ZTE Builds a TCO-Optimal AI Factory to Fuel Token Economy
- OEX architecture SuperPOD achieves a breakthrough in both computing density and energy efficiency, showcasing ZTE's technological innovation and system-level synergy in AI infrastructure
- ZTE leverages multiple dimensional co-design to enhance TPS, enabling cost-optimized AI factories and fueling Token economy

Shanghai, China, June 24, 2026 – ZTE Corporation (0763.HK / 000063.SZ), a global leading provider of integrated information and communication technology solutions, showcased at MWC Shanghai 2026 its comprehensively boost to TPS. Powered by multiple dimensional co-design, deep optimization, and acceleration - from chips, servers, clusters, and AIDC, to software algorithms and scheduling platforms - this innovation empowers customers to build TCO-Optimal AI factories providing robust support for the efficient development of the Token Economy.
As large models enter the phase of scaled inference deployment, the "cost per Token" has emerged as the ultimate metric for measuring the commercial value of AI. ZTE proposes that a leap in Token generation efficiency can only be achieved through architectural-level innovation and system-level synergy. To that end, the OEX (Orthogonal Electrical eXchange) architecture based SuperPOD showcased at MWC Shanghai 2026 represents a milestone innovation designed to shatter computing power bottlenecks and maximize energy efficiency.
Pioneering the OEX architecture to define the next-generation super-node standard
ZTE pioneered the Orthogonal Architecture SuperPOD concept. Its OEX architecture features a midplane-free and zero-cable design to achieve physical decoupling and flexible replacement of core components such as GPUs, CPUs, and switch chips. By supporting mainstream high-speed interconnect protocols like CLink and SUE, it truly realizes "multi-chip synergy, open compatibility, and on-demand optimization". Compared with traditional architectures, OEX-based SuperPOD communication paths are shorter with lower signal loss, significantly improving overall interconnection efficiency, minimizing latency, and enhancing system reliability.
ZTE's SuperPOD single rack achieves industry-leading ultra-high-density integration of 128 GPUs, and supports scale up to 16,000 GPUs to build an extra-large-scale cluster. It meets AI training and inference requirements ranging from thousand-card to ten-thousand-card scale, providing a solid foundation for long-context, high-concurrency agent scenarios.
Multi-Dimensional Optimization to Achieve Comprehensive Improvements in Inference Efficiency and Computing Energy Efficiency
Hardware-software synergy unleashes ultimate efficiency per watt. By leveraging a PD disaggregation and integrating technologies such as network efficiency optimization, operator optimization, and multi-level KV cache, performance bottlenecks are overcome and throughput is significantly increased. In close collaboration with multiple manufacturers, heterogeneous mixed inference and system-level optimization are advanced on domestic chip platforms, resulting in a comprehensive enhancement of inference efficiency and a notable increase in TPS.
Compute-storage-network synergy builds a large-scale inference pool. ZTE offers the Full Series AI Server supporting high-density deployment with 8 or 16 GPUs per server and 64 or 128 GPUs per rack, adapting to diverse scenarios. The AI-native KV cache is implemented through DPU hardware acceleration that enables direct GPU access to storage, achieving zero-copy data transfer, microsecond-level latency, and PB-scale scalability. Combined with intelligent prefetching and dynamic eviction mechanisms, the cache delivers a hit rate exceeding 70%, significantly boosting inference efficiency.
Building an Open and Evolvable AI Infrastructure with Ecosystem Partners
ZTE emphasizes that AI computing power development must balance performance, cost, and sustainable evolution. To achieve this, ZTE's OEX-based SuperPOD adopts a "Pre-Integration" model. Through Pre-adaptation & Pre-integration, the product adaptation and turning cycle is slashed from over one year to within six months, significantly accelerating ecosystem convergence and large-scale commercial rollout.
Efficient computing power serves as the bedrock of the Token economy, and ZTE is dedicated to ensuring that every ounce of computing power translates into tangible AI productivity. This showcase fully underscores ZTE's deep technological heritage and innovative strength in intelligent computing infrastructure, while offering customers a definitive, future-proof blueprint for building highly efficient AI factories.