摘要:智算中心网络作为智算中心的连接底座,需要具备高性能、低时延的通信能力。智算中心网络体系是一个多要素融合的复杂系统,依赖于智算业务、机内/外交换芯片、网卡、网络设备等上下游产业协同创新。本文系统剖析了服务器/超节点内GPU互联网络、同园区机间互联网络、跨园区智算中心互联网络三大核心领域,探讨了智算网络的需求、挑战以及业界发展态势。中国移动创新提出OISA、GSE技术体系,以及弹性以太网聚合、精细化拥塞控制、物理层安全等多项创新技术,旨在构建超大规模、超高带宽、超低时延、超高可靠的智算中心网络,助力AI产业发展。
关键词:智算中心网络;全向智感互联架构;全调度以太网;智算中心互联
Abstract: As the connection base of the intelligent computing center, the network of the intelligent computing center needs to have high-performance and low-latency communication capabilities. The network system of the intelligent computing center is a complex system that integrates multiple elements, relying on collaborative innovation among upstream and downstream industries such as intelligent computing services, forwarding chips, network cards, and network equipment. This paper systematically analyzes three core areas namely the GPU Internet in servers/super nodes, the Internet between computers in the same park, and the Internet between intelligent computing centers across parks. The requirements, challenges, and development trends in the industry of intelligent computing networks are discussed. China Mobile has innovatively proposed technical systems such as OISA and GSE, as well as a number of innovative technologies including elastic Ethernet aggregation, refined congestion control, and physical layer security. The goal is to build an intelligent computing center network with super scale, ultra-high bandwidth, ultra-low latency, and ultra-high reliability, so as to boost the development of the AI industry.
Keywords: intelligent computing center network; omni-directional intelligent sensing express architecture; global scheduling ethernet; intelligent computing center interconnection