面向新型智算中心的全调度以太网技术

发布时间:2023-07-25 作者:段晓东,程伟强,王瑞雪,王雯萱 阅读量:

 

摘要:智算中心网络作为智算中心的连接底座,需要具备高性能、低时延的通信能力。一旦网络性能不佳,就会严重影响分布式训练的效果。智算中心网络体系是一个多要素融合的复杂系统,依赖于智算业务、网络设备、交换芯片、网卡、仪表等上下游产业协同创新。本文提出一种新型全调度以太网(GSE)技术架构,在最大限度地兼容以太网生态链的前提下,基于报文容器(PKTC)转发、负载均衡机制以及基于报文容器的动态全局调度队列(DGSQ)全局调度技术,构建超大规模、超高带宽、超低时延、超高可靠的智算中心网络,助力AI产业发展。

关键词:人工智能生成内容;智算中心网络;全调度以太网;报文容器;动态全局调度队列

 

Abstract: The intelligent computing center network, as the connection base of the intelligent computing center, needs to have high performance and low latency communication capability. Once the network performance is poor, it can seriously affect the effect of distributed training. An intelligent computing center network system is a multi-element integration of complex systems, relying on intelligent computing services, network equipment, switching chips, network cards, instruments, and other upstream and downstream industry collaborative innovation. A new global scheduling Ethernet (GSE) technology architecture is proposed to build an ultra-large scale, ultra-high bandwidth, ultra-low latency, and ultra-reliable intelligent computing center network with maximum compatibility with the Ethernet ecosystem, innovative forwarding and load balancing mechanism based on packet containers (PKTC) and dynamic global scheduling queue (DGSQ) global scheduling technology based on packet containers to help the development of AI industry.

Keywords: artificial intelligence-generated content; intelligent computing center network; GSE; packet container; dynamic global scheduling queue

在线PDF浏览: PDF