面向智算中心互联的光算协同技术研究

发布时间:2026-01-06 作者:谭艳霞,满祥锟,吴绍辉,张贺,徐博华

摘要:针对智算中心互联对光网络的新需求,结合当前智算网络发展现状,探讨智算中心互联架构及关键技术,以实现高性能算力互联。同时,针对跨智算中心分布式协同训练场景,搭建基于光传送网(OTN)的跨智算中心现网试验环境,在广域收敛比不低于16:1的场景下,百亿AI大模型跨域分布式训练性能达到95%以上。该试验验证采用了单波800G实现300 km的传输,并验证其超高可靠传输能力。

关键词:智算中心互联;光传输网络;分布式协同训练;高可靠传输

 

Abstract: The interconnection architecture and key technologies for intelligent computing centers are explored to address the new demands of optical networks for their interconnection, while considering the current development status of intelligent computing networks, with the aim of achieving high-performance computing power interconnection. Furthermore, focusing on the scenario of distributed collaborative training spanning multiple intelligent computing centers, an optical transport network (OTN)-based experimental testbed for cross-center interconnection is implemented on a live network. Under conditions where the wide-area convergence ratio is no less than 16:1, a performance of over 95% is achieved for cross-domain distributed training of AI large models with 10 billion parameters. Single-wave 800G transmission over 300 km is employed, and its ultra-high reliability and transmission capability are verified.

Keywords: interconnection of intelligent computing centers; optical transport network; distributed collaborative training; highly reliable transmission