选择语言

地理分布式机器学习：超越局域的框架与技术

发布时间：2020-10-22 作者：李宗航, 虞红芳, 汪漪

地理分布式机器学习：超越局域的框架与技术

李宗航¹, 虞红芳^1,3, 汪漪^2,3

（1. 电子科技大学，中国成都，611731; 2. 南方科技大学，中国深圳，518055; 3. 鹏城实验室，中国深圳，518055 ）

摘要：提出了一种面向地理分布式机器学习的软件框架GeoMX，该框架从通信架构和压缩传输机制两方面着手优化通信。对应设计了分层参数服务器（HiPS）架构和双向稀疏梯度传输（BiSparse）技术，旨在分别减少广域传输的梯度流数量和流大小。GeoMX在跨广域分布的数据中心上最高可取得4倍于数据中心内MXNET的训练效率，且几乎无精度损失。
关键词：大数据；人工智能；地理分布式机器学习；梯度稀疏化

Geo-Distributed Machine Learning: Framework and Technology Exceeding LAN Speed

LI Zonghang¹, YU Hongfang^1,3,WANG Yi^2,3

（1. University of Electronic Science and Technology of China, Chengdu 611731, China; 2. Southern University of Science and Technology, Shenzhen 518055, China; 3. Peng Cheng Laboratory, Shenzhen 518055, China ）

Abstract：A software framework, called GeoMX, is proposed for geo-distributed machine learning. GeoMX improves communication efficiency in terms of architecture and compression, and accordingly hierarchical parameter server (HiPS) architecture and bi-directional sparsification (BiSparse) technology are designed to reduce the number and size of gradients transmitted via wide area network (WAN) respectively. In the experiments, GeoMX is deployed on multiple data centers distributed across WAN, while MXNET is deployed in a data center within local area network (LAN). The results show that GeoMX is up to 4 times faster than MXNET with little loss of accuracy.
Keywords: big data; artificial intelligence; geo-distributed machine learning; gradient sparsification

在线PDF浏览： PDF

本期相关文章

超密集蜂窝网络智能干扰协调算法

基于AI的运营级IDC节能研究

算力网络中面向业务体验的算力建模

电信运营商泛在智联网络的构建

新型拓扑感知的参数交换方案

分布式深度学习系统网络通信优化技术

专题导读