新型拓扑感知的参数交换方案

发布时间:2020-10-22 作者:万鑫晨, 胡水海, 张骏雪 阅读量:

 

 

新型拓扑感知的参数交换方案
 
万鑫晨, 胡水海, 张骏雪
(香港科技大学,中国 香港 999077 )
 
摘要:定义了一种新型拓扑感知的参数交换方案——弹性全局规约树(RAT)。针对底层物理拓扑及其超额认购条件,RAT建立了“弹性全局规约树”。该树指定了参数聚合模式,其中每个聚合节点负责在规约阶段聚合一个超额认购区域内的所有工作的梯度,并在广播阶段将更新传回给工作节点。实验表明,该方法能有效地减少跨超额认购区域流量,缩短依赖链。
关键词:分布式机器学习;全局规约算法;参数交换方案


New Parameter Exchange Scheme with Topology-Awareness
 
WAN Xinchen, HU Shuihai, ZHANG Junxue
(Hong Kong University of Science and Technology, Hong Kong SAR 999077, China )
 
Abstract:A new parameter exchange scheme with topology-awareness called resilient allreduce trees (RAT) is proposed. Aiming at the underlying physical topology and its oversubscription conditions, RAT establishes the "resilient allreduce trees". The allreduce trees specify the aggregation pattern in which each aggregator is responsible for aggregating gradients from all workers within an oversubscribed region at the reduce phase, and broadcasting the updates back to workers at the broadcast phase. Experiments show that this method can effectively reduce the cross-region traffic and shorten dependency chain.
Keywords: distributed machine learning; all-reduce algorithm; parameter exchange scheme

 

 

在线PDF浏览: PDF