Abstract
Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with hundreds of parameters using distributed CPU cores. We have developed Bagging-Down SGD algorithm to solve the distributing problems. Bagging-Down SGD introduces the parameter server adding on the several model replicas, and separates the updating and the training computing to accelerate the whole system. We have successfully used our system to train a distributed deep network, and achieve state-of-the-art performance on MINIST, a visual handwriting font library. We show that these techniques dramatically accelerate the training of this kind of distributed deep network.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 27 Aug 2016 |
Event | 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) - Duration: 27 Aug 2016 → … |
Conference
Conference | 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) |
---|---|
Period | 27/08/16 → … |