Charlotte, North Carolina, USA
May 2, 2010 to May 4, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FCCM.2010.38
Deep Belief Nets (DBNs) are an emerging application in the machine learning domain, which use Restricted Boltzmann Machines (RBMs) as their basic building block. Although small scale DBNs have shown great potential, the computational cost of RBM training has been a major challenge in scaling to large networks. In this paper we present a highly scalable architecture for Deep Belief Net processing on hardware systems that can handle hundreds of boards, if not more, of customized logic with near linear performance increase. We elucidate tradeoffs between flexibility in the neuron connections, and the hardware resources, such as memory and communication bandwidth, required to build a custom processor design that has optimal efficiency. We illustrate how our architecture can easily support sparse networks with dense regions of connections between neighboring sets of neurons, which is relevant to applications where there are obvious spatial correlations in the data, such as in image processing. We demonstrate the feasibility of our approach by implementing a multi-FPGA system. We show that a speedup of 46X-112X over an optimized single core CPU implementation can be achieved for a four-FPGA implementation.
Neural network hardware, Computer architecture, Large-scale systems, Field programmable gate arrays, Parallel processing, Boltzmann machines
Sang Kyun Kim, Peter Leonard McMahon, Kunle Olukotun, "A Large-Scale Architecture for Restricted Boltzmann Machines", FCCM, 2010, Field-Programmable Custom Computing Machines, Annual IEEE Symposium on, Field-Programmable Custom Computing Machines, Annual IEEE Symposium on 2010, pp. 201-208, doi:10.1109/FCCM.2010.38