Multi-FPGA DNN Acceleration

0 0
  • 0 Collaborators

A generic multi-FPGA solution, written in OpenCL, which can accelerate more complex CNNs (e.g., C3D CNN) and achieve a near linear speedup with respect to the available single-FPGA solutions. ...learn more

Project status: Concept

HPC, Artificial Intelligence

Intel Technologies
Intel FPGA

Overview / Usage

High-throughput and low-latency Convolutional Neural Network (CNN) inference is increasingly impor- tant for many cloud- and edge-computing applications. FPGA-based acceleration of CNN inference has demonstrated various benefits, compared to other high-performance devices such as GPGPUs. Current FPGA CNN-acceleration solutions are based on a single FPGA design, which are limited by the available resources on an FPGA. Also, they can only accelerate conventional 2D neural networks. To address these limitations, we present a generic multi-FPGA solution, written in OpenCL, which can accelerate more complex CNNs (e.g., C3D CNN) and achieve a near linear speedup with respect to the available single-FPGA solutions. The design is built upon the DLA architecture, with three extensions. First, it includes updates for better area efficiency (up to 25%) and higher performance (up to 24%). Second, it supports 3D convolutions for more challenging applications such as video learning. Third, it supports multi-FPGA communication for higher inference throughput. The results show that utilizing multiple FPGAs can linearly increase the overall bandwidth while maintaining the same end-to-end latency. Also, the design can outperform other FPGA 2D accelerators by up to 8.4 times, and 3D accelerators by up to 1.7 times.

Technologies Used

OpenCL / FPGA

Comments (0)