Sparse Deep Neural Network Inference using different Programming Models

January 13, 2023

Conference Paper

Sparse Deep Neural Network Inference using different Programming Models

Abstract

Sparse deep neural networks have gained increasing attention recently in achieving speedups on inference with reduced memory footprints. However, the real-world applications are little shown with specialized optimizations, yet a wide variety of DNN tasks remain dense without exploiting the advantages of sparsity in networks. Recent work presented by MIT/IEEE/Amazon GraphChallenge has demonstrated significant speedups and various techniques. Still, we find that there is limited investigation of the impact of various Python and C\slash C++ based programming models to explore new opportunities in general cases. In this work, we provide performance evaluation through different programming models using CuPy, cuSPARSE, and OpenMP to discuss the advantages and disadvantages of our sparse implementations on single-GPU and multiple GPUs of NVIDIA DGX-A100 40GB/80GB platforms.

Published: January 13, 2023

Citation

Lee H., M. Jain, and S. Ghosh. 2022. Sparse Deep Neural Network Inference using different Programming Models. In IEEE High Performance Extreme Computing Conference (HPEC 2022), September 19-23, 2022, Waltham, MA, 1-6. Piscataway, New Jersey:IEEE. PNNL-SA-175380. doi:10.1109/HPEC55821.2022.9926362

Research topics

Computing & Analytics

PNNL

Sparse Deep Neural Network Inference using different Programming Models

Abstract

Citation

Research topics

PNNL Showcases AI Innovations at National Competitiveness Expo

Connecting Computational and Systems Biology for Biodefense

Topology Property Analysis and Application of Stable Time-Delay Regions for Linear Multiple Time-Delay Systems