Skip to Main Content U.S. Department of Energy
ACMD Division

Staff information

Shuaiwen Leon Song

High Performance Computing
Scientist
Pacific Northwest National Laboratory
PO Box 999
MSIN: J4-30
Richland, WA 99352
509/372-4189

Biography

I am currently a staff research scientist in High Performance Computing Group at Pacific Northwest National Lab (PNNL). Before joining PNNL HPC group in May 2013, I have worked as R&D intern with several government and industrial labs including Center for Advanced Computing (CASC) at Lawrence Livermore National Lab (LLNL), Performance Analysis Lab (PAL) at Pacific Northwest National Lab (PNNL), and the Architecture Research Division at NEC Research American at Princeton.

I was a 2011 Livermore ISCR scholar, recipient of 2011 Paul E. Torgersen Excellent research award and 2016 PNNL PCSD outstanding performance award. I have published in the major HPC conferences including HPDC, ICS, SC, PACT, and IPDPS, etc. My SC'15 paper is nominated for best student paper. I have served as PC member, session or publicity chair for several major HPC venues including SC, IPDPS, HPDC, etc. My past and current research is funded by several major government agencies including DOE ASCR, DoD, and DoD DARPA.

Research Interests

  • Performance and Energy evaluation and optimization for HPC systems
  • Fault tolerance and system reliability
  • Multi-core and Many-core architectures (e.g., emergent many-core accelerators)
  • Power-aware computing and energy-efficient design for large scale distributed systems
  • Big data analytic, Deep Learning, and Dynamic modeling techniques (e.g. machine learning)
  • Approximate Computing, Accuracy-Aware Computing
  • Runtime System

Education and Credentials

  • Ph.D. in Computer Science and Application, Virginia Tech, May 2013
  • Master's in Computer Science and Application, Virginia Tech, May 2009

Affiliations and Professional Service

  • IEEE professional
  • ACM professional
  • ACM SIGHPC
  • Upsilon Pi Epsilon

Awards and Recognitions

  • PPNNL PCSD Outstanding Performance Award.
  • Co-Chair, The First IEEE Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM'2016), in conjunction with IPDPS'16, Chicago.
  • Co-Chair, The Twelfth IEEE Workshop on High-Performance Power-Aware Computing (HPPAC), in conjunction with IPDPS'16, Chicago.
  • PNNL staff research highlight award 2015
  • PNNL research award 2015
  • Best student paper nominee for SC15
  • Recipient of 2011 Paul E. Torgersen excellent research award
  • 2011 ISCR scholar, Lawrence Livermore National Lab
  • PACT 12 ACM SRC by Microsoft Research
  • SC 11 selected Ph.D. showcase

PNNL Publications

2016

  • Li A, S Song, M Wijtvliet, A Kumar, and H Corporaal. 2016. "SFU-Driven Transparent Approximation Acceleration on GPUs." In Proceedings of the International Conference on Supercomputing (ICS 2016), June 1-3, 2016, Istanbul, Turkey, p. Paper No. 15.  Association for Computing Machinery, New York, NY.  doi:10.1145/2925426.2926255
  • Li A, S Song, A Kumar, E Zhang, D Chavarría-Miranda, and H Corporaal. 2016. "Critical Points Based Register-Concurrency Autotuning for GPUs." In Proceedings of the Design, Automation and Test in Europe Conference (DATE 2016), March 14-18, 2016, Dresden, Germany, pp. 1273-1278.  IEEE, Piscataway, NJ. 
  • Li A, S Song, E Brugel, A Kumar, D Chavarría-Miranda, and H Corporaal. 2016. "X: A Comprehensive Analytic Model for Parallel Machines." In IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), May 23-27, 2016 Chicago, Illinois, pp. 242-252.  IEEE, PISCATAWAY, NJ.  doi:10.1109/IPDPS.2016.89
  • Li L, A Hayes, S Song, and E Zhang. 2016. "Tag-Split Cache for Efficient GPGPU Cache Utilization." In Proceedings of the International Conference on Supercomputing (ICS 2016), June 1-3, 2016, Istanbul, Turkey, p. Paper No. 43.  ACM, New York, NY.  doi:10.1145/2925426.2926253
  • Roy P, X Liu, and S Song. 2016. "SMT-Aware Instantaneous Footprint Optimization." In Proceedings of the 25th ACM international Symposium on High-Performance and Distributed Computing (HPDC 2016), May 31-June 4, 2016, Kyoto, Japan, pp. 267-279.  ACM, NEW YORK, NY.  doi:10.1145/2907294.2907308
  • Tan L, Z Chen, and S Song. 2016. "Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology." ACM Transactions on Architecture and Code Optimization 12(4):Article No. 35.  doi:10.1145/2822893
  • Tan L, Z Chen, and S Song. 2016. "Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology." In 11th International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC 2016), January 18-20, 2016, Prague, Czech Republic.  ACM , New York, NY. 
  • Tao D, S Song, S Krishnamoorthy, P Wu, X Liang, E Zhang, DJ Kerbyson, and Z Chen. 2016. "New-Sum: A Novel Online ABFT Scheme For General Iterative Methods." In Proceedings of the 25th ACM international Symposium on High-Performance and Distributed Computing (HPDC 2016), May 31-June 4, 2016, Kyoto, Japan, pp. 43-55.  ACM, NEW YORK, NY.  doi:10.1145/2907294.2907306

2015

  • Li C, S Song, H Dai, A Sidelnik, S Hari, and H Zhou. 2015. "Locality-Driven Dynamic GPU Cache Bypassing." In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS 2015), June 8-11, 2015, Newport Beach, California, pp. 66-77.  ACM , New York, NY.  doi:10.1145/2751205.2751237
  • Sengupta D, S Song, K Agarwal, and K Schwan. 2015. "GraphReduce: Processing Large-Scale Graphs on Accelerator-Based Systems." In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'15), November 15-20, 2015, Austin, Texas, p. Paper No. 28.  ACM , New York, NY.  doi:10.1145/2807591.2807655
  • Sengupta D, K Agarwal, S Song, and K Schwan. 2015. "GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems." In IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW 2015), May 25-29, 2016, Hyderabad, India, pp. 604-609.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPSW.2015.16
  • Shrestha S, JB Manzano Franco, A Marquez, S Zuckerman, S Song, and GR Gao. 2015. "Gregarious Data Re-structuring in a Many Core Architecture." In IEEE 17th International Conference on High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), August 24-26, 2015, New York, pp. 712-720.  IEEE, Piscataway, NJ.  doi:10.1109/HPCC-CSS-ICESS.2015.291
  • Tan L, S Song, P Wu, Z Chen, R Ge, and DJ Kerbyson. 2015. "Investigating the Interplay between Energy Efficiency and Resilience in High Performance Computing." In IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015), May 25-29, 2015, Hyderabad, India, pp. 786-796.  IEEE Computer Society, Los Alamitos.  doi:10.1109/IPDPS.2015.108
  • You Y, H Fu, S Song, A Randles, DJ Kerbyson, A Marquez, G Yang, and A Hoisie. 2015. "Scaling Support Vector Machines On Modern HPC Platforms." Journal of Parallel and Distributed Computing 76:16-31.  doi:10.1016/j.jpdc.2014.09.005

2014

  • Li B, HC Chang, S Song, CY Su, T Meyer, J Mooring, and K Cameron. 2014. "Extending PowerPack for Profiling and Analysis of High Performance Accelerator-Based Systems." Parallel Processing Letters 24(4):Article No. 144200.  doi:10.1142/S0129626414420018
  • Li B, HC Chang, S Song, CY Su, T Meyer, J Mooring, and K Cameron. 2014. "The Power-Performance Tradeoffs of the Intel Xeon Phi on HPC Applications." In IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW 2014), May 19-23, 2014, Phoenix, Arizona, pp. 1448-1456.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPSW.2014.162
  • Marquez A, JB Manzano Franco, S Song, B Meister, S Shrestha, T St. John, and GR Gao. 2014. "ACDT: Architected Composite Data Types Trading-in Unfettered Data Access for Improved Execution." In The 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2014), December 16-19, 2015, Hsinchu, Taiwan, pp. 289-297.  IEEE, Piscataway, NJ.  doi:10.1109/PADSW.2014.7097820
  • You Y, S Song, and DJ Kerbyson. 2014. "An Adaptive Cross-Architecture Combination Method for Graph Traversal." In Proceedings of the 28th ACM international conference on Supercomputing (ICS'14), June 10-13, 2014, Munich, Germany, pp. 169-169.  Association for Computing Machinery , New York, NY.  doi:10.1145/2597652.2600110
  • You Y, H Fu, S Song, M Mehri Dehanavi, L Gan, X Huang, and G Yang. 2014. "Evaluating Multi-core Architectures through Accelerating the Three-Dimensional Lax-Wendroff Correction." International Journal of High Performance Computing Applications 28(3):301-318.  doi:10.1177/1094342014524807
  • You Y, S Song, H Fu, A Marquez, M Mehri Dehanavi, KJ Barker, K Cameron, A Randles, and G Yang. 2014. "MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures." In IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS 2014), May 19-23, 2014, Phoenix, Arizona, pp. 809-818.  IEEE Computer Society, Los Alamitos, CA.  doi:10.1109/IPDPS.2014.88

2013

  • Vishnu A, S Song, A Marquez, KJ Barker, DJ Kerbyson, K Cameron, and P Balaji. 2013. "Designing Energy Efficient Communication Runtime Systems: A View from PGAS Models." Journal of Supercomputing 63(3):691-709 .  doi:10.1007/s11227-011-0699-9
  • Li B, S Song, I Bezakova, and K Cameron. 2013. "EDR: An Energy-Aware Runtime Load Distribution System for Data-Intensive Applications in the Cloud." In IEEE International Conference on Cluster Computing (CLUSTER 2013), September 23-27, 2013, Indianapolis, IN, pp. 1-8.  Institute of Electrical and Electronics Engineers , Piscataway, NJ.  doi:10.1109/CLUSTER.2013.6702674
  • Song S, KJ Barker, and DJ Kerbyson. 2013. "Unified Performance and Power Modeling of Scientific Workloads." In E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, November 17-21, 2013, Denver, Colorado, p. Article No. 4.  Association for Computing Machinery, New York, NY.  doi:10.1145/2536430.2536435
  • Song S, NR Tallent, and A Vishnu. 2013. "Exploring Machine Learning Techniques For Dynamic Modeling on Future Exascale Systems." In Modeling & Simulation of Exascale Systems & Applications: Workshop on Modeling & Simulation of Exascale Systems & Applications, September 18-19, 2013, Seattle, Washington.  US Department of Energy, Office of Advanced Scientific Computing Research, Washington DC. 

2011

  • Song S, C Si Yu, R Ge, A Vishnu, and K Cameron. 2011. "Iso-Energy-Efficiency: An Approach to Power Constrained Parallel Computation." In IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), May 16-20, 2011, Anchorage, Alaska, pp. 128-139.  IEEE, Piscataway, NJ.  doi:10.1109/IPDPS.2011.22

2010

  • Vishnu A, HJJ van Dam, WA De Jong, P Balaji, and S Song. 2010. "Fault Tolerant Communication Runtime Support for Data-Centric Programming Models." In International Conference on High Performance Computing (HiPC 2010), December 19-22, 2010, Goa, India.  International Electrical and Electronics Engineers, Piscataway, NJ.  doi:10.1109/HIPC.2010.5713195
  • Vishnu A, S Song, A Marquez, KJ Barker, DJ Kerbyson, K Cameron, and P Balaji. 2010. "Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models." In IEEE/ACM Internationall Conference on Green Computing and Communications (GreenCom 2010) and the International Conference on Cyber, Physical and Social Computing (CPSCom 2010), December 18-20, 2010, Hangzhou, China, ed. P Zhu, et al, pp. 229-236.  Institute of Electrical and Electronics Engineers, Inc., Piscatawy, NJ.  doi:10.1109/GreenCom-CPSCom.2010.133

Computing Research

Collaborations

Seminar Series

Science at PNNL

Computing Research

View All Highlights

Contacts