Murali Emani



Computer Scientist
Argonne Leadership Computing Facility
Argonne National Laboratory, IL, USA
m[lastname]@anl[dot]gov

About

I am a Computer Scientist in the Artificial Intelligence and Machine Learning (AIML) group with the Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. Prior, I was a Postdoctoral Research Staff Member at Lawrence Livermore National Laboratory, US. I obtained my PhD and worked as a Research Associate at the Institute for Computing Systems Architecture at the School of Informatics, University of Edinburgh, UK under the guidance of Prof. Michael O'Boyle. My research interests are in Parallel programming models, High Performance Computing, Scalable Machine Learning, Runtime Systems, Emerging HPC architectures, Online Adaptation. Some of my current projects include:

  • Developing performance models to identifying and addressing bottlenecks while scaling machine learning and deep learning frameworks on emerging supercomputers for scientific applications.
  • Co-design of emerging hardware architectures to scale up machine learning algorithms.
  • Efforts on benchmarking ML/DL frameworks and methods on HPC systems.
At ALCF, I also co-lead the AI Testbed where we explore the performance, efficiency of AI accelerators for scientific machine learning applications. I am serving as a co-chair for MLPerf HPC group at MLCommons to benchmark large scale ML on HPC systems. The slides are found here.

Service

  • Program Committee: CCGRID'24, Bench'23, ISC'23, Bench'22, ICCD'22, IPDPS'22, SC'21 (posters), IPDPS'21, AI4Science at SC'20, ICPP'19, CCGRID'19, PACT '18, CCGRID '18, ICPP '18
  • Artifact Evaluation Committee: ASPLOS'21, ASPLOS'20, MLSys'20
  • Technical Reviewer:
    Journals: Concurrency and Computation '19,'18, Parco '18, ACM TOPC '16, Elsevier TJOR '15, Elsevier ParCo '18, '15, ACM TECS '15
    Funding agencies: NSF '17
    Conferences: SBAC-PAD '16, PACT '15, IISWC '15, CC '13, HiPC '10
  • Postdoc and Student Mentoring:
    Sid Raskar, Postdoc, Argonne National Laboratory
    Krishna Teja Chitty Venkata, Postdoc, Argonne National Laboratory
    Zhen Xie, Assistant Professor, Binghamton University
    Gaurav Verma, PhD Student, Stony Brook University
    Shilpika, PhD student, University of California, Davis, (now at ANL)
    Scott Cheng, PhD Student, The Pennsylvania State University
    Xianzhong Ding, PhD Student, University of California, Merced
    Sixing Yu, PhD student, Iowa State University
    Yulie Zamora, PhD student, University of Chicago (now at Intel)
    Hailu Xu, PhD student, Florida International University (now Asst. Professor, California State University)
    Larisa Stoltzfus, PhD student, University of Edinburgh (now at EPCC)

Publications

  • [CCGRID'24] A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems
    Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram Vishwanath, Michael Papka, Kwan-Liu Ma 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing (CCGRID'24)

  • [Applied Sciences'24] Cross-Feature Transfer Learning for Efficient Tensor Program Generation
    Gaurav Verma, Sid Raskar, Murali Emani, and Barbara Chapman Journal of Applied Sciences

  • [SIGMETRICS'24] Thorough Characterization and Analysis of Large Transformer Model Training At-Scale
    Scott Cheng, Jun-Liang Lin, Murali Emani, Siddhisanket Raskar, Sam Foreman, Zhen Xie, Venkatram Vishwanath, Mahmut Kandemir

  • A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators
    Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Sid Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael E. Papka arXiv link

  • DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
    Leon Song et al. arXiv link

  • [AI4S'23] Protein Generation via Genome-scale Language Models with Bio-physical Scoring
    Gautham Dharuman, Logan Ward, Heng Ma, Priyanka V Setty, Ozan Gokdemir, Sam Foreman, Murali Emani, Kyle Hippe, Alexander Brace, Kristopher Keipert, Thomas Gibbs, Ian Foster, Anima Anandkumar, Venkatram Vishwanath, Arvind Ramanathan
    Workshop on Artificial Intelligenc and Machine Learning for Scientific Applications (AI4S) at SC23

  • [WACCPD'23] Characterizing the Performance of Triangle Counting on Graphcore's IPU Architecture
    Reet Barik, Sid Raskar, Murali Emani, Venkatram Vishwanath
    Workshop on Accelerator Programming and Directives (WACCPD) at SC23

  • [MLG-HPCE'23] HPC-GPT: Integrating Large Language Model for High-Performance Computing
    Xianzhong Ding, Le Chen, Murali Emani, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao, Zhen Xie, Alberto Cerpa, Wan Du
    Machine Learning with Graphs in High Performance Computing Environments (MLG-HPCE) at SC23

  • [Correctness'23] Data Race Detection Using Large Language Models
    Le Chen, Xianzhong Ding, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao, Murali Emani
    Correctness workshop at SC23

  • Differentiable Neural Architecture, Mixed Precision and Accelerator Co-search
    Krishna Teja Chitty-Venkata, Yiming Bian, Murali Emani, Venkatram Vishwanath, Arun K. Somani
    IEEE Access

  • A Survey of Techniques for Optimizing Transformer Inference
    Krishna Teja Chitty-Venkata, Sparsh Mittal, Murali Emani, Venkatram Vishwanath, Arun K. Somani
    Journal of Systems Architecture

  • LM4HPC: Towards Effective Language Model Application in High-Performance Computing
    Le Chen, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao, Murali Emani, Bronis de Supinski
    arXiv link

  • A Multi-Level, Multi-Scale Visual Analytics Approach to Assessment of Multifidelity HPC Systems
    Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram Vishwanath, Michael E. Papka, Kwan-Liu Ma
    arXiv link

  • [IWOMP'23] Towards Effective Language Model Application in High-Performance Computing
    Le Chen, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao, Murali Emani and Bronis de Supinski
    The International Workshop on OpenMP (IWOMP) 2023

  • [Nature Scientific Data'23] FAIR for AI: An interdisciplinary, international, inclusive, and diverse community building perspective
    Eliu Huerta, Ben Blaiszik, L. Brinson, Kristofer Bouchard, Daniel Diaz, Caterina Doglioni, Javier Duarte, Murali Emani, Ian Foster, Geoffrey Fox, Philip Harris, Lukas Heinrich, Shantenu Jha, Daniel Katz, Volodymyr Kindratenko, Christine Kirkpatrick, Kati Lassila-Perini, Ravi Madduri, Mark Neubauer, Fotis Psomopoulos, Avik Roy, Oliver Ruebel, Zhizhen Zhao, and Ruike Zhu

  • [Euro-Par'23] TrainBF: High-Performance DNN Training Engine using BFloat16 on AI Accelerators
    Zhen Xie, Sid Raskar, Murali Emani, Venkatram Vishwanath
    Euro-Par 2023

  • [IEEE Access'23] Neural Architecture Search Benchmarks: Insights and Survey
    Krishna Teja Chitty-Venkata, Murali Emani, Venkatram Vishwanath, Arun K. Somani
    IEEE Access 2023

  • [ExHET'23] Transfer Learning Across Heterogeneous Features For Efficient Tensor Program Generation
    Gaurav Verma, Siddhisanket Raskar, Zhen Xie, Abid M Malik, Murali Emani and Barbara Chapman
    International Workshop on Extreme Heterogeneity Solutions at PPoPP 2023

  • [AAAI'23] Towards Seamless Management of AI Models in High-Performance Computing
    Sixing Yu, Murali Emani, Chunhua Liao, Pei-Hung Lin, Tristan Vanderbruggen, Xipeng Shen, Ali Jannesari
    2nd AAAI workshop on AI to Accelerate Science and Engineering, 2023

  • [SC'22] GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics
    Maxim Zvyagin, Alexander Brace, Kyle Hippe, Yuntian Deng, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot-Sasson, Murali Emani, Sam Foreman, Zhen Xie, Diangen Lin, Maulik Shukla, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Rick Stevens, Anima Anandkumar, Venkatram Vishwanath, Arvind Ramanathan
    Winner of the ACM Gordon Bell Special Prize for HPC-based Covid-19 research, SC 2022

  • [PMBS'22] A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads
    Murali Emani, Zhen Xie, Sid Raskar, Varuni Sastry, William Arnold, Bruce Wilson, Rajeev Thakur, Venkatram Vishwanath, Michael E Papka, Cindy Orozco Bohorquez, Rick Weisner, Karen Li, Yongning Sheng, Yun Du, Jian Zhang, Alexander Tsyplikhin, Gurdaman Khaira, Jeremy Fowers, Ramakrishnan Sivakumar, Victoria Godsoe, Adrian Macias, Chetan Tekur, Matthew Boyd
    13th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS) at SC 2022

  • [IEEE Access'22] Neural architecture search for transformers: A survey
    Krishna Teja Chitty-Venkata, Murali Emani, Venkatram Vishwanath, Arun K. Somani
    IEEE Access 2022

  • [HUST'22] Interactive NLU-Powered Ontology-Based Workflow Synthesis for FAIR Support of HPC
    Zifan Nan, Mithil Dave, Xipeng Shen, Chunhua Liao, Tristan Vanderbruggen, Pei-Hung Lin, Murali Emani
    IEEE/ACM 9th International Workshop on HPC User Support Tools (HUST-22) at SC 2022

  • [Correctness'22] Early Experience with Transformer-Based Similarity Analysis for DataRaceBench
    Winson Chen, Tristan Vanderbruggen, Pei-Hung Lin, Chunhua Liao, Murali Emani
    Correctness workshop at SC 2022

  • [TEML'22] Making Machine Learning Datasets and Models FAIR for HPC: A Methodology and Case Study
    Pei-Hung Lin, Chunhua Liao, Winson Chen, Tristan Vanderbruggen, Murali Emani and Hailu Xu
    1st Workshop on Trustable and Ethical Machine Learning (TEML) 2022

  • [SAML'22] Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
    Patrick Flynn, Tristan Vanderbruggen, Chunhua Liao, Pei-Hung Lin, Murali Emani, and Xipeng Shen
    International Workshop on Software Architecture and Machine Learning (SAML), co-located with ECSA 2022

  • [IEEE Access'22] XUnified: A Framework for Guiding Optimal Use of GPU Unified Memory
    Hailu Xu, Murali Emani, Pei-Hung Lin, Liting Hu, and Chunhua Liao
    IEEE Access Journal, 2022

  • [H3'22] AI Benchmarking for Science: Efforts from the MLCommons Science Working Group
    Jeyan Thiyagalingam, Gregor von Laszewski, Junqi Yin, Murali Emani, Juri Papay, Gregg Barrett, Piotr Luszczek, Aristeidis Tsaris, Christine Kirkpatrick, Feiyi Wang, Tom Gibbs, Venkatram Vishwanath, Mallikarjun Shankar, Geoffrey Fox and Tony Hey
    Workshop on HPC on Heterogeneous Hardware (H3) at ICS'22

  • [HPDC '22] Efficient Design Space Exploration for Sparse Mixed Precision Neural Architectures
    Krishna Teja Chitty Venkata, Murali Emani, Venkatram Vishwanath, and Arun K Somani
    ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC)'22

  • [CFW '22] Towards Neural Architecture-Aware Exploration Of Compiler Optimizations in a Deep Learning {Graph} Compiler
    Gaurav Verma, Swetang Finviya, Abid M. Malik, Murali Emani, and Barbara Chapman
    Compiler Frontiers Workshop (CFW)'22

  • [ScaDL '22] Throughput-oriented and Accuracy-aware DNN Training with BFloat16 on GPU
    Zhen Xie, Sid Raskar, and Murali Emani
    Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL), IPDPS'22

  • [CCGRID '22] Towards an In-Depth Analysis of Multifidelity High Performance Computing Systems
    Fnu Shilpika, Bethany Lusch, Murali Emani, Filippo Simini, Venkatram Vishwanath, Michael Papka, Kwan-Liu Ma
    The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID'22)

  • [IJHPC '21] Intelligent Resolution: Integrating Cryo-EM with AI-driven Multi-resolution Simulations to Observe the SARS-CoV-2 Replication-Transcription Machinery in Action
    Anda Trifan, Defne Gorgun, Zongyi Li, Alexander Brace, Maxim Zvyagin, Heng Ma, Austin R Clyde, David A Clark, Michael Salim, David Hardy, Tom Burnley, Lei Huang, John McCalpin, Murali Emani, Hyunseung Yoo, Junqi Yin, Aristeidis Tsaris, Vishal Subbiah, Jessica Liu, Noah Trebesch, Geoffrey Wells, Venkatesh Mysore, Tom Gibbs, James Phillips, S Chakra Chennubhotla, Ian Foster, Rick Stevens, Anima Anandkumar, Venkatram Vishwanath, John E Stone, Emad Tajkhorshid, Sarah A Harris, Arvind Ramanathan
    The International Journal of High Performance Computing Applications (IJHPC'21)
    Finalist for the Gordon Bell Special Prize for HPC-based Covid-19 research

  • [MLHPC'21] MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
    Steven Farrell, Murali Emani et al.
    Machine Learning in HPC Environments (MLHPC)'21

  • [MLHPC'21] HPCFAIR: Enabling FAIR AI for HPC Applications
    Gaurav Verma, Murali Emani, Chunhua Liao, Pei-Hung Lin, Tristan Lucas Vanderbruggen, Xipeng Shen,Barbara Chapman
    Machine Learning in HPC Environments (MLHPC)'21

  • [MLHPC'21] HPC Ontology: Towards a Unified Ontology for Managing Training Datasets and AI Models for High-Performance Computing
    Chunhua Liao, Pei-Hung Lin, Gaurav Verma, Tristan Lucas Vanderbruggen, Murali Emani, Zifan Nan, Xipeng Shen
    Machine Learning in HPC Environments (MLHPC)'21

  • [PASC'21] Stream-AI-MD: Streaming AI-driven Adaptive Molecular Simulations for Heterogeneous Computing Platform
    Alex Brace, Misha Salim, Vishal Subbiah, Heng Ma, Murali Emani, Anda Trifan, Austin Clyde, Corey Adams, Thomas Uram, Hyunseung Yoo, Andrew Hock, Jessica Liu, Venkatram Vishwanath, Arvind Ramanathan
    Platform for Advanced Scientific Computing (PASC)'21

  • [IEEE CSE'21] Accelerating Scientific Applications With SambaNova Reconfigurable Dataflow Architecture
    Murali Emani, Venkatram Vishwanath, Corey Adams, Michael E. Papka, Rick Stevens, Laura Florescu, Sumti Jairath, William Liu, Tejas Nama, Arvind Sujeeth
    IEEE Journal Computing in Science & Engineering '21

  • [ECP'20] Exploring Deep Learning for Science Bench-marks on DOE Supercomputers (poster)
    Aristeidis Tsaris, Jacob Balma, Murali Emani, Steve Farrell, Yin Junqi, Abid Malik, Prabhat, Venkatram Vishwanath, Mallikarjun Shankar
    Exascale Computing Project (ECP) Annual Meeting '20

  • [DAAC'19] MELA: A Visual Analytics Tool for Studying Multifidelity HPC System Logs
    Fnu Shilpika, Bethany Lusch, Murali Emani, Venkatram Vishwanath, Michael E. Papka, Kwan-Liu Ma
    In International Workshop on Data-Center Automation, Analytics, and Control (DAAC) at Supercomputing '19

  • [MCHPC '19] Machine Learning Guided Optimal Use of GPU Unified Memory
    Hailu Xu, Murali Emani, Pei-Hung Lin, Chunhua Liao
    In Workshop on Memory Centric High Performance Computing (MCHPC) at Supercomputing '19

  • [PMBS '18] Is Data Placement Optimization Still Relevant On Newer GPUs?
    Md Abdullah Shahneous Bari, Larisa Stoltzfus, Pei-Hung Lin, Chunhua Liao, Murali Emani, Barbara Chapman
    In 9th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS18) @ Supercomputing 2018 .

  • [MCHPC '18] Data Placement Optimization in GPU Memory Hierarchy using Predictive Modeling
    Larisa Stoltzfus, Murali Emani, Pei-Hung Lin, Chunhua Liao
    In Workshop on Memory Centric High Performance Computing (MCHPC) at Supercomputing '18

  • [EuroMPI '18] MPI Stages: Checkpointing MPI State for Bulk Synchronous Applications
    Nawrin Sultana, Anthony Skjellum, Ignacio Laguna, Matthew Farmrer, Kathryn Mohror and Murali Emani
    In EuroMPI, 2018

  • [ICS '18] Bootstrapping Parameter Space Exploration for Fast Tuning
    Jay Thiagarajan, Nikhil Jain, Rushil Anirudh, Alfredo Gimenez, Rahul Sridhar, Aniruddha Marathe, Tao Wang, Murali Emani, Abhinav Bhatele and Todd Gamblin
    In International Conference on Supercomputing, 2018

  • [CnC Journal '18] EReinit: Scalable and Efficient Fault-Tolerance for Bulk-Synchronous MPI Applications
    Sourav Chakraborty, Ignacio Laguna, Murali Emani, Kathryn Mohror, Dhabaleswar K. Panda, Martin Schulz and Hari Subramoni
    In Concurrency and Computation: Practice and Experience, 2018

  • [ExaMPI '17] Checkpointable MPI: A Transparent, Fault-Tolerance Approach for MPI
    Murali Emani, Ignacio Laguna, Kathryn Mohror, Nawrin Sultana and Anthony Skjellum
    In Workshop on Exascale MPI, 2017

  • [ExaMPI '17] EReinit: Scalable and Efficient Fault-Tolerance for Bulk-Synchronous MPI Applications
    Sourav Chakraborty, Ignacio Laguna, Murali Emani, Kathryn Mohror, Dhabaleswar K. Panda, Martin Schulz and Hari Subramoni
    In Workshop on Exascale MPI, 2017

  • [CnC '16] Adaptive Tuning of Parallel Programs with CnC
    Murali Emani
    In Annual Concurrent Collections Workshop, 2016

  • [LCPC '16] Mapping Medley: Adaptive Parallelism Mapping with Varying Optimization Goals
    Murali Krishna Emani (LLNL)
    In 29th International Workshop on Languages and Compilers for Parallel Computing (LCPC) ,2016

  • [PACT '16] Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in Dense 3D Scene Understanding
    Bruno Bodin (U. of Edinburgh), Luigi Nardi and Zia Zeeshan (Imperial College London), Harry Wagstaff, Govind Sreekar Shenoy (U. of Edinburgh), Murali Emani (LLNL), John Mawer, Christos Kotselidis, Andy Nisbet and Mikel Lujan (U. of Manchester), Bjoern Franke (U. of Edinburgh), Paul Kelly (Imperial College London), Michael O'Boyle (U. of Edinburgh)
    In International Conference on Parallel Architectures and Compilation Techniques, 2016

  • [REC2 '15] Ensemble of Mapping Techniques for Improved Efficiency
    Murali Krishna Emani and Michael O'Boyle
    In Workshop on Resource-Efficient Cloud Computing , 2015

  • [CnC '15] A Mixture of Experts Approach for Parallelism Mapping in Dynamic Environment
    Murali Krishna Emani and Michael O'Boyle
    In Annual Concurrent Collections Workshop , 2015

  • [PLDI '15] Celebrating Diversity: A Mixture of Experts Approach for Runtime Mapping in Dynamic Environments [pdf]
    Murali Krishna Emani and Michael O'Boyle
    In 36th ACM SIGPLAN conference on Programming Language Design and Implementation , 2015

  • [LCPC '14] Change Detection based Parallelism Mapping: Exploiting Offline Models and Online Adaptation [pdf]
    Murali Krishna Emani and Michael O'Boyle
    In The 27th International Workshop on Languages and Compilers for Parallel Computing , 2014

  • [HiPC '13] A Novel Technique to Improve Parallel Program Performance Co-executing with Dynamic Workloads [pdf]
    Murali Krishna Emani and Michael O'Boyle
    Workshop on Performance Engineering and Applications, IEEE International Conference on High Performance Computing 2013

  • [ICAC '13] Self-Adaptive Parallelism Mapping in Dynamic Environments
    Murali Krishna Emani and Michael O'Boyle
    In Doctoral Forum, USENIX International Conference on Autonomic Computing , 2013

  • [CGO '13] Smart, Adaptive Mapping of Parallelism in the Presence of External Workload [pdf]
    Murali Krishna Emani, Zheng Wang and Michael O'Boyle
    In International Symposium on Code Generation and Optimization , Februrary, 2013.

  • [GTC '10] High Performance Complex Event Processing on GPGPU [pdf]
    Murali Krishna Emani and Sudeep Mallick
    In GPU Technology Conference , 2010

  • [ICCSE '09] Parallelism in Workflows/BPM: Approaches, Techniques and Applications
    Murali Krishna Emani
    International Conference on Computer Science and Software Engineering, 2009

  • PhD : Adaptive Parallelism Mapping in Dynamic Environments using Machine Learning [pdf]
    University of Edinburgh, UK, 2015

  • Masters : Scalability of J2EE Applications on Multi-core Machines
    IIIT Bangalore, India, 2008

  • Granted Patents

  • US Patent Publication number:US9317456 B2
    Method and system for performing event-matching with a graphical processing unit
    Murali Krishna Emani, Sudeep Mallick

  • US Patent Publication number:US8869125 B2
    Systems and Methods for Demarcating Information Related to one or more Blocks in an Application

    Murali Krishna Emani, Sudeep Mallick and Balkrishna Prasad

  • US Patent Publication number: 9,043,775
    Method for Identifying Problematics Loops in an Application and Devices thereof
    Murali Krishna Emani, Sudeep Mallick and Balkrishna Prasad