Skip to content

2008-11: Scalable Algorithms for Petascale Systems with Multicore Architectures

This work is part of the U.S. Department of Energy’s Institute for Advanced Architecture and Algorithms (IAA). It was established in 2008 to facilitate the co-design of architectures and applications in order to create synergy in their respective evolutions for closing the gap between the peak capabilities of the hardware and the performance realized by high performance computing applications (application-architecture performance gap).

This project focuses on the development of architecture-aware algorithms and the supporting runtime features needed by these algorithms to solve general sparse linear systems common in many scientific applications. Targeted architecture-aware algorithms include (1) multi-precision Krylov solvers, preconditioners, and multi-level smoothers, (2) multi-resolution, multi-precision fast Poisson and Helmholtz solvers, (3) multi-core aware hybrid algorithms for preconditioning, and (4) parallel-in-time algorithms based on Krylov Deferred Correction. Targeted features within an architecture-aware runtime environment include multi-core aware Message Passing Interface (MPI) memory allocation, multi-level MPI communicators, and process-to-core and memory-to-core affinity.

This project further focuses on evaluating the algorithmic impact of future architecture choices and determining what architecture changes would have the highest impact. The evaluation includes (1) detailed performance analyses of key computational kernels on different simulated node architectures, (2) analysis and development of new memory access capabilities that may improve use of memory bandwidth and cache memory resources, and (3) simulation of system architectures at full scale to evaluate the scalability and fault tolerance behavior of key science algorithms.

Prominent Solutions

Funding Sources

Participating Institutions

Peer-reviewed Conference Publications

  1. Swen Böhm and Christian Engelmann. xSim: The Extreme-Scale Simulator. In Proceedings of the International Conference on High Performance Computing and Simulation (HPCS) 2011, pages 280-286, Istanbul, Turkey, July 4-8, 2011. IEEE Computer Society, Los Alamitos, CA, USA. ISBN 978-1-61284-383-4. DOI 10.1109/HPCSim.2011.5999835. Acceptance rate 28.1% (48/171). Abstract Publication Presentation BibTeX Citation

Peer-reviewed Workshop Publications

  1. Ian S. Jones and Christian Engelmann. Simulation of Large-Scale HPC Architectures. In Proceedings of the 40th International Conference on Parallel Processing (ICPP) 2011: 2nd International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI), pages 447-456, Taipei, Taiwan, September 13-19, 2011. IEEE Computer Society, Los Alamitos, CA, USA. ISBN 978-0-7695-4511-0. ISSN 1530-2016. DOI 10.1109/ICPPW.2011.44. Abstract Publication Presentation BibTeX Citation
  2. Christian Engelmann and Frank Lauer. Facilitating Co-Design for Extreme-Scale Systems Through Lightweight Simulation. In Proceedings of the 12th IEEE International Conference on Cluster Computing (Cluster) 2010: 1st Workshop on Application/Architecture Co-design for Extreme-scale Computing (AACEC), pages 1-8, Hersonissos, Crete, Greece, September 20-24, 2010. IEEE Computer Society, Los Alamitos, CA, USA. ISBN 978-1-4244-8395-2. DOI 10.1109/CLUSTERWKSP.2010.5613113. Abstract Publication Presentation BibTeX Citation

Talks and Lectures

  1. Christian Engelmann. Resilience and Hardware/Software Co-design for Extreme-Scale Supercomputing. Seminar at the Barcelona Supercomputing Center, Barcelona, Spain, July 27, 2011. Abstract Presentation BibTeX Citation
  2. Christian Engelmann. Beyond Application-Level Checkpoint/Restart – Advanced Software Approaches for Fault Resilience. Talk at the 39th SPEEDUP Workshop on High Performance Computing, Zurich, Switzerland, September 6, 2010. Presentation BibTeX Citation
  3. Christian Engelmann and Stephen L. Scott. HPC System Software Research at Oak Ridge National Laboratory. Seminar at the Leibniz Rechenzentrum (LRZ), Garching, Germany, February 22, 2010. Abstract Presentation BibTeX Citation
  4. Christian Engelmann. High-Performance Computing Research Internship and Appointment Opportunities at Oak Ridge National Laboratory. Seminar at the Department of Computer Science, University of Reading, Reading, United Kingdom, December 14, 2009. Abstract Presentation BibTeX Citation
  5. Christian Engelmann. JCAS – IAA Simulation Efforts at Oak Ridge National Laboratory. Invited talk at the IAA Workshop on HPC Architectural Simulation (HPCAS), Boulder, CO, USA, September 1-2, 2009. Presentation BibTeX Citation

Co-advised Theses

  1. Ian S. Jones. Simulation of Large Scale Architectures on High Performance Computers. Master’s thesis, Department of Computer Science, University of Reading, UK, October 22, 2010. Thesis research performed at Oak Ridge National Laboratory. Advisors: Prof. Vassil N. Alexandrov (University of Reading); Christian Engelmann (Oak Ridge National Laboratory); George Bosilca (University of Tennessee, Knoxville). Abstract Publication Presentation BibTeX Citation
  2. Frank Lauer. Simulation of Advanced Large-Scale HPC Architectures. Master’s thesis, Department of Computer Science, University of Reading, UK, March 12, 2010. Thesis research performed at Oak Ridge National Laboratory. Advisors: Prof. Vassil N. Alexandrov (University of Reading); Christian Engelmann (Oak Ridge National Laboratory); George Bosilca (University of Tennessee, Knoxville). Abstract Publication Presentation BibTeX Citation

Symbols: Abstract Abstract, Publication Publication, Presentation Presentation, BibTeX Citation BibTeX Citation