Alumni Project
The Performance Evaluation Research Center (PERC)
Summary
The Performance Evaluation Research Center (PERC) project is developing a science for understanding and improving the performance of scientific application codes on large-scale computer systems. PERC integrates several previously active efforts in the high performance computing community, and is forging alliances with several other SciDAC scientific projects. These interactions not only help these scientific teams understand and improve the performance of their codes, but also ensure that the techniques and tools developed in this activity are truly useful to the broader DoE Office of Science community.
The PERC project seeks to:
- Understand the key factors in scientific codes that affect performance;
- Understand the key factors in computer systems that affect performance;
- Develop models that accurately predict performance of codes on systems;
- Develop an enabling infrastructure of tools for performance monitoring, modeling and optimization;
- Validate these ideas and infrastructure via close collaboration with DOE SciDAC projects and other scientists;
- Transfer this technology to scientists.
This activity focuses on high performance computing (HPC) systems (i.e., large distributed memory parallel systems; large shared memory systems; and large cluster systems), although it is expected that the techniques and tools developed here also benefits scientists using smaller systems. PERC also focuses on representative scientific applications and problems of interest to the DoE/SC, initially those areas emphasized in the SciDAC scientific projects.
PERC researchers believe that overall performance (namely wall-clock execution time) is dominated by how well a scientific code exploits the entire memory hierarchy of a machine. Hence, a science of performance must analyze performance phenomena from the register and CPU level up to the scale of the interprocessor network and beyond.
PERC focuses on four thrusts:
- Application and system benchmarking;
- Performance analysis tools;
- Performance modeling and analysis; and
- Performance optimizers.
PERC benchmarking activities target both application characterization and machine measurement. PERC has developed effective low-level benchmark programs that accurately measure multi-level memory system performance. PERC has also adapted large-scale scientific applications for use as high-level benchmarks. These benchmark codes permit PERC researchers to compare systems and analyze low-level performance.
PERC performance analysis tools span the spectrum from low-level infrastructure to high-level end-user tools. PERC researchers are improving the PAPI hardware performance monitoring infrastructure being developed at the University of Tennessee and the dynamic instrumentation API being developed at the University of Maryland and integrating these technologies with end-user tools. End-user tool efforts include the SvPablo toolkit under development at the University of Illinois Urbana-Champaign and the Sigma tool for cache measurement from the University of Maryland with participation by IBM Research.
PERC researchers are pursuing several distinct performance modeling and analysis strategies, including machine signatures, application signatures, statistical modeling and performance bound analysis. Our vision is to develop tools and techniques that can accurately estimate the performance of a given application on a given computer system. One highlight of this activity during this past year is the development of an infrastructure of low-level benchmarks, high-level tools and a “convolution” methodology that has demonstrated performance predictions accurate to within a few percent in tested cases involving large-scale scientific codes.
PERC researchers have established some solid contacts with a number of SciDAC application projects and other scientists doing large-scale computations under sponsorship of DoE/SC:
- TSI Supernova project: In collaboration with TSI researchers, PERC has extensively analyzed the astrophysics codes EVH1 and Agile-Boltztran using SvPablo and other tools. Several benchmark kernels were developed and shared with the TSI group. Removal of certain performance bottlenecks resulted in significantly better performance.
- Accelerator S&T project: PERC is analyzing the Standard Template Library, which is used heavily in these C++ codes, and has identified a number of optimizations for one key code.
- Lattice Gauge Theory project: PERC researchers have analyzed the performance of the MILC code, a very important lattice gauge QCD code. In particular, the communication performance has been studied on a Pentium-4 cluster system.
- Community Climate System Model project: PERC has studied the CAM benchmark program in detail, motivating changes that have significantly improved the parallel scalability of CAM. Baseline performance figures for the POP ocean model have been generated as well, and analysis has begun on this code.
- Wave-Plasma Interaction Fusion project: PERC has benchmarked, analyzed and modeled the performance of the AORSA3D application code. PERC is working with the fusion sciences community to identify additional benchmarks for analysis.
- Electronic Structure Theory project: PERC is developing a performance model for a GAMESS calculation (a widely used electronic structure computation package).
- Terascale Optimal PDE Solvers project: PERC researchers have analyzed several codes, resulting in reductions both in floating-point operation counts and overall runtime for a mesh smoothing code. In addition, we have developed a performance bound model for TOPS codes, including a 2D/3D radiation transport code.
For further information please contact:
Dr. David H. Bailey, PERC Project Lead
Lawrence Berkeley National Laboratory
Tel: 510-495-2773 Email: dhbailey@lbl.gov
back to project page