August 2005
The Coming Era of Low Power, High-Performance Computing — Trends, Promises, and Challenges
Wu-chun Feng, Los Alamos National Laboratory

Low-Power HPC (and Power-Aware HPC): The Future

Implicit in the preceding discussion is the distinction between capability and capacity computing. According to Graham et al,5 capability computing applies maximum processing power to solve a large problem in a short period of time — with the main figure of merit being “time to solution.” Another important facet to capability computing is the ability to solve problems of a magnitude that have never been solved before. Examples of such systems are the DOE ASCI-class supercomputers such as ASCI White and the recently demonstrated ASC Purple supercomputer — the Formula One race cars of supercomputing.

In contrast, capacity systems are typically cheaper and less performance-capable than capability systems on a per-node basis as well as relative to the entire system. Capacity systems allow scientists to explore design alternatives that are often needed to prepare for larger-scale runs on capability systems. In addition, capacity systems typically solve a multitude of smaller problems simultaneously. Systems such as Green Destiny, MegaProto, Orion Multisystems DS-96, and arguably Blue Gene/L fit into this category.

Because low-power HPC generally sacrifices a measurable amount of performance (e.g., 3.6-GHz Intel Xeon CPU versus 1.4-GHz Transmeta Efficeon CPU) to achieve substantially lower power consumption per node (e.g., 151 W versus 7 W), and hence, better efficiency and reliability, low-power HPC will be confined to capacity computing for the foreseeable future. See citations35 36 for the latest results in low-power HPC.

But what about capability computing? HPC vendors now realize that in building capability systems, power consumption is becoming a primary design constraint because of the exorbitant operational costs associated with such systems due to their inefficiency and because of its effect of reliability, as noted in Table 1. Excessive power consumption is becoming such a dominant issue that ASC Purple requires new air-handling designs and specifications because of the 7.5-MW required to power the system and the cooling equipment. This 7.5-MW appetite equates to powering 7,500 typical homes.

With low-power HPC unable to support the requirements of capability computing and too much power being consumed by traditional capability systems, what the HPC community should expect to see over the next decade is the emergence of power-aware solutions for capability computing. These solutions will ultimately reduce operational costs and improve reliability and availability, particularly in capacity systems, while minimizing impact on overall performance. We are already seeing indications of this trend at SC2005 where the following three technical papers will be presented on power-aware HPC:

  1. R. Ge, X. Feng, and K. Cameron, “Performance-Constrained, Distributed DVS Scheduling for Scientific Applications on Power-Aware Clusters.”
    Describes a software framework for implementing and evaluating dynamic voltage and frequency scaling, where performance-directed scheduling is of particular interest.
  2. C. Hsu and W. Feng, “A Power-Aware Run-Time System for High-Performance Computing.”
    Presents a power-aware run-time system on a high-end commodity cluster that automatically and transparently adapts its voltage and frequency settings to achieve about 20% energy savings on average with minimal impact on performance.
  3. N. Kappiah, V. Freeh, and D. Lowenthal, “Just-in-Time Dynamic Voltage Scaling: Explointing Inter-Node Slack to Save Energy in MPI Programs.”
    Saves energy by taking advantage of the slack time that exists when the computational load is not perfectly balanced across a HPC system.

As noted earlier, a power-aware approach makes use of commodity processors (e.g., AMD Opteron33) with dynamic voltage and frequency scaling (e.g., PowerNow!33) to ensure high-end capability performance while reducing power consumption. For the capability supercomputer called ASC Purple, using our power-aware run-time system would reduce the power envelope by 1.3 MW on average, thus reducing its electrical bill by $1.37M/year, when assuming a rate of $0.12/kWh. Furthermore, such a dramatic reduction in power consumption would lengthen the life of system components in the supercomputer, and hence, improve overall reliability of the supercomputer as well as those presented in Table 1.

Pages: 1 2 3 4 5 6 7

Reference this article
Feng, W. "The Importance of Being Low Power in High Performance Computing," CTWatch Quarterly, Volume 1, Number 3, August 2005. http://www.ctwatch.org/quarterly/articles/2005/08/the-importance-of-being-low-power-in-high-performance-computing/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.