August 2005
The Coming Era of Low Power, High-Performance Computing — Trends, Promises, and Challenges
Satoshi Matsuoka, Tokyo Institute of Technology


All is not favorable, however. There are various drawbacks to low power designs, some of which are generic and some of which are more peculiar to HPC. Both types require substantial research and engineering.


  • Increased system complexity — Low power design obviously adds complexity to the overall system both in hardware and software as well as overall management. We omit a more detailed discussion here for the sake of brevity.
  • Increased sporadic failures — In low power systems generally, the chance of sporadic failures may increase for several reasons, including reduced noise margins caused by lowering the supply voltage, timing issues, etc. Such failures will have to be compensated for by careful and somewhat conservative circuit design, sanity checking, redundancy, etc. Another possibility is to employ software checking and recovery more extensively, but such measures tend to be difficult to implement without some hardware support.
  • Increased failures as the number of components is scaled up — In some low power HPC architectures, the desire to exploit a “slow and parallel” strategy leads to designs with a higher number of nodes and thus a higher number of components in the system. For example, the largest BlueGene/L on the Top500 to date sports 65,536 CPU cores, an order of magnitude greater than any other machine on the Top 500. By comparison, the Earth Simulator has only 5120 cores. Certainly the number of cores is one particular metric and cannot account for overall machine stability. In fact, BlueGene/L has gone to great lengths to reduce the number of overall system components, and the results from early deployments have demonstrated that it is a quite reliable machine. Nonetheless, as we approach the petaflops range, the amount of component increase will be substantially more demanding.
  • Reliance on Extremely High Parallel Efficiency to Extract Performance — Since the performance of each processor in such low power designs will be slow, achieving good performance will require a much higher degree of parallel efficiency compared to conventional high-performance, high-power CPUs. Thus, unless the application is able to exhibit considerable parallel efficiency, we will not be able to attain proper performance from the system. If the inefficiency is due to the software or the underlying hardware, solutions may be available to resolve it. However, if the cause is fundamental to the algorithm, with unavoidable serialization capping limits on parallelism, then we will have to resort to somehow discovering the fundamental application algorithm. This is sometimes very difficult, however, especially for very large, legacy applications.
Advanced Vectors (Earth Simulator => SX-8) High Density Cluster (Itanium Mondecito Blade + Infiniband 4x) Low PowerCPU? Super Highly Density (Blue Gene/L)
GFLOPS/CPU 16 8 2.8
CPU CORE/Chip 1 2 2
CPU Chip/Cabinet 8 72 1024
TFLOPS/Cabinet 0.128 1.152 5.7344
Memory BW/Chip (GB/s) 64 10.672 6.4
Memory BW/Cabinet (GB/s) 512 768.384 6553.6
Network BW/Chip (MB/s) NA 625 1050
Network Bytes/s/Flop 0.125 0.0390625 0.1875
#Cabinets for 1PF (+30% Network) 10156 1128 174
Physical size relative to ES 13.22 1.47 0.23
Power/Cabinet (KW) 9 15 25
Total Power (30% cooling) (MW) 118.83 22.00 5.66
Power relative to ES (8MW) 14.85 2.75 0.71
Cost/Cabinet ($Million US) 1 1 1.5
Total Cost ($Billion US) 10.16 1.13 0.26
Cost relative to ES ($400 mil US) 25.39 2.82 0.65
Table 1. Modern HPC Machine Parameters

Pages: 1 2 3 4 5 6 7

Reference this article
Matsuoka, S. "Low Power Computing for Fleas, Mice, and Mammoth — Do They Speak the Same Language?" CTWatch Quarterly, Volume 1, Number 3, August 2005. http://www.ctwatch.org/quarterly/articles/2005/08/low-power-computing-for-fleas-mice-and-mammoth/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.