February 2007
The Promise and Perils of the Coming Multicore Revolution and Its Impact
John McCalpin, Advanced Micro Devices, Inc.
Chuck Moore, Advanced Micro Devices, Inc.
Phil Hester, Advanced Micro Devices, Inc.


For the purposes of this article, the performance impacts of various configuration options will be estimated using an analytical model with coefficients "tuned" to provide the best fit for a large subset of the SPECfp2000 and SPECfp_rate2000 benchmarks collected in March 2006.4 The analysis included 508 SPECfp2000 results and 730 SPECfp_rate2000 results. The remaining 233 results were excluded from the analysis because either advanced compiler optimizations or unusual hardware configurations made the results inappropriate for comparison with the bulk of the results.

The performance model has been described previously56 but has been extended here to include a much more complete set of data and has been applied to each of the 14 SPECfp2000 benchmarks as well as to the geometric mean values. Although the model does not capture some of the details of the performance characteristics of these benchmarks, using a least-squares fit to a large number of results provides a large reduction in the random "noise" associated with the individual results and provides a significant degree of platform-independence.

In brief, the model assumes that the execution time of each benchmark is the sum of a "CPU time" and a "Memory Time," where the amount of "work" done by the memory subsystem is a simple function of the cache size – linearly decreasing from a maximum value with no cache to a minimum value with a "large" cache (where "large" is also a parameter of the model), and a constant amount of memory work for caches larger than the large size. The rate at which CPU work is completed is assumed to be proportional to the peak floating-point performance of the chip for 64-bit IEEE arithmetic, while the rate at which memory work is completed is assumed to be proportional to the performance of the system on the 171.swim (base) benchmark. Previous studies have shown a strong correlation between the performance of the 171.swim benchmark and direct measurement of sustained memory performance using the STREAM benchmark.7

The model results are strongly correlated with the measured results, with 75% of the measurements falling within 15% of the projection. This suggests that the underlying model assumptions are reasonably consistent with the actual performance characteristics of these systems on these benchmarks. Although there are some indications of systematic errors in the model, not all of the differences between the model and the observations are due to oversimplification of the hardware assumptions – much of the variance also appears to be due to differences in compilers, compiler options, operating systems, and benchmark configurations. Overall, the model seems appropriately robust to use as a basis for illustrations of performance and price/performance sensitivities in microprocessor-based systems.

Assumptions and Modeling

For the performance and price/performance analysis, we will assume

  • The bare, two-socket system (with disks, memory, and network interfaces, but without CPUs) costs $1,500.
  • The base CPU configuration is a single-core processor at 2.4 GHz with a 1 MB L2 cache, costing $300.
  • The die is assumed to be about ½ CPU core and about ½ L2 cache, with the other on-die functionality limited to a small fraction of the total area.
  • The "smaller chip" configuration is a single-core processor at 2.8 GHz with a 1 MB L2 cache, costing $150.
  • The "lots of cache" configuration is a single-core processor at 2.8 GHz with a 3 MB L2 cache, costing $300.
  • The "more cores" configuration is a dual-core processor at 2.0 GHz with 1 MB L2 cache per core, costing $300.

Pages: 1 2 3 4 5 6 7 8 9

Reference this article
McCalpin, J., Moore, C., Hester, P. "The Role of Multicore Processors in the Evolution of General-Purpose Computing," CTWatch Quarterly, Volume 3, Number 1, February 2007. http://www.ctwatch.org/quarterly/articles/2007/02/the-role-of-multicore-processors-in-the-evolution-of-general-purpose-computing/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.