Tuesday, July 25, 2006

The End of Moore's Law?

At ISCA this year, much of the talk in the halls was about the end of Moore's Law. Not down the road as we get to atomic structures too small to do lithography, now. Moore's Law ended last year (2005). The CPU manufacturers can't keep increasing the clock speed enough to gain the 2x performance improvement we have come to expect every (pick your doubling time).

The biggest problem is heat. Smaller transistors have higher leakage current, meaning more heat even when the system is idle. Raise the clock speed, and heat goes up. The maximum that can be extracted from a silicon die is about 100 watts per square centimeter. We're already there.

So, what next? Well, it is well know that Intel and AMD are both shipping dual-core processors -- two microprocessors on a single chip, sharing an L2 cache. Both companies have also promised quad-core chips by mid-2007. What is happening is that each processor will now gain perhaps only 10-20% in performance each year, but the number of cores on a chip will double every (pick your doubling time) until lithography really does run out of gas, or quantum effects come into play, in fifteen years or so.

What does this mean to Joe Programmer? It means that if you have code whose performance you care about -- whether it's physics simulations or evolutionary algorithms or graphics rendering -- and your code isn't parallel, you have a problem. Not tomorrow, today. The handwriting has been on the wall for years, with multithreaded processors and known heat problems and whatnot. Now it's really here. Of course, if you have very large computations, you have probably already made that work in a distributed-memory compute cluster.

Haven't you?

3 comments:

Dave Bacon said...

Hey Rod, just too be nitpicky (one of my best traits :): I thought that Moore Law, i.e. doubling of transistor density was still holding (even though clock speeed, a raw proxy for processing power, is now topping off.)

rdv said...

You're absolutely right, of course (what, agree with a comment? where's the controversy?). Moore's original 1965 paper shows a doubling of "number of components per integrated function" every year, extrapolating from six years' worth of data out another decade, to 1975. (The doubling period has since been adjusted.)

But since then Moore's Law has become, as you say, a proxy for CPU speed. People have come to expect that doubling the transistor count and shrinking the channel length will give them double the performance on their benchmark of choice.

Of course, the end of Moore's Law growth in performance has been predicted for years; the increasing memory latency to CPU clock cycle ratio has been a problem forever. And I, for one, thought we were doomed when branch prediction studies showed that a branch is taken every few (half a dozen?) instructions. But the hardware guys (and women) have continued to kick butt, cramming more and better cache in a chip, adding speculative execution, improving prefetching, doing compiler superblocks (which one the ISCA award this year for most influential paper of the 1991 ISCA), doing multithreading.

But it has been true for quite a while that adding 10% to your transistor count doesn't gain you 10% in performance, on most applications, and growth in clock speed has been tailing off (mostly it's wire delays, rather than MOSFET switching, anyway). Now we have finally reached the point where the chip architects have determined that the best use of die space is to increase the number of complete, independent CPUs on a chip. If your code is parallel, you'll continue to get performance gains -- how good depends on the code itself and its interaction with the hardware.

If your application is web service of either static or multimedia content, you're probably sitting pretty (except that your performance bottleneck is likely I/O, anyway). If your application is transaction code, with lots of writes and heavy locking, you're going to struggle. If you're doing, say, Navier-Stokes, you probably figured out how to parallelize your code a long time ago, and will breeze through this. If it's some single-threaded Java app, well, have fun parallelizing it.

Dave Bacon said...

Wow, thanks for the detailed resonse! I'm going to have to show up at one of these ISCA conferences so I get to see all this cool stuff. I was told that at recent conferences for parallel computing, they are all very excited with the mantra being "This time we matter!" :)