Thread: Triple-Core
View Single Post
Programmer
Member
 
Join Date: Nov 2004
 
2005-05-14, 11:15

Quote:
Originally Posted by Ocelot
The short answer is that the multi-core gamebox PPCs are very simple compared to the PPC 970, but they bode well for IBM's fabbing abilities. But you reall must read the thread yourself since there's no way I can summarize it well.
Looking at the limited information we have naively, the 970 is a 4-5 way issue processor (but probably averages closer to 3 instructions/clock) and the new core (in XB2 and Cell, similar if not identical) are only 2-way issue (probably 1.5 instructions/clock on average). That means in the same number of clocks the 970 will do double the work. A 3.2 GHz XB2 core is probably roughly equivalent to a 1.6 GHz 970, and a 4 GHz Cell's Power core is probably roughly equivalent to a 2 GHz 970. The 2.7 GHz G5 in Apple's latest would roughly equate to a 5.4 GHz version of one of these new cores.

Also, this new core is 2-way SMT which typically means that you get two threads, each running at half speed (well, a little more because they are using up the wasted cycles of the other thread). A 3.2 GHz XB2 core, therefore, is something like having 2 x 1 GHz 970. If your code is properly multithreaded and can take advantage of lots of threads then the XB2 is promising about 6 GHz worth of 970, vaguely compareable to a 3 GHz 970MP.

Note that there are many many variables in performance so these can only be very rough rules of thumb and ought to be taken with a grain of salt until some form of real benchmark is available. An example of other factors is that both XB2 and Cell claim to have 22-25 GB/sec of memory bandwidth, compared to something like 6 GB/sec in Apple's G5 (although that isn't strictly fair since the GPU in the game consoles typically uses main memory bandwidth, but on a Mac it has its own VRAM for many things... and that VRAM is 20+ GB/sec). An example favoring the other direction is that the 970 is an out-of-order processor with pretty good branch prediction, so on crappy code it probably does quite a bit better than this new core... which is in-order so it likely spends a good deal of time spinning its wheels while struggling through crappy code. This means that properly optimized code is going to sing on the new cores, but poorly optimized code will choke them far worse than it does the 970 (which is a big part of why IBM originally designed the 970 core the way they did [as POWER4] for their server line).
  quote