Monday, September 12, 2011

Multicore CPUs: Amdahl's Law in the Multicore Era

This short paper talks about the application of Amdahl's Law to multicore chips. Although Amdahl's Law was used to show that hardware manufacturers should concentrate on high-powered single core processors, now in the parallel age, there are many types of computations that are embarrassingly parallel and much more computation can be done on data due to these parallel machines.

The paper looks into 3 types of multicore chips: symmetric, asymmetric, and dynamic. In the symmetric model, all cores use the same amount of resources. With the asymmetric chip, there is a single large core which utilizes a large portion of the resources, and the rest are all symmetric. With the dynamic model, the authors imagine a chip in which the resources can be used for many symmetric cores or for one large serial-processing core. The results show that the asymmetric chip has much greater speedup than the symmetric chip, and the dynamic model obviously has the potential for incredible speedup.

I think this was a great simple way to adapt Amdahl's Law in the parallel age. The authors leverage this well to show what areas of research we should be concentrating on. However, there are a few problems I had with this paper.

The first thing I found questionable with this paper was the limited number of configurations the authors tried, both theoretically and practically. For example, it would have been interesting to explore, theoretically, the example, in which the asymmetric core uses multiple resources for the symmetric part of the chip. Also, practically, it's much more likely that, for the dynamic model, only a few of the cores can be combined to boost the sequential component, as opposed to all the resources.

A depressing point is that the majority of the lines shown in the graphs have f >= 0.9. Even with this small variation in the amount of parallelism, there are huge changes in the slope and height of these lines. A lot of problems exist in which f < 0.9, and these lines will all lie in the area between f = 0.9 and f = 0.5; the area where an increase in cores does not really seem to help speed up the computation.

The final thing I'd like to say is that the authors assume, for the dynamic model, that perf(r) = sqrt(r), just like the other scenarios. However, it is much more likely that perf(r) will be much less than sqrt(r) in practice. The authors should not have used the same function. This leads the graphs to very misleading.

No comments:

Post a Comment