Now we are watching as AMD is pushing two new instructions in their new Bulldozer CPUs. These are FMA4 and XOP and are designed to enhance vector and matrix multiplication functions. The way they work is something like the vector design of their GPUs in that they break code down into more manageable chunks for the relatively small memory bus to stuff through the multiple CPU cores in tandem. The only problem is that not many people need or at least are using this. Those that do/are have moved to the much more parallel GPGPU computing systems to get much faster performance returns on these compute functions.
To help increase awareness AMD has gone on something of an underground campaign to let people know that existing compiler technology (GCC and Open64) can already handle these new instructions. At the same time Intel’s own compilers have no support for this just yet and their CPUs (including the Xeon line) also do not include these two instructions. Of course it could be said that with the memory and caching performance found in most Intel CPUs these are not necessarily needed, but it does seem that AMD has something on Intel for a change.
Now the question is; can AMD get these new instructions adopted? In the past they have tried to work up enthusiasm for the Vec5 architecture in the gaming world without much success and we have a feeling that the same may happen here. If it does fail to take off then Intel has lost nothing. On the other hand if AMD’s new instructions do take off Intel can still probably keep up until they work something into their compilers and silicon. It will be interesting to see how this one plays out especially given the lukewarm performance reviews that the Bulldozer line has been getting so far especially when it comes to memory and caching efficiency.
Discuss in our Forum