Flawed Methods and Inaccurate Results Used to Show ARM CPUs as More Efficient

News_PR_Machine-1776There used to be a thing called truth in journalism. It meant that when you published an article you should at the very least check your facts if you are presenting it as “news” in the world of editorial articles things are different as an editorial is nearly always an opinion based article with some facts thrown in for furn. What has happened though is that with the introduction of Blogs, Fan Sites and other venues for information some of the fact checking has gone out the window in the effort to be the first to report on a juicy bit of news. When an article hits one of the big sites it often gets spread around the net and becomes the “truth” simply by means of repetition. We have watched this many times (and it is something that Apple’s PR and marketing thrive on).

Today we have an incident where a single blog post is fast becoming a focus of conversation with apparently no fact checking at all. It started off on a relatively small site called armservers.com. As you might imagine they are a fan of ARM based servers and in keeping with their name they put together what they felt was a good indication of ARM performance per watt Vs an Intel Xeon. This benchmark was quickly grabbed up by at least one large site and here it will quickly make its rounds.

Unfortunately the entire comparison is simply wrong. Let me tell you why.

The comparison is supposed to be a Calxeda EnergyCore ECX-1000 ARM system @ 1.1GHz Vs an Intel Xeon at 3.3GHz running ApacheBench v2.3. According to their numbers it shows that the Calxeca system uses only 5.26 Watts compared to the Xeon’s 102 Watts. Amazing right? 15x the performance per watt ratio! Now that is incredible and also inaccurate.

The first item that invalidates the whole test is that the wattage numbers were derived using nothing more than published TDP for the Xeon.

“The Intel (Sandybridge) platform is based on published TDP values for the CPU and I/O chipset, along with an estimate for DDR memory. Unfortunately, at the time of this blog post, we didn’t have a way to measure actual power consumption with the same level of fine detail.”

Now that right there makes the whole claim invalid. You cannot call it a comparison if you do not have both in the lab. Where did they get the performance numbers from on the Intel system? If they had it in house then they should have been able to get exact power usage numbers at the same time. So we have multiple issues that invalidate the article right off the bat.

The second item that shows the invalidity of the comparison is this statement;

“The Sandybridge system saturated the single 1Gb NIC with less than 15% CPU utilization. (While it’s possible to add additional network adapters, most data center customers we’ve spoken with don’t add extra NICs for web servers for a variety of reasons.) This is a classic example of where Calxeda can deliver superior value: workloads for which “brawny cores” simply deliver more horsepower than can be consumed by the rest of the platform/infrastructure.”

Ok first of all I do not know ANY data centers that would waste a FULL Xeon E3-1240 server with 16GB of RAM on a single Apache Webserver. What they would do is segment that single server into multiple virtual servers and get the best bang for the buck from the hardware. Even using something like Xen you could get 10-15 webservers out of the comparison system; something that you cannot do on the ARM server. So the performance per watt of the Xeon just went up quite a bit and you would not need to add more NICs in. Although the truth of that statement is also inaccurate as most Xeon boards have more than one NIC and many include a NIC specifically for remote control Vis Ethernet (Dell calls this their DRAC card). So the test was stacked and the bench never even had a chance at being accurate.

The test should also have use only 15% of the TDP number is they were only using 15% of the CPU to get the results. Instead the used the full TDP and compared it to the measured results from the ARM server. At the point you do that then the results are startlingly different.

We would be very interested to see the performance comparison done properly with the right (and typical) configuration. That would include directly measured power from both the Intel and ARM systems followed by the measure of a single virtual sever on the Intel system to see where the real total cost of ownership is. We imagine their statement of "our TCO advantage improves to a 77% reduction of overall total cost of ownership" might be away off the mark.

Discuss this in our Forum

No comments

Leave your comment

In reply to Some User