Today we have the release of the Intel Xeon E5-2600 V3 series of processors. One will also hear the release as that of the Haswell-EP or “Grantley” code names. This is a huge deal as it does signal many billions of dollars of technology refreshes over the next year. This release, several companies sent STH pre-release hardware. My personal coverage of the launch will extend not just to STH, but also a piece on Tom’s Hardware and another on Tom’s IT Pro. I was also invited to a press day at Intel’s facilities in Oregon about a month ago to hear from Intel’s folks about the launch. Suffice to say, I have been living the launch for the past month or so leading up to today’s announcement.
Intel Xeon E5-2690 and E5-2699 V3 Benchmarks
Many of the Linux-Bench.com benchmarks were published on Tom’s Hardware for their launch piece. This also gave me time to review the Linux-Bench parser script and a few minor errors were found (again this is the web parser not the actual benchmark script itself.) In the meantime, I thought I would share some of the raw benchmark data for folks to look at:
Head on over to http://linux-bench.com/parser.html and you can use the following reference IDs to see a few runs in the database using those processors:
- Intel Xeon E5-2690 V1: 00171408296480
- Intel Xeon E5-2690 V2: 35171408319075
- Intel Xeon E5-2690 V3: 28161408152388
- Intel Xeon E5-2699 V3: 33061410024153
The bottom line is that in multi-threaded benchmarks these things are fast! In some workloads this generation’s flagship processor (the Xeon E5-2699 V3) is going to be more than twice as fast as the flagship processor of two years ago (the Intel Xeon E5-2690 V1.)
The actual dice are huge for these chips. Here is a wafer and die shot of the highest core count die taken at the press event:
Haswell-EP Power Consumption
A quick note on this. Haswell-EP power consumption is awesome. Using two higher end Xeon E5-2600 V3 processors (the Intel Xeon E5-2690 V3’s) in a Supermicro SYS-6018R-WTR and 16x 8GB DDR4 DIMMs I was able to achieve as low as 77w idle in the 1U server. A quick overview of one of the test platforms:
For a 1U server with 24 cores and 48 threads, and 16 DIMMs that is pretty great as of 2014. AVX 2.0 really turns on the power consumption. I do have Intel Xeon E5-2650L V3 processors for testing which should happen soon. Noise for a 1U with redundant power supplies was surprisingly good.
Insights from the press event
There were a few major points I took away from the press event. First, AVX 2.0 is going to be a great accelerator, but AVX 2.0 also uses a ton of power. If you want to really see these processors push power consumption figures higher, try running an AVX 2.0 workload. On the other hand the fuse multiply-add function is extremely useful. Here is Intel’s slide on the note:
Second, Fortville is going to be a game changer. We previewed the new dual 40GbE Fortville adapters a few weeks ago. Intel is pushing virtual network offloading to help cope with the speed demands of software defined networking. With typical power consumption of 3.6w and a maximum TDP of 7w this is going to be huge in terms of getting faster networking in servers.
Intel gets a lot of data from its customers and partners. One area it is specifically working on is called the “noisy neighbor” problem. This problem manifests itself in a particular virtual machine taking resource time out of line with other applications to the detrimental impact of the other virtual machines. There are many stories of folks on Amazon Web Service’s EC2 where this is has been offered an issue.
Intel is starting to recognize that this is an issue, especially on processors like the Intel Xeon CPU E5-2699 v3 which can provide 36 cores and 72 threads on a single system. The possibilities for cloud providers such as Amazon to provide bigger virtual machines or more virtual machines on a system mean that noisy neighbors can impact more customers on a single server. Expect to hear more on this trend over the next few generations.
Is the Intel Xeon E5-2600 V3’s progress due to ARM?
One lingering thought I will sign off with. Is the Intel Xeon E5-2600 V3’s progress due to ARM’s threat on the low end. At some point, managing more nodes becomes burdensome. I was having a conversation around ARM processors with an industry source. The question is: what is more efficient, a bunch of ARM servers or a single virtualized Xeon E5 server?
Intel no longer has a competitor for the Xeon -EP series. AMD has not released a new architecture in the market since the original Opteron 6300 was released in November 2012. Intel has transitioned to 22nm and with the V3 generation now has offerings that are more than twice as fast as its offerings in late 2012.
When one looks towards the fact that one can put more and more VMs on a single system (Xeon E5-2600 V3.) Feed them with bigger network pipes (Fortville), RAM (DDR4) and storage (NVMe) the big argument for ARM in large installations becomes custom IP and the ability to avoid noisy neighbors. Intel is clearly looking at the latter as discussed above. The former Intel is now making custom silicon for big customers. There is another series of processors with an 8 in their model number (like the Intel Atom C2758) that are part of the communications processor series with QuickAssist. Intel is building IP blocks for major applications.