I started this discussion to get out technical differences for Intel's Nehalem and AMD architecture, which include AMD Istanbul (comes 2009), but exclude Intels Nehalem EX, comes 2010.
Today we only use 4 socket Server. Most are AMD Barcelona , but although Intel Tigertons. So all Quad Core.
What are the differences for one over the other architecture:
2 Socket , - 4 Socket , number of dimms, downgrade of memory speed , scheduler, cpu queueing, hyperthreading (are thread 1 to 8 equal to thread 9 to 16), vmmark results.
CPU Part | Intel Nehalem EP System | AMD Shanghai | AMD Istanbul | ||||||
VMmark result | 23,46 | 20,50 | Expected 30 | ||||||
Sockets | 2 | 4 | 4 | ||||||
Cores per Socket | 2 x 4 | 4 x 4 | 4 x 6 | ||||||
Threads | 8 | 16 | 24 | ||||||
Hypertreading | 8 | - | - | ||||||
Threads Total | 16 | 16 | 24 | ||||||
Caches
Intel Nehalem EP System | AMD Shanghai | AMD Istanbul | |
L1 cache size (max) | 32 KB D-cache + 32 KB L1 instruction cache per core | 64KB (Data) + 64KB (Instruction) per core | 64KB (Data) + 64KB (Instruction) per core |
L2 cache size (max) | 256 KB low latency L2 cache per core | 512KB per core | 512KB per core |
L3 cache size (max) | 8 MB shared L3 cache | 6MB (shared) | 6 MB or more |
SIMD Instruction Set Support | SSE, SSE2, SSE3, SSE4A | SSE, SSE2, SSE3, SSE4A |
Intel Nehalem ?
Memory Controller | Intel QPI | AMD Hypertransport | AMD Hypertransport | |||||||
Version | 1 | HT1.1 | HT3 | |||||||
Memory frequencies | 800/1066/1333 | 677/800 | 800/1066/1333 | |||||||
DDR Technology | DDR3 | DDR2 | DDR3 with Fiorano platform | |||||||
QuickPath Connect bandwidth per CPU | 25,6 GB/s | Maximum Memory Bandwidth 4P System | 51.2GB/s | higher | ||||||
? | ? | Maximum I/O bandwidth with 4P System | 32.0GB/s | higher | ||||||
Maximum QuickPath Connect bandwidth | 51,2 GB/s | Maximum Total Bandwidth 4P System | 83.2GB/s | higher | ||||||
Not finished yet.
Where are you getting "Expected 30" for Istanbul? And comparing a Quad Socket Istanbul to a dual socket Nehalem is not fair.
And not stating the number of tiles in a VMMark result is also leaving out a significant portion of the meaning. Without the tiles, the Nehalem clearly is better then Shanghai, but since the number of tiles was also higher, it means its that much better the Shanghai.
Fred,
I totally agree on the 2S vs 4S. 23 @2S vs 30 (expected) @4S is a no brainer. Especially if you take into account sw licensing.
What I don't understand is your tiles comment. My limited understanding of this benchmark is that a tile is a way to make the server spin. The bigger the server is the more tiles you have to have to saturate it. At the end of the day is the normalized number (i.e. 23, 30 etc etc) that matters. In fact I am not even sure why they call out the # of tiles. Tiles in my mind are similar to the # of workstations you need to have to generate the requests for a TPC-C benchmark.
Massimo.
I do agree if Intel had an Intel 4 socket server based on integrated memory controller. Then I although do a comparision between Intel and AMD 4 socket servers.
On the other hand is price and the availablity of systems. From rumors intel nehalem systems are available in May and AMD in Jun 2009.
Although I like to have both cpu vendors in a big vmware farm. AMD is from my perspective the 64 Bit leader in X86 space. It although native.
I guess I don't know enough about how VMMark results are reported, but you always see the tiles mentioned. It was my understanding that a tile is a set number of VM's loaded up so having a number that is marginally better on what is supposed to be a faster server could be hurtful to its sale - until you mention that that marginally better number was actually because it was doing a whole lot more. If you made it do the same amount of work, the result number would be higher.
Guess I have some reading to do on VMMark results and what they really mean.
Guess I have some reading to do on VMMark results and what they really mean.
Welcome to the club.
I have always seen the tiles as a mean to generate the score number. I have always tried not to think about scenarios where a given server could score 20 with 10 tiles and another server could score 15 with 13 tiles. Which one would be better? That's why, for clarity, I would not personally highlight the # of tiles and I would rather only list the score. As I said the tile, in my mind, is just the "tool" to get to the score and as such should not be relevant.
Massimo.
What about the esx scheduler?
Nehalem 2 socket ( 8 cores)
18 dimm slots with max 144 GB and 800 MHZ speed
7 cores and 1 for esx service console. 8 native threads and 8 hyperthreads (Hyperthreads can only get 30 % of the native threads !
Intel VT-d could be an advantage . But till now no performance values available.
AMD 4 socket server (16 cores) .
15 cores and 1 core for the service console. 16 native threads. Advantage better parallel queing and/or better garantied MHZ.
32 dimm slots with 128 (256) and 667 MMZ speed.
Both are plattforms near equal priced when both use same ram capacity.