The Intel Core i9-9990XE Review: All 14 Cores at 5.0 GHzby Dr. Ian Cutress on October 28, 2019 10:00 AM EST
Within a few weeks, Intel is set to launch its most daring consumer desktop processor yet: the Core i9-9900KS, which offers eight cores all running at 5.0 GHz. There’s going to be a lot of buzz about this processor, but what people don’t know is that Intel already has an all 5.0 GHz processor, and it actually has 14 cores: the Core i9-9990XE. This ultra-rare thing isn’t sold to consumers – Intel only sells it to select partners, and even then it is only sold via an auction, once per quarter, with no warranty from Intel. How much would you pay for one? Well we got one to test.
Build It, And They Will Come
The Core i9-9990XE is the pinnacle of Intel’s 14nm process, binned to such an nth degree that Intel can neither guarantee how many it can produce nor support it in any way or fashion. Unlike other mass market processors, there is no product support on this thing, no such thing as ‘EOL’ – once a system integrator wins it at auction it’s a sunk cost to that integrator. The idea is to sell it on for a premium, before the boss wants it for his own personal system. I mean, who wouldn’t want 14 cores at 5.0 GHz?
This CPU is part of the high-end desktop family of processors, and runs in select X299 motherboards. It’s a Core i9, rather than a Xeon, which means only four memory channels and no ECC support. It does technically support overclocking, although your mileage may vary. This here is a processor for only one market, and it’s a market willing to spend big bucks to get any sort of millisecond latency advantage: high-frequency trading.
At the first auction, we initially knew of three companies that took part. The closed auction was somewhat of a mystery to those wanting to bid: they knew what the hardware was, but not how many Intel were going to offer. Out of the three companies we spoke to, one sat by and didn’t bid, the second got three processors, and a third got the rest. How many that was, we’re not sure – just like how much value these companies put in these parts. As I mentioned at the start: how much would you pay for a 14-core 5.0 GHz all-core processor?
High-Frequency Trading systems are no stranger to esoteric arrangements. Stories of companies spending 10s of millions to implement line-of-sight microwave transmitter towers to shave off 3 milliseconds from the latency time is a story I once heard. All the big financial traders have their servers located as close to the exchange as possible, because the speed of light through an optical cable still isn’t fast enough. These companies not only pay through the nose for the hardware, but also pay experts and specialists to tune those systems for low latency. That means tweaking the memory, overclocking the processor, and even implementing chillers to get a fully stable but the fastest possible system.
So how much would these people pay for a pre-binned 14-core 5.0 GHz processor? Some of them might already be running higher than that, as a standard Core i9-9980XE off the shelf, if you buy enough of them and bin them, could potentially run at this speed. In the end, we got an answer from CaseKing, the recipient of most of these Core i9-9990XE processors: $2800. In fact, since that initial price, it has actually gone up to $2850. Compared to the Core i9-9980XE ($1979), or the newly announced Core i9-10980XE ($999), then yes, traders will easily spend $1000-$2000 more for the lowest latency x86 CPU on the market.
|Intel's HEDT CPUs|
|i9-10980XE||18 / 36||3.0||3.8||4.6||4.8||165 W||48||$979|
|i9-10940X||14 / 28||3.3||4.1||4.6||4.6||165 W||48||$784|
|i9-10920X||12 / 24||3.5||4.2||4.6||4.8||165 W||48||$689|
|i9-10900X||10 / 24||3.7||4.3||4.5||4.7||165 W||48||$590|
|i9-9990XE||14 / 28||4.0||5.0||5.0||5.0||255 W||44||$auction|
|i9-9980XE||18 / 36||3.0||3.8||4.4||4.5||165 W||44||$1979|
|i9-9960X||16 / 32||4.1||4.4||4.5||165 W||44||$1684|
|i9-9940X||14 / 28||3.3||4.4||4.5||165 W||44||$1387|
|i9-9920X||12 / 24||3.5||4.4||4.5||165 W||44||$1189|
|i9-9900X||10 / 20||3.5||4.4||4.5||165 W||44||$989|
|Coffee Lake Refresh|
|i9-9900KS||8 / 16||4.0||5.0||5.0||-||127 W?||16||$513|
So where do we come in? We have a sample. Technically we have a whole system, from International Computer Concepts, or ICC. ICC is a server specialist – we first met them at Supercomputing 2015 showing off a crazy tower system with 8 different servers in side, but they work closely with Intel to provide specific solutions for different vertical markets: oil and gas, medical, high performance computing, and very importantly, financial. They will sell a system overclocked to the gills.
Unfortunately, due to some proprietary technology, we can’t show you the inside of the server they sent us. It’s a standard 1U design, with an ASUS X299 motherboard inside and 32GB of customized memory. It uses an all-copper custom liquid cooled system that is absolutely overkill for most hardware, but does enough to keep this Core i9-9990XE under control. Being a 1U system, which means 1.75-inches tall (4.45cm), and having to house this monstrous beast means the cooling has to be top class, and ICC doesn’t skimp. To that end, it is also loud. There’s no way you’re having a 1U like this in the same room as you are working, as this thing is loud. More detail inside the review.
On top of the standard out-of-the-box specifications, ICC has done further tweaks to the BIOS to ensure the lowest latency and stability. Again, we’re not able to show you what these are, but we were told not to update the BIOS as part of our testing. The 1U server does have space for two graphics cards, two M.2 drives, four SATA drives, and does come with 1200W power supply. We do have some measurements inside the review for the power as well.
Don’t Drop It
On the face of it, the Core i9-9990XE is a standard LGA2066 chip. It uses Intel’s regular 18-core ‘HCC’ Skylake silicon, however it’s geared towards the ‘consumer’ platform, which is part of Intel’s product segmentation strategy. It doesn’t have ECC support, and so is limited to 128 GB of standard DDR4 memory, although you can bet that any HFT system that uses this part will run high speed memory. The chip has 44 PCIe 3.0 lanes, in line with other LGA2066 consumer parts, and because it isn’t a Xeon, does not support RAS features or vPro for management.
One of the issues with this chip is that at this price, typically we have professional users that require in-band management features and other security elements to make sure their expensive hardware remains secure and affords appropriate manageability. By designating this part a Core i9, rather than something like a Xeon W, Intel takes those offerings off the table: OEMs that purchase and resell the part to end-users have to explain to end-users that this rare chip comes with these limitations.
At this point we do not know how many chips Intel intends to put into the market. Intel is having an auction every quarter with what chips do pass the grade, assuming that any OEMs want to actually buy them for their customers. We could be talking sub-100 units per year, which is a little odd given that Intel doesn’t need to bin these to the same strict longevity standard as other chips as it doesn’t provide a warranty. Because of all this ‘product / not a product’, the Core i9-9990XE doesn’t get its own page in Intel’s processor database, and it will never be given a strict ‘end-of-life’ program as it doesn’t fall under the standard product order/shipping regime. All the long-term support falls at the hands of the company or OEM that buys them.
The Chip and the Competition
Strictly speaking, this Core i9-9990XE is a 14-core processor with a base frequency of 4.0 GHz and a thermal design power at that frequency of 255W. The turbo frequency for this processor is 5.0 GHz on all cores. But this creates a little bit of an issue for an ‘all core 5.0 GHz turbo’ classification.
As stated in our interviews with Intel Fellows about how turbo response should be presented, we explained that how long a system has turbo enabled is dependent on the instructions being used but also by the motherboard manufacturer. Turbo is defined by a higher level power limit (PL2) and a turbo budget time (Tau) which is indicative of a percentage of a power virus. Normally Intel ‘suggests’ a turbo power of 25% higher than TDP (so for 255W, that is 319W), and anywhere from 8-200 seconds of turbo depending on the platform.
For the 1U server we were given for testing, ICC has enabled turbo for an unlimited power for an unlimited time (technically up to 4096 seconds I believe), as they want to enable this CPU to hit 5.0 GHz on all cores all the time. In order to do this, as mentioned above, requires some very effective cooling. It becomes doubly complicated for ICC, given that they want to do this in a 1U, and so have developed some proprietary cooling technology to enable this.
This is as much as I can legally show you about the cooling
Technically this chip supports Turbo Max 3.0, whereby Intel designates the best performing cores for even higher turbo frequencies. In our case, out of the 14 cores, Core 10 was considered the best. Inside Windows, the ACPI interface will detect key software (or software defined by the active window) and try to run it on these cores with an extra frequency bump (+100 MHz or so). For our system, while the TBM3 and ACPI interface did lock software to specific cores, we saw no increase in frequency, due to the way the system has been set up. One of the other key areas for ICC’s customers is low latency but consistent low latency. In order not to modify that consistency, TBM 3.0 has no effect on the processor frequencies for our testing.
The other features of the chip are the quad-channel memory support of DDR4-2666 in single rank mode. ICC supplied our system with custom memory modules and appropriate heatsinks, with the system running at DDR4-3600 CL16. This chip also has 44 PCIe 3.0 lanes, in line with other 9th series Intel HEDT processors.
Competition for the Core i9-9990XE comes from several sides.
One CPU on the books is the upcoming Core i9-9900KS, an eight-core processor that also promotes all eight cores at 5.0 GHz. This chip uses the consumer grade mainstream silicon, and thus only has two memory channels and 16 PCIe 3.0 lanes. This CPU is going to be launched in a couple of days (October 30th), with a $513 MSRP.
Another CPU is the new Cascade Lake-X 18-core flagship, the Core i9-10980XE, for $999. This is the latest high-end desktop processor, with (we assume) the latest security updates from Intel as well as a boost in some of the freuqencies from the Core i9-9980XE. Ultimately this has four more cores than the 9990XE, but lower frequencies, and is cheaper. The user that is lucky enough to get a good sample could perhaps overclock it to match the 9990XE. The Core i9-10980XE also has four more PCIe 3.0 lanes and the same number of memory channels.
From AMD’s side, the upcoming 16-core Ryzen 9 3950X in November is one angle. Being on 7nm it is certainly more energy efficient, and the Zen 2 microarchitecture has a higher IPC than the Intel part, but the CPU won’t be able to reach the same frequencies. It is also aimed at consumers, with 24 PCIe 4.0 lanes and two memory channels. At an MSRP of $749, it will certainly cost a lot less, however.
We can also look towards AMD’s launch of the next generation of Threadripper, also based on Zen 2 and 7nm. At this point, aside from AMD announcing that they are coming in November and starting with a 24-core CPU, we don’t have many details. It is expected to have four memory channels, 64 PCIe lanes, and might come in around 4.0 GHz. It will still have the issue of not clocking as high as the Intel part, and price/power is an unknown at this point.
AMD has however launched its Zen 2 server hardware, the EPYC 7002 series. Rather than looking at a high frequency 14-core part, users might consider a 32-core CPU here, with eight memory channels, a high IPC, and 128 PCIe 4.0 lanes. Again, the deficit is going to be in the frequency, which is something that HPC traders desire. The EPYC 7502P retails for around $3400, so in the right server if a HPC trader needs to scale out, this could be an option.
|Comparing the i9-9990XE|
|255 W||255 W||127 W?||TDP||105 W||250 W||225W|
|6 x 2666||4 x 2666||2 x 2666||DDR4||2 x 3200||4 x 2933||8 x 3200|
For any comparison you make, there’s no denying that the Core i9-9990XE pushes the boundaries for Intel’s binning on its 14nm process. This is why it has no MSRP, and why Intel can’t predict how many it will be able to manufacture in any given quarter. For whatever the OEMs end up paying for it at auction, the fact that CaseKing has it for sale (with 1 year OEM warranty) for 2849 Euro, means that it sits well above any other Intel high-end desktop processor, and with good reason.
It should be noted that Intel’s recent updates regarding Spectre, Meltdown, and ZombieLoad may have an effect on performance. Based on data we’ve seen at Intel, the mitigations hurt the newest hardware the least (compared to say, Broadwell). The system provided by ICC does not have firmware mitigations in place, however we did use an OS version that had some of the software implemented fixes. ICC was clear that some of its customers, while concerned about these issues, just want the fastest system possible based on the way they use these systems.
As a result, our results here are ultimately not in the same ‘ilk’ as our previous reviews. Because of the custom BIOS being used, with the overclock options locked down, the benchmark data will not necessarily mirror an ‘off-the-shelf’ installation, but will mirror a pre-built system which is ultimately what these chips are aimed for. As a result, we’re putting an Asterisk by our results, to indicate that the environment for this chip was different.
CPU: Intel Core i9-9990XE, 14 Cores, 4.0 GHz Base, 5.0 GHz Turbo, 255W TDP, $Auction
DRAM: 4x8 GB Custom ICC Modules, DDR4-3600 CL16
Motherboard: ASUS X299
GPU: Sapphire Radeon RX460 2GB
Cooling: ICC Proprietary Liquid Cooling
Power Supply: Dual 1200W 1U Redundant Supplies
Storage: Micron MX500 1TB SSD
Chassis: 1U Rack Server
In our reviews, we normally take an open-air testbed with powerful cooling, a powerful motherboard, DRAM at manufacturer supported frequencies, and the latest public BIOS for that motherboard.
For our benchmarks, we ran our standard CPU suite. Due to the 1U arrangement, and where this chip is focused, we did not install a large GPU for gaming tests. Users looking at this system wanting to pair it with a large CUDA card for financial simulation will likely have a field day, but for gaming, that is best left to the Core i9-9900KS when it comes out.
Also, while this CPU is overclockable, the motherboard supplied had a locked BIOS on overclocking: ICC has configured it for performance and stability, and we were unable to even open the appropriate menus in the BIOS to perform overclocking.
If there is a sufficient request from readers, we’ll look into taking the chip and running it in a different motherboard for gaming and overclocking performance. I’ll have to see if my best cooling solution will be sufficient.
Pages In This Review
- Analysis and Competition
- The Core i9-9990XE: Compilation Champion
- CPU Performance: Rendering Tests
- CPU Performance: Encoding Tests
- CPU Performance: System Tests
- CPU Performance: Office Tests
- CPU Performance: Web and Legacy Tests
- Power Consumption and Thermals
- Conclusions and Final Words
Post Your CommentPlease log in or sign up to comment.
View All Comments
edzieba - Thursday, October 31, 2019 - linkAn ASIC has a significant (many months to years) lead time between "we need X design" and functioning silicon. Trading algorithms are a constant arms race being updated to counter others' algorithm changes (who then counter your counters, etc) on the days to hours timescales.
shtldr - Tuesday, October 29, 2019 - linkIf you've got all the money (which you should, in case you are a successful algorithmic trader), why not go ASIC?
Dribble - Monday, October 28, 2019 - linkIt's not as simple as you need hundreds of threads or you need one. Compiling is an obvious example. You have a mixture of tasks - some take more threads (e.g. you have a large number of files in a makefile you can compile at once), some take less threads (you have a smaller makefile with only a few files), some take one thread (you need to link).
A chip like this with 14 cores and very high single thread performance it turns out is ideal for this sort of task.
Compiling is very much not a niche market.
eek2121 - Monday, October 28, 2019 - linkWord (in the article) is that it helps with web browsing as well. So there is that. ;)
That being said, I don't look at this CPU as being competitive to AMD offerings simply because you can't buy the thing. However it is nice to see that Intel can do something if they put their mind to it.
bananaforscale - Thursday, October 31, 2019 - linkWell, multiple cores *do* help with web browsing, doesn't mean you need 14@5 GHz. :D
MattZN - Tuesday, October 29, 2019 - linkYou don't need all those cores running on a single platform to do HFT. In fact, that winds up being a negative because all the cores are competing for memory cycles. Instead what you want to do is mirror (not split, but do a full mirror) the packet stream to a whole bunch of platforms with fewer cores which can then maximally leverage their memory bandwidth and CPU caches. You also filter the packets inside the NIC itself, not with the CPU.
You also don't need to have a high-frequency CPU to minimize response time. The CPU is calculating outcomes from likely moves way ahead of time, long before actually receiving any packet telling it what movement actually happened. When the packet comes in, the CPU really only needs to look up the appropriate response from a table that has already been calculated. In fact, the NIC itself could do the table lookup for certain actions and bypass the CPU entirely.
So you want lots of cores, but they don't actually have to be ultra fast. Anyone using something like this processor to try to 'get ahead' in the HFT game is going to be in for a big surprise.
Spunjji - Wednesday, October 30, 2019 - linkThanks for the clarification. I thought that leaning on a single, many-core high-frequency CPU for this sort of task sounded a lot like optimising the wrong part of the whole process.
peevee - Monday, October 28, 2019 - linkThat's the point. It does not make them much money, the volume is simply not there. It is for INTEL's bragging.
eek2121 - Monday, October 28, 2019 - linkThey likely auction the chips due to the aggressive binning required. I expect if they could roll out this kind of chip easily, they would have already. Think: 10 chips for every 100,000 can do 4-5 GHz @ 14 cores, 255 watt TDP.
lazarpandar - Monday, October 28, 2019 - linkSo if you have an absurd amount of money and can't scale with more cores beyond 14...
What a stupid product.