SPECworkstation 3

The best place to start for performance is to confirm that this system does get the best SPECworkstation 3 score ever. For users who have never heard of SPECworkstation, it comes from the same people that have the SPEC benchmark that we often use on new processors. The workstation element comes in because this set of benchmarks are designed to test a number of common workstation workloads, such as 3D rendering and animation, molecular modeling and dynamics, medical, oil and gas, construction and architecture, financial services, general operations, and GPU compute. This benchmark combine 30 workloads and ~140 tests into a single package, and results are given as a multiple of a performance compared to a ‘reference’ machine using an Intel Quad-core Skylake processor running a W3100 AMD GPU. This means that this quad-core Intel system gets a value of ‘1’.

SPECworkstation 3 Test Systems
AnandTech CPU GPU DRAM SSD Price
Fujistu Celsius R970 2 x Xeon 8276 RTX 8000 DDR4-2933 PCIe 3.0 $30000+
Armari Magnetar X64T TR3 3990X RTX 6000 DDR4-3200 PCIe 4.0 ~$14200
TR3 3990X 'Stock' TR3 3990X 2080 super DDR4-3200 SATA -
W-3175X 'Stock' Xeon W-3175X 2080 Ti DDR4-2933 SATA -

The current system at the top of the official SPECworkstation 3 standings is a Fujitsu Celsius R970 workstation (D3488-A2). This is the system that Armari has beaten with the X64T. The Fujitsu uses two Intel Xeon Platinum 8276 processors (28-core each, total 56-corepaired with an NVIDIA Quadro RTX 8000 and 384 GB of DDR4-2933. This system, going on list prices for just these components, already comes to $24538. Add in the rest, and some overhead, and this is easily $30000+. By comparison, Armari’s Magnetar X64T workstation is only ~$14200.

The results are as follows. Here we are comparing the Fujitsu official results to Armari’s official results. We also have included our results with the same system (technically classified as ‘estimated results’ because these haven’t been formally submitted to the results database), and a W-3175X system with an RTX 2080 Ti and PCIe 3.0 SSD.

SPECworkstation 3 Results
AnandTech Fujitsu
Celsius
R970*
Armari
Magnetar
X64T*
Our
X64T
Run
3990X
+ 2080
super
3175X
2080 Ti
Media and Entertainment 4.72 7.04 6.84 4.79 3.69
Product Development 6.07 10.85 9.95 3.51 3.35
Life Sciences 5.89 8.24 8.11 - 3.72
Financial Services 8.78 10.55 10.45 9.15 6.59
Energy 5.44 9.09 8.73 4.20 2.86
General Operations 2.27 2.53 2.45 1.55 1.59
GPU Compute 5.40 5.75 5.70 4.63 5.01
 
Geomean 5.17 7.06 6.84 4.08 3.54

*As submitted to SPEC

 

Within each of these segments, 7-20 sub-tests are performed covering CPU, GPU, and Storage workloads. Our results were a little lower than Armari's, however that can be down to tuning, ambient temperatures, and repeated runs. Our run was within 3%.

Overall, the Magnetar X64T results beat the old Fujitsu results by 37%:

  • CPU: Armari wins by +46%
  • GPU: Armari wins by +12%
  • Storage: Armari wins by +58%

Now, users might wonder how the Armari wins in the GPU tests, given that it has an RTX 6000 compared to the RTX 8000 in the Fujitsu. This is namely down to processor performance – the Fujitsu system processors have a base frequency of 2200 MHz, compared to the Magnetar X64T which can run all processors at 3925 MHz. Even if the Fujitsu was using the CPU in single core mode, and hitting its max turbo of 4000 MHz, the Armari would be using the better IPC of the Zen 2 core against Intel’s Skylake core.

Now each of the above tests are combined scores from sub-tests.

The Intel-based Fujitsu system does have some specific wins in individual tests, such as Maya Storage (+15%), NAMD Storage (+12%) and 7-zip CPU (+75%), however these mostly apply due to the increased memory capacity of the Intel machine.

The AMD-based Armari system has 40 other wins, including Blender CPU (+62%), handbrake CPU (+86%), CFD CPU (+108%), NAMD CPU (+164%), Seismic Data Processing (+230%), LAAMPS storage (+88%), and Creo GPU (+55%).

Full data for the Armari and the Fujitsu systems can be found at these links:

The Armari Magnetar X64T Workstation Rendering Benchmark Performance
Comments Locked

96 Comments

View All Comments

  • KillgoreTrout - Wednesday, September 9, 2020 - link

    Intelol
  • close - Wednesday, September 9, 2020 - link

    This shows some awesome performance but the tradeoff is the limited memory capacity. If you don;t need that great. If you do then Threadripper is not the best option.
  • twotwotwo - Wednesday, September 9, 2020 - link

    Hmm, so you're saying AnandTech needs a 3995WX or 2x7742 workstation sample? :)
  • close - Wednesday, September 9, 2020 - link

    A stack of them even :). Thing is memory support doesn't make for a more interesting review, doesn't really change any of the bars there. It's a tick box "supports up to 2TB of RAM".

    Memory support is of the things that makes an otherwise absurdly expensive workstation like the Mac Pro attractive (that and the fact that for whoever needs to stay within that ecosystem the licenses alone probably cost more than a stack of Pros).
  • oleyska - Wednesday, September 9, 2020 - link

    https://www.lenovo.com/no/no/thinkstation-p620

    will probably be able to help.
  • close - Wednesday, September 9, 2020 - link

    The P620 supports up to 512GB of RAM. Generally OK and probably delivers on every other aspect but for those few that need 1.5-2TB of RAM it still wouldn't cut it. For that the go to is usually a Xeon, or EPYC more recently.
  • schujj07 - Wednesday, September 9, 2020 - link

    Remember that Threadripper Pro supports 2TB of RAM in an 8 channel setup. While getting 2TB/socket isn't cheap, it is a possibility.
  • rbanffy - Thursday, September 10, 2020 - link

    I wonder the impact of the 8-channel config on single-threaded workloads. The 256MB of L3 is already quite ample to the point I'm unsure how diminished are the returns at that point.
  • sjerra - Monday, September 28, 2020 - link

    This is my biggest concern and rarely considered or studied in reviews. Design space exploration.
    CAE over many design variations. Hundreds of design variations calculated as much as possible in parallel over the available cores (one core per variation, but each grabbing a slice of the memory). I've tested this on a 7960xe, purposely running it on dual channel and quad channel memory. On dual channel memory, at 12 parallel calculations (so 6 cores/channel) I measured a 46% increase in the calculation time / sample. in quad channel, at 12 parallel calculations (so 3 cores/ channel) I already measured a 30% reduction per calculation. (can anyone explain the worse results for quad channel?)
    Either way, it leaves me to conclude that 64 cores with 4 channel memory for this type of workload is a big no go. Something to keep in mind. I'm now spec'ing a dual processor workstation with two lower core count processors and fully populated memory channels. (either epic (2x32c, 16 channels) or Xeon (2x24c, 12 channels). still deciding).
  • sjerra - Monday, September 28, 2020 - link

    Edit: 30% increase of course.

Log in

Don't have an account? Sign up now