CPU ST Performance: Not Much Change from M1

Apple didn’t talk much about core performance of the new M1 Pro and Max, and this is likely because it hasn’t really changed all that much compared to the M1. We’re still seeing the same Firestrom performance cores, and they’re still clocked at 3.23GHz. The new chip has more caches, and more DRAM bandwidth, but under ST scenarios we’re not expecting large differences.

When we first tested the M1 last year, we had compiled SPEC under Apple’s Xcode compiler, and we lacked a Fortran compiler. We’ve moved onto a vanilla LLVM11 toolchain and making use of GFortran (GCC11) for the numbers published here, allowing us more apple-to-apples comparisons. The figures don’t change much for the C/C++ workloads, but we get a more complete set of figures for the suite due to the Fortran workloads. We keep flags very simple at just “-Ofast” and nothing else.

SPECint2017 Rate-1 Estimated Scores

In SPECint2017, the differences to the M1 are small. 523.xalancbmk is showcasing a large performance improvement, however I don’t think this is due to changes on the chip, but rather a change in Apple’s memory allocator in macOS 12. Unfortunately, we no longer have an M1 device available to us, so these are still older figures from earlier in the year on macOS 11.

Against the competition, the M1 Max either has a significant performance lead, or is able to at least reach parity with the best AMD and Intel have to offer. The chip however doesn’t change the landscape all too much.

SPECfp2017 Rate-1 Estimated Scores

SPECfp2017 also doesn’t change dramatically, 549.fotonik3d does score quite a bit better than the M1, which could be tied to the more available DRAM bandwidth as this workloads puts extreme stress on the memory subsystem, but otherwise the scores change quite little compared to the M1, which is still on average quite ahead of the laptop competition.

SPEC2017 Rate-1 Estimated Total

The M1 Max lands as the top performing laptop chip in SPECint2017, just shy of being the best CPU overall which still goes to the 5950X, but is able to take and maintain the crown from the M1 in the FP suite.

Overall, the new M1 Max doesn’t deliver any large surprises on single-threaded performance metrics, which is also something we didn’t expect the chip to achieve.

Power Behaviour: No Real TDP, but Wide Range CPU MT Performance: A Real Monster
Comments Locked

493 Comments

View All Comments

  • techconc - Monday, October 25, 2021 - link

    I guess you missed the section where they showed the massive performance gains for the various content creation applications.
  • GatesDA - Monday, October 25, 2021 - link

    Apple currently has the benefit of an advanced manufacturing process. If it feels like future tech compared to Intel/AMD, that's because it is. The real test will be if it still holds up when x86 chips are on equal footing.

    Notably, going from M1 Pro to Max adds more transistors than the 3080 has TOTAL. This wouldn't be feasible without the transistor density of TSMC's N5 process. M1's massive performance CPU cores also benefit from the extra transistor density.

    Samsung and Intel getting serious about fabrication mean it'll be much harder for future Apple chips to maintain a process advantage. From the current roadmaps they'll actually fall behind, at least for a while.
  • michael2k - Monday, October 25, 2021 - link

    That's a tautology and therefore a fallacy and bad logic:
    Apple is only ahead because they're ahead. When they fall behind they will fall behind.

    You can deconstruct your fallacy by asking this:
    When will Intel get ahead of Apple? The answer is never, at least according to Intel itself:
    https://appleinsider.com/articles/21/03/23/now-int...

    By the time Intel has surpassed TSMC, it means Intel will need to have many more customers to absorb the costs of surpassing TSMC, because it means Intel's process advantage will be too expensive to maintain without the customer base of TSMC.
  • kwohlt - Tuesday, October 26, 2021 - link

    It's pretty clear that Apple will never go back to x86/64, and that they will be using in-house designed custom silicon for their Macs. Doesn't matter how good AMD or Intel get, Apple's roadmap on that front is set in stone for as far into the future as corporate roadmaps are made.

    Intel saying they hope to one day get A and M series manufacturing contracts suggests they're confident about their ability to rival TSMC in a few years, not that they will never be able to reach Apple Silicon perf/watt.

    Intel def won't come close to M series in perf/watt until at least 2025 with Royal Core Project, and even then, who knows, still probably not.
  • daveinpublic - Monday, October 25, 2021 - link

    So by your logic, Apple is ahead right now.

    Samsung and Intel are behind right now. And could be for a while.
  • Sunrise089 - Tuesday, October 26, 2021 - link

    The Apple chips have perf/watt numbers in some instances 400% better than the Intel competition. Just how much benefit are you expecting a node shrink to provide? Are you seriously suggesting Intel would see a doubling, tripling, or even quadrupling of perf/watt via moving to a smaller node? You are aware node shrink efficiency gains don’t remotely approach that level of improvement be it on Intel or TSMC, aren’t you?

    “Samsung and Intel getting serious about fabrication.” What does this even mean? Intel has been the world leader in fabrication investment and technology for decades before recently falling behind. How on earth could you possibly consider them not ‘serious’ about it?
  • AshlayW - Tuesday, October 26, 2021 - link

    Firestorm cores have >2X the transistors as Zen3/Sunny Cove cores in >2X the area on the same process (or slightly less). The cores are designed to be significantly wider making use of the N5 process, and yes, I very much expect at LEAST a doubling of perf/w from N5 CPUs from AMD, since they doubled Ryzen 2000 with 3000, and +25% from 3000 to 5000 on the same N7 node.
  • kwohlt - Tuesday, October 26, 2021 - link

    Ryzen 3000 doubled perf/watt over Ryzen 2000?? Which workloads on which SKUs are you comparing?
  • dada_dave - Monday, October 25, 2021 - link

    So I wonder why Geekbench scores have so far shown M1Max very far off it's expected score relative to the M1 (22K)? I've checked other GPUs in its score range across a variety of APIs (including Metal) and so far they all show the expected scaling (or close enough) between TFLOP and GB score except the M1 Max. Even the 24 core Max is not that far off, it's the 32 core scores are really far off. They should be in the 70Ks or even high 80Ks for perfect scaling which is achieved by the 16-core Pro GPU, but the 32-core scores are actually in the upper 50Ks/low 60Ks. Do you have any hypotheses as to why that is? Also does the 16" have the high performance mode supposedly coming (or here already)?
  • Andrei Frumusanu - Monday, October 25, 2021 - link

    The GB compute is too short in bursts and the GPU isn't ramping up to peak frequencies. Just ignore it.

Log in

Don't have an account? Sign up now