OS Preparation and Benchmark Installation

Windows 10 Pro

As we started to use Windows 10 Pro in our last update, there's a large opportunity for something to come in and disrupt our testing. Windows 10 is known to kick-in and check for updates at any hour of the day (and we’re testing 24hr), so anything that can interrupt or take CPU time away from benchmarking is a bit of a hassle. There’s also the added element of Windows silently adjusting the update schedule and moving places in the registry without warning.

During building this latest suite, Microsoft launched Windows 10 version 2004. There is always a question as to what we should do in this regard – move to the absolute latest, or take a step back to something more stable and fewer bugs but it might not be as relevant. In order to not create any level of programming debt, by which lots of work is needed to fix the smallest issues that might arise, we often choose the latter. In this regard, we are using Windows 10 version 1909 (18363.900). It has since transpired, from talking to peers, that 2004 has a number of issues that would affect benchmarking consistency, which validates our concerns.

Naturally, the first thing an OS wants to do when it starts up is connect to the internet and update. We install the OS without the internet connected, and our install image automatically sets the update period to the maximum period possible. The scripts we run are continuously updated to ensure that when the benchmark starts, the ‘don’t restart’ period for the OS is resynchronized to the latest possible time. There’s nothing worse than a restart in the middle of a scripted run to wake up in the morning to find that the system rebooted at 1am.

The OS is installed manually with most of the default settings, and disabling all the extra monitoring features offered on install. On entering the OS, our default strategy is multiple: disable the ability to update as much as possible in the registry, disable Windows Defender, uninstall OneDrive, disable Cortana as much as possible, implement the high performance mode in the power options and disable the platform from turning off the display. We also pull the latest version of CPU-Z from network storage, in case we are testing a very new system. Another script is in place to run when the OS loads, to check the CPU and GPU is what we expect, as well as the GPU drivers that we needed are in place, as Windows has a habit of updating those without saying anything. Windows Defender is also disabled, as it (personally) has historically seems to eat CPU time if the network changes for no reason, even when the system is in use.

Some of these strategies are designed to be redundant. The goal here is to attack the option needed in as many different ways as possible. There’s nothing lost by being thorough at this point and hammering the point home. This means executing registry files that adjust settings, executing batch files which do the same while installing files, and reiterating these commands before every benchmark run in order to be crystal clear. Simply put, do not implicitly trust Windows to leave the settings alone. Something always invariably changes (or moves somewhere else) if it is not monitored. Some of these commands that are in place are also old/legacy, but are kept as they don’t otherwise adjust the system (and can take effect if options that are continually moved around suddenly move back).

It is worth noting that some of the options, when run through a batch file, require the file to be run as Administrator. Windows 10 makes a frustrating task to do so manually recently without implementing user access elevation. The best way to ensure that the batch file always runs in admin mode seems to be to create a shortcut to the batch file, and adjusting the properties of the shortcut to always enable the ‘run as admin’ mode. It is an interesting kludge for that to work, and it is frustrating I cannot just adjust the batch file properties directly to run as admin every time.

Benchmark Installs

When choosing a benchmark, it often falls under two headers – standalone, such that it can be run as is, or ones that need installation. With installation, these are subdivided further into those with silent installers, and those who have to have the installation done manually.

Installing benchmarks can either be done before running the main script, or be integrated directly into the main testing script. As time has progressed, we have moved from the former to the latter, so we can wrap uninstall commands into the script if we only get limited access to a system. For the manually installed benchmarks this isn’t possible, and technically calling an install/uninstall from the script does make total testing time longer, but it also reduces requirements for SSD capacity by not having everything installed at once. Experience of doing this scripting over the past few years, and making the benchmark scripts as portable as possible, have pointed to making the install/uninstall part of the benchmark run.

Benchmarks that could be run without installing, known as ‘standalone’ benchmarks, are the holt grail. Cinebench and others are great for this. But for the others, these are probed for silent install methods. Certain benchmarks in the past, such as PCMark8, also have additional features to enable online registration to enable DRM through the command line. Other installers, such as .msi files, seem to be unable to be installed if they are not in the directory from which the batch file was called without the right commands. When scripting successive installs, it becomes important to check the previous one has finished before another one starts, otherwise the script might jump straight to the next installer before the previous ones were finished, making it tricky as well.

For msi files, our install code relies heavily on the following command to ensure that installs are finished before tackling the next one:

cmd /c start /wait msiexec /qb /i <file>

Most .msi files have the same flags for silent installs, however install executables can vary significantly and require probing the vendor documentation. For the most part, a ‘/S’ flag is the silent install flag, while others require /norestart to ensure the system doesn’t restart immediately, or /quiet, to get going in a silent fashion. Some installations use none of these and rely on their own definitions of what constitutes a silent install flag. I’m looking at you, Adobe. However ultimately, most software packages that can install silently, or require additional commands to enable licenses, and are ready to be called for their respective tests.

One benchmark is a special case: Chrome. Chrome has the amazing ability to update itself as soon as it is installed – even without opening it or when the system is booted. To stop this from happening is more than just a simple software adjustment, purely because Google no longer offers an option to delay updates. We initially found an undocumented way to stop it from updating, which requires the install script to gut some of the files after installing the software in order to stop this happening, however the quick update cycle of Chrome means that our v56 version from last year is now out of date. To get over this, we are using a standalone version of Chromium.

The final benchmark in our install is Steam, which is a fully manual only install. Valve has created Steam with a really odd interface interaction mechanism type, even for AHK scripting, which makes installing Steam a bit of a hassle. Valve does not offer a complete standalone installer here, so the base program opens after installation to download ~200MB of updates on a fresh system. We install the software over the Steam directory already present on the benchmark partition from a previous OS install, so the games do not need to be re-downloaded. (When an OS is installed, it’s installed on a specific OS partition, and all benchmarks are kept on a second partition).

One other point to be aware of is when software checks for updates. Loading AIDA, for example, means that it will probe online for the latest version and leave a hanging message box to be answered before a script can continue. There are often two ways to do this, and the best is if the program allows the user to set the ‘no updates’ automatically in the configuration files. The fall back tactic that works is to disable the internet connectivity (often by disabling all network adaptors through PowerShell) while the application is running.

Benchmark Automation The CPU Overload 2020 Suite
Comments Locked

110 Comments

View All Comments

  • vasily - Monday, July 20, 2020 - link

    You might want to check out Phoronix Test Suite and openbenchmarking.org.

    https://www.phoronix-test-suite.com/
    https://openbenchmarking.org/
  • colinisation - Monday, July 20, 2020 - link

    would love to see the following processors added
    5775C (overclocked to 4Ghz) - just purely to see what impact the eDRAM has on workloads
    4770K
    7600K

    Phenom II X4
    Highest Bulldozer core

    VIA's highest performance x86 core
  • faizoff - Monday, July 20, 2020 - link

    What a gargantuan project this is going to be. And I cannot wait, oddly enough I've been using the bench tool the past few weeks to get a sense of how much difference an upgrade for me would make.

    I am probably one of the many (or few) people that have still held on to their i5 2500k and this is one of the places I can select that CPU and compare the benchmarks with newer releases.

    This project looks to be an amazing read once all done and will be especially looking forward to those segments "how well does x CPU run today?"
  • Alim345 - Monday, July 20, 2020 - link

    Are you going to make benchmark scripts available? They should be useful for individual comparisons, since many users might have overclocked CPUs which were more common in 2010-2015.
  • brantron - Monday, July 20, 2020 - link

    Just to fill out the starting set:

    7700K needs a common AMD counterpart, i.e. Ryzen 2600
    Sandy or Ivy Bridge i7
    Haswell i7

    That would also make for a good article, as it should be possible to overclock any of those to ~4.5 GHz for a more apples to apples comparison.
  • StormyParis - Monday, July 20, 2020 - link

    Thank you for that. My main question is not "what should I buy" because that's always very well covered, and on a fixed budget there's never much choice anyway, but "should I upgrade *now* which is only worth it when last time's amount of money gets you at least 2x performance. I'ive got a 7yo Core i5... I'll look into it !
  • eastcoast_pete - Monday, July 20, 2020 - link

    Ian, thanks for this!
    One aspect I've wondered about for a while is whether you could include performance/Watt in your tests and comparisons going forward? I know that's usually done for server CPUs, but I also find it of interest for desktop and laptop CPUs.
  • thebigteam - Monday, July 20, 2020 - link

    I think I have the below list of Intel CPUs available if needed, likely with working mobos too. Would be very happy to clean out the closet and get these to you guys :) Likely some 2009/2010 Athlons as well
    E8400
    i3 530
    i3 540
    i5 760
    i5 2500
    i5 4670K
  • inighthawki - Monday, July 20, 2020 - link

    Thank you so much for changing your gaming benchmark methodology. I tend to play my games at 1440p on lowest settings for maximum framerates, which is far more often than not CPU bound. It was always so annoying seeing the benchmarks be GPU bound when I'm trying to see how much a new CPU helps.
  • Smell This - Monday, July 20, 2020 - link

    Chicken
    (lol)

    With AM3, AM2+ and AM2 processors, AM3+ processors broke backwards-compatibility.

    A mobo like the MSI 790FX K9A2 Platinum transitioned nearly 250 processors from S754-939, to AM2-AM3, beginning with the single-core Athlon 64 3000+ 'Orleans' up to the PhII x6 DDR3 Thubans.

    These were the progeny of the K8 or 'Hammer' projects. A Real Man would never leave them behind ...

    https://www.cpu-upgrade.com/mb-MSI/K9A2_Platinum_%...

Log in

Don't have an account? Sign up now