Titan’s Compute Performance, Cont

With Rahul having covered the basis of Titan’s strong compute performance, let’s shift gears a bit and take a look at real world usage.

On top of Rahul’s work with Titan, as part of our 2013 GPU benchmark suite we put together a larger number of compute benchmarks to try to cover real world usage, including the old standards of gaming usage (Civilization V) and ray tracing (LuxMark), along with several new tests. Unfortunately that got cut short when we discovered that OpenCL support is currently broken in the press drivers, which prevents us from using several of our tests. We still have our CUDA and DirectCompute benchmarks to look at, but a full look at Titan’s compute performance on our 2013 GPU benchmark suite will have to wait for another day.

For their part, NVIDIA of course already has OpenCL working on GK110 with Tesla. The issue is that somewhere between that and bringing up GK110 for Titan by integrating it into NVIDIA’s mainline GeForce drivers – specifically the new R314 branch – OpenCL support was broken. As a result we expect this will be fixed in short order, but it’s not something NVIDIA checked for ahead of the press launch of Titan, and it’s not something they could fix in time for today’s article.

Unfortunately this means that comparisons with Tahiti will be few and far between for now. Most significant cross-platform compute programs are OpenCL based rather than DirectCompute, so short of games and a couple other cases such as Ian’s C++ AMP benchmark, we don’t have too many cross-platform benchmarks to look at. With that out of the way, let’s dive into our condensed collection of compute benchmarks.

We’ll once more start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes.  While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Note that for 2013 we have changed the benchmark a bit, moving from using a single leader to using all of the leaders. As a result the reported numbers are higher, but they’re also not going to be comparable with this benchmark’s use from our 2012 datasets.

With Civilization V having launched in 2010, graphics cards have become significantly more powerful since then, far outpacing growth in the CPUs that feed them. As a result we’ve rather quickly drifted from being GPU bottlenecked to being CPU bottlenecked, as we see both in our Civ V game benchmarks and our DirectCompute benchmarks. For high-end GPUs the performance difference is rather minor; the gap between GTX 680 and Titan for example is 45fps, or just less than 10%. Still, it’s at least enough to get Titan past the 7970GE in this case.

Our second test is one of our new tests, utilizing Elcomsoft’s Advanced Office Password Recovery utility to take a look at GPU password generation. AOPR has separate CUDA and OpenCL kernels for NVIDIA and AMD cards respectively, which means it doesn’t follow the same code path on all GPUs but it is using an optimal path for each GPU it can handle. Unfortunately we’re having trouble getting it to recognize AMD 7900 series cards in this build, so we only have CUDA cards for the time being.

Password generation and other forms of brute force crypto is an area  where the GTX 680 is particularly weak, thanks to the various compute aspects that have been stripped out in the name of efficiency. As a result it ends up below even the GTX 580 in these benchmarks, never mind AMD’s GCN cards. But with Titan/GK110 offering NVIDIA’s full compute performance, it rips through this task. In fact it more than doubles performance from both the GTX 680 and the GTX 580, indicating that the huge performance gains we’re seeing are coming from not just the additional function units, but from architectural optimizations and new instructions that improve overall efficiency and reduce the number of cycles needed to complete work on a password.

Altogether at 33K passwords/second Titan is not just faster than GTX 680, but it’s faster than GTX 690 and GTX 680 SLI, making this a test where one big GPU (and its full compute performance) is better than two smaller GPUs. It will be interesting to see where the 7970 GHz Edition and other Tahiti cards place in this test once we can get them up and running.

Our final test in our abbreviated compute benchmark suite is our very own Dr. Ian Cutress’s SystemCompute benchmark, which is a collection of several different fundamental compute algorithms. Rahul went into greater detail on this back in his look at Titan’s compute performance, but I wanted to go over it again quickly with the full lineup of cards we’ve tested.

Surprisingly, for all of its performance gains relative to GTX 680, Titan still falls notably behind the 7970GE here. Given Titan’s theoretical performance and the fundamental nature of this test we would have expected it to do better. But without additional cross-platform tests it’s hard to say whether this is something where AMD’s GCN architecture continues to shine over Kepler, or if perhaps it’s a weakness in NVIDIA’s current DirectCompute implementation for GK110. Time will tell on this one, but in the meantime this is the first solid sign that Tahiti may be more of a match for GK110 than it’s typically given credit for.

Titan’s Compute Performance (aka Ph.D Lust) Meet The 2013 GPU Benchmark Suite & The Test
Comments Locked

337 Comments

View All Comments

  • varg14 - Thursday, February 21, 2013 - link

    I will hang on to my SLI 560 tis for a while longer. Since i game at 1080p they perform very well.
  • mayankleoboy1 - Thursday, February 21, 2013 - link

    Some video conversion benchmarks please.
  • mayankleoboy1 - Thursday, February 21, 2013 - link

    Ohh, and the effect of PCIE2.0 VS PCIE3.0 also. Lets see how much is the Titan gimped by PCIE2.0
  • Ryan Smith - Thursday, February 21, 2013 - link

    This isn't something we can do at this second, but it's definitely something we can follow up on once things slow down a bit.
  • mayankleoboy1 - Thursday, February 21, 2013 - link

    Sure. I am looking forward to a part three of the Titan review
  • Hrel - Thursday, February 21, 2013 - link

    The problem with that reasoning, that they're raising here, is that the 7970 is almost as fast and costs a lot less. The Titan is competing, based on performance, with the 7970. Based on that comparison it's a shitty deal.

    http://www.newegg.com/Product/Product.aspx?Item=N8...

    $430. So based on that I'd say the highest price you can justify for this card is $560. We'll round up to $600.

    Nvidia shipping this, at this price, and just saying "it's a luxury product" is bullshit. It's not a luxury product, it's their version of a 7970GHE. But they want to try and get a ridiculous profit to support their PhysX and CUDA projects.

    Nvidia just lost me as a customer. This is the last straw. This card should be pushing the pricing down on the rest of their lineup. They SHOULD be introducing it to compete with the 7970GHE. Even at my price range, compare the GTX660 to the 7870GHE, or better yet the sub $200 7850. They just aren't competitive anymore. I'll admit, I was a bit of a Nvidia fan boy. Loved their products. Was disappointed by older ATI cards and issues I had with them. (stability, screen fitting properly, audio issues) But ATI has become AMD and they've improved quality a lot and Nvidia is USING their customers loyalty; that's just wrong.

    I'm done with Nvidia on the desktop. By the time I need a new laptop AMD will probably have the graphics switching all sorted; so I'm probably done with Nvidia on laptops too.
  • CeriseCogburn - Saturday, February 23, 2013 - link

    LOL - be done, and buy the alternative crap - amd.

    You'll be sorry, and when you have to hold back admitting it, I'll be laughing the whole time.

    Poor baby can't pony up the grand, so he's boycotting the whole line.
    You know you people are the sickest freaks the world has ever seen, and frankly I don't believe you, and consider you insane.

    You're all little crybaby socialist activists. ROFL You're all pathetic.

    nVidia won't listen to you, so go blow on your amd crash monkey, you and two other people will do it before amd disappears into bankruptcy, and then we can laugh at your driver less video cards.

    I never seen bigger crybaby two year olders in my entire life. You all live in your crybaby world together, in solidarity - ROFL

    No one cares if you lying turds claim you aren't buying nVidia - they have billions and are DESTROYING amd because you cheapskate losers cost amd money - LOL

    YOU ARE A BURDEN AND CANNOT PAY FOR THE PRODUCTION OF A VIDEO CARD !

    Enjoy your false amd ghetto loser lifestyle.
  • Soulnibbler - Thursday, February 21, 2013 - link

    Hey, I'm excited about the fp64 performance but I'm not going to have any time to write code for a bit so I'll ask the question that would let me justify buying a card like this:

    How much acceleration should I expect using this card with Capture One as compared to AMD/software rendering. I've heard anecdotal evidence that the openCL code paths in version 7 make everything much faster, but I'd like a metric before I give up my current setup (windows in VMware) and dual-boot to get openCL support.

    I know openCL is not yet ready on this card but when you revisit it could we see a little Capture One action?

    Preferably the benchmark sets would be high resolution images at both high and low iso.
  • Ryan Smith - Monday, February 25, 2013 - link

    I'm afraid I don't know anything about Capture One. Though if you could describe it, that would be helpful.
  • Soulnibbler - Monday, February 25, 2013 - link

    Capture One is a raw developer for digital cameras.
    http://www.phaseone.com/en/Imaging-Software.aspx
    notably for medium format digital backs but also for 35mm and aps sensors. It could be considered a competitor to Adobe's Lightroom and ACR software but the medium format camera support and workflow are the major differentiators.

    The last two releases have had openCL support for both previews and exporting which I've heard has lead to reductions in time to get an image through post.

    I'd imagine that one could benchmark on a large library of photos and determine if this card as a compute card is any improvement over standard gaming cards in this use scenario.

    I'd imagine this is part of the market that NVIDIA is aiming at as I know at least one user who switched to an ATI W7000 for openCL support with Capture One.

Log in

Don't have an account? Sign up now