Intel's Xeon Cascade Lake vs. NVIDIA Turing: An Analysis in AI

Name: Intel's Xeon Cascade Lake vs. NVIDIA Turing: An Analysis in AI
Item: Intel's Xeon Cascade Lake vs. NVIDIA Turing: An Analysis in AI
Author: Johan De Gelas

by Johan De Gelas on July 29, 2019 8:30 AM EST

56 Comments | Add A Comment

56 Comments

AI Is More Than Deep Learning

At a high level, while deep learning is a form of artificial intelligence, the converse isn't always true; an application implementing AI does not necessarily use deep learning. Many AI applications use “conventional statistical” or “traditional” machine learning. After all, Support Vector Machines, Logistic Regression, K-nearest, Naive Bayes, and decision trees still make a lot of sense to use in automating information classification, especially if you don’t have a lot of data.

For example, Conditional Random Field (CRF) is used in natural language processing, and a lot of recommendation engines are based upon Boltzman Machines, Alternate Least Square (ALS), and so on. Case in point: one of most demanding and unique benchmarks – our "big data" benchmark – uses an ALS algorithm as recommendation engine ("collaborative filtering").

Of course, the use of neural networks – itself a whole field of study – is booming, and their use tends to dominate the latest AI applications. Neural networks are also among the most demanding workloads, requiring lots of processing power and signing expensive (and well-published) hardware contracts. All of which contrasts heavily with logistic regression, which remains the most used machine learning method, and also happens to need much less processing.

The reason for these difference in processing power requirements is, in turn, actually pretty simple. To quote Wouter Gevaert, an AI expert at the university department I work in:

“Each Neuron in a Neural Network can be considered like a logistic regression unit. Therefore, a Neural Network is like a massive amount of logistic regressions” (When you use sigmoid as activation function)

With all of that said, however, while neural networks are the most processing-intensive of AI technologies (especially with a large number of layers), there are several traditional machine learning techniques that also require a lot of processing power. Support Vector Machines with their complex transformations also tend to require a lot of computational time, for example. And in our Spark test, the Stanford NER system is based on a supervised CRF model using a labeled collection of English data. In that test, it has to crunch through a massive amount of unstructured text – several hundreds of gigabytes.

And of course, most of the analytical queries are still written in good old SQL. For structured and semi-structured data, for OLAP cubes etc., SQL code is still prevalent. As a single SQL query is nowhere near as parallel as Neural Networks – in many cases they are 100% sequential – the CPU is best tool for the job.

So in practice, most data (pre) processing and lots of AI software is still running on a CPU. GPUs mostly run massively parallel HPC applications and neural networks, an important market to be sure, but still only a piece of the larger AI market. This is one of the reasons why NVIDIA closed on $3 Billion of datacenter revenue last year, while Intel’s datacenter group made $20 Billion. Yes, Intel’s number includes Networking and storage. Yes, that includes several other markets than HPC and Data analytics. But still, a significant part of that revenue is going to be based upon servers that store and process data for analytics.

Compounding this whole picture, however, is not just revenue, but opportunities for growth. NVIDIA has been seeing massive growth in the datacenter market, while Intel has only seen single digit growth. Customer needs are continuing to shift as new technologies become available; the battle for the data analytics market has begun, and it is intensifying.

Enter the Era of AI Convolutional, Recurrent, & Scalability

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

56 Comments

View All Comments

ballsystemlord - Saturday, August 3, 2019 - link
Spelling and grammar errors:

"But it will have a impact on total energy consumption, which we will discuss."
"An" not "a":
"But it will have an impact on total energy consumption, which we will discuss."

"We our newest servers into virtual clusters to make better use of all those core."
Missing "s" and missing word. I guessed "combine".
"We combine our newest servers into virtual clusters to make better use of all those cores."

"For reasons unknown to us, we could get our 2.7 GHz 8280 to perform much better than the 2.1 GHz Xeon 8176."
The 8280 is only slightly faster in the table than the 8176. It is the 8180 that is missing from the table.

"However, since my group is mostly using TensorFlow as a deep learning framework, we tend to with stick with it."
Excess "with":
"However, since my group is mostly using TensorFlow as a deep learning framework, we tend to stick with it."

"It has been observed that using a larger batch can causes significant degradation in the quality of the model,..."
Remove plural form:
"It has been observed that using a larger batch can cause significant degradation in the quality of the model,..."

"...but in many applications a loss of even a few percent is a significant."
Excess "a":
"...but in many applications a loss of even a few percent is significant."

"LSTM however come with the disadvantage that they are a lot more bandwidth intensive."
Add an "s":
"LSTMs however come with the disadvantage that they are a lot more bandwidth intensive."

"LSTMs exhibit quite inefficient memory access pattern when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."
"pattern" should be plural because "LSTMs" is plural, I choose an "s":
"LSTMs exhibit quite inefficient memory access patterns when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."

"Of course, you have the make the most of the available AVX/AVX2/AVX512 SIMD power."
"to" not "the":
"Of course, you have to make the most of the available AVX/AVX2/AVX512 SIMD power."

"Also, this another data point that proves that CNNs might be one of the best use cases for GPUs."
Missing "is":
"Also, this is another data point that proves that CNNs might be one of the best use cases for GPUs."

"From a high-level workflow perfspective,..."
A joke, or a misspelling?

"... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter doesn't completely suck at."
Traditionally, AT has had no language.
"... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter is good at."

"It is been going on for a while,..."
"has" not "is":
"It has been going on for a while,..."
ballsystemlord - Saturday, August 3, 2019 - link
Thanks for the cool article!
tmnvnbl - Tuesday, August 6, 2019 - link
Great read, especially liked the background and perspective next to the benchmark details
dusk007 - Tuesday, August 6, 2019 - link
Great Article.
I wouldn't call Apache Arrow a database though. It is a data format more akin to a file format like csv or parquet. It is not something that stores data for you and gives it to you. It is the how to store data in memory. Like CSV or Parquet are a "how to" store data in Files. More efficient less redundancy less overhead when access from different runtimes (Tensorflow, Spark, Pandas,..).

Love the article, I hope we get more of those. Also that huge performance optimizations are possible in this field just in software. Often renting compute in the cloud is cheaper than the man hours required to optimize though.
Emrickjack - Thursday, August 8, 2019 - link
Johan's new piece in 14 months! Looking forward to your Rome review
Emrickjack - Thursday, August 8, 2019 - link
It More Information http://americanexpressconfirmcard.club/

Intel's Xeon Cascade Lake vs. NVIDIA Turing: An Analysis in AI

AI Is More Than Deep Learning

Post Your Comment

56 Comments

View All Comments

ballsystemlord - Saturday, August 3, 2019 - link

ballsystemlord - Saturday, August 3, 2019 - link

tmnvnbl - Tuesday, August 6, 2019 - link

dusk007 - Tuesday, August 6, 2019 - link

Emrickjack - Thursday, August 8, 2019 - link

Emrickjack - Thursday, August 8, 2019 - link

Log in

Don't have an account? Sign up now