MediaTek Unveils Helio X20 Tri-Cluster 10-Core SoC
by Andrei Frumusanu on May 12, 2015 4:00 AM EST- Posted in
- Smartphones
- Mobile
- MediaTek
- Cortex A72
- Helio
- Helio X20
- MT6597
Today MediaTek announces their brand new flagship SoC for smartphones and tablets, the Helio X20. MediaTek continues their Helio SoC branding announced earlier in the year, making the X20 the second SoC in the X-lineup and the first one to be actually released with the new product name from the beginning (as the X10 was a direct name change from the MT6795).
Right off the bat, MediaTek manages to raise eyebrows with what is the first 10 core System-on-a-Chip design. The 10 processor cores are arranged in a tri-cluster orientation, which is a new facet against a myriad of dual-cluster big.LITTLE heterogeneous CPU designs. The three clusters consist of a low power quad-core A53 cluster clocked at 1.4 GHz, a power/performance balanced quad-core A53 cluster at 2.0GHz, and an extreme performance dual-core A72 cluster clocked in at 2.5GHz. To achieve this tri-cluster design, MediaTek chose to employ a custom interconnect IP called the MediaTek Coherent System Interconnect (MCSI).
We'll get back to the new innovative CPU arrangement in a bit, but first let's see an overview of what the rest of SoC offers. MediaTek is proud to present its first CDMA2000 compatible integrated modem with the X20. This is an important stepping stone as the company attempts to enter the US market and try to breach Qualcomm's stronghold on the North American modems and SoCs. Besides C2K, the X20's modem allows for LTE Release 11 Category 6 with 20+20MHz Carrier Aggregation (downstream), supporting speeds up to 300Mbps in the downstream direction and 50Mbps upstream. The new modem also is supposed to use 30% less power when compared to the Helio X10.
The SoC also has an integrated 802.11ac Wi-Fi with what seems to be a single spatial stream rated in the spec sheets up to 280Mbps.
MediaTek Helio X20 vs The Competition | ||||
SoC | MediaTek Helio X20 (MT6797) |
MediaTek Helio X10 (MT6795) |
Qualcomm Snapdragon 808 (MSM8992) |
Qualcomm Snapdragon 620 (MSM8976) |
CPU | 4x Cortex A53 @1.4GHz 4x Cortex A53 @2.0GHz 2x Cortex A72 @2.3-2.5GHz |
4x Cortex A53 @2.2GHz 4x Cortex A53 @2.2GHz |
4x Cortex A53 @1.44GHz 2x Cortex A57 @1.82GHz |
4x Cortex A53 @1.2GHz 4x Cortex A72 @1.8GHz |
Memory Controller |
2x 32-bit @ 933MHz LPDDR3 14.9GB/s b/w |
2x 32-bit @ 933MHz LPDDR3 14.9GB/s b/w |
2x 32-bit @ 933MHz LPDDR3 14.9GB/s b/w |
2x 32-bit @ 933MHz LPDDR3 14.9GB/s b/w |
GPU | Mali T8??MP4 @700MHz |
PowerVR G6200 @700MHz |
Adreno 418 @600MHz |
"Next-gen" Adreno |
Encode/ Decode |
2160p30 10-bit H.264/HEVC/VP9 decode 2160p30 HEVC w/HDR encode |
2160p30 10-bit H.264/HEVC/VP9 decode 2160p30 HEVC encode |
2160p30, 1080p120 H.264 & HEVC decode 2160p30, 1080p120 H.264 encode |
2160p30, 1080p120 H.264 & HEVC
|
Camera/ISP | Dual ISP 32MP @ 24fps |
13MP | Dual ISP 21MP |
Dual ISP 21MP |
Integrated Modem |
LTE Cat. 6 300Mbps DL 50Mbps UL 2x20MHz C.A. (DL) |
LTE Cat. 4 150Mbps DL 50Mbps UL |
"X10 LTE" Cat. 9 450Mbps DL 50Mbps UL 3x20MHz C.A. (DL) |
"X8 LTE" Cat. 7 300Mbps DL 100Mbps UL 2x20MHz C.A. (DL & UL) |
Video encoding and decoding capabilities seem to be carried over from the MT6795 / X10, but MediaTek advertises a 30% and 40% improvement in decoding and encoding power consumption respectively.
Still on the multimedia side, we see the employment of a new integrated Cortex-M4 companion-core which serves as both an audio processor for low-power audio decoding, speech enhancement features and voice recognition, as well as sensor-hub function acting as a microcontroller for offloading sensor data processing from the main CPU cores. This means that while the device has the display turned off but is playing audio, only the M4 is in use in order to prolong battery life.
On the GPU side, the X20 seemed to be the first officially announced Mali T800 series GPU SoC. MediaTek explains that this is a still-unreleased ARM Mali high-end GPU similar to the T880. MediaTek initially chose a more conservative MP4 configuration clocked in at 700MHz, although final specifications are being withheld at this time. It should be noted that Mediatek has traditionally never aimed very high in terms of GPU configurations. It could be considered that the GPU in the X20 could still remain competitive in prolonged sustained loads as we saw larger Mali implementations such as Samsung's Exynos SoCs not being able to remain in the thermal envelope at their maximum rated frequencies. Initial relative estimates of the X20, expressed by MediaTek, compared to the Helio X10's G6200 see a 40% improvement in performance with a 40% drop in power.
On the memory side, MediaTek remains with a 2x32bit LPDDR3 memory interface running at 933MHz. MediaTek reasons that the SoC is limited to 1440p devices and the LPDDR3 memory should be plenty enough to satisfy the SoC's bandwidth requirements (a notion I agree with, given the GPU configuration).
Going back to the signature 10-Core/Tri-Cluster architecture of the SoC, MediaTek explains that this was a choice of power optimization over conventional two-cluster big.LITTLE designs. b.L works by employing heterogeneous CPU clusters - these may differ in architecture, but can also be identical architectures which then differ in their electrical characteristics and their target operating speeds. We've covered how power consumption curves behave in our Exynos 5433 deep-dive, and MediaTek presents a similar overview when explaining the X20's architecture.
One option in the traditional 2-cluster designs is to employ a low-power low-performance cluster, typically always a lower-power in-order CPU architecture such as ARM's A53. This is paired with a higher-power high-performance cluster, either a larger CPU core such as the A57/A72, or a frequency optimized A53 as we see employed in some past MediaTek SoCs, or most recently, HiSilicon's Kirin 930 found in the Huawei P8.
Contrary to what MediaTek presents as an "introduction of a Mid cluster", I like to see MediaTek's tri-cluster approach as an extension to the existing dual A53 cluster designs - where the added A72 cluster is truly optimized for only the highest frequencies. Indeed, we are told that the A72 cluster can reach up to 2.5GHz on a TSMC 20nm process. ARM aims similar clocks for the A72 but at only 14/16nm FinFET processes, so to see MediaTek go this high on 20nm is impressive, even if it's only a two-core cluster. It will be interesting to see how MediaTek chooses the lower frequency limits on each cluster, especially the A72 CPUs, or how these options will be presented to OEMs.
The end-result is a promised 30% improvement in power consumption over a similar 2-cluster approach. This happens thanks to the finer granularity in the performance/power curve and an increase in available performance-power points for the scheduler to place a thread on. Having a process that is heavy enough that it is not capable of residing on the smallest cluster due to performance constraints, but not demanding enough to require the big cluster's full performance, can now reside on this medium cluster at much greater efficiency than had it been running on the big cluster at reduced clocks. MediaTek uses CorePilot as a custom developed scheduler implementation that is both power aware and very advanced (based on our internal testing of other MediaTek SoCs). My experience and research with it on existing devices was fairly positive, so I'm sure the X20's new v3.0 implementation of CorePilot will be able to take good advantage of the tri-cluster design.
The biggest question and need of clarification is in the area of what the MCSI (the interconnect) is capable of. ARM had announced its CCI-500 interconnect back in February, which incidentally also promised the capability of up to 4 CPU clusters. MediaTek hinted that this may be a design based on ARM's CCI - but we're still not sure if this means a loosely based design or a direct improvement of ARM's IP. Cache coherence is a major design effort, and if MediaTek saw this custom IP as an effort worth committing to, then the MCSI may have some improvements we're still not clear on.
The Helio X20 is certainly an interesting SoC and I'm eager on how the tri-cluster design performs in practice. The X20 samples in H2 2015 and devices with it are planned to be shipping in Q1 2016. In the given time-frame, it seems the X20's primary competitor is Qualcomm's Snapdragon 620, so it'll be definitely a battle for the "super-mid" (as MediaTek likes to put it) crown.
69 Comments
View All Comments
jay5 - Tuesday, May 12, 2015 - link
Think the Snapdragon 808 in the table should be listed as having 2x A57s not A72sIan Cutress - Tuesday, May 12, 2015 - link
Yup, a mistake in our copy and paste. Updated :)rocketbuddha - Tuesday, May 12, 2015 - link
Hi Ian,Could you also correct/clarify the below
MT6795 - 4xA53 2.2GHz + 4xA53 2.2GHz.
I think that the 2nd cluster should be like 1.3-1.5 GHz...
JoshHo - Tuesday, May 12, 2015 - link
The X10 SoC isn't quite a conventional bL SoC. Both clusters are clocked identically and treated identically.MrSpadge - Tuesday, May 12, 2015 - link
Given the tiny size of A53 cores this could really make sense. And under heavy multi-threaded loads (image & video processing) one could imagine 8xA53 to be more efficient than using fewer big cores. I don't know if they can be used together, though.There could also be further synergy between the clusters by sharing a last-level cache. This could probably reduce average switching latency and provide more cache space (or a lower die size). Caches are also built for certain power & performance points, though. so sharing the cache between the low and high cluster would hurt power efficiency and/or performance.
Kracer - Tuesday, May 12, 2015 - link
Why all this complexity to gain CPU performance on mobile? That is not to say to stop improving performance, but the experience is rarely ever constrained by CPU perf. GPU perf, battery life and storage performance have a tangible effect in the experience of a smartphone.A pair of very low-power A53s and a pair of blazing fast A72s(similar to the cluster of this SOC) are plenty. This setup can rush to idle to give a responsive experience and be very low-power.
If one is doing a sustained workload on mobile that is not offloaded to a dedicated ASIC(video/audio playback/capture) and are not plugged in they are using a smartphone wrong.
psychobriggsy - Tuesday, May 12, 2015 - link
Because it isn't that complex for the SoC designer to put the IP blocks together (assuming their custom interconnect is a modified ARM design), the A53 cores are tiny so you might as well use four (and maybe licensing tiers encourage using a quad-core unit rather than a dual-core unit), and including the low power co-processor reduces integration cost.The A72s do sound like a benchmarketing mode to me, but sometimes race to sleep really does work (with timer coalescing and other OS level features supporting it).
jjj - Tuesday, May 12, 2015 - link
If they do get 30% power savings, or even 20%,then it's worth using an extra 4.5mm2 (more or less) for the additional quad cluster. Any clue on cache sizes, maybe they save some die there?This vs SD620 shouldn't really be any contest given the 28- 39% higher clocks and likely more advanced process. Still no clue what GPU the SD620 will have but if it's aimed at upper midrange they got to keep the die size in check even if 28nm might be not that expensive anymore.So i assume they'll have to make some cuts to keep the die size at 80-100mm2 to sell it at 20-25$, if it's on 28nm.
On shipping timing, Mediatek a couple of weeks ago during their results call said it's shipping to device makers in Q4 ,was kinda hoping we might see 1-2 phones late in Q4.
jjj - Tuesday, May 12, 2015 - link
"MediaTek is proud to present its first CDMA2000 compatible integrated modem with the X20"The MT6735 and MT6753 have CDMA2000 and are in devices now so unless somehow those have it on a discrete chip, the HX20 is not the first.
What might be a first for MTK is carrier aggregation, unless some unannounced chip hits the market sooner.
dragosmp - Tuesday, May 12, 2015 - link
"[...]only the M4 is in use in order to decrease battery life"Funny