Details on Trinity - AMD's Next Gen APU
by Kristian Vättö on October 25, 2011 10:53 AM ESTDonahimHaber has leaked a slide concerning AMD's next generation APU, called Trinity. The slide does not reveal any detailed specifications, it's merely an overview of Trinity. Lets begin with a table comparing Llano and Trinity:
Comparison of AMD's Higher-End APUs | ||
Llano | Trinity | |
Core | Husky | Piledriver |
Core Count | Up to 4 | Up to 4 |
RAM | Up to DDR3-1866 | Up to DDR3-2133 |
GPU | AMD 6000 Series | AMD 7000 Series |
Socket | FM1 | FM2 |
Those are the differences in a nutshell. Husky core is based on upgraded 10h microarchitecture (also known as AMD K10), the same microarchitecture that is used in Phenom II CPUs. As for Piledriver, AMD is referring to it as second generation Bulldozer core (see our Bulldozer review). Trinity will have up to four cores, just like Llano, which means up to two Piledriver modules (each Bulldozer/Piledriver module has two cores). In terms of speed, AMD is claiming up to 20% increase over Llano. Bulldozer's poor single-threaded performance might cause the performance upgrades to be limited to multithreaded tasks though, unless AMD can do magics with Piledriver (aka 2nd gen Bulldozer). RAM support is also up from DDR3-1866 to up to DDR-2133.
GPU department will also get an overhaul. We already reported that Trinity's GPU will be named as AMD 7000 Series, which suggests that it will be based on the same design as other 7000 Series GPUs (this might sound obvious, but Llano's GPU was named as 6000 Series, yet it was based on 5000 Series "Redwood" core). The leaked slide supports this since it mentions support for next generation DirectX 11, most likely DX 11.1. AMD will also compete with Intel's QuickSync by including Video Compression Engine (VCE) in Trinity. Performance increase will be around 30% compared to Llano's GPU according to AMD.
Trinity will continue to use the same chipsets as Llano. However, the socket will change to FM2, which will most likely be compatible with FM1. Another leaked slide shows that mobile Trinity's package is FS1r2, whereas Llano's is FS1. The APU after Trinity, called Kaveri, will use FS2 package. This suggests that FS1 and FS1r2, as well as FM1 and FM2, are very similar and hence backwards compatible. This has not been confirmed though.
Availability is unknown but if roadmaps are to believe, Trinity should make its first appearance in Q1'12, full availability being in Q2'12.
Source: DonahimHaber
44 Comments
View All Comments
Taft12 - Tuesday, October 25, 2011 - link
Trinity should make its first appearance in Q1'12, full availability being in Q2'12.So, we'll get Trinity APUs (and a new socket, bleh) likely before we can even buy the "good" desktop Llanos (A8-3800 or A6-3600).
I like desktop Llano, but it's hard to recommend when it becomes a dead end so fast. It's like the bad old days of Intel (more recently LGA1156)
Snoop - Tuesday, October 25, 2011 - link
I wonder if they are going to be able to get the power down for the mobile market? I cannot imagine a Bulldozer derivative with decent performance without crazy power usage.Roland00Address - Tuesday, October 25, 2011 - link
The architecture itself is not power hungry, it is the fact Global Foundries 32nm process is still unmature and thus the chips run hot for there is a lot of leakage and the AMD is using a high voltage for the chips (for a high voltage Llano or Bullzdozer chip is still a possible chip sale for AMD)There is no doubt that AMD is going to get better performance per watt out of Bulldozer and Bulldozer deritiatives such as Piledriver in the future. Remember how bad Phenom was when it first came out, Agena sucked but it was improved with Deneb and the third improvement Thuban and Stars isn't that bad either. Problem is that AMD competition (Intel and ARM) is a moving target, they have to improve their product faster than everybody else improves theirs.
Snoop - Tuesday, October 25, 2011 - link
I am far from cpu architect but long pipelines, super high transistor counts, with low per clock performance and a reliance on high clock speeds seems to translate to high power usage. How is the GF process going to change this part of the architecture?niva - Tuesday, October 25, 2011 - link
GF won't change the architecture, it would be up to AMD to change that at a later time. However, once GF gets the manufacturing to a better quality you can reduce the voltage, or increase the clocks for the same heat output.Architecture changes (like the pipeline, or the single FPU per 2 integer cores) are made for reasons. I'm not happy with the results of the choices they made for the architecture with Bulldozer, maybe in the long term it will turn out to be the right direction but right now, it doesn't seem to be so.
Super high transistor count isn't necessarily a bad thing, CPUs are becoming more and more complicated, expect this trend to continue not just for AMD but any company which makes processors (including ARM). Everything we expect out of these machines is becoming more complicated which drives up the transistor count.
Penti - Tuesday, October 25, 2011 - link
Dude the cache does use as many transistors as a whole Nehalem or SandyBridge cpu/processor. That's obviously a mistake, not one they can do on mobile chips like the mobile version of Piledriver though so it ought to be a quite different overall architecture/product/cpu. We'll have to see.Roland00Address - Wednesday, October 26, 2011 - link
Four Cpu Modules (8 cores)+L2 Cache=852 million transistorsL3 Cache=405 million transistors
Everything Else (I/O, DDR3 memory controller, Logic and Routing)=800 million transistors.
So for Trinity AMD is going to only have two cpu modules so at least 426 million transistors. There will be less everything else since AMD won't be doing any multiple cpu needing interconnects. It is rumored trinity will have a 7000 series gpu based on a 4D arranagement with 480 shaders. Well a turks gpu (480 shaders, hd6670) uses 716 million transistors.
So we are talking a billion transistor chip lets hope they can keep it closer to 1 billion and not 2 billion like bulldozer is.
DanNeely - Tuesday, October 25, 2011 - link
You're confusing fundamentally bad, with Intel botched the implementation here. IBMs power 6 CPUs also used very long pipelines compensated with high clockrates (upto 5ghz shipping parts in 08) quite successfully in thier Power6 architecture.At the moment most of AMDs troubles appear to be due to GF's 32nm process sucking; although it will probably be a year or two before we know if the process was also masking architectural problems.
Zoomer - Wednesday, October 26, 2011 - link
The pipeline isn't that long; it's just longer than before. More stages is a legit strategy to split the work up more, perhaps relieving a bottleneck.Low per clock performance? Per thread, yes. But the overall threaded performance goes up since more threads can run. They had to go 2B transistors for 8 cores; that's because they were first and foremost targeting servers and HPC. These have higher margins than the piddly desktop market, unfortunately.
jeremyshaw - Wednesday, October 26, 2011 - link
2B transistors is about equal to two Thubans... you could have 12 cores, and better per-core performance, too :p Hell, even comparible power consumption, despite being on 45nm...