What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

The AMD Radeon RX 480 Preview

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
13,421
Location
Montreal
AMD’s Polaris architecture has been talked about for some time now but details about the actual cards that will be launched has been kept under close lock and key. Now with Computex fully underway some additional details are beginning to make their way out of the Radeon Technology Group.

Let’s start with some of the more obvious elements that have already been discussed at length since a refresher is in order before going in depth with this one. While NVIDIA recently launched their GTX 1080 and GTX 1070 into high level price points of $699 and $449 respectively, Polaris’ goals are more modest but infinitely more realistic. Instead of leveraging their substantial engineering knowledge to launch a halo product that costs a small fortune, AMD is targeting two key markets for the time being: the mid-level performance segment and notebooks. These are areas where the Radeon lineup desperately needed some reinforcements so it makes sense to start there.


Before anyone reading this sighs in frustration and starts thinking the Radeon Technology Group is abandoning enthusiasts, you should be aware that Polaris is just the tip of a very large and substantial product stack. This initial offering will act like a cornerstone within AMD’s GPU foundation by (hopefully) driving volume sales rather than the strictly limited quantities a niche product would move. This in theory will help the Radeon lineup expand its market share and free up resources to work on those GP104-beaters everyone loves looking at but very few can afford.

So what does this all mean from a performance perspective? Well with the Polaris lineup broken up into two categories dubbed Polaris 10 and Polaris 11, AMD is casting a pretty wide net but it will be focused towards the $199 and lower price points. While competitive analysis against GeForce alternatives is still under NDA until launch on June 29th (yes, you now have an official date!) what we can state is this new lineup will be aiming to bring down the cost of entry for VR-certified GPUs.


The first but certainly not last product being officially unveiled is the RX 480. While it may have a new moniker rather than the typical R9 or R7 designation, from the name alone it should slot somewhere between AMD’s current R9 390 and R9 380. However, affordability is paramount this time around.

Priced at just $199 for the 4GB version and slightly higher for an 8GB equipped alternative, the amount of information we know about it is quite limited. What the RTG is willing to reveal does however point towards an extremely potent yet efficient budget-friendly graphics card. It utilizes a 256-bit GDDR5 memory interface with modules operating at 8Gbps, consumes just 150W of power and includes 36 individual Compute Units. Provided AMD retains the same CU hierarchy as previous Graphics Core Next designs for this fourth generation version, that would lead to 2304 Stream Processors and 144 Texture Units but that could change this time around so I’m only speculating.


The official pictures of the RX 480 show a card that keeps with the previous designs so a black / red color scheme, a dual slot blower style fan and a dimpled heatsink shroud. There are some references here to the Fury X which could point towards a shroud that can be 3D printed for some personalization.


With 150W power draw and some hope this card will overclock, AMD is utilizing a simple 6-pin power input setup which should be par for the course with these new generation $199 GPUs.


The RX 480’s backside reveals a design that’s extremely similar to NVIDIA’s reference GTX 970. The PCB is extremely short but the fan assembly extends past the edge, allowing for a more unified design and also some extra airflow to the internal components.


As a result of architectural improvements (more about those on the next pages) and a set of unknown core clock speeds the RX 480 offers up “over” 5 TFLOPs of single precision performance. For those of you keeping track at home the current R9 390 currently tops the scales at 5.12 TFLOPs while the R9 380X hits about 4 TFLOPs. Unfortunately, past those numbers we can’t really say where the RX 480 will ultimately land within the current gamut of budget-conscious GPUs.

One thing is certain: the Radeon Technology Group is clearly dedicated to Virtual Reality. As a matter of fact, their 800 word press release announcing some of the information above VR was mentioned no fewer than 38 times. While the new RX 480 will likely set a new benchmark in affordable VR performance (through an expanded feature set it supposedly offers the throughput of a $500 GPU in these environments) it remains to be seen how many people want spend mega-bucks on an Oculus or Vive and then grab a mid-level GPU.

Regardless of the marketing and hype surrounding any new GPU launch, I have to laud AMD for effectively targeting an aggressive price / performance segment instead of risking the farm on an ultra expensive initiative. I may actually be more excited about what the RX 480 and its spin-offs will provide than anything else that has launched or will be launching this year. While it remains to be seen how well AMD’s approach works, it looks like one of the largest sections of the graphics card market will have a new “king” in the coming weeks.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
13,421
Location
Montreal
Polaris; Achieved Through FinFET

Polaris; Achieved Through FinFET


Lately there’s been a lot of discussion about the current 28nm manufacturing process being used on GPUs and AMD’s APUs. It used to be considered efficient and the go-to standard but as semiconductor companies move forward with increasingly complicated designs, its limitations are quickly being realized. While there have been numerous efforts like clock gating and voltage controls to lower power consumption, 28nm has been a major roadblock to microarchitecture advancement for the last few years.

In order to move the performance yardsticks forward, engineers have to continually pack more transistors into a core while still optimizing yields and respecting increasingly constrained TDPs. While it would be relatively easy to launch a massive core with a metric ton of transistors, the realities of today’s device needs require efficiency and performance rather than just the latter. This is where the FinFET tri-gate transistor approach gets factored into the equation.


FinFET has taken on many different guises since its introduction in 2000. For example, Intel rolled it into their 22nm Ivy Bridge microarchitecture and has since refined it for smaller nodes like Skylake’s 14nm and the upcoming 10nm for Cannonlake. Meanwhile Samsung, GlobalFoundries and TSMC have all been working on their own versions of FinFET in an effort to optimize its benefits for their own high performance core rollouts. At this time, it is unknown whether or not AMD will adopt a 14nm or 16nm process for their Polaris architecture.

Without getting too far into the technical details, FinFET allows for the transistors and their respective gates to rise above the core substrate rather than simply lie flat across the horizontal plane. Essentially, a flat planar surface makes it more difficult for the voltage from a gate electrode to stop the flow through its associated carrier channel. This erodes on-die efficiency. In a FinFET design, the gates are projected above the surface and then wrap around the source and its drain. Not only does this structure allow for up to twice as much gate control versus a typical planar layout but it also insures an optimal current flow.


In plain English FinFET designs improve overall efficiency, can condense the footprint of a core design and boost IPC metrics by a substantial amount. In Polaris’ case, when coupled with a 14nm / 16nm manufacturing process, this technology will allow more performance to be wrung out of the architecture while providing better leakage metrics than previous 28nm designs. We can see this with the RX 480’s amazing power consumption numbers.

With all of this being said, there are still several challenges that have to be overcome before AMD can claim a successful rollout of 14nm FinFET cores. Judging by the 14nm and 16nm nodes' continual delays from all semiconductor foundries this manufacturing process is anything but mature and it is that lack of maturity which could negatively impact yields for first-run products. According to AMD yields are very good and the first Polaris cards should be available in volume from day one. This situation also brings us back to our point about why the choice was made to initially focus upon a simplified Polaris core rather than a high end luxury card.

Packing a large amount of transistors into a limited space through the use of a finer-grain manufacturing process has long presented some roadblocks on the thermal side of the spectrum as well. Simply put, condensing components focuses thermal output into a smaller footprint means that cooling devices and any interface between the core and internal headspreader has to be extra efficient to insure adequate temperatures. Intel faced this with the Ivy Bridge processors which proved to be quite hot running, particularly when overclocked and NVIDIA's Pascal shows flashes of increased thermals as well. AMD on the other hand claims they have addressed this through several optimizations within the Polaris architecture itself and in the way the cores will interface with their cooling medium.


All of the aforementioned technologies are being combined in an effort to make Polaris as natively power friendly as possible since that ingrained architectural efficiency will give engineers additional thermal headroom to play with. There’s also the possibility of utilizing many of the advances pioneered during efforts to lower TDPs on 28nm architectures to make newer FinFET-based designs even more efficient. Essentially, this will lead to significantly more performance than previous cores at a similar TDP.

On paper at least there are a huge number of benefits and relatively few drawbacks to using FinFET; from power consumption, to the number of cores that can be fit onto a wafer, to the raw performance potential this seems to be the best available approach to high performance computing in 2016 and beyond. However, don’t expect power consumption to be reduced by a measurable amount since engineers will continually strive to meet today's achievable TDPs but with higher performance metrics. In other words high end GPUs will still strive to meet that 200W to 250W barrier but the amount of performance that can be achieved at a given power level will likely move into the stratosphere for every SKU in AMD’s upcoming lineup
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
13,421
Location
Montreal
A Peek Under the Polaris Architecture Covers

A Peek Under the Polaris Architecture Covers


In essence, Polaris is simply the name given to an updated core design which utilizes AMD’s Graphics Core Next architecture. But make no mistake about it; this version of GCN has been so thoroughly revised that it can’t be directly compared to the architecture’s first showing in HD 7000-series Southern Islands cards. Indeed, as with other GCN derivatives it follows the same baseline core layout but adds its own set of optimizations to those already built into the architecture through the Sea Islands (Bonaire and Hawaii) and Volcanic Islands (Tonga and Hawaii) product families.

Like its forefathers, Polaris will incorporate additional upgrades that AMD has deemed necessary to handle next generation workloads so performance in 4K and virtual reality environments alongside support for advanced display formats are being given priority. Whether or not this is termed “GCN 1.3” or “GCN 2.0” is immaterial at this point; the Radeon Technologies Group simply wants us to focus upon the Polaris nomenclature and its potential impact upon the future of gaming.


For the time being the real nuts and bolts details about what make the Polaris architecture tick is still being held behind closed doors but we do know a bit more about the areas of GCN AMD is updating and which will remain as-is.

Let’s start with the strengths of their current GCN design since the core elements are still going strong years after its introduction. For all intents and purposes the basic rendering pathways and core layout will remain the same since the actual needs of next generation workloads aren’t fundamentally different from what’s currently around. There are however many efficiencies built into APIs like DX12 and Vulkan that AMD can take advantage of through the implementation of targeted architectural improvements while still maintaining excellent performance in legacy situations.


One of the areas deemed in need to some attention is the primary graphics command stages which contains the Hardware Scheduler, Command Processor and Asynchronous Compute Engines. Here they’ve added a new primitive discard accelerator and hardware scheduler which are supposed to work in tandem to more efficiently route requests through the Global Data Share and onto the geometry processing elements.

The Asynchronous Compute Engines (which were already enhanced in Fiji-based cores) also receive an upgrade to improve their throughput. This bodes extremely well for DX12 performance since those ACE’s have already proven to be a key differentiator between AMD’s current GPU designs and their NVIDIA counterparts.

It looks like the Geometry Engines will get some attention as well. Since these processing stages live within the Shader Engines and contain the Geometry / Vertex Assemblers along with a dedicated tessellation unit in GCN-based cores, we can only assume this means improved tessellation and shader throughput. Once again though, the details about what’s been done aren’t being made public just yet.

The main Compute Units haven’t been overlooked either. The Fiji architecture’s implementation of Graphics Core Next kept the number of SIMD cores and Texture Units per CU constant from the previous generation while physically increasing the quantity of CUs per Shader Engine. In order to further boost performance in these stages, Polaris implements some key efficiency improvements and pre-fetch algorithms.

Rounding out the graphics side of this equation, AMD is looking at enhancing their memory compression algorithms and will be addressing L2 cache limitations. It should be noted that the Polaris lineup could contain parts both with and without an HBM interface (edit: GDDR5X is also a possibility according to AMD). This makes sense considering High Bandwidth Memory adds significantly to any BOM cost so integrating it into mid-range or lower-end GPUs may not be financially possible.
 
Last edited:

Twitter

Top