What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

The AMD Polaris GPU Architecture Preview

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,841
Location
Montreal
In the last quarter of 2015, AMD and their Radeon Technologies Group did something we’ve rarely seen in the secretive world of tech companies. They sat down members of the press, gave us an iron-clad NDA and started talking about their plans for the next year. While that’s what typically happens behind closed doors, AMD added in a refreshingly new twist: we wouldn’t have to wait until a product’s official launch to actually talk about what was being discussed. Instead, there were preset times throughout 2015 and 2016 were we could publish information well in advance and give our readers a glimpse at some exciting elements coming down the pipeline. One of the key take-aways from those meetings was details about AMD’s upcoming GPU architecture, code named Polaris.

AMD-RTG-1-11.png

The Polaris architecture represents a huge step forward for AMD but it will also walk hand in hand with a number of other initiatives fronted by the Radeon Technologies Group. For example, GPUOpen aims to put additional resources into the hands of developers which could allow for better optimization in PC games and enhanced visual effects across all platforms. There’s also a whole packet of upcoming display-driven technologies like HDR panels, FreeSync over HDMI and DisplayPort 1.3 that are coming down the pipeline. Finally, the RTG is hoping to offer a robust driver and software infrastructure through their (hopefully) regularly updated Radeon Software Suite.

These elements and others should combine to lay a solid foundation and insure the stars align in preparation for the Polaris architecture. As you can imagine, there are some major investments tied up in Polaris’ success but the architecture itself requires a bit more explanation as well.

AMD-POLARIS-4.png

The Polaris architecture may represent a shining beacon for gamers looking for a flagship product from AMD which can essentially offer an alternative to NVIDIA’s upcoming Pascal microarchitecture. However, at least initially, Polaris will be targeting volume rather than halo markets in an effort to compete in segments where the Radeon Technologies Group feels the largest inroads can be made. This means mid level desktops / all-in-ones, notebooks and even integration into upcoming consoles are all being focused upon over enthusiast-grade wares. Availability for those first Polaris cores is slated for mid 2016 while the more complicated designs meant for higher end GPUs will likely be rolled out in Q3 and Q4 of this year.

There’s a good reason for this staggered rollout: not only does it allow for a potentially quick Radeon resurgence within the key low power applications AMD has been historically weak in but there’s also some good old fashioned production assurances involved here as well. With AMD utilizing a new 14nm FinFET manufacturing process (more on this later), they need to perfect the Polaris core design, optimize yields and start understanding the limitations of their new architecture without taking huge risks. A primary rollout with a smaller, more efficient and less specialized core allows them to do exactly that. It could also optimize the timeline for Polaris’ closer integration into upcoming APU designs.

At this time the amount of information about the Polaris architecture is relatively minimal but AMD is set to release additional talking points between now and its official launch in mid 2016. However, over the next few pages we’ll go over what we can officially discuss about Polaris and what the 16nm manufacturing process means for its future.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,841
Location
Montreal
Polaris; Achieved Through FinFET

Polaris; Achieved Through FinFET


*Note: an earlier version of this preview indicated AMD is using a 16nm manufacturing process. While there has been no official confirmation, either a 16nm or 14nm node may be used. We apologize for this error.

Lately there’s been a lot of discussion about the current 28nm manufacturing process being used on GPUs and AMD’s APUs. It used to be considered efficient and the go-to standard but as semiconductor companies move forward with increasingly complicated designs, its limitations are quickly being realized. While there have been numerous efforts like clock gating and voltage controls to lower power consumption, 28nm has been a major roadblock to microarchitecture advancement for the last few years.

In order to move the performance yardsticks forward, engineers have to continually pack more transistors into a core while still optimizing yields and respecting increasingly constrained TDPs. While it would be relatively easy to launch a massive core with a metric ton of transistors, the realities of today’s device needs require efficiency and performance rather than just the latter. This is where the FinFET tri-gate transistor approach gets factored into the equation.

AMD-POLARIS-1.png

FinFET has taken on many different guises since its introduction in 2000. For example, Intel rolled it into their 22nm Ivy Bridge microarchitecture and has since refined it for smaller nodes like Skylake’s 14nm and the upcoming 10nm for Cannonlake. Meanwhile Samsung, GlobalFoundries and TSMC have all been working on their own versions of FinFET in an effort to optimize its benefits for their own high performance core rollouts. At this time, it is unknown whether or not AMD will adopt a 14nm or 16nm process for their Polaris architecture.

Without getting too far into the technical details, FinFET allows for the transistors and their respective gates to rise above the core substrate rather than simply lie flat across the horizontal plane. Essentially, a flat planar surface makes it more difficult for the voltage from a gate electrode to stop the flow through its associated carrier channel. This erodes on-die efficiency. In a FinFET design, the gates are projected above the surface and then wrap around the source and its drain. Not only does this structure allow for up to twice as much gate control versus a typical planar layout but it also insures an optimal current flow.

AMD-POLARIS-2.png

In plain English FinFET designs improve overall efficiency, can condense the footprint of a core design and boost IPC metrics by a substantial amount. In Polaris’ case, when coupled with a 14nm / 16nm manufacturing process, this technology will allow more performance to be wrung out of the architecture while providing better leakage metrics than previous 28nm designs.

With all of this being said, there are still several challenges that have to be overcome before AMD can claim a successful rollout of 14nm FinFET cores. Judging by the 14nm and 16nm nodes' continual delays from all semiconductor foundries this manufacturing process is anything but mature and it is that lack of maturity which could negatively impact yields for first-run products. According to AMD yields are improving and their first Polaris testing units are in-hand but volume production is still several months away. This situation also brings us back to our point about why the choice was made to initially focus upon a simplified Polaris core for mid-2016 delivery.

Packing a large amount of transistors into a limited space through the use of a finer-grain manufacturing process has long presented some roadblocks on the thermal side of the spectrum as well. Simply put, condensing components focuses thermal output into a smaller footprint means that cooling devices and any interface between the core and internal headspreader has to be extra efficient to insure adequate temperatures. Intel faced this with the Ivy Bridge processors which proved to be quite hot running, particularly when overclocked. AMD on the other hand claims they have addressed this through several optimizations within the Polaris architecture itself and in the way the cores will interface with their cooling medium.

AMD-POLARIS-3.png

All of the aforementioned technologies are being combined in an effort to make Polaris as natively power friendly as possible since that ingrained architectural efficiency will give engineers additional thermal headroom to play with. There’s also the possibility of utilizing many of the advances pioneered during efforts to lower TDPs on 28nm architectures to make newer FinFET-based designs even more efficient. Essentially, this will lead to significantly more performance than previous cores at a similar TDP.

On paper at least there are a huge number of benefits and relatively few drawbacks to using FinFET; from power consumption, to the number of cores that can be fit onto a wafer, to the raw performance potential this seems to be the best available approach to high performance computing in 2016 and beyond. However, don’t expect power consumption to be reduced by a measurable amount since engineers will continually strive to meet today's achievable TDPs but with higher performance metrics. In other words high end GPUs will still strive to meet that 200W to 250W barrier but the amount of performance that can be achieved at a given power level will likely move into the stratosphere for every SKU in AMD’s upcoming lineup
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,841
Location
Montreal
The Polaris Architecture & Its Performance

The Polaris Architecture & Its Performance


*Note: an earlier version of this preview indicated AMD is using a 16nm manufacturing process. While there has been no official confirmation, either a 16nm or 14nm node may be used. We apologize for this error.

In essence, Polaris is simply the name given to an updated core design which utilizes AMD’s Graphics Core Next architecture. But make no mistake about it; this version of GCN has been so thoroughly revised that it can’t be directly compared to the architecture’s first showing in HD 7000-series Southern Islands cards. Indeed, as with other GCN derivatives it follows the same baseline core layout but adds its own set of optimizations to those already built into the architecture through the Sea Islands (Bonaire and Hawaii) and Volcanic Islands (Tonga and Hawaii) product families.

Like its forefathers, Polaris will incorporate additional upgrades that AMD has deemed necessary to handle next generation workloads so performance in 4K and virtual reality environments alongside support for advanced display formats are being given priority. Whether or not this is termed “GCN 1.3” or “GCN 2.0” is immaterial at this point; the Radeon Technologies Group simply wants us to focus upon the Polaris nomenclature and its potential impact upon the future of gaming.

AMD-POLARIS-5.png

For the time being the real nuts and bolts details about what make the Polaris architecture tick is still being held behind closed doors but we do know a bit more about the areas of GCN AMD is updating and which will remain as-is.

Let’s start with the strengths of their current GCN design since the core elements are still going strong years after its introduction. For all intents and purposes the basic rendering pathways and core layout will remain the same since the actual needs of next generation workloads aren’t fundamentally different from what’s currently around. There are however many efficiencies built into upcoming APIs like DX12 and Vulkan that AMD can take advantage of through the implementation of targeted architectural improvements while still maintaining excellent performance in legacy situations.

One of the areas deemed in need to some attention is the primary graphics command stages which contains the Hardware Scheduler, Command Processor and Asynchronous Compute Engines. Here they’ve added a new primitive discard accelerator and hardware scheduler which are supposed to work in tandem to more efficiently route requests through the Global Data Share and onto the geometry processing elements.

The Asynchronous Compute Engines (which were already enhanced in Fiji-based cores) also receive an upgrade to improve their throughput. This bodes extremely well for DX12 performance since those ACE’s have already proven to be a key differentiator between AMD’s current GPU designs and their NVIDIA counterparts.

It looks like the Geometry Engines will get some attention as well. Since these processing stages live within the Shader Engines and contain the Geometry / Vertex Assemblers along with a dedicated tessellation unit in GCN-based cores, we can only assume this means improved tessellation and shader throughput. Once again though, the details about what’s been done aren’t being made public just yet.

The main Compute Units haven’t been overlooked either. The Fiji architecture’s implementation of Graphics Core Next kept the number of SIMD cores and Texture Units per CU constant from the previous generation while physically increasing the quantity of CUs per Shader Engine. In order to further boost performance in these stages, Polaris implements some key efficiency improvements and pre-fetch algorithms.

Rounding out the graphics side of this equation, AMD is looking at enhancing their memory compression algorithms and will be addressing L2 cache limitations. It should be noted that the Polaris lineup will contain parts both with and without an HBM interface. This makes sense considering High Bandwidth Memory adds significantly to any BOM cost so integrating it into mid-range or lower-end GPUs may not be financially possible.

AMD-RTG-1-3.png

As we’ve already covered in previous articles, Polaris’ media engines and display outputs will be designed specifically for next generation formats. The current Fiji and Tonga architecture faced their fair share of criticism for not integrating support for at least HDMI 2.0 which limited their usefulness for UHD 4K TV’s. That’s about to change in a big way.

Not only will Polaris have native support for HDMI 2.0a and DisplayPort 1.3 (enabling 4K/60 content) but the architecture will also incorporate engines for decoding h.265 main10 at up to 4K and encoding 4K h.265 content at 60FPS.

AMD-POLARIS-6.png

In addition to all of the information we’ve already gone through, the Radeon Technologies Group gave us a quick preview of Polaris performance. Since this architecture looks to achieve a new benchmark in performance per watt, the focus of their presentation was to up-play what their cards could accomplish when competing against a similarly-performing NVIDIA card.

In the example above, both the Polaris-based card and a stock GTX 950 were used on the same test system and achieved identical performance. AMD’s newest architecture was able to hit a much lower system power consumption number even though this was using extremely early hardware and immature drivers. This bodes extremely well for mid-level and low power applications but it also tells us very little about how well it can scale upwards.

We had some additional time to pick the brains of some folks at AMD’s Radeon Technologies Group and they brought up several more enlightening points. Like Tonga and Bonaire before it, the Polaris architecture’s mid-2016 rollout will target mid-level performance, hence why the performance numbers you see above are being compared to a GTX 950. When it is first introduced, the first Polaris-based cards will be considered a type of pipe-cleaning product that tests a new architecture and will allow AMD to further refine their new manufacturing process.

This actually meshes quite well with the revised timeline for the dual Fiji card previewed last year and then later delayed to Q2 2016. That means Polaris will live alongside the current Radeon product stack in the short term to mid-term. We also hope it can scale well enough to finally replace cores that have been constantly rebranded as AMD fights to keep their lineup at least looking “fresh” to buyers and their system builder partners. However, while there are still plenty of questions we need answered, upon first glance it looks like Polaris could really be something worth waiting for.
 
Last edited:

Latest posts

Twitter

Top