What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

AMD R9 Fury X Review; Fiji Arrives

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
After nearly two years of waiting, rumors, hope and disappointment the AMD’s Fiji architecture and the R9 Fury X are finally here. Not only does this pairing hope to compete against the best NVIDIA has to offer but it represents the best hope for AMD’s graphics division as they move forward into the future.

While AMD may not have been leading the graphics race for some time now, if history is any indication, they are obviously willing to take chances on new technologies. For example, the HD 4000-series was the first to boast native support for GDDR5 memory and the now-standard 28nm manufacturing process debuted with the HD 7900-series parts. This time around it is the addition of an innovative High Bandwidth Memory (HBM) interface which is supposed to drastically enhance memory performance while lowering manufacturing costs and memory subsystem overhead.

<iframe width="640" height="360" src="https://www.youtube.com/embed/rfCb6oiJ6EI?rel=0" frameborder="0" allowfullscreen></iframe>​

Since the launch of their Hawaii architecture, things haven’t been easy for AMD’s graphics unit. While the R9 290X and R9 290 were relatively successful on the gaming front, other than a slight jump in sales due to the popularity of crypto-currency mining, their momentum couldn’t be sustained. One of the main reasons for this was their lack of follow-up products which could have brought forth better performance and lower power consumption. The possibility was certainly there but for whatever reason AMD decided to abandon any potential of a follow-up to Hawaii. Instead of those potential launches, AMD’s board partners were left marketing an older architecture while the GeForce lineup was refreshed with Maxwell-based graphics cards.

While the situation went from bad to worse when NVIDIA introduced the GM200-based TITAN X and GTX 980 Ti, there were some faint glimmers of hope along the line. The R9 285 incorporated AMD’s new Tonga GPU which was based upon a thoroughly revised Hawaii-type core but with plenty of updates built into its svelte frame. There was also the feature-rich Omega drivers and FreeSync’s successfully implementation of DisplayPort’s Adaptive Sync protocol. So even though the new product front may have been quiet for the Radeon lineup, AMD was anything but quiet over the last 20 or so months.

R9-FURY-X-2-16.png

While many have criticized AMD for missing opportunities and letting NVIDIA run away with market share within the graphics card industry, that was then and this is now. Water under the bridge so to speak. Right now, at this moment we have Fiji, an architecture which represents a paradigm shift that will likely impact every future GPU design. It shows how, contrary to the diatribes of their vocal critics, AMD may have actually been playing the long game for the last two years rather than falling into that all-too-familiar rut of near sightedness.

So what makes Fiji and the R9 Fury X so unique and more importantly, why should gamers sit up and take notice? First and foremost this is the first mass-produced product to use HBM to enhance memory bandwidth while also lowering BOM costs. AMD has also worked hard to enhance overall process efficiency and performance over their previous generation products. That means cramming more transistors into a 28nm-based die (8.9 million transistors to be exact) and substantially better performance while still offerings power consumption savings versus the outgoing Hawaii architecture. It may sound impressive in writing but in practice it is nothing short of revolutionary.

R9-FURY-X-2-83.jpg

For the time being Fury X stands preeminent in the current Radeon lineup and it will surely be followed up by the air-cooled Fury and the mini-ITX friendly Fury Nano. There’s also a dual Fiji card which will be introduced before Indeed, unlike with the Hawaii architecture’s dead end lineup of just two cards, AMD hopes the Fury X and by extension the Fiji architecture will headline a whole new product stack once their rollout is fully complete.

From a raw specifications standpoint the Fury X is indeed impressive. It boasts a massive 4069 Stream Processors and 256 Texture Units which represents a 45% increase over a fully enabled Hawaii XT / Grenada XT core while clock speeds have the ability to hit a peak of 1050MHz. Expect those speeds to be extremely consistent since AMD has equipped their flagship card with an integrated water cooling unit to keep temperatures well under control. That means performance which should be well ahead of even the R9 390X we reviewed.

One area which seems to have been left by the wayside is the ROP count which remains at 64, a number that has been carried on since the Hawaii days. According to AMD, this shouldn’t cause a bottleneck since additional resources have been allocated to facilitating data transfer within the core so additional ROPs weren’t needed. We’ll get into this in a bit more detail later. Plus, a good amount of die area has been set aside for the high Bandwidth Memory interface.

R9-FURY-X-2-17.png

In many ways the Fury X is acting as a testing platform for the first generation of HBM. There may be “just” 4GB of it operating at a relatively paltry 500MHz but the way it is utilized ensures that bottlenecks won’t happen, even in high detail 4K scenarios where the core itself will likely become a limiting factor long before the memory interface gets saturated. One of the primary reasons 4GB of HBM won’t limit this card is due to the titanic 4096-bit wide interface which grants an effective bandwidth of half a terabyte per second.

With this being AMD’s flagship part it naturally receives their full stable of features as well. That means DX12 compatibility (though not a full 12.1 feature level certification), Virtual Super Resolution support, Framerate Target Control and of course FreeSync abilities. There’s also an updated display scaler with improved quality and an enhanced video decode engine with support for HEVC.

R9-FURY-X-2-2.png

Before we get too far into this particular rabbit hole something needs to be mentioned about the Fury X’s price. At $649 it puts expectations for performance right in line with NVIDIA’s GTX 980 Ti. That’s an impressive claim to start things off since higher end Radeon cards have typically launched for significantly less than their GeForce opposites due to slightly slower performance metrics. In our recent article detailing what AMD needed to accomplish to effect a comeback in the GPU market, suitably high yet competitive pricing was among our “must do” elements. And here we have it.

Despite a long list of positive take-aways from our conversations with AMD about Fiji, actual availability was a topic that was never quite addressed head-on. There’s a good reason for that. The Fiji core alongside its associated interposer and HBM modules don’t necessarily represent bleeding edge technology but making this holy union into something that can be mass produced is likely a challenge of epic proportions. As with all new designs, getting yields and production volumes up to the point of broad-scale retail availability probably won’t happen for some time to come.

With all of this being said, the R9 Fury X is a bright new hope not only for AMD but the gaming market as a whole. What was a one-horse field will now once again become an arms race between AMD and NVIDIA….provided this new card can deliver on all of its lofty promises.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
A Closer Look at the R9 Fury X

A Closer Look at the R9 Fury X


R9-FURY-X-2-200.jpg

AMD’s R9 Fury X is one heck of a unique looking card but its design was largely dictated by the cooling requirements of the Fiji core. That meant two things: water cooling and plenty of power capacity since it can draw up to 275W (and much more when overclocked).

To hear AMD tell it, the Fury X’s design was less about taming their HBM-equipped monster’s temperatures and more about pushing the boundaries of graphics board design. Just looking at it, we can’t argue with their line of thinking since this card looks incredible and is ridiculously compact for a flagship GPU.

Let’s get this off our shoulders right away: this thing feels like a top shelf product. The build quality of our engineering sample was nearly flawless with tight material seams and a nice robust feel to every surface. Material include die-cast aluminum, steel reinforcements and some black nickel finishes that would put many plating shops to shame. Its leaps and bounds better than the Fischer Price feel of previous AMD reference cards.

R9-FURY-X-2-201.jpg

Starting with the card itself, AMD has gone in a different direction by completely eliminating a fan and instead relying on the liquid cooling unit to cool all of the internal components. This leaves a svelte 7.5” long card that’s topped with a textured soft-touch metal plate which can be removed my loosening a trio of small hex screws a single Radeon logo that glows red.

Supposedly AMD will be releasing specifications for this top plate to the 3D printing community so users can print their own replacements provided they have the necessary equipment.

R9-FURY-X-2-202.jpg

Unlike other water cooled cards which have their water cooling tubes protruding from the side, AMD once again chose the path less travelled by routing their tubing through the R9 Fury X’s rear area. While it may add some length (about 2”) this positioning allows the card to fit into those slimmer cases which are all the rage these days.

Speaking of the tubes, they’re pretty basic plasticized corrugated affairs that have loosely covered in black mesh. Unfortunately, the mesh hasn’t been heat-shrunk onto the tubes so there’s a good amount of wiggle room. Nonetheless, it looks absolutely stunning.

R9-FURY-X-2-209.jpg

Even with those tubes jutting out of the Fury X’s back, its overall 9.5” long footprint is still shorter than the reference GTX 980 Ti and every other custom GTX 980 Ti we have seen thus far. With that being said, the added complication of finding a place for its radiator does somewhat hinder an otherwise clean looking build.

R9-FURY-X-2-208.jpg

The radiator being used follows a typical 120mm design but then throws in a bit of a twist with an additional reservoir that extends the height by about ½”. Supposedly the radiator allows for up to 500W of thermal capacity.

In most scenarios this addition won’t cause a problem but some smaller cases may experience installation hurdles since the reservoir will either hit the case’s roof or interfere with the motherboard I/O area. Do your research before assuming the Fury X will fit in your chassis!

R9-FURY-X-2-204.jpg

In order to feed their 8.9 billion transistor beast, AMD has added a pair of 8-pin power inputs but, from our testing at least, it will likely only need the excess capacity when heavily overclocked. Above these connectors are nine LEDs which make up the GPU Tach and are meant to indicate relative real time load levels of the GPU.

R9-FURY-X-2-203.jpg

On the card’s side the LED parade continues with an illuminated logo on the side as well but above that is a small opening to access a DIP switch. The switch allows enthusiasts to switch between the standard BIOS profile and a customized one of their choice.

R9-FURY-X-2-206.jpg

The back area of AMD’s R9 Fury X is a pretty boring affair but we can’t knock them for that. The backplate’s soft-touch finish and clean design lend a certain feeling of elegance to the card. We happen to like it quite a bit.

R9-FURY-X-2-205.jpg

The only interesting feature on the card’s back area is a pair of small DIP switches which allow for control over the GPU Tach LED but it doesn’t control the Radeon logos. It’s a great addition for anyone who wants a more “stealthy” build rather than seeing the GPU Tach spazzing away while gaming.

R9-FURY-X-2-3.png

While there aren’t a whole lot of options (don’t expect RGB lighting here folks!) the above chart shown how the DIP switches can be manipulated to given either red or blue lighting or completely turn off the Tach.

R9-FURY-X-2-11.png

For those of you wondering how AMD crammed all of this haute technology onto a PCB that measures just 7.5” long, look no further than HBM. With the memory modules integrated directly onto the GPU package via an advanced interposer, there was no need to add a bunch of memory modules and their associated traces onto the PCB. Indeed, GPU cores with HBM encourage small form factor designs like no other technology before.

R9-FURY-X-2-215.jpg

By beaking the Fury X down into its basic raw elements we can see why AMD didn’t see the need for a fan atop the card itself. Not only are the HMB memory modules and core directly cooled by the closed loop liqui cooler’s contact plate but there’s a secondary heatspreader for the PWM components. This heatspreader transmits heat from the PWM and into the core contact plate, thus insuring adequare cooling for all components.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
The GCN 1.2 Architecture…On Testosterone

The GCN 1.2 Architecture…On Testosterone


AMD’s GCN architecture may seem to have more lives than disco, the fact of the matter is that a brand new ground-up redesign hasn’t been needed. The same basic principles that made it so potent years ago still hold true today though, as DX12 begins rolling out, some of its higher level feature sets may begin to show their age in the next 18 to 24 months.

As was mentioned in the introduction, Fiji is essentially an upscaled GCN 1.2 part which is very similar to Tonga in its capabilities. This means it shares a lot in common with the GCN 1.1-based and Hawaii / Grenada cores, as with has additional efficiencies built into several key processing stages. From our standpoint there isn’t quite enough to call this a new completely version of Graphics Core Next since it still uses the 1.1 revision as a primary foundation. However, many of AMD’s design choices for Fiji are throughput-focused so the core can physically keep up with the ultra high bandwidth HBM interface.

R9-FURY-X-2-10.png

In a core layout diagram there’s very little to distinguish Fiji from its predecessors. Like Hawaii it utilizes a quartet of Shader engines which hold the respective geometry processors, rasterizers, ROPs (within the Render Back-Ends) and Compute Units. Meanwhile, at a lower level the Compute Units still hold 64 cores and a quartet of texture units along with their associated caching blocks so not much has changed from a physical perspective there either. But don’t stop there and assume AMD simply took Hawaii, added in a bit of GCN 1.2 goodness and called it a day.

One of the first things AMD did to create Fiji was expand the Compute Unit count within each Shader Engine. In comparison to Grenada / Hawaii, there are five additional CU’s per engine which equates 325 additional cores, 20 more TMUs and of course more on-die caching capability.

By far the largest change here has to be the addition of HBM and the inner workings that were necessary to insure the high bandwidth memory interface was fed with enough information to justify its presence. To accomplish that, AMD doubled the L2 cache from 1MB to 2MB and instituted advanced lossless color compression algorithms for frame buffer reads and writes. Naturally, there’s also the eight 512-bit memory controllers that can process a veritable torrent of information towards the quartet of 1GB HBM modules.

All of this has contributed to a relatively large transistor count increase from Hawaii’s 6.2 billion to 8.9 billion on Fiji. That equates a die size of 596 mm² for the core and 1011 mm² when the HBM interposer is taken into account as well. Ironically, you can see that Hawaii’s legacy is still firmly ingrained into the die since there’s space for TrueAudio DSPs, a technology which AMD seems to have abandoned.

In order to keep power consumption to a minimum despite the massive die, AMD turned to a highly tuned 28nm manufacturing process. This allowed them to boost transistor count by 44% over Hawaii yet Fiji’s die size has only increased by a little over 35%. The maturity of 28nm has also contributed towards high transistor efficiency and lower overall power consumption. As a matter of fact, several insiders have told us AMD reduced their TDP forecasts numerous times over the course of designing this new core.

R9-FURY-X-2-13.png

Eagle eyed readers may have noticed that despite increasing the number of SIMD cores and Texture Units, the number of Render Back-Ends containing the ROPs and other tertiary output functions have remained untouched. There are still four of them per Shader Engine which is the exact same number on Hawaii. At first glance that’s a bit concerning since overloading these key architectural elements could result in a bottleneck within some games.

According to AMD, any potential hindrance caused by a stagnant number of ROPs has been mitigated by a number of core modifications. The ROPs are able to process 16 bits per clock which is the exact same rate as Hawaii’s theoretical limit but they can now run a full speed due to the implementation of HBM and the aforementioned color compression routines. There are some other additions as well which enhance the ROPs’ processing abilities As a result, Fiji’s claimed fill rate performance is more than double that of Hawaii.

AMD has also modified a few elements within those Compute Engines. There are new data processing instructions which allow for parallel sharing between SIMD lanes, new 16-bit floating point and integer instructions and double the amount of effective cache per CU.

R9-FURY-X-2-14.png

Another area that doesn’t have any physical changes is the geometry processing stage but once again the revisions have been done below the skin. There are still four processors containing the assemblers and tesselllators but each unit has received a significant speedup, thus boosting tessellation performance among other things.

Last but not least AMD has enhanced the throughput of their Asynchronous Compute Engines and further worked upon prioritizing data across their shared Crossbar. This should help Fiji in higher level DX12 scenarios.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
HBM; The Art of Doing More With Less

HBM; The Art of Doing More With Less


High Bandwidth Memory has been heralded as a key next generation technology for both SoCs and graphics cards. It is supposed to offer significant bandwidth increases while also featuring notable power consumption and size savings over current GDDR-based designs. AMD has been developing HBM for the better part of seven years now and Fiji is acting as a kind of proof of concept for them while also showing us a glimpse of what the future will hold. Indeed, NVIDIA has already announced their intent to use second generation HBM modules on their Pascal architecture.

R9-FURY-X-2-21.png

With lower powered systems becoming the norm even among gaming enthusiasts, engineers have been on a desperate search for any way to improve their component designs while also enhancing performance. Simply put, even ultra high performance graphics architectures can only scale so far upwards if they are constrained by the power consumption and heat production from other board-bound components.

This is where the current state of GDDR5 enters into the equation. According to AMD the market was rapidly approaching a point of inflection where the increasing power consumption of memory and the die space needed for its associated memory controllers could conceivably begin to negatively impact development of faster graphics processors. You see, as GDDR5 speeds increase so too do the requirements for signaling strength and memory controller capacity. This means frequencies of today’s memory technologies could conceivably increase (8Gbps GDDR5 is actually undergoing its final testing phases) but going forward, it becomes increasingly hard to justify adding them to more efficient cores.

Make no mistake about it though; GDDR5 isn’t anywhere close to playing its swan song just yet. However, memory efficiency, throughput, signaling strength and inherent limitations of current memory technologies needed to be addressed if there was any hope of moving forward with next generation GPU cores. This is where HBM is able to potentially shine.

R9-FURY-X-2-22.png

Unlike previous memory designs which required complicated PCB traces and secondary PWM components, HBM modules live in extremely close proximity to the processing core, be it a GPU, CPU or SoC. They communicate with the core via a high performance connected interposer and their own integrated logic circuits, providing an ultra quick pathway between all critical data points. This combination of processor and memory creates a compact package but one which cannot be modified by third parties.

Since the communication streams between the core and memory are drastically shortened versus the offboard memory layouts of today’s modern GPUs, there’s an inherent increase in power efficiency. Some of the added power savings comes from a lowering of that critical signal strength we talked about above; there’s no longer a need to boost frequencies since the components are so physically close to one another.

The end result of HBM’s built-in efficiencies speak for themselves. AMD claims it features a 50% or greater reduction in power consumption versus 6Gbps GDDR5 while delivering up to 60% more bandwidth. That naturally brings to the table some massive performance per watt benefits.

R9-FURY-X-2-25.png

Along with the obvious power benefits HBM consumes something along the order of 94% less surface area than GDDR5 due to its location directly on the substrate package. Not only that but it also requires a substantially smaller power distribution grid since the modules can effectively operate at lower clock speeds while still maintaining extremely high bandwidth. This actually looks like the only option going forward for small form factor high performance devices.

HBM’s approach to module design is relatively straightforward as well. Each DRAM core die is stacked on top of one another, building upwards like a skyscraper instead of previous memory implementations’ urban sprawl. This means HBM still requires multiple stacks to achieve a given memory footprint but that doesn’t necessarily add excess space since each stack (at least in first generation HBM) can be up to 1GB in size.

In the case of AMD’s Fiji architecture, each of these stacks has four 256MB DRAM dies stacked on top of one another which each communicate with the GPU’s onboard memory controllers via a 1024-bit wide bus.

R9-FURY-X-2-24.png

While GDDR5 had to rely on high frequencies to compensate for lower bus widths, HBM’s location allows for massive bus widths since memory controller design is less complex and more compact. On the flip side of that equation, a larger bus allows the memory itself to operate a lower clock speeds and lower voltages while still achieving amazingly high throughput numbers.

Lower temperatures don’t necessarily mean that HBM runs at a neutral temperature though. Its addition to the already hot-running core package means there are some additional challenges when trying to cool things off. AMD’s Fury series will have a combination of liquid (the Fury X) and air cooled models to compensate for this.

In terms of feeds and speeds, the HBM AMD currently uses runs at 500MHz while each stack has access to a dedicated 1024-bit communication bus to the core. That grants an effective bandwidth of 128GB/s per stack or 512GB/s for the entire setup. This is all achieved at very low latencies as well.

Talking bandwidth is one thing but AMD has a herculean marketing task ahead of them if they want to convince buyers that 4GB of HBM is just as good as 6GB, 8GB or even 12GB of GDDR5. While there has been countless hours of work put into properly utilizing the amazing potential of this new technology, today’s games often require copious amounts of memory when operating at 4K and above. Will this slow down HBM? AMD doesn’t believe so since they’ve instituted different resource allocation algorithms within their drivers to insure their available memeory and on-chip caching bandwidth is fully utilized, therefore processing requests quicker so capacities above 4GB aren’t necessary. In theory of course.

There is still a challenge of selling what amounts to a 4GB card to ODMs who like using the “bigger is better” approach on their specification sheets. This all happens right after AMD themselves were championing their larger memory allocation and ribbing on NVIDIA’s GTX 970’s setup. Nonetheless, HBM obviously has certain limitations for the time being but the potential for game-changing performance is certainly here.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
The Great DX12 Question

The Great DX12 Question


There has been a lot of talk about Microsoft’s upcoming Windows 10 and the DirectX 12 API which resides at its heart. The reasons for excitement begin with the fact that the new version of Windows will be a free upgrade for users of genuine Windows 7 or Windows 8.1 systems. Then there’s DX12 which is the first Direct3D version that’s backwards compatible with existing graphics hardware, though with some limits.

The DX12 API represents a fundamental change in the way developers program their games. Instead of keeping the hardware / software interaction at arm’s distance through a sometimes-clunky application / driver / hardware handoff, it promises closer to the metal programming benefits, giving game developers additional control over the system’s resource pools. If this all sounds familiar, that’s because AMD blazed this trail nearly two years ago with Mantle.

Unlike Mantle, which was only compatible with certain AMD graphics engines, DX12 boasts a much wider compatibility range. It spans NVIDIA’s Kepler and Maxwell parts along with AMD’s HD 7000, 200 series, 300 series and Fury cards. However, the actual feature level support each of these architectures varies.

Feature level support in DX12 is a somewhat slippery slope since it is broken into a number of different tiers ranging from basic Indeed, a given architecture can support some features within the API and claim DX12 support without actually meeting feature level 12_0 requirements. We’ll discuss this in a bit more detail within an upcoming article since now isn’t the time or the place to start debating who does what.

We’ve already highlighted AMD’s use of an upscaled and slightly upgraded Tonga-based architecture for the creation of Fiji. This has a direct impact upon how it goes about supporting the various Feature Levels within the Direct3D 12 ecosystem. Since Tonga utilizes the same basic design as previous generation GCN 1.1 parts and AMD has stated there aren’t any additional DX12-focused features in Fiji, the Fury series boasts Feature Level 12_0 support. Meanwhile, the Maxwell architecture goes full-on with Feature Level 12_1. With that being said AMD does support Tier 3 resource binding which is a step above NVIDIA’s current designs and Fiji’s Asynchronous Compute Engines could also prove to be a huge differentiating factor.

R9-FURY-X-2-4.gif

Even without full DX12_1 feature level compatibility there are plenty of advantages to using the latest version of DirectX, one of which is how the API’s relative efficiency cuts down on overhead and properly (finally) utilizes multi core systems.

In a typical DX11 environment, it was extremely hard to properly code for today’s multi core processors. As a result, the graphics card was often bottlenecked as it waited around for the processor, squandering valuable system resources in the process. As you can see in the chart above, the primary core is completely consumed processing various information sets with the DX11 command buffer while other cores a left idle. The result is extremely high API overhead while performance lags to around 29ms or 34ms.

In the DX12 graph things are drastically different, mostly due to simpler programming language and better resource allocation. All of the cores are engaged with work evenly distributed across the CPU while the DirectX API overhead has been reduced by an order of magnitude. There also happens to be an incredible performance jump to 66FPS in the exact same scenario as we detailed in the DX11 situation above.

R9-FURY-X-2-6.png

DX12 also allows for very linear performance upscaling up to the 6 processing core mark but beyond that the amount of polygons per second being processed tends to plateau. This is actually an interesting metric since it directly points to AMD’s own 6-core processors being a “sweet spot” that offer a good blend of price and performance in future gaming systems. Supposedly Intel will have additional optimizations built into their Skylake architecture to better distribute DX12 draw calls be we have yet to see any specifics regarding this.

R9-FURY-X-2-7.png

As you can probably tell by now, enhanced resource management is one of the key selling points for DX12 but it goes beyond just CPU core allocation and utilization. The API also seeks to address how today’s multi core, massively parallel graphics architectures handle the information thrown their way.

In DX11 generation applications, the GPU pipeline behaved in a linear, serial manner with compute, lighting and memory running one after another rather than in a parallel manner. As you can imagine, this takes an inordinate amount of GPU time while key elements are left waiting around without anything to do.

R9-FURY-X-2-8.png

DX12 changes things in a pretty dramatic fashion with multiple processing stages being completed in parallel. This reduces rendering times, increases framerates and also diminishes latency. Every one of these elements will contribute to faster graphics processing and should also lead to a better experience as VR continues to make inroads.

R9-FURY-X-2-9.png

One of AMD’s main DX12-enhancing architectural features is their Asynchronous Compute Engines or ACEs. These engines are primarily used to dynamically accelerate workloads containing a combination of GPU and compute-focused elements. This can lead to a dramatic speedup for any game that utilizes DX12’s native asynchronous shaders and according to AMD, many upcoming titles like Deus Ex: Mankind Divided and Ashes of the Sigularity will do just that.


These elements simply scratch the surface of what will be possible with DX12 and we haven’t even touched upon certain features like GPU Multi Adapter, combined memory pools and other high-level talking points. But how does this relate to Fiji? At this point that isn’t all too obvious since DX12’s true benefits will only be realized when supporting games become available and the ecosystem moves out of its beta stage. In the end, the onus rests upon the developers to take advantage of the supposed strengths of this new API to bring truly next generation performance to today’s graphics architectures.

With all of this being said, it is more than obvious that AMD is betting quite heavily upon DX12 to enhance and expand the performance potential of Fiji. The emphasized it quite heavily in all of their presentations. With just 4GB of HBM on tap, there’s hope that Microsoft’s upcoming API will take better advantage of the Fury X’s available resources rather than squandering available memory bandwidth on idle time. Will that actually happen? Considering DX12 gives the developers what they have always wanted and it meshes well with a holistic console / PC ecosystem, we’re willing to bet the initial uptake is much quicker than DX9, DX10 and DX11 exhibited.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Test System & Setup

Main Test System

Processor: Intel i7 4930K @ 4.7GHz
Memory: G.Skill Trident 16GB @ 2133MHz 10-10-12-29-1T
Motherboard: ASUS P9X79-E WS
Cooling: NH-U14S
SSD: 2x Kingston HyperX 3K 480GB
Power Supply: Corsair AX1200
Monitor: Dell U2713HM (1440P) / ASUS PQ321Q (4K)
OS: Windows 8.1 Professional


Drivers:
AMD 15.15 Beta
NVIDIA 352.90


*Notes:

- All games tested have been patched to their latest version

- The OS has had all the latest hotfixes and updates installed

- All scores you see are the averages after 2 benchmark runs

All IQ settings were adjusted in-game and all GPU control panels were set to use application settings


The Methodology of Frame Testing, Distilled


How do you benchmark an onscreen experience? That question has plagued graphics card evaluations for years. While framerates give an accurate measurement of raw performance , there’s a lot more going on behind the scenes which a basic frames per second measurement by FRAPS or a similar application just can’t show. A good example of this is how “stuttering” can occur but may not be picked up by typical min/max/average benchmarking.

Before we go on, a basic explanation of FRAPS’ frames per second benchmarking method is important. FRAPS determines FPS rates by simply logging and averaging out how many frames are rendered within a single second. The average framerate measurement is taken by dividing the total number of rendered frames by the length of the benchmark being run. For example, if a 60 second sequence is used and the GPU renders 4,000 frames over the course of that time, the average result will be 66.67FPS. The minimum and maximum values meanwhile are simply two data points representing single second intervals which took the longest and shortest amount of time to render. Combining these values together gives an accurate, albeit very narrow snapshot of graphics subsystem performance and it isn’t quite representative of what you’ll actually see on the screen.

FCAT on the other hand has the capability to log onscreen average framerates for each second of a benchmark sequence, resulting in the “FPS over time” graphs. It does this by simply logging the reported framerate result once per second. However, in real world applications, a single second is actually a long period of time, meaning the human eye can pick up on onscreen deviations much quicker than this method can actually report them. So what can actually happens within each second of time? A whole lot since each second of gameplay time can consist of dozens or even hundreds (if your graphics card is fast enough) of frames. This brings us to frame time testing and where the Frame Time Analysis Tool gets factored into this equation.

Frame times simply represent the length of time (in milliseconds) it takes the graphics card to render and display each individual frame. Measuring the interval between frames allows for a detailed millisecond by millisecond evaluation of frame times rather than averaging things out over a full second. The larger the amount of time, the longer each frame takes to render. This detailed reporting just isn’t possible with standard benchmark methods.

We are now using FCAT for ALL benchmark results, other than 4K.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Performance Consistency & Temperatures Over Time

Performance Consistency & Temperatures Over Time


One of the major knocks against AMD’s R9 290X was its inability to maintain its initial performance as temperatures rose and its heatsink struggled to keep up. To somewhat mitigate this problem every card had “Uber” and “Silent” modes, the latter of which would reduce fan speeds while the former would boost cooling to the point where the core wouldn’t throttle as much.

The Fiji core on the R9 Fury X is one of the largest ever created which should point towards potentially rampant heat production. However, this time around AMD decided to pre-install a closed loop liquid cooler. Let’s see if it can overcome its predecessor’s shortcomings.

R9-FURY-X-2-80.jpg

The temperature results we see here are exactly what we have come to expect from AiO coolers placed atop hot running graphics cards. Not once did the core temperatures exceed the 50°C mark.

R9-FURY-X-2-79.jpg

It looks like AMD hasn’t resorted to core frequency fluctuations to attain their low temperatures either. In every situation, the Fiji architecture was able to easily attain its 1050MHz speed, even after hours of continual testing. That’s a far cry from what we saw with the R9 290X in reference form.

R9-FURY-X-2-78.jpg

As you can imagine, performance was consistent across the board as well without any of the drops that characterized the R9 Fury X’s predecessors. This actually makes us wonder whether or not board partners could simply apply their higher end air coolers (for example the ASUS DirectCU II or Sapphire’s Tri-X) to this card and still attain suitable temperature results. Only time will tell…
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
Thermal Imaging / Acoustics / Power Consumption

Thermal Imaging


R9-FURY-X-2-210.jpg
R9-FURY-X-2-211.jpg

At first glance the R9 Fury X looks like an extremely cool running card and that’s because it is. We didn’t pick up any significant heat signatures on the card itself which points towards how efficient the water cooler is working.

R9-FURY-X-2-212.jpg
R9-FURY-X-2-213.jpg

R9-FURY-X-2-214.jpg

Judging from the tubes and radiator it is more than evident that the core and various PCB components are producing a prodigious amount of heat. Just the reservoir itself registered a temperature that was in excess of 25°C hotter than the ambient air reading. However, none of these figures should be of any concern since these images are simply proof that the water cooler is doing its job well.


Acoustical Testing


What you see below are the baseline idle dB(A) results attained for a relatively quiet open-case system (specs are in the Methodology section) sans GPU along with the attained results for each individual card in idle and load scenarios. The meter we use has been calibrated and is placed at seated ear-level exactly 12” away from the GPU’s fan. For the load scenarios, Hitman Absolution is used in order to generate a constant load on the GPU(s) over the course of 15 minutes.

R9-FURY-X-2-81.jpg

One thing we haven’t made evident up until now is how our decibel meter is calibrated. It reads both high and low frequencies in parallel which allows for a more balanced reading that’s more akin to what an average person’s ear would actually hear. This is why the readings for AMD’s R9 Fury X are slightly higher than those of the competition.

While it is nowhere near as loud as the R9 290X, this card’s water cooling pump throws out a good amount of high pitched noise. There’s some additional coil whine thrown in for good measure at higher loads but the idle results are significantly increased due to the pump.

According to AMD, they are aware of this issue and it is only supposed to affect early samples. They are expecting retail cards to ship with a slightly revised pump bearing design that eliminates any instances of whine. Other than that unfortunate situation, the Fury X’s single 120mm fan is dead quiet.


System Power Consumption


For this test we hooked up our power supply to a UPM power meter that will log the power consumption of the whole system twice every second. In order to stress the GPU as much as possible we used 15 minutes of Unigine Valley running on a loop while letting the card sit at a stable Windows desktop for 15 minutes to determine the peak idle power consumption.

R9-FURY-X-2-82.jpg

This is one area where the Fury X managed to surprise me. Expecting yet another power-hogger from AMD, I was ready for the worst…which didn’t happen. It only consumes some 15W more than the GTX 980 Ti and less than the much less powerful R9 290X and R9 390X. Much of this is likely due to the lower temperature positively affecting leakage but regardless of how AMD achieved this, we’ll take it!
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
1440P: AC:Unity / Battlefield 4

Assassin’s Creed: Unity


R9-FURY-X-2-54.jpg

R9-FURY-X-2-30.jpg


Battlefield 4


<iframe width="640" height="360" src="//www.youtube.com/embed/y9nwvLwltqk?rel=0" frameborder="0" allowfullscreen></iframe>​

In this sequence, we use the Singapore level which combines three of the game’s major elements: a decayed urban environment, a water-inundated city and finally a forested area. We chose not to include multiplayer results simply due to their randomness injecting results that make apples to apples comparisons impossible.

R9-FURY-X-2-55.jpg

R9-FURY-X-2-31.jpg
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,840
Location
Montreal
1440P: Dragon Age: Inquisition / Dying Light

Dragon Age: Inquisition


<iframe width="640" height="360" src="https://www.youtube.com/embed/z7wRSmle-DY" frameborder="0" allowfullscreen></iframe>​

Dragon Age: Inquisition is one of the most popular games around due to its engaging gameplay and open-world style. In our benchmark sequence we run through two typical areas: a busy town and through an outdoor environment.

R9-FURY-X-2-56.jpg

R9-FURY-X-2-32.jpg



Dying Light


<iframe width="640" height="360" src="https://www.youtube.com/embed/MHc6Vq-1ins" frameborder="0" allowfullscreen></iframe>​

Dying Light is a relatively late addition to our benchmarking process but with good reason: it required multiple patches to optimize performance. While one of the patches handicapped viewing distance, this is still one of the most demanding games available.

R9-FURY-X-2-57.jpg

R9-FURY-X-2-33.jpg
 

Latest posts

Top