What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

AMD Radeon HD 6990 4GB Review

Status
Not open for further replies.

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
When AMD first launched their second generation DX11 lineup with the HD 6800-series, their plans for the whole Northern Islands family of GPUs were laid bare. The original timeline called for the release of the Barts, Cayman and Antilles products before the end of 2010 and for the most part, this was accomplished. The Cayman-based HD 6970 and HD 6950 made their way onto the market in mid December but the eagerly anticipated Antilles dual GPU card was pushed back to Q1 2011.

The official reason behind the Antilles’ delay will likely be shrouded in mystery for the foreseeable future but several issues did arise due to its devolution from a 32nm flagship to using existing 40nm technology. When trying to push the limits of DX11 performance, the 40nm process poses some unique challenges in terms of power consumption and heat production. In order to engineer their way around these obstacles, AMD had to come up with some innovative ways to overcome them while striving to keep cost down.

We have had our fair share of great experiences with cards like Antilles in the past but it’s important to remember its type of design has had inherent problems as well. Unsteady drivers, low availability and horrendously loud fans have become stumbling blocks time and again. The HD 6990 aims to smash these preconceptions and forge a new chapter in the history of dual GPU cards.

Regardless of launch dates and other challenges, the card we now know as the HD 6990 should become available at retailers as you read this for a vertigo inducing price of $699. This actually makes some cruel sense since it pegs Antilles to compete with dual HD 6970s plus a price premium for the “convenience” of a dual GPU card.

Some may question the reasoning behind the release of a $700 dual core graphics card in today’s market. However, the HD 6990 isn’t geared towards people who really care about saving money or want to save on their energy bills; it’s for gamers who want uncompromising performance regardless of mere worldly common sense. It also gives AMD the ability to once again plant their flag into ultra high end soil while giving consumers access to what is (for the next few weeks at least) billed as the fastest graphics card in the world.

 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
The Architecture behind Antilles

The Architecture behind Antilles


AMD’s Antilles card uses a pair of AMD’s Cayman cores which have been specifically binned for low power consumption and heat output. Unfortunately, since the cores are so stringently binned, we will likely see a mere trickle of these cards making their way into the retail channel.

One of the main benefits of using such select GPUs is ability to utilize fully-enabled dies instead of trading off SIMDs blocks for additional power savings. As we will see on the next pages, AMD stuck with fully-enabled cores but did have to sacrifice clock speeds in order to retain acceptable power consumption numbers. For a more in-depth presentation of the Cayman architecture, we recommend that you visit our HD 6970 / HD 6950 launch article.


A bird’s-eye view of a single Cayman core which resides in Antilles really doesn’t show that much of a departure from Cypress but there are some noteworthy changes. Since AMD’s has moved to a simplified VLIW4 architecture for the thread processors, the number of SIMD engines has been increased by four for a total of 24. Each of these engines features 16 thread processors with four ALUs each (for a total of 64 ALUs per SIMD), four texture units, 512KB of L2 texture cache and 64KB associated towards the local data share. This means a full-enabled core will have 1536 shaders and 96 TMUs while the ROPs array layout hasn’t changed from Cypress with its 32 colour and 128 z-stencil ROPs. Multiply this by two and you have a general idea of what Antilles can bring to the table.

All in all, Cayman may have less Shader Processors than Cypress but the processors themselves are slightly more efficient and the architecture has additional texture processing power granted by the additional 4 SIMD engines.

Much like on the Barts series, we can also see that in an effort to increase rendering efficiency even more, AMD has broken up the Ultra Threaded Dispatch Processor into two with each section having its own instruction and constant cache. This dispatch processor basically acts like a traffic cop, directing draw calls to the SIMD arrays. With each directing its own “half” of the SIMD engine, rendering information can be processed at a much quicker rate.

Since geometry performance has been the overriding focus here, we can naturally expect Cayman-based cards to run circles around the HD 5800-series in some games. However, not all of the first generation DX11 games incorporate higher level geometry or higher levels of tessellation. DX10 and to a greater extent DX9 applications lack a real need for increased performance in this area, which may very well lead to a relatively minor gap between AMD’s current and past generations.


In order to facilitate communication between the two cores, AMD has once again gone with a PCI-E switch from PLX. For the Antilles card, an ultra low latency 8647 chip is used and allows for a total of 48 second generation PCI-E lanes to be used which should eliminate any bottlenecks seen with previous generations. The HD 5970 used this same switch but due to communication differences between the Cypress and Cayman cores, Antilles should be better able to utilize it. There is no magical Sideport wand here, just a fast 7.92 Gbps interconnect between the two GPUs.

The 48 bi-directional lanes afforded by the 8647 are partitioned at a 2:1 ratio between the cores and switch to interface communication. This means each Cayman GPU has the bandwidth of sixteen PCI-E Gen 2 lanes between it and the PLX switch while the switch itself has an open x16 link to the motherboard’s PCI-E slot. Meanwhile, the main display output is handled by the primary GPU.

AMD has chosen a very straightforward design which should be quite efficient considering the PLX chip consumes less than 3 watts of power. In addition, an expensive dual PCB solution has been avoided through the use of a linear communication path with a low latency switch instead of a built-in Crossfire bridge.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
PowerTune: Keeping Consumption in Check

PowerTune: Keeping Consumption in Check


One of the largest challenges GPU manufacturers have been smashing into as of late is the rapid increase in the power consumption of their higher-end ASICs. NVIDIA’s solution to cut consumption and TDP in their GTX 500-series has been a combination of input current monitoring and upgraded heatsinks as well as application detection. AMD meanwhile is taking a different path with their PowerTune technology which uses a complex set of current calculations to determine on-the-fly TDP levels. It can then adjust clock speeds once the card reaches a pre-determined maximum thermal design power level.

The entire point of PowerTune is to allow AMD to strike a delicate balance between power consumption, thermals and clock speeds. If such a middle-man didn’t exist, the clock speeds of Antilles would have been significantly lower since there would have been nothing to keep TDP in check.

The Antilles design represented a unique challenge since individually, two Cayman cores and their associated 2GB of memory would have easily burst through the 375W ceiling needed for PCI-E certification and compatibility with the largest cross section of motherboards and power supplies. PowerTune allows for high clock speeds to be maintained but it can also be pushed aside if a user wants to overclock the card for additional performance. On Antilles, it is implemented in an identical fashion to the Cayman series.


A typical GPU will likely be used of any number of applications but its primary focus will usually be upon one thing: entertainment. While there are several synthetic benchmarks which cause a graphics card to consume copious amounts of power, most typical games will never even begin to approach these levels. As such, AMD is focusing their PowerTune technology upon scenarios which put unrealistic loads upon the GPU rather than games. Since most of us don’t sit around all day benchmarking with 3DMark, this is good news.

Unfortunately, depending on their rendering methods there may still be the odd game which will be caught up in the crossfire and have its performance capped. In past articles we have found next to no impact from PowerTune’s implementation though. It is just important to remember that AMD has tuned this technology to deliver the best gaming performance while weeding out potential power viruses.


As AMD describes it, this new technology is simply used to contain power consumption in such a way that the actual TDP of a given product will in effect determine clock speeds. Instead of letting the card run amok for the few seconds of absolute peak consumption that will likely occur every now and then, PowerTune caps power draw through clock speed modification. After the peak periods are concluded, clock speeds along with performance will return to normal.

This may all sound like doom and gloom for overall performance but PowerTune is actually designed for a worst-case scenario rather than a typical usage pattern. The algorithm to determine implied power consumption is based upon an extremely high leakage ASIC operating with 45 degree inlet temperature. Remember that high temperatures increase power draw in transistors so this ensures products are not artificially capped in lower temperature scenarios. Since TDP is the determining factor here, if you keep your card cool within a well ventilated case you should in theory never see PowerTune kick in while gaming.


One of the beauties of this technology is the control the end user has over it. Within the Catalyst Control Center’s Overdrive panel, there is a Power Control Setting slider that allows PowerTune to add some overhead to its calculations. This could also improve overclocking since it allows the core to loosen its grip on TDP.


While the current PowerTune cap for Antilles is 375W, this modifier will allow for theoretical consumption limits of up to a massive 450W which could increase performance if one is running into rendering limitations. However, setting additional overhead does not guarantee games will perform any better since as we saw there are very few (if any) applications that will be limited by the already-lax limits AMD has instituted. It should also be mentioned that since PowerTune is considered an overclocking tool, its usage will not be covered by certain board partners’ warranties.


Consequently, AMD has also allowed for the tightening of containment as well. There may not be much value in this for a typical hardcore gamer but if performance is above that magical 60 FPS mark, lowering the PowerTune setting could still net perfectly acceptable framerates coupled with lower power consumption and temperatures.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Image Quality Improvements Aplenty

Image Quality Improvements Aplenty


AMD has also brought some new image quality enhancements to the table. The cornerstone of this push to increase IQ is the addition of a new anti-aliasing method which AMD calls Morphological AA.


Morphological AA Explained


Morphological AA is basically a new form of fullscreen anti-aliasing that delivers an image quality which is comparable to Super Sample AA, but can be implemented with a fraction of SSAA’s performance hit. The AA algorithms are calculated more efficiently by leveraging the GPGPU compute abilities of modern Radeon cards and the power of the DirectCompute API. Since the post-processing filtering is done by DirectCompute, the whole scene can be quickly analyzed so this AA method isn’t limited to only certain aspects of a given image.


One of the more interesting benefits of Morphological AA being done through a standalone API is the fact that it can be applied to both 2D and 3D scenes. It can be applied to things like video, Flash apps and more. In addition, since it is controlled directly through AMD’s Catalyst Control Center and makes use of DirectCompute, Morphological AA has the ability to be forced in any DX9, DX10 or DX11 game.


Enhanced Quality AA Makes an Entry


Enhanced Quality Anti Aliasing is another new image enhancement routine which AMD has implemented for Cayman-series cards. This is not DirectCompute-controlled and acts much like CSAA routines. Unlike Multi-Sample AA which uses an equal number of color and coverage samples per pixel, EQAA allows for each to be controlled independently with a maximum of 16 coverage samples per pixel. It is also compatible with existing AA modes which could further enhance image quality.


The main benefit of EQAA is the performance impact (or lack thereof) it has upon performance. According to AMD’s own numbers, in most cases users will see less than a ten percent drop in framerates when enabling this feature.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Antilles’ Place in AMD’s Lineup

Antilles’ Place in AMD’s Lineup



It should go without saying that the HD 6990 is meant to be AMD’s flagship product for the foreseeable future and it has a price to match. Due to a perceived shortage of these new cards in the coming weeks, we expect to see the price go slightly beyond this before leveling out once demand is met. It is also possible that we will see some movement here once NVIDIA introduces their upcoming flagship but for the time being, a launch price of $699 makes it one of the most expensive reference cards ever launched (though it still doesn’t hold a candle to the 8800 Ultra’s initial price of $799 and more).

The GPUs on the Antilles have the same number of cores, texture units, ROPs and memory layout as the one found on the HD 6970. On paper, these specifications give it twice the graphics processing power of a HD 6970 but due to slightly lower core clocks, there will likely be a small gap between Antilles and two Cayman XT cards. Memory speeds have also been reduced by 10% which likely won’t make that much of a difference since these cards aren’t anywhere close to bandwidth starved.

Regardless of the fact that AMD bins low leakage, high clock speed Cayman XT cores for the Antilles, these clock speed reductions were necessary in order to reduce the core voltage which in turn has a substantial impact upon power consumption. Even with these savings, the HD 6990 still consumes 375W which makes it the most power hungry reference graphics card ever released.


When compared directly against the outgoing HD 5970, Antilles actually has fewer cores but because of the architectural and core speed differences, the HD 6990 should have no problem outstripping its predecessor. The additional GDDR5 memory and higher number of texture units will also likely play a huge factor when it comes to differentiating between these two cards.


AMD’s current high end lineup

Judging from the price and specifications, it should be quite obvious that AMD is trying to thread a very thin needle between Crossfire solutions of the HD 6970 and HD 6950. Whether or not the PLX chip imparts any latency into the equation will be seen throughout testing but from our past experiences, it should really be a non-issue.

Past dual card solutions may have had quite a few issues in terms of driver implementation and overall stability but AMD has been hard at work on getting around these challenges. With the HD 6990 at the pinnacle of their lineup, AMD has positioned the HD 6990 4GB to take on almost anything the competition can throw at them.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
AMD’s HD 6990 4GB Under the Microscope

AMD’s HD 6990 4GB Under the Microscope



The HD 6990 4GB is one huge card which has its heatsink shroud deigned in such a way that it fits right in with the rest of the HD 6000 series. The main differentiating factor is the centrally mounted blower-style fan which mirrors the design XFX used for their HD 5970 Black Edition.

In terms of length, we are looking at about 12 ¼” which puts it on par with the HD 5970. Unfortunately, there are quite a few standard ATX cases on the market that won’t be able to fit this card without making some extensive modifications. So make sure your enclosure is compatible before taking a $700 plunge into dual GPU territory.


The heatsink itself is quite in terms of design and implementations. Instead of using a typical rear-mounted intake fan which pushes air across a series of heatsinks placed one after another, AMD has chosen a centrally located intake which pushes cool air across heatsinks located on either side of the fan.

In theory, this should allow the fan to spin at lower RPMs since each core will receive an equal amount of airflow in order to stabilize their temperatures. Considering reducing transistor heat is a key component in lowering overall power consumption, AMD is relying on this setup to pull double duty. The only potential issue we can see is the exhaust path over the second GPU core means hot air will be blown against a case’s natural airflow direction.


Underneath the Antilles’ fan shroud lies a complex series of heatsinks, vapor chambers and high endurance thermal compound. In order to maximize heat transfer from the memory, AMD has equipped the HD 6990 with a backplate which makes direct contact with the eight GDDR5 modules on the card’s back. The other front-mounted modules along with the VRMs and other tertiary components receive their own low profile anodized aluminum heatsink which runs across the PCB’s whole length and width.

Each core has its own vapor chamber that makes direct contact with the core and has a complex fin array in order to quickly disperse heat. AMD has been using these types of vapor chambers for quite a while now but they have supposedly been perfected for us in the HD 6990.

Usually we would take the card and remove the heatsink in order to show you all of this but AMD has equipped their flagship product with a special kind of “phase change” thermal compound. This isn’t anything new on the market since companies like Shin Etsu have been producing something similar for several years now. Basically, this type of thermal compound rapidly changes consistency as temperatures increase and flows into any imperfections in order to increase heat transfer between the core and heatsink.


The HD 6990’s back shows how AMD has changed their core layout in order to optimize cooling. To their way of thinking, two equally spaced cores with a single fan in close proximity should allow for a significant reduction in head over past designs. The departure away from the HD 5970 design is quite obvious.

Additionally, AMD has applied neoprene spacers on the outer edge of the board in order to ensure a small space is maintained for air intake when two cards are used in Crossfire.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
AMD’s HD 6990 4GB Under the Microscope pg.2

AMD’s HD 6990 4GB Under the Microscope pg.2



Due to the high end power needs of this card, AMD saw the need to go with a dual 8-pin connector setup which happens to be the first such layout from a reference board. We’re sure to see this again when NVIDIA launches their dual chip card in the coming weeks.

A dual 8-pin setup also gives AMD the option to offer a fair amount of overclocking headroom on Antilles since their design and the components chosen are rated to provide up to 450W to the two cores. This naturally raises some questions about PSU selection and output but we’ll tackle that in the overclocking and power consumption sections.

Next to the single Crossfire connector is the same dual BIOS switch we saw on the HD 6900-series but in this case, both BIOS files have been populated with different settings. Position 2 is the standard position which incorporates factory default clock speeds as well as a VCore of 1.12V meant to reduce power consumption. Position 2 allows the core to run at 880Mhz with a voltage of 1.175V and opens up the possibility of additional overclocking.

On retail cards, this switch will be covered by a sticker which can be removed (doing so will void the warranty) in order to switch between the standard and Performance BIOSes.


At 12 ¼” long, the HD 6990 is slightly longer than the HD 6970. Due to its enclosed heatsink design that is supposed to funnel air towards the front and rear exhaust vents, the shroud actually overlaps the PCB by a good half inch.


The output connector selection is definitely interesting since it forgoes the usual dual DVI outputs seen on most cards. Instead, there are four DisplayPort connectors and a single DVI but according to AMD, every card will ship with two DisplayPort to DVI adaptors (one active and one passive) alongside a single passive DisplayPort to HDMI dongle. By combining these adaptors, the HD6990 will be able to drive up to three monitors in an Eyefinity setup.

DisplayPort 1.2 connectors also have the ability to daisy chain up to two 1080P monitors together. In theory, an Antilles card could therefore drive up to six monitors but supports five natively.

Photo provided by AMD

AMD has designed the Antilles card quite a bit differently from past dual chip solutions. The cores are placed equally apart and are each paired with 2GB of GDDR5 memory broken into 16 256MB modules which are placed at a rate of four per GPU. The PLX 8647 46-lane PCI-E bridge chip sits in an off-center location in order to make place for the fan.

The two massive Volterra digital programmable voltage regulators sit above the cores and should push clean, uninterrupted power through a complete 8 + 4 phase power design.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Test System & Setup

Test System & Setup

Processor: Intel Core i7 920(ES) @ 4.0Ghz (Turbo Mode Enabled)
Memory: Corsair 3x2GB Dominator DDR3 1600Mhz
Motherboard: Gigabyte EX58-UD5
Cooling: CoolIT Boreas mTEC + Scythe Fan Controller (Off for Power Consumption tests)
Disk Drive: Pioneer DVD Writer
Hard Drive: Western Digital Caviar Black 640GB
Power Supply: Corsair HX1000W
Monitor: Samsung 305T 30” widescreen LCD / 3x Acer GD235HZ 23.5" 1080P LCDs
OS: Windows 7 Ultimate N x64 SP1


Graphics Cards:

AMD HD 6990 4GB
AMD HD 5970 2GB
AMD HD 6970 2GB Single + Crossfire (Ref)
AMD HD 6950 2GB Crossfire (Ref)

NVIDIA GTX 560 Ti 1GB SLI (Ref)
NVIDIA GTX 570 SLI (Ref)
NVIDIA GTX 580 Single & SLI (Ref)




Drivers:

NVIDIA 267.31 Beta
ATI 11.4 Preview + CAP 11.2 R4

Note: Even though AMD claims the “AMD Optimized Tessellation” feature in the 11.1a drivers has not yet been implemented, we have changed the setting to “Off” in order to ensure additional, untested optimizations are not enabled.

Applications Used:

3DMark 11
Aliens Versus Predator
Battlefield: Bad Company 2
DiRT 2
F1 2010
Just Cause 2
Lost Planet
Metro 2033
Unigine: Heaven


*Notes:

FOR MORE INFORMATION ABOUT OUR BENCHMARKING PROCESS PLEASE SEE THIS ARTICLE

- All games tested have been patched to their latest version

- The OS has had all the latest hotfixes and updates installed

- All scores you see are the averages after 3 benchmark runs

All game-specific methodologies are explained above the graphs for each game

All IQ settings were adjusted in-game
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
3DMark 11 (DX11)

3DMark 11 (DX11)


3DMark 11 is the latest in a long line of synthetic benchmarking programs from the Futuremark Corporation. This is their first foray into the DX11 rendering field and the result is a program that incorporates all of the latest techniques into a stunning display of imagery. Tessellation, depth of field, HDR, OpenCL physics and many others are on display here. In the benchmarks below we have included the results (at default settings) for both the Performance and Extreme presets.


Performance Preset



Extreme Preset

 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Aliens Versus Predator (DX11)

Aliens Versus Predator (DX11)


When benchmarking Aliens Versus Predator, we played through the whole game in order to find a section which represents a “worst case” scenario. We finally decided to include “The Refinery” level which includes a large open space and several visual features that really tax a GPU. For this run-through, we start from within the first tunnel, make our way over the bridge on the right (blowing up several propane tanks in the process), head back over the bridge and finally climb the tower until the first run-in with an Alien. In total, the time spent is about four minutes per run. Framerates are recorded with FRAPS.


1920 x 1200





2560 x 1600



 
Status
Not open for further replies.
Top