What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

The NVIDIA TITAN X Performance Review

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
NVIDIA’s TITAN series has always been about delivering the highest possible performance to both gamers and developers but the new TITAN X does things a bit differently. While it is still a processing powerhouse with roots firmly planted in the CUDA developer field, the goals this time around are distinctively more gamer-centric.

When the GTX 980 and its GM204 core was introduced to widen their lead against AMD’s Hawaii, everyone knew that the Maxwell architecture had so much more to offer. The GM204 was and will always be NVIDIA’s mid-tier core that parades around in an enthusiast graphics card due to a lack of pressure from the Radeon lineup. With that in mind, NVIDIA was able to hold onto their larger cores until inventory built up and yields improved. Why prematurely launch something if there’s nothing for it to compete against, right? Hence TITAN X has been born with a fully enabled GM200 core and it represents a “catch me if you can” challenge for AMD who now know exactly what it’s going to take to challenge the Maxwell in its current form.


With an incredible 8 billion transistors, sized at 601mm² and boasting a TDP of 250W, NVIDIA’s GM200 core is based on the mature 28nm manufacturing process and is one of the largest they’ve ever developed. It is composed of six Graphics Processing Clusters, each of which holds a quartet of Streaming Multiprocessors. In total that means TITAN X has access to 3072 CUDA cores and 192 Texture units or 50% more than are rolled into the GTX 980.

The GM200’s secondary processing stages include 3MB of share L2 along with 96 ROPs which have access to six 64-bit GDDR5 memory controllers which are combined for a 384-bit interface. Once again these specifications put TITAN X a good 50% above its sibling. Like other Maxwell cores, this one includes advanced delta color correction algorithms which effectively boost theoretical bandwidth and a new video engine that grants higher pixel clocks alongside native support for 5K resolution. You can find a deeper dive into those features HERE.

If we go back to when the “original” TITAN, TITAN Black and insanely powerful (for the time) TITAN Z were announced, NVIDIA specifically highlighted their abilities for budding CUDA developers. Unlike the GeForce lineup, TITAN cards’ DP units could operate at full speed which proved to be a boon for some developers who weren’t initially looking for the extra driver features and back-end support provided by NVIDIA’s workstation-class products. TITAN X takes a different stance by offering the same simplified 1/32 speed double precision floating point throughput as the GeForce lineup.

The move away from raw DP throughput was made due to pure space and power limitations. Adding additional Double Precision units would have consumed valuable die space and forced NVIDIA to make sacrifices in other key areas to insure the TITAN X remained at a nominal 250W TDP. As it stands, they were able to maximize gaming horsepower for this particular card but expect a GM200 specifically tailored for the Quadro market very soon.


On paper at least the TITAN X represents a significant step up from the Kepler-totting TITAN Black but many enthusiasts will understandably compare it to the GTX 980. In that respect NVIDIA is huge improvements in the core count, TMU and ROP departments but the most noteworthy area of change is the memory interface. With so many games pushing past the 4GB mark at 4K and higher resolutions some gamers began to run face-first into the GTX 980’s 4GB framebuffer limit. Even SLI solutions were held back by this cap. The TITAN X’s 384-bit interface has been paired up with 12GB of 7Gbps GDDR5 which should make memory bottlenecks a thing of the past, particularly for anyone developing virtual reality applications.

While 50% improvement (or even more if memory-limited situations come into play) over the GTX 980 sounds like a dream come, the TITAN X’s core speeds may conspire to limit that hope. While the GTX 980 can harness its Boost algorithms to push its engine frequency to 1216MHz and higher in some instances, the 250W card should top out around 1075MHz so we can’t expect a linear framerate difference based on architecture alone. Luckily, NVIDIA has added a bit of overclocking headroom into the TITAN X so those numbers are anything but a foregone conclusion.

Leading edge performance does of course come with some sacrifices as well. While the TITAN X is still able to maintain an air cooled heatsink in part due to the Maxwell architecture’s inherent thermal efficiency, its TDP hits 250W. We do have to remember that older TITAN parts had similarly high TDPs and yet they offered substantially less performance.

The real question many will likely have about the TITAN X is price, one aspect which was a closely guarded secret until an hour before this review went live. With previous iterations retailing between $999 and a spectacular $2999 for the TITAN Z, some were pegging this card to hit the $1499 mark. However, NVIDIA no longer has that double precision / CUDA developer crutch to lean on so some rethinking is in order and the GTX 980 fits in quite well at just $549. Add 50% to that and a few bucks for the convenience of a single GPU solution and you’ve got a $900 graphics card but that isn’t quite the case here.

NVIDIA is following in the footsteps of TITAN cards past by pricing the X at $999 which is a far cry from what some were predicting. Does that make for a good value on paper? Nope. However, avoiding the tiresome eccentricities of a dual card solution does have a certain allure.

All things considered, the TITAN X's price and its potential capabilities certainly put it within reach of gamers. But can it actually provide enough raw performance to justify a $1000 investment? Let's find out...

 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
A Closer Look at the TITAN X

A Closer Look at the TITAN X



While the retail packaging will vary depending on board partners, reviewers received their TITAN X’s in a package which points towards gaming rather than CUDA developer roots. “Inspired by Gamers, Built By NVIDIA” is a fitting slogan considering this card has gaming at its heart.


The TITAN X itself is a relatively straightforward affair, though its design mirrors the GTX 980 quite closely, the heatsink shroud takes inspiration from previous TITANs by going with an all-aluminum shell. Like all other high end NVIDIA cards as of late, its length remains 10.5” which should insure compatibility with smaller chassis. Along with a glowing GeForce GTX logo on the side, the design is a stunning one.


Even though the TITAN X is a 250W card, NVIDIA has been able to retain an air cooled heatsink design that uses a blower-style layout. This effectively exhausts hot air outside the case so there’s no to worry about increased ambient temperatures within your system.


Around back there’s a continuation of the internal heatsink that pokes out through an open duct which is supposed to feed the fan with cool air if the card is placed close to an obstruction. We can also see a basic 8+6 pin power input layout.


Even though the TITAN X’s rear PCB area is stacked full of GDDR5 modules, NVIDIA didn’t include a backplate cooler for some reason. While these modules don’t run excessively hot, there are plenty of enthusiasts out there who simply like the look of a clean backplate on their ultra expensive graphics cards.


Taking off the shroud reveals and extensive three-stage internal vapor chamber heatsink design that consists of two primary aluminum fin arrays alongside a full-coverage plate that covers the memory and several VRM components. There are also channels to direct and accelerate the fan’s airflow so it can be most effective in dissipating any built-up heat.


The card itself makes use of an advanced 6+2 phase PWM with insulated chokes to reduce whine at high framerates. Interestingly enough, there’s room for another power input connector on the PCB which perhaps points to it pulling double duty as a Tesla / Quadro workstation board as well.


The rear I/O area mirrors the one found on NVIDIA’s GTX 980 with a trio of DisplayPort outputs, a lone HDMI 2.0 port and the DVI output.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
Performance Consistency & Temperatures Over Time

Performance Consistency & Temperatures Over Time


With a TDP of 250W, there should be no doubt that the TITAN X is a hot-running card. Meanwhile, NVIDIA claims their air cooler is capable enough to keep the GM200 core cool enough for it to achieve its Boost clocks fairly regularly. That’s a tall order considering the Boost algorithms can (and WILL) throttle performance if temperatures, input current or power consumption get too high. With that in mind, we took the TITAN for an extensive gaming test to see if performance kept within NVIDIA’s specifications.


The first test out the door sees the core getting a bit toasty throughout testing. However, it never goes above 81°C which may seem quite high in a market that’s filled with massive custom heatsinks cooling off efficient cores but in reality this is a far cry from NVIDIA’s thermal throttling temperature of 90°C.


As we can see, the temperatures do have a slightly negative effect upon frequencies as the Boost algorithm lowers clocks by about 12MHz to better stabilize temperatures. Even with that slight reduction taken into account, our TITAN X remained at a lofty 1164MHz throughout testing which is nearly 100MHz more than NVIDIA’s advertised Boost speeds.


As one can imagine, all of this leads to extremely consistent performance with a slight, sub-1% framerate reduction as the frequencies kick down half a notch. For a 250W core on a quiet air cooler, these results are quite impressive.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
Thermal Imaging, Acoustics & Power Consumption

Thermal Imaging



In our usual thermal imaging shots, there’s nothing out of the ordinary and it seems all of the components are being well cooled. It is certainly interesting to see the aluminum shroud doing some work in alleviating any built-up heat from the internal fin array but it never builds up enough heat to become dangerous to handle.

Around back the story is pretty much the same though we can see the memory modules do tend to heat up without a secondary heatsink in place. Supposedly it was omitted to help with airflow but we would have still liked to see one in place.


Acoustical Testing


What you see below are the baseline idle dB(A) results attained for a relatively quiet open-case system (specs are in the Methodology section) sans GPU along with the attained results for each individual card in idle and load scenarios. The meter we use has been calibrated and is placed at seated ear-level exactly 12” away from the GPU’s fan. For the load scenarios, Hitman Absolution is used in order to generate a constant load on the GPU(s) over the course of 15 minutes.


Despite housing a hot running core, the fan doesn’t need to speed up by a large amount to keep things cool. It looks as though the heatsink coupled with small fluctuations in core frequencies achieve a good balance so the TITAN X can remain relatively quiet when gaming.


System Power Consumption


For this test we hooked up our power supply to a UPM power meter that will log the power consumption of the whole system twice every second. In order to stress the GPU as much as possible we used 15 minutes of Unigine Valley running on a loop while letting the card sit at a stable Windows desktop for 15 minutes to determine the peak idle power consumption.


This is one of the Maxwell architecture’s true strengths. Its performance per watt equation is quite literally through the roof and the fully enabled GM200 core alongside 12GB of GDDR5 memory ends up consuming less power than AMD’s R9 290X in Uber mode. Idle power is a bit higher than the other single core cards but that’s to be expected.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
Test System & Setup

Main Test System

Processor: Intel i7 4930K @ 4.7GHz
Memory: G.Skill Trident 16GB @ 2133MHz 10-10-12-29-1T
Motherboard: ASUS P9X79-E WS
Cooling: NH-U14S
SSD: 2x Kingston HyperX 3K 480GB
Power Supply: Corsair AX1200
Monitor: Dell U2713HM (1440P) / ASUS PQ321Q (4K)
OS: Windows 8.1 Professional


Drivers:
AMD 15.3.1 Beta
NVIDIA 347.84


*Notes:

- All games tested have been patched to their latest version

- The OS has had all the latest hotfixes and updates installed

- All scores you see are the averages after 2 benchmark runs

All IQ settings were adjusted in-game and all GPU control panels were set to use application settings


The Methodology of Frame Testing, Distilled


How do you benchmark an onscreen experience? That question has plagued graphics card evaluations for years. While framerates give an accurate measurement of raw performance , there’s a lot more going on behind the scenes which a basic frames per second measurement by FRAPS or a similar application just can’t show. A good example of this is how “stuttering” can occur but may not be picked up by typical min/max/average benchmarking.

Before we go on, a basic explanation of FRAPS’ frames per second benchmarking method is important. FRAPS determines FPS rates by simply logging and averaging out how many frames are rendered within a single second. The average framerate measurement is taken by dividing the total number of rendered frames by the length of the benchmark being run. For example, if a 60 second sequence is used and the GPU renders 4,000 frames over the course of that time, the average result will be 66.67FPS. The minimum and maximum values meanwhile are simply two data points representing single second intervals which took the longest and shortest amount of time to render. Combining these values together gives an accurate, albeit very narrow snapshot of graphics subsystem performance and it isn’t quite representative of what you’ll actually see on the screen.

FCAT on the other hand has the capability to log onscreen average framerates for each second of a benchmark sequence, resulting in the “FPS over time” graphs. It does this by simply logging the reported framerate result once per second. However, in real world applications, a single second is actually a long period of time, meaning the human eye can pick up on onscreen deviations much quicker than this method can actually report them. So what can actually happens within each second of time? A whole lot since each second of gameplay time can consist of dozens or even hundreds (if your graphics card is fast enough) of frames. This brings us to frame time testing and where the Frame Time Analysis Tool gets factored into this equation.

Frame times simply represent the length of time (in milliseconds) it takes the graphics card to render and display each individual frame. Measuring the interval between frames allows for a detailed millisecond by millisecond evaluation of frame times rather than averaging things out over a full second. The larger the amount of time, the longer each frame takes to render. This detailed reporting just isn’t possible with standard benchmark methods.

We are now using FCAT for ALL benchmark results, other than 4K.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
1440P: Battlefield 4 / Dragon Age: Inquisition

Battlefield 4


<iframe width="640" height="360" src="//www.youtube.com/embed/y9nwvLwltqk?rel=0" frameborder="0" allowfullscreen></iframe>​

In this sequence, we use the Singapore level which combines three of the game’s major elements: a decayed urban environment, a water-inundated city and finally a forested area. We chose not to include multiplayer results simply due to their randomness injecting results that make apples to apples comparisons impossible.





Dragon Age: Inquisition


<iframe width="640" height="360" src="https://www.youtube.com/embed/z7wRSmle-DY" frameborder="0" allowfullscreen></iframe>

Dragon Age: Inquisition is one of the most popular games around due to its engaging gameplay and open-world style. In our benchmark sequence we run through two typical areas: a busy town and through an outdoor environment.


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
1440P: Dying Light / Far Cry 4

Dying Light


<iframe width="640" height="360" src="https://www.youtube.com/embed/MHc6Vq-1ins" frameborder="0" allowfullscreen></iframe>​

Dying Light is a relatively late addition to our benchmarking process but with good reason: it required multiple patches to optimize performance. While one of the patches handicapped viewing distance, this is still one of the most demanding games available.




Far Cry 4


<iframe width="640" height="360" src="https://www.youtube.com/embed/sC7-_Q1cSro" frameborder="0" allowfullscreen></iframe>​

The latest game in Ubisoft’s Far Cry series takes up where the others left off by boasting some of the most impressive visuals we’ve seen. In order to emulate typical gameplay we run through the game’s main village, head out through an open area and then transition to the lower areas via a zipline.


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
1440P: Hitman Absolution / Metro: Last Light

Hitman Absolution


<iframe width="560" height="315" src="http://www.youtube.com/embed/8UXx0gbkUl0?rel=0" frameborder="0" allowfullscreen></iframe>​

Hitman is arguably one of the most popular FPS (first person “sneaking”) franchises around and this time around Agent 47 goes rogue so mayhem soon follows. Our benchmark sequence is taken from the beginning of the Terminus level which is one of the most graphically-intensive areas of the entire game. It features an environment virtually bathed in rain and puddles making for numerous reflections and complicated lighting effects.




Metro: Last Light


<iframe width="640" height="360" src="http://www.youtube.com/embed/40Rip9szroU" frameborder="0" allowfullscreen></iframe>​

The latest iteration of the Metro franchise once again sets high water marks for graphics fidelity and making use of advanced DX11 features. In this benchmark, we use the Torchling level which represents a scene you’ll be intimately familiar with after playing this game: a murky sewer underground.


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
1440P: Middle Earth: Shadow of Mordor / Thief

Middle Earth: Shadow of Mordor


<iframe width="640" height="360" src="https://www.youtube.com/embed/U1MHjhIxTGE" frameborder="0" allowfullscreen></iframe>​

With its high resolution textures and several other visual tweaks, Shadow of Mordor’s open world is also one of the most detailed around. This means it puts massive load on graphics cards and should help point towards which GPUs will excel at next generation titles.




Thief


<iframe width="640" height="360" src="//www.youtube.com/embed/p-a-8mr00rY?rel=0" frameborder="0" allowfullscreen></iframe>​

When it was released, Thief was arguably one of the most anticipated games around. From a graphics standpoint, it is something of a tour de force. Not only does it look great but the engine combines several advanced lighting and shading techniques that are among the best we’ve seen. One of the most demanding sections is actually within the first level where you must scale rooftops amidst a thunder storm. The rain and lightning flashes add to the graphics load, though the lightning flashes occur randomly so you will likely see interspersed dips in the charts below due to this.


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,861
Location
Montreal
1440P: Tomb Raider

Tomb Raider


<iframe width="560" height="315" src="http://www.youtube.com/embed/okFRgtsbPWE" frameborder="0" allowfullscreen></iframe>​

Tomb Raider is one of the most iconic brands in PC gaming and this iteration brings Lara Croft back in DX11 glory. This happens to not only be one of the most popular games around but it is also one of the best looking by using the entire bag of DX11 tricks to properly deliver an atmospheric gaming experience.

In this run-through we use a section of the Shanty Town level. While it may not represent the caves, tunnels and tombs of many other levels, it is one of the most demanding sequences in Tomb Raider.



 

Latest posts

Twitter

Top