What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

NVIDIA GeForce GTX 780 Ti 3GB Review

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
When it was first announced at NVIDIA’s Montreal event, it was obvious that the GeForce GTX 780 Ti was an attempt to recapture the performance crown from AMD’s extremely capable R9 290X. However, NVIDIA does have an interesting situation to contend with since the 290X currently goes for $549, competes well against the almighty TITAN but hasn’t been widely available in the retail channels since its launch. This has left the door open for a more powerful, highly targeted alternative NVIDIA is ready to go.


The GTX 780 Ti is the card we’ve all been waiting for since Kepler was first announced as it represents the first GeForce-branded unveiling of NVIDIA’s fully enabled 7.1 billion transistor GK110 core. This means it has one more Streaming Multiprocessor active than the TITAN but it still retains the same 384-bit memory controller and ROP layout.

Having that extra SM grants access to some additional processing backbone in the form of 2880 CUDA cores, 240 texture units, additional L1 instruction cache, and another all-important PolyMorph Engine. Budding CUDA developers may be salivating right now but unlike the TITAN, the GTX 780 Ti’s double precision throughput has been neutered to 1/3 the single precision rate. With this in mind, double precision performance will be faster than a GTX 780 but significantly slower than a TITAN. This approach towards segmentation actually makes sense since NVIDIA wanted to keep some differentiation between the two cards.


With a bucket load of impressive specifications in tow, it goes without saying NVIDIA’s GTX 780 Ti is taking over TITAN’s mantle as the GeForce lineup’s flagship. In many ways, this card takes the GTX 780’s blueprint and turns all the settings up to eleven. There are more cores, additional TMUs and even higher Base and Boost frequencies. The memory layout remains at 3GB but overall speed has seen a stratospheric increase through the use of 7Gbps GDDR5 modules. It manages to accomplish this while operating at about the same TDP numbers as a GTX 780 due to enhanced power management routines.

While the R9 290X may have the GTX 780 Ti beat in terms of raw memory allotment with 4GB, NVIDIA’s latest card features more bandwidth. This may be the lone stumbling block since we are starting to see some games maxing out the 3GB layout on other NVIDIA cards, particularly at 4K resolution (more on that in an upcoming article) so it should be interesting to see where the GTX 780 Ti's performance lands in the coming months.

The GTX 780 Ti’s $699 price may be a bit controversial since AMD’s R9 290X set the bar quite high with a showing of $549 while the GTX 780 is a full $200 less expensive at $499. With that being said, NVIDIA is aiming quite high with claims it will easily beat AMD’s flagship and one-up the TITAN by a handy margin.

It should also be mentioned that NVIDIA has extended their Holiday Gaming Bundle to the GTX 780 Ti so buyers will receive three free games (Batman Arkham City, Splinter Cell Black List and Assassin’s Creed Black Flag) along with a $100 rebate coupon for SHIELD. When combined with exclusive features like ShadowPlay and G-Sync, this may still be an enticing solution regardless of its premium.


While many things have changed in NVIDIA’s lineup, some have remained the same. The GTX 690 is still at $999 since it is an immensely powerful card that’s well supported by NVIDIA’s latest SLI profiles regardless of its age. TITAN meanwhile continues to be an excellent option for CUDA developers so NVIDIA will keep it around without touching cost since it still provides a phenomenal value for non-gaming markets that care about full speed double precision.

NVIDIA is hoping that enthusiasts will appreciate their approach to flagship graphics cards since performance isn’t their sole focus here. The GTX 780 Ti is expected to provide class-leading framerates while being more efficient and quieter than the R9 290X. Considering NVIDIA’s past track record in this area, we certainly have some high expectations and so should you.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
A Closer Look at the GTX 780 Ti

A Closer Look at the GTX 780 Ti



The GTX 780 follows the same exterior design of other high end NVIDIA cards with a striking all-metal heatsink shroud that’s been finished in powder coated nickel and matte black. Other than the GTX 780 Ti logo prominently displayed near the backplate, there would be nothing to distinguish it from the TITAN, GTX 780 or GTX 770. Even its length of 10.75” runs throughout most of NVIDIA’s GTX 700-series.


The GTX 780 Ti’s internal heatsink engineering has also been carried over from previous cards which is to say it is still light-years ahead of what AMD has recently launched. NVIDIA also adds some unique flourishes like an acrylic window which looks down onto the aluminum fin array and a GeForce GTX logo that’s backlit with LEDs. All in all, the GTX 780 Ti looks like a flagship product.


Connector-wise, we get another shot of déjà vu with a 6+8 pin power input layout as well as dual DVI, HDMI and DisplayPort outputs on the backplate. This allows the GTX 780 Ti to natively support triple monitor surround and 4K setups without adaptors.


With all of the GDDR5 modules being loaded onto the PCB’s upper side, the GTX 780 Ti’s back is relatively sparse, though there is some indication that NVIDIA is using a Tesla-inspired layout as evidenced by the power input pin-outs.

With a basic 6+2 all-digital PWM and Samsung 7Gbps GDDR5 modules, there’s plenty to distinguish the GTX 780 Ti from its siblings. One of the main differentiators is what NVIDIA calls Power Balancing. In a typical PWM design, the power is evenly split between the power connectors and PCI-E interface but when overclocking, one of these sources can be maxed out and thus impede an overclock that would otherwise be stable.

What Power Balancing does is dynamically switch power routing between sources so if one is pushed to its limit, the other will automatically take up the slack. This could in theory increase overclocking headroom but it won’t impact NVIDIA’s preset Power, Voltage and Temperature limits which are more often what cuts off an enthusiast’s clock speed endeavors.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
GTX 780 Ti’s Clock Speeds; Consistency Personified

GTX 780 Ti’s Clock Speeds; Consistency Personified


One of the major critiques leveled at AMD’s Hawaii architecture is the R9 290X’s and R9 290’s lack of clock speed consistency while playing games. As we saw in their respective reviews, both have a tendency to throttle back frequencies after a few minutes of gameplay due to PowerTune’s algorithms giving power draw and temperatures priority due to a lack of adequate cooling. This was completely understandable but we can’t forget that NVIDIA’s GeForce Boost features a similar albeit more advanced approach to balancing clock speeds, temperatures and power requirements.

So how does NVIDIA’s potentially hot-running GTX 780 Ti fare when placed in the same situations? Let’s find out.


Taking a look at temperatures, it’s obvious that NVIDIA has their targets set substantially lower than AMD’s. It seems like AMD needed to run their architecture as hot as possible in order to achieve the highest possible clock speeds on a reference cooler. NVIDIA meanwhile is able to utilize their heatsink engineering to the fullest of its capabilities without massive increases in the fan’s rotational speed. Thus, temperatures never venture above the 85°C mark.


NVIDIA’s GTX 780 Ti sports a fully enabled GK110 core so we half expected it to clamp down quite hard on clock speeds but that didn’t happen. It remained consistent throughout testing with a core frequency that actually exceeded NVIDIA’s specified Boost Clock by a good 40MHz. Seeing a situation of under promising and over-delivering is great.

AMD’s R9 290X on the other hand doesn’t come with a Base Clock and the reason for that is obvious: its clock speeds are all over the place. We’ll be looking into sample to sample frequency comparisons in an upcoming article but for the time being, let’s just say that the frequencies you see above are anything but consistent when looking at a larger sample size than just one card.


The GTX 780 Ti’s performance is a model of consistency which should have been evident by now considering the results we’ve already seen in this section. The R9 290X does match its continual framerate but only when used in Uber Mode which boosts fan speeds to uncomfortable levels in an effort to achieve the highest possible stable clock speed.

If anything, these tests should be vindication for NVIDIA’s approach to their GeForce Boost algorithms and their commitment to deliver a pleasant gaming experience. The GTX 780 Ti could have been a loud mess which performed even faster but instead it remains a relatively docile card that can still pump out framerates without sacrificing in other key areas.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Test System & Setup

Main Test System

Processor: Intel i7 3930K @ 4.5GHz
Memory: Corsair Vengeance 32GB @ 1866MHz
Motherboard: ASUS P9X79 WS
Cooling: Corsair H80
SSD: 2x Corsair Performance Pro 256GB
Power Supply: Corsair AX1200
Monitor: Samsung 305T / 3x Acer 235Hz
OS: Windows 7 Ultimate N x64 SP1


Acoustical Test System

Processor: Intel 2600K @ stock
Memory: G.Skill Ripjaws 8GB 1600MHz
Motherboard: Gigabyte Z68X-UD3H-B3
Cooling: Thermalright TRUE Passive
SSD: Corsair Performance Pro 256GB
Power Supply: Seasonic X-Series Gold 800W


Drivers:
NVIDIA 331.70 Beta
AMD 13.11 v8 Beta



*Notes:

- All games tested have been patched to their latest version

- The OS has had all the latest hotfixes and updates installed

- All scores you see are the averages after 3 benchmark runs

All IQ settings were adjusted in-game and all GPU control panels were set to use application settings


The Methodology of Frame Testing, Distilled


How do you benchmark an onscreen experience? That question has plagued graphics card evaluations for years. While framerates give an accurate measurement of raw performance , there’s a lot more going on behind the scenes which a basic frames per second measurement by FRAPS or a similar application just can’t show. A good example of this is how “stuttering” can occur but may not be picked up by typical min/max/average benchmarking.

Before we go on, a basic explanation of FRAPS’ frames per second benchmarking method is important. FRAPS determines FPS rates by simply logging and averaging out how many frames are rendered within a single second. The average framerate measurement is taken by dividing the total number of rendered frames by the length of the benchmark being run. For example, if a 60 second sequence is used and the GPU renders 4,000 frames over the course of that time, the average result will be 66.67FPS. The minimum and maximum values meanwhile are simply two data points representing single second intervals which took the longest and shortest amount of time to render. Combining these values together gives an accurate, albeit very narrow snapshot of graphics subsystem performance and it isn’t quite representative of what you’ll actually see on the screen.

FCAT on the other hand has the capability to log onscreen average framerates for each second of a benchmark sequence, resulting in the “FPS over time” graphs. It does this by simply logging the reported framerate result once per second. However, in real world applications, a single second is actually a long period of time, meaning the human eye can pick up on onscreen deviations much quicker than this method can actually report them. So what can actually happens within each second of time? A whole lot since each second of gameplay time can consist of dozens or even hundreds (if your graphics card is fast enough) of frames. This brings us to frame time testing and where the Frame Time Analysis Tool gets factored into this equation.

Frame times simply represent the length of time (in milliseconds) it takes the graphics card to render and display each individual frame. Measuring the interval between frames allows for a detailed millisecond by millisecond evaluation of frame times rather than averaging things out over a full second. The larger the amount of time, the longer each frame takes to render. This detailed reporting just isn’t possible with standard benchmark methods.

We are now using FCAT for ALL benchmark results.


Frame Time Testing & FCAT

To put a meaningful spin on frame times, we can equate them directly to framerates. A constant 60 frames across a single second would lead to an individual frame time of 1/60th of a second or about 17 milliseconds, 33ms equals 30 FPS, 50ms is about 20FPS and so on. Contrary to framerate evaluation results, in this case higher frame times are actually worse since they would represent a longer interim “waiting” period between each frame.

With the milliseconds to frames per second conversion in mind, the “magical” maximum number we’re looking for is 28ms or about 35FPS. If too much time spent above that point, performance suffers and the in game experience will begin to degrade.

Consistency is a major factor here as well. Too much variation in adjacent frames could induce stutter or slowdowns. For example, spiking up and down from 13ms (75 FPS) to 28ms (35 FPS) several times over the course of a second would lead to an experience which is anything but fluid. However, even though deviations between slightly lower frame times (say 10ms and 25ms) wouldn’t be as noticeable, some sensitive individuals may still pick up a slight amount of stuttering. As such, the less variation the better the experience.

In order to determine accurate onscreen frame times, a decision has been made to move away from FRAPS and instead implement real-time frame capture into our testing. This involves the use of a secondary system with a capture card and an ultra-fast storage subsystem (in our case five SanDisk Extreme 240GB drives hooked up to an internal PCI-E RAID card) hooked up to our primary test rig via a DVI splitter. Essentially, the capture card records a high bitrate video of whatever is displayed from the primary system’s graphics card, allowing us to get a real-time snapshot of what would normally be sent directly to the monitor. By using NVIDIA’s Frame Capture Analysis Tool (FCAT), each and every frame is dissected and then processed in an effort to accurately determine latencies, frame rates and other aspects.

We've also now transitioned all testing to FCAT which means standard frame rates are also being logged and charted through the tool. This means all of our frame rate (FPS) charts use onscreen data rather than the software-centric data from FRAPS, ensuring dropped frames are taken into account in our global equation.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Assassin’s Creed III / Crysis 3

Assassin’s Creed III (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/RvFXKwDCpBI?rel=0" frameborder="0" allowfullscreen></iframe>​

The third iteration of the Assassin’s Creed franchise is the first to make extensive use of DX11 graphics technology. In this benchmark sequence, we proceed through a run-through of the Boston area which features plenty of NPCs, distant views and high levels of detail.


2560 x 1440





Crysis 3 (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/zENXVbmroNo?rel=0" frameborder="0" allowfullscreen></iframe>​

Simply put, Crysis 3 is one of the best looking PC games of all time and it demands a heavy system investment before even trying to enable higher detail settings. Our benchmark sequence for this one replicates a typical gameplay condition within the New York dome and consists of a run-through interspersed with a few explosions for good measure Due to the hefty system resource needs of this game, post-process FXAA was used in the place of MSAA.


2560 x 1440


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Dirt: Showdown / Far Cry 3

Dirt: Showdown (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/IFeuOhk14h0?rel=0" frameborder="0" allowfullscreen></iframe>​

Among racing games, Dirt: Showdown is somewhat unique since it deals with demolition-derby type racing where the player is actually rewarded for wrecking other cars. It is also one of the many titles which falls under the Gaming Evolved umbrella so the development team has worked hard with AMD to implement DX11 features. In this case, we set up a custom 1-lap circuit using the in-game benchmark tool within the Nevada level.


2560 x 1440





Far Cry 3 (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/mGvwWHzn6qY?rel=0" frameborder="0" allowfullscreen></iframe>​

One of the best looking games in recent memory, Far Cry 3 has the capability to bring even the fastest systems to their knees. Its use of nearly the entire repertoire of DX11’s tricks may come at a high cost but with the proper GPU, the visuals will be absolutely stunning.

To benchmark Far Cry 3, we used a typical run-through which includes several in-game environments such as a jungle, in-vehicle and in-town areas.



2560 x 1440


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Hitman Absolution / Max Payne 3

Hitman Absolution (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/8UXx0gbkUl0?rel=0" frameborder="0" allowfullscreen></iframe>​

Hitman is arguably one of the most popular FPS (first person “sneaking”) franchises around and this time around Agent 47 goes rogue so mayhem soon follows. Our benchmark sequence is taken from the beginning of the Terminus level which is one of the most graphically-intensive areas of the entire game. It features an environment virtually bathed in rain and puddles making for numerous reflections and complicated lighting effects.


2560 x 1440





Max Payne 3 (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/ZdiYTGHhG-k?rel=0" frameborder="0" allowfullscreen></iframe>​

When Rockstar released Max Payne 3, it quickly became known as a resource hog and that isn’t surprising considering its top-shelf graphics quality. This benchmark sequence is taken from Chapter 2, Scene 14 and includes a run-through of a rooftop level featuring expansive views. Due to its random nature, combat is kept to a minimum so as to not overly impact the final result.


2560 x 1440


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Metro: Last Light / Tomb Raider

Metro: Last Light (DX11)


<iframe width="640" height="360" src="http://www.youtube.com/embed/40Rip9szroU" frameborder="0" allowfullscreen></iframe>​

The latest iteration of the Metro franchise once again sets high water marks for graphics fidelity and making use of advanced DX11 features. In this benchmark, we use the Torchling level which represents a scene you’ll be intimately familiar with after playing this game: a murky sewer underground.


2560 x 1440




Tomb Raider (DX11)


<iframe width="560" height="315" src="http://www.youtube.com/embed/okFRgtsbPWE" frameborder="0" allowfullscreen></iframe>​

Tomb Raider is one of the most iconic brands in PC gaming and this iteration brings Lara Croft back in DX11 glory. This happens to not only be one of the most popular games around but it is also one of the best looking by using the entire bag of DX11 tricks to properly deliver an atmospheric gaming experience.

In this run-through we use a section of the Shanty Town level. While it may not represent the caves, tunnels and tombs of many other levels, it is one of the most demanding sequences in Tomb Raider.


2560 x 1440


 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Onscreen Frame Times w/FCAT

Onscreen Frame Times w/FCAT


When capturing output frames in real-time, there are a number of eccentricities which wouldn’t normally be picked up by FRAPS but are nonetheless important to take into account. For example, some graphics solutions can either partially display a frame or drop it altogether. While both situations may sound horrible, these so-called “runts” and dropped frames will be completely invisible to someone sitting in front of a monitor. However, since these are counted by its software as full frames, FRAPS tends to factor them into the equation nonetheless, potentially giving results that don’t reflect what’s actually being displayed.

With certain frame types being non-threatening to the overall gaming experience, we’re presented with a simple question: should the fine-grain details of these invisible runts and dropped frames be displayed outright or should we show a more realistic representation of what you’ll see on the screen? Since Hardware Canucks is striving to evaluate cards based upon and end-user experience rather than from a purely scientific standpoint, we decided on the latter of these two methods.

With this in mind, we’ve used the FCAT tools to add the timing of partially rendered frames to the latency of successive frames. Dropped frames meanwhile are ignored as their value is zero. This provides a more realistic snapshot of visible fluidity.





 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,857
Location
Montreal
Onscreen Frame Times w/FCAT

Onscreen Frame Times w/FCAT (pg.2)


When capturing output frames in real-time, there are a number of eccentricities which wouldn’t normally be picked up by FRAPS but are nonetheless important to take into account. For example, some graphics solutions can either partially display a frame or drop it altogether. While both situations may sound horrible, these so-called “runts” and dropped frames will be completely invisible to someone sitting in front of a monitor. However, since these are counted by its software as full frames, FRAPS tends to factor them into the equation nonetheless, potentially giving results that don’t reflect what’s actually being displayed.

With certain frame types being non-threatening to the overall gaming experience, we’re presented with a simple question: should the fine-grain details of these invisible runts and dropped frames be displayed outright or should we show a more realistic representation of what you’ll see on the screen? Since Hardware Canucks is striving to evaluate cards based upon and end-user experience rather than from a purely scientific standpoint, we decided on the latter of these two methods. With this in mind, we’ve used the FCAT tools to add the timing of runted to the latency of successive frames. Dropped frames meanwhile are ignored as their value is zero. This provides a more realistic snapshot of visible fluidity.





 
Top