What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

NVIDIA GeForce GTX 680 2GB Review

Status
Not open for further replies.

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Unbeknownst to many, the design process of modern GPU architectures is a long, drawn out process that involves hundreds of engineers, thousands of software architects and a healthy dose of assumption. NVIDIA's newest Kepler architecture is a prime example of this; the core that lies within the GeForce GTX 680's began its life as a rough schematic about five years ago. As they say, Rome wasn’t built in a day but in the GPU world “guestimates” are a way of life because no one really knows exactly where the market (or the competition) will be in a half decade's time. Most of the time, the architectural teams have a good idea of directionality but there’s always a significant amount of risk when it comes to releasing a new GPU core.

At its heart Kepler was conceived as a way to further refine a DirectX 11 and HPC centric approach that began with Fermi. You see, unlike AMD, NVIDIA already had the solid foundation of an existing DX11 architecture to build upon and was able to focus upon rendering efficiency and performance per watt this time around. In many ways Kepler can be considered a kind of “Fermi 2.0” since it still uses many of the same building blocks as its predecessor but as we will see on the upcoming pages, nearly every one of the rendering pipeline’s features have been augmented in some way. More importantly, NVIDIA’s initial offering GK104 / GTX 680 is smaller and more efficient than AMD’s own Tahiti XT.


For the time being the GTX 680 will occupy the flagship spot in NVIDIA’s lineup and with good reason. It boasts 1536 CUDA cores –a threefold increase over the GTX 580- while the texture units have been doubled to 128, matching the HD 7970’s layout. On the other hand, the quantity of ROPs has been dropped to 32 but as with many things in the Kepler architecture, the interaction between certain processing stages and these units has been refined, resulting in better throughput. We can also see that NVIDIA has halved the PolyMorph Engine count. On paper this should lead to a 50% reduction in tessellation performance but the fixed function stages of Kepler have received a thorough facelift, making them substantially more powerful than those in previous generations.

Some of the most noticeable changes here are found in the GTX 680’s clock speeds. The asynchronous graphics and processor clocks have now become a thing of the past with both engines running at a parallel 1:1 ratio. So while the separate clock speeds haven’t necessarily been eliminated, the change has led to a much faster graphics clock of just over 1GHz but the shaders are operating at a cut down speed when compared against many Fermi-based cards.


With the introduction of the GTX 680, NVIDIA is also premiering a new technology which they affectionately call GPU Boost. Learn to love this term because you’ll likely be seeing a lot of it in the coming months. GPU Boost acts like an overdrive gear for the GPU core, allowing it to dynamically increase clock speeds in certain situations where the architecture isn’t fully utilized. We go into further detail about it in a dedicated section but for the sake of this section, 1058MHz should be the minimum Boost speed with different variations above this depending upon the application.

Along with a core clock speed that makes AMD’s “GHz Edition” marketing seem like nothing more than a gimmick, the GTX 680 boasts 2GB of some of the fastest GDDR5 memory around with speeds of 6Gbps. This is paired up with a 256-bit interface which does come as a surprise for a flagship level product but when paired up with the blistering 6GHz clocks, the GTX 680 offers the same memory bandwidth as the outgoing 384-bit GTX 580. Hopefully the additional 512MB of memory allows this card to overcome the high resolution performance limitations of its predecessor. We just can’t forget that AMD’s card still sits atop the market with a staggering 264GB/s of bandwidth on tap.


Having learned early on that adding a massive amount of geometry processing and compute horsepower to a GPU architecture invariably increases die size and decreases overall efficiency, NVIDIA has been able to optimize several aspects of the GK104 core to better fit within the market’s new realities. The result is a TDP of just 195W which undercuts the HD 7970’s supposed 210W power draw and bucks a longstanding trend which had NVIDIA always releasing less efficient cards than AMD.

With a die size of just 294mm2 the GK104 should also be quite inexpensive (when compared against Fermi and Tahiti) to manufacture and NVIDIA’s pricing structure reflects this. Ready for a shock? Instead of carrying on a trend that led to a gradual increase in high end GPU prices, the GTX 680 actually undercuts AMD’s HD 7970 by $50. Not only should this lead to lower costs for the entire graphics card market once NVIDIA cascades the Kepler architecture down into more accessible price points but high level GPU performance just became that much more affordable. But that isn’t to say that the GTX 680 will underperform.

Interestingly enough, NVIDIA isn’t going for a complete knockout punch against AMD’s HD 7970 on the performance front. While the GTX 680 is indeed meant to beat its competitors’ flagship, it is supposed to do so by a significant amount in every game. This may sound completely at odds with NVIDIA’s old mantra of performance at any cost but they believe a focus upon efficiency and cost meshes seamlessly with the current post financial market meltdown realities. Make no mistake about it; the GTX 680 will be the fastest GPU on the planet, but its foremost goal is to run against many people’s preconceptions about NVIDIA’s graphics cards and chart a new course for the GeForce lineup.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Putting it all Together; Kepler’s Core Revealed

Putting it all Together; Kepler’s Core Revealed


In many ways, the genesis of the Kepler architecture came with the realization that Fermi’s large die, monolithic approach on an inefficient 40nm process node would cause long term issues on a number of fronts. Not only did the first Fermi-based GF100 cards consume loads of power but their cores were expensive to produce due to the inherent yield issues that arose when producing a 529mm² die with 3.2 billion transistors. Unfortunately, creating a whole new architecture from the ground up necessitated these sacrifices but things obviously needed to change in a big way in order to bring the GTX 680 and its ilk up to modern standards.

The one thing making this all possible is the switch towards TSMC’s 28nm manufacturing process which allows for a ton of transistors to be packed into a relatively small die area. In the case of the GK104, we’re talking about 3.5 billion transistors in an area of just 294mm² which makes it slightly more than half of the 3.2 billion transistor GF110’s size and also a good 20% smaller than Tahiti. This goes to illustrate just how far we’ve come in the last two years.

Another interesting aspect of this new architecture is the fact that NVIDIA’s first card out of the gate doesn’t boast what would be called a “fully enabled” core. While the GK104 doesn’t necessarily come with any of its elements disabled, its code name seems to indicate that NVIDIA may have an even higher performance core waiting in the wings. Indeed, with a TDP of just 195 watts, the GTX 680 still has loads of overhead before it would hit the 250W plateau.


The GK104 core is segmented into four distinct groups called Graphics Processing Clusters or GPCs which are then broken down again into individual Streaming Multiprocessors (SMs), raster engines and so on. Each of the GPCs can be considered a dedicated compute unit all on its own with self contained processing engines and rendering stages. This generalized layout and its functionality hasn’t changed much from the Fermi architecture but it builds upon the lessons learned from its predecessors in several ways.

There is now a total of 1536 CUDA cores, a drastic increase over the 512 present within the GF110. Regardless of the physical die space this change requires, a minimization of clock speeds alongside an increase in the core count has allowed NVIDIA to realize a twofold improvement in performance per watt. In effect, NVIDIA is taking advantage of the 28nm manufacturing process by substituting power hungry processing speed for a more efficient processor-centric approach.

The Raster Engines haven’t undergone a facelift per se and they still work in a highly parallelized fashion but several other architectural changes like a drastic speed increase for the PolyMorph engines meant their functionality had to be rationalized. As such, processing efficiency was increased in order to better handle the workload being thrown in their direction.

With high level geometry rendering taking up many of the headlines, texture throughput may not be something that’s discussed all that much these days but it still plays an integral role in game performance. NVIDIA has massaged this portion of their architecture as well by upping the Texture Unit count from 64 in the GF110 to 128 in the new Kepler-based GK104. As we drill down into each SMX on the next page, we will see how additional refinements have been integrated within each processing stage to ensure proper load balancing and optimal efficiency.


On the periphery of the GK104 die is the GigaThread Engine along with the memory controllers. The GigaThread Engine performs the somewhat thankless duty of reading the CPU’s commands over the new PCI-E 3.0 host interface and then fetching data from the system’s main memory bank. The data is then copied over onto the framebuffer of the graphics card itself before being passed along to the designated engine within the core.

Speaking of memory, things have changed here as well but some will think NVIDIA took a step backwards. Even though AMD’s Tahiti architecture and the previous generation GF100 / GF110 all incorporated 384-bit memory interfaces, Kepler (for the time being at least) makes do with a 256-bit interface spread across a quartet of 64-bit GDDR5 memory controllers. However, in order to compensate for this NVIDIA has updated the controllers themselves for compatibility with 6Gbps and higher memory ICs so the GF104’s combined bandwidth is still on par with the GTX 580’s.


Each of the 64-bit memory controllers is paired up with 128KB of L2 cache and a single ROP unit which mirrors the layout from previous generations but once again there are some significant differences built into Kepler than allow for higher performance. While the L2 cache may only total 512KB, the overall communication bandwidth has been doubled, eliminating any potential bottlenecks and allowing for much quicker data handoffs between the various rendering stages.

The design of the ROP units has also been rationalized with the 32 ROPs being divided into four groups of eight. At first glance, this may not seem like a noteworthy change since GF110 also incorporated 32 ROPs but the differences between past and present architectures are significant. Since each of the ROP units is driven by its associated GPC, both GF114 and GF110 couldn’t take full advantage of the Render Output Units due to asynchronous throughput. For example, the GF114 had two GPCs running at 16 pixels per clock while the four ROP partitions had a render throughput of 32 samples per clock. This setup allowed the extra ROPs to help out in some cases -particularly with uncompressed AA rendering- but actually led to several pipeline inefficiencies that hurt overall performance.

Kepler meanwhile has a 1:1 ratio between raster and ROP units, allowing the GPC to process 32 pixels per clock which lines up perfectly with the 32 samples per clock of each ROP partition. The result is a twofold speedup in every respect such as aliased rendering and compressed anti aliased rendering, even though uncompressed AA performance hasn’t been touched.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
The SMX: Kepler’s Building Block

The SMX: Kepler’s Building Block



Much like Fermi, Kepler uses a modular architecture which is structured into dedicated, self contained compute / graphics units called Streaming Multiprocessors or in this case Extreme Streaming Multiprocessors. While the basic design and implementation principles may be the same as the previous generation (other than doubling up the parallel threading capacity that is), several changes have been built into this version that help it further maximize performance and consume less power than its predecessor.

Due to die space limitations on the 40nm manufacturing process, the Fermi architecture had to cope with less CUDA cores but NVIDIA offset this shortcoming by running these cores at a higher speed than the rest of processing stages. The result was a 1:2 graphics to core clock ratio that led to excellent performance but unfortunately high power consumption numbers.

As we already mentioned, the inherent efficiencies of TSMC’s 28nm manufacturing process has allowed Kepler’s SMX to take a different path by offering six times the number of processors but running their clocks at a 1:1 ratio with the rest of the core. So essentially we are left with core components that run at slower speeds but in this case sheer volume makes up for and indeed surpasses any limitation. In theory this should lead to an increase in raw processing power for graphics intensive workloads and higher performance per watt even though the CUDA cores’ basic functionality and throughput hasn’t changed.

Each SMX holds 192 CUDA cores along with 32 load / store units which allows for a total of 32 threads per clock to be processed. Alongside these core blocks are the Warp Schedulers along with the associated dispatch units which process 64 concurrent threads (called Warps) to the cores while the primary register file currently sits at 65,536 x 32-bit. All of these numbers have been increased twofold over the previous generation to avoid causing bottlenecks now that each SMX’s CUDA core count is so high.

NVIDIA’s ubiquitous PolyMorph geometry engine has gone through a redesign as well. Each engine still contains five stages from Vertex Fetch to the Stream Output which process data from the SMX they are associated with. The data then gets output to the Raster Engine within each Graphics Processing Cluster. In order to further speed up operations, data is dynamically load balanced and goes from one of eight PolyMorph engines to another through the on-die caching infrastructure for increased communication speed.

The difference main difference between the current and past generation PolyMorph engines boils down to data stream efficiency. The new “2.0” version in the Kepler core boasts primitive rates that are two times higher and along with other improvements throughout the architecture offers a fourfold increase in tessellation performance over the Fermi-based cores.


The SMX plays host to a dedicated caching network which runs parallel to the primary core stages in order to help store draw calls so they are not passed off through the card’s memory controllers, taking up valuable storage space. Not only does this help with geometry processing efficiency but GPGPU performance can also be drastically increased provided an API can take full advantage of the caching hierarchy.

As with Fermi, each one of Kepler’s SMX blocks has 64KB of shared, programmable on-chip memory that can be configured in one of three ways. It can either be laid out as 48 KB of shared memory with 16 KB of L1 cache, or as 16 KB of Shared memory with 48 KB of L1 cache. Kepler adds another 32/32 mode which balances out the configuration for situations where the core may be processing graphics in parallel with compute tasks. This L1 cache is supposed to help with access to the on-die L2 cache as well as streamlining functions like stack operations and global loads / stores. However, in total, the GK104 has less SMXs than Fermi which results in significantly less on-die memory. This could negatively impact compute performance in some instances.

Even though there haven’t been any fundamentally changes in the way textures are handled across the Kepler architecture, each SMX receives a huge influx of texture units to 16 up from Fermi’s four. Hopefully this will help in certain texture heavy rendering situations, particularly in DX11 environments.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
GPU Boost; Dynamic Clocking Comes to Graphics Cards

GPU Boost; Dynamic Clocking Comes to Graphics Cards


Turbo Boost was first introduced into Intel’s CPUs years ago and through a successive number of revisions, it has become the de facto standard for situation dependent processing performance. In layman’s terms Turbo Boost allows Intel’s processors to dynamically fluctuate their clock speeds based upon operational conditions, power targets and the demands of certain programs. For example, if a program only demanded a pair of a CPU’s six cores the monitoring algorithms would increase the clock speeds of the two utilized cores while the others would sit idle. This sets the stage for NVIDIA’s new feature called GPU Boost.


Before we go on, let’s explain one of the most important factors in determining how high a modern high end graphics card can clock: a power target. Typically, vendors like AMD and NVIDIA set this in such a way that ensures an ASIC doesn’t overshoot a given TDP value, putting undue stress upon its included components. Without this, board partners would have one hell of a time designing their cards so they wouldn’t overheat, pull too much power from the PWM or overload a PSU’s rails.

While every game typically strives to take advantage of as many GPU resources as possible, many don’t fully utilize every element of a given architecture. As such, some processing stages may sit idle while others are left to do the majority of rendering, post processing and other tasks. As in our Intel Turbo boost example this situation results in lower heat production, reduced power consumption and will ultimately cause the GPU core to fall well short of its predetermined power (or TDP) target.

In order to take advantage of this NVIDIA has set their “base clock” –or reference clock- in line with a worst case scenario which allows for a significant amount of overhead in typical games. This is where the so-called GPU Boost gets worked into the equation. Through a combination of software and hardware monitoring GPU Boost fluctuates clock speeds in an effort to run as close as possible to the GK104’s TDP of 195W. When gaming, this monitoring algorithm will typically result in a core speed that is higher than the stated base clock.


Unfortunately, things do get a bit complicated since we are now talking about two clock speeds, one of which may vary from one application to another. The “Base Clock” is the minimum speed at which the core is guaranteed to run, regardless of the application being used. Granted, there may be some power viruses out there which will push the card beyond even these limits but the lion’s share of games and even most synthetic applications will have no issue running at or above the Base Clock.

The “Boost Clock” meanwhile is the typical speed at which the core will run in non-TDP limited applications. As you can imagine, depending on the core’s operational proximity to the power target this value will surely fluctuate to higher and lower levels. However, NVIDIA likens the Boost Clock rating to a happy medium that nearly every game will achieve, at a minumum. For those of you wondering, both the Base Clock and the Boost Clock will be advertised on all Kepler-based cards and on the GTX 680 the values are 1006MHz and 1058MHz respectively.

GPU Boost differs from AMD’s PowerTune in a number of ways. While AMD sets their base clock off of a typical in-game TPD scenario and throttles performance if an application exceeds these predetermined limits, NVIDIA has taken a more conservative approach to clock speeds. Their base clock is the minimum level at which their architecture will run under the worst case conditions and this allows for a clock speed increase in most games rather than throttling.

In order to better give you an idea of how GPU Boost operates, we logged clock speeds and Power Use in Dirt 3 and 3DMark11 using EVGA’s new Precision X utility.



In both of the situations above the clock speeds tend to fluctuate as the core moves closer to and further away from its maximum power limit. Since the reaction time of the GPU Boost algorithm is about 100ms, there are situations when clock speeds don’t line up with power use, causing a minor peak or valley but for the most part both run in perfect harmony. This is most evident in the 3DMark11 tests where we see the GK104’s ability to run slightly above the base clock in a GPU intensive test and then boost up to even higher levels in the Combined Test which doesn’t stress the architecture nearly as much.


According to NVIDIA, lower temperatures could promote higher GPU Boost clocks but even by increasing our sample’s fan speed to 100%, we couldn’t achieve higher Boost speeds. We’re guessing that high end forms of water cooling would be needed to give this feature more headroom and according to some board partners, benefits could be seen once temperatures hit below 70 degrees Celcius. However, the default GPU Boost / Power offset NVIDIA built into their core seems to leave more than enough wiggle room to ensure that all reference-based cards should behave in the same manner.

There may be a bit of variance from the highest to the lowest leakage parts but the resulting dropoff in Boost clocks will never be noticeable in-game. This is why the boost clock is so conservative; it strives to stay as close as possible to a given point so power consumption shouldn’t fluctuate wildly from one application to another. But will this cause performance differences from one reference card to another? Again, absolutely not unless they are running at abnormally hot or very cool temperatures.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Smoother Gaming Through Adaptive VSync

Smoother Gaming Through Adaptive VSync


In a market that seems eternally obsessed with high framerates, artificially capping performance at a certain levels by enabling Vertical Synchronization (or VSync) may seem like a cardinal sin. In simplified terms, VSync essentially sets the framerate within games to the refresh rate of the monitor which means games running on 60Hz monitors will achieve framerates of no higher than 60FPS. 120Hz panels eliminate this limitation and boost framerates to 120 but monitors sporting the technology are few in number and they usually come with astronomical price points.


With today’s graphics cards pushing boundaries that weren’t even dreamed of a few years ago, gamers usually want to harness every last drop of their latest purchase. This alongside the possible input lag issues VSync causes many gamers choose to disable VSync altogether. However, there are some noteworthy issues associated with running games at high framerates, asynchronously to the vertical refresh rate of most monitors.

Without VSync enabled, games will flow more naturally, average framerates are substantially higher and commands will be registered onscreen in short order. However, as the framerates run outside of the monitor’s refresh rate tearing begins to occur, decreasing image quality and potentially leading to unwanted distractions. Tearing happens when fragments of multiple frames are displayed on the screen as the monitor can’t keep up with the massive amount of rendered information being pushed through at once.


For some, V-Sync can be a saving grace since it eliminates the horizontal tearing but other than the aforementioned input lag, there is one other major drawback: stuttering. Remember, syncing up the monitor with your game holds both refresh rates and framerates at 60. However, some scenes can cause framerates to droop well below the optimal 60 mark which will lead to some frames being “missed” by the 60Hz monitor refresh and thus cause a slight stuttering effect. Basically, as the monitor is refreshing itself 60 times every second, the lower framerate causes it to momentarily display 30, 20, 15 (or any other multiple of 60) frames per second.


Through Adaptive VSync NVIDIA now gives users the best of both worlds by still capping framerates at the same level as the screen’s refresh rate but when framerate droops are detected, it temporarily disables the synchronization. This boosts framerates for as long as needed before once again enabling VSync when performance climbs to optimal levels. It is supposed to virtually eliminate visible stutter –even though some will still occur as the algorithm switches over- and improve overall framerates while still maintaining the tear-free experience normally associated with VSync.


Adaptive VSync can be enabled in drivers’ control panel but will only be available starting with the 300.xx-series driver stack. For this technology to be effective, all VSync changes should be done in the control panel while VSync needs to be disabled within any in-game graphics menu. There’s also the option for a “Half Refresh Rate” sync that can be used to lock the framerates to 30 FPS for highly demanding games or for graphics cards that can’t quite hit the 60 FPS mark.

So what kind of affect does Adaptive VSync have upon a typical gaming experience? We used it in Batman: Arkham City and Dirt 3 to find out.



The results were certainly definitive, at least in the case of Batman: Arkham City. In it, the framerates were significantly more constant as the Adaptive VSync effectively eliminating the peaks and valleys normally associated with a highly demanding game. Dirt 3 on the other hand doesn’t really benefit from this technology in an overt manner but there are still plenty of instances where the framerates were smoothed out so they didn’t reach quite as far into negative territory.

The graphs above only tell half the story though since the real impact of Adaptive VSync can only be experienced when actually playing a game live. Stuttering will become nearly nonexistent and the difference between it being enabled and disabled really is like night and day. The term “smooth as a baby’s bottom” comes to mind. Unfortunately, NVIDIA hasn’t quite found a way to eliminate VSync’s usual input lag issues but turning on Triple Buffering within the control panel can help mask these problems.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Introducing TXAA

Introducing TXAA


As with every other new graphics’ architecture that has launched in the last few years, NVIDIA will be launching a few new features alongside Kepler. In order to improve image quality in a wide variety of scenarios, FXAA has been added as an option to NVIDIA’s control panel, making it applicable to every game. For those of you who haven’t used it, FXAA is a form of post processing anti aliasing which offers image quality that’s comparable to MSAA but at a fraction of the performance cost.


Another item that has been added is a new anti aliasing component called TXAA. TXAA uses hardware multisampling alongside a custom software-based resolve AA filter for a sense of smoothness and adds an optional temporal component for even higher in game image quality.

According to NVIDIA, their TXAA 1 mode offers comparable performance to 2xMSAA but results in much higher edge quality than 8xMSAA. TXAA2 meanwhile steps things up to the next level by offering image enhancements that can’t be equaled by MSAA but once again the performance impact is negligible when compared against higher levels of multisampling.


From the demos we were shown, TXAA has the ability to significantly decrease the aliasing in scenes Indeed, it looks like the developer market is trying to move away from inefficient implementations of multi sample anti aliasing and have instead started gravitating towards more higher performance alternatives like MLAA, FXAA and now possibly TXAA.

There is however one catch: TXAA cannot be enabled in NVIDIA’s control panel. Instead, game engines have to support it and developers will be implementing it within the in-game options. Presently there isn’t a single title on the market that supports TXAA but that should change over the next 12 months. Once available, it will be backwards compatible with the GTX 400 and GTX 500 series GPUs as well.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
NVIDIA Surround Improvements

NVIDIA Surround Improvements



When it was first released, many thought of NVIDIA’s Surround multi monitor technology as nothing more than way to copy AMD’s competing Eyefinity. Since then it has become much more with NVIDIA rolling out near seamless support for the stereoscopic 3D Vision Surround while gradually improving performance and compatibility with constant driver updates. The one thing they were missing was the ability to run more than two monitors off of a single core graphics card. Well, Kepler is about to change that.


By thoroughly revising their display engine so it has the ability to output signals to four monitors simultaneously. This means a trio of monitors can be used alongside a fourth “accessory” display. This will allow you to game in Surround on the three primary screens while the fourth screen can act as a location for email, instant messaging and anything else you may want to keep track of. There are some Windows-related limitations when running 3DVision since the fourth panel won’t be able to display a 2D image in parallel with stereoscopic content but running your game in windowed mode should alleviate this issue.

NVIDIA has also included a simple yet handy feature to Surround: minimizing the Windows taskbar to the center panel. This means all of your core functionality can stay confined in one area without having to move the cursor across three monitors to interact with some items. Unfortunately, for the time being there isn’t any way to move the taskbar but NVIDIA may implement this as an option in a later release and the ability to span the taskbar across all monitors is still available.


Bezel correction is an integral part of the Surround experience since it offers continual, break-free images from one screen to the next. However it does tend to hide portions of the image as it compensates for the bezel’s thickness, sometimes leading to in game menus getting cut off. The new Bezel Peaking feature allows gamers to temporarily disable bezel correction by pressing CTRL+ALT+B in order to see and interact with anything being hid. The corrective measures can be enabled again without exiting the application.


One major complaint from gamers that use surround is the wide array of unused and sometimes unusable resolutions that Windows displays in games. NVIDIA has avoided this by adding a Custom Resolutions option into their control panel so the user can select only the resolutions they want to be displayed in games.
 
Last edited:

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
A Closer Look at the NVIDIA GeForce GTX 680 2GB

A Closer Look at the NVIDIA GeForce GTX 680 2GB



The GTX 680’s design is exactly what you would expect from an NVIDIA card: black, green and imposing. It uses a full length heatsink with a push / pull fan setup which is covered by a black shroud that has been designed to loosely mirror the GTX 480’s external heatsink layout. What really differentiates this flagship card from past examples is its length: at about 10”, this is one of the shortest class leading graphics cards to date. The GTX 680 also supports Tri-SLI through its pair of SLI connectors.


There are subtle (and some not so subtle) hit about this card’s origins with NVIDIA and GeForce logos strategically placed for maximum effect. Since this is an engineering sample, we can assume that board partners will take a different direction and install their own unique stickers and branding references.


Since this card only has a TDP of 195W and power consumption numbers that are slightly lower than that, a pair of 6-pin connectors is all that’s needed to keep it going, even when overclocked. The folks at NVIDIA have set themselves apart from the pack by placing the two connectors in a top to bottom configuration, instead of in a typical side by side fashion. While this did save about an inch in the GTX 680’s PCB length, the layout causes some complications when it comes to installation since the “top” connector partially blocks the top-down view of its sibling. You’ll eventually line things up but it could take a bit of trial and error.

Te backplate features a pair of dual link DVI connectors as well as full size outputs for HDMI 1.4a and DisplayPort 1.2. In our opinion, this is much more user friendly than AMD’s current layouts since it doesn’t require any adaptors to be purchased if multi monitor gaming is required. In addition NVIDIA has built in 4K resolution support along with (finally!) full compatibility with bitstreaming of HD audio signals.


The PCB’s underside holds a number of interesting items like the capability for an extra 6-pin connector which we will be covering in more detail on the next page. However, there is a secondary PCB soldered onto the card right next to the power connector hubs. This contains the Richtek RT8802A PWM controller which is in charge of controlling the GTX 680’s power distribution grid and dynamically adjusting current based upon needs.


With the length of 10” the GTX 680 is shorter than both the GTX 580 and HD 7970 but not by much and not to the point that it will affect your case buying decision.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Under the Heatsink

Under the Heatsink



Once the GTX 680’s lid is popped off, we can see cooling setup that’s reminiscent of the GTX 580’s since it sports a large fin array over a copper vapor chamber and a beveled exhaust area which should help decrease air turbulence. There is also a long back aluminum support brace that morphs into a VRM heatsink for added heat dissipation.


The rearmost portion of NVIDIA’s GTX 680 holds the fan while the components for the 4-phase digital PWM are pushed off to one side. This is a unique design that works in tandem with the stacked power connectors to decrease the card’s overall length and offers more direct airflow over the hot running power distribution network.


Once the heatsink is removed, it becomes increasingly clear how NVIDIA is eschewing past designs here. Not only is the core devoid of the large IHS of yesteryear but there’s also an option for a third 6-pin PCI-E power connector, should higher clocked versions ever become a reality. We should also mention that the GDDR5 receives its own dedicated dual phase power setup.
 

SKYMTL

HardwareCanuck Review Editor
Staff member
Joined
Feb 26, 2007
Messages
12,900
Location
Montreal
Test System & Setup / Benchmark Sequences

Main Test System

Processor: Intel i7 3930K @ 4.5GHz
Memory: Corsair Vengeance 32GB @ 1866MHz
Motherboard: ASUS P9X79 WS
Cooling: Corsair H80
SSD: 2x Corsair Performance Pro 256GB
Power Supply: Corsair AX1200
Monitor: Samsung 305T / 3x Acer 235Hz
OS: Windows 7 Ultimate N x64 SP1


Acoustical Test System

Processor: Intel 2500K @ stock
Memory: G.Skill Ripjaws 8GB 1600MHz
Motherboard: Gigabyte EX58-UD5
Cooling: Thermalright TRUE Passive
SSD: Corsair Performance Pro 256GB
Power Supply: Seasonic X-Series Gold 800W


Drivers:
NVIDIA 300.99 Beta for GTX 680
AMD 12.2 WHQL
NVIDIA 295.73 WHQL

Application Benchmark Information:
Note: In all instances, in-game sequences were used. The videos of the benchmark sequences have been uploaded below.


Batman: Arkham City

<object width="640" height="480"><param name="movie" value="http://www.youtube.com/v/Oia84huCvLI?version=3&hl=en_US&rel=0"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Oia84huCvLI?version=3&hl=en_US&rel=0" type="application/x-shockwave-flash" width="640" height="480" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Battlefield 3

<object width="640" height="480"><param name="movie" value="http://www.youtube.com/v/i6ncTGlBoAw?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/i6ncTGlBoAw?version=3&hl=en_US" type="application/x-shockwave-flash" width="640" height="480" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Crysis 2

<object width="560" height="315"><param name="movie" value="http://www.youtube.com/v/Bc7_IAKmAsQ?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Bc7_IAKmAsQ?version=3&hl=en_US" type="application/x-shockwave-flash" width="560" height="315" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Deus Ex Human Revolution

<object width="560" height="315"><param name="movie" value="http://www.youtube.com/v/GixMX3nK9l8?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/GixMX3nK9l8?version=3&hl=en_US" type="application/x-shockwave-flash" width="560" height="315" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Dirt 3

<object width="560" height="315"><param name="movie" value="http://www.youtube.com/v/g5FaVwmLzUw?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/g5FaVwmLzUw?version=3&hl=en_US" type="application/x-shockwave-flash" width="560" height="315" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Metro 2033

<object width="480" height="360"><param name="movie" value="http://www.youtube.com/v/8aZA5f8l-9E?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/8aZA5f8l-9E?version=3&hl=en_US" type="application/x-shockwave-flash" width="480" height="360" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Shogun 2: Total War

<object width="560" height="315"><param name="movie" value="http://www.youtube.com/v/oDp29bJPCBQ?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/oDp29bJPCBQ?version=3&hl=en_US" type="application/x-shockwave-flash" width="560" height="315" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Skyrim

<object width="640" height="480"><param name="movie" value="http://www.youtube.com/v/HQGfH5sjDEk?version=3&hl=en_US&rel=0"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/HQGfH5sjDEk?version=3&hl=en_US&rel=0" type="application/x-shockwave-flash" width="640" height="480" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Wargame: European Escalation

<object width="640" height="480"><param name="movie" value="http://www.youtube.com/v/ztXmjZnWdmk?version=3&hl=en_US&rel=0"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/ztXmjZnWdmk?version=3&hl=en_US&rel=0" type="application/x-shockwave-flash" width="640" height="480" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


Witcher 2 v2.0

<object width="560" height="315"><param name="movie" value="http://www.youtube.com/v/tyCIuFtlSJU?version=3&hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/tyCIuFtlSJU?version=3&hl=en_US" type="application/x-shockwave-flash" width="560" height="315" allowscriptaccess="always" allowfullscreen="true"></embed></object>​


*Notes:

- All games tested have been patched to their latest version

- The OS has had all the latest hotfixes and updates installed

- All scores you see are the averages after 3 benchmark runs

All IQ settings were adjusted in-game and all GPU control panels were set to use application settings
 
Status
Not open for further replies.

Latest posts

Twitter

Top