NVIDIA GeForce GTX 470 Review

SKYMTL · Mar 26, 2010

We will be breaking our GTX 400 series articles into two separate reviews: the one you are reading right now will concentrate on the reasonably priced GTX 470 while the other will take an in-depth look at the flagship GTX 480. Click here to go to the GTX 480 review.

The birth of NVIDIA’s Fermi architecture has been a difficult one and the folks in Santa Clara are the first to admit that. Sure, there were tantalizing bits and pieces shown at the GPU Technology Conference and later at CES but the rumour mill continued to spin and cast doubt upon everything from the chip design to the manufacturing hurdles. What it all boils down to is rather than push a product out the door that simply wasn’t ready, NVIDIA has waited until they nailed down almost every aspect of the GF100 prior to releasing the final product. So, for better or worse the first day in the life of this new poster child starts today; the GTX 470 and GTX 480 are hoping to shake up the industry.

Now that the so-called GTX 400 series is seeing the light of day after so many months behind closed doors and literally iron-clad NDAs, everyone from board partners to consumers are finally breathing a deep sigh of relief. In a DX11 market that is currently dominated by ATI’s HD 5000-series we have seen very little (if any) downwards movement in the prices of graphics cards over the last seven months. Rather, there have actually been price increases which were cleverly disguised as a way to even out production shortfalls so the media and consumers didn’t classify them as simple cash grabs. With NVIDIA putting in a showing after all this time, consumers may finally be able to benefit from some competition.

At this point, the 480-core GTX 480 will assume the guise of a flagship product and take the lead against the current reigning single GPU champion: the HD 5870. With a price of nearly $500 USD, the high end market is where the GTX 480 will vie for dominance. On the flip side of the coin it seems like NVIDIA is also targeting the lucrative $300 to $400 market that was previously the unchallenged stomping ground for ATI’s extremely popular HD 5850. As such, the GTX 470 will start at a competitive $349 USD which translates into a mere 10% price premium over ATI’s current price / performance leader. From where we stand seeing pricing like this is very encouraging but it also opens up a huge $150 chasm between the two 400-series cards which will hopefully be bridged by overclocked GTX 470 products.

At this card’s heart beats a 40nm processor with 448 cores and a clock speed of 607Mhz which is paired up with 1.28GB of GDDR5 memory. As will be the usual situation with all GF100-based cards, power consumption and heat were two of the more pressing concerns when it came to designing the GTX 470. Even with such a low clock speed, the maximum board power for this card is in the 215W range. Considering the competing HD 5850 consumes 151W at full load, this could be a bitter pill to swallow for people who are looking to cut down their electricity bills.

According to NVIDIA, the GTX 470 is meant to offer the perfect combination of performance, price and features which will make it appealing to a wide range of potential buyers. Considering the supposed (we’ll get to the facts later in the review) superior DX11 performance of the GF100-based cards, many people could be looking at this product as something which is more future-proof than the current ATI offerings.

We know you are excited to get on with this review but if there is anyone out there who wants additional background, we suggest you take a look at our dedicated GF100 article that covers everything from architecture scaling to compute performance. We will be using some of that article’s passages here but there is a ton of additional reading contained therein to satisfy anyone.

SKYMTL · Mar 26, 2010

The GTX 400 Series’ Specifications

The GTX 400 Series’ Specifications

As the GF100 architecture has matured through the last few months, a cleared picture of its capabilities and finally clock speeds has begun to take shape. The basis of this architecture (a large, monolithic core for the flaghip products) hasn’t changed much from the days of the G80 and as such the GTX 400 series won’t be packing any high graphics clock speeds. We have to remember that NVIDIA’s cards historically launch at lower clock speeds than those of ATI. For example, the GTX 260 ran at 576Mhz while even the mighty 8800 Ultra clocked in at a mere 612Mhz. The 400-series does however bring NVIDIA some much-needed products to the DX11 marketplace which has been in an ATI stranglehold for some time. Both the GTX 480 and GTX 470 are aimed directly at two cards which ATI has been selling for months now: the HD 5870 and the HD 5850. As such, they are priced accordingly which should actually come as a minor shock to people who are used to new high-end cards retailing in the $400 to $600+ range. It seems like ATI’s stiff competition has had a direct impact upon NVIDIA’s pricing structure.

While the core itself is capable of sporting up to 512 cores, the flagship GTX 480 has mysteriously “lost” 32 cores (totalling one SM) somewhere along the line while the $349 GTX 470 has two SMs disabled giving it 448 cores. With the elimination of a single Streaming Multiprocessor also comes the loss of four texture units but the memory (384-bit), L2 cache (768KB) and ROP (48) structure retains the maximum allowable specifications. It also receives a whopping 1536MB of GDDR5 memory. The choice to go with 480 cores wasn’t discussed at length by NVIDIA but if we had to hazard some guesses, we’d say there currently aren’t enough 512 core parts capable of 700Mhz and higher clocks coming off of the production line. Sacrifices simply had to be made in order to ensure the necessary number of units were available when they are shipped to retailers and that power consumption stayed within optimal parameters (250W in this case). Nonetheless, in the grand scheme of things, the loss of a single SM should not have a significant impact upon performance.

GTX 470 on the other hand is specifically tailored to offer a price / performance solution which rivals the HD 5850. To achieve this while ensuring it would not be much more power hungry than the GTX 285, NVIDIA took the standard GF100 core and cut it down significantly. Not only were a total of 64 cores cut but the ROP / L2 Cache and memory controllers also went under the knife. Clock speeds for both the memory and the core / processor domains were also cut quite a bit when compared to NVIDIA’s GTX 480 which results in a card that is understandably priced a full $150 below the flagship product.

For the foreseeable future, these two cards will be NVIDIA’s claim to fame in the enthusiast GPU market but one of there are a number of lingering questions. We have heard very little about lower-end cards which could attack price ranges where ATI is currently quite weak in; namely the market currently held by the underperforming HD 5830. In addition, there’s very little doubt in our minds that ATI will soon have something out to counter the GTX 480 is NVIDIA simply waiting for then to release a 512-core card? Only time will tell but it’s quite obvious that the GPU market will be an interesting place this year.

SKYMTL · Mar 26, 2010

In-Depth GF 100 Architecture Analysis (Core Layout)

In-Depth GF 100 Architecture Analysis (Core Layout)

The first stop on this whirlwind tour of the GeForce GF100 is an in-depth look at what makes the GPU tick as we peer into the core layout and how NVIDIA has designed this to be the fastest graphics core on the planet.

Many people incorrectly believed that the Fermi architecture was primarily designed for GPU computing applications and very little thought was given to the graphics processing capabilities. This couldn’t be further from the truth since the computing and graphics capabilities were determined in parallel and the result is a brand new architecture tailor made to live in a DX11 environment. Basically, NVIDIA needed to apply what they had learned from past generations (G80 & GT200) to the GF100.

What you are looking at above is the heart and soul of any GF100 card: the core layout. While we will go into each section in a little more detail below, from the overall view we can see that the main functions are broken up into four distinct groups called Graphics Processing Clusters or GPCs which are then broken down again into individual Streaming Multiprocessors (SMs), raster engines and so on. To make matters simple, think of it way: in its highest-end form, a GF100 will have four GPCs, each of which is equipped with four SMs for a total of 16 SMs broken up into groups of four. Within each of these SMs are 32 CUDA Cores (or shader processors from past generations) for a total of 512 cores in total. However, the current GTX 480 and GTX 470 cards make do with slightly less cores (480 and 448 respectively) while we are told there will be a 512 core version in the near future.

On the periphery of the die is the GigaThread Engine along with the memory controllers. The GigaThread Engine performs the somewhat thankless duty of reading the CPU’s commands over the host interface and then fetching data from the system’s main memory bank. The data is then copied over onto the framebuffer of the graphics card itself before being passed along to the designated engine within the core. Meanwhile, the GF100 incorporates a total of six 64-bit GDDR5 memory controllers for a total of 384-bits. The massive amount of bandwidth created by a 384-bit GDDR5 memory interface will provide extremely fast access to the system memory and eliminate any bottlenecks seen in past generations.

Each Streaming Multiprocessor holds 32 CUDA cores along with 16 load / store units which allows for a total of 16 threads per clock to be processed. Above these we see Warp Schedulers along with the associated dispatch units which process 32 concurrent threads (called Warps) to the cores.

Finally, closer to the bottom of the SM is the L1 / L2 cache, Polymorph Engine and the four texture units. In total, the maximum number of texture units in this architecture is 64 which should come as a surprise considering the outgoing GT200 architecture supported up to 80 TMUs. However, NVIDIA has implemented a number of improvements with the way the architecture handles textures which we will go into in a later section. Suffice to say that the texture units are now integrated into the SP without having multiple SPs addressing a common texture cache.

Independent of the SM structure is six dedicated partitions of eight ROP units for a total of 48 ROPs as opposed to the 32 units from the GT200 architecture. Also different from the GT200 layout is that instead of backing up directly into the memory bus, the ROPs interface with the shared L2 cache which provides a quick interface for data storage.

SKYMTL · Mar 26, 2010

A Closer Look at the Raster & PolyMorph Engines

A Closer Look at the Raster & PolyMorph Engines

In the last few pages you may have noticed mention of the PolyMorph and Raster engines which are used for highly parallel geometry processing operations. What NVIDIA has done is effectively grouped all of the fixed function stages into these two engines, which is one of the main reasons drastically improved geometry rendering is being touted for GF100 cards. In previous generations these functions used to be outside of the core processing stages (SMs) and NVIDIA has now brought them inside the core stages to ensure proper load balancing. This in effect will help immeasurably with tessellated scenes which feature extremely high triangle counts.

We should also note here and now that the GTX 400 series’ “core” clock numbers refer to the speed at which these fixed function stages run.

Within the PolyMorph engine there are five stages from Vertex Fetch to the Stream Output which each process data from the Streaming Multiprocessor they are associated with. The data then gets output to the Raster Engine. Contrary to past architectures which featured all of these stages in a single pipeline, the GF100 architecture does all of the calculations in a completely parallel fashion. According to our conversations with NVIDIA, their approach vastly improves triangle, tessellation, and Stream Out performance across a wide variety of applications.

In order to further speed up operations, data goes from one of 16 PolyMorph engines to another and uses the on-die cache structure for increased communication speed.

After the PolyMorph engine is done processing data, it is handed off to the Raster Engine’s three pipeline stages that pass off data from one to the next. These Raster Engines are set up to work in a completely parallel fashion across the GPU for quick processing.

Both the PolyMorph and Raster engines are distributed throughout the architecture which increases parallelism but are distributed in a different way from one another. In total, there are 16 PolyMorph engines which are incorporated into each of the SMs throughout the core while the four Raster Engines are placed at a rate of one per GPC. This setup makes for four Graphics Processing Clusters which are basically dedicated, individual GPUs within the core architecture allowing for highly parallel geometry rendering.

Now that we are done with looking at the finer details of this architecture, it’s time to see how that all translates into geometry and texture rendering. In the following pages we take a look at how the new architecture works in order to deliver the optimal performance in a DX11 environment.

SKYMTL · Mar 26, 2010

Efficiency Through Caching

Efficiency Through Caching

There are benefits to having dedicated L1 and L2 caches as this approach not only helps when it comes to GPGPU computing but also for storing draw calls so they are not passed off to the memory on the graphics card. This is supposed to drastically streamline rendering efficiency, especially in situations with a lot of higher-level geometry.

Above we have an enlarged section of the cache and memory layout within each SM. To put things into perspective, a Streaming Multiprocessor has 64KB of shared, programmable on-chip memory that can be configured in one of two ways. It can either be laid out as 48 KB of shared memory with 16 KB of L1 cache, or as 16 KB of Shared memory with 48 KB of L1 cache. However, when used for graphics processing as opposed to GPGPU functions, the SM will make use of the 16 KB L1 cache configuration. This L1 cache is supposed to help with access to the L2 cache as well as streamlining functions like stack operations and global loads / stores.

In addition, each texture unit now has its own high efficiency cache as well which helps with rendering speed.

Through the L2 cache architecture NVIDIA is able to keep most of the rendering function data like tessellation, shading and rasterizing on-die instead of going to the framebuffer (DRAM) which would slow down the process. Caching for the GPU benefits bandwidth amplification and alleviates memory bottlenecks which normally occur when doing multiple reads and writes to the framebuffer. In total, the GF100 has 768KB of L2 cache which is dynamically load balanced for peak efficiency.

It is also possible for the L1 and L2 cache to do loads and stores to memory and pass data from engine to engine so nothing moves off chip. Unfortunately, one of the issues with this approach is that significant die area is taken up by doing geometry processing in a parallel and scalable way while not using DRAM bandwidth.

When compared with the new GeForce GF100, the previous architecture is inferior in every way. The GT200 only used cache for textures and featured a read-only L2 cache structure whereas the new GPU’s L2 is rewritable and caches everything from vertex data to textures to ROP data and nearly everything in between.

By contrast, with their Radeon HD 5000-series, ATI dumps all of the data from the geometry shaders to the memory and then pulls it back into the core for rasterization before output. This causes a drop in efficiency and therefore performance. Meanwhile, as we discussed before, NVIDIA is able to keep all of their functions on-die in the cache without having to introduce memory latency into the equation and hogging bandwidth.

So what does all of this mean for the end-user? Basically, it means vastly improved memory efficiency since less bandwidth is being taken up by unnecessary read and write calls. This can and will benefit the GF100 in high resolution, high IQ situations where lesser graphics cards’ framebuffers can easily become saturated.

SKYMTL · Mar 26, 2010

The GF100’s Modular Architecture Scaling

The GF100’s Modular Architecture Scaling

When the GT200 series was released, there really wasn’t much information presented regarding how the design could be scaled down to create lower-end cards to appeal to a wide variety of price brackets. Indeed, the GT200 proved extremely hard to scale due to its inherit design properties which is why we saw the G92 series of cards stay around for much longer than was originally planned. NVIDIA was lambasted for their constant product renames but considering the limitations of their architecture at the time, there really wasn’t much they could do.

Lessons were learned the hard way and the GF100 actually features an amazingly modular design which can be scaled down from its high-end 512SP version to a nearly infinite number of smaller, less expensive derivatives. In this section we take a look at how these lower-end products can be designed.

The GPC’s: Where it All Starts

Before we begin, it is necessary to take a closer look at one of the four GPCs that make up a fully-endowed GF100.

By now you should all remember that the Graphics Processing Cluster is the heart of the GF100. It encompasses a quartet of Streaming Multiprocessors and a dedicated Raster Engine. Each of the SMs consists of 32 CUDA cores, four texture units, dedicated cache and a PolyMorph Engine for fixed function calculations. This means each GPC houses 128 cores and 16 texture units. According to NVIDIA, they have the option to eliminate these GPCs as needed to create other products but they are also able to do additional fine tuning as we outline below.

Within the GPC are four Streaming Multiprocessors and these too can be eliminated one by one to decrease the die size and create products at lower price points. As you eliminate each SM, 32 cores and 4 texture units are removed as well. It is also worth mentioning that due to the load balancing architecture used in the GF100, it’s possible to eliminate multiple SMs from a single GPC without impacting the Raster Engine’s parallel communication with the other engines. So in theory, one GPC can have one to four SMs while all the other GPCs have their full amount without impacting performance one bit.

So what does this mean for actual specifications of GF100 cards aside from the 512 core version? The way we look at this, additional products would theoretically be able to range from 480 core, 60 texture unit high-end cards to 32 core, 4 TMU budget-oriented products. This is assuming NVIDIA sticks to the 32 cores per SM model they currently have.

Since we want to be as realistic as possible here, we expect NVIDIA to keep some separation between some product ranges and release GF100-based cards with either two or four SPs disabled. This could translate into products with 448(cores) + 56 (texture), 384 + 48, 320 + 40, etc for a wide range of possible solutions. The current GTX 480 has a single SM diasabled.

ROP, Framebuffer and Cache Scaling

You may have noticed that we haven’t discussed ROPs and memory scaling yet and that’s because these items scale independently from the GPCs.

Focusing in on the ROP, Memory and Cache array we can see that while placed relatively far apart on the block diagram, they are closely related and as such they must be scaled together. In its fullest form, the GT100 has 48 ROP units grouped into six groups of eight and each of these groups is served by 128KB of L2 cache for a total of 768KB. In addition, every ROP group has a dedicated 64-bit GDDR5 memory controller. This all translates into a pretty straightforward solution: once you eliminate a ROP group, you also have to eliminate a memory controller and 128KB of L2 cache.

Scaling of these three items happens in a linear affair as you can see in the chart above since in the GF100 architecture, you can’t have ROPs without an associated amount of L2 cache or memory interface and vice versa. One way or another, the architecture can scale down all the way down to 8 ROPs with a 64-bit memory interface.

Meanwhile, the memory on lower end versions could scale in a linear fashion as well in accordance with the elimination of a 64-bit interface with every group of ROPs that is removed. So, the possibility of a 1.28GB, 320-bit card, a 1GB, 256-bit product and so on does exist and has happened with the GTX 470’s specifications.

SKYMTL · Mar 26, 2010

Image Quality Improvements

Image Quality Improvements

Even though additional geometry could end up adding to the overall look and “feel” of a given scene, methods like tessellation and HDR lighting still require accurate filtering and sampling to achieve high rendering fidelity. For that you need custom anti-aliasing (AA) modes as well as vendor-specific anisotropic filtering (AF) sampling and everything in between. As the power of GPUs rapidly outpaces the ability of DX9 and even DX10 games to feed them with information, a new focus has been turned to image quality adjustments. These adjustments do tend to impact upon framerates but with GPUs like the GF100 there is much less of a chance that increasing IQ will result in the game becoming unplayable.

Quicker Jittered Sampling Techniques

Many of you are probably scratching your head and wondering what in the world jittered sampling is. Basically, it is a shadow sampling method that has been around since the DX9 days (maybe even prior to that) which allows for realistic, soft shadows to be mapped by the graphics hardware. Unfortunately, this method is extremely resource hungry so it hasn’t been used very often regardless of how good the shadows it produces may look.

In the picture above you can see what happens with shadows which don’t use this method of mapping. Basically, for a shadow to look good it shouldn’t have a hard, serrated edge.

Soft shadows are the way to go and while past generations of hardware were able to do jittered sampling, they just didn’t have the resources to do it efficiently. Their performance was adequate with one light source in a scene but when asked to produce soft shadows from multiple light sources (in a night scene for example), the framerate would take an unacceptably large hit. With the GF100, NVIDIA had the opportunity to vastly improve shadow rendering and they did just that.

To do quicker, more efficient jittered sampling, NVIDIA worked with Microsoft to implement hardware support for Gather4 in DX11. Instead of doing four texture fetches per cycle, the hardware is now able to specify one coordinate with an offset and fetch four textures instead of having to fetch all four separately. This will significantly improve the shadow rendering efficiency of the hardware and is still able to work as a standard Gather4 instruction set if need be.

With this feature turned on, NVIDIA expects a 200% improvement in shadow rendering performance when compared to the same scene being rendered with their hardware Gather4 turned off.

32x CSAA Mode for Improved AA

In our opinion, the differences between the AA modes above 8x are minimal at best unless you are rendering thin items such as grass, a chain-link fence or a distant railing. With the efficiency of the DX11 API in addition to increased horsepower from cards like the GF100, it is now possible to use geometry to model vegetation and the like. However, developers will continue using the billboarding and alpha texturing methods from DX9 which allow for dynamic vegetation, but it will continue to look jagged and under-rendered. In such cases, anti-aliasing can be applied but high levels of AA are needed in order to properly render these items. This is why NVIDIA has implemented their new 32x Coverage Sample AA.

In order to accurately apply AA, three things are needed: coverage samples, color samples and levels of transparency. To put this into context, GT 200 had 8 color samples and 8 coverage samples which means a total rate of 16 samples on edges. However, this only allowed for only 9 levels of transparency. This lead to edges which still looked jagged and without proper blending so dithering was implemented to mask the banding.

The GF100 on the other hand features 24 coverage samples and 8 color samples for a total of 32 samples (hence the 32x CSAA moniker). This layout also offers 33 levels of transparency for much smoother blending of the anti-aliased edges into the background and increased performance as well.

With increased efficiency comes decreased overhead when running complex AA routines and NVIDIA specifically designed the GF100 to cope with high IQ settings. Indeed, on average this new architecture only loses about 7% of its performance when going from 8x AA to 32x CSAA.

TMAA and CSAA: Hand in Hand

No matter how much AA you apply in DX9, there will still invariably be some issues with distant, thin objects that are less than a pixel wide due to the method older APIs use to render these. Transparency Multisample AA (TMAA) allows the DX9 API to convert shader code to effectively use alpha to coverage routines when rendering a scene. This, combined with CSAA, can greatly increase the overall image quality.

It may be hard to see in the image above but without TMAA, the railing in the distance would have its lines shimmer in and out of existence due to the fact that the DX9 API doesn’t have the tools necessary to properly process sub-single pixel items. It may not impact upon gaming but it is noticeable when moving through a level.

Since coverage samples are used as part of GF100’s TMAA evaluation, much smoother gradients are produced. TMAA will help in instances such as this railing and even with the vegetation examples we used in the last section.

SKYMTL · Mar 26, 2010

Touching on NVIDIA Surround / 3D Vision Surround

Touching on NVIDIA Surround / 3D Vision Surround

During CES, NVIDIA unveiled their answer to ATI’s Eyefinity multi-display capability: 3D Vision Surround and NVIDIA Surround. These two “surround” technologies from NVIDIA share common ground but in some ways their prerequisites and capabilities are at two totally different ends of the spectrum. We should also mention straight away that both of these technologies will become available once the GF100 cards launch and will support bezel correction management from the outset.

NVIDIA Surround

Not to be confused with NVIDIA’s 3D Vision Surround, their standard Surround moniker allows for three displays to be fed concurrently via an SLI setup. Yes, you need an SLI system in order to run three displays at the same time but the good news is that NVIDIA Surround is backwards compatible with GTX 200-series cards in addition to forwards compatible with GF100 series parts. This display method can display information across three 2560 x 1600 screens and allows for a mixture of monitors to be used as long as they all support the same resolutions.

The reason why SLI is needed is because both the GT200 series and the GF100 cards are only capable of having a pair of display adapters active at the same time. In addition, if you want to drive three monitors at reasonably high detail levels, you’ll need some serious horsepower and that’s exactly what a dual or triple card system gives you.

This does tend to leave out the people who may want to use three displays for professional applications but that’s where NVIDIA’s Quadro series comes into play.

3D Vision Surround

We all know by now that immersive gaming has been taken to new levels by both ATI, with their HD 5000-series’ ability to game on up to three monitors at once, and NVIDIA’s own 3D Vision which offers stereoscopic viewing within games. What has now happened is a combining of these two techniques under the 3D Vision Surround banner, which brings stereo 3D to surround gaming.

This is the mac-daddy of display technologies and it is compatible with SLI setups of upcoming GF100 cards and older GT200-series. The reasoning behind this is pretty straightforward: you need a massively powerful system for rendering and outputting what amounts to six high resolution 1920 x 1080 images (two to each of the three 120Hz monitors). Another thing you should be aware of is the fact that all three monitors MUST be of the same make and model in order to ensure uniformity.

All in all, we saw NVIDIA’s 3D Vision Surround in action and while it was extremely impressive to say the least, but we can't give any more thoughts about it since more testing on our part must be done.

SKYMTL · Mar 26, 2010

A Closer Look at the NVIDIA GeForce GTX 470

A Closer Look at the NVIDIA GeForce GTX 470

The GTX 470’s overall design isn’t all that different from past NVIDIA cards when it comes to the overall look and feel of the heatsink shroud. Since this is a reference design, there isn’t a board partners’ sticker anywhere to be seen but if there was one, it would be placed where the NVIDIA logos are. For the time being, every GF100-based card will be based off of this design until board partners are allowed in manufacture their own versions. All in all, we love the black on black look.

The fan is actually a relatively small 70mm affair that sucks in cool air from the top and back of the shroud in order to facilitate cooling. We can also see a faint NVIDIA logo that will likely be replaced with a board partner’s sticker once these hit retail.

As we already mentioned, in order to allow the fan additional fresh air in certain situations (with two cards in SLI or when it is installed in particularly cramped quarters), the back of the plastic heatsink shroud is left open. This also facilitates cooling the VRMs and other hot-running PCB components.

When it comes to connectors, the GTX 470 is decked out with a pair of DVI outputs as well as a mini-HDMI port which can be used with an included mini-HDMI to HDMI adaptor. In addition it should be noted that all Fermi-based graphics cores have compatibility for DisplayPort built into them but it is up to the board partner to add the necessary output. For the time being, all GTX 400 series cards will ship in the configuration seen above.

Power to the card comes from a pair of 6-pin PCI-E connectors while the indent you see to the right of these is only used as a latch for the heatsink shroud attachment. There is no SPDIF input since all audio processing is now done natively on the chip itself.

The back of the card doesn’t hold anything of particular interest other than the cut out in the PCB which is used for additional airflow to the 70mm fan.

The GTX 470 is actually quite compact, measuring a total of 9” in length which makes it equal in size to a HD 5850.

SKYMTL · Mar 26, 2010

The GTX 470’s Heatsink

The GTX 470’s Heatsink

In order to tame the raging temperatures emanating from the GTX 470’s core, NVIDIA had to implement a high-end cooling solution that isn’t as exotic as the one which graces the GTX 480 but still gets the job done. In this case they have gone with five large-diameter copper heatpipes which run away from the core and make their way into an aluminum heatsink assembly. It is obvious every available millimetre was needed in order to cram this large heatsink into the plastic shroud.

We can also see that NVIDIA has gone with a large-scale secondary heatsink that runs the length of the card and acts as a way to dissipate heat from the VRMs and the memory modules. This actually looks to be quite a complicated design but it is interesting nonetheless.

xentr_theme_editor

xentr_page_width

xentr_toggle_page_width

xentr_color_pickers

xentr_toggle_color_picker

xentr_typography

xentr_node_layout

xentr_grid_layout

Style variation

xentr_sidebar

xentr_toggles_sidebar

xentr_sticky_sidebar

discussion NVIDIA GeForce GTX 470 Review

HardwareCanuck Review Editor

HardwareCanuck Review Editor

The GTX 400 Series’ Specifications

HardwareCanuck Review Editor

In-Depth GF 100 Architecture Analysis (Core Layout)

HardwareCanuck Review Editor

A Closer Look at the Raster & PolyMorph Engines

HardwareCanuck Review Editor

Efficiency Through Caching

HardwareCanuck Review Editor

The GF100’s Modular Architecture Scaling

HardwareCanuck Review Editor

Image Quality Improvements

HardwareCanuck Review Editor

Touching on NVIDIA Surround / 3D Vision Surround

HardwareCanuck Review Editor

A Closer Look at the NVIDIA GeForce GTX 470

HardwareCanuck Review Editor

The GTX 470’s Heatsink

xentr_legal_notice_title

NVIDIA GeForce GTX 470 Review