What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

Intel Core i7 "Nehalem" 920, 940 & 965 XE Processor Review

Status
Not open for further replies.

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
intelcorplogo.jpg


Intel Core i7 "Nehalem" 920, 940 & 965 XE Processor Review​




Let’s take a little walk through the recent history of processors since it is a story of amazing success and crushing losses, along with both stunning and utterly disappointing performance numbers. Indeed, we have seen a pretty amazing transformation in the overall performance as well as efficiency in the last few years. Back in the days of Intel’s Pentium 4s to AMD’s Athlons, the processor marketplace used to be a confusing minefield of competing products vying for your attention until AMD released their Athlon 64s. In the blink of an eye, the game of cat versus mouse suddenly had Intel scurrying to find an answer to a surging AMD. AMD enjoyed its time in the limelight until Intel found the answer to their problems in the form of the mobile Yonah laptop processor. From this unsuspecting corner, the Core 2 Duo and Core 2 Quad processors we have all come to know and love were conceived and born. Not only did these new processors run all over the AMD competition but they have been running strong for more than two years with little competition. Now, following Intel’s Tick / Tock model where a new architecture is released every other year, we have a brand new processor family launching. It is called the Nehalem by many, but can also go by its Core i7 name or even by its codename: Bloomfield.

Today marks the day that we are finally able to post our review of the next step in Intel’s march to market domination and with it they are hoping to put the competition to shame. What we get is a processor with four physical cores which through the “miracle” of Hyperthreading (don’t worry, this will be explained in its own section a bit later on in this review) operates as if it has eight cores. Words like QuickPath Interconnect, Turbo Mode and Triple Channel Memory will all become familiar lexicons in no time at all since they are all integral technologies that this new 45nm processor family uses.

This may sound perfect for many of you operating those LGA775 systems out there but there is one important thing to remember: unlike the move from the Pentium D to the Core architecture, a completely new socket was used for the i7 processors. Technological advancement comes at a price folks and thankfully, with this step forward the sacrifices one will have to make are relatively minimal. This new LGA 1366 socket will be the centerpiece of a new generation of motherboards boasting enough overclocking potential to put a smile on the face of the most jaded enthusiast while having features galore. We will be running multiple reviews of these new X58-based boards in the coming weeks so stay tuned since some of them are real show stoppers.

While today may mark the day we can begin talking about these new processors and their accompanying motherboards, it will still be a few weeks until you can actually go out and purchase one. In other words...be patient. That being said, even though they may not have any real amount of competition on the market today, the prices being asked for the processors themselves will not be exorbitant. According to the chaps over at Intel, the launch prices will be about $284 for the i7-920, $562 for the i7-940 and finally nearly $1000 for the i7-965. This isn’t outrageous but since you have to factor the price of a new motherboard and memory into the equation, things start getting a bit hairy for the bank account. That being said, boys will be boys and many of us will spend whatever it takes to get the latest and greatest.

We should also mention right now that this review marks the launching point of what will amount to be a series of articles about this new Intel platform, its features and how to best take control of all the options at your fingertips. Without a doubt, this is an exciting day to be a reviewer and a consumer so without sounding too excited, let’s get this show on the road and introduce you to this brand new world Intel wants us all to be part of.

corei7logos.jpg

Logos: Core i7 on the left, Core i7 Extreme Edition on the right.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Core i7 - Bloomfield

Core i7 - Bloomfield



Now that the media embargo has ended, we can finally confirm the final specifications of the soon-to-be-released Core i7 family of processors, codenamed Bloomfield. Keep in mind that today is not the actual product launch, that is going to be sometime “later in November”. Nevertheless, let's take a glance at what these new chips have to offer.


Specifications

<table align="center" table border="0" bgcolor="#666666" cellpadding="5" cellspacing="1" width="735px"><tr><td align="center" bgcolor="#cc9999" width="130"><b>Processor name</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-920</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-940</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965 Extreme Edition</b></td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Form Factor</b></td><td align="center" bgcolor="#ececec" width="100">LGA1366</td><td align="center" bgcolor="#ececec" width="100">LGA1366</td><td align="center" bgcolor="#ececec" width="100">LGA1366</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Manufacturing Process</b></td><td align="center" bgcolor="#ececec" width="100">45nm</td><td align="center" bgcolor="#ececec" width="100">45nm</td><td align="center" bgcolor="#ececec" width="100">45nm</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Die Size</b></td><td align="center" bgcolor="#ececec" width="100">263mm²</td><td align="center" bgcolor="#ececec" width="100">263mm²</td><td align="center" bgcolor="#ececec" width="100">263mm²</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Transistor Count</b></td><td align="center" bgcolor="#ececec" width="100">731 Million</td><td align="center" bgcolor="#ececec" width="100">731 Million</td><td align="center" bgcolor="#ececec" width="100">731 Million</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Physical Cores</b></td><td align="center" bgcolor="#ececec" width="100">4</td><td align="center" bgcolor="#ececec" width="100">4</td><td align="center" bgcolor="#ececec" width="100">4</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Logical Cores</b></td><td align="center" bgcolor="#ececec" width="100">8</td><td align="center" bgcolor="#ececec" width="100">8</td><td align="center" bgcolor="#ececec" width="100">8</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Clock Speed</b></td><td align="center" bgcolor="#ececec" width="100">2.66Ghz</td><td align="center" bgcolor="#ececec" width="100">2.93Ghz</td><td align="center" bgcolor="#ececec" width="100">3.2Ghz</td><tr><td align="center" bgcolor="#ececec" width="100"><b>QuickPath Interconnect</b></td><td align="center" bgcolor="#ececec" width="100">4.8 GT/s</td><td align="center" bgcolor="#ececec" width="100">4.8 GT/s</td><td align="center" bgcolor="#ececec" width="100">6.4 GT/s</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Memory Support</b></td><td align="center" bgcolor="#ececec" width="100">Triple-Channel DDR3-1066</td><td align="center" bgcolor="#ececec" width="100">Triple-Channel DDR3-1066</td><td align="center" bgcolor="#ececec" width="100">Triple-Channel DDR3-1066</td><tr><td align="center" bgcolor="#ececec" width="100"><b>L1 Cache Total</b></td><td align="center" bgcolor="#ececec" width="100">4 x 32KB Data + 4 x 32KB Instruction</td><td align="center" bgcolor="#ececec" width="100">4 x 32KB Data + 4 x 32KB Instruction</td><td align="center" bgcolor="#ececec" width="100">4 x 32KB Data + 4 x 32KB Instruction</td><tr><td align="center" bgcolor="#ececec" width="100"><b>L2 Cache Total</b></td><td align="center" bgcolor="#ececec" width="100">4 x 256KB</td><td align="center" bgcolor="#ececec" width="100">4 x 256KB</td><td align="center" bgcolor="#ececec" width="100">4 x 256KB</td><tr><td align="center" bgcolor="#ececec" width="100"><b>L3 Cache Total</b></td><td align="center" bgcolor="#ececec" width="100">8MB Shared</td><td align="center" bgcolor="#ececec" width="100">8MB Shared</td><td align="center" bgcolor="#ececec" width="100">8MB Shared</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Instruction Set Architecture (ISA)</b></td><td align="center" bgcolor="#ececec" width="100">MMX,EMT64, SSE-SSSE4.2</td><td align="center" bgcolor="#ececec" width="100">MMX,EMT64, SSE-SSSE4.2</td><td align="center" bgcolor="#ececec" width="100">MMX,EMT64, SSE-SSSE4.2</td><tr><td align="center" bgcolor="#ececec" width="100"><b>TDP</b></td><td align="center" bgcolor="#ececec" width="100">130W</td><td align="center" bgcolor="#ececec" width="100">130W</td><td align="center" bgcolor="#ececec" width="100">130W</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Launch Price</b></td><td align="center" bgcolor="#ececec" width="100">$284 USD</td><td align="center" bgcolor="#ececec" width="100">$562 USD</td><td align="center" bgcolor="#ececec" width="100">$999 USD</td><tr></table>

As you can see, the three initial launch models are quite similar even though their prices differ greatly. Aside from the price, the only noteworthy difference between the three chips is that the Extreme Edition model benefits from a higher QuickPath Interconnect (QPI) speed and has the unlocked multipliers that enthusiasts crave. As is their custom, Intel is targeting these initial offerings at the mid to high-end market. Although the budget-conscious among you may be disappointed, you must remember that the Bloomfield family is Intel's high-performance quad-core products. Based on Intel's roadmap, we can expect mainstream Nehalem offerings sometime in Q3 2009. Specifically, these will be the quad-core Lynnfield and dual-core Havendale models.


The Chips:

cpu1th.jpg
cpu2th.jpg

On its own the Core i7 processors look quite similar to its Core 2 predecessors, however it can be clearly distinguished by the unique gold dots that grace the edge of CPU package. By the way, notice the stepping our engineering sample processor, "Q1CK" - Quick. A little cockiness on Intel's behalf? We can't wait to see if it is warranted!

cpu3th.jpg
cpu4th.jpg

Those are 1366 contact points, a huge increase from the 775 that have been used since the Prescott and Cedar Mill Pentium 4 models. The layout of the micro SMD resistors is very interesting because it mimics the actual layout of the core, which you can see on the subsequent page.

cpu5th.jpg
cpu6th.jpg

When placed side-by-side with a Core 2 chip, you can clearly see that the Core i7 is a decent bit larger. It is also no longer square, instead having a slightly rectangular design.

920.jpg
cpu7.jpg

The all-important CPU-Z shots. You can see that although the Core i7-920 and Core i7-965 share an identical bus speed, their QPI Link speed is in fact different.

As usual, the CPU core speed is derived by a multiplier times bus speed formula. Since the FSB is no longer present, the bus speed in question is the QPI source clock (BCLK), which has a stock frequency of 133MHz. The memory controller and the L3 cache operate at 2.13GHz (i7-920/940) or 2.66GHz (i7-965) on a seperate frequency called the un-core clock. By the way, although our chips are engineering samples, they are manufactured with the final retail stepping, so they perform the same as the chips you will be able to buy in the retail channel.

Now let’s take an in-depth look at the Core i7’s Nehalem microarchitecture to see what exactly makes it so special.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Microarchitecture Dissected #1

Microarchitecture Dissected #1


ticktock.jpg

In 2007, Intel unveiled the Tick-Tock Model as a demonstration of the company's dedication towards continued rapid technological innovation. The "tick" is a shrinking of the previous architecture manufacturing process (65nm --> 45nm --> 32nm) and the "tock" is a new architecture. Since Penryn was a shrink and slight improvement of the preceding Core architecture, it was time for a brand new architecture and that is where Nehalem comes in. On a side note, the code-name Nehalem first appeared on Intel's long-term roadmap back in 2002, and back then it was claimed to be a future version of the NetBurst architecture used in the Pentium 4, but clocked at an amazing 10Ghz. We can all breath a sigh of relief that Intel canned that idea a long ago.

quadpenryncoreth.jpg
nehalemcoreth.jpg
Yorkfield quad-core die on the left, new Nehalem quad-core die on the right.​
On the right we have the current Intel "Yorkfield" quad-core die, which is effectively two dual-core dies mounted in one CPU package. On the other hand, Bloomfield-Nehalem is a native quad-core design. We say 'Bloomfield-Nehalem' because the Nehalem architecture was designed to be dynamically scalable and there will be native hexa-core (Westmere) and octo-core (Beckton) models in the future.

Despite the huge and easily visible 12MB L2 cache, the die size of the Yorkfield processors measures a relatively small 214mm². In comparison, the Nehalem clocks in at 263 mm2, a 23% size increase. Although Nehalem may have a larger die size it actually has less transistors then a quad-core Yorkfield (731M vs. 820M). This is despite the fact that both are manufactured using the same 45nm High K + metal gate transistor technology. The reason for this 'anomaly' is the fact that Yorkfield has a lot of extremely transistor-dense L2 cache, while Nehalem has less cache but more components on the die (integrated memory controller, QPI Link, etc).

Now many of you are probably looking at the Nehalem die picture and saying "It is pretty but what am I looking at exactly?". A valid question, so let's take a look at the Nehalem core's layout:

nehalemcorelayout.jpg

We are not actually going to start critiquing the pros and cons of this core layout (we aren't that knowledgeable), but it is still amazing that all four cores, 256KB L1 cache, 1024KB L2 cache, 8MB L3 cache, one QuickPath interface, and the memory controller are on a single die.

scalablemodular.jpg

Now as mentioned above, the Nehalem architecture is dynamically scalable, and that is because it was designed with modularity in mind. What this means is that Intel can custom create processors based on the needs of the market without having to go design a brand new chip from scratch. They can add or remove cores, L3 cache, number of QPI links, number of memory channels, type of memory supported, power management, and even integrated graphics. Therefore, Intel now have the ability to add new blocks to the core without having to go to the drawing board and redesigning the whole layout. Amazingly, they are only limited by how much stuff they can actually fit on one CPU package.


In the next page we will examine some of the more functional features that Intel have built into the Core i7 processors.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Microarchitecture Dissected #2

Microarchitecture Dissected #2


Although Nehalem has been highly touted as one of the most significant architectural overhauls ever, it still shares significant roots with the original P6 microarchitecture that was debuted in the Pentium Pro in 1995. Furthermore, there is no denying that Nehalem was built upon a Penryn base, however from there Intel's engineers have added significant performance-oriented features, like an integrated memory controller, a completely new system interconnect, and a multi-level shared cache. As you will see, they have also focused a great deal on the chip's power efficiency capabilities.

Let's examine some of these advancements:


  • QPI

For the Nehalem architecture, Intel has foregone the legacy front side bus in favour of the QuickPath Interconnect (QPI). The QPI is a high-speed, low-latency point-to-point processor link. From a technical standpoint, the QPI is a bi-directional 20-bit wide bus that is integrated onto the processor itself. The result? An incredibly fast interconnect that will improve overall bandwidth while reducing latency. This high-speed interface is used to access the distributed shared memory, it helps cores communicate with each other, and it links up with the X58 northbridge; now known as the IO Hub (IOH).

Consumer-oriented Nehalem models will have a single QPI link, but the workstation/server processors will have up to four of these high-speed interconnects. With its faster 6.4 Gigatransfers per second (GT/s) QPI link, the 965 Extreme Edition will benefit from a theoretical bandwidth 25.6GB/s link, which is double the bandwidth offered by the 1600MHz front side bus implemented in the X48 Express chipset. Interestingly, it is also equivalent to Nehalem's triple-channel DDR3-1066 memory bandwidth. The lesser 920 and 940 models feature a 4.8GT/s QPI interface with 19.2GB/s of bandwidth.

Some of you may be wondering why a replacement to the front-side bus was needed. Well the easy answer is that the conventional shared-bus interconnect topology was bandwidth-starved and not scalable. The front-side bus is a decade-old concept that was never meant for a multi-core era. At the moment, there are significant communication bottlenecks between the processor and chipset, as well as among the various cores in one processor. For example, if one core wanted to communicate with a core on another die or access that core's L2 cache, the data had to go through the slow FSB, causing a bottleneck and performance hit. There is no denying that current multi-core processors perform very well, but they are simply relying on their large cache to offet the front-side bus bottleneck issues.


  • Integrated Memory Controller (IMC)

mem.jpg

Following AMD's lead, Intel has finally integrated the memory controller into the processor itself. As a result, the memory is directly connected to the processor, which not only means significantly lower latency, but much higher bandwidth as well. Current Core i7 processors feature a triple-channel memory interface, and each channel can support one or two DDR3 modules. This means that memory modules should be installed in sets of three, not two as has been the norm since the dual-channel memory architecture was first introduced back in 2003. It also means that most Core i7 motherboards will ship with three or six memory slots, but you will see the occassional four slot design, like Intel's DX58SO Smackover motherboard.

With this new design, Intel claims up to a 3.4x increase in memory bandwidth from Penryn, as well 40% lower memory latency. We definitely look forward to testing out that claim.


  • SSE4.2

Building upon Penryn's implementation of SSE4.1, which was focused on improving video encoding, image/video editing, faster 3D game physics, etc...Nehalem adds 7 new instrutions, namely Accelerated String and Text New Instructions (STTNI) and Application Targeted Acceleration (ATA), which focus on faster XML parsing, faster search and pattern matching, and other cryptic processor functions.

Keep in mind that with Penryn, the SSE4 instructions were responsible for the most significant performance increases, so we definitely look forward to seeing what Intel have accomplished with these latest instructions.


  • Hyper-Threading

htsmall.gif

Nehalem also brings Hyper-Threading (HT) back from the dead. With HT enabled, a processor with four physical cores is viewed by the operating system as having eight logical cores. A core usually processes the pieces of the different threads one after another, however an HT-enabled core can process two threads in a simultaneous manner. While Hyper-Threading did not perform particularly well on the Pentium 4, Nehalem's architecture was designed to remove many of the processing bottlenecks. Depending on the workload, and how effectively multi-threaded an application is, the performance increases could be 20% or higher.


  • Power Control Unit (PCU)

Nehalem’s Power Control Unit (PCU) is an extremely innovative power management feature that uses an on-chip micro-controller to actively manage the power and performance of the entire processor with the help of numerous integrated power sensors. The PCU can dynamically alter the voltage and frequency of the CPU cores to lower power consumption or provide performance boost in conjunction with the new Turbo Mode feature. Also, thanks to a development know as Power Gates, idle cores can be completely shut down and placed in a C6 sleep mode while other cores continue working. This is noteworthy because C6 mode had previously only been featured on mobile processors.


  • Turbo Mode

tma.gif

For the first time ever, Intel has included a feature that automatically overclocks a processor based on the workload demand. Basically, all Core i7 processors come with two additional speed bins, which is to say that they have two higher multipliers that they can use under certain scenarios. For example, if you are using a single-threaded application, the PCU will down-clock or shut down three cores, thereby freeing up power and lowering heat output while "overclocking" that one core that is in use. If an application is multi-threaded and the cores are not running too hot, the PCU will overclock all the cores up one speed bin. The only limit to Turbo Mode is the power and thermal headroom, so keeping your processor cool should definitely be an even greater priority with the Core i7 series than it ever was.


Taken as a whole, these new performance and energy-saving features are what truly distinguish Nehalem as a veritable next-generation microarchitecture. They are little elements that some users may never know exist, but which ultimately deliver a superior computing experience. We look forward to testing and examining each and everyone one of these new capabilities.


If you are truly interested in knowing everything there is to know about this new microarchitecture, we highly recommend that you watch this Intel Developer Forum 2008 presentation by Steve Pawlowski, the Digital Enterprise Group chief technology officer and general manager for Architecture and Planning for Intel Corporation.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
X58 Express Chipset Examined

X58 Express Chipset Examined



x58blockdiagram.jpg

As mentioned previously, along with the Core i7 processors, Intel will be launching the X58 Tylersburg northbridge, now known as the IO Hub (IOH). This reclassification has occured because of the fact that the memory controller has been integrated into the processor. As result, the IO Hub is now solely responsible for implementing PCI Express lanes and linking to the I/O Controller Hub (ICH) southbridge. Since the front side bus is no more, the X58 communicates with the processor via the new hyper-fast QuickPath point-to-point link, and it is connected to the ICH via the traditional Direct Media Interface (DMI). The southbridge is the venerable ICH10 found on all P45 Express motherboards.

The X58 supports 32 PCI-Express 2.0 lanes, which means that it supports two PCI-E x16 slots. CrossFire support is still present, and some X58 motherboards have been certified for NVIDIA SLI, which is something that enthusiasts the world over have been waiting for with great anticipation.

This Intel chipset is manufactured on the venerable 65nm process, it measures 37.5mm x 37.5mm, and it has a high 24.1W TDP. With these specifications, it is no surprise that it is one fairly hot running chipset and you will be seeing fairly hefty chipset cooling solutions by motherboard manufacturers.

Here are some of the features and functionality that Intel has added to the X58 chipset.

x58chipsetfeatures.jpg
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
The Platform in Pictures (stock cooler & DX58SO Motherboard)

The Platform in Pictures


Stock Cooler


Need a reading break? Well enjoy this picture gallery which provides an up-close look at the various non-processor components that Intel will be launching with this new platform.

With a Thermal Design Power (TDP) of 130W, the Core i7 series has the potential to be a hot little chip, but Intel have bundled a good-sized copper-core cooler to cope with the heat output. As you can see, it is huge compared to the relatively dinky cooler that was included with cool-running 45nm Core 2 Duo's.


Intel DX58SO Motherboard


Skull_Circuitry.jpg

The Smiling Skull... Scaring people into buying Intel products since '07 and is the basis for Intel's performance motherboards for the last few years.

platform1.jpg

Along the three Core i7 models, Intel will also soon be launching the DX58SO 'Smackover' motherboard, which has served as our test platform, and which has proven itself to be solid foundation on which to launch a new product line.

platform2th.jpg
platform4th.jpg

As with all their enthusiast-oriented motherboards, Intel's Smackover features a simple black, blue, and white theme. As you can see the Skull logo has a predominant place on the X58 IOH Hub heatsink. While the aluminum heatsink may look plain, it performs well and has a truly excellent mounting system.

platform5th.jpg
platform6th.jpg

As has become the industry norm, Intel have used long-lasting solid state capacitors throughout the motherboard. The Smackover features a well-spaced six-phase power design, with mighty-looking PA2080 power inductors from Pulse Engineering. All the MOSFETs are covered by their own heatsinks and have thermal material applied to them. The socket area is little crowded, and installing low and wide CPU coolers can be a little tricky but some of the larger ones like the Thermalright Ultra Extreme (with the optional i7 mounting kit) installed without a problem during our testing.

platform7th.jpg
platform8th.jpg

Here we have the brand new LGA1366 socket with its 1366 contact points. We counted, they're all there. As you can see the whole socket features a sturdy bolt-thru design.

platform9th.jpg
platform10th.jpg

Although the overwhelming majority of Core i7 motherboards will ship with three or six memory slots, Intel have gone with a four-slot design that is placed very close to the CPU socket. According to Intel, this layout ensures smoother, shorter data routing.

On the other hand, we really don't like the location of the 8-pin CPU power connector, since it's placed in between the IOH heatsink and the back of any installed graphics card. There is very likely an power routing reason for why it was placed in that location, but we hope other motherboard manufacturers place it closer to the edge of the motherboard.

platform11th.jpg
platform12th.jpg

Evidently this platform needs some serious juice as here we have a 12-volt auxiliary power connector. We will be testing the power consumption a little later. Thankfully, the 24-pin main power connector is located in the traditional location.


platform13th.jpg
platform14th.jpg

We are happy to see the SATA ports all along the motherboard's edge, but there is one flaw. If you use a dual-slot graphics card that is 10-inches long, you lose access to one SATA port. If you use two such cards, you lose access to three ports in total. This is slightly disappointing, but regrettably all too common. The ICH10 southbridge is cooled by a small but capable heatsink.

platform15th.jpg
platform16th.jpg

The Smackover comes with one legacy PCI slot, two PCI-E x1 slots, one PCI-E x4 slot, and two mighty PCI-E 2.0 x16 slots. The x16 slots are well-spaced, so all but the most radical cooling solutions should fit without any issues.

platform17.jpg

The rear I/O panel looks a little sparse, and there is very good reason for this...no more legacy PS/2 keyboard and mouse ports! That's right folks, if you want to use this motherboard you are going to have to kiss that beige keyboard and ball mouse goodbye.
The remaining ports are one Gigabit LAN, eight USB2.0, two eSATA, one FireWire, six audio jacks, optical and coaxial S/PDIF connectors.

platform19th.jpg
platform20th.jpg


Even though the X58 chipset doesn't really get particularly hot, Intel has nevertheless included a neat little glowing fan. Noise-o-phobics will be glad to know that it is barely audible.


platform21th.jpg
platform22th.jpg

Ok, these are just gratuitous drool-inducing hardware pics. By the way, don't get too excited, as mentioned previously the Smackover has actually not yet been certified for SLI. We will be taking a good look at multi-GPU scaling in an upcoming article.

Now it's high time for some actual benchmarks don't you agree? Keep on reading!
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Test Setups & Methodology

Test Setups & Methodology



For the benchmarking section, we decided to disable the Turbo Mode feature. The logic behind this decision was two-fold. First, since Turbo Mode varies clock speeds based on CPU temperatures, we could not guarantee that the clock speeds our chips ran at would be anywhere similar to what you could achieve. There were simply too many variables to take into considering to be able to use this feature and maintain a transparent, reproducible benchmark methodology. Secondly, by forcing the processors to run at their native speeds, we could better determine how the Core i7 series performs on a clock-per-clock basis in comparison to previous processors.

Intel Core i7 Test Setup​
<table table="" align="center" bgcolor="#666666" border="0" cellpadding="4" cellspacing="1" width="670"><tbody><tr><td align="left" bgcolor="#cc9999" width="30">Motherboard:</td><td align="justify" bgcolor="#ececec" width="200">Intel DX58SO 'Smackover'</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Processor:</td><td align="justify" bgcolor="#ececec" width="200">Intel Core i7-920, Core i7-940*, and Core i7-965 Extreme Edition</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Processor Cooling:</td><td align="justify" bgcolor="#ececec" width="200">Thermalright Ultra-120 eXtreme LGA-1366<br>Thermalright TR-FDB-12-1600 120MM FAN - 63.7CFM 1600RPM</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Memory:</td><td align="justify" bgcolor="#ececec" width="200">3 x 1GB Qimonda PC3-8300 7-7-7 1.5V</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Power Supply:</td><td align="justify" bgcolor="#ececec" width="200">Corsair HX620W</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Video Card:</td><td align="justify" bgcolor="#ececec" width="200">BFG GeForce 9800GTX 512MB</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Hard Drive:</td><td align="justify" bgcolor="#ececec" width="200">Western Digital 320GB WD3200AAKS-00B3A0</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Operating System:</td><td align="justify" bgcolor="#ececec" width="200">Windows Vista Ultimate SP1 64-bit (with all updates)</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Chipset Drivers:</td><td align="justify" bgcolor="#ececec" width="200">ForceWare 175.16<br>Intel 9.1.1.1010</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Special Notes:</td><td align="justify" bgcolor="#ececec" width="200">*In order to recreate a Core i7-940, we simply downclocked the 965 to 2.93Ghz (22x133) and reduced the QPI to 4.8GT/s</td></tr></tr></tbody></table>


Intel Core 2 Test Setup​
<table table="" align="center" bgcolor="#666666" border="0" cellpadding="4" cellspacing="1" width="670"><tbody><tr><td align="left" bgcolor="#cc9999" width="30">Motherboard:</td><td align="justify" bgcolor="#ececec" width="200">Asus Maximus Formula X38 (Bios 0907)</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Processor:</td><td align="justify" bgcolor="#ececec" width="200">Intel Pentium Dual Core E5200, Core 2 Duo E8400, Core 2 Quad Q6600, Core 2 Extreme QX9770</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Memory:</td><td align="justify" bgcolor="#ececec" width="200">OCZ Reaper HPC 4GB 2X2GB PC2-8500 DDR2-1066 5-5-5-15</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Power Supply:</td><td align="justify" bgcolor="#ececec" width="200">Corsair HX620W</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Video Card:</td><td align="justify" bgcolor="#ececec" width="200">eVGA GeForce 9800GTX 512MB</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Hard Drive:</td><td align="justify" bgcolor="#ececec" width="200">Western Digital 320GB WD3200AAKS-00B3A0</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Operating System:</td><td align="justify" bgcolor="#ececec" width="200">Windows Vista Ultimate SP1 64-bit (with all updates)</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Chipset Drivers:</td><td align="justify" bgcolor="#ececec" width="200">ForceWare 175.16</td></tr></tbody></table>

AMD Athlon X2 & Phenom Test Setup​
<table table="" align="center" bgcolor="#666666" border="0" cellpadding="4" cellspacing="1" width="670"><tbody><tr><td align="left" bgcolor="#cc9999" width="30">Motherboard:</td><td align="justify" bgcolor="#ececec" width="200">Asus M3A32- MVP Deluxe 790FX (Bios 1002 – Latest)</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Processor:</td><td align="justify" bgcolor="#ececec" width="200">AMD X2 6400+, Phenom X3 8750, Phenom X4 9850 Black Edition</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Memory:</td><td align="justify" bgcolor="#ececec" width="200">OCZ Reaper HPC 4GB 2X2GB PC2-8500 DDR2-1066 5-5-5-15</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Power Supply:</td><td align="justify" bgcolor="#ececec" width="200">Corsair HX620W</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Video Card:</td><td align="justify" bgcolor="#ececec" width="200">eVGA GeForce 9800GTX 512MB</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Hard Drive:</td><td align="justify" bgcolor="#ececec" width="200">Western Digital 320GB WD3200AAKS-00B3A0</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Operating System:</td><td align="justify" bgcolor="#ececec" width="200">Windows Vista Ultimate SP1 64-bit (with all updates)</td></tr><tr><td align="left" bgcolor="#cc9999" width="30">Chipset Drivers:</td><td align="justify" bgcolor="#ececec" width="200">ForceWare 175.16</td></tr></tbody></table>


To ensure consistent results, a few tweaks were applied to Windows Vista SP1:
  • Page File – Disabled
  • System Protection/Restore – Disabled
  • Problem & Error Reporting – Disabled
  • Remote Desktop/Assistance - Disabled
  • Windows Security Center Alerts – Disabled
  • Windows Defender - Disabled
  • Visual Effects – Disabled


All scores you see in the following pages are the averages after 5 benchmark runs. If they were any clearly anomalous results, the 5-loop run was repeated.


Here is a full list of the applications that we utilized in our benchmarking suite:
  • 3DMark06 v1.1.0
  • 3DMark Vantage Professional Edition v1.0.1
  • Cinebench R10 64-bit
  • Crysis v1.2
  • Half-Life 2: Episode 2 (Latest Updates)
  • Lavalys Everest Ultimate Edition v4.60.1500
  • iTunes v7.6.1.9
  • PCMark Vantage Advanced 64-Bit Edition (1.0.0.0)
  • Super PI Mod 1.5
  • Supreme Commander v1.1.3280
  • Unreal Tournament 3 v1.2
  • Valve Particle Simulation Benchmark
  • WinRAR 3.7.1
  • x264 HD Benchmark v1.0


So without further ado, let's get to the good stuff!
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Feature Test: Goodbye FSB – Hello QPI

Feature Test: Goodbye FSB – Hello QPI



As mentioned previously, the QuickPath Interconnect (QPI) is the brand new high-speed interface that has replaced the decade-old front side bus. We are all familiar with the good 'ol FSB, and we also know that increasing it generally provides improved performance. Therefore, we were interested in determining whether there is a noticeable performance difference between a 4.8GT/s - 2400Mhz QPI Link and a 6.4GT/s - 3200Mhz QPI Link.

4.8QPIa.jpg
6.4QPIa.jpg
Click for full size…​

To isolate the QuickPath Interconnect as the focus of the test, we downclocked our Core i7-965 to 2.66Ghz, and manually adjusted the the QPI speed option in the BIOS. Are consumers missing out by buying the lower-end Core i7-920 and 940 models? Let's find out!

<table align="center" table border="0" bgcolor="#666666" cellpadding="5" cellspacing="1" width="735px"><tr><td align="center" bgcolor="#cc9999" width="130"><b></b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965 @ 2.66Ghz - 4.8GT/s QPI</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965 @ 2.66Ghz - 6.4GT/s QPI</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Performance Difference</b></td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>3DMark Vantage: CPU Score</b></td><td align="center" bgcolor="#ececec" width="100">16540</td><td align="center" bgcolor="#ececec" width="100">16480</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Valve Particle Simulation Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">130</td><td align="center" bgcolor="#ececec" width="100">130</td><td align="center" bgcolor="#ececec" width="100">None</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 1-CPU</b></td><td align="center" bgcolor="#ececec" width="100">3584</td><td align="center" bgcolor="#ececec" width="100">3571</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 Multi-CPU</b></td><td align="center" bgcolor="#ececec" width="100">15745</td><td align="center" bgcolor="#ececec" width="100">15820</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td><tr><td align="center" bgcolor="#ececec" width="100"><b>x264 HD Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">22.87</td><td align="center" bgcolor="#ececec" width="100">22.87</td><td align="center" bgcolor="#ececec" width="100">None</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>WinRAR 3.71 Compression</b></td><td align="center" bgcolor="#ececec" width="100">3:04</td><td align="center" bgcolor="#ececec" width="100">3:00</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 1M</b></td><td align="center" bgcolor="#ececec" width="100">15.641s</td><td align="center" bgcolor="#ececec" width="100">15.625s</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td><tr><td align="center" bgcolor="#ececec" width="100"><b>SuperPI 32M</b></td><td align="center" bgcolor="#ececec" width="100">14:05.465s</td><td align="center" bgcolor="#ececec" width="100">14:04.715s</td><td align="center" bgcolor="#ececec" width="100">Insignificant</td><tr></table>

The results certainly speak for themselves, a faster QPI Link does not increase overall performance....at least in our tests. It is quite possible that under a specific workload the faster QPI speed would in fact distinguish itself, however we suspect that this type of workload would only be found in the high-end workstation or server sector. It is also quite possible that even the 4.8GT/s QPI Link provides more than enough bandwidth for any possible workload.

Although we may have raised more questions than answers, it is clear the those who buy the lower-end Core i7 models and overclock them to Extreme Edition speeds will not be bottlenecked in any discernable manner.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Feature Test: The Return of Hyper-Threading

Feature Test: The Return of Hyper-Threading



After the less-than-successful implementation of Hyper-Threading that we experienced with the Pentium 4, we were not particularly excited to this feature make a comeback. However, Intel assured everyone that the numerous architectural advancements they had made to Nehalem were specifically designed to eliminate any of the bottlenecks that Hyper-Threading can cause.

HT.gif

It doesn't matter how technologically jaded you are, opening the Windows Task Manager on a Core i7 system is a smile-inducing experience.

With it's shorter, faster, more efficient pipeline (ability to simultaneously process up to four instructions), can Nehalem truly make Hyper-Threading a worthwhile feature with real-world performance gains? Let's find out.

<table align="center" table border="0" bgcolor="#666666" cellpadding="5" cellspacing="1" width="735px"><tr><td align="center" bgcolor="#cc9999" width="130"><b></b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - HT Enabled</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Intel Core i7-965<br> - HT Disabled</b></td><td align="center" bgcolor="#cc9999" width="180"><b>Performance Difference</b></td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>3DMark Vantage: CPU Score</b></td><td align="center" bgcolor="#ececec" width="100">19546</td><td align="center" bgcolor="#ececec" width="100">14644</td><td align="center" bgcolor="#ececec" width="100">+33%</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Valve Particle Simulation Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">133</td><td align="center" bgcolor="#ececec" width="100">+15%</td><tr><td align="center" bgcolor="#ececec" width="100"><b>Cinebench R10 Multi-CPU</b></td><td align="center" bgcolor="#ececec" width="100">18569</td><td align="center" bgcolor="#ececec" width="100">16067</td><td align="center" bgcolor="#ececec" width="100">+15%</td><tr><td align="center" bgcolor="#ececec" width="100"><b>x264 HD Benchmark</b></td><td align="center" bgcolor="#ececec" width="100">27.23 fps</td><td align="center" bgcolor="#ececec" width="100">21.55 fps</td><td align="center" bgcolor="#ececec" width="100">+26%</td></tr><tr><td align="center" bgcolor="#ececec" width="100"><b>WinRAR 3.71 Compression</b></td><td align="center" bgcolor="#ececec" width="100">2:39</td><td align="center" bgcolor="#ececec" width="100">3:02</td><td align="center" bgcolor="#ececec" width="100">+15%</td><tr></table>

We think a "WOW" is called for. Intel have come through with the predicted 20-30% performance gains and then some. Nehalem was always touted as being a multi-threading monster, and this certainly proves it. Clearly, if you run heavily multi-threaded applications, the Core i7 series is a very attractive proposition.
 
Last edited:

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,086
Location
Montreal
Feature Test: Turbo Mode

Feature Test: Turbo Mode



tma.gif
Click for full size…​

As mentioned previously, Turbo Mode is a potentially exciting new feature that automatically unlocks two additional multipliers and allows the processor to self-overclock based on thermal conditions and workload. If the Power Control Unit (PCU) senses that only one core is active and the other three are in an idle state, it will use the unused power and thermal headroom to overclock that single active core to ensure superior single-threaded performance. Conversely, if you running a multi-threaded application, the PCU will measure the thermal headroom and if the processor is running cool enough it will overclock all four cores. Turbo Mode can overclock a single core by a maximum of two speed bins (multipliers), thus 266Mhz higher at the stock 133Mhz BCLK. When overclocking all four cores, it can increase the frequency by 133Mhz.

Although the results will be fairly self-evident, we have to measure the performance boost that Turbo Mode provides on the top-end Core i7-965 model. As per the above, thermal conditions permitting (and they were), it will run a one core at 3.46GHz for single-threaded workloads, and the four cores at 3.33Ghz for multi-threaded scenarios

<table table="" align="center" bgcolor="#666666" border="0" cellpadding="5" cellspacing="1" width="735"><tbody><tr><td align="center" bgcolor="#cc9999" width="130"></td><td align="center" bgcolor="#cc9999" width="180">Intel Core i7-965 - Turbo Disabled</td><td align="center" bgcolor="#cc9999" width="180">Intel Core i7-965 - Turbo Enabled</td><td align="center" bgcolor="#cc9999" width="180">Performance Difference</td></tr><tr><td align="center" bgcolor="#ececec" width="100">3DMark Vantage: CPU Score</td><td align="center" bgcolor="#ececec" width="100">19646</td><td align="center" bgcolor="#ececec" width="100">20301</td><td align="center" bgcolor="#ececec" width="100">+3%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">Valve Particle Simulation Benchmark</td><td align="center" bgcolor="#ececec" width="100">153</td><td align="center" bgcolor="#ececec" width="100">161</td><td align="center" bgcolor="#ececec" width="100">+5%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">Cinebench R10 1-CPU</td><td align="center" bgcolor="#ececec" width="100">4252</td><td align="center" bgcolor="#ececec" width="100">4474</td><td align="center" bgcolor="#ececec" width="100">+5%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">Cinebench R10 Multi-CPU</td><td align="center" bgcolor="#ececec" width="100">18569</td><td align="center" bgcolor="#ececec" width="100">19097</td><td align="center" bgcolor="#ececec" width="100">+3%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">x264 HD Benchmark</td><td align="center" bgcolor="#ececec" width="100">27.23 fps</td><td align="center" bgcolor="#ececec" width="100">28.40 fps</td><td align="center" bgcolor="#ececec" width="100">+4%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">WinRAR 3.71 Compression</td><td align="center" bgcolor="#ececec" width="100">2:39</td><td align="center" bgcolor="#ececec" width="100">2:36</td><td align="center" bgcolor="#ececec" width="100">+2%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">SuperPI 1M</td><td align="center" bgcolor="#ececec" width="100">12.078s</td><td align="center" bgcolor="#ececec" width="100">12.960s</td><td align="center" bgcolor="#ececec" width="100">+7%</td></tr><tr><td align="center" bgcolor="#ececec" width="100">SuperPI 32M</td><td align="center" bgcolor="#ececec" width="100">12:04.755</td><td align="center" bgcolor="#ececec" width="100">11:16.562</td><td align="center" bgcolor="#ececec" width="100">+7%</td></tr><tr></tr></tbody></table>

As you can see, there are some marginal performance improvements in multi-threaded applications, and some more noticeable speeds boosts in single-threaded applications like SuperPI. Some people may consider the Turbo Mode feature a mere gimmick, and perhaps it is for enthusiast users. However, who would begrudge Intel for giving all users a free 133-266Mhz speed boost? No one, and we definitely like the concept and the implementation.
 
Last edited:
Status
Not open for further replies.

Latest posts

Top