I am a big fan of John Carmack (read Masters of Doom). At E3 this year he sat down for an interview on G4. The interviewer was pissing me off though. If the interviewer didn’t control the conversation, Carmack could have talked for hours about mega texture and the inner workings of the PS3 architecture. It just feels like he wants to go more in-depth, but the intelligence of the audience around him and the interviewer are keeping him locked down. I guess programmer speak is not what they were looking for, but I wanted to hear it.
To call the custom-built ATI GPU found inside the Xbox 360 “powerful” is like saying Muhammad Ali was a “good boxer.” Sure, it’s true, but it doesn’t even come close to the entire truth of the matter. There are subtle yet very important aspects of the Xbox 360’s GPU that, at first glance, might not strike you as impressive. But once you take a deeper look, we’re sure you’ll agree that this is one bad chip.
First off, the custom-built ATI chip runs at 500MHz, a very respectable speed for a console-based GPU. It uses 512MB GDDR3 memory (which requires less power and runs cooler than previous memory types) running at 700MHz and has a 256-bit memory interface bus with a 22.4GBps bandwidth. This memory is equally accessible to both the GPU and CPU, creating what is known as Unified Memory Architecture, making graphics performance truly lightning-fast.
Innovative use of on-die eDRAM makes sure this punch doesn’t lose its speed and impact, even at 720p.
Manufactured by NEC using 90nm technology, the unique 10MB eDRAM (embedded DRAM) chip provides the truly powerful benefits of the Xbox 360’s GPU. (You might note that NEC is also the provider of the Nintendo GameCube embedded DRAM, but that is an entirely different generation of eDRAM.)
Even running at a full 500MHz, the ATI GPU draws less than 35 watts of power, and that includes the eDRAM. The power management features found on the chip are impressive. They provide clock throttling at both a “macro” and “micro” level, powering down either large blocks of the chip or smaller logical units where necessary. The true wonder of this chip is the fact that there is no fan directly on the GPU, only a passive heatsink, which is cooled by the air drawn over it by the fans on the back of the Xbox 360. Together, these features define a truly efficient chip.
The 360’s GPU can produce up to 500 million triangles per second. Given that a triangle has three edges, for practical purposes this means the GPU can produce approximately 1.5 billion vertices per second. (In comparison, the ATI X1900 XTX processes only 1.3 billion vertices per second and runs at nearly double the clock speed.) For antialiasing, the 360 GPU pounds out a pixel fillrate of 16 gigasamples per second, using 4X MSAA (Multi-Sampling Anti-Aliasing). Of course, the big claim to fame of the 360’s GPU is the stunning 48 billion shader operations per second, thanks to its innovative use of Unified Shader Architecture.
Why is that figure so impressive? For the uninitiated, shader operations are the core of what makes a rendered graphic look the way it does. There are two separate types of shaders that are used in gaming graphics: vertex shaders and pixel shaders. Vertex shaders impact the values of the lines that make up a polygon. They are what determine how realistic animation of polygons and wireframe models will look: the swagger of a walking character, for instance, or the rolling tread of a tank as it crushes an android skull laid to waste on a charred battleground.
Pixel shaders, on the other hand, are what determine how realistic that charred battlefield will look or the color of the dents in the tank. They alter the pixel’s color and brightness, altering the overall tone, texture, and shape of a “skin” once it’s applied to the wireframe. These shaders allow developers to create materials and surfaces with textures and environments that much more closely resemble reality.
Each of these graphics processing functions are called and executed on a per-pixel or per-vertex basis as they pass through the pipeline. Until recently, graphics processors handled each type of shader individually with dedicated units for each. Developers used low-level assembly languages to talk directly to the chip for instructions on how to handle the shaders, or they used APIs such as OpenGL or DirectX. Unified Shader Architecture changes all that by handling both shader types at the hardware level in the same instruction process. This means that the GPU can make use of the common pieces of each type of shader while making direct calls and relaying specific instructions to the shader itself. This decreases the actual size of the instruction sets and combines common instructions for two shader types into one when applicable. This is how the 360’s GPU quickly and efficiently handles shader operations. 48 billion shader operations per second, in fact.
It’s tempting to compare the GPU inside the Xbox 360 to today’s high-dollar, high-performance video cards, and some who do might scoff a little. The latest graphic cards from Nvidia and ATI, such as Nvidia’s GeForce 7800 GTX and ATI’s Radeon X1900 series, are—on paper—superior GPUs. They tout processor speeds of 550 to 625MHz and memory clock speeds of 1,500MHz and above. In terms of raw horsepower, these cards are indeed brutes. Of course, if there’s one thing we’ve all learned about clock speeds in the great processor wars between Intel and AMD, it’s that raw speed hardly translates into a real measure of processing power.
It’s not hyperbole to say that video memory bandwidth is one of the most important (if not the most important) parts of processing and rendering graphic elements. This is simply because bandwidth and speed determine how rapidly instructions can be transferred, processed, and returned to the system. Thus it’s in direct control of overall graphics performance for a system.
To improve video memory bandwidth, graphics card manufacturers have resorted to the typical methods of boosting speed, such as creating wider bitpaths (512MB nowadays) or boosting core clock speed. These techniques have placed performance in the range of 40 to 50GBps at peak range, which is respectable when compared with other graphics processors. However, these figures still fall short of the Xbox 360’s 256GBps.
Yes, you read that right: 256GBps memory bandwidth. It’s utterly stunning, and it’s thanks to the chip’s embedded 10MB of eDRAM.
No currently available video card makes use of embedded DRAM. And even if one was available, it’ll be at least the end of 2006 before they’ll be of any use. That’s when Windows Vista comes out, meaning that the operating systems they’re gaming on can’t make use of Vista’s WGF (Windows Graphics Foundation) 2.0 features. This speed of instruction handling combined with Unified Shading Architecture not only makes the GPU inside the Xbox 360 the current graphics powerhouse, it also means it’ll stay that way for a number of years.
And even when current PC-based GPUs start catching up, it’s going to be extremely expensive to match the performance of this dedicated gaming platform. The top-level cards by ATI and Nvidia are retailing for around $560 apiece, and that’s without Unified Shading Architecture support or eDRAM. And of course, there are other aspects of the system to consider, such as the fact that the CPU and memory were custom-built for dedicated gaming performance.
ATI and Microsoft have truly built something special in the Xbox360’s GPU. It’s astounding to see a chip with such power run at such an efficient clock speed and generate as little heat as it does, while at the same time making use of never-before-seen technology that will surely be replicated in graphics cards and consoles for years to come. It’s comforting to know that the Xbox 360 will continue to produce visually stunning and smooth graphics well into the foreseeable future.
You can’t help but like Epic Games VP Mark Rein. Sure, he is a graphics whore but he openly speaks his mind and has a lot of important things to say. In an interview with Firingsquad, Mark Rein called out Intel on making horrible graphics chips, and I could not agree with him more.
FiringSquad: What is Epic’s feeling about PC game hardware and how will the Unreal engine be part of that?
Mark Rein: I wish I could report only good news but that’s not the case.
On the good side there is a lot of exciting news in the PC space. Ageia is launching their physics hardware. NVIDIA now has Quad SLI. Dell is getting serious about PC gaming with their XPS 600 Renegade system and XPS M170 laptops. Both Intel and AMD now sell dual-core processors. Apple has switched to Intel. NVIDIA’s SLI is taking hold and lots of game enthusiasts are starting to use it. These are all good things that Unreal Engine 3 is very qualified to take advantage of. We’re obviously using the PhysX library which, in addition to giving us very strong physics performance on Xbox360 and PS3, will give us the ability to take advantage of the upcoming PC PhysX hardware. Prior to their announcement at CES this year we had a chance to run Unreal Engine 3 on Dell’s new XPS 600 Renegade Quad SLI system and I can tell you that it is fantastic! Dell is lending us gear for our GDC demos this year. In addition to running our theatre on the amazing Renegade we’ll also be demo’ing on more down-to-earth SLI-equipped XPS 600 systems and the Dell M170 laptop with NVIDIA’s Geforce Go 7800 GTX. Dell is also lending us some of their amazing 30” monitors which are fantastic for Unreal Editor demos. Our new multi-threaded renderer will also be great on dual core processors from AMD and Intel. Windows Vista will also give us a nice performance boost by getting us closer to the hardware than past versions of Windows have.
Unfortunately the bad side is getting really bad. It is getting harder and harder for the average consumer to buy a computer with a decent graphics chips in it. When I go to major electronics retailers I see that most of the machines being sold are using Intel Integrated graphics – including the vast majority of laptops. Some of the desktop machines don’t even have slots for discrete graphics cards which I find personally offensive. Laptops of course are mostly not upgradable so a bad laptop is a bad laptop forever and considering how many people are replacing desktop with laptops this is especially worrisome. It is really sad when you see the moniker “media” or “entertainment” attached to something with Intel Integrated graphics in it. I question the logic of developing dual-core CPUs and saddling them with ultra-low-end graphics especially considering that one of the big benefits of Windows Vista will be a hugely improved graphical user interface that will help improve productivity. There are some seriously expensive desktops and laptops with crappy graphics chips in them – these aren’t just the low-priced machines either. Intel salespeople are probably patting themselves on the back for these design wins but the truth is the more successful they are with this strategy the faster they could be killing off the PC games market and nobody has the balls to stand up and cry foul because Intel is so powerful.
If people take those machines home and try to play recent PC games on them they’re going to have a horrible experience and possibly give up on PC gaming altogether. Users aren’t educated in this area but when their new $1,500 PC says “no” to a decent PC game they’re going to just assume the PC games market had passed them by. This is sad because the difference in cost the PC manufacturer to put in a decent graphics chip isn’t very much.
We need to find a way to encourage manufacturers to offer more balanced systems with better graphics chips and understand that every user they convert to a gamer represents a potential higher-margin sale the next time and every user they discourage from gaming represents a potential lower-margin commodity purchaser later. We need those mainstream users to be trying PC games. It is nearly impossible to justify the cost of making games that scale down to integrated graphics when the next-gen consoles have so much graphics power and represent a huge upcoming market. How many publishers would bother bringing their latest games to PC if only the hardcore players could run them? Those customers have already proven they’re willing to spend $300 for a graphics card so expecting them to own a next-gen console isn’t much of a leap.
So despite the fact that I’m a big cheerleader for PC gaming I am worried about a potential for catastrophic failure of the PC gaming market. You’d think Intel would be worried about that too especially considering that none of the next-generation consoles use Intel CPUs.
So, price info has been coming out for the BFG PhysX card. It should cost around $350. I am still not sure if I will be buying one of these or not. In a way, it seems somewhat pointless to buy a physics card. With today’s dual core processors and GPUs, I don’t see any problem with calculating physics on the CPU & GPU. Two processors with 4 threads are enough in my opinion, especially considering most game engines are still single threaded.
Then there is the issue of support. The PhysX cards only work for games that run AGEIA’s physics middleware. How many games will actually use it? Havok is by far the most commonly used physics middleware and if anything its support is growing. It is also worth mentioning that Havok has been talking about physics processing for the GPU for a while now. To me, it sounds like a much better idea then having a separate PPU for physics. I find it hard to suggest buying a physics card that will only work on select games. Can you imagine buying a video card that only worked on games that ran in OpenGL?
Then there is the fact that AGEIA’s physics middleware runs perfectly fine on the CPU. So you don’t even need the card to play games that use it. I would suggest waiting a good long time before buying one. I would like to see how much support this card actually gets, and how much of a performance boost you will actually find on a game running on dual cores.
One thing is for sure though, the demos do look sexy (just not $350 cool) and I am willing to bet those demos can run fine on a dual core processor if it took advantage of it. You can view one of the demos below. If it looks interesting, a slightly different (and a little bit cooler) high resolution version of that video can be found here.
Seriously. I am tired of reading people complain about their outdated junk. I hate people that feel like the whole world should be held back for them because they are too cheap to keep updated. These are the same people that make me install games over 4 cd discs. Everyone that doesn’t have a dvd drive needs to go out and buy a $25 dvd drive. Give game manufacturers a reason to start putting EVERY game on dvds now.
Then there are the people that ask for help in forums and get mad at you when you tell them they are SOL. They are like “I need a new graphics card. I want to run Quake 4, Call of Duty 2, F.E.A.R., and Half-Life 2 on high settings and I don’t want to spend more then $200”. I tell them thats just not going to happen for $200 and they get all mad. Honestly, what the hell do they expect. Maybe they are just clueless, but if thats the case, how about doing at least a little research for yourself before coming to a forum and asking people for help.