GT300 is meant to be Nvidia's first completely new architecture (because it's DX11) since the introduction of G80 / 8800 in 2006.
Likewise, Rx8xx is meant to be ATI's first completely new architecture (it's also DX11) since the introduction of R600 / HD2900 in early 2007.
Both GPUs look to be a major advance in performance and features over GT200 and RV770.
Nvidia GT300
http://www.brightsideofnews.com/news/2009/...-cgpu!.aspx
http://www.bit-tech.net/news/hardware/2009...-architecture/1
ATI RV870 / R800 (HD 5850, HD5870, HD 5850X2, HD 5870X2)
http://www.neoseeker.com/news/10564-specs-...-5870-turn-up-/
http://www.brightsideofnews.com/news/2009/...s-revealed.aspx
Likewise, Rx8xx is meant to be ATI's first completely new architecture (it's also DX11) since the introduction of R600 / HD2900 in early 2007.
Both GPUs look to be a major advance in performance and features over GT200 and RV770.
Nvidia GT300
http://www.brightsideofnews.com/news/2009/...-cgpu!.aspx
nVidia's GT300 specifications revealed - it's a cGPU!
4/22/2009 by: Theo Valich - Get more from this author
Over the past six months, we heard different bits'n'pieces of information when it comes to GT300, nVidia's next-gen part. We decided to stay silent until we have information confirmed from multiple sources, and now we feel more confident to disclose what is cooking in Santa Clara, India, China and other nV sites around the world.
GT300 isn't the architecture that was envisioned by nVidia's Chief Architect, former Stanford professor Bill Dally, but this architecture will give you a pretty good idea why Bill told Intel to take a hike when the larger chip giant from Santa Clara offered him a job on the Larrabee project.
Thanks to Hardware-Infos, we managed to complete the puzzle what nVidia plans to bring to market in couple of months from now.
What is GT300?
Even though it shares the same first two letters with GT200 architecture [GeForce Tesla], GT300 is the first truly new architecture since SIMD [Single-Instruction Multiple Data] units first appeared in graphical processors.
GT300 architecture groups processing cores in sets of 32 - up from 24 in GT200 architecture. But the difference between the two is that GT300 parts ways with the SIMD architecture that dominate the GPU architecture of today. GT300 Cores rely on MIMD-similar functions [Multiple-Instruction Multiple Data] - all the units work in MPMD mode, executing simple and complex shader and computing operations on-the-go. We're not exactly sure should we continue to use the word "shader processor" or "shader core" as these units are now almost on equal terms as FPUs inside latest AMD and Intel CPUs.
GT300 itself packs 16 groups with 32 cores - yes, we're talking about 512 cores for the high-end part. This number itself raises the computing power of GT300 by more than 2x when compared to the GT200 core. Before the chip tapes-out, there is no way anybody can predict working clocks, but if the clocks remain the same as on GT200, we would have over double the amount of computing power.
If for instance, nVidia gets a 2 GHz clock for the 512 MIMD cores, we are talking about no less than 3TFLOPS with Single-Precision. Dual precision is highly-dependant on how efficient the MIMD-like units will be, but you can count on 6-15x improvement over GT200.
This is not the only change - cluster organization is no longer static. The Scratch Cache is much more granular and allows for larger interactivity between the cores inside the cluster. GPGPU e.g. GPU Computing applications should really benefit from this architectural choice. When it comes to gaming, the question is obviously - how good can GT300 be? Please do bear in mind that this 32-core cluster will be used in next-generation Tegra, Tesla, GeForce and Quadro cards.
This architectural change should result in dramatic increase in Dual-Precision performance, and if GT300 packs enough registers - performance of both Single-Precision and Dual-Precision data might surprise all the players in the industry. Given the timeline when nVidia begun work on GT300, it looks to us like GT200 architecture was a test for real things coming in 2009.
Just like the CPU, GT300 gives direct hardware access [HAL] for CUDA 3.0, DirectX 11, OpenGL 3.1 and OpenCL. You can also do direct programming on the GPU, but we're not exactly sure would development of such a solution that be financially feasible. But the point in question is that now you can do it. It looks like Tim Sweeney's prophecy is slowly, but certainly - coming to life.
Rumour: Nvidia GT300 architecture revealed
Author: Ben Hardwidge
Published: 23rd April 2009
How do you follow a GPU architecture such as Nvidia's original G80? Possibly by moving to a completely new MIMD GPU architecture. Although Nvidia hasnt done much to the design of its GPU architecture recently - other than adding some more stream processors and renaming some of its older GPUs - theres little doubt that the original GeForce 8-series architecture was groundbreaking stuff. How do you follow up something like that? Well, according to the rumour mill, Nvidia has similarly radical ideas in store for its upcoming GT300 architecture.
Bright Side of News claims to have harvested information confirmed from multiple sources about the part, which looks as though it could be set to take on any threat posed by Intels forthcoming Larrabee graphics processor. Unlike todays traditional GPUs, which are based on a SIMD (single instruction, multiple data) architecture, the site reports that GT300 will rely on MIMD-similar functions where all the units work in MPMD mode.
MIMD stands for multiple-input, multiple-data, and its a technology often found in SMP systems and clusters. Meanwhile, MPMD stands for multiple-program, multiple data. An MIMD system such as this would enable you to run an independent program on each of the GPUs parallel processors, rather than having the whole lot running the same program.
Put simply, this could open up the possibilities of parallel computing on GPUs even further, particularly when it comes to GPGPU apps.
Computing expert Greg Pfister, whos worked in parallel computing for 30 years, has a good blog about the differences between MIMD and SIMD architectures here, which is well worth a read if you want to find out more information. Pfister makes the case that a major difference between Intels Larrabee and an Nvidia GPU running CUDA is that the
former will use a MIMD architecture, while the latter uses a SIMD architecture. Pure graphics processing isnt the end point of all of this, says Pfister. He gives the example of game physics, saying maybe my head just isn't build for SIMD; I don't understand how it
can possibly work well [on SIMD]. But that may just be me.
Pfister says there are pros and cons to both approaches. For a given technology, says Pfister, SIMD always has the advantage in raw peak operations per second. After all, it mainly consists of as many adders, floating-point units, shaders, or what have you, as you can pack into a given area. However, he adds that engineers who have never programmed dont understand why SIMD isnt absolutely the cats pajamas.
He points out that SIMD also has its problems. Theres the problem of batching all those operations, says Pfister. If you really have only one ADD to do, on just two values, and you really have to do it before you do a batch (like, its testing for whether you should do the whole batch), then youre slowed to the speed of one single unit. This is not good. Average speeds get really screwed up when you average with a zero. Also not good is the basic need to batch everything. My own experience in writing a ton of APL, a language where everything is a vector or matrix, is that a whole lot of APL code is written that is
basically serial: One thing is done at a time. As such, Pfister says that Larrabee should have a big advantage in flexibility, and also familiarity. You can write code for it just like SMP code, in C++ or whatever your favorite language is.
Bright Side of News points out that this could potentially put the GPUs parallel processing units almost on equal terms with the FPUs inside latest AMD and Intel CPUs. In terms of numbers, the site claims that the top-end GT300 part will feature 16 groups that will
each contain 32 parallel processing units, making for a total of 512. The side also claims that the GPUs scratch cache will be much more granular which will enable a greater degree of interactivity between the cores inside the cluster.
No information on clock speeds has been revealed yet, but if this is true, it looks as though Nvidias forthcoming GT300 GPU will really offer something new to the GPU industry. Are you excited about the prospect of an MIMD- based GPU architecture with 512 parallel
processing units, and could this help Nvidia to take on the threat from Intels Larrabee graphics chip? Let us know your thoughts in the forums.
http://www.bit-tech.net/news/hardware/2009...-architecture/1
ATI RV870 / R800 (HD 5850, HD5870, HD 5850X2, HD 5870X2)
http://www.neoseeker.com/news/10564-specs-...-5870-turn-up-/
Specs for ATI HD 5870 turn up
Kevin Spiess - Friday, April 24th, 2009 | 11:38AM (PT)
Seems reasonable; coming in July
Following rumors we went over earlier in the week, it does seem that ATI's next flagship GPU, the RV870, will be landing sometime in the later summer, possibly July.
Today some specs turned up for the R870 on German site ATI-Forum.de.
The RV870 will be a 40nm part, meaning that it will take less power than current generation 55nm GPUs. It will have:
* 1200 shader processors (compared with 800 on the current HD 4870)
* 32 ROPS (compared with 16 on the HD 4870)
* 48 TMUs (compared with 40 on the HD 4870)
* 2.1 TFlops of effective computational potential (this is excessive - just about double the TFlops offered by the HD 4870!)
The core clock speed for the HD 5870 appears that it will be 900 MHz, with the 512MB (or possibly 1GB) of GDDR5 running at 1100 MHz (4400 MHz effectively because of the GDDR5.) The RV870 will be DirectX 11 compatible as well.
It is presumed the RV870 will come in the same variants that the recent few generations of ATI cards have come in. That is to say that a HD 5850 and HD 5870 part will launch first, followed by a HD 5870 X2. Perhaps most interesting here though is that there will be much more board partners making a HD 5850 X2 card, unlike this current generation of HD 4850 X2, where Sapphire was the only company to put one together.
Looking at these specs, if someone where to take a guess at the HD 5870's performance, factoring in shader processor improvements, it seems that a HD 5870 will offer somewhere around 155%-160% the performance of the HD 4870 -- which seems hard to believe at first. That would put one HD 5870 around the power of two HD 4850 cards.
These specs are all from a "very trusted source" according to ATI-Forum.de, and they seem reasonable.
Certainly NVIDIA will have something equally fast and powerful to compete against the HD 5870 with. We'll post more rumors as they become available.
http://www.brightsideofnews.com/news/2009/...s-revealed.aspx
ATI Radeon 5870 and 5870X2 specs revealed?
4/24/2009 by: Theo Valich - Get more from this author
German site ATI-Forum probably scored a coup of 2009 - according to their sources, ATI's RV870-based cards are already out at selected partners.
We cannot say was this leak was a reaction on our joint-exclusive story about nVidia's GT300 architecture, but one thing is for sure - ATI wants to bring out their Cypress board as soon as possible - planned for July 2009.
The alleged specifications of RV870 reveal that this chip is not exactly a new architecture, but rather a DirectX 11-specification tweak of the RV770 GPU architecture. Just like nVidia's GT300 architecture, the actual RV870 chip is manufactured in TSMC's 40nm half-node process, packing more transistors than GT200 chips. Regardless of what ATI says about nVidia and large dies, the fact of the matter is that ATI is making a large die as well - but the company will continue to use the dual-GPU approach to reach high-end performance.
The RV870 chip should feature 1200 cores, divided into 12 SIMD groups with 100 cores each [20 "5D" units], while RV770 was based on 10 SIMD group with 80 cores total [16 "5D" groups consisting out of one "fat" and four simpler ones]. Thus, it is logical to conclude that when it comes to execution cores, not much happened architecturally - ATI's engineers increased the number of registers and other demanding architectural tasks in order to comply with Shader Model 5.0 and DirectX 11 Compute Shaders. The core is surrounded with 48 texture memory units, meaning ATI is continuing to increase the ROP:Core:TMU. For the first time, ATI is shipping a part with 32 ROP [Rasterizing OPeration] units, meaning the chip is able to output 32 pixels in a single clock.
When it comes to products, ATI plans to launch four parts: Radeon HD 5850 and 5850X2 in more affordable pricing bracket and HD5870 and HD5870X2 for the high-end parts. While there were no clocks for the Radeon HD 5850/5850X2 parts, alleged clocks for HD5870 and HD5870X2 reveal that for the first time, an X2 part is clocked higher than a single-GPU part. Was this a requirement of SidePort memory interface, we are not aware atm. German site Hardware-Infos placed all of the data in a very convenient table, which we are running here with permission. Their story also contains more data about the upcoming ATI RV870 architecture.
ATI 4870 vs 5870 table...courtesy of Hardware-Infos
These units should result in 2.16 TFLOPS for the HD5870 and 4.56 TFLOPS for the dual-GPU part. Yes, you've read correctly - we are going from 1TFLOPS chip to 4.6TFLOPS within 13 months. Is it now clear that CPUs are in a standstill when it comes to performance improvements? The biggest question though is - while there is no doubt that ATI pulled another miracle out of their hat with a brilliant on-time execution, releasing a 40nm part that will be relatively cheap to manufacture. BUT - can it beat nVidia's GT300 and by how much?
Some journalists allegedly have miracle 8-balls and claim that the ATI cards will blow nVidia out of the water. We are not so certain... stay tuned.