VGLeaks: Durango's Move Engines

Much as it's neat to see Jpeg decoding in hw return, it's a bit depressing that 13 years later, the decoder is barely 50% faster then what PS2 had.
 
A good technical analysis of the Durango leaked documents (from DME's to GPU) ... Its translated from Spanish BUT if you're technical you may be able to follow along!

http://www.microsofttranslator.com/...013/02/08/durango-nos-hace-coger-el-delorean/

That article is a really good read. Good explanation and makes sense. it's starting to look like the reason MS went with DDR3 is making sense. If you are using tile based rendering to an extent, you wouldn't need as much bandwidth as a traditional renderer.
 
IMO I don't believe those rumors. I think someone wanted that information to sound negative, but in reality it's the same thing all the consoles are doing and have done in the past.

I actually do as it would allow Microsoft to do forwards/cross device compatibility with any other products they have in the pipeline.
 
That article is a really good read. Good explanation and makes sense. it's starting to look like the reason MS went with DDR3 is making sense. If you are using tile based rendering to an extent, you wouldn't need as much bandwidth as a traditional renderer.

The puzzle pieces are falling into place.
And it seems Microsoft collaborated a lot with devs asking where are your engines now and where do you think you guys are going with your engines.
 
That article is a really good read. Good explanation and makes sense. it's starting to look like the reason MS went with DDR3 is making sense. If you are using tile based rendering to an extent, you wouldn't need as much bandwidth as a traditional renderer.

And u could potentially have a really large pool of RAM to grab stuff from. 5 or 6 gb irt the next xbox.
 
That article is a really good read. Good explanation and makes sense. it's starting to look like the reason MS went with DDR3 is making sense. If you are using tile based rendering to an extent, you wouldn't need as much bandwidth as a traditional renderer.


Can someone sum this up for me?
Am not a tech guy :) Is it good? Is it comparable to orbis?

But this can be applied in PS Orbis? Basically to implement virtual memory management it is not necessary that there are two levels of memory, taking the top sufficient storage capacity for the image (color, depth, and stencil) buffers and textures needed for the scene and there is a lower level. For PS Orbis the caches of the GPU do not have enough storage capacity for this and the GDDR5 is a single level of memory for all of the GPU. Obviously the ESRAM and all the mechanism implementation costs in the space that is a sacrifice in terms of computation capability. But the biggest advantage comes from the fact that this allows access to large amounts of memory per frame without having to rely on huge band widths from expensive high-wattage as the GDDR5 memory. The reason why Xbox 8/Durango uses GDDR5 is not by the fact that then the thing would be completely redundant, the GDDR5 exists on the GPUs of face to avoid the Texture Trashing by the use of a higher bandwidth, the use of virtual memory on the GPU and Virtual Texturing are another solution to the same problem that both come into conflict within a system

Appreciate it :)
 
He's saying that since the Orbis has a single large pool of memory with high bandwidth it doesn't need DMEs or see any benefits from virtual texturing.
 
He's saying that since the Orbis has a single large pool of memory with high bandwidth it doesn't need DMEs or see any benefits from virtual texturing.

What about those things With Mark Grossman and Carmack?
(Prts)?
Virtual Texturing?
connection to 3DLabs ?

Are those things positive in any way when it comes to grafix and performance?


thanks for your response :)
 
Partial Resident Textures are a standard feature of the GCN architecture so Orbis will support that. Virtual textures are only useful if you have textures too large to fit in the memory you want to texture from. For Durango the 32MB embedded memory pool could be too small, and the 8GB memory pool too slow, so by using virtual textures you can copy just the piece you need at a given time to the small, fast memory and use it. On Orbis your large memory pool is also your fast memory pool so it isn't a problem to just read the part of the texture (using PRT) whenever you want.
 
What about those things With Mark Grossman and Carmack?
(Prts)?
Virtual Texturing?
connection to 3DLabs ?

Are those things positive in any way when it comes to grafix and performance?


thanks for your response :)

It's very positive for reducing bandwidth usage from what I gather.
 
Partial Resident Textures are a standard feature of the GCN architecture so Orbis will support that. Virtual textures are only useful if you have textures too large to fit in the memory you want to texture from. For Durango the 32MB embedded memory pool could be too small, and the 8GB memory pool too slow, so by using virtual textures you can copy just the piece you need at a given time to the small, fast memory and use it. On Orbis your large memory pool is also your fast memory pool so it isn't a problem to just read the part of the texture (using PRT) whenever you want.

Thanks Bro..Appreciate it :)
I hope we all will be Happy next gen :D
 
I was reading those 2010 yukon powerpoint leak, and MS was aiming for a 299$ box with Kinect 2.0 packed in, that would be profitable from year one. It also speaks of a modular design.

I wonder if MS's target is still the same.
 
The memory pools that allow a program to read and write to different areas of memory at the same time without stalls. IMO that's what game programmers always want to achieve but can't when they run into system stalls.

This is improvements over an architecture devs are already familiar with. With many of the new editions being automatic or hidden from the user. It's still x86 and a GCN graphic chip.
That's all true to an extent, but being able to somewhat conveniently juggle data between two separate pools still requires more developer effort than simply just having 1 single pool (which is as fast as those 2 pools combined) in the first place.
 
He's saying that since the Orbis has a single large pool of memory with high bandwidth it doesn't need DMEs or see any benefits from virtual texturing.

Of course it will. And i say will because i'm sure we will see lots of games using virtual textures on Orbis too.
 
That's all true to an extent, but being able to somewhat conveniently juggle data between two separate pools still requires more developer effort than simply just having 1 single pool (which is as fast as those 2 pools combined) in the first place.

Hence (supposedly) it's why they have custom hardware to orchestrate all that.

If it all works as "advertised" they end up with much more memory on board, alleviate some of the bandwidth gap, and in turn has a reasonably large cache memory that's much better at handling random access and could eliminate stalls across the entire system.

Of course, it could all backfire, but if they are sticking to this plan i'd say that in the very least they are satisfied with the results.
 
That's all true to an extent, but being able to somewhat conveniently juggle data between two separate pools still requires more developer effort than simply just having 1 single pool (which is as fast as those 2 pools combined) in the first place.

Its not going to require any significant effort from developers. There has been a system with an embedded memory of one form or the other in at least each of the past 3 generations. Developers are used to it and it has its benefits, and it definitely isn't comparable to the split memory of the ps3. This is in no way discounting the fact that ps4's single fast pool, just pointing out that it isn't necessary to always draw parallel with the ps4. To me it is just unnecessary and its tiring.
 
This is in no way discounting the fact that ps4's single fast pool, just pointing out that it isn't necessary to always draw parallel with the ps4. To me it is just unnecessary and its tiring.
The whole discussion started with oldergamer saying "There's no evidence that the new xbox is harder to work with." vis-a-vis PS4. I just pointed out that there is such evidence, even though it need not be a significant difference. Of course, PS3 had 8 user-managed memory pools, and somehow some ports still turned out well.
 
The whole discussion started with oldergamer saying "There's no evidence that the new xbox is harder to work with." vis-a-vis PS4. I just pointed out that there is such evidence, even though it need not be a significant difference. Of course, PS3 had 8 user-managed memory pools, and somehow some ports still turned out well.

Oh ok, sorry about that. Its just that you can't seem to have a discussion in this thread about the benefit of the durango architecture without somebody saying "yeah but orbis is better". I mean ps4 can be ten times better for all I care, and I am getting both system at launch.
 
Don't confuse megatextures with virtual textures.

I'm not. I am talking about virtual textures, not mega textures. Orbis might have enough bandwidth to access the entire ram per frame, but that doesn't mean you can't save it by accessing only what's really needed and using the bandwidth somewhere else, like a really expensive alpha blending across a large portion of the screen.
 
Its not going to require any significant effort from developers. There has been a system with an embedded memory of one form or the other in at least each of the past 3 generations. Developers are used to it and it has its benefits, and it definitely isn't comparable to the split memory of the ps3. This is in no way discounting the fact that ps4's single fast pool, just pointing out that it isn't necessary to always draw parallel with the ps4. To me it is just unnecessary and its tiring.

Do you not remember how much bitching there was from devs about the PS2's 4MB of embedded memory? Or noticed how few devs went to the effort of tiling to get a full 720p framebuffer with MSAA on 360? And that was in a system where the embedded memory was restricted to basically one use.

I'm not. I am talking about virtual textures, not mega textures. Orbis might have enough bandwidth to access the entire ram per frame, but that doesn't mean you can't save it by accessing only what's really needed and using the bandwidth somewhere else, like a really expensive alpha blending across a large portion of the screen.

That doesn't require virtual textures.
 
Do you not remember how much bitching there was from devs about the PS2's 4MB of embedded memory? Or noticed how few devs went to the effort of tiling to get a full 720p framebuffer with MSAA on 360? And that was in a system where the embedded memory was restricted to basically one use.



That doesn't require virtual textures.
Unless I'm indeed making a confusion, that's what virtual texturing is. I know amd call it partially resident textures, but I thought the broader name was virtual texturing... Anyway, that's what the spanish guy is talking about, right? And that would definitely benefit orbis too...
 
Do you not remember how much bitching there was from devs about the PS2's 4MB of embedded memory? Or noticed how few devs went to the effort of tiling to get a full 720p framebuffer with MSAA on 360? And that was in a system where the embedded memory was restricted to basically one use.

Tiling was automatic for XNA created games, I'm sure MS provided similar things to devs of real games. I believe the bitching was due to the performance hit.
 
Excited to see how they utilize this with the new ways eSRAM can be used versus the eDRAM.
Durante any thoughts? I see some excitement on B3D but they are going highly technical.
 
Tiling was automatic for XNA created games, I'm sure MS provided similar things to devs of real games. I believe the bitching was due to the performance hit.

Automatic is code for non-optimal. When devs have full control of the data pipeline they get best results, though they may not care enough..
 
IMO I don't believe those rumors. I think someone wanted that information to sound negative, but in reality it's the same thing all the consoles are doing and have done in the past.
MS has actually forced devs to only use approved dev libraries for any kind of of X360 development. On the other hand, on PS3 Sony has been encouraging them to use LibGCM as much as possible (closest to down-to-metal as it can be on today's machines) and circumvent regular OpenGL path. From what I know it's the same exact deal on Vita, so it would be pretty consistent if it stays like that with PS4.
 
MS has actually forced devs to only use approved dev libraries for any kind of of X360 development. On the other hand, on PS3 Sony has been encouraging devs to use LibGCM as much as possible (closest to down-to-metal as it can be on today's machines) and circumvent regular OpenGL path. From what I know it's the same exact deal on Vita, so it would be pretty consistent if it stays like that with PS4.

That's not true. It has been debunked several times by developers on B3D.
 
MS has actually forced devs to only use approved dev libraries for any kind of of X360 development. On the other hand, on PS3 Sony has been encouraging devs to use LibGCM as much as possible (closest to down-to-metal as it can be on today's machines) and circumvent regular OpenGL path. From what I know it's the same exact deal on Vita, so it would be pretty consistent if it stays like that with PS4.
Ask yourself why MS would "force" anything on devs when they are known for bending over backwards to make the dev community happy. Seriously....
 
That's not true. It has been debunked several times by developers on B3D.
I've read it at some point on Digital Foundry, and it was not debunked there, so I thought they knew what they were talking about. What's the actual story then?

Ask yourself why MS would "force" anything on devs when they are known for bending over backwards to make the dev community happy. Seriously....
To ensure that everything can be easily emulated on future hardware, is the line of thinking.
 
MS has actually forced devs to only use approved dev libraries for any kind of of X360 development. On the other hand, on PS3 Sony has been encouraging them to use LibGCM as much as possible (closest to down-to-metal as it can be on today's machines) and circumvent regular OpenGL path. From what I know it's the same exact deal on Vita, so it would be pretty consistent if it stays like that with PS4.

Approved libraries get support and regular updates. Unapproved libraries, well you run into trouble and you could be shit out of luck. Yes MS has always forced developers on their consoles to use directX, but keep in mind it's a tailored version of direct X for the hardware in the console and supports some functions that you wouldn't find on the PC. Yeah MS gives you less freedom of choice but it's still just as custom or low level as whatever sony themselves are applying. Again, not any different then the last two generations imo. I'm still not certain why this is being discussed at all.

It's a non issue which is why i said it's being put forth like a negative but in reality it's not.
 
That's all true to an extent, but being able to somewhat conveniently juggle data between two separate pools still requires more developer effort than simply just having 1 single pool (which is as fast as those 2 pools combined) in the first place.

See, I'm not really sure about that. I look at it this way. IMO there's more developer effort spent trying to prevent system stalls and keeping the hardware fed with a constantly flow of data (to improve or keep performance high) then there is in not having to worry about stalls.

What it really comes down to is less optimization in specific areas that will impact performance. handling things differently doesn't necessarily mean more work is what i'm saying.
 
Absolutely.

While that may be true historically , it's gotten less and less true as time moves on...........devl tools and libs provided by MS last gen were unprecedented in the console space.... allot of people assumed that Cell is the reason many developers liked coding for 360 over PS3 but there was ALLOT more to it than just that.

It's like comparing coding in free Java Studio to enterprise Visual Studio .NET(with all the server side collaborative stuff)....there really is no comparison the developmental "pipeline" and integration with tools like Maya and 3D Studio max were(and probably still are) light years ahead of what Sony or Nintendo have.

again I'm not talking about this in a comparison sense as far as Durango VS PS4(I still believe PS4 will be more "powerful") more of an overall discussion of Durango by itself in a vacuum.
 
If Microsoft have solved bandwidth issues (and all this points to a yes) while simultaneously providing 8GB of RAM, then it'll be interesting to see how this plays out in a few years when devs get used to coding for Durango and start utilizing that extra RAM. I'm starting to feel more confident about this system and we still don't know anything about the three display planes.
 
If Microsoft have solved bandwidth issues (and all this points to a yes) while simultaneously providing 8GB of RAM, then it'll be interesting to see how this plays out in a few years when devs get used to coding for Durango and start utilizing that extra RAM. I'm starting to feel more confident about this system and we still don't know anything about the three display planes.

3 display planes? where did you read that?
 
Thanks for that. There's a very good explanation there from ERP and Dominik. There's also a link there to a link to the very article I read few years back that I mentioned above.

http://www.eurogamer.net/articles/digitalfoundry-directx-360-performance-blog-entry

Suspicions were first aroused by a tweet by EA Vancouver's Jim Hejl who revealed that addressing the Xenos GPU on 360 involves using the DirectX APIs, which in turn incurs a cost on CPU resources. Hejl later wrote in a further message that he'd written his own API for manual control of the GPU ring, incurring little or no hit to the main CPU.

"Cert would hate it tho," he added mysteriously.

So it could be that they don't allow you to overridee what they already have in those slim libraries, but those libraries are already pretty close to metal as it is.
 
Thanks for that. There's a very good explanation there from ERP and Dominik. There's also a link there to a link to the very article I read few years back that I mentioned above.

http://www.eurogamer.net/articles/digitalfoundry-directx-360-performance-blog-entry



So it could be that they don't allow you to overridee what they already have in those slim libraries, but those libraries are already pretty close to metal as it is.

No. Read sebbbi's post specifically, check most pdf articles frm MS gamerfest. They would describe a method for achieving something but the would always say that you can use Microcode to get better performance if you want. Microcode is about as coding to the metal as you want.
 
If Microsoft have solved bandwidth issues (and all this points to a yes) while simultaneously providing 8GB of RAM, then it'll be interesting to see how this plays out in a few years when devs get used to coding for Durango and start utilizing that extra RAM. I'm starting to feel more confident about this system and we still don't know anything about the three display planes.

As I understand it, the move engines are not really to save bandwidth, but to save GPU cycles. No matter how you slice it, they are still moving data around at a peak 102GB/s.
 
Approved libraries get support and regular updates. Unapproved libraries, well you run into trouble and you could be shit out of luck. Yes MS has always forced developers on their consoles to use directX, but keep in mind it's a tailored version of direct X for the hardware in the console and supports some functions that you wouldn't find on the PC. Yeah MS gives you less freedom of choice but it's still just as custom or low level as whatever sony themselves are applying. Again, not any different then the last two generations imo. I'm still not certain why this is being discussed at all.

It's a non issue which is why i said it's being put forth like a negative but in reality it's not.
I remember this myth being spread as something that would give a huge edge in performance for PS3 as the generation went on... which strangely never happened.
 
I remember this myth being spread as something that would give a huge edge in performance for PS3 as the generation went on... which strangely never happened.

Really? From what I remember in 2005 everyone was saying that the opposite (That 360's API was lower-level). The myth that d3d on 360 is highly abstracted seems to have started relatively recently (with that df article).

For example, from Carmack (Quakecon 2005):
I’m not completely sure yet which direction we’re going to go, but the plan of record is that it’s going to be more the Microsoft model right now where we’ve got the game and the renderer running as two primary threads and then we’ve got targets of opportunity for render surface optimization and physics work going on the spare processor, or the spare threads, which will amenable to moving to the CELL, but it’s not clear yet how much the hand feeding of the graphics processor on the renderer, how well we’re going to be able to move that to a CELL processor, and that’s probably going to be a little bit more of an issue because the graphics interface on the PS3 is a little bit more heavyweight. You’re closer to the metal on the Microsoft platform and we do expect to have a little bit lower driver overhead.
 
3 display planes? where did you read that?

I believe the only news we have pointing towards display planes is the survey that vgleaks put out on what we would like to know more about concerning Durango. Display planes was listed as an option. Outside of their existence, I don't think we know much.
 
Interesting I must have missed that post. I'm assuming you could render things to two different display planes and the hardware composites them before pushing to the display.

Hmm, that would mean it's possible to render each plane at a different resolution. The more I hear, the more this sounds similar to the Talisman proposed graphics hardware that MS researched many moons ago.

It too could render various screen elements at different resolutions. however this was before 3D accelerators became prominent.
 
Interesting I must have missed that post. I'm assuming you could render things to two different display planes and the hardware composites them before pushing to the display.

Hmm, that would mean it's possible to render each plane at a different resolution. The more I hear, the more this sounds similar to the Talisman proposed graphics hardware that MS researched many moons ago.

It too could render various screen elements at different resolutions. however this was before 3D accelerators became prominent.

im guessing that its like illumiroom.. as a dev u can have rendertargets that u render content on that are beamed separately from each other on whatever surfaces are present.. These surfaces can be physically different or can be ontop top of each other ... just guessing here...

eg. of "ontop of each other", is a games surface, and ontop of that another surface where apps are being rendered on...
 
Interesting I must have missed that post. I'm assuming you could render things to two different display planes and the hardware composites them before pushing to the display.

Yeah, there was also a patent posted describing just that: http://www.faqs.org/patents/app/20110304713

Hmm, that would mean it's possible to render each plane at a different resolution. The more I hear, the more this sounds similar to the Talisman proposed graphics hardware that MS researched many moons ago.

It too could render various screen elements at different resolutions. however this was before 3D accelerators became prominent.

Oddly enough, right before all that info came out I was reading an old presentation about scaling on 360 and they made the same reference to Talisman

mrr.jpg

A little nostalgia: Talisman was my first job at Microsoft, back in 1996. It was a hardware initiative where objects were rendered to offscreen surfaces and depth-buffers similar to what we call Z-Sprites nowadays. The hardware would then composite these layers prior to or even during the video signal out. With literally all the objects in a scene rendered into their own 2D layer, software would decide whether a layer could be reused – moved and transformed in 2D – or needed to be re-rendered. This framerate-independent rendering aspect of Talisman was a cool idea but seems to have been lost with time. Talisman never made it to market, but now you know why DirectX went from DX3 to DX5. As DirectX 4 was for Talisman, and without Talisman, it was skipped. However, one lovely feature that remained was SetRenderTarget(), a totally new concept designed for Talisman…something that was challenging for hardware at the time, as render target memory and texture memory were separate in the hardware of the era. A tangent yes, but now you know

Mixed-resolution rendering seems to be in vogue. Specifically for translucency effects, like a smoke or fireball effect that fills the screen and has tons of overdraw.
 
Very good information guys! Interesting read. I think there may be a key in the illumiroom comment as well. Noticed how in the illumiroom concept video the graphics displayed beyond the boarders of the TV were of a different resolution and refresh rate?

i was wondering how something like that could be possible with a single graphics chip. I'm guessing two graphics chips would have to be used with one rendering to each display plane before the composting occurs. I'm still wondering if there's the possibility to have two gpu's in the xbox 720.
 
Automatic is code for non-optimal. When devs have full control of the data pipeline they get best results, though they may not care enough..

It really depends, automatic optimizations can be really fantastic when there's no ambiguity on the task, and in the context, whoever is optimizing knows exactly what you are trying to do. 32mb can be large enough to a simple cache scenario, like in one frame you render only from the main ram, the next one the system already knows the data that was requested most and put parts of it into the esram to speed access... Some simple user cases like this one can already improve performance and are simple to optimize automatically.

As I understand it, the move engines are not really to save bandwidth, but to save GPU cycles. No matter how you slice it, they are still moving data around at a peak 102GB/s.

They mitigate bandwidth issues by moving pieces of data to esram, so when the gpu needs to access them it can fetch from both pools at the same time.

Obviously, it's not going to be 68 + 102 GB/s, but the perceived bandwidth of the system can be much higher than either of the pools individually.
 
Top Bottom