View Single Post
Old 6th September 2011, 00:00   #15  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,926
Quote:
The major issue here is the overhead the driver adds for memory copies.

John Carmack (ID Software) wrote about it in this interview.
Quote:
The topic of the GPU hardware race came up early in our talk and the response Carmack gave us was pretty interesting. Stating “I don’t worry about the GPU hardware at all, I worry about the drivers” seemed to be a reiterated point. This became very apparent to id Software while developing RAGE where even though the PC had truly an order of magnitude more horsepower than the consoles, it struggled to keep up with the “minimum latency”, get feedback here, update data there, etc and do it all to maintain a 60 Hertz frame rate. DirectX 11 and multi-threaded drivers might have helped things but he still claims that they are far from the solution he envisions: direct surfacing of the memory system. The process of updating a textures on the PC is on the order of “tens of thousands of times slower” than on the Xbox 360 and PS3. AMD did implement a “multi-texture” update specifically for id Tech 5 which should help, but from the interview you can tell that Carmack really does want more done on this topic.

One interesting side effect of this talk – Intel’s integrated graphics actually has impressed Carmack quite a bit and the shared memory address space could potentially fix much of this issue. AMD’s Fusion architecture, seen in the Llano APU and upcoming Trinity design, would also fit into the same mold here. He calls it “almost a forgone conclusion” that eventually this type of architecture is going to be the dominant force. You might remember our discussion of this topic with Josh’s analysis of AMD’s Fusion System Architecture – it would appear that AMD has a potential ally on its side if they are paying attention.
The same situation applies here too. Basically, the Intel GPU driver provides virtual GPU memory that in reality resides in the system ram.
But... you can't get direct access to that memory. The way the driver provides access to this memory is 1000's of percent slower than if the driver were able to point to the real memory address and let you just copy the image directly.
Also when we are about the talk on GPU/CPU Efficiency we have to come to the OS itself and it's current Driver architecture and WDDM 1.1 is just the start of this Process the next Windows is going to bring the next step until we some day reach WDDM 2.0
We already had a similar Discussion on Beyond3d and nobody really want's to go to Assembler Style Code the GPU directly anymore, so yeah it's up @ Microsoft and the Vendors to improve this

Quote:
The (relatively) high CPU usage is caused by one thing - memory copying from the GPU to system memory. I'll try to reduce this by trying to do VPP (DXVA/MSDK video post processing) to a system memory buffer. Hopefully the driver will do the copy faster than memcpy().


Quote:
My idea with FFDShow is to have a 1 stop decoder that's low on power and high on quality. I want to abstract the HW acceleration and hopefully don't lose too much because of the above frame copying.
Nvidia was very successful with this

Quote:
Just using DXVA to decode isn't trivial as different splitters behave differently and give different data and maybe the HW decoders aren't following the various specs to the letter. Microsoft's documentation isn't clear enough on how to write things properly. Theoretically they could have created a DXVA decoder themselves, but they didn't. Same goes to Intel/AMD/Nvidia.
Yeah true many ISVs know that and some do better then others in those regards, having more open and better documented APIs like Nvcuvid,Open Video and Intel Media SDK are great and hopefully will make this more easy for Devs Lav Cuvid and ffdshow-quicksync are nice examples though Nvidia is still in the lead here and both AMD and Intel came late into the Game.
Also it makes it much easier to adapt to new Renderer that doesn't support DXVA and use full capabilities without being limited
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 6th September 2011 at 00:31.
CruNcher is offline   Reply With Quote