Doom9's Forum - View Single Post

LoRd_MuldeR · 4th August 2010, 22:38

Quote:

Originally Posted by TheImperial2004

Nope . For myself , I'm talking entirly in theories . Can be right or wrong . I'm just woundering how can we magically speed things while serving the same quality of x264.

You never get any speed-up for free. And certainly GPU's won't "magically" make your program faster!

Porting software to CUDA/OpenCL isn't simple at all. Getting a non-trivial software running on the GPU will be though task. Not to mention all the work that has to be done to optimize it for speed.

Also there is absolutely no guarantee that your software will run any faster (more efficient) on the GPU than it does on the CPU. It may or may not work.

If your problem isn't highly parallel, it won't fit on the GPU. But even if your problem is highly parallel in theory, then you still have to come up with a smart parallel algorithm that works on the real hardware.

See also:
http://forum.doom9.org/showpost.php?...&postcount=192

Also the this example shows how complex it is to optimize something as simple as a "parallel reduction" on CUDA:
http://developer.download.nvidia.com.../reduction.pdf

Quote:

Originally Posted by TheImperial2004

Dark Shikari had already state that we need to build the whole encoder from scratch . So I think it'd be best if we wait for H.265 and build x265 from the ground up to harness the GPU.

Note necessarily the whole encoder, but a significant part.

You can't "move" a single DSP function to the GPU (even if it is a LOT faster there), because the delay for CPU -> GPU -> CPU data transfer would nullify the speed-up.

Instead you must "move" (read: re-implement) complete algorithms on the GPU, so there will be enough "calculations per data transfer" to legitimate the transfer delay.

(Furthermore we don't have any indication that H.265 will be any easier or harder to implement on a GPU)