Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
3rd August 2010, 11:26 | #21 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,251
|
Quote:
http://developer.download.nvidia.com...tart_Guide.pdf Search for "Pointer Traversal" and you'll find: ...pointers must be converted to be relative to the buffer base pointer and only refer to data within the buffer itself (no pointers between OpenCL buffers are allowed)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
|
3rd August 2010, 23:24 | #22 | Link |
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
I'm I the only one who believes that all "minor" x264 development should be postponed and all efforts should be focused on developing a way to offload the ME "at least" to the GPU ?
One can dream , can't he ?
__________________
AviDemux Windows Builds |
4th August 2010, 01:15 | #23 | Link | |
Drazick
Join Date: May 2003
Location: Israel
Posts: 139
|
Quote:
I wish someone would offer a significant encoding performances improvement. |
|
4th August 2010, 05:28 | #25 | Link |
Registered User
Join Date: Nov 2003
Posts: 1,281
|
More speed is always nice. But personally, x264 is 'fast enough' for my needs.
I would rather gain 20% more quality for the same bitrate than 20% more speed for the same options.
__________________
http://www.7-zip.org/ |
4th August 2010, 07:24 | #26 | Link | |
Registered User
Join Date: Jun 2005
Posts: 278
|
Quote:
I can only think of it if you have a lot of different-sized data sets that you have to copy around continuously, but unless NVidia has significantly improved the memory fragmentation issue even there it might be better to allocate a single "max-sized" buffer and do your own memory management - which eliminates the issue even though in a really bad way. |
|
4th August 2010, 11:05 | #27 | Link | ||
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
Quote:
Quote:
So I don't think we are in a rush for more quality at this point of development
__________________
AviDemux Windows Builds Last edited by TheImperial2004; 4th August 2010 at 11:09. |
||
4th August 2010, 11:11 | #29 | Link | |
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
Quote:
EDIT : Wait a sec !!! Does that mean ! All CUDA encoders right now are faster just because they produce much less quality than x264 ?! Come to think of it , How can a GPU processor be 20x faster than CPU ? They advertise that their CUDA encoders are 20x faster than regular CPU ones ... !!! Now I see your point DS ...
__________________
AviDemux Windows Builds Last edited by TheImperial2004; 4th August 2010 at 11:18. |
|
4th August 2010, 11:37 | #31 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
A Honda Civic is 10 times faster than an Go-kart. It's easy to be "20 times faster" when you're comparing yourself to the worst encoders on the market. I have yet to see a CUDA encoder that's faster than x264. All of them are "fast" because they use incredibly crappy encoding settings -- and if you set x264 to use comparable settings, x264 is equally (or even moreso) faster. |
|
4th August 2010, 12:30 | #32 | Link | |
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
Quote:
Thanks DS . I'm now released from GPGPU torment
__________________
AviDemux Windows Builds |
|
4th August 2010, 13:17 | #33 | Link | |
Registered User
Join Date: Apr 2009
Posts: 478
|
Quote:
I would say that speed is like money: You can never have enough of it. Nothing is ever fast enough. I would definitely say that x264 is not fast enough for HD content even if you have a i7-980X. That said, would the AVX instructions in Bulldozer and Sandy Bridge make a huge difference in speed? Last edited by aegisofrime; 4th August 2010 at 13:21. |
|
4th August 2010, 15:08 | #35 | Link | |
Registered User
Join Date: Oct 2006
Posts: 150
|
Quote:
Theoretically a radeon 5970 has 9 times as much double precision floating point calculation ability than the best core i7 980X. Several matrices can be accelerated to be processed more than 9 times faster on such GPUs. However, if the code isn't suitable for such job you can end up actually losing speed compared to a CPU. As far as I know x264 doesn't use FP calculations and the GPGPU programming landscape is a huge mess right now, so if a port of x264 to GPU would actually bring any advantage to the speed is highly debatable. Then there is the question of optimizing it. |
|
4th August 2010, 17:49 | #36 | Link | |
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
Quote:
I believe that the major issue here is to synth. the data between two different entities . What if the GPU is just too fast for the CPU to keep up with ? Of course we will need the CPU to do some calculations . If the CPU is 9x slower than the GPU , then whats the point ? In that case , the GPU will have to wait for the CPU to respond and complete its part of the job , *only* then the GPU will continue doing its part . Lagging is the major issue here . Feel free to correct me though
__________________
AviDemux Windows Builds |
|
4th August 2010, 17:56 | #37 | Link |
Drazick
Join Date: May 2003
Location: Israel
Posts: 139
|
A guy comes, taking on himself a big challenge yet you take all the wind out of his sail.
Try to support this. Worse case we'll be left with the CPU :-). Let him explore. There are so many smart guys here, they might come up with a solution. |
4th August 2010, 18:35 | #38 | Link | |
Registered User
Join Date: Dec 2008
Posts: 589
|
Quote:
As for what you're talking about, I'm not sure that it can be improved a lot if the encoder keeps the "thinking" that it's supposed to receive a series of frames and that it must process it as they come. Sure, it's needed for real time encoding and streaming but in lots of cases, the whole content to be encoded is already physically there. I'm imagining for example, if you have a 10 GB video you need re-encoded and an 8 core processor, you could quickly parse the first 512 MB of this content, split it into 8 smaller chunks, upload to video card, let it do calculations while the 8-12 CPU threads do calculations on each of those 8 chunks... if cpu lags behind, just store the computations performed on GPU somewhere and upload to card the next 512 MB chunk and when it's all done, do some computations to glue these chunks together. With cards nowadays having almost all at least 512 MB of memory on them with a lot of them having 768-1GB, that's (I think) plenty of space to fill with data to be crunched through while cpu glues everything up. Not sure how much would these gpu results would be in disk space or normal memory and if it's fast enough to dump them to disk so that it would be faster than just doing it all on cpu - I don't see otherwise really a problem of using a lot of disk space to encode something - mbtree file already uses about 200 MB to encode 3 GB of content. Last edited by mariush; 4th August 2010 at 18:38. |
|
4th August 2010, 20:02 | #39 | Link | |
Registered User
Join Date: Oct 2006
Posts: 150
|
Quote:
One way to avoid hitting a bottleneck would be to program the encoder to perform most of the (decoding and) encoding in GPU while the CPU is used only to maintain I/O and task scheduling. That, I don't see happening anytime soon due to technical constrains. "Running out of work" due to thread waits is already a big problem in the multicore CPU world and the people involved are investing huge amount of time to improve caching and branch prediction etc etc. It's just that, they never tried to do it that well to co-ordinate with the GPU. But as CPUs and GPUs are getting "fused" (lol math co-processor redux), using the GPUs for accelerating compression is only a matter of time. |
|
4th August 2010, 21:59 | #40 | Link | ||
C# Addict
Join Date: Oct 2008
Location: Saudi Arabia
Posts: 114
|
Quote:
" just store the computations performed on GPU somewhere " I don't think there will be other place to store them other than HDD . And we all know what that might mean , Yes , Lag . For storing -let's say- 512MB segment every 10-30 seconds , I believe that the HDD will be the bottleneck here . Your idea is great but I can't see it will improve encoding speed "magically" in the near future , especially when the HDD is involved . Quote:
I'm just woundering , if we are to offload "everything" to the GPU , how can a 600-700 MHz GPU be faster than a 3.0+ GHz CPU ? Isn't everything we are looking for is clock speeds ? Corrections are welcome
__________________
AviDemux Windows Builds |
||
Tags |
encoder, gpu, h.264 |
Thread Tools | Search this Thread |
Display Modes | |
|
|