Quote:
Originally Posted by Beelzebubu
Frames with a "similar entropy" reference each other, so a high-level P might use the previous P (which is coded 16 frames back) as its entropy reference, and a non-reference inner B frame (which might not be a reference picture at all for pixel purposes) may actually use the previous inner B-frame (which may well be the one directly before this, or usually 2 and sometimes 3 frames back) as its reference. So this certainly influences how well frame-multithreading scales, not in the worst possible way but not ideal either.
And that's why you see weird things where using 256 instead of 128 threads (I think this is 32/16 frame threads x 8 tile threads) on a 32 core leads to pretty significant speedups ( like this).
|
Great analysis, thanks!
And huh, I can just imagine the tears of people trying to implement low-cost HW decoders for this. I can see how interframe entropy could provide a percent or two of compression efficiency, though.
I would rather have per-frame entropy and no slice requirement if I had a choice.