View Single Post
Old 16th September 2019, 06:03   #154  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,883
Quote:
Except that the implementation is not as easy as it may seem. All MPEG-based codecs (including AVC/H.264, HEVC/H.265 and VVC) make use of similarities between consecutive pictures. This means each encoding block needs to know about subsequent or previous images, as well as neighboring blocks. Any shortcut taken here will invariably lead to image quality degradation, and any over-engineering in the control mechanism will sacrifice the benefit of parallel processing. We aren't interested in this
Well of course parallelization means a reduction of quality and ideally we should stick with single thread, however the increasing computational power required to encode led us to where we are now: a very parallelized world.
I gotta say, though, that H.264 and H.265 are very well parallelized and that although a very little reduction of quality can be proven via objective metrics like SSIM/PSNR in extreme parallelization, such a thing is so small that it's almost unnoticeable and H.265 in particular developed many internal tools in a way that they're meant to be used by multi-core/threads CPUs in a very efficient way.

Quote:
Is Hybrid GPU Encoding the ultimate answer? As every use case is different, there is no single best answer. However, Hybrid GPU accelerated encoding delivers faster processing, allowing for more live channels per server, with less demand for CPU power when encoding. But, the performance of the encoder depends on a variety of factors such as the desired encoding profile, resolution, server system, GPU model, etc. In today’s world of ever-increasing quality and performance requirements, especially for live video, GPU Hybrid offers flexibility to get the most out of your investment.
Sure, GPU encoding is a thing, but GPU encoders have been proven to achieve less quality compared to CPU encoders in both x264 and x265 tests over the years.
Of course, things have improved ever since, but they still lack behind.
Anyway, the same parallelization concept applies: in the world we're living in, there's so much need to parallelize workflows that hybrid CPU/GPU encoding is almost always preferred, despite the loss of quality, which is not as negligible as it was for single-thread multi-thread CPU tests (heck, you can see with your eyes the difference between CPU encoders and GPU encoders).


On a totally unrelated thing...


Linear Algebra consideration:

I was taking a look at the official document about H.266 VVC.
I've stumbled across different transforms in my life, namely the DCT (Discrete Cosine Transform), DFT (Discrete Fourier Transform), WHT (Walsh-Hadamard Transform), DWT (Discrete Wavelet Transform) and of course the KLT (Karhunen-Loeve Transform).
The Discrete Cosine transform is widely used in many codecs and indeed it's my favorite transform 'cause it works with real numbers and it's continuous in 2 phi.
On the contrary, the Fourier Transform works with the imaginary numbers and it's continuous in phi, so not only it has more discontinuity points, but it also requires more computational power since it's using complex numbers.
As a matter of fact, in codecs like H.264 the DCT is used along with the Hadamard Transform which is very light and although it doesn't compress much, it's used in addition to the DCT in order to take care of what the DCT didn't properly take care of.
As to the Wavelet transform, it kinda became popular years ago when they were trying to take care of the blocking issues that were caused by quantizing 4x4 and 8x8 blocks with the Discrete Cosine Transform that was causing blocking artifacts that were not pleasant at not so high bitrates, therefore it was used in implementations like the JPEG2000 or AppleProres 'cause they were processing the whole image at the full resolution all at once, but I've never been a fan of this transform and as a matter of fact, although it has its own advantages and peculiarities, it's not as widespread as the DCT.
As to the Karhunen-Loeve transform, it became popular back in 2013 when we were all looking for alternatives when H.265 was still a work in progress and was on design-stage.
The Karnhunen-Loeve transform is probably the heaviest transform that I'm aware of and it requires a lot of computational power, however, during the tests, it was shown that although it was better than the Discrete Cosine Transform itself, it didn't actually achieve such a significant advantage over the Discrete Cosine Transform on a significant amount of contents, but it required way more computational power than the DCT and therefore the proposal at the time was rejected and the Discrete Cosine Tranform and the Discrete Sine Transform were used in H.265 instead.

Linear Algebra question:

If I got it right, H.266 VVC inherited the Discrete Cosine Transform and Discrete Sine Transform approach from H.265 HEVC which is fine, but it's also using an adaptive multiple transform (AMT) scheme for residual coding for both inter-coded and intra-coded blocks. This approach basically consists of a set of five DCT and DST based transform, namely DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII.
Fair enough, but how? I mean, I can't find an extensive documentation that fully explains how those are used.
Are they used subsequently? One after the other and in case in which order? Are they used with a sort of pre-filter which analyses the type of frame, divides it in blocks and macroblocks and decide which one it's gonna use or which set it's gonna use?

Last edited by FranceBB; 16th September 2019 at 06:10.
FranceBB is offline   Reply With Quote