View Single Post
Old 8th October 2019, 22:36   #7092  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,883
Quote:
Originally Posted by aymanalz View Post
He has raised a valid and pertinent point. After all these years, x265 is still painfully slow on "normal" processors. You are right that the only way to encode faster is to get faster processors with more cores. I wish that wasn't the case, and that by now the developers could have made it faster. Maybe it is mathematically/programatically impossible to get any more speed improvements. That's a pity.
It was pretty much the same when there was the shift from MPEG-2 and then Xvid to H.264: the computational complexity was way higher and the old single core single thread CPUs weren't able to cope with the amount of resources required. The fact that H.264 has been the de-facto standard for several years kinda got us used to high encoding speed. As a matter of fact, MPEG-2 encoders like x262 and MPEG-4 ASP encoders (Xvid) were not parallelized at all or were poorly parallelized, while x264 is able to max out a very high number of cores and thread depending on the settings.
A thing it won't use properly, though, is the second CPU, for instance I noticed that if you have a dual socket configuration like a Dual Xeon with an high number of cores and threads respectively, x264 will use only one of the CPUs, thus reducing the speed.
Anyway, x264 is generally so fast by now that it's fine, also because of modern assembly optimizations (manually written intrinsics) that were not available for x262 and Xvid encoders like AVX2.
x265 on the other hand has been developed with modern hardware in mind and not only it uses modern assembly optimizations (like x264) but it is also heavily parallelized, it uses both CPUs in a dual socket environment and it also enables you to use some additional settings if your CPU has so many cores that it's not maxed out by it.

You mentioned the mathematical complexity and indeed H.264 was based on a Discrete Cosine Transform (which works with real numbers and is continuous in 2phi) and the Hadamard Transform which is very light and is meant to take care of what the DCT couldn't compress well enough. As to H.265 it is indeed more demanding in terms of computational cost as it's using the Discrete Cosine Transform and the Discrete Sine Transform, but keep in mind that it could have been even more demanding 'cause years ago, before 2013, there were propositions about using the Karhunen-Loeve transform which is the heaviest transform that I know and it's very demanding in terms of computational cost, this is because back in the days it seemed impossible to achieve a 40% reduction compared to H.264 based on a linear-algebra only approach. The thing was that according to the results, the KLT did achieve better results compared to the DCT, however the improvements were so small in some cases and the computational cost was so high that they decided not to proceed with that approach, which then led to the modern DCT, DST approach.
If you take a look at the "future", you'll see 8K and H.266 VVC which inherited the Discrete Cosine Transform and Discrete Sine Transform approach from H.265 HEVC, but it's also using an adaptive multiple transform (AMT) scheme for residual coding for both inter-coded and intra-coded blocks. This approach basically consists of a set of five DCT and DST based transform, namely DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII and a Signal
Dependent Transform (SDT) is competed to the AMT output. The SDT approximates the optimal Karhunen-Loéve
transform (KLT), which is a signal dependent transform, by estimating current signal to code (transform block) with
similar signals (i.e. reference patch) available at the decoder (already coded). This way a lot of computational power is actually saved by not using the KLT directly which is far too demanding in terms of computational cost.

Anyway, you can be sure of one thing: it will be even more demanding, but, you know, in a world in which we have configurations with Intel Xeons CPUs like this Intel Xeon Platinum 9282 56c/112th, this doesn't seem to be a problem. For instance, I myself encode with an Intel Xeon 28c/56th at work with 64GB of RAM and a Quadro GPU even though I've been asking them several times to upgrade the CPU as it's been like this ever since 2017 and it's now "old" for what I gotta do on a daily basis.

What do I have at home? Well, a crappy i7 4c/8th with 32 GB DDR4 and an RTX NVIDIA GPU but it doesn't matter since the PC I use at home is for general purpose: browsing (replying to you folks here on Doom9 :P), occasionally watching videos (although I do have my 4K Panasonic Bluray for that), listening to music and studying (I'm in the middle of my master at university while I'm working as encoder for a company).

In a nutshell: computational cost will always get higher and higher but CPUs will get better and better. :P

Last edited by FranceBB; 8th October 2019 at 22:44.
FranceBB is offline   Reply With Quote