Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th October 2019, 19:34   #7081  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,788
I wonder if that mode is really that beneficial. If it detects dupes, it could just code them as all-skip P or B frames with minimal bitstream overhead. Sure, that single frame is being saved, but in the grand scheme of things for a small scene of a static shot that seems insignficant. Nevermind that repeat flags are likely to trip up decoders and/or muxers, as you basically generate a VFR bitstream.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 4th October 2019, 20:07   #7082  |  Link
vpupkind
Registered User
 
Join Date: Jul 2007
Posts: 17
Quote:
Originally Posted by nevcairiel View Post
I wonder if that mode is really that beneficial. If it detects dupes, it could just code them as all-skip P or B frames with minimal bitstream overhead. Sure, that single frame is being saved, but in the grand scheme of things for a small scene of a static shot that seems insignficant. Nevermind that repeat flags are likely to trip up decoders and/or muxers, as you basically generate a VFR bitstream.
All-skip frame is way more expensive than changing a pic_struct value.
The reason for doing such a weird VFR is that the same mechanism used for 3:2 pulldown and 24->60p conversion is used here.
vpupkind is offline   Reply With Quote
Old 4th October 2019, 20:26   #7083  |  Link
vpupkind
Registered User
 
Join Date: Jul 2007
Posts: 17
Quote:
Originally Posted by benwaggoner View Post
I see the default value of dup-threshold is 70. It would be helpful to know if higher numbers require more or less similarity, and ballpark how much similarity is requires for 70. I hope it is sub-psychovisual at least.

I could see this helping efficiency and encoding speed for stuff like a title card displayed for a couple of seconds.
The threshold is PSNR value between consecutive pictures
vpupkind is offline   Reply With Quote
Old 4th October 2019, 22:25   #7084  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,788
Quote:
Originally Posted by vpupkind View Post
All-skip frame is way more expensive than changing a pic_struct value.
Sure, in relative terms. But in absolute terms, in a sea of actually coded and changing frames, how much of a difference are we talking here? 0.01%? Likely not even that.

Quote:
Originally Posted by vpupkind View Post
The reason for doing such a weird VFR is that the same mechanism used for 3:2 pulldown and 24->60p conversion is used here.
A mechanism thats already rarely used in HEVC, and likely not well supported, or intentionally ignored, because original 24p content is just better then stuttery 30p.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 5th October 2019, 03:51   #7085  |  Link
WhatZit
Registered User
 
Join Date: Aug 2016
Posts: 60
Quote:
Originally Posted by benwaggoner View Post
I could see this helping efficiency and encoding speed for stuff like a title card displayed for a couple of seconds.
I could see this being as big a pain-in-the-arse as traditional VFR. Luckily, it is disabled by default.

Still, one of Multicoreware's corporate clients probably asked for it (anime OTT?), so there it is.
WhatZit is offline   Reply With Quote
Old 5th October 2019, 15:54   #7086  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 5,932
x265 3.2+5-354901970679
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 5th October 2019, 17:02   #7087  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 734
Quote:
Originally Posted by WhatZit View Post
I could see this being as big a pain-in-the-arse as traditional VFR. Luckily, it is disabled by default.

Still, one of Multicoreware's corporate clients probably asked for it (anime OTT?), so there it is.
Actually it's not really a good idea for anime. The duplicate removal filters like this can mishandle a very common scenario where the whole picture doesn't move at all, but there is just mouth movement in a tiny part. It can be just few pixels, but killing that and replacing with wrong duplicate is a terrible sort of artifact.
mandarinka is offline   Reply With Quote
Old 6th October 2019, 15:33   #7088  |  Link
vpupkind
Registered User
 
Join Date: Jul 2007
Posts: 17
Quote:
Originally Posted by mandarinka View Post
Actually it's not really a good idea for anime. The duplicate removal filters like this can mishandle a very common scenario where the whole picture doesn't move at all, but there is just mouth movement in a tiny part. It can be just few pixels, but killing that and replacing with wrong duplicate is a terrible sort of artifact.
I am unsure whether these two frames will have a PSNR of 70dB -- you will still have a significant absolute distance between a couple of co-located pixels, which should bring it below the threshold. Would be very interested in test results -- don't have a decent anime source.
vpupkind is offline   Reply With Quote
Old 6th October 2019, 23:52   #7089  |  Link
Magik Mark
Registered User
 
Join Date: Dec 2014
Posts: 541
Hey Guys,

Are there any new switches that would speed up multi pass encoding?
__________________
Asus X99 Sabertooth - Xeon E5 2695 - Asus Strix GTX 960 4G - DDR4 16GB Predator - Pioneer KRP 600M (isf calibrated) - Yamaha A3030 - Windows 10 x64 - Kodi with DSplayer - Lav - MadVR - XYsubtitle
Magik Mark is offline   Reply With Quote
Old 7th October 2019, 05:45   #7090  |  Link
RanmaCanada
Registered User
 
Join Date: May 2009
Posts: 95
Quote:
Originally Posted by Magik Mark View Post
Hey Guys,

Are there any new switches that would speed up multi pass encoding?
Yes the -Buy Ryzen 3900x switch
RanmaCanada is offline   Reply With Quote
Old 8th October 2019, 20:56   #7091  |  Link
aymanalz
Registered User
 
Join Date: May 2015
Posts: 53
Quote:
Originally Posted by RanmaCanada View Post
Yes the -Buy Ryzen 3900x switch
He has raised a valid and pertinent point. After all these years, x265 is still painfully slow on "normal" processors. You are right that the only way to encode faster is to get faster processors with more cores. I wish that wasn't the case, and that by now the developers could have made it faster. Maybe it is mathematically/programatically impossible to get any more speed improvements. That's a pity.
aymanalz is offline   Reply With Quote
Old 8th October 2019, 22:36   #7092  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Germany
Posts: 650
Quote:
Originally Posted by aymanalz View Post
He has raised a valid and pertinent point. After all these years, x265 is still painfully slow on "normal" processors. You are right that the only way to encode faster is to get faster processors with more cores. I wish that wasn't the case, and that by now the developers could have made it faster. Maybe it is mathematically/programatically impossible to get any more speed improvements. That's a pity.
It was pretty much the same when there was the shift from MPEG-2 and then Xvid to H.264: the computational complexity was way higher and the old single core single thread CPUs weren't able to cope with the amount of resources required. The fact that H.264 has been the de-facto standard for several years kinda got us used to high encoding speed. As a matter of fact, MPEG-2 encoders like x262 and MPEG-4 ASP encoders (Xvid) were not parallelized at all or were poorly parallelized, while x264 is able to max out a very high number of cores and thread depending on the settings.
A thing it won't use properly, though, is the second CPU, for instance I noticed that if you have a dual socket configuration like a Dual Xeon with an high number of cores and threads respectively, x264 will use only one of the CPUs, thus reducing the speed.
Anyway, x264 is generally so fast by now that it's fine, also because of modern assembly optimizations (manually written intrinsics) that were not available for x262 and Xvid encoders like AVX2.
x265 on the other hand has been developed with modern hardware in mind and not only it uses modern assembly optimizations (like x264) but it is also heavily parallelized, it uses both CPUs in a dual socket environment and it also enables you to use some additional settings if your CPU has so many cores that it's not maxed out by it.

You mentioned the mathematical complexity and indeed H.264 was based on a Discrete Cosine Transform (which works with real numbers and is continuous in 2phi) and the Hadamard Transform which is very light and is meant to take care of what the DCT couldn't compress well enough. As to H.265 it is indeed more demanding in terms of computational cost as it's using the Discrete Cosine Transform and the Discrete Sine Transform, but keep in mind that it could have been even more demanding 'cause years ago, before 2013, there were propositions about using the Karhunen-Loeve transform which is the heaviest transform that I know and it's very demanding in terms of computational cost, this is because back in the days it seemed impossible to achieve a 40% reduction compared to H.264 based on a linear-algebra only approach. The thing was that according to the results, the KLT did achieve better results compared to the DCT, however the improvements were so small in some cases and the computational cost was so high that they decided not to proceed with that approach, which then led to the modern DCT, DST approach.
If you take a look at the "future", you'll see 8K and H.266 VVC which inherited the Discrete Cosine Transform and Discrete Sine Transform approach from H.265 HEVC, but it's also using an adaptive multiple transform (AMT) scheme for residual coding for both inter-coded and intra-coded blocks. This approach basically consists of a set of five DCT and DST based transform, namely DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII and a Signal
Dependent Transform (SDT) is competed to the AMT output. The SDT approximates the optimal Karhunen-Love
transform (KLT), which is a signal dependent transform, by estimating current signal to code (transform block) with
similar signals (i.e. reference patch) available at the decoder (already coded). This way a lot of computational power is actually saved by not using the KLT directly which is far too demanding in terms of computational cost.

Anyway, you can be sure of one thing: it will be even more demanding, but, you know, in a world in which we have configurations with Intel Xeons CPUs like this Intel Xeon Platinum 9282 56c/112th, this doesn't seem to be a problem. For instance, I myself encode with an Intel Xeon 28c/56th at work with 64GB of RAM and a Quadro GPU even though I've been asking them several times to upgrade the CPU as it's been like this ever since 2017 and it's now "old" for what I gotta do on a daily basis.

What do I have at home? Well, a crappy i7 4c/8th with 32 GB DDR4 and an RTX NVIDIA GPU but it doesn't matter since the PC I use at home is for general purpose: browsing (replying to you folks here on Doom9 :P), occasionally watching videos (although I do have my 4K Panasonic Bluray for that), listening to music and studying (I'm in the middle of my master at university while I'm working as encoder for a company).

In a nutshell: computational cost will always get higher and higher but CPUs will get better and better. :P
__________________
Broadcast Encoder
Avisynth memes: 1 - 2 - 3
Videotek - Audacity XP

Last edited by FranceBB; 8th October 2019 at 22:44.
FranceBB is offline   Reply With Quote
Old 9th October 2019, 02:21   #7093  |  Link
RanmaCanada
Registered User
 
Join Date: May 2009
Posts: 95
FranceBB hits the nail on the head. It's also why AV1 is a literal order of magnitude slower than x265. As you get more complex with your codecs and your compression, the processing power required is basically a bell curve. There are ways around this, like the SVT implementations of HEVC and AV1, but they are seriously garbage in comparison to a dedicated CPU encode. In time they might get better, but currently, no.
RanmaCanada is offline   Reply With Quote
Old 9th October 2019, 03:05   #7094  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 104
Quote:
Originally Posted by RanmaCanada View Post
There are ways around this, like the SVT implementations of HEVC and AV1, but they are seriously garbage in comparison to a dedicated CPU encode. In time they might get better, but currently, no.
SVT is purely CPU encode, there's no GPU, ASIC or other accelerator code in there - just a great parallel scaling framework that seemingly loses no quality as you pile on threads (per the BAV conference), that and oodles of AVX2 and AVX 512 SIMD code.

It's not a question of whether it 'might' get better though, the SVT codecs are owned/controlled by Intel, with Netflix working on it too, so unless they get bored and shelve them, it will continue to get developed because its a perfect way to show off and benchmark their super-mega-core-a-paloosa CPU's.

The libaom encoder is more of a development platform/reference implementation of AV1 optimised into a working encoder, much like libvpx for VP8/VP9 from Google - I wouldn't ever expect it to reach the speed performance of the other implementations because they are also developing the next gen codec on an experimental branch at the moment.
soresu is offline   Reply With Quote
Old 9th October 2019, 03:07   #7095  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,648
Quote:
Originally Posted by FranceBB View Post
Anyway, you can be sure of one thing: it will be even more demanding, but, you know, in a world in which we have configurations with Intel Xeons CPUs like this Intel Xeon Platinum 9282 56c/112th, this doesn't seem to be a problem.
This is not a real world CPU.
It's practically not existent, it doesn't have a written price and it's probably never sold to anyone. Only rumors.
Just for papers.
Quote:
Originally Posted by FranceBB View Post
For instance, I myself encode with an Intel Xeon 28c/56th at work with 64GB of RAM and a Quadro GPU even though I've been asking them several times to upgrade the CPU as it's been like this ever since 2017 and it's now "old" for what I gotta do on a daily basis.
Since your needs are that high, you should convince your boss at work to buy some serious processing power with half the money.
Try a 64C/128T EPYC CPU and you will get double processing power with half the money of the Xeon 28C/56T
Yes, it's that simple.
__________________
Win 10 x64 (18362.388) - Core i3-9100F - nVidia 1660 (436.15)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 9th October 2019, 17:55   #7096  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 615
Quote:
Originally Posted by aymanalz View Post
He has raised a valid and pertinent point. After all these years, x265 is still painfully slow on "normal" processors. You are right that the only way to encode faster is to get faster processors with more cores. I wish that wasn't the case, and that by now the developers could have made it faster. Maybe it is mathematically/programatically impossible to get any more speed improvements. That's a pity.
Things designed for future are supposed to be used with future technologies.

May I ask what is a "normal" processor. When x264 was released, I was among one of the pioneers to use x264 for daily driving. What is a normal processor by then? An Athlon 64 4000+ with 1 core 1 thread at 2.4GHz is probably a HEDT(?) processor. A Sempron 2400+ is probably a fairly normal processor with 1c1t at 1.66GHz. Does the latest x264 run faster on a Sempron 2400+? Probably not.

Within 5 years after that, at around 2010 we got Phenom II X6 1055T at a reasonable price, with 6c6t at 2.8GHz, which is about 10x fast as an Athlon 64 4000+. You used to get 3 fps from x264, now it's 30 fps, which sounds very reasonable.

Now regarding x265, it was released at around 2013, a year which Haswell released. i7-4770K comes with 4c8t at 3.5GHz. Within 5 years, what do we get? Core i9-9900K that's 8c16t at 4GHz if you can afford that. From passmark score it's barely 2x performance to 4770K, and even you take AVX2 into consideration it's not gonna be 3x 4x performance. You used to get 3 fps, now it's 9 fps, which is still slow.

So, blame the CPU manufacturers, not developers.

Also you probably made an assumption that code can be optimized by a good portion. That's not very true if you are talking about the versions after full AVX2 optimization is done. Yes, before they did that, it was a bit slow because the full advantage of a modern CPU is not being used. For now, the AVX2 code (and AVX512 code as we speak) has little room to further optimize. We may be able to squeeze little time by applying some early exiting and skipping algorithm optimizations. Again, marginal difference.

HEVC is designed for (near) future, with the ability to reduce processor demands by downgrading the parameters. If you want something fast, use a lower setting. Full feature enabled high setting encoding is supposed and desired to be slow even on a high-end processor, let alone a "normal" one.

Hope this helps.
MeteorRain is online now   Reply With Quote
Old 10th October 2019, 10:30   #7097  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 108
Quote:
Originally Posted by FranceBB View Post
A thing it won't use properly, though, is the second CPU, for instance I noticed that if you have a dual socket configuration like a Dual Xeon with an high number of cores and threads respectively, x264 will use only one of the CPUs, thus reducing the speed.
Anyway, x264 is generally so fast by now that it's fine, also because of modern assembly optimizations (manually written intrinsics) that were not available for x262 and Xvid encoders like AVX2.
x265 on the other hand has been developed with modern hardware in mind and not only it uses modern assembly optimizations (like x264) but it is also heavily parallelized, it uses both CPUs in a dual socket environment and it also enables you to use some additional settings if your CPU has so many cores that it's not maxed out by it.
I have not seen any multi socket issues with x264, I think it handles it fine, maybe not as good as x265, but it can use the second socket if needed. You will see more load on one socket, but isnt that the logical way? Most of the time you wont saturate all threads in a multi socket system, and it will prioritize one socket ofc to minimize cross socket communication.

And tbh, I dont think x265 is better parallelized then x264, at default settings its actaully worse for resolutions under 4k cause of the large CU size. And both x264 and x265 have a hard time to scale beyond 24'ish threads for 1080p at slower settings. Above that clock speeds should be prioritized over threads if not doing chunk encoding. And this is by no means a criticism for x264 or x265, the paralazation and thread scaling is already very impressive for this task. And I dont think we can assume this to get much better without sacrificing something else. To increase speed and to utilize renderfarms, chunk encoding will be the way forward. And for live/realtime content, it will be hw-encoding doing the job.

Quote:
Originally Posted by MeteorRain View Post
Now regarding x265, it was released at around 2013, a year which Haswell released. i7-4770K comes with 4c8t at 3.5GHz. Within 5 years, what do we get? Core i9-9900K that's 8c16t at 4GHz if you can afford that. From passmark score it's barely 2x performance to 470K, and even you take AVX2 into consideration it's not gonna be 3x 4x performance. You used to get 3 fps, now it's 9 fps, which is still slow.
4770k have AVX2 to, if I'm not mistaken it was introduced with haswell. I own an 4790k and have some experience with 9900, I would say that there is about an 2,5x performance difference for x265.

Last edited by excellentswordfight; 10th October 2019 at 12:29.
excellentswordfight is offline   Reply With Quote
Old 10th October 2019, 12:18   #7098  |  Link
aymanalz
Registered User
 
Join Date: May 2015
Posts: 53
Quote:
Originally Posted by MeteorRain View Post
Things designed for future are supposed to be used with future technologies.

May I ask what is a "normal" processor. When x264 was released, I was among one of the pioneers to use x264 for daily driving. What is a normal processor by then? An Athlon 64 4000+ with 1 core 1 thread at 2.4GHz is probably a HEDT(?) processor. A Sempron 2400+ is probably a fairly normal processor with 1c1t at 1.66GHz. Does the latest x264 run faster on a Sempron 2400+? Probably not.

Within 5 years after that, at around 2010 we got Phenom II X6 1055T at a reasonable price, with 6c6t at 2.8GHz, which is about 10x fast as an Athlon 64 4000+. You used to get 3 fps from x264, now it's 30 fps, which sounds very reasonable.

Now regarding x265, it was released at around 2013, a year which Haswell released. i7-4770K comes with 4c8t at 3.5GHz. Within 5 years, what do we get? Core i9-9900K that's 8c16t at 4GHz if you can afford that. From passmark score it's barely 2x performance to 4770K, and even you take AVX2 into consideration it's not gonna be 3x 4x performance. You used to get 3 fps, now it's 9 fps, which is still slow.

So, blame the CPU manufacturers, not developers.

Also you probably made an assumption that code can be optimized by a good portion.
...
But HEVC is no longer for the future, is it? Successors to HEVC are in development, so HEVC is the present.

The part about how processors haven't improved much is absolutely right, and I had that in mind as well. From the year 2000 to 2010, processors (low end, mid grade, high end, everything) became several times faster, and sold for the same or even lower prices, probably due to the Intel-AMD competition. But incremental improvements have been a lot less in this decade, especially after Sandybridge or Haswell.

I could blame the CPU manufacturers, or blame the x265 developers for not anticipating that processing power per price will not keep increasing at the rate it used to, but I'm not really trying to assign blame; just making an observation that x265 is extremely slow on "normal" processors - by which I meant a reasonable home desktop without a gazillion cores and threads. (I'd say a quad core i7 or hexacore is the mainstream now.)

I wasn't assuming that code can be optimized further - I was wondering out loud whether it could. I was lamenting that perhaps they have reached a point where further optimizations for speed just isn't possible - in which case, only professional encoders or studios with 16+ core machines can use it at decent speeds. Not the casual home users.
aymanalz is offline   Reply With Quote
Old 10th October 2019, 12:34   #7099  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 5,932
CPUs are not developed with the one and only purpose to encode video.

If you want video encoding at top speed, use a dedicated video encoder chip... but as always, there are the usual conflicts between speed, accuracy/complexity, and other factors: "You can't have them all at maximum at the same time."
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 10th October 2019, 15:14   #7100  |  Link
Rousseau
Registered User
 
Join Date: Jun 2017
Posts: 3
On UHD rips with 3.2 , the image looks darker when played in MPV than with rips made in 3.1 . They look the same in MPC with no tone mapping. I made no change other than the encoder.
Rousseau is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:38.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.