Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th October 2018, 04:51   #6421  |  Link
StvG
Registered User
 
Join Date: Jul 2018
Posts: 65
Tested binaries download from here.
input.mkv - hevc (Main 10), yuv420p10le(tv), 3840x1606
AVX2 clock speed = AVX512 clock speed

Code:
ffmpeg -i input.mkv -f yuv4mpegpipe -strict -1 - | .\resources\x265-10b.exe --y4m - --ctu 32 -o .\OUTPUT.mkv

x265 [info]: HEVC encoder version 2.8+74-fd517ae68f93
x265 [info]: build info [Windows][GCC 8.2.0][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

encoded 498 frames in 51.34s (9.70 fps), 5044.61 kb/s, Avg QP:31.42

x265 [info]: HEVC encoder version 2.8+74-fd517ae68f93
x265 [info]: build info [Windows][MSVC 1915][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

encoded 498 frames in 51.80s (9.61 fps), 5044.61 kb/s, Avg QP:31.42
Code:
ffmpeg -i input.mkv -f yuv4mpegpipe -strict -1 - | .\resources\x265-10b.exe --y4m - --ctu 32 -o .\OUTPUT.mkv

x265 [info]: HEVC encoder version 2.9+2-7e978ed93d60
x265 [info]: build info [Windows][GCC 8.2.0][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

encoded 498 frames in 52.82s (9.43 fps), 5044.61 kb/s, Avg QP:31.42

x265 [info]: HEVC encoder version 2.9+2-7e978ed93d60
x265 [info]: build info [Windows][MSVC 1915][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

encoded 498 frames in 51.55s (9.66 fps), 5044.61 kb/s, Avg QP:31.42
Code:
ffmpeg -i input.mkv -f yuv4mpegpipe -strict -1 - | .\resources\x265-10b.exe --y4m - --ctu 32 -o .\OUTPUT.mkv

VS 2017 Generic compilation ("none")

encoded 498 frames in 51.49s (9.67 fps), 5044.61 kb/s, Avg QP:31.42

VS 2017 AVX2 compilation ("AVX2")

encoded 498 frames in 52.27s (9.53 fps), 5044.61 kb/s, Avg QP:31.42
Code:
ffmpeg -i input.mkv -f yuv4mpegpipe -strict -1 - | .\resources\x265-10b.exe --y4m - --ctu 32 (--asm avx512) -o .\OUTPUT.mkv

x265 [info]: HEVC encoder version 2.9+2-7e978ed93d60
x265 [info]: build info [Windows][MSVC 1915][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

encoded 498 frames in 52.05s (9.57 fps), 5044.61 kb/s, Avg QP:31.42

x265 [info]: HEVC encoder version 2.9+2-7e978ed93d60
x265 [info]: build info [Windows][MSVC 1915][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512

encoded 498 frames in 50.79s (9.80 fps), 5044.61 kb/s, Avg QP:31.42
StvG is offline   Reply With Quote
Old 15th October 2018, 07:21   #6422  |  Link
DotJun
Registered User
 
Join Date: Aug 2014
Posts: 17
I tried a short test clip with avx512 enabled and disabled on a 4K source using the slower preset. FPS went up to 1.37 from 0.84 when I enabled 512.

Encoded clip looks good, no obvious errors that is. File size is roughly the same, but clip length and crf might have something to do with the tiny difference between the two.

64bit x265 on an intel 7820x. Temps are roughly equal to when 512 is disabled. Load is mostly at 100% on all cores with the occasional dip down to 87% every minute or so.
DotJun is offline   Reply With Quote
Old 15th October 2018, 07:30   #6423  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 5,851
So it appears to be efficient on your specific CPU model.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 15th October 2018, 12:16   #6424  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 6,856
Quote:
Originally Posted by DotJun View Post
I tried a short test clip with avx512 enabled and disabled on a 4K source using the slower preset. FPS went up to 1.37 from 0.84 when I enabled 512.

Encoded clip looks good, no obvious errors that is. File size is roughly the same, but clip length and crf might have something to do with the tiny difference between the two.

64bit x265 on an intel 7820x. Temps are roughly equal to when 512 is disabled. Load is mostly at 100% on all cores with the occasional dip down to 87% every minute or so.
You should encode whole movie (130k frames) instead of ultra short clip with few hundred of frames.
The longer you encode the more heat your cpu will produce and hence more aggressive AVX negative offset will be activated.
Atak_Snajpera is offline   Reply With Quote
Old 15th October 2018, 12:32   #6425  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,542
Has anybody made any recent tests with different CTU and TU sizes? I made a quick test yesterday on a 720p encode, and max CTU (and TU) 16 turned out to produce the smallest file but also looked best compared to the original frame. At CTU max 64, the frame was clearly more blurry in places where there were more small details such as hair etc.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 15th October 2018, 14:16   #6426  |  Link
RieGo
Registered User
 
Join Date: Nov 2009
Posts: 57
Quote:
Originally Posted by DotJun View Post
I tried a short test clip with avx512 enabled and disabled on a 4K source using the slower preset. FPS went up to 1.37 from 0.84 when I enabled 512.

Encoded clip looks good, no obvious errors that is. File size is roughly the same, but clip length and crf might have something to do with the tiny difference between the two.
afaik avx512 enabled and disabled should produce exact same output. at least on my tests it did. can you share your command line?
RieGo is offline   Reply With Quote
Old 15th October 2018, 19:01   #6427  |  Link
Clare
Registered User
 
Join Date: Apr 2016
Posts: 60
Pushing Encoding Quality and Speed with x265
Massively Parallel Encoding

from Mile-High Video Workshop videos http://mile-high.video/files/mhv2018/
Clare is offline   Reply With Quote
Old 15th October 2018, 19:27   #6428  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by Forteen88 View Post
x265 2.9+2 released now!
http://www.msystem.waw.pl/x265/
Anyone know what the new patch actually does:

rc: Fix rowStat computation in const-vbv

It looks like it might fix a serious issue in a given RC mode, but it isn't actually self-documenting.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th October 2018, 06:32   #6429  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 568
Something seems to me that these aren't the only x265 bugs.
Github is still at 2.8. There will probably be some fixes to implement version 2.9.
Jamaika is offline   Reply With Quote
Old 16th October 2018, 06:43   #6430  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,542
Quote:
Originally Posted by Boulder View Post
Has anybody made any recent tests with different CTU and TU sizes? I made a quick test yesterday on a 720p encode, and max CTU (and TU) 16 turned out to produce the smallest file but also looked best compared to the original frame. At CTU max 64, the frame was clearly more blurry in places where there were more small details such as hair etc.
Tests with 4K and 1080p encodes also showed the same behaviour. It's quite strange as it's said that the big thing in HEVC is that it can use larger CTUs than AVC to increase efficiency and that the bigger the CTU is, the less bitrate should be required. It doesn't seem to be like this at least in CRF mode. Too bad it's not allowed to use 16x16 CTUs with 1080p or 4K encodes if they are to be compliant.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 16th October 2018, 07:57   #6431  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 5,851
Quote:
Originally Posted by Jamaika View Post
Github is still at 2.8. There will probably be some fixes to implement version 2.9.
Bitbucket recently provided tag v2.9; but the "tip" is currently in the "stable" branch, not in "default".

Still, there seems to be a lack of communication recently. I reported compiler warnings of GCC 8.x already 2 times, and nobody replied until today.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid

Last edited by LigH; 16th October 2018 at 07:59.
LigH is offline   Reply With Quote
Old 16th October 2018, 09:22   #6432  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 78
Quote:
Originally Posted by LigH View Post
Still, there seems to be a lack of communication recently. I reported compiler warnings of GCC 8.x already 2 times, and nobody replied until today.
Don't get me wrong, I have nothing but the conspiracy theories in my head, BUT, seeing as MulticoreWare sells en/decoding products, I wouldn't be surpised to see some announcement around the turn of the year about their work on some of the new codecs, be it AV1 or VVC.
But that's probably just me wishfully hoping for a sane AV1 encoder with proper performance, multithreading, profiles and documentation...
SmilingWolf is offline   Reply With Quote
Old 16th October 2018, 16:10   #6433  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by Boulder View Post
Tests with 4K and 1080p encodes also showed the same behaviour. It's quite strange as it's said that the big thing in HEVC is that it can use larger CTUs than AVC to increase efficiency and that the bigger the CTU is, the less bitrate should be required. It doesn't seem to be like this at least in CRF mode. Too bad it's not allowed to use 16x16 CTUs with 1080p or 4K encodes if they are to be compliant.
It would be helpful to see the command line used, or at least bitrate/CRF.

Larger CTUs are definitely helpful at lower bitrates. Comparing at high perceptual quality can bring in other subtler differences between modes.

I've even been able to see a slight improvement at sub-SD resolutions using CTU 64 at very low bitrates.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th October 2018, 17:02   #6434  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,542
Source was 4K, downsampled to 1080p. These are the basic parameters that I've set:
Code:
--input-depth 16
--dither
--profile main10
--min-keyint 5
--keyint 480
--merange 44
--splitrd-skip
--preset veryslow
--rc-lookahead 60
--deblock -2:-2
--no-strong-intra-smoothing
--no-sao
--qcomp 0.8
--aq-mode 3
--aq-strength 0.8
--ctu 64
--max-tu-size 32
--rdpenalty 1
--qg-size 16
--tu-inter-depth 4
--tu-intra-depth 4
--limit-tu 4
--limit-refs 3
--max-merge 2
--rd-refine
--ref 6
--bframes 10
--crf 19
I've then also tested ctu 32 with max-tu-size 32 or 16.

In the 1080p encode, the bitrate is as follows:

CTU 64 / TU 32 - 15078 kbps
CTU 32 / TU 32 - 14608 kbps
CTU 32 / TU 16 - 14472 kbps

Of these, CTU 32 / TU 32 resembles the original the most. It's interesting that setting TU 16 also causes distortion in the same areas as CTU 64 / TU 32. I checked areas like eyes, hair etc. which have easily some sort of distortion because there are many fine lines and things that can be compared quite easily.

I've just started testing what CRF value is visually enough for 1080p so the final bitrate will probably be lower than what I got from my tests. I'd estimate CRF 20-21 would be the final value.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 16th October 2018, 17:11   #6435  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by Boulder View Post
Source was 4K, downsampled to 1080p. These are the basic parameters that I've set...

I've then also tested ctu 32 with max-tu-size 32 or 16.

In the 1080p encode, the bitrate is as follows:

CTU 64 / TU 32 - 15078 kbps
CTU 32 / TU 32 - 14608 kbps
CTU 32 / TU 16 - 14472 kbps

Of these, CTU 32 / TU 32 resembles the original the most. It's interesting that setting TU 16 also causes distortion in the same areas as CTU 64 / TU 32. I checked areas like eyes, hair etc. which have easily some sort of distortion because there are many fine lines and things that can be compared quite easily.

I've just started testing what CRF value is visually enough for 1080p so the final bitrate will probably be lower than what I got from my tests. I'd estimate CRF 20-21 would be the final value.
Target CRF will also depend on other parameters, if you are still adjusting those.

Overall, that is a very idiosyncratic set of options. Nothing looks wrong (and I'd love to hear who you picked some of those!, but it's definitely outside of any combination of settings that MCW would be doing psychovisual optimization for.


I'm curious about why you chose these particular settings:
  • --merange 44
  • --splitrd-skip
  • --max-merge 2
  • --deblock -2:-2
  • --rdpenalty 1
  • --qg-size 16
  • --bframes 10
Is this for encoding anime or some other kind of synthetic or mixed synthetic/natural image encoding?
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th October 2018, 17:28   #6436  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,542
My sources are just regular movies or TV series. I'll tune CRF as the last item once I've got all the rest in place.

--merange 44, basically lowering it from the default 57 which I understand is meant for 4K. For 720p, I've used 38 all the time. I think this is remains of littlepox's set of "tune film" parameters.
--splitrd-skip, in my old notes, I didn't find it cause any ill effects. Do you have any specific information why it's a "bad idea"?
--max-merge 2, values 3-4 tested and it caused blur. One of the things in x265 that is different from x264 - the slower presets don't mean similar quality at lower final bitrate
--deblock -2:-2, no need for intense deblocking according to my tests
--rdpenalty 1, trying to favour smaller blocks.
--qg-size 16, tested values from 64 to 8, 16 looked best (in terms of distortion in small details again) when compared frame-by-frame. Tested with a 720p encode, so I'll need to check that also with 1080p later.
--bframes 10, some video utilizes a lot of B-frames for some reason. Not a big slowdown so I've kept it at that all the time.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 16th October 2018, 17:52   #6437  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by Boulder View Post
My sources are just regular movies or TV series. I'll tune CRF as the last item once I've got all the rest in place.

--merange 44, basically lowering it from the default 57 which I understand is meant for 4K. For 720p, I've used 38 all the time. I think this is remains of littlepox's set of "tune film" parameters.
It's not THAT frame size dependent, but 44 is probably fine.
Quote:
--splitrd-skip, in my old notes, I didn't find it cause any ill effects. Do you have any specific information why it's a "bad idea"?
No reason it would be a bad idea; I just haven't seen it used before. Generally parameters that have a reliable quality/speed tradeoff are in a preset. But it isn't listed as experimental...

Anyone from MultiCoreWare care to weigh in? What's the tradeoff? Is this something that should get added to the faster presets, or default to On but off in --preset placebo?
Quote:
--max-merge 2, values 3-4 tested and it caused blur. One of the things in x265 that is different from x264 - the slower presets don't mean similar quality at lower final bitrate
Yeah, with the many more tools available in HEVC, the differences between presets are greater. Odd that it causes blur; this should be a speed/quality tradeoff. You should file an issue here with repro details: https://bitbucket.org/multicoreware/...ew&status=open
Quote:
--deblock -2:-2, no need for intense deblocking according to my tests
Well, to improve compression efficiency. Did you see any issues with using 0:0?
Quote:
--rdpenalty 1, trying to favour smaller blocks.
Have you seen any experimental validation of it helping in x265 2.4+? I found places where it was helpful in older versions, but not recently.
Quote:
--qg-size 16, tested values from 64 to 8, 16 looked best (in terms of distortion in small details again) when compared frame-by-frame. Tested with a 720p encode, so I'll need to check that also with 1080p later.
I would expect it would look better, but at some cost to efficiency due to the signaling overhead. --opt-cu-delta-qp might help that some. What sorts of bitrates are you targeting/getting?
Quote:
--bframes 10, some video utilizes a lot of B-frames for some reason. Not a big slowdown so I've kept it at that all the time.
Why 10 specifically? I'd use either 8 (most tested, as it's in the slower+ presets) or 16 (maximum allowed).
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th October 2018, 18:09   #6438  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,542
I think the general problem is that the presets and tunings are really old. I'm quite sure things are very much different now compared to what they were when most of the parameters and settings were come up with. Also, we are still missing the most common tuning which is --tune film. I would very much like to see what the creators of the encoder think of proper settings as they should know the internal workings best.

I recall someone else also complaining about max-merge for smoothing things if the value is too big. I'll need to retest and file a bug report if it's still reproducable. It's been some time since I tested it.

Will also retest deblock 0:0. When I tested it earlier, I did find it smooth things slightly so I went one notch down from the standard -1:-1 of x264's --tune film.

--rdpenalty 1 is something I have not tested recently.

--qg-size 16 does cause the bitrate to jump compared to 32 or 64, but it's worth it in my opinion.
I had tested it with a 720p encode with quite a big filesize difference:
qg-size 64 : 3147,41 kbps
qg-size 32 : 3648,01 kbps
qg-size 16 : 3954,23 kbps
It is something I need to separately retest for 1080p. The bitrates fluctuate a lot, I'd say between 2.5 - 6 Mbps for 720p encodes. I have no specific target so I just use CRF 19.

10 B-frames because for some quite noisy sources, I noticed that 8 consecutive B-frames were being used 5-10% of the time. Setting ten usually meant that the longest sequence was used 1-2% of the time.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...

Last edited by Boulder; 16th October 2018 at 18:11.
Boulder is offline   Reply With Quote
Old 16th October 2018, 20:46   #6439  |  Link
jlpsvk
Registered User
 
Join Date: Dec 2014
Posts: 187
Quote:
Originally Posted by DotJun View Post
Is there a downside to enabling avx512 on an intel X chip?


Sent from my iPhone using Tapatalk
Heat.
__________________
Core i9-7960X, 64GB DDR4, RTX 2070, 1TB NVMe SSD, 56TB NAS
jlpsvk is offline   Reply With Quote
Old 17th October 2018, 02:09   #6440  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by jlpsvk View Post
Heat.
Would it be hotter? Or would it just be slower due to thermal throttling.

I wouldn't be surprised if Intel's next big microarchitecture revision makes AVX512 useful in cases where it isn't today. We saw the same thing with Skylake and AVX2.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:46.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.