Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th July 2022, 16:13   #8561  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,277
Yup, using MABS too when building Windows tools for Hybrid.
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 27th July 2022, 16:22   #8562  |  Link
BuccoBruce
Registered User
 
Join Date: Apr 2022
Posts: 28
Quote:
Originally Posted by benwaggoner View Post
If anyone can give me an overview of the best way to go from a source directory to a Windows 64-bit binary, I'll owe you one! I don't mind going the GCC route if that's easier.
Quote:
Originally Posted by Boulder View Post
I don't know if it helps, but I just cloned the repo and edited the make-solutions.bat in the build\vc15-x86_64 folder to this:

cmake -G "Visual Studio 16 2019" ..\..\source && cmake-gui ..\..\source

Then ran make-solutions in the console and it will proceed like it is instructed in the wiki. You might need to point to the path where NASM is and enable assembly in the configure part (assembly is disabled if CMake cannot find NASM). The resulting .sln file can be opened in Visual Studio 2019.
Just a note about MABS, it will build with GCC and purely with GCC, unless you tell it to build with clang. Not an issue for most, and probably preferred if you're going to be using the libraries to link against anything else built with GCC. I still prefer it for building non-free ffmpeg (with ffmpeg's AAC, MP2, MP3, opus, and vorbis implementations disabled outright) to make an MPV build that can handle USAC and that de/encodes opus using libopus. Trying to build all those requirements separately with MSVC myself would take a few years off my life, and take forever.

For what it's worth, I've found on some machines that VS builds of x265.exe (and x264, aom-enc, SVT-AV1, to name a few) perform a bit better in some cases, but only negligibly. It does seemingly add up on Intel CPUs without AVX2 though, and it also allows you to use VS profiler guided optimizations (PGO) if you choose to, and in my case, disable things like Spectre slowdow...I mean mitigations, but only because I don't know how to pass that through to GCC.

I can confirm Boulder's edit worked for compiling under VS2019. I add -A x64 out of habit, so cmake -G "Visual Studio 16 2019" -A x64 ..\..\source && cmake-gui ..\..\source. You would presumably edit it to read "Visual Studio 17 2022" if you're using 2022.

Make sure you start a "x64 Native Tools Command Prompt for VS 2019" to run things from, and make sure NASM is in your PATH or you'll end up with no optimized assembly code.
BuccoBruce is offline   Reply With Quote
Old 27th July 2022, 16:49   #8563  |  Link
BuccoBruce
Registered User
 
Join Date: Apr 2022
Posts: 28
Microsoft y u do dis?

TL;DR Is there some magic bullet for muxing an HEVC elementary stream with Open GOP using mp4box and getting Media Foundation to decode it properly?

I'm running into some weird issues with Open GOP HEVC + Media Foundation decoding in an MP4 container. Muxing with ffmpeg -movflags faststart+negative_cts_offsets seems to work fine most of the time. It complains about a lack of timestamps in the raw .hevc stream and outputs VFR, so -i has to be preceded with e.g. -r 60000/1001 to get around that, and I have to -loglevel error -stats or it will just quickly fill the console, ad infinitum, with:
Quote:
[mp4 @ 000001da260000c0] Timestamps are unset in a packet for stream 0.
This is deprecated and will stop working in the future.
Fix your code to set the timestamps properly
[mp4 @ 000001da260000c0] pts has no valueB time=00:00:00.00 bitrate=N/A speed= 0x
Last message repeated 103 times
  • 5760x2880, 5408x2704, 4800x2400, 4096x2048, 3840x1920, 3000x1500
  • Level 6 Main at the max, lower for smaller resolutions
  • Issue persists even with L5/Main 3000x1500 50 fps video, or 2160x2160 30 fps
  • 8/10 bit doesn't matter
  • GOP length doesn't seem to matter, tried 60/600 (10 second rule), 30/300 (half), 25/250 (default)
  • Ref/b-frame count doesn't seem to matter, all within the limits of Level 6 or well below anyways
  • Thought it might be VBV limited CRF acting up and overflowing the DPB using the default Level 6 VBV, so I tried lowering the VBV, and lower bitrate ABR with a much longer RC Lookahead, issue persisted
  • CRF encodes ended up being fine anyways, since the issue seems to be limited to MP4Box+Media Foundation...
  • Disabling Open GOP magically fixes it most of the time.

Is it some kind of IDR signaling issue? Disabling Open GOP alone seemingly resolves all of the issues, but I would like to use Open GOP since these are static camera shots. Is it something really dumb like -inter 500 being too small? I guess that would make me really dumb. It's starting to seem more like a GPAC/mp4box issue, or more likely super-duper Dunning-Kreuger PEBKAC, but I don't know enough about HEVC bitstream output and signaling (PPS/SPS/VUI) to know any better so I wasted my time messing with encoder parameters.

MP4Box output is mostly unplayable, it doesn't seek, and it plays choppy, almost like what you'd expect to see when the decoder drops a temporal enhancement layer and plays back at half FPS. Using --forcesync with mp4box didn't help either. I tried MP4Box with an added --negctts but that just outputs "Arg negctts set but not used" in the console - whereas using negative_cts_offsets in ffmpeg seems to fix things?! Either way, the issue only seems to be with Media Foundation playback. Using an MKV and/or decoding with LAV works fine, as does decoding in software or using MPV or even ffplay.

---

What about any of the x265 bitstream options, could they help? Based on the documentation, --repeat-headers seems like it's only useful for trying to seek within the elementary stream output before muxing it. --aud? --eos? --hrd? I thought --idr-recovery-sei might help, but enabling it along with --repeat-headers seemingly made things worse. mp4box's output when trying to mux a stream made with those two options makes avidemux crash immediately, and makes ffmpeg (MPV) have serious issues playing the file too. This seems to be regardless of the parameters I tried with mp4box: -inter 0 to force a flat mp4, letting it do the default -inter 500, and trying with and without --forcesync for both options. I am pretty sure I tried --nosei, but that'd just be throwing away the extra stuff I asked x265 to write, and then I'd have to waste my time remembering how to re-signal bt709/limited. Trying to remux any of that mp4box output using ffmpeg results in a file that is entirely unplayable in anything, it skips back and forth randomly, you get intermittently decoded blocks, etc. It's also the only result that could technically allow posting a screenshot, since it's NSFW video...it's VR pr0n alright? I can mux directly from the raw .hevc stream if I use those two x265 options with ffmpeg but only with no other parameters, just forcing the FPS with -r 60000/1001 to prevent erroneous VFR output, and -c copy. I haven't tried -movflags faststart, and using negative_cts_offsets seemingly breaks these files too. I also have yet to try putting ffmpeg's output back through mp4box. I am streaming these from a NAS and would prefer to have the MOOV atom at the beginning of the file, so a working flat mp4 is only "half fixed".

What's even weirder is I can take a working mp4 from elsewhere at the same resolution, frame rate, and bitrate, and with seemingly identical x265 settings visible in the SEI, and remux it all I want with mp4box. It doesn't break playback under Media Foundation. The only difference is the version tag for x265 reading 0.0 - these working files were also seemingly muxed with ffmpeg (Lavf58.12.100) or even encoded directly with it using -c:v libx265. Looking at that file, it looks like the only options they passed to x265 were --bitrate 30000 --output-depth 10 --colormatrix=2 --colorprim=2 --transfer=2 --videoformat=5. Everything else is --preset medium defaults.

I've just been using --preset medium with some slower options selectively enabled, and some that I thought would help lower bitrate when I thought that was the issue.

Code:
--bitrate 20000 --output-depth 10 --level-idc 6 --no-high-tier
--rect --amp --tskip --tskip-fast --b-intra --limit-modes
--vbv-bufsize 30000 --vbv-maxrate 40000
--analyze-src-pics --rc-lookahead 120 --min-keyint 60 --keyint 600
--fades --video-signal-type-preset BT709_YCC
--opt-qp-pps --opt-ref-list-length-pps --opt-cu-delta-qp
--limit-sao --selective-sao 1 --sao-non-deblock
Plus either +,- or -,+ for pools on a dual socket system, and I've obviously tried with/without --repeat-headers --idr-recovery-sei . Adding/removing any of --b-intra --fades --analyze-src-pics --opt-qp-pps --opt-ref-list-length-pps --opt-cu-delta-qp didn't make a difference either - I'm just including the command I tried with the most options for completeness.
BuccoBruce is offline   Reply With Quote
Old 29th July 2022, 12:23   #8564  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,733
A note for MABS users who build for Zen2/3: add -march=znver2 or -march=znver3 in custom_profile in the local64\etc directory. It gives a slightly better performance for those chips, I think I found it 3-4% better when I tested it on my 3900X.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 30th July 2022, 20:06   #8565  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,782
New upload: x265 3.5+39-a599806d3

[Windows][GCC 12.1.0][32/32XP/64 bit] 8bit+10bit+12bit
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 30th July 2022, 20:42   #8566  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
I was wondering if someone could build me a Zen3 optimized and Zen2 optimized Windows version for my 5950x and 3950x CPUs. That would be much appreciated.
LeXXuz is offline   Reply With Quote
Old 1st August 2022, 15:34   #8567  |  Link
RanmaCanada
Registered User
 
Join Date: May 2009
Posts: 331
Quote:
Originally Posted by LeXXuz View Post
I was wondering if someone could build me a Zen3 optimized and Zen2 optimized Windows version for my 5950x and 3950x CPUs. That would be much appreciated.
Pretty sure DJATOM has the best. Yes it's an older build, but x265 has been in maintenance mode for well over a year now.
RanmaCanada is offline   Reply With Quote
Old 1st August 2022, 19:20   #8568  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
Quote:
Originally Posted by RanmaCanada View Post
Pretty sure DJATOM has the best. Yes it's an older build, but x265 has been in maintenance mode for well over a year now.
I know that's why I'd like an actual build for comparison.
LeXXuz is offline   Reply With Quote
Old 1st August 2022, 19:23   #8569  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by BuccoBruce View Post
TL;DR Is there some magic bullet for muxing an HEVC elementary stream with Open GOP using mp4box and getting Media Foundation to decode it properly?
I've had some .hevc files that don't play properly when muxed in mp4box, but do when muxed in ffmpeg. ffmpeg complains enormously about missing PTS data, but seems to fix it fine.

They were all Closed GOP, though, so potentially unrelated to your issue.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 1st August 2022, 22:19   #8570  |  Link
BuccoBruce
Registered User
 
Join Date: Apr 2022
Posts: 28
Quote:
Originally Posted by benwaggoner View Post
I've had some .hevc files that don't play properly when muxed in mp4box, but do when muxed in ffmpeg. ffmpeg complains enormously about missing PTS data, but seems to fix it fine.

They were all Closed GOP, though, so potentially unrelated to your issue.
Might still be related, I re-encoded so many files I might have forgotten if something other than Open GOP was also causing it. Guess I'll stick to ffmpeg for HEVC and just use mp4box for AVC+HLS.
BuccoBruce is offline   Reply With Quote
Old 2nd August 2022, 06:13   #8571  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
Is SAO still an issue for high quality encodes with actual builds or can this safely be activated now?
LeXXuz is offline   Reply With Quote
Old 2nd August 2022, 07:23   #8572  |  Link
microchip8
ffx264/ffhevc author
 
microchip8's Avatar
 
Join Date: May 2007
Location: /dev/video0
Posts: 1,844
Quote:
Originally Posted by LeXXuz View Post
Is SAO still an issue for high quality encodes with actual builds or can this safely be activated now?
it's still an issue
__________________
ffx264 || ffhevc || ffxvid || microenc
microchip8 is offline   Reply With Quote
Old 24th August 2022, 11:58   #8573  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
I'm tinkering around with my profiles to gain more speed out of my encodes. The significant rise in electricity cost here in Germany made that decision necessary.

I have a question regarding the "--limit refs" parameter. As there is a huge speed difference between mode 1 and 3 and I was told to better use mode 1 for better quality, I now also tested mode 2 which none of the presets seem to use by default.

I got a decent performance increase with mode 2 over mode 1 and tested this with quite a few examples. Can't say I've seen any notable differences in quality so far.

I read the docs about the differenct modes, but in all honesty I don't really understand what's written there and how that may affect quality.

I always do high bitrate encodes with the "slower" preset as a base and CRF values of 18 or even below. Is there any good reason NOT to use mode 2 over 1 for better performance?
LeXXuz is offline   Reply With Quote
Old 24th August 2022, 18:42   #8574  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by LeXXuz View Post
I'm tinkering around with my profiles to gain more speed out of my encodes. The significant rise in electricity cost here in Germany made that decision necessary.

I have a question regarding the "--limit refs" parameter. As there is a huge speed difference between mode 1 and 3 and I was told to better use mode 1 for better quality, I now also tested mode 2 which none of the presets seem to use by default.

I got a decent performance increase with mode 2 over mode 1 and tested this with quite a few examples. Can't say I've seen any notable differences in quality so far.

I read the docs about the differenct modes, but in all honesty I don't really understand what's written there and how that may affect quality.

I always do high bitrate encodes with the "slower" preset as a base and CRF values of 18 or even below. Is there any good reason NOT to use mode 2 over 1 for better performance?
To test more subtle features like this, I strongly recommend using a 2-pass --bitrate encode instead of CRF. It's hard to disentangle impacts on quality when bitrate is also varying. 1-pass CBR can also work, and is faster.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 24th August 2022, 19:03   #8575  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by LeXXuz View Post
I'm tinkering around with my profiles to gain more speed out of my encodes. The significant rise in electricity cost here in Germany made that decision necessary.
If you're looking for ways to reduce joules/pixel, --frame-threads 1 can really help. The overhead of frame threading can really reduce power efficiency, and doesn't always have that big of a speed boost depending on how many cores you have and the resolution you're encoding at.

If you use SAO, --selective-sao 2 saves a bit without material quality impact.

If you can share your current command line, we might have other suggestions.

In general, the --preset options are pretty well tuned for a typical range of content and scenarios as of x265 3.0. They don't include any features added in 3.1 or later, which is why no --selective-sao, --rskip 2, etcetera, even though those really should be the defaults.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 24th August 2022, 23:07   #8576  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
Thank you for those suggestions.

Right now I recode 1080p content

I use these settings:

Code:
--preset slower --crf 17.00 --qpfile "E:\WORK\chp.qpf"
 --repeat-headers --input-depth 16 --output-depth 10 --dither 
--ctu 32 --limit-refs 2 --psy-rdoq 5 --selective-sao 0 --no-sao 
--colorprim bt709 --transfer bt709 --colormatrix bt709
CPUs used are Ryzen 5950x and 3950x.
LeXXuz is offline   Reply With Quote
Old 25th August 2022, 01:49   #8577  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by LeXXuz View Post
Thank you for those suggestions.

Right now I recode 1080p content

I use these settings:

Code:
--preset slower --crf 17.00 --qpfile "E:\WORK\chp.qpf"
 --repeat-headers --input-depth 16 --output-depth 10 --dither 
--ctu 32 --limit-refs 2 --psy-rdoq 5 --selective-sao 0 --no-sao 
--colorprim bt709 --transfer bt709 --colormatrix bt709
CPUs used are Ryzen 5950x and 3950x.
--slower is already one of the better-balanced presets. Changing parameters from slower to ones from slow will speed things up, but all of them have quality impacts too.

There's no point to using --selective-sao if you're already using --no-sao.

I always like to set --profile and --level-idc so I'll get warnings if I violate the requirements. In your case that looks like --profile main10 --level-idc 4.0 or 4.1.

Using --psy-rdoq 5 without raising --psy as well is an uncommon configuration, but should work.

I'd use --rskip 2 to replace the default --rskip 1 because it's a better quality mode. I've not directly compared the speed. Higher --rskip-edge-threshold values are faster, but can reduce quality. I tend to use 2-3 in my stuff, but I'm more biased towards quality/efficiency than your use case.

What CPU are you running on?

The biggest thing to improve pixels/joule without any quality loss would be --frame-threads 1. Lower values can actually improve quality.

You can learn a lot from doing a --csv-log-level 2 and looking at the frame level data. For example, if there aren't a lot of TUs smaller than 8x8 you could reduce --tu-intra-depth and --tu-inter-depth by 1. Recursing all the way down is mostly helpful with content that has sharp details, like text and cel animation.

If you have a lot of RAM, increasing --rc-lookahead can improve quality when VBV-limited quite a lot without much negative speed impact.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 25th August 2022, 09:38   #8578  |  Link
LeXXuz
21 years and counting...
 
LeXXuz's Avatar
 
Join Date: Oct 2002
Location: Germany
Posts: 716
Quote:
Originally Posted by benwaggoner View Post
There's no point to using --selective-sao if you're already using --no-sao.
I was uncertain if I have to set it to 0 as well when I don't want to have SAO at all. I will remove that parameter.

Quote:
Originally Posted by benwaggoner View Post
I always like to set --profile and --level-idc so I'll get warnings if I violate the requirements. In your case that looks like --profile main10 --level-idc 4.0 or 4.1.
Again, I was unsure if I should let x265 decide on its own or put these in manually. Never thought about the violation warnings though which is a very good point. Will add these again.

Quote:
Originally Posted by benwaggoner View Post
Using --psy-rdoq 5 without raising --psy as well is an uncommon configuration, but should work.
Well, that is a longer story and the most subtle approach at the moment to fight banding with the quite clean source material I have. The --slower preset already uses --psy-rd 2. Raising that any higher added too much static noise into flat areas to my taste.
It's barely visible on 4k, but visible on 1080p and almost terrible on SD.
Without raising at least --psy-rdoq a little, x265 tends to produce banding in certain flat areas. And sadly my living room TV is very susceptible to that and tends to intensify even the slightest banding compared to my other TVs. So this is somewhat a personal compromise.

Quote:
Originally Posted by benwaggoner View Post
I'd use --rskip 2 to replace the default --rskip 1 because it's a better quality mode. I've not directly compared the speed. Higher --rskip-edge-threshold values are faster, but can reduce quality. I tend to use 2-3 in my stuff, but I'm more biased towards quality/efficiency than your use case.
I'll add --rskip 2 to my script.

Quote:
Originally Posted by benwaggoner View Post
What CPU are you running on?
AMD Ryzen 5950x and 3950x. Both with 16 cores/32 threads

Quote:
Originally Posted by benwaggoner View Post
The biggest thing to improve pixels/joule without any quality loss would be --frame-threads 1. Lower values can actually improve quality.
Doesn't that decrease speed a lot as it reduces parallel processing? Or am I mistaken here?


Quote:
Originally Posted by benwaggoner View Post
If you have a lot of RAM, increasing --rc-lookahead can improve quality when VBV-limited quite a lot without much negative speed impact.
The machines have 64GB. I think the default is 40 for the --slower preset? How much should I raise that?

Thanks again for your valued input Ben.
LeXXuz is offline   Reply With Quote
Old 25th August 2022, 20:25   #8579  |  Link
vpupkind
Registered User
 
Join Date: Jul 2007
Posts: 63
rc_lookahead -- at least 1s worth of frames
vpupkind is offline   Reply With Quote
Old 25th August 2022, 23:13   #8580  |  Link
Immaculate
Registered User
 
Join Date: Oct 2021
Posts: 1
Quote:
Originally Posted by benwaggoner View Post
--slower is already one of the better-balanced presets. Changing parameters from slower to ones from slow will speed things up, but all of them have quality impacts too.

There's no point to using --selective-sao if you're already using --no-sao.

I always like to set --profile and --level-idc so I'll get warnings if I violate the requirements. In your case that looks like --profile main10 --level-idc 4.0 or 4.1.

Using --psy-rdoq 5 without raising --psy as well is an uncommon configuration, but should work.

I'd use --rskip 2 to replace the default --rskip 1 because it's a better quality mode. I've not directly compared the speed. Higher --rskip-edge-threshold values are faster, but can reduce quality. I tend to use 2-3 in my stuff, but I'm more biased towards quality/efficiency than your use case.

What CPU are you running on?

The biggest thing to improve pixels/joule without any quality loss would be --frame-threads 1. Lower values can actually improve quality.

You can learn a lot from doing a --csv-log-level 2 and looking at the frame level data. For example, if there aren't a lot of TUs smaller than 8x8 you could reduce --tu-intra-depth and --tu-inter-depth by 1. Recursing all the way down is mostly helpful with content that has sharp details, like text and cel animation.

If you have a lot of RAM, increasing --rc-lookahead can improve quality when VBV-limited quite a lot without much negative speed impact.
Thanks for the tips. --rskip 2 seems to improve grain "motion" quite a bit in some cases.

It's a shame that you have to fiddle with x265 to get an acceptable quality, when a simple --tune film --preset veryslow produces good results with x264. Of course, clean material isn't an issue, it's just that x264 looks better with noise/grain - out of the box.
Immaculate is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:32.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.