Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
15th July 2019, 22:18 | #1 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
|
VQ Test for MPEG-2 Encoders
VQ Test for MPEG-2 Encoders
Broadcast Encoder : Francesco Bucciantini (FranceBB) Senior Video Editor : Livio Aloja (algia) The files analysed are named “Test4” as we did several tests and they are aimed to encode files from lossless masters for internal usage as mezzanine files. 1) Input file and encoding target 2) The impact of Dithering on objective metrics 3) Comparison between encoders 4) Results and final thoughts 5) Bibliography 1) Input file and encoding target For this test, the original masterfile is an Apple ProRes, FULL HD (1920x1080), 10bit, 4:2:2 planar BT709 Limited TV Range with both progressive and interlaced contents at 25fps. The target is an XDCAM50 lossy mezzanine file for broadcast usage, which is an MPEG-2, FULL HD, 50Mbit/s, 8bit, 4:2:2 planar (yv16), BT709 Limited TV Range, closed GOP, with both progressive (flagged as interlaced) and interlaced contents at 25fps. The test reel has different types of contents to test how encoders behave. 2) The impact of Dithering on objective metrics Internally, whenever we get an high bit depth source, we apply Dithering in order to avoid to introduce banding while bringing it to 8bit. In particular, we use the Floyd-Steinberg error diffusion. The algorithm achieves dithering using error diffusion, meaning it pushes (adds) the residual quantization error of a pixel onto its neighboring pixels, to be dealt with later. It spreads the debt out according to the distribution (shown as a map of the neighboring pixels): The pixel indicated with a star indicates the pixel currently being scanned, and the blank pixels are the previously-scanned pixels. The algorithm scans the image from left to right, top to bottom, quantizing pixel values one by one. Each time the quantization error is transferred to the neighboring pixels, while not affecting the pixels that already have been quantized. Hence, if a number of pixels have been rounded downwards, it becomes more likely that the next pixel is rounded upwards, such that on average, the quantization error is close to zero. The original lossless masterfile for this test is 10bit, but the encoded file has to be 8bit due to the XDCAM specifications, so we were wondering whether Dithering has a positive or a negative impact on objective metrics compared to truncation. We tried to run some internal tests with three different types of dithering: – Serpentine Floyd-Steinberg error diffusion – Stucki error diffusion – Atkinson error diffusion When we compared each test, we noticed that the Serpentine Floyd-Steinberg error diffusion is a well-balanced algoritm (which confirms the reason why we use it internally), the Stucki error diffusion looks “sharp” and preserve light edges and details well and the Atkinson error diffusion generates distinct patterns but keeps clean the flat areas. Unfortunately, though, even if they look “better” to the human eye, this is strictly subjective, as they only look “different”. As a matter of fact, on both SSIM and PSNR, each dithering method has got a lower score compared to truncation. The interesting fact, though, is that it didn't get a lower score in every single frame, as there are a very few frames in which dithering algorithms managed to get an higher value compared to truncation, but overall truncation outperformed dithering algorithms by 1.51% in SSIM and 0.3% in PSNR, that's why we decided not to include Dithering as reference. 3) Comparison between encoders In this part, we are going to compare the following encoders: Ateme, AWS, Selenio, Telestream and x262. For the reason already explained above, the file encoded with x262 has been encoded without any dithering algorithms and just using truncation. The first graph represents how the different encoders behave during the whole video. Since SSIM goes from 0 to 1 with many digits after the 0, we re-scaled it in order to make it more human readable. From the tests, AWS performed better than other encoders, with a score of 289921, followed by Ateme by a very narrow margin (289639). At the third position, with a rather significant quality drop, we have Telestream with 281010, followed by Selenio with 279577. At the bottom, we have x262 which scored 276854 and which is outperformed by 4.51%. PSNR pretty much confirms what is shown by SSIM. AWS performed better than all the other encoders and scored 733272, followed by Ateme by a narrow margin with 731639. At the third position, there's Telestream with 729195, followed by Selenio which scored 722449. At the bottom, there's x262 with a total score of 720755. According to PSNR, though, Selenio is closer to the quality reached by x262 rather than the one reached by Telestream. SSIM Individual Charts (From best to worse): PSNR Individual Charts (From best to worse): 4) Results and final thoughts AWS managed to achieve a better score compared to all the other encoders, but its advantage is only because it had a slightly higher spike on a few scenes, while overall it had pretty much the same performance as Ateme. In particular, grain retention was pretty much fine on both, but when random noise recorded by the camera came into the equation, AWS managed to handle it slightly better than Ateme, but again, overall, they performed pretty much the same, that's why the margin was really narrow. At the third place, Telestream performed worse compared to Ateme by a not-so-high margin, but still, it was worse. Even though Telestream performed worse, it's still closer to the upper part of the chart rather than to the lower part of the chart, however it didn't quite manage to get to the same level of Ateme on too many scenes, that's why it ended up being third. At the fourth position there's Selenio that performed significantly worse than AWS and Ateme and worse than Telestream by a still significant margin. At the bottom of the table, there's x262, which apparently is the worse MPEG-2 encoder among those at such an high bitrate and its performance was pretty low overall and on top of that, it struggled to encode sport properly. On the other had, x262 is free and open source, while all the other encoders are closed source, need to be purchased, their cost is very high and they don't support Avisynth input, so we would still choose x262 over those other encoders. We're already using x262 and we're not planning to change anytime soon. Bibliography –Visgraf (Vision and Graphic Laboratory) Mathematics, FS algorithm –Proceedings of the Society of Information Display (adaptive algorithm) Last edited by FranceBB; 15th July 2019 at 22:20. |
17th July 2019, 00:27 | #3 | Link |
Derek Prestegard IRL
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
|
Fascinating
I was always impressed with Elemental's MPEG-2 encoder, so I'm not surprised to see it doing so well here. Particularly when doing low bitrate 15 Mbps CableLabs compliant 1080i it absolutely crushed my go-to at that point - Harmonic ProMedia Carbon aka Rhozet Carbon Coder. Any particular reason you left Harmonic out of the mix? Last edited by Blue_MiSfit; 17th July 2019 at 06:33. |
18th July 2019, 02:46 | #5 | Link |
Derek Prestegard IRL
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
|
Rhozet was fine, and WAY better than Digital Rapids (now Imagine's Selenio product line), but at low bitrates (15 Mbps for 1080i) it had a lot of blocking that totally went away with Elemental. Plus, the latter was WAY faster
Gosh I haven't thought about this in awhile, It's been years since I've done any MPEG-2 encoding. |
14th July 2021, 02:15 | #6 | Link |
Registered User
Join Date: Dec 2009
Posts: 72
|
Do you remember the x262 commandline? I sure hope you used --tune ssim/psnr.
I should probably check in here more often. This thread is 2 years old and I haven't seriously worked on x262 in 8 years. I'm glad people still find it somewhat useful, though. |
14th July 2021, 13:50 | #7 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
|
Quote:
About x262, I don't remember the command line, but I remember that I had to make sure the content was progressive 25p 'cause this: Code:
x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 --keyint 12 --bframes 2 --tff --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component --nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262" pause Code:
avs [info]: 1920x1080i 0:0 @ 25/1 fps (cfr) x262 [error]: interlaced 4:2:2 not implemented Code:
x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 --keyint 12 --bframes 2 --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component --nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262" pause Now, here is the thing: first of all to create a complaint XDCAM-50 stream we need: Code:
--level The MPEG-2 levels are: LL Low Level ML Main Level 2 H-14 High 1440 HL High Level we need the last one. Code:
--nal-hrd cbr Then: Code:
--keyint 12 --bframes 2 pict_type=I pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B and then repeats the sequence as M=3 N=12. This is an example of an XDCAM File encoded with FFMpeg: Code:
General Complete name : S:\MEDIADIRECTOR\ARCA\F1 1993 GP ITALIA GARA 930912 1P (7754523).mxf Format : MXF Commercial name : XDCAM HD422 Format version : 1.3 Format profile : OP-1a Format settings : Closed / Complete File size : 45.8 GiB Duration : 1 h 49 min Overall bit rate : 60.0 Mb/s Encoded date : 2021-06-23 14:23:37.404 Writing application : FFmpeg OP1a Muxer 58.65.101.0.0 Video ID : 2 Format : MPEG Video Commercial name : XDCAM HD422 Format version : Version 2 Format profile : 4:2:2@High Format settings : CustomMatrix / BVOP Format settings, BVOP : Yes Format settings, Matrix : Custom Format settings, GOP : M=3, N=12 Format settings, picture structure : Frame Format settings, wrapping mode : Frame Codec ID : 0D01030102046001-0401020201040300 Duration : 1 h 49 min Bit rate mode : Constant Bit rate : 50.0 Mb/s Width : 1 920 pixels Height : 1 080 pixels Display aspect ratio : 16:9 Frame rate : 25.000 FPS Standard : Component Color space : YUV Chroma subsampling : 4:2:2 Bit depth : 8 bits Scan type : Interlaced Scan order : Top Field First Compression mode : Lossy Bits/(Pixel*Frame) : 0.965 Time code of first frame : 00:00:00:00 Time code source : Group of pictures header GOP, Open/Closed : Open GOP, Open/Closed of first frame : Closed Stream size : 38.2 GiB (83%) Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 Code:
avs2yuv.exe "S:\00_INGEST_MAM\A.R.C.A\02_ALTRO\NR_DJF_AVISYNTH_TEST_SPORT_SD.avs" -csp AUTO -o - | ffmpeg.exe -i - -pix_fmt yuv422p -vcodec mpeg2video -s 1920:1080 -aspect 16:9 -vf setfield=tff -flags +ildct+ilme+cgop -b_strategy 0 -mpv_flags +strict_gop -sc_threshold 1000000000 -r 25 -b:v 50000k -minrate 50000k -maxrate 50000k -bufsize 17825792 -g 12 -bf 2 -profile:v 0 -level:v 2 -color_range 1 -color_primaries 1 -color_trc 1 -colorspace 1 -y "\\MIBCSSDA001\Media Ingest\filetemporanei\server0\output.mxf" Can you help me out a bit here? I mean, can we try to support XDCAM-50 properly this time? Let's be honest, aside from professional formats like XDCAM and IMX, there's no much use for MPEG-2 nowadays. On top of that, I think x262 should not encode in H.264. I think it should be MPEG-2 only with the same features as x264 but that's it. This way, if we get that MPEG-2 only and XDCAM/IMX compliant, then I'm pretty sure it can be used professionally and it could even be included in FFMpeg as libx262! Wouldn't that be cheating? ehehehehe I mean the goal was to offer a real life scenario, but yeah I guess picking them would have achieved a higher score, definitely eheheheh Last edited by FranceBB; 14th July 2021 at 14:17. |
|
14th July 2021, 15:26 | #8 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,546
|
Still there is improvement room to pull closer to CCE, DVDs are still needed.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
14th July 2021, 15:43 | #9 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
|
Quote:
Anyway, if ifb gets XDCAM right I'm gonna be happy xD |
|
14th July 2021, 17:23 | #10 | Link |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,771
|
What's the goal in using objective metrics here? Particularly for dithering, which is absolutely a subjective optimization.
The "best" encoder is the one that delivers the best subjective results in double-blind comparisons. Objective metrics are an okay 1st order approximation of some things, but not for things like how different dithering modes impact subjective quality of the final output. For MPEG-2, a dithering mode that looks better pre-compression might wind up compressing less, and any potential quality gain is eaten up by artifacts from encoding requiring a higher QP. |
14th July 2021, 19:34 | #11 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
|
Yes, which is why I tested to see whether dithering was going to affect quality in terms of objective metrics or not, and it did, negatively, so in the end the one I evaluated was the one obtained via truncation, which is the one you see in the charts
|
15th July 2021, 01:23 | #12 | Link | |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,771
|
Quote:
There's a lot of secret sauce in dithering for 8-bit codecs, particularly as simple a one as MPEG-2. |
|
21st July 2021, 15:28 | #13 | Link | |||||
Registered User
Join Date: Dec 2009
Posts: 72
|
Quote:
Code:
--output-csp i422 --level high Code:
x262 [info]: 4:2:2 profile @ High level Quote:
Quote:
Quote:
The x262 repository hasn't been updated largely because of the x264 history rewrite. I'm not a git-filter-branch expert enough to fix it. Quote:
A "real life scenario" uses your eyes, not PSNR/SSIM. |
|||||
21st July 2021, 22:21 | #14 | Link | ||||
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
|
That's really a shame...
If you'll ever feel like into this, lots of people will definitely appreciate it, I'm sure. Quote:
Quote:
Quote:
Quote:
Code:
--no-scenecut --keyint 12 --bframes 2 Code:
pict_type=I pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B pict_type=P pict_type=B pict_type=B So I guess the final command line would be: Code:
x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --level high --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 --keyint 12 --bframes 2 --no-scenecut --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component --nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262" pause I'll test again and let you know if it works correctly but I have faith it will :P The only "sad" thing is that it's gonna be progressive only... If you'll ever introduce interlace encode in 4:2:2 please let me know. I also hope that one day or another your patch will be merged into the mainline x264 repository at this point, although, if it hasn't been merged after so many years, I doubt it will ever be... By the way, just let me say that what you've done was brilliant; I mean, using the x264 features to encode an MPEG-2 stream is very interesting and it's a shame that it hasn't been used by more broadcasters... Last edited by FranceBB; 21st July 2021 at 22:25. |
||||
22nd July 2021, 17:13 | #15 | Link |
Registered User
Join Date: Mar 2006
Posts: 1,049
|
I have one question - is there any justification to not test ordered dither side (or as alternative) to "noise" like dither?
FS is "noise" type dither and in unavoidable way it will increase entropy and sources coded with lossy encoder may deliver suboptimal results. |
22nd July 2021, 18:01 | #16 | Link | ||||
Registered User
Join Date: Dec 2009
Posts: 72
|
Quote:
Quote:
--profile 422 isn't needed with --output-csp i422 --level high isn't needed either if you are encoding HD resolutions Maybe add --open-gop if XDCAM allows it. Short keyints are the worst and that helps some. Quote:
Quote:
Code:
2010-10-09 22:28:47 < kierank> awww BBB's not here 2010-10-09 22:28:57 < Dark_Shikari> what do you need to harass him for? 2010-10-09 22:29:09 < kierank> the fact that x262 is making progress as opposed to xvp8 2010-10-09 22:29:15 < ifb> lol 2010-10-09 22:29:45 < Dark_Shikari> hell yes 2010-10-09 22:29:53 < Dark_Shikari> \o/ 2010-10-09 22:29:59 < Dark_Shikari> Now, what you need to do 2010-10-09 22:30:02 < Dark_Shikari> is show x262 beating libvpx 2010-10-09 22:31:55 < Dark_Shikari> troll troll troll troll troll 2010-10-09 22:32:08 < Dark_Shikari> we can really troll the shit out of michael with this one too 2010-10-09 22:32:12 < Dark_Shikari> "lol x264 is a better ffmpeg than ffmpeg" |
||||
23rd July 2021, 04:57 | #17 | Link | |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,771
|
Quote:
Really, metrics should be calculated comparing the dithered input to the encoder in the final and the encoder output. A subjective analysis comparing different dithering modes is interesting, but objective metrics just aren't going to be helpful. Dithering is a fascinating area of study that I'm hoping to purge from my memory as we enter the 10+ bit era . But heck, it was probably 20 years ago I first said "thank goodness I'll never have to do an Animated GIF again!" |
|
2nd August 2021, 20:59 | #18 | Link | |||
Registered User
Join Date: Mar 2006
Posts: 1,049
|
Quote:
Also FS temporal characteristic made impossible to compress this kind of dither. From my perspective FS is like noise i.e. suboptimal from spatial and temporal perspective. Beside to this - clear proof how behave FS in video - use same video before compression, then modify video by placing single static pixel in static position (so no change from temporal perspective) apply FS to both videos, subtract original from modified - compare spatial residue, then second experiment - modify single pixel with different position each frame - subtract reference from modified - analyze residue in spatial and temporal domain. Perhaps i'm bit naive but from my perspective residue is noise and noise is extremely difficult to compress especially by lossy encoder... Perhaps my example is too simplistic but still similar effect can be achieved simply by adding filtered noise (blue?) to lets say Luma channel. Also instead FS, some refined ordered dither such as Ulichney could be better at least from temporal perspective. Quote:
And always - you need clearly define goal and methodology. Quote:
this is same like ultra high frame rate - 300...600 frames per second will be key to get full immersion.... But my question was triggered by FS dithering - from my experience FS dither raising QP dramatically and literally stealing bits from video details. |
|||
31st August 2021, 02:58 | #19 | Link |
Derek Prestegard IRL
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
|
Any suggestions on tuning ffmpeg's mpeg-2 encoder or x262 for quality? I'm happy to burn as much CPU time as possible!
I'm targeting 6 Mbps for 720p59.94. Aggressive? ... Yes
__________________
These are all my personal statements, not those of my employer :) Last edited by Blue_MiSfit; 31st August 2021 at 05:11. |
1st September 2021, 01:33 | #20 | Link | |
Registered User
Join Date: Apr 2010
Location: I have a statue in Hakodate, Japan
Posts: 744
|
Quote:
EDIT: Ok, I see, this is another class of encoders, my comment is not exactly applicable here. Last edited by GMJCZP; 1st September 2021 at 18:50. |
|
|
|