VQ Test for MPEG-2 Encoders

FranceBB · 15th July 2019, 22:18

VQ Test for MPEG-2 Encoders

Broadcast Encoder : Francesco Bucciantini (FranceBB)
Senior Video Editor : Livio Aloja (algia)

The files analysed are named “Test4” as we did several tests and they are aimed to
encode files from lossless masters for internal usage as mezzanine files.

1) Input file and encoding target
2) The impact of Dithering on objective metrics
3) Comparison between encoders
4) Results and final thoughts
5) Bibliography

1) Input file and encoding target

For this test, the original masterfile is an Apple ProRes, FULL HD (1920x1080), 10bit,
4:2:2 planar BT709 Limited TV Range with both progressive and interlaced contents at
25fps. The target is an XDCAM50 lossy mezzanine file for broadcast usage, which is an MPEG-2,
FULL HD, 50Mbit/s, 8bit, 4:2:2 planar (yv16), BT709 Limited TV Range, closed GOP, with both
progressive (flagged as interlaced) and interlaced contents at 25fps.
The test reel has different types of contents to test how encoders behave.

2) The impact of Dithering on objective metrics

Internally, whenever we get an high bit depth source, we apply Dithering in order to
avoid to introduce banding while bringing it to 8bit. In particular, we use the Floyd-Steinberg
error diffusion.
The algorithm achieves dithering using error diffusion, meaning it pushes (adds) the residual
quantization error of a pixel onto its neighboring pixels, to be dealt with later. It spreads the
debt out according to the distribution (shown as a map of the neighboring pixels):

The pixel indicated with a star indicates the pixel currently being scanned, and the blank pixels
are the previously-scanned pixels. The algorithm scans the image from left to right, top to
bottom, quantizing pixel values one by one. Each time the quantization error is transferred to
the neighboring pixels, while not affecting the pixels that already have been quantized. Hence, if
a number of pixels have been rounded downwards, it becomes more likely that the next pixel is
rounded upwards, such that on average, the quantization error is close to zero.
The original lossless masterfile for this test is 10bit, but the encoded file has to be 8bit due to
the XDCAM specifications, so we were wondering whether Dithering has a positive or a negative
impact on objective metrics compared to truncation.

We tried to run some internal tests with three different types of dithering:

– Serpentine Floyd-Steinberg error diffusion
– Stucki error diffusion
– Atkinson error diffusion

When we compared each test, we noticed that the Serpentine Floyd-Steinberg error diffusion is
a well-balanced algoritm (which confirms the reason why we use it internally), the
Stucki error diffusion looks “sharp” and preserve light edges and details well and the Atkinson
error diffusion generates distinct patterns but keeps clean the flat areas. Unfortunately, though,
even if they look “better” to the human eye, this is strictly subjective, as they only look “different”.
As a matter of fact, on both SSIM and PSNR, each dithering method has got a lower score
compared to truncation.
The interesting fact, though, is that it didn't get a lower score in every single frame, as there are a
very few frames in which dithering algorithms managed to get an higher value compared to
truncation, but overall truncation outperformed dithering algorithms by 1.51% in SSIM and 0.3%
in PSNR, that's why we decided not to include Dithering as reference.

3) Comparison between encoders

In this part, we are going to compare the following encoders: Ateme, AWS, Selenio, Telestream and x262.
For the reason already explained above, the file encoded with x262 has been encoded without any dithering algorithms and just using truncation.

The first graph represents how the different encoders behave during the whole video.
Since SSIM goes from 0 to 1 with many digits after the 0, we re-scaled it in order to make it more human readable. From the tests, AWS performed better than other encoders, with a score of 289921, followed by Ateme by a very narrow margin (289639). At the third position, with a rather significant quality drop, we have Telestream with 281010, followed by Selenio with 279577.
At the bottom, we have x262 which scored 276854 and which is outperformed by 4.51%.

PSNR pretty much confirms what is shown by SSIM.
AWS performed better than all the other encoders and scored 733272, followed by Ateme by a narrow margin with 731639. At the third position, there's Telestream with 729195, followed by Selenio which scored 722449. At the bottom, there's x262 with a total score of 720755.
According to PSNR, though, Selenio is closer to the quality reached by x262 rather than the one reached by Telestream.

SSIM Individual Charts (From best to worse):

PSNR Individual Charts (From best to worse):

4) Results and final thoughts

AWS managed to achieve a better score compared to all the other encoders, but its advantage is only because it had a slightly higher spike on a few scenes, while overall it had pretty much the same performance as Ateme. In particular, grain retention was pretty much fine on both, but when random noise recorded by the camera came into the equation, AWS managed to handle it slightly better than Ateme, but again, overall, they performed pretty much the same, that's why the margin was really narrow. At the third place, Telestream performed worse compared to Ateme by a not-so-high margin, but still, it was worse. Even though Telestream performed worse, it's still closer to the upper part of the chart rather than to the lower part of the chart, however it didn't quite manage to get to the same level of Ateme on too many scenes, that's why it ended up being third. At the fourth position there's Selenio that performed significantly worse than AWS and Ateme and worse than Telestream by a still significant margin. At the bottom of the table, there's x262, which apparently is the worse MPEG-2 encoder among those at such an high bitrate and its performance was pretty low overall and on top of that, it struggled to encode sport properly. On the other had, x262 is free and open source, while all the other encoders are closed source, need to be purchased, their cost is very high and they don't support Avisynth input, so we would still choose x262 over those other encoders. We're already using x262 and we're not planning to change anytime soon.

Bibliography

–Visgraf (Vision and Graphic Laboratory) Mathematics, FS algorithm
–Proceedings of the Society of Information Display (adaptive algorithm)

kolak · 16th July 2019, 19:07

AWS is an ex Elemental encoder, no?
I also found FS dithering been the best, but I also kept adding tiny amount of noise (this was for Blu-ray encoding though).

Blue_MiSfit · 17th July 2019, 00:27

Fascinating

I was always impressed with Elemental's MPEG-2 encoder, so I'm not surprised to see it doing so well here. Particularly when doing low bitrate 15 Mbps CableLabs compliant 1080i it absolutely crushed my go-to at that point - Harmonic ProMedia Carbon aka Rhozet Carbon Coder.

Any particular reason you left Harmonic out of the mix?

kolak · 17th July 2019, 11:03

Is it that good?
Rhozet was a reference for interlaced content for me.

Blue_MiSfit · 18th July 2019, 02:46

Rhozet was fine, and WAY better than Digital Rapids (now Imagine's Selenio product line), but at low bitrates (15 Mbps for 1080i) it had a lot of blocking that totally went away with Elemental. Plus, the latter was WAY faster

Gosh I haven't thought about this in awhile, It's been years since I've done any MPEG-2 encoding.

ifb · 14th July 2021, 02:15

Do you remember the x262 commandline? I sure hope you used --tune ssim/psnr.

I should probably check in here more often. This thread is 2 years old and I haven't seriously worked on x262 in 8 years. I'm glad people still find it somewhat useful, though.

FranceBB · 14th July 2021, 13:50

Quote:

Originally Posted by ifb

Do you remember the x262 commandline?
I should probably check in here more often. This thread is 2 years old and I haven't seriously worked on x262 in 8 years. I'm glad people still find it somewhat useful, though.

I'm glad you're back, though!

About x262, I don't remember the command line, but I remember that I had to make sure the content was progressive 25p 'cause this:

Code:

x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 
--keyint 12 --bframes 2 --tff --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 
--videoformat component --nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262"

pause

ends up with:

Code:

avs [info]: 1920x1080i 0:0 @ 25/1 fps (cfr)
x262 [error]: interlaced 4:2:2 not implemented

so I had to settle for something like:

Code:

x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 
--keyint 12 --bframes 2 --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component 
--nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262"

pause

Now, here is the thing:

first of all to create a complaint XDCAM-50 stream we need:

Code:

--level

which is currently not working if I try things over 1, but we need "HL", so High Level.
The MPEG-2 levels are:
LL Low Level
ML Main Level 2
H-14 High 1440
HL High Level

we need the last one.

Code:

--nal-hrd cbr

should actually be displaying constant bitrate properly rather than showing it as variable.
Then:

Code:

--keyint 12 --bframes 2

are meant to be used (I think) so that the GOP is:

pict_type=I
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B

and then repeats the sequence as M=3 N=12.

This is an example of an XDCAM File encoded with FFMpeg:

Code:

General 
Complete name : S:\MEDIADIRECTOR\ARCA\F1 1993 GP ITALIA GARA 930912 1P (7754523).mxf 
Format : MXF 
Commercial name : XDCAM HD422 
Format version : 1.3 
Format profile : OP-1a 
Format settings : Closed / Complete 
File size : 45.8 GiB 
Duration : 1 h 49 min 
Overall bit rate : 60.0 Mb/s 
Encoded date : 2021-06-23 14:23:37.404 
Writing application : FFmpeg OP1a Muxer 58.65.101.0.0 

Video 
ID : 2 
Format : MPEG Video 
Commercial name : XDCAM HD422 
Format version : Version 2 
Format profile : 4:2:2@High 
Format settings : CustomMatrix / BVOP 
Format settings, BVOP : Yes 
Format settings, Matrix : Custom 
Format settings, GOP : M=3, N=12 
Format settings, picture structure : Frame 
Format settings, wrapping mode : Frame 
Codec ID : 0D01030102046001-0401020201040300 
Duration : 1 h 49 min 
Bit rate mode : Constant 
Bit rate : 50.0 Mb/s 
Width : 1 920 pixels 
Height : 1 080 pixels 
Display aspect ratio : 16:9 
Frame rate : 25.000 FPS 
Standard : Component 
Color space : YUV 
Chroma subsampling : 4:2:2 
Bit depth : 8 bits 
Scan type : Interlaced 
Scan order : Top Field First 
Compression mode : Lossy 
Bits/(Pixel*Frame) : 0.965 
Time code of first frame : 00:00:00:00 
Time code source : Group of pictures header 
GOP, Open/Closed : Open 
GOP, Open/Closed of first frame : Closed 
Stream size : 38.2 GiB (83%) 
Color range : Limited 
Color primaries : BT.709 
Transfer characteristics : BT.709 
Matrix coefficients : BT.709

and this is the command line from FFMpeg:

Code:

avs2yuv.exe "S:\00_INGEST_MAM\A.R.C.A\02_ALTRO\NR_DJF_AVISYNTH_TEST_SPORT_SD.avs" -csp AUTO -o - | ffmpeg.exe -i - -pix_fmt yuv422p -vcodec mpeg2video 
-s 1920:1080 -aspect 16:9 -vf setfield=tff -flags +ildct+ilme+cgop -b_strategy 0 -mpv_flags +strict_gop -sc_threshold 1000000000 -r 25 
-b:v 50000k -minrate 50000k -maxrate 50000k -bufsize 17825792 -g 12 -bf 2 -profile:v 0 -level:v 2 -color_range 1 -color_primaries 1 -color_trc 1 
-colorspace 1 -y "\\MIBCSSDA001\Media Ingest\filetemporanei\server0\output.mxf"

Can you help me out a bit here?
I mean, can we try to support XDCAM-50 properly this time?
Let's be honest, aside from professional formats like XDCAM and IMX, there's no much use for MPEG-2 nowadays.
On top of that, I think x262 should not encode in H.264. I think it should be MPEG-2 only with the same features as x264 but that's it. This way, if we get that MPEG-2 only and XDCAM/IMX compliant, then I'm pretty sure it can be used professionally and it could even be included in FFMpeg as libx262!

Quote:

Originally Posted by ifb

I sure hope you used --tune ssim/psnr.

Wouldn't that be cheating? ehehehehe
I mean the goal was to offer a real life scenario, but yeah I guess picking them would have achieved a higher score, definitely eheheheh

Emulgator · 14th July 2021, 15:26

Still there is improvement room to pull closer to CCE, DVDs are still needed.

FranceBB · 14th July 2021, 15:43

Quote:

Originally Posted by Emulgator

Still there is improvement room to pull closer to CCE, DVDs are still needed.

True. But making a DVD compliant stream is also professional in some sense, so yeah.

Anyway, if ifb gets XDCAM right I'm gonna be happy xD

benwaggoner · 14th July 2021, 17:23

What's the goal in using objective metrics here? Particularly for dithering, which is absolutely a subjective optimization.

The "best" encoder is the one that delivers the best subjective results in double-blind comparisons. Objective metrics are an okay 1st order approximation of some things, but not for things like how different dithering modes impact subjective quality of the final output. For MPEG-2, a dithering mode that looks better pre-compression might wind up compressing less, and any potential quality gain is eaten up by artifacts from encoding requiring a higher QP.

FranceBB · 14th July 2021, 19:34

Yes, which is why I tested to see whether dithering was going to affect quality in terms of objective metrics or not, and it did, negatively, so in the end the one I evaluated was the one obtained via truncation, which is the one you see in the charts

benwaggoner · 15th July 2021, 01:23

Quote:

Originally Posted by FranceBB

Yes, which is why I tested to see whether dithering was going to affect quality in terms of objective metrics or not, and it did, negatively, so in the end the one I evaluated was the one obtained via truncation, which is the one you see in the charts

But truncation looks bad, and would never be used in real-world DVD encoding.

There's a lot of secret sauce in dithering for 8-bit codecs, particularly as simple a one as MPEG-2.

ifb · 21st July 2021, 15:28

Quote:

Originally Posted by FranceBB

The MPEG-2 levels are:
LL Low Level
ML Main Level 2
H-14 High 1440
HL High Level

we need the last one.

There are five levels supported (you omitted HighP). Using

Code:

--output-csp i422 --level high

reports

Code:

x262 [info]: 4:2:2 profile @ High level

so I don't see the problem. It's the same behavior as x264. The minimal level is assumed and you can force High if needed (i.e. with SD resolutions).

Quote:

Code:

--keyint 12 --bframes 2

are meant to be used (I think) so that the GOP is:

I don't know that a fixed GOP pattern is strictly required for XDCAM, but a fixed M,N pattern is achieved the same as with x264. Use --no-scenecut.

Quote:

Can you help me out a bit here?
I mean, can we try to support XDCAM-50 properly this time?

You're implying some sort of failure, but XDCAM was never a goal. 4:2:2 was added just because I could. Interlacing was added (and paid for) by a company that needed it. Extending that to 4:2:2 is outside of my expertise and interest.

Quote:

On top of that, I think x262 should not encode in H.264. I think it should be MPEG-2 only with the same features as x264 but that's it. This way, if we get that MPEG-2 only and XDCAM/IMX compliant, then I'm pretty sure it can be used professionally and it could even be included in FFMpeg as libx262!

x262 IS x264. It's a patch that would have been merged into mainline x264. The whole genius/stupid troll is that you can dumb down an AVC encoder and have it spit out MPEG-2 (even MPEG-1) with minimal effort. Unless and until there is some requirement/feature that would negatively affect the main AVC codebase, separating them makes little sense. They are 95% the same code.

The x262 repository hasn't been updated largely because of the x264 history rewrite. I'm not a git-filter-branch expert enough to fix it.

Quote:

Wouldn't that be cheating? ehehehehe
I mean the goal was to offer a real life scenario, but yeah I guess picking them would have achieved a higher score, definitely eheheheh

It's not cheating at all. The entire point of those tunes is to not cripple x264 when doing codec comparisons. You're using metrics that the AQ and psy-opts in x264/x262 bias against.

A "real life scenario" uses your eyes, not PSNR/SSIM.

FranceBB · 21st July 2021, 22:21

Quote:

Originally Posted by ifb

Extending that to 4:2:2 is outside of my expertise and interest.

That's really a shame...

If you'll ever feel like into this, lots of people will definitely appreciate it, I'm sure.

Quote:

Originally Posted by ifb

x262 IS x264. It's a patch that would have been merged into mainline x264. The whole genius/stupid troll is that you can dumb down an AVC encoder and have it spit out MPEG-2 (even MPEG-1) with minimal effort. Unless and until there is some requirement/feature that would negatively affect the main AVC codebase, separating them makes little sense. They are 95% the same code.

Well, if it will actually be merged into x264, then that's even better.

Quote:

Originally Posted by ifb

The x262 repository hasn't been updated largely because of the x264 history rewrite. I'm not a git-filter-branch expert enough to fix it.

Ah, I see.

Quote:

Originally Posted by ifb

It's not cheating at all. The entire point of those tunes is to not cripple x264 when doing codec comparisons. You're using metrics that the AQ and psy-opts in x264/x262 bias against.

Ah, right, due to the psychovisual optimization. In this case I can try to repeat the tests with --tune ssim

Quote:

Originally Posted by ifb

I don't know that a fixed GOP pattern is strictly required for XDCAM, but a fixed M,N pattern is achieved the same as with x264.

Ok, so I guess

Code:

--no-scenecut --keyint 12 --bframes 2

should give me:

Code:

pict_type=I
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B

right?

So I guess the final command line would be:

Code:

x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --level high --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 
--keyint 12 --bframes 2 --no-scenecut --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component 
--nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262"

pause

and then use --tune ssim for the comparison.

I'll test again and let you know if it works correctly but I have faith it will :P

The only "sad" thing is that it's gonna be progressive only...

If you'll ever introduce interlace encode in 4:2:2 please let me know.
I also hope that one day or another your patch will be merged into the mainline x264 repository at this point, although, if it hasn't been merged after so many years, I doubt it will ever be...

By the way, just let me say that what you've done was brilliant; I mean, using the x264 features to encode an MPEG-2 stream is very interesting and it's a shame that it hasn't been used by more broadcasters...

pandy · 22nd July 2021, 17:13

I have one question - is there any justification to not test ordered dither side (or as alternative) to "noise" like dither?
FS is "noise" type dither and in unavoidable way it will increase entropy and sources coded with lossy encoder may deliver suboptimal results.

ifb · 22nd July 2021, 18:01

Quote:

Originally Posted by FranceBB

Ok, so I guess

Code:

--no-scenecut --keyint 12 --bframes 2

should give me:

Code:

pict_type=I
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B
pict_type=P
pict_type=B
pict_type=B

right?

I didn't test it, but I think so.

Quote:

So I guess the final command line would be:

Code:

x262_64.exe "AVS Script.avs" --mpeg2 --preset medium --level high --profile 422 --bitrate 50000 --vbv-maxrate 50000 --vbv-bufsize 17825792 
--keyint 12 --bframes 2 --no-scenecut --deblock -1:-1 --overscan show --colormatrix bt709 --range tv --transfer bt709 --colorprim bt709 --videoformat component 
--nal-hrd cbr --output-csp i422 --output "\\mibctvan000\Ingest\MEDIA\temp\raw_video.h262"

pause

--deblock, --overscan, --range do nothing in MPEG-2
--profile 422 isn't needed with --output-csp i422
--level high isn't needed either if you are encoding HD resolutions

Maybe add --open-gop if XDCAM allows it. Short keyints are the worst and that helps some.

Quote:

If you'll ever introduce interlace encode in 4:2:2 please let me know.
I also hope that one day or another your patch will be merged into the mainline x264 repository at this point, although, if it hasn't been merged after so many years, I doubt it will ever be...

x264 development is in different hands than when this started, so that's a big part of the issue.

Quote:

By the way, just let me say that what you've done was brilliant;

kierank did a lot, so I can't take all the credit (or even most of it). There were rumors about xvp8 being written, plus I wanted something I could use for ATSC, thus x262 was born. That it turned out kinda OK is a pleasant surprise. The very informal eyeball comparisons I did using park_joy were favorable, IMHO.

Code:

2010-10-09 22:28:47 < kierank> awww BBB's not here
2010-10-09 22:28:57 < Dark_Shikari> what do you need to harass him for?
2010-10-09 22:29:09 < kierank> the fact that x262 is making progress as opposed to xvp8
2010-10-09 22:29:15 < ifb> lol
2010-10-09 22:29:45 < Dark_Shikari> hell yes
2010-10-09 22:29:53 < Dark_Shikari> \o/
2010-10-09 22:29:59 < Dark_Shikari> Now, what you need to do
2010-10-09 22:30:02 < Dark_Shikari> is show x262 beating libvpx
2010-10-09 22:31:55 < Dark_Shikari> troll troll troll troll troll
2010-10-09 22:32:08 < Dark_Shikari> we can really troll the shit out of michael with this one too
2010-10-09 22:32:12 < Dark_Shikari> "lol x264 is a better ffmpeg than ffmpeg"

benwaggoner · 23rd July 2021, 04:57

Quote:

Originally Posted by pandy

I have one question - is there any justification to not test ordered dither side (or as alternative) to "noise" like dither?
FS is "noise" type dither and in unavoidable way it will increase entropy and sources coded with lossy encoder may deliver suboptimal results.

Patterned dithers are more efficient with GIF and PNG as they result in similar pixel sequences; better for lossless RGB entropy codecs. But they're no help for DCT codecs like MPEG-2. High quality 8-bit movie encoders are really dependent on high quality, highly tuned dithering. Tools like xscaler are/were used and different modes could be used for different scenes.

Really, metrics should be calculated comparing the dithered input to the encoder in the final and the encoder output. A subjective analysis comparing different dithering modes is interesting, but objective metrics just aren't going to be helpful.

Dithering is a fascinating area of study that I'm hoping to purge from my memory as we enter the 10+ bit era

.

But heck, it was probably 20 years ago I first said "thank goodness I'll never have to do an Animated GIF again!"

pandy · 2nd August 2021, 20:59

Quote:

Originally Posted by benwaggoner

Patterned dithers are more efficient with GIF and PNG as they result in similar pixel sequences; better for lossless RGB entropy codecs. But they're no help for DCT codecs like MPEG-2. High quality 8-bit movie encoders are really dependent on high quality, highly tuned dithering. Tools like xscaler are/were used and different modes could be used for different scenes.

Don't get me wrong but dither like FS is nothing than stress to encoder (and it is usually filtered out if possible).

Also FS temporal characteristic made impossible to compress this kind of dither.
From my perspective FS is like noise i.e. suboptimal from spatial and temporal perspective.

Beside to this - clear proof how behave FS in video - use same video before compression, then modify video by placing single static pixel in static position (so no change from temporal perspective) apply FS to both videos, subtract original from modified - compare spatial residue, then second experiment - modify single pixel with different position each frame - subtract reference from modified - analyze residue in spatial and temporal domain.

Perhaps i'm bit naive but from my perspective residue is noise and noise is extremely difficult to compress especially by lossy encoder...

Perhaps my example is too simplistic but still similar effect can be achieved simply by adding filtered noise (blue?) to lets say Luma channel.

Also instead FS, some refined ordered dither such as Ulichney could be better at least from temporal perspective.

Quote:

Originally Posted by benwaggoner

Really, metrics should be calculated comparing the dithered input to the encoder in the final and the encoder output. A subjective analysis comparing different dithering modes is interesting, but objective metrics just aren't going to be helpful.

Depends what is your goal - if "objective" then definitely yes, always apples shall be compared with apples - if subjective then if you define properly area then apples can be compared with for example pineapples.
And always - you need clearly define goal and methodology.

Quote:

Originally Posted by benwaggoner

Dithering is a fascinating area of study that I'm hoping to purge from my memory as we enter the 10+ bit era

.

But heck, it was probably 20 years ago I first said "thank goodness I'll never have to do an Animated GIF again!"

Dithering is unavoidable - whenever quantization is involved then dithering (and best with psychovisual matched noiseshaping/errorshaping) is mandatory - 10 bit solving some problems - of course modern display technology quickly reaching level where for average consumer this will be enough but human eyes are capable way more than 10 bits (depends on conditions and context somewhere between 12 and 14 bits).
this is same like ultra high frame rate - 300...600 frames per second will be key to get full immersion....

But my question was triggered by FS dithering - from my experience FS dither raising QP dramatically and literally stealing bits from video details.

Blue_MiSfit · 31st August 2021, 02:58

Any suggestions on tuning ffmpeg's mpeg-2 encoder or x262 for quality? I'm happy to burn as much CPU time as possible!

I'm targeting 6 Mbps for 720p59.94. Aggressive? ... Yes

GMJCZP · 1st September 2021, 01:33

Quote:

Originally Posted by Blue_MiSfit

Any suggestions on tuning ffmpeg's mpeg-2 encoder or x262 for quality? I'm happy to burn as much CPU time as possible!

I'm targeting 6 Mbps for 720p59.94. Aggressive? ... Yes

Please search in my Arsenal the script for DVD with FFMPEG.

EDIT: Ok, I see, this is another class of encoders, my comment is not exactly applicable here.

15th July 2019, 22:18	#1 \| Link
FranceBB Broadcast Encoder Join Date: Nov 2013 Location: Royal Borough of Kensington & Chelsea, UK Posts: 2,905	VQ Test for MPEG-2 Encoders VQ Test for MPEG-2 Encoders Broadcast Encoder : Francesco Bucciantini (FranceBB) Senior Video Editor : Livio Aloja (algia) The files analysed are named “Test4” as we did several tests and they are aimed to encode files from lossless masters for internal usage as mezzanine files. 1) Input file and encoding target 2) The impact of Dithering on objective metrics 3) Comparison between encoders 4) Results and final thoughts 5) Bibliography 1) Input file and encoding target For this test, the original masterfile is an Apple ProRes, FULL HD (1920x1080), 10bit, 4:2:2 planar BT709 Limited TV Range with both progressive and interlaced contents at 25fps. The target is an XDCAM50 lossy mezzanine file for broadcast usage, which is an MPEG-2, FULL HD, 50Mbit/s, 8bit, 4:2:2 planar (yv16), BT709 Limited TV Range, closed GOP, with both progressive (flagged as interlaced) and interlaced contents at 25fps. The test reel has different types of contents to test how encoders behave. 2) The impact of Dithering on objective metrics Internally, whenever we get an high bit depth source, we apply Dithering in order to avoid to introduce banding while bringing it to 8bit. In particular, we use the Floyd-Steinberg error diffusion. The algorithm achieves dithering using error diffusion, meaning it pushes (adds) the residual quantization error of a pixel onto its neighboring pixels, to be dealt with later. It spreads the debt out according to the distribution (shown as a map of the neighboring pixels): The pixel indicated with a star indicates the pixel currently being scanned, and the blank pixels are the previously-scanned pixels. The algorithm scans the image from left to right, top to bottom, quantizing pixel values one by one. Each time the quantization error is transferred to the neighboring pixels, while not affecting the pixels that already have been quantized. Hence, if a number of pixels have been rounded downwards, it becomes more likely that the next pixel is rounded upwards, such that on average, the quantization error is close to zero. The original lossless masterfile for this test is 10bit, but the encoded file has to be 8bit due to the XDCAM specifications, so we were wondering whether Dithering has a positive or a negative impact on objective metrics compared to truncation. We tried to run some internal tests with three different types of dithering: – Serpentine Floyd-Steinberg error diffusion – Stucki error diffusion – Atkinson error diffusion When we compared each test, we noticed that the Serpentine Floyd-Steinberg error diffusion is a well-balanced algoritm (which confirms the reason why we use it internally), the Stucki error diffusion looks “sharp” and preserve light edges and details well and the Atkinson error diffusion generates distinct patterns but keeps clean the flat areas. Unfortunately, though, even if they look “better” to the human eye, this is strictly subjective, as they only look “different”. As a matter of fact, on both SSIM and PSNR, each dithering method has got a lower score compared to truncation. The interesting fact, though, is that it didn't get a lower score in every single frame, as there are a very few frames in which dithering algorithms managed to get an higher value compared to truncation, but overall truncation outperformed dithering algorithms by 1.51% in SSIM and 0.3% in PSNR, that's why we decided not to include Dithering as reference. 3) Comparison between encoders In this part, we are going to compare the following encoders: Ateme, AWS, Selenio, Telestream and x262. For the reason already explained above, the file encoded with x262 has been encoded without any dithering algorithms and just using truncation. The first graph represents how the different encoders behave during the whole video. Since SSIM goes from 0 to 1 with many digits after the 0, we re-scaled it in order to make it more human readable. From the tests, AWS performed better than other encoders, with a score of 289921, followed by Ateme by a very narrow margin (289639). At the third position, with a rather significant quality drop, we have Telestream with 281010, followed by Selenio with 279577. At the bottom, we have x262 which scored 276854 and which is outperformed by 4.51%. PSNR pretty much confirms what is shown by SSIM. AWS performed better than all the other encoders and scored 733272, followed by Ateme by a narrow margin with 731639. At the third position, there's Telestream with 729195, followed by Selenio which scored 722449. At the bottom, there's x262 with a total score of 720755. According to PSNR, though, Selenio is closer to the quality reached by x262 rather than the one reached by Telestream. SSIM Individual Charts (From best to worse): PSNR Individual Charts (From best to worse): 4) Results and final thoughts AWS managed to achieve a better score compared to all the other encoders, but its advantage is only because it had a slightly higher spike on a few scenes, while overall it had pretty much the same performance as Ateme. In particular, grain retention was pretty much fine on both, but when random noise recorded by the camera came into the equation, AWS managed to handle it slightly better than Ateme, but again, overall, they performed pretty much the same, that's why the margin was really narrow. At the third place, Telestream performed worse compared to Ateme by a not-so-high margin, but still, it was worse. Even though Telestream performed worse, it's still closer to the upper part of the chart rather than to the lower part of the chart, however it didn't quite manage to get to the same level of Ateme on too many scenes, that's why it ended up being third. At the fourth position there's Selenio that performed significantly worse than AWS and Ateme and worse than Telestream by a still significant margin. At the bottom of the table, there's x262, which apparently is the worse MPEG-2 encoder among those at such an high bitrate and its performance was pretty low overall and on top of that, it struggled to encode sport properly. On the other had, x262 is free and open source, while all the other encoders are closed source, need to be purchased, their cost is very high and they don't support Avisynth input, so we would still choose x262 over those other encoders. We're already using x262 and we're not planning to change anytime soon. Bibliography –Visgraf (Vision and Graphic Laboratory) Mathematics, FS algorithm –Proceedings of the Society of Information Display (adaptive algorithm) __________________ LUT Collection FFAStrans Videotek - AAA - SafeColorLimiter Last edited by FranceBB; 15th July 2019 at 22:20.

17th July 2019, 00:27	#3 \| Link
Blue_MiSfit Derek Prestegard IRL Join Date: Nov 2003 Location: Los Angeles Posts: 5,989	Fascinating I was always impressed with Elemental's MPEG-2 encoder, so I'm not surprised to see it doing so well here. Particularly when doing low bitrate 15 Mbps CableLabs compliant 1080i it absolutely crushed my go-to at that point - Harmonic ProMedia Carbon aka Rhozet Carbon Coder. Any particular reason you left Harmonic out of the mix? Last edited by Blue_MiSfit; 17th July 2019 at 06:33.

14th July 2021, 15:26	#8 \| Link
Emulgator Big Bit Savings Now ! Join Date: Feb 2007 Location: close to the wall Posts: 1,546	Still there is improvement room to pull closer to CCE, DVDs are still needed. __________________ "To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."

14th July 2021, 17:23	#10 \| Link
benwaggoner Moderator Join Date: Jan 2006 Location: Portland, OR Posts: 4,771	What's the goal in using objective metrics here? Particularly for dithering, which is absolutely a subjective optimization. The "best" encoder is the one that delivers the best subjective results in double-blind comparisons. Objective metrics are an okay 1st order approximation of some things, but not for things like how different dithering modes impact subjective quality of the final output. For MPEG-2, a dithering mode that looks better pre-compression might wind up compressing less, and any potential quality gain is eaten up by artifacts from encoding requiring a higher QP. __________________ Ben Waggoner Principal Video Specialist, Amazon Prime Video My Compression Book

14th July 2021, 19:34	#11 \| Link
FranceBB Broadcast Encoder Join Date: Nov 2013 Location: Royal Borough of Kensington & Chelsea, UK Posts: 2,905	Yes, which is why I tested to see whether dithering was going to affect quality in terms of objective metrics or not, and it did, negatively, so in the end the one I evaluated was the one obtained via truncation, which is the one you see in the charts __________________ LUT Collection FFAStrans Videotek - AAA - SafeColorLimiter

16th July 2019, 19:07	#2 \| Link
kolak Registered User Join Date: Nov 2004 Location: Poland Posts: 2,843	AWS is an ex Elemental encoder, no? I also found FS dithering been the best, but I also kept adding tiny amount of noise (this was for Blu-ray encoding though).

17th July 2019, 11:03	#4 \| Link
kolak Registered User Join Date: Nov 2004 Location: Poland Posts: 2,843	Is it that good? Rhozet was a reference for interlaced content for me.

18th July 2019, 02:46	#5 \| Link
Blue_MiSfit Derek Prestegard IRL Join Date: Nov 2003 Location: Los Angeles Posts: 5,989	Rhozet was fine, and WAY better than Digital Rapids (now Imagine's Selenio product line), but at low bitrates (15 Mbps for 1080i) it had a lot of blocking that totally went away with Elemental. Plus, the latter was WAY faster Gosh I haven't thought about this in awhile, It's been years since I've done any MPEG-2 encoding.

14th July 2021, 02:15	#6 \| Link
ifb Registered User Join Date: Dec 2009 Posts: 72	Do you remember the x262 commandline? I sure hope you used --tune ssim/psnr. I should probably check in here more often. This thread is 2 years old and I haven't seriously worked on x262 in 8 years. I'm glad people still find it somewhat useful, though.

22nd July 2021, 17:13	#15 \| Link
pandy Registered User Join Date: Mar 2006 Posts: 1,049	I have one question - is there any justification to not test ordered dither side (or as alternative) to "noise" like dither? FS is "noise" type dither and in unavoidable way it will increase entropy and sources coded with lossy encoder may deliver suboptimal results.

31st August 2021, 02:58	#19 \| Link
Blue_MiSfit Derek Prestegard IRL Join Date: Nov 2003 Location: Los Angeles Posts: 5,989	Any suggestions on tuning ffmpeg's mpeg-2 encoder or x262 for quality? I'm happy to burn as much CPU time as possible! I'm targeting 6 Mbps for 720p59.94. Aggressive? ... Yes __________________ These are all my personal statements, not those of my employer :) Last edited by Blue_MiSfit; 31st August 2021 at 05:11.