Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
1st March 2019, 15:19 | #61 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Quote:
Bear in mind though that all of those tests were done on a single source. Could be any number of factors weighing in there. What scores to you get if you get if you test the ref and x265 clips against themselves ?
__________________
Nostalgia's not what it used to be Last edited by WorBry; 1st March 2019 at 15:57. |
|
1st March 2019, 16:43 | #62 | Link | |
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
I'm not sure how valid that test would be. EXR is usually sRGB linear , and 16bit half float . For any metric you usually need a common ground to compare. This means same pixel format (same colorspace, same bit depth, same chroma subsampling) . Otherwise you introduce other variables that are not controlled for. e.g. if one run uses one algorithm to scale (e.g. bicubic vs. bilinear, vs...) , or another dithers down using one algorithm, but another does not... or if you convert to RGB using different matrix, etc... there are many factors that invalidate your testing |
|
1st March 2019, 22:12 | #63 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Not to mention the potential for frame shifts/misalignment when using different decoders for the reference and test clips, although the filter will report an error if the number of frames is different.
Also needs to be appreciated that the VMAF models are 'trained' for predicting perceptual quality at streaming bitrates primarily. Came across this quote from Netflix: Quote:
I would assume a similar focus was applied in training the 4K model. So at CRF 10 you are well into uncharted territory. Personally, I'd be more inclined to look at other metrics available for VapourSynth that are (maybe) better attuned for VQA in the visually lossless domain - GMSD, MDSI and yes, SSIM....Butteraugli, possibly. My own journey of discovery in that vein continues: https://forum.doom9.org/showthread.php?t=176101
__________________
Nostalgia's not what it used to be Last edited by WorBry; 1st March 2019 at 22:19. |
|
1st March 2019, 22:42 | #64 | Link | |
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
16bit EXR ref clip is 400 frames - control test to itself via 0.6.1 results in 98.2 x265 12bit 444 encode (from EXR), control tested to itself via 0.6.1 results in 98.08 x265 12bit 444 encode (from EXR) tested against it's source (16 bit EXR) via 0.6.1 results in 96.4 interesting that the clip you tested had 399 out of 400 frames a perfect 100 in the control test... |
|
1st March 2019, 22:51 | #65 | Link | |
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
for streaming the movie was encoded via ffmpeg and x265 from EXR (rgb48le) to x265 12bit 444 (yuv444p12le) - final result looks very good, we're using VMAF to compare various encodes (presets/CRF/etc) against each other... exact same as NF does it w/ VMAF... the master one delivers to NF is obviosly also not 8bit 420, they encode from that (high quality) master for NF streaming... so the "common ground" you state is the same movie in the same resolution, which is the only thing that NF states in their VMAF instructions... the whole reason for comparison is different output bit depth w/ different output chroma subsampling, on top of different encoding settings, so I do not understand your point... Last edited by Iron_Mike; 1st March 2019 at 22:53. |
|
1st March 2019, 22:59 | #66 | Link | |
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
ffmpeg reads an EXR frame and then reads a frame from the mp4 encode (that was done from the EXR via ffmpeg) and then passes the decoded frames to VMAF for comparison... is there a setting to avoid frame shifts/misalignment ? you yourself validated the test clip up to CRF 0 and results on the charts make sense... your encodes went up to VMAF 100... not sure I understand your concern for CRF 10 ? Thanks Last edited by Iron_Mike; 1st March 2019 at 23:07. |
|
1st March 2019, 23:09 | #67 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Quote:
Go through the list of per-frame VMAF scores from your 'self' tests and you'll be able to identify which frames are skewing the aggregate score.
__________________
Nostalgia's not what it used to be |
|
1st March 2019, 23:29 | #68 | Link | ||||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
You are posting in the vapoursynth vmaf thread. Only certain pixel formats are supported. https://github.com/HomeOfVapourSynth...ourSynth-VMAF/ Quote:
My point is strive to be more scientific. To eliminate all those confounding variables in a controlled environment. How you perform the various conversions will affect the results that are calculated. But now it's clear you're using ffmpeg vmaf. Did you look at the ffmpeg log to see what other conversions were occurring ? There might be other stuff going on behind your back Quote:
For the vapoursynth , the source filter can be indexed, and is more robust method for frame accuracy. For ffmpeg you can reset the PTS which might help Quote:
Last edited by poisondeathray; 1st March 2019 at 23:34. |
||||
1st March 2019, 23:30 | #69 | Link | |
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
This all seems normal considering that NF themselves state that in their FAQs but when I saw you get a perfect 100 in 499/500 frames I thought maybe they've updated their model to make control tests perform close to 100... |
|
1st March 2019, 23:42 | #71 | Link | ||||
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
Quote:
the VMAF score whether using original bit depth, 12 bit, 10 bit or 8 bit for the main/ref clips was always ~ 96.x (real test, not control) Quote:
Quote:
those results could easily be interpreted that from a certain CRF on, the encode is perceptually identical, which is the whole point of VMAF... their samples are based on humans reporting perceived quality differences... Last edited by Iron_Mike; 1st March 2019 at 23:46. |
||||
1st March 2019, 23:48 | #72 | Link | |
Registered User
Join Date: Dec 2005
Location: Germany
Posts: 1,795
|
Quote:
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth VapourSynth Portable FATPACK || VapourSynth Database |
|
1st March 2019, 23:50 | #73 | Link | |||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
https://github.com/HomeOfVapourSynth...ourSynth-VMAF/ Quote:
Quote:
I personally haven't used VMAF enough to be comfortable with it yet I personally don't find that particularly useful. I guess it might be good enough for "joe public" , they might not be able to tell the difference. But you can bet people that deal frequently with encoding, codecs, compression ; ie. people that post here - they can tell the difference between say, a crf 10 vs. crf 18 encode. Maybe a conspiracy theory, but it's almost like a Netflix scheme trying to justify their low delivery bitrate practices Last edited by poisondeathray; 2nd March 2019 at 00:02. |
|||
2nd March 2019, 01:18 | #74 | Link | |
Registered User
Join Date: Jul 2010
Posts: 132
|
Quote:
problem is w/ "scientific metrics" is that they often not relate a lot to the HVS (Human Vision System), which is the only thing that matters when humans watch the streamed content... VMAF attempts to address that with their sample data... question always are if enough people were sampled, what kind of people (gender/age/race/ethicity - diff between European and Asian samples etc) and the sample procedure was done as best as possible... hah ! probably the reason to start the project.. |
|
2nd March 2019, 02:11 | #75 | Link | |||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
Quote:
Yes, pros/cons to every measure , but there are other HVS modelled metrics. Quote:
So another way to phrase it - is the data set is not valid at higher bitrates. You cannot apply VMAF at higher bitrates because it was trained at CRF 22-28 |
|||
2nd March 2019, 02:18 | #76 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Quote:
https://forum.doom9.org/showthread.p...37#post1867137 Makes me wonder. https://www.reddit.com/r/netflix/com...for_hd_titles/
__________________
Nostalgia's not what it used to be Last edited by WorBry; 2nd March 2019 at 02:31. |
|
2nd March 2019, 05:55 | #78 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Cool. Should be interesting to see what statistical significance VMAF gives to those superfine score differences seen at the very high bitrates that I brought attention to earlier, which now, in light of the present discussion, I wish I hadn't
https://forum.doom9.org/showthread.p...24#post1865424 Seeing that comment from Netflix changed my perspective somewhat: Quote:
__________________
Nostalgia's not what it used to be Last edited by WorBry; 2nd March 2019 at 07:51. |
|
4th March 2019, 07:08 | #79 | Link | |
Registered User
Join Date: Jan 2004
Location: Here, there and everywhere
Posts: 1,197
|
Quote:
https://forum.doom9.org/showthread.p...70#post1864770 Here are the VMAF results, together with the aggregate 95% confidence interval (CI95_Low and CI95-High) scores i.e. the aggregate derived from the individual frame confidence intervals. I didn't generate the component SSIM, MS-SSIM and PSNR scores. First thing to note is that the VMAF v4 scores are lower than the scores I obtained previously (with the exact same x264 encodes) with v3. The same default pool=1 (harmonic mean) setting was applied in both cases, so I can only assume this reflects changes in the VMAF model itself. And homed in on the higher bitrate range. As noted in the v3 test series, the VMAF score for the lossless x264 CRF=0 encode (99.9954) didn't quite reach 100, and for the same reason - the component motion2=0 score for the first frame skewed an otherwise perfect 100 score for the other 499 frames. I've yet to test the parallel x265 series with VMAF v4 but looking at the aggregate CI scores obtained with the x264 files I think I can confidently say that what minor differences were seen at the high bitrates in the first test series are not statistically significant. Seems odd though that the CI95_Low intervals for the CRF 22 - 30 encodes are actually smaller than those of CRF 20 despite being beyond the scope of the trained vmaf_v0.6.1.pkl model. Would have thought they would be larger. I suppose it depends on the content and quality of the source/reference video also.
__________________
Nostalgia's not what it used to be Last edited by WorBry; 4th March 2019 at 17:42. |
|
4th March 2019, 07:48 | #80 | Link |
Registered User
Join Date: Jul 2015
Posts: 697
|
I did my VMAF test on BPG files.
Vmaf is already embedded in the SVT encoder, not as a json file tester. Pictures for I frames are better because they have a larger size by the same QP values for different encoders. And so much on the topic photos . Ma should add codec X265 with VMAF metric . http://forum.doom9.org/showthread.ph...19#post1867419 Last edited by Jamaika; 4th March 2019 at 07:50. |
Thread Tools | Search this Thread |
Display Modes | |
|
|