Finally got around to re-testing the Crowd Run 2160/50p x264 series that I kept from earlier tests with v3:
https://forum.doom9.org/showthread.p...16#post1865316
So this was testing with VapourSynth VMAF v4 in Model=1 mode which uses vmaf_4k_v0.6.1 by default (CI=False) and vmaf_4k_rb_v0.6.2 when set to CI=True.
Now in this case CI=False and CI=True produced the
exact same aggregate VMAF scores, which came as a surprise:
Now how is that ? The Confidence Interval doc doesn't mention 4K models specifically but I would assume the 'rb' in 'vmaf_4k_rb_v0.6.2' means 'residue bootstrapping', in which case why is residue bootstrapping used to derive CI scores for 4K video, whereas the CI model for HD/SD (vmaf_b_v0.6.3) uses
plain bootstrapping ? All rather confusing.