Once deinterlaced, 4:1:1->4:2:0->4:2:2 isn't an important transition as long as some chroma fidelity remains. Try it and see how little difference there is. 4:1:1 is already the lowest quality, so mixing in neighbor pixels usually doesn't change much.
Mild residual combs in your screenshot can be tackled with Vinverse, or better yet, whatever general denoiser you plan to use (fft3d? tbilateral? mctd?). Residual combs are the least of your problem in a noisy frame like that.
|