View Single Post
Old 3rd October 2015, 22:39   #33340  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by aufkrawall View Post
There's no difference for me. On the right side, there is new windowed path and on the left side, it's old FSE mode:

It's the same with old windowed path.
My test case was 1080p60 -> WQHD with Jinc3 + SuperRes 2 passes + Adaptive Sharpen.
Hhmmm... That's interesting. Maybe your card is so powerful, or it depends on lot's of things. Are you sure that you used D3D9 exlusive (old path)?

Here're my results (see my signature for details):
0. I'd been done testing on the 1080p TV, using nvidia with DXVA-copyback in LAV video, dithering on, all trading performance are switched off, no smooth motion, no other processing. I tried to select the lowest gpu and memory freq where the queues are still full. (max values: gpu 745MHz, ram: 2000MHz)
1. I managed to use Overlay mode (finally): but only with iGPU (intel) and only when I was tested on TV (my AV receiver is recognized otherwise as a cloned display device, that's why I got blank picture when I tried it on my laptop screen )
2. 720p (1280x720) 25fps video, chroma Jinc+AR, luma Jinc+AR:
- D3D9 FSE new path: gpu 690MHz, mem 2000MHz (gpu usage: 90-92%)
- D3D9 FSE old path: gpu 450MHz, mem 800MHz !!! (gpu usage: 97-99%, but playback is without a problem)
3. 1080p (1920x1080) 23.976fps video, chroma super-xbr+AR:
- D3D9 FSE new path: gpu 490MHz, mem 900MHz (gpu usage: 90-92%)
- D3D9 FSE old path: gpu 405MHz, mem 405MHz !!! (gpu usage: 80-82%) This result is actually insane!!! -> we use the lowest settings in P8 state (P12->P8->P5->P0)
4. The above values are valid. Both GPU-Z and nvidiainspector shows them, plus the thermal monitor app for CPU and GPU is also right on both cases.
5. I noticed 1 more thing in my system: the new path is way more ram consuming then the old one. In both tests I had to raise the ram clock not just the gpu clock.
6. All the rest of the modes (FS Windowed new/old path, FS Overlay, FS D3D11) were about the same speed like FSE new path was (only D3D9 old path is different completely).
7. The results were similar when I tried it out on the intel iGPU (with different scaling algos, since it's not so powerful than the nvidia), D3D9 FSE old path was way faster than the rest.
8. conclusion: if you do care about performance AND you don't/can't use NNEDI3 AND you don't use 10bit output, then give the D3D9 old path a chance! You can test how it behaves within 5 minutes.

Madshi, what could cause such a big performance difference?

Quote:
Originally Posted by madshi View Post
It's a technical limitation and there's not much I can do about it, without totally rewriting the "old exclusive mode".
You don't need t worry about it, since it's the fastest mode for me, thanks!

Quote:
Originally Posted by madshi View Post
It's possible, but difficult, due to 2 reasons:
1) You'd need to find an image, ideally some high quality RGB photo, where different chroma upscaling algorithms would show clear differences. In order to find such an image, you need to know what to look for. E.g. dark content with lots of black and red usually is a good idea. You should use an RGB photo which has full resolution chroma, maybe even scale it down a bit in RGB, so that the chroma channels really have full resolution and quality.
2) You need to convert the image to YCbCr 4:2:0, and you can't do that by simple downscaling chroma. You need to use the correct chroma offset, too! The chroma channel is not in center position compared to luma channel. It's slightly offset for all newer video codecs (MPEG2, h264, VC-1, h265 etc). Maybe LAV Video Decoder applies the proper offset when forcing it to output NV12? I'm not sure. @nevcairiel?
Sorry, but this was like Chinese language to me
I made a chroma comparison test with the video "mp4.v2c\Misc Patterns\A - Additional\3-Color Steps.mp4" from AVS.HD.709.v2d.Calibration of AVSforum, in windowed fullscreen mode, the zipped pngs are here: http://www10.zippyshare.com/v/bDzpW31T/file.html
(NN: nearest neighbour, SX: super-xbr+AR, Jinc: Jinc+AR, SR: superres strenght1, SR2: superres strenght2)
What I can see on the images that Superres adds unwanted bright edges that can be visible at the red rectangles on the right side of the pictures.
Apart from this, I can't tell the difference between the tested 3 algos (NN, Jinc+AR, SX+AR), maybe the test video wasn't aproppriate for this.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config

Last edited by chros; 3rd October 2015 at 22:53.
chros is offline   Reply With Quote