Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 6th July 2019, 17:59   #1  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
impact of copyback operations for video playback

this threads is here to discuss and analyse the impact of copyback decoder or other copyback operations used for video playback.

just copypaste your findings from the madVR thread in here lets get started.
huhn is offline   Reply With Quote
Old 8th July 2019, 11:33   #2  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by clsid View Post
Native is obviously more efficient than copyback. But the performance impact, while noticeable, isn't that huge that makes it a necessity to use.
Quote:
Originally Posted by el Filou View Post
did you test if disabling black bar detection changes anything?
Quote:
Originally Posted by chros View Post
I don't remember But for me the only advantage of using copyback would be to utilise black bar detection+cropping to save performance, and that's not the case. Otherwise I don't mind the full image processing and it will make to write profile rules easier.
My last report about this using GPU 1060 6GB (max, underclocked) freq is 1544Mhz:
- first 3 minutes of Shazam 23p 4k HDR BD remux (~75GB, video bitrate 76.7 Mb/s) on a 4K screen
- external srt subtitle is used (MPC-BE internal sub filter)
- LAV filters
- madvr:
-- hdr passthrough
-- only chroma upscaling is applied: NGU Sharp High
-- dithering: Error Diffusion 2
-- no trade quality option is checked
-- full screen window mode
-- 10 bit output if possible

CPU usage results.
GPU usage results (checked with nvidiainspector):
Code:
- software decoding, - crop:	74% - 82%
- software decoding, + crop:	75% - 85%
- cuvid copy-back, - crop:	82% - 90%
- cuvid copy-back, + crop:	84% - 95%
- dxva2 native:			76% - 80%
- dxva2 copy-back, - crop:	83% - 87%
- dxva2 copy-back, + crop:	83% - 91%
- d3d11 native:			73% - 77%
- d3d11 copy-back, - crop:	85% - 88%
- d3d11 copy-back, + crop:	87% - 95%
There's the ~10% difference on my system. The closest performer is dxva2 native but with its obvious flaws.
Interestingly enough, cropping (with copy-back modes) increases GPU usage and don't reduce it (it uses the same profile, so result is valid), the diff becomes ~15%!
So, in summary, the fastest mode is: d3d11 native.

Note about d3d11-native: no black-bar detection/ deinterlacing / BD menus (using jRiver) is available in this mode.

I'll be curious about your results/graphs with similar test case, guys, including your system (mine is in my signature).
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config

Last edited by chros; 9th September 2021 at 20:10.
chros is offline   Reply With Quote
Old 8th July 2019, 12:10   #3  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by el Filou View Post
The real limitations of copyback decoding only start to become a problem with 10-bit 4K, because it takes up 8 times the bandwidth of 8-bit FHD. With lower resolutions, using copyback or native doesn't have an impact on which madVR settings I am able to use. With 4K 10-bit it does.
Quote:
Originally Posted by tp4tissue View Post
is that on full bitrate 4k remux ?
Quote:
Originally Posted by huhn View Post
it's a broadcast sample and we are talking about decoded frames here they have always the same size with a 10 mbit source or a 125 mbit.
Somehow it does matter at some point in the rendering process. So to try out the biggest impact we can:
- use a 4k hdr bt.2020 10bit 23p remux (HDR passthrough is fine)
- use a 4k monitor/tv
- set "12 bit" in nvidia CP
- set "10 bit or above" in madVR
- and use the highest setting you can with madvr chroma scaling (e.g. NGU Sharp @ High)

All these settings are real life examples, not some demo material settings.

@huhn, do you have a 4k display? If you don't have 4k remux files then I think guys posted some samples in the tonemapping topic.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config

Last edited by chros; 8th July 2019 at 12:13.
chros is offline   Reply With Quote
Old 8th July 2019, 17:07   #4  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
i use these https://kodi.wiki/view/Samples broadcast is a real world file

and yes even for me something is happening which should not happening.

10 bit UHD 59p is only 2.5 times the copyback work of 10 bit UHD 23p but i get far far more CPU usages then 2.5 times.

while i have a 4K TV you don't need one testing with DSR is fine too.
huhn is offline   Reply With Quote
Old 10th July 2019, 10:03   #5  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by el Filou View Post
4. Just out of curiosity I underclocked the CPU to 2100 MHz (FSB 200), to be able to test more different RAM speeds:

(Jellyfish 10-bit DXVA Checker decode):

RAM @ 400: 131,6 fps (native 268,4), CPU 76, GPU 1589, video 989, bus 13
RAM @ 533: 139,6 fps (native 275,7), CPU 70, GPU 1642, video 909, bus 14
RAM @ 666: 148,8 fps (native 281,1), CPU 67, GPU 1642, video 798, bus 15
RAM @ 800: 146,0 fps (native 281,8), CPU 68, GPU 1428, video 766, bus 15

for reference, CPU @ 3500 & RAM @ 800: 210,7 fps, CPU 60, GPU 1797, video 996, bus 22

With same RAM speed but 66% faster CPU, 40-45% more fps.
With same (slow) CPU speed but 66% faster RAM, 13% more fps.
Quote:
Originally Posted by nevcairiel View Post
70% CPU usage on Copy-Back is not a typical result, really. On NVIDIA or Intel you should see extremely low CPU usage, if you have a relatively recent CPU, since both of those will use the DMA engines to copy the image, which does not result in high CPU usage.
AMD, especially on older generations, has been notoriously bad with copy-back, and I would not recommend using it there, or using it as a testing reference for any meaning beyond those cards specifically.
I just tried out here with different CPU speeds (DDR4 RAM is 3200 MHz @CL16), in short: there's no change here.

I made my undervolted 6C12T Ryzen 5 2600 CPU lazy and capped it's speed at 2.8GHz with a Windows Power Plan (to not be hot in a fanless system) and it mainly runs around its lowest clocks 1.5GHz - 1.7GHz until some heavy task kicks in (e.g. encoding).
I switched to the Ryzen Balanced plan (that makes the CPU as snappy as it can, ~3.4GHz), but there wasn't any change in copyback performance here.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config
chros is offline   Reply With Quote
Old 16th July 2019, 21:51   #6  |  Link
el Filou
Registered User
 
el Filou's Avatar
 
Join Date: Oct 2016
Posts: 896
Quote:
Originally Posted by el Filou View Post
CPU 3500 / RAM 666: DXVA Checker decode 63,0 fps; madVR 439 dropped frames, avg 50,16 ms, max 78,17 ms

CPU 3500 / RAM 800: DXVA Checker 66,8 fps; madVR 315 dropped frames, avg 45,68 ms, max 63,78 ms
So I pushed my Core 2 Duo a bit more for fun, and on that same UHD BD 75-second test clip I now have:

CPU 3500 / RAM 800 'agressive' timings: DXVA Checker 68,3 fps; madVR 283 dropped frames, avg 45,92 ms, max 58,61 ms

CPU 3780 / RAM 864: DXVA Checker 73,4 fps; madVR 165 dropped frames, avg 43,92 ms, max 55,13 ms

CPU still at 85+ % in all cases.
I'd be curious what an old AMD CPU of the same era would get with its integrated memory controller. It has to be the platform.

Edit: based on this old post by nevcairiel, Core 2 Duo has SSE4.1 so should be fine.
Quote:
Originally Posted by nevcairiel View Post
The Athlon 64 is probably still rather slow for CB, as it doesn't have SSE4.1, which introduced the optimized instructions for copying from GPU memory to system memory. It can make a world of difference.
My results are closer to what nevcairiel measured for the 'non-direct' copyback method:
Quote:
Originally Posted by nevcairiel View Post
Decode, Direct P010 Out: 127 fps, 3% CPU
Decode, Direct NV12 Out: 126 fps, 3% CPU
And for giggles without the new Direct Mode:
Decode, No Direct P010 Out: 73 fps, 6% CPU
I've gone back and read the LAV thread from that post: https://forum.doom9.org/showthread.php?t=171219&page=15 and apparently performance is heavily dependent on memory speed even with SSE4.1. Maybe even modern systems with slower memory can see an impact?
__________________
HTPC: Windows 10 22H2, MediaPortal 1, LAV Filters/ReClock/madVR. DVB-C TV, Panasonic GT60, Denon 2310, Core 2 Duo E7400 oc'd, GeForce 1050 Ti 536.40

Last edited by el Filou; 17th July 2019 at 00:42.
el Filou is offline   Reply With Quote
Old 17th July 2019, 02:55   #7  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
if i understand the wiki correctly not every core 2 duo has sse 4.1.
easy to confirm with CPU-Z.
huhn is offline   Reply With Quote
Old 17th July 2019, 12:22   #8  |  Link
el Filou
Registered User
 
el Filou's Avatar
 
Join Date: Oct 2016
Posts: 896
Mine does (Wolfdale).
I guess you need at least good DDR3 for 4K 10-bit copyback.
__________________
HTPC: Windows 10 22H2, MediaPortal 1, LAV Filters/ReClock/madVR. DVB-C TV, Panasonic GT60, Denon 2310, Core 2 Duo E7400 oc'd, GeForce 1050 Ti 536.40
el Filou is offline   Reply With Quote
Old 19th July 2019, 10:26   #9  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
i'm testing my zen 2 right now finally some bad results.

i get 15 % CPU load on an 3700X something very odd is happening here.

1060 3700X 3200 mhz ram.
UHD 60p NGU AA mid chroma SSim d1 100 to 1080p

d3d11 copyback
CPU 15 %, GPU 19 MS
d3d11 native
CPU 11 %, GPU 14 MS
DXVA copyback
CPU 12 %, GPU 19 MS

1 core is pretty much totally loaded with and without copyback. if an 8 core CPU is loaded like this my 2 core intel shouldn't be able to do this idling...

edit: the CPU load seems to be a part of madVR...
can you guys please retest with mpcVR just for the CPU load: https://github.com/Aleksoid1978/VideoRenderer/releases
all you need to do is install and load it as an external filter.

Last edited by huhn; 19th July 2019 at 10:49.
huhn is offline   Reply With Quote
Old 19th July 2019, 14:39   #10  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by huhn View Post
i'm testing my zen 2 right now finally some bad results.
Congrats, enjoy your new system!

Quote:
Originally Posted by huhn View Post
d3d11 copyback
CPU 15 %, GPU 19 MS
d3d11 native
CPU 11 %, GPU 14 MS
DXVA copyback
CPU 12 %, GPU 19 MS
Yes, similar result like ours.

Quote:
Originally Posted by huhn View Post
1 core is pretty much totally loaded with and without copyback.
edit: the CPU load seems to be a part of madVR...
Interesting, I don't recall this. I'll double check this weekend.
Which madvr version do you use? latest stable or HDR2SDR test?
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config
chros is offline   Reply With Quote
Old 19th July 2019, 15:11   #11  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
i use stable release. no HDR no blackbar detection i want to know what copyback has as an effect.

i have huge difference between mpcVR and madVR in term of CPU usage.
huhn is offline   Reply With Quote
Old 19th July 2019, 15:49   #12  |  Link
littleD
Registered User
 
littleD's Avatar
 
Join Date: Aug 2008
Posts: 343
Important for INTEL iGPU users.
Since ever, enabling dx11 option in lavfilters decoder was causing slowdown in video decoding, much much worse performance was noticable than dxva native, especially with highest resolutions. It was a few versions of Win10 and intel drivers back.

Now, dont know why, but enabling dx11 automatic, makes possible normal playing even with 8k videos. It does not drop any frame with nightly 2.1 MPC video decoder. It drops some frames with Madvr but its not so noticable. Madvr consumes much shared memory (intel igpu video ram) and it is almost filled at 8k (i see it in process manager). And finally, video playback crawls with EVR Custom presenter. This renderer does not convert HDR>>SDR tones, so it's the least featured among possible renderers now. Need to switch to another renderer finally on daily basis.

Take note, enabling dxva2 CB makes video slow. It must be some dx11 compatibility chain between decoder and renderer making dx11 playback better performance.

I didnt watch CPU usage, Not so important right now for me.

Last edited by littleD; 31st July 2019 at 20:17.
littleD is offline   Reply With Quote
Old 19th July 2019, 17:04   #13  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by huhn View Post
i have huge difference between mpcVR and madVR in term of CPU usage.
One thing popped in my mind: Ryzen CPUs are extremly snappy especially with the AMD's Ryzen Balanced Power Plan (if you installed the AMD chipset driver). You can try to switch to Windows Balanced or take a look the Power plan that I use above.

I understand if you don't want to change it, I'm just telling that even small amount of workload can result in high clock speeds using the Ryzen Balanced Plan.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config
chros is offline   Reply With Quote
Old 19th July 2019, 17:45   #14  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
this is very different with zen2 and not the issue here only one core boost fully one at ~2ghz the rest pretty much sleeps or is at below 400 mhz
huhn is offline   Reply With Quote
Old 20th July 2019, 13:30   #15  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by huhn View Post
i get 15 % CPU load on an 3700X something very odd is happening here.

1060 3700X 3200 mhz ram.
UHD 60p NGU AA mid chroma SSim d1 100 to 1080p

d3d11 copyback
CPU 15 %, GPU 19 MS
d3d11 native
CPU 11 %, GPU 14 MS
DXVA copyback
CPU 12 %, GPU 19 MS

1 core is pretty much totally loaded with and without copyback.
Using my Power Plan (CPU@1.53 - 1.56 GHz), UHD SDR 10bit 59p sample + NGU Sharp Low chroma + SSim d1 100 to 1080p:
2160p_59fps_hevc-LG_2_DEMO_4K_L_N_06_Slam Dunk.mkv

d3d11 native (madVR / mpcbeVR):
CPU MPC-BE: ~1.8 % / ~1.8 %
d3d11 copyback - no zoom control/black bar detection (madVR / mpcbeVR)
CPU MPC-BE: ~5.8 % / ~3.8%
d3d11 copyback + zoom control and black bar detection (only madVR)
CPU MPC-BE: ~9.2 %

Load is equally distributed between threads (3-4 threads having load <20% out of 12).

Edit:
I added mpc-be video renderer as well. The diff between madvr and mpcbeVR is 2% using d3d11 copyback, while there's no difference using d3d11 native.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config

Last edited by chros; 1st August 2019 at 11:31.
chros is offline   Reply With Quote
Old 21st July 2019, 04:40   #16  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
mpcVR d3d9
d3d11 native 5 % at very low frequency
DXVA copyback 3 % at even lower clocks...
DXVA native 0.3 % at idle

mpcVR d3d11
d3d11 native 2% at idle 300mhz
DXVA copyback 3 % at up to 1300 usually much lower
DXVA native 1.0 % at idle

and here again the madVR numbers they are lower then the old numbers.
d3d11 copyback
CPU 13 %, GPU 19 MS
d3d11 native
CPU 9 %, GPU 14 MS
DXVA copyback
CPU 12 %, GPU 19 MS

pressing control+v in a browser has higher load then copyback operation with mpcVR
i have PCIe 3.0 which is currently not a given with ryzen 3000
huhn is offline   Reply With Quote
Old 3rd August 2019, 13:32   #17  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
I added CUVID results to the above post: as @nevcairiel suggested, it performs similarly to other copy-back methods.
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config
chros is offline   Reply With Quote
Old 20th September 2019, 21:11   #18  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
system 1 specs:
i7 3770k 16GB Kingston HyperX blu. DDR3-1600 DIMM CL10 Dual Kit
ASRock Z77Pro3 Intel Z77
1060
the system is not assembled anymore and it was used most of the time with 4 dimms 2 totally different kits but it stopped booten with them about 5 month ago.
ram support was superb on this plat form. the missing ram was 8GB (2x 4096MB) TeamGroup Elite DDR3-1333 DIMM CL9-9-9-24 Dual Kit Oc to 1600 mhz

system 2 specs:
r7 3700x
msi x570a pro
32GB Corsair Vengeance LPX schwarz DDR4-3200 DIMM CL16 Dual Kit
currently 1060

system 3 specs:
i3 4130
ASRock B85 Pro4
2*8GB (1x 8192MB) Crucial Ballistix Sport DDR3-1600 DIMM CL9-9-9-24 Single mostly run at 1333 mhz because i didn't care.
currently 960

system 3 up to this date doesn't really care about copyback operation up to UHD 23p at 60p it show real differences.
huhn is offline   Reply With Quote
Old 21st September 2019, 20:46   #19  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by nevcairiel View Post
The problem with CopyBack is also that it has to copy the image twice, once from the GPU to the system, and then back from the system to the GPU. On some GPUs, the download step is also rather slow (AMD used to historically have trouble there, no clue how recent hardware changed). But it can also stress the system RAM, especially on dual-channel memory mainstream systems.

I did a quick test on my system (which isn't a good example, since it has fast quad-channel RAM and everything else high-end as well, but regardless) with a random 4K 10-bit test clip I had at hand:
With DXVAChecker and naive EVR playback testing

DXVA2-Native, ~380 FPS
DXVA2-CopyBack, ~104 FPS
Software Decoding, ~196 FPS

The native test is close to what the hardware decoder can achieve, it was at ~95% usage most of the time. CopyBack definitely takes quite a toll on 4K. Interestingly on 1080p the overhead from CopyBack is generally extremely minimal.
Interesting is also software decoding. Granted you need a CPU that can actually decode this fast, and it was decode-limited at this point, but uploading the image alone is not bottlenecking the decoder yet.

Since I could upload at 196 fps at least (and probably more), I did another test, DXVA2-CopyBack, Decode only - which means it'll only download the image from the GPU, but not re-upload it. That yielded ~232 FPS.
Clearly the doubled use from download and upload creates the real bottleneck ... somewhere. Its not entirely clear where the real bottleneck is. Clearly the software upload path in the renderer can handle more then ~104 FPS. Clearly the download path in LAV Video can as well. PCIe is full-duplex, which means it should be capable of sending and receiving at the same time. System Memory is more complex in regards to that... but my quad-channel memory should have plenty bandwidth to accomodate this here.

What I don't know is if the EVR used in this example uses a different thread for uploading the video, or if its on the same thread as LAV Video uses to deliver the image - which might explain why its slowing down so much, since it does two things on the same thread. madVR, at least, uses a seperate thread for uploading, so it wouldn't be affected by that.
Quote:
Originally Posted by littleD View Post
Not sure if i add something new, but computers since like forever were designed with the data flow in one direction. To make pc games reach high fps rates, it means the pc system should have fast cpu>gpu memory transfer. Backward direction was always few times slower because there were no applications needing that. That fact starts matter in GPGPU times, even before Opencl, because general processing need to reupload data many times. Since then - the path gpu>cpu have been steadily improved but still is slower.
Ram speed has no much impact in cpu>gpu transfer since thats native for pc architecture. Upload to gpu, decode (texture/image) and render is typical for pc game. Quad channel might have advantage in software decoding whith many cpu<>ram memory transfers. So ram speed might matter in software decoding.
In Your example, alone downloading image (decode only) looks fast anyhow. Reuploading (playback) is contrary to PC design, even with fast RAM, slow speed may be hardware or software limitation (dxva design?). There are some small tools to benchmark PCIE gpu>cpu transfer.
Interesting to see that native is ~3.6x faster than copyback and software decoding ~1.9x faster.

What more interesting is that it has big impact on the "normal" GPU operations for whatever reason.

I added software decoding results to the above post: as @nevcairiel suggested, it performs better than other copy-back methods, almost like dxva2 native!
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config

Last edited by chros; 21st September 2019 at 20:52.
chros is offline   Reply With Quote
Old 24th September 2019, 11:45   #20  |  Link
el Filou
Registered User
 
el Filou's Avatar
 
Join Date: Oct 2016
Posts: 896
Quote:
Originally Posted by chros View Post
What more interesting is that it has big impact on the "normal" GPU operations for whatever reason.
Just out of curiosity, from a recent test I did for something else (https://forum.doom9.org/showthread.p...#post1885607):

System: PCIe 2.0 x16, Core 2 FSB 333 DDR2-800, 1050 Ti.
Settings: 2160 to 1080, SSIM2D100, scale chroma separately, HDR processing with no trade quality and with highlights recovery.
(Edit: GPU clock is at maximum boost in both cases)

native / copyback (which skipped 110 frames in just 30 seconds):

33.61 / 51.32

0.53 / 1.79 Jinc Image Downscaling - Convert to Linear Light
10.04/ 10.83 Jinc Downscaling
0.45 / 3.79 SSIM RT
0.41 / 0.90 SSIM Final
0.77 / 1.48 SSIM AR
0.83 / 1.69 HDR Blur Dif
6.84 / 8.45 HDR Blur
0.66 / 0.88 HDR Frequency Split
0.47 / 0.64 Chrome Scaling - Shift X
0.47 / 1.03 Chrome Upscaling - ConvertToRGB
1.20 / 1.72 HDR Tone Map
0.47 / 1.49 HDR Blur Dif
4.46 / 6.45 HDR Blur
0.42 / 1.23 HDR Join
0.50 / 1.05 Image Scaling X
0.16 / 0.62 Image Scaling Y
0.09 / 0.35 HDR Compare
0.09 / 0.18 HDR Blur Dif
0.67 / 1.25 HDR Blur
0.42 / 0.70 HDR Gamut Map
0.57 / 2.16 HDR Final

I don't know enough about GPUs to form an opinion, and also the fact that it drops frames may render this comparison invalid, but what's interesting is the long Jinc Downscaling step which by itself takes 1/3 of the rendering time sees a really small difference.
Maybe a scheduling issue in the memory controller between the video decoder, the compute units, and the bus interface?
__________________
HTPC: Windows 10 22H2, MediaPortal 1, LAV Filters/ReClock/madVR. DVB-C TV, Panasonic GT60, Denon 2310, Core 2 Duo E7400 oc'd, GeForce 1050 Ti 536.40

Last edited by el Filou; 24th September 2019 at 11:55.
el Filou is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 19:11.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.