Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th January 2014, 13:33   #21701  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by cyberbeing View Post
CPU load with 0.87e is now the same as 0.87.0 & 0.87a in my previous post, but it's still more than twice that of 0.86.11.
This CPU load regression exists in all builds starting with the first madVR deband test build released on 9/30/2013. There is no change whether debanding is enabled or disabled, so it must be some other change you made starting with that build.
cyberbeing is offline   Reply With Quote
Old 24th January 2014, 13:37   #21702  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by madshi View Post
Here's a new test build:

http://madshi.net/madVR87e.rar

I hope that CPU and GPU performance is mostly back to v.86.11 levels (maybe GPU performance could be slightly lower due to modified dithering logic). Can anybody confirm?

There's a new option in the rendering settings now, allowing you to enable/disable OpenCL processing of DXVA NV12 surfaces for AMD and Intel GPUs. It's disabled by default now. Please check whether this option helps or harms with your GPU (AMD/Intel only) and report. Thanks.

OpenCL will still not work with newer NVidia GPUs. Need a new debug log for this with error diffusion enabled.
Hmm. Performance still seems worse than 0.86.x for me. GPU usage is 95-100% and the queues are essentially empty.

Disabling "Use random dithering instead of OpenCL error diffusion" drops GPU usage to ~85% but the queues are still generally empty. Using 10-bit chroma and image buffers gets me perfect playback but I've never had to use these options before.

The "Use OpenCL to process DXVA NV12 surfaces" doesn't seem to make any difference to GPU or CPU usage or the queues for me. By the way, ticking/unticking that option doesn't trigger the "Apply" button being enabled, unlike the other options. I never tried the deband builds so I don't know if those had the problem or not.
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7
DragonQ is offline   Reply With Quote
Old 24th January 2014, 13:41   #21703  |  Link
James Freeman
Registered User
 
Join Date: Sep 2013
Posts: 919
Here is another Log with 87.1 (87e) & GTX660:
https://www.mediafire.com/?kc6y1y9253f4gva
__________________
System: i7 3770K, GTX660, Win7 64bit, Panasonic ST60, Dell U2410.
James Freeman is offline   Reply With Quote
Old 24th January 2014, 13:56   #21704  |  Link
michkrol
Registered User
 
Join Date: Nov 2012
Posts: 167
Quote:
Originally Posted by madshi View Post
You're right. I thought adding the "renderQueue" field would be a clever idea, but it changes too often for it to make sense. So I've completely removed it now.
Thanks for looking into this.

Quote:
Originally Posted by madshi View Post
There's a new option in the rendering settings now, allowing you to enable/disable OpenCL processing of DXVA NV12 surfaces for AMD and Intel GPUs. It's disabled by default now. Please check whether this option helps or harms with your GPU (AMD/Intel only) and report. Thanks.
I'm unable to save this option. The Apply button doesn't get activated, clicking OK does nothing.

Quote:
Originally Posted by DarkSpace View Post
In the latest test build, I have the same issues:
Code:
if (srcInterlaced) && (!filmMode) "Don't Use OpenCL"
else                              "Use OpenCL"
works just fine (thanks michkrol!), but
Code:
if (srcInterlaced && !filmMode) "Don't Use OpenCL"
else                            "Use OpenCL"
doesn't work. If you want me to enter this into the bug tracker, please tell me so.
You're welcome. It's all in the documantation
Quote:
Each value comparison must be placed in brackets
So it shouldn't work the way you want it to. The syntax may seem familiar (java, C#, etc.), but is not exactly the same.
michkrol is offline   Reply With Quote
Old 24th January 2014, 14:08   #21705  |  Link
noee
Registered User
 
Join Date: Jan 2007
Posts: 530
Quote:
Originally Posted by madshi
There's a new option in the rendering settings now, allowing you to enable/disable OpenCL processing of DXVA NV12 surfaces for AMD and Intel GPUs. It's disabled by default now. Please check whether this option helps or harms with your GPU (AMD/Intel only) and report. Thanks.
This option will not "take". If I click it, the apply button does not enable, if I hit okay and come back, the option reverts to disabled.

.87e fixes the performance problem here for me for SD material. No drops, queues full. Turned on error diffusion also.

Still have the slideshow with 1080p film on 1080 monitor, upload queue just never goes above 1...fwiw, I have a 1080p video file that works perfectly..

Edit: I should also add that all of my 1080p film mkvs are P010, not sure if that makes a difference, I don't have any 8-bit 1080p film mkvs....

Last edited by noee; 24th January 2014 at 14:36.
noee is offline   Reply With Quote
Old 24th January 2014, 14:10   #21706  |  Link
jaju123
Registered User
 
Join Date: Apr 2012
Posts: 16
So what is the new maximum quality setting? I am using two AMD r9 290s in crossfire, if that helps! Is OpenCL providing higher quality upscaling?

Thanks guys.
jaju123 is offline   Reply With Quote
Old 24th January 2014, 14:29   #21707  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
Quote:
Originally Posted by jaju123 View Post
So what is the new maximum quality setting? I am using two AMD r9 290s in crossfire, if that helps! Is OpenCL providing higher quality upscaling?

Thanks guys.
what's best to your eyes.
huhn is offline   Reply With Quote
Old 24th January 2014, 14:58   #21708  |  Link
DarkSpace
Registered User
 
Join Date: Oct 2011
Posts: 204
Quote:
Originally Posted by michkrol View Post
You're welcome. It's all in the documantation
Ugh. I think I even read that, but I thought comparisons meant all stuff that isn't bool by itself ( e.g. (srcWidth <= 720) compared to srcInterlaced ). Thanks for pointing it out to me, I hope I've learned something for the future. Anyway, it seems like it's not even a bug then!

By the way, madshi: What do you think about a single bool that states whether deinterlacing is active (false for source treated as progressive or IVTC, true for deinterlacing)? As I understand it, filmMode will be false for progressive content, so it's not entirely suitable, and I fear that if I once discover a source encoded as progressive but activate deinterlacing, the srcInterlaced switch won't change.

Last edited by DarkSpace; 24th January 2014 at 15:01.
DarkSpace is offline   Reply With Quote
Old 24th January 2014, 15:14   #21709  |  Link
cca
Anime Otaku
 
Join Date: Oct 2002
Location: Somewhere in Cyberspace...
Posts: 437
Testing 0.87e on the PC shown in my signature produced good results, I can play my interlaced DVDs with the same settings I used in 0.86.11 plus debanding on. NNEDI is no go, too much load on the GPU. The only case I can use NNEDI is on SD videos of either 24 or 30 fps, on those cases it works good enough. To solve these issues I set up the profile system to select those options according to resolution/frame rate, I just blatantly copied madshi's example and tailored it to my needs

EDIT: Forgot to mention, the option to enable/disable OpenCL processing of DXVA NV12 surfaces is not working for me either just like the above reports.
__________________
AMD FX8350 on Gigabyte GA-970A-D3 / 8192 MB DDR3-1600 SDRAM / AMD R9 285 with Catalyst 1.5.9.1/ Asus Xonar D2X / Windows 10 pro 64bit
cca is offline   Reply With Quote
Old 24th January 2014, 15:19   #21710  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by cyberbeing View Post
CPU load with 0.87e is now the same as 0.87.0 & 0.87a in my previous post, but it's still more than twice that of 0.86.11.
Quote:
Originally Posted by cyberbeing View Post
This CPU load regression exists in all builds starting with the first madVR deband test build released on 9/30/2013. There is no change whether debanding is enabled or disabled, so it must be some other change you made starting with that build.
This would seem to match my findings.
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7
DragonQ is offline   Reply With Quote
Old 24th January 2014, 15:56   #21711  |  Link
kasper93
MPC-HC Developer
 
Join Date: May 2010
Location: Poland
Posts: 586
Quote:
Originally Posted by madshi View Post
You're right. I thought adding the "renderQueue" field would be a clever idea, but it changes too often for it to make sense. So I've completely removed it now.
Yeah, "renderQueue" was funny. But thats the only thing that I was excited about. So we could make profile based on performance not on source video. Maybe add "droppedFrames" instead? So we could make an automatic fallback to "faster" profile if certain threshold is reached. Or even few profiles with different limits ;p You know
Code:
if (droppedFrames < 50) "Profile1" else if (droppedFrames < 100) "Profile2" else "Profile3"
I know that it won't make much use, but hey it will be fun to have just in case.


Thanks for new release, everything is working great Except minor cosmetic, and my GPU performance :X

Last edited by kasper93; 24th January 2014 at 16:07.
kasper93 is offline   Reply With Quote
Old 24th January 2014, 16:09   #21712  |  Link
vivan
/人 ◕ ‿‿ ◕ 人\
 
Join Date: May 2011
Location: Russia
Posts: 643
Maybe rendering time would be a better option?
Like (rendering_time * fps < 0.9) means that profile is too slow.
vivan is offline   Reply With Quote
Old 24th January 2014, 16:09   #21713  |  Link
djfred93
Registered User
 
Join Date: Aug 2012
Posts: 32
MadVR don't load with the latest version, it is stuck at opening the file and it crash on the setting window. The debug work fine but the debug don't have the enable/disable OpenCL processing of DXVA NV12 option. The deinterlacing have framedrop but it's better than the other 0.87 version of MadVR. The deinterlacing worked fine (no framedrops) with the latest deband test and stable version. Thanks anyway for the latest version.

My system : Intel Core i7@920, Ati Radeon HD 5770, Windows 8.1, MPC-HC 1.7.1

Last edited by djfred93; 24th January 2014 at 16:16.
djfred93 is offline   Reply With Quote
Old 24th January 2014, 16:37   #21714  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Qotscha View Post
For me this did the trick. Playback is smooth and CPU and GPU usage are about the same as with 0.86.11.
That's a relief.

Quote:
Originally Posted by cyberbeing View Post
CPU load with 0.87e is now the same as 0.87.0 & 0.87a in my previous post, but it's still more than twice that of 0.86.11.
Quote:
Originally Posted by cyberbeing View Post
This CPU load regression exists in all builds starting with the first madVR deband test build released on 9/30/2013. There is no change whether debanding is enabled or disabled, so it must be some other change you made starting with that build.
Ok, good to know. So it's got nothing to do with OpenCL or profiling, which should make it easier to fix. Was the CPU load already higher in the first few builds where fade in/out detection was not implemented yet? You seem to say so. Yet I'm wondering. The fade in/out detection was my first guess about what could have increased the CPU load. But if the CPU load was already higher in the first few deband test builds then fade in/out detection can't be the reason cause it was added rather late.

Quote:
Originally Posted by DarkSpace View Post
In the latest test build, I have the same issues:
Code:
if (srcInterlaced) && (!filmMode) "Don't Use OpenCL"
else                              "Use OpenCL"
works just fine (thanks michkrol!), but
Code:
if (srcInterlaced && !filmMode) "Don't Use OpenCL"
else                            "Use OpenCL"
doesn't work. If you want me to enter this into the bug tracker, please tell me so.
That's as intended. The script language is not fully C++ compatible. I do require brackets everywhere to make parsing simpler (= faster).

Quote:
Originally Posted by James Freeman View Post
Here is another Log with 87.1 (87e) & GTX660:
https://www.mediafire.com/?kc6y1y9253f4gva
Quote:
Originally Posted by HolyWu View Post
Thx.

Quote:
Originally Posted by DarkSpace View Post
By the way, madshi: What do you think about a single bool that states whether deinterlacing is active (false for source treated as progressive or IVTC, true for deinterlacing)? As I understand it, filmMode will be false for progressive content, so it's not entirely suitable, and I fear that if I once discover a source encoded as progressive but activate deinterlacing, the srcInterlaced switch won't change.
srcInterlaced will be true if you force deinterlacing on.

Quote:
Originally Posted by DragonQ View Post
Hmm. Performance still seems worse than 0.86.x for me. GPU usage is 95-100% and the queues are essentially empty.

Disabling "Use random dithering instead of OpenCL error diffusion" drops GPU usage to ~85% but the queues are still generally empty. Using 10-bit chroma and image buffers gets me perfect playback but I've never had to use these options before.
Does that mean performance is 1-2% worse than before? Or are rendering times twice as high as before? Your post doesn't give any indication about how much worse things really got. Already a 1% performance drop could explain the problem if your settings were already borderline with v0.86.x.

Quote:
Originally Posted by DragonQ View Post
The "Use OpenCL to process DXVA NV12 surfaces" doesn't seem to make any difference to GPU or CPU usage or the queues for me. By the way, ticking/unticking that option doesn't trigger the "Apply" button being enabled, unlike the other options.
Quote:
Originally Posted by michkrol View Post
I'm unable to save this option. The Apply button doesn't get activated, clicking OK does nothing.
Quote:
Originally Posted by noee View Post
This option will not "take".
Yes, seems to be a bug.

Quote:
Originally Posted by noee View Post
Still have the slideshow with 1080p film on 1080 monitor, upload queue just never goes above 1...fwiw, I have a 1080p video file that works perfectly..

Edit: I should also add that all of my 1080p film mkvs are P010, not sure if that makes a difference, I don't have any 8-bit 1080p film mkvs....
Your CPU does support SSE2, I hope? In v0.86.x I uploaded P010 content simply by using the MSVC++ "memcpy" function. Now in v0.87.x I'm using custom SSE2 code which while copying also does some rudimentary analyzation of the pixel data (for fade in/out detection and for potential future features). On my CPU/GPU the SSE2 code performs just as fast as the old "memcpy" code. But it seems to be very different on your PC. I'm wondering why...

Quote:
Originally Posted by cca View Post
Testing 0.87e on the PC shown in my signature produced good results, I can play my interlaced DVDs with the same settings I used in 0.86.11 plus debanding on. NNEDI is no go, too much load on the GPU. The only case I can use NNEDI is on SD videos of either 24 or 30 fps, on those cases it works good enough. To solve these issues I set up the profile system to select those options according to resolution/frame rate
That's exactly the reason I implemented profiles now.

Quote:
Originally Posted by kasper93 View Post
Yeah, "renderQueue" was funny. But thats the only thing that I was excited about. So we could make profile based on performance not on source video. Maybe add "droppedFrames" instead? So we could make an automatic fallback to "faster" profile if certain threshold is reached. Or even few profiles with different limits ;p
The problem is that dropped frames can jump a lot if you seek or things like that. I liked the idea of switching profiles based on rendering performance/state myself. But in real life switching rendering settings costs time/performance, too. And if whatever value you're checking is borderline, it's bound to jump back and forth over the boundary all the time, resulting in profiles having to switch back and forth all the time, too, which doesn't really make sense. Maybe I'll find a better solution for this in the future. But for now I think switching based on actual performance isn't going to work well.

Quote:
Originally Posted by vivan View Post
Maybe rendering time would be a better option?
Like (rendering_time * fps < 0.9) means that profile is too slow.
That would probably work better. But what happens if the profile switches, and then the rendering times is low enough, so just 2 frames later the profiles switch back into the more difficult profile again? Switching settings around all the time is not a good idea, it costs performance, too. Furthermore NNEDI3 has a 0.5 pixel offset, so if you switch NNEDI3 on/off during playback, the image will shift 0.5 pixels during runtime, too.

Quote:
Originally Posted by djfred93 View Post
MadVR don't load with the latest version, it is stuck at opening the file and it crash on the setting window.
Strange. Please try again with the next build.

-------

So here's the next test build. I'm carefully optimistic that it might make OpenCL work with newer NVidia GPUs. Give it a few seconds when you activate OpenCL features the first time. The kernels need to be compiled which may take 1-3 seconds or so.

http://madshi.net/madVR87f.rar

Also the new OpenCL option should now work properly.
madshi is offline   Reply With Quote
Old 24th January 2014, 16:55   #21715  |  Link
James Freeman
Registered User
 
Join Date: Sep 2013
Posts: 919
Quote:
Originally Posted by madshi
So here's the next test build. I'm carefully optimistic that it might make OpenCL work with newer NVidia GPUs.
Not yet, but I see a different behaviour.
The image turns black whether on the previous release (87e) it freezes.

MadVR 87f Log
__________________
System: i7 3770K, GTX660, Win7 64bit, Panasonic ST60, Dell U2410.
James Freeman is offline   Reply With Quote
Old 24th January 2014, 16:56   #21716  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Still a black screen when enabling OpenCL stuff (tested dither and nnedi) with 0.87f.

madVR 0.87f debug log

And yes, the CPU load regression began in the very first deband test build. CPU load in 0.87.0 is slightly higher (+0.4%), but the significant doubling in madVR CPU usage started with this first deband test build.

Quote:
Originally Posted by madshi View Post
Seems that D3D9 <-> OpenCL interop doesn't work properly.
Quote:
Originally Posted by NVIDIA
Simple OpenCL D3D9 Texture
Simple program which demonstrates Direct3D9 texture interoperability with OpenCL. The program creates a number of D3D9 textures (2D, 3D, and CubeMap) which are written to from OpenCL kernels. Direct3D then renders the results on the screen.
The NVIDIA test program for D3D9 <-> OpenCL interop runs successfully on my GTX 770. Link

It also seems to be very fast, at least in these simple test programs. While the D3D9 <-> OpenCL interop program seems to have fps limited to VSync (max I could test on my CRT was 170fps @ 170Hz with 5% GPU load), their D3D10 version utilizing swap chain isn't, and runs runs at 1550 fps on my GTX 770 at only 50% GPU load.

Maybe worth looking at? At least it confirms this OpenCL feature isn't bugged with the 332.21 driver.

Last edited by cyberbeing; 24th January 2014 at 22:09.
cyberbeing is offline   Reply With Quote
Old 24th January 2014, 16:57   #21717  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by James Freeman View Post
Not yet, but I see a different behaviour.
The image turns black whether on the previous release (87e) it freezes.

MadVR 87f Log
Hmmmm... Your log reports that everything's working. Can you try NNEDI3 instead of Error Diffusion? Still black image?
madshi is offline   Reply With Quote
Old 24th January 2014, 16:58   #21718  |  Link
HeadlessCow
Registered User
 
Join Date: Nov 2002
Posts: 131
Quote:
Originally Posted by madshi View Post
That would probably work better. But what happens if the profile switches, and then the rendering times is low enough, so just 2 frames later the profiles switch back into the more difficult profile again? Switching settings around all the time is not a good idea, it costs performance, too. Furthermore NNEDI3 has a 0.5 pixel offset, so if you switch NNEDI3 on/off during playback, the image will shift 0.5 pixels during runtime, too.
Maybe a "maxRenderingTime" value instead of "renderingTime", that way once you hit a frame that has a rendering time that is too high it will switch to the lower profile and stay there.
HeadlessCow is offline   Reply With Quote
Old 24th January 2014, 17:02   #21719  |  Link
DarkSpace
Registered User
 
Join Date: Oct 2011
Posts: 204
Quote:
Originally Posted by madshi View Post
That's as intended. The script language is not fully C++ compatible. I do require brackets everywhere to make parsing simpler (= faster).
Thanks, I'll just leave it as it is and remember it for the future, then.

Quote:
Originally Posted by madshi View Post
srcInterlaced will be true if you force deinterlacing on.
That's one less worry, then!

Quote:
Originally Posted by madshi View Post
Furthermore NNEDI3 has a 0.5 pixel offset, so if you switch NNEDI3 on/off during playback, the image will shift 0.5 pixels during runtime, too.
Talking about NNEDI pixel shifts: You added a chroma upscaling option for NNEDI, how do you handle the shifts there?
DarkSpace is offline   Reply With Quote
Old 24th January 2014, 17:07   #21720  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by madshi View Post
Does that mean performance is 1-2% worse than before? Or are rendering times twice as high as before? Your post doesn't give any indication about how much worse things really got. Already a 1% performance drop could explain the problem if your settings were already borderline with v0.86.x.
Screenshots for the same clip (1080i/25) with the same settings (no new OpenCL stuff) for 0.86.11 and 0.87f are below using an HD4000. My settings are:

Luma: Lanczos3AR (shouldn't be used in this case)
Chroma: Bicubic75
Downscaling: Catmull-Rom
General: Using separate device for presentation & DXVA processing
Smooth Motion: On
Deinterlacing: On

I only get a few dropped frames when first opening the video with 0.86.11, I get loads and very dodgy playback with 0.87f. In terms of average stats, all of them are worse with the latter.

Screenshot 0.86.11
Screenshot 0.87f
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7
DragonQ is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:51.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.