Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd April 2014, 04:12   #25621  |  Link
biggerpapi
Registered User
 
Join Date: Jan 2014
Posts: 5
Quote:
Originally Posted by Anime Viewer View Post
Something else of note I noticed when I went in to add the forceVendor key and sting inside it I noticed there is a HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\Intel(R) HD Graphics 4000 key, but no sign of an Nvidia one.
I noticed the same thing. It seems madVR is trying to use the HD4000 for openCL. I even tried deleting the HD4000 value and added a new value for my nVidia gpu but as soon as I started a video it added the HD4000 right back.
biggerpapi is offline   Reply With Quote
Old 3rd April 2014, 05:19   #25622  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by Anime Viewer View Post
I opened up regedit, went to HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL and created a new key named forceVendor, and in that created a string key which I named nVidia (you didn't mention it, but I'm guessing it should also have a value of 1), but it made no difference. I tried changing the name to Intel, and nothing ran any different.
I think you've misunderstood madshi's notation.
Code:
HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\forceVendor REG_SZ "nVidia"
This means to add a 'forceVendor' String (REG_SZ) with a value of 'nVidia' at HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL , not a key.
cyberbeing is offline   Reply With Quote
Old 3rd April 2014, 05:26   #25623  |  Link
Anime Viewer
Troubleshooter
 
Anime Viewer's Avatar
 
Join Date: Feb 2014
Posts: 339
Quote:
Originally Posted by cyberbeing View Post
I think you've misunderstood madshi's notation.
Code:
HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\forceVendor REG_SZ "nVidia"
This means to add a 'forceVendor' String (REG_SZ) with a value of 'nVidia' at HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL , not a key.
Thanks for clearing that up. That worked, and now it has created a HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\GeForce GTX 680M key in the registry. 32 neurons seems to be the best setting for an Nvidia GTX 680M gpu chroma upscaling setting.

Its interesting...if I force the Nvidia NEEDI3 runs correctly, but if I force the Intel the screen goes green again...
__________________
System specs: Sager NP9150 SE with i7-3630QM 2.40GHz, 16 GB RAM, 64-bit Windows 10 Pro, NVidia GTX 680M/Intel 4000 HD optimus dual GPU system. Video viewed on LG notebook screen and LG 3D passive TV.

Last edited by Anime Viewer; 3rd April 2014 at 05:44.
Anime Viewer is offline   Reply With Quote
Old 3rd April 2014, 05:42   #25624  |  Link
QBhd
QB the Slayer
 
QBhd's Avatar
 
Join Date: Feb 2011
Location: Toronto
Posts: 697
Quote:
Originally Posted by seiyafan View Post
Agreed, my 270x had OpenCL copy interop and kernel interop in the 400'ish fps, check your PCI-e
What OS do you use?

My R9 270x is only getting 185'ish fps

QB
__________________
QBhd is offline   Reply With Quote
Old 3rd April 2014, 05:42   #25625  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
Thanks for the fixes in this build - especially #097.
Now I can leave Smooth Motion enabled all the time.
6233638 is offline   Reply With Quote
Old 3rd April 2014, 05:53   #25626  |  Link
truexfan81
Registered User
 
truexfan81's Avatar
 
Join Date: Nov 2012
Posts: 138
Quote:
Originally Posted by madshi View Post
Just copy the settings.bin file. In order to go back to a stored settings.bin file, first run "restore default settings.bat", then copy the stored settings.bin file into the madVR folder. For any of this to work, madVR requires write access to its own folder. If you don't want to give madVR write access to its own folder, open regedit, browse to "HKCU\Software\madshi\madVR" and export to a reg file. Then when you later want to go back, again first run "restore default settings.bat", then double click the saved reg file.quite slow and results in very slow OpenCL <-> Direct3D9 interop performance.
good to know, thanks
truexfan81 is offline   Reply With Quote
Old 3rd April 2014, 06:29   #25627  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,926
Quote:
Originally Posted by QBhd View Post
What OS do you use?

My R9 270x is only getting 185'ish fps

QB
this has a lot to do with the cpu and pci-e versions.

i get over 440 fps with haswell and pci-e 3.0 ivy bridge is slower with pci-e 3.0.

just have a look at this post:
http://www.avsforum.com/t/1477339/so-youve-built-your-htpc-now-what-is-next-how-to-get-the-ultimate-picture-and-sound-quality-from-your-htpc-madvr-svp-xbmc-mediabrowser-jriver/450#post_24514253

this is not 100 % reproduceable number can varies between two identical cards and that a lot.
huhn is offline   Reply With Quote
Old 3rd April 2014, 09:13   #25628  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,348
Quote:
Originally Posted by madshi View Post
D'ya still have a Maxwell? Would be interesting to hear how that fares compared to the 660, considering that Maxwell seems to have noticeably improved compute performance over Kepler.
Quote:
Originally Posted by ryrynz View Post
Hey nevcairiel, any chance of that compute performance test of your 750?
Using the same settings as my GTX 660, it seems to produce similar performance - which is kinda impressive, considering its quite a bit slower in raw performance (~30%).

Chroma: Bicubic 75+AR
Image Up: Jinc3+AR
Image Down: CR+AR+LL

720->1080 with NNEDI 32N: 37ms with OD, 40ms with ED1. Stable and no dropped frames during a 24p movie.
If I step down image downscaling to CR without AR/LL, rendering goes down to ~32ms, or image upscaling down to lanczos3+AR to 35ms.

The hit from error diffusion is much lower than on the 660, but NNEDI does take a bit more performance out of it.
However, its quite usable with NNEDI for 24p content. A 750Ti, with around 20% more performance, should fare even better and beat the GTX 660 consistently. A "860" model based on Maxwell sounds like a good madVR card, but who knows when those will come out. To reach enough performance to run 60p with NNEDI, you'll need to wait for the new high end cards, though!

90% of all content I watch is 24p, 7% is 25p, and only the last 3% are anything higher, so for me it would probably even be enough.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 3rd April 2014 at 09:18.
nevcairiel is offline   Reply With Quote
Old 3rd April 2014, 10:12   #25629  |  Link
John Carmack
Registered User
 
John Carmack's Avatar
 
Join Date: Jan 2014
Posts: 10
Quote:
Originally Posted by madshi View Post
Try increasing the GPU queue size and the number of pre-presented frames for exclusive mode. Smooth motion FRC benefits from bigger queues.
I actually tried and noticed a lot of things:
-Disabling exclusive mode solves the stutter.
-When in exclusive mode, (even with a GPU queue of 24), the decoder queue is always something like 21-25/24 while the others are 20 or 21-24/24
-The mouse is stuttering a lot in exclusive mode (maybe a clue?)
John Carmack is offline   Reply With Quote
Old 3rd April 2014, 10:44   #25630  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by Asmodian View Post
Wow! look at that interop fps, not good.

Do you have something else on the PCI-E bus? It looks like it takes a really long time for a frame to make a round trip.
Nope, no other PCI-E devices.

Quote:
Originally Posted by seiyafan View Post
Agreed, my 270x had OpenCL copy interop and kernel interop in the 400'ish fps, check your PCI-e
The only option in the BIOS that I can remember which relates to PCI-E sets the speeds of the 2nd and 3rd PCI-E x16 slots, which I'm not using. :/

The only other setting I can think of is the Windows Power Settings for PCI-E power savings but that makes no difference.

Quote:
Originally Posted by Anime Viewer View Post
Something else of note I noticed when I went in to add the forceVendor key and sting inside it I noticed there is a HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\Intel(R) HD Graphics 4000 key, but no sign of an Nvidia one.
Hmmmm, mine also only has a key for my old GTS 250 and "Pitcairn", whatever that is. There is no key for my R9 270...could this be the problem?

EDIT: Just realised Pitcairn is an AMD code name so that's probably fine.
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7

Last edited by DragonQ; 3rd April 2014 at 11:09.
DragonQ is offline   Reply With Quote
Old 3rd April 2014, 11:18   #25631  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by DragonQ View Post
{benchmark image}

I don't think I'll be changing my GPU again, especially to nVidia out of principle.
Well, for whatever reason interop cost seems to be extremely high on your PC. How well does DXVA decoding with copyback work for you? I guess it'll probably be very slow, too. Many other users have better results. Could be a problem with your mainboard or your OS installation or something. Can't say for sure. At least it looks like you can forget about NNEDI3 with AMD on your mainboard or OS installation, unfortunately, unless AMD fixes the interop performance problem with a future driver.

Quote:
Originally Posted by ryrynz View Post
madshi, is there still reason to keep the old FSE path?
Some users are still using it. I hope to be able to get rid of it some day, but it's not costing me much support work, so I'll leave it in for now. The big cleanup might come for v1.0. No need to rush cleanup now.

Quote:
Originally Posted by Anime Viewer View Post
Thanks for clearing that up. That worked, and now it has created a HKEY_CURRENT_USER\Software\madshi\madVR\OpenCL\GeForce GTX 680M key in the registry. 32 neurons seems to be the best setting for an Nvidia GTX 680M gpu chroma upscaling setting.

Its interesting...if I force the Nvidia NEEDI3 runs correctly, but if I force the Intel the screen goes green again...
Ok, that's good to know. So NNEDI3 generally works on Optimus laptops. Now the big question is how I can automatically detect the situation, so that you don't have to create that registry tweak. I'll probably have to create a small test tool for you to run...

Quote:
Originally Posted by 6233638 View Post
Thanks for the fixes in this build - especially #097.
Now I can leave Smooth Motion enabled all the time.


Quote:
Originally Posted by nevcairiel View Post
Using the same settings as my GTX 660, it seems to produce similar performance - which is kinda impressive, considering its quite a bit slower in raw performance (~30%).

Chroma: Bicubic 75+AR
Image Up: Jinc3+AR
Image Down: CR+AR+LL

720->1080 with NNEDI 32N: 37ms with OD, 40ms with ED1. Stable and no dropped frames during a 24p movie.
If I step down image downscaling to CR without AR/LL, rendering goes down to ~32ms, or image upscaling down to lanczos3+AR to 35ms.

The hit from error diffusion is much lower than on the 660, but NNEDI does take a bit more performance out of it.
That's quite interesting, thanks for the tests! FWIW, looking at raw numbers it seems that the 750 has about 55% of the memory bandwidth of the 660 and also about 55% of the GLOPS shader power. So it's quite promising if it runs at about the same speed in madVR.

You're saying the 750 is faster with error diffusion but a bit slower with NNEDI compared to the 660? I think the 750Ti should be a great HTPC card, at least it's by far the greatest card which doesn't need an extra power plug. Can't wait for 860Ti or 960Ti in 20nm without an extra power plug...

Quote:
Originally Posted by John Carmack View Post
I actually tried and noticed a lot of things:
-Disabling exclusive mode solves the stutter.
-When in exclusive mode, (even with a GPU queue of 24), the decoder queue is always something like 21-25/24 while the others are 20 or 21-24/24
-The mouse is stuttering a lot in exclusive mode (maybe a clue?)
The queues look just fine to me. Some questions:

(1) How's the presentation queue?
(2) Do you get frame drops? Or presentation glitches? Or both?
(3) Have you changed any other settings besides from the queue sizes?
(4) Hope you're using the default flush settings?
(5) Have you tried disabling desktop composition (if you're on win7)?
madshi is offline   Reply With Quote
Old 3rd April 2014, 11:18   #25632  |  Link
TheShadowRunner
Registered User
 
TheShadowRunner's Avatar
 
Join Date: Feb 2004
Posts: 399
Quote:
Originally Posted by Mangix View Post
Windows XP support ends on the 8th. I don't think it makes sense anymore.
Yeah well, 30% of the world thinks otherwise.
Please don't drop XP support.
And thanks madshi for the new build with Vsfilter/black frame bug fixed!
__________________
XP SP3 / Geforce 8500 / Zoom Player

Last edited by TheShadowRunner; 3rd April 2014 at 11:20.
TheShadowRunner is offline   Reply With Quote
Old 3rd April 2014, 11:50   #25633  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by madshi View Post
Well, for whatever reason interop cost seems to be extremely high on your PC. How well does DXVA decoding with copyback work for you? I guess it'll probably be very slow, too. Many other users have better results. Could be a problem with your mainboard or your OS installation or something. Can't say for sure. At least it looks like you can forget about NNEDI3 with AMD on your mainboard or OS installation, unfortunately, unless AMD fixes the interop performance problem with a future driver.
OS is Windows 8.1 x64, couple of months old. Naturally all motherboard drivers are installed. Latest BIOS too.

Can't think of any settings to change so I guess I'm stuck. My CPU is overclocked but Nehalem uses Base Clock rather than CPU Multiplier for overclocking, so the PCI-E bus is overclocked as well. My RAM has to be underclocked due to this but I can't imagine that would make such a huge difference; it's still over 1333 MHz.

EDIT: Just realised the PCI-E bus isn't overclocked since it's separate to the Base Clock. Been a while since I've checked BIOS settings on this machine! In any case, QPI should surely be fast enough...is anyone else here using an AMD GPU on a QPI-based motherboard? I think QPI was only used on first-generation Core i7s (Bloomfield), at least for desktop systems.
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7

Last edited by DragonQ; 3rd April 2014 at 12:42.
DragonQ is offline   Reply With Quote
Old 3rd April 2014, 12:49   #25634  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,650
Quote:
Originally Posted by TheShadowRunner View Post
Yeah well, 30% of the world thinks otherwise.
30% of the world won't upgrade unless made to also I'm sure 29.9% of them aren't using madVR. Anyway the old path is sticking around at least until 1.0 so don't worry.
ryrynz is offline   Reply With Quote
Old 3rd April 2014, 12:51   #25635  |  Link
John Carmack
Registered User
 
John Carmack's Avatar
 
Join Date: Jan 2014
Posts: 10
Quote:
Originally Posted by madshi View Post
The queues look just fine to me. Some questions:

(1) How's the presentation queue?
(2) Do you get frame drops? Or presentation glitches? Or both?
(3) Have you changed any other settings besides from the queue sizes?
(4) Hope you're using the default flush settings?
(5) Have you tried disabling desktop composition (if you're on win7)
First, thank you for helping me, even if I'm an isolated case.

1) Almost always empty (i.e. 0 or 1-8/8)
2) Both
3) I tried with the reseted settings and it's the same.
4) Yes, defaults
5) Tried it too

I must precise again that these errors only show up with High10.
John Carmack is offline   Reply With Quote
Old 3rd April 2014, 21:11   #25636  |  Link
Anime Viewer
Troubleshooter
 
Anime Viewer's Avatar
 
Join Date: Feb 2014
Posts: 339
Quote:
Originally Posted by madshi View Post

Ok, that's good to know. So NNEDI3 generally works on Optimus laptops. Now the big question is how I can automatically detect the situation, so that you don't have to create that registry tweak.
While experimenting around with different forms of combinations it looks like optimus (and manybe AMD's hybrid) may be able to run in a hybrid state. While using the force registry key to Nvidia, and selecting the Intel gpu for MPC-HC in Nvidia control panel it looks like it may have used a combination of the two gpu. The NNEDI3 continued to render using the Nvidia because the screen didn't turn green and render times didn't shoot up drastically (like when I force the Intel in the registry), but the render times did increase ~5-7ms over the times when run with Nvidia control panel and the registry set to Nvidia. Potentially if other things besides OpenCL could be forced (like DirectCompute) people with dual gpu systems could eliminate the weaker of the two gpu's when it comes to using certain features. People with hybrid amd/intel systems might be able to get around the OpenCL interop bug if they set their systems to force the Intel in the registry, but tell their system to use the AMD for MPC-HC (or whatever other video player - like POT - they may be using).
__________________
System specs: Sager NP9150 SE with i7-3630QM 2.40GHz, 16 GB RAM, 64-bit Windows 10 Pro, NVidia GTX 680M/Intel 4000 HD optimus dual GPU system. Video viewed on LG notebook screen and LG 3D passive TV.
Anime Viewer is offline   Reply With Quote
Old 3rd April 2014, 23:24   #25637  |  Link
blu3wh0
Registered User
 
Join Date: Feb 2014
Posts: 39
Quote:
Originally Posted by DragonQ View Post


I don't think I'll be changing my GPU again, especially to nVidia out of principle.
I would like to note that I'm getting the same issue as DragonQ with AMD 7950 x2. At first I thought I was crazy since most other people with AMD video cards seem to be getting at least some respectable luma doubling, thinking it might be a problem with my setup. I have tried everything that I could think of to try to fix this to no avail.

My OpenCL copy and kernel fps drops from approximately 5250 fps to 130 fps with the interop. I'm on Windows 8.1 x64 and running the latest AMD beta driver. OpenCl error diffusion used to give me the same problem before madshi switched it to DirectCompute. The lowest frame time I can get with luma doubling at 16 neurons for a 720p 24 fps file is 48-50 ms with everything else either disabled or the lowest possible settings. My GPU runs at approximately 40% when I try to use my current settings with luma doubling.

Sorry if this is kind of abrupt, I have been following madVR and lurking here since madshi started developing it, to which I cannot express my gratitude enough in words (for now, thank you!), but I have been pulling my hair out due to this problem. Please let me know if I can provide any other kind of info.
blu3wh0 is offline   Reply With Quote
Old 4th April 2014, 00:10   #25638  |  Link
flashmozzg
Registered User
 
Join Date: May 2013
Posts: 77
Quote:
Originally Posted by blu3wh0 View Post
I would like to note that I'm getting the same issue as DragonQ with AMD 7950 x2. At first I thought I was crazy since most other people with AMD video cards seem to be getting at least some respectable luma doubling, thinking it might be a problem with my setup. I have tried everything that I could think of to try to fix this to no avail.

My OpenCL copy and kernel fps drops from approximately 5250 fps to 130 fps with the interop. I'm on Windows 8.1 x64 and running the latest AMD beta driver. OpenCl error diffusion used to give me the same problem before madshi switched it to DirectCompute. The lowest frame time I can get with luma doubling at 16 neurons for a 720p 24 fps file is 48-50 ms with everything else either disabled or the lowest possible settings. My GPU runs at approximately 40% when I try to use my current settings with luma doubling.

Sorry if this is kind of abrupt, I have been following madVR and lurking here since madshi started developing it, to which I cannot express my gratitude enough in words (for now, thank you!), but I have been pulling my hair out due to this problem. Please let me know if I can provide any other kind of info.
What cpu / pci-e do you have?
flashmozzg is offline   Reply With Quote
Old 4th April 2014, 01:12   #25639  |  Link
blu3wh0
Registered User
 
Join Date: Feb 2014
Posts: 39
I have an i5-2500k on an ASRock Z67 Extreme4 motherboard, with the video cards running on PCI Express 2.0 x8/x8. Disabling crossfire also provided no improvements.
blu3wh0 is offline   Reply With Quote
Old 4th April 2014, 01:49   #25640  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,926
pci-e 2.0 x8 is like using pci-e 1.1 and this is to slow for nnedi on amd. i hope amd fix this in the future.
huhn is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 22:35.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.