Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th October 2016, 13:19   #39761  |  Link
ashlar42
Registered User
 
Join Date: Jun 2007
Posts: 652
Quote:
Originally Posted by Asmodian View Post
FLOPS, and I agree that it seems to indicate madVR performance quite accurately.
Ok, so my idea was correct.

If this is the case, there's a huge gap between 1050Ti and 1060.
2.1 Tflops vs 4.4.

RX480 should offer 5.8 Tflops. Is this confirmed by real testing (it being faster than GTX1060 in madVR)?

Note: the above values are calculated from specifications, according to Tech Report.

Edit: I searched for the comparison spreadsheet that was posted here, based on real testing, and no, GTX480 does not seem to confirm the theoretical advantage that it has based on specs.

Last edited by ashlar42; 19th October 2016 at 13:30.
ashlar42 is offline   Reply With Quote
Old 19th October 2016, 13:30   #39762  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
the first test show it was not the case with polaris.

i just got my RX 480 back and it can easily do nnedi3 64 neuron 24p FHD-> UHD.
but that is mostly openCL.

just wait for the new algo i'm pretty sure people with different hardware will test it.
huhn is offline   Reply With Quote
Old 19th October 2016, 14:45   #39763  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by Asmodian View Post
FLOPS, and I agree that it seems to indicate madVR performance quite accurately.
Keep in mind that FLOPS can only do so much to make things faster; running shaders always has some overhead (reading / writing memory etc.) and for modern GPUs this overhead seems to dominate the computation time for all but the most computation intensive shaders (e.g. NNEDI3).

Case in point, a while back I upgraded my GTX 560Ti to a GTX 960, and while the GTX 960 is definitely faster the difference in FLOPS is far greater than the difference in render times.
Shiandow is offline   Reply With Quote
Old 19th October 2016, 17:47   #39764  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Quote:
Originally Posted by Shiandow View Post
Keep in mind that FLOPS can only do so much to make things faster; running shaders always has some overhead (reading / writing memory etc.) and for modern GPUs this overhead seems to dominate the computation time for all but the most computation intensive shaders (e.g. NNEDI3).

Case in point, a while back I upgraded my GTX 560Ti to a GTX 960, and while the GTX 960 is definitely faster the difference in FLOPS is far greater than the difference in render times.
I have noticed this too. All the easy shader passes take 0.49ms on my Titan X and this is independent of clock speed and resolution. This means an easy six pass shader, like thin edges, is the same 3 ms on my Titan X as it was on my 980Ti while NNEDI3 256 is much faster on my Titan X.

More FLOPS = Faster in madVR, it isn't 1 to 1 but other aspects don't do much to help madVR at all.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 19th October 2016, 20:13   #39765  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by Asmodian View Post
I have noticed this too. All the easy shader passes take 0.49ms on my Titan X and this is independent of clock speed and resolution. This means an easy six pass shader, like thin edges, is the same 3 ms on my Titan X as it was on my 980Ti while NNEDI3 256 is much faster on my Titan X.
Interesting... the 0.5ms per shader pass seems to be pretty universal, I've had similar results on my GTX 560Ti and GTX 960. Not sure what's causing it, it doesn't seem to be just memory bandwidth as reading in more pixels doesn't slow things down much (in a rather extreme case I've had a shader read 1920 pixels per output pixel, that definitely slowed things down but it still worked). I think the number of output pixels has a bigger effect but I haven't checked it thoroughly.

At any rate if it's 0.5ms for every GPU then there's indeed no point in trying to make it faster, so FLOPS are pretty much the only benchmark that will have some effect. Although 2x more FLOPS won't necessarily mean 2x faster rendering.
Shiandow is offline   Reply With Quote
Old 19th October 2016, 23:36   #39766  |  Link
AngelGraves13
Registered User
 
Join Date: Dec 2010
Posts: 254
VRAM plays a role too when upscaling. My 1080 GTX is a huge upgrade over my 980 GTX.

NNEDI3 at 256 for Chroma still isn't possible, but maybe one day...
AngelGraves13 is offline   Reply With Quote
Old 19th October 2016, 23:43   #39767  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Quote:
Originally Posted by AngelGraves13 View Post
VRAM plays a role too when upscaling. My 1080 GTX is a huge upgrade over my 980 GTX.

NNEDI3 at 256 for Chroma still isn't possible, but maybe one day...
Do you mean VRAM speed or amount? I haven't noticed a change in performance due to VRAM speed myself, and I never need more than 3GB of VRAM even with 4K and large buffer sizes.

Are you sure the improvement isn't simply due to the large increase in FLOPS (4981 v.s. 8873)?

edit: Why would you want NNEDI3 256 for chroma? I can do it but I do not.
__________________
madVR options explained

Last edited by Asmodian; 19th October 2016 at 23:46.
Asmodian is offline   Reply With Quote
Old 20th October 2016, 00:09   #39768  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
with a GPU buffer of 16 i get more than 3Gb VRAM usages.
huhn is offline   Reply With Quote
Old 20th October 2016, 00:26   #39769  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Quote:
Originally Posted by huhn View Post
with a GPU buffer of 16 i get more than 3Gb VRAM usages.
Relatively larger buffers then, or maybe simply not small buffers, the default of 8 stays under 3GB.

The 980 has 4GB of VRAM anyway, I don't see how the extra 4GB of VRAM on the 1080 helps.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 20th October 2016, 05:44   #39770  |  Link
AngelGraves13
Registered User
 
Join Date: Dec 2010
Posts: 254
Quote:
Originally Posted by Asmodian View Post
Do you mean VRAM speed or amount? I haven't noticed a change in performance due to VRAM speed myself, and I never need more than 3GB of VRAM even with 4K and large buffer sizes.

Are you sure the improvement isn't simply due to the large increase in FLOPS (4981 v.s. 8873)?

edit: Why would you want NNEDI3 256 for chroma? I can do it but I do not.
I'd say VRAM usage. Doubt speed matters much, if at all. It's just better to have more buffer, especially for 4K video, though it will be a few years until we can play back 4K Ultra HD on PC, assuming the copy protection will ever be cracked

I have 128 for CPU, 24 for GPU and 16 for Present.

NNEDI3 128 I can do, but 256 isn't possible on a 1080 GTX. I use Jinc w/ SuperRes 4 for Chroma and Upscaling.
AngelGraves13 is offline   Reply With Quote
Old 20th October 2016, 07:50   #39771  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
Quote:
Originally Posted by Asmodian View Post
Relatively larger buffers then, or maybe simply not small buffers, the default of 8 stays under 3GB.

The 980 has 4GB of VRAM anyway, I don't see how the extra 4GB of VRAM on the 1080 helps.
even with GPU buffer of 8 you get close to 3 GB Vram usage.
of cause 8 GB doesn't help but 3 GB is still very limited for UHD. with larger buffer size you simply have no chance with 3GB.
huhn is offline   Reply With Quote
Old 20th October 2016, 08:05   #39772  |  Link
Betroz
Is this for real?
 
Betroz's Avatar
 
Join Date: Mar 2016
Location: Norway
Posts: 168
With 720p, 23.976fps content upscaled to 1440p, and then down to 1080p on my 1080p TV I can use the following settings in MadVr :

- Chroma upscaling : NNEDI3 64
- image downscaling : SSIM2D 100%, AR, LL
- image doubling : double Luma NNEDI3 256 + double Chroma NNEDI3 32 (quadruple off)
- image upscaling : Jinc AR
- upscaling refinement : only SuperRes 1X (2X is possible, but a bit higher render times)

This gives me a rendertime of 32-35ms. 480p content is upscaled with Jinc AR from 960p -> 1080p (after NNEDI3 settings has done it's job up to 960).

Or I can use these settings with the same content :

- Chroma upscaling : NNEDI3 256
- image downscaling : SSIM2D 100%, AR, LL
- image doubling : double Luma NNEDI3 64 + double Chroma NNEDI3 64 (quadruple off)
- image upscaling : Jinc AR
- upscaling refinement : only SuperRes 1X (2X is possible, but a bit higher render times)

Very little difference in IQ between the two setups, although I like NNEDI3 256 for Chroma Upscaling with native 1080p content on my 1080p TV that is not using any image upscaling/doubling or SuperRes refinement.
__________________
My HTPC : i9 10900K | nVidia RTX 4070 Super | TV : Samsung 75Q9FN QLED

Last edited by Betroz; 20th October 2016 at 08:08.
Betroz is offline   Reply With Quote
Old 20th October 2016, 08:25   #39773  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
Quote:
Originally Posted by Betroz View Post
I like NNEDI3 256 for Chroma Upscaling
Spot the difference.
ryrynz is offline   Reply With Quote
Old 20th October 2016, 09:19   #39774  |  Link
Betroz
Is this for real?
 
Betroz's Avatar
 
Join Date: Mar 2016
Location: Norway
Posts: 168
Quote:
Originally Posted by ryrynz View Post
Hehe, I can't
I use the first settings from my previous post anyway (the one with NNEDI3 256 for Luma).
__________________
My HTPC : i9 10900K | nVidia RTX 4070 Super | TV : Samsung 75Q9FN QLED
Betroz is offline   Reply With Quote
Old 20th October 2016, 09:26   #39775  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
Quote:
Originally Posted by Betroz View Post
Hehe, I can't


I'd almost suggest removing 128 and 256 neuron NNEDI3 for chroma but I don't think that'll happen, IMO it's best to avoid them.
ryrynz is offline   Reply With Quote
Old 20th October 2016, 13:43   #39776  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Seems I may need to hurry along my latest experiment a bit.
Shiandow is offline   Reply With Quote
Old 20th October 2016, 13:50   #39777  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Quote:
Originally Posted by Shiandow View Post
Seems I may need to hurry along my latest experiment a bit.
Compared to NNEDI3, seems to sharpen the sleeve and instrument of the guy on the left a little more, but at the expense of some aliasing (or clipping) that makes it look a little rough. Had to switch between the images several times to even notice though! (I somehow noticed the aliasing before the increase in sharpness)
Ver Greeneyes is offline   Reply With Quote
Old 20th October 2016, 14:07   #39778  |  Link
aufkrawall
Registered User
 
Join Date: Dec 2011
Posts: 1,812
Quote:
Originally Posted by huhn View Post
even with GPU buffer of 8 you get close to 3 GB Vram usage.
of cause 8 GB doesn't help but 3 GB is still very limited for UHD. with larger buffer size you simply have no chance with 3GB.
Considering that I'm getting presentation glitches with default queues, which I don't get with longer queues, this is a real dealbreaker for 3GB cards.
aufkrawall is offline   Reply With Quote
Old 20th October 2016, 16:23   #39779  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by Ver Greeneyes View Post
Compared to NNEDI3, seems to sharpen the sleeve and instrument of the guy on the left a little more, but at the expense of some aliasing (or clipping) that makes it look a little rough. Had to switch between the images several times to even notice though! (I somehow noticed the aliasing before the increase in sharpness)
True, although softer images almost automatically look less aliased. And compared to NNEDI3 it's quite a bit faster. Still somewhat of a work in progress though.
Shiandow is offline   Reply With Quote
Old 20th October 2016, 18:31   #39780  |  Link
sauma144
Registered User
 
Join Date: Sep 2016
Posts: 89
Quote:
Originally Posted by Shiandow View Post
True, although softer images almost automatically look less aliased. And compared to NNEDI3 it's quite a bit faster. Still somewhat of a work in progress though.
Is it SSimSuperRes, its successor or a totally new algo?
What's the performance difference between NNEDI3 (64n) and your algo?

Last edited by sauma144; 20th October 2016 at 18:35.
sauma144 is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.