Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Closed Thread
 
Thread Tools Search this Thread Display Modes
Old 18th May 2016, 19:59   #1601  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,309
Just wanted to ask, if there is any specific in the nnedi3_rpow2 function, because it seems that applying SetFilterMode on it is simply ignored and works like a NICE filter. I saw parallel calls to the rpow2 function in the debug logs and the corruption pattern is something like that nnedi3 core is working on the same internal buffers. 4x size images, U and V copied to the Y plane, etc.
Have to dig into it.
Remark: it looks like a regular avisynth script but written in C, learned again something new (Invoke).
pinterf is offline  
Old 18th May 2016, 20:53   #1602  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
Quote:
Originally Posted by Reel.Deel View Post
The MPEG2Source issue was fixed so maybe there's hope for nnedi3_row2.
The way to fix nnedi3_rpow2 issue is just remove tritical's PlanarFrame hack and internal buffers to make nnedi3 as MT_NICE_FILTER.
__________________
my repositories
Chikuzen is offline  
Old 19th May 2016, 08:48   #1603  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Quote:
Originally Posted by Chikuzen View Post
The way to fix nnedi3_rpow2 issue is just remove tritical's PlanarFrame hack and internal buffers to make nnedi3 as MT_NICE_FILTER.
But this is used only inside nnedi3, as nnedi3_rpow2 is "just" calling nnedi3. If it was this (and i agree for the possibility), shouldn't nnedi3 in that case also have corruption ? But it seems it's not the case.

Nevertheless, it could indeed be interesting and cleaner if possible to get rid of this hack, but it means i have to dig deaper on the code, and for now, i have no idea (and not much time) of how doing it.
Any advice, clue, very global ideas like (function "xxx" and "yyy" are a good start to look) is welcome.
Basicaly, the 1rst step would be, how doing properly what it's doing (creating using/buffer).

Last edited by jpsdr; 19th May 2016 at 09:07.
jpsdr is offline  
Old 19th May 2016, 10:36   #1604  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
Quote:
Originally Posted by jpsdr View Post
But this is used only inside nnedi3, as nnedi3_rpow2 is "just" calling nnedi3. If it was this (and i agree for the possibility), shouldn't nnedi3 in that case also have corruption ? But it seems it's not the case.
It seems that avisynth can't handle automaticaly a filter which invoked from other filter.
If it's possible, why plugin author have to call "InternalCache" manually ?
I don't think Avisynth+ can choose MT mode appropriately on invoked filter though it cannot insert internal cache automatically.

Quote:
Any advice, clue, very global ideas like (function "xxx" and "yyy" are a good start to look) is welcome.
Basicaly, the 1rst step would be, how doing properly what it's doing (creating using/buffer).
TCannyMod ?
tcanny uses PlanarFrame hack and floating point read/write buffer allocated on constructor.
I removed them and change allocate buffers per each GetFrame().
__________________
my repositories
Chikuzen is offline  
Old 19th May 2016, 10:48   #1605  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Wouldn't it be better to use a script function to achieve rpow2 with nnedi3/(f)turn calls and sort out chroma/luma alignment issues in that function instead of hard-coding it into the plugin?
Groucho2004 is offline  
Old 19th May 2016, 12:01   #1606  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
If i understand properly, basicaly, it means that any filter which needs buffer and allocate them once for all in constructor and so free them in destructor (which is normaly the best and correct way to do) will be screwed with MT mode, because for the same instance/variable filter, several getframe can be runned in the same time (which doesn't occur in not MT mode), and so both will access the same buffer and corrupt each other.
Issue is not because of PlanarFrame hack, it's deeper, meaning that for being MT compatible, if you need buffers, you have to allocate and free them in GetFrame... Honestly, i don't like the idea, but don't know if actualy there is a better solution.
We are hitting here something deeper. It's design restriction. If you want your filter being MT compatible, you have design restrictions. But again, it's specific to this MT mode.
A better solution, but again hitting MT specific design, would be that constructor allow, once for all, several buffers, the number of buffer being decided by the number of multi-thread, and each GetFrame get some "Thread number" information, and so the correct buffer is used by the getframe. Again, the MT scheduler must assure that there is never two getframe of the same filter instance with the same number.

Anyone of course can discuss, agree, disagree (but in that case explain me why and where i'm wrong) with this.
jpsdr is offline  
Old 19th May 2016, 13:39   #1607  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
@jpsdr
If your filter repeats alloc/free same sized memory, Windows calculates the memory needed by all threads, and reserves them.
Freed buffer doesn't return to the system and waits for the next request, and reused just as it is.
Thus, allocate/free buffers per GetFrame is cheap costs now a day.
And this is very akin to the scheduler you want actually.

I think that implementing buffer scheduler is worse idea than re-implementing memcpy.
__________________
my repositories
Chikuzen is offline  
Old 19th May 2016, 14:46   #1608  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,547
Actually this is a bit of a problem in practice. In vs2015 and probably all somewhat modern versions it's only fast for small allocations. Like up to 1080p frames. Once you start allocating a whole 4k frame of space it simply calls virtualalloc. That's quite slow. Even stupid buffer reuse behind a global mutex easily beats that.

Not going to comment on any specific ideas but some kind of buffer reuse can (unfortunately) still help. Or just shove in tcmalloc if you don't use vs2015 and it'll basically do it for you.
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet
Myrsloik is offline  
Old 19th May 2016, 15:17   #1609  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
@mylsroyk
hmmm, interesting.
In that case first avisynth itself should use tcmalloc like VapourSynth.
__________________
my repositories
Chikuzen is offline  
Old 19th May 2016, 15:38   #1610  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
I juste take a look at how to use mutex, i think i'll try to do something with some kind of buffer scheduler, but don't expect it too soon.

Edit :
I'll need to know how to do the following in the plugin :
- Get the avisynth version, to not use MT functions if you're not in a MT version (basicaly, if you're in avs+ or not, and which release).
- If you're in avs+, is there a way to detect if MT has been enable or not ? And if MT has been enabled, is there a way to know how many thread will be running ?

Is there a link on some information/tutorial or anything which explain how to do the listed before ?



Edit 2 :
Or... I'll just add (at the end, of course), an "MT" parameter, set to "1" by default, where you specify the maximum number of possible threads.

Last edited by jpsdr; 19th May 2016 at 16:16.
jpsdr is offline  
Old 19th May 2016, 16:16   #1611  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,309
I'm lost.
I have put a mutex in the nnedi3 class, and use it inside nnedi3 GetFrame, and there is no curruption. There are still parallel calls to nnedi3::GetFrame with the same frame number, but they are for different class instances. Different class, different mutexes, they are working fine.
But this is just a workaround and not the basic problem.

Except that how nnedi3_rpow2 works internally, it should work as-is.

But after inserting the nnedi3 mutex, something is still weird.

Avs 2.6 mt runs at 28.7 fps
Avisynth+
no prefetch: 7.6 fps (13%, thread count=20)
prefetch(1): 7.7 fps (13%, thread count=20) (same)
prefetch(2): 8.2 fps (15%, thread count=21)
prefetch(4): 8.7 fps (15%, thread count=23)
prefetch(8): 8.7 fps (15%, thread count=27)

Last edited by pinterf; 19th May 2016 at 18:38. Reason: removed stupid remark on prefetcher
pinterf is offline  
Old 19th May 2016, 16:52   #1612  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
jpsdr
I'm just gonna leave this here. And here's an example.

It should be pretty obvious that avisynth+ would not require every single plugin author ever to implement his own memory pool.

And I really should stop checking this thread.
__________________
Me on GitHub | AviSynth+ - the (dead) future of AviSynth
TurboPascal7 is offline  
Old 19th May 2016, 19:08   #1613  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Code:
AVSValue __cdecl Create_SangNom2(AVSValue args, void*, IScriptEnvironment* env) {
    if (!env->FunctionExists("SetFilterMtMode")) {
        env->ThrowError("SangNom2: this plugin only works with multithreaded versions of Avisynth+!");
    }
Euh... Not for me !

But thanks for the example, it will allow me to see how to get informations and check things.

If the memory pool handle things "right", meaning allocating only if something asked doesn't exist yet, meaning it should stop allocating very quickly and only giving allready allocating pointers, yes, indeed, this doesn't bother me. But, if it's allocating/freeing on each getframe, i don't like it.
But didn't take a look at the pool code, so for now, doubt benefit...

Last edited by jpsdr; 19th May 2016 at 19:38.
jpsdr is offline  
Old 19th May 2016, 21:55   #1614  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
@jpsdr

I don't know if you could do it or not, but if you link nnedi3 with AVSTP it will easy to control MT in the script outside nnedi3
__________________
See My Avisynth Stuff
real.finder is offline  
Old 20th May 2016, 07:38   #1615  |  Link
jackoneill
unsigned int
 
jackoneill's Avatar
 
Join Date: Oct 2012
Location: 🇪🇺
Posts: 760
Reminder that NNEDI3 for Avisynth has internal multithreading: https://github.com/jpsdr/NNEDI3/blob...nedi3.cpp#L577
__________________
Buy me a "coffee" and/or hire me to write code!
jackoneill is offline  
Old 20th May 2016, 08:43   #1616  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
This is begining nnedi3 specific, so i'll switch to the nnedi3 thread to continue discuss this topic.
jpsdr is offline  
Old 23rd May 2016, 05:00   #1617  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Quote:
Originally Posted by Reel.Deel View Post
I'll add it, better yet, maybe you can make your plugin self-register the appropriate MT mode. Take a look here: http://forum.doom9.org/showthread.ph...30#post1667529
I tried adding it but it doesn't seem to be having any effect.

This should cause the memory usage to go WAY down, and also to run much slower, which doesn't happen.
Code:
class ExecuteShader : public GenericVideoFilter {
public:
	// Supported MT mode for AviSynth+
	int __stdcall SetCacheHints(int cachehints, int frame_range) override {
		return cachehints == CACHE_GET_MTMODE ? MT_NICE_FILTER : 0;
	}
	...
}
MysteryX is offline  
Old 28th May 2016, 05:17   #1618  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
AviSynthShader now specifies its supported MT modes to AviSynth+ and is not necessary in the MT definition file anymore.

KNLMeans, however, isn't in the file either, so this line must be added.
Code:
SetFilterMTMode("KNLMeansCL",          MT_SERIALIZED)
MysteryX is offline  
Old 28th May 2016, 08:46   #1619  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Is there a place/link which explain the differences and impact between all MT status like, MT_SERIALIZED, MT_NICE_FILTER to eventualy add in my filter the report of what they support.
I need the list of all the avaible status, and what they exactly mean for the filter reporting a such status. Where there is such information ?
jpsdr is offline  
Old 28th May 2016, 10:27   #1620  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,746
A few hints are in AviSynth Wiki: AviSynth+; a few more are in the Avisynth+ MT modes definitions (you may have to log in to github to have access here).

As a thumb rule: To be an MT_NICE_FILTER, the code must be programmed in a "threading aware" ~ "re-entrant" style (preferably only function parameters and local variables, no global variables where they don't need to be global) and should not fork own threads. If a filter produces at least probably wrong output in an MT environment, it may have to be set to MT_SERIALIZED even though that will produce a bottleneck, reducing its execution to one thread and possibly even requesting its input in an ordered manner. Filters creating own threads may work as MT_MULTI_INSTANCE filter only in case their number of internal threads is limited, possibly to only 1.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid

Last edited by LigH; 28th May 2016 at 10:39.
LigH is offline  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:28.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.