Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Closed Thread
 
Thread Tools Search this Thread Display Modes
Old 21st March 2017, 16:13   #3161  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Quote:
Originally Posted by pinterf View Post
- vdubfilter (deshaker issue at exit)
Isn't that the issue I had fixed?
MysteryX is offline  
Old 22nd March 2017, 04:37   #3162  |  Link
olex99
Guest
 
Posts: n/a
I've just moved to Avisynth+ from Avisynth 2.6 MT and i'm seeing some weird performance issues.

Using the following script in AviSynth 2.6 MT (x86), I get an average of 4.115fps

Code:
SetMemoryMax(2000)
SetMTMode(5, 6)

DGSource("D:\Videos\Video.dgi")

SetMTMode(2)
QTGMC(EdiThreads=1, DftThreads=1).SelectEven()
Trim(0,999)
Using a the following script in AviSynth+ r2455 (x64), I get an average of 2.505fps

Code:
SetMemoryMax(2000)

DGSource("D:\Videos\Video.dgi")
QTGMC(EdiThreads=1, DftThreads=1).SelectEven()
Trim(0,999)

Prefetch(6)
I've tried the 32bit version of avisynth+ and see similar performance to the 64bit version.

I've downloaded a MTModes.avsi file, aswell as trying to set the Default MT Mode to 2 and nothing seemed to make a difference.

Any ideas of why I might be seeing such a big difference in speed? I was hoping to move to Avisynth+ 64 bit to utilise more memory to increase performance, setting max memory to 3000 in Avisynth 2.6 MT gets me another 1fps but it crashes as soon as I try encode cause it runs out of memory.

The following plugins are installed (for both x64 and x86):
DGDecodeNV 0.0.0.2052
masktools2 - 2.2.4.0
mvtools2 - 2.7.15.22
nnedi3 - 0.9.4.37
rgtools - 0.95.0.0

Thanks
 
Old 22nd March 2017, 13:03   #3163  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by MysteryX View Post
Isn't that the issue I had fixed?
Yes, you have partially fixed an issue, but now it was reported again, this time with a multiple call scenario

Works:
Code:
source="test.mp4"
LoadVirtualDubPlugin ("c:\Virtualdub\plugins32\deshaker.vdf", "deshaker", preroll=0)
FFVideoSource(source)
clip=ConvertToRGB32()
clip.deshaker("
        12|2|30|4|1.09402|1|1|0|640|480|
        1|2|1|400|400|400|1500|4|1|4|2|
        5|40|300|4|C:\deshaker.log|0|0|0|0|0|
        0|0|0|0|0|0|0|0|1|15|
        15|5|15|1|1|30|30|0|0|0|
        0|1|0|1|10|1|15|1000|1|88")
Freeze on exit:

Code:
source="test.mp4"
LoadVirtualDubPlugin ("c:\Virtualdub\plugins32\deshaker.vdf", "deshaker", preroll=0)
FFVideoSource(source)
clip=ConvertToRGB32()
clip.deshaker("
	19|1|30|4|1|0|1|0|1920|1080|
	1|2|1000|1000|1000|1000|4|1|0|2|
	8|30|300|4|C:\Deshaker.log|0|0|0|0|0|
	0|0|0|0|0|0|0|0|1|15|
	15|5|15|0|0|30|30|0|0|1|
	0|1|0|0|10|1000|1|104|1|1|
	20|5000|100|20|1|0|ff00ff")

clip.deshaker("
	19|2|30|4|1|0|1|0|1920|1080|
	1|2|1000|1000|1000|1000|4|1|0|2|
	8|30|300|4|C:\Deshaker.log|0|0|0|0|0|
	0|0|0|0|0|0|0|0|1|15|
	15|5|15|0|0|30|30|0|0|1|
	0|1|0|0|10|1000|1|104|1|1|
	20|5000|100|20|1|0|ff00ff")
The difference is that we have two deshaker calls.

Last edited by pinterf; 22nd March 2017 at 13:32. Reason: new line
pinterf is offline  
Old 22nd March 2017, 13:29   #3164  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,779
^ which appears nonsensical to me, especially if you take the implicit assignment to last into account:

Code:
...
clip = ConvertToRGB32()
last = clip.deshaker({ParamSet1}) # ignored due to the following call superseding the output
last = clip.deshaker({ParamSet2}) # this is the call producing the script output
If you want to create a sequence of two calls, which possibly depend on each other, you will have to ensure that both will have an impact on the output clip, and may it just be a merge of 0% and 100% weight. Furthermore, understand that the sequence will run per frame: If you need a first pass to produce a log file, and a second pass to read and process it, you will need two different scripts anyway, or the second call will not find a finished log file because the first call did not yet complete it.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid

Last edited by LigH; 22nd March 2017 at 13:36.
LigH is offline  
Old 22nd March 2017, 13:35   #3165  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by LigH View Post
^ which appears nonsensical to me, especially if you take the implicit assignment to last into account:
Report is from here
Anyway, it shouldn't freeze on exit.
pinterf is offline  
Old 22nd March 2017, 13:42   #3166  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,779
I'll post my thoughts there if not yet answered by someone else; for me it looks like the author of this report tried to run a 2-pass sequence in one script, so the second-pass call will have either missed a non-existing log file, or tried to read from an open file being written to (and depending on file sharing modes, this might cause a lock?). So I would not bet on the AviSynth core to be blamed.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline  
Old 22nd March 2017, 15:18   #3167  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Small point though, the first calls constructor will be called, even if that filters result is ignored.
EDIT: Is deshaker re-entrant. Perhaps adding a "Last=0" between calls would make freeze disappear, ie call destructor on Last.

EDIT: Post here seems to suggest that log does indeed need to be closed between calls (as per LigH):- https://forum.doom9.org/showthread.p...98#post1782998
Also of course need complete scan between calls.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 22nd March 2017 at 15:57.
StainlessS is offline  
Old 23rd March 2017, 17:47   #3168  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by olex99 View Post
I've just moved to Avisynth+ from Avisynth 2.6 MT and i'm seeing some weird performance issues.

Using the following script in AviSynth 2.6 MT (x86), I get an average of 4.115fps
[...]
Using a the following script in AviSynth+ r2455 (x64), I get an average of 2.505fps
[...]
Thanks
For Prefetch(6) the SetMemoryMax(2000) kills performance for this script.
Avisynth+ r2455:
x64, SetMemoryMax(2000): 6.1 fps
x64, SetMemoryMax(3000): 8.8 fps (AVSMeter: 2500MB virtual)
x86, SetMemoryMax(2000): 7.7 fps
x86, SetMemoryMax(3000): 8.2 fps (AVSMeter: 2600MB virtual)

AVS2.6 MT 2.6.0.5
x86, SetMemoryMax(2000): 8.7 fps (AVSMeter: 2540MB virtual)
x86, SetMemoryMax(3000): 8.6 fps (AVSMeter: 3540MB virtual)

I don't have DGSource, used a test clip with lsmashvideosource.
Run from AVSMeter, no extra memory was needed for the encoder.

Using this script line:
Code:
SetLogParams("log.txt", LOG_DEBUG)
a warning appears in the log file for the SetMemoryMax(2000) case:
WARNING: Caches have been shrunk due to low memory limit. This will probably degrade performance. You can try increasing the limit using SetMemoryMax().
pinterf is offline  
Old 24th March 2017, 01:05   #3169  |  Link
olex99
Guest
 
Posts: n/a
Thanks Pinterf.

You are right, i've managed to get similar performance between Avisynth+ 64bit and Avisynth 2.6 MT by setting the Max Memory to 3000.

I did a lot of testing a few months back on the best Max Memory value for my setup and I settled on 2000 as it gave me the best speed and reliability, going to 3000 in x86 obviously gave me faster speed but it would crash when encoding cause the process would run out of memory.

I'm surprised to see such a big performance difference between Avisynth+ and Avisynth 2.6 MT, the same script and plugins are about 40% faster in Avisynth 2.6 MT, I guess the internals of both programs are a fair bit different though when it comes to multithreading.

The bonus of Avisynth+ is the 64bit mode though which allows me to pump more memory into it but it looks like for my setup, 6 threads and 3000 max memory give me the best performance, anymore threads and all it does is increase the amount of memory required without giving me any performance increase.

Is there anything you can think of that might increase my performance in Avisynth+ to get it closer to Avisynth MT?

I am looking at getting some more ram soon as i'm currently stuck running single channel on a x58 xeon which can run triple channel and I've found memory speed gives a fairly substantial jump in performance so hopefully when I get that I should see a difference.

Thanks for your help.
 
Old 24th March 2017, 01:06   #3170  |  Link
vdcrim
Registered User
 
Join Date: Dec 2011
Posts: 192
Quote:
Originally Posted by videoFred View Post
AvsPmod throws this error message:
Quote:
Only a single prefetcher is allowed per script
But the only prefetch in the script is this one.
It's a bug in AvsPmod, I just posted a build with a fix here.
vdcrim is offline  
Old 24th March 2017, 09:23   #3171  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by olex99 View Post
I'm surprised to see such a big performance difference between Avisynth+ and Avisynth 2.6 MT, the same script and plugins are about 40% faster in Avisynth 2.6 MT, I guess the internals of both programs are a fair bit different though when it comes to multithreading.
40% difference between avs+ 32 and avs+ x64 (and classic x86 Avisynth MT 2.6.0.5) is too huge and it cannot be reasoned by internal mt differences.
pinterf is offline  
Old 24th March 2017, 10:59   #3172  |  Link
olex99
Guest
 
Posts: n/a
Quote:
Originally Posted by pinterf View Post
40% difference between avs+ 32 and avs+ x64 (and classic x86 Avisynth MT 2.6.0.5) is too huge and it cannot be reasoned by internal mt differences.
Would you expect the 64bit version to be faster than the 32bit version? I know x264 is meant to be faster as 64bit and I assumed that Avisynth would be the same but from what I've seen in my tests the 32bit version is faster.

I don't have the results of my tests on me as the computer is at work so ill have to get them on Monday but I ended up getting the speed to within 5% of each other. Avisynth 2.6 MT x86 with SetMaxMemory at 2000 is about 5-10% faster than Avisynth+ 64 bit with SetMaxMemory at 3000.

Today I basically uninstalled Avisynth+ and normal Avisynth, deleted everything and started from scratch. Installed Avisynth+ 2294 and then updated to 2455, removed all plugin other than the ones I use for QTGMC. I downloaded the latest QTGMC 3.355s as well as the latest SMDegrain.

Then I ran 1000 frames of a 1080i (25fps) video through AVSMeter with a whole bunch of different SetMaxMemory and Prefetch, trying to match or beat the speed I got with Avisynth 2.6MT.

One thing I haven't tried is Avisynth+ x86 version to see what sort of speed I get from that but I was hoping to use the x64 version so I could use more memory as I know 2000 is limiting but cant go anymore than that in x86 without it crashing.

I've set the logging in Avisynth+ to Debug and it is empty.
 
Old 24th March 2017, 12:59   #3173  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by olex99 View Post
Would you expect the 64bit version to be faster than the 32bit version? I know x264 is meant to be faster as 64bit and I assumed that Avisynth would be the same but from what I've seen in my tests the 32bit version is faster.
mvtools2, RgTools benefit from running on x64.
In QTGMC most of the time is spent in plugins, not in avisynth, unless there are other factors like mt sceduling and memory issues.
Btw, on what exact processor type have you made the speed tests?
pinterf is offline  
Old 24th March 2017, 13:16   #3174  |  Link
olex99
Guest
 
Posts: n/a
Quote:
Originally Posted by pinterf View Post
mvtools2, RgTools benefit from running on x64.
In QTGMC most of the time is spent in plugins, not in avisynth, unless there are other factors like mt sceduling and memory issues.
Btw, on what exact processor type have you made the speed tests?
Is there a way I can work out which plugin/s might be the issue?

It's weird cause i'm literally using the exact same plugins and scripts between the different avisynths, I downloaded all the latest ones this morning. I installed Avisynth+ 2294 exe and then copied the Avisynth.dll and DevIl.dll from Avisynth 2.6 MT over the top of the x86 version and the same dlls from Avisynth r2455 over the top of the x64 version..

I'm running an Intel Xeon X5675 6 core clocked at 4ghz with 8gig of ddr3 ram clocked at 1820mhz with Windows 10 Pro x64. As I said in a previous post, the ram is only running single channel at the moment but I'm looking at buying another 2 sticks for triple channel in the next week or so, I found the faster I have the memory clocked the faster the encoding runs.

The scripts are literally just doing a load of the source using DGDecodeNV and then QTGMC with default settings other than setting the threads to 1.. I've commented the QTGMC call out to see if there was an issue with DGDecodeNV, however both versions of avisynth give me around 340fps with just DGDecode in there so its got to be something within QTGMC.
 
Old 24th March 2017, 13:30   #3175  |  Link
olex99
Guest
 
Posts: n/a
I wonder if it could be something to do with the runtime version I'm using? I can't remember the versions but I installed the x86 ones a while ago and I did have to install a x64 version for one of the plugins so maybe there is a difference there.

On Monday when I get back to the computer I'll do some testing between avisynth 2.6 mt and avisynth+ x86 and see what the performance difference is. Least that way I can narrow down whether it's a difference between avisynth and avisynth+ or if there is something wrong with my x64 setup.

Should I expect them to be pretty similar with the exact same plugins and script, the only difference in script would be the different way we specify MT.
 
Old 24th March 2017, 13:44   #3176  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Quote:
Originally Posted by olex99 View Post
I'm running an Intel Xeon X5675 6 core clocked at 4ghz with 8gig of ddr3 ram clocked at 1820mhz with Windows 10 Pro x64
My friend has encoding server with the same CPU, but there are actually 2 CPUs and he didn't OC them. Also we using windows server 2008 as host OS.
All I can say that's a good low-cost CPU for encoding, especially if you can afford a pair of them. It's above 2x speed boost against my i5-4670k clocked at 4.3 GHz.

pinterf
If you need any test on that CPU, ask me in PM or so.
DJATOM is offline  
Old 24th March 2017, 14:35   #3177  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Thanks, I was just wondering whether the processor has AVX or better capabilities (but not, it has only SSE4.2), because Avs+ can report AVX or better CPU flags to plugins, and that may result in a different code path. And if that code path would contain bug that can explain the different.

Another question for olex99, please set SetMemoryMax to 5000, and run the x64 avsmeter64 process to see how much memory is needed actually (it tops at a maximum, I don't expect to reach 5000). Maybe even 3000 is not enough (though you have said that no cache warning was seen in your logs). What are the frame dimensions (and format YV12?) of your clip?
pinterf is offline  
Old 24th March 2017, 16:09   #3178  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
For those who are intested in what happens in the background, this is what Visual Studio performance profiler shows, Avs+ x86.

Script
Code:
SetMemoryMax(3000)
SetFilterMTMode("DEFAULT_MT_MODE",2)
lsmashvideosource("test107frame.mp4", format="YUV420P8").Loop(10)
Crop(0, 140, 0, -140)
QTGMC(EdiThreads=1, fftThreads=1).SelectEven()
Trim(0,999)
Prefetch(6)
Code:
Function Name	Inclusive Samples %	Exclusive Samples %	Module Name
[nnedi3.dll]
	6,76	6,76	nnedi3.dll
Filtering::MaskTools::Filters::Lut::Dual::lut_c
	5,82	5,82	masktools2.dll
[LSMASHSource.dll]
	5,83	5,79	LSMASHSource.dll
resizer_h_ssse3_generic
	5,65	5,65	avisynth.dll
Degrain1to6_sse2<16,16,0,1>
	5,09	5,09	mvtools2.dll
_Overlaps16x16_sse2
	4,60	4,60	mvtools2.dll
memcpy
	4,32	4,32	vcruntime140.dll
_VerticalWiener_iSSE
	4,12	4,12	mvtools2.dll
Short2Bytes
	4,00	4,00	mvtools2.dll
PlaneOfBlocks::PseudoEPZSearch<unsigned char>
	3,44	3,44	mvtools2.dll
_HorizontalWiener_iSSE
	3,28	3,28	mvtools2.dll
_x264_pixel_sad_16x16_sse2
	2,81	2,81	mvtools2.dll
_Overlaps8x8_sse2
	2,50	2,50	mvtools2.dll
_x264_pixel_sad_8x8_mmx2
	2,32	2,32	mvtools2.dll
resize_v_ssse3_planar<&simd_load_streaming>
	2,31	2,31	avisynth.dll
Degrain1to6_sse2<8,8,0,1>
	2,10	2,10	mvtools2.dll
MVDegrainX::process_chroma
	4,19	2,10	mvtools2.dll
PlaneOfBlocks::search_mv_slice<unsigned char>
	1,95	1,94	mvtools2.dll
PlaneOfBlocks::ExpandingSearch<unsigned char>
	1,89	1,89	mvtools2.dll
PlaneOfBlocks::FetchPredictors<unsigned char>
	1,76	1,76	mvtools2.dll
_Copy16x16_sse2
	1,71	1,71	mvtools2.dll
_Thread32Next@8
	1,59	1,59	kernel32.dll
process_plane_sse<unsigned char,&rg_mode20_sse<0,1>,&rg_mode20_sse<1,1> >
	1,43	1,43	RgTools.dll
MVDegrainX::GetFrame
	37,15	1,28	mvtools2.dll
Filtering::MaskTools::Filters::Morphologic::xxpand_sse2_vertical
<&expand_operator_sse2,&Filtering::MaskTools::Filters::Morphologic::limit_up_sse2,1>
	1,26	1,26	masktools2.dll
weighted_merge_planar_sse2
	1,26	1,26	avisynth.dll
Filtering::MaskTools::Filters::Morphologic::xxpand_sse2_vertical
<&inpand_operator_sse2,&Filtering::MaskTools::Filters::Morphologic::limit_down_sse2,1>
	1,22	1,22	masktools2.dll
PlaneOfBlocks::InterpolatePrediction<unsigned char>
	1,20	1,20	mvtools2.dll
_Copy8x8_sse2
	1,19	1,19	mvtools2.dll
calculate_sad_sse2<0>
	0,76	0,76	avisynth.dll
PlaneOfBlocks::Hex2Search<unsigned char>
	0,69	0,69	mvtools2.dll
accumulate_line_sse2<1,1>
	0,66	0,66	avisynth.dll
MVCompensate::compensate_slice_overlap
	0,65	0,65	mvtools2.dll
logic_t_sse2<1,&max_t_sse2<&nop_sse2,&nop_sse2>,&max_t<&nop,&nop> >
	0,62	0,62	masktools2.dll
process_plane_sse<unsigned char,&rg_mode11_sse<0,1>,&rg_mode11_sse<1,1> >
	0,62	0,62	RgTools.dll
Filtering::MaskTools::Filters::Lut::Single::lut_c
	0,60	0,60	masktools2.dll
norm_weights<1>
	0,57	0,57	mvtools2.dll
logic_t_sse2<1,&min_t_sse2<&nop_sse2,&nop_sse2>,&min_t<&nop,&nop> >
	0,57	0,57	masktools2.dll
Filtering::MaskTools::Filters::Support::MakeDiff::makediff_sse2_t<1>
	0,48	0,48	masktools2.dll
_RB2BilinearFilteredVerticalLine_SSE
	0,41	0,41	mvtools2.dll
pinterf is offline  
Old 24th March 2017, 20:29   #3179  |  Link
videoFred
Registered User
 
videoFred's Avatar
 
Join Date: Dec 2004
Location: Terneuzen, Zeeland, the Netherlands, Europe, Earth, Milky Way,Universe
Posts: 689
Quote:
Originally Posted by vdcrim View Post
It's a bug in AvsPmod, I just posted a build with a fix here.
Thank you!

Fred.
__________________
About 8mm film:
http://www.super-8.be
Film Transfer Tutorial and example clips:
https://www.youtube.com/watch?v=W4QBsWXKuV8
More Example clips:
http://www.vimeo.com/user678523/videos/sort:newest
videoFred is offline  
Old 25th March 2017, 15:44   #3180  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Pinterf, here's a bug in Avisynth+. Better if I leave this one for you so it's done the right way instead of hacking around the issue.
https://forum.doom9.org/showthread.php?t=174459
MysteryX is offline  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:25.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.