Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Closed Thread
 
Thread Tools Search this Thread Display Modes
Old 17th October 2013, 02:57   #161  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by ARDA View Post
in some cases when those codes run in a AMD cpu were derivated to a general not optimized library.
This only applies if you use automatic CPU dispatch optimizations (Qax..). I only use Qx... which "hardcodes" the supported instruction set into the binary. For example, if you use "QxSSE2", you need at least a P4 or a Athlon64.
Also, the maximum optimizations don't always ensure maximum performance. It's a lot of work to optimize a binary (especially with the Intel compiler) but sometimes you get pretty decent gains.
Groucho2004 is offline  
Old 17th October 2013, 03:04   #162  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by ARDA View Post
I'll be waiting Groucho2004 answer about benchmark methods
Here goes:
I used this script (which I also use for PGO) and AVSMeter.
The results are (i5 2500K @ 4GHz):
Ultim's latest DLL: 29.73 fps
Official Alpha5 DLL: 28.95 fps
My Alpha5 ICL10.1 DLL: 30.02 fps
zero9999's ICC14 DLL: 24.56 fps

I'm aware that the script does not cover all internal functions and I'd be happy about suggestions to improve it.

Last edited by Groucho2004; 17th October 2013 at 03:14.
Groucho2004 is offline  
Old 17th October 2013, 03:22   #163  |  Link
ARDA
Registered User
 
Join Date: Nov 2001
Posts: 291
Thanks Goucho2004, I'll work with it and if I have some suggestion I will post here

ARDA
ARDA is offline  
Old 17th October 2013, 10:10   #164  |  Link
ultim
AVS+ Dev
 
ultim's Avatar
 
Join Date: Aug 2013
Posts: 359
Theoretically there shouldn't be any difference between the runtime performance of Avs+ and Avs yet. Any current differences are probably due to different compilers. While some of my current changes might improve running time marginally (removed allocations, less exceptions), I have not yet done anything that was specifically aimed at increasing performance.

Edit: The only difference in speed should be initialization/loading time, where I did make conscious improvements.

Last edited by ultim; 17th October 2013 at 10:18.
ultim is offline  
Old 19th October 2013, 02:07   #165  |  Link
ARDA
Registered User
 
Join Date: Nov 2001
Posts: 291
Quote:
Originally Posted by ultim
- The "crop" function now defaults to aligned crop. You can still controll alignment using its second
parameter, but if you omit it the default is now for the new frame to be aligned.
This is important for plugin authors so that they can have a stronger alignment guarantee, in the end leading
to faster processing in multiple plugins.
Allow me please to point some issues.
I know it is not an important subject, but just a warning for plugin developers but mainly for users.
Take following code just as an example.

Code:
MPEG2Source("D:\TRABAJO32BITS\MALENA\MALENA.d2v")# 720 x 576, use your source

AvsTimer(frames=1000, name="ANYONE",type=3, frequency=2000, total=false, quiet=true)#use your cpu frequency

#crop(8,72,700,432,align=true)
#or
crop(8,72,700,432,align=false)

#BicubicResize(640,352)         # or anyother sizes and resizers, and many filters as well
#or                             # just two cases I've tested.
flipvertical()                  # as flipvertical that in fact is just a bitblt

AvsTimer(frames=1500 ,name="ANYONE",type=3, frequency=2000, difference=1, total=false)
Which one of crops do you think is faster? At first sight we should say with align=true, but
it isn't; the fastest one is with align=false, between 3% till 20% faster with resizers and more than
40% faster with filpvertical(). Why?
Simple with crop + align=true in fact there are two movements of the whole frame, first a bitblt of align=true
and second to the new created frame by resize function, or just another bitblit with flipvertical.
Obviuos this situation only will happen if following filter in the chain inmediately after crop creates
a new frame. And from that frame onwards we shall have again an aligned buffer.

There are a few filters that could work in place under the appropiate conditions and in that case
an align=true maybe could be better. I've still some doubts, I didn't test.
(one by heart, Isse code of Tweak by dividee, but probably many others)

So I don't say align=false is better, just that final user should be aware how next filter in the
chain after the crop works. Is that possible? Probably not. At least I should add a WARNING
in documentation for users. IF YOU RESIZE IMMEDIATELY AFTER A CROP, ALIGN=FALSE COULD BE FASTER.

As far as plugin developers, I would never trust that the source (last) you are receiving is always
aligned b16 for sse2 or b32 for new AVX. So I would check always, except that the only posibility
to have misligned data were with crop align=false, I don't remember and didn't check.?

I hope this can be useful. ARDA
ARDA is offline  
Old 19th October 2013, 02:26   #166  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
Yes, without a doubt, unaligned crop is faster in some (most) cases. Sometimes significantly so. Should a user know about it? Probably. Does it matter? Not really. You don't use crop often in a single script and even on my Core i7 860 system (stock clocks, 4 years old), aligned crop at 1080p runs over 2000fps. Hardly a noticeable impact for me as my scripts rarely run faster than 1fps in the end.

As for developers - yes, one should always assume that he can get any kind of data. But one also can assume that most of the times data he will receive will be aligned. This means that he can omit unaligned routines completely, dispatching all such cases to plain C implementation.

For example in the core filters recommended dispatch looks like this:
Code:
if ((env->GetCPUFlags() & CPUF_SSE2) && IsPtrAligned(srcp, 16)) {
    process_sse2(some args);
  } else
#ifdef X86_32
  if (env->GetCPUFlags() & CPUF_MMX) {
    process_mmx(some args);
  } else
#endif
    process_c(some args);

Last edited by TurboPascal7; 19th October 2013 at 02:34. Reason: Misspelled aligned crop as unaligned, completely changing the meaning
TurboPascal7 is offline  
Old 19th October 2013, 09:38   #167  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by TurboPascal7 View Post
Hardly a noticeable impact for me as my scripts rarely run faster than 1fps in the end.
Ouch. You must have terrible sources if they require this kind of filtering.
Groucho2004 is offline  
Old 19th October 2013, 11:21   #168  |  Link
ultim
AVS+ Dev
 
ultim's Avatar
 
Join Date: Aug 2013
Posts: 359
As Turbo pointed it out, what is important is that plugin developers can know that in the majority of the cases, they will get aligned data. Of course they should not rely on this, but they should definitely optimize for it. Crop is a relatively cheap filter, little more than a bitblt, and the performance that is lost by copying a frame one extra time should be lower than what you'd loose if you ran a more complex filter in its unoptimized version.
ultim is offline  
Old 20th October 2013, 21:27   #169  |  Link
ultim
AVS+ Dev
 
ultim's Avatar
 
Join Date: Aug 2013
Posts: 359
Quote:
Originally Posted by Groucho2004 View Post
Here goes:
I used this script (which I also use for PGO) and AVSMeter.
The results are (i5 2500K @ 4GHz):
Ultim's latest DLL: 29.73 fps
Official Alpha5 DLL: 28.95 fps
My Alpha5 ICL10.1 DLL: 30.02 fps
zero9999's ICC14 DLL: 24.56 fps

I'm aware that the script does not cover all internal functions and I'd be happy about suggestions to improve it.
Out of curiosity, I ran your script through a profiler, and have found that over 40% of time was spent in text antialiasing, more specifically in Antialiaser::GetAlphaRect().
ultim is offline  
Old 20th October 2013, 21:49   #170  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by ultim View Post
Out of curiosity, I ran your script through a profiler, and have found that over 40% of time was spent in text antialiasing, more specifically in Antialiaser::GetAlphaRect().
I know, it's far from a good source for profiling. It does however have a bunch of functions but apparently their CPU usage pales in comparison to the text rendering. I guess I should just use "blankclip" as a source.
Groucho2004 is offline  
Old 20th October 2013, 22:03   #171  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 724
Text anti-aliasing as separate procedure? Why don't you use Anti-Grain Geometry library and render text already anti-aliased?
__________________
...desu!
Mystery Keeper is offline  
Old 20th October 2013, 23:44   #172  |  Link
ultim
AVS+ Dev
 
ultim's Avatar
 
Join Date: Aug 2013
Posts: 359
Quote:
Originally Posted by Groucho2004 View Post
I know, it's far from a good source for profiling. It does however have a bunch of functions but apparently their CPU usage pales in comparison to the text rendering. I guess I should just use "blankclip" as a source.
Basically anything that doesn't dominate over all the other filters this much. Also, maybe text rendering could be kept, only less text/smaller area for text.

Quote:
Originally Posted by Mystery Keeper View Post
Text anti-aliasing as separate procedure? Why don't you use Anti-Grain Geometry library and render text already anti-aliased?
Antialiaser::GetAlphaRect() is just part of the antialiasing. There might be other libraries out there for this purpose (anti-grain, cairo, freetype etc.), but the code for this is not much, it worked just fine, and at least there was no need to link to yet another library. We'll need to link to one when we go cross-platform though, since we're currently relying on GDI for the actual drawing.

Last edited by ultim; 20th October 2013 at 23:49.
ultim is offline  
Old 21st October 2013, 06:43   #173  |  Link
qyot27
...?
 
qyot27's Avatar
 
Join Date: Nov 2005
Location: Florida
Posts: 1,419
Just because I've been curious about this since the rewritten demuxer was committed this past March, can the 64-bit build of AviSynth+ (or heck, the old builds of AviSynth64) work correctly* with Win64 builds of FFmpeg that were compiled with --enable-avisynth? I have no 64-bit Windows setup to test with, so I've never bothered to actually check.

*obviously defining 'work correctly' in the context of what's actually available in said 64-bit builds of AviSynth(+). Even if it's just Version() or simple file loading, that's enough.

Last edited by qyot27; 21st October 2013 at 06:45.
qyot27 is offline  
Old 21st October 2013, 16:57   #174  |  Link
l33tmeatwad
Registered User
 
l33tmeatwad's Avatar
 
Join Date: Jun 2007
Posts: 414
Quote:
Originally Posted by qyot27 View Post
Just because I've been curious about this since the rewritten demuxer was committed this past March, can the 64-bit build of AviSynth+ (or heck, the old builds of AviSynth64) work correctly* with Win64 builds of FFmpeg that were compiled with --enable-avisynth? I have no 64-bit Windows setup to test with, so I've never bothered to actually check.

*obviously defining 'work correctly' in the context of what's actually available in said 64-bit builds of AviSynth(+). Even if it's just Version() or simple file loading, that's enough.
Yes, it works with both AviSynth64 & AviSynth+ 64-bit.
__________________
Github | AviSynth 101 | VapourSynth 101
l33tmeatwad is offline  
Old 22nd October 2013, 09:00   #175  |  Link
ultim
AVS+ Dev
 
ultim's Avatar
 
Join Date: Aug 2013
Posts: 359
I'm giving Avs+ a Boost (pun). It has libraries that are really useful for Avs, so much that in fact many have been included in the new C++11 standard from there. I guess you could ask, why not just use the corresponding standard headers then? First, not all are in C++11. But much more importantly, even those that are, are not all supported by Visual Studio, or only supported by Visual Studio 2013. And I don't want to limit developers to use only the newest VS that been released literally a week ago. In the far future, when C++11 support has become widespread enough among people, in another 2-3 iterations of VS maybe, I'll be glad to switch from Boost to the standard headers, but until then, Boost will make sure we get the appropriate support on every compiler. Below is a list of libraries that could prove really usefull. Many of them are non-trivial, but Boost has well-tested, time-proven, maintained, and portable implementations of them.

Boost libs that I've set my eyes on: Atomic, Smart Ptr, Filesystem, Thread, Lockfree.
Other Boost libs that can come in handy: Locale, Unordered, Log, Static Assert, and some others.
ultim is offline  
Old 23rd October 2013, 10:49   #176  |  Link
zerowalker
Registered User
 
Join Date: Jul 2011
Posts: 1,121
Just to make sure, it does not speed up scripting itself, meaning for example QTGMC?
zerowalker is offline  
Old 23rd October 2013, 10:52   #177  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
Right now - no. To speed up something like QTGMC one will either need faster plugins or maybe good multithreading. Plus faster plugins.
__________________
Me on GitHub | AviSynth+ - the (dead) future of AviSynth
TurboPascal7 is offline  
Old 23rd October 2013, 10:56   #178  |  Link
zerowalker
Registered User
 
Join Date: Jul 2011
Posts: 1,121
Okay then i got things right, but will this take over the Avisynth development?
Cause seem to be quite many branches going on, as well as Vapoursynth, which however is supposed to superseed Avisynth eventually if i understand things correctly.

Last edited by zerowalker; 23rd October 2013 at 10:58.
zerowalker is offline  
Old 23rd October 2013, 11:10   #179  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
Vapoursynth is an entirely different world that some people are not completely happy about (including the author of this fork and myself for example).

As for other Avisynth branches - Avxsynth and Avs64 are effectively dead. Avisynth MT is alive but it's not a real "branch" of avs and it does not get much development and it won't "superseed" Avisynth.

As for avs+ taking over the avisynth development - this depends on your understanding of "taking over". No one is going to assassinate IanB and I'm pretty sure he will be maintaining the official avisynth branch in the future. Whether the official branch will continue to be the most widespread version is another question. We'll see.
__________________
Me on GitHub | AviSynth+ - the (dead) future of AviSynth
TurboPascal7 is offline  
Old 23rd October 2013, 11:15   #180  |  Link
zerowalker
Registered User
 
Join Date: Jul 2011
Posts: 1,121
I see, had quite some positive thoughts of Vapoursynth as it seems good in itself (Avisynth is old, time to renew so to speak).
But well itīs always more than meets the eye.

Good to know, and as you end up, itīs those 2 i am concerned, IanB and avs+.

Like to have 1 to follow, but guess i will have to wait to see if if one of these will "win" the war, or if it will continue to be 2 independent branches.

Thanks for the info.
zerowalker is offline  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:57.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.