Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
17th October 2013, 02:57 | #161 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
Also, the maximum optimizations don't always ensure maximum performance. It's a lot of work to optimize a binary (especially with the Intel compiler) but sometimes you get pretty decent gains. |
|
17th October 2013, 03:04 | #162 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Here goes:
I used this script (which I also use for PGO) and AVSMeter. The results are (i5 2500K @ 4GHz): Ultim's latest DLL: 29.73 fps Official Alpha5 DLL: 28.95 fps My Alpha5 ICL10.1 DLL: 30.02 fps zero9999's ICC14 DLL: 24.56 fps I'm aware that the script does not cover all internal functions and I'd be happy about suggestions to improve it. Last edited by Groucho2004; 17th October 2013 at 03:14. |
17th October 2013, 10:10 | #164 | Link |
AVS+ Dev
Join Date: Aug 2013
Posts: 359
|
Theoretically there shouldn't be any difference between the runtime performance of Avs+ and Avs yet. Any current differences are probably due to different compilers. While some of my current changes might improve running time marginally (removed allocations, less exceptions), I have not yet done anything that was specifically aimed at increasing performance.
Edit: The only difference in speed should be initialization/loading time, where I did make conscious improvements. Last edited by ultim; 17th October 2013 at 10:18. |
19th October 2013, 02:07 | #165 | Link | |
Registered User
Join Date: Nov 2001
Posts: 291
|
Quote:
I know it is not an important subject, but just a warning for plugin developers but mainly for users. Take following code just as an example. Code:
MPEG2Source("D:\TRABAJO32BITS\MALENA\MALENA.d2v")# 720 x 576, use your source AvsTimer(frames=1000, name="ANYONE",type=3, frequency=2000, total=false, quiet=true)#use your cpu frequency #crop(8,72,700,432,align=true) #or crop(8,72,700,432,align=false) #BicubicResize(640,352) # or anyother sizes and resizers, and many filters as well #or # just two cases I've tested. flipvertical() # as flipvertical that in fact is just a bitblt AvsTimer(frames=1500 ,name="ANYONE",type=3, frequency=2000, difference=1, total=false) it isn't; the fastest one is with align=false, between 3% till 20% faster with resizers and more than 40% faster with filpvertical(). Why? Simple with crop + align=true in fact there are two movements of the whole frame, first a bitblt of align=true and second to the new created frame by resize function, or just another bitblit with flipvertical. Obviuos this situation only will happen if following filter in the chain inmediately after crop creates a new frame. And from that frame onwards we shall have again an aligned buffer. There are a few filters that could work in place under the appropiate conditions and in that case an align=true maybe could be better. I've still some doubts, I didn't test. (one by heart, Isse code of Tweak by dividee, but probably many others) So I don't say align=false is better, just that final user should be aware how next filter in the chain after the crop works. Is that possible? Probably not. At least I should add a WARNING in documentation for users. IF YOU RESIZE IMMEDIATELY AFTER A CROP, ALIGN=FALSE COULD BE FASTER. As far as plugin developers, I would never trust that the source (last) you are receiving is always aligned b16 for sse2 or b32 for new AVX. So I would check always, except that the only posibility to have misligned data were with crop align=false, I don't remember and didn't check.? I hope this can be useful. ARDA |
|
19th October 2013, 02:26 | #166 | Link |
Registered User
Join Date: Jan 2010
Posts: 270
|
Yes, without a doubt, unaligned crop is faster in some (most) cases. Sometimes significantly so. Should a user know about it? Probably. Does it matter? Not really. You don't use crop often in a single script and even on my Core i7 860 system (stock clocks, 4 years old), aligned crop at 1080p runs over 2000fps. Hardly a noticeable impact for me as my scripts rarely run faster than 1fps in the end.
As for developers - yes, one should always assume that he can get any kind of data. But one also can assume that most of the times data he will receive will be aligned. This means that he can omit unaligned routines completely, dispatching all such cases to plain C implementation. For example in the core filters recommended dispatch looks like this: Code:
if ((env->GetCPUFlags() & CPUF_SSE2) && IsPtrAligned(srcp, 16)) { process_sse2(some args); } else #ifdef X86_32 if (env->GetCPUFlags() & CPUF_MMX) { process_mmx(some args); } else #endif process_c(some args); Last edited by TurboPascal7; 19th October 2013 at 02:34. Reason: Misspelled aligned crop as unaligned, completely changing the meaning |
19th October 2013, 11:21 | #168 | Link |
AVS+ Dev
Join Date: Aug 2013
Posts: 359
|
As Turbo pointed it out, what is important is that plugin developers can know that in the majority of the cases, they will get aligned data. Of course they should not rely on this, but they should definitely optimize for it. Crop is a relatively cheap filter, little more than a bitblt, and the performance that is lost by copying a frame one extra time should be lower than what you'd loose if you ran a more complex filter in its unoptimized version.
|
20th October 2013, 21:27 | #169 | Link | |
AVS+ Dev
Join Date: Aug 2013
Posts: 359
|
Quote:
|
|
20th October 2013, 21:49 | #170 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
I know, it's far from a good source for profiling. It does however have a bunch of functions but apparently their CPU usage pales in comparison to the text rendering. I guess I should just use "blankclip" as a source.
|
20th October 2013, 23:44 | #172 | Link | |
AVS+ Dev
Join Date: Aug 2013
Posts: 359
|
Quote:
Antialiaser::GetAlphaRect() is just part of the antialiasing. There might be other libraries out there for this purpose (anti-grain, cairo, freetype etc.), but the code for this is not much, it worked just fine, and at least there was no need to link to yet another library. We'll need to link to one when we go cross-platform though, since we're currently relying on GDI for the actual drawing. Last edited by ultim; 20th October 2013 at 23:49. |
|
21st October 2013, 06:43 | #173 | Link |
...?
Join Date: Nov 2005
Location: Florida
Posts: 1,419
|
Just because I've been curious about this since the rewritten demuxer was committed this past March, can the 64-bit build of AviSynth+ (or heck, the old builds of AviSynth64) work correctly* with Win64 builds of FFmpeg that were compiled with --enable-avisynth? I have no 64-bit Windows setup to test with, so I've never bothered to actually check.
*obviously defining 'work correctly' in the context of what's actually available in said 64-bit builds of AviSynth(+). Even if it's just Version() or simple file loading, that's enough. Last edited by qyot27; 21st October 2013 at 06:45. |
21st October 2013, 16:57 | #174 | Link | |
Registered User
Join Date: Jun 2007
Posts: 414
|
Quote:
|
|
22nd October 2013, 09:00 | #175 | Link |
AVS+ Dev
Join Date: Aug 2013
Posts: 359
|
I'm giving Avs+ a Boost (pun). It has libraries that are really useful for Avs, so much that in fact many have been included in the new C++11 standard from there. I guess you could ask, why not just use the corresponding standard headers then? First, not all are in C++11. But much more importantly, even those that are, are not all supported by Visual Studio, or only supported by Visual Studio 2013. And I don't want to limit developers to use only the newest VS that been released literally a week ago. In the far future, when C++11 support has become widespread enough among people, in another 2-3 iterations of VS maybe, I'll be glad to switch from Boost to the standard headers, but until then, Boost will make sure we get the appropriate support on every compiler. Below is a list of libraries that could prove really usefull. Many of them are non-trivial, but Boost has well-tested, time-proven, maintained, and portable implementations of them.
Boost libs that I've set my eyes on: Atomic, Smart Ptr, Filesystem, Thread, Lockfree. Other Boost libs that can come in handy: Locale, Unordered, Log, Static Assert, and some others. |
23rd October 2013, 10:56 | #178 | Link |
Registered User
Join Date: Jul 2011
Posts: 1,121
|
Okay then i got things right, but will this take over the Avisynth development?
Cause seem to be quite many branches going on, as well as Vapoursynth, which however is supposed to superseed Avisynth eventually if i understand things correctly. Last edited by zerowalker; 23rd October 2013 at 10:58. |
23rd October 2013, 11:10 | #179 | Link |
Registered User
Join Date: Jan 2010
Posts: 270
|
Vapoursynth is an entirely different world that some people are not completely happy about (including the author of this fork and myself for example).
As for other Avisynth branches - Avxsynth and Avs64 are effectively dead. Avisynth MT is alive but it's not a real "branch" of avs and it does not get much development and it won't "superseed" Avisynth. As for avs+ taking over the avisynth development - this depends on your understanding of "taking over". No one is going to assassinate IanB and I'm pretty sure he will be maintaining the official avisynth branch in the future. Whether the official branch will continue to be the most widespread version is another question. We'll see. |
23rd October 2013, 11:15 | #180 | Link |
Registered User
Join Date: Jul 2011
Posts: 1,121
|
I see, had quite some positive thoughts of Vapoursynth as it seems good in itself (Avisynth is old, time to renew so to speak).
But well itīs always more than meets the eye. Good to know, and as you end up, itīs those 2 i am concerned, IanB and avs+. Like to have 1 to follow, but guess i will have to wait to see if if one of these will "win" the war, or if it will continue to be 2 independent branches. Thanks for the info. |
Thread Tools | Search this Thread |
Display Modes | |
|
|