Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
9th September 2018, 01:22 | #1 | Link |
Registered User
Join Date: Mar 2018
Posts: 447
|
Introducing Zopti (ex AvisynthOptimizer)
Last year I dug up some old VHS videos and decided to finally digitize them. The tapes are old and I wanted to preserve them in the best quality I could afford. I soon found AviSynth and started to investigate how to do restoration and deinterlacing (for playback on computer).
This lead me to QTGMC-plugin and I was impressed by its capabilities but I did find some strange artifacts when I used the best quality settings. These artifacts were not present at lower quality settings. Being a perfectionist I naturally searched manually which of the 92 parameters were responsible for the artifacts and found them (more about that later). There is quite a bit of noise in the old VHS tapes so naturally I investigated noise removal plugins. The temporal denoising made possible by MVTools is very important and I wanted to find out which settings are best for my VHS tapes. Reading these and other video restoration forums I found that there really isn't any one set of settings that would work for every type of video. So finding good settings would involve a lot of manual testing. Using MVTools to do motion compensation involves about 60 parameters so there's a lot of things to adjust... this task seemed hopeless. So I began to wonder if there is anything that could automate this kind of process. Turns out that there is, and I made a tool that can help doing it. And now I'm about to share it with you. An AviSynth script is first "augmented" by adding instructions on which parameters should be optimized. The instructions are written inside comments so the script will continue to work normally. The script also needs to measure the quality of the processed video using current parameter values or measure the running time (or preferably both the quality and runtime). The script needs to write these results to a specific file. Now you might be wondering how on earth can the script measure the quality. I can only think of one way: to compare the frames to reference frames and measure the similarity. The closer the similary value, the better the quality. You could measure similarity in a very simplistic way using LumaDifference and ChromaUDifference+ChromaVDifference, but the best similary metric AviSynth currently has is SSIM. SSIM is not the best similarity metric anymore but it's still widely used in the image processing community. The SSIM plugin needs to have a function which returns the similarity value to the script. The only version that currently does that (as far as I know) is v0.25.1 by mitsubishi. Ok, but where do we get the reference frames? In case of MVTools we can use the original frames as reference frames and the script will try to reconstruct them using motion compensation (but it's not allowed to use the reference frame in the reconstruction). We can do something similar if we want to use MVTools to double the framerate: we first create the double-rate video, then remove the original frames from it, then double the framerate again and finally compare these to the original frames. This idea is not limited to MVTools, you could for example do color grading using some other software and then try to recreate the same result using AviSynth. I'm sure the smart people here will find use cases I couldn't even dream about. Most people don't only care about the quality, the script's run time is also important. I found a couple of different plugins to measure the runtime, but only one of them (AvsTimer) fit the bill and even then I had to make some modifications to it. So if the purpose is to find the best settings, what do we consider the "best" when both quality and time are involved? For example we can have settings A with quality 99 (larger is better), time 2300ms (larger is worse) and settings B with quality 95 and time 200ms. Which one of these is better? We can use the concept of pareto domination to answer this question. When one solution is at least as good as the other solution in every objective (in this case the objectives are quality and speed) and better in at least one objective, it dominates the other solution. In this example, neither dominates the other. But if we have settings C with quality 99 and time 200ms it dominates both A and B. In the end we want to know all the nondominated solutions, which is called the pareto front. So there's going to be a list of parameters with increasing quality and runtime. You can have more than two objectives if you want, the same pareto concept works. There's a good and free ebook called "Essentials of Metaheuristics" which describes the pareto concept and much more. AvisynthOptimizer reads the augmented script, verifies it and starts running a metaheuristic optimization algorithm. There are currently three algorithm choices: NSGA-II, SPEA2 and mutation. The first two are some of the best metaheuristic algorithms available and also described in the ebook mentioned above. The mutation is my own simplistic algorithm (but it can be useful if you're in a hurry since it's the most efficient). All the metaheuristic algorithms are working in a similar manner generating different solutions and testing them. The optimizer does these tests by creating AviSynth scripts with certain parameter values and running them. The script then writes quality/time results into a file which the optimizer then reads. The metaheuristic then decides which parameters to try next based on the results. This continues until some end criteria is met. There are three different ending criterias: number of iterations, time limit and "dynamic". Number of iterations is just that, the algorithm runs the script specific number of times. Setting a time limit can be pretty useful if you know how much time you can use for the optimization. You could for example let it run overnight for 8 hours and see the results in the morning. Dynamic variation is stopping only when it doesn't make any progress anymore. Making progress is defined by "no more pareto front members in last x iterations". This can be useful if you want to find the best possible results regardless of how long it takes. During the optimization all tested parameters and their results are written to a log file. This log file can be "evaluated" during and after the optimization process. The evaluation basically means finding the pareto front and showing it. You can also create AviSynth scripts from the pareto front in order to test them yourself. It's also possible to visualize the results of the log file in a two dimensional scatter chart. This chart highlights the pareto front and shows all the other results too. The chart can also be "autorefreshing": it loads the log file every few seconds and updates the visuals, which is a fun way to track how the optimization is progressing. Here's a gif what it looks like (obviously sped up): The visualization has a few other bells and whistles but one I'd like to highlight here is the group by functionality: you can group the results by certain parameter's values and show a separate pareto front for each value. I think a picture is in order here, this what grouping by MVTools' blocksize looks like: Measuring the script's runtime is not very accurate, ie it has some variation. All the other processes running at the same time are using CPU cycles and messing with the cache so you should try to minimize other activity on the computer. In order to get more accurate results you can run a validation on the finished results log file. In validation the idea is to run the pareto front results multiple times and calculate the average, median, minimum or maximum of these multiple measurements (you can decide which one(s)). After the pareto front is measured validation calculates the "secondary pareto front" by removing the results in the original pareto front and finding a pareto front of the remaining results. The verification is run on the secondary pareto front also because it's possible that due to inaccurate runtimes the real pareto front will contain results from the secondary pareto front. If the secondary pareto front did contain real pareto front members the validation takes the third pareto front and validates it as well. And so on until the current pareto front didn't contain any new pareto front members. So how good is the optimizer? Let's take a look at one example. A while back there was a thread about best motion interpolation filters. There's a test video with a girl waving her hand. The best options currently are John Meyer's jm_fps script and FrameRateConverter. Here's a comparison gif with those two and AvisynthOptimizer. FramerateConverter was run with preset="slow". Now obviously I'm showing a bit of a cherry-picked example here. The optimizer was instructed to search the best parameters for this short 10 frame sequence. I have run most of my optimization runs using only 10 frames because otherwise the optimization takes too long. Ideally the optimizer would automatically select the frames from a longer video, I have some ideas on how to implement that but as of now the user has to make the selection. Also this part of the video is not the most challenging part, I decided to try something relatively easy first. After all the optimizer cannot do miracles (that feature is not finished yet ). At this point I envision that AvisynthOptimizer would be an useful tool for plugin authors so they can test plugin parameters and try to search for the optimal ones. At some later point AvisynthOptimizer could be useful for normal users, when combined with a script with limited search space so that the search will not take excessively long time. All right, that's all for now. I will add more detailed explanations and the download links tomorrow. I haven't actually finished the documentation yet, but still wanted to get this thing out there before my vacation is over. [EDIT] Here are the download links: Download the zip (version 1.2.3) here and unzip to a folder of your liking. You will also need my modded AvsTimer and the SSIM plugin with the SSIM_FRAME-function which returns the SSIM value to the script. I have a package which contains both of those here. More documentation can be found from these posts (if you're not interested in reading the whole thread): Augmented Script (part 1/2) Augmented Script (part 2/2) Hands-on tutorial (part 1/2) Hands-on tutorial (part 2/2) Optimizer arguments Last edited by zorr; 5th March 2022 at 21:24. Reason: Changed download link to latest version 1.2.3 |
9th September 2018, 12:53 | #3 | Link |
Registered User
Join Date: Sep 2010
Location: Russia
Posts: 85
|
Sounds cool, but wouldn't it make more sense measuring quality by halving the source framerate and comparing upsampled FPS to the original frames instead of double-upsampling and comparing to the already imperfect recreated frames? Settings that worked well for a lower source framerate will surely work even better for a higher one.
Besides I don't get why everyone's using FRC as the golden measure, all it does is blending frames like ConvertFps but slow. |
9th September 2018, 14:06 | #4 | Link | ||
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
you can try to offset the difficulty by selecting a moderately easy part of the video for optimization. I think this idea is worth investigating. I will run some tests later. Quote:
|
||
9th September 2018, 17:12 | #5 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Quote:
I did start on some half arsed similar project but gave up (well on back burner somwhere), was intending to use from script and using RT_Stats RT_Array and/or RT_DBase, maybe DBase for arg names, ranges etc, and array for results. Your effort looks way more organized than my totally unplanned (lets see what happens) effort. Good look and keep us all posted, your cherry pickin's look good to me EDIT: I guess your thread explains all of the 'torture' you put me through in this thread:- http://forum.doom9.org/showthread.php?t=175373 EDIT: Also take note that Source Framerate is also an indirect arg, and it would be good to also know how it affects the results.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 9th September 2018 at 17:54. |
|
9th September 2018, 20:52 | #6 | Link | ||||
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
Quote:
Quote:
Quote:
Sorry, which function has this argument? |
||||
9th September 2018, 21:49 | #7 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Quote:
(ie the results will be different depending upon input framerate, higher i/p framerate almost certainly better o/p). EDIT: Assuming eg o/p to double rate. EDIT: Also, assuming that I'm correct in thinking that vector length is limited to signed BYTE size, so is also affected by input frame size (and also pel setting) [perhaps for 16 bit colorspace is limited to 16 bit vector, dont know].
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 9th September 2018 at 22:21. |
|
10th September 2018, 01:20 | #8 | Link |
Registered User
Join Date: Mar 2018
Posts: 447
|
Augmented Script (part 1/2)
Let's take a look at what an "augmented" script looks like. This is a complete script that contains everything AvisynthOptimizer needs. The script is using MFlowInter to reconstruct every frame of the video using the neighbour frames and then compares them to the original frames.
Code:
TEST_FRAMES = 10 # how many frames are tested MIDDLE_FRAME = 50 # middle frame number AVISource("d:\process2\1 deinterlaced.avi") orig = last # you could add preprocessing here to help MSuper - not used here searchClip = orig super_pel = 4 # optimize super_pel = _n_ | 2,4 | super_pel super_sharp = 2 # optimize super_sharp = _n_ | 0..2 | super_sharp super_rfilter = 2 # optimize super_rfilter = _n_ | 0..4 | super_rfilter super_search = MSuper(pel=super_pel, sharp=super_sharp, rfilter=super_rfilter, searchClip) super_render = MSuper(pel=super_pel, sharp=super_sharp, rfilter=super_rfilter, last, levels=1) blockSize = 8 # optimize blockSize = _n_ | 6,8,12,16,24,32 ; min:divide 0 > 8 2 ? ; filter:overlap overlapv max 2 * x <= | blockSize searchAlgo = 5 # optimize searchAlgo = _n_ | 0..7 D | searchAlgo searchRange = 2 # optimize searchRange = _n_ | 1..10 | searchRange searchRangeFinest = 2 # optimize searchRangeFinest = _n_ | 1..10 | searchRangeFinest lambda = 1000*(blockSize*blockSize)/(8*8) # optimize lambda = _n_ | 0..20000 | lambda lsad=1200 # optimize lsad=_n_ | 8..20000 | LSAD pnew=0 # optimize pnew=_n_ | 0..256 | pnew plevel=1 # optimize plevel=_n_ | 0..2 | plevel overlap=2 # optimize overlap=_n_ | 0,2,4,6,8,10,12,14,16 ; max:blockSize 2 / ; filter:x divide 0 > 4 2 ? % 0 == | overlap overlapv=2 # optimize overlapv=_n_ | 0,2,4,6,8,10,12,14,16 ; max:blockSize 2 / | overlapv divide=0 # optimize divide=_n_ | 0..2 ; max:blockSize 8 >= 2 0 ? overlap 4 % 0 == 2 0 ? min | divide globalMotion = true # optimize globalMotion = _n_ | false,true | globalMotion badSAD = 10000 # optimize badSAD = _n_ | 4..10000 | badSAD badRange = 24 # optimize badRange = _n_ | 4..50 | badRange meander = true # optimize meander = _n_ | false,true | meander temporal = false # optimize temporal = _n_ | false,true | temporal trymany = false # optimize trymany = _n_ | false,true | trymany # smallest delta is 1 but you can make the task more challenging by using larger delta (larger deltas often used in temporal denoising) delta = 1 useChroma = true bv = MAnalyse(super_search, isb = true, blksize=blockSize, search=searchAlgo, searchparam=searchRange, pelsearch=searchRangeFinest, chroma=useChroma, \ delta=delta, lambda=lambda, lsad=lsad, pnew=pnew, plevel=plevel, global=globalMotion, overlap=overlap, overlapv=overlapv, divide=divide, badSAD=badSAD, \ badrange=badRange, meander=meander, temporal=temporal, trymany=trymany) fv = MAnalyse(super_search, isb = false, blksize=blockSize, search=searchAlgo, searchparam=searchRange, pelsearch=searchRangeFinest, chroma=useChroma, \ delta=delta, lambda=lambda, lsad=lsad, pnew=pnew, plevel=plevel, global=globalMotion, overlap=overlap, overlapv=overlapv, divide=divide, badSAD=badSAD, \ badrange=badRange, meander=meander, temporal=temporal, trymany=trymany) # NOTE: we disable scene change detection by setting thSCD1 very high blockChangeThreshold = 10000 maskScale = 70 # optimize maskScale = _n_ | 1..300 | maskScale inter = last.MFlowInter(super_render, bv, fv, time=50, ml=maskScale, thSCD1=blockChangeThreshold, thSCD2=100, blend=false) # SSIM needs YV12 colospace inter_yv12 = inter.ConvertToYV12() orig_yv12 = orig.ConvertToYV12() # for comparison original must be forwarded one frame orig_yv12 = trim(orig_yv12,1,0) # cut out the part used in quality / speed evaluation inter_yv12 = inter_yv12.Trim(MIDDLE_FRAME - TEST_FRAMES/2 + (TEST_FRAMES%2==0?1:0), MIDDLE_FRAME + TEST_FRAMES/2) orig_yv12 = orig_yv12.Trim(MIDDLE_FRAME - TEST_FRAMES/2 + (TEST_FRAMES%2==0?1:0), MIDDLE_FRAME + TEST_FRAMES/2) last = inter_yv12 # calculate SSIM value for each test frame global total = 0.0 global ssim_total = 0.0 FrameEvaluate(last, """ global ssim = SSIM_FRAME(orig_yv12, inter_yv12) global ssim = (ssim == 1.0 ? 0.0 : ssim) global ssim_total = ssim_total + ssim """) # measure runtime, plugin writes the value to global avstimer variable # NOTE: AvsTimer should be called before WriteFile global avstimer = 0.0 AvsTimer(frames=1, type=0, total=false, name="Optimizer") # per frame logging (ssim, time) delimiter = "; " resultFile = "D:\optimizer\perFrame.txt" # output out1="ssim: MAX(float)" out2="time: MIN(time) ms" file="D:\optimizer\perFrame.txt" WriteFile(resultFile, "current_frame", "delimiter", "ssim", "delimiter", "avstimer") # write "stop" at the last frame to tell the optimizer that the script has finished frame_count = FrameCount() WriteFileIf(resultFile, "current_frame == frame_count-1", """ "stop " """, "ssim_total", append=true) # return original and reconstructed frame side by side for comparison #return StackHorizontal(orig_yv12, inter_yv12) # NOTE: must return last or FrameEvaluate will not run return last Code:
TEST_FRAMES = 10 # how many frames are tested MIDDLE_FRAME = 50 # middle frame number Code:
super_pel = 4 # optimize super_pel = _n_ | 2,4 | super_pel super_sharp = 2 # optimize super_sharp = _n_ | 0..2 | super_sharp super_rfilter = 2 # optimize super_rfilter = _n_ | 0..4 | super_rfilter super_search = MSuper(pel=super_pel, sharp=super_sharp, rfilter=super_rfilter, searchClip) super_render = MSuper(pel=super_pel, sharp=super_sharp, rfilter=super_rfilter, last, levels=1) The first section (after the word optimize) tells which part of code needs to be manipulated by the optimizer. The part with "_n_" is replaced with different values, the rest is there just give enough context on where this "_n_" -part is located. Whitespace matters here, so "# optimize super_pel=_n_" would not work when the code says "super_pel = 4". The second section tells the valid values the optimizer should try for this parameter. You can define it as a range, for example "1..5" which would mean values from 1 to 5. Or you can define the values as a list separated by commas, for example "1,2,3". Booleans are supported (usually given as "false,true"). Strings are supported as well. Floats however are not, so if the parameter is a floating point number define it something like this: Code:
param = 100/1000.0 # optimize param = _n_/1000.0 | 0..1000 | paramName The last section is the name of the parameter. I have named some of the parameters differently than their corresponding parameter name in MVTools just to make it a little easier to remember what they do. These names are used in the log files where tested parameter values are reported. Also you can refer to other parameters by using these names (more about that in a little while). Code:
searchAlgo = 5 # optimize searchAlgo = _n_ | 0..7 D | searchAlgo Code:
overlap=2 # optimize overlap=_n_ | 0,2,4,6,8,10,12,14,16 ; max:blockSize 2 / ; filter:x divide 0 > 4 2 ? % 0 == | overlap The better solution is to tell the kind of dependencies the parameters have between them. In this case we define a max dependency: "max:blockSize 2 /". The part after "max:" is a function written in reverse polish notation. For all the non-reverse-polish people this means blockSize/2. Now the optimizer knows that the maximum value of overlap is blockSize/2. Here we are referring to another parameter "blockSize" by its name. The final part and the most difficult one by far is the filter dependency. It's used when minimum and maximum dependencies are not enough. The idea of the filter is that the optimizer will do a test for the current parameter values during the runtime. The current value of the parameter is put inside the formula replacing the "x". Then the formula is evaluated and if it returns true the parameter value is accepted as valid. The formula in reverse polish form "x divide 0 > 4 2 ? % 0 ==", which in infix form is "x % ((divide > 0) ? 4 : 2) == 0". "divide" is not division operator but another parameter name (perhaps not the best name choice here). In plain english this states that if (divide > 0) then overlap should be divisible by 4, otherwise it should be divisible by 2. Code:
blockSize = 8 # optimize blockSize = _n_ | 6,8,12,16,24,32 ; min:divide 0 > 8 2 ? ; filter:overlap overlapv max 2 * x <= | blockSize Whenever you define a dependency with another parameter, you should do it for both parameters involved. So for example because overlap has a max dependency on blockSize, the blockSize should have min or filter dependency on overlap. The reason for this is to avoid bias in the search. If there is a conflict between overlap and blockSize, the optimizer will try to resolve it by changing either overlap's value or blockSize's value (and it does this in a fair way so that both get changed as often). If the dependency is defined only in one of the parameters, it will always change that parameter's value (it's not smart enough to figure out the valid values for the other one). This would introduce bias in the search and possibly ruin the chances of finding the optimal results. That's all there is to defining the parameters to optimize. The reverse polish notation is something I would like to change since it's not the most user friendly format. The infix parsers I looked at were unfortunately not able to deal with ternary operators. But if you just want to optimize MVTools the hard part is already done and you can reuse my parameter definitions. Code:
# calculate SSIM value for each test frame global total = 0.0 global ssim_total = 0.0 FrameEvaluate(last, """ global ssim = SSIM_FRAME(orig_yv12, inter_yv12) global ssim = (ssim == 1.0 ? 0.0 : ssim) global ssim_total = ssim_total + ssim """) Code:
# measure runtime, plugin writes the value to global avstimer variable # NOTE: AvsTimer should be called before WriteFile global avstimer = 0.0 AvsTimer(frames=1, type=0, total=false, name="Optimizer") Part 2 below... Last edited by zorr; 23rd November 2018 at 00:45. Reason: Added new operators !=, and, or |
10th September 2018, 01:26 | #9 | Link |
Registered User
Join Date: Mar 2018
Posts: 447
|
Augmented Script (part 2/2)
Code:
# per frame logging (ssim, time) delimiter = "; " resultFile = "D:\optimizer\perFrame.txt" # output out1="ssim: MAX(float)" out2="time: MIN(time) ms" file="D:\optimizer\perFrame.txt" WriteFile(resultFile, "current_frame", "delimiter", "ssim", "delimiter", "avstimer") The "file=" -definition is the only compulsory output parameter, the others have default values. The optimizer will change the file name where the results are written for every new script that it creates. For that reason it needs to know what part of the text needs to be replaced with a new file name. We just need to repeat the same file name which is used as the resultFile's value. Code:
# write "stop" at the last frame to tell the optimizer that the script has finished frame_count = FrameCount() WriteFileIf(resultFile, "current_frame == frame_count-1", """ "stop " """, "ssim_total", append=true) Code:
# return original and reconstructed frame side by side for comparison #return StackHorizontal(orig_yv12, inter_yv12) # NOTE: must return last or FrameEvaluate will not run return last The results-file will look like this: Code:
0; 0.754317; 164.351318 1; 0.859464; 92.895966 2; 0.805696; 67.377174 3; 0.744211; 64.517632 4; 0.684871; 60.627941 5; 0.821181; 63.919125 6; 0.919067; 57.346405 7; 0.842346; 57.677219 8; 0.833121; 65.297905 9; 0.787272; 59.674370 stop 8.051547 |
10th September 2018, 03:18 | #10 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Sorry Zorr, I had not properly read your intro, (skim read far too much), and I now understand that you are trying to optimize for the current clip,
and not a generally best settings set, for any ol' clip. The double current, chuck originals away and do it again, is a bit of lateral thinking that skipped over this head, and (assuming it works, and I guess it must), then is quite inspired. I can see me reading this thread a good few times from start to finish, and have also downed the meta-wottsit PDF thing to have a read of that too. Will delete this post so as not to interfere in your postings flow, after you post the next intriguing episode. EDIT: Quote:
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 10th September 2018 at 17:27. |
|
14th September 2018, 21:27 | #11 | Link | ||
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
But in the long run I also want to see if the default settings of MVTools and other filters can be made better. That however takes a lot more work, the settings will have to be tested on many kinds of videos. Maybe we can find settings that work better for most videos than the current default settings. Maybe the current defaults are already optimal in that regard. I think it's worth trying now that we have a tool that can help. Quote:
Yes, in this case the optimization task is more difficult than the way the script is used for real, which should make the real case better quality. |
||
14th September 2018, 22:07 | #12 | Link |
Registered User
Join Date: Mar 2018
Posts: 447
|
Hands-on tutorial (part 1/2)
[EDIT] Tutorial updated to match the features of the latest version of AvisynthOptimizer (no longer using VirtualDub to run scripts, better error handling, colored text)
It's finally time for you to get your greedy hands on AvisynthOptimizer! Download the zip here and unzip to a folder of your liking. You will also need my modded AvsTimer and the SSIM plugin with the SSIM_FRAME-function which returns the SSIM value to the script. I have a package which contains both of those here. Let's first see what is inside the AvisynthOptimizer package: Code:
lib (folder) AviSynthOptimizer.jar avsr.exe avsr64.exe optimizer.bat optimizer.ini versionHistory.txt AvisynthOptimizer.jar contains the main code. There are some external jar dependencies in the lib-folder. avsr.exe and avsr64.exe this is Groucho2004's avsr utility which is used to run the scripts (thanks Groucho2004!) optimizer.bat is a batch file you can use to run the optimizer. optimizer.ini contains some configuration settings. In it's initial state it looks like this: Code:
architecture= log= The second line "log=" doesn't have a value either. The optimizer will update here the latest log file it has written to. It's useful when evaluating logs (we don't have to specify the log file, it defaults to the file found in this .ini file). versionHistory.txt what could this be? yes, it's the version history. Let's move on to our first optimizing task. When I was trying to come up what a suitable first demonstration I found this thread where Fizick was comparing the quality of denoisers using SSIM. One of the first filters he tried was FFT3DFilter. The script is short and simple and shouldn't be too difficult to optimize. Let's find out if we can beat Fizick's best settings! The challenge introduced in that thread is the following: introduce noise to a video and then try to remove it with the denoiser, finally compare the denoised frames to the original frames with SSIM. The goal is to get as good similarity as possible. The video used can be downloaded here. It's in raw 4:2:0 YUV format and can we read with RawSource. Let's first make a script that has the SSIM comparison and just for a good measure the timing measurement as well. If I was doing a serious optimization I would measure more than 5 frames but let's keep the runtime fast in this example. Note that we're adding the noise using a constant seed value so that the noise is exactly the same every time we run the script, making sure that there isn't an unknown variable messing with our quality measurements. The script is using the best FFT3DFilter settings found in the thread. Change the Rawsource path and resultFile path if needed. The example script has an absolute path for the source file but it's not necessary. Code:
TEST_FRAMES = 5 # how many frames are tested MIDDLE_FRAME = 50 # middle frame number RawSource("D:\optimizer\test\flower\flower_cif.yuv", width=352, height=288, pixel_type="I420") source=ColorYUV(levels="PC->TV") noisy=source.AddGrain(25, 0, 0, seed=1) denoised=noisy.FFT3DFilter(sigma=4, bt=4, bw=16, bh=16, ow=8, oh=8) # best settings by Fizick # cut out the part used in quality / speed evaluation source = source.Trim(MIDDLE_FRAME - TEST_FRAMES/2 + (TEST_FRAMES%2==0?1:0), MIDDLE_FRAME + TEST_FRAMES/2) denoised = denoised.Trim(MIDDLE_FRAME - TEST_FRAMES/2 + (TEST_FRAMES%2==0?1:0), MIDDLE_FRAME + TEST_FRAMES/2) last = denoised global total = 0.0 global ssim_total = 0.0 FrameEvaluate(last, """ global ssim = SSIM_FRAME(source, denoised) global ssim = (ssim == 1.0 ? 0.0 : ssim) global ssim_total = ssim_total + ssim """) # measure runtime, plugin writes the value to global avstimer variable global avstimer = 0.0 AvsTimer(frames=1, type=0, total=false, name="Optimizer") # per frame logging (ssim, time) delimiter = "; " resultFile = "perFrameResults.txt" # output out1="ssim: MAX(float)" out2="time: MIN(time) ms" file="perFrameResults.txt" WriteFile(resultFile, "current_frame", "delimiter", "ssim", "delimiter", "avstimer") # write "stop" at the last frame to tell the optimizer that the script has finished frame_count = FrameCount() WriteFileIf(resultFile, "current_frame == frame_count-1", """ "stop " """, "ssim_total", append=true) return last Code:
0; 0.987766; 12.808190 1; 0.987759; 6.340095 2; 0.987965; 6.358214 3; 0.987979; 5.819328 4; 0.987961; 5.781337 stop 4.939430 Ok, so 4.939430 is the result we're trying to beat. Now let's change the script a bit to make it ready for the optimizer. Replace this part Code:
denoised=noisy.FFT3DFilter(sigma=4, bt=4, bw=16, bh=16, ow=8, oh=8) # best settings by Fizick Code:
sigma = 400/100.0 # optimize sigma = _n_/100.0 | 100..800 | sigma bt = 4 # optimize bt = _n_ | -1..5 | blockTemporal blockSize = 32 # optimize blockSize = _n_ | 2..64 | blockSize overlap = 16 # optimize overlap = _n_ | 0..32 | overlap denoised=noisy.FFT3DFilter(sigma=sigma, bt=bt, bw=blockSize, bh=blockSize, ow=overlap, oh=overlap) We're going to try values 1.0 - 8.0 for the sigma, values from -1 to 5 for the bt (the full range allowed), values 2 - 64 for the bw and bh and 0 - 32 for the ow and oh. The last two are the overlap in x and y direction, they can use the same value because we have no reason to believe they should be different (at least in this case where the noise is uniform in all directions). The same applies for the block's size. In general it's a good idea to try to keep the number of parameters to optimize small, it makes the optimization job easier. Now we can start the optimizer. Open a command line window in the directory you installed it to and write Code:
optimizer <path_to_your_script> Now since this is the first time you're running the optimizer it will ask the preferred Avisynth architecture: Code:
Which Avisynth architecture are you (mostly) using? NOTE: this default setting can be overridden with -arch argument 1 - 32bit (x86) 2 - 64bit (x64) Code:
Error in script execution: FFT3DFilter: Must not be 2*ow > bw <script path and line number> Let's figure out what the error message is trying to say. Looks like there is a dependency between the parameters ow and bw. We should add those to our script and try again. This dependency is almost the same we already saw in the MVTools script, namely that overlap cannot be larger than half the blockSize. The dependencies are thus: Code:
blockSize = 32 # optimize blockSize = _n_ | 2..64 ; min:overlap 2 * | blockSize overlap = 16 # optimize overlap = _n_ | 0..32 ; max:blockSize 2 / | overlap Code:
Mutating 2 params by 28,5 % RESOLVED: blockSize 5 -> 14 RESOLVED: blockSize 10 -> 36 RESOLVED: overlap 10 -> 6 RESOLVED: overlap 4 -> 3 * 105 / 2000 : 4.946868 50ms sigma=514 blockTemporal=5 blockSize=11 overlap=5 106 / 2000 : 4.9306154 90ms sigma=717 blockTemporal=5 blockSize=26 overlap=13 107 / 2000 : 4.911076 30ms sigma=726 blockTemporal=4 blockSize=5 overlap=2 108 / 2000 : 4.932592 60ms sigma=732 blockTemporal=5 blockSize=14 overlap=7 + 109 / 2000 : 4.8711557 10ms sigma=752 blockTemporal=4 blockSize=16 overlap=0 110 / 2000 : 4.933645 70ms sigma=625 blockTemporal=5 blockSize=36 overlap=18 111 / 2000 : 4.937391 60ms sigma=670 blockTemporal=5 blockSize=12 overlap=6 112 / 2000 : 4.918705 30ms sigma=752 blockTemporal=4 blockSize=7 overlap=3 Parameter sensitivity estimation with 256 result combinations -> sigma 1,573 blockTemporal 1,178 blockSize 0,600 overlap 0,720 Next we have four RESOLVED lines, they tell us how the conflicts were resolved in the current generation. If you don't see any resolve lines that just means that there weren't any conflicts. The important thing here is that both blockSize and overlap get their values changed. If you have a parameter with dependencies defined but never see it on the resolved line, then something could be wrong about the definitions. Then we have a list of results. The first number is the current iteration, always increasing by one with every result. The second number is the total number of iterations in this run. The default is 2000. The next two numbers are the SSIM value and runtime which the script calculated. The rest of the line spells out the parameters used with this result. Some of the result lines have different colors and symbols (+, *) in front of them. If you're lucky you might even see the "e" symbol. The * symbol (red line) means this is the best result found so far. Now this needs a bit more explanation... didn't we already conclude that there is not one best result but a pareto front? Yes, but here the best result is determined by sorting the results by the *first* value output from the script (in this case the SSIM value). So whenever we find the best SSIM so far the * is displayed in front of the result. The + symbol (yellow line) signifies that we have found a new pareto front member. And if you see "e", it means this result is exactly as good as some other result already in the pareto front ("e" is short for "equal"). The optimizer never tries the same parameter combination twice so this should be pretty rare, but can happen because sometimes certain parameter don't have much (or any) effect on the result. Whenever you see * or + you know the algorithm is making progress. The last two lines say some gibberish about parameter sensitivity. This is my invention where the algorithm is trying to determine (based on recent results) how sensitive the parameters are. A sensitive parameter is one that causes a large change in the result when its value changes. The sensitivity value shown is large for sensitive parameters and small for non-sensitive parameters. The reason for this sensitivity business is that the mutations we're making are scaled by the sensitivity in order to avoid too large or too small changes for the parameters. In my testing the optimizer gives better results with the sensitivity estimation than without it so it defaults to being enabled. 2000 iterations is a bit long for this optimization task, so let's stop the optimizer (CTRL+C, Y) and make it run a much lighter task. This time we give it the number of iterations as an argument: Code:
optimizer <path_to_your_script> -iters 100 Last edited by zorr; 9th December 2018 at 23:41. Reason: Updated download link to latest version 0.9.16-beta |
14th September 2018, 22:30 | #13 | Link |
Registered User
Join Date: Mar 2018
Posts: 447
|
Hands-on tutorial (part 2/2)
While the optimizer is running we can run the visualization. Open another command line window in the optimizer's folder and run:
Code:
optimizer -mode evaluate The grey dots are all the results of the latest run. The red dots which are connected by a black line are the latest run's pareto front. The black larger dots, connected by a grey line, are the global pareto front, meaning the best results from all the runs. These descriptions are also visible in the info box in the bottom right corner. The info box also shows the latest run number and the number of iterations finished. Over the chart window there's a long text, that's the name of the latest log file where these results are stored. The name is long because it contains all the relevant parameters of the metaheuristic algorithm. It also contains the script name we're optimizing and the timestamp when the optimization started. All this information could be useful later when you have hundreds of log files. Notice how all the results are clustered on 10ms marks on the X axis, that's because the runtimes are rounded. The chart doesn't show all the results but instead focuses on the best ones. To be precise, the chart shows the best X% of the results. The focused percentage can be defined by a parameter, the default % is 20. This is implemented because the top results are usually the interesting ones. The default percentage can be changed with the argument -top. Close the visualization window from the X and let's try an automatically updating chart. Start it by typing Code:
optimizer -mode evaluate -autorefresh true All right, that's enough about visualization for now. In the next phase we will look at the finished results. Go grab some coffee (or something else if you don't like coffee, like me) and come back in less than 8 minutes. Or even sooner if your machine is faster than mine. ... Now that the optimizer has finished we can try to analyze the results. Start the evaluation once more (autorefresh not needed, there's nothing to refresh) Code:
optimizer -mode evaluate Code:
Run 1 best: 4.940206 70 sigma=704 blockTemporal=5 blockSize=17 overlap=7 Run 2 best: 4.947239 50 sigma=551 blockTemporal=5 blockSize=13 overlap=6 Run 3 best: 4.935884 50 sigma=771 blockTemporal=5 blockSize=19 overlap=5 Run 4 best: 4.941647 90 sigma=607 blockTemporal=5 blockSize=23 overlap=10 Run 5 best: 4.928436 20 sigma=725 blockTemporal=3 blockSize=20 overlap=7 Code:
Run 1 best: 4.94857 60 sigma=486 blockTemporal=5 blockSize=16 overlap=8 Run 2 best: 4.947047 60 sigma=543 blockTemporal=5 blockSize=16 overlap=8 Run 3 best: 4.931532 160 sigma=474 blockTemporal=5 blockSize=43 overlap=21 Run 4 best: 4.942368 30 sigma=464 blockTemporal=3 blockSize=12 overlap=6 Run 5 best: 4.944955 30 sigma=462 blockTemporal=4 blockSize=12 overlap=6 Code:
Run 1 best: 4.9489594 60 sigma=480 blockTemporal=5 blockSize=14 overlap=7 Run 2 best: 4.948967 60 sigma=475 blockTemporal=5 blockSize=14 overlap=7 Run 3 best: 4.948965 60 sigma=485 blockTemporal=5 blockSize=14 overlap=7 Run 4 best: 4.948967 60 sigma=475 blockTemporal=5 blockSize=14 overlap=7 Run 5 best: 4.948962 60 sigma=477 blockTemporal=5 blockSize=14 overlap=7 Code:
Run 1 best: 4.948967 60 sigma=479 blockTemporal=5 blockSize=14 overlap=7 Run 2 best: 4.948967 60 sigma=479 blockTemporal=5 blockSize=14 overlap=7 Run 3 best: 4.948967 60 sigma=479 blockTemporal=5 blockSize=14 overlap=7 Run 4 best: 4.948967 60 sigma=479 blockTemporal=5 blockSize=14 overlap=7 Run 5 best: 4.948967 60 sigma=479 blockTemporal=5 blockSize=14 overlap=7 Code:
Pareto front: 4.947239 50 sigma=551 blockTemporal=5 blockSize=13 overlap=6 4.942367 40 sigma=566 blockTemporal=5 blockSize=16 overlap=6 4.9421406 30 sigma=551 blockTemporal=4 blockSize=14 overlap=6 4.93908 20 sigma=560 blockTemporal=3 blockSize=14 overlap=6 4.917379 10 sigma=790 blockTemporal=3 blockSize=15 overlap=2 One final step in this tutorial: let's create the scripts from the pareto front. Code:
optimizer -mode evaluate -scripts true So far we have always evaluated the latest optimization run. If you need to go back to earlier ones, give the name of the log file(s) with the -log parameter. You can use the "*" wildcard in the name, for example if you want to analyze all logs starting with "denoise" you could run Code:
optimizer -mode evaluate -log "../test/flower/denoise*.log" I will give you one more option to play with: the -vismode (short for "visualize mode") which can take the values none, single, series and seriespareto. In the next episode we will take a closer look at the optimizer and how to customize the optimization process besides the iteration count. Last edited by zorr; 14th September 2018 at 22:32. Reason: Corrected description of the chart |
15th September 2018, 00:06 | #14 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
For example: "avsmeter script.avs -o".
__________________
Groucho's Avisynth Stuff |
|
15th September 2018, 00:33 | #15 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
JFYI, (anybodies info),
On WXP32 (last Firefox for XP32), I was getting a 'Page Is not redirecting properley' type message, but Downloaded OK on W10 (current Firefox). EDIT: Both links at top of post #12.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 15th September 2018 at 01:35. |
15th September 2018, 21:59 | #16 | Link | |
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
Luckily (or rather, by design ) it's very easy to use AVSMeter instead of VirtualDub, you only need to change the runavs.bat to something like this: Code:
title %2 "D:\optimizer\bin\tools\AvsMeter\AvsMeter" %1 -o exit Code:
D:\optimizer\bin\tools\AvsMeter>avsmeter ../../../test/flower/denoise.avs -o AVSMeter 2.8.5 (x86) - Copyright (c) 2012-2018, Groucho2004 AviSynth 2.60, build:Mar 31 2015 [16:38:54] (2.6.0.6) Number of frames: 5 Length (hh:mm:ss.ms): 00:00:00.200 Frame width: 352 Frame height: 288 Framerate: 25.000 (25/1) Colorspace: i420 Exception 0xC0000094 [STATUS_INTEGER_DIVIDE_BY_ZERO] Module: D:\optimizer\bin\tools\AvsMeter\AVSMeter.exe Address: 0x00324CD3 Code:
Script runtime is too short for meaningful measurements I was able to make it work by adding more test frames to the script, it worked with TEST_FRAMES = 50. Another issue is that AVSMeter takes quite a long time (several seconds) displaying "Query Avisynth info..." which then gets replaced with the AviSynth version information. Would it be possible to skip this part and start running the script right away? I didn't find a switch to suppress that. |
|
15th September 2018, 22:04 | #17 | Link | |
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
[EDIT] I replaced the download links, I probably used incorrect ones earlier. Last edited by zorr; 15th September 2018 at 22:16. Reason: Info about download links |
|
16th September 2018, 02:49 | #18 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Try "avsr" instead.
__________________
Groucho's Avisynth Stuff |
16th September 2018, 20:51 | #19 | Link | |
Registered User
Join Date: Mar 2018
Posts: 447
|
Quote:
I decided to run some benchmarks which one is faster and also do some tests with the poll frequency. Apparently avsr is starting up faster than VirtualDub. When running a validation on the first pareto front of denoise.avs (2000 iterations) it's finished with VirtualDub in 54.1 seconds but with avsr it's 0.6 seconds faster (measured average of 5 runs). If I change the poll frequency from 100ms to 50ms the difference is even larger, about 1.8 seconds. I tried intervals 25ms and 10ms too, there's still improvement but probably not enough to justify the cost. I decided to change the poll frequency to 50ms. With that change and using avsr the validation runs about 5% faster. Do you mind if I add avsr to the AvisynthOptimizer package and use it as the default script runner? I will of course document that this is software made by you. It would be nice if I could determine automatically which Avisynth platform (x86 or x64) the user is running, then I could skip one manual step of the installation. Last edited by zorr; 16th September 2018 at 21:03. Reason: Added percentage of speedup |
|
16th September 2018, 21:46 | #20 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
VDub has a much bigger overhead so that's to be expected.
Quote:
Where do you want to determine this? In the batch file? In your software?
__________________
Groucho's Avisynth Stuff |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|