View Single Post
Old 11th October 2018, 00:24   #70  |  Link
zorr
Registered User
 
Join Date: Mar 2018
Posts: 447
Quote:
Originally Posted by Seedmanc View Post
Some other ideas to add, previously it was mentioned that some parameter combinations might have no effect on time or SSIM, but what if we knew for sure what those combinations are, it would be nice to be able to mark them as such so the optimizer could skip them.
Using the dependency definitions (min / max / filter) we can guarantee that the optimizer will not try certain combinations. It doesn't really matter what the purpose is (in some cases to avoid completely invalid combinations, in some cases to avoid doing useless work).

Quote:
Originally Posted by Seedmanc View Post
For example, in MSuper the sharp parameter is only for pel>1, it won't raise an error otherwise but it'll be a waste of time.
You could define it like this:

Code:
super_pel = 2 # optimize super_pel = _n_ | 1,2,4 | super_pel
super_sharp = 2 # optimize super_sharp = _n_ | 0..2 ; max:super_pel 1 == 0 2 ? | super_sharp
Translation: if super_pel is 1 then then the maximum value of super_sharp is 0, otherwise it is 2. But note that in your script super_pel only has values 2 and 4 so in this case all the values of super_sharp are valid. My experience so far has been that the best results pretty much always use super_pel=4, may not be worth your time to try the other values unless you're also optimizing for speed.



Quote:
Originally Posted by Seedmanc View Post
Regarding the ConvertToYV24, can you give an example what kind of parameter combinations would be unsuitable for YV12 so this becomes necessary? It slows down the processing considerably, so I'd like to get rid of that.
The error message you get is "MAnalyse: wrong overlap for the colorspace subsampling for divide mode". Looking at the MVTools source I was able to gather that it's triggered when overlap is not divisible by xRatioUV (or overlapv is not divisible by yRatioUV). And the xRatioUV is 2 for YUY2 and (1 << vi.GetPlaneWidthSubsampling(PLANAR_U)) for other formats. Assuming you convert to YV12 instead (which is needed for the SSIM anyway) then you can avoid the error by making the overlap values divisible by 4. Like this (filter dependency is no longer needed so I removed that):

Code:
overlap=0	# optimize overlap=_n_ | 0,4,8,12,16,20,24,28,32 ; max:blockSize 2 / | overlap
overlapv=0 # optimize overlapv=_n_ | 0,4,8,12,16,20,24,28,32 ; max:blockSize 2 / | overlapv
You could also avoid the speed penalty by preprocessing the video to YV24 and use that instead. It might still be slower than processing YV12 but at least you don't pay the price of conversion every time. And also consider the cropping idea, that can give you a significant speed increase.

Quote:
Originally Posted by Seedmanc View Post
Another thing, the badRange parameter of MAnalyze says that we need to use positive values for UMH search and negative for exhaustive. Unfortunately it does not disclose why, however it doesn't raise an error anyway.
I think this means that using negative values selects the exhaustive algorithm and using positive values selects the UMH algorithm for this wide search, independently on what is used in the first search. Btw I didn't even realize you can use negative values, I guess I now have to test those too.

Quote:
Originally Posted by Seedmanc View Post
Anyway, how would we describe it in the settings to use negative values when searchAlgo is 3?
You could use the min and max dependencies. Or the filter, you can do pretty much anything you could think of with that one.

Quote:
Originally Posted by Seedmanc View Post
Finally, I wonder about the temporal parameter. The readme says it's incompatible with setMTmode, however the new mvtools have the MT parameter inbuilt and on by default, do we know if it should be disabled for temporal? Again, it doesn't raise errors, the output looks differently but then it also does look differently when disabling MT for temporal=false as well. Really, the readme should be updated there.
I don't know more than you about this, if Pinterf is around maybe he could clarify. But it strikes me odd that the result would be different when temporal is disabled and MT is on/off. That could be a bug.

Quote:
Originally Posted by Seedmanc View Post
But judging from the example where you ran a script for 100, 200 500 and 2000 iterations, the key was to see how the results stabilize among multiple runs, converging to a single set of parameters and resulting SSIMs/times. Hence, multiple runs are only used to check if the process has stabilized enough, but multiple iterations are the requirement for that stabilization.
I was just showing off. It's rare that you can make the results converge to the exact same values. When that does happen though it's a strong indicator that we've found the optimal result. In a more realistic scenario the iteration counts are always insufficient to make the results converge in that way. There may be some parameters with the same values (like super_pel=4) but many others are different, some wildly.

There is another reason to do multiple runs. The beginning of the search usually "locks" the search into a certain corner of the search space and it may never get out of that within the iteration count. So it could be that you get significantly better result in one out of say, 10 runs. If you only ever do 3 runs maybe you'll never get to that lucky corner. Here's a recent example:

Code:
Run 1 best: 9.790725 2130 rmgrain=12 super_rfilter=1 blockSize=8 searchAlgo=3 searchRange=1 
searchRangeFinest=7 lambda=2213 LSAD=2744 plevel=2 overlap=0 globalMotion=true badSAD=9559 
badRange=34 meander=false temporal=false trymany=false dct=1 maskScale=2
Run 2 best: 9.787563 1210 rmgrain=19 super_rfilter=1 blockSize=8 searchAlgo=1 searchRange=1 
searchRangeFinest=3 lambda=3931 LSAD=19316 plevel=2 overlap=0 globalMotion=false badSAD=2770 
badRange=13 meander=true temporal=true trymany=false dct=1 maskScale=2
Run 3 best: 9.790011 1510 rmgrain=19 super_rfilter=1 blockSize=8 searchAlgo=3 searchRange=1 
searchRangeFinest=4 lambda=2629 LSAD=1684 plevel=2 overlap=0 globalMotion=true badSAD=9831 
badRange=10 meander=true temporal=false trymany=false dct=1 maskScale=2
Run 4 best: 9.789404 1190 rmgrain=19 super_rfilter=0 blockSize=8 searchAlgo=1 searchRange=1 
searchRangeFinest=2 lambda=2142 LSAD=2679 plevel=2 overlap=0 globalMotion=true badSAD=9558 
badRange=45 meander=true temporal=false trymany=false dct=1 maskScale=2
Run 5 best: 9.788453 1220 rmgrain=8  super_rfilter=0 blockSize=8 searchAlgo=1 searchRange=1 
searchRangeFinest=3 lambda=3432 LSAD=6410 plevel=2 overlap=0 globalMotion=true badSAD=4531 
badRange=4 meander=true temporal=false trymany=false dct=1 maskScale=1
There are three results below 9.79 and two above it. If you're unlucky you'll get those below 9.79 results in your three runs. These results were run with 5000 iterations, the differences are larger with a smaller iteration count.

Quote:
Originally Posted by Seedmanc View Post
In best case all you'd need is 2 runs and a large amount of iters to see if the results became close to each other. I use 3 since it's the lowest amount from which you can already calculate both the average and the median.
Maybe you're right. It's a tough call because we're talking about probabilities here. So if you do *this*, then *that* happens with a certain probability, but not always. One could do a large number of runs and then using that data calculate what are the odds for a certain result using N runs. But I think it also depends on the particular script and how difficult it is to optimize.

Quote:
Originally Posted by Seedmanc View Post
If I'm correct about that all we need to figure out is how to scale population count over the increasing search space. What configuration is more likely to try the largest subset of the search parameters, the high-iter low-pop or the vice-versa?
Strictly speaking there is no difference, the number of iterations defines how large the searched subset is (as the optimizer never tries duplicates within one run). But I guess you're asking how to get the widest possible subset. For that question the high population should do better in terms of how wide the search is but it's going to do less mutations of the best result and therefore might end up with a worse result than a smaller population. The "mutation" algorithm with population 1 is the most narrow search possible, it simply keeps the best result and mutates it until one of the mutations is better. If you just want to make the search wider, you can do that also by cranking up the mutation amount and count.

Last edited by zorr; 11th October 2018 at 00:42. Reason: Added suggestion to use mutation count / amount
zorr is offline   Reply With Quote