View Single Post
Old 23rd May 2020, 16:42   #195  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,733
Quote:
Originally Posted by Boulder View Post
Here's an example of a script which starts pounding on the GPU at 100%. I've tried changing the prefetch value, but I've not found a good value which would put the CPU (12c/24t) to work at 90-100%. In Vapoursynth, a very similar script uses 95-100% of the CPU and is much faster. GPU usage stays below 10% almost all the time.

Code:
DGSource("potter_stone.dgi", ct=280, cb=280, cl=0, cr=0) # UHD source

c2 = convertbits(bits=16)
c2blur = c2.blur(0.2)
prefilt = convertbits(bits=10)

w = prefilt.width()
h = prefilt.height()
prefilt = prefilt.removegrain(12, 12).gaussresize(w, h, 0, 0, w+0.0001, h+0.0001, p=2).mergeluma(prefilt, 0.1)

sharp_luma = c2.sharpen(0.6)
sharp_chroma = c2.sharpen(0.2)
sharp = sharp_luma.mergechroma(sharp_chroma)

superanalyse = prefilt.msuper(pel=2, hpad=16, vpad=16, sharp=2, rfilter=4)
supermdg = sharp.msuper(pel=2, hpad=16, vpad=16, levels=1, sharp=2, rfilter=4)

fv1 = manalyse(superanalyse, isb=false, delta=1, blksize=64, overlap=32, search=5, searchparam=8, pelsearch=8, truemotion=false, dct=5, mt=false)
bv1 = manalyse(superanalyse, isb=false, delta=1, blksize=64, overlap=32, search=5, searchparam=8, pelsearch=8, truemotion=false, dct=5, mt=false)
fv1 = mrecalculate(superanalyse, fv1, thsad=100, blksize=32, overlap=16, search=5, searchparam=6, truemotion=false, dct=5, mt=false)
bv1 = mrecalculate(superanalyse, bv1, thsad=100, blksize=32, overlap=16, search=5, searchparam=6, truemotion=false, dct=5, mt=false)
fv1 = mrecalculate(superanalyse, fv1, thsad=100, blksize=16, overlap=8, search=5, searchparam=6, truemotion=false, dct=5, mt=false)
bv1 = mrecalculate(superanalyse, bv1, thsad=100, blksize=16, overlap=8, search=5, searchparam=6, truemotion=false, dct=5, mt=false)

fv1scaled = fv1.mscalevect(bits=16)
bv1scaled = bv1.mscalevect(bits=16)

c2blur.mdegrain1(supermdg, bv1scaled, fv1scaled, thsad=200, thsadc=200, plane=4, limit=255, limitc=255, thscd1=200, thscd2=70)

Prefetch(24)
I'm still quite confused with these tests of mine.
Running the script with Prefetch(frames=1, threads=24) ends up in CPU usage which looks like it's using only one thread. AVSMeter tells me that the thread count is increased but CPU utilization is still around 4%. GPU usage stays low.

I thought that would emulate the old behaviour of SetMTMode(x, 24)?

I've tried various frames and threads combinations, but at best I've been able to get around 40-45% of CPU load. The source decoding becomes a bottleneck sooner or later as the amount of frames increases.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is online now   Reply With Quote