I suspect there is nothing you can do to reach 25 fps 1920x1080 on this level of CPU performance, the combination of core count, core's IPC throughpout and clock is too low.
Ultrafast already goes about as fast as it can, it is turning off almost everything. You can probably get some speedups, the ARM optimizations weren't IIRC as thorough as x86, but it won't yield you 100%+ speedup, more like 10% if you are lucky and spend a lot of time.
This task needs either stronger SoC (quadcore, higher clock), or to downscale the input. However, that will also consume cycles unless you succesfully offload to teh chip's GPU or something.
|