Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.


Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Thread Tools Search this Thread Display Modes
Old 2nd November 2015, 15:01   #1  |  Link
Registered User
Join Date: May 2008
Posts: 12
speeding up x264


I’m trying to run the x264 encoder on an ARM processor (2 cores, 1.2GHz, ARM Cortex-A9 with v7-A instruction set, with Neon extensions) and I would like to run it in real time.

I use this command line “x264 --input-res 1920x1080 --preset ultrafast -I -1 -b 0 --bitrate 3500 --fps 25 --nr 0 --threads 2 -o /mnt/ramdisk/1.264 /mnt/ramdisk/00001.yuv”

(It's a low delay application so that's why the -b 0 and -I -1, I know the quality is bad but that's not the concern here)

I obtain about 12fps performance. The profiling gives the following functions as being the major ones:
8.67% x264_quant_4x4x4_neon
6.81% x264_macroblock_cache_load_progressive
4.78% x264_plane_copy_neon
4.29% x264_frame_init_lowres_core_neon
4.19% x264_macroblock_cache_save
3.93% x264_mb_encode_chroma
3.73% x264_mb_encode_i16x16
3.23% x264_mc_copy_w16_aligned_neon

So I’m trying to see how I could accelerate the code (any suggestion is welcome!). As we are also looking into low delay stuff so mainly only I frames for now or I and P (but B have too long a delay). For the I frames only bitstream (which is the one running fastest I could get) I’m thinking about restraining the number of intra predictions tried or the number of possible partitions. I know this will result in a loss of quality or bitrate but my aim is to simplify the encoder to have it run in real time…

I’ve got two questions about the code:
• Is there any documentation for the code? Not all steps are self-explanatory…e.g. I couldn’t find out what the function mc_copy_w16 does as it is also called when we compress with I frames only (meaning when there is no motion compensation…)
• I’m also trying to figure out the macroblock_cache_load and save functions as they represent around 11% of the profiling. As I’m using the ultrafast preset, the only prediction modes tried are the 4 intra 16x16. So I’m surprised that we spend that much time loading and saving MBs in the cache and I’m trying to save some time there…

viper_room is offline   Reply With Quote
Old 3rd November 2015, 01:49   #2  |  Link
Registered User
mandarinka's Avatar
Join Date: Jan 2007
Posts: 739
I suspect there is nothing you can do to reach 25 fps 1920x1080 on this level of CPU performance, the combination of core count, core's IPC throughpout and clock is too low.

Ultrafast already goes about as fast as it can, it is turning off almost everything. You can probably get some speedups, the ARM optimizations weren't IIRC as thorough as x86, but it won't yield you 100%+ speedup, more like 10% if you are lucky and spend a lot of time.

This task needs either stronger SoC (quadcore, higher clock), or to downscale the input. However, that will also consume cycles unless you succesfully offload to teh chip's GPU or something.
mandarinka is offline   Reply With Quote
Old 12th November 2015, 19:47   #3  |  Link
Big Bit Savings Now !
Emulgator's Avatar
Join Date: Feb 2007
Location: close to the wall
Posts: 863
--no cabac ?
well, ultrafast does this already...
So yes, downscale before.
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're working on that issue. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote

performance, x264

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT +1. The time now is 04:39.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.