View Single Post
Old 23rd February 2021, 10:15   #855  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,041
It looks some very hard long strategic question on core and plugin development:

Because of current execution hardware architecture moving to large number of processing cores with small per-core L1d cache and slow main memory access can be the new scan formats be added like blocks scan instead of full-frame line scans ?

Currently for algorithms for vertical data accessing it is require to read memory with large strides and it even 1 full line of 8K frame in float32 eats all 32 kB L1d cache.

The size of scan blocks is a new question. It looks must be < L1d cache size like 32..48 kB maximum and the ratio of V/H may be from 1:1 to may be natural for 4/3..16/9 frame.

The scan re-formatting may be made inside each plugin to and back but it will slow performance. May be also some plans in this direction exist ?
DTL is offline   Reply With Quote