Clang sometimes generates quicker code, it seems that it features with smarter optimizer than MS. For example the above mentioned TemporalRepair has modes that run 10-15% faster in clang version than with MS built DLL.
I was also looking at the generated assembly code based on pure C source. clang was able to vectorize (parallel computing of pixel data using mm registers) situations where MS simply stuck with linear one-by-one processing.
|