@katzenjoghurt:
For the core functions of the x265 encoder (especially those which are most often called in tight loops), some of the following sentences may be true, depending on performance gain, development progress in different bit depths, etc.:
- There is basic C/C++ code. It depends on the compiler options which instruction set is used. If your CPU supports it, x265 can use this code. If not, it will crash due to unsupported instructions.
- There is hand-optimized assembler code with MMX/SSE2 optimization. If your CPU supports it, x265 can use this code. If not, x265 should use simpler code.
- There is hand-optimized assembler code with SSSE3/SSE4 optimization. If your CPU supports it, x265 can use this code. If not, x265 should use simpler code.
- There is hand-optimized assembler code with AVX optimization. If your CPU supports it, x265 can use this code. If not, x265 should use simpler code.
- There is hand-optimized assembler code with AVX2 optimization. If your CPU supports it and you enable it explicitly, x265 can use this code.
Well, for any x86-64 CPU today, SSE2 should be the minimum sensible supported instruction set. But already there are small differences. I remember the Athlon 64 (AMD K8) family being a threshold of providing an SSE2 implementation which is considered relatively "fast", but despite supporting SSE3 in specs, x264 and x265 will refuse to use it.
Usually all these code variants are present in a binary of (lib)x265, except they are excluded during compilation (e.g. you may disable all assembler code paths; but why would you want that?).