Ok so all builds apart of simd256 work with fft3dfilter. However the old build simd128+256 appears to have been a wee bit faster than all new builds with dfttest.
Out of curiosity: what do you actually mean by simd 128 or 256 in contrast to sse2/avx/avx2? After all sse2 is a 128 simd and avx/avx2 have additional 256 simd instructions on top of sse2.