speed up accumulate, accumulateSquare, accumulateProduct and accumulateWeighted using SIMD
* use SSE and/or AVX based on configuration * revise the test to verify the implementation
This commit is contained in:
* use SSE and/or AVX based on configuration * revise the test to verify the implementation