Commit Graph

5 Commits

Author SHA1 Message Date
Scott LaVarnway
8e6022844f vpx: [x86] add vpx_satd_avx2()
SSE2 instrinsic vs AVX2 intrinsic speed gains:
blocksize   16: ~1.33
blocksize   64: ~1.51
blocksize  256: ~3.03
blocksize 1024: ~3.71

Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
2017-11-10 12:24:12 -08:00
Scott LaVarnway
3bf02ad74a vpx: hadamard: use ptrdiff_t instead of int for stride
Eliminates the following instruction for the x86 (64 bit)
intrinsic code:

movslq %esi,%rax

Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae
2017-10-26 11:41:48 -07:00
Scott LaVarnway
512bf4e029 vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix
Use an intermediate buffer before storing to coeffs when
highbitdepth is enabled.

Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b
2017-10-23 08:49:32 -07:00
Scott LaVarnway
4906cea027 vpx: [x86] vpx_hadamard_16x16_avx2() improvements
~10% performance gain.  Fixed the cosmetics noted in the
previous commit.

Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13
2017-10-20 08:55:06 -07:00
Scott LaVarnway
55c126a5d7 vpx: [x86] add vpx_hadamard_16x16_avx2()
This version is ~1.91x faster than the sse2 version.  When
highbitdepth is enabled, it is ~1.74x.

Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd
2017-10-18 18:00:00 -07:00