vpx/vpx_dsp/x86
Scott LaVarnway d6c9bbc2b6 vpxdsp: [x86] add highbd_d207_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.31x

C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x

BUG=webm:1411

Change-Id: I0bac29db261079181ddabc6814bd62c463109caf
2017-09-11 07:36:24 -07:00
..
add_noise_sse2.asm postproc : fix function parameters for noise functions. 2016-07-15 08:27:34 -07:00
avg_intrin_sse2.c highbd x86: consolidate tran_low_t conversions 2017-02-06 10:43:26 -08:00
avg_pred_sse2.c vpx_comp_avg_pred: sse2 optimization 2017-04-13 08:44:52 -07:00
avg_ssse3_x86_64.asm bitdepth conversion: really use num elements 2017-02-16 15:02:48 +00:00
bitdepth_conversion_avx2.h block error avx2: use tran_low_t 2017-02-16 12:39:02 -08:00
bitdepth_conversion_sse2.asm quantize_fp highbd ssse3: use tran_low_t for coeff 2017-02-16 07:40:56 -08:00
bitdepth_conversion_sse2.h correct bitdepth_conversion_sse2.h header guard 2017-02-16 12:43:33 -08:00
convolve.h Remove get_filter_base() and get_filter_offset() in convolve 2017-09-05 15:22:36 -07:00
deblock_sse2.asm Fix segmentation fault caused by denoiser working with spatial SVC. 2017-02-21 09:38:28 -08:00
fwd_dct32x32_impl_avx2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_dct32x32_impl_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_impl_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_sse2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_ssse3_x86_64.asm Rework forward 8x8 2D-DCT ssse3 implementation 2017-01-10 12:50:55 -08:00
highbd_convolve_avx2.c Remove get_filter_base() and get_filter_offset() in convolve 2017-09-05 15:22:36 -07:00
highbd_idct4x4_add_sse2.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct4x4_add_sse4.c Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx() 2017-08-04 15:33:37 -07:00
highbd_idct8x8_add_sse2.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct8x8_add_sse4.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct16x16_add_sse2.c Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_idct16x16_add_sse4.c Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_idct32x32_add_sse2.c highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo 2017-08-17 15:37:38 -07:00
highbd_idct32x32_add_sse4.c highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo 2017-08-17 15:37:38 -07:00
highbd_intrapred_intrin_sse2.c vpxdsp: [x86] add highbd_d207_predictor functions 2017-09-11 07:36:24 -07:00
highbd_intrapred_intrin_ssse3.c vpxdsp: [x86] add highbd_d207_predictor functions 2017-09-11 07:36:24 -07:00
highbd_intrapred_sse2.asm Code clean of highbd_tm_predictor_32x32 2015-12-22 16:51:57 -08:00
highbd_inv_txfm_sse2.h Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_inv_txfm_sse4.h Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_loopfilter_sse2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
highbd_quantize_intrin_sse2.c quantize: ignore skip_block in x86 2017-08-21 14:37:03 -07:00
highbd_sad4d_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
highbd_sad_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
highbd_subpel_variance_impl_sse2.asm Fix for issue 1114 compile error 2015-12-18 09:43:22 +00:00
highbd_variance_impl_sse2.asm Move variance functions to vpx_dsp 2015-05-26 12:01:52 -07:00
highbd_variance_sse2.c Resolve -Wshorten-64-to-32 in highbd variance. 2017-04-05 17:34:02 -07:00
intrapred_sse2.asm *.asm: normalize label format 2016-06-27 19:46:57 -07:00
intrapred_ssse3.asm Slow pshufb removal in 3 intra prediction functions. 2016-06-02 10:55:58 -07:00
inv_txfm_sse2.c Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_txfm_sse2.h inv_txfm_sse2.h: correct idct*/iadst* prototypes 2017-08-16 23:06:09 -07:00
inv_txfm_ssse3.c Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_txfm_ssse3.h Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_wht_sse2.asm bitdepth conversion: really use num elements 2017-02-16 15:02:48 +00:00
loopfilter_avx2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
loopfilter_sse2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
quantize_avx_x86_64.asm Revert "quantize avx: copy 32x32 implementation" 2017-08-25 16:56:08 +00:00
quantize_avx.c Revert "quantize avx: copy 32x32 implementation" 2017-08-25 16:56:08 +00:00
quantize_sse2.c quantize sse2: copy opts from ssse3 2017-08-22 13:01:44 -07:00
quantize_ssse3.c quantize ssse3: copy implementation to intrinsics 2017-08-24 07:47:51 -07:00
sad4d_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
sad4d_sse2.asm Code clean of sad4xNx4D_sse 2015-12-17 17:43:46 -08:00
sad_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
sad_sse2.asm sad_sse2: fix sad4xN(_avg) on windows 2015-12-18 19:19:32 -08:00
sad_sse3.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
sad_sse4.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
sad_ssse3.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
ssim_opt_x86_64.asm ssim: Replace unsigned long with uint32_t. 2015-08-07 11:48:31 -07:00
subpel_variance_sse2.asm Code clean of sub_pixel_variance4xh -- 2 2016-05-24 04:44:05 -07:00
subtract_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
sum_squares_sse2.c vp9_rdopt: correct size to vpx_sum_squares_2d_i16 2017-03-22 12:04:33 -07:00
transpose_sse2.h Add transpose_32bit_8x4() sse2 optimization 2017-08-02 16:15:58 -07:00
txfm_common_sse2.h Refactor highbd idct 4x4 and 8x8 x86 functions 2017-07-27 18:01:03 -07:00
variance_avx2.c vpx_dsp: get32x32var_avx2() cleanup 2017-08-18 13:44:09 -07:00
variance_sse2.c vpx_variance16x16_sse2: correct cast order 2017-07-25 16:45:40 -07:00
vpx_asm_stubs.c Remove get_filter_base() and get_filter_offset() in convolve 2017-09-05 15:22:36 -07:00
vpx_convolve_copy_sse2.asm Remove get_filter_base() and get_filter_offset() in convolve 2017-09-05 15:22:36 -07:00
vpx_high_subpixel_8t_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_high_subpixel_bilinear_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_8t_intrin_avx2.c Remove get_filter_base() and get_filter_offset() in convolve 2017-09-05 15:22:36 -07:00
vpx_subpixel_8t_intrin_ssse3.c Update convolve functions' assertions 2017-09-07 12:33:58 -07:00
vpx_subpixel_8t_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_8t_ssse3.asm Update vpx subpixel 1d filter ssse3 asm 2016-06-29 13:48:41 -07:00
vpx_subpixel_bilinear_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_bilinear_ssse3.asm improve vpx_filter_block1d* based on replace paddsw+psrlw to pmulhrsw 2016-06-27 17:50:45 +00:00