vpx/vpx_dsp/x86
Johann 7c27872164 quantize avx: copy implementation to intrinsics
Adds an early exit based on ptest. Slightly slower than ssse3 in the
full case because of the extra check, but potentially faster if lots of
rows can be skipped.

Very close in speed to the assembly.

Can run in 32 bit, unlike the assembly. Allows reworking the function
prototype to use structs.

Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
2017-08-23 09:19:16 -07:00
..
add_noise_sse2.asm postproc : fix function parameters for noise functions. 2016-07-15 08:27:34 -07:00
avg_intrin_sse2.c highbd x86: consolidate tran_low_t conversions 2017-02-06 10:43:26 -08:00
avg_pred_sse2.c vpx_comp_avg_pred: sse2 optimization 2017-04-13 08:44:52 -07:00
avg_ssse3_x86_64.asm bitdepth conversion: really use num elements 2017-02-16 15:02:48 +00:00
bitdepth_conversion_avx2.h block error avx2: use tran_low_t 2017-02-16 12:39:02 -08:00
bitdepth_conversion_sse2.asm quantize_fp highbd ssse3: use tran_low_t for coeff 2017-02-16 07:40:56 -08:00
bitdepth_conversion_sse2.h correct bitdepth_conversion_sse2.h header guard 2017-02-16 12:43:33 -08:00
convolve.h Update highbd convolve functions arguments to use uint16_t src/dst 2017-04-25 14:22:19 -07:00
deblock_sse2.asm Fix segmentation fault caused by denoiser working with spatial SVC. 2017-02-21 09:38:28 -08:00
fwd_dct32x32_impl_avx2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_dct32x32_impl_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_impl_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_sse2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_sse2.h vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
fwd_txfm_ssse3_x86_64.asm Rework forward 8x8 2D-DCT ssse3 implementation 2017-01-10 12:50:55 -08:00
highbd_convolve_avx2.c High bit depth inter prediction horizontal/vertical filters AVX2 2017-05-03 12:18:01 -07:00
highbd_idct4x4_add_sse2.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct4x4_add_sse4.c Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx() 2017-08-04 15:33:37 -07:00
highbd_idct8x8_add_sse2.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct8x8_add_sse4.c Clean highbd idct x86 code with inline functions 2017-08-08 17:53:28 -07:00
highbd_idct16x16_add_sse2.c Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_idct16x16_add_sse4.c Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_idct32x32_add_sse2.c highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo 2017-08-17 15:37:38 -07:00
highbd_idct32x32_add_sse4.c highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo 2017-08-17 15:37:38 -07:00
highbd_intrapred_sse2.asm Code clean of highbd_tm_predictor_32x32 2015-12-22 16:51:57 -08:00
highbd_inv_txfm_sse2.h Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_inv_txfm_sse4.h Update highbd idct x86 optimizations. 2017-08-14 16:59:50 -07:00
highbd_loopfilter_sse2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
highbd_quantize_intrin_sse2.c quantize: ignore skip_block in x86 2017-08-21 14:37:03 -07:00
highbd_sad4d_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
highbd_sad_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
highbd_subpel_variance_impl_sse2.asm Fix for issue 1114 compile error 2015-12-18 09:43:22 +00:00
highbd_variance_impl_sse2.asm Move variance functions to vpx_dsp 2015-05-26 12:01:52 -07:00
highbd_variance_sse2.c Resolve -Wshorten-64-to-32 in highbd variance. 2017-04-05 17:34:02 -07:00
intrapred_sse2.asm *.asm: normalize label format 2016-06-27 19:46:57 -07:00
intrapred_ssse3.asm Slow pshufb removal in 3 intra prediction functions. 2016-06-02 10:55:58 -07:00
inv_txfm_sse2.c Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_txfm_sse2.h inv_txfm_sse2.h: correct idct*/iadst* prototypes 2017-08-16 23:06:09 -07:00
inv_txfm_ssse3.c Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_txfm_ssse3.h Update 32x32 idct sse2 and ssse3 optimizations. 2017-08-14 16:59:31 -07:00
inv_wht_sse2.asm bitdepth conversion: really use num elements 2017-02-16 15:02:48 +00:00
loopfilter_avx2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
loopfilter_sse2.c Unify loopfilter function names 2016-09-29 16:25:42 -07:00
quantize_avx_x86_64.asm quantize avx: copy implementation to intrinsics 2017-08-23 09:19:16 -07:00
quantize_avx.c quantize avx: copy implementation to intrinsics 2017-08-23 09:19:16 -07:00
quantize_sse2.c quantize sse2: copy opts from ssse3 2017-08-22 13:01:44 -07:00
quantize_ssse3_x86_64.asm quantize: ignore skip_block in x86 2017-08-21 14:37:03 -07:00
quantize_ssse3.c quantize ssse3: copy style from sse2 2017-08-22 14:25:27 -07:00
sad4d_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
sad4d_sse2.asm Code clean of sad4xNx4D_sse 2015-12-17 17:43:46 -08:00
sad_avx2.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
sad_sse2.asm sad_sse2: fix sad4xN(_avg) on windows 2015-12-18 19:19:32 -08:00
sad_sse3.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
sad_sse4.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
sad_ssse3.asm Move shared SAD code to vpx_dsp 2015-05-06 16:58:20 -07:00
ssim_opt_x86_64.asm ssim: Replace unsigned long with uint32_t. 2015-08-07 11:48:31 -07:00
subpel_variance_sse2.asm Code clean of sub_pixel_variance4xh -- 2 2016-05-24 04:44:05 -07:00
subtract_sse2.asm Use newer x86inc.asm 2015-08-07 16:44:44 -07:00
sum_squares_sse2.c vp9_rdopt: correct size to vpx_sum_squares_2d_i16 2017-03-22 12:04:33 -07:00
transpose_sse2.h Add transpose_32bit_8x4() sse2 optimization 2017-08-02 16:15:58 -07:00
txfm_common_sse2.h Refactor highbd idct 4x4 and 8x8 x86 functions 2017-07-27 18:01:03 -07:00
variance_avx2.c vpx_dsp: vpx_get16x16var_avx2() cleanup 2017-08-18 12:23:49 -07:00
variance_sse2.c vpx_variance16x16_sse2: correct cast order 2017-07-25 16:45:40 -07:00
vpx_asm_stubs.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
vpx_convolve_copy_sse2.asm Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve 2017-04-19 12:13:49 -07:00
vpx_high_subpixel_8t_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_high_subpixel_bilinear_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_8t_intrin_avx2.c vpx_subpixel_8t_intrin_avx2: tolerate unversioned clang 2016-09-16 07:14:17 +00:00
vpx_subpixel_8t_intrin_ssse3.c add vpx high bitdepth convolve8 NEON intrinsics optimization 2016-10-17 15:23:54 -07:00
vpx_subpixel_8t_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_8t_ssse3.asm Update vpx subpixel 1d filter ssse3 asm 2016-06-29 13:48:41 -07:00
vpx_subpixel_bilinear_sse2.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_subpixel_bilinear_ssse3.asm improve vpx_filter_block1d* based on replace paddsw+psrlw to pmulhrsw 2016-06-27 17:50:45 +00:00