vpx/vp9/encoder/x86
Ronald S. Bultje e5fb4b61b6 Use pmovmskb to skip quantize loops over empty coefficients.
If none of the 16 coefficients that we quantize per loop iteration
are larger than the zbin, directly skip to the next round of coeffs,
rather than doing a full quantize loop that will eventually result
in 16 zeroes. This incurs a jump cost, but saves a lot of other work.
32x32 quant goes from 1349 -> 1184 cycles. The same approach yielded
no significantly positive results for smaller transforms, so is not
used there (8x8: 103 -> 101 cycles; 16x16: 302 -> 306 cycles).

Change-Id: I8fca17dc2543fc8eed1dbcd5100145e3c3a9b647
2013-07-02 16:34:24 -07:00
..
vp9_dct_mmx.asm add private to assembly files to insure proper chromebuild 2012-12-20 09:40:18 -08:00
vp9_dct_mmx.h google style guide include guards 2012-11-30 07:30:59 -08:00
vp9_dct_sse2.c Merge "Enable SSE2 4x4 ADST/DCT transform" 2013-06-29 15:57:04 -07:00
vp9_error_sse2.asm Make coefficient skip condition an explicit RD choice. 2013-06-28 10:28:49 -07:00
vp9_fwalsh_sse2.asm add private to assembly files to insure proper chromebuild 2012-12-20 09:40:18 -08:00
vp9_mcomp_x86.h google style guide include guards 2012-11-30 07:30:59 -08:00
vp9_quantize_ssse3.asm Use pmovmskb to skip quantize loops over empty coefficients. 2013-07-02 16:34:24 -07:00
vp9_sad4d_sse2.asm Implement SSE version for sad4x8x4d and SSE2 version for sad8x4x4d. 2013-06-12 17:40:01 -04:00
vp9_sad_mmx.asm add private to assembly files to insure proper chromebuild 2012-12-20 09:40:18 -08:00
vp9_sad_sse2.asm Add averaging-SAD functions for 8-point comp-inter motion search. 2013-06-25 12:57:28 -07:00
vp9_sad_sse3.asm Merge master branch into experimental 2013-03-01 11:06:05 -08:00
vp9_sad_sse4.asm this commit converts all sad ptrs to uint32 2013-02-28 08:46:35 -08:00
vp9_sad_ssse3.asm add private to assembly files to insure proper chromebuild 2012-12-20 09:40:18 -08:00
vp9_ssim_opt.asm add private to assembly files to insure proper chromebuild 2012-12-20 09:40:18 -08:00
vp9_subpel_variance_impl_sse2.asm Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. 2013-06-20 09:34:25 -07:00
vp9_subpel_variance.asm Add missing SECTION .text marker in assembly file. 2013-06-21 12:55:46 -07:00
vp9_subtract_sse2.asm Remove emms - that shouldn't be there. 2013-06-21 14:45:04 -07:00
vp9_temporal_filter_apply_sse2.asm Fix --as=nasm compatibility for new asm code. 2013-02-27 09:55:38 -08:00
vp9_variance_impl_mmx.asm Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. 2013-06-20 09:34:25 -07:00
vp9_variance_impl_sse2.asm Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. 2013-06-20 09:34:25 -07:00
vp9_variance_mmx.c Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. 2013-06-20 09:34:25 -07:00
vp9_variance_sse2.c SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). 2013-06-20 15:59:48 -07:00