vpx/vp9/common
Geza Lore aa8f85223b Optimize vp9_highbd_block_error_8bit assembly.
A new version of vp9_highbd_error_8bit is now available which is
optimized with AVX assembly. AVX itself does not buy us too much, but
the non-destructive 3 operand format encoding of the 128bit SSEn integer
instructions helps to eliminate move instructions. The Sandy Bridge
micro-architecture cannot eliminate move instructions in the processor
front end, so AVX will help on these machines.

Further 2 optimizations are applied:

1. The common case of computing block error on 4x4 blocks is optimized
as a special case.
2. All arithmetic is speculatively done on 32 bits only. At the end of
the loop, the code detects if overflow might have happened and if so,
the whole computation is re-executed using higher precision arithmetic.
This case however is extremely rare in real use, so we can achieve a
large net gain here.

The optimizations rely on the fact that the coefficients are in the
range [-(2^15-1), 2^15-1], and that the quantized coefficients always
have the same sign as the input coefficients (in the worst case they are
0). These are the same assumptions that the old SSE2 assembly code for
the non high bitdepth configuration relied on. The unit tests have been
updated to take this constraint into consideration when generating test
input data.

Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
2015-10-21 12:30:40 +01:00
..
arm/neon Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names 2015-08-04 13:46:11 -07:00
mips Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names 2015-08-04 13:46:11 -07:00
x86 Accelerated transform in high bit depth 2015-09-28 21:09:16 -07:00
vp9_alloccommon.c VP9: move loopfilter build masks to decode loop 2015-09-29 05:20:49 -07:00
vp9_alloccommon.h Safely free all the frame buffers after all the workers finish the work. 2015-03-19 12:21:00 -07:00
vp9_blockd.c VP9: remove plane_type from macroblockd_plane 2015-09-30 15:15:11 -07:00
vp9_blockd.h VP9: remove plane_type from macroblockd_plane 2015-09-30 15:15:11 -07:00
vp9_common_data.c Include vpx_dsp_common.h when using VPXMIN/MAX 2015-08-31 14:36:35 -07:00
vp9_common_data.h vp9_common_data: right-size tables 2015-06-25 20:20:40 -07:00
vp9_common.h Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h 2015-08-10 15:37:14 -07:00
vp9_debugmodes.c Fix debugmodes file to print modes and MVs correctly 2015-04-27 17:09:38 -07:00
vp9_entropy.c vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_entropy.h vp9: simplify extrabits encoding 2015-10-06 16:26:08 -07:00
vp9_entropymode.c vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_entropymode.h Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vp9_entropymv.c vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_entropymv.h vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_enums.h Reducing size of MODE_INFO struct 2015-06-04 07:32:16 -07:00
vp9_filter.c remove vp9_get_interp_kernel() 2015-07-06 13:04:05 -07:00
vp9_filter.h Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vp9_frame_buffers.c vpx_mem: remove vpx_memset 2015-04-28 20:00:59 -07:00
vp9_frame_buffers.h Add get release decoder frame buffer functions. 2014-02-10 14:08:11 -08:00
vp9_idct.c Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h 2015-08-10 15:37:14 -07:00
vp9_idct.h Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names 2015-08-04 13:46:11 -07:00
vp9_loopfilter.c Merge "VP9: remove plane_type checks in loopfilter functions" 2015-09-30 22:11:21 +00:00
vp9_loopfilter.h VP9: move loopfilter build masks to decode loop 2015-09-29 05:20:49 -07:00
vp9_mfqe.c Add static syntax to copy_mem64x64 2015-08-07 10:16:27 -07:00
vp9_mfqe.h Multiframe Quality Enhancement(MFQE) in VP9. 2014-12-11 09:19:39 -08:00
vp9_mv.h Replacing int_mv with MV inside the first pass code. 2014-08-22 16:20:18 -07:00
vp9_mvref_common.c Remove tile param 2015-06-22 06:09:38 -07:00
vp9_mvref_common.h Remove tile param 2015-06-22 06:09:38 -07:00
vp9_onyxc_int.h Add a new enum type vpx_color_range_t 2015-10-16 16:27:18 -07:00
vp9_postproc.c Include vpx_dsp_common.h when using VPXMIN/MAX 2015-08-31 14:36:35 -07:00
vp9_postproc.h Multiframe Quality Enhancement(MFQE) in VP9. 2014-12-11 09:19:39 -08:00
vp9_ppflags.h Multiframe Quality Enhancement(MFQE) in VP9. 2014-12-11 09:19:39 -08:00
vp9_pred_common.c vp9_pred_common: inline vp9_get_tx_size_context 2015-06-15 18:41:22 -07:00
vp9_pred_common.h Include vpx_dsp_common.h when using VPXMIN/MAX 2015-08-31 14:36:35 -07:00
vp9_quant_common.c inline vp9_get_segdata() 2015-06-11 09:52:00 -07:00
vp9_quant_common.h Clean up header files in vp9_blockd.h and related files 2014-10-07 15:17:10 -07:00
vp9_reconinter.c Fix the sub8x8 block inter prediction with scaled reference frame 2015-09-08 11:09:30 -07:00
vp9_reconinter.h Make build_inter_predictors static function 2015-08-10 15:51:13 +00:00
vp9_reconintra.c Small cleanup 2015-10-01 11:19:13 -07:00
vp9_reconintra.h Refactor intra block prediction function 2015-07-13 15:20:47 -07:00
vp9_rtcd_defs.pl Optimize vp9_highbd_block_error_8bit assembly. 2015-10-21 12:30:40 +01:00
vp9_rtcd.c Reorganize *_rtcd() calling conventions 2015-04-15 11:12:05 -04:00
vp9_scale.c VPX: Add rtcd support for scaling. 2015-08-03 09:43:34 -07:00
vp9_scale.h Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vp9_scan.c Make iscan and scan neighbor arrays static const. 2014-10-02 14:08:14 -07:00
vp9_scan.h Re-worked header files 2015-05-22 11:19:51 -07:00
vp9_seg_common.c vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_seg_common.h vpx_dsp/prob.h: vp9_ -> vpx_ 2015-07-20 18:13:04 -07:00
vp9_textblit.c Code cleanup. 2013-02-22 11:03:14 -08:00
vp9_textblit.h vp9/common: add extern "C" to headers 2014-01-23 16:21:24 -08:00
vp9_thread_common.c VP9: move loopfilter build masks to decode loop 2015-09-29 05:20:49 -07:00
vp9_thread_common.h Fix a macro definition 2015-09-29 09:34:42 -07:00
vp9_tile_common.c Include vpx_dsp_common.h when using VPXMIN/MAX 2015-08-31 14:36:35 -07:00
vp9_tile_common.h Refactor decode_tiles and loopfilter code. 2014-05-20 14:47:45 -07:00