Commit Graph

13 Commits

Author SHA1 Message Date
Johann
158c80cbb0 convolve8 optimizations for neon
Independent horizontal and vertical implementations.

Requires that blocks be built from 4x4 and [xy]_step_q4 == 16

6-10% improvement. CIF improved the least.

Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda
2013-07-11 11:08:19 -07:00
Ronald S. Bultje
decead7336 Replace copy_memNxM functions with a generic copy/avg function.
Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa
2013-07-10 18:27:24 -07:00
James Zern
e7b599f683 convolve_test: align filter arrays
fixes issue #583

Change-Id: I4b855a5b5b168c8961410cef6ab5e6d86f14d301
2013-06-17 23:14:15 -07:00
Deb Mukherjee
995ce523eb Cosmetic cleanups of filters
No bitstream change.

Removes unused filters and the code for the case of 2 switchable filters;
also changes the 8tap-smooth filter coefficients for integer shifts to be
interpolating to be consistent with the way it is implemented currently.

Change-Id: I96c542fd8c06f4e0df507a645976f58e6de92aae
2013-06-10 12:06:36 -07:00
James Zern
8fb48afd54 convolve_test: cosmetics
fix indent, whitespace, casts

Change-Id: Ifea8618a90f9da263a8955dd242bb3aa7fc59ae5
2013-05-02 19:30:47 -07:00
James Zern
b0e5775ebc convolve_test: remove unnecessary memset
input_ is filled with random values just afterward.
the size was wrong anyway as input_ is allocated with memalign so
sizeof(input_)==sizeof(uint8_t*)

Change-Id: I014b832ac60960cd22b6f369dbc9fd648d4055b5
2013-05-02 12:32:13 -07:00
John Koleszar
a9ebbcc338 convolve: support larger blocks, fix asm saturation bug
Updates the common convoloution code to support blocks larger than
16x16, and rectangular blocks. This uncovered a bug in the SSSE3
filtering routines due to the order of application of saturation.
This commit fixes that bug, adjusts the unit test to bias its
random values towards the extremes, and adds a test to ensure that
all filters conform to the expected pairwise addition structure.

Change-Id: I81f69668b1de0de5a8ed43f0643845641525c8f0
2013-04-18 13:57:59 -07:00
John Koleszar
04c2407874 convolve test: validate 1D filters are 1D
Since the 8-tap lowpass filter is non-interpolating, the results are
different between applying it at whole-pel values and not. This
means that 1D-only versions are requried to be implemented, as
opposed to being an optimization of the 2D case. Calling the 2D
filter instead of the horizontal-only filter is not equivalent
in this case. Update the test to pass invalid filters to the
unused stage of the 1D-only calls, to verify they're unused.

Change-Id: Idc1c490f059adadd4cc80dbe770c1ccefe628b0a
2013-02-27 11:19:11 -08:00
John Koleszar
557a1b209e Run all filters through convolve test
Updates the convolve test to verify that all filters match the
reference implementation. This verifies commit 30f866f, which
fixed some problems with the SSE3 version of the filters for
the vp9_sub_pel_filters_8s and vp9_sub_pel_filters_8lp banks
due to overflow and order of operations.

Change-Id: I6b5fe1a41bc20062e2e64633b1355ae58c9c592c
2013-02-27 11:15:20 -08:00
John Koleszar
6fd7dd1a70 Use 256-byte aligned filter tables
This avoids duplicating all the filters twice. Includes fixups to the
convolve routines and associated tests to make this work.

Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
2013-02-27 08:22:39 -08:00
John Koleszar
6a4f708c25 Refactor inter recon functions to support scaling
Ensure that all inter prediction goes through a common code path
that takes scaling into account. Removes a bunch of duplicate
1st/2nd predictor code. Also introduces a 16x8 mode for 8x8
MVs, similar to the 8x4 trick we were doing before. This has an
unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that
case for now.

Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
2013-02-26 10:03:29 -08:00
John Koleszar
29d47ac80e Restore SSSE3 subpixel filters in new convolve framework
This commit adds the 8 tap SSSE3 subpixel filters back into the code
underneath the convolve API. The C code is still called for 4x4
blocks, as well as compound prediction modes. This restores the
encode performance to be within about 8% of the baseline.

Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c
2013-02-08 12:18:14 -08:00
John Koleszar
5ca6a3667f Add 8-tap generic convolver
This commit introduces a new convolution function which will be used to
replace the existing subpixel interpolation functions. It is much the
same as the existing functions, but allows for changing the filter
kernel on a per-pixel basis, and doesn't bake in knowledge of the
filter to be applied or the size of the resulting block into the
function name.

Replacing the existing subpel filters will come in a later commit.

Change-Id: Ic9a5615f2f456cb77f96741856fc650d6d78bb91
2013-02-05 14:19:28 -08:00