these are compatible as they only load one element of the input so the
larger size of tran_low_t makes no difference in little endian builds.
note the asm is incompatible with big-endian, but there are other points of
failure there so currently it's considered unsupported.
BUG=webm:1294
Change-Id: Icd2665a0699bccae92d1bea43a95b0a83fb17028
these functions are incompatible currently and unreferenced in rtcd,
exclude them from the build.
BUG=webm:1294
Change-Id: I7790c195a91e1b142f56c04d2a5e305d9133b896
This function only exists as a shortcut to subpixel variance with
predefined offsets. xoffset = 4 for horizontal, yoffset = 4 for vertical
and both for "hv"
Removing this allows the existing optimizations for the variance
functions to be called. Instead of having only sse2 optimizations, this
gives sse2, ssse3, msa and neon.
BUG=webm:1273
Change-Id: Ieb407b423b91b87d33c4263c6a1ad5e673b0efd6
Use pixel domain distortion metric in speed 0. This improves the
compression performance by 0.3% for both low and high resolution
test sets.
Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d
development has moved to the nextgenv2 branch and a snapshot from here
was used to seed aomedia
BUG=b/29457125
Change-Id: Iedaca11ec7870fb3a4e50b2c9ea0c2b056a0d3c0
Followed the code style of other lpf fuctions.
These 2 functions put 2 rows of data in a single xmm register,
so they have similar but not identical filter operations,
and cannot share the same macros.
Change-Id: I3bab55a5d1a1232926ac8fd1f03251acc38302bc
This reverts commit 9aa083d164e0d39086aa0c83f0d1a0d0f0d1ba61.
Fixes a decoder mismatch with 32bit PIC builds.
Change-Id: I94717df662834810302fe3594b38c53084a4e284
Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.
This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.
Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.
Change-Id: I86714a6b7364da20cd468cd784247009663a5140
This is based on the original patch optimized for 32bit
platforms by Tamar/Ilya and now uses the x86inc style asm.
The assembly was also modified to support 64bit platforms.
Change-Id: Ice12f249bbbc162a7427e3d23fbf0cbe4135aff2
These were lost in the great sub pixel variance move of
6a82f0d7fb9ee908c389e8d55444bbaed3d54e9c
Not having these functions caused a ~10% performance regression in
some realtime vp8 encodes.
Change-Id: I50658483d9198391806b27899f2c0d309233c4b5
This commit clears the function naming convention in vpx_dsp. It
replaces vp9_ prefix of global functions with vpx_ prefix. It also
removes the vp9_ prefix from static functions.
Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf
Add a guard to exclud dspr2 inverse transform files from vpx_dsp
make file, when high bit-depth is turned on. This fixes the jenkins
nightly build.
Change-Id: Ibacd86563af1ec4810c550905b3fa0397baeeafc