The 32x32 quantization process can potentially have the intermediate
stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
fixes this overflow issue in the SSSE3 implementation, as well as the
prototype, of 32x32 quantization.
This fixes issue 607 from webm@googlecode.
Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
The two arrays are typically initialized to INT64_MAX, if they are not
filled with valid values before the addition, the values can overflow
and lead to wrong results.
Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
This patch is a reformatted version of optimizations done by
engineers at Intel (Erik/Tamar) who have been providing
performance feedback for VP9. For the test clips used (720p, 1080p),
up to 1.2% performance improvement was seen.
Change-Id: Ic1a7149098740079d5453b564da6fbfdd0b2f3d2
Switching from mi_{width, height}_log2 and b_{width, height}_log2 to
num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing
redundant code, adding const.
Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
Incorporates a speed feature for fast forward updates of
coefficients. This feature takes 3 values:
0 - use standard 2-loop version
1 - use a 1-loop version
2 - use a 1-loop version with reduced updates
Results: derfraw300 +0.007% (on speed 0) at feature value = 1
-0.160% (on speed 0) at feature value = 2
There is substantial speed up at speeds 2 and above for low
resolution sequences where the entropy updates are a big part
of the overall computations.
Change-Id: Ie96fc50777088a5bd441288bca6111e43d03bcae
This commit resolved a mis-alignment issue in compound inter-inter
prediction of sub8x8. This patch follows solution from dkovalev@.
Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932
In subpel_avg_variance functions, code similar to the following
punpkldq m2, [addr]
actually reads 8 bytes. For functions that are supposed to work on
buffers only have less 8 bytes a line, this caused valgrind error
of reading uninitialized memory.
Change-Id: I2a4c079dbdbc747829bd9e2ed85f0018ad2a3a34
vp9_setup_interp_filters before each inter block decoding, it is not
necessary to call it just before the whole frame decoding.
Change-Id: Id1b0ee62f987474e27eafba0013a4896b492c400
Removing references to plane_block_width and plane_block_height (we are
going to delete the latter ones).
Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287
Make the current head working properly, while working on fixing an
issue in the SSSE3 implementation of 32x32 quantization.
Change-Id: Ic029da3fd7f1f5e58bc641341cbd226ec49a16bc