Commit Graph

6317 Commits

Author SHA1 Message Date
Jingning Han
e22bb0dc8e Merge "Refactor 16x16 unit tests" 2013-08-30 08:53:19 -07:00
Tero Rintaluoma
e326cecf18 Fix intermediate height in convolve_c
- Intermediate height was not correct i.e. when block size is 4 and
  y_step_q4 is 6. In this case intermediate height was
  (4*6) >> 4 = 1 and vertical interpolation needs two source pixels
  plus 7 extra pixels for taps.
- Also if the current output block is 16x16 and we are using 4x upscaling
  we need only 12 rows after horizontal filtering instead of 16.

  Patch Set 2: Intermediate_height updated after CL 66723
               "Fix bug in convolution functions (filter selection)"

Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
2013-08-30 10:31:21 +03:00
Jim Bankoski
1d44fc0c49 Merge "rework filter_block_plane" 2013-08-29 20:11:09 -07:00
Jim Bankoski
bc50961a74 rework filter_block_plane
Change-Id: I55c3b60c4c0f4910d3dfb70e3edaae00cfa8dc4d
2013-08-29 17:00:05 -07:00
Jingning Han
ec4b2742e7 Refactor 16x16 unit tests
Make the new test module comply to the unit test rules.

Change-Id: Id79ff7f03f870973ffbc74f26d64edb418b75299
2013-08-29 16:49:11 -07:00
Jingning Han
c86c5443eb Merge "Fix overflow issue in SSSE3 32x32 quantization" 2013-08-29 16:49:04 -07:00
Paul Wilkins
1f4bf79d65 Added per pixel inter rd hit count stats
Added some code to output normalized rd hit count stats.
In effect this approximates to the average number of rd
operations/tests per pixel for the sequence.

The results are not quite accurate and I have not bothered
to account for partial SB64s at frame edges and for key frames
However they do give some idea of the number of modes /
prediction methods being tested for each pixel across the
different partition sizes. This indicates how much scope their
is for further gains either by reducing the number of partitions
examined or the modes per partition through heuristics.

Patch 3 moved place where count incremented so partial rd
tests that are aborted with INT_MAX return are also counted.

Example numbers for first 50 frames of Akiyo.
Speed 0 ~84.4 rd operations / pixel
Speed 1 ~28.8
Speed 2 ~11.9

Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
2013-08-30 00:13:51 +01:00
Deb Mukherjee
b6dbf11ed5 Merge "Adds a speed feature for fast 1-loop forw updates" 2013-08-29 15:54:04 -07:00
James Zern
e83e8f0426 Merge changes Ib1e853f9,Ifd75c809,If3e83404
* changes:
  consistently name VP9_COMMON variables #3
  consistently name VP9_COMMON variables #2
  consistently name VP9_COMMON variables #1
2013-08-29 15:50:56 -07:00
Yaowu Xu
ee961599e1 Merge "Fixed potential overflows" 2013-08-29 15:43:26 -07:00
James Zern
d765df2796 consistently name VP9_COMMON variables #3
stragglers

Change-Id: Ib1e853f9a331b7b66639dc34d79568d84d1930f1
2013-08-29 13:27:41 -07:00
James Zern
aa05321262 consistently name VP9_COMMON variables #2
oci -> cm

Change-Id: Ifd75c809d9cc99034d3c2fccc4653a78b3aec21f
2013-08-29 13:25:58 -07:00
James Zern
924d74516a consistently name VP9_COMMON variables #1
pc -> cm

Change-Id: If3e83404f574316fdd3b9aace2487b64efdb66f3
2013-08-29 13:25:57 -07:00
Dmitry Kovalev
e80bf802a9 Merge "Renaming txfm_size to tx_size." 2013-08-29 12:30:18 -07:00
Jingning Han
abff678866 Fix overflow issue in SSSE3 32x32 quantization
The 32x32 quantization process can potentially have the intermediate
stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
fixes this overflow issue in the SSSE3 implementation, as well as the
prototype, of 32x32 quantization.

This fixes issue 607 from webm@googlecode.

Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
2013-08-29 11:00:54 -07:00
Yaowu Xu
aaa7b44460 Fixed potential overflows
The two arrays are typically initialized to INT64_MAX, if they are not
filled with valid values before the addition, the values can overflow
and lead to wrong results.

Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
2013-08-29 10:26:52 -07:00
Scott LaVarnway
22dc946a7e Improved mb_lpf_horizontal_edge_w_sse2_8
This patch is a reformatted version of optimizations done by
engineers at Intel (Erik/Tamar) who have been providing
performance feedback for VP9.  For the test clips used (720p, 1080p),
up to 1.2% performance improvement was seen.

Change-Id: Ic1a7149098740079d5453b564da6fbfdd0b2f3d2
2013-08-29 08:30:17 -04:00
Dmitry Kovalev
b71807082c Merge "General code cleanup." 2013-08-28 12:57:49 -07:00
Dmitry Kovalev
db20806710 Merge "Removing unnecessary call to vp9_setup_interp_filters." 2013-08-28 12:31:08 -07:00
Dmitry Kovalev
b62ddd5f8b General code cleanup.
Switching from mi_{width, height}_log2 and b_{width, height}_log2 to
num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing
redundant code, adding const.

Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
2013-08-28 12:22:37 -07:00
Deb Mukherjee
e02dc84c1a Adds a speed feature for fast 1-loop forw updates
Incorporates a speed feature for fast forward updates of
coefficients. This feature takes 3 values:
0 - use standard 2-loop version
1 - use a 1-loop version
2 - use a 1-loop version with reduced updates

Results: derfraw300 +0.007% (on speed 0) at feature value = 1
                    -0.160% (on speed 0) at feature value = 2

There is substantial speed up at speeds 2 and above for low
resolution sequences where the entropy updates are a big part
of the overall computations.

Change-Id: Ie96fc50777088a5bd441288bca6111e43d03bcae
2013-08-28 10:56:52 -07:00
Dmitry Kovalev
851a2fd72c Renaming txfm_size to tx_size.
Change-Id: I752e374867d459960995b24d197301d65ad535e3
2013-08-27 19:47:53 -07:00
Jingning Han
eb7acb5524 Merge "Fix buf alignment in sub8x8 comp inter-inter pred" 2013-08-27 19:03:12 -07:00
Dmitry Kovalev
1d3f94efe2 Merge "Adding get_entropy_context function." 2013-08-27 17:02:36 -07:00
Frank Galligan
7d058ef86c Merge "Fix winodws warning." 2013-08-27 15:39:58 -07:00
Frank Galligan
f1560ce035 Fix winodws warning.
Const is not needed on the function parameter.

Change-Id: I38c2a7317cb6f42f70bbddfde9a2cd18d65ceb1c
2013-08-27 15:19:55 -07:00
Dmitry Kovalev
a93992e725 Adding get_entropy_context function.
Moving common code from encoder and decoder to this function.

Change-Id: I60fa643fb1ddf7ebbff5e83b6c4710137b0195ef
2013-08-27 14:17:53 -07:00
hkuang
3a679e56b2 Add neon optimize vp9_short_idct16x16_1_add.
Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5
2013-08-27 14:00:27 -07:00
hkuang
ce04b1aa62 Merge "Add neon optimize vp9_short_idct8x8_1_add." 2013-08-27 12:10:07 -07:00
Dmitry Kovalev
7b95f9bf39 Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder.
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
2013-08-27 11:05:08 -07:00
Dmitry Kovalev
ba10aed86d Merge "Using num_8x8_* lookup tables instead of mi_*_log2." 2013-08-27 10:49:36 -07:00
Dmitry Kovalev
12e5931a9a Merge "Using existing functions instead of raw expressions." 2013-08-27 10:33:34 -07:00
Dmitry Kovalev
f77c6973a1 Merge "Cleaning up decode_block_intra function." 2013-08-27 10:17:56 -07:00
Dmitry Kovalev
f389ca2acc Merge "Cleaning up model_rd_for_sb_y_tx." 2013-08-27 10:17:10 -07:00
Dmitry Kovalev
bfebe7e927 Merge "Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the common/decoder." 2013-08-27 10:15:21 -07:00
Dmitry Kovalev
78e670fcf8 Merge "Renaming D27 to D207." 2013-08-27 10:03:57 -07:00
Jingning Han
2d6aadd7e2 Fix buf alignment in sub8x8 comp inter-inter pred
This commit resolved a mis-alignment issue in compound inter-inter
prediction of sub8x8. This patch follows solution from dkovalev@.

Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932
2013-08-27 09:28:05 -07:00
Yaowu Xu
45125ee573 Merge "fixed the reading too many bytes" 2013-08-27 09:09:18 -07:00
Yaowu Xu
9482c07953 fixed the reading too many bytes
In subpel_avg_variance functions, code similar to the following

punpkldq m2, [addr]

actually reads 8 bytes. For functions that are supposed to work on
buffers only have less 8 bytes a line, this caused valgrind error
of reading uninitialized memory.

Change-Id: I2a4c079dbdbc747829bd9e2ed85f0018ad2a3a34
2013-08-27 08:39:20 -07:00
Jim Bankoski
3e43e49ffd Merge "Add a test vector that tests color space 444" 2013-08-27 08:33:12 -07:00
Dmitry Kovalev
44b7854c84 Removing unnecessary call to vp9_setup_interp_filters.
vp9_setup_interp_filters before each inter block decoding, it is not
necessary to call it just before the whole frame decoding.

Change-Id: Id1b0ee62f987474e27eafba0013a4896b492c400
2013-08-26 17:25:49 -07:00
hkuang
36e9b82080 Add neon optimize vp9_short_idct8x8_1_add.
Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4
2013-08-26 16:28:57 -07:00
hkuang
ba8fc71979 Merge "Add neon optimize vp9_short_idct4x4_1_add." 2013-08-26 16:26:38 -07:00
Dmitry Kovalev
657ee2d719 Cleaning up model_rd_for_sb_y_tx.
Removing references to plane_block_width and plane_block_height (we are
going to delete the latter ones).

Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287
2013-08-26 16:18:28 -07:00
hkuang
69384f4fad Add neon optimize vp9_short_idct4x4_1_add.
Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5
2013-08-26 15:55:16 -07:00
Jim Bankoski
bbb490f6a3 Merge "Fix Chroma plane md5 check" 2013-08-26 15:27:14 -07:00
Jim Bankoski
a5cb05c45d Add a test vector that tests color space 444
This adds a test vector for 444 color space.

Change-Id: I1e2ac3883211989a062cfafc0e58151b14d294b8
2013-08-26 15:24:35 -07:00
Dmitry Kovalev
242460cb66 Cleaning up decode_block_intra function.
Change-Id: Ia41ea5d526d15fcbc9b56d74079593cf8b2fdf66
2013-08-26 15:24:12 -07:00
Jim Bankoski
af13fbb70f Fix Chroma plane md5 check
Chroma plane MD5 calculation was incorrect for 444 and 422
yuv color spaces.

Change-Id: If985396871a2f57db85108a4355172f9793d3007
2013-08-26 14:26:38 -07:00
Dmitry Kovalev
b25589c6bb Using num_8x8_* lookup tables instead of mi_*_log2.
Change-Id: I8a246b3d056c98be614d05a90bc261e2441ffc10
2013-08-26 14:22:54 -07:00