Commit Graph

1851 Commits

Author SHA1 Message Date
Dmitry Kovalev
52fa10a9a3 Cleaning up vp9_append_sub8x8_mvs_for_idx.
Change-Id: Ic92f15d82ff5cfa3df655d08e460335c2ef8a325
2013-11-22 15:28:32 -08:00
Dmitry Kovalev
e0ec61187e Merge "Removing txfrm_block_to_raster_xy() call from extend_for_intra()." 2013-11-22 10:51:38 -08:00
Dmitry Kovalev
7c8cac3c21 Removing txfrm_block_to_raster_xy() call from extend_for_intra().
Change-Id: I6a48d1f35ed5fe7a2c7499675b339994c9c3bdf2
2013-11-21 19:30:58 -08:00
Dmitry Kovalev
ad3333e2cd Merge "Removing plane_block_{width, height} functions." 2013-11-21 16:37:27 -08:00
Frank Galligan
97d1258375 Revert "Add 16 wide neon horz loopfilter."
The change caused mismatches with some test vectors on neon.

Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/

Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
2013-11-21 14:01:33 -08:00
Yunqing Wang
e002bb99a8 Merge "Add filter_selectively_vert_row2 to enable parallel loopfiltering" 2013-11-21 11:25:55 -08:00
hkuang
370bf116a2 Merge "Remove unnecessary eob checking." 2013-11-21 11:24:02 -08:00
Frank Galligan
2dd77580c0 Merge "Add 16 wide neon horz loopfilter." 2013-11-21 10:29:30 -08:00
Yunqing Wang
b5e6d6cccf Add filter_selectively_vert_row2 to enable parallel loopfiltering
Added filter_selectively_vert_row2 to be ready for parallel
loopfiltering in vertical direction. This change did 2-row
filtering at a time. If 2 vertically adjacent 8x8 blocks do same
type of filtering, we can do 16-pixel filtering in parallel.

Next, we need to provide 16-pixel loopfiltering functions in c
and optimized versions for codec speedup.

Change-Id: Idf97bbdd70566e55bd30e1fd25cb8544e33291be
2013-11-21 09:53:15 -08:00
Yunqing Wang
6c4964602a Merge "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" 2013-11-21 09:40:02 -08:00
Frank Galligan
98de15137e Add 16 wide neon horz loopfilter.
Add support to do 16 pixel horizontal filtering in Neon.
Nexus devices saw about 0.5% decode speed increase.

Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d
2013-11-21 09:39:36 -08:00
Dmitry Kovalev
a218a96784 Merge "Adding MV_FP_SIZE constant." 2013-11-20 14:39:58 -08:00
Yunqing Wang
256cf7ee7d Correct ssse3 8/16-pixel wide sub-pixel filter calculation
Although no mismatch was indicated for 8/16 wide sub-pixel filters
in issue 661, they had similar problems that could cause mismatch
potentially. This patch fixed calculations in HORIZx8/16
and VERTx8/16.

Change-Id: I169961c9d40a20340995b7d22aafc89ccf30bfca
2013-11-20 12:52:56 -08:00
Dmitry Kovalev
79b5a2b142 Removing plane_block_{width, height} functions.
Change-Id: I29c0dfcf41a1253d5e2a0d2ff740c0c38ebaa5a2
2013-11-20 12:39:29 -08:00
Jim Bankoski
302c33e49f Merge "Clean up removal of vp9_pareto8 table." 2013-11-20 12:30:03 -08:00
Dmitry Kovalev
4956fcd31b Adding MV_FP_SIZE constant.
Change-Id: I98d750ee92ff51fb714980418ea28be3b1d0f3c6
2013-11-20 12:07:57 -08:00
hkuang
6debc446e0 Remove unnecessary eob checking.
Change-Id: Ia568f70bddc1a2b62141a0197459119ca74c22b5
2013-11-20 11:58:11 -08:00
Jim Bankoski
25aae73a30 Merge "remove the model and copy in pack_mb_tokens" 2013-11-20 11:34:30 -08:00
Jim Bankoski
5bbb0c6295 Clean up removal of vp9_pareto8 table.
Change-Id: I5556e8d1fc150be8a3e93af21900829b59a500dc
2013-11-20 11:17:26 -08:00
Jingning Han
81b9fd4310 Merge "Take out assertion from inverse transforms" 2013-11-20 10:55:27 -08:00
Jim Bankoski
03276bf6e6 remove the model and copy in pack_mb_tokens
Change-Id: I00a5203c8ed76c184d936fccf93d76e7c06773d3
2013-11-20 10:06:04 -08:00
Yunqing Wang
0ef63f596d Fix stack pointer in sub-pixel filters
In commit "3d50da5397d20abc932d81453b26cde758293a40", the stack
pointer was modified while aligning the stack, and it needed to
be pop out at the end.

Change-Id: I062971e195f1f2ab9d0ab5fb84dcf215a0fcaa67
2013-11-20 09:42:44 -08:00
Guillaume Martres
b00057c88a Merge "vpxenc: add --aq-mode flag to control adaptive quantization" 2013-11-20 08:13:28 -08:00
Jim Bankoski
7a8a68e2bd Merge "scan order table lookup same for encoder and decoder" 2013-11-19 16:22:48 -08:00
Yunqing Wang
e8f8e77642 Merge "Fix decoder mismatch with ssse3 enabled" 2013-11-19 16:19:32 -08:00
Yaowu Xu
dd04ff506b Merge "Move vp9_setup_interp_filter() to encoder" 2013-11-19 16:01:19 -08:00
Jim Bankoski
d6667dd54f scan order table lookup same for encoder and decoder
Change-Id: I473947b5ca70b7a81151926284bff86f8555492a
2013-11-19 15:31:43 -08:00
Yunqing Wang
3d50da5397 Fix decoder mismatch with ssse3 enabled
This patch fixed issue 661: "Decoder produces mismatched outputs
with ssse3 enabled and disabled." In sub-pixel filters, a pixel
value was multiplied by a filter coefficient, and the results
were added up. The order of adding up these multiplications had to
be arranged carefully to prevent incorrect overflowing.

Change-Id: Id08af4200fea9e1b896fc40157b8651c2c7e80f2
2013-11-19 15:10:04 -08:00
Dmitry Kovalev
65cee2f01a Merge "Simplifying partition context calculation." 2013-11-19 15:09:01 -08:00
Jim Bankoski
60aba6558f Merge "entropy code speedup" 2013-11-19 14:58:44 -08:00
Yaowu Xu
df78fea166 Move vp9_setup_interp_filter() to encoder
As it is used in encoder only.

Change-Id: I5f2a8abbe72bb18cbf6ce36a3dc7e132aeae8ec2
2013-11-19 14:57:58 -08:00
Yaowu Xu
f92cfa1ca6 Merge "Move vp9_sadmxn.h from common to encoder" 2013-11-19 14:41:33 -08:00
Jim Bankoski
8cf352abac entropy code speedup
Change-Id: Ic316d3374ff9a2b43897272260947d56765a0fdd
2013-11-19 14:31:38 -08:00
Jim Bankoski
ff4f1c4b76 scan order / neighbors converted to lookup
Change-Id: I64b189dfeee1cf3e90134a1a93497072f3361e5e
2013-11-19 12:55:44 -08:00
Yaowu Xu
30b03050a2 Move vp9_sadmxn.h from common to encoder
Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b
2013-11-19 12:46:08 -08:00
Dmitry Kovalev
f6ec323906 Simplifying partition context calculation.
Reversing bit order of partition_context_lookup, and modifying accordingly
update_partition_context() and partition_plane_context().

Change-Id: I64a11f1a94962a3bf217de2f50698cb781db71a5
2013-11-19 11:17:30 -08:00
Yunqing Wang
f16fb829e6 Merge "Improve vp9_iht4x4_16_add_sse2 (x1.341)" 2013-11-19 11:11:47 -08:00
Dmitry Kovalev
953b1e9683 Removing raster_block_offset_uint8() function.
There is no need to use that function, it is much clear to pass offset
directly to the buffer.

Change-Id: I9026cb0c5094c46f97df5d7f7daeb952f2843b24
2013-11-18 19:00:49 -08:00
Dmitry Kovalev
9e1e7bee48 Merge "Finally removing txfrm_block_to_raster_block() function." 2013-11-18 18:43:16 -08:00
Dmitry Kovalev
220af9ac2c Merge "Cleaning up vp9_entropy.c file." 2013-11-18 18:04:56 -08:00
Abo Talib Mahfoodh
613e2d2e90 Improve vp9_iht4x4_16_add_sse2 (x1.341)
This rebase is a better implementation of the previous ones.

Modifications are done to reduce the total clock cycle.
Speedup: 1.341
Compiled with -O3
Tested with: park_joy_420_720p50.y4m

Change-Id: I940eaf283f60597ca0d9d2e13d518878d55ff02d
2013-11-18 20:53:13 -05:00
Dmitry Kovalev
d8c06d23da Cleaning up vp9_entropy.c file.
Change-Id: I568f5e2d4ef2f2affe013ba1691ffb546f1fe8c6
2013-11-18 17:18:14 -08:00
Yaowu Xu
a42ab027fd Merge "Move vp9_extend.{h,c} from common to encoder" 2013-11-18 15:43:32 -08:00
Yaowu Xu
1c61e1960d Move vp9_extend.{h,c} from common to encoder
Since they used in encoder only. This commit also re-order includes
for the files that include vp9_extend.h

Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
2013-11-18 12:43:36 -08:00
Yunqing Wang
e3168b0c54 Merge "Do horizontal loopfiltering in parallel" 2013-11-18 10:03:41 -08:00
Jim Bankoski
83eb1975df partition context update speedup
This removes a lot of operations in setting partition context...

Change-Id: I365e6f5607ece85190cb21443988816dfa510ce3
2013-11-17 06:58:08 -08:00
Yunqing Wang
64f728caef Do horizontal loopfiltering in parallel
This patch followed "Rewrite filter_selectively_horiz for parallel
loopfiltering" commit, and added x86 SSE2 optimization to do
16-pixel filtering in parallel. Also, corrected the declaration
of aligned arrays. For 8-pixel-in-parallel case, improved the
calculation of the masks and filters. Updated the threshold loading
since the thresholds were already duplicated. Updated neon C functions
to call neon loopfilters twice.

Using tulip clip, tests showed it gave a ~1.5% decoder speed gain.

Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
2013-11-15 16:18:43 -08:00
Jingning Han
bdc4371174 Take out assertion from inverse transforms
Separate the rounding and right shift operations of forward transform
from those of inverse transform. Take out the assertion check from
inverse transforms. If the transform coefficients were constructed to
cause intermediate steps of inverse transform overflow, the codec will
just let it overflow without breaking the decoding flow.

Change-Id: I73cfc3706c4e840fc543a77cbc4cdb0b05d07730
2013-11-15 15:30:47 -08:00
hkuang
7424492a0b Let the idct vp9_idct32x32_34_add = vp9_idct32x32_1024_add
on arm until we implenment real vp9_idct32x32_34_add_neon.

This issue is due to commit 47665452f0
Merge "Add 32x32 idct function for eob<=34 case".

Change-Id: I56b5f0abc20e7dd1bba521f78a995e85d65ea296
2013-11-15 14:59:16 -08:00
Guillaume Martres
17084657e6 vpxenc: add --aq-mode flag to control adaptive quantization
Change-Id: I57e1ad4bed3487df12893ced77c49093f8755706
2013-11-15 19:42:20 +01:00