Commit Graph

7416 Commits

Author SHA1 Message Date
Dmitry Kovalev
0a64f943fc Merge "Cleaning up entropy probability update in encoder." 2013-11-22 10:52:40 -08:00
Dmitry Kovalev
e0ec61187e Merge "Removing txfrm_block_to_raster_xy() call from extend_for_intra()." 2013-11-22 10:51:38 -08:00
Yunqing Wang
384089004d Merge "Improve vp9_fdct4x4_sse2 (x1.2)" 2013-11-22 10:39:55 -08:00
Yunqing Wang
ed36720b66 Do vertical loopfiltering in parallel
This patch followed "Add filter_selectively_vert_row2 to enable
parallel loopfiltering" commit, and added x86 SSE2 optimization
to do 16-pixel filtering in parallel. For other optimizations
(neon and dspr2), current 16-pixel functions were done by calling
8-pixel functions twice, and real 16-pixel functions could be added
later.

Decoder speedup:
tulip clip:     2% speed gain;
old_town_cross: 1.2% speed gain;
bus:            2% speed gain.

Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7
2013-11-22 10:04:51 -08:00
Yaowu Xu
16ad35f64e Merge "Fix the cpuid macro for x86_64 non-gcc build" 2013-11-22 10:03:51 -08:00
Adrian Grange
d427fab587 Fix bug in extend_frame chroma extended too far
This fixes issue 667.

In the case where the frame was an odd number of pixels
wide or high, the border was being extended by one col
or row too far.

The calculation of color plane dimensions was modified
to use those already computed at the time the frame
buffer was allocated.

Also freed the temporary scaling buffer in vpxdec to
prevent a memory leak.

Change-Id: Ied04bdcdfd77469731408c05da205db1a6f89bf5
2013-11-22 09:55:10 -08:00
Jim Bankoski
a64a192c90 Merge changes Id1698a35,Idcabd0b9
* changes:
  detokenization speedups
  Don't write 0's to token_cache
2013-11-22 08:16:17 -08:00
Deb Mukherjee
5576a4e1cb Merge "Refactoring of rate control - part 1" 2013-11-22 08:06:48 -08:00
Deb Mukherjee
f1781e86b7 Refactoring of rate control - part 1
Moves all rate control variables to a separate structure,
removes some currently unused variables,
moves some rate control functions to vp9_ratectrl.c,
and splits the encode_frame_to_data_rate function.

Change-Id: I4ed54c24764b3b6de2dd676484f01473724ab52b
2013-11-22 07:07:24 -08:00
Dmitry Kovalev
7c8cac3c21 Removing txfrm_block_to_raster_xy() call from extend_for_intra().
Change-Id: I6a48d1f35ed5fe7a2c7499675b339994c9c3bdf2
2013-11-21 19:30:58 -08:00
Yaowu Xu
36dfb90c53 Fix the cpuid macro for x86_64 non-gcc build
Change-Id: I0c44800db10db8d74c1ddfe89abecfd1c53d0f8d
2013-11-21 18:02:20 -08:00
Tom Finegan
65ac291f20 Merge "vpxenc: Add vpxenc.h and move/rename the global_config struct" 2013-11-21 17:56:26 -08:00
Jim Bankoski
70ffd5d055 detokenization speedups
removed unnecessary ifs and branches ..

Change-Id: Id1698a35292659388f48926791024d1400f2cea9
2013-11-21 16:55:22 -08:00
Dmitry Kovalev
5925ba08a3 Merge "Using num_4x4_blocks_* instead of b_{width, height}_log2." 2013-11-21 16:48:34 -08:00
Tom Finegan
49dc9cafa6 vpxenc: Add vpxenc.h and move/rename the global_config struct
- Rename the struct to VpxEncoderConfig.
- The idea behind this is to enable checking the global settings against
  stream specific settings in source files other than vpxenc.c.

Change-Id: Ic736cbb714845b9466acb34671780d65b83ad1a8
2013-11-21 16:46:40 -08:00
Dmitry Kovalev
ad3333e2cd Merge "Removing plane_block_{width, height} functions." 2013-11-21 16:37:27 -08:00
Dmitry Kovalev
6042f781f7 Merge "Using txfrm_block_to_raster_xy() in encoder." 2013-11-21 16:24:22 -08:00
Dmitry Kovalev
485682c30a Adding select_tx_size() function.
Change-Id: I9d18b31661a2ccdcd4e25956882c7fc2d4b7002e
2013-11-21 15:55:40 -08:00
Dmitry Kovalev
27e6b5b6bd Using num_4x4_blocks_* instead of b_{width, height}_log2.
Change-Id: I9ea3946c17b19f511565cd771037abe7db8b3ddb
2013-11-21 15:53:06 -08:00
Joshua Litt
3aeebfb231 Merge "Removing PARAMS macro for consistency" 2013-11-21 15:06:51 -08:00
Frank Galligan
fe847e7660 Merge "Revert "Add 16 wide neon horz loopfilter."" 2013-11-21 15:06:17 -08:00
levytamar82
8def766de2 vp9_short_fdct32x32_rd vp9_short_fdct32x32 optimized for AVX2
Change-Id: I6366e84490883b72362f762369d7e5bccb64f02f
2013-11-21 14:19:49 -08:00
Frank Galligan
97d1258375 Revert "Add 16 wide neon horz loopfilter."
The change caused mismatches with some test vectors on neon.

Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/

Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
2013-11-21 14:01:33 -08:00
Yunqing Wang
4fbc1210ed Correct ssse3 8/16-pixel wide sub-pixel filter calculation
Although no mismatch was indicated for 8/16 wide sub-pixel filters
in issue 661, they had similar problems that could cause mismatch
potentially. This patch fixed calculations in HORIZx8/16
and VERTx8/16.

Change-Id: Ib85412d690bea5609a51f0e50e7c858406b8ff9e
2013-11-21 12:57:47 -08:00
Jingning Han
272c82c13a Take out assertion from inverse transforms
Separate the rounding and right shift operations of forward transform
from those of inverse transform. Take out the assertion check from
inverse transforms. If the transform coefficients were constructed to
cause intermediate steps of inverse transform overflow, the codec will
just let it overflow without breaking the decoding flow.

Change-Id: Ia7ce15dfd1a73b4abbaa78cbc74ec718523c5b1b
2013-11-21 12:57:42 -08:00
Yunqing Wang
da359253d5 Fix stack pointer in sub-pixel filters
In commit "3d50da5397d20abc932d81453b26cde758293a40", the stack
pointer was modified while aligning the stack, and it needed to
be pop out at the end.

Change-Id: I39e4adc6b8aa3379854dd264d41aa6f0f15c7953
2013-11-21 12:57:36 -08:00
Yunqing Wang
2b55cfcfbd Fix decoder mismatch with ssse3 enabled
This patch fixed issue 661: "Decoder produces mismatched outputs
with ssse3 enabled and disabled." In sub-pixel filters, a pixel
value was multiplied by a filter coefficient, and the results
were added up. The order of adding up these multiplications had to
be arranged carefully to prevent incorrect overflowing.

Change-Id: Ia78663dfe74a2d46900f1c6fb07c21fac273892f
2013-11-21 12:57:32 -08:00
Guillaume Martres
e4f3b970ad vpxenc: add --aq-mode flag to control adaptive quantization
Change-Id: I2a08c00e8576099abc84b6ef05cb3567426e29cf
2013-11-21 12:57:27 -08:00
Johann
37a83b447e Disable avx/avx2 for Visual Studio 2010
VS2010 only supports avx. There is currently no avx code
in libvpx so don't create a special case for it.

Change-Id: I39a11410367712b98bc6122c5a42fabffcdb94cf
2013-11-21 12:57:22 -08:00
Yaowu Xu
0d28133b12 Add support for VC++2013
Change-Id: I5f979135d371c3fc7b9485e29479f112baa5fa3b
2013-11-21 12:57:12 -08:00
Jim Bankoski
b38e42fe9d Don't write 0's to token_cache
This code only updates the token_cache if the result is non0.

Change-Id: Idcabd0b993a926fea9c29dbec134b9c5c4859b40
2013-11-21 12:52:15 -08:00
Dmitry Kovalev
864e7c51b6 Syncing update_coef_probs() implementation with decoder.
Using for loop based on max_tx_size instead of separate checks. Combining
build_coeff_contexts() with update_coef_probs().

Change-Id: Ie335a7db29830677fbc14478a9c190d3c1068665
2013-11-21 12:36:02 -08:00
Abo Talib Mahfoodh
ec2dbdd107 Improve vp9_fdct4x4_sse2 (x1.2)
Modifications are done to reduce the total clock cycle.
Speedup: 1.2

Tested with: park_joy_420_720p50.y4m

Change-Id: Ia36b87e62e2f80a5fadaf5628729aedc80f38f3f
2013-11-21 15:04:35 -05:00
Dmitry Kovalev
4896d5c7ef Moving {left, right}_block_mode to vp9_blockd.h.
Both functions have no relation to motion vectors, so moving them from
vp9_findnearmv.h to vp9_blockd.h.

Change-Id: I74f524267886ab0fff4a2da793a10c906ed0f43a
2013-11-21 11:43:53 -08:00
Yunqing Wang
e002bb99a8 Merge "Add filter_selectively_vert_row2 to enable parallel loopfiltering" 2013-11-21 11:25:55 -08:00
hkuang
370bf116a2 Merge "Remove unnecessary eob checking." 2013-11-21 11:24:02 -08:00
Frank Galligan
2dd77580c0 Merge "Add 16 wide neon horz loopfilter." 2013-11-21 10:29:30 -08:00
Yunqing Wang
b5e6d6cccf Add filter_selectively_vert_row2 to enable parallel loopfiltering
Added filter_selectively_vert_row2 to be ready for parallel
loopfiltering in vertical direction. This change did 2-row
filtering at a time. If 2 vertically adjacent 8x8 blocks do same
type of filtering, we can do 16-pixel filtering in parallel.

Next, we need to provide 16-pixel loopfiltering functions in c
and optimized versions for codec speedup.

Change-Id: Idf97bbdd70566e55bd30e1fd25cb8544e33291be
2013-11-21 09:53:15 -08:00
Yunqing Wang
6c4964602a Merge "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" 2013-11-21 09:40:02 -08:00
Frank Galligan
98de15137e Add 16 wide neon horz loopfilter.
Add support to do 16 pixel horizontal filtering in Neon.
Nexus devices saw about 0.5% decode speed increase.

Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d
2013-11-21 09:39:36 -08:00
Dmitry Kovalev
c90b6bb101 Removing redundant call of vp9_init_mbmode_probs().
This function is called from vp9_setup_past_independence() which is called
before the modified piece of code. Moving reset of inter_mode_probs  into
vp9_init_mbmode_probs() for consistency.

Change-Id: Ib188e8798e1fbe15407fd501406761b746fdda95
2013-11-20 21:56:38 -08:00
Tom Finegan
44dd3274da vpxenc: Warn users about incorrect quantizer settings.
Also, clean up stylistically questionable code near my changes.

Change-Id: I92c96a274cb339b7b74174a608f94ae86aba8354
2013-11-20 17:18:28 -08:00
Dmitry Kovalev
77a865d970 Merge "Removing old code." 2013-11-20 14:43:03 -08:00
Dmitry Kovalev
a218a96784 Merge "Adding MV_FP_SIZE constant." 2013-11-20 14:39:58 -08:00
Dmitry Kovalev
d54893da1d Merge "Using is_inter_block() and has_second_ref() functions." 2013-11-20 14:39:50 -08:00
Dmitry Kovalev
87ff7f2af3 Removing old code.
Change-Id: I67d1681c7b17661deb792c5e6a9e2014a73ff9b7
2013-11-20 14:05:21 -08:00
Dmitry Kovalev
a84c5f0f64 Using txfrm_block_to_raster_xy() in encoder.
Change-Id: Ibe847000467fe46bf8ce87d8f1ef8f2d5ad1eaf4
2013-11-20 13:58:21 -08:00
Yunqing Wang
256cf7ee7d Correct ssse3 8/16-pixel wide sub-pixel filter calculation
Although no mismatch was indicated for 8/16 wide sub-pixel filters
in issue 661, they had similar problems that could cause mismatch
potentially. This patch fixed calculations in HORIZx8/16
and VERTx8/16.

Change-Id: I169961c9d40a20340995b7d22aafc89ccf30bfca
2013-11-20 12:52:56 -08:00
Dmitry Kovalev
79b5a2b142 Removing plane_block_{width, height} functions.
Change-Id: I29c0dfcf41a1253d5e2a0d2ff740c0c38ebaa5a2
2013-11-20 12:39:29 -08:00
Jim Bankoski
302c33e49f Merge "Clean up removal of vp9_pareto8 table." 2013-11-20 12:30:03 -08:00