Commit Graph

17028 Commits

Author SHA1 Message Date
Marco
06c8713e89 vp9: Use sb content measure to bias against golden.
For each superblock, keep track of how far from current frame
was the last significant content change, and use that (along
with GF distance), to turnoff GF search in non-rd pickmode.

Only enabled for speed >= 8.

avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
Speedup on mac: ~3-5%.
Speedup on arm: 3.6% for VGA and 4.4% for HD.

Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
2017-03-20 12:42:26 -07:00
Johann Koenig
d642dd4311 Merge "temporal filter test: update types" 2017-03-20 19:05:55 +00:00
Yunqing Wang
9c2552a1c1 Record the sum of tx block eobs in the partition block
The sum of tx bloxk eobs is needed in the machine learning based partition
early termination. The eobs are first accumulated during tx search, and
then the value associated with the best tx_size is copied to ctx for later
use.

After the sum of eobs are calculated correctly, re-enabled
ml_partition_search_early_termination speed feature.

Re-did the quality/speed test to check the impact of the fix.

1. Borg test BDRATE result:
4k set:     PSNR: +0.183%; SSIM: +0.100%;
hdres set:  PSNR: +0.168%; SSIM: +0.256%;
midres set: PSNR: +0.186%; SSIM: +0.326%;

2.Average speed gain result:
4k clips: 21%;
hd clips: 26%;
midres clips: 15%.

The result is in line with the original result.

Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
2017-03-20 17:12:15 +00:00
Jingning Han
ca9bedd538 Backport "Optimize the use case of token_cost table" to VP9
cherry picked from nextgenv2 90ea281f29

Change-Id: Ie989e60c6479ac3251cadaac9c7e795ccba52f4e
2017-03-17 16:54:22 -07:00
Alex Converse
ab71181545 Drop vp9_get_token_extracost
vp9_get_token_cost does the same thing with one fewer lookup.

Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
2017-03-17 16:53:09 -07:00
James Zern
36533e8c5a Merge "inv_txfm_sse2: clear conversion warning in hbd build" 2017-03-17 21:48:20 +00:00
Johann
775569473d temporal filter test: update types
Use 'int' for w/h since it is that way everywhere else.

Pass Buffer pointers

Change-Id: I9eef6890af657baba171c6bcfcc85fc976173399
2017-03-17 13:22:28 -07:00
Johann Koenig
9675affae0 Merge "test: add vp9_temporal_filter_apply test" 2017-03-17 18:18:06 +00:00
Alex Converse
0842daa24e Merge "vp9_optimize_b: Combine extrabits cost with token lookup" 2017-03-17 16:18:21 +00:00
James Zern
5da2e500d7 inv_txfm_sse2: clear conversion warning in hbd build
tran_high -> tran_low in return from dct_const_round_shift()

Change-Id: I2fe06c4b604823b1d1fe40a487017c3c2819a440
2017-03-17 01:16:38 -07:00
Linfeng Zhang
27530d484e Add vpx_highbd_idct32x32_1024_add_neon()
BUG=webm:1301

Change-Id: Ib90af0c1712e56b301d0e981dbe9a641e15e36ca
2017-03-17 00:27:46 -07:00
Linfeng Zhang
50b13f75b8 Add vpx_highbd_idct32x32_34_add_neon()
BUG=webm:1301

Change-Id: I74dd16c6c64e7bb71aa991cedccddf0663ef5e06
2017-03-17 00:27:46 -07:00
James Zern
2882778310 Merge "Add vpx_highbd_idct32x32_135_add_neon()" 2017-03-17 07:26:52 +00:00
Linfeng Zhang
65e9fb65e8 Add vpx_highbd_idct32x32_135_add_neon()
BUG=webm:1301

Change-Id: I58c2d65d385080711c3666d6d8f9d241dac7b21a
2017-03-16 22:37:55 -07:00
James Zern
68efc64b72 Merge "Clean vpx_idct32x32_1024_add_neon()" 2017-03-17 05:24:58 +00:00
Marco
02975a604c vp9: Fix speed 8 condition for enabling copy_partition.
Change-Id: I2c090e6ba853a30fef1957b620853315f9471753
2017-03-16 17:08:37 -07:00
Alex Converse
3a6ec9ea72 vp9_optimize_b: Combine extrabits cost with token lookup
About 0.6% fewer cycles spent in vp9_optimize_b.

Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
2017-03-16 17:03:22 -07:00
Gabriel Marin
976ddb61d3 Add a vector form of routine vp9_model_rd_from_var_lapndz
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb
to model the rate and distortion for MAX_MB_PLANE Laplacian sources in
parallel. The caller ensures that all sources have non-zero variance.

Measured a 18% to 25% reduction in retired instructions, and 17% to 24%
reduction in instruction execution cost with different compilers for the
Laplacian modeling.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
2017-03-16 22:19:44 +00:00
Marco Paniconi
83ba1880bf Merge "vp9: Fixes in non-rd pickmode for denoising with SVC." 2017-03-16 21:53:38 +00:00
Johann Koenig
eeeb71ed97 Merge "Remove ppc-linux-gcc target" 2017-03-16 21:53:17 +00:00
Johann Koenig
cd3d7cf4ac Merge "Add Hadamard for Power8" 2017-03-16 21:52:15 +00:00
Marco
bc7d4935bb vp9: Fixes in non-rd pickmode for denoising with SVC.
Don't denoise spatial layer frames whose base layer is a key frame.

Disallow golden reference for SVC with denoising on frames
that will be denoised (highest layer), as this removes bad artifact.
Will re-enable when issue is resolved.

Change-Id: I87a6597812330500966458172acfce54af65f70f
2017-03-16 12:59:41 -07:00
Marco
ba8bfaafa7 vpx_codec.h: include vpx/*.h -> ./*.h
This matches the other includes and also fixes a compile issue in
chromium.

Change-Id: I45e00a1454f7ed948aa3b96b04cc5946b1d02985
2017-03-16 16:55:56 +00:00
Jerome Jiang
bf40776aa4 Merge "Refactor: Change cpi->resize_state to enum values." 2017-03-16 16:43:42 +00:00
Marco Paniconi
ec73bf53a5 Merge "vp8: Fix compiler warning in vp8 pickinter.c" 2017-03-16 05:13:38 +00:00
Rafael de Lucena Valle
405b94c661 Add Hadamard for Power8
Change-Id: I3b4b043c1402b4100653ace4869847e030861b18
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
2017-03-15 23:46:18 -03:00
Marco Paniconi
cd47c1942e Merge "vp9: Fix some issues with denoiser and SVC." 2017-03-16 02:42:55 +00:00
Marco
a340c64a79 vp9: Fix some issues with denoiser and SVC.
Fix the update of the denoiser buffer when the base
spatial layer is a key frame. And allow for better/lower
QP on high spatial layers when their base layer is key frame.

Change-Id: I96b2426f1eaa43b8b8d4c31a68b0c6d68c3024a2
2017-03-15 17:19:17 -07:00
Jerome Jiang
b5f7f7737a Refactor: Change cpi->resize_state to enum values.
Change-Id: Iab1409b0fc1175bc5a14afc4749a08c536c98c41
2017-03-15 17:16:17 -07:00
Marco
2c8430e223 vp9: Turn off ml_partition_search_early_termination.
Fails on nightly ubsan, valgrind tests.
Enabled on commit:6701014

Change-Id: Ied3f5cb38e39cba54ac134f4514107cdfdfce159
2017-03-15 15:00:38 -07:00
Marco
deea4ede59 vp8: Fix compiler warning in vp8 pickinter.c
Change-Id: I0e5714538fe53d885a2201d808846901ae8fc288
2017-03-15 11:50:14 -07:00
Linfeng Zhang
e54231d613 Clean vpx_idct32x32_1024_add_neon()
Change-Id: I05921e16d6a3e4e7e5b00a90624735050a186636
2017-03-15 11:24:31 -07:00
Yi Luo
8440cc4817 Merge "Improve idct32x32_1024_add SSSE3 intrinsics performance" 2017-03-15 02:32:52 +00:00
Linfeng Zhang
d9a9a4ffea Merge "Fix overflow issue in 32x32 idct NEON intrinsics" 2017-03-15 00:38:17 +00:00
Jerome Jiang
27d5a57072 Merge "vp9: Using source sad for speedup for dynamic resizing." 2017-03-15 00:03:52 +00:00
Linfeng Zhang
c756eb01c8 Fix overflow issue in 32x32 idct NEON intrinsics
Similar issue as Change bc1c18e.

The PartialIDctTest.ResultsMatch test on vpx_idct32x32_135_add_neon()
in high bit-depth mode exposes 16-bit overflow in final stage of pass
2, when changing the test number from 1,000 to 1,000,000.

Change to use saturating add/sub for vpx_idct32x32_34_add_neon(),
vpx_idct32x32_135_add_neon and vpx_idct32x32_1024_add_neon() in high
bit-depth mode.

Change-Id: Iaec0e9aeab41a3fdb4e170d7e9b3ad1fda922f6f
2017-03-14 16:59:14 -07:00
Jerome Jiang
2fa7092808 Merge "vp9: Enable row multithreading for SVC in real-time mode." 2017-03-14 23:29:46 +00:00
Jerome Jiang
02463273c9 vp9: Using source sad for speedup for dynamic resizing.
Only for speed >= 7.

Change-Id: I3ac85fbb4023cf7e6f8333806b345b0174382a09
2017-03-14 15:47:19 -07:00
Yi Luo
fedcf83f33 Improve idct32x32_1024_add SSSE3 intrinsics performance
- Function level speed improves ~12%.

Change-Id: I9b7dbddabf08c7d0f6b25264e6074d5ccbe39290
2017-03-14 14:04:08 -07:00
James Zern
1b91f41935 Merge "vp9/encoder: fix segfault on win32 using vs < 2015" 2017-03-14 19:21:42 +00:00
Yunqing Wang
c3e290963d Merge "Apply machine learning-based early termination in VP9 partition search" 2017-03-14 18:07:05 +00:00
Marco Paniconi
78a6946904 Merge "vp9: Speed >= 8: Enable simple_block_yrd speed feature." 2017-03-14 17:50:17 +00:00
Marco
c0c789ab50 vp9: Adjust copy partition threshold, for speed 8.
Reduce it from 5 to 4, small/no change in metrics or speed.
Small reduction in dragging artifact near moving head.

Change-Id: Ic3bc5ca67c70bf0c89fc2ed14454840a28ae5b6a
2017-03-14 09:18:53 -07:00
Marco
c216c8d6f2 vp9: Speed >= 8: Enable simple_block_yrd speed feature.
Enable speed feature for resolutions > VGA.
avgPSNR on RTC down by ~1.7%.
Speedup on ARM: ~5%.

Change-Id: I7a3fe5f7425aa8df3f4a2eced1afa355bc0d4c95
2017-03-14 09:10:28 -07:00
Johann
a14a987c82 test: add vp9_temporal_filter_apply test
Add an independent implementation of the filter.

BUG=webm:1379

Change-Id: I309c459b493c3011273b78b127a786bb23c59f9c
2017-03-13 15:26:26 -07:00
Marco Paniconi
507204316a Merge "vp9: Fix to source_sad feature for SVC." 2017-03-13 19:18:31 +00:00
Linfeng Zhang
b0bfcc368c Merge "Add vpx_highbd_idct32x32_135_add_c()" 2017-03-13 18:49:01 +00:00
Marco
f0a22b23fe vp9: Fix to source_sad feature for SVC.
Allow speed feature sf->use_source_sad to be used
on highest spatial layer for SVC.

Change-Id: I260eb0478902764f49f83e43b17024fe86ff3b22
2017-03-13 11:00:40 -07:00
Yunqing Wang
670101439f Apply machine learning-based early termination in VP9 partition search
This patch was based on Yang Xian's intern project code. Further modifications
were done.
1. Moved machine-learning related parameters into the context structure.
2. Corrected the calculation of sum_eobs.
3. Removed unused parameters and calculations.
4. Made it work with multiple tiles.
5. Added a speed feature for the machine-learning based partition search
early termination.
6. Re-organized the code.

The patch was rebased to the top-of-tree.

Borg test BDRATE result:
4k set:     PSNR: +0.144%; SSIM: +0.043%;
hdres set:  PSNR: +0.149%; SSIM: +0.269%;
midres set: PSNR: +0.127%; SSIM: +0.257%;

Average speed gain result:
4k clips: 22%;
hd clips: 23%;
midres clips: 15%.

Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
2017-03-13 09:54:18 -07:00
Marco Paniconi
b39f7c3364 Merge "vp9: Fix condition for intra search in non-rd pickmode." 2017-03-13 06:11:13 +00:00