18311 Commits

Author SHA1 Message Date
JackyChen
891dbe1e52 vp9: Fix valgrind failure for short circuit on low temporal vaiance block.
Add check for actual split before using the variance of the split.

Change-Id: If0f93248be0b16d17738675d16c90516054dad2b
2016-06-02 15:56:58 -07:00
Debargha Mukherjee
17c4f1c7f5 Merge "Use standard rounding in combine_interintra." into nextgenv2 2016-06-02 19:29:16 +00:00
Debargha Mukherjee
7534a15c3a Merge "Warped motion functions added" into nextgenv2 2016-06-02 19:28:03 +00:00
Linfeng Zhang
ad0646cb84 Slow pshufb removal in 3 intra prediction functions.
Replaced vpx_d45_predictor_4x4_ssse3(), vpx_d45_predictor_8x8_ssse3()
and vpx_d207_predictor_4x4_ssse3() with
created vpx_d45_predictor_4x4_sse2(), vpx_d45_predictor_8x8_sse2()
and vpx_d207_predictor_4x4_sse2() respectively.
It's mostly neutral or slightly worse than ssse3 in good cases and
better than ssse3 in the bad cases (but still worse than using the mmx
regs).

Change-Id: Ib0237ceb71d2c57b8a93fd3170330cfed9d56bdd
2016-06-02 10:55:58 -07:00
JackyChen
a32f341539 Disable short circuit feature for low temporal variance.
The featrue fails in libvpx_unit_tests-valgrind. Will re-enable it after
fixing the issue.

Change-Id: I8ba132f04e98f4615b31fbff2097eda83c5e42bc
2016-06-02 09:45:00 -07:00
Linfeng Zhang
10969dfc6e Merge "Update filter_selectively_vert_row2()" 2016-06-02 16:22:21 +00:00
Yaowu Xu
100dfc9eab Merge "firstpass.c: fix an UBSAN/IOC error" 2016-06-02 16:20:06 +00:00
Geza Lore
888e90e823 Use standard rounding in combine_interintra.
Use the same rounding method that is used throughout the codebase,
where the halfway value is rounded up rather than down.

Change-Id: I04e92850bc69a7d7a07b06e3d2ce97f6f2ada321
2016-06-02 16:26:05 +01:00
Yaowu Xu
fd500f955f firstpass.c: fix an UBSAN/IOC error
Change-Id: I579286e6741b689ae4281a35beb7b8f95c3ffce5
2016-06-02 00:31:32 +00:00
jackychen
bacc67f4a8 vp9: Skip some modes when variance is low for big blocks, for 1 pass real-time.
Skip intra-mode and some inter-modes (newmv, nearmv, nearestmv) for
golden frame if the variance got from choose_partitioning is very low.
Only for 1 pass real-time CBR mode and bsize >= 32x32, it has ~2.5%
speed up with less than 0.1% PSNR drop for rtc test set. Don't see
visual regression.

Change-Id: I70efbc95a1007231ae36f02c5b2fbf6cd35077ad
2016-06-01 13:54:18 -07:00
Linfeng Zhang
b26232eb1b Update filter_selectively_vert_row2()
Reduce operations and jumps. perf shows CPU time reduced from 1.9% to
1.6% when decoding fdJc1_IBKJA.248.webm on Xeon E5.
Will apply the changes to vp10 after code review.

Change-Id: I9351509922855d8896ddef1ed093b3ca12619a61
2016-06-01 11:20:47 -07:00
Alex Converse
380c4ee32d Merge "segmentation: Don't use uninitialized probability data." into nextgenv2 2016-06-01 17:50:37 +00:00
Marco Paniconi
204809bfb3 Merge "vp9: Skip computation of best_sad for newmv, unless needed." 2016-06-01 17:37:29 +00:00
Yaowu Xu
6382727dc5 Fix UBSAN/IOC errors
1. test/dct16x16_test.cc
2. test/dct32x32_test.cc
3. test/fdct8x8_test.cc

BUG=webm:1225

Change-Id: I9c9315fbd65ddb3b44f688e01ba265fd22192198
2016-06-01 16:01:18 +00:00
Yaowu Xu
787b38ebb9 Fix VP8 encoder UBSAN/IOC errors
1. vp8/decoder/dboolhuff.c
2. vp8/decoder/dboolhuff.h
3. vp8/encoder/bitstream.c
4. vp8/encoder/boolhuff.h
5. vp8/encoder/rdopt.c

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1218

Change-Id: I5d315d63fd7aeaee6f3bd79178e593f3db38a6b1
2016-06-01 16:00:56 +00:00
James Zern
e5e2932cb3 ivfdec: tolerate invalid framerates
default invalid framerates to 30, quiets warnings in corrupt / fuzzed
files

Change-Id: Ib10d2b67df83cb6f9ed1cd6ef8e0e637aa7099ff
2016-05-31 17:37:59 -07:00
Alex Converse
6bae20ca43 Merge "Replace some vpxbool calls with entropy coder agnostic calls." into nextgenv2 2016-05-31 23:58:00 +00:00
Yaowu Xu
46ff1072b3 variance_avx2.c: UBSAN/IOC fix
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1222

Change-Id: Ifb3bedf9b4e1b007b21aebaa4beb9ba50424efef
2016-05-31 16:44:35 -07:00
Alex Converse
7a6cb59dbb segmentation: Don't use uninitialized probability data.
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1224

Change-Id: I17b76fcf0d8c191850350d5aa50dcc007b8b0cdc
2016-05-31 16:42:29 -07:00
Hui Su
afaefc89eb Merge "ext-intra: speed up keyframe encoding" into nextgenv2 2016-05-31 23:21:03 +00:00
Hui Su
118167a47d Merge "Add a speed feature for inter tx type search" into nextgenv2 2016-05-31 23:20:57 +00:00
Hui Su
60b52a1334 Merge "Add a speed feature for intra tx type search" into nextgenv2 2016-05-31 23:20:52 +00:00
James Zern
1d9cf262f7 Merge "vp10_inv_txfm2d_test: fix memory leak" into nextgenv2 2016-05-31 23:19:47 +00:00
Alex Converse
aee0091161 Replace some vpxbool calls with entropy coder agnostic calls.
Change-Id: Ifbcd0714fcf994c43b69255185456c7a255df66c
2016-05-31 15:42:19 -07:00
Debargha Mukherjee
faf3c2cd38 Warped motion functions added
Change-Id: I5064ef1421e17c3ecafe70e7ff1fc7db0c16cc8f
2016-05-31 14:03:23 -07:00
hui su
fa933553da ext-intra: speed up keyframe encoding
130% speed increase for keyframe encoding, with 0.4%
compression loss.

When kf-max-dist=150, 1.5% speed increase with 0.03%
compression loss.

Change-Id: I4cf7314ab95b9eb6dd17f314aca8955522c82676
2016-05-31 10:34:44 -07:00
hui su
f523d7b540 Add a speed feature for inter tx type search
Seperate prediction mode and tx type search for inter
modes. Enabled for speed >=1.

baseline:
speed increase     40%
compression drop   0.30%/0.29% on lowres/midres

ext-tx:
speed increase    160%
compression drop  1.08%/0.95% on lowres/midres

Change-Id: Ieb34b1ee80df6980d16e26a5783e08cc0deae55b
2016-05-31 10:34:35 -07:00
hui su
38e6dd71bb Add a speed feature for intra tx type search
Add a speed feature to seperate prediction mode and tx type search
for intra modes: search for best intra prediction mode with fixed
default tx type first, then choose the best tx type for the
selected mode.

Coding performance drop:
baseline
  lowres 0.10% midres 0.08% hdres 0.14%
with ext-tx
  lowres 0.14% midres 0.25% hdres 0.20%

Speed improvement is 20% for baseline and 17% for ext-tx.

It is turned on for speed >= 1.

Change-Id: Ia5e8d39e8a4e2e42c521bfde938f8b6a98ab24f9
2016-05-31 10:33:56 -07:00
Marco
bedf1c3af6 vp9: Skip computation of best_sad for newmv, unless needed.
For non-rd pickmode:
best_pred_sad, computed for NEWMV-last, is only used for
skipping golden non-zero modes. Add condition to avoid this
computation if not used (i.e, if golden nonzero modes are not used).

And remove code for computing best_pred_sad for NEWMV-golden,
since that sad is not used.

No change in behavior; small speed gain (~1%) for svc encodes.

Change-Id: Ic2cbdef6c4e9a233a57c0db0eeac8ad5fcead366
2016-05-31 10:29:00 -07:00
Zoe Liu
e89ca180c2 Make the bi-predictive frame group interval adjustable
This is for the bidir-pred experiment. Previously the length of the
bi-predictive frame group interval is fixed at 2, i.e. one
bi-predictive frame may be inserted every other frame. This patch
makes the length adjustable, i.e. any positive number may be
specified, but the use of the backward ref will be turned off if the
bi-predictive frame group interval is larger than the golden frame
group.

Further, an additional rate factor level has been added:
INTER_LOW
, which applies to LAST_BIPRED_UPDATE frames that are not used as
references.

Change-Id: I5514d34a64dd486bbb5756c2d0612946f598a789
2016-05-28 16:46:45 -07:00
Hui Su
6fd7f7dd3e Merge "ext-intra: refactor mode info. writing and reading" into nextgenv2 2016-05-28 04:34:59 +00:00
Tom Finegan
f80d8011a0 Merge "vpx_ports/mem_ops.h: cast the lhs of bitwise shifts of 24." 2016-05-27 18:52:05 +00:00
James Zern
f6ac6cf5bd Merge "acm_random,Rand9Signed: correct cast" 2016-05-27 18:32:06 +00:00
Linfeng Zhang
2ab7b9a6c9 Merge "Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10." 2016-05-27 17:51:35 +00:00
James Zern
13d48c4267 acm_random,Rand9Signed: correct cast
convert the random value to int16 before subtracting 256 from it; quiets
a ubsan (sanitize=integer) warning

BUG=webm:1225

Change-Id: Ibc2c5a21f30e112bd6c180f7d6a033327c38d0df
2016-05-27 10:33:56 -07:00
Linfeng Zhang
af7fb17c09 Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10.
Function level timing test shows about 27% time saving on
a Xeon E5-2680 v2 desktop.

Rename vp9_dct_sse2.c to vp9_dct_intrin_sse2.c for vp9 and
rename dct_sse2.c to dct_intrin_sse2.c for vp10 to avoid
duplicate basenames.

Actually vp9_fwht4x4_mmx/sse2() and vp10_fwht4x4_mmx/sse2()
are identical. TODO: They should be unified later if there is
no intention to keep a duplicate.

Change-Id: I3e537b7bbd9ba417c606cd7c68c4dbbfa583f77d
2016-05-27 09:51:16 -07:00
Tom Finegan
f1de622617 vpx_ports/mem_ops.h: cast the lhs of bitwise shifts of 24.
C does not allow for shifting into the sign bit of a signed
integer, and the two instances here become signed ints via
promotion. Explcitly cast them to unsigned MEM_VALUE_T to
avoid the problem.

BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=614648

Change-Id: I51165361a8c6cbb5c378cf7e4e0f4b80b3ad9a6e
2016-05-27 09:23:11 -07:00
Linfeng Zhang
0ba9b299e9 Merge "Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2" 2016-05-27 15:47:28 +00:00
James Zern
5d237f0986 vp10_inv_txfm2d_test: fix memory leak
input_, ref_input_ and output_ were being allocated with new[] followed
by vpx_memalign, remove the former

Change-Id: Ia16d0f9b9317042a24445095ad3c284f4e7bb481
2016-05-26 20:04:59 -07:00
Hui Su
e717ece4ab Merge "Add a quick path in build_intra_predictors" into nextgenv2 2016-05-26 22:12:53 +00:00
hui su
e5f47d4334 ext-intra: refactor mode info. writing and reading
No performance changes.

Change-Id: I001068330ea217a993aee9b79d7ffead0d23100e
2016-05-26 14:56:40 -07:00
Linfeng Zhang
4b5e462d08 Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
Followed the code style of other lpf fuctions.
These 2 functions put 2 rows of data in a single xmm register,
so they have similar but not identical filter operations,
and cannot share the same macros.

Change-Id: I3bab55a5d1a1232926ac8fd1f03251acc38302bc
2016-05-26 14:55:18 -07:00
Yaowu Xu
ff6accf936 Merge "Convert to unsigned int before left shift" 2016-05-26 21:29:50 +00:00
Hui Su
88eaf5d6ce Merge "Skip unnecessary calculations in ext-intra" into nextgenv2 2016-05-26 18:03:02 +00:00
Yaowu Xu
301e345273 Convert to unsigned int before left shift
This is to fix overflow when 128 is left shifted by 24.

Change-Id: Ibb5f6813536d985afa003a9848c0c3dd358955a7
2016-05-26 08:46:01 -07:00
Scott LaVarnway
9d24fe60f1 Merge "Code clean of sub_pixel_variance4xh -- 2" 2016-05-26 13:20:24 +00:00
hui su
bad6e169bf Add a quick path in build_intra_predictors
For the cases where no reference data is available.

Change-Id: Ibf1ac9b7073acc2c7fc44da893f3d608dc74bc1e
2016-05-25 15:21:57 -07:00
Yi Luo
469d002f4e Merge "Integrate HBD inverse HT flip types sse4.1 optimization" into nextgenv2 2016-05-25 21:35:14 +00:00
Marco
75d551783d vp9: Add datarate test for 1 pass VBR mode.
Existing tests are only for CBR mode.

Change-Id: Ie3b2cd46236457748e2650901d1a347a730f38af
2016-05-25 14:20:30 -07:00
Alex Converse
19e0b406c9 Refactor probability savings search.
- Avoid excessive copying

- Don't both searching if no update can possibly offer savings

- Simplify the interface

- Remove the confusing vp9_cost_upd256 macro

Change-Id: Id9d9676a361fd1203b27e930cd29c23b2813ce59
2016-05-25 13:00:09 -07:00