17183 Commits

Author SHA1 Message Date
Geza Lore
8d64b53dc8 Revert "Fix uninitialized blk_skip for VAR TX."
This reverts commit e7b89d88354708790211ff3949fdc705a4fa1672.
2016-04-19 15:41:56 +01:00
Geza Lore
e7b89d8835 Fix uninitialized blk_skip for VAR TX.
x->blk_skip used to be uninitialzied (leftover from encoding the
previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
higher --cpu-used or --rt options). This resulted in degraded coding
performance when uning cm->tx_mode != TX_MODE_SELECT.

This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.

Change-Id: If39062927446798c626fc93694b4e6a4f35fa5da
2016-04-19 14:22:48 +01:00
Jingning Han
ec2ffda599 Handle zero motion vector residual
This commit handles the zero motion vector residuals for single
and compound reference modes, respectively. It improves the coding
performance by 0.13% with no additional encoding complexity.

Change-Id: I16075a836025bd2746da2ff4698fb9261e4b08c1
2016-04-18 18:14:01 -07:00
Yi Luo
3cf1a082e0 Merge "Disable HBD 4x4 DCT_DCT HT test" into nextgenv2 2016-04-18 23:07:25 +00:00
Yue Chen
c0fd271932 Remove an unsuccessful adaption of overlap sizes in obmc experiment
We removed this adaption, which intended to reduce the size of
overlapped region if the neighboring block is a non-skip one. Thus,
now the width/height of the overlapping region is fixed as a half of
the current block.

Performance improvement (lowres/midres): 0.111%/0.102%

Change-Id: Ife75dad9d4eb355c78a05178b50cc015c442884f
2016-04-18 15:27:59 -07:00
Yaowu Xu
ed04e82a04 Merge branch 'master' into nextgenv2
Conflicts:
	vp10/common/scan.c
	vp9/common/vp9_pred_common.c
	vp9/decoder/vp9_decoder.c

Change-Id: Id559d98ea676da15d60ed464ddb6c48d3eed1111
2016-04-18 15:15:05 -07:00
Marco
9cc1f692bd vp9: 1 pass vbr: More even spacing for gf near key frame.
More even spacing near key frame and avoid gf on scene cut
if its close to key frame.

Small increase in metrics for ytlive set (which uses key-period=150).
(~0.2% gain)

Change only affects 1 pass vbr mode.

Change-Id: If1e5a59baf1e0befbaf998522fbc47d94ac5b5df
2016-04-18 14:40:55 -07:00
Marco
d488236ce3 vp9: Adjustment to active_best_quality for inter_frame, 1 pass vbr.
Change only affects 1 pass vbr.

Use a q value somewhat larger (~6%) than avg_frame_qindex[INTER]
as basis for active_best_quality for inter-frames.
And use the minium of this (avg_frame_qindex) and the active_worst_quality.

This reduces some overshoot in ytlive clips.
Overall small but positive average increase in metrics (up on average ~0.2%).

Change-Id: Icdbaae7872d5675fd38a13c0ec6ce0e2e3b919ce
2016-04-18 13:01:27 -07:00
Jingning Han
2aa6117bda Refactor transform selection process
This commit re-arranges the transform type and size selectio
process. It removes an unnecessary rate-distortion cost computation
step. Local experiments show that this speeds up the encoding
process by 6% for both the baseline and the ext-intra experiment.

Change-Id: Iab3b86a63a1e9e55548466791ed5d29a0575c1e7
2016-04-18 19:45:56 +00:00
Jingning Han
c5449d3eb7 Merge "Refactor rd_variance_adjustment function" into nextgenv2 2016-04-18 19:45:45 +00:00
Angie Chiang
caf066f845 Merge changes I67543d36,I763f2924 into nextgenv2
* changes:
  Reduce shift in txfm8x8
  Let txfm's constant bit be the same for each stage
2016-04-18 19:40:33 +00:00
Yi Luo
dd04329367 Disable HBD 4x4 DCT_DCT HT test
- HBD HT unit tests will be modified to test against new algorithm.

Change-Id: Iba58eeb21a45612685c93c98d7c846dab25e6638
2016-04-18 12:24:31 -07:00
Paul Wilkins
2e0841931c Merge "Adjustment to prediction decay." 2016-04-18 18:47:13 +00:00
Angie Chiang
d72560e10d Merge "Fit adst/dct's stage range into 32-bit in bd12" into nextgenv2 2016-04-18 18:40:28 +00:00
Angie Chiang
cf3ef18fc4 Merge "Remove double operation from tx_size selection" into nextgenv2 2016-04-18 18:11:36 +00:00
Yi Luo
a431a93cb1 Merge "Improvement on hybrid transform 4x4 DCT_DCT SSE4.1 optimization" into nextgenv2 2016-04-18 18:04:06 +00:00
Angie Chiang
6de4a77df3 Remove double operation from tx_size selection
This CL fix the bug
rdopt.c:1687: choose_tx_size_from_rd: Assertion
`mbmi->tx_type == DCT_DCT' failed

It is caused by
1) mms register access before double operation
2) different compiler behaviors
code:
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 0

code:
  --target=x86-linux-gcc
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 1

I remove the double operation and test it with EXT_TX experiment.
The psnr change is around 0.05%, which is considered as noise level.

Change-Id: If8935c70c8603617fcfa8571accd30ccdda786a0
2016-04-18 11:00:13 -07:00
Jingning Han
c8312daad1 Refactor rd_variance_adjustment function
Compute the reconstruction variance in the prediction mode search.

Change-Id: Id9c7635a9c9f5383e61c0e427e95234211834301
2016-04-18 09:37:34 -07:00
Yue Chen
16a99e967c Merge "Optimization for EXT_INTER + OBMC combination" into nextgenv2 2016-04-17 18:54:33 +00:00
Yue Chen
321794c4d5 Optimization for EXT_INTER + OBMC combination
In the rd loop, check the perf of obmc, whose mv is copied from regular
inter predictor, when wedge interinter is better than regular inter
(previously it will force allow_obmc = 0). The condition of the early
termination before this step is relaxed to avoid skipping too many obmc
predictions. The rates of the overhead are properly calculated for these tools.

The logic of the bitstream syntax:
(a single ref) the interintra flag is sent first, only if it is 0, we
send the obmc flag;
(compound refs) the obmc flag is sent first, only if it is 0, we send
the wedge interinter flag

Coding gain
lowres: 0.428% (2.287%->2.715%)

Change-Id: I5f3a34640b398e313cbf84235c9fe2073eb2173f
2016-04-15 17:03:20 -07:00
Yi Luo
71fa2b2218 Merge "Fix an unaligned memory allocation in HT 4x4 speed test" into nextgenv2 2016-04-15 23:56:21 +00:00
Angie Chiang
e7f64756a1 Merge "remove redundant header" into nextgenv2 2016-04-15 22:44:33 +00:00
Angie Chiang
8de8499cc9 remove redundant header
Change-Id: Ib0e880c341adebb238f43a6caeb661e2094e7a93
2016-04-15 15:34:05 -07:00
Angie Chiang
1b0092a76e relax txfm test error constraint
The error is increases because we reduce the const bit
of txfm

Change-Id: I0235a3fdb7dc6a4c0cd1c8cebb369df2a5071b94
2016-04-15 15:25:53 -07:00
Yi Luo
f53ecc21b0 Fix an unaligned memory allocation in HT 4x4 speed test
- Allocate 16-byte aligned memory.
- Disable speed test in unit tests.

Change-Id: Ibef734f4b9d39ad50e9b2e8e0a5d74565d57b409
2016-04-15 14:59:31 -07:00
Yi Luo
f095ea7dd6 Improvement on hybrid transform 4x4 DCT_DCT SSE4.1 optimization
- Implemented Angie's new fwd txfm algorithm.
- Improve ~100% than last 64-bit version; 3 times faster than
  original C code.
- Passed bit-exact unit test.

Change-Id: Ica30b9768706604a6d69fe42da778441f0f5f02e
2016-04-15 14:16:30 -07:00
Scott LaVarnway
9faa0296b8 Merge "VP9: inline vp9_get_intra_inter_context()" 2016-04-15 19:06:33 +00:00
Jingning Han
4d503d1043 Remove duplicated TxfmFunc declarations
Change-Id: If3876610a1fbce0988cc21ea917596bbb467df93
2016-04-15 12:03:21 -07:00
Zoe Liu
9638ee1f4e Merge "Fix segfault with --cpu-used >= 3 and ext-refs." into nextgenv2 2016-04-15 16:41:15 +00:00
Johann Koenig
c59c5cbeff Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm" 2016-04-15 16:00:51 +00:00
Scott LaVarnway
ef98a8f61f VP9: inline vp9_get_intra_inter_context()
Change-Id: I71366140799b9b39474b9b459082cdb250bd1905
2016-04-15 04:58:37 -07:00
Geza Lore
77d197e635 Fix segfault with --cpu-used >= 3 and ext-refs.
With ext-ref enabled, it is possible that when trying to encode the
first true ALTREF frame after a keyframe, the previous ALTREF frame
(alias for the keyframe) is the same as one of the new LAST{2,3,4}
reference frames, and hence cpi->ref_frame_flags will have the ALTREF
bit clear, as computed by get_ref_frame_flags in encoder.c.

sf->alt_ref_search_fp forces the previous ALTREF frame to
be used as the only possible  reference when encoding a new ALTREF
frame, but due to cpi->ref_frame_flags, some buffers will not be
initialized (see rdopt.c:7689 yv12_mb), leading to a segfault.

get_ref_frame_flags in encoder.c has been changed to prefer to keep
the  LAST frame, then the ALTREF frame, then any of the LAST{2,3,4}
frames and then the GOLDEN frame in that order of preference in case
any of them are the same. This avoids the segfault and behaves the
same for the baseline.

Change-Id: I4da1991667614009da5d3061a6316c0d5dbc6c0c
2016-04-15 11:17:22 +01:00
Martin Storsjo
d8b3e29ee7 Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm
This was never hooked up for the 32x32_34 case as the neon_asm version
in 3f7c12da, when the intrinsics version was added.

Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320
2016-04-15 10:25:47 +03:00
Angie Chiang
0a715add2e Reduce shift in txfm8x8
Change-Id: I67543d365cbef3c3e113f01660ae8cb744cc556d
2016-04-14 19:12:22 -07:00
Angie Chiang
dfa532cc2a Let txfm's constant bit be the same for each stage
Change-Id: I763f2924afca526db371231bca18b38879bdf793
2016-04-14 15:46:54 -07:00
Angie Chiang
02d23fbbf4 Fit adst/dct's stage range into 32-bit in bd12
Change-Id: Ie428c6f0655873de3e77e844a2f2e4203cf47dff
2016-04-14 15:44:05 -07:00
Johann
26faa3ec7a Apply 'const' to data not pointer
Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298
2016-04-14 14:47:16 -07:00
Jingning Han
019683e963 Merge "Clean up motion vector precision check in the encoding process" into nextgenv2 2016-04-14 20:55:51 +00:00
Jingning Han
79bef030f2 Merge "Apply motion vector precision check to candidate mv" into nextgenv2 2016-04-14 20:55:45 +00:00
Jingning Han
03a468f9ac Merge "Enable mode conversion in sub8x8 block" into nextgenv2 2016-04-14 19:01:15 +00:00
Alex Converse
031fd260f1 Merge "Disable the TestSuperframeIndexIsOptional test with ANS." into nextgenv2 2016-04-14 18:57:28 +00:00
Jingning Han
6af8f63d96 Clean up motion vector precision check in the encoding process
Remove unnecessary motion vector precision check in the encoding
process.

Change-Id: Ica32933c7d138f499f36b1dedec14c894b27d85a
2016-04-14 11:37:19 -07:00
Jingning Han
525995a3d9 Apply motion vector precision check to candidate mv
This avoids repeatedly checking the candidate motion vector
precision level at the decoder end. The compression performance
varies at 0.01% level.

Change-Id: I4a88e95decd900d0cac9a0c2e70ba43ef7ecac38
2016-04-14 09:44:41 -07:00
Jingning Han
cd39224cff Merge "Speed up dynamic motion vector referencing system" into nextgenv2 2016-04-14 16:16:43 +00:00
James Zern
5fb49e456a Merge "dct32x32_test: s/HAVE_NEON_ASM/HAVE_NEON/" 2016-04-14 02:35:46 +00:00
Hui Su
436a6cc4e7 Merge "ext-tx: use raster scan order for identity transform" into nextgenv2 2016-04-13 23:52:35 +00:00
Jingning Han
885a81f468 Merge "Fix a few mis-use cases of MAX_MV_REF_CANDIDATES" into nextgenv2 2016-04-13 23:44:25 +00:00
Angie Chiang
716f0ea3cf Merge changes I92819356,I50b5a313,I807e60c6,I8a8df9fd into nextgenv2
* changes:
  Branch dct to new implementation for bd12
  Change dct32x32's range
  Fit dct's stage range into 32-bit when bitdepth is 12
  Pass tx_type into get_tx_scale
2016-04-13 23:24:41 +00:00
Alex Converse
91c985fc28 Merge "Convert some vpx boolcoder calls back to vp10 generic calls." into nextgenv2 2016-04-13 23:04:17 +00:00
Hui Su
85a3f5b740 Merge "Speed-up in tx_size search" into nextgenv2 2016-04-13 23:02:21 +00:00