Commit Graph

97 Commits

Author SHA1 Message Date
Geza Lore
552d5cd715 Extend superblock size fo 128x128 pixels.
If --enable-ext-partition is used at build time, the superblock size
(sometimes also referred to as coding unit (CU) size) is extended to
128x128 pixels.

Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a
2016-03-30 18:23:06 +01:00
Yaowu Xu
c810740c36 Merge branch 'masterbase' into nextgenv2
Conflicts:
	vp9/encoder/vp9_encoder.c
	vpx_dsp/x86/convolve.h

Change-Id: I60c3532936bedd796a75dfe78245a95ec21e2e55
2016-03-28 17:44:28 -07:00
Geza Lore
490ba1ad25 Port large scale tile coding features from nextgen.
If configured with --enable-ext-tile, the codec uses an alternative
tile coding syntax in the bitstream. Changes include::
 - The maximum number of tile rows and columns is extended to 1024
   each.
 - The minimum tile width/height is 64 pixels (1 superblock).
 - A tile copy mode is added where a tile directly reuse the coded
   data of a previous tile
 - The meaning of the tile-columns and tile-rows codec parameters are
   overloaded to mean tile-width and tile-height in units of 64
   pixels.
 - All tiles should now be independent, including rows within the
   same columns, so large scale parallel, or independent decoding is
   possible.
 - vpxdec also gained the options to decode only a particular tile,
   tile row, or tile column.

Changes without --enable-ext-tile:
 - All tiles should now be independent, including rows within the
   same columns, so large scale parallel, or independent decoding is
   possible.
 - vpxenc default tile configuration changed to use 1 tile column.

Change-Id: I0cd08ad550967ac18622dae5e98ad23d581cb33e
2016-03-24 09:26:05 +00:00
hui su
83b47af18d Add "entropy" experiment
This patch added two features to improve entropy coding efficiency
for coefficient tokens.

1. Choose 1 of 4 default probability tables based on q-index for
key-frames.
It is ported from nextgen branch:
https://chromium-review.googlesource.com/#/c/280586/

2. Do backward update after each superblock (64X64) row using
subframe token counts.

Coding gain: 0.1% on lowres; 0.42% on midres; 0.36% on hdres.
Much larger gain for key-frames: 2.6%, 2.3%, 1.7%.

Design doc: go/huisu-entropy

Change-Id: Ia3b6a615636be09247d70e4c520405637561532b
2016-03-16 11:55:50 -07:00
Yunqing Wang
e6e2d886d3 Add high-precision sub-pixel search as a speed feature
Using the up-sampled reference frames in sub-pixel motion search is
enabled as a speed feature for good-quality mode speed 0 and speed 1.

Change-Id: Ieb454bf8c646ddb99e87bd64c8e74dbd78d84a50
2016-03-11 16:32:11 -08:00
Debargha Mukherjee
f34deab243 Adds compound wedge prediction modes
Incorporates wedge compound prediction modes.

Change-Id: Ie73b54b629105b9dcc5f3763be87f35b09ad2ec7
2016-03-10 07:19:54 -08:00
Alex Converse
6bbbe31656 ANS: Switch from PDFs to CDFs.
Make the RANS implementation operate on cumulative distribution
functions rather than individual probability distribution functions.
CDFs have shown themselves more flexible to work with.

Reduces decoding memory usage from scaling O(num_distributions *
symbol_resolution) to O(num_distributions).

No bitstream change. This is an purely implementation change.

Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
2016-03-03 09:32:54 +00:00
Yunqing Wang
342a368fd4 Do sub-pixel motion search in up-sampled reference frames
Up-sampled the reference frames to 8 times in each dimension using
the 8-tap interpolation filter. In sub-pixel motion search, use the
up-sampled reference frames to find the best matching blocks. This
largely improved the motion search precision, and thus, improved
the compression quality. There was no change in decoder side.

Borg test and speed test results:
1. On derflr set,
Overall PSNR gain: 1.306%, and SSIM gain: 1.512%.
Average speed loss on derf set was 6.0%.
2. On stdhd set,
Overall PSNR gain: 0.754%, and SSIM gain: 0.814%.
On hevchd set,
Overall PSNR gain: 0.465%, and SSIM gain: 0.527%.
Speed loss on HD clips was 3.5%.

Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f
2016-02-29 12:14:47 -08:00
Debargha Mukherjee
bab2912b5e Some refactoring and cleanups of interp filter
Includes various cosmetic changes and refactoring including
naming the sharp filters differently (since they are no longer
8-tap).

Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a
2016-02-26 15:42:49 -08:00
Yaowu Xu
a570cefcf8 Merge "Extend vpxssim to handle more HBD combinations" into nextgenv2 2016-02-26 15:57:40 +00:00
James Zern
ac4c37c684 vp9/10: fix forced keyframes w/alt-refs enabled
in 1-pass encodes. issues with 2-pass as well as other forced flags
persist.

Change-Id: Ic7ceb906fccea6456d5df96483c10cacd46e01c7
2016-02-24 15:56:37 -08:00
Yaowu Xu
aa6c754635 Merge remote-tracking branch 'webm/master' into nextgenv2 2016-02-24 10:53:17 -08:00
Yaowu Xu
272dbaa13f Merge "Cleanup psnr.h" into nextgenv2 2016-02-23 17:13:34 +00:00
Yaowu Xu
ec6b8d8b76 Merge "Add shift stage in FASTSSIM computation" into nextgenv2 2016-02-23 00:43:18 +00:00
Yaowu Xu
eeaf8e6b6c Extend vpxssim to handle more HBD combinations
Change-Id: I38426d946b74c9090a265d34b89e2db6693927c2
2016-02-22 16:09:08 -08:00
Yaowu Xu
38cfc45e07 Cleanup psnr.h
Change-Id: Id026e72ee655ee5bd645a89e378da0d462be367d
2016-02-22 15:37:40 -08:00
Yaowu Xu
d1c5cd4a30 Add shift stage in FASTSSIM computation
This commits adds a shift stage for FASTSSIM computaton when source
bit depth is different from working bit depth, to make sure metric
results are calculated in bit_depth consistent with source.

Change-Id: I997799634076ef7b00fd051710544681ed536185
2016-02-22 14:58:10 -08:00
Yaowu Xu
af3a8381ef Merge "Move psnrhvs function declaration to psnr.h" into nextgenv2 2016-02-22 18:46:39 +00:00
Jingning Han
404c512786 Merge "Unify motion vector cost system" into nextgenv2 2016-02-22 17:38:00 +00:00
Jingning Han
a10814e11e Merge "Account context based prob model for motion vector cost estimate" into nextgenv2 2016-02-22 17:37:42 +00:00
Yaowu Xu
6e695da2d9 Move psnrhvs function declaration to psnr.h
From "ssim.h"

Change-Id: Ie53378794149ef8a844b4eb47ad4f08579de4b60
2016-02-22 08:38:49 -08:00
Jingning Han
fec5988657 Unify motion vector cost system
This commit unifies the motion vector cost buffers for full pixel
and sub-pixel motion search. The new motion vector coding system
provides 0.5% coding gains for 720p and above sequences and 0.2%
for lower resolution sets.

Change-Id: I927ec81eadc39d11a3c12b375221a1ddd2e8bf24
2016-02-21 22:21:28 -08:00
Jingning Han
03c01bc3c0 Account context based prob model for motion vector cost estimate
This commit accounts for the context based probability model for
motion vector cost estimate in rate-distortion optimization.

Change-Id: Ia068a9395dcb4ecc348f128b17b8d24734660b83
2016-02-19 16:32:51 -08:00
Yaowu Xu
7823fbb45c Merge "Move PSNR related functions into vpx_dsp/psnr.c" into nextgenv2 2016-02-18 01:00:54 +00:00
James Zern
7fe96753d7 vp10/encoder: add missing alloc checks
Change-Id: I5f81250d054bfd1cc69308a491b8fd21b77e4ee1
2016-02-17 14:36:06 -08:00
Yaowu Xu
7538501ad1 Move PSNR related functions into vpx_dsp/psnr.c
This makes all metric computation to locate at some place, also gets
rid of duplicate code between vp9 and vp10.

Change-Id: I24a2707d183a2419cd18a8343010adae185ffcd4
2016-02-17 13:05:34 -08:00
Yaowu Xu
6ed7f7a516 Merge branch 'master' into nextgenv2 2016-02-17 07:23:58 -08:00
James Zern
fdc977afc6 vp10,encoder: relocate setjmp
move to encoder_encode() as vp10_get_compressed_data() allocates data and
would require some modification to make its error return meaningful.

Change-Id: Ia5267c35d16ccd42b6da6d2136402b13e28f9159
2016-02-16 19:33:16 -08:00
Debargha Mukherjee
8b0a5b8718 Adding loop wiener restoration
Adds a wiener filter based restoration scheme in loop which can
be optionally selected instead of the bilateral filter.

The LMMSE filter generated per frame is a separable symmetric 7
tap filter. Three parameters for each of horizontal and vertical
filters are transmitted in the bitstream. The fourth parameter
is obtained assuming the sum is normalized to 1.
Also integerizes the bilateral filters, along with other
refactoring necessary in order to support the new switchable
restoration type framework.

derflr: -0.75% BDRATE

[A lot of videos still prefer bilateral, however since many frames
now use the simpler separable filter, the decoding speed is
much better].

Further experiments to follow, related to replacing the bilateral.

Change-Id: I6b1879983d50aab7ec5647340b6aef6b22299636
2016-02-12 09:56:24 -08:00
Yaowu Xu
1a69cb286f Refactor internal stats code
Also removed the use of postprocessing in computing internal stats.

Change-Id: Ib8fdbdfe7b7ca05cd1a034a373aa7762fa44323c
2016-02-12 07:31:29 -08:00
James Zern
8628898acf vp10_receive_raw_frame: add missing setjmp
allocations done within this function are protected with
vpx_internal_error; adding the setjmp fixes a crash in
vp10_lookahead_push() under low memory conditions.

Change-Id: I5515017cd71b218840c506791b3a517da7ffc93e
2016-02-11 19:21:28 -08:00
Yaowu Xu
bb8ca08816 Enable computing PSNRHVS for hbd build
This commit adds computation of PSNRHVS for highbitdepth build, it
also adds tests to make sure the calculation of psnrhvs metric for
10 and 12 bit correct.

Change-Id: Iac8a8073d2b3e3ba5d368829d770793212fa63b6
2016-02-11 13:17:59 -08:00
Yaowu Xu
c0874f2441 Enable computing of FastSSIM for HBD build
This commit adds the computation of fastSSIM for highbitdepth build,
it also modifies the hbdmetric test to be more generic and applicable
for fastSSIM.

The 255 used for calculating ssim constants c1 and c2 is not exactly
scaled by 4x and 16x to 1023 and 4095, therefore requries the metric
test to have a thresold more tolerant than 0, currently at 0.03dB.

Change-Id: I631829da7773de400e77fc36004156e5e126c7e0
2016-02-10 17:11:58 -08:00
Yaowu Xu
bb5f9e431f Fix a bug in HBD buffer size computation
The value of use_highbitdepth flag is used for compute the size for
high bit depth buffer allocation, which should take value 0 or 1
depending on if the buffer is used for high bit depth or not.
Previously, the values is set to 8 or 0, this commit fixes the issue
and properly set the value for this flag to 1 or 0.

This cuts the size of highbitdepth buffer memory allocation to 2/9 of
the size prior to the fix.

Change-Id: I401518b5a6147e5d8a973e54f7ca6bc1892065e0
2016-02-08 18:52:08 -08:00
Yaowu Xu
090eaadf20 Change to use local variables consistently
This commit does not change the computation, nor results.

Change-Id: I1a7bb47050220d970f075458b507c5e55d93b22e
2016-02-08 11:38:04 -08:00
Yaowu Xu
204e77e059 Remove a flavor of SSIM that is never really used.
Change-Id: I61ea7f63acbcfeecd3f7dba5a5a38b980efc802b
2016-02-08 11:22:08 -08:00
Debargha Mukherjee
f0a4485e54 Refactor to separate restoration from loop filter
Change-Id: Iab517862d957f3aa2a664e9349d57bbf424febb3
2016-01-29 15:39:23 -08:00
Debargha Mukherjee
3eb10fcf21 Cosmetic changes to loop restoration
Also adds a normalized filtering function to be used later.

Change-Id: I30e2140e664db635602f26a73b81ce8e008dff5e
2016-01-27 17:33:36 -08:00
Debargha Mukherjee
eef57c1e99 Fixes ext-interp experiment
Fixes integer pel MV usage for the sub8x8 case, which fixes a
rare mismatch issue.

Also adds some other minor missing code related to filter threshes.

Change-Id: I6b07e6cf9b287ba4b5bd6599af4a7412e50b3bdc
2016-01-27 09:24:48 -08:00
Debargha Mukherjee
84ca7a9f0f Loop restoration filter
Current implementation is a bilateral filter whose
parameters are transmitted in the bitstream.

derflr: -0.647% BDRATE
hevcmr: -0.794% BDRATE

This is a prelimary patch. Various other variations are to
be investigated next, that will hopefully be less expensive
on the decoder side.

Change-Id: I50634ae8f5014ad0bf7432306348908a349d81e1
2016-01-20 17:59:46 -08:00
Yaowu Xu
727ca802bf Merge "Merge branch 'master' into nextgenv2" into nextgenv2 2016-01-14 00:26:45 +00:00
Yaowu Xu
0367f32ea8 Merge branch 'master' into nextgenv2
Manually resovled the following conflicts:
	vp10/common/blockd.h
	vp10/common/entropy.h
	vp10/common/entropymode.c
	vp10/common/entropymode.h
	vp10/common/enums.h
	vp10/common/thread_common.c
	vp10/decoder/decodeframe.c
	vp10/decoder/decodemv.c
	vp10/encoder/bitstream.c
	vp10/encoder/encodeframe.c
	vp10/encoder/rd.c
	vp10/encoder/rdopt.c

Change-Id: I15d20ce5292b70f0c2b4ba55c1f1318181481596
2016-01-13 13:18:06 -08:00
Jingning Han
33cc1bd21d Generate compound reference motion vector
This commit allows the codec to add motion vector pairs into
the candidate list. It further improves the compression performance
by 0.1% across derf, hevcmr, stdhd, and hevchr sets without adding
encode/decode time.

Change-Id: I88d36da25a2a89bb506d411844af667081eba98b
2016-01-12 15:28:47 -08:00
Debargha Mukherjee
f7dfa4ece7 Modifies inter/intra coding to allow all tx types
The nominal tx_type for a given mode is used as a context
to encode the actual tx_type for intra.

Results:
derflr: -0.241% BDRATE
hevcmr: -0.366% BDRATE

Change-Id: Icfe7b0a58d79bc6497a06e3441779afec6e01e21
2016-01-08 11:13:46 -08:00
Jingning Han
387a10e3dc Enable context analyzer for inter mode entropy coding
It allows the codec to account for certain corner cases when
processing inter prediction mode entropy coding.

Change-Id: Ied451f4fff26ba579f6556554b8381ff2ccd0003
2016-01-08 10:27:27 -08:00
Zoe Liu
9581f3d49a Replaced a hard-coded value with the macro
Change-Id: I2aec63d8a600e319d037b764b0609092bce1e483
2015-12-30 17:16:51 -08:00
Zoe Liu
ec36a2b061 Restore the flexibility for the new 3 references
For the experiment of EXT_REFS, removed the previous special handling
on the new last 3 references, i.e. LAST2_FRAME, LAST3_FRAME, and
LAST4_FRAME, at the decoder, so that these new last references are
treated the same way as the other 3 references (LAST_FRAME,
GOLDEN_FRAME, and ALTREF_FRAME). Encoder changes have been made
accordingly to realize this flexibility.

Change-Id: Ic6546f9443b4377bb7e7b101bfa3e70a8b8d1c65
2015-12-17 16:34:02 -08:00
Yaowu Xu
dab7515aa4 Merge branch 'master' into nextgenv2
With a few manual fixes of merge conflicts.

Change-Id: I0dd65ff90f9fa8606e5563f528659e2607b12376
2015-12-16 09:00:57 -08:00
paulwilkins
99309004bf Fixed interval, fixed Q 1 pass test patch.
For testing implemented a fixed pattern and delta, 1 pass,
fixed Q, low delay mode.

This has not in any way been tuned or optimized.

Change-Id: Icf9b57c3bb16cc5c0726d5229009212af36eb6d9
2015-12-15 15:33:25 +00:00
Yaowu Xu
f07d73b9bf Merge branch 'master' into nextgenv2
Change-Id: Id0b784b115602e2502b42fa972a5ae210435a3be
2015-12-11 08:58:40 -08:00