4774 Commits

Author SHA1 Message Date
Debargha Mukherjee
caae13d54f Global motion continued
Implements a first version of global motion where the
existing ZEROMV mode is converted to a translation only
global motion mode.
A lot of the code for supporting a rotation-zoom affine
model is also incorporated.
WIP.

Change-Id: Ia1288a8dfe82f89484d4e291780288388e56d91b
2015-05-06 14:59:38 -07:00
Alex Converse
47cd96fb49 Try non-traditional intra prediction even when spatial isn't good.
Change-Id: I3a9b94d52cc0e962d91827a9b7ca8b65e82130ca
2015-05-06 10:23:22 -07:00
Peter de Rivaz
2dad1a7c8e Added high bitdepth sse2 transform functions
Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
2015-05-06 10:10:18 -07:00
Peter de Rivaz
2189a51891 Added sse2 acceleration for highbitdepth variance
This is a combination of:
  4a19fa6 Added sse2 acceleration for highbitdepth variance
  c6f5d3b Fix high bit depth assembly function bugs

Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f
2015-05-06 10:04:08 -07:00
Peter de Rivaz
41973e0e3e Refactored idct routines and headers
This change is made in preparation for a
subsequent patch which adds acceleration
for the highbitdepth transform functions.

The highbitdepth transform functions attempt
to use 16/32bit sse instructions where possible,
but fallback to using the C implementations if
potential overflow is detected.  For this reason
the dct routines are made global so they can be
called from the acceleration functions in the
subsequent patch.

Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665
2015-05-06 09:59:20 -07:00
Peter de Rivaz
0e82cba628 Added highbitdepth sse2 SAD acceleration and tests
Change-Id: I9f09e404e3136951e5cc15bf40b915c1fe10b620
2015-05-06 09:00:53 -07:00
Yaowu Xu
846396ddda Enable build with vs20013
Change-Id: I0592b9e92c3ca45e0a81d9ce49a9f2381bec3e39
2015-05-04 14:08:52 -07:00
Alex Converse
9b638cded6 tx_skip: Avoid undefined shift behavior.
vp9_quantize_rect did illegal shifts but didn't use the results.
The shift |a << b| is unfortunately undefined if |a < 0|, but the
more verbose |a * (1 << b)| generates the same machine code.

Change-Id: I7ceac66fa20a700630cf8ed008949146b161dab4
2015-04-30 12:56:27 -07:00
Alex Converse
aaa50de4ca Fix integer overflows in TX skipping
Change-Id: Ic1fc0f1271065180cffcbd2906e8faac6d07d08d
2015-04-30 11:42:31 -07:00
Debargha Mukherjee
735360f70e Consolidate common count updates
Cleanup - does not have any change in RD performance.

Change-Id: Iaca9c7378b294bd8c780958f5e33e697690eebfa
2015-04-29 14:12:03 -07:00
Alex Converse
8d9c600d44 Merge "palette: Add missing consts" into nextgen 2015-04-28 19:51:16 +00:00
Alex Converse
cb437f800f Merge "Refactor 4:4:4 palette selection." into nextgen 2015-04-28 19:07:41 +00:00
Alex Converse
7ae0b65f32 palette: Add missing consts
Change-Id: I83a2e57dc5dbc328c7bfea421ffbaeb83b7ca3bd
2015-04-28 11:35:17 -07:00
Alex Converse
d0cb4e75bc Refactor 4:4:4 palette selection.
Move 444 palette selection out of vp9_rd_pick_intra_mode_sb and into
a subfunction.

Change-Id: Ib323b740318626e2a68cd3d106dbd27c8f4652a6
2015-04-28 11:21:04 -07:00
hui su
1f7b49f7cd Use uniform quantization settings for non-transform blocks
Do not treat first element (dc) differently.

on screen_content
tx-skip only: +16.4% (was +15.45%)

no significant impact on natrual videos

Change-Id: I79415a9e948ebbb4a69109311c10126d8a0b96ab
2015-04-28 07:54:16 -07:00
hui su
761cd0b010 Fix bugs in palette and intrabc expt
palette expt: correctly update color buffer
intrabc expt: update zcoeff_blk so that residue coding will not
              be mistakenly skipped

Change-Id: I870f5b742c2ac394f4c871aa65e6591e293d8ef6
2015-04-27 15:27:11 -07:00
Alex Converse
98d4f09a7a Replace vp9_get_bit_depth with vp9_ceil_log2.
The current name is confusing with regard to high bit depth buffers.

Change-Id: Ieacd55ec22c81bd2f013f2e3d73a095affc93689
2015-04-23 10:26:57 -07:00
Debargha Mukherjee
96213ac5e7 Merge "Some minor improvements in bilateral filter expt." into nextgen 2015-04-23 01:14:45 +00:00
Debargha Mukherjee
425a45a45c Some minor improvements in bilateral filter expt.
Changes include:

* Uses double for RD cost computation to guard against overflow
for large resolution frames.
* Use previous frame's filter level to code the level better.
* Change precision of the filter parameters.
* Allow spatial variance for x and y to be different

Change-Id: I1669f65eb0ab1e8519962954c92d59e04f1277b7
derflr: +0.556% (a little up from before)
2015-04-22 18:09:42 -07:00
hui su
9e0750c2d2 Modify scan order for non-transform coding blocks
Use raster scan order for non-transform blocks

+15.45% (+2.1%) on screen_content
no significant change on natural videos

Change-Id: I0e264cb69e8624540639302d131f7de9c31c3ba7
2015-04-21 14:23:52 -07:00
hui su
33207fb170 Remove unused variable in new-quant expt
remove dequant_val_nuq in macroblock_plane

Change-Id: I4b4070ae2d01c2403c781433030204d6e95c3750
2015-04-20 11:14:51 -07:00
Alex Converse
90b3838fca Merge "Don't use old uv scores for NEWDV and cleanup mbmi saving." into nextgen 2015-04-17 12:47:23 -07:00
hui su
ebd3666940 Merge "Add high bit depth support for tx-skip expt" into nextgen 2015-04-17 11:37:39 -07:00
Alex Converse
e45e18593b Don't use old uv scores for NEWDV and cleanup mbmi saving.
Change-Id: Ic0fae1b348ad7659e4a41db29d075ae5eb6cdc82
2015-04-17 10:54:33 -07:00
Debargha Mukherjee
fb001c2e2f Merge "Simplify bilateral filter search for speed" into nextgen 2015-04-16 18:58:03 -07:00
Debargha Mukherjee
017baf9f4b Simplify bilateral filter search for speed
Adds an internal buffer in the encoder to store the deblocked
result to help speed up the search for the best bilateral filter.

Very small change in performance but a lot faster:
derflr: +0.518%

Change-Id: I5d37e016088e559c16317789cfb1c2f49334b2b9
2015-04-16 15:33:34 -07:00
hui su
8c00c7a9cd Fix palette expt asan failure
Account for 422 video format.

Change-Id: Ic5af661720fc5fa7142210d907dd25e1e79ff653
2015-04-16 15:08:06 -07:00
hui su
b69152db79 Add high bit depth support for tx-skip expt
+0.3% on 10-bit
+0.3% on 12-bit

With other high bit compatible experiments on 12-bit
+12.44% (+0.17) over 8-bit baseline

Change-Id: I40b4c382fa54ba4640d08d9d01950ea8c1200bc9
2015-04-16 14:54:39 -07:00
hui su
871c51b30a Fix a bug in tx_skip expt
tx_skip is not enabled for sub8x8 blocks.

Change-Id: I3797238735f85fb2bd07b50ca2845611b198bff6
2015-04-14 11:25:55 -07:00
hui su
294159d41e Merge "refactoring in tx_skip experiment" into nextgen 2015-04-14 08:12:08 -07:00
hui su
261a9bac5a refactoring in tx_skip experiment
simplify code logic

Change-Id: Ifafc712f3f85abafadb429a04e295cf8cbb185d2
2015-04-13 17:14:05 -07:00
Debargha Mukherjee
343c092e2e High bit-depth support for wedge partition expt
Change-Id: Idbd27e66d4f4a7953f888137d5752856215a6760
2015-04-13 09:28:15 -07:00
Debargha Mukherjee
8fa0b12cf7 Merge "An experiment introducing a bilateral loop filter" into nextgen 2015-04-10 16:46:16 -07:00
Debargha Mukherjee
fe4b6ac652 An experiment introducing a bilateral loop filter
Adds a framework to incorporate a parameterized loop
postfilter in the coding loop after the application of the
standard deblocking loop filter.

The first version uses a straight bilateral filter
where the parameters conveyed are just spatial and
intensity gaussian variances.

Results on derflr:
+0.523% (only with this experiment)
+6.714% (with all expts other than intrabc)

Change-Id: I20d47285b4d25b8c6386ff8af2a75ff88ac2b69b
2015-04-10 16:05:00 -07:00
hui su
bfc27bb614 tx-skip experiment: improve entropy coding of coeff tokens
This patch allows the prediction residues of tx-skipped blocks
to use probs that are different from regular transfrom
coefficients for token entropy coding. Prediction residues are
assumed as in band 6.

The initial value of probs is obtained with stats from limited
tests. The statistic model for constrained token nodes has not
been optimized. The probs for token extra bits have not been
optimized. These can be future work.

Certain coding improvment is observed:
derflr with all experiments:                +6.26%  (+0.10%)
screen_content with palette:               +22.48%  (+1.28%)

Change-Id: I1c0d78178ee9f3655febb6f30cdaef8ee9f8e3cc
2015-04-10 11:33:42 -07:00
Alex Converse
16e5e713fa Add an intra block copy mode (NEWDV).
Change-Id: I82b261c54ac9db33706bb057613dcbe66fc71387
2015-04-03 11:59:57 -07:00
Zoe Liu
e1cae5eebf Clean the COMPOUND_MODES mv initialization for sub8x8
Change-Id: I04f4ad41c002c761d55093432d6c437c25e5bddd
2015-04-02 16:30:48 -07:00
Zoe Liu
2ae3d4f266 Add a new PREDICTION mode using NEARMV as ref mv
This experiment, referred as NEWMVREF, also merged with NEWMVREF_SUB8X8
and the latter one has been removed. Runborgs results show that:

(1) Turning on this experiment only, compared against the base:
derflf: Average PSNR 0.40%; Overall PSNR 0.40%; SSIM 0.35%
(2) Turning on all the experiments including this feature, compared against
that without this feature, on the highbitdepth case using 12-bit:
derflf: Average PSNR 0.33%; Overall PSNR 0.32%; SSIM 0.30%.

Now for highbitdepth using 12-bit, compared against base:
derflf: Average PSNR 11.12%; Overall PSNR 11.07%; SSIM 20.27%.

Change-Id: Ie61dbfd5a19b8652920d2c602201a25a018a87a6
2015-04-02 14:37:22 -07:00
hui su
9eada94a3e palette experiment: remove run-length coding
Change-Id: I1e52475d0179cf019841d09a53b3b7fc53c79336
2015-03-31 11:09:30 -07:00
hui su
65d39f9fae Merge "Palette experiment: encode color indices based on context" into nextgen 2015-03-26 18:34:43 -07:00
hui su
a3af20f56e Merge "Palette experiment: adaptly update probs" into nextgen 2015-03-26 18:34:28 -07:00
hui su
6ad18db24f Palette experiment: encode color indices based on context
The basic idea is to use a pixel’s neighboring colors as
context to predict its own color. Up to 4 neighbors are
considered here: left, left-above, above, right-above.
To reduce the number of contexts,  the combination of any
4 (or less) colors are mapped to a reduced number of
patterns. For example, 1111, 2222, 3333, … , can be mapped
to the same pattern: AAAA. SImilarly, 1122, 1133, 2233, …,
can be mapped to the pattern AABB. In this way, the total
number of color contexts is reduced to 16.

This almost doubles the gain of palette coding on screen
content videos.

on screen_content
--enable-palette                                  +14.2%
--enable-palette --enable-tx-skip                 +21.2%

on derflr
--enable-palette                                  +0.12%
with all other experiments                        +6.16%

Change-Id: I560306dae216f2ac11a9214968c2ad2319fa1718
2015-03-26 15:48:08 -07:00
Debargha Mukherjee
3d965883f4 Merge "Add palette coding mode for inter frames" into nextgen 2015-03-26 11:59:36 -07:00
hui su
e18b104462 Palette experiment: adaptly update probs
Also make changes to transmit palette-enabled flag using
neighbor blocks as context.

on screen_content
--enable-palette                            +7.35%

on derflr
with all other experiments                  +6.05%

Change-Id: Id6c2f726d21913d54a3f86ecfea474a4044c27f6
2015-03-25 09:12:57 -07:00
Zoe Liu
2a5648dca5 Cleaned the code in handle_inter_mode
borgs show consistent results as before this patch

Change-Id: I3d21623cb03ea169a031328e9dde9c26ba1bd016
2015-03-23 15:33:25 -07:00
hui su
070d635657 Add palette coding mode for inter frames
on screen_content
--enable-palette                                    +6.74%

on derflr
with all other experiments                          +6.02%
(--enable-supertx --enable-copy-mode
 --enable-ext-tx --enable-filterintra
 --enable-tx64x64 --enable-tx-skip
 --enable-interintra --enable-wedge-partition
 --enable-compound-modes --enable-new-quant
 --enable-palette)

Change-Id: Ib85049b4c3fcf52bf95efbc9d6aecf53d53ca1a3
2015-03-23 08:41:51 -07:00
Deb Mukherjee
73dcd41b72 Merge "Make interintra experiment work with highbitdepth" into nextgen 2015-03-22 22:48:56 -07:00
Deb Mukherjee
c082df2359 Make interintra experiment work with highbitdepth
Also includes some adjustments to the algorithm.
All stats look good.

Change-Id: I824ef8ecf25b34f3feb358623d14fe375c3e4eb7
2015-03-21 07:35:40 -07:00
Deb Mukherjee
8c5ac79e66 Some build fixes with highbitdepth and new quant
Highbitdepth performance about the same as 8-bit.

Change-Id: If737962d8588dd190083edae4383b731f9d22873
2015-03-21 06:53:58 -07:00
Deb Mukherjee
c8ed36432e Non-uniform quantization experiment
This framework allows lower quantization bins to be shrunk down or
expanded to match closer the source distribution (assuming a generalized
gaussian-like central peaky model for the coefficients) in an
entropy-constrained sense. Specifically, the width of the bins 0-4 are
modified as a factor of the nominal quantization step size and from 5
onwards all bins become the same as the nominal quantization step size.
Further, different bin width profiles as well as reconstruction values
can be used based on the coefficient band as well as the quantization step
size divided into 5 ranges.

A small gain currently on derflr of about 0.16% is observed with the
same paraemters for all q values.
Optimizing the parameters based on qstep value is left as a TODO for now.

Results on derflr with all expts on is +6.08% (up from 5.88%).

Experiments are in progress to tune the parameters for different
coefficient bands and quantization step ranges.

Change-Id: I88429d8cb0777021bfbb689ef69b764eafb3a1de
2015-03-17 21:42:55 -07:00