Commit Graph

207 Commits

Author SHA1 Message Date
Dmitry Kovalev
5b65246a71 Adding missing const to vp9_extra_bits array.
Change-Id: Icd128ab58719e0b9066bdfa66a5d0d427a84d6df
2013-07-31 18:51:18 -07:00
Jingning Han
525745b17a Remove a redundant branching in tokenize_b
The tokenize_b function is only called when output flag is on. Hence
removing the conditional branch on it therein.

Change-Id: Ib709f47f23f39ca05a695faf86fa3377f11f2dd0
2013-07-29 17:08:13 -07:00
Jingning Han
455f2de20b Tune tokenization/detokenization flow for speed-up
This commit optimizes the tokenization and detokenization operational
flow for speed-up. It makes the coding process about 0.3% faster at
speed 0.

Change-Id: I28008df7482874e4b5f237f2d418ff82a249dd56
2013-07-29 16:15:30 -07:00
Jingning Han
b5323ed89a Skip redundant tokenization in rd loop
This commit makes the encoder skip the redundant tokenization process
in the rate-distortion optimization search loop, while updating the
entropy contexts accordingly. It makes the speed 0 encoding process
about 0.5% faster at no performance change.

Change-Id: I34a4155a0b5332afeb45c93a51c7f35a294d685c
2013-07-29 16:09:16 -07:00
Dmitry Kovalev
828119d6ab Renaming txfm to tx for consistency in some places.
Change-Id: I2a6a646570e2af66315e7c658d00d99f80c4b127
2013-07-29 14:35:55 -07:00
Dmitry Kovalev
23391ea835 Renaming TX_SIZE_MAX_SB to TX_SIZES.
Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107
2013-07-29 12:25:34 -07:00
Dmitry Kovalev
fcc34796d2 Removing CONFIG_BALANCED_COEFTREE experiment.
Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4
2013-07-24 15:53:42 -07:00
Dmitry Kovalev
9139ee0908 Adding condition inside get_tx_type_{4x4, 8x8, 16x16}.
Adding plane type check condition because it was always used outside of
get_tx_type_{4x4, 8x8, 16x16}.

Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230
2013-07-24 12:55:45 -07:00
Dmitry Kovalev
1099a436d3 Moving counts from FRAME_CONTEXT to new struct FRAME_COUNTS.
Counts are separate from frame context. We have several frame contexts but
need only one copy of all counts.

Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d
2013-07-23 17:02:08 -07:00
Dmitry Kovalev
dd150e8ea9 Removing redundant code mostly from vp9_pred_common.{h, c}.
Removing redundant function arguments and curly braces.

Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
2013-07-11 18:39:10 -07:00
Dmitry Kovalev
c4ad3273c7 Moving segmentation related vars into separate struct.
Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
2013-07-11 11:57:57 -07:00
Jim Bankoski
fb027a7658 removing case statements around prediction entropy coding
Removes SEG_ID
Removes MBSKIP
Removes SWITCHABLE_INTERP
Removes INTRA_INTER
Removes COMP_INTER_INTER
Removes COMP_REF_P
Removes SINGLE_REF_P1
Removes SINGLE_REF_P2
Removes TX_SIZE

Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b
2013-07-09 20:10:16 -07:00
Ronald S. Bultje
26b6318de8 Make get_coef_context() branchless.
This should significantly speedup cost_coeffs(). Basically what the
patch does is to make the neighbour arrays padded by one item to
prevent an eob check in get_coef_context(), then it populates each
col/row scan and left/top edge coefficient with two times the same
neighbour - this prevents a single/double context branch in
get_coef_context(). Lastly, it populates neighbour arrays in pixel
order (rather than scan order), so we don't have to dereference the
scantable to get the correct neighbours.

Total encoding time of first 50 frames of bus (speed 0) at 1500kbps
goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase.

Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56
2013-07-01 16:34:10 -07:00
Ronald S. Bultje
7353ceab9d Quantize (64-bit only, for now) SSSE3 SIMD.
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
x86-64 only, it needs some minor modifications to be 32bit compatible,
because it uses 15 xmm registers, whereas 32bit only has 8.

Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
2013-07-01 11:36:07 -07:00
Ronald S. Bultje
d00b8e5f82 Inline vp9_get_coef_context() (and remove vp9_ prefix).
Makes cost_coeffs() a lot faster:
4x4: 236 -> 181 cycles
8x8: 888 -> 588 cycles
16x16: 3550 -> 2483 cycles
32x32: 17392 -> 12010 cycles

Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.

Change-Id: I16b8d595946393c8dc661599550b3f37f5718896
2013-06-28 10:40:21 -07:00
Dmitry Kovalev
87ee34aacb Removing unused code.
Removing block index (ib) parameter from get_tx_type_{8x8, 16x16}
functions.

Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1
2013-06-25 10:17:19 -07:00
Deb Mukherjee
869a39ba60 Cleans up mbskip encoding
Refactors mbskip coding to be compatible with coding of the rest of
the symbols. Adds forward/backward adaptation and removes a lot of
the legacy code.

Results:
fast50: +1.6%
derfraw300: +0.317%

Change-Id: I395a2976d15af044d3b8ded5acfa45f6f065f980
2013-06-07 16:00:26 -07:00
Ronald S. Bultje
6ef805eb9d Change ref frame coding.
Code intra/inter, then comp/single, then the ref frame selection.
Use contextualization for all steps. Don't code two past frames
in comp pred mode.

Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95
2013-06-06 17:28:09 -07:00
Jim Bankoski
5a88271b09 don't tokenize & encode tokens for blocks in UMV
This avoids encoding tokens for blocks that are entirely
in the UMV border. This changes the bitstream.

Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9
2013-06-06 06:10:25 -07:00
Ronald S. Bultje
e9d68a5e36 Merge all various transform size data trackers into single variables.
Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db
2013-05-31 09:18:59 -07:00
Deb Mukherjee
b8b3f1a46d Balancing coef-tree to reduce bool decodes
This patch changes the coefficient tree to move the EOB to below
the ZERO node in order to save number of bool decodes.

The advantages of moving EOB one step down as opposed to two steps down
in the other parallel patch are: 1. The coef modeling based on
the One-node becomes independent of the tree structure above it, and
2. Fewer conext/counter increases are needed.

The drawback is that the potential savings in bool decodes will be
less, but assuming that 0s are much more predominant than 1's the
potential savings is still likely to be substantial.

Results on derf300: -0.237%

Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242
2013-05-29 16:25:52 -07:00
Sami Pietila
88a4d4c510 Residual coding to cache energy class of tokens.
Proposal for tuning the residual coding by changing how the context
from previous tokens is calculated. Storing the energy class of previous
tokens instead of the token itself eases the critical path of
HW implementations.

Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab
2013-05-29 15:21:01 +01:00
Paul Wilkins
33ecd6ad54 Merge Scatter Scan experiment.
Removal from under configure flag.
A bit  renaming

Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa
2013-05-23 13:09:27 +01:00
Jingning Han
7ac5ac52f9 Merge 4x4 block level partition into codebase
Move 4x4/4x8/8x4 partition coding out of experimental list.

This commit fixed the unit test failure issues. It also resolved
the merge conflicts between 4x4 block level partition and iterative
motion search for comp_inter_inter.

Change-Id: I898671f0631f5ddc4f5cc68d4c62ead7de9c5a58
2013-05-23 11:58:50 +01:00
Deb Mukherjee
de4d682ca4 Using 128 entry look up table for coef models
Reverts to using 128 bit LUT for the coef models rather than 48
to ease hardware implementation.

Also incorporates some cleanups including removing various
hooks to support different lookup tables based on block_type and
ref_type.

Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6
2013-05-22 08:44:31 -07:00
Deb Mukherjee
7a645e4e12 Merging the model coef prob experiment
Merges the experiment.

Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9
2013-05-21 14:44:38 -07:00
Deb Mukherjee
39a90bc8e8 Updating the model coef experiment
Cleans up the experiment. Actually uses reduced counts for backward
updates, and reduced number of probabilities in the context.

No change in bitstream when the experiment is on.

Between expt on and off:
derfraw300 is down only -0.062% (which is better than when expts
were run previously).

Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5
2013-05-20 12:46:36 -07:00
Jingning Han
1f26840fbf Enable recursive partition down to 4x4
This commit allows the rate-distortion optimization recursion
at encoder to go down to 4x4 block size. It deprecates the use
of I4X4_PRED and SPLITMV syntax elements from bit-stream
writing/reading. Will remove the unused probability models in
the next patch.

The partition type search and bit-stream are now capable of
supporting the rectangular partition of 8x8 block, i.e., 8x4
and 4x8. Need to revise the rate-distortion parts to get these
two partition tested in the rd loop.

Change-Id: I0dfe3b90a1507ad6138db10cc58e6e237a06a9d6
2013-05-14 12:39:56 -07:00
Paul Wilkins
e5f715201a Change to band calculation.
Change band calculation back to simpler model based
on the order in which coefficients are coded in scan order
not the absolute coefficient positions.

With the scatter scan experiment enabled the results were
appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134).

Without the scatterscan experiment on the results were up derf as well.

Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791
2013-05-13 17:21:49 +01:00
Paul Wilkins
a14ae84749 Deprecate code_zerogroup experiment.
Delete code under the CONFIG_CODE_ZEROGROUP flag.

Change-Id: I5fe6c7b42a5da9b73118e33594301da4129f320a
2013-05-07 16:52:55 -07:00
Jingning Han
776c1482a3 Merge SB8X8 into the codebase
Pull sb8x8 out of experimental list. verified via borg run tests.
Fixed unit test failures.

Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76
2013-05-07 09:08:25 -07:00
John Koleszar
acc9c125dd Remove old_block_idx_4x4
Removes several instances where the old block numbering was
still in use.

Change-Id: Id35130591455a4abe6844613e45c0b70c1220c08
2013-05-03 17:19:13 -07:00
Ronald S. Bultje
d068d869b9 sb8x8 integration in rd loop.
Work-in-progress, not yet ready for review. TODO items:
- bitstream writing (encoder) and reading (decoder)
- decoder reconstruction

Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081
2013-04-30 16:13:20 -07:00
John Koleszar
5c24469c2c Use foreach_transformed_block with tokenize_b
Updates the tokenizer to use the common block walker used by the
detokenizer, to support non-4:2:0 and more than 3 planes.

Change-Id: If1854117a9c7c1427349209fa2b3051ce6459dcb
2013-04-29 12:07:39 -07:00
Jingning Han
c8bed9d4f9 Merge "Immigrate tokenize_mb into tokenize_sb" into experimental 2013-04-29 11:28:56 -07:00
Jingning Han
3ac3c4695c Immigrate tokenize_mb into tokenize_sb
Unify the tokenize_ function and enable configurable block size for
superblock 8x8. We are immigrating the functionalities of
macroblock handles into superblock ones, and eventually will remove
encode_mb and decode_mb. To be continued on detokenize_ module.

Change-Id: I9f81e8c2291082535cf5e0c4b662eb24fb7c8a7f
2013-04-29 11:19:36 -07:00
Ronald S. Bultje
2dbaa4f4f4 Change above/left_context to use an 8x8 basis.
Output changes slightly because of a minor bug in (at least) the sb32x16
block2above tx16x16 tables that previously existed in vp9_blockd.c.

Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc
2013-04-29 10:37:25 -07:00
Ronald S. Bultje
c849eaca59 Use b_width/height_log2 instead of mb_ where appropriate.
Basic assumption: when talking about transform units, use b_; when
talking about macroblock indices, use mb_.

Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884
2013-04-25 14:20:59 -07:00
Jingning Han
b0e3b3df18 Move sbsegment out of experimental list
Move rectangular superblock coding out of experimental list.

Change-Id: I96c37547d122330d666a67b4bf577ae54547857f
2013-04-24 15:19:17 -07:00
Dmitry Kovalev
d0d1094a05 Merge "Adding get_scan_{4x4, 8x8, 16x16} functions." into experimental 2013-04-23 12:44:51 -07:00
Ronald S. Bultje
00269a2413 Remove unused stuffing function.
Change-Id: I2bc8d775f8d698bf8582f4eecabc2329452e8d9b
2013-04-23 11:18:42 -07:00
Dmitry Kovalev
5de7e16ca2 Adding get_scan_{4x4, 8x8, 16x16} functions.
Change-Id: Id4306ef6d65d4a3984aed50b775bdf48d4f6c438
2013-04-22 14:08:41 -07:00
Deb Mukherjee
0aa79be7d5 Removes the code_nonzerocount experiment
This patch does not seem to give any benefits.

Change-Id: I9d2b4091d6af3dfc0875f24db86c01e2de57f8db
2013-04-22 10:58:49 -07:00
Deb Mukherjee
70d9f116fd End of orientation zero group experiment
Adds an experiment that codes an end-of-orientation symbol
for every eligible zero encountered in scan order.

This cleans out various other sub-experiments that were part
of the origiinal patch, which will be later included if found
useful.

Results are slightly positive on all sets (0.1 - 0.2% range).

Change-Id: I57765c605fefc7fb9d1b57f1b356843602abefaf
2013-04-22 09:27:59 -07:00
Dmitry Kovalev
684ddc61ea Renaming vp9_extra_bit_struct to vp9_extra_bit.
Change-Id: Ie4713da125e954c1d30e1d4cbeb38666fce90ccc
2013-04-19 11:14:33 -07:00
Dmitry Kovalev
77f4697a13 Fixing member names inside TOKENVALUE and TOKENEXTRA structs.
Change-Id: I183ec5819d4d80966c92db36db75b8c3be0d381d
2013-04-18 16:18:08 -07:00
Dmitry Kovalev
a8d903e539 Merge "Replacing VP9_COMBINEENTROPYCONTEXTS macro with function." into experimental 2013-04-18 14:26:34 -07:00
John Koleszar
ff3f93639c Use BLOCK_SIZE_TYPE in foreach_ walker
Change-Id: I655305c9e22bdd9abc893d3c40d4bc6616aa1d35
2013-04-17 15:08:37 -07:00
Ronald S. Bultje
e693472236 Fairly basic integration of rectangular blocks in encoding RD loop.
Adds RD integration for 32x16, 16x32, 64x32 and 32x64 rectangular blocks.
Derf almost +0.6%, HD a little over +1.0%, STDHD +1.3%.

Change-Id: Id651fdb6a655fdbb5c47009757e63317acfb88a5
2013-04-17 09:25:06 -07:00
Dmitry Kovalev
9087d6d470 Replacing VP9_COMBINEENTROPYCONTEXTS macro with function.
Change-Id: I3bbc31840af69481e1d9bb4427c9ee25abf82946
2013-04-16 15:30:28 -07:00
John Koleszar
e3cfe4e89e Remove the mb_no_coeff_skip flag
This flag was added to VP8 to allow a mode where MB-level skipping
was not allowed, saving a bit per mb. It was never used in practice,
and hasn't been tested in VP9, so remove it.

Change-Id: Id450ec6904c6d06c1919508e7efc52d05cde5631
2013-04-16 12:36:16 -07:00
Dmitry Kovalev
399a6cbcde Merge "Renaming vp9_token_struct to vp9_token and removing previous typedef." into experimental 2013-04-14 04:31:39 -07:00
John Koleszar
8bf6de725c Merge changes I6721e42f,Iaffb1ae8 into experimental
* changes:
  tokenize: convert skippable functions
  Add foreach_transformed_block
2013-04-11 13:36:25 -07:00
Dmitry Kovalev
24f18e1c34 Renaming vp9_token_struct to vp9_token and removing previous typedef.
Change-Id: If69c3d795f87af5cc7bfdfe70ef733c41b4d55c8
2013-04-11 13:01:52 -07:00
John Koleszar
c2bd46bf45 tokenize: convert skippable functions
Use the common block walker to calculate skippability.

Change-Id: I6721e42f065df237426c91c1d871ec226ba7cdcb
2013-04-11 12:27:37 -07:00
John Koleszar
c18b2617a4 Remove vp9_reset_mb_tokens_context
Use sb-common version instead.

Change-Id: If2552b5a39fd2e5272f66a41c5667dda85fd3939
2013-04-11 11:39:19 -07:00
Ronald S. Bultje
8fb5be48a6 Make usage of sb_type independent of literal values.
Change-Id: I0d12f9ef9d960df0172a1377f8e5236eb6d90492
2013-04-10 17:38:57 -07:00
Ronald S. Bultje
a3874850dd Make SB coding size-independent.
Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code
gives identical encoder results before and after. There are a few
macros for rectangular block sizes under the sbsegment experiment; this
experiment is not yet functional and should not yet be used.

Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728
2013-04-09 21:28:27 -07:00
John Koleszar
05a79f2fbf Move EOB to per-plane data
Continue migrating data from BLOCKD/MACROBLOCKD to the per-plane
structures.

Change-Id: Ibbfa68d6da438d32dcbe8df68245ee28b0a2fa2c
2013-04-04 21:30:23 -07:00
John Koleszar
4c05a051ab Move qcoeff, dqcoeff from BLOCKD to per-plane data
Start grouping data per-plane, as part of refactoring to support
additional planes, and chroma planes with other-than 4:2:0
subsampling.

Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23
2013-04-04 16:30:57 -07:00
Deb Mukherjee
e3955007df Merge "Framework changes in nzc to allow more flexibility" into experimental 2013-03-29 15:57:27 -07:00
Deb Mukherjee
fe9b5143ba Framework changes in nzc to allow more flexibility
The patch adds the flexibility to use standard EOB based coding
on smaller block sizes and nzc based coding on larger blocksizes.
The tx-sizes that use nzc based coding and those that use EOB based
coding are controlled by a function get_nzc_used().
By default, this function uses nzc based coding for 16x16 and 32x32
transform blocks, which seem to bridge the performance gap
substantially.

All sets are now lower by 0.5% to 0.7%, as opposed to ~1.8% before.

Change-Id: I06abed3df57b52d241ea1f51b0d571c71e38fd0b
2013-03-28 09:33:50 -07:00
Ronald S. Bultje
9eea9fa206 Fix mix-up in pt token indexing.
This fixes uninitialized reads in the trellis, and probably makes the
trellis do something again.

Change-Id: Ifac8dae9aa77574bde0954a71d4571c5c556df3c
2013-03-28 09:24:29 -07:00
Ronald S. Bultje
d9094d8fd3 Add col/row-based coefficient scanning patterns for 1D 8x8/16x16 ADSTs.
These are mostly just for experimental purposes. I saw small gains (in
the 0.1% range) when playing with this on derf.

Change-Id: Ib21eed477bbb46bddcd73b21c5c708a5b46abedc
2013-03-26 16:46:13 -07:00
Ronald S. Bultje
3120dbddb1 Redo banding for all transforms.
Now that the first AC coefficient in both directions use the same DC
as their context, there no longer is a purpose in letting both have
their own band. Merging these two bands allows us to split bands for
some of the very high-frequency AC bands.

In addition, I'm redoing the banding for the 1D-ADST col/row scans. I
don't think the old banding made any sense at all (it merged the last
coefficient of the first row/col in the same band as the first two of
the second row/col), which was clearly an oversight from the band being
applied in scan-order (rather than in their actual position). Now,
coefficients at the same position will be in the same band, regardless
what scan order is used. I think this makes most sense for the purpose
of banding, which is basically "predict energy for this coefficient
depending on the energy of context coefficients" (i.e. pt).

After full re-training, together with previous patch, derf gains about
1.2-1.3%, and hd/stdhd gain about 0.9-1.0%.

Change-Id: I7a0cc12ba724e88b278034113cb4adaaebf87e0c
2013-03-26 16:46:13 -07:00
Ronald S. Bultje
790fb13215 Use above/left (instead of previous in scan-order) as token context.
Pearson correlation for above or left is significantly higher than for
previous-in-scan-order (absolute values depend on position in scan, but
in general, we gain about 0.1-0.2 by using either above or left; using
both basically just makes this even better). For eob branch skipping,
we continue to use the previous token in scan order.

This helps about 0.9% on derf after re-training on a limited data set.
Full re-training and results on larger-resolution clips are pending.

Note that this commit breaks trellis, so we can probably get further
gains out of it by fixing trellis at some later point.

Change-Id: Iead68e296fc3a105cca746b5e3da9555d6010cfe
2013-03-26 16:46:09 -07:00
Ronald S. Bultje
b99dce6881 Fix ENTROPY_STATS code in vp9_tokenize.c.
Change-Id: I9b4cb1e2ce6c6a99cffd473ff2fa7579bd318fcd
2013-03-18 15:39:04 -07:00
Deb Mukherjee
a28139c849 Continued experiment with nonzero count
Adds probability updates for extra bits for the nzcs, code for
getting nzc stats, plus some minor cleanups and fixes.

Change-Id: If2814e7f04fb52f5025ad9f400f3e6c50a00b543
2013-03-08 16:37:08 -08:00
Ronald S. Bultje
d3724abe9f Re-add support for ADST in superblocks.
This also changes the RD search to take account of the correct block
index when searching (this is required for ADST positioning to work
correctly in combination with tx_select).

Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6
2013-03-07 11:19:10 -08:00
Deb Mukherjee
eb6ef2417f Coding con-zero count rather than EOB for coeffs
This patch revamps the entropy coding of coefficients to code first
a non-zero count per coded block and correspondingly remove the EOB
token from the token set.

STATUS:
Main encode/decode code achieving encode/decode sync - done.
Forward and backward probability updates to the nzcs - done.
Rd costing updates for nzcs - done.
Note: The dynamic progrmaming apporach used in trellis quantization
is not exactly compatible with nzcs. A suboptimal approach has been
used instead where branch costs are updated to account for changes
in the nzcs.

TODO:
Training the default probs/counts for nzcs

Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d
2013-03-07 07:20:30 -08:00
Ronald S. Bultje
111ca42133 Make superblocks independent of macroblock code and data.
Split macroblock and superblock tokenization and detokenization
functions and coefficient-related data structs so that the bitstream
layout and related code of superblock coefficients looks less like it's
a hack to fit macroblocks in superblocks.

In addition, unify chroma transform size selection from luma transform
size (i.e. always use the same size, as long as it fits the predictor);
in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
transform will now use the 16x16 (instead of the 8x8) chroma transform,
and 64x64 superblocks using the 32x32 luma transform will now use the
32x32 (instead of the 16x16) chroma transform.

Lastly, add a trellis optimize function for 32x32 transform blocks.

HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
a few negative points here and there that I might want to analyze
a little closer.

Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
2013-03-04 16:34:36 -08:00
Ronald S. Bultje
e8c74e2b70 Move eob from BLOCKD to MACROBLOCKD.
Consistent with VP8.

Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
2013-02-27 11:00:55 -08:00
Ronald S. Bultje
b1641150b1 Merge cnvcontext experiment.
Change-Id: I35e64998b25694a3bb4a62164bba3c03c1db4bc7
2013-02-26 10:40:15 -08:00
Ronald S. Bultje
0c9e2e9a1d Split coefficient token tables intra vs. inter.
Change-Id: I5416455f8f129ca0f450d00e48358d2012605072
2013-02-23 07:33:46 -08:00
Paul Wilkins
c17672a33d Further changes to coefficient contexts.
This patch alters the balance of context between the
coefficient bands (reflecting the position of coefficients
within a transform blocks) and the energy of the previous
token (or tokens) within a block.

In this case the number of coefficient bands is reduced
but more previous token energy bands are supported.

Some initial rebalancing of the default tables has been
by running multiple derf clips at multiple data rates using
the ENTOPY_STATS macro. Further balancing needs to be
done using larger image formatsd especially in regard to
the bigger transform sizes which are not as well represented
in encodings of smaller image formats.

Change-Id: If9736e95c391e711b04aef6393d26f60f36e1f8a
2013-02-23 07:29:09 -08:00
Ronald S. Bultje
3af36ea8cc Remove Y2 and Y-no-DC token types from the bitstream.
Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042
2013-02-15 14:06:30 -08:00
Ronald S. Bultje
46dff5d233 Remove some Y2-related code.
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
2013-02-15 14:06:25 -08:00
Scott LaVarnway
ae886d6bff Moved vp9_get_coef_band to header file
allowing the compiler to inline.

Change-Id: I66e5caf5e7fefa68a223ff0603aa3f9e11e35dbb
2013-02-14 12:27:25 -08:00
Paul Wilkins
9255ad107f Abstract selection of coef band.
This patch abstracts the selection of the coefficient band
context into a function as a precursor to further experiments
with the coefficient context.

It also removes the large per TX size coefficient band structures
and uses a single matrix for all block sizes within the test function.

This may have an impact on quality (results to follow) but is only an
intermediate step in the process of redefining the context. Also the
quality impact will be larger initially because the default tables will
be out of step with the new banding.

In particular the 4x4 will in this case only use 7 bands. If needed we
can add back block size dependency localized within the function, but
this can follow on after the other changes to the definition of the
context.

Change-Id: Id7009c2f4f9bb1d02b861af85fd8223d4285bde5
2013-02-13 19:01:25 +00:00
Paul Wilkins
0d284ffed1 Abstract the selection of coefficient context.
This is an initial step to facilitate experimentation
with changes to the prior token context used to code
coefficients to take better account of the energy of
preceding tokens.

This patch merely abstracts the selection of context into
two functions and does not alter the output.

Change-Id: I117fff0b49c61da83aed641e36620442f86def86
2013-02-13 18:56:30 +00:00
Paul Wilkins
6a9f0c61a4 Remove NEWCOEFCONTEXT experiment.
Removal of the  NEWCOEFCONTEXT experiment to
reduce code clutter and make it easier to experiment with
some other changes to the coefficient coding context.

Change-Id: Icd17b421384c354df6117cc714747647c5eb7e98
2013-02-13 15:12:17 +00:00
Ronald S. Bultje
aac73df1a7 Use configure checks for various inline keywords.
Change-Id: I8508f1a3d3430f998bb9295f849e88e626a52a24
2013-02-06 16:12:56 -08:00
Paul Wilkins
0ff9b033b0 Segment Skip Flag
First step in simplifying the segment mode and
segment EOB flags into a simpler segment skip
flag that implies 0,0 mv and EOB at position 0.

Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5
2013-01-28 17:28:04 +00:00
Ronald S. Bultje
aa2effa954 Merge tx32x32 experiment.
Change-Id: I615651e4c7b09e576a341ad425cf80c393637833
2013-01-10 08:23:59 -08:00
Ronald S. Bultje
4455036cfc Merge superblocks (32x32) experiment.
Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587
2013-01-08 12:54:45 -08:00
Yaowu Xu
83664f457b make cost_coeffs() and tokenize_b() consistent
Change-Id: I7cdb5c32a1400f88ec36d08ea982e38b77731602
2013-01-03 09:31:47 -08:00
Deb Mukherjee
08f0c7cc9c New previous coef context experiment
Adds an experiment to derive the previous context of a coefficient
not just from the previous coefficient in the scan order but from a
combination of several neighboring coefficients previously encountered
in scan order.  A precomputed table of neighbors for each location
for each scan type and block size is used. Currently 5 neighbors are
used.

Results are about 0.2% positive using a strategy where the max coef
magnitude from the 5 neigbors is used to derive the context.

Change-Id: Ie708b54d8e1898af742846ce2d1e2b0d89fd4ad5
2012-12-19 18:49:39 -08:00
Yaowu Xu
b41c3583ac Merge "correct logic in cnvcontext experiment for tx32x32" into experimental 2012-12-18 14:23:39 -08:00
Yaowu Xu
de269c8a62 correct logic in cnvcontext experiment for tx32x32
Change-Id: I004ded11983b7fda85793912ebc5c6f266dc5eb5
2012-12-18 13:53:17 -08:00
Ronald S. Bultje
8986eb5c26 Give 4x4 scan and coef_band tables a _4x4 suffix.
This matches the names of tables for all other transform sizes.

Change-Id: Ia7681b7f8d34c97c27b0eb0e34d490cd0f8d02c6
2012-12-18 10:49:10 -08:00
John Koleszar
1306ba7659 Remove vp9_type_aliases.h
Prefer the standard fixed-size integer typedefs.

Change-Id: Iad75582350669e49a8da3b7facb9c259e9514a5b
2012-12-17 11:32:37 -08:00
Yaowu Xu
2b9ec585d6 fixed an encoder/decoder mismatch
The mismatch was caused by an improper merge of cleanup code around
tokenize_b() and stuff_b() with TX32X32 experiment.

Change-Id: I225ae62f015983751f017386548d9c988c30664c
2012-12-13 15:33:21 -08:00
Deb Mukherjee
7fa3deb1f5 Build fixes with teh super blcoks and 32x32 expts
Change-Id: I3c751f8d57ac7d3b754476dc6ce144d162534e6d
2012-12-13 12:18:38 -08:00
Ronald S. Bultje
f4608e3606 Merge "New default coefficient/band probabilities." into experimental 2012-12-13 09:56:50 -08:00
Ronald S. Bultje
5a5df19de3 New default coefficient/band probabilities.
Gives 0.5-0.6% improvement on derf and stdhd, and 1.1% on hd. The
old tables basically derive from times that we had only 4x4 or
only 4x4 and 8x8 DCTs.

Note that some values are filled with 128, because e.g. ADST ever
only occurs as Y-with-DC, as does 32x32; 16x16 ever only occurs
as Y-with-DC or as UV (as complement of 32x32 Y); and 8x8 Y2 ever
only has 4 coefficients max. If preferred, I can add values of
other tables in their place (e.g. use 4x4 2nd order high-frequency
probabilities for 8x8 2nd order), so that they make at least some
sense if we ever implement a larger 2nd order transform for the
8x8 DCT (etc.), please let me know

Change-Id: I917db356f2aff8865f528eb873c56ef43aa5ce22
2012-12-12 16:23:57 -08:00
Ronald S. Bultje
39de1e14ed Merge "Consistently use get_prob(), clip_prob() and newly added clip_pixel()." into experimental 2012-12-12 10:34:14 -08:00
Ronald S. Bultje
4d0ec7aacd Consistently use get_prob(), clip_prob() and newly added clip_pixel().
Add a function clip_pixel() to clip a pixel value to the [0,255] range
of allowed values, and use this where-ever appropriate (e.g. prediction,
reconstruction). Likewise, consistently use the recently added function
clip_prob(), which calculates a binary probability in the [1,255] range.
If possible, try to use get_prob() or its sister get_binary_prob() to
calculate binary probabilities, for consistency.

Since in some places, this means that binary probability calculations
are changed (we use {255,256}*count0/(total) in a range of places,
and all of these are now changed to use 256*count0+(total>>1)/total),
this changes the encoding result, so this patch warrants some extensive
testing.

Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628
2012-12-12 10:01:19 -08:00
Yaowu Xu
899f0fc126 clean up tokenize_b() and stuff_b()
Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5
2012-12-11 13:32:16 -08:00
Yaowu Xu
ab480cede5 experiment with CONTEXT conversion
This commit changed the ENTROPY_CONTEXT conversion between MBs that
have different transform sizes.

In additioin, this commit also did a number of cleanup/bug fix:
1. removed duplicate function vp9_fix_contexts() and changed to use
vp8_reset_mb_token_contexts() for both encoder and decoder
2. fixed a bug in stuff_mb_16x16 where wrong context was used for
the UV.
3. changed reset all context to 0 if a MB is skipped to simplify the
logic.

Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c
2012-12-07 17:25:45 -08:00
Ronald S. Bultje
885cf816eb Introduce vp9_coeff_probs/counts/stats/accum types.
Use these, instead of the 4/5-dimensional arrays, to hold statistics,
counts, accumulations and probabilities for coefficient tokens. This
commit also re-allows ENTROPY_STATS to compile.

Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5
2012-12-07 16:09:59 -08:00
Ronald S. Bultje
c456b35fdf 32x32 transform for superblocks.
This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
code all over the place to wrap that in the bitstream/encoder/decoder/RD.

Some implementation notes (these probably need careful review):
- token range is extended by 1 bit, since the value range out of this
  transform is [-16384,16383].
- the coefficients coming out of the FDCT are manually scaled back by
  1 bit, or else they won't fit in int16_t (they are 17 bits). Because
  of this, the RD error scoring does not right-shift the MSE score by
  two (unlike for 4x4/8x8/16x16).
- to compensate for this loss in precision, the quantizer is halved
  also. This is currently a little hacky.
- FDCT and IDCT is double-only right now. Needs a fixed-point impl.
- There are no default probabilities for the 32x32 transform yet; I'm
  simply using the 16x16 luma ones. A future commit will add newly
  generated probabilities for all transforms.
- No ADST version. I don't think we'll add one for this level; if an
  ADST is desired, transform-size selection can scale back to 16x16
  or lower, and use an ADST at that level.

Additional notes specific to Debargha's DWT/DCT hybrid:
- coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
  block than for the rest (DWT pixel differences) of the block. Therefore,
  RD error scoring isn't easily scalable between coefficient and pixel
  domain. Thus, unfortunately, we need to compute the RD distortion in
  the pixel domain until we figure out how to scale these appropriately.

Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
2012-12-07 14:45:05 -08:00
Yaowu Xu
6431007df3 Merge "minor fix to eob check for setting CONTEXT" into experimental 2012-11-29 09:27:00 -08:00
Yaowu Xu
7ab1d3e49f minor fix to eob check for setting CONTEXT
Previously, the "!=" check is logically incorrect when eob is at 0 and
effective coefficient starting position is 1. This commit should have
no effect on bitstream.

Change-Id: I6ce3a847c7e72bfbe4f7c74f88e3310c6b9b6d30
2012-11-29 09:10:15 -08:00
Deb Mukherjee
0742b1e4ae Fixing 8x8/4x4 ADST for intra modes with tx select
This patch allows use of 8x8 and 4x4 ADST correctly for Intra
16x16 modes and Intra 8x8 modes when the block size selected
is smaller than the prediction mode. Also includes some cleanups
and refactoring.

Rebase.

Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a
2012-11-28 16:21:12 -08:00
Jim Bankoski
c67873989f fixed includes to be fully specified
Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243
2012-11-28 10:53:17 -08:00
Yaowu Xu
3e976bba21 Localize Y2 entropy coding context
This commit makes sure Y2 entropy coding context is always updated on
every macroblock even there is no Y2 block.

Change-Id: Ie307cfc46526efe55613be39f9f178d2531b56ba
2012-11-28 09:27:36 -08:00
John Koleszar
fcccbcbb39 Add vp9_ prefix to all vp9 files
Support for gyp which doesn't support multiple objects in the same
static library having the same basename.

Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
2012-11-27 14:12:30 -08:00