Commit Graph

92 Commits

Author SHA1 Message Date
Ronald S. Bultje
885cf816eb Introduce vp9_coeff_probs/counts/stats/accum types.
Use these, instead of the 4/5-dimensional arrays, to hold statistics,
counts, accumulations and probabilities for coefficient tokens. This
commit also re-allows ENTROPY_STATS to compile.

Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5
2012-12-07 16:09:59 -08:00
Ronald S. Bultje
c456b35fdf 32x32 transform for superblocks.
This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
code all over the place to wrap that in the bitstream/encoder/decoder/RD.

Some implementation notes (these probably need careful review):
- token range is extended by 1 bit, since the value range out of this
  transform is [-16384,16383].
- the coefficients coming out of the FDCT are manually scaled back by
  1 bit, or else they won't fit in int16_t (they are 17 bits). Because
  of this, the RD error scoring does not right-shift the MSE score by
  two (unlike for 4x4/8x8/16x16).
- to compensate for this loss in precision, the quantizer is halved
  also. This is currently a little hacky.
- FDCT and IDCT is double-only right now. Needs a fixed-point impl.
- There are no default probabilities for the 32x32 transform yet; I'm
  simply using the 16x16 luma ones. A future commit will add newly
  generated probabilities for all transforms.
- No ADST version. I don't think we'll add one for this level; if an
  ADST is desired, transform-size selection can scale back to 16x16
  or lower, and use an ADST at that level.

Additional notes specific to Debargha's DWT/DCT hybrid:
- coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
  block than for the rest (DWT pixel differences) of the block. Therefore,
  RD error scoring isn't easily scalable between coefficient and pixel
  domain. Thus, unfortunately, we need to compute the RD distortion in
  the pixel domain until we figure out how to scale these appropriately.

Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
2012-12-07 14:45:05 -08:00
Johann
1009f76566 Use 'vpx_scale' consistently
Change-Id: I178352813d2b8702d081caf405de9dbad9af2cc3
2012-12-05 16:05:44 -08:00
Paul Wilkins
7405040142 Merge "Change to MV reference search." into experimental 2012-12-05 09:14:46 -08:00
John Koleszar
5d91a1e0ae Merge remote-tracking branch 'origin/vp9-preview' into experimental 2012-12-05 08:41:35 -08:00
Paul Wilkins
4cc657ec6e Change to MV reference search.
This patch reduces the cpu cost of the MV ref
search by only allowing insert for candidates
that would be in the current top 4.

This could alter the outcome and slightly favors
near candidates which are tested first but also
limits the worst case loop count to 4 and means in
many cases it will drop out and not happen.

Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113
2012-12-05 14:03:45 +00:00
Johann
d138262ac0 Merge "Begin to refactor vpx_scale usage in VP9" into experimental 2012-12-04 15:23:42 -08:00
Johann
57e72208b3 Merge "Remove ARM optimizations from VP9" into experimental 2012-12-03 13:54:38 -08:00
Johann
c6bd29e2f5 Begin to refactor vpx_scale usage in VP9
Only declare the functions in vpx_scale RTCD and include the relevant
header.

Remove unused files and functions in vpx_scale to avoid wasting time
renaming. vpx_scale/win32/scaleopt.c contains functions which have not
been called in a long time but are potentially optimized.

The 'vp8' functions have not been renamed yet. That is for after the
cleanup.

Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c
2012-12-03 12:51:56 -08:00
Johann
34591b54dd Remove ARM optimizations from VP9
Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b
2012-12-03 12:50:15 -08:00
Jim Bankoski
b95338c7ab Merge "fixes --disable-vp9-encoder" into vp9-preview 2012-12-03 12:41:31 -08:00
Jim Bankoski
d9038b3c60 fixes --disable-vp9-encoder
Change-Id: I467bf0fdf3b35326bcce58d5459e6d2dbfd6c5e5
2012-12-03 12:21:16 -08:00
Deb Mukherjee
8b92f1e023 Supports inter-intra prediction with superblocks
Adds support for compound inter-intra prediction with superblocks.
Also, fixes a bug that disabled intra modes for superblocks.

Change-Id: I4d711317e1bc19df8c2f32dc645429f7fff31036
2012-12-01 15:19:55 -08:00
Deb Mukherjee
6632330702 Adds switchable filters with superblocks
Allows switchbale filters to be used without mismatch when the
superblock experiment is on.

Also removes a spurious clamping code in decodemv.c which causes
rare encode/decode mismatches.

Change-Id: I809d9ee0b2859552b613500b539a615515b863ae
2012-11-30 09:37:08 -08:00
Jim Bankoski
9f9370425b warnings in various experiments
Change-Id: Ib5106d4772450f8026f823dd743f162ab833b1d6
2012-11-30 07:31:37 -08:00
Jim Bankoski
2b8dc065d1 google style guide include guards
Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea
2012-11-30 07:30:59 -08:00
Yunqing Wang
eebc0b49f1 Merge "Further improve macroblock loop filters" into experimental 2012-11-29 16:07:14 -08:00
Jim Bankoski
245fba74b7 signed mismatch mvrefcount
Change-Id: Ie34820c1b6eaba9cf9316415a46f48af79c41646
2012-11-29 08:13:18 -08:00
Jim Bankoski
030e268a90 ihtllm moves to rtcd
clears up some warnings

Change-Id: I9899637497c6ad7519f098e055ab98580ae6d688
2012-11-29 07:19:38 -08:00
Jim Bankoski
e69b5258fd fix vp9_vp8 files renamed
Change-Id: I20c426e91ee49666db42e20eb074095ab6b8ec5d
2012-11-29 06:53:08 -08:00
Jim Bankoski
13dbf1fb17 more rtcd cleanup
Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7
2012-11-28 17:27:15 -08:00
Deb Mukherjee
0de214260b Merge "Fixing 8x8/4x4 ADST for intra modes with tx select" into experimental 2012-11-28 16:59:17 -08:00
Deb Mukherjee
0742b1e4ae Fixing 8x8/4x4 ADST for intra modes with tx select
This patch allows use of 8x8 and 4x4 ADST correctly for Intra
16x16 modes and Intra 8x8 modes when the block size selected
is smaller than the prediction mode. Also includes some cleanups
and refactoring.

Rebase.

Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a
2012-11-28 16:21:12 -08:00
Yaowu Xu
b2f27d909a Merge "remove the vp9_default_mode_contexts_a" into experimental 2012-11-28 13:56:42 -08:00
Yaowu Xu
1cc5739669 remove the vp9_default_mode_contexts_a
Given the way mode_context is updated, the benefit of an additional
default is not signficant.

Change-Id: I67489453e8781340b18e26a1cc2f04e9221004a2
2012-11-28 11:14:30 -08:00
Jim Bankoski
c67873989f fixed includes to be fully specified
Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243
2012-11-28 10:53:17 -08:00
Jim Bankoski
926d95cd84 Merge "remove postproc invokes" into experimental 2012-11-28 10:30:42 -08:00
John Koleszar
00e2c6bf7a Merge "Clamp decoded feature data" into experimental 2012-11-28 10:08:37 -08:00
Jim Bankoski
85cba19e16 remove postproc invokes
and some miscellaneous invoke left overs

Change-Id: I63191b1bfd3bea4ce30cceaeb686ec850570fc43
2012-11-28 10:00:25 -08:00
Yunqing Wang
d202138621 Further improve macroblock loop filters
This change included:
1. Aligned reads in vp9_mbloop_filter_vertical_edge function.
Since we actually read 16 bytes, we can align the reads to read
starting at (s - 8) instead of (s - 5).
2. Combined u, v loop filters.
3. Added 8x16 transpose.

This gave 2% decoder performance gain (tulip clip).

Change-Id: Ib14c2f1645c4a3436df17fe2f24789506bf0bb58
2012-11-28 09:27:07 -08:00
Yaowu Xu
12da793d00 removed redundant mode_context data structures
This commit removed a couple of redundant data structures in frame
coding contextsm, mode_context and mode_context_a, and changed to
use vp9_mode_contexts only. The switch of the context for different
frame type now relies on the switch of frame coding context between
lfc and lfc_a. This commit also removed a number of memcpy among
these redundant data structure.

Change-Id: I42e8174bd60f466b0860afc44c1263896471b0f3
2012-11-28 09:24:30 -08:00
John Koleszar
a1f15814be Clamp decoded feature data
Not all segment feature data elements are full-range powers of two, so
there are values that can be encoded that are invalid. Add a new function
to clamp values to the maximum allowed.

Change-Id: Ie47cb80ef2d54292e6b8db9f699c57214a915bc4
2012-11-27 16:38:31 -08:00
John Koleszar
fcccbcbb39 Add vp9_ prefix to all vp9 files
Support for gyp which doesn't support multiple objects in the same
static library having the same basename.

Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
2012-11-27 14:12:30 -08:00
Yunqing Wang
e7cd80718b Improve sad3x16 SSE2 function
Vp9_sad3x16_sse2() is heavily called in decoder, in which the
unaligned reads consume lots of cpu cycles. When CONFIG_SUBPELREFMV
is off, the unaligned offset is 1. In this situation,
we can adjust the src_ptr to be 4-byte aligned, and then do the
aligned reads. This reduced the reading time significantly. Tests
on 1080p clip showed over 2% decoder performance gain with
CONFIG_SUBPELREFM off.

Change-Id: I953afe3ac5406107933ef49d0b695eafba9a6507
2012-11-26 09:53:50 -08:00
Jim Bankoski
510557e2eb removed the idct rtcd idct calls
More cleanup to do after this,  but this is a good chunk of removing rtcd.

Change-Id: I551db75e341a0a85c3ad650df1e9a60dc305681a
2012-11-24 19:33:58 -08:00
Jim Bankoski
3338af4109 remove subpixel invoke functions
Removed the rtcd subpixel invoke functions.

Change-Id: I8b7618bd5813333fac66b2817bdf807616e0fb33
2012-11-21 09:16:30 -08:00
Jim Bankoski
4ad2f08c72 Merge "clean out some of the rtcd code." into experimental 2012-11-21 06:41:37 -08:00
John Koleszar
414f68d266 Merge "Pack invisible frames without lengths" into experimental 2012-11-20 17:22:50 -08:00
Yunqing Wang
bbe5e032a4 Fix ref_stride in sad function
Used ref_stride.

Change-Id: I31f0a3bb935520f54d11a1d87315627f162ae845
2012-11-20 10:01:20 -08:00
Jim Bankoski
f4871b6a3f clean out some of the rtcd code.
This removes functions that are no longer needed and cleans up some warnings.

Change-Id: I292a4c3694e9c1d68ce99cea390905b198434719
2012-11-18 12:33:18 -08:00
Jim Bankoski
cb98b83239 removal of temporal invoke
Change-Id: I18ca713b02a5241bdb20dddcde0216467b55b596
2012-11-17 06:11:01 -08:00
Yunqing Wang
0eb5590425 Merge "Add const before the dequant(dq)" into experimental 2012-11-16 12:35:17 -08:00
Yunqing Wang
4c7c15ee69 Merge "Optimize 8x8 dequant and idct" into experimental 2012-11-16 12:23:06 -08:00
Yunqing Wang
47d9d48fa4 Add const before the dequant(dq)
Modified code to use const before dq.

Change-Id: I6fa59c2ed9743ded33ad08df70e15c2fe1ae7b99
2012-11-16 12:13:13 -08:00
Ronald S. Bultje
5b11052ac1 Support 32x32 intra modes in non-keyframe superblocks.
Change-Id: Icf8ad313c543462e523bff89690e5daa8d49bcc0
2012-11-16 09:54:43 -08:00
Paul Wilkins
a57dbd957b Further experimentation with the mode context
Experiments with a larger set of contexts and some
clean up to replace magic numbers regarding the
number of contexts.

The starting values and rate of backwards adaption
are still suspect and based on a small set of tests.
Added forwards adjustment of probabilities.

The net result of adding the new context and forward
update is small compared to the old context from the
legacy find_near function.  (down a little on derf but
up by a similar amount for HD)

HOWEVER.... with the new context and forward update
the impact of disabling the reverse update (which may be
necessary in some use cases to facilitate parallel decoding)
is hugely reduced.

For the old context without forward update, the impact of
turning off reverse update (Experiment was with SB off) was
Derf - 0.9, Yt -1.89, ythd -2.75 and sthd -8.35. The impact was
mainly at low data rates.

With the new context and forward update enabled the impact
for all the test sets was no more than 0.5-1% (again most at
the low end).

Change-Id: Ic751b414c8ce7f7f3ebc6f19a741d774d2b4b556
2012-11-16 16:58:00 +00:00
Deb Mukherjee
cb2d06ceac Merge "Compound inter-intra experiment" into experimental 2012-11-16 08:30:34 -08:00
Yaowu Xu
170305dcd3 Merge "changed mv candidate search for superblocks" into experimental 2012-11-16 07:21:55 -08:00
Yaowu Xu
415e6bff4d changed mv candidate search for superblocks
added additional motion vectors at close neighborhood of a superblock
to the list of candiate motion vectors, and removed a couple that are
further away.

The change helped std-hd set about .8% (all metrics) and smaller gain
for derf set.

Change-Id: Iaa69b98614db43420ed3fd4738d0ca5587b90045
2012-11-16 07:01:13 -08:00
Deb Mukherjee
0c917fc975 Compound inter-intra experiment
A patch on compound inter-intra prediction.

In compound inter-intra prediction, a new predictor for
16x16 inter coded MBs are obtained by combining a single
inter predictor with a 16x16 intra predictor, in a manner
that the weight varies with distance from the top/left
boundary. The current search strategy is to combine the best
inter mode with the best intra mode obtained independently.

Results so far:

derf +0.31%
yt +0.32%
std-hd +0.35%
hd +0.42%

It is conceivable that the results would improve somewhat
with a more thorough search strategy where all intra modes
are searched given the best mv, or even a joint search for
the best mv and the best intra mode.

Change-Id: I7951f1ed0d6eb31ca32ac24d120f1585bcd8d79b
2012-11-16 06:56:29 -08:00