Commit Graph

3628 Commits

Author SHA1 Message Date
Deb Mukherjee
210dc5b2db Further improvements on the hybrid dwt/dct expt
Modifies the scanning pattern and uses a floating point 16x16
dct implementation for now to handle scaling better.
Also experiments are in progress with 2/6 and 9/7 wavelets.

Results have improved to within ~0.25% of 32x32 dct for std-hd
and about 0.03% for derf. This difference can probably be bridged by
re-optimizing the entropy stats for these transforms. Currently
the stats used are common between 32x32 dct and dwt/dct.

Experiments are in progress with various scan pattern - wavelet
combinations.

Ideally the subbands should be tokenized separately, and an
experiment will be condcuted next on that.

Change-Id: Ia9cbfc2d63cb7a47e562b2cd9341caf962bcc110
2012-12-13 10:37:49 -08:00
Ronald S. Bultje
f4608e3606 Merge "New default coefficient/band probabilities." into experimental 2012-12-13 09:56:50 -08:00
Ronald S. Bultje
5a5df19de3 New default coefficient/band probabilities.
Gives 0.5-0.6% improvement on derf and stdhd, and 1.1% on hd. The
old tables basically derive from times that we had only 4x4 or
only 4x4 and 8x8 DCTs.

Note that some values are filled with 128, because e.g. ADST ever
only occurs as Y-with-DC, as does 32x32; 16x16 ever only occurs
as Y-with-DC or as UV (as complement of 32x32 Y); and 8x8 Y2 ever
only has 4 coefficients max. If preferred, I can add values of
other tables in their place (e.g. use 4x4 2nd order high-frequency
probabilities for 8x8 2nd order), so that they make at least some
sense if we ever implement a larger 2nd order transform for the
8x8 DCT (etc.), please let me know

Change-Id: I917db356f2aff8865f528eb873c56ef43aa5ce22
2012-12-12 16:23:57 -08:00
Scott LaVarnway
b575394e21 Improved vp9_ihtllm_c
As suggested by Yaowu, we can use eob to reduce the complexity
of the vp9_ihtllm_c function.  For the 1080p test clip used, the decoder
performance improved by 17%.

Change-Id: I32486f2f06f9b8f60467d2a574209aa3a3daa435
2012-12-12 15:49:39 -08:00
Ronald S. Bultje
39de1e14ed Merge "Consistently use get_prob(), clip_prob() and newly added clip_pixel()." into experimental 2012-12-12 10:34:14 -08:00
Ronald S. Bultje
4d0ec7aacd Consistently use get_prob(), clip_prob() and newly added clip_pixel().
Add a function clip_pixel() to clip a pixel value to the [0,255] range
of allowed values, and use this where-ever appropriate (e.g. prediction,
reconstruction). Likewise, consistently use the recently added function
clip_prob(), which calculates a binary probability in the [1,255] range.
If possible, try to use get_prob() or its sister get_binary_prob() to
calculate binary probabilities, for consistency.

Since in some places, this means that binary probability calculations
are changed (we use {255,256}*count0/(total) in a range of places,
and all of these are now changed to use 256*count0+(total>>1)/total),
this changes the encoding result, so this patch warrants some extensive
testing.

Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628
2012-12-12 10:01:19 -08:00
Yaowu Xu
0c35b27689 Merge "clean up tokenize_b() and stuff_b()" into experimental 2012-12-11 13:51:56 -08:00
Yaowu Xu
899f0fc126 clean up tokenize_b() and stuff_b()
Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5
2012-12-11 13:32:16 -08:00
Johann
7a09f6b892 Revert "Upstream build bug for chromium"
This reverts commit 8bb82fded5.

This is an incorrect workaround. It has been fixed in the GYP files
upstream.

Change-Id: If42f997747ce878b874508fdf7ae5a73a6fa1b2b
2012-12-11 11:49:18 -08:00
Scott LaVarnway
fd671152bc Merge "Bug fix: use correct count_mb_ref_frame_usage" 2012-12-11 11:00:53 -08:00
Yaowu Xu
6b380c0cfa Merge "experiment with CONTEXT conversion" into experimental 2012-12-11 09:46:36 -08:00
Scott LaVarnway
57e12be283 Bug fix: use correct count_mb_ref_frame_usage
Change-Id: I9702f3e9ed664c2537e7874698c944620b07fff8
2012-12-10 17:38:55 -08:00
Scott LaVarnway
a0ad16e203 Moved error_bins to macroblock struct
Change-Id: Ic9956ddf1c2ddffcf7be7fdfc23ad9a2426fc47a
WIP: Fixing unsafe threading in VP8 encoder.
2012-12-10 17:32:58 -08:00
Frank Galligan
b192d99f73 Fix ads2gas script to look for ALIGN as a word.
Change-Id: I4efc4f4e87e8666b69257de82c5c5dd4aadee28c
2012-12-10 16:30:32 -08:00
Scott LaVarnway
2cd48bdc92 Merge "Moved zbin_mode_boost to macroblock struct" 2012-12-10 16:22:57 -08:00
Scott LaVarnway
cc91d655e4 Update correct macroblock quantize_b function ptrs
WIP: Fixing unsafe threading in VP8 encoder.
Use the passed in macroblock instead of the macroblock located in
cpi.

Change-Id: I1bfa07de6ea463f2baeaae1bae5d950691bc4afc
2012-12-10 15:23:11 -08:00
Scott LaVarnway
74efda4bd6 Moved zbin_mode_boost to macroblock struct
Fixing unsafe threading in VP8 encoder.

Change-Id: Ibf4c89a2043654834747811bc11eb283de0bb830
2012-12-10 12:42:24 -08:00
John Koleszar
d984763867 configure: add --enable-external-build support
First attempt at avoiding all the compile-time environment detection for
cases where you can generate the environments statically, as when the
real build is being performed by another build system.

Change-Id: Ie3cf95d71d6c5169900f31e263b84bc123cdf73f
2012-12-10 12:39:55 -08:00
Deb Mukherjee
f09c4cde85 Merge "A bug fix related to switchable filters" into experimental 2012-12-10 12:28:06 -08:00
Deb Mukherjee
14a38a8735 A bug fix related to switchable filters
The switchable count update was mistakenly inside a macro.

Change-Id: Iec04c52ad57034b88312dbaf05eee1f47ce265b3
2012-12-10 12:10:36 -08:00
Scott LaVarnway
3a19eebe4d Moved zbin_over_quant to macroblock struct
Change-Id: I76fe20ade099573997404b8733cf7f79e82fb21e
WIP: Fixing unsafe threading in VP8 encoder.
2012-12-10 10:51:42 -08:00
Paul Wilkins
d124465975 Further changes to mv reference code.
Some further changes and refactoring of mv
reference code and selection of center point for
searches. Mainly relates to not passing so many
different local copies of things around.

Some place holder comments.

Change-Id: I309f10ffe9a9cde7663e7eae19eb594371c8d055
2012-12-10 17:31:51 +00:00
John Koleszar
d1356faeb8 Merge remote-tracking branch 'origin/vp9-preview' into experimental 2012-12-07 17:26:31 -08:00
Yaowu Xu
ab480cede5 experiment with CONTEXT conversion
This commit changed the ENTROPY_CONTEXT conversion between MBs that
have different transform sizes.

In additioin, this commit also did a number of cleanup/bug fix:
1. removed duplicate function vp9_fix_contexts() and changed to use
vp8_reset_mb_token_contexts() for both encoder and decoder
2. fixed a bug in stuff_mb_16x16 where wrong context was used for
the UV.
3. changed reset all context to 0 if a MB is skipped to simplify the
logic.

Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c
2012-12-07 17:25:45 -08:00
John Koleszar
6f014dc5ad libvpx_test: ensure rtcd init functions are called
In addition to allowing tests to use the RTCD-enabled functions (perhaps transitively)
without having run a full encode/decode test yet, this fixes a linking issue with
Apple's G++ whereby the Common symbols (the function pointers themselves) wouldn't
be resolved. Fixing this linking issue is the primary impetus for this patch, as none
of the tests exercise the RTCD functionality except through the main API.

Change-Id: I12aed91ca37a707e5309aa6cb9c38a649c06bc6a
2012-12-07 17:21:53 -08:00
Jim Bankoski
fccebcba57 Merge "Fix implicit cast." into vp9-preview 2012-12-07 17:16:01 -08:00
Jim Bankoski
26a4918282 Merge "Fix meaninglesss if." into vp9-preview 2012-12-07 17:15:52 -08:00
Ronald S. Bultje
fbf052df42 Clean up 4x4 coefficient decoding code.
Don't use vp9_decode_coefs_4x4() for 2nd order DC or luma blocks. The
code introduces some overhead which is unnecessary for these cases.
Also, remove variable declarations that are only used once, remove
magic offsets into the coefficient buffer (use xd->block[i].qcoeff
instead of xd->qcoeff + magic_offset), and fix a few Google Style
Guide violations.

Change-Id: I0ae653fd80ca7f1e4bccd87ecef95ddfff8f28b4
2012-12-07 16:27:07 -08:00
Ronald S. Bultje
885cf816eb Introduce vp9_coeff_probs/counts/stats/accum types.
Use these, instead of the 4/5-dimensional arrays, to hold statistics,
counts, accumulations and probabilities for coefficient tokens. This
commit also re-allows ENTROPY_STATS to compile.

Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5
2012-12-07 16:09:59 -08:00
Frank Galligan
1c0ee77589 Fix meaninglesss if.
Change-Id: I0cb06d77805246fe39d39ad3bc5df3c3f52c7050
2012-12-07 15:44:39 -08:00
Frank Galligan
8d449ce0a9 Remove unused symbols from vp9 asm offsets C files.
Change-Id: I366e6d175da3012f1c8607fd7fad99fbbb616091
2012-12-07 15:38:40 -08:00
Frank Galligan
eec0bc4f1e Fix implicit cast.
Change-Id: I1eb7433061a6c529471026e0ebdc6467942062eb
2012-12-07 15:25:44 -08:00
Ronald S. Bultje
c456b35fdf 32x32 transform for superblocks.
This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
code all over the place to wrap that in the bitstream/encoder/decoder/RD.

Some implementation notes (these probably need careful review):
- token range is extended by 1 bit, since the value range out of this
  transform is [-16384,16383].
- the coefficients coming out of the FDCT are manually scaled back by
  1 bit, or else they won't fit in int16_t (they are 17 bits). Because
  of this, the RD error scoring does not right-shift the MSE score by
  two (unlike for 4x4/8x8/16x16).
- to compensate for this loss in precision, the quantizer is halved
  also. This is currently a little hacky.
- FDCT and IDCT is double-only right now. Needs a fixed-point impl.
- There are no default probabilities for the 32x32 transform yet; I'm
  simply using the 16x16 luma ones. A future commit will add newly
  generated probabilities for all transforms.
- No ADST version. I don't think we'll add one for this level; if an
  ADST is desired, transform-size selection can scale back to 16x16
  or lower, and use an ADST at that level.

Additional notes specific to Debargha's DWT/DCT hybrid:
- coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
  block than for the rest (DWT pixel differences) of the block. Therefore,
  RD error scoring isn't easily scalable between coefficient and pixel
  domain. Thus, unfortunately, we need to compute the RD distortion in
  the pixel domain until we figure out how to scale these appropriately.

Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
2012-12-07 14:45:05 -08:00
Scott LaVarnway
000c8414b5 Moved denoiser frame copy/updates out of loopfilter thread
The loopfilter thread from the previous frame can be running while
starting the current frame.  cpi->Source will change during this time causing
the wrong data to be copied.  The refresh_x_frame flags also change, which
will cause incorrect updates of the denoised buffers.

Change-Id: I7d982b4fcb40a0610801332aa85f3b792c64e4c3
2012-12-07 12:19:52 -08:00
Scott LaVarnway
bc10eab41b Merge "added work buffer for denoiser" 2012-12-06 15:27:54 -08:00
John Koleszar
434336b072 libvpx_test: ensure rtcd init functions are called
In addition to allowing tests to use the RTCD-enabled functions (perhaps transitively)
without having run a full encode/decode test yet, this fixes a linking issue with
Apple's G++ whereby the Common symbols (the function pointers themselves) wouldn't
be resolved. Fixing this linking issue is the primary impetus for this patch, as none
of the tests exercise the RTCD functionality except through the main API.

Change-Id: I12aed91ca37a707e5309aa6cb9c38a649c06bc6a
2012-12-06 14:02:36 -08:00
Scott LaVarnway
ef2248a2a3 added work buffer for denoiser
The denoiser was writing to LAST_FRAME buffer.   If LAST_FRAME isn't being
updated,  the reference frame buffers were out of sync between the encoder and the
denoised raw buffers. This patch resolves the discrepancy by always writing to a work
buffer (INTRA_FRAME) and then copying from that buffer to any buffers that needs to
be updated.

Change-Id: I6dd855b9749978b542bc3d515914d5f16faf25df
2012-12-05 19:09:05 -08:00
Johann
a36d9a4a15 Move vp8_scale_frame to vpx namespace
Change-Id: I92d613e89c8f1174eca0789116120bfa20c25c28
2012-12-05 16:05:46 -08:00
Johann
11a84b25b5 Remove last duck_ functions
Change-Id: I5fbcd2006d05bfe841f3c7af9c1aeb2cb83b3149
2012-12-05 16:05:45 -08:00
Johann
1009f76566 Use 'vpx_scale' consistently
Change-Id: I178352813d2b8702d081caf405de9dbad9af2cc3
2012-12-05 16:05:44 -08:00
Adrian Grange
9a3de881c0 Disable background update on non-base layer frames
Multi-threaded code was not updated to disable background
refresh for non base-layer frames at the time it was
disabled in the main C-code.

Change-Id: Id6cc376130b7def046942121cfd0526b4f0a71d4
2012-12-05 13:24:52 -08:00
Paul Wilkins
7405040142 Merge "Change to MV reference search." into experimental 2012-12-05 09:14:46 -08:00
John Koleszar
d345e65d3c Merge remote-tracking branch 'origin/vp9-preview' into experimental 2012-12-05 09:01:57 -08:00
Johann
52d350febf Begin to refactor vpx_scale usage in VP9
Only declare the functions in vpx_scale RTCD and include the relevant
header.

Remove unused files and functions in vpx_scale to avoid wasting time
renaming. vpx_scale/win32/scaleopt.c contains functions which have not
been called in a long time but are potentially optimized.

The 'vp8' functions have not been renamed yet. That is for after the
cleanup.

Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c
2012-12-05 08:59:40 -08:00
Johann
a905672906 Remove ARM optimizations from VP9
Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b
2012-12-05 08:59:25 -08:00
Johann
4a9b95470c Update ARM for vpx_scale changes
Refactor asm_offsets for vpx_scale.

Change-Id: I2db0eeb28c8e757bd033c6614a1e5319a1a204a5
2012-12-05 08:59:17 -08:00
Scott LaVarnway
f2b36a4de7 Removed check_gf_quality()
and various unused members in VP8_COMP along with other
code cleanups.

Change-Id: I56c6c0a77a51f5ac5cbd6071017bcbfd2623b7df
2012-12-05 08:56:42 -08:00
John Koleszar
5d91a1e0ae Merge remote-tracking branch 'origin/vp9-preview' into experimental 2012-12-05 08:41:35 -08:00
John Koleszar
4a4d2aa55c vp9_bilinear_filters_mmx: add missing extern specifiers
Change-Id: Ibabf18947f90cb4f45052763ebf44cfb8209bd8b
2012-12-05 08:27:48 -08:00
Paul Wilkins
4cc657ec6e Change to MV reference search.
This patch reduces the cpu cost of the MV ref
search by only allowing insert for candidates
that would be in the current top 4.

This could alter the outcome and slightly favors
near candidates which are tested first but also
limits the worst case loop count to 4 and means in
many cases it will drop out and not happen.

Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113
2012-12-05 14:03:45 +00:00