Commit Graph

1326 Commits

Author SHA1 Message Date
Paul Wilkins
b6f02c8592 Code Simplification
Removal of code relating to token partitioning

Change-Id: Iaf3c88d6758639a55bd92c3be5c51e6bed407a3c
2012-02-28 17:55:42 +00:00
Yaowu Xu
eb87b56eab fixed a wrong intialization value
The "update" variable was used as a flag in coef_prob update dry run
that tests if a frame should encodes update at all. The wrong init
value forced the update happening always. fixing this has a minor
improvement in low bit rate situation when 8x8 transform is allowed.

Change-Id: Icb498e8d6a62fd074dcbc2065b797cba9237cb51
2012-02-28 09:10:34 -08:00
Paul Wilkins
3cdd0a8e75 Merge "Corrected spelling" into experimental 2012-02-28 02:07:49 +00:00
Paul Wilkins
b00ed02a16 Corrected spelling
Apparently the correct spelling of segement is segment !

Change-Id: I88593ee0523f251b3a96794c6166ef8c7898a029
2012-02-27 21:42:36 +00:00
Paul Wilkins
88a867c6dd Merge "Code Cleanup." into experimental 2012-02-27 20:50:21 +00:00
Paul Wilkins
2e9d7d647a Merge "Removal of temporal re sampling code." into experimental 2012-02-27 20:50:01 +00:00
Paul Wilkins
46ab54abf8 Merge "Code Simplification." into experimental 2012-02-27 17:58:57 +00:00
Paul Wilkins
d90b1ee16c Merge "Further code simplification and clean up." into experimental 2012-02-27 17:58:12 +00:00
Paul Wilkins
646e62211e Code Cleanup.
Removal of error_resilient_mode features.
The interface has been left in place but does nothing.

Change-Id: I2407863bd0d3c98407354507423ca48d29f63b17
2012-02-26 01:15:47 +00:00
Paul Wilkins
80b873e318 Removal of temporal re sampling code.
For now the interface elements have been left in place
to make sure existing parameter files work but parameters
relating to drop frame wont do anything.

Change-Id: I579ee614726387381c546845dac4bc03c74c6a07
2012-02-25 18:13:57 +00:00
Deb Mukherjee
88b36eb0d9 Bug fix in ssse3 variance computation.
Fixes a bug that was introduced in the high precision mv patch.

Change-Id: Ieadb433ebe4c3ef3e0e63944dab11528bf8bd73a
2012-02-24 20:24:54 -08:00
Paul Wilkins
69e80a028c Code Simplification.
Removal of code relating to spatial re sampling

Change-Id: Iff1bc651c62cd528f960c4b27f9673b172e68835
2012-02-24 23:58:24 +00:00
Paul Wilkins
3cc5b92c65 Further code simplification and clean up.
Change-Id: Ifdb17b56090a317b2aa82cf125d57934902c5298
2012-02-24 23:38:36 +00:00
Paul Wilkins
583f2d8fc7 Deleted code.
Removed redundant code for ref frame cost.
2012-02-24 02:16:53 +00:00
Deb Mukherjee
18e90d744e Supporting high precision 1/8-pel motion vectors
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.

Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)

Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V

Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).

Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.

With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html

Results on high precision mv and on the hd set are to follow.

Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c

Patch 7: Cleaning up various debug messages.

Patch 8: Merge conflict

Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
2012-02-23 09:25:21 -08:00
Yaowu Xu
3c872b6c27 Merge "Fixed skippable evaluation in mode decision" into experimental 2012-02-22 17:13:04 +00:00
Yaowu Xu
0f430084e0 Merge "Reduced bias in picking loop filter level" into experimental 2012-02-22 17:12:52 +00:00
Yaowu Xu
7670933386 Merge "a bit code clean-up" into experimental 2012-02-22 15:55:53 +00:00
Yaowu Xu
c54bfcb6f0 Merge "Reworked context conversion between 8x8 and 4x4" into experimental 2012-02-22 15:55:37 +00:00
Yaowu Xu
2b4cd4cc01 Fixed skippable evaluation in mode decision
Yunqing fixed an oddity in UVIntra skippable evaluation for stable
branch, which brought up the fact that the evaluation is broken.
The issue was that for MBs with 2nd order block, the eob for 1st
order blocks is set at 1. The previous evaluation did not take that
into account. This commit intend to fix the problem. The commit also
absorbed Yunqing's fix for UVIntra skippable evalution.

Test on hd showed some good gains in combination with LPF bias fix:
http://www.corp.google.com/~yaowu/no_crawl/LPFBias_FixSkip.html
(avg psnr: .34%, glb psnr: .32%, ssim: .22%)

Change-Id: I36af11c8ef7f643e8ff46da7bf3a167b437039d4
2012-02-22 06:49:13 -08:00
Yaowu Xu
737179f275 Reduced bias in picking loop filter level
The bias in picklpf intended to bias toward less greedy in getting
best frame level psnr while maximize overall quality for a clip.
This commit reduced the bias for frames using 8x8 transform to
achieve better compression overall.

The change improve compression by ~.15% consistently on most of the
HD clips tested.

http://www.corp.google.com/~yaowu/no_crawl/LPFBias_FixSkip.html

Change-Id: Ic30932d2b8eaebd52339b0195f569edc48eed7bc
2012-02-17 16:44:08 -08:00
Paul Wilkins
4cfb8ed4c9 Code base simplification.
Removal of most code to do with 1 pass.
Removal of cyclic refresh code.

Change-Id: I74971082bc19dd76e795d4d2e781a0424cec5c8c
2012-02-17 16:29:03 +00:00
Yaowu Xu
47d545f166 a bit code clean-up
Removed some transform code that is not in use.

Change-Id: I9489af7e23d9d7fe052feb6c8bbafa62ebbda39c
2012-02-16 15:15:06 -08:00
Yaowu Xu
b92a96d8ad Reworked context conversion between 8x8 and 4x4
The commit rationized and simplified the entropy context conversion
betwen MB using 8x8 transform and MB using 4x4 transform. The old version
had a number of weirdness in how 4x4 transform MB's context is used for
8x8 blocks other than the first 8x8 within a MB.

Test showed the change has a gain ~.1% for avg psnr, glb psnr and ssim on
the limited HD set.

Change-Id: I774536c416baa6845aa741f956d8a69fa40e5d47
2012-02-16 15:00:10 -08:00
Paul Wilkins
79d330d7d5 Code simplification
Removal of the pickinter.c and .h files and calls to this
code.

Removal of some code relating to real time and one pass
settings  though there is more to be done in this regard.

However,  vp8_set_speed_features() now
only supports modes 0 and 1 and speeds up to 3
so rd should always be set.

Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8
2012-02-16 17:21:20 +00:00
Yaowu Xu
f90983e167 revised the rate distortion computation for UV
this commit changed the UV r/d calculation in the mode decision process to
properly account for the rate of 8x8 transform coefficients.

Change-Id: I485f8f35f2b61db0b6539beb32e83481b1cf083b
2012-02-16 07:34:46 -08:00
Yaowu Xu
8b71f3e059 Merge "revised the rate distortion computation for UV" into experimental 2012-02-16 15:34:38 +00:00
Yaowu Xu
bc3dd313ef Merge "optmized rounding for transforms" into experimental 2012-02-16 15:33:01 +00:00
Yaowu Xu
a96bf2038c Merge "re-scaled 2nd order haar transform" into experimental 2012-02-16 15:32:33 +00:00
Yaowu Xu
a78a4b4551 Merge "moved scaling from dequantization to inverse transform for T8x8" into experimental 2012-02-16 15:32:17 +00:00
Yaowu Xu
efa9abd028 optmized rounding for transforms
the changes are still temporary, the final transforms, especially
inverse ones should take in account both accuracy, complexity, and
sign-bias, which should be decided at a later time.

Change-Id: I116b0c70b25f5ee324ae5713d4564f5d0aa27151
2012-02-16 07:03:57 -08:00
Yaowu Xu
62a78f0342 re-scaled 2nd order haar transform
During the work of extend_qrange, we have rolled a factor of 2 from
quantization/dequatnization into 2nd order walsh-hadamard transform.
This commit does the same for the 2nd order haar transform. so they
can share the same quantizaiton process as the 2nd order WHT.

Change-Id: I734af4a20ea8149a01b5b1971a065092977dfe33
2012-02-16 07:03:56 -08:00
Yaowu Xu
454c7abc1a moved scaling from dequantization to inverse transform for T8x8
Previously, the scaling related to extended quantize range happens in
dequantization stage, which implies the coefficients form forward
transform are in different scale(4x) from dequantization coefficients
This worked fine when there was not distortion computation done based
on 8x8 transform, but it completely wracked the distortion estimation
based on transform coefficients and dequantized transform coefficients
introduced in commit f64725a00 for macroblocks using 8x8 transform.
This commit fixed the issue by moving the scaling into the stage of
inverse 8x8 transform.

TODO: Test&Verify the transform/quantization pipeline accuracy.

Change-Id: Iff77b36a965c2a6b247e59b9c59df93eba5d60e2
2012-02-16 07:03:55 -08:00
Ronald S. Bultje
721711fb51 Remove dual prediction frame re-encoding loop.
I'm basically not convinced that the concept works at all, let alone
that this is the right place to do it. I think if we want something
like this at all, I should integrate it with the main encoding loop
and re-encode checks in onyx_if.c, and show that it has a significant
benefit (which right now, it doesn't; removing this re-encode check
actually increases all metrics by ~0.15%).

Change-Id: I1b597385dc17f468384a994484fb24813389411f
2012-02-15 16:38:04 -08:00
Ronald S. Bultje
0930dde249 Fix overflows in dual prediction mode selection.
Change-Id: I265ad46e01a307bca21e6223725e4055f5e08648
2012-02-15 15:57:49 -08:00
Yaowu Xu
d327dcf3aa moved segment based LPF level selection under CONFIG_FEATUREUPDATES
This commit moved segment based loop filter level selection into
the experiment of CONFIG_FEATUREUPDATES. As previous commit noted,
the segment based loop filter selection helps the compression by
~0.1% on cif set, the ongoing experiment CONFIG_FEATUREUPDATES
made encoding updates of the segment based LPF level more efficient,
hence, another .04% gain on cif set. The commit also fixed an issue
previously where encoder/decoder may use different loop filter level
for one of the segments.

Change-Id: Ia978b14aae95bb107d561ba53a7a2bb6ff01faf3
2012-02-15 07:18:05 -08:00
Yaowu Xu
9b68ad0f30 added 8x8 based Rate estimation for dualpred case
This commmit added logic for MB using dual-pred to compute rate
estimation based on correct transform size. The section of code
was previously located under #if CONFIG_DUALPRED, that was made
to be working with T8x8 experiment at the same time.

Change-Id: Iebc2518c03f11378b9c2e72905520f088b54d5c0
2012-02-14 09:23:21 +00:00
Paul Wilkins
9a8204d6ee Simplification of experimental code base.
Removed ~CONFIG_REALTIME_ONLY code.

Change-Id: I5fafff29a08acd8928699f9ddce8744787024d8c
2012-02-14 09:03:56 +00:00
Jim Bankoski
af8f1928d1 vp8 - config_featureupdates
Added a bit to signify that the feature changed since
the last time we sent it, or not so that we don't need
to send all the databits for every feature change.

added config

Change-Id: I8d3064ce90d4500bf0d5c6b87c664e46138dfcac
2012-02-13 12:31:12 -08:00
Yaowu Xu
2d1ead342c Changed how coefficient probability table is updated
Added a frame level flag to indicate if coef probabilities are updated
at all for the frame.

During the experimental work with 8x8 transform, it is discovered that
even in the case of no probability is ever update, cost of transmitting
"no update" for each of probabilities can run up to become a significant
overhead cost. A single bit to indicate no-update for all coef probs
is therefore helpful, which is also demonstrated by the test results:

1. On Cif set:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_updprob.html
(avg psnr: .14%, glb psnr: .14% SSIM: .13%)

2. On HD set:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_updprob.html
(avg psnr: .02%  glb psnr: .01% SSIM: .02%)
It should be noted that the gain on HD is smaller because the average bit
rate is much higher in contrast to the overhead bit cost.

Change-Id: I46db270e693ee8799fef34a14d8260868ce4cd16
2012-02-13 13:20:02 +00:00
Yaowu Xu
9ded6e375a fixed an issue related to 2nd order size due to merge artifacts.
For 8x8 transformed macroblock, the 2nd order transform is a 2x2 haar
transform, here there is only 4 coefficients total. A previous merge
changed these to 64, causing crashes when encoding with 8x8 transform
enabled. (i.e. when input video image size > 640x360 ) This commit
reverts them back to 4 and fixes the crashes.

Change-Id: I3290b81f8c0d32c7efec03093a61ea57736c0550
2012-02-10 11:49:22 -08:00
Paul Wilkins
2615ca5d41 Removal of threading code.
For the experimental branch we are trying to slim the codebase
down removing features such as threading for now which complicate
the process of development and testing.

Change-Id: I657c0246aef4d1fa8c8ffc6a1adfeee45bce8e24
2012-02-10 16:23:59 +00:00
Ronald S. Bultje
f64725a009 Improved coding using 8x8 transform
In summary, this commit encompasses a series of changes in attempt to
improve the 8x8 transform based coding to help overall compression
quality, please refer to the detailed commit history below for what
are the rationale underly the series of changes:

a. A frame level flag to indicate if 8x8 transform is used at all.
b. 8x8 transform is not used for key frames and small image size.
c. On inter coded frame, macroblocks using modes B_PRED, SPLIT_MV
and I8X8_PRED are forced to using 4x4 transform based coding, the
rest uses 8x8 transform based coding.
d. Encoder and decoder has the same assumption on the relationship
between prediction modes and transform size, therefore no signaling
is encoded in bitstream.
e. Mode decision process now calculate the rate and distortion scores
using their respective transforms.

Overall test results:
1. HD set
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120206.html
(avg psnr: 3.09% glb psnr: 3.22%, ssim: 3.90%)
2. Cif set:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120206.html
(avg psnr: -0.03%, glb psnr: -0.02%, ssim: -0.04%)
It should be noted here, as 8x8 transform coding itself is disabled
for cif size clips, the 0.03% loss is purely from the 1 bit/frame
flag overhead on if 8x8 transform is used or not for the frame.

---patch history for future reference---
Patch 1:
this commit tries to select transform size based on macroblock
prediction mode. If the size of a prediction mode is 16x16, then
the macroblock is forced to use 8x8 transform. If the prediction
mode is B_PRED, SPLITMV or I8X8_PRED, then the macroblock is forced
to use 4x4 transform. Tests on the following HD clips showed mixed
results: (all hd clips only used first 100 frames in the test)

http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8.html
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_log.html

while the results are mixed and overall negative, it is interesting to
see 8x8 helped a few of the clips.

Patch 2:
this patch tries to hard-wire selection of transform size based on
prediction modes without using segmentation to signal the transform size.
encoder and decoder both takes the same assumption that all macroblocks
use 8x8 transform except when prediciton mode is B_PRED, I8X8_PRED or
SPLITMV. Test results are as follows:

http://www.corp.google.com/~yaowu/no_crawl/t8x8/cifmodebase8x8_0125.html
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_0125log.html

Interestingly, by removing the overhead or coding the segmentation, the
results on this limited HD set have turn positive on average.

Patch 3:
this patch disabled the usage of 8x8 transform on key frames, and kept the
logic from patch 2 for inter frames only. test results on HD set turned
decidedly positive with 8x8 transform enabled on inter frame with 16x16
prediction modes: (avg psnr: .81% glb psnr: .82 ssim: .55%)

http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdintermode8x8_0125.html
results on cif set still negative overall

Patch 4:
continued from last patch, but now in mode decision process, the rate and
distortion estimates are computed based on 8x8 transform results for MBs
with modes associated with 8x8 transform. This patch also fixed a problem
related to segment based eob coding when 8x8 transform is used. The patch
significantly improved the results on HD clips:

http://www.corp.google.com/~yaowu/no_crawl/t8x8/hd8x8RDintermode.html
(avg psnr: 2.70% glb psnr: 2.76% ssim: 3.34%)
results on cif also improved, though they are still negative compared to
baseline that uses 4x4 transform only:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif8x8RDintermode.html
(avg psnr: -.78% glb psnr: -.86% ssim: -.19%)

Patch 5:
This patch does 3 things:
a. a bunch of decoder bug fixes, encodings and decodings were verified
to have matched recon buffer on a number of encodes on cif size mobile and
hd version of _pedestrian.
b. the patch further improved the rate distortion calculation of MBS that
use 8x8 transform. This provided some further gain on compression.
c. the patch also got the experimental work SEG_LVL_EOB to work with 8x8
transformed macroblock, test results indicates it improves the cif set
but hurt the HD set slightly.

Tests results on HD clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120201.html
(avg psnr: 3.19% glb psnr: 3.30% ssim: 3.93%)

Test results on cif clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120201.html
(avg psnr: -.47% glb psnr: -.51% ssim: +.28%)

Patch 6:
Added a frame level flag to indicate if 8x8 transform is allowed at all.
temporarily the decision is based on frame size, can be optimized later
one. This get the cif results to basically unchanged, with one bit per
frame overhead on both cif and hd clips.

Patch 8:
Rebase and Merge to head by PGW.
Fixed some suspect 4s that look like hey should be 64s in regard
to segmented EOB. Perhaps #defines would be bette.
Bulit and tested without T8x8 enabled and produces unchanged
output.

Patch 9:
Corrected misalligned code/decode of "txfm_mode" bit.
Limited testing for correct encode and decode with
T8x8 configured on derf clips.

Change-Id: I156e1405d25f81579d579dff8ab9af53944ec49c
2012-02-10 14:23:27 +00:00
Ronald S. Bultje
e3ca23a361 Reindent some code after merging the dualpred experiment.
Change-Id: Idb328dd29ebcd360e39886abe48694f90f2e1140
2012-02-09 16:29:22 -08:00
Ronald S. Bultje
29e4d7e861 Merge dualpred (compound prediction) experiment.
Change-Id: Ieaaa07c50eae41118596197f6a4d848135946e41
2012-02-09 16:29:18 -08:00
Paul Wilkins
d90f0eb4c5 Removal of SEGFEATURES placeholder comments
This commit only involves the removal of placeholder comments
//#if CONFIG_SEGFEATURES.

Change-Id: I94b350daaf998ee0cfdde5aa25b1d3b0522ab816
2012-02-09 17:25:05 +00:00
Paul Wilkins
3e9890a394 Merge Extended Q experiment.
Merge the extended Q experiment as indicated by the

Change-Id: I02d9e654fff9998cc7e9e2f1f5cd838dad8fb431
2012-02-09 17:22:34 +00:00
Paul Wilkins
cf8af867dd Merge COMPRED
Merged in most of the current common prediction changes
that were under the #if CONFIG_COMPRED option.

Change-Id: If4e6f61dbe7b86dd449f6effbe93b5eb7e893885
2012-02-09 16:10:46 +00:00
Paul Wilkins
8266abfe96 Dual pred flag
Further changes to make experiments with the context
used for coding the dual pred flag easier.

Current best performing method tested on derf is a two
element context based on reference frame. I also tried
various combinations of mode and reference frame as
shown in commented out case using up to 6 contexts.

Derf +0.26 overall psnr +0.15% ssim vs original method.

Change-Id: I64c21ddec0abbb27feaaeaa1da2e9f164ebaca03
2012-02-09 15:44:18 +00:00
Paul Wilkins
59a200f1ea Changes to coding of dual_pred flag.
Further use of common prediction functions and experiments
with alternate contexts based on mode and reference frame.

For the Derf set using reference frame as basis of context
gives +0.18% Overall Psnr and +0.08 SSIM

Change-Id: Ie7eb76f329f74c9c698614f01ece31de0b6bfc9e
2012-02-09 15:27:20 +00:00
Ronald S. Bultje
915f13bd59 Fix dual prediction recode loop.
We should only change the dual prediction mode if we actually entered
the recode branch. Else, it may potentially undo beneficial changes
to the dual prediction mode in the first encode iteration.

Change-Id: I79fc53e5fd0bb551092ed422c797619f1566f002
2012-02-08 14:55:46 -08:00
Ronald S. Bultje
3adcbe2f15 Remove write-only variable "mbs_dual_count".
Change-Id: Icf7a6749ca2f8ad6a032f86c34540d1c5880cf68
2012-02-08 14:18:02 -08:00
Ronald S. Bultje
c8ec59d858 Fix dual prediction recode loop.
Some conditions were conditional under a threshold, whereas they should
always execute. Also, some conditions were testing an array instead of
the values within it.

Change-Id: Ia6892945cfbbe07322e6af6be42cd864bf9479c1
2012-02-08 10:09:02 +00:00
Paul Wilkins
e1050bd3dc Move update of ref frame probabilities in encode loop.
The existing code updated the reference frame probabilities before
the test to evaluate the impact of using updated probabilities
in vp8_estimate_entropy_savings().

The estimate of cost and savings is still basic and does not reflect
the new prediction code but this would require per MB costings
and the benefit is probably marginal, as this is really just used for
rate estimation in the loop.

Change-Id: Id6ba88ae6e11c273b3159deff70980363ccd8ea1
2012-02-06 16:42:00 +00:00
Paul Wilkins
9c9300f56f Merged NEWNEAR experiment
This commit merges the NEWNEAR experiment such that it
is effectively always on.

The fact that there were changes in the threading code again
highlights the need to strip out such features during the
bitstream development phase as trying to maintain this code
(especially as it is not being tested) slows the development cycle.

Change-Id: I8b34950a1333231ced9928aa11cd6d6459984b65
2012-02-06 16:40:57 +00:00
Paul Wilkins
82b865da94 Coding the hybrid dual prediction signal.
Initial modifications to make limited use of common prediction
functions.

The only functional change thus far is that updates to the probabilities are
no longer "damped". This was a testing convenience but in fact seems to
help by a little over 0.1% over the derf set.

Change-Id: I8b82907d9d6b6a4a075728b60b31ce93392a5f2e
2012-02-06 16:38:41 +00:00
Paul Wilkins
c98e9d2882 Moved prob_dualpred to common.
Moved the prob_dualpred[] sturcture to common.
Created common prediction entry for Dual flag.

Change-Id: I9ac3d128bae6114f09e5c18216d4b95cf36453d5
2012-02-06 16:37:11 +00:00
Paul Wilkins
58ec6fe8c3 Modified prediction behavior for reference frame.
Trial of a modified prediction function that ranks each possible
reference frame based on a combination of local usage and
frame level probability. The code is a bit cleaner and simpler.

In direct comparison with old unpredicted method with segment level
coding turned off for mode,ref & EOB the prediction gives a gain on derf
of around 0.4%. There is some further gain from bug fixes over earlier code.

With segment coding on the prediction method is slightly -ve on some very
easy clips (at low rates) due to slightly higher overheads, but better on harder
clips. Overall neutral on derf in direct comparison on latest code base, but
compared to earlier code without bug fixes about +0.7% overall psnr
+0.3% SSIM.

Change-Id: I5b8474658b208134d352d24f6517f25795490789
2012-02-06 16:34:41 +00:00
Paul Wilkins
f0459549a6 Reference frame prediction:
Extended prediction and coding of reference frame where
a subset of options are flagged as available at the segment level.

Updated copyright notices.

Switch to SAD in mbgraph code as SATD problematic for the
foreground and background separation as it can ignore large DC shifts.

Change-Id: I661dbbb2f94f3ec0f96bb928c1655e5e415a7de1
2012-02-03 12:44:45 +00:00
Adrian Grange
5d0b5a17d9 Added encoding in Superblock Order
As a precursor to encoding 32x32 blocks this cl adds the
ability to encode the frame superblock (=32x32 block) at
a time. Within a SB the 4 indiviual MBs are encoded in
raster-order (NW,NE,SW,SE).

This functionality is added as an experiment which can be
enabled by ispecifying --enable-superblocks in the
command line specified to configure (CONFIG_SUPERBLOCKS
macro in the code).

To make this work I had to disable the two intra
prediction modes that use data from the top-right of the
MB.

On the tests that I have run the results produce
almost exactly the same PSNRs & SSIMs with a very
slightly higher average data rate (and slightly higher
data rate than just disabling the two intra modes in
the original code).

NOTE: This will also break the multi-threaded code.

This replaces the abandoned change:
Iebebe0d1a50ce8c15c79862c537b765a2f67e162

Change-Id: I1bc1a00f236abc1a373c7210d756e25f970fcad8
2012-02-02 10:30:57 -08:00
Paul Wilkins
92ffb17cc1 Comment out segref segmentation filter changes.
Commented out changes from earlier checking:

"Change Iab7f1eff: vpnext use segref segmentation filter"

Which in its current state breaks the decoder.

Change-Id: I9185098aeda8ce65310f338c4c9375f4a39005d3
2012-02-02 12:43:53 +00:00
Adrian Grange
3ff8c7d968 Correctly capped minqtarget to maxq
This line of code incorrectly set maxq = maxq rather than
capping minqtarget.

Change-Id: Ifbc86df8b0ff2779e7b2a5f7349724d04a18bd62
2012-01-31 12:58:34 -08:00
Paul Wilkins
1335ac3071 Implementation of new prediction model for reference frame coding.
This check in uses the common prediction interface functions
to code reference frame.

Some updates made regarding the impact of the new code in rd loop
but there remain TODOs in this regard.

Change-Id: I9da3ed5dfdaa489e0903ab33258b0767a585567f
2012-01-31 12:54:05 +00:00
Paul Wilkins
56904be19d Use common prediction interface for segment coding.
This does not change any functionality just modifies the code to
use the common prediction module interface for coding
the segment data.

Change-Id: Ifd43e9153573365619774a4f5572215e44fb5aa3
2012-01-31 12:53:49 +00:00
Paul Wilkins
048e4fd524 Moved some reference frame data structures into common.
Encoder side changes

Change-Id: I8921800e4fccec5e5a9e4755b80cbd472623107b
2012-01-31 12:52:51 +00:00
Paul Wilkins
fe96afa705 Moved some segmentation data structures.
Moved some segmentation data structures into VP8_COMMON

Change-Id: I59c6e2edf7a0176e35319936eea450027aeb3b39
2012-01-31 12:51:57 +00:00
Jim Bankoski
a127c940c0 vpnext use segref segmentation filter
Goes through set of ref frames used by each macroblock and
sets seg_lvl_ref_frame flags accordingly..

http://www.corp.google.com/~jimbankoski/no_crawl/segref.html

Change-Id: Iab7f1effd75a839b34eb310d7168692c8f105411
2012-01-27 07:53:15 -08:00
Jim Bankoski
1a9fd54ecd vpnext: pick loop filter segment by segment
Picks a per segment loopfilter.  Adapts the algorithm to search for
a loopfilter value for each separate segment.  Further todo fix the
bias

Improvements .06 % ov psnr, .11% ssim
http://www.corp.google.com/~jimbankoski/no_crawl/segmentedpicklpf.html

Change-Id: Ic6a571c16fcd6ec0139f4de1f8061f87c6515a10
2012-01-27 07:48:10 -08:00
Yaowu Xu
9a4dde1b36 Merge "fixed an issue with 8x8 token cost in trellisquant" into experimental 2012-01-27 15:11:18 +00:00
Yaowu Xu
81d16e3f53 fixed an issue with 8x8 token cost in trellisquant
changed the token cost for 8x8 transformed macroblock used in trellisquant
from those derived from 4x4 transform coefficient distribution to those
derived from 8x8 transform coefficient distribution. Test results show
this fix help 8x8 transform based compression consistently on cif and hd
sets:

http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_cost8x8only.html
(avg psnr:.14% glb psnr: .17% ssim: .20%)
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hd_cost8x8only.html
(avg psnr:.17% glb psnr: .18% ssim: .58%)

Note: To test the effect of this change, 8x8 transform was forced to be used
only on 16x16 predicted macroblocks on inter frames, the effect would be
bigger had all macroblocks been forcd to use 8x8 transform.

Change-Id: If9b7868b75357c66541f511e5ee78e4d2d4929a4
2012-01-26 14:50:11 -08:00
Deb Mukherjee
6fa47a5f16 Adds support for enhanced interpolation for subpel motion
using an 8-tap filter.

The results with 3 different 8-tap filters on the derf set are in:
http://www.corp.google.com/~debargha/vp8_results/enhinterp.html
The one that gives the most gain achieves an overall gain of about
0.6%. The results for a set of 12 hd (720p) videos are in:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd.html
with max gain of 0.55% with the same filter. The best filter apparently
achieves the best trade-off between pass band ripple and stop band
attenuation.

Change-Id: I919e28ae245c0493147fa0864f8c9d048a9dd530
2012-01-26 10:24:47 -08:00
Jim Bankoski
91325b8fe7 vpn common -> implicit segmentation
This introduces base functions for introducing implicit segmentation.
The code that actually stores the results to the segment map isn't
here yet.   This just prints out the segmentation map results
if you call it.

Uses connected component labeling technique on mbmi info so that only
if 2 mbs are horizontally or vertically touching do they get the same
segment.

vp8next - plumbing for rotation

code to produce taps for rotation ( tapify. py ),  code
for predicting using rotation ( predict_rotated.c ) ,  code
for finding the best rotation find_rotation.c.

didn't checkin code that uses this in the codec.   still work
in progress.

Fixed copyright notice

Change-Id: I450c13cfa41ab2fcb699f3897760370b4935fdf8
2012-01-24 11:20:13 -08:00
Yaowu Xu
5aab0c3fb7 Added code to prevent I8X8_PRED mode for MBs using 8x8 transform
This fixed a conflict introduced by the change of adding 8x8 intra
prediction modes. The 8x8 intra prediction mode code assumed the
use of 4x4 transform, and causes encoder crashes when the codec is
configured with --enable-t8x8.

Change-Id: I00cc94df63e9725377ffba9eb51be6b77fe3fcf9
2012-01-19 17:09:40 -08:00
Yaowu Xu
be9af16e16 reverted an accidental code deleting
commit cf561bad accidentally deleted a line of code that sets the
base_qindex for each frame, which leads to every frame is encoded
at Q of 0.

Change-Id: Ib5f8022e856bf3b3bd0d4147405e46241e3dcf2d
2012-01-19 16:56:46 -08:00
Paul Wilkins
bd5f384bef Possible divide by 0 error.
Put traps to prevent two possible divide by 0 errors.

Change-Id: Ia415b945244253dcdd12f54f1f157f9ca8c94d6b
2012-01-18 11:10:51 +00:00
Paul Wilkins
cf561bad1d Rate control on static scenes plus Y2dc delta Q fix.
A problem can arise on static clips with force key frames where
attempts to avoid popping lead to a progressive reduction in key
frame Q that ultimately may lead to unexpected overspend against
the  rate target.

The changes in this patch help to insure that in such clips the
quality of the key frames across the clip is more uniform (rather
than starting bad and getting better - especially at low target rates).

This patch also includes a fix that removes a delta on the Y2DC
when the baseline q index < 4 as this is no longer needed.

There is also a fix to try and prevent repeat single step Q adjustment in
the recode loop leading to lots of recodes, especially where the use
of forced skips as part of segmentation has made the impact of Q on
the number of bits generated much smaller.

Patch 2: Amend "last_boosted_qindex" calculation for arf overlay frames.

Change-Id: Ia1feeb79ed8ed014e4239994fcf5e58e68fd9459
2012-01-17 17:42:46 +00:00
Yaowu Xu
483b262bab Merge "Added an emms to prevent invalid stats output" into experimental 2012-01-11 23:07:54 +00:00
Yaowu Xu
a5ea68447f Added an emms to prevent invalid stats output
In certain hardware configuration, where mmx code is enabled and
other simd (sse2/sse3) disabled, lacking of this emms caused invalid
internal stats outputs.

Change-Id: I77c61cf6e0448d3f3b8c11781aa9e42f31d231c9
2012-01-05 13:25:41 -08:00
Christian Duvivier
e4ca542a3b Fix more warnings.
Change-Id: Ifadf65026a11bdb5d39840748613880bcfb364bb
2011-12-22 16:33:06 -08:00
Christian Duvivier
a7eb21760f Fix a couple of warnings. 2011-12-21 15:58:14 -08:00
Paul Wilkins
df4e79f7f7 Extend to 256 Q steps.
This commit extends the number of Q steps to 256 from 128.
The q_trans[] array has been altered to distribute available Q index values
(using the current 64 steps available as input parameters) evenly across the
available range. This is coupled with the fact that each Q step where possible
now equates to a fixed % change in the quantizer. This may want refinement
later especially in terms of the granularity at the high quality end but is a
reasonable starting point.

Change-Id: I2aaa6874fa10ce05c958dd182947ce39f6f1eecb
2011-12-19 09:36:19 +00:00
Paul Wilkins
ec670bc558 QRange experiements.
High Q end extended a little.
Some clean up.

Slightly better on SSIM, Slightly worse on PSNR over derf set.

Change-Id: I3dceea8a39e11c26e1a389a40e40b86efc76d28c
2011-12-19 09:35:10 +00:00
Paul Wilkins
fb807776a2 Further QIndex realted Fixes:
Added code to support 256 index steps instead of 128 but disabled for now.
Replace hard wired table vp8cx_base_skip_false_prob[128]
Observed Qindex problem with setting minimum loop filter value.
(Experiment code using real Q in place but for now just returning 0. This has a big
beneficial effect on some clips, particularly waterfall which shows 5% ssim gain)

Change-Id: I2f7117de8adc1797164c106aa13effc900a1467e
2011-12-19 09:27:19 +00:00
Adrian Grange
b3ade15a26 Fixed stride bug in segmentation code
mode_info_context is padded with an additional column of data, so
mode_info_stride should be used to move between rows rather than
mb_cols.

Change-Id: I598559a2cd9df1c486d64aaeccf76b76a7ecf21c
2011-12-15 12:27:38 -08:00
Adrian Grange
ae63ce248a Fixed bug to use mode_info_stride rather than mb_cols
Both encoder & decoder were using mb_cols to
offset from one row of MODE_INFO structures to the next
when they should have been using mode_info_stride.

Fixing this in both encoder and decoder gives around
a 3KB size saving and 0.025dB PSNR improvement on the one
720P clip I tried.

(Also removed "index" which was being updated but not used)

Change-Id: I413bea802b142886bfcf8d8aa7f5a2f0c524fd4b
2011-12-15 10:00:46 -08:00
Paul Wilkins
ae9023a3c9 QINDEX_RANGE fixed tables.
Removed a couple more fixed tables for the extended quantizer experiment
that depend on QINDEX_RANGE.

Change-Id: I2c15ffc7488c2a2b8d6504e2c4b6b2339799d117
2011-12-12 11:18:57 +00:00
Yaowu Xu
be360d47f4 Enabled adaptive UV intra coding for inter frames
Previously, Y-adaptive UV intra coding only enabled on key frames in
UVINTRA experiment. This commit enabled the same coding for inter
frames, so the encoding of UV intra modes are consistent cross all
frame types. Tests on derf set showed a very small overall gain around
.04%:

http://www.corp.google.com/~yaowu/no_crawl/interUVintra.html

The gain looks to be reasonable given inta coded MBs is only a
small portion of MBs in inter frames.

Change-Id: Ic6fc261923f2c253f4a0c9f8bccf4797557b9e16
2011-12-09 14:44:13 -08:00
Adrian Grange
43a059de71 Merge "Fix out of bounds read in update_mbgraph_frame_stats" into experimental 2011-12-09 21:05:00 +00:00
Adrian Grange
95b4cf059c Fix out of bounds read in update_mbgraph_frame_stats
update_mbgraph_frame_stats used xd->mode_info_context
before it had been setup, resulting in potentially
random accesses of uninitialized memory.

This fix allocates a local MODE_INFO structure to hold
the data generated in the function.

Change-Id: Ic9e75610008ce0e2d690e8e583c21582fee6fc45
2011-12-09 12:47:57 -08:00
Yaowu Xu
ba1a6619b3 Revised coding using adaptive mode context to depend on frame type
A previous commit 76feb965 made the vp8_mode_context adaptive on a frame
frame basis, this commit further made the coding context adaptive to two
frame types separately. Tests on derf set showed a further small gain on
all metrics: avg psnr 0.10%, glb psnr: 0.11%, ssim: 0.08%

http://www.corp.google.com/~yaowu/no_crawl/newNearMode_1209.html

Change-Id: I7b3e32ec8729de1903d14a3f1213f1624b78cdee
2011-12-09 12:13:42 -08:00
Yaowu Xu
ebcc6605c1 fixed a crash caused invalid Q choice
The commit fixed a problem by capping cpi->active_best_quality to be
smaller than cpi->worst_quality.  Also fixed a few line of code that
was misplaced.

Change-Id: Ie908264b72140c669122a0afde5d886619c33474
2011-12-08 07:04:23 -08:00
Yaowu Xu
b70f23caec Removed #if CONFIG_MULCONTEXT
This commit removed the macro CONFIG_MULCONTEXT, which was used to
indicate the experiment code for using separate context for altref
and normal frames. This commit made the change fully merged in.

Change-Id: I525f927f68e2365d37b340ef23b836a136a4f70b
2011-12-07 14:01:07 -08:00
Yaowu Xu
d37cd97682 Removed #if CONFIG_I8X8
This commit removed the macro CONFIG_I8X8, which was used to indicate
the 8x8 intra prediction experiment, made the change fully merged in.

Change-Id: Iafa4443781ce6e83f5591c12ba615a0e92ce0ea0
2011-12-07 13:48:53 -08:00
Yaowu Xu
76feb965d3 made vp8_mode_context adaptive
vp8_mode_contexts[] is an entropy table used to code inter mode
choices. It was a fixed constant table. This commit made the entropy
context adaptive. Tests on derf set showed very good consistent gains
on all metrics: avg psnr .47%, overall psnr .46% and ssim .40%.

http://www.corp.google.com/~yaowu/no_crawl/newModeContext.html

Change-Id: Ia62b14485c948e2b74586118619c5eb2068b43b2
2011-12-07 11:01:59 -08:00
Yaowu Xu
b1823a7dd2 fixed a crash when MODE_STATS is enabled
The MODE_STATS macro was used to #ifdef around code for mode entropy
stats collection, this commit fixed a crash when MODE_STATS is on.
The commit also changed a number of array definitions to use defined
macros instead of hard-coded numbers.

Change-Id: I114592f53a1e44e31e455f5725f036ae6168735a
2011-12-07 10:56:39 -08:00
Yaowu Xu
d0e3acf98c Merge "Minor fixes:" into experimental 2011-12-07 18:52:51 +00:00
Paul Wilkins
79774d108f Minor fixes:
fixed issues caused by conflicts between two experiments.

Change-Id: I56a9bd69493e4850c121ea057a6233c55777c2a5
2011-12-07 09:55:27 -08:00
Ronald S. Bultje
73bbdfe506 Rename use_dc_pred to use_16x16_pred.
Because the variable doesn't distinguish between DC and non-DC
prediction, but rather between 16x16 or 4x4 prediction.

Change-ID: Ia4e7dda2bd6230c91515072e3277be2d64e42629
2011-12-07 09:10:26 -08:00
Ronald S. Bultje
0072b8bc73 Fix for RD thresholds if both I8X8 and DUALPRED are enabled.
Change-Id: I5f9fc894e6a332d9be6d7336c7c5fe11e65b8498
2011-12-06 15:13:11 -08:00
Ronald S. Bultje
60cb39da86 Dual 16x16 inter prediction.
This patch introduces the concept of dual inter16x16 prediction. A
16x16 inter-predicted macroblock can use 2 references instead of 1,
where both references use the same mvmode (new, near/est, zero). In the
case of newmv, this means that two MVs are coded instead of one. The
frame can be encoded in 3 ways: all MBs single-prediction, all MBs dual
prediction, or per-MB single/dual prediction selection ("hybrid"), in
which case a single bit is coded per-MB to indicate whether the MB uses
single or dual inter prediction.

In the future, we can (maybe?) get further gains by mixing this with
Adrian's 32x32 work, per-segment dual prediction settings, or adding
support for dual splitmv/8x8mv inter prediction.

Gain (on derf-set, CQ mode) is ~2.8% (SSIM) or ~3.6% (glb PSNR). Most
gain is at medium/high bitrates, but there's minor gains at low bitrates
also. Output was confirmed to match between encoder and decoder.

Note for optimization people: this patch introduces a 2nd version of
16x16/8x8 sixtap/bilin functions, which does an avg instead of a
store. They may want to look and make sure this is implemented to
their satisfaction so we can optimize it best in the future.

Change-ID: I59dc84b07cbb3ccf073ac0f756d03d294cb19281
2011-12-06 11:53:02 -08:00
Paul Wilkins
b4ad9b5d50 Some further QIndex issues with extended Q
Resolved or factored out some further issues with Q index.
Put in a 3rd order polynomial instead of less accurate power function
as the best fit on gf and kf boost adjustment.
Added avg_q value to use instead of ni_av_qi.
Compute segment delta Q values based on avg_q.
Fixed bug in adjust_maxq_qrange().

The extended range Q on the derf set, using standard data rates
(which do not extend high enough to get big benefits) still show
a shortfall of between 0.5 and 1% though so there would appear to
be further issues that need to be tracked down.

Change-Id: Icfd49b9f401906ba487ef1bef7d397048295d959
2011-12-06 15:43:17 +00:00
Yaowu Xu
82d99257f2 removed leftover code from a couple merge problems.
Change-Id: I17d9c1246d69e102297ec1c3efb359691b3da313
2011-12-05 11:22:35 -08:00
Yaowu Xu
acf5d20ce5 added separate entropy context for alt_ref
This commit added code to keep track of separate entropy contexts for
normal frames and alt ref frames. The underly assumption was that the
two type of frames have different entropy characteristics given they
typically have quite different quantization levels. By keeping entropy
contexts separate, it helps the entropy context distribution to be more
closely adapted to each frame type.

Tests on derf set showed a good and very consistent gain on all clips
on all metrics, avg psnr: 0.89%, overall psnr: 0.84% and ssim 0.93%.

http://www.corp.google.com/~yaowu/no_crawl/mulcontext.html

Change-Id: I15bc9697f6ff7829042911fe0c62930585d7e65d
2011-12-02 14:43:33 -08:00
Yaowu Xu
a8fbab8697 enabled 8x8 intra prediction modes on inter frames
This commit enabled the usage of 8x8 intra prediction modes on inter
frames. There are a few TODO items related to this: 1)baseline entropy
need be calibrated; 2)cost of UV need to be done more properly rather
than using decision only relying on Y; 3)Threshold for allowing picking
8x8 intra prediction should be lowered to lower than the B_PRED.

Even with all the TODOs, tests showed consistent gain on derf set ~0.1%
(PSNR:0.08% and SSIM:0.14%). It is assumed that 8x8 intra prediction
will help more on large resolution clips, especially with above TODOs
addressed.

Change-Id: I398ada49dfc32575cfab962a569c2885111ae3ba
2011-12-02 13:44:47 -08:00
Paul Wilkins
8487a68baf Further work on extended Q range.
Fixed some further QIndex related issues and replaced some tables
(eg zbin and rounding)

Also Added function (currently disabled by default) to populate the
main AC and DC quantizer tables. Using the original AC range the
resulting computed DC values give behavior broadly comparable
on the DERF set. That is not to say that the equations will hold good
over a more extended range. The purpose of this code is to make it
easier to experiment with further alterations to the Q range and distribution
of Q values plus the relative weights given to AC and DC.

The function find_fp_qindex() ensures that changes to the Q tables
are reflected in the value passed in to the first pass code.

Slight experimental adjustment to static segment Q offset.

Change-Id: I36186267d55dfc2a3d565d0cff7218ef300d1cd5
2011-12-02 15:30:01 +00:00
Paul Wilkins
2b307b38e3 CR/LF issue.
Change-Id: I95fab6f51967008acf1bc9e98fdb7bb56974807f
2011-12-02 15:06:15 +00:00
Yaowu Xu
bba710fcbd added transform type to MB_MODE_INFO
this commit is to add an variable in the macroblock level mode
info structure to track the transform size used in each MB, so
the information can be used later in the loop filter to change
how loop filter works on MBs with different transform sizes.

Change-Id: Id0eeaba6cc854c6d1be00ed8d237b3d9e250e447
2011-12-01 07:34:27 -08:00
Paul Wilkins
a917afabbb MinQ equations.
Slight tweaks to the new minq equations to bring results more into line with
original lookup tables.

Change-Id: I969fc87d95912df549b6775e83ee2345e84d4da0
2011-11-29 18:03:57 +00:00
Paul Wilkins
b9ce9bcbc5 Extended Q Range:
Addressed a couple of other QIndex dependencies.

Change-Id: I15b224bffd0210d3c7065cb6905156f2ca8e9ea9
2011-11-29 18:02:56 +00:00
Paul Wilkins
99df6bb629 Further work on extended Q range.
Fixed bug in firspass.c call to vp8_initialize_rd_consts()

This was passing in vp8_dc_quant(cm->base_qindex, cm->y1dc_delta_q)
 instead of (cm->base_qindex + cm->y1dc_delta_q).

It just so happens that for the value 26 used for cm->base_qindex in the
unextended Q case,  the two give similar results. However, when using
the extended Q range the two are very different.

Also added more stats output and partly disabled another broken feature.

Change-Id: Iddf6cf5ea8467c44b7c133f38e629f6ba6f2581e
2011-11-29 17:59:23 +00:00
Ronald S. Bultje
82733643ca mbgraph: fix invalid memory access if motion vectors are too big. 2011-11-28 12:39:38 -08:00
Yaowu Xu
643238a3e0 changed find_near_mvs search to include a mb from last frame
This is an experiment to include a mv contribution from last frame to
nearest and near mv definition. Initial test showed some small though
consistent gain.

latest patch slightly better result ~.13%-~.18%.

TODO: the entropy used to encode the mode choice, i.e. the mv counts
based conditional distribution of modes should be re-collected to
reflect this change, it is expected that there is some further gain
from that.

Change-Id: Ief1e284a36d8aa56b49ae5b360c91419ec494fa4
2011-11-28 08:52:08 -08:00
Paul Wilkins
ee2051f650 Two pass rate control code changes.
This comitt brings accross changes from the public branch
commit number Icf74d13af77437c08602571dc7a97e747cce5066.

The main puurpose of this comit relates to CQ mode but it
also includes some refactoring of the two pass code which
I hope will make tuning the experimental branch for the new
quantizer range a little less painfull.

Change-Id: I278e989436a928fc1fe7761068960048f9d7a376
2011-11-23 17:18:31 +00:00
Paul Wilkins
a0b7db22e6 Further resolution of QIndex LUTS;
This commit resolves further QIndex look up tables to facilitate
experimentation with the quantizer range.

In some cases  rather than remove the look up tables completely
I have created functions that are called once  to populate them
using a formulaic approach base on the actual quantizer.

The use of these functions based on best fit of data from the original
tables does affect the results on some clips but across the derf test
set the effect was broadly neutral.

Change-Id: I8baa61c97ce87dc09a6340d56fdeb681b9345793
2011-11-23 11:32:20 +00:00
Adrian Grange
08491b8665 Remove redundant code (lf_or_gf and frame_lf_or_gf)
Removed unused variables lf_or_gf and frame_lf_or_gf.

Change-Id: I88692cd7d53e532d303c4525ee4667c1ecea3026
2011-11-22 08:47:08 +00:00
Paul Wilkins
d39b5d0546 Removal of Qindex LUTS.
One of the problems arising when tweaking or adjusting the quantizer
tables is that there are a lot of look up tables that depend on the QINDEX.
Any adjustment to the link between QINDEX and real quantizer therefore tends
to break aspects of for example the rate control.

In this check in I have replaced several of the look up tables with functions that
approximate the same results as the old Q luts but use a formulaic approach
based on real Q values rather than QIndex. This should hopefully make it easier
to experiment with changes to the Q tables without always having to go through
and hand optimize a set of look up tables. Once things stabilize we may choose
to re-instate luts for the sake of performance.

Patch 2:
    Addressed Ronald's comments.
    vp8_init_me_luts() Added so luts only initialized once.

Change-Id: Ic80db2212d2fd01e08e8cb5c7dca1fda1102be57
2011-11-22 08:42:33 +00:00
Paul Wilkins
9bac509ac5 Extended Q range Experiment.
Corrected dc lookup table to maintain ac/dc balance
close to what it was previously.

Firstpass not being passed the adjusted Q index for
the extended range.

Change-Id: Ic0200dabda445fea03bf81067999cb2670e99b77
2011-11-21 15:53:40 +00:00
Paul Wilkins
4f792921e7 CONFIG_T8X8 experiment.:
Block the selection of 4x4 modes in key frames if 8x8 is selected.

Change-Id: Ie5729ec22a999d9a1996f020bd4b941e29514992
2011-11-21 15:46:32 +00:00
Adrian Grange
eb15fe85e0 Clip buffer level to the maximum buffer size in CBR
The buffer level was able to increase indefinitely rather than
being clipped to the maximum buffer size specified by the user.

This change checks the buffrer level and prevents it from
going beyond the upper limit of the buffer.

Change-Id: Ifff55f79d3c018e4d3d77e554b11ada543cc1654
2011-11-17 15:57:37 -08:00
Yaowu Xu
6dddcbc57d Merge "fixed the scaling in 8x8 trellis quant" into experimental 2011-11-17 14:55:43 +00:00
Yaowu Xu
7f33be9e96 fixed the scaling in 8x8 trellis quant
This commit has a few minor fixes to the 8x8 trellis quant, so to
make it work regardless if extend_qrange is enabled or not. It also
borrowed adaptive RDMULT constants from 4x4 trellis that was missed
in the 8x8 trellis quant.

Change-Id: I60d7769071f102c699b5084597e62bca87a1f759
2011-11-16 14:29:02 -08:00
Paul Wilkins
cee3d2223a Header inclusion for Unix build
Explicit inclusion of limits.h to satisfy unix build for definition of INT_MAX.
Some commented out code removed.

Change-Id: I5b5980dfaa9b4d2d12bfd729cfd35bd982106908
2011-11-16 10:34:47 +00:00
Paul Wilkins
3cdfdb55e4 Merge CONFIGURE_SEGMENTATION experiment.
Removal of CONFIGURE_SEGMENTATION ifdefs.

Removal of legacy support code fo the old coding mechanism.

Use local reference "xd" for MACROBLOCKD structure in
encode_frame_to_data_rate()

Moved call to choose_segmap_coding_method() out of encode
loop as the cost of segmentation is not properly accounted
in the loop anyway. If this is desirable in the future it
can be moved back. The use of this function to do all the
analysis and set the probabilities also removes the need
to track segment useage in threading code.

Change-Id: I85bc8fd63440e7176c73d26cb742698f9b70cade
2011-11-15 16:15:23 +00:00
Paul Wilkins
6394ef28d7 Further clean up of Segmentation experiment code
Changed name and sense of segment_flag to "seg_id_predicted"
Added some additional comments and retested.

I also did some experimentation with a spatial prediction option
using a similar strategy to the temporal mode implemented.
This helps in some cases where temporal prediction is bad but
I suspect there is more overlap here with work on a larger scale
block structure and spatial correlation will likely be better
handled through that mechanism.

Next check in will remove #ifdefs and legacy mode code.

Change-Id: I3b382b65ed2a57bd7775ac0f3a01a9508a209cbc
2011-11-15 15:22:26 +00:00
Paul Wilkins
661b2c2dcf Further work on Segmentation Experiment:
This check in includes quite a lot of clean up and refactoring.

Most of the analysis and set up for the different coding options for the
segment map (currently simple distribution based coding or temporaly
predicted coding), has been moved to one location (the function
choose_segmap_coding_method() in segmenation.c). This code was previously
scattered around in various locations making integration with other
experiments and modification / debug more difficult.

Currently the functionality is as it was with the exception that the
prediction probabilities are now only transmitted when the temporal
prediction mode is selected.

There is still quite a bit more clean up work that will be possible
when the #ifdef is removed. Also at that time I may rename and alter
the sense of macroblock based variable "segment_flag" which indicates
(1 that the segmnet id is not predicted vs 0 that it is predicted).

I also intend to experiment with a spatial prediction mode that can be
used when coding a key frame segment map or in cases where temporal
prediction does not work well but there is spatial correlation.

In a later check in when the ifdefs have gone I may also move the call
to choose_segmap_coding_method() to just before where the bitsream is
packed (currently it is in vp8_encode_frame()) to further reduce the
possibility of clashes with other experiments and prevent it being called
on each itteration of the recode loop.

Change-Id: I3d4aba2a2826ec21f367678d5b07c1d1c36db168
2011-11-15 11:13:33 +00:00
Paul Wilkins
c9130bdbbc Segmentation experiment:
Added last_segmentation_map[] structure
to keep track of what we had before when
doing temporal prediction. With this change
the existing code does once again appear to
be giving a decodable bitstream for both
temporal and standard prediction modes.
However, it is still somewhat messy and
confused and there is no option to take
advantage of spatial prediction so it could
do with further work.

Some housekeeping / clean out.

Change-Id: I368258243f82127b81d8dffa7ada615208513b47
2011-11-11 18:33:25 +00:00
Paul Wilkins
bf25d4ad7f SEGMENTATION experiment:
Some initial cleanup to aid testing and debug.

Pull code to choose temporal or spatial encoding
out of encodeframe.c into a dedicated function
in segmentation.c.

For now disable broken temporal mode.

Move the coding of "temporal_update" flag and
only transmit if segment map update is indicated.

Rename the functions read_mb_features() and
write_mb_features() to read_mb_segid() and
read_mb_segid() as they only read and write
the macroblock segment id not any of the
features.

Change-Id: Ib75118520b1144c24d35fdfc6ce46106803cabcf
2011-11-11 18:31:21 +00:00
Yaowu Xu
982b061dc2 Make 8x8 and extend_qrange to work together
This commit added scaling factors to 8x8 transform, quant, dequant and
inverse transform pipeline to make 8x8 transform to work when configed
with enable-extend_qrange. This commit also disabled the trellis-quant
when extend_qrange is configured.

Change-Id: Icfb3192e4746f70a4bb35ad18b7b47705b657e52
2011-11-11 07:31:00 -08:00
Yaowu Xu
a0ed4e6380 Merge "scaled the threshold for 2nd order coefficient reset" into experimental 2011-11-10 16:34:36 +00:00
Yaowu Xu
cbcba9e7c0 scaled the threshold for 2nd order coefficient reset
extend_qrange introduces a different scaling factor, this commit takes
the scaling difference into account for reset 2nd order coefficients.

Change-Id: Ie58bca9f52698fa759e3f88da2aa4d82630fa91a
2011-11-10 08:31:13 -08:00
Paul Wilkins
0789253125 T8x8 experiment merge.
For ease of testing and merging experiments I have
removed in line code in encode_frame() that assigns
MBs to be t8x8 or t4x4 coded segments and have
moved the decision point and segment setup to
the init_seg_features0 test function.

Keeping everything in one place helps make sure
for now that experiments using segmentation are
not fighting each other.

Also made sure mode selection code can't choose 4x4
modes if t8x8 is selected.

Patch2: In init_seg_features() add checks
for SEG_LVL_TRANSFORM active.

Change-Id: Ia1767edd99b78510011d4251539f9bc325842e3a
2011-11-09 15:46:05 +00:00
Paul Wilkins
b0f9f15dbd Merging and testing of SEGMENTATION experiment.
Removed code in #if CONFIG_SEGMENTATION that
enables segmentation and creates a test segmentation
map, to avoid conflicts with the other segmentation test
code,

Change-Id: I7a21a44ed188b814cd80b30dd628c62474eba730
2011-11-09 11:59:20 +00:00
Paul Wilkins
ac2ab02dcf Segmentation feature logic fix.
Bug fix to logic in vp8_pick_inter_mode() and
vp8_rd_pick_inter_mode().

The block on the use of segment features
for the cm->refresh_alt_ref_frame case
was just for testing and is not correct.

The special case code for alt ref can
be re-enabled as an else clause.

Change-Id: Ic9b57cdb5f04ea7737032b8fb953d84d7717b3ce
2011-11-09 11:53:26 +00:00
Yaowu Xu
5883246d01 make debug match release build on win32 with 8x8 transform enabled
The 8x8 forward transform makes use of floating operations, therefore
requires emms call to reset mmx registers to correct state. Without
the resets, the 8x8 forward transform results are indefinite on win32
platform.

Change-Id: Ib5b71c3213e10b8a04fe776adf885f3714e7deb1
2011-11-08 20:36:54 -08:00
Paul Wilkins
a9df4183a6 Segment signaling of TX size
Initial attempt at using new segment feature signaling
to indicate 4x4 or 8x8 transform.

needs --enable-experimental --enable-t8x8

Note this is work in progress.

Change-Id: Ib160d46a5d810307bfcbc79853ce1a65b5b870b7
2011-11-08 12:21:08 +00:00
Yaowu Xu
f82d601f94 Merge "Added context reset when 2nd order coefficients are cleared" into experimental 2011-11-04 16:25:15 +00:00
Paul Wilkins
fe38082f44 Segment Features with 8x8DCT.
Temporary check in to turn off other segment features
tests when #if CONFIG_T8X8 is set as the assignment of
MBs to differnt segments in each case  will conflict.

The 8x8 code will be modified to use the new segment
feature method properly in a later check in.

Increase bits allowed for EOB end stop marker to 6 ready
for 8x8.

Change-Id: I4835bc8d3bf98e1775c3d247d778639c90b01f7f
2011-11-04 11:06:24 +00:00
Paul Wilkins
a258bba1fb Segment Feature Data Access
No change to functionality or output.

Updates to the segment feature data structure now all done
through functions such as set_segdata() and get_segdata()
in seg_common.c.

The reason for this is to make changing the structures (if needed)
and debug easier.

In addition it provides a single location for subsequent addition
of range and validity checks. For  example valid combination of
mode and reference frame.

Change-Id: I2e866505562db4e4cb6f17a472b25b4465f01add
2011-11-04 10:42:12 +00:00
Yaowu Xu
2bbde25003 make uv intra mode coding adaptive to Y mode
This commit tries to do UV intra mode coding adaptive to Y intra mode.
Entropy context is defined as conditional PDF of uv intra mode given
the Y mode. All constants are normalized with 256 to be fit in 8 bits.

This provides further coding efficiency beyond the quantizer adaptive
y intra mode coding. Consistent gains were observed on all clips and
all bit rates for HD all key encoding tests.

To test, configure with
--enable-experimental --enable-uvintra

Change-Id: I2d78d73f143127f063e19bd0bac3b68c418d756a
2011-11-03 21:48:08 -07:00
Yaowu Xu
d8afecef71 Added context reset when 2nd order coefficients are cleared
As discovered in path 10 of Change Ia12acd2f, reset 2nd order coeffs
without reset of above and left coding context may have introduced
problem that causes encoder/decoder mismatching. This commit added
update to coding context when the 2nd coefficients are cleared.

In addition, this commit also introduced early breakout in the checks
to speed up when coefficients are too significant to be cleared.

Change-Id: I85322a432b11e8af85001525d1e9dc218f9a0bd6
2011-11-03 16:05:29 -07:00
Paul Wilkins
a10a268e58 Segment Features. Removal of #ifdefs
Removal of configure #ifdefs so that segment features
always available. Removal of code supporting old
segment feature method.

Still a good deal of tidying up to do.

Change-Id: I397855f086f8c09ab1fae0a5f65d9e06d2e3e39f
2011-11-03 17:14:26 +00:00
Paul Wilkins
2370d440bd Merge "Segmentation: Reference frames" into experimental 2011-11-03 12:58:43 +00:00
Paul Wilkins
ab9a4ce065 Merge "Change to prevent encoding of effect-less 2nd order coefficients" into experimental 2011-11-02 14:48:17 +00:00
Paul Wilkins
87ff8620b2 Segmentation: Reference frames
Modify reference frame segmentation so that ONE or MORE
reference frames may be marked as a available for a given
segment.

Fixed bugs relating to segment coding of INTRA and some
INTER modes at the segment level.

Modified Q boost for static areas based on ambient average Q.

Strong results now on clips with significant static areas.
(some data points in derf set as high as 9% and some static &
slide show type content in YT set > 20%)

Change-Id: Ia79f912efa84b977f35a23683ae3643251e24f0c
2011-11-02 13:31:54 +00:00
Adrian Grange
2b450a460f Deleted repeated code block
The block of code skipped testing the current mode if the
reference frame is AltRef, the mv is not (0,0) and
ARNR filtering is disabled.

This block of code has already been tested above if the
macro CONFIG_SEGFEATURES is set to 0.

Change-Id: I3f5710bb8270caad06c9a0eee59fa0daf1f70776
2011-11-01 08:41:43 -07:00
Adrian Grange
71fb1f8eab Fixed this_mode used before set in vp8_pick_inter_mode
The variable this_mode was being used before it had been
initialized.

Moved the line that sets-up this_mode toward the top of the
enclosing loop, prior to its first use. The bug would result in
tests in the loop lagging the mode that was expected to be
tested.

Change-Id: If4e51600449ce6b4285f112da17a44c24b4a19fb
2011-10-31 12:42:00 -07:00
Paul Wilkins
795c6dd2c9 Segmentation Entropy and tweaks.
Some correction for entropy impact of segment signaled (EOB and ref frame)

Other slight tweaks.

Derf VBR average gain now over 1% (best over 7%)
One YT test clip has gains of circa 30% (VBR)

There is still an issue with noisy clips where making the background static
and coded with 0,0 can have a negative effect, especially at low Q.
This is probably because of the loss of smoothing by fractional pixel filters.

Change-Id: I7a225613c98067b96f8fc7a7e36f95d465b2b834
2011-10-31 10:59:25 +00:00
Yaowu Xu
c8ef79d22e Change to prevent encoding of effect-less 2nd order coefficients
similar logic to http://gerrit.chromium.org/gerrit/#change,10359

Change-Id: Ia12acd2f2b3b92ef2a601da43c2497034ef62174
2011-10-25 10:25:02 -07:00
Paul Wilkins
23701f4f87 Segmentation Features;
Only encode sign bit for feature data that can have a sign.

Tweaks to the test segmentation rules so that it now actually gives
a net benefit on the derf set of about 0.4% though much higher
on some clips at the low end.

Change-Id: I8e61f1aebf41c9037db7e67e2f8975aa18a0c986
2011-10-24 17:06:29 +01:00
Paul Wilkins
01ce04bc06 Further segment feature extensions.
This quite large check in includes the following:

Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group.
This is used as a basis for a simple segmentation for the normal frames
in a gf/arf group. This code also uses satd functions from Yaowu.

Adds functionality for coding the latest possible position of an EOB for
blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct).
Where the EOB position is 0 this acts like "skip" and the normal coding
of skip at the per mb level is disabled.

Added functions (seg_common.c) for setting and reading segment feature
elements. These may want to be optimized away at some point but while the
mecahnism is in a state of flux they provide a single location for making
changes and keep things a bit cleaner.

This is still proof of concept code. Currently the tested feature set:-

Quantizer,
Loop Filter level,
Reference frame,
Prediction Mode,
EOB end stop.

TBD:-

Add functions for setting and reading the feature data with range
and validity checking.

Handling of signed and unsigned feature data. At the moment all is assumed
to be signed and a sign bit is coded but many cannot be negative.

Correct handling of EOB feature with intra coded blocks.

Testing/trapping of legal/illegal ref frame and mode combinations.

Transform size switch plus merge and test with 8c8 DCT work

Merge and test with Sumans Segmenation coding optimizations

Change-Id: Iee12e83661c7abbd1e0ce6810915eb4ec35e2d8e
2011-10-24 15:52:18 +01:00
Paul Wilkins
156b221a7f Segment coding of mode and reference frame.
Proof of concept test code that encodes mode and reference
frame data at the segment level.

Decode-able bit stream but some issues not yet resolved.
As it this helps a little on a couple of clips but hurts on most as
the basis for segmentation is unsound.

To build and test, configure with
--enable-experimental --enable-segfeatures

Change-Id: I22a60774f69273523fb152db8c31f4b10b07c7f4
2011-09-30 16:45:16 +01:00
John Koleszar
305084d5fa Merge remote branch 'internal/upstream' into HEAD 2011-09-21 00:05:04 -04:00
Fritz Koenig
bd0c3409a8 Move neon only arm functions under arm/neon.
These files don't contain generic arm code, so should
only be compiled by neon.

Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564
2011-09-20 10:51:06 -07:00
Johann
6829e62718 Merge "NEON FDCT updated to match current C code" 2011-09-20 09:51:05 -07:00
Johann
86e07525d5 Merge "NEON walsh transform updated to match C" 2011-09-20 09:50:42 -07:00
Johann
3a16276cf7 Merge "Updated ARMv6 forward transforms to match C" 2011-09-20 09:50:36 -07:00
Tero Rintaluoma
0c2529a812 NEON FDCT updated to match current C code
- Removed fast_fdct4x4_neon and fast_fdct8x4_neon
- Uses now short_fdct4x4 and short_fdct8x4
- Gives ~1-2% speed-up on Cortex-A8/A9

Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec
2011-09-20 10:20:55 +03:00
Tero Rintaluoma
3c19bc3fb3 Fixed armv5te multiplications
Rd and Rm registers should be different in 'mul'. This register
combination results in unpredictable behaviour. GCC will give
a warning and RVCT an error in this case.

Restriction applies only to armv5 targets and not for armv6 and above.

Change-Id: I378d17c51e1f16a6820814fbed43e115aaabb03e
2011-09-20 09:59:27 +03:00
Tero Rintaluoma
4c3ad66b7f Updated ARMv6 forward transforms to match C
- Updated walsh transform to match C
  (based on Change Id24f3392)
- Changed fast_fdct4x4 and 8x4 to short_fdct4x4 and 8x4
  correspondingly

Change-Id: I704e862f40e315b0a79997633c7bd9c347166a8e
2011-09-19 10:26:59 +03:00
Tero Rintaluoma
2a4b2a000c NEON walsh transform updated to match C
Modified original patch If2f07220885c4c3a0cae0dace34ea0e36124f001
according to comments. Scheduled code a little bit to prevent some
interlocks.

Change-Id: I338f02b881098782f82af63d97f042b85e63e902
2011-09-19 10:15:33 +03:00
Yaowu Xu
1d44e7ce1f enable selecting&transmitting to for intra mode entropy
This commit added a 3 bit index to the bitstream, the index is used to
look into the intra mode coding entropy context table. The commit uses
the mode stats to calculate the cost of transmitting modes using 8
possible entropy distributions, and selects the distribution that
provides the lowest cost to do the actual mode coding.

Initial test show this provides additional .2%~.3% gain over quantizer
adaptive intra mode coding. So the adaptive intra mode coding provides
a total of .5%(psnr) to .6% gain(ssim) combined for all-key-encoding

To build and test, configure with
--enable-experimental --enable-qimode

Change-Id: I7c41cd8bfb352bc1fe7c5da1848a58faea5ed74a
2011-09-16 16:33:19 -07:00
Yaowu Xu
aac2c12663 add quantizer adaptive intra mb mode encoding
make intra mode coding entropy distribution adaptive to baseQindex, an
encoding test on hd clips with all key frame shows universal gain on
all clips in both .2%(psnr) and (ssim).3%.

To build and test, configure with
--enable-experimental --enable-qimode

Change-Id: Iaa69241b984d4fdd8baa6d77ee78c0140f5ac00a
2011-09-16 16:26:35 -07:00
Yaowu Xu
ca6b85aa4e add 8x8 intra prediction modes
Patch 1 to Patch 3 is an initial implementation of 8x8 intra prediction
modes, here are with the following assumptions:
a. 8x8 has 4 prediction modes DC, H, V and TM
b. UV 4x4 block use the same mode as corresponding 8x8 area
c. i8x8 modes are enabled for key frame only for now
Patch 4:
d. removed debug code from previous patches
Patch 5:
e. added stats code to collect entropy stats and further cleaned up
Patch 6:
f. changed mode stats code to collect finer stats of modes
Patch 7:
g. normalized i8x8 modes distribution to total at 256 (8bits).
Patch 8:
h. fixed a bug in decoder and removed debug printf output.
Patch 9:
i. more cleanups to address paul's comment
Patch 10:
j. messy rebase/merges to bring the commit up to date.

Tests on HD clips encoded with all key frame showing consistent gain
on all clips and all metrics:~0.5%(psnr) and 0.6%(ssim):
http://www.corp.google.com/~yaowu/no_crawl/i8x8hd_allkey_fixedq.html

To build and test, configure with:
--enable-experimental --enable-i8x8

Change-Id: I9813fe07ae48cab5fdb5d904bca022514ad01e7f
2011-09-16 15:55:19 -07:00
John Koleszar
62371d382a Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/decoder/decodframe.c
	vp8/encoder/encodeframe.c
	vp8/encoder/encodemb.c

Change-Id: I6e0d1669e4409a2dfd73ba2c7038d730842d3953
2011-09-16 09:22:29 -04:00
Paul Wilkins
ceb5174205 Segment Feature Signaling
Plumbing for tuning new segment features on and off.

Change-Id: If86cd6f103296b73030e8af7cf85c5b9bbffdbaf
2011-09-15 10:19:09 +01:00
Paul Wilkins
1741cc7ab9 Reverse coding order for segment features:
Code all the features for one segment (grouped together)
then all for the next etc. etc. rather than grouping the
data by feature.

Change-Id: I2a65193b3a70aca78f92e855e35d8969d857b6dd
2011-09-13 16:57:17 +01:00
Scott LaVarnway
5bc7b3a68e Fixed encoder crash
caused by the "Removed bmi copy to/from BLOCKD" commit.

Change-Id: I9fae71bdc34c8ecc07bb81cd3ccf498b91ce3ec7
2011-09-13 11:46:33 -04:00
Paul Wilkins
1c24442a07 Change to segment_feature_data[][] structure.
This data structure is  now [Segment ID][Features]
rather than [Features][Segment_ID]

I propose as a separate modification to make the experimental
bit stream reflect this such that all the features for a segment
are coded together.

Change-Id: I581e4e3ca2033bdbdef3d9300977a8202f55b4fb
2011-09-13 12:58:04 +01:00
Paul Wilkins
dfbc61f3ab Segment Features:
Some basic plumbing added for a range of segment level features.
MB_LVL_* changed to SEG_LVL_* to better reflect meaning.

Change-Id: Iac96da36990aa0e40afc0d86e990df337fd0c50b
2011-09-13 11:26:39 +01:00
Scott LaVarnway
c4b9089bb9 Merge "Skip computation of distortion in vp8_pick_inter_mode if active_map is used" 2011-08-31 07:18:52 -07:00
Scott LaVarnway
222c72e50f Merge "Removed bmi copy to/from BLOCKD" 2011-08-31 06:57:20 -07:00
Alpha Lam
0e05f2c6c9 Skip computation of distortion in vp8_pick_inter_mode if active_map is used
If a block is marked to be inactive then set distortion to 0.

Change-Id: Ib415f19642a2ff7b5cf5cfaedd60ebbd79732272
2011-08-31 14:06:55 +01:00
John Koleszar
4551743ceb Merge remote branch 'internal/upstream' into HEAD 2011-08-31 00:05:05 -04:00
John Koleszar
800b70a3bf Merge "Recalculate zbin_extra only if regular quantizer is being used" 2011-08-30 12:49:24 -07:00
Alpha Lam
bc9293b815 Recalculate zbin_extra only if regular quantizer is being used
vp8_update_zbin_extra() is called all the time even though the fast
quantizer doesn't use it. Skip this call if fast quantizer is used.

Change-Id: Ia711c38431930cc2486cf59b8466060ef0e9d9db
2011-08-30 19:23:34 +01:00
John Koleszar
ce59a150a6 Merge remote branch 'internal/upstream' into HEAD 2011-08-27 00:05:05 -04:00
John Koleszar
4a28115464 Merge remote branch 'internal/upstream' into HEAD 2011-08-26 00:05:06 -04:00
Yunqing Wang
1f20202e2c Minor modification on key frame decision
This change makes sure that no key frame recoding in real-time mode
even if CONFIG_REALTIME_ONLY is not configured.

Change-Id: Ifc34141f3217a6bb63cc087d78b111fadb35eec2
2011-08-25 16:54:45 -04:00
John Koleszar
180b0306cc Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/common/defaultcoefcounts.h
	vp8/common/entropy.c
	vp8/encoder/bitstream.c

Change-Id: Idd4990c80d5b5494ac036254694015fab449bc08
2011-08-25 08:36:19 -04:00
Fritz Koenig
4797a97215 Quiet warning by removing unused variable.
fwd_boost_score was not being computed or
referenced, so remove declaration.

Change-Id: Iece36cde1ec113e3c6afaff1407d24cdf12bd0a8
2011-08-24 15:47:09 -07:00
Scott LaVarnway
b870947d42 Removed bmi copy to/from BLOCKD
for SPLITMV and B_PRED modes.  Modified code to use the bmi
found in mode_info_context instead of BLOCKD.  On the decode
side, the uvmvs are calculated only when required, instead of
every macroblock.  This is WIP. (bmi should eventually be
removed from BLOCKD)
Small performance gains noticed for RT encodes and decodes.(VGA)

Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7
2011-08-24 14:42:26 -04:00
Scott LaVarnway
1de5da80c9 Merge "Faster vp8_default_coef_probs" 2011-08-24 07:52:10 -07:00
John Koleszar
67864c5f97 Merge remote branch 'internal/upstream' into HEAD 2011-08-24 00:05:05 -04:00
Fritz Koenig
c5f890af2c Use local labels for jumps/loops in x86 assembly.
Prepend . to local labels in assembly code.  This
allows non unique labels within a file.  Also
makes profiling information more informative
by keeping the function name with the loop name.

Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
2011-08-23 09:05:29 -07:00
Fritz Koenig
694d4e7777 Reclassify optimized ssim calculations as SSE2.
Calculations were incorrectly classified as either
SSE3 or SSSE3.  Only using SSE2 instructions.
Cleanup function names and make non-RTCD code work
as well.

Change-Id: I48ad0218af0cc51c5078070a08511dee43ecfe09
2011-08-22 12:36:28 -07:00
Fritz Koenig
b7a6f1d20e Merge "Revert "Reclasify optimized ssim calculations as SSE2."" 2011-08-22 12:32:12 -07:00
Fritz Koenig
734b1b2041 Revert "Reclasify optimized ssim calculations as SSE2."
This reverts commit 01376858cd
2011-08-22 11:31:12 -07:00
Fritz Koenig
f8e3d23b99 Merge "Reclasify optimized ssim calculations as SSE2." 2011-08-22 09:20:33 -07:00
John Koleszar
efe35fa63f Merge remote branch 'internal/upstream' into HEAD 2011-08-20 00:05:04 -04:00
Fritz Koenig
01376858cd Reclasify optimized ssim calculations as SSE2.
Calculations were incorrectly classified as either
SSE3 or SSSE3.  Only using SSE2 instructions.
Cleanup function names and make non-RTCD code work
as well.

Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97
2011-08-19 08:51:27 -07:00
John Koleszar
edec5eb5e7 Merge "Copy less when active map is in use" 2011-08-19 07:31:00 -07:00
Alpha Lam
4e8d35a461 Copy less when active map is in use
When active map is specified and the current frame is not a key frame,
golden frame nor a altref frame then copy only those active regions.

This significantly reduces encoding time by as much as 19% on the test
system where realtime encoding is used. This is particularly useful
when the frame size is large (e.g. 2560x1600) and there's only a few
action macroblocks.

Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4
2011-08-19 10:29:41 -04:00
John Koleszar
3743fd0cc7 Merge remote branch 'internal/upstream' into HEAD 2011-08-18 00:05:09 -04:00
Paul Wilkins
744f482350 Small boost to every other frame.
Instead of a single mid GF boost apply a few extra bits to
every other frame. This gives a very small average metrics
improvement on both derf and YT sets.

Also use min GF interval as min KF interval.

Change-Id: Iee238b8cae0ffaed850a5a944ac825cee18da485
2011-08-17 14:14:23 +01:00
Scott LaVarnway
19987dcbfa Faster vp8_default_coef_probs
Copies from a generated table instead of building the
default coeff probabilities during runtime.

Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
2011-08-16 16:21:21 -04:00
John Koleszar
f54d561fa8 Merge remote branch 'internal/upstream' into HEAD 2011-08-16 00:05:05 -04:00
John Koleszar
9cc1611588 Merge v0.9.7-p1 release int 'origin/master'
Change-Id: I93388d2f8846615ad1e26b975308c5e96b9b1918
2011-08-15 17:10:01 -04:00
John Koleszar
e96131705a Revert "Improved 1-pass CBR rate control"
This reverts commit b5ea2fbc2c. Further
testing showed noticable keyframe popping in some cases, reverting this
for now to give time for a proper fix.

Conflicts:

	vp8/encoder/onyx_if.c
	vp8/encoder/ratectrl.c

Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af
2011-08-12 14:51:36 -04:00
John Koleszar
a16cd74ba1 Merge remote branch 'internal/upstream-experimental' into HEAD
Conflicts:
	vp8/decoder/detokenize.c
	vp8/decoder/onyxd_if.c
	vp8/vp8_common.mk

Change-Id: Ifca1108186a8bc715da86a44021ee2fa5550b5b8
2011-08-11 13:01:45 -04:00
John Koleszar
939f64f68e Merge remote branch 'origin/master' into experimental
Change-Id: I9c479c9b6e72aa78b412d25c00b8075eaca5229d
2011-08-06 00:05:15 -04:00
Yunqing Wang
b84e8f20c3 Merge "Adjust half-pixel only search" 2011-08-05 12:15:32 -07:00
John Koleszar
712762b508 Merge remote branch 'origin/master' into experimental
Change-Id: Ic698ea5f5b31a5faf467eb0da4b762f9586df938
2011-08-05 00:05:05 -04:00
John Koleszar
238dae8604 Fix source buffer selection
This patch fixes a bug in the interaction between the recode loop and
spatial resampling. If the codec was in a spatial resampling state,
and a subsequent iteration of the recode loop disables resampling,
then the source buffer must be reset to the unscaled source.

Change-Id: I4e4cd47b943f6cd26a47449dc7f4255b38e27c77
2011-08-03 16:13:15 -04:00
Yunqing Wang
b9f19f8917 Adjust half-pixel only search
Changed motion search in vp8_find_best_half_pixel_step() to be the
same as in vp8_find_best_sub_pixel_step(), which checks 5 points
instead of 8 points. This only affects real-time mode with
cpu-used >=9. Tests showed it gives 2% encoding speedup with
a quality loss(psnr) of up to 0.5%.

Change-Id: I16049cad1535002346d46cfdfad345bfc3dc5146
2011-08-03 11:51:07 -04:00
John Koleszar
06c3d5bb9a Fix building with --disable-postproc
Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9
2011-08-01 17:50:23 -04:00
John Koleszar
87e570e6be Merge remote branch 'origin/master' into experimental
Change-Id: I473166452c0ed5a4219b5e7d96a91a6641b11b9d
2011-07-30 00:05:09 -04:00
John Koleszar
1f71d2e2c8 Correctly track sharpness in vp8cx_pick_filter_level_fast
Make sure to update last_sharpness_level from the current
sharpness_level whenever it changes.

Change-Id: I0258d2f5b11a407abf6176a8d4c4994d925943f0
2011-07-29 12:27:03 -04:00
John Koleszar
728886fae9 Merge remote branch 'origin/master' into experimental
Change-Id: Iaca87acc9726b5173d638528684d154538ec01e6
2011-07-28 00:05:12 -04:00
Yunqing Wang
2f2302f8d5 Preload reference area in sub-pixel motion search (real-time mode)
This change implemented same idea in change "Preload reference area
to an intermediate buffer in sub-pixel motion search." The changes
were made to vp8_find_best_sub_pixel_step() and vp8_find_best_half
_pixel_step() functions which are called when speed >= 5. Test
result (using tulip clip):

1. On Core2 Quad machine(Linux)
rt mode, speed (-5 ~ -8), encoding speed gain: 2% ~ 3%
rt mode, speed (-9 ~ -11), encoding speed gain: 1% ~ 2%
rt mode, speed (-12 ~ -14), no noticeable encoding speed gain

2. On Xeon machine(Linux)
Test on speed (-5 ~ -14) didn't show noticeable speed change.

Change-Id: I21bec2d6e7fbe541fcc0f4c0366bbdf3e2076aa2
2011-07-27 14:19:10 -04:00
Yunqing Wang
f11613b620 Merge "Fix range checks in motion search" 2011-07-27 09:34:13 -07:00
Yunqing Wang
bde2afbe23 Fix range checks in motion search
There were some situations that the start motion vectors were
out of range. This fix adjusted range checks to make sure they
are checked and clamped.

Change-Id: Ife83b7fed0882bba6d1fa559b6e63c054fd5065d
2011-07-27 10:37:33 -04:00
John Koleszar
9fbb1d4350 Merge remote branch 'origin/master' into experimental
Change-Id: I1ae82458536ba2f0969e1bea78f41cd16fe96b79
2011-07-27 00:05:06 -04:00
James Zern
b45065d38b cosmetics: consistently use [u]int64_t
Removes mixed usage of (unsigned) long long and INT64.
Fixes Issue #208.

Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a
2011-07-26 11:34:36 -07:00
John Koleszar
62400028e2 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/decoder/detokenize.c
	vp8/decoder/onyxd_int.h

Change-Id: Ib9b516b939358ac8bf694200a8425fdd62c8d149
2011-07-26 10:22:42 -04:00
John Koleszar
3c4a39e71c Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/decoder/detokenize.c
	vp8/decoder/onyxd_int.h

Change-Id: Idc301ae630dc1aedeb85674ecfdcf1eb28420f81
2011-07-26 10:04:36 -04:00
Yunqing Wang
fe270dd527 Specify size for argument pushed to stack
The change fixes building error on Win64.

Change-Id: I63d25b26220c4da8a98ca2e36530cbb802468e6b
2011-07-25 11:30:45 -04:00
John Koleszar
664cd5ac91 Merge remote branch 'internal/upstream' into HEAD 2011-07-23 00:05:14 -04:00
John Koleszar
e14ad46efa Merge remote branch 'origin/master' into experimental
Change-Id: I0a24d6762598e5fee30f264de1dcd10331c01eac
2011-07-23 00:05:13 -04:00
Johann
773bcc300d Merge "fix sharpness bug and clean up" 2011-07-22 09:34:55 -07:00
Johann
a04ed0e8f3 fix sharpness bug and clean up
sharpness was not recalculated in vp8cx_pick_filter_level_fast

remove last_filter_type. all values are calculated, don't need to update
the lfi data when it changes.

always use cm->sharpness_level. the extra indirection was annoying.

don't track last frame_type or sharpness_level manually. frame type
only matters for motion search and sharpness_level is taken care of in
frame_init

move function declarations to their proper header

Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db
2011-07-22 12:33:57 -04:00
Yunqing Wang
829179e888 Merge "Preload reference area to an intermediate buffer in sub-pixel motion search" 2011-07-22 06:56:15 -07:00
Yunqing Wang
20bd1446c0 Preload reference area to an intermediate buffer in sub-pixel motion search
In sub-pixel motion search, the search range is small(+/- 3 pixels).
Preload whole search area from reference buffer into a 32-byte
aligned buffer. Then in search, load reference data from this buffer
instead. This keeps data in cache, and reduces the crossing cache-
line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
showed encoder speed improvement:
  3.4%   at --rt --cpu-used =-4
  2.8%   at --rt --cpu-used =-3
  2.3%   at --rt --cpu-used =-2
  2.2%   at --rt --cpu-used =-1

Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
Test on Xeon machine also showed less improvement, since unaligned
data access latency is greatly reduced in newer cores.

Next, I will apply similar idea to other 2 sub-pixel search functions
for encoding speed > 4.

Make this change exclusively for x86 platforms.

Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
2011-07-22 09:28:06 -04:00
John Koleszar
dc9e1b7683 Merge remote branch 'origin/master' into experimental
Change-Id: I8b0a76b3232c8cff15c0ca5289e18af6889e5095
2011-07-22 00:05:11 -04:00
John Koleszar
7d44c805cf Merge remote branch 'internal/upstream' into HEAD 2011-07-22 00:05:06 -04:00
Yaowu Xu
8c31484ea1 fix more merge issues
With this fix, the experimental branch now builds and encodes correctly
with the following two configure options respectively:
--enable-experimental --enable-t8x8
--enable-experimental

Change-Id: I3147c33c503fe713a85fd371e4f1a974805778bf
2011-07-21 09:01:53 -07:00
John Koleszar
2bdda84e37 Merge "Increase chrow row alignment to 16 bytes." 2011-07-21 07:32:39 -07:00
Yunqing Wang
c5fe641179 Merge "Add improvements made in good-quality mode to real-time mode" 2011-07-21 07:27:09 -07:00
Yaowu Xu
1c24eb2b7b fixed a number of problems caused by auto merges
The auto merge process pull and merge commits from public git or master
branch. These automerges while worked well most time, but has created
a few problems. This commit fixed several issues existed long before
the latest 8x8 transform commit.

Change-Id: I895ca99713231b1aec521d57db5d9839f74aacfa
2011-07-20 12:45:35 -07:00
Timothy B. Terriberry
7d1b37cdac Increase chrow row alignment to 16 bytes.
This is done by expanding luma row to 32-byte alignment, since
 there is currently a bunch of code that assumes that
 uv_stride == y_stride/2 (see, for example, vp8/common/postproc.c,
 common/reconinter.c, common/arm/neon/recon16x16mb_neon.asm,
 encoder/temporal_filter.c, and possibly others; I haven't done a
 full audit).
It also uses replaces the hardcoded border of 16 in a number of
 encoder buffers with VP8BORDERINPIXELS (currently 32), as the
 chroma rows start at an offset of border/2.
Together, these two changes have the nice advantage that simply
 dumping the frame memory as a contiguous blob produces a valid,
 if padded, image.

Change-Id: Iaf5ea722ae5c82d5daa50f6e2dade9de753f1003
2011-07-20 10:20:31 -07:00
Deb Mukherjee
08f6471890 Add 8x8 transform to experimental branch
Please refer to previous commit messages for detailed info:
https://on2-git.corp.google.com/g/#change,5940
https://on2-git.corp.google.com/g/#change,6045

Change-Id: I8b16992f2f69c5a808ad40a3e32ef589cce7c59d
2011-07-20 09:49:22 -07:00
John Koleszar
6907117175 Merge remote branch 'origin/master' into experimental
Change-Id: I956822324c046c254806dd712a2d3be4dcf8564b
2011-07-20 00:05:17 -04:00
John Koleszar
8e464cc4c2 Merge remote branch 'internal/upstream' into HEAD 2011-07-20 00:05:09 -04:00
Scott LaVarnway
a25f6a9c88 Moved vp8_encode_bool into boolhuff.h
allowing the compiler to inline this function.  For real-time
encodes, this gave a boost of 1% to 2.5%, depending on the
speed setting.

Change-Id: I3929d176cca086b4261267b848419d5bcff21c02
2011-07-19 09:17:25 -04:00
John Koleszar
2614b77fcb Merge remote branch 'origin/master' into experimental
Change-Id: Ida9204624fe3fb99fed1b149d1f88159480fdd83
2011-07-19 00:05:11 -04:00
John Koleszar
b3b34b0bc7 Merge remote branch 'internal/upstream' into HEAD 2011-07-19 00:05:05 -04:00
John Koleszar
b5ea2fbc2c Improved 1-pass CBR rate control
This patch attempts to improve the handling of CBR streams with
respect to the short term buffering requirements. The "buffer level"
is changed to be an average over the rc buffer, rather than a long
running average. Overshoot is also tracked over the same interval
and the golden frame targets suppressed accordingly to correct for
overly aggressive boosting.

Testing shows that this is fairly consistently positive in one
metric or another -- some clips that show significant decreases
in quality have better buffering characteristics, others show
improvenents in both.

Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920
2011-07-18 11:48:05 -04:00
John Koleszar
8bf2cbce98 Merge remote branch 'origin/master' into experimental
Change-Id: Ic623c335cd4991c9d80f675f390e81282b18c137
2011-07-16 00:05:08 -04:00
John Koleszar
dc1c3f9024 Merge remote branch 'internal/upstream' into HEAD 2011-07-16 00:05:05 -04:00
Scott LaVarnway
e68894fa03 Merge "Tokenize MB optimized" 2011-07-15 07:54:14 -07:00
Tero Rintaluoma
4e82f01547 Tokenize MB optimized
Optimized C-code of the following functions:
 - vp8_tokenize_mb
 - tokenize1st_order_b
 - tokenize2nd_order_b
Gives ~1-5% speed-up for RT encoding on Cortex-A8/A9
depending on encoding parameters.

Change-Id: I6be86104a589a06dcbc9ed3318e8bf264ef4176c
2011-07-15 11:26:54 +03:00
John Koleszar
f1fcd74e3e Merge remote branch 'origin/master' into experimental
Change-Id: Icbeb14d64ed3d9337606b591dde4e0669540a10d
2011-07-15 00:05:06 -04:00
John Koleszar
087b338d9e Merge remote branch 'internal/upstream' into HEAD 2011-07-15 00:05:04 -04:00
John Koleszar
04dce631a2 Remove unused speed features
min_fs_radius, max_fs_radius, full_freq were set but never read.

Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74
2011-07-14 14:20:25 -04:00
John Koleszar
86edcb0cc7 Merge remote branch 'origin/master' into experimental
Change-Id: I3f64e220b78738e5261a9fda3c270d51613f4faa
2011-07-14 00:05:12 -04:00
John Koleszar
6901105e99 Merge remote branch 'internal/upstream' into HEAD 2011-07-14 00:05:04 -04:00
Yunqing Wang
0e9a6ed72a Add improvements made in good-quality mode to real-time mode
Several improvements we made in good-quality mode can be added
into real-time mode to speed up encoding in speed 1, 2, and 3
with small quality loss. Tests using tulip clip showed:

--rt --cpu-used=-1
(before change)
PSNR: 38.028
time: 1m33.195s
(after change)
PSNR: 38.014
time: 1m20.851s

--rt --cpu-used=-2
(before change)
PSNR: 37.773
time: 0m57.650s
(after change)
PSNR: 37.759
time: 0m54.594s

--rt --cpu-used=-3
(before change)
PSNR: 37.392
time: 0m42.865s
(after change)
PSNR: 37.375
time: 0m41.949s

Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9
2011-07-13 14:51:02 -04:00
Fritz Koenig
84c3cd79d1 Merge "Reduce motion vector search on alt-ref frame." 2011-07-13 10:07:30 -07:00
Johann
d9b825cff2 Merge "New loop filter interface" 2011-07-13 04:09:26 -07:00
John Koleszar
ffc4587a47 Merge remote branch 'origin/master' into experimental
Change-Id: I9dab62c24d71f71cdc36732ed8ed469bee67d7e1
2011-07-13 00:05:04 -04:00
John Koleszar
791ad1bb37 Merge remote branch 'internal/upstream' into HEAD 2011-07-13 00:05:03 -04:00