Most of these were picked up in the previous commit (prefix change from
vp8_ to vp9_), but I'm pushing this separately so that it's easier to
review.
Change-Id: I91e959895778b8632d7d33375523df8a7568a490
There is a macro DEFAULT_INTERP_FILTER defined in encoder/onyx_if.c that
is set as EIGHTTAP for now - so SWITCHABLE is not really used. Ideally,
this should be SWITCHABLE but that would make the encoder quite a bit slower.
We will change the default filter to SWITCHABLE once we find a faster way to
search for switchable filters.
Change-Id: Iee91832cdc07e6e14108d9b543130fdd12fc9874
Results: derf (vanilla or +hybridtx) +0.2% and (+hybrid16x16
or +tx16x16) +0.7%-0.8%; HD (vanilla or +hybridtx) +0.1-0.2%
and (+hybrid16x16 or +tx16x16) +1.4%, STD/HD (vanilla or +hybridtx)
about even, and (+hybrid16x16 or +tx16x16) +0.8-1.0%.
Change-Id: I03899e2f7a64e725a863f32e55366035ba77aa62
Separates the entropy coding context models for 4x4, 8x8 and 16x16
ADST variants.
There is a small improvement for HD (hd/std-hd) by about 0.1-0.2%.
Results on derf/yt are about the same, probably because there is not
enough statistics.
Results may improve somewhat once the initial probability tables are
updated for the hybrid transforms which is coming soon.
Change-Id: Ic7c0c62dacc68ef551054fdb575be8b8507d32a8
Adds a new experiment with redesigned/refactored motion vector entropy
coding. The patch also takes a first step towards separating the
integer and fractional pel components of a MV. However the fractional
pel encoding still depends on the integer pel part and so they are
not fully independent. Further experiments are in progress to see
how much they can be decoupled without affecting performance.
All components including entropy coding/decoding, costing for MV
search, forward updates and backward updates to probability tables,
have been implemented.
Results so far:
derf: +0.19%
std-hd: +0.28%
yt: +0.80%
hd: +1.15%
Patch: Simplifies the fractional pel models:
derf: +0.284%
std-hd: +0.289%
yt: +0.849%
hd: +1.254%
Patch: Some changes in the models, rebased.
derf: +0.330%
std-hd: +0.306%
yt: +0.816%
hd: +1.225%
Change-Id: I646b3c48f3587f4cc909639b78c3798da6402678
Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides
benefits mostly for hd sequences.
Set up the framework for selectable transform dimension.
Also allowing quantization parameter threshold to control the use
of hybrid transform (This is currently disabled by setting threshold
always above the quantization parameter. Adaptive thresholding can
be built upon this, which will further improve the coding performance.)
The coding performance gains (with respect to the codec that has all
other configuration settings turned on) are
derf: 0.013
yt: 0.086
hd: 0.198
std-hd: 0.501
Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926
Merged in the high_precision_mv experiment to make it easier
to work on new mv encoding strategies. Also removed
coef_update_probs3().
Change-Id: I82d3b0bb642419fe05dba82528bc9ba010e90924
Set on all 16x16 intra/inter modes
Features:
- Butterfly fDCT/iDCT
- Loop filter does not filter internal edges with 16x16
- Optimize coefficient function
- Update coefficient probability function
- RD
- Entropy stats
- 16x16 is a config option
Have not tested with experiments.
hd: 2.60%
std-hd: 2.43%
yt: 1.32%
derf: 0.60%
Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
Allows for swtiching/setting interpolation filters at the MB
level. A frame level flag indicates whether to use a specifc
filter for the entire frame or to signal the interpolation
filter for each MB. When switchable filters are used, the
encoder chooses between 8-tap and 8-tap sharp filters. The
code currently has options to explore other variations as well,
which will be cleaned up subsequently.
One issue with the framework is that encoding is slow. I
tried to do some tricks to speed things up but it is still slow.
Decoding speed should not be affected since the number of
filter taps remain unchanged.
With the current version, we are up 0.5% on derf on average but
some videos city/mobile improve by close to 4 and 2% respectively.
If we did a full-search by turning the SEARCH_BEST_FILTER flag
on, the results are somewhat better.
The framework can be combined with filtered prediction, and I
seek feedback regarding that.
Rebased.
Change-Id: I8f632cb2c111e76284140a2bd480945d6d42b77a
This commit adds lossless compression capability to the experimental
branch. The lossless experiment can be enabled using --enable-lossless
in configure. When the experiment is enabled, the encoder will use
lossless compression mode by command line option --lossless, and the
decoder automatically recognizes a losslessly encoded clip and decodes
accordingly.
To achieve the lossless coding, this commit has changed the following:
1. To encode at lossless mode, encoder forces the use of unit
quantizer, i.e, Q 0, where effective quantization is 1. Encoder also
disables the usage of 8x8 transform and allows only 4x4 transform;
2. At Q 0, the first order 4x4 DCT/IDCT have been switched over
to a pair of forward and inverse Walsh-Hadamard Transform
(http://goo.gl/EIsfy), with proper scaling applied to match the range
of the original 4x4 DCT/IDCT pair;
3. At Q 0, the second order remains to use the previous
walsh-hadamard transform pair. However, to maintain the reversibility
in second order transform at Q 0, scaling down is applied to first
order DC coefficients prior to forward transform, and scaling up is
applied to the second order output prior to quantization. Symmetric
upscaling and downscaling are added around inverse second order
transform;
4. At lossless mode, encoder also disables a number of minor
features to ensure no loss is introduced, these features includes:
a. Trellis quantization optimization
b. Loop filtering
c. Aggressive zero-binning, rounding and zero-bin boosting
d. Mode based zero-bin boosting
Lossless coding test was performed on all clips within the derf set,
to verify that the commit has achieved lossless compression for all
clips. The average compression ratio is around 2.57 to 1.
(http://goo.gl/dEShs)
Change-Id: Ia3aba7dd09df40dd590f93b9aba134defbc64e34
Added the ability to optionally filter the prediction data
when inter modes are selected (excludes SPLITMV, for now).
The mode selection loop considers both the filtered and
non-filtered prediction data when choosing mode. The filter
can be turned on/off at the frame-level, or signaled for
each MB.
Change-Id: I1b783c71d95a361ab36c761b07e8a6b06bc36822
Incorporates mv_ref, mbsplit and second_mv into the adaptive
entropy framework. The mv_ref framework has been modified from
before.
Adds some clean-ups and fixes.
Results with the adaptive entropy experiment are currently up by
+1.93% on derf; +2.33% std-hd and +1.87% yt-hd.
Fixed a nasty intermittent bug.
Change-Id: I4b1ac9f9483b48432597595195bfec05f31d1e39
This patch incorporates adaptive entropy coding of coefficient tokens,
and mode/mv information based on distributions encountered in a frame.
Specifically, there is an initial forward update to the probabilities
in the bitstream as before for coding the symbols in the frame, however
at the end of decoding each frame, the forward update to the
probabilities is reverted and instead the probabilities are updated
towards the actual distributions encountered within the frame.
The amount of update is weighted by the number of hits within each
context.
Results on derf/hd/std-hd are all up by 1.6%.
On derf, the most of the gains come from coefficients, however for the
hd and std-hd sets, the most of the gains come from the mode/mv
information updates.
Change-Id: I708c0e11fdacafee04940fe7ae159ba6844005fd
This commit changed to enable the usage 8x8 transform for all frame
type, all resolution and all quantizer range. This has an overall
benefit .2% to .3% in term of compression, but more importantly,
the difficult clips benefits much more, up to 2% to 3% on clips
like football, harbour and so on.
We observed some weird humps on very high end on a couple of youtube
clips, but have determined the underly cause was the aggressive zbin
having an effect of lowering rate with lower quality, which have
an impact on slide show clips around 60DB.
The commit does not change the association between prediction mode
and transform size.
Change-Id: I33043bdce6207528ae00b4a4b26d8ff63cfea1f4
These contexts need to be saved and restored for recode, otherwise
encoder/decoder mismatch happens for some clips (eg._mobcal 720p)
Change-Id: Ic65cfa0bf56ed0472ecab962ce31394d59d344bf
Some code re-factored / moved to allow the main
pack operation inside the recode loop so that the
size estimate is accurate.
Deletion of some redundant code relating to one pass.
Aproximate improvement over March 27 code base:
Derf 0.0%, YT 0.5%, YThd 0.3% Std_hd 0.25%
Change-Id: Id2d071794ab44f0b52935f6fcdb5733d09a6bb86
When ac_yquant>171, a key frame is enabled to use 8x8 transform. In
such case, MBs with DC_PRED or TM_PRED are selected to use T8x8. This
change helped the full STD-HD set by ~.1% or so, which is reasonable
considering how often key frame occurs in these encodings.
Change-Id: Id17009ef6327252177b19e6bf0d6628827febaf1
The recoding loop save and restore frame coding context for recodes.
However in recoding of key frames, some of the coding context saved
was stale from last encoded inter frame. The save/restore sometimes
overwrites the re-inintialized coding context with saved context
from last frame, resulting in encoder/decoder mismatch
Change-Id: I354ae2f71074d142602d51d06544c05a2462caaf
This is the first patch for refactoring of the code related to
high-precision mv, so that 1/4 and 1/8 pel motion vectors can
co-exist in the same bit-stream by use of a frame level flag.
The current patch works fine for only use of 1/4th and
only use of 1/8th pel mv, but there are some issues with the
mode switching in between. Subsequent patches on this change Id
will fix the remaining issues.
Patch 2: Adds fixes to make sure that multiple mv precisions can
co-exist in the bit-stream. Frame level switching has been tested
to work correctly.
Patch 3: Fixes lines exceeding 80 char
Patch 4:
http://www.corp.google.com/~debargha/vp8_results/enhinterp.html
Results on derf after ssse3 bugfix, compared to everything
enabled but the 8-tap, 1/8-subpel and 1/16-subpel uv. Overall the
gains are about 3% now. Hopefully there are no more bugs lingering.
Apparently the sse3 bug affected the quartel subpel results more than
the eighth pel ones (which is understandabale because one bad predictor
due to the bug, matters less if there are a lot more subpel options
available as in the 1/8 subpel case).
The results in the 4th column correspond to the current settings.
The first two columns correspond to two settings of adaptive switching
of the 1/4 or 1/8 subpel mode based on initial Q estimate. These
do not work as good as just using 1/8 all the time yet.
Change-Id: I3ef392ad338329f4d68a85257a49f2b14f3af472
The commit overall on derf test is break even to very slightly positive
comparing to all 4x4 transform.
Change-Id: I2a7c19599aa54c2d3a5b35db0dc891ba8a6a2b26
For now the interface elements have been left in place
to make sure existing parameter files work but parameters
relating to drop frame wont do anything.
Change-Id: I579ee614726387381c546845dac4bc03c74c6a07
Removal of the pickinter.c and .h files and calls to this
code.
Removal of some code relating to real time and one pass
settings though there is more to be done in this regard.
However, vp8_set_speed_features() now
only supports modes 0 and 1 and speeds up to 3
so rd should always be set.
Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8
In summary, this commit encompasses a series of changes in attempt to
improve the 8x8 transform based coding to help overall compression
quality, please refer to the detailed commit history below for what
are the rationale underly the series of changes:
a. A frame level flag to indicate if 8x8 transform is used at all.
b. 8x8 transform is not used for key frames and small image size.
c. On inter coded frame, macroblocks using modes B_PRED, SPLIT_MV
and I8X8_PRED are forced to using 4x4 transform based coding, the
rest uses 8x8 transform based coding.
d. Encoder and decoder has the same assumption on the relationship
between prediction modes and transform size, therefore no signaling
is encoded in bitstream.
e. Mode decision process now calculate the rate and distortion scores
using their respective transforms.
Overall test results:
1. HD set
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120206.html
(avg psnr: 3.09% glb psnr: 3.22%, ssim: 3.90%)
2. Cif set:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120206.html
(avg psnr: -0.03%, glb psnr: -0.02%, ssim: -0.04%)
It should be noted here, as 8x8 transform coding itself is disabled
for cif size clips, the 0.03% loss is purely from the 1 bit/frame
flag overhead on if 8x8 transform is used or not for the frame.
---patch history for future reference---
Patch 1:
this commit tries to select transform size based on macroblock
prediction mode. If the size of a prediction mode is 16x16, then
the macroblock is forced to use 8x8 transform. If the prediction
mode is B_PRED, SPLITMV or I8X8_PRED, then the macroblock is forced
to use 4x4 transform. Tests on the following HD clips showed mixed
results: (all hd clips only used first 100 frames in the test)
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8.htmlhttp://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_log.html
while the results are mixed and overall negative, it is interesting to
see 8x8 helped a few of the clips.
Patch 2:
this patch tries to hard-wire selection of transform size based on
prediction modes without using segmentation to signal the transform size.
encoder and decoder both takes the same assumption that all macroblocks
use 8x8 transform except when prediciton mode is B_PRED, I8X8_PRED or
SPLITMV. Test results are as follows:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cifmodebase8x8_0125.htmlhttp://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_0125log.html
Interestingly, by removing the overhead or coding the segmentation, the
results on this limited HD set have turn positive on average.
Patch 3:
this patch disabled the usage of 8x8 transform on key frames, and kept the
logic from patch 2 for inter frames only. test results on HD set turned
decidedly positive with 8x8 transform enabled on inter frame with 16x16
prediction modes: (avg psnr: .81% glb psnr: .82 ssim: .55%)
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdintermode8x8_0125.html
results on cif set still negative overall
Patch 4:
continued from last patch, but now in mode decision process, the rate and
distortion estimates are computed based on 8x8 transform results for MBs
with modes associated with 8x8 transform. This patch also fixed a problem
related to segment based eob coding when 8x8 transform is used. The patch
significantly improved the results on HD clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hd8x8RDintermode.html
(avg psnr: 2.70% glb psnr: 2.76% ssim: 3.34%)
results on cif also improved, though they are still negative compared to
baseline that uses 4x4 transform only:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif8x8RDintermode.html
(avg psnr: -.78% glb psnr: -.86% ssim: -.19%)
Patch 5:
This patch does 3 things:
a. a bunch of decoder bug fixes, encodings and decodings were verified
to have matched recon buffer on a number of encodes on cif size mobile and
hd version of _pedestrian.
b. the patch further improved the rate distortion calculation of MBS that
use 8x8 transform. This provided some further gain on compression.
c. the patch also got the experimental work SEG_LVL_EOB to work with 8x8
transformed macroblock, test results indicates it improves the cif set
but hurt the HD set slightly.
Tests results on HD clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120201.html
(avg psnr: 3.19% glb psnr: 3.30% ssim: 3.93%)
Test results on cif clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120201.html
(avg psnr: -.47% glb psnr: -.51% ssim: +.28%)
Patch 6:
Added a frame level flag to indicate if 8x8 transform is allowed at all.
temporarily the decision is based on frame size, can be optimized later
one. This get the cif results to basically unchanged, with one bit per
frame overhead on both cif and hd clips.
Patch 8:
Rebase and Merge to head by PGW.
Fixed some suspect 4s that look like hey should be 64s in regard
to segmented EOB. Perhaps #defines would be bette.
Bulit and tested without T8x8 enabled and produces unchanged
output.
Patch 9:
Corrected misalligned code/decode of "txfm_mode" bit.
Limited testing for correct encode and decode with
T8x8 configured on derf clips.
Change-Id: I156e1405d25f81579d579dff8ab9af53944ec49c
This commit merges the NEWNEAR experiment such that it
is effectively always on.
The fact that there were changes in the threading code again
highlights the need to strip out such features during the
bitstream development phase as trying to maintain this code
(especially as it is not being tested) slows the development cycle.
Change-Id: I8b34950a1333231ced9928aa11cd6d6459984b65
A problem can arise on static clips with force key frames where
attempts to avoid popping lead to a progressive reduction in key
frame Q that ultimately may lead to unexpected overspend against
the rate target.
The changes in this patch help to insure that in such clips the
quality of the key frames across the clip is more uniform (rather
than starting bad and getting better - especially at low target rates).
This patch also includes a fix that removes a delta on the Y2DC
when the baseline q index < 4 as this is no longer needed.
There is also a fix to try and prevent repeat single step Q adjustment in
the recode loop leading to lots of recodes, especially where the use
of forced skips as part of segmentation has made the impact of Q on
the number of bits generated much smaller.
Patch 2: Amend "last_boosted_qindex" calculation for arf overlay frames.
Change-Id: Ia1feeb79ed8ed014e4239994fcf5e58e68fd9459
Removed a couple more fixed tables for the extended quantizer experiment
that depend on QINDEX_RANGE.
Change-Id: I2c15ffc7488c2a2b8d6504e2c4b6b2339799d117
A previous commit 76feb965 made the vp8_mode_context adaptive on a frame
frame basis, this commit further made the coding context adaptive to two
frame types separately. Tests on derf set showed a further small gain on
all metrics: avg psnr 0.10%, glb psnr: 0.11%, ssim: 0.08%
http://www.corp.google.com/~yaowu/no_crawl/newNearMode_1209.html
Change-Id: I7b3e32ec8729de1903d14a3f1213f1624b78cdee
This commit removed the macro CONFIG_MULCONTEXT, which was used to
indicate the experiment code for using separate context for altref
and normal frames. This commit made the change fully merged in.
Change-Id: I525f927f68e2365d37b340ef23b836a136a4f70b
vp8_mode_contexts[] is an entropy table used to code inter mode
choices. It was a fixed constant table. This commit made the entropy
context adaptive. Tests on derf set showed very good consistent gains
on all metrics: avg psnr .47%, overall psnr .46% and ssim .40%.
http://www.corp.google.com/~yaowu/no_crawl/newModeContext.html
Change-Id: Ia62b14485c948e2b74586118619c5eb2068b43b2
The MODE_STATS macro was used to #ifdef around code for mode entropy
stats collection, this commit fixed a crash when MODE_STATS is on.
The commit also changed a number of array definitions to use defined
macros instead of hard-coded numbers.
Change-Id: I114592f53a1e44e31e455f5725f036ae6168735a
Resolved or factored out some further issues with Q index.
Put in a 3rd order polynomial instead of less accurate power function
as the best fit on gf and kf boost adjustment.
Added avg_q value to use instead of ni_av_qi.
Compute segment delta Q values based on avg_q.
Fixed bug in adjust_maxq_qrange().
The extended range Q on the derf set, using standard data rates
(which do not extend high enough to get big benefits) still show
a shortfall of between 0.5 and 1% though so there would appear to
be further issues that need to be tracked down.
Change-Id: Icfd49b9f401906ba487ef1bef7d397048295d959
This commit added code to keep track of separate entropy contexts for
normal frames and alt ref frames. The underly assumption was that the
two type of frames have different entropy characteristics given they
typically have quite different quantization levels. By keeping entropy
contexts separate, it helps the entropy context distribution to be more
closely adapted to each frame type.
Tests on derf set showed a good and very consistent gain on all clips
on all metrics, avg psnr: 0.89%, overall psnr: 0.84% and ssim 0.93%.
http://www.corp.google.com/~yaowu/no_crawl/mulcontext.html
Change-Id: I15bc9697f6ff7829042911fe0c62930585d7e65d