In summary, this commit encompasses a series of changes in attempt to
improve the 8x8 transform based coding to help overall compression
quality, please refer to the detailed commit history below for what
are the rationale underly the series of changes:
a. A frame level flag to indicate if 8x8 transform is used at all.
b. 8x8 transform is not used for key frames and small image size.
c. On inter coded frame, macroblocks using modes B_PRED, SPLIT_MV
and I8X8_PRED are forced to using 4x4 transform based coding, the
rest uses 8x8 transform based coding.
d. Encoder and decoder has the same assumption on the relationship
between prediction modes and transform size, therefore no signaling
is encoded in bitstream.
e. Mode decision process now calculate the rate and distortion scores
using their respective transforms.
Overall test results:
1. HD set
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120206.html
(avg psnr: 3.09% glb psnr: 3.22%, ssim: 3.90%)
2. Cif set:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120206.html
(avg psnr: -0.03%, glb psnr: -0.02%, ssim: -0.04%)
It should be noted here, as 8x8 transform coding itself is disabled
for cif size clips, the 0.03% loss is purely from the 1 bit/frame
flag overhead on if 8x8 transform is used or not for the frame.
---patch history for future reference---
Patch 1:
this commit tries to select transform size based on macroblock
prediction mode. If the size of a prediction mode is 16x16, then
the macroblock is forced to use 8x8 transform. If the prediction
mode is B_PRED, SPLITMV or I8X8_PRED, then the macroblock is forced
to use 4x4 transform. Tests on the following HD clips showed mixed
results: (all hd clips only used first 100 frames in the test)
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8.htmlhttp://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_log.html
while the results are mixed and overall negative, it is interesting to
see 8x8 helped a few of the clips.
Patch 2:
this patch tries to hard-wire selection of transform size based on
prediction modes without using segmentation to signal the transform size.
encoder and decoder both takes the same assumption that all macroblocks
use 8x8 transform except when prediciton mode is B_PRED, I8X8_PRED or
SPLITMV. Test results are as follows:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cifmodebase8x8_0125.htmlhttp://www.corp.google.com/~yaowu/no_crawl/t8x8/hdmodebased8x8_0125log.html
Interestingly, by removing the overhead or coding the segmentation, the
results on this limited HD set have turn positive on average.
Patch 3:
this patch disabled the usage of 8x8 transform on key frames, and kept the
logic from patch 2 for inter frames only. test results on HD set turned
decidedly positive with 8x8 transform enabled on inter frame with 16x16
prediction modes: (avg psnr: .81% glb psnr: .82 ssim: .55%)
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hdintermode8x8_0125.html
results on cif set still negative overall
Patch 4:
continued from last patch, but now in mode decision process, the rate and
distortion estimates are computed based on 8x8 transform results for MBs
with modes associated with 8x8 transform. This patch also fixed a problem
related to segment based eob coding when 8x8 transform is used. The patch
significantly improved the results on HD clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/hd8x8RDintermode.html
(avg psnr: 2.70% glb psnr: 2.76% ssim: 3.34%)
results on cif also improved, though they are still negative compared to
baseline that uses 4x4 transform only:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif8x8RDintermode.html
(avg psnr: -.78% glb psnr: -.86% ssim: -.19%)
Patch 5:
This patch does 3 things:
a. a bunch of decoder bug fixes, encodings and decodings were verified
to have matched recon buffer on a number of encodes on cif size mobile and
hd version of _pedestrian.
b. the patch further improved the rate distortion calculation of MBS that
use 8x8 transform. This provided some further gain on compression.
c. the patch also got the experimental work SEG_LVL_EOB to work with 8x8
transformed macroblock, test results indicates it improves the cif set
but hurt the HD set slightly.
Tests results on HD clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/HD_t8x8_20120201.html
(avg psnr: 3.19% glb psnr: 3.30% ssim: 3.93%)
Test results on cif clips:
http://www.corp.google.com/~yaowu/no_crawl/t8x8/cif_t8x8_20120201.html
(avg psnr: -.47% glb psnr: -.51% ssim: +.28%)
Patch 6:
Added a frame level flag to indicate if 8x8 transform is allowed at all.
temporarily the decision is based on frame size, can be optimized later
one. This get the cif results to basically unchanged, with one bit per
frame overhead on both cif and hd clips.
Patch 8:
Rebase and Merge to head by PGW.
Fixed some suspect 4s that look like hey should be 64s in regard
to segmented EOB. Perhaps #defines would be bette.
Bulit and tested without T8x8 enabled and produces unchanged
output.
Patch 9:
Corrected misalligned code/decode of "txfm_mode" bit.
Limited testing for correct encode and decode with
T8x8 configured on derf clips.
Change-Id: I156e1405d25f81579d579dff8ab9af53944ec49c
Merged in most of the current common prediction changes
that were under the #if CONFIG_COMPRED option.
Change-Id: If4e6f61dbe7b86dd449f6effbe93b5eb7e893885
Further changes to make experiments with the context
used for coding the dual pred flag easier.
Current best performing method tested on derf is a two
element context based on reference frame. I also tried
various combinations of mode and reference frame as
shown in commented out case using up to 6 contexts.
Derf +0.26 overall psnr +0.15% ssim vs original method.
Change-Id: I64c21ddec0abbb27feaaeaa1da2e9f164ebaca03
This commit merges the NEWNEAR experiment such that it
is effectively always on.
The fact that there were changes in the threading code again
highlights the need to strip out such features during the
bitstream development phase as trying to maintain this code
(especially as it is not being tested) slows the development cycle.
Change-Id: I8b34950a1333231ced9928aa11cd6d6459984b65
Initial modifications to make limited use of common prediction
functions.
The only functional change thus far is that updates to the probabilities are
no longer "damped". This was a testing convenience but in fact seems to
help by a little over 0.1% over the derf set.
Change-Id: I8b82907d9d6b6a4a075728b60b31ce93392a5f2e
Trial of a modified prediction function that ranks each possible
reference frame based on a combination of local usage and
frame level probability. The code is a bit cleaner and simpler.
In direct comparison with old unpredicted method with segment level
coding turned off for mode,ref & EOB the prediction gives a gain on derf
of around 0.4%. There is some further gain from bug fixes over earlier code.
With segment coding on the prediction method is slightly -ve on some very
easy clips (at low rates) due to slightly higher overheads, but better on harder
clips. Overall neutral on derf in direct comparison on latest code base, but
compared to earlier code without bug fixes about +0.7% overall psnr
+0.3% SSIM.
Change-Id: I5b8474658b208134d352d24f6517f25795490789
Extended prediction and coding of reference frame where
a subset of options are flagged as available at the segment level.
Updated copyright notices.
Switch to SAD in mbgraph code as SATD problematic for the
foreground and background separation as it can ignore large DC shifts.
Change-Id: I661dbbb2f94f3ec0f96bb928c1655e5e415a7de1
As a precursor to encoding 32x32 blocks this cl adds the
ability to encode the frame superblock (=32x32 block) at
a time. Within a SB the 4 indiviual MBs are encoded in
raster-order (NW,NE,SW,SE).
This functionality is added as an experiment which can be
enabled by ispecifying --enable-superblocks in the
command line specified to configure (CONFIG_SUPERBLOCKS
macro in the code).
To make this work I had to disable the two intra
prediction modes that use data from the top-right of the
MB.
On the tests that I have run the results produce
almost exactly the same PSNRs & SSIMs with a very
slightly higher average data rate (and slightly higher
data rate than just disabling the two intra modes in
the original code).
NOTE: This will also break the multi-threaded code.
This replaces the abandoned change:
Iebebe0d1a50ce8c15c79862c537b765a2f67e162
Change-Id: I1bc1a00f236abc1a373c7210d756e25f970fcad8
The bug was introduced by the commit that added I8X8 intra prediction
mode for inter frames, the decoder was not update to accept the additional
probability update from encoder. This causes the decoder typicall to crash
when encoder sends intra mode probability update.
Change-Id: Ib7dc42dc77a51178aa9ece41e081829818a25016
This check in uses the common prediction interface functions
to code reference frame.
Some updates made regarding the impact of the new code in rd loop
but there remain TODOs in this regard.
Change-Id: I9da3ed5dfdaa489e0903ab33258b0767a585567f
This does not change any functionality just modifies the code to
use the common prediction module interface for coding
the segment data.
Change-Id: Ifd43e9153573365619774a4f5572215e44fb5aa3
Added code to support 256 index steps instead of 128 but disabled for now.
Replace hard wired table vp8cx_base_skip_false_prob[128]
Observed Qindex problem with setting minimum loop filter value.
(Experiment code using real Q in place but for now just returning 0. This has a big
beneficial effect on some clips, particularly waterfall which shows 5% ssim gain)
Change-Id: I2f7117de8adc1797164c106aa13effc900a1467e
Whilst the encoder explicitly set the segment_id to 0
when segmentation is diabled, the decoder would allow
the segment_id to persist from the previous frame.
This fix attempts to make the decoder behave the same
as the encoder by explicitly setting the segment_id to
0 in this case.
Change-Id: I65c3a05247550edb10706eb5d54d306dfb792309
Both encoder & decoder were using mb_cols to
offset from one row of MODE_INFO structures to the next
when they should have been using mode_info_stride.
Fixing this in both encoder and decoder gives around
a 3KB size saving and 0.025dB PSNR improvement on the one
720P clip I tried.
(Also removed "index" which was being updated but not used)
Change-Id: I413bea802b142886bfcf8d8aa7f5a2f0c524fd4b
Previously, Y-adaptive UV intra coding only enabled on key frames in
UVINTRA experiment. This commit enabled the same coding for inter
frames, so the encoding of UV intra modes are consistent cross all
frame types. Tests on derf set showed a very small overall gain around
.04%:
http://www.corp.google.com/~yaowu/no_crawl/interUVintra.html
The gain looks to be reasonable given inta coded MBs is only a
small portion of MBs in inter frames.
Change-Id: Ic6fc261923f2c253f4a0c9f8bccf4797557b9e16
A previous commit 76feb965 made the vp8_mode_context adaptive on a frame
frame basis, this commit further made the coding context adaptive to two
frame types separately. Tests on derf set showed a further small gain on
all metrics: avg psnr 0.10%, glb psnr: 0.11%, ssim: 0.08%
http://www.corp.google.com/~yaowu/no_crawl/newNearMode_1209.html
Change-Id: I7b3e32ec8729de1903d14a3f1213f1624b78cdee
This commit removed the macro CONFIG_MULCONTEXT, which was used to
indicate the experiment code for using separate context for altref
and normal frames. This commit made the change fully merged in.
Change-Id: I525f927f68e2365d37b340ef23b836a136a4f70b
This commit removed the macro CONFIG_I8X8, which was used to indicate
the 8x8 intra prediction experiment, made the change fully merged in.
Change-Id: Iafa4443781ce6e83f5591c12ba615a0e92ce0ea0
vp8_mode_contexts[] is an entropy table used to code inter mode
choices. It was a fixed constant table. This commit made the entropy
context adaptive. Tests on derf set showed very good consistent gains
on all metrics: avg psnr .47%, overall psnr .46% and ssim .40%.
http://www.corp.google.com/~yaowu/no_crawl/newModeContext.html
Change-Id: Ia62b14485c948e2b74586118619c5eb2068b43b2
This patch introduces the concept of dual inter16x16 prediction. A
16x16 inter-predicted macroblock can use 2 references instead of 1,
where both references use the same mvmode (new, near/est, zero). In the
case of newmv, this means that two MVs are coded instead of one. The
frame can be encoded in 3 ways: all MBs single-prediction, all MBs dual
prediction, or per-MB single/dual prediction selection ("hybrid"), in
which case a single bit is coded per-MB to indicate whether the MB uses
single or dual inter prediction.
In the future, we can (maybe?) get further gains by mixing this with
Adrian's 32x32 work, per-segment dual prediction settings, or adding
support for dual splitmv/8x8mv inter prediction.
Gain (on derf-set, CQ mode) is ~2.8% (SSIM) or ~3.6% (glb PSNR). Most
gain is at medium/high bitrates, but there's minor gains at low bitrates
also. Output was confirmed to match between encoder and decoder.
Note for optimization people: this patch introduces a 2nd version of
16x16/8x8 sixtap/bilin functions, which does an avg instead of a
store. They may want to look and make sure this is implemented to
their satisfaction so we can optimize it best in the future.
Change-ID: I59dc84b07cbb3ccf073ac0f756d03d294cb19281
This commit added code to keep track of separate entropy contexts for
normal frames and alt ref frames. The underly assumption was that the
two type of frames have different entropy characteristics given they
typically have quite different quantization levels. By keeping entropy
contexts separate, it helps the entropy context distribution to be more
closely adapted to each frame type.
Tests on derf set showed a good and very consistent gain on all clips
on all metrics, avg psnr: 0.89%, overall psnr: 0.84% and ssim 0.93%.
http://www.corp.google.com/~yaowu/no_crawl/mulcontext.html
Change-Id: I15bc9697f6ff7829042911fe0c62930585d7e65d
This commit enabled the usage of 8x8 intra prediction modes on inter
frames. There are a few TODO items related to this: 1)baseline entropy
need be calibrated; 2)cost of UV need to be done more properly rather
than using decision only relying on Y; 3)Threshold for allowing picking
8x8 intra prediction should be lowered to lower than the B_PRED.
Even with all the TODOs, tests showed consistent gain on derf set ~0.1%
(PSNR:0.08% and SSIM:0.14%). It is assumed that 8x8 intra prediction
will help more on large resolution clips, especially with above TODOs
addressed.
Change-Id: I398ada49dfc32575cfab962a569c2885111ae3ba
Fixed some further QIndex related issues and replaced some tables
(eg zbin and rounding)
Also Added function (currently disabled by default) to populate the
main AC and DC quantizer tables. Using the original AC range the
resulting computed DC values give behavior broadly comparable
on the DERF set. That is not to say that the equations will hold good
over a more extended range. The purpose of this code is to make it
easier to experiment with further alterations to the Q range and distribution
of Q values plus the relative weights given to AC and DC.
The function find_fp_qindex() ensures that changes to the Q tables
are reflected in the value passed in to the first pass code.
Slight experimental adjustment to static segment Q offset.
Change-Id: I36186267d55dfc2a3d565d0cff7218ef300d1cd5
this commit is to add an variable in the macroblock level mode
info structure to track the transform size used in each MB, so
the information can be used later in the loop filter to change
how loop filter works on MBs with different transform sizes.
Change-Id: Id0eeaba6cc854c6d1be00ed8d237b3d9e250e447
This is an experiment to include a mv contribution from last frame to
nearest and near mv definition. Initial test showed some small though
consistent gain.
latest patch slightly better result ~.13%-~.18%.
TODO: the entropy used to encode the mode choice, i.e. the mv counts
based conditional distribution of modes should be re-collected to
reflect this change, it is expected that there is some further gain
from that.
Change-Id: Ief1e284a36d8aa56b49ae5b360c91419ec494fa4
Fix decoder segmentation bug for temporal coding where the segment map
was first initialized on a key frame.
in vp8_kfread_modes() after reading the segment id it must be written to
the pbi->segmentation_map[] for use in temporal coding on subsequent frames.
Change-Id: I1489305efc376564e734a216f69c2844646ee3d3
Removal of CONFIGURE_SEGMENTATION ifdefs.
Removal of legacy support code fo the old coding mechanism.
Use local reference "xd" for MACROBLOCKD structure in
encode_frame_to_data_rate()
Moved call to choose_segmap_coding_method() out of encode
loop as the cost of segmentation is not properly accounted
in the loop anyway. If this is desirable in the future it
can be moved back. The use of this function to do all the
analysis and set the probabilities also removes the need
to track segment useage in threading code.
Change-Id: I85bc8fd63440e7176c73d26cb742698f9b70cade
Changed name and sense of segment_flag to "seg_id_predicted"
Added some additional comments and retested.
I also did some experimentation with a spatial prediction option
using a similar strategy to the temporal mode implemented.
This helps in some cases where temporal prediction is bad but
I suspect there is more overlap here with work on a larger scale
block structure and spatial correlation will likely be better
handled through that mechanism.
Next check in will remove #ifdefs and legacy mode code.
Change-Id: I3b382b65ed2a57bd7775ac0f3a01a9508a209cbc
This check in includes quite a lot of clean up and refactoring.
Most of the analysis and set up for the different coding options for the
segment map (currently simple distribution based coding or temporaly
predicted coding), has been moved to one location (the function
choose_segmap_coding_method() in segmenation.c). This code was previously
scattered around in various locations making integration with other
experiments and modification / debug more difficult.
Currently the functionality is as it was with the exception that the
prediction probabilities are now only transmitted when the temporal
prediction mode is selected.
There is still quite a bit more clean up work that will be possible
when the #ifdef is removed. Also at that time I may rename and alter
the sense of macroblock based variable "segment_flag" which indicates
(1 that the segmnet id is not predicted vs 0 that it is predicted).
I also intend to experiment with a spatial prediction mode that can be
used when coding a key frame segment map or in cases where temporal
prediction does not work well but there is spatial correlation.
In a later check in when the ifdefs have gone I may also move the call
to choose_segmap_coding_method() to just before where the bitsream is
packed (currently it is in vp8_encode_frame()) to further reduce the
possibility of clashes with other experiments and prevent it being called
on each itteration of the recode loop.
Change-Id: I3d4aba2a2826ec21f367678d5b07c1d1c36db168
Some initial cleanup to aid testing and debug.
Pull code to choose temporal or spatial encoding
out of encodeframe.c into a dedicated function
in segmentation.c.
For now disable broken temporal mode.
Move the coding of "temporal_update" flag and
only transmit if segment map update is indicated.
Rename the functions read_mb_features() and
write_mb_features() to read_mb_segid() and
read_mb_segid() as they only read and write
the macroblock segment id not any of the
features.
Change-Id: Ib75118520b1144c24d35fdfc6ce46106803cabcf
The dequantizer functions for 2nd order haar block had confusing 8x8
in their names. this commit fixed their name to avoid confusion.
Change-Id: I6ae4e7888330865f831436313637d4395b1fc273
This commit added scaling factors to 8x8 transform, quant, dequant and
inverse transform pipeline to make 8x8 transform to work when configed
with enable-extend_qrange. This commit also disabled the trellis-quant
when extend_qrange is configured.
Change-Id: Icfb3192e4746f70a4bb35ad18b7b47705b657e52
updated the decode_macroblock logic to reflect that 8x8 transform is
not used for "SPLITMV". Also fixed an issue where 2nd order haar block
has wrong dequant/idct process.
Change-Id: I1e373f6535c009dfec503b6362c8a5cfc196e1da
Initial attempt at using new segment feature signaling
to indicate 4x4 or 8x8 transform.
needs --enable-experimental --enable-t8x8
Note this is work in progress.
Change-Id: Ib160d46a5d810307bfcbc79853ce1a65b5b870b7
No change to functionality or output.
Updates to the segment feature data structure now all done
through functions such as set_segdata() and get_segdata()
in seg_common.c.
The reason for this is to make changing the structures (if needed)
and debug easier.
In addition it provides a single location for subsequent addition
of range and validity checks. For example valid combination of
mode and reference frame.
Change-Id: I2e866505562db4e4cb6f17a472b25b4465f01add
This commit tries to do UV intra mode coding adaptive to Y intra mode.
Entropy context is defined as conditional PDF of uv intra mode given
the Y mode. All constants are normalized with 256 to be fit in 8 bits.
This provides further coding efficiency beyond the quantizer adaptive
y intra mode coding. Consistent gains were observed on all clips and
all bit rates for HD all key encoding tests.
To test, configure with
--enable-experimental --enable-uvintra
Change-Id: I2d78d73f143127f063e19bd0bac3b68c418d756a
Removal of configure #ifdefs so that segment features
always available. Removal of code supporting old
segment feature method.
Still a good deal of tidying up to do.
Change-Id: I397855f086f8c09ab1fae0a5f65d9e06d2e3e39f