This commit adds a pick_sb_mode() function which selects the best 32x32
superblock coding mode. Then it selects the best per-MB modes, compares
the two and encodes that in the bitstream.
The bitstream coding is rather simplistic right now. At the SB level,
we code a bit to indicate whether this block uses SB-coding (32x32
prediction) or MB-coding (anything else), and then we follow with the
actual modes. This could and should be modified in the future, but is
omitted from this commit because it will likely involve reorganizing
much more code rather than just adding SB coding, so it's better to let
that be judged on its own merits.
Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%.
Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6
Loop filter producing wierd artifacts when
repeatedly applied in noisy video. This
mitigates the effect.
Change-Id: If4b1a8543912d186a486f84e11d8b01f7436fa5f
Resolved the decoder mismatch issue due to quantization parameter
threshold for hybrid transform coding. The macroblock dequantizer
initialization is moved to be performed before coefficient
detokenization, since the (de)tokenization is now dependent on the
macroblock level quantization parameter.
Change-Id: I443da4992ebb70ae4114750b2f1363c0c628580e
This doesn't affect the result, since there are no MVs coded using this
entropy. It does, however, silence valgrind warnings about uninitialized
variables.
Change-Id: I6e21ba92df6ce5381bf58b8c349ef4373294a0b6
About 3.5x faster, 30% overall encoder speedup. Rest of optimizations
will come soon (see TODO section in filter_sse4.c).
Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0
The reference motion vector selected by surrounding pixels that has
the best matching score is used as nearest motion vector.
The change has shown consistent gain on all test sets, compression
gains range from .2% to .6%. The variation is largely dependent on
various other experiments on or off.
Change-Id: I5552e1c2f6fc57c3e8818a5ee41ffda89af05e75
References to MACROBLOCKD that use "x" changed to "xd"
to comply with convention elsewhere that x = MACROBLOCK
and xd = MACROBLOCKD.
Simplify some repeat references using local variables.
Change-Id: I0ba2e79536add08140a6c8b19698fcf5077246bc
Add local variable in several places to reference the MB mode
info structure. Currently this is usually accessed in the code as
x->e_mbd.mode_info_context->mbmi.* or in some places
xd->mode_info_context->mbmi.*
Resolved some uses of x-> for the MACROBLOCKD structure.
Rebased without dependency on motion reference experiment.
Change-Id: If6718276ee4f2ef131825d1524dfdb02a3793aed
Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.
Also removes an experimental filter.
Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd
Latest version of all scripts/makefile but rtcd_defs.sh is empty, all
existing functions are still selected using the old/current way.
Change-Id: Ib92946a48a31d6c8d1d7359eca524bc1d3e66174
using large values for the timebase, e.g., {33333, 1000000} could
rollover the timestamp calculation in vp8e_encode as it was not using
64-bit math.
originally reported on ffmpeg's trac:
https://ffmpeg.org/trac/ffmpeg/ticket/1014
BUG=468
Change-Id: Iedb4e11de086a3dda75097bfaf08f2488e2088d8
The commit replaces run-time initialization of cosine constants with
static constant values, which provides ~30% relief on slow speed. The
real solution, however will be to implement integer versions of those
functions that current use float/double.
Change-Id: Ie3ff1793509653d78dd1aeaf88cc6737da1bc55f
Using surrounding reconstructed pixels from left and above to select
best matching mv to use as reference motion vector for mv encoding.
Test results:
AVGPSNR GLBPSNR VPXSSIM
Derf: 1.107% 1.062% 0.992%
Std-hd:1.209% 1.176% 1.029%
Change-Id: I8f10e09ee6538c05df2fb9f069abcaf1edb3fca6
The forward and inverse hybrid transforms are now performed using
single function modules, where the dimension is sent as argument.
Added an inline function clip8b to clip the reconstruction pixels
into range of 0-255.
Change-Id: Id7d870b3e1aefc092721c80c0af6f641eb5f3747
This allows building on MountainLion as the 10.6 SDK has been
removed from the latest Xcode version (4.4 4F250). Also fix
all warnings for that build.
Change-Id: Ib70bca4a25295f13595f0d10ea9f0229631de5a4
Merged in the high_precision_mv experiment to make it easier
to work on new mv encoding strategies. Also removed
coef_update_probs3().
Change-Id: I82d3b0bb642419fe05dba82528bc9ba010e90924
Previouly, the decoding of mode and motion vector are done a per frame
basis followed by residue decoding and reconstuction. The commit added
the option to allow decoder to interleave the decoding of mode and mvs
with the residue decoding on a per MB basis.
Change-Id: Ia5316f4a7af9ba7f155c92b5a6fc97201b653571
Fixed the code review comments.
Under the htrans8x8 experiment the 8X8 DCT in the
I8X8 mode is replaced with a combination of 8X8 ADST and
DCT.
Overall coding gains with the htrans8x8 experiment are:
derf: 0.486
std-hd: 1.040
hd: 1.063
yt: 0.506
Note that part of the gain comes from bigger transforms
(8x8 instead of 4x4) and part comes from replacing the DCT
wth the ADST.
Change-Id: I92ca6bbfce11b4165d612b81d9adfad4d010c775
Set on all 16x16 intra/inter modes
Features:
- Butterfly fDCT/iDCT
- Loop filter does not filter internal edges with 16x16
- Optimize coefficient function
- Update coefficient probability function
- RD
- Entropy stats
- 16x16 is a config option
Have not tested with experiments.
hd: 2.60%
std-hd: 2.43%
yt: 1.32%
derf: 0.60%
Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
Interleaved loopfiltering with decode. For 1080p clips, up to 1%
performance gain. For 4k clips, up to 10% seen. This patch is required
for better "frame-based" multithreading.
Change-Id: Ic834cf32297cc04f27e8205652fb9f70cbe290db
Apply 2D-DCT transform of dimension 8x8 to encode prediction
residuals of I8X8 mode.
Brought back block type 3 probability context model for 8x8 tokens,
which is used for the coefficients of Y blocks in I8x8 modes. The
coefficient costs estimate of I8X8 mode in rate-distortion is also
changed appropriately.
Performance results:
derf: 0.246
yt: 0.114
std-hd: 0.730
hd: 0.670
Change-Id: If1d970eeb4e1827c9f0d2c5b27d33089b347ea27
predict_d has become canonical. Remove previous helper function.
Disable ARM assembly pending update.
Change-Id: Idd84ac8a28f9b0221ea97904a77de1e705d06a7d
The sync interval for the multithreaded encoder was considered as not changing
during the encoding. This is not true if picture size is changed.
The encoder could dead-lock because the main thread and the other threads were
using different sync interval.
Change-Id: I75232bbdbc6c02d77f830d870fd8b4e96697c64e
After the picture size was changed to a bigger one, the internal memory was
corrupted and multithreaded encoder was deadlocking.
Memory for last frame's MVs, segmentation map and active map were allocated when
the compressor was created (vp8_create_compressor). Buffers need to be
reallocated when picture size is changed, so, the allocation was moved to
vp8_alloc_compressor_data, which is called every time the picture is resized.
Change-Id: I7ce16b8e69bbf0386d7997df57add155aada2240
Merged the enhanced_interp experiment.
Found and fixed a bug in the include files framework, whereby
certain encoder files were still using the old INTERP_EXTEND
value of 3 instead of 4. The thresholds for mv range mcomp.c
need a small adjustment to prevent crashes.
The results are more or less unchanged.
Change-Id: Iac5008390f1efc97ce1102fbb5f8989c847fb579
Allows for swtiching/setting interpolation filters at the MB
level. A frame level flag indicates whether to use a specifc
filter for the entire frame or to signal the interpolation
filter for each MB. When switchable filters are used, the
encoder chooses between 8-tap and 8-tap sharp filters. The
code currently has options to explore other variations as well,
which will be cleaned up subsequently.
One issue with the framework is that encoding is slow. I
tried to do some tricks to speed things up but it is still slow.
Decoding speed should not be affected since the number of
filter taps remain unchanged.
With the current version, we are up 0.5% on derf on average but
some videos city/mobile improve by close to 4 and 2% respectively.
If we did a full-search by turning the SEARCH_BEST_FILTER flag
on, the results are somewhat better.
The framework can be combined with filtered prediction, and I
seek feedback regarding that.
Rebased.
Change-Id: I8f632cb2c111e76284140a2bd480945d6d42b77a
The ambient qp and active worse/best qp were reset for every frame
when temporal layers is on. This change removes this reset.
As this affects the target size for forced key frames
(it will actually lower the size somewhat), we increased the
inital boost factor to compensate.
Change-Id: Ie38d95f5c99ab3d447469c49e2177bc3fcc4ad28
SAD returns unsigned values. Make all the declarations the same.
Remove bestsad initialization and check. It is always set to the
result of a SAD call so it will never remain UINT_MAX
Use ja instead of jg to test unsigned comparison instead of signed.
Update test.
Change-Id: I46336ab45f4e60fc37caf20bd36bc5782079c7a5
The following five experiments are merged:
newentropy
newupdate
adaptive_entropy (also includes a couple of parameter changes
that improves results a little
in common/entropymode.c and encoder/modecosts.c
that were not merged from the internal branch)
newintramodes
expanded_coef_context
Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
Precalculated block ptrs do not need updates during encoding.
Set these at init stage.
Moved the allocation of 'mt_current_mb_col' (last encoded MB on each
row) to vp8_alloc_compressor_data(), so that it is correctly
reallocated when frame size is changing.
Change-Id: Idcdaa2d0cf3a7f782b7d888626b7cf22a4ffb5c1
Added drop_frame support in multi-resolution encoder.
If one frame is dropped at a lower-resolution level, the next
upper-resolution level encoder needs to encode that frame
independently without any lower-resolution level motion
information.
Another issue is that if one frame is dropped at some but not all
resolution levels, a frame after that one may use different set
of reference frames at different resolution levels. This reference
frame asynchronization could degrade motion search precision in
upper-resolution level encoding, which uses lower-resolution level
motion result. This change compares the lower-resolution and upper-
resolution level's reference frames. If they are not the same, the
upper-resolution level encoder can not use lower-resolution level
motion result.
Change-Id: I61afa4f313630e75b7cbdd5742e230e8724a988a
Replaced local definitions of the extension required
by the filters with the globally defined value.
Change-Id: If9e590a1f2e5b0bdc2d3e3c3f04aacbd3b09bfee
Adds ADST/DCT hybrid transform coding for Intra4x4 mode.
The ADST is applied to directions in which the boundary
pixels are used for prediction, while DCT applied to
directions without corresponding boundary prediction.
Adds enum TX_TYPE in b_mode_infor to indicate the transform
type used.
Make coding style consistent with google style.
Fixed the commented issues.
Experimental results in terms of bit-rate reduction:
derf: 0.731%
yt: 0.982%
std-hd: 0.459%
hd: 0.725%
Will be looking at 8x8 transforms next.
Change-Id: I46dbd7b80dbb3e8856e9c34fbc58cb3764a12fcf
the integer version has very good precision, the float version is no
longer useful. this commit also removes the experiment option from
configure script.
Change-Id: Ibb92e63c9f5083357cdf89c559d584a7deb3353f
this commit removes a number of experiment options from configure
script. the associated experiments are already fully merged, the
options in configure script have no effect at all.
Change-Id: I8054ccaee0a04610162ed76ac9e59c4538217113
vp8_encode_inter_macroblock() is called in both pick_mb_modes() as
well as encode_sb(), thus the number of macroblocks in the counter
were twice as big as actual numbers. This doesn't affect output.
Change-Id: I6de8a996ee44d2f7f2080d8d2177dd7bc6207c93
This allows CONFIG_SUPERBLOCKS experiment to almost compile succesfully,
except for the missing pick_sb_modes() function.
Change-Id: Ib2322f2aacdc371e8066f2eb4a8d761c40490b4d
Added validity checking in multi-res encoder. Disable spatial
resampling and frame dropping before we have those supports.
Also, deallocate the memory for all resolution levels once error
occurs.
Change-Id: Ia5d65a645381cad1a49940ab3a19bb5696c39c09
This is to avoid a rare encoder/decoder mismatch for MB using SPLITMV
mode. In decoder, the UV mv can be determined to need clamp, but the
flag is never set in encoder motion vector selection process, and the
clamp is not done in encoding in encoder.
Change-Id: I60520d3f790354c7855dadf03f0978ea9b77e2c0
This patch fixed issue 458 by calling copy function when both
offsets are 0, which guarantees the SSSE3 functions output
same result as the c function for all possible offsets.
Change-Id: I209aec7a4c6b3362db2646a8887c1038493b6496
xd->subpixel_predict16x16 is called in first pass, but isn't
initialized in first pass, which causes segfault. This patch
fixed that problem.
Change-Id: Ibd2cad4e2d32ea589fc3e0876d60d3079ae836e7
This commit adds lossless compression capability to the experimental
branch. The lossless experiment can be enabled using --enable-lossless
in configure. When the experiment is enabled, the encoder will use
lossless compression mode by command line option --lossless, and the
decoder automatically recognizes a losslessly encoded clip and decodes
accordingly.
To achieve the lossless coding, this commit has changed the following:
1. To encode at lossless mode, encoder forces the use of unit
quantizer, i.e, Q 0, where effective quantization is 1. Encoder also
disables the usage of 8x8 transform and allows only 4x4 transform;
2. At Q 0, the first order 4x4 DCT/IDCT have been switched over
to a pair of forward and inverse Walsh-Hadamard Transform
(http://goo.gl/EIsfy), with proper scaling applied to match the range
of the original 4x4 DCT/IDCT pair;
3. At Q 0, the second order remains to use the previous
walsh-hadamard transform pair. However, to maintain the reversibility
in second order transform at Q 0, scaling down is applied to first
order DC coefficients prior to forward transform, and scaling up is
applied to the second order output prior to quantization. Symmetric
upscaling and downscaling are added around inverse second order
transform;
4. At lossless mode, encoder also disables a number of minor
features to ensure no loss is introduced, these features includes:
a. Trellis quantization optimization
b. Loop filtering
c. Aggressive zero-binning, rounding and zero-bin boosting
d. Mode based zero-bin boosting
Lossless coding test was performed on all clips within the derf set,
to verify that the commit has achieved lossless compression for all
clips. The average compression ratio is around 2.57 to 1.
(http://goo.gl/dEShs)
Change-Id: Ia3aba7dd09df40dd590f93b9aba134defbc64e34
Added the ability to optionally filter the prediction data
when inter modes are selected (excludes SPLITMV, for now).
The mode selection loop considers both the filtered and
non-filtered prediction data when choosing mode. The filter
can be turned on/off at the frame-level, or signaled for
each MB.
Change-Id: I1b783c71d95a361ab36c761b07e8a6b06bc36822
Incorporates mv_ref, mbsplit and second_mv into the adaptive
entropy framework. The mv_ref framework has been modified from
before.
Adds some clean-ups and fixes.
Results with the adaptive entropy experiment are currently up by
+1.93% on derf; +2.33% std-hd and +1.87% yt-hd.
Fixed a nasty intermittent bug.
Change-Id: I4b1ac9f9483b48432597595195bfec05f31d1e39
Changes relating to Issue 411
Removed code that was clearing down the segmentation data each
frame.
Added range/parameter checking in vp8_set_roimap(); Return error
if called when cyclic_refresh is enabled.
Correct setup_features() so that it sets or clears the segment update
flags as appropriate.
Change-Id: Ib31ac53006640ddf1ba7b9ec8f8b952e3eff860a
The function vp8_post_proc_down_and_across_c takes the
stride of both the src and dst images as parameters, but
assumes that they are the same.
I modified the code to use the correct strides, as the
assembler versions of these functions do.
Change-Id: I222715b774cd071b21c15a4b0d2f4aef64a520de
Avoid a pthreads dependency via pthread_once() when compiled with
--disable-multithread.
In addition, this synchronization is disabled for Win32 as well, even
though we can be sure that the required primatives exist, so that the
requirements on the application when built with --disable-multithread
are consistent across platforms.
Users using libvpx built with --disable-multithread in a multithreaded
context should provide their own synchronization. Updated the
documentation to vpx_codec_enc_init_ver() and vpx_codec_dec_init_ver()
to note this requirement. Moved the RTCD initialization call to match
this description, as previously it didn't happen until the first
frame.
Change-Id: Id576f6bce2758362188278d3085051c218a56d4a
This patch incorporates adaptive entropy coding of coefficient tokens,
and mode/mv information based on distributions encountered in a frame.
Specifically, there is an initial forward update to the probabilities
in the bitstream as before for coding the symbols in the frame, however
at the end of decoding each frame, the forward update to the
probabilities is reverted and instead the probabilities are updated
towards the actual distributions encountered within the frame.
The amount of update is weighted by the number of hits within each
context.
Results on derf/hd/std-hd are all up by 1.6%.
On derf, the most of the gains come from coefficients, however for the
hd and std-hd sets, the most of the gains come from the mode/mv
information updates.
Change-Id: I708c0e11fdacafee04940fe7ae159ba6844005fd
This commit is to remove two arrays, which contain the probabilities
of how likely each probability in coef_probs table is updated. The
commit changed to use a fixed number "252".
Surprisedly, the overall impact on quality is close to zero, which
basically says the two big static arrays are not helpful at all.
derf: -0.016%, -0.020%
std-hd: 0.000%, -0.013%
yt: -0.022%, +0.007%
yt-hd: -0.038%, +0.034%
Change-Id: Ifee94d28a37dcab4f1d2b994bd5b07575be42b72
This commit added the ability to accumulate the coef stats across
different encodings using an intermediate binary stats files. The
accumulation happens only the binary stats file exists in current
directory. The encoder needs to be built with "ENTROPY_STATS" to
allow the output. The commit also fixed a few formating issues in
output stats file.
Change-Id: Ib1a41180aa554845cf51e4421a230b128a3a82b4
Changes to calculation of sr_coded_error to include 0,0 case.
Experimental use of sr_coded_error in calculating correction factor
for estimating the allowable Q range.
Reinstated some code needed for calculating section_intra_rating.
Add flash detection in calculation of KF boost
Increased tolerance in testing candidate key frames (needed with
longer motion search as this tends to slightly increase inter %.
Zbin changes for 8x8.
Other minor adjustments, refactoring and bug fixes.
Reinstated some motion break out clauses in boost loop
as their removal hurt a few 50fps clips badly in the std set.
It may be possible to remove them again later if a better way
can be found of preventing overly long gf intervals.
Change-Id: Iee686d0c31072828bb1ccd2bc63f5f1c7c548ea2
Allows building the library with the gcc -pedantic option, for improved
portabilty. In particular, this commit removes usage of C99/C++ style
single-line comments and dynamic struct initializers. This is a
continuation of the work done in commit 97b766a46, which removed most
of these warnings for decode only builds.
Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
Frame dropping decision is made by evaluating both current frame
and next frame's buffer_level. If both buffer_levels are less
than drop_mark, next frame is dropped. When frame dropping is
over, namely, buffer_level becomes normal again, we need to
reset decimation_count to 0.
Change-Id: Iae182612e61e0da367fbd43afdc90738d975d1a3
The logic for spatial resizing is done after the Q is selected for the
frame. This causes a problem that the Q we select for the (resized)
key frame may be based on a different resolution than the frame we
will encode.
This fix is to ensure that, when resize is on, the selected Q is still
based on the resolution of the frame to be encoded.
Change-Id: Ia49a9eac5f64e48d1c00dfc7ed4ce26fe84d3fa1
Variables m & mi were being dereferenced when they might
hold invalid values.
The fix is simply to move these dereferences to after the
point at which mb_row and mb_col are tested for validity.
Change-Id: Ib16561efa9792dc469759936189ea379d374ad20
Compares the sum of differences between the input block and the averaged
block. If they differ too much the block will not be filtered. Negligible
perfomance hit.
Change-Id: Ib1c31a265efd4d100b3abc4a1ea6675038c8ddde
Add PRIVATE macro for adding private_extern directive for yasm
to hide global symbols. This is only enabled if -DCHROMIUM is used
with YASM.
Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING.
Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
This extends the denoiser to work for temporally scalable
coding.
I believe this also fixes a very rare but really bad bug in the original
implementation.
Change-Id: I8b3593a8c54b86eb76f785af1970935f7d56262a
This fix addresses some problems with very complex clips like
handling of flashes on clips like crew (which was made worse
by an earlier patch (derf and std-hd)).
Most clips a small effect but some between 1 & 2%
Derf +0.039, +0.211%
YT +0.042, +0.083%
Change-Id: I65fc7c13afc31482040068544dd65b8808f5cb4a
Compares the sum of differences between the input block and the averaged
block. If they differ too much the block will not be filtered. Negligible
perfomance hit.
Change-Id: Ib1c31a265efd4d100b3abc4a1ea6675038c8ddde
Removed the local scaling factor est_max_qcorrection_factor
and related code to simplify estimateq calculation (little effect
anyway)
Cap range of total correction factor.
Slight change to break out case to turn off arf.
Change-Id: I748187737ba93cfadf016f3dfdf8d2741934067f
the commit fixed a number of compiling issues when some epxeriments
are turned on at the same time.
Change-Id: Idb15b215e2d2a7d25f2707f99ef55a34e7301ce7
This extends the denoiser to work for temporally scalable
coding.
I believe this also fixes a very rare but really bad bug in the original
implementation.
Change-Id: I8b3593a8c54b86eb76f785af1970935f7d56262a
Add PRIVATE macro for adding private_extern directive for yasm
to hide global symbols. This is only enabled if -DCHROMIUM is used
with YASM.
Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING.
Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
After a key frame encoding, the frame type could change while
filtering is still going on. Pass the frame type as parameter to the
loopfilter function and don't read it from common storage.
vp8cx_set_alt_lf_level has to be done before packing the stream.
Currently alt_lf_level is not used so there hasn't been any visible
problem here.
Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
The two-pass code does not support the case where the application
changes the frame size dynamically. Add this case to the validation
checks in the vpx_codec_enc_config_set() path.
Change-Id: Idadc42c7c3bd566ecdbce30d8dd720add097f992
* changes:
Add initial keyframe tests
Move all tests to test/ directory
Enable unit tests by default
Build unit tests monolithically
configure: initial support for CXX, CXXFLAGS variables
Resolution changes in calls to vpx_codec_enc_config_set() would cause
a memory leak due to failing to release the lookahead and alt ref
buffers.
Change-Id: I48392ea25e71fe2760d60cfde3fb3874598cc85f
Reverted part of change in memory alllocation code, which ensures
that the function returns 0 and encoder works correctly when
CONFIG_MULTI_RES_ENCODING isn't turned on.
Change-Id: Id5d5e7f2c8bd9e961a6dca79d257e8185f0d592a
The commit changed how baseline 8x8 coefficient probabilities are
initialized, to be consistent with the initialization of baseline
4x4 coefficient probabilities.
The commit does not have any effect on compression.
Change-Id: Ifb3902b5dc0b0c2e6dc3aa5d4a6589d528e58355
After a key frame encoding, the frame type could change while
filtering is still going on. Pass the frame type as parameter to the
loopfilter function and don't read it from common storage.
vp8cx_set_alt_lf_level has to be done before packing the stream.
Currently alt_lf_level is not used so there hasn't been any visible
problem here.
Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
Rework unit tests to have a single executable rather than many, which
should avoid pollution of the visual studio project namespace, improve
build times, and make it easier to use the gtest test sharding system
when we get these going on the continuous build cluster.
Change-Id: If4c3e5d4b3515522869de6c89455c2a64697cca6
Remove dependency on amount and speed of motion as this
may not behave well across different image sizes.
Tweak impact of % inter.
Add in experimental adjustment based on relative quality of an
older second reference frame.
Cap range of decay values allowed.
Some small + effect on derf but -ve on yt & hd at this stage.
Change-Id: I390d6f6ebe67a2eb0b834980d0d4650124980d3e
In multi-resolution encoding, frame_type decision for each frame
is made by the lowest-resolution encoder. For all other higher-
resolution encoders, kf_mode is always set to VPX_KF_DISABLED,
and they are forced to use the same frame_type picked by the
lowest-resolution encoder.
Change-Id: Ic4d52ec65bbc012ca9c2d236210e28a295591eaf
I now see I didn't write a very long description, so let's do it
here then. We took a pretty big quality hit (0.1-0.2%) from my
recent fix of the inversion of arguments to vp8_cost_bit() in the
RD reference frame costing. I looked into it and basically the
costing prevented us from switching reference frames. This is of
course silly, since each frame codes its own prob_intra_coded, so
using last frame cost indications as a limiting factor can never
be right.
Here, I've rewritten that code to estimate costings based partially
on statistics from progress on current frame encoding. Overall,
this gives us a ~0.2%-0.3% improvement over what we had previously
before my argument-inversion-fix, and thus about ~0.4% over current
git (on derf-set), and a little more (0.5-1.0%) on HD/STD-HD/YT.
Change-Id: I79ebd4ccec4d6edbf0e152d9590d103ba2747775
base the static image test off a measure of 0,0 motion
instead of the decay accumulator value.
Change "transition to still detection" to compare the
decay rate from successive frames.
Minor tweak to the arf extra boost given based on the
number of frames affected.
Removed unused variable mod_err_per_mb_accumulator.
Change-Id: Idd8360083ad409e45f133ce97dd2488259003e64
The commit added an integer version of 8x8 forward DCT, based on the
orginal forward DCT from VP6. The constants, roundings, and shifts
were adjusted to improve the accuracy. The latest patch has a very
similar accuracy in term of round trip error against the floating
point version.
It should be noted here that the purpose of the patch is to help
encoding speed and facilitate all other experiments. There will be
futher review in combination with inverse DCT before finalization.
configure with "--enable--int_8x8fdct" to use the integer version
Change-Id: I5a4f80507429f0e07cf02a13768ec81cbfddc5bc
Some marginal impact due to the fact that it makes use of
arf more likely / stable even in hard sections.
Change-Id: Ic72fda0f63eefc9433914b5d9cd374d515810129
Removed unused function.
Added tentative code to take error score of an older frame
into account when calculating Q range. However, for now
it is disabled pending merging other changes and testing.
Change-Id: Ie89955e70319dac31b79e3b833e3352712a061ec
Remove testing of whether we estimate that it will be possible
to code an arf at a lower Q than the ambient Q. This adds quite
a bit of extra code and complexity for marginal gain.
Factored out some code relating to ARNR selection to a separate
function as this is likely to be changed / simplified soon.
Change-Id: Ia1cf060405637ef5bbf7018355437be21d12375f
Removed odd *100 >> 4 factor from boost calculations. Not all the
calculations exactly match what was there before so there may be
some minor impact on results.
Some other minor tidying up in regard to coding conventions.
The specific values of factors and thresholds will likely change as
part of subsequent patches.
Change-Id: Id976321484ac02ba50294cf54fafbc17dda85686
These frames can force reference frame (arf), mode (zeromv) and skip,
which means that if we use compound prediction (i.e. arf+last), we
might use a blend of a perfect (arf) and an imperfect (last) predictor,
leading to semi-garbage display and thus a huge drop in SSIM/PSNR (up
to 10dB for some frames I analyzed).
Gives a +0.2% gain on YT.
Change-Id: If1f2b7899ad165684af3808fd379295e82558cbb
This is the first patch in a series of changes to the first
pass code. (Broken down for ease of testing/merging/review).
This patch introduces a new stats element "sr_coded_error".
This is the coded error recorded vs the second reference
frame (which is updated such that it lags by at least one frame).
No use is made of the new structure in this change so this patch
should have no material effect.
Removed some ifdefs and deprecated code (#if NEW_BOOST).
Removed twopass.gf_decay_rate (not used any more)
Change-Id: I1be672a73017f7c13fd50fb4f99236aa2ed30916
This commit changed the forward and the inverse 4x4 Walsh Hadamard
transform to a new pair, where the inverse transform can pefectly
reconstuct the input to forward transform. It also does so without
changing the input and output value range. Even more, it does not
change the complexity of the transforms.
While it was not expected to improve the results of our current test,
it does improve std-hd set by 0.2% on all metrics. No change on derf.
Change-Id: Ie4f23ddd3a0f3c5fbe97fb58399f860031f99337
Accept the same range of inputs for the VP8E_SET_NOISE_SENSITIVITY
control, regardless of whether temporal denoising is enabled or not.
This is important for maintaining compatibility with existing
applications.
Change-Id: I94cd4bb09bf7c803516701a394cf1a63bfec0097
1. block types
There are only three types of blocks for 8x8 transformed MBs, i.e. Y
block with DC does not exist for 8x8 transformed MBs as all MB using
8x8 transform have 2nd order haar transform. This commit introduced
a new macro BLOCK_TYPES_8X8 to reflect such fact.
2. context counters
This commit also fixed the mixed of context_counters between 4x4 and
8x8 transformed MBs. The mixed use of the counters leads me to think
the existing the context probabilities were not properly generated
from 8x8 transformed MBs.
3. redundant collecting in recoding
The commit also corrected the code that accumulates entropy stats by
making sure stats only collected for final packing, not during the
recode loop
Change-Id: I029f09f8f60bd0c3240cc392ff5c6d05435e322c
Adds a unit test to the boolcoder that tests encoding
and decoding thousands of different bits, with different
probabilities in different patterns.
Code borrowed from the webp project - and its committers.
Change-Id: Icabbb884d57e666496490c961dd29b246144ab3e
Make functions only referenced from one translation unit static. Other
symbols with extern linkage get a vp8/vpx prefix.
Change-Id: I928c7e0d0d36e89ac78cb54ff8bb28748727834f
Change If4321cc5 fixed a bug caused by forward declarations not being
kept in sync across C files, resulting in a function call with the
wrong arguments. The commit moves the affected function declarations
into a header file, along with the other symbols from encodeframe.c
that were being sloppily shared.
Change-Id: I76a7b4c66d4fe175f9cbef7e52148655e4bb9ba1
Removes all runtime initialization of global data. This commit is a
squashed version of the following series cherry-picked from master.
This is necessary because of a change that was merged to the tester
that depends on the scaler being moved to the RTCD framework, which
is a worthwhile thing to include in Eider anyway.
- a91b42f02 Makes all global data in entropy.c const
- b35a0db0e Makes all global data in tokenize.c const
- 441cac8ea Makes all mode token tables const
- 5948a0210 Ports vpx_xcaler to new RTCD method
- 317d4244c Makes all mode token tables const part 2
Change-Id: Ifeaea24df2b731e7c509fa6c6ef6891a374afc26
Move the notion of 0 bitrate implying skip deeper into the codec,
rather than doing it at the multi-encoder API level. This preserves
v1.0.0 ABI compatibility, rather than forcing a bump to v2.0.0 over a
minor change. Also, this allows the case where the application can
selectively enable and disable the larger resolution(s) without having
to reinitialize the codec instace (for instance, if no target is
receiving the full resolution stream).
It's not clear how deep to push this check. It may be valuable to
allow the framerate adaptation code to run, for example. Currently put
the check as early as possible for simplicity, should reevaluate this
as this feature gains real use.
Change-Id: I371709b8c6b52185a1c71a166a131ecc244582f0
Besides imposing a performance penalty at startup in most
configurations, these relocations break the dynamic linker for
native Fennec, since it does not support them at all.
Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de
These are warnings in most builds, but show up as compile errors on
some platforms when these headers are included from C++ code.
Change-Id: I6c523b4dbbc699075fe73830442b51922e5a61d5
These are warnings in most builds, but show up as compile errors on
some platforms when these headers are included from C++ code.
Change-Id: I6c523b4dbbc699075fe73830442b51922e5a61d5
This commit adjusted slightly the 4x4 coefficents band definition to
better classify coefficients with similar distributions and usages.
It helps derf set about .1%, it is alos slightly positive for std-hd
set, where 4x4 blocks are used less frequently.
The commit also removed a const array not in use.
Change-Id: I78d16905d4036641ec905b0c32c190c1def5b249
The ARNR filter uses a motion compensated temporal filter,
but the motion estimation implementation accounts for the
cost of the mv in its decision making process. The ARNR
filter uses a dummy cost table initialized to 0 as a way
to ignore the mv costs (which are irrelevant to the filter).
This CL modifies the ARNR filter implementation to so that
the mv costing is ignored without the requirement for
dummy tables.
Change-Id: I0dd9620c3b70682f938b2a70912c11d4d7c9284c
The ARNR filter uses MC to find the best match between the
ARF and other nearby frames in the filter-set. Since the
ARF is a member of the filter-set, MC in that case is
unnecesssary. This patch modifies the filter so it does
not apply MC in this case.
Change-Id: Ic0321199c08db2189a57f28d1700b745bc7ff66d
The ARNR filter uses a motion compensated temporal filter,
but the motion estimation implementation accounts for the
cost of the mv in its decision making process. The ARNR
filter uses a dummy cost table initialized to 0 as a way
to ignore the mv costs (which are irrelevant to the filter).
This CL modifies the ARNR filter implementation so that
the mv costing is ignored without the requirement for
dummy tables.
Change-Id: I4196aa5c24da63f858ff54fbaa5fc85ae1f1957f
The backup MODE_INFO buffer used in the error concealment was
allocated in the codec common parts allocation even though this is a
decoder only resource. Moved the allocation to the decoder side.
No need to update_mode_info_border as mode_info buffers are zero
allocated.
This fixes also a potential memory leak as the EC overlaps buffer was not
properly released before reallocation after a frame size change.
Change-Id: I12803d3e012308d95669069980b1c95973fb775f
Adds a speed feature to conduct a brute force search among a set of
available interpolation filters for the best one in an RD sense.
There is a gain of 0.4% on derf, 1.0% on Std-HD.
Patch 2: A macro added to determine if the encoder state is reset
for each new filter tried.
Patch 3: rebase, also fixes a bug (decodframe.c) introduced by a
couple of missing function pointer assignements.
Patch 4: rebase.
Change-Id: Ic9ccca9d8c35c6af557449ae867391a2f996cc29
This commit merge the QI mode experiment. As the experiment affects
the encoding of intra coding modes on key frame only, the overall
effect of the experiment on encoding tests is insignificant.
Change-Id: I9e4e3933adface88867ad429cee3986e529c511d
The commit merges the UVINTRA experiment and removed the related
macros. The overall effect of the experiment is a small gain (.1%
on derf)
Change-Id: Ia34b3312fb9b5b34c9ba111bf0fa78c6f78ac80b
Race was introduced by https://gerrit.chromium.org/gerrit/15563.
If loopfilter related config params were changed between frames, or
after a KEY frame, there could be a mismatch between encoder's and
decoder's recontructed image. In worst case, when frame buffers are
reallocated because of a size change, the loopfilter could
do an invalid data access (segmentation fault).
Fixes:
Sync with the loopfilter before applying any encoder changes in
vp8_change_config().
Moved the loopfilter synching to the top of
encode_frame_to_data_rate() so that it's done before any alteration of
the encoder.
Change-Id: Ide5245d2a2aeed78752de750c0110bc4b46f5b7b
Increment the last_row_mb_col counter by nsync after last MB of row is
ready. This way we dont need to check for last MB of row when
synching.
Set last MB of row ready just after row extension is done, This avoids
o potential race condition whit the processing of last MB of next row.
Change-Id: I19c44fd6041116ee5483be2143b4f4bfcd149eac
Adds differential encoding of prob updates using a subexponential
code centered around the previous probability value.
Also searches for the most cost-effective update, and breaks
up the coefficient updates into smaller groups.
Small gain on Derf: 0.2%
Change-Id: Ie0071e3dc113e3d0d7ab95b6442bb07a89970030
RD costs were local to MACROBLOCK data and had to be copied all the
time to each thread's MACROBLOCK data. Tables moved to a common place
and only pointers are setup for each encoding thread.
vp8_cost_tokens() generates 'int' costs so changed all types to be
int (i.e. removed unsigned).
NOTE: Could do some more cleaning in vp8cx_init_mbrthread_data().
Change-Id: Ifa4de4c6286dffaca7ed3082041fe5af1345ddc0
The block pointers and offset do not need to be calculated for every
frame. Block internal predictors can be update once when decoder is
allocated. Destination and previous buffer offsets have to be updated
just when frame size is changing.
Change-Id: I92ca8df0e6aaac4cc35ab890751d446760bf82e2
Key frame macrobock and block mode probabilities are constant.
Remove the allocation of tables for each codec instance and use
instead the default const prob tables.
Change-Id: I8361798ac491f9b3889e86925a494c58647c753f
Move the notion of 0 bitrate implying skip deeper into the codec,
rather than doing it at the multi-encoder API level. This preserves
v1.0.0 ABI compatibility, rather than forcing a bump to v2.0.0 over a
minor change. Also, this allows the case where the application can
selectively enable and disable the larger resolution(s) without having
to reinitialize the codec instace (for instance, if no target is
receiving the full resolution stream).
It's not clear how deep to push this check. It may be valuable to
allow the framerate adaptation code to run, for example. Currently put
the check as early as possible for simplicity, should reevaluate this
as this feature gains real use.
Change-Id: I371709b8c6b52185a1c71a166a131ecc244582f0
Look for changes in the codec's configured w/h instead of its active
w/h when forcing keyframes. Otherwise calls to vp8_change_config()
will force a keyframe when spatial resampling is active.
Change-Id: Ie0d20e70507004e714ad40b640aa5b434251eb32
This commit changed to enable the usage 8x8 transform for all frame
type, all resolution and all quantizer range. This has an overall
benefit .2% to .3% in term of compression, but more importantly,
the difficult clips benefits much more, up to 2% to 3% on clips
like football, harbour and so on.
We observed some weird humps on very high end on a couple of youtube
clips, but have determined the underly cause was the aggressive zbin
having an effect of lowering rate with lower quality, which have
an impact on slide show clips around 60DB.
The commit does not change the association between prediction mode
and transform size.
Change-Id: I33043bdce6207528ae00b4a4b26d8ff63cfea1f4
This is to prevent the evaluation of a mode from using values left
over from a mode evaluated prior in the loop.
Change-Id: Ife2c6ceb76d2f7365fd262515d3ae48229033c2d
(see Change I9b2ccc88: Makes all mode token tables const)
Further remove runtime table initialization and use
precalculated const data. Data footprint reduced
by 4112 bytes.
Change-Id: Ia3ae9fc19f77316b045cabff01f6e5f0876a86ab
Ensure that RTCD function pointers are set at most once, to silence
some data race warnings. Implementation provided for POSIX threads and
Win32, with the prior unsynchronized behavior left in place for other
platforms.
Change-Id: I65c5856df43ef67043b3d5f26ddafddd8fcb2f7e
We can get rid of all remaining global initializers now:
vp8_scale_machine_specific_config()
vp8_initialize()
vp8dx_initialize()
Change-Id: I2825cea5d1c01ad9f6c45df49a0f86d803bfeb69
Mode token tabels precalculated in entropymode.c.
Removes vp8_initialize_common()as all common global data
is precalculated const now.
Change-Id: I9b2ccc883e4f618069e1bc180dad3a823394eb73
With the NEWENTROPY experiment enabled encoding certain clips
produced invlid bitstreams, or files that had a high degree
of artefacts.
This was the results of pointers in MACROBLOCKD not being
setup correctly (mode_info_context and prev_mode_info_context).
Change-Id: Ice13e1efa8bd122997d2f8f3f1e761c6c16e0403
These contexts need to be saved and restored for recode, otherwise
encoder/decoder mismatch happens for some clips (eg._mobcal 720p)
Change-Id: Ic65cfa0bf56ed0472ecab962ce31394d59d344bf
Removes all runtime initialization of global data in tokenize.c.
DCT token and cost tabels are pre-generated.
Second patch in a series to make sure code is reentrant.
Change-Id: Iab48b5fe290129823947b669413101f22a1bcac0
Removes all runtime initialization of global data in entropy.c.
Precalculated values are used for initializing all entropy related
tabels.
First patch in a series to make sure code is reentrant.
Change-Id: I9aac91a2a26f96d73c6470d772a343df63bfe633
When producing an invisible ARF, the time stamp counters aren't
updated since the last time stamp is seen by the codec twice. The
prior code was trapping this case with refresh_alt_ref, but this isn't
correct for other uses of the ARF. Instead, use the show_frame flag.
Change-Id: If67fff7c6c66a3606698e34e2fb5731f56b4a223
Added code to save the coding context in vp8_rd_pick_inter_mode
when the coding mode is forced to ARF(0,0).
Also, modified the MV bounds computation to comply with the
change in MV border from 32 to 64 pixels.
Change-Id: I96963a6f5f4d04ce84c807ae11e0635177c3ad6c
Local variable offsets are now consistent for the functions,
removed unused parameters, reworked the assembly to eliminate
stalls/instructions.
Change-Id: Iaa37668f8a9bb8754df435f6a51c3a08d547f879
Turning off the interpolation filter selection based on edge
proportion. This heuristics has not been working as well as
expected and I have started a more rigorous investigation into
this. We can turn this off for now since it is unnecessarily
slowing things down.
Rebase.
Change-Id: Ic5958b2b3a35ec2d8eb73b6d81617ca8fbe07e74
This commit tries to address an issue related to the oddity shown on
HD _mobcal clip, where some rather ugly blocks shown in the second
frame at low-mid bit rates if the third frame is not made a key frame
by he encoder. The fixes include: 1) made calls to sad_16x16 to be
consistent with function prototype. 2) remove the error bias to intra
and golden in mbgraph search. 3) changed the error accumulation on
inter_segment encoding to avoid potential out-of-range. 1) has no
effect on encoding results.
Encoding test show that the overall effect of the commit helps about
.2%(HD) to .3%(cif)
Change-Id: I930975a2d0c06252f01c39e0a02351529774e30b
The commit removed a limit on key frame detection, which caused a big
drop in all metric measurements for standard HD clip such as _mobcal.
This single change helps two standard HD clips by a huge amount, which
help the overall std-hd set by 2.4% (glb psnr), 0.9% (avg_psnr), 2.1%
(vpxssim).
In the result page:
http://pafr9.prod.google.com:26163/?/cns/rc-d/home/on2-prod/sunkaras/borg-test/yaowu
2012_04_02_1649_yaowu_bugfix_std-hd
2012_04_03_1452_yaowu_hump_std-hd
represent the encoding test results and std-hd set prior and after this
commit respectively.
Change-Id: Ie4313e317c737ea0e699c3a7919c1376744baa1a
This commit has made macro_block_yrd_8x8 and macro_block_yrd_8x8 to
take same parameters. It also removed a few unnecessary shifts that
has the potential to create out-of-range distortion values.
Change-Id: I4ec5afb307c3685c2a67a07c2850f0927d214455
Some code re-factored / moved to allow the main
pack operation inside the recode loop so that the
size estimate is accurate.
Deletion of some redundant code relating to one pass.
Aproximate improvement over March 27 code base:
Derf 0.0%, YT 0.5%, YThd 0.3% Std_hd 0.25%
Change-Id: Id2d071794ab44f0b52935f6fcdb5733d09a6bb86
Some adjustments to zbin for t8x8.
Changes to rules for sizing forced key frames.
Some extra stats output in tmp.stt.
Approximate gain on YT-hd set 0.5%
There are still issues in sizing key frames and gf/arf frames
when the image is largely static. These in part relate to
problems with cost estimates in the recode loop.
Change-Id: I6f0159dc8a8faeab4115a19c668d442491619a68
This is the first patch to add superblock (32x32) coding
order capabilities. It does not yet do any mode selection
at the SB level, that will follow in a further patch.
This patch encodes rows of SBs rather than
MBs, each SB contains 2x2 MBs.
Two intra prediction modes have been disabled since they
require reconstructed data for the above-right MB which
may not have been encoded yet (e.g. for the bottom right
MB in each SB).
Results on the one test clip I have tried (720p GIPS clip)
suggest that it is somewhere around 0.2dB worse than the
baseline version, so there may be bugs.
It has been tested with no experiments enabled and with
the following 3 experiments enabled:
--enable-enhanced_interp
--enable-high_precision_mv
--enable-sixteenth_subpel_uv
in each case the decode buffer matches the recon buffer
(using "cmp" to compare the dumped/decoded frames).
Note: Testing these experiments individually created
errors.
Some problems were found with other experiments but it
is unclear what state these experiments are in:
--enable-comp_intra_pred
--enable-newentropy
--enable-uvintra
This code has not been extensively tested yet, so there
is every likelihood that further bugs remain. I also
intend to do some code cleanup & refactoring in tandem
with the next patch that adds the 32x32 modes.
Change-Id: I1eba7f740a70b3510df58db53464535ef881b4d9
This patch includes:
1. fixes to disable block based termporal mixing when motion
is detected (because this version of mfqe only handles zero motion).
2. The criterion used for determining whether to mix or
not are changed to use squared differences rather than
absolute differences.
3. Additional checks on color mismatch and excessive block
flatness added. If the block as decoded has very low activity
it is unlikely to yield benefits for mixing.
Change-Id: I07331e5ab5ba64844c56e84b1a4b7de823eac6cb
In cases where you have a flat background occluded by a moving object
of similar luminosity in the foreground, it was likely that the
foreground blocks would persist for a few frames after the background
is uncovered. This is particularly noticable when the object has a
different color than the background, so add the chroma planes in as an
additional check.
In addition, for block sizes of 8 and 16, the luma threshold is
applied on four subblocks independently, which helps when only part of
the background in the block has been uncovered.
This fixes issue #392, which includes a test clip to reproduce the
issue.
BUG=392
Change-Id: I2bd7b2b0e25e912dcac342e5ad6e8914f5afd302
Reduced the size of the struct by 8 bytes, which would be
a memory savings of 64800 bytes for 1080 resolutions. Had
an extra byte, so created an is_4x4 for B_PRED or SPLITMV
modes. This simplified the mode checks in
vp8_reset_mb_tokens_context and vp8_decode_mb_tokens.
Change-Id: Ibec27784139abdc34d4d01f73c09f43e9e10e0f5
Found this bug while tracking down some anomalies in my experiments.
Since vp8_cost_one and vp8_cost_zero return unsigned int, the
bit shift by 8 will be incorrect if the value is negative.
I am cautiously optimistic that this fix will make the prob
updates more correct and somewhat improve results across the board.
But the update probabilities will need to be retuned I think.
Patch 2: Adding more of the same fixes using a macro.
Change-Id: I1a168f040e74e8c67e7225103b1c2af9a611da49
This new vp8_decode_mb_tokens() uses a modified version of
WebP's GetCoeffs function. For now, the dequant does not
occur in GetCoeffs.
Tests showed performance improvements up to 2.5% depending
on material.
Change-Id: Ia24d78627e16ffee5eb4d777ee8379a9270f07c5
Adds logic to disable mfqe for the first frame after a configuration
change such as change in resolution. Also adds some missing
if CONFIG_POSTPROC macro checks.
Change-Id: If29053dad50b676bd29189ab7f9fe250eb5d30b3
When ac_yquant>171, a key frame is enabled to use 8x8 transform. In
such case, MBs with DC_PRED or TM_PRED are selected to use T8x8. This
change helped the full STD-HD set by ~.1% or so, which is reasonable
considering how often key frame occurs in these encodings.
Change-Id: Id17009ef6327252177b19e6bf0d6628827febaf1
Deprecate fast quant and strict_quant code.
Small effect on quality as fast was used in first pass but the
effect is basically neutral across the derf set.
The rationale here is to reduce the number of code paths for
now to make experimentation easier. Optimized and fast code
options can be re-introduced later along with other encode
speed options.
Change-Id: Ia30c5daf3dbc52e72c83b277a1d281e3c934cdad
Various refactoring to make the subpel motion compensation
filters switchable by a frame level field.
Two types of 8-tap filters are supported in addition to the existing
bilinar and sixtap filters. One is the default 8-tap and the
other has a sharper cut-off for use with frames with substantial
edge content.
Patch 2: Added a preliminary strategy for filter selection based on
edginess detecton. Also includes some filter changes.
Change-Id: I866085bda5ae143cfdf2ec88157feaabdf7bd63a
This change added a motion search skipping mechanism similar
to what we did in second pass. For a macroblock that is very
similar to the macroblock at same location on last frame,
we can set its mv to be zero, and skip motion search. This
improves first-pass performance for slide shows and video
conferencing clips with a slight PSNR loss.
Change-Id: Ic73f9ef5604270ddd6d433170091d20361dfe229
The commit added a clamp to the 2nd motion vector used in compound
prediction to insure mv within UMV borders. The clamp is similar to
that of the first motion vector except that No SPLITMV is ever used
for the 2nd motion vector.
Change-Id: I26dd63c304bd66b2e03a083749cc98c641667116
The recoding loop save and restore frame coding context for recodes.
However in recoding of key frames, some of the coding context saved
was stale from last encoded inter frame. The save/restore sometimes
overwrites the re-inintialized coding context with saved context
from last frame, resulting in encoder/decoder mismatch
Change-Id: I354ae2f71074d142602d51d06544c05a2462caaf
This issue likely doesn't appear in the unmodified encoder, but
sufficient hacking on the mode selection loop can expose it.
Change-Id: I8a35831e8f08b549806d0c2c6900d42af883f78f
In a previous commit, the duplicate of headerfile defaultcoefcounts.h
was identified. This commit updates the .mk file to ensure configure
and make works properly for all platforms.
Change-Id: I31a39c809a734ba438ee53db700f252e9a03eddd
Pulled out super block code for the snapshot as this
is not quite ready and will need an extensive re-merge.
Change-Id: I436369b511257447a7b0ea064016cb63f5011849
Break MFQE code into it's own file.
It is currently only valid for 16x16 and 8x8 Y blocks. It also filters
4x4 U/V blocks.
Refactor filtering and add associated assembly. Limited test cases show
--mfqe introduces a penalty of ~20% with HD content. The assembly
reduces the penalty to ~15%
Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf
https://gerrit.chromium.org/gerrit/#change,17319 fixes cost estimating
to take skip_eob into account. No quality difference seen on derf set
tests, but about .4% gain on STD_HD set.
Change-Id: Ic5fe6d35ee021e664a6fcd28037b8432a0e470ca
Coefficient costing failed to take account of the first branch
being skipped ( 0 vs eob) if the previous token is 0.
Fixed rd to account for slightly increased token cost & cleaned up
warning message
Change-Id: I56140635d9f48a28dded5a816964e973a53975ef
This gives a modest gain on derf overall, although at low bitrates the
cost is still too high, so this can be improved further.
Patch 2. Re-base and fix 80 column issues
Change-Id: Ida2f9fa3fe75370669f6a27b37108dc602231c63
The commit changed to compute UV intra RD estimates for 4x4 and 8x8
separately to be used in mode decision for MB modes associated with
the appropriate transform size respectively. Now finally after many
other changes related 8x8 quantizer zbin boost and zbin_mode_boost,
this change overall helps the HD(with 8x8) by around ~.13%.
(avg .13% glb .13% ssim .17%)
The commit also has a few changes for eliminating compiler warnings.
Change-Id: Ibab35dad44820c87e6b44799c66f8d519cc37344
The commit added the correct Zbin_mode_boost initialization based on
Intra Mode before using rate distortion to pick UV intra mode.
Change-Id: I8e57878ff356a06672f6fa2431be860bf9b9a5c7
Produce the token partitions on-the-fly, while processing each MB.
Context is updated at the beginning of each frame based on the
previoud frame's counters. Optimally encoder outputs partitions in
separate buffers. For frame based output, partitions are concatenated
internally.
Limitations:
- enabled just in combination with realtime-only mode
- number of encoding threads has to be equal or less than the
number of token partitions. For this reason, by default the encoder
will do 8 token partitions.
- vpxenc supports partition output (-P) just in combination with
IVF output format (--ivf)
Performance:
- Realtime encoder can be up to 13% faster (ARM) depending on the number
of threads and bitrate settings. Constant gain over the 5-16 speed
range.
- Token buffer reduced from one frame to 8 MBs
Quality:
- quality is affected by the delayed context updates. This again
dependents on input material, speed and bitrate settings. For VC
style input the loss seen is up to 0.2dB. If error-resilient=2
mode is used than the effect of this change is negligible.
Example:
./configure --enable-realtime-only --enable-onthefly-bitpacking
./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
--target-bitrate=1000 --token-parts=3 --static-thresh=2000
--ivf -P -t 4 -o strm.ivf tanya_640x480.yuv
Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76
Eliminated some mb branches along with other code cleanups.
This is part of an ongoing effort to remove cut/paste
code in the decoder.
Change-Id: Ifabb0f67cafa6922b5a0e89a0d03a9b34e9e5752
The commit fixed a problem where 8x8 regular quantizer was using the
4x4 zbinboost lookup table that only has 16 entries at each Q. The
commit assigned a uniform zbin boost value for all cases that there
are more than 16 consective zeros. The change only affects MBs using
8x8 transform. The fix has a slightly positive impact on quality.
Test results:
http://www.corp.google.com/~yaowu/no_crawl/hd_fixzbinb.html
(avg psnr: .26% glb psnr: .21% ssim: .28%)
Results on cif size clip are also positive even though gain is smaller
http://www.corp.google.com/~yaowu/no_crawl/derf_fixzbinb.html
Change-Id: Ibe8f6da181d1fb377fbd0d3b5feb15be0cfa2017
This is the first patch for refactoring of the code related to
high-precision mv, so that 1/4 and 1/8 pel motion vectors can
co-exist in the same bit-stream by use of a frame level flag.
The current patch works fine for only use of 1/4th and
only use of 1/8th pel mv, but there are some issues with the
mode switching in between. Subsequent patches on this change Id
will fix the remaining issues.
Patch 2: Adds fixes to make sure that multiple mv precisions can
co-exist in the bit-stream. Frame level switching has been tested
to work correctly.
Patch 3: Fixes lines exceeding 80 char
Patch 4:
http://www.corp.google.com/~debargha/vp8_results/enhinterp.html
Results on derf after ssse3 bugfix, compared to everything
enabled but the 8-tap, 1/8-subpel and 1/16-subpel uv. Overall the
gains are about 3% now. Hopefully there are no more bugs lingering.
Apparently the sse3 bug affected the quartel subpel results more than
the eighth pel ones (which is understandabale because one bad predictor
due to the bug, matters less if there are a lot more subpel options
available as in the 1/8 subpel case).
The results in the 4th column correspond to the current settings.
The first two columns correspond to two settings of adaptive switching
of the 1/4 or 1/8 subpel mode based on initial Q estimate. These
do not work as good as just using 1/8 all the time yet.
Change-Id: I3ef392ad338329f4d68a85257a49f2b14f3af472
The commit overall on derf test is break even to very slightly positive
comparing to all 4x4 transform.
Change-Id: I2a7c19599aa54c2d3a5b35db0dc891ba8a6a2b26
Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead. vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict. Future commits will fix this
code duplication.
Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5
When we encode slide-show clips, for the majority of the time,
only ZEROMV mode is checked, and all other modes are skipped.
This change delayed uv intra-mode evaluation until intra mode is
actually checked. This gave big performance gain for slide-show
video encoding (2nd pass gain: 18% to 28%). But, this change
doesn't help other types of videos.
Also, zbin_mode_boost is adjusted in mode-checking loop, which
causes bitstream mismatch before/after this change when --best
or --good with --cpu-used=0 are used.
Change-Id: I582b3e69fd384039994360e870e6e059c36a64cc
The "update" variable was used as a flag in coef_prob update dry run
that tests if a frame should encodes update at all. The wrong init
value forced the update happening always. fixing this has a minor
improvement in low bit rate situation when 8x8 transform is allowed.
Change-Id: Icb498e8d6a62fd074dcbc2065b797cba9237cb51
use oxcf instead of common in check to Reinit the
lookahead buffer if the frame size changes
prior behavior would cause assertion fail/crash
first observed in:
support changing resolution with vpx_codec_enc_config_set
Change-Id: Ib669916ca9b4f206d4cc3caab5107e49d39a36aa
For now the interface elements have been left in place
to make sure existing parameter files work but parameters
relating to drop frame wont do anything.
Change-Id: I579ee614726387381c546845dac4bc03c74c6a07
The Lagrangian interpolation filter is maximally flat in the
passband. There is non-trivial improvement with the hd set, while
for derf the results are virtually unchanged.
See:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html (derf)
http://www.corp.google.com/~debargha/vp8_results/enhinterpn_hd.html (HD)
Patch 2: Updated the results for derf in the html above to use the
new baseline. There is still about 4% improvement. Will update the
hd baseline later (since it takes 9 hours to run on my machine)
Patch 3: By mistake the default filter was left at 60 - should be 0
to use the new interpolation filter.
Change-Id: If5f64444976562415d68a2aeabb94fdfa0d47890
* Removes EDGE_PIXEL_FILTER for external sanpshot
* changes the default 8-tap filter based on high precision results
in http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html
* changes the default prob tables for high-precision mv encoding to
favor zeros in the last bit (i.e. quarter pel). This is only important
for short clips.
Change-Id: I02bb0de8679d9eec06cdbcc8160dbf073cd847a4
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.
Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)
Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V
Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).
Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.
With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html
Results on high precision mv and on the hd set are to follow.
Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c
Patch 7: Cleaning up various debug messages.
Patch 8: Merge conflict
Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
When temporal layers is used (i.e., number_of_layers > 1),
we don't use the frame rate boost for setting the key
frame target size. The factor was forcing the target size to be
always at its minimum (2* per_frame_bandwidth) for low frame rates
(i.e., base layer frame rate).
Generally we should modify or remove this frame rate factor;
for now we turn if off for number_of_layers > 1.
Change-Id: Ia5acf406c9b2f634d30ac2473adc7b9bf2e7e6c6
Yunqing fixed an oddity in UVIntra skippable evaluation for stable
branch, which brought up the fact that the evaluation is broken.
The issue was that for MBs with 2nd order block, the eob for 1st
order blocks is set at 1. The previous evaluation did not take that
into account. This commit intend to fix the problem. The commit also
absorbed Yunqing's fix for UVIntra skippable evalution.
Test on hd showed some good gains in combination with LPF bias fix:
http://www.corp.google.com/~yaowu/no_crawl/LPFBias_FixSkip.html
(avg psnr: .34%, glb psnr: .32%, ssim: .22%)
Change-Id: I36af11c8ef7f643e8ff46da7bf3a167b437039d4
Reworked the code to use vp8_build_intra_predictors_mbuv_s
instead. This is WIP with the goal of eliminating all
functions in reconintra_mt.h
Change-Id: I61c4a132684544b24a38c4a90044597c6ec0dd52
The bias in picklpf intended to bias toward less greedy in getting
best frame level psnr while maximize overall quality for a clip.
This commit reduced the bias for frames using 8x8 transform to
achieve better compression overall.
The change improve compression by ~.15% consistently on most of the
HD clips tested.
http://www.corp.google.com/~yaowu/no_crawl/LPFBias_FixSkip.html
Change-Id: Ic30932d2b8eaebd52339b0195f569edc48eed7bc
In vp8_rd_pick_inter_mode(), if total of eobs is zero, rate needs
to be adjusted since there are no non-zero coefficients for
transmission. The uv intra eobs calculated in
rd_pick_intra_mbuv_mode() need to be saved before they are
overwritten by inter-mode eobs.
Change-Id: I41dd04fba912e8122ef95793d4d98a251bc60e58
mode_info_context->mbmi.mb_skip_coeff has to always reflect the
existence or not of coeffs for a certain MB. The loopfilter needs this
info.
mb_skip_coeff is either set by the vp8_tokenize_mb or has to be set to
1 when the MB is skipped by mode selection. This has to be done
regardless of the mb_no_coeff_skip value.
prob_skip_false is needed just when mb_no_coeff_skip is 1. No need to
keep count of both skip_false and skip_true as they are complementary
(skip_true+skip_false = total_mbs)
Change-Id: I3c74c9a0ee37bec10de7bb796e408f3e77006813
Depending on implementation the optimized SAD functions may return early
when the calculated SAD exceeds max_sad.
Change-Id: I05ce5b2d34e6d45fb3ec2a450aa99c4f3343bf3a
The commit rationized and simplified the entropy context conversion
betwen MB using 8x8 transform and MB using 4x4 transform. The old version
had a number of weirdness in how 4x4 transform MB's context is used for
8x8 blocks other than the first 8x8 within a MB.
Test showed the change has a gain ~.1% for avg psnr, glb psnr and ssim on
the limited HD set.
Change-Id: I774536c416baa6845aa741f956d8a69fa40e5d47
Refactoring some of the mode decoding logic introduced a bug where
the segmentation maps would not be properly reset on keyframes.
http://code.google.com/p/webm/issues/detail?id=378
The text of the bug is somewhat misleading as I initially read it to
imply the bug was present in v0.9.7-p1 (Cayuga), but note the text
"master", which indicates this was something subsequent. This issue
bisects back to v0.9.7-p1-84-ga99c20c, so unfortunately it was broken
during the Duclair release.
Thanks to Alexei Leonenko for investigating the root cause.
Change-Id: I9713c9f070eb37b31b3b029d9ef96be9b6ea2def
On Android NDK, rand() is inlined function. But, on our SSE optimization,
we need symbol for rand()
Change-Id: I42ab00e3255208ba95d7f9b9a8a3605ff58da8e1
Removal of the pickinter.c and .h files and calls to this
code.
Removal of some code relating to real time and one pass
settings though there is more to be done in this regard.
However, vp8_set_speed_features() now
only supports modes 0 and 1 and speeds up to 3
so rd should always be set.
Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8