This includes trellis optimization, forward/inverse transform,
quantization, tokenization and stuffing functions.
Change-Id: Ibd34132e1bf0cd667671a57b3f25b3d361b9bf8a
Entropy coding takes care of this anyway, and this causes changes to
the txfm size assigned to skip blocks, which can affect the loopfilter
output, thus causing encoder/decoding mismatches.
Change-Id: I591a8d8a4758a507986b751a9f83e6d76e406998
The update_mb_segmentation_map flag was being signalled earlier than
other data dependent on that flag. Consolidate this data so it's
parsed within the same if-scope as the flag is originally parsed in.
Change-Id: I10e90b4f511856445ef75a85a44ff441e1e5e672
If the threshold(limits) <= 0, skipped filtering and copied the
frame directly. Also, fixed memory allocation checking.
Change-Id: If3d79d5b2bcb71b9777e6eb5cba1384585131e22
Use the common update_skip_probs() function rather than duplicating its
logic in write_kf_modes().
Change-Id: I2890a28f6907cb79ffe0fb21d20f0ef98b85cdd9
If a reference frame is forced because of low dissimilarity, then
shut off the search of intra modes. This change has mixed results. On
one clip (QVGA), it hurt quality by ~1.5% with negligible speed impact.
On another (VGA) it had negligible affect on quality, but a ~0.2% speed
impact.
Change-Id: Ic8b07648979d732f489de5f094957e140f84d2eb
Rather than overloading the parent_ref_frame value to shut off the
search in some cases, add a new validity flag. This cleans up some
of the duplicated mr_encoder_id && mr_low_res_mv_avail checks as
well, for readability.
Change-Id: Iddad93a27066c3d85ff2f25a361ac113b288ab7b
Results: derf (vanilla or +hybridtx) +0.2% and (+hybrid16x16
or +tx16x16) +0.7%-0.8%; HD (vanilla or +hybridtx) +0.1-0.2%
and (+hybrid16x16 or +tx16x16) +1.4%, STD/HD (vanilla or +hybridtx)
about even, and (+hybrid16x16 or +tx16x16) +0.8-1.0%.
Change-Id: I03899e2f7a64e725a863f32e55366035ba77aa62
1. Algorithm modification:
Instead of having same filter threshold for a whole frame, now we
allow the thresholds to be adjusted for each macroblock. In current
implementation, to avoid excessive blur on background as reported
in issue480(http://code.google.com/p/webm/issues/detail?id=480), we
reduce the thresholds for skipped macroblocks.
2. SSE2 optimization:
As started in issue479(http://code.google.com/p/webm/issues/detail?id=479),
the filter calculation was adjusted for better performance. The c
code was also modified accordingly. This made the deblock filter
2x faster, and the decoder was 1.2x faster overall.
Next, the demacroblock filter will be modified similarly.
Change-Id: I05e54c3f580ccd427487d085096b3174f2ab7e86
In some situations, believed to be an interaction between temporal
scalability and dropped frames, the references available to an
encoder may not be the same references available to its parent.
Previously, the code tried to force the reference frame chosen by
the parent to be used on this frame, even if it was disabled. This
was preventing the pick mode loop from running even once, which led
to a crash.
Attempts to reproduce this bug locally were unsuccessful, so it is
still undetermined what the underlying cause of this issue is. In
the specific case that was failing, the application did not set
any flags which influenced the reference selection on that frame.
ref_frame_flags indicated that the golden frame was disabled,
believed to be because the last frame updated the last and golden
frames, so golden was shut off by default. It's not clear why this
wouldn't have also been true in the lower res encoder, ie, why the
lower res encoder decided to use and/or was allowed to use the
golden frame. We weren't able to debug into the non-crashing
lower res encoder as the crash couldn't be reproduced locally.
Change-Id: Ifb265253d26963ac2afde0e20cf6792788be6af7
This commit fixes unsafe simd / floating point interactions arising
from the current hybrid and 16x16 transform implementation.
These led to a raft of bugs and issues when the project was
built using VS2008 for Win32 though they did not show up with
the unix builds.
Gerrit makes a meal out of presenting the fix but all I have actually
done is indent the body of each function that uses floating point by
one level and bracket with emms instructions using the function
vp8_clear_system_state(). See below.
function () {
vp8_clear_system_state();
{
... function body
}
vp8_clear_system_state();
}
This is almost certainly over the top in terms of number of emms
instructions but is a temporary measure pending implementation of
integer variants of each function to replace the floating point.
Limited testing suggests that this fixes the problems that arose for
Win32 VS2008 when the hybrid or 16x16 transforms were enabled.
Change-Id: I7c9a72bd79315246ed880578dec51e2b7c178442
If a parent mb is available but is intra coded, then parent_ref_mv is
invalid. Check that the parent is inter coded before trying to access
the parent_ref_mv. Previously the parent_ref_mv was being read from
an uninitialized stack allocation, causing potential OOB reads and
other undefined behavior.
Change-Id: I0c93cd412a19c3a184bcf6decaa145b3a036a6c0
Protect the call to {Initialize,Delete}CriticalSection() with an
Interlocked{Inc,Dec}rement() pair, rather than the previous static
initialization. This should play better with AppVerifier, and fix issue
http://code.google.com/p/webm/issues/detail?id=467
Change-Id: I06eadbfac1b3b4414adb8eac862ef9bd14bbe4ad
The codec as it stood placed a keyframe one frame after a
real cut scene - and ignored datarate and other considerations.
TODO: Its possible that we should detect a keyframe and recode
the frame ( in certain circumstances) to improve quality.
Change-Id: Ia1fd6d90103f4da4d21ca5ab62897d22e0b888a8
Reset the cyclie refresh mode index in alloc_compressor_data().
This is needed to handle both cases of internal and
external spatial resizing.
Change-Id: I2697e12d45135eae2e8f0d45161811f24722312a
Separates the entropy coding context models for 4x4, 8x8 and 16x16
ADST variants.
There is a small improvement for HD (hd/std-hd) by about 0.1-0.2%.
Results on derf/yt are about the same, probably because there is not
enough statistics.
Results may improve somewhat once the initial probability tables are
updated for the hybrid transforms which is coming soon.
Change-Id: Ic7c0c62dacc68ef551054fdb575be8b8507d32a8
On an internal spatial resize, this mode index was not reset to 0,
and therefore could exceed dimensions of seg_map or cyclic_refresh_map.
Change-Id: I6fe85dbd2765eb0207a9d9f71fda8d8b8c34f075
Fixes some build issues for people building for win32 who have a
pthreads emulation layer installed.
Change-Id: I0e0003fa01f65020f6ced35d961dcb1130db37a8
This should avoid problems with blocks gettings high quality
improvement despite having recently moved:
Change-Id: Ic0af0de2d6577807fa3c553f47b55d547ef36359
Set the seg map to 0 for key frame.
In previous commit on cyclic refresh, the seg map for key frame
was not reset, and instead used the seg map from last frame.
Change-Id: I848eb2face420dfcd2f7daca6f070b9127ca938b
-Increase the amount of mbs to be refreshed.
-Replace the delta qp with a fixed and reduced delta.
-Change to the mb update loop to try to always update same amount of mbs.
Change-Id: I93ac88002fd8dc677d2337f77998ff93f64e4ff9
With this change, even if hybridtransform8x8 experiment is off,
8x8 dct is used for the I8x8 mode. However note that the gains
observed with the hybridtransform8x8 experiment will now be less,
since part of the gain is now merged in.
Change-Id: I9afb3880906fd0a1368a374041fc08efcf060c54
This change is necessary for the frame-based multithreading
implementation.
Since the postproc occurs in this call, vpxdec was modified to time around
vpx_codec_get_frame()
Change-Id: I389acf78b6003cd35e41becc16c893f7d3028523
The commit changed to use 3 rows above and 3 cols from left for SAD
scoring for selecting the best reference motion vector. The change
helped std-hd set by >.2% on psnr/ssim metrics.
Change-Id: Ifad3b528d0b4b6e3c22518af789d76eff23c1520
Initialize the top line at the beginning of each frame and
the left column at the beginning of each row.
Change-Id: I5412f7ea49ffc490215cf65a62715a6c5e3a5a29
The high-precision (1/8) pel bit is turned off if the reference
MV is larger than a threshold. The motivation for this patch is
the intuition that if motion is likely large (as indicated by
the reference), there is likley to be more motion blur, and as
a result 1/8 pel precision would be wasteful both in rd sense
as well as computationally.
The feature is incorporated as part of the newmventropy experiment.
There is a modest RD improvement with the patch. Overall the
results with the newmventropy experiment with the threshold being
16 integer pels are:
derf: +0.279%
std-hd: +0.617%
hd: +1.299%
yt: +0.822%
With threshold 8 integer pels are:
derf: +0.295%
std-hd: +0.623%
hd: +1.365%
yt: +0.847%
Patch: rebased
Patch: rebase fixes
Change-Id: I4ed14600df3c457944e6541ed407cb6e91fe428b
Multiple decoders were getting allocated per frame.
If the decoder crashed we exitted with out freeing
them and the next time in we'd allocate over.
This fix removes the allocation and just has 8
boolcoders in the pbi structure
Change-Id: I638b5bda23b622b43b7992aec21dd7cf6f6278da
Some cleanups that will make it easier to maintain the code
and incorporate upcoming changes on entropy coding for the
hybrid transforms.
Change-Id: I44bdba368f7b8bf203161d7a6d3b1fc2c9e21a8f
If the decoder crashes and returned an error before it set up
block offsets but after it set up frame buffers. We had a
problem decoding the next keyframe because the block offsets
were never set.
Change-Id: Ied2866e9770d80fc66241d5e0d978d4f5f9cdd89
This commit merges those parts of the CONFIG_NEW_MVREF
that specifically relate to choosing a better set of candidate
MV references into the NEWBESTREFMV experiment.
CONFIG_NEW_MVREF will then be used for changes relating
to the explicit coding of a cost optimized MV reference in the
bitstream as part of MV coding.
Change-Id: Ied982c0ad72093eab29e38b8cd74d5c3d7458b10
Extend experiment to use both vectors from MBs
coded using compound prediction as candidates.
In final sort only consider best 4 candidates
for now but make sure 0,0 is always one of them.
Other minor changes to new MV reference code.
Pass in Mv list to vp8_find_best_ref_mvs().
Change-Id: Ib96220c33c6b80bd1d5e0fbe8b68121be7997095
Adds a new experiment with redesigned/refactored motion vector entropy
coding. The patch also takes a first step towards separating the
integer and fractional pel components of a MV. However the fractional
pel encoding still depends on the integer pel part and so they are
not fully independent. Further experiments are in progress to see
how much they can be decoupled without affecting performance.
All components including entropy coding/decoding, costing for MV
search, forward updates and backward updates to probability tables,
have been implemented.
Results so far:
derf: +0.19%
std-hd: +0.28%
yt: +0.80%
hd: +1.15%
Patch: Simplifies the fractional pel models:
derf: +0.284%
std-hd: +0.289%
yt: +0.849%
hd: +1.254%
Patch: Some changes in the models, rebased.
derf: +0.330%
std-hd: +0.306%
yt: +0.816%
hd: +1.225%
Change-Id: I646b3c48f3587f4cc909639b78c3798da6402678
Adjusts some of the qualification thresholds in mfqe to eliminate
artifacts due to wrong decisions. Besides, a new qualification
criteria is used to disable mfqe if the quality of the previous
frame is itself not too good.
Change-Id: I4097c20b7fd4fcc60cc3003c1e33e8faae2ff066
The denoiser function was modified to reduce the computational
complexity.
1. The denoiser c function modification:
The original implementation calculated pixel's filter_coefficient
based on the pixel value difference between current raw frame and last
denoised raw frame, and stored them in lookup tables. For each pixel c,
find its coefficient using
filter_coefficient[c] = LUT[abs_diff[c]];
and then apply filtering operation for the pixel.
The denoising filter costed about 12% of encoding time when it was
turned on, and half of the time was spent on finding coefficients in
lookup tables. In order to simplify the process, a short cut was taken.
The pixel adjustments vs. pixel diff value were calculated ahead of time.
adjustment = filtered_value - current_raw
= (filter_coefficient * diff + 128) >> 8
The adjustment vs. diff curve becomes flat very quick when diff increases.
This allowed us to use only several levels to get a close approximation
of the curve. Following the denoiser algorithm, the adjustments are
further modified according to how big the motion magnitude is.
2. The sse2 function was rewritten.
This change made denoiser filter function 3x faster, and improved the
encoder performance by 7% ~ 10% with the denoiser on.
Change-Id: I93a4308963b8e80c7307f96ffa8b8c667425bf50
Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides
benefits mostly for hd sequences.
Set up the framework for selectable transform dimension.
Also allowing quantization parameter threshold to control the use
of hybrid transform (This is currently disabled by setting threshold
always above the quantization parameter. Adaptive thresholding can
be built upon this, which will further improve the coding performance.)
The coding performance gains (with respect to the codec that has all
other configuration settings turned on) are
derf: 0.013
yt: 0.086
hd: 0.198
std-hd: 0.501
Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926
Alternative strategy for finding a list of candidate motion
vectors to use as reference values in mv coding and as
nearest and near.
Sort by sad in vp8_find_best_ref_mvs() rather than just
pick the best. Allow 0,0 as a best ref option but not a
nearest or near unless there are no alternatives.
Encode/Decode verified on at least some clips.
Some commented out experimental and stats code still in place.
Gain over existing code averages about 1% on derf (alll metrics)
with improvement on all clips. Other test results pending.
The entropy coding of the mode (nearest/near etc) still
depends upon and requires the old "findnear" code so
this needs looking at and may provide room for further gains.
Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247
For videos with big static background(such as video conferencing
clips), the mode decision was biased to ZEROMV in order to
obtain a stable background. The percentage of ZEROMV on last
frame was used to predict if there is static area in current frame,
and checking already-encoded neighboring macroblocks' motion
vectors to make sure the local area has low motion.
Change-Id: I05b3241d3a56a0bda88b6681e5646c1c8baf2e57
The current way of counting inter_zz_count doesn't work correctly
in multi-threaded encoding. Calculating it after the frame is
encoded fixed the problem.
Change-Id: Ifcb1972cde950b8cc194f75c6d7b6af09e8b0e65
Added missing parameters to calls to:
vp8_build_intra_predictors_internal
vp8_build_intra_predictors_mbuv_internal
Change-Id: If8eeb8ff23eff4572397b404fe61be5d0c950bbe
This commit adds a pick_sb_mode() function which selects the best 32x32
superblock coding mode. Then it selects the best per-MB modes, compares
the two and encodes that in the bitstream.
The bitstream coding is rather simplistic right now. At the SB level,
we code a bit to indicate whether this block uses SB-coding (32x32
prediction) or MB-coding (anything else), and then we follow with the
actual modes. This could and should be modified in the future, but is
omitted from this commit because it will likely involve reorganizing
much more code rather than just adding SB coding, so it's better to let
that be judged on its own merits.
Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%.
Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6
Loop filter producing wierd artifacts when
repeatedly applied in noisy video. This
mitigates the effect.
Change-Id: If4b1a8543912d186a486f84e11d8b01f7436fa5f
Resolved the decoder mismatch issue due to quantization parameter
threshold for hybrid transform coding. The macroblock dequantizer
initialization is moved to be performed before coefficient
detokenization, since the (de)tokenization is now dependent on the
macroblock level quantization parameter.
Change-Id: I443da4992ebb70ae4114750b2f1363c0c628580e
This doesn't affect the result, since there are no MVs coded using this
entropy. It does, however, silence valgrind warnings about uninitialized
variables.
Change-Id: I6e21ba92df6ce5381bf58b8c349ef4373294a0b6
About 3.5x faster, 30% overall encoder speedup. Rest of optimizations
will come soon (see TODO section in filter_sse4.c).
Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0
The reference motion vector selected by surrounding pixels that has
the best matching score is used as nearest motion vector.
The change has shown consistent gain on all test sets, compression
gains range from .2% to .6%. The variation is largely dependent on
various other experiments on or off.
Change-Id: I5552e1c2f6fc57c3e8818a5ee41ffda89af05e75
References to MACROBLOCKD that use "x" changed to "xd"
to comply with convention elsewhere that x = MACROBLOCK
and xd = MACROBLOCKD.
Simplify some repeat references using local variables.
Change-Id: I0ba2e79536add08140a6c8b19698fcf5077246bc
Add local variable in several places to reference the MB mode
info structure. Currently this is usually accessed in the code as
x->e_mbd.mode_info_context->mbmi.* or in some places
xd->mode_info_context->mbmi.*
Resolved some uses of x-> for the MACROBLOCKD structure.
Rebased without dependency on motion reference experiment.
Change-Id: If6718276ee4f2ef131825d1524dfdb02a3793aed
Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.
Also removes an experimental filter.
Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd
Latest version of all scripts/makefile but rtcd_defs.sh is empty, all
existing functions are still selected using the old/current way.
Change-Id: Ib92946a48a31d6c8d1d7359eca524bc1d3e66174
using large values for the timebase, e.g., {33333, 1000000} could
rollover the timestamp calculation in vp8e_encode as it was not using
64-bit math.
originally reported on ffmpeg's trac:
https://ffmpeg.org/trac/ffmpeg/ticket/1014
BUG=468
Change-Id: Iedb4e11de086a3dda75097bfaf08f2488e2088d8
The commit replaces run-time initialization of cosine constants with
static constant values, which provides ~30% relief on slow speed. The
real solution, however will be to implement integer versions of those
functions that current use float/double.
Change-Id: Ie3ff1793509653d78dd1aeaf88cc6737da1bc55f
Using surrounding reconstructed pixels from left and above to select
best matching mv to use as reference motion vector for mv encoding.
Test results:
AVGPSNR GLBPSNR VPXSSIM
Derf: 1.107% 1.062% 0.992%
Std-hd:1.209% 1.176% 1.029%
Change-Id: I8f10e09ee6538c05df2fb9f069abcaf1edb3fca6
The forward and inverse hybrid transforms are now performed using
single function modules, where the dimension is sent as argument.
Added an inline function clip8b to clip the reconstruction pixels
into range of 0-255.
Change-Id: Id7d870b3e1aefc092721c80c0af6f641eb5f3747
This allows building on MountainLion as the 10.6 SDK has been
removed from the latest Xcode version (4.4 4F250). Also fix
all warnings for that build.
Change-Id: Ib70bca4a25295f13595f0d10ea9f0229631de5a4
Merged in the high_precision_mv experiment to make it easier
to work on new mv encoding strategies. Also removed
coef_update_probs3().
Change-Id: I82d3b0bb642419fe05dba82528bc9ba010e90924
Previouly, the decoding of mode and motion vector are done a per frame
basis followed by residue decoding and reconstuction. The commit added
the option to allow decoder to interleave the decoding of mode and mvs
with the residue decoding on a per MB basis.
Change-Id: Ia5316f4a7af9ba7f155c92b5a6fc97201b653571
Fixed the code review comments.
Under the htrans8x8 experiment the 8X8 DCT in the
I8X8 mode is replaced with a combination of 8X8 ADST and
DCT.
Overall coding gains with the htrans8x8 experiment are:
derf: 0.486
std-hd: 1.040
hd: 1.063
yt: 0.506
Note that part of the gain comes from bigger transforms
(8x8 instead of 4x4) and part comes from replacing the DCT
wth the ADST.
Change-Id: I92ca6bbfce11b4165d612b81d9adfad4d010c775
Set on all 16x16 intra/inter modes
Features:
- Butterfly fDCT/iDCT
- Loop filter does not filter internal edges with 16x16
- Optimize coefficient function
- Update coefficient probability function
- RD
- Entropy stats
- 16x16 is a config option
Have not tested with experiments.
hd: 2.60%
std-hd: 2.43%
yt: 1.32%
derf: 0.60%
Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
Interleaved loopfiltering with decode. For 1080p clips, up to 1%
performance gain. For 4k clips, up to 10% seen. This patch is required
for better "frame-based" multithreading.
Change-Id: Ic834cf32297cc04f27e8205652fb9f70cbe290db
Apply 2D-DCT transform of dimension 8x8 to encode prediction
residuals of I8X8 mode.
Brought back block type 3 probability context model for 8x8 tokens,
which is used for the coefficients of Y blocks in I8x8 modes. The
coefficient costs estimate of I8X8 mode in rate-distortion is also
changed appropriately.
Performance results:
derf: 0.246
yt: 0.114
std-hd: 0.730
hd: 0.670
Change-Id: If1d970eeb4e1827c9f0d2c5b27d33089b347ea27
predict_d has become canonical. Remove previous helper function.
Disable ARM assembly pending update.
Change-Id: Idd84ac8a28f9b0221ea97904a77de1e705d06a7d
The sync interval for the multithreaded encoder was considered as not changing
during the encoding. This is not true if picture size is changed.
The encoder could dead-lock because the main thread and the other threads were
using different sync interval.
Change-Id: I75232bbdbc6c02d77f830d870fd8b4e96697c64e
After the picture size was changed to a bigger one, the internal memory was
corrupted and multithreaded encoder was deadlocking.
Memory for last frame's MVs, segmentation map and active map were allocated when
the compressor was created (vp8_create_compressor). Buffers need to be
reallocated when picture size is changed, so, the allocation was moved to
vp8_alloc_compressor_data, which is called every time the picture is resized.
Change-Id: I7ce16b8e69bbf0386d7997df57add155aada2240
Merged the enhanced_interp experiment.
Found and fixed a bug in the include files framework, whereby
certain encoder files were still using the old INTERP_EXTEND
value of 3 instead of 4. The thresholds for mv range mcomp.c
need a small adjustment to prevent crashes.
The results are more or less unchanged.
Change-Id: Iac5008390f1efc97ce1102fbb5f8989c847fb579
Allows for swtiching/setting interpolation filters at the MB
level. A frame level flag indicates whether to use a specifc
filter for the entire frame or to signal the interpolation
filter for each MB. When switchable filters are used, the
encoder chooses between 8-tap and 8-tap sharp filters. The
code currently has options to explore other variations as well,
which will be cleaned up subsequently.
One issue with the framework is that encoding is slow. I
tried to do some tricks to speed things up but it is still slow.
Decoding speed should not be affected since the number of
filter taps remain unchanged.
With the current version, we are up 0.5% on derf on average but
some videos city/mobile improve by close to 4 and 2% respectively.
If we did a full-search by turning the SEARCH_BEST_FILTER flag
on, the results are somewhat better.
The framework can be combined with filtered prediction, and I
seek feedback regarding that.
Rebased.
Change-Id: I8f632cb2c111e76284140a2bd480945d6d42b77a
The ambient qp and active worse/best qp were reset for every frame
when temporal layers is on. This change removes this reset.
As this affects the target size for forced key frames
(it will actually lower the size somewhat), we increased the
inital boost factor to compensate.
Change-Id: Ie38d95f5c99ab3d447469c49e2177bc3fcc4ad28
SAD returns unsigned values. Make all the declarations the same.
Remove bestsad initialization and check. It is always set to the
result of a SAD call so it will never remain UINT_MAX
Use ja instead of jg to test unsigned comparison instead of signed.
Update test.
Change-Id: I46336ab45f4e60fc37caf20bd36bc5782079c7a5
The following five experiments are merged:
newentropy
newupdate
adaptive_entropy (also includes a couple of parameter changes
that improves results a little
in common/entropymode.c and encoder/modecosts.c
that were not merged from the internal branch)
newintramodes
expanded_coef_context
Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
Precalculated block ptrs do not need updates during encoding.
Set these at init stage.
Moved the allocation of 'mt_current_mb_col' (last encoded MB on each
row) to vp8_alloc_compressor_data(), so that it is correctly
reallocated when frame size is changing.
Change-Id: Idcdaa2d0cf3a7f782b7d888626b7cf22a4ffb5c1
Added drop_frame support in multi-resolution encoder.
If one frame is dropped at a lower-resolution level, the next
upper-resolution level encoder needs to encode that frame
independently without any lower-resolution level motion
information.
Another issue is that if one frame is dropped at some but not all
resolution levels, a frame after that one may use different set
of reference frames at different resolution levels. This reference
frame asynchronization could degrade motion search precision in
upper-resolution level encoding, which uses lower-resolution level
motion result. This change compares the lower-resolution and upper-
resolution level's reference frames. If they are not the same, the
upper-resolution level encoder can not use lower-resolution level
motion result.
Change-Id: I61afa4f313630e75b7cbdd5742e230e8724a988a
Replaced local definitions of the extension required
by the filters with the globally defined value.
Change-Id: If9e590a1f2e5b0bdc2d3e3c3f04aacbd3b09bfee
Adds ADST/DCT hybrid transform coding for Intra4x4 mode.
The ADST is applied to directions in which the boundary
pixels are used for prediction, while DCT applied to
directions without corresponding boundary prediction.
Adds enum TX_TYPE in b_mode_infor to indicate the transform
type used.
Make coding style consistent with google style.
Fixed the commented issues.
Experimental results in terms of bit-rate reduction:
derf: 0.731%
yt: 0.982%
std-hd: 0.459%
hd: 0.725%
Will be looking at 8x8 transforms next.
Change-Id: I46dbd7b80dbb3e8856e9c34fbc58cb3764a12fcf
the integer version has very good precision, the float version is no
longer useful. this commit also removes the experiment option from
configure script.
Change-Id: Ibb92e63c9f5083357cdf89c559d584a7deb3353f
this commit removes a number of experiment options from configure
script. the associated experiments are already fully merged, the
options in configure script have no effect at all.
Change-Id: I8054ccaee0a04610162ed76ac9e59c4538217113
vp8_encode_inter_macroblock() is called in both pick_mb_modes() as
well as encode_sb(), thus the number of macroblocks in the counter
were twice as big as actual numbers. This doesn't affect output.
Change-Id: I6de8a996ee44d2f7f2080d8d2177dd7bc6207c93
This allows CONFIG_SUPERBLOCKS experiment to almost compile succesfully,
except for the missing pick_sb_modes() function.
Change-Id: Ib2322f2aacdc371e8066f2eb4a8d761c40490b4d
Added validity checking in multi-res encoder. Disable spatial
resampling and frame dropping before we have those supports.
Also, deallocate the memory for all resolution levels once error
occurs.
Change-Id: Ia5d65a645381cad1a49940ab3a19bb5696c39c09
This is to avoid a rare encoder/decoder mismatch for MB using SPLITMV
mode. In decoder, the UV mv can be determined to need clamp, but the
flag is never set in encoder motion vector selection process, and the
clamp is not done in encoding in encoder.
Change-Id: I60520d3f790354c7855dadf03f0978ea9b77e2c0
This patch fixed issue 458 by calling copy function when both
offsets are 0, which guarantees the SSSE3 functions output
same result as the c function for all possible offsets.
Change-Id: I209aec7a4c6b3362db2646a8887c1038493b6496
xd->subpixel_predict16x16 is called in first pass, but isn't
initialized in first pass, which causes segfault. This patch
fixed that problem.
Change-Id: Ibd2cad4e2d32ea589fc3e0876d60d3079ae836e7
This commit adds lossless compression capability to the experimental
branch. The lossless experiment can be enabled using --enable-lossless
in configure. When the experiment is enabled, the encoder will use
lossless compression mode by command line option --lossless, and the
decoder automatically recognizes a losslessly encoded clip and decodes
accordingly.
To achieve the lossless coding, this commit has changed the following:
1. To encode at lossless mode, encoder forces the use of unit
quantizer, i.e, Q 0, where effective quantization is 1. Encoder also
disables the usage of 8x8 transform and allows only 4x4 transform;
2. At Q 0, the first order 4x4 DCT/IDCT have been switched over
to a pair of forward and inverse Walsh-Hadamard Transform
(http://goo.gl/EIsfy), with proper scaling applied to match the range
of the original 4x4 DCT/IDCT pair;
3. At Q 0, the second order remains to use the previous
walsh-hadamard transform pair. However, to maintain the reversibility
in second order transform at Q 0, scaling down is applied to first
order DC coefficients prior to forward transform, and scaling up is
applied to the second order output prior to quantization. Symmetric
upscaling and downscaling are added around inverse second order
transform;
4. At lossless mode, encoder also disables a number of minor
features to ensure no loss is introduced, these features includes:
a. Trellis quantization optimization
b. Loop filtering
c. Aggressive zero-binning, rounding and zero-bin boosting
d. Mode based zero-bin boosting
Lossless coding test was performed on all clips within the derf set,
to verify that the commit has achieved lossless compression for all
clips. The average compression ratio is around 2.57 to 1.
(http://goo.gl/dEShs)
Change-Id: Ia3aba7dd09df40dd590f93b9aba134defbc64e34
Added the ability to optionally filter the prediction data
when inter modes are selected (excludes SPLITMV, for now).
The mode selection loop considers both the filtered and
non-filtered prediction data when choosing mode. The filter
can be turned on/off at the frame-level, or signaled for
each MB.
Change-Id: I1b783c71d95a361ab36c761b07e8a6b06bc36822
Incorporates mv_ref, mbsplit and second_mv into the adaptive
entropy framework. The mv_ref framework has been modified from
before.
Adds some clean-ups and fixes.
Results with the adaptive entropy experiment are currently up by
+1.93% on derf; +2.33% std-hd and +1.87% yt-hd.
Fixed a nasty intermittent bug.
Change-Id: I4b1ac9f9483b48432597595195bfec05f31d1e39
Changes relating to Issue 411
Removed code that was clearing down the segmentation data each
frame.
Added range/parameter checking in vp8_set_roimap(); Return error
if called when cyclic_refresh is enabled.
Correct setup_features() so that it sets or clears the segment update
flags as appropriate.
Change-Id: Ib31ac53006640ddf1ba7b9ec8f8b952e3eff860a
The function vp8_post_proc_down_and_across_c takes the
stride of both the src and dst images as parameters, but
assumes that they are the same.
I modified the code to use the correct strides, as the
assembler versions of these functions do.
Change-Id: I222715b774cd071b21c15a4b0d2f4aef64a520de
Avoid a pthreads dependency via pthread_once() when compiled with
--disable-multithread.
In addition, this synchronization is disabled for Win32 as well, even
though we can be sure that the required primatives exist, so that the
requirements on the application when built with --disable-multithread
are consistent across platforms.
Users using libvpx built with --disable-multithread in a multithreaded
context should provide their own synchronization. Updated the
documentation to vpx_codec_enc_init_ver() and vpx_codec_dec_init_ver()
to note this requirement. Moved the RTCD initialization call to match
this description, as previously it didn't happen until the first
frame.
Change-Id: Id576f6bce2758362188278d3085051c218a56d4a
This patch incorporates adaptive entropy coding of coefficient tokens,
and mode/mv information based on distributions encountered in a frame.
Specifically, there is an initial forward update to the probabilities
in the bitstream as before for coding the symbols in the frame, however
at the end of decoding each frame, the forward update to the
probabilities is reverted and instead the probabilities are updated
towards the actual distributions encountered within the frame.
The amount of update is weighted by the number of hits within each
context.
Results on derf/hd/std-hd are all up by 1.6%.
On derf, the most of the gains come from coefficients, however for the
hd and std-hd sets, the most of the gains come from the mode/mv
information updates.
Change-Id: I708c0e11fdacafee04940fe7ae159ba6844005fd
This commit is to remove two arrays, which contain the probabilities
of how likely each probability in coef_probs table is updated. The
commit changed to use a fixed number "252".
Surprisedly, the overall impact on quality is close to zero, which
basically says the two big static arrays are not helpful at all.
derf: -0.016%, -0.020%
std-hd: 0.000%, -0.013%
yt: -0.022%, +0.007%
yt-hd: -0.038%, +0.034%
Change-Id: Ifee94d28a37dcab4f1d2b994bd5b07575be42b72
This commit added the ability to accumulate the coef stats across
different encodings using an intermediate binary stats files. The
accumulation happens only the binary stats file exists in current
directory. The encoder needs to be built with "ENTROPY_STATS" to
allow the output. The commit also fixed a few formating issues in
output stats file.
Change-Id: Ib1a41180aa554845cf51e4421a230b128a3a82b4
Changes to calculation of sr_coded_error to include 0,0 case.
Experimental use of sr_coded_error in calculating correction factor
for estimating the allowable Q range.
Reinstated some code needed for calculating section_intra_rating.
Add flash detection in calculation of KF boost
Increased tolerance in testing candidate key frames (needed with
longer motion search as this tends to slightly increase inter %.
Zbin changes for 8x8.
Other minor adjustments, refactoring and bug fixes.
Reinstated some motion break out clauses in boost loop
as their removal hurt a few 50fps clips badly in the std set.
It may be possible to remove them again later if a better way
can be found of preventing overly long gf intervals.
Change-Id: Iee686d0c31072828bb1ccd2bc63f5f1c7c548ea2
Allows building the library with the gcc -pedantic option, for improved
portabilty. In particular, this commit removes usage of C99/C++ style
single-line comments and dynamic struct initializers. This is a
continuation of the work done in commit 97b766a46, which removed most
of these warnings for decode only builds.
Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
Frame dropping decision is made by evaluating both current frame
and next frame's buffer_level. If both buffer_levels are less
than drop_mark, next frame is dropped. When frame dropping is
over, namely, buffer_level becomes normal again, we need to
reset decimation_count to 0.
Change-Id: Iae182612e61e0da367fbd43afdc90738d975d1a3
The logic for spatial resizing is done after the Q is selected for the
frame. This causes a problem that the Q we select for the (resized)
key frame may be based on a different resolution than the frame we
will encode.
This fix is to ensure that, when resize is on, the selected Q is still
based on the resolution of the frame to be encoded.
Change-Id: Ia49a9eac5f64e48d1c00dfc7ed4ce26fe84d3fa1
Variables m & mi were being dereferenced when they might
hold invalid values.
The fix is simply to move these dereferences to after the
point at which mb_row and mb_col are tested for validity.
Change-Id: Ib16561efa9792dc469759936189ea379d374ad20
Compares the sum of differences between the input block and the averaged
block. If they differ too much the block will not be filtered. Negligible
perfomance hit.
Change-Id: Ib1c31a265efd4d100b3abc4a1ea6675038c8ddde
Add PRIVATE macro for adding private_extern directive for yasm
to hide global symbols. This is only enabled if -DCHROMIUM is used
with YASM.
Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING.
Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
This extends the denoiser to work for temporally scalable
coding.
I believe this also fixes a very rare but really bad bug in the original
implementation.
Change-Id: I8b3593a8c54b86eb76f785af1970935f7d56262a
This fix addresses some problems with very complex clips like
handling of flashes on clips like crew (which was made worse
by an earlier patch (derf and std-hd)).
Most clips a small effect but some between 1 & 2%
Derf +0.039, +0.211%
YT +0.042, +0.083%
Change-Id: I65fc7c13afc31482040068544dd65b8808f5cb4a
Compares the sum of differences between the input block and the averaged
block. If they differ too much the block will not be filtered. Negligible
perfomance hit.
Change-Id: Ib1c31a265efd4d100b3abc4a1ea6675038c8ddde
Removed the local scaling factor est_max_qcorrection_factor
and related code to simplify estimateq calculation (little effect
anyway)
Cap range of total correction factor.
Slight change to break out case to turn off arf.
Change-Id: I748187737ba93cfadf016f3dfdf8d2741934067f
the commit fixed a number of compiling issues when some epxeriments
are turned on at the same time.
Change-Id: Idb15b215e2d2a7d25f2707f99ef55a34e7301ce7
This extends the denoiser to work for temporally scalable
coding.
I believe this also fixes a very rare but really bad bug in the original
implementation.
Change-Id: I8b3593a8c54b86eb76f785af1970935f7d56262a
Add PRIVATE macro for adding private_extern directive for yasm
to hide global symbols. This is only enabled if -DCHROMIUM is used
with YASM.
Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING.
Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
After a key frame encoding, the frame type could change while
filtering is still going on. Pass the frame type as parameter to the
loopfilter function and don't read it from common storage.
vp8cx_set_alt_lf_level has to be done before packing the stream.
Currently alt_lf_level is not used so there hasn't been any visible
problem here.
Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
The two-pass code does not support the case where the application
changes the frame size dynamically. Add this case to the validation
checks in the vpx_codec_enc_config_set() path.
Change-Id: Idadc42c7c3bd566ecdbce30d8dd720add097f992
* changes:
Add initial keyframe tests
Move all tests to test/ directory
Enable unit tests by default
Build unit tests monolithically
configure: initial support for CXX, CXXFLAGS variables
Resolution changes in calls to vpx_codec_enc_config_set() would cause
a memory leak due to failing to release the lookahead and alt ref
buffers.
Change-Id: I48392ea25e71fe2760d60cfde3fb3874598cc85f
Reverted part of change in memory alllocation code, which ensures
that the function returns 0 and encoder works correctly when
CONFIG_MULTI_RES_ENCODING isn't turned on.
Change-Id: Id5d5e7f2c8bd9e961a6dca79d257e8185f0d592a
The commit changed how baseline 8x8 coefficient probabilities are
initialized, to be consistent with the initialization of baseline
4x4 coefficient probabilities.
The commit does not have any effect on compression.
Change-Id: Ifb3902b5dc0b0c2e6dc3aa5d4a6589d528e58355
After a key frame encoding, the frame type could change while
filtering is still going on. Pass the frame type as parameter to the
loopfilter function and don't read it from common storage.
vp8cx_set_alt_lf_level has to be done before packing the stream.
Currently alt_lf_level is not used so there hasn't been any visible
problem here.
Change-Id: Ia114162158cd833c2b16e3b89303cc9c91f19165
Rework unit tests to have a single executable rather than many, which
should avoid pollution of the visual studio project namespace, improve
build times, and make it easier to use the gtest test sharding system
when we get these going on the continuous build cluster.
Change-Id: If4c3e5d4b3515522869de6c89455c2a64697cca6
Remove dependency on amount and speed of motion as this
may not behave well across different image sizes.
Tweak impact of % inter.
Add in experimental adjustment based on relative quality of an
older second reference frame.
Cap range of decay values allowed.
Some small + effect on derf but -ve on yt & hd at this stage.
Change-Id: I390d6f6ebe67a2eb0b834980d0d4650124980d3e
In multi-resolution encoding, frame_type decision for each frame
is made by the lowest-resolution encoder. For all other higher-
resolution encoders, kf_mode is always set to VPX_KF_DISABLED,
and they are forced to use the same frame_type picked by the
lowest-resolution encoder.
Change-Id: Ic4d52ec65bbc012ca9c2d236210e28a295591eaf
I now see I didn't write a very long description, so let's do it
here then. We took a pretty big quality hit (0.1-0.2%) from my
recent fix of the inversion of arguments to vp8_cost_bit() in the
RD reference frame costing. I looked into it and basically the
costing prevented us from switching reference frames. This is of
course silly, since each frame codes its own prob_intra_coded, so
using last frame cost indications as a limiting factor can never
be right.
Here, I've rewritten that code to estimate costings based partially
on statistics from progress on current frame encoding. Overall,
this gives us a ~0.2%-0.3% improvement over what we had previously
before my argument-inversion-fix, and thus about ~0.4% over current
git (on derf-set), and a little more (0.5-1.0%) on HD/STD-HD/YT.
Change-Id: I79ebd4ccec4d6edbf0e152d9590d103ba2747775
base the static image test off a measure of 0,0 motion
instead of the decay accumulator value.
Change "transition to still detection" to compare the
decay rate from successive frames.
Minor tweak to the arf extra boost given based on the
number of frames affected.
Removed unused variable mod_err_per_mb_accumulator.
Change-Id: Idd8360083ad409e45f133ce97dd2488259003e64
The commit added an integer version of 8x8 forward DCT, based on the
orginal forward DCT from VP6. The constants, roundings, and shifts
were adjusted to improve the accuracy. The latest patch has a very
similar accuracy in term of round trip error against the floating
point version.
It should be noted here that the purpose of the patch is to help
encoding speed and facilitate all other experiments. There will be
futher review in combination with inverse DCT before finalization.
configure with "--enable--int_8x8fdct" to use the integer version
Change-Id: I5a4f80507429f0e07cf02a13768ec81cbfddc5bc
Some marginal impact due to the fact that it makes use of
arf more likely / stable even in hard sections.
Change-Id: Ic72fda0f63eefc9433914b5d9cd374d515810129
Removed unused function.
Added tentative code to take error score of an older frame
into account when calculating Q range. However, for now
it is disabled pending merging other changes and testing.
Change-Id: Ie89955e70319dac31b79e3b833e3352712a061ec
Remove testing of whether we estimate that it will be possible
to code an arf at a lower Q than the ambient Q. This adds quite
a bit of extra code and complexity for marginal gain.
Factored out some code relating to ARNR selection to a separate
function as this is likely to be changed / simplified soon.
Change-Id: Ia1cf060405637ef5bbf7018355437be21d12375f
Removed odd *100 >> 4 factor from boost calculations. Not all the
calculations exactly match what was there before so there may be
some minor impact on results.
Some other minor tidying up in regard to coding conventions.
The specific values of factors and thresholds will likely change as
part of subsequent patches.
Change-Id: Id976321484ac02ba50294cf54fafbc17dda85686
These frames can force reference frame (arf), mode (zeromv) and skip,
which means that if we use compound prediction (i.e. arf+last), we
might use a blend of a perfect (arf) and an imperfect (last) predictor,
leading to semi-garbage display and thus a huge drop in SSIM/PSNR (up
to 10dB for some frames I analyzed).
Gives a +0.2% gain on YT.
Change-Id: If1f2b7899ad165684af3808fd379295e82558cbb
This is the first patch in a series of changes to the first
pass code. (Broken down for ease of testing/merging/review).
This patch introduces a new stats element "sr_coded_error".
This is the coded error recorded vs the second reference
frame (which is updated such that it lags by at least one frame).
No use is made of the new structure in this change so this patch
should have no material effect.
Removed some ifdefs and deprecated code (#if NEW_BOOST).
Removed twopass.gf_decay_rate (not used any more)
Change-Id: I1be672a73017f7c13fd50fb4f99236aa2ed30916
This commit changed the forward and the inverse 4x4 Walsh Hadamard
transform to a new pair, where the inverse transform can pefectly
reconstuct the input to forward transform. It also does so without
changing the input and output value range. Even more, it does not
change the complexity of the transforms.
While it was not expected to improve the results of our current test,
it does improve std-hd set by 0.2% on all metrics. No change on derf.
Change-Id: Ie4f23ddd3a0f3c5fbe97fb58399f860031f99337
Accept the same range of inputs for the VP8E_SET_NOISE_SENSITIVITY
control, regardless of whether temporal denoising is enabled or not.
This is important for maintaining compatibility with existing
applications.
Change-Id: I94cd4bb09bf7c803516701a394cf1a63bfec0097
1. block types
There are only three types of blocks for 8x8 transformed MBs, i.e. Y
block with DC does not exist for 8x8 transformed MBs as all MB using
8x8 transform have 2nd order haar transform. This commit introduced
a new macro BLOCK_TYPES_8X8 to reflect such fact.
2. context counters
This commit also fixed the mixed of context_counters between 4x4 and
8x8 transformed MBs. The mixed use of the counters leads me to think
the existing the context probabilities were not properly generated
from 8x8 transformed MBs.
3. redundant collecting in recoding
The commit also corrected the code that accumulates entropy stats by
making sure stats only collected for final packing, not during the
recode loop
Change-Id: I029f09f8f60bd0c3240cc392ff5c6d05435e322c
Adds a unit test to the boolcoder that tests encoding
and decoding thousands of different bits, with different
probabilities in different patterns.
Code borrowed from the webp project - and its committers.
Change-Id: Icabbb884d57e666496490c961dd29b246144ab3e
Make functions only referenced from one translation unit static. Other
symbols with extern linkage get a vp8/vpx prefix.
Change-Id: I928c7e0d0d36e89ac78cb54ff8bb28748727834f
Change If4321cc5 fixed a bug caused by forward declarations not being
kept in sync across C files, resulting in a function call with the
wrong arguments. The commit moves the affected function declarations
into a header file, along with the other symbols from encodeframe.c
that were being sloppily shared.
Change-Id: I76a7b4c66d4fe175f9cbef7e52148655e4bb9ba1
Removes all runtime initialization of global data. This commit is a
squashed version of the following series cherry-picked from master.
This is necessary because of a change that was merged to the tester
that depends on the scaler being moved to the RTCD framework, which
is a worthwhile thing to include in Eider anyway.
- a91b42f02 Makes all global data in entropy.c const
- b35a0db0e Makes all global data in tokenize.c const
- 441cac8ea Makes all mode token tables const
- 5948a0210 Ports vpx_xcaler to new RTCD method
- 317d4244c Makes all mode token tables const part 2
Change-Id: Ifeaea24df2b731e7c509fa6c6ef6891a374afc26
Move the notion of 0 bitrate implying skip deeper into the codec,
rather than doing it at the multi-encoder API level. This preserves
v1.0.0 ABI compatibility, rather than forcing a bump to v2.0.0 over a
minor change. Also, this allows the case where the application can
selectively enable and disable the larger resolution(s) without having
to reinitialize the codec instace (for instance, if no target is
receiving the full resolution stream).
It's not clear how deep to push this check. It may be valuable to
allow the framerate adaptation code to run, for example. Currently put
the check as early as possible for simplicity, should reevaluate this
as this feature gains real use.
Change-Id: I371709b8c6b52185a1c71a166a131ecc244582f0
Besides imposing a performance penalty at startup in most
configurations, these relocations break the dynamic linker for
native Fennec, since it does not support them at all.
Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de
These are warnings in most builds, but show up as compile errors on
some platforms when these headers are included from C++ code.
Change-Id: I6c523b4dbbc699075fe73830442b51922e5a61d5
These are warnings in most builds, but show up as compile errors on
some platforms when these headers are included from C++ code.
Change-Id: I6c523b4dbbc699075fe73830442b51922e5a61d5
This commit adjusted slightly the 4x4 coefficents band definition to
better classify coefficients with similar distributions and usages.
It helps derf set about .1%, it is alos slightly positive for std-hd
set, where 4x4 blocks are used less frequently.
The commit also removed a const array not in use.
Change-Id: I78d16905d4036641ec905b0c32c190c1def5b249
The ARNR filter uses a motion compensated temporal filter,
but the motion estimation implementation accounts for the
cost of the mv in its decision making process. The ARNR
filter uses a dummy cost table initialized to 0 as a way
to ignore the mv costs (which are irrelevant to the filter).
This CL modifies the ARNR filter implementation to so that
the mv costing is ignored without the requirement for
dummy tables.
Change-Id: I0dd9620c3b70682f938b2a70912c11d4d7c9284c
The ARNR filter uses MC to find the best match between the
ARF and other nearby frames in the filter-set. Since the
ARF is a member of the filter-set, MC in that case is
unnecesssary. This patch modifies the filter so it does
not apply MC in this case.
Change-Id: Ic0321199c08db2189a57f28d1700b745bc7ff66d
The ARNR filter uses a motion compensated temporal filter,
but the motion estimation implementation accounts for the
cost of the mv in its decision making process. The ARNR
filter uses a dummy cost table initialized to 0 as a way
to ignore the mv costs (which are irrelevant to the filter).
This CL modifies the ARNR filter implementation so that
the mv costing is ignored without the requirement for
dummy tables.
Change-Id: I4196aa5c24da63f858ff54fbaa5fc85ae1f1957f
The backup MODE_INFO buffer used in the error concealment was
allocated in the codec common parts allocation even though this is a
decoder only resource. Moved the allocation to the decoder side.
No need to update_mode_info_border as mode_info buffers are zero
allocated.
This fixes also a potential memory leak as the EC overlaps buffer was not
properly released before reallocation after a frame size change.
Change-Id: I12803d3e012308d95669069980b1c95973fb775f
Adds a speed feature to conduct a brute force search among a set of
available interpolation filters for the best one in an RD sense.
There is a gain of 0.4% on derf, 1.0% on Std-HD.
Patch 2: A macro added to determine if the encoder state is reset
for each new filter tried.
Patch 3: rebase, also fixes a bug (decodframe.c) introduced by a
couple of missing function pointer assignements.
Patch 4: rebase.
Change-Id: Ic9ccca9d8c35c6af557449ae867391a2f996cc29
This commit merge the QI mode experiment. As the experiment affects
the encoding of intra coding modes on key frame only, the overall
effect of the experiment on encoding tests is insignificant.
Change-Id: I9e4e3933adface88867ad429cee3986e529c511d
The commit merges the UVINTRA experiment and removed the related
macros. The overall effect of the experiment is a small gain (.1%
on derf)
Change-Id: Ia34b3312fb9b5b34c9ba111bf0fa78c6f78ac80b
Race was introduced by https://gerrit.chromium.org/gerrit/15563.
If loopfilter related config params were changed between frames, or
after a KEY frame, there could be a mismatch between encoder's and
decoder's recontructed image. In worst case, when frame buffers are
reallocated because of a size change, the loopfilter could
do an invalid data access (segmentation fault).
Fixes:
Sync with the loopfilter before applying any encoder changes in
vp8_change_config().
Moved the loopfilter synching to the top of
encode_frame_to_data_rate() so that it's done before any alteration of
the encoder.
Change-Id: Ide5245d2a2aeed78752de750c0110bc4b46f5b7b
Increment the last_row_mb_col counter by nsync after last MB of row is
ready. This way we dont need to check for last MB of row when
synching.
Set last MB of row ready just after row extension is done, This avoids
o potential race condition whit the processing of last MB of next row.
Change-Id: I19c44fd6041116ee5483be2143b4f4bfcd149eac
Adds differential encoding of prob updates using a subexponential
code centered around the previous probability value.
Also searches for the most cost-effective update, and breaks
up the coefficient updates into smaller groups.
Small gain on Derf: 0.2%
Change-Id: Ie0071e3dc113e3d0d7ab95b6442bb07a89970030
RD costs were local to MACROBLOCK data and had to be copied all the
time to each thread's MACROBLOCK data. Tables moved to a common place
and only pointers are setup for each encoding thread.
vp8_cost_tokens() generates 'int' costs so changed all types to be
int (i.e. removed unsigned).
NOTE: Could do some more cleaning in vp8cx_init_mbrthread_data().
Change-Id: Ifa4de4c6286dffaca7ed3082041fe5af1345ddc0
The block pointers and offset do not need to be calculated for every
frame. Block internal predictors can be update once when decoder is
allocated. Destination and previous buffer offsets have to be updated
just when frame size is changing.
Change-Id: I92ca8df0e6aaac4cc35ab890751d446760bf82e2
Key frame macrobock and block mode probabilities are constant.
Remove the allocation of tables for each codec instance and use
instead the default const prob tables.
Change-Id: I8361798ac491f9b3889e86925a494c58647c753f
Move the notion of 0 bitrate implying skip deeper into the codec,
rather than doing it at the multi-encoder API level. This preserves
v1.0.0 ABI compatibility, rather than forcing a bump to v2.0.0 over a
minor change. Also, this allows the case where the application can
selectively enable and disable the larger resolution(s) without having
to reinitialize the codec instace (for instance, if no target is
receiving the full resolution stream).
It's not clear how deep to push this check. It may be valuable to
allow the framerate adaptation code to run, for example. Currently put
the check as early as possible for simplicity, should reevaluate this
as this feature gains real use.
Change-Id: I371709b8c6b52185a1c71a166a131ecc244582f0
Look for changes in the codec's configured w/h instead of its active
w/h when forcing keyframes. Otherwise calls to vp8_change_config()
will force a keyframe when spatial resampling is active.
Change-Id: Ie0d20e70507004e714ad40b640aa5b434251eb32
This commit changed to enable the usage 8x8 transform for all frame
type, all resolution and all quantizer range. This has an overall
benefit .2% to .3% in term of compression, but more importantly,
the difficult clips benefits much more, up to 2% to 3% on clips
like football, harbour and so on.
We observed some weird humps on very high end on a couple of youtube
clips, but have determined the underly cause was the aggressive zbin
having an effect of lowering rate with lower quality, which have
an impact on slide show clips around 60DB.
The commit does not change the association between prediction mode
and transform size.
Change-Id: I33043bdce6207528ae00b4a4b26d8ff63cfea1f4
This is to prevent the evaluation of a mode from using values left
over from a mode evaluated prior in the loop.
Change-Id: Ife2c6ceb76d2f7365fd262515d3ae48229033c2d
(see Change I9b2ccc88: Makes all mode token tables const)
Further remove runtime table initialization and use
precalculated const data. Data footprint reduced
by 4112 bytes.
Change-Id: Ia3ae9fc19f77316b045cabff01f6e5f0876a86ab
Ensure that RTCD function pointers are set at most once, to silence
some data race warnings. Implementation provided for POSIX threads and
Win32, with the prior unsynchronized behavior left in place for other
platforms.
Change-Id: I65c5856df43ef67043b3d5f26ddafddd8fcb2f7e
We can get rid of all remaining global initializers now:
vp8_scale_machine_specific_config()
vp8_initialize()
vp8dx_initialize()
Change-Id: I2825cea5d1c01ad9f6c45df49a0f86d803bfeb69
Mode token tabels precalculated in entropymode.c.
Removes vp8_initialize_common()as all common global data
is precalculated const now.
Change-Id: I9b2ccc883e4f618069e1bc180dad3a823394eb73
With the NEWENTROPY experiment enabled encoding certain clips
produced invlid bitstreams, or files that had a high degree
of artefacts.
This was the results of pointers in MACROBLOCKD not being
setup correctly (mode_info_context and prev_mode_info_context).
Change-Id: Ice13e1efa8bd122997d2f8f3f1e761c6c16e0403
These contexts need to be saved and restored for recode, otherwise
encoder/decoder mismatch happens for some clips (eg._mobcal 720p)
Change-Id: Ic65cfa0bf56ed0472ecab962ce31394d59d344bf
Removes all runtime initialization of global data in tokenize.c.
DCT token and cost tabels are pre-generated.
Second patch in a series to make sure code is reentrant.
Change-Id: Iab48b5fe290129823947b669413101f22a1bcac0
Removes all runtime initialization of global data in entropy.c.
Precalculated values are used for initializing all entropy related
tabels.
First patch in a series to make sure code is reentrant.
Change-Id: I9aac91a2a26f96d73c6470d772a343df63bfe633
When producing an invisible ARF, the time stamp counters aren't
updated since the last time stamp is seen by the codec twice. The
prior code was trapping this case with refresh_alt_ref, but this isn't
correct for other uses of the ARF. Instead, use the show_frame flag.
Change-Id: If67fff7c6c66a3606698e34e2fb5731f56b4a223
Added code to save the coding context in vp8_rd_pick_inter_mode
when the coding mode is forced to ARF(0,0).
Also, modified the MV bounds computation to comply with the
change in MV border from 32 to 64 pixels.
Change-Id: I96963a6f5f4d04ce84c807ae11e0635177c3ad6c
Local variable offsets are now consistent for the functions,
removed unused parameters, reworked the assembly to eliminate
stalls/instructions.
Change-Id: Iaa37668f8a9bb8754df435f6a51c3a08d547f879
Turning off the interpolation filter selection based on edge
proportion. This heuristics has not been working as well as
expected and I have started a more rigorous investigation into
this. We can turn this off for now since it is unnecessarily
slowing things down.
Rebase.
Change-Id: Ic5958b2b3a35ec2d8eb73b6d81617ca8fbe07e74
This commit tries to address an issue related to the oddity shown on
HD _mobcal clip, where some rather ugly blocks shown in the second
frame at low-mid bit rates if the third frame is not made a key frame
by he encoder. The fixes include: 1) made calls to sad_16x16 to be
consistent with function prototype. 2) remove the error bias to intra
and golden in mbgraph search. 3) changed the error accumulation on
inter_segment encoding to avoid potential out-of-range. 1) has no
effect on encoding results.
Encoding test show that the overall effect of the commit helps about
.2%(HD) to .3%(cif)
Change-Id: I930975a2d0c06252f01c39e0a02351529774e30b
The commit removed a limit on key frame detection, which caused a big
drop in all metric measurements for standard HD clip such as _mobcal.
This single change helps two standard HD clips by a huge amount, which
help the overall std-hd set by 2.4% (glb psnr), 0.9% (avg_psnr), 2.1%
(vpxssim).
In the result page:
http://pafr9.prod.google.com:26163/?/cns/rc-d/home/on2-prod/sunkaras/borg-test/yaowu
2012_04_02_1649_yaowu_bugfix_std-hd
2012_04_03_1452_yaowu_hump_std-hd
represent the encoding test results and std-hd set prior and after this
commit respectively.
Change-Id: Ie4313e317c737ea0e699c3a7919c1376744baa1a
This commit has made macro_block_yrd_8x8 and macro_block_yrd_8x8 to
take same parameters. It also removed a few unnecessary shifts that
has the potential to create out-of-range distortion values.
Change-Id: I4ec5afb307c3685c2a67a07c2850f0927d214455
Some code re-factored / moved to allow the main
pack operation inside the recode loop so that the
size estimate is accurate.
Deletion of some redundant code relating to one pass.
Aproximate improvement over March 27 code base:
Derf 0.0%, YT 0.5%, YThd 0.3% Std_hd 0.25%
Change-Id: Id2d071794ab44f0b52935f6fcdb5733d09a6bb86
Some adjustments to zbin for t8x8.
Changes to rules for sizing forced key frames.
Some extra stats output in tmp.stt.
Approximate gain on YT-hd set 0.5%
There are still issues in sizing key frames and gf/arf frames
when the image is largely static. These in part relate to
problems with cost estimates in the recode loop.
Change-Id: I6f0159dc8a8faeab4115a19c668d442491619a68
This is the first patch to add superblock (32x32) coding
order capabilities. It does not yet do any mode selection
at the SB level, that will follow in a further patch.
This patch encodes rows of SBs rather than
MBs, each SB contains 2x2 MBs.
Two intra prediction modes have been disabled since they
require reconstructed data for the above-right MB which
may not have been encoded yet (e.g. for the bottom right
MB in each SB).
Results on the one test clip I have tried (720p GIPS clip)
suggest that it is somewhere around 0.2dB worse than the
baseline version, so there may be bugs.
It has been tested with no experiments enabled and with
the following 3 experiments enabled:
--enable-enhanced_interp
--enable-high_precision_mv
--enable-sixteenth_subpel_uv
in each case the decode buffer matches the recon buffer
(using "cmp" to compare the dumped/decoded frames).
Note: Testing these experiments individually created
errors.
Some problems were found with other experiments but it
is unclear what state these experiments are in:
--enable-comp_intra_pred
--enable-newentropy
--enable-uvintra
This code has not been extensively tested yet, so there
is every likelihood that further bugs remain. I also
intend to do some code cleanup & refactoring in tandem
with the next patch that adds the 32x32 modes.
Change-Id: I1eba7f740a70b3510df58db53464535ef881b4d9
This patch includes:
1. fixes to disable block based termporal mixing when motion
is detected (because this version of mfqe only handles zero motion).
2. The criterion used for determining whether to mix or
not are changed to use squared differences rather than
absolute differences.
3. Additional checks on color mismatch and excessive block
flatness added. If the block as decoded has very low activity
it is unlikely to yield benefits for mixing.
Change-Id: I07331e5ab5ba64844c56e84b1a4b7de823eac6cb
In cases where you have a flat background occluded by a moving object
of similar luminosity in the foreground, it was likely that the
foreground blocks would persist for a few frames after the background
is uncovered. This is particularly noticable when the object has a
different color than the background, so add the chroma planes in as an
additional check.
In addition, for block sizes of 8 and 16, the luma threshold is
applied on four subblocks independently, which helps when only part of
the background in the block has been uncovered.
This fixes issue #392, which includes a test clip to reproduce the
issue.
BUG=392
Change-Id: I2bd7b2b0e25e912dcac342e5ad6e8914f5afd302
Reduced the size of the struct by 8 bytes, which would be
a memory savings of 64800 bytes for 1080 resolutions. Had
an extra byte, so created an is_4x4 for B_PRED or SPLITMV
modes. This simplified the mode checks in
vp8_reset_mb_tokens_context and vp8_decode_mb_tokens.
Change-Id: Ibec27784139abdc34d4d01f73c09f43e9e10e0f5
Found this bug while tracking down some anomalies in my experiments.
Since vp8_cost_one and vp8_cost_zero return unsigned int, the
bit shift by 8 will be incorrect if the value is negative.
I am cautiously optimistic that this fix will make the prob
updates more correct and somewhat improve results across the board.
But the update probabilities will need to be retuned I think.
Patch 2: Adding more of the same fixes using a macro.
Change-Id: I1a168f040e74e8c67e7225103b1c2af9a611da49
This new vp8_decode_mb_tokens() uses a modified version of
WebP's GetCoeffs function. For now, the dequant does not
occur in GetCoeffs.
Tests showed performance improvements up to 2.5% depending
on material.
Change-Id: Ia24d78627e16ffee5eb4d777ee8379a9270f07c5
Adds logic to disable mfqe for the first frame after a configuration
change such as change in resolution. Also adds some missing
if CONFIG_POSTPROC macro checks.
Change-Id: If29053dad50b676bd29189ab7f9fe250eb5d30b3
When ac_yquant>171, a key frame is enabled to use 8x8 transform. In
such case, MBs with DC_PRED or TM_PRED are selected to use T8x8. This
change helped the full STD-HD set by ~.1% or so, which is reasonable
considering how often key frame occurs in these encodings.
Change-Id: Id17009ef6327252177b19e6bf0d6628827febaf1
Deprecate fast quant and strict_quant code.
Small effect on quality as fast was used in first pass but the
effect is basically neutral across the derf set.
The rationale here is to reduce the number of code paths for
now to make experimentation easier. Optimized and fast code
options can be re-introduced later along with other encode
speed options.
Change-Id: Ia30c5daf3dbc52e72c83b277a1d281e3c934cdad
Various refactoring to make the subpel motion compensation
filters switchable by a frame level field.
Two types of 8-tap filters are supported in addition to the existing
bilinar and sixtap filters. One is the default 8-tap and the
other has a sharper cut-off for use with frames with substantial
edge content.
Patch 2: Added a preliminary strategy for filter selection based on
edginess detecton. Also includes some filter changes.
Change-Id: I866085bda5ae143cfdf2ec88157feaabdf7bd63a
This change added a motion search skipping mechanism similar
to what we did in second pass. For a macroblock that is very
similar to the macroblock at same location on last frame,
we can set its mv to be zero, and skip motion search. This
improves first-pass performance for slide shows and video
conferencing clips with a slight PSNR loss.
Change-Id: Ic73f9ef5604270ddd6d433170091d20361dfe229
The commit added a clamp to the 2nd motion vector used in compound
prediction to insure mv within UMV borders. The clamp is similar to
that of the first motion vector except that No SPLITMV is ever used
for the 2nd motion vector.
Change-Id: I26dd63c304bd66b2e03a083749cc98c641667116
The recoding loop save and restore frame coding context for recodes.
However in recoding of key frames, some of the coding context saved
was stale from last encoded inter frame. The save/restore sometimes
overwrites the re-inintialized coding context with saved context
from last frame, resulting in encoder/decoder mismatch
Change-Id: I354ae2f71074d142602d51d06544c05a2462caaf
This issue likely doesn't appear in the unmodified encoder, but
sufficient hacking on the mode selection loop can expose it.
Change-Id: I8a35831e8f08b549806d0c2c6900d42af883f78f
In a previous commit, the duplicate of headerfile defaultcoefcounts.h
was identified. This commit updates the .mk file to ensure configure
and make works properly for all platforms.
Change-Id: I31a39c809a734ba438ee53db700f252e9a03eddd
Pulled out super block code for the snapshot as this
is not quite ready and will need an extensive re-merge.
Change-Id: I436369b511257447a7b0ea064016cb63f5011849
Break MFQE code into it's own file.
It is currently only valid for 16x16 and 8x8 Y blocks. It also filters
4x4 U/V blocks.
Refactor filtering and add associated assembly. Limited test cases show
--mfqe introduces a penalty of ~20% with HD content. The assembly
reduces the penalty to ~15%
Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf
https://gerrit.chromium.org/gerrit/#change,17319 fixes cost estimating
to take skip_eob into account. No quality difference seen on derf set
tests, but about .4% gain on STD_HD set.
Change-Id: Ic5fe6d35ee021e664a6fcd28037b8432a0e470ca
Coefficient costing failed to take account of the first branch
being skipped ( 0 vs eob) if the previous token is 0.
Fixed rd to account for slightly increased token cost & cleaned up
warning message
Change-Id: I56140635d9f48a28dded5a816964e973a53975ef
This gives a modest gain on derf overall, although at low bitrates the
cost is still too high, so this can be improved further.
Patch 2. Re-base and fix 80 column issues
Change-Id: Ida2f9fa3fe75370669f6a27b37108dc602231c63
The commit changed to compute UV intra RD estimates for 4x4 and 8x8
separately to be used in mode decision for MB modes associated with
the appropriate transform size respectively. Now finally after many
other changes related 8x8 quantizer zbin boost and zbin_mode_boost,
this change overall helps the HD(with 8x8) by around ~.13%.
(avg .13% glb .13% ssim .17%)
The commit also has a few changes for eliminating compiler warnings.
Change-Id: Ibab35dad44820c87e6b44799c66f8d519cc37344
The commit added the correct Zbin_mode_boost initialization based on
Intra Mode before using rate distortion to pick UV intra mode.
Change-Id: I8e57878ff356a06672f6fa2431be860bf9b9a5c7
Produce the token partitions on-the-fly, while processing each MB.
Context is updated at the beginning of each frame based on the
previoud frame's counters. Optimally encoder outputs partitions in
separate buffers. For frame based output, partitions are concatenated
internally.
Limitations:
- enabled just in combination with realtime-only mode
- number of encoding threads has to be equal or less than the
number of token partitions. For this reason, by default the encoder
will do 8 token partitions.
- vpxenc supports partition output (-P) just in combination with
IVF output format (--ivf)
Performance:
- Realtime encoder can be up to 13% faster (ARM) depending on the number
of threads and bitrate settings. Constant gain over the 5-16 speed
range.
- Token buffer reduced from one frame to 8 MBs
Quality:
- quality is affected by the delayed context updates. This again
dependents on input material, speed and bitrate settings. For VC
style input the loss seen is up to 0.2dB. If error-resilient=2
mode is used than the effect of this change is negligible.
Example:
./configure --enable-realtime-only --enable-onthefly-bitpacking
./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
--target-bitrate=1000 --token-parts=3 --static-thresh=2000
--ivf -P -t 4 -o strm.ivf tanya_640x480.yuv
Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76
Eliminated some mb branches along with other code cleanups.
This is part of an ongoing effort to remove cut/paste
code in the decoder.
Change-Id: Ifabb0f67cafa6922b5a0e89a0d03a9b34e9e5752
The commit fixed a problem where 8x8 regular quantizer was using the
4x4 zbinboost lookup table that only has 16 entries at each Q. The
commit assigned a uniform zbin boost value for all cases that there
are more than 16 consective zeros. The change only affects MBs using
8x8 transform. The fix has a slightly positive impact on quality.
Test results:
http://www.corp.google.com/~yaowu/no_crawl/hd_fixzbinb.html
(avg psnr: .26% glb psnr: .21% ssim: .28%)
Results on cif size clip are also positive even though gain is smaller
http://www.corp.google.com/~yaowu/no_crawl/derf_fixzbinb.html
Change-Id: Ibe8f6da181d1fb377fbd0d3b5feb15be0cfa2017
This is the first patch for refactoring of the code related to
high-precision mv, so that 1/4 and 1/8 pel motion vectors can
co-exist in the same bit-stream by use of a frame level flag.
The current patch works fine for only use of 1/4th and
only use of 1/8th pel mv, but there are some issues with the
mode switching in between. Subsequent patches on this change Id
will fix the remaining issues.
Patch 2: Adds fixes to make sure that multiple mv precisions can
co-exist in the bit-stream. Frame level switching has been tested
to work correctly.
Patch 3: Fixes lines exceeding 80 char
Patch 4:
http://www.corp.google.com/~debargha/vp8_results/enhinterp.html
Results on derf after ssse3 bugfix, compared to everything
enabled but the 8-tap, 1/8-subpel and 1/16-subpel uv. Overall the
gains are about 3% now. Hopefully there are no more bugs lingering.
Apparently the sse3 bug affected the quartel subpel results more than
the eighth pel ones (which is understandabale because one bad predictor
due to the bug, matters less if there are a lot more subpel options
available as in the 1/8 subpel case).
The results in the 4th column correspond to the current settings.
The first two columns correspond to two settings of adaptive switching
of the 1/4 or 1/8 subpel mode based on initial Q estimate. These
do not work as good as just using 1/8 all the time yet.
Change-Id: I3ef392ad338329f4d68a85257a49f2b14f3af472
The commit overall on derf test is break even to very slightly positive
comparing to all 4x4 transform.
Change-Id: I2a7c19599aa54c2d3a5b35db0dc891ba8a6a2b26
Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead. vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict. Future commits will fix this
code duplication.
Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5