This patch factors in the different partition coding syntax used for
right and bottom edge blocks when doing RD search.
Change-Id: I2f31650512b6a4a7a2c03352414693aff6fbf87b
Increase in the damping used in adjusting the active Q range.
This does hurt rate accuracy a little in a few extreme cases
especially if the clip is very short*, but helps metrics.
* Note that the adjustment is applied at the GF/ARF group level based
on what happened in the last group. Hence for very short clips where
the length of a single group may be a significant % of the clip length
there is still scope for some drift that cannot be accommodated.
In practice most data points in our test sets are now much closer to target
than was previously the case with default settings and in some cases are
better even than they were with the command line undershoot and overshoot
parameter was set very low (e.g. 2%). For example in bridge_close at high rates
the old mechanism was unable to adapt enough to prevent extreme overshoot.
Change-Id: I634f8f0e015b5ee64a9f0ccaa2bcfdbc1d360489
Change to the calculation of the error divisor used in
get_twopass_worst_quality(). This follows on from other
changes to the rate control that impact the output of this
function.
Change-Id: I414fa9aa1e6a68a64dccea17c3712f44b8a0c10c
Changes to the function the redistributes bits from overshoot
or undershoot throughout the rest of the clip to respond more
quickly.
Change-Id: I90f10900cdd82cf2ce1d8da4b6f91eb5934310da
Added a factor based on the bit spend in the last arf group vs the
target to adjust the choice of the active worst quality in subsequent
groups.
Helps clips where previously there was a big overshoot or undershoot
to adapt and get closer to the target rate.
Change-Id: I67034b801679b99024409489a2273ea6fe23b8e6
The use of this value is preventing rate adjustment on clips
or sections that have very little motion but high noise and
this can give rise to some sections with massive overshoot.
Change-Id: I9a65c7c1148dc5d3a7d8b23e50fc1733f3661621
This is purely a refactoring patch and has no functional effect.
Uses of these masks can be arranged such that all input blocks are
contiguous in memory (stride == block width). In this case 1D versions
of operations can be used. 1D vector operations have superior performance
over 2D block equivalents as they are more processor cache friendly and
they can do away with a second loop overhead.
Change-Id: I2b76c9888aea2c857cc497e8a4b2841fd3dad54e
Replaced vpx_d45_predictor_4x4_ssse3(), vpx_d45_predictor_8x8_ssse3()
and vpx_d207_predictor_4x4_ssse3() with
created vpx_d45_predictor_4x4_sse2(), vpx_d45_predictor_8x8_sse2()
and vpx_d207_predictor_4x4_sse2() respectively.
It's mostly neutral or slightly worse than ssse3 in good cases and
better than ssse3 in the bad cases (but still worse than using the mmx
regs).
Change-Id: Ib0237ceb71d2c57b8a93fd3170330cfed9d56bdd
Use the same rounding method that is used throughout the codebase,
where the halfway value is rounded up rather than down.
Change-Id: I04e92850bc69a7d7a07b06e3d2ce97f6f2ada321
Skip intra-mode and some inter-modes (newmv, nearmv, nearestmv) for
golden frame if the variance got from choose_partitioning is very low.
Only for 1 pass real-time CBR mode and bsize >= 32x32, it has ~2.5%
speed up with less than 0.1% PSNR drop for rtc test set. Don't see
visual regression.
Change-Id: I70efbc95a1007231ae36f02c5b2fbf6cd35077ad
Reduce operations and jumps. perf shows CPU time reduced from 1.9% to
1.6% when decoding fdJc1_IBKJA.248.webm on Xeon E5.
Will apply the changes to vp10 after code review.
Change-Id: I9351509922855d8896ddef1ed093b3ca12619a61
Seperate prediction mode and tx type search for inter
modes. Enabled for speed >=1.
baseline:
speed increase 40%
compression drop 0.30%/0.29% on lowres/midres
ext-tx:
speed increase 160%
compression drop 1.08%/0.95% on lowres/midres
Change-Id: Ieb34b1ee80df6980d16e26a5783e08cc0deae55b
Add a speed feature to seperate prediction mode and tx type search
for intra modes: search for best intra prediction mode with fixed
default tx type first, then choose the best tx type for the
selected mode.
Coding performance drop:
baseline
lowres 0.10% midres 0.08% hdres 0.14%
with ext-tx
lowres 0.14% midres 0.25% hdres 0.20%
Speed improvement is 20% for baseline and 17% for ext-tx.
It is turned on for speed >= 1.
Change-Id: Ia5e8d39e8a4e2e42c521bfde938f8b6a98ab24f9