1. Made speed choices to be progressive
2. Adjusted rt speed settings to achieve better speed/quality
Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p
niklas clip goes from 137,052ms to 121,874ms
Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
vp9_mse16x16 by migrating to AVX2
some of the functions were optimized by processing 32 elements instead of 16.
some of the functions were optimized by processing 2 loop strides of 16
elements in a single 256 bit register
This optimization gives between 2.4% - 2.7% user level performance gain
and 42% function level gain.
Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
Fix miss alignment of the frames contributing to the
error score and bit allocation for gf/arf groups.
Initial results slightly +.
Change-Id: Ie508bdcfdac52e592d48e1f13e01b3551b523deb
Some cleanups on frames_to_key, frames_since_key.
Also removes the unused fixed_q parameters in vp9.
Change-Id: If8743a32c71de30a8d17136477b53d607a7acda8
The previous implementation stops motion vector prediction test when
the zero motion vector appears for the second time. This commit fixes
it by simply skipping the second time check on zero mv and continuing
on to next mv candidate.
It slightly improves stdhd in speed 2 by 0.06% on average. Most static
sequences are not affected. A few hard ones, like jet, ped, and riverbed
were improved by 0.1 - 0.2%.
Change-Id: Ia8d4e2ffb7136669e8ad1fb24ea6e8fdd6b9a3c1
The feature undergoes prior assumption that the recursive partition
size search from 4x4 to 64x64, hence utilizing information from small
blocks to determine early termination in large block rate-distortion
optimization search. The current codebase is now going from top down.
The previous function might go with not properly initialized values,
hence removed.
Tested on pedestrian_area_1080p at 4000 kbps running under speed 2.
No visible difference in runtime observed.
Change-Id: I553df415c6191413762db7ae34e8790c71d8118e
This patch sets frame types correctly in the new
vp9_get_second_pass_params() function called prior
to encode_frame_to_data_rate() function, so that the
latter function can just work with what is passed to
it. This will allow multiple vp9_get_second_pass_params()
to be created for various encode strategies without
messing with the core encode function.
There is no difference in derf and yt. stdhd/hd are pending.
Change-Id: I70dfb97e9f497e9cee04052e0e8e0c2892eab0c3
Adding RefBuffer to simplify reference buffer management. The struct has a
pointer to image data and scale factors relative to the current frame.
Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e
In two pass encodes bits are allocated to each frame
according to a modified error score for the frame as a
fraction of the modified error score for the clip or section.
Previously a minimum rate per frame was reserved and
subtracted from the bits allocatable by the two pass code.
The vbr max section rate was enforced by clipping the
actual number of bits allocated.
In this patch the min and max vbr rates are enforced
instead by clipping the modified error scores for each frame
rather than the number of bits allocated.
Small gains for all test sets (psnr and SSIM) ranging from
~ +0.05 for YT psnr up to ~ +0.25 for Std-hd SSIM.
Change-Id: Iae27d70bdd3944e3f0cceaf225bad2e8802833de
This commit takes a preliminary attempt to refine the motion search
control. It detects the SAD associated with mv predictor per reference
frame, and based on which to determine whether the encoder wants to
reduce the motion search range (if the predicted mv provides fairly
small SAD), or to skip the current reference frame (if there exists
another ref frame that gives much smaller SAD cost).
This feature is turned on in the settings of speed 1 and above.
In speed 1, compression performance changed
derf -0.018%
yt -0.043%
hd -0.045%
stdhd -0.281%
speed-up
pedestrian_area_1080p at 4000 kbps 100 frames
199651ms -> 188846ms (5.5% speed-up)
blue_sky_1080p at 6000 kbps
443531ms -> 415239ms (6.3% speed-up)
In speed 2, compression performance changed
derf -0.026%
yt -0.090%
hd -0.055%
stdhd -0.210%
speed-up
pedstrian 113949ms -> 108855ms (4.5% speed-up)
blue_sky 271057ms -> 257322ms (5% speed-up)
Change-Id: I1b74ea28278c94fea329d971d706d573983d810d
Buffer the SSE of prediction residuals in the rate-distortion
optimization loop of a given block. This information would be used
for later encoding control.
Change-Id: If4e63f3462490513c48be9407d3327c8dd438367
Moving back to scale_factors struct. We don't need anymore x_offset_q4 and
y_offset_q4 because both values are calculated locally inside vp9_scale_mv
function.
Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b
Before mv scaling it is required to calculate x_offset_q4/y_offset_q4
by calling set_scaled_offsets(). Now offset configuration can not be
missed because it happens just before scale_mv().
Change-Id: I7dd1a85b85811a6cc67c46c9b01e6ccbbb06ce3a
Various cleanups and streamlining of interfaces as precursor
to further advancements in rate control.
Pre-encode parameter setting for different use cases:
One-pass, first of 2-pass, second of 2-pass, and Svc
are separated out.
There is no change in output with this change.
Change-Id: Ied8ca7d84d610993776aa30ef263fe20452e0e3e
Take account of the fact that the overlay frame is usually
very cheap so distribute target bits among the other frames.
Change-Id: I120685122e8cbbe75da8d07d02932f7877059867
This will hurt metrics in some cases (particularly for static
clips at low data rates where there is extra overhead, but it
helps smooth transitions around forced key frames between
stitched kf sections.
Change-Id: I7e1026ae0de6c77bba863061e115136d7f283cc0
Slightly reduces the mean tendency to undershoot target
rate in vbr, especially when using the memory less mode
and when recodes are disabled.
The effect is primarily at low q.
Change-Id: I59a593b99522cc7da31b4134d1c8a65f5b7b7c53
This commit reworks the prediction filter rate-distortion cost update
process consistent for all block sizes.
Change-Id: I5874349ab38df380240f96c2d4ef924072bab68d