Removes certain cases of feedback of active_worst_quality,
and removes it from the RATE_CONTROL structure. Now active
worst quality is expected to be computed locally in the
q picking function during the encode.
Making temporal filter strength depend on avg_frame_qindex
rather than on active_worst_quality actually improves
performance esp. for yt.
derf: +0.038%
yt: +0.359%
Change-Id: I1fe5a343034b55af9322289165321f00ac0827b1
In real time encoding, we enable encode_breakout to make encoding
fast. A speed feature "use_encode_breakout" is defined to set
encode_breakout thresholds for different speeds.
However, currently, static_thresh is an encoder option. The encode_
breakout can be turned off if user sets static_thresh=0 specifically.
The rtc set borg test result: (need to set --static_thresh=1)
speed -5, psnr loss -3.543%;
speed -4, psnr loss -2.358%;
speed -3, psnr loss -0.771%.
Encoding speed test:
speed -5, 11% - 60% speedup;
speed -4, 5.5% - 28% speedup;
speed -3, 0.8% - 7% speedup.
Change-Id: Icde592ffbe77eac7446f872a2e9eb2051733677b
Aq 1 only updates segment map on kf and arf and
only uses 3 segments. With these settings AQ1 is
+ for most clips in SSIM but negative in psnr.
However, the penalty in PSNR is much less than
previously.
Old version aq1 average results for std hd
-20.899% psnr, -5.809% SSIM
New version aq1 for std hd
-3.57% psnr, +1.23% SSIM
Aq2 Now uses only 2 segments and rd.
This mode is still slightly negative for most clips on
psnr and SSIM but seems to have a much bigger visual
impact on several problem clips than aq mode 1.
Old results for std hd:
-2.578% psnr, -1.151% SSIM
New results for std hd:
-1.561% psnr, -0.85% SSIM
Change-Id: I94f57f8a73121629ce598fb921aad761c1450e1c
Some parameter changes and fixes on one-pass rate control.
derfraw300 is now only 10% below 2-pass speed 0 rate control.
Change-Id: I1940eef8a5a035dc18e71b880d5e00cabd1f01b9
This CL changes libvpx to call a function when a frame buffer
is needed for decode. Libvpx will call a release callback when
no other frames reference the frame buffer. This CL adds a
default implementation of the frame buffer callbacks. Currently
only VP9 is supported. A future CL will add support for
applications to supply their own frame buffer callbacks.
Change-Id: I1405a320118f1cdd95f80c670d52b085a62cb10d
Function encode_rtc_frame_internal() and encode_frame_internal() only
differed by a couple of speed features, this commit relocation those
difference into the setup of speed features and merged two functions
into one to remove duplication.
It also fixed a subtle bug super_fast_rtc was used before it was
initialized.
Change-Id: I234a5a1d11a4450930e5b4943dbab434208d5030
-Properly set the average frame size for each layer.
-Allow each layer to update its average/last Q stats after encoding.
-Initialize for some layer context variables.
Change-Id: Iaa37d144fcf4f30ff4283a4e8db8b9ca8bf4c815
A bug was reported in Issue 702: "SIGILL (Illegal instruction) when
transcoding with vp9 - using FFmpeg". It was reproduced and fixed.
Change-Id: Ie32c149a89af02856084aeaf289e848a905c7700
Bitwise OR operation doesn't guarantee any subexpression evaluation order.
Just reading one bit now and ignoring the next one. For reference look at
vp9_decode_frame() implementation.
Change-Id: I4971686929838ae5ded8f43a38a2934db5e1d462
Fixes some of the parameters for 1-pass non-cbr mode.
Also includes some cleanups, inlcuding refactoring of the
recode_loop options.
Results on derfraw300 improve by about 5-6%, so that the one-pass
mode is now 13% below the 2-pass mode in speed 0.
Change-Id: I844cc2638694c7574f3be00d41d60b23dc1016f0
this ensures both are properly initialized when calling _dealloc().
+ check the arrays before access
Change-Id: I789af39b41c271b5cb3c029526581b4d9903b895
This patch adds a buffer-based rate control for temporal layers,
under CBR mode.
Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).
Updated datarate unittest with tests for temporal layer rate-targeting.
Change-Id: I8900a854288b9354d9c697cfeb0243a9fd6790b1
Inlcudes a number cleanups:
1. Moves the one-pass pre-encode parameter setting functions
to vp9_ratectrl.c
2. Deprecates per_frame_bandwidth in RATE_CONTROL structure
3. Removes target_bandwidth in cpi structure since it is not used.
4. Various renaming of functions
There is no bit-stream change in 2-pass, one-pass cbr and one-pass
vbr modes.
Change-Id: Ifd9916bf4d485b7d04c5f52044ffe6703254ccbd
This isn't strictly necessary, but makes the file more consistent
with the other arm assembly source files.
Change-Id: I245c9677d89e0ab3f31991e473764858af35b180
CONFIG_USE_X86INC is available to every makefile, there's no need to
duplicate its value with USE_X86INC
Change-Id: Id12bd5f09cba78abba56ab5a8f56351562e5b8b6
avoid wrapping msvc includes with extern "C"; this breaks some visual
studio builds of the (c++) tests.
Change-Id: Ie8062d55d4f4c049f6cd360a36da6a67607df132
Fixes rate control partially in one-pass non-cbr case to achieve a
bitrate close to the one desired. Previous version was way off at
the high bitrate end.
Also includes several one-pass rate control cleanups and refactoring.
On derfraw300, one-pass encoding is now 19% off from two-pass speed
0 encoding, down from 35%.
Change-Id: I6f0dcdb7f8aa85a7e7cd3a3155d4f9d2a4d2f4f4
This patch added ssse3 optimization of bilinear sub-pixel filters.
The real time encoder was speeded up by ~1%.
Change-Id: Ie82e98976f411183cb8c61ab8d2ba0276e55a338
Moved a few features with low impact on compression form -5 to -4 and
increased adaptive_rd_thresh for -5.
Change-Id: Ib1b748168cc6ed7684ae4818499f3a536ae76253
The new implementation disagrees when the argument is equal to 2**n but
that is never called in practice and based on how it is used the new
implementation is correct in that case.
Change-Id: Ifbac4ad87d459fe6bd2fd0f400c0340f96617342
This avoids calls to get_unsigned_bits() with constants and
replaces hard to trace loops with simpler structures.
Change-Id: Ic1afc5a17d7df5bcfc85b76efda316b0bf118467
Using bilinear filters could speed up the codec in real-time mode.
This patch added sse2 optimizations of bilinear filters that
operate on different-sized blocks.
Tests showed that the real-time encoder was speeded up by 3%.
Change-Id: If99a7ee4385fcc225c3ee7445d962d5752e57c3f
This patch adds a buffer-based rate control for temporal layers,
under CBR mode.
Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).
Updated datarate unittest with tests for temporal layer rate-targeting.
Change-Id: I9cb6cce2494390ae6096ee17774af7fb9308bde7
As pointed out by Dmitry and James, "partial" is a Microsoft-
specific c++ keyword, and it is renamed.
Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
This commit added a logic to prevent the inter_filter type from being
changed if the default interp_filter mode is not switchable. Also, it
sets the default interp_filter to BILINEAR at very and super fast rtc
encoding modes
Change-Id: Ic41e6d31de29795a4ce536ec79afb01cab6daad3
--rt --cpu-used=-5 uses the progressive rtc mode
--rt --cpu-used=-6 uses the new super fast rtc mode
Change-Id: Id6469ca996100cdf794a0e42d76430161f22f976
Implemented parallel loopfiltering, which uses existing tile-
decoding threads. Each thread works on one row, and when that row
is loopfiltered, it moves to next unattended row. To ensure the
correct filtering order, threads are synchronized and one
superblock is filtered only if the superblocks it depends on are
filtered already.
To reduce synchronization overhead and speed up the decoder, we use
nsync > 1 for high resolution.
Performance tests:
1. on desktop:
8-tile 4k video using 8 threads, speedup: 70% - 80%
4-tile HD video using 4 threads, speedup: ~35%
2. on mobile device(Nexus 7):
4-tile 1080p video using 4 threads, speedup: 18% - 25%
4-tile 1080p video using 2 threads, speedup: 10% - 15%
Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
* Avoid unnecessary type erasure
* Prune unused/duplicate fields from struct rdcost_block_args
* Make struct rdcost_block_args a local
Change-Id: I4f1fd4837ccd028bbfe727191ee8d69f0463b7e5
When showing a previously decoded frame, i.e. when
show_existing_frame=1, the update of the
last_show_frame flag must be disabled.
This is to ensure that the last_show_frame flag
reflects the state of the flag for the immediately
previously decoded frame rather then the value that
was forced to ensure that a previously decoded frame
would be displayed.
This patch also adds a test vector to verify that the
display_existing_frame flag works as expected. Code
for generating the test vector can be found in this
patch:
https://gerrit.chromium.org/gerrit/#/c/68581/
(Bug originally reported by Alexander Voronov
<ru.xalba@gmail.com>).
Change-Id: I731d288fba02088959f7fcc87707137fffc6acf5
Added a constant to represent the minimum KF boost
rather than using the magic number 2000 in the code.
Change-Id: I9428b61f47d26312caff81c6f9ae8587df004791
Some changes in 1ca1186 were mistakenly reverted by a later merge,
this commit re-instated the chanages from values to enums.
Change-Id: Ia6b01c31da584a1f612996e6432612c1295b9eaf
In this new mode, the size range is strictly determined by the min
and max partition size in neighborhood blocks.
Niklas720 encoding time at cpu-used -5 goes from 56250ms to 50676ms,
a 10% reduction.
Change-Id: I316b0e2ac967ff3fad57b28d69c0ec80b7d8b34e
Includes a few fixes and clean-ups that adds the ability
to use alt-ref frames in one-pass mode.
Whether alt-refs are actually used or not is controlled by a
macro USE_ALTREF_FOR_ONE_PASS in vp9_firstpass.c.
This first cut seems to improve derf by 15+% in 1-pass mode.
But further experiments with parameters are underway.
Change-Id: I78254421435478003367c788c7930d2dc4ee2816
This patch only works if the video is a width and height that are both
a multiple of 32.. It sets every partition to 16x16, and does INTRADC
only on the first frame and ZEROMV on every other frame. It always does
does the largest possible transform, and loop filter level is set to 4.
Was ~20% faster than speed -5 of vp8
Now 20% slower but adds motion search ( every block ), nearest, near
and zeromv
The SVC test was changed because - while this realtime mode produces
bad quality albeit quickly, it isn't obeying all the rules it should
about which frames are available.
Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7
Adds multiple filters in the 0.5-1.0 range in the last stage
of the resize functions to prevent over-smoothing/aliasing
Change-Id: I1a615adb16f0df5095790945c94b28b4d6a6fc48
SSE for a 64x64 block with 3 planes can go as high as 3*2^28. So left
shift by 4 may overflow 32 bit int.
Change-Id: I63c84aa56894788bb987299badabbd7cc6fd0be6