sf->unused_mode_skip_lvl. Tests modes as normal for all
sizes at or below the given level. At larger sizes it skips
all modes that were not chosen at any smaller size.
Hence setting BLOCK_SIZE_SB64X64 is in effect off.
Setting BLOCK_SIZE_AB4X4 will only consider modes that
were chosen for one or more 4x4 blocks at larger sizes.
sf->reference_masking.
Do a test encode of the NONE partition at one size and create
a reference frame mask based on the best rd choice. In the
full search only allow this reference frame.
Currently it is testing 64x64 and repeats this in the full search.
This does not work well with Jim's Partition code just now and
is disabled by default.
Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd
This commit adds a speed feature where only squared partition are
evaluated in partition picking. Enable this feature in cpu-used 2
reduces encoding time by ~30%.
loss of compression:
-0.9% on cif set
-1.23% on stdhd
Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9
This speed feature will skip searching the directional intra prediction
modes D63, D117, D27, D153 if the best intra mode so far is not one of
the diagonal, horizontal or vertical directions closest to the respective
directions being tested. In other words, this implements a sort of
binary search in the angular domain.
Speedup: about 9-10%
Results: -0.05% only on derfraw300.
Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2
(1) Refines the modeling function and uses that to add some speed
features. Specifically, intead of using a flag use_largest_txfm as
a speed feature, an enum tx_size_search_method is used, of which
two of the types are USE_FULL_RD and USE_LARGESTALL. Two other
new types are added:
USE_LARGESTINTRA (use largest only for intra)
USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for
inter)
(2) Another change is that the framework for deciding transform type
is simplified to use a heuristic count based method rather than
an rd based method using txfm_cache. In practice the new method
is found to work just as well - with derf only -0.01 down.
The new method is more compatible with the new framework where
certain rd costs are based on full rd and certain others are
based on modeled rd or are not computed. In this patch the existing
rd based method is still kept for use in the USE_FULL_RD mode.
In the other modes, the count based method is used.
However the recommendation is to remove it eventually since the
benefit is limited, and will remove a lot of complications in
the code
(3) Finally a bug is fixed with the existing use_largest_txfm speed feature
that causes mismatches when the lossless mode and 4x4 WH transform is
forced.
Results on derf:
USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction
USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a
pretty good compromise)
USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction
(currently the benefit of modeling is limited for txfm size selection,
but keeping this enum as a placeholder) .
USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing
use_largest_txfm speed feature).
Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
Added a speed feature in speed 1 to disable splitmv for HD (>=720)
clips. Test result on stdhd set: 0.3% psnr loss and 0.07% ssim
loss. Encoding speedup is 36%.
(For reference: The test result on derf set showed 2% psnr loss
and 1.6% ssim loss. Encoding speedup is 34%. SPLITMV should be
enabled for small resolution videos.)
Change-Id: I54f72b94f506c6d404b47c42e71acaa5374d6ee6
Remove the use of sf->comp_inter_joint_search_thresh
from the baseline speed 0. Approx +0.4% on derf.
Change-Id: Icc14db98909830f40e5ac66130d40e78d2e55c71
This cl converts use partition from last frame to do the following:
if part is none,horz, vert -> try split
if part != none and one of the children is not split - try none
Change-Id: I5b6c659e35f3ac9f11c051b92ba98af6d7e8aa87
Signed-off-by: Jim Bankoski <jimbankoski@google.com>
Added a speed feature that focuses only on thresholds
for new motion modes.
Moved sf->comp_inter_joint_search_thresh into speed
1. This has ~+0.4% impact on quality at speed 0 as
our quality reference baseline.
Slight adjustment to baseline thresholds.
Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5
Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated
ones from vp9_onyx_int.h and vp9_onyxd_int.h.
Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6
This commit change the partition search order to allow checking of
rectangular partition to be done after square partitions. It also
added a speed feature to skip rectangular partition check when
NONE is better than SPLIT in RD sense.
This feature roughly speed up encoder by 1.5X with loss on compression
-0.91% on cif set
-0.56% on stdhd set
Change-Id: I0d2d06993041aa9ea9073fcc39c54f73a127dfa4
Also tweaks to other features and experiments with
what is on and off at different speed settings.
Change-Id: I3e1d0be0d195216bf17c2ac5df67f34ce0b306b2
Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size
and changed its meaning such that it is a delta applied to
reduce the default first step size (>> x) in the motion search
rather than an absolute value.
The default first step size is already changed according to the image
dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size
now applies a further correction from the default.
Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d
The part where we align it by 8 or 16 is an implementation detail that
shouldn't matter to the outside world.
Change-Id: I9edd6f08b51b31c839c0ea91f767640bccb08d53
Makes first 50 frames of bus @ 1500kbps encode from 3min22.7 to 3min18.2,
i.e. 2.3% faster. In addition, use the sub_pixel_avg functions to calc
the variance of the averaging predictor. This is slightly suboptimal
because the function is subpixel-position-aware, but it will (at least
for the SSE2 version) not actually use a bilinear filter for a full-pixel
position, thus leading to approximately the same performance compared to
if we implemented an actual average-aware full-pixel variance function.
That gains another 0.3 seconds (i.e. encode time goes to 3min17.4), thus
leading to a total gain of 2.7%.
Change-Id: I3f059d2b04243921868cfed2568d4fa65d7b5acd
This uses variance to split partition. Variance is calculated using
nearest mv, always from last ref frame.
Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
The encode-side scaling was not indexing through the image correctly
for the chroma planes, causing a green checkerboard-like output in
the unit test.
Change-Id: I9abbd73615404cd6699588be3e64dcf59005bc14
* New probs for subpel filters/tx_count
* Makes a change to not reset to defaults for the tx_size
probs if an intermediate frame reverts to using a fixed tx_size.
* A few updates to the parameters for backward adaptation for mode/mv
* some cosmetic cleanups
derf300: +0.06%
Change-Id: I22994d659bc31ca7a4fc8820fde24001e64a2920
Remove the bilinear filter mode, and the no-loopfilter mode, and the
related vp9_setup_version() function.
Change-Id: I32311367812faf37863131df3af37d63d03973d7
Adds coding of transform size within a frame by use of context
of transform sizes selected in left and above blocks.
Also incorporates code for generating stats.
TODO: generate and incorporate new default stats
Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6
Refactors mbskip coding to be compatible with coding of the rest of
the symbols. Adds forward/backward adaptation and removes a lot of
the legacy code.
Results:
fast50: +1.6%
derfraw300: +0.317%
Change-Id: I395a2976d15af044d3b8ded5acfa45f6f065f980
The partition types of blocks sitting on the frame boundary are
constrained by the block size and the position of each sub-block
relative to the frame. Hence we use truncated probability models
to handle the coding of such information.
100 frames run:
yt 0.138%
Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d
With the removal of i4X4 and SPLIT_MV modes, the two entries for the
modes are no longer used. This patch remove the coding of the deltas.
Change-Id: Iea4eb500404ebe9706159380a03b8eca542fb4c3
Changes to the coding of transform sizes, along with forward
and backward probability updates.
Results:
derf300: +0.241%
Context based coding of transform sizes will be in a separate
patch.
Change-Id: I97241d60a926f014fee2de21fa4446ca56495756
Wrong max data size (skip has no data) and use of vp9_get_segdata()
when it should be vp9_segfeature_active().
Change-Id: I1eb97d33df6e2a42cc589049f704266fe3639902
Code intra/inter, then comp/single, then the ref frame selection.
Use contextualization for all steps. Don't code two past frames
in comp pred mode.
Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95
Added structures to support independent rd thresholds
for different block sizes (and set experimental block
size correction factors).
Added structure to to allow dynamic adaptation of thresholds
per mode and per block size basis depending on how often
the mode/block size combination is seen (currently fixed factor).
Removed some unused variables.
TODO
- Adaptation of thresholds based on how often each mode chosen.
- The baseline mode values could also be adjusted based on
the block size (e.g. for a particular intra mode use a low threshold
for 4x4 prediction blocks but a relatively high value for 64x64.
Change-Id: Iddee65ff3324ee309815ae7c1c5a8584720e7568