Commit Graph

322 Commits

Author SHA1 Message Date
Adrian Grange
fbd73648dd Merge "Cleanup typos, remove unnecessary lines, replace switch" 2013-07-30 12:59:46 -07:00
Adrian Grange
b30a06b930 Cleanup typos, remove unnecessary lines, replace switch
Removed unnecessary code lines, replaced switch with an if,
fixed spelling errors and formatting.

Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6
2013-07-30 12:10:32 -07:00
Yaowu Xu
a15d1f3134 removed duplication
Change-Id: Ica23b66f6664e5a5b168499584f0afffbc54794f
2013-07-30 09:09:14 -07:00
Dmitry Kovalev
c09b81719f Merge "General cleanups." 2013-07-26 13:59:39 -07:00
Paul Wilkins
fe5e2a91bb Auto min and max partition size experiment.
Speed feature experiment to set an upper and lower
partition size limit based on what has been seen
in spatial neighbors.

This seems to gives quite reasonable speed gains in local
(10-15%) and when used with speed 0 the losses are small
(0.25% derf, 0.35% stdhd). However, for now I am only
enabling it on speed 1 as there may be clashes with the existing
temporal partition selection in speed 2.

Using a tighter min / max around the range derived from the
neighbors increases speed further but at the cost of a
bigger quality loss. However,  I think this spatial method could
be combined with data from either the last frame or a variance
method (or both) to refine the range of minimum and maximum
partition size. I.e. consider the min and max from spatial and
temporal neighbors and the variance recommendation.

Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
2013-07-26 18:30:49 +01:00
Dmitry Kovalev
7131cb0e3d General cleanups.
Removing unused constants, macros, and function declarations. Using
ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
#include from *.h to *.c. Merging for loops for motion vectors.

Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
2013-07-25 14:13:48 -07:00
Dmitry Kovalev
47d61f008f Removing vp9_adapt_mode_context function.
Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs.

Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9
2013-07-25 10:48:45 -07:00
Adrian Grange
a183f17d33 Merge "Correct spelling mistakes" 2013-07-24 09:48:57 -07:00
Adrian Grange
bc8b0529db Correct spelling mistakes
Change-Id: Id4138293efeac4503b2e01ce7a6c150a5abeef77
2013-07-24 07:58:26 -07:00
Dmitry Kovalev
1099a436d3 Moving counts from FRAME_CONTEXT to new struct FRAME_COUNTS.
Counts are separate from frame context. We have several frame contexts but
need only one copy of all counts.

Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d
2013-07-23 17:02:08 -07:00
Paul Wilkins
cedd24ec61 Merge "Renaming of segment constants." 2013-07-23 08:16:12 -07:00
Scott LaVarnway
2fd20eb37d Merge "Eliminated prev_mip memsets/memcpys in encoder" 2013-07-23 06:43:52 -07:00
Paul Wilkins
7c134bc0cd Merge "Reworked the auto_mv_step_size speed feature" 2013-07-23 04:49:55 -07:00
Paul Wilkins
32042af14b Renaming of segment constants.
Renamed:
  MAX_MB_SEGMENTS to MAX_SEGMENTS
  MB_SEG_TREE_PROBS to SEG_TREE_PROBS

The minimum unit for segmentation in the segment map
is now 8x8 so it is misleading to use MB_ as macro-block
traditionally refers to a 16x16 region.

Change-Id: I0b55a6f0426bb46dd13435fcfa5bae0a30a7fa22
2013-07-23 12:09:04 +01:00
James Zern
76db4d599a Merge "VP[89]_COMMON: remove golden/altref frame counts" 2013-07-22 12:55:07 -07:00
Paul Wilkins
1d189d6464 Re-order mode search in rd.
Mode search order in rd loop changed to better reflect
observed hit counts.

Also some adjustment of the baseline mode rd thresholds
to reflect the order change and observed frequencies.

Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c
2013-07-22 17:21:12 +01:00
Paul Wilkins
888375d243 Fix build error.
When CONFIG_POSTPROC is set there was a now
invalid reference to cm->filter_level.

Changed to cpi->mb.e_mbd.lf.filter_level in line with
change Iaf5fb71c33719cdfa1b991f671caf071be9ea035

Change-Id: If746e60044903f7ba8d0d346225b3d015226c7d0
2013-07-22 14:01:43 +01:00
Dmitry Kovalev
ee1771ebaa Moving all loop filter related variables into new struct.
Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
Eventually it will be moved to vp9_loopfilter.h for better code structure.

Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
2013-07-19 16:19:10 -07:00
Deb Mukherjee
302698fb12 Reworked the auto_mv_step_size speed feature
This patch modifies the auto_mv_step_size speed feature to
use a combination of the maximum magnitude mv from the last
inter frame, and the maximum magnitude mv for the two reference
mvs with the same reference. For arf frames, the max mav step
for the resolution is used.
The bounds therefore are slightly tighter. The feature is made
a speed 1 feature.

Rebased.

Results (when this feature is turned on over speed 0):
derfraw300: -0.046% psnr, about 5+% speedup
(tested on football: goes from 4m30.760s to 4m17.410s).

Change-Id: If492797a61b0b4b3e58c0b8f86afb880165fc9f6
2013-07-19 15:12:56 -07:00
James Zern
5f30a0c687 VP[89]_COMMON: remove golden/altref frame counts
these are only used in the encoder.
frames_since_golden / frames_till_alt_ref_frame -> VP[89]_COMP

Change-Id: Ie14a6f46987bced685ddb449b85dc261caba6dfe
2013-07-18 14:09:21 -07:00
Dmitry Kovalev
9f3c0e34a9 Moving Scale2Ration function from vp9_onyx.h to vp9_onyx_if.c.
Change-Id: Idfe2a850f72b38f519aea1aac1266d8c3aa813ee
2013-07-18 14:05:06 -07:00
Yunqing Wang
3798db88e1 Remove unnecessary calling of vp9_init_quantizer()
vp9_init_quantizer() is called in vp9_create_compressor(), and
should not be called in vp9_set_speed_features().

Change-Id: Ic2f1f4b0531b9d46bb841d7e1d8da9812207dad6
2013-07-17 14:59:00 -07:00
Yunqing Wang
10e83b0717 Enable disable_splitmv feature for other speeds
Added disable_splitmv feature at other speed levels. For speed 3 or
above, always turn it on.

Change-Id: Ibb36f0a7ef12a34b4f8d0f9cb6193eab43b34360
2013-07-17 10:25:49 -07:00
Yunqing Wang
df90d58f4f Speed up motion estimation using small partitions' result(experiment)
Current partition checking starts from small sizes, and then goes up
to large sizes. This experiment uses the small partitions' motion
estimation result, which is already available, to speed up the
large partition's motion estimation. We can decide to skip some
patition checkings if they are unlikely choices. We could use the
motion vector(MV) result as current partition's prediction MV, limit
the search range and reference frame.

Current result at speed 1:
psnr loss: 1.19% for stdhd, 0.287% for derf.
speed gain: 14% for sunflower(hd), 11% for akiyo.

Further improvement will be done later.

Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
2013-07-17 09:11:47 -07:00
Paul Wilkins
d66eab15dd Merge "Move uv intra mode selection in rd loop." 2013-07-17 05:19:26 -07:00
Paul Wilkins
2ee338ce3b Move uv intra mode selection in rd loop.
Use an estimate based on DC_PRED for intra uv cost
within the rd loop then only do a full uv mode analysis
if an intra mode is chosen.

Significant speed gains in some cases. Currently only
enabled for speed 2 pending speed/quality tests.

Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0
2013-07-17 11:11:21 +01:00
Dmitry Kovalev
9482a0bf10 Cleaning up tile code.
Removing tile_rows and tile_columns from VP9Common, removing redundant
constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
vp9_get_tile_n_bits.

Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
2013-07-16 14:47:15 -07:00
James Zern
9581eb6e8a use consistent framerate naming
s/frame_rate/framerate/g

Change-Id: I6fc3e088e419c5f46e3a9390dd8a2cad2677a2fc
2013-07-16 14:12:47 -07:00
James Zern
3a7c2665d0 Merge "yv12config: remove YUV_TYPE" 2013-07-16 12:16:04 -07:00
Yaowu Xu
5b915ebd92 Change to extend full border only when needed
This is a short term optimization till we work out a decoder
implementation requiring no frame border extension.

Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
2013-07-15 20:52:13 -07:00
Jingning Han
faff6ed0fb Skip duplicate block encoding in the rd loop
This speed feature allows the encoder to largely remove the spatial
dependency between blocks inside a 64x64 superblock, thereby removing
the need to repeatedly encode superblocks per partition type in the
rate-distortion optimization loop.

A major challenge lies in the intra modes tested in the rate-distortion
optimization loop. The subsequent blocks do not have access to the
reconstructed boundary pixels without the intermediate coding steps.
This was resolved by using the original pixels for intra prediction
in the rd loop, followed by an appropriately designed distortion
modeling on the quantization parameters. Experiments also suggested
that the performance impact is more discernible at lower bit-rate/psnr
settings. Hence a quantizer dependent threshold is applied to deactivate
skip of block coding.

For bus_cif at 2000 kbps,
speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB
         performance loss.

speed 1: runtime 65312ms  -> 61536ms, (7% speed-up) at 0.04dB
         performance loss.

This operation is currently turned on in settings of speed 1.

Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec
2013-07-15 11:08:58 -07:00
James Zern
4fc6c88e9c yv12config: remove YUV_TYPE
this was never fleshed out in the context of VP8, for which it was
added. for VP9 it has no meaning.

Change-Id: Iba2ecc026d9e947067b96690245d337e51e26eff
2013-07-12 15:25:48 -07:00
Paul Wilkins
b8ddc9f0d3 Merge "Speed 2 feature adjustment." 2013-07-12 02:14:01 -07:00
Dmitry Kovalev
c4ad3273c7 Moving segmentation related vars into separate struct.
Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
2013-07-11 11:57:57 -07:00
Scott LaVarnway
f2a6bcfb18 Eliminated prev_mip memsets/memcpys in encoder
This patch is in experimental but was not merged into master.

This patch swaps ptrs instead of copying and uses the
last show_frame flag instead of setting the entire buffer
to zero.

Change-Id: Ia0950466c8ba301a2a5bf917ff3d07bc1a2c2311
2013-07-11 10:47:28 -04:00
Paul Wilkins
5290eeab88 Speed 2 feature adjustment.
With sf->auto_mv_step_size on it is questionable
whether sf->reduce_first_step_size is worthwhile.
At speed 2 it was not having a big impact.

Even at speed 2 sf->optimize_coefficients = 0 is not
having a big speed imapct so for now I have moved it
down into a higher speed setting.

Change-Id: I8a54de76d486ad37aabce76474889da2768b14c1
2013-07-11 13:59:12 +01:00
Deb Mukherjee
53ff43adc3 Prunes out full-rd computation based on modeled rd
Adds a speed feature to eliminate full-rd computation if the modeled
rd or rd based on a different parameter in the same mode is already
a lot larger than the best rd yet.

Specifically, only search the sharp and smooth filters if the modeled
rd cost based on the  regular filter is within a certain factor of the
best rd cost so far. Also, skip full-rd computation of non splitmv
inter modes if the modeled rd cost based on pred error is within the
same factor of the best rd cost so far.

Also adds some enhancements in the rd search for splitmv mode to
speed things up by early breakouts. Negligible impact on performance.

Resuts on derfraw300:
psnr:    -0.013% with the splitmv enhancements, -0.24% with the rd
         breakout feature on.
speedup: 6% with splitmv enhancements, 20% with also residual breakout
         (tested on football sequence at 600 Kbps)

Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc
2013-07-10 13:49:49 -07:00
Yaowu Xu
bed27a960a Add a feature to reduce chrome intra mode search
Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8
2013-07-10 08:59:18 -07:00
Ronald S. Bultje
ed995afba1 Make frame-wide filter-type decision fully RD-based.
Overall, on all test sets, this gains about +0.2% on all metrics.
City is a clip where this really hurts (-1.0% on all metrics), I'm
not quite sure why yet. Maybe interesting to look into in the future.

Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78
2013-07-08 16:22:37 -07:00
Deb Mukherjee
d9b62160a0 Implements several heuristics to prune mode search
Skips mode searches for intra and compound inter modes depending
on the best mode so far and the reference frames. The various
heuristics to be used are selected by bits from a flag. The
previous direction based intra mode search pruning is also absorbed
in this framework.

Specifically the flags and their impact are:

1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique
directional modes and TM_PRED if the best so far is
an inter mode)
derfraw300: -0.15%, 10% speedup

2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153
mode search if the best so far is not one of the closest
hor/vert/diagonal directions.
derfraw300: -0.05%, about 9% speedup

3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode
search if the best so far is an intra mode)
derfraw300: -0.06%, about 7-8% speedup

4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search
if the best single ref inter mode does not have the same ref
as one of the two references being tested in the compound mode)
derfraw300: -0.56%, about 10% speedup

Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495
2013-07-08 12:17:12 -07:00
Paul Wilkins
ef0ca2deaa Merge "Fix to comp_inter_joint_search_thresh feature." 2013-07-04 03:27:00 -07:00
Dmitry Kovalev
430bd0c94a Merge "Replacing 64 / MI_SIZE with MI_BLOCK_SIZE." 2013-07-03 14:16:02 -07:00
Dmitry Kovalev
5a21de8418 Replacing 64 / MI_SIZE with MI_BLOCK_SIZE.
Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c
2013-07-03 10:54:50 -07:00
Paul Wilkins
f58b44ad62 Fix to comp_inter_joint_search_thresh feature.
When this is 0 (BLOCK_SIZE_AB4X4) we want to do
the inter joint search for all sizes.

Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88
2013-07-03 16:58:34 +01:00
Paul Wilkins
72c5778ec5 Added two new skip experiments.
sf->unused_mode_skip_lvl. Tests modes as normal for all
sizes at or below the given level. At larger sizes it skips
all modes that were not chosen at any smaller size.
Hence setting BLOCK_SIZE_SB64X64 is in effect off.
Setting BLOCK_SIZE_AB4X4 will only consider modes that
were chosen for one or more 4x4 blocks at larger sizes.

sf->reference_masking.
Do a test encode of the NONE partition at one size and create
a reference frame mask based on the best rd choice. In the
full search only allow this reference frame.
Currently it is testing 64x64 and repeats this in the full search.
This does not work well with Jim's Partition code just now and
is disabled by default.

Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd
2013-07-03 16:56:06 +01:00
Paul Wilkins
b0a2871c35 Merge "Adjust Speed 0 settings." 2013-07-03 02:47:18 -07:00
Yaowu Xu
0d7b7c09cb Added a speed feature use_square_partition_only
This commit adds a speed feature where only squared partition are
evaluated in partition picking. Enable this feature in cpu-used 2
reduces encoding time by ~30%.

loss of compression:
-0.9% on cif set
-1.23% on stdhd

Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9
2013-07-02 16:40:15 -07:00
Deb Mukherjee
37501d687c Speed feature to binary search dir intramodes
This speed feature will skip searching the directional intra prediction
modes D63, D117, D27, D153 if the best intra mode so far is not one of
the diagonal, horizontal or vertical directions closest to the respective
directions being tested. In other words, this implements a sort of
binary search in the angular domain.

Speedup: about 9-10%
Results: -0.05% only on derfraw300.

Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2
2013-07-02 14:07:19 -07:00
Deb Mukherjee
8d3d2b76f3 Tx size selection enhancements
(1) Refines the modeling function and uses that to add some speed
features. Specifically, intead of using a flag use_largest_txfm as
a speed feature, an enum tx_size_search_method is used, of which
two of the types are USE_FULL_RD and USE_LARGESTALL. Two other
new types are added:
USE_LARGESTINTRA (use largest only for intra)
USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for
inter)

(2) Another change is that the framework for deciding transform type
is simplified to use a heuristic count based method rather than
an rd based method using txfm_cache. In practice the new method
is found to work just as well - with derf only -0.01 down.
The new method is more compatible with the new framework where
certain rd costs are based on full rd and certain others are
based on modeled rd or are not computed. In this patch the existing
rd based method is still kept for use in the USE_FULL_RD mode.
In the other modes, the count based method is used.
However the recommendation is to remove it eventually since the
benefit is limited, and will remove a lot of complications in
the code

(3) Finally a bug is fixed with the existing use_largest_txfm speed feature
that causes mismatches when the lossless mode and 4x4 WH transform is
forced.

Results on derf:
USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction
USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a
pretty good compromise)
USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction
(currently the benefit of modeling is limited for txfm size selection,
but keeping this enum as a placeholder) .
USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing
use_largest_txfm speed feature).

Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
2013-07-02 13:54:00 -07:00
Yunqing Wang
b12e060b55 Add speed feature to disable splitmv
Added a speed feature in speed 1 to disable splitmv for HD (>=720)
clips. Test result on stdhd set: 0.3% psnr loss and 0.07% ssim
loss. Encoding speedup is 36%.

(For reference: The test result on derf set showed 2% psnr loss
and 1.6% ssim loss. Encoding speedup is 34%. SPLITMV should be
enabled for small resolution videos.)

Change-Id: I54f72b94f506c6d404b47c42e71acaa5374d6ee6
2013-07-02 10:02:34 -07:00