Commit Graph

403 Commits

Author SHA1 Message Date
Dmitry Kovalev
8962d975b2 Merge "Moving all loop filter related variables into new struct." 2013-07-20 22:45:24 -07:00
Dmitry Kovalev
f66821afbb Merge "Removing frame_type field from MACROBLOCKD struct." 2013-07-20 22:40:06 -07:00
Yaowu Xu
ea284d6281 added checks to prevent rate/distortion overflow
At speed 2, due to the threshold scheme used, it is possible the rate
and distortion assigned with INT_MAX value. The patch added checking
to prevent the INT_MAX value is used in further calculation of RD
scores. The patch also changed the assertion in rd_use_partition() to
be mirror similar assertion in rd_pick_partition().

Change-Id: Idb52c543cc1e10abdf6e6a5d6e9cb535a42214dc
2013-07-19 17:52:50 -07:00
Dmitry Kovalev
ee1771ebaa Moving all loop filter related variables into new struct.
Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
Eventually it will be moved to vp9_loopfilter.h for better code structure.

Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
2013-07-19 16:19:10 -07:00
Dmitry Kovalev
e71a4a77bb Merge "Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE)." 2013-07-19 12:14:32 -07:00
Dmitry Kovalev
97e96bc4e9 Removing frame_type field from MACROBLOCKD struct.
Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119
2013-07-19 11:55:36 -07:00
Dmitry Kovalev
c0eb57406c Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE).
Moving TX_MODE enum to vp9_enums.h. Renaming txfm_mode variables to
tx_mode.

Change-Id: I459d1af6dd928ce7fccdf8ce30b6f1ca057bef92
2013-07-19 11:37:13 -07:00
Dmitry Kovalev
afe43d4089 Removing redundant VP9_COMMON* from function signatures.
Functions: vp9_get_pred_context_switchable_interp,
           vp9_get_pred_context_intra_inter,
           vp9_get_pred_context_single_ref_p1,
           vp9_get_pred_context_single_ref_p2.

Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96
2013-07-19 11:20:49 -07:00
Ronald S. Bultje
e4686c589e Fix slightly quality drop caused at speed 1.
We would skip the rectangular blocks for sub8x8 partitions because
we would conclude that PARTITION_NONE was better than PARTITION_SPLIT,
however, that conclusion was made before we actually really tested
PARTITION_SPLIT.

Change-Id: I8fa91e59894badc1d8cee3ba8a49e40ae4c4a489
2013-07-18 17:52:08 -07:00
Ronald S. Bultje
5ebe503f04 Merge scale_factors and scale_factors_uv.
This prevents a duplicate memcpy of a 128-byte struct every time
set_scale_factors() is called (which is a lot), thus leading to a
decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block
RD/partition loop.

Overall, this decreases encoding time of the first 50 frames of bus
@ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5%
overall speedup. We can likely get more gains by removing the copy
of the other struct (and replacing it with an indexing) as well.

Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
2013-07-18 14:10:56 -07:00
Ronald S. Bultje
2d4929e340 Remove motion vectors from PARTITION_INFO.
The same information already exists in union b_mode_info.

Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d
2013-07-18 14:10:52 -07:00
Ronald S. Bultje
607424449c Merge "Best_rd breakout in rd partition search." 2013-07-17 16:10:22 -07:00
Dmitry Kovalev
a7a1e96136 Merge changes Ieffea49e,Idf610746
* changes:
  Removing two unused arguments from vp9_inc_mv signature.
  Changing signature of vp9_get_pred_probs_tx_size.
2013-07-17 14:44:20 -07:00
Ronald S. Bultje
9f427bfe98 Best_rd breakout in rd partition search.
About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which
goes from 1min36 to 1min24. Results become slightly better (+0.2% on
derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in
super_block_yrd(). Overall speed change (on derfraw300) is roughly
-13%. This can probably be improved further by caching best_yrd
between partition searches. Also, we might be able to get more
speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not
just at the sb8x8 level.

Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b
2013-07-17 09:56:46 -07:00
Yunqing Wang
df90d58f4f Speed up motion estimation using small partitions' result(experiment)
Current partition checking starts from small sizes, and then goes up
to large sizes. This experiment uses the small partitions' motion
estimation result, which is already available, to speed up the
large partition's motion estimation. We can decide to skip some
patition checkings if they are unlikely choices. We could use the
motion vector(MV) result as current partition's prediction MV, limit
the search range and reference frame.

Current result at speed 1:
psnr loss: 1.19% for stdhd, 0.287% for derf.
speed gain: 14% for sunflower(hd), 11% for akiyo.

Further improvement will be done later.

Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
2013-07-17 09:11:47 -07:00
Dmitry Kovalev
5b65a71cdc Changing signature of vp9_get_pred_probs_tx_size.
Removing VP9_COMMON* argument and adding struct tx_probs* instead of
MACROBLOCKD*.

Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c
2013-07-16 16:34:54 -07:00
Dmitry Kovalev
9482a0bf10 Cleaning up tile code.
Removing tile_rows and tile_columns from VP9Common, removing redundant
constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
vp9_get_tile_n_bits.

Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
2013-07-16 14:47:15 -07:00
Dmitry Kovalev
5de96b3ce6 Merge "Rewriting vp9_set_pred_flag_{seg_id, mbskip}." 2013-07-16 13:34:42 -07:00
James Zern
5baa416b6c Merge "vp9: remove frames_{since,till}.. from MACROBLOCKD" 2013-07-16 13:00:14 -07:00
Dmitry Kovalev
863138a2ad Rewriting vp9_set_pred_flag_{seg_id, mbskip}.
Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent
with vp9_get_segment_id without using confusing sub(a, b) macro. Passing
mi_row and mi_col to functions explicitly instead of replying on
mb_to_right_edge and mb_to_bottom_edge.

Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435
2013-07-16 10:44:48 -07:00
Jingning Han
faff6ed0fb Skip duplicate block encoding in the rd loop
This speed feature allows the encoder to largely remove the spatial
dependency between blocks inside a 64x64 superblock, thereby removing
the need to repeatedly encode superblocks per partition type in the
rate-distortion optimization loop.

A major challenge lies in the intra modes tested in the rate-distortion
optimization loop. The subsequent blocks do not have access to the
reconstructed boundary pixels without the intermediate coding steps.
This was resolved by using the original pixels for intra prediction
in the rd loop, followed by an appropriately designed distortion
modeling on the quantization parameters. Experiments also suggested
that the performance impact is more discernible at lower bit-rate/psnr
settings. Hence a quantizer dependent threshold is applied to deactivate
skip of block coding.

For bus_cif at 2000 kbps,
speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB
         performance loss.

speed 1: runtime 65312ms  -> 61536ms, (7% speed-up) at 0.04dB
         performance loss.

This operation is currently turned on in settings of speed 1.

Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec
2013-07-15 11:08:58 -07:00
James Zern
dc1d2331f6 vp9: remove frames_{since,till}.. from MACROBLOCKD
frames_since_golden / frames_till_alt_ref_frame are unused.

Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b
2013-07-13 18:02:11 -07:00
Dmitry Kovalev
429070987a Using vp9_copy and vp9_zero instead of custom code.
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
2013-07-12 18:07:43 -07:00
Dmitry Kovalev
cc662dd768 Adding struct tx_probs and struct tx_counts to cleanup the code.
Also removing unused declarations from vp9_entropymode.h file.

Change-Id: Ib9c5826db3584a32f6bb3297a76c522b99d83402
2013-07-12 15:22:38 -07:00
Dmitry Kovalev
dd150e8ea9 Removing redundant code mostly from vp9_pred_common.{h, c}.
Removing redundant function arguments and curly braces.

Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
2013-07-11 18:39:10 -07:00
Dmitry Kovalev
c4ad3273c7 Moving segmentation related vars into separate struct.
Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
2013-07-11 11:57:57 -07:00
Dmitry Kovalev
544d8c3316 Removing unused TOKENEXTRA arg from pick_sb_modes function.
Change-Id: I0543e72fa092eef3976b65e16bb597197c364873
2013-07-10 15:57:28 -07:00
Jim Bankoski
68ef7a6b8a configure with internal stats not working
Change-Id: I5dea4570cb05df27a522abf6e7b695998654284a
2013-07-10 15:07:53 -07:00
Jim Bankoski
6591cf2f7e remove warnings when NDEBUG is set
Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136
2013-07-10 14:27:20 -07:00
Jim Bankoski
fb027a7658 removing case statements around prediction entropy coding
Removes SEG_ID
Removes MBSKIP
Removes SWITCHABLE_INTERP
Removes INTRA_INTER
Removes COMP_INTER_INTER
Removes COMP_REF_P
Removes SINGLE_REF_P1
Removes SINGLE_REF_P2
Removes TX_SIZE

Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b
2013-07-09 20:10:16 -07:00
Dmitry Kovalev
c6c279aff0 Merge "Using mi_cols instead of mb_cols." 2013-07-08 20:09:19 -07:00
Dmitry Kovalev
1c65c580d6 Merge "Refactoring setup_pre_planes function." 2013-07-08 20:08:05 -07:00
Ronald S. Bultje
a5062cc635 Don't call encode_sb() for the final of 4-split subpartitions.
The resulting reconstruction is never used, thus it just wastes CPU
cycles. Reduces encode time of first 50 frames of bus (speed 0) @
1500kbps from 2min2.0 to 2min1.2, i.e. a 0.65% overall speedup.

Change-Id: I74755ca3aadc21e2be220f486259060bd4088c45
2013-07-08 16:22:39 -07:00
Ronald S. Bultje
ed995afba1 Make frame-wide filter-type decision fully RD-based.
Overall, on all test sets, this gains about +0.2% on all metrics.
City is a clip where this really hurts (-1.0% on all metrics), I'm
not quite sure why yet. Maybe interesting to look into in the future.

Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78
2013-07-08 16:22:37 -07:00
Dmitry Kovalev
b7559258a4 Using mi_cols instead of mb_cols.
Eliminating usage of mb-units, switching to mi-units. Adding
ALIGN_POWER_OF_TWO macro.

Change-Id: I2491c969f713207c062011878b57e4e531818607
2013-07-08 14:54:04 -07:00
Deb Mukherjee
d9b62160a0 Implements several heuristics to prune mode search
Skips mode searches for intra and compound inter modes depending
on the best mode so far and the reference frames. The various
heuristics to be used are selected by bits from a flag. The
previous direction based intra mode search pruning is also absorbed
in this framework.

Specifically the flags and their impact are:

1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique
directional modes and TM_PRED if the best so far is
an inter mode)
derfraw300: -0.15%, 10% speedup

2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153
mode search if the best so far is not one of the closest
hor/vert/diagonal directions.
derfraw300: -0.05%, about 9% speedup

3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode
search if the best so far is an intra mode)
derfraw300: -0.06%, about 7-8% speedup

4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search
if the best single ref inter mode does not have the same ref
as one of the two references being tested in the compound mode)
derfraw300: -0.56%, about 10% speedup

Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495
2013-07-08 12:17:12 -07:00
Dmitry Kovalev
f72e072555 Refactoring setup_pre_planes function.
Removing set_refs, adding set_ref function.

Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63
2013-07-03 17:42:01 -07:00
Dmitry Kovalev
430bd0c94a Merge "Replacing 64 / MI_SIZE with MI_BLOCK_SIZE." 2013-07-03 14:16:02 -07:00
Dmitry Kovalev
5a21de8418 Replacing 64 / MI_SIZE with MI_BLOCK_SIZE.
Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c
2013-07-03 10:54:50 -07:00
Paul Wilkins
72c5778ec5 Added two new skip experiments.
sf->unused_mode_skip_lvl. Tests modes as normal for all
sizes at or below the given level. At larger sizes it skips
all modes that were not chosen at any smaller size.
Hence setting BLOCK_SIZE_SB64X64 is in effect off.
Setting BLOCK_SIZE_AB4X4 will only consider modes that
were chosen for one or more 4x4 blocks at larger sizes.

sf->reference_masking.
Do a test encode of the NONE partition at one size and create
a reference frame mask based on the best rd choice. In the
full search only allow this reference frame.
Currently it is testing 64x64 and repeats this in the full search.
This does not work well with Jim's Partition code just now and
is disabled by default.

Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd
2013-07-03 16:56:06 +01:00
Dmitry Kovalev
1f6e95e76a Merge "Removing redundant struct from union b_mode_info." 2013-07-02 18:09:31 -07:00
Dmitry Kovalev
be77f6bbbf Removing redundant struct from union b_mode_info.
Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83
2013-07-02 16:51:57 -07:00
Yaowu Xu
0d7b7c09cb Added a speed feature use_square_partition_only
This commit adds a speed feature where only squared partition are
evaluated in partition picking. Enable this feature in cpu-used 2
reduces encoding time by ~30%.

loss of compression:
-0.9% on cif set
-1.23% on stdhd

Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9
2013-07-02 16:40:15 -07:00
Deb Mukherjee
8d3d2b76f3 Tx size selection enhancements
(1) Refines the modeling function and uses that to add some speed
features. Specifically, intead of using a flag use_largest_txfm as
a speed feature, an enum tx_size_search_method is used, of which
two of the types are USE_FULL_RD and USE_LARGESTALL. Two other
new types are added:
USE_LARGESTINTRA (use largest only for intra)
USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for
inter)

(2) Another change is that the framework for deciding transform type
is simplified to use a heuristic count based method rather than
an rd based method using txfm_cache. In practice the new method
is found to work just as well - with derf only -0.01 down.
The new method is more compatible with the new framework where
certain rd costs are based on full rd and certain others are
based on modeled rd or are not computed. In this patch the existing
rd based method is still kept for use in the USE_FULL_RD mode.
In the other modes, the count based method is used.
However the recommendation is to remove it eventually since the
benefit is limited, and will remove a lot of complications in
the code

(3) Finally a bug is fixed with the existing use_largest_txfm speed feature
that causes mismatches when the lossless mode and 4x4 WH transform is
forced.

Results on derf:
USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction
USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a
pretty good compromise)
USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction
(currently the benefit of modeling is limited for txfm size selection,
but keeping this enum as a placeholder) .
USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing
use_largest_txfm speed feature).

Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
2013-07-02 13:54:00 -07:00
Dmitry Kovalev
3140c443e4 Merge "Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h." 2013-07-02 11:31:35 -07:00
Jim Bankoski
d4158283e7 use partitioning from last frame
This cl converts use partition from last frame to do the following:

if part is none,horz, vert -> try split
if part != none and one of the children is not split - try none


Change-Id: I5b6c659e35f3ac9f11c051b92ba98af6d7e8aa87
Signed-off-by: Jim Bankoski <jimbankoski@google.com>
2013-07-01 18:18:50 -07:00
Dmitry Kovalev
1ac0540296 Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h.
Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c
2013-07-01 17:28:08 -07:00
Yaowu Xu
632289b31f fix a mismatch in cpuused 2
Change-Id: I921c9faba6386535aaf717a54301dd346a9b8540
2013-07-01 08:54:50 -07:00
Dmitry Kovalev
59070f6e3c Merge "Removing CONFIG_DEBUG checks on assertions." 2013-06-28 14:03:28 -07:00
Dmitry Kovalev
0345fc3ad9 Merge "Decoder's code cleanup." 2013-06-28 10:38:54 -07:00
Dmitry Kovalev
8e6ce6bb9e Removing CONFIG_DEBUG checks on assertions.
Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated
ones from vp9_onyx_int.h and vp9_onyxd_int.h.

Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6
2013-06-28 10:36:20 -07:00
Yaowu Xu
64bb996e03 Merge "Optimize partition search order" 2013-06-28 09:29:39 -07:00
Yaowu Xu
1374a06bd8 Optimize partition search order
This commit change the partition search order to allow checking of
rectangular partition to be done after square partitions. It also
added a speed feature to skip rectangular partition check when
NONE is better than SPLIT in RD sense.

This feature roughly speed up encoder by 1.5X with loss on compression
-0.91% on cif set
-0.56% on stdhd set

Change-Id: I0d2d06993041aa9ea9073fcc39c54f73a127dfa4
2013-06-28 07:13:54 -07:00
Ronald S. Bultje
fd4eed3b08 Fix tile independence with both column tiling and static_thresh set.
Change-Id: I0b2be0ec2c410a527f88b95a44f24ac967b2dac1
2013-06-27 21:56:40 -07:00
Dmitry Kovalev
3231da0a9e Decoder's code cleanup.
Using vp9_set_pred_flag function instead of custom code, adding
decode_tokens function which is now called from decode_atom,
decode_sb_intra, and decode_sb.

Change-Id: Ie163a7106c0241099da9c5fe03069bd71f9d9ff8
2013-06-27 16:15:43 -07:00
Dmitry Kovalev
a3664258c5 Merge "General cleanup in segmentation-related code." 2013-06-27 14:57:07 -07:00
Paul Wilkins
59af9049d3 Merge "Start adaptive threshold for each mode at max." 2013-06-27 02:28:36 -07:00
Jingning Han
bd9bac0391 Remove empty function vp9_build_block_offsets
This function is empty, hence is removed.

Change-Id: Ia9d01710806bffe0398a6dc9405f8a5a81b27d74
2013-06-26 14:55:47 -07:00
Dmitry Kovalev
be07485e9a General cleanup in segmentation-related code.
Using consistent function and variable names.

Change-Id: I2deb3fded8797453a2081836c9ce2e79ade06eb7
2013-06-26 10:27:28 -07:00
Paul Wilkins
689957e3ad Start adaptive threshold for each mode at max.
Each frame we reset all adaptive thresholds to MAX
rather than base. As modes are picked their thresholds
drop down.

Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8
2013-06-26 17:04:47 +01:00
Dmitry Kovalev
70e9622185 Merge "Removing find_seg_id and using vp9_get_pred_mi_segid instead." 2013-06-25 10:16:06 -07:00
Yaowu Xu
e371cd73a3 change to enable use_largest_txform feature
for all regular inter frames at speed 1

Change-Id: I0a8b301273ecf2b8730ab1f6b7a05f89f4d498e0
2013-06-24 16:43:26 -07:00
Dmitry Kovalev
40141681c0 Removing find_seg_id and using vp9_get_pred_mi_segid instead.
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
2013-06-21 13:05:10 -07:00
Ronald S. Bultje
54b2a59623 Implement SSE2 block_error.
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.

Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.

Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
2013-06-21 12:54:52 -07:00
Yaowu Xu
ee07a261a0 rename variables to avoid build error in MSVC
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
2013-06-20 18:31:48 -07:00
Jim Bankoski
9f2a1ae23e adds force partitioning greater than or less than block size
adds a new speed feature to force partitioning to be greater than
or less than a certain size

Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
2013-06-20 09:51:42 -07:00
Jim Bankoski
18bdf708e7 adds a set partitioning to speed features
this feature lets you set a partitioning size to be used by the entire
frame.

Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
2013-06-20 09:50:44 -07:00
Jim Bankoski
476d73d294 partition by variance using var from last frame
This uses variance to split partition. Variance is calculated using
nearest mv,  always from last ref frame.

Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
2013-06-20 09:48:22 -07:00
Jim Bankoski
727fa7b1e4 new partition via variance
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
2013-06-20 09:42:05 -07:00
Jim Bankoski
0fad6a9d99 fix to set up new speed feature
This uses the speed feature functionality for code.

Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
2013-06-20 09:35:02 -07:00
Jim Bankoski
df2314cfdd don't copy partitions for key frames or altrefs
force us to go through slow partitioning for keyframes, altref and
overlays.

Change-Id: I1a286361bf74083e71973575a7296be46eb98742
2013-06-20 09:34:32 -07:00
Jim Bankoski
fbcce4dd6f Merge "copy partitioning from last fame" 2013-06-20 09:32:43 -07:00
Jim Bankoski
f033b44e74 copy partitioning from last fame
Change-Id: I26e80ede80cb4389378a95afa95d229092a9859a
2013-06-20 09:32:19 -07:00
Jingning Han
7088426976 Merge "Make fdct32 computation flow within 16bit range" 2013-06-18 11:40:14 -07:00
Jingning Han
a41a4860c0 Make fdct32 computation flow within 16bit range
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.

This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.

Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
2013-06-18 09:46:24 -07:00
Dmitry Kovalev
686b99741c Removing vp9_invtrans.{c, h} files.
Moving single function from vp9_invtrans.c to vp9_encodemb.c.

Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7
2013-06-17 16:09:03 -07:00
Ronald S. Bultje
8a0808a145 Fix row tiling.
Change-Id: I57be4eeaea6e4402f6a0cc04f5c6b7a5d9aedf9b
2013-06-12 13:42:59 -04:00
Jim Bankoski
fca6c82b29 Fix rd partition search for corner blocks
This commit enables proper partition type search for the bottom-
right corner blocks.

Change-Id: Id1123d0e4e81eba648ed4f3c0c7ab587e174f650
2013-06-11 09:29:21 -07:00
Ronald S. Bultje
eedd98ac0a Fix crash on RD iterations with segmentation enabled.
Change-Id: I3baf93c2fa5c2f7f45c6bc5514d317040975da71
2013-06-10 10:42:09 -07:00
Yaowu Xu
c08317e4f2 Merge "Fix the rd loop over partition types" into experimental 2013-06-08 13:30:27 -07:00
Deb Mukherjee
17da2cab78 TX_SIZE contexts simplification.
Reduces TX_SIZE contexts to 2 for each kind. The code is
cleaner and there is hardly any performance difference with
more than two contexts.

Results: almost neutral

Change-Id: I17656bd6db76224ae2856adf882504560e7dbaa4
2013-06-08 12:32:26 -07:00
Jingning Han
e1d63c010e Fix the rd loop over partition types
This commit enables boundary blocks properly tested over allowable
partition types.

Change-Id: I405a9a46ddcfa0c7af2b63e3644cabfa3b6a951d
2013-06-07 23:36:35 -07:00
Yaowu Xu
b7da6d0c5a Merge "Handle partition type coding of boundary blocks" into experimental 2013-06-07 18:16:16 -07:00
Deb Mukherjee
21401942b0 Coding tx-size selection by use of spatial context
Adds coding of transform size within a frame by use of context
of transform sizes selected in left and above blocks.

Also incorporates code for generating stats.

TODO: generate and incorporate new default stats

Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6
2013-06-07 16:07:58 -07:00
Deb Mukherjee
869a39ba60 Cleans up mbskip encoding
Refactors mbskip coding to be compatible with coding of the rest of
the symbols. Adds forward/backward adaptation and removes a lot of
the legacy code.

Results:
fast50: +1.6%
derfraw300: +0.317%

Change-Id: I395a2976d15af044d3b8ded5acfa45f6f065f980
2013-06-07 16:00:26 -07:00
Jingning Han
78b8190cc7 Handle partition type coding of boundary blocks
The partition types of blocks sitting on the frame boundary are
constrained by the block size and the position of each sub-block
relative to the frame. Hence we use truncated probability models
to handle the coding of such information.

100 frames run:
yt 0.138%

Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d
2013-06-07 14:19:40 -07:00
Ronald S. Bultje
6462afe088 Fix ref_frame segment feature when it is intra.
Change-Id: Ifbf790c14cee0c08a27f6728e3c637404e1f8477
2013-06-07 13:57:55 -07:00
Paul Wilkins
340c7a48e6 Change to segment ref frame feature.
Simplify feature to only support a single reference frame
instead of a mask.

Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb
2013-06-07 21:42:22 +01:00
Deb Mukherjee
78fbaf4d84 Merge "Coding updates for tx-size selection" into experimental 2013-06-07 09:19:36 -07:00
Deb Mukherjee
3ee1a21a42 Coding updates for tx-size selection
Changes to the coding of transform sizes, along with forward
and backward probability updates.

Results:
derf300: +0.241%

Context based coding of transform sizes will be in a separate
patch.

Change-Id: I97241d60a926f014fee2de21fa4446ca56495756
2013-06-07 08:54:00 -07:00
Paul Wilkins
653a25569b Compound inter encoder bug fix.
In the longer term the encoder should allow compound as long
as one of the buffers has opposite sign bias and as per the decoder
this buffer is then set as the fixed reference. However at the moment
the encoder and RD loop only supports the case where the ALTREF_FRAME
buffer (or third of the 3 allowed in any given frame) is the odd one out.

This patch fixes a bug that would allow compound inter and set
fixed ref to ALTREF_FRAME when it is not the odd one out.

Change-Id: Ic83a69486e088a147ba83a4aedc2a0042f6b3721
2013-06-07 12:31:54 +01:00
Yaowu Xu
e127bdc04c fix a typo
Change-Id: I8fd21e3a8435b873c5687d8b273922fc60988295
2013-06-06 22:25:13 -07:00
Ronald S. Bultje
6ef805eb9d Change ref frame coding.
Code intra/inter, then comp/single, then the ref frame selection.
Use contextualization for all steps. Don't code two past frames
in comp pred mode.

Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95
2013-06-06 17:28:09 -07:00
Ronald S. Bultje
ad34368786 New intra mode and partitioning probabilities.
Split partition probabilities between keyframes and non-keyframes,
since they are fairly different. Also have per-blocksize interframe
y intramode probabilities, since these vary heavily between different
blocksizes.

Lastly, replace default probabilities for partitioning and intra modes
with new ones generated from current codec. Replace counts with actual
probabilities also.

Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f
2013-06-06 10:45:30 -07:00
Jim Bankoski
5a88271b09 don't tokenize & encode tokens for blocks in UMV
This avoids encoding tokens for blocks that are entirely
in the UMV border. This changes the bitstream.

Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9
2013-06-06 06:10:25 -07:00
Deb Mukherjee
83885235a7 Clean-ups on switchable interpolation and mv_ref
Adds backward adaptation and differential forward updates of switchable
interpolation filter probabilities. Also adds some cosmetic cleanups
and minor fixes on mv_ref probabilities.

derfraw300: +0.353% (with most coming from switchable interp changes)

Change-Id: Ie2718be73528c945fd0d80cfd63ca2d9cb3032de
2013-06-05 10:11:52 -07:00
Dmitry Kovalev
3b9ec31eaf Replacing memcpy with struct assignment.
Change-Id: Ib557cc6351404b9e178e95a545883eb3666f11f0
2013-05-31 16:00:32 -07:00
Dmitry Kovalev
75cf80ee8e Adding new encode_txfm function.
Moving some code from vp9_pack_bitstream to encode_txfm function.

Change-Id: Icc25d6083e54f09886216fea632ceac002042d7f
2013-05-31 12:33:44 -07:00
Ronald S. Bultje
a288cb3b10 Merge "Merge all various transform size data trackers into single variables." into experimental 2013-05-31 09:59:24 -07:00
Scott LaVarnway
1e025dbfd1 Merge "Moved use_prev_in_find_mv_refs check to frame level" into experimental 2013-05-31 09:35:51 -07:00
Ronald S. Bultje
e9d68a5e36 Merge all various transform size data trackers into single variables.
Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db
2013-05-31 09:18:59 -07:00
Jim Bankoski
21595f8e38 Merge "Creates a new speed 1:" into experimental 2013-05-30 20:36:05 -07:00
Jim Bankoski
ced21bd6a6 Creates a new speed 1:
This speed 1 - uses variance threshold stolen from static-thresh
to determine split.  Any superblock with greater than the variance
set by static thresh * quantizer index squared is split. In addition
transform size is set to largest size less than or equal to partition
size, sub pixel filter is set to normal,  and only 12 modes are used
at all.

Change-Id: If7a2858ee70f96d1eb989c04fd87a332b147abef
2013-05-30 19:53:00 -07:00
Ronald S. Bultje
e6485581fe Remove splitmv.
We leave it in rdopt.c as a local define for now - this can be removed
later. In all other places, we remove it, thereby slightly decreasing
the size of some arrays in the bitstream.

Change-Id: Ic2a9beb97a4eda0b086f62c039d994b192f99ca5
2013-05-30 17:21:01 -07:00
Ronald S. Bultje
98c192ae83 Merge all intra mode coding trees into a single one.
Also merge all counters. This removes a few unused probability updates
from the bitstream.

Change-Id: I20f58853e9dac84d8c0d9703ae012c55917516eb
2013-05-30 09:58:53 -07:00
Scott LaVarnway
353642bc53 Moved use_prev_in_find_mv_refs check to frame level
This patch checks at the frame level to see if the previous
mode info context can be used.  This patch eliminates the
flag check that was done for every mode and removes another
check that was done prior to every vp9_find_mv_refs().

Change-Id: I9da5e18b7e7e28f8b1f90d527cad087073df2d73
2013-05-29 16:42:23 -04:00
Ronald S. Bultje
5cac66078e Remove splitmv.
Also do per-partition motion vector referencing in <sb8x8 partitions,
and adjust mvref finding for sub8x8 partitions.

Change-Id: Id3ed1ed4d2a8910d11d327db6cc63b8eb79f941f
2013-05-26 14:40:49 -07:00
Jingning Han
d093027791 Fix transform size coding mismatch
This commit fixes a transform size enc/dec mismatch issue in the
key frame coding.

Change-Id: I0c4f40464a367b33dd91ace84506650b1aec2873
2013-05-24 11:30:58 -07:00
Yaowu Xu
a2db88fc26 Fix two bugs
1) Added an initialization of rd_tx_select_threshs[].
2) Made updating transform size counts to be consistent

Change-Id: Iaa9d6c6be825b0364c9d61a9802873d01356815c
2013-05-24 09:28:19 -07:00
Yaowu Xu
f116abf774 update txfm size counting
Change-Id: I3a26baf8b2f945fea4f1aea156e60fa79f620f86
2013-05-23 18:27:07 -07:00
Jingning Han
7ac5ac52f9 Merge 4x4 block level partition into codebase
Move 4x4/4x8/8x4 partition coding out of experimental list.

This commit fixed the unit test failure issues. It also resolved
the merge conflicts between 4x4 block level partition and iterative
motion search for comp_inter_inter.

Change-Id: I898671f0631f5ddc4f5cc68d4c62ead7de9c5a58
2013-05-23 11:58:50 +01:00
Jingning Han
d2cacdc530 Merge "Make the intra rd search support 8x4/4x8" into experimental 2013-05-22 10:00:15 -07:00
Yaowu Xu
8ba92a0bed changes intra coding to be based on txfm block
This commit changed the encoding and decoding of intra blocks to be
based on transform block. In each prediction block, the intra coding
iterates thorough each transform block based on raster scan order.

This commit also fixed a bug in D135 prediction code.

TODO next:
The RD mode/txfm_size selection should take this into account when
computing RD values.

Change-Id: I6d1be2faa4c4948a52e830b6a9a84a6b2b6850f6
2013-05-22 11:53:19 +01:00
Yaowu Xu
232d90d8fd Generalized intra 4x4 encoding for all sizes
Change-Id: I1b86744fa247233c8df031b3f4b87b212c8dd094
2013-05-22 11:44:12 +01:00
Jingning Han
f153a5d063 Make the intra rd search support 8x4/4x8
This commit allows the rate-distortion optimization of intra coding
capable of supporting 8x4 and 4x8 partition settings.

It enables the entropy coding of intra modes in key frame using a
unified contextual probability model conditioned on its above/left
prediction modes.

Coding performance:
derf 0.464%

Change-Id: Ieed055084e11fcb64d5d5faeb0e706d30268ba18
2013-05-21 21:03:00 -07:00
John Koleszar
ddf13be8ef Merge "Initial version of alpha channel support" into experimental 2013-05-21 17:29:51 -07:00
Jingning Han
1f7d810a72 Merge "Deprecate 4x4 intra modes from bit-stream" into experimental 2013-05-21 15:59:58 -07:00
Scott LaVarnway
1db6373267 Merge "WIP: 4x4 idct/recon merge" into experimental 2013-05-21 10:45:53 -07:00
Dmitry Kovalev
4ac70bd7d3 Adding get_ref_frame_idx function.
Change-Id: I4f1a4eca6794cda78d00512196caacd5567e2dcc
2013-05-20 16:09:00 -07:00
Jingning Han
1e6be7bc75 Deprecate 4x4 intra modes from bit-stream
Replace B_DC_PRED like syntax element writing/reading with sb_ymode
set (e.g., DC_PRED, etc).

Change-Id: I293006a6b3bcd130c08ea9f053e7a79c6819c6f8
2013-05-20 12:14:24 -07:00
Scott LaVarnway
ba48a11130 WIP: 4x4 idct/recon merge
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.

Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
2013-05-20 13:03:17 -04:00
Jingning Han
810b612c23 Enable bit-stream support to 8x4 and 4x8 partition
The recursive partition type search is enabled down to 4x4, 4x8 and
8x4, followed by the corresponding rate-distortion optimization for
the per-partition encoding mode decisions.

The bit-stream writing/reading synchronized in supporting the
rectangular partition of 8x8 block.

This provides above 1% coding performance gains on derf.

To do next:
1. re-design the rate-distortion loop for inter prediction below 8x8.
2. re-design the rate-distortion loop for intra prediction below 4x4.
3. make the loop-filter aware of rectangular partition of 8x8 block.
4. clean the unused probability models.
5. update default probability values.

Change-Id: Idd41a315b16879db08f045a322241f46f1d53f20
2013-05-19 14:59:04 -07:00
Jingning Han
49068dd985 Merge "Refactor encode_sb_ for 4x8/8x4 partition" into experimental 2013-05-17 09:19:48 -07:00
Paul Wilkins
51bc4bf4a0 Remove MODE_STATS flag and code
Change-Id: I6c70a8a8a4633399842ac74792003ae5f7859ffa
2013-05-17 12:34:10 +01:00
John Koleszar
679e4abdd5 Initial version of alpha channel support
This is a mostly-working implementation of an extra channel in the
bitstream. Configure with --enable-alpha to test. Notable TODOs:

 - Add extra channel to all mismatch tests, PSNR, SSIM, etc
 - Configurable subsampling
 - Variable number of planes (currently always uses all 4)
 - Loop filtering
 - Per-plane lossless quantizer
 - ARNR support

This implementation just uses the same contents as the Y channel
for the A channel, due to lack of content and general pain in
playing back 4 channel content. A later patch will use the actual
alpha channel passed in from outside the codec.

Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592
2013-05-16 22:21:09 -07:00
Jingning Han
5a1c953310 Refactor encode_sb_ for 4x8/8x4 partition
Deprecate set_block_index. Replace it with get_sb_index_ for
consistency with partition search and bit-stream writing/reading.

Use b_width/height_log2 instead of mi_width/height_log2, to support
4x4 resolution partition types.

Change-Id: Ic1e71981e163c669f7ea6b3c12b831c284c4a494
2013-05-16 16:33:31 -07:00
John Koleszar
f07602e403 Merge "Remove vp9_extend_mb_row()" into experimental 2013-05-16 15:22:38 -07:00
John Koleszar
16ac5a5cde Remove vp9_extend_mb_row()
This code is no longer needed for correct intra prediction.

Change-Id: I822d1a8b0ad0a00e7c4c6e7b2931790c39d1267d
2013-05-16 11:56:00 -07:00
Dmitry Kovalev
b0c101e2b4 Removing lossless flag from the bitstream.
Change-Id: If6aee510cbc4910f2f24fcd92dddc65fdf8edeea
2013-05-15 18:20:51 -07:00
Jingning Han
1f26840fbf Enable recursive partition down to 4x4
This commit allows the rate-distortion optimization recursion
at encoder to go down to 4x4 block size. It deprecates the use
of I4X4_PRED and SPLITMV syntax elements from bit-stream
writing/reading. Will remove the unused probability models in
the next patch.

The partition type search and bit-stream are now capable of
supporting the rectangular partition of 8x8 block, i.e., 8x4
and 4x8. Need to revise the rate-distortion parts to get these
two partition tested in the rd loop.

Change-Id: I0dfe3b90a1507ad6138db10cc58e6e237a06a9d6
2013-05-14 12:39:56 -07:00
Jingning Han
6910f178f1 Use consistent partition context setup in enc/dec
Move set_partition_seg_context_ to common file. Use consistent
context setup conditions for partition probability model update at
encoder and decoder.

Change-Id: I24b7ed3b1c48e3d2568191a46b70136b99b67b1a
2013-05-11 15:22:13 -07:00
Jingning Han
2117d4ee96 Move get_sb_index to vp9_blockd.h
Use common function to fetch/assign sb_index in rd loop, bit-stream
writing and reading.

Change-Id: I1d8a214a57ed9cbcd026040436ef33e5e39d65b7
2013-05-11 13:26:59 -07:00
John Koleszar
64667d5af7 Merge "Subsampling aware allocs and bitstream" into experimental 2013-05-10 16:55:00 -07:00
Jingning Han
e44678c061 Enable recursive partition type search
This commit enables the search for the optimal superblock
partition types in the recursion form. The intention is to
make the optimization process more concise and ready to
support partition down to 4x4 block size next.

Change-Id: Iae279a67df3a7cc372553c84c775bc4d2f3e4336
2013-05-10 10:13:10 -07:00
John Koleszar
da58436f43 Subsampling aware allocs and bitstream
Make framebuffer allocations according to the chroma subsamping
factors in use. A bit is placed in the raw part of the frame header for
each of the two subsampling factors. This will be moved in a future
commit to make them part of the TBD feature set bits, probably only set
on keyframes, etc.

Change-Id: I59ed38d3a3c0d4af3c7c277617de28d04a001853
2013-05-09 17:50:12 -07:00
Jingning Han
4a88ad89fd Extend left/above partition context to per mi(8x8)
Update and buffer left/above partition information context per 8x8
block. This allows to further enable recursive partition down to
4x4 block size, and hence deprecating I4X4_PRED and SPLITMV.

This commit also fixes a context buffer swap/restore issue in 32x32
partition type search. This gives 0.1% performance gain for derf/yt.
Will refactor the superblock partition type search into recursion
form.

Change-Id: Ib61975aca5f12b78d8018481d7fa1393d085689b
2013-05-08 10:20:34 -07:00
Paul Wilkins
a14ae84749 Deprecate code_zerogroup experiment.
Delete code under the CONFIG_CODE_ZEROGROUP flag.

Change-Id: I5fe6c7b42a5da9b73118e33594301da4129f320a
2013-05-07 16:52:55 -07:00
Paul Wilkins
1ed57a6a62 Deprecate comp_interintra_pred experiment.
Delete code under the CONFIG_COMP_INTERINTRA_PRED
flag.

Change-Id: I3d1079cf46305c08f7e11d738596ea112e7b547f
2013-05-07 16:24:08 -07:00
Paul Wilkins
8c1b516d10 Deprecate the newbintramode experiment.
Clean out code relating to newbintramode.

Change-Id: Ie91f4f156cdf60ce0da8ca407c1c9cb00c7d0705
2013-05-07 16:00:59 -07:00
Jingning Han
c0504a9b24 Merge "Merge SB8X8 into the codebase" into experimental 2013-05-07 09:23:47 -07:00
Jingning Han
776c1482a3 Merge SB8X8 into the codebase
Pull sb8x8 out of experimental list. verified via borg run tests.
Fixed unit test failures.

Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76
2013-05-07 09:08:25 -07:00
Scott LaVarnway
cb7955d83e Removed vp9_setup_intra_recon()
This setup is now handled by vp9_build_intra_predictors()
when left_available and/or up_available is zero.

Change-Id: I59cec0ab95f8be69ce885fd20727510e4deef8a0
2013-05-06 16:13:06 -04:00
John Koleszar
6c622e2783 Merge "Separate transform and quant from vp9_encode_sb" into experimental 2013-05-03 17:19:01 -07:00
John Koleszar
4529c68b3b Separate transform and quant from vp9_encode_sb
This allows removing a large number of transform size specific functions,
as well as supporting 444/alpha by routing all code through the
subsampling-aware path.

Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4
2013-05-03 12:14:50 -07:00
John Koleszar
f07733010b Merge "Create common vp9_encode_sb{,y}" into experimental 2013-05-03 10:26:53 -07:00
Ronald S. Bultje
034928843f Fix use of wrong rate/distortion variables in 16x8 r/d check.
Change-Id: Ib5961b4c8ca84d54c84b2651a4e0317c72fe7da4
2013-05-02 21:03:38 -07:00
Ronald S. Bultje
1069c12cf4 Fix 16x16-iteration indexing bug in main encode_sb_row loop.
With this, encoder/decoder appear to match with sb8x8 experiment.
Needs some larger-scale testing.

Change-Id: I44d3cac37b3c98264985ed0a0fc763c30089aa64
2013-05-02 16:41:08 -07:00
Jingning Han
879a2f053d Fix state update in sb8x8 rate-distortion loop
Update mode_info of 8x8 blocks within the scope of current block.

Change-Id: I110c599e60664a5acde6afd919b107cea8419a0d
2013-05-02 14:41:51 -07:00
John Koleszar
3f4e80634b Create common vp9_encode_sb{,y}
Creates a common encode (subtract, transform, quantize, optimize,
inverse transform, reconstruct) function for all sb sizes, including
the old 16x16 path.

Change-Id: I964dff1ea7a0a5c378046a069ad83495f54df007
2013-05-02 14:02:03 -07:00
Ronald S. Bultje
e931dac733 Fix i4x4 mode reading and writing in sb8x8 bitstream.
Don't allow i4x4 except for sb8x8 recursion step. Read only 4 (not 16)
i4x4 submodes if we are i4x4.

Change-Id: Iaaaced1a134006b2c96eed66f014300eae41e0ed
2013-05-02 13:01:09 -07:00