360 Commits

Author SHA1 Message Date
Paul Wilkins
1a3641d91b Merge "Renaming in MB_MODE_INFO" 2013-08-15 02:12:48 -07:00
Dmitry Kovalev
bb072000e8 foreach_transformed_block_in_plane cleanup, explicit tx_size var.
Making foreach_transformed_block_in_plane more clear (it's not finished
yet). Using explicit tx_size variable consistently instead of
(ss_txfrm_size / 2) or (ss_txfrm_size >> 1) expression.

Change-Id: I1b9bba2c0a9f817fca72c88324bbe6004766fb7d
2013-08-14 11:39:31 -07:00
Paul Wilkins
26fead7ecf Renaming in MB_MODE_INFO
The macro block mode info context originally contained an
entry for each 16x16 macroblock. In VP9 each entry refers
to an 8x8 region not a macro block, so the naming is misleading.

This first stage clean up changes the names of 3 entries in the
structure to remove the mb_ prefix.

TODO clean up the nomenclature more widely in respect of
mbmi and bmi.

Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6
2013-08-14 12:47:52 +01:00
Jingning Han
78136edcdc SSE2 high precision 32x32 forward DCT
Enable SSE2 implementation of high precision 32x32 forward DCT. The
intermediate stacks are of 32-bits. The run-time goes down from
32126 cycles to 13442 cycles.

Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
2013-08-12 16:52:53 -07:00
Dmitry Kovalev
76d166e413 Removing foreach_predicted_block_uv function.
Adding function build_inter_predictors_for_planes to build inter
predictors for specified planes. This function allows to remove
condition "#if CONFIG_ALPHA" and use MAX_MB_PLANE for general case.
Renaming 'which_mv' local var to 'ref', and 'weight' argument to 'ref'.

Change-Id: I1a97160c9263006929d38953f266bc68e9c56c7d
2013-08-12 13:54:13 -07:00
Dmitry Kovalev
3c43ec206c Renaming BLOCK_SIZE_TYPES constant to BLOCK_SIZES.
There will be another change set to rename BLOCK_SIZE_TYPE enum to
BLOCK_SIZE.

Change-Id: I8d1dfc873d6186fa5e554262f5169e929978085e
2013-08-09 17:47:32 -07:00
Dmitry Kovalev
f1559bdeaf Inlining 16 as a stride for BLOCK_OFFSET macro.
Change-Id: I7f23d174eb089e5500f268a10db09648634c1b82
2013-08-09 16:40:05 -07:00
Dmitry Kovalev
cd0629fe68 Merge "Removing plane_block_{width, height}_log2by4 functions." 2013-08-09 15:26:51 -07:00
Dmitry Kovalev
ff7df102d9 Merge "Moving loopfilter struct to VP9_COMMON." 2013-08-09 15:23:00 -07:00
Dmitry Kovalev
816d6c989c Moving loopfilter struct to VP9_COMMON.
Loop filter configuration doesn't belong to macroblock, so moving it from
MACROBLOCKD to VP9_COMMON. Also moving the declaration of loopfilter struct
from vp9_blockd.h to vp9_loopfilter.h.

Change-Id: I4b3e34be9623b47cda35f9b1f9951f8c5b1d5d28
2013-08-09 14:41:51 -07:00
Dmitry Kovalev
8ffe85ad00 Moving scale_factors and related code to separate files.
Change-Id: I531829e5aee2a4a7a112d528ecccbddf052d0e74
2013-08-09 14:07:09 -07:00
Dmitry Kovalev
61c33d0ad5 Removing plane_block_{width, height}_log2by4 functions.
Change-Id: I040b82b8e32aee272d10cbb021c7ba1c76343d7a
2013-08-07 17:06:33 -07:00
Dmitry Kovalev
82d7c6fb3c Merge "Using only one scale function in scale_factors struct." 2013-08-07 16:32:09 -07:00
Dmitry Kovalev
8db2675b97 Adding ss_size_lookup table.
Removing the old one bsize_from_dim_lookup. Now we have a way to determine
block size for plane using its subsampling values (ss_size_lookup). And
then we can find the number of pixels in the block (num_pels_log2_lookup).

Change-Id: I6fc981da2ae093de81741d3d78eaefed11015db9
2013-08-07 15:33:17 -07:00
Dmitry Kovalev
1c552e79bd Using only one scale function in scale_factors struct.
Functions scale_mv_q4 and scale_mv_q3_to_q4 were almost identical except
q3->q4 conversion in scale_mv_q3_to_q4. Now q3->q4 conversion happens
directly in vp9_build_inter_predictor.

Also adding useful constants: SUBPEL_BITS and SUBPEL_MASK.

Change-Id: Ia0a6ad2ac07c45fdf95a5139ece6286c035e9639
2013-08-06 15:43:56 -07:00
Dmitry Kovalev
3f611555d7 Changing the order switchable filter enum constants.
This changeset allows to remove vp9_switchable_interp and
vp9_switchable_interp_map arrays and make code much clear. Actually we
still have to use these mapping but only inside read_interp_filter_type and
write_interp_filter_type functions.

Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50
2013-08-05 12:26:15 -07:00
Dmitry Kovalev
d007446b3f Replacing long block size enum values with shorter ones (2).
Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
2013-08-05 10:51:02 -07:00
Dmitry Kovalev
680ec32d18 Adding is_inter_block function.
Using it instead of long unclear verbose check
"mbmi->ref_frame[0] != INTRA_FRAME".

Change-Id: I9c7b4b3797942fa962bf3ba7460fff3084beabe9
2013-08-02 16:25:33 -07:00
Dmitry Kovalev
769bcab3f5 Cleaning up set_contexts_on_border function.
Change-Id: I8f21c18b29f54b277fb1c167f278f109d9f3b996
2013-08-02 15:52:26 -07:00
Dmitry Kovalev
b47153deed Replacing long block size enum values with shorter ones.
Change-Id: I0e9329490828684a4fd46f540d89114cc68e8407
2013-08-02 10:48:27 -07:00
Adrian Grange
b30a06b930 Cleanup typos, remove unnecessary lines, replace switch
Removed unnecessary code lines, replaced switch with an if,
fixed spelling errors and formatting.

Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6
2013-07-30 12:10:32 -07:00
Dmitry Kovalev
778989a097 Removing duplicated PREDICTION_PROBS constant.
Already defined in vp9_seg_common.h.

Change-Id: I5a0e3fa15966b1ebeb77ccd506b55fc231c22342
2013-07-25 11:08:21 -07:00
Dmitry Kovalev
9139ee0908 Adding condition inside get_tx_type_{4x4, 8x8, 16x16}.
Adding plane type check condition because it was always used outside of
get_tx_type_{4x4, 8x8, 16x16}.

Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230
2013-07-24 12:55:45 -07:00
Jingning Han
a5a9f5f7f3 Merge "Optimize operation flow in sub8x8 rd loop" 2013-07-22 12:08:15 -07:00
Jingning Han
409e77f2d4 Optimize operation flow in sub8x8 rd loop
Stack the rate-distortion statistics in the sub8x8 rd loop. This allows
the encoder to skip the forward transform, quantization, and coeff cost
estimation, in the sub8x8 rd optimization search, if the motion
vector(s) are of integer pixel value, and have been tested in the
previous prediction filter type rd loops of the same block.

This gives about 2% speed-up for bus_cif at 2000 kpbs, for speed 0.
Its efficacy depends how frequently the motion search will select an
integer motion vector.

Change-Id: Iee15d4283ad4adea05522c1d40b198b127e6dd97
2013-07-22 10:40:33 -07:00
Dmitry Kovalev
8962d975b2 Merge "Moving all loop filter related variables into new struct." 2013-07-20 22:45:24 -07:00
Dmitry Kovalev
ee1771ebaa Moving all loop filter related variables into new struct.
Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
Eventually it will be moved to vp9_loopfilter.h for better code structure.

Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
2013-07-19 16:19:10 -07:00
Dmitry Kovalev
97e96bc4e9 Removing frame_type field from MACROBLOCKD struct.
Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119
2013-07-19 11:55:36 -07:00
Paul Wilkins
710d10c521 Block index variables in MACROBLOCKD reduced to chars.
Change-Id: I9a4df095732d561807de01a41dcb1a1960726a3c
2013-07-19 11:32:51 +01:00
Yaowu Xu
67fb0679ee Merge "Merge scale_factors and scale_factors_uv." 2013-07-18 17:50:34 -07:00
Dmitry Kovalev
0b562b2d3d Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT).
Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34
2013-07-18 15:12:46 -07:00
Ronald S. Bultje
5ebe503f04 Merge scale_factors and scale_factors_uv.
This prevents a duplicate memcpy of a 128-byte struct every time
set_scale_factors() is called (which is a lot), thus leading to a
decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block
RD/partition loop.

Overall, this decreases encoding time of the first 50 frames of bus
@ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5%
overall speedup. We can likely get more gains by removing the copy
of the other struct (and replacing it with an indexing) as well.

Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
2013-07-18 14:10:56 -07:00
Paul Wilkins
5f4722c75f Merge "Minor cleanup in code to fine uv tx_size." 2013-07-17 02:50:09 -07:00
James Zern
5baa416b6c Merge "vp9: remove frames_{since,till}.. from MACROBLOCKD" 2013-07-16 13:00:14 -07:00
Paul Wilkins
30d2ea45ce Minor cleanup in code to fine uv tx_size.
Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e
2013-07-16 18:27:33 +01:00
Dmitry Kovalev
e8e7620a1f Merge "Removing and moving around constant definitions." 2013-07-16 00:52:53 -07:00
Yaowu Xu
5b915ebd92 Change to extend full border only when needed
This is a short term optimization till we work out a decoder
implementation requiring no frame border extension.

Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f
2013-07-15 20:52:13 -07:00
Dmitry Kovalev
ca75f1255f Removing and moving around constant definitions.
Removing unused and duplicated constants, moving them from *.h to *.c
if possible.

Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
2013-07-15 19:26:30 -07:00
James Zern
dc1d2331f6 vp9: remove frames_{since,till}.. from MACROBLOCKD
frames_since_golden / frames_till_alt_ref_frame are unused.

Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b
2013-07-13 18:02:11 -07:00
James Zern
0195fb53cb vp9: consistent 'log2' variable naming
lg2 -> log2

Change-Id: I0602ddff49e42c9c40c29c084d04b7592b9f8edf
2013-07-12 11:37:43 -07:00
Deb Mukherjee
94c481f9f1 Some minor cleanups for efficiency
Implements some of the helper functions more efficiently with
lookups rathers than branches. Modeling function is consolidated
to reduce some computations.

Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into
one because there is no need to keep them separate (even though
the semantics are a little different).

No bitstream or output change.

About 0.5% speedup

Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f
2013-07-12 10:22:56 -07:00
Dmitry Kovalev
c4ad3273c7 Moving segmentation related vars into separate struct.
Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
2013-07-11 11:57:57 -07:00
Deb Mukherjee
7494bba66b Merge "Prunes out full-rd computation based on modeled rd" 2013-07-10 15:37:11 -07:00
Jim Bankoski
6591cf2f7e remove warnings when NDEBUG is set
Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136
2013-07-10 14:27:20 -07:00
Deb Mukherjee
53ff43adc3 Prunes out full-rd computation based on modeled rd
Adds a speed feature to eliminate full-rd computation if the modeled
rd or rd based on a different parameter in the same mode is already
a lot larger than the best rd yet.

Specifically, only search the sharp and smooth filters if the modeled
rd cost based on the  regular filter is within a certain factor of the
best rd cost so far. Also, skip full-rd computation of non splitmv
inter modes if the modeled rd cost based on pred error is within the
same factor of the best rd cost so far.

Also adds some enhancements in the rd search for splitmv mode to
speed things up by early breakouts. Negligible impact on performance.

Resuts on derfraw300:
psnr:    -0.013% with the splitmv enhancements, -0.24% with the rd
         breakout feature on.
speedup: 6% with splitmv enhancements, 20% with also residual breakout
         (tested on football sequence at 600 Kbps)

Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc
2013-07-10 13:49:49 -07:00
Jim Bankoski
863204e64d mi_width_log2 & mi_height_log2
converted to lookup to avoid unnecessary code

Change-Id: I2ee6a01f06984cc2c4ba74b3fffd215318f749d2
2013-07-10 07:26:08 -07:00
Jim Bankoski
6c8170af52 b_width_log2 and b_height_log2 lookups
Replace case statement with lookup.
    Small speed gain at low speed settings but at speed 2+ where the
    number of motion searches etc. falls the impact rises to ~3-4%.

    Change-Id: Idff639b7b302ee65e042b7bf836943ac0a06fad8

Change-Id: I5940719a4a161f8c26ac9a6753f1678494cec644
2013-07-10 07:19:09 -07:00
Dmitry Kovalev
be77f6bbbf Removing redundant struct from union b_mode_info.
Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83
2013-07-02 16:51:57 -07:00
Deb Mukherjee
8d3d2b76f3 Tx size selection enhancements
(1) Refines the modeling function and uses that to add some speed
features. Specifically, intead of using a flag use_largest_txfm as
a speed feature, an enum tx_size_search_method is used, of which
two of the types are USE_FULL_RD and USE_LARGESTALL. Two other
new types are added:
USE_LARGESTINTRA (use largest only for intra)
USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for
inter)

(2) Another change is that the framework for deciding transform type
is simplified to use a heuristic count based method rather than
an rd based method using txfm_cache. In practice the new method
is found to work just as well - with derf only -0.01 down.
The new method is more compatible with the new framework where
certain rd costs are based on full rd and certain others are
based on modeled rd or are not computed. In this patch the existing
rd based method is still kept for use in the USE_FULL_RD mode.
In the other modes, the count based method is used.
However the recommendation is to remove it eventually since the
benefit is limited, and will remove a lot of complications in
the code

(3) Finally a bug is fixed with the existing use_largest_txfm speed feature
that causes mismatches when the lossless mode and 4x4 WH transform is
forced.

Results on derf:
USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction
USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a
pretty good compromise)
USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction
(currently the benefit of modeling is limited for txfm size selection,
but keeping this enum as a placeholder) .
USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing
use_largest_txfm speed feature).

Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
2013-07-02 13:54:00 -07:00
Dmitry Kovalev
1ac0540296 Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h.
Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c
2013-07-01 17:28:08 -07:00