Commit Graph

475 Commits

Author SHA1 Message Date
Jingning Han
84f3b76e1c Fix rectangular partition check flag
Put rectangular partition check flag change according to the rd
costs of NONE and SPLIT partition types under the speed feature.

Change-Id: If681e1e078a8d43d86961ea4b748da5cd1b6c331
2013-08-22 17:15:01 -07:00
Dmitry Kovalev
604022d40b vp9_encodeframe.c cleanup.
Removing unused get_sbuv_perpixel_variance function, using has_second_ref/
is_inter_block functions, organizing includes.

Change-Id: I016de4af12fbbb8b4ece26a70759b2392651b095
2013-08-22 15:50:51 -07:00
James Zern
40ae02c247 rename LOG2_* defines to *_LOG2
gets rid of a mix of styles

Change-Id: I3591d312157bc6f53a25438bf047765c671fd8a8
2013-08-22 14:45:24 -07:00
Jingning Han
01a37177d1 Refactor rd_pick_partition for parameter control
This commit changes the partition search order of superblocks from
{SPLIT, NONE, HORZ, VERT} to {NONE, SPLIT, HORZ, VERT} for
consistency with that of sub8x8 partition search. It enable the use
of early termination in partition search for all block sizes.

For ped_area_1080p 50 frames coded at 4000 kbps, it makes the runtime
goes down from 844305ms -> 818003ms (3% speed-up) at speed 0.

This will further move towards making the in-search partition types
configurable, hence unifying various speed-up approaches.

Some speed 1 and 2 features are turned off during the refactoring
process, including:
disable_split_var_thresh
using_small_partition_info

Stricter constraints are applied to use_square_partition_only for
right/bottom boundary blocks. Will bring back/refine these features
subsequently. At this point, it makes derf set at speed 1 about
0.45% higher in compression performance, and 9% down in run-time.

Change-Id: I3db9f9d1d1a0d6cbe2e50e49bd9eda1cf705f37c
2013-08-22 12:36:02 -07:00
Deb Mukherjee
8b810c7a78 Fixes on feature disabling split based on variance
Adds a couple of minor fixes, which may be absorbed in Jingning's
patch. Thanks to Guillaume for pointing these out.
Also adjusts the thresholds for speed 1 and 2 to 16 and 32
respectively, to keep quality drops small.

Results:
--------
derfraw300:  threshold = 16, psnr -0.082%, speedup 2-3%
             threshold = 32, psnr -0.218%, speedup 5-6%
stdhdraw250: threshold = 16, psnr -0.031%, speedup 2-3%
             threshold = 32, psnr -0.273%, speedup 5-6%

Change-Id: I4b11ae8296cca6c2a9f644be7e40de7c423b8330
2013-08-22 07:05:44 -07:00
Scott LaVarnway
f39bf458e5 Merge "Initialize mb_skip_coeff before picking modes" 2013-08-22 06:26:04 -07:00
Scott LaVarnway
94bfbaa84e Initialize mb_skip_coeff before picking modes
It appears that the above/left mb_skip_coeff used during
the pick modes, is left over from the previously
encode frame.  This patch initializes the flag to the default
value of zero.


Change-Id: Ida4684cc99611d6e3e82628db35ed717e28ce550
2013-08-22 08:51:04 -04:00
Dmitry Kovalev
048ccb2849 Cleaning up sum_intra_stats function.
Using size_group_lookup table and better variable names.

Change-Id: I6e67f2ce091845db43ace7d21b7ae31c6f165aec
2013-08-21 16:25:02 -07:00
Adrian Grange
ce28d0ca89 Fix typos and minor stylistic cleanup
Change-Id: I32e43474e8651ef2eb181d24860a8f118cfea7bf
2013-08-21 08:45:42 -07:00
Paul Wilkins
e8923fe492 Changes to auto partition size selection.
Changes to code to auto select a partition size range
based on data from spatial neighbors.

Now looks at the sb_type in each 8x8 block of above
and left SB64.

The effect on speed 1 is now weaker giving better
quality but less speed gain. Now also used in speed 2.

Change-Id: Iace33a97d5c3498dd2a9a8a4067351941abcbabc
2013-08-20 14:05:39 +01:00
Yaowu Xu
c4048dbdd3 Change to limit the mv search range
As the pixel values beyond image border are duplicates of pixels
on edge, the change limits the mv search range, any mv beyond
the limits no longer produce new/different prediction values
as entire block with pixels used for subpel interpolation are
outside image border.

Change-Id: I4c6fdf06e33c1cef1489f5470ce0fb4e5e01fb79
2013-08-19 17:19:36 -07:00
Dmitry Kovalev
26e5b5e25d Removing unused or redundant arguments from *_args structures.
Redundant dst, pre[2] from build_inter_predictors_args, unused cm from
encode_b_args.

Change-Id: I2c476cd328c5c0cca4c78ba451ca6ba2a2c37e2d
2013-08-16 12:51:20 -07:00
Dmitry Kovalev
b7616e387e Moving segmentation struct from MACROBLOCKD to VP9_COMMON.
VP9_COMMON is the right place to segmentatation struct because it has
global segmentation parameters, not something specific to macroblock
processing.

Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708
2013-08-15 10:47:48 -07:00
Deb Mukherjee
24856b6abc Speed feature to skip split partition based on var
Adds a speed feature to disable split partition search based on a
given threshold on the source variance. A tighter threshold derived
from the threshold provided is used to also disable horizontal and
vertical partitions.

Results on derfraw300:
threshold = 16, psnr = -0.057%, speedup ~1% (football)
threshold = 32, psnr = -0.150%, speedup ~4-5% (football)
threshold = 64, psnr = -0.570%, speedup ~10-12% (football)

Results on stdhdraw250:
threshold = 32, psnr = -0.18%, speedup is somewhat more than derf
because of a larger number of smoother blocks at higher resolution.

Based on these results, a threshold of 32 is chosen for speed 1,
and a threshold of 64 is chosen for speeds 2 and above.

Change-Id: If08912fb6c67fd4242d12a0d094783a99f52f6c6
2013-08-15 10:01:45 -07:00
Paul Wilkins
26fead7ecf Renaming in MB_MODE_INFO
The macro block mode info context originally contained an
entry for each 16x16 macroblock. In VP9 each entry refers
to an 8x8 region not a macro block, so the naming is misleading.

This first stage clean up changes the names of 3 entries in the
structure to remove the mb_ prefix.

TODO clean up the nomenclature more widely in respect of
mbmi and bmi.

Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6
2013-08-14 12:47:52 +01:00
Guillaume Martres
fc50477082 Honor min_partition_size properly for non-square splits
Don't do vertical or horizontal splits if subsize < min_partition_size,
except for edge blocks where it makes sense.

Change-Id: I479aa66ba1838d227b5de8312d46be184a8d6401
2013-08-13 15:24:03 -07:00
Paul Wilkins
8e35263bed Merge "Honor min_partition_size properly" 2013-08-13 05:19:51 -07:00
Dmitry Kovalev
8b0e6035a2 Entropy context related cleanups.
Adding set_skip_context() function used from both encoder and decoder.

Change-Id: Ia22cfad3211a00a63eb294f64f857b78f4aa9b85
2013-08-12 11:24:24 -07:00
Dmitry Kovalev
3c43ec206c Renaming BLOCK_SIZE_TYPES constant to BLOCK_SIZES.
There will be another change set to rename BLOCK_SIZE_TYPE enum to
BLOCK_SIZE.

Change-Id: I8d1dfc873d6186fa5e554262f5169e929978085e
2013-08-09 17:47:32 -07:00
Guillaume Martres
58b07a6f9d Honor min_partition_size properly
It represents the minimum partition size, so don't split if
bsize == min_partition_size .

Change-Id: Id77c32d6afef7d2ddec0368eaae18fb13227d30e
2013-08-09 17:28:33 -07:00
Dmitry Kovalev
816d6c989c Moving loopfilter struct to VP9_COMMON.
Loop filter configuration doesn't belong to macroblock, so moving it from
MACROBLOCKD to VP9_COMMON. Also moving the declaration of loopfilter struct
from vp9_blockd.h to vp9_loopfilter.h.

Change-Id: I4b3e34be9623b47cda35f9b1f9951f8c5b1d5d28
2013-08-09 14:41:51 -07:00
Scott LaVarnway
41251ae558 Bug fix: call set_offsets before rd_auto_partition_range
The set_offsets call is necessary inorder to set the
mode_info_context ptr correctly.

Change-Id: I644910cc5bacc50ee9cd78458843274ad8ee636d
2013-08-09 14:09:49 -04:00
Adrian Grange
83ee80c045 Moved fast motion search level decision to function
Moving this block of code into a function makes the
code easier to read and change.

Change-Id: If4ede570cce1eab1982b188c4d3e4fd3d4db236e
2013-08-08 11:01:44 -07:00
Adrian Grange
aae6a4c895 Simplify & fix potential bug in rd_pick_partition
Different partitionings were not being evaluated against
best_rd and there were unnecessary calls to RDCOST. This
could have resulted in a non-optimal partioning being
selected.

I simplified the variables used to track the rate,
distortion and RD values throughout the function.

Change-Id: Ifa7085ee80d824e86791432a5bc6d8fea5a3e313
2013-08-08 09:55:45 -07:00
Jingning Han
debb9c68c8 Use low precision 32x32fdct for encodemb in speed1
The low precision 32x32 fdct has all the intermediate steps within
16-bit depth, hence allowing faster SSE2 implementation, at the
expense of larger round-trip error. It was used in the rate-distortion
optimization search loop only.

Using the low precision version, in replace of the high precision one,
affects the compression performance by about 0.7% (derf, stdhd) at
speed 0. For speed 1, it makes derf set down by only 0.017%.

Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b
2013-08-07 15:34:12 -07:00
Dmitry Kovalev
0c80065694 Inlining vp9_get_pred_probs_switchable_interp function.
There was no benefit having this function. For example, inside
read_switchable_filter_type switchable filter context was calculated twice.

Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1
2013-08-06 11:04:31 -07:00
Dmitry Kovalev
3e51acafec Merge "Finally removing all old block size constants." 2013-08-06 10:30:37 -07:00
Dmitry Kovalev
4a692e4168 Merge "Changing the order switchable filter enum constants." 2013-08-06 10:30:26 -07:00
Deb Mukherjee
33afddadb9 Merge "Add variance based mode/skipping" 2013-08-06 10:19:15 -07:00
Dmitry Kovalev
b9c7d04e95 Finally removing all old block size constants.
Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d
2013-08-05 15:23:49 -07:00
Deb Mukherjee
8b3faccb9e Add variance based mode/skipping
Adds a speed feature to skip all intra modes other than
DC_PRED if the source variance is small. This feature is
made part of speed 1 and up.

Results on derf300: psnr -0.07%, speedup about 1-2%

Also uses the source variance to fine-tune the early
termination criteria when FLAG_EARLY_TERMINATE is on.
This feature is made part of speed 2 and up.

Results on derf300: psnr -0.52%, speedup about 5-7%

Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
2013-08-05 14:14:01 -07:00
Jim Bankoski
9f988a2edf Merge "cleanups after bw bh code" 2013-08-05 14:02:02 -07:00
Dmitry Kovalev
3f611555d7 Changing the order switchable filter enum constants.
This changeset allows to remove vp9_switchable_interp and
vp9_switchable_interp_map arrays and make code much clear. Actually we
still have to use these mapping but only inside read_interp_filter_type and
write_interp_filter_type functions.

Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50
2013-08-05 12:26:15 -07:00
Jim Bankoski
5d2cb7ead0 cleanups after bw bh code
Cons bw/bh parms that should have been const. Additional formatting.

Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16
2013-08-05 12:15:52 -07:00
Dmitry Kovalev
d007446b3f Replacing long block size enum values with shorter ones (2).
Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
2013-08-05 10:51:02 -07:00
Dmitry Kovalev
fe2a201eb1 Replacing "txfm" with "tx" in identifiers.
Consistent names with TX_SIZE, TX_MODE, and TX_MODE.

Change-Id: I79592218bf5a40ace89197a34a06ee7de581ed8d
2013-08-02 17:28:23 -07:00
Dmitry Kovalev
680ec32d18 Adding is_inter_block function.
Using it instead of long unclear verbose check
"mbmi->ref_frame[0] != INTRA_FRAME".

Change-Id: I9c7b4b3797942fa962bf3ba7460fff3084beabe9
2013-08-02 16:25:33 -07:00
Yunqing Wang
d340c114fb Merge "Add more checking to using_small_partition_info" 2013-08-02 15:55:09 -07:00
Adrian Grange
60ff123536 Merge "Fixed typos and added a few explanatory comments" 2013-08-02 11:37:47 -07:00
Dmitry Kovalev
b47153deed Replacing long block size enum values with shorter ones.
Change-Id: I0e9329490828684a4fd46f540d89114cc68e8407
2013-08-02 10:48:27 -07:00
Yunqing Wang
0d68080445 Merge "Comment out 2 unused speed features" 2013-08-02 09:58:46 -07:00
Dmitry Kovalev
741537f3ce Cleanup: replacing xd->seg with seg, and xd->lf with lf.
Change-Id: I73b59d7699a8e7e7acd3bf8041cb6c98ce9ba4bf
2013-08-01 15:38:16 -07:00
Dmitry Kovalev
9f4f001ba5 Merge "Cleanup: removing unused function arguments." 2013-08-01 15:07:12 -07:00
Dmitry Kovalev
ce8dedc353 Cleanup: removing unused function arguments.
Change-Id: I27471768980fc631916069f24bc7c482a5c9ca17
2013-08-01 13:41:38 -07:00
Deb Mukherjee
dbea726daf Adds a source variance computation function
Adds a function to compute source variance for various
sb_types to be used for pruning mode and partition searches.
[The existing activity measure function is currently specialized
for only 16x16 MBs and needs to be updated].

Change-Id: I22a41e6f1430184201487326fdbebb9b47e6fc24
2013-08-01 13:01:54 -07:00
Yunqing Wang
215b010f4b Add more checking to using_small_partition_info
If the partition is out of partition size range, we don't
need to process small partition information.

Change-Id: Ice9bfbbdebe1f2ef79271a3aee17de0ed4608376
2013-08-01 11:37:41 -07:00
Yunqing Wang
7965a6ea34 Comment out 2 unused speed features
use_min_partition_size and use_max_partition_size are not used
currently, and could be added back if needed later.

Change-Id: Ib22a9c06b064567a7c1d6d5445567ed77e0d3acc
2013-08-01 11:03:34 -07:00
Adrian Grange
89e73c63c0 Fixed typos and added a few explanatory comments
Change-Id: Ib4e4b41094b54874ee34343dd77c0c131ceed9d2
2013-08-01 09:23:49 -07:00
Jingning Han
12f5762756 Remove unnecessary arguments in rd_pick_ref_frame
This commit removes redundant arguments passing in the function of
rd_pick_reference_frame. This resolves the clang warnings about
potential use of uninitialized values.

Change-Id: Ic68f949a9f8fcd0a583786b0c75321104ea44739
2013-07-31 17:04:13 -07:00
Jingning Han
ac7bab7575 Merge "Make the use of ref_frame index consistent" 2013-07-31 09:11:37 -07:00
Jingning Han
86c384d398 Make the use of ref_frame index consistent
Refactor the frame buffer referencing in choose_partition and make
it consistent with other places. This means to prevent potential
issues when we extend reference frame buffer.

Change-Id: I5ff33ed5f671e1f4cc7049622212769a9b4578d9
2013-07-30 19:49:36 -07:00
Adrian Grange
fbd73648dd Merge "Cleanup typos, remove unnecessary lines, replace switch" 2013-07-30 12:59:46 -07:00
Adrian Grange
b30a06b930 Cleanup typos, remove unnecessary lines, replace switch
Removed unnecessary code lines, replaced switch with an if,
fixed spelling errors and formatting.

Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6
2013-07-30 12:10:32 -07:00
Dmitry Kovalev
730a34416f Renaming NB_TXFM_MODES constant to TX_MODES.
Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039
2013-07-29 13:38:40 -07:00
Dmitry Kovalev
23391ea835 Renaming TX_SIZE_MAX_SB to TX_SIZES.
Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107
2013-07-29 12:25:34 -07:00
Dmitry Kovalev
c09b81719f Merge "General cleanups." 2013-07-26 13:59:39 -07:00
Paul Wilkins
fe5e2a91bb Auto min and max partition size experiment.
Speed feature experiment to set an upper and lower
partition size limit based on what has been seen
in spatial neighbors.

This seems to gives quite reasonable speed gains in local
(10-15%) and when used with speed 0 the losses are small
(0.25% derf, 0.35% stdhd). However, for now I am only
enabling it on speed 1 as there may be clashes with the existing
temporal partition selection in speed 2.

Using a tighter min / max around the range derived from the
neighbors increases speed further but at the cost of a
bigger quality loss. However,  I think this spatial method could
be combined with data from either the last frame or a variance
method (or both) to refine the range of minimum and maximum
partition size. I.e. consider the min and max from spatial and
temporal neighbors and the variance recommendation.

Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f
2013-07-26 18:30:49 +01:00
Yunqing Wang
845fd5011c Merge "Add encoding option --static-thresh" 2013-07-25 14:58:00 -07:00
Yunqing Wang
d36852b702 Add encoding option --static-thresh
This option exists in VP8, and it was rewritten in VP9 to support
skipping on different partition levels. After prediction is done,
we can check if the residuals in the partition block will be all
quantized to 0. If this is true, the skip flag is set, and only
prediction data are needed in reconstruction. Based on DCT's energy
conservation property, the skipping check can be estimated in
spatial domain.

The prediction error is calculated and compared to a threshold.
The threshold is determined by the dequant values, and also
adjusted by partition sizes. To be precise, the DC and AC parts
for Y, U, and V planes are checked to decide skipping or not.

Test showed that
1. derf set:
when static-thresh = 1, psnr loss is 0.666%;
when static-thresh = 500, psnr loss is 1.162%;
2. stdhd set:
when static-thresh = 1, psnr loss is 1.249%;
when static-thresh = 500, psnr loss is 1.668%;

For different clips, encoding speedup range is between several
percentage and 20+% when static-thresh <= 500. For example,
clip            bitrate  static-thresh psnr    time
akiyo(cif)       500        0          48.923  5.635s(50f)
akiyo            500        500        48.863  4.402s(50f)

parkjoy(1080p)   4000       0          30.380  77.54s(30f)
parkjoy          4000       500        30.384  69.59s(30f)

sunflower(1080p) 4000       0          44.461  85.2s(30f)
sunflower        4000       500        44.418  78.1s(30f)

Higher static-thresh values give larger speedup with larger
quality loss.

Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53
2013-07-25 14:28:05 -07:00
Dmitry Kovalev
7131cb0e3d General cleanups.
Removing unused constants, macros, and function declarations. Using
ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
#include from *.h to *.c. Merging for loops for motion vectors.

Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
2013-07-25 14:13:48 -07:00
Adrian Grange
e862c6f9eb Merge "Simplify handling of sub-partition motion vectors" 2013-07-25 12:58:38 -07:00
Adrian Grange
be700e140a Simplify handling of sub-partition motion vectors
Simplified the code that extracts and uses the motion
vectors for the 4 sub-partitions in rd_pick_partition.

Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9
2013-07-25 11:51:51 -07:00
Yaowu Xu
3e386aefc2 fix a bug where flags are not reset
The feature that uses small partition results as a measure to skip
mode evaluation at larger partition requires the flags to be reset.
The reset was missing in the code path that calls rd_use_partition().

Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50
2013-07-25 10:28:38 -07:00
Adrian Grange
a183f17d33 Merge "Correct spelling mistakes" 2013-07-24 09:48:57 -07:00
Adrian Grange
bc8b0529db Correct spelling mistakes
Change-Id: Id4138293efeac4503b2e01ce7a6c150a5abeef77
2013-07-24 07:58:26 -07:00
Dmitry Kovalev
1099a436d3 Moving counts from FRAME_CONTEXT to new struct FRAME_COUNTS.
Counts are separate from frame context. We have several frame contexts but
need only one copy of all counts.

Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d
2013-07-23 17:02:08 -07:00
Adrian Grange
719cd35f3a Merge "Rolled-up several for loops into one" 2013-07-23 15:00:06 -07:00
Adrian Grange
646edbc1b2 Rolled-up several for loops into one
Several consecutive for loops executed over the same
index range, so I rolled them into one.

Change-Id: I5cfcc8c38c738478965768409cca9d09adf224e1
2013-07-23 14:32:21 -07:00
Dmitry Kovalev
2855d8aea1 Merge "Adding update_tx_counts function." 2013-07-23 13:57:59 -07:00
Jim Bankoski
86a9dec73c clean up bw, bh
many structures use bw and bh and they have different meanings.   This cl attempts
to start this clean up and remove unneccessary 2 step look up log and then
shift operations...

also removed partition type multiple operation code in bitstream.c.

Change-Id: I7e03e552bdfc0939738e430862e3073d30fdd5db
2013-07-23 06:51:44 -07:00
Dmitry Kovalev
b2fc6fa969 Adding update_tx_counts function.
Moving common encoder/decoder code to update_tx_counts. Also renaming
vp9_get_pred_probs_tx_size to get_tx_probs2 and adding get_tx_probs to
call vp9_get_pred_context_tx_size inside read_selected_tx_size only once
(twice before).

Change-Id: Ia50247f3893de88ef8e9041b0d44be44a40aaa4d
2013-07-22 14:57:43 -07:00
Jim Bankoski
2ac8b50cd8 fix left over overflow
This cl fixes issues rbultje brought up. that I somehow neglected when I
submitted yaowu's patch.

Change-Id: I07ad18796317822510b96e951c88d29f194a3c2e
2013-07-22 06:39:39 -07:00
Dmitry Kovalev
8962d975b2 Merge "Moving all loop filter related variables into new struct." 2013-07-20 22:45:24 -07:00
Dmitry Kovalev
f66821afbb Merge "Removing frame_type field from MACROBLOCKD struct." 2013-07-20 22:40:06 -07:00
Yaowu Xu
ea284d6281 added checks to prevent rate/distortion overflow
At speed 2, due to the threshold scheme used, it is possible the rate
and distortion assigned with INT_MAX value. The patch added checking
to prevent the INT_MAX value is used in further calculation of RD
scores. The patch also changed the assertion in rd_use_partition() to
be mirror similar assertion in rd_pick_partition().

Change-Id: Idb52c543cc1e10abdf6e6a5d6e9cb535a42214dc
2013-07-19 17:52:50 -07:00
Dmitry Kovalev
ee1771ebaa Moving all loop filter related variables into new struct.
Adding loopfilter struct with fields from MACROBLOCKD and VP9Common.
Eventually it will be moved to vp9_loopfilter.h for better code structure.

Change-Id: Iaf5fb71c33719cdfa1b991f671caf071be9ea035
2013-07-19 16:19:10 -07:00
Dmitry Kovalev
e71a4a77bb Merge "Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE)." 2013-07-19 12:14:32 -07:00
Dmitry Kovalev
97e96bc4e9 Removing frame_type field from MACROBLOCKD struct.
Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119
2013-07-19 11:55:36 -07:00
Dmitry Kovalev
c0eb57406c Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE).
Moving TX_MODE enum to vp9_enums.h. Renaming txfm_mode variables to
tx_mode.

Change-Id: I459d1af6dd928ce7fccdf8ce30b6f1ca057bef92
2013-07-19 11:37:13 -07:00
Dmitry Kovalev
afe43d4089 Removing redundant VP9_COMMON* from function signatures.
Functions: vp9_get_pred_context_switchable_interp,
           vp9_get_pred_context_intra_inter,
           vp9_get_pred_context_single_ref_p1,
           vp9_get_pred_context_single_ref_p2.

Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96
2013-07-19 11:20:49 -07:00
Ronald S. Bultje
e4686c589e Fix slightly quality drop caused at speed 1.
We would skip the rectangular blocks for sub8x8 partitions because
we would conclude that PARTITION_NONE was better than PARTITION_SPLIT,
however, that conclusion was made before we actually really tested
PARTITION_SPLIT.

Change-Id: I8fa91e59894badc1d8cee3ba8a49e40ae4c4a489
2013-07-18 17:52:08 -07:00
Ronald S. Bultje
5ebe503f04 Merge scale_factors and scale_factors_uv.
This prevents a duplicate memcpy of a 128-byte struct every time
set_scale_factors() is called (which is a lot), thus leading to a
decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block
RD/partition loop.

Overall, this decreases encoding time of the first 50 frames of bus
@ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5%
overall speedup. We can likely get more gains by removing the copy
of the other struct (and replacing it with an indexing) as well.

Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
2013-07-18 14:10:56 -07:00
Ronald S. Bultje
2d4929e340 Remove motion vectors from PARTITION_INFO.
The same information already exists in union b_mode_info.

Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d
2013-07-18 14:10:52 -07:00
Ronald S. Bultje
607424449c Merge "Best_rd breakout in rd partition search." 2013-07-17 16:10:22 -07:00
Dmitry Kovalev
a7a1e96136 Merge changes Ieffea49e,Idf610746
* changes:
  Removing two unused arguments from vp9_inc_mv signature.
  Changing signature of vp9_get_pred_probs_tx_size.
2013-07-17 14:44:20 -07:00
Ronald S. Bultje
9f427bfe98 Best_rd breakout in rd partition search.
About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which
goes from 1min36 to 1min24. Results become slightly better (+0.2% on
derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in
super_block_yrd(). Overall speed change (on derfraw300) is roughly
-13%. This can probably be improved further by caching best_yrd
between partition searches. Also, we might be able to get more
speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not
just at the sb8x8 level.

Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b
2013-07-17 09:56:46 -07:00
Yunqing Wang
df90d58f4f Speed up motion estimation using small partitions' result(experiment)
Current partition checking starts from small sizes, and then goes up
to large sizes. This experiment uses the small partitions' motion
estimation result, which is already available, to speed up the
large partition's motion estimation. We can decide to skip some
patition checkings if they are unlikely choices. We could use the
motion vector(MV) result as current partition's prediction MV, limit
the search range and reference frame.

Current result at speed 1:
psnr loss: 1.19% for stdhd, 0.287% for derf.
speed gain: 14% for sunflower(hd), 11% for akiyo.

Further improvement will be done later.

Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab
2013-07-17 09:11:47 -07:00
Dmitry Kovalev
5b65a71cdc Changing signature of vp9_get_pred_probs_tx_size.
Removing VP9_COMMON* argument and adding struct tx_probs* instead of
MACROBLOCKD*.

Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c
2013-07-16 16:34:54 -07:00
Dmitry Kovalev
9482a0bf10 Cleaning up tile code.
Removing tile_rows and tile_columns from VP9Common, removing redundant
constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
vp9_get_tile_n_bits.

Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
2013-07-16 14:47:15 -07:00
Dmitry Kovalev
5de96b3ce6 Merge "Rewriting vp9_set_pred_flag_{seg_id, mbskip}." 2013-07-16 13:34:42 -07:00
James Zern
5baa416b6c Merge "vp9: remove frames_{since,till}.. from MACROBLOCKD" 2013-07-16 13:00:14 -07:00
Dmitry Kovalev
863138a2ad Rewriting vp9_set_pred_flag_{seg_id, mbskip}.
Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent
with vp9_get_segment_id without using confusing sub(a, b) macro. Passing
mi_row and mi_col to functions explicitly instead of replying on
mb_to_right_edge and mb_to_bottom_edge.

Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435
2013-07-16 10:44:48 -07:00
Jingning Han
faff6ed0fb Skip duplicate block encoding in the rd loop
This speed feature allows the encoder to largely remove the spatial
dependency between blocks inside a 64x64 superblock, thereby removing
the need to repeatedly encode superblocks per partition type in the
rate-distortion optimization loop.

A major challenge lies in the intra modes tested in the rate-distortion
optimization loop. The subsequent blocks do not have access to the
reconstructed boundary pixels without the intermediate coding steps.
This was resolved by using the original pixels for intra prediction
in the rd loop, followed by an appropriately designed distortion
modeling on the quantization parameters. Experiments also suggested
that the performance impact is more discernible at lower bit-rate/psnr
settings. Hence a quantizer dependent threshold is applied to deactivate
skip of block coding.

For bus_cif at 2000 kbps,
speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB
         performance loss.

speed 1: runtime 65312ms  -> 61536ms, (7% speed-up) at 0.04dB
         performance loss.

This operation is currently turned on in settings of speed 1.

Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec
2013-07-15 11:08:58 -07:00
James Zern
dc1d2331f6 vp9: remove frames_{since,till}.. from MACROBLOCKD
frames_since_golden / frames_till_alt_ref_frame are unused.

Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b
2013-07-13 18:02:11 -07:00
Dmitry Kovalev
429070987a Using vp9_copy and vp9_zero instead of custom code.
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
2013-07-12 18:07:43 -07:00
Dmitry Kovalev
cc662dd768 Adding struct tx_probs and struct tx_counts to cleanup the code.
Also removing unused declarations from vp9_entropymode.h file.

Change-Id: Ib9c5826db3584a32f6bb3297a76c522b99d83402
2013-07-12 15:22:38 -07:00
Dmitry Kovalev
dd150e8ea9 Removing redundant code mostly from vp9_pred_common.{h, c}.
Removing redundant function arguments and curly braces.

Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
2013-07-11 18:39:10 -07:00
Dmitry Kovalev
c4ad3273c7 Moving segmentation related vars into separate struct.
Adding segmentation struct to vp9_seg_common.h. Struct members are from
macroblockd and VP9Common structs. Moving segmentation related constants
and enums to vp9_seg_common.h.

Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03
2013-07-11 11:57:57 -07:00
Dmitry Kovalev
544d8c3316 Removing unused TOKENEXTRA arg from pick_sb_modes function.
Change-Id: I0543e72fa092eef3976b65e16bb597197c364873
2013-07-10 15:57:28 -07:00
Jim Bankoski
68ef7a6b8a configure with internal stats not working
Change-Id: I5dea4570cb05df27a522abf6e7b695998654284a
2013-07-10 15:07:53 -07:00
Jim Bankoski
6591cf2f7e remove warnings when NDEBUG is set
Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136
2013-07-10 14:27:20 -07:00
Jim Bankoski
fb027a7658 removing case statements around prediction entropy coding
Removes SEG_ID
Removes MBSKIP
Removes SWITCHABLE_INTERP
Removes INTRA_INTER
Removes COMP_INTER_INTER
Removes COMP_REF_P
Removes SINGLE_REF_P1
Removes SINGLE_REF_P2
Removes TX_SIZE

Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b
2013-07-09 20:10:16 -07:00
Dmitry Kovalev
c6c279aff0 Merge "Using mi_cols instead of mb_cols." 2013-07-08 20:09:19 -07:00
Dmitry Kovalev
1c65c580d6 Merge "Refactoring setup_pre_planes function." 2013-07-08 20:08:05 -07:00
Ronald S. Bultje
a5062cc635 Don't call encode_sb() for the final of 4-split subpartitions.
The resulting reconstruction is never used, thus it just wastes CPU
cycles. Reduces encode time of first 50 frames of bus (speed 0) @
1500kbps from 2min2.0 to 2min1.2, i.e. a 0.65% overall speedup.

Change-Id: I74755ca3aadc21e2be220f486259060bd4088c45
2013-07-08 16:22:39 -07:00
Ronald S. Bultje
ed995afba1 Make frame-wide filter-type decision fully RD-based.
Overall, on all test sets, this gains about +0.2% on all metrics.
City is a clip where this really hurts (-1.0% on all metrics), I'm
not quite sure why yet. Maybe interesting to look into in the future.

Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78
2013-07-08 16:22:37 -07:00
Dmitry Kovalev
b7559258a4 Using mi_cols instead of mb_cols.
Eliminating usage of mb-units, switching to mi-units. Adding
ALIGN_POWER_OF_TWO macro.

Change-Id: I2491c969f713207c062011878b57e4e531818607
2013-07-08 14:54:04 -07:00
Deb Mukherjee
d9b62160a0 Implements several heuristics to prune mode search
Skips mode searches for intra and compound inter modes depending
on the best mode so far and the reference frames. The various
heuristics to be used are selected by bits from a flag. The
previous direction based intra mode search pruning is also absorbed
in this framework.

Specifically the flags and their impact are:

1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique
directional modes and TM_PRED if the best so far is
an inter mode)
derfraw300: -0.15%, 10% speedup

2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153
mode search if the best so far is not one of the closest
hor/vert/diagonal directions.
derfraw300: -0.05%, about 9% speedup

3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode
search if the best so far is an intra mode)
derfraw300: -0.06%, about 7-8% speedup

4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search
if the best single ref inter mode does not have the same ref
as one of the two references being tested in the compound mode)
derfraw300: -0.56%, about 10% speedup

Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495
2013-07-08 12:17:12 -07:00
Dmitry Kovalev
f72e072555 Refactoring setup_pre_planes function.
Removing set_refs, adding set_ref function.

Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63
2013-07-03 17:42:01 -07:00
Dmitry Kovalev
430bd0c94a Merge "Replacing 64 / MI_SIZE with MI_BLOCK_SIZE." 2013-07-03 14:16:02 -07:00
Dmitry Kovalev
5a21de8418 Replacing 64 / MI_SIZE with MI_BLOCK_SIZE.
Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c
2013-07-03 10:54:50 -07:00
Paul Wilkins
72c5778ec5 Added two new skip experiments.
sf->unused_mode_skip_lvl. Tests modes as normal for all
sizes at or below the given level. At larger sizes it skips
all modes that were not chosen at any smaller size.
Hence setting BLOCK_SIZE_SB64X64 is in effect off.
Setting BLOCK_SIZE_AB4X4 will only consider modes that
were chosen for one or more 4x4 blocks at larger sizes.

sf->reference_masking.
Do a test encode of the NONE partition at one size and create
a reference frame mask based on the best rd choice. In the
full search only allow this reference frame.
Currently it is testing 64x64 and repeats this in the full search.
This does not work well with Jim's Partition code just now and
is disabled by default.

Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd
2013-07-03 16:56:06 +01:00
Dmitry Kovalev
1f6e95e76a Merge "Removing redundant struct from union b_mode_info." 2013-07-02 18:09:31 -07:00
Dmitry Kovalev
be77f6bbbf Removing redundant struct from union b_mode_info.
Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83
2013-07-02 16:51:57 -07:00
Yaowu Xu
0d7b7c09cb Added a speed feature use_square_partition_only
This commit adds a speed feature where only squared partition are
evaluated in partition picking. Enable this feature in cpu-used 2
reduces encoding time by ~30%.

loss of compression:
-0.9% on cif set
-1.23% on stdhd

Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9
2013-07-02 16:40:15 -07:00
Deb Mukherjee
8d3d2b76f3 Tx size selection enhancements
(1) Refines the modeling function and uses that to add some speed
features. Specifically, intead of using a flag use_largest_txfm as
a speed feature, an enum tx_size_search_method is used, of which
two of the types are USE_FULL_RD and USE_LARGESTALL. Two other
new types are added:
USE_LARGESTINTRA (use largest only for intra)
USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for
inter)

(2) Another change is that the framework for deciding transform type
is simplified to use a heuristic count based method rather than
an rd based method using txfm_cache. In practice the new method
is found to work just as well - with derf only -0.01 down.
The new method is more compatible with the new framework where
certain rd costs are based on full rd and certain others are
based on modeled rd or are not computed. In this patch the existing
rd based method is still kept for use in the USE_FULL_RD mode.
In the other modes, the count based method is used.
However the recommendation is to remove it eventually since the
benefit is limited, and will remove a lot of complications in
the code

(3) Finally a bug is fixed with the existing use_largest_txfm speed feature
that causes mismatches when the lossless mode and 4x4 WH transform is
forced.

Results on derf:
USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction
USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a
pretty good compromise)
USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction
(currently the benefit of modeling is limited for txfm size selection,
but keeping this enum as a placeholder) .
USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing
use_largest_txfm speed feature).

Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936
2013-07-02 13:54:00 -07:00
Dmitry Kovalev
3140c443e4 Merge "Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h." 2013-07-02 11:31:35 -07:00
Jim Bankoski
d4158283e7 use partitioning from last frame
This cl converts use partition from last frame to do the following:

if part is none,horz, vert -> try split
if part != none and one of the children is not split - try none


Change-Id: I5b6c659e35f3ac9f11c051b92ba98af6d7e8aa87
Signed-off-by: Jim Bankoski <jimbankoski@google.com>
2013-07-01 18:18:50 -07:00
Dmitry Kovalev
1ac0540296 Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h.
Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c
2013-07-01 17:28:08 -07:00
Yaowu Xu
632289b31f fix a mismatch in cpuused 2
Change-Id: I921c9faba6386535aaf717a54301dd346a9b8540
2013-07-01 08:54:50 -07:00
Dmitry Kovalev
59070f6e3c Merge "Removing CONFIG_DEBUG checks on assertions." 2013-06-28 14:03:28 -07:00
Dmitry Kovalev
0345fc3ad9 Merge "Decoder's code cleanup." 2013-06-28 10:38:54 -07:00
Dmitry Kovalev
8e6ce6bb9e Removing CONFIG_DEBUG checks on assertions.
Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated
ones from vp9_onyx_int.h and vp9_onyxd_int.h.

Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6
2013-06-28 10:36:20 -07:00
Yaowu Xu
64bb996e03 Merge "Optimize partition search order" 2013-06-28 09:29:39 -07:00
Yaowu Xu
1374a06bd8 Optimize partition search order
This commit change the partition search order to allow checking of
rectangular partition to be done after square partitions. It also
added a speed feature to skip rectangular partition check when
NONE is better than SPLIT in RD sense.

This feature roughly speed up encoder by 1.5X with loss on compression
-0.91% on cif set
-0.56% on stdhd set

Change-Id: I0d2d06993041aa9ea9073fcc39c54f73a127dfa4
2013-06-28 07:13:54 -07:00
Ronald S. Bultje
fd4eed3b08 Fix tile independence with both column tiling and static_thresh set.
Change-Id: I0b2be0ec2c410a527f88b95a44f24ac967b2dac1
2013-06-27 21:56:40 -07:00
Dmitry Kovalev
3231da0a9e Decoder's code cleanup.
Using vp9_set_pred_flag function instead of custom code, adding
decode_tokens function which is now called from decode_atom,
decode_sb_intra, and decode_sb.

Change-Id: Ie163a7106c0241099da9c5fe03069bd71f9d9ff8
2013-06-27 16:15:43 -07:00
Dmitry Kovalev
a3664258c5 Merge "General cleanup in segmentation-related code." 2013-06-27 14:57:07 -07:00
Paul Wilkins
59af9049d3 Merge "Start adaptive threshold for each mode at max." 2013-06-27 02:28:36 -07:00
Jingning Han
bd9bac0391 Remove empty function vp9_build_block_offsets
This function is empty, hence is removed.

Change-Id: Ia9d01710806bffe0398a6dc9405f8a5a81b27d74
2013-06-26 14:55:47 -07:00
Dmitry Kovalev
be07485e9a General cleanup in segmentation-related code.
Using consistent function and variable names.

Change-Id: I2deb3fded8797453a2081836c9ce2e79ade06eb7
2013-06-26 10:27:28 -07:00
Paul Wilkins
689957e3ad Start adaptive threshold for each mode at max.
Each frame we reset all adaptive thresholds to MAX
rather than base. As modes are picked their thresholds
drop down.

Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8
2013-06-26 17:04:47 +01:00
Dmitry Kovalev
70e9622185 Merge "Removing find_seg_id and using vp9_get_pred_mi_segid instead." 2013-06-25 10:16:06 -07:00
Yaowu Xu
e371cd73a3 change to enable use_largest_txform feature
for all regular inter frames at speed 1

Change-Id: I0a8b301273ecf2b8730ab1f6b7a05f89f4d498e0
2013-06-24 16:43:26 -07:00
Dmitry Kovalev
40141681c0 Removing find_seg_id and using vp9_get_pred_mi_segid instead.
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
2013-06-21 13:05:10 -07:00
Ronald S. Bultje
54b2a59623 Implement SSE2 block_error.
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.

Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.

Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
2013-06-21 12:54:52 -07:00
Yaowu Xu
ee07a261a0 rename variables to avoid build error in MSVC
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
2013-06-20 18:31:48 -07:00
Jim Bankoski
9f2a1ae23e adds force partitioning greater than or less than block size
adds a new speed feature to force partitioning to be greater than
or less than a certain size

Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
2013-06-20 09:51:42 -07:00
Jim Bankoski
18bdf708e7 adds a set partitioning to speed features
this feature lets you set a partitioning size to be used by the entire
frame.

Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
2013-06-20 09:50:44 -07:00
Jim Bankoski
476d73d294 partition by variance using var from last frame
This uses variance to split partition. Variance is calculated using
nearest mv,  always from last ref frame.

Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
2013-06-20 09:48:22 -07:00
Jim Bankoski
727fa7b1e4 new partition via variance
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
2013-06-20 09:42:05 -07:00
Jim Bankoski
0fad6a9d99 fix to set up new speed feature
This uses the speed feature functionality for code.

Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
2013-06-20 09:35:02 -07:00
Jim Bankoski
df2314cfdd don't copy partitions for key frames or altrefs
force us to go through slow partitioning for keyframes, altref and
overlays.

Change-Id: I1a286361bf74083e71973575a7296be46eb98742
2013-06-20 09:34:32 -07:00
Jim Bankoski
fbcce4dd6f Merge "copy partitioning from last fame" 2013-06-20 09:32:43 -07:00
Jim Bankoski
f033b44e74 copy partitioning from last fame
Change-Id: I26e80ede80cb4389378a95afa95d229092a9859a
2013-06-20 09:32:19 -07:00
Jingning Han
7088426976 Merge "Make fdct32 computation flow within 16bit range" 2013-06-18 11:40:14 -07:00
Jingning Han
a41a4860c0 Make fdct32 computation flow within 16bit range
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.

This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.

Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
2013-06-18 09:46:24 -07:00
Dmitry Kovalev
686b99741c Removing vp9_invtrans.{c, h} files.
Moving single function from vp9_invtrans.c to vp9_encodemb.c.

Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7
2013-06-17 16:09:03 -07:00
Ronald S. Bultje
8a0808a145 Fix row tiling.
Change-Id: I57be4eeaea6e4402f6a0cc04f5c6b7a5d9aedf9b
2013-06-12 13:42:59 -04:00
Jim Bankoski
fca6c82b29 Fix rd partition search for corner blocks
This commit enables proper partition type search for the bottom-
right corner blocks.

Change-Id: Id1123d0e4e81eba648ed4f3c0c7ab587e174f650
2013-06-11 09:29:21 -07:00