VP9_COMMON is the right place to segmentatation struct because it has
global segmentation parameters, not something specific to macroblock
processing.
Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708
This commit unifies the rate-distortion cost calculation process of
luma and chroma components. It allows early termination to be enabled
later in the rd search loop of chroma components, in consistent with
luma pixels.
Change-Id: I2e52a7c6496176bf2a5e3ef338d34ceb8aad9b3d
The macro block mode info context originally contained an
entry for each 16x16 macroblock. In VP9 each entry refers
to an 8x8 region not a macro block, so the naming is misleading.
This first stage clean up changes the names of 3 entries in the
structure to remove the mb_ prefix.
TODO clean up the nomenclature more widely in respect of
mbmi and bmi.
Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6
Enable SSE2 implementation of high precision 32x32 forward DCT. The
intermediate stacks are of 32-bits. The run-time goes down from
32126 cycles to 13442 cycles.
Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
'skippable' can remain unset and negatively affect later decisions
address one aspect of issue #599
Change-Id: Iffdf0ac2e49ac481c27dc27c87fa546d4167bb28
Adds a new subpel motion estimation function that uses a 2-level
tree-structured decision tree to eliminate redundant computations.
It searches fewer points than iterative search (which can search
the same point multiple times) but has the same quality roughly.
This is made the default setting at speeds 0 and 1, while at
speed 2 and above only a 1-level search is used.
Also includes various cleanups for consistency and redundancy removal.
Results:
derf: +0.012% psnr
stdhd: +0.09% psnr
Speedup of about 2-3%
Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9
Removing the old one bsize_from_dim_lookup. Now we have a way to determine
block size for plane using its subsampling values (ss_size_lookup). And
then we can find the number of pixels in the block (num_pels_log2_lookup).
Change-Id: I6fc981da2ae093de81741d3d78eaefed11015db9
Removes some unused code and speed features, and organizes the
interfaces for fractional mv step functions for use in new speed
features to come.
In the process a new speed feature - number of iterations per
step during the subpel search - is exposed.
No change when this parameter is set as the original value of 3.
Results:
subpel_iters_per_step = 3: baseline
subpel_iters_per_step = 2: psnr -0.067%, 1% speedup
subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup
Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8
Adds a few pattern searches to achieve various tradeoffs
between motion estimation complexity and performance.
The search framework is unified across these searches so that a
common pattern search function is used for all. Besides it will
be easier to experiment with various patterns or combinations
thereof at different scales in the future.
The new pattern search is multi-scale and is capable of using
different patterns at different scales.
The new hex search uses 8 points at the smallest scale
and 6 points at other scales.
Two other pattern searches - big-diamond and square are
also added. Big diamond uses 4 points at the smallest scale and
8 points in diamond shape at the larger scales.
Square is very similar conceptually to the default n-step search
but is somewhat faster since it keeps only one survivor across
all scales.
Psnr/speed-up results on derf300:
hex: -1.6% psnr%, 6-8% speed-up
big-diamond: -0.96% psnr, 4-5% speedup
square: -0.93% psnr, 4-5% speedup
Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2
There was no benefit having this function. For example, inside
read_switchable_filter_type switchable filter context was calculated twice.
Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1
Adds a speed feature to skip all intra modes other than
DC_PRED if the source variance is small. This feature is
made part of speed 1 and up.
Results on derf300: psnr -0.07%, speedup about 1-2%
Also uses the source variance to fine-tune the early
termination criteria when FLAG_EARLY_TERMINATE is on.
This feature is made part of speed 2 and up.
Results on derf300: psnr -0.52%, speedup about 5-7%
Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
This changeset allows to remove vp9_switchable_interp and
vp9_switchable_interp_map arrays and make code much clear. Actually we
still have to use these mapping but only inside read_interp_filter_type and
write_interp_filter_type functions.
Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50
Removing assign_and_clamp_mv function, making implementation of clamp_mv
and clamp_mv2 more clear and consistent.
Change-Id: Iecd08e1c1bf0379f8314ebe01811f8253f4ade58
The function name rd_pick_intra4x4mby_modes is confusing, so
I changed it to rd_pick_intra_sub_8x8_y_modes to better
reflect what the function does. Also added const qualifiers
to some of the input parameters and removed camel-case.
Change-Id: I23d53d4c7af5d79ed8a471acd59a09bbb47add39
This allows us to increment the position at the band-level only as
we go from one band to the next; more importantly, that allows us to
use an add instead of multiply instruction, and omit the instruction
altogether if the band doesn't change from one coef to the next, thus
being slightly faster (probably more noticeable on systems where a
multiply is expensive, like arm).
Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381