As a side-effect, the sad unit tests for VP8 and VP9
had to be separated.
Fixes a bug in original patch:
(https://gerrit.chromium.org/gerrit/#/c/70163/8)
that was reverted due to a nightly test failure.
Change-Id: Ia2a4e9e278fd3c89d6c3c82fcc6381320ec2a8a6
The various motion search functions share a
common function prototype. In the case of
vp9_full_range_search() two of the parameters
are not needed.
Change-Id: I0e190af54a3b3f276409f20e8ec55912f9b0b798
As a side-effect, the max_sad check is removed from the
C-implementation of VP8, for consistency with VP9, and to
ensure that the SAD tests common to VP8/VP9 pass.
That will make the VP8 C implementation of sad a little slower
but given that is rarely used in practice, the impact will be
minimal.
Change-Id: I7f43089fdea047fbf1862e40c21e4715c30f07ca
Problem has been introduced recently with the cleanup patch
I0816ec12ec0a6f21d0f25f10c214b5fd327afc6c
Change-Id: Iaacb956a6039eb5826b82618dc03be32053fb892
Brings back most of Jim's previous patch for choosing
partitioning based on variance while making it compatible
with the current state of the code. Also adds a
nonrd_use_partition() function to recursively encode for any
arbitrary sb_type decisions within a 64x64 block; and
includes some refactoring.
Currently, when the VAR_BASED_PARTITIONING mode is turned on
for speed 7, there is a 10+% speed-up observed.
Experiments/improvements with this new partitioning method
will be conducted subsequently.
Change-Id: Ie6f43bfbde30583e941f450bf07c3b48828c9571
Adds a fast diamond search which is about 5% faster than FAST_HEX
with only a 0.1% drop in psnr when turned on for both speeds 5 and 7.
This search is turned on for speed 7.
Change-Id: I497630aa88a5148926086bb3038e7975e5f4eb98
The core motion estimation fucntions all return sad now consistently.
The only exception is vp9_full_pixel_diamond(), however the core diamond
and refining search routines called from vp9_full_pixel_diamond() also
return SAD. If variance of pred error + mv cost is desired it must be
calculated explicitly outside these functions. For very fast encoding,
hopefully this will eliminate some redundant computations.
Also suggests reimplementing FAST_HEX with the vp9_pattern_search
framework. It is not exactly the same as the existing FAST_HEX, but
performance is slightly better and speed is very similar. Enables
removing a lot of duplicate code.
Change-Id: I152736393438c25bdf7e96b37cbb8ce330f4f94a
In good quality mode motion search, the best matches are normally
found after searching in a large area. In real time mode, to make
encoding fast, a center-biased fast HEX search is used, which
converges quickly most of the time. A 4-point diamond search is
also carried out as the following refining search, which gives more
precise results, and maintains good motion search quality.
At speed 5, the borg test on rtc set showed an overall PSNR loss of
0.936%. The encoding speed gain is 4% - 5%.
Change-Id: I42cd68bb56a09ca1b86293c99d5f7312225ca7ae
This commit explicitly enforces the effective motion vector range
in the motion search stage. The range needs to be the intersection
of UMV border, effective absolute motion vector value range, and
the target search area.
Change-Id: I1cf7c563e02b1086040dad6c1f4f6be1538635a6
Add a full range motion search for regular block sizes. This runs
exhaustive search within the given reference area. This commit further
optimizes the search process by combining 4 points test into one
pipeline, which gives 30% speed-up as compared to run each individual
point at a time.
This full range search serves as a best possible motion search reference.
When replacing the diamond search with full range search, the speed 0
runtime of bus CIF at 2000 kbps goes from 153872ms to 623051ms. The
compression performance compared to speed 0 setting gains 0.585% for
derf set.
Change-Id: Ieef1225216b0b86b4ac4872fa7fb9e18bf2eabb3
Explicitly constrain the upper limit of motion search range (in the
unit of full pixel) to be [-1023, +1023]. It is intended to control
the effective motion search range for 4K sequences.
Change-Id: I645539c70885eec0f155781f439d97d333336e88
Making this change in order to move allow_high_precision_mv field
from MACROBLOCKD structure to VP9_COMMON (because it is a frame level
flag).
Change-Id: I1d006ba36d938e0caf4d40fa051e2e38df9c1108
39c7b01d accidently reverted the row/col initialization, which broke
mv clamps, which is dependent on the sites for valid motion vector
range. This commit fixed the issue.
Change-Id: Ibcce0226e0360b1ef483fe760b2e33f1af4bf494
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of pointers to MODE_INFO structs. The
MODE_INFO structs are now stored as a stream (decoder only),
eliminating unnecessary copies and is a little more cache
friendly.
Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of a pointer to a MODE_INFO struct and
a "in the image" flag. The MODE_INFO structs are now stored
as a stream, eliminating unnecessary copies and is a little
more cache friendly.
For the test clips used, the decoder performance improved
by ~4.3% (1080p) and ~9.7% (720p).
Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
and 5.9% (720p).
Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256