This experimental change reorders the search so
that all possible references that match the target
reference frame are tested first and these in order
of distance from the current block. These will usually
be the highest scoring candidates.
If we do not find enough good candidates this way
we try non matching cases. These will usually be lower
scoring candidates.
The change in order together with breakouts when
we have found enough candidates should reduce
the computational cost and especially reduce the number
of sort operations.
Quality Results:
Std Hd +0.228%, Hd +0.074%, YT +0.046%, derf +0.137%
This effect is probably due to the fact that more distant
weak candidates are now less likely to get "promoted" over
near candidates even if they are repeated.
Change-Id: Iec37e77d88a48ad0ee1f315b14327a95d63f81f6
Old Scheme:
When SWITCHABLE filter selection is enabled the encoder
evaluates the use of each interpolation filter type and
selects the best one to use at the MB level. A frame-
level flag can be set to force the use of a particular
filter type for all MBs in a frame if it is more efficient
to encode that way. The logic here involved a Q dependent
threshold that assumed that the second 8-tap filter was
a high-pass filter. However, this requires a trip around
the recode loop. If the frame-level flag indicates use
of a particular filter, the other filters are not
evaluated in the pick_mode loop.
New Scheme:
Each filter type is evaluated at the MB level and a record
of the best filter is kept, irrespective of what filter
is signaled at the frame-level. Once all MBs have been
encoded, a decision is made as to what frame-level mode
to set for the *next* frame. If one filter is used by 80%
or more of the MBs, then this filter is forced since it
is assumed that this will be more efficient if the
next frame has similar characteristics. i.e. there is a
one-frame lag between measuring the filter selection and
setting the frame-level mode to use.
Change-Id: I6a7e7ced8f27e120fafb99db2dc9c6293f8d20f7
Remove special case function cost_coeffs_2x2() and change function
cost_coeffs() to handle 2nd order haar block as it is handle all
other block types already.
Change-Id: I2aac6f81ee0ae9e03d6a8da4f8681d69b79ce41f
Read mode before calling vp9_find_best_ref_mvs(). If the mode is
ZEROMV, the best ref_mvs are not needed. Then, we can skip calling
vp9_find_best_ref_mvs().
Change-Id: I5baa3658dd3f1c7107211cbbbcf919b4584be2e2
The 2-D inverse transform X = M1*Z*Transposed_M2 was calculated
in 2 steps from left to right:
1. Vertical transform: Y = M1*Z
2. Horizontal transform: X= Y*Transposed_M2
In SIMD, a transpose is needed in vertical transform.
Here, switched the calculation order to do it from right to left.
In this way, we could eliminate that transpose by writing the
intermediate results out to their transposed positions.
Change-Id: I34dfe5eb01292f6e363712420d99475e2e81e12c
Adds an experiment to derive the previous context of a coefficient
not just from the previous coefficient in the scan order but from a
combination of several neighboring coefficients previously encountered
in scan order. A precomputed table of neighbors for each location
for each scan type and block size is used. Currently 5 neighbors are
used.
Results are about 0.2% positive using a strategy where the max coef
magnitude from the 5 neigbors is used to derive the context.
Change-Id: Ie708b54d8e1898af742846ce2d1e2b0d89fd4ad5
Remove an extra level of escaping around the $@ variable to get valid output.
Prior to this change, modifying header files did not trigger a rebuild of
sources dependent on them.
Change-Id: I93ecc60371b705b64dc8a2583a5d31126fe3f851
This patch changes the token packing to call the bool encoder API rather
than inlining it into the token packing function, and similarly removes
a special get_signed case from the detokenizer. This allows easier
experimentation with changing the bool coder as a whole.
Change-Id: I52c3625bbe4960b68cfb873b0e39ade0c82f9e91
For coefficients, use int16_t (instead of short); for pixel values in
16-bit intermediates, use uint16_t (instead of unsigned short); for all
others, use uint8_t (instead of unsigned char).
Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da
The MAX_PSNR was used to assign a "psnr" number when the mse is close
to zero. The direct assignment is used to prevent divide by zero in
computation. Changing it from 60 to 100 to be consistent against what
is being done in VP9
Change-Id: I4854ffc4961e59d372ec8005a0d52ca46e3c4c1a
The mismatch was caused by an improper merge of cleanup code around
tokenize_b() and stuff_b() with TX32X32 experiment.
Change-Id: I225ae62f015983751f017386548d9c988c30664c
Modifies the scanning pattern and uses a floating point 16x16
dct implementation for now to handle scaling better.
Also experiments are in progress with 2/6 and 9/7 wavelets.
Results have improved to within ~0.25% of 32x32 dct for std-hd
and about 0.03% for derf. This difference can probably be bridged by
re-optimizing the entropy stats for these transforms. Currently
the stats used are common between 32x32 dct and dwt/dct.
Experiments are in progress with various scan pattern - wavelet
combinations.
Ideally the subbands should be tokenized separately, and an
experiment will be condcuted next on that.
Change-Id: Ia9cbfc2d63cb7a47e562b2cd9341caf962bcc110
Gives 0.5-0.6% improvement on derf and stdhd, and 1.1% on hd. The
old tables basically derive from times that we had only 4x4 or
only 4x4 and 8x8 DCTs.
Note that some values are filled with 128, because e.g. ADST ever
only occurs as Y-with-DC, as does 32x32; 16x16 ever only occurs
as Y-with-DC or as UV (as complement of 32x32 Y); and 8x8 Y2 ever
only has 4 coefficients max. If preferred, I can add values of
other tables in their place (e.g. use 4x4 2nd order high-frequency
probabilities for 8x8 2nd order), so that they make at least some
sense if we ever implement a larger 2nd order transform for the
8x8 DCT (etc.), please let me know
Change-Id: I917db356f2aff8865f528eb873c56ef43aa5ce22
Add a function clip_pixel() to clip a pixel value to the [0,255] range
of allowed values, and use this where-ever appropriate (e.g. prediction,
reconstruction). Likewise, consistently use the recently added function
clip_prob(), which calculates a binary probability in the [1,255] range.
If possible, try to use get_prob() or its sister get_binary_prob() to
calculate binary probabilities, for consistency.
Since in some places, this means that binary probability calculations
are changed (we use {255,256}*count0/(total) in a range of places,
and all of these are now changed to use 256*count0+(total>>1)/total),
this changes the encoding result, so this patch warrants some extensive
testing.
Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628
Some further changes and refactoring of mv
reference code and selection of center point for
searches. Mainly relates to not passing so many
different local copies of things around.
Some place holder comments.
Change-Id: I309f10ffe9a9cde7663e7eae19eb594371c8d055
This commit changed the ENTROPY_CONTEXT conversion between MBs that
have different transform sizes.
In additioin, this commit also did a number of cleanup/bug fix:
1. removed duplicate function vp9_fix_contexts() and changed to use
vp8_reset_mb_token_contexts() for both encoder and decoder
2. fixed a bug in stuff_mb_16x16 where wrong context was used for
the UV.
3. changed reset all context to 0 if a MB is skipped to simplify the
logic.
Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c
In addition to allowing tests to use the RTCD-enabled functions (perhaps transitively)
without having run a full encode/decode test yet, this fixes a linking issue with
Apple's G++ whereby the Common symbols (the function pointers themselves) wouldn't
be resolved. Fixing this linking issue is the primary impetus for this patch, as none
of the tests exercise the RTCD functionality except through the main API.
Change-Id: I12aed91ca37a707e5309aa6cb9c38a649c06bc6a
Don't use vp9_decode_coefs_4x4() for 2nd order DC or luma blocks. The
code introduces some overhead which is unnecessary for these cases.
Also, remove variable declarations that are only used once, remove
magic offsets into the coefficient buffer (use xd->block[i].qcoeff
instead of xd->qcoeff + magic_offset), and fix a few Google Style
Guide violations.
Change-Id: I0ae653fd80ca7f1e4bccd87ecef95ddfff8f28b4