* changes:
Unfork ANS decode_coefs
Remove ZERO_TOKEN from the ANS tokenset
Drop costing ANS tokens from derived probabilities
Unfork ANS pack_mb_tokens
Fix a bug in the C implementation of the ihalfright32
transform, in the case that its input and output buffers are the same.
This occurs when it is called by av1_iht32x16_512_add_c.
Change-Id: I61c652e2662178520c0639a2879ae128a9c7ec3f
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
But function replacement must go with the corresponding inverse txfm.
- No obvious user level time reduction due to 32x32 TX_TYPE selection.
- Zero high 128b YMM to avoid AVX-SSE transition penalties
(fix 16x16 case).
- Added 32x32 AVX2 unit tests to verify bitexact.
- AVX2 optimization summary:
On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
C to AVX2: function level time reduction, ~86-89%.
SSE2 to AVX2: function level time reduction, ~51%.
Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.
This partially reverts commit 3829cd2f2f9904572019aa047d068baeee843767.
Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
This mimics what's currently done in aom/master. This can be re-added
after aligning AOM's ANS with nextgenv2's ANS.
Change-Id: I3ae62181dd4803694204a234c717a86a15ca8a40
This commit renames LIBVPX_TEST_DATA_PATH to LIBAOM_TEST_DATA_PATH,
with a work around for working with jenkins environmnet variables.
Change-Id: If664ce57e25ad2af8121d1b578bf64043f0baa2a
Before this change, gm parameters were being written to the
bitstream for all frames, but only read for inter only frames,
causing a bitstream error.
Change-Id: I63b8e2fdf6358e07cc00718de04cc399809bde37
Adds the functionality to return the rate cost due to
coefficients without doing full search of all modes.
This will be subsequently used in various experiments,
including in new_quant experiment to search quantization
profiles at the superblock level without repeating the
full mode/partition search.
Change-Id: I4aad3f3f0c8b8dfdea38f8f4f094a98283f47f08
* Move clipping tests from inside to outside loops
* Let sizex and sizey to clpf_block() be the clipped block size rather
than both just bs
* Make fallback tests to C more accurate
Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
When CLPF was extended to chroma, the chroma RDO accidentally
discarded the optimal block size found in the luma RDO.
PSNR YCbCr: -0.25% 0.05% 0.06%
PSNRHVS: -0.19%
SSIM: -0.36%
MSSSIM: -0.23%
Conflicts:
av1/common/clpf.c
Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030
Previously, only the motion vectors were being stored. This caused
a mismatch in the global motion experiment, which needs this
mode information to decide whether or not to use the gm parameters
in reconstruction.
Change-Id: I58cde750ec06587dbfb8d65b07c15a67b7d6b1f6
Rename av1_write_tree() to aom_write_tree() and move it into bitwriter.h
to match aom_read_tree() in bitreader.h.
Manually cherry-picked from aom/master:
33a143fa7ac42d62080bfc20468cb76ad26045db
Change-Id: I6c686cdd3e0f179d7e95c5bc6984558b62d46d67
av1_clpf_frame() was always called with the same src and dst,
so we only need one argument and the code supporting different
src and dst was removed.
Change-Id: I70919f50e5cfb19c22eb4dff9ee7c0fa2697fad3