This commit ports the following from aom/master:
4c46278 Add aom_reader_tell() support.
b9c9935 Remove an erroneous declaration.
56c9c3b Fix ANS build.
Change-Id: I59bd910f58c218c649a1de2a7b5fae0397e13cb1
Move computing the partition_cdf tables per symbol to
computing them only when the probabilities are updated.
Change-Id: I442f9230ba00be7f5d0558d7c38d7324ad009ee8
Move computing the inter_ext_tx_cdf tables per symbol to
computing them only when the probabilities are updated.
Change-Id: I5e1e62f8eae8f6b2edbbd378beeb786649502c10
Move computing the intra_ext_tx_cdf tables per symbol to
computing them only when the probabilities are updated.
Change-Id: I26d5e419e103093e98a7d896c196176305b50fc9
Move from computing the switchable_interp_cdf per symbol to
computing once per frame when the probabilities are adapted.
Change-Id: I6571126239f0327e22bb09ee8bad94114291683e
When building with --enable-daala_ec, calls to aom_write() and aom_read()
use the daala entropy coder to write and read bits.
When the probability is exactly 0.5 (128), then raw bits are used.
ntt-short-1:
MEDIUM (%) HIGH (%)
PSNR -0.027556 -0.020114
PSNRHVS -0.027401 -0.020169
SSIM -0.027587 -0.020151
FASTSSIM -0.027592 -0.020102
subset1:
RATE (%) DSNR (dB)
PSNR 0.03296 -0.00210
PSNRHVS 0.03537 -0.00281
SSIM 0.03299 -0.00161
FASTSSIM 0.03458 -0.00111
Change-Id: I48ad8eb40fc895d62d6e241ea8abc02820d573f7
This flag was already added to aomedia/master, so bringing it back to
webm/nextgenv2, as part of an effort to get the two codebases in sync.
Change-Id: I2b933a6a160e4210d1411a9e7978149eb8553205
This reverts commit 9b25f3067485b32442e13964df098903736c3fd8 to
reinstate the reverted commit with fixes that solved the build issues
when --enalbe-clpf is used in configure.
Change-Id: I15447cae7fa9b3deb27976345dc3db230a4a7a60
These signals were in the uncompressed frame header (as a temporary
hack), which caused two problems:
* We don't want that header to be duplicated in the slice header
* It was necessary to signal the number of bits to transmit up front
However, the filter size can be 128x128 which is greater than the SB
size, and a decoder wouldn't be able to know whether to read a bit or
not until the final SB of that 128x128 block has been decoded
(depending on whether the 128x128 is all skip or not). Therefore the
signalling was changed for 128x128 blocks so that every top left SB of
a 128x128 filter block contains a signal regardless of whether the
block is all skip or not. Also, all the MB's of 128x128 block are
filtered even if they are skip MB's. This gives the signal a purpose
even when the 128x128 block is all skip, and it also gives a slight
coding gain as it leaves a way to filter skip blocks, which was
previously forbidden.
Low latency:
PSNR YCbCr: -0.19% -0.14% -0.06%
PSNRHVS: -0.15%
SSIM: -0.13%
MSSSIM: -0.15%
CIEDE2000: -0.19%
High latency:
PSNR YCbCr: -0.03% -0.01% -0.09%
PSNRHVS: 0.04%
SSIM: 0.00%
MSSSIM: 0.02%
CIEDE2000: -0.02%
Change-Id: I69ba7144d07d388b4f0968f6a53558f480979171
To get ready for pulling AV1 to nextgenv2
Replace the experimental flag by MOTION_VAR. Rename major variables.
Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
* changes:
Deringing cleanup: remove DERING_REFINEMENT (always on now)
Don't run the deringing filter on skipped blocks within a superblock
Don't dering skipped superblocks
On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
* changes:
Remove custom rans types
Remove add_token_no_extra.
Remove unused aom_rans_build_cdf_from_pdf
Add the tool used to generate the constrained tokenset.
Remove the starting zero from ANS CDFs.
Import the aom_read/write_symbol abstractions from aom/master
(cherry picked from aom/master commit 11206c60d930be9d29100567aa67f2a65463852a)
Includes renames in a bunch of places not handled by the original
due to differing tree states.
Change-Id: Ic74d9d8850b8c80a51e55e425bbf472a67e2653f
This brings it in line with the Daala CDFs and will make it easier to
share code.
Change-Id: Idfd2d2b33c3b9b2c4e72ce72fb3d8039013448b9
(cherry picked from aom/master commit af98507ca928afe33e9f88fdd2ca168379528d6a)
Fix a bug in the C implementation of the ihalfright32
transform, in the case that its input and output buffers are the same.
This occurs when it is called by av1_iht32x16_512_add_c.
Change-Id: I61c652e2662178520c0639a2879ae128a9c7ec3f
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.
- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
But function replacement must go with the corresponding inverse txfm.
- No obvious user level time reduction due to 32x32 TX_TYPE selection.
- Zero high 128b YMM to avoid AVX-SSE transition penalties
(fix 16x16 case).
- Added 32x32 AVX2 unit tests to verify bitexact.
- AVX2 optimization summary:
On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
C to AVX2: function level time reduction, ~86-89%.
SSE2 to AVX2: function level time reduction, ~51%.
Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.
This partially reverts commit 3829cd2f2f9904572019aa047d068baeee843767.
Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
* Move clipping tests from inside to outside loops
* Let sizex and sizey to clpf_block() be the clipped block size rather
than both just bs
* Make fallback tests to C more accurate
Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
When CLPF was extended to chroma, the chroma RDO accidentally
discarded the optimal block size found in the luma RDO.
PSNR YCbCr: -0.25% 0.05% 0.06%
PSNRHVS: -0.19%
SSIM: -0.36%
MSSSIM: -0.23%
Conflicts:
av1/common/clpf.c
Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030