472 Commits

Author SHA1 Message Date
Nathan E. Egge
56eeaa5daf Rename aom_write_tree_cdf() to aom_write_symbol().
Change-Id: I7c088c55f1c461063976d5bd84ff2026c4f3bc69
2016-10-17 11:54:51 -07:00
Yushin Cho
09de28b4f7 Bug fix in super_block_uvrd().
In super_block_uvrd(),if is_cost_valid == 0, all return parameters,
i.e. rate, distortion, skippable, and sse, are reset.
So, should not call txfm_rd_in_plane() if is_cost_valid == 0.
Also, the bug causes av1_xform_quant() to see invalid diff signal
since av1_subtract_plane() is not called in super_block_uvrd().

Change-Id: Iaa06061e2e9aa8876b4611a54f4ae6b8d499332b
2016-10-17 11:25:13 -07:00
Nathan E. Egge
fba2be692f Update partition_cdf per frame.
Move computing the partition_cdf tables per symbol to
 computing them only when the probabilities are updated.

Change-Id: I442f9230ba00be7f5d0558d7c38d7324ad009ee8
2016-10-17 10:21:06 -07:00
Nathan E. Egge
93878c4243 Update inter_ext_tx_cdf per frame.
Move computing the inter_ext_tx_cdf tables per symbol to
 computing them only when the probabilities are updated.

Change-Id: I5e1e62f8eae8f6b2edbbd378beeb786649502c10
2016-10-17 10:20:53 -07:00
Nathan E. Egge
7c5b4c1665 Update intra_ext_tx_cdf per frame.
Move computing the intra_ext_tx_cdf tables per symbol to
 computing them only when the probabilities are updated.

Change-Id: I26d5e419e103093e98a7d896c196176305b50fc9
2016-10-17 08:47:02 -07:00
Nathan E. Egge
4947c296f7 Update switchable_interp_cdf once per frame.
Move from computing the switchable_interp_cdf per symbol to
 computing once per frame when the probabilities are adapted.

Change-Id: I6571126239f0327e22bb09ee8bad94114291683e
2016-10-17 08:44:57 -07:00
Yaowu Xu
73d702db7f Merge changes I339d0389,I2fa1e87a,If79fa5ae,Icb1a8cb8,Ic76de4a4, ... into nextgenv2
* changes:
  Add missing CONFIG_DAALA_EC declaration.
  Add API for writing trees using a CDF.
  Add macro to build a simple cdf table.
  Use Daala entropy coder to code trees.
  Silence clang-format code review warning.
  Use Daala entropy coder to code bits.
  Clear existing format issue in the codebase
  Add Daala entropy coder.
2016-10-14 23:42:22 +00:00
Yi Luo
1dec26e004 Merge "Zero high 128b YMM registers to avoid SSE-AVX transition penalties" into nextgenv2 2016-10-14 23:13:10 +00:00
Nathan E. Egge
8043cc4018 Use Daala entropy coder to code bits.
When building with --enable-daala_ec, calls to aom_write() and aom_read()
 use the daala entropy coder to write and read bits.
When the probability is exactly 0.5 (128), then raw bits are used.

ntt-short-1:

          MEDIUM (%) HIGH (%)
    PSNR -0.027556  -0.020114
 PSNRHVS -0.027401  -0.020169
    SSIM -0.027587  -0.020151
FASTSSIM -0.027592  -0.020102

subset1:

         RATE (%)  DSNR (dB)
    PSNR 0.03296  -0.00210
 PSNRHVS 0.03537  -0.00281
    SSIM 0.03299  -0.00161
FASTSSIM 0.03458  -0.00111

Change-Id: I48ad8eb40fc895d62d6e241ea8abc02820d573f7
2016-10-14 14:59:27 -07:00
Urvang Joshi
74114a3a1e Bugfix: fix the build for CONFIG_FP_MB_STATS
Cherry-picked from aomedia/master: bf6c636

Change-Id: Iea3fb46d23cb94d1152de3a7a40b6a183e78b4d7
2016-10-14 13:42:53 -07:00
Urvang Joshi
b100db7c1d Wrap palette code inside CONFIG_PALETTE flag.
This flag was already added to aomedia/master, so bringing it back to
webm/nextgenv2, as part of an effort to get the two codebases in sync.

Change-Id: I2b933a6a160e4210d1411a9e7978149eb8553205
2016-10-14 13:42:02 -07:00
Yi Luo
e9fde265f7 Zero high 128b YMM registers to avoid SSE-AVX transition penalties
Documents:
- https://software.intel.com/en-us/articles/intel-avx-state-transitions-migrating-sse-code-to-avx
- https://software.intel.com/sites/default/files/m/d/4/1/d/8/11MC12_Avoiding_2BAVX-SSE_2BTransition_2BPenalties_2Brh_2Bfinal.pdf

Change-Id: I90f85fcb15a7a2c49ee068300be6ffe9c68d371c
2016-10-14 12:22:35 -07:00
Yaowu Xu
d71be7815d Revert "Revert "Move CLPF block signals from frame to SB level.""
This reverts commit 9b25f3067485b32442e13964df098903736c3fd8 to
reinstate the reverted commit with fixes that solved the build issues
when --enalbe-clpf is used in configure.

Change-Id: I15447cae7fa9b3deb27976345dc3db230a4a7a60
2016-10-14 08:58:49 -07:00
Yaowu Xu
4b71775307 Merge "Revert "Move CLPF block signals from frame to SB level."" into nextgenv2 2016-10-14 15:39:36 +00:00
Yaowu Xu
9b25f30674 Revert "Move CLPF block signals from frame to SB level."
This reverts commit 975350387ce0b55bf5af8cb944f6a242b72251ff.

Change-Id: I9f8e891739352ca2bde4b294e37c85a668f416e0
2016-10-14 15:39:03 +00:00
Debargha Mukherjee
a720f4b3b5 Merge "Add sse2 forward and inverse 16x32 and 32x16 transforms" into nextgenv2 2016-10-14 02:49:20 +00:00
Yue Chen
a48764d05f Merge "Renamings for OBMC experiment" into nextgenv2 2016-10-14 01:33:00 +00:00
Steinar Midtskogen
975350387c Move CLPF block signals from frame to SB level.
These signals were in the uncompressed frame header (as a temporary
hack), which caused two problems:

* We don't want that header to be duplicated in the slice header
* It was necessary to signal the number of bits to transmit up front

However, the filter size can be 128x128 which is greater than the SB
size, and a decoder wouldn't be able to know whether to read a bit or
not until the final SB of that 128x128 block has been decoded
(depending on whether the 128x128 is all skip or not).  Therefore the
signalling was changed for 128x128 blocks so that every top left SB of
a 128x128 filter block contains a signal regardless of whether the
block is all skip or not.  Also, all the MB's of 128x128 block are
filtered even if they are skip MB's.  This gives the signal a purpose
even when the 128x128 block is all skip, and it also gives a slight
coding gain as it leaves a way to filter skip blocks, which was
previously forbidden.

Low latency:
PSNR YCbCr:     -0.19%     -0.14%     -0.06%
   PSNRHVS:     -0.15%
      SSIM:     -0.13%
    MSSSIM:     -0.15%
 CIEDE2000:     -0.19%

High latency:
PSNR YCbCr:     -0.03%     -0.01%     -0.09%
   PSNRHVS:      0.04%
      SSIM:      0.00%
    MSSSIM:      0.02%
 CIEDE2000:     -0.02%

Change-Id: I69ba7144d07d388b4f0968f6a53558f480979171
2016-10-13 16:06:10 -07:00
Yue Chen
cb60b185c7 Renamings for OBMC experiment
To get ready for pulling AV1 to nextgenv2
Replace the experimental flag by MOTION_VAR. Rename major variables.

Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
2016-10-13 15:51:22 -07:00
Jean-Marc Valin
209f830d97 Fix deringing level choice for 10-bit and 12-bit
Making sure we never exceed a base level of 63

Change-Id: I821254b8d970446bd40fdd6e4d7073c69760a86d
2016-10-13 18:27:17 +00:00
Yaowu Xu
98e9ce923b Merge "Add SSE4.1 code for deringing functions." into nextgenv2 2016-10-13 18:02:59 +00:00
Michael Bebenita
7227b65c4c Add SSE4.1 code for deringing functions.
Change-Id: I363f7fb610a5c86ea9f417e34b57c6373af877e5
2016-10-13 18:02:19 +00:00
Yaowu Xu
fd44e24541 Merge "Removing Daala-specific deringing code" into nextgenv2 2016-10-13 18:01:11 +00:00
Zoe Liu
12cbaac759 Merge "Clean code a bit and fix a couple of small bugs in ext-refs" into nextgenv2 2016-10-13 16:47:03 +00:00
Yaowu Xu
9ffdf48c5a Merge "Use a quantizer-based threshold rather than full search for deringing" into nextgenv2 2016-10-13 16:35:08 +00:00
Yaowu Xu
8ac419f307 Merge changes Ic3a68557,Ib1dbe41a,I0da09270,Ibdbd720d into nextgenv2
* changes:
  Deringing cleanup: remove DERING_REFINEMENT (always on now)
  Don't run the deringing filter on skipped blocks within a superblock
  Don't dering skipped superblocks
  On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
2016-10-13 15:54:32 +00:00
Zoe Liu
f0e4669edb Clean code a bit and fix a couple of small bugs in ext-refs
Currently the patch does not have any impact on the RD performance. The
fix could however potentially help on the next step of work, especially
when the extra altref frames allow non-zero temporal filtering strength
and their corresponding OVERLAY frames, i.e. the INTNL_OVERLAY frames
are being added.

Change-Id: I2e07fb3d0aa547a0b5dd05bb4ba865cd46309076
2016-10-13 08:42:51 -07:00
Alex Converse
fc4980edb7 Merge changes Ic74d9d88,Ie93b474e,I544989ea,Ic273f7d9,Idfd2d2b3, ... into nextgenv2
* changes:
  Remove custom rans types
  Remove add_token_no_extra.
  Remove unused aom_rans_build_cdf_from_pdf
  Add the tool used to generate the constrained tokenset.
  Remove the starting zero from ANS CDFs.
  Import the aom_read/write_symbol abstractions from aom/master
2016-10-13 14:03:15 +00:00
David Barker
33231d4801 Add sse2 forward and inverse 16x32 and 32x16 transforms
Change-Id: I1241257430f1e08ead1ce0f31db8272b50783102
2016-10-13 14:01:22 +01:00
Alex Converse
9ed1a2ff44 Remove custom rans types
(cherry picked from aom/master commit 11206c60d930be9d29100567aa67f2a65463852a)

Includes renames in a bunch of places not handled by the original
due to differing tree states.

Change-Id: Ic74d9d8850b8c80a51e55e425bbf472a67e2653f
2016-10-13 05:53:58 +00:00
Jean-Marc Valin
2c616e61e0 Removing Daala-specific deringing code
No point in keeping them in sync now that all the code is reformatted

Change-Id: I8a062253ed6a5f86028cd5a2a922b3c760def6fb
2016-10-12 18:16:23 -07:00
Jean-Marc Valin
6d5a7a924b Use a quantizer-based threshold rather than full search for deringing
objective-1-short results (with deringing enabled):
PSNR YCbCr:      0.08%      0.03%      0.11%
   PSNRHVS:      0.06%
      SSIM:      0.12%
    MSSSIM:      0.08%
 CIEDE2000:      0.05%

Change-Id: Ifcfc42c14c33650dcf879c4d0ddd8688d4d07da1
2016-10-12 18:16:07 -07:00
Alex Converse
4ce69de9a6 Remove add_token_no_extra.
It was a fairly small production optimization for VP9.

Change-Id: Ie93b474ea5b7e63384a7c0b3a56b135462d1471b
(cherry picked from aom/master commit df9bb76b1330de42fe13827df4c72010adb51429)
2016-10-12 17:44:28 -07:00
Alex Converse
a1ac972867 Import the aom_read/write_symbol abstractions from aom/master
Change-Id: I0b255c05108c3b97e74df1b59c34111c9e9a5770
2016-10-12 17:41:01 -07:00
Jean-Marc Valin
e874ce0300 Deringing cleanup: remove DERING_REFINEMENT (always on now)
Change-Id: Ic3a6855799be010e69aeab924b013679282ab191
2016-10-12 17:13:09 -07:00
Jean-Marc Valin
56b0c3c51b Don't dering skipped superblocks
No change in metrics

Change-Id: I0da09270d78c3caf78a32a3157f02c87f2232e3e
2016-10-12 17:12:10 -07:00
Yi Luo
e01484e412 Merge "Hybrid forward transform 32x32 AVX2 optimization" into nextgenv2 2016-10-13 00:08:48 +00:00
Alex Converse
91e4e604bd Merge changes I3ca2b674,I78afc587,I3ae62181,I5ed91556 into nextgenv2
* changes:
  Unfork ANS decode_coefs
  Remove ZERO_TOKEN from the ANS tokenset
  Drop costing ANS tokens from derived probabilities
  Unfork ANS pack_mb_tokens
2016-10-12 22:25:27 +00:00
Yi Luo
fed8e1c06d Hybrid forward transform 32x32 AVX2 optimization
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.

- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
  But function replacement must go with the corresponding inverse txfm.

- No obvious user level time reduction due to 32x32 TX_TYPE selection.

- Zero high 128b YMM to avoid AVX-SSE transition penalties
  (fix 16x16 case).

- Added 32x32 AVX2 unit tests to verify bitexact.

- AVX2 optimization summary:
  On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
  C to AVX2: function level time reduction, ~86-89%.
  SSE2 to AVX2: function level time reduction, ~51%.

Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
2016-10-12 14:19:53 -07:00
Hui Su
933bf08cfb Merge "Send allow_screen_content flag for both key and intra only frames" into nextgenv2 2016-10-12 21:13:24 +00:00
Debargha Mukherjee
4282b6bbbb Merge "Refactor expand dry_run types to return coef rate" into nextgenv2 2016-10-12 21:06:41 +00:00
Alex Converse
ea7e990fd4 Remove ZERO_TOKEN from the ANS tokenset
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.

This partially reverts commit 3829cd2f2f9904572019aa047d068baeee843767.

Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
2016-10-12 13:15:08 -07:00
Alex Converse
ccf472bc05 Drop costing ANS tokens from derived probabilities
This mimics what's currently done in aom/master. This can be re-added
after aligning AOM's ANS with nextgenv2's ANS.

Change-Id: I3ae62181dd4803694204a234c717a86a15ca8a40
2016-10-12 13:14:21 -07:00
Alex Converse
dc62b0925d Unfork ANS pack_mb_tokens
This is less code and more like what we have in aom/master.

Change-Id: I5ed915563cbfbc6281113c1eb31455f50710ba9f
2016-10-12 13:09:13 -07:00
hui su
24f7b07f2e Send allow_screen_content flag for both key and intra only frames
BUG=webm:1311

Change-Id: I03c1043d17ed4e4ea22002473779a9612884c6c6
2016-10-12 11:45:05 -07:00
Yaowu Xu
732c188523 Merge "LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH" into nextgenv2 2016-10-12 17:56:26 +00:00
Sarah Parker
d2b1fe4a1f Merge "Fix inconsistency in gm parameter write to bitstream" into nextgenv2 2016-10-12 17:32:21 +00:00
Yaowu Xu
97aa09f658 LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH
This commit renames LIBVPX_TEST_DATA_PATH to LIBAOM_TEST_DATA_PATH,
with a work around for working with jenkins environmnet variables.

Change-Id: If664ce57e25ad2af8121d1b578bf64043f0baa2a
2016-10-12 08:26:44 -07:00
Sarah Parker
689b0caea7 Fix inconsistency in gm parameter write to bitstream
Before this change, gm parameters were being written to the
bitstream for all frames, but only read for inter only frames,
causing a bitstream error.

Change-Id: I63b8e2fdf6358e07cc00718de04cc399809bde37
2016-10-11 19:35:26 -07:00
Debargha Mukherjee
ceebb70197 Refactor expand dry_run types to return coef rate
Adds the functionality to return the rate cost due to
coefficients without doing full search of all modes.
This will be subsequently used in various experiments,
including in new_quant experiment to search quantization
profiles at the superblock level without repeating the
full mode/partition search.

Change-Id: I4aad3f3f0c8b8dfdea38f8f4f094a98283f47f08
2016-10-11 14:55:26 -07:00