18311 Commits

Author SHA1 Message Date
Yi Luo
761ae880d7 Delete some redundant function declarations in aom_dsp_rtcd_defs.pl
Change-Id: I4df57a7faba5800c048b2dc469ec31545406f55c
2016-10-13 17:53:45 -07:00
Steinar Midtskogen
975350387c Move CLPF block signals from frame to SB level.
These signals were in the uncompressed frame header (as a temporary
hack), which caused two problems:

* We don't want that header to be duplicated in the slice header
* It was necessary to signal the number of bits to transmit up front

However, the filter size can be 128x128 which is greater than the SB
size, and a decoder wouldn't be able to know whether to read a bit or
not until the final SB of that 128x128 block has been decoded
(depending on whether the 128x128 is all skip or not).  Therefore the
signalling was changed for 128x128 blocks so that every top left SB of
a 128x128 filter block contains a signal regardless of whether the
block is all skip or not.  Also, all the MB's of 128x128 block are
filtered even if they are skip MB's.  This gives the signal a purpose
even when the 128x128 block is all skip, and it also gives a slight
coding gain as it leaves a way to filter skip blocks, which was
previously forbidden.

Low latency:
PSNR YCbCr:     -0.19%     -0.14%     -0.06%
   PSNRHVS:     -0.15%
      SSIM:     -0.13%
    MSSSIM:     -0.15%
 CIEDE2000:     -0.19%

High latency:
PSNR YCbCr:     -0.03%     -0.01%     -0.09%
   PSNRHVS:      0.04%
      SSIM:      0.00%
    MSSSIM:      0.02%
 CIEDE2000:     -0.02%

Change-Id: I69ba7144d07d388b4f0968f6a53558f480979171
2016-10-13 16:06:10 -07:00
Yue Chen
cb60b185c7 Renamings for OBMC experiment
To get ready for pulling AV1 to nextgenv2
Replace the experimental flag by MOTION_VAR. Rename major variables.

Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
2016-10-13 15:51:22 -07:00
Steinar Midtskogen
2d5f752ae9 Don't use _mm_cvtsi128_si64 on 32 bit systems
Change-Id: I332afb8d9e35cd60f05915160a5b2e1dc8757de5
2016-10-13 14:35:00 -07:00
Yaowu Xu
410fee8de6 Fix formatting in a few files
Change-Id: Ia5175afe82b142d9e18c01c546610202c630588e
2016-10-13 13:04:29 -07:00
Jean-Marc Valin
a8ce2c9199 Removing some useless loops in deringing filter
No change in the output

Change-Id: I1627feaa163d65da0df90e9dacbc5e39ee755de8
2016-10-13 18:27:25 +00:00
Jean-Marc Valin
209f830d97 Fix deringing level choice for 10-bit and 12-bit
Making sure we never exceed a base level of 63

Change-Id: I821254b8d970446bd40fdd6e4d7073c69760a86d
2016-10-13 18:27:17 +00:00
Jean-Marc Valin
3cfec90d33 Don't dering superblocks that have deringing disabled
Doesn't change the output, but avoids useless deringing with threshold=0

Change-Id: I69f3e54abad2d2493cfbc76c188ad7d190f0aeff
2016-10-13 18:27:03 +00:00
Yaowu Xu
98e9ce923b Merge "Add SSE4.1 code for deringing functions." into nextgenv2 2016-10-13 18:02:59 +00:00
Michael Bebenita
7227b65c4c Add SSE4.1 code for deringing functions.
Change-Id: I363f7fb610a5c86ea9f417e34b57c6373af877e5
2016-10-13 18:02:19 +00:00
Yaowu Xu
3feb89170b Merge "Simpler threshold calculation for the second filter" into nextgenv2 2016-10-13 18:01:45 +00:00
Yaowu Xu
5d2f01284f Merge "Make 4x4 deringing (chroma) use shorter filters" into nextgenv2 2016-10-13 18:01:23 +00:00
Yaowu Xu
fd44e24541 Merge "Removing Daala-specific deringing code" into nextgenv2 2016-10-13 18:01:11 +00:00
Zoe Liu
12cbaac759 Merge "Clean code a bit and fix a couple of small bugs in ext-refs" into nextgenv2 2016-10-13 16:47:03 +00:00
Yaowu Xu
9ffdf48c5a Merge "Use a quantizer-based threshold rather than full search for deringing" into nextgenv2 2016-10-13 16:35:08 +00:00
Yaowu Xu
8ac419f307 Merge changes Ic3a68557,Ib1dbe41a,I0da09270,Ibdbd720d into nextgenv2
* changes:
  Deringing cleanup: remove DERING_REFINEMENT (always on now)
  Don't run the deringing filter on skipped blocks within a superblock
  Don't dering skipped superblocks
  On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
2016-10-13 15:54:32 +00:00
Zoe Liu
f0e4669edb Clean code a bit and fix a couple of small bugs in ext-refs
Currently the patch does not have any impact on the RD performance. The
fix could however potentially help on the next step of work, especially
when the extra altref frames allow non-zero temporal filtering strength
and their corresponding OVERLAY frames, i.e. the INTNL_OVERLAY frames
are being added.

Change-Id: I2e07fb3d0aa547a0b5dd05bb4ba865cd46309076
2016-10-13 08:42:51 -07:00
Yaowu Xu
89d3f2fd10 Merge "Sync 2x2 intra predictors" into nextgenv2 2016-10-13 15:20:52 +00:00
David Barker
4f803efac1 Simplify 8x16 and 16x8 inverse transform tests
Change-Id: Ie86aedfb1f3e0d9c0cf58d7183861a0ed0e8ccc8
2016-10-13 16:02:59 +01:00
David Barker
7825022daa Enable test system to detect transforms misusing 'stride' parameter
This would have caught the bug introduced in patch set 1 of
https://chromium-review.googlesource.com/#/c/397378/

Change-Id: I9c6d5d9c4c98aed5ac48c4fb1c4ff4131b0df1d5
2016-10-13 15:50:44 +01:00
Alex Converse
cba3d1f1c3 AnsTest: Replace the dummy distribution
Use constrained token table row 65/256 instead.

Change-Id: I8b442d4c82af8fa9d36ac2de0d73179ed040478d
(cherry picked from commit 47eb9a2ca46821b468903514cd34eaaca2533d45)
2016-10-13 07:04:55 -07:00
Alex Converse
fc4980edb7 Merge changes Ic74d9d88,Ie93b474e,I544989ea,Ic273f7d9,Idfd2d2b3, ... into nextgenv2
* changes:
  Remove custom rans types
  Remove add_token_no_extra.
  Remove unused aom_rans_build_cdf_from_pdf
  Add the tool used to generate the constrained tokenset.
  Remove the starting zero from ANS CDFs.
  Import the aom_read/write_symbol abstractions from aom/master
2016-10-13 14:03:15 +00:00
David Barker
33231d4801 Add sse2 forward and inverse 16x32 and 32x16 transforms
Change-Id: I1241257430f1e08ead1ce0f31db8272b50783102
2016-10-13 14:01:22 +01:00
Debargha Mukherjee
cad8283e55 Merge "Fix a bug in inverse halfright 32x32 transform" into nextgenv2 2016-10-13 08:16:47 +00:00
Alex Converse
9ed1a2ff44 Remove custom rans types
(cherry picked from aom/master commit 11206c60d930be9d29100567aa67f2a65463852a)

Includes renames in a bunch of places not handled by the original
due to differing tree states.

Change-Id: Ic74d9d8850b8c80a51e55e425bbf472a67e2653f
2016-10-13 05:53:58 +00:00
Jingning Han
e3954d8312 Sync 2x2 intra predictors
Add 2x2 DC, V, H, TM intra predictors.

Change-Id: I2a614adde553f821c45bc5a9bf09800a9f0aaa26
2016-10-12 21:04:01 -07:00
Jean-Marc Valin
4713d8d019 Simpler threshold calculation for the second filter
PSNR YCbCr:      0.03%     -0.00%      0.07%
   PSNRHVS:      0.06%
      SSIM:      0.12%
    MSSSIM:      0.09%
 CIEDE2000:      0.05%

Change-Id: I15ef9598a08f6713bc28ab98b0182310433e97ef
2016-10-12 18:17:10 -07:00
Jean-Marc Valin
ea64c342b7 Make 4x4 deringing (chroma) use shorter filters
Avoids blurring chroma for 4:2:0

PSNR YCbCr:      0.03%     -0.31%     -0.29%
   PSNRHVS:      0.02%
      SSIM:      0.03%
    MSSSIM:      0.02%
 CIEDE2000:      0.01%

Change-Id: If744fb902b5f24404479def22b9ca8a19baec722
2016-10-12 18:16:54 -07:00
Jean-Marc Valin
2c616e61e0 Removing Daala-specific deringing code
No point in keeping them in sync now that all the code is reformatted

Change-Id: I8a062253ed6a5f86028cd5a2a922b3c760def6fb
2016-10-12 18:16:23 -07:00
Jean-Marc Valin
6d5a7a924b Use a quantizer-based threshold rather than full search for deringing
objective-1-short results (with deringing enabled):
PSNR YCbCr:      0.08%      0.03%      0.11%
   PSNRHVS:      0.06%
      SSIM:      0.12%
    MSSSIM:      0.08%
 CIEDE2000:      0.05%

Change-Id: Ifcfc42c14c33650dcf879c4d0ddd8688d4d07da1
2016-10-12 18:16:07 -07:00
Alex Converse
4ce69de9a6 Remove add_token_no_extra.
It was a fairly small production optimization for VP9.

Change-Id: Ie93b474ea5b7e63384a7c0b3a56b135462d1471b
(cherry picked from aom/master commit df9bb76b1330de42fe13827df4c72010adb51429)
2016-10-12 17:44:28 -07:00
Alex Converse
d5b9c730ad Remove unused aom_rans_build_cdf_from_pdf
Change-Id: I544989eae45b7dda04250365c3de99f50110a76b
(cherry picked from aom/master commit 06cce842caa5212826d51c2a317de0bdfae74349)
2016-10-12 17:44:14 -07:00
Alex Converse
dacf45facd Add the tool used to generate the constrained tokenset.
The code that generates the raw distribution is based on a MATLAB
program by Debargha Mukherjee, and the algorithm used to quantize the
distribution comes from the ANS Toolkit by Jarek Duda.

Change-Id: Ic273f7d9e43e3ecd999e9e7e04cde57e8559375a
(cherry picked from aom/master commit ef446026aeafa318f9bee182b8c80eb4f1ef5a0a)
2016-10-12 17:41:01 -07:00
Alex Converse
e9f70f8f10 Remove the starting zero from ANS CDFs.
This brings it in line with the Daala CDFs and will make it easier to
share code.

Change-Id: Idfd2d2b33c3b9b2c4e72ce72fb3d8039013448b9
(cherry picked from aom/master commit af98507ca928afe33e9f88fdd2ca168379528d6a)
2016-10-12 17:41:01 -07:00
Alex Converse
a1ac972867 Import the aom_read/write_symbol abstractions from aom/master
Change-Id: I0b255c05108c3b97e74df1b59c34111c9e9a5770
2016-10-12 17:41:01 -07:00
Jean-Marc Valin
e874ce0300 Deringing cleanup: remove DERING_REFINEMENT (always on now)
Change-Id: Ic3a6855799be010e69aeab924b013679282ab191
2016-10-12 17:13:09 -07:00
Jean-Marc Valin
8455cd9fc1 Don't run the deringing filter on skipped blocks within a superblock
No change in metrics

Change-Id: Ib1dbe41a9e1a564dd9a63a33e2a5315ad6bca70c
2016-10-12 17:12:45 -07:00
Jean-Marc Valin
56b0c3c51b Don't dering skipped superblocks
No change in metrics

Change-Id: I0da09270d78c3caf78a32a3157f02c87f2232e3e
2016-10-12 17:12:10 -07:00
Yi Luo
e01484e412 Merge "Hybrid forward transform 32x32 AVX2 optimization" into nextgenv2 2016-10-13 00:08:48 +00:00
Steinar Midtskogen
b074823863 On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
Change-Id: Ibdbd720d4f68892da6164a9849e212e759305005
2016-10-12 15:48:13 -07:00
Alex Converse
91e4e604bd Merge changes I3ca2b674,I78afc587,I3ae62181,I5ed91556 into nextgenv2
* changes:
  Unfork ANS decode_coefs
  Remove ZERO_TOKEN from the ANS tokenset
  Drop costing ANS tokens from derived probabilities
  Unfork ANS pack_mb_tokens
2016-10-12 22:25:27 +00:00
Debargha Mukherjee
e52816bf8f Fix a bug in inverse halfright 32x32 transform
Fix a bug in the C implementation of the ihalfright32
transform, in the case that its input and output buffers are the same.
This occurs when it is called by av1_iht32x16_512_add_c.

Change-Id: I61c652e2662178520c0639a2879ae128a9c7ec3f
2016-10-12 14:49:18 -07:00
Yi Luo
fed8e1c06d Hybrid forward transform 32x32 AVX2 optimization
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.

- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
  But function replacement must go with the corresponding inverse txfm.

- No obvious user level time reduction due to 32x32 TX_TYPE selection.

- Zero high 128b YMM to avoid AVX-SSE transition penalties
  (fix 16x16 case).

- Added 32x32 AVX2 unit tests to verify bitexact.

- AVX2 optimization summary:
  On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
  C to AVX2: function level time reduction, ~86-89%.
  SSE2 to AVX2: function level time reduction, ~51%.

Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
2016-10-12 14:19:53 -07:00
Hui Su
933bf08cfb Merge "Send allow_screen_content flag for both key and intra only frames" into nextgenv2 2016-10-12 21:13:24 +00:00
Debargha Mukherjee
4282b6bbbb Merge "Refactor expand dry_run types to return coef rate" into nextgenv2 2016-10-12 21:06:41 +00:00
Alex Converse
5e4d00c37e Unfork ANS decode_coefs
This is less code and more like what we have in aom/master.

Change-Id: I3ca2b674e4ad9e2e211d08bb51d78549e8b63a54
2016-10-12 13:23:33 -07:00
Alex Converse
ea7e990fd4 Remove ZERO_TOKEN from the ANS tokenset
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.

This partially reverts commit 3829cd2f2f9904572019aa047d068baeee843767.

Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
2016-10-12 13:15:08 -07:00
Alex Converse
ccf472bc05 Drop costing ANS tokens from derived probabilities
This mimics what's currently done in aom/master. This can be re-added
after aligning AOM's ANS with nextgenv2's ANS.

Change-Id: I3ae62181dd4803694204a234c717a86a15ca8a40
2016-10-12 13:14:21 -07:00
Alex Converse
dc62b0925d Unfork ANS pack_mb_tokens
This is less code and more like what we have in aom/master.

Change-Id: I5ed915563cbfbc6281113c1eb31455f50710ba9f
2016-10-12 13:09:13 -07:00
Jim Bankoski
3265ef3d1d AUTHORS regenerated
script changed to remove extra entities and clang-format bot.

Change-Id: I102cd80fdf4b240e6e4d5172943e49146a601a72
2016-10-12 12:26:05 -07:00