748 Commits

Author SHA1 Message Date
Alex Converse
91e4e604bd Merge changes I3ca2b674,I78afc587,I3ae62181,I5ed91556 into nextgenv2
* changes:
  Unfork ANS decode_coefs
  Remove ZERO_TOKEN from the ANS tokenset
  Drop costing ANS tokens from derived probabilities
  Unfork ANS pack_mb_tokens
2016-10-12 22:25:27 +00:00
Debargha Mukherjee
e52816bf8f Fix a bug in inverse halfright 32x32 transform
Fix a bug in the C implementation of the ihalfright32
transform, in the case that its input and output buffers are the same.
This occurs when it is called by av1_iht32x16_512_add_c.

Change-Id: I61c652e2662178520c0639a2879ae128a9c7ec3f
2016-10-12 14:49:18 -07:00
Yi Luo
fed8e1c06d Hybrid forward transform 32x32 AVX2 optimization
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.

- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
  But function replacement must go with the corresponding inverse txfm.

- No obvious user level time reduction due to 32x32 TX_TYPE selection.

- Zero high 128b YMM to avoid AVX-SSE transition penalties
  (fix 16x16 case).

- Added 32x32 AVX2 unit tests to verify bitexact.

- AVX2 optimization summary:
  On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
  C to AVX2: function level time reduction, ~86-89%.
  SSE2 to AVX2: function level time reduction, ~51%.

Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
2016-10-12 14:19:53 -07:00
Hui Su
933bf08cfb Merge "Send allow_screen_content flag for both key and intra only frames" into nextgenv2 2016-10-12 21:13:24 +00:00
Debargha Mukherjee
4282b6bbbb Merge "Refactor expand dry_run types to return coef rate" into nextgenv2 2016-10-12 21:06:41 +00:00
Alex Converse
5e4d00c37e Unfork ANS decode_coefs
This is less code and more like what we have in aom/master.

Change-Id: I3ca2b674e4ad9e2e211d08bb51d78549e8b63a54
2016-10-12 13:23:33 -07:00
Alex Converse
ea7e990fd4 Remove ZERO_TOKEN from the ANS tokenset
This can be re-added after aligning AOM's ANS with nextgenv2's ANS.

This partially reverts commit 3829cd2f2f9904572019aa047d068baeee843767.

Change-Id: I78afc587f1abfe33ffcd53b3262910cfae135534
2016-10-12 13:15:08 -07:00
Alex Converse
ccf472bc05 Drop costing ANS tokens from derived probabilities
This mimics what's currently done in aom/master. This can be re-added
after aligning AOM's ANS with nextgenv2's ANS.

Change-Id: I3ae62181dd4803694204a234c717a86a15ca8a40
2016-10-12 13:14:21 -07:00
Alex Converse
dc62b0925d Unfork ANS pack_mb_tokens
This is less code and more like what we have in aom/master.

Change-Id: I5ed915563cbfbc6281113c1eb31455f50710ba9f
2016-10-12 13:09:13 -07:00
hui su
24f7b07f2e Send allow_screen_content flag for both key and intra only frames
BUG=webm:1311

Change-Id: I03c1043d17ed4e4ea22002473779a9612884c6c6
2016-10-12 11:45:05 -07:00
Yaowu Xu
732c188523 Merge "LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH" into nextgenv2 2016-10-12 17:56:26 +00:00
Sarah Parker
d2b1fe4a1f Merge "Fix inconsistency in gm parameter write to bitstream" into nextgenv2 2016-10-12 17:32:21 +00:00
Yaowu Xu
97aa09f658 LIBVPX_TEST_DATA_PATH -> LIBAOM_TEST_DATA_PATH
This commit renames LIBVPX_TEST_DATA_PATH to LIBAOM_TEST_DATA_PATH,
with a work around for working with jenkins environmnet variables.

Change-Id: If664ce57e25ad2af8121d1b578bf64043f0baa2a
2016-10-12 08:26:44 -07:00
Sarah Parker
689b0caea7 Fix inconsistency in gm parameter write to bitstream
Before this change, gm parameters were being written to the
bitstream for all frames, but only read for inter only frames,
causing a bitstream error.

Change-Id: I63b8e2fdf6358e07cc00718de04cc399809bde37
2016-10-11 19:35:26 -07:00
Yaowu Xu
c96168987d Merge "Clean up and speed up CLPF clipping" into nextgenv2 2016-10-11 22:09:31 +00:00
Debargha Mukherjee
ceebb70197 Refactor expand dry_run types to return coef rate
Adds the functionality to return the rate cost due to
coefficients without doing full search of all modes.
This will be subsequently used in various experiments,
including in new_quant experiment to search quantization
profiles at the superblock level without repeating the
full mode/partition search.

Change-Id: I4aad3f3f0c8b8dfdea38f8f4f094a98283f47f08
2016-10-11 14:55:26 -07:00
Yaowu Xu
53a9745c7a Merge "Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0." into nextgenv2 2016-10-11 21:54:13 +00:00
Yaowu Xu
1aa6cbc7ea Merge "Bugfix in the CLPF RDO." into nextgenv2 2016-10-11 21:53:56 +00:00
Sarah Parker
4082ff0bf6 Merge "Read mode to mi->bmi for sub 8x8 blocks" into nextgenv2 2016-10-11 21:48:01 +00:00
Steinar Midtskogen
e66fc87c46 Clean up and speed up CLPF clipping
* Move clipping tests from inside to outside loops
* Let sizex and sizey to clpf_block() be the clipped block size rather
  than both just bs
* Make fallback tests to C more accurate

Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
2016-10-11 12:36:17 -07:00
Steinar Midtskogen
86b19177ab Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0.
PSNR YCbCr:     -0.01%     -0.06%     -0.17%
   PSNRHVS:      0.01%
      SSIM:      0.03%
    MSSSIM:      0.00%
 CIEDE2000:     -0.05%

Change-Id: I1205c021bfc5cee6f80344fec92aabb529af9bd1
2016-10-11 12:35:48 -07:00
Steinar Midtskogen
2e40cc4ce6 Bugfix in the CLPF RDO.
When CLPF was extended to chroma, the chroma RDO accidentally
discarded the optimal block size found in the luma RDO.

PSNR YCbCr:     -0.25%      0.05%      0.06%
   PSNRHVS:     -0.19%
      SSIM:     -0.36%
    MSSSIM:     -0.23%

Conflicts:
	av1/common/clpf.c

Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030
2016-10-11 12:35:10 -07:00
Yaowu Xu
25faa0e9f5 Merge "Move tree writing code into bitwriter.h." into nextgenv2 2016-10-11 19:16:25 +00:00
Yaowu Xu
de005d322a Merge "Remove unused color_sensitivity member from MACROBLOCK." into nextgenv2 2016-10-11 19:16:07 +00:00
Sarah Parker
d7fa8542f6 Read mode to mi->bmi for sub 8x8 blocks
Previously, only the motion vectors were being stored. This caused
a mismatch in the global motion experiment, which needs this
mode information to decide whether or not to use the gm parameters
in reconstruction.

Change-Id: I58cde750ec06587dbfb8d65b07c15a67b7d6b1f6
2016-10-11 11:51:59 -07:00
Yaowu Xu
57aa518c30 Merge "CLPF: Remove redundant function argument." into nextgenv2 2016-10-11 18:44:56 +00:00
Yaowu Xu
80eaf1a120 Merge "Extend CLPF to chroma." into nextgenv2 2016-10-11 18:44:31 +00:00
Yaowu Xu
39b25dfa38 Merge "Remove some dead code in CLPF." into nextgenv2 2016-10-11 18:43:27 +00:00
Yaowu Xu
443e522b5c Merge "Reduce memory footprint for CLPF encoding." into nextgenv2 2016-10-11 18:42:34 +00:00
Yaowu Xu
a71552421d Merge "Non-normative quality improvements to CLPF." into nextgenv2 2016-10-11 18:41:40 +00:00
Yaowu Xu
038d41045b Merge "Added high bit-depth support in CLPF." into nextgenv2 2016-10-11 18:41:15 +00:00
Yaowu Xu
6fc92c1ccc Merge "Fix a memleak in CLPF." into nextgenv2 2016-10-11 18:41:03 +00:00
Yaowu Xu
a2bbf621f1 Merge "Reduce memory footprint for CLPF decoding." into nextgenv2 2016-10-11 18:40:47 +00:00
Yaowu Xu
4da3ed40a3 Merge "Make CLPF handle frame widths and heights not divisible by 8." into nextgenv2 2016-10-11 18:40:05 +00:00
Yaowu Xu
b5e73bddb0 Merge "CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix." into nextgenv2 2016-10-11 17:44:12 +00:00
Yaowu Xu
3b161e14b3 Merge "Silence some harmless compiler warnings in CLPF." into nextgenv2 2016-10-11 17:43:23 +00:00
Zoe Liu
d623c4122a Merge "Add a small code clean for show_existing_frame" into nextgenv2 2016-10-11 16:58:17 +00:00
Nathan E. Egge
eeedc633c0 Move tree writing code into bitwriter.h.
Rename av1_write_tree() to aom_write_tree() and move it into bitwriter.h
 to match aom_read_tree() in bitreader.h.

Manually cherry-picked from aom/master:
33a143fa7ac42d62080bfc20468cb76ad26045db

Change-Id: I6c686cdd3e0f179d7e95c5bc6984558b62d46d67
2016-10-11 09:36:01 -07:00
Thomas Daede
debaface95 Remove unused color_sensitivity member from MACROBLOCK.
Conflicts:
	av1/encoder/block.h
	av1/encoder/encodeframe.c

Change-Id: I941e7b9e76380f262b173928d3c5132c5613b3ce
2016-10-11 09:35:39 -07:00
Yaowu Xu
12fcf74c8a Merge "Use derived variable size for memcpy" into nextgenv2 2016-10-11 16:15:43 +00:00
Yaowu Xu
4960f7c3bd Merge "Added generic SIMD support for CLPF." into nextgenv2 2016-10-11 16:05:18 +00:00
Debargha Mukherjee
fb865cf41c Merge "Add sse2 forward / inverse 4x8 and 8x4 transforms" into nextgenv2 2016-10-11 15:50:32 +00:00
Yaowu Xu
c648a9fd83 Use derived variable size for memcpy
Manually cherry-picked from aom/master:
bf2ad75a1723d223c376b93295aa06dd23226937

Change-Id: I99f05e79ec8ad35a49bc124e6dd829ccc7d9cc36
2016-10-10 17:39:29 -07:00
Zoe Liu
5fca72498a Add a small code clean for show_existing_frame
Change-Id: I42dc9f0fdecd3cf3398ab82d6e01dde06bdf7b24
2016-10-10 17:18:57 -07:00
Steinar Midtskogen
ded69f5668 CLPF: Remove redundant function argument.
Change-Id: I31bea3b1f76493060edd7e1bd616a223841d5f77
2016-10-10 15:24:33 -07:00
Steinar Midtskogen
ecf9a0c821 Extend CLPF to chroma.
Objective quality impact (low latency):

PSNR YCbCr:      0.13%     -1.37%     -1.79%
   PSNRHVS:      0.03%
      SSIM:      0.24%
    MSSSIM:      0.10%
 CIEDE2000:     -0.83%

Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
2016-10-10 15:23:38 -07:00
Steinar Midtskogen
9021d09f9a Remove some dead code in CLPF.
av1_clpf_frame() was always called with the same src and dst,
so we only need one argument and the code supporting different
src and dst was removed.

Change-Id: I70919f50e5cfb19c22eb4dff9ee7c0fa2697fad3
2016-10-10 15:23:09 -07:00
Steinar Midtskogen
a8af9126fb Reduce memory footprint for CLPF encoding.
Use in-place filtering, like in the decoder
(see eb5794da1659f87597291d84c2fbdfd89280065d).

Change-Id: If037ead45f5cb3461347a63e0e415954d5dcba8b
2016-10-10 15:20:42 -07:00
Steinar Midtskogen
499deb9def Non-normative quality improvements to CLPF.
BDR improvements:
     PSNR  PSNRHVS SSIM  MSSSIM CIEDE2000 PSNR Cb  PSNR Cr
LL: -0.17% -0.13% -0.11% -0.12%   -0.18%   -0.19%   -0.21%
HL: -0.21% -0.14% -0.15% -0.11%   -0.37%   -0.39%   -0.52%

Change-Id: I58c00a1cc0ddfc3376644f66345e99472482a613
2016-10-10 11:31:50 -07:00
Steinar Midtskogen
3dbd55a6c4 Added high bit-depth support in CLPF.
Change-Id: Ic5eadb323227a820ad876c32d4dc296e05db6ece
2016-10-10 11:27:04 -07:00