Commit Graph

183 Commits

Author SHA1 Message Date
Yaowu Xu
c96168987d Merge "Clean up and speed up CLPF clipping" into nextgenv2 2016-10-11 22:09:31 +00:00
Yaowu Xu
53a9745c7a Merge "Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0." into nextgenv2 2016-10-11 21:54:13 +00:00
Yaowu Xu
1aa6cbc7ea Merge "Bugfix in the CLPF RDO." into nextgenv2 2016-10-11 21:53:56 +00:00
Sarah Parker
4082ff0bf6 Merge "Read mode to mi->bmi for sub 8x8 blocks" into nextgenv2 2016-10-11 21:48:01 +00:00
Steinar Midtskogen
e66fc87c46 Clean up and speed up CLPF clipping
* Move clipping tests from inside to outside loops
* Let sizex and sizey to clpf_block() be the clipped block size rather
  than both just bs
* Make fallback tests to C more accurate

Change-Id: Icdc57540ce21b41a95403fdcc37988a4ebf546c7
2016-10-11 12:36:17 -07:00
Steinar Midtskogen
86b19177ab Bugfix in CLPF RDO. Prevented selection of enable_fb_flag=0.
PSNR YCbCr:     -0.01%     -0.06%     -0.17%
   PSNRHVS:      0.01%
      SSIM:      0.03%
    MSSSIM:      0.00%
 CIEDE2000:     -0.05%

Change-Id: I1205c021bfc5cee6f80344fec92aabb529af9bd1
2016-10-11 12:35:48 -07:00
Steinar Midtskogen
2e40cc4ce6 Bugfix in the CLPF RDO.
When CLPF was extended to chroma, the chroma RDO accidentally
discarded the optimal block size found in the luma RDO.

PSNR YCbCr:     -0.25%      0.05%      0.06%
   PSNRHVS:     -0.19%
      SSIM:     -0.36%
    MSSSIM:     -0.23%

Conflicts:
	av1/common/clpf.c

Change-Id: Ie49cd30f9276a311ada88cb2f13d14757617f030
2016-10-11 12:35:10 -07:00
Yaowu Xu
25faa0e9f5 Merge "Move tree writing code into bitwriter.h." into nextgenv2 2016-10-11 19:16:25 +00:00
Yaowu Xu
de005d322a Merge "Remove unused color_sensitivity member from MACROBLOCK." into nextgenv2 2016-10-11 19:16:07 +00:00
Sarah Parker
d7fa8542f6 Read mode to mi->bmi for sub 8x8 blocks
Previously, only the motion vectors were being stored. This caused
a mismatch in the global motion experiment, which needs this
mode information to decide whether or not to use the gm parameters
in reconstruction.

Change-Id: I58cde750ec06587dbfb8d65b07c15a67b7d6b1f6
2016-10-11 11:51:59 -07:00
Yaowu Xu
57aa518c30 Merge "CLPF: Remove redundant function argument." into nextgenv2 2016-10-11 18:44:56 +00:00
Yaowu Xu
80eaf1a120 Merge "Extend CLPF to chroma." into nextgenv2 2016-10-11 18:44:31 +00:00
Yaowu Xu
39b25dfa38 Merge "Remove some dead code in CLPF." into nextgenv2 2016-10-11 18:43:27 +00:00
Yaowu Xu
443e522b5c Merge "Reduce memory footprint for CLPF encoding." into nextgenv2 2016-10-11 18:42:34 +00:00
Yaowu Xu
a71552421d Merge "Non-normative quality improvements to CLPF." into nextgenv2 2016-10-11 18:41:40 +00:00
Yaowu Xu
038d41045b Merge "Added high bit-depth support in CLPF." into nextgenv2 2016-10-11 18:41:15 +00:00
Yaowu Xu
6fc92c1ccc Merge "Fix a memleak in CLPF." into nextgenv2 2016-10-11 18:41:03 +00:00
Yaowu Xu
a2bbf621f1 Merge "Reduce memory footprint for CLPF decoding." into nextgenv2 2016-10-11 18:40:47 +00:00
Yaowu Xu
4da3ed40a3 Merge "Make CLPF handle frame widths and heights not divisible by 8." into nextgenv2 2016-10-11 18:40:05 +00:00
Yaowu Xu
b5e73bddb0 Merge "CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix." into nextgenv2 2016-10-11 17:44:12 +00:00
Yaowu Xu
3b161e14b3 Merge "Silence some harmless compiler warnings in CLPF." into nextgenv2 2016-10-11 17:43:23 +00:00
Zoe Liu
d623c4122a Merge "Add a small code clean for show_existing_frame" into nextgenv2 2016-10-11 16:58:17 +00:00
Nathan E. Egge
eeedc633c0 Move tree writing code into bitwriter.h.
Rename av1_write_tree() to aom_write_tree() and move it into bitwriter.h
 to match aom_read_tree() in bitreader.h.

Manually cherry-picked from aom/master:
33a143fa7ac42d62080bfc20468cb76ad26045db

Change-Id: I6c686cdd3e0f179d7e95c5bc6984558b62d46d67
2016-10-11 09:36:01 -07:00
Thomas Daede
debaface95 Remove unused color_sensitivity member from MACROBLOCK.
Conflicts:
	av1/encoder/block.h
	av1/encoder/encodeframe.c

Change-Id: I941e7b9e76380f262b173928d3c5132c5613b3ce
2016-10-11 09:35:39 -07:00
Yaowu Xu
12fcf74c8a Merge "Use derived variable size for memcpy" into nextgenv2 2016-10-11 16:15:43 +00:00
Yaowu Xu
4960f7c3bd Merge "Added generic SIMD support for CLPF." into nextgenv2 2016-10-11 16:05:18 +00:00
Debargha Mukherjee
fb865cf41c Merge "Add sse2 forward / inverse 4x8 and 8x4 transforms" into nextgenv2 2016-10-11 15:50:32 +00:00
Yaowu Xu
c648a9fd83 Use derived variable size for memcpy
Manually cherry-picked from aom/master:
bf2ad75a1723d223c376b93295aa06dd23226937

Change-Id: I99f05e79ec8ad35a49bc124e6dd829ccc7d9cc36
2016-10-10 17:39:29 -07:00
Zoe Liu
5fca72498a Add a small code clean for show_existing_frame
Change-Id: I42dc9f0fdecd3cf3398ab82d6e01dde06bdf7b24
2016-10-10 17:18:57 -07:00
Steinar Midtskogen
ded69f5668 CLPF: Remove redundant function argument.
Change-Id: I31bea3b1f76493060edd7e1bd616a223841d5f77
2016-10-10 15:24:33 -07:00
Steinar Midtskogen
ecf9a0c821 Extend CLPF to chroma.
Objective quality impact (low latency):

PSNR YCbCr:      0.13%     -1.37%     -1.79%
   PSNRHVS:      0.03%
      SSIM:      0.24%
    MSSSIM:      0.10%
 CIEDE2000:     -0.83%

Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
2016-10-10 15:23:38 -07:00
Steinar Midtskogen
9021d09f9a Remove some dead code in CLPF.
av1_clpf_frame() was always called with the same src and dst,
so we only need one argument and the code supporting different
src and dst was removed.

Change-Id: I70919f50e5cfb19c22eb4dff9ee7c0fa2697fad3
2016-10-10 15:23:09 -07:00
Steinar Midtskogen
a8af9126fb Reduce memory footprint for CLPF encoding.
Use in-place filtering, like in the decoder
(see eb5794da1659f87597291d84c2fbdfd89280065d).

Change-Id: If037ead45f5cb3461347a63e0e415954d5dcba8b
2016-10-10 15:20:42 -07:00
Steinar Midtskogen
499deb9def Non-normative quality improvements to CLPF.
BDR improvements:
     PSNR  PSNRHVS SSIM  MSSSIM CIEDE2000 PSNR Cb  PSNR Cr
LL: -0.17% -0.13% -0.11% -0.12%   -0.18%   -0.19%   -0.21%
HL: -0.21% -0.14% -0.15% -0.11%   -0.37%   -0.39%   -0.52%

Change-Id: I58c00a1cc0ddfc3376644f66345e99472482a613
2016-10-10 11:31:50 -07:00
Steinar Midtskogen
3dbd55a6c4 Added high bit-depth support in CLPF.
Change-Id: Ic5eadb323227a820ad876c32d4dc296e05db6ece
2016-10-10 11:27:04 -07:00
Steinar Midtskogen
9351b2f792 Fix a memleak in CLPF.
The memleak appeared in eb5794da1659f87597291d84c2fbdfd89280065d.

Change-Id: Ifdd6d64aafa0d0ce4dfaf1844f594d5f843bf2e0
2016-10-10 11:26:52 -07:00
Steinar Midtskogen
e8224c7ad5 Reduce memory footprint for CLPF decoding.
Instead of having CLPF write to an entire new frame and
copy the result back into the original frame, make the
filter able to work in-place by keeping a buffer of size
frame_width*filter_block_size and delay the write-back
by one filter_block_size row.

This reduces the cycles spent in the filter to ~75%.

Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
2016-10-10 11:26:33 -07:00
Steinar Midtskogen
34dac00adc Make CLPF handle frame widths and heights not divisible by 8.
Change-Id: If5eb33b6b090f43ba64c82468576b89eddd872c3
2016-10-10 11:26:15 -07:00
Steinar Midtskogen
f4d41e6330 CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix.
Change-Id: I518ad9c58973910eb0bdcb377f2d90138208c570
2016-10-10 11:21:23 -07:00
Steinar Midtskogen
2fd70ee124 Silence some harmless compiler warnings in CLPF.
Change-Id: I4a6d84007bc17b89cfd8d8f2440bf2968505bd6a
2016-10-10 11:20:43 -07:00
Steinar Midtskogen
be668e92c3 Added generic SIMD support for CLPF.
Change-Id: Ie03f9a5b0a4c708a586532198d755a1e7509f149
2016-10-10 11:19:37 -07:00
Yaowu Xu
abe0484cee Merge "New CLPF: New kernel and RDO for strength and block size" into nextgenv2 2016-10-10 18:17:41 +00:00
David Barker
4d03d6fc6f Add sse2 forward / inverse 4x8 and 8x4 transforms
Change-Id: I89ed93fb20cf975c2b463cff58879521ceaa4163
2016-10-10 09:02:45 -07:00
Yi Luo
3a8217f21b Merge "Hybrid forward transforms 16x16 AVX2 optimization" into nextgenv2 2016-10-07 01:52:11 +00:00
Debargha Mukherjee
609453e7e4 Merge "Added sse2 inverse 8x16 and 16x8 transforms" into nextgenv2 2016-10-07 00:03:34 +00:00
Debargha Mukherjee
e4dc5f8dc9 Merge "A bug fix for var-tx" into nextgenv2 2016-10-07 00:02:31 +00:00
Yi Luo
e8e8cd8f1b Hybrid forward transforms 16x16 AVX2 optimization
- Unit tests are added for AVX2 SIMD.
- Encoder speed improvement:
  AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
  800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
  user level time reduction: 3.86%.

Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
2016-10-06 15:33:15 -07:00
Alex Converse
24aa59cc51 Fix left shift of negative integer in hbd directional predictors
Change-Id: Id78139ae2dfa2d521bd50618b4a81cf24e09e391
2016-10-06 11:41:47 -07:00
Peter de Rivaz
1baecfeb03 Added sse2 inverse 8x16 and 16x8 transforms
Change-Id: I43628407b11e5c8e6af4df69f2acdc67ac827834
2016-10-06 11:23:14 -07:00
Debargha Mukherjee
29804479b5 Merge "Silence some warnings" into nextgenv2 2016-10-06 18:15:16 +00:00