401 Commits

Author SHA1 Message Date
Yaowu Xu
a2bbf621f1 Merge "Reduce memory footprint for CLPF decoding." into nextgenv2 2016-10-11 18:40:47 +00:00
Yaowu Xu
4da3ed40a3 Merge "Make CLPF handle frame widths and heights not divisible by 8." into nextgenv2 2016-10-11 18:40:05 +00:00
Yaowu Xu
b5e73bddb0 Merge "CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix." into nextgenv2 2016-10-11 17:44:12 +00:00
Yaowu Xu
3b161e14b3 Merge "Silence some harmless compiler warnings in CLPF." into nextgenv2 2016-10-11 17:43:23 +00:00
Yaowu Xu
12fcf74c8a Merge "Use derived variable size for memcpy" into nextgenv2 2016-10-11 16:15:43 +00:00
Yaowu Xu
4960f7c3bd Merge "Added generic SIMD support for CLPF." into nextgenv2 2016-10-11 16:05:18 +00:00
Debargha Mukherjee
fb865cf41c Merge "Add sse2 forward / inverse 4x8 and 8x4 transforms" into nextgenv2 2016-10-11 15:50:32 +00:00
Yaowu Xu
c648a9fd83 Use derived variable size for memcpy
Manually cherry-picked from aom/master:
bf2ad75a1723d223c376b93295aa06dd23226937

Change-Id: I99f05e79ec8ad35a49bc124e6dd829ccc7d9cc36
2016-10-10 17:39:29 -07:00
Steinar Midtskogen
ded69f5668 CLPF: Remove redundant function argument.
Change-Id: I31bea3b1f76493060edd7e1bd616a223841d5f77
2016-10-10 15:24:33 -07:00
Steinar Midtskogen
ecf9a0c821 Extend CLPF to chroma.
Objective quality impact (low latency):

PSNR YCbCr:      0.13%     -1.37%     -1.79%
   PSNRHVS:      0.03%
      SSIM:      0.24%
    MSSSIM:      0.10%
 CIEDE2000:     -0.83%

Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
2016-10-10 15:23:38 -07:00
Steinar Midtskogen
9021d09f9a Remove some dead code in CLPF.
av1_clpf_frame() was always called with the same src and dst,
so we only need one argument and the code supporting different
src and dst was removed.

Change-Id: I70919f50e5cfb19c22eb4dff9ee7c0fa2697fad3
2016-10-10 15:23:09 -07:00
Steinar Midtskogen
3dbd55a6c4 Added high bit-depth support in CLPF.
Change-Id: Ic5eadb323227a820ad876c32d4dc296e05db6ece
2016-10-10 11:27:04 -07:00
Steinar Midtskogen
9351b2f792 Fix a memleak in CLPF.
The memleak appeared in eb5794da1659f87597291d84c2fbdfd89280065d.

Change-Id: Ifdd6d64aafa0d0ce4dfaf1844f594d5f843bf2e0
2016-10-10 11:26:52 -07:00
Steinar Midtskogen
e8224c7ad5 Reduce memory footprint for CLPF decoding.
Instead of having CLPF write to an entire new frame and
copy the result back into the original frame, make the
filter able to work in-place by keeping a buffer of size
frame_width*filter_block_size and delay the write-back
by one filter_block_size row.

This reduces the cycles spent in the filter to ~75%.

Change-Id: I78ca74380c45492daa8935d08d766851edb5fbc1
2016-10-10 11:26:33 -07:00
Steinar Midtskogen
34dac00adc Make CLPF handle frame widths and heights not divisible by 8.
Change-Id: If5eb33b6b090f43ba64c82468576b89eddd872c3
2016-10-10 11:26:15 -07:00
Steinar Midtskogen
f4d41e6330 CLPF: Don't assume sb size=64 and w&h multiple of 8 + valgrind fix.
Change-Id: I518ad9c58973910eb0bdcb377f2d90138208c570
2016-10-10 11:21:23 -07:00
Steinar Midtskogen
2fd70ee124 Silence some harmless compiler warnings in CLPF.
Change-Id: I4a6d84007bc17b89cfd8d8f2440bf2968505bd6a
2016-10-10 11:20:43 -07:00
Steinar Midtskogen
be668e92c3 Added generic SIMD support for CLPF.
Change-Id: Ie03f9a5b0a4c708a586532198d755a1e7509f149
2016-10-10 11:19:37 -07:00
Yaowu Xu
abe0484cee Merge "New CLPF: New kernel and RDO for strength and block size" into nextgenv2 2016-10-10 18:17:41 +00:00
David Barker
4d03d6fc6f Add sse2 forward / inverse 4x8 and 8x4 transforms
Change-Id: I89ed93fb20cf975c2b463cff58879521ceaa4163
2016-10-10 09:02:45 -07:00
Yi Luo
3a8217f21b Merge "Hybrid forward transforms 16x16 AVX2 optimization" into nextgenv2 2016-10-07 01:52:11 +00:00
Debargha Mukherjee
609453e7e4 Merge "Added sse2 inverse 8x16 and 16x8 transforms" into nextgenv2 2016-10-07 00:03:34 +00:00
Yi Luo
e8e8cd8f1b Hybrid forward transforms 16x16 AVX2 optimization
- Unit tests are added for AVX2 SIMD.
- Encoder speed improvement:
  AV1 baseline and EXT_TX, three 1080p sequences at bitrate:
  800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average
  user level time reduction: 3.86%.

Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4
2016-10-06 15:33:15 -07:00
Alex Converse
24aa59cc51 Fix left shift of negative integer in hbd directional predictors
Change-Id: Id78139ae2dfa2d521bd50618b4a81cf24e09e391
2016-10-06 11:41:47 -07:00
Peter de Rivaz
1baecfeb03 Added sse2 inverse 8x16 and 16x8 transforms
Change-Id: I43628407b11e5c8e6af4df69f2acdc67ac827834
2016-10-06 11:23:14 -07:00
Steinar Midtskogen
d06588ab18 New CLPF: New kernel and RDO for strength and block size
This commit ports a CLPF change from aom/master by manually
cherry-picking:
7560123c066854aa40c4685625454aea03410b18

Change-Id: I61eb08862a101df74a6b65ece459833401e81117
2016-10-06 09:36:03 -07:00
Jingning Han
3b22d1a875 Merge "Make ref_mv_idx syntax context dependent on block distance only" into nextgenv2 2016-10-06 15:55:40 +00:00
Angie Chiang
9c2d401ca0 Merge "Simplify file dependencies of SIMD implementation of interpolation filters" into nextgenv2 2016-10-05 16:26:26 +00:00
Jingning Han
8205b78552 Make ref_mv_idx syntax context dependent on block distance only
This allows the hardware decoder to start decoding ref_mv_idx
syntax prior to the sorting stage and hide the latency of entropy
decoding. The compression performance change is about 0.01% level.

Change-Id: I86b34f31f6c99a36ae2780416175cc0bd90ff492
2016-10-05 09:09:00 -07:00
Debargha Mukherjee
cb603790b0 Fix a compiler warning in ext-inter experiment
Change-Id: If36417c1384646da57453344b208e7653a4d31e5
2016-10-04 13:22:21 -07:00
Debargha Mukherjee
1a16a987ee Fix an integer overflow issue in restoration
https://bugs.chromium.org/p/webm/issues/detail?id=1306

Change-Id: Icd11d373ff08954121c097728e4c7791791e223f
2016-10-04 11:50:00 -07:00
Angie Chiang
b9ba5c251b Simplify file dependencies of SIMD implementation of interpolation filters
This is a similar change to following aom CL
https://aomedia-review.googlesource.com/#/c/1961/

Move SIMD related functions from filter.c/h to following files
av1_convolve_ssse3.c
av1_highbd_convolve_filters_sse4.c

Change following c files to header files.
av1_highbd_convolve_filters_sse4.c
av1_convolve_filters_ssse3.c

Change-Id: I41a3cc6b0789e632451aeda82f5eb97a4d78e370
2016-10-03 18:43:23 -07:00
Yi Luo
8e46b860c6 Fix filter type mismatch warning on Visual Studio
- Move filter look-up functions to corresponding optimization modules.

BUG=webm:1296

Change-Id: I87f399609052db2dbc7e5a590afb08b82e3fa89f
2016-10-03 16:24:25 -07:00
Debargha Mukherjee
bf0431276d Merge "Further changes to new-quant tables" into nextgenv2 2016-10-03 21:10:30 +00:00
Jingning Han
42bc3a9ef3 Sync ref-mv experiment between aom and nextgenv2
Change-Id: I134d276234b3b8aa7df1ab647892b5d739647f4c
2016-10-03 09:02:20 -07:00
Debargha Mukherjee
3c42c09608 Further changes to new-quant tables
Refactor to streamline the number of profiles needed, in
preparation for the next steps.

NO change in performance.

Change-Id: I753b89299897857f3c250c316b4cdc4fedcb90e8
2016-10-01 17:59:28 -07:00
Yaowu Xu
671f2bd3f5 Rename AOM_ENC/DEC_BORDER_IN_PIXELS
Cherry-picked from aom/master:
e2721a65cbfb5b560cd884d60eb17f53539df5f0

Change-Id: I4ade58be91e7bca0cc4f2bed98a43177d7f590a5
2016-09-30 15:17:16 -07:00
Jingning Han
71e4553c3b Clean up av1_adapt_mv_probs format
Change-Id: Ib5226d4fe3dcf916fe8954c7240966e3a32eed31
2016-09-30 17:58:21 +00:00
Jingning Han
3b0a3f3ab3 Merge "Set spatial neighbor search resolution 16x16 for block size 64x64" into nextgenv2 2016-09-30 17:57:52 +00:00
Jingning Han
dcf1b40d91 Merge "Search collocated reference block in 16x16 unit" into nextgenv2 2016-09-30 17:45:09 +00:00
Jingning Han
75e513f126 Set spatial neighbor search resolution 16x16 for block size 64x64
When the block has width/height above or equal to 64, use 16x16
block search step for reference motion vector search in the non-
immediate rows and columns.

Change-Id: If11ce97a9328b879f30ef87115086aa0cd985a2f
2016-09-30 10:00:10 -07:00
Jingning Han
883c63ca57 Search collocated reference block in 16x16 unit
Use 16x16 block resolution for collocated reference motion vector
search.

Change-Id: I1091b5b178e255eb6cc0b994de360994f7661b79
2016-09-30 09:04:21 -07:00
Alex Converse
770911d48c Merge changes I319cb856,Ib009b6b6 into nextgenv2
* changes:
  Remove multi-entropy coder hacks from the treewriter
  Rename rans_dec_lut to rans_lut
2016-09-29 21:54:28 +00:00
Jingning Han
d54e5a04c4 Merge "more ref_mv changes from aom/master" into nextgenv2 2016-09-29 21:46:56 +00:00
Yue Chen
7dc7703bcb Merge "Fix unit test failure for RECT_TX + VAR_TX" into nextgenv2 2016-09-29 21:41:10 +00:00
Yaowu Xu
4306b6e599 more ref_mv changes from aom/master
Change-Id: I9152f898dfacdf3877ed719f193bb1e0dbee0a1a
2016-09-29 12:41:55 -07:00
Yue Chen
8e87224604 Merge "Move warping model estimation functions to COMMON folder" into nextgenv2 2016-09-29 18:24:32 +00:00
Alex Converse
57aa0f656d Merge changes Ideda50a6,Id2bced5f,If423eeb3 into nextgenv2
* changes:
  Port ANS from aom/master 25aaf40
  Refactor bitreader and bitwriter wrapper.
  Migrate aom/master ANS test from d311d02.
2016-09-29 16:43:12 +00:00
Yue Chen
49587a77f1 Fix unit test failure for RECT_TX + VAR_TX
Disable rect_tx because we only support 4x4 Walsh-Hadamard transform
in lossless mode.

Fixes failure in ./test_libaom --gtest_filter=*Large*ScreencastQ0/1
Configuration: --enable-experimental --enable-var-tx --enable-rect-tx
 --enable-ref-mv --enable-ext_intra --enable-ext_tx --enable-debug
 --disable-optimizations

Change-Id: Ib6b3494c7dcf7182f1cab9b138388d054851a23d
2016-09-29 09:20:52 -07:00
Debargha Mukherjee
485af9e580 Merge "Change non-uniform-quant parameters" into nextgenv2 2016-09-29 16:04:58 +00:00