17183 Commits

Author SHA1 Message Date
Marco
f21ff53830 vp9: Remove effective_bitrate from SVC datarate tests.
Change-Id: I1189c2403463e0aca288ba344052ba1c9cf94390
2016-02-29 13:13:32 -08:00
Yunqing Wang
342a368fd4 Do sub-pixel motion search in up-sampled reference frames
Up-sampled the reference frames to 8 times in each dimension using
the 8-tap interpolation filter. In sub-pixel motion search, use the
up-sampled reference frames to find the best matching blocks. This
largely improved the motion search precision, and thus, improved
the compression quality. There was no change in decoder side.

Borg test and speed test results:
1. On derflr set,
Overall PSNR gain: 1.306%, and SSIM gain: 1.512%.
Average speed loss on derf set was 6.0%.
2. On stdhd set,
Overall PSNR gain: 0.754%, and SSIM gain: 0.814%.
On hevchd set,
Overall PSNR gain: 0.465%, and SSIM gain: 0.527%.
Speed loss on HD clips was 3.5%.

Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f
2016-02-29 12:14:47 -08:00
Marco
729c997642 vp8: multi-res-encoder: Fix timer around encoder in sample encoder.
Change-Id: I0131ab4767e2eb72838ab6e58dd77a85fbf508e0
2016-02-29 11:13:42 -08:00
Debargha Mukherjee
db084506d8 A build fix and some other cosmetic changes
Fixes some issues introduced by a merge of two patches.
Also decouples the temporal interpolation filter from the switchable
filters for now for ease of experimentation with both separately.

Change-Id: If1c7c08adf00e0cf818fe8d0d3656c26ea65eb32
2016-02-29 10:20:52 -08:00
Marco
55a09f7f45 vp9-svc: For 1 pass svc, remove frame-level upsampling.
With the svc fix in https://chromium-review.googlesource.com/#/c/328978/,
the asan error is resolved, so this should work now.

Change-Id: I57b2a593651d414e1b445431d90f2fdc3281128b
2016-02-29 08:56:14 -08:00
Debargha Mukherjee
48589e8d07 Merge "Some refactoring and cleanups of interp filter" into nextgenv2 2016-02-29 15:55:48 +00:00
Scott LaVarnway
dd6729f826 VPX: Remove pmin/pmax from subpixel functions.
These instructions are unnecessary if the adds
are done in the correct order.

Change-Id: I4e533b8267c32e610a4b94203ad052dc9fdabd71
2016-02-27 05:47:56 -08:00
Scott LaVarnway
51beb29f52 Merge "VPX: vpx_filter_block1d16_(v8, v8_avg)" 2016-02-27 13:31:18 +00:00
Hui Su
95428a5926 Merge "Fix compiler warnings" into nextgenv2 2016-02-27 05:04:02 +00:00
James Zern
3b5cb2dbe0 Merge changes I95159bcb,Ia74e3097,I661f6439
* changes:
  x86/convolve.h: remove redundant check in FUN_CONV_2D
  x86/convolve.h: replace while w/if for w < 16
  x86/convolve.h: change filter[] || chains to |
2016-02-27 02:56:41 +00:00
Jingning Han
0fc0c1a32d Merge "Enable improved temporal filter in ext-interp experiment" into nextgenv2 2016-02-27 01:22:15 +00:00
Jingning Han
dca86af8f4 Merge "Unify frame border extension operation" into nextgenv2 2016-02-27 01:22:03 +00:00
James Zern
4b00f0ecae datarate_test/ChangingDropFrameThresh: set kf interval
restore the value for VP9 to 9999 to satisfy the current test
expectations; without this
VP9/DatarateTestVP9Large.ChangingDropFrameThresh/8 will overshoot.

Change-Id: I88dad574ae4ab10f923579824c7347ff468c7045
2016-02-26 16:54:36 -08:00
James Zern
8062e10162 Revert "vp9-svc: Fix speed issue with source downscaling for spatial layers."
This reverts commit f51f0998e1ca99cd7497ded3642bb27445b1b215.

This causes datarate tests to fail. Some are due to the new default
keyframe distance, another causes an assert even forcing 9999:

[ RUN      ] VP9/DatarateOnePassCbrSvc.OnePassCbrSvc3SpatialLayers/0
test_libvpx:
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c:853: scaledconvolve2d:
Assertion `y_step_q4 <= 32' failed.

Change-Id: I4ee4fea97f47e4f1a23b82a62e6afc6280961e38
2016-02-26 16:53:26 -08:00
Marco Paniconi
9ef41cf577 Merge "vp8-denoiser: Update some denoiser thresholds." 2016-02-27 00:20:53 +00:00
Debargha Mukherjee
bab2912b5e Some refactoring and cleanups of interp filter
Includes various cosmetic changes and refactoring including
naming the sharp filters differently (since they are no longer
8-tap).

Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a
2016-02-26 15:42:49 -08:00
Julia Robson
74a679de6f Port "cost_coeff speed improvements" to vp9.
About a 5% faster overall encode (perf cycles) at speed zero!

Change-Id: Iaf013ba75884415cd824e98349f654ffb1c3ef33
2016-02-26 14:47:18 -08:00
Marco Paniconi
a69c3f2823 Merge "vp9-svc: Bugfix for svc in non-rd variance partition." 2016-02-26 22:39:28 +00:00
Jingning Han
95d35a4a0b Enable improved temporal filter in ext-interp experiment
It improves the coding performance by 0.3%.

Change-Id: I9703abd705ceacdf9e7424428e5120253cadcc18
2016-02-26 21:59:51 +00:00
Jingning Han
d1d11fc6dd Unify frame border extension operation
This commit unifies the encoder and decoder border extension and
motion compensated prediction process. Remove the decoder specific
flow to simplify the development flow.

Change-Id: I9c43bbe6d7c017e6da2db6a62c5bf3d0af7ccfce
2016-02-26 13:58:53 -08:00
hui su
4aeabf1b0d Fix compiler warnings
Change-Id: Id7240260cec471a3f8d0986b9c8df06efda925f9
2016-02-26 13:52:49 -08:00
Geza Lore
7ded038af5 Port interintra experiment from nextgen.
The interintra experiment, which combines an inter prediction and an
inter prediction have been ported from the nextgen branch. The
experiment is merged into ext_inter, so there is no separate configure
option to enable it.

Change-Id: I0cc20cefd29e9b77ab7bbbb709abc11512320325
2016-02-26 13:01:51 -08:00
Debargha Mukherjee
3287f5519e Merge "Hooks to use 32x32 masked transforms for ext-tx" into nextgenv2 2016-02-26 20:54:37 +00:00
Yi Luo
b347c3c5e5 Merge "Implemented DST 8x8 with SSE2 intrinsics." into nextgenv2 2016-02-26 19:10:00 +00:00
Marco
6a23966c34 vp9-svc: Bugfix for svc in non-rd variance partition.
Reset the scale factors before build_inter_predictors.

Add datarate tests for 3 spatial layers, which exposed this issue.

Change-Id: I7f81efbe44345ecea9fdd5f639a4cca76aed3874
2016-02-26 09:24:18 -08:00
Jingning Han
2b7196a8bb Merge "Use sharp filter for alter reference frame generation" into nextgenv2 2016-02-26 16:24:59 +00:00
Jingning Han
83ecafbd95 Merge "Enable context based motion vector entropy coding" into nextgenv2 2016-02-26 16:24:49 +00:00
Marco
f51f0998e1 vp9-svc: Fix speed issue with source downscaling for spatial layers.
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.

Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.

Change-Id: Ifcf526ec2aaf3e5fa7924588d9dd8660bf02fb46
2016-02-26 08:11:37 -08:00
Yaowu Xu
a570cefcf8 Merge "Extend vpxssim to handle more HBD combinations" into nextgenv2 2016-02-26 15:57:40 +00:00
James Zern
654d2163c9 x86/convolve.h: remove redundant check in FUN_CONV_2D
the filter will be the same in this case

Change-Id: I95159bcb05bbfb71b57da741393e80cc7ffc5cff
2016-02-25 23:31:50 -08:00
James Zern
6d8c8c6201 x86/convolve.h: replace while w/if for w < 16
in non-hbd configurations; any high-bitdepth changes will be done in a
follow-up

Change-Id: Ia74e30971b744c1faab68c92fdeda1a053988c77
2016-02-25 21:44:06 -08:00
James Zern
1ff2935ebf altref_test: move AltRefTest instantiation w/in VP8 check
some configurations may fail if AltRefTest is undefined though
VP8_INSTANTIATE_TEST_CASE is defined away.

Change-Id: I7272775a506718336bd6cee2225cf83bd72fede5
2016-02-25 20:58:56 -08:00
James Zern
48755f9f1a Merge "vp9/10: fix forced keyframes w/alt-refs enabled" 2016-02-26 03:52:44 +00:00
Jingning Han
72eda13e50 Use sharp filter for alter reference frame generation
This commit uses 12-tap sharp filter to generate alter reference
frame. It improves the compression performance by
derf    0.45%
hevcmr  0.35%
stdhd   0.79%

No encoding time change is observed.

Change-Id: Ia5dc26d5aae6b9b0cb782e5a28dc5066eeeb2ec8
2016-02-25 14:20:38 -08:00
James Zern
14828e756f vp9: set kf_max_dist to a reasonable default (128)
the same as vp8, with the same reasoning from:
2a0d7b1 Reduce the default kf_max_dist to 128.

see also:
https://trac.ffmpeg.org/ticket/4904
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815673

+ restore vpxenc behavior of taking the library default rather than
  forcing 5s

This change also exposes an issue with one-pass svc in cbr mode, keep
the old default in datarate_test.cc for now.

Change-Id: Id6d1244f42490b06fefc1a7b4e12a423a1f83e88
2016-02-25 12:34:12 -08:00
Scott LaVarnway
1f736e400f VPX: vpx_filter_block1d16_(v8, v8_avg)
Store result with one 16 byte store instead of
two 8 byte stores.

Change-Id: I43acbc5edfd6d6055a926f9b9605d47127400f09
2016-02-25 06:15:24 -08:00
James Zern
b3ceb629ba x86/convolve.h: change filter[] || chains to |
Change-Id: I661f64390f232826857b259e7a67e77f5a3a91ad
2016-02-24 19:47:43 -08:00
Hui Su
1226e734a0 Merge "Add test for screen content coding tools in end to end test" into nextgenv2 2016-02-25 03:47:03 +00:00
Angie Chiang
8878fa4f9a convolve8 sse2 test
This experiment shows that when frame size is 64x64
vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar.
However when frame size becomes 1024x1024
vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2
we think the bottleneck is from memory IO

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms)

VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms)

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms)

Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943
2016-02-24 17:01:20 -08:00
James Zern
ac4c37c684 vp9/10: fix forced keyframes w/alt-refs enabled
in 1-pass encodes. issues with 2-pass as well as other forced flags
persist.

Change-Id: Ic7ceb906fccea6456d5df96483c10cacd46e01c7
2016-02-24 15:56:37 -08:00
hui su
827e1b3fef Add test for screen content coding tools in end to end test
Test screen content coding tools (currently only palette) at
speed 1 and two-pass.

Change-Id: I3c467aee1cd9c366c65a3abfdccfafa0416b59b7
2016-02-24 15:27:07 -08:00
Yi Luo
0353f596e9 Implemented DST 8x8 with SSE2 intrinsics.
Implemented fdst8_sse2() function against C version: fdst8().
Added seven DST related hybrid transform types in vp10_fht8x8_sse2().
Replaced vp10_fht8x8_c() with vp10_fht8x8_sse2() in fwd_txfm_8x8().
Speedup: 18.1%, 11.5%, 22.0% based on speed test from
city_cif.y4m, garden_sif.y4m, mobile_cif.y4m.

Change-Id: Ia4aa1ea44c7a33e494f64ce843037f8703f975e3
2016-02-24 14:58:01 -08:00
Johann Koenig
784eebb2d3 Merge changes from topic 'x86inc'
* changes:
  x86inc.asm: only set visibility for chromium builds
  Only use .text sections for aout
  Use .text instead of .rodata on macho
  Copy PIC handling code from x86_abi_support
  Set 'private_extern' visibility for macho targets
  Expand PIC default to macho64 and respect CONFIG_PIC from libvpx
  Use libvpx defines to set name mangling rules
  Customize x86inc.asm for libvpx
  Update x86inc.asm from x264
2016-02-24 22:33:02 +00:00
Scott LaVarnway
87bd54fa05 Merge "BUG FIX: vpx_filter_block1d(8,4)_(v8, v8_avg)" 2016-02-24 22:31:07 +00:00
Debargha Mukherjee
da2d4a7afc Hooks to use 32x32 masked transforms for ext-tx
Adds hooks to use 32x32 ext-tx. Also adds scan orders for the masked
transforms for 32x32.
Make macro USE_MSKTX_FOR_32X32 1 in blockd.h to support 32x32 masked
transforms for ext-tx.

Change-Id: Ie6564830266651fcafae2d536c274dafd664ce17
2016-02-24 13:08:37 -08:00
Debargha Mukherjee
389efb289e Adds an utility macro ROUNDZ_POWER_OF_TWO
This macro works for the shift parameter being 0.
The ROUND_POWER_OF_TWO macro does not.

Change-Id: I8434d2933892e09bbc0d2dafc934d0c3637df347
2016-02-24 12:35:29 -08:00
Hui Su
aa703adb46 Merge "Fix some compiler warnings." into nextgenv2 2016-02-24 20:28:37 +00:00
Debargha Mukherjee
ad574d4008 Merge "Some fixes in reconintra" into nextgenv2 2016-02-24 20:25:25 +00:00
hui su
8537826eb4 Fix some compiler warnings.
"taking the absolute value of unsigned type 'unsigned int' has no effect"

Change-Id: Iea1f67c2a3171a98ca89d5dc7192a5508d086c16
2016-02-24 11:17:33 -08:00
Yaowu Xu
aa6c754635 Merge remote-tracking branch 'webm/master' into nextgenv2 2016-02-24 10:53:17 -08:00