16370 Commits

Author SHA1 Message Date
James Zern
a914ffad97 Merge "variance_neon: sync variance*() w/c,sse2" 2016-09-23 02:18:49 +00:00
Scott LaVarnway
7a34f85955 VP9: pass TileWorkerData instead of MACROBLOCKD and vpx_reader.
Change-Id: I869ef0f113c022143b531c44aefa0f1bb267052d
2016-09-22 13:18:36 -07:00
James Zern
fdd1186f97 vpx_idct32x32_34_add_sse2: rm unneeded transposes
this change is neutral to mildly positive across various x86-64
platforms

Change-Id: I28fb5ae598fc1317b7a42c9a846ac5d57d104784
2016-09-21 19:49:25 -07:00
Angie Chiang
99ef84c65a Merge "Detect invalid highbd iht input" 2016-09-22 01:06:38 +00:00
James Zern
e372bfd5ac variance_neon: sync variance*() w/c,sse2
removes some unnecessary casts and adds a few explicit uint32 ones for
larger sizes to quiet -Wshorten-64-to-32 warnings

Change-Id: I63c5fce8e62c426d5cf5c10a66a113c119a43518
2016-09-21 18:04:45 -07:00
James Zern
fcf281b6a1 Merge "vp8: remove VP8_SET_DBG* control support" 2016-09-22 00:43:35 +00:00
Angie Chiang
80338b91d3 Detect invalid highbd iht input
Do nothing in vp9_highbd_iht#x#_##_add_c when input magnitude is beyond
20 bits. Note that, sign bit is not included here.

In the 20 bits, we use 12 bits for input signal, 7 bits for forward
transform amplification, and 1 bit for contingency in rounding and
quantizing

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1286

Change-Id: I332c6f68df4614fc2e7d2dc4c5bb0d0cff8a245c
2016-09-21 17:15:19 -07:00
Johann
2bed8b6acd Keep vp8 sixtap read within bounds
When filtering it needs 6 pixels: 2 prior to the source, the source, and
3 after the source.

When filtering 16 wide, that means 21. To accomplish this the SSE2 reads
[-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading
in groups of 8 is easy)

The filter then shifts this last set to the top half of the register and
uses 'or' to combine it with the previous set.

Valgrind detected an issue reading pixels [19], [20] and [21]:
Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd

Note: we only need pixels [16], [17], and [18] as context for [15].

To fix this, it now reads 8 bytes starting at [11], which re-loads [11]
through [13], but stops at [18] and does not over-read any values.

This is shifted by 5 and 'or'd with xmm1. Although the lower bits are
not cleared, they overlap directly with [11] through [13], so 'or'
produces the correct results.

Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
2016-09-21 16:17:07 -07:00
Johann
35ebc1cddf predict_test: align dst buffer to 16
On 32 bit machines 'new' does not always appear to allocate sufficiently
aligned buffers, causing intermittent test failures.

Change-Id: I0db4fc73782012e4eef71dc0fb540e74fdbfcebe
2016-09-21 13:35:47 -07:00
James Zern
3f72509587 vp8: remove VP8_SET_DBG* control support
the --enable-postproc-visualizer configure option remains as a no-op as
do the control names and values for compatibility
+ remove the corresponding debug flags from vpxdec: --pp-*

Change-Id: I4a001cd9962b59560d7d6bda6272d4ff32b8d37c
2016-09-20 20:19:36 -07:00
James Zern
cec6433e41 vp9_idct: delete dead TODOs
Change-Id: Icdd5494f557d83026dc078bce37997a76aa288fb
2016-09-20 19:46:27 -07:00
James Zern
b6e686b1ea Merge changes from topic 'Wshorten'
* changes:
  vp8: convert some uses of unsigned long to size_t
  vp8/encoder: quiet some -Wshorten-64-to-32 warnings
2016-09-20 23:17:20 +00:00
James Zern
c31d02615d Merge "variance_avx2: sync variance functions with c-code" 2016-09-20 22:33:39 +00:00
James Zern
2351a73531 Merge "examples: quiet -Wshorten-64-to-32 warnings" 2016-09-20 22:32:58 +00:00
James Zern
feb4313c5f Merge "vp9_rtcd: remove non-existent highbd convolve fns" 2016-09-20 22:07:09 +00:00
Johann Koenig
8478f97105 Merge "Enable ssse3 bilinear tests" 2016-09-20 21:46:50 +00:00
Johann Koenig
18fd69ee91 Merge "Add vp8_bilinear_filter test" 2016-09-20 20:30:48 +00:00
Alex Converse
0d2687ef87 Merge "Code class0 using vpx_read() / vpx_write()." 2016-09-20 19:19:29 +00:00
James Zern
5841929fde vp9_rtcd: remove non-existent highbd convolve fns
these were moved to vpx_dsp

Change-Id: I307b07ae05e2333277d4b7011cba36dcf8409959
2016-09-19 20:01:23 -07:00
James Zern
08b8b6bb8f examples: quiet -Wshorten-64-to-32 warnings
all around usage of strtol/strtoul

Change-Id: If907c89f107a068987aa71ddd93cee9a7389e4cd
2016-09-19 19:02:49 -07:00
James Zern
8281da74b9 vp8: convert some uses of unsigned long to size_t
similar to changes that were done in vp9 for encoded frame size
reporting. has the side-effect of quieting a -Wshorten-64-to-32 warning.

Change-Id: I89f74cb617fc29334ee351dc8dfaa3b8cfd4e5af
2016-09-19 18:35:59 -07:00
James Zern
0ce98b423b vp8/encoder: quiet some -Wshorten-64-to-32 warnings
this code is similar to other existing uses and/or vp9

Change-Id: I56e646931379759d9f7332ea6d746060007c75ee
2016-09-19 18:35:59 -07:00
Linfeng Zhang
761e5ec2f6 Refactor lpf (size 4 and 8) NEON intrinsics optimization
Also check in 8x8 8-bit transpose NEON intrinsics optimization
transpose_u8_8x8()

Change-Id: I32d321cf97ea21eab158ac4896990fc9a51681c4
2016-09-19 16:41:37 -07:00
James Zern
6acd061aad variance_avx2: sync variance functions with c-code
add missing int64 -> uint32 cast; quiets -Wshorten-64-to-32 warnings

Change-Id: I4850b36e18dc8b399108342be4bfe0b684aefb78
2016-09-19 16:19:29 -07:00
Johann Koenig
0695843a21 Merge "Remove -fno-strict-aliasing flag" 2016-09-19 22:49:23 +00:00
Johann
fad70a358b Remove -fno-strict-aliasing flag
The referenced bug was fixed by saving neon registers. That this had any
effect was coincidental.

Both chromium and Android build with clang and neither uses this flag.

Change-Id: I470247d6fd9226fc207b42a187105581a94badc3
2016-09-19 12:16:03 -07:00
Nathan E. Egge
de7f5ce9e5 Code class0 using vpx_read() / vpx_write().
The vp9_mv_class0_tree is a balanced tree with two leafs and can
simply be coded as a boolean with probability class0[0].

Change-Id: If294dac825a5f945371092c74aa8e3f84cd962b6
(cherry picked from commit be8a8ab62ebdd111c6f2e9a33b15630570671eba)
2016-09-19 10:50:39 -07:00
Alex Converse
01e2902521 Zero the whole rd_counts struct rather than the each member
Change-Id: I495aa9cec2b2b8f1ae69bdab8b3feeca76358472
2016-09-19 10:04:47 -07:00
James Zern
aa0eb67bf7 loopfilter_mb_neon: remove unused load_8x8()
quiets a -Wunused-function warning for arm targets

Change-Id: I293a7e3d3d7d61d6af2fbedad5e8c25126c418b6
2016-09-17 11:00:31 -07:00
Linfeng Zhang
5d73639d8f Merge "Refactor lpf (size 16) NEON intrinsics optimization" 2016-09-17 00:33:30 +00:00
James Zern
112eb54c1b Merge "vpx_codec_control: return incapable for unmatched control" 2016-09-16 17:30:44 +00:00
Linfeng Zhang
8107368000 Refactor lpf (size 16) NEON intrinsics optimization
Extract shared code so later lpf size 4 and 8 functions can reuse.

Change-Id: Ibb43ef1fd8651bd2e32fcc4c56cf6fa7ca237401
2016-09-16 09:12:13 -07:00
James Zern
33aef48f29 vpx_subpixel_8t_intrin_avx2: tolerate unversioned clang
assume __clang_major__==0 has the latest version of
_mm256_broadcastsi128_si256. fixes builds with custom clang toolchains.

BUG=b/30970831

Change-Id: I90becd56278e4716bd46e2ba9d910af977e8dfa6
2016-09-16 07:14:17 +00:00
James Zern
7a9e476072 Merge changes from topic 'clang-format'
* changes:
  apply clang-format
  .clang-format: update to 3.8.1
2016-09-16 07:11:33 +00:00
Johann
e813c2b416 Enable ssse3 bilinear tests
The code only has issues when xoffset == 0 and yoffset == 0 which
represents a simple copy. Presumably this case does not need to be
handled because the issue has existed since 2010.

BUG=webm:1287

Change-Id: Ic47e2653f3b729e99b40e53d8d2d8d1501edaaa9
2016-09-15 23:16:26 -07:00
Johann
caf9a7841e Add vp8_bilinear_filter test
Build out the sixtap_predict test because the filters are
interchangeable. Add verbose failures and border checking.

Change-Id: I962f50041750dca6f8d0cd35a943424cf82ddcb1
2016-09-15 23:16:19 -07:00
James Zern
6ae58fd55e Merge "Revert "Restore vp8_sixtap_predict4x4_neon"" 2016-09-16 06:13:42 +00:00
Johann Koenig
7795e99296 Revert "Restore vp8_sixtap_predict4x4_neon"
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5.

Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well.

Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-16 06:12:49 +00:00
Johann Koenig
fdbe249991 Merge "Restore vp8_bilinear_predict4x4_neon" 2016-09-16 05:33:50 +00:00
Johann Koenig
102eae06e9 Merge "zero structures completely" 2016-09-16 04:41:22 +00:00
Johann
43743b1d3e Restore vp8_bilinear_predict4x4_neon
This function was removed when clang started introducing alignment hints
which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail:
https://llvm.org/bugs/show_bug.cgi?id=24421

The load has been rendered safe with an implementation ~indiscernible
performance-wise that uses _u8 and over-reads just a touch.

It is still ~5x faster than C in the unaligned case and doing both
filters.

BUG=webm:892
BUG=webm:1273

Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
2016-09-15 21:16:11 -07:00
Johann Koenig
7bc0733c27 Merge "Restore vp8_sixtap_predict4x4_neon" 2016-09-16 04:12:08 +00:00
Johann
d5054504a7 zero structures completely
Use vp[89]_zero when possible.

Expand the {} set when neither is available or nearby.

Change-Id: Ifc1f46f60100916cd798bf7be3a10f09321c99bd
2016-09-16 03:54:11 +00:00
Johann
1d2aaf58dd vp8 postproc: expand CONFIG_POSTPROC guard
postproc.c is overloaded and used for both postproc and internal stats.
If only --enable-internal-stats is specified there are issues with
non-existent struct members and unused functions.

Change-Id: I82367f1ffce659c3918c9f964dbce94a716fbb89
2016-09-16 03:52:19 +00:00
Johann
f2be831885 altref test: comment out 'pass'
All the other test which do not use 'pass' (which appears to be almost
all of them) do this.

Cleans -Wextra/-Wunused-parameter:
unused parameter ‘pass’

Change-Id: I1ff3acf3f3d1e831f94dcb00ea36337afe0aefe0
2016-09-15 17:45:47 -07:00
Johann Koenig
c53aacf408 Merge "vp9 frame parallel test: Initialize cfg differently" 2016-09-15 23:46:56 +00:00
Marco
4c1a9fb8db vp9: Small code cleanup.
Remove the experiment LIMIT_QP_ONEPASS_VBR_LAG, as its
not currently used and no plan to use in near future.

Change-Id: Ib069f8d7225195be04b765d0ab477510dfba6a3b
2016-09-15 15:17:17 -07:00
clang-format
5f6d143b41 apply clang-format
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
2016-09-15 15:07:53 -07:00
James Zern
30b1abd6e6 .clang-format: update to 3.8.1
based on --style=Google with the following differences:
3a4
> # Generated with clang-format 3.8.1
13c14
< AllowShortCaseLabelsOnASingleLine: false
---
> AllowShortCaseLabelsOnASingleLine: true
41c42
< ConstructorInitializerAllOnOneLineOrOnePerLine: true
---
> ConstructorInitializerAllOnOneLineOrOnePerLine: false
44,45c45,46
< Cpp11BracedListStyle: true
< DerivePointerAlignment: true
---
> Cpp11BracedListStyle: false
> DerivePointerAlignment: false
73c74
< PointerAlignment: Left
---
> PointerAlignment: Right
75c76
< SortIncludes:    true
---
> SortIncludes:    false

SortIncludes will like be enabled in a future commit

Change-Id: I5c404f44081b65354e7f526411c91fbbe31ac5af
2016-09-15 15:05:52 -07:00
Johann
d9dce2f48e Restore vp8_sixtap_predict4x4_neon
This function was removed when clang started introducing alignment hints
which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail:
https://llvm.org/bugs/show_bug.cgi?id=24421

The load has been rendered safe with an implementation ~indiscernible
performance-wise that uses _u8 and over-reads just a touch.

The store, when unaligned, has a version that is ~25% slower but safe
when xoffset = 0 (second pass filter only). When the first pass filter
(or both) are in play, the new version is almost identical in speed.

Worst case performance (both filters, unaligned stores) is roughly 3-4x
faster than C.

BUG=webm:817
BUG=webm:1273

Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-09-15 14:56:47 -07:00