Commit Graph

87 Commits

Author SHA1 Message Date
Yi Luo
a3452996a1 High bit depth inter prediction horizontal/vertical filters AVX2
User level speed improvement on i7-6700, cpu-used=1,
  x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps:
- Decoder:
  1080p: ~4%
  4K: ~5%
- Encoder:
  1080p: ~1%
  4K: ~3%

Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640
2017-05-03 12:18:01 -07:00
Luca Barbato
e2ad89092d ppc: Add convolve8_vsx and convolve8_avg_vsx
Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857
2017-05-02 20:27:47 -07:00
Luca Barbato
e6ca81ee67 ppc: Add convolve8_avg_vert_vsx
Only the generic one again, speedups for 8x8 and larger blocks to
come later.

Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d
2017-05-02 20:27:42 -07:00
Luca Barbato
a65f1771ad ppc: Add convolve8_vert
Only the generic one again, speedups for 8x8 and larger blocks
to come later.

Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f
2017-05-02 20:27:33 -07:00
Luca Barbato
77772350f3 ppc: Add convolve8_horiz_avg
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15
2017-05-02 20:27:28 -07:00
Luca Barbato
08edb85bd0 ppc: Add convolve8_horiz
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046
2017-05-02 20:27:16 -07:00
Luca Barbato
d51d3934f5 ppc: Add convolve_avg
Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014
2017-04-29 15:47:25 +02:00
Luca Barbato
63860ba7b8 ppc: Add convolve_copy
Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405
2017-04-29 15:47:25 +02:00
Linfeng Zhang
51dc998f3a Update highbd convolve functions arguments to use uint16_t src/dst
BUG=webm:1388

Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-25 14:22:19 -07:00
Linfeng Zhang
bf8a49abbd Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
Replace by CAST_TO_BYTEPTR/SHORTPTR.
The rule is: if a short ptr is casted to a byte ptr, any offset
operation on the byte ptr must be doubled. We do this by casting to
short ptr first, adding offset, then casting back to byte ptr.

BUG=webm:1388

Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
2017-04-19 12:13:49 -07:00
Yi Luo
aa5a941992 Add AVX2 optimization to copy/avg functions
Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074
2017-04-14 16:50:10 -07:00
Linfeng Zhang
9c8981c666 add vpx high bitdepth convolve8 NEON intrinsics optimization
BUG=webm:1299

Change-Id: I236bfa0441e357b6ff05add8269a2cfb543924d1
2016-10-17 15:23:54 -07:00
Linfeng Zhang
f910d14a1a add vpx_highbd_convolve_{copy,avg}_neon()
BUG=webm:1299

Change-Id: Ib87ac466ada63251eb06ae2abd1e13e61e0d1538
2016-10-13 15:21:14 -07:00
Linfeng Zhang
85a9e48d25 Refine vpx_convolve_copy_neon() and vpx_convolve_avg_neon()
BUG=webm:1290

Change-Id: Ia27e58521eba5a4852b50381c56746fa5767f6d6
2016-09-29 16:19:39 -07:00
Linfeng Zhang
81ff7a065f Clean convolve_test.cc
Combine test MatchesReferenceSubpixelFilter and
MatchesReferenceAveragingSubpixelFilter.

Change-Id: I75f96befbbb118cdc6b8c6001b4cdda8d88fbbd3
2016-09-27 13:36:31 -07:00
clang-format
9c9d92ae3a test: apply clang-tidy google-readability-braces-around-statements
applied against a x86_64 configure with and without
--enable-vp9-highbitdepth

clang-tidy-3.7.1 \
  -checks='-*,google-readability-braces-around-statements' \
  -header-filter='.*' -fix
+ clang-format afterward

Change-Id: Ia2993ec64cf1eb3505d3bfb39068d9e44cfbce8d
2016-08-05 20:02:28 -07:00
clang-format
33e40cb5db test: apply clang-format
Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f
2016-07-27 01:58:52 +00:00
Johann Koenig
e616012d69 Merge changes I59a11921,I296a0b81,I397d7753
* changes:
  configure: remove x86inc.asm distinction
  test: remove x86inc.asm distinction
  vpx_dsp: remove x86inc.asm distinction
2016-07-01 18:13:41 +00:00
Johann
0266e70c52 test: remove x86inc.asm distinction
BUG=b:29583530

Change-Id: I296a0b81755e3086bc0a40cb126d0200ff03c095
2016-06-30 11:14:10 -07:00
James Zern
f5a6079141 convolve_test: fix byte offsets in hbd build
CONVERT_TO_BYTEPTR(x) was corrected in:
003a9d2 Port metric computation changes from nextgenv2
to use the more common (x) within the expansion. offsets should occur
after converting the pointer to the desired type.

+ factorized some common expressions

Change-Id: I171c3faaa5606d098e984baa9aa74bb36042f57f
2016-06-29 20:39:07 -07:00
Tom Finegan
9a56a5ea18 convolve_test: Fix high bit depth IOC runtime errors.
Add a cast.

BUG=webm:1225

Change-Id: I34ea18ee816569485c1f1046a81fd2a0ce527ac8
2016-05-13 09:42:58 -07:00
Tom Finegan
6042d68851 convolve_test: Fix IOC runtime errors.
Add a cast.

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1216

Change-Id: I40627de387bc9cfba37860e7a0a4f2d4524f3431
2016-05-09 16:33:59 -04:00
Alex Converse
2f97b7cbfe Port convolve test refactor to master.
Brings f03e238f to master.

Change-Id: I7f7754e7d1288b103a4510303d10afc68a7d8ca8
2016-04-27 16:53:33 -07:00
James Zern
cffef113b9 tests: quiet some unused parameter warnings
Change-Id: Iff8b0d77234f78bf407676891bccad92825bfcc6
2016-02-11 19:25:48 -08:00
Alex Converse
0c00af126d Add vpx_highbd_convolve_{copy,avg}_sse2
single-threaded:
swanky (silvermont): ~1% faster overall
peppy (celeron,haswell): ~1.5% faster overall

Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073
2015-10-09 11:50:25 -07:00
Alex Converse
7e77938d72 Generate convolve_test wrapper functions with a macro
Change-Id: Iccb4cdc23c1845cf9cb7d69101c9f4f43675d368
2015-10-09 11:42:05 -07:00
Scott LaVarnway
4e6b5079c6 VPX: remove scaled calls from FUN_CONV_1D
and FUN_CONV_2D macros.  The predict lut now handles
this case.  The encoder now calls vpx_scaled_2d() instead
of vpx_convolve8() for scaling.

Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7
2015-08-05 10:47:06 -07:00
Zoe Liu
7186a2dd86 Code refactor on InterpKernel
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.

Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
2015-07-31 10:27:33 -07:00
Jingning Han
097d59c28c Cosmetics - Fix header file order in unit tests
Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798
2015-07-29 20:48:25 -07:00
Scott LaVarnway
1ec0853d17 Delete ChangeFilterWorks test
This test places 128 in positions that would not be found
in the VP9 filter tables.  The ssse3 code packs this table
into chars and uses the pmaddubsw instruction, which treats
the value as signed.  The ssse3 code checks for 128 in
position 3, skipping the ssse3 code if found, and calls
vp9_convolve8_c().  vp9_convolve8_c() is also used for scaling.
ChangeFilterWorks breaks the ssse3 scaling code found in other
commits.

Change-Id: I1f5a76834bc35180b9094c48f9421bdb19d3d1cb
2015-07-22 09:05:17 -07:00
James Zern
017253b7a3 remove vp9_get_interp_kernel()
expose filter_kernels[] and do the table lookup directly

Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20
2015-07-06 13:04:05 -07:00
Johann
ff8505a54d Fix --disable-use-x86inc
Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06
2015-07-01 13:15:51 -07:00
Parag Salasakar
bdfbc3e876 mips msa vp9 convolve8 avg hv optimization
average improvement ~4x-6x

Change-Id: I7c8b4f2334491be8a859592606e568bc95d019aa
2015-06-04 08:11:01 +05:30
Parag Salasakar
b8c1cdcd12 mips msa vp9 convolve8 avg horiz optimization
average improvement ~5x-8x

Change-Id: I179a69ec620fbd69979bd128f05d18113618aab4
2015-06-03 11:33:42 +05:30
Parag Salasakar
c543d38ac7 mips msa vp9 convolve8 avg vert optimization
average improvement ~4x-6x

Change-Id: Ia2e6f770da46416ebec31fdcea5cc7878879a9d9
2015-06-03 09:55:25 +05:30
Parag Salasakar
ebf7466cd8 mips msa vp9 updated convolve horiz, vert, hv, copy, avg module
Updated sources according to improved version of common MSA macros.
Enabled respective convolve MSA hooks and tests.
Overall, this is just upgrading the code with styling changes.

Change-Id: If5ad6ef8ea7ca47feed6d2fc9f34f0f0e8b6694d
2015-06-02 12:03:51 +05:30
Parag Salasakar
f9f078ebb6 mips msa vp9 updated macros and disable all MSA functions
Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc.
Disabled all MSA hooks and tests

Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190
2015-05-29 13:34:33 +05:30
Parag Salasakar
95cb130f32 Merge "mips msa vp9 copy and avg convolve optimization" 2015-04-30 04:39:13 +00:00
Parag Salasakar
2301d10f73 mips msa vp9 copy and avg convolve optimization
average improvement ~3x-5x

Change-Id: I422e4c33ea7e6d6783ba40029438ccf21b0e76bb
2015-04-29 12:28:17 +05:30
James Zern
f274c2199b vpx_mem: remove vpx_memcpy
vestigial. replace instances with memcpy() which they already were being
defined to.

Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c
2015-04-28 19:59:41 -07:00
Parag Salasakar
ca90d4fd96 mips msa vp9 convolve8 horiz optimization
average improvement ~6x-8x

Change-Id: I7c91eec41aada3b0a5231dda7869b3b968f3ad18
2015-04-21 12:31:26 +05:30
Parag Salasakar
ef51c1ab5b mips msa vp9 convolve8 hv optimization
average improvement ~5x-8x

Change-Id: I3214734cb3716e742907ce0d2d7a042d953df82b
2015-04-21 09:17:49 +05:30
Parag Salasakar
27d083c1b9 mips msa vp9 convolve8 vert optimization
average improvement ~6x-10x

Change-Id: Ie3f3ab3a9005be84935919701e56b404e420affa
2015-04-18 08:13:04 +05:30
Jim Bankoski
18d323606d Fix test to call clear system state in convolve_test.
Assembly tests should clear system state, as we have no
expectation of proper system state in between test runs..

Change-Id: I0f591996c1f17ef2a5a8572a6b445f757223a144
2014-12-12 06:18:56 -08:00
James Yu
01fc6f51e0 VP9 common for ARMv8 by using NEON intrinsics 07
Add vp9_convolve8_neon.c
- vp9_convolve8_horiz_neon
- vp9_convolve8_vert_neon

Change-Id: I0bdd99ff72d275223fe211ac7243c25a5a60cf87
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:03:07 -08:00
James Yu
893534a996 VP9 common for ARMv8 by using NEON intrinsics 04
Add vp9_convolve8_avg_neon.c
- vp9_convolve8_avg_horiz_neon
- vp9_convolve8_avg_vert_neon

Change-Id: I617971e37b02186fec5aca181f4f9622050ea2df
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:03:07 -08:00
James Yu
d12757f5c6 VP9 common for ARMv8 by using NEON intrinsics 03
Add vp9_copy_neon.c
- vp9_convolve_copy_neon

Change-Id: I291fc5423d06240876411bbceab03eae5ef585be
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:02:46 -08:00
Scott LaVarnway
617382a2e3 VP9 common for ARMv8 by using NEON intrinsics 02
Add vp9_avg_neon.c
- vp9_convolve_avg_neon

Change-Id: Id2c9d5bcfa37cff1a16417aba1656ff07bdf10fd
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 19:00:21 -08:00
Johann
1c3594c334 Add convolve_copy and convolve_avg to the test
Change-Id: Ic9438031282e63e627550f7e4cdeda36e43e647b
2014-12-09 12:56:38 -08:00
Deb Mukherjee
27dce0f324 Test name changes to use SSE/SSE2 exactly
Change-Id: I3b5a478d198868c2796366f0ac59d0e2036308b8
2014-11-07 13:44:19 -08:00