generic-library/vpx

Author	SHA1	Message	Date
Kaustubh Raste	339f4dcaee	mips msa optimize vpx_scaled_2d function Change-Id: I638507b360c71489ab0e87bd558d2719ad995333	2017-11-29 13:27:04 +05:30
Kyle Siefring	ae35425ae6	Optimize convolve8 SSSE3 and AVX2 intrinsics Changed the intrinsics to perform summation similiar to the way the assembly does. The new code diverges from the assembly by preferring unsaturated additions. Results for haswell SSSE3 Horiz/Vert Size Speedup Horiz x4 ~32% Horiz x8 ~6% Vert x8 ~4% AVX2 Horiz/Vert Size Speedup Horiz x16 ~16% Vert x16 ~14% BUG=webm:1471 Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668	2017-10-24 10:39:48 -04:00
Linfeng Zhang	0d2e95193b	Merge "Generalize CheckScalingFiltering in ConvolveTest"	2017-10-17 16:03:07 +00:00
Linfeng Zhang	54f7d68c5c	Generalize CheckScalingFiltering in ConvolveTest Let it test extreme inputs and all filter types. In the future ConvolveTest should test regular 8-bit functions in high bitdepth mode. Change-Id: I1042564d1d390589ca203070fe332c6da3315d75	2017-10-10 14:12:43 -07:00
Kyle Siefring	1b2f92ee8e	Extend 16 wide AVX2 convolve8 code to support averaging. Also adds vpx_convolve8_avg_horiz_avx2. Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf	2017-10-09 19:10:03 -04:00
Kyle Siefring	9ca06bcdd2	Add AVX2 version of vpx_convolve8_avg. vpx_convolve8_avg works by first running a normal horizontal filter then a vertical filter averages at the end. The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the horizontal step. vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code. Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983	2017-10-07 23:37:48 -04:00
Linfeng Zhang	6543213e87	Refactor x86/vpx_subpixel_8t_intrin_ssse3.c Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac	2017-10-03 13:02:05 -07:00
Linfeng Zhang	9d0d13e939	Add vpx_scaled_2d_neon() BUG=webm:1419 Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96	2017-09-26 09:22:39 -07:00
Linfeng Zhang	d331e7a1c0	Remove get_filter_base() and get_filter_offset() in convolve so that the convolve functions are independent of table alignment. Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee	2017-09-05 15:22:36 -07:00
Yi Luo	a3452996a1	High bit depth inter prediction horizontal/vertical filters AVX2 User level speed improvement on i7-6700, cpu-used=1, x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps: - Decoder: 1080p: ~4% 4K: ~5% - Encoder: 1080p: ~1% 4K: ~3% Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640	2017-05-03 12:18:01 -07:00
Luca Barbato	e2ad89092d	ppc: Add convolve8_vsx and convolve8_avg_vsx Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857	2017-05-02 20:27:47 -07:00
Luca Barbato	e6ca81ee67	ppc: Add convolve8_avg_vert_vsx Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d	2017-05-02 20:27:42 -07:00
Luca Barbato	a65f1771ad	ppc: Add convolve8_vert Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f	2017-05-02 20:27:33 -07:00
Luca Barbato	77772350f3	ppc: Add convolve8_horiz_avg The 8x8 and larger blocks cases can be sped up further. Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15	2017-05-02 20:27:28 -07:00
Luca Barbato	08edb85bd0	ppc: Add convolve8_horiz The 8x8 and larger blocks cases can be sped up further. Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046	2017-05-02 20:27:16 -07:00
Luca Barbato	d51d3934f5	ppc: Add convolve_avg Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014	2017-04-29 15:47:25 +02:00
Luca Barbato	63860ba7b8	ppc: Add convolve_copy Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405	2017-04-29 15:47:25 +02:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
Linfeng Zhang	bf8a49abbd	Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Replace by CAST_TO_BYTEPTR/SHORTPTR. The rule is: if a short ptr is casted to a byte ptr, any offset operation on the byte ptr must be doubled. We do this by casting to short ptr first, adding offset, then casting back to byte ptr. BUG=webm:1388 Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248	2017-04-19 12:13:49 -07:00
Yi Luo	aa5a941992	Add AVX2 optimization to copy/avg functions Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074	2017-04-14 16:50:10 -07:00
Linfeng Zhang	9c8981c666	add vpx high bitdepth convolve8 NEON intrinsics optimization BUG=webm:1299 Change-Id: I236bfa0441e357b6ff05add8269a2cfb543924d1	2016-10-17 15:23:54 -07:00
Linfeng Zhang	f910d14a1a	add vpx_highbd_convolve_{copy,avg}_neon() BUG=webm:1299 Change-Id: Ib87ac466ada63251eb06ae2abd1e13e61e0d1538	2016-10-13 15:21:14 -07:00
Linfeng Zhang	85a9e48d25	Refine vpx_convolve_copy_neon() and vpx_convolve_avg_neon() BUG=webm:1290 Change-Id: Ia27e58521eba5a4852b50381c56746fa5767f6d6	2016-09-29 16:19:39 -07:00
Linfeng Zhang	81ff7a065f	Clean convolve_test.cc Combine test MatchesReferenceSubpixelFilter and MatchesReferenceAveragingSubpixelFilter. Change-Id: I75f96befbbb118cdc6b8c6001b4cdda8d88fbbd3	2016-09-27 13:36:31 -07:00
clang-format	9c9d92ae3a	test: apply clang-tidy google-readability-braces-around-statements applied against a x86_64 configure with and without --enable-vp9-highbitdepth clang-tidy-3.7.1 \ -checks='-,google-readability-braces-around-statements' \ -header-filter='.' -fix + clang-format afterward Change-Id: Ia2993ec64cf1eb3505d3bfb39068d9e44cfbce8d	2016-08-05 20:02:28 -07:00
clang-format	33e40cb5db	test: apply clang-format Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f	2016-07-27 01:58:52 +00:00
Johann Koenig	e616012d69	Merge changes I59a11921,I296a0b81,I397d7753 * changes: configure: remove x86inc.asm distinction test: remove x86inc.asm distinction vpx_dsp: remove x86inc.asm distinction	2016-07-01 18:13:41 +00:00
Johann	0266e70c52	test: remove x86inc.asm distinction BUG=b:29583530 Change-Id: I296a0b81755e3086bc0a40cb126d0200ff03c095	2016-06-30 11:14:10 -07:00
James Zern	f5a6079141	convolve_test: fix byte offsets in hbd build CONVERT_TO_BYTEPTR(x) was corrected in: `003a9d2` Port metric computation changes from nextgenv2 to use the more common (x) within the expansion. offsets should occur after converting the pointer to the desired type. + factorized some common expressions Change-Id: I171c3faaa5606d098e984baa9aa74bb36042f57f	2016-06-29 20:39:07 -07:00
Tom Finegan	9a56a5ea18	convolve_test: Fix high bit depth IOC runtime errors. Add a cast. BUG=webm:1225 Change-Id: I34ea18ee816569485c1f1046a81fd2a0ce527ac8	2016-05-13 09:42:58 -07:00
Tom Finegan	6042d68851	convolve_test: Fix IOC runtime errors. Add a cast. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1216 Change-Id: I40627de387bc9cfba37860e7a0a4f2d4524f3431	2016-05-09 16:33:59 -04:00
Alex Converse	2f97b7cbfe	Port convolve test refactor to master. Brings `f03e238f` to master. Change-Id: I7f7754e7d1288b103a4510303d10afc68a7d8ca8	2016-04-27 16:53:33 -07:00
James Zern	cffef113b9	tests: quiet some unused parameter warnings Change-Id: Iff8b0d77234f78bf407676891bccad92825bfcc6	2016-02-11 19:25:48 -08:00
Alex Converse	0c00af126d	Add vpx_highbd_convolve_{copy,avg}_sse2 single-threaded: swanky (silvermont): ~1% faster overall peppy (celeron,haswell): ~1.5% faster overall Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073	2015-10-09 11:50:25 -07:00
Alex Converse	7e77938d72	Generate convolve_test wrapper functions with a macro Change-Id: Iccb4cdc23c1845cf9cb7d69101c9f4f43675d368	2015-10-09 11:42:05 -07:00
Scott LaVarnway	4e6b5079c6	VPX: remove scaled calls from FUN_CONV_1D and FUN_CONV_2D macros. The predict lut now handles this case. The encoder now calls vpx_scaled_2d() instead of vpx_convolve8() for scaling. Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7	2015-08-05 10:47:06 -07:00
Zoe Liu	7186a2dd86	Code refactor on InterpKernel It in essence refactors the code for both the interpolation filtering and the convolution. This change includes the moving of all the files as well as the changing of the code from vp9_ prefix to vpx_ prefix accordingly, for underneath architectures: (1) x86; (2) arm/neon; and (3) mips/msa. The work on mips/drsp2 will be done in a separate change list. Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46	2015-07-31 10:27:33 -07:00
Jingning Han	097d59c28c	Cosmetics - Fix header file order in unit tests Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798	2015-07-29 20:48:25 -07:00
Scott LaVarnway	1ec0853d17	Delete ChangeFilterWorks test This test places 128 in positions that would not be found in the VP9 filter tables. The ssse3 code packs this table into chars and uses the pmaddubsw instruction, which treats the value as signed. The ssse3 code checks for 128 in position 3, skipping the ssse3 code if found, and calls vp9_convolve8_c(). vp9_convolve8_c() is also used for scaling. ChangeFilterWorks breaks the ssse3 scaling code found in other commits. Change-Id: I1f5a76834bc35180b9094c48f9421bdb19d3d1cb	2015-07-22 09:05:17 -07:00
James Zern	017253b7a3	remove vp9_get_interp_kernel() expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20	2015-07-06 13:04:05 -07:00
Johann	ff8505a54d	Fix --disable-use-x86inc Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06	2015-07-01 13:15:51 -07:00
Parag Salasakar	bdfbc3e876	mips msa vp9 convolve8 avg hv optimization average improvement ~4x-6x Change-Id: I7c8b4f2334491be8a859592606e568bc95d019aa	2015-06-04 08:11:01 +05:30
Parag Salasakar	b8c1cdcd12	mips msa vp9 convolve8 avg horiz optimization average improvement ~5x-8x Change-Id: I179a69ec620fbd69979bd128f05d18113618aab4	2015-06-03 11:33:42 +05:30
Parag Salasakar	c543d38ac7	mips msa vp9 convolve8 avg vert optimization average improvement ~4x-6x Change-Id: Ia2e6f770da46416ebec31fdcea5cc7878879a9d9	2015-06-03 09:55:25 +05:30
Parag Salasakar	ebf7466cd8	mips msa vp9 updated convolve horiz, vert, hv, copy, avg module Updated sources according to improved version of common MSA macros. Enabled respective convolve MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: If5ad6ef8ea7ca47feed6d2fc9f34f0f0e8b6694d	2015-06-02 12:03:51 +05:30
Parag Salasakar	f9f078ebb6	mips msa vp9 updated macros and disable all MSA functions Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc. Disabled all MSA hooks and tests Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190	2015-05-29 13:34:33 +05:30
Parag Salasakar	95cb130f32	Merge "mips msa vp9 copy and avg convolve optimization"	2015-04-30 04:39:13 +00:00
Parag Salasakar	2301d10f73	mips msa vp9 copy and avg convolve optimization average improvement ~3x-5x Change-Id: I422e4c33ea7e6d6783ba40029438ccf21b0e76bb	2015-04-29 12:28:17 +05:30
James Zern	f274c2199b	vpx_mem: remove vpx_memcpy vestigial. replace instances with memcpy() which they already were being defined to. Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c	2015-04-28 19:59:41 -07:00
Parag Salasakar	ca90d4fd96	mips msa vp9 convolve8 horiz optimization average improvement ~6x-8x Change-Id: I7c91eec41aada3b0a5231dda7869b3b968f3ad18	2015-04-21 12:31:26 +05:30

1 2

96 Commits