generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	30d9a1916c	vpxdsp: [x86] add highbd_h_predictor functions C vs SSE2 speed gains: _4x4 : ~8.12x _8x8 : ~9.71x _16x16 : ~8.21x _32x32 : ~5.0x BUG=webm:1422 Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993	2017-08-28 17:31:18 -07:00
Marco Paniconi	3e069846b9	Merge "Revert "quantize avx: copy 32x32 implementation""	2017-08-25 18:20:31 +00:00
Marco Paniconi	8c42237bb2	Revert "quantize avx: copy 32x32 implementation" This reverts commit `f60d1dcd3d`. Reason for revert: <INSERT REASONING HERE> Failures in AVX/VP9QuantizeTest in nightly tests. Original change's description: > quantize avx: copy 32x32 implementation > > Ensure avx and ssse3 stay in sync by testing them against each other. > > Change-Id: I699f3b48785c83260825402d7826231f475f697c TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org Change-Id: Ibd38636212269328317dd0721be9d25452113d1c No-Presubmit: true No-Tree-Checks: true No-Try: true	2017-08-25 16:56:08 +00:00
Shiyou Yin	ece1989fa2	Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi."	2017-08-25 06:44:02 +00:00
Shiyou Yin	9e4647c7ab	vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi. Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174	2017-08-25 01:58:49 +08:00
Johann	f60d1dcd3d	quantize avx: copy 32x32 implementation Ensure avx and ssse3 stay in sync by testing them against each other. Change-Id: I699f3b48785c83260825402d7826231f475f697c	2017-08-24 10:42:34 -07:00
Johann	1787e7dbe0	quantize ssse3: copy implementation to intrinsics Still does not pass tests. Does match the previous assembly, although saving the sign before multiplying is dubious. Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a	2017-08-24 07:47:51 -07:00
Shiyou Yin	d080c92524	Merge "vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi."	2017-08-24 00:55:11 +00:00
Johann Koenig	f53b656207	Merge "quantize avx: copy implementation to intrinsics"	2017-08-23 21:14:13 +00:00
Scott LaVarnway	1aad50c092	Merge "vpx_dsp: get32x32var_avx2() cleanup"	2017-08-23 19:59:25 +00:00
Johann Koenig	dfafd10ef5	Merge "quantize neon: round dqcoeff towards zero"	2017-08-23 19:20:53 +00:00
Johann	7c27872164	quantize avx: copy implementation to intrinsics Adds an early exit based on ptest. Slightly slower than ssse3 in the full case because of the extra check, but potentially faster if lots of rows can be skipped. Very close in speed to the assembly. Can run in 32 bit, unlike the assembly. Allows reworking the function prototype to use structs. Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e	2017-08-23 09:19:16 -07:00
Johann	2a5aa98a35	quantize neon: round dqcoeff towards zero Add 1 if negative to get dqcoeff to round towards zero. 10-15% faster than converting to positive before shifting. Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d	2017-08-23 08:05:50 -07:00
Shiyou Yin	59e065b6ed	vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi. Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2	2017-08-23 15:14:15 +08:00
Johann	b9c1dcc5fa	quantize ssse3: copy style from sse2 Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598	2017-08-22 14:25:27 -07:00
Johann	75752ab7c0	quantize sse2: copy opts from ssse3 Simplify eob calculations based on ssse3 implementation. General clean up and re-scoping. Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee	2017-08-22 13:01:44 -07:00
Johann Koenig	ab27b68693	Merge changes Icfb70687,I9a963e99,Ie8ac00ef,I1272917c * changes: quantize: ignore skip_block in arm quantize: ignore skip_block in x86 quantize fp: ignore skip_block in arm quantize fp: ignore skip_block in x86	2017-08-22 19:19:14 +00:00
James Zern	419ce36294	Merge "ppc: Add vpx_idct16x16_256_add_vsx"	2017-08-22 00:48:39 +00:00
Shiyou Yin	bff5aa9827	Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi."	2017-08-22 00:37:23 +00:00
Johann	2c56bb97f2	quantize: ignore skip_block in arm Change-Id: Icfb70687476b2edb25d255793ba325b261d40584	2017-08-21 14:37:50 -07:00
Johann	c02fdd0258	quantize: ignore skip_block in x86 Change-Id: I9a963e99f08761f0c8d6a305619270b2f1c4edf8	2017-08-21 14:37:03 -07:00
Johann	13eed991f9	Remove skip_block from quantize This condition is handled before this code is reached. The ssse3 version of the function has always crashed when attempting to handle the skip_block condition. Add assert() and comments regarding the usage of skip_block. Removing the parameter is a fairly involved process so leave it be for the moment. Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a	2017-08-21 09:49:04 -07:00
Scott LaVarnway	eab3f5e0cc	vpx_dsp: get32x32var_avx2() cleanup renamed to get32x16var_avx2() BUG=webm:1404 Change-Id: Icb8f3986c9c9c646e13a69430db7235fc7e1a036	2017-08-18 13:44:09 -07:00
Scott LaVarnway	2c5478e383	Merge "vpx_dsp: vpx_get16x16var_avx2() cleanup"	2017-08-18 20:30:59 +00:00
Scott LaVarnway	2f7497f341	vpx_dsp: vpx_get16x16var_avx2() cleanup BUG=webm:1404 Change-Id: I88aceb07f4db4870a06eee21d87296974ce3221a	2017-08-18 12:23:49 -07:00
Johann Koenig	1426f04e91	Merge "quantize: normalize intermediate types"	2017-08-18 16:00:28 +00:00
Shiyou Yin	7d82e57f5b	vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi. Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e	2017-08-18 09:06:49 +08:00
James Zern	bb15fd51be	highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo 135 -> 34 fixes unused function warnings for highbd_idct32_34_4x32_quarter_[12] Change-Id: I4f50ff6ea514200af93dd59ff94c7f9717409682	2017-08-17 15:37:38 -07:00
Johann	7f602d6114	quantize: normalize intermediate types Despite abs_coeff being a positive value, all the other implementations treat it as signed which simplifies restoring the sign. HBD builds cast qcoeff to avoid a visual studio warning. Match vp9_quantize.c style of casting the entire expression. Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5	2017-08-17 12:34:28 -07:00
James Zern	e038d1610e	inv_txfm_sse2.h: correct idct/iadst prototypes fixes mismatch between prototypes and definitions Change-Id: Ib5e7dfcce244dbb8401815be2cdd183d96792652	2017-08-16 23:06:09 -07:00
Linfeng Zhang	f95686895b	Merge changes I08b562b6,Ia275940a,I51106e90 * changes: Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} Update highbd idct x86 optimizations. Update 32x32 idct sse2 and ssse3 optimizations.	2017-08-16 16:36:37 +00:00
Jerome Jiang	6b9c691daf	Merge "Clean up writing YUV files for debug purpose."	2017-08-15 18:28:54 +00:00
Jerome Jiang	a153080b55	Clean up writing YUV files for debug purpose. Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files. Delete some flags that can be enabled during build. To enable writing denoised YUV, use the following command line: CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure --enable-vp9-temporal-denoising For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP' Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528	2017-08-15 10:44:03 -07:00
Johann	77ed4414d6	quantize: silence unsigned overflow warning The result of the xor operation is unsigned. If coeff was negative, this results in an unsigned value - INT_MIN. Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0	2017-08-15 09:48:24 -07:00
Linfeng Zhang	d72e20b123	Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} BUG=webm:1412 Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f	2017-08-14 17:05:22 -07:00
Linfeng Zhang	69775d2f40	Update highbd idct x86 optimizations. BUG=webm:1412 Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069	2017-08-14 16:59:50 -07:00
Linfeng Zhang	3f05a70c41	Update 32x32 idct sse2 and ssse3 optimizations. Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70	2017-08-14 16:59:31 -07:00
Linfeng Zhang	15193ce51f	Merge "Clean highbd idct x86 code with inline functions"	2017-08-10 20:25:18 +00:00
Johann Koenig	9bb8ce5efb	Merge "neon: vpx_quantize_b_32x32"	2017-08-10 15:42:49 +00:00
Johann Koenig	0b393ae505	Merge "quantize: copy ssse3 optimizations to intrinsics"	2017-08-10 15:42:20 +00:00
Linfeng Zhang	39da7fb786	Clean highbd idct x86 code with inline functions Created inline functions highbd_butterfly_cospi16_sse2() and highbd_butterfly_cospi16_sse4_1() BUG=webm:1412 Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a	2017-08-08 17:53:28 -07:00
Johann	93166c5e51	neon: vpx_quantize_b_32x32 With skip block the neon is about twice as fast as C. The neon has no shortcut for coeff < zbin so it always takes the same amount of time. Even if the C can take the shortcut, it is over twice as fast in neon. If it can't, that gap increases to over 10x. BUG=webm:1426 Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6	2017-08-08 14:05:18 -07:00
Johann	d52cb59729	quantize: copy ssse3 optimizations to intrinsics Fairly minor differences from sse2. pabsw and psignw are the big gains. Also re-uses some values in eob calculation to avoid an extra pcmp. Fixes test failures in HBD and OS X builds. Allows using it in 32bit builds, where it is about 40% faster than sse2. Substantially faster than the assembly for skip_block. 10-20% faster the rest of the time. Change-Id: If783bb3567e561e47667e10133b9c84414a334e2	2017-08-08 12:22:14 -07:00
Linfeng Zhang	853165ba39	Update 32x32 idct sse2 funcs, add partial case 135 Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a	2017-08-07 17:37:02 -07:00
Linfeng Zhang	d670678f26	Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx() in idct x86 code Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083	2017-08-04 15:33:37 -07:00
Linfeng Zhang	fa829e0e5a	Replace multiplication_and_add() with butterfly() in idct x86 code Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e	2017-08-04 15:33:34 -07:00
Linfeng Zhang	c9fb719ee1	Update butterfly() in idct x86 optimizations. Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701	2017-08-04 15:33:28 -07:00
Linfeng Zhang	7f20c3ac44	Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1 BUG=webm:1412 Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca	2017-08-04 15:31:17 -07:00
Linfeng Zhang	22b6dc9fdf	Update for loop increment of idct x86 functions Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4	2017-08-04 15:29:19 -07:00
Linfeng Zhang	0c61331244	Update high bitdepth 16x16 idct x86 code Prepare for high bitdepth 16x16 idct sse4.1 code. Just functions moving and renaming. BUG=webm:1412 Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a	2017-08-04 15:12:33 -07:00

1 2 3 4 5 ...

878 Commits