generic-library/vpx

Author	SHA1	Message	Date
Johann	fe96dbda15	vp9: remove x86inc.asm distinction BUG=b:29583530 Change-Id: I952da3fc0d4716dec897be0d2e9806af6612722b	2016-06-29 18:55:28 -07:00
Scott LaVarnway	eb09bbe88b	Revert "remove vp9_diamond_search_sad_avx.c" This reverts commit `be12fefa4b` and commit `057c1c4034`. Also, the mismatch between the avx version and the c version has been fixed. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1168 For a rt encode using 1080p@60fps material, up to 11% performance improvement overall was seen. Change-Id: Icd1f216209ebc6fc0b8da885f32f356fa4355ed0	2016-06-07 17:21:01 -07:00
Linfeng Zhang	2ab7b9a6c9	Merge "Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10."	2016-05-27 17:51:35 +00:00
Linfeng Zhang	af7fb17c09	Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10. Function level timing test shows about 27% time saving on a Xeon E5-2680 v2 desktop. Rename vp9_dct_sse2.c to vp9_dct_intrin_sse2.c for vp9 and rename dct_sse2.c to dct_intrin_sse2.c for vp10 to avoid duplicate basenames. Actually vp9_fwht4x4_mmx/sse2() and vp10_fwht4x4_mmx/sse2() are identical. TODO: They should be unified later if there is no intention to keep a duplicate. Change-Id: I3e537b7bbd9ba417c606cd7c68c4dbbfa583f77d	2016-05-27 09:51:16 -07:00
James Zern	be12fefa4b	remove vp9_diamond_search_sad_avx.c vp9_diamond_search_sad_avx was disabled in: `057c1c4` disable vp9_diamond_search_sad_avx this removes a missing prototype warning as the prototype is no longer included in vp9_rtcd.h. the file can be restored if someone gets around to fixing the issue. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1168 Change-Id: Ia9fda4b81c53dc5fba7c31d780d761f886940b52	2016-05-24 12:02:22 -07:00
Scott LaVarnway	0840c6fb8f	BUG FIX: undefined reference to `vp9_scale_and_extend_frame_c' See https://bugs.chromium.org/p/webm/issues/detail?id=1145 Change-Id: I778ee07dc39a524e3f729bef47a7abeed51e0cee	2016-02-08 13:42:56 -08:00
Scott LaVarnway	989c69303d	Vidyo patch: Optimization for 1-to-2 downsampling and upsampling. Change-Id: I9cc9780f506e025aea57485a9e21f0835faf173c	2016-02-04 14:50:26 -08:00
Debargha Mukherjee	02345be986	Adding an aq mode for 360 videos Different quality levels are used for different regions in the frame depending on how far they are vertically from the center. Specifically, three segments are used based on the mi_row index with respect number to the number of mi_rows in the frame. Change-Id: Ifc8b777bc58ea8521dffc4640360c67d99f8d381	2016-01-13 16:17:37 -08:00
James Zern	d36659cec7	move vp9_avg to vpx_dsp Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f	2015-12-14 14:42:12 -08:00
Geza Lore	5eefd3ebfd	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc	2015-11-11 14:03:47 +00:00
James Zern	30466f26b4	Revert "Add AVX vectorized vp9_diamond_search_sad" This reverts commit `f1342a7b07`. This breaks 32-bit builds: runtime error: load of misaligned address 0xf72fdd48 for type 'const __m128i' (vector of 2 'long long' values), which requires 16 byte alignment + _mm_set1_epi64x is incompatible with some versions of visual studio Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673	2015-11-06 13:15:01 -08:00
Yunqing Wang	57cae22c1e	Merge "Add AVX vectorized vp9_diamond_search_sad"	2015-11-05 20:17:13 +00:00
Geza Lore	f1342a7b07	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6	2015-11-05 10:02:17 +00:00
Marco	c7da053d4b	Move noise level estimate outside denoiser. Source noise level estimate is also useful for setting variance encoder parameters (variance thresholds, qp-delta, mode selection, etc), so allow it to be used also if denoising is not on. Change-Id: I4fe23d47607b4e17a35287057f489c29114beed1	2015-11-02 12:15:26 -08:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00
Alex Converse	c7b7011b9b	Move VP9 SSIM metrics to vpx_dsp. Change-Id: I20c7b42631b579fade6cf7ebf6d4c69b2fcb5e5e	2015-08-06 18:25:25 -07:00
James Zern	f42012e526	Merge "add vp9_block_error_fp_neon"	2015-07-29 00:47:09 +00:00
Jingning Han	a7e9178d80	Remove vp9_dct.h file The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder. The vp9_dct.h file is not effectively used now. Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281	2015-07-28 15:27:36 -07:00
Jingning Han	d19033fa4e	Move DC only forward 2D-DCT functions to vpx_dsp This completes the forward transform functions layout refactoring. Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd	2015-07-28 14:52:30 -07:00
Jingning Han	a6a4659bea	Factor 32x32 fwd DCT to vpx_dsp folder Move the 32x32 2D-DCT implementations from vp9/ to vpx_dsp/. Change-Id: Id3980696f8b69906ff7a59ff9fb2b9013d60047d	2015-07-28 11:13:41 -07:00
James Zern	ea990af7f5	add vp9_block_error_fp_neon ~60-70% faster depending on the block size Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1	2015-07-27 19:59:50 -07:00
Jingning Han	8eefb36ca9	Move forward dct sse2 header file to vpx_dsp Change-Id: Iba03852ce778c956200818e3473cfb2b48cf8d8e	2015-07-27 14:59:57 -07:00
Jingning Han	b67821f37b	Factor forward 2D-DCT transforms into vpx_dsp This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward transform operations into vpx_dsp folder. Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d	2015-07-22 15:48:17 -07:00
Yaowu Xu	c5ad31e518	Move bit writer files to vpx_dsp/ Change-Id: Id27e0007a0feac821ca66bcecbf3a723305da82d	2015-07-20 11:20:02 -07:00
Yunqing Wang	38f1fbbb75	Migrate quantization functions from vp9/ to vpx_dsp/ The following quantization functions were moved: vp9_quantize_b vp9_quantize_b_32x32 vp9_highbd_quantize_b vp9_highbd_quantize_b_32x32 vp9_quantize_dc vp9_quantize_dc_32x32 vp9_highbd_quantize_dc vp9_highbd_quantize_dc_32x32 The purpose of doing that was to allow these functions to be shared by multiple codecs. Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f	2015-07-17 16:38:14 -07:00
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
James Zern	1696114587	Merge "mips msa vp9 subpel variance optimization"	2015-07-06 22:43:01 +00:00
Parag Salasakar	fbe67d307a	mips msa vp9 subpel variance optimization Change-Id: If88401bf8c5d8ee58200278734d7a5058d1585d0	2015-07-06 14:59:01 -07:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
James Zern	97946622c0	Revert "mips msa vp9 subpel variance optimization" This reverts commit `a42df86c03`. this change causes MSA/VP9SubpelVarianceTest.Ref and MSA/VP9SubpelVarianceTest.ExtremeRef failures under mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu Change-Id: I40b71a0b774eaeb31f66f795733f95cf360909f7	2015-07-02 12:06:51 -07:00
Johann	ff8505a54d	Fix --disable-use-x86inc Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06	2015-07-01 13:15:51 -07:00
Parag Salasakar	a42df86c03	mips msa vp9 subpel variance optimization average improvement ~3x-5x Change-Id: I4cbba2711467b0e205904769ebbb4a1fcbb1a311	2015-07-01 07:51:34 +05:30
Parag Salasakar	b92cc27b76	mips msa vp9 temporal filter optimization average improvement ~4x-5x Change-Id: Iad9c0a296dbc2ea96d000bd009077999ed58a3c5	2015-06-26 12:00:24 +05:30
Parag Salasakar	c040f96e4b	mips msa vp9 subtract block optimization average improvement ~3x-4x Change-Id: Idbe4d13a00d05ff8be6559b116f416e42c3b4097	2015-06-26 09:23:56 +05:30
Parag Salasakar	1543f2b60e	mips msa vp9 block error optimization average improvement ~3x-4x Change-Id: If0fdcc34b17437a7e3e7fb4caaf1067bc175f291	2015-06-26 09:04:00 +05:30
Parag Salasakar	7555e2b822	mips msa vp9 avg optimization average improvement ~2x-3x Change-Id: I76f7fc00c0ffdf2b4ba41bf3819f3b6044bcdeff	2015-06-23 07:32:25 +05:30
Parag Salasakar	bc94999148	mips msa vp9 fdct 4x4 optimization average improvement ~2x-3x Change-Id: Idf8be780b8b4228fc91f110a94e4ee1fd9af0163	2015-06-22 14:30:24 +05:30
Parag Salasakar	7ca84888c2	mips msa vp9 fdct 8x8 optimization average improvement ~4x-5x Change-Id: I37582efc2622bc20b2bf99617a76110ab24e9f6a	2015-06-20 07:48:35 +05:30
Parag Salasakar	d9fedf7832	mips msa vp9 fdct 32x32 optimization average improvement ~4x-6x Change-Id: Ibcac3ef8ed5e207cf8c121e696570e6b63d3c0f4	2015-06-17 07:58:34 +05:30
Parag Salasakar	89b4b315aa	mips msa vp9 fdct 16x16 optimization average improvement ~4x-6x Change-Id: Id3b2243e5b3c7844c90c4231a5e75fa69911362c	2015-06-16 12:49:34 +05:30
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
James Zern	a989c66b84	rename vp9_dct_impl_sse2.c to vp9_dct_sse2_impl.h this file shouldn't be built directly, it is included in vp9_dct_sse2.c to create a non-high-bitdepth and a high-bitdepth version silences missing prototype warnings for the unused FDCT* functions Change-Id: Ide6ff8c24ab31bdb0f833260505ae33660a1ad5b	2015-05-15 17:01:19 -07:00
James Zern	587a71f1d6	rename vp9_dct32x32_sse2.c to vp9_dct32x32_sse2_impl.h this file shouldn't be built directly, it is included in vp9_dct_sse2.c to create a non-high-bitdepth and a high-bitdepth version silences missing prototype warnings for the unused FDCT32x32* functions Change-Id: I0e38f16dae5ea1728de184ee2c89287d48675c51	2015-05-15 16:59:52 -07:00
James Zern	4ec47249bc	rename vp9_dct32x32_avx2.c to vp9_dct32x32_avx2_impl.h this file shouldn't be built directly, it is included in vp9_dct_avx2.c to create a non-high-bitdepth and a high-bitdepth version silences missing prototype warnings for the unused FDCT32x32* functions Change-Id: I4c19935c0e035b393be513bde735e9a78064a494	2015-05-15 16:47:51 -07:00
Johann	d5d9289800	Move shared SAD code to vpx_dsp Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a	2015-05-06 16:58:20 -07:00
Jim Bankoski	03829f2fea	Merge "Adds a blockiness metric to internal stats."	2015-04-17 16:06:26 -07:00
Jim Bankoski	3d2f037a44	Merge "adds psnrhvs to internal stats."	2015-04-17 16:06:10 -07:00
Jim Bankoski	f2cbee9a04	Merge "Adds a fastssim metric to VPX internal stats."	2015-04-17 16:05:53 -07:00
Jim Bankoski	1777413a2a	Adds a blockiness metric to internal stats. Change-Id: Iedceeb020492050063acf3fd2326f96c29db9ae5	2015-04-17 11:13:18 -07:00

1 2 3 4

167 Commits