generic-library/vpx

Author	SHA1	Message	Date
James Zern	60760f710f	fix vp9_satd_sse2 accumulate satd in 32-bits + add unit test Change-Id: I6748183df3662ddb9d635f9641f9586f2fd38ad5	2015-11-20 14:35:46 -08:00
James Zern	3e0138edb7	vp9_satd: return an int the final sum may use up to 26 bits + add a unit test + disable the sse2 as the result will rollover; this will be fixed in a future commit Change-Id: I2a49811dfaa06abfd9fa1e1e65ed7cd68e4c97ce	2015-11-20 14:35:38 -08:00
paulwilkins	0149fb3d6b	Changes to exhaustive motion search. This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa	2015-11-13 10:16:31 +00:00
Geza Lore	5eefd3ebfd	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc	2015-11-11 14:03:47 +00:00
James Zern	30466f26b4	Revert "Add AVX vectorized vp9_diamond_search_sad" This reverts commit `f1342a7b07`. This breaks 32-bit builds: runtime error: load of misaligned address 0xf72fdd48 for type 'const __m128i' (vector of 2 'long long' values), which requires 16 byte alignment + _mm_set1_epi64x is incompatible with some versions of visual studio Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673	2015-11-06 13:15:01 -08:00
Geza Lore	f1342a7b07	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6	2015-11-05 10:02:17 +00:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00
Julia Robson	406030d1b0	Accelerated transform in high bit depth When configured with high bitdepth enabled, the 8bit transform stopped using optimised code. This made 8bit content decode slowly. Change-Id: I67d91f9b212921d5320f949fc0a0d3f32f90c0ea	2015-09-28 21:09:16 -07:00
Alex Converse	c7b7011b9b	Move VP9 SSIM metrics to vpx_dsp. Change-Id: I20c7b42631b579fade6cf7ebf6d4c69b2fcb5e5e	2015-08-06 18:25:25 -07:00
James Zern	a0fd7a9831	Merge "add vp9_vector_var_neon"	2015-08-04 02:30:41 +00:00
James Zern	7dc5a689b4	add vp9_vector_var_neon ~50-60% faster depending on the width Change-Id: I9d007cfa10b9aaa2169c8c009d95522df6123a92	2015-07-31 17:31:58 -07:00
Jingning Han	e8b133c79c	Factor inverse transform functions into vpx_dsp This commit moves the module inverse transform functions from vp9 to vpx_dsp folder. The hybrid transform wrapper functions stay in the vp9 folder, since it involves codec-specific data structures. Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8	2015-07-31 16:21:00 -07:00
Zoe Liu	7186a2dd86	Code refactor on InterpKernel It in essence refactors the code for both the interpolation filtering and the convolution. This change includes the moving of all the files as well as the changing of the code from vp9_ prefix to vpx_ prefix accordingly, for underneath architectures: (1) x86; (2) arm/neon; and (3) mips/msa. The work on mips/drsp2 will be done in a separate change list. Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46	2015-07-31 10:27:33 -07:00
James Zern	f42012e526	Merge "add vp9_block_error_fp_neon"	2015-07-29 00:47:09 +00:00
Jingning Han	fc18cf7a11	Merge "Move DC only forward 2D-DCT functions to vpx_dsp"	2015-07-29 00:06:37 +00:00
Jingning Han	d19033fa4e	Move DC only forward 2D-DCT functions to vpx_dsp This completes the forward transform functions layout refactoring. Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd	2015-07-28 14:52:30 -07:00
Hui Su	fe7cabe8b6	Merge "Move intra prediction functions from vp9/common/ to vpx_dsp/"	2015-07-28 20:41:01 +00:00
Jingning Han	a6a4659bea	Factor 32x32 fwd DCT to vpx_dsp folder Move the 32x32 2D-DCT implementations from vp9/ to vpx_dsp/. Change-Id: Id3980696f8b69906ff7a59ff9fb2b9013d60047d	2015-07-28 11:13:41 -07:00
James Zern	ea990af7f5	add vp9_block_error_fp_neon ~60-70% faster depending on the block size Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1	2015-07-27 19:59:50 -07:00
hui su	7971846a5e	Move intra prediction functions from vp9/common/ to vpx_dsp/ Change-Id: I64edc26cf4aab050c83f2d393df6250628ad43b8	2015-07-27 13:38:16 -07:00
Jingning Han	b67821f37b	Factor forward 2D-DCT transforms into vpx_dsp This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward transform operations into vpx_dsp folder. Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d	2015-07-22 15:48:17 -07:00
Jingning Han	e253eaa036	Unify the high bit-depth forward hybrid transforms The SSE2 version high bit-depth forward hybrid transforms are essentially using the C functions via cross referencing to 1-D functions in vp9_dct.c. This commit unifies the two versions and removes the unnecessary dependency. Change-Id: Ib4d0702a138f8daf7d0bd97c141ee7088f293765	2015-07-20 11:17:49 -07:00
Yunqing Wang	38f1fbbb75	Migrate quantization functions from vp9/ to vpx_dsp/ The following quantization functions were moved: vp9_quantize_b vp9_quantize_b_32x32 vp9_highbd_quantize_b vp9_highbd_quantize_b_32x32 vp9_quantize_dc vp9_quantize_dc_32x32 vp9_highbd_quantize_dc vp9_highbd_quantize_dc_32x32 The purpose of doing that was to allow these functions to be shared by multiple codecs. Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f	2015-07-17 16:38:14 -07:00
Jingning Han	845aad42b8	Merge "Migrate loop filter functions from vp9/ to vpx_dsp/"	2015-07-17 16:12:01 +00:00
Jingning Han	50adfdf5ba	Migrate loop filter functions from vp9/ to vpx_dsp/ The various tap loop filter operations are common functions across codec. This commit moves them along with SIMD optimizations to vpx_dsp folder. Change-Id: Ia5fa0b2e5289cdb98467502a549c380b9c60e92c	2015-07-16 16:40:47 -07:00
Frank Galligan	1c39998e39	Add vp9_int_pro_col_neon. BUG=https://code.google.com/p/webm/issues/detail?id=1023 Change-Id: I212a1d67b23ce3b5ce08800de369b25b9e375e7d	2015-07-15 09:04:28 -07:00
Alex Converse	d8426d6f12	Add an SSE2 version of vp9_iwht4x4_16_add Roughly half as many cycles as plain C. Change-Id: I8c16c29940b76d54ee7e4fb874c328ce90bff5d4	2015-07-14 14:23:32 -07:00
Yaowu Xu	ae5394b9e2	Revert "Add an SSE2 version of vp9_iwht4x4_16_add." This reverts commit `f8d3501640`. Change-Id: If8c7af403c091b7fb447a6f0c73fecdbccbc51b3	2015-07-13 16:26:27 +00:00
Alex Converse	f8d3501640	Add an SSE2 version of vp9_iwht4x4_16_add. 80% fewer cycles than C Change-Id: I841bde1e268ddd33ae2ee75eee94737a400e2cde	2015-07-08 15:00:51 -07:00
Frank Galligan	5327fcf857	Merge "Add vp9_int_pro_row_neon."	2015-07-08 00:16:03 +00:00
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
James Zern	1696114587	Merge "mips msa vp9 subpel variance optimization"	2015-07-06 22:43:01 +00:00
Parag Salasakar	fbe67d307a	mips msa vp9 subpel variance optimization Change-Id: If88401bf8c5d8ee58200278734d7a5058d1585d0	2015-07-06 14:59:01 -07:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
James Zern	97946622c0	Revert "mips msa vp9 subpel variance optimization" This reverts commit `a42df86c03`. this change causes MSA/VP9SubpelVarianceTest.Ref and MSA/VP9SubpelVarianceTest.ExtremeRef failures under mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu Change-Id: I40b71a0b774eaeb31f66f795733f95cf360909f7	2015-07-02 12:06:51 -07:00
James Zern	ced982640b	Revert "mips msa vp9 avg subpel variance optimization" This reverts commit `61774ad1c4`. this change causes MSA/VP9SubpelAvgVarianceTest.Ref failures under mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu Change-Id: I7fb520c12b2a3b212d5e84b7619a380a48e49bb0	2015-07-02 12:06:29 -07:00
Johann	79fcc56781	Merge "Fix --disable-use-x86inc when used with --enable-vp9-highbitdepth"	2015-07-01 21:14:41 +00:00
Johann	8d5389171f	Merge "Fix --disable-use-x86inc"	2015-07-01 21:14:17 +00:00
Johann	1c967f17bd	Fix --disable-use-x86inc when used with --enable-vp9-highbitdepth Change-Id: I0ed6de72dc0bb99fc9c5b1f6500399b16754ffb3	2015-07-01 13:17:01 -07:00
Johann	ff8505a54d	Fix --disable-use-x86inc Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06	2015-07-01 13:15:51 -07:00
Parag Salasakar	61774ad1c4	mips msa vp9 avg subpel variance optimization average improvement ~3x-5x Change-Id: Iefbcafc05daab77b38a4e63b551e427867a501a4	2015-07-01 13:46:41 +05:30
Parag Salasakar	a42df86c03	mips msa vp9 subpel variance optimization average improvement ~3x-5x Change-Id: I4cbba2711467b0e205904769ebbb4a1fcbb1a311	2015-07-01 07:51:34 +05:30
Parag Salasakar	b92cc27b76	mips msa vp9 temporal filter optimization average improvement ~4x-5x Change-Id: Iad9c0a296dbc2ea96d000bd009077999ed58a3c5	2015-06-26 12:00:24 +05:30
Parag Salasakar	c040f96e4b	mips msa vp9 subtract block optimization average improvement ~3x-4x Change-Id: Idbe4d13a00d05ff8be6559b116f416e42c3b4097	2015-06-26 09:23:56 +05:30
Parag Salasakar	d017f5ba38	Merge "mips msa vp9 block error optimization"	2015-06-26 03:42:31 +00:00
Parag Salasakar	1543f2b60e	mips msa vp9 block error optimization average improvement ~3x-4x Change-Id: If0fdcc34b17437a7e3e7fb4caaf1067bc175f291	2015-06-26 09:04:00 +05:30
James Zern	d219f2b9d2	Merge "vp9_reconintra_neon: add d45 16x16"	2015-06-24 21:23:15 +00:00
Frank Galligan	944ad6cac9	Add vp9_int_pro_row_neon. BUG=https://code.google.com/p/webm/issues/detail?id=1022 Change-Id: I510c3b0a70158fa2e4da554f7c5d7558021a6ddf	2015-06-23 11:53:49 -07:00
James Zern	9db1f24c47	vp9_reconintra_neon: add d45 16x16 ~90% faster over 20M pixels Change-Id: I92d80f66e91e0a870a672cfb5dd29bf1a17cb11a	2015-06-22 21:00:07 -07:00
Parag Salasakar	7555e2b822	mips msa vp9 avg optimization average improvement ~2x-3x Change-Id: I76f7fc00c0ffdf2b4ba41bf3819f3b6044bcdeff	2015-06-23 07:32:25 +05:30
Parag Salasakar	7b71cdb0b4	Merge "mips msa vp9 fdct 4x4 optimization"	2015-06-23 01:46:54 +00:00
James Zern	c8b9658ecc	Merge "vp9_reconintra_neon: add d45 8x8"	2015-06-22 22:27:57 +00:00
Parag Salasakar	bc94999148	mips msa vp9 fdct 4x4 optimization average improvement ~2x-3x Change-Id: Idf8be780b8b4228fc91f110a94e4ee1fd9af0163	2015-06-22 14:30:24 +05:30
Parag Salasakar	b6131a733d	Merge "mips msa vp9 fdct 8x8 optimization"	2015-06-20 02:58:10 +00:00
James Zern	12c6688e31	vp9_reconintra_neon: add d45 8x8 based on ssse3 implementation ~91% faster over 20M pixels Change-Id: I6d743a53352c2d6de0efe7899d7996e8b0f7fa29	2015-06-19 19:19:22 -07:00
Parag Salasakar	7ca84888c2	mips msa vp9 fdct 8x8 optimization average improvement ~4x-5x Change-Id: I37582efc2622bc20b2bf99617a76110ab24e9f6a	2015-06-20 07:48:35 +05:30
James Zern	a2c69af50e	Merge "vp9_reconintra_neon: add d45 4x4"	2015-06-19 03:27:23 +00:00
James Zern	5d1d72df16	Merge changes from topic 'vp9-intra-pred' * changes: vp9_reconintra_neon: add d135 4x4 vp9_reconintra: correct d135 4x4 signature	2015-06-19 03:24:58 +00:00
James Zern	ce88d74d34	vp9_reconintra_neon: add d45 4x4 based on webp's LD4() ~59% faster over 20M pixels Change-Id: I371eaed9ce8f470451046997e130b0ba1a2f7a9c	2015-06-18 15:25:07 -07:00
James Zern	337b221e00	vp9_reconintra_neon: add d135 4x4 based on webp's RD4() ~50% faster over 20M pixels Change-Id: Ifcb7bf7f7fc8eabf79d9e3b219ce1be67abc524a	2015-06-18 15:25:06 -07:00
James Zern	41d8545ab6	Merge "vp9_reconintra_neon: add DC 4x4 predictors"	2015-06-18 22:24:55 +00:00
James Zern	6e44bf20f7	vp9_reconintra_neon: add DC 4x4 predictors ~85-89% faster over 20M pixels Change-Id: I3812e8adfffe5255034da88dfe6546e12f4d10ee	2015-06-18 15:22:43 -07:00
James Zern	e77f859d72	Merge "vp9_reconintra_neon: add DC 32x32 predictors"	2015-06-18 22:17:51 +00:00
Parag Salasakar	d9fedf7832	mips msa vp9 fdct 32x32 optimization average improvement ~4x-6x Change-Id: Ibcac3ef8ed5e207cf8c121e696570e6b63d3c0f4	2015-06-17 07:58:34 +05:30
Parag Salasakar	fa53008fb7	Merge "mips msa vp9 fdct 16x16 optimization"	2015-06-17 01:21:59 +00:00
Parag Salasakar	89b4b315aa	mips msa vp9 fdct 16x16 optimization average improvement ~4x-6x Change-Id: Id3b2243e5b3c7844c90c4231a5e75fa69911362c	2015-06-16 12:49:34 +05:30
James Zern	79fb3a013e	vp9_reconintra_neon: add DC 32x32 predictors ~84-85% faster over 20M pixels Change-Id: Ia67a7f4a342bf7b0a9280e05c25d81a774d90469	2015-06-15 20:57:28 -07:00
James Zern	98f0178611	enable vp9_d153_predictor_32x32_ssse3 unused since its initial commit ~91% faster over 20M pixels Change-Id: Ic8b5b3246bc97c8406be8bc4496601370403b70a	2015-06-12 19:48:22 -07:00
Parag Salasakar	fbac961b47	mips msa vp9 filter by weight optimization filter by weight - average improvement ~2x-3x Change-Id: I4832033335d339cdafdce697f07ce3e643920057	2015-06-12 12:06:42 +05:30
Parag Salasakar	a2288d274c	mips msa vp9 intra-pred optimization intra pred - average improvement ~2x-3x Change-Id: Ie3f7d6eded5ecb7ed7ee506ba8e4d98f93803b09	2015-06-06 22:29:32 +05:30
Parag Salasakar	d43fd99822	mips msa vp9 loopfilter 4, 8 optimization average improvement ~3x-4x Change-Id: I59279293ce4b2a1e99bd10579ac97740e943643f	2015-06-05 09:56:08 +05:30
Parag Salasakar	914f8f9ee0	mips msa vp9 loopfilter 16 optimization average improvement ~3x-4x Change-Id: I8ef263da6ebcf8f20aabaefeccf25a84640ba048	2015-06-04 11:50:41 +05:30
Parag Salasakar	bdfbc3e876	mips msa vp9 convolve8 avg hv optimization average improvement ~4x-6x Change-Id: I7c8b4f2334491be8a859592606e568bc95d019aa	2015-06-04 08:11:01 +05:30
Parag Salasakar	b8c1cdcd12	mips msa vp9 convolve8 avg horiz optimization average improvement ~5x-8x Change-Id: I179a69ec620fbd69979bd128f05d18113618aab4	2015-06-03 11:33:42 +05:30
Parag Salasakar	c543d38ac7	mips msa vp9 convolve8 avg vert optimization average improvement ~4x-6x Change-Id: Ia2e6f770da46416ebec31fdcea5cc7878879a9d9	2015-06-03 09:55:25 +05:30
Parag Salasakar	54a6f73958	mips msa vp9 idct4x4 and iwht4x4 optimization average improvement ~3x-4x moved assert to respective files Change-Id: I6c915059d456a00bdd76fab0dd2eede8b6c6ea58	2015-06-02 12:16:28 +05:30
Parag Salasakar	ebf7466cd8	mips msa vp9 updated convolve horiz, vert, hv, copy, avg module Updated sources according to improved version of common MSA macros. Enabled respective convolve MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: If5ad6ef8ea7ca47feed6d2fc9f34f0f0e8b6694d	2015-06-02 12:03:51 +05:30
Parag Salasakar	6af9d7f2e2	mips msa vp9 updated idct 8x8, 16x16 and 32x32 module Updated sources according to improved version of common MSA macros. Enabled idct MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: I1f488ab2c741f6c622b7a855388a202168082209	2015-06-01 09:24:23 +05:30
Parag Salasakar	71e88f903d	Merge "mips msa vp9 updated macros and disable all MSA functions"	2015-05-30 02:52:27 +00:00
James Zern	a2a13cbe5f	vp9_reconintra_neon: add DC 16x16 predictors 85-89% faster over 20M pixels Change-Id: I9b320ed6b9e67f27df738b84c8b43b65a93c50c2	2015-05-29 15:41:44 -07:00
James Zern	e97b849219	vp9_reconintra_neon: add DC 8x8 predictors ~90% faster over 20M pixels Change-Id: Iab791510cc57c8332c2f9a5da0ed50702e5f5763	2015-05-29 15:39:08 -07:00
Parag Salasakar	f9f078ebb6	mips msa vp9 updated macros and disable all MSA functions Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc. Disabled all MSA hooks and tests Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190	2015-05-29 13:34:33 +05:30
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
James Zern	330fba41e2	vp9 intrinsics: add vp9_rtcd include silences a missing declaration warning Change-Id: I59a34e1a1377cf3529b678d7ec0122bd43ab1bf1	2015-05-15 10:43:47 -07:00
Parag Salasakar	686616a989	Merge "mips msa vp9 idct 8x8 optimization"	2015-05-13 04:36:34 +00:00
hkuang	f5574fb44c	Merge "Add more sse2 code for intra prediction."	2015-05-08 17:26:30 +00:00
Parag Salasakar	7c5f00f868	mips msa vp9 idct 8x8 optimization average improvement ~4x-6x Change-Id: I5edf713721b9e24c7e0ce2e69d8fc3ecab625d91	2015-05-08 12:23:27 +05:30
Parag Salasakar	a8a9c2bb45	Merge "mips msa vp9 idct 32x32 optimization"	2015-05-08 04:27:44 +00:00
Johann	76a08210b6	Merge "Move shared SAD code to vpx_dsp"	2015-05-07 18:33:06 +00:00
Parag Salasakar	1601c1385a	mips msa vp9 idct 32x32 optimization average improvement ~4x-6x Change-Id: Idaba7e49fbd7f388caee0d73773ccf6e4807ef17	2015-05-07 12:42:23 +05:30
hkuang	7153b822ed	Add more sse2 code for intra prediction. vp9_dc_left_predictor_16x16 vp9_dc_top_predictor_32x32 vp9_dc_left_predictor_32x32 vp9_dc_128_predictor_32x32 Change-Id: Ib9861deefd01c3527235b92ff6b3d571ef6b4bc6	2015-05-06 17:17:00 -07:00
Johann	d5d9289800	Move shared SAD code to vpx_dsp Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a	2015-05-06 16:58:20 -07:00
Parag Salasakar	d1cdda88bd	Merge "mips msa vp9 idct 16x16 optimization"	2015-05-06 06:40:56 +00:00
James Zern	ccae5d99d2	fix and enable vp9_dc_128_predictor_16x16 widen the loads and stores to 128-bit. this was added, but not enabled in: `493a857` Add some sse2 code for intra prediction. Change-Id: I277d7db608a7db7d75cc0bde86f48fa66ad487e4	2015-05-05 11:40:13 -07:00
hkuang	e47811ef8f	Merge "Add some sse2 code for intra prediction."	2015-05-05 17:11:07 +00:00
Parag Salasakar	60052b618f	mips msa vp9 idct 16x16 optimization average improvement ~4x-6x Change-Id: I55e95b7f2ba403dff11813958dc7c73a900dd022	2015-05-05 12:37:06 +05:30
Yaowu Xu	2061359fcf	Merge "Remove vp9_idct16x16_10_add_ssse3()"	2015-04-30 23:13:33 +00:00
hkuang	493a8579f1	Add some sse2 code for intra prediction. Change-Id: I16c0a62e52dab62837c547345df31e7518620ed4	2015-04-30 15:42:57 -07:00
Yaowu Xu	47767609fe	Remove vp9_idct16x16_10_add_ssse3() The rotation computation using 2X of cos(pi/16) has a potential to overflow 32 bit, this commit disable the function to allow further investigation and optimization. Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf	2015-04-30 09:07:30 -07:00

1 2 3 4 5 ...

308 Commits