generic-library/vpx

Author	SHA1	Message	Date
Johann	462e29703c	fdct 8x8 neon: minor comment cleanup Simplify HBD/non distinction in test. Document why transpose_neon.h is not used Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52	2017-05-04 15:14:23 -07:00
Yi Luo	a3452996a1	High bit depth inter prediction horizontal/vertical filters AVX2 User level speed improvement on i7-6700, cpu-used=1, x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps: - Decoder: 1080p: ~4% 4K: ~5% - Encoder: 1080p: ~1% 4K: ~3% Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640	2017-05-03 12:18:01 -07:00
Linfeng Zhang	a10a5cb356	Merge changes I8bb660de,Ica51d780,I6037525d * changes: Clean specializes of idct functions Clean add_protos of highbd idct functions Clean add_protos of idct functions	2017-05-03 19:17:55 +00:00
Luca Barbato	e2ad89092d	ppc: Add convolve8_vsx and convolve8_avg_vsx Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857	2017-05-02 20:27:47 -07:00
Luca Barbato	e6ca81ee67	ppc: Add convolve8_avg_vert_vsx Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d	2017-05-02 20:27:42 -07:00
Luca Barbato	a65f1771ad	ppc: Add convolve8_vert Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f	2017-05-02 20:27:33 -07:00
Luca Barbato	77772350f3	ppc: Add convolve8_horiz_avg The 8x8 and larger blocks cases can be sped up further. Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15	2017-05-02 20:27:28 -07:00
Luca Barbato	08edb85bd0	ppc: Add convolve8_horiz The 8x8 and larger blocks cases can be sped up further. Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046	2017-05-02 20:27:16 -07:00
Linfeng Zhang	0178d974e5	Clean specializes of idct functions Change-Id: I8bb660de47b5f97263ec381dc428db96e9c9a4b2	2017-05-02 18:01:19 -07:00
Linfeng Zhang	4412996d59	Clean add_protos of highbd idct functions Change-Id: Ica51d780b92b316ce9112740c56cdf7670816371	2017-05-02 17:59:38 -07:00
Linfeng Zhang	a7a57d9756	Clean add_protos of idct functions Change-Id: I6037525d92ec172810edab720389eb1865ed3b1a	2017-05-02 17:58:40 -07:00
Luca Barbato	d51d3934f5	ppc: Add convolve_avg Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014	2017-04-29 15:47:25 +02:00
Luca Barbato	63860ba7b8	ppc: Add convolve_copy Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405	2017-04-29 15:47:25 +02:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
Luca Barbato	914b160fb5	ppc: h predictor 8x8 Slightly faster with the current compiler. Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47	2017-04-19 19:57:51 -07:00
Luca Barbato	0b9be93205	ppc: d63 predictor 8x8 10x faster. Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8	2017-04-19 19:57:51 -07:00
Luca Barbato	ee9325b0bd	ppc: tm predictor 4x4 Slightly faster. Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191	2017-04-19 19:57:51 -07:00
Luca Barbato	2904eb5800	ppc: h predictor 4x4 2x faster. Change-Id: I0583dec353299c6797401b646099f18db4e0420d	2017-04-19 19:57:51 -07:00
Luca Barbato	58245d7050	ppc: dc predictor 8x8 Slightly faster, the other dc predictors cannot be faster since the computation speedup is overwhelmed by the time spent reading dst to write just the 8x8 part. Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66	2017-04-19 19:57:51 -07:00
Luca Barbato	6b4a65e8b1	ppc: d45 predictor 8x8 11x faster. Change-Id: I5b8f39213ee1f5260724fc254e3fb5c462435798	2017-04-19 19:57:51 -07:00
Luca Barbato	92e33c7b31	ppc: d63 predictor 32x32 About 10x faster. Change-Id: If7d0645f75c5d7deb9751edd0bf47e2f9068e9e7	2017-04-19 19:57:51 -07:00
Luca Barbato	a5469a00a8	ppc: d63 predictor 16x16 About 18x faster. Change-Id: Id043bf76c011e03e992085bb5e20f330d3e98cd4	2017-04-19 19:57:51 -07:00
Luca Barbato	cc868da526	ppc: d45 predictor 32x32 About 12x faster. Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed	2017-04-19 19:57:51 -07:00
Luca Barbato	7a7dc9e624	ppc: d45 predictor 16x16 About 16x faster. Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9	2017-04-19 19:57:51 -07:00
Luca Barbato	c08baa2900	ppc: dc predictor 32x32 10x and 5x faster. Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf	2017-04-19 19:57:47 -07:00
Luca Barbato	22ca468c7c	ppc: dc top and left predictor 32x32 6x faster. Change-Id: I717995b4056e5579c68191d11b495372971fe1ae	2017-04-19 19:49:31 -07:00
Luca Barbato	ad9dea1f6d	ppc: dc top and left predictor 16x16 13x faster. Change-Id: I1771ac39fda599153f933cb3f0506c9f97a6cbe6	2017-04-19 19:49:31 -07:00
Luca Barbato	d68d37872c	ppc: dc_128 predictor 32x32 6x faster. Change-Id: I1da8f51b4262871cb98f0aa03ccda41b0ac2b08b	2017-04-19 19:49:31 -07:00
Luca Barbato	f9d20e6df2	ppc: dc_128 predictor 16x16 20x faster. Change-Id: I05f0deb2d38ae7966eae6b71fbc0aa51880e5709	2017-04-19 19:49:31 -07:00
Luca Barbato	0d9417de4a	ppc: tm predictor 32x32 About 8x faster. Change-Id: I9bad827ccbdf47ec95406e961c74ac2ff45f80cf	2017-04-19 19:49:26 -07:00
James Zern	a81f037f15	Merge changes I1f5a3752,I95123051,I3bb724e0,Ie81077fa,Ic80f3c05, ... * changes: ppc: tm predictor 16x16 ppc: tm predictor 8x8 ppc: horizontal predictor 32x32 ppc: horizontal predictor 16x16 ppc: vertical intrapred 16x16 and 32x32 configure: Workaround clang not enabling altivec on -mvsx configure: Match power64 as ppc64	2017-04-20 02:45:45 +00:00
Linfeng Zhang	bf8a49abbd	Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Replace by CAST_TO_BYTEPTR/SHORTPTR. The rule is: if a short ptr is casted to a byte ptr, any offset operation on the byte ptr must be doubled. We do this by casting to short ptr first, adding offset, then casting back to byte ptr. BUG=webm:1388 Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248	2017-04-19 12:13:49 -07:00
Luca Barbato	479443a570	ppc: tm predictor 16x16 About 10x faster. Change-Id: I1f5a3752d346459df3b45f92963208bf3e520f06	2017-04-19 01:48:10 +02:00
Luca Barbato	c8f5a55df4	ppc: tm predictor 8x8 About 5x faster. Change-Id: I951230517f49c0dca9ac9eac2efa8916a303b85a	2017-04-19 01:48:09 +02:00
Luca Barbato	7b0e12934e	ppc: horizontal predictor 32x32 About 5x faster. Change-Id: I3bb724e07baffd901aa2d0f65060ba48882cc9b8	2017-04-19 01:48:09 +02:00
Luca Barbato	a7a2d1653b	ppc: horizontal predictor 16x16 About 10x faster. Change-Id: Ie81077fa32ad214cdb46bdcb0be4e9e2c7df47c2	2017-04-19 01:48:09 +02:00
Luca Barbato	7ad1faa6f8	ppc: vertical intrapred 16x16 and 32x32 Change-Id: Ic80f3c050cfbe7697e81a311b4edaaa597b85cab	2017-04-19 01:48:09 +02:00
Johann	9fa24f03b5	re-enable vpx_comp_avg_pred_sse2 Buffers on 32 bit x86 builds only guaranteed 8 byte alignment. Fixed with "AvgPred test: use aligned buffers" and "sad avg: align intermediate buffer" Also re-enable asserts on the C version. BUG=webm:1390 Change-Id: I93081f1b0002a352bb0a3371ac35452417fa8514	2017-04-17 08:40:43 -07:00
Johann	069b772915	sad avg: align intermediate buffer comp_avg_pred has started declaring a requirement for aligned buffers. BUG=webm:1390 Change-Id: Idaf6667498ea343e8d49b32bc9d8b9d0aa43ef5c	2017-04-17 14:26:33 +00:00
James Zern	4ba20da8b1	Merge "Add AVX2 optimization to copy/avg functions"	2017-04-15 00:26:08 +00:00
Yi Luo	aa5a941992	Add AVX2 optimization to copy/avg functions Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074	2017-04-14 16:50:10 -07:00
Johann	eaa7cdf05d	Disable vpx_comp_avg_pred_sse2 Failures on windows: unknown file: error: SEH exception with code 0xc0000005 thrown in the test body. Alignment check errors on linux: test_libvpx: ../libvpx/vpx_dsp/variance.c:230: void vpx_comp_avg_pred_c(uint8_t , const uint8_t , int, int, const uint8_t *, int): Assertion `((intptr_t)comp_pred & 0xf) == 0' failed. BUG=webm:1390 Change-Id: I5eed5381c0f1a8fe594a128eb415e77232f544ea	2017-04-14 08:43:06 -07:00
Johann	28a8622143	vpx_comp_avg_pred: sse2 optimization Provides over 15x speedup for width > 8. Due to smaller loads and shifting for width == 8 it gets about 8x speedup. For width == 4 it's only about 4x speedup because there is a lot of shuffling and shifting to get the data properly situated. BUG=webm:1390 Change-Id: Ice0b3dbbf007be3d9509786a61e7f35e94bdffa8	2017-04-13 08:44:52 -07:00
James Zern	04e9456567	Merge changes from topic 'Wshorten' * changes: configure: enable -Wshorten-64-to-32 for hbd vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd Resolve -Wshorten-64-to-32 in highbd variance.	2017-04-07 07:32:14 +00:00
James Zern	47b9a09120	Resolve -Wshorten-64-to-32 in highbd variance. For 8-bit the subtrahend is small enough to fit into uint32_t. This is the same that was done for: c0241664a Resolve -Wshorten-64-to-32 in variance. For 10/12-bit apply: 63a37d16f Prevent negative variance Change-Id: Iab35e3f3f269035e17c711bd6cc01272c3137e1d	2017-04-05 17:34:02 -07:00
Linfeng Zhang	6fc2e57c2c	Update 32x32 high bitdepth idct NEON optimization Preparation of CONVERT_TO_BYTEPTR/SHORTPTR clean up. BUG=webm:1388 Change-Id: I928d30a5698023bb90888d783cf81c51ec183760	2017-04-05 15:28:11 -07:00
James Zern	aefc1088a2	intrapred: sync highbd_d135_predictor w/d135_ previously: 05437805f intrapred/d135: flatten border results before storing BUG=webm:1316 Change-Id: I3b8bd89117ad7f2f4560b57f7c148da781e86f85	2017-03-24 20:45:44 -07:00
James Zern	67cde46dd7	intrapred: specialize highbd 4x4 predictors d207/d63/d45/d117/d135/d153 ~9-45% better depending on the predictor on 32-bit ARM, similar range on x86-64 this matches the non-highbitdepth implementation BUG=webm:1316 Change-Id: Iddebdf7c58c6f31c47cae04da95c6e5318200e4c	2017-03-24 20:45:36 -07:00
James Zern	e05f4cf8f4	intrapred: rename d63f to d63e this is consistent with he/ve/d45e Change-Id: I75641ae5667430b0ecd370db86fff6e666cb577d	2017-03-24 20:41:39 -07:00
James Zern	d45617c702	remove CONFIG_MISC_FIXES this belonged to vp10 with the changes now migrated to av1. Change-Id: Ie30ead3e7b71f465bc14136e1b6f156ea978c43f	2017-03-24 20:41:39 -07:00

1 2 3 4 5 ...

668 Commits