generic-library/vpx

Author	SHA1	Message	Date
Johann	87610ac45e	neon: consolidate horizontal adds Change-Id: Iaf9e88ff636ccf8f0ef310869c6827f3f205cca8	2017-07-10 15:29:13 -07:00
Johann	67ac68e399	variance neon: assert overflow conditions Change-Id: I12faca82d062eb33dc48dfeb39739b25112316cd	2017-05-22 11:25:06 -07:00
Johann	7b742da63e	neon variance: process 4x blocks Continue processing sets of 16 values. Plenty of improvement for 4x8 (doubles the speed) but only about 30% for 4x4. BUG=webm:1422 Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905	2017-05-17 17:35:01 -07:00
Johann	f7d1486f48	neon variance: process 16 values at a time Read in a Q register. Works on blocks of 16 and larger. Improvement of about 20% for 64x64. The smaller blocks are faster, but don't have quite the same level of improvement. 16x32 is only about 5% BUG=webm:1422 Change-Id: Ie11a877c7b839e66690a48117a46657b2ac82d4b	2017-05-08 18:48:55 +00:00
Johann	d6a7489dd5	neon variance: process two rows of 8 at a time When the width is equal to 8, process two rows at a time. This doubles the speed of 8x4 and improves 8x8 by about 20%. 8x16 was using this technique already, but still improved a little bit with the rewrite. Also use this for vpx_get8x8var_neon BUG=webm:1422 Change-Id: Id602909afcec683665536d11298b7387ac0a1207	2017-05-04 08:59:46 -07:00
Johann	cb9133c72f	neon variance: add small missing sizes Some of the mixed sizes were missing. They can be implemented trivially using the existing helper function. When comparing the previous 16x8 and 8x16 implementations, the helper function is about 10% faster than the 16x8 version. The 8x16 is very close, but the existing version appears to be faster. BUG=webm:1422 Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004	2017-05-04 08:59:42 -07:00
James Zern	e372bfd5ac	variance_neon: sync variance*() w/c,sse2 removes some unnecessary casts and adds a few explicit uint32 ones for larger sizes to quiet -Wshorten-64-to-32 warnings Change-Id: I63c5fce8e62c426d5cf5c10a66a113c119a43518	2016-09-21 18:04:45 -07:00
clang-format	099bd7f07e	vpx_dsp: apply clang-format Change-Id: I3ea3e77364879928bd916f2b0a7838073ade5975	2016-07-25 14:14:19 -07:00
James Zern	be380f2005	variance_neon: add missing include vpx_ports/mem.h is necessary for MSVC __builtin_prefetch compatibility macro Change-Id: I210fad6c6b4545df1874d028b31f42018490b029	2015-05-28 23:38:53 -07:00
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00

10 Commits