generic-library/vpx

Author	SHA1	Message	Date
Johann	d658216276	Don't return value for void functions Clears "warning: 'return' with a value, in function returning void" Change-Id: I93972610d67e243ec772a1021d2fdfcfc689c8c2	2014-06-20 11:26:44 -07:00
Johann	baef0b89da	Include type defines Clears error: unknown type name 'uint8_t' Change-Id: I9b6eff66a5c69bc24aeaeb5ade29255a164ef0e2	2014-06-20 11:26:13 -07:00
Jingning Han	41a350a83d	Change eob threshold for partial inverse 8x8 2D-DCT to 12 The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2	2014-05-08 09:48:58 -07:00
hkuang	edcbbf2ee3	Merge "Fix a bug in neon that has not save and restore q4-q7 registers."	2014-02-28 09:48:26 -08:00
hkuang	f3d8e315ac	Fix a bug in neon that has not save and restore q4-q7 registers. Change-Id: Ie21b5ae89100389b80f919710839084f935a8545	2014-02-27 14:06:52 -08:00
James Yu	e486488ce8	Replace vqshrun by vqmovun if shift #0 bit Change-Id: Ifabb8c7ec0c327fea9d6739cab10addb060ff435 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-14 21:03:40 -08:00
Johann	4378503665	Merge "Remove redundant arm neon instructions."	2014-02-14 20:02:51 -08:00
Yaowu Xu	ecf392a155	Merge "minor spelling cleanup in comments"	2014-02-14 14:29:35 -08:00
Frank Galligan	b41acbf9bb	Fix neon wide loopfilter for filter8 only branch The current code removed the check to only perform the filter8. Change-Id: Ie54e19a77745042a5660eab986d9ef1c42e82410	2014-02-12 18:36:17 -08:00
Andrew Russell	549c31f8ae	minor spelling cleanup in comments Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06	2014-02-12 16:32:51 -08:00
James Yu	619f29cdb0	Remove redundant arm neon instructions. Change-Id: I1fabad59747eb5f68c64275a36c3a1d94daf32a3 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-11 21:19:12 -08:00
Martin Storsjo	03bc491721	arm: Consistently use braces around doubleword arguments to vld This isn't strictly necessary, but makes the file more consistent with the other arm assembly source files. Change-Id: I245c9677d89e0ab3f31991e473764858af35b180	2014-02-05 13:24:25 +02:00
Martin Storsjo	c2bb1aa544	arm: Use {} around quadword arguments to vld This fixes building for iOS. Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb	2014-02-05 13:24:17 +02:00
Dmitry Kovalev	c49b08c9a1	Removing "_short" suffix from arm transform file names. Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507	2014-01-31 17:19:02 -08:00
hkuang	770454f3a8	Add vp9_tm_predictor_32x32 neon implementation which is 7.8 times faster than C. Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321	2014-01-27 16:01:07 -08:00
hkuang	05d2081d38	Fix the vp9_tm_predictor_8x8_neon. Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9	2014-01-27 10:17:20 -08:00
Frank Galligan	183361dadb	Merge "Optimize vp9_tm_predictor_8x8_neon function"	2014-01-24 16:21:56 -08:00
Frank Galligan	56a8a0b54b	Optimize vp9_tm_predictor_8x8_neon function Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5	2014-01-24 11:07:14 -08:00
hkuang	3633ffcbf7	Add vp9_tm_predictor_16x16 neon implementation which is 3.5 times faster than C. Change-Id: I24439ba7a2971829c11620f34848facf2c916678	2014-01-24 10:22:58 -08:00
hkuang	97826df96b	Add tm_predictor_8x8 neon implementation. Change-Id: I76c2720546b737cb63018a8ab6a3ff62a291786d	2014-01-22 13:43:20 -08:00
hkuang	2a2d8c140f	Merge "Add vp9_tm_predictor_4x4 neon implementation"	2014-01-16 10:18:12 -08:00
hkuang	f2ef389256	Add vp9_tm_predictor_4x4 neon implementation Change-Id: I10c423bde7ea5a3bac9f14f35c73b6bc31c8f3e3	2014-01-15 11:51:36 -08:00
hkuang	5be0ed30dc	Merge "Add initial intra frame neon optimization. 1~2% gain."	2014-01-08 14:41:43 -08:00
hkuang	691111aacf	Add initial intra frame neon optimization. 1~2% gain. More intra optimizations will be added. Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8	2014-01-08 11:58:42 -08:00
Jim Bankoski	b720ba165f	rename loop filter functions This renames all the loop filter functions so that they no longer refer to mb Change-Id: I8a58a8c7fd253d835cb619bde13913e896ece90b	2013-12-17 17:34:34 -08:00
Frank Galligan	b4874e2c82	Fix 16 wide neon horz loopfilter. Multiply by 3 was on 8bit vectors when it should have been on 16bit vectors. Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc	2013-11-26 10:02:40 -08:00
Yunqing Wang	ed36720b66	Do vertical loopfiltering in parallel This patch followed "Add filter_selectively_vert_row2 to enable parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. For other optimizations (neon and dspr2), current 16-pixel functions were done by calling 8-pixel functions twice, and real 16-pixel functions could be added later. Decoder speedup: tulip clip: 2% speed gain; old_town_cross: 1.2% speed gain; bus: 2% speed gain. Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7	2013-11-22 10:04:51 -08:00
Frank Galligan	97d1258375	Revert "Add 16 wide neon horz loopfilter." The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0	2013-11-21 14:01:33 -08:00
Frank Galligan	98de15137e	Add 16 wide neon horz loopfilter. Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d	2013-11-21 09:39:36 -08:00
Yunqing Wang	64f728caef	Do horizontal loopfiltering in parallel This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35	2013-11-15 16:18:43 -08:00
Johann	e72d49a97a	Use lowercase 'b' to branch iOS doesn't recognize B: bad instruction `B idct32_pass_loop' Change-Id: I3cf6aede4639f1d9efa97f7962fa287ba6feaaef	2013-11-12 10:41:06 -08:00
hkuang	c689a126ed	Fix a bug in the assembly code. Change-Id: Ic416e3f8a11e82ee298e6f709b2119a9ddf1e2f8	2013-11-11 12:49:12 -08:00
hkuang	6b16f63332	Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3. Change-Id: I034848cf05031618818f7df2e7f9c35102686948	2013-11-05 14:57:32 -08:00
Dmitry Kovalev	65f118d72f	Making input pointer of any inverse transform constant. Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940	2013-10-11 18:27:12 -07:00
Dmitry Kovalev	7ef573914d	Consistent names for inverse hybrid transforms (1 of 2). Renames: vp9_short_iht4x4_add -> vp9_iht4x4_16_add vp9_short_iht8x8_add -> vp9_iht8x8_64_add vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0	2013-10-11 13:31:32 -07:00
Dmitry Kovalev	1e766b50e2	Giving consistent names to IDCT 32x32 functions. Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666	2013-10-10 11:27:39 -07:00
Dmitry Kovalev	b096c5a336	Giving consistent names to IDCT 16x16 functions. Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3	2013-10-07 14:31:10 -07:00
Dmitry Kovalev	c6ad70d5f1	Giving consistent names to IDCT 8x8 functions. Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1	2013-10-06 00:24:09 -07:00
Dmitry Kovalev	3a0602578e	Giving consistent names to IDCT/IWHT functions. The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1	2013-10-04 14:17:06 -07:00
Dmitry Kovalev	3fab2125ff	Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add. Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1. Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5	2013-09-27 15:26:27 -07:00
Christian Duvivier	b1b4ba1bdd	Properly save neon registers. Replace current code which corrupts the stack by duplicate of vp8 code to save and restore neon registers. Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a	2013-09-27 14:25:33 -07:00
Dmitry Kovalev	db60c02c9e	Merge "Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10."	2013-09-27 13:08:52 -07:00
Dmitry Kovalev	15a36a0a0d	Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10. Making function name consistent with vp9_short_idct16x16 and vp9_short_idct16x16_1. Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517	2013-09-26 14:01:25 -07:00
Christian Duvivier	5b1dc1515f	Fix a bunch of TODO from vp9_short_idct32x32_add_neon. - full ASM version, no more C gateway file. - integrate combine-add with last step of 2nd pass. - remove a few push/pop pairs. - some instruction reordering to hide latency. Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6	2013-09-25 21:15:19 -07:00
Johann	a6a00fc6a3	Use lowercase instruction in assembly The iOS compiler does not recognize BLE: bad instruction `BLE idct32_transpose_pair_loop' Change-Id: I7426694c66bc31caf939a2d5000968da1222c15b	2013-09-20 16:11:05 -07:00
hkuang	23e1a29fc7	Speed up iht8x8 by rearranging instructions. Speed improves from 282% to 302% faster based on assembly-perf. Change-Id: I08c5c1a542d43361611198f750b725e4303d19e2	2013-09-16 14:23:26 -07:00
hkuang	86fb12b600	Merge "Add neon optimize iht8x8 which is 282% faster than C."	2013-09-12 15:42:44 -07:00
hkuang	182366c736	Add neon optimize iht8x8 which is 282% faster than C. Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530	2013-09-12 11:49:05 -07:00
Christian Duvivier	6a501462f8	First draft of vp9_short_idct32x32_add_neon. Lots of TODO which will be taken care in upcoming changes. As is, about 6x faster than C version. Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0	2013-09-11 15:19:38 -07:00
hkuang	fc5ec206a7	Speed up idct16x16 by rearrange instructions. Speed improve from 376% to 400% faster base on assembly-perf. Change-Id: If0b2eccc39d5793dc101ce9feb7fcadf88396ea2	2013-09-09 18:00:13 -07:00
hkuang	01c4e04424	Speed up idct8x8 by rearrange instructions. Speed improve from 264% ~ 270% to 280% ~ 300% base on assembly-perf. Change-Id: I3e2cc818ec14b432204ff43732f39b6438db685d	2013-09-04 15:57:22 -07:00
hkuang	3b8614a8f6	Add neon optimize vp9_short_iht4x4_add. Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e	2013-09-04 12:37:58 -07:00
hkuang	3a679e56b2	Add neon optimize vp9_short_idct16x16_1_add. Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5	2013-08-27 14:00:27 -07:00
hkuang	36e9b82080	Add neon optimize vp9_short_idct8x8_1_add. Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4	2013-08-26 16:28:57 -07:00
hkuang	69384f4fad	Add neon optimize vp9_short_idct4x4_1_add. Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5	2013-08-26 15:55:16 -07:00
hkuang	b85367a608	Merge "Optimise idct4x4: rearrange the instructions a bit to improve instruction scheduling."	2013-08-23 10:08:43 -07:00
hkuang	4082bf9d7c	Add neon optimize vp9_short_idct10_16x16_add. vp9_short_idct10_16x16_add is used to handle the block that only have valid data at top left 4x4 block. All the other datas are 0. So we could cut many unnecessary calculations in order to save instructions. Change-Id: I6e30a3fee1ece5af7f258532416d0bfddd1143f0	2013-08-22 15:53:22 -07:00
hkuang	610642c130	Optimise idct4x4: rearrange the instructions a bit to improve instruction scheduling. Change-Id: I5ea881a6e419f9e8ed4b3b619406403b4de24134	2013-08-22 11:02:22 -07:00
hkuang	37cda6dc4c	Add neon optimize vp9_short_idct10_8x8_add. vp9_short_idct10_8x8_add is used to handle the block that only have valid data at top left 4x4 block. All the other datas are 0. So we could cut several unnecessary calculations in order to save instructions. Change-Id: I34fda95e29082b789aded97c2df193991c2d9195	2013-08-20 11:51:07 -07:00
Johann	d514b778c4	Merge "Reduce the instructions of idct8x8. Also add the saving and restoring of D registers."	2013-08-16 11:30:21 -07:00
Johann	65aa89af1a	Merge "Reduce instructions of idct4x4."	2013-08-16 11:28:35 -07:00
Frank Galligan	bdc785e976	Merge "vp9: neon: optimise vp9_wide_mbfilter_neon"	2013-08-16 11:16:48 -07:00
hkuang	df0715204c	Reduce instructions of idct4x4. Change-Id: Ia26a2526804e7e2f656b0051618a615fca8fc79d	2013-08-16 10:54:56 -07:00
hkuang	60ecd60c9a	Reduce the instructions of idct8x8. Also add the saving and restoring of D registers. Change-Id: Id3630c90fcb160ef939fef55411342608af5f990	2013-08-16 10:32:12 -07:00
Mans Rullgard	4fa93bcef4	vp9: neon: use aligned stores in convolve functions The destination is block-aligned so it is safe to use aligned stores. Change-Id: I38261e4fa40bc60e6472edffece59e372908da7e	2013-08-16 14:25:08 +01:00
Johann	a9aa7d07d0	Merge "vp9: neon: add vp9_convolve_avg_neon"	2013-08-15 14:55:15 -07:00
Johann	63e140eaa7	Merge "vp9: neon: add vp9_convolve_copy_neon"	2013-08-15 14:55:08 -07:00
Mans Rullgard	67e53716e0	vp9: neon: optimise vp9_wide_mbfilter_neon Break up long dependency chains to improve instruction scheduling. Change-Id: I0e0cb66943df24af920767bb4167b25c38af9630	2013-08-15 19:07:22 +01:00
hkuang	39f42c8713	Merge "Add neon optimize vp9_short_idct16x16_add."	2013-08-14 14:16:20 -07:00
hkuang	cf6beea661	Add neon optimize vp9_short_idct16x16_add. Change-Id: I27134b9a5cace2bdad53534562c91d829b48838d	2013-08-14 13:52:16 -07:00
Mans Rullgard	0f1deccf86	vp9: neon: add vp9_convolve_avg_neon Change-Id: I33cff9ac4f2234558f6f87729f9b2e88a33fbf58	2013-08-14 16:27:55 +01:00
Mans Rullgard	635ba269be	vp9: neon: add vp9_convolve_copy_neon Change-Id: I15adbbda15d1842e9f15f21878a5ffbb75c3c0c9	2013-08-14 16:27:55 +01:00
Johann	4417c04531	Merge "vp9: neon: optimise convolve8_vert functions"	2013-08-12 17:54:47 -07:00
Mans Rullgard	ad7021dd6c	vp9: neon: optimise convolve8_vert functions Invert loops to operate vertically in the inner loop. This allows removing redundant loads. Also add preloading of data. Change-Id: I4fa85c0ab1735bcb1dd6ea58937efac949172bdc	2013-08-12 15:37:48 +01:00
Mans Rullgard	b84dc949c8	vp9: neon: optimise convolve8_horiz functions Each iteration of the horizontal loop reuses 7 of the 11 source values. Loading only the 4 new values saves some time. Also add preload for source data. Overall 4% faster on Chromebook. Change-Id: I8f69e749f2b7f79e9734620dcee51dbfcd716b44	2013-08-11 16:21:55 +01:00
Christian Duvivier	78182538d6	Neon version of vp9_short_idct4x4_add. Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4	2013-08-06 18:41:27 -07:00
Mans Rullgard	355cb14dc7	vp9: neon: convolve: replace some insns with simpler equivalents Change-Id: I5d6906772e6e6adf68d7f0fd5b8b5207a64a3a37	2013-08-02 08:11:28 -07:00
Mans Rullgard	2003468df8	vp9: neon: convolve: simplify branching to C fallbacks Change-Id: Ic7cacd02d6dc9243ad8fc85082c5618a9d1e66dc	2013-08-02 08:11:25 -07:00
Mans Rullgard	5e2e78d024	vp9: neon: optimise loads in horiz convolve functions Loading to single lanes in multiple registers is expensive since it requires a read and write of each register which saturates the register file access. Loading to single registers followed by a separate transpose reduces this pressure. Change-Id: I4cc35887ddbca80e5e635b50d2b1d158de9668ee	2013-08-02 08:11:08 -07:00
Mans Rullgard	d85ae87183	vp9: neon: add vp9_mb_lpf_* functions Change-Id: I13e0880df234f15abc4cc7c57fe84488d5d46a75	2013-08-02 08:10:50 -07:00
hkuang	588b4daf54	Fix some format error and code error in neon code. Change-Id: I748dee8938dfb19f417f24eed005f3d216f83a82	2013-07-26 14:14:57 -07:00
Frank Galligan	e88db77892	Merge "Speedup loopfilter neon code."	2013-07-22 17:39:42 -07:00
Frank Galligan	5af6bf6c43	Speedup loopfilter neon code. Try and cut down the cycle count by rearranging the instructions so there are less stalls. Change-Id: Ic1383335ee0f05e656477d9ee9c179ec231285d5	2013-07-22 17:00:01 -07:00
hkuang	97dbee00dd	Merge "Add neon optimize vp9_short_idct8x8_add."	2013-07-19 08:28:39 -07:00
hkuang	d757de744c	Add neon optimize vp9_short_idct8x8_add. Change-Id: Ic32acf3e2939c6d12d9c2bf192a5f5da59705fda	2013-07-18 16:40:41 -07:00
Frank Galligan	7fd5d8e6a4	Fix horz loopfilter loops If count was greater than 1 the src pointer would be off on the second loop. Change-Id: I8e09037e68dc4ae92076a8067f7b6dacbbef8263	2013-07-18 09:44:15 -07:00
Johann	59dc4e9cdd	vp9_convolve8_neon placeholder Call the individually optimized horizontal and vertical functions. This implementation abuses the temp buffer. This will be replaced with a custom optimized function. Over 2x speedup. Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd	2013-07-17 08:39:27 -07:00
Johann	90ebfe621f	Merge "vp9_convolve8_[horiz\|vert]_avg"	2013-07-16 09:42:52 -07:00
Frank Galligan	f4f60f6005	Neon: Update mbfilter if all vectors follow one branch. Change the mbfilter Neon code from executing both branches if all vectors follow only one branch. The code is about 5% faster when executing only one branch and about 1% slower when executing both branches. -PS5: Remove local stack space from mbfilter. Change-Id: I6a23f9b318a9f4568a2718b4c9348db988fe2182	2013-07-15 13:08:28 -07:00
Johann	a15bebfc0a	vp9_convolve8_[horiz\|vert]_avg Super basic conversion from the other implementations. Any changes to one should be trivial to copy over keep in sync. Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8	2013-07-12 16:21:33 -07:00
Johann	158c80cbb0	convolve8 optimizations for neon Independent horizontal and vertical implementations. Requires that blocks be built from 4x4 and [xy]_step_q4 == 16 6-10% improvement. CIF improved the least. Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda	2013-07-11 11:08:19 -07:00
hkuang	c9b25dcae4	Add neon optimize vp9_dc_only_idct_add. Change-Id: Iae84ab945cc9662a0ddd839aa2b9ca59f2ae5423	2013-07-11 10:30:47 -07:00
Frank Galligan	198fa6d0a0	Add Neon horizontal and vertical vp9_mbloop_filter - The vp9 mbfilter C code will branch on flat and mask. This CL will perform both branches and combine the data. A later CL will perform a check to see if all patch will take one branch. - These functions are about 1.75 times faster than the C code on Nexus 7. PS #3 - Changed all functions to dub limit, blimit, and thresh from vld {dx[]}, freeing up r4-r6. - Changed code to use vbif to reduce one instruction and free up a d register. Change-Id: I028dae0e434dc9891c3677bdb182e201ffb04777	2013-07-09 12:40:05 -07:00
Frank Galligan	1d6dc1b702	Add Neon optimized loop filter functions. - Added vp9_loop_filter_horizontal_edge_neon and vp9_loop_filter_vertical_edge_neon. - The functions are based off the vp8 loopfilter functions. - Matches x86 md5 checksum. Change-Id: Id1c4dddb03584227e5ecd29f574a6ac27738fdd0	2013-06-27 16:14:45 -07:00

1 2 3

144 Commits