generic-library/vpx

Author	SHA1	Message	Date
James Yu	fb5d281bb6	VP8 for ARMv8 by using NEON intrinsics 05 Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 10:16:00 +08:00
James Yu	28b2f82f97	VP8 for ARMv8 by using NEON intrinsics 04 Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:59:23 +08:00
James Yu	d749ab6221	VP8 for ARMv8 by using NEON intrinsics 03 Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:28:29 +08:00
James Yu	300a3bfc73	VP8 for ARMv8 by using NEON intrinsics 02 Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-23 22:56:53 +08:00
Andrew Russell	549c31f8ae	minor spelling cleanup in comments Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06	2014-02-12 16:32:51 -08:00
James Zern	aceba82c41	vp8/common: add extern "C" to headers Change-Id: I13b434b1e6621e31962b08831c3587c039368c83	2014-01-23 16:21:24 -08:00
Johann	dadf350551	Apply neon flags to intrinsic files Filter out files ending in _neon.c and append .neon so the Android build system knows to apply -mfpu=neon Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305	2014-01-10 12:16:59 -08:00
James Yu	79395e16cf	VP8 for ARMv8 by using NEON intrinsics 01 Add bilinearpredict_neon_intrinsics.c - vp8_bilinear_predict4x4_neon - vp8_bilinear_predict8x4_neon - vp8_bilinear_predict8x8_neon - vp8_bilinear_predict16x16_neon Change-Id: I33dfa502881219841b442dda32b73220e51b716b Signed-off-by: James Yu <james.yu@linaro.org>	2014-01-09 09:56:22 -08:00
James Zern	e903cacf5b	vp8/common: normalize include guards Change-Id: Ia8789a8f864e0edc0bf94f00f6430846f86911c3	2013-12-16 19:40:54 -08:00
Martin Storsjo	fba3110b09	arm: Move the definition of bilinear_taps_coeff to within the section Previously, the microsoft arm assembler errored out, saying that bilinear_taps_coeff was an undefined symbol. Change-Id: Ib938f0b454c41ccbd801e70a7c9acc0fa04e3c55	2013-05-22 01:50:59 +03:00
Martin Storsjo	b9ed185659	arm: Explicitly write both target registers for ldrd The microsoft assembler can't handle the second register being implicit. Change-Id: Ia831953a78a25fd6b2082474f05fdb78d96cdf78	2013-05-22 01:50:58 +03:00
John Koleszar	a9c7597adc	support building vp8 and vp9 into a single lib Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d	2012-11-15 10:46:17 -08:00
Johann	aa165c8c5d	Update armv6 vp8_intra4x4_predict Change-Id: I52a3b0a4a42e5af91b987e19523df07c8f467847	2012-08-08 10:57:33 -07:00
Johann	a82c58c40f	Change vp8_intra4x4_predict call sites Use the _d variant from the decoder. It moves the pointer calculations to the caller. Change-Id: Iae2a793433ef082980a3ffa0a1cabf0264a6a24d	2012-08-01 10:48:46 -07:00
John Koleszar	0164a1cc5b	Fix pedantic compiler warnings Allows building the library with the gcc -pedantic option, for improved portabilty. In particular, this commit removes usage of C99/C++ style single-line comments and dynamic struct initializers. This is a continuation of the work done in commit `97b766a46`, which removed most of these warnings for decode only builds. Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966	2012-06-11 15:14:58 -07:00
John Koleszar	d8216b19b6	Merge "Fix compiler warnings" into eider	2012-05-02 16:22:34 -07:00
Timothy B. Terriberry	e50c842755	Fix TEXTRELs in the ARM asm. Besides imposing a performance penalty at startup in most configurations, these relocations break the dynamic linker for native Fennec, since it does not support them at all. Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de	2012-05-02 10:36:01 -07:00
Attila Nagy	14c9fce8e4	Fix compiler warnings Fix code for following warnings: -Wimplicit-function-declaration -Wuninitialized -Wunused-but-set-variable -Wunused-variable Change-Id: I2be434f22fdecb903198e8b0711255b4c1a2947a	2012-05-02 10:57:57 +03:00
Johann	e50f96a4a3	Move SAD and variance functions to common The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a	2012-03-05 16:50:33 -08:00
John Koleszar	f103dcefaf	RTCD: add subpixel functions This commit continues the process of converting to the new RTCD system. Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe	2012-01-30 12:08:29 -08:00
John Koleszar	fdb61a4531	RTCD: add recon functions This commit continues the process of converting to the new RTCD system. Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6	2012-01-30 12:08:28 -08:00
John Koleszar	ab77b4e898	RTCD: add remaining IDCT functions This commit continues the process of converting to the new RTCD system. Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80	2012-01-30 12:08:22 -08:00
John Koleszar	55f74c59c7	RTCD: add loopfilter functions This commit continues the process of converting to the new RTCD system. Change-Id: Ic8a4047d72ff3a54ec98977dd90e70c13213db71	2012-01-30 12:06:31 -08:00
John Koleszar	a910049aea	New RTCD implementation This is a proof of concept RTCD implementation to replace the current system of nested includes, prototypes, INVOKE macros, etc. Currently only the decoder specific functions are implemented in the new system. Additional functions will be added in subsequent commits. Overview: RTCD "functions" are implemented as either a global function pointer or a macro (when only one eligible specialization available). Functions which have RTCD specializations are listed using a simple DSL identifying the function's base name, its prototype, and the architecture extensions that specializations are available for. Advantages over the old system: - No INVOKE macros. A call to an RTCD function looks like an ordinary function call. - No need to pass vtables around. - If there is only one eligible function to call, the function is called directly, rather than indirecting through a function pointer. - Supports the notion of "required" extensions, so in combination with the above, on x86_64 if the best function available is sse2 or lower it will be called directly, since all x86_64 platforms implement sse2. - Elides all references to functions which will never be called, which could reduce binary size. For example if sse2 is required and there are both mmx and sse2 implementations of a certain function, the code will have no link time references to the mmx code. - Significantly easier to add a new function, just one file to edit. Disadvantages: - Requires global writable data (though this is not a new requirement) - 1 new generated source file. Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55	2012-01-30 12:06:27 -08:00
Attila Nagy	294aa37745	Rename save_neon_reg.asm as save_reg_neon.asm Easier to filter out all NEON asm. Change-Id: I0022dae8321a9608e864b09d4181414c5fff4610	2012-01-26 09:44:00 +02:00
Fritz Koenig	892102842a	Disconnect ARM tgt_isa from dsp extensions A processor with ARMv7 instructions does not necessarily have NEON dsp extensions. This CL has the added side effect of allowing the ability to enable/disable the dsp extensions cleanly. Change-Id: Ie1e879b8fe131885bc3d4138a0acc9ffe73a36df	2012-01-20 10:38:15 -08:00
Scott LaVarnway	5f25d4c175	Reduced the size of Y1Dequant and friends to [128][2] This patch removes the local copies of the dequantize constants and implements John's idea as described in "Make a local copy of the dequantized data" commit. Change-Id: Ic6b7d681f00bf63263f71ff1e39ab2f80729e8b2	2012-01-06 11:12:00 -08:00
John Koleszar	0c2f8e77cc	Remove useless g_common.h This file declared a bunch of nonexistent, unreferenced global function pointers. Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728	2011-12-21 15:02:23 -08:00
Scott LaVarnway	a53d5a4c44	Moved dequant idct into common These functions are now used by the encoder. This is WIP with the goal of creating a common idct/add for the encoder and decoder. A boost of 1.8% was seen for the HD rt test clip used. [Tero] Added needed changes to ARM side. Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf	2011-12-15 14:23:41 -05:00
Scott LaVarnway	4a91541c94	Modified the inverse walsh to output directly to the dqcoeff or qcoeff buffer. The encoder would populate the dc coeffs of the y blocks as a separate stage (recon_dcblock) and the decoder would use a special version of the idct. This change eliminates the extra copy and reduces the code footprint. [Tero] Added needed changes to armv6 and NEON assembly. Change-Id: I83202ffdbaf83f6e5dd69f4ba2519fcf0b13b3ba	2011-11-25 09:24:04 +02:00
Tero Rintaluoma	5a2fd63a2a	ARMv6 optimized Intra4x4 prediction Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54	2011-11-09 09:13:51 +02:00
Scott LaVarnway	ed9c66f584	Remove usage of predict buffer for decode Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6	2011-10-18 12:06:50 -04:00
Attila Nagy	1a7d25a484	Replace vpx_ports/config.h with vpx_config.h Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a	2011-09-22 13:33:54 +03:00
Johann	8f910594bd	Merge "Update armv6 loopfilter to new interface"	2011-07-13 04:09:55 -07:00
Johann	1a219c22b1	Merge "Update armv7 loopfilter to new interface"	2011-07-13 04:09:42 -07:00
Attila Nagy	c231b0175d	Update armv6 loopfilter to new interface Change-Id: I5fe581d797571a7a9432fbd17fc557591d0c1afa	2011-07-12 12:14:51 +03:00
Attila Nagy	283b0e25ac	Update armv7 loopfilter to new interface Change-Id: I65105a9c63832669237e6a6a7fcb4ea3ea683346	2011-07-12 12:12:25 +03:00
Johann	6611f66978	clean up warnings when building arm with rtcd Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df	2011-06-29 10:51:41 -04:00
Johann	dc004e8c17	Merge "Avoid text relocations in ARM vp8 decoder"	2011-06-28 16:34:10 -07:00
Mike Hommey	e3f850ee05	Avoid text relocations in ARM vp8 decoder The current code stores pointers to coefficient tables and loads them to access the tables contents. As these pointers are stored in the code sections, it means we end up with text relocations. eu-findtextrel will thus complain about code not compiled with -fpic/-fPIC. Since the pointers are stored in the code sections, we can actually cheat and let the assembler generate relative addressing when accessing the coefficient tables, and just load their location with adr. Change-Id: Ib74ae2d3f2bab80b29991355f2dbe6955f38f6ae	2011-06-28 09:11:40 +02:00
Taekhyun Kim	458fb8f491	utilize preload in ARMv6 MC/LPF/Copy routines About 9~10% decoding perf improvement on non-Neon ARM cpus Change-Id: I7dc2a026764e84e9c2faf282b4ae113090326837	2011-06-17 14:04:53 -07:00
Attila Nagy	f96d56c4aa	Fixed iwalsh_neon build problems with RVDS4.1 rvct 4.1 was complaining about vstmia.16, store multiple expects 64 data type. optimized the implementation. Change-Id: I0701052cabd685c375637bbc3796ff6d88f5972c	2011-05-19 10:27:26 +03:00
Attila Nagy	a6aa389d2f	Loopfilter NEON: Use VMOV for constant vectors instead of VLD. Change-Id: I562b6e01c32bb51d00f3b95faf757fc7dc29a3a3	2011-05-04 11:29:23 +03:00
Johann	01527e743f	remove simpler_lpf the decision to run the regular or simple loopfilter is made outside the function and managed with pointers stop tracking the option in two places. use filter_type exclusively Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15	2011-04-25 17:37:41 -04:00
John Koleszar	27972d2c1d	Move build_intra_predictors_mby to RTCD framework The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534	2011-03-11 13:04:50 -05:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
Tero Rintaluoma	cb14764fab	Adds armv6 optimized variance calculation Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6 and adds new assembly file for variance16x16 calculation. - vp8_filter_block2d_bil_first_pass_armv6 (integrated) - vp8_filter_block2d_bil_second_pass_armv6 (integrated) - vp8_variance16x16_armv6 (new) - bilinearfilter_arm.h (new) Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db	2011-02-09 10:23:43 -05:00
Johann	e5aaac24bb	clean up bilinear filter make reference version of bilinear_filters short. use reference versions of bilinear_filters and sub_pel_filters when possible. recognize that Width was being passed into filter_block2d_bil_first_pass multiple times. ARM version had already fixed this. propegate to C. change references to src_pixels_per_line to src_pitch and standardize on src/dst (instead of input/output). recognize that first_pass is only run in the verticle and second_pass only horizontal. ARM version had already fixed this. propegate to C Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2	2011-02-08 17:42:54 -05:00
Johann	3273c7b679	move one of the offset files common/arm/vpx_asm_offsets moves up a level. prepare for muxing with encoder/arm/vpx_vp8_enc_asm_offsets Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24	2011-02-07 11:35:30 -05:00
Johann	bb9c95ea53	remove unused dboolhuff code we were holding on to this "just in case." purge it instead Change-Id: I77a367b36d0821d731019f2566ecfffdae1d4b8a	2011-02-04 16:00:00 -05:00

1 2

80 Commits