generic-library/vpx

Author	SHA1	Message	Date
Jingning Han	0ede9f52b7	Unify subtract function used in VP8/9 This commit replaces the vp8_ prefixed subtract function with the common vpx_subtract_block function. It removes redundant SIMD optimization codes and unit tests. Change-Id: I42e086c32c93c6125e452dcaa6ed04337fe028d9	2015-07-07 09:57:44 -07:00
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
Johann	08ad7e4db5	Correctly initialize "ones" value in neon quantize By using 0xff for a short it was not setting the high bits. When comparing the output with vtst to find non-zero elements it was skipping vaules which had no low bits set such as -512 / 0xFE00. Using -8191 as the first element of coeff will generate this condition. BUG=883 Change-Id: Ia1e10fb809d1e7866f28c56769fe703e6231a657	2014-11-20 18:34:45 -08:00
Johann	2134eb2f05	Remove pair quantization The intrinsics version of the pair quant is slower than running it individually. Change-Id: I7b4ea8599d4aab04be0a5a0c59b8b29a7fc283f4	2014-10-31 13:42:55 -07:00
Johann	7ae75c3d52	vp8 quantization -> intrinsics Use intrinsics for neon quantization. Slight loss (<5%) of performance compared to the assembly. Roughly 10x faster on arm64 because that was running C code before. Change-Id: I7cf5242d8f29b7eab5bca6a1c20c89c9fc9ca66d	2014-10-31 13:42:13 -07:00
Johann	587ff646f6	Fix build failure with Android NDK The version of gcc4.6 included with the Android NDK through r10b fails to compile this function. Replace it with C code. BUG=860 Change-Id: Ifcc0476664071aec46a171cdd5ad17305930986a	2014-09-25 12:15:21 -07:00
Scott LaVarnway	1220b49c89	arm: Fix building vp8_mse16x16_neon.c with MSVC Use the right return values - vadd_s64 returns int64x1_t, not a normal int64_t. Change-Id: Ife17213087c1dfb5faaa647f804d2fd140f3a0eb	2014-09-16 12:36:00 -07:00
Scott LaVarnway	fe2cc873dc	VP8 encoder for ARMv8 by using NEON intrinsics 1 Add vp8_mse16x16_neon.c - vp8_mse16x16_neon - vp8_get4x4sse_cs_neon Change-Id: I108952f60a9ae50613f0ce3903c2c81df19d99d0 Signed-off-by: James Yu <james.yu@linaro.org>	2014-09-15 12:04:09 -07:00
Jia Jia	395f2e874b	vp8 encoder: remove vp8_yv12_copy_partial_frame_neon Use generic C implementation instead of neon-specific code Change-Id: Ib322b4ece9cdbd4de76a9eed3d2e9fd1d8542406	2014-09-08 08:59:24 -07:00
James Yu	eed005b076	VP8 encoder for ARMv8 by using NEON intrinsics 6 Add shortfdct_neon.c - vp8_short_fdct4x4_neon - vp8_short_fdct8x4_neon Change-Id: I90152c803b484f5fab839473d632c50af0524e68 Signed-off-by: James Yu <james.yu@linaro.org>	2014-08-20 09:25:29 -07:00
James Yu	6d6fdd9c3d	VP8 encoder for ARMv8 by using NEON intrinsics 3 Add subtract_neon.c - vp8_subtract_b_neon - vp8_subtract_mby_neon - vp8_subtract_mbuv_neon Change-Id: If9a17a093478552e3e3276eeaa3f098b9021d08c Signed-off-by: James Yu <james.yu@linaro.org>	2014-08-20 09:20:55 -07:00
Scott LaVarnway	8013aaa10b	VP8 encoder for ARMv8 by using NEON intrinsics 2 Add vp8_shortwalsh4x4_neon.c - vp8_short_walsh4x4_neon Change-Id: Ica5f584be608c9e636f62db14f563757e94be09b Signed-off-by: James Yu <james.yu@linaro.org>	2014-08-20 09:19:23 -07:00
Marco Paniconi	7788c62286	Fix clang compiler warning in denoising_neon. Issue: https://code.google.com/p/webm/issues/detail?id=829 Change-Id: I580308f8aa4af194b5d8990a9692ebd18db68ee8	2014-07-23 09:59:27 -07:00
Scott LaVarnway	a4b7ae7e82	Neon version of vp8_denoiser_filter_uv() The encoder performance improved by 5% (vs "C") for the test clip used. Change-Id: I866b35eb2a06092edce7b37fc409562d0dacd7e7	2014-06-27 11:03:58 -07:00
Scott LaVarnway	4d9b9fa508	Neon match to vp8 temporal denoiser fix Now match the "C" version of "Fix to reduce block artifacts from vp8 temporal denoiser." (see change id Id9b56e59e33f3c22e79d2f89f763bdde246fdf3f) Change-Id: I99e569bb6af4ae3532621127e12bf917a48ba08e	2014-05-28 13:32:52 -07:00
Scott LaVarnway	03de5a38e2	neon matches "C" when using increase_denoising If increase_denoising is set, vp8_denoiser_filter_neon() produced incorrect results. Change-Id: I645f78e48b8f6657fa8a4b69d2c4d3488a0581dc	2014-05-26 08:06:25 -07:00
Marco Paniconi	6da66e1114	vp8: Add increase_denoising parameter to denoiser. Change-Id: I96ed73e109c4f89dd06f3583cf7ecf9277401fae	2014-05-16 15:06:59 -07:00
Marco Paniconi	96d1946e87	Revert "Revert "Remove struct params from vp8_denoiser_filter"" This reverts commit `06e6d56fa1` Change-Id: If95598385b693945d6b144d03b6da8f6a57dac98	2014-05-14 10:55:53 -07:00
Frank Galligan	06e6d56fa1	Revert "Remove struct params from vp8_denoiser_filter" This reverts commit `e516a42527` Change-Id: I7c78712acc737ad5f580181cdab3aa76b23f3ca5	2014-05-07 16:19:20 -07:00
Scott LaVarnway	e516a42527	Remove struct params from vp8_denoiser_filter This eliminates the asm_offsets dependency for future all-assembly versions of this function. Change-Id: I3227073ecfcb8ee6e593934fab941e9081abdda0	2014-05-02 10:31:52 -07:00
Scott LaVarnway	dea687f733	Merge "Improved intrinsic version of vp8_denoiser_filter_neon"	2014-05-02 09:59:59 -07:00
Scott LaVarnway	ff209de82b	Improved intrinsic version of vp8_denoiser_filter_neon Used horizonal add instructions instead of adding byte lanes. The encoder performance improved by ~4% for the test clip used. Change-Id: Iaddd10403fcffb5b3f53b1f591ab2fe0ff002c08	2014-04-30 06:58:16 -07:00
Yunqing Wang	33df6d1fc1	Save NEON registers in VP8 NEON functions The recent compiler can generate optimized code that uses NEON registers for various operations besides floating-point operations. Therefore, only saving callee-saved registers d8 - d15 at the beginning of the encoder/decoder is not enough anymore. This patch added register saving code in VP8 NEON functions that use those registers. Change-Id: Ie9e44f5188cf410990c8aaaac68faceee9dffd31	2014-04-28 14:51:53 -07:00
Martin Storsjo	e5647d6826	arm: Use vreinterpret instead of a plain cast for converting between neon vector types This fixes building with MSVC for arm. Change-Id: Iffae0408e0c68760e87e96b9e17d9df8e8cadb1a	2014-01-22 11:28:37 +02:00
Christian Duvivier	b52db6b7e8	ARM NEON version of denoiser. Change-Id: I951abd4ad0078f78949f3cb79453ac334fb82a7e	2014-01-02 10:51:05 -08:00
Johann	4d5f1955de	Remove type from vmvn datatype is optional for the instruction but clang refuses it. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/CIHIJIHC.html It is still required when using an immediate. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/CIHGGEEB.html Change-Id: I0fae956c8c0fa3f97578ce80abea247f7fc88705	2013-05-23 13:02:44 -07:00
John Koleszar	7b8dfcb5a2	Rough merge of master into experimental Creates a merge between the master and experimental branches. Fixes a number of conflicts in the build system to allow either VP8 or VP9 to be built. Specifically either: $ configure --disable-vp9 $ configure --disable-vp8 --disable-unit-tests VP9 still exports its symbols and files as VP8, so that will be resolved in the next commit. Unit tests are broken in VP9, but this isn't a new issue. They are fixed upstream on origin/experimental as of this writing, but rebasing this merge proved difficult, so will tackle that in a second merge commit. Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21	2012-11-07 11:30:16 -08:00
Ronald S. Bultje	4b2c2b9aa4	Rename vp8/ codec directory to vp9/. Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4	2012-11-01 16:31:22 -07:00
Ronald S. Bultje	6a4b1e5958	Remove vp8 in local symbols. For non-static functions, change the prefix to vp9_. For static functions, remove the prefix. Also fix some comments, remove unused code or unused function prototypes. Change-Id: I1f8be05362f66060fe421c3d4c9a906fdf835de5	2012-11-01 10:03:43 -07:00
John Koleszar	c6b9039fd9	Restyle code Approximate the Google style guide[1] so that that there's a written document to follow and tools to check compliance[2]. [1]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml [2]: http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py Change-Id: Idf40e3d8dddcc72150f6af127b13e5dab838685f	2012-07-17 11:46:03 -07:00
John Koleszar	d8216b19b6	Merge "Fix compiler warnings" into eider	2012-05-02 16:22:34 -07:00
Timothy B. Terriberry	e50c842755	Fix TEXTRELs in the ARM asm. Besides imposing a performance penalty at startup in most configurations, these relocations break the dynamic linker for native Fennec, since it does not support them at all. Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de	2012-05-02 10:36:01 -07:00
Attila Nagy	14c9fce8e4	Fix compiler warnings Fix code for following warnings: -Wimplicit-function-declaration -Wuninitialized -Wunused-but-set-variable -Wunused-variable Change-Id: I2be434f22fdecb903198e8b0711255b4c1a2947a	2012-05-02 10:57:57 +03:00
Johann	e50f96a4a3	Move SAD and variance functions to common The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a	2012-03-05 16:50:33 -08:00
Johann	fea3556e20	Fix variance overflow In the variance calculations the difference is summed and later squared. When the sum exceeds sqrt(2^31) the value is treated as a negative when it is shifted which gives incorrect results. To fix this we cast the result of the multiplication as unsigned. The alternative fix is to shift sum down by 4 before multiplying. However that will reduce precision. For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and change). PPC change is untested. Change-Id: I1bad27ea0720067def6d71a6da5f789508cec265	2012-02-09 12:38:31 -08:00
Scott LaVarnway	edd98b7310	Added predictor stride argument(s) to subtract functions Patch set 2: 64 bit build fix Patch set 3: 64 bit crash fix [Tero] Patch set 4: Updated ARMv6 and NEON assembly. Added also minor NEON optimizations to subtract functions. Patch set 5: x86 stride bug fix Change-Id: I1fcca93e90c89b89ddc204e1c18f208682675c15	2011-11-15 12:53:01 -05:00
Scott LaVarnway	46639567a0	Merge "Change use of eob in the encoder"	2011-11-03 08:06:06 -07:00
Tero Rintaluoma	e4f2ec7a52	Change use of eob in the encoder Changed 'int eob' to 'char *eob' in BLOCKD so that both encoder and decoder will use eobs[25] array from MACROBLOCKD structure. In future, this will enable use of the decoder side IDCT in the encoder. Change-Id: I6e1c011628cb8864fd4a0b80f0279ce16a5ca978	2011-11-03 16:08:09 +02:00
Attila Nagy	de82809444	Reduce partial frame copy in encoder's pick_filter_level_fast The partial frame copy function used to copy an extra 8 lines above and below. The partial frame filtering can only modify 3 pixel rows above the partial frame. Reduce copy to bare minimum needed, which is 4 lines, so that partial filtering on copied frame is possible. Define the "magic" fraction number for partial filtering in loopfilter.h . Change-Id: I4791ffc541b6884b12759a0d0714a8faf16147ec	2011-10-26 15:25:07 +03:00
Fritz Koenig	bd0c3409a8	Move neon only arm functions under arm/neon. These files don't contain generic arm code, so should only be compiled by neon. Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564	2011-09-20 10:51:06 -07:00
Johann	6829e62718	Merge "NEON FDCT updated to match current C code"	2011-09-20 09:51:05 -07:00
Tero Rintaluoma	0c2529a812	NEON FDCT updated to match current C code - Removed fast_fdct4x4_neon and fast_fdct8x4_neon - Uses now short_fdct4x4 and short_fdct8x4 - Gives ~1-2% speed-up on Cortex-A8/A9 Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec	2011-09-20 10:20:55 +03:00
Tero Rintaluoma	2a4b2a000c	NEON walsh transform updated to match C Modified original patch If2f07220885c4c3a0cae0dace34ea0e36124f001 according to comments. Scheduled code a little bit to prevent some interlocks. Change-Id: I338f02b881098782f82af63d97f042b85e63e902	2011-09-19 10:15:33 +03:00
Yaowu Xu	361717d2be	remove one set of 16x16 variance funcations call to this set of functions are replaced by var16x16. Change-Id: I5ff1effc6c1358ea06cda1517b88ec28ef551b0d	2011-06-09 11:23:05 -07:00
Tero Rintaluoma	61f0c090df	neon fast quantize block pair vp8_fast_quantize_b_pair_neon function added to quantize two adjacent blocks at the same time to improve performance. - Additional 3-6% speedup compared to neon optimized fast quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16) Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e	2011-06-01 10:48:05 +03:00
Tero Rintaluoma	33fa7c4ebe	neon fast quantizer updated vp8_fast_quantize_b_neon function updated and further optimized. - match current C implementation of fast quantizer - updated to use asm_enc_offsets for structure members - updated ads2gas scripts to handle alignment issues Change-Id: I5cbad9c460ad8ddb35d2970a8684cc620711c56d	2011-05-06 08:59:52 +03:00
Tero Rintaluoma	cec76a36d6	Wrapper function removed from vp8_subtract_b_neon function call Address calculations moved from encodemb_arm.c file to neon optimized assembly function to save cycles in function calls. - vp8_subtract_b_neon_func replaced with vp8_subtract_b_neon that contains all needed address calculations - unnecessary file encodemb_arm.c removed - consistent with ARMv6 optimized version Change-Id: I6cbc1a2670b56c2077f59995fcf8f70786b4990b	2011-04-01 10:06:44 +03:00
Johann	f3cb9ae459	Merge "Adds "armvX-none-rvct" targets"	2011-01-28 09:03:58 -08:00
Tero Rintaluoma	11a222f5d9	Adds "armvX-none-rvct" targets Adds following targets to configure script to support RVCT compilation without operating system support (for Profiler or bare metal images). - armv5te-none-rvct - armv6-none-rvct - armv7-none-rvct To strip OS specific parts from the code "os_support"-config was added to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS specific parts such as OS specific includes and function calls for timers and threads etc. This was done to enable RVCT compilation for profiling purposes or running the image on bare metal target with Lauterbach. Removed separate AREA directives for READONLY data in armv6 and neon assembly files to fix the RVCT compilation. Otherwise "ldr <reg>, =label" syntax would have been needed to prevent linker errors. This syntax is not supported by older gnu assemblers. Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43	2011-01-28 12:47:39 +02:00
Yunqing Wang	ce6c954d2e	Modify calling of NEON code in sub-pixel search In vp8_find_best_sub_pixel_step_iteratively(), many times xoffset and yoffset are specific values - (4,0) (0,4) and (4,4). Modified code to call simplified NEON version at these specific offsets to help with the performance. Change-Id: Iaf896a0f7aae4697bd36a49e182525dd1ef1ab4d	2011-01-18 14:19:52 -05:00

1 2

56 Commits