ffmpeg

Author	SHA1	Message	Date
James Almer	32c836cb11	x86/vp9: remove duplicate function prototypes Fixes "redundant redeclaration" warnings. Signed-off-by: James Almer <jamrial@gmail.com>	2014-12-23 00:56:51 -03:00
James Almer	7696e429c7	x86/vp3dsp: port put_vp_no_rnd_pixels8_l2_mmx to yasm Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-20 13:25:43 +01:00
James Almer	a4d62f7775	x86/constants: fix alignment of pw_255 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-19 20:21:34 +01:00
Ronald S. Bultje	bdc1e3e3b2	vp9/x86: intra prediction sse2/32bit support. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-19 14:07:19 +01:00
Ronald S. Bultje	b6e1711223	vp9/x86: invert hu_ipred left array ordering. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-19 14:07:18 +01:00
Ronald S. Bultje	0a7964dca5	vp9/x86: save one register on 32bit idct32x32. Fixes build on win32. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-16 02:51:26 +01:00
Ronald S. Bultje	cae893f692	vp9/x86: sse2 MC assembly. Also a slight change to the ssse3 code, which prevents a theoretical overflow in the sharp filter. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-15 02:34:05 +01:00
Ronald S. Bultje	fd77fbb390	vp9/x86: 32bit and sse2 support for vp9 inverse transform assembly Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-15 00:38:05 +01:00
Michael Niedermayer	a03f72e744	avcodec/x86/hevc_mc: fix sse register counts These fix failures of --enable-xmm-clobber-test It would be better to change the code to use fewer registers, but until someone does the used register count must not be too small Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-11 13:17:26 +01:00
Michael Niedermayer	d43d5c5707	avcodec/x86/hevc_mc: remove dead branch from EPEL_FILTER Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-10 07:34:49 +01:00
Michael Niedermayer	ed9be7dd47	avcodec/x86/pngdsp: fix off by 1 error This fixes artifacts in the last pixel of rows with some widths and pixel formats Found-by: Dominique Leroux <Dominique.Leroux@autodesk.com> Tested-by: Dominique Leroux <Dominique.Leroux@autodesk.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-08 18:24:40 +01:00
Michael Niedermayer	1d048f762d	Merge commit '9a738c27dceb4b975784b23213a46f5cb560d1c2' * commit '9a738c27dceb4b975784b23213a46f5cb560d1c2': v210enc: Add SIMD optimised 8-bit and 10-bit encoders Conflicts: libavcodec/v210enc.c libavcodec/v210enc.h libavcodec/x86/Makefile libavcodec/x86/v210enc.asm libavcodec/x86/v210enc_init.c tests/ref/vsynth/vsynth1-v210 tests/ref/vsynth/vsynth2-v210 See: `36091742d1` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-06 01:54:10 +01:00
Kieran Kunhya	9a738c27dc	v210enc: Add SIMD optimised 8-bit and 10-bit encoders Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2014-12-05 13:03:49 +00:00
Reimar Döffinger	49d9cbe55d	h264_i386: Fix operand size Fixes fate failure on macosx clang x86-64 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-03 23:03:13 +01:00
Christophe Gisquet	9fa056ba75	pngdsp x86: use unaligned access For test images manually generated to contain only up prediction, timing results: 8380x3032 255x185 before: 138635 1992 after: 139232 1996 Actually jumping to the proper version depending on the alignment: 8380x3032: 138767 A 0.5% speed improvement for gigantic images is not worth the code duplication. Fixes ticket #4148 Signed-off-by: Christophe Gisquet <christophe.gisquet@gmail.com> Tested-by: Benoit Fouet <benoit.fouet@free.fr> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-12-03 11:56:22 +01:00
Kieran Kunhya	36091742d1	v210enc: Add SIMD optimised 8-bit and 10-bit encoders Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-26 20:30:47 +01:00
Michael Niedermayer	ea41e6d637	Merge commit '9c12c6ff9539e926df0b2a2299e915ae71872600' * commit '9c12c6ff9539e926df0b2a2299e915ae71872600': motion_est: convert stride to ptrdiff_t Conflicts: libavcodec/me_cmp.c libavcodec/ppc/me_cmp.c libavcodec/x86/me_cmp_init.c See: `9c669672c7` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-24 12:13:00 +01:00
Vittorio Giovara	9c12c6ff95	motion_est: convert stride to ptrdiff_t CC: libav-stable@libav.org Bug-Id: CID 700556 / CID 700557 / CID 700558	2014-11-24 01:30:10 +00:00
Carl Eugen Hoyos	600e38f563	Fix standalone compilation of the apng decoder on x86.	2014-11-23 13:21:29 +01:00
Michael Niedermayer	65ce8f8895	avcodec/x86/Makefile: fix order Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-23 01:49:04 +01:00
Michael Niedermayer	d3512a0e89	avcodec/x86/lossless_audiodsp: fix fallback code for 32bit Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-22 21:08:38 +01:00
Michael Niedermayer	4327088da3	avcodec/x86/lossless_audiodsp: support len %16 == 8 in scalarproduct_and_madd_int16() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-11-22 20:40:36 +01:00
Reimar Döffinger	478c61ccb2	h264_i386: Optimize decode_significance_8x8_x86 for 64 bit. 11674 -> 10877 decicycles on my Phenom II. Overall speedup was unfortunately within measurement error. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>	2014-11-22 14:06:48 +01:00
James Almer	3cec54b7d7	x86/flacdsp: add SSE2 and AVX decorrelate functions Two to four times faster depending on instruction set, block size and channel count.	2014-11-13 13:47:55 -03:00
James Almer	84ccc317ce	x86/flacdsp: separate decoder and encoder dsp initialization Signed-off-by: James Almer <jamrial@gmail.com>	2014-11-12 14:41:45 -03:00
James Almer	7292b0477a	x86/hpeldsp: fix loop in {avg,avg_no_rnd}_pixels16_x2_mmx Handle it inside the __asm__() block. Fixes fate-vc1_ilaced_twomv when using the gcc-usan toolchain. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-23 13:11:05 -03:00
Michael Niedermayer	3c1378ce0a	Merge commit '2d91abade29e43bb45c881d45909b8ee77e904e2' * commit '2d91abade29e43bb45c881d45909b8ee77e904e2': x86: h264_intrapred: Don't treat 32-bit integers as 64-bit Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-10-08 11:48:58 +02:00
Henrik Gramner	2d91abade2	x86: h264_intrapred: Don't treat 32-bit integers as 64-bit The upper halves are not guaranteed to be zero in x86-64. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2014-10-08 08:15:52 +00:00
Mickaël Raulet	4ba6371a83	x86/hevc: get rid off packusdw for ssse3 compatibility cherry picked from commit df8ebe304df453f26c28ff8f11d607f49b90a4c2 Fixes out of array access Fixes: asan_stack-oob_1046454_9_asan_stack-oob_15a9e7c_170_WP_MAIN10_B_Toshiba_3.bit Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-10-04 21:14:15 +02:00
James Almer	0de1d6287e	x86/mlpdec: add ff_mlp_rematrix_channel_{sse4,avx2} 2x to 2.5x faster than the C version. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-02 22:11:55 -03:00
James Almer	acebff8e5d	x86/mpegvideoencdsp: improve ff_pix_sum16_sse2 ~15% faster. Also add an mmxext version that takes advantage of the new code, and build it alongside with the mmx version only on x86_32. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-10-01 13:07:22 -03:00
Michael Niedermayer	d22e88d120	avcodec/x86/fmtconvert: Fix operand size in ff_int32_to_float_fmul_array8_sse* Fixes acodec-dca2 fate failure Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-28 19:04:06 +02:00
James Almer	26cd7b1e1a	x86/fmtconvert: add ff_int32_to_float_fmul_array8_{sse,sse2} About two times faster than the c wrapper. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-26 20:48:40 -03:00
Carl Eugen Hoyos	c0f9df30dd	lavc/x86/idctdsp.h: Fix make checkheaders.	2014-09-25 22:18:25 +02:00
James Almer	a829870b2f	avcodec/svq1enc: align buffer used by simd functions Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-25 16:00:20 -03:00
James Almer	4b892e469b	x86/cavsdsp: fix buffer alignment in cavs_idct8_add_mmx() It may be used by ff_add_pixels_clamped_sse2(). Should fix fate-cavs failures on some systems. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-25 16:00:16 -03:00
James Almer	4f4f08e6f0	x86/idctdsp: port {put,add}_pixels_clamped to yasm Also add sse2 versions for both. put_pixels_clamped port and sse2 version originally written by Timothy Gu. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 21:52:13 -03:00
James Almer	c99a882814	avcodec/idctdsp: change {put,add}_pixels_clamped to ptrdiff_t line_size Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 21:43:19 -03:00
James Almer	ad26e83f9c	avcodec/x86: use function pointers for {put,add}_pixels_clamped Same behavior as in simple_idct. This way the best optimized versions available will be used instead. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 18:52:32 -03:00
James Almer	70277d1d23	x86/videodsp: add ff_emu_edge_{hfix,hvar}_avx2 ~15% faster than sse2. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-24 16:12:55 -03:00
James Almer	164d6c7f5b	x86/videodsp: fix warning about discarded 'const' qualifier Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-23 19:59:20 -03:00
James Almer	6b2caa321f	x86/vp9: add AVX and AVX2 MC Roughly 25% faster MC than ssse3 for blocksizes 32 and 64. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-22 22:35:03 -03:00
James Almer	33c752be51	x86/me_cmp: port mmxext vsad functions to yasm Also add mmxext versions of vsad8 and vsad_intra8, and sse2 versions of vsad16 and vsad_intra16. Since vsad8 and vsad16 are not bitexact, they are accordingly marked as approximate. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-19 20:50:20 -03:00
James Almer	77f9a81cca	x86/me_cmp: combine sad functions into a single macro No point in having the sad8 functions separate now that the loop is no longer unrolled. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-17 23:52:36 -03:00
Michael Niedermayer	41d82b85ab	avcodec/x86/vp9lpf: Always include x86util.asm Fixes executable stack Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-17 23:37:46 +02:00
Michael Niedermayer	85f2c0124d	avcodec/x86/me_cmp: fix sad8xh This adds back support for 8x4 and 8x16 it does not support 8x2, i think nothing uses that Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-17 14:08:24 +02:00
James Almer	0456d169c4	x86/me_cmp: port mmxext and sse2 sad functions to yasm Also add a missing c->pix_abs[0][0] initialization, and sse2 versions of sad16_x2, sad16_y2 and sad16_xy2 (%15 to %20 faster than mmxext). Since the _xy2 versions are not bitexact, they are accordingly marked as approximate. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-17 11:12:50 +02:00
James Almer	52ec81c67d	x86/hevc_res_add: add missing guards to hevc_transform_add32_8_avx2 Should fix compilation with old Yasm/Nasm versions. Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-04 23:34:01 -03:00
James Almer	c3d2426cca	x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 ~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2014-09-04 20:21:29 -03:00
James Darnley	46ef45ab59	lavc/x86/v210: give cpuflag to INIT macro This lets the cglobal macro automatically append a suffix to the function name. This means that INIT_XMM avx must be used rather than INIT_AVX. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-09-05 00:35:07 +02:00

1 2 3 4 5 ...

1872 Commits