ffmpeg

Author	SHA1	Message	Date
Diego Biurrun	6961bdface	x86: avcodec: Consistently name all init files	2012-08-16 11:05:38 +02:00
Martin Storsjö	1d9c2dc89a	Don't include common.h from avutil.h Signed-off-by: Martin Storsjö <martin@martin.st>	2012-08-15 22:32:06 +03:00
Diego Biurrun	29cfdd3767	x86: avcodec: Appropriately name files containing only init functions	2012-08-15 03:24:08 +02:00
Diego Biurrun	be12958937	mpegvideo_mmx_template: drop some commented-out cruft	2012-08-15 03:24:07 +02:00
Mans Rullgard	8ec0204ee4	x86: cabac: allow building with suncc This fixes two issues preventing suncc from building this code. The undocumented 'a' operand modifier, causing gcc to omit a $ in front of immediate operands (as required in addresses), is not supported by suncc. Luckily, the also undocumented 'c' modifer has the same effect and is supported. On some asm statements with a large number of operands, suncc for no obvious reason fails to correctly substitute some of the operands. Fortunately, some of the operands in these statements are plain numbers which can be inserted directly into the code block instead of passed as operands. With these changes, the code builds correctly with both gcc and suncc. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-13 14:51:52 +01:00
Mans Rullgard	c8252e80eb	x86: mlpdsp: avoid taking address of void This code contains a C array of addresses of labels defined in inline asm. To do this, the names must be declared as external in C. The declared type does not matter since only the address is used, and for some reason, the author of the code used the 'void' type despite taking the address of a void expression being invalid. Changing the type to char, a reasonable choice since the alignment of the code labels cannot be known or guaranteed, eliminates gcc warnings and allows building with suncc. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-13 14:51:52 +01:00
Diego Biurrun	3b9e832e17	x86: Drop silly "_yasm" suffixes from filenames	2012-08-12 17:13:05 +02:00
Mans Rullgard	d7a4f8f8b9	Move MASK_ABS macro to libavcodec/mathops.h This macro is only used in two places, both in libavcodec, so this is a more sensible place for it. Two small tweaks to the macro are made: - removing the trailing semicolon - dropping unnecessary 'volatile' from the x86 asm Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-09 00:58:20 +01:00
Mans Rullgard	c318626ce2	x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h This puts x86-specific things in the x86/ subdirectory where they belong. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-09 00:58:20 +01:00
Dave Yeo	197439c1ef	x86: pngdsp: Fix assembly for OS/2 The a.out object format does not allow aligning sections. On OS/2 LD aligns sections to 16 bytes. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-08-08 15:45:09 +02:00
Mans Rullgard	2b140a3d09	x86: use 32-bit source registers with movd instruction yasm tolerates mismatch between movd/movq and source register size, adjusting the instruction according to the register. nasm is more strict. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-07 15:21:20 +01:00
Mans Rullgard	a3df4781f4	x86: add colons after labels nasm prints a warning if the colon is missing. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-07 15:20:56 +01:00
Anton Khirnov	36ef5369ee	Replace all CODEC_ID_* with AV_CODEC_ID_*	2012-08-07 16:00:24 +02:00
Diego Biurrun	2096857551	x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2	2012-08-05 21:40:49 +02:00
Ronald S. Bultje	4a8143e73c	fft: 3dnow: fix register name typo in DECL_IMDCT macro Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-08-04 00:16:02 +02:00
Diego Biurrun	0c3ff1982c	x86: dct32: port to cpuflags	2012-08-03 22:51:06 +02:00
Diego Biurrun	239fdf1b4a	x86: build: replace mmx2 by mmxext Refactoring mmx2/mmxext YASM code with cpuflags will force renames. So switching to a consistent naming scheme beforehand is sensible. The name "mmxext" is more official and widespread and also the name of the CPU flag, as reported e.g. by the Linux kernel.	2012-08-03 22:51:05 +02:00
Ronald S. Bultje	da6505ad2f	dsputil: make add_hfyu_left_prediction_sse4() support unaligned src. This makes add_hfyu_left_prediction_sse4() handle sources that are not 16-byte aligned in its own function rather than by proxying the call to add_hfyu_left_prediction_ssse3(). This fixes a crash on Win64, since the sse4 version clobberes xmm6, but the ssse3 version (which uses MMX regs) does not restore it, thus leading to XMM clobbering and RSP being off. Fixes bug 342.	2012-08-03 11:09:14 -07:00
Diego Biurrun	ca844b7be9	x86: Use consistent 3dnowext function and macro name suffixes Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to "3dnowext", which is a more common name of the CPU flag, as reported e.g. by the Linux kernel, unifies this.	2012-08-03 14:00:47 +02:00
Diego Biurrun	03737412a3	x86: proresdsp: improve SIGNEXTEND macro comments	2012-08-02 22:30:44 +02:00
Diego Biurrun	81905088a1	x86: h264dsp: K&R formatting cosmetics	2012-08-02 20:20:21 +02:00
Ronald S. Bultje	c728518b3c	x86: fft: fix imdct_half() for AVX Some calculations were changed in b6a3849 to use mmsize, which was not correct for the AVX version, which uses INIT_YMM and therefore has mmsize == 32. Fixes Bug 341. Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>	2012-08-02 13:40:11 -04:00
Mans Rullgard	ec7c501ed5	x86: remove libmpeg2 mmx(ext) idct functions These functions are not faster than other mmx implementations on any hardware I have been able to test on, and they are horribly inaccurate. There is thus no reason to ever use them. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-08-02 12:14:52 +01:00
Ronald S. Bultje	b6a3849adb	fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64. 64-bit CPUs always have SSE available, thus there is no need to compile in the 3dnow functions. This results in smaller binaries.	2012-07-31 21:20:47 -07:00
Ronald S. Bultje	53dfaedc01	x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM.	2012-07-31 20:28:52 -07:00
Diego Biurrun	6376a3ad24	x86: h264dsp: Remove unused variable ff_pb_3_1	2012-08-01 00:17:16 +02:00
Diego Biurrun	8728b381cb	x86: h264dsp: Adjust YASM #ifdefs This fixes compilation with YASM disabled.	2012-07-31 13:54:07 +02:00
Ronald S. Bultje	b829b4ce29	h264: convert loop filter strength dsp function to yasm. This completes the conversion of h264dsp to yasm; note that h264 also uses some dsputil functions, most notably qpel. Performance-wise, the yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles faster (201->193) on x86-32.	2012-07-30 19:39:47 -07:00
Ronald S. Bultje	c83f44dba1	h264_idct_10bit: port x86 assembly to cpuflags.	2012-07-28 08:29:45 -07:00
Ronald S. Bultje	b3c5ae5607	fft: rename "z" to "zc" to prevent name collision. Without this, cglobal will expand "z" to "zh" to access the high byte in a register's word, which causes a name collision with the ZH(x) macro further up in this file.	2012-07-28 08:29:44 -07:00
Ronald S. Bultje	4d777eedfd	vp3: don't compile mmx IDCT functions on x86-64. 64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX version will never be used.	2012-07-27 20:12:30 -07:00
Ronald S. Bultje	a5bbb1242c	h264_loopfilter: port x86 simd to cpuflags.	2012-07-27 20:12:11 -07:00
Ronald S. Bultje	d07ff3cd5a	h264_chromamc_10bit: port x86 simd to cpuflags.	2012-07-27 17:35:49 -07:00
Ronald S. Bultje	4a26fdd852	vp3: port x86 SIMD to cpuflags.	2012-07-27 17:35:49 -07:00
Ronald S. Bultje	76888c64b0	rv34: port x86 SIMD to cpuflags.	2012-07-27 15:13:26 -07:00
Ronald S. Bultje	158744a4cd	vp56: only compile MMX SIMD on x86-32. All x86-64 CPUs have SSE2, so the MMX version will never be used. This leads to smaller binaries.	2012-07-27 14:40:27 -07:00
Ronald S. Bultje	2734ba787b	vp56: port x86 simd to cpuflags.	2012-07-27 14:39:07 -07:00
Ronald S. Bultje	5361e10a5e	proresdsp: port x86 assembly to cpuflags.	2012-07-27 11:43:06 -07:00
Ronald S. Bultje	bde73f28af	mpegaudio: bury inline asm under HAVE_INLINE_ASM.	2012-07-26 13:43:16 -07:00
Ronald S. Bultje	30b45d9c38	x86inc: automatically insert vzeroupper for YMM functions.	2012-07-26 13:43:16 -07:00
Ronald S. Bultje	a1878a88a1	vp3: don't use calls to inline asm in yasm code. Mixing yasm and inline asm is a bad idea, since if either yasm or inline asm is not supported by your toolchain, all of the asm stops working. Thus, better to use either one or the other alone. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2012-07-25 14:24:30 -04:00
Ronald S. Bultje	79195ce565	x86/dsputil: put inline asm under HAVE_INLINE_ASM. This allows compiling with compilers that don't support gcc-style inline assembly. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2012-07-25 14:24:27 -04:00
Yang Wang	845e92fd6a	dsputil_mmx: fix incorrect assembly code In ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t", and so forth, have problems. From above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: "movq 8(%edi), %mm1". During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes "movq 810000(%edx)". That is, it will stride by 810000 instead of 8. This will cause a segmentation fault. This error was fixed in the second block of the assembly code, but not in the unrolled loop. How to reproduce: This error is exposed when we build using Intel C++ Compiler, with IPO+PGO optimization enabled. Crashed when decoding an MJPEG video. Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2012-07-25 14:22:18 -04:00
Jason Garrett-Glaser	85a3c19ed1	dsputil: x86: add SHUFFLE_MASK_W macro Simplifies pshufb masks that operate on words.	2012-07-22 16:56:58 -04:00
Diego Biurrun	9f97af2688	x86: dsputil: drop some unused CPU flag debug code	2012-07-19 10:17:56 +02:00
Mans Rullgard	28f9ab7029	vp3: move idct and loop filter pointers to new vp3dsp context This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-07-18 10:32:19 +01:00
Mans Rullgard	ab9f987661	build: add CONFIG_VP3DSP, reduce repetition in OBJS lists Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-07-18 10:32:18 +01:00
Martin Storsjö	f27386cdc7	x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro The SPLATB_REG macro already adds the 'd' suffix internally. This fixes building on Win64, which has been broken since 878e66902. This worked for unix, where r2 happened to be rdx in this case, which with the first suffix rdxd was mapped to eax, and eaxd is defined back to eax. On win64 however, r2 happened to be R8 in this case, and R8d mapps to R8D just fine, but there's no mapping for R8Dd to anything. Signed-off-by: Martin Storsjö <martin@martin.st>	2012-07-06 21:07:23 +03:00
Diego Biurrun	878e669029	x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros	2012-07-05 17:37:11 +02:00
Loren Merritt	4d4752366f	x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-07-05 17:37:11 +02:00

... 6 7 8 9 10 ...

944 Commits