Commit Graph

869 Commits

Author SHA1 Message Date
Diego Biurrun
239fdf1b4a x86: build: replace mmx2 by mmxext
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
2012-08-03 22:51:05 +02:00
Ronald S. Bultje
da6505ad2f dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
This makes add_hfyu_left_prediction_sse4() handle sources that are not
16-byte aligned in its own function rather than by proxying the call to
add_hfyu_left_prediction_ssse3(). This fixes a crash on Win64, since the
sse4 version clobberes xmm6, but the ssse3 version (which uses MMX regs)
does not restore it, thus leading to XMM clobbering and RSP being off.

Fixes bug 342.
2012-08-03 11:09:14 -07:00
Diego Biurrun
ca844b7be9 x86: Use consistent 3dnowext function and macro name suffixes
Currently there is a wild mix of 3dn2/3dnow2/3dnowext.  Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
2012-08-03 14:00:47 +02:00
Michael Niedermayer
9c6e23f5d2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: fft: fix imdct_half() for AVX
  rtmppkt: Add missing libavcodec/bytestream.h include.
  rtmp: add functions for reading AMF values
  vc1dec: remove useless #include simple_idct.h
  dct-test: always link with aandcttab.o
  vp8: pack struct VP8ThreadData more efficiently
  x86: remove libmpeg2 mmx(ext) idct functions
  eamad: Use dsputils instead of a custom bswap16_buf
  Canopus Lossless decoder

Conflicts:
	Changelog
	LICENSE
	libavcodec/avcodec.h
	libavcodec/cllc.c
	libavcodec/eamad.c
	libavcodec/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-02 23:34:01 +02:00
Diego Biurrun
03737412a3 x86: proresdsp: improve SIGNEXTEND macro comments 2012-08-02 22:30:44 +02:00
Ronald S. Bultje
9f14cd91b5 fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-08-02 22:14:40 +02:00
Diego Biurrun
81905088a1 x86: h264dsp: K&R formatting cosmetics 2012-08-02 20:20:21 +02:00
Ronald S. Bultje
c728518b3c x86: fft: fix imdct_half() for AVX
Some calculations were changed in b6a3849 to use mmsize, which was not correct
for the AVX version, which uses INIT_YMM and therefore has mmsize == 32.

Fixes Bug 341.

Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-08-02 13:40:11 -04:00
Mans Rullgard
ec7c501ed5 x86: remove libmpeg2 mmx(ext) idct functions
These functions are not faster than other mmx implementations on
any hardware I have been able to test on, and they are horribly
inaccurate.  There is thus no reason to ever use them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Michael Niedermayer
ec7ecb8811 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dca: Switch dca_sample_rates to avpriv_ prefix; it is used across libs
  ARM: use =const syntax instead of explicit literal pools
  ARM: use standard syntax for all LDRD/STRD instructions
  fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
  dct-test: allow to compile without HAVE_INLINE_ASM.
  x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM.
  dca: Move tables used outside of dcadec.c to a separate file.
  dca: Rename dca.c ---> dcadec.c
  x86: h264dsp: Remove unused variable ff_pb_3_1
  apetag: change a forgotten return to return 0

Conflicts:
	libavcodec/Makefile
	libavcodec/dca.c
	libavcodec/x86/fft_3dn.c
	libavcodec/x86/fft_3dn2.c
	libavcodec/x86/fft_mmx.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-01 23:33:06 +02:00
Ronald S. Bultje
b6a3849adb fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-07-31 21:20:47 -07:00
Ronald S. Bultje
53dfaedc01 x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM. 2012-07-31 20:28:52 -07:00
Diego Biurrun
6376a3ad24 x86: h264dsp: Remove unused variable ff_pb_3_1 2012-08-01 00:17:16 +02:00
Michael Niedermayer
d1dad7c824 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpc8: return more meaningful error codes.
  mpc: return more meaningful error codes.
  wv,mpc8: don't return apetag data in packets.
  rtmp: do not warn about receiving metadata packets
  x86: h264dsp: Adjust YASM #ifdefs
  x86: yadif: Mark mmxext optimizations as such
  h264: convert loop filter strength dsp function to yasm.
  Improve descriptiveness of a number of codec and container long names

Conflicts:
	libavcodec/flvdec.c
	libavcodec/libopenjpegdec.c
	libavformat/apetag.c
	libavformat/mp3dec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-31 22:41:00 +02:00
Diego Biurrun
8728b381cb x86: h264dsp: Adjust YASM #ifdefs
This fixes compilation with YASM disabled.
2012-07-31 13:54:07 +02:00
Ronald S. Bultje
b829b4ce29 h264: convert loop filter strength dsp function to yasm.
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles
faster (201->193) on x86-32.
2012-07-30 19:39:47 -07:00
Michael Niedermayer
706bd8ea19 Merge remote-tracking branch 'qatar/master'
* qatar/master: (35 commits)
  h264_idct_10bit: port x86 assembly to cpuflags.
  x86inc: clip num_args to 7 on x86-32.
  x86inc: sync to latest version from x264.
  fft: rename "z" to "zc" to prevent name collision.
  wv: return meaningful error codes.
  wv: return AVERROR_EOF on EOF, not EIO.
  mp3dec: forward errors for av_get_packet().
  mp3dec: remove a pointless local variable.
  mp3dec: remove commented out cruft.
  lavfi: bump minor to mark stabilizing the ABI.
  FATE: add tests for yadif.
  FATE: add a test for delogo video filter.
  FATE: add a test for amix audio filter.
  audiogen: allow specifying random seed as a commandline parameter.
  vc1dec: Override invalid macroblock quantizer
  vc1: avoid reading beyond the last line in vc1_draw_sprites()
  vc1dec: check that coded slice positions and interlacing match.
  vc1dec: Do not ignore ff_vc1_parse_frame_header_adv return value
  configure: Move parts that should not be user-selectable to CONFIG_EXTRA
  lavf: remove commented out cruft in avformat_find_stream_info()
  ...

Conflicts:
	Makefile
	configure
	libavcodec/vc1dec.c
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264dsp_mmx.c
	libavfilter/version.h
	libavformat/mp3dec.c
	libavformat/utils.c
	libavformat/wv.c
	libavutil/x86/x86inc.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-29 02:16:26 +02:00
Ronald S. Bultje
c83f44dba1 h264_idct_10bit: port x86 assembly to cpuflags. 2012-07-28 08:29:45 -07:00
Ronald S. Bultje
b3c5ae5607 fft: rename "z" to "zc" to prevent name collision.
Without this, cglobal will expand "z" to "zh" to access the high byte
in a register's word, which causes a name collision with the ZH(x) macro
further up in this file.
2012-07-28 08:29:44 -07:00
Ronald S. Bultje
4d777eedfd vp3: don't compile mmx IDCT functions on x86-64.
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
2012-07-27 20:12:30 -07:00
Ronald S. Bultje
a5bbb1242c h264_loopfilter: port x86 simd to cpuflags. 2012-07-27 20:12:11 -07:00
Ronald S. Bultje
d07ff3cd5a h264_chromamc_10bit: port x86 simd to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
4a26fdd852 vp3: port x86 SIMD to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
76888c64b0 rv34: port x86 SIMD to cpuflags. 2012-07-27 15:13:26 -07:00
Michael Niedermayer
c6963a220d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  proresdsp: port x86 assembly to cpuflags.
  lavr: x86: improve non-SSE4 version of S16_TO_S32_SX macro
  lavfi: better channel layout negotiation
  alac: check for truncated packets
  alac: reverse lpc coeff order, simplify filter
  lavr: add x86-optimized mixing functions
  x86: add support for fmaddps fma4 instruction with abstraction to avx/sse
  tscc2: fix typo in array index
  build: use COMPILE template for HOSTOBJS
  build: do full flag handling for all compiler-type tools
  eval: fix printing of NaN in eval fate test.
  build: Rename aandct component to more descriptive aandcttables
  mpegaudio: bury inline asm under HAVE_INLINE_ASM.
  x86inc: automatically insert vzeroupper for YMM functions.
  rtmp: Check the buffer length of ping packets
  rtmp: Allow having more unknown data at the end of a chunk size packet without failing
  rtmp: Prevent reading outside of an allocate buffer when receiving server bandwidth packets

Conflicts:
	Makefile
	configure
	libavcodec/x86/proresdsp.asm
	libavutil/eval.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-27 23:42:19 +02:00
Ronald S. Bultje
158744a4cd vp56: only compile MMX SIMD on x86-32.
All x86-64 CPUs have SSE2, so the MMX version will never be used. This
leads to smaller binaries.
2012-07-27 14:40:27 -07:00
Ronald S. Bultje
2734ba787b vp56: port x86 simd to cpuflags. 2012-07-27 14:39:07 -07:00
Ronald S. Bultje
5361e10a5e proresdsp: port x86 assembly to cpuflags. 2012-07-27 11:43:06 -07:00
jamal
52a62f9085 dwt: Fix several warnings about incompatible pointer type
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-27 19:36:17 +02:00
Ronald S. Bultje
bde73f28af mpegaudio: bury inline asm under HAVE_INLINE_ASM. 2012-07-26 13:43:16 -07:00
Ronald S. Bultje
30b45d9c38 x86inc: automatically insert vzeroupper for YMM functions. 2012-07-26 13:43:16 -07:00
Michael Niedermayer
7333798c85 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libopenjpeg: support YUV and deep RGB pixel formats
  Fix typo in v410 decoder.
  vf_yadif: unset cur_buf on the input link.
  vf_overlay: ensure the overlay frame does not get leaked.
  vf_overlay: prevent premature freeing of cur_buf
  Support urlencoded http authentication credentials
  rtmp: Return an error when the client bandwidth is incorrect
  rtmp: Return proper error code in handle_server_bw
  rtmp: Return proper error code in handle_client_bw
  rtmp: Return proper error codes in handle_chunk_size
  lavr: x86: add missing vzeroupper in ff_mix_1_to_2_fltp_flt()
  vp8: Replace x*155/100 by x*101581>>16.
  vp3: don't use calls to inline asm in yasm code.
  x86/dsputil: put inline asm under HAVE_INLINE_ASM.
  dsputil_mmx: fix incorrect assembly code
  rtmp: Factorize the code by adding handle_invoke
  rtmp: Factorize the code by adding handle_chunk_size
  rtmp: Factorize the code by adding handle_ping
  rtmp: Factorize the code by adding handle_client_bw
  rtmp: Factorize the code by adding handle_server_bw

Conflicts:
	libavcodec/libopenjpegdec.c
	libavcodec/x86/dsputil_mmx.c
	libavfilter/vf_overlay.c
	libavformat/Makefile
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-26 21:37:15 +02:00
Ronald S. Bultje
a1878a88a1 vp3: don't use calls to inline asm in yasm code.
Mixing yasm and inline asm is a bad idea, since if either yasm or inline
asm is not supported by your toolchain, all of the asm stops working.
Thus, better to use either one or the other alone.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:24:30 -04:00
Ronald S. Bultje
79195ce565 x86/dsputil: put inline asm under HAVE_INLINE_ASM.
This allows compiling with compilers that don't support gcc-style
inline assembly.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:24:27 -04:00
Yang Wang
845e92fd6a dsputil_mmx: fix incorrect assembly code
In ff_put_pixels_clamped_mmx(), there are two assembly code blocks.
In the first block (in the unrolled loop), the instructions
"movq 8%3, %%mm1 \n\t", and so forth, have problems.

From above instruction, it is clear what the programmer wants: a load from
p + 8. But this assembly code doesn’t guarantee that. It only works if the
compiler puts p in a register to produce an instruction like this:
"movq 8(%edi), %mm1". During compiler optimization, it is possible that the
compiler will be able to constant propagate into p. Suppose p = &x[10000].
Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction
becomes "movq 810000(%edx)". That is, it will stride by 810000 instead of 8.

This will cause a segmentation fault.

This error was fixed in the second block of the assembly code, but not in
the unrolled loop.

How to reproduce:
    This error is exposed when we build using Intel C++ Compiler, with
    IPO+PGO optimization enabled. Crashed when decoding an MJPEG video.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:22:18 -04:00
yang
6a2bad2c4f dsputil_mmx: fix incorrect assembly code
In file libavcodec/x86/dsputil_mmx.c, function ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t" etc have problem.
For above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: “movq 8(%edi), %mm1”. During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes “movq 810000(%edx)”. That is, it will stride by 810000 instead of 8.
This will cause the segmentation fault.
This error was fixed in the second block of the assembly code, but not in the unrolled loop.

How to reproduce:
This error is exposed when we build the ffmpeg using Intel C++ Compiler, IPO+PGO optimization. The ffmpeg was crashed when decoding a mjpeg video.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-24 00:55:05 +02:00
Michael Niedermayer
2cb4d51654 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  v410dec: Implement explode mode support
  zerocodec: fix direct rendering.
  wav: init st to NULL to avoid a false-positive warning.
  wavpack: set bits_per_raw_sample for S32 samples to properly identify 24-bit
  h264: refactor NAL decode loop
  RTMPTE protocol support
  RTMPE protocol support
  rtmp: Add ff_rtmp_calc_digest_pos()
  rtmp: Rename rtmp_calc_digest to ff_rtmp_calc_digest and make it global
  swscale: add missing HAVE_INLINE_ASM check.
  lavfi: place x86 inline assembly under HAVE_INLINE_ASM.
  vc1: Add a test for interlaced field pictures
  swscale: Mark all init functions as av_cold
  swscale: x86: Drop pointless _mmx suffix from filenames
  lavf: use conditional notation for default codec in muxer declarations.
  swscale: place inline assembly bilinear scaler under HAVE_INLINE_ASM.
  dsputil: ppc: cosmetics: pretty-print
  dsputil: x86: add SHUFFLE_MASK_W macro
  configure: respect CC_O setting in check_cc

Conflicts:
	Changelog
	configure
	libavcodec/v410dec.c
	libavcodec/zerocodec.c
	libavformat/asfenc.c
	libavformat/version.h
	libswscale/utils.c
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-23 21:25:09 +02:00
Jason Garrett-Glaser
85a3c19ed1 dsputil: x86: add SHUFFLE_MASK_W macro
Simplifies pshufb masks that operate on words.
2012-07-22 16:56:58 -04:00
Michael Niedermayer
85044358f6 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Print full compiler identification, not only version number
  flacdec: reverse lpc coeff order, simplify filter
  x86: dsputil: drop some unused CPU flag debug code

Conflicts:
	cmdutils.c
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-19 22:01:31 +02:00
Diego Biurrun
9f97af2688 x86: dsputil: drop some unused CPU flag debug code 2012-07-19 10:17:56 +02:00
Michael Niedermayer
204c4e953d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ppc: fix build with altivec disabled
  vp3: move idct and loop filter pointers to new vp3dsp context
  build: add CONFIG_VP3DSP, reduce repetition in OBJS lists
  tscc2: do not add/subtract 128 bias during DCT
  tscc2: fix typo in DCT
  configure: clarify external library section of help output
  configure: mark libfdk-aac as nonfree
  configure: cosmetics: drop some unnecessary backslashes
  os_support: K&R formatting cosmetics

Conflicts:
	configure
	libavcodec/vp3.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-18 22:34:48 +02:00
Mans Rullgard
28f9ab7029 vp3: move idct and loop filter pointers to new vp3dsp context
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context.  There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:19 +01:00
Mans Rullgard
ab9f987661 build: add CONFIG_VP3DSP, reduce repetition in OBJS lists
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:18 +01:00
Michael Niedermayer
3245c8b669 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mxfdec: replace x>>av_log2(sizeof(..)) by x/sizeof(..).
  x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro

Conflicts:
	libavformat/mxfdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-07 20:29:43 +02:00
Loren Merritt
e14052dbc8 x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-07 20:26:39 +02:00
Martin Storsjö
f27386cdc7 x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro
The SPLATB_REG macro already adds the 'd' suffix internally.

This fixes building on Win64, which has been broken since 878e66902.

This worked for unix, where r2 happened to be rdx in this case, which
with the first suffix rdxd was mapped to eax, and eaxd is defined back
to eax. On win64 however, r2 happened to be R8 in this case, and
R8d mapps to R8D just fine, but there's no mapping for R8Dd to anything.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-07-06 21:07:23 +03:00
Michael Niedermayer
24823a761c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  qdm2: remove broken and disabled dump_context() debug function
  x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros
  x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros
  x86inc: modify ALIGN to not generate long nops on i586
  x86: h264_intrapred: port to cpuflag macros
  avplay: update input filter pointer when the filtergraph is reset.
  avconv: fix parsing of -force_key_frames option.
  h264: use templates to avoid excessive inlining
  xtea: Make the count parameter match the documentation
  blowfish: Make the count parameter match the documentation
  mpegvideo: Don't use ff_mspel_motion() for vc1
  xtea: invert branch and loop precedence
  blowfish: invert branch and loop precedence
  flvdec: optionally trust the metadata
  avconv: Set audio filter time base to the sample rate
  vp8: Add ifdef guards around the sse2 loopfilter in the sse2slow branch too

Conflicts:
	ffmpeg.c
	ffplay.c
	libavcodec/h264.c
	libavcodec/mpegvideo_common.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-05 21:55:31 +02:00
Diego Biurrun
878e669029 x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros 2012-07-05 17:37:11 +02:00
Loren Merritt
4d4752366f x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-07-05 17:37:11 +02:00
Diego Biurrun
d20f133ef9 x86: h264_intrapred: port to cpuflag macros 2012-07-05 17:37:10 +02:00
Martin Storsjö
07eeeb1d4f vp8: Add ifdef guards around the sse2 loopfilter in the sse2slow branch too
This was missed in the the previous commit in 70a1c800.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-07-05 09:39:01 +03:00
Michael Niedermayer
039e9fe01c Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  lavfi: reclassify showfiltfmts as a TESTPROG
  graph2dot: fix printf format specifier
  swscale: yuv2planeX 8bit >=sse2 functions need aligned stack on x86-32.
  vp8: loopfilter >=sse2 functions need aligned stack on x86-32.
  amr: remove shift out of the AMR_BIT() macro.
  dsputilenc: group yasm and inline asm function pointer assignment.
  mov: use forward declaration of a function instead of a table.
  Clarify Doxygen comment for FF_API_* #defines.
  configure: simplify get_version()
  Create version.h headers for libraries that lack them
  gitignore: Use full path instead of relative path to specify patterns
  mpegvideo: remove VLAs
  Add XTEA encryption support in libavutil
  Add Blowfish encryption support in libavutil
  eval: Add the isinf() function and tests for it
  flacdec: move lpc filter to flacdsp
  flacdec: split off channel decorrelation as flacdsp
  avplay: Add an option for not limiting the input buffer size
  FATE: add a test for WMA cover art.
  FATE: add a test for apetag cover art
  ...

Conflicts:
	.gitignore
	configure
	ffplay.c
	libavcodec/Makefile
	libavcodec/error_resilience.c
	libavcodec/mpegvideo.c
	libavcodec/ratecontrol.c
	libavdevice/avdevice.h
	libavfilter/Makefile
	libavfilter/filtfmts.c
	libavfilter/version.h
	libavformat/mov.c
	libavformat/version.h
	libavutil/Makefile
	libavutil/avutil.h
	libavutil/version.h
	libswscale/swscale.h
	libswscale/x86/swscale_mmx.c
	tests/fate/libavutil.mak
	tests/lavfi-regression.sh
	tools/graph2dot.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-04 21:03:28 +02:00
Martin Storsjö
70a1c8000f vp8: loopfilter >=sse2 functions need aligned stack on x86-32.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-07-04 08:25:50 -07:00
Ronald S. Bultje
723b266d72 dsputilenc: group yasm and inline asm function pointer assignment. 2012-07-04 07:46:27 -07:00
Michael Niedermayer
64b25938e9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dsputilenc_mmx: split assignment of ff_sse16_sse2 to SSE2 section.
  dnxhdenc: add space between function argument type and comment.
  x86: fmtconvert: add special asm for float_to_int16_interleave_misc_*
  attributes: Add a definition of av_always_inline for MSVC
  cmdutils: Pass the actual chosen encoder to filter_codec_opts
  os_support: Add fallback definitions for stat flags
  os_support: Rename the poll fallback function to ff_poll
  network: Check for struct pollfd
  os_support: Don't compare a negative number against socket descriptors
  os_support: Include all the necessary headers for the win32 open function
  x86: vc1: fix and enable optimised loop filter

Conflicts:
	cmdutils.c
	cmdutils.h
	ffmpeg.c
	ffplay.c
	libavformat/os_support.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-30 22:44:18 +02:00
Ronald S. Bultje
ceabc13f12 dsputilenc_mmx: split assignment of ff_sse16_sse2 to SSE2 section. 2012-06-30 09:24:52 -07:00
Ronald S. Bultje
66a02159ea x86: fmtconvert: add special asm for float_to_int16_interleave_misc_*
This gets rid of a variable-length array and a for loop in C code.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-06-30 19:10:36 +03:00
Mans Rullgard
f2fd167835 x86: vc1: fix and enable optimised loop filter
The problem is that the ssse3 psign instruction does the wrong
thing here.  Commit ea60dfe incorrectly removed a macro emulating
this instruction for pre-ssse3 code.  However, the emulation is
incorrect, and the code relies on the behaviour of the macro.
Specifically, the psign sets destination elements to zero where
the corresponding source element is zero, whereas the emulation
only negates destination elements where the source is negative.

Furthermore, the PSIGNW_MMX macro in x86util.asm is totally bogus,
which is why the original VC-1 code had an additional right shift
when using it.  Since the psign instruction cannot be used here,
skip all the macro hell and use the working instruction sequence
directly.

None of this was noticed due a stray return statement in
ff_vc1dsp_init_mmx() which meant that only the mmx version of the
loop filter was ever used (before being removed in ea60dfe).

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-30 00:12:05 +01:00
Michael Niedermayer
87df986dcf Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mss1: validate number of changeable palette entries
  mss1: report palette changed when some additional colours were decoded
  x86: fft: replace call to memcpy by a loop
  udp: Support IGMPv3 source specific multicast and source blocking
  dxva2: include dxva.h if found
  libm: Provide fallback definitions for isnan() and isinf()
  tcp: Pass NULL as hostname to getaddrinfo if the string is empty
  tcp: Set AI_PASSIVE when the socket will be used for listening

Conflicts:
	configure
	libavcodec/mss1.c
	libavformat/udp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-28 01:08:52 +02:00
Christophe Gisquet
a5bfa66df5 x86: fft: replace call to memcpy by a loop
The function call was a mess to handle, and memcpy cannot make
the assumptions we do in the new code.

Tested on an IMC sample: 430c -> 370c.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-27 12:49:33 +01:00
Mans Rullgard
37c3864ef7 x86: fft: elf64: fix PIC build
In a 64-bit PIC build, external functions must be called
through the PLT.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-26 15:57:32 +02:00
Nicolas George
d4c45b8adf Revert "Revert "x86: fft: win64: fix stack alignment for memcpy() call""
This reverts commit f767658414.

The bug it introduces has been fixed.
2012-06-26 15:56:01 +02:00
Nicolas George
91765594dd Revert "Revert "x86: fft: convert sse inline asm to yasm""
This reverts commit fd91a3ec44.

The bug it introduced has been fixed.
2012-06-26 15:55:41 +02:00
Nicolas George
fd91a3ec44 Revert "x86: fft: convert sse inline asm to yasm"
This reverts commit 8299260470.

It breaks shared builds on x86_64.
2012-06-26 13:00:14 +02:00
Nicolas George
f767658414 Revert "x86: fft: win64: fix stack alignment for memcpy() call"
This reverts commit 8725da49a2.

Necerrary to revert 8299260470.
2012-06-26 12:59:48 +02:00
Michael Niedermayer
3b0ad040b3 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  log: Include io.h on windows
  lavr: x86: merge some branches
  x86: cpu: whitespace (mostly) cosmetics
  x86: fft: win64: fix stack alignment for memcpy() call

Conflicts:
	libavutil/log.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-26 01:13:07 +02:00
Mans Rullgard
0595334892 x86: fft: elf64: fix PIC build
In a 64-bit PIC build, external functions must be called
through the PLT.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 22:58:18 +01:00
Michael Niedermayer
a6ff8514a9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  wtv: Check the return value from gmtime
  x86: fft: convert sse inline asm to yasm
  x86: place some inline asm under #if HAVE_INLINE_ASM

Conflicts:
	libavcodec/x86/fft_sse.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-25 16:55:31 +02:00
Mans Rullgard
8725da49a2 x86: fft: win64: fix stack alignment for memcpy() call 2012-06-25 15:10:39 +01:00
Mans Rullgard
8299260470 x86: fft: convert sse inline asm to yasm 2012-06-25 13:31:00 +01:00
Ronald S. Bultje
8123e0901f x86: place some inline asm under #if HAVE_INLINE_ASM
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 13:23:12 +01:00
Michael Niedermayer
244682dd08 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  log: Only include unistd.h if configure found it
  ape: create audio stream before reading tags.
  mov: make a length variable larger.
  image2: Add "start_number" private option to the demuxer
  image2: Add "start_number" private option to the muxer
  avconv: remove a forgotten debugging printf.
  avconv: use more descriptive names for hardcoded filters.
  avconv: remove redundant handling of async.
  doc/filters: fix typo.
  h264: use asm cabac reader under a generic condition

Conflicts:
	ffmpeg.c
	libavformat/img2dec.c
	libavformat/img2enc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-24 21:34:54 +02:00
Michael Niedermayer
1c60088885 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: Only use optimizations with cmov if the CPU supports the instruction
  x86: Add CPU flag for the i686 cmov instruction
  x86: remove unused inline asm macros from dsputil_mmx.h
  x86: move some inline asm macros to the only places they are used
  lavfi: Add the af_channelmap audio channel mapping filter.
  lavfi: add join audio filter.
  lavfi: allow audio filters to request a given number of samples.
  lavfi: support automatically inserting the fifo filter when needed.
  lavfi/audio: eliminate ff_default_filter_samples().

Conflicts:
	Changelog
	libavcodec/x86/h264dsp_mmx.c
	libavfilter/Makefile
	libavfilter/allfilters.c
	libavfilter/avfilter.h
	libavfilter/avfiltergraph.c
	libavfilter/version.h
	libavutil/x86/cpu.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-24 02:09:53 +02:00
Mans Rullgard
0b6f973635 h264: use asm cabac reader under a generic condition
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 22:14:21 +01:00
Diego Biurrun
fe07c9c6b5 x86: Only use optimizations with cmov if the CPU supports the instruction 2012-06-23 16:21:50 +02:00
Mans Rullgard
29686d6ea3 x86: remove unused inline asm macros from dsputil_mmx.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 14:14:06 +01:00
Mans Rullgard
685f5438bb x86: move some inline asm macros to the only places they are used
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 14:14:06 +01:00
Michael Niedermayer
e847f41285 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libspeexenc: add supported sample rates and channel layouts.
  Replace usleep() calls with av_usleep()
  lavu: add av_usleep() function
  utvideo: mark interlaced frames as such
  utvideo: Fix interlaced prediction for RGB utvideo.
  cosmetics: do not use full path for local headers
  lavu/file: include unistd.h only when available
  configure: check for unistd.h
  log: include unistd.h only when needed
  lavf: include libavutil/time.h instead of redeclaring av_gettime()

Conflicts:
	configure
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/utvideo.c
	libavutil/avutil.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-22 22:34:02 +02:00
Michael Niedermayer
fba18ef8cc x86/dsputil_mmx: support 4 sample edges
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-22 17:23:09 +02:00
Diego Biurrun
a5a93fa8f5 cosmetics: do not use full path for local headers 2012-06-22 10:49:40 +02:00
Michael Niedermayer
82edf6727f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: add x86-optimized functions for mixing 1-to-2 s16p with flt coeffs
  lavr: add x86-optimized functions for mixing 1-to-2 fltp with flt coeffs
  Add Dolby/DPLII downmix support to libavresample
  vorbisdec: replace div/mod in loop with a counter
  fate: vorbis: add 5.1 surround test
  rtpenc: Allow requesting H264 RTP packetization mode 0
  configure: Sort the library listings in the help text alphabetically
  dwt: remove variable-length arrays
  RTMPT protocol support
  http: Properly handle chunked transfer-encoding for replies to post data
  http: Fail reading if the connection has gone away
  amr: Mark an array const
  amr: More space cleanup
  rtpenc: Fix memory leaks in the muxer open function

Conflicts:
	Changelog
	configure
	doc/APIchanges
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-18 20:07:00 +02:00
Ronald S. Bultje
d9669eab0b dwt: remove variable-length arrays
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-17 23:20:10 +01:00
Michael Niedermayer
9946a6aa55 diracdsp: try to fix segfault
This might fix Ticket1412

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-16 23:16:54 +02:00
Michael Niedermayer
3b196bb737 libavcodec/x86/rv40dsp_init.c: add missing HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:26:24 +02:00
Michael Niedermayer
915ec91e6b libavcodec/x86/h264dsp_mmx.c: add forgotten HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:23:44 +02:00
Michael Niedermayer
63bfee8796 libavcodec/x86/dwt.c: move some missed things under HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:20:27 +02:00
Michael Niedermayer
7e22514d98 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  float_dsp: ppc: add a separate header for Altivec function prototypes
  ARM: fix float_dsp breakage from d5a7229
  Add a float DSP framework to libavutil
  PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
  ARM: Move asm.S from libavcodec to libavutil
  vc1dsp: mark put/avg_vc1_mspel_mc() always_inline

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-08 23:59:09 +02:00
Justin Ruggles
d5a7229ba4 Add a float DSP framework to libavutil
Move vector_fmul() from DSPContext to AVFloatDSPContext.
2012-06-08 13:14:38 -04:00
Michael Niedermayer
b0387edd5e Merge commit 'f919cc7df6ab844bc12f89fe7bef4fb915a47725'
* commit 'f919cc7df6ab844bc12f89fe7bef4fb915a47725':
  fate: fix acodec/vsynth tests for make 3.81
  pcm_mpeg: fix number of consumed bytes to include the header.
  avfilter: include required header file avfilter.h in video.h
  x86: Avoid movs on BUTTERFLYPS when in AVX mode
  x86: use new schema for ASM macros
  fate: convert codec-regression.sh to makefile rules
  fate: allow tests to specify unit size for psnr comparison
  fate: teach videogen/rotozoom to output a single raw video stream
  http: Add support for reusing the http socket for subsequent requests
  http: Add support for using persistent connections

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-30 01:40:54 +02:00
Vitor Sessak
bac0729d9e x86: use new schema for ASM macros
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-05-29 14:49:45 +02:00
Vitor Sessak
2fd5e70869 x86: use new schema for ASM macros
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-27 15:42:45 +02:00
Carl Eugen Hoyos
001d9d5e93 Fix compilation with --disable-everything. 2012-05-24 08:08:31 +02:00
Michael Niedermayer
d0ad91c258 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  os_support: Define SHUT_RD, SHUT_WR and SHUT_RDWR on OS/2
  http: Add support for reading http POST reply headers
  http: Add http_shutdown() for ending writing of posts
  tcp: Allow signalling end of reading/writing
  avio: Add a function for signalling end of reading/writing
  lavfi: fix comment, audio is supported now.
  lavfi: fix incorrect comment.
  lavfi: remove avfilter_null_* from public API on next bump.
  lavfi: remove avfilter_default_* from public API on next bump.
  lavfi: deprecate default config_props() callback and refactor avfilter_config_links()
  avfiltergraph: smarter sample format selection.
  avconv: rename transcode_audio/video to decode_audio/video.
  asyncts: reset delta to 0 when it's not used.
  x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code.
  dwt: return errors from ff_slice_buffer_init()

Conflicts:
	ffmpeg.c
	libavfilter/avfilter.c
	libavfilter/avfilter.h
	libavfilter/formats.c
	libavfilter/version.h
	libavfilter/vf_blackframe.c
	libavfilter/vf_drawtext.c
	libavfilter/vf_fade.c
	libavfilter/vf_format.c
	libavfilter/vf_showinfo.c
	libavfilter/video.c
	libavfilter/video.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-23 21:48:31 +02:00
Michael Niedermayer
ea5dab58e0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dwt: check malloc calls
  ppc: Drop unused header regs.h
  af_resample: remove an extra space in the log output
  Convert vector_fmul range of functions to YASM and add AVX versions
  lavfi: add an audio split filter
  lavfi: rename vf_split.c to split.c

Conflicts:
	doc/filters.texi
	libavcodec/ppc/regs.h
	libavfilter/Makefile
	libavfilter/allfilters.c
	libavfilter/f_split.c
	libavfilter/split.c
	libavfilter/version.h
	libavfilter/vf_split.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-22 23:42:17 +02:00
Justin Ruggles
713548cbad x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code.
This is needed for older versions of yasm/nasm that do not support AVX.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-05-22 20:46:02 +02:00
Kieran Kunhya
5ff01259a8 Convert vector_fmul range of functions to YASM and add AVX versions
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-05-21 17:13:05 -04:00
Michael Niedermayer
703e920bb7 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  fate: Work around non-standard wc implementations at more places
  fate: work around non-standard wc implementations
  x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions.
  ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16
  fate: use standard diff options
  tta: Fix comment about channel number; TTA supports >2 channels.
  avfilter: Move ff_get_ref_perms_string() to where it is used.
  build: Add 'check' target to run all compile and test targets.
  indeo3: validate new frame size before resetting decoder
  indeo3: when freeing buffers, set pointers referencing them to NULL as well
  indeo3: initialise pixel planes on allocation
  indeo3: ensure that decoded cell data is in 7-bit range as presumed by decoder
  fate: rename psx-str-v3-mdec to mdec-v3
  fate: convert psx-str to a demuxer test
  lavf: add mdec to is_intra_only() list

Conflicts:
	doc/developer.texi
	libavcodec/indeo3.c
	libavfilter/video.c
	libavformat/utils.c
	tests/fate/demux.mak
	tests/fate/video.mak
	tests/lavf-regression.sh
	tests/ref/vsynth1/cljr
	tests/ref/vsynth1/ffvhuff
	tests/ref/vsynth2/cljr
	tests/ref/vsynth2/ffvhuff

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-16 22:32:05 +02:00
Michael Kostylev
6797d1948b x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions. 2012-05-15 23:54:08 +02:00
Justin Ruggles
95a98ab3f0 ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16
Simplifies the code by using cpuflags and a new macro.
Also fixes the invalid use of the MMX2 pshufw operation in the MMX-only
function.
2012-05-15 15:23:59 -04:00
Michael Niedermayer
7e944159c6 Merge remote-tracking branch 'qatar/master'
* qatar/master: (25 commits)
  vcr1: Add vcr1_ prefixes to all static functions with generic names.
  vcr1: Fix return type of common_init to match the function pointer signature.
  vcr1enc: Replace obsolete get_bit_count by put_bits_count/flush_put_bits.
  motion-test: remove disabled code
  gxfenc: remove disabled half-implemented MJPEG tag
  x86: use more standard construct for setting ASM functions in FFT code
  fate: westwood-aud: disable decoding
  fate: caf: disable decoding
  fate: film-cvid: drop pcm audio and rename test
  fate: d-cinema-demux: drop unnecessary flags
  fate: split off dpcm-interplay from interplay-mve tests
  fate: rename funcom-iss to adpcm-ima-iss
  fate: rename cryo-apc to adpcm-ima-apc
  fate: rename adpcm-psx-str-v3 to adpcm-xa
  fate: split off adpcm-ms-mono test from dxa-feeble
  fate: split off adpcm-ima-ws test from vqa-cc
  fate: add adpcm-ima-smjpeg test
  fate: split off adpcm-ima-amv from amv test
  fate: separate bmv audio and video tests
  fate: separate delphine-cin audio and video tests
  ...

Conflicts:
	doc/platform.texi
	libavcodec/vcr1.c
	tests/fate/audio.mak
	tests/fate/demux.mak
	tests/fate/video.mak
	tests/ref/fate/ea-mad-pcm-planar
	tests/ref/fate/interplay-mve-16bit
	tests/ref/fate/interplay-mve-8bit
	tests/ref/fate/mtv
	tests/ref/fate/qtrle-1bit
	tests/ref/fate/qtrle-2bit
	tests/ref/fate/truemotion1-15
	tests/ref/fate/truemotion1-24
	tests/ref/fate/vqa-cc

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-14 20:17:24 +02:00