Commit Graph

1108 Commits

Author SHA1 Message Date
Michael Niedermayer
bec180e112 Merge commit 'a1bcc76e6036e78f25cbb7323c145056cfca9d93'
* commit 'a1bcc76e6036e78f25cbb7323c145056cfca9d93': (21 commits)
  cmdutils: fix a memleak when specifying an option twice.
  x86: mpegvideo: more sensible names for optimization file and init function
  x86: mpegvideoenc: Split optimizations off into a separate file
  dnxhdenc: x86: more sensible names for optimization file and init function
  svq1/svq3: Move common code out of SVQ1 decoder-specific file
  dirac: add Comments and references to the standard
  lavr: x86: optimized 6-channel flt to fltp conversion
  lavr: x86: optimized 2-channel flt to fltp conversion
  lavr: x86: optimized 6-channel flt to s16p conversion
  lavr: x86: optimized 2-channel flt to s16p conversion
  lavr: x86: optimized 6-channel s16 to fltp conversion
  lavr: x86: optimized 2-channel s16 to fltp conversion
  lavr: x86: optimized 6-channel s16 to s16p conversion
  lavr: x86: optimized 2-channel s16 to s16p conversion
  lavr: x86: optimized 2-channel fltp to flt conversion
  lavr: x86: optimized 6-channel fltp to s16 conversion
  lavr: x86: optimized 2-channel fltp to s16 conversion
  lavr: x86: optimized 6-channel s16p to flt conversion
  lavr: x86: optimized 2-channel s16p to flt conversion
  lavr: x86: optimized 6-channel s16p to s16 conversion
  ...

Conflicts:
	libavcodec/dirac.c
	libavcodec/mpegvideo.h
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-24 14:30:40 +02:00
Diego Biurrun
dc40285427 x86: mpegvideo: more sensible names for optimization file and init function 2012-08-24 02:23:16 +02:00
Diego Biurrun
d211547ddd x86: mpegvideoenc: Split optimizations off into a separate file 2012-08-24 02:23:16 +02:00
Diego Biurrun
26ce9aec03 dnxhdenc: x86: more sensible names for optimization file and init function 2012-08-24 02:23:15 +02:00
Michael Niedermayer
3699960690 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  build: x86: Only compile mpegvideo optimizations when necessary
  configure: Drop fastdiv option
  build: Make the E-AC-3 encoder select the AC-3 encoder
  fate: flac: Only run tests requiring samples when samples are available

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-22 14:37:03 +02:00
Diego Biurrun
6fa488678f build: x86: Only compile mpegvideo optimizations when necessary 2012-08-22 01:06:33 +02:00
Michael Niedermayer
bb46b9a36f vc1dsp_mmx: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:29:29 +02:00
Michael Niedermayer
9bfeaf6f10 simple_idct_mmx: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:28:57 +02:00
Michael Niedermayer
64b23d7dec x86/motion_est_mmx: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:28:37 +02:00
Michael Niedermayer
191ffc7fe7 x86/mlpdsp: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:28:13 +02:00
Michael Niedermayer
501b681d95 lpc_mmx: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:26:41 +02:00
Michael Niedermayer
7cb9f1a8d0 idct_sse2_xvid: remove libavutil/internal.h include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 17:26:06 +02:00
Michael Niedermayer
c581cb4e4f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Fix even more missing includes after the common.h removal
  build: Factor out rangecoder dependencies to CONFIG_RANGECODER
  build: Factor out error resilience dependencies to CONFIG_ERROR_RESILIENCE
  x86: avcodec: Consistently name all init files
  Add more missing includes after removing the implicit common.h
  Add some more missing includes after removing the implicit common.h
  Don't include common.h from avutil.h
  rtmp: Automatically compute the hash for SWFVerification

Conflicts:
	configure
	doc/APIchanges
	doc/examples/decoding_encoding.c
	libavcodec/Makefile
	libavcodec/assdec.c
	libavcodec/audio_frame_queue.c
	libavcodec/avpacket.c
	libavcodec/dv_profile.c
	libavcodec/dwt.c
	libavcodec/libtheoraenc.c
	libavcodec/rawdec.c
	libavcodec/rv40dsp.c
	libavcodec/tiff.c
	libavcodec/tiffenc.c
	libavcodec/v210dec.h
	libavcodec/vc1dsp.c
	libavcodec/x86/Makefile
	libavfilter/asrc_anullsrc.c
	libavfilter/avfilter.c
	libavfilter/buffer.c
	libavfilter/formats.c
	libavfilter/vf_ass.c
	libavfilter/vf_drawtext.c
	libavfilter/vf_fade.c
	libavfilter/vf_select.c
	libavfilter/video.c
	libavfilter/vsrc_testsrc.c
	libavformat/version.h
	libavutil/audioconvert.c
	libavutil/error.h
	libavutil/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 16:20:30 +02:00
Diego Biurrun
6961bdface x86: avcodec: Consistently name all init files 2012-08-16 11:05:38 +02:00
Martin Storsjö
1d9c2dc89a Don't include common.h from avutil.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-15 22:32:06 +03:00
Michael Niedermayer
9e89bc37ed Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rtmp: Add support for SWFVerification
  api-example: use new video encoding API.
  x86: avcodec: Appropriately name files containing only init functions
  mpegvideo_mmx_template: drop some commented-out cruft
  libavresample: add mix level normalization option
  w32pthreads: Add missing #includes to make header compile standalone
  rtmp: Gracefully ignore _checkbw errors by tracking them
  rtmp: Do not send _checkbw calls as notifications
  prores: interlaced ProRes encoding

Conflicts:
	doc/examples/decoding_encoding.c
	libavcodec/proresenc_kostya.c
	libavcodec/w32pthreads.h
	libavcodec/x86/Makefile
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-15 15:55:24 +02:00
Diego Biurrun
29cfdd3767 x86: avcodec: Appropriately name files containing only init functions 2012-08-15 03:24:08 +02:00
Diego Biurrun
be12958937 mpegvideo_mmx_template: drop some commented-out cruft 2012-08-15 03:24:07 +02:00
Michael Niedermayer
7427d1ca4a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  g723.1: simplify scale_vector()
  g723.1: simplify normalize_bits()
  vda: cosmetics: fix Doxygen comment formatting
  vda: better frame allocation
  vda: Merge implementation into one file
  vda: support synchronous decoding
  vda: Reuse the bitstream buffer and reallocate it only if needed
  build: Factor out mpegvideo encoding dependencies to CONFIG_MPEGVIDEOENC
  avprobe: Include libm.h for the log2 fallback
  proresenc: use the edge emulation buffer
  rtmp: handle bytes read reports
  configure: Fix typo in mpeg2video/svq1 decoder dependency declaration
  Use log2(x) instead of log(x) / log(2)
  x86: swscale: fix fragile memory accesses
  x86: swscale: remove disabled code
  x86: yadif: fix asm with suncc
  x86: cabac: allow building with suncc
  x86: mlpdsp: avoid taking address of void
  ARM: intmath: use native-size return types for clipping functions

Conflicts:
	configure
	ffprobe.c
	libavcodec/Makefile
	libavcodec/g723_1.c
	libavcodec/v210dec.h
	libavcodec/vda.h
	libavcodec/vda_h264.c
	libavcodec/x86/cabac.h
	libavfilter/x86/yadif_template.c
	libswscale/x86/rgb2rgb_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-14 15:34:39 +02:00
Mans Rullgard
8ec0204ee4 x86: cabac: allow building with suncc
This fixes two issues preventing suncc from building this code.

The undocumented 'a' operand modifier, causing gcc to omit a $ in
front of immediate operands (as required in addresses), is not
supported by suncc.  Luckily, the also undocumented 'c' modifer
has the same effect and is supported.

On some asm statements with a large number of operands, suncc for no
obvious reason fails to correctly substitute some of the operands.
Fortunately, some of the operands in these statements are plain
numbers which can be inserted directly into the code block instead
of passed as operands.

With these changes, the code builds correctly with both gcc and
suncc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Mans Rullgard
c8252e80eb x86: mlpdsp: avoid taking address of void
This code contains a C array of addresses of labels defined in
inline asm.  To do this, the names must be declared as external
in C.  The declared type does not matter since only the address is
used, and for some reason, the author of the code used the 'void'
type despite taking the address of a void expression being invalid.

Changing the type to char, a reasonable choice since the alignment
of the code labels cannot be known or guaranteed, eliminates gcc
warnings and allows building with suncc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Michael Niedermayer
d8c3170c9f Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  g723.1: do not pass large structs by value
  g723.1: do not bounce intermediate values via memory
  g723.1: declare a variable in the block it is used
  g723.1: avoid saving/restoring excitation
  g723.1: avoid unnecessary memcpy() in residual_interp()
  g723.1: make postfilter write directly to output buffer
  g723.1: drop unnecessary variable buf_ptr in formant_postfilter()
  g723.1: make scale_vector() output to a separate buffer
  g723.1: make autocorr_max() work on an arbitrary buffer
  g723.1: do not needlessly use int64_t
  g723.1: use saturating addition functions
  g723.1: optimise scale_vector()
  g723.1: remove useless uses of MUL64()
  g723.1: remove unnecessary argument 'shift' from dot_product()
  g723.1: deobfuscate "(x << 4) - x" to "15 * x"
  celp: optimise ff_celp_lp_synthesis_filter()
  libavutil: add saturating addition functions
  cllc: Implement ARGB support
  cllc: Add support for QRGB
  cllc: Rename some funcs to represent what they actually do
  ...

Conflicts:
	LICENSE
	libavcodec/g723_1.c
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-13 14:38:43 +02:00
Diego Biurrun
3b9e832e17 x86: Drop silly "_yasm" suffixes from filenames 2012-08-12 17:13:05 +02:00
Michael Niedermayer
9f088a1ed4 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpegvideo: reduce excessive inlining of mpeg_motion()
  mpegvideo: convert mpegvideo_common.h to a .c file
  build: factor out mpegvideo.o dependencies to CONFIG_MPEGVIDEO
  Move MASK_ABS macro to libavcodec/mathops.h
  x86: move MANGLE() and related macros to libavutil/x86/asm.h
  x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
  aacdec: Don't fall back to the old output configuration when no old configuration is present.
  rtmp: Add message tracking
  rtsp: Support mpegts in raw udp packets
  rtsp: Support receiving plain data over UDP without any RTP encapsulation
  rtpdec: Remove an unused include
  rtpenc: Remove an av_abort() that depends on user-supplied data
  vsrc_movie: discourage its use with avconv.
  avconv: allow no input files.
  avconv: prevent invalid reads in transcode_init()
  avconv: rename OutputStream.is_past_recording_time to finished.

Conflicts:
	configure
	doc/filters.texi
	ffmpeg.c
	ffmpeg.h
	libavcodec/Makefile
	libavcodec/aacdec.c
	libavcodec/mpegvideo.c
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-09 19:31:56 +02:00
Mans Rullgard
d7a4f8f8b9 Move MASK_ABS macro to libavcodec/mathops.h
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.

Two small tweaks to the macro are made:

- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Mans Rullgard
c318626ce2 x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
This puts x86-specific things in the x86/ subdirectory where they
belong.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Michael Niedermayer
11a1033c9f Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  build: cosmetics: Reorder some lists in a more logical fashion
  x86: pngdsp: Fix assembly for OS/2
  fate: add test for RTjpeg in nuv with frameheader
  rtmp: send check_bw as notification
  g723_1: clip argument for 15-bit version of normalize_bits()
  g723_1: use all LPC vectors in formant postfilter
  id3v2: Support v2.2 PIC
  avplay: fix build with lavfi disabled.
  avconv: split configuring filter configuration to a separate file.
  avconv: split option parsing into a separate file.
  mpc8: do not leave padding after last frame in buffer for the next decode call
  mpegaudioenc: list supported channel layouts.
  mpegaudiodec: don't print an error on > 1 frame in a packet.
  api-example: update to new audio encoding API.
  configure: add --enable/disable-random option
  doc: cygwin: Update list of FATE package requirements
  build: Remove all installed headers and header directories on uninstall
  build: change checkheaders to use regular build rules
  rtmp: Add a new option 'rtmp_subscribe'
  rtmp: Add support for subscribing live streams
  ...

Conflicts:
	Makefile
	common.mak
	configure
	doc/examples/decoding_encoding.c
	ffmpeg.c
	libavcodec/g723_1.c
	libavcodec/mpegaudiodec.c
	libavcodec/x86/pngdsp.asm
	libavformat/version.h
	library.mak
	tests/fate/video.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-09 00:51:02 +02:00
Dave Yeo
197439c1ef x86: pngdsp: Fix assembly for OS/2
The a.out object format does not allow aligning sections.
On OS/2 LD aligns sections to 16 bytes.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-08 15:45:09 +02:00
Michael Niedermayer
2fc7c818cb Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: fix build with nasm 2.08
  x86: use nop cpu directives only if supported
  x86: fix rNmp macros with nasm
  build: add trailing / to yasm/nasm -I flags
  x86: use 32-bit source registers with movd instruction
  x86: add colons after labels

Conflicts:
	Makefile
	libavutil/x86/x86inc.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-07 23:04:55 +02:00
Michael Niedermayer
7a72695c05 Merge commit '36ef5369ee9b336febc2c270f8718cec4476cb85'
* commit '36ef5369ee9b336febc2c270f8718cec4476cb85':
  Replace all CODEC_ID_* with AV_CODEC_ID_*
  lavc: add AV prefix to codec ids.

Conflicts:
	doc/APIchanges
	doc/examples/decoding_encoding.c
	doc/examples/muxing.c
	ffmpeg.c
	ffprobe.c
	ffserver.c
	libavcodec/8svx.c
	libavcodec/avcodec.h
	libavcodec/dnxhd_parser.c
	libavcodec/dvdsubdec.c
	libavcodec/error_resilience.c
	libavcodec/h263dec.c
	libavcodec/libvorbisenc.c
	libavcodec/mjpeg_parser.c
	libavcodec/mjpegenc.c
	libavcodec/mpeg12.c
	libavcodec/mpeg4videodec.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo_enc.c
	libavcodec/pcm.c
	libavcodec/r210dec.c
	libavcodec/utils.c
	libavcodec/v210dec.c
	libavcodec/version.h
	libavdevice/alsa-audio-dec.c
	libavdevice/bktr.c
	libavdevice/v4l2.c
	libavformat/asfdec.c
	libavformat/asfenc.c
	libavformat/avformat.h
	libavformat/avidec.c
	libavformat/caf.c
	libavformat/electronicarts.c
	libavformat/flacdec.c
	libavformat/flvdec.c
	libavformat/flvenc.c
	libavformat/framecrcenc.c
	libavformat/img2.c
	libavformat/img2dec.c
	libavformat/img2enc.c
	libavformat/ipmovie.c
	libavformat/isom.c
	libavformat/matroska.c
	libavformat/matroskadec.c
	libavformat/matroskaenc.c
	libavformat/mov.c
	libavformat/movenc.c
	libavformat/mp3dec.c
	libavformat/mpeg.c
	libavformat/mpegts.c
	libavformat/mxf.c
	libavformat/mxfdec.c
	libavformat/mxfenc.c
	libavformat/nsvdec.c
	libavformat/nut.c
	libavformat/oggenc.c
	libavformat/pmpdec.c
	libavformat/rawdec.c
	libavformat/rawenc.c
	libavformat/riff.c
	libavformat/sdp.c
	libavformat/utils.c
	libavformat/vocenc.c
	libavformat/wtv.c
	libavformat/xmv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-07 22:45:46 +02:00
Mans Rullgard
2b140a3d09 x86: use 32-bit source registers with movd instruction
yasm tolerates mismatch between movd/movq and source register size,
adjusting the instruction according to the register.  nasm is more
strict.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:21:20 +01:00
Mans Rullgard
a3df4781f4 x86: add colons after labels
nasm prints a warning if the colon is missing.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:20:56 +01:00
Anton Khirnov
36ef5369ee Replace all CODEC_ID_* with AV_CODEC_ID_* 2012-08-07 16:00:24 +02:00
Michael Niedermayer
b4780d03d0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2
  rational: add av_inv_q() returning the inverse of an AVRational
  dpx: Make start offset unsigned
  lavfi: properly signal out-of-memory error in ff_filter_samples
  cosmetics: Fix a few switched periods and linebreaks
  zerocodec: Fix memleak in decode_frame
  zerocodec: Cosmetics

Conflicts:
	ffmpeg.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-05 22:17:02 +02:00
Diego Biurrun
2096857551 x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2 2012-08-05 21:40:49 +02:00
Michael Niedermayer
e776ee8f29 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: fix handling of custom mix matrices
  fate: force pix_fmt in lagarith-rgb32 test
  fate: add tests for lagarith lossless video codec.
  ARMv6: vp8: fix stack allocation with Apple's assembler
  ARM: vp56: allow inline asm to build with clang
  fft: 3dnow: fix register name typo in DECL_IMDCT macro
  x86: dct32: port to cpuflags
  x86: build: replace mmx2 by mmxext
  Revert "wmapro: prevent division by zero when sample rate is unspecified"
  wmapro: prevent division by zero when sample rate is unspecified
  lagarith: fix color plane inversion for YUY2 output.
  lagarith: pad RGB buffer by 1 byte.
  dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.

Conflicts:
	doc/APIchanges
	libavcodec/lagarith.c
	libavfilter/x86/gradfun.c
	libavutil/cpu.h
	libavutil/version.h
	libswscale/utils.c
	libswscale/version.h
	libswscale/x86/yuv2rgb.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-04 23:51:43 +02:00
Ronald S. Bultje
4a8143e73c fft: 3dnow: fix register name typo in DECL_IMDCT macro
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-04 00:16:02 +02:00
Michael Niedermayer
a7acab6cda Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vc1dec: Remove separate scaling function for interlaced field MVs
  vc1dec: Invoke edge_emulation regardless of MV precision
  x86: Use consistent 3dnowext function and macro name suffixes
  g723_1: scale output as supposed for the case with postfilter disabled
  g723_1: increase excitation storage by 4
  g723_1: fix upper bound parameter from inverse maximum autocorrelation
  g723_1: make scale_vector() behave like the reference
  g723_1: fix off-by-one error in normalize_bits()
  g723_1: save/restore excitation with offset to store LPC history
  wmapro: prevent division by zero when sample rate is unspecified
  x86: proresdsp: improve SIGNEXTEND macro comments
  x86: h264dsp: K&R formatting cosmetics
  LICENSE: Document all GPL files

Conflicts:
	libavcodec/g723_1.c
	libavcodec/wmaprodec.c
	libavcodec/x86/h264dsp_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-03 23:13:06 +02:00
Diego Biurrun
0c3ff1982c x86: dct32: port to cpuflags 2012-08-03 22:51:06 +02:00
Diego Biurrun
239fdf1b4a x86: build: replace mmx2 by mmxext
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
2012-08-03 22:51:05 +02:00
Ronald S. Bultje
da6505ad2f dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
This makes add_hfyu_left_prediction_sse4() handle sources that are not
16-byte aligned in its own function rather than by proxying the call to
add_hfyu_left_prediction_ssse3(). This fixes a crash on Win64, since the
sse4 version clobberes xmm6, but the ssse3 version (which uses MMX regs)
does not restore it, thus leading to XMM clobbering and RSP being off.

Fixes bug 342.
2012-08-03 11:09:14 -07:00
Diego Biurrun
ca844b7be9 x86: Use consistent 3dnowext function and macro name suffixes
Currently there is a wild mix of 3dn2/3dnow2/3dnowext.  Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
2012-08-03 14:00:47 +02:00
Michael Niedermayer
9c6e23f5d2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: fft: fix imdct_half() for AVX
  rtmppkt: Add missing libavcodec/bytestream.h include.
  rtmp: add functions for reading AMF values
  vc1dec: remove useless #include simple_idct.h
  dct-test: always link with aandcttab.o
  vp8: pack struct VP8ThreadData more efficiently
  x86: remove libmpeg2 mmx(ext) idct functions
  eamad: Use dsputils instead of a custom bswap16_buf
  Canopus Lossless decoder

Conflicts:
	Changelog
	LICENSE
	libavcodec/avcodec.h
	libavcodec/cllc.c
	libavcodec/eamad.c
	libavcodec/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-02 23:34:01 +02:00
Diego Biurrun
03737412a3 x86: proresdsp: improve SIGNEXTEND macro comments 2012-08-02 22:30:44 +02:00
Ronald S. Bultje
9f14cd91b5 fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-08-02 22:14:40 +02:00
Diego Biurrun
81905088a1 x86: h264dsp: K&R formatting cosmetics 2012-08-02 20:20:21 +02:00
Ronald S. Bultje
c728518b3c x86: fft: fix imdct_half() for AVX
Some calculations were changed in b6a3849 to use mmsize, which was not correct
for the AVX version, which uses INIT_YMM and therefore has mmsize == 32.

Fixes Bug 341.

Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-08-02 13:40:11 -04:00
Mans Rullgard
ec7c501ed5 x86: remove libmpeg2 mmx(ext) idct functions
These functions are not faster than other mmx implementations on
any hardware I have been able to test on, and they are horribly
inaccurate.  There is thus no reason to ever use them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Michael Niedermayer
ec7ecb8811 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dca: Switch dca_sample_rates to avpriv_ prefix; it is used across libs
  ARM: use =const syntax instead of explicit literal pools
  ARM: use standard syntax for all LDRD/STRD instructions
  fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
  dct-test: allow to compile without HAVE_INLINE_ASM.
  x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM.
  dca: Move tables used outside of dcadec.c to a separate file.
  dca: Rename dca.c ---> dcadec.c
  x86: h264dsp: Remove unused variable ff_pb_3_1
  apetag: change a forgotten return to return 0

Conflicts:
	libavcodec/Makefile
	libavcodec/dca.c
	libavcodec/x86/fft_3dn.c
	libavcodec/x86/fft_3dn2.c
	libavcodec/x86/fft_mmx.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-01 23:33:06 +02:00
Ronald S. Bultje
b6a3849adb fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-07-31 21:20:47 -07:00
Ronald S. Bultje
53dfaedc01 x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM. 2012-07-31 20:28:52 -07:00
Diego Biurrun
6376a3ad24 x86: h264dsp: Remove unused variable ff_pb_3_1 2012-08-01 00:17:16 +02:00
Michael Niedermayer
d1dad7c824 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpc8: return more meaningful error codes.
  mpc: return more meaningful error codes.
  wv,mpc8: don't return apetag data in packets.
  rtmp: do not warn about receiving metadata packets
  x86: h264dsp: Adjust YASM #ifdefs
  x86: yadif: Mark mmxext optimizations as such
  h264: convert loop filter strength dsp function to yasm.
  Improve descriptiveness of a number of codec and container long names

Conflicts:
	libavcodec/flvdec.c
	libavcodec/libopenjpegdec.c
	libavformat/apetag.c
	libavformat/mp3dec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-31 22:41:00 +02:00
Diego Biurrun
8728b381cb x86: h264dsp: Adjust YASM #ifdefs
This fixes compilation with YASM disabled.
2012-07-31 13:54:07 +02:00
Ronald S. Bultje
b829b4ce29 h264: convert loop filter strength dsp function to yasm.
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles
faster (201->193) on x86-32.
2012-07-30 19:39:47 -07:00
Michael Niedermayer
706bd8ea19 Merge remote-tracking branch 'qatar/master'
* qatar/master: (35 commits)
  h264_idct_10bit: port x86 assembly to cpuflags.
  x86inc: clip num_args to 7 on x86-32.
  x86inc: sync to latest version from x264.
  fft: rename "z" to "zc" to prevent name collision.
  wv: return meaningful error codes.
  wv: return AVERROR_EOF on EOF, not EIO.
  mp3dec: forward errors for av_get_packet().
  mp3dec: remove a pointless local variable.
  mp3dec: remove commented out cruft.
  lavfi: bump minor to mark stabilizing the ABI.
  FATE: add tests for yadif.
  FATE: add a test for delogo video filter.
  FATE: add a test for amix audio filter.
  audiogen: allow specifying random seed as a commandline parameter.
  vc1dec: Override invalid macroblock quantizer
  vc1: avoid reading beyond the last line in vc1_draw_sprites()
  vc1dec: check that coded slice positions and interlacing match.
  vc1dec: Do not ignore ff_vc1_parse_frame_header_adv return value
  configure: Move parts that should not be user-selectable to CONFIG_EXTRA
  lavf: remove commented out cruft in avformat_find_stream_info()
  ...

Conflicts:
	Makefile
	configure
	libavcodec/vc1dec.c
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264dsp_mmx.c
	libavfilter/version.h
	libavformat/mp3dec.c
	libavformat/utils.c
	libavformat/wv.c
	libavutil/x86/x86inc.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-29 02:16:26 +02:00
Ronald S. Bultje
c83f44dba1 h264_idct_10bit: port x86 assembly to cpuflags. 2012-07-28 08:29:45 -07:00
Ronald S. Bultje
b3c5ae5607 fft: rename "z" to "zc" to prevent name collision.
Without this, cglobal will expand "z" to "zh" to access the high byte
in a register's word, which causes a name collision with the ZH(x) macro
further up in this file.
2012-07-28 08:29:44 -07:00
Ronald S. Bultje
4d777eedfd vp3: don't compile mmx IDCT functions on x86-64.
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
2012-07-27 20:12:30 -07:00
Ronald S. Bultje
a5bbb1242c h264_loopfilter: port x86 simd to cpuflags. 2012-07-27 20:12:11 -07:00
Ronald S. Bultje
d07ff3cd5a h264_chromamc_10bit: port x86 simd to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
4a26fdd852 vp3: port x86 SIMD to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
76888c64b0 rv34: port x86 SIMD to cpuflags. 2012-07-27 15:13:26 -07:00
Michael Niedermayer
c6963a220d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  proresdsp: port x86 assembly to cpuflags.
  lavr: x86: improve non-SSE4 version of S16_TO_S32_SX macro
  lavfi: better channel layout negotiation
  alac: check for truncated packets
  alac: reverse lpc coeff order, simplify filter
  lavr: add x86-optimized mixing functions
  x86: add support for fmaddps fma4 instruction with abstraction to avx/sse
  tscc2: fix typo in array index
  build: use COMPILE template for HOSTOBJS
  build: do full flag handling for all compiler-type tools
  eval: fix printing of NaN in eval fate test.
  build: Rename aandct component to more descriptive aandcttables
  mpegaudio: bury inline asm under HAVE_INLINE_ASM.
  x86inc: automatically insert vzeroupper for YMM functions.
  rtmp: Check the buffer length of ping packets
  rtmp: Allow having more unknown data at the end of a chunk size packet without failing
  rtmp: Prevent reading outside of an allocate buffer when receiving server bandwidth packets

Conflicts:
	Makefile
	configure
	libavcodec/x86/proresdsp.asm
	libavutil/eval.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-27 23:42:19 +02:00
Ronald S. Bultje
158744a4cd vp56: only compile MMX SIMD on x86-32.
All x86-64 CPUs have SSE2, so the MMX version will never be used. This
leads to smaller binaries.
2012-07-27 14:40:27 -07:00
Ronald S. Bultje
2734ba787b vp56: port x86 simd to cpuflags. 2012-07-27 14:39:07 -07:00
Ronald S. Bultje
5361e10a5e proresdsp: port x86 assembly to cpuflags. 2012-07-27 11:43:06 -07:00
jamal
52a62f9085 dwt: Fix several warnings about incompatible pointer type
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-27 19:36:17 +02:00
Ronald S. Bultje
bde73f28af mpegaudio: bury inline asm under HAVE_INLINE_ASM. 2012-07-26 13:43:16 -07:00
Ronald S. Bultje
30b45d9c38 x86inc: automatically insert vzeroupper for YMM functions. 2012-07-26 13:43:16 -07:00
Michael Niedermayer
7333798c85 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libopenjpeg: support YUV and deep RGB pixel formats
  Fix typo in v410 decoder.
  vf_yadif: unset cur_buf on the input link.
  vf_overlay: ensure the overlay frame does not get leaked.
  vf_overlay: prevent premature freeing of cur_buf
  Support urlencoded http authentication credentials
  rtmp: Return an error when the client bandwidth is incorrect
  rtmp: Return proper error code in handle_server_bw
  rtmp: Return proper error code in handle_client_bw
  rtmp: Return proper error codes in handle_chunk_size
  lavr: x86: add missing vzeroupper in ff_mix_1_to_2_fltp_flt()
  vp8: Replace x*155/100 by x*101581>>16.
  vp3: don't use calls to inline asm in yasm code.
  x86/dsputil: put inline asm under HAVE_INLINE_ASM.
  dsputil_mmx: fix incorrect assembly code
  rtmp: Factorize the code by adding handle_invoke
  rtmp: Factorize the code by adding handle_chunk_size
  rtmp: Factorize the code by adding handle_ping
  rtmp: Factorize the code by adding handle_client_bw
  rtmp: Factorize the code by adding handle_server_bw

Conflicts:
	libavcodec/libopenjpegdec.c
	libavcodec/x86/dsputil_mmx.c
	libavfilter/vf_overlay.c
	libavformat/Makefile
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-26 21:37:15 +02:00
Ronald S. Bultje
a1878a88a1 vp3: don't use calls to inline asm in yasm code.
Mixing yasm and inline asm is a bad idea, since if either yasm or inline
asm is not supported by your toolchain, all of the asm stops working.
Thus, better to use either one or the other alone.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:24:30 -04:00
Ronald S. Bultje
79195ce565 x86/dsputil: put inline asm under HAVE_INLINE_ASM.
This allows compiling with compilers that don't support gcc-style
inline assembly.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:24:27 -04:00
Yang Wang
845e92fd6a dsputil_mmx: fix incorrect assembly code
In ff_put_pixels_clamped_mmx(), there are two assembly code blocks.
In the first block (in the unrolled loop), the instructions
"movq 8%3, %%mm1 \n\t", and so forth, have problems.

From above instruction, it is clear what the programmer wants: a load from
p + 8. But this assembly code doesn’t guarantee that. It only works if the
compiler puts p in a register to produce an instruction like this:
"movq 8(%edi), %mm1". During compiler optimization, it is possible that the
compiler will be able to constant propagate into p. Suppose p = &x[10000].
Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction
becomes "movq 810000(%edx)". That is, it will stride by 810000 instead of 8.

This will cause a segmentation fault.

This error was fixed in the second block of the assembly code, but not in
the unrolled loop.

How to reproduce:
    This error is exposed when we build using Intel C++ Compiler, with
    IPO+PGO optimization enabled. Crashed when decoding an MJPEG video.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:22:18 -04:00
yang
6a2bad2c4f dsputil_mmx: fix incorrect assembly code
In file libavcodec/x86/dsputil_mmx.c, function ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t" etc have problem.
For above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: “movq 8(%edi), %mm1”. During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes “movq 810000(%edx)”. That is, it will stride by 810000 instead of 8.
This will cause the segmentation fault.
This error was fixed in the second block of the assembly code, but not in the unrolled loop.

How to reproduce:
This error is exposed when we build the ffmpeg using Intel C++ Compiler, IPO+PGO optimization. The ffmpeg was crashed when decoding a mjpeg video.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-24 00:55:05 +02:00
Michael Niedermayer
2cb4d51654 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  v410dec: Implement explode mode support
  zerocodec: fix direct rendering.
  wav: init st to NULL to avoid a false-positive warning.
  wavpack: set bits_per_raw_sample for S32 samples to properly identify 24-bit
  h264: refactor NAL decode loop
  RTMPTE protocol support
  RTMPE protocol support
  rtmp: Add ff_rtmp_calc_digest_pos()
  rtmp: Rename rtmp_calc_digest to ff_rtmp_calc_digest and make it global
  swscale: add missing HAVE_INLINE_ASM check.
  lavfi: place x86 inline assembly under HAVE_INLINE_ASM.
  vc1: Add a test for interlaced field pictures
  swscale: Mark all init functions as av_cold
  swscale: x86: Drop pointless _mmx suffix from filenames
  lavf: use conditional notation for default codec in muxer declarations.
  swscale: place inline assembly bilinear scaler under HAVE_INLINE_ASM.
  dsputil: ppc: cosmetics: pretty-print
  dsputil: x86: add SHUFFLE_MASK_W macro
  configure: respect CC_O setting in check_cc

Conflicts:
	Changelog
	configure
	libavcodec/v410dec.c
	libavcodec/zerocodec.c
	libavformat/asfenc.c
	libavformat/version.h
	libswscale/utils.c
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-23 21:25:09 +02:00
Jason Garrett-Glaser
85a3c19ed1 dsputil: x86: add SHUFFLE_MASK_W macro
Simplifies pshufb masks that operate on words.
2012-07-22 16:56:58 -04:00
Michael Niedermayer
85044358f6 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Print full compiler identification, not only version number
  flacdec: reverse lpc coeff order, simplify filter
  x86: dsputil: drop some unused CPU flag debug code

Conflicts:
	cmdutils.c
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-19 22:01:31 +02:00
Diego Biurrun
9f97af2688 x86: dsputil: drop some unused CPU flag debug code 2012-07-19 10:17:56 +02:00
Michael Niedermayer
204c4e953d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ppc: fix build with altivec disabled
  vp3: move idct and loop filter pointers to new vp3dsp context
  build: add CONFIG_VP3DSP, reduce repetition in OBJS lists
  tscc2: do not add/subtract 128 bias during DCT
  tscc2: fix typo in DCT
  configure: clarify external library section of help output
  configure: mark libfdk-aac as nonfree
  configure: cosmetics: drop some unnecessary backslashes
  os_support: K&R formatting cosmetics

Conflicts:
	configure
	libavcodec/vp3.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-18 22:34:48 +02:00
Mans Rullgard
28f9ab7029 vp3: move idct and loop filter pointers to new vp3dsp context
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context.  There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:19 +01:00
Mans Rullgard
ab9f987661 build: add CONFIG_VP3DSP, reduce repetition in OBJS lists
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:18 +01:00
Michael Niedermayer
3245c8b669 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mxfdec: replace x>>av_log2(sizeof(..)) by x/sizeof(..).
  x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro

Conflicts:
	libavformat/mxfdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-07 20:29:43 +02:00
Loren Merritt
e14052dbc8 x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-07 20:26:39 +02:00
Martin Storsjö
f27386cdc7 x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro
The SPLATB_REG macro already adds the 'd' suffix internally.

This fixes building on Win64, which has been broken since 878e66902.

This worked for unix, where r2 happened to be rdx in this case, which
with the first suffix rdxd was mapped to eax, and eaxd is defined back
to eax. On win64 however, r2 happened to be R8 in this case, and
R8d mapps to R8D just fine, but there's no mapping for R8Dd to anything.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-07-06 21:07:23 +03:00
Michael Niedermayer
24823a761c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  qdm2: remove broken and disabled dump_context() debug function
  x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros
  x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros
  x86inc: modify ALIGN to not generate long nops on i586
  x86: h264_intrapred: port to cpuflag macros
  avplay: update input filter pointer when the filtergraph is reset.
  avconv: fix parsing of -force_key_frames option.
  h264: use templates to avoid excessive inlining
  xtea: Make the count parameter match the documentation
  blowfish: Make the count parameter match the documentation
  mpegvideo: Don't use ff_mspel_motion() for vc1
  xtea: invert branch and loop precedence
  blowfish: invert branch and loop precedence
  flvdec: optionally trust the metadata
  avconv: Set audio filter time base to the sample rate
  vp8: Add ifdef guards around the sse2 loopfilter in the sse2slow branch too

Conflicts:
	ffmpeg.c
	ffplay.c
	libavcodec/h264.c
	libavcodec/mpegvideo_common.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-05 21:55:31 +02:00
Diego Biurrun
878e669029 x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros 2012-07-05 17:37:11 +02:00
Loren Merritt
4d4752366f x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-07-05 17:37:11 +02:00
Diego Biurrun
d20f133ef9 x86: h264_intrapred: port to cpuflag macros 2012-07-05 17:37:10 +02:00
Martin Storsjö
07eeeb1d4f vp8: Add ifdef guards around the sse2 loopfilter in the sse2slow branch too
This was missed in the the previous commit in 70a1c800.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-07-05 09:39:01 +03:00
Michael Niedermayer
039e9fe01c Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  lavfi: reclassify showfiltfmts as a TESTPROG
  graph2dot: fix printf format specifier
  swscale: yuv2planeX 8bit >=sse2 functions need aligned stack on x86-32.
  vp8: loopfilter >=sse2 functions need aligned stack on x86-32.
  amr: remove shift out of the AMR_BIT() macro.
  dsputilenc: group yasm and inline asm function pointer assignment.
  mov: use forward declaration of a function instead of a table.
  Clarify Doxygen comment for FF_API_* #defines.
  configure: simplify get_version()
  Create version.h headers for libraries that lack them
  gitignore: Use full path instead of relative path to specify patterns
  mpegvideo: remove VLAs
  Add XTEA encryption support in libavutil
  Add Blowfish encryption support in libavutil
  eval: Add the isinf() function and tests for it
  flacdec: move lpc filter to flacdsp
  flacdec: split off channel decorrelation as flacdsp
  avplay: Add an option for not limiting the input buffer size
  FATE: add a test for WMA cover art.
  FATE: add a test for apetag cover art
  ...

Conflicts:
	.gitignore
	configure
	ffplay.c
	libavcodec/Makefile
	libavcodec/error_resilience.c
	libavcodec/mpegvideo.c
	libavcodec/ratecontrol.c
	libavdevice/avdevice.h
	libavfilter/Makefile
	libavfilter/filtfmts.c
	libavfilter/version.h
	libavformat/mov.c
	libavformat/version.h
	libavutil/Makefile
	libavutil/avutil.h
	libavutil/version.h
	libswscale/swscale.h
	libswscale/x86/swscale_mmx.c
	tests/fate/libavutil.mak
	tests/lavfi-regression.sh
	tools/graph2dot.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-04 21:03:28 +02:00
Martin Storsjö
70a1c8000f vp8: loopfilter >=sse2 functions need aligned stack on x86-32.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-07-04 08:25:50 -07:00
Ronald S. Bultje
723b266d72 dsputilenc: group yasm and inline asm function pointer assignment. 2012-07-04 07:46:27 -07:00
Michael Niedermayer
64b25938e9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dsputilenc_mmx: split assignment of ff_sse16_sse2 to SSE2 section.
  dnxhdenc: add space between function argument type and comment.
  x86: fmtconvert: add special asm for float_to_int16_interleave_misc_*
  attributes: Add a definition of av_always_inline for MSVC
  cmdutils: Pass the actual chosen encoder to filter_codec_opts
  os_support: Add fallback definitions for stat flags
  os_support: Rename the poll fallback function to ff_poll
  network: Check for struct pollfd
  os_support: Don't compare a negative number against socket descriptors
  os_support: Include all the necessary headers for the win32 open function
  x86: vc1: fix and enable optimised loop filter

Conflicts:
	cmdutils.c
	cmdutils.h
	ffmpeg.c
	ffplay.c
	libavformat/os_support.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-30 22:44:18 +02:00
Ronald S. Bultje
ceabc13f12 dsputilenc_mmx: split assignment of ff_sse16_sse2 to SSE2 section. 2012-06-30 09:24:52 -07:00
Ronald S. Bultje
66a02159ea x86: fmtconvert: add special asm for float_to_int16_interleave_misc_*
This gets rid of a variable-length array and a for loop in C code.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-06-30 19:10:36 +03:00
Mans Rullgard
f2fd167835 x86: vc1: fix and enable optimised loop filter
The problem is that the ssse3 psign instruction does the wrong
thing here.  Commit ea60dfe incorrectly removed a macro emulating
this instruction for pre-ssse3 code.  However, the emulation is
incorrect, and the code relies on the behaviour of the macro.
Specifically, the psign sets destination elements to zero where
the corresponding source element is zero, whereas the emulation
only negates destination elements where the source is negative.

Furthermore, the PSIGNW_MMX macro in x86util.asm is totally bogus,
which is why the original VC-1 code had an additional right shift
when using it.  Since the psign instruction cannot be used here,
skip all the macro hell and use the working instruction sequence
directly.

None of this was noticed due a stray return statement in
ff_vc1dsp_init_mmx() which meant that only the mmx version of the
loop filter was ever used (before being removed in ea60dfe).

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-30 00:12:05 +01:00
Michael Niedermayer
87df986dcf Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mss1: validate number of changeable palette entries
  mss1: report palette changed when some additional colours were decoded
  x86: fft: replace call to memcpy by a loop
  udp: Support IGMPv3 source specific multicast and source blocking
  dxva2: include dxva.h if found
  libm: Provide fallback definitions for isnan() and isinf()
  tcp: Pass NULL as hostname to getaddrinfo if the string is empty
  tcp: Set AI_PASSIVE when the socket will be used for listening

Conflicts:
	configure
	libavcodec/mss1.c
	libavformat/udp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-28 01:08:52 +02:00
Christophe Gisquet
a5bfa66df5 x86: fft: replace call to memcpy by a loop
The function call was a mess to handle, and memcpy cannot make
the assumptions we do in the new code.

Tested on an IMC sample: 430c -> 370c.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-27 12:49:33 +01:00
Mans Rullgard
37c3864ef7 x86: fft: elf64: fix PIC build
In a 64-bit PIC build, external functions must be called
through the PLT.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-26 15:57:32 +02:00
Nicolas George
d4c45b8adf Revert "Revert "x86: fft: win64: fix stack alignment for memcpy() call""
This reverts commit f767658414.

The bug it introduces has been fixed.
2012-06-26 15:56:01 +02:00
Nicolas George
91765594dd Revert "Revert "x86: fft: convert sse inline asm to yasm""
This reverts commit fd91a3ec44.

The bug it introduced has been fixed.
2012-06-26 15:55:41 +02:00
Nicolas George
fd91a3ec44 Revert "x86: fft: convert sse inline asm to yasm"
This reverts commit 8299260470.

It breaks shared builds on x86_64.
2012-06-26 13:00:14 +02:00
Nicolas George
f767658414 Revert "x86: fft: win64: fix stack alignment for memcpy() call"
This reverts commit 8725da49a2.

Necerrary to revert 8299260470.
2012-06-26 12:59:48 +02:00
Michael Niedermayer
3b0ad040b3 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  log: Include io.h on windows
  lavr: x86: merge some branches
  x86: cpu: whitespace (mostly) cosmetics
  x86: fft: win64: fix stack alignment for memcpy() call

Conflicts:
	libavutil/log.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-26 01:13:07 +02:00
Mans Rullgard
0595334892 x86: fft: elf64: fix PIC build
In a 64-bit PIC build, external functions must be called
through the PLT.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 22:58:18 +01:00
Michael Niedermayer
a6ff8514a9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  wtv: Check the return value from gmtime
  x86: fft: convert sse inline asm to yasm
  x86: place some inline asm under #if HAVE_INLINE_ASM

Conflicts:
	libavcodec/x86/fft_sse.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-25 16:55:31 +02:00
Mans Rullgard
8725da49a2 x86: fft: win64: fix stack alignment for memcpy() call 2012-06-25 15:10:39 +01:00
Mans Rullgard
8299260470 x86: fft: convert sse inline asm to yasm 2012-06-25 13:31:00 +01:00
Ronald S. Bultje
8123e0901f x86: place some inline asm under #if HAVE_INLINE_ASM
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 13:23:12 +01:00
Michael Niedermayer
244682dd08 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  log: Only include unistd.h if configure found it
  ape: create audio stream before reading tags.
  mov: make a length variable larger.
  image2: Add "start_number" private option to the demuxer
  image2: Add "start_number" private option to the muxer
  avconv: remove a forgotten debugging printf.
  avconv: use more descriptive names for hardcoded filters.
  avconv: remove redundant handling of async.
  doc/filters: fix typo.
  h264: use asm cabac reader under a generic condition

Conflicts:
	ffmpeg.c
	libavformat/img2dec.c
	libavformat/img2enc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-24 21:34:54 +02:00
Michael Niedermayer
1c60088885 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: Only use optimizations with cmov if the CPU supports the instruction
  x86: Add CPU flag for the i686 cmov instruction
  x86: remove unused inline asm macros from dsputil_mmx.h
  x86: move some inline asm macros to the only places they are used
  lavfi: Add the af_channelmap audio channel mapping filter.
  lavfi: add join audio filter.
  lavfi: allow audio filters to request a given number of samples.
  lavfi: support automatically inserting the fifo filter when needed.
  lavfi/audio: eliminate ff_default_filter_samples().

Conflicts:
	Changelog
	libavcodec/x86/h264dsp_mmx.c
	libavfilter/Makefile
	libavfilter/allfilters.c
	libavfilter/avfilter.h
	libavfilter/avfiltergraph.c
	libavfilter/version.h
	libavutil/x86/cpu.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-24 02:09:53 +02:00
Mans Rullgard
0b6f973635 h264: use asm cabac reader under a generic condition
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 22:14:21 +01:00
Diego Biurrun
fe07c9c6b5 x86: Only use optimizations with cmov if the CPU supports the instruction 2012-06-23 16:21:50 +02:00
Mans Rullgard
29686d6ea3 x86: remove unused inline asm macros from dsputil_mmx.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 14:14:06 +01:00
Mans Rullgard
685f5438bb x86: move some inline asm macros to the only places they are used
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 14:14:06 +01:00
Michael Niedermayer
e847f41285 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libspeexenc: add supported sample rates and channel layouts.
  Replace usleep() calls with av_usleep()
  lavu: add av_usleep() function
  utvideo: mark interlaced frames as such
  utvideo: Fix interlaced prediction for RGB utvideo.
  cosmetics: do not use full path for local headers
  lavu/file: include unistd.h only when available
  configure: check for unistd.h
  log: include unistd.h only when needed
  lavf: include libavutil/time.h instead of redeclaring av_gettime()

Conflicts:
	configure
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/utvideo.c
	libavutil/avutil.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-22 22:34:02 +02:00
Michael Niedermayer
fba18ef8cc x86/dsputil_mmx: support 4 sample edges
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-22 17:23:09 +02:00
Diego Biurrun
a5a93fa8f5 cosmetics: do not use full path for local headers 2012-06-22 10:49:40 +02:00
Michael Niedermayer
82edf6727f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: add x86-optimized functions for mixing 1-to-2 s16p with flt coeffs
  lavr: add x86-optimized functions for mixing 1-to-2 fltp with flt coeffs
  Add Dolby/DPLII downmix support to libavresample
  vorbisdec: replace div/mod in loop with a counter
  fate: vorbis: add 5.1 surround test
  rtpenc: Allow requesting H264 RTP packetization mode 0
  configure: Sort the library listings in the help text alphabetically
  dwt: remove variable-length arrays
  RTMPT protocol support
  http: Properly handle chunked transfer-encoding for replies to post data
  http: Fail reading if the connection has gone away
  amr: Mark an array const
  amr: More space cleanup
  rtpenc: Fix memory leaks in the muxer open function

Conflicts:
	Changelog
	configure
	doc/APIchanges
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-18 20:07:00 +02:00
Ronald S. Bultje
d9669eab0b dwt: remove variable-length arrays
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-17 23:20:10 +01:00
Michael Niedermayer
9946a6aa55 diracdsp: try to fix segfault
This might fix Ticket1412

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-16 23:16:54 +02:00
Michael Niedermayer
3b196bb737 libavcodec/x86/rv40dsp_init.c: add missing HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:26:24 +02:00
Michael Niedermayer
915ec91e6b libavcodec/x86/h264dsp_mmx.c: add forgotten HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:23:44 +02:00
Michael Niedermayer
63bfee8796 libavcodec/x86/dwt.c: move some missed things under HAVE_YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-10 03:20:27 +02:00
Michael Niedermayer
7e22514d98 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  float_dsp: ppc: add a separate header for Altivec function prototypes
  ARM: fix float_dsp breakage from d5a7229
  Add a float DSP framework to libavutil
  PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
  ARM: Move asm.S from libavcodec to libavutil
  vc1dsp: mark put/avg_vc1_mspel_mc() always_inline

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-08 23:59:09 +02:00
Justin Ruggles
d5a7229ba4 Add a float DSP framework to libavutil
Move vector_fmul() from DSPContext to AVFloatDSPContext.
2012-06-08 13:14:38 -04:00
Michael Niedermayer
b0387edd5e Merge commit 'f919cc7df6ab844bc12f89fe7bef4fb915a47725'
* commit 'f919cc7df6ab844bc12f89fe7bef4fb915a47725':
  fate: fix acodec/vsynth tests for make 3.81
  pcm_mpeg: fix number of consumed bytes to include the header.
  avfilter: include required header file avfilter.h in video.h
  x86: Avoid movs on BUTTERFLYPS when in AVX mode
  x86: use new schema for ASM macros
  fate: convert codec-regression.sh to makefile rules
  fate: allow tests to specify unit size for psnr comparison
  fate: teach videogen/rotozoom to output a single raw video stream
  http: Add support for reusing the http socket for subsequent requests
  http: Add support for using persistent connections

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-30 01:40:54 +02:00
Vitor Sessak
bac0729d9e x86: use new schema for ASM macros
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-05-29 14:49:45 +02:00
Vitor Sessak
2fd5e70869 x86: use new schema for ASM macros
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-27 15:42:45 +02:00
Carl Eugen Hoyos
001d9d5e93 Fix compilation with --disable-everything. 2012-05-24 08:08:31 +02:00
Michael Niedermayer
d0ad91c258 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  os_support: Define SHUT_RD, SHUT_WR and SHUT_RDWR on OS/2
  http: Add support for reading http POST reply headers
  http: Add http_shutdown() for ending writing of posts
  tcp: Allow signalling end of reading/writing
  avio: Add a function for signalling end of reading/writing
  lavfi: fix comment, audio is supported now.
  lavfi: fix incorrect comment.
  lavfi: remove avfilter_null_* from public API on next bump.
  lavfi: remove avfilter_default_* from public API on next bump.
  lavfi: deprecate default config_props() callback and refactor avfilter_config_links()
  avfiltergraph: smarter sample format selection.
  avconv: rename transcode_audio/video to decode_audio/video.
  asyncts: reset delta to 0 when it's not used.
  x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code.
  dwt: return errors from ff_slice_buffer_init()

Conflicts:
	ffmpeg.c
	libavfilter/avfilter.c
	libavfilter/avfilter.h
	libavfilter/formats.c
	libavfilter/version.h
	libavfilter/vf_blackframe.c
	libavfilter/vf_drawtext.c
	libavfilter/vf_fade.c
	libavfilter/vf_format.c
	libavfilter/vf_showinfo.c
	libavfilter/video.c
	libavfilter/video.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-23 21:48:31 +02:00
Michael Niedermayer
ea5dab58e0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dwt: check malloc calls
  ppc: Drop unused header regs.h
  af_resample: remove an extra space in the log output
  Convert vector_fmul range of functions to YASM and add AVX versions
  lavfi: add an audio split filter
  lavfi: rename vf_split.c to split.c

Conflicts:
	doc/filters.texi
	libavcodec/ppc/regs.h
	libavfilter/Makefile
	libavfilter/allfilters.c
	libavfilter/f_split.c
	libavfilter/split.c
	libavfilter/version.h
	libavfilter/vf_split.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-22 23:42:17 +02:00
Justin Ruggles
713548cbad x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code.
This is needed for older versions of yasm/nasm that do not support AVX.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-05-22 20:46:02 +02:00
Kieran Kunhya
5ff01259a8 Convert vector_fmul range of functions to YASM and add AVX versions
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-05-21 17:13:05 -04:00
Michael Niedermayer
703e920bb7 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  fate: Work around non-standard wc implementations at more places
  fate: work around non-standard wc implementations
  x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions.
  ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16
  fate: use standard diff options
  tta: Fix comment about channel number; TTA supports >2 channels.
  avfilter: Move ff_get_ref_perms_string() to where it is used.
  build: Add 'check' target to run all compile and test targets.
  indeo3: validate new frame size before resetting decoder
  indeo3: when freeing buffers, set pointers referencing them to NULL as well
  indeo3: initialise pixel planes on allocation
  indeo3: ensure that decoded cell data is in 7-bit range as presumed by decoder
  fate: rename psx-str-v3-mdec to mdec-v3
  fate: convert psx-str to a demuxer test
  lavf: add mdec to is_intra_only() list

Conflicts:
	doc/developer.texi
	libavcodec/indeo3.c
	libavfilter/video.c
	libavformat/utils.c
	tests/fate/demux.mak
	tests/fate/video.mak
	tests/lavf-regression.sh
	tests/ref/vsynth1/cljr
	tests/ref/vsynth1/ffvhuff
	tests/ref/vsynth2/cljr
	tests/ref/vsynth2/ffvhuff

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-16 22:32:05 +02:00
Michael Kostylev
6797d1948b x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions. 2012-05-15 23:54:08 +02:00
Justin Ruggles
95a98ab3f0 ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16
Simplifies the code by using cpuflags and a new macro.
Also fixes the invalid use of the MMX2 pshufw operation in the MMX-only
function.
2012-05-15 15:23:59 -04:00
Michael Niedermayer
7e944159c6 Merge remote-tracking branch 'qatar/master'
* qatar/master: (25 commits)
  vcr1: Add vcr1_ prefixes to all static functions with generic names.
  vcr1: Fix return type of common_init to match the function pointer signature.
  vcr1enc: Replace obsolete get_bit_count by put_bits_count/flush_put_bits.
  motion-test: remove disabled code
  gxfenc: remove disabled half-implemented MJPEG tag
  x86: use more standard construct for setting ASM functions in FFT code
  fate: westwood-aud: disable decoding
  fate: caf: disable decoding
  fate: film-cvid: drop pcm audio and rename test
  fate: d-cinema-demux: drop unnecessary flags
  fate: split off dpcm-interplay from interplay-mve tests
  fate: rename funcom-iss to adpcm-ima-iss
  fate: rename cryo-apc to adpcm-ima-apc
  fate: rename adpcm-psx-str-v3 to adpcm-xa
  fate: split off adpcm-ms-mono test from dxa-feeble
  fate: split off adpcm-ima-ws test from vqa-cc
  fate: add adpcm-ima-smjpeg test
  fate: split off adpcm-ima-amv from amv test
  fate: separate bmv audio and video tests
  fate: separate delphine-cin audio and video tests
  ...

Conflicts:
	doc/platform.texi
	libavcodec/vcr1.c
	tests/fate/audio.mak
	tests/fate/demux.mak
	tests/fate/video.mak
	tests/ref/fate/ea-mad-pcm-planar
	tests/ref/fate/interplay-mve-16bit
	tests/ref/fate/interplay-mve-8bit
	tests/ref/fate/mtv
	tests/ref/fate/qtrle-1bit
	tests/ref/fate/qtrle-2bit
	tests/ref/fate/truemotion1-15
	tests/ref/fate/truemotion1-24
	tests/ref/fate/vqa-cc

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-14 20:17:24 +02:00
Vitor Sessak
fcc456b829 x86: use more standard construct for setting ASM functions in FFT code
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-05-14 15:38:42 +02:00
Michael Niedermayer
1caf614bec Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavfi: autoinsert resample filter when necessary.
  lavfi: add lavr-based audio resampling filter.
  x86: vc1: drop MMX loop filter implementation, which uses MMX2 instructions.

Conflicts:
	configure
	doc/filters.texi
	libavcodec/x86/vc1dsp_mmx.c
	libavfilter/Makefile
	libavfilter/allfilters.c
	libavfilter/avfiltergraph.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-13 00:13:49 +02:00
Michael Kostylev
ea60dfe284 x86: vc1: drop MMX loop filter implementation, which uses MMX2 instructions. 2012-05-12 14:02:45 +02:00
Michael Niedermayer
015903294c Merge remote-tracking branch 'qatar/master'
* qatar/master: (25 commits)
  rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MC
  ape: Use unsigned integer maths
  arm: dsputil: fix overreads in put/avg_pixels functions
  h264: K&R formatting cosmetics for header files (part II/II)
  h264: K&R formatting cosmetics for header files (part I/II)
  rtmp: Implement check bandwidth notification.
  rtmp: Support 'rtmp_swfurl', an option which specifies the URL of the SWF player.
  rtmp: Support 'rtmp_flashver', an option which overrides the version of the Flash plugin.
  rtmp: Support 'rtmp_tcurl', an option which overrides the URL of the target stream.
  cmdutils: Add fallback case to switch in check_stream_specifier().
  sctp: be consistent with socket option level
  configure: Add _XOPEN_SOURCE=600 to Solaris preprocessor flags.
  vcr1enc: drop pointless empty encode_init() wrapper function
  vcr1: drop pointless write-only AVCodecContext member from VCR1Context
  vcr1: group encoder code together to save #ifdefs
  vcr1: cosmetics: K&R prettyprinting, typos, parentheses, dead code, comments
  mov: make one comment slightly more specific
  lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX
  lavfi: move audio-related functions to a separate file.
  lavfi: remove some audio-related function from public API.
  ...

Conflicts:
	cmdutils.c
	libavcodec/h264.h
	libavcodec/h264_mvpred.h
	libavcodec/vcr1.c
	libavfilter/avfilter.c
	libavfilter/avfilter.h
	libavfilter/defaults.c
	libavfilter/internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-05-10 23:30:42 +02:00
Christophe Gisquet
110d0cdc9d rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MC
Code mostly inspired by vp8's MC, however:
- its MMX2 horizontal filter is worse because it can't take advantage of
  the coefficient redundancy
- that same coefficient redundancy allows better code for non-SSSE3 versions

Benchmark (rounded to tens of unit):
        V8x8  H8x8  2D8x8  V16x16  H16x16  2D16x16
C       445    358   985    1785    1559    3280
MMX*    219    271   478     714     929    1443
SSE2    131    158   294     425     515     892
SSSE3   120    122   248     387     390     763

End result is overall around a 15% speedup for SSSE3 version (on 6 sequences);
all loop filter functions now take around 55% of decoding time, while luma MC
dsp functions are around 6%, chroma ones are 1.3% and biweight around 2.3%.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-05-10 18:42:43 +02:00
Ronald S. Bultje
bec207f9f9 snowdsp: explicitily state instruction size.
Fixes a compile error with clang at -O0.
2012-05-02 09:57:12 -07:00
Michael Niedermayer
dfa07e8928 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  4xm: fix invalid array indexing
  rv34dsp: factorize a multiplication in the noround inverse transform
  rv40: perform bitwise checks in loop filter
  rv34: remove inline keyword from rv34_decode_block().
  rv40: change a logical test into a bitwise one.
  rv34: remove constant parameter
  rv40: don't always do the full prev_type search
  dsputil x86: revert a test back to its previous value
  rv34dsp x86: implement MMX2 inverse transform

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-29 21:45:54 +02:00
Roland Scheidegger
82c71913e4 h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
Now that only one table is used, there's some chance even darwin as compiles
this (apparently the label arithmetic used previously doesn't work if it
involves symbols defined in a different file, thanks to Ronald S. Bultje for
helping me with this).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
Roland Scheidegger
7f668cd2b5 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
Roland Scheidegger
5520df6a8f h264: (trivial) remove unneeded macro argument in x86/cabac.h
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
Christophe GISQUET
e75d1d4f73 dsputil x86: revert a test back to its previous value
Commit 356ee8d caused the initial inversion.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 11:00:51 -07:00
Christophe Gisquet
fe5ed69dc7 rv34dsp x86: implement MMX2 inverse transform
141 cycles down to 51.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 10:58:47 -07:00
Roland Scheidegger
9b9df1cdff h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 09:43:25 -07:00
Roland Scheidegger
14e9ffc1e4 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:26:12 -07:00
Roland Scheidegger
444f47b55c h264: (trivial) remove unneeded macro argument in x86/cabac.h
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:24:56 -07:00
Michael Niedermayer
70d54392f5 lowres2 support.
The new lowres support is limited to decoders where lowres decoding
is possible in high quality.
I was not able to measure any speed difference, but if one is found
the 2-3 lines that might affect speed can be made compile time conditional

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-22 22:26:55 +02:00
Michael Niedermayer
92ef4be4ab Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ARM: allow runtime masking of CPU features
  dsputil: remove unused functions
  mov: Treat keyframe indexes as 1-origin if starting at non-zero.
  mov: Take stps entries into consideration also about key_off.
  Remove lowres video decoding

Conflicts:
	ffmpeg.c
	ffplay.c
	libavcodec/arm/vp8dsp_init_arm.c
	libavcodec/libopenjpegdec.c
	libavcodec/mjpegdec.c
	libavcodec/mpegvideo.c
	libavcodec/utils.c
	libavformat/mov.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-22 22:26:42 +02:00
Michael Niedermayer
c047afb80c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avcodec: remove AVCodecContext.dsp_mask
  avconv: fix a segfault when default encoder for a format doesn't exist.
  utvideo: general cosmetics
  aac: Handle HE-AACv2 when sniffing a channel order.
  movenc: Support high sample rates in isomedia formats by setting the sample rate field in stsd to 0.
  xxan: Remove write-only variable in xan_decode_frame_type0().
  ivi_common: Initialize a variable at declaration in ff_ivi_decode_blocks().

Conflicts:
	ffmpeg.c
	libavcodec/utvideo.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-21 22:56:07 +02:00
Mans Rullgard
2bcbd98459 Remove lowres video decoding
This feature is complex, of questionable utility, and slows down
normal decoding.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-04-21 18:56:19 +01:00
Mans Rullgard
95510be8c3 avcodec: remove AVCodecContext.dsp_mask
This removes all references to AVCodecContext.dsp_mask and marks
it for eviction at the next version bump.  It has been superseded
by av_set_cpu_flag_mask() which, unlike this field, works everywhere.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-04-21 18:30:01 +01:00
Michael Niedermayer
9849515214 Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)"
This broke compilation on darwin, revert until a better solution is found.

This reverts commit a812b599b5.
2012-04-21 02:09:27 +02:00
Roland Scheidegger
a812b599b5 h264: assembly version of get_cabac for x86_64 with PIC (v4)
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
v2: incorporated feedback from Loren Merritt to avoid rip-relative movs
for every table, and got rid of unnecessary @GOTPCREL.
v3: apply similar fixes to the the decode_significance functions, and use
same macro arguments for non-pic case.
v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect
the c code to be faster otherwise since both cmov and sbb suck hard on a
Prescott, even can't construct the mask with a 64bit shift as that's just as
terrible - it's quite difficult to find usable instructions on that chip...).
This is tested to work but not on a P4, in theory it _should_ be fast there.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-21 00:27:06 +02:00
Michael Niedermayer
15141f939d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  indeo3: add parens around some macro arguments
  h264: use proper PROLOGUE statement for a function using 8 registers.
  doc: Update sample Vim config with suitable (function) indentation settings.
  dv: Merge dvquant.h into dvdata.c where all other DV tables reside.
  dv: Move static tables only used in one place to where they are used.
  graphparser: set next to NULL on an entry extracted from inputs list
  doc/filters: update documentation.
  avconv: flush decoders immediately after an EOF.
  avconv: send EOF to vsrc_buffer.
  avconv: reindent.

Conflicts:
	doc/filters.texi
	ffmpeg.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-17 12:13:22 +02:00
Ronald S. Bultje
87a246341b h264: use proper PROLOGUE statement for a function using 8 registers.
Fixes crashes when using biweight on win64.
2012-04-16 08:07:21 -07:00
Michael Niedermayer
7432bcfe5a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vsrc_buffer: fix check from 7ae7c41.
  libxvid: Reorder functions to avoid forward declarations; make functions static.
  libxvid: drop some pointless dead code
  wmal: vertical alignment cosmetics
  wmal: Warn about missing bitstream splicing feature and ask for sample.
  wmal: Skip seekable_frame_in_packet.
  wmal: Drop unused variable num_possible_block_size.
  avfiltergraph: make the AVFilterInOut alloc/free API public
  graphparser: allow specifying sws flags in the graph description.
  graphparser: fix the order of connecting unlabeled links.
  graphparser: add avfilter_graph_parse2().
  vsrc_buffer: allow using a NULL buffer to signal EOF.
  swscale: handle last pixel if lines have an odd width.
  qdm2: fix a dubious pointer cast
  WMAL: Do not try to read rawpcm coefficients if bits is invalid
  mov: Fix detecting there is no sync sample.
  tiffdec: K&R cosmetics
  avf: has_duration does not check the global one
  dsputil: fix optimized emu_edge function on Win64.

Conflicts:
	doc/APIchanges
	libavcodec/libxvid_rc.c
	libavcodec/libxvidff.c
	libavcodec/tiff.c
	libavcodec/wmalosslessdec.c
	libavfilter/avfiltergraph.h
	libavfilter/graphparser.c
	libavfilter/version.h
	libavfilter/vsrc_buffer.c
	libswscale/output.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-14 22:37:43 +02:00
Michael Niedermayer
367d9b2957 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: K&R formatting cosmetics (part II)
  tiffdec: Add a malloc check and refactor another.
  faxcompr: Check malloc results and unify return path
  configure: escape colons in values written to config.fate
  ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSE
  matroska: Fix leaking memory allocated for laces.
  pthread: Fix crash due to fctx->delaying not being cleared.
  vp3: Assert on invalid filter_limit values.
  h264: fix 10bit biweight functions after recent x86inc.asm fixes.
  ffv1: Fix size mismatch in encode_line.
  movenc: Remove a dead initialization
  git-howto: Explain how to avoid Windows line endings in git checkouts.
  build: Move all arch OBJS declarations into arch subdirectory Makefiles.

Conflicts:
	configure
	libavcodec/vp3.c
	libavformat/matroskadec.c
	libavutil/Makefile
	libswscale/Makefile
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-13 21:50:37 +02:00
Ronald S. Bultje
b089ca871a dsputil: fix optimized emu_edge function on Win64.
Recent register allocation changes (x86inc.asm update) changed the
register order and thus opcodes for the inner loops. One of them became
>128bytes, which confuses other parts of this function where it jumps
to fixed-offset positions to extend the edge by fixed amounts. A simple
register change fixes this.
2012-04-13 11:28:30 -07:00
Justin Ruggles
de7f22ab0c ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSE
Fixes ac3-encode and eac3-encode FATE test failures with SSE2 disabled.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-12 21:33:04 -07:00
Ronald S. Bultje
76538d7a78 h264: fix 10bit biweight functions after recent x86inc.asm fixes.
This should have been updated in the x86inc.asm update, but was
accidently forgotten.
2012-04-12 21:13:57 -07:00
Michael Niedermayer
ca19862d38 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libxvid: remove disabled code
  qdm2: make a table static const
  qdm2: simplify bitstream reader setup for some subpacket types
  qdm2: use get_bits_left()
  build: Consistently handle conditional compilation for all optimization OBJS.
  avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics
  msrle: convert MS RLE decoding function to bytestream2.
  x86inc improvements for 64-bit

Conflicts:
	common.mak
	libavcodec/avpacket.c
	libavcodec/bfi.c
	libavcodec/msrledec.c
	libavcodec/qdm2.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-13 00:39:19 +02:00
Diego Biurrun
7bb3a302fe build: Consistently handle conditional compilation for all optimization OBJS. 2012-04-12 09:00:49 +02:00
Henrik Gramner
729f90e268 x86inc improvements for 64-bit
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments

Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-04-11 15:47:00 -04:00
Michael Niedermayer
e387c9d5dd Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  rv40dsp x86: use only one register, for both increment and loop counter
  rv40dsp: implement prescaled versions for biweight.
  avconv: use default channel layouts when they are unknown
  avconv: parse channel layout string
  nutdec: K&R formatting cosmetics
  vda: Signal 4 byte NAL headers to the decoder regardless of what's in the extradata
  mem: Consistently return NULL for av_malloc(0)
  vf_overlay: implement poll_frame()
  vf_scale: support named constants for sws flags.
  lavc doxy: add all installed headers to doxy groups.
  lavc doxy: add avfft to the main lavc group.
  lavc doxy: add remaining avcodec.h functions to a misc doxygen group.
  lavc doxy: add AVPicture functions to a doxy group.
  lavc doxy: add resampling functions to a doxy group.
  lavc doxy: replace \ with /
  lavc doxy: add encoding functions to a doxy group.
  lavc doxy: add decoding functions to a doxy group.
  lavc doxy: fix formatting of AV_PKT_DATA_{PARAM_CHANGE,H263_MB_INFO}
  lavc doxy: add AVPacket-related stuff to a separate doxy group.
  lavc doxy: add core functions/definitions to a doxy group.
  ...

Conflicts:
	ffmpeg.c
	libavcodec/avcodec.h
	libavcodec/vda.c
	libavcodec/x86/rv40dsp.asm
	libavfilter/vf_scale.c
	libavformat/nutdec.c
	libavutil/mem.c
	tests/ref/acodec/pcm_s24daud

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-10 22:53:25 +02:00
Christophe GISQUET
2130bd8f5b rv40dsp x86: use only one register, for both increment and loop counter
Around 10 cycles faster for luma.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-10 10:07:09 -07:00
Christophe GISQUET
272b252c01 rv40dsp: implement prescaled versions for biweight.
Quite often, the original weights are multiple of 512. By prescaling them
by 1/512 when they are computed (once per frame), no intermediate shifting
is needed, and no prescaling on each call either.

The x86 code already used that trick.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-10 10:06:48 -07:00
Michael Niedermayer
2c5a2958e9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  h264: Factorize declaration of mb_sizes array.
  vsrc_buffer: when no frame is available, return an error instead of segfaulting.
  configure: add dl to frei0r extralibs.
  dsputil x86: use SSE float instruction instead of SSE2 integer equivalent
  dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype
  vp8dsp x86: perform rounding shift with a single instruction
  fate: add BMP tests.
  swscale: handle complete dimensions for monoblack/white.
  aacenc: Mark deinterleave_input_samples argument as const.
  vf_unsharp: Mark readonly variable as const.
  h264: fix 4:2:2 PCM-macroblocks decoding

Conflicts:
	configure
	libavcodec/h264.h
	libavcodec/x86/dsputil_mmx.c
	libavfilter/vf_unsharp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-05 22:26:50 +02:00
Christophe GISQUET
6b81da2fd0 dsputil x86: use SSE float instruction instead of SSE2 integer equivalent
All the more required since the users are pure SSE functions.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-04 11:24:27 -07:00
Christophe GISQUET
cd88105f6f dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-04 11:24:08 -07:00
Christophe GISQUET
f9888520cc vp8dsp x86: perform rounding shift with a single instruction
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-04 11:23:36 -07:00
Michael Niedermayer
226671ee2f dsputil_mmx: fix scalarproduct prototypes
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-01 22:04:05 +02:00
Michael Niedermayer
d40ff29cac Merge remote-tracking branch 'qatar/master'
* qatar/master:
  asf: only set index_read if the index contained entries.
  cabac: add overread protection to BRANCHLESS_GET_CABAC().
  cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC().
  cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE().
  cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC().
  h264: add overread protection to get_cabac_bypass_sign_x86().
  h264: reindent get_cabac_bypass_sign_x86().
  h264: use struct offsets in get_cabac_bypass_sign_x86().
  h264: fix overreads in cabac reader.
  wmall: fix seeking.
  lagarith: fix buffer overreads.
  dvdec: drop unnecessary dv_tablegen.h #include
  build: fix doc generation errors in parallel builds
  Replace memset(0) by zero initializations.
  faandct: Remove FAAN_POSTSCALE define and related code.
  dvenc: print allowed profiles if the video doesn't conform to any of them.
  avcodec_encode_{audio,video}: only reallocate output packet when it has non-zero size.
  FATE: add a test for vp8 with changing frame size.
  fate: add kgv1 fate test.
  oggdec: calculate correct timestamps in Ogg/FLAC

Conflicts:
	libavcodec/4xm.c
	libavcodec/cook.c
	libavcodec/dvdata.c
	libavcodec/dvdsubdec.c
	libavcodec/lagarith.c
	libavcodec/lagarithrac.c
	libavcodec/utils.c
	tests/fate/video.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-29 04:11:10 +02:00
Ronald S. Bultje
a940198130 cabac: add overread protection to BRANCHLESS_GET_CABAC().
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
2012-03-28 08:01:29 -07:00
Ronald S. Bultje
448dc42571 cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
16f6e83f74 cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
951014e5bb cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
a0bdcb019e h264: add overread protection to get_cabac_bypass_sign_x86(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
95bfa4ead7 h264: reindent get_cabac_bypass_sign_x86(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
db025929f2 h264: use struct offsets in get_cabac_bypass_sign_x86(). 2012-03-28 08:01:29 -07:00
Michael Niedermayer
7e496e1545 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  build: ppc: drop stray leftover backslash
  build: Only clean the architecture subdirectory we build for.
  build: drop some unnecessary dependencies from the H.264 parser
  build: prettyprinting cosmetics
  libavutil: Remove pointless rational test program.
  libavutil: Remove broken and pointless lzo test program.
  lavf doxy: expand AVStream.codec doxy.
  lavf doxy: improve AVStream.time_base doxy.
  lavf doxy: add some basic documentation about reading from the demuxer.
  lavf doxy: document passing options to demuxers.
  lavf doxy: clarify that an AVPacket contains encoded data.
  mpegtsenc: allow user triggered PES packet flushing
  APIchanges: mark the place where 0.7 was cut.
  APIchanges: mark the place where 0.8 was cut.
  APIchanges: fill in missing dates and hashes.
  smacker: convert palette and header reading to bytestream2.
  alac: convert extradata reading to bytestream2.

Conflicts:
	doc/APIchanges
	libavcodec/smacker.c
	libavcodec/x86/Makefile
	libavfilter/Makefile
	libavutil/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-26 20:52:52 +02:00
Diego Biurrun
ad0e31f134 build: prettyprinting cosmetics 2012-03-26 13:00:10 +02:00
Michael Niedermayer
9621646eb3 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: dsputil: prettyprint gcc inline asm
  x86: K&R prettyprinting cosmetics for dsputil_mmx.c
  x86: conditionally compile H.264 QPEL optimizations
  dsputil_mmx: Surround QPEL macros by "do { } while (0);" blocks.
  Ignore generated files below doc/.
  dpcm: convert to bytestream2.
  interplayvideo: convert to bytestream2.
  movenc: Merge if statements
  h264: fix memleak in error path.
  pthread: Immediately release all frames in ff_thread_flush()
  h264: Add check for invalid chroma_format_idc
  utvideo: port header reading to bytestream2.

Conflicts:
	.gitignore
	configure
	libavcodec/h264_ps.c
	libavcodec/interplayvideo.c
	libavcodec/pthread.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-25 23:10:30 +02:00
Diego Biurrun
62ce9defb8 x86: dsputil: prettyprint gcc inline asm 2012-03-25 11:50:48 +02:00
Diego Biurrun
3b54912113 x86: K&R prettyprinting cosmetics for dsputil_mmx.c 2012-03-25 11:50:48 +02:00
Diego Biurrun
915a2a0a65 x86: conditionally compile H.264 QPEL optimizations 2012-03-25 11:50:45 +02:00
Diego Biurrun
3816642eab dsputil_mmx: Surround QPEL macros by "do { } while (0);" blocks.
This makes them safe to use in non-fully braced if-blocks and similar.
2012-03-25 11:48:37 +02:00
Carl Eugen Hoyos
5cddfc58d8 Fix linking without yasm. 2012-03-24 14:54:06 +01:00
Michael Niedermayer
f58f75dd92 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rv34: error out on size changes with frame threading
  aacsbr: Add a debug check to sbr_mapping.
  aac: Reset some state variables when turning SBR off
  aac: Reset PS parameters on header decode failure.
  fate: add wmalossless test.
  aacsbr: handle m_max values smaller than 4.

Conflicts:
	libavcodec/aacsbr.c
	tests/fate/lossless-audio.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-24 10:59:43 +01:00
Ronald S. Bultje
71ea26811c aacsbr: handle m_max values smaller than 4.
Prevents a signflip in the counter, and a subsequent crash because of
overreads/overwrites.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
2012-03-23 12:56:08 -07:00
Reimar Döffinger
adb98a3d22 VC1: restore optimizations broken in 9a1ced32.
They were moved into code under HAVE_YASM and most of them
even into completely disabled code with no reason given
for that in the commit message.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-03-23 19:39:02 +01:00
ami_stuff
f6b7863808 Replace SSE2 instruction in scalarproduct_float_sse() by SSE equivalent.
Fixes an AAC decoding issue with the sample from ticket #213 on machines
with SSE but without SSE2.
Based on 89411a by Reimar.
2012-03-22 19:28:52 +01:00
Reimar Döffinger
89411ae699 Replace SSE2 instruction by SSE equivalent.
This is even potentially faster in this use-case.
Should fix AAC SBR decoding on machines with SSE but not
SSE2, fixing track issue #1041.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-03-21 20:14:50 +01:00
Michael Niedermayer
219a6fb61c dsp: fix diff_bytes_mmx() with small width
Fixes Ticket1068

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-17 20:48:56 +01:00
Michael Niedermayer
dd2631a6df dsputil: mark source of diff_bytes as const.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-15 22:17:24 +01:00
Michael Niedermayer
1bc85fb32d dirac: mark some variables const.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-15 22:12:02 +01:00
Nico Weber
599888a480 Move struc FFTContext below SECTION_RODATA
Yasm creates an implicit unaligned text section if "struc" is used
outside of any section:
http://tortall.lighthouseapp.com/projects/78676-yasm/tickets/247

Since yasm only honors the "align" annotation on the first declaration
of a section, this implicit text section causes all text section
alignments to be ignored. Also fixes a yasm warning about it agnoring
alignment.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-12 21:54:37 +01:00
Michael Niedermayer
c3c2db49a7 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  cook: expand dither_tab[], and make sure indexes into it don't overflow.
  xxan: reindent xan_unpack_luma().
  xxan: protect against chroma LUT overreads.
  xxan: convert to bytestream2 API.
  xxan: don't read before start of buffer in av_memcpy_backptr().
  vp8: convert mbedge loopfilter x86 assembly to use named arguments.
  vp8: convert inner loopfilter x86 assembly to use named arguments.

Conflicts:
	libavcodec/xxan.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-11 01:12:52 +01:00
Ronald S. Bultje
a928ed3751 vp8: convert mbedge loopfilter x86 assembly to use named arguments. 2012-03-10 11:36:33 -08:00
Ronald S. Bultje
bee330e300 vp8: convert inner loopfilter x86 assembly to use named arguments. 2012-03-10 11:36:33 -08:00
Michael Niedermayer
bf807a5e87 Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  sbrdsp.asm: convert all instructions to float/SSE ones.
  dv: cosmetics.
  dv: check buffer size before reading profile.
  Revert "AAC SBR: group some writes."
  udp: Print an error message if bind fails
  cook: extend channel uncoupling tables so the full bit range is covered.
  roqvideo: cosmetics.
  roqvideo: convert to bytestream2 API.
  dca: don't use av_clip_uintp2().
  wmall: fix build with -DDEBUG enabled.
  smc: port to bytestream2 API.
  AAC SBR: group some writes.
  dsputil: remove shift parameter from scalarproduct_int16
  SBR DSP: unroll sum_square
  rv34: remove dead code in intra availability check
  rv34: clean a bit availability checks.
  v4l2: update documentation
  tgq: convert to bytestream2 API.
  parser: remove forward declaration of MpegEncContext
  dca: prevent accessing static arrays with invalid indexes.
  ...

Conflicts:
	doc/indevs.texi
	libavcodec/Makefile
	libavcodec/dca.c
	libavcodec/dvdata.c
	libavcodec/eatgq.c
	libavcodec/mmvideo.c
	libavcodec/roqvideodec.c
	libavcodec/smc.c
	libswscale/output.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-08 02:51:45 +01:00
Reimar Döffinger
6eda85e15b sbrdsp.asm: convert all instructions to float/SSE ones.
Since the values are floats, using the float operations
makes sense, improves performance on some CPUs and
makes the code SSE compatible instead of needing SSE2.

Based on suggestion by Jason.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-07 13:50:13 -08:00
Christophe GISQUET
7e1ce6a6ac dsputil: remove shift parameter from scalarproduct_int16
There is only one caller, which does not need the shifting. Other use cases
are situations where different roundings would be needed.

The x86 and neon versions are modified accordingly.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-07 10:29:52 -08:00
Diego Biurrun
1e9d55e45e x86: Remove duplicated AVG_3DNOW_OP / AVG_MMX2_OP macros from h264_qpel_mmx.c. 2012-03-07 09:36:04 +01:00
Michael Niedermayer
6df42f9874 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  SBR DSP: fix SSE code to not use SSE2 instructions.
  cpu: initialize mask to -1, so that by default, optimizations are used.
  error_resilience: initialize s->block_index[].
  svq3: protect against negative quantizers.
  Don't use ff_cropTbl[] for IDCT.
  swscale: make filterPos 32bit.
  FATE: add CPUFLAGS variable, mapping to -cpuflags avconv option.
  avconv: add -cpuflags option for setting supported cpuflags.
  cpu: add av_set_cpu_flags_mask().
  libx264: Allow overriding the sliced threads option
  avconv: fix counting encoded video size.

Conflicts:
	doc/APIchanges
	doc/fate.texi
	doc/ffmpeg.texi
	ffmpeg.c
	libavcodec/h264idct_template.c
	libavcodec/svq3.c
	libavutil/avutil.h
	libavutil/cpu.c
	libavutil/cpu.h
	libswscale/swscale.c
	tests/Makefile
	tests/fate-run.sh
	tests/regression-funcs.sh

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-07 03:22:49 +01:00
Reimar Döffinger
b5161908e0 SBR DSP: fix SSE code to not use SSE2 instructions.
movq from SSE register _to_ memory is an SSE2 instruction.
Use the SSE movlps function instead that does the same thing.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-06 13:40:35 -08:00
Michael Niedermayer
f095391a14 Merge remote-tracking branch 'qatar/master'
* qatar/master: (31 commits)
  cdxl demux: do not create packets with uninitialized data at EOF.
  Replace computations of remaining bits with calls to get_bits_left().
  amrnb/amrwb: Remove get_bits usage.
  cosmetics: reindent
  avformat: do not require a pixel/sample format if there is no decoder
  avformat: do not fill-in audio packet duration in compute_pkt_fields()
  lavf: Use av_get_audio_frame_duration() in get_audio_frame_size()
  dca_parser: parse the sample rate and frame durations
  libspeexdec: do not set AVCodecContext.frame_size
  libopencore-amr: do not set AVCodecContext.frame_size
  alsdec: do not set AVCodecContext.frame_size
  siff: do not set AVCodecContext.frame_size
  amr demuxer: do not set AVCodecContext.frame_size.
  aiffdec: do not set AVCodecContext.frame_size
  mov: do not set AVCodecContext.frame_size
  ape: do not set AVCodecContext.frame_size.
  rdt: remove workaround for infinite loop with aac
  avformat: do not require frame_size in avformat_find_stream_info() for CELT
  avformat: do not require frame_size in avformat_find_stream_info() for MP1/2/3
  avformat: do not require frame_size in avformat_find_stream_info() for AAC
  ...

Conflicts:
	doc/APIchanges
	libavcodec/Makefile
	libavcodec/avcodec.h
	libavcodec/h264.c
	libavcodec/h264_ps.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/dsputil_mmx.c
	libavformat/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-06 06:03:32 +01:00
Mans Rullgard
356ee8d7de x86: clean up ff_dsputil_init_mmx()
This splits ff_dsputil_init_mmx() into multiple functions, one for
each MMX/SSE level, somewhat simplifying the nested conditions.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-03-05 14:40:03 +01:00
Michael Niedermayer
2af8f2cea6 Merge remote-tracking branch 'qatar/master'
* qatar/master: (27 commits)
  cmdutils: use new avcodec_is_decoder/encoder() functions.
  lavc: make codec_is_decoder/encoder() public.
  lavc: deprecate AVCodecContext.sub_id.
  libcdio: add a forgotten AVClass to the private context.
  swscale: remove "cpu flags" from -sws_flags description.
  proresenc: give user a possibility to alter some encoding parameters
  vorbisenc: add output buffer overwrite protection
  libopencore-amrnbenc: fix end-of-stream handling
  ra144enc: fix end-of-stream handling
  nellymoserenc: zero any leftover packet bytes
  nellymoserenc: use proper MDCT overlap delay
  qpeg: Use bytestream2 functions to prevent buffer overreads.
  swscale: make %rep unconditional.
  vp8: convert simple loopfilter x86 assembly to use named arguments.
  vp8: convert idct x86 assembly to use named arguments.
  vp8: convert mc x86 assembly to use named arguments.
  vp8: convert loopfilter x86 assembly to use cpuflags().
  vp8: convert idct/mc x86 assembly to use cpuflags().
  swscale: remove now unnecessary hack.
  x86inc: don't "bake" stack_offset in named arguments.
  ...

Conflicts:
	cmdutils.c
	doc/APIchanges
	libavcodec/mpeg12.c
	libavcodec/options.c
	libavcodec/qpeg.c
	libavcodec/utils.c
	libavcodec/version.h
	libavdevice/libcdio.c
	tests/lavf-regression.sh

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-05 00:15:55 +01:00
Ronald S. Bultje
b4188f0d46 vp8: convert simple loopfilter x86 assembly to use named arguments. 2012-03-03 20:40:00 -08:00
Ronald S. Bultje
8476ca3b4e vp8: convert idct x86 assembly to use named arguments. 2012-03-03 20:40:00 -08:00
Ronald S. Bultje
21ffc78fd7 vp8: convert mc x86 assembly to use named arguments. 2012-03-03 20:40:00 -08:00
Ronald S. Bultje
28170f1a39 vp8: convert loopfilter x86 assembly to use cpuflags(). 2012-03-03 20:40:00 -08:00
Ronald S. Bultje
e25be47154 vp8: convert idct/mc x86 assembly to use cpuflags(). 2012-03-03 20:39:59 -08:00
Michael Niedermayer
268098d8b2 Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  amrwb: remove duplicate arguments from extrapolate_isf().
  amrwb: error out early if mode is invalid.
  h264: change underread for 10bit QPEL to overread.
  matroska: check buffer size for RM-style byte reordering.
  vp8: disable mmx functions with sse/sse2 counterparts on x86-64.
  vp8: change int stride to ptrdiff_t stride.
  wma: fix invalid buffer size assumptions causing random overreads.
  Windows Media Audio Lossless decoder
  rv10/20: Fix slice overflow with checked bitstream reader.
  h263dec: Disallow width/height changing with frame threads.
  rv10/20: Fix a buffer overread caused by losing track of the remaining buffer size.
  rmdec: Honor .RMF tag size rather than assuming 18.
  g722: Fix the QMF scaling
  r3d: don't set codec timebase.
  electronicarts: set timebase for tgv video.
  electronicarts: parse the framerate for cmv video.
  ogg: don't set codec timebase
  electronicarts: don't set codec timebase
  avs: don't set codec timebase
  wavpack: Fix an integer overflow
  ...

Conflicts:
	libavcodec/arm/vp8dsp_init_arm.c
	libavcodec/fraps.c
	libavcodec/h264.c
	libavcodec/mpeg4videodec.c
	libavcodec/mpegvideo.c
	libavcodec/msmpeg4.c
	libavcodec/pnmdec.c
	libavcodec/qpeg.c
	libavcodec/rawenc.c
	libavcodec/ulti.c
	libavcodec/vcr1.c
	libavcodec/version.h
	libavcodec/wmalosslessdec.c
	libavformat/electronicarts.c
	libswscale/ppc/yuv2rgb_altivec.c
	tests/ref/acodec/g722
	tests/ref/fate/ea-cmv

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-03 00:23:10 +01:00
Ronald S. Bultje
291c9b6285 h264: change underread for 10bit QPEL to overread.
This prevents us from reading before the start of the buffer, and thus
prevents crashes resulting from this behaviour. Fixes bug 237.
2012-03-02 10:33:05 -08:00
Ronald S. Bultje
45549339bc vp8: disable mmx functions with sse/sse2 counterparts on x86-64.
x86-64 is guaranteed to have at least SSE2, therefore the MMX/MMX2
functions will never be used in practice.
2012-03-02 10:32:05 -08:00
Ronald S. Bultje
bd66f073fe vp8: change int stride to ptrdiff_t stride.
On 64bit platforms with 32bit int, this means we won't have to sign-
extend the integer anymore.
2012-03-02 10:31:50 -08:00
Michael Niedermayer
e3822886eb Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avcodec_default_reget_buffer(): fix compilation in DEBUG mode
  fate: Overhaul WavPack coverage
  h264: fix mmxext chroma deblock to use correct TC values.
  flvdec: Remove the now redundant check for known broken metadata creator
  flvdec: Validate index entries added from metadata while reading
  rtsp: Handle requests from server to client
  movenc: use timestamps instead of frame_size for samples-per-packet
  movenc: use the first cluster duration as the tfhd default duration
  movenc: factorize calculation of cluster duration into a separate function
  doc/APIchanges: fill in missing dates and hashes.
  lavc: reorder AVCodecContext fields.
  lavc: reorder AVFrame fields.

Conflicts:
	doc/APIchanges
	libavcodec/avcodec.h
	libavformat/flvdec.c
	libavformat/movenc.c
	tests/fate/lossless-audio.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-28 03:38:58 +01:00
Ronald S. Bultje
b0c4f04338 h264: fix mmxext chroma deblock to use correct TC values. 2012-02-27 09:38:44 -08:00
Michael Niedermayer
b008ac18bb Merge remote-tracking branch 'qatar/master'
* qatar/master:
  docs: use -bsf:[vas] instead of -[vas]bsf.
  mpegaudiodec: Prevent premature clipping of mp3 input buffer.
  lavf: move the packet keyframe setting code.
  oggenc: free comment header for all codecs
  lcl: error out if uncompressed input buffer is smaller than framesize.
  mjpeg: abort decoding if packet is too large.
  golomb: use HAVE_BITS_REMAINING() macro to prevent infloop on EOF.
  get_bits: add HAVE_BITS_REMAINING macro.
  lavf/output-example: use new audio encoding API correctly.
  lavf/output-example: more proper usage of the new API.
  tiff: Prevent overreads in the type_sizes array.
  tiff: Make the TIFF_LONG and TIFF_SHORT types unsigned.
  apetag: do not leak memory if avio_read() fails
  apetag: propagate errors.
  SBR DSP x86: implement SSE sbr_hf_g_filt
  SBR DSP x86: implement SSE sbr_sum_square_sse
  SBR DSP: use intptr_t for the ixh parameter.

Conflicts:
	doc/bitstream_filters.texi
	doc/examples/muxing.c
	doc/ffmpeg.texi
	libavcodec/golomb.h
	libavcodec/x86/Makefile
	libavformat/oggenc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-25 04:00:43 +01:00
Christophe GISQUET
2784d18791 SBR DSP x86: implement SSE sbr_hf_g_filt
Unrolling the main loop to process, instead of 4 elements:
- 8: minor gain of 2 cycles (not worth the extra object size)
- 2: loss of 8 cycles.

Assigning STEP to a register is a loss. Output address (Y) is almost always
unaligned.

Timings:
- C (32/64 bits): 117/109 cycles
- SSE: 57 cycles

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:50:09 -08:00
Christophe GISQUET
34454c761f SBR DSP x86: implement SSE sbr_sum_square_sse
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C  /32bits: 82c (unrolled)/102c
               C  /64bits: 69c (unrolled)/82c
               SSE/32bits: 42c
               SSE/64bits: 31c

Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:50:06 -08:00
Michael Niedermayer
184fc600e1 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpegvideo_enc: only allocate output packet when we know there will be output
  Add names for more channel layouts to the channel layout map.
  sunrast: Add a sample request for RMP_RAW colormap.
  avcodec: do not override pts or duration from the audio encoder
  Add prores regression test.
  Enable already existing rso regression test.
  Add regression test for "sox" format muxer/demuxer.
  Add dpx encoding regression test.
  swscale: K&R formatting cosmetics for PowerPC code (part I/II)
  img2: Use ff_guess_image2_codec(filename) shorthand where appropriate.
  Clarify licensing information about files borrowed from libjpeg.
  Mark mutable static data const where appropriate.
  avplay: fix -threads option
  dvbsubdec: avoid undefined signed left shift in RGBA macro
  mlpdec: use av_log_ask_for_sample()
  gif: K&R formatting cosmetics
  png: make .long_name more descriptive
  movdec: Adjust keyframe flagging in fragmented files
  rv34: change most "int stride" into "ptrdiff_t stride".

Conflicts:
	avprobe.c
	ffplay.c
	libavcodec/mlpdec.c
	libavcodec/mpegvideo_enc.c
	libavcodec/pngenc.c
	libavcodec/x86/v210-init.c
	libavfilter/vf_boxblur.c
	libavfilter/vf_crop.c
	libavfilter/vf_drawtext.c
	libavfilter/vf_lut.c
	libavfilter/vf_overlay.c
	libavfilter/vf_pad.c
	libavfilter/vf_scale.c
	libavfilter/vf_select.c
	libavfilter/vf_setpts.c
	libavfilter/vf_settb.c
	libavformat/img2.c
	libavutil/audioconvert.c
	tests/codec-regression.sh
	tests/lavf-regression.sh
	tests/ref/lavf/dpx
	tests/ref/vsynth1/prores
	tests/ref/vsynth2/prores

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-22 02:24:18 +01:00
Michael Niedermayer
eadd4264ee Merge remote-tracking branch 'qatar/master'
* qatar/master: (36 commits)
  adpcmenc: Use correct frame_size for Yamaha ADPCM.
  avcodec: add ff_samples_to_time_base() convenience function to internal.h
  adx parser: set duration
  mlp parser: set duration instead of frame_size
  gsm parser: set duration
  mpegaudio parser: set duration instead of frame_size
  (e)ac3 parser: set duration instead of frame_size
  flac parser: set duration instead of frame_size
  avcodec: add duration field to AVCodecParserContext
  avutil: add av_rescale_q_rnd() to allow different rounding
  pnmdec: remove useless .pix_fmts
  libmp3lame: support float and s32 sample formats
  libmp3lame: renaming, rearrangement, alignment, and comments
  libmp3lame: use the LAME default bit rate
  libmp3lame: use avpriv_mpegaudio_decode_header() for output frame parsing
  libmp3lame: cosmetics: remove some pointless comments
  libmp3lame: convert some debugging code to av_dlog()
  libmp3lame: remove outdated comment.
  libmp3lame: do not set coded_frame->key_frame.
  libmp3lame: improve error handling in MP3lame_encode_init()
  ...

Conflicts:
	doc/APIchanges
	libavcodec/libmp3lame.c
	libavcodec/pcxenc.c
	libavcodec/pnmdec.c
	libavcodec/pnmenc.c
	libavcodec/sgienc.c
	libavcodec/utils.c
	libavformat/hls.c
	libavutil/avutil.h
	libswscale/x86/swscale_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-21 05:10:12 +01:00
Ronald S. Bultje
3ab9a2a557 rv34: change most "int stride" into "ptrdiff_t stride".
This prevents having to sign-extend on 64-bit systems with 32-bit ints,
such as x86-64. Also fixes crashes on systems where we don't do it and
arguments are not in registers, such as Win64 for all weight functions.
2012-02-20 14:58:25 -08:00
Ronald S. Bultje
8fb26950ed h264: don't use redzone in loopfilter on win64.
Red zone usage is not allowed in the Win64 ABI.
2012-02-19 15:31:03 -08:00
Michael Niedermayer
f9caec0cf9 h264: change deblock_h_chroma_8_mmxext() to prevent valgrind confusion.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-17 21:36:37 +01:00
Michael Niedermayer
8c1ebdcea2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  shorten: Use separate pointers for the allocated memory for decoded samples.
  atrac3: Fix crash in tonal component decoding.
  ws_snd1: Fix wrong samples counts.
  movenc: Don't set a default sample duration when creating ismv
  rtp: Factorize the check for distinguishing RTCP packets from RTP
  golomb: avoid infinite loop on all-zero input (or end of buffer).
  bethsoftvid: synchronize video timestamps with audio sample rate
  bethsoftvid: add audio stream only after getting the first audio packet
  bethsoftvid: Set video packet duration instead of accumulating pts.
  bethsoftvid: set packet key frame flag for audio and I-frame video packets.
  bethsoftvid: fix read_packet() return codes.
  bethsoftvid: pass palette in side data instead of in a separate packet.
  sdp: Ignore RTCP packets when autodetecting RTP streams
  proresenc: initialise 'sign' variable
  mpegaudio: replace memcpy by SIMD code
  vc1: prevent using last_frame as a reference for I/P first frame.

Conflicts:
	libavcodec/atrac3.c
	libavcodec/golomb.h
	libavcodec/shorten.c
	libavcodec/ws-snd1.c
	tests/ref/fate/bethsoft-vid

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-17 00:35:06 +01:00
Christophe GISQUET
f3e084909b mpegaudio: replace memcpy by SIMD code
By replacing memcpy with an unrolled loop using the alignment knowledge
it has, some speedup can be obtained.

Before (gcc 4.6.1): ~400 cycles
After: ~370 cycles

Overall, around 2% speed increase when decoding a 2400s mp3 to f32le.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-15 20:11:54 -08:00
Michael Niedermayer
6cb2085278 Merge remote-tracking branch 'qatar/master'
* qatar/master: (27 commits)
  ppc: Add ff_ prefix to nonstatic symbols
  sh4: Add ff_ prefix to nonstatic symbols
  mpegvideo: Add ff_ prefix to nonstatic functions
  rtjpeg: Add ff_ prefix to nonstatic symbols
  rv: Add ff_ prefix to nonstatic symbols
  vp56: Add ff_ prefix to nonstatic symbols
  vorbis: Add ff_ prefix to nonstatic symbols
  msmpeg4: Add ff_ prefix to nonstatic symbols
  vc1: Add ff_ prefix to nonstatic symbols
  msmpeg4: Add ff_ prefixes to nonstatic symbols
  snow: Add ff_ prefix to nonstatic symbols
  mpeg12: Add ff_ prefix to nonstatic symbols
  mpeg4: Add ff_ prefixes to nonstatic symbols
  lagarith: Add ff_ prefix to lag_rac_init
  libavcodec: Add ff_ prefix to j_rev_dct*
  dsputil: Add ff_ prefix to inv_zigzag_direct16
  libavcodec: Prefix fdct_ifast, fdct_ifast248
  dsputil: Add ff_ prefix to the dsputil*_init* functions
  libavcodec: Add ff_ prefix to some nonstatic symbols
  vlc/rl: Add ff_ prefix to the nonstatic symbols
  ...

Conflicts:
	libavcodec/Makefile
	libavcodec/allcodecs.c
	libavcodec/dnxhddec.c
	libavcodec/ffv1.c
	libavcodec/h263.h
	libavcodec/h263dec.c
	libavcodec/h264.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo_enc.c
	libavcodec/nuv.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/proresdsp.c
	libavcodec/svq3.c
	libavcodec/version.h
	libavformat/dv.h
	libavformat/dvenc.c
	libavformat/matroskadec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-16 01:34:37 +01:00
Martin Storsjö
efd29844eb mpegvideo: Add ff_ prefix to nonstatic functions
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:07:23 +02:00
Martin Storsjö
873c89e2a6 dsputil: Add ff_ prefix to inv_zigzag_direct16
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:06:42 +02:00
Martin Storsjö
9cf0841ef3 dsputil: Add ff_ prefix to the dsputil*_init* functions
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:06:34 +02:00
Reimar Döffinger
f51a072160 Fix compilation without HAVE_AVX.
%ifdef HAVE_AVX must now be %if HAVE_AVX.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-02-12 21:42:31 +01:00
Reimar Döffinger
b223035511 Detect and check for CMOV.
Some MMX-only CPUs do not have support for CMOV.
All SSE/MMX2 CPUs should be fine, thus no check was
added to those functions.
See also https://sourceforge.net/tracker/?func=detail&aid=3358347&group_id=205275&atid=992986

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-02-12 18:56:06 +01:00
Michael Niedermayer
a78f6b8cb9 Merge remote-tracking branch 'qatar/master'
* qatar/master: (38 commits)
  v210enc: remove redundant check for pix_fmt
  wavpack: allow user to disable CRC checking
  v210enc: Use Bytestream2 functions
  cafdec: Check return value of avio_seek and avoid modifying state if it fails
  yop: Check return value of avio_seek and avoid modifying state if it fails
  tta: Check return value of avio_seek and avoid modifying state if it fails
  tmv: Check return value of avio_seek and avoid modifying state if it fails
  r3d: Check return value of avio_seek and avoid modifying state if it fails
  nsvdec: Check return value of avio_seek and avoid modifying state if it fails
  mpc8: Check return value of avio_seek and avoid modifying state if it fails
  jvdec: Check return value of avio_seek and avoid modifying state if it fails
  filmstripdec: Check return value of avio_seek and avoid modifying state if it fails
  ffmdec: Check return value of avio_seek and avoid modifying state if it fails
  dv: Check return value of avio_seek and avoid modifying state if it fails
  bink: Check return value of avio_seek and avoid modifying state if it fails
  Check AVCodec.pix_fmts in avcodec_open2()
  svq3: Prevent illegal reads while parsing extradata.
  remove ParseContext1
  vc1: use ff_parse_close
  mpegvideo parser: move specific fields into private context
  ...

Conflicts:
	libavcodec/4xm.c
	libavcodec/aacdec.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/mpeg4video_parser.c
	libavcodec/svq3.c
	libavcodec/v210enc.c
	libavformat/cafdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-11 01:22:22 +01:00
Reimar Döffinger
394d41ee30 Partially revert "Fix png decoding on x86."
This partially reverts commit 58dabf7bf2.
It is no longer necessary to use unaligned mov.
The swapped mov argument fix remains though.
2012-02-10 23:18:52 +01:00
Justin Ruggles
d483bb58c3 ac3dsp: do not use pshufb in ac3_extract_exponents_ssse3()
We need to do unsigned saturation in order to cover the corner case when the
absolute coefficient value is 16777215 (the maximum value).

Fixes Bug #216
2012-02-09 21:04:44 -05:00
Michael Niedermayer
8c6ebab747 Merge remote-tracking branch 'qatar/master'
* qatar/master: (26 commits)
  eac3dec: replace undefined 1<<31 with INT32_MIN in noise generation
  yadif: specify array size outside DECLARE_ALIGNED
  prores: specify array size outside DECLARE_ALIGNED brackets.
  WavPack demuxer: set packet duration
  tta: use skip_bits_long()
  mxfdec: Ignore the last entry in Avid's index table segments
  mxfdec: Sanity-check SampleRate
  mxfdec: Handle small EditUnitByteCount
  mxfdec: Consider OPAtom files that do not have exactly one EC to be OP1a
  mxfdec: Don't crash in mxf_packet_timestamps() if current_edit_unit overflows
  mxfdec: Zero nb_ptses in mxf_compute_ptses_fake_index()
  mxfdec: Sanity check PreviousPartition
  mxfdec: Never seek back in local sets and KLVs
  mxfdec: Move the current_partition check inside mxf_read_header()
  mxfdec: Fix infinite loop in mxf_packet_timestamps()
  mxfdec: Check eof_reached in mxf_read_local_tags()
  mxfdec: Check for NULL component
  mxfdec: Make sure mxf->nb_index_tables > 0 in mxf_packet_timestamps()
  mxfdec: Make sure x < index_table->nb_ptses
  build: Add missing directories to DIRS declarations.
  ...

Conflicts:
	doc/build_system.txt
	doc/fate.texi
	libavfilter/x86/yadif_template.c
	libavformat/mxfdec.c
	libavutil/Makefile
	tests/fate/audio.mak
	tests/fate/prores.mak
	tests/fate/screen.mak
	tests/fate/video.mak
	tests/ref/fate/bethsoft-vid
	tests/ref/fate/cscd
	tests/ref/fate/dfa4
	tests/ref/fate/nuv
	tests/ref/fate/vp8-sign-bias
	tests/ref/fate/wmv8-drm
	tests/ref/lavf/gxf

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-10 01:20:07 +01:00
Diego Biurrun
0bba26466f cosmetics: Delete empty lines at end of file. 2012-02-09 12:26:45 +01:00
Michael Niedermayer
f2b20b7a8b Merge remote-tracking branch 'qatar/master'
* qatar/master:
  pixdesc: mark pseudopaletted formats with a special flag.
  avconv: switch to avcodec_encode_video2().
  libx264: implement encode2().
  libx264: split extradata writing out of encode_nals().
  lavc: add avcodec_encode_video2() that encodes from an AVFrame -> AVPacket
  cmdutils: update copyright year to 2012.
  swscale: sign-extend integer function argument to qword on x86-64.
  x86inc: support yasm -f win64 flag also.
  h264: manually save/restore XMM registers for functions using INIT_MMX.
  x86inc: allow manual use of WIN64_SPILL_XMM.
  aacdec: Use correct speaker order for 7.1.
  aacdec: Remove incorrect comment.
  aacdec: Simplify output configuration.
  Remove Sun medialib glue code.
  dsputil: set STRIDE_ALIGN to 16 for x86 also.
  pngdsp: swap argument inversion.

Conflicts:
	cmdutils.c
	configure
	doc/APIchanges
	ffmpeg.c
	libavcodec/aacdec.c
	libavcodec/dsputil.h
	libavcodec/libx264.c
	libavcodec/mlib/dsputil_mlib.c
	libavcodec/utils.c
	libavfilter/vf_scale.c
	libavutil/avutil.h
	libswscale/mlib/yuv2rgb_mlib.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-09 01:27:12 +01:00
Ronald S. Bultje
ce1e250ee9 h264: manually save/restore XMM registers for functions using INIT_MMX.
On Win64, these registers are callee-save, so not saving/restoring them
correctly is a violation of ABI and can lead to crashes or corrupt data.
2012-02-08 10:31:14 -08:00
Michael Niedermayer
18d0a16fc9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: make yuv2yuv1 use named registers.
  h264: mark h264_idct_add8_10 with number of XMM registers.
  swscale: fix V plane memory location in bilinear/unscaled RGB/YUYV case.
  vp8: always update next_framep[] before returning from decode_frame().
  avconv: estimate next_dts from framerate if it is set.
  avconv: better next_dts usage.
  avconv: rename InputStream.pts to last_dts.
  avconv: reduce overloading for InputStream.pts.
  avconv: rename InputStream.next_pts to next_dts.
  avconv: rework -t handling for encoding.
  avconv: set encoder timebase for subtitles.
  pva-demux test: add -vn
  swscale: K&R formatting cosmetics for SPARC code
  apedec: allow the user to set the maximum number of output samples per call
  apedec: do not unnecessarily zero output samples for mono frames
  apedec: allocate a single flat buffer for decoded samples
  apedec: use sizeof(field) instead of sizeof(type)
  swscale: split C output functions into separate file.
  swscale: Split C input functions into separate file.
  bytestream: Add bytestream2 writing API.

The avconv changes are due to massive regressions and bugs not merged yet.

Conflicts:
	ffmpeg.c
	libavcodec/vp8.c
	libswscale/swscale.c
	libswscale/x86/swscale_template.c
	tests/fate/demux.mak
	tests/ref/lavf/asf
	tests/ref/lavf/avi
	tests/ref/lavf/mkv
	tests/ref/lavf/mpg
	tests/ref/lavf/nut
	tests/ref/lavf/ogg
	tests/ref/lavf/rm
	tests/ref/lavf/ts
	tests/ref/seek/lavf_avi
	tests/ref/seek/lavf_mkv
	tests/ref/seek/lavf_rm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-08 05:53:35 +01:00
Ronald S. Bultje
4ff6dea390 pngdsp: swap argument inversion. 2012-02-07 14:32:26 -08:00
Michael Kostylev
3206cccc0e h264: mark h264_idct_add8_10 with number of XMM registers.
This fixes XMM register clobber problems on Win64.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-07 11:37:13 -08:00
Reimar Döffinger
58dabf7bf2 Fix png decoding on x86.
Line sizes are only 8-byte aligned, so use unaliged loads
for add_bytes_l2 pointers.
Increasing the alignment requirement to 16 seemed a bit extreme
(png may be used for rather small sizes).
Also fix a mov that had its arguments swapped, leading
add_bytes_l2 being applied on up to 8 bytes too few.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-02-03 23:12:10 +01:00
Reimar Döffinger
da1ba4e88b Fix NASM compilation.
movd needs explicit register size prefix for NASM.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-02-03 20:42:30 +01:00
Michael Niedermayer
d77294c5e4 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  libx264: fix indentation.
  vorbis: fix overflows in floor1[] vector and inverse db table index.
  win64: add a XMM clobber test configure option.
  movdec: Parse the dvc1 atom
  ARM: ac3: fix ac3_bit_alloc_calc_bap_armv6
  swscale: K&R formatting cosmetics for Blackfin code
  frwu: lowercase the FRWU codec name
  movdec: fix dts generation in fragmented files
  fate: make acodec-ac3_fixed test output raw AC3
  APIchanges: add missing commit hashes
  swscale: implement MMX, SSE2 and AVX functions for RGB32 input.
  ra144enc: drop pointless "encoder" from .long_name
  bethsoftvideo: fix palette reading.
  mpc7: use av_fast_padded_malloc()
  mpc7: simplify handling of packet sizes that are not a multiple of 4 bytes
  doc: decoding Forward Uncompressed is supported
  Fix a typo in the x86 asm version of ff_vector_clip_int32()
  pcmenc: Do not set avpkt->size.
  ff_alloc_packet: modify the size of the packet to match the requested size

Conflicts:
	doc/APIchanges
	libavcodec/libx264.c
	libavcodec/mpc7.c
	libavformat/isom.h
	libswscale/Makefile
	libswscale/bfin/yuv2rgb_bfin.c
	tests/ref/fate/bethsoft-vid
	tests/ref/seek/ac3_ac3

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-03 03:51:32 +01:00
KO Myung-Hun
c853124fb0 Use SECTION_TEXT instead of section .text for the compatibility
aout does not support 'align='.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-02 21:11:36 +01:00
Ronald S. Bultje
7e4d9d5d45 win64: add a XMM clobber test configure option.
This will be useful to test more aggressively for failures to mark XMM
registers as clobbered in Win64 builds, and prevent regressions thereof.

Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>
2012-02-02 12:00:48 -08:00
Justin Ruggles
236a550c3f Fix a typo in the x86 asm version of ff_vector_clip_int32()
Specifies the correct number of xmm registers used so that they can be saved
and restored on Win64 if necessary.
2012-02-01 19:02:32 -05:00
Michael Niedermayer
a369a6b858 Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  fate: add golomb-test
  golomb-test: K&R formatting cosmetics
  h264: Split h264-test off into a separate file - golomb-test.c.
  h264-test: cleanup: drop timer invocations, commented out code and other cruft
  h264-test: Remove unused DSP and AVCodec contexts and related init calls.
  adpcm: Add missing stdint.h #include to fix standalone header compilation.
  lavf: add functions for accessing the fourcc<->CodecID mapping tables.
  lavc: set AVCodecContext.codec in avcodec_get_context_defaults3().
  lavc: make avcodec_close() work properly on unopened codecs.
  lavc: add avcodec_is_open().
  lavf: rename AVInputFormat.value to raw_codec_id.
  lavf: remove the pointless value field from flv and iv8
  lavc/lavf: remove unnecessary symbols from the symbol version script.
  lavc: reorder AVCodec fields.
  lavf: reorder AVInput/OutputFormat fields.
  mp3dec: Fix a heap-buffer-overflow
  adpcmenc: remove some unneeded casts
  adpcmenc: use int16_t and uint8_t instead of short and unsigned char.
  adpcmenc: fix adpcm_ms extradata allocation
  adpcmenc: return proper AVERROR codes instead of -1
  ...

Conflicts:
	doc/APIchanges
	libavcodec/Makefile
	libavcodec/adpcmenc.c
	libavcodec/avcodec.h
	libavcodec/h264.c
	libavcodec/libavcodec.v
	libavcodec/mpc7.c
	libavcodec/mpegaudiodec.c
	libavcodec/options.c
	libavformat/Makefile
	libavformat/avformat.h
	libavformat/flvdec.c
	libavformat/libavformat.v

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-02-01 02:36:09 +01:00
Michael Niedermayer
151ecc2aec Merge remote-tracking branch 'qatar/master'
* qatar/master: (26 commits)
  avconv: deprecate the -deinterlace option
  doc: Fix the name of the new function
  aacenc: make sure to encode enough frames to cover all input samples.
  aacenc: only use the number of input samples provided by the user.
  wmadec: Verify bitstream size makes sense before calling init_get_bits.
  kmvc: Log into a context at a log level constant.
  mpeg12: Pad framerate tab to 16 entries.
  kgv1dec: Increase offsets array size so it is large enough.
  kmvc: Check palsize.
  nsvdec: Propagate errors
  nsvdec: Be more careful with av_malloc().
  nsvdec: Fix use of uninitialized streams.
  movenc: cosmetics: Get rid of camelCase identifiers
  swscale: more generic check for planar destination formats with alpha
  doc: Document mov/mp4 fragmentation options
  build: Use order-only prerequisites for creating FATE reference file dirs.
  x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf
  rtsp: Remove some unused variables from ff_rtsp_connect().
  avutil: make intfloat api public
  avformat_write_header(): detail error message
  ...

Conflicts:
	doc/APIchanges
	doc/ffmpeg.texi
	doc/muxers.texi
	ffmpeg.c
	libavcodec/kmvc.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_yasm.asm
	libavcodec/x86/pngdsp-init.c
	libavformat/movenc.c
	libavformat/movenc.h
	libavformat/mpegtsenc.c
	libavformat/nsvdec.c
	libavformat/utils.c
	libavutil/avutil.h
	libswscale/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-31 02:46:26 +01:00
Christophe Gisquet
e5c9de2ab7 rv40: x86 SIMD for biweight
Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are
multiples of 512 (which is often the case when the values round up nicely).

*_TIMER report for the 16x16 and 8x8 cases:
C:
9015 decicycles in 16, 524257 runs, 31 skips
2656 decicycles in 8, 524271 runs, 17 skips
MMX:
4156 decicycles in 16, 262090 runs, 54 skips
1206 decicycles in 8, 262131 runs, 13 skips
MMX on fast-path:
2760 decicycles in 16, 524222 runs, 66 skips
995 decicycles in 8, 524252 runs, 36 skips
SSE2:
2163 decicycles in 16, 262131 runs, 13 skips
832 decicycles in 8, 262137 runs, 7 skips
SSE2 with fast path:
1783 decicycles in 16, 524276 runs, 12 skips
711 decicycles in 8, 524283 runs, 5 skips
SSSE3:
2117 decicycles in 16, 262136 runs, 8 skips
814 decicycles in 8, 262143 runs, 1 skips
SSSE3 with fast path:
1315 decicycles in 16, 524285 runs, 3 skips
578 decicycles in 8, 524286 runs, 2 skips

This means around a 4% speedup for some sequences.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-01-30 23:58:25 +01:00
Diego Biurrun
91bafb52ae x86: Give RV40 init file a more suitable name. 2012-01-30 23:58:24 +01:00
Diego Biurrun
c30b198381 x86: Place mm_flags variable declaration below the appropriate #ifdef.
This fixes some unused variable warnings with YASM disabled.
2012-01-30 23:58:23 +01:00
Christophe Gisquet
6b03900382 x86 dsputil: provide SSE2/SSSE3 versions of bswap_buf
While pshufb allows emulating bswap on XMM registers for SSSE3, more
shuffling is needed for SSE2. Alignment is critical, so specific codepaths
are provided for this case.

For the huffyuv sequence "angels_480-huffyuvcompress.avi":
C (using bswap instruction): ~ 55k cycles
SSE2:                        ~ 40k cycles
SSSE3 using unaligned loads: ~ 35k cycles
SSSE3 using aligned loads:   ~ 30k cycles

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-01-30 10:19:55 +01:00
Ronald S. Bultje
af79a0c48a png: add support for bpp>4 to paeth x86 SIMD code.
This fixes playback of e.g. RGB48 (bpp=6) content on x86 CPUs. Fixes
bug 214.
2012-01-29 21:22:50 -08:00
Michael Niedermayer
e1492151fb Merge remote-tracking branch 'qatar/master'
* qatar/master:
  png: add missing #if HAVE_SSSE3 around function pointer assignment.
  imdct36: mark SSE functions as using all 16 XMM registers.
  png: move DSP functions to their own DSP context.
  sunrast: Add a sample request for TIFF, IFF, and Experimental Rastfile formats.
  sunrast: Cosmetics
  sunrast: Remove if (unsigned int < 0) check.
  sunrast: Replace magic number by a macro.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/pngdec.c
	libavcodec/sunrast.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-30 05:20:58 +01:00
Ronald S. Bultje
f91c4b7824 png: add SSE2 version for add_bytes_l2. 2012-01-29 18:52:17 -08:00
Ronald S. Bultje
59f474b49d png: convert DSP functions to yasm. 2012-01-29 18:47:50 -08:00
Ronald S. Bultje
20a7d3178f png: add missing #if HAVE_SSSE3 around function pointer assignment. 2012-01-29 12:31:59 -08:00
Ronald S. Bultje
331e7c4cb3 imdct36: mark SSE functions as using all 16 XMM registers.
On x86-64, it indeed uses all 16 registers (and on x86-32, this gets
clipped to 8). Not marking it properly causes callers of this function
to fail randomly because of XMM register clobbering.
2012-01-29 08:14:05 -08:00
Ronald S. Bultje
e92003514d png: move DSP functions to their own DSP context. 2012-01-29 08:11:18 -08:00
Michael Niedermayer
81ab42a334 dirac_yasm: fix linking failure due to %ifndef
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-28 09:06:03 +01:00
Michael Niedermayer
e37f161e66 Merge remote-tracking branch 'qatar/master'
* qatar/master: (71 commits)
  movenc: Allow writing to a non-seekable output if using empty moov
  movenc: Support adding isml (smooth streaming live) metadata
  libavcodec: Don't crash in avcodec_encode_audio if time_base isn't set
  sunrast: Document the different Sun Raster file format types.
  sunrast: Add a check for experimental type.
  libspeexenc: use AVSampleFormat instead of deprecated/removed SampleFormat
  lavf: remove disabled FF_API_SET_PTS_INFO cruft
  lavf: remove disabled FF_API_OLD_INTERRUPT_CB cruft
  lavf: remove disabled FF_API_REORDER_PRIVATE cruft
  lavf: remove disabled FF_API_SEEK_PUBLIC cruft
  lavf: remove disabled FF_API_STREAM_COPY cruft
  lavf: remove disabled FF_API_PRELOAD cruft
  lavf: remove disabled FF_API_NEW_STREAM cruft
  lavf: remove disabled FF_API_RTSP_URL_OPTIONS cruft
  lavf: remove disabled FF_API_MUXRATE cruft
  lavf: remove disabled FF_API_FILESIZE cruft
  lavf: remove disabled FF_API_TIMESTAMP cruft
  lavf: remove disabled FF_API_LOOP_OUTPUT cruft
  lavf: remove disabled FF_API_LOOP_INPUT cruft
  lavf: remove disabled FF_API_AVSTREAM_QUALITY cruft
  ...

Conflicts:
	doc/APIchanges
	libavcodec/8bps.c
	libavcodec/avcodec.h
	libavcodec/libx264.c
	libavcodec/mjpegbdec.c
	libavcodec/options.c
	libavcodec/sunrast.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/h264_deblock.asm
	libavdevice/libdc1394.c
	libavdevice/v4l2.c
	libavformat/avformat.h
	libavformat/avio.c
	libavformat/avio.h
	libavformat/aviobuf.c
	libavformat/dv.c
	libavformat/mov.c
	libavformat/utils.c
	libavformat/version.h
	libavformat/wtv.c
	libavutil/Makefile
	libavutil/file.c
	libswscale/x86/input.asm
	libswscale/x86/swscale_mmx.c
	libswscale/x86/swscale_template.c
	tests/ref/lavf/ffm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-28 07:53:34 +01:00
Ronald S. Bultje
3b15a6d742 config.asm: change %ifdef directives to %if directives.
This allows combining multiple conditionals in a single statement.
2012-01-27 10:19:57 +08:00
Michael Niedermayer
3c5fe5b527 Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  wma: Clip WMA1 and WMA2 frame length to 11 bits.
  movenc: Don't require frame_size to be set for modes other than mov
  doc: Update APIchanges with info on muxer flushing
  movenc: Reindent a block
  tools: Remove some unnecessary #undefs.
  rv20: prevent calling ff_h263_decode_mba() with unset height/width
  tools: K&R reformatting cosmetics
  Ignore generated aviocat and ismindex tools.
  build: Automatically include architecture-specific library Makefile snippets.
  indeo5: prevent null pointer dereference on broken files
  pktdumper: Use usleep instead of sleep
  cosmetics: Remove some unnecessary block braces.
  Drop unnecessary prefix from *sink* variable and struct names.
  Add a tool for creating smooth streaming manifests
  movdec: Calculate an average bit rate for fragmented streams, too
  movenc: Write the sample rate instead of time scale in the stsd atom
  movenc: Add a separate ismv/isma (smooth streaming) muxer
  movenc: Allow the caller to decide on fragmentation
  libavformat: Add a flag for muxers that support write_packet(NULL) for flushing
  movenc: Add support for writing fragmented mov files
  ...

Conflicts:
	Changelog
	cmdutils.c
	cmdutils.h
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavfilter/Makefile
	libavformat/Makefile
	libavformat/avformat.h
	libavformat/movenc.c
	libavformat/movenc.h
	libavformat/version.h
	tools/graph2dot.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-26 02:23:56 +01:00
Ronald S. Bultje
c3af52fa8b dsputil: use vertical component for drawing bottom edge.
Current code only writes 8 pixels of vertical edge for YUV422, which
causes MC artifacts when subsequent frames use data from that edge.
2012-01-25 18:06:36 +08:00
Reimar Döffinger
7e62315c91 Use correct register size.
Fixes compilation with NASM.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-01-17 08:41:39 +01:00
Michael Niedermayer
67f5650a78 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rv34: add NEON rv34_idct_add
  rv34: 1-pass inter MB reconstruction
  add SMJPEG muxer
  avformat: split out common SMJPEG code
  pictordec: Use bytestream2 functions
  avconv: use avcodec_encode_audio2()
  pcmenc: use AVCodec.encode2()
  avcodec: bump minor version and add APIChanges for the new audio encoding API
  avcodec: Add avcodec_encode_audio2() as replacement for avcodec_encode_audio()
  avcodec: add a public function, avcodec_fill_audio_frame().
  rv34: Intra 16x16 handling
  rv34: Inter/intra MB code split

Conflicts:
	Changelog
	libavcodec/avcodec.h
	libavcodec/pictordec.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/rv34dsp.asm
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-17 02:37:30 +01:00
Christophe GISQUET
9ba9c34024 rv34: 1-pass inter MB reconstruction
Implement 1-pass inverse transform and reconstruction for inter blocks.
2012-01-16 19:26:41 +01:00
Christophe GISQUET
d78062386e rv34: Intra 16x16 handling
Extract processing of intra 16x16 blocks from intra macroblock
processing.
Also implement a function performing inverse transform and block
reconstruction for DC-only blocks in 1 pass instead of 2.
2012-01-16 00:41:51 +01:00
Reimar Döffinger
7a1723086a Fix compilation without HAVE_AVX, HAVE_YASM etc.
At the very least this should fix warnings about unused static
functions if one or more of these is not defined.
However even compilation might be broken if the compiler does
not optimize the function away completely.
This actually happens in case of the AVX function, since the
function pointer is used in an assignment that is not under
an #if and thus probably only optimized away after the function
was already marked as used.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-01-14 23:09:39 +01:00
Reimar Döffinger
83b12c16af Use correct register size, fixes compilation with NASM.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2012-01-14 17:43:47 +01:00
Michael Niedermayer
b18e17eabf Merge remote-tracking branch 'qatar/master'
* qatar/master: (21 commits)
  utils: Check for extradata size overflows.
  ARM: rv34: fix asm syntax in dc transform functions
  avio: Fix the value of the deprecated URL_FLAG_NONBLOCK
  rv34: fix and optimise frame dependency checking
  rv34: NEON optimised dc only inverse transform
  avprobe: use avio_size() instead of deprecated AVFormatContext.file_size.
  ffmenc: remove references to deprecated AVFormatContext.timestamp.
  lavf: undeprecate read_seek().
  avserver: remove code using deprecated CODEC_CAP_PARSE_ONLY.
  lavc: replace some remaining FF_I_TYPE with AV_PICTURE_TYPE_I
  lavc: ifdef out parse_only AVOption
  nellymoserdec: SAMPLE_FMT -> AV_SAMPLE_FMT
  mpegvideo_enc: ifdef out/replace references to deprecated codec flags.
  riff: remove references to sonic codec ids
  indeo4: add some missing static and const qualifiers
  rv34: DC-only inverse transform
  avconv: use AVFrame.width/height/format instead of corresponding AVCodecContext fields
  lavfi: move version macros to a new installed header version.h
  vsrc_buffer: release the buffer on uninit.
  rgb2rgb: rgb12tobgr12()
  ...

Conflicts:
	avconv.c
	doc/APIchanges
	ffprobe.c
	libavfilter/Makefile
	libavfilter/avfilter.h
	libswscale/rgb2rgb.c
	libswscale/rgb2rgb.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-13 01:29:48 +01:00
Carl Eugen Hoyos
ef3a19d595 Fix compilation with yasm-0.6.2 2012-01-12 16:35:49 +01:00
Christophe GISQUET
3faa303a47 rv34: DC-only inverse transform
When decoding coefficients, detect whether the block is DC-only, and take
advantage of this knowledge to perform DC-only inverse transform.

This is achieved by:
- first, changing the 108x4 element modulo_three_table into a 108 element
  table (kind of base4), and accessing each value using mask and shifts.
- then, checking low bits for 0 (as they represent the presence of higher
  frequency coefficients)

Also provide x86 SIMD code for the DC-only inverse transform.

Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-01-12 09:52:33 +01:00
Michael Niedermayer
794006f8fe Merge remote-tracking branch 'qatar/master'
* qatar/master:
  fft: init functions with INIT_XMM/YMM.
  pcmenc: set frame_size to 0.
  gsm demuxer: use generic seeking instead of a gsm-specific function.
  gsm demuxer: return packets with only 1 gsm block at a time.
  avcodec: add GSM parser
  doc: Replace ffmpeg references in avserver config file by avconv.
  doc: Fix names of av_log color environment variables.
  Fix a bunch of platform name and other typos.
  Add some missing changelog entries and release 0.8_beta2
  No longer build libpostproc by default
  wtv: fix memleaks during normal operation
  threads: add CODEC_CAP_AUTO_THREADS for libvpx and xavs

Conflicts:
	Changelog
	RELEASE
	cmdutils.c
	configure
	doc/ffserver.conf
	doc/platform.texi
	ffplay.c
	libavcodec/Makefile
	libavcodec/version.h
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-12 01:10:32 +01:00
Michael Niedermayer
5387f9917f cabac: Try to disable problematic ASM for gcc-llvm 4.2.1
This should fix compilation with gcc-llvm (see darwin fate box)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-11 22:30:21 +01:00
Henrik Gramner
e7d02b04dc fft: init functions with INIT_XMM/YMM.
This is required to handle clobbering of XMM registers on Win64
correctly. Fixes FFT and all tests depending on FFT on Win64.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-01-11 20:12:26 +01:00
Michael Niedermayer
dd3ca3ea15 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  fate: Add tests for more AAC features.
  aacps: Add missing newline in error message.
  fate: Add tests for vc1/wmapro in ism.
  aacdec: Add a fate test for 5.1 channel SBR.
  aacdec: Turn off PS for multichannel files that use PCE based configs.
  cabac: remove put_cabac_u/ueg from cabac-test.
  swscale: RGB4444 and BGR444 input
  FATE: add test for xWMA demuxer.
  FATE: add test for SMJPEG demuxer and associated IMA ADPCM audio decoder.
  mpegaudiodec: optimized iMDCT transform
  mpegaudiodec: change imdct window arrangment for better pointer alignment
  mpegaudiodec: move imdct and windowing function to mpegaudiodsp
  mpegaudiodec: interleave iMDCT buffer to simplify future SIMD implementations
  swscale: convert yuy2/uyvy/nv12/nv21ToY/UV from inline asm to yasm.
  FATE: test to exercise WTV demuxer.
  mjpegdec: K&R formatting cosmetics
  swscale: K&R formatting cosmetics for code examples
  swscale: K&R reformatting cosmetics for header files
  FATE test: cvid-grayscale; ensures that the grayscale Cinepak variant is exercised.

Conflicts:
	libavcodec/cabac.c
	libavcodec/mjpegdec.c
	libavcodec/mpegaudiodec.c
	libavcodec/mpegaudiodsp.c
	libavcodec/mpegaudiodsp.h
	libavcodec/mpegaudiodsp_template.c
	libavcodec/x86/Makefile
	libavcodec/x86/imdct36_sse.asm
	libavcodec/x86/mpegaudiodec_mmx.c
	libswscale/swscale-test.c
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/x86/swscale_template.c
	tests/fate/demux.mak
	tests/fate/microsoft.mak
	tests/fate/video.mak
	tests/fate/wma.mak
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-10 03:50:41 +01:00
Michael Niedermayer
f247f4cf47 cabac: 3rd try at working around a compiler bug in clang.
Switch to a broader detection of versions.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-09 16:04:26 +01:00
Michael Niedermayer
444632eae6 cabac: Disable get_cabac_inline_x86() for clang 2.9 on x86_32
This should finally fix the compilation issue on darwin

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-09 03:43:56 +01:00
Michael Niedermayer
2138a89e71 Revert "Revert commit 599b4c6efddaed33b1667c386b34b07729ba732b"
This reverts commit c4f237a981.
This didnt fix compilation on darwin with current clang.
2012-01-09 03:32:06 +01:00
Vitor Sessak
39df0c434c mpegaudiodec: optimized iMDCT transform
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-01-08 17:40:55 -08:00
Michael Niedermayer
c4f237a981 Revert commit 599b4c6efd
Author: Mans Rullgard <mans@mansr.com>
	Date:   Sun Dec 11 21:41:59 2011 +0000

    	x86: cabac: replace explicit memory references with "m" operands

    	This replaces the explicit offset(reg) memory references with
    	"m" operands for the same locations.  As a result, one fewer
    	register operand is needed for these inline asm statements.

This change appears to have broken compilation on darwin, and subsequent
fixes by martin (which did not fix compilation) removed the register
advantage, thus this change seems not a good idea to keep.
See: http://fate.ffmpeg.org/log.cgi?time=20120103122446&log=compile&slot=i386-darwin-llvm-gcc-4.2.1

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-01-06 01:46:51 +01:00
Michael Niedermayer
0e5fbbd776 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mpegvideo_enc: K&R cosmetics
  doxygen: remove unreplaced variables from custom header and footer
  threads: test for sys/param.h and include it for sysctl on OpenBSD
  v4l2: remove unneded linux specific asm/types.h include
  x86: Fix constraints for decode_significance*_x86

Conflicts:
	libavcodec/mpegvideo_enc.c
	libavdevice/v4l2.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-12-28 02:38:33 +01:00
Martin Storsjö
676a9ee1d2 x86: Fix constraints for decode_significance*_x86
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:

ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)

Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.

This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.

Signed-off-by: Martin Storsjö <martin@martin.st>
2011-12-27 09:05:14 +02:00
Michael Niedermayer
52c522c720 Merge remote-tracking branch 'qatar/master'
* qatar/master: (27 commits)
  asfdec: add side data to ASFStream packet instead of output packet.
  idroqdec: set AVFMTCTX_NOHEADER and create streams as they occur.
  nellymoserdec: Indicate that the decoder can handle changed parameters
  libavcodec: Apply parameter change side data when decoding audio
  flvdec: Add param change side data if the sample rate or channels have changed
  libavformat: Add a utility function for adding parameter change side data
  libavcodec: Define a side data type for parameter changes
  aacdec: Handle new extradata passed as side data
  flvdec: Export new AAC/H.264 extradata as side data on the next packet
  libavcodec: Define a side data type for new extradata
  flacdec: skip all track indices at once instead of looping.
  mxf: Add PictureEssenceCoding UL for V210.
  mxfdec: consider QuantizationBits between 17 and 24 to be pcm_s24*
  mxfenc: Add support for MPEG-2 MP@HL-14 in mxf container.
  mxf: H.264/MPEG-4 AVC Intra support
  configure: Show whether the safe bitstream reader is enabled
  x86: Tighten register constraints for decode_significance*_x86.
  Replace Subversion revisions in comments by Git hashes.
  h264_cabac: synchronize decode_significance_*_x86 conditionals
  w32threads: wait for the waked thread in pthread_cond_signal.
  ...

Conflicts:
	libavcodec/avcodec.h
	libavcodec/version.h
	libavformat/flvdec.c
	libavformat/utils.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-12-22 01:51:53 +01:00
Diego Biurrun
6fdb2ce34a x86: Tighten register constraints for decode_significance*_x86.
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.

This reverts commit 8742a4f to a simplified version of the original constraints.
2011-12-21 12:06:37 +01:00
Michael Niedermayer
0edf7ebcd6 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  h264: clear trailing bits in partially parsed NAL units
  vc1: Handle WVC1 interlaced stream
  xl: Fix overreads
  mpegts: rename payload_index to payload_size
  segment: introduce segmented chain muxer
  lavu: add AVERROR_BUG error value
  avplay: clear pkt_temp when pkt is freed.
  qcelpdec: K&R formatting cosmetics
  qcelpdec: cosmetics: drop some pointless parentheses
  x86: conditionally compile dnxhd encoder optimizations
  Revert "h264: skip start code search if the size of the nal unit is known"
  swscale: fix formatting and indentation of unscaled conversion routines.
  h264: skip start code search if the size of the nal unit is known
  cljr: fix buf_size sanity check
  cljr: Check if width and height are positive integers

Conflicts:
	libavcodec/cljr.c
	libavcodec/vc1dec.c
	libavformat/Makefile
	libavformat/mpegtsenc.c
	libavformat/segment.c
	libswscale/swscale_unscaled.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-12-20 04:12:09 +01:00