ffmpeg

Author	SHA1	Message	Date
Michael Niedermayer	c3c2db49a7	Merge remote-tracking branch 'qatar/master' * qatar/master: cook: expand dither_tab[], and make sure indexes into it don't overflow. xxan: reindent xan_unpack_luma(). xxan: protect against chroma LUT overreads. xxan: convert to bytestream2 API. xxan: don't read before start of buffer in av_memcpy_backptr(). vp8: convert mbedge loopfilter x86 assembly to use named arguments. vp8: convert inner loopfilter x86 assembly to use named arguments. Conflicts: libavcodec/xxan.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-03-11 01:12:52 +01:00
Ronald S. Bultje	a928ed3751	vp8: convert mbedge loopfilter x86 assembly to use named arguments.	2012-03-10 11:36:33 -08:00
Ronald S. Bultje	bee330e300	vp8: convert inner loopfilter x86 assembly to use named arguments.	2012-03-10 11:36:33 -08:00
Michael Niedermayer	2af8f2cea6	Merge remote-tracking branch 'qatar/master' * qatar/master: (27 commits) cmdutils: use new avcodec_is_decoder/encoder() functions. lavc: make codec_is_decoder/encoder() public. lavc: deprecate AVCodecContext.sub_id. libcdio: add a forgotten AVClass to the private context. swscale: remove "cpu flags" from -sws_flags description. proresenc: give user a possibility to alter some encoding parameters vorbisenc: add output buffer overwrite protection libopencore-amrnbenc: fix end-of-stream handling ra144enc: fix end-of-stream handling nellymoserenc: zero any leftover packet bytes nellymoserenc: use proper MDCT overlap delay qpeg: Use bytestream2 functions to prevent buffer overreads. swscale: make %rep unconditional. vp8: convert simple loopfilter x86 assembly to use named arguments. vp8: convert idct x86 assembly to use named arguments. vp8: convert mc x86 assembly to use named arguments. vp8: convert loopfilter x86 assembly to use cpuflags(). vp8: convert idct/mc x86 assembly to use cpuflags(). swscale: remove now unnecessary hack. x86inc: don't "bake" stack_offset in named arguments. ... Conflicts: cmdutils.c doc/APIchanges libavcodec/mpeg12.c libavcodec/options.c libavcodec/qpeg.c libavcodec/utils.c libavcodec/version.h libavdevice/libcdio.c tests/lavf-regression.sh Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-03-05 00:15:55 +01:00
Ronald S. Bultje	b4188f0d46	vp8: convert simple loopfilter x86 assembly to use named arguments.	2012-03-03 20:40:00 -08:00
Ronald S. Bultje	8476ca3b4e	vp8: convert idct x86 assembly to use named arguments.	2012-03-03 20:40:00 -08:00
Ronald S. Bultje	21ffc78fd7	vp8: convert mc x86 assembly to use named arguments.	2012-03-03 20:40:00 -08:00
Ronald S. Bultje	28170f1a39	vp8: convert loopfilter x86 assembly to use cpuflags().	2012-03-03 20:40:00 -08:00
Ronald S. Bultje	e25be47154	vp8: convert idct/mc x86 assembly to use cpuflags().	2012-03-03 20:39:59 -08:00
Michael Niedermayer	268098d8b2	Merge remote-tracking branch 'qatar/master' * qatar/master: (29 commits) amrwb: remove duplicate arguments from extrapolate_isf(). amrwb: error out early if mode is invalid. h264: change underread for 10bit QPEL to overread. matroska: check buffer size for RM-style byte reordering. vp8: disable mmx functions with sse/sse2 counterparts on x86-64. vp8: change int stride to ptrdiff_t stride. wma: fix invalid buffer size assumptions causing random overreads. Windows Media Audio Lossless decoder rv10/20: Fix slice overflow with checked bitstream reader. h263dec: Disallow width/height changing with frame threads. rv10/20: Fix a buffer overread caused by losing track of the remaining buffer size. rmdec: Honor .RMF tag size rather than assuming 18. g722: Fix the QMF scaling r3d: don't set codec timebase. electronicarts: set timebase for tgv video. electronicarts: parse the framerate for cmv video. ogg: don't set codec timebase electronicarts: don't set codec timebase avs: don't set codec timebase wavpack: Fix an integer overflow ... Conflicts: libavcodec/arm/vp8dsp_init_arm.c libavcodec/fraps.c libavcodec/h264.c libavcodec/mpeg4videodec.c libavcodec/mpegvideo.c libavcodec/msmpeg4.c libavcodec/pnmdec.c libavcodec/qpeg.c libavcodec/rawenc.c libavcodec/ulti.c libavcodec/vcr1.c libavcodec/version.h libavcodec/wmalosslessdec.c libavformat/electronicarts.c libswscale/ppc/yuv2rgb_altivec.c tests/ref/acodec/g722 tests/ref/fate/ea-cmv Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-03-03 00:23:10 +01:00
Ronald S. Bultje	45549339bc	vp8: disable mmx functions with sse/sse2 counterparts on x86-64. x86-64 is guaranteed to have at least SSE2, therefore the MMX/MMX2 functions will never be used in practice.	2012-03-02 10:32:05 -08:00
Kieran Kunhya	b1766c170c	Move x264asm to libavutil. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-10-19 20:26:55 +02:00
Michael Niedermayer	1a34478b71	Merge remote-tracking branch 'qatar/master' * qatar/master: Fix NASM include directive dsputil_mmx: Honor HAVE_AMD3DNOW lavf,lavd: remove all usage of AVFormatParameters from demuxers. jack: add 'channels' private option. VC-1: fix reading of custom PAR. Remove redundant and dubious video codec detection by its extradata mpeg12: remove repeat-field code disabled since May 2002 patch checklist: suggest fate instead of regression tests Turn on resampling on sudden size change instead of bailing out during recode. avtools: reinitialise filter chain when input video stream changes dimensions Conflicts: Makefile avconv.c doc/developer.texi ffplay.c libavcodec/x86/dsputil_mmx.c libavdevice/libdc1394.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-08-15 23:35:53 +02:00
Dave Yeo	cc73511e8e	Fix NASM include directive Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2011-08-15 11:24:35 -07:00
Michael Niedermayer	0cb233cf46	Merge commit 'b2c087871dafc7d030b2d48457ddff597dfd4925' * commit 'b2c087871dafc7d030b2d48457ddff597dfd4925': Move x86util.asm from libavcodec/ to libavutil/. Move x86inc.asm to libavutil/. APIchanges: note error_recognition in lavf lavf: add support for error_recognition, use it in avidec, and bump minor API version avconv: change semantics of -map avconv: get rid of new* options. cmdutils: allow precisely specifying a stream for AVOptions. configure: add missing CFLAGS to fix building on the HURD libx264: Include hint for possible values for configuring libx264 cmdutils: allow ':'-separated modifiers in option names. avconv: make -map_metadata work consistently with the other options avconv: remove deprecated options. avconv: make -map_chapters accept only the input file index. Make a copy of ffmpeg under a new name -- avconv. ffmpeg: add a warning stating that the program is deprecated. Add weighted motion compensation for RV40 B-frames RV3/4: calculate B-frame motion weights once per frame Move RV3/4-specific DSP functions into their own context mjpeg: propagate decode errors from ff_mjpeg_decode_sos and ff_mjpeg_decode_dqt h264: notice memory allocation failure Conflicts: .gitignore Makefile cmdutils.c configure doc/ffplay.texi doc/ffprobe.texi doc/ffserver.texi libavcodec/libx264.c libavformat/avformat.h libavformat/avidec.c libavformat/version.h tests/lavf-regression.sh tests/lavfi-regression.sh Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-08-13 02:56:08 +02:00
Ronald S. Bultje	b2c087871d	Move x86util.asm from libavcodec/ to libavutil/. This allows using it in swscale also.	2011-08-12 11:43:03 -07:00
Ronald S. Bultje	3a39195b1d	Move x86inc.asm to libavutil/. This allows using it in libswscale/ also.	2011-08-12 11:43:02 -07:00
Michael Niedermayer	b4bcd1e2f1	Merge remote-tracking branch 'qatar/master' * qatar/master: Fix compilation of iirfilter-test. libx264: handle closed GOP codec flag lavf: remove duplicate assignment in avformat_alloc_context. lavf: use designated initializers for AVClasses. flvdec: clenup debug code asfdec: fix possible overread on broken files. asfdec: do not fall back to binary/generic search asfdec: reindent after previous commit `c7bd5ed` asfdec: fallback to binary search internally mpegaudio: add _fixed suffix to some names Modify x86util.asm to ease transitioning to 10-bit H.264 assembly. dct: build dct32 as separate object files qdm2: include correct header for rdft Conflicts: ffpresets/libx264-fast.ffpreset ffpresets/libx264-fast_firstpass.ffpreset ffpresets/libx264-faster.ffpreset ffpresets/libx264-faster_firstpass.ffpreset ffpresets/libx264-medium.ffpreset ffpresets/libx264-medium_firstpass.ffpreset ffpresets/libx264-placebo.ffpreset ffpresets/libx264-placebo_firstpass.ffpreset ffpresets/libx264-slow.ffpreset ffpresets/libx264-slow_firstpass.ffpreset ffpresets/libx264-slower.ffpreset ffpresets/libx264-slower_firstpass.ffpreset ffpresets/libx264-superfast.ffpreset ffpresets/libx264-superfast_firstpass.ffpreset ffpresets/libx264-ultrafast.ffpreset ffpresets/libx264-ultrafast_firstpass.ffpreset ffpresets/libx264-veryfast.ffpreset ffpresets/libx264-veryfast_firstpass.ffpreset ffpresets/libx264-veryslow.ffpreset ffpresets/libx264-veryslow_firstpass.ffpreset libavformat/flvdec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-05-18 05:42:42 +02:00
Daniel Kang	d0005d347d	Modify x86util.asm to ease transitioning to 10-bit H.264 assembly. Arguments for variable size instructions are added to many macros, along with other various changes. The x86util.asm code was ported from x264. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2011-05-17 20:44:48 +02:00
Michael Niedermayer	5a153604c9	Merge remote branch 'qatar/master' * qatar/master: Fix FSF address copy paste error in some license headers. Add an aac sample which uses LTP to fate-aac. DUPLICATE [PATCH] Update pixdesc_be fate refs after adding 9/10bit YUV420P formats. arm: properly mark external symbol call Conflicts: libavcodec/x86/ac3dsp.asm libavcodec/x86/deinterlace.asm libavcodec/x86/dsputil_yasm.asm libavcodec/x86/dsputilenc_yasm.asm libavcodec/x86/fft_mmx.asm libavcodec/x86/fmtconvert.asm libavcodec/x86/h264_chromamc.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264_idct.asm libavcodec/x86/h264_intrapred.asm libavcodec/x86/h264_weight.asm libavcodec/x86/vc1dsp_yasm.asm libavcodec/x86/vp3dsp.asm libavcodec/x86/vp56dsp.asm libavcodec/x86/vp8dsp.asm libavcodec/x86/x86util.asm libswscale/ppc/swscale_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-05-15 04:44:07 +02:00
Diego Biurrun	888fa31eca	Fix FSF address copy paste error in some license headers.	2011-05-14 21:32:31 +02:00
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-03-19 13:33:20 +00:00
Reimar Döffinger	b1c32fb5e5	Use "d" suffix for general-purpose registers used with movd. This increases compatibilty with nasm and is also more consistent, e.g. with h264_intrapred.asm and h264_chromamc.asm that already do it that way. Originally committed as revision 25042 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-05 10:10:16 +00:00
Ronald S. Bultje	3611c45ab7	Mark xmm registers as clobbered in simple loopfilter. Should fix the last two VP8-related fate failures on Win64. Originally committed as revision 24908 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-24 16:52:27 +00:00
Ronald S. Bultje	684d608bde	Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures). Originally committed as revision 24871 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-23 02:41:22 +00:00
Jason Garrett-Glaser	827d43bb9d	VP8: move zeroing of luma DC block into the WHT Lets us do the zeroing in asm instead of C. Also makes it consistent with the way the regular iDCT code does it. Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-02 20:18:09 +00:00
Ronald S. Bultje	6341838f3c	Use word-writing instead of dword-writing (with two cached but otherwise unchanged bytes) in the horizontal simple loopfilter. This makes the filter quite a bit faster in itself (~30 cycles less on Core1), probably mostly because we don't need a complex 4x4 transpose, but only a simple byte interleave. Also allows using pextrw on SSE4, which speeds up even more (e.g. 25% faster on Core i7). Originally committed as revision 24638 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-31 23:13:15 +00:00
Ronald S. Bultje	ab4d031889	Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster. Originally committed as revision 24514 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 21:18:19 +00:00
Jason Garrett-Glaser	e25dee602f	VP8: Much faster SSE2 MC 5-10% faster or more on Phenom, Athlon 64, and some others. Helps some on pre-SSSE3 Intel chips as well, but not as much. Originally committed as revision 24513 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 19:34:00 +00:00
Ronald S. Bultje	48adb7e7a4	Enable no-loop memory/register saving for ssse3/sse4 also. Originally committed as revision 24511 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 14:07:57 +00:00
Ronald S. Bultje	2a180c69ea	Save a register (or regsize of stackspace for x86-32) for the no-loop mbedge loopfilter functions, by re-using space that holds a variable that we no longer need. Originally committed as revision 24510 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 14:00:15 +00:00
Ronald S. Bultje	bcd4aa6498	Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this construct was always enabled, even for <ssse3 versions). Originally committed as revision 24509 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 13:56:51 +00:00
Ronald S. Bultje	2208053bd3	Split pextrw macro-spaghetti into several opt-specific macros, this will make future new optimizations (imagine a sse5) much easier. Also fix a bug where we used the direction (%2) rather than optimization (%1) to enable this, which means it wasn't ever actually used... Originally committed as revision 24507 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-26 13:50:59 +00:00
Ronald S. Bultje	6de5b7c6b8	Fix obvious bug in assignment. Somehow, the test vectors don't test this... Originally committed as revision 24489 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-25 02:42:40 +00:00
Ronald S. Bultje	e3f7bf774c	Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this splits it into small optimization-specific macros which are selected for each DSP function. The advantage of this approach is that the sse4 functions now use the ssse3 codepath also without needing an explicit sse4 codepath. Originally committed as revision 24487 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-24 19:33:05 +00:00
Jason Garrett-Glaser	3ae079a3c8	VP8: optimize DC-only chroma case in the same way as luma. Add MMX idct_dc_add4uv function for this case. ~40% faster chroma idct. Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-23 06:02:52 +00:00
Jason Garrett-Glaser	51c9156438	VP8 asm: cosmetics (spacing) Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-23 03:02:56 +00:00
Jason Garrett-Glaser	8a467b2d44	VP8: 30% faster idct_mb Take shortcuts based on statistically common situations. Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT blocks are common. TODO: tie this more directly into the MB mode, since the DC-level transform is only used for non-splitmv blocks? Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-23 02:58:27 +00:00
Jason Garrett-Glaser	c25c776708	VP8: clear DCT blocks in iDCT instead of using clear_blocks. ~0.3% faster overall. Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-23 00:07:16 +00:00
Ronald S. Bultje	dc5eec8085	Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on CPUs supporting it. Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-22 19:59:34 +00:00
Ronald S. Bultje	003243c3c2	Fix and enable horizontal >=SSE2 mbedge loopfilter. Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-22 01:35:26 +00:00
Jason Garrett-Glaser	8731dbd890	Eliminate one instruction in VP8 dc_add_sse4 Originally committed as revision 24405 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-21 22:41:37 +00:00
Jason Garrett-Glaser	7dd224a42d	Various VP8 x86 deblocking speedups SSSE3 versions, improve SSE2 versions a bit. SSE2/SSSE3 mbedge h functions are currently broken, so explicitly disable them. Originally committed as revision 24403 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-21 22:11:03 +00:00
Jason Garrett-Glaser	b8b231b5dc	Make mmx VP8 WHT faster Avoid pextrw, since it's slow on many older CPUs. Now it doesn't require mmxext either. Originally committed as revision 24397 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-21 20:51:01 +00:00
Ronald S. Bultje	e9e456d850	VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) and chroma (width=8). Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-20 22:58:56 +00:00
Ronald S. Bultje	268821e76e	Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder. Originally committed as revision 24377 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-20 22:04:18 +00:00
Ronald S. Bultje	c60ed66dbe	Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's wrong with it tomorrow or so, then re-submit. Originally committed as revision 24341 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-19 23:57:09 +00:00
Ronald S. Bultje	1878f685c0	Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions. Originally committed as revision 24339 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-19 21:53:28 +00:00
Ronald S. Bultje	fb9bdf048c	Be more efficient with registers or stack memory. Saves 8/16 bytes stack for x86-32, or 2 MM registers on x86-64. Originally committed as revision 24338 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-19 21:45:36 +00:00
Ronald S. Bultje	3facfc99da	Change function prototypes for width=8 inner and mbedge loopfilter functions so that it does both U and V planes at the same time. This will have speed advantages when using SSE2 (or higher) optimizations, since we can do both the U and V rows together in a single xmm register. This also renames filter16 to filter16y and filter8 to filter8uv so that it's more obvious what each function is used for. Originally committed as revision 24337 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-19 21:18:04 +00:00

1 2

68 Commits