ffmpeg

Author	SHA1	Message	Date
Roland Scheidegger	9b9df1cdff	h264: new assembly version of get_cabac for x86_64 with PIC This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. get_cabac() gets about 40% faster, for an overall speedup of about 5%. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2012-04-28 09:43:25 -07:00
Roland Scheidegger	14e9ffc1e4	h264: use one table instead of several for cabac functions The reason is this is easier for PIC code (in particular on darwin...). Keep the old names as pointers (static in cabac_functions.h so gcc knows these are just immediate offsets) so the c code can nicely stay the same (alternatively could use offsets directly in the functions needing the tables). This should produce the same code as before with non-pic and better code (confirmed) with pic. The assembly uses the new table but still won't work for PIC case. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2012-04-28 08:26:12 -07:00
Michael Niedermayer	9849515214	Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)" This broke compilation on darwin, revert until a better solution is found. This reverts commit `a812b599b5`.	2012-04-21 02:09:27 +02:00
Roland Scheidegger	a812b599b5	h264: assembly version of get_cabac for x86_64 with PIC (v4) This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... v2: incorporated feedback from Loren Merritt to avoid rip-relative movs for every table, and got rid of unnecessary @GOTPCREL. v3: apply similar fixes to the the decode_significance functions, and use same macro arguments for non-pic case. v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect the c code to be faster otherwise since both cmov and sbb suck hard on a Prescott, even can't construct the mask with a 64bit shift as that's just as terrible - it's quite difficult to find usable instructions on that chip...). This is tested to work but not on a P4, in theory it _should_ be fast there. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-04-21 00:27:06 +02:00
Michael Niedermayer	2c5a2958e9	Merge remote-tracking branch 'qatar/master' * qatar/master: h264: Factorize declaration of mb_sizes array. vsrc_buffer: when no frame is available, return an error instead of segfaulting. configure: add dl to frei0r extralibs. dsputil x86: use SSE float instruction instead of SSE2 integer equivalent dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype vp8dsp x86: perform rounding shift with a single instruction fate: add BMP tests. swscale: handle complete dimensions for monoblack/white. aacenc: Mark deinterleave_input_samples argument as const. vf_unsharp: Mark readonly variable as const. h264: fix 4:2:2 PCM-macroblocks decoding Conflicts: configure libavcodec/h264.h libavcodec/x86/dsputil_mmx.c libavfilter/vf_unsharp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-04-05 22:26:50 +02:00
Diego Biurrun	0becb07842	h264: Factorize declaration of mb_sizes array.	2012-04-05 17:17:22 +02:00
Ronald S. Bultje	63a1b481f6	h264: fix cabac-on-stack after safe cabac reader.	2012-03-28 16:35:42 -07:00
Michael Niedermayer	79ae084e9b	Merge remote-tracking branch 'qatar/master' * qatar/master: (58 commits) amrnbdec: check frame size before decoding. cscd: use negative error values to indicate decode_init() failures. h264: prevent overreads in intra PCM decoding. FATE: do not decode audio in the nuv test. dxa: set audio stream time base using the sample rate psx-str: do not allow seeking by bytes asfdec: Do not set AVCodecContext.frame_size vqf: set packet parameters after av_new_packet() mpegaudiodec: use DSPUtil.butterflies_float(). FATE: add mp3 test for sample that exhibited false overreads fate: add cdxl test for bit line plane arrangement vmnc: return error on decode_init() failure. libvorbis: add/update error messages libvorbis: use AVFifoBuffer for output packet buffer libvorbis: remove unneeded e_o_s check libvorbis: check return values for functions that can return errors libvorbis: use float input instead of s16 libvorbis: do not flush libvorbis analysis if dsp state was not initialized libvorbis: use VBR by default, with default quality of 3 libvorbis: fix use of minrate/maxrate AVOptions ... Conflicts: Changelog doc/APIchanges libavcodec/avcodec.h libavcodec/dpxenc.c libavcodec/libvorbis.c libavcodec/vmnc.c libavformat/asfdec.c libavformat/id3v2enc.c libavformat/internal.h libavformat/mp3enc.c libavformat/utils.c libavformat/version.h libswscale/utils.c tests/fate/video.mak tests/ref/fate/nuv tests/ref/fate/prores-alpha tests/ref/lavf/ffm tests/ref/vsynth1/prores tests/ref/vsynth2/prores Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-03-01 03:17:11 +01:00
Ronald S. Bultje	d1604b3de9	h264: prevent overreads in intra PCM decoding. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org	2012-02-29 13:17:34 -08:00
Michael Niedermayer	a78f6b8cb9	Merge remote-tracking branch 'qatar/master' * qatar/master: (38 commits) v210enc: remove redundant check for pix_fmt wavpack: allow user to disable CRC checking v210enc: Use Bytestream2 functions cafdec: Check return value of avio_seek and avoid modifying state if it fails yop: Check return value of avio_seek and avoid modifying state if it fails tta: Check return value of avio_seek and avoid modifying state if it fails tmv: Check return value of avio_seek and avoid modifying state if it fails r3d: Check return value of avio_seek and avoid modifying state if it fails nsvdec: Check return value of avio_seek and avoid modifying state if it fails mpc8: Check return value of avio_seek and avoid modifying state if it fails jvdec: Check return value of avio_seek and avoid modifying state if it fails filmstripdec: Check return value of avio_seek and avoid modifying state if it fails ffmdec: Check return value of avio_seek and avoid modifying state if it fails dv: Check return value of avio_seek and avoid modifying state if it fails bink: Check return value of avio_seek and avoid modifying state if it fails Check AVCodec.pix_fmts in avcodec_open2() svq3: Prevent illegal reads while parsing extradata. remove ParseContext1 vc1: use ff_parse_close mpegvideo parser: move specific fields into private context ... Conflicts: libavcodec/4xm.c libavcodec/aacdec.c libavcodec/h264.c libavcodec/h264.h libavcodec/h264_cabac.c libavcodec/h264_cavlc.c libavcodec/mpeg4video_parser.c libavcodec/svq3.c libavcodec/v210enc.c libavformat/cafdec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-02-11 01:22:22 +01:00
Ronald S. Bultje	45b7bd7c53	h264: disallow constrained intra prediction modes for luma. Conversion of the luma intra prediction mode to one of the constrained ("alzheimer") ones can happen by crafting special bitstreams, causing a crash because we'll call a NULL function pointer for 16x16 block intra prediction, since constrained intra prediction functions are only implemented for chroma (8x8 blocks). Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org	2012-02-09 22:57:01 -08:00
Michael Niedermayer	e986a5d10d	Merge remote-tracking branch 'qatar/master' * qatar/master: FATE: add tests for targa ARM: fix Thumb-mode simple_idct_arm ARM: 4-byte align start of all asm functions rgb2rgb: rgb12to15() swscale-test: fix stack overread. swscale: fix invalid conversions and memory problems. cabac: split cabac.h into declarations and function definitions cabac: Mark ff_h264_mps_state array as static, it is only used within cabac.c. cabac: Remove ff_h264_lps_state array. Conflicts: libswscale/rgb2rgb.h libswscale/swscale_unscaled.c tests/fate/image.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-01-14 02:22:09 +01:00
Diego Biurrun	55b9ef18e4	cabac: split cabac.h into declarations and function definitions This fixes standalone compilation of some decoders with --disable-optimizations. cabac.h defines some inline functions that use symbols from cabac.c. Without optimizations these inline functions are not eliminated and linking fails with references to non-existing symbols. Splitting the inline functions off into their own header and only #including it in the places where the inline functions are used allows #including cabac.h from anywhere without ill effects.	2012-01-12 23:08:23 +01:00
Michael Niedermayer	0e5fbbd776	Merge remote-tracking branch 'qatar/master' * qatar/master: mpegvideo_enc: K&R cosmetics doxygen: remove unreplaced variables from custom header and footer threads: test for sys/param.h and include it for sysctl on OpenBSD v4l2: remove unneded linux specific asm/types.h include x86: Fix constraints for decode_significance*_x86 Conflicts: libavcodec/mpegvideo_enc.c libavdevice/v4l2.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-12-28 02:38:33 +01:00
Martin Storsjö	676a9ee1d2	x86: Fix constraints for decode_significance*_x86 Originally, prior to `8742a4ff8`, the caller code was compiled within this condition: ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS) Since HAVE_7REGS is defined as (ARCH_X86_64 \|\| (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE)) the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal to HAVE_7REGS (for 32 bit at least). The correct simplification of the original condition thus is HAVE_7REGS, not HAVE_EBX_AVAILABLE. This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0 and HAVE_EBX_AVAILABLE = 1. Signed-off-by: Martin Storsjö <martin@martin.st>	2011-12-27 09:05:14 +02:00
Michael Niedermayer	52c522c720	Merge remote-tracking branch 'qatar/master' * qatar/master: (27 commits) asfdec: add side data to ASFStream packet instead of output packet. idroqdec: set AVFMTCTX_NOHEADER and create streams as they occur. nellymoserdec: Indicate that the decoder can handle changed parameters libavcodec: Apply parameter change side data when decoding audio flvdec: Add param change side data if the sample rate or channels have changed libavformat: Add a utility function for adding parameter change side data libavcodec: Define a side data type for parameter changes aacdec: Handle new extradata passed as side data flvdec: Export new AAC/H.264 extradata as side data on the next packet libavcodec: Define a side data type for new extradata flacdec: skip all track indices at once instead of looping. mxf: Add PictureEssenceCoding UL for V210. mxfdec: consider QuantizationBits between 17 and 24 to be pcm_s24* mxfenc: Add support for MPEG-2 MP@HL-14 in mxf container. mxf: H.264/MPEG-4 AVC Intra support configure: Show whether the safe bitstream reader is enabled x86: Tighten register constraints for decode_significance_x86. Replace Subversion revisions in comments by Git hashes. h264_cabac: synchronize decode_significance__x86 conditionals w32threads: wait for the waked thread in pthread_cond_signal. ... Conflicts: libavcodec/avcodec.h libavcodec/version.h libavformat/flvdec.c libavformat/utils.c tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-12-22 01:51:53 +01:00
Diego Biurrun	6fdb2ce34a	x86: Tighten register constraints for decode_significance*_x86. On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register is not available, but required to assemble the functions. This reverts commit `8742a4f` to a simplified version of the original constraints.	2011-12-21 12:06:37 +01:00
Diego Biurrun	8742a4ff87	h264_cabac: synchronize decode_significance_*_x86 conditionals The definition and the call site where under different #ifdefs.	2011-12-21 09:04:25 +01:00
Michael Niedermayer	38331d2036	h264: disable checking reader, overreads are not possible in ffmpegs h264 decoder. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-12-18 03:17:44 +01:00
Michael Niedermayer	3ba0bfe71f	Merge remote-tracking branch 'qatar/master' * qatar/master: ulti: Fix invalid reads lavf: dealloc private options in av_write_trailer yadif: support 10bit YUV vc1: mark with ER_MB_ERROR bits overconsumption lavc: introduce ER_MB_END and ER_MB_ERROR error_resilience: use the ER_ namespace build: move inclusion of subdir.mak to main subdir loop rv34: NEON optimised 4x4 dequant rv34: move 4x4 dequant to RV34DSPContext aacdec: Use intfloat.h rather than local punning union. Conflicts: libavcodec/h264.c libavcodec/vc1dec.c libavfilter/vf_yadif.c libavformat/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-12-13 23:21:37 +01:00
Luca Barbato	5bf2ac2b37	error_resilience: use the ER_ namespace Add the namespace to {AC_,DC_,MV_}{END,ERROR} macros Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2011-12-13 16:20:58 +01:00
Michael Niedermayer	8bc7fe4daf	Merge remote-tracking branch 'qatar/master' * qatar/master: doxygen: misc consistency, spelling and wording fixes vcr1: drop unnecessary emms_c() calls without MMX code Replace all uses of av_close_input_file() with avformat_close_input(). lavf: add avformat_close_input(). lavf: deprecate av_close_input_stream(). lavf doxy: add some basic demuxing documentation. lavf doxy: add some general lavf information. lavf doxy: add misc utility functions to a group. lavf doxy: add av_guess_codec/format to the encoding group. lavf doxy: add core functions to a doxy group. Add basic libavdevice documentation. lavc: convert error_recognition to err_recognition. avconv: update -map option help text x86: Require 7 registers for the cabac asm x86: bswap: remove test for bswap instruction bswap: make generic implementation more compiler-friendly h264: remove useless cast proresdec: fix decode_slice() prototype Conflicts: configure doc/APIchanges ffprobe.c libavcodec/avcodec.h libavcodec/celp_math.h libavcodec/h264.c libavfilter/src_movie.c libavformat/anm.c libavformat/avformat.h libavformat/version.h libavutil/avstring.h libavutil/bswap.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-12-13 00:39:48 +01:00
Diego Biurrun	58c42af722	doxygen: misc consistency, spelling and wording fixes	2011-12-12 23:06:23 +01:00
Michael Niedermayer	aedc908601	Merge remote-tracking branch 'qatar/master' * qatar/master: (35 commits) flvdec: Do not call parse_keyframes_index with a NULL stream libspeexdec: include system headers before local headers libspeexdec: return meaningful error codes libspeexdec: cosmetics: reindent libspeexdec: decode one frame at a time. swscale: fix signed shift overflows in ff_yuv2rgb_c_init_tables() Move timefilter code from lavf to lavd. mov: add support for hdvd and pgapmetadata atoms mov: rename function _stik, some indentation cosmetics mov: rename function _int8 to remove ambiguity, some indentation cosmetics mov: parse the gnre atom mp3on4: check for allocation failures in decode_init_mp3on4() mp3on4: create a separate flush function for MP3onMP4. mp3on4: ensure that the frame channel count does not exceed the codec channel count. mp3on4: set channel layout mp3on4: fix the output channel order mp3on4: allocate temp buffer with av_malloc() instead of on the stack. mp3on4: copy MPADSPContext from first context to all contexts. fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm ... Conflicts: libavcodec/arm/h264dsp_init_arm.c libavcodec/h264.c libavcodec/h264.h libavcodec/h264_cabac.c libavcodec/h264_cavlc.c libavcodec/h264_ps.c libavcodec/h264dsp_template.c libavcodec/h264idct_template.c libavcodec/h264pred.c libavcodec/h264pred_template.c libavcodec/x86/h264dsp_mmx.c libavdevice/Makefile libavdevice/jack_audio.c libavformat/Makefile libavformat/flvdec.c libavformat/flvenc.c libavutil/pixfmt.h libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-10-22 01:16:41 +02:00
Baptiste Coudurier	76741b0e56	h264: 4:2:2 intra decoding support Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2011-10-21 01:00:41 -07:00
Laurent Aimar	a4fd95b5d5	h264: fix intra 16x16 mode check when using mbaff and constrained_intra_pred. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-10-02 21:20:57 +02:00
Baptiste Coudurier	231a6df9ea	h264dec: h264: 4:2:2 intra decoding Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-08-15 00:39:55 +02:00
Michael Niedermayer	2dd2abe391	Merge remote-tracking branch 'qatar/master' * qatar/master: h263dec: Propagate AV_LOG_ERRORs from slice decoding through frame decoding with sufficient error recognition x86: cabac: don't load/store context values in asm H.264: optimize CABAC x86 asm for Atom vp3/theora: flush after seek. doc/fftools-common-opts: wording fixes missing from the previous commit. doc: document using AVOptions in fftools. cmdutils: add codec_opts parameter to setup_find_stream_info_opts() cmdutils: clarify documentation for filter_codec_opts() cmdutils: clarify documentation for setup_find_stream_info_opts() lavf: add forgotten attribute_deprecated to av_find_stream_info() Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-07-29 01:50:53 +02:00
Jason Garrett-Glaser	6c32576548	H.264: optimize CABAC x86 asm for Atom	2011-07-28 13:06:13 -07:00
Michael Niedermayer	e10979ff56	Merge remote-tracking branch 'qatar/master' * qatar/master: changelog: misc typo and wording fixes H.264: add filter_mb_fast support for >8-bit decoding doc: Remove outdated comments about gcc 2.95 and gcc 3.3 support. lls: use av_lfg instead of rand() in test program build: remove unnecessary dependency on libs from 'all' target H.264: avoid redundant alpha/beta calculations in loopfilter H.264: optimize intra/inter loopfilter decision mpegts: fix Continuity Counter error detection build: remove unnecessary FFLDFLAGS variable vp8/mt: flush worker thread, not application thread context, on seek. mt: proper locking around release_buffer calls. DxVA2: unbreak build after [`657ccb5ac7`] hwaccel: unbreak build Eliminate FF_COMMON_FRAME macro. Conflicts: Changelog Makefile doc/developer.texi libavcodec/avcodec.h libavcodec/h264.c libavcodec/mpeg4videodec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-07-12 01:42:32 +02:00
Diego Biurrun	657ccb5ac7	Eliminate FF_COMMON_FRAME macro. FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied to struct Picture. Replace by an embedded AVFrame structure in struct Picture.	2011-07-11 00:19:00 +02:00
Michael Niedermayer	2f56a97f24	Merge remote-tracking branch 'qatar/master' * qatar/master: (22 commits) H.264: fix filter_mb_fast with 4:4:4 + 8x8dct alsa: limit buffer_size to 32768 frames. alsa: fallback to buffer_size/4 for period_size. doc: replace @pxref by @ref where appropriate mpeg1video: don't abort if thread_count is too high. segafilm: add support for videos with cri adx adpcm gxf: Fix 25 fps DV material in GXF being misdetected as 50 fps libxvid: Add const qualifier to silence compiler warning. H.264: improve qp_thresh check H.264: use fill_rectangle in CABAC decoding H.264: Remove redundant hl_motion_16/8 code H.264: merge fill_rectangle into P-SKIP MV prediction, to match B-SKIP H.264: faster P-SKIP decoding H.264: av_always_inline some more functions H.264: Add x86 assembly for 10-bit H.264 predict functions swscale: rename uv_off/uv_off2 to uv_off_px/byte. swscale: implement error dithering in planarCopyWrapper. swscale: error dithering for 16/9/10-bit to 8-bit. swscale: fix overflow in 16-bit vertical scaling. swscale: fix crash in 8-bpc bilinear output without alpha. ... Conflicts: doc/developer.texi libavdevice/alsa-audio.h libavformat/gxf.c libswscale/swscale.c libswscale/swscale_internal.h libswscale/swscale_unscaled.c libswscale/x86/swscale_template.c tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_crop tests/ref/lavfi/pixfmts_hflip tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-07-10 04:28:50 +02:00
Jason Garrett-Glaser	99b6d2c065	H.264: use fill_rectangle in CABAC decoding	2011-07-08 16:12:39 -07:00
Michael Niedermayer	976a8b2179	Merge remote-tracking branch 'qatar/master' * qatar/master: (40 commits) H.264: template left MB handling H.264: faster fill_decode_caches H.264: faster write_back_* H.264: faster fill_filter_caches H.264: make filter_mb_fast support the case of unavailable top mb Do not include log.h in avutil.h Do not include pixfmt.h in avutil.h Do not include rational.h in avutil.h Do not include mathematics.h in avutil.h Do not include intfloat_readwrite.h in avutil.h Remove return statements following infinite loops without break RTSP: Doxygen comment cleanup doxygen: Escape '\' in Doxygen documentation. md5: cosmetics md5: use AV_WL32 to write result md5: add fate test md5: include correct headers md5: fix test program doxygen: Drop array size declarations from Doxygen parameter names. doxygen: Fix parameter names to match the function prototypes. ... Conflicts: libavcodec/x86/dsputil_mmx.c libavformat/flvenc.c libavformat/oggenc.c libavformat/wtv.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-07-04 00:45:21 +02:00
Jason Garrett-Glaser	556f8a066c	H.264: template left MB handling Faster H.264 decoding with ALLOW_INTERLACE off.	2011-07-03 15:06:00 -07:00
Jason Garrett-Glaser	3b7ebeb4d5	H.264: faster write_back_* Avoid aliasing, unroll loops, and inline more functions.	2011-07-03 15:05:55 -07:00
Carl Eugen Hoyos	4d08dfefa9	Remove gcc 2.95.3 remnants.	2011-06-29 10:07:39 +02:00
Carl Eugen Hoyos	81ef892ca8	Use HAVE_TEN_OPERANDS for new decode_significance* functions.	2011-06-22 21:45:03 +02:00
Michael Niedermayer	c137fdd778	Merge remote-tracking branch 'qatar/master' * qatar/master: swscale: remove misplaced comment. ffmpeg: fix streaming to ffserver. swscale: split out RGB48 output functions from yuv2packed[12X]_c(). build: move vpath directives to main Makefile swscale: fix JPEG-range YUV scaling artifacts. build: move ALLFFLIBS to a more logical place ARM: factor some repetitive code into macros Fix SVQ3 after adding 4:4:4 H.264 support H.264: fix CODEC_FLAG_GRAY 4:4:4 H.264 decoding support ac3enc: fix allocation of floating point samples. Conflicts: ffmpeg.c libavcodec/dsputil_template.c libavcodec/h264.c libavcodec/mpegvideo.c libavcodec/snow.c libswscale/swscale.c libswscale/swscale_internal.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-06-15 02:15:25 +02:00
Jason Garrett-Glaser	c90b94424c	4:4:4 H.264 decoding support Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.	2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser	504811baea	Roll back 4:4:4 H.264 for now Needs some ARM/PPC asm modifications.	2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser	c9c493872c	4:4:4 H.264 decoding support Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.	2011-06-13 12:21:39 -07:00
Michael Niedermayer	59eb12faff	Merge remote branch 'qatar/master' * qatar/master: (30 commits) AVOptions: make default_val a union, as proposed in AVOption2. arm/h264pred: add missing argument type. h264dsp_mmx: place bracket outside #if/#endif block. lavf/utils: fix ff_interleave_compare_dts corner case. fate: add 10-bit H264 tests. h264: do not print "too many references" warning for intra-only. Enable decoding of high bit depth h264. Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder. Add support for higher QP values in h264. Add the notion of pixel size in h264 related functions. Make the h264 loop filter bit depth aware. Template dsputil_template.c with respect to pixel size, etc. Template h264idct_template.c with respect to pixel size, etc. Preparatory patch for high bit depth h264 decoding support. Move some functions in dsputil.c into a new file dsputil_template.c. Move the functions in h264idct into a new file h264idct_template.c. Move the functions in h264pred.c into a new file h264pred_template.c. Preparatory patch for high bit depth h264 decoding support. Add pixel formats for 9- and 10-bit yuv420p. Choose h264 chroma dc dequant function dynamically. ... Conflicts: doc/APIchanges ffmpeg.c ffplay.c libavcodec/alpha/dsputil_alpha.c libavcodec/arm/dsputil_init_arm.c libavcodec/arm/dsputil_init_armv6.c libavcodec/arm/dsputil_init_neon.c libavcodec/arm/dsputil_iwmmxt.c libavcodec/arm/h264pred_init_arm.c libavcodec/bfin/dsputil_bfin.c libavcodec/dsputil.c libavcodec/h264.c libavcodec/h264.h libavcodec/h264_cabac.c libavcodec/h264_cavlc.c libavcodec/h264_loopfilter.c libavcodec/h264_ps.c libavcodec/h264_refs.c libavcodec/h264dsp.c libavcodec/h264idct.c libavcodec/h264pred.c libavcodec/mlib/dsputil_mlib.c libavcodec/options.c libavcodec/ppc/dsputil_altivec.c libavcodec/ppc/dsputil_ppc.c libavcodec/ppc/h264_altivec.c libavcodec/ps2/dsputil_mmi.c libavcodec/sh4/dsputil_align.c libavcodec/sh4/dsputil_sh4.c libavcodec/sparc/dsputil_vis.c libavcodec/utils.c libavcodec/version.h libavcodec/x86/dsputil_mmx.c libavformat/options.c libavformat/utils.c libavutil/pixfmt.h libswscale/swscale.c libswscale/swscale_internal.h libswscale/swscale_template.c tests/ref/seek/lavf_avi Merged-by: Michael Niedermayer <michaelni@gmx.at>	2011-05-11 05:47:02 +02:00
Oskar Arvidsson	fcc0224e4f	Add support for higher QP values in h264. In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)). Preparatory patch for high bit depth h264 decoding support. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2011-05-10 07:24:35 -04:00
Oskar Arvidsson	6e3ef511d7	Add the notion of pixel size in h264 related functions. In high bit depth the pixels will not be stored in uint8_t like in the normal case, but in uint16_t. The pixel size is thus 1 in normal bit depth and 2 in high bit depth. Preparatory patch for high bit depth h264 decoding support. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2011-05-10 07:24:33 -04:00
Stefano Sabatini	ce5e49b0c2	replace deprecated FF__TYPE symbols with AV_PICTURE_TYPE_	2011-05-02 16:41:41 +02:00
Stefano Sabatini	975a1447f7	Replace deprecated FF__TYPE symbols with AV_PICTURE_TYPE_. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2011-05-02 12:18:44 +02:00
Michael Niedermayer	179106ed78	H264: factor if() out of coef decoding loop of decode_cabac_residual_internal() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-04-10 22:33:42 +02:00
Michael Niedermayer	e7077f5e7b	H264: replace pixel_size by pixel_shift Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-04-10 22:33:42 +02:00
Oskar Arvidsson	d268bed209	Add support for higher QP values in h264. In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)). Preparatory patch for high bit depth h264 decoding support. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-04-10 22:33:42 +02:00
Oskar Arvidsson	dc172ecc6e	Add the notion of pixel size in h264 related functions. In high bit depth the pixels will not be stored in uint8_t like in the normal case, but in uint16_t. The pixel size is thus 1 in normal bit depth and 2 in high bit depth. Preparatory patch for high bit depth h264 decoding support. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2011-04-10 22:33:41 +02:00
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	2011-03-19 13:33:20 +00:00
Ronald S. Bultje	7f8c11b005	Set gray (128) U/V planes for chroma-less samples. Fixes two fate samples when played with -flags emu_edge. (cherry picked from commit `8bcfe7f7fd`)	2011-01-21 20:36:01 +01:00
Ronald S. Bultje	772225c041	Revert `2a1f431d38`, it broke H264 lossless. (cherry picked from commit `66c6b5e2a5`)	2011-01-21 20:36:01 +01:00
Ronald S. Bultje	66c6b5e2a5	Revert `2a1f431d38`, it broke H264 lossless.	2011-01-20 17:24:44 -05:00
Ronald S. Bultje	8bcfe7f7fd	Set gray (128) U/V planes for chroma-less samples. Fixes two fate samples when played with -flags emu_edge.	2011-01-20 17:24:44 -05:00
Jason Garrett-Glaser	b9af15402d	Remove evil timers that snuck their way into r26375. Originally committed as revision 26377 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-15 18:14:36 +00:00
Jason Garrett-Glaser	fb2734c8a6	Fix r26375 on non-x86. Originally committed as revision 26376 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-15 18:13:40 +00:00
Jason Garrett-Glaser	f14bdd8e75	H.264: Partially inline CABAC residual decoding Improves CABAC performance about ~1.2%. Trick originates from x264 and has also been used in ffvp8. It's useful because coded block flags are usually zero, so it helps to have the early termination inlined into the main function. Originally committed as revision 26375 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-15 17:52:48 +00:00
Jason Garrett-Glaser	2a1f431d38	H.264/SVQ3: make chroma DC work the same way as luma DC No speed improvement, but necessary for some future stuff. Also opens up the possibility of asm chroma dc idct/dequant. Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-15 01:10:46 +00:00
Jason Garrett-Glaser	5657d14094	H.264: switch to x264-style tracking of luma/chroma DC NNZ Useful so that we don't have to run the hierarchical DC iDCT if there aren't any coefficients. Opens up some future opportunities for optimization as well. Originally committed as revision 26337 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-14 21:36:16 +00:00
Jason Garrett-Glaser	19fb234e4a	H.264: split luma dc idct out and implement MMX/SSE2 versions About 2.5x the speed. NOTE: the way that the asm code handles large qmuls is a bit suboptimal. If x264-style dequant was used (separate shift and qmul values), it might be possible to get some extra speed. Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-14 21:34:25 +00:00
Diego Biurrun	ba87f0801d	Remove explicit filename from Doxygen @file commands. Passing an explicit filename to this command is only necessary if the documentation in the @file block refers to a file different from the one the block resides in. Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-04-20 14:45:34 +00:00
Benoit Fouet	32e543f866	Replace @returns by @return. Originally committed as revision 22729 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-30 15:50:57 +00:00
Alexander Strange	767738f7a3	h264: Use + instead of \| in some places 6 insns less on x86-64/gcc 4.2. Originally committed as revision 22692 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-26 05:04:03 +00:00
Alexander Strange	601ca8c55c	h264: Remove unused function argument Originally committed as revision 22690 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-26 03:31:56 +00:00
Alexander Strange	f7ba470d58	h264: Simplify decode_cabac_residual() specialization Gives more consistent inlining with some compilers (such as llvm). Originally committed as revision 22689 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-26 03:29:31 +00:00
Michael Niedermayer	8897b247a5	Remove some unneeded fill_rectangle() for 16x16 blocks. Originally committed as revision 22124 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-28 23:54:24 +00:00
Zhou Zongyi	821fe7f3e6	Optimize (amvd>2)+(amvd>32), about 1 cpu cycles faster. patch by Zhou Zongyi @ zhouzy () os punkt pku dot edu speck cn Originally committed as revision 22084 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-26 22:45:35 +00:00
Michael Niedermayer	b5bd070029	Change mvd_cache & mvd_table to 8bit, this is overall a bit faster for high resolution videos. about 20cycles faster per MB for cathederal. Originally committed as revision 22038 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 20:43:06 +00:00
Michael Niedermayer	81b5e4ee92	Calculate mvd without abs() same speed (ask gcc why, i dont know) Originally committed as revision 22035 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 18:50:02 +00:00
Michael Niedermayer	855a1ba5e8	switch back to (amvd>2)+(amvd>32), its 5 cpu cycles faster now. Originally committed as revision 22032 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 18:16:48 +00:00
Michael Niedermayer	01b35be14a	Factorize common code from the top of decode_cabac_mb_mvd() 10-15 cpu cycles faster. Originally committed as revision 22029 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 18:06:02 +00:00
Michael Niedermayer	6d0155c79c	Replace mvd>2 + mvd>32 by MIN((mvd+28)*17>>9, 2) same speed as far as i can meassure but it might have fewer branches on some archs. Idea from x264 / jason Originally committed as revision 22027 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 16:16:08 +00:00
Michael Niedermayer	90332debfe	Replace ad-hoc fill rectangle by fill_rectangle(). Originally committed as revision 22025 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-24 13:12:09 +00:00
Michael Niedermayer	f4ce853125	get rid of an if() 1 cpu cycle faster. Originally committed as revision 21889 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-19 03:10:26 +00:00
Michael Niedermayer	e69bfde6b2	Get rid of a local variable, 10 cpu cycles faster. Originally committed as revision 21888 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-19 02:37:11 +00:00
Michael Niedermayer	a305449df6	Move abs() from decode_cabac_mb_mvd() to the code that writes mvd_cache. 4-8 cycles faster Originally committed as revision 21887 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-18 23:37:48 +00:00
Michael Niedermayer	90a5849efd	Speedup decode_cabac_field_decoding_flag() by 9 cpu cycles. Originally committed as revision 21875 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-18 12:13:21 +00:00
Michael Niedermayer	69cc31832f	Move check for and call of predict_field_decoding_flag() from the mb code to the row code. This function would only be needed on a MB basis for MBAFF+FMO Originally committed as revision 21860 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-17 02:14:02 +00:00
Michael Niedermayer	59f733d1b1	2x faster ff_h264_init_cabac_states(), 4k cpu cycles less. Sadly this is just per slice so the speedup with normal files should be negligible. Originally committed as revision 21859 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-16 23:43:08 +00:00
Michael Niedermayer	37a9719a97	2 cpu cycles faster context calculation for decode_cabac_intra_mb_type() Originally committed as revision 21845 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-16 02:51:37 +00:00
Michael Niedermayer	5806e8cd1f	Drop a few redundant slice_num checks. Originally committed as revision 21844 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-16 00:09:30 +00:00
Michael Niedermayer	053074276b	Drop compute_mb_neighbors() and move fill_decode_neighbors() up to take its role. Should be faster as this is a strict code removial. Originally committed as revision 21843 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-15 23:04:07 +00:00
Michael Niedermayer	c1bb66ac19	Split setting neighboring MBs from fill_decode_caches() no speed change. Originally committed as revision 21842 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-15 22:07:02 +00:00
Michael Niedermayer	cf55f59d5e	Simplify decode_cabac_mb_intra4x4_pred_mode(). same speed Originally committed as revision 21839 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-15 19:22:09 +00:00
Michael Niedermayer	f4060611e9	Merge decode_cabac_mb_type_b() into calling code. This avoids a conditional branch and is about 3 cpu cyclues faster. Originally committed as revision 21838 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-15 19:20:49 +00:00
Michael Niedermayer	64dd1b0a1d	Merge the single line function decode_cabac_mb_transform_size() into the calling code. 8 cpu cycles faster Originally committed as revision 21828 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-15 01:04:07 +00:00
Michael Niedermayer	8b38d10761	indent Originally committed as revision 21827 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-14 23:10:02 +00:00
Michael Niedermayer	f4b8b82514	Merge decode_cabac_mb_dqp() with surronding code. ~20 cpu cycles faster Originally committed as revision 21826 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-14 23:06:25 +00:00
Michael Niedermayer	a59b9ee33d	Set sub_mb_type in direct_cache instead of just the direct flag. Simpler, cleaner and faster. Originally committed as revision 21822 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-14 16:51:31 +00:00
Michael Niedermayer	2dc380ca8e	Store sub_mb_type in direct_cache/direct_table. This is equal complexity but could be more usefull. Originally committed as revision 21821 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-14 14:41:27 +00:00
Michael Niedermayer	3d2c3ef4b4	Remove slice_table checks from decode_cabac_mb_cbp_luma() and set left/top_cbp so these checks arent needed. Originally committed as revision 21819 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-14 02:08:48 +00:00
Michael Niedermayer	2773920698	Optimize decode_cabac_field_decoding_flag(). ~4 cpu cycles faster Originally committed as revision 21447 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-25 02:44:34 +00:00
Måns Rullgård	c67278098d	Move array specifiers outside DECLARE_ALIGNED() invocations Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-22 03:25:11 +00:00
Michael Niedermayer	7231ccf4d5	Cosmetic, get rid of &x[0] Originally committed as revision 21309 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-18 23:55:19 +00:00
Michael Niedermayer	f432b43b08	Split fill_caches() between filter and decoder. Originally committed as revision 21271 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-17 21:43:08 +00:00
Michael Niedermayer	c988f97566	Rearchitecturing the stiched up goose part 1 Run loop filter per row instead of per MB, this also should make it much easier to switch to per frame filtering and also doing so in a seperate thread in the future if some volunteer wants to try. Overall decoding speedup of 1.7% (single thread on pentium dual / cathedral sample) This change also allows some optimizations to be tried that would not have been possible before. Originally committed as revision 21270 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-17 20:35:55 +00:00
Michael Niedermayer	ddd60f28d8	Replace cabac checks in inline functions from h264.h with constants. No benchmark because its just replacing variables with litteral constants (so no risk for slowdown outside gcc silliness) and i need sleep. Originally committed as revision 21237 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-16 05:41:33 +00:00
Michael Niedermayer	cc51b28299	Split cabac decoding code out of h264.c. not slower according to benchmarks. Originally committed as revision 21181 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-13 02:35:36 +00:00

1 2 3 4

151 Commits