Commit Graph

86 Commits

Author SHA1 Message Date
Michael Niedermayer
976a8b2179 Merge remote-tracking branch 'qatar/master'
* qatar/master: (40 commits)
  H.264: template left MB handling
  H.264: faster fill_decode_caches
  H.264: faster write_back_*
  H.264: faster fill_filter_caches
  H.264: make filter_mb_fast support the case of unavailable top mb
  Do not include log.h in avutil.h
  Do not include pixfmt.h in avutil.h
  Do not include rational.h in avutil.h
  Do not include mathematics.h in avutil.h
  Do not include intfloat_readwrite.h in avutil.h
  Remove return statements following infinite loops without break
  RTSP: Doxygen comment cleanup
  doxygen: Escape '\' in Doxygen documentation.
  md5: cosmetics
  md5: use AV_WL32 to write result
  md5: add fate test
  md5: include correct headers
  md5: fix test program
  doxygen: Drop array size declarations from Doxygen parameter names.
  doxygen: Fix parameter names to match the function prototypes.
  ...

Conflicts:
	libavcodec/x86/dsputil_mmx.c
	libavformat/flvenc.c
	libavformat/oggenc.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 00:45:21 +02:00
Jason Garrett-Glaser
556f8a066c H.264: template left MB handling
Faster H.264 decoding with ALLOW_INTERLACE off.
2011-07-03 15:06:00 -07:00
Jason Garrett-Glaser
4320a309ce H.264: make filter_mb_fast support the case of unavailable top mb
Significantly faster deblocking in streams with lots of slices.
2011-07-03 15:05:49 -07:00
Michael Niedermayer
f211d9d839 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  build: improve rules for test programs
  build: factor out the .c and .S compile commands as a macro
  swscale: remove unused xInc/srcW arguments from hScale().
  H.264: disable 2tap qpel with CODEC_FLAG2_FAST and >8-bit
  H.264: make filter_mb_fast support 4:4:4
  mpeg4videoenc: Remove disabled variant of mpeg4_encode_block().
  configure: allow post-fixed cpu strings for athlon64, k8, and opteron when setting the -march flag.
  Move some variable declarations below the proper #ifdefs.

Conflicts:
	Makefile
	ffplay.c
	libswscale/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-27 03:32:45 +02:00
Jason Garrett-Glaser
84153d1883 H.264: make filter_mb_fast support 4:4:4 2011-06-26 14:35:36 -07:00
Michael Niedermayer
4b87a088bf Merge remote-tracking branch 'qatar/master'
* qatar/master:
  configure: add --optflags option
  build: move documentation rules to doc/Makefile
  build: move test rules to tests/Makefile
  ac3enc: remove unneeded local variable in asym_quant()
  ac3enc: remove a branch in asym_quant() by doing 2 shifts
  ac3enc: avoid masking output in asym_quant() by using signed values for quantized mantissas.
  H.264: fix 4:4:4 + deblocking + 8x8dct + cavlc + MBAFF
  H.264: fix 4:4:4 + deblocking + MBAFF
  H.264: fix 4:4:4 cropping warning
  H.264: reference the correct SPS in decode_scaling_matrices
  H.264: fix bug in lossless 4:4:4 decoding

Conflicts:
	Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-23 04:49:04 +02:00
Jason Garrett-Glaser
2702a6f114 H.264: fix 4:4:4 + deblocking + 8x8dct + cavlc + MBAFF 2011-06-22 02:39:20 -07:00
Jason Garrett-Glaser
7c9079ab4c H.264: fix 4:4:4 + deblocking + MBAFF 2011-06-22 02:39:17 -07:00
Michael Niedermayer
c137fdd778 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: remove misplaced comment.
  ffmpeg: fix streaming to ffserver.
  swscale: split out RGB48 output functions from yuv2packed[12X]_c().
  build: move vpath directives to main Makefile
  swscale: fix JPEG-range YUV scaling artifacts.
  build: move ALLFFLIBS to a more logical place
  ARM: factor some repetitive code into macros
  Fix SVQ3 after adding 4:4:4 H.264 support
  H.264: fix CODEC_FLAG_GRAY
  4:4:4 H.264 decoding support
  ac3enc: fix allocation of floating point samples.

Conflicts:
	ffmpeg.c
	libavcodec/dsputil_template.c
	libavcodec/h264.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libswscale/swscale.c
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-15 02:15:25 +02:00
Jason Garrett-Glaser
7b442ad918 H.264: fix CODEC_FLAG_GRAY
It was broken in 4:4:4, and still did chroma deblocking for no reason in 4:2:0.
2011-06-13 21:16:33 -07:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c177cfb4fb H.264: fix CODEC_FLAG_GRAY
It was broken in 4:4:4, and still did chroma deblocking for no reason in 4:2:0.
2011-06-13 12:21:49 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Michael Niedermayer
59eb12faff Merge remote branch 'qatar/master'
* qatar/master: (30 commits)
  AVOptions: make default_val a union, as proposed in AVOption2.
  arm/h264pred: add missing argument type.
  h264dsp_mmx: place bracket outside #if/#endif block.
  lavf/utils: fix ff_interleave_compare_dts corner case.
  fate: add 10-bit H264 tests.
  h264: do not print "too many references" warning for intra-only.
  Enable decoding of high bit depth h264.
  Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
  Add support for higher QP values in h264.
  Add the notion of pixel size in h264 related functions.
  Make the h264 loop filter bit depth aware.
  Template dsputil_template.c with respect to pixel size, etc.
  Template h264idct_template.c with respect to pixel size, etc.
  Preparatory patch for high bit depth h264 decoding support.
  Move some functions in dsputil.c into a new file dsputil_template.c.
  Move the functions in h264idct into a new file h264idct_template.c.
  Move the functions in h264pred.c into a new file h264pred_template.c.
  Preparatory patch for high bit depth h264 decoding support.
  Add pixel formats for 9- and 10-bit yuv420p.
  Choose h264 chroma dc dequant function dynamically.
  ...

Conflicts:
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/alpha/dsputil_alpha.c
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/arm/dsputil_init_neon.c
	libavcodec/arm/dsputil_iwmmxt.c
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/bfin/dsputil_bfin.c
	libavcodec/dsputil.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_ps.c
	libavcodec/h264_refs.c
	libavcodec/h264dsp.c
	libavcodec/h264idct.c
	libavcodec/h264pred.c
	libavcodec/mlib/dsputil_mlib.c
	libavcodec/options.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/h264_altivec.c
	libavcodec/ps2/dsputil_mmi.c
	libavcodec/sh4/dsputil_align.c
	libavcodec/sh4/dsputil_sh4.c
	libavcodec/sparc/dsputil_vis.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/dsputil_mmx.c
	libavformat/options.c
	libavformat/utils.c
	libavutil/pixfmt.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_template.c
	tests/ref/seek/lavf_avi

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-11 05:47:02 +02:00
Oskar Arvidsson
6e3ef511d7 Add the notion of pixel size in h264 related functions.
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:33 -04:00
Oskar Arvidsson
44ca80df34 Make the h264 loop filter bit depth aware.
Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:32 -04:00
Ronald S. Bultje
dd561441b1 h264: DSP'ize MBAFF loopfilter. 2011-05-10 07:24:08 -04:00
Michael Niedermayer
e7077f5e7b H264: replace pixel_size by pixel_shift
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Oskar Arvidsson
dc172ecc6e Add the notion of pixel size in h264 related functions.
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:41 +02:00
Oskar Arvidsson
86b0d9cd58 Make the h264 loop filter bit depth aware.
Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:41 +02:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Diego Biurrun
ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Måns Rullgård
4693b031a3 Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the new
H264DSPContext.  The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by
non-h264 code.

Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 01:17:00 +00:00
Måns Rullgård
84dc2d8afa Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant.  All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:24:59 +00:00
Måns Rullgård
19769ece3b H264: use alias-safe macros
This eliminates all aliasing violation warnings in h264 code.
No measurable speed difference with gcc-4.4.3 on i7.

Originally committed as revision 21881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-18 16:24:31 +00:00
Måns Rullgård
40d1122752 Use LOCAL_ALIGNED macro for local arrays
Originally committed as revision 21866 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-17 20:36:20 +00:00
Alexander Strange
78998bf217 h264: Remove unused variables.
Originally committed as revision 21815 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 21:09:38 +00:00
Michael Niedermayer
9873ae0d44 Fix CAVLC+8x8DCT+MBAFF loopfiltering.
Fixes issue1250

Originally committed as revision 21665 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-07 02:00:00 +00:00
Michael Niedermayer
37b2b0d6cd Get rid of a check in one direction that cant be true in it in that part
of the code.
No meassureable speed change.

Originally committed as revision 21566 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-31 02:05:26 +00:00
Michael Niedermayer
2646814897 Split first reference list comparission from mv comparission.
about 0.5% faster MBAFF loop filtering

Originally committed as revision 21552 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 20:07:37 +00:00
Michael Niedermayer
4e992796a9 Replace h->left_type[0] by the local variable for it we have.
No meassureable speed effect.

Originally committed as revision 21541 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 14:33:25 +00:00
Michael Niedermayer
012dbcce08 slightly faster bit trickery.
Originally committed as revision 21540 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 14:10:06 +00:00
Michael Niedermayer
77821e11b3 Replace ?: by branchless code.
about 0.5% faster loop filtering

Originally committed as revision 21539 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-30 13:40:20 +00:00
Michael Niedermayer
34032e26ab factorize first filter call out, this makes the code somewhat
smaller without any speed loss.

Originally committed as revision 21514 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 19:44:13 +00:00
Michael Niedermayer
592e03a8da Change wraper functions to always inline, they are faster now that way.
1% faster MBAFF decoding overall, maybe ~0.1% faster for the cathedral sample.

Originally committed as revision 21507 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:37:35 +00:00
Michael Niedermayer
5364db2893 indent
Originally committed as revision 21506 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:18:06 +00:00
Michael Niedermayer
2cf0d46d4c Restructure check_mv()
~20 cpu cycles faster loopfilter

Originally committed as revision 21505 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 11:12:46 +00:00
Michael Niedermayer
fabd704b37 Restructure if() in check_mv()
quite a bit faster

Originally committed as revision 21504 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:38:43 +00:00
Michael Niedermayer
ca7c784fdf Unroll loops in check_mv()
~6% faster (slow path) loopfilter (should be ~2% overall)

Originally committed as revision 21503 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:34:06 +00:00
Michael Niedermayer
e814817b74 Factor mv/ref compare code out.
This is a hair slower (0.15% maybe) but i really dont want to have the
identical code duplicated 3 times because gcc adds odd threaded jumps with
register reshuffling and register safe/restore.

Originally committed as revision 21502 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 10:10:02 +00:00
Michael Niedermayer
3b84924516 Simplify first edge filter condition.
Originally committed as revision 21497 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:41:52 +00:00
Michael Niedermayer
b6302d0c55 Cosmetics, mostly indention, 2 or so new fixme comments that i was to lazy
to split out

Originally committed as revision 21496 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:20:31 +00:00
Michael Niedermayer
0a32508d90 Make the fast loop filter path work with unavailable left MBs.
This prevents the issue with having to switch between slow and
fast code paths in each row.
0.5% faster loopfilter for cathedral

Originally committed as revision 21495 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 02:15:25 +00:00
Michael Niedermayer
b304767301 get rid of the start variable.
a few cycles faster

Originally committed as revision 21494 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 01:31:06 +00:00
Michael Niedermayer
980bcc554d Unroll main loop so the edge==0 case is seperate.
This allows many things to be simplified away.
h264 decoder is overall 1% faster with a mbaff sample and
0.1% slower with the cathedral sample, probably because the slow loop
filter code must be loaded into the code cache for each first MB of each
row but isnt used for the following MBs.

Originally committed as revision 21493 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-28 01:24:25 +00:00
Michael Niedermayer
8670f84cf9 Update comment.
Originally committed as revision 21479 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 13:18:08 +00:00
Michael Niedermayer
e470ef7641 Use table to speedup access to non_zero_count in MBAFF with differing interlacing.
~4 cpu cycles speedup

Originally committed as revision 21474 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 11:14:29 +00:00
Michael Niedermayer
16e5e39ab4 Optimize loop filtering of the left edge in MBAFF.
60 cpu cycles speedup

Originally committed as revision 21467 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 22:59:19 +00:00
Michael Niedermayer
6548c939ec remove unneeded check
Originally committed as revision 21460 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 15:34:21 +00:00