Commit Graph

327 Commits

Author SHA1 Message Date
Anton Khirnov
25408b2a06 h264: make ff_h264_frame_start static.
It is not called from outside h264.c
2013-03-21 10:19:54 +01:00
Anton Khirnov
759001c534 lavc decoders: work with refcounted frames. 2013-03-08 07:38:30 +01:00
Diego Biurrun
5f401b7b71 Add missing error_resilience includes to files that use ER 2013-03-07 15:04:49 +01:00
Diego Biurrun
94ee7da08d Remove pointless av_cold attributes in header files
The init functions marked as av_cold have to be executed in any case,
so there is no gain from trying to mark paths leading to such functions
as unlikely.
2013-02-23 20:13:47 +01:00
Ronald S. Bultje
fae6fd5b87 h264/svq3: Stop using draw_edges
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:34:33 +02:00
Ronald S. Bultje
7ebfb466ae h264: Don't store intra pcm samples in h->mb
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:34:14 +02:00
Anton Khirnov
2c54155407 h264: deMpegEncContextize
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.

The nontrivial parts are:
1) extracting a simplified version of the frame management code from
   mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
   its own more complex system already and those were set only to appease
   the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
   for dxva, the draw_horiz_band() call is moved from
   ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
   because it's now different for h264 and MpegEncContext-based
   decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
   added some very simplistic frame management instead and dropped the
   use of ff_h264_frame_start(). Because of this I also had to move some
   initialization code to svq3.

Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-02-15 16:35:16 +01:00
Anton Khirnov
9782c778a2 h264: remove silly macros
They serve no useful purpose and wreak all kind of havoc when h264.h is
included elsewhere.
2013-02-06 21:45:02 +01:00
Diego Biurrun
79dad2a932 dsputil: Separate h264chroma 2013-02-06 11:30:53 +01:00
Mans Rullgard
e9d817351b dsputil: Separate h264 qpel
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Diego Biurrun
88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Anton Khirnov
940b8b5861 h264: avoid pointless copying of ref lists
ref_list is constructed from other fields per slice when needed, so do
not copy it for both frame and slice threading.
default_ref_list is constructed per frame and still needs to be copied
to per-slice contexts for slice threading, but a copy is not needed for
frame threading.
2013-01-18 07:56:05 +01:00
Anton Khirnov
ea382767ad h264: fix ff_generate_sliding_window_mmcos() prototype.
It's been returning an error value since
bad446e251

Also check for the errors it returns.
2013-01-14 21:36:08 +01:00
Ronald S. Bultje
bad446e251 h264: don't clobber mmco opcode tables for non-first slice headers.
Clobbering these tables will temporarily clobber the template used
as a basis for other threads to start decoding from. If the other
decoding thread updates from the template right at that moment,
subsequent threads will get invalid (or, usually, none at all) mmco
tables. This leads to invalid reference lists and subsequent decode
failures.

Therefore, instead, decode the mmco tables only for the first slice in
a field or frame. For other slices, decode the bits and ensure they
are identical to the mmco tables in the first slice, but don't ever
clobber the context state. This prevents other threads from using a
clobbered/invalid template as starting point for decoding, and thus
fixes decoding in these cases.

This fixes occasional (~1%) failures of h264-conformance-mr1_bt_a with
frame-multithreading enabled.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-14 19:20:47 +01:00
Janne Grunau
9e696d2e5f h264: support frame parameter changes during frame-mt
Fixes CVE-2012-2782.
2012-12-18 19:55:10 +01:00
Janne Grunau
73ad2c2fa7 h264: increase dist_scale_factor for up to 32 references
Compute dist_scale_factor_field only for MBAFF since that is the only
case in which it is used.
2012-12-18 19:36:58 +01:00
Janne Grunau
61c6eef545 h264: prevent decoding of slice NALs in extradata
It is not posible to call get_buffer during frame-mt codec
initialization. Libavformat might pass huge amounts of data as
extradata after parsing broken files. The 'extradata' for the fuzzed
sample sample_varPAR_s5374_r001-02.avi is 2.8M large and contains
multiple slices.
2012-12-18 11:01:14 +01:00
Janne Grunau
6a27ae28f9 mpegvideo: treat delayed pictures as used
This requires to move the avcodec_default_free_buffers() call to
ff_MPV_common_end() since otherwise delayed pictures would get freed
during a size change.
2012-12-13 21:02:42 +01:00
Janne Grunau
072be3e896 h264: set parameters from SPS whenever it changes
Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with
alternating bit depths.
2012-12-13 21:02:42 +01:00
Janne Grunau
a394959bbe h264: add a pointer for weighted prediction temporary buffer
Reusing MpegEncContext's obmc_scratchpad for this becomes a mess with
adaptive frame-mt.
2012-12-07 11:43:28 +01:00
Diego Biurrun
be545b8a34 h264: K&R formatting cosmetics for header files (part I/II) 2012-05-10 13:02:47 +02:00
Diego Biurrun
0becb07842 h264: Factorize declaration of mb_sizes array. 2012-04-05 17:17:22 +02:00
Diego Biurrun
9ad80ef3db h264: Make ff_h264_decode_end() static, it is not used externally.
Also drop the now unnecessary ff_ prefix from its name.
2012-03-30 17:46:52 +02:00
Ronald S. Bultje
45b7bd7c53 h264: disallow constrained intra prediction modes for luma.
Conversion of the luma intra prediction mode to one of the constrained
("alzheimer") ones can happen by crafting special bitstreams, causing
a crash because we'll call a NULL function pointer for 16x16 block intra
prediction, since constrained intra prediction functions are only
implemented for chroma (8x8 blocks).

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
2012-02-09 22:57:01 -08:00
Janne Grunau
358ea75e9e Revert "h264: skip start code search if the size of the nal unit is known"
This reverts commit 87eebb3454.
2011-12-19 03:24:32 +01:00
Janne Grunau
87eebb3454 h264: skip start code search if the size of the nal unit is known
Start code emulation prevention is only required in Annex B bytestream
packed NAL units. For other coding formats the size is already known.
Looking for a start code prefix can result in false positives like in
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4
which has a false positive in the SPS.
2011-12-18 23:52:53 +01:00
Luca Barbato
5bf2ac2b37 error_resilience: use the ER_ namespace
Add the namespace to {AC_,DC_,MV_}{END,ERROR} macros

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2011-12-13 16:20:58 +01:00
Diego Biurrun
58c42af722 doxygen: misc consistency, spelling and wording fixes 2011-12-12 23:06:23 +01:00
Ronald S. Bultje
adedd840e2 h264: fix frame reordering code.
Fixes fate-h264-conformance-{mr2_tandberg_e,mr3_tandberg_b} without
requiring -strict 1.
2011-12-03 08:24:27 -08:00
Ronald S. Bultje
ea2bb12e3e h264: improve calculation of codec delay.
Fixes the following conformance suite samples:
HCBP1_HHI_A.264, HCBP2_HHI_A.264, HCMP1_HHI_A.264 (main)
HCHP1_HHI_B.264, HCHP2_HHI_A.264, HCHP3_HHI_A.264 (frext)
2011-11-05 06:58:52 -07:00
Baptiste Coudurier
76741b0e56 h264: 4:2:2 intra decoding support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Diego Biurrun
806212498a h264: move fill_decode_neighbors()/fill_decode_caches() to h264_mvpred.h
This fixes a bunch of unused function warnings.
2011-07-14 04:09:49 +02:00
Diego Biurrun
028216b2c2 h264: move decode_mb_skip() from h264.h to h.264_mvpred.h
This resolves a circular dependency between the headers.
2011-07-12 20:36:50 +02:00
Ronald S. Bultje
c90a2538a0 h264: move h264_mvpred.h include.
Fixes the following compile error with darwin/gcc-4.2.1:
In file included from libavcodec/error_resilience.c:33:
libavcodec/h264.h: In function ‘decode_mb_skip’:
libavcodec/h264.h:773: error: ‘always_inline’ function could not be inlined in call to ‘pred_pskip_motion’: the function body must appear before caller
libavcodec/h264.h:1334: error: called from here
2011-07-12 08:15:55 -07:00
Diego Biurrun
657ccb5ac7 Eliminate FF_COMMON_FRAME macro.
FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied
to struct Picture.  Replace by an embedded AVFrame structure in struct Picture.
2011-07-11 00:19:00 +02:00
Jason Garrett-Glaser
ef0c594801 H.264: merge fill_rectangle into P-SKIP MV prediction, to match B-SKIP 2011-07-08 16:12:12 -07:00
Jason Garrett-Glaser
5136ba7c69 H.264: faster P-SKIP decoding
Inline the relevant parts of fill_decode_caches into P-SKIP mv prediction to
avoid calling the whole thing.
2011-07-08 16:11:15 -07:00
Jason Garrett-Glaser
bbdd52ed34 H.264: av_always_inline some more functions
These weren't getting inlined all the time in all gcc versions.
2011-07-08 16:09:35 -07:00
Jason Garrett-Glaser
556f8a066c H.264: template left MB handling
Faster H.264 decoding with ALLOW_INTERLACE off.
2011-07-03 15:06:00 -07:00
Jason Garrett-Glaser
ca80f11ec3 H.264: faster fill_decode_caches
Aliasing avoidance and general cleanup.
2011-07-03 15:05:57 -07:00
Jason Garrett-Glaser
3b7ebeb4d5 H.264: faster write_back_*
Avoid aliasing, unroll loops, and inline more functions.
2011-07-03 15:05:55 -07:00
Reinhard Tartler
21a19b7912 doxygen: Prefer member groups over grouping into modules
Before this, almost all module groups have been used for grouping functions
and fields in structures semantically. This causes them to not appear
properly in the file documentation and needlessly clutters up the "Modules"
index.

Additionally, this commit streamlines some spelling and appearances.
2011-07-02 13:52:29 +02:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Baptiste Coudurier
8dfc6d1f7c svq3: Move svq3-specific fields to their own context.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-03 13:55:54 +02:00
Alexander Strange
6a9c859444 H264/MPEG frame-level multi-threading.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 10:16:20 -07:00
Oskar Arvidsson
fcc0224e4f Add support for higher QP values in h264.
In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)).

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:35 -04:00
Oskar Arvidsson
6e3ef511d7 Add the notion of pixel size in h264 related functions.
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:33 -04:00
Stefano Sabatini
975a1447f7 Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-02 12:18:44 +02:00
Diego Biurrun
e6ff064845 Eliminate pointless '#if 1' statements without matching '#else'. 2011-04-26 20:18:27 +02:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Janne Grunau
fe9a3fbe42 h264: Add Intra and Constrained Baseline profiles to avctx.profile 2011-02-01 20:37:02 +01:00
Diego Elio Pettenò
8529731961 Make ff_h264_decode_rbsp_trailing static to h264.c
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-25 21:48:03 +01:00
Ronald S. Bultje
66c6b5e2a5 Revert 2a1f431d38, it broke H264 lossless. 2011-01-20 17:24:44 -05:00
Jason Garrett-Glaser
2a1f431d38 H.264/SVQ3: make chroma DC work the same way as luma DC
No speed improvement, but necessary for some future stuff.
Also opens up the possibility of asm chroma dc idct/dequant.

Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 01:10:46 +00:00
Jason Garrett-Glaser
5657d14094 H.264: switch to x264-style tracking of luma/chroma DC NNZ
Useful so that we don't have to run the hierarchical DC iDCT if there aren't
any coefficients.  Opens up some future opportunities for optimization as well.

Originally committed as revision 26337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:36:16 +00:00
Jason Garrett-Glaser
19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Eli Friedman
9049fa5479 Add av_unused to decode_mb_skip declaration to fix the following warning:
libavcodec/h264.h:1260: warning: ‘decode_mb_skip’ defined but not used
patch by Eli Friedman, eli.friedman gmail com

Originally committed as revision 24069 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-06 07:40:35 +00:00
Michael Niedermayer
733f5990d0 Factorize ff_generate_sliding_window_mmcos() out.
Originally committed as revision 24056 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-05 12:42:19 +00:00
Måns Rullgård
49bd8e4b84 Fix grammar errors in documentation
Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-30 15:38:06 +00:00
Howard Chu
82f1ffc7ba Cleanup prev commit, flag variable should start with 0
Originally committed as revision 23364 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-05-28 20:14:14 +00:00
Howard Chu
23584bec87 Parse avctx->extradata if available.
Fixes many "non-existing PPS referenced" error messages

Originally committed as revision 23363 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-05-28 18:50:39 +00:00
Howard Chu
05e953193d Factorize ff_h264_decode_extradata().
Patch by Howard Chu, hyc highlandsun com

Originally committed as revision 23340 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-05-26 19:00:59 +00:00
Diego Biurrun
ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Diego Biurrun
d02bb3ecf1 Move static function fill_filter_caches() from h264.h to h264.c.
The function is only used within that file, so it makes sense to place
it there. This fixes many warnings of the type:
h264.h:1170: warning: ‘fill_filter_caches’ defined but not used

Originally committed as revision 22876 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-13 22:15:49 +00:00
Michael Niedermayer
1052b76f0f Fix implicit weight for b frames in mbaff.
Originally committed as revision 22733 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-30 21:05:11 +00:00
Benoit Fouet
32e543f866 Replace @returns by @return.
Originally committed as revision 22729 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-30 15:50:57 +00:00
Måns Rullgård
4693b031a3 Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the new
H264DSPContext.  The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by
non-h264 code.

Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 01:17:00 +00:00
Måns Rullgård
404793f4ac H264: fix signed overflow in constant multiplication
This fixes libavcodec/h264.h:1100: warning: integer overflow in expression

Originally committed as revision 22558 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-15 23:00:53 +00:00
Måns Rullgård
84dc2d8afa Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant.  All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:24:59 +00:00
Michael Niedermayer
38768cb70a Port Optimizations about *_type init from decode to filter code.
1 cpu cycle faster

Originally committed as revision 22193 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-04 02:00:05 +00:00
Michael Niedermayer
b46b5ac9f8 Optimize *_type init, 1.5 cpu cycles faster.
Originally committed as revision 22192 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-04 01:03:15 +00:00
Michael Niedermayer
3d9137c883 Reorder indexes in weight tables.
5 cpu cycles faster.

Originally committed as revision 22183 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 21:10:08 +00:00
Michael Niedermayer
bd8868e092 Move all context fields that are not used in the mb and block layers
to the end of the structure.
4 cpu cycles faster in 3k cpu cycles

Originally committed as revision 22181 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 20:36:56 +00:00
Michael Niedermayer
65f3c029b9 remove unused left_border field from context.
Originally committed as revision 22179 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 19:44:27 +00:00
Michael Niedermayer
af2b0df40f Note about luma/chroma_weight tables and their datatype.
Originally committed as revision 22177 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 19:31:58 +00:00
Michael Niedermayer
d7f5e520bf move svq3 specific fields to the end of the context
Originally committed as revision 22171 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 16:47:40 +00:00
Michael Niedermayer
70118abd68 Merge weight & offset tables, 15 cpu cycles faster.
Originally committed as revision 22169 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 14:41:43 +00:00
Michael Niedermayer
f57880d244 Another 3 useless zeroing instructions.
Originally committed as revision 22162 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 02:20:48 +00:00
Michael Niedermayer
16b802fe93 Load the whole left side of mv&ref only when needed.
30 cpu cycles faster

Originally committed as revision 22161 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-03 01:38:27 +00:00
Michael Niedermayer
ce9c691616 Merge h->slice_table[left_xy[0/1] ] checks, 4 cpu cycles speedup
Originally committed as revision 22086 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-27 04:09:48 +00:00
Michael Niedermayer
82fb5bb2ee Split *_type setting up, 4 cpu cycles faster.
Originally committed as revision 22085 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-27 03:46:16 +00:00
Michael Niedermayer
cf41a02b1b Only load the topleft mv/ref when the topright is unavailable.
8 cpu cycles faster.

Originally committed as revision 22079 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 15:26:11 +00:00
Michael Niedermayer
cf7b67bc40 Remove some useless operations from the code setting left_cbp.
maybe 0.5 cpu cycles faster

Originally committed as revision 22078 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 15:03:00 +00:00
Michael Niedermayer
59b5370f02 Simplify code to set cbp_*
this seems 1 cpu cycle slower even though we practically just remove code.
Speed loss seems caused by the merge of if(left_type), iam commiting this
anyway as i cant imagine this to be anything but compiler messup.

Originally committed as revision 22073 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 09:13:40 +00:00
Michael Niedermayer
747db4e31a Move init of right side of ref_cache from fill_caches() to init_the_darn_decoder().
Originally committed as revision 22071 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 03:27:52 +00:00
Michael Niedermayer
77c6edb846 Remove 3 mv_cache zeroing instructions that zeroed the right side.
This seems unneeded as nothing seems to ever set it to non zero values.

Originally committed as revision 22070 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 02:54:03 +00:00
Michael Niedermayer
8f8497ae78 Remove useless check of the 2 left MBs of a pair being in the same slice.
Originally committed as revision 22069 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 01:38:12 +00:00
Michael Niedermayer
6e2fe0f20a Remove unneeded line of code from the neighbor setting code in h264.
Originally committed as revision 22067 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-26 00:10:35 +00:00
Michael Niedermayer
358b5b1a59 Get rid of mb2b8_xy and b8_stride, change arrays organized based on b8_stride to
ones based on mb_stride in h264.
about 20 cpu cycles faster overall per MB

Originally committed as revision 22065 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 23:44:42 +00:00
Michael Niedermayer
5e350863cc Store data in direct_table interleaved.
seems 20cpu cycles faster

Originally committed as revision 22055 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 15:27:55 +00:00
Michael Niedermayer
013202d720 Simplify intra4x4_pred_mode_cache init.
Originally committed as revision 22054 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 14:54:31 +00:00
Michael Niedermayer
662a5b2370 Reorder intra4x4_pred_mode so that we can read/write 4 values at once.
3-7 cpu cycles faster

Originally committed as revision 22053 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 14:26:12 +00:00
Michael Niedermayer
5b0fb5244d Store intra4x4_pred_mode per row only.
about 5 cpu cycles slower in the local code but should be overall faster
due to reduced cache use. (my sample though has too few intra4x4 blocks
for this to be meassureable easily either way)

Originally committed as revision 22052 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 14:02:39 +00:00
Michael Niedermayer
c2186cbddc unroll tiny and trivial loop. Same speed but clearer.
Originally committed as revision 22051 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 12:51:32 +00:00
Michael Niedermayer
e1c88a2138 Cut the size of mvd_table by yet another factor of 2.
The code read/write code itself was 1 cycle faster, overall its
likely more due to cache effects

Originally committed as revision 22048 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 04:11:33 +00:00
Michael Niedermayer
d43c192236 Keep mvd_table values of only 2 mb rows.
Originally committed as revision 22047 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-25 02:42:25 +00:00
Michael Niedermayer
b5bd070029 Change mvd_cache & mvd_table to 8bit, this is overall a bit faster
for high resolution videos.
about 20cycles faster per MB for cathederal.

Originally committed as revision 22038 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-24 20:43:06 +00:00
Michael Niedermayer
9127a369ad Replace /2 by faster >>1 as the mvd values are now all positive.
Originally committed as revision 22013 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-24 01:57:31 +00:00
Michael Niedermayer
5c34e36a23 Remove unused variable. Seems i forgot to commit this.
Originally committed as revision 22012 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-24 01:56:27 +00:00
Diego Biurrun
dd3475682e Remove unused variable, fixes warnings of the type:
libavcodec/h264.h:816: warning: unused variable `mb_xy'

Originally committed as revision 21941 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-21 15:29:17 +00:00
Måns Rullgård
19769ece3b H264: use alias-safe macros
This eliminates all aliasing violation warnings in h264 code.
No measurable speed difference with gcc-4.4.3 on i7.

Originally committed as revision 21881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-18 16:24:31 +00:00
Michael Niedermayer
69a28f3e2b Move predict_field_decoding_flag() from h264.h to .c as its only used there and belongs
there as well.

Originally committed as revision 21861 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-17 02:25:05 +00:00
Michael Niedermayer
69cc31832f Move check for and call of predict_field_decoding_flag() from the mb code to
the row code. This function would only be needed on a MB basis for MBAFF+FMO

Originally committed as revision 21860 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-17 02:14:02 +00:00
Michael Niedermayer
c1bb66ac19 Split setting neighboring MBs from fill_decode_caches()
no speed change.

Originally committed as revision 21842 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-15 22:07:02 +00:00
Michael Niedermayer
2dc380ca8e Store sub_mb_type in direct_cache/direct_table.
This is equal complexity but could be more usefull.

Originally committed as revision 21821 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-14 14:41:27 +00:00
Michael Niedermayer
3d2c3ef4b4 Remove slice_table checks from decode_cabac_mb_cbp_luma() and set left/top_cbp so
these checks arent needed.

Originally committed as revision 21819 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-14 02:08:48 +00:00
Michael Niedermayer
056c502155 Revert r21814
Log:
	h264: Fix pointer warnings by removing redundant [0]

	Fixes:
	h264.h:1222:38: warning: initialization from incompatible pointer type
	h264.h:1299:38: warning: initialization from incompatible pointer type
	h264.h:1314:42: warning: initialization from incompatible pointer type

Reason: breaks h264 decoding & fate

Originally committed as revision 21818 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-14 02:04:41 +00:00
Michael Niedermayer
e916764675 Direct temporal skiped MBs dont need fill_decode_caches() at all so dont call it
for them.

Originally committed as revision 21816 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 22:53:44 +00:00
Alexander Strange
78998bf217 h264: Remove unused variables.
Originally committed as revision 21815 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 21:09:38 +00:00
Alexander Strange
677dab59cb h264: Fix pointer warnings by removing redundant [0]
Fixes:
h264.h:1222:38: warning: initialization from incompatible pointer type
h264.h:1299:38: warning: initialization from incompatible pointer type
h264.h:1314:42: warning: initialization from incompatible pointer type

Originally committed as revision 21814 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 21:08:17 +00:00
Alexander Strange
cd12c37729 Fix integer overflow warnings in h264.h
Fixes:
h264.h: In function 'fill_filter_caches':
h264.h:1216:73: warning: integer overflow in expression
h264.h:1307:81: warning: integer overflow in expression

Originally committed as revision 21813 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 20:57:13 +00:00
Michael Niedermayer
bb770c5b52 Merge (IS_SKIP(mb_type) || IS_DIRECT(mb_type)
Originally committed as revision 21812 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 20:13:54 +00:00
Michael Niedermayer
2e4362af14 Skiped MBs dont need the cbp stuff so skip initing that.
Originally committed as revision 21811 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 20:13:10 +00:00
Michael Niedermayer
e2b28acf89 Also skip direct/mvd_cache init for skiped blocks.
Odd thing is i thought ive tryed this already and it failed previously.

Originally committed as revision 21809 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 19:39:18 +00:00
Michael Niedermayer
cb9285a246 Move more code under if(!IS_DIRECT(mb_type)).
Originally committed as revision 21806 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 19:00:51 +00:00
Michael Niedermayer
f2b3763736 Skip some more code that isnt needed for direct MBs.
Originally committed as revision 21798 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 18:23:46 +00:00
Michael Niedermayer
5ca43c25f6 Move setting MB_TYPE_L0L1 for direct MBs up, this is simpler.
Originally committed as revision 21794 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 15:57:49 +00:00
Michael Niedermayer
da452acac6 Dont calculate any surrounding MVs for temporal MBs
Originally committed as revision 21793 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-13 15:30:27 +00:00
Michael Niedermayer
8a3b90686d Remove an apparently unneeded && !FRAME_MBAFF.
This should speed the affected cases (MBAFF temporal direct MBs) up.

Originally committed as revision 21686 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-08 04:24:50 +00:00
Michael Niedermayer
3a06e8647f Ooops, 10l forgot to commit h264.h.
Originally committed as revision 21680 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-07 23:15:53 +00:00
Rafaël Carré
881b5b80da Fix svq3_* function declarations.
Patch by Rafaël Carré, rafael D carre A gmail

Originally committed as revision 21489 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 22:22:01 +00:00
Michael Niedermayer
8652e44acd Simplify left_xy init
Originally committed as revision 21470 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-27 00:15:55 +00:00
Michael Niedermayer
599fe45b8d Split fill_caches() between loopfilter & decode, the 2 no longer where common
enough to justify the messy interleaving.

Originally committed as revision 21469 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 23:54:11 +00:00
Michael Niedermayer
dfe4dc154b use left_xy[1] in mbaff QP loop filter check, this improves the amount that can
be skiped.

Originally committed as revision 21465 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 20:28:58 +00:00
Michael Niedermayer
aebf31236e Optimize mv/ref cache init for left MB.
Originally committed as revision 21464 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 20:25:14 +00:00
Michael Niedermayer
a715af8ff4 Simplify left_xy content for the loop filter, this also makes it closer to
what is needed and its faster too.

Originally committed as revision 21458 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 14:55:19 +00:00
Michael Niedermayer
99344d4372 Set top & left types for deblock in fill_caches().
Originally committed as revision 21456 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 13:38:18 +00:00
Michael Niedermayer
66472bcde0 cosmetic
Originally committed as revision 21454 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 13:28:55 +00:00
Michael Niedermayer
3046c25ec5 Fix qp_thres loop filter check for MBAFF.
Originally committed as revision 21453 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 13:27:22 +00:00
Michael Niedermayer
806ac67b51 Optimize mb neighbor initialization for MBAFF in fill_caches().
~10 cpu cycles speedup.

Originally committed as revision 21452 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-26 10:35:36 +00:00
Alexander Strange
0b69d6254f H.264: Use 64-/128-bit write-combining macros for copies
2-3% faster decode on x86-32 core2.

Originally committed as revision 21440 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-25 00:30:44 +00:00
Laurent Aimar
0dc343d4cb Added a missing const to ff_h264_get_slice_type().
Originally committed as revision 21421 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-24 16:37:12 +00:00
Michael Niedermayer
b2b7ab32aa Prefer cbp over cbp_table.
Originally committed as revision 21418 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-24 13:43:26 +00:00
Michael Niedermayer
2c0ee01866 Remove unneeded reset of non_zero_count_cache for deblock.
Originally committed as revision 21414 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 22:11:46 +00:00
Michael Niedermayer
01c511683f Remove useless things from the deblock side of fill_caches().
Originally committed as revision 21413 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 21:57:36 +00:00
Michael Niedermayer
ea3b456dd6 make mv_cache init 64bit where possible.
Originally committed as revision 21412 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-23 21:45:12 +00:00
Måns Rullgård
c67278098d Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 03:25:11 +00:00
Michael Niedermayer
c2894fbf1c Dont waste time initializing stuff for deblocking intra mbs, none of
it is used.

Originally committed as revision 21315 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-19 03:14:45 +00:00
Michael Niedermayer
7a93858a6d Fix accumulated indention errors.
Originally committed as revision 21307 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:34:37 +00:00
Michael Niedermayer
70bd7a3d48 Optimize top non_zero_count_cache init.
Originally committed as revision 21306 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:31:14 +00:00
Michael Niedermayer
5e07aa7721 Dont init chroma elements of non_zero_count_cache for deblock.
Originally committed as revision 21305 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:30:21 +00:00
Michael Niedermayer
5cc5d9bf29 Remove unneeded for_deblock check, this code was alraedy under for_deblock.
Originally committed as revision 21304 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 23:27:53 +00:00
Michael Niedermayer
a7d7cdaac7 Set h->cbp for ff_h264_filter_mb_fast().
Originally committed as revision 21287 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 16:11:13 +00:00
Michael Niedermayer
b6ef858ec7 Move CAVLC 8x8 DCT special case from ff_h264_filter_mb() to fill_caches
that way it is also available for ff_h264_filter_mb_fast().

Originally committed as revision 21283 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 13:09:53 +00:00
Michael Niedermayer
6d7e6b2657 Perform reference remapping at fill_cache() time instead of in the
loop filter. This removes one obstacle of getting ff_h264_filter_mb_fast()
bitexact. code is maybe 0.1% faster

Originally committed as revision 21280 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 05:15:31 +00:00
Michael Niedermayer
7da0d82104 Make qp check for loop filter skiping also work with MBAFF.
Originally committed as revision 21276 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 00:34:28 +00:00
Michael Niedermayer
12be38ec18 Comment about a cornercase we ignore currently
Originally committed as revision 21275 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 00:21:58 +00:00
Michael Niedermayer
44a5e7b64c Move the qp check to skip the loop filter up.
Originally committed as revision 21274 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-18 00:20:44 +00:00