731 Commits

Author SHA1 Message Date
Michael Niedermayer
d42fc4a8ca Use the new VLC table for the first non trailing coeff too.
Sadly only 5 cycles faster here on pentium dual. So maybe the
complexity is not worth it and this should be reverted ...

Originally committed as revision 16295 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 19:10:46 +00:00
Michael Niedermayer
593af7cdda Optimize esc removal code.
Originally committed as revision 16294 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 18:31:44 +00:00
Michael Niedermayer
2d76bf391a Indent
Originally committed as revision 16292 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 17:50:36 +00:00
Michael Niedermayer
8140955d39 unified CAVLC level decoding LUT.
Quite a bit faster (HPCVMOLQ_BRCM_B.264 was 3% faster here)

Originally committed as revision 16291 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 17:49:38 +00:00
Michael Niedermayer
abb27cfb24 100l, I broke H.264 again, forgot one hunk.
Thanks to FATE for finding it.

Originally committed as revision 16285 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 01:11:56 +00:00
Michael Niedermayer
e08715d391 Optimize 0 0 0-3 search, 45% faster on pentium dual.
Originally committed as revision 16284 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-23 00:38:45 +00:00
Michael Niedermayer
ec3686e889 Simplify decode_cabac_mb_ref() a little bit, 2 cpu cycles faster on
pentium dual.

Originally committed as revision 16279 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 17:14:13 +00:00
Michael Niedermayer
26695973c7 Indent
Originally committed as revision 16278 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 16:56:05 +00:00
Michael Niedermayer
b68a455313 inline decode_cabac_mb_type for I & P frames, 9 cycles faster on pentium dual.
Originally committed as revision 16277 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 16:46:17 +00:00
Michael Niedermayer
1952ac3713 Negate 2 more variables, 1 cpu cycle faster on pentium dual.
Originally committed as revision 16276 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 16:20:13 +00:00
Michael Niedermayer
03a035e059 Simplify if/else, no speed change
Originally committed as revision 16275 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 16:14:06 +00:00
Michael Niedermayer
6f3c50f2f9 Negate a few variables, this simplifies the code and makes it 5 cycles faster
on pentium dual.

Originally committed as revision 16274 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 16:10:35 +00:00
Michael Niedermayer
60c6ba7aea Simplify ifs(), 8 cpu cycles faster on pentium dual
Originally committed as revision 16273 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 15:54:27 +00:00
Michael Niedermayer
127a20e3b8 Simplify if(), 3 cpu cycles faster in pentium dual.
Originally committed as revision 16272 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 15:53:00 +00:00
Diego Biurrun
a6493a8fbd Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 09:12:42 +00:00
Diego Biurrun
bef05f05e4 Remove a bunch of unused variables.
Originally committed as revision 16263 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 00:10:36 +00:00
Michael Niedermayer
befc8fe086 Remove useless code.
Originally committed as revision 16253 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-21 17:49:40 +00:00
Michael Niedermayer
c212fb0cb1 Only execute clear_blocks() when needed.
+0.3% speedup for both aladin & cathedral.

Originally committed as revision 16252 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-21 15:58:42 +00:00
Michael Niedermayer
66c07ca96f Optimize get_dct8x8_allowed().
30 cpu cycles faster on pentium dual.

Originally committed as revision 16248 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-20 23:11:30 +00:00
Jason Garrett-Glaser
aac8b76983 H.264 loopfilter speed tweaks
Originally committed as revision 16240 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 14:49:17 +00:00
Michael Niedermayer
a5805aa9d1 Fix decoding with the plain C idcts of
FRExt/HPCAMOLQ_BRCM_B
FRExt/HPCAQ2LQ_BRCM_B
FRExt/HPCVMOLQ_BRCM_B

Originally committed as revision 16236 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 09:21:54 +00:00
Michael Niedermayer
a5b807a6c1 Replace /2 by >>1 in decode_cabac_mb_dqp()
3 cpu cycles speed up on pentium dual.

Originally committed as revision 16233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 02:28:51 +00:00
Michael Niedermayer
1aea5d35e5 Simplify ctx update in decode_cabac_mb_dqp().
no speed change

Originally committed as revision 16232 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 02:00:33 +00:00
Michael Niedermayer
7cfca0dfd8 Simplify ctx calculation in decode_cabac_mb_dqp()
no speed change

Originally committed as revision 16231 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 01:50:20 +00:00
Alexander Strange
d43696309a Clear FF_INPUT_BUFFER_PADDING_SIZE bytes at the end of NALs in rbsp_buffer.
Fixes valgrind uninitialized value warnings at the end of decoding H.264
frames.

Originally committed as revision 16230 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 01:11:52 +00:00
Jason Garrett-Glaser
712ca84c21 Move filter_luma_intra into dsputil for later addition of asm.
Originally committed as revision 16228 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 00:44:51 +00:00
Jason Garrett-Glaser
b9fe706305 Simplify chroma AC in CABAC residual decoding.
Originally committed as revision 16227 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 00:43:30 +00:00
Michael Niedermayer
8955b66950 Optimize ctx calculation in decode_cabac_mb_mvd(), code by dark shikari.
The case for 16x16 blocks becomes 10 cpu cycles faster on pentium dual,
i could not find a speed difference in the case of subblocks though.

Originally committed as revision 16226 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-19 00:05:39 +00:00
Michael Niedermayer
17779f39b6 Remove unacceptable NULL pointer hack from mc code.
Originally committed as revision 16225 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 23:52:32 +00:00
Michael Niedermayer
04618b98e3 Check ref values in CABAC H.264 for validity.
Originally committed as revision 16224 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 23:31:10 +00:00
Michael Niedermayer
c25ac15a07 Move idct_(dc)add closer to where it is needed.
Originally committed as revision 16223 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 18:25:11 +00:00
Michael Niedermayer
aebb5d6d96 indent
Originally committed as revision 16222 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 18:14:38 +00:00
Michael Niedermayer
96465b90a1 Reorder ifs in chroma hl_decode_mb to avoid a duplicate transform_bypass
check.
14 cpu cycles speedup on Pentium Dual

Originally committed as revision 16221 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 18:12:59 +00:00
Michael Niedermayer
6456d6d87c s/h->cbp_table[mb_xy]/h->cbp/
Originally committed as revision 16220 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 17:42:44 +00:00
Michael Niedermayer
04824298a9 Faster CAVLC decoding of trailing_ones. Based on a patch by dark shikari.
decode_residual is about 3.3% faster.

Originally committed as revision 16219 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 16:11:19 +00:00
Jason Garrett-Glaser
93445d1617 Replace i by trailing_ones, part of a patch by dark shikari.
No speed change meassureable with START/STOP_TIMER, but this is needed
for future optimizations.

Originally committed as revision 16218 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 15:51:32 +00:00
Michael Niedermayer
c375d87f6f Remove if() surrounding decode_cabac_mb_type() that can never be true.
Originally committed as revision 16217 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 15:32:07 +00:00
Michael Niedermayer
c325b5054f Remove unreachable else clause, found by dark shikari.
Originally committed as revision 16216 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 15:21:15 +00:00
Michael Niedermayer
dae006d7d7 Remove useless IS_8x8DCT check i forgot, spotted by dark shikari.
Originally committed as revision 16215 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 11:17:12 +00:00
Michael Niedermayer
1eb960352b Do not calculate idct_dc_add/idct_add when the variables are unused.
Originally committed as revision 16210 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 04:13:02 +00:00
Michael Niedermayer
62bc966f8f Remove redundant nnz variable.
Originally committed as revision 16209 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 03:04:53 +00:00
Michael Niedermayer
0a8ca22f4e indent
Originally committed as revision 16208 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 02:56:44 +00:00
Michael Niedermayer
2fd1f0e026 Use the new idct functions (except chroma as it was slower in benchmarks)
cathedral +0.5% speed
aladin +0.6% speed [note aladin has been cat-ed 10 times to reduce the influence
of init time]
Speedup also verified via START/STOP_TIMER (difference was very significant
for the changed parts)

Originally committed as revision 16207 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 02:53:18 +00:00
Michael Niedermayer
49c084a745 Skip non intra luma code when there is no coded luma.
0.7% speedup for the cathedral sample.

Originally committed as revision 16203 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 01:10:51 +00:00
Michael Niedermayer
621561cdf3 Skip chroma handling when there is no coded chroma.
0.5% overall speedup for the cathedral sample.

Originally committed as revision 16201 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-18 00:46:54 +00:00
Michael Niedermayer
4080e67c8e Replace != 0 || check by |
3 cpu cycles faster

Originally committed as revision 16183 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-17 02:53:03 +00:00
Michael Niedermayer
ad9ca7e720 Split filter_mb_dir() out of filter_mb().
1% overall decoding speed up for cathedral-beta2-400extra-crop-avc.mp4
no speed change for Aladin.mpg
Benchmarks done on Pentium dual

Originally committed as revision 16182 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-17 02:35:14 +00:00
Michael Niedermayer
ac0623b23c Fix indention, also do a little vertical alignment of changed lines.
Originally committed as revision 16176 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-16 22:16:11 +00:00
Michael Niedermayer
6120a343aa Factorize 3 multiplications out, code becomes 3 cpu cycles faster.
(not significant as thats just per MB)

Originally committed as revision 16174 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-16 21:08:16 +00:00
Michael Niedermayer
1dd488e955 Move ENABLE_SMALL back to the per MB check, as otherwise gcc wont remove
the code.

Originally committed as revision 16173 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-16 20:43:39 +00:00