Michael Niedermayer
b99f3cabed
write cabac low and range variables as early as possible to prevent stalls from reading them before they where written, the P4 is said to disslike that alot, on P3 its 2% faster (START/STOP_TIMER over decode_residual)
...
Originally committed as revision 6657 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 16:11:41 +00:00
Michael Niedermayer
d17faef011
use ecx instead of cl (no speed change on P3 but might avoid partial register stalls on some cpus)
...
Originally committed as revision 6656 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 15:20:08 +00:00
Michael Niedermayer
d61c4e731e
make state transition tables global as they are constant and the code is slightly faster that way
...
Originally committed as revision 6655 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 14:44:17 +00:00
Michael Niedermayer
5f3eca121e
10l
...
Originally committed as revision 6654 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 13:25:29 +00:00
Michael Niedermayer
0fa352c7e6
make lps_range a global table its constant anyway (saves 1 addition for accessing it)
...
Originally committed as revision 6653 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 13:21:42 +00:00
Michael Niedermayer
3650b43959
enable CMOV_IS_FAST as its faster or equal speed on every cpu (duron, athlon, PM, P3) from which ive seen benchmarks, it might be slower on P4 but noone has posted benchmarks ...
...
Originally committed as revision 6652 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 12:23:40 +00:00
Diego Biurrun
0bc2e7f081
BRANCHLESS_CABAD --> BRANCHLESS_CABAC_DECODER
...
Originally committed as revision 6623 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-10 08:16:41 +00:00
Michael Niedermayer
9ed92c65f1
moving another bit&1 out, this is as fast as with it in there, but it makes more sense with it outside of the loop
...
Originally committed as revision 6618 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-10 06:56:51 +00:00
Michael Niedermayer
f1b37db48d
move the &1 out of the asm so gcc can optimize it away in inlined cases (yes this is slightly faster)
...
Originally committed as revision 6616 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-10 01:17:39 +00:00
Michael Niedermayer
ab0151d163
replace a few and/sub/... by cmov
...
this is faster on P3, should be faster on AMD, and should be slower on P4
its disabled by default (benchmarks welcome so we know when to enable it)
Originally committed as revision 6615 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-10 01:08:39 +00:00
Michael Niedermayer
a6672acf45
reading 8bit mem into a 8bit register needs 2 uops on P4, 8bit->32bit with zero extension needs just 1
...
Originally committed as revision 6612 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 21:57:10 +00:00
Michael Niedermayer
2d3df05ca0
on the P4 inc needs twice as much time a add
...
Originally committed as revision 6611 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 21:39:07 +00:00
Michael Niedermayer
2ee9dc65be
10l
...
Originally committed as revision 6610 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 21:21:10 +00:00
Michael Niedermayer
7822e1c1ff
reverse remainder of the failed attempt to optimize *state=c->mps_state[s]
...
Originally committed as revision 6609 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 21:14:16 +00:00
Michael Niedermayer
ef0090a998
x86 branchless cabac decoder
...
slightly faster on P3
Originally committed as revision 6608 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 20:51:33 +00:00
Michael Niedermayer
2e1aee80f4
optimize branchless C CABAC decoder
...
Originally committed as revision 6607 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 20:44:11 +00:00
Michael Niedermayer
1c2a417f6a
move outcommented START/STOP_TIMER to a hopefully better place for benchmarking ...
...
Originally committed as revision 6605 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 18:20:00 +00:00
Michael Niedermayer
30dc5f56ad
drop failed attempt to optimize *state= c->mps_state[s];
...
Originally committed as revision 6604 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 15:52:17 +00:00
Michael Niedermayer
c56d23dacf
10l bugfix for some disabled code
...
Originally committed as revision 6603 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 14:15:53 +00:00
Michael Niedermayer
f7d0b68361
first try of a handwritten get_cabac() for x86, this is 10-20% faster on P3 depening on if you try to subtract the START/STOP_TIMER overhead
...
Originally committed as revision 6602 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 14:15:14 +00:00
Michael Niedermayer
5bbe2a5292
remove bytestream_end checks, seems to work fine without them and the bitstream reader doesnt check for the end either
...
Originally committed as revision 6599 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 12:25:24 +00:00
Michael Niedermayer
c010d69a75
decrease ff_h264_norm_shift[] size
...
Originally committed as revision 6596 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-09 00:59:42 +00:00
Michael Niedermayer
6ff042699f
cleanup
...
Originally committed as revision 6594 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-08 21:26:08 +00:00
Michael Niedermayer
260ceb6322
branchless renormalization (1% faster get_cabac) old branchless renormalization wasnt faster because gcc was scared of the shift variable (missusing bit variable now)
...
Originally committed as revision 6587 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-08 13:20:22 +00:00
Michael Niedermayer
99ce10873d
5% faster get_cabac()
...
Originally committed as revision 6586 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-08 11:24:37 +00:00
Michael Niedermayer
400d0f8e47
disable benchmarking code
...
disable asm optims as the fastest depends on cpu type
Originally committed as revision 6582 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 22:37:34 +00:00
Michael Niedermayer
4310580db5
renorm_cabac_decoder_once START/STOP_TIMER scores for athlon
...
Originally committed as revision 6581 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 22:34:32 +00:00
Michael Niedermayer
5659b509c7
refill cabac variables in 16bit steps, 3% faster get_cabac()
...
Originally committed as revision 6578 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 15:44:14 +00:00
Diego Biurrun
b78e7197a8
Change license headers to say 'FFmpeg' instead of 'this program/this library'
...
and fix GPL/LGPL version mismatches.
Originally committed as revision 6577 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 15:30:46 +00:00
Michael Niedermayer
2ae7569dc8
() 10l
...
Originally committed as revision 6576 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 12:41:55 +00:00
Michael Niedermayer
ec8f483ab5
several x86 renorm_cabac_decoder_once optimizations
...
START/STOP_TIMER benchmarking code for them
please benchmark on P4 & athlon
(ill remove the benchmarking code and the always slower variants as soon as p4/athlon benchmarks have been posted or commited)
Originally committed as revision 6573 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 11:15:10 +00:00
Loren Merritt
938dd84693
don't try to inline cabac functions. gcc ignored the hint anyway, and forcing it would make h264 slower.
...
Originally committed as revision 6549 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-04 07:16:10 +00:00
Loren Merritt
bfe328caf0
tweak cabac. 0.5% faster h264.
...
Originally committed as revision 6106 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-27 09:19:02 +00:00
Loren Merritt
2848ce84d2
don't force asserts in release builds. 2% faster h264.
...
Originally committed as revision 5332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-29 00:43:15 +00:00
Diego Biurrun
5509bffa88
Update licensing information: The FSF changed postal address.
...
Originally committed as revision 4842 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-12 22:43:26 +00:00
Diego Biurrun
115329f160
COSMETICS: Remove all trailing whitespace.
...
Originally committed as revision 4749 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-17 18:14:38 +00:00
Loren Merritt
6d1feb028f
decode h264 end-of-slice flag
...
Originally committed as revision 4316 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-05-29 18:18:13 +00:00
Måns Rullgård
88730be651
kill warnings patch by (Måns Rullgård <mru inprovide com>)
...
Originally committed as revision 3977 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-02-24 19:08:50 +00:00
Michael Niedermayer
ec7eb8966c
optimization
...
Originally committed as revision 3639 to svn://svn.ffmpeg.org/ffmpeg/trunk
2004-10-26 03:12:21 +00:00
Michael Niedermayer
bba8334965
overread fix
...
Originally committed as revision 3294 to svn://svn.ffmpeg.org/ffmpeg/trunk
2004-07-08 00:53:21 +00:00
Michael Niedermayer
e96682e6f4
some of the warning fixes by (Michael Roitzsch <mroi at users dot sourceforge dot net>)
...
Originally committed as revision 3140 to svn://svn.ffmpeg.org/ffmpeg/trunk
2004-05-18 17:09:46 +00:00
Alex Beregszaszi
b46243ed1c
get_bit_count -> put_bits_count
...
Originally committed as revision 2753 to svn://svn.ffmpeg.org/ffmpeg/trunk
2004-02-06 17:51:58 +00:00
Michael Niedermayer
7408ad05cc
10l
...
Originally committed as revision 1940 to svn://svn.ffmpeg.org/ffmpeg/trunk
2003-06-09 19:11:50 +00:00
Michael Niedermayer
5e20f836b3
FFV1 codec (our very simple lossless intra only codec, compresses much better then huffyuv)
...
Originally committed as revision 1939 to svn://svn.ffmpeg.org/ffmpeg/trunk
2003-06-09 02:24:51 +00:00
Michael Niedermayer
8f8c0800f8
cleanup
...
Originally committed as revision 1932 to svn://svn.ffmpeg.org/ffmpeg/trunk
2003-06-06 10:04:15 +00:00
Michael Niedermayer
61ccfcc009
(truncated) unary binerization
...
unary k-th order exp golomb binarization
Originally committed as revision 1920 to svn://svn.ffmpeg.org/ffmpeg/trunk
2003-05-30 01:05:48 +00:00
Michael Niedermayer
d592f67fb6
CABAC
...
note, this is just the CABAC (de)coder not complete h264-cabac support
Originally committed as revision 1915 to svn://svn.ffmpeg.org/ffmpeg/trunk
2003-05-28 18:44:52 +00:00
Roland Scheidegger
14e9ffc1e4
h264: use one table instead of several for cabac functions
...
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.
The assembly uses the new table but still won't work for PIC case.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:26:12 -07:00