This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
v2: incorporated feedback from Loren Merritt to avoid rip-relative movs
for every table, and got rid of unnecessary @GOTPCREL.
v3: apply similar fixes to the the decode_significance functions, and use
same macro arguments for non-pic case.
v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect
the c code to be faster otherwise since both cmov and sbb suck hard on a
Prescott, even can't construct the mask with a 64bit shift as that's just as
terrible - it's quite difficult to find usable instructions on that chip...).
This is tested to work but not on a P4, in theory it _should_ be fast there.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
avcodec: add a cook parser to get subpacket duration
FATE: allow lavf tests to alter input parameters
FATE: replace the acodec-pcm_s24daud test with an enc_dec_pcm checksum test
FATE: replace the acodec-g726 test with 4 new encode/decode tests
FATE: replace current g722 encoding tests with an encode/decode test
FATE: add a pattern rule for generating asynth wav files
FATE: optionally write a WAVE header in audiogen
avutil: add audio fifo buffer
Conflicts:
doc/APIchanges
libavcodec/version.h
libavutil/avutil.h
tests/Makefile
tests/codec-regression.sh
tests/fate/voice.mak
tests/lavf-regression.sh
tests/ref/acodec/g722
tests/ref/acodec/g726
tests/ref/acodec/pcm_s24daud
tests/ref/lavf/dv_fmt
tests/ref/lavf/gxf
tests/ref/lavf/mxf
tests/ref/lavf/mxf_d10
tests/ref/seek/lavf_dv
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Avoids resampling and channel mixing. This only tests the behavior
with respect to input and output audio rather than also testing changes
to the encoder or muxer that do not affect the resulting decoded output.
Avoids resampling and channel mixing. This only tests the behavior
with respect to input and output audio rather than also testing changes
to the encoder or muxer that do not affect the resulting decoded output.
* qatar/master:
dv: Initialize encoder tables during encoder init.
dv: Replace some magic numbers by the appropriate #define.
FATE: pass the decoded output format and audio source file to enc_dec_pcm
FATE: specify the input format when decoding in enc_dec_pcm()
x86inc: support AVX abstraction for 2-operand instructions
configure: detect PGI compiler and set suitable flags
avconv: check for an incompatible changing channel layout
avio: make AVIOContext.av_class pointer to const
nutdec: add malloc check and fix const to non-const conversion warnings
Conflicts:
ffmpeg.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Generating warnings when casting const away leads to tight constraints
on the code if one wants to avoid warnings. This is especially true for
generic code that is supposed to work with both const and non const.
These tight constrains cause people to waste time trying to find a
way to write code so it doesnt generate any warning, while people
should rather spend their time thinking on how to write fast,
clean, maintainable and bug free code.
Removing this class of warnings fixes this issue.
Approved-by: Nicolas George <nicolas.george@normalesup.org>
Approved-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Yasm was fixed in its r2161 and yasm 0.8.0 (Apr 2010) contained this fix.
Nasm was fixed in 2.06 (Jun 2009):
https://groups.google.com/group/alt.lang.asm/browse_thread/thread/fcc85bbc3745d893
I tested with yasm 0.7.99 and yasm 1.2.0.7, where this works fine.
I also tested with nasm. The nasm shipping with Xcode is too old to understand
ffmpeg's assembly, before and after the patch. Nasm 2.10 fails to compile
fft_mmx.asm on trunk with
libavcodec/x86/fft_mmx.asm:88: panic: section ".text" has already been specified with alignment 32, conflicts with new alignment of 16
but builds fine if I change the two alignment "16"s in x86inc.asm to "32". With this patch,
nasm 2.10 fails with
libavcodec/x86/fft_mmx.asm:39: panic: section ".rodata" has already been specified with alignment 32, conflicts with new alignment of 16
instead, but again builds fine with s/16/32/.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
In normal picture decoding this does not need to be checked but as
error concealment is run in the case of errors the availability of
references is less certain. This may be fixed differently at some
point so that all references are always filled in before the EC
code, in which case this should then be changed to an assert()
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This will allow decoding to md5 and doing a diff comparison to a reference
checksum instead of a fuzzy stddev or oneoff comparison.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The output format is not always the same as the file extension,
which is sometimes required for correct probing. We can avoid
probing by specifying the format since it is already known.