This is essential for fast SIMD accesses.
The same should be done with the predict output.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
19% faster
smaller files
this may also fix possible integer overflows due to previous 32bit useage
Tested with libutvideo and our utvideo decoder, this patch does not change
decoder output in the test
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
build: x86: Only compile mpegvideo optimizations when necessary
configure: Drop fastdiv option
build: Make the E-AC-3 encoder select the AC-3 encoder
fate: flac: Only run tests requiring samples when samples are available
Conflicts:
configure
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Pass pointer to sample buffer instead of channel number to various
functions called from decode_subframe(). Also simplify a few
expressions within this function.
* qatar/master:
fate: Add FATE tests for the Ut Video encoder
lavc: add Ut Video encoder
mpegvideo_enc: remove stray duplicate line from 7f9aaa4
swscale: x86: fix #endif comments in rgb2rgb template file
avconv: mark more options as expert.
avconv: split printing "main options" into global and per-file.
avconv: refactor help printing.
Conflicts:
Changelog
ffmpeg_opt.c
ffserver.c
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The special cases in demuxers and decoders are a mess otherwise (and more
would be needed to support it fully)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The h264_vdpau decoder crashed if output colorspace was not 8-bit 420.
Add a check to error out instead (current hardware does not support
other colorspaces, so successful decoding is not possible).
Check implemented at a different place by michael, thus blame for bugs goes to michael
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mpegvideo_enc: don't use deprecated avcodec_encode_video().
cmdutils: refactor -codecs option.
avconv: make -shortest a per-output file option.
lavc: add avcodec_descriptor_get_by_name().
lavc: add const to AVCodec* function parameters.
swf(dec): replace CODEC_ID with AV_CODEC_ID
dvenc: don't use deprecated AVCODEC_MAX_AUDIO_FRAME_SIZE
rtmpdh: Do not generate the same private key every time when using libnettle
rtp: remove ff_rtp_get_rtcp_file_handle().
rtsp.c: use ffurl_get_multi_file_handle() instead of ff_rtp_get_rtcp_file_handle()
avio: add (ff)url_get_multi_file_handle() for getting more than one fd
h264: vdpau: fix crash with unsupported colorspace
amrwbdec: Decode the fr_quality bit properly
Conflicts:
Changelog
cmdutils.c
cmdutils_common_opts.h
doc/ffmpeg.texi
ffmpeg.c
ffmpeg.h
ffmpeg_opt.c
libavcodec/h264.c
libavcodec/options.c
libavcodec/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Also print a warning if neither quality nor bitrate is specified
and use the libvpx default bitrate in this case.
The idea of using the default bitrate is from Luca Barbato
Reviewed-by: James Zern <jzern@google.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The h264_vdpau decoder crashed if output colorspace was not 8-bit 420.
Add a check to error out instead (current hardware does not support
other colorspaces, so successful decoding is not possible).
Signed-off-by: Martin Storsjö <martin@martin.st>
The way this bit is decoded was accidentally flipped in b70feb405,
leading to warnings "Encountered a bad or corrupted frame" for each
decoded frame.
Signed-off-by: Martin Storsjö <martin@martin.st>
* qatar/master:
libvpxenc: use the default bitrate if not set
utvideo: Rename utvideo.c to utvideodec.c
doc: Fix syntax errors in sample Emacs config
mjpegdec: more meaningful return values
configure: clean up Altivec detection
getopt: Remove an unnecessary define
rtmp: Use int instead of ssize_t
getopt: Add missing includes
rtmp: Add support for receiving incoming streams
Add missing includes for code relying on external libraries
Conflicts:
libavcodec/libopenjpegenc.c
libavcodec/libvpxenc.c
libavcodec/mjpegdec.c
libavformat/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
Fix even more missing includes after the common.h removal
build: Factor out rangecoder dependencies to CONFIG_RANGECODER
build: Factor out error resilience dependencies to CONFIG_ERROR_RESILIENCE
x86: avcodec: Consistently name all init files
Add more missing includes after removing the implicit common.h
Add some more missing includes after removing the implicit common.h
Don't include common.h from avutil.h
rtmp: Automatically compute the hash for SWFVerification
Conflicts:
configure
doc/APIchanges
doc/examples/decoding_encoding.c
libavcodec/Makefile
libavcodec/assdec.c
libavcodec/audio_frame_queue.c
libavcodec/avpacket.c
libavcodec/dv_profile.c
libavcodec/dwt.c
libavcodec/libtheoraenc.c
libavcodec/rawdec.c
libavcodec/rv40dsp.c
libavcodec/tiff.c
libavcodec/tiffenc.c
libavcodec/v210dec.h
libavcodec/vc1dsp.c
libavcodec/x86/Makefile
libavfilter/asrc_anullsrc.c
libavfilter/avfilter.c
libavfilter/buffer.c
libavfilter/formats.c
libavfilter/vf_ass.c
libavfilter/vf_drawtext.c
libavfilter/vf_fade.c
libavfilter/vf_select.c
libavfilter/video.c
libavfilter/vsrc_testsrc.c
libavformat/version.h
libavutil/audioconvert.c
libavutil/error.h
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The logic here was off. If the packet size is exactly two, then
it's a well-formed empty subtitle, used to mark the end of the
duration of the previous subtitle.
Signed-off-by: Philip Langdale <philipl@overt.org>
Unsurprisingly, if a timing-less subrip decoder is desireable, an
encoder is as well. With this in place, we can move on to remove
the use of the old encoder/decoder with embedded timing and move
all timing handling the (de)muxer where they belong.
Signed-off-by: Philip Langdale <philipl@overt.org>
After various discussions, we concluded that, amongst other things,
it made sense to have a separate subrip decoder that did not use
in-band timing information, and rather relied on the ffmpeg level
timing.
As this is 90% the same as the existing srt decoder, it's implemented
in the same file.
Signed-off-by: Philip Langdale <philipl@overt.org>
this crashes otherwise, and can happen from try_decode_frame() in the case of decoding errors
Fixes Ticket1602
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
rtmp: Add support for SWFVerification
api-example: use new video encoding API.
x86: avcodec: Appropriately name files containing only init functions
mpegvideo_mmx_template: drop some commented-out cruft
libavresample: add mix level normalization option
w32pthreads: Add missing #includes to make header compile standalone
rtmp: Gracefully ignore _checkbw errors by tracking them
rtmp: Do not send _checkbw calls as notifications
prores: interlaced ProRes encoding
Conflicts:
doc/examples/decoding_encoding.c
libavcodec/proresenc_kostya.c
libavcodec/w32pthreads.h
libavcodec/x86/Makefile
libavformat/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This function is always called with a non-negative argument, so
those special cases are not needed. In the places the argument
might be zero, the return value for a zero argument does not matter
since it would then be used to scale an array full of zeros.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Note that the symbols used to run the hardware decoder in asynchronous mode
have been marked deprecated and will be dropped at a future version bump.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
DVD subtitles packets can only encode a single rectangle:
if there are several, copy them into a big transparent one.
DVD subtitles rely on an external 16-colors palette:
use a reasonable default one, stored in the private context,
and encode it into the extradata, as specified by Matroska.
TODO: allow to change the palette with an option.
Each packet can use four colors out of the global palette.
The old logic was to map transparent colors to the color 0
and all other colors to 3, 2, 1, cyclically in descending
frequency order, completely disregarding the original color.
Select the "best" four colors from the global palette, according
to heuristics based on frequency, opacity and brightness, and
arrange them in standard DVD order: background, foreground,
outline, other.
TODO: select the alpha value more finely; see if CHG_COLCON can
allow more than 4 colors per packet.
Reference:
http://dvd.sourceforge.net/dvdinfo/spu.html
With these changes, dvdsubenc can be used to transcode DVB subtitles
and get a very decent result.
Note that the symbols used to run the hardware decoder in asynchronous mode
has been marked as deprecated and will be dropped at a future version dump.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This fixes two issues preventing suncc from building this code.
The undocumented 'a' operand modifier, causing gcc to omit a $ in
front of immediate operands (as required in addresses), is not
supported by suncc. Luckily, the also undocumented 'c' modifer
has the same effect and is supported.
On some asm statements with a large number of operands, suncc for no
obvious reason fails to correctly substitute some of the operands.
Fortunately, some of the operands in these statements are plain
numbers which can be inserted directly into the code block instead
of passed as operands.
With these changes, the code builds correctly with both gcc and
suncc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This code contains a C array of addresses of labels defined in
inline asm. To do this, the names must be declared as external
in C. The declared type does not matter since only the address is
used, and for some reason, the author of the code used the 'void'
type despite taking the address of a void expression being invalid.
Changing the type to char, a reasonable choice since the alignment
of the code labels cannot be known or guaranteed, eliminates gcc
warnings and allows building with suncc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
many branches and cases of scale_vector are irrelevant for the case here
and by inlining they can be reliably removed.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It appears someone thinks this special case can be reached
Well, it cannot, thus not only do we not need to optimize it
we dont need it at all
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (22 commits)
g723.1: do not pass large structs by value
g723.1: do not bounce intermediate values via memory
g723.1: declare a variable in the block it is used
g723.1: avoid saving/restoring excitation
g723.1: avoid unnecessary memcpy() in residual_interp()
g723.1: make postfilter write directly to output buffer
g723.1: drop unnecessary variable buf_ptr in formant_postfilter()
g723.1: make scale_vector() output to a separate buffer
g723.1: make autocorr_max() work on an arbitrary buffer
g723.1: do not needlessly use int64_t
g723.1: use saturating addition functions
g723.1: optimise scale_vector()
g723.1: remove useless uses of MUL64()
g723.1: remove unnecessary argument 'shift' from dot_product()
g723.1: deobfuscate "(x << 4) - x" to "15 * x"
celp: optimise ff_celp_lp_synthesis_filter()
libavutil: add saturating addition functions
cllc: Implement ARGB support
cllc: Add support for QRGB
cllc: Rename some funcs to represent what they actually do
...
Conflicts:
LICENSE
libavcodec/g723_1.c
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Although a reasonable compiler will probably optimise out the
actual store and load, this operation still implies a truncation
to 16 bits which the compiler will probably not realise is not
necessary here.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Writing the scaled excitation to a scratch buffer (borrowing the
'audio' array) instead of modifying it in place avoids the need
to save and restore the unscaled values.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Use saturating addition functions instead of 64-bit intermediates
and separate clipping. This is much faster when dedicated
instructions are available.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Firstly, nothing in this function can overflow 32 bits so the use
of a 64-bit type is completely unnecessary. Secondly, the scale
is either a power of two or 0x7fff. Doing separate loops for these
cases avoids using multiplications. Finally, since only the number
of bits, not the actual value, of the maximum value is needed, the
bitwise or of all the values serves the purpose while being faster.
It is worth noting that even if overflow could happen, it was not
handled correctly anyway.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The operands in both cases are 16-bit so cannot overflow a 32-bit
destination. In gain_scale() the inputs are reduced to 14-bit,
so even the shift cannot overflow.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Adding instead of subtracting the products in the loop allows the
compiler to generate more efficient multiply-accumulate instructions
when 16-bit multiply-subtract is not available. ARM has only
multiply-accumulate for 16-bit operands. In general, if only one
variant exists, it is usually accumulate rather than subtract.
In the same spirit, using the dedicated saturation function enables
use of any special optimised versions of this.
Signed-off-by: Mans Rullgard <mans@mansr.com>