Use saturating addition functions instead of 64-bit intermediates
and separate clipping. This is much faster when dedicated
instructions are available.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Firstly, nothing in this function can overflow 32 bits so the use
of a 64-bit type is completely unnecessary. Secondly, the scale
is either a power of two or 0x7fff. Doing separate loops for these
cases avoids using multiplications. Finally, since only the number
of bits, not the actual value, of the maximum value is needed, the
bitwise or of all the values serves the purpose while being faster.
It is worth noting that even if overflow could happen, it was not
handled correctly anyway.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The operands in both cases are 16-bit so cannot overflow a 32-bit
destination. In gain_scale() the inputs are reduced to 14-bit,
so even the shift cannot overflow.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* qatar/master:
g723.1: fix addition overflow
g723.1: simplify and fix multiplication overflow
g723.1: deobfuscate an expression
g723.1: remove unused #includes
ARM: add missing "cc" clobber in av_clipl_int32_arm()
rtmp: Factorize the code by adding handle_invoke_error
rtmp: Factorize the code by adding handle_invoke_status
rtmp: Factorize the code by adding handle_invoke_result
libavutil: remove unused av_abort() macro
ffmenc: replace if/abort with assert()
libavutil: drop offsetof() fallback definition
libavutil: drop fallback definitions of INTxx_MIN/MAX
configure: Check for a sctp struct instead of just the header
configure: suncc: Add -xc99 to dependency flags, required on Solaris
doxygen: Fix function parameter names to match the code
doc: Drop obsolete shared libs cflags hint to workaround Cygwin gcc bugs
swf: Move shared table out of the header file
swf: Move swf_audio_codec_tags table to the only place it is used
fate: add G.723.1 decoder tests
Conflicts:
configure
doc/platform.texi
libavformat/Makefile
libavutil/arm/intmath.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
In 16-bit arithmetic, x * 0xffffc is simply x * -4 with extra overflows,
(and the constant was probably meant to be 0xfffc). Combined with the
shift, this simplifies to -x >> 1. Finally, clearing the low two bits
with a 32-bit mask and switching to a 32-bit type allows more efficient
code on 32-bit machines.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* qatar/master: (23 commits)
build: cosmetics: Reorder some lists in a more logical fashion
x86: pngdsp: Fix assembly for OS/2
fate: add test for RTjpeg in nuv with frameheader
rtmp: send check_bw as notification
g723_1: clip argument for 15-bit version of normalize_bits()
g723_1: use all LPC vectors in formant postfilter
id3v2: Support v2.2 PIC
avplay: fix build with lavfi disabled.
avconv: split configuring filter configuration to a separate file.
avconv: split option parsing into a separate file.
mpc8: do not leave padding after last frame in buffer for the next decode call
mpegaudioenc: list supported channel layouts.
mpegaudiodec: don't print an error on > 1 frame in a packet.
api-example: update to new audio encoding API.
configure: add --enable/disable-random option
doc: cygwin: Update list of FATE package requirements
build: Remove all installed headers and header directories on uninstall
build: change checkheaders to use regular build rules
rtmp: Add a new option 'rtmp_subscribe'
rtmp: Add support for subscribing live streams
...
Conflicts:
Makefile
common.mak
configure
doc/examples/decoding_encoding.c
ffmpeg.c
libavcodec/g723_1.c
libavcodec/mpegaudiodec.c
libavcodec/x86/pngdsp.asm
libavformat/version.h
library.mak
tests/fate/video.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
It expects maximum value to be 32767 but calculations in scale_vector()
which uses this function can give it ABS(-32768) which leads to wrong
result and thus clipping is needed.
* qatar/master:
vc1dec: Remove separate scaling function for interlaced field MVs
vc1dec: Invoke edge_emulation regardless of MV precision
x86: Use consistent 3dnowext function and macro name suffixes
g723_1: scale output as supposed for the case with postfilter disabled
g723_1: increase excitation storage by 4
g723_1: fix upper bound parameter from inverse maximum autocorrelation
g723_1: make scale_vector() behave like the reference
g723_1: fix off-by-one error in normalize_bits()
g723_1: save/restore excitation with offset to store LPC history
wmapro: prevent division by zero when sample rate is unspecified
x86: proresdsp: improve SIGNEXTEND macro comments
x86: h264dsp: K&R formatting cosmetics
LICENSE: Document all GPL files
Conflicts:
libavcodec/g723_1.c
libavcodec/wmaprodec.c
libavcodec/x86/h264dsp_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixed codebook mode in 5300 rate may write up to SUBFRAME_LEN + 4 and
that is considered normal by the reference decoder. Without that additional
padding it might overwrite first elements of LPC history.
* qatar/master:
build: fix standalone compilation of OMA muxer
build: fix standalone compilation of Microsoft XMV demuxer
build: fix standalone compilation of Core Audio Format demuxer
kvmc: fix invalid reads
4xm: Add a check in decode_i_frame to prevent buffer overreads
adpcm: fix IMA SMJPEG decoding
options: set minimum for "threads" to zero
bsd: use number of logical CPUs as automatic thread count
windows: use number of CPUs as automatic thread count
linux: use number of CPUs as automatic thread count
pthreads: reset active_thread_type when slice thread_init returrns early
v410dec: include correct headers
Drop ALT_ prefix from BITSTREAM_READER_LE name.
lavfi: always build vsrc_buffer.
ra144enc: zero the reflection coeffs if the filter is unstable
sws: readd PAL8 to isPacked()
mov: Don't stick the QuickTime field ordering atom in extradata.
truespeech: fix invalid reads in truespeech_apply_twopoint_filter()
Conflicts:
configure
libavcodec/4xm.c
libavcodec/avcodec.h
libavfilter/Makefile
libavfilter/allfilters.c
libavformat/Makefile
libswscale/swscale_internal.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>