* qatar/master:
docs: use -bsf:[vas] instead of -[vas]bsf.
mpegaudiodec: Prevent premature clipping of mp3 input buffer.
lavf: move the packet keyframe setting code.
oggenc: free comment header for all codecs
lcl: error out if uncompressed input buffer is smaller than framesize.
mjpeg: abort decoding if packet is too large.
golomb: use HAVE_BITS_REMAINING() macro to prevent infloop on EOF.
get_bits: add HAVE_BITS_REMAINING macro.
lavf/output-example: use new audio encoding API correctly.
lavf/output-example: more proper usage of the new API.
tiff: Prevent overreads in the type_sizes array.
tiff: Make the TIFF_LONG and TIFF_SHORT types unsigned.
apetag: do not leak memory if avio_read() fails
apetag: propagate errors.
SBR DSP x86: implement SSE sbr_hf_g_filt
SBR DSP x86: implement SSE sbr_sum_square_sse
SBR DSP: use intptr_t for the ixh parameter.
Conflicts:
doc/bitstream_filters.texi
doc/examples/muxing.c
doc/ffmpeg.texi
libavcodec/golomb.h
libavcodec/x86/Makefile
libavformat/oggenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Prevents crash when trying to copy from a non-existing plane in e.g.
a RGB32 reference image to a YUV420P target image
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Instead of clipping extrasize based on EXTRABYTES, clip based on the
amount of buffer actually left. Without this fix, there are warbles
and other distortions in the test case below.
http://kevincennis.com/mix/assets/sounds/1901_voxfx.mp3
This prevents crashes when trying to read beyond the end of the buffer
while decoding frame data.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
* qatar/master: (40 commits)
swf: check return values for av_get/new_packet().
wavpack: Don't shift minclip/maxclip
rtpenc: Expose the max packet size via an avoption
rtpenc: Move max_packet_size to a context variable
rtpenc: Add an option for not sending RTCP packets
lavc: drop encode() support for video.
snowenc: switch to encode2().
snowenc: don't abuse input picture for storing information.
a64multienc: switch to encode2().
a64multienc: don't write into output buffer when there's no output.
libxvid: switch to encode2().
tiffenc: switch to encode2().
tiffenc: properly forward error codes in encode_frame().
lavc: drop libdirac encoder.
gifenc: switch to encode2().
libvpxenc: switch to encode2().
flashsvenc: switch to encode2().
Remove libpostproc.
lcl: don't overwrite input memory.
swscale: take first/lastline over/underflows into account for MMX.
...
Conflicts:
.gitignore
Makefile
cmdutils.c
configure
doc/APIchanges
libavcodec/Makefile
libavcodec/allcodecs.c
libavcodec/libdiracenc.c
libavcodec/libxvidff.c
libavcodec/qtrleenc.c
libavcodec/tiffenc.c
libavcodec/utils.c
libavformat/mov.c
libavformat/movenc.c
libpostproc/Makefile
libpostproc/postprocess.c
libpostproc/postprocess.h
libpostproc/postprocess_altivec_template.c
libpostproc/postprocess_internal.h
libpostproc/postprocess_template.c
libswscale/swscale.c
libswscale/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Unrolling the main loop to process, instead of 4 elements:
- 8: minor gain of 2 cycles (not worth the extra object size)
- 2: loss of 8 cycles.
Assigning STEP to a register is a loss. Output address (Y) is almost always
unaligned.
Timings:
- C (32/64 bits): 117/109 cycles
- SSE: 57 cycles
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C /32bits: 82c (unrolled)/102c
C /64bits: 69c (unrolled)/82c
SSE/32bits: 42c
SSE/64bits: 31c
Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Since we are clipping before we shift the values to
16 or 32 bits, we should not shift the min/max clip
values to compensate.
Fixes 8 and 24 bit lossy decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>