* qatar/master: (40 commits)
swf: check return values for av_get/new_packet().
wavpack: Don't shift minclip/maxclip
rtpenc: Expose the max packet size via an avoption
rtpenc: Move max_packet_size to a context variable
rtpenc: Add an option for not sending RTCP packets
lavc: drop encode() support for video.
snowenc: switch to encode2().
snowenc: don't abuse input picture for storing information.
a64multienc: switch to encode2().
a64multienc: don't write into output buffer when there's no output.
libxvid: switch to encode2().
tiffenc: switch to encode2().
tiffenc: properly forward error codes in encode_frame().
lavc: drop libdirac encoder.
gifenc: switch to encode2().
libvpxenc: switch to encode2().
flashsvenc: switch to encode2().
Remove libpostproc.
lcl: don't overwrite input memory.
swscale: take first/lastline over/underflows into account for MMX.
...
Conflicts:
.gitignore
Makefile
cmdutils.c
configure
doc/APIchanges
libavcodec/Makefile
libavcodec/allcodecs.c
libavcodec/libdiracenc.c
libavcodec/libxvidff.c
libavcodec/qtrleenc.c
libavcodec/tiffenc.c
libavcodec/utils.c
libavformat/mov.c
libavformat/movenc.c
libpostproc/Makefile
libpostproc/postprocess.c
libpostproc/postprocess.h
libpostproc/postprocess_altivec_template.c
libpostproc/postprocess_internal.h
libpostproc/postprocess_template.c
libswscale/swscale.c
libswscale/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Unrolling the main loop to process, instead of 4 elements:
- 8: minor gain of 2 cycles (not worth the extra object size)
- 2: loss of 8 cycles.
Assigning STEP to a register is a loss. Output address (Y) is almost always
unaligned.
Timings:
- C (32/64 bits): 117/109 cycles
- SSE: 57 cycles
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C /32bits: 82c (unrolled)/102c
C /64bits: 69c (unrolled)/82c
SSE/32bits: 42c
SSE/64bits: 31c
Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Since we are clipping before we shift the values to
16 or 32 bits, we should not shift the min/max clip
values to compensate.
Fixes 8 and 24 bit lossy decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
If the PNG filter is enabled, a PNG-style filter will run over the
input buffer, writing into the buffer. Therefore, if no zlib compression
was used, ensure that we copy into a temporary buffer, otherwise we
overwrite user-provided input data.
This is done in preparation for the following patch implementing
encode2().
This commit is analogous to 05d699222dd5af4f5775f9890aa825ede05a144f for
libx264.
This prevents crashers and errors further down when reading nodes in the
empty tree.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
* qatar/master:
dxva2: don't check for DXVA_PictureParameters->wDecodedPictureIndex
img2: split muxer and demuxer into separate files
rm: prevent infinite loops for index parsing.
aac: fix infinite loop on end-of-frame with sequence of 1-bits.
mov: Add more HDV and XDCAM FourCCs.
lavf: don't set AVCodecContext.has_b_frames in compute_pkt_fields().
rmdec: when using INT4 deinterleaving, error out if sub_packet_h <= 1.
cdxl: correctly synchronize video timestamps to audio
mlpdec_parser: fix a few channel layouts.
Add channel names to channel_names[] array for channels added in b2890f5
movenc: Buffer the mdat for the initial moov fragment, too
flvdec: Ignore the index if the ignidx flag is set
flvdec: Fix indentation
movdec: Don't parse all fragments if ignidx is set
movdec: Restart parsing root-level atoms at the right spot
prores: use natural integer type for the codebook index
mov: Add support for MPEG2 HDV 720p24 (hdv4)
swscale: K&R formatting cosmetics (part I)
swscale: variable declaration and placement cosmetics
Conflicts:
configure
libavcodec/aacdec.c
libavcodec/mlp_parser.c
libavformat/flvdec.c
libavformat/img2.c
libavformat/isom.h
libavformat/mov.c
libavformat/movenc.c
libswscale/rgb2rgb.c
libswscale/rgb2rgb_template.c
libswscale/yuv2rgb.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The operations that use it require it to be promoted to a larger (natural)
type and thus perform sign extension on it.
While an optimal compiler may account for this, gcc 4.6 (for x86 Windows)
fails. Using the natural integer type provides a 2% speedup for Win64
and 1% for Win32.
Signed-off-by: Diego Biurrun <diego@biurrun.de>