Altivec can only load naturally aligned vectors. To handle possibly
unaligned data a second vector is loaded from an offset of the original
location and the data is recovered through a vector permutation.
Overreads are minimal if the offset for second load points to the last
element of data. This is 7 for loading eight 8-bit pixels and overreads
are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned.
For unaligned pixels the overread is reduced from 23 bytes to 15 bytes
in the worst case.
* commit '620289a20e022b9c16c10d546ef86cc0bb77cc84':
sh4: Fix silly type vs. variable name search and replace typo
configure: Group all hwaccels together in a separate variable
Add av_cold attributes to arch-specific init functions
Conflicts:
configure
libavcodec/arm/mpegvideo_armv5te.c
libavcodec/x86/mlpdsp.c
libavcodec/x86/motion_est.c
libavcodec/x86/mpegvideoenc.c
libavcodec/x86/videodsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '25841dfe806a13de526ae09c11149ab1f83555a8':
Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
Conflicts:
libavcodec/alpha/dsputil_alpha.c
libavcodec/dsputil_template.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
float_dsp: ppc: add a separate header for Altivec function prototypes
ARM: fix float_dsp breakage from d5a7229
Add a float DSP framework to libavutil
PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
ARM: Move asm.S from libavcodec to libavutil
vc1dsp: mark put/avg_vc1_mspel_mc() always_inline
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
avplay: use libavresample for sample format conversion and channel mixing
Fix compilation with YASM/NASM without AVX support.
WMAL: do not output last frame again if nothing was decoded in current packet
WMAL: do not start decoding if frame does not end in current packet
adpcm-thp: fix invalid array indexing
ppc: add const where needed in scalarproduct_int16_altivec()
ppc: remove shift parameter from scalarproduct_int16_altivec()
ppc: dsputil: do unaligned block accesses correctly
dvenc: do not call dsputil functions with stride not a multiple of 16
APIchanges: fill in some dates and commit hashes
Conflicts:
doc/APIchanges
ffplay.c
libavcodec/adpcm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
To load unaligned vector data in the usual way, explicit vec_ld()
should be used rather than dereferencing a pointer to a vector type.
When the VSX extension is enabled, gcc may compile vector pointer
dereferences using the VSX lxvw4x instruction instead of the lvx
instruction typically used with Altivec/VMX. As the behaviour of
these instructions with unaligned addresses differs, it is important
that only lvx is used here.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* qatar/master:
dnxhddec: optimise dnxhd_decode_dct_block()
rtp: remove disabled code
eac3enc: use different numbers of blocks per frame to allow higher bitrates
dnxhd: add regression test for 10-bit
dnxhd: 10-bit support
dsputil: update per-arch init funcs for non-h264 high bit depth
dsputil: template get_pixels() for different bit depths
dsputil: create 16/32-bit dctcoef versions of some functions
jfdctint: add 10-bit version
mov: add clcp type track as Subtitle stream.
mpeg4: add Mpeg4 Profiles names.
mpeg4: decode Level Profile for MPEG4 Part 2.
ffprobe: display bitstream level.
imgconvert: remove unused glue and xglue macros
Conflicts:
libavcodec/dsputil_template.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (22 commits)
arm: remove disabled function dct_unquantize_h263_inter_iwmmxt()
Remove commented-out call to non-existing function print_pow1().
Do not decode RV30 files if the extradata is too small
flashsv: split flashsv_decode_block() off from flashsv_decode_frame().
ppc: remove disabled code
libspeexdec: Drop const qualifier to silence compiler warning.
libopenjpeg: Drop const qualifier to silence compiler warning.
alac: Remove unused dummy code.
Remove unused structs and tables.
vaapi: do not assert on value read from input bitstream
flashsvenc: replace bitstream description by a link to the specification
flashsvenc: drop unnecessary cast
flashsvenc: improve some variable names and fix corresponding comments
flashsvenc: merge two consecutive if-conditions
flashsvenc: merge variable declarations and initializations
flashsvenc: convert some debug av_log() to av_dlog()
flashsvenc: whitespace cosmetics
flashsvenc: drop some unnecessary parentheses
flashsvenc: fix some comment typos
aacps: skip some memcpy() if src and dst would be equal
...
Conflicts:
libavcodec/vaapi_mpeg2.c
libavformat/aviobuf.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (30 commits)
AVOptions: make default_val a union, as proposed in AVOption2.
arm/h264pred: add missing argument type.
h264dsp_mmx: place bracket outside #if/#endif block.
lavf/utils: fix ff_interleave_compare_dts corner case.
fate: add 10-bit H264 tests.
h264: do not print "too many references" warning for intra-only.
Enable decoding of high bit depth h264.
Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
Add support for higher QP values in h264.
Add the notion of pixel size in h264 related functions.
Make the h264 loop filter bit depth aware.
Template dsputil_template.c with respect to pixel size, etc.
Template h264idct_template.c with respect to pixel size, etc.
Preparatory patch for high bit depth h264 decoding support.
Move some functions in dsputil.c into a new file dsputil_template.c.
Move the functions in h264idct into a new file h264idct_template.c.
Move the functions in h264pred.c into a new file h264pred_template.c.
Preparatory patch for high bit depth h264 decoding support.
Add pixel formats for 9- and 10-bit yuv420p.
Choose h264 chroma dc dequant function dynamically.
...
Conflicts:
doc/APIchanges
ffmpeg.c
ffplay.c
libavcodec/alpha/dsputil_alpha.c
libavcodec/arm/dsputil_init_arm.c
libavcodec/arm/dsputil_init_armv6.c
libavcodec/arm/dsputil_init_neon.c
libavcodec/arm/dsputil_iwmmxt.c
libavcodec/arm/h264pred_init_arm.c
libavcodec/bfin/dsputil_bfin.c
libavcodec/dsputil.c
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
libavcodec/h264_loopfilter.c
libavcodec/h264_ps.c
libavcodec/h264_refs.c
libavcodec/h264dsp.c
libavcodec/h264idct.c
libavcodec/h264pred.c
libavcodec/mlib/dsputil_mlib.c
libavcodec/options.c
libavcodec/ppc/dsputil_altivec.c
libavcodec/ppc/dsputil_ppc.c
libavcodec/ppc/h264_altivec.c
libavcodec/ps2/dsputil_mmi.c
libavcodec/sh4/dsputil_align.c
libavcodec/sh4/dsputil_sh4.c
libavcodec/sparc/dsputil_vis.c
libavcodec/utils.c
libavcodec/version.h
libavcodec/x86/dsputil_mmx.c
libavformat/options.c
libavformat/utils.c
libavutil/pixfmt.h
libswscale/swscale.c
libswscale/swscale_internal.h
libswscale/swscale_template.c
tests/ref/seek/lavf_avi
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Storing a single element from a vector where all elements have the same
value does not require an aligned destination. Which element is stored
depends on the alignment of the destination address, but since they all
have the same value, the result is the same regardless of the alignment.
Originally committed as revision 19696 to svn://svn.ffmpeg.org/ffmpeg/trunk
The original problem was that FSF and Apple gcc used a different syntax
for vector declarations, i.e. {} vs. (). Nowadays Apple gcc versions support
the standard {} syntax and versions that support {} are available on all
relevant Mac OS X versions. Thus the greater compatibility is no longer
worth cluttering the code with macros.
Originally committed as revision 14366 to svn://svn.ffmpeg.org/ffmpeg/trunk
This includes indentation changes, comment reformatting, consistent brace
placement and some prettyprinting.
Originally committed as revision 14318 to svn://svn.ffmpeg.org/ffmpeg/trunk