The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
This allows us to remove FF_IDCT_WMV2, which serves no practical purpose
other than to be able to select the WMV2 IDCT for MPEG (or vice versa)
and get corrupt output.
Fate tests for all wmv2-related tests change, because (for some obscure
reason) they forced use of the MPEG IDCT. You would get the same changes
previously by not using -idct simple in the fate test (or replacing it
with -idct auto).
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This changes the LOCAL_ALIGNED definition on systems where
DECLARE_ALIGNED is used so it matches the manual alignment
case, ensuring invalid use will not compile on x86 only to
fail on everything else.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This reverts commit 484a337cd7.
These functions were used in f8bed30 "VC1: merge idct8x8, coeff
adjustments and put_pixels" which was reverted in 18b6a69.
Signed-off-by: Mans Rullgard <mans@mansr.com>
These decoders use a special non-MPEG2 IDCT. Call it directly
instead of going through dsputil. There is never any reason
to use a regular IDCT with these decoders or to use the EA IDCT
with other codecs.
This also fixes the bizarre situation of eamad and eatqi decoding
incorrectly if eatgq is disabled.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context. There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.
Signed-off-by: Mans Rullgard <mans@mansr.com>
There is only one caller, which does not need the shifting. Other use cases
are situations where different roundings would be needed.
The x86 and neon versions are modified accordingly.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This fixes crashes in e.g. PNG decoding with SSE2 enabled. In fact, many
x86 optimizations for codecs assume that our buffer strides are 16-byte
aligned.
High bitdepth H.264 needs 32-bit transform coefficients, whereas
dnxhd does not. This creates a conflict with the templated
functions operating on DCTELEM data. This patch adds a field
allowing the caller to choose the element size in dsputil_init()
and adds the required functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Use of these has been broken ever since the h264 idct was changed
to always use transposed inputs. Furthermore, they were only
ever used if some *other* non-default idct was requested.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This macro can cause problems in conjunction with the bitdepth
template expansion. It was presumably added to keep source
compatibility when high bitdepth support was added. However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
There are several places where a buffer is byte-swapped in 16-bit units.
This allows them to share code which can be optimised for various
architectures.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.
Signed-off-by: Mans Rullgard <mans@mansr.com>
C99 variadic macros require more arguments than there are named
parameters in the definition. This means we must use an extra
indirection to avoid having two different macros for arrays with
one resp more than one dimension.
Signed-off-by: Mans Rullgard <mans@mansr.com>