Commit Graph

388 Commits

Author SHA1 Message Date
Diego Biurrun
a650c906cb ppc: Only compile AltiVec FFT assembly when AltiVec is enabled 2013-05-02 10:25:30 +02:00
Diego Biurrun
7f75f2f2bd ppc: Drop unnecessary ff_ name prefixes from static functions 2013-04-30 16:10:06 +02:00
Diego Biurrun
38282149b6 ppc: More consistent arch initialization 2013-04-30 12:19:45 +02:00
Diego Biurrun
a053dbfcfb ppc: Move AltiVec utility headers out of AltiVec ifdefs
Now that the headers themselves have ifdef protection this is no
longer necessary and more consistent with normal include handling.
2013-04-30 12:19:44 +02:00
Diego Biurrun
6b110d3a73 ppc: More consistent names for H.264 optimizations files 2013-04-30 12:19:43 +02:00
Diego Biurrun
643e433bf7 mpegaudiosp: More consistent names for ppc/x86 optimization files 2013-04-30 12:19:43 +02:00
Martin Storsjö
6d0fbebf94 ppc: hpeldsp: Include attributes.h
This fixes building in configurations where altivec is disabled.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-20 16:43:01 +03:00
Ronald S. Bultje
47e5a98174 ppc: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:18:59 +03:00
Ronald S. Bultje
015821229f vp3: Use full transpose for all IDCTs
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.

Also remove the unused type == 0 cases from the plain C version
of the idct.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Ronald S. Bultje
62844c3fd6 h264: Integrate clear_blocks calls with IDCT
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Luca Barbato
a8b6015823 dsputil: convert remaining functions to use ptrdiff_t strides
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-03-12 18:26:42 +01:00
Diego Biurrun
c242bbd8b6 Remove unnecessary dsputil.h #includes 2013-02-26 00:51:34 +01:00
Diego Biurrun
218aefce44 dsputil: Move LOCAL_ALIGNED macros to libavutil 2013-02-08 23:13:37 +01:00
Diego Biurrun
79dad2a932 dsputil: Separate h264chroma 2013-02-06 11:30:53 +01:00
Diego Biurrun
c9f933b5b6 Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
Diego Biurrun
25841dfe80 Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
2013-02-05 12:59:12 +01:00
Diego Biurrun
4eef2ed707 ppc: fmtconvert: Drop two unused variables. 2013-02-01 12:51:13 +01:00
Mans Rullgard
e9d817351b dsputil: Separate h264 qpel
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Diego Biurrun
88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Ronald S. Bultje
1768e43ceb vorbisdsp: change block_size type from int to intptr_t.
This saves one instruction in the x86-64 assembly.
2013-01-20 22:26:42 -08:00
Diego Biurrun
d9bf716945 ppc: vorbisdsp: Drop some unnecessary #includes
Also fixes compilation with AltiVec disabled.
2013-01-20 17:38:11 +01:00
Martin Storsjö
d160a2fb4c ppc: Include string.h for memset
This fixes build failures on ppc machines with a compiler that
supports -Werror=implicit-function-declaration.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-20 18:10:21 +02:00
Ronald S. Bultje
fef906c77c Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
aeaf268e52 vp3: integrate clear_blocks with idct of previous block.
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).

Arm asm updated by Janne Grunau <janne-libav@jannau.net>.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2013-01-19 22:04:55 -08:00
Justin Ruggles
e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Ronald S. Bultje
8c53d39e7f lavc: introduce VideoDSPContext
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Mans Rullgard
a384f6a7f7 ppc: replace pointer casting with AV_COPY32
This removes warnings about strict aliasing violations.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-11-12 10:31:31 +00:00
Mans Rullgard
031aac9861 ppc: fix some unused variable warnings
The third argument of OP_U8_ALTIVEC is evaluated at most once so
there is no need for a potentially unused temporary variable.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-11-12 10:31:31 +00:00
Diego Biurrun
ac56ff9cc9 build: non-x86: Only compile mpegvideo optimizations when necessary 2012-10-09 14:45:59 +02:00
Mans Rullgard
f79364b2c3 ppc: fix Altivec build with old compilers
The vec_splat() intrinsic requires a constant argument for the
element number, and the code relies on the compiler unrolling
the loop to provide this.  Manually unrolling the loop avoids
this reliance and works with all compilers.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-08 23:14:51 +01:00
Mans Rullgard
642b4efaf7 ppc: fmtconvert: kill VLA in float_to_int16_interleave_altivec()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-05 22:33:32 +01:00
Martin Storsjö
33e112847d Add more missing includes after removing the implicit common.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-16 10:49:54 +03:00
Martin Storsjö
70766c2182 Add some more missing includes after removing the implicit common.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-15 23:48:48 +03:00
Martin Storsjö
1d9c2dc89a Don't include common.h from avutil.h
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-08-15 22:32:06 +03:00
Justin Ruggles
a35738f424 dsputil: ppc: cosmetics: pretty-print 2012-07-22 17:38:55 -04:00
Mans Rullgard
ffdd93a25e ppc: fix build with altivec disabled
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 13:34:42 +01:00
Mans Rullgard
28f9ab7029 vp3: move idct and loop filter pointers to new vp3dsp context
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context.  There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:19 +01:00
Diego Biurrun
af10feadc2 ppc: Rename H.264 optimization template file for consistency. 2012-06-12 23:20:05 +02:00
Justin Ruggles
d5a7229ba4 Add a float DSP framework to libavutil
Move vector_fmul() from DSPContext to AVFloatDSPContext.
2012-06-08 13:14:38 -04:00
Justin Ruggles
98db4e2a4e PPC: Move types_altivec.h and util_altivec.h from libavcodec to libavutil
This will allow for easier implementation of Altivec functions in libraries
other than libavcodec.
2012-06-08 13:14:38 -04:00
Diego Biurrun
3ea5429489 ppc: Drop unused header regs.h 2012-05-22 11:54:53 +02:00
Mans Rullgard
c81d1e2390 ppc: add const where needed in scalarproduct_int16_altivec()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-05-01 00:21:30 +01:00
Mans Rullgard
ce82dad7eb ppc: remove shift parameter from scalarproduct_int16_altivec()
The shift parameter was removed from this interface in 7e1ce6a.
This updates the Altivec implementation to match.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-05-01 00:21:30 +01:00
Mans Rullgard
4c387c7070 ppc: dsputil: do unaligned block accesses correctly
To load unaligned vector data in the usual way, explicit vec_ld()
should be used rather than dereferencing a pointer to a vector type.
When the VSX extension is enabled, gcc may compile vector pointer
dereferences using the VSX lxvw4x instruction instead of the lvx
instruction typically used with Altivec/VMX.  As the behaviour of
these instructions with unaligned addresses differs, it is important
that only lvx is used here.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-05-01 00:21:30 +01:00
Mans Rullgard
2bcbd98459 Remove lowres video decoding
This feature is complex, of questionable utility, and slows down
normal decoding.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-04-21 18:56:19 +01:00
Diego Biurrun
0f53601ac6 ppc: drop unused function dct_quantize_altivec()
This also allows dropping some PPC-specific ugliness from dsputil.[ch].
2012-04-18 18:53:54 +02:00
Diego Biurrun
7bb3a302fe build: Consistently handle conditional compilation for all optimization OBJS. 2012-04-12 09:00:49 +02:00
Diego Biurrun
02c39f056a ppc: Add/remove a number of const qualifiers to fix related warnings. 2012-04-09 20:39:33 +02:00