Commit Graph

332 Commits

Author SHA1 Message Date
Diego Biurrun
383fd4d478 arm: Drop unnecessary ff_ name prefixes from static functions 2013-04-30 16:02:03 +02:00
Ronald S. Bultje
7384b7a713 arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:19:08 +03:00
Ronald S. Bultje
015821229f vp3: Use full transpose for all IDCTs
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.

Also remove the unused type == 0 cases from the plain C version
of the idct.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Ronald S. Bultje
62844c3fd6 h264: Integrate clear_blocks calls with IDCT
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Luca Barbato
a8b6015823 dsputil: convert remaining functions to use ptrdiff_t strides
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-03-12 18:26:42 +01:00
Diego Biurrun
c242bbd8b6 Remove unnecessary dsputil.h #includes 2013-02-26 00:51:34 +01:00
Diego Biurrun
76b19a3984 Fix a number of incorrect intmath.h #includes. 2013-02-26 00:51:34 +01:00
Diego Biurrun
3e85b46ecf arm: vp8: Add missing #includes for header to compile standalone 2013-02-20 14:24:07 +01:00
Diego Biurrun
82bd04b170 rv34: Drop now unnecessary dsputil dependencies 2013-02-06 11:30:54 +01:00
Diego Biurrun
79dad2a932 dsputil: Separate h264chroma 2013-02-06 11:30:53 +01:00
Diego Biurrun
c9f933b5b6 Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
Diego Biurrun
25841dfe80 Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
2013-02-05 12:59:12 +01:00
Diego Biurrun
6c1a7d07eb Use proper "" quotes for local header #includes 2013-02-01 12:51:15 +01:00
Martin Storsjö
2026eb1408 arm: vp8: Fix the plain-armv6 version of vp8_luma_dc_wht
This makes the plain-armv6 version use the same registers as the
armv6t2 version above.

This fixes fate-vp8 on plain-armv6 devices.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-27 13:17:25 +02:00
Diego Biurrun
33552a5f7b arm: Add mathops.h to ARCH_HEADERS list
It is an arch-specific header not suitable for standalone compilation.
2013-01-24 20:59:22 +01:00
Janne Grunau
8e148b8742 arm: h264qpel: use neon h264 qpel functions only if supported 2013-01-24 17:06:52 +01:00
Mans Rullgard
e9d817351b dsputil: Separate h264 qpel
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Ronald S. Bultje
baf35bb4bc dsputil: remove one array dimension from avg_no_rnd_pixels_tab. 2013-01-22 18:41:36 -08:00
Ronald S. Bultje
32ff643228 dsputil: remove avg_no_rnd_pixels8.
This is never used.
2013-01-22 18:41:36 -08:00
Diego Biurrun
88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Diego Biurrun
73b704ac60 arm: Add some missing header #includes 2013-01-22 21:24:10 +01:00
Ronald S. Bultje
d56668bd80 floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
5959bfaca3 floatdsp: move butterflies_float from dsputil to avfloatdsp.
This makes wmadec/enc, twinvq and mpegaudiodec (i.e. mp2/mp3)
independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Ronald S. Bultje
1768e43ceb vorbisdsp: change block_size type from int to intptr_t.
This saves one instruction in the x86-64 assembly.
2013-01-20 22:26:42 -08:00
Janne Grunau
68f18f0351 videodsp_armv5te: remove #if HAVE_ARMV5TE_EXTERNAL
libavutil/arm/asm.S sets '.arch' depending on HAVE_ARMV5TE so that
assembling armv5te code will always succeed even if the default -march
flag does not support it. HAVE_ARMV5TE_EXTERNAL tests assembling code
with the default arch.
Fixes the missing symbol ff_prefetch_arm with --cpu= not including
armv5te.

CC: libav-stable@libav.org
2013-01-20 15:20:00 +01:00
Ronald S. Bultje
fef906c77c Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
aeaf268e52 vp3: integrate clear_blocks with idct of previous block.
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).

Arm asm updated by Janne Grunau <janne-libav@jannau.net>.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2013-01-19 22:04:55 -08:00
Justin Ruggles
e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Luca Barbato
6906b19346 lavc: add missing files for arm
Across the many retouches those did not make the main commit.
2012-12-20 14:07:23 +01:00
Ronald S. Bultje
8c53d39e7f lavc: introduce VideoDSPContext
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Diego Biurrun
523c7bd23c misc typo, style and wording fixes 2012-12-18 13:36:51 +01:00
Mans Rullgard
b326755989 arm: rename ARMVFP config symbol to VFP
This is consistent with usual ARM nomenclature as well as with the
VFPV3 and NEON symbols which both lack the ARM prefix.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:04 +00:00
Mans Rullgard
a7831d509f arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation
These macros reflect the actual capabilities required here.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:03 +00:00
Mans Rullgard
92dad6687f arm: fix use of uninitialised value in ff_fft_fixed_init_arm()
When initialising an FFTContext for a plain FFT, mdct_bits is not set
and can contain a garbage value.  Since nbits is always valid and for
MDCT operation is mdct_bits - 2 checking this instead avoids using an
uninitialised value while having the same effect.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 13:11:57 +00:00
Justin Ruggles
284ea790d8 dsputil: move vector_fmul_scalar() to AVFloatDSPContext in libavutil 2012-11-26 11:29:06 -05:00
Ronald S. Bultje
95c89da36e Use ptrdiff_t instead of int for intra pred "stride" function parameter.
This way, SIMD-optimized functions don't have to sign-extend their
stride argument manually to be able to do pointer arithmetic.
2012-10-29 17:49:13 -07:00
Mans Rullgard
1846ddf0a7 ARM: fix overreads in neon h264 chroma mc
The loops were reading ahead one line, which could end up outside the
buffer for reference blocks at the edge of the picture.  Removing
this readahead has no measurable performance impact.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-20 01:28:38 +01:00
Diego Biurrun
9734b8ba56 Move avutil tables only used in libavcodec to libavcodec. 2012-10-11 18:29:36 +02:00
Jean-Baptiste Kempf
507dce2536 arm: call arm-specific rv34dsp init functions under if (ARCH_ARM)
Assign NEON specific function pointers after runtime check via
av_get_cpu_flags().

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-10-10 15:28:50 +02:00
Diego Biurrun
ac56ff9cc9 build: non-x86: Only compile mpegvideo optimizations when necessary 2012-10-09 14:45:59 +02:00
Mans Rullgard
5e826fd65e ARM: set Tag_ABI_align_preserved in all asm files
All our ARM asm preserves alignment so setting this attribute
in a common location is simpler.  This removes numerous warnings
when linking with armcc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-02 19:47:56 +01:00
Mans Rullgard
a27a690fac ARM: swap source operands in some add instructions
This allows using a 16-bit opcode when generating Thumb2 code.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-09-20 17:07:18 +01:00
Mans Rullgard
7689eea49a flacdsp: arm optimised lpc filter 2012-09-15 23:54:21 +01:00
Anton Khirnov
36ef5369ee Replace all CODEC_ID_* with AV_CODEC_ID_* 2012-08-07 16:00:24 +02:00
Mans Rullgard
e6cd698955 ARMv6: vp8: fix stack allocation with Apple's assembler
In the GNU assembler, a relational expression, bizarrely, has the
value -1 if true, whereas in Apple's it is +1.  This patch makes
sure the correct expression is used in both cases.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-04 00:59:14 +01:00
Mans Rullgard
9829a81bcd ARM: vp56: allow inline asm to build with clang
The clang integrated assembler does not support pre-UAL syntax,
while gcc requires pre-UAL syntax for ARM code.  A patch[1] for
clang to support the old syntax as well has been ignored since
January.

This patch chooses the syntax appropriate for each compiler,
allowing both to build the code.  Notably, this change allows
building for iphone with the latest Apple Xcode update.

[1] http://llvm.org/bugs/show_bug.cgi?id=11855

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-04 00:59:14 +01:00
Mans Rullgard
faa788227f ARM: use =const syntax instead of explicit literal pools
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-01 10:32:24 +01:00
Mans Rullgard
998170913c ARM: use standard syntax for all LDRD/STRD instructions
The standard syntax requires two destination registers for
LDRD/STRD instructions.  Some versions of the GNU assembler
allow using only one with the second implicit, others are
more strict.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-01 10:32:24 +01:00