142 Commits

Author SHA1 Message Date
Michael Niedermayer
1253091d6f Merge commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc'
* commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc':
  libavutil: Add ARM av_clip_intp2_arm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-21 11:15:32 +01:00
Peter Meerwald
76ce9bd8e2 libavutil: Add ARM av_clip_intp2_arm
add ARM code for implementing av_clip_intp2 using the ssat instruction

on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and
the generic av_clip(), about -19%

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-02-21 00:54:40 +01:00
Michael Niedermayer
16e65419ed Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46'
* commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46':
  arm: Use .data.rel.ro for const data with relocations

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-12-09 11:58:13 +01:00
Martin Storsjö
f963f80399 arm: Use .data.rel.ro for const data with relocations
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-12-09 11:43:25 +02:00
jessejiang
29d208d5d4 avutil/arm/float_dsp_init_vfp: replace restrict by av_restrict
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-11-20 11:17:42 +01:00
Michael Niedermayer
1e519b9d40 avutil: turn arm setend into a cpuflag
this allows disabling and enabling it
it also prevents crashes if vfpv3 and neon are disabled which previously
would have enabled the flag

And last but not least one can enable setend on cpus like cortex-a8 where
its fast but disabled by default

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 14:50:15 +02:00
Michael Niedermayer
7cdb3b2b79 Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c'
* commit '6869612f5c7d4d2f20f69a5658328a761deadb1c':
  arm: Macroize the test for 'setend' CPU instruction support

Conflicts:
	libavcodec/arm/h264dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 12:46:13 +02:00
Ben Avison
6869612f5c arm: Macroize the test for 'setend' CPU instruction support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-07-21 15:08:01 -07:00
Ben Avison
5a272190a0 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:38 +03:00
Ben Avison
5edad2c4a1 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-07-18 01:34:31 +03:00
Ben Avison
57641410d1 armv6: Accelerate butterflies_float
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the
same sample AAC stream:

                   Before          After
                   Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode       1542.8 43.7     1470.5 41.5    100.0%      +4.9%
butterflies_float  130.0  11.9     70.2   12.1    100.0%      +85.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:38:02 +02:00
Ben Avison
649c666137 armv6: Accelerate vector_fmul_window
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the
same sample AAC stream:

                    Before          After
                    Mean   StdDev   Mean   StdDev  Confidence  Change
Audio decode        1598.2 47.4     1529.2 25.4    100.0%      +4.5%
vector_fmul_window  244.0  22.1     188.9  22.3    100.0%      +29.2%

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-16 21:37:41 +02:00
Michael Niedermayer
01983e50c0 Merge commit '7b0c7c9163fe3dd0081696befde28617119d2590'
* commit '7b0c7c9163fe3dd0081696befde28617119d2590':
  arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 21:31:18 +02:00
Martin Storsjö
7b0c7c9163 arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel
When running on a 64 bit kernel, /proc/cpuinfo lists different
optional features than on 32 bit kernels (because some of them
are mandatory in the 64 bit implemenations).

The kernel does list the old features properly if they are queried
via /proc/self/auxv though - however this file is not always readable
(e.g. on most android systems). The getauxval function could also
provide the same info as /proc/self/auxv even if this file isn't
readable, but this function is not always available (and thus would
need to be loaded with dlsym for compatibility with older android
versions).

The android cpufeatures library does this slightly differently,
by assuming that these are available if the "CPU architecture"
line is >= 8, see [1] for details.

It has been suggested to include the old, non-optional features in
/proc/cpuinfo as well, but that suggested patch never was merged.
See [2] for the discussion around this suggestion.

[1] https://android-review.googlesource.com/91380
[2] http://marc.info/?l=linux-arm-kernel&m=139087240101974

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-06-28 22:16:59 +03:00
Michael Niedermayer
a40c338a00 Merge commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412'
* commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412':
  build: check if AS supports the '.func' directive

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04 12:45:35 +02:00
Janne Grunau
d5a5598198 build: check if AS supports the '.func' directive
Not supported by Clang's integrated assembler. Since it just adds
debug information it can safely omitted.
2014-06-03 14:23:03 +02:00
Michael Niedermayer
1c788eaca9 Merge commit '831a1180785a786272cdcefb71566a770bfb879e'
* commit '831a1180785a786272cdcefb71566a770bfb879e':
  Update dsputil- and SIMD-related comments to match reality more closely

Conflicts:
	libavcodec/x86/hpeldsp.asm
	libavutil/arm/float_dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 23:59:56 +01:00
Diego Biurrun
831a118078 Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
Michael Niedermayer
a74bab7079 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: hpeldsp: prevent overreads in armv6 asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 21:35:30 +01:00
Janne Grunau
cbddee1cca arm: hpeldsp: prevent overreads in armv6 asm
Based on a patch by Russel King <rmk+libav@arm.linux.org.uk>

Bug-Id: 646
CC: libav-stable@libav.org
2014-03-05 14:30:57 +01:00
Michael Niedermayer
53d11f7b2d Merge commit '543156d7518f5e5d731123da066d86278f9fa492'
* commit '543156d7518f5e5d731123da066d86278f9fa492':
  arm: Mark the stack as non-executable

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-19 14:07:26 +01:00
Martin Storsjö
543156d751 arm: Mark the stack as non-executable
If linking in an object file without this attribute set, the
linker will assume that an executable stack might be needed.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-19 09:57:19 +02:00
Michael Niedermayer
a7574a36af Merge commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635'
* commit 'e3fec3f095ab5ea08ee662942d98526aaf5e3635':
  arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:49:28 +01:00
Martin Storsjö
e3fec3f095 arm: Add EXTERN_ASM to the .func and .type declarations for exported symbols
This makes the generated assembly more internally consistent,
avoiding declaring two labels for the same function (for cases
where EXTERN_ASM is empty) and not declaring a separate unprefixed
label in other cases.

This also makes sure the .func and .type delcarations have the same
prefix. They have previously not been used on the platforms
that have prefixed symbols on arm (iOS), but gas-preprocessor
has recently started using the .func declarations for adding
.thumb_func declarations for such functions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-07 15:14:06 +02:00
Michael Niedermayer
9d5cc55f0f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add an option for making sure NEON registers aren't clobbered

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-11 03:08:10 +01:00
Martin Storsjö
44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Michael Niedermayer
edba54630b Merge commit '5dae4872357613a0b51120b54a4c5221e0ec3f69'
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
  arm: Allow overriding the alignment set in the function macro

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:36:56 +01:00
Martin Storsjö
5dae487235 arm: Allow overriding the alignment set in the function macro
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).

The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.

This restores the original intention, to align the loop entry
points.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:56 +02:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer
946f080b54 Merge commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613'
* commit '7ffda66fd5c81af4725bff7c2c4f207ba2aa0613':
  arm: float_dsp: Propagate cpu_flags to vfp initialization function

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:05:04 +02:00
Michael Niedermayer
2a60666d1d Merge commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e'
* commit '8410d6e93c2e074881f1c7b7e4cdefd2e497d52e':
  avutil: Refactor CPU extension availability macros

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:15:10 +02:00
Michael Niedermayer
c83d794936 Merge commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b'
* commit 'b78b10c4b78b696927f2801cf2d9f193b4eff28b':
  avutil: Move internal CPU detection function declarations to private header

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 14:05:15 +02:00
Diego Biurrun
7ffda66fd5 arm: float_dsp: Propagate cpu_flags to vfp initialization function 2013-08-29 11:24:14 +02:00
Diego Biurrun
8410d6e93c avutil: Refactor CPU extension availability macros 2013-08-28 23:54:14 +02:00
Diego Biurrun
b78b10c4b7 avutil: Move internal CPU detection function declarations to private header 2013-08-28 23:54:14 +02:00
Michael Niedermayer
c88503e3f6 Merge commit '439902e0d68a0f0d800c21b5e6b598d5fa0c51da'
* commit '439902e0d68a0f0d800c21b5e6b598d5fa0c51da':
  Employ consistent LIBAV_COMPAT_ multiple inclusion guards in compat/

Conflicts:
	compat/aix/math.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-19 10:56:10 +02:00
Diego Biurrun
439902e0d6 Employ consistent LIBAV_COMPAT_ multiple inclusion guards in compat/
Also fix a comment and an #endif comment.
2013-07-18 18:12:38 +02:00
Michael Niedermayer
b7c6d1ed90 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Only output eabi attributes if building for ELF
  fix scalarproduct_and_madd_int16_altivec() for orders > 16

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-27 08:55:24 +02:00
Martin Storsjö
be7952b5c3 arm: Only output eabi attributes if building for ELF
This matches the other eabi attribute in the same file. This is
required in order to build for arm/hardfloat with other object
file formats than ELF.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-27 00:55:33 +03:00
Michael Niedermayer
3c200aa693 Merge commit '1fda184a85178cfd7b98d9e308d18e1ded76a511'
* commit '1fda184a85178cfd7b98d9e308d18e1ded76a511':
  avutil: Add av_cold attributes to init functions missing them

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 12:53:50 +02:00
Diego Biurrun
1fda184a85 avutil: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Michael Niedermayer
3ccda2b02b Merge commit '375ef6528c9dd2db7f9881e232cb0ec3aa16970d'
* commit '375ef6528c9dd2db7f9881e232cb0ec3aa16970d':
  libfdk-aacenc: Actually check for upper bounds of cutoff
  arm: Fall back to runtime cpu feature detection via /proc/cpuinfo

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-12 12:41:09 +01:00
Martin Storsjö
ab8f1a6989 arm: Fall back to runtime cpu feature detection via /proc/cpuinfo
On recent android versions, /proc/self/auxw is unreadable
(unless the process is running running under the shell uid or
in debuggable mode, which makes it hard to notice). See
http://b.android.com/43055 and
https://android-review.googlesource.com/51271 for more information
about the issue.

This makes sure e.g. neon optimizations are enabled at runtime in
android apps even when built in release mode, if configured to
use the runtime detection.

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-11 17:15:15 +02:00
Michael Niedermayer
8102f27b5b Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9'
* commit '73b704ac609d83e0be124589f24efd9b94947cf9':
  arm: Add some missing header #includes
  floatdsp: move scalarproduct_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/acelp_pitch_delay.c
	libavcodec/amrnbdec.c
	libavcodec/amrwbdec.c
	libavcodec/ra288.c
	libavcodec/x86/dsputil_mmx.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:31:55 +01:00
Michael Niedermayer
24604ebaf8 Merge commit '5959bfaca396ecaf63a8123055f499688b79cae3'
* commit '5959bfaca396ecaf63a8123055f499688b79cae3':
  floatdsp: move butterflies_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/imc.c
	libavcodec/mpegaudiodec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:13:54 +01:00
Michael Niedermayer
6e6e170898 Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4'
* commit '42d324694883cdf1fff1612ac70fa403692a1ad4':
  floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.

Conflicts:
	libavcodec/arm/dsputil_init_vfp.c
	libavcodec/arm/dsputil_vfp.S
	libavcodec/dsputil.c
	libavcodec/ppc/float_altivec.c
	libavcodec/x86/dsputil.asm
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:04:50 +01:00
Michael Niedermayer
b1b870fbd7 Merge commit '55aa03b9f8f11ebb7535424cc0e5635558590f49'
* commit '55aa03b9f8f11ebb7535424cc0e5635558590f49':
  floatdsp: move vector_fmul_add from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/x86/dsputil.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 13:54:34 +01:00
Ronald S. Bultje
5959bfaca3 floatdsp: move butterflies_float from dsputil to avfloatdsp.
This makes wmadec/enc, twinvq and mpegaudiodec (i.e. mp2/mp3)
independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00