Commit Graph

80 Commits

Author SHA1 Message Date
Michael Niedermayer
581b5f0b9b Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e'
* commit 'e3fcb14347466095839c2a3c47ebecff02da891e':
  dsputil: Split off IDCT bits into their own context

Conflicts:
	configure
	libavcodec/aic.c
	libavcodec/arm/Makefile
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/asvdec.c
	libavcodec/dnxhdenc.c
	libavcodec/dsputil.c
	libavcodec/dvdec.c
	libavcodec/dxva2_mpeg2.c
	libavcodec/intrax8.c
	libavcodec/mdec.c
	libavcodec/mjpegdec.c
	libavcodec/mjpegenc_common.h
	libavcodec/mpegvideo.c
	libavcodec/ppc/dsputil_altivec.h
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/idctdsp.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/dsputil_mmx.c
	libavcodec/x86/dsputil_x86.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 15:22:11 +02:00
Diego Biurrun
e3fcb14347 dsputil: Split off IDCT bits into their own context 2014-06-30 07:58:46 -07:00
Michael Niedermayer
99497b4683 Merge commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2'
* commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2':
  dsputil: Split audio operations off into a separate context

Conflicts:
	configure
	libavcodec/takdec.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/dsputil_mmx.c
	libavcodec/x86/dsputil_x86.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22 17:58:28 +02:00
Diego Biurrun
9a9e2f1c8a dsputil: Split audio operations off into a separate context 2014-06-22 06:20:15 -07:00
Michael Niedermayer
2b05db4f81 Merge commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9'
* commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9':
  dsputil: Split clear_block*/fill_block* off into a separate context

Conflicts:
	configure
	libavcodec/asvdec.c
	libavcodec/dnxhddec.c
	libavcodec/dnxhdenc.c
	libavcodec/dsputil.h
	libavcodec/eamad.c
	libavcodec/intrax8.c
	libavcodec/mjpegdec.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/vc1dec.c
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 04:54:38 +02:00
Diego Biurrun
e74433a8e6 dsputil: Split clear_block*/fill_block* off into a separate context 2014-06-18 14:07:23 -07:00
Christophe Gisquet
ccff45a0d3 apedsp: move to llauddsp
APE is not the sole codec using scalarproduct_and_madd_int16.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-05 20:31:59 +02:00
Michael Niedermayer
40f3a87c10 Merge commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3'
* commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3':
  dsputil: Move APE-specific bits into apedsp

Conflicts:
	libavcodec/arm/int_neon.S
	libavcodec/x86/dsputil.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 00:59:15 +02:00
Diego Biurrun
054013a0fc dsputil: Move APE-specific bits into apedsp 2014-05-29 06:41:15 -07:00
Ben Avison
9d8ecdd8ca vc-1: Add platform-specific start code search routine to VC1DSPContext.
Initialise VC1DSPContext for parser as well as for decoder.
Note, the VC-1 code doesn't actually use the function pointer yet.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 02:36:11 +02:00
Ben Avison
270cede3f3 h264: Move search code search functions into separate source files.
This permits re-use with parsers for codecs which use similar start codes.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 02:35:56 +02:00
Michael Niedermayer
fb61ed1e9f Merge commit 'ac4b32df71bd932838043a4838b86d11e169707f'
* commit 'ac4b32df71bd932838043a4838b86d11e169707f':
  On2 VP7 decoder

Conflicts:
	Changelog
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/arm/vp8dsp.h
	libavcodec/arm/vp8dsp_init_arm.c
	libavcodec/arm/vp8dsp_init_armv6.c
	libavcodec/arm/vp8dsp_init_neon.c
	libavcodec/avcodec.h
	libavcodec/h264pred.c
	libavcodec/version.h
	libavcodec/vp8.c
	libavcodec/vp8.h
	libavcodec/vp8data.h
	libavcodec/vp8dsp.c
	libavcodec/vp8dsp.h
	libavcodec/x86/h264_intrapred_init.c
	libavcodec/x86/vp8dsp_init.c

See: 89f2f5dbd7 and others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 14:46:10 +02:00
Peter Ross
ac4b32df71 On2 VP7 decoder
Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-04-04 04:00:11 +02:00
Michael Niedermayer
68014c6ed9 Merge commit 'c3a0b3eb64be441ca897629e8ecd80d5b51fded7'
* commit 'c3a0b3eb64be441ca897629e8ecd80d5b51fded7':
  arm: build: Maintain decoder objects separate from infrastructure objects

Conflicts:
	libavcodec/arm/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 20:10:51 +01:00
Diego Biurrun
c3a0b3eb64 arm: build: Maintain decoder objects separate from infrastructure objects 2014-03-27 03:00:05 -07:00
Michael Niedermayer
50b68e323c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output.

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 21:23:09 +01:00
Michael Niedermayer
f38af0143c Merge commit '15a29c39d9ef15b0783c04b3228e1c55f6701ee3'
* commit '15a29c39d9ef15b0783c04b3228e1c55f6701ee3':
  truehd: add hand-scheduled ARM asm version of mlp_filter_channel.

Conflicts:
	libavcodec/arm/Makefile
	libavcodec/arm/mlpdsp_init_arm.c

See: 87b128d5ef
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 20:39:10 +01:00
Ben Avison
87b128d5ef truehd: add hand-scheduled ARM asm version of mlp_filter_channel.
Profiling results for overall audio decode and the mlp_filter_channel(_arm)
function in particular are as follows:

              Before          After
              Mean   StdDev   Mean   StdDev  Confidence  Change
6:2 total     380.4  22.0     370.8  17.0    87.4%       +2.6%  (insignificant)
6:2 function  60.7   7.2      36.6   8.1     100.0%      +65.8%
8:2 total     357.0  17.5     343.2  19.0    97.8%       +4.0%  (insignificant)
8:2 function  60.3   8.8      37.3   3.8     100.0%      +61.8%
6:6 total     717.2  23.2     658.4  15.7    100.0%      +8.9%
6:6 function  140.4  12.9     81.5   9.2     100.0%      +72.4%
8:8 total     981.9  16.2     896.2  24.5    100.0%      +9.6%
8:8 function  193.4  15.0     103.3  11.5    100.0%      +87.2%

Experiments with adding preload instructions to this function yielded no
useful benefit, so these have not been included.

The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 20:22:18 +01:00
Ben Avison
3b5946bcce truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output.
Profiling results for overall decode and the output_data function in
particular are as follows:

              Before          After
              Mean   StdDev   Mean   StdDev  Confidence  Change
6:2 total     339.6  15.1     329.3  16.0    95.8%       +3.1%  (insignificant)
6:2 function  24.6   6.0      9.9    3.1     100.0%      +148.5%
8:2 total     324.5  15.5     323.6  14.3    15.2%       +0.3%  (insignificant)
8:2 function  20.4   3.9      9.9    3.4     100.0%      +104.7%
6:6 total     572.8  20.6     539.9  24.2    100.0%      +6.1%
6:6 function  54.5   5.6      16.0   3.8     100.0%      +240.9%
8:8 total     741.5  21.2     702.5  18.5    100.0%      +5.6%
8:8 function  63.9   7.6      18.4   4.8     100.0%      +247.3%

The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-03-26 19:54:32 +02:00
Ben Avison
15a29c39d9 truehd: add hand-scheduled ARM asm version of mlp_filter_channel.
Profiling results for overall audio decode and the mlp_filter_channel(_arm)
function in particular are as follows:

              Before          After
              Mean   StdDev   Mean   StdDev  Confidence  Change
6:2 total     380.4  22.0     370.8  17.0    87.4%       +2.6%  (insignificant)
6:2 function  60.7   7.2      36.6   8.1     100.0%      +65.8%
8:2 total     357.0  17.5     343.2  19.0    97.8%       +4.0%  (insignificant)
8:2 function  60.3   8.8      37.3   3.8     100.0%      +61.8%
6:6 total     717.2  23.2     658.4  15.7    100.0%      +8.9%
6:6 function  140.4  12.9     81.5   9.2     100.0%      +72.4%
8:8 total     981.9  16.2     896.2  24.5    100.0%      +9.6%
8:8 function  193.4  15.0     103.3  11.5    100.0%      +87.2%

Experiments with adding preload instructions to this function yielded no
useful benefit, so these have not been included.

The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-03-26 19:53:52 +02:00
Michael Niedermayer
011d83de48 Merge commit '0e083d7e43805db1a978cb57bfa25fda62e8ff18'
* commit '0e083d7e43805db1a978cb57bfa25fda62e8ff18':
  build: Group general components separate from de/encoders in arch Makefiles

Conflicts:
	libavcodec/arm/Makefile
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 22:26:31 +01:00
Diego Biurrun
0e083d7e43 build: Group general components separate from de/encoders in arch Makefiles
This is in line with how the top-level libavcodec Makefile is structured.
2014-03-20 05:03:23 -07:00
James Darnley
623f380a18 lavc: fix flac encoder and decoder dependencies
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-13 21:00:32 +01:00
Martin Storsjö
44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Mason Carter
832e190632 vc1: arm: Add NEON assembly
For:

ff_vc1_inv_trans_{8,4}x{8,4}_{dc_,}neon
ff_put_pixels8x8_neon
ff_put_vc1_mspel_mc{0,1,2,3}{0,1,2,3}_neon (except for 00)

Based on ARM assembly code in libavcodec/arm by Rob Clark and Mans
Rullgard.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 14:53:39 +02:00
Diego Biurrun
f0389eb777 arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
Diego Biurrun
8506ff97c9 vp56: Mark VP6-only optimizations as such.
Most of our VP56 optimizations are VP6-only and will stay that way.
So avoid compiling them for VP5-only builds.
2013-08-23 14:42:19 +02:00
Ben Avison
45e10e5c8d arm: Add assembly version of h264_find_start_code_candidate
Before          After
               Mean   StdDev   Mean   StdDev  Change
This function   508.8 23.4      185.4  9.0    +174.4%
Overall        3068.5 31.7     2752.1 29.4     +11.5%

In combination with the preceding patch:
                Before          After
                Mean   StdDev   Mean   StdDev  Change
Overall         2925.6 26.2     2752.1 29.4     +6.3%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-08-08 12:08:34 +03:00
Martin Storsjö
8b9eba664e arm: Add VFP-accelerated version of fft16
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1389.3  4.2       967.8  35.1   +43.6%
Overall        15577.5 83.2     15400.0 336.4    +1.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:41 +03:00
Martin Storsjö
ba6836c966 arm: Add VFP-accelerated version of dca_lfe_fir
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function    868.2  33.5      436.0  27.0   +99.1%
Overall        15973.0 223.2    15577.5  83.2    +2.5%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:39 +03:00
Martin Storsjö
b63bb251ea arm: Add VFP-accelerated version of imdct_half
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   2653.0  28.5     1108.8  51.4   +139.3%
Overall        17049.5 408.2    15973.0 223.2     +6.7%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:37 +03:00
Ben Avison
41ef1d360b arm: Add VFP-accelerated version of synth_filter_float
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   9295.0 114.9     4853.2 83.5    +91.5%
Overall        23699.8 397.6    19285.5 292.0   +22.9%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:17 +03:00
Martin Storsjö
86113667c0 arm: Include hpeldsp_neon.o if h264qpel is enabled
A few of the h264qpel neon functions are shared with other
hpeldsp functions in this file.

This fixes standalone compilation of the h264 decoder on arm.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:37 +03:00
Martin Storsjö
efb7968cfe arm: Don't unconditionally build dsputil files
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:35 +03:00
Martin Storsjö
36a7df8cf1 arm: Only build the FFT init files if FFT is enabled
This fixes build errors in cases where FFT is disabled.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:33 +03:00
Diego Biurrun
186599ffe0 build: cosmetics: Place unconditional before conditional OBJS lines
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:31 +03:00
Diego Biurrun
9b9b2e9f30 build: arm: cosmetics: Place all OBJS declarations in alphabetical order
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:27 +03:00
Ronald S. Bultje
7384b7a713 arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:19:08 +03:00
Diego Biurrun
79dad2a932 dsputil: Separate h264chroma 2013-02-06 11:30:53 +01:00
Diego Biurrun
33552a5f7b arm: Add mathops.h to ARCH_HEADERS list
It is an arch-specific header not suitable for standalone compilation.
2013-01-24 20:59:22 +01:00
Mans Rullgard
e9d817351b dsputil: Separate h264 qpel
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
fef906c77c Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
8c53d39e7f lavc: introduce VideoDSPContext
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Mans Rullgard
b326755989 arm: rename ARMVFP config symbol to VFP
This is consistent with usual ARM nomenclature as well as with the
VFPV3 and NEON symbols which both lack the ARM prefix.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:04 +00:00
Jean-Baptiste Kempf
507dce2536 arm: call arm-specific rv34dsp init functions under if (ARCH_ARM)
Assign NEON specific function pointers after runtime check via
av_get_cpu_flags().

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-10-10 15:28:50 +02:00
Diego Biurrun
ac56ff9cc9 build: non-x86: Only compile mpegvideo optimizations when necessary 2012-10-09 14:45:59 +02:00
Mans Rullgard
7689eea49a flacdsp: arm optimised lpc filter 2012-09-15 23:54:21 +01:00
Mans Rullgard
28f9ab7029 vp3: move idct and loop filter pointers to new vp3dsp context
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context.  There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:19 +01:00
Mans Rullgard
ab9f987661 build: add CONFIG_VP3DSP, reduce repetition in OBJS lists
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-18 10:32:18 +01:00