Commit Graph

15 Commits

Author SHA1 Message Date
Christophe Gisquet
9ae8e23188 dcadsp: scan coefficients linearly instead.
This change is inspired by x86 asm, where this frees a register.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-15 20:08:08 +01:00
Christophe Gisquet
45854df9a5 dcadsp: split lfe_dir cases
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 02:04:12 +01:00
Michael Niedermayer
82ae8a44e6 Merge commit '5b59a9fc6152169599561f04b4f66370edda5c9c'
* commit '5b59a9fc6152169599561f04b4f66370edda5c9c':
  x86: dcadsp: implement int8x8_fmul_int32

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 01:20:33 +01:00
Christophe Gisquet
481a46a462 dcadsp: add int8x8_fmul_int32 to DSP context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:55:42 +01:00
Christophe Gisquet
5b59a9fc61 x86: dcadsp: implement int8x8_fmul_int32
For the callable function (as opposed to the inline one):
         C  SSE  SSE2  SSE4
Win32:  47   42   29    26
Win64:  30   33   25    23
The SSE version is neither compiled nor set for ARCH_X86_64, as the
inlinable function takes over.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:52:40 +01:00
Christophe Gisquet
2bd44cb705 dcadsp: add int8x8_fmul_int32 to dsp context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:51:59 +01:00
Michael Niedermayer
366e415836 Merge commit '800ffab48a7844dd5dc0a33b8f6b8e5ed718cf2e'
* commit '800ffab48a7844dd5dc0a33b8f6b8e5ed718cf2e':
  dcadsp: Add a new method, qmf_32_subbands

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 12:10:59 +02:00
Ben Avison
800ffab48a dcadsp: Add a new method, qmf_32_subbands
This does most of the work formerly carried out by
the static function qmf_32_subbands() in dcadec.c.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:42 +03:00
Michael Niedermayer
0aa095483d Merge commit '6fee1b90ce3bf4fbdfde7016e0890057c9000487'
* commit '6fee1b90ce3bf4fbdfde7016e0890057c9000487':
  avcodec: Add av_cold attributes to init functions missing them

Conflicts:
	libavcodec/aacpsy.c
	libavcodec/atrac3.c
	libavcodec/dvdsubdec.c
	libavcodec/ffv1.c
	libavcodec/ffv1enc.c
	libavcodec/h261enc.c
	libavcodec/h264_parser.c
	libavcodec/h264dsp.c
	libavcodec/h264pred.c
	libavcodec/libschroedingerenc.c
	libavcodec/libxvid_rc.c
	libavcodec/mpeg12.c
	libavcodec/mpeg12enc.c
	libavcodec/proresdsp.c
	libavcodec/rangecoder.c
	libavcodec/videodsp.c
	libavcodec/x86/proresdsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 11:34:29 +02:00
Diego Biurrun
6fee1b90ce avcodec: Add av_cold attributes to init functions missing them 2013-05-04 21:09:45 +02:00
Justin Ruggles
a8ae4e0e7b Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 80ba1ddb58)
2011-02-02 03:40:48 +01:00
Måns Rullgård
08255107cf DCA: ARM/NEON optimised lfe_fir
Originally committed as revision 22863 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-12 20:45:33 +00:00
Måns Rullgård
309d16a4a0 DCA: break out lfe_interpolation_fir() inner loops to a function
This enables SIMD optimisations of this function.

Originally committed as revision 22861 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-12 20:45:25 +00:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Justin Ruggles
80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00