Commit Graph

310 Commits

Author SHA1 Message Date
Oskar Arvidsson
8dbe585641 Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Justin Ruggles
539244eeb6 cosmetics: rename ff_fmt_convert_init_ppc() to ff_fmt_convert_init_altivec().
It only has Altivec functions and is not compiled if Altivec is disabled.
(cherry picked from commit d21be5f15b)
2011-03-08 02:09:31 +01:00
Mans Rullgard
644b66cd4a vp8: ppc: fix invalid reads in altivec epel mc
The 4-tap filters should only access one row/column before the
reference block.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit e0e46cae37)
2011-02-22 02:44:39 +01:00
Mans Rullgard
e407f4173a ppc: fix vc1 inverse transform, unbreak build
GCC 4.3 and later are more particular about signedness matching
in vector operations.  The operations under if(rangered) were
missing assignments and thus had no effect.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 381efba0ec)
2011-02-22 02:44:39 +01:00
Ronald S. Bultje
6a786b15c3 VC1: merge idct8x8, coeff adjustments and put_pixels.
Merging these functions allows merging some loops, which makes the
results (particularly after SIMD optimizations) much faster.
(cherry picked from commit f8bed30d8b)
2011-02-22 02:44:36 +01:00
Ronald S. Bultje
56cbc5f19f Fix PPC build.
(cherry picked from commit ed040f35f2)
2011-02-18 19:52:41 +01:00
Ronald S. Bultje
9a1ced321b dsputil: move VC1-specific stuff into VC1DSPContext.
(cherry picked from commit 12802ec060)
2011-02-18 19:52:40 +01:00
Ronald S. Bultje
2739dc5d85 VC1: transpose IDCT 8x8 coeffs while reading.
(cherry picked from commit 1da6ea3954)
2011-02-18 19:52:38 +01:00
Justin Ruggles
fe2ff6d247 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit c73d99e672)
2011-02-04 03:08:09 +01:00
Justin Ruggles
a8ae4e0e7b Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 80ba1ddb58)
2011-02-02 03:40:48 +01:00
Vitor Sessak
bc0a603c78 Fix overread in altivec DSP function sad16
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 3af1fe829e)
2011-01-30 03:41:47 +01:00
Justin Ruggles
015f9f1ad3 Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 6eabb0d3ad)
2011-01-23 19:32:08 +01:00
Janne Grunau
2c3589bfda consolidate .gitignore patters into a single file
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-18 21:32:05 +01:00
Janne Grunau
348b8218f7 convert svn:ignore properties to .gitignore files
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-17 15:50:14 +01:00
Stefano Sabatini
c6c98d0897 Move mm_support() from libavcodec to libavutil, make it a public
function and rename it to av_get_cpu_flags().

Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 15:07:14 +00:00
Stefano Sabatini
ccf22d3ed1 Merge has_altivec() function into mm_support(), remove it and use
mm_support() instead.

Reduce complexity and simplify pending move to libavutil.

Originally committed as revision 25074 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 10:02:40 +00:00
Stefano Sabatini
7160bb716b Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_
symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h.

Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-04 09:59:08 +00:00
Måns Rullgård
c0ec9918b0 Remove global mm_flags variable
Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-24 17:47:05 +00:00
Loren Merritt
1ee076b1b1 more credits to D. J. Bernstein for fft
Originally committed as revision 24308 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-18 20:06:42 +00:00
Måns Rullgård
a46b84d120 PPC: convert Altivec FFT to pure assembler
On PPC a leaf function has a 288-byte red zone below the stack pointer,
sparing these functions the chore of setting up a full stack frame.

When a function call is disguised within an inline asm block, the
compiler might not adjust the stack pointer as required before a
function call, resulting in the red zone being clobbered.

Moving the entire function to pure asm avoids this problem and also
results in somewhat better code.

Originally committed as revision 24044 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:47 +00:00
Måns Rullgård
deca86eab1 PPC: gas-preprocessor handles m[ft]spr shorthands
Originally committed as revision 24043 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:43 +00:00
Måns Rullgård
fe3d2e4b02 PPC: add some asm support macros
Originally committed as revision 24042 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:40 +00:00
Måns Rullgård
a075902f3d PPC: add _interleave versions of fft{4,6,16}_altivec
This removes the need for a post-swizzle with the small FFTs.

Originally committed as revision 24025 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-03 18:36:10 +00:00
Måns Rullgård
9bbb50648d PPC: fix build on OSX without gas-preprocessor
Originally committed as revision 23962 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-02 00:20:26 +00:00
Loren Merritt
cf61994a17 PPC: Altivec IMDCT
Patch by Loren Merritt

Originally committed as revision 23959 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:49 +00:00
Måns Rullgård
588d28ac08 Remove vestiges of radix-2 FFT
Patch (mostly) by Loren Merritt

Originally committed as revision 23957 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:42 +00:00
Måns Rullgård
bf7ba15372 PPC: Altivec split-radix FFT
1.8x faster than altivec radix-2 on a G4
8% faster vorbis decoding

Patch (mostly) by Loren Merritt

Originally committed as revision 23956 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:39 +00:00
Måns Rullgård
2f0c136e1f Check whether IBM or Apple PPC assembler syntax is used
This checks which assembler syntax is supported and defines macros
for register names accordingly.

Originally committed as revision 23952 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:27 +00:00
Vitor Sessak
060dd93000 Altivec-optimized mp{1,2,3} windowing
Originally committed as revision 23943 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 21:04:12 +00:00
Måns Rullgård
49bd8e4b84 Fix grammar errors in documentation
Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-30 15:38:06 +00:00
David Conrad
982fac7357 Altivec VP8 MC functions
Originally committed as revision 23884 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 06:42:17 +00:00
David Conrad
7bf4e9d7f7 Altivec: Add helper function to load from a constant misalignment
Originally committed as revision 23883 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 06:42:12 +00:00
Eli Friedman
b3858964d6 Add const to some pointer parameters.
Patch by Eli Friedman,  eli D friedman A gmail

Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 15:11:38 +00:00
Måns Rullgård
2829ce4b40 Remove PPC perf counter support
This functionality is better accessed through tools like oprofile.

Originally committed as revision 23808 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-26 22:23:35 +00:00
Diego Biurrun
ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Måns Rullgård
3bd74e9243 Simplify arch-specific object file lists
Originally committed as revision 22570 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 21:23:03 +00:00
Måns Rullgård
43f60eba19 Move arch-specific makefile parts into $arch/Makefile
Originally committed as revision 22569 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 21:22:59 +00:00
Måns Rullgård
4693b031a3 Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the new
H264DSPContext.  The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by
non-h264 code.

Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 01:17:00 +00:00
Martin Storsjö
b81786611a Move the local includes below the system includes
This fixes a compilation issue on OS X 10.4, where some system headers were
included implicitly through dsputil_altivec.h (with _POSIX_C_SOURCE defined),
and other system headers included later, with _POSIX_C_SOURCE undefined at
that time.

Originally committed as revision 22327 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-08 15:12:36 +00:00
Måns Rullgård
ddb8c2c0f1 PPC: move prototypes to headers and make some functions static
Originally committed as revision 22267 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 22:37:14 +00:00
Måns Rullgård
1429224b04 Move FFT parts from dsputil.h to fft.h
Originally committed as revision 22235 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:34:46 +00:00
Måns Rullgård
84dc2d8afa Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant.  All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:24:59 +00:00
Måns Rullgård
c67278098d Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 03:25:11 +00:00
Loren Merritt
b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
Måns Rullgård
35de5d2412 cosmetics: fix indentation after previous commit
Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:52:00 +00:00
Måns Rullgård
952e872198 Drop unused args from vector_fmul_add_add, simpify code, and rename
The src3 and step arguments to vector_fmul_add_add() are always zero
and one, respectively.  This removes these arguments from the function,
simplifies the code accordingly, and renames the function to better
match the new operation.

Originally committed as revision 20061 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:51:54 +00:00
Måns Rullgård
f486321395 Move per-arch fft init bits into the corresponding subdirs
Originally committed as revision 19864 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-15 21:14:14 +00:00
Måns Rullgård
afe08a728a PPC: remove unnecessary alignment on local variables
Storing a single element from a vector where all elements have the same
value does not require an aligned destination.  Which element is stored
depends on the alignment of the destination address, but since they all
have the same value, the result is the same regardless of the alignment.

Originally committed as revision 19696 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-24 21:42:22 +00:00
Diego Biurrun
deb1b2b699 Add necessary #include for config.h.
Originally committed as revision 19692 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-24 10:59:14 +00:00
Måns Rullgård
b662e8395b PPC: simplify loading some values into altivec registers
Instead of filling a local array with the desired value and loading it,
load a single element and vec_splat() it to fill the vector.

Originally committed as revision 19691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-24 10:36:13 +00:00