Commit Graph

328 Commits

Author SHA1 Message Date
Mans Rullgard
b034c95cc1 h264: fix ppc/altivec build
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-10-21 12:49:01 +01:00
Ronald S. Bultje
c2d337429c H264: change weight/biweight functions to take a height argument.
Neon parts by Mans Rullgard <mans@mansr.com>.
2011-10-21 01:00:45 -07:00
Baptiste Coudurier
76741b0e56 h264: 4:2:2 intra decoding support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Mans Rullgard
6e4a35ced9 ppc: fix 32-bit PIC build
On 32-bit ppc, the GOT pointer must be loaded manually.
This adds a "get_got" assembler macro to compute the
GOT address.  The "movrel" macro is updated to take an
additional parameter containing the GOT address since
no register is reserved for this purpose on ppc32.
These changes have no effect on ppc64 builds.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-09-25 17:27:48 +01:00
Mans Rullgard
ca6a904656 ppc: remove redundant setting of Altivec IDCT
This is already set by dsputil_init_ppc() and is best done in only
one place.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-27 20:14:12 +01:00
Mans Rullgard
a617c6aaa3 dsputil: update per-arch init funcs for non-h264 high bit depth
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
874f1a901d dsputil: template get_pixels() for different bit depths
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
0a72533e98 jfdctint: add 10-bit version
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
e7a972e113 simple_idct: add 10-bit version
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-20 17:49:48 +01:00
Diego Biurrun
21aed0ed92 ppc: remove disabled code 2011-07-16 02:56:52 +02:00
Mans Rullgard
6cbf2420b9 PPC: use Altivec IMDCT only for supported sizes
The Altivec IMDCT works with size 32 and higher only.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-05 16:01:56 +01:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Diego Biurrun
1f6b9cc31d Replace some nonstandard DEBUG_* preprocessor directives by plain DEBUG. 2011-06-07 13:20:58 +02:00
Mans Rullgard
0b5e44ed29 mpegaudiodsp: fix x86 and ppc makefiles
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 16:32:24 +01:00
Mans Rullgard
c4f5c2d6f4 Move some mpegaudio functions to new mpegaudiodsp subsystem
This separation allows these functions to be used in a cleaner
fashion from other codecs (e.g. qdm2) and simplifies creating
optimised versions of them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 12:25:34 +01:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Ronald S. Bultje
18b6a69ce9 Revert "VC1: merge idct8x8, coeff adjustments and put_pixels."
This reverts commit f8bed30d8b. The reason
for this is that the overlap filter, which runs after IDCT, should run
on unclamped values, and thus IDCT and put_pixels() cannot be merged if
we want to attempt to be bitexact.
2011-05-04 07:40:01 -04:00
Alex Converse
187a537904 Convert some undefined 1<<31 shifts into 1U<<31.
According to ISO 9899:1999 S 6.5.7/4:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits
are filled with zeros. If E1 has an unsigned type, the value of the
result is E1× 2^E2, reduced modulo one more than the maximum value
representable in the result type. If E1 has a signed type and
nonnegative value, and E1× 2^E2 is representable in the result type, then
that is the resulting value; otherwise, the behavior is undefined.
2011-04-11 21:47:42 -07:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Justin Ruggles
d21be5f15b cosmetics: rename ff_fmt_convert_init_ppc() to ff_fmt_convert_init_altivec().
It only has Altivec functions and is not compiled if Altivec is disabled.
2011-03-07 11:15:29 -05:00
Mans Rullgard
e0e46cae37 vp8: ppc: fix invalid reads in altivec epel mc
The 4-tap filters should only access one row/column before the
reference block.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-21 20:28:41 +00:00
Mans Rullgard
381efba0ec ppc: fix vc1 inverse transform, unbreak build
GCC 4.3 and later are more particular about signedness matching
in vector operations.  The operations under if(rangered) were
missing assignments and thus had no effect.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-21 20:28:37 +00:00
Ronald S. Bultje
f8bed30d8b VC1: merge idct8x8, coeff adjustments and put_pixels.
Merging these functions allows merging some loops, which makes the
results (particularly after SIMD optimizations) much faster.
2011-02-21 10:23:44 -05:00
Ronald S. Bultje
ed040f35f2 Fix PPC build. 2011-02-17 20:22:39 -05:00
Ronald S. Bultje
12802ec060 dsputil: move VC1-specific stuff into VC1DSPContext. 2011-02-17 17:35:35 -05:00
Ronald S. Bultje
1da6ea3954 VC1: transpose IDCT 8x8 coeffs while reading. 2011-02-17 17:35:35 -05:00
Justin Ruggles
c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Justin Ruggles
80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Vitor Sessak
3af1fe829e Fix overread in altivec DSP function sad16
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-29 15:32:14 +00:00
Justin Ruggles
6eabb0d3ad Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Janne Grunau
2c3589bfda consolidate .gitignore patters into a single file
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-18 21:32:05 +01:00
Janne Grunau
348b8218f7 convert svn:ignore properties to .gitignore files
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-17 15:50:14 +01:00
Stefano Sabatini
c6c98d0897 Move mm_support() from libavcodec to libavutil, make it a public
function and rename it to av_get_cpu_flags().

Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 15:07:14 +00:00
Stefano Sabatini
ccf22d3ed1 Merge has_altivec() function into mm_support(), remove it and use
mm_support() instead.

Reduce complexity and simplify pending move to libavutil.

Originally committed as revision 25074 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 10:02:40 +00:00
Stefano Sabatini
7160bb716b Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_
symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h.

Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-04 09:59:08 +00:00
Måns Rullgård
c0ec9918b0 Remove global mm_flags variable
Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-24 17:47:05 +00:00
Loren Merritt
1ee076b1b1 more credits to D. J. Bernstein for fft
Originally committed as revision 24308 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-18 20:06:42 +00:00
Måns Rullgård
a46b84d120 PPC: convert Altivec FFT to pure assembler
On PPC a leaf function has a 288-byte red zone below the stack pointer,
sparing these functions the chore of setting up a full stack frame.

When a function call is disguised within an inline asm block, the
compiler might not adjust the stack pointer as required before a
function call, resulting in the red zone being clobbered.

Moving the entire function to pure asm avoids this problem and also
results in somewhat better code.

Originally committed as revision 24044 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:47 +00:00
Måns Rullgård
deca86eab1 PPC: gas-preprocessor handles m[ft]spr shorthands
Originally committed as revision 24043 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:43 +00:00
Måns Rullgård
fe3d2e4b02 PPC: add some asm support macros
Originally committed as revision 24042 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-04 18:33:40 +00:00
Måns Rullgård
a075902f3d PPC: add _interleave versions of fft{4,6,16}_altivec
This removes the need for a post-swizzle with the small FFTs.

Originally committed as revision 24025 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-03 18:36:10 +00:00
Måns Rullgård
9bbb50648d PPC: fix build on OSX without gas-preprocessor
Originally committed as revision 23962 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-02 00:20:26 +00:00
Loren Merritt
cf61994a17 PPC: Altivec IMDCT
Patch by Loren Merritt

Originally committed as revision 23959 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:49 +00:00
Måns Rullgård
588d28ac08 Remove vestiges of radix-2 FFT
Patch (mostly) by Loren Merritt

Originally committed as revision 23957 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:42 +00:00
Måns Rullgård
bf7ba15372 PPC: Altivec split-radix FFT
1.8x faster than altivec radix-2 on a G4
8% faster vorbis decoding

Patch (mostly) by Loren Merritt

Originally committed as revision 23956 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:39 +00:00
Måns Rullgård
2f0c136e1f Check whether IBM or Apple PPC assembler syntax is used
This checks which assembler syntax is supported and defines macros
for register names accordingly.

Originally committed as revision 23952 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 23:21:27 +00:00
Vitor Sessak
060dd93000 Altivec-optimized mp{1,2,3} windowing
Originally committed as revision 23943 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 21:04:12 +00:00
Måns Rullgård
49bd8e4b84 Fix grammar errors in documentation
Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-30 15:38:06 +00:00
David Conrad
982fac7357 Altivec VP8 MC functions
Originally committed as revision 23884 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 06:42:17 +00:00
David Conrad
7bf4e9d7f7 Altivec: Add helper function to load from a constant misalignment
Originally committed as revision 23883 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 06:42:12 +00:00