Diego Biurrun
276b995d85
x86: drop pointless ARCH_X86 #ifdef from files in x86 subdirectory
2011-11-08 17:52:55 +01:00
Justin Ruggles
b8f02f5b4e
dsputil: use cpuflags in x86 versions of vector_clip_int32()
2011-11-06 20:50:06 -05:00
Ronald S. Bultje
717401aff2
h264_weight: remove duplication functions.
2011-11-05 07:16:30 -07:00
Justin Ruggles
5463e83dbc
fmtconvert: fix int32_to_float_fmul_scalar() for windows x86_64
...
The calling convention only allows 4 non-stack parameter, with each
float or int register being skipped if not used.
fixes Bug 64
2011-11-02 21:44:58 -04:00
Daniel Kang
ded3e9f054
H.264: Cometics to dsputil_mmx.c
...
Add whitespace.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-26 06:41:32 -07:00
Ronald S. Bultje
b0b3231074
h264_weight: initialize "height" function argument properly.
...
Right now it's not actually initialized on 32-bit, leading to crashes
on win32.
2011-10-22 00:23:24 -07:00
Justin Ruggles
aad3429d4e
fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Justin Ruggles
4e8e262476
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Justin Ruggles
185142a5ea
fmtconvert: check compile-time x86 instruction set flags
2011-10-21 10:13:05 -04:00
Justin Ruggles
708ab7dd69
fmtconvert: port float_to_int16() x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Ronald S. Bultje
c2d337429c
H264: change weight/biweight functions to take a height argument.
...
Neon parts by Mans Rullgard <mans@mansr.com>.
2011-10-21 01:00:45 -07:00
Ronald S. Bultje
229d263cc9
Support for lossless and inter H264 4:2:2.
2011-10-21 01:00:45 -07:00
Baptiste Coudurier
76741b0e56
h264: 4:2:2 intra decoding support
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Diego Biurrun
265980dabc
x86: Move some variable declarations below the appropriat #ifdef.
...
This avoids some unused variable warnings with YASM disabled.
2011-10-20 16:19:27 +02:00
Diego Biurrun
2cb7c81669
x86: Fix linking of ProRes DSP ASM with YASM disabled.
2011-10-20 16:19:13 +02:00
Ronald S. Bultje
05c8f119cc
proresdsp: fix function prototypes.
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-14 21:34:46 +02:00
Ronald S. Bultje
e3f530feca
prores: idct sse2/sse4 optimizations.
...
~3.0-3.5x as fast as original C version, 1.6x as fast overall.
2011-10-11 07:50:48 -07:00
Sean McGovern
c2d3f56107
fft: avoid a signed overflow
...
As a signed integer, 1<<31 overflows, so force it to unsigned.
Signed-off-by: Alex Converse <alex.converse@gmail.com>
2011-09-23 17:02:58 -07:00
Ronald S. Bultje
38e06c2969
Move clipd macros to x86util.asm.
...
This allows sharing them between multiple .asm files.
2011-08-17 20:56:06 -07:00
Dave Yeo
cc73511e8e
Fix NASM include directive
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-15 11:24:35 -07:00
Alex Converse
48f7163f13
dsputil_mmx: Honor HAVE_AMD3DNOW
2011-08-15 11:20:08 -07:00
Ronald S. Bultje
b2c087871d
Move x86util.asm from libavcodec/ to libavutil/.
...
This allows using it in swscale also.
2011-08-12 11:43:03 -07:00
Ronald S. Bultje
3a39195b1d
Move x86inc.asm to libavutil/.
...
This allows using it in libswscale/ also.
2011-08-12 11:43:02 -07:00
Kostya Shishkov
d241f51e0f
Move RV3/4-specific DSP functions into their own context
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-11 16:07:15 -07:00
Vitor Sessak
18b131de04
dct32: Add SSE2 ASM optimizations
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-02 10:17:29 -07:00
Jason Garrett-Glaser
a3bf7b864a
H.264: tweak some other x86 asm for Atom
2011-07-29 12:24:15 -07:00
Mans Rullgard
3ad1684126
x86: cabac: add operand size suffixes missing from 6c32576
...
This fixes build with clang.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 18:59:23 -07:00
Mans Rullgard
f5f004bc5a
x86: cabac: don't load/store context values in asm
...
Inspection of compiled code shows gcc handles these fine on its own.
Benchmarking also shows no measurable speed difference.
Removing the remaining cases in get_cabac_bypass_sign_x86() does
cause more substantial changes to the compiled code with uncertain
impact.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 22:25:21 +01:00
Jason Garrett-Glaser
6c32576548
H.264: optimize CABAC x86 asm for Atom
2011-07-28 13:06:13 -07:00
Mans Rullgard
da4c7cce21
x86: fix build with gcc 4.7
...
The upcoming gcc 4.7 has more advanced constant propagation
resulting some inline asm operands becoming constants and thus
emitted as literals, sometimes in contexts where this results
in invalid instructions.
This patch changes the constraints of the relevant operands
to "rm" thus forcing a valid type. While obviously suboptimal,
this is what older gcc versions already did, and there is no
change to the code generated with these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-26 22:17:43 +01:00
Daniel Kang
406fbd24dc
H.264: Add optimizations to predict x86 assembly.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-22 14:54:33 -07:00
Joseph Artsimovich
5ab21439fd
dnxhd: 10-bit support
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:44:40 +01:00
Mans Rullgard
a617c6aaa3
dsputil: update per-arch init funcs for non-h264 high bit depth
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
874f1a901d
dsputil: template get_pixels() for different bit depths
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
0a72533e98
jfdctint: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
e7a972e113
simple_idct: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-20 17:49:48 +01:00
Diego Biurrun
65083b4911
dsputil: remove disabled code
2011-07-18 11:48:35 +02:00
Martin Storsjö
8f62ef0f95
x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-07-18 00:10:45 +03:00
Diego Biurrun
e0ae2174db
simple_idct: remove disabled code
2011-07-17 17:32:37 +02:00
Daniel Kang
ac4a85f476
H.264: Add more x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-13 18:44:51 -07:00
Jason Garrett-Glaser
b5bbc84fe2
H.264: add filter_mb_fast support for >8-bit decoding
...
Much faster high bit depth deblocking.
2011-07-11 14:58:50 -07:00
Mans Rullgard
710b8df949
dsputil: remove ff_emulated_edge_mc macro used in one place
...
This macro can cause problems in conjunction with the bitdepth
template expansion. It was presumably added to keep source
compatibility when high bitdepth support was added. However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-10 17:55:58 +01:00
Daniel Kang
c0483d0c7a
H.264: Add x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-08 15:59:29 -07:00
Daniel Kang
3c7c16fde3
YASM: Shut up unused variable compiler warning with --disable-yasm.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-07-04 18:49:09 +02:00
Daniel Kang
567a32b5b2
x86_32: Fix build on x86_32 with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-04 08:47:09 -07:00
Daniel Kang
58f7aad051
Fix build with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 22:56:09 -07:00
Daniel Kang
9bfa5363da
H.264: Add x86 assembly for 10-bit H.264 qpel functions.
...
Mainly ported from 8-bit H.264 qpel.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 07:43:38 -07:00
Justin Ruggles
f99a5ef92e
ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
2011-07-01 13:02:11 -04:00
Justin Ruggles
6054cd25b4
ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
2011-07-01 13:02:11 -04:00
Diego Biurrun
d2ee495fb2
configure: Drop check for availability of ten assembler operands.
...
This was done to support gcc 2.95, which is an old legacy compiler
that fails to compile the current codebase anyway.
2011-06-28 13:14:37 +02:00
Diego Biurrun
adbfc605f6
doxygen: Consistently use '@' instead of '\' for Doxygen markup.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-24 00:37:49 +02:00
Daniel Kang
84e70ef004
h264: Add x86 assembly for 10-bit weight/biweight H.264 functions.
...
Mainly ported from 8-bit H.264 weight/biweight.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-21 15:24:13 +02:00
Mans Rullgard
c5ee740745
x86: cabac: fix register constraints for 32-bit mode
...
Some operands need to be accessed in byte mode, which restricts the
available registers in 32-bit mode. Using the 'q' constraint selects
a suitable register.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 23:36:40 +01:00
Mans Rullgard
2143d69bdd
cabac: move x86 asm to libavcodec/x86/cabac.h
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
d075e7d540
x86: h264: cast pointers to intptr_t rather than int
...
Only the low-order bits are used here so the type is not important,
but this avoids a compiler warning.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3a4edb76d6
x86: h264: remove hardcoded edi in decode_significance_8x8_x86()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
b92c1a6d26
x86: h264: remove hardcoded esi in decode_significance[_8x8]_x86()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3fc4e36c78
x86: h264: remove hardcoded edx in decode_significance[_8x8]_x86()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
e4b5a204aa
x86: h264: remove hardcoded eax in decode_significance[_8x8]_x86()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
018c33838e
x86: cabac: remove hardcoded ebx in inline asm
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
6b712acc0e
x86: cabac: remove hardcoded struct offsets from inline asm
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Ronald S. Bultje
ed63f527f2
Fix build if yasm is not available.
2011-06-18 08:34:14 -04:00
Daniel Kang
f188a1e0ca
H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.
...
Mainly ported from 8-bit H.264 MC Chroma.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-18 07:52:19 -04:00
Jason Garrett-Glaser
c90b94424c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea
Roll back 4:4:4 H.264 for now
...
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Oskar Arvidsson
6c031a3338
h264: Fix 10-bit H.264 x86 chroma v loopfilter asm.
...
The tc variable was not splatted correctly.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-10 14:44:57 -04:00
Daniel Kang
4de83b7b6d
H264: x86 predict init cosmetics.
...
Change indentation and whitespace; also move HAVE_YASM blocks.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-08 00:22:52 +02:00
Daniel Kang
a8d44f9dd5
Add x86 assembly for some 10-bit H.264 intra predict functions.
...
Parts are inspired from the 8-bit H.264 predict code in Libav.
Other parts ported from x264 with relicensing permission from author.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-06 01:31:02 +02:00
Loren Merritt
53be7b23e9
Cosmetic changes to h264_idct_10bit.asm.
...
Removes redundant dword tags and whitespace changes.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:15 -07:00
Loren Merritt
994c3550ff
2x faster h264_idct_add8_10.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:02 -07:00
Ronald S. Bultje
e6635a9a19
h264: remove CONFIG_GPL from x86 intra prediction code.
...
The authors permitted relicensing to LGPL a long time ago (Holger,
Loren and Jason).
2011-06-02 07:02:46 -07:00
Daniel Kang
f3aa65af3a
h264/10bit: add HAVE_ALIGNED_STACK checks.
...
Fixes regression in 836f47d34b
in ICC-10.x,
since ICC<=11.0 doesn't align stack upon function calls.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-31 21:43:20 -07:00
Daniel Kang
348493db60
Update 8-bit H.264 IDCT function names to reflect bit-depth.
...
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00
Daniel Kang
836f47d34b
Add IDCT functions for 10-bit H.264.
...
Ports the majority of IDCT functions for 10-bit H.264.
Parts are inspired from 8-bit IDCT code in Libav; other parts ported from x264 with relicensing permission from author.
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00
Justin Ruggles
70bb747a57
ac3dsp: do not use the ff_* prefix when referencing ff_ac3_bap_bits.
...
this should fix the windows builds
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-05-28 22:43:40 +03:00
Justin Ruggles
6ca23db9cc
ac3enc: modify mantissa bit counting to keep bap counts for all values of bap
...
instead of just 0 to 4.
This does all the actual bit counting as a final step.
2011-05-28 12:39:28 -04:00
Diego Biurrun
5e528cffcf
x86: Add appropriate ifdefs around certain AVX functions.
...
nasm versions prior to 2.09 have trouble assembling some of our AVX code.
Protect these sections by preprocessor macros to allow compilation to pass.
2011-05-27 21:18:12 +02:00
Dave Yeo
a10fb79070
x86 asm: Add SECTION_TEXT to dct32_sse.asm.
...
This fixes the following error on OS/2:
error: segment name `.text align=16' not recognized
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-23 12:47:53 +02:00
Loren Merritt
422b2362fc
dct32_sse: eliminate some spills
...
125->104 cycles on penryn (x86_64 only)
2011-05-22 19:27:18 +02:00
Vitor Sessak
165c7c420d
Fix dct32() compilation with --disable-yasm
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-22 07:10:19 -04:00
Vitor Sessak
6204feb160
dct32: Add AVX implementation of 32-point DCT
2011-05-21 17:42:26 +02:00
Vitor Sessak
4e653b98c8
dct32: Change pass 6 permutation to allow for AVX implementation
2011-05-21 17:42:26 +02:00
Vitor Sessak
3758eb0eb9
dct32: port SSE 32-point DCT to YASM
2011-05-21 17:42:26 +02:00
Diego Biurrun
153382e1b6
multiple inclusion guard cleanup
...
Add missing multiple inclusion guards; clean up #endif comments;
add missing library prefixes; keep guard names consistent.
2011-05-21 13:48:10 +02:00
Dave Yeo
d69f9a4234
Add support for a.out object format to assembler macros.
...
This format is still used by e.g. OS/2.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-20 17:52:21 +02:00
Mans Rullgard
0b5e44ed29
mpegaudiodsp: fix x86 and ppc makefiles
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 16:32:24 +01:00
Mans Rullgard
c4f5c2d6f4
Move some mpegaudio functions to new mpegaudiodsp subsystem
...
This separation allows these functions to be used in a cleaner
fashion from other codecs (e.g. qdm2) and simplifies creating
optimised versions of them.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 12:25:34 +01:00
Justin Ruggles
e98a95e779
10l: wrap float_interleave functions in HAVE_YASM.
...
fixes compilation with --disable-yasm
2011-05-18 20:18:08 -04:00
Justin Ruggles
32f8fb8ecf
Add float_interleave() to FmtConvertContext with x86-optimized versions.
...
Partially based on patches by clsid2 in ffdshow-tryout.
ff_float_interleave6() x86 improvements by Loren Merrit.
2011-05-18 17:27:05 -04:00
Daniel Kang
d0005d347d
Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
...
Arguments for variable size instructions are added to many macros, along
with other various changes. The x86util.asm code was ported from x264.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-17 20:44:48 +02:00
Gil Pedersen
257de5fb25
h264dsp_mmx: Add #ifdefs around some mmxext functions on x86_64.
...
This fixes linking errors due to undefined symbols on x86_64 OS X.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-16 15:35:53 +02:00
Diego Biurrun
888fa31eca
Fix FSF address copy paste error in some license headers.
2011-05-14 21:32:31 +02:00
Jason Garrett-Glaser
5705b02079
10-bit H.264 x86 chroma v loopfilter asm
...
Also delete some unused deblock asm macros.
2011-05-11 11:09:10 -07:00
Jason Garrett-Glaser
9f3d6ca4f1
Port x86 10-bit H.264 deblock asm from x264
2011-05-10 20:02:15 -07:00
Jason Garrett-Glaser
8ad77b65b5
Update x86 H.264 deblock asm
...
Includes AVX versions from x264.
2011-05-10 20:01:58 -07:00
Ronald S. Bultje
86b29553f8
h264dsp_mmx: place bracket outside #if/#endif block.
...
Should fix compile on systems missing yasm/nasm.
2011-05-10 08:39:38 -04:00
Oskar Arvidsson
19a0729b4c
Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
...
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Diego Biurrun
a734fa575f
Remove disabled non-optimized code variants.
2011-04-29 20:01:13 +02:00
Vitor Sessak
9d35fa520e
Add AVX FFT implementation.
...
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-04-26 18:25:24 +02:00
Vitor Sessak
33cbfa6fa3
Update x86inc.asm from x264 to allow AVX emulation using SSE and MMX.
...
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-04-26 18:18:22 +02:00
Alexander Strange
1500be13f2
dsputil: allow to skip drawing of top/bottom edges.
2011-03-26 17:45:38 -04:00
Justin Ruggles
e6e9823488
Add apply_window_int16() to DSPContext with x86-optimized versions and use it
...
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Mans Rullgard
0aded9484d
Move dct and rdft definitions to separate files
...
This leaves fft.h with only the core FFT and MDCT definitions
thus making it more managable.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-20 17:15:33 +00:00
Mans Rullgard
2912e87a6c
Replace FFmpeg with Libav in licence headers
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Justin Ruggles
0f999cfddb
ac3enc: add float_to_fixed24() with x86-optimized versions to AC3DSPContext
...
and use in scale_coefficients() for the floating-point AC-3 encoder.
2011-03-17 16:46:48 -04:00
Justin Ruggles
79414257e2
mathops: fix MULL() when the compiler does not inline the function.
...
If the function is not inlined, an immmediate cannot be used for the
shift parameter, so the %cl register must be used instead in that case.
This fixes compilation for x86-32 using gcc with --disable-optimizations.
2011-03-15 20:49:37 -04:00
Justin Ruggles
aaff3b312e
mathops: change "g" constraint to "rm" in x86-32 version of MUL64().
...
The 1-arg imul instruction cannot take an immediate argument, only a register
or memory argument.
2011-03-15 13:43:47 -04:00
Justin Ruggles
b181b8fb96
mathops: convert MULL/MULH/MUL64 to inline functions rather than macros.
...
This fixes unexpected name collisions that were occurring with variables
declared within the macros.
It also fixes the fate-acodec-ac3_fixed regression test on x86-32.
2011-03-15 13:43:47 -04:00
Justin Ruggles
f1efbca5e9
ac3enc: add SIMD-optimized shifting functions for use with the fixed-point AC3 encoder.
2011-03-14 08:45:31 -04:00
Mans Rullgard
a5444fee06
Add CONFIG_AC3DSP symbol to simplify makefiles
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-12 11:35:26 +00:00
Ronald S. Bultje
bf6fa73245
dsputil_mmx.c: remove ff_vector128.
...
Remove ff_vector128, it is identical to ff_pb_80.
2011-02-19 10:51:15 -05:00
Ronald S. Bultje
12802ec060
dsputil: move VC1-specific stuff into VC1DSPContext.
2011-02-17 17:35:35 -05:00
Justin Ruggles
1f004fc512
ac3dsp: Change punpckhqdq to movhlps in ac3_max_msb_abs_int16().
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-16 14:08:34 -05:00
Justin Ruggles
fbb6b49dab
ac3enc: Add x86-optimized function to speed up log2_tab().
...
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-13 16:49:39 -05:00
Loren Merritt
e6b1ed693a
FFT: factor a shuffle out of the inner loop and merge it into fft_permute.
...
6% faster SSE FFT on Conroe, 2.5% on Penryn.
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-02-13 15:36:39 +01:00
Justin Ruggles
dda3f0ef48
Add x86-optimized versions of exponent_min().
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-10 15:32:47 -05:00
Ronald S. Bultje
17cf7c68ed
Fix ff_emu_edge_core_sse() on Win64.
...
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()
function calls on Win64.
2011-02-08 18:25:12 -05:00
Justin Ruggles
c73d99e672
Separate format conversion DSP functions from DSPContext.
...
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Alex Converse
770c410fbb
Fix ff_imdct_calc_sse() on gcc-4.6
...
Gcc 4.6 only preserves the first value when using an array with an "m"
constraint.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:40:05 +00:00
Ronald S. Bultje
81f2a3f4ff
Implement a SIMD version of emulated_edge_mc() for x86.
...
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Justin Ruggles
d19b744a36
cosmetics: indentation
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:30:15 +00:00
Justin Ruggles
80ba1ddb58
Remove unneeded add bias from 3 functions.
...
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Mans Rullgard
80944df720
x86: fix overflow in h264 8x8 planar prediction
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-24 23:24:28 +00:00
Justin Ruggles
6eabb0d3ad
Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Justin Ruggles
1c189fc533
cosmetics related to LPC changes.
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles
77a78e9bdc
Separate window function from autocorrelation.
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles
56f8952b25
Move lpc_compute_autocorr() from DSPContext to a new struct LPCContext.
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:58:59 +00:00
Ronald S. Bultje
b9c7f66e6d
Fix horizontal/horizontal_up 8x8l intra prediction x86/simd functions.
...
The original functions did not work correctly for edge pixels, e.g.
when CODEC_FLAG_EMU_EDGE is set, leading to corrupt output in e.g. VLC.
Based on a patch by Daniel Kang <daniel d kang gmail com>.
Signed-off-by: Ronald S. Bultje <rsbultje gmail com>
2011-01-19 20:34:42 -05:00
Mans Rullgard
ef4a65149d
Replace ASMALIGN() with .p2align
...
This macro has unconditionally used .p2align for a long time and
serves no useful purpose.
2011-01-18 20:48:24 +00:00
Mans Rullgard
ac3c9d0169
x86: remove VLA in ac3_downmix_sse
2011-01-18 20:48:24 +00:00
Janne Grunau
2c3589bfda
consolidate .gitignore patters into a single file
...
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-18 21:32:05 +01:00
Janne Grunau
348b8218f7
convert svn:ignore properties to .gitignore files
...
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-17 15:50:14 +01:00
Ronald S. Bultje
1b3e43e4fd
Fix overflow in pred16x16_plane x86 simd code. Fixes issue 2547.
...
Originally committed as revision 26381 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 22:00:44 +00:00
Ronald S. Bultje
ec3233a855
Fix ff_pw_3 alignment.
...
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 23:26:34 +00:00
Jason Garrett-Glaser
19fb234e4a
H.264: split luma dc idct out and implement MMX/SSE2 versions
...
About 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.
Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Daniel Kang
004357a11f
Fix compilation on x86-32 with --disable-optimizations,
...
fixes issue 2127.
Patch by Daniel Kang, daniel.d.kang at gmail
Originally committed as revision 26204 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-03 11:30:04 +00:00
Daniel Kang
0790caba60
Fix invalid reads in valgrind fate, patch by Daniel Kang <daniel dot d dot
...
kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26177 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-31 01:29:06 +00:00
Daniel Kang
536e9b2f58
Port pred8x8l_down_left_mmxext (H.264 intra prediction) from x264 (authors:
...
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26162 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 23:48:44 +00:00
Daniel Kang
720ea2d5b2
Port pred4x4_down_right_mmxext (H.264 intra prediction) from x264 (authors:
...
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26159 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:55:51 +00:00
Daniel Kang
d0aebe23e2
Port pred4x4_vertical_right_mmxext (H.264 intra prediction) from x264 (authors:
...
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26158 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:52:41 +00:00
Daniel Kang
76497232ef
Port pred4x4_horizontal_down_mmxext (H.264 intra prediction) from x264
...
(authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26157 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:49:57 +00:00
Daniel Kang
e9c576a467
Port pred4x4_horizontal_up_mmxext (H.264 intra prediction) from x264 (authors:
...
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26156 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:42:33 +00:00
Daniel Kang
92f441ae86
Port pred4x4_vertical_left_mmxext (H.264 intra prediction) from x264 (authors:
...
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26155 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:35:34 +00:00
Ronald S. Bultje
e8d98764cc
Merge a few superfluous CONFIG_GPL checks.
...
Originally committed as revision 26154 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:30:47 +00:00
Ronald S. Bultje
42a59278cf
Whitespace cosmetics.
...
Originally committed as revision 26152 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:43:15 +00:00
Daniel Kang
57b1f334d1
Port pred8x8l_horizontal_down_sse2/ssse3 (H.264 intra prediction) from x264
...
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26151 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:42:15 +00:00
Daniel Kang
04cbdf3d24
Port pred8x8l_horizontal_down_mmxext (H.264 intra prediction) from x264
...
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26150 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:38:06 +00:00
Daniel Kang
98c6053cd0
Port pred8x8l_horizontal_up_mmxext/ssse3 (H.264 intra prediction) from x264
...
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26149 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:35:31 +00:00
Daniel Kang
ecc7efbbb6
Port pred8x8l_vertical_left_sse2/ssse3 (H.264 intra prediction) from x264
...
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26148 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:06:22 +00:00