Justin Ruggles
b8f02f5b4e
dsputil: use cpuflags in x86 versions of vector_clip_int32()
2011-11-06 20:50:06 -05:00
Ronald S. Bultje
717401aff2
h264_weight: remove duplication functions.
2011-11-05 07:16:30 -07:00
Justin Ruggles
5463e83dbc
fmtconvert: fix int32_to_float_fmul_scalar() for windows x86_64
...
The calling convention only allows 4 non-stack parameter, with each
float or int register being skipped if not used.
fixes Bug 64
2011-11-02 21:44:58 -04:00
Daniel Kang
ded3e9f054
H.264: Cometics to dsputil_mmx.c
...
Add whitespace.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-26 06:41:32 -07:00
Ronald S. Bultje
b0b3231074
h264_weight: initialize "height" function argument properly.
...
Right now it's not actually initialized on 32-bit, leading to crashes
on win32.
2011-10-22 00:23:24 -07:00
Justin Ruggles
aad3429d4e
fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Justin Ruggles
4e8e262476
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Justin Ruggles
185142a5ea
fmtconvert: check compile-time x86 instruction set flags
2011-10-21 10:13:05 -04:00
Justin Ruggles
708ab7dd69
fmtconvert: port float_to_int16() x86 inline asm to yasm
2011-10-21 10:13:05 -04:00
Ronald S. Bultje
c2d337429c
H264: change weight/biweight functions to take a height argument.
...
Neon parts by Mans Rullgard <mans@mansr.com>.
2011-10-21 01:00:45 -07:00
Ronald S. Bultje
229d263cc9
Support for lossless and inter H264 4:2:2.
2011-10-21 01:00:45 -07:00
Baptiste Coudurier
76741b0e56
h264: 4:2:2 intra decoding support
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Diego Biurrun
265980dabc
x86: Move some variable declarations below the appropriat #ifdef.
...
This avoids some unused variable warnings with YASM disabled.
2011-10-20 16:19:27 +02:00
Diego Biurrun
2cb7c81669
x86: Fix linking of ProRes DSP ASM with YASM disabled.
2011-10-20 16:19:13 +02:00
Ronald S. Bultje
05c8f119cc
proresdsp: fix function prototypes.
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-14 21:34:46 +02:00
Ronald S. Bultje
e3f530feca
prores: idct sse2/sse4 optimizations.
...
~3.0-3.5x as fast as original C version, 1.6x as fast overall.
2011-10-11 07:50:48 -07:00
Sean McGovern
c2d3f56107
fft: avoid a signed overflow
...
As a signed integer, 1<<31 overflows, so force it to unsigned.
Signed-off-by: Alex Converse <alex.converse@gmail.com>
2011-09-23 17:02:58 -07:00
Ronald S. Bultje
38e06c2969
Move clipd macros to x86util.asm.
...
This allows sharing them between multiple .asm files.
2011-08-17 20:56:06 -07:00
Dave Yeo
cc73511e8e
Fix NASM include directive
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-15 11:24:35 -07:00
Alex Converse
48f7163f13
dsputil_mmx: Honor HAVE_AMD3DNOW
2011-08-15 11:20:08 -07:00
Ronald S. Bultje
b2c087871d
Move x86util.asm from libavcodec/ to libavutil/.
...
This allows using it in swscale also.
2011-08-12 11:43:03 -07:00
Ronald S. Bultje
3a39195b1d
Move x86inc.asm to libavutil/.
...
This allows using it in libswscale/ also.
2011-08-12 11:43:02 -07:00
Kostya Shishkov
d241f51e0f
Move RV3/4-specific DSP functions into their own context
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-11 16:07:15 -07:00
Vitor Sessak
18b131de04
dct32: Add SSE2 ASM optimizations
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-02 10:17:29 -07:00
Jason Garrett-Glaser
a3bf7b864a
H.264: tweak some other x86 asm for Atom
2011-07-29 12:24:15 -07:00
Mans Rullgard
3ad1684126
x86: cabac: add operand size suffixes missing from 6c32576
...
This fixes build with clang.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 18:59:23 -07:00
Mans Rullgard
f5f004bc5a
x86: cabac: don't load/store context values in asm
...
Inspection of compiled code shows gcc handles these fine on its own.
Benchmarking also shows no measurable speed difference.
Removing the remaining cases in get_cabac_bypass_sign_x86() does
cause more substantial changes to the compiled code with uncertain
impact.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 22:25:21 +01:00
Jason Garrett-Glaser
6c32576548
H.264: optimize CABAC x86 asm for Atom
2011-07-28 13:06:13 -07:00
Mans Rullgard
da4c7cce21
x86: fix build with gcc 4.7
...
The upcoming gcc 4.7 has more advanced constant propagation
resulting some inline asm operands becoming constants and thus
emitted as literals, sometimes in contexts where this results
in invalid instructions.
This patch changes the constraints of the relevant operands
to "rm" thus forcing a valid type. While obviously suboptimal,
this is what older gcc versions already did, and there is no
change to the code generated with these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-26 22:17:43 +01:00
Daniel Kang
406fbd24dc
H.264: Add optimizations to predict x86 assembly.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-22 14:54:33 -07:00
Joseph Artsimovich
5ab21439fd
dnxhd: 10-bit support
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:44:40 +01:00
Mans Rullgard
a617c6aaa3
dsputil: update per-arch init funcs for non-h264 high bit depth
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
874f1a901d
dsputil: template get_pixels() for different bit depths
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
0a72533e98
jfdctint: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
e7a972e113
simple_idct: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-20 17:49:48 +01:00
Diego Biurrun
65083b4911
dsputil: remove disabled code
2011-07-18 11:48:35 +02:00
Martin Storsjö
8f62ef0f95
x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-07-18 00:10:45 +03:00
Diego Biurrun
e0ae2174db
simple_idct: remove disabled code
2011-07-17 17:32:37 +02:00
Daniel Kang
ac4a85f476
H.264: Add more x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-13 18:44:51 -07:00
Jason Garrett-Glaser
b5bbc84fe2
H.264: add filter_mb_fast support for >8-bit decoding
...
Much faster high bit depth deblocking.
2011-07-11 14:58:50 -07:00
Mans Rullgard
710b8df949
dsputil: remove ff_emulated_edge_mc macro used in one place
...
This macro can cause problems in conjunction with the bitdepth
template expansion. It was presumably added to keep source
compatibility when high bitdepth support was added. However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-10 17:55:58 +01:00
Daniel Kang
c0483d0c7a
H.264: Add x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-08 15:59:29 -07:00
Daniel Kang
3c7c16fde3
YASM: Shut up unused variable compiler warning with --disable-yasm.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-07-04 18:49:09 +02:00
Daniel Kang
567a32b5b2
x86_32: Fix build on x86_32 with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-04 08:47:09 -07:00
Daniel Kang
58f7aad051
Fix build with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 22:56:09 -07:00
Daniel Kang
9bfa5363da
H.264: Add x86 assembly for 10-bit H.264 qpel functions.
...
Mainly ported from 8-bit H.264 qpel.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 07:43:38 -07:00
Justin Ruggles
f99a5ef92e
ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
2011-07-01 13:02:11 -04:00
Justin Ruggles
6054cd25b4
ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
2011-07-01 13:02:11 -04:00
Diego Biurrun
d2ee495fb2
configure: Drop check for availability of ten assembler operands.
...
This was done to support gcc 2.95, which is an old legacy compiler
that fails to compile the current codebase anyway.
2011-06-28 13:14:37 +02:00
Diego Biurrun
adbfc605f6
doxygen: Consistently use '@' instead of '\' for Doxygen markup.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-24 00:37:49 +02:00