Commit Graph

25 Commits

Author SHA1 Message Date
Justin Ruggles
9d06037d48 twinvq: add SSE/AVX optimized sum/difference stereo interleaving 2011-11-11 14:13:58 -05:00
Justin Ruggles
b8f02f5b4e dsputil: use cpuflags in x86 versions of vector_clip_int32() 2011-11-06 20:50:06 -05:00
Justin Ruggles
4e8e262476 fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm 2011-10-21 10:13:05 -04:00
Ronald S. Bultje
38e06c2969 Move clipd macros to x86util.asm.
This allows sharing them between multiple .asm files.
2011-08-17 20:56:06 -07:00
Dave Yeo
cc73511e8e Fix NASM include directive
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-15 11:24:35 -07:00
Ronald S. Bultje
3a39195b1d Move x86inc.asm to libavutil/.
This allows using it in libswscale/ also.
2011-08-12 11:43:02 -07:00
Justin Ruggles
6054cd25b4 ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions. 2011-07-01 13:02:11 -04:00
Dave Yeo
d69f9a4234 Add support for a.out object format to assembler macros.
This format is still used by e.g. OS/2.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-20 17:52:21 +02:00
Diego Biurrun
888fa31eca Fix FSF address copy paste error in some license headers. 2011-05-14 21:32:31 +02:00
Justin Ruggles
e6e9823488 Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Ronald S. Bultje
17cf7c68ed Fix ff_emu_edge_core_sse() on Win64.
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()
function calls on Win64.
2011-02-08 18:25:12 -05:00
Justin Ruggles
c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Ronald S. Bultje
81f2a3f4ff Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Jason Garrett-Glaser
2966cc1849 Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.

Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-23 19:20:46 +00:00
Alex Converse
3deb53849e Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 23:07:58 +00:00
Loren Merritt
758c7455f1 fix a crash in ape decoding on x86_32 sse2
Originally committed as revision 20777 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-08 21:24:01 +00:00
Loren Merritt
a4605efdf5 slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on conroe
Originally committed as revision 20743 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 17:53:11 +00:00
Loren Merritt
b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
Loren Merritt
b10fa1bb8b port ape dsp functions from sse2 to mmx
now requires yasm

Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-03 18:53:12 +00:00
Loren Merritt
b07781b6e4 fix linking on systems with a function name prefix (10l in r20287)
Originally committed as revision 20294 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 21:44:03 +00:00
Loren Merritt
e17ccf60fe huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:47:25 +00:00
Loren Merritt
2f77923d72 simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv

Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:10:10 +00:00
Loren Merritt
3daa434a40 ff_add_hfyu_median_prediction_mmx2
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.

Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-08 17:45:30 +00:00
Diego Biurrun
a6493a8fbd Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 09:12:42 +00:00