Zuxy Meng
663deb54af
Remove incorrect comment; MMX2 is preferred over 3DNow! on Athlon
...
Originally committed as revision 9079 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-20 05:07:44 +00:00
Zuxy Meng
038bfcf9d6
3DNow! and SSSE3 optimization to QNS DSP functions; use pmulhrw/pmulhrsw instead of pmulhw
...
Originally committed as revision 9053 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-18 08:18:56 +00:00
Aurelien Jacobs
5b0b7054b4
better separation of vp3dsp functions from dsputil_mmx.c
...
Originally committed as revision 9039 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-16 23:23:45 +00:00
Ronald S. Bultje
b550bfaa61
Add libavcodec to compiler include flags in order to simplify header
...
include paths in the source files.
mostly from a patch by Ronald S. Bultje, rbultje ronald.bitfreak net
Originally committed as revision 9034 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-16 09:51:45 +00:00
Panagiotis Issaris
9b5dc86746
Make vp3dsp*.c compilation optional.
...
Originally committed as revision 9025 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-14 14:28:13 +00:00
Reimar Döffinger
e36d79c837
Change some leftover __attribute__((unused)) and __attribute__((used)) to
...
attribute_unused and attribute_used respectively to ease compiling on non-gcc.
Originally committed as revision 9024 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-14 14:07:50 +00:00
Zuxy Meng
25e4f8aaee
Faster SSE FFT/MDCT, patch by Zuxy Meng %zuxy P meng A gmail P com%
...
unrolls some loops, utilizing all 8 xmm registers. fft-test
shows ~10% speed up in (I)FFT and ~8% speed up in (I)MDCT on Dothan
Originally committed as revision 9017 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-13 16:32:32 +00:00
Loren Merritt
ff506a906e
sse2 & ssse3 versions of dct_quantize.
...
core2: mmx2=154 sse2=73 ssse3=66 (cycles)
k8: mmx2=179 sse2=149
p4: mmx2=284 sse2=194
Originally committed as revision 9003 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 05:55:09 +00:00
Loren Merritt
1edbfe1994
factor sum_abs_dctelem out of dct_sad, and simd it.
...
sum_abs_dctelem_* alone:
core2: c=186 mmx2=39 sse2=21 ssse3=13 (cycles)
k8: c=163 mmx2=33 sse2=31
p4: c=370 mmx2=60 sse2=60
dct_sad including sum_abs_dctelem_*:
core2: c=405 mmx2=258 sse2=240 ssse3=232
k8: c=624 mmx2=394 sse2=392
p4: c=849 mmx2=556 sse2=556
Originally committed as revision 9001 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 02:41:25 +00:00
Loren Merritt
561f940c03
sse2 & ssse3 versions of hadamard. unroll and inline diff_pixels.
...
core2: before mmx2=193 cycles. after mmx2=174 sse2=122 ssse3=115 (cycles).
k8: before mmx2=205. after mmx2=184 sse2=180.
p4: before mmx2=342. after mmx2=314 sse2=309.
Originally committed as revision 9000 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 01:16:06 +00:00
Loren Merritt
ba53071acb
10l, r8991 broke mmx1 sad
...
Originally committed as revision 8993 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 03:29:06 +00:00
Loren Merritt
72946825fa
sse2 version of fullpel sad.
...
16% faster on core2, 5% faster on p4. 10% slower (and thus disabled) on k8.
Originally committed as revision 8992 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 01:11:45 +00:00
Loren Merritt
164d75ebf3
tweak mmx2 sad.
...
40% faster on core2, 18% faster on k8, 5% faster on p4.
Originally committed as revision 8991 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 00:45:07 +00:00
Loren Merritt
eca3810e31
tweak mmx2 sad.
...
6% faster on core2 and k8, no change on p4.
Originally committed as revision 8984 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-10 22:24:19 +00:00
Loren Merritt
7c3a9fe2a3
sse2 version of fdct_col.
...
k8: 72->61 cycles, core2: 51->26 cycles.
Originally committed as revision 8966 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-10 03:13:41 +00:00
Loren Merritt
5adf43e47e
cosmetics: remove code duplication in hadamard8_diff_mmx
...
Originally committed as revision 8946 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-09 01:46:33 +00:00
Loren Merritt
bba5293bb7
cosmetics: remove duplicate transpose macro
...
Originally committed as revision 8939 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-08 17:55:56 +00:00
Reimar Döffinger
a1ce61108b
Fix parts missed in clip -> av_clip rename
...
Originally committed as revision 8760 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-04-19 16:12:06 +00:00
Diego Biurrun
fe0372296a
typos
...
Originally committed as revision 8642 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-04-07 14:10:02 +00:00
Loren Merritt
5900637219
mmx 16-bit ssd. 2.3x faster svq1 encoding.
...
Originally committed as revision 8559 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-03-30 19:15:31 +00:00
Diego Biurrun
d42f88025a
Fix wrong conditional, Snow decoding, not encoding, was SIMD-accelerated.
...
Originally committed as revision 8116 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-24 11:58:52 +00:00
Michael Niedermayer
58e31fb1d5
reorder a few more paddws to reduce dependancy chains
...
chroma mc4 put 2480 -> 2460 dezicyles on duron
Originally committed as revision 8098 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:44:56 +00:00
Michael Niedermayer
b4fe97696c
reorder paddws to reduce dependancy chain
...
put_h264_chroma_mc2_mmx2() 927 -> 902 dezicyles on duron
Originally committed as revision 8097 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:28:35 +00:00
Michael Niedermayer
0c67082e02
shortening dependancy chain in chroma mc2
...
Originally committed as revision 8095 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:03:30 +00:00
Michael Niedermayer
af26516261
remove now wrong comment
...
Originally committed as revision 8094 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 14:29:59 +00:00
Michael Niedermayer
61240ae556
fix chroma mc2 bug, this is based on a patch by (Oleg Metelitsa oleg hitron co kr)
...
and does slow the mc2 chroma put down, avg interrestingly seems unaffected speedwise on duron
this of course should be rather done in a way which doesnt slow it down but its better a few %
slower but correct then incorrect
Originally committed as revision 8093 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 14:29:13 +00:00
Michael Niedermayer
470d2d03cc
gcc 2.95 fix
...
Originally committed as revision 8059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-22 00:04:36 +00:00
Måns Rullgård
459022f504
fix for x86-64
...
Originally committed as revision 8022 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-18 20:00:05 +00:00
Michael Niedermayer
b21e0b6dfc
rewrite H264_CHROMA_MC4_TMPL (20% faster)
...
Originally committed as revision 8012 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-17 23:43:02 +00:00
Michael Niedermayer
2a115873af
add a few asserts to ensure alignment
...
Originally committed as revision 7994 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-16 21:22:53 +00:00
Michael Niedermayer
00e210ddbb
prevent h.264 MC related functions from being inlined (yes this is much faster the code just doesnt fit in the code cache otherwise)
...
Originally committed as revision 7993 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-16 21:21:07 +00:00
Reimar Döffinger
392b76ca93
Minor AMD64 compilation fix
...
Originally committed as revision 7907 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-10 13:33:08 +00:00
Michael Niedermayer
9bc0d3ef3e
maybe fix x86_64 (untested)
...
Originally committed as revision 7906 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-10 03:56:49 +00:00
Michael Niedermayer
7c4fd7eb0c
factor out common subexprssion (gcc of course is too stupid to do this ...)
...
5% faster avg_h264_chroma_mc2_mmx2()
10% faster put_h264_chroma_mc2_mmx2()
Originally committed as revision 7898 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-09 12:37:38 +00:00
Michael Niedermayer
9301a0b4a9
merge asm fragments in H264_CHROMA_MC2_TMPL()
...
10% faster avg_h264_chroma_mc2_mmx2()
5% faster put_h264_chroma_mc2_mmx2()
Originally committed as revision 7897 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-09 12:24:22 +00:00
Panagiotis Issaris
9dd6c80453
Add the const specifier as needed to reduce the number of warnings.
...
Originally committed as revision 7764 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-01-30 10:31:34 +00:00
Diego Biurrun
9688979c51
Fix some more license headers.
...
Originally committed as revision 7637 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-01-22 01:16:42 +00:00
Guillaume Poirier
5a5c770d5a
Add SSSE3 (Core2 aka Conroe/Merom/Woodcrester new instructions) detection
...
Originally committed as revision 7332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-12-18 22:43:09 +00:00
Måns Rullgård
849f10351d
rename always_inline to av_always_inline and move to common.h
...
Originally committed as revision 7256 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-12-08 00:35:08 +00:00
Måns Rullgård
486497e07b
revert bad checkin
...
Originally committed as revision 7044 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-14 03:18:09 +00:00
Måns Rullgård
be6ed6fff4
move some CFLAGS settings away from config.* writing section
...
Originally committed as revision 7043 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-14 03:12:29 +00:00
Måns Rullgård
7466ed2f02
zigzag_direct_noperm doesn't exist, remove declaration
...
Originally committed as revision 6998 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 23:35:49 +00:00
Måns Rullgård
36cd306907
rename inverse -> ff_inverse
...
Originally committed as revision 6990 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 18:49:36 +00:00
Måns Rullgård
bb54f6ab62
adding more static keywords
...
Originally committed as revision 6976 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 03:34:12 +00:00
Michael Niedermayer
079e61db5d
ensure alignment (no speed change)
...
Originally committed as revision 6891 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 16:54:05 +00:00
Michael Niedermayer
f5a9e8f33d
merging mov & and (no speedchange)
...
Originally committed as revision 6889 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 16:02:18 +00:00
Michael Niedermayer
e80cf125a7
2 instructions less (same speed)
...
Originally committed as revision 6888 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 15:40:57 +00:00
Michael Niedermayer
9347118237
comment about failed optimization
...
Originally committed as revision 6887 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 15:17:36 +00:00
Michael Niedermayer
38cfdc83f0
move luma tc0 related init into asm
...
5% faster filter_mb_fast() on P3
Originally committed as revision 6884 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 14:28:30 +00:00
Michael Niedermayer
25225c3773
2 instructions less in h264_loop_filter_luma_mmx2()
...
Originally committed as revision 6882 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 12:07:53 +00:00
Michael Niedermayer
bda2203d56
preempt possible overflow
...
Originally committed as revision 6881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 11:07:35 +00:00
Michael Niedermayer
5a1553dee3
1 instruction less
...
Originally committed as revision 6880 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 09:59:15 +00:00
Michael Niedermayer
e9f1885c21
optimize H264_DEBLOCK_P0_Q0
...
2.5% faster filter_mb_fast() on P3
Originally committed as revision 6877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 02:03:56 +00:00
Diego Biurrun
7c428ea681
Put libmpeg2 IDCT functions under CONFIG_GPL, fixes link failure
...
with --disable-opts.
Originally committed as revision 6691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-14 17:04:50 +00:00
Diego Biurrun
c26abfa541
Rename ABS macro to FFABS.
...
Originally committed as revision 6666 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 23:17:58 +00:00
Diego Biurrun
b78e7197a8
Change license headers to say 'FFmpeg' instead of 'this program/this library'
...
and fix GPL/LGPL version mismatches.
Originally committed as revision 6577 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 15:30:46 +00:00
Diego Biurrun
ade6e7f3ae
Compilation fix, printf gets redefined to please_use_av_log.
...
Originally committed as revision 6574 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 11:30:24 +00:00
Diego Biurrun
0eb59ddba4
Switch idct_mmx_xvid.c from GPL to LGPL as permitted by the
...
author, Peter Ross (pross xvid org).
Originally committed as revision 6557 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-05 00:23:24 +00:00
Loren Merritt
2833fc4646
approximate qpel functions: sacrifice some quality for some decoding speed. enabled on B-frames with -lavdopts fast.
...
Originally committed as revision 6412 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-01 21:25:17 +00:00
Måns Rullgård
62bb489b13
add some #ifdef CONFIG_ENCODERS/DECODERS
...
Originally committed as revision 6356 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-27 19:54:07 +00:00
Loren Merritt
a4eb118a41
cosmetics (indentation)
...
Originally committed as revision 6313 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 17:43:09 +00:00
Loren Merritt
f469094c9b
tweak ff_imdct_calc_3dn2
...
Originally committed as revision 6312 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 17:42:23 +00:00
Loren Merritt
ebbafcb454
sse implementation of imdct.
...
patch mostly by Zuxy Meng (zuxy dot meng at gmail dot com)
Originally committed as revision 6311 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 16:37:39 +00:00
Luca Barbato
99aed7c8fc
New single instruction math operation header
...
Originally committed as revision 6291 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-19 22:22:29 +00:00
Aurelien Jacobs
2a2311bee3
disable vp3 mmx idct for theora files to avoid artifacts
...
(see theora-a4_v6-k250-s0_2.ogg)
Originally committed as revision 6253 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-14 22:13:23 +00:00
Diego Biurrun
7f889a76ad
Remove the LGPL exception clause as discussed on ffmpeg-devel
...
and move the dependent code under CONFIG_GPL.
Originally committed as revision 6248 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-14 00:38:03 +00:00
Aurelien Jacobs
1dac8fea05
Enables back the mmx/sse optimized version of the vp3 idct.
...
It generates different md5sum than the reference C implementation,
but no visual difference, so enabled only when bitexact is not set.
Originally committed as revision 6241 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-12 20:58:17 +00:00
Diego Biurrun
04d7f60143
Add official LGPL license headers to the files that were missing them.
...
Originally committed as revision 6219 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-10 14:02:42 +00:00
Måns Rullgård
0e176c3eb5
remove redundant declarations
...
Originally committed as revision 6153 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-02 23:10:28 +00:00
Loren Merritt
3e20143ee7
mmx implementation of deblocking strength decision.
...
2-3% faster h264.
Originally committed as revision 6113 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-28 09:33:01 +00:00
Loren Merritt
1e4ecf26f5
ff_fft_calc_3dn/3dn2/sse: convert intrinsics to inline asm.
...
2.5% faster fft, 0.5% faster vorbis.
Originally committed as revision 6023 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-18 23:53:49 +00:00
Michael Niedermayer
cf5aed5bad
simplify
...
Originally committed as revision 6020 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-18 10:43:23 +00:00
Michael Niedermayer
3829a62eae
insufficient alignment
...
Originally committed as revision 6006 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-17 07:49:22 +00:00
Marco Manfredini
6bb9e49249
Fix building with --disable-opts but MMX enabled.
...
patch by Marco Manfredini mldb %at% gmx %dot% net
Originally committed as revision 5994 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-13 10:10:06 +00:00
John Dalgliesh
4454dc1b6f
Support for MacIntel, last part: balign directives
...
Determines whether .align's arg is power-of-two or not, then defines ASMALIGN appropriately in config.h. Changes all .baligns to ASMALIGNs.
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 11, 2006 8:00 AM
Subject: Re: [Ffmpeg-devel] Mac OS X Intel last part: balign directives
Originally committed as revision 5990 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-12 16:37:31 +00:00
Loren Merritt
069720565c
vorbis simd tweaks
...
Originally committed as revision 5983 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-11 18:19:37 +00:00
Michael Niedermayer
1f1aa1d955
convert vector_fmul_reverse_sse2 and vector_fmul_add_add_sse2 to sse
...
please complain if they are slower on sse2 cpus ...
Originally committed as revision 5976 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 20:24:58 +00:00
Loren Merritt
eb4825b5d4
sse and 3dnow implementations of float->int conversion and mdct windowing.
...
15% faster vorbis.
Originally committed as revision 5975 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 19:06:25 +00:00
Luca Barbato
ffad4ed154
Fix x86 SIMD asm and pic, patch from Martin von Gagern <Martin.vGagern@gmx.net>
...
Originally committed as revision 5973 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 16:05:29 +00:00
John Dalgliesh
347be47226
Support for MacIntel, take xx: '/nop' illegal for old versions of GAS
...
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 8, 2006 8:12 PM
Subject: Re: [Ffmpeg-devel] [PATCH] '/nop' illegal for old versions of GAS
Originally committed as revision 5972 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 15:26:18 +00:00
John Dalgliesh
0fc256f3d9
Add support for Mac OS X Intel part 2: Assembler macros in fdct_mmx.c
...
convert gas macros to ccp macros
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 10, 2006 5:39 AM
Subject: Re: [Ffmpeg-devel] Mac OS X Intel part 2: Assembler macros in fdct_mmx.c
Originally committed as revision 5971 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 11:29:57 +00:00
John Dalgliesh
fc48b6fe74
Support for Mac OS X Intel, part 3: binary integer constants:
...
Apple's assembler only understands the same integer constants as C does: hex, decimal, octal. It doesn't understand binary integer constants (0b...) so this patch replaces binary integer constants with hex ones.
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 10, 2006 8:16 AM
Subject: [Ffmpeg-devel] Mac OS X Intel part 3: binary integer constants
Originally committed as revision 5970 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 09:06:06 +00:00
Loren Merritt
ee5df92750
emms -> femms
...
Originally committed as revision 5965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-09 06:37:25 +00:00
Loren Merritt
2494bdd90d
gcc 2.95 and 3.4.x on x86 32bit without fomit-frame-pointer can't even find 5 registers for asm input.
...
0.5% slower vorbis.
Originally committed as revision 5964 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-09 06:33:49 +00:00
Loren Merritt
1b87c40245
slightly faster ff_imdct_calc_3dn2() on amd64. (gcc added a bunch of useless movsxd)
...
Originally committed as revision 5962 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 21:47:11 +00:00
Michael Niedermayer
21bb884fb7
change vorbis_inverse_coupling_sse2() so it works on sse1 cpus
...
Originally committed as revision 5957 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 12:03:51 +00:00
Loren Merritt
bcfa3e58ee
3dnow2 implementation of imdct.
...
6% faster vorbis and wma.
Originally committed as revision 5954 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 04:01:04 +00:00
Loren Merritt
cd035a6051
10l, vorbis_inverse_coupling_sse() was really 3dnow
...
Originally committed as revision 5903 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-03 07:09:29 +00:00
Loren Merritt
2dac4acfc0
sse & sse2 implementations of vorbis channel coupling.
...
9% faster vorbis (on a K8).
Originally committed as revision 5898 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-03 03:18:47 +00:00
Stefan Gehrer
595e7bd940
some MMX optimizations for the CAVS decoder
...
Originally committed as revision 5846 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-29 08:45:33 +00:00
Michael Niedermayer
5ced7b80ad
disable the vp3 mmx and sse2 idcts, their output doesnt match the c idct (tested with -f crc) and the theora spec does not allow different idcts not to mention the difference is quite vissible ...
...
Originally committed as revision 5788 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-19 09:49:21 +00:00
Måns Rullgård
98d417cbcd
#define SBUTTERFLY outside CONFIG_ENCODERS
...
Originally committed as revision 5628 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-05 19:31:01 +00:00
Luca Abeni
9c39071d6d
Move REG_* macros from libavcodec/i386/mmx.h to libavutil/x86_cpu.h
...
Originally committed as revision 5595 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-03 10:52:07 +00:00
Måns Rullgård
3f8674a902
remove redundant macro definitions
...
Originally committed as revision 5589 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-02 22:01:31 +00:00
Måns Rullgård
8fb0d07339
kill warning
...
Originally committed as revision 5588 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-02 21:53:30 +00:00
Michael Niedermayer
e27b6e62f7
missmatch control for mpeg2 intra dequantization if bitexact=1
...
Originally committed as revision 5328 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-28 17:03:52 +00:00
Zuxy Meng
392f6da897
Remove unused and unsupported Cyrix's "Extended MMX",
...
Add SSE3 support.
Patch by Zuxy Meng < zuxy POIS meng AH gmail POIS com >
Original thread:
04/26/06 13:13:
[Ffmpeg-devel] [PATCH] Bug fix, SSE3 support in i386/cputest.c and dsputil.h
Originally committed as revision 5326 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-28 07:46:13 +00:00
Wolfram Gloger
f42635f558
gcc-2.95 compile fix, patch by Wolfram Gloger <wmglo A dent PIS med PIS uni-muenchen PIS de>
...
Originally committed as revision 5298 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-18 03:48:30 +00:00
Loren Merritt
75ca1a5f70
gmc_mmx tweaks
...
Originally committed as revision 5269 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-05 04:13:41 +00:00
Loren Merritt
703c8195a8
mmx implementation of 3-point GMC. (5x faster than C)
...
Originally committed as revision 5265 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-04 09:23:45 +00:00
Luca Barbato
22b48b85b6
altivec support for snow
...
Originally committed as revision 5228 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-27 12:51:19 +00:00
Loren Merritt
5e8b787afa
simplified and slightly faster h264_chroma_mc8_mmx
...
Originally committed as revision 5214 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-25 08:41:14 +00:00
Loren Merritt
513fbd8e5a
prefetch pixels for future motion compensation. 2-5% faster h264.
...
Originally committed as revision 5203 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-23 20:16:36 +00:00
Loren Merritt
5e6a5c4daf
10l
...
Originally committed as revision 5201 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 22:08:28 +00:00
Loren Merritt
fdd3057981
added mmx implementation of h264_chroma_mc2
...
Originally committed as revision 5200 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 22:05:00 +00:00
Robert Edele
e8600e5edc
add MMX and SSE versions of ff_snow_inner_add_yblock
...
Patch by Robert Edele < yartrebo AH earthlink POIS net >
Original Thread:
Date: Mar 22, 2006 3:24 AM
Subject: [Ffmpeg-devel] [PATCH] snow mmx + sse2 part 5
Originally committed as revision 5197 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 12:08:35 +00:00
Robert Edele
2c9a0285d4
snow mmx+sse2 optimizations, part 4
...
Patch by Robert Edele, yartrebo <<at>> earthlink <<dot>> net
Originally committed as revision 5191 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-21 21:51:07 +00:00
Robert Edele
4567b4bdab
Add the mmx and sse2 implementations of ff_snow_vertical_compose().
...
Patch by Robert Edele < yartrebo AH earthlink POIS net >
Original thread:
Date: Mar 20, 2006 5:54 PM
Subject: [Ffmpeg-devel] [PATCH] snow mmx + sse2 part 3
Originally committed as revision 5185 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-20 22:27:59 +00:00
Robert Edele
059715a41c
First part of a series of speed-enchancing patches.
...
This one sets up a snow.h and makes snow use the dsputil function pointer
framework to access the three functions that will be implemented in asm
in the other parts of the patchset.
Patch by Robert Edele < yartrebo AH earthlink POIS net>
Original thread:
Subject: [Ffmpeg-devel] [PATCH] Snow mmx+sse2 asm optimizations
Date: Sun, 05 Feb 2006 12:47:14 -0500
Originally committed as revision 5172 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-16 19:18:18 +00:00
Zuxy Meng
82eb4b0f1b
3DNow! & Extended 3DNow! versions of FFT
...
Patch by Zuxy Meng, zuxy <<dot>> meng >>at<< gmail <<dot>> com
Minor non-functional diff-related fixes by me.
Originally committed as revision 5125 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-08 04:13:55 +00:00
Loren Merritt
548a1c8a35
h264_idct8_add_mmx
...
Originally committed as revision 5123 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:45:56 +00:00
Loren Merritt
6da971f160
h264_idct_add only needs mmx1
...
Originally committed as revision 5122 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:33:32 +00:00
Zuxy Meng
2ffb22d2ad
use xorps instead of mulps to toggle the sign of a float, as suggested by Software Optimization Guide for AMD64 Processors.
...
Patch by Zuxy Meng < zuxy POIS meng AH gmail POIS com > OKed by Michael
Original thread:
Date: Mar 5, 2006 8:15 PM
Subject: [Ffmpeg-devel] [PATCH] Little optimization to fft_sse.c
Originally committed as revision 5112 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 20:25:18 +00:00
Loren Merritt
d84f7c61ee
gcc2.95 workaround
...
Originally committed as revision 5111 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 19:02:35 +00:00
Loren Merritt
7a5b2fa812
remove some useless instructions
...
Originally committed as revision 5109 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-04 19:56:01 +00:00
Loren Merritt
6a8eb0f45a
4% faster h264_qpel_mc
...
Originally committed as revision 5094 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-02 08:21:08 +00:00
Loren Merritt
ef9d1d1575
h264: special case dc-only idct. ~1% faster overall
...
Originally committed as revision 4971 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-10 06:55:25 +00:00
Loren Merritt
4e295993ba
10l in 1.12
...
Originally committed as revision 4965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-09 02:43:23 +00:00
Loren Merritt
6ee669732d
10l (x86_64)
...
Originally committed as revision 4952 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 16:10:48 +00:00
Loren Merritt
e545f37527
18% faster put_h264_qpel16_mc[13]2_mmx2
...
Originally committed as revision 4951 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 10:52:25 +00:00
Loren Merritt
c03ce51dfb
11% faster put_h264_qpel16_v_lowpass_mmx2
...
Originally committed as revision 4950 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 07:35:03 +00:00
Loren Merritt
0331f09237
15% faster put_h264_qpel16_hv_lowpass_mmx2
...
Originally committed as revision 4949 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 06:25:14 +00:00
Steve L'Homme
68b51e58ce
MSVC-compatible __align8/__align16 declaration
...
patch by Steve Lhomme, steve .dot. lhomme .at. free .dot. fr
Originally committed as revision 4942 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-05 13:35:17 +00:00
Diego Biurrun
5509bffa88
Update licensing information: The FSF changed postal address.
...
Originally committed as revision 4842 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-12 22:43:26 +00:00
Loren Merritt
e8b562087d
tweak h264_biweight
...
Originally committed as revision 4835 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-09 03:38:37 +00:00
Loren Merritt
cec9395977
fix some potential arithmetic overflows in pred_direct_motion() and
...
ff_h264_weight_WxH_mmx2().
Originally committed as revision 4795 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-30 23:47:41 +00:00
Diego Biurrun
bb270c0896
COSMETICS: tabs --> spaces, some prettyprinting
...
Originally committed as revision 4764 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-22 01:10:11 +00:00
Diego Biurrun
115329f160
COSMETICS: Remove all trailing whitespace.
...
Originally committed as revision 4749 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-17 18:14:38 +00:00
Guillaume Poirier
f6d1338cb5
Add the rest of missing Reg_* macros to support both AMD-64 style regs and IA32 regs.
...
Not used yet, but should be once the SIMD code to accelerate Snow decoding is merged.
Originally committed as revision 4731 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-10 22:53:44 +00:00
Loren Merritt
ea15df8048
use sse16_sse2() in nsse
...
Originally committed as revision 4688 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-11-12 05:23:25 +00:00
Loren Merritt
a6624e21cb
faster h264_chroma_mc8_mmx, added h264_chroma_mc4_mmx.
...
2-4% overall speedup.
Originally committed as revision 4666 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-27 06:45:29 +00:00
Loren Merritt
b926572aa9
h264 mmx weighted prediction. up to 3% overall speedup.
...
Originally committed as revision 4630 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-09 23:38:52 +00:00
Loren Merritt
5693c08356
sse2 16x16 sum squared diff (306=>268 cycles on a K8)
...
faster 8x8 mmx ssd (77=>70 cycles)
Originally committed as revision 4623 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-30 02:31:47 +00:00
Michael Niedermayer
12e9668119
replace a few mov + psrlq with pshufw, there are more cases which could benefit from this but they would require us to duplicate some functions ...
...
the trick is from various places (my own code in libpostproc, a patch on the x264 list, ...)
Originally committed as revision 4608 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-21 21:17:09 +00:00
Reimar Döffinger
cd7af76d9e
Fix compile without CONFIG_GPL, misplaced #endif caused a missing }.
...
Originally committed as revision 4575 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:30:40 +00:00
Michael Niedermayer
9f211bc6d7
remove unused table entries
...
change non portable table access
Originally committed as revision 4574 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:03:37 +00:00
Michael Niedermayer
84740d5980
xvids mmx&mmx2 idcts
...
needed to decode xvid without some minor artefacts
under #ifdef CONFIG_GPL of course
Originally committed as revision 4572 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 17:01:30 +00:00
Måns Rullgård
79396ac685
Kill some compiler warnings. Compiled code verified identical after changes.
...
Originally committed as revision 4567 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-06 21:25:35 +00:00
Michael Niedermayer
d3a9f79871
simplify (d&a) and (d&~a) calculation, hint by skal
...
Originally committed as revision 4552 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-03 09:17:30 +00:00
Michael Niedermayer
b5b65df7a9
add consts (this was in my local tree, dunno where it came from, probably forgoten from some const patch)
...
Originally committed as revision 4551 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-02 21:13:19 +00:00
Måns Rullgård
bf4e3bd2d0
kill a bunch of compiler warnings
...
Originally committed as revision 4522 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-14 15:42:40 +00:00
Alexander Strasser
c11c2bc20b
libavutil: Utility code from libavcodec moved to a separate library.
...
Originally committed as revision 4489 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-01 20:07:05 +00:00
Loren Merritt
d2bb7db135
sort H.264 mmx dsp functions into their own file
...
Originally committed as revision 4338 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-02 20:45:35 +00:00
Michael Niedermayer
c26ae41db2
adding a few const
...
Originally committed as revision 4337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 21:19:00 +00:00
Michael Niedermayer
435b0720a8
100l for myself (breaking amd64)
...
Originally committed as revision 4336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 18:04:01 +00:00
Michael Niedermayer
6510f43cf3
merge a few asm blocks so gcc cant unoptimize it (658->631 dezicycles on duron)
...
Originally committed as revision 4334 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:56:58 +00:00
Michael Niedermayer
987ae784e6
get rid of 2 movq (680 -> 658 dezicycles on duron)
...
Originally committed as revision 4333 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:36:32 +00:00
Michael Niedermayer
e4b36d4434
avoid one transpose (730->680 dezicycles on duron)
...
Originally committed as revision 4332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 08:43:40 +00:00
Loren Merritt
85bbfcd4ee
10l (symbol mangling)
...
Originally committed as revision 4331 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 04:51:46 +00:00
Michael Niedermayer
1f3dbc09b1
add rounding bias before the horizontal idct (765->730 dezicyles on duron)
...
Originally committed as revision 4330 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 01:18:41 +00:00