Commit Graph

442 Commits

Author SHA1 Message Date
Zuxy Meng
663deb54af Remove incorrect comment; MMX2 is preferred over 3DNow! on Athlon
Originally committed as revision 9079 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-20 05:07:44 +00:00
Zuxy Meng
038bfcf9d6 3DNow! and SSSE3 optimization to QNS DSP functions; use pmulhrw/pmulhrsw instead of pmulhw
Originally committed as revision 9053 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-18 08:18:56 +00:00
Aurelien Jacobs
5b0b7054b4 better separation of vp3dsp functions from dsputil_mmx.c
Originally committed as revision 9039 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-16 23:23:45 +00:00
Ronald S. Bultje
b550bfaa61 Add libavcodec to compiler include flags in order to simplify header
include paths in the source files.
mostly from a patch by Ronald S. Bultje, rbultje ronald.bitfreak net

Originally committed as revision 9034 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-16 09:51:45 +00:00
Panagiotis Issaris
9b5dc86746 Make vp3dsp*.c compilation optional.
Originally committed as revision 9025 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-14 14:28:13 +00:00
Reimar Döffinger
e36d79c837 Change some leftover __attribute__((unused)) and __attribute__((used)) to
attribute_unused and attribute_used respectively to ease compiling on non-gcc.

Originally committed as revision 9024 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-14 14:07:50 +00:00
Zuxy Meng
25e4f8aaee Faster SSE FFT/MDCT, patch by Zuxy Meng %zuxy P meng A gmail P com%
unrolls some loops, utilizing all 8 xmm registers. fft-test
shows ~10% speed up in (I)FFT and ~8% speed up in (I)MDCT on Dothan

Originally committed as revision 9017 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-13 16:32:32 +00:00
Loren Merritt
ff506a906e sse2 & ssse3 versions of dct_quantize.
core2: mmx2=154 sse2=73 ssse3=66 (cycles)
k8: mmx2=179 sse2=149
p4: mmx2=284 sse2=194

Originally committed as revision 9003 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 05:55:09 +00:00
Loren Merritt
1edbfe1994 factor sum_abs_dctelem out of dct_sad, and simd it.
sum_abs_dctelem_* alone:
core2: c=186 mmx2=39 sse2=21 ssse3=13 (cycles)
k8: c=163 mmx2=33 sse2=31
p4: c=370 mmx2=60 sse2=60
 dct_sad including sum_abs_dctelem_*:
core2: c=405 mmx2=258 sse2=240 ssse3=232
k8: c=624 mmx2=394 sse2=392
p4: c=849 mmx2=556 sse2=556

Originally committed as revision 9001 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 02:41:25 +00:00
Loren Merritt
561f940c03 sse2 & ssse3 versions of hadamard. unroll and inline diff_pixels.
core2: before mmx2=193 cycles. after mmx2=174 sse2=122 ssse3=115 (cycles).
k8: before mmx2=205. after mmx2=184 sse2=180.
p4: before mmx2=342. after mmx2=314 sse2=309.

Originally committed as revision 9000 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-12 01:16:06 +00:00
Loren Merritt
ba53071acb 10l, r8991 broke mmx1 sad
Originally committed as revision 8993 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 03:29:06 +00:00
Loren Merritt
72946825fa sse2 version of fullpel sad.
16% faster on core2, 5% faster on p4. 10% slower (and thus disabled) on k8.

Originally committed as revision 8992 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 01:11:45 +00:00
Loren Merritt
164d75ebf3 tweak mmx2 sad.
40% faster on core2, 18% faster on k8, 5% faster on p4.

Originally committed as revision 8991 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-11 00:45:07 +00:00
Loren Merritt
eca3810e31 tweak mmx2 sad.
6% faster on core2 and k8, no change on p4.

Originally committed as revision 8984 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-10 22:24:19 +00:00
Loren Merritt
7c3a9fe2a3 sse2 version of fdct_col.
k8: 72->61 cycles, core2: 51->26 cycles.

Originally committed as revision 8966 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-10 03:13:41 +00:00
Loren Merritt
5adf43e47e cosmetics: remove code duplication in hadamard8_diff_mmx
Originally committed as revision 8946 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-09 01:46:33 +00:00
Loren Merritt
bba5293bb7 cosmetics: remove duplicate transpose macro
Originally committed as revision 8939 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-05-08 17:55:56 +00:00
Reimar Döffinger
a1ce61108b Fix parts missed in clip -> av_clip rename
Originally committed as revision 8760 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-04-19 16:12:06 +00:00
Diego Biurrun
fe0372296a typos
Originally committed as revision 8642 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-04-07 14:10:02 +00:00
Loren Merritt
5900637219 mmx 16-bit ssd. 2.3x faster svq1 encoding.
Originally committed as revision 8559 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-03-30 19:15:31 +00:00
Diego Biurrun
d42f88025a Fix wrong conditional, Snow decoding, not encoding, was SIMD-accelerated.
Originally committed as revision 8116 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-24 11:58:52 +00:00
Michael Niedermayer
58e31fb1d5 reorder a few more paddws to reduce dependancy chains
chroma mc4 put 2480 -> 2460 dezicyles on duron

Originally committed as revision 8098 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:44:56 +00:00
Michael Niedermayer
b4fe97696c reorder paddws to reduce dependancy chain
put_h264_chroma_mc2_mmx2() 927 -> 902 dezicyles on duron

Originally committed as revision 8097 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:28:35 +00:00
Michael Niedermayer
0c67082e02 shortening dependancy chain in chroma mc2
Originally committed as revision 8095 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 15:03:30 +00:00
Michael Niedermayer
af26516261 remove now wrong comment
Originally committed as revision 8094 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 14:29:59 +00:00
Michael Niedermayer
61240ae556 fix chroma mc2 bug, this is based on a patch by (Oleg Metelitsa oleg hitron co kr)
and does slow the mc2 chroma put down, avg interrestingly seems unaffected speedwise on duron
this of course should be rather done in a way which doesnt slow it down but its better a few %
slower but correct then incorrect

Originally committed as revision 8093 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-23 14:29:13 +00:00
Michael Niedermayer
470d2d03cc gcc 2.95 fix
Originally committed as revision 8059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-22 00:04:36 +00:00
Måns Rullgård
459022f504 fix for x86-64
Originally committed as revision 8022 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-18 20:00:05 +00:00
Michael Niedermayer
b21e0b6dfc rewrite H264_CHROMA_MC4_TMPL (20% faster)
Originally committed as revision 8012 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-17 23:43:02 +00:00
Michael Niedermayer
2a115873af add a few asserts to ensure alignment
Originally committed as revision 7994 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-16 21:22:53 +00:00
Michael Niedermayer
00e210ddbb prevent h.264 MC related functions from being inlined (yes this is much faster the code just doesnt fit in the code cache otherwise)
Originally committed as revision 7993 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-16 21:21:07 +00:00
Reimar Döffinger
392b76ca93 Minor AMD64 compilation fix
Originally committed as revision 7907 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-10 13:33:08 +00:00
Michael Niedermayer
9bc0d3ef3e maybe fix x86_64 (untested)
Originally committed as revision 7906 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-10 03:56:49 +00:00
Michael Niedermayer
7c4fd7eb0c factor out common subexprssion (gcc of course is too stupid to do this ...)
5% faster avg_h264_chroma_mc2_mmx2()
10% faster put_h264_chroma_mc2_mmx2()

Originally committed as revision 7898 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-09 12:37:38 +00:00
Michael Niedermayer
9301a0b4a9 merge asm fragments in H264_CHROMA_MC2_TMPL()
10% faster avg_h264_chroma_mc2_mmx2()
5% faster put_h264_chroma_mc2_mmx2()

Originally committed as revision 7897 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-02-09 12:24:22 +00:00
Panagiotis Issaris
9dd6c80453 Add the const specifier as needed to reduce the number of warnings.
Originally committed as revision 7764 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-01-30 10:31:34 +00:00
Diego Biurrun
9688979c51 Fix some more license headers.
Originally committed as revision 7637 to svn://svn.ffmpeg.org/ffmpeg/trunk
2007-01-22 01:16:42 +00:00
Guillaume Poirier
5a5c770d5a Add SSSE3 (Core2 aka Conroe/Merom/Woodcrester new instructions) detection
Originally committed as revision 7332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-12-18 22:43:09 +00:00
Måns Rullgård
849f10351d rename always_inline to av_always_inline and move to common.h
Originally committed as revision 7256 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-12-08 00:35:08 +00:00
Måns Rullgård
486497e07b revert bad checkin
Originally committed as revision 7044 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-14 03:18:09 +00:00
Måns Rullgård
be6ed6fff4 move some CFLAGS settings away from config.* writing section
Originally committed as revision 7043 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-14 03:12:29 +00:00
Måns Rullgård
7466ed2f02 zigzag_direct_noperm doesn't exist, remove declaration
Originally committed as revision 6998 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 23:35:49 +00:00
Måns Rullgård
36cd306907 rename inverse -> ff_inverse
Originally committed as revision 6990 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 18:49:36 +00:00
Måns Rullgård
bb54f6ab62 adding more static keywords
Originally committed as revision 6976 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-12 03:34:12 +00:00
Michael Niedermayer
079e61db5d ensure alignment (no speed change)
Originally committed as revision 6891 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 16:54:05 +00:00
Michael Niedermayer
f5a9e8f33d merging mov & and (no speedchange)
Originally committed as revision 6889 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 16:02:18 +00:00
Michael Niedermayer
e80cf125a7 2 instructions less (same speed)
Originally committed as revision 6888 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 15:40:57 +00:00
Michael Niedermayer
9347118237 comment about failed optimization
Originally committed as revision 6887 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 15:17:36 +00:00
Michael Niedermayer
38cfdc83f0 move luma tc0 related init into asm
5% faster filter_mb_fast() on P3

Originally committed as revision 6884 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 14:28:30 +00:00
Michael Niedermayer
25225c3773 2 instructions less in h264_loop_filter_luma_mmx2()
Originally committed as revision 6882 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 12:07:53 +00:00
Michael Niedermayer
bda2203d56 preempt possible overflow
Originally committed as revision 6881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 11:07:35 +00:00
Michael Niedermayer
5a1553dee3 1 instruction less
Originally committed as revision 6880 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 09:59:15 +00:00
Michael Niedermayer
e9f1885c21 optimize H264_DEBLOCK_P0_Q0
2.5% faster filter_mb_fast() on P3

Originally committed as revision 6877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-11-03 02:03:56 +00:00
Diego Biurrun
7c428ea681 Put libmpeg2 IDCT functions under CONFIG_GPL, fixes link failure
with --disable-opts.

Originally committed as revision 6691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-14 17:04:50 +00:00
Diego Biurrun
c26abfa541 Rename ABS macro to FFABS.
Originally committed as revision 6666 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-11 23:17:58 +00:00
Diego Biurrun
b78e7197a8 Change license headers to say 'FFmpeg' instead of 'this program/this library'
and fix GPL/LGPL version mismatches.

Originally committed as revision 6577 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 15:30:46 +00:00
Diego Biurrun
ade6e7f3ae Compilation fix, printf gets redefined to please_use_av_log.
Originally committed as revision 6574 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-07 11:30:24 +00:00
Diego Biurrun
0eb59ddba4 Switch idct_mmx_xvid.c from GPL to LGPL as permitted by the
author, Peter Ross (pross xvid org).

Originally committed as revision 6557 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-05 00:23:24 +00:00
Loren Merritt
2833fc4646 approximate qpel functions: sacrifice some quality for some decoding speed. enabled on B-frames with -lavdopts fast.
Originally committed as revision 6412 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-10-01 21:25:17 +00:00
Måns Rullgård
62bb489b13 add some #ifdef CONFIG_ENCODERS/DECODERS
Originally committed as revision 6356 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-27 19:54:07 +00:00
Loren Merritt
a4eb118a41 cosmetics (indentation)
Originally committed as revision 6313 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 17:43:09 +00:00
Loren Merritt
f469094c9b tweak ff_imdct_calc_3dn2
Originally committed as revision 6312 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 17:42:23 +00:00
Loren Merritt
ebbafcb454 sse implementation of imdct.
patch mostly by Zuxy Meng (zuxy dot meng at gmail dot com)

Originally committed as revision 6311 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-21 16:37:39 +00:00
Luca Barbato
99aed7c8fc New single instruction math operation header
Originally committed as revision 6291 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-19 22:22:29 +00:00
Aurelien Jacobs
2a2311bee3 disable vp3 mmx idct for theora files to avoid artifacts
(see theora-a4_v6-k250-s0_2.ogg)

Originally committed as revision 6253 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-14 22:13:23 +00:00
Diego Biurrun
7f889a76ad Remove the LGPL exception clause as discussed on ffmpeg-devel
and move the dependent code under CONFIG_GPL.

Originally committed as revision 6248 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-14 00:38:03 +00:00
Aurelien Jacobs
1dac8fea05 Enables back the mmx/sse optimized version of the vp3 idct.
It generates different md5sum than the reference C implementation,
but no visual difference, so enabled only when bitexact is not set.

Originally committed as revision 6241 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-12 20:58:17 +00:00
Diego Biurrun
04d7f60143 Add official LGPL license headers to the files that were missing them.
Originally committed as revision 6219 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-10 14:02:42 +00:00
Måns Rullgård
0e176c3eb5 remove redundant declarations
Originally committed as revision 6153 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-09-02 23:10:28 +00:00
Loren Merritt
3e20143ee7 mmx implementation of deblocking strength decision.
2-3% faster h264.

Originally committed as revision 6113 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-28 09:33:01 +00:00
Loren Merritt
1e4ecf26f5 ff_fft_calc_3dn/3dn2/sse: convert intrinsics to inline asm.
2.5% faster fft, 0.5% faster vorbis.

Originally committed as revision 6023 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-18 23:53:49 +00:00
Michael Niedermayer
cf5aed5bad simplify
Originally committed as revision 6020 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-18 10:43:23 +00:00
Michael Niedermayer
3829a62eae insufficient alignment
Originally committed as revision 6006 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-17 07:49:22 +00:00
Marco Manfredini
6bb9e49249 Fix building with --disable-opts but MMX enabled.
patch by Marco Manfredini mldb %at% gmx %dot% net

Originally committed as revision 5994 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-13 10:10:06 +00:00
John Dalgliesh
4454dc1b6f Support for MacIntel, last part: balign directives
Determines whether .align's arg is power-of-two or not, then defines ASMALIGN appropriately in config.h. Changes all .baligns to ASMALIGNs.
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 11, 2006 8:00 AM
Subject: Re: [Ffmpeg-devel] Mac OS X Intel last part: balign directives

Originally committed as revision 5990 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-12 16:37:31 +00:00
Loren Merritt
069720565c vorbis simd tweaks
Originally committed as revision 5983 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-11 18:19:37 +00:00
Michael Niedermayer
1f1aa1d955 convert vector_fmul_reverse_sse2 and vector_fmul_add_add_sse2 to sse
please complain if they are slower on sse2 cpus ...

Originally committed as revision 5976 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 20:24:58 +00:00
Loren Merritt
eb4825b5d4 sse and 3dnow implementations of float->int conversion and mdct windowing.
15% faster vorbis.

Originally committed as revision 5975 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 19:06:25 +00:00
Luca Barbato
ffad4ed154 Fix x86 SIMD asm and pic, patch from Martin von Gagern <Martin.vGagern@gmx.net>
Originally committed as revision 5973 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 16:05:29 +00:00
John Dalgliesh
347be47226 Support for MacIntel, take xx: '/nop' illegal for old versions of GAS
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 8, 2006 8:12 PM
Subject: Re: [Ffmpeg-devel] [PATCH] '/nop' illegal for old versions of GAS

Originally committed as revision 5972 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 15:26:18 +00:00
John Dalgliesh
0fc256f3d9 Add support for Mac OS X Intel part 2: Assembler macros in fdct_mmx.c
convert gas macros to ccp macros
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 10, 2006 5:39 AM
Subject: Re: [Ffmpeg-devel] Mac OS X Intel part 2: Assembler macros in fdct_mmx.c

Originally committed as revision 5971 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 11:29:57 +00:00
John Dalgliesh
fc48b6fe74 Support for Mac OS X Intel, part 3: binary integer constants:
Apple's assembler only understands the same integer constants as C does: hex, decimal, octal. It doesn't understand binary integer constants (0b...) so this patch replaces binary integer constants with hex ones.
Patch by John Dalgliesh % johnd AH defyne P org %
Original thread:
Date: Aug 10, 2006 8:16 AM
Subject: [Ffmpeg-devel] Mac OS X Intel part 3: binary integer constants

Originally committed as revision 5970 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-10 09:06:06 +00:00
Loren Merritt
ee5df92750 emms -> femms
Originally committed as revision 5965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-09 06:37:25 +00:00
Loren Merritt
2494bdd90d gcc 2.95 and 3.4.x on x86 32bit without fomit-frame-pointer can't even find 5 registers for asm input.
0.5% slower vorbis.

Originally committed as revision 5964 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-09 06:33:49 +00:00
Loren Merritt
1b87c40245 slightly faster ff_imdct_calc_3dn2() on amd64. (gcc added a bunch of useless movsxd)
Originally committed as revision 5962 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 21:47:11 +00:00
Michael Niedermayer
21bb884fb7 change vorbis_inverse_coupling_sse2() so it works on sse1 cpus
Originally committed as revision 5957 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 12:03:51 +00:00
Loren Merritt
bcfa3e58ee 3dnow2 implementation of imdct.
6% faster vorbis and wma.

Originally committed as revision 5954 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-08 04:01:04 +00:00
Loren Merritt
cd035a6051 10l, vorbis_inverse_coupling_sse() was really 3dnow
Originally committed as revision 5903 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-03 07:09:29 +00:00
Loren Merritt
2dac4acfc0 sse & sse2 implementations of vorbis channel coupling.
9% faster vorbis (on a K8).

Originally committed as revision 5898 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-08-03 03:18:47 +00:00
Stefan Gehrer
595e7bd940 some MMX optimizations for the CAVS decoder
Originally committed as revision 5846 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-29 08:45:33 +00:00
Michael Niedermayer
5ced7b80ad disable the vp3 mmx and sse2 idcts, their output doesnt match the c idct (tested with -f crc) and the theora spec does not allow different idcts not to mention the difference is quite vissible ...
Originally committed as revision 5788 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-19 09:49:21 +00:00
Måns Rullgård
98d417cbcd #define SBUTTERFLY outside CONFIG_ENCODERS
Originally committed as revision 5628 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-05 19:31:01 +00:00
Luca Abeni
9c39071d6d Move REG_* macros from libavcodec/i386/mmx.h to libavutil/x86_cpu.h
Originally committed as revision 5595 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-03 10:52:07 +00:00
Måns Rullgård
3f8674a902 remove redundant macro definitions
Originally committed as revision 5589 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-02 22:01:31 +00:00
Måns Rullgård
8fb0d07339 kill warning
Originally committed as revision 5588 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-07-02 21:53:30 +00:00
Michael Niedermayer
e27b6e62f7 missmatch control for mpeg2 intra dequantization if bitexact=1
Originally committed as revision 5328 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-28 17:03:52 +00:00
Zuxy Meng
392f6da897 Remove unused and unsupported Cyrix's "Extended MMX",
Add SSE3 support.
Patch by Zuxy Meng < zuxy POIS meng AH gmail POIS com >
Original thread:
04/26/06 13:13:
[Ffmpeg-devel] [PATCH] Bug fix,	SSE3 support in i386/cputest.c and dsputil.h

Originally committed as revision 5326 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-28 07:46:13 +00:00
Wolfram Gloger
f42635f558 gcc-2.95 compile fix, patch by Wolfram Gloger <wmglo A dent PIS med PIS uni-muenchen PIS de>
Originally committed as revision 5298 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-18 03:48:30 +00:00
Loren Merritt
75ca1a5f70 gmc_mmx tweaks
Originally committed as revision 5269 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-05 04:13:41 +00:00
Loren Merritt
703c8195a8 mmx implementation of 3-point GMC. (5x faster than C)
Originally committed as revision 5265 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-04-04 09:23:45 +00:00
Luca Barbato
22b48b85b6 altivec support for snow
Originally committed as revision 5228 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-27 12:51:19 +00:00
Loren Merritt
5e8b787afa simplified and slightly faster h264_chroma_mc8_mmx
Originally committed as revision 5214 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-25 08:41:14 +00:00
Loren Merritt
513fbd8e5a prefetch pixels for future motion compensation. 2-5% faster h264.
Originally committed as revision 5203 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-23 20:16:36 +00:00
Loren Merritt
5e6a5c4daf 10l
Originally committed as revision 5201 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 22:08:28 +00:00
Loren Merritt
fdd3057981 added mmx implementation of h264_chroma_mc2
Originally committed as revision 5200 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 22:05:00 +00:00
Robert Edele
e8600e5edc add MMX and SSE versions of ff_snow_inner_add_yblock
Patch by Robert Edele < yartrebo AH earthlink POIS net >
Original Thread:
Date: Mar 22, 2006 3:24 AM
Subject: [Ffmpeg-devel] [PATCH] snow mmx + sse2 part 5

Originally committed as revision 5197 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-22 12:08:35 +00:00
Robert Edele
2c9a0285d4 snow mmx+sse2 optimizations, part 4
Patch by Robert Edele, yartrebo <<at>> earthlink <<dot>> net

Originally committed as revision 5191 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-21 21:51:07 +00:00
Robert Edele
4567b4bdab Add the mmx and sse2 implementations of ff_snow_vertical_compose().
Patch by Robert Edele < yartrebo AH earthlink POIS net >
Original thread:
Date: Mar 20, 2006 5:54 PM
Subject: [Ffmpeg-devel] [PATCH] snow mmx + sse2 part 3

Originally committed as revision 5185 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-20 22:27:59 +00:00
Robert Edele
059715a41c First part of a series of speed-enchancing patches.
This one sets up a snow.h and makes snow use the dsputil function pointer
framework to access the three functions that will be implemented in asm
in the other parts of the patchset.
Patch by Robert Edele < yartrebo AH earthlink POIS net>
Original thread:
Subject: [Ffmpeg-devel] [PATCH] Snow mmx+sse2 asm optimizations
Date: Sun, 05 Feb 2006 12:47:14 -0500

Originally committed as revision 5172 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-16 19:18:18 +00:00
Zuxy Meng
82eb4b0f1b 3DNow! & Extended 3DNow! versions of FFT
Patch by Zuxy Meng, zuxy <<dot>> meng >>at<< gmail <<dot>> com
Minor non-functional diff-related fixes by me.

Originally committed as revision 5125 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-08 04:13:55 +00:00
Loren Merritt
548a1c8a35 h264_idct8_add_mmx
Originally committed as revision 5123 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:45:56 +00:00
Loren Merritt
6da971f160 h264_idct_add only needs mmx1
Originally committed as revision 5122 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-07 22:33:32 +00:00
Zuxy Meng
2ffb22d2ad use xorps instead of mulps to toggle the sign of a float, as suggested by Software Optimization Guide for AMD64 Processors.
Patch by Zuxy Meng < zuxy POIS meng AH  gmail POIS com > OKed by Michael
Original thread:
Date: Mar 5, 2006 8:15 PM
Subject: [Ffmpeg-devel] [PATCH] Little optimization to fft_sse.c

Originally committed as revision 5112 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 20:25:18 +00:00
Loren Merritt
d84f7c61ee gcc2.95 workaround
Originally committed as revision 5111 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-05 19:02:35 +00:00
Loren Merritt
7a5b2fa812 remove some useless instructions
Originally committed as revision 5109 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-04 19:56:01 +00:00
Loren Merritt
6a8eb0f45a 4% faster h264_qpel_mc
Originally committed as revision 5094 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-03-02 08:21:08 +00:00
Loren Merritt
ef9d1d1575 h264: special case dc-only idct. ~1% faster overall
Originally committed as revision 4971 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-10 06:55:25 +00:00
Loren Merritt
4e295993ba 10l in 1.12
Originally committed as revision 4965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-09 02:43:23 +00:00
Loren Merritt
6ee669732d 10l (x86_64)
Originally committed as revision 4952 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 16:10:48 +00:00
Loren Merritt
e545f37527 18% faster put_h264_qpel16_mc[13]2_mmx2
Originally committed as revision 4951 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 10:52:25 +00:00
Loren Merritt
c03ce51dfb 11% faster put_h264_qpel16_v_lowpass_mmx2
Originally committed as revision 4950 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 07:35:03 +00:00
Loren Merritt
0331f09237 15% faster put_h264_qpel16_hv_lowpass_mmx2
Originally committed as revision 4949 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-07 06:25:14 +00:00
Steve L'Homme
68b51e58ce MSVC-compatible __align8/__align16 declaration
patch by Steve Lhomme, steve .dot. lhomme .at. free .dot. fr

Originally committed as revision 4942 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-02-05 13:35:17 +00:00
Diego Biurrun
5509bffa88 Update licensing information: The FSF changed postal address.
Originally committed as revision 4842 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-12 22:43:26 +00:00
Loren Merritt
e8b562087d tweak h264_biweight
Originally committed as revision 4835 to svn://svn.ffmpeg.org/ffmpeg/trunk
2006-01-09 03:38:37 +00:00
Loren Merritt
cec9395977 fix some potential arithmetic overflows in pred_direct_motion() and
ff_h264_weight_WxH_mmx2().

Originally committed as revision 4795 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-30 23:47:41 +00:00
Diego Biurrun
bb270c0896 COSMETICS: tabs --> spaces, some prettyprinting
Originally committed as revision 4764 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-22 01:10:11 +00:00
Diego Biurrun
115329f160 COSMETICS: Remove all trailing whitespace.
Originally committed as revision 4749 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-17 18:14:38 +00:00
Guillaume Poirier
f6d1338cb5 Add the rest of missing Reg_* macros to support both AMD-64 style regs and IA32 regs.
Not used yet, but should be once the SIMD code to accelerate Snow decoding is merged.

Originally committed as revision 4731 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-12-10 22:53:44 +00:00
Loren Merritt
ea15df8048 use sse16_sse2() in nsse
Originally committed as revision 4688 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-11-12 05:23:25 +00:00
Loren Merritt
a6624e21cb faster h264_chroma_mc8_mmx, added h264_chroma_mc4_mmx.
2-4% overall speedup.

Originally committed as revision 4666 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-27 06:45:29 +00:00
Loren Merritt
b926572aa9 h264 mmx weighted prediction. up to 3% overall speedup.
Originally committed as revision 4630 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-10-09 23:38:52 +00:00
Loren Merritt
5693c08356 sse2 16x16 sum squared diff (306=>268 cycles on a K8)
faster 8x8 mmx ssd (77=>70 cycles)

Originally committed as revision 4623 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-30 02:31:47 +00:00
Michael Niedermayer
12e9668119 replace a few mov + psrlq with pshufw, there are more cases which could benefit from this but they would require us to duplicate some functions ...
the trick is from various places (my own code in libpostproc, a patch on the x264 list, ...)

Originally committed as revision 4608 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-21 21:17:09 +00:00
Reimar Döffinger
cd7af76d9e Fix compile without CONFIG_GPL, misplaced #endif caused a missing }.
Originally committed as revision 4575 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:30:40 +00:00
Michael Niedermayer
9f211bc6d7 remove unused table entries
change non portable table access

Originally committed as revision 4574 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 19:03:37 +00:00
Michael Niedermayer
84740d5980 xvids mmx&mmx2 idcts
needed to decode xvid without some minor artefacts
under #ifdef CONFIG_GPL of course

Originally committed as revision 4572 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-10 17:01:30 +00:00
Måns Rullgård
79396ac685 Kill some compiler warnings. Compiled code verified identical after changes.
Originally committed as revision 4567 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-06 21:25:35 +00:00
Michael Niedermayer
d3a9f79871 simplify (d&a) and (d&~a) calculation, hint by skal
Originally committed as revision 4552 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-03 09:17:30 +00:00
Michael Niedermayer
b5b65df7a9 add consts (this was in my local tree, dunno where it came from, probably forgoten from some const patch)
Originally committed as revision 4551 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-09-02 21:13:19 +00:00
Måns Rullgård
bf4e3bd2d0 kill a bunch of compiler warnings
Originally committed as revision 4522 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-14 15:42:40 +00:00
Alexander Strasser
c11c2bc20b libavutil: Utility code from libavcodec moved to a separate library.
Originally committed as revision 4489 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-08-01 20:07:05 +00:00
Loren Merritt
d2bb7db135 sort H.264 mmx dsp functions into their own file
Originally committed as revision 4338 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-02 20:45:35 +00:00
Michael Niedermayer
c26ae41db2 adding a few const
Originally committed as revision 4337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 21:19:00 +00:00
Michael Niedermayer
435b0720a8 100l for myself (breaking amd64)
Originally committed as revision 4336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 18:04:01 +00:00
Michael Niedermayer
6510f43cf3 merge a few asm blocks so gcc cant unoptimize it (658->631 dezicycles on duron)
Originally committed as revision 4334 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:56:58 +00:00
Michael Niedermayer
987ae784e6 get rid of 2 movq (680 -> 658 dezicycles on duron)
Originally committed as revision 4333 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 11:36:32 +00:00
Michael Niedermayer
e4b36d4434 avoid one transpose (730->680 dezicycles on duron)
Originally committed as revision 4332 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 08:43:40 +00:00
Loren Merritt
85bbfcd4ee 10l (symbol mangling)
Originally committed as revision 4331 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 04:51:46 +00:00
Michael Niedermayer
1f3dbc09b1 add rounding bias before the horizontal idct (765->730 dezicyles on duron)
Originally committed as revision 4330 to svn://svn.ffmpeg.org/ffmpeg/trunk
2005-06-01 01:18:41 +00:00