Commit Graph

476 Commits

Author SHA1 Message Date
Alexander Strange
bc31447225 Make the function prototype visible to comply with C99 inline.
Fixes building with gcc -std=gnu99.

Originally committed as revision 14140 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-09 17:51:57 +00:00
Michael Niedermayer
e98750c373 float_to_int16_sse2()
20% faster than sse

Originally committed as revision 14138 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-09 07:21:12 +00:00
Victor Pollex
1835cda65a Make LOAD4/STORE4 macros more generic.
Patch by Victor Pollex victor pollex web de
Original thread: [PATCH] mmx implementation of vc-1 inverse transformations
Date: 06/21/2008 03:37 PM

Originally committed as revision 14108 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-08 09:24:11 +00:00
Michael Niedermayer
35ee72b1d7 1 c-asm loop less and 1x unroll of float_to_int16_sse()
25% faster

Originally committed as revision 14104 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 21:25:18 +00:00
Michael Niedermayer
560fa9bf51 Fix x86-64
Originally committed as revision 14103 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 21:04:29 +00:00
Michael Niedermayer
63b737d4f9 dont use C-asm loops and unroll once float_to_int16_3dnow()
30% faster

Originally committed as revision 14102 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-07-07 20:46:03 +00:00
Alexander Strange
74fd9022b5 Realign newlines.
Originally committed as revision 14023 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-28 18:30:50 +00:00
Alexander Strange
00969e1c59 Use MANGLE() instead of memory operands to read globals.
(fixes out of registers with apple gcc 4.2)

Originally committed as revision 14022 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-28 18:27:31 +00:00
Reimar Döffinger
00eebe3d6a Fix add_bytes_mmx and add_bytes_l2_mmx for w < 16
Originally committed as revision 13877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-06-22 07:05:40 +00:00
Michael Niedermayer
0bd134abd3 Simplify vsad16_mmx2().
Originally committed as revision 13193 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:36:44 +00:00
Michael Niedermayer
6bf6a9301b Simplify vsad16_mmx().
Originally committed as revision 13191 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:35:14 +00:00
Michael Niedermayer
e13810223a Simplify vsad_intra16_mmx2()
Originally committed as revision 13189 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:33:01 +00:00
Michael Niedermayer
06bb35f94c Simplify vsad_intra16_mmx()
Originally committed as revision 13188 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-17 14:31:10 +00:00
Diego Biurrun
a12b44d7fb Add missing required header directly.
Originally committed as revision 13103 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 14:34:52 +00:00
Diego Biurrun
20cd685ae8 Add missing path to #include.
Originally committed as revision 13102 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 14:33:55 +00:00
Diego Biurrun
245976da2a Use full path for #includes from another directory.
Originally committed as revision 13098 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-09 11:56:36 +00:00
Ramiro Polla
40d0e665d0 Do not misuse long as the size of a register in x86.
typedef x86_reg as the appropriate size and use it instead.

Originally committed as revision 13081 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-05-08 21:11:24 +00:00
Diego Biurrun
57105ddd03 Rename i386/cputest.c --> i386/cpuid.c.
Originally committed as revision 13002 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-26 16:02:22 +00:00
Diego Biurrun
c88c253d8b cosmetics: __asm__ __volatile__ --> asm volatile
Originally committed as revision 12885 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-17 21:57:52 +00:00
Diego Biurrun
80465c7eed cosmetics: Fix nonstandard indentation.
Originally committed as revision 12863 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 20:51:39 +00:00
Jeff Downs
591d87babe Cosmetics:
Break long lines.
Correct spelling in comment (duplicatin -> duplicating)

Originally committed as revision 12862 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 20:43:37 +00:00
Jeff Downs
52cb7981e2 Redo r12838, this time using svn copy to create h264_i386.h from cabac.h.
Move decode_significance_x86() and decode_significance_8x8_x86() to
i386-specific file from cabac.h.
New file is h264-oriented and only included from h264.c
Resolves compilation when configured with --disable-optimizations due to
decode_significance_8x8_x86 using last_coeff_flag_offset_8x8, which is
only defined in h264.c

Originally committed as revision 12846 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 04:40:21 +00:00
Jeff Downs
3aa9ede400 Revert 12838 to redo it the right way (use svn copy to create new
file based on old).

Originally committed as revision 12845 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 04:26:52 +00:00
Alexander Strange
f73a6393e7 Add a new xvid-style IDCT using SSE2.
Originally committed as revision 12843 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-16 01:36:14 +00:00
Jeff Downs
e6cfd8fffb Move decode_significance_x86() and decode_significance_8x8_x86() to
i386-specific file from cabac.h.
New file is h264-oriented and only included from h264.c
Resolves compilation when configured with --disable-optimizations due to
decode_significance_8x8_x86 using last_coeff_flag_offset_8x8, which is
only defined in h264.c

Originally committed as revision 12838 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-15 13:51:41 +00:00
Luca Barbato
3fbe711832 Eliminate movdqu in vp3dsp_sse2, patch from Alexander Strange astrangeAtithinkswDoTcom
Originally committed as revision 12824 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-14 20:54:23 +00:00
Alexander Strange
54a0b6e590 Add a header file to declare Xvid IDCT functions.
patch by Alexander Strange, astrange ithinksw com

Originally committed as revision 12794 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-12 16:54:36 +00:00
Loren Merritt
96275520a3 Fix H.264 interframe decoding when compiling with icc. Patch by Loren
Merritt:

"It seems that icc copies the constants from their global var onto the
stack, at which point they're not aligned, hence the crash.
[This change] really shouldn't mean anything different, but maybe it'll
confuse icc into not performing that 'optimization'."

Originally committed as revision 12772 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-08 23:49:34 +00:00
Loren Merritt
ce53144bac h264 chroma mc ssse3
width8: 180->92, width4: 78->63 cycles (core2)

Originally committed as revision 12661 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-04-01 04:51:28 +00:00
Diego Biurrun
04932b0d97 cosmetics: typo fixes
Originally committed as revision 12554 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-22 16:46:36 +00:00
Zuxy Meng
9e8e6d318c Add missed call to ff_cavsdsp_init_3dnow() in dsputil_init_mmx()
Originally committed as revision 12540 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-21 12:36:49 +00:00
Michael Niedermayer
943032b155 Hardcode register to prevent aparent miscompilation.
Fixes regression tests with gcc 2.95.

Originally committed as revision 12512 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-20 14:24:29 +00:00
Michael Niedermayer
dea00a4623 remove unused temp
Originally committed as revision 12511 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-20 14:09:31 +00:00
Måns Rullgård
b55aa9a904 get register names from x86_cpu.h
Originally committed as revision 12482 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-17 23:08:19 +00:00
Aurelien Jacobs
5a6a9e78ab move draw_edges() into dsputil
Originally committed as revision 12309 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-03-04 00:07:41 +00:00
Aurelien Jacobs
97d1d009e2 split encoding part of dsputil_mmx into its own file
Originally committed as revision 12223 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-25 23:14:22 +00:00
Reimar Döffinger
f2217d6f90 __asm __volatile -> asm volatile part 2
Originally committed as revision 12189 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-24 14:47:42 +00:00
Reimar Döffinger
78d3d94f14 __asm __volatile -> asm volatile, improves code consistency and works
(as far as that is possible) with the Sun C compiler.

Originally committed as revision 12188 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-24 14:46:22 +00:00
Loren Merritt
4a9ca0a279 simd and unroll png_filter_row
cycles per 1000 pixels on core2:
left: 9211->5170
top: 9283->2138
avg: 12215->7611
paeth: 64024->17360
overall rgb png decoding speed: +45%
overall greyscale png decoding speed: +6%

Originally committed as revision 12164 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-21 07:10:46 +00:00
Michael Niedermayer
1435e4ccde Disabling all SSE* code for old gcc to avoid alignment issues.
Originally committed as revision 12163 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-21 00:06:07 +00:00
Reimar Döffinger
754bf3d8a1 Fix warnings:
i386/vp3dsp_sse2.c:805: warning: cast discards qualifiers from pointer target type
i386/vp3dsp_sse2.c:806: warning: cast discards qualifiers from pointer target type

Originally committed as revision 12150 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-19 21:55:13 +00:00
Diego Biurrun
5edac5dc94 cosmetics: Replace // by /* */ comments.
sync with upstream libmpeg2 0.4.1

Originally committed as revision 11915 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-13 01:18:12 +00:00
Loren Merritt
ec199cc94c asm argument that might be in memory needs a size
Originally committed as revision 11890 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-10 01:45:42 +00:00
Loren Merritt
2c70770e33 use fewer registers in apply_welch_window_sse2
Originally committed as revision 11882 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-09 05:29:47 +00:00
Loren Merritt
1d67b037f7 sse2 h264 motion compensation. not new code, just separate out the cases that didn't need ssse3.
Originally committed as revision 11877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-06 12:32:31 +00:00
Loren Merritt
20d565be6d put loop counter in a register if possible. makes some of the qpel functions 3% faster.
Originally committed as revision 11876 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-06 04:44:21 +00:00
Loren Merritt
7080ec2937 fix aliasing warnings. simpler too.
Originally committed as revision 11875 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-06 04:14:07 +00:00
Loren Merritt
a2b7bc8e71 constant was excessively aligned
Originally committed as revision 11874 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-06 03:51:53 +00:00
Loren Merritt
ddf969705f ssse3 h264 motion compensation.
25% faster tham mmx on core2, 35% if you discount fullpel, 4% overall decoding.

Originally committed as revision 11871 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-05 11:22:55 +00:00
Loren Merritt
b64dfbb8d2 add qpel rounder once during hv rather than twice during hv and whatever it's averaged with
Originally committed as revision 11870 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-02-05 03:58:13 +00:00