Diego Biurrun
831a118078
Update dsputil- and SIMD-related comments to match reality more closely
2014-03-13 05:50:29 -07:00
Diego Biurrun
17608f6ee3
x86: Add some more missing headers
2014-03-13 05:50:28 -07:00
Diego Biurrun
08dba0e1c3
x86: mpegvideoenc: Remove some remnants of the long-gone libmpeg2 IDCT
2014-03-13 05:50:28 -07:00
Diego Biurrun
3bfdee00cd
x86: dcadsp: Fix linking with yasm and optimizations disabled
...
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
2014-03-05 23:16:21 +01:00
Diego Biurrun
3741aa37c2
x86: cabac: Use correct #includes to make header compile standalone
2014-03-05 13:32:25 +01:00
Christophe Gisquet
4cb6964244
dcadec: simplify decoding of VQ high frequencies
...
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.
Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.
The decode_hf implementations have following timings:
For x86 Arrandale:
C SSE SSE2 SSE4
win32: 260 162 119 104
win64: 242 N/A 89 72
The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
2014-02-28 13:03:22 +01:00
Christophe Gisquet
08e3ea60ff
x86: synth filter float: implement SSE2 version
...
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:48 +01:00
Christophe Gisquet
ad507d7907
x86: dcadsp: implement SSE lfe_dir
...
Results for Arrandale/Windows:
32: 1670 -> 316
64: 728 -> 298
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:47 +01:00
Diego Biurrun
b23650491f
prores: Use consistent names for DSP arch initialization functions
2014-02-28 10:34:55 +01:00
Diego Biurrun
017a06a9ee
x86: dsputil: Use correct file name as multiple inclusion guard
2014-02-20 04:16:15 -08:00
Diego Biurrun
b23bc95920
x86: dca: Add missing multiple inclusion guards
2014-02-19 10:19:15 +01:00
Janne Grunau
5c1c6e8226
dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders
2014-02-08 13:38:36 +01:00
Janne Grunau
0cffd6fff5
x86: use the inline int8x8_fmul_int32 only if inline SSE2 is availbale
...
Fixes compilation with MSVC. Also does not rely on on earlier config.h
include but include it directly.
2014-02-08 12:10:56 +01:00
Christophe Gisquet
5b59a9fc61
x86: dcadsp: implement int8x8_fmul_int32
...
For the callable function (as opposed to the inline one):
C SSE SSE2 SSE4
Win32: 47 42 29 26
Win64: 30 33 25 23
The SSE version is neither compiled nor set for ARCH_X86_64, as the
inlinable function takes over.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:52:40 +01:00
Ronald S. Bultje
9ee9c679a7
x86: videodsp: Fix a bug in a %if statement where we used '%%' instead of '&&'.
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-01-30 15:33:23 +01:00
Ronald S. Bultje
51daafb02e
x86: videodsp: Properly mark sse2 instructions in emulated_edge_mc as such.
...
Should fix crashes or corrupt output on pre-SSE2 CPUs when they were
using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in
hfix or hvar single-edge (left/right) extension functions.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-01-30 15:30:01 +01:00
Diego Biurrun
aab40bbfd5
x86: dsputil: Simplify xvmc deprecation conditional
2014-01-15 15:23:46 +01:00
Diego Biurrun
46bacb5cc6
x86: Consistently use cpu flag detection macros in places that still miss it
2014-01-14 00:04:58 +01:00
Diego Biurrun
4c642d8d98
x86: hpeldsp: Add missing av_cold attribute to init function
2014-01-09 15:09:07 +01:00
Diego Biurrun
b0be1ae792
x86: avcodec: Add a bunch of missing #includes for av_cold
2014-01-09 15:09:07 +01:00
Anton Khirnov
a03a642d5c
h264: do not use 422 functions for monochrome
...
Fixes invalid memory access.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
2014-01-06 08:25:36 +01:00
Anton Khirnov
dfc50ac85e
x86: mpegvideo: move denoise_dct asm to mpegvideoenc
...
This function is encoding-only.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-12-20 17:16:11 +01:00
Diego Biurrun
4958f35a2e
dsputil: Move apply_window_int16 to ac3dsp
...
The (optimized) functions are used nowhere else.
2013-12-08 17:57:15 +01:00
Diego Biurrun
3d7c84747d
x86: Initialize mmxext after amd3dnow optimizations
...
The mmxext optimizations should be at least equally fast if available and
amd3dnow optimizations are being deprecated. Thus the former should
override the latter, not the other way around.
2013-12-04 18:52:48 +01:00
Diego Biurrun
7ffaa19570
dsputil: x86: Move ff_inv_zigzag_direct16 table init to mpegvideo
...
The table is MMX-specific and used nowhere else.
2013-12-02 04:05:18 +01:00
Diego Biurrun
cf7860db60
x86: dsputil: Suppress deprecation warnings for XvMC bits
...
These parts are scheduled for removal on the next version bump.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2013-11-28 16:04:30 +01:00
Ronald S. Bultje
72ca830f51
lavc: VP9 decoder
...
Originally written by Ronald S. Bultje <rsbultje@gmail.com> and
Clément Bœsch <u@pkh.me>
Further contributions by:
Anton Khirnov <anton@khirnov.net>
Diego Biurrun <diego@biurrun.de>
Luca Barbato <lu_zero@gentoo.org>
Martin Storsjö <martin@martin.st>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-11-15 10:16:28 +01:00
Ronald S. Bultje
458446acfa
lavc: Edge emulation with dst/src linesize
...
Allow supporting files for which the image stride is smaller than
the maximum block size + number of subpel mc taps, e.g. a 64x64 VP9
file or a 16x16 VP8 file with -fflags +emu_edge.
2013-11-15 10:16:27 +01:00
Diego Biurrun
19e30a58fc
Deprecate obsolete XvMC hardware decoding support
...
XvMC has long ago been superseded by newer acceleration APIs, such as
VDPAU, and few downstreams still support it. Furthermore XvMC is not
implemented within the hwaccel framework, but requires its own specific
code in the MPEG-1/2 decoder, which is a maintenance burden.
2013-11-13 21:07:45 +01:00
Diego Biurrun
0338c39698
dsputil: Split off H.263 bits into their own H263DSPContext
2013-11-08 12:40:47 +01:00
Diego Biurrun
e2b5b09789
x86: rv40dsp: Use PAVGB instruction macro where appropriate
2013-11-04 21:14:39 +01:00
Mikulas Patocka
694d997afe
x86: hpeldsp: Use PAVGB instruction macro where necessary
...
Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-11-04 01:29:23 +01:00
Diego Biurrun
1700b4e678
x86: vp8dsp: Split loopfilter code into a separate file
2013-11-01 22:05:20 +01:00
Diego Biurrun
6405ca7d4a
x86: h264_idct: Update comments to match 8/10-bit depth optimization split
2013-10-07 21:46:46 +02:00
Henrik Gramner
bbe4a6db44
x86inc: Utilize the shadow space on 64-bit Windows
...
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:35 -04:00
Diego Biurrun
ce1e8045e0
x86: fdct: Employ more specific ifdefs
...
This avoids building mmxext and sse2 code when disabled by configure.
2013-10-06 22:02:25 +02:00
Diego Biurrun
2ddb35b911
x86: dsputil: Separate ff_add_hfyu_median_prediction_cmov from dsputil_mmx
...
The function does not depend on MMX and compilation without MMX enabled
fails if the function is compiled conditional on MMX availability.
2013-10-05 19:21:15 +02:00
Diego Biurrun
258414d077
x86: fdct: Initialize optimized fdct implementations in the standard way
2013-10-05 18:20:52 +02:00
Diego Biurrun
0b8b2ae5e9
x86: xviddct: Employ more specific ifdefs
...
This avoids building mmxext and sse2 code when disabled by configure.
2013-10-05 18:14:58 +02:00
Diego Biurrun
6cc133ec58
x86: fdct: Only build fdct code if encoders have been enabled
...
fdct is only initialized if encoders are enabled.
2013-10-04 10:50:44 +02:00
Martin Storsjö
1daea5232f
x86: Add an xmm clobbering wrapper for avcodec_encode_video2
...
This is required since 187105ff8
when we started trying to
wrap this function as well.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-16 22:22:41 +03:00
Hendrik Leppkes
a06a5b78e2
mathops/x86: work around inline asm miscompilation with GCC 4.8.1
...
The volatile is not required here, and prevents a miscompilation with GCC
4.8.1 when building on x86 with --cpu=i686
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-09-15 11:15:07 -04:00
Diego Biurrun
e998b56362
x86: avcodec: Consistently structure CPU extension initialization
2013-08-29 13:07:37 +02:00
Diego Biurrun
6369ba3c9c
x86: avcodec: Use convenience macros to check for CPU flags
2013-08-29 13:07:37 +02:00
Diego Biurrun
cd52917237
x86: rv40dsp: Move inline assembly optimizations out of YASM init section
2013-08-28 23:59:24 +02:00
Diego Biurrun
a64f6a04ac
dsputil: x86: Hide arch-specific initialization details
...
Also give consistent names to init functions.
2013-08-28 23:59:24 +02:00
Diego Biurrun
8506ff97c9
vp56: Mark VP6-only optimizations as such.
...
Most of our VP56 optimizations are VP6-only and will stay that way.
So avoid compiling them for VP5-only builds.
2013-08-23 14:42:19 +02:00
Diego Biurrun
e7b31844f6
x86: Split DCT and FFT initialization into separate files
2013-08-21 20:15:27 +02:00
Diego Biurrun
0b45269c2d
x86: h264_idct: Remove incorrect comment
2013-08-21 15:09:58 +02:00
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
2013-07-18 00:31:35 +02:00
Christophe Gisquet
b6293e2798
fmtconvert: Explicitly use int32_t instead of int
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-17 11:02:47 +03:00
Luca Barbato
2b379a9251
mlpdsp: x86: Respect cpuflags
2013-07-12 04:34:49 +02:00
Jason Garrett-Glaser
d222f6e39e
cabac: x86 version of get_cabac_bypass
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-07-04 16:06:10 +02:00
Diego Biurrun
186599ffe0
build: cosmetics: Place unconditional before conditional OBJS lines
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:31 +03:00
Diego Biurrun
004b81c465
mpegvideo: Remove commented-out PARANOID debug cruft
2013-05-15 23:53:42 +02:00
Diego Biurrun
1399931d07
x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.h
...
The header is not (anymore) MMX-specific.
2013-05-12 22:28:07 +02:00
Diego Biurrun
245b76a108
x86: dsputil: Split inline assembly from init code
...
Also remove some pointless comments.
2013-05-12 22:28:07 +02:00
Diego Biurrun
46bb456853
x86: dsputil: Refactor pixels16 wrapper functions with a macro
2013-05-12 22:28:07 +02:00
Diego Biurrun
f54b55058a
configure: Rename cmov processor capability to i686
...
The goal is to make the capapility slightly more general and have it
cover the availability of the nopl instruction in addition to cmov.
2013-05-12 21:23:38 +02:00
Christophe Gisquet
2c299d4165
x86: sbrdsp: implement SSE2 qmf_pre_shuffle
...
From 253 to 51 cycles on Arrandale and Win64.
44 cycles on SandyBridge.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-05-10 09:31:27 +02:00
Diego Biurrun
f243bf7aa2
x86: dsputil: Remove unused argument from QPEL_OP macro
2013-05-08 18:18:58 +02:00
Diego Biurrun
3d40c1ee74
x86: dsputil: Move TRANSPOSE4 macro to the only place it is used
2013-05-08 18:18:23 +02:00
Diego Biurrun
71469f3b63
x86: dsputil: Move constant declarations into separate header
2013-05-08 18:18:23 +02:00
Diego Biurrun
ed880050ed
x86: dsputil: Group all assembly constants together in constants.c
2013-05-08 01:04:04 +02:00
Diego Biurrun
8761466760
x86: dsputil: Move ff_pd assembly constants to the only place they are used
2013-05-08 01:04:04 +02:00
Diego Biurrun
1b343cedd7
x86: dsputil: Remove unused ff_pb_3F constant
2013-05-07 18:03:35 +02:00
Diego Biurrun
3334cbec0a
x86: dsputil: Remove unused MOVQ_BONE macro
2013-05-07 18:03:35 +02:00
Diego Biurrun
63bac48f73
x86: dsputil: Move rv40-specific functions where they belong
2013-05-07 18:03:35 +02:00
Diego Biurrun
92f8e06ecb
x86: dsputil hpeldsp: Move shared template functions into separate object
2013-05-07 18:03:34 +02:00
Diego Biurrun
7edaf4edb5
x86: rnd_template: Eliminate pointless OP_AVG macro indirection
2013-05-07 18:03:34 +02:00
Diego Biurrun
110796739a
x86: hpeldsp: Move avg_pixels8_x2_mmx() out of hpeldsp_rnd_template.c
...
The function is only instantiated once, so there is no point
in keeping it in a template file.
2013-05-06 11:02:08 +02:00
Diego Biurrun
dc1b328d0d
x86: hpeldsp: Only compile MMX hpeldsp code if MMX is enabled
2013-05-06 11:02:08 +02:00
Diego Biurrun
9e5e76ef9e
x86: More specific ifdefs for dsputil/hpeldsp init functions
2013-05-06 11:02:07 +02:00
Diego Biurrun
6fee1b90ce
avcodec: Add av_cold attributes to init functions missing them
2013-05-04 21:09:45 +02:00
Diego Biurrun
a5f8873620
silly typo fixes
2013-05-03 18:26:12 +02:00
Christophe Gisquet
5a97469a4f
x86: sbrdsp: Implement SSE2 qmf_deint_bfly
...
Sandybridge: 47 cycles
Having a loop counter is a 7 cycle gain.
Unrolling is another 7 cycle gain.
Working in reverse scan is another 6 cycles.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-05-03 18:23:14 +02:00
Diego Biurrun
bf7c3c6b15
x86: dsputil: Move cavs and vc1-specific functions where they belong
2013-05-02 11:45:37 +02:00
Diego Biurrun
9328062321
x86: dsputil: Move avg_pixels16_mmx() out of rnd_template.c
...
The function does not do any rounding, so there is no point in
keeping it in a round template file.
2013-05-02 11:45:37 +02:00
Diego Biurrun
9c112a6158
x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c
...
The function is only instantiated once, so there is no point
in keeping it in a template file.
2013-05-02 11:45:37 +02:00
Diego Biurrun
9b3a04d306
x86: Move duplicated put_pixels{8|16}_mmx functions into their own file
2013-05-02 11:16:45 +02:00
Diego Biurrun
f2e9d44a57
x86: Drop unnecessary ff_ name prefixes from static functions
2013-04-30 16:02:03 +02:00
Diego Biurrun
643e433bf7
mpegaudiosp: More consistent names for ppc/x86 optimization files
2013-04-30 12:19:43 +02:00
Diego Biurrun
97c56ad796
x86: dsputil: Remove a set of pointless #ifs around function declarations
2013-04-30 01:42:32 +02:00
Diego Biurrun
85f2f82af6
x86: dsputil: cosmetics: Group ff_{avg|put}_pixels16_mmxext() declarations
2013-04-30 01:41:05 +02:00
Diego Biurrun
20784aa678
x86: hpeldsp: Remove unused macro definitions
2013-04-29 15:57:00 +02:00
Diego Biurrun
7c00e9d8ae
x86: ac3dsp: Remove 3dnow version of ff_ac3_extract_exponents
...
The function requires increasing the fuzz factor for the ac3/eac3 encode
tests and even so makes fate fail. It only provides a slight encoding
speedup for legacy CPUs that do not support SS2. Thus its benefit is not
worth the trouble it creates and fixing it would be a waste of time.
2013-04-26 21:06:52 +02:00
Martin Storsjö
74685f6783
x86: Rename dsputil_rnd_template.c to rnd_template.c
...
This makes it less confusing when this template is shared both by
dsputil and by hpeldsp.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-25 23:03:09 +03:00
Martin Storsjö
486f76f029
x86: Get rid of duplication between *_rnd_template.c
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-23 23:30:17 +03:00
Martin Storsjö
6a8561dbd7
x86: Factorize duplicated inline assembly snippets
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-23 15:07:31 +02:00
Diego Biurrun
c1ad70c3cb
x86: Move some conditional code around to avoid unused variable warnings
2013-04-22 17:50:02 +02:00
Diego Biurrun
b4ad7c54c8
x86: cavs: Refactor duplicate dspfunc macro
2013-04-22 12:05:09 +02:00
Diego Biurrun
78fa0bd0f7
x86: cavs: Put mmx-specific code into its own init function
...
Before, this code was labeled as mmxext and enabled both for the
3dnow and the mmxext case.
2013-04-22 10:42:50 +02:00
Diego Biurrun
311a592dfc
x86: Remove some duplicate function declarations
2013-04-22 02:29:57 +02:00
Martin Storsjö
b71a0507b0
x86: Remove unused inline asm instruction defines
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-20 00:44:54 +03:00
Ronald S. Bultje
8db00081a3
x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:18:53 +03:00
Ronald S. Bultje
015821229f
vp3: Use full transpose for all IDCTs
...
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.
Also remove the unused type == 0 cases from the plain C version
of the idct.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Ronald S. Bultje
c46819f229
x86: Move constants to the only place where they are used
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:17:39 +03:00
Diego Biurrun
a3cb865310
x86: dsputil: Move some ifdefs to avoid unused variable warnings
2013-04-12 09:36:47 +02:00
Diego Biurrun
2004c7c8f7
x86: dsputil: cosmetics: Remove two pointless variable indirections
2013-04-12 09:36:47 +02:00
Diego Biurrun
c51a3a5bd9
x86: dsputil: Refactor some ff_{avg|put}_pixels function declarations
2013-04-12 09:36:46 +02:00
Diego Biurrun
e027032fc6
x86: dsputil: ff_h263_*_loop_filter declarations to a more suitable place
2013-04-12 09:36:46 +02:00
Diego Biurrun
a89c05500f
x86: h264qpel: int --> ptrdiff_t for some line_size parameters
2013-04-12 09:30:12 +02:00
Diego Biurrun
ac9362c5d9
Move misplaced file author information where it belongs
2013-04-11 02:42:11 +02:00
Ronald S. Bultje
b93b27edb0
dsputil: Make dsputil selectable
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:04:05 +03:00
Ronald S. Bultje
62844c3fd6
h264: Integrate clear_blocks calls with IDCT
...
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Ronald S. Bultje
610b18e2e3
x86: qpel: Move fullpel and l2 functions to a separate file
...
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-08 12:38:33 +03:00
Christophe Gisquet
f4b0d12f5b
x86: sbrdsp: Implement SSE neg_odd_64
...
Timing on Arrandale:
C SSE
Win32: 57 44
Win64: 47 38
Unrolling and not storing mask both save some cycles.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-05 22:47:04 +02:00
Diego Biurrun
b6649ab503
cosmetics: Remove unnecessary extern keywords from function declarations
2013-03-27 14:21:45 +01:00
Martin Storsjö
a2acadd058
x86: vc1dsp: Fix indentation
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-03-26 15:49:42 +02:00
Janne Grunau
e5c2794a71
x86: consistently use unaligned movs in the unaligned bswap
...
Fixes fate errors in asv1, ffvhuff and huffyuv on x86_32.
2013-03-25 12:11:11 +01:00
Martin Storsjö
285ff14413
x86: Change a missed occurrance of int to ptrdiff_t for strides
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-03-24 12:06:53 +02:00
Martin Storsjö
352dbdb96c
x86: Remove win64 xmm clobbering wrappers for the now removed avcodec_encode_video function
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-03-23 23:37:27 +02:00
Luca Barbato
a8b6015823
dsputil: convert remaining functions to use ptrdiff_t strides
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-03-12 18:26:42 +01:00
Diego Biurrun
e8c52271c4
Revert "Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm."
...
This reverts commit f90ff772e7
.
The code should be put back in h264_qpel_8bit.asm, but unfortunately
it is unconditionally used from dsputil_mmx.c since 71155d7
.
2013-02-28 21:50:02 +01:00
Diego Biurrun
ebc701993f
x86: dsputil: Drop some unused function #defines
2013-02-26 23:36:24 +01:00
Diego Biurrun
845cfc92f9
x86: dsputil: Drop aliasing of ff_put_pixels8_mmx to ff_put_pixels8_mmxext
...
The external assembly function uses mmxext instructions and should not be
masqueraded as an mmx-only function. Instead, use the mmx-only inline
assembly function.
2013-02-26 23:36:24 +01:00
Diego Biurrun
096cc11ec1
x86: vc1dsp: Move ff_avg_vc1_mspel_mc00_mmxext out of dsputil_mmx.c
2013-02-26 23:36:24 +01:00
Martin Storsjö
31a23a0dc6
x86: dsputil_mmx: Remove leftover inline assembly fragments
...
These became unused in 71155d7b
.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-27 00:17:05 +02:00
Diego Biurrun
c242bbd8b6
Remove unnecessary dsputil.h #includes
2013-02-26 00:51:34 +01:00
Matt Wolenetz
311443f6c7
x86: h264: Don't use redzone in AVX h264_deblock on Win64
...
This fixes crashes in chromium on win64 on machines with AVX
(crashes that apparently aren't triggered by fate).
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-21 15:02:16 +02:00
Ronald S. Bultje
e5ffffe48d
h264chroma: Remove duplicate 9/10 bit functions
...
These functions do the same thing in 16 bit space and don't need
any depth specific clipping.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:33:19 +02:00
Daniel Kang
9acd23d655
x86: dsputil: Fix h263 loop filter link error in some configurations
...
This was caused by unconditionally referencing a conditionally compiled
table. Now the code is also compiled conditionally.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-02-18 17:09:00 +01:00
Daniel Kang
7a03145ed7
x86: dsputil: int --> ptrdiff_t for ff_put_pixels16_mmxext line_size param
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-02-18 15:23:03 +01:00
Daniel Kang
b3f2a3fe3f
x86: mpeg4qpel: Make movsxifnidn do the right thing
...
Fixes an instruction that does nothing by changing the
source to dword.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-02-11 20:17:15 +01:00
Diego Biurrun
5d3d39c72e
dsputil: Move fdct function declarations to dct.h
2013-02-09 00:08:28 +01:00
Diego Biurrun
218aefce44
dsputil: Move LOCAL_ALIGNED macros to libavutil
2013-02-08 23:13:37 +01:00
Daniel Kang
a1d3673034
dsputil: x86: Fix compile error
...
Accidentally prefixed ff_ with cextern.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-07 11:06:16 +02:00
Daniel Kang
659d4ba5af
dsputil: x86: Convert h263 loop filter to yasm
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-02-06 15:38:27 -08:00
Martin Storsjö
a846dccb29
h264chroma: x86: Fix building with yasm disabled
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-06 17:05:33 +02:00
Diego Biurrun
82bd04b170
rv34: Drop now unnecessary dsputil dependencies
2013-02-06 11:30:54 +01:00
Diego Biurrun
79dad2a932
dsputil: Separate h264chroma
2013-02-06 11:30:53 +01:00
Diego Biurrun
c9f933b5b6
Add av_cold attributes to arch-specific init functions
2013-02-05 17:01:05 +01:00
Diego Biurrun
25841dfe80
Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
2013-02-05 12:59:12 +01:00
Diego Biurrun
52acd79165
x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp
2013-01-31 11:19:23 +01:00
Diego Biurrun
c59211b437
x86: Simplify some arch conditionals
2013-01-29 00:10:53 +01:00
Michael Niedermayer
834e9fb056
x86: hpeldsp: Fix a typo, use the right register
...
This makes the code actually work.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-28 12:49:37 +02:00
Daniel Kang
05b0998f51
dsputil: Fix error by not using redzone and register name
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-28 07:23:20 +01:00
Daniel Kang
96753bd00d
dsputil: x86: Correct the number of registers used in put_no_rnd_pixels16_l2
...
put_no_rnd_pixels16_l2 allocated 5 instead of 6 registers.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-27 08:41:48 +01:00
Daniel Kang
0eedf5d74d
dsputil: add missing HAVE_YASM guard
...
Fix compile error under
"--disable-optimizations --disable-yasm --disable-inline-asm"
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-27 08:41:46 +01:00
Daniel Kang
71155d7b41
dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-27 06:45:31 +01:00
Ronald S. Bultje
f90ff772e7
Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm.
2013-01-26 20:35:42 -08:00
Diego Biurrun
033a86f9bb
x86: h264qpel: Move stray comment to the right spot and clarify it
2013-01-26 11:19:22 +01:00
Janne Grunau
c5c2060cf5
x86: h264qpel: add cpu flag checks for init function
...
The code was copied from per cpu extension init function so the checks
for supported extensions was overlooked.
2013-01-24 19:03:59 +01:00
Mans Rullgard
e9d817351b
dsputil: Separate h264 qpel
...
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Ronald S. Bultje
baf35bb4bc
dsputil: remove one array dimension from avg_no_rnd_pixels_tab.
2013-01-22 18:41:36 -08:00
Ronald S. Bultje
32ff643228
dsputil: remove avg_no_rnd_pixels8.
...
This is never used.
2013-01-22 18:41:36 -08:00
Diego Biurrun
88bd7fdc82
Drop DCTELEM typedef
...
It does not help as an abstraction and adds dsputil dependencies.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Ronald S. Bultje
2e4bb99f4d
vorbisdsp: convert x86 simd functions from inline asm to yasm.
2013-01-22 18:02:24 -08:00
Ronald S. Bultje
d56668bd80
floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
...
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
42d3246948
floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
...
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00