Commit Graph

13 Commits

Author SHA1 Message Date
Ronald S. Bultje
52f84d82bd videodsp: don't overread edges in vfix3 emu_edge.
Fixes trac ticket 3226. Also see Andreas' analysis in
https://bugs.debian.org/801745, which was very helpful.
2015-10-24 14:34:50 -04:00
James Almer
70277d1d23 x86/videodsp: add ff_emu_edge_{hfix,hvar}_avx2
~15% faster than sse2.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-24 16:12:55 -03:00
Michael Niedermayer
5bca5f87d1 Revert "x86/videodsp: add emulated_edge_mc_mmxext"
The commit causes minor out of array reads and was mainly intended for
future optimizations which turned out not to be meassurably faster.
Itself it was just 1 cpu cycle faster

Approved-by: jamrial

This reverts commit 057d2704e7.
2014-06-28 05:39:07 +02:00
James Almer
057d2704e7 x86/videodsp: add emulated_edge_mc_mmxext
This also changes hfix8_mmx and above to use mmx regs instead of
gprs, and makes emulated_edge_mc_sse and emulated_edge_mc_sse2 use
mmxext hfix and hvar functions instead of mmx where possible.

This is mostly in preparation for an ssse3 version.

Signed-off-by: James Almer <jamrial@gmail.com>

code is about 1 cpu cycle faster approximately

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-26 17:58:57 +02:00
Ronald S. Bultje
960490c0b2 avcodec/x86/videodsp: Small speedups in ff_emulated_edge_mc x86 SIMD.
Don't use word-size multiplications if size == 2, and if we're using
SIMD instructions (size >= 8), complete leftover 4byte sets using movd,
not mov. Both of these changes lead to minor speedups.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-27 15:02:48 +01:00
Ronald S. Bultje
cd86eb265f avcodec/x86/videodsp: fix a bug in a %if statement where we used '%%' instead of '&&'.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-27 15:02:48 +01:00
Ronald S. Bultje
1b3a7e1f42 avcodec/x86/videodsp: Properly mark sse2 instructions in emulated_edge_mc x86 simd as such.
Should fix crashes or corrupt output on pre-SSE2 CPUs when they were
using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in
hfix or hvar single-edge (left/right) extension functions.

Tested-by: Ingo Brückl <ib@wupperonline.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-24 13:36:55 +02:00
Ronald S. Bultje
20d78a8606 libavcodec/x86: Fix emulated_edge_mc SSE code to not contain SSE2 instructions on x86-32.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-10 13:36:06 +02:00
Ronald S. Bultje
face578d56 Rewrite emu_edge functions to have separate src/dst_stride arguments.
This allows supporting files for which the image stride is smaller than
the max. block size + number of subpel mc taps, e.g. a 64x64 VP9 file
or a 16x16 VP8 file with -fflags +emu_edge.
2013-09-28 20:28:08 -04:00
Michael Niedermayer
63a97d5674 Merge commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa'
* commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa':
  cosmetics: Remove unnecessary extern keywords from function declarations

Conflicts:
	libswscale/x86/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-28 11:20:41 +01:00
Diego Biurrun
b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00
Michael Niedermayer
e16bac7b33 videodsp: Fix project name
These are all part of splited out dsp utils from FFmpeg

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-22 00:58:08 +01:00
Ronald S. Bultje
8c53d39e7f lavc: introduce VideoDSPContext
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00