Commit Graph

48 Commits

Author SHA1 Message Date
Michael Niedermayer
a066ff89bc swscale/x86/rgb2rgb_template: Fallback to mmx in interleaveBytes() if the alignment is insufficient for SSE*
This also as a sideeffect fixes the non aligned case

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-15 03:06:33 +01:00
Michael Niedermayer
80bfce35cc swscale/x86/rgb2rgb_template: Do not crash on misaligend stride
Fixes Ticket5013

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-15 02:32:23 +01:00
James Almer
e22edbfd41 swscale/x86/rgb2rgb_template: fix signedness of v in shuffle_bytes_2103_{mmx,mmxext}
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-23 13:28:09 -03:00
James Almer
0c15f2f158 swscale/x86/rgb2rgb_template: don't call emms on sse2/avx functions
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-23 13:28:03 -03:00
James Almer
910eeab480 swscale/x86/rgb2rgb_template: add missing xmm clobbers
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-23 13:27:56 -03:00
Michael Niedermayer
8524558858 swscale/x86/rgb2rgb_template: fix crash with tiny size and nv12 output
Fixes Ticket4151

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-12-03 20:21:56 +01:00
Michael Niedermayer
4388e78a0f swscale/x86/rgb2rgb_template: handle the first 2 lines with C in rgb24toyv12_*()
This avoids out of array accesses
Should fix Ticket3451

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-11-20 01:20:31 +01:00
Michael Niedermayer
1e3f77b53a swscale/x86/rgb2rgb_template: fix 1 byte overread in yuyvtoyuv420 and uyvytoyuv420
might fix ticket 3410

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-13 05:48:09 +02:00
Michael Niedermayer
0371eaebcd Merge commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f'
* commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f':
  Add missing headers to make template files compile (more) standalone

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 14:50:55 +01:00
Diego Biurrun
aba70bb538 Add missing headers to make template files compile (more) standalone 2014-03-26 04:31:27 -07:00
Matt Oliver
8236747511 Automatically change MANGLE() into named inline asm operands when direct symbol reference in inline asm are not supported.
This is part of the patch-set for intel C inline asm on windows support

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 23:39:30 +01:00
Michael Niedermayer
977abf9aed Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rgb2rgb_template: add MMX/SSE2/AVX-optimized deinterleaveBytes

Conflicts:
	libswscale/x86/rgb2rgb_template.c

See: 3033cd7555
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-21 22:20:26 +01:00
Michael Niedermayer
ef25595b71 Merge commit '7597e6efe492cb2449bb771054d64cc7fdf62ff5'
* commit '7597e6efe492cb2449bb771054d64cc7fdf62ff5':
  swscale/x86/rgb2rgb: add support for AVX

Conflicts:
	libswscale/x86/rgb2rgb_template.c

See: 4729b529e6
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-21 22:03:28 +01:00
Michael Niedermayer
91c981857b rgb2rgb_template: add MMX/SSE2/AVX-optimized deinterleaveBytes
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-01-21 18:03:41 +01:00
Michael Niedermayer
7597e6efe4 swscale/x86/rgb2rgb: add support for AVX
This does not yet include any actual AVX code

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-01-21 18:01:29 +01:00
Michael Niedermayer
f618cb1a4b swscale/x86/rgb2rgb_template: try to fix build failure with avx disabled
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-16 03:38:13 +01:00
Michael Niedermayer
3f4290a206 swscale/x86/rgb2rgb_template: try to fix build without AVX
Found-by: iive
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-14 02:22:44 +01:00
Michael Niedermayer
f836b0c581 swscale/x86: SIMD deinterleaveBytes() depends on YASM
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-19 22:36:27 +01:00
Michael Niedermayer
3033cd7555 swscale/x86/rgb2rgb_template: add mmx/sse2/avx optimized deinterleaveBytes
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-19 15:13:48 +01:00
Michael Niedermayer
4729b529e6 swscale/x86/rgb2rgb: extend framework to also include AVX
This does not yet include any actual AVX code

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-19 15:13:48 +01:00
Clément Bœsch
570d63eef3 lavu: add FF_CEIL_RSHIFT and use it in various places. 2013-05-09 16:59:42 +02:00
Michael Niedermayer
d5dbd84c9a Merge commit '2b677ffca54a5fbef9c8860841c32f28ecd68f70'
* commit '2b677ffca54a5fbef9c8860841c32f28ecd68f70':
  swscale: Add av_cold attributes to init functions missing them

Conflicts:
	libswscale/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-05 13:00:21 +02:00
Diego Biurrun
2b677ffca5 swscale: Add av_cold attributes to init functions missing them 2013-05-04 22:48:05 +02:00
Carl Eugen Hoyos
99818ac4d3 Fix libswscale compilation with --disable-optimizations on x86-32.
Fixes ticket #2477.
2013-04-18 12:47:16 +02:00
Michael Niedermayer
3950236332 sws/x86: update RENAME(rgb24toyv12)() to using the user provided rgb2yuv table
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-15 21:44:50 +02:00
Michael Niedermayer
920dd84bf1 sws/x86: remove 8bit rgb2yuv coefficient case for rgb24toyv12 special converter
This simplifies the code and improves quality at the expense of a slight
slowdown of a rarely used function (no fate test uses it).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-15 03:19:52 +02:00
Michael Niedermayer
a37fd7f957 sws: Update rgb24toyv12_c() to user supplied rgb2yuv tables
As the function arguments change, we also change the function name
to ensure that anyone using this (non public) function doesnt end
with hard to debug crashes. The new name also has a proper prefix.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-15 03:08:37 +02:00
Michael Niedermayer
78ec407d5a Merge commit '652f5185945c8405fc57aed353286858df8d066f'
* commit '652f5185945c8405fc57aed353286858df8d066f':
  x86: mmx2 ---> mmxext in comments and messages

Conflicts:
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 14:02:35 +01:00
Diego Biurrun
652f518594 x86: mmx2 ---> mmxext in comments and messages 2012-10-31 00:37:42 +01:00
Hans-Kristian Arntzen
f099fbf5f3 Remove redundant masks in STORE_BGR24_MMX.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-02 21:56:50 +02:00
Giorgio Vazzana
e6ee58fae6 libswscale: fix #endif comments in rgb2rgb_template.c
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-16 18:40:11 +02:00
Michael Niedermayer
7427d1ca4a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  g723.1: simplify scale_vector()
  g723.1: simplify normalize_bits()
  vda: cosmetics: fix Doxygen comment formatting
  vda: better frame allocation
  vda: Merge implementation into one file
  vda: support synchronous decoding
  vda: Reuse the bitstream buffer and reallocate it only if needed
  build: Factor out mpegvideo encoding dependencies to CONFIG_MPEGVIDEOENC
  avprobe: Include libm.h for the log2 fallback
  proresenc: use the edge emulation buffer
  rtmp: handle bytes read reports
  configure: Fix typo in mpeg2video/svq1 decoder dependency declaration
  Use log2(x) instead of log(x) / log(2)
  x86: swscale: fix fragile memory accesses
  x86: swscale: remove disabled code
  x86: yadif: fix asm with suncc
  x86: cabac: allow building with suncc
  x86: mlpdsp: avoid taking address of void
  ARM: intmath: use native-size return types for clipping functions

Conflicts:
	configure
	ffprobe.c
	libavcodec/Makefile
	libavcodec/g723_1.c
	libavcodec/v210dec.h
	libavcodec/vda.h
	libavcodec/vda_h264.c
	libavcodec/x86/cabac.h
	libavfilter/x86/yadif_template.c
	libswscale/x86/rgb2rgb_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-14 15:34:39 +02:00
Mans Rullgard
90540c2d5a x86: swscale: fix fragile memory accesses
To access data at multiple fixed offsets from a base address, this
code uses a single "m" operand and code of the form "32%0", relying on
the memory operand instantiation having no displacement, giving a final
result of the form "32(%rax)".  If the compiler uses a register and
displacement, e.g. "64(%rax)", the end result becomes "3264(%rax)",
which obviously does not work.

Replacing the "m" operands with "r" operands allows safe addition of a
displacement.  In theory, multiple memory operands could use a shared
base register with different index registers, "(%rax,%rbx)", potentially
making more efficient use of registers.  In the cases at hand, no such
sharing is possible since the addresses involved are entirely unrelated.

After this change, the code somewhat rudely accesses memory without
using a corresponding memory operand, which in some cases can lead to
unwanted "optimisations" of surrounding code.  However, the original
code also accesses memory not covered by a memory operand, so this is
not adding any defect not already present.  It is also hightly unlikely
that any such optimisations could be performed here since the memory
locations in questions are not accessed elsewhere in the same functions.

This fixes crashes with suncc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Mans Rullgard
10b83cb653 x86: swscale: remove disabled code
This code has been disabled since 2003.  Nobody will ever look at
it again.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 14:51:52 +01:00
Michael Niedermayer
e776ee8f29 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavr: fix handling of custom mix matrices
  fate: force pix_fmt in lagarith-rgb32 test
  fate: add tests for lagarith lossless video codec.
  ARMv6: vp8: fix stack allocation with Apple's assembler
  ARM: vp56: allow inline asm to build with clang
  fft: 3dnow: fix register name typo in DECL_IMDCT macro
  x86: dct32: port to cpuflags
  x86: build: replace mmx2 by mmxext
  Revert "wmapro: prevent division by zero when sample rate is unspecified"
  wmapro: prevent division by zero when sample rate is unspecified
  lagarith: fix color plane inversion for YUY2 output.
  lagarith: pad RGB buffer by 1 byte.
  dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.

Conflicts:
	doc/APIchanges
	libavcodec/lagarith.c
	libavfilter/x86/gradfun.c
	libavutil/cpu.h
	libavutil/version.h
	libswscale/utils.c
	libswscale/version.h
	libswscale/x86/yuv2rgb.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-04 23:51:43 +02:00
Diego Biurrun
239fdf1b4a x86: build: replace mmx2 by mmxext
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
2012-08-03 22:51:05 +02:00
Michael Bradshaw
1b27b8bf6c MANGLEd swscale x86 asm to save registers
register starvation caused gcc4.2 to fail building 32 bit shared libs
on 64 bit OS X

Signed-off-by: Michael Bradshaw <mbradshaw@sorensonmedia.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-05 03:02:41 +02:00
Themaister
0827222b9c Use more accurate conversion for rgb15/16 to rgb24/32 (C/MMX).
Fate update by michael.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-09 01:58:22 +01:00
Michael Niedermayer
b8a43bc1b5 Merge remote-tracking branch 'qatar/master' into master
* qatar/master: (27 commits)
  ac3enc: fix LOCAL_ALIGNED usage in count_mantissa_bits()
  ac3dsp: do not use the ff_* prefix when referencing ff_ac3_bap_bits.
  ac3dsp: fix loop condition in ac3_update_bap_counts_c()
  ARM: unbreak build
  ac3enc: modify mantissa bit counting to keep bap counts for all values of bap instead of just 0 to 4.
  ac3enc: split mantissa bit counting into a separate function.
  ac3enc: store per-block/channel bap pointers by reference block in a 2D array rather than in the AC3Block struct.
  get_bits: add av_unused tag to cache variable
  sws: replace all long with int.
  ARM: aacdec: fix constraints on inline asm
  ARM: remove unnecessary volatile from inline asm
  ARM: add "cc" clobbers to inline asm where needed
  ARM: improve FASTDIV asm
  ac3enc: use LOCAL_ALIGNED macro
  APIchanges: fill in git hash for av_get_pix_fmt_name (0420bd7).
  lavu: add av_get_pix_fmt_name() convenience function
  cmdutils: remove OPT_FUNC2
  swscale: fix crash in bilinear scaling.
  vpxenc: add VP8E_SET_STATIC_THRESHOLD mapping
  webm: support stereo videos in matroska/webm muxer
  ...

Conflicts:
	Changelog
	cmdutils.c
	cmdutils.h
	doc/APIchanges
	doc/muxers.texi
	ffmpeg.c
	ffplay.c
	libavcodec/ac3enc.c
	libavcodec/ac3enc_float.c
	libavcodec/avcodec.h
	libavcodec/get_bits.h
	libavcodec/libvpxenc.c
	libavcodec/version.h
	libavdevice/libdc1394.c
	libavformat/matroskaenc.c
	libavutil/avutil.h
	libswscale/rgb2rgb.c
	libswscale/swscale.c
	libswscale/swscale_template.c
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-29 03:34:35 +02:00
Anton Khirnov
b8e893399f sws: replace all long with int.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-28 10:03:37 -04:00
Ronald S. Bultje
78046dadc3 rgb2rgb: remove duplicate mmx/mmx2/3dnow/sse2 functions.
Many functions have such a prefix, but do not actually use any
instructions or features from that set, thus giving the false
impression that swscale is highly optimized for a particular
system, whereas in reality it is not.
2011-05-28 11:41:32 +02:00
Ronald S. Bultje
522d65ba25 rgb2rgb: remove duplicate mmx/mmx2/3dnow/sse2 functions.
Many functions have such a prefix, but do not actually use any
instructions or features from that set, thus giving the false
impression that swscale is highly optimized for a particular
system, whereas in reality it is not.
2011-05-26 09:31:02 -04:00
Michael Niedermayer
7dc303a60e swscale: Eliminate rgb24toyv12_c() duplication.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-26 00:56:06 +02:00
Michael Niedermayer
4a056160be swscale: remove duplicatiopn of rgb24toyv12_c()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-25 20:58:59 +02:00
Michael Niedermayer
d1adad3cca Merge swscale bloatup
This will be cleaned up in the next merge

Authorship / merged commits:
commit f668afd489
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 09:12:34 2011 +0200

    swscale: fix "ISO C90 forbids mixed declarations and code" warning

    only hit with --enable-runtime-cpudetect

commit 7f2ae5c7af
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 02:09:44 2011 +0200

    swscale: fix compilation with --enable-runtime-cpudetect

commit b6cad3df82
Author: Janne Grunau <janne-libav@jannau.net>
Date:   Fri Apr 15 00:31:04 2011 +0200

    swscale: correct include path to fix ppc altivec build

commit 6216fc70b7
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Thu Apr 14 22:03:45 2011 +0200

    swscale: simplify rgb2rgb templating

    MMX is always built. Drop the ifdefs

commit 33a0421bba
Author: Josh Allmann <joshua.allmann@gmail.com>
Date:   Wed Apr 13 20:57:32 2011 +0200

    swscale: simplify initialization code

    Simplify the fallthrough case when no accelerated functions
    can be initialized.

commit 735bf19511
Author: Josh Allmann <joshua.allmann@gmail.com>
Date:   Wed Apr 13 20:57:31 2011 +0200

    swscale: further cleanup swscale.c

    Move x86-specific constants out of swscale.c

commit 86330b4c92
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Wed Apr 13 20:57:30 2011 +0200

    swscale: partially move the arch specific code left

    PPC and x86 code is split off from swscale_template.c. Lots of code is
    still duplicated and should be removed later.

    Again uniformize the init system to be more similar to the dsputil one.

    Unset h*scale_fast in the x86 init in order to make the output
    consistent with the previous status. Thanks to Josh for spotting it.

commit c003832883
Author: Luca Barbato <lu_zero@gentoo.org>
Date:   Wed Apr 13 20:57:29 2011 +0200

    swscale: move away x86 specific code from rgb2rgb

    Keep only the plain C code in the main rgb2rgb.c and move the x86
    specific optimizations to x86/rgb2rgb.c
    Change the initialization pattern a little so some of it can be
    factorized to behave more like dsputils.

Conflicts:
	libswscale/rgb2rgb.c
	libswscale/swscale_template.c
2011-05-25 06:24:55 +02:00
Luca Barbato
6216fc70b7 swscale: simplify rgb2rgb templating
MMX is always built. Drop the ifdefs
2011-04-14 22:16:47 +02:00
Luca Barbato
c003832883 swscale: move away x86 specific code from rgb2rgb
Keep only the plain C code in the main rgb2rgb.c and move the x86
specific optimizations to x86/rgb2rgb.c
Change the initialization pattern a little so some of it can be
factorized to behave more like dsputils.
2011-04-14 22:16:47 +02:00
Giorgio Vazzana
1e6072139b swscale: x86: fix #endif comments in rgb2rgb template file
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-19 21:50:09 +02:00