Commit Graph

1569 Commits

Author SHA1 Message Date
Matt Oliver
3c3e02b8d1 x86/cavdsp: prevent named constraints appearing twice. 2014-05-03 17:47:55 +02:00
James Almer
5ac10d40fb x86/mpegaudiodsp: define apply_window_mp3 as SSE
None of the handwritten asm in this function seems to be SSE2

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 00:38:01 +02:00
Hendrik Leppkes
5809c2a99d vc1dsp: fix build without inline asm
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-22 14:01:53 +02:00
Clément Bœsch
62d31307c1 avcodec/x86/vp9lpf: add a comment above a bunch of SWAP. 2014-04-20 21:33:58 +02:00
Clément Bœsch
f0d368d758 avcodec/x86/vp9lpf: merge a few movs with other instructions. 2014-04-20 21:29:11 +02:00
Christophe Gisquet
319235c67c vc1dsp: introduce cases for 8x8 and 16x16
This allows further unrolling the DSP implementation where possible.

x86 and ARM DSP modified by simply moving the multiple calls from vc1dec
to the DSP code. Decoding improvements should only occurs because of the
compiler actually able to unroll more.

Decoding time: ~8.80s -> 8.64s (ie around 2%)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-20 18:25:36 +02:00
Clément Bœsch
010732b73a vp9/x86: simplify FILTER_INIT.
In the 2 FILTER_INIT usages, the source is already preloaded so that
extra complexity taken from FILTER_UPDATE is not necessary.

Also add forgotten "mask" argument in FILTER_{INIT,UPDATE} comments.
2014-04-19 17:30:33 +02:00
Clément Bœsch
b8d002dc95 vp9/x86: clarify mixed splatb. 2014-04-19 17:00:51 +02:00
Carl Eugen Hoyos
b38910c979 Fix compilation with !HAVE_6REGS.
Can be tested with:
$ ./configure --cc='cc -m32' --disable-optimizations --enable-pic
2014-04-19 09:56:01 +02:00
Carl Eugen Hoyos
72c93abaad Use MANGLE in cavsdsp.c to save two registers using gcc.
Fixes compilation with !HAVE_6REGS.
2014-04-19 09:54:26 +02:00
James Almer
197fe392db x86/dsputil: use HADDD where applicable
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:35 +02:00
James Almer
76ed71a72b x86: move horizontal add macros to x86util
Also port relevant AVX2/XOP optimizations from x264 with permission
to relicense to LGPL from the corresponding authors

Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:09 +02:00
Michael Niedermayer
46d5625f44 avcodec/x86/idct_sse2_xvid: fix non C99 inline function
Found-by: Matt Oliver <protogonoi@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 18:04:57 +02:00
Matt Oliver
d1e6e5c887 avcodec/x86: Exclude broken get_cabac under icl.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-10 17:47:22 +02:00
Matt Oliver
158a80cc0b Remove leal op to fix icl inline asm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-07 13:02:54 +02:00
Hendrik Leppkes
fc7e02f0ff dcadsp: fix SSE code to not use SSE2 instructions.
movq from SSE register to memory is an SSE2 instruction.
Instead, use SSE movlps, which does the same thing.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-06 18:31:22 +02:00
Michael Niedermayer
e6f69b324e Merge commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5'
* commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5':
  x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:36:21 +02:00
Michael Niedermayer
e3c3f277a9 Merge commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99'
* commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99':
  x86: h264_qpel: Simplify an #if conditional

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:30:44 +02:00
Michael Niedermayer
ebb21887b8 Merge commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8'
* commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8':
  x86: Drop some unnecessary YASM ifdefs

Conflicts:
	libavfilter/x86/vf_yadif_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:16:39 +02:00
Michael Niedermayer
874f27a8f7 Merge commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647'
* commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647':
  x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:05:00 +02:00
Michael Niedermayer
5440151fa4 Merge commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d'
* commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d':
  Remove a number of unnecessary dsputil.h #includes

Conflicts:
	libavcodec/h264pred.c
	libavcodec/vc1dsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 18:54:15 +02:00
James Almer
a1ac12bddd x86/dcadsp: add ff_dca_lfe_fir0_fma3
~10% faster than the SSE version.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 13:55:59 +02:00
James Almer
7d2116dd09 x86/synth_filter: compile avx and fma3 functions unconditionally
Fixes compilation failures with "--disable-{avx,fma3} --disable-optimizations"

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 05:15:27 +02:00
Michael Niedermayer
490d53e335 avcodec/x86/dcadsp_init: fix compilation failure without FMA3
alternatively the call could be put under #if or the #if
over the function removed

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 00:11:48 +02:00
Michael Niedermayer
51fd962c0b Merge commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3'
* commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3':
  x86/synth_filter: add synth_filter_fma3
  x86/synth_filter: add synth_filter_avx
  x86/synth_filter: add synth_filter_sse

Conflicts:
	libavcodec/x86/dcadsp.asm
	libavcodec/x86/dcadsp_init.c

See: 6467209836
See: 68c3ed936a
See: 7fd64e3e36
See: aa1f38015c
See: dfd865e51b
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 23:40:08 +02:00
Christophe Gisquet
dfd865e51b x86/synth_filter: remove the main loop when it's not needed
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 22:35:45 +02:00
Diego Biurrun
57b5b84e20 x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong 2014-04-04 19:08:05 +02:00
Diego Biurrun
c2c5be5749 x86: h264_qpel: Simplify an #if conditional
The extra conditions are covered by previous #ifs and conditional compilation.
2014-04-04 19:08:05 +02:00
Diego Biurrun
01c5779f56 x86: Drop some unnecessary YASM ifdefs
Dead code elimination is enough to avoid undefined references in these cases.
2014-04-04 19:08:05 +02:00
Diego Biurrun
b42f49e42f x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes 2014-04-04 19:08:05 +02:00
Diego Biurrun
3dc6272bed Remove a number of unnecessary dsputil.h #includes 2014-04-04 19:08:05 +02:00
James Almer
c74b86699c x86/synth_filter: add synth_filter_fma3
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
81e02fae6e x86/synth_filter: add synth_filter_avx
Sandy Bridge Win64:
180 cycles in ff_synth_filter_inner_sse2
150 cycles in ff_synth_filter_inner_avx

Also switch some instructions to a three operand format to avoid
assembly errors with Yasm 1.1.0 or older.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
2025d8026f x86/synth_filter: add synth_filter_sse
Build only on x86_32 targets.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
Michael Niedermayer
fb61ed1e9f Merge commit 'ac4b32df71bd932838043a4838b86d11e169707f'
* commit 'ac4b32df71bd932838043a4838b86d11e169707f':
  On2 VP7 decoder

Conflicts:
	Changelog
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/arm/vp8dsp.h
	libavcodec/arm/vp8dsp_init_arm.c
	libavcodec/arm/vp8dsp_init_armv6.c
	libavcodec/arm/vp8dsp_init_neon.c
	libavcodec/avcodec.h
	libavcodec/h264pred.c
	libavcodec/version.h
	libavcodec/vp8.c
	libavcodec/vp8.h
	libavcodec/vp8data.h
	libavcodec/vp8dsp.c
	libavcodec/vp8dsp.h
	libavcodec/x86/h264_intrapred_init.c
	libavcodec/x86/vp8dsp_init.c

See: 89f2f5dbd7 and others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 14:46:10 +02:00
Peter Ross
ac4b32df71 On2 VP7 decoder
Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-04-04 04:00:11 +02:00
Matt Oliver
0f2588d7e5 Use intel compliant CDQ instead of CLTD in inline asm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-30 23:14:36 +02:00
Clément Bœsch
c4148a6668 x86/vp9mc: add vp9 namespace. 2014-03-29 18:13:15 +01:00
Timothy Gu
9d34dce05b x86: convert DNxHDenc inline asm to yasm
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 23:16:17 +01:00
Timothy Gu
cb11b9e89e dnxhdenc: make get_pixel_8x4_sym accept ptrdiff_t as stride
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 23:09:10 +01:00
Michael Niedermayer
4998a72b49 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init

Conflicts:
	libavcodec/x86/rnd_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 16:55:46 +01:00
Michael Niedermayer
0371eaebcd Merge commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f'
* commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f':
  Add missing headers to make template files compile (more) standalone

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 14:50:55 +01:00
Diego Biurrun
efc7290eb6 x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init
There is no point in having a separate file just for the instantiation
that provides the public functions.
2014-03-26 04:31:27 -07:00
Diego Biurrun
aba70bb538 Add missing headers to make template files compile (more) standalone 2014-03-26 04:31:27 -07:00
Diego Biurrun
d0aabeab23 x86: h264_qpel: Fix typo in CALL_2X_PIXELS macro invocation
This fixes FATE with mmxext CPUFLAGS set.
2014-03-26 12:00:01 +01:00
Peter Ross
a490970af2 libavcodec/*/vp8dsp_init: indent
Signed-off-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 13:29:29 +01:00
Peter Ross
89f2f5dbd7 On2 VP7 decoder
Signed-off-by: Peter Ross <pross@xvid.org>
Reviewed-by: BBB
previous patch reviewed by jason
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 13:29:05 +01:00
Michael Niedermayer
c25d2cd20b avcodec/x86/mpegvideoenc_template: fix integer overflow
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 00:15:52 +01:00
Michael Niedermayer
c8246d3766 avcodec/x86/h264_qpel: Fix typo introduced by 322a1dda97
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-23 15:04:53 +01:00
Michael Niedermayer
74fed968d1 Merge commit '82dd1026cfc1d72b04019185bea4c1c9621ace3f'
* commit '82dd1026cfc1d72b04019185bea4c1c9621ace3f':
  x86: dsputil: Move hpeldsp-related declarations to a separate header

Conflicts:
	libavcodec/x86/dsputil_x86.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-22 23:21:54 +01:00