Matt Oliver
3c3e02b8d1
x86/cavdsp: prevent named constraints appearing twice.
2014-05-03 17:47:55 +02:00
James Almer
5ac10d40fb
x86/mpegaudiodsp: define apply_window_mp3 as SSE
...
None of the handwritten asm in this function seems to be SSE2
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 00:38:01 +02:00
Hendrik Leppkes
5809c2a99d
vc1dsp: fix build without inline asm
...
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-22 14:01:53 +02:00
Clément Bœsch
62d31307c1
avcodec/x86/vp9lpf: add a comment above a bunch of SWAP.
2014-04-20 21:33:58 +02:00
Clément Bœsch
f0d368d758
avcodec/x86/vp9lpf: merge a few movs with other instructions.
2014-04-20 21:29:11 +02:00
Christophe Gisquet
319235c67c
vc1dsp: introduce cases for 8x8 and 16x16
...
This allows further unrolling the DSP implementation where possible.
x86 and ARM DSP modified by simply moving the multiple calls from vc1dec
to the DSP code. Decoding improvements should only occurs because of the
compiler actually able to unroll more.
Decoding time: ~8.80s -> 8.64s (ie around 2%)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-20 18:25:36 +02:00
Clément Bœsch
010732b73a
vp9/x86: simplify FILTER_INIT.
...
In the 2 FILTER_INIT usages, the source is already preloaded so that
extra complexity taken from FILTER_UPDATE is not necessary.
Also add forgotten "mask" argument in FILTER_{INIT,UPDATE} comments.
2014-04-19 17:30:33 +02:00
Clément Bœsch
b8d002dc95
vp9/x86: clarify mixed splatb.
2014-04-19 17:00:51 +02:00
Carl Eugen Hoyos
b38910c979
Fix compilation with !HAVE_6REGS.
...
Can be tested with:
$ ./configure --cc='cc -m32' --disable-optimizations --enable-pic
2014-04-19 09:56:01 +02:00
Carl Eugen Hoyos
72c93abaad
Use MANGLE in cavsdsp.c to save two registers using gcc.
...
Fixes compilation with !HAVE_6REGS.
2014-04-19 09:54:26 +02:00
James Almer
197fe392db
x86/dsputil: use HADDD where applicable
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:35 +02:00
James Almer
76ed71a72b
x86: move horizontal add macros to x86util
...
Also port relevant AVX2/XOP optimizations from x264 with permission
to relicense to LGPL from the corresponding authors
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:09 +02:00
Michael Niedermayer
46d5625f44
avcodec/x86/idct_sse2_xvid: fix non C99 inline function
...
Found-by: Matt Oliver <protogonoi@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 18:04:57 +02:00
Matt Oliver
d1e6e5c887
avcodec/x86: Exclude broken get_cabac under icl.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-10 17:47:22 +02:00
Matt Oliver
158a80cc0b
Remove leal op to fix icl inline asm.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-07 13:02:54 +02:00
Hendrik Leppkes
fc7e02f0ff
dcadsp: fix SSE code to not use SSE2 instructions.
...
movq from SSE register to memory is an SSE2 instruction.
Instead, use SSE movlps, which does the same thing.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-06 18:31:22 +02:00
Michael Niedermayer
e6f69b324e
Merge commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5'
...
* commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5':
x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:36:21 +02:00
Michael Niedermayer
e3c3f277a9
Merge commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99'
...
* commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99':
x86: h264_qpel: Simplify an #if conditional
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:30:44 +02:00
Michael Niedermayer
ebb21887b8
Merge commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8'
...
* commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8':
x86: Drop some unnecessary YASM ifdefs
Conflicts:
libavfilter/x86/vf_yadif_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:16:39 +02:00
Michael Niedermayer
874f27a8f7
Merge commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647'
...
* commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647':
x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:05:00 +02:00
Michael Niedermayer
5440151fa4
Merge commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d'
...
* commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d':
Remove a number of unnecessary dsputil.h #includes
Conflicts:
libavcodec/h264pred.c
libavcodec/vc1dsp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 18:54:15 +02:00
James Almer
a1ac12bddd
x86/dcadsp: add ff_dca_lfe_fir0_fma3
...
~10% faster than the SSE version.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 13:55:59 +02:00
James Almer
7d2116dd09
x86/synth_filter: compile avx and fma3 functions unconditionally
...
Fixes compilation failures with "--disable-{avx,fma3} --disable-optimizations"
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 05:15:27 +02:00
Michael Niedermayer
490d53e335
avcodec/x86/dcadsp_init: fix compilation failure without FMA3
...
alternatively the call could be put under #if or the #if
over the function removed
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 00:11:48 +02:00
Michael Niedermayer
51fd962c0b
Merge commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3'
...
* commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3':
x86/synth_filter: add synth_filter_fma3
x86/synth_filter: add synth_filter_avx
x86/synth_filter: add synth_filter_sse
Conflicts:
libavcodec/x86/dcadsp.asm
libavcodec/x86/dcadsp_init.c
See: 6467209836
See: 68c3ed936a
See: 7fd64e3e36
See: aa1f38015c
See: dfd865e51b
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 23:40:08 +02:00
Christophe Gisquet
dfd865e51b
x86/synth_filter: remove the main loop when it's not needed
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 22:35:45 +02:00
Diego Biurrun
57b5b84e20
x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong
2014-04-04 19:08:05 +02:00
Diego Biurrun
c2c5be5749
x86: h264_qpel: Simplify an #if conditional
...
The extra conditions are covered by previous #ifs and conditional compilation.
2014-04-04 19:08:05 +02:00
Diego Biurrun
01c5779f56
x86: Drop some unnecessary YASM ifdefs
...
Dead code elimination is enough to avoid undefined references in these cases.
2014-04-04 19:08:05 +02:00
Diego Biurrun
b42f49e42f
x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes
2014-04-04 19:08:05 +02:00
Diego Biurrun
3dc6272bed
Remove a number of unnecessary dsputil.h #includes
2014-04-04 19:08:05 +02:00
James Almer
c74b86699c
x86/synth_filter: add synth_filter_fma3
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
81e02fae6e
x86/synth_filter: add synth_filter_avx
...
Sandy Bridge Win64:
180 cycles in ff_synth_filter_inner_sse2
150 cycles in ff_synth_filter_inner_avx
Also switch some instructions to a three operand format to avoid
assembly errors with Yasm 1.1.0 or older.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
2025d8026f
x86/synth_filter: add synth_filter_sse
...
Build only on x86_32 targets.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
Michael Niedermayer
fb61ed1e9f
Merge commit 'ac4b32df71bd932838043a4838b86d11e169707f'
...
* commit 'ac4b32df71bd932838043a4838b86d11e169707f':
On2 VP7 decoder
Conflicts:
Changelog
libavcodec/arm/h264pred_init_arm.c
libavcodec/arm/vp8dsp.h
libavcodec/arm/vp8dsp_init_arm.c
libavcodec/arm/vp8dsp_init_armv6.c
libavcodec/arm/vp8dsp_init_neon.c
libavcodec/avcodec.h
libavcodec/h264pred.c
libavcodec/version.h
libavcodec/vp8.c
libavcodec/vp8.h
libavcodec/vp8data.h
libavcodec/vp8dsp.c
libavcodec/vp8dsp.h
libavcodec/x86/h264_intrapred_init.c
libavcodec/x86/vp8dsp_init.c
See: 89f2f5dbd7
and others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 14:46:10 +02:00
Peter Ross
ac4b32df71
On2 VP7 decoder
...
Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-04-04 04:00:11 +02:00
Matt Oliver
0f2588d7e5
Use intel compliant CDQ instead of CLTD in inline asm.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-30 23:14:36 +02:00
Clément Bœsch
c4148a6668
x86/vp9mc: add vp9 namespace.
2014-03-29 18:13:15 +01:00
Timothy Gu
9d34dce05b
x86: convert DNxHDenc inline asm to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 23:16:17 +01:00
Timothy Gu
cb11b9e89e
dnxhdenc: make get_pixel_8x4_sym accept ptrdiff_t as stride
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 23:09:10 +01:00
Michael Niedermayer
4998a72b49
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init
Conflicts:
libavcodec/x86/rnd_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 16:55:46 +01:00
Michael Niedermayer
0371eaebcd
Merge commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f'
...
* commit 'aba70bb5387f12dfa5e6cd8cb861c9c7e668151f':
Add missing headers to make template files compile (more) standalone
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-26 14:50:55 +01:00
Diego Biurrun
efc7290eb6
x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init
...
There is no point in having a separate file just for the instantiation
that provides the public functions.
2014-03-26 04:31:27 -07:00
Diego Biurrun
aba70bb538
Add missing headers to make template files compile (more) standalone
2014-03-26 04:31:27 -07:00
Diego Biurrun
d0aabeab23
x86: h264_qpel: Fix typo in CALL_2X_PIXELS macro invocation
...
This fixes FATE with mmxext CPUFLAGS set.
2014-03-26 12:00:01 +01:00
Peter Ross
a490970af2
libavcodec/*/vp8dsp_init: indent
...
Signed-off-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 13:29:29 +01:00
Peter Ross
89f2f5dbd7
On2 VP7 decoder
...
Signed-off-by: Peter Ross <pross@xvid.org>
Reviewed-by: BBB
previous patch reviewed by jason
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 13:29:05 +01:00
Michael Niedermayer
c25d2cd20b
avcodec/x86/mpegvideoenc_template: fix integer overflow
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-25 00:15:52 +01:00
Michael Niedermayer
c8246d3766
avcodec/x86/h264_qpel: Fix typo introduced by 322a1dda97
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-23 15:04:53 +01:00
Michael Niedermayer
74fed968d1
Merge commit '82dd1026cfc1d72b04019185bea4c1c9621ace3f'
...
* commit '82dd1026cfc1d72b04019185bea4c1c9621ace3f':
x86: dsputil: Move hpeldsp-related declarations to a separate header
Conflicts:
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-22 23:21:54 +01:00