plepere
63832e01c3
hvcodec/x86/hevcdsp: make macros more modular to support functions that are not sse4
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-09 00:14:50 +02:00
Matt Oliver
1898c2f49d
inline asm: fix arrays as named constraints.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-07 15:02:45 +02:00
Michael Niedermayer
fc7d0d8201
avcodec/x86/hevcdsp_init: fix SSE4 checks
...
Found-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:27:49 +02:00
Michael Niedermayer
7be230b5fa
avcodec/x86/Makefile: remove duplicate line
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:42 +02:00
Michael Niedermayer
3b3db02f2e
avcodec/x86/hevcdsp_init: fix build on 32bit
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:42 +02:00
plepere
7a2491c436
HEVC : added assembly MC functions
...
pretty print x86
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:36 +02:00
Matt Oliver
ac9869ffb0
x86/mpegaudiodsp.c: msvc compilation error without sse/avx_external
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 14:15:02 +02:00
Michael Niedermayer
ebf2c2c3a8
avcodec/lossless_videodsp: fix incompatible pointer type warning
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-05 05:49:18 +02:00
Matt Oliver
3c3e02b8d1
x86/cavdsp: prevent named constraints appearing twice.
2014-05-03 17:47:55 +02:00
James Almer
5ac10d40fb
x86/mpegaudiodsp: define apply_window_mp3 as SSE
...
None of the handwritten asm in this function seems to be SSE2
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 00:38:01 +02:00
Hendrik Leppkes
5809c2a99d
vc1dsp: fix build without inline asm
...
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-22 14:01:53 +02:00
Clément Bœsch
62d31307c1
avcodec/x86/vp9lpf: add a comment above a bunch of SWAP.
2014-04-20 21:33:58 +02:00
Clément Bœsch
f0d368d758
avcodec/x86/vp9lpf: merge a few movs with other instructions.
2014-04-20 21:29:11 +02:00
Christophe Gisquet
319235c67c
vc1dsp: introduce cases for 8x8 and 16x16
...
This allows further unrolling the DSP implementation where possible.
x86 and ARM DSP modified by simply moving the multiple calls from vc1dec
to the DSP code. Decoding improvements should only occurs because of the
compiler actually able to unroll more.
Decoding time: ~8.80s -> 8.64s (ie around 2%)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-20 18:25:36 +02:00
Clément Bœsch
010732b73a
vp9/x86: simplify FILTER_INIT.
...
In the 2 FILTER_INIT usages, the source is already preloaded so that
extra complexity taken from FILTER_UPDATE is not necessary.
Also add forgotten "mask" argument in FILTER_{INIT,UPDATE} comments.
2014-04-19 17:30:33 +02:00
Clément Bœsch
b8d002dc95
vp9/x86: clarify mixed splatb.
2014-04-19 17:00:51 +02:00
Carl Eugen Hoyos
b38910c979
Fix compilation with !HAVE_6REGS.
...
Can be tested with:
$ ./configure --cc='cc -m32' --disable-optimizations --enable-pic
2014-04-19 09:56:01 +02:00
Carl Eugen Hoyos
72c93abaad
Use MANGLE in cavsdsp.c to save two registers using gcc.
...
Fixes compilation with !HAVE_6REGS.
2014-04-19 09:54:26 +02:00
James Almer
197fe392db
x86/dsputil: use HADDD where applicable
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:35 +02:00
James Almer
76ed71a72b
x86: move horizontal add macros to x86util
...
Also port relevant AVX2/XOP optimizations from x264 with permission
to relicense to LGPL from the corresponding authors
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:09 +02:00
Michael Niedermayer
46d5625f44
avcodec/x86/idct_sse2_xvid: fix non C99 inline function
...
Found-by: Matt Oliver <protogonoi@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 18:04:57 +02:00
James Almer
0f524b6c69
x86/synth_filter: remove the fma3 version ifdefs
...
This fixes compilation failures with --disable-fma3
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-13 11:29:28 +02:00
Timothy Gu
71c32ed533
DNxHD: convert inline asm to yasm
2014-04-11 12:09:09 +02:00
Timothy Gu
676856204b
DNxHD: make get_pixel_8x4_sym accept ptrdiff_t as stride
2014-04-11 12:09:09 +02:00
Matt Oliver
d1e6e5c887
avcodec/x86: Exclude broken get_cabac under icl.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-10 17:47:22 +02:00
Matt Oliver
158a80cc0b
Remove leal op to fix icl inline asm.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-07 13:02:54 +02:00
Hendrik Leppkes
fc7e02f0ff
dcadsp: fix SSE code to not use SSE2 instructions.
...
movq from SSE register to memory is an SSE2 instruction.
Instead, use SSE movlps, which does the same thing.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-06 18:31:22 +02:00
Michael Niedermayer
e6f69b324e
Merge commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5'
...
* commit '57b5b84e208ad61ffdd74ad849bed212deb92bc5':
x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:36:21 +02:00
Michael Niedermayer
e3c3f277a9
Merge commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99'
...
* commit 'c2c5be57494e6117086771bca34c8cd4c72c8e99':
x86: h264_qpel: Simplify an #if conditional
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:30:44 +02:00
Michael Niedermayer
ebb21887b8
Merge commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8'
...
* commit '01c5779f56cf708e6cb88b11cfdc248cae7e2ee8':
x86: Drop some unnecessary YASM ifdefs
Conflicts:
libavfilter/x86/vf_yadif_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:16:39 +02:00
Michael Niedermayer
874f27a8f7
Merge commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647'
...
* commit 'b42f49e42f8cde25a788b2d13d03e99ca2956647':
x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 19:05:00 +02:00
Michael Niedermayer
5440151fa4
Merge commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d'
...
* commit '3dc6272bed7890a49080e18eacf3c7a4a6594b0d':
Remove a number of unnecessary dsputil.h #includes
Conflicts:
libavcodec/h264pred.c
libavcodec/vc1dsp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 18:54:15 +02:00
James Almer
a1ac12bddd
x86/dcadsp: add ff_dca_lfe_fir0_fma3
...
~10% faster than the SSE version.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 13:55:59 +02:00
James Almer
7d2116dd09
x86/synth_filter: compile avx and fma3 functions unconditionally
...
Fixes compilation failures with "--disable-{avx,fma3} --disable-optimizations"
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 05:15:27 +02:00
Michael Niedermayer
490d53e335
avcodec/x86/dcadsp_init: fix compilation failure without FMA3
...
alternatively the call could be put under #if or the #if
over the function removed
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-05 00:11:48 +02:00
Michael Niedermayer
51fd962c0b
Merge commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3'
...
* commit 'c74b86699c86bdf62e8570f41d8a38be5710baa3':
x86/synth_filter: add synth_filter_fma3
x86/synth_filter: add synth_filter_avx
x86/synth_filter: add synth_filter_sse
Conflicts:
libavcodec/x86/dcadsp.asm
libavcodec/x86/dcadsp_init.c
See: 64672098361361cd15d37e36f747ab44de5b80ca
See: 68c3ed936a76c3ff7738f602fa90237ac7e3ce08
See: 7fd64e3e36f79204c0eda7cacce6884c14ddc1fb
See: aa1f38015cb0d04a5c50a8957dd7aba79f0d8882
See: dfd865e51b890d9be394804bccddf55198f4a251
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 23:40:08 +02:00
Christophe Gisquet
dfd865e51b
x86/synth_filter: remove the main loop when it's not needed
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 22:35:45 +02:00
Diego Biurrun
57b5b84e20
x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belong
2014-04-04 19:08:05 +02:00
Diego Biurrun
c2c5be5749
x86: h264_qpel: Simplify an #if conditional
...
The extra conditions are covered by previous #ifs and conditional compilation.
2014-04-04 19:08:05 +02:00
Diego Biurrun
01c5779f56
x86: Drop some unnecessary YASM ifdefs
...
Dead code elimination is enough to avoid undefined references in these cases.
2014-04-04 19:08:05 +02:00
Diego Biurrun
b42f49e42f
x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includes
2014-04-04 19:08:05 +02:00
Diego Biurrun
3dc6272bed
Remove a number of unnecessary dsputil.h #includes
2014-04-04 19:08:05 +02:00
James Almer
c74b86699c
x86/synth_filter: add synth_filter_fma3
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
81e02fae6e
x86/synth_filter: add synth_filter_avx
...
Sandy Bridge Win64:
180 cycles in ff_synth_filter_inner_sse2
150 cycles in ff_synth_filter_inner_avx
Also switch some instructions to a three operand format to avoid
assembly errors with Yasm 1.1.0 or older.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
James Almer
2025d8026f
x86/synth_filter: add synth_filter_sse
...
Build only on x86_32 targets.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-04-04 17:40:51 +02:00
Michael Niedermayer
fb61ed1e9f
Merge commit 'ac4b32df71bd932838043a4838b86d11e169707f'
...
* commit 'ac4b32df71bd932838043a4838b86d11e169707f':
On2 VP7 decoder
Conflicts:
Changelog
libavcodec/arm/h264pred_init_arm.c
libavcodec/arm/vp8dsp.h
libavcodec/arm/vp8dsp_init_arm.c
libavcodec/arm/vp8dsp_init_armv6.c
libavcodec/arm/vp8dsp_init_neon.c
libavcodec/avcodec.h
libavcodec/h264pred.c
libavcodec/version.h
libavcodec/vp8.c
libavcodec/vp8.h
libavcodec/vp8data.h
libavcodec/vp8dsp.c
libavcodec/vp8dsp.h
libavcodec/x86/h264_intrapred_init.c
libavcodec/x86/vp8dsp_init.c
See: 89f2f5dbd7a23e7ec1073d3c08d46093a01a4135 and others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-04 14:46:10 +02:00
Peter Ross
ac4b32df71
On2 VP7 decoder
...
Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-04-04 04:00:11 +02:00
Matt Oliver
0f2588d7e5
Use intel compliant CDQ instead of CLTD in inline asm.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-30 23:14:36 +02:00
Clément Bœsch
c4148a6668
x86/vp9mc: add vp9 namespace.
2014-03-29 18:13:15 +01:00
Timothy Gu
9d34dce05b
x86: convert DNxHDenc inline asm to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-27 23:16:17 +01:00