1509 Commits

Author SHA1 Message Date
Michael Niedermayer
1cd107f637 avcodec/x86/snowdsp: add missing clobbers to inner_add_yblock_bw_8_obmc_16_bh_even_sse2() and inner_add_yblock_bw_16_obmc_32_sse2()
Note, these functions are currently disabled

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-21 03:38:48 +01:00
Michael Niedermayer
e98bac82e5 Merge commit '82bb3048013201c0095d2853d4623633d912252f'
* commit '82bb3048013201c0095d2853d4623633d912252f':
  dsputil: Use correct type in me_cmp_func function pointer

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 22:36:40 +01:00
Michael Niedermayer
011d83de48 Merge commit '0e083d7e43805db1a978cb57bfa25fda62e8ff18'
* commit '0e083d7e43805db1a978cb57bfa25fda62e8ff18':
  build: Group general components separate from de/encoders in arch Makefiles

Conflicts:
	libavcodec/arm/Makefile
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 22:26:31 +01:00
Michael Niedermayer
ba85bfabf3 Merge commit '5169e688956be3378adb3b16a93962fe0048f1c9'
* commit '5169e688956be3378adb3b16a93962fe0048f1c9':
  dsputil: Propagate bit depth information to all (sub)init functions

Conflicts:
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv5te.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/arm/dsputil_init_neon.c
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/x86/dsputil_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 22:06:01 +01:00
Diego Biurrun
82bb304801 dsputil: Use correct type in me_cmp_func function pointer 2014-03-20 05:03:23 -07:00
Diego Biurrun
0e083d7e43 build: Group general components separate from de/encoders in arch Makefiles
This is in line with how the top-level libavcodec Makefile is structured.
2014-03-20 05:03:23 -07:00
Diego Biurrun
5169e68895 dsputil: Propagate bit depth information to all (sub)init functions
This avoids recalculating the value over and over again.
2014-03-20 05:03:23 -07:00
Carl Eugen Hoyos
57fdc74c34 Add one forgotten named inline asm operand in libavcodec/x86/motion_est.c. 2014-03-19 03:00:19 +01:00
Matt Oliver
8236747511 Automatically change MANGLE() into named inline asm operands when direct symbol reference in inline asm are not supported.
This is part of the patch-set for intel C inline asm on windows support

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 23:39:30 +01:00
Matt Oliver
b2d3a45598 avcodec/x86/mlpdsp: Only use asm when non-local inline asm lables are supported
This is part of the patch-set for intel C inline asm on windows support

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 23:37:50 +01:00
James Almer
aa1f38015c x86/synth_filter: improve FMA version
Replace mulps+subps with fnmaddps, resulting in two less instructions inside the
inner loops.
About 1% faster FMA3 performance.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-17 21:04:15 +01:00
Matt Oliver
b73aae6fe9 avcodec/x86/idct_sse2_xvid: move offsets out of MANGLE()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-17 04:19:59 +01:00
Matt Oliver
9eb3f11c55 Add missing external declarations.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-17 00:48:09 +01:00
Matt Oliver
590805b7c3 Fixed 64bit conformance with mvzbl.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-17 00:13:50 +01:00
Michael Niedermayer
5dd97d5809 Merge commit 'db3f61a04f1f66746660f921bb2780ddf1141f3b'
* commit 'db3f61a04f1f66746660f921bb2780ddf1141f3b':
  x86: dsputil_init: Drop some unnecessary parentheses

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:25:57 +01:00
Michael Niedermayer
27cab16ce7 Merge commit '441b093915717afa7d24be34bdab2a4911b30a57'
* commit '441b093915717afa7d24be34bdab2a4911b30a57':
  x86: dsputil_init: K&R formatting cosmetics

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:25:36 +01:00
Michael Niedermayer
236874a571 Merge commit '4cb4680c1087a2cd13d4b0c9167a2eb3147f99d8'
* commit '4cb4680c1087a2cd13d4b0c9167a2eb3147f99d8':
  x86: dsputil_x86.h: K&R formatting cosmetics

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:25:19 +01:00
Michael Niedermayer
925ce6faf4 Merge commit 'f8bbebecfd7ea3dceb7c96f931beca33f80a3490'
* commit 'f8bbebecfd7ea3dceb7c96f931beca33f80a3490':
  x86: motion_est: K&R formatting cosmetics

Conflicts:
	libavcodec/x86/motion_est.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:20:43 +01:00
Michael Niedermayer
b7a5f5dc66 Merge commit 'a36947c167d7278b891453083b57dc56b7a7f5c5'
* commit 'a36947c167d7278b891453083b57dc56b7a7f5c5':
  dsputilenc_mmx: K&R formatting cosmetics

Conflicts:
	libavcodec/x86/dsputilenc_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:09:57 +01:00
Michael Niedermayer
d926c4b240 Merge commit '38675229a879aa5258a8c71891fc8cbf74cf139f'
* commit '38675229a879aa5258a8c71891fc8cbf74cf139f':
  dsputil_mmx: K&R formatting cosmetics

Conflicts:
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 01:01:37 +01:00
Michael Niedermayer
55f53f6c29 Merge commit '6a8b35dc88b4a1a452f192fbbf53ae7f59bc3f23'
* commit '6a8b35dc88b4a1a452f192fbbf53ae7f59bc3f23':
  dsputilenc_mmx: Merge two assignment blocks with identical conditions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 00:57:25 +01:00
Michael Niedermayer
4104eb44e6 Merge commit '55519926ef855c671d084ccc151056de9e3d3a77'
* commit '55519926ef855c671d084ccc151056de9e3d3a77':
  x86: Make function prototype comments in assembly code consistent

Conflicts:
	libavcodec/x86/sbrdsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 00:01:30 +01:00
Michael Niedermayer
a9b1936a4e Merge commit 'edd1f833fa145eb9c5026877c699ebe6efca00a0'
* commit 'edd1f833fa145eb9c5026877c699ebe6efca00a0':
  x86: h264_idct_10_bit: Use proper type in function prototype comments

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-14 00:00:16 +01:00
Michael Niedermayer
1c788eaca9 Merge commit '831a1180785a786272cdcefb71566a770bfb879e'
* commit '831a1180785a786272cdcefb71566a770bfb879e':
  Update dsputil- and SIMD-related comments to match reality more closely

Conflicts:
	libavcodec/x86/hpeldsp.asm
	libavutil/arm/float_dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 23:59:56 +01:00
Michael Niedermayer
d61e1156be Merge commit '17608f6ee3d2088cdb8d1e704276d8b34f01160d'
* commit '17608f6ee3d2088cdb8d1e704276d8b34f01160d':
  x86: Add some more missing headers

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 23:41:17 +01:00
Diego Biurrun
db3f61a04f x86: dsputil_init: Drop some unnecessary parentheses 2014-03-13 08:15:51 -07:00
Diego Biurrun
441b093915 x86: dsputil_init: K&R formatting cosmetics 2014-03-13 08:15:51 -07:00
Diego Biurrun
4cb4680c10 x86: dsputil_x86.h: K&R formatting cosmetics 2014-03-13 08:15:51 -07:00
Diego Biurrun
f8bbebecfd x86: motion_est: K&R formatting cosmetics 2014-03-13 08:15:51 -07:00
Diego Biurrun
a36947c167 dsputilenc_mmx: K&R formatting cosmetics 2014-03-13 08:15:51 -07:00
Diego Biurrun
38675229a8 dsputil_mmx: K&R formatting cosmetics 2014-03-13 08:15:51 -07:00
Diego Biurrun
6a8b35dc88 dsputilenc_mmx: Merge two assignment blocks with identical conditions 2014-03-13 08:15:51 -07:00
Diego Biurrun
55519926ef x86: Make function prototype comments in assembly code consistent
This helps grepping for functions, among other things.
2014-03-13 05:50:29 -07:00
Diego Biurrun
edd1f833fa x86: h264_idct_10_bit: Use proper type in function prototype comments 2014-03-13 05:50:29 -07:00
Diego Biurrun
831a118078 Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
Diego Biurrun
17608f6ee3 x86: Add some more missing headers 2014-03-13 05:50:28 -07:00
Diego Biurrun
08dba0e1c3 x86: mpegvideoenc: Remove some remnants of the long-gone libmpeg2 IDCT 2014-03-13 05:50:28 -07:00
James Almer
9e0e1f9067 x86/dsputil: add emms to ff_scalarproduct_int16_mmxext()
Also undo the changes to ra144enc.c from previous commits.
Should fix ticket #3429

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-06 18:23:55 +01:00
Michael Niedermayer
2d99de66b7 Merge commit '3bfdee00cd92ff07c364d4901c4aefda32780756'
* commit '3bfdee00cd92ff07c364d4901c4aefda32780756':
  x86: dcadsp: Fix linking with yasm and optimizations disabled

Conflicts:
	libavcodec/x86/dcadsp_init.c

See: 206167a295a5c28cec3c38f7308835b0b7e0618f
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-06 14:10:27 +01:00
Diego Biurrun
3bfdee00cd x86: dcadsp: Fix linking with yasm and optimizations disabled
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
2014-03-05 23:16:21 +01:00
Michael Niedermayer
146b476ba0 Merge commit '3741aa37c2a0d0717faff74a5c4cc357d16f6d1d'
* commit '3741aa37c2a0d0717faff74a5c4cc357d16f6d1d':
  x86: cabac: Use correct #includes to make header compile standalone

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 21:33:44 +01:00
Diego Biurrun
3741aa37c2 x86: cabac: Use correct #includes to make header compile standalone 2014-03-05 13:32:25 +01:00
James Almer
7fd64e3e36 x86/synth_filter: add synth_filter_fma3
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 01:58:16 +01:00
James Almer
206167a295 x86/synth_filter: add missing HAVE_YASM guard
Should fix compilation failures with --disable-yasm on some compilers

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-04 22:47:28 +01:00
James Almer
884e085d1e x86/synth_filter: Revert the switch to float ops with SSE2
This reverts the changes 64672098361361cd15d37e36f747ab44de5b80ca
and 68c3ed936a76c3ff7738f602fa90237ac7e3ce08 did to the SSE2 version,
which generated a hit of about 5 cycles.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-02 11:58:10 +01:00
James Almer
68c3ed936a x86/synth_filter: add synth_filter_avx
Sandy Bridge Win64:
180 cycles on ff_synth_filter_inner_sse2
150 cycles on ff_synth_filter_inner_avx

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-02 01:00:55 +01:00
James Almer
6467209836 x86/synth_filter: add synth_filter_sse
Build only on x86_32 targets.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-01 15:32:40 +01:00
Michael Niedermayer
fb3c33f3cd Merge commit '4cb6964244fd6c099383d8b7e99731e72cc844b9'
* commit '4cb6964244fd6c099383d8b7e99731e72cc844b9':
  dcadec: simplify decoding of VQ high frequencies

Conflicts:
	configure
	libavcodec/dcadec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:41:19 +01:00
Michael Niedermayer
baf3adc621 Merge commit '08e3ea60ff4059341b74be04a428a38f7c3630b0'
* commit '08e3ea60ff4059341b74be04a428a38f7c3630b0':
  x86: synth filter float: implement SSE2 version

Conflicts:
	libavcodec/x86/dcadsp.asm
	libavcodec/x86/dcadsp_init.c

See: 2cdbcc004837ce092a14f326f24d97a29512a2c3
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 20:38:39 +01:00
Christophe Gisquet
2cdbcc0048 x86: synth filter float: implement SSE2 version
Timings for Arrandale:
          C    SSE
win32:  2108   334
win64:  1152   322

Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.

Unrolling for ARCH_X86_64 is a 20 cycles gain.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 20:34:40 +01:00