Commit Graph

69 Commits

Author SHA1 Message Date
James Almer
d0f56ca071 x86/hevc_deblock: improve 8bit transpose store macros
Up to four instructions less depending on function and instruction set.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-03 04:24:15 +02:00
James Almer
1ace9573dc x86/hevc_idct: replace old and unused idct functions
Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial).

Benchmarks on an Intel Core i5-4200U:

idct8x8_dc
       SSE2   MMXEXT  C
cycles 22     26      57

idct16x16_dc
       AVX2   SSE2    C
cycles 27     32      249

idct32x32_dc
       AVX2   SSE2    C
cycles 62     126     1375

Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 18:00:11 +02:00
Christophe Gisquet
9107612818 x86util: add and use RSHIFT/LSHIFT macros
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 13:19:27 +02:00
Christophe Gisquet
2267003981 x86: hpeldsp: better factorization
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 21:47:40 +02:00
James Almer
561bfc85eb x86/dsputilenc: implement SSE2 versions of pix_{sum16, norm1}
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-28 23:29:34 +02:00
James Almer
76ed71a72b x86: move horizontal add macros to x86util
Also port relevant AVX2/XOP optimizations from x264 with permission
to relicense to LGPL from the corresponding authors

Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-17 14:15:09 +02:00
James Almer
3f3d748cab x86: Move XOP emulation to x86util
We need the emulation to support the cases where the first
argument is the same as the fourth. To achieve this a fifth
argument working as a temporary may be needed.
Emulation that doesn't obey the original instruction semantics
can't be in x86inc.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-24 08:30:19 +01:00
Michael Niedermayer
e3e0e3d0c9 Merge commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497'
* commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497':
  x86inc: FMA3/4 Support

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 16:06:22 +02:00
Michael Niedermayer
9ac124c889 Merge commit '206895708ea2b464755d340e44501daf9a07c310'
* commit '206895708ea2b464755d340e44501daf9a07c310':
  x86inc: Remove our FMA4 support

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 15:54:23 +02:00
Jason Garrett-Glaser
c6908d6b4b x86inc: FMA3/4 Support
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:41:54 +01:00
Derek Buitenhuis
206895708e x86inc: Remove our FMA4 support
This is so we can sync to x264's version of FMA4 support.

This partialy reverts commit 79687079a9.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:39:29 +01:00
Michael Niedermayer
b45e0c2573 Merge commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d'
* commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d':
  x86inc: Add cvisible macro for C functions with public prefix

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-19 13:11:41 +01:00
Michael Niedermayer
1b03e09198 Merge commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b'
* commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b':
  x86inc: Rename "program_name" to "private_prefix"
  configure: Run SHFLAGS through ldflags_filter()

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-19 13:01:06 +01:00
Diego Biurrun
d633d12b2c x86inc: Add cvisible macro for C functions with public prefix
This allows defining externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 22:02:03 +01:00
Diego Biurrun
ef5d41a553 x86inc: Rename "program_name" to "private_prefix"
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-18 20:29:53 +01:00
Michael Niedermayer
68f92a70f1 Merge commit 'dae1d507af94261bafd3b11549884e5d1eca590e'
* commit 'dae1d507af94261bafd3b11549884e5d1eca590e':
  x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
  vf_fps: add final flushed frames to the dropped frame count
  rv34_parser: Adjust #if for disabling individual parsers

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-16 11:44:45 +01:00
Diego Biurrun
dae1d507af x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags 2013-01-15 17:29:43 +01:00
Michael Niedermayer
b7ede94bbd Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: ABSB2: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 16:16:18 +01:00
Michael Niedermayer
77041e2474 Merge commit '094a7405e5d8463d7d167d893e04934ec1a84ecd'
* commit '094a7405e5d8463d7d167d893e04934ec1a84ecd':
  x86: ABSB: port to cpuflags
  sdp: Include SRTP crypto params if using the srtp protocol

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 16:12:24 +01:00
Michael Niedermayer
cfc40a6aff Merge commit 'd8c772de53d29afb1bada88afa859fce8489c668'
* commit 'd8c772de53d29afb1bada88afa859fce8489c668':
  nutdec: Always return a value from nut_read_timestamp()
  configure: Make warnings from -Wreturn-type fatal errors
  x86: ABS2: port to cpuflags
  vdpau: Remove av_unused attribute from function declaration
  h264: fix ff_generate_sliding_window_mmcos() prototype.

Conflicts:
	configure
	libavformat/nutdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-15 15:23:20 +01:00
Diego Biurrun
320e1d0df3 x86: ABSB2: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun
094a7405e5 x86: ABSB: port to cpuflags 2013-01-15 11:18:51 +01:00
Diego Biurrun
51969a652c x86: ABS2: port to cpuflags 2013-01-14 21:56:55 +01:00
Michael Niedermayer
ea93ccf079 Merge commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0'
* commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0':
  x86: ABS1: port to cpuflags
  v210x: cosmetics, reformat

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-07 01:35:18 +01:00
Diego Biurrun
5b4dfbffc2 x86: ABS1: port to cpuflags 2013-01-06 13:57:01 +01:00
Michael Niedermayer
15784c2bab Merge commit '9d5c62ba5b586c80af508b5914934b1c439f6652'
* commit '9d5c62ba5b586c80af508b5914934b1c439f6652':
  lavu/opt: do not filter out the initial sign character except for flags
  eval: treat dB as decibels instead of decibytes
  float_dsp: add vector_dmul_scalar() to multiply a vector of doubles

Conflicts:
	libavutil/eval.c
	tests/ref/fate/eval

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 14:33:38 +01:00
Justin Ruggles
ac7eb4cb20 float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
Include x86-optimized versions for SSE2 and AVX.
2012-12-05 11:23:36 -05:00
Michael Niedermayer
e6d81ce22e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: h264_intrapred: Fix C function names in comments
  x86: SPLATD: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-19 14:24:20 +01:00
Diego Biurrun
87af05c575 x86: SPLATD: port to cpuflags 2012-11-18 18:34:05 +01:00
Michael Niedermayer
a1b5c9634e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: mmx2 ---> mmxext in asm constructs

Conflicts:
	libavcodec/x86/h264_chromamc_10bit.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264dsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-14 12:34:30 +01:00
Diego Biurrun
26301caaa1 x86: mmx2 ---> mmxext in asm constructs 2012-11-14 00:58:51 +01:00
Michael Niedermayer
da501ea857 Merge commit '802713c4e7b41bc2deed754d78649945c3442063'
* commit '802713c4e7b41bc2deed754d78649945c3442063':
  mss2: prevent potential uninitialized reads
  mss2: reindent after last commit
  mss2: fix handling of unmasked implicit WMV9 rectangles
  configure: add lavu dependency to lavr/lavfi .pc files
  x86inc: Set program_name outside of x86inc.asm

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-12 10:57:06 +01:00
Diego Biurrun
f0d124f005 x86inc: Set program_name outside of x86inc.asm
This reduces the local difference to the x264 upstream version.
2012-11-11 11:06:19 +01:00
Michael Niedermayer
2ce64413e2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86: PALIGNR: port to cpuflags
  x86: h264_qpel_10bit: port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-10 12:44:39 +01:00
Diego Biurrun
4b60fac419 x86: PALIGNR: port to cpuflags 2012-11-09 21:31:31 +01:00
Michael Niedermayer
e859339e7a Merge commit '930e26a3ea9d223e04bac4cdde13697cec770031'
* commit '930e26a3ea9d223e04bac4cdde13697cec770031':
  x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled
  x86: PABSW: port to cpuflags
  x86: vc1dsp: port to cpuflags
  rtmp: Use av_strlcat instead of strncat

Conflicts:
	libavcodec/x86/h264_qpel.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-05 22:36:05 +01:00
Diego Biurrun
dbb37e7711 x86: PABSW: port to cpuflags 2012-11-05 14:51:10 +01:00
Michael Niedermayer
37e81996dc Merge commit '9221efef7968463f3e3d9ce79ea72eaca082e73f'
* commit '9221efef7968463f3e3d9ce79ea72eaca082e73f':
  lavf: fix av_interleaved_write_frame() doxy.
  lavf: clarify the lifetime of demuxed packets.
  avconv: do not free muxed packet on streamcopy.
  crc: move doxy to the header
  vf_drawtext: do not use deprecated av_tree_node_size
  x86: Refactor PSWAPD fallback implementations and port to cpuflags

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-03 14:24:11 +01:00
Michael Niedermayer
1885ffb03d Merge commit '9a07c1332cfe092b57b5758f22b686ca58806c60'
* commit '9a07c1332cfe092b57b5758f22b686ca58806c60':
  parser: Move Doxygen documentation to the header files
  PGS subtitles: Expose forced flag
  x86: PMINUB: port to cpuflags

Conflicts:
	libavcodec/avcodec.h
	libavcodec/pgssubdec.c
	libavcodec/version.h
	libavcodec/x86/ac3dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-03 14:13:45 +01:00
Michael Niedermayer
1dad486714 Merge commit '9ce02e14f01de50fcc6f7f459544b140be66d615'
* commit '9ce02e14f01de50fcc6f7f459544b140be66d615':
  x86: ac3dsp: port to cpuflags
  x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
  x86inc: Only define program_name if the macro is unset

Conflicts:
	libavcodec/x86/ac3dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-03 13:38:38 +01:00
Diego Biurrun
0a7a94f2e5 x86: Refactor PSWAPD fallback implementations and port to cpuflags 2012-11-02 17:05:29 +01:00
Diego Biurrun
26f01bd106 x86: PMINUB: port to cpuflags 2012-11-02 15:38:15 +01:00
Diego Biurrun
61bc2bc7d4 x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
"mmxext" is a more sensible name and more common in outside projects.
2012-11-02 15:22:34 +01:00
Michael Niedermayer
28c0678eb7 Merge commit 'be923ed659016350592acb9b3346f706f8170ac5'
* commit 'be923ed659016350592acb9b3346f706f8170ac5':
  x86: fmtconvert: port to cpuflags
  x86: MMX2 ---> MMXEXT in macro names

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 14:16:18 +01:00
Dave Yeo
264f12342c x86: Fix assembly with NASM
Unlike YASM, NASM only looks for include files in the current
directory, not in the directory that included files reside in.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-10-31 13:50:01 +01:00
Michael Niedermayer
3174616f59 Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'
* commit '6860b4081d046558c44b1b42f22022ea341a2a73':
  x86: include x86inc.asm in x86util.asm
  cng: Reindent some incorrectly indented lines
  cngdec: Allow flushing the decoder
  cngdec: Make the dbov variable have the right unit
  cngdec: Fix the memset size to cover the full array
  cngdec: Update the LPC coefficients after averaging the reflection coefficients
  configure: fix print_config() with broke awks

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/dct32.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil.asm
	libavcodec/x86/dsputilenc.asm
	libavcodec/x86/fft.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_deblock_10bit.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_idct_10bit.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_intrapred_10bit.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-31 13:43:33 +01:00
Dave Yeo
9c167914a1 x86: Fix assembly with NASM
Unlike YASM, NASM only looks for include files in the current
directory, not in the directory that included files reside in.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-10-31 10:20:35 +01:00
Diego Biurrun
588fafe7f3 x86: MMX2 ---> MMXEXT in macro names 2012-10-31 01:04:55 +01:00
Diego Biurrun
6860b4081d x86: include x86inc.asm in x86util.asm
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Michael Niedermayer
bec180e112 Merge commit 'a1bcc76e6036e78f25cbb7323c145056cfca9d93'
* commit 'a1bcc76e6036e78f25cbb7323c145056cfca9d93': (21 commits)
  cmdutils: fix a memleak when specifying an option twice.
  x86: mpegvideo: more sensible names for optimization file and init function
  x86: mpegvideoenc: Split optimizations off into a separate file
  dnxhdenc: x86: more sensible names for optimization file and init function
  svq1/svq3: Move common code out of SVQ1 decoder-specific file
  dirac: add Comments and references to the standard
  lavr: x86: optimized 6-channel flt to fltp conversion
  lavr: x86: optimized 2-channel flt to fltp conversion
  lavr: x86: optimized 6-channel flt to s16p conversion
  lavr: x86: optimized 2-channel flt to s16p conversion
  lavr: x86: optimized 6-channel s16 to fltp conversion
  lavr: x86: optimized 2-channel s16 to fltp conversion
  lavr: x86: optimized 6-channel s16 to s16p conversion
  lavr: x86: optimized 2-channel s16 to s16p conversion
  lavr: x86: optimized 2-channel fltp to flt conversion
  lavr: x86: optimized 6-channel fltp to s16 conversion
  lavr: x86: optimized 2-channel fltp to s16 conversion
  lavr: x86: optimized 6-channel s16p to flt conversion
  lavr: x86: optimized 2-channel s16p to flt conversion
  lavr: x86: optimized 6-channel s16p to s16 conversion
  ...

Conflicts:
	libavcodec/dirac.c
	libavcodec/mpegvideo.h
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-08-24 14:30:40 +02:00