ffmpeg

Author	SHA1	Message	Date
Michael Niedermayer	63d2be7533	avcodec/x86/lossless_videodsp: use SPLATW in add_int16 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-21 02:33:20 +01:00
Michael Niedermayer	f70d7eb20c	Move add/diff_int16 to lossless_videodsp Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-20 21:32:47 +01:00
Michael Niedermayer	a493f8541d	avcodec/x86/dsp: add_int16_mmx / add_int16_sse2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-20 04:06:46 +01:00
James Almer	26800e3864	vp9/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext pavgb is an sse integer instruction, so the mmxext flag is enough Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-18 17:08:25 +01:00
James Almer	d2a7314f1e	vp9/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2(). Similar gains in performance as the SSSE3 version Signed-off-by: James Almer <jamrial@gmail.com>	2014-01-17 14:16:38 +01:00
Ronald S. Bultje	8173d1ffc0	vp9/x86: 16x16 iadst_idct, idct_iadst and iadst_iadst (ssse3+avx). Sample timings on ped1080p.webm (of the ssse3 functions): iadst_idct: 4672 -> 1175 cycles idct_iadst: 4736 -> 1263 cycles iadst_iadst: 4924 -> 1438 cycles Total decoding time changed from 6.565s to 6.413s.	2014-01-16 13:49:31 +01:00
Clément Bœsch	9cc8fa63dd	vp9/x86: simplify a few mc inits.	2014-01-16 07:48:27 +01:00
Michael Niedermayer	6391dec82a	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Simplify xvmc deprecation conditional Conflicts: libavcodec/x86/dsputil_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-15 20:41:08 +01:00
Diego Biurrun	aab40bbfd5	x86: dsputil: Simplify xvmc deprecation conditional	2014-01-15 15:23:46 +01:00
Clément Bœsch	8b4190da93	vp9/x86: add AVX for itxfm and lpf. 4412 decicycles in ff_vp9_loop_filter_h_16_16_ssse3, 4193462 runs, 842 skips 3600 decicycles in ff_vp9_loop_filter_h_16_16_avx, 4193621 runs, 683 skips 3010 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 4193528 runs, 776 skips 2678 decicycles in ff_vp9_loop_filter_v_16_16_avx, 4193742 runs, 562 skips 23025 decicycles in ff_vp9_idct_idct_32x32_add_ssse3, 2096871 runs, 281 skips 19943 decicycles in ff_vp9_idct_idct_32x32_add_avx, 2096815 runs, 337 skips 4675 decicycles in ff_vp9_idct_idct_16x16_add_ssse3, 4194018 runs, 286 skips 3980 decicycles in ff_vp9_idct_idct_16x16_add_avx, 4194022 runs, 282 skips 967 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 16776972 runs, 244 skips 887 decicycles in ff_vp9_idct_idct_8x8_add_avx, 16777002 runs, 214 skips	2014-01-15 15:54:03 +01:00
Michael Niedermayer	cb613657ee	avcodec/x86/proresdsp_init: x86 prores IDCT is bitexact again reenable it for for bitexact mode Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 15:59:00 +01:00
Michael Niedermayer	b148a39d55	Merge commit '46bacb5cc6169ff5e8e982495c4925467c1d8bb7' * commit '46bacb5cc6169ff5e8e982495c4925467c1d8bb7': x86: Consistently use cpu flag detection macros in places that still miss it Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-14 14:44:59 +01:00
Diego Biurrun	46bacb5cc6	x86: Consistently use cpu flag detection macros in places that still miss it	2014-01-14 00:04:58 +01:00
Clément Bœsch	af68bd1c06	vp9/x86: add ff_vp9_loop_filter_[vh]_16_16_ssse3(). 16662 decicycles in loop_filter_h_16_16_c, 8387355 runs, 1253 skips 17510 decicycles in loop_filter_v_16_16_c, 8387516 runs, 1092 skips 4941 decicycles in ff_vp9_loop_filter_h_16_16_ssse3, 8387887 runs, 721 skips 3899 decicycles in ff_vp9_loop_filter_v_16_16_ssse3, 8387980 runs, 628 skips Overall decode time goes from: ./ffmpeg -v 0 -nostats -threads 1 -i ~/samples/vp9/ped1080p.webm -f null - 8.10s user 0.02s system 99% cpu 8.126 total to: ./ffmpeg -v 0 -nostats -threads 1 -i ~/samples/vp9/ped1080p.webm -f null - 6.15s user 0.04s system 99% cpu 6.199 total (46 to 61 fps)	2014-01-12 20:20:24 +01:00
Clément Bœsch	e11ceea68f	vp9/x86: factor out some code in VP9_UNPACK_MULSUB_2W_4X.	2014-01-12 20:19:00 +01:00
Clément Bœsch	c9aa0b8f70	vp9/x86: remove reg redundancy in VP9_MULSUB_2W_2X.	2014-01-12 20:18:55 +01:00
Clément Bœsch	7c55ee6168	vp9/x86: merge IDCT coef macros.	2014-01-12 20:18:44 +01:00
Michael Niedermayer	92b2404571	Merge commit '4c642d8d98703faf52983243098f35865e15b312' * commit '4c642d8d98703faf52983243098f35865e15b312': x86: hpeldsp: Add missing av_cold attribute to init function Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-09 20:32:53 +01:00
Michael Niedermayer	390452bab6	Merge commit 'b0be1ae792ac8bbfb0fc7b9b9cb39eaf0feb489b' * commit 'b0be1ae792ac8bbfb0fc7b9b9cb39eaf0feb489b': x86: avcodec: Add a bunch of missing #includes for av_cold Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-09 20:24:15 +01:00
Diego Biurrun	4c642d8d98	x86: hpeldsp: Add missing av_cold attribute to init function	2014-01-09 15:09:07 +01:00
Diego Biurrun	b0be1ae792	x86: avcodec: Add a bunch of missing #includes for av_cold	2014-01-09 15:09:07 +01:00
Ronald S. Bultje	c6fe984f2f	vp9/x86: make STORE_2X2 macro local. Prevents this assembler warning: libavcodec/x86/vp9itxfm.asm:1208: warning: (VP9_IDCT32_1D:309) redefining multi-line macro `STORE_2X2' Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-08 14:07:15 +01:00
Ronald S. Bultje	04a187fb2a	vp9/x86: idct_32x32_add_ssse3 sub-8x8-idct. Runtime of the full 32x32 idct goes from 2446 to 2441 cycles (intra) or from 1425 to 1306 cycles (inter). Overall runtime is not significantly affected.	2014-01-07 20:43:35 -05:00
Ronald S. Bultje	37b001d14d	vp9/x86: idct_32x32_add_ssse3 sub-16x16-idct. Runtime of all IDCTs together goes from 3327 to 2473 cycles (intra, i.e. ~35% faster) or from 2312 to 1448 cycles (inter, i.e. ~60% faster). Total decode time of ped1080p.webm goes from 8.086sec to 7.974sec (1.4% faster).	2014-01-07 20:43:34 -05:00
Ronald S. Bultje	e84d14df10	vp9/x86: idct_32x32_add_ssse3. Sub-IDCTs will follow later. ped1080.webm goes from 9.295s to 8.191s (13.5% faster). The IDCT itself goes from 4372 (intra) or 4337 (inter) to 403 (intra) or 329 (inter) cycles for the DC-only form, 23755 (intra) or 23723 (inter) to 3497 (intra) or 3607 (inter) cycles for the no-DC form, which averages from 23393 (intra) or 16612 (inter) to 3449 (intra) or 2392 (inter) for all 32x32s together, i.e. about ~7x faster (all tests done on ped1080p.webm).	2014-01-07 20:43:30 -05:00
Michael Niedermayer	30056fd0be	Merge commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb' * commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb': h264: do not use 422 functions for monochrome See: `07abf13da4` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-01-06 16:51:23 +01:00
Anton Khirnov	a03a642d5c	h264: do not use 422 functions for monochrome Fixes invalid memory access. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC:libav-stable@libav.org	2014-01-06 08:25:36 +01:00
Ronald S. Bultje	18175baa54	vp9/x86: 16px MC functions (64bit only). Cycle counts for large MCs (old -> new on ped1080p.webm, mx!=0&&my!=0): 16x8: 876 -> 870 (0.7%) 16x16: 1444 -> 1435 (0.7%) 16x32: 2784 -> 2748 (1.3%) 32x16: 2455 -> 2349 (4.5%) 32x32: 4641 -> 4084 (13.6%) 32x64: 9200 -> 7834 (17.4%) 64x32: 8980 -> 7197 (24.8%) 64x64: 17330 -> 13796 (25.6%) Total decoding time goes from 9.326sec to 9.182sec.	2013-12-26 21:05:10 -05:00
Ronald S. Bultje	0d9375fc90	vp9/x86: 16x16 sub-IDCT for top-left 8x8 subblock (eob <= 38). Sub8x8 speed (w/o dc-only case) goes from ~750 cycles (inter) or ~735 cycles (intra) to ~415 cycles (inter) or ~430 cycles (intra). Average overall 16x16 idct speed goes from ~635 cycles (inter) or ~720 cycles (intra) to ~415 cycles (inter) or ~545 (intra) - all measurements done using ped1080p.webm.	2013-12-26 07:40:25 -05:00
Ivan Kalvachev	1c63aed232	Convert XvMC to hwaccel v3 Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-22 22:03:47 +01:00
Michael Niedermayer	ce612fc186	Merge commit 'dfc50ac85e9d68a771b556297b7c411650206f3b' * commit 'dfc50ac85e9d68a771b556297b7c411650206f3b': x86: mpegvideo: move denoise_dct asm to mpegvideoenc Conflicts: libavcodec/x86/mpegvideo.c libavcodec/x86/mpegvideoenc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-20 23:44:31 +01:00
Anton Khirnov	dfc50ac85e	x86: mpegvideo: move denoise_dct asm to mpegvideoenc This function is encoding-only. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-12-20 17:16:11 +01:00
Ronald S. Bultje	8d4c616fc0	vp9/x86: idct_add_16x16_ssse3. Currently only dc-only and full 16x16. Other subforms will follow in the near future. Total decoding time of ped1080p.webm goes from 9.7 to 9.3 seconds. DC-only goes from 957 -> 131 cycles, and the full IDCT goes from ~4050 to ~745 cycles.	2013-12-14 12:13:26 -05:00
Michael Niedermayer	8e70fdab36	Merge commit '4958f35a2ebc307049ff2104ffb944f5f457feb3' * commit '4958f35a2ebc307049ff2104ffb944f5f457feb3': dsputil: Move apply_window_int16 to ac3dsp Conflicts: libavcodec/arm/ac3dsp_init_arm.c libavcodec/arm/ac3dsp_neon.S libavcodec/x86/ac3dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-09 04:12:40 +01:00
Diego Biurrun	4958f35a2e	dsputil: Move apply_window_int16 to ac3dsp The (optimized) functions are used nowhere else.	2013-12-08 17:57:15 +01:00
Ronald S. Bultje	92436e8ad9	vp9: implement top/left half (4x4) sub-8x8-IDCT. For that specific case (eob>3&&eob<=12), runtime of idct8x8 goes from 668 to 477 cycles. For all idct8x8, runtime goes from 521 to 490 cycles.	2013-12-07 12:39:36 -05:00
Ronald S. Bultje	b2045c44a9	vp9: split pre-load of 11585x2 out of 1d idct macro. This allows us to load it only once, instead of twice, in this function.	2013-12-07 12:39:36 -05:00
Ronald S. Bultje	f9a0d4c6e0	vp9: minor refactorings in idct ssse3 assembly. Make register usage in macros explicit; change mulsub_2w_4x to use 2 instead of 3 temp registers.	2013-12-07 12:39:35 -05:00
Ronald S. Bultje	8729964b99	vp9: split x86 assembly in two files. (And in future, loopfilter or intra pred could be put in their own respective files also.)	2013-12-07 12:39:35 -05:00
Michael Niedermayer	5b4d57455d	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Initialize mmxext after amd3dnow optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-05 11:55:41 +01:00
Diego Biurrun	3d7c84747d	x86: Initialize mmxext after amd3dnow optimizations The mmxext optimizations should be at least equally fast if available and amd3dnow optimizations are being deprecated. Thus the former should override the latter, not the other way around.	2013-12-04 18:52:48 +01:00
Michael Niedermayer	be2312aa8f	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputil: x86: Move ff_inv_zigzag_direct16 table init to mpegvideo If someone optimizes dct_quantize for non x86 SIMD, then this probably needs to be reverted. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-12-02 10:59:48 +01:00
Diego Biurrun	7ffaa19570	dsputil: x86: Move ff_inv_zigzag_direct16 table init to mpegvideo The table is MMX-specific and used nowhere else.	2013-12-02 04:05:18 +01:00
Michael Niedermayer	3adb825650	Merge commit 'cf7860db608df7c76471d8b61f07abbd5aad8dd5' * commit 'cf7860db608df7c76471d8b61f07abbd5aad8dd5': x86: dsputil: Suppress deprecation warnings for XvMC bits Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-28 22:47:37 +01:00
Diego Biurrun	cf7860db60	x86: dsputil: Suppress deprecation warnings for XvMC bits These parts are scheduled for removal on the next version bump. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2013-11-28 16:04:30 +01:00
Clément Bœsch	616da59542	avcodec/x86/vp9dsp: merge a few SWAP together.	2013-11-21 23:06:21 +01:00
Clément Bœsch	e0434cfcfc	avcodec/x86: remove 3 sub in pred4x4_tm_vp8_8. before: 411 decicycles in ff_pred4x4_tm_vp8_8_ssse3, 8388289 runs, 319 skips after: 389 decicycles in ff_pred4x4_tm_vp8_8_ssse3, 8388308 runs, 300 skips Tested on i7 920.	2013-11-17 23:12:35 +01:00
Clément Bœsch	d28c79b003	avcodec/x86/vp9dsp: use EXTERNAL_* macros. Original fix by one of these developers: Anton Khirnov <anton@khirnov.net> Diego Biurrun <diego@biurrun.de> Luca Barbato <lu_zero@gentoo.org> Martin Storsjö <martin@martin.st> See `97962b2` / `72ca830` Personnal guess is Diego Biurrun.	2013-11-16 17:03:17 +01:00
Michael Niedermayer	91e00c4a78	Merge commit '458446acfa1441d283dacf9e6e545beb083b8bb0' * commit '458446acfa1441d283dacf9e6e545beb083b8bb0': lavc: Edge emulation with dst/src linesize Conflicts: libavcodec/cavs.c libavcodec/h264.c libavcodec/hevc.c libavcodec/mpegvideo_enc.c libavcodec/mpegvideo_motion.c libavcodec/rv34.c libavcodec/svq3.c libavcodec/vc1dec.c libavcodec/videodsp.h libavcodec/videodsp_template.c libavcodec/vp3.c libavcodec/vp8.c libavcodec/wmv2.c libavcodec/x86/videodsp.asm libavcodec/x86/videodsp_init.c Changes to the asm are not merged, they are left for volunteers or in their absence for later. The changes this merge introduces are reordering of the function arguments See: `face578d56` Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-15 15:07:10 +01:00
Ronald S. Bultje	72ca830f51	lavc: VP9 decoder Originally written by Ronald S. Bultje <rsbultje@gmail.com> and Clément Bœsch <u@pkh.me> Further contributions by: Anton Khirnov <anton@khirnov.net> Diego Biurrun <diego@biurrun.de> Luca Barbato <lu_zero@gentoo.org> Martin Storsjö <martin@martin.st> Signed-off-by: Luca Barbato <lu_zero@gentoo.org> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-11-15 10:16:28 +01:00
Ronald S. Bultje	458446acfa	lavc: Edge emulation with dst/src linesize Allow supporting files for which the image stride is smaller than the maximum block size + number of subpel mc taps, e.g. a 64x64 VP9 file or a 16x16 VP8 file with -fflags +emu_edge.	2013-11-15 10:16:27 +01:00
Michael Niedermayer	5231eecdaf	Merge remote-tracking branch 'qatar/master' * qatar/master: Deprecate obsolete XvMC hardware decoding support Conflicts: libavcodec/mpeg12.c libavcodec/mpeg12dec.c libavcodec/mpegvideo.c libavcodec/options_table.h libavutil/pixdesc.c libavutil/version.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-14 03:26:35 +01:00
Diego Biurrun	19e30a58fc	Deprecate obsolete XvMC hardware decoding support XvMC has long ago been superseded by newer acceleration APIs, such as VDPAU, and few downstreams still support it. Furthermore XvMC is not implemented within the hwaccel framework, but requires its own specific code in the MPEG-1/2 decoder, which is a maintenance burden.	2013-11-13 21:07:45 +01:00
Michael Niedermayer	a30f7918b5	Merge commit '0338c396987c82b41d322630ea9712fe5f9561d6' * commit '0338c396987c82b41d322630ea9712fe5f9561d6': dsputil: Split off H.263 bits into their own H263DSPContext Conflicts: configure libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-08 17:42:56 +01:00
Diego Biurrun	0338c39698	dsputil: Split off H.263 bits into their own H263DSPContext	2013-11-08 12:40:47 +01:00
Clément Bœsch	87434cf373	avcodec/vp9: add ff_vp9_idct_idct_{4x4,8x8}_ssse3(). 1789 decicycles in idct_idct_4x4_add_c, 262136 runs, 8 skips 1839 decicycles in idct_idct_4x4_add_c, 524270 runs, 18 skips 1864 decicycles in idct_idct_4x4_add_c, 1048548 runs, 28 skips 529 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 262138 runs, 6 skips 516 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 524282 runs, 6 skips 474 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 1048565 runs, 11 skips (~3.9x faster) 7726 decicycles in idct_idct_8x8_add_c, 1048433 runs, 143 skips 7732 decicycles in idct_idct_8x8_add_c, 2096882 runs, 270 skips 7731 decicycles in idct_idct_8x8_add_c, 4193772 runs, 532 skips 1145 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 1048549 runs, 27 skips 1137 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 2097097 runs, 55 skips 1086 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 4194188 runs, 116 skips (~7.1x faster) Overall decode time before commit: 16.48s user 0.03s system 99% cpu 16.526 total 16.54s user 0.01s system 99% cpu 16.566 total 16.46s user 0.03s system 99% cpu 16.511 total Overall decode time after commit: 16.34s user 0.02s system 99% cpu 16.378 total 16.28s user 0.02s system 99% cpu 16.315 total 16.32s user 0.03s system 99% cpu 16.366 total Tested on i7 920 with 40s 1080p footage.	2013-11-05 19:25:40 +01:00
Michael Niedermayer	934e489ee8	Merge commit 'e2b5b097898c9155f4bdff4d83cdc54d5eef6930' * commit 'e2b5b097898c9155f4bdff4d83cdc54d5eef6930': x86: rv40dsp: Use PAVGB instruction macro where appropriate Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-05 10:26:07 +01:00
Diego Biurrun	e2b5b09789	x86: rv40dsp: Use PAVGB instruction macro where appropriate	2013-11-04 21:14:39 +01:00
Mikulas Patocka	694d997afe	x86: hpeldsp: Use PAVGB instruction macro where necessary Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-11-04 01:29:23 +01:00
Mikulas Patocka	074155360d	avcodec/x86/hpeldsp: fix crash on AMD K6-3+ There are instructions pavgb and pavgusb. Both instructions do the same operation but they have different enconding. Pavgb exists in SSE (or MMXEXT) instruction set and pavgusb exists in 3D-NOW instruction set. livavcodec uses the macro PAVGB to select the proper instruction. However, the function avg_pixels8_xy2 doesn't use this macro, it uses pavgb directly. As a consequence, the function avg_pixels8_xy2 crashes on AMD K6-2 and K6-3 processors, because they have pavgusb, but not pavgb. This bug seems to be introduced by commit `71155d7b41`, "dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm" Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-03 19:49:11 +01:00
Michael Niedermayer	7146eacfc5	Merge commit '1700b4e678ed329611a16b20d11e64b7abda4839' * commit '1700b4e678ed329611a16b20d11e64b7abda4839': x86: vp8dsp: Split loopfilter code into a separate file Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-11-02 10:13:14 +01:00
Diego Biurrun	1700b4e678	x86: vp8dsp: Split loopfilter code into a separate file	2013-11-01 22:05:20 +01:00
Michael Niedermayer	fa6fa2162b	avcodec/cabac: support UNCHECKED_BITSTREAM_READER = 0 Fixes overreads in HEVC Fixes Ticket3070 Also fixed remaining issues from Ticket3075 and Ticket3076 Some lines of code taken from 0c5f839693da2276c2da23400f67a67be4ea0af1:libavcodec/x86/cabac.h and 0c5f839693da2276c2da23400f67a67be4ea0af1:libavcodec/cabac_functions.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-31 11:13:27 +01:00
Ronald S. Bultje	960490c0b2	avcodec/x86/videodsp: Small speedups in ff_emulated_edge_mc x86 SIMD. Don't use word-size multiplications if size == 2, and if we're using SIMD instructions (size >= 8), complete leftover 4byte sets using movd, not mov. Both of these changes lead to minor speedups. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-27 15:02:48 +01:00
Ronald S. Bultje	cd86eb265f	avcodec/x86/videodsp: fix a bug in a %if statement where we used '%%' instead of '&&'. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-27 15:02:48 +01:00
Michael Niedermayer	41efb8d9a7	avcodec/x86/cabac: include get_cabac_bypass_sign_x86() under #if !BROKEN_COMPILER this might fix Ticket2999 as well as some fate clients untested as the original patch submitter no longer has the environment to test this should be reverted if it does not fix the issues Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-26 15:06:55 +02:00
Ronald S. Bultje	1b3a7e1f42	avcodec/x86/videodsp: Properly mark sse2 instructions in emulated_edge_mc x86 simd as such. Should fix crashes or corrupt output on pre-SSE2 CPUs when they were using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in hfix or hvar single-edge (left/right) extension functions. Tested-by: Ingo Brückl <ib@wupperonline.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-24 13:36:55 +02:00
Michael Niedermayer	c35d29a9c8	avcodec/x86/dsputil_init: move ff_idct_xvid_mmxext init This decreases the diff to libav Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-15 02:06:12 +02:00
Michael Niedermayer	ab8cbfe0dd	avcodec/x86/dsputil_init: remove duplicated sse2 idct init Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-15 01:59:36 +02:00
Michael Niedermayer	1bf8fa75ee	avcodec/x86/dsputil_init: fix cpu flag checks Fixes linking failure with --disable-sse2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-15 01:46:21 +02:00
Ronald S. Bultje	20d78a8606	libavcodec/x86: Fix emulated_edge_mc SSE code to not contain SSE2 instructions on x86-32. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-10 13:36:06 +02:00
Ronald S. Bultje	ad75d2b590	x86: Fix compilation with nasm on PPC & OS/2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-08 12:36:19 +02:00
Michael Niedermayer	deb5addcff	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: h264_idct: Update comments to match 8/10-bit depth optimization split Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-08 12:10:02 +02:00
Michael Niedermayer	1f17619fe4	Merge commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450' * commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450': x86inc: Utilize the shadow space on 64-bit Windows Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-08 11:23:00 +02:00
Ronald S. Bultje	ba9c557b92	avcodec/x86/vp9dsp: Fix compilation with nasm. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-08 02:27:12 +02:00
Diego Biurrun	6405ca7d4a	x86: h264_idct: Update comments to match 8/10-bit depth optimization split	2013-10-07 21:46:46 +02:00
Henrik Gramner	bbe4a6db44	x86inc: Utilize the shadow space on 64-bit Windows Store XMM6 and XMM7 in the shadow space in functions that clobbers them. This way we don't have to adjust the stack pointer as often, reducing the number of instructions as well as code size. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-10-07 06:25:35 -04:00
Michael Niedermayer	b67cb58520	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: fdct: Employ more specific ifdefs Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-07 11:37:47 +02:00
Diego Biurrun	ce1e8045e0	x86: fdct: Employ more specific ifdefs This avoids building mmxext and sse2 code when disabled by configure.	2013-10-06 22:02:25 +02:00
Michael Niedermayer	c86955d24a	Merge commit '2ddb35b91131115c094d90e04031451023441b4d' * commit '2ddb35b91131115c094d90e04031451023441b4d': x86: dsputil: Separate ff_add_hfyu_median_prediction_cmov from dsputil_mmx Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-06 11:50:01 +02:00
Michael Niedermayer	7fb123429e	Merge commit '258414d0771845d20f646ffe4d4e60f22fba217c' * commit '258414d0771845d20f646ffe4d4e60f22fba217c': x86: fdct: Initialize optimized fdct implementations in the standard way Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-06 11:31:01 +02:00
Michael Niedermayer	d0b2703676	Merge commit '0b8b2ae5e93d616c2ece59f7175f483154cff918' * commit '0b8b2ae5e93d616c2ece59f7175f483154cff918': x86: xviddct: Employ more specific ifdefs Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-06 11:25:22 +02:00
Diego Biurrun	2ddb35b911	x86: dsputil: Separate ff_add_hfyu_median_prediction_cmov from dsputil_mmx The function does not depend on MMX and compilation without MMX enabled fails if the function is compiled conditional on MMX availability.	2013-10-05 19:21:15 +02:00
Diego Biurrun	258414d077	x86: fdct: Initialize optimized fdct implementations in the standard way	2013-10-05 18:20:52 +02:00
Diego Biurrun	0b8b2ae5e9	x86: xviddct: Employ more specific ifdefs This avoids building mmxext and sse2 code when disabled by configure.	2013-10-05 18:14:58 +02:00
Michael Niedermayer	9d8e8495c9	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: fdct: Only build fdct code if encoders have been enabled Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-10-04 14:36:58 +02:00
Diego Biurrun	6cc133ec58	x86: fdct: Only build fdct code if encoders have been enabled fdct is only initialized if encoders are enabled.	2013-10-04 10:50:44 +02:00
Ronald S. Bultje	f1548c008f	Full-pixel MC functions. Decoding time of ped1080p.webm goes from 11.3sec to 11.1sec.	2013-10-02 21:03:15 -04:00
Ronald S. Bultje	c07ac8d467	VP9 MC (ssse3) optimizations. Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.	2013-10-02 21:03:15 -04:00
Ronald S. Bultje	face578d56	Rewrite emu_edge functions to have separate src/dst_stride arguments. This allows supporting files for which the image stride is smaller than the max. block size + number of subpel mc taps, e.g. a 64x64 VP9 file or a 16x16 VP8 file with -fflags +emu_edge.	2013-09-28 20:28:08 -04:00
Ronald S. Bultje	c341f734e5	Convert multiplier for MV from int to ptrdiff_t. This prevents emulated_edge_mc from not undoing mvy*stride-related integer overflows. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-09-28 11:28:09 +02:00
Martin Storsjö	ede42109e7	x86: Add an xmm clobbering wrapper for avcodec_encode_video2 This is required since `187105ff8` when we started trying to wrap this function as well. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-09-17 10:53:23 +02:00
Martin Storsjö	1daea5232f	x86: Add an xmm clobbering wrapper for avcodec_encode_video2 This is required since `187105ff8` when we started trying to wrap this function as well. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-09-16 22:22:41 +03:00
Hendrik Leppkes	a06a5b78e2	mathops/x86: work around inline asm miscompilation with GCC 4.8.1 The volatile is not required here, and prevents a miscompilation with GCC 4.8.1 when building on x86 with --cpu=i686 Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2013-09-15 11:15:07 -04:00
Michael Niedermayer	2ffead98dd	avcodec: add emuedge_linesize_type Currently all uses of the emu edge code as well as the code itself assume int linesize changing some but not changing all would introduce a security issue once all use this typedef a simple search and replace can be done to switch them all to ptrdiff_t Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-09-04 14:29:20 +02:00
Paul B Mahol	6053812814	x86/simple_idct: use LOCAL_ALIGNED instead of DECLARE_ALIGNED Signed-off-by: Paul B Mahol <onemda@gmail.com>	2013-09-03 17:02:49 +00:00
Thilo Borgmann	d814a839ac	Reinstate proper FFmpeg license for all files.	2013-08-30 15:47:38 +00:00
Carl Eugen Hoyos	8fe1fb41ac	Fix compilation with --disable-mmx.	2013-08-30 15:21:15 +02:00
Michael Niedermayer	62a6052974	Merge commit 'e998b56362c711701b3daa34e7b956e7126336f4' * commit 'e998b56362c711701b3daa34e7b956e7126336f4': x86: avcodec: Consistently structure CPU extension initialization Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-30 12:50:01 +02:00
Michael Niedermayer	7fb758cd8e	avcodec/x86/lpc: Fix cpu flag checks so they work Broken by `6369ba3c9c` Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-30 12:34:52 +02:00
Michael Niedermayer	c1913064e3	avcodec/x86/vp8dsp: Fix cpu flag checks so they work Broken by `6369ba3c9c` Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-30 12:33:56 +02:00
Michael Niedermayer	8be0e2bd43	Merge commit '6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0' * commit '6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0': x86: avcodec: Use convenience macros to check for CPU flags Conflicts: libavcodec/x86/dsputil_init.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/motion_est.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-30 12:08:28 +02:00
Michael Niedermayer	37494bdb0d	Merge commit 'cd529172377229f2e86987869ccc08f426bfe114' * commit 'cd529172377229f2e86987869ccc08f426bfe114': x86: rv40dsp: Move inline assembly optimizations out of YASM init section Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-29 14:48:46 +02:00
Michael Niedermayer	477641e9f8	Merge commit 'a64f6a04ac5773aeff2003897455dadb9609f18b' * commit 'a64f6a04ac5773aeff2003897455dadb9609f18b': dsputil: x86: Hide arch-specific initialization details Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-29 14:32:05 +02:00
Diego Biurrun	e998b56362	x86: avcodec: Consistently structure CPU extension initialization	2013-08-29 13:07:37 +02:00
Diego Biurrun	6369ba3c9c	x86: avcodec: Use convenience macros to check for CPU flags	2013-08-29 13:07:37 +02:00
Diego Biurrun	cd52917237	x86: rv40dsp: Move inline assembly optimizations out of YASM init section	2013-08-28 23:59:24 +02:00
Diego Biurrun	a64f6a04ac	dsputil: x86: Hide arch-specific initialization details Also give consistent names to init functions.	2013-08-28 23:59:24 +02:00
Michael Niedermayer	f9418d156f	Merge commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9' * commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9': vp56: Mark VP6-only optimizations as such. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-24 11:04:11 +02:00
Diego Biurrun	8506ff97c9	vp56: Mark VP6-only optimizations as such. Most of our VP56 optimizations are VP6-only and will stay that way. So avoid compiling them for VP5-only builds.	2013-08-23 14:42:19 +02:00
Michael Niedermayer	f903b42663	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Split DCT and FFT initialization into separate files Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-22 14:28:42 +02:00
Michael Niedermayer	503aec1425	Merge commit '0b45269c2d732d15afa2de9c475d85fcf5561ac4' * commit '0b45269c2d732d15afa2de9c475d85fcf5561ac4': x86: h264_idct: Remove incorrect comment Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-08-22 13:04:07 +02:00
Diego Biurrun	e7b31844f6	x86: Split DCT and FFT initialization into separate files	2013-08-21 20:15:27 +02:00
Diego Biurrun	0b45269c2d	x86: h264_idct: Remove incorrect comment	2013-08-21 15:09:58 +02:00
Michael Niedermayer	9d01bf7d66	Merge remote-tracking branch 'qatar/master' * qatar/master: Consistently use "cpu_flags" as variable/parameter name for CPU flags Conflicts: libavcodec/x86/dsputil_init.c libavcodec/x86/h264dsp_init.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/motion_est.c libavcodec/x86/mpegvideo.c libavcodec/x86/proresdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-18 09:53:47 +02:00
Diego Biurrun	3ac7fa81b2	Consistently use "cpu_flags" as variable/parameter name for CPU flags	2013-07-18 00:31:35 +02:00
Christophe Gisquet	b6293e2798	fmtconvert: Explicitly use int32_t instead of int Signed-off-by: Martin Storsjö <martin@martin.st>	2013-07-17 11:02:47 +03:00
Michael Niedermayer	9d47333e3e	Merge commit '2b379a925162b6783bd9a81dc03e647e8b65494c' * commit '2b379a925162b6783bd9a81dc03e647e8b65494c': mlpdsp: x86: Respect cpuflags Conflicts: libavcodec/x86/mlpdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-12 13:10:53 +02:00
Luca Barbato	2b379a9251	mlpdsp: x86: Respect cpuflags	2013-07-12 04:34:49 +02:00
Michael Niedermayer	707b2135fd	libavcodec/x86/mpegvideo: Move mmx functions under HAVE_MMX_INLINE should fix ticket2755 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-07 01:50:59 +02:00
Michael Niedermayer	abce6dfd9e	avcodec/x86/vp3dsp_init: move mmx functions under HAVE_MMX_INLINE Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-07 01:50:59 +02:00
Michael Niedermayer	66537c7efd	avcodec/x86/cabac: Disable get_cabac_bypass_x86() on broken llvm/clang This should fix fate on these platforms Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-06 16:24:27 +02:00
Michael Niedermayer	32de28053d	avcodec/x86/cabac: factorize broken llvm/clang check out Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-06 16:24:27 +02:00
Michael Niedermayer	cced6f4d58	Merge commit 'e6d8acf6a8fba4743eb56eabe72a741d1bbee3cb' * commit 'e6d8acf6a8fba4743eb56eabe72a741d1bbee3cb': indeo: use a typedef for the mc function pointer cabac: x86 version of get_cabac_bypass aic: use chroma scan tables while decoding luma component in progressive mode Conflicts: libavcodec/aic.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-05 11:41:30 +02:00
Jason Garrett-Glaser	d222f6e39e	cabac: x86 version of get_cabac_bypass Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-07-04 16:06:10 +02:00
Michael Niedermayer	b791a0831b	avcodec/x86/dsputil_init: only use xvid idct for lowres=0 Fixes crash Fixes Ticket2714 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-07-01 20:56:37 +02:00
Hendrik Leppkes	659df32a9d	mathops/x86: work around inline asm miscompilation with GCC 4.8.1 The volatile is not required here, and prevents a miscompilation with GCC 4.8.1 when building on x86 with --cpu=i686 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-06-21 12:47:03 +02:00
Michael Niedermayer	3a0e21f037	Merge commit '186599ffe0a94d587434e5e46e190e038357ed99' * commit '186599ffe0a94d587434e5e46e190e038357ed99': build: cosmetics: Place unconditional before conditional OBJS lines Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-30 10:49:43 +02:00
Diego Biurrun	186599ffe0	build: cosmetics: Place unconditional before conditional OBJS lines Signed-off-by: Martin Storsjö <martin@martin.st>	2013-05-30 02:17:31 +03:00
Christophe Gisquet	f49564c607	fmtconvert: int32_t input to int32_to_float_fmul_scalar It was previously declared as int. Does not change fate results for x86. Conflicts: libavcodec/ppc/fmtconvert_altivec.c Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-18 18:01:16 +02:00
Michael Niedermayer	0d83b5722e	Merge commit '9cacdabd1c8cd257a942d8289349c37d992989b7' * commit '9cacdabd1c8cd257a942d8289349c37d992989b7': jpegls: cosmetics: Drop some unnecessary parentheses mpegvideo: Remove commented-out PARANOID debug cruft Conflicts: libavcodec/jpegls.c libavcodec/mpegvideo.c libavcodec/x86/mpegvideo.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-16 09:19:11 +02:00
Diego Biurrun	004b81c465	mpegvideo: Remove commented-out PARANOID debug cruft	2013-05-15 23:53:42 +02:00
Michael Niedermayer	a887372109	Merge commit '1399931d07f0f37ef4526eb8d39d33c64e09618a' * commit '1399931d07f0f37ef4526eb8d39d33c64e09618a': x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.h Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-14 12:12:20 +02:00
Michael Niedermayer	b2da63db50	Merge commit '245b76a108585b6fb52eebc2626c472d6fa530dc' * commit '245b76a108585b6fb52eebc2626c472d6fa530dc': x86: dsputil: Split inline assembly from init code Conflicts: libavcodec/x86/dsputil_mmx.c Note, the author attribution is left in place and not removed as it is in the merged commit. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-14 11:19:20 +02:00
Michael Niedermayer	eda9d97b7a	Merge commit '46bb456853b197f4562de7acf5d42abf11ded9be' * commit '46bb456853b197f4562de7acf5d42abf11ded9be': x86: dsputil: Refactor pixels16 wrapper functions with a macro Conflicts: libavcodec/x86/hpeldsp_avg_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-14 10:18:46 +02:00
Michael Niedermayer	69d52ef8ff	Merge commit 'f54b55058a429c4eea5bae7e5bcb49bd29b34199' * commit 'f54b55058a429c4eea5bae7e5bcb49bd29b34199': configure: Rename cmov processor capability to i686 Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-13 11:29:40 +02:00
Diego Biurrun	1399931d07	x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.h The header is not (anymore) MMX-specific.	2013-05-12 22:28:07 +02:00
Diego Biurrun	245b76a108	x86: dsputil: Split inline assembly from init code Also remove some pointless comments.	2013-05-12 22:28:07 +02:00
Diego Biurrun	46bb456853	x86: dsputil: Refactor pixels16 wrapper functions with a macro	2013-05-12 22:28:07 +02:00
Diego Biurrun	f54b55058a	configure: Rename cmov processor capability to i686 The goal is to make the capapility slightly more general and have it cover the availability of the nopl instruction in addition to cmov.	2013-05-12 21:23:38 +02:00
Michael Niedermayer	5e1278c640	Merge commit '2c299d4165cd9653153e12270971c2368551b79e' * commit '2c299d4165cd9653153e12270971c2368551b79e': x86: sbrdsp: implement SSE2 qmf_pre_shuffle Conflicts: libavcodec/x86/sbrdsp.asm libavcodec/x86/sbrdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-10 23:16:09 +02:00
Christophe Gisquet	2c299d4165	x86: sbrdsp: implement SSE2 qmf_pre_shuffle From 253 to 51 cycles on Arrandale and Win64. 44 cycles on SandyBridge. Signed-off-by: Anton Khirnov <anton@khirnov.net>	2013-05-10 09:31:27 +02:00
Michael Niedermayer	769efe56b1	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Remove unused argument from QPEL_OP macro Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:55:46 +02:00
Michael Niedermayer	bda5487d92	Merge commit '3d40c1ee742db5f13ebcf53c2d1fa4bf4f39bcd2' * commit '3d40c1ee742db5f13ebcf53c2d1fa4bf4f39bcd2': x86: dsputil: Move TRANSPOSE4 macro to the only place it is used Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:43:00 +02:00
Michael Niedermayer	164899c6e8	Merge commit '71469f3b636fbe06b6aca5933f9fdebddd8d5f57' * commit '71469f3b636fbe06b6aca5933f9fdebddd8d5f57': x86: dsputil: Move constant declarations into separate header Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:36:29 +02:00
Michael Niedermayer	5747d835c7	Merge commit 'ed880050edf061b38d3e39e25657c59ad9108b27' * commit 'ed880050edf061b38d3e39e25657c59ad9108b27': x86: dsputil: Group all assembly constants together in constants.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:23:40 +02:00
Michael Niedermayer	3469c24a10	Merge commit '87614667606b42476f9017d79faf12b45a0bd77c' * commit '87614667606b42476f9017d79faf12b45a0bd77c': x86: dsputil: Move ff_pd assembly constants to the only place they are used Conflicts: libavcodec/x86/lpc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-09 11:01:14 +02:00
Diego Biurrun	f243bf7aa2	x86: dsputil: Remove unused argument from QPEL_OP macro	2013-05-08 18:18:58 +02:00
Diego Biurrun	3d40c1ee74	x86: dsputil: Move TRANSPOSE4 macro to the only place it is used	2013-05-08 18:18:23 +02:00
Diego Biurrun	71469f3b63	x86: dsputil: Move constant declarations into separate header	2013-05-08 18:18:23 +02:00
Michael Niedermayer	e3869dd17e	Merge commit '1b343cedd7cd68e7865aa5280d1568c7e5d79917' * commit '1b343cedd7cd68e7865aa5280d1568c7e5d79917': x86: dsputil: Remove unused ff_pb_3F constant x86: dsputil: Remove unused MOVQ_BONE macro Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-08 14:13:48 +02:00
Michael Niedermayer	69d2eff5af	Merge commit '63bac48f734fc69cca2ef2cfada92cd9a222734d' * commit '63bac48f734fc69cca2ef2cfada92cd9a222734d': x86: dsputil: Move rv40-specific functions where they belong Conflicts: libavcodec/x86/dsputil_mmx.c libavcodec/x86/dsputil_mmx.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-08 14:08:00 +02:00
Michael Niedermayer	2f9ef60c97	Merge commit '92f8e06ecb431a427ea13d794e5a6bc927a034d2' * commit '92f8e06ecb431a427ea13d794e5a6bc927a034d2': x86: dsputil hpeldsp: Move shared template functions into separate object Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-08 14:01:58 +02:00
Michael Niedermayer	bf18810a21	Merge commit '7edaf4edb5c3c04f34ad1242680cbc32d11f4087' * commit '7edaf4edb5c3c04f34ad1242680cbc32d11f4087': x86: rnd_template: Eliminate pointless OP_AVG macro indirection Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-08 13:55:31 +02:00
Christophe Gisquet	fc37cd4333	x86: sbrdsp: force PIC addressing for Win64 MSVC complains about the 32bits addressing, while mingw/gcc does not. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-08 03:07:03 +02:00
Diego Biurrun	ed880050ed	x86: dsputil: Group all assembly constants together in constants.c	2013-05-08 01:04:04 +02:00
Diego Biurrun	8761466760	x86: dsputil: Move ff_pd assembly constants to the only place they are used	2013-05-08 01:04:04 +02:00
Diego Biurrun	1b343cedd7	x86: dsputil: Remove unused ff_pb_3F constant	2013-05-07 18:03:35 +02:00
Diego Biurrun	63bac48f73	x86: dsputil: Move rv40-specific functions where they belong	2013-05-07 18:03:35 +02:00
Diego Biurrun	3334cbec0a	x86: dsputil: Remove unused MOVQ_BONE macro	2013-05-07 18:03:35 +02:00
Diego Biurrun	92f8e06ecb	x86: dsputil hpeldsp: Move shared template functions into separate object	2013-05-07 18:03:34 +02:00
Diego Biurrun	7edaf4edb5	x86: rnd_template: Eliminate pointless OP_AVG macro indirection	2013-05-07 18:03:34 +02:00
Michael Niedermayer	108e2ae829	Merge commit '110796739ab32854dc0b6b0a1c95e6ae98889062' * commit '110796739ab32854dc0b6b0a1c95e6ae98889062': x86: hpeldsp: Move avg_pixels8_x2_mmx() out of hpeldsp_rnd_template.c Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-06 21:00:40 +02:00
Michael Niedermayer	a6e782434a	Merge commit 'dc1b328d0df6e5ad5ff0ca4ae031e08466624f9c' * commit 'dc1b328d0df6e5ad5ff0ca4ae031e08466624f9c': x86: hpeldsp: Only compile MMX hpeldsp code if MMX is enabled Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-06 20:53:37 +02:00
Michael Niedermayer	32cc7dacde	Merge commit '9e5e76ef9ea803432ef2782a3f528c3f5bab621e' * commit '9e5e76ef9ea803432ef2782a3f528c3f5bab621e': x86: More specific ifdefs for dsputil/hpeldsp init functions Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-06 20:46:27 +02:00
Diego Biurrun	110796739a	x86: hpeldsp: Move avg_pixels8_x2_mmx() out of hpeldsp_rnd_template.c The function is only instantiated once, so there is no point in keeping it in a template file.	2013-05-06 11:02:08 +02:00
Diego Biurrun	dc1b328d0d	x86: hpeldsp: Only compile MMX hpeldsp code if MMX is enabled	2013-05-06 11:02:08 +02:00
Diego Biurrun	9e5e76ef9e	x86: More specific ifdefs for dsputil/hpeldsp init functions	2013-05-06 11:02:07 +02:00
Michael Niedermayer	0aa095483d	Merge commit '6fee1b90ce3bf4fbdfde7016e0890057c9000487' * commit '6fee1b90ce3bf4fbdfde7016e0890057c9000487': avcodec: Add av_cold attributes to init functions missing them Conflicts: libavcodec/aacpsy.c libavcodec/atrac3.c libavcodec/dvdsubdec.c libavcodec/ffv1.c libavcodec/ffv1enc.c libavcodec/h261enc.c libavcodec/h264_parser.c libavcodec/h264dsp.c libavcodec/h264pred.c libavcodec/libschroedingerenc.c libavcodec/libxvid_rc.c libavcodec/mpeg12.c libavcodec/mpeg12enc.c libavcodec/proresdsp.c libavcodec/rangecoder.c libavcodec/videodsp.c libavcodec/x86/proresdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-05 11:34:29 +02:00
Diego Biurrun	6fee1b90ce	avcodec: Add av_cold attributes to init functions missing them	2013-05-04 21:09:45 +02:00
Michael Niedermayer	0104570fb6	Merge commit 'a5f8873620ce502d37d0cc3ef93ada2ea8fb8de7' * commit 'a5f8873620ce502d37d0cc3ef93ada2ea8fb8de7': silly typo fixes Conflicts: doc/protocols.texi libavcodec/aacpsy.c libavformat/utils.c tools/patcheck Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-04 10:17:52 +02:00
Michael Niedermayer	711c8ee71d	Merge commit '4a7af92cc80ced8498626401ed21f25ffe6740c8' * commit '4a7af92cc80ced8498626401ed21f25ffe6740c8': sbrdsp: Unroll and use integer operations sbrdsp: Unroll sbr_autocorrelate_c x86: sbrdsp: Implement SSE2 qmf_deint_bfly Conflicts: libavcodec/sbrdsp.c libavcodec/x86/sbrdsp.asm libavcodec/x86/sbrdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-04 10:07:43 +02:00
Diego Biurrun	a5f8873620	silly typo fixes	2013-05-03 18:26:12 +02:00
Christophe Gisquet	5a97469a4f	x86: sbrdsp: Implement SSE2 qmf_deint_bfly Sandybridge: 47 cycles Having a loop counter is a 7 cycle gain. Unrolling is another 7 cycle gain. Working in reverse scan is another 6 cycles. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-05-03 18:23:14 +02:00
Michael Niedermayer	05599308e9	Merge commit 'bf7c3c6b157f7938578f964b62cffd5e504940be' * commit 'bf7c3c6b157f7938578f964b62cffd5e504940be': x86: dsputil: Move cavs and vc1-specific functions where they belong Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-03 10:50:27 +02:00
Michael Niedermayer	35ef98013d	Merge commit '932806232108872655556100011fe369125805d3' * commit '932806232108872655556100011fe369125805d3': x86: dsputil: Move avg_pixels16_mmx() out of rnd_template.c x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-03 10:44:09 +02:00
Michael Niedermayer	ed1697ffcb	Merge commit '9b3a04d30691e85b77e63f75f5f26a93c3a000cd' * commit '9b3a04d30691e85b77e63f75f5f26a93c3a000cd': x86: Move duplicated put_pixels{8\|16}_mmx functions into their own file Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-03 10:36:52 +02:00
Diego Biurrun	bf7c3c6b15	x86: dsputil: Move cavs and vc1-specific functions where they belong	2013-05-02 11:45:37 +02:00
Diego Biurrun	9328062321	x86: dsputil: Move avg_pixels16_mmx() out of rnd_template.c The function does not do any rounding, so there is no point in keeping it in a round template file.	2013-05-02 11:45:37 +02:00
Diego Biurrun	9c112a6158	x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c The function is only instantiated once, so there is no point in keeping it in a template file.	2013-05-02 11:45:37 +02:00
Diego Biurrun	9b3a04d306	x86: Move duplicated put_pixels{8\|16}_mmx functions into their own file	2013-05-02 11:16:45 +02:00
Michael Niedermayer	dbcf7e9ef7	Merge commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d' * commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d': ppc: Drop unnecessary ff_ name prefixes from static functions x86: Drop unnecessary ff_ name prefixes from static functions arm: Drop unnecessary ff_ name prefixes from static functions Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-01 18:21:35 +02:00
Michael Niedermayer	3ad5d8694c	Merge commit '6b110d3a739c31602b59887ad65c67025df3f49d' * commit '6b110d3a739c31602b59887ad65c67025df3f49d': ppc: More consistent names for H.264 optimizations files mpegaudiosp: More consistent names for ppc/x86 optimization files Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-05-01 17:53:33 +02:00
Diego Biurrun	f2e9d44a57	x86: Drop unnecessary ff_ name prefixes from static functions	2013-04-30 16:02:03 +02:00
Diego Biurrun	643e433bf7	mpegaudiosp: More consistent names for ppc/x86 optimization files	2013-04-30 12:19:43 +02:00
Michael Niedermayer	01a5a3a2e8	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Remove a set of pointless #ifs around function declarations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-30 08:42:10 +02:00
Michael Niedermayer	a3030d47e7	Merge commit '85f2f82af66fade2f5af2a03c5011d7de1b6e295' * commit '85f2f82af66fade2f5af2a03c5011d7de1b6e295': x86: dsputil: cosmetics: Group ff_{avg\|put}_pixels16_mmxext() declarations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-30 08:25:24 +02:00
Diego Biurrun	97c56ad796	x86: dsputil: Remove a set of pointless #ifs around function declarations	2013-04-30 01:42:32 +02:00
Diego Biurrun	85f2f82af6	x86: dsputil: cosmetics: Group ff_{avg\|put}_pixels16_mmxext() declarations	2013-04-30 01:41:05 +02:00
Michael Niedermayer	16b2472d20	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: hpeldsp: Remove unused macro definitions Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-29 22:01:53 +02:00
Diego Biurrun	20784aa678	x86: hpeldsp: Remove unused macro definitions	2013-04-29 15:57:00 +02:00
Michael Niedermayer	3fa6c992d9	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: ac3dsp: Remove 3dnow version of ff_ac3_extract_exponents Conflicts: tests/fate/ac3.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-27 11:35:03 +02:00
Diego Biurrun	7c00e9d8ae	x86: ac3dsp: Remove 3dnow version of ff_ac3_extract_exponents The function requires increasing the fuzz factor for the ac3/eac3 encode tests and even so makes fate fail. It only provides a slight encoding speedup for legacy CPUs that do not support SS2. Thus its benefit is not worth the trouble it creates and fixing it would be a waste of time.	2013-04-26 21:06:52 +02:00
Michael Niedermayer	721ffc691a	Merge remote-tracking branch 'qatar/master' * commit '74685f6783e77f2545d48bd2124945ad5be39982': x86: Rename dsputil_rnd_template.c to rnd_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-26 11:10:57 +02:00
Martin Storsjö	74685f6783	x86: Rename dsputil_rnd_template.c to rnd_template.c This makes it less confusing when this template is shared both by dsputil and by hpeldsp. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-25 23:03:09 +03:00
Michael Niedermayer	2e789d165b	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Get rid of duplication between *_rnd_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-24 10:17:28 +02:00
Michael Niedermayer	c2a0833c09	Merge commit '6a8561dbd7c078eb75985f7011ad1ad3fda9e223' * commit '6a8561dbd7c078eb75985f7011ad1ad3fda9e223': x86: Factorize duplicated inline assembly snippets Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-24 10:01:15 +02:00
Michael Niedermayer	fc69033371	avcodec/x86/sbrdsp_init: disable using the noise code in x86_64 MSVC, Try #2 This should fix building with MSVC until someone can change the code so it works with MSVC Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-24 02:02:25 +02:00
Martin Storsjö	486f76f029	x86: Get rid of duplication between *_rnd_template.c Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-23 23:30:17 +03:00
Martin Storsjö	6a8561dbd7	x86: Factorize duplicated inline assembly snippets Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-04-23 15:07:31 +02:00
Michael Niedermayer	7a617d6c17	avcodec/x86/sbrdsp_init: disable using the noise code in x86_64 MSVC This should fix building with MSVC until someone can change the code so it works with MSVC Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-23 12:46:28 +02:00
Michael Niedermayer	0a73803c86	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Move some conditional code around to avoid unused variable warnings Conflicts: libavcodec/x86/dsputil_mmx.c libavfilter/x86/vf_yadif_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-23 11:01:46 +02:00
Michael Niedermayer	430d69c942	Merge commit 'b4ad7c54c878dead7dfa4838b912a530c1debe85' * commit 'b4ad7c54c878dead7dfa4838b912a530c1debe85': x86: cavs: Refactor duplicate dspfunc macro Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-22 18:37:44 +02:00
Michael Niedermayer	f84e373797	Merge commit '78fa0bd0f7067868943c0899907e313414492426' * commit '78fa0bd0f7067868943c0899907e313414492426': x86: cavs: Put mmx-specific code into its own init function Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-22 18:29:05 +02:00
Diego Biurrun	c1ad70c3cb	x86: Move some conditional code around to avoid unused variable warnings	2013-04-22 17:50:02 +02:00
Michael Niedermayer	2288c77689	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Remove some duplicate function declarations Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-22 13:07:20 +02:00
Diego Biurrun	b4ad7c54c8	x86: cavs: Refactor duplicate dspfunc macro	2013-04-22 12:05:09 +02:00
Diego Biurrun	78fa0bd0f7	x86: cavs: Put mmx-specific code into its own init function Before, this code was labeled as mmxext and enabled both for the 3dnow and the mmxext case.	2013-04-22 10:42:50 +02:00
Diego Biurrun	311a592dfc	x86: Remove some duplicate function declarations	2013-04-22 02:29:57 +02:00
Michael Niedermayer	0dd25e4699	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: Remove unused inline asm instruction defines vc1: Remove now unused variables Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-20 16:24:15 +02:00
Michael Niedermayer	d0aa60da10	Merge commit '8db00081a37d5b7e23918ee500bb16bc59b57197' * commit '8db00081a37d5b7e23918ee500bb16bc59b57197': x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp Conflicts: libavcodec/hpeldsp.c libavcodec/hpeldsp.h libavcodec/x86/Makefile libavcodec/x86/dsputil_mmx.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/hpeldsp_rnd_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-20 13:42:51 +02:00
Martin Storsjö	b71a0507b0	x86: Remove unused inline asm instruction defines Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-20 00:44:54 +03:00
Ronald S. Bultje	8db00081a3	x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-19 23:18:53 +03:00
Christophe Gisquet	76c7277385	x86: sbrdsp: implement SSE2 hf_apply_noise 233 to 105 cycles on Arrandale and Win64. Replacing the multiplication by s_m[m] by a pand and a pxor with appropriate vectors is slower. Unrolling is a 15 cycles win. A SSE version was 4 cycles slower. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-19 13:19:45 +02:00
Michael Niedermayer	d5c31403aa	Merge commit 'c46819f2299c73cd1bfa8ef04d08b0153a5699d3' * commit 'c46819f2299c73cd1bfa8ef04d08b0153a5699d3': x86: Move constants to the only place where they are used Conflicts: libavcodec/x86/vp3dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-16 00:44:20 +02:00
Ronald S. Bultje	015821229f	vp3: Use full transpose for all IDCTs This way, the special IDCT permutations are no longer needed. This is similar to how H264 does it, and removes the dsputil dependency imposed by the scantable code. Also remove the unused type == 0 cases from the plain C version of the idct. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-15 12:32:05 +03:00
Ronald S. Bultje	c46819f229	x86: Move constants to the only place where they are used Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-15 12:17:39 +03:00
Michael Niedermayer	34b78ad04f	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Move some ifdefs to avoid unused variable warnings Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-12 23:38:41 +02:00
Michael Niedermayer	ed3680bc9b	Merge commit '2004c7c8f763280ff3ba675ea21cf25396528fd3' * commit '2004c7c8f763280ff3ba675ea21cf25396528fd3': x86: dsputil: cosmetics: Remove two pointless variable indirections Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-12 23:28:23 +02:00
Michael Niedermayer	694fa0035a	Merge commit 'c51a3a5bd9a5b404176ff343ecadb80b2553b256' * commit 'c51a3a5bd9a5b404176ff343ecadb80b2553b256': x86: dsputil: Refactor some ff_{avg\|put}_pixels function declarations Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-12 22:36:31 +02:00
Michael Niedermayer	43bf4ee9a9	Merge commit 'e027032fc6a49db5a4ce12fc3e09ffb86ff20522' * commit 'e027032fc6a49db5a4ce12fc3e09ffb86ff20522': x86: dsputil: ff_h263_*_loop_filter declarations to a more suitable place Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-12 22:29:30 +02:00
Michael Niedermayer	52bda1d903	Merge commit 'a89c05500f68d94a0269e68bc522abfd420c5497' * commit 'a89c05500f68d94a0269e68bc522abfd420c5497': x86: h264qpel: int --> ptrdiff_t for some line_size parameters Conflicts: libavcodec/x86/qpelbase.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-12 22:22:27 +02:00
Diego Biurrun	a3cb865310	x86: dsputil: Move some ifdefs to avoid unused variable warnings	2013-04-12 09:36:47 +02:00
Diego Biurrun	2004c7c8f7	x86: dsputil: cosmetics: Remove two pointless variable indirections	2013-04-12 09:36:47 +02:00
Diego Biurrun	c51a3a5bd9	x86: dsputil: Refactor some ff_{avg\|put}_pixels function declarations	2013-04-12 09:36:46 +02:00
Diego Biurrun	e027032fc6	x86: dsputil: ff_h263_*_loop_filter declarations to a more suitable place	2013-04-12 09:36:46 +02:00
Diego Biurrun	a89c05500f	x86: h264qpel: int --> ptrdiff_t for some line_size parameters	2013-04-12 09:30:12 +02:00
Michael Niedermayer	580a0600ef	Merge remote-tracking branch 'qatar/master' * qatar/master: Move misplaced file author information where it belongs Conflicts: libavcodec/adpcm.c libavcodec/adpcmenc.c libavcodec/gif.c libavcodec/x86/dsputilenc_mmx.c libavcodec/x86/fmtconvert_init.c libavformat/au.c libavformat/gif.c libavformat/mov.c libavformat/nsvdec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-11 15:56:18 +02:00
Michael Niedermayer	742c392885	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputil: Make dsputil selectable Conflicts: configure libavcodec/Makefile libavcodec/x86/Makefile libavcodec/x86/constants.c libavcodec/x86/dsputil_mmx.c libavcodec/x86/dsputil_mmx.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-11 12:32:29 +02:00
Michael Niedermayer	0724b4a16d	Merge commit '62844c3fd66940c7747e9b2bb7804e265319f43f' * commit '62844c3fd66940c7747e9b2bb7804e265319f43f': h264: Integrate clear_blocks calls with IDCT Conflicts: libavcodec/arm/h264idct_neon.S libavcodec/h264idct_template.c libavcodec/x86/h264_idct.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-11 11:53:19 +02:00
Diego Biurrun	ac9362c5d9	Move misplaced file author information where it belongs	2013-04-11 02:42:11 +02:00
Ronald S. Bultje	b93b27edb0	dsputil: Make dsputil selectable Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-10 11:04:05 +03:00
Ronald S. Bultje	62844c3fd6	h264: Integrate clear_blocks calls with IDCT The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700 to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip tested (cathedral), i.e. almost 30 cycles per mb faster. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-10 11:03:06 +03:00
Christophe Gisquet	2383068cbf	x86: sbrdsp: implement SSE2 qmf_pre_shuffle From 253 to 51 cycles on Arrandale and Win64. 44 cycles on SandyBridge. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-10 02:42:22 +02:00
Ronald S. Bultje	610b18e2e3	x86: qpel: Move fullpel and l2 functions to a separate file This way, they can be shared between mpeg4qpel and h264qpel without requiring either one to be compiled unconditionally. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-08 12:38:33 +03:00
Christophe Gisquet	e2946e5c34	x86: sbrdsp: implement SSE qmf_deint_bfly From 312 to 89/68 (sse/sse2) cycles on Arrandale and Win64. Sandybridge: 68/47 cycles. Having a loop counter is a 7 cycle gain. Unrolling is another 7 cycle gain. Working in reverse scan is another 6 cycles. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-08 02:26:34 +02:00
Michael Niedermayer	32bac65ba0	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: sbrdsp: Implement SSE neg_odd_64 Conflicts: libavcodec/x86/sbrdsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-06 13:30:19 +02:00
Christophe Gisquet	f4b0d12f5b	x86: sbrdsp: Implement SSE neg_odd_64 Timing on Arrandale: C SSE Win32: 57 44 Win64: 47 38 Unrolling and not storing mask both save some cycles. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-04-05 22:47:04 +02:00
Christophe Gisquet	37a9708391	x86: sbrdsp: implement SSE neg_odd_64 Timing on Arrandale: C SSE Win32: 57 44 Win64: 47 38 Unrolling and not storing mask both save some cycles. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-04-05 16:42:12 +02:00
Carl Eugen Hoyos	670bb1c979	Fix compilation with --enable-decoder=webp --disable-decoder=vp8	2013-03-30 08:25:44 +01:00
Michael Niedermayer	63a97d5674	Merge commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa' * commit 'b6649ab5037fb55f78c2606f3d23cea0867cdeaa': cosmetics: Remove unnecessary extern keywords from function declarations Conflicts: libswscale/x86/swscale.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-28 11:20:41 +01:00
Diego Biurrun	b6649ab503	cosmetics: Remove unnecessary extern keywords from function declarations	2013-03-27 14:21:45 +01:00
Michael Niedermayer	ef8ab2f953	Merge commit '3b2d0ec473b036bdd0a5bc0d896fd5292915f44d' * commit '3b2d0ec473b036bdd0a5bc0d896fd5292915f44d': configure: Remove the mpegvideo dependency from svq1 x86: vc1dsp: Fix indentation Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-27 11:32:45 +01:00
Martin Storsjö	a2acadd058	x86: vc1dsp: Fix indentation Signed-off-by: Martin Storsjö <martin@martin.st>	2013-03-26 15:49:42 +02:00
Michael Niedermayer	9b9205e760	x86/dsputil.asm: make unaligned bswap actually work Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-26 13:07:46 +01:00
Michael Niedermayer	cb69a9dbf4	Merge commit 'e5c2794a7162e485eefd3133af5b98fd31386aeb' * commit 'e5c2794a7162e485eefd3133af5b98fd31386aeb': x86: consistently use unaligned movs in the unaligned bswap Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-26 13:07:37 +01:00
Michael Niedermayer	ea7b96af96	avcodec/x86/dsputil_qns_template: use av_assert Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-26 04:08:28 +01:00
Janne Grunau	e5c2794a71	x86: consistently use unaligned movs in the unaligned bswap Fixes fate errors in asv1, ffvhuff and huffyuv on x86_32.	2013-03-25 12:11:11 +01:00
Martin Storsjö	285ff14413	x86: Change a missed occurrance of int to ptrdiff_t for strides Signed-off-by: Martin Storsjö <martin@martin.st>	2013-03-24 12:06:53 +02:00
Martin Storsjö	352dbdb96c	x86: Remove win64 xmm clobbering wrappers for the now removed avcodec_encode_video function Signed-off-by: Martin Storsjö <martin@martin.st>	2013-03-23 23:37:27 +02:00
Michael Niedermayer	b3e9f266e8	x86/mpegvideo: switch to av_assert Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-22 22:57:23 +01:00
Michael Niedermayer	cdbf8409ef	x86/h264_qpel: switch to av_assert Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-22 22:57:08 +01:00
Carl Eugen Hoyos	d98a5318fd	Fix compilation with --disable-mmx.	2013-03-22 13:00:50 +01:00
Michael Niedermayer	0f95534669	h264_qpel: fix another forgotten int stride Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-17 05:20:35 +01:00
Michael Niedermayer	c3bb2f7296	dsputil_mmx: remove unused variables Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-13 15:07:46 +01:00
Michael Niedermayer	db4e4f766c	Merge commit 'a8b6015823e628047a45916404c00044c5e80415' * commit 'a8b6015823e628047a45916404c00044c5e80415': dsputil: convert remaining functions to use ptrdiff_t strides Conflicts: libavcodec/dsputil.h libavcodec/dsputil_template.c libavcodec/h264qpel_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-13 14:18:53 +01:00
Ronald S. Bultje	3ced55d51c	Move x86 half-pel assembly from dsputil to hpeldsp.	2013-03-13 03:59:23 +01:00
Ronald S. Bultje	d1293512cf	vp3: use hpeldsp instead of dsputil for half-pel functions. This makes vp3 independent of dsputil.	2013-03-13 03:55:33 +01:00
Michael Niedermayer	1f27053b91	Merge commit 'de27d2b92fa97deb2856d18e9f5f19586ce45a0f' * commit 'de27d2b92fa97deb2856d18e9f5f19586ce45a0f': lavc: remove disabled FF_API_LIBMPEG2 cruft Conflicts: libavcodec/avcodec.h libavcodec/version.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-13 02:56:33 +01:00
Ronald S. Bultje	d85c9b036e	vp3/x86: use full transpose for all IDCTs. This way, the special IDCT permutations are no longer needed. Bfin code is disabled until someone updates it. This is similar to how H264 does it, and removes the dsputil dependency imposed by the scantable code. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-12 22:54:10 +01:00
Ronald S. Bultje	6a701306db	dsputil: make selectable. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-12 19:56:58 +01:00
Luca Barbato	a8b6015823	dsputil: convert remaining functions to use ptrdiff_t strides Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-03-12 18:26:42 +01:00
Ronald S. Bultje	22cc8a103c	x86/qpel: move fullpel and l2 functions to separate file. This way, they can be shared between mpeg4qpel and h264qpel without requiring either one to be compiled unconditionally. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-09 17:25:30 +01:00
Michael Niedermayer	1a7166a58d	Merge commit 'e8c52271c45ec27d783e74238dcfad0c2008731c' * commit 'e8c52271c45ec27d783e74238dcfad0c2008731c': Revert "Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm." Conflicts: libavcodec/x86/dsputil.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-03-01 13:39:47 +01:00
Diego Biurrun	e8c52271c4	Revert "Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm." This reverts commit `f90ff772e7`. The code should be put back in h264_qpel_8bit.asm, but unfortunately it is unconditionally used from dsputil_mmx.c since `71155d7`.	2013-02-28 21:50:02 +01:00
Michael Niedermayer	50c2738883	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Drop some unused function #defines Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-27 12:35:18 +01:00
Michael Niedermayer	cdb9752a0f	Merge commit '845cfc92f908791714b8c4c8a49c91b8c64b685e' * commit '845cfc92f908791714b8c4c8a49c91b8c64b685e': x86: dsputil: Drop aliasing of ff_put_pixels8_mmx to ff_put_pixels8_mmxext Conflicts: libavcodec/x86/dsputil_mmx.c Note, the commit message is wrong, there are no mmxext instructions as claimed in the function. The change should do no harm though Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-27 12:22:10 +01:00
Michael Niedermayer	04ec796bda	Merge commit '096cc11ec102701a18951b4f0437d609081ca1dd' * commit '096cc11ec102701a18951b4f0437d609081ca1dd': x86: vc1dsp: Move ff_avg_vc1_mspel_mc00_mmxext out of dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-27 11:59:31 +01:00
Michael Niedermayer	f2bbc2ffc3	Merge commit '31a23a0dc663bd42bf593275971b4277a479b73d' * commit '31a23a0dc663bd42bf593275971b4277a479b73d': x86: dsputil_mmx: Remove leftover inline assembly fragments Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-27 11:50:51 +01:00
Diego Biurrun	ebc701993f	x86: dsputil: Drop some unused function #defines	2013-02-26 23:36:24 +01:00
Diego Biurrun	845cfc92f9	x86: dsputil: Drop aliasing of ff_put_pixels8_mmx to ff_put_pixels8_mmxext The external assembly function uses mmxext instructions and should not be masqueraded as an mmx-only function. Instead, use the mmx-only inline assembly function.	2013-02-26 23:36:24 +01:00
Diego Biurrun	096cc11ec1	x86: vc1dsp: Move ff_avg_vc1_mspel_mc00_mmxext out of dsputil_mmx.c	2013-02-26 23:36:24 +01:00
Martin Storsjö	31a23a0dc6	x86: dsputil_mmx: Remove leftover inline assembly fragments These became unused in `71155d7b`. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-02-27 00:17:05 +02:00
Michael Niedermayer	a984efd104	Merge commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f' * commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f': Remove unnecessary dsputil.h #includes Conflicts: libavcodec/ffv1.c libavcodec/h261dec.c libavcodec/h261enc.c libavcodec/h264pred.c libavcodec/lpc.h libavcodec/mjpegdec.c libavcodec/rectangle.h libavcodec/x86/idct_sse2_xvid.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-26 13:05:10 +01:00
Diego Biurrun	c242bbd8b6	Remove unnecessary dsputil.h #includes	2013-02-26 00:51:34 +01:00
Matt Wolenetz	82a4a4e7ca	Fix Win64 AVX h264_deblock by not using redzone on Win64 Thanks-to: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-21 22:38:56 +01:00
Matt Wolenetz	311443f6c7	x86: h264: Don't use redzone in AVX h264_deblock on Win64 This fixes crashes in chromium on win64 on machines with AVX (crashes that apparently aren't triggered by fate). Signed-off-by: Martin Storsjö <martin@martin.st>	2013-02-21 15:02:16 +02:00
Ronald S. Bultje	e5ffffe48d	h264chroma: Remove duplicate 9/10 bit functions These functions do the same thing in 16 bit space and don't need any depth specific clipping. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-02-19 22:33:19 +02:00
Ronald S. Bultje	1acd7d594c	h264: integrate clear_blocks calls with IDCT. The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700 to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip tested (cathedral), i.e. almost 30 cycles per mb faster. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-19 16:25:50 +01:00
Michael Niedermayer	b9237aa7b0	x86/h263_loopfilter: Fix author attribution after code has been moved/splited around Reference: commit `3615e2be84` Author: Michael Niedermayer <michaelni@gmx.at> Date: Tue Dec 2 22:02:57 2003 +0000 h263_h_loop_filter_mmx Originally committed as revision 2553 to svn://svn.ffmpeg.org/ffmpeg/trunk commit `359f98ded9` Author: Michael Niedermayer <michaelni@gmx.at> Date: Tue Dec 2 20:28:10 2003 +0000 h263_v_loop_filter_mmx Originally committed as revision 2552 to svn://svn.ffmpeg.org/ffmpeg/trunk Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-19 12:51:00 +01:00
Michael Niedermayer	fa09ad5c9e	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Fix h263 loop filter link error in some configurations Conflicts: libavcodec/x86/dsputil.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-19 12:41:27 +01:00
Michael Niedermayer	cf10616cc0	Merge commit '7a03145ed7cb4f1ce794b5126559dd6f38029243' * commit '7a03145ed7cb4f1ce794b5126559dd6f38029243': x86: dsputil: int --> ptrdiff_t for ff_put_pixels16_mmxext line_size param Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-19 12:32:12 +01:00
Daniel Kang	9acd23d655	x86: dsputil: Fix h263 loop filter link error in some configurations This was caused by unconditionally referencing a conditionally compiled table. Now the code is also compiled conditionally. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-02-18 17:09:00 +01:00
Daniel Kang	7a03145ed7	x86: dsputil: int --> ptrdiff_t for ff_put_pixels16_mmxext line_size param This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-02-18 15:23:03 +01:00
Ronald S. Bultje	71ae8d50b2	x86/dsputil: fix compilation when h263 decoder/encoder are disabled. The symbol "ff_h263_loop_filter_strength" is defined in h263.c, but the h263 loopfilter functions (in the .asm file) are not optimized out (even though their function pointers are never assigned). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-18 15:18:56 +01:00
Michael Niedermayer	7491356111	Merge commit '304b806cb524fb040f8e09a241040f1af2cb820b' * commit '304b806cb524fb040f8e09a241040f1af2cb820b': build: Make library minor version visible in the Makefile x86: mpeg4qpel: Make movsxifnidn do the right thing Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-12 13:26:22 +01:00
Ronald S. Bultje	972771dcf2	h264chroma: remove duplicate 9/10 bit functions. Also use the resulting 16bpp functions for anything >8 and <=16, not just 9 and 10. This fixes 12 and 14bpp H264 support. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-11 22:54:52 +01:00
Daniel Kang	b3f2a3fe3f	x86: mpeg4qpel: Make movsxifnidn do the right thing Fixes an instruction that does nothing by changing the source to dword. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-02-11 20:17:15 +01:00
Ronald S. Bultje	c7e3e55429	Move ff_emulated_edge_mc prototypes to videodsp. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-11 01:03:33 +01:00
Michael Niedermayer	5cfc0ae825	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputil: Move fdct function declarations to dct.h Conflicts: libavcodec/dsputil.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-09 14:45:45 +01:00
Michael Niedermayer	6b2e65078c	Merge commit '218aefce4472dc02ee3f12830a9a894bf7916da9' * commit '218aefce4472dc02ee3f12830a9a894bf7916da9': dsputil: Move LOCAL_ALIGNED macros to libavutil Conflicts: libavcodec/dvdec.c libavcodec/imc.c libavcodec/mpegvideo_motion.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-09 14:33:57 +01:00
Diego Biurrun	5d3d39c72e	dsputil: Move fdct function declarations to dct.h	2013-02-09 00:08:28 +01:00
Diego Biurrun	218aefce44	dsputil: Move LOCAL_ALIGNED macros to libavutil	2013-02-08 23:13:37 +01:00
Michael Niedermayer	48870853b2	x86/dsputil: Fix author attribution after code has been moved/splited around Reference: commit `3615e2be84` Author: Michael Niedermayer <michaelni@gmx.at> Date: Tue Dec 2 22:02:57 2003 +0000 h263_h_loop_filter_mmx Originally committed as revision 2553 to svn://svn.ffmpeg.org/ffmpeg/trunk commit `359f98ded9` Author: Michael Niedermayer <michaelni@gmx.at> Date: Tue Dec 2 20:28:10 2003 +0000 h263_v_loop_filter_mmx Originally committed as revision 2552 to svn://svn.ffmpeg.org/ffmpeg/trunk Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-07 16:22:51 +01:00
Michael Niedermayer	54d8322355	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputil: x86: Fix compile error dsputil: x86: Convert h263 loop filter to yasm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-07 14:35:49 +01:00
Michael Niedermayer	60a0bc46cd	Merge commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc' * commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc': h264chroma: x86: Fix building with yasm disabled rv34: Drop now unnecessary dsputil dependencies Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-07 13:35:49 +01:00
Michael Niedermayer	c4e394e460	Merge commit '79dad2a932534d1155079f937649e099f9e5cc27' * commit '79dad2a932534d1155079f937649e099f9e5cc27': dsputil: Separate h264chroma Conflicts: libavcodec/dsputil_template.c libavcodec/ppc/dsputil_ppc.c libavcodec/vc1dec.c libavcodec/vc1dsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-07 13:09:35 +01:00
Daniel Kang	a1d3673034	dsputil: x86: Fix compile error Accidentally prefixed ff_ with cextern. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-02-07 11:06:16 +02:00
Daniel Kang	659d4ba5af	dsputil: x86: Convert h263 loop filter to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-02-06 15:38:27 -08:00
Martin Storsjö	a846dccb29	h264chroma: x86: Fix building with yasm disabled Signed-off-by: Martin Storsjö <martin@martin.st>	2013-02-06 17:05:33 +02:00
Michael Niedermayer	6c38884876	Merge commit '620289a20e022b9c16c10d546ef86cc0bb77cc84' * commit '620289a20e022b9c16c10d546ef86cc0bb77cc84': sh4: Fix silly type vs. variable name search and replace typo configure: Group all hwaccels together in a separate variable Add av_cold attributes to arch-specific init functions Conflicts: configure libavcodec/arm/mpegvideo_armv5te.c libavcodec/x86/mlpdsp.c libavcodec/x86/motion_est.c libavcodec/x86/mpegvideoenc.c libavcodec/x86/videodsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-06 13:27:24 +01:00
Michael Niedermayer	0ddca7d416	dsputil: fixup half a dozen bugs with ptrdiff vs int linesize Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-06 13:22:19 +01:00
Michael Niedermayer	ede45c4e1d	Merge commit '25841dfe806a13de526ae09c11149ab1f83555a8' * commit '25841dfe806a13de526ae09c11149ab1f83555a8': Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter. Conflicts: libavcodec/alpha/dsputil_alpha.c libavcodec/dsputil_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-06 12:18:25 +01:00
Diego Biurrun	82bd04b170	rv34: Drop now unnecessary dsputil dependencies	2013-02-06 11:30:54 +01:00
Diego Biurrun	79dad2a932	dsputil: Separate h264chroma	2013-02-06 11:30:53 +01:00
Diego Biurrun	c9f933b5b6	Add av_cold attributes to arch-specific init functions	2013-02-05 17:01:05 +01:00
Diego Biurrun	25841dfe80	Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter. This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic.	2013-02-05 12:59:12 +01:00
Michael Niedermayer	4d37d2bfc5	put_vp_no_rnd_pixels8_l2_mmx: fix type Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-03 17:21:06 +01:00
Michael Niedermayer	cb573f7fbc	avcodec/x86: Add daniels copyright to the recent gcc->yasm convertions he did. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-02-03 13:50:44 +01:00
Michael Niedermayer	dd87d4a318	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp configure: Add a comment indicating why uclibc is checked before glibc Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-31 20:03:36 +01:00
Diego Biurrun	52acd79165	x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp	2013-01-31 11:19:23 +01:00
Michael Niedermayer	71f8d70456	dirac/x86: fix compile without yasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-30 06:47:09 +01:00
Michael Niedermayer	4d3d362549	dirac/x86: fix compile without inline asm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-30 02:51:59 +01:00
Michael Niedermayer	14aa358c20	Merge commit '098eed95bc1a6b2c8ac97f126f62bb74699670cf' * commit '098eed95bc1a6b2c8ac97f126f62bb74699670cf': mdec: merge mdec_common_init() into decode_init(). eatgv: use fixed-width types where appropriate. x86: Simplify some arch conditionals bfin: Separate VP3 initialization code Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-29 13:48:38 +01:00
Diego Biurrun	c59211b437	x86: Simplify some arch conditionals	2013-01-29 00:10:53 +01:00
Michael Niedermayer	94ef1667bb	dirac/x86: Fix handling blocksizes that are not a multiple of 4 Fixes out of array accesses Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-28 20:55:11 +01:00
Michael Niedermayer	5c9cae7447	dirac: Only use MMX if MMX is available. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-28 20:00:55 +01:00
Michael Niedermayer	bb2f4ae434	Merge commit '05b0998f511ffa699407465d48c7d5805f746ad2' * commit '05b0998f511ffa699407465d48c7d5805f746ad2': dsputil: Fix error by not using redzone and register name swscale: GBRP output support Conflicts: libswscale/output.c libswscale/swscale.c libswscale/swscale_internal.h libswscale/utils.c tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-28 14:11:31 +01:00
Michael Niedermayer	834e9fb056	x86: hpeldsp: Fix a typo, use the right register This makes the code actually work. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-01-28 12:49:37 +02:00
Daniel Kang	05b0998f51	dsputil: Fix error by not using redzone and register name Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-28 07:23:20 +01:00
Michael Niedermayer	edde562130	AVG_PIXELS8_XY2: fix typo, make code actually work Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 15:50:26 +01:00
Daniel Kang	5327a45552	dsputil: x86: Correct the number of registers used in put_no_rnd_pixels16_l2 put_no_rnd_pixels16_l2 allocated 5 instead of 6 registers. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 15:20:44 +01:00
Daniel Kang	d9e62f368d	dsputil: add missing HAVE_YASM guard Fix compile error under "--disable-optimizations --disable-yasm --disable-inline-asm" Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 15:20:35 +01:00
Michael Niedermayer	5934be16cc	x86/mpeg4qpel: Fix author attribution Also fix project name See git blame/log/show and commit `826f429ae9` Author: Michael Niedermayer <michaelni@gmx.at> Date: Sun Jan 5 15:57:10 2003 +0000 qpel in mmx2/3dnow qpel refinement quality parameter Originally committed as revision 1393 to svn://svn.ffmpeg.org/ffmpeg/trunk Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 15:07:02 +01:00
Michael Niedermayer	aa3f449955	x86/hpeldsp: Fix author attribution This also fixes the project name Original authors fabrice and nick go back to the initial ffmpeg commit Others for example contributed in: (for a complete list please use git blame / show / log) commit `e9c0a38ff0` Author: Zdenek Kabelac <kabi@informatics.muni.cz> Date: Tue May 28 16:35:58 2002 +0000 * optimized avg_* functions (except xy2) * minor speedup for put_pixels_x2 & cleanup Originally committed as revision 619 to svn://svn.ffmpeg.org/ffmpeg/trunk commit `607dce96c0` Author: Michael Niedermayer <michaelni@gmx.at> Date: Fri May 17 01:04:14 2002 +0000 hopefully faster mmx2&3dnow MC Originally committed as revision 506 to svn://svn.ffmpeg.org/ffmpeg/trunk Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 14:47:58 +01:00
Michael Niedermayer	91c8921d80	Merge commit '71155d7b4157fee44c0d3d0fc1b660ebfb9ccf46' * commit '71155d7b4157fee44c0d3d0fc1b660ebfb9ccf46': dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 14:24:28 +01:00
Michael Niedermayer	6b2f7fd1c7	Merge commit 'f90ff772e7e35b4923c2de429d1fab9f2569b568' * commit 'f90ff772e7e35b4923c2de429d1fab9f2569b568': Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm. doc: update the reference for the title Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-27 13:32:56 +01:00
Daniel Kang	96753bd00d	dsputil: x86: Correct the number of registers used in put_no_rnd_pixels16_l2 put_no_rnd_pixels16_l2 allocated 5 instead of 6 registers. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 08:41:48 +01:00
Daniel Kang	0eedf5d74d	dsputil: add missing HAVE_YASM guard Fix compile error under "--disable-optimizations --disable-yasm --disable-inline-asm" Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 08:41:46 +01:00
Daniel Kang	71155d7b41	dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-27 06:45:31 +01:00
Ronald S. Bultje	f90ff772e7	Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm.	2013-01-26 20:35:42 -08:00
Michael Niedermayer	446d62f0cf	Merge commit '69c25c9284645cf5189af2ede42d6f53828f3b45' * commit '69c25c9284645cf5189af2ede42d6f53828f3b45': dnxhdenc: fix invalid reads in dnxhd_mb_var_thread(). x86: h264qpel: Move stray comment to the right spot and clarify it atrac3: use correct loop variable in add_tonal_components() Conflicts: tests/ref/vsynth/vsynth1-dnxhd-1080i tests/ref/vsynth/vsynth2-dnxhd-1080i Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-26 14:33:06 +01:00
Diego Biurrun	033a86f9bb	x86: h264qpel: Move stray comment to the right spot and clarify it	2013-01-26 11:19:22 +01:00
Michael Niedermayer	e9125dd556	Merge commit '2c10e2a2f62477efaef5b641974594f7df4ca339' * commit '2c10e2a2f62477efaef5b641974594f7df4ca339': build: Make the H.264 parser select h264qpel x86: h264qpel: add cpu flag checks for init function Conflicts: libavcodec/x86/h264_qpel.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-25 14:14:21 +01:00
Carl Eugen Hoyos	a0d1440476	Fix compilation with --disable-everything on x86_32. Fixes ticket #2183.	2013-01-25 03:04:46 +01:00
Michael Niedermayer	fc8e8e5bef	h264_qpel: put cpuflags checks back. These where lost when libav moved the code out of dsputil Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-24 22:14:39 +01:00
Janne Grunau	c5c2060cf5	x86: h264qpel: add cpu flag checks for init function The code was copied from per cpu extension init function so the checks for supported extensions was overlooked.	2013-01-24 19:03:59 +01:00
Michael Niedermayer	fc13a89654	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputil: Separate h264 qpel Conflicts: libavcodec/dsputil_template.c libavcodec/h264.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-24 15:47:47 +01:00
Mans Rullgard	e9d817351b	dsputil: Separate h264 qpel The sh4 optimizations are removed, because the code is 100% identical to the C code, so it is unlikely to provide any real practical benefit. Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-24 10:44:43 +01:00
Michael Niedermayer	1e7a92f219	Merge commit 'baf35bb4bc4fe7a2a4113c50989d11dd9ef81e76' * commit 'baf35bb4bc4fe7a2a4113c50989d11dd9ef81e76': dsputil: remove one array dimension from avg_no_rnd_pixels_tab. Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 18:15:29 +01:00
Michael Niedermayer	6d1e9d993a	Merge commit '32ff6432284f713e9f837ee5b36fc8e9f1902836' * commit '32ff6432284f713e9f837ee5b36fc8e9f1902836': dsputil: remove avg_no_rnd_pixels8. Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 17:52:21 +01:00
Michael Niedermayer	ac8987591f	Merge commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f' * commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f': Drop DCTELEM typedef Conflicts: libavcodec/alpha/dsputil_alpha.h libavcodec/alpha/motion_est_alpha.c libavcodec/arm/dsputil_init_armv6.c libavcodec/bfin/dsputil_bfin.h libavcodec/bfin/pixels_bfin.S libavcodec/cavs.c libavcodec/cavsdec.c libavcodec/dct-test.c libavcodec/dnxhdenc.c libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/dsputil_template.c libavcodec/eamad.c libavcodec/h264_cavlc.c libavcodec/h264idct_template.c libavcodec/mpeg12.c libavcodec/mpegvideo.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/ppc/dsputil_altivec.c libavcodec/proresdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 17:44:56 +01:00
Michael Niedermayer	b90ab2b993	Merge commit '2e4bb99f4df7052b3e147ee898fcb4013a34d904' * commit '2e4bb99f4df7052b3e147ee898fcb4013a34d904': vorbisdsp: convert x86 simd functions from inline asm to yasm. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 17:00:26 +01:00
Michael Niedermayer	8102f27b5b	Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9' * commit '73b704ac609d83e0be124589f24efd9b94947cf9': arm: Add some missing header #includes floatdsp: move scalarproduct_float from dsputil to avfloatdsp. Conflicts: libavcodec/acelp_pitch_delay.c libavcodec/amrnbdec.c libavcodec/amrwbdec.c libavcodec/ra288.c libavcodec/x86/dsputil_mmx.c libavutil/x86/float_dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 14:31:55 +01:00
Michael Niedermayer	6e6e170898	Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4' * commit '42d324694883cdf1fff1612ac70fa403692a1ad4': floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp. Conflicts: libavcodec/arm/dsputil_init_vfp.c libavcodec/arm/dsputil_vfp.S libavcodec/dsputil.c libavcodec/ppc/float_altivec.c libavcodec/x86/dsputil.asm libavutil/x86/float_dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 14:04:50 +01:00
Michael Niedermayer	b1b870fbd7	Merge commit '55aa03b9f8f11ebb7535424cc0e5635558590f49' * commit '55aa03b9f8f11ebb7535424cc0e5635558590f49': floatdsp: move vector_fmul_add from dsputil to avfloatdsp. Conflicts: libavcodec/dsputil.c libavcodec/x86/dsputil.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-23 13:54:34 +01:00
Ronald S. Bultje	baf35bb4bc	dsputil: remove one array dimension from avg_no_rnd_pixels_tab.	2013-01-22 18:41:36 -08:00
Ronald S. Bultje	32ff643228	dsputil: remove avg_no_rnd_pixels8. This is never used.	2013-01-22 18:41:36 -08:00
Diego Biurrun	88bd7fdc82	Drop DCTELEM typedef It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2013-01-22 18:32:56 -08:00
Ronald S. Bultje	2e4bb99f4d	vorbisdsp: convert x86 simd functions from inline asm to yasm.	2013-01-22 18:02:24 -08:00
Ronald S. Bultje	42d3246948	floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp. Now, nellymoserenc and aacenc no longer depends on dsputil. Independent of this patch, wmaprodec also does not depend on dsputil, so I removed it from there also.	2013-01-22 11:55:42 -08:00
Ronald S. Bultje	55aa03b9f8	floatdsp: move vector_fmul_add from dsputil to avfloatdsp.	2013-01-22 11:55:42 -08:00
Ronald S. Bultje	d56668bd80	floatdsp: move scalarproduct_float from dsputil to avfloatdsp. This makes the aac decoder and all voice codecs independent of dsputil.	2013-01-22 11:55:42 -08:00
Michael Niedermayer	26345acb0e	Merge remote-tracking branch 'qatar/master' * qatar/master: proresdec: support mixed interlaced/non-interlaced content vp3/5: move put_no_rnd_pixels_l2 from dsputil to VP3DSPContext. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-22 15:33:23 +01:00
Michael Niedermayer	b3ab281027	avcodec/x86/cabac: workaround llvm 4.2.1 bug x86_64 is affected by this too Fixes Ticket2156 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-22 03:43:29 +01:00
Diego Biurrun	4f56e773fe	x86: ac3: Fix HAVE_MMXEXT condition to only refer to external assembly CC: libav-stable@libav.org	2013-01-21 23:54:32 +01:00
Michael Niedermayer	f6a80d6e63	Merge remote-tracking branch 'qatar/master' * qatar/master: dsputilenc: x86: Convert pixel inline asm to yasm libgsm: detect libgsm header path Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-21 16:34:37 +01:00
Michael Niedermayer	cf4515ecd9	Merge commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3' * commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3': fate: Use wmv2 IDCT for wmv2 tests vorbisdsp: change block_size type from int to intptr_t. Conflicts: tests/fate-run.sh tests/fate/vcodec.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-21 16:28:04 +01:00
Michael Niedermayer	2cf9ab6555	Merge commit '8a4f26206d7914eaf2903954ce97cb7686933382' * commit '8a4f26206d7914eaf2903954ce97cb7686933382': dsputil: remove butterflies_float_interleave. srtp: Move a variable to a local scope srtp: Add tests for the crypto suite with 32/80 bit HMAC Conflicts: libavcodec/x86/dsputil.asm libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-21 15:18:57 +01:00
Michael Niedermayer	a85311ef84	Merge commit '68f18f03519ae550e25cf12661172641e9f0eaca' * commit '68f18f03519ae550e25cf12661172641e9f0eaca': videodsp_armv5te: remove #if HAVE_ARMV5TE_EXTERNAL dsputil: drop non-compliant "fast" qpel mc functions get_bits: change the failure condition in init_get_bits Conflicts: libavcodec/get_bits.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-21 13:38:57 +01:00
Daniel Kang	9f00b1cbab	dsputilenc: x86: Convert pixel inline asm to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-21 09:54:10 +01:00
Ronald S. Bultje	1768e43ceb	vorbisdsp: change block_size type from int to intptr_t. This saves one instruction in the x86-64 assembly.	2013-01-20 22:26:42 -08:00
Ronald S. Bultje	8a4f26206d	dsputil: remove butterflies_float_interleave. The function is unused.	2013-01-20 21:57:35 -08:00
Mans Rullgard	0b711ca3f3	dsputil: drop non-compliant "fast" qpel mc functions Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-01-20 14:50:42 +01:00
Michael Niedermayer	28245fb466	Merge remote-tracking branch 'qatar/master' * qatar/master: Remove put_no_rnd_pixels_l2 function pointer for w=16 from dsputil. Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-20 14:25:27 +01:00
Michael Niedermayer	c62cb1112f	Merge commit 'fef906c77c09940a2fdad155b2adc05080e17eda' * commit 'fef906c77c09940a2fdad155b2adc05080e17eda': Move vorbis_inverse_coupling from dsputil to vorbisdspcontext. Conflicts: libavcodec/dsputil.c libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-20 14:13:16 +01:00
Michael Niedermayer	cf061a9c3b	Merge commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a' * commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a': vp3: integrate clear_blocks with idct of previous block. mpegvideo: fix loop condition in draw_line() dvdsubdec: parse the size from the extradata Conflicts: libavcodec/dvdsubdec.c libavcodec/mpegvideo.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-20 13:57:10 +01:00
Ronald S. Bultje	fef906c77c	Move vorbis_inverse_coupling from dsputil to vorbisdspcontext. Conveniently (together with Justin's earlier patches), this makes our vorbis decoder entirely independent of dsputil.	2013-01-19 22:21:10 -08:00
Ronald S. Bultje	aeaf268e52	vp3: integrate clear_blocks with idct of previous block. This is identical to what e.g. vp8 does, and prevents the function call overhead (plus dependency on dsputil for this particular function). Arm asm updated by Janne Grunau <janne-libav@jannau.net>. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2013-01-19 22:04:55 -08:00
Michael Niedermayer	ed8ff70d9e	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: Drop some unused macro definitions x86: Add a Yasm-based emms() replacement Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-19 13:20:25 +01:00
Diego Biurrun	822b0728f0	x86: dsputil: Drop some unused macro definitions	2013-01-18 22:24:58 +01:00
Michael Niedermayer	5c7e9e16c9	Merge remote-tracking branch 'qatar/master' * qatar/master: lavc: Move vector_fmul_window to AVFloatDSPContext rtpdec_mpeg4: Check the remaining amount of data before reading Conflicts: libavcodec/dsputil.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-16 12:38:41 +01:00
Michael Niedermayer	68f92a70f1	Merge commit 'dae1d507af94261bafd3b11549884e5d1eca590e' * commit 'dae1d507af94261bafd3b11549884e5d1eca590e': x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags vf_fps: add final flushed frames to the dropped frame count rv34_parser: Adjust #if for disabling individual parsers Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-16 11:44:45 +01:00
Justin Ruggles	e034cc6c60	lavc: Move vector_fmul_window to AVFloatDSPContext Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-16 10:45:45 +01:00
Diego Biurrun	dae1d507af	x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags	2013-01-15 17:29:43 +01:00
Michael Niedermayer	cfc40a6aff	Merge commit 'd8c772de53d29afb1bada88afa859fce8489c668' * commit 'd8c772de53d29afb1bada88afa859fce8489c668': nutdec: Always return a value from nut_read_timestamp() configure: Make warnings from -Wreturn-type fatal errors x86: ABS2: port to cpuflags vdpau: Remove av_unused attribute from function declaration h264: fix ff_generate_sliding_window_mmcos() prototype. Conflicts: configure libavformat/nutdec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-15 15:23:20 +01:00
Michael Niedermayer	274e48d8ac	x86/Makefile: move dirac_dwt to right type Fix build failure without yasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-14 23:40:26 +01:00
Michael Niedermayer	9cb3c1a4d9	x86/dirac: fix asm on win64 This could also be fixed by changing the argument type if someone prefers that and wants to change it ... Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-14 23:28:01 +01:00
Michael Niedermayer	30981a966f	lavc: split snow and dirac DWTs There is only about 4 lines of common code, so it alot cleaner when seperated. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-14 22:59:05 +01:00
Diego Biurrun	51969a652c	x86: ABS2: port to cpuflags	2013-01-14 21:56:55 +01:00
Carl Eugen Hoyos	f023003ce6	Fix compilation with --disable-everything.	2013-01-10 10:04:46 +01:00
Michael Niedermayer	c526a01c91	Merge commit '4f50646697606df39317b93c2a427603b77636ee' * commit '4f50646697606df39317b93c2a427603b77636ee': x86: sbrdsp: Implement SSE qmf_post_shuffle Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-07 01:51:10 +01:00
Michael Niedermayer	8429320313	Merge commit '44a0036d10579ed91e48df24859e54b08a582742' * commit '44a0036d10579ed91e48df24859e54b08a582742': x86: sbrdsp: Implement SSE sum64x5 Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-07 01:40:05 +01:00
Michael Niedermayer	ea93ccf079	Merge commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0' * commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0': x86: ABS1: port to cpuflags v210x: cosmetics, reformat Merged-by: Michael Niedermayer <michaelni@gmx.at>	2013-01-07 01:35:18 +01:00
Diego Biurrun	a0c5917f86	Drop Snow codec Snow is a toy codec with no real-world use and horrible code.	2013-01-06 16:30:02 +01:00
Christophe Gisquet	4f50646697	x86: sbrdsp: Implement SSE qmf_post_shuffle 255 to 174 cycles on Arrandale / Win64. Unrolling yields no gain. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-01-06 13:57:01 +01:00
Christophe Gisquet	44a0036d10	x86: sbrdsp: Implement SSE sum64x5 698 to 174 cycles on Arrandale. Unrolling is a 6 cycles gain. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2013-01-06 13:57:01 +01:00
Diego Biurrun	5b4dfbffc2	x86: ABS1: port to cpuflags	2013-01-06 13:57:01 +01:00
Michael Niedermayer	9cb887ed37	dsputil_mmx: fix pointer type for emulated_edge_mc_func() Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-25 02:04:31 +01:00
Michael Niedermayer	3e15775333	x86/ac3dsp_init: try to workaround ICC failure. The asm code is not valid for older compilers as it uses too many operands, ICC on x86_32 seems affected by this. This patch disables the affected code for ICC on x86_32 and should make it compileable again. A better fix would be to use fewer operands or to change this code to yasm, later is being worked on AFAIK so this is a temporary solution. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-23 19:27:19 +01:00
Michael Niedermayer	e16bac7b33	videodsp: Fix project name These are all part of splited out dsp utils from FFmpeg Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-22 00:58:08 +01:00
Michael Niedermayer	90eaa989f1	x86/videodsp_init: Add back lost author attribution Code originates from: `910b9f30` libavcodec/dsputil.c (David Conrad 2010-05-27 04:39:27 +0000 334) void ff_emulated_edge_mc(uint8_t buf, const uint8_t src, int linesize, int block_w, int block_h, `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 335) int src_x, int src_y, int w, int h){ `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 336) int x, y; `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 337) int start_y, start_x, end_y, end_x; `b5a093b3` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-25 20:22:36 +0000 338) `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 339) if(src_y>= h){ `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 340) src+= (h-1-src_y)linesize; `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 341) src_y=h-1; `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 342) }else if(src_y<=-block_h){ `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 343) src+= (1-block_h-src_y)linesize; `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 344) src_y=1-block_h; `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 345) } `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 346) if(src_x>= w){ `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 347) src+= (w-1-src_x); `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 348) src_x=w-1; `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 349) }else if(src_x<=-block_w){ `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 350) src+= (1-block_w-src_x); `225f9c44` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 351) src_x=1-block_w; `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 352) } `93a21abd` libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 353) `b8a78f41` libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 354) start_y= FFMAX(0, -src_y); `b8a78f41` libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 355) start_x= FFMAX(0, -src_x); `b8a78f41` libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 356) end_y= FFMIN(block_h, h-src_y); `b8a78f41` libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 357) end_x= FFMIN(block_w, w-src_x); Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-22 00:58:08 +01:00
Michael Niedermayer	a41bf09d9c	Merge commit '6906b19346ae8a330bfaa1c16ce535be10789723' * commit '6906b19346ae8a330bfaa1c16ce535be10789723': lavc: add missing files for arm lavc: introduce VideoDSPContext Conflicts: configure libavcodec/arm/dsputil_init_armv5te.c libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/dsputil_template.c libavcodec/h264.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-21 17:18:43 +01:00
Ronald S. Bultje	8c53d39e7f	lavc: introduce VideoDSPContext Move some functions from dsputil. The idea is that videodsp contains functions that are useful for a large and varied set of video decoders. Currently, it contains emulated_edge_mc() and prefetch(). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2012-12-20 13:40:45 +01:00
Ronald S. Bultje	ce58642ed0	x86inc: support stack mem allocation and re-alignment in PROLOGUE. Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-12 10:37:52 +01:00
Ronald S. Bultje	6f40e9f070	x86inc: support stack mem allocation and re-alignment in PROLOGUE Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2012-12-12 05:23:46 +01:00
Clément Bœsch	7eafd274d8	build: fix prores decoder dependencies. According to lavc/proresdsp.c, both prores and prores-lgpl decoders need lavc/x86/proresdsp_init.c:ff_proresdsp_x86_init().	2012-12-11 02:54:55 +01:00
Michael Niedermayer	e7101a7f3f	libavcodec/x86/mpegvideo: switch to av_assert2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-10 23:30:16 +01:00
Michael Niedermayer	ddbf0702c5	dsputil_mmx: switch to av_assert2() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-10 14:41:31 +01:00
Michael Niedermayer	a933698457	Merge commit '30b39164256999efc8d77edc85e2e0b963c24834' * commit '30b39164256999efc8d77edc85e2e0b963c24834': ac3dec: make downmix() take array of pointers to channel data Conflicts: libavcodec/ac3dsp.c libavcodec/ac3dsp.h Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-10 02:06:50 +01:00
Mans Rullgard	30b3916425	ac3dec: make downmix() take array of pointers to channel data	2012-12-09 15:52:01 +00:00
Michael Niedermayer	0110108a7c	sbr_hf_gen_sse: Optimize code a bit more. Core I7 (Sandy Bridge) 135 to 107 cycles Core i5 (Arrandale) 162 to 142 (Thanks to Christophe Gisquet for testing) Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-08 17:30:11 +01:00
Michael Niedermayer	af164d7d9f	Merge commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152' * commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152': fate: dpcm: Add dependencies SBR DSP x86: implement SSE sbr_hf_gen AAC SBR: use AVFloatDSPContext's vector_fmul fate: image: Add dependencies Changelog: add an entry for deprecating the avconv -vol option x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32 Conflicts: Changelog libavutil/x86/float_dsp.asm tests/fate/image.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-07 15:21:41 +01:00
Christophe Gisquet	2aef3d66c9	SBR DSP x86: implement SSE sbr_hf_gen Start and end index are multiple of 2, therefore guaranteeing aligned access. Also, this allows to generate 4 floats per loop, keeping the alignment all along. Timing: - 32 bits: 326c -> 172c - 64 bits: 323c -> 156c Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-12-07 11:04:26 +01:00
Michael Niedermayer	b023392f34	mpegvideo: remove #if/define PARANOID code This code never did anything as far as i can remember Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-05 19:31:27 +01:00
Michael Niedermayer	599ae9995f	ff_emulated_edge_mc: fix handling of w/h being 0 Fixes assertion failure Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-12-05 03:45:10 +01:00
Michael Niedermayer	076300bf8b	Merge commit 'bfe5454cd238b16e7977085f880205229103eccb' * commit 'bfe5454cd238b16e7977085f880205229103eccb': lavf: move ff_codec_get_tag() and ff_codec_get_id() definitions to internal.h lavf: move "MP3 " fourcc from riff to nut fate: vpx: Add dependencies fate: Fix wavpack-matroskamode test dependencies x86: dsputilenc: port to cpuflags Conflicts: libavformat/internal.h libavformat/nut.c tests/fate/vpx.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-29 13:45:57 +01:00
Michael Niedermayer	7dc0ed80e8	Merge commit '1f3f896564501c23b44fcf605567c78ce066b539' * commit '1f3f896564501c23b44fcf605567c78ce066b539': fate: Add dependencies for Vorbis, ProRes, QTRLE, utvideo tests fate: real: Add dependencies fate: lossless-audio: Add dependencies x86: h264dsp: Fix linking with yasm and optimizations disabled Conflicts: libavcodec/x86/h264dsp_init.c tests/fate/lossless-audio.mak tests/fate/real.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-29 13:35:56 +01:00
Diego Biurrun	9b15c0a9b3	x86: dsputilenc: port to cpuflags	2012-11-28 16:05:44 +01:00
Diego Biurrun	89145fbbfe	x86: h264dsp: Fix linking with yasm and optimizations disabled Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.	2012-11-28 14:45:28 +01:00
Michael Niedermayer	42d3fea65f	Merge commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1' * commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1': asink_nullsink: plug a memory leak. x86: h264_idct: port to cpuflags x86: cpu: Drop unused HAVE_RWEFLAGS condition Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-28 13:32:17 +01:00
Michael Niedermayer	264441715b	Merge commit 'f5fa03660db16f9d78abc5a626438b4d0b54f563' * commit 'f5fa03660db16f9d78abc5a626438b4d0b54f563': vble: Do not abort decoding when version is not 1 lavr: do not pass consumed samples as a parameter to ff_audio_resample() lavr: correct the documentation for the ff_audio_resample() return value lavr: do not pass sample count as a parameter to ff_audio_convert() x86: h264_weight: port to cpuflags configure: Enable avconv filter dependencies automatically Conflicts: configure libavcodec/x86/h264_weight.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-28 13:27:18 +01:00
Diego Biurrun	2e89aeed65	x86: h264_idct: port to cpuflags	2012-11-28 00:28:09 +01:00
Diego Biurrun	28e1cf19aa	x86: h264_weight: port to cpuflags	2012-11-27 21:10:38 +01:00
Michael Niedermayer	a3f30f2e99	Merge commit '5ae72f54532960cb9eae82a1c9e8d505106c022b' * commit '5ae72f54532960cb9eae82a1c9e8d505106c022b': flashsv: check for keyframe before using differential coding h264: enable low delay only if no delayed frames were seen x86: fix build without inline asm Conflicts: libavcodec/h264.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 16:11:02 +01:00
Michael Niedermayer	a13148f633	Merge commit '8e134e5104e99a69cd4cea10540a7ce9c3682a2c' * commit '8e134e5104e99a69cd4cea10540a7ce9c3682a2c': lavc: clarify get_buffer() documentation mpegaudiodec: use planar sample format for output unless packed is requested x86: h264 qpel: use the correct number of utilized xmm regs in cglobal Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 14:24:19 +01:00
Michael Niedermayer	86270236d5	dsputil_mmx: ff_put_dirac_pixels depend now on yasm. Fix compile failure without yasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 13:59:41 +01:00
Michael Niedermayer	7b29b07394	Merge remote-tracking branch 'qatar/master' * qatar/master: remove #defines to prevent use of discouraged external functions x86: h264: Convert 8-bit QPEL inline assembly to YASM Conflicts: libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 02:17:02 +01:00
Diego Biurrun	7ee4071362	x86: fix build without inline asm The qpel functions referenced here are not related to h264 and should thus never have been under CONFIG_H264QPEL. Signed-off-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-11-26 01:50:47 +01:00
Michael Niedermayer	66c3bac2b9	Merge commit 'ad01ba6ceaea7d71c4b9887795523438689b5a96' * commit 'ad01ba6ceaea7d71c4b9887795523438689b5a96': x86: h264: Remove 3dnow QPEL code Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 00:57:33 +01:00
Justin Ruggles	2d3993ce8c	x86: h264 qpel: use the correct number of utilized xmm regs in cglobal Fixes xmm register clobbering on win64.	2012-11-25 18:48:43 -05:00
Michael Niedermayer	bf2f93cdbf	Merge commit '28c8e288fa0342fdef532a7522a4707bebf831cc' * commit '28c8e288fa0342fdef532a7522a4707bebf831cc': x86: h264_chromamc: port to cpuflags yop: fix typo avconv: fix copying per-stream metadata. doc: avtools-common-opts: Fix terminology concerning metric prefixes configure: suncc: Add compiler arch support for Nehalem & Sandy Bridge riff: Make ff_riff_tags static and move under appropriate #ifdef Conflicts: libavformat/riff.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-26 00:43:45 +01:00
Daniel Kang	610e00b359	x86: h264: Convert 8-bit QPEL inline assembly to YASM Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-11-25 20:38:35 +01:00
Daniel Kang	ad01ba6cea	x86: h264: Remove 3dnow QPEL code The only CPUs that have 3dnow and don't have mmxext are 12 years old. Moreover, AMD has dropped 3dnow extensions from newer CPUs. Signed-off-by: Diego Biurrun <diego@biurrun.de>	2012-11-25 20:32:55 +01:00
Diego Biurrun	28c8e288fa	x86: h264_chromamc: port to cpuflags	2012-11-25 17:25:10 +01:00
Michael Niedermayer	533a8b2a7d	x86/mpegvideoenc_template: use av_assert Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-23 17:57:22 +01:00
Michael Niedermayer	e6d81ce22e	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: h264_intrapred: Fix C function names in comments x86: SPLATD: port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-19 14:24:20 +01:00
Diego Biurrun	89923fce70	x86: h264_intrapred: Fix C function names in comments Function names changed after switching to declaration with PRED4x4/8x8/8x8L/16x16 macros in the C code.	2012-11-18 18:34:05 +01:00
Diego Biurrun	87af05c575	x86: SPLATD: port to cpuflags	2012-11-18 18:34:05 +01:00
Michael Niedermayer	2207ea44fb	ff_emulated_edge_mc: fix integer anomalies, fix out of array reads Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-16 21:33:52 +01:00
Michael Niedermayer	ff3b59c848	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: dsputil: port to cpuflags crc: av_crc() parameter names should match between .c, .h and doxygen avserver: replace av_read_packet with av_read_frame avserver: fix constness casting warnings Conflicts: libavcodec/x86/dsputil.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-16 13:23:35 +01:00
Diego Biurrun	8c3849bc76	x86: dsputil: port to cpuflags	2012-11-16 10:38:23 +01:00
Michael Niedermayer	a1b5c9634e	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: mmx2 ---> mmxext in asm constructs Conflicts: libavcodec/x86/h264_chromamc_10bit.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-14 12:34:30 +01:00
Michael Niedermayer	e13d5e9a4b	Merge commit '5e9c6ef8f3beb9ed7b271654a82349ac90fe43f2' * commit '5e9c6ef8f3beb9ed7b271654a82349ac90fe43f2': x86: h264_weight_10bit: port to cpuflags libtheoraenc: add missing pixdesc.h header avcodec: remove ff_is_hwaccel_pix_fmt pixdesc: add av_pix_fmt_get_chroma_sub_sample hlsenc: stand alone hls segmenter Conflicts: doc/muxers.texi libavcodec/ffv1enc.c libavcodec/imgconvert.c libavcodec/mpegvideo_enc.c libavcodec/tiffenc.c libavformat/Makefile libavformat/allformats.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-14 11:59:20 +01:00
Diego Biurrun	26301caaa1	x86: mmx2 ---> mmxext in asm constructs	2012-11-14 00:58:51 +01:00
Diego Biurrun	5e9c6ef8f3	x86: h264_weight_10bit: port to cpuflags	2012-11-13 19:07:09 +01:00
Diego Biurrun	2b479bcab0	build: Drop AVX assembly ifdefs An assembler able to cope with AVX instructions is now required.	2012-11-11 20:43:28 +01:00
Michael Niedermayer	def8588fb5	dwt_yasm/vertical_compose: fix width witdth argument. Fixes out of array accesses Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-11 12:41:35 +01:00
Michael Niedermayer	bec37935ec	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: h264_qpel_10bit: drop unused parameter from MC10/MC20/MC30 macros Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-11 12:18:05 +01:00
Diego Biurrun	6cd796049d	x86: h264_qpel_10bit: drop unused parameter from MC10/MC20/MC30 macros	2012-11-10 14:49:09 +01:00
Michael Niedermayer	2ce64413e2	Merge remote-tracking branch 'qatar/master' * qatar/master: x86: PALIGNR: port to cpuflags x86: h264_qpel_10bit: port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-10 12:44:39 +01:00
Diego Biurrun	4b60fac419	x86: PALIGNR: port to cpuflags	2012-11-09 21:31:31 +01:00
Diego Biurrun	4d1f69f244	x86: h264_qpel_10bit: port to cpuflags	2012-11-09 21:17:05 +01:00
Michael Niedermayer	1b5a6d3c49	Merge remote-tracking branch 'qatar/master' * qatar/master: flacenc: ensure the order is within the min/max range in LPC order search avconv: rescale packet duration to muxer time base when flushing encoders add 24-bit FLAC encoding to Changelog rtpenc_aac: Fix calculation of the header size x86: h264_intrapred: port to cpuflags Conflicts: Changelog libavformat/rtpenc_aac.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-09 10:29:51 +01:00
Diego Biurrun	6ca60d4ddd	x86: h264_intrapred: port to cpuflags	2012-11-08 18:05:23 +01:00
Michael Niedermayer	e859339e7a	Merge commit '930e26a3ea9d223e04bac4cdde13697cec770031' * commit '930e26a3ea9d223e04bac4cdde13697cec770031': x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled x86: PABSW: port to cpuflags x86: vc1dsp: port to cpuflags rtmp: Use av_strlcat instead of strncat Conflicts: libavcodec/x86/h264_qpel.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-05 22:36:05 +01:00
Diego Biurrun	930e26a3ea	x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled This fixes compilation with --disable-everything and components enabled.	2012-11-05 20:48:43 +01:00
Diego Biurrun	dbb37e7711	x86: PABSW: port to cpuflags	2012-11-05 14:51:10 +01:00
Diego Biurrun	6c104826bd	x86: vc1dsp: port to cpuflags	2012-11-05 14:51:10 +01:00
Michael Niedermayer	37e81996dc	Merge commit '9221efef7968463f3e3d9ce79ea72eaca082e73f' * commit '9221efef7968463f3e3d9ce79ea72eaca082e73f': lavf: fix av_interleaved_write_frame() doxy. lavf: clarify the lifetime of demuxed packets. avconv: do not free muxed packet on streamcopy. crc: move doxy to the header vf_drawtext: do not use deprecated av_tree_node_size x86: Refactor PSWAPD fallback implementations and port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>	2012-11-03 14:24:11 +01:00

... 7 8 9 10 11 ...

1801 Commits