ffmpeg

Author	SHA1	Message	Date
Michael Niedermayer	40a4ab8ba4	rename sub_hfyu_median_prediction_int16 to sub_hfyu_median_pred_int16 This makes the naming more consistent with the 8bit variant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-29 19:48:23 +02:00
James Almer	05de4d3011	x86/dsputilenc: implement XOP version of pix_sum16 SSE2: 137 cycles XOP: 87 cycles Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-29 18:40:23 +02:00
Diego Biurrun	368f50359e	dsputil: Split off quarterpel bits into their own context	2014-05-29 06:48:31 -07:00
Diego Biurrun	054013a0fc	dsputil: Move APE-specific bits into apedsp	2014-05-29 06:41:15 -07:00
Diego Biurrun	65d5d58658	dsputil: Move SVQ1 encoding specific bits into svq1enc	2014-05-29 06:41:15 -07:00
Michael Niedermayer	b50559fc0b	libavcodec/x86/dsputilenc: drop and 0xffff that should have becomei redundant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-29 00:16:52 +02:00
James Almer	561bfc85eb	x86/dsputilenc: implement SSE2 versions of pix_{sum16, norm1} Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-28 23:29:34 +02:00
Christophe Gisquet	0810608e23	x86: hevc_mc: better register allocation The xmm reg count was incorrect, and manual loading of the gprs furthermore allows to noticeable reduce the number needed. The modified functions are used in weighted prediction, so only a few samples like WP_* exhibit a change. For this one and Win64 (some widths removed because of too few occurrences): WP_A_Toshiba_3.bit, ff_hevc_put_hevc_uni_w 16 32 before: 2194 3872 after: 2119 3767 WP_B_Toshiba_3.bit, ff_hevc_put_hevc_bi_w 16 32 64 before: 2819 4960 9396 after: 2617 4788 9150 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-28 17:39:34 +02:00
Michael Niedermayer	48a6916308	Merge commit '512f3ffe9b4bb86767c2b1176554407c75fe1a5c' * commit '512f3ffe9b4bb86767c2b1176554407c75fe1a5c': dsputil: Split off HuffYUV encoding bits into their own context Conflicts: configure libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/huffyuv.h libavcodec/huffyuvenc.c libavcodec/pngenc.c libavcodec/x86/dsputilenc_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-28 00:03:59 +02:00
Michael Niedermayer	e2abc0d5ca	Merge commit '0d439fbede03854eac8a978cccf21a3425a3c82d' * commit '0d439fbede03854eac8a978cccf21a3425a3c82d': dsputil: Split off HuffYUV decoding bits into their own context Conflicts: configure libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/huffyuv.h libavcodec/huffyuvdec.c libavcodec/lagarith.c libavcodec/vble.c libavcodec/x86/Makefile libavcodec/x86/dsputil.asm libavcodec/x86/dsputil_init.c libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-27 23:16:06 +02:00
Diego Biurrun	512f3ffe9b	dsputil: Split off HuffYUV encoding bits into their own context Also shorten HuffYUV context member names to avoid clutter.	2014-05-27 08:54:53 -07:00
Diego Biurrun	0d439fbede	dsputil: Split off HuffYUV decoding bits into their own context Also shorten HuffYUV context member names to avoid clutter.	2014-05-27 08:52:34 -07:00
James Almer	5863207086	x86/dsputilenc: use HADDD in ff_sse16_sse2 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-27 15:12:50 +02:00
James Almer	e64e079ece	x86/dsputilenc: implement SSE2 version of diff_pixels Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-27 05:55:11 +02:00
Michael Niedermayer	a0c5cd3475	avcodec/x86/dsputilenc: set the count of SSE registers correctly for get_pixels Found-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-27 05:52:25 +02:00
Christophe Gisquet	86ae0da60c	x86: hpeldsp: propagate changes across codecs Some codecs still use mmx versions, so have them use the versions with newer instruction sets. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-26 15:37:04 +02:00
Michael Niedermayer	a3950a90f6	Revert "x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm" This reverts commit ad733089b024e4a3ff8f024d247a032f79a50ac8. breaks with --disable-yasm revert requested by: Christophe Gisquet <christophe.gisquet@gmail.com>	2014-05-25 19:42:18 +02:00
Timothy Gu	ad733089b0	x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm Signed-off-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-25 16:30:08 +02:00
James Almer	d94e255dd1	x86/dsputilenc: make the SUM_ABS_DCTELEM macro more readable Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-25 02:03:54 +02:00
James Almer	61eea421b2	x86/dsputilenc: port sum_abs_dctelem functions to yasm Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 21:46:25 +02:00
Christophe Gisquet	81aa0f4604	x86: hpeldsp: implement SSSE3 version of _xy2 Loading pb_1 rather than pw_8192 was benchmarked to be more efficient. Loading of the 2 yields no advantage. Loading of one saves ~11 cycles. decicycles count: put8: 3223(mmx) -> 2387 avg8: 2863(mmxext) -> 2125 put16: 4356(sse2) -> 3553 avg16: 4481(sse2) -> 3513 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 15:15:56 +02:00
Christophe Gisquet	9722a6a3f3	x86: hpeldsp: implement SSE2 put_pixels16_xy2 This is obviously equivalent to the avg version, without the avg. 3223(mmx) -> 2006(sse2) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 03:45:17 +02:00
Christophe Gisquet	f0aca50e0b	x86: hpeldsp: implement SSE2 versions Those are mostly used in codecs older than H.264, eg MPEG-2. put16 versions: mmx mmx2 sse2 x2: 1888 1185 552 y2: 1778 1092 510 avg16 xy2: 3509(mmx2) -> 2169(sse2) Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 03:29:48 +02:00
James Almer	7538ad2248	x86/hevc_deblock: improve chroma functions register allocation Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-24 01:16:26 +02:00
James Almer	584327f22f	x86/dsputil: fix argument declaration in vector_clipf Should fix fate failures in msvc x86_64 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-23 23:10:17 +02:00
James Almer	518cbf9b4a	x86/dsputil: fix VECTOR_CLIP_INT32 macro The inline loop was incrementing and using the value of %%i the wrong way. Disassembly of ff_vector_clip_int32_sse2 before and after this patch: movdqa (%rdx),%xmm0 \| movdqa (%rdx),%xmm0 movdqa 0x10(%rdx),%xmm1 \| movdqa 0x10(%rdx),%xmm1 movdqa 0x20(%rdx),%xmm2 \| movdqa 0x20(%rdx),%xmm2 movdqa 0x30(%rdx),%xmm3 \| movdqa 0x30(%rdx),%xmm3 [...] \| movdqa %xmm0,(%rcx) \| movdqa %xmm0,(%rcx) movdqa %xmm1,0x10(%rcx) \| movdqa %xmm1,0x10(%rcx) movdqa %xmm2,0x20(%rcx) \| movdqa %xmm2,0x20(%rcx) movdqa %xmm3,0x30(%rcx) \| movdqa %xmm3,0x30(%rcx) movdqa (%rdx),%xmm0 \| movdqa 0x40(%rdx),%xmm0 movdqa 0x20(%rdx),%xmm1 \| movdqa 0x50(%rdx),%xmm1 movdqa 0x40(%rdx),%xmm2 \| movdqa 0x60(%rdx),%xmm2 movdqa 0x60(%rdx),%xmm3 \| movdqa 0x70(%rdx),%xmm3 [...] \| movdqa %xmm0,(%rcx) \| movdqa %xmm0,0x40(%rcx) movdqa %xmm1,0x20(%rcx) \| movdqa %xmm1,0x50(%rcx) movdqa %xmm2,0x40(%rcx) \| movdqa %xmm2,0x60(%rcx) movdqa %xmm3,0x60(%rcx) \| movdqa %xmm3,0x70(%rcx) add $0x80,%rdx \| add $0x80,%rdx add $0x80,%rcx \| add $0x80,%rcx Other versions were unaffected. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-23 22:59:55 +02:00
James Almer	6a4832caae	x86/diracdsp: mark all functions as yasm No inline asm dirac code remains in the tree, so replace every relevant check. This also moves all the dirac functions from dsputil_mmx.c to diracdsp_mmx.c Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-23 15:02:42 +02:00
James Almer	1d36defe94	x86/dsputil: port ff_vector_clipf_sse to yasm Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-23 00:08:21 +02:00
Christophe Gisquet	c081ca851c	x86: hpeldsp: avg_pixels_xy2 for mmx2&3dnow This is a port of the inline assembly of the mmx version to use the pavg(us\|)b instruction. 8 16 mmx 1498 4355 mmx2 1242 3509 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:49 +02:00
Christophe Gisquet	17ac998055	x86: hpeldsp: mark _xy2 versions as approximate Currently, only the mmx version is bitexact, the others (mmxext and 3dnow) are not, in spite of their naming. Therefore, make their name more obvious. Also restore a comment that was removed in 71155d7b. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:45 +02:00
Christophe Gisquet	f8de35ebc4	x86: hpeldsp: kill hpeldsp_mmx.c before: 1987 decicycles in 8_x2, 262121 runs, 23 skips after: 1902 decicycles in 8_x2, 262112 runs, 32 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-22 20:17:40 +02:00
James Almer	80ee2dfcf6	x86/dsputil: port ff_put_signed_pixels_clamped_mmx to yasm Also add an SSE2 version Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-21 23:33:45 +02:00
James Almer	7b05267239	x86/dsputil: port clear_block functions to yasm Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-21 23:33:45 +02:00
Michael Niedermayer	3d4e365073	avcodec/x86/hpeldsp_init: remove redundant if() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-21 13:38:27 +02:00
Hendrik Leppkes	cd9e08e110	hpeldsp: fix build without inline asm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-21 13:37:38 +02:00
Christophe Gisquet	d1a32c3f49	x86: kill fpel_mmx.c Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-21 03:25:08 +02:00
James Almer	d43c303038	x86/hevc_deblock: use constants instead of generating values at runtime Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-19 23:09:33 +02:00
James Almer	057ebf1222	x86/hevc_deblock: remove some duplicated instructions Also remove a couple unnecessary cmps Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-18 23:28:17 +02:00
Christophe Gisquet	f1793fe9cd	x86: hevc_mc: specify coefficients registers By default, macro EPEL_FILTER loads the coefficients inconditionally into m14/m15. This forces an unneeded higher register count. Reduce that count by making them parameters of EPEL_FILTER. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-18 16:23:58 +02:00
Carl Eugen Hoyos	ef2713747f	Fix compilation of libavcodec/x86/hevc_deblock.asm with nasm. Suggested-by: Reimar	2014-05-17 12:50:55 +02:00
James Almer	be1fbc02b8	x86/hevc_deblock: use movhps instead of shuffling values Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-17 05:40:14 +02:00
James Almer	8aac77fede	x86/hevc_deblock: fix label names Also remove some unnecessary jmps Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-17 05:40:08 +02:00
James Almer	521eaea63a	x86/hevc_deblock: fix usage of ABS1 The second argument is a temp register for non-SSSE3 cases Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-17 05:39:55 +02:00
James Almer	45110d2290	x86/hevc_deblock: merge movs with other instructions Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-17 05:39:34 +02:00
plepere	ef7c4cd001	avcodec/x86/hevc: updated to use x86util macros Reviewed-by: James Almer <jamrial@gmail.com> Reviewed-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-16 21:11:07 +02:00
plepere	de7b89fd43	avcodec/x86/hevc: added DBF assembly functions Reviewed-by: James Almer <jamrial@gmail.com> Reviewed-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-16 21:11:03 +02:00
Michael Niedermayer	bebce653e5	avcodec/x86/dsputil_mmx: Fix build with clang-usan Found-by: Katerina Barone-Adesi Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-15 23:56:39 +02:00
Christophe Gisquet	d1310c591e	x86: sbrdsp: implement SSE qmf_deint_neg From 133 (unrolled av_intfloat32 C) to 59 cycles on Arrandale/Win64. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-15 23:11:18 +02:00
Hendrik Leppkes	87f2d8079a	hevcdsp: correctly indicate that hevc_put_hevc_bi_epel_h uses 9 GPRs Fixes FATE on Windows. Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-12 17:00:48 +02:00
James Almer	8e07800001	hevcdsp: include stddef.h for ptrdiff_t definition Including stdint.h was enough for systems like Mingw, but apparently not for Linux. This should fix make checkheaders failures on every platform Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	2014-05-10 18:23:30 +02:00

... 6 7 8 9 10 ...

1982 Commits