ffmpeg

Author	SHA1	Message	Date
Ronald S. Bultje	f76423d097	vp9: add x86 simd (sse2/ssse3) for iadst4 10bpp functions.	2015-10-13 11:05:58 -04:00
Ronald S. Bultje	6b579cf547	vp9: add 10bpp simd (mmxext/ssse3) for idct_idct_4x4.	2015-10-13 11:05:58 -04:00
Ronald S. Bultje	1c3be32533	vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function.	2015-10-13 11:05:57 -04:00
Christophe Gisquet	b6594a9605	x86: dct-test: add more idcts In particular for 10 and 12 bits. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 16:03:04 +02:00
Christophe Gisquet	7ece8b50b1	x86: simple_idct: 12bits versions On 12 frames of a 444p 12 bits DNxHR sequence, _put function: C: 78902 decicycles in idct, 262071 runs, 73 skips avx: 32478 decicycles in idct, 262045 runs, 99 skips Difference between the 2: stddev: 0.39 PSNR:104.47 MAXDIFF: 2 This is unavoidable and due to the scale factors used in the x86 version, which cannot match the C ones. In addition, the trick of adding an initial bias to the input of a pass can overflow, as the input coefficients are already 15bits, which is the maximum this function can handle. Overall, however, the omse on 12 bits samples goes from 0.16916 to 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes overflows. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 15:34:32 +02:00
Christophe Gisquet	4369b9dc7b	x86: simple_idct(_put): 10bits versions Modeled from the prores version. Clips to [0;1023] and is bitexact. Bitexactness requires to add offsets in different places compared to prores or C, and makes the function approximately 2% slower. For 16 frames of a DNxHD 4:2:2 10bits test sequence: C: 60861 decicycles in idct, 1048205 runs, 371 skips sse2: 27567 decicycles in idct, 1048216 runs, 360 skips avx: 26272 decicycles in idct, 1048171 runs, 405 skips The add version is not implemented, so the corresponding dsp function is set to NULL to make it clear in a code executing it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 13:32:21 +02:00
Christophe Gisquet	e652f69b35	x86: simple_idct10_template: fix overflow in pass When the input of a pass has 15 or 16 bits of precision (in particular the column pass), the addition of a bias to W4 may lead to overflows in the input to pmaddwd. This requires postponing the adding of the bias to after the first butterfly. To do so, the fact that m15, unused although zeroed, is exploited. In case the pass is safe, an address can be directly used, and the number of xmm regs can be decreased. Otherwise, the 32bits bias is loaded into it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 12:51:10 +02:00
Christophe Gisquet	e9a68b0316	x86: prores: templatize 10 bits simple_idct This should be reused for a generic simple_idct10 function. Requires a bit of trickery to declare common constants in C. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 01:10:34 +02:00
James Almer	dab5f65b25	x86/takdsp: use arithmetic shift instructions p1 and p2 are int32_t. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-10-09 23:52:39 -03:00
Paul B Mahol	35af7add6f	avcodec/takdec: add x86 SIMD for rest of decorrelation modes Signed-off-by: Paul B Mahol <onemda@gmail.com>	2015-10-09 21:38:15 +02:00
Ronald S. Bultje	ce78729033	vp9: don't keep a stack pointer if we don't need it. This saves one register in a few cases on 32bit builds with unaligned stack (e.g. MSVC), making the code slightly easier to maintain. (Can someone please test this on 32bit+msvc and confirm make fate-vp9 and tests/checkasm/checkasm still work after this patch?)	2015-10-07 08:55:19 -04:00
James Almer	72254b19b8	x86/alacdsp: add simd optimized functions Signed-off-by: James Almer <jamrial@gmail.com>	2015-10-06 20:22:00 -03:00
Ronald S. Bultje	cb912b4521	vp9: fix msvc build by using 6 GPRs on 32bit if stack!=aligned.	2015-10-05 16:51:05 -04:00
Christophe Gisquet	f827a17005	blockdsp: reindent after parameter removal Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-03 23:34:56 +02:00
Ronald S. Bultje	061b67fb50	vp9: 10/12bpp SIMD (sse2/ssse3/avx) for directional intra prediction.	2015-10-03 14:42:39 -04:00
Ronald S. Bultje	26ece7a511	vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions.	2015-10-03 14:42:39 -04:00
Ronald S. Bultje	db7786e8ff	vp9: sse2/ssse3/avx 16bpp loopfilter x86 simd.	2015-10-03 14:42:39 -04:00
Ganesh Ajjanagadde	0493e42eb2	avcodec/x86/hpeldsp_rnd_template: silence -Wunused-function on --disable-mmx This silences some of the -Wunused-function warnings when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. Header guards are too brittle and ugly for this case. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-03 14:24:41 +02:00
Christophe Gisquet	562ba4a827	blockdsp: remove high bitdepth parameter It is only (mis-)used to set the dsp fucntions clear_block(s). But these functions always work on 16bits-wide elements, which make the parameter useless and actually harmful, as it causes all content on more than 8-bits to not use accelerated functions. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-02 04:38:40 +02:00
James Almer	3178931a14	x86/hevc_sao: move 10/12bit functions into a separate file Tested-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-30 02:59:55 -03:00
Ganesh Ajjanagadde	308e7484a3	avcodec/x86/rnd_template: silence -Wunused-function on --disable-mmx This silences some of the -Wunused-function warnings when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. Header guards are too brittle and ugly for this case. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-29 19:37:26 +02:00
Michael Niedermayer	1b82b934a1	avcodec/x86/sbrdsp: Fix using uninitialized upper 32bit of noise Fixes crash Fixes: flicker-1.scout3d21443372922.28.m4a Found-by: Dale Curtis <dalecurtis@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-29 13:23:25 +02:00
Ganesh Ajjanagadde	07cd8d5676	avcodec/x86/cavsdsp: silence -Wunused-variable on --disable-mmx This silences -Wunused-variable when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. The alternative of header guards will make it far too ugly. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-24 04:27:50 +02:00
Ganesh Ajjanagadde	0544c95fd6	avcodec/x86/mpegaudiodsp: silence -Wunused-variable on --disable-mmx This silences -Wunused-variable when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. The alternative of header guards will make it far too ugly. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-22 23:45:03 +02:00
Ganesh Ajjanagadde	4f90818ea1	avcodec/x86/rv40dsp_init: silence -Wunused-variable on --disable-mmx This silences -Wunused-variable when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. The alternative of header guards will make it far too ugly. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-22 23:45:03 +02:00
James Almer	7086154aaa	x86/vp9dsp: fix local header include Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-21 14:37:32 -03:00
James Almer	91fcb10f08	x86/vp9dsp: add missing header include Fixes make checkheaders Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-21 14:34:08 -03:00
James Almer	4bb6cb4c7d	x86/vp9mc: fix string concatenation of fullpel function names Fixes compilation with NASM Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-20 12:32:27 -03:00
Ganesh Ajjanagadde	92fabca427	avcodec/x86/hpeldsp_rnd_template: silence -Wunused-function on --disable-mmx This silences some of the -Wunused-function warnings when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. Header guards are too brittle and ugly for this case. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-20 04:00:42 +02:00
Ganesh Ajjanagadde	e681baf638	avcodec/x86/mpegvideoenc: silence -Wunused-function on --disable-mmx This silences -Wunused-function when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-19 23:26:57 +02:00
Ganesh Ajjanagadde	f0c635f577	avcodec/x86/hpeldsp_init: silence -Wunused-function on --disable-mmx This silences some of the -Wunused-function warnings when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-19 23:10:52 +02:00
James Almer	6f9ba0cb82	x86/vp9dsp: add missing preprocessor guards Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-19 13:33:53 -03:00
James Almer	e47564828b	x86/vp9mc: add missing preprocessor guards Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-18 15:14:53 -03:00
James Almer	2f9ab15960	x86/vp9: add avx2 subpel MC SIMD for 10/12bpp Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-18 12:28:55 -03:00
Michael Niedermayer	58fe57d5a0	avcodec/mpeg12enc: Basic support for encoding non even QPs for -non_linear_quant 1 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-18 02:52:57 +02:00
Michael Niedermayer	2d35757814	avcodec/mpegvideo: Change mpeg2 unquant to work with higher precission qscale Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-18 02:39:17 +02:00
Ronald S. Bultje	344d519040	vp9: add subpel MC SIMD for 10/12bpp.	2015-09-16 21:11:34 -04:00
Ronald S. Bultje	77f359670f	vp9: add fullpel (avg) MC SIMD for 10/12bpp.	2015-09-16 21:11:34 -04:00
Ronald S. Bultje	6354ff0383	vp9: add fullpel (put) MC SIMD for 10/12bpp.	2015-09-16 21:11:34 -04:00
Hendrik Leppkes	7b865c222e	Merge commit '5d14cf199990cd378904a2618b5c72c4b02290f6' * commit '5d14cf199990cd378904a2618b5c72c4b02290f6': mpegvideo: Make sure mpegutils.h is included where needed Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-09-16 11:23:40 +02:00
Vittorio Giovara	5d14cf1999	mpegvideo: Make sure mpegutils.h is included where needed	2015-09-13 17:34:45 +02:00
James Almer	d5f8a642f6	x86: port PSIGNW to cpuflags Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-09-11 23:27:03 -03:00
Ronald S. Bultje	4b66274a86	vp9: save one (PSIGNW) instruction in iadst16_1d sse2/ssse3.	2015-09-11 20:36:51 -04:00
Ronald S. Bultje	fd8b90f5f6	vp9: fix overflow in 8x8 topleft 32x32 idct ssse3 version. Also disable the mmx/iwht optimization when the bitexact flag is set. With synthetically coded coefficients (i.e. these that lead to a residual well outside the [-255,255] range), our optimizations will overflow. It doesn't make sense to fix the overflows, since they can only occur on synthetic input, not on real fwht-generated input. Thus, add a bitexact flag that disables this optimization.	2015-09-10 07:51:16 -04:00
Hendrik Leppkes	5d8e836d0e	Replace all remaining occurances of step/depth_minus1 and offset_plus1	2015-09-08 17:10:48 +02:00
Ronald S. Bultje	f12093fffd	vp9: fix integer overflows in sse2 version of iadst4.	2015-09-06 15:07:19 -04:00
Michael Niedermayer	8d860f9a77	avcodec/x86/w64xmmtest: Fix another build failure Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-09-05 22:15:53 +02:00
Ronald S. Bultje	086c9b78d4	vp9: fix rounding error in idct_8x8_ssse3.	2015-09-05 15:50:02 -04:00
Hendrik Leppkes	41194f065c	Merge commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798' * commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798': lavc: Drop deprecated deinterlace module Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-09-05 17:06:14 +02:00
Vittorio Giovara	cad40a3833	lavc: Drop deprecated deinterlace module Deprecated in 03/2013.	2015-08-28 16:04:19 +02:00

1 2 3 4 5 ...

2095 Commits