ffmpeg

Author	SHA1	Message	Date
Janne Grunau	98fdfa9970	ppc: reduce overreads when loading 8 pixels in altivec dsp functions Altivec can only load naturally aligned vectors. To handle possibly unaligned data a second vector is loaded from an offset of the original location and the data is recovered through a vector permutation. Overreads are minimal if the offset for second load points to the last element of data. This is 7 for loading eight 8-bit pixels and overreads are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned. For unaligned pixels the overread is reduced from 23 bytes to 15 bytes in the worst case.	2014-02-14 18:34:04 +01:00
Ronald S. Bultje	2f6eec65ac	vp8: fix PPC assembly to work if src_stride != dst_stride Signed-off-by: Anton Khirnov <anton@khirnov.net> Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2014-02-09 18:50:53 +01:00
Anton Khirnov	a03a642d5c	h264: do not use 422 functions for monochrome Fixes invalid memory access. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC:libav-stable@libav.org	2014-01-06 08:25:36 +01:00
Diego Biurrun	a6b6501185	ppc: cosmetics: Consistently format CPU flag detection invocations	2013-08-29 11:31:32 +02:00
Diego Biurrun	6af2c351b3	ppc: Add missing AltiVec cpuflag detection invocations	2013-08-29 00:24:46 +02:00
Diego Biurrun	de81b6ae4f	ppc: fdct: Remove vim editor settings comment	2013-08-28 23:59:24 +02:00
Diego Biurrun	f61bece684	ppc: Add and use convenience macro to check for AltiVec availability	2013-08-28 23:54:15 +02:00
Kostya Shishkov	f399e406af	altivec: perform an explicit unaligned load Implicit vector loads on POWER7 hardware can use the VSX instruction set instead of classic Altivec/VMX. Let's force a VMX load in this case. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-08-16 10:08:47 +03:00
Diego Biurrun	3ac7fa81b2	Consistently use "cpu_flags" as variable/parameter name for CPU flags	2013-07-18 00:31:35 +02:00
Christophe Gisquet	b6293e2798	fmtconvert: Explicitly use int32_t instead of int Signed-off-by: Martin Storsjö <martin@martin.st>	2013-07-17 11:02:47 +03:00
Kostya Shishkov	0418cbf081	fix scalarproduct_and_madd_int16_altivec() for orders > 16 the second and third sources were incremented only by half of the needed size	2013-05-26 16:10:47 +02:00
Diego Biurrun	a650c906cb	ppc: Only compile AltiVec FFT assembly when AltiVec is enabled	2013-05-02 10:25:30 +02:00
Diego Biurrun	7f75f2f2bd	ppc: Drop unnecessary ff_ name prefixes from static functions	2013-04-30 16:10:06 +02:00
Diego Biurrun	38282149b6	ppc: More consistent arch initialization	2013-04-30 12:19:45 +02:00
Diego Biurrun	a053dbfcfb	ppc: Move AltiVec utility headers out of AltiVec ifdefs Now that the headers themselves have ifdef protection this is no longer necessary and more consistent with normal include handling.	2013-04-30 12:19:44 +02:00
Diego Biurrun	6b110d3a73	ppc: More consistent names for H.264 optimizations files	2013-04-30 12:19:43 +02:00
Diego Biurrun	643e433bf7	mpegaudiosp: More consistent names for ppc/x86 optimization files	2013-04-30 12:19:43 +02:00
Martin Storsjö	6d0fbebf94	ppc: hpeldsp: Include attributes.h This fixes building in configurations where altivec is disabled. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-20 16:43:01 +03:00
Ronald S. Bultje	47e5a98174	ppc: hpeldsp: Move half-pel assembly from dsputil to hpeldsp Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-19 23:18:59 +03:00
Ronald S. Bultje	015821229f	vp3: Use full transpose for all IDCTs This way, the special IDCT permutations are no longer needed. This is similar to how H264 does it, and removes the dsputil dependency imposed by the scantable code. Also remove the unused type == 0 cases from the plain C version of the idct. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-15 12:32:05 +03:00
Ronald S. Bultje	62844c3fd6	h264: Integrate clear_blocks calls with IDCT The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700 to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip tested (cathedral), i.e. almost 30 cycles per mb faster. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-04-10 11:03:06 +03:00
Luca Barbato	a8b6015823	dsputil: convert remaining functions to use ptrdiff_t strides Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-03-12 18:26:42 +01:00
Diego Biurrun	c242bbd8b6	Remove unnecessary dsputil.h #includes	2013-02-26 00:51:34 +01:00
Diego Biurrun	218aefce44	dsputil: Move LOCAL_ALIGNED macros to libavutil	2013-02-08 23:13:37 +01:00
Diego Biurrun	79dad2a932	dsputil: Separate h264chroma	2013-02-06 11:30:53 +01:00
Diego Biurrun	c9f933b5b6	Add av_cold attributes to arch-specific init functions	2013-02-05 17:01:05 +01:00
Diego Biurrun	25841dfe80	Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter. This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic.	2013-02-05 12:59:12 +01:00
Diego Biurrun	4eef2ed707	ppc: fmtconvert: Drop two unused variables.	2013-02-01 12:51:13 +01:00
Mans Rullgard	e9d817351b	dsputil: Separate h264 qpel The sh4 optimizations are removed, because the code is 100% identical to the C code, so it is unlikely to provide any real practical benefit. Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-24 10:44:43 +01:00
Diego Biurrun	88bd7fdc82	Drop DCTELEM typedef It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	2013-01-22 18:32:56 -08:00
Ronald S. Bultje	42d3246948	floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp. Now, nellymoserenc and aacenc no longer depends on dsputil. Independent of this patch, wmaprodec also does not depend on dsputil, so I removed it from there also.	2013-01-22 11:55:42 -08:00
Ronald S. Bultje	55aa03b9f8	floatdsp: move vector_fmul_add from dsputil to avfloatdsp.	2013-01-22 11:55:42 -08:00
Ronald S. Bultje	1768e43ceb	vorbisdsp: change block_size type from int to intptr_t. This saves one instruction in the x86-64 assembly.	2013-01-20 22:26:42 -08:00
Diego Biurrun	d9bf716945	ppc: vorbisdsp: Drop some unnecessary #includes Also fixes compilation with AltiVec disabled.	2013-01-20 17:38:11 +01:00
Martin Storsjö	d160a2fb4c	ppc: Include string.h for memset This fixes build failures on ppc machines with a compiler that supports -Werror=implicit-function-declaration. Signed-off-by: Martin Storsjö <martin@martin.st>	2013-01-20 18:10:21 +02:00
Ronald S. Bultje	fef906c77c	Move vorbis_inverse_coupling from dsputil to vorbisdspcontext. Conveniently (together with Justin's earlier patches), this makes our vorbis decoder entirely independent of dsputil.	2013-01-19 22:21:10 -08:00
Ronald S. Bultje	aeaf268e52	vp3: integrate clear_blocks with idct of previous block. This is identical to what e.g. vp8 does, and prevents the function call overhead (plus dependency on dsputil for this particular function). Arm asm updated by Janne Grunau <janne-libav@jannau.net>. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	2013-01-19 22:04:55 -08:00
Justin Ruggles	e034cc6c60	lavc: Move vector_fmul_window to AVFloatDSPContext Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2013-01-16 10:45:45 +01:00
Ronald S. Bultje	8c53d39e7f	lavc: introduce VideoDSPContext Move some functions from dsputil. The idea is that videodsp contains functions that are useful for a large and varied set of video decoders. Currently, it contains emulated_edge_mc() and prefetch(). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2012-12-20 13:40:45 +01:00
Mans Rullgard	a384f6a7f7	ppc: replace pointer casting with AV_COPY32 This removes warnings about strict aliasing violations. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-11-12 10:31:31 +00:00
Mans Rullgard	031aac9861	ppc: fix some unused variable warnings The third argument of OP_U8_ALTIVEC is evaluated at most once so there is no need for a potentially unused temporary variable. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-11-12 10:31:31 +00:00
Diego Biurrun	ac56ff9cc9	build: non-x86: Only compile mpegvideo optimizations when necessary	2012-10-09 14:45:59 +02:00
Mans Rullgard	f79364b2c3	ppc: fix Altivec build with old compilers The vec_splat() intrinsic requires a constant argument for the element number, and the code relies on the compiler unrolling the loop to provide this. Manually unrolling the loop avoids this reliance and works with all compilers. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-10-08 23:14:51 +01:00
Mans Rullgard	642b4efaf7	ppc: fmtconvert: kill VLA in float_to_int16_interleave_altivec() Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-10-05 22:33:32 +01:00
Martin Storsjö	33e112847d	Add more missing includes after removing the implicit common.h Signed-off-by: Martin Storsjö <martin@martin.st>	2012-08-16 10:49:54 +03:00
Martin Storsjö	70766c2182	Add some more missing includes after removing the implicit common.h Signed-off-by: Martin Storsjö <martin@martin.st>	2012-08-15 23:48:48 +03:00
Martin Storsjö	1d9c2dc89a	Don't include common.h from avutil.h Signed-off-by: Martin Storsjö <martin@martin.st>	2012-08-15 22:32:06 +03:00
Justin Ruggles	a35738f424	dsputil: ppc: cosmetics: pretty-print	2012-07-22 17:38:55 -04:00
Mans Rullgard	ffdd93a25e	ppc: fix build with altivec disabled Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-07-18 13:34:42 +01:00
Mans Rullgard	28f9ab7029	vp3: move idct and loop filter pointers to new vp3dsp context This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>	2012-07-18 10:32:19 +01:00

1 2 3 4 5 ...

399 Commits