Diego Biurrun
82bb304801
dsputil: Use correct type in me_cmp_func function pointer
2014-03-20 05:03:23 -07:00
Diego Biurrun
0e083d7e43
build: Group general components separate from de/encoders in arch Makefiles
...
This is in line with how the top-level libavcodec Makefile is structured.
2014-03-20 05:03:23 -07:00
Diego Biurrun
54a6e08a65
dsputil: Conditionally compile dsputil code on all architectures
2014-03-20 05:03:23 -07:00
Diego Biurrun
5169e68895
dsputil: Propagate bit depth information to all (sub)init functions
...
This avoids recalculating the value over and over again.
2014-03-20 05:03:23 -07:00
Diego Biurrun
1675975216
ppc: dsputil: Drop trailing semicolon from macros
...
This allows for a more natural macro usage.
2014-03-20 05:03:22 -07:00
Diego Biurrun
b7d24fd4b2
ppc: dsputil: Merge some declarations and initializations
2014-03-20 05:03:22 -07:00
Diego Biurrun
b045283f21
ppc: dsputil: Simplify some ifdeffed function definitions
2014-03-20 05:03:22 -07:00
Diego Biurrun
8bd6f88266
ppc: dsputil: Drop some unnecessary parentheses
2014-03-20 05:03:22 -07:00
Diego Biurrun
022184a646
ppc: dsputil: more K&R formatting cosmetics
2014-03-20 05:03:22 -07:00
Diego Biurrun
30f3f95987
ppc: dsputil: K&R formatting cosmetics
2014-03-20 05:03:22 -07:00
Diego Biurrun
82ee14d2ce
ppc: dsputil: comment formatting and wording/grammar improvements
2014-03-20 05:03:22 -07:00
Diego Biurrun
fd9e2221bd
ppc: Add some missing headers
2014-03-13 05:50:28 -07:00
Diego Biurrun
49676eb730
dsputil: Remove prototypes for nonexisting optimization functions
2014-03-13 05:50:28 -07:00
Janne Grunau
98fdfa9970
ppc: reduce overreads when loading 8 pixels in altivec dsp functions
...
Altivec can only load naturally aligned vectors. To handle possibly
unaligned data a second vector is loaded from an offset of the original
location and the data is recovered through a vector permutation.
Overreads are minimal if the offset for second load points to the last
element of data. This is 7 for loading eight 8-bit pixels and overreads
are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned.
For unaligned pixels the overread is reduced from 23 bytes to 15 bytes
in the worst case.
2014-02-14 18:34:04 +01:00
Ronald S. Bultje
2f6eec65ac
vp8: fix PPC assembly to work if src_stride != dst_stride
...
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-09 18:50:53 +01:00
Anton Khirnov
a03a642d5c
h264: do not use 422 functions for monochrome
...
Fixes invalid memory access.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
2014-01-06 08:25:36 +01:00
Diego Biurrun
a6b6501185
ppc: cosmetics: Consistently format CPU flag detection invocations
2013-08-29 11:31:32 +02:00
Diego Biurrun
6af2c351b3
ppc: Add missing AltiVec cpuflag detection invocations
2013-08-29 00:24:46 +02:00
Diego Biurrun
de81b6ae4f
ppc: fdct: Remove vim editor settings comment
2013-08-28 23:59:24 +02:00
Diego Biurrun
f61bece684
ppc: Add and use convenience macro to check for AltiVec availability
2013-08-28 23:54:15 +02:00
Kostya Shishkov
f399e406af
altivec: perform an explicit unaligned load
...
Implicit vector loads on POWER7 hardware can use the VSX
instruction set instead of classic Altivec/VMX. Let's force
a VMX load in this case.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-08-16 10:08:47 +03:00
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
2013-07-18 00:31:35 +02:00
Christophe Gisquet
b6293e2798
fmtconvert: Explicitly use int32_t instead of int
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-17 11:02:47 +03:00
Kostya Shishkov
0418cbf081
fix scalarproduct_and_madd_int16_altivec() for orders > 16
...
the second and third sources were incremented only by half of the needed size
2013-05-26 16:10:47 +02:00
Diego Biurrun
a650c906cb
ppc: Only compile AltiVec FFT assembly when AltiVec is enabled
2013-05-02 10:25:30 +02:00
Diego Biurrun
7f75f2f2bd
ppc: Drop unnecessary ff_ name prefixes from static functions
2013-04-30 16:10:06 +02:00
Diego Biurrun
38282149b6
ppc: More consistent arch initialization
2013-04-30 12:19:45 +02:00
Diego Biurrun
a053dbfcfb
ppc: Move AltiVec utility headers out of AltiVec ifdefs
...
Now that the headers themselves have ifdef protection this is no
longer necessary and more consistent with normal include handling.
2013-04-30 12:19:44 +02:00
Diego Biurrun
6b110d3a73
ppc: More consistent names for H.264 optimizations files
2013-04-30 12:19:43 +02:00
Diego Biurrun
643e433bf7
mpegaudiosp: More consistent names for ppc/x86 optimization files
2013-04-30 12:19:43 +02:00
Martin Storsjö
6d0fbebf94
ppc: hpeldsp: Include attributes.h
...
This fixes building in configurations where altivec is disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-20 16:43:01 +03:00
Ronald S. Bultje
47e5a98174
ppc: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:18:59 +03:00
Ronald S. Bultje
015821229f
vp3: Use full transpose for all IDCTs
...
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.
Also remove the unused type == 0 cases from the plain C version
of the idct.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Ronald S. Bultje
62844c3fd6
h264: Integrate clear_blocks calls with IDCT
...
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Luca Barbato
a8b6015823
dsputil: convert remaining functions to use ptrdiff_t strides
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-03-12 18:26:42 +01:00
Diego Biurrun
c242bbd8b6
Remove unnecessary dsputil.h #includes
2013-02-26 00:51:34 +01:00
Diego Biurrun
218aefce44
dsputil: Move LOCAL_ALIGNED macros to libavutil
2013-02-08 23:13:37 +01:00
Diego Biurrun
79dad2a932
dsputil: Separate h264chroma
2013-02-06 11:30:53 +01:00
Diego Biurrun
c9f933b5b6
Add av_cold attributes to arch-specific init functions
2013-02-05 17:01:05 +01:00
Diego Biurrun
25841dfe80
Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
...
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
2013-02-05 12:59:12 +01:00
Diego Biurrun
4eef2ed707
ppc: fmtconvert: Drop two unused variables.
2013-02-01 12:51:13 +01:00
Mans Rullgard
e9d817351b
dsputil: Separate h264 qpel
...
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Diego Biurrun
88bd7fdc82
Drop DCTELEM typedef
...
It does not help as an abstraction and adds dsputil dependencies.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Ronald S. Bultje
42d3246948
floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
...
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8
floatdsp: move vector_fmul_add from dsputil to avfloatdsp.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
1768e43ceb
vorbisdsp: change block_size type from int to intptr_t.
...
This saves one instruction in the x86-64 assembly.
2013-01-20 22:26:42 -08:00
Diego Biurrun
d9bf716945
ppc: vorbisdsp: Drop some unnecessary #includes
...
Also fixes compilation with AltiVec disabled.
2013-01-20 17:38:11 +01:00
Martin Storsjö
d160a2fb4c
ppc: Include string.h for memset
...
This fixes build failures on ppc machines with a compiler that
supports -Werror=implicit-function-declaration.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-20 18:10:21 +02:00
Ronald S. Bultje
fef906c77c
Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
...
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
aeaf268e52
vp3: integrate clear_blocks with idct of previous block.
...
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).
Arm asm updated by Janne Grunau <janne-libav@jannau.net>.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2013-01-19 22:04:55 -08:00