GCC 4.3 and later do the right thing with the plain C code. Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value. At least
two gcc configurations miscompile the inline asm in some situations.
In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr. On Intel i7 and later, this imul is faster 32-bit mul. On
older Intel and all AMD, it is slightly slower. On Atom it is much
slower.
Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible. If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
There is no point in having the user disable any fastdiv macros.
Besides the condition implementation was broken and only disabled
the C implementation, but no platform specific assembly versions.
Fixed-point audio codecs often use saturating arithmetic, and
special instructions for these operations are common.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This adds a function to retrieve the number of entries in a
dictionary and updates the places directly accessing what should
be an opaque struct to use this new function instead.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The only compiler I have that does not define the standard
offsetof() macro is "Bruce's C Compiler", a simple compiler
for producing 8/16-bit 8086 code, usually for use in early
stages of PC booting.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This list is incomplete (we also use UINT16_MAX), so there does
not appear to be any system we care about that needs these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.
Two small tweaks to the macro are made:
- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
Some compilers do not support the Q/R modifiers used to access
the low/high parts of a 64-bit register pair. Check for this
and disable all uses of it when not supported.
Fixes bug #337.
Signed-off-by: Mans Rullgard <mans@mansr.com>
It appears that something goes wrong in old nasm versions when the
%+ operator is used in the last argument of a macro invocation and
this argument is tested with %ifdef within the macro. This patch
rearranges the macro arguments such that the %+ operator is never
used in the last argument.
nasm does not support 'CPU foonop' directives. This adds a configure
test for the directive and uses it only if supported.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
This allows us to unconditionally set the cglobal num_args
parameter to a bigger value, thus making writing yasm code
even easier than before.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Previously it was interpreted as number of bytes, while the
documentation stated that it was the number of 8 byte blocks.
This makes it behave similarly to the existing AES code.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously it was interpreted as number of bytes, while the
documentation stated that it was the number of 8 byte blocks.
This makes it behave similarly to the existing AES code.
Signed-off-by: Martin Storsjö <martin@martin.st>
intfloat.h is a public header, and is now (since a1245d5ca) included
by mathematics.h, which many external callers include.
This fixes building third party applications that include
mathematics.h in a language that doesn't support designated
initalizers.
Signed-off-by: Martin Storsjö <martin@martin.st>
These files use NAN/INFINITY but didn't include mathematics.h to get
the fallback definitions if the system lacks the macros.
Signed-off-by: Martin Storsjö <martin@martin.st>
Some compilers, MSVC among them, don't recognize the divisions by
zero as meaning infinity/nan.
These macros should, according to the standard, expand to constant
expressions, but this shouldn't matter for our usage.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds macros for accessing the EFLAGS register and uses
these instead of coding the entire check in inline asm.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This creates proper position independent code when accessing
data symbols if CONFIG_PIC is set.
References to external symbols should now use the movrelx macro.
Some additional code changes are required since this macro may
need a register to hold the GOT pointer.
Signed-off-by: Mans Rullgard <mans@mansr.com>
unistd.h is used for open/read/close, but if this header does not
exist, there's probably no use in trying to open /dev/*random
at all.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds a fallback for cbrtf() using powf(x, 1/3). Since
powf() with a non-integer exponent requires a non-negative
base, special handling of negative inputs is needed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This is required for isatty, which exists on MSVC and is found by
configure, but is provided by io.h instead of unistd.h.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds whitespace around operators, aligns line continuation
backslashes, and breaks long lines. Also fixes an ifdef halfway
through a statement. The one line of duplication this saved is
not worth the ugliness.
Signed-off-by: Mans Rullgard <mans@mansr.com>
MSVC has isatty (in io.h), but not unistd.h. (isatty isn't called
at all for windows, since there's a special case block for that.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This function implements a delay using the first available
of the following functions:
- nanosleep()
- usleep()
- Sleep() (Windows)
The conditional #includes in time.c are simplified by including
unistd.h and windows.h whenever they are available rather than
having these lines triggered by specific functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The unistd.h header is only needed for the close() declaration.
If this header is not available, the close() declaration may be
provided by another header, e.g. io.h.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The only symbol this file uses from unistd.h is isatty(). By
including the header only when this function is used, the file
can be built on systems without unistd.h (which presumably also
lack isatty).
Signed-off-by: Mans Rullgard <mans@mansr.com>
While these defines are not defined by the C standard they are
standardized as X/Open System Interfaces Extension. We use the
appropiate _XOPEN_SOURCE define to make them available. They
seem to be available on all FATE configs since the constants
are used in files where mathematics.h is not included.
The check uses check_func_header, since this function is
conditionally available depending on the targeted MSVCRT
version.
Signed-off-by: Martin Storsjö <martin@martin.st>
Introduce a new function to set binary data through AVOption,
avoiding having to convert the binary data to a string inbetween.
Signed-off-by: Martin Storsjö <martin@martin.st>
Just like gcc 4.6 and later on ARM, gcc 4.8 on MIPS generates
inefficient code when a known-unaligned location is used as a
memory input operand. This applies the same fix as has been
previously done to the ARM version of the code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
GCC actually handles unaligned accesses correctly in all cases
except, absurdly, 32-bit loads on mips64. The remaining asm is
thus not needed, and removing it results in better code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
libavcodec/utils.c:274: warning: passing argument 3 of ‘av_samples_fill_arrays’ discards qualifiers from pointer target type
./libavutil/samplefmt.h:151: note: expected ‘uint8_t *’ but argument is of type ‘const uint8_t *’
Commit adebad0 "arm: intreadwrite: fix inline asm constraints for gcc
4.6 and later" caused some older gcc versions to miscompile code.
This reverts to the old version of the code for these compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Starting with version 4.7, gcc properly supports unaligned
memory accesses on ARM. Not using the inline asm with these
compilers results in better code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
With a dereferenced type-cast pointer as memory operand, gcc 4.6
and later will sometimes copy the data to a temporary location,
the address of which is used as the operand value, if it thinks
the target address might be misaligned. Using a pointer to a
packed struct type instead does the right thing.
The 16-bit case is special since the ldrh instruction addressing
modes are limited compared to ldr. The "Uq" constraint produces a
memory reference suitable for an ldrsb instruction, which supports
the same addressing modes as ldrh. However, the restrictions appear
to apply only when the operand addresses a single byte. The memory
reference must thus be split into two operands each targeting one
byte. Finally, the "Uq" constraint is only available in ARM mode.
The Thumb-2 ldrh instruction supports most addressing modes so the
normal "m" constraint can be used there.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This allows masking CPU features with the -cpuflags avconv option
which is useful for testing different optimisations without rebuilding.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
Plain POSIX malloc(0) is allowed to return either NULL or a
non-NULL pointer. The calling code should be ready to handle
a NULL return as a correct return (instead of a failure) if the size
to allocate was 0 - this makes sure the condition is handled
in a consistent way across platforms.
This also avoids calling posix_memalign(&ptr, 32, 0) on OS X,
which returns an invalid pointer (a non-NULL pointer that causes
crashes when passed to av_free).
Abort in debug mode, to help track down issues related to
incorrect handling of this case.
Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Nicolas George <nicolas.george@normalesup.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
The were broken since August of 2010 without anyone noticing until
three weeks ago. Nobody cares about it anymore and hopefully Marvell
will support NEON like in the PXA978 from now on.
This library does not fit into Libav as a whole and its code is just a
maintenance burden. Furthermore it is now available as an external project,
which completely obviates any reason to keep it around.
URL: http://git.videolan.org/?p=libpostproc.git
This sets __OUTPUT_FORMAT__ to win64 instead of win32, even though both
(through -m amd64) produce 64-bit binary code.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Functions using INIT_MMX may still access XMM registers through direct
means (xmm0-15). Therefore, they still need to be marked for clobber
so they can be properly saved/restored.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The spec says the following speaker mapping is default:
center front speaker
left, right center front speakers,
left, right outside front speakers,
left surround, right surround rear speakers,
front low frequency effects speaker
This will be useful to test more aggressively for failures to mark XMM
registers as clobbered in Win64 builds, and prevent regressions thereof.
Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>
The functions are already av_ prefixed and intfloat header is already provided.
Install libavutil/intfloat.h
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Firstly, this test never worked as intended, always reporting
success. Secondly, bswap is available from 486 onward and can
thus be assumed present.
Signed-off-by: Mans Rullgard <mans@mansr.com>
With these changes, gcc 4.5 and later recognise it as a bswap
and use the proper instructions on ARM and x86. On x86, the
16-bit bswap is recognised from gcc 4.1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The existing functions defined in intfloat_readwrite.[ch] are
both slow and incorrect (infinities are not handled).
This introduces a new header with fast, inline conversion
functions using direct union punning assuming an IEEE-754
system, an assumption already made throughout the code.
The one use of Intel/Motorola extended 80-bit format is
replaced by simpler code sufficient under the present
constraints (positive normal values).
The old functions are marked deprecated and retained for
compatibility.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Solaris Studio (suncc) has difficulty with filling in
members of a union. Instead, let's retrieve and store the
cpuid() results separately. This is still a compiler bug,
however this fix does not cause a regression on other platforms.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
A value shifted left by >31 needs to have a 64-bit type.
As there are no defined channels in this range, the fix
is purely theoretical at this point, although it does
avoid some invalid shifts triggering the overflow
checker.
Signed-off-by: Mans Rullgard <mans@mansr.com>
It makes more sense for a bit mask to use an unsigned type.
The change should be source and binary compatible on all
supported systems, hence micro version bump.
Fixes a few invalid shifts.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Casting the left-most byte to unsigned avoids an undefined
result of the shift by 24 if bit 7 is set. This affects
the rm demuxer.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This is useful, since the normal timegm function isn't a standard
function (requiring _BSD_SOURCE or _SVID_SOURCE on glibc to
be visible, and not available on e.g. windows). The widely available
function mktime uses the local time zone, which requires ugly
workarounds to handle UTC time.
Signed-off-by: Martin Storsjö <martin@martin.st>
All current usages of it are incompatible with localization.
For example strcasecmp("i", "I") != 0 is possible, but would
break many of the places where it is used.
Instead use our own implementations that always treat the data
as ASCII.
Signed-off-by: Martin Storsjö <martin@martin.st>
We keep INIT_AVX (for backwards compatibility). 3arg AVX ops with
a memory arg can only have it in src2, whereas SSE emulation of
3arg prefers to have it in src1 (i.e. the mov). So, if the op is
symmetric and the wrong one is memory, swap them.
With the changes in 3b3ea34655,
"Remove all uses of deprecated AVOptions API", av_opt_flag_is_set
was broken, since it now uses av_opt_find, which doesn't return
named constants unless a unit to look for the constant in is given.
This broke enabling LATM encapsulated AAC output in RTP.
Signed-off-by: Martin Storsjö <martin@martin.st>
'struct AVClass' is used in the code since
641c7afe3c, but AVClass is typedeffed as
an anonymous struct.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This will allow the caller to enumerate child contexts in a generic way
and since the API is recursive, it also allows for deeper nesting (e.g.
AVFormatContext->AVIOContext->URLContext)
This will also allow the new setting/reading API to transparently apply
to children contexts.
These additions might overflow the signed range for large
input values. Converting to unsigned before the addition
rather than after avoids such undefined behaviour. The
result under normal two's complement wraparound remains
unchanged.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This fixes a signed overflow from i << 24 when i == 255 by
making i unsigned. The result of the shift is already
assigned to an variable of unsigned type.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This patch adds the possibility to calculate the DES-CBC-MAC of a
source buffer (i.e. the last block of the buffer encrypted in CBC
mode) without having to allocate a destination buffer that is as
long as the complete source buffer, but instead only 8 bytes
for the MAC.
Signed-off-by: David Goldwich <david.goldwich@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It's a hack which was created to allow for multiple options with
different defaults to refer to same field (e.g. 'b' vs 'ab'). There is
no need for it anymore.