Now that there is CPU detection in YASM, there will always be one of
inline or external assembly enabled, which obviates the need to fall
back on CPU detection through compiler intrinsics.
There are cases where strncpy() does exactly what is required.
A blanket ban forces more convoluted solutions to be used in those
cases and has been a cause of bugs.
Signed-off-by: Mans Rullgard <mans@mansr.com>
All our ARM asm preserves alignment so setting this attribute
in a common location is simpler. This removes numerous warnings
when linking with armcc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
LDR with register offset and PC as base register is not available in
the Thumb instruction set so the addition must be done separately.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When building Thumb2 code, the end of a function, where the PIC
offsets are placed, need not be aligned. Although the values
are only accessed with instructions allowing unaligned addresses,
keeping them aligned is preferable.
Signed-off-by: Mans Rullgard <mans@mansr.com>
GCC 4.3 and later do the right thing with the plain C code. Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value. At least
two gcc configurations miscompile the inline asm in some situations.
In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr. On Intel i7 and later, this imul is faster 32-bit mul. On
older Intel and all AMD, it is slightly slower. On Atom it is much
slower.
Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible. If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
There is no point in having the user disable any fastdiv macros.
Besides the condition implementation was broken and only disabled
the C implementation, but no platform specific assembly versions.
Fixed-point audio codecs often use saturating arithmetic, and
special instructions for these operations are common.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This adds a function to retrieve the number of entries in a
dictionary and updates the places directly accessing what should
be an opaque struct to use this new function instead.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The only compiler I have that does not define the standard
offsetof() macro is "Bruce's C Compiler", a simple compiler
for producing 8/16-bit 8086 code, usually for use in early
stages of PC booting.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This list is incomplete (we also use UINT16_MAX), so there does
not appear to be any system we care about that needs these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.
Two small tweaks to the macro are made:
- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
Some compilers do not support the Q/R modifiers used to access
the low/high parts of a 64-bit register pair. Check for this
and disable all uses of it when not supported.
Fixes bug #337.
Signed-off-by: Mans Rullgard <mans@mansr.com>
It appears that something goes wrong in old nasm versions when the
%+ operator is used in the last argument of a macro invocation and
this argument is tested with %ifdef within the macro. This patch
rearranges the macro arguments such that the %+ operator is never
used in the last argument.
nasm does not support 'CPU foonop' directives. This adds a configure
test for the directive and uses it only if supported.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
This allows us to unconditionally set the cglobal num_args
parameter to a bigger value, thus making writing yasm code
even easier than before.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Previously it was interpreted as number of bytes, while the
documentation stated that it was the number of 8 byte blocks.
This makes it behave similarly to the existing AES code.
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously it was interpreted as number of bytes, while the
documentation stated that it was the number of 8 byte blocks.
This makes it behave similarly to the existing AES code.
Signed-off-by: Martin Storsjö <martin@martin.st>
intfloat.h is a public header, and is now (since a1245d5ca) included
by mathematics.h, which many external callers include.
This fixes building third party applications that include
mathematics.h in a language that doesn't support designated
initalizers.
Signed-off-by: Martin Storsjö <martin@martin.st>
These files use NAN/INFINITY but didn't include mathematics.h to get
the fallback definitions if the system lacks the macros.
Signed-off-by: Martin Storsjö <martin@martin.st>
Some compilers, MSVC among them, don't recognize the divisions by
zero as meaning infinity/nan.
These macros should, according to the standard, expand to constant
expressions, but this shouldn't matter for our usage.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds macros for accessing the EFLAGS register and uses
these instead of coding the entire check in inline asm.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This creates proper position independent code when accessing
data symbols if CONFIG_PIC is set.
References to external symbols should now use the movrelx macro.
Some additional code changes are required since this macro may
need a register to hold the GOT pointer.
Signed-off-by: Mans Rullgard <mans@mansr.com>
unistd.h is used for open/read/close, but if this header does not
exist, there's probably no use in trying to open /dev/*random
at all.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds a fallback for cbrtf() using powf(x, 1/3). Since
powf() with a non-integer exponent requires a non-negative
base, special handling of negative inputs is needed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This is required for isatty, which exists on MSVC and is found by
configure, but is provided by io.h instead of unistd.h.
Signed-off-by: Martin Storsjö <martin@martin.st>
This adds whitespace around operators, aligns line continuation
backslashes, and breaks long lines. Also fixes an ifdef halfway
through a statement. The one line of duplication this saved is
not worth the ugliness.
Signed-off-by: Mans Rullgard <mans@mansr.com>
MSVC has isatty (in io.h), but not unistd.h. (isatty isn't called
at all for windows, since there's a special case block for that.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This function implements a delay using the first available
of the following functions:
- nanosleep()
- usleep()
- Sleep() (Windows)
The conditional #includes in time.c are simplified by including
unistd.h and windows.h whenever they are available rather than
having these lines triggered by specific functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The unistd.h header is only needed for the close() declaration.
If this header is not available, the close() declaration may be
provided by another header, e.g. io.h.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The only symbol this file uses from unistd.h is isatty(). By
including the header only when this function is used, the file
can be built on systems without unistd.h (which presumably also
lack isatty).
Signed-off-by: Mans Rullgard <mans@mansr.com>
While these defines are not defined by the C standard they are
standardized as X/Open System Interfaces Extension. We use the
appropiate _XOPEN_SOURCE define to make them available. They
seem to be available on all FATE configs since the constants
are used in files where mathematics.h is not included.
The check uses check_func_header, since this function is
conditionally available depending on the targeted MSVCRT
version.
Signed-off-by: Martin Storsjö <martin@martin.st>
Introduce a new function to set binary data through AVOption,
avoiding having to convert the binary data to a string inbetween.
Signed-off-by: Martin Storsjö <martin@martin.st>
Just like gcc 4.6 and later on ARM, gcc 4.8 on MIPS generates
inefficient code when a known-unaligned location is used as a
memory input operand. This applies the same fix as has been
previously done to the ARM version of the code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
GCC actually handles unaligned accesses correctly in all cases
except, absurdly, 32-bit loads on mips64. The remaining asm is
thus not needed, and removing it results in better code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
libavcodec/utils.c:274: warning: passing argument 3 of ‘av_samples_fill_arrays’ discards qualifiers from pointer target type
./libavutil/samplefmt.h:151: note: expected ‘uint8_t *’ but argument is of type ‘const uint8_t *’
Commit adebad0 "arm: intreadwrite: fix inline asm constraints for gcc
4.6 and later" caused some older gcc versions to miscompile code.
This reverts to the old version of the code for these compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Starting with version 4.7, gcc properly supports unaligned
memory accesses on ARM. Not using the inline asm with these
compilers results in better code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
With a dereferenced type-cast pointer as memory operand, gcc 4.6
and later will sometimes copy the data to a temporary location,
the address of which is used as the operand value, if it thinks
the target address might be misaligned. Using a pointer to a
packed struct type instead does the right thing.
The 16-bit case is special since the ldrh instruction addressing
modes are limited compared to ldr. The "Uq" constraint produces a
memory reference suitable for an ldrsb instruction, which supports
the same addressing modes as ldrh. However, the restrictions appear
to apply only when the operand addresses a single byte. The memory
reference must thus be split into two operands each targeting one
byte. Finally, the "Uq" constraint is only available in ARM mode.
The Thumb-2 ldrh instruction supports most addressing modes so the
normal "m" constraint can be used there.
Signed-off-by: Mans Rullgard <mans@mansr.com>