Previously this used unofficial, apple specific syntax (with fallback
macros for gnu binutils), since Xcode 5.x didn't support the official
syntax of these instructions. Since Xcode 6 has been out for quite a
number of months already, it should be safe to require this (for
building 64 bit binaries for iOS, armv7 builds can still be built
with older Xcode versions).
This clarifies the code by avoiding apple specific syntax in the
assembler instructions.
They are still used slightly differently in the encoder and decoder;
the decoder uses plain functions while the encoder uses one object
keeping track of the number of allocated bytes, and keeping track
of the requested alignment.
When generating a new version of the header, that includes the
actual git hash, don't overwrite the file that is tracked by git.
Instead create a new file, and include this only if the build system
indicates that it exists (by setting a define). This allows the
untouched source tree to be built from within an IDE even if make
has not been run.
This reduces the hassle with a file that needs to be ignored in the
git configuration.
The downside is that the generated file isn't used if building
from within an IDE, if the header has been updated by calling make
before (since the IDE configuration doesn't know whether the user
actually has run make). Since users of the IDE might not build via
make in the command line at all (in the same source checkout at least),
this should not be an issue in practice. The previous way things worked,
the version hash (generated by make) when used in an IDE could actually
be outdated and misleading.
Use the decoder versions of the functions (which are capable
of handling widths 4/8/16 for luma, not only 16 as in the
encoder). By using the more generic versions, there may be a small
performance loss since the functions need to check the width
in every call. Actual measurements show that the actual change is
very small (and the shared routines turn out to actually be faster
than the existing ones in ARM NEON setups).
When using a macro, the macro parameters get evaluated
multiple times, which means that the rand() value compared
actually isn't the same that is used as return value.
This makes sure that clipping works as intended for the
random tests.
1, ref_list_mgr_svc.cpp, Ln314: fix a wrong DeleteLTRFromLongList which should not be used under screen strategy
2, ref_list_mgr_svc.cpp, Ln910: remove the frame which is furthest in distance to the current frame in ref list
3, wels_preprocess.cpp: use scene LTR ref list when the current frame is scene LTR
-, ref_list_mgr_svc.cpp, Ln811: add DEBUG trace
reviewed at: https://rbcommons.com/s/OpenH264/r/870/
The .align directive takes an argument in number of bits, i.e. the
actual alignment is 2^n. Previously building with binutils failed,
since 16 isn't a valid parameter to .align, the maximum is 15.
Thus, this makes the code try to align to 16 bytes, instead of aligning
to 65536 bytes.
This fixes building for android.
This also clears up the same mistake in the aarch64 code, even though
that one built just fine.
This file is built on its own from within the xcode projects,
even though it isn't necessary. Previously its contents was just
empty, but now a .syntax unified was added, which failed the build
when building for arm64.
Make this file a no-op, just like the other arm assembly source files,
unless HAVE_NEON is defined.
This is the default when building with the clang built-in assembler,
but not if using the external assembler - thus always specify it,
for clarity.
Also use the three-operand for of a sub instruction in BS_NZC_CHECK.
The same is already done in the gnu version of the macro.
This fixes building most of the arm assembly with Apple's external
assembler. While this isn't a necessary goal in itself, there's no
harm in doing this either.
Interpreting data of one type via a pointer of a different type is an
aliasing violating. This means that a compiler optimizer's analyzer
can assume that data loaded into an array as uint32_t isn't related
to data read out from the same array as uint64_t, and e.g. reorder
loads/stores.
Since these structs are intentionally used to load data via pointers
of a wrong size, tell the compiler that these accesses may alias
other reads.
This fixes the GetIntraPredictorTest tests of WelsI4x4LumaPredV_c
and WelsI4x4LumaPredH_c. (The compiler optimizer did the wrong thing
as long as WelsFillingPred8to16_c or WelsFillingPred8x2to16_c were
inlined into the calling function.)