The caller of the function should not need to know exactly which
implementation of it is being used.
For the variants that don't support detecting the number of cores,
the pNumberOfLogicProcessors parameter can be left untouched
and the caller will use a higher level API for finding it out.
This simplifies all the calling code, and simplifies adding
more implementations of cpu feature detection.
The two different variants of the threadlib basically are
win32 and unix - use _WIN32 to check for this consistently,
instead of occasionally using __GNUC__ to enable the unix
codepath. (__GNUC__ is also defined on mingw, which still is
a windows platform and should use the _WIN32 code.)
When adding the (dwMilliseconds % 1000) * 1000000 part
to ts.tv_nsec, the ts.tv_nsec field can grow larger than one
whole second. Therefore first add all of dwMilliseconds to
the tv_nsec field and add all whole seconds to the tv_sec
field instead - this way we make sure that the tv_nsec field
actually is less than a second.
On processors without HTT, WelsCPUFeatureDetect can't return
a number of cores but might still return a nonzero set of
CPU feature flags. Previously the nonzero cpu feature flag
indicated that cpuid worked and the encoder wouldn't use the
higher level API for getting the number of cores, even though the
number of cores was left at 1.
TRUE/FALSE has intentionally been left in use for the few
platform specific APIs that define these constants themselves
and expect them to be used, for consistency.
Instead of byteswapping a 32 bit word and writing it out as a
whole (which could even possibly lead to crashes due to
incorrect alignment on some platforms), write it out explicitly
in the intended byte order.
This avoids having to set a define indicating the endianness.
The code interprets an array of 4 uint8_t values as one uint32_t
and does shifts on the value. The same optimization can be
kept in big endian as well, but the shift has to be done in the
other direction.
This code could be made truly independent of endianness, but
that could cause some minimal performance degradaion, at least
in theory.
This makes "make test" pass on big endian, assuming that
WORDS_BIGENDIAN is defined while building.
This is a more convenient behaviour (truncating on overflow and
always null terminating the buffer) compared to the MSVC
safe strcat_s which aborts the process if the string doesn't fit
into the target buffer.
Also mark the source buffer as const in the function prototype.
Make the MSVC "safe" version truncate instead of aborting the
process if the buffer is too small.
Update all the other functions to use the right parameter
(iSizeInBytes, not iCount) as 'n' parameter to strncpy.
(By passing iCount as parameter to the normal strncpy functions,
it meant that the resulting buffer actually never was null
terminated.)
Additionally make sure that the other implementations of WelsStrncpy
always null terminate the resulting buffer, just as the MSVC safe
version does when passed the _TRUNCATE parameter.
These were essentially useless - if strlen() ever was used as
fallback, it either indicated that those ports of the library
were insecure, or that strnlen never was required at all.
In this case it turned out to be the latter (at least after
the preceding cleanups) - all uses of it were with known null
terminated strings.
As long as WelsFileHandle* is equal to FILE* this doesn't matter,
but for consistency use the WelsF* functions for all handles
opened by WelsFopen, and use WelsFileHandle* as type for it
instead of FILE*.
Both encoder and decoder versions were functionally equivalent,
but I picked the decoder version (but added the static inline
keywords to it) since the encoder one was quite messy with a lot
of commented out code.
If the buffer is too small, there's no guarantee that it is
null terminated. The docs (on both unix and MSVC) say explicitly
that the function returns 0 and the buffer contents are
indeterminate in this case.
Otherwise builds on platforms other than MSVC might be
insecure.
Use vsnprintf_s with the _TRUNCATE flag instead of vsprintf_s
when using MSVC - this truncates the buffer instead of aborting
the whole process in case it's too small.
The decoder used WelsMedian while the encoder used WELS_MEDIAN.
The former has two different implementations, WELS_MEDIAN was
identical to the disabled version of WelsMedian.
Settle on using the same implementation for both decoder and
encoder - whichever version of the implementations is faster
should be used for both.
Also use the __APPLE__ predefined define instead of MACOS for enabling
these code paths.
This also avoids having to link to the CoreServices framework in
order to get the Gestalt function.
This gets rid of the code that parses /proc/cpuinfo, and avoids
forking within the library.
The previous code also failed build on modern glibc versions
due to ignoring the return value of the system, read and write
system calls.