Commit Graph

54 Commits

Author SHA1 Message Date
Martin Storsjö
3cf52554f7 Backup/restore the xmm6-xmm15 SSE registers within asm functions on win64
According to the Win64 ABI, these registers need to be preserved,
and compilers are allowed to rely on their content to stay
available - not only for float usage but for any usage, anywhere,
in the calling C++ code.

This adds a macro which pushes the clobbered registers onto the
stack if targeting win64 (and a matching one which restores them).
The parameter to the macro is the number of xmm registers used
(e.g. if using xmm0 - xmm7, the parameter is 8), or in other
words, the number of the highest xmm register used plus one.

This is similar to how the same issue is handled for the NEON
registers q4-q7 with the vpush instruction, except that they needed
to be preserved on all platforms, not only on one particular platform.

This allows removing the XMMREG_PROTECT_* hacks, which can
easily fail if the compiler chooses to use the callee saved
xmm registers in an unexpected spot.
2014-03-17 13:44:33 +02:00
Martin Storsjö
f96918283f Remove commented out code for old, 32-bit only x86 assembly function prologues/epilogues 2014-03-17 11:20:11 +02:00
Licai Guo
fc4e0cacec Merge pull request #483 from volvet/develop_b
use large/medium/similar to define scene change result
2014-03-17 16:32:31 +08:00
Martin Storsjö
d45e624cd4 Simplify code by using the arg11 and arg12 defines 2014-03-15 14:42:27 +02:00
Martin Storsjö
9199798f22 Fix a typo in a macro name, EXTENTION -> EXTENSION 2014-03-14 10:13:18 +02:00
Martin Storsjö
2bce50283f Fix a mismatched ifdef comment
This is an ifdef block for HAVE_NEON.
2014-03-14 10:01:18 +02:00
volvet
c8761c08ae use large/medium/similar to define scene change result 2014-03-13 10:43:20 +08:00
Martin Storsjö
c011890764 Push clobbered neon registers on the stack
According to the calling convention, the registers q4-q7 should be
preserved by functions. The caller (generated by the compiler) could
be using those registers anywhere for any intermediate data.

Functions that use more than 12 of the qX registers must push
the clobbered registers on the stack in order to be able to restore them
afterwards.

In functions that don't use all 16 registers, but clobber some of
the callee saved registers q4-q7, one or more of them are remapped
to reduce the number of registers that have to be saved/restored.

This incurs a very small (around 0.5%) slowdown in the decoder and
encoder.
2014-03-10 22:07:36 +02:00
Martin Storsjö
811c647c0e Remap registers to avoid clobbering the neon registers q4-q7
According to the calling convention, the registers q4-q7 should be
preserved by functions. The caller (generated by the compiler) could
be using those registers anywhere for any intermediate data.

Functions that use 12 or less of the qX registers can avoid
violating the calling convention by simply using other registers instead
of the callee saved registers q4-q7.

This change only remaps the registers used within functions - therefore
this does not affect performance at all. E.g. in functions using
registers q0-q7, we now use q0-q3 and q8-q11 instead.
2014-03-10 22:07:25 +02:00
Ethan Hugg
3627875986 Merge pull request #456 from mstorsjo/use-common-threadlib
Make the processing lib use mutexes from WelsThreadLib from the common library
2014-03-10 09:45:51 -07:00
ruil2
e99df3377d Merge pull request #460 from mstorsjo/add-const
Mark pointers as const where possible in vaacalc
2014-03-10 15:35:57 +08:00
Martin Storsjö
e31ba7a775 Fix a typo, "heigth" -> "height" 2014-03-09 19:19:37 +02:00
Martin Storsjö
2a23a508e1 Mark pointers as const where possible in vaacalc 2014-03-09 18:52:17 +02:00
Martin Storsjö
c5390521ec Make the processing lib use mutexes from WelsThreadLib from the common library
This requires always building the WelsMutex* functions,
even if MT_ENABLED isn't set.
2014-03-08 12:46:25 +02:00
Martin Storsjö
c87bb2b449 Remove unused/undeclared arm assembly macro parameters
The SAD_VAR_16_END macro only takes 3 parameters, never 4,
and SAD_SSD_16_END never is called with more than 3 parameters
either.
2014-03-07 10:26:54 +02:00
Martin Storsjö
c0043f7053 Use the three-operand form of add/sub with shift
When using unified syntax, the two operand form with a shift
isn't allowed.
2014-03-06 16:21:54 +02:00
Martin Storsjö
8ba79262bf Rename a function to avoid conflicts between almost duplicate neon functions
There's a different version of the same function in the encoder,
but they're not identical - the encoder version has got stricter
alignment requirements.

If someone can confirm that it is ok to use the function from the
encoder, pixel_sad_neon.S in processing could be deleted, and the
encoder version moved to codec/common instead.
2014-03-06 16:19:48 +02:00
Licai Guo
e7cc8c2780 Add arm asm code for processing. 2014-03-05 16:54:05 +08:00
Licai Guo
efcee63692 Remove .DS_Store file. 2014-03-05 10:24:05 +08:00
Licai Guo
bb244d736b Partly add arm asm code to encoder. 2014-03-05 10:24:05 +08:00
Martin Storsjö
9cf34e7615 Unify the interface for the different variants of WelsCPUFeatureDetect
The caller of the function should not need to know exactly which
implementation of it is being used.

For the variants that don't support detecting the number of cores,
the pNumberOfLogicProcessors parameter can be left untouched
and the caller will use a higher level API for finding it out.

This simplifies all the calling code, and simplifies adding
more implementations of cpu feature detection.
2014-03-04 10:18:30 +02:00
Martin Storsjö
19efc59fae Remove the WELSAPI definition
There's no need to specify a custom calling convention for
these functions.
2014-02-21 09:32:46 +02:00
Martin Storsjö
099595696b Add the common processing include directories to the include path
This avoids using relative paths for including these files.
2014-02-19 14:42:03 +02:00
Martin Storsjö
3b297ec866 Remove completely unused variables and private fields 2014-02-18 13:04:13 +02:00
Martin Storsjö
80862eec77 Use the C++ constants true/false instead of defining our own
TRUE/FALSE has intentionally been left in use for the few
platform specific APIs that define these constants themselves
and expect them to be used, for consistency.
2014-02-10 08:06:37 +02:00
Martin Storsjö
f2bd22acd5 Use char instead of str_t 2014-02-10 08:06:37 +02:00
Martin Storsjö
2b77fe7f49 Use bool instead of bool_t
bool is one of the built in, standard types in C++, there's no need
for a typedef for it.
2014-02-10 08:05:09 +02:00
Martin Storsjö
d36b10fac5 Remove typedefs for float_t and double_t
The actual float and double data types are defined in C89 and are
usable as such without any extra typedefs.

Removing the extra typedefs simplifies the compatibility typedef
headers, simplifies portability and makes the code base easier
to work with for people new to the library.
2014-02-08 14:11:44 +02:00
Martin Storsjö
f252acf8a5 Remove fallback defines for NULL
No actual (supported) compiler lacks a definition for NULL, and it
is mandated to be present in stddef.h according to the C89 standard.
2014-02-06 10:38:15 +02:00
orbitcowboy
7d29cecc0e cleanup variable initializations. 2014-01-31 23:13:21 +01:00
Martin Storsjö
b2178aacc0 Fix a typo in a directory name 2014-01-29 10:29:53 +02:00
Martin Storsjö
04dba61d22 Remove an unused assembly source file
Nothing within processing uses functions from this file.
2014-01-28 13:55:41 +02:00
Martin Storsjö
3761901ed4 Remove sad.asm from the processing lib, move satd_sad from the encoder to the common lib
sad.asm as used in processing is an exact subset of the
code in satd_sad.asm in the encoder.
2014-01-28 13:54:57 +02:00
Martin Storsjö
b468ed3c0b Unify the declaration of int8_t within the processing lib with the rest
int8_t in general should to be defined as signed char, since there
are actual envrionments where a plain 'char' is unsigned.

This also reduces the differences between the typedef headers of
the different sub-libraries.
2014-01-27 08:44:48 +02:00
Martin Storsjö
a24fd5e120 Make the bool_t typedef in the processing lib match the others
There is no specific reason to use int32_t for this type.
2014-01-27 08:44:48 +02:00
Martin Storsjö
09f0254d5e Remove the now unused long_t typedef 2014-01-27 08:44:48 +02:00
Martin Storsjö
cc4507462b Use int32_t instead of long_t in WELS_SIGN/WELS_ABS in the processing lib
This makes them match the same macros in the main decoder/encoder
libraries. long_t (which is typedeffed to long) actually is 64
bit on 64 bit unix platforms, which might not be what was
intended.
2014-01-27 08:44:48 +02:00
Martin Storsjö
0339cd51c5 Unify the definition of WELS_THREAD_ERROR_CODE between libraries
This simplifies the code base, reduces the risk for mixups and
gets rid of the use of a local nonstandard typedef.
2014-01-27 08:44:48 +02:00
Martin Storsjö
98dd2d91d9 Use 'inline' instead of 'inline_t' in the processing lib
Just use the standard inline keyword with sufficient backwards
compatibility defines, similar to how it is done in the main decoder
and encoder libraries.
2014-01-27 08:44:48 +02:00
Varun B Patil
7c6445418b Removed unused header files in processing src 2014-01-25 15:39:05 +05:30
Martin Storsjö
c61b040c11 Remove an MSVC resource editor state file
This file contains the local UI state of the resource editor,
and should not be committed to version control.
2014-01-23 22:55:36 +02:00
Martin Storsjö
eaf95566ea Remove an unused function wrapping a standard function
This allows removing a whole file.
2014-01-23 22:55:36 +02:00
Martin Storsjö
aec2ed30cd Simplify an ifdef
We don't need to check both platform and compiler at the same time,
checking the compiler is enough here.
2014-01-23 22:55:36 +02:00
Martin Storsjö
8583e13e34 Clear the executable bit on source files 2014-01-23 09:30:50 +02:00
volvet
5c9f447c0e fix win64 float issue, enable AQ assembly 2014-01-21 11:16:48 +08:00
V
a6463be0cc Allow yasm to be used instead of nasm.
http://www.nasm.us/doc/nasmdoc3.html#section-3.4.1 says a zero should
follow $ in numeric constants, but yasm complains about it when not
followed.
2014-01-18 13:59:24 +01:00
Varun B Patil
98ff18d15d fix typo 2014-01-17 00:50:03 +05:30
Martin Storsjö
65b339815e Get rid of trailing whitespace in the assembly source files 2014-01-13 20:12:04 +02:00
Martin Storsjö
41a251630d Use intptr_t instead of long for casting pointers to integers
This fixes building on mingw-w64.

Include stdint.h on everything except MSVC for definitions of
common standard types, include stddef.h on MSVC instead, since
MSVC doesn't have stdint.h in all older versions that are
supposed to be supported, but MSVC always defines intptr_t via
stddef.h.
2014-01-10 14:52:09 +02:00
Martin Storsjö
05bf57a2af Use InitializeCriticalSectionEx for Windows Store or Windows Phone apps
The old InitializeCriticalSection function isn't available in
these API partitions, and the new InitializeCriticalSectionEx
function is only available since Vista, so we want to keep using
the old function for normal desktop code.
2014-01-08 09:20:39 +02:00