6 Commits

Author SHA1 Message Date
Greg Tucker
87908c9060 mem: Move new mem_zero_detect function to avx2
New mem_zero_detect function will fail on avx only machines.

Change-Id: I3bca49bff886f9c130c89e8c74b31110e9bac76b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-09-30 17:47:57 -07:00
Nicola Torracca
0e65117138 mem_zero_detect_avx: OR multiple vector and test for non zero on the result
micro-optimizations: vpcmpeqb+vpmaskmov is faster than vptest according
to uops.info; make usually untaken branches target forward.
reduce numbers of data dependant branches and code size.

Change-Id: Ie70b4bc99685368e5131f23344348bfaf7c27d3e
Signed-off-by: Nicola Torracca <shark@bitchx.it>
2021-09-30 16:55:30 -07:00
H.J. Lu
cd888f01a4 x86: Add ENDBR32/ENDBR64 at function entries for Intel CET
To support Intel CET, all indirect branch targets must start with
ENDBR32/ENDBR64.  Here is a patch to define endbranch and add it to
function entries in x86 assembly codes which are indirect branch
targets as discovered by running testsuite on Intel CET machine and
visual inspection.

Verified with

$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8
$ make -j8 check

with both nasm and yasm on both CET and non-CET machines.

Change-Id: I9822578e7294fb5043a64ab7de5c41de81a7d337
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2020-05-26 09:16:49 -07:00
Greg Tucker
ede04f0a1f build: Fix for windows to allow nasm use
Previously windows build could only use yasm because some procedural items such
as proc_start were not supported by nasm.  This adds a few macros and fixes so
nasm can be used to build on windows.

Change-Id: Ia05dc3ff482f33b0f915bb1be3c7df5e4a753b3a
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-17 18:05:46 -07:00
Greg Tucker
2e212f28fa build: Fix for mac nasm lack of symbol types
Change-Id: I9ee86a3e32876d3860477c8365fc459d94a8920e
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2018-11-29 13:54:36 -07:00
John Kariuki
6e2013391a mem: Add zero detect memory functions
This patch introduces the base, avx and sse optimized zero detect memory function.
The zero detect memory function tests if a memory region is all zeroes. If all the
bytes in the memory region are zero, the function return a zero. Otherwise, if the
memory region has non zero bytes, the zero detect function returns a 1.

Change-Id: If965badf750377124d0067d09f888d0419554998
Signed-off-by: John Kariuki <John.K.Kariuki@intel.com>
2018-09-25 14:33:31 -07:00