micro-optimizations: vpcmpeqb+vpmaskmov is faster than vptest according
to uops.info; make usually untaken branches target forward.
reduce numbers of data dependant branches and code size.
Change-Id: Ie70b4bc99685368e5131f23344348bfaf7c27d3e
Signed-off-by: Nicola Torracca <shark@bitchx.it>
The variable D= can be used to quickly add defines. This sets a null
default so it can only be overridden by the make command line.
fixes#184
Change-Id: I84615174547f36208d6d577c1e30b6fac83139b3
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Instead of using a constant as default zlib header, create the header on the fly. Both zlib
header bytes depend on the wbits and compression level used.
Make sure that ISA-L compression level 0 is advertised as the fastest compression in
both the gzip header (setting xfl flag to 0x04) and the zlib header (as 0, fastest, other levels are 1, fast).
Change-Id: I1f30e4397a0f5fcf6df593c40178e7d6f6c05328
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
The file types.h has long been misnamed and overlaps with
functionality in the test helper routines.
Change-Id: I774047d3a0074198b67a6b4e909f1e2ce1938195
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Timing functions are made os-independent with test.h include.
Change-Id: Iab7d6325254d5c32263504efc756dbbe51d77153
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Windows def file was missing an exported ec support function.
Also added path in nmake file to build extra examples.
Change-Id: I59ac1599dcb8cdb45077347c74b57aeca4751c35
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
The osx brew and older linux targets are failing the update.
This removes the older linux builds and change the osx to
take the latest brew that comes with the image instead of
doing a brew update on every build.
Change-Id: Ib1543296a733875c9eff798326b0d45854153923
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Both gcc and clang are showing a warning on this despite the buffer
always being set before use.
Change-Id: I0e8f6b9e3451efe69e49814abc883d49b04f2666
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
The raid functions xor_gen, pq_gen and check functions
must have at least two sources. Fixes#175
Change-Id: I2e4509e037c2b1dc88f3f7449d80f4c763e1e124
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Make changed the interpretation of escaped # in a quote causing
warnings in the test for pthreads.
Change-Id: Ice94116713aea3c3e9725b38232e03f53d6633cc
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Here is the bug report on ceph. https://tracker.ceph.com/issues/48681
Change-Id: Ie1c60a71f28c1a169c8899a621be9bb455f5e244
Signed-off-by: luo rixin <luorixin@huawei.com>
Author of this patch is Taiju Yamada <tyamada@bi.a.u-tokyo.ac.jp>
Re-organized by Jerry Yu <jerry.h.yu@arm.com>
Clang version must be later than 9.x according to https://reviews.llvm.org/D61719
Change-Id: I7516cca17ef4556b828fb6ecfa755e6451052359
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
Clang has deprecated the option -fsanitize-coverage=trace-pc-guard
for use with fuzzing.
Change-Id: I7fe5da0f57ab44110208d098858b786450a0a5e7
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Clang with sanitizer on was catching on cast of static header.
Switching to uload64 macro for better general solution.
Change-Id: I495d440407bb1773841e2f7cdc48bd95fc1a2df4
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
In the newly added function isal_deflate_process_dict(), a null check
was added to the dictionary struct but was ineffectual because of the
order.
Change-Id: I3b3e70997210794de102b1348e1467295871cee2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Performance improvement for inflate to skip the time-consuming process of decode
table expansion when the header matches a known common dymanic one such as
produced by level 0 compression.
Change-Id: Ia2550b812a062b7cc2eb1b72bcb609f1a631e40b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Previous new crc version missed the update for nmake.
Change-Id: Ie529ee9d70d8d0ab8a8af3bd2720405802180d1e
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Issue with reading header only appears when combined with new feature in cli of
multiple concatenated gzip files.
Change-Id: Id8df9150c6f27d8b22e810b511291f3fcf136723
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
This will make it easier for users to get the latest version. Installing with conda is easier than compiling it yourself. Distro packages (such as Debian's) do not always ship the latest version while conda-forge can. This badge will advertise this install method.
Change-Id: I99a1853a00e55fdf0c574c9906675738ac278121
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
Arm64 and ppc64 build reports below error:
"configure: error: conditional "INTEL_CET_ENABLED" was never defined."
And the error should be report in all non-x86 platform.
Change-Id: I4c1b2fc99091424cfd5c62cf4d6536222b66712d
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
We should generate .note.gnu.property section with x86 assembly codes
for ELF outputs to mark Intel CET support when Intel CET is enabled
since all input files must be marked with Intel CET support in order
for linker to mark output with Intel CET support. Since nasm and yasm
can't generate the proper .note.gnu.property section, yasm-cet-filter.sh
and yasm-filter.sh are added to generate the proper .note.gnu.property
with linker help.
Verified with
$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8
on Linux/x86-64.
Change-Id: I14e03a8a9031c8397dc36939a528cf5a827d775a
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
To support Intel CET, all indirect branch targets must start with
ENDBR32/ENDBR64. Here is a patch to define endbranch and add it to
function entries in x86 assembly codes which are indirect branch
targets as discovered by running testsuite on Intel CET machine and
visual inspection.
Verified with
$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8
$ make -j8 check
with both nasm and yasm on both CET and non-CET machines.
Change-Id: I9822578e7294fb5043a64ab7de5c41de81a7d337
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
To reduce the cache missing events, the mix layout is changed
to PMULL+CRC. It also relaxes the final delay caused by data
dependency.
As results, the cold perf was improved about 20% and warm perf
was improved about 4%.
Change-Id: I7756f846edcb4f1665b4643a5a0e02283938cfdf
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
Hardware folding algorithm depend on CRC32 and PMULL instruction.
And it should match both flags .
Change-Id: I361068402db1fe6d7c0bd8d2c7048f1d94880233
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>