560 Commits

Author SHA1 Message Date
Nicola Torracca
0e65117138 mem_zero_detect_avx: OR multiple vector and test for non zero on the result
micro-optimizations: vpcmpeqb+vpmaskmov is faster than vptest according
to uops.info; make usually untaken branches target forward.
reduce numbers of data dependant branches and code size.

Change-Id: Ie70b4bc99685368e5131f23344348bfaf7c27d3e
Signed-off-by: Nicola Torracca <shark@bitchx.it>
2021-09-30 16:55:30 -07:00
Greg Tucker
f980b36655 build: Change include shortcut D to not conflict with env
The variable D= can be used to quickly add defines. This sets a null
default so it can only be overridden by the make command line.

fixes #184

Change-Id: I84615174547f36208d6d577c1e30b6fac83139b3
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-09-14 19:18:31 -07:00
Taiju Yamada
998e03bf95 Strip -isysroot and related flags from asm-filter
This helps python-isal compatibility.

Change-Id: I8a2540e330f229f65903bdb2cc47aceeb0724dc5
Signed-off-by: Taiju Yamada <tyamada@bi.a.u-tokyo.ac.jp>
2021-09-13 10:02:38 -07:00
Greg Tucker
066940a9a7 build: Add ms rc file to put extra metatdata on dll
Change-Id: Idf687c6b2f8d1dea203f01bf57c5158d19ed519e
Signed-off-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-09-02 18:27:51 -07:00
Ruben Vorderman
908726e255 More prominently feature language bindings and igzip
Change-Id: Ief814eeb6d24f16d822e22327f40756ffba05869
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
2021-08-24 18:22:54 -07:00
Ruben Vorderman
94ec6026ce Create headers based on compression parameters.
Instead of using a constant as default zlib header, create the header on the fly. Both zlib
header bytes depend on the wbits and compression level used.
Make sure that ISA-L compression level 0 is advertised as the fastest compression in
both the gzip header (setting xfl flag to 0x04) and the zlib header (as 0, fastest, other levels are 1, fast).

Change-Id: I1f30e4397a0f5fcf6df593c40178e7d6f6c05328
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
2021-08-23 09:48:10 -07:00
Greg Tucker
1db0363c49 igzip: Add compress-decompress with dictionary to perf test
Change-Id: Ic396819537f5437e6aab3ebf5d023ed5cdbe852a
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-06-14 15:55:39 -07:00
Greg Tucker
112dd72c01 build: Remove unneeded file types.h
The file types.h has long been misnamed and overlaps with
functionality in the test helper routines.

Change-Id: I774047d3a0074198b67a6b4e909f1e2ce1938195
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-06-10 09:35:43 -07:00
Greg Tucker
cfdd3497d1 perf: Remove unneeded time include
Timing functions are made os-independent with test.h include.

Change-Id: Iab7d6325254d5c32263504efc756dbbe51d77153
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-06-09 18:33:57 -07:00
Greg Tucker
d5928e3760 build: Fix missing ms function export
Windows def file was missing an exported ec support function.
Also added path in nmake file to build extra examples.

Change-Id: I59ac1599dcb8cdb45077347c74b57aeca4751c35
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-06-07 18:30:08 -07:00
Greg Tucker
628f4e91ea ex: Add makefile to build examples from installed lib
Change-Id: I10a51dfe90e0672bb33348de241a5be91c9caa37
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-06-03 17:53:20 -07:00
Greg Tucker
0f7bf1c04d doc: Update minimum nasm recommendation and details
Change-Id: Icb113242c0ab7f3c75af3e65a8d519511f4ed4c3
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-05-27 17:44:31 -07:00
Greg Tucker
393f69fcac build: Change travis osx to use std brew
The osx brew and older linux targets are failing the update.
This removes the older linux builds and change the osx to
take the latest brew that comes with the image instead of
doing a brew update on every build.

Change-Id: Ib1543296a733875c9eff798326b0d45854153923
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-05-21 19:44:39 -07:00
Greg Tucker
240ca46ffb build: Change mingw to nasm by default
Change-Id: I80053b8cf62f5f2ef7c12661086e9aeaf2eea573
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-05-21 19:44:39 -07:00
Greg Tucker
d7bac36be4 crc: Fix warning in perf test from uninitialized tmp ptr
Both gcc and clang are showing a warning on this despite the buffer
always being set before use.

Change-Id: I0e8f6b9e3451efe69e49814abc883d49b04f2666
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-05-20 11:57:56 -07:00
Greg Tucker
fe4b7f9acc Add toplevel header gen in windows
Change-Id: I3a1e5fc495266d8ba223d75384625e22c3cf66fe
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-05-06 16:44:10 -07:00
Greg Tucker
2c705a26cb raid: Fix doc and base functions for min sources
The raid functions xor_gen, pq_gen and check functions
must have at least two sources. Fixes #175

Change-Id: I2e4509e037c2b1dc88f3f7449d80f4c763e1e124
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-04-26 16:23:58 -07:00
Greg Tucker
ebb78fc99e build: Fix warning from inconsistency in gnu make
Make changed the interpretation of escaped # in a quote causing
warnings in the test for pthreads.

Change-Id: Ice94116713aea3c3e9725b38232e03f53d6633cc
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2021-03-03 10:26:06 -07:00
luo rixin
bee5180a15 erasure_code: Fix text relocation on aarch64
Here is the bug report on ceph. https://tracker.ceph.com/issues/48681

Change-Id: Ie1c60a71f28c1a169c8899a621be9bb455f5e244
Signed-off-by: luo rixin <luorixin@huawei.com>
2021-01-08 15:23:15 -07:00
Jerry Yu
bc8b2aef55 Fix clang build fail
Author of this patch is Taiju Yamada <tyamada@bi.a.u-tokyo.ac.jp>
Re-organized by Jerry Yu <jerry.h.yu@arm.com>

Clang version must be later than 9.x according to https://reviews.llvm.org/D61719

Change-Id: I7516cca17ef4556b828fb6ecfa755e6451052359
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-12-09 14:37:55 +08:00
Greg Tucker
600d8d8f77 build: Update fuzz tests for deprecated clang args
Clang has deprecated the option -fsanitize-coverage=trace-pc-guard
for use with fuzzing.

Change-Id: I7fe5da0f57ab44110208d098858b786450a0a5e7
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-12-04 15:04:02 -07:00
Greg Tucker
2df39cf5f1 build: Bump revision to 2.30
Change-Id: If6d696ee76f3949d3cf5aff34403df65bce2c6b9
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
v2.30.0
2020-11-06 18:08:16 -07:00
Greg Tucker
05f6a0bb39 Update release notes for v2.30 additions
Change-Id: Icbb1faa2b67d8d18b1c7cde9f09774ebd895a6df
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-04 14:59:33 -07:00
Greg Tucker
ece814e912 doc: Add details on build and test
Change-Id: I58401ed26ba8a0a7fad0191b4c1bbb461d0311e6
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-04 12:40:08 -07:00
Greg Tucker
dca9dd221e igzip: Use unaligned load on static header to fix usan
Clang with sanitizer on was catching on cast of static header.
Switching to uload64 macro for better general solution.

Change-Id: I495d440407bb1773841e2f7cdc48bd95fc1a2df4
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-04 12:40:08 -07:00
Greg Tucker
269df8a67d igzip: Fix order of args check in new dictionary function
In the newly added function isal_deflate_process_dict(), a null check
was added to the dictionary struct but was ineffectual because of the
order.

Change-Id: I3b3e70997210794de102b1348e1467295871cee2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-03 08:50:30 -07:00
Greg Tucker
24a98e3e87 Fix missing files in extra dist
Change-Id: I83e62344fab72afd755453d4eb43e9c236ba2b86
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-28 17:43:53 -07:00
Greg Tucker
79143208ac test: Add testing for new dictionary functions
Change-Id: I0b0a151374acfe9b44c7a2be4bb959df59356d97
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-28 17:28:43 -07:00
Greg Tucker
19035917f4 igzip: Add new functions for faster dictionary compression
Change-Id: Id55728fea286d144f8a11192ab02ccc8503d7b25
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
438ecd8187 Update custom hufftable tool for saving histogram
Change-Id: I515217b19373b8f996ff887268862cf2b102f3a4
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
89f7c46cd5 Change igzip_file_perf to accept 0 time
Change-Id: Ie2edf8e742d0bcdd9a008704f997006f8f5009ac
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
9968e7a032 Change gen cust hufftables to accept dictionary
Change-Id: I4eed03bdb91030b16b3ecfd8076adc890e4f59a2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
63dffab948 igzip: Change pre-gen inflate table to multi-symbol
Change-Id: I4b0dac1e5aa2796be17644b893e3b6c7aed05876
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
d7927673ba igzip: Inflate detect pre-gen header and use pre-expanded
Performance improvement for inflate to skip the time-consuming process of decode
table expansion when the header matches a known common dymanic one such as
produced by level 0 compression.

Change-Id: Ia2550b812a062b7cc2eb1b72bcb609f1a631e40b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
cc9ed53972 build: Fix nmake check for multiple arch
Change-Id: I36c3616163f6fec61dda9cf8b35ca561e59477c9
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-27 11:16:30 -07:00
Greg Tucker
794b8b60c1 build: Add test to check for nmake consistency
Change-Id: I1180ba749d54e7ef433b01b33450e52ac5dbb2bb
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-26 11:41:03 -07:00
Greg Tucker
24623b8b82 crc: Fix missing object omitted from nmake file
Previous new crc version missed the update for nmake.

Change-Id: Ie529ee9d70d8d0ab8a8af3bd2720405802180d1e
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-26 09:49:23 -07:00
Greg Tucker
ec73d39086 crc: Add new vclmul version of crc32_iscsi
Change-Id: I1c509c6ea312b6eb4e1c2c1c8bb7044f7b043e0d
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-21 17:15:58 -07:00
Greg Tucker
ae45f60e78 igzip: Add cli feature to inflate concatenated gz files
Change-Id: I2beade6682e78fda30a18228a8660201ae7bf718
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-13 15:21:10 -07:00
Greg Tucker
93049d0d1f igzip: Fix read header for correct null checking and init
Issue with reading header only appears when combined with new feature in cli of
multiple concatenated gzip files.

Change-Id: Id8df9150c6f27d8b22e810b511291f3fcf136723
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-13 15:21:10 -07:00
Ruben Vorderman
2049d8dc81 Add conda shield to readme
This will make it easier for users to get the latest version. Installing with conda is easier than compiling it yourself. Distro packages (such as Debian's) do not always ship the latest version while conda-forge can. This badge will advertise this install method.

Change-Id: I99a1853a00e55fdf0c574c9906675738ac278121
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
2020-07-27 11:36:55 -07:00
Jerry Yu
1c71f9c0ae crc32: tweak performance of crc32/crc32c
Tweak performances with prefetch instructions.

Below is the test results:
- Neoverse N1: ~30%
- Cortex-A72: ~3%
- Cortex-A57: ~90%
- Others: 50% - 5x

Change-Id: I3ab292a953043dbaea98af3c66778f57da3a1331
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-07-09 17:37:00 +08:00
Jerry Yu
14e0081bef build: fix build break on non-x86 platform
Arm64 and ppc64 build reports below error:
"configure: error: conditional "INTEL_CET_ENABLED" was never defined."
And the error should be report in all non-x86 platform.

Change-Id: I4c1b2fc99091424cfd5c62cf4d6536222b66712d
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-06-03 03:25:03 +00:00
H.J. Lu
8074e3fe1b x86: Generate .note.gnu.property section for ELF output
We should generate .note.gnu.property section with x86 assembly codes
for ELF outputs to mark Intel CET support when Intel CET is enabled
since all input files must be marked with Intel CET support in order
for linker to mark output with Intel CET support.  Since nasm and yasm
can't generate the proper .note.gnu.property section, yasm-cet-filter.sh
and yasm-filter.sh are added to generate the proper .note.gnu.property
with linker help.

Verified with

$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8

on Linux/x86-64.

Change-Id: I14e03a8a9031c8397dc36939a528cf5a827d775a
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2020-05-26 17:12:01 -07:00
H.J. Lu
cd888f01a4 x86: Add ENDBR32/ENDBR64 at function entries for Intel CET
To support Intel CET, all indirect branch targets must start with
ENDBR32/ENDBR64.  Here is a patch to define endbranch and add it to
function entries in x86 assembly codes which are indirect branch
targets as discovered by running testsuite on Intel CET machine and
visual inspection.

Verified with

$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8
$ make -j8 check

with both nasm and yasm on both CET and non-CET machines.

Change-Id: I9822578e7294fb5043a64ab7de5c41de81a7d337
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2020-05-26 09:16:49 -07:00
Zhiyuan Zhu
031450f697 crc32: Implement default mix mode optimization
Change-Id: Ib3bf04215cca491db522ec33905fe48df173cc2f
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2020-05-09 08:10:34 +00:00
Jerry Yu
6c4d3dbf6c crc32:NeoverseN1: Change CRC32/PMULL order to PMULL first
To reduce the cache missing events, the mix layout is changed
to PMULL+CRC. It also relaxes the final delay caused by data
dependency.
As results, the cold perf was improved about 20% and warm perf
was improved about 4%.

Change-Id: I7756f846edcb4f1665b4643a5a0e02283938cfdf
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-16 20:38:41 +08:00
Jerry Yu
92fc8733fa crc32: Fix prototype mismatch bug
Change-Id: I7c8a2348441f32a43ff386122612405e418d9947
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-10 00:46:41 +00:00
Jerry Yu
9bcd6768fd crc32:Adjust hardware folding algorithm flags
Hardware folding algorithm depend on CRC32 and PMULL instruction.
And it should match both flags .

Change-Id: I361068402db1fe6d7c0bd8d2c7048f1d94880233
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-08 13:50:15 +08:00
Jerry Yu
0033f42189 crc32:Optimize crc32/c for cortex-a72
Change-Id: Ib1658fd4b87b31d8ea6c93f697b50d9b409c186e
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-08 13:49:38 +08:00