536 Commits

Author SHA1 Message Date
Greg Tucker
dca9dd221e igzip: Use unaligned load on static header to fix usan
Clang with sanitizer on was catching on cast of static header.
Switching to uload64 macro for better general solution.

Change-Id: I495d440407bb1773841e2f7cdc48bd95fc1a2df4
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-04 12:40:08 -07:00
Greg Tucker
269df8a67d igzip: Fix order of args check in new dictionary function
In the newly added function isal_deflate_process_dict(), a null check
was added to the dictionary struct but was ineffectual because of the
order.

Change-Id: I3b3e70997210794de102b1348e1467295871cee2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-11-03 08:50:30 -07:00
Greg Tucker
24a98e3e87 Fix missing files in extra dist
Change-Id: I83e62344fab72afd755453d4eb43e9c236ba2b86
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-28 17:43:53 -07:00
Greg Tucker
79143208ac test: Add testing for new dictionary functions
Change-Id: I0b0a151374acfe9b44c7a2be4bb959df59356d97
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-28 17:28:43 -07:00
Greg Tucker
19035917f4 igzip: Add new functions for faster dictionary compression
Change-Id: Id55728fea286d144f8a11192ab02ccc8503d7b25
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
438ecd8187 Update custom hufftable tool for saving histogram
Change-Id: I515217b19373b8f996ff887268862cf2b102f3a4
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
89f7c46cd5 Change igzip_file_perf to accept 0 time
Change-Id: Ie2edf8e742d0bcdd9a008704f997006f8f5009ac
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
9968e7a032 Change gen cust hufftables to accept dictionary
Change-Id: I4eed03bdb91030b16b3ecfd8076adc890e4f59a2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
63dffab948 igzip: Change pre-gen inflate table to multi-symbol
Change-Id: I4b0dac1e5aa2796be17644b893e3b6c7aed05876
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
d7927673ba igzip: Inflate detect pre-gen header and use pre-expanded
Performance improvement for inflate to skip the time-consuming process of decode
table expansion when the header matches a known common dymanic one such as
produced by level 0 compression.

Change-Id: Ia2550b812a062b7cc2eb1b72bcb609f1a631e40b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-10-21 18:09:49 -07:00
Greg Tucker
cc9ed53972 build: Fix nmake check for multiple arch
Change-Id: I36c3616163f6fec61dda9cf8b35ca561e59477c9
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-27 11:16:30 -07:00
Greg Tucker
794b8b60c1 build: Add test to check for nmake consistency
Change-Id: I1180ba749d54e7ef433b01b33450e52ac5dbb2bb
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-26 11:41:03 -07:00
Greg Tucker
24623b8b82 crc: Fix missing object omitted from nmake file
Previous new crc version missed the update for nmake.

Change-Id: Ie529ee9d70d8d0ab8a8af3bd2720405802180d1e
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-26 09:49:23 -07:00
Greg Tucker
ec73d39086 crc: Add new vclmul version of crc32_iscsi
Change-Id: I1c509c6ea312b6eb4e1c2c1c8bb7044f7b043e0d
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-21 17:15:58 -07:00
Greg Tucker
ae45f60e78 igzip: Add cli feature to inflate concatenated gz files
Change-Id: I2beade6682e78fda30a18228a8660201ae7bf718
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-13 15:21:10 -07:00
Greg Tucker
93049d0d1f igzip: Fix read header for correct null checking and init
Issue with reading header only appears when combined with new feature in cli of
multiple concatenated gzip files.

Change-Id: Id8df9150c6f27d8b22e810b511291f3fcf136723
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-08-13 15:21:10 -07:00
Ruben Vorderman
2049d8dc81 Add conda shield to readme
This will make it easier for users to get the latest version. Installing with conda is easier than compiling it yourself. Distro packages (such as Debian's) do not always ship the latest version while conda-forge can. This badge will advertise this install method.

Change-Id: I99a1853a00e55fdf0c574c9906675738ac278121
Signed-off-by: Ruben Vorderman <r.h.p.vorderman@lumc.nl>
2020-07-27 11:36:55 -07:00
Jerry Yu
1c71f9c0ae crc32: tweak performance of crc32/crc32c
Tweak performances with prefetch instructions.

Below is the test results:
- Neoverse N1: ~30%
- Cortex-A72: ~3%
- Cortex-A57: ~90%
- Others: 50% - 5x

Change-Id: I3ab292a953043dbaea98af3c66778f57da3a1331
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-07-09 17:37:00 +08:00
Jerry Yu
14e0081bef build: fix build break on non-x86 platform
Arm64 and ppc64 build reports below error:
"configure: error: conditional "INTEL_CET_ENABLED" was never defined."
And the error should be report in all non-x86 platform.

Change-Id: I4c1b2fc99091424cfd5c62cf4d6536222b66712d
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-06-03 03:25:03 +00:00
H.J. Lu
8074e3fe1b x86: Generate .note.gnu.property section for ELF output
We should generate .note.gnu.property section with x86 assembly codes
for ELF outputs to mark Intel CET support when Intel CET is enabled
since all input files must be marked with Intel CET support in order
for linker to mark output with Intel CET support.  Since nasm and yasm
can't generate the proper .note.gnu.property section, yasm-cet-filter.sh
and yasm-filter.sh are added to generate the proper .note.gnu.property
with linker help.

Verified with

$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8

on Linux/x86-64.

Change-Id: I14e03a8a9031c8397dc36939a528cf5a827d775a
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2020-05-26 17:12:01 -07:00
H.J. Lu
cd888f01a4 x86: Add ENDBR32/ENDBR64 at function entries for Intel CET
To support Intel CET, all indirect branch targets must start with
ENDBR32/ENDBR64.  Here is a patch to define endbranch and add it to
function entries in x86 assembly codes which are indirect branch
targets as discovered by running testsuite on Intel CET machine and
visual inspection.

Verified with

$ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++ -Wl,-z,cet-report=error -fcf-protection" .../configure x86_64-linux
$ make -j8
$ make -j8 check

with both nasm and yasm on both CET and non-CET machines.

Change-Id: I9822578e7294fb5043a64ab7de5c41de81a7d337
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2020-05-26 09:16:49 -07:00
Zhiyuan Zhu
031450f697 crc32: Implement default mix mode optimization
Change-Id: Ib3bf04215cca491db522ec33905fe48df173cc2f
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2020-05-09 08:10:34 +00:00
Jerry Yu
6c4d3dbf6c crc32:NeoverseN1: Change CRC32/PMULL order to PMULL first
To reduce the cache missing events, the mix layout is changed
to PMULL+CRC. It also relaxes the final delay caused by data
dependency.
As results, the cold perf was improved about 20% and warm perf
was improved about 4%.

Change-Id: I7756f846edcb4f1665b4643a5a0e02283938cfdf
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-16 20:38:41 +08:00
Jerry Yu
92fc8733fa crc32: Fix prototype mismatch bug
Change-Id: I7c8a2348441f32a43ff386122612405e418d9947
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-10 00:46:41 +00:00
Jerry Yu
9bcd6768fd crc32:Adjust hardware folding algorithm flags
Hardware folding algorithm depend on CRC32 and PMULL instruction.
And it should match both flags .

Change-Id: I361068402db1fe6d7c0bd8d2c7048f1d94880233
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-08 13:50:15 +08:00
Jerry Yu
0033f42189 crc32:Optimize crc32/c for cortex-a72
Change-Id: Ib1658fd4b87b31d8ea6c93f697b50d9b409c186e
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-04-08 13:49:38 +08:00
Greg Tucker
5e586843eb build: Change ms nmake default to nasm and add pdb gen
The nmake default is changed for a modern nasm. Older nasm and yasm versions
will still work with windows but the nmake options must be changed appropriately
for max AS_FEATURE_LEVEL to match. Also now generates debug symbol pdb files.

Change-Id: I94a2dd7ecf541c6564ccbd4a184c33995d7b31ad
Signed-off-by: Poornima Kumar <poornima.kumar@intel.com>
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-31 22:55:27 +00:00
Jerry Yu
a2fc2c000d crc32:Add optimization implementation for Neoverse N1
This patch is base on reference(1) algorithm with some changes.
- Redefine the block number to two.
  - That's due to only two pipe-line can be used in CRC32 calculate.
- Redefine the block size:
  - The block size of CRC is 1536B and PMULL is 512B
- Interleave CRC and PMULL instructions.
The optimization parameters are calculated base on reference(2)

References:
- https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
- https://developer.arm.com/docs/swog309707/a

Change-Id: I1c9e593d59b521f56e4b3c807b396c083c181636
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-03-30 09:20:29 -07:00
Jerry Yu
f2cf2609cd multi-binary:Add microarchitecture id reader
This patch provides microarchitecture information
and make microarchitecture optimization possible. It
will trap into kernel due to mrs instruction. So it
should be called only in dispatcher, that will be
called only once in program lifecycle. And HWCAP must
be match,That will make sure there are no illegal
instruction errors.

Change-Id: I393ec742010bf3f10ce335482c0350aa4202c788
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-03-30 09:20:29 -07:00
Jerry Yu
85f947e120 ci: remove unused drone configuration
Change-Id: I20bded8111deb122757dbf259d17cd80010c2bb6
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2020-03-27 16:16:00 -07:00
Greg Tucker
af13ed6136 ec: Fix second windows reg push for avx512
Change improper stack push in windows prolog.  Error was not reachable without
windows nasm support and so went undetected.

Change-Id: I8b715195d1c8efd173843c043d42fc610ddebd17
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-20 12:36:58 -07:00
Greg Tucker
ede04f0a1f build: Fix for windows to allow nasm use
Previously windows build could only use yasm because some procedural items such
as proc_start were not supported by nasm.  This adds a few macros and fixes so
nasm can be used to build on windows.

Change-Id: Ia05dc3ff482f33b0f915bb1be3c7df5e4a753b3a
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-17 18:05:46 -07:00
Greg Tucker
5ab40c79cc ec: Fix windows reg push for avx512
Push of registers overlapped xmm push.  Error was not reachable without windows
nasm support and so went undetected.

Change-Id: I0ffd66f6d32ac37ea03fe9b11924968aa50f8fa7
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-17 18:05:46 -07:00
Greg Tucker
472e7011e8 ec: Change use of windows macro save_xmm128 to vec
For builds under windows this could emit a non-vec mov that's not optional for
AVX versions.

Change-Id: I31e6ea3b62d48c5a13f6e83f8d684f0b5551087b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-17 18:04:54 -07:00
Greg Tucker
7c0ab1d459 build: Add auto regenerate of nmake file
Change-Id: Icaa64aa35697c87779df18c3941d3df0f3256546
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-10 14:00:05 -07:00
Greg Tucker
794413ddd2 ec: Remove arch-specific redundant gf_nvect tests
The gf_{2-6}vect_dot_prod tests were kept in other_tests since the 5,6vect
functions were not strictly called by the higher level ec_encode_data() and
needed independent testing.  As this has now changed the extra tests can be
removed as redundant.

Change-Id: I8a95e31487b150a2a8f929c5586785524d951fde
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-03-06 13:45:59 -07:00
Greg Tucker
806b55ee57 build: Bump revision to 2.29
Change-Id: I78cfa77864f3fd77c3b63199bc18fd1782fe3dc2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
v2.29.0
2020-02-26 18:29:49 -07:00
Greg Tucker
2db2cd557c Update release notes for v2.29 additions
Change-Id: Id9ba5da760ee60dbb1de47162e6276f522bc0850
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-26 12:04:18 -07:00
Greg Tucker
6136a04bbe crc: Add new vclmul version of crc16_t10dif
Change-Id: Ic068f35d5d8c34b74128b7a2ea8e82f5fa693c28
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 19:54:19 -07:00
Greg Tucker
5ef6eb5c68 crc: Add new vclmul version of crc32_ieee
Change-Id: Ib761e3240d8252ce84e9abeadb568dce60742717
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Greg Tucker
25a673d75a crc: Add new vclmul version of gzip_refl
Change-Id: I8050853dcd177f4fb506f32f5fa723f7a1d3cded
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Greg Tucker
4217930338 crc: Add vec version of crc16_t10dif_copy
Change-Id: I5f73e8a38efd1ff50d30a39689d9d85da702e809
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Greg Tucker
02a41e0653 crc: Add vec version of crc32_ieee when avx avail
Change-Id: I5542ee93156c26f5a23feb89b82f4c51f282777d
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Greg Tucker
d4131bb3d3 crc: Add vec version of crc32_gzip_refl when avx avail
Change-Id: I4a069c318c809dcd21a6ebc47d3e0d1c131599ea
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Greg Tucker
ad22a90686 crc: Add vec version of crc16 when avx available
Vec versions mix much better with other avx code.

Change-Id: I2544c75d09231ee70f16c384b1e57062976199d9
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2020-02-21 10:11:16 -07:00
Hong Bo Peng
180c74aefd enable VSX SIMD in ISA-L for ppc64le
1) Implement the ErasureCode function in Altivec Intrinsics
  2) Coding style update

Change-Id: I2c81d035f4083e9b011dbf3b741f628813b68606
Thanks-to: Daniel Axtens <dja@axtens.net>
Signed-off-by: Hong Bo Peng <penghb@cn.ibm.com>
2020-02-20 09:40:43 -07:00
Zhang Jinde
a3d5cd8642 igzip: Fix clang error on dep generation
Clang errors when generating dependencies due to a stray semicolon following a
function definition.

Change-Id: Iefb4aca988b643bb62a69bbbaf197aca20a2d085
Signed-off-by: Zhang Jinde <zjd5536@163.com>
2020-01-17 10:25:32 -07:00
Zhang Jinde
163b6cd934 igzip: Fix for deflate logic buffer management
Fixes invalid logic that attempted to eliminate unnecessary copy of input to the
history buffer in cases where it is not required. Correction should improve
performance and not change functionality.

Change-Id: Ife24dcc9d920ce220b1a394031e971321737a171
Signed-off-by: Zhang Jinde <zjd5536@163.com>
2020-01-08 09:46:16 -07:00
Jerry Yu
fc69e8fc79 igzip: fix deflate hash bug
if next_in equal end_in, the function should
return.

Change-Id: I59e631bb1f24835fd43f943a3736e016c4e2d0ac
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-12-31 13:15:35 -07:00
Jerry Yu
e2b07bbd44 build: fix debug build problem
Remove strip command when lib_debug=1

Change-Id: I1203fcbfefb3b87080e9ba12ccbfb8018a008147
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-12-31 13:15:05 -07:00