763 Commits

Author SHA1 Message Date
Tim Burke
a46e3f1588 igzip: Fix aarch64 registry width for bfinal
We only ever load 32 bits into it, and we only ever want to compare against
32 bits. There was no need to declare it as 64 bits.

Furthermore, there were cases where a 64 bit comparison around
isal_out_overflow_1 led us to erroneously set the block state to
ISAL_BLOCK_INPUT_DONE when it should have been left at ISAL_BLOCK_NEW_HDR.

Fixes #316

Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-08-29 23:50:35 +08:00
Tim Burke
73c50447fc aarch64: Fix build on macOS
Somewhere between Command Line Tools for Xcode 16.2 and 16.3, clang
started complaining like

   <instantiation>:91:26: error: unexpected token in argument list
    movk x7, br_low_b2, lsl 32
                            ^
   crc/aarch64/crc32_ieee_norm_pmull.S:34:1: note: while in macro instantiation
   crc32_norm_func crc32_ieee_norm_pmull

It seems to do with some change to macro expansion; work around it by
replacing .equ directives with #defines.

Fixes #352
Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-08-21 16:20:04 +01:00
Pablo de Lara
8772e99fee Add MAINTAINERS file
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-21 09:05:44 +01:00
Pablo de Lara
fa32879c2d tests: [fuzz] fix potential null dereference
There is a possibility that zstate.msg = NULL, which is set
in inflateInit2() function. In that case, we should not
compare against another string.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-11 17:37:49 +01:00
Pablo de Lara
768b77219f igzip: [SHIM] fix memory leaks
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-11 17:37:49 +01:00
Pablo de Lara
8f2c02ab9e igzip: fix memory leak in test
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-11 17:37:49 +01:00
vkarpenk
f0320e1c30 shim: add zlib shim library
This is experimental library is a drop-in replacement for zlib that
utilizes ISA-L for improved compression/decompression performance.

Signed-off-by: Karpenko, Veronika <veronika.karpenko@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-08 07:47:35 +00:00
Pablo de Lara
5e9072107a cmake: add functional tests
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-17 09:57:45 +02:00
Pablo de Lara
612c210684 Add inital CMake build system
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2025-07-17 09:57:45 +02:00
lvshuo
d414b2702a erasure_code: optimize RVV implementation
The ISA-L EC code has been written using RVV vector instructions and the minimum multiplication table,
resulting in a performance improvement of over 10 times compared to the existing implementation.

Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>
2025-07-11 15:55:57 +02:00
Pablo de Lara
f2883f24fd raid: add cold cache test
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-09 14:15:18 +01:00
Pablo de Lara
55e25f7aa2 raid: add consolidated performance app
Added new RAID performance application which consolidates the
existing XOR and P+Q gen performance applications.

This application accepts buffer sizes to benchmark,
as a single value, list or range, and the RAID function
to test and the number of sources.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 15:00:56 +02:00
Pablo de Lara
8735bb4e20 crc: add cold cache test
To benchmark a cold cache scenario, the option `--cold`
has been added as a parameter of the CRC benchmark application,
where the addresses of the input buffers are randomize
within a 1GB preallocated memory buffer.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 12:11:56 +01:00
Pablo de Lara
e97c91547f Add parenthesis around parameters in macros
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 12:11:56 +01:00
Pablo de Lara
199a0a8151 crc: add CRC consolidated performance benchmark
Added new CRC performance application which consolidates the
existing CRC performance applications (CRC16, CRC32 and CRC64).

This application accepts buffer sizes to benchmark,
as a single value, list or range, and the CRC function
to test (or all of them).

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
5d437d72f1 Add missing base function symbol
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
fc37bd08e3 Further memory leak fixes
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
bf18da6770 Free allocated memory in test applications
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-11 12:17:43 +01:00
Rong Tao
8054d41db5 Ignore more generated files
ignore files generated by 'make perf'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
2025-05-26 19:28:17 +01:00
Tim Burke
9a6c32cb05 Optimize crc64_rocksoft for aarch64
Closes #326

Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
Tim Burke
86e775b3b5 Remove unnecessary .text directives
Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
Tim Burke
22810489c6 Normalize the width of some constants
This makes it easier to compare the constants used for crc/crc64_*_by8.asm,
crc/crc64_*_by16_10.asm, and crc/aarch64/crc64_*_pmull.h

Note that this revealed some discrepancies:

  ecma_refl: br_high != rk8 (92d8af2baf0e1e85 vs 92d8af2baf0e1e84)
  iso_refl: br_high != rk8 (b000000000000001 vs b000000000000000)
  jones_refl: br_high != rk8 (2b5926535897936b vs 2b5926535897936a)

but they should be innocuous.

Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
sunyuechi
f74b0d27ab Update release notes
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-13 15:39:44 +02:00
sunyuechi
a7766a91b6 mem: R-V V mem_zero_detect
banana_f3:
    rvv: mem_zero_detect_perf_warm: runtime =    3062584 usecs, bandwidth 33784 MB in 3.0626 sec = 11031.32 MB/s
    c:   mem_zero_detect_perf_warm: runtime =    3000354 usecs, bandwidth 1594 MB in 3.0004 sec = 531.34 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-13 15:39:44 +02:00
Pablo de Lara
94690d01ca Remove 32-bit x86 architecture support
As already announced in issue #296, we are removing 32-bit x86 support,
which was not being validated anyway.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 18:37:08 +01:00
Pablo de Lara
8045bee170 Bump minimum NASM version to 2.14.01
NASM version 2.14.01 supports all x86 ISA in this library.
Since this version has been out since 2018, it is safe to
only permit the library to be compiled with this minimum version,
as announced in issue #297.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 16:20:08 +01:00
Pablo de Lara
d20335bba8 Update release notes
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 16:20:08 +01:00
sunyuechi
eb130eaf6b erasure_code: R-V V ec_encode_data
banana_f3:
    rvv:
        erasure_code_encode_warm: runtime =    3065696 usecs, bandwidth 108 MB in 3.0657 sec = 35.37 MB/s
        erasure_code_decode_warm: runtime =    3001213 usecs, bandwidth 136 MB in 3.0012 sec = 45.47 MB/s
    c:
        erasure_code_encode_warm: runtime =    3002512 usecs, bandwidth 52 MB in 3.0025 sec = 17.34 MB/s
        erasure_code_decode_warm: runtime =    3065235 usecs, bandwidth 57 MB in 3.0652 sec = 18.69 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
c5d75f1e27 erasure_code: R-V V gf_vect_dot_prod
banana_f3:
    rvv: gf_vect_dot_prod_warm: runtime =    3062964 usecs, bandwidth 490 MB in 3.0630 sec = 160.25 MB/s
    c:   gf_vect_dot_prod_warm: runtime =    3000581 usecs, bandwidth 173 MB in 3.0006 sec = 57.69 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
4174804684 riscv64_multibinary support more args
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
0a68e9434a erasure_code: R-V V gf_vect_mul
banana_f3:
    rvv: gf_vect_mul_warm: runtime =    3062541 usecs, bandwidth 1889 MB in 3.0625 sec = 616.84 MB/s
    c:   gf_vect_mul_warm: runtime =    3062014 usecs, bandwidth 285 MB in 3.0620 sec = 93.29 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
5518db11a9 Fix erasure_code/gf_vect_mul_test output
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
Pablo de Lara
9b3532244b Remove YASM support
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 17:37:34 +01:00
Pablo de Lara
8401831dc4 raid: add AVX2+GFNI implementation for P+Q gen
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 13:51:12 +01:00
Pablo de Lara
55a42d7717 raid: add AVX512+GFNI implementation for P+Q gen
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 13:51:12 +01:00
sunyuechi
359e2ac1af Update release notes for v2.32
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
0de3661ec0 raid: R-V V xor_gen
banana_f3:
        new: xor_gen_warm: runtime =    3006459 usecs, bandwidth 10685 MB in 3.0065 sec = 3554.17 MB/s
        old: xor_gen_warm: runtime =    3060970 usecs, bandwidth 514 MB in 3.0610 sec = 168.21 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
7fafc98a37 Fix xor_gen test pass when len % 256 == 0
If len > 255, the return value will be % 256, which causes the test to incorrectly pass

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
ba874ba762 raid: R-V V pq_gen
banana_f3:
        new: pq_gen_warm: runtime =    3062397 usecs, bandwidth 4737 MB in 3.0624 sec = 1546.92 MB/s
        old: pq_gen_warm: runtime =    3005894 usecs, bandwidth 2851 MB in 3.0059 sec = 948.80 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:00 +00:00
sunyuechi
b725bddd05 license: correct name to "ISCAS"
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:00 +00:00
sunyuechi
91da2ada9a add RISCV CI
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-24 15:29:34 +01:00
Pablo de Lara
ce957f9449 ci: update github actions to latest versions
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-24 10:11:26 +01:00
Mattias Ellert
7e01b2c812 Address type mismatch warnings on riscv64
The riscv64 dispatcher code uses the same PROVIDER_INFO macro as the
aarch64 dispatcher and have the same kind of warnings during compilation:

igzip/riscv64/igzip_multibinary_riscv64_dispatcher.c:39:24: warning: type of 'adler32_base' does not match original declaration [-Wlto-type-mismatch]
   39 |                 return PROVIDER_BASIC(adler32);
      |                        ^
igzip/adler32_base.c:34:1: note: return value type mismatch
   34 | adler32_base(uint32_t adler32, uint8_t *start, uint64_t length)
      | ^
igzip/adler32_base.c:34:1: note: type 'uint32_t' should match type 'void'
igzip/adler32_base.c:34:1: note: 'adler32_base' was previously declared here

This commit introduces the same correction for riscv64.

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
2025-04-23 20:04:05 +01:00
Pablo de Lara
6b03bc4f1e igzip: fix coding style of inflate example
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-23 13:46:12 +01:00
Pablo de Lara
4fe61d3bce Show clang-format version
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-23 13:46:12 +01:00
Pablo de Lara
aa9e15f794 aarch64: remove unneeded defines
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-22 16:12:16 +01:00
Mattias Ellert
841f9e34ad Address type mismatch warnings on aarch64
The PROVIDER_INFO macro used in the aarch64 code declares all
functions with the signature:

extern void function(void);

The actual return type and parameter list of the functions are however
different. The declarations provided by the PROVIDER_INFO macro
therfore conflicts with the actual declarations of the functions
elsewhere in the code, causing compiler warnings.

This commit drops the PROVIDER_INFO macro and provides proper function
declarations, eiter by including a header file or by providing a
forward declaration. This corresponds to how the code for the other
architectures are handlinging this issue.

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
2025-04-22 12:55:53 +01:00
Karpenko, Veronika
3e03e91cef igzip: add inflate example
Signed-off-by: Karpenko, Veronika <veronika.karpenko@intel.com>
2025-04-08 10:13:32 +01:00
sunyuechi
c0bd84c20e add R-V V build check
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00
sunyuechi
027be4beb9 add volatile for igzip/checksum32_funs_test
When using RISC-V GCC 14, `gcc -O0` passes the test, but `gcc -O2` fails.

The log shows that it enters the branch `if (c_dut != c_ref) {`

even though `c_dut` and `c_ref` have the same value.

Adding `volatile` allows the test to pass.

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00