757 Commits

Author SHA1 Message Date
Pablo de Lara
4edef39572 Add MAINTAINERS file
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-08-13 09:24:43 +00:00
Pablo de Lara
5e9072107a cmake: add functional tests
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-17 09:57:45 +02:00
Pablo de Lara
612c210684 Add inital CMake build system
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2025-07-17 09:57:45 +02:00
lvshuo
d414b2702a erasure_code: optimize RVV implementation
The ISA-L EC code has been written using RVV vector instructions and the minimum multiplication table,
resulting in a performance improvement of over 10 times compared to the existing implementation.

Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>
2025-07-11 15:55:57 +02:00
Pablo de Lara
f2883f24fd raid: add cold cache test
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-09 14:15:18 +01:00
Pablo de Lara
55e25f7aa2 raid: add consolidated performance app
Added new RAID performance application which consolidates the
existing XOR and P+Q gen performance applications.

This application accepts buffer sizes to benchmark,
as a single value, list or range, and the RAID function
to test and the number of sources.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 15:00:56 +02:00
Pablo de Lara
8735bb4e20 crc: add cold cache test
To benchmark a cold cache scenario, the option `--cold`
has been added as a parameter of the CRC benchmark application,
where the addresses of the input buffers are randomize
within a 1GB preallocated memory buffer.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 12:11:56 +01:00
Pablo de Lara
e97c91547f Add parenthesis around parameters in macros
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-07-03 12:11:56 +01:00
Pablo de Lara
199a0a8151 crc: add CRC consolidated performance benchmark
Added new CRC performance application which consolidates the
existing CRC performance applications (CRC16, CRC32 and CRC64).

This application accepts buffer sizes to benchmark,
as a single value, list or range, and the CRC function
to test (or all of them).

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
5d437d72f1 Add missing base function symbol
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
fc37bd08e3 Further memory leak fixes
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-23 08:48:01 +01:00
Pablo de Lara
bf18da6770 Free allocated memory in test applications
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-06-11 12:17:43 +01:00
Rong Tao
8054d41db5 Ignore more generated files
ignore files generated by 'make perf'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
2025-05-26 19:28:17 +01:00
Tim Burke
9a6c32cb05 Optimize crc64_rocksoft for aarch64
Closes #326

Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
Tim Burke
86e775b3b5 Remove unnecessary .text directives
Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
Tim Burke
22810489c6 Normalize the width of some constants
This makes it easier to compare the constants used for crc/crc64_*_by8.asm,
crc/crc64_*_by16_10.asm, and crc/aarch64/crc64_*_pmull.h

Note that this revealed some discrepancies:

  ecma_refl: br_high != rk8 (92d8af2baf0e1e85 vs 92d8af2baf0e1e84)
  iso_refl: br_high != rk8 (b000000000000001 vs b000000000000000)
  jones_refl: br_high != rk8 (2b5926535897936b vs 2b5926535897936a)

but they should be innocuous.

Signed-off-by: Tim Burke <tim.burke@gmail.com>
2025-05-20 11:39:28 +01:00
sunyuechi
f74b0d27ab Update release notes
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-13 15:39:44 +02:00
sunyuechi
a7766a91b6 mem: R-V V mem_zero_detect
banana_f3:
    rvv: mem_zero_detect_perf_warm: runtime =    3062584 usecs, bandwidth 33784 MB in 3.0626 sec = 11031.32 MB/s
    c:   mem_zero_detect_perf_warm: runtime =    3000354 usecs, bandwidth 1594 MB in 3.0004 sec = 531.34 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-13 15:39:44 +02:00
Pablo de Lara
94690d01ca Remove 32-bit x86 architecture support
As already announced in issue #296, we are removing 32-bit x86 support,
which was not being validated anyway.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 18:37:08 +01:00
Pablo de Lara
8045bee170 Bump minimum NASM version to 2.14.01
NASM version 2.14.01 supports all x86 ISA in this library.
Since this version has been out since 2018, it is safe to
only permit the library to be compiled with this minimum version,
as announced in issue #297.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 16:20:08 +01:00
Pablo de Lara
d20335bba8 Update release notes
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-05-08 16:20:08 +01:00
sunyuechi
eb130eaf6b erasure_code: R-V V ec_encode_data
banana_f3:
    rvv:
        erasure_code_encode_warm: runtime =    3065696 usecs, bandwidth 108 MB in 3.0657 sec = 35.37 MB/s
        erasure_code_decode_warm: runtime =    3001213 usecs, bandwidth 136 MB in 3.0012 sec = 45.47 MB/s
    c:
        erasure_code_encode_warm: runtime =    3002512 usecs, bandwidth 52 MB in 3.0025 sec = 17.34 MB/s
        erasure_code_decode_warm: runtime =    3065235 usecs, bandwidth 57 MB in 3.0652 sec = 18.69 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
c5d75f1e27 erasure_code: R-V V gf_vect_dot_prod
banana_f3:
    rvv: gf_vect_dot_prod_warm: runtime =    3062964 usecs, bandwidth 490 MB in 3.0630 sec = 160.25 MB/s
    c:   gf_vect_dot_prod_warm: runtime =    3000581 usecs, bandwidth 173 MB in 3.0006 sec = 57.69 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
4174804684 riscv64_multibinary support more args
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
0a68e9434a erasure_code: R-V V gf_vect_mul
banana_f3:
    rvv: gf_vect_mul_warm: runtime =    3062541 usecs, bandwidth 1889 MB in 3.0625 sec = 616.84 MB/s
    c:   gf_vect_mul_warm: runtime =    3062014 usecs, bandwidth 285 MB in 3.0620 sec = 93.29 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
sunyuechi
5518db11a9 Fix erasure_code/gf_vect_mul_test output
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-05-01 17:44:19 +01:00
Pablo de Lara
9b3532244b Remove YASM support
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 17:37:34 +01:00
Pablo de Lara
8401831dc4 raid: add AVX2+GFNI implementation for P+Q gen
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 13:51:12 +01:00
Pablo de Lara
55a42d7717 raid: add AVX512+GFNI implementation for P+Q gen
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-29 13:51:12 +01:00
sunyuechi
359e2ac1af Update release notes for v2.32
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
0de3661ec0 raid: R-V V xor_gen
banana_f3:
        new: xor_gen_warm: runtime =    3006459 usecs, bandwidth 10685 MB in 3.0065 sec = 3554.17 MB/s
        old: xor_gen_warm: runtime =    3060970 usecs, bandwidth 514 MB in 3.0610 sec = 168.21 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
7fafc98a37 Fix xor_gen test pass when len % 256 == 0
If len > 255, the return value will be % 256, which causes the test to incorrectly pass

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:01 +00:00
sunyuechi
ba874ba762 raid: R-V V pq_gen
banana_f3:
        new: pq_gen_warm: runtime =    3062397 usecs, bandwidth 4737 MB in 3.0624 sec = 1546.92 MB/s
        old: pq_gen_warm: runtime =    3005894 usecs, bandwidth 2851 MB in 3.0059 sec = 948.80 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:00 +00:00
sunyuechi
b725bddd05 license: correct name to "ISCAS"
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-29 12:16:00 +00:00
sunyuechi
91da2ada9a add RISCV CI
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-04-24 15:29:34 +01:00
Pablo de Lara
ce957f9449 ci: update github actions to latest versions
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-24 10:11:26 +01:00
Mattias Ellert
7e01b2c812 Address type mismatch warnings on riscv64
The riscv64 dispatcher code uses the same PROVIDER_INFO macro as the
aarch64 dispatcher and have the same kind of warnings during compilation:

igzip/riscv64/igzip_multibinary_riscv64_dispatcher.c:39:24: warning: type of 'adler32_base' does not match original declaration [-Wlto-type-mismatch]
   39 |                 return PROVIDER_BASIC(adler32);
      |                        ^
igzip/adler32_base.c:34:1: note: return value type mismatch
   34 | adler32_base(uint32_t adler32, uint8_t *start, uint64_t length)
      | ^
igzip/adler32_base.c:34:1: note: type 'uint32_t' should match type 'void'
igzip/adler32_base.c:34:1: note: 'adler32_base' was previously declared here

This commit introduces the same correction for riscv64.

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
2025-04-23 20:04:05 +01:00
Pablo de Lara
6b03bc4f1e igzip: fix coding style of inflate example
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-23 13:46:12 +01:00
Pablo de Lara
4fe61d3bce Show clang-format version
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-23 13:46:12 +01:00
Pablo de Lara
aa9e15f794 aarch64: remove unneeded defines
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-04-22 16:12:16 +01:00
Mattias Ellert
841f9e34ad Address type mismatch warnings on aarch64
The PROVIDER_INFO macro used in the aarch64 code declares all
functions with the signature:

extern void function(void);

The actual return type and parameter list of the functions are however
different. The declarations provided by the PROVIDER_INFO macro
therfore conflicts with the actual declarations of the functions
elsewhere in the code, causing compiler warnings.

This commit drops the PROVIDER_INFO macro and provides proper function
declarations, eiter by including a header file or by providing a
forward declaration. This corresponds to how the code for the other
architectures are handlinging this issue.

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
2025-04-22 12:55:53 +01:00
Karpenko, Veronika
3e03e91cef igzip: add inflate example
Signed-off-by: Karpenko, Veronika <veronika.karpenko@intel.com>
2025-04-08 10:13:32 +01:00
sunyuechi
c0bd84c20e add R-V V build check
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00
sunyuechi
027be4beb9 add volatile for igzip/checksum32_funs_test
When using RISC-V GCC 14, `gcc -O0` passes the test, but `gcc -O2` fails.

The log shows that it enters the branch `if (c_dut != c_ref) {`

even though `c_dut` and `c_ref` have the same value.

Adding `volatile` allows the test to pass.

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00
sunyuechi
e0687d4964 igzip: R-V V isal_adler32
banana_f3:
	new: adler32_warm: runtime =    3062612 usecs, bandwidth 3861 MB in 3.0626 sec = 1261.01 MB/s
	old: adler32_warm: runtime =    3062505 usecs, bandwidth 1027 MB in 3.0625 sec = 335.64 MB/s

Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00
sunyuechi
83d58b856c multibinary: Add run-time cpu feature detect for riscv64
Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>
2025-03-20 19:22:40 +00:00
Daniel Gregory
726a6f7c02 build: Add riscv64 support
Use the base implementations for every function.

Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>
2025-03-20 19:22:40 +00:00
Pablo de Lara
633add1b56 igzip: fix header construction in Big Endian systems
When a file contains a number of repeated '0x00' or '0xff'
bytes, the block header is copied from a precomputed header,
which only worked for Little-Endian systems.

Fixes #311.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2025-02-04 10:13:32 +00:00
Mattias Ellert
e3c2d243a1 Address compiler warnings on ppc64le and s390x
igzip/igzip_icf_body.c:7:1: warning: type of 'gen_icf_map_lh1' does not match original declaration [-Wlto-type-mismatch]
    7 | gen_icf_map_lh1(struct isal_zstream *, struct deflate_icf *, uint32_t);
      | ^
igzip/igzip_base_aliases.c:177:1: note: return value type mismatch
  177 | gen_icf_map_lh1(struct isal_zstream *stream, struct deflate_icf *matches_icf_lookup,
      | ^
igzip/igzip_base_aliases.c:177:1: note: type 'void' should match type 'uint64_t'
igzip/igzip_base_aliases.c:177:1: note: 'gen_icf_map_lh1' was previously declared here
igzip/igzip_base_aliases.c:177:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used
igzip/igzip_icf_body.c:9:1: warning: type of 'set_long_icf_fg' does not match original declaration [-Wlto-type-mismatch]
    9 | set_long_icf_fg(uint8_t *, uint64_t, uint64_t, struct deflate_icf *);
      | ^
igzip/igzip_base_aliases.c:170:1: note: type mismatch in parameter 2
  170 | set_long_icf_fg(uint8_t *next_in, uint8_t *end_in, struct deflate_icf *match_lookup,
      | ^
igzip/igzip_base_aliases.c:170:1: note: 'set_long_icf_fg' was previously declared here
igzip/igzip_base_aliases.c:170:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used
igzip/igzip_base_aliases.c:62:1: warning: type of 'set_long_icf_fg_base' does not match original declaration [-Wlto-type-mismatch]
   62 | set_long_icf_fg_base(uint8_t *next_in, uint8_t *end_in, struct deflate_icf *match_lookup,
      | ^
igzip/igzip_icf_body.c:34:1: note: type mismatch in parameter 2
   34 | set_long_icf_fg_base(uint8_t *next_in, uint64_t processed, uint64_t input_size,
      | ^
igzip/igzip_icf_body.c:34:1: note: 'set_long_icf_fg_base' was previously declared here
igzip/igzip_icf_body.c:34:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used
igzip/igzip_base_aliases.c:54:1: warning: type of 'adler32_base' does not match original declaration [-Wlto-type-mismatch]
   54 | adler32_base(uint32_t init, const unsigned char *buf, uint64_t len);
      | ^
igzip/adler32_base.c:34:1: note: type mismatch in parameter 3
   34 | adler32_base(uint32_t adler32, uint8_t *start, uint32_t length)
      | ^
igzip/adler32_base.c:34:1: note: type 'uint32_t' should match type 'uint64_t'
igzip/adler32_base.c:34:1: note: 'adler32_base' was previously declared here
igzip/adler32_base.c:34:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
2025-01-27 23:01:00 +01:00
Mattias Ellert
c387163fcb Revert soname change
The soname is equal to current minus age.
In version 2.31.0 current is 2 and age is set to 0.
In version 2.31.1 current is 2 and age is set to 1.
This means the soname goes backwards from 2 to 1.
The full library version changes from 2.0.31 to 1.1.31

The soname should not go backwards, so this soname change looks like a
mistake that should be reverted.

The current, revision, age for a library should change in one of three ways:

1) increase current by one, reset revision and age to 0.
2) increase current by one, reset revision to 0 and increase age by 1.
3) increase revision by 1, retain the values of current and age.

1) is for non-backward compatible changes to the library (changes or
removals to the old ABI). Soname changes and applications using the
library must be recompiled.

2) is for when there are ABI additions to the library, but no ABI
changes or removals. Application compiled against the old version of
the library don't need to be recompiled, and the soname (current minus
age) does not change.

3) is for minor updates with no ABI additions, changes or removals.

The major, minor, patch version of the software project should not be
used as current, revision, age for the library. Especially true for
using the patch version as age, because that means the soname goes
backwards for patch releases as happened here.

Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>
v2.31.1
2025-01-08 15:33:59 +00:00