isa-l

mirror of https://github.com/intel/isa-l.git synced 2025-09-06 06:10:54 +02:00

Author	SHA1	Message	Date
Pablo de Lara	4edef39572	Add MAINTAINERS file Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-08-13 09:24:43 +00:00
Pablo de Lara	5e9072107a	cmake: add functional tests Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-17 09:57:45 +02:00
Pablo de Lara	612c210684	Add inital CMake build system Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>	2025-07-17 09:57:45 +02:00
lvshuo	d414b2702a	erasure_code: optimize RVV implementation The ISA-L EC code has been written using RVV vector instructions and the minimum multiplication table, resulting in a performance improvement of over 10 times compared to the existing implementation. Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>	2025-07-11 15:55:57 +02:00
Pablo de Lara	f2883f24fd	raid: add cold cache test Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-09 14:15:18 +01:00
Pablo de Lara	55e25f7aa2	raid: add consolidated performance app Added new RAID performance application which consolidates the existing XOR and P+Q gen performance applications. This application accepts buffer sizes to benchmark, as a single value, list or range, and the RAID function to test and the number of sources. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-03 15:00:56 +02:00
Pablo de Lara	8735bb4e20	crc: add cold cache test To benchmark a cold cache scenario, the option `--cold` has been added as a parameter of the CRC benchmark application, where the addresses of the input buffers are randomize within a 1GB preallocated memory buffer. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-03 12:11:56 +01:00
Pablo de Lara	e97c91547f	Add parenthesis around parameters in macros Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-03 12:11:56 +01:00
Pablo de Lara	199a0a8151	crc: add CRC consolidated performance benchmark Added new CRC performance application which consolidates the existing CRC performance applications (CRC16, CRC32 and CRC64). This application accepts buffer sizes to benchmark, as a single value, list or range, and the CRC function to test (or all of them). Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-23 08:48:01 +01:00
Pablo de Lara	5d437d72f1	Add missing base function symbol Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-23 08:48:01 +01:00
Pablo de Lara	fc37bd08e3	Further memory leak fixes Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-23 08:48:01 +01:00
Pablo de Lara	bf18da6770	Free allocated memory in test applications Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-11 12:17:43 +01:00
Rong Tao	8054d41db5	Ignore more generated files ignore files generated by 'make perf'. Signed-off-by: Rong Tao <rongtao@cestc.cn>	2025-05-26 19:28:17 +01:00
Tim Burke	9a6c32cb05	Optimize crc64_rocksoft for aarch64 Closes #326 Signed-off-by: Tim Burke <tim.burke@gmail.com>	2025-05-20 11:39:28 +01:00
Tim Burke	86e775b3b5	Remove unnecessary .text directives Signed-off-by: Tim Burke <tim.burke@gmail.com>	2025-05-20 11:39:28 +01:00
Tim Burke	22810489c6	Normalize the width of some constants This makes it easier to compare the constants used for crc/crc64__by8.asm, crc/crc64__by16_10.asm, and crc/aarch64/crc64_*_pmull.h Note that this revealed some discrepancies: ecma_refl: br_high != rk8 (92d8af2baf0e1e85 vs 92d8af2baf0e1e84) iso_refl: br_high != rk8 (b000000000000001 vs b000000000000000) jones_refl: br_high != rk8 (2b5926535897936b vs 2b5926535897936a) but they should be innocuous. Signed-off-by: Tim Burke <tim.burke@gmail.com>	2025-05-20 11:39:28 +01:00
sunyuechi	f74b0d27ab	Update release notes Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-13 15:39:44 +02:00
sunyuechi	a7766a91b6	mem: R-V V mem_zero_detect banana_f3: rvv: mem_zero_detect_perf_warm: runtime = 3062584 usecs, bandwidth 33784 MB in 3.0626 sec = 11031.32 MB/s c: mem_zero_detect_perf_warm: runtime = 3000354 usecs, bandwidth 1594 MB in 3.0004 sec = 531.34 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-13 15:39:44 +02:00
Pablo de Lara	94690d01ca	Remove 32-bit x86 architecture support As already announced in issue #296, we are removing 32-bit x86 support, which was not being validated anyway. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-05-08 18:37:08 +01:00
Pablo de Lara	8045bee170	Bump minimum NASM version to 2.14.01 NASM version 2.14.01 supports all x86 ISA in this library. Since this version has been out since 2018, it is safe to only permit the library to be compiled with this minimum version, as announced in issue #297. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-05-08 16:20:08 +01:00
Pablo de Lara	d20335bba8	Update release notes Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-05-08 16:20:08 +01:00
sunyuechi	eb130eaf6b	erasure_code: R-V V ec_encode_data banana_f3: rvv: erasure_code_encode_warm: runtime = 3065696 usecs, bandwidth 108 MB in 3.0657 sec = 35.37 MB/s erasure_code_decode_warm: runtime = 3001213 usecs, bandwidth 136 MB in 3.0012 sec = 45.47 MB/s c: erasure_code_encode_warm: runtime = 3002512 usecs, bandwidth 52 MB in 3.0025 sec = 17.34 MB/s erasure_code_decode_warm: runtime = 3065235 usecs, bandwidth 57 MB in 3.0652 sec = 18.69 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	c5d75f1e27	erasure_code: R-V V gf_vect_dot_prod banana_f3: rvv: gf_vect_dot_prod_warm: runtime = 3062964 usecs, bandwidth 490 MB in 3.0630 sec = 160.25 MB/s c: gf_vect_dot_prod_warm: runtime = 3000581 usecs, bandwidth 173 MB in 3.0006 sec = 57.69 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	4174804684	riscv64_multibinary support more args Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	0a68e9434a	erasure_code: R-V V gf_vect_mul banana_f3: rvv: gf_vect_mul_warm: runtime = 3062541 usecs, bandwidth 1889 MB in 3.0625 sec = 616.84 MB/s c: gf_vect_mul_warm: runtime = 3062014 usecs, bandwidth 285 MB in 3.0620 sec = 93.29 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	5518db11a9	Fix erasure_code/gf_vect_mul_test output Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
Pablo de Lara	9b3532244b	Remove YASM support Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-29 17:37:34 +01:00
Pablo de Lara	8401831dc4	raid: add AVX2+GFNI implementation for P+Q gen Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-29 13:51:12 +01:00
Pablo de Lara	55a42d7717	raid: add AVX512+GFNI implementation for P+Q gen Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-29 13:51:12 +01:00
sunyuechi	359e2ac1af	Update release notes for v2.32 Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-29 12:16:01 +00:00
sunyuechi	0de3661ec0	raid: R-V V xor_gen banana_f3: new: xor_gen_warm: runtime = 3006459 usecs, bandwidth 10685 MB in 3.0065 sec = 3554.17 MB/s old: xor_gen_warm: runtime = 3060970 usecs, bandwidth 514 MB in 3.0610 sec = 168.21 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-29 12:16:01 +00:00
sunyuechi	7fafc98a37	Fix xor_gen test pass when len % 256 == 0 If len > 255, the return value will be % 256, which causes the test to incorrectly pass Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-29 12:16:01 +00:00
sunyuechi	ba874ba762	raid: R-V V pq_gen banana_f3: new: pq_gen_warm: runtime = 3062397 usecs, bandwidth 4737 MB in 3.0624 sec = 1546.92 MB/s old: pq_gen_warm: runtime = 3005894 usecs, bandwidth 2851 MB in 3.0059 sec = 948.80 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-29 12:16:00 +00:00
sunyuechi	b725bddd05	license: correct name to "ISCAS" Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-29 12:16:00 +00:00
sunyuechi	91da2ada9a	add RISCV CI Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-04-24 15:29:34 +01:00
Pablo de Lara	ce957f9449	ci: update github actions to latest versions Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-24 10:11:26 +01:00
Mattias Ellert	7e01b2c812	Address type mismatch warnings on riscv64 The riscv64 dispatcher code uses the same PROVIDER_INFO macro as the aarch64 dispatcher and have the same kind of warnings during compilation: igzip/riscv64/igzip_multibinary_riscv64_dispatcher.c:39:24: warning: type of 'adler32_base' does not match original declaration [-Wlto-type-mismatch] 39 \| return PROVIDER_BASIC(adler32); \| ^ igzip/adler32_base.c:34:1: note: return value type mismatch 34 \| adler32_base(uint32_t adler32, uint8_t *start, uint64_t length) \| ^ igzip/adler32_base.c:34:1: note: type 'uint32_t' should match type 'void' igzip/adler32_base.c:34:1: note: 'adler32_base' was previously declared here This commit introduces the same correction for riscv64. Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>	2025-04-23 20:04:05 +01:00
Pablo de Lara	6b03bc4f1e	igzip: fix coding style of inflate example Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-23 13:46:12 +01:00
Pablo de Lara	4fe61d3bce	Show clang-format version Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-23 13:46:12 +01:00
Pablo de Lara	aa9e15f794	aarch64: remove unneeded defines Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-04-22 16:12:16 +01:00
Mattias Ellert	841f9e34ad	Address type mismatch warnings on aarch64 The PROVIDER_INFO macro used in the aarch64 code declares all functions with the signature: extern void function(void); The actual return type and parameter list of the functions are however different. The declarations provided by the PROVIDER_INFO macro therfore conflicts with the actual declarations of the functions elsewhere in the code, causing compiler warnings. This commit drops the PROVIDER_INFO macro and provides proper function declarations, eiter by including a header file or by providing a forward declaration. This corresponds to how the code for the other architectures are handlinging this issue. Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>	2025-04-22 12:55:53 +01:00
Karpenko, Veronika	3e03e91cef	igzip: add inflate example Signed-off-by: Karpenko, Veronika <veronika.karpenko@intel.com>	2025-04-08 10:13:32 +01:00
sunyuechi	c0bd84c20e	add R-V V build check Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-03-20 19:22:40 +00:00
sunyuechi	027be4beb9	add volatile for igzip/checksum32_funs_test When using RISC-V GCC 14, `gcc -O0` passes the test, but `gcc -O2` fails. The log shows that it enters the branch `if (c_dut != c_ref) {` even though `c_dut` and `c_ref` have the same value. Adding `volatile` allows the test to pass. Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-03-20 19:22:40 +00:00
sunyuechi	e0687d4964	igzip: R-V V isal_adler32 banana_f3: new: adler32_warm: runtime = 3062612 usecs, bandwidth 3861 MB in 3.0626 sec = 1261.01 MB/s old: adler32_warm: runtime = 3062505 usecs, bandwidth 1027 MB in 3.0625 sec = 335.64 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-03-20 19:22:40 +00:00
sunyuechi	83d58b856c	multibinary: Add run-time cpu feature detect for riscv64 Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-03-20 19:22:40 +00:00
Daniel Gregory	726a6f7c02	build: Add riscv64 support Use the base implementations for every function. Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>	2025-03-20 19:22:40 +00:00
Pablo de Lara	633add1b56	igzip: fix header construction in Big Endian systems When a file contains a number of repeated '0x00' or '0xff' bytes, the block header is copied from a precomputed header, which only worked for Little-Endian systems. Fixes #311. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-02-04 10:13:32 +00:00
Mattias Ellert	e3c2d243a1	Address compiler warnings on ppc64le and s390x igzip/igzip_icf_body.c:7:1: warning: type of 'gen_icf_map_lh1' does not match original declaration [-Wlto-type-mismatch] 7 \| gen_icf_map_lh1(struct isal_zstream , struct deflate_icf , uint32_t); \| ^ igzip/igzip_base_aliases.c:177:1: note: return value type mismatch 177 \| gen_icf_map_lh1(struct isal_zstream stream, struct deflate_icf matches_icf_lookup, \| ^ igzip/igzip_base_aliases.c:177:1: note: type 'void' should match type 'uint64_t' igzip/igzip_base_aliases.c:177:1: note: 'gen_icf_map_lh1' was previously declared here igzip/igzip_base_aliases.c:177:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used igzip/igzip_icf_body.c:9:1: warning: type of 'set_long_icf_fg' does not match original declaration [-Wlto-type-mismatch] 9 \| set_long_icf_fg(uint8_t , uint64_t, uint64_t, struct deflate_icf ); \| ^ igzip/igzip_base_aliases.c:170:1: note: type mismatch in parameter 2 170 \| set_long_icf_fg(uint8_t next_in, uint8_t end_in, struct deflate_icf match_lookup, \| ^ igzip/igzip_base_aliases.c:170:1: note: 'set_long_icf_fg' was previously declared here igzip/igzip_base_aliases.c:170:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used igzip/igzip_base_aliases.c:62:1: warning: type of 'set_long_icf_fg_base' does not match original declaration [-Wlto-type-mismatch] 62 \| set_long_icf_fg_base(uint8_t next_in, uint8_t end_in, struct deflate_icf match_lookup, \| ^ igzip/igzip_icf_body.c:34:1: note: type mismatch in parameter 2 34 \| set_long_icf_fg_base(uint8_t next_in, uint64_t processed, uint64_t input_size, \| ^ igzip/igzip_icf_body.c:34:1: note: 'set_long_icf_fg_base' was previously declared here igzip/igzip_icf_body.c:34:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used igzip/igzip_base_aliases.c:54:1: warning: type of 'adler32_base' does not match original declaration [-Wlto-type-mismatch] 54 \| adler32_base(uint32_t init, const unsigned char buf, uint64_t len); \| ^ igzip/adler32_base.c:34:1: note: type mismatch in parameter 3 34 \| adler32_base(uint32_t adler32, uint8_t *start, uint32_t length) \| ^ igzip/adler32_base.c:34:1: note: type 'uint32_t' should match type 'uint64_t' igzip/adler32_base.c:34:1: note: 'adler32_base' was previously declared here igzip/adler32_base.c:34:1: note: code may be misoptimized unless '-fno-strict-aliasing' is used Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>	2025-01-27 23:01:00 +01:00
Mattias Ellert	c387163fcb	Revert soname change The soname is equal to current minus age. In version 2.31.0 current is 2 and age is set to 0. In version 2.31.1 current is 2 and age is set to 1. This means the soname goes backwards from 2 to 1. The full library version changes from 2.0.31 to 1.1.31 The soname should not go backwards, so this soname change looks like a mistake that should be reverted. The current, revision, age for a library should change in one of three ways: 1) increase current by one, reset revision and age to 0. 2) increase current by one, reset revision to 0 and increase age by 1. 3) increase revision by 1, retain the values of current and age. 1) is for non-backward compatible changes to the library (changes or removals to the old ABI). Soname changes and applications using the library must be recompiled. 2) is for when there are ABI additions to the library, but no ABI changes or removals. Application compiled against the old version of the library don't need to be recompiled, and the soname (current minus age) does not change. 3) is for minor updates with no ABI additions, changes or removals. The major, minor, patch version of the software project should not be used as current, revision, age for the library. Especially true for using the patch version as age, because that means the soname goes backwards for patch releases as happened here. Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se> v2.31.1	2025-01-08 15:33:59 +00:00

1 2 3 4 5 ...

757 Commits