isa-l

mirror of https://github.com/intel/isa-l.git synced 2025-12-10 00:56:01 +01:00

Author	SHA1	Message	Date
lvshuo	e64ed02065	erasure_code: set vsetvli to default parameter and add space Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>	2025-10-13 09:49:38 +01:00
lvshuo	7684934179	erasure_code: add optimization implementation reduce one slli instructions and remove the dependence between vle8.v and ld instructions gf5 and gf7 are not modified, +5 and +7 are not used in actual scenarios. Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>	2025-10-13 09:49:38 +01:00
Jonathan Swinney	aedcd375ba	aarch64: Use NEON when SVE width is 128 bits On AArch64 systems with SVE support, 128-bit SVE implementations can perform significantly worse than equivalent NEON code due to the different optimization strategies used in each implementation. The NEON version is unrolled 4 times, providing excellent performance at the fixed 128-bit width. The SVE version can achieve similar or better performance through its variable-width operations on systems with 256-bit or 512-bit SVE, but on 128-bit SVE systems, the NEON unrolled implementation is faster due to reduced overhead. This change adds runtime detection of SVE vector length and falls back to the optimized NEON implementation when SVE is operating at 128-bit width, ensuring optimal performance across all AArch64 configurations. This implementation checks the vector length with an intrinsic if the compiler supports it (which works on Apple as well) and falls back to using prctl otherwise. This optimization ensures that systems benefit from: - 4x unrolled NEON code on 128-bit SVE systems - Variable-width SVE optimizations on wider SVE implementations - Maintained compatibility across different AArch64 configurations Performance improvement on systems with 128-bit SVE: - Encode: 7509.80 MB/s → 8995.59 MB/s (+19.8% improvement) - Decode: 9383.67 MB/s → 12272.38 MB/s (+30.8% improvement) Signed-off-by: Jonathan Swinney <jswinney@amazon.com>	2025-09-18 17:11:00 +01:00
Pablo de Lara	09cec64707	erasure_code: improve verbose output of test application Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-09-17 13:36:06 +01:00
Pablo de Lara	612c210684	Add inital CMake build system Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>	2025-07-17 09:57:45 +02:00
lvshuo	d414b2702a	erasure_code: optimize RVV implementation The ISA-L EC code has been written using RVV vector instructions and the minimum multiplication table, resulting in a performance improvement of over 10 times compared to the existing implementation. Signed-off-by: Shuo Lv <lv.shuo@sanechips.com.cn>	2025-07-11 15:55:57 +02:00
Pablo de Lara	e97c91547f	Add parenthesis around parameters in macros Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-07-03 12:11:56 +01:00
Pablo de Lara	fc37bd08e3	Further memory leak fixes Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-23 08:48:01 +01:00
Pablo de Lara	bf18da6770	Free allocated memory in test applications Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-06-11 12:17:43 +01:00
Pablo de Lara	94690d01ca	Remove 32-bit x86 architecture support As already announced in issue #296, we are removing 32-bit x86 support, which was not being validated anyway. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-05-08 18:37:08 +01:00
Pablo de Lara	8045bee170	Bump minimum NASM version to 2.14.01 NASM version 2.14.01 supports all x86 ISA in this library. Since this version has been out since 2018, it is safe to only permit the library to be compiled with this minimum version, as announced in issue #297. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2025-05-08 16:20:08 +01:00
sunyuechi	eb130eaf6b	erasure_code: R-V V ec_encode_data banana_f3: rvv: erasure_code_encode_warm: runtime = 3065696 usecs, bandwidth 108 MB in 3.0657 sec = 35.37 MB/s erasure_code_decode_warm: runtime = 3001213 usecs, bandwidth 136 MB in 3.0012 sec = 45.47 MB/s c: erasure_code_encode_warm: runtime = 3002512 usecs, bandwidth 52 MB in 3.0025 sec = 17.34 MB/s erasure_code_decode_warm: runtime = 3065235 usecs, bandwidth 57 MB in 3.0652 sec = 18.69 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	c5d75f1e27	erasure_code: R-V V gf_vect_dot_prod banana_f3: rvv: gf_vect_dot_prod_warm: runtime = 3062964 usecs, bandwidth 490 MB in 3.0630 sec = 160.25 MB/s c: gf_vect_dot_prod_warm: runtime = 3000581 usecs, bandwidth 173 MB in 3.0006 sec = 57.69 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	0a68e9434a	erasure_code: R-V V gf_vect_mul banana_f3: rvv: gf_vect_mul_warm: runtime = 3062541 usecs, bandwidth 1889 MB in 3.0625 sec = 616.84 MB/s c: gf_vect_mul_warm: runtime = 3062014 usecs, bandwidth 285 MB in 3.0620 sec = 93.29 MB/s Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
sunyuechi	5518db11a9	Fix erasure_code/gf_vect_mul_test output Signed-off-by: sunyuechi <sunyuechi@iscas.ac.cn>	2025-05-01 17:44:19 +01:00
Mattias Ellert	841f9e34ad	Address type mismatch warnings on aarch64 The PROVIDER_INFO macro used in the aarch64 code declares all functions with the signature: extern void function(void); The actual return type and parameter list of the functions are however different. The declarations provided by the PROVIDER_INFO macro therfore conflicts with the actual declarations of the functions elsewhere in the code, causing compiler warnings. This commit drops the PROVIDER_INFO macro and provides proper function declarations, eiter by including a header file or by providing a forward declaration. This corresponds to how the code for the other architectures are handlinging this issue. Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>	2025-04-22 12:55:53 +01:00
Daniel Gregory	726a6f7c02	build: Add riscv64 support Use the base implementations for every function. Signed-off-by: Daniel Gregory <daniel.gregory@bytedance.com>	2025-03-20 19:22:40 +00:00
Cornu, Marcel D	07f8028743	erasure_code: fix unaligned free error in perf apps on windows Signed-off-by: Cornu, Marcel D <marcel.d.cornu@intel.com>	2024-11-19 14:20:33 +00:00
Marcel Cornu	300260a4d9	erasure_code: reformat using new code style Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2024-04-22 11:35:03 +02:00
Taiju Yamada	38279f5e9e	Avoid using x18 register Signed-off-by: Taiju Yamada <tyamada@bi.a.u-tokyo.ac.jp>	2024-03-25 15:34:01 +00:00
Mattias Ellert	1b1ee1e18f	erasure_code: fix wrong return type erasure_code/ppc64le/gf_vect_mul_vsx.c: In function '_gf_vect_mul_base': erasure_code/ppc64le/gf_vect_mul_vsx.c:14:16: error: 'return' with a value, in function returning void [-Wreturn-mismatch] 14 \| return 0; \| ^ erasure_code/ppc64le/gf_vect_mul_vsx.c:6:13: note: declared here 6 \| static void _gf_vect_mul_base(int len, unsigned char a, unsigned char src, \| ^~~~~~~~~~~~~~~~~ Signed-off-by: Mattias Ellert <mattias.ellert@physics.uu.se>	2024-01-23 12:01:14 +00:00
Pablo de Lara	e0fd782974	erasure_code: use internal gf_vect_mul_base for ppc64le encoding gf_vect_mul_base is expected to work for all buffer sizes. However, this function is checking for size alignment to 32 bytes, to follow the other gf_vect_mul implementations. Therefore, another implementation for this function is included inside ppc64le folder to be used by the encoding functions. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2024-01-15 15:48:14 +00:00
Pablo de Lara	b8d5633e51	erasure_code: check for size alignment on powerpc gf_vect_mul_vsx implementation Follows the rest of the gf_vect_mul implementations for other architectures, and checks for size alignment, stated in the documentation. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2024-01-15 15:48:14 +00:00
Pablo de Lara	91e7906f3f	erasure_code: check for size on gf_vect_mul_sse/avx gf_vect_mul requires length to be multiple of 32 bytes, so this check is added in the SSE/AVX implementations. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2024-01-15 13:52:08 +00:00
liuqinfei	275977156d	gf_vect_mul_sve: fix error and enable unit tests for aarch64 Signed-off-by: liuqinfei <lucas.liuqinfei@huawei.com>	2024-01-12 15:18:37 +00:00
Pablo de Lara	e0fffbe48b	erasure_code: disable unit tests temporarily for aarch64/ppc64le Some aarch64 and ppc64le implementations of gf_vect_mul do not check for invalid sizes, so the unit test checking for negative return value from this function is disabled temporarily on these architectures. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2024-01-10 15:53:14 +00:00
Pablo de Lara	455fdded4e	erasure_code: add missing aarch64 and powerpc interface for ec_init_tables ec_init_tables is now a multi-implementation function, so it requires a dispatcher for all architectures. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2024-01-09 13:38:43 +00:00
Tomasz Kantecki	402bd4f773	erasure_code: various fixes for static code analysis issues Signed-off-by: Tomasz Kantecki <tomasz.kantecki@intel.com>	2023-12-19 20:36:39 +00:00
Pablo de Lara	a3e260436a	erasure_code: [test] fix memory leak Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-12-18 14:25:22 +00:00
Pablo de Lara	abd80d3c5a	erasure_code: check for size in gf_Xvect_mad_avx512_gfni Length of data was not checked in implementation with AVX512+GFNI, at the start of the gf_Xvect_mad_avx512_gfni functions, resulting in buffer overflow if length was less than 64 bytes. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-12-18 14:25:22 +00:00
Marcel Cornu	561a419bc8	erasure_code: fix modules using incorrect unsigned jump Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-14 17:55:49 +00:00
Marcel Cornu	a53a20ea2a	erasure_code: add AVX2 5vect mad with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-14 17:55:49 +00:00
Marcel Cornu	47ed2847af	erasure_code: add AVX2 4vect mad with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-14 17:55:49 +00:00
Marcel Cornu	22b7f33d68	erasure_code: add AVX2 3vect mad with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-14 17:55:49 +00:00
Marcel Cornu	d22bb198f3	erasure_code: optimize AVX2-GFNI single vector mad implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-13 17:03:16 +00:00
Marcel Cornu	a0a149d674	erasure_code: add AVX2 2vect mad with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-13 17:03:16 +00:00
Marcel Cornu	0052080f53	erasure_code: optimize AVX2 GFNI 2 vector dot product Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-11 22:44:07 +00:00
Marcel Cornu	3f87141d03	erasure_code: optimize AVX2 GFNI single vector dot product Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-11 22:44:07 +00:00
Marcel Cornu	164d9ff1f0	erasure_code: add 2 vector AVX2 dot product with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-11 22:44:07 +00:00
Marcel Cornu	307d737bf2	erasure_code: add 3 vector AVX2 dot product with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-07 14:01:18 +00:00
Pablo de Lara	2ca781df19	lib: reduce verbosity by default in tests Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-12-01 14:33:29 +00:00
Marcel Cornu	5f23c03415	erasure_code: add initial AVX2 mad with GFNI implementation Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-01 14:20:56 +00:00
Pablo de Lara	447d9af75b	erasure_code: add initial AVX2 dot product with GFNI implementation Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-01 14:20:56 +00:00
Marcel Cornu	bc34d87427	erasure_code: update GF_MUL_XOR macro to support VEX encoding Signed-off-by: Marcel Cornu <marcel.d.cornu@intel.com>	2023-12-01 14:20:56 +00:00
Pablo de Lara	f971f02309	erasure_code: expose base implementation of init_tables Expose ec_init_tables_base(), which should be used with ec_encode_data_base() and ec_encode_data_update_base(). Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00
Pablo de Lara	65e89717df	erasure_code: implement EC update with AVX512 + GFNI Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00
Pablo de Lara	1eff12dddb	erasure_code: implement EC with AVX512 + GFNI Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00
Pablo de Lara	9d487fd6db	erasure_code: [perf] get parameters for number of buffers Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00
Pablo de Lara	07af4032ff	erasure_code: fix stack allocation Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00
Pablo de Lara	801df41929	erasure_code: fix vmovdqa instruction vmovdqa needs to be vmovdqa32/64 when used on ZMMs (EVEX encoded). Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2023-11-23 10:56:28 +00:00

1 2

96 Commits