isa-l/crc
Samuel Lee 4785428d2f crc: arm64 implementation tweaks
+ Utilise `pmull2` instruction in main loops of arm64 crc functions and
avoid the need for `dup` to align multiplicands.
  + Use just 1 ASIMD register to hold both 64b p4 constants,
appropriately aligned.
+ Interleave quadword `ldr` with `pmull{2}` to avoid unnecessary stalls
on existing LITTLE uarch (which can only issue these instructions every
other cycle).
+ Similarly interleave scalar instructions with ASIMD instructions to
increase likelihood of instruction level parallelism on a variety of
uarch.
+ Cut down on needless instructions in non-critical sections to help
performance for small buffers.
+ Extract common instruction sequences into inner macros and moved
them into shared header - crc_common_pmull.h
+ Use the same human readable register aliases and register allocation
in all 4 implementations, never refer to registers without using human
readable alias.
  + Use #defines rather than .req to allow use of same names across
several implementations
+ Reduce tail case size from 1024B to 64B

+ Phrased the `eor` instructions in the main loop to more clearly show
that we can rewrite pairs of `eor` instructions with a single `eor3`
instruction in the presence of Armv8.2-SHA (should probably be an option
in multibinary in future).

Change-Id: I3688193ea4ad88b53cf47e5bd9a7fd5c2b4401e1
Signed-off-by: Samuel Lee <samuel.lee@microsoft.com>
2019-11-13 10:58:19 -07:00
..
aarch64 crc: arm64 implementation tweaks 2019-11-13 10:58:19 -07:00
crc16_t10dif_01.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc16_t10dif_by4.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc16_t10dif_copy_by4.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc16_t10dif_copy_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc16_t10dif_copy_test.c crc: implement table-driven crc algorithm 2019-05-08 17:50:03 -07:00
crc16_t10dif_op_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc16_t10dif_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc16_t10dif_test.c test: Increase size of crc tests and simplify output 2019-09-14 16:01:28 -07:00
crc32_funcs_test.c test: Increase size of crc tests and simplify output 2019-09-14 16:01:28 -07:00
crc32_gzip_refl_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc32_gzip_refl_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc32_ieee_01.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc32_ieee_by4.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc32_ieee_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc32_iscsi_00.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc32_iscsi_01.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc32_iscsi_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc64_base.c crc: implement table-driven crc algorithm 2019-05-08 17:50:03 -07:00
crc64_ecma_norm_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_ecma_norm_by16_10.asm crc: Add new ecma_norm 2019-09-16 17:01:25 -07:00
crc64_ecma_refl_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_ecma_refl_by16_10.asm crc: Add new ecma_refl 2019-09-16 17:01:25 -07:00
crc64_example.c crc64: add jones and iso format, crc64 code clean 2016-12-06 13:48:13 -07:00
crc64_funcs_perf.c all: Revamp performance testing to be time based 2019-03-07 09:28:04 -07:00
crc64_funcs_test.c test: Increase size of crc tests and simplify output 2019-09-14 16:01:28 -07:00
crc64_iso_norm_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_iso_norm_by16_10.asm crc: Fix symbol conflict with older assemblers 2019-10-28 14:39:44 -07:00
crc64_iso_refl_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_iso_refl_by16_10.asm crc: Fix symbol conflict with older assemblers 2019-10-28 14:39:44 -07:00
crc64_jones_norm_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_jones_norm_by16_10.asm crc: Add new jones_norm 2019-09-16 17:01:25 -07:00
crc64_jones_refl_by8.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc64_jones_refl_by16_10.asm crc: Add new jones_refl 2019-09-16 17:01:25 -07:00
crc64_multibinary.asm crc: Add new ecma_norm 2019-09-16 17:01:25 -07:00
crc64_ref.h crc: implement table-driven crc algorithm 2019-05-08 17:50:03 -07:00
crc_base_aliases.c crc: Add t10dif+copy function 2017-12-18 15:59:17 -07:00
crc_base.c crc: implement table-driven crc algorithm 2019-05-08 17:50:03 -07:00
crc_multibinary.asm build: Fix for mac nasm lack of symbol types 2018-11-29 13:54:36 -07:00
crc_ref.h crc: implement table-driven crc algorithm 2019-05-08 17:50:03 -07:00
crc_simple_test.c igzip: Add unit tests for adler and crc32_gzip 2017-06-26 04:03:35 -04:00
Makefile.am crc: Add new ecma_norm 2019-09-16 17:01:25 -07:00