Commit Graph

689 Commits

Author SHA1 Message Date
Zhang Jinde
163b6cd934 igzip: Fix for deflate logic buffer management
Fixes invalid logic that attempted to eliminate unnecessary copy of input to the
history buffer in cases where it is not required. Correction should improve
performance and not change functionality.

Change-Id: Ife24dcc9d920ce220b1a394031e971321737a171
Signed-off-by: Zhang Jinde <zjd5536@163.com>
2020-01-08 09:46:16 -07:00
Jerry Yu
fc69e8fc79 igzip: fix deflate hash bug
if next_in equal end_in, the function should
return.

Change-Id: I59e631bb1f24835fd43f943a3736e016c4e2d0ac
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-12-31 13:15:35 -07:00
Jerry Yu
e2b07bbd44 build: fix debug build problem
Remove strip command when lib_debug=1

Change-Id: I1203fcbfefb3b87080e9ba12ccbfb8018a008147
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-12-31 13:15:05 -07:00
Jerry Yu
936d05fc4f igzip:Add decode huffman code for aarch64
Change-Id: If26cc4fd97b078b5f3b02e5f6f121a12ec73f671
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-12-19 16:10:04 +08:00
Greg Tucker
ad49e580dc doc: Fix missing description of gf_matrix_inverse
Doc missed issue of input matrix destruction.
Fixes #116

Change-Id: Ic840b27532d90518dd21ec2701c278a1c3b61a8b
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-12-13 16:24:05 -07:00
Zhiyuan Zhu
2b8cc393af igzip: implement gen_icf_map with assembly
Change-Id: I74e6200a732acfaac44b7f5a82bd4a2215ba1535
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-12-13 07:54:12 +00:00
Zhiyuan Zhu
f430953f0a igzip: cleanup perf test related code
This patch addresses some cppcheck issues.
And some minor changes to maintain code consistency.

- Cleanup cppcheck issues.
  [log][igzip/igzip_perf.c] (error) Shifting signed 32-bit value by 31 bits is undefined behaviour
  [log][igzip/igzip_hist_perf.c:132]: (error) Memory leak: outbuf

- Some minor changes to maintain code consistency.
  igzip/igzip_build_hash_table_perf.c
  igzip/igzip_hist_perf.c
  igzip/igzip_semi_dyn_file_perf.c

- delete unused variable
  outbuf and outbuf_size from igzip/igzip_hist_perf.c

Change-Id: Icbbd8f70de689931c8a844d89e457af8d97c6793
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-12-06 15:33:20 +08:00
Zhiyuan Zhu
683364c47b igzip: implement encode_deflate_icf with assembly
Change-Id: I90b12da2d2a96bfdb47d29ab329648247a756585
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-11-29 14:45:45 -07:00
John Kariuki
5eeb33f69c ec: add AVX512 ec functions with 5 and 6 outputs
Added AVX512 optimized functions to calculate the
GF(2^8) vector dot product with 5 and 6 outputs
at a time. Also added GF(2^8) vector multiply
AVX512 optimized functions with 5 and 6 accumulate.

Change-Id: I6d2c080f4f4f8e4823ad9a9be2c65c3b5b3bb1f8
Signed-off-by: John Kariuki <John.K.Kariuki@intel.com>
2019-11-19 10:12:14 -07:00
Samuel Lee
4785428d2f crc: arm64 implementation tweaks
+ Utilise `pmull2` instruction in main loops of arm64 crc functions and
avoid the need for `dup` to align multiplicands.
  + Use just 1 ASIMD register to hold both 64b p4 constants,
appropriately aligned.
+ Interleave quadword `ldr` with `pmull{2}` to avoid unnecessary stalls
on existing LITTLE uarch (which can only issue these instructions every
other cycle).
+ Similarly interleave scalar instructions with ASIMD instructions to
increase likelihood of instruction level parallelism on a variety of
uarch.
+ Cut down on needless instructions in non-critical sections to help
performance for small buffers.
+ Extract common instruction sequences into inner macros and moved
them into shared header - crc_common_pmull.h
+ Use the same human readable register aliases and register allocation
in all 4 implementations, never refer to registers without using human
readable alias.
  + Use #defines rather than .req to allow use of same names across
several implementations
+ Reduce tail case size from 1024B to 64B

+ Phrased the `eor` instructions in the main loop to more clearly show
that we can rewrite pairs of `eor` instructions with a single `eor3`
instruction in the presence of Armv8.2-SHA (should probably be an option
in multibinary in future).

Change-Id: I3688193ea4ad88b53cf47e5bd9a7fd5c2b4401e1
Signed-off-by: Samuel Lee <samuel.lee@microsoft.com>
2019-11-13 10:58:19 -07:00
Greg Tucker
0a8d05a81e doc: Move arch-dependent build instructions to readme
Removed the redundant parts that apply to all arch.

Change-Id: I2015c436cc8ea09913a8d0d4ce2cf1f112d71dde
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-11-01 15:55:44 -07:00
Hang Li
02a86dfb3f erasure_code: modify eor way in aarch64 neon codes
Change-Id: I9fb9219c5f280ed88194ec63234af046a5a036ae
Signed-off-by: Hang Li <lihang48@hisilicon.com>
2019-11-01 15:31:33 -07:00
Jerry Yu
ce9e56054a igzip:implement deflate hash with assembly
Change-Id: I39b3a37cd291c40f597750839c27db2a6a571fe5
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-11-01 14:41:46 -07:00
Jerry Yu
216d0f929b build: fix cross compile issue
Replace hardcode gcc with $(CC). as_filter
will work correct in cross compile

Change-Id: I484d5074abdfc80ed5cd14fdd1358274f306bcfd
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-11-01 18:11:05 +08:00
Jerry Yu
5d7724898d build: fix wrong use the register name
The third parameter must be 32bit register . Those assmebly
put 64bit register here , it is wrong .

Change-Id: Iebe17516b555a6a9b94ea7baa4778ad4b9dd0878
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-11-01 18:11:00 +08:00
Jerry Yu
b441659879 multibinary: fix strict-prototype warning
with -Wstric-prototype option , GCC report the
warning .

Change-Id: Ic2d1adb566ad21deec65c66552e2863254e1376a
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-11-01 18:10:57 +08:00
Jerry Yu
f0104600a0 build: disable clang support in ci
- Disable clang test for travis and drone.io
- Add document about compiler requirement

Change-Id: I81f8dc31088d40f315dd4ec062bed5df8ab7b633
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-11-01 18:10:50 +08:00
Zhiyuan Zhu
6b70da5051 igzip: implement set_long_icf_fg with assembly
Change-Id: I21ac55985a56c2b7b0a684934c076600d90f8b0a
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-10-31 11:02:54 -07:00
Greg Tucker
4ed944c4b1 build: Fix travis osx issue with brew update
Bug in Homebrew auto-update causes post-update install to use the old
environment.

Change-Id: I03e20d899f558f71579dfd4be3f96903b77f1998
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-30 11:16:49 -07:00
Hang Li
621cf92c52 erasure_code: modify perf benchmark loop
Change-Id: Ie45ceb3ac55ab943a155e2a3f9f6b765cd94d7a1
Signed-off-by: Hang Li <lihang48@hisilicon.com>
2019-10-30 10:34:40 -07:00
Greg Tucker
2f9eef537c build: Fix autoconf build for mingw target
Change-Id: Ie5ae17556f8cc95af8e59c8bd81a958c94455cd1
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-28 15:53:14 -07:00
Greg Tucker
e6848434ae test: Fix issue keeping mingw tests from running
Change-Id: I1e72ed99c2f09cbad488774313cddafdb1ce5de8
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-28 15:52:48 -07:00
Greg Tucker
533ba53f11 crc: Fix symbol conflict with older assemblers
Change-Id: I6f1322a5fecdf21b2c774454cd51cb56767f30b8
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-28 14:39:44 -07:00
Zhou Xiong
d7848c1d05 Implement aarch64 neon for erasure code.
1.Replace below erasure code interfaces to arm neon interface by mbin_interface function.
	ec_encode_data
	gf_vect_mul
	gf_vect_dot_prod
	gf_vect_mad
	ec_encode_data_update

2.Utilise arm neon instrution to accelerate GF(2^8) set compute by 128bit registor.

Change-Id: Ib0ecbfbd1837d2b1f823d26815c896724d2d22e4
Signed-off-by: Zhou Xiong <zhouxiong13@huawei.com>
2019-10-25 11:09:03 -07:00
Jun He
c680d3aba7 Add arm64 to Travis matrix
Enable new arm64 architecture in TravisCI, add tests for
following compilers:
gcc: v5.4.0
clang: v3.8.0

Change-Id: Id0b2f2231fabcbeff7061f85050db99df12c9a67
Signed-off-by: Jun He <jun.he@arm.com>
2019-10-24 10:09:19 +08:00
Greg Tucker
5f698e9e41 doc: Update mailing list link
Change-Id: I57fdf1ab4ca9f57c11f361c873094c5c22dc5410
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-16 17:13:54 -07:00
Greg Tucker
66cff99954 doc: Remove non-extern headers and add treeview
Change-Id: Icee001e66d48f7a47b36ded5550c66832f81a4cc
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-10-16 17:13:54 -07:00
Bernd Schubert
d32d3f6902 Make variables in ec_base.h (file) static
ec_base.h has several variables, which were defined with
a global scope. Exactly those global variables caused issues
on linking a static compilation of libisal.a to a shared lib.
Adding -fPIC to CFLAGS somehow didn't help.
As all the variables in ec_base.h are only included
and used by a single C file, all of these can be
(file) static, which then will also helps the compiler to
make further optimizations. And which also solves the issue
to link the static libisal to a shared lib.

Also make the variables const, as these are constants and
must be modified.

Change-Id: I2b8141dabc1c7a528401f2778cdbdbed6c93c36b
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
2019-10-11 15:39:56 -07:00
Zhiyuan Zhu
f3993f5c0b crc: Fix dynamic relocation link failure on Arm
This issue occurs when dynamic compilation is used
and gcc's -fsanitize memory detection option is turned on.

[Log] relocation truncated to fit: R_AARCH64_LD_PREL_LO19 against `.rodata'

Change-Id: Ic2f82264610552f347e043f82ac5ebafc93748e2
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-10-11 15:37:29 -07:00
Zhiyuan Zhu
be4d035227 igzip: Optimize isal update histogram with arm64
Change-Id: I944f9497d990e831de5e066055a21ea7e8d6693b
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-10-11 09:59:47 -07:00
Zhiyuan Zhu
290456231c igzip: Implement deflate icf body/finish with assembly
Change-Id: I40e4a9be2ae654c881460056de9730176d3d097c
Signed-off-by: Zhiyuan Zhu <zhiyuan.zhu@arm.com>
2019-10-11 09:59:40 -07:00
Jerry Yu
f3bb041799 igzip: Implement deflate body/finish with assembly
Change-Id: I556af7976294f31abd72ac49366f7259e3baf399
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-10-11 09:59:30 -07:00
Greg Tucker
fae4c3a499 Update release notes for v2.28 additions
Change-Id: Id295d5e615712f41d67d1130d5bcab1abed4c29f
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-17 11:01:17 -07:00
Greg Tucker
36502ec33b build: Bump revision to 2.28
Change-Id: I57443be6b0f6dff6129943cd6e1508d73bc1aa80
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-17 10:43:53 -07:00
Greg Tucker
600b6d8d99 crc: Add new ecma_norm
Change-Id: I7747bfdca24bcd604c3eb118e7f1bcd98b2b6211
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
121bc635c9 crc: Add new jones_norm
Change-Id: I66118baeec2a1d63423c74edc3aa20a3e8955c6e
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
ed528bb2ad crc: Add new iso_norm
Change-Id: If0b05d1a1029b02842935c5c43966d81c59fbbca
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
ea4cbf0ffa crc: Add new ecma_refl
Change-Id: Ifef4f8c6ce7da328b0cc03040b17e7443febf44d
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
42bbc5a37e crc: Add new jones_refl
Change-Id: Ia4837b9125bce4e38ef6bae0a8c852d02e9b0bf2
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
5c546ecddf crc: Add new arch CRC
Change-Id: I31d3a7e61eeed9d13a0cadd6d1ed25b0dbb39415
Signed-off-by: Chunyang Hui <chunyang.hui@intel.com>
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-16 17:01:25 -07:00
Greg Tucker
7a28c83879 test: Increase size of crc tests and simplify output
Change-Id: Ia0418b7889e591a0164c335e273caff263cdf640
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-14 16:01:28 -07:00
Greg Tucker
ae3c91ab85 build: Set assembler feature level in std make
Also fix multibinary to try each available arch

Change-Id: Icd8496d169665bded478a33a02e739d1f8349b6f
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-14 16:01:28 -07:00
Roy Oursler
198b026a55 build: Add multi-binary checking for new arch
Change-Id: I8bb8d9e9ae28987ee583976871ff84ee205bdbdc
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
2019-09-14 16:01:28 -07:00
Roy Oursler
e4b8f164ae build: Setup as_feature_level
Change-Id: I7443058c577cf8eafe10acc2b2bfdfe76e2ce264
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
2019-09-14 16:01:28 -07:00
Roy Oursler
d3caab9c3a build: Avoid requiring AVX512 define when using dispatch functions
Change-Id: I76af2d6ab7eb61ae531bbc7427650d08737c20ab
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
2019-09-14 16:01:28 -07:00
Greg Tucker
1ba280fa09 igzip: Fix and clarify a few code issues in the cli tool
Fixes a few scan build hits. A few are false positives such as a missed free but
better to clarify the code in this case. Others such as calling no-null
functions are made explicit.

Change-Id: Icb001a2bf7024dbaa4b4c87089eda818de830c78
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2019-09-04 14:39:01 -07:00
Jerry Yu
5f45f3f310 igzip: Optimize adler32 with arm neon
Change-Id: I9b8932eb02ed6bc44756f6505e7efbfad1706b46
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-08-29 10:11:06 +08:00
Jerry Yu
a2005c1fd6 igzip: enable multibinary interfaces
- Add dispatcher layer
- Alias functions with assmebly

Change-Id: I84da1be539d890db0df64e5ea989b2fd1f276949
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-08-29 10:08:58 +08:00
Jerry Yu
183385f02f multibinary: Add run-time cpu feature detect for aarch64
Some CPUs  report "illegal instruction" error for the crc test because
they do not support the relevant optional feature . This can be fixed by
introducing CPU feature detection for AArch64 .

The difference with the x86 implementation is the dispatcher . It is based
on the glibc function `getauxval(AT_HWCAP)` and `getauxval(AT_HWCAP2)` , not
registers or instructions .

On a  heterogeneous system (big.LITTLE) , it is dangerous to detect CPU
features using identification registers . And while it is possible to use
architectural feature registers from userspace on recent kernels, this
won't necessarily work with older platforms . Thus we use the HW_CAPs
exported from the kernel (and visible in getauxval) as the solution.

- According to kernel suggestion , getauxval should be used for this purpose .
  - [CPU Feature detection](https://github.com/torvalds/linux/blob/master/Documentation/arm64/cpu-feature-registers.rst)
- According to  AAPCS result/paramter registers should be saved/restore for function call
  - [AAPCS](http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf)
  - [GLibc](https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/aarch64/dl-trampoline.S)

Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
Change-Id: Ic9abe0d2268ac95537e1abf10acc642fc58a5054
2019-08-26 17:58:42 +08:00
Jerry Yu
0c22fcd3e2 build: fix compile break for unsupported CPUs
Build with Makefile.unx on unsupported CPUs fail . It reports
"undefine references". Fix it with adding base aliases files
into sources list

Change-Id: I9fbdeee7cb82edc9d5d8461bee3f648be83feaa6
Signed-off-by: Jerry Yu <jerry.h.yu@arm.com>
2019-08-23 17:28:22 +08:00