isa-l/Release_notes.txt

309 lines
9.2 KiB
Plaintext
Raw Permalink Normal View History

v2.30 Intel Intelligent Storage Acceleration Library Release Notes
==================================================================
RELEASE NOTE CONTENTS
1. KNOWN ISSUES
2. FIXED ISSUES
3. CHANGE LOG & FEATURES ADDED
1. KNOWN ISSUES
----------------
* Perf tests do not run in Windows environment.
* 32-bit lib is not supported in Windows.
2. FIXED ISSUES
---------------
v2.30
* Intel CET support.
* Windows nasm support fix.
v2.28
* Fix documentation on gf_vect_mad(). Min length listed as 32 instead of
required min 64 bytes.
v2.27
* Fix lack of install for pkg-config files
v2.26
* Fixes for sanitizer warnings.
v2.25
* Fix for nasm on Mac OS X/darwin.
v2.24
* Fix for crc32_iscsi(). Potential read-over for small buffer. For an input
buffer length of less than 8 bytes and aligned to an 8 byte boundary, function
could read past length. Previously had the possibility to cause a seg fault
only for length 0 and invalid buffer passed. Calculated CRC is unchanged.
* Fix for compression/decompression of > 4GB files. For streaming compression
of extremely large files, the total_out parameter would wrap and could
potentially flag an otherwise valid lookback distance as being invalid.
Total_out is still 32bit for zlib compatibility. No inconsistent compressed
buffers were generated by the issue.
v2.23
* Fix for histogram generation base function.
* Fix library build warnings on macOS.
* Fix igzip to use bsf instruction when tzcnt is not available.
v2.22
* Fix ISA-L builds for other architectures. Base function and examples
sanitized for non-IA builds.
* Fix fuzz test script to work with llvm 6.0 builtin libFuzz.
v2.20
* Inflate total_out behavior corrected for in-progress decompression.
Previously total_out represented the total bytes decompressed into the output
buffer or temp internal buffer. This is changed to be only the bytes put into
the output buffer.
* Fixed issue with isal_create_hufftables_subset. Affects semi-dynamic
compression use case when explicitly creating hufftables from histogram. The
_hufftables_subset function could fail to generate length symbols for any
length that were never seen.
v2.19
* Fix erasure code test that violates rs matrix bounds.
* Fix 0 length file and looping errors in igzip_inflate_test.
v2.18
* Mac OS X/darwin systems no longer require the --target=darwin config option.
The autoconf canonical build should detect.
v2.17
* Fix igzip using 32K window and a shared object
* Fix igzip undefined instruction error on Nehalem.
* Fixed issue in crc performance tests where OS optimizations turned cold cache
tests into warm tests.
v2.15
* Fix for windows register save in gf_6vect_mad_avx2.asm. Only affects windows
versions of ec_encode_data_update() running with AVX2. A GP register was not
properly restored resulting in corruption on return.
v2.14
* Building in unit directories is no longer supported removing the issue of
leftover object files causing the top-level make build to fail.
v2.10
* Fix for windows register save overlap in gf_{3-6}vect_dot_prod_sse.asm. Only
affects windows versions of erasure code. GP register saves/restore were
pushed to same stack area as XMM.
3. CHANGE LOG & FEATURES ADDED
------------------------------
v2.30
* Igzip compression enhancements.
- New functions for dictionary acceleration. Split dictionary processing and
resetting can greatly accelerate the performance of compressing many small
files with a dictionary.
- New static level 0 header decode tables. Accelerates decompressing small
files that are level 0 compressed by skipping the known header parsing.
- New feature for igzip cli tool: support for concatenated .gz files. On
decompression, igzip will process a series of independent, concatenated .gz
files into one output stream.
* CRC Improvements
- New vclmul version of crc32_iscsi().
- Updates for aarch64.
v2.29
* CRC Improvements
- New AVX512 vclmul versions of crc16_t10dif(), crc32_ieee(), crc32_gzip_refl.
* Erasure code improvements
- Added AVX512 ec functions with 5 and 6 outputs. Can improve performance for
codes with 5 or more parity by running in batches of up to 6 at a time.
v2.28
* New next-arch versions of 64-bit CRC. All norm and reflected 64-bit
polynomials are expanded to utilize vpclmulqdq.
v2.27
* New multi-threaded compression option for igzip cli tool
v2.26
* Adler32 added to external API.
* Multi-arch improvements.
* Performance test improvements.
v2.25
* Igzip performance improvements and features.
- Performance improvements for uncompressable files. Random or uncompressable
files can be up to 3x faster in level 1 or 2 compression.
- Additional small file performance improvments.
- New options in igzip cli: use name from header or not, test compressed file.
* Multi-arch autoconf script.
- Autoconf should detect architecture and run base functions at minimum.
v2.24
* Igzip small file performance improvements and new features.
- Better performance on small files.
- New gzip/zlib header and trailer handling.
- New gzip/zlib header parsing helper functions.
- New user-space compression/decompression tool igzip.
* New mem unit added with first function isal_zero_detect().
v2.23
* Igzip inflate (decompression) performance improvements.
- Implemented multi-byte decode for inflate. Decode can pack up to three
symbols into the decode table making some compressed streams decompress much
faster depending on the prevalence of short codes.
v2.22
* Igzip: AVX2 version of level 3 compression added.
* Erasure code examples
- New examples for standard EC encode and decode.
- Example of piggyback EC encode and decode.
v2.21
* Igzip improvements
- New compression levels added. ISA-L fast deflate now has more levels to
balance speed vs. target compression level. Level 0, 1 are as in previous
generations. New levels 2 & 3 target higher compression roughly comparable
to zlib levels 2-3. Level 3 is currently only optimized for processors with
AVX512 instructions.
* New T10dif & copy function - crc16_t10dif_copy()
- CRC and copy was added to emulate T10dif operations such as DIF insert and
strip. This function stitches together CRC and memcpy operations
eliminating an extra data read.
* CRC32 iscsi performance improvements
- Fixes issue under some distributions where warm cache performance was
reduced.
v2.20
* Igzip improvements
- Optimized deflate_hash in compression functions.
Improves performance of using preset dictionary.
- Removed alignment restrictions on input structure.
v2.19
* Igzip improvements
- Add optimized Adler-32 checksum.
- Implement zlib compression format.
- Add stateful dictionary support.
- Add struct reset functions for both deflate and inflate.
* Reflected IEEE format CRC32 is released out. Function interface is named
crc32_gzip_refl.
* Exact work condition of Erasure Code Reed-Solomon Matrix is determined by new
added program gen_rs_matrix_limits.
v2.18
* New 2-pass fully-dynamic deflate compression (level -1). ISA-L fast deflate
now has two levels. Level 0 (default) is the same as previous generations.
Setting to level 1 will switch to the fully-dynamic compression that will
typically reach higher compression ratios.
* RAID AVX512 functions.
v2.17
* New fast decompression (inflate)
* Compression improvements (deflate)
- Speed and compression ratio improvements.
- Fast custom Huffman code generation.
- New features:
* Run-time option of gzip crc calculation and headers/trailer.
* Choice of static header (BTYPE 01) blocks.
* LARGE_WINDOW, 32K history, now default.
* Stateless full flush mode.
* CRC64
- Six new 64-bit polynomials supported. Normal and reflected versions of ECMA,
ISO and Jones polynomials.
v2.16
* Units added: crc, raid, igzip (deflate compression).
v2.15
* Erasure code updates. New AVX512 versions.
* Nasm support. ISA-L ported to build with nasm or yasm assembler.
* Windows DLL support. Windows builds DLL by default.
v2.14
* Autoconf and autotools build allows easier porting to additional systems.
Previous make system still available to embedded users with Makefile.unx.
* Includes update for building on Mac OS X/darwin systems. Add --target=darwin
to ./configure step.
v2.13
* Erasure code improvments
- 32-bit port of optimized gf_vect_dot_prod() functions. This makes
ec_encode_data() functions much faster on 32-bit processors.
- Avoton performance improvements. Performance on Avoton for
gf_vect_dot_prod() and ec_encode_data() can improve by as much as 20%.
v2.11
* Incremental erasure code. New functions added to erasure code to handle
single source update of code blocks. The function ec_encode_data_update()
works with parameters similar to ec_encode_data() but are called incrementally
with each source block. These versions are useful when source blocks are not
all available at once.
v2.10
* Erasure code updates
- New AVX and AVX2 support functions.
- Changes min len requirement on gf_vect_dot_prod() to 32 from 16.
- Tests include both source and parity recovery with ec_encode_data().
- New encoding examples with Vandermonde or Cauchy matrix.
v2.8
* First open release of erasure code unit that is part of ISA-L.