isa-l/Release_notes.txt

v2.31 Intel Intelligent Storage Acceleration Library Release Notes
==================================================================

RELEASE NOTE CONTENTS
1. KNOWN ISSUES
2. FIXED ISSUES
3. CHANGE LOG & FEATURES ADDED

1. KNOWN ISSUES
----------------

* Perf tests do not run in Windows environment.

* 32-bit lib is not supported in Windows.

* 32-bit lib is not validated.

2. FIXED ISSUES
---------------
v2.31

* Fixed various compilation issues/warnings for different platforms.
* Fixed documentation on xor/pq gen/check functions, with minimum
  number of vectors.
* Fixed potential out-of-bounds read on Adler32 Neon implementation.
* Fixed potential out-of-bounds read on gf_vect_mul Neon implementation.
* Fixed x86 load/store instructions in erasure coding functions (aligned moves
  that should be unaligned).
* Fixed memory leaks in unit tests.

v2.30

* Intel CET support.
* Windows nasm support fix.

v2.28

* Fix documentation on gf_vect_mad(). Min length listed as 32 instead of
  required min 64 bytes.

v2.27

* Fix lack of install for pkg-config files

v2.26

* Fixes for sanitizer warnings.

v2.25

* Fix for nasm on Mac OS X/darwin.

v2.24

* Fix for crc32_iscsi().  Potential read-over for small buffer.  For an input
  buffer length of less than 8 bytes and aligned to an 8 byte boundary, function
  could read past length.  Previously had the possibility to cause a seg fault
  only for length 0 and invalid buffer passed.  Calculated CRC is unchanged.

* Fix for compression/decompression of > 4GB files.  For streaming compression
  of extremely large files, the total_out parameter would wrap and could
  potentially flag an otherwise valid lookback distance as being invalid.
  Total_out is still 32bit for zlib compatibility.  No inconsistent compressed
  buffers were generated by the issue.

v2.23

* Fix for histogram generation base function.
* Fix library build warnings on macOS.
* Fix igzip to use bsf instruction when tzcnt is not available.

v2.22

* Fix ISA-L builds for other architectures.  Base function and examples
  sanitized for non-IA builds.

* Fix fuzz test script to work with llvm 6.0 builtin libFuzz.

v2.20

* Inflate total_out behavior corrected for in-progress decompression.
  Previously total_out represented the total bytes decompressed into the output
  buffer or temp internal buffer.  This is changed to be only the bytes put into
  the output buffer.

* Fixed issue with isal_create_hufftables_subset.  Affects semi-dynamic
  compression use case when explicitly creating hufftables from histogram.  The
  _hufftables_subset function could fail to generate length symbols for any
  length that were never seen.

v2.19

* Fix erasure code test that violates rs matrix bounds.

* Fix 0 length file and looping errors in igzip_inflate_test.

v2.18

* Mac OS X/darwin systems no longer require the --target=darwin config option.
  The autoconf canonical build should detect.

v2.17

* Fix igzip using 32K window and a shared object

* Fix igzip undefined instruction error on Nehalem.

* Fixed issue in crc performance tests where OS optimizations turned cold cache
  tests into warm tests.

v2.15

* Fix for windows register save in gf_6vect_mad_avx2.asm.  Only affects windows
  versions of ec_encode_data_update() running with AVX2.  A GP register was not
  properly restored resulting in corruption on return.

v2.14

* Building in unit directories is no longer supported removing the issue of
  leftover object files causing the top-level make build to fail.

v2.10

* Fix for windows register save overlap in gf_{3-6}vect_dot_prod_sse.asm. Only
  affects windows versions of erasure code.  GP register saves/restore were
  pushed to same stack area as XMM.

3. CHANGE LOG & FEATURES ADDED
------------------------------

v2.31

* API changes:
  - gf_vect_mul_base() function now returns an integer, matching the return type
    of gf_vect_mul() function (not a breaking change).

* Igzip compression improvements:
  - Added compress/decompress with dictionary to perf test app.
  - Zlib header can be now created on the fly when starting the compression.
  - Added isal_zlib_hdr_init() function to initialize the zlib header to 0.

* Zero-memory dectection improvements:
  - Optimized AVX implementation.
  - Added new AVX2 and AVX512 implementations.

* Erasure coding improvements:
  - Added new AVX512 and AVX2 implementations using GFNI instructions.
  - Added new SVE implementation.

* CRC improvements:
  - Added new CRC64 Rocksoft algorithm.
  - CRC x86 implementations optimized using ternary logic instructions and
    folding of bigger data on the last bytes.
  - CRC16 T10dif aarch64 implementation improved.
  - CRC aarch64 implementations optimized using XOR fusion feature.

* Documentation:
  - Added function overview documentation page.
  - Added security file.

* Performance apps:
  - Changed performance tests to warm by default.

* Example apps:
  - Added CRC combine example `crc_combine_example` for multiple polynomials.

v2.30

* Igzip compression enhancements.
  - New functions for dictionary acceleration. Split dictionary processing and
    resetting can greatly accelerate the performance of compressing many small
    files with a dictionary.
  - New static level 0 header decode tables. Accelerates decompressing small
    files that are level 0 compressed by skipping the known header parsing.
  - New feature for igzip cli tool: support for concatenated .gz files. On
    decompression, igzip will process a series of independent, concatenated .gz
    files into one output stream.

* CRC Improvements
  - New vclmul version of crc32_iscsi().
  - Updates for aarch64.

v2.29

* CRC Improvements
  - New AVX512 vclmul versions of crc16_t10dif(), crc32_ieee(), crc32_gzip_refl.

* Erasure code improvements
  - Added AVX512 ec functions with 5 and 6 outputs. Can improve performance for
    codes with 5 or more parity by running in batches of up to 6 at a time.

v2.28

* New next-arch versions of 64-bit CRC. All norm and reflected 64-bit
  polynomials are expanded to utilize vpclmulqdq.

v2.27

* New multi-threaded compression option for igzip cli tool

v2.26

* Adler32 added to external API.
* Multi-arch improvements.
* Performance test improvements.

v2.25

* Igzip performance improvements and features.
  - Performance improvements for uncompressable files. Random or uncompressable
    files can be up to 3x faster in level 1 or 2 compression.
  - Additional small file performance improvments.
  - New options in igzip cli: use name from header or not, test compressed file.

* Multi-arch autoconf script.
  - Autoconf should detect architecture and run base functions at minimum.

v2.24

* Igzip small file performance improvements and new features.
  - Better performance on small files.
  - New gzip/zlib header and trailer handling.
  - New gzip/zlib header parsing helper functions.
  - New user-space compression/decompression tool igzip.

* New mem unit added with first function isal_zero_detect().

v2.23

* Igzip inflate (decompression) performance improvements.
  - Implemented multi-byte decode for inflate.  Decode can pack up to three
    symbols into the decode table making some compressed streams decompress much
    faster depending on the prevalence of short codes.

v2.22

* Igzip: AVX2 version of level 3 compression added.

* Erasure code examples
  - New examples for standard EC encode and decode.
  - Example of piggyback EC encode and decode.

v2.21

* Igzip improvements
  - New compression levels added.  ISA-L fast deflate now has more levels to
    balance speed vs. target compression level.  Level 0, 1 are as in previous
    generations.  New levels 2 & 3 target higher compression roughly comparable
    to zlib levels 2-3.  Level 3 is currently only optimized for processors with
    AVX512 instructions.

* New T10dif & copy function - crc16_t10dif_copy()
  - CRC and copy was added to emulate T10dif operations such as DIF insert and
    strip.  This function stitches together CRC and memcpy operations
    eliminating an extra data read.

* CRC32 iscsi performance improvements
  - Fixes issue under some distributions where warm cache performance was
    reduced.

v2.20

* Igzip improvements
  - Optimized deflate_hash in compression functions.
    Improves performance of using preset dictionary.
  - Removed alignment restrictions on input structure.

v2.19

* Igzip improvements

  - Add optimized Adler-32 checksum.

  - Implement zlib compression format.

  - Add stateful dictionary support.

  - Add struct reset functions for both deflate and inflate.

* Reflected IEEE format CRC32 is released out. Function interface is named
  crc32_gzip_refl.

* Exact work condition of Erasure Code Reed-Solomon Matrix is determined by new
  added program gen_rs_matrix_limits.

v2.18

* New 2-pass fully-dynamic deflate compression (level -1).  ISA-L fast deflate
  now has two levels.  Level 0 (default) is the same as previous generations.
  Setting to level 1 will switch to the fully-dynamic compression that will
  typically reach higher compression ratios.

* RAID AVX512 functions.

v2.17

* New fast decompression (inflate)

* Compression improvements (deflate)
  - Speed and compression ratio improvements.
  - Fast custom Huffman code generation.
  - New features:
    * Run-time option of gzip crc calculation and headers/trailer.
    * Choice of static header (BTYPE 01) blocks.
    * LARGE_WINDOW, 32K history, now default.
    * Stateless full flush mode.

* CRC64
  - Six new 64-bit polynomials supported. Normal and reflected versions of ECMA,
    ISO and Jones polynomials.

v2.16

* Units added: crc, raid, igzip (deflate compression).

v2.15

* Erasure code updates. New AVX512 versions.

* Nasm support.  ISA-L ported to build with nasm or yasm assembler.

* Windows DLL support.  Windows builds DLL by default.

v2.14

* Autoconf and autotools build allows easier porting to additional systems.
  Previous make system still available to embedded users with Makefile.unx.

* Includes update for building on Mac OS X/darwin systems. Add --target=darwin
  to ./configure step.

v2.13

* Erasure code improvments
  - 32-bit port of optimized gf_vect_dot_prod() functions.  This makes
    ec_encode_data() functions much faster on 32-bit processors.
  - Avoton performance improvements.  Performance on Avoton for
    gf_vect_dot_prod() and ec_encode_data() can improve by as much as 20%.

v2.11

* Incremental erasure code.  New functions added to erasure code to handle
  single source update of code blocks.  The function ec_encode_data_update()
  works with parameters similar to ec_encode_data() but are called incrementally
  with each source block.  These versions are useful when source blocks are not
  all available at once.

v2.10

* Erasure code updates
  - New AVX and AVX2 support functions.
  - Changes min len requirement on gf_vect_dot_prod() to 32 from 16.
  - Tests include both source and parity recovery with ec_encode_data().
  - New encoding examples with Vandermonde or Cauchy matrix.

v2.8

* First open release of erasure code unit that is part of ISA-L.