Commit Graph

21 Commits

Author SHA1 Message Date
Roy Oursler
5a7c97a930 igzip: Document inflate_huff_code data structure.
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:52:56 -07:00
Roy Oursler
40b5104397 igzip: More optimizations by speeding up rarely taken branch
For some reason optimizing the rarely taken branch speeds up the program.

Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:51:22 -07:00
Roy Oursler
84ffaead82 igzip: Optimize inflate more
Some general optimizations, including speculatively loading the distance huffman
code and interleaving inflate_in_load with other instructions.

Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:48:47 -07:00
Roy Oursler
6fc9029f83 inflate: Optimize inflate_stateless
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:47:38 -07:00
Roy Oursler
c233946b0a igzip: Add some compile time options to igzip_inflate_perf
Allow for running igzip_inflate_perf with zlib -1, zlib -9, and igzip
compression. Also add a parameter to allow comparing igzip inflate's performance
to zlib inflate's performance.

Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:47:01 -07:00
Roy Oursler
09a5a243bf igzip: Implement statelesss inflate in assembly
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 12:45:53 -07:00
Roy Oursler
17dac9f641 igzip: Implement xmm loads to preload data in update_histogram
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 11:07:28 -07:00
Roy Oursler
4d40cd360d igzip: Speed up hash table initialization in igzip_update_histogram.
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 11:02:23 -07:00
Roy Oursler
39ce31de30 igzip: Fix minor bug in igzip_rand_test
Stop igzip_rand_test from trying to free a null pointer.

Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 11:01:40 -07:00
Roy Oursler
af9c0c0f46 igzip: update update_hash to match stateless and stateful compression.
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-05 11:01:01 -07:00
Roy Oursler
f97de75fc6 igzip: Port improvements to stateless compress to stateful compress
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 18:40:27 -07:00
Roy Oursler
eb1b7788d0 igzip: Move code in igzip_stateless to hide latencies more in ivybridge.
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:23:29 -07:00
Roy Oursler
cf30138c7b igzip: Load memory into xmm in stateless registers
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:22:52 -07:00
Roy Oursler
4d1fe78bfa igzip: Modify stateless hash updating to match the length 4 short match
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:19:15 -07:00
Roy Oursler
31814483c0 igzip: Create assembly version of isal_update_histogram
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:16:55 -07:00
Roy Oursler
7c91df5e50 igzip: add macros for shrx, shlx, and bzhi instructions on old architectures
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:11:06 -07:00
Roy Oursler
bbc886cf01 igzip: Modify igzip to ignore matches which are shorter than 4
Signed-off-by: Roy Oursler <roy.j.oursler@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:07:45 -07:00
Greg Tucker
45311ea249 Change common and igzip multibinary check from sse4.1 to sse4.2
SSE optimized compression function is using crc32 instruction that is in SSE 4.2
not SSE4.1 as stated.  Fixes incorect choice on core 2 duo that is SSE 4.1 only.

Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 17:05:23 -07:00
Xiaodong Liu
b34cb054fd build: Fix an include path to be srcdir relative
Allows configure to again build in an external directory.  When building ISAL in
an external path, assembler or compiler needs relative include paths.

Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Greg Tucker <greg.b.tucker@intel.com>
2016-12-02 16:54:40 -07:00
Greg Tucker
1abf68b7db igzip: Fix missing .note.gnu-stack for non-executable stack
One asm file in the source list for compression missed an include that adds
.note.gnu-stack.  Without it an executable could be marked has having an
executable stack and miss some hardware protections.

Reported-by: Ondřej Nový <novy@ondrej.org>
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2016-07-11 14:42:25 -07:00
Greg Tucker
660f49b02d Add data compression unit
Include fast DEFLATE compatable compression functions.

Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
2016-06-10 17:03:38 -07:00