igzip: fix neon adler32 load beyond buffer end

In the adler32_neon function, during the last iteration of the
loop through "accum32_neon", we would load data after the end of the
buffer (in the ld1 instruction, the "start" register points to the end
of the buffer).

If this memory is unmapped, this would cause a segfault. If the memory
is mapped, the checksum would be correct because that value would
only be used in the next iteration, but this happens during the last
iteration.

To fix this, we can simply do the load before incrementing "start". And
while we're at it, we can load directly into d0_v/d1_v, saving a couple
of mov's.

Finally, the ld1 done during the function initialization can be removed
as the values aren't used for anything.

Change-Id: I4a0f2811adc523852ebe774da0a6fb1f5419192f
Signed-off-by: Martin Oliveira <martin.oliveira@eideticom.com>
This commit is contained in:
Martin Oliveira 2022-04-20 15:57:10 -06:00 committed by Greg Tucker
parent 5b1a519ffc
commit 8b7c1b80b2

View File

@ -70,8 +70,6 @@ local variables
declare_var_vector_reg s2acc , 3
declare_var_vector_reg zero , 16
declare_var_vector_reg adler , 17
declare_var_vector_reg back_d0 , 18
declare_var_vector_reg back_d1 , 19
declare_var_vector_reg sum2 , 20
declare_var_vector_reg tmp2 , 20
@ -100,7 +98,6 @@ local variables
add end,start,length
cbz loop_cnt,final_accum32
ld1 {back_d0_v.16b-back_d1_v.16b},[start]
mov loop_const,173
movi v16.4s,0
@ -118,10 +115,8 @@ great_than_32:
add tmp_x,start,loop_const,lsl 5
accum32_neon:
ld1 {d0_v.16b-d1_v.16b},[start]
add start,start,32
mov d0_v.16b,back_d0_v.16b
mov d1_v.16b,back_d1_v.16b
ld1 {back_d0_v.16b-back_d1_v.16b},[start]
shl tmp2_v.4s,adacc_v.4s,5
add s2acc_v.4s,s2acc_v.4s,tmp2_v.4s