022323bf85
the previous commit laid the groundwork by doing two sets of idcts together. this moved that further by grouping the interesting data (q[0], q+16[0]) together to allow using wider instructions. also managed to drop a few instructions by recognizing that the constant for sinpi8sqrt2 could be downshifted all the time which avoided a dowshift as well as workarounds for a function which only accepted signed data looks like a modest gain for performance: at qcif, went from ~180 fps to ~183 Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf |
||
---|---|---|
.. | ||
dboolhuff_neon.asm | ||
dequant_idct_neon.asm | ||
dequantizeb_neon.asm | ||
idct_blk_neon.c | ||
idct_dequant_0_2x_neon.asm | ||
idct_dequant_dc_0_2x_neon.asm | ||
idct_dequant_dc_full_2x_neon.asm | ||
idct_dequant_full_2x_neon.asm |