ba843a92e7
* prevent 64bit overflow by controlling the 32b->64b conversions and preventively descaling by 8bit before the final multiply * adjust the threshold constants C1 and C2 to de-emphasis the dark areas * use a hat-like filter instead of box-filtering to avoid blockiness during averaging SSIM distortion calc is actually *faster* now in SSE2, because of the unrolling during the function rewrite. The C-version is quite slower because still un-optimized. Change-Id: I96e2715827f79d26faae354cc28c7406c6800c90