The performance gain of idct16x16_10_add_sse2 function is not
noticeable. However since both functions use the IDCT16_1D,
idct16x16_10_add_sse2 should be modified as well.
Tested with: park_joy_420_720p50.y4m
Change-Id: I02b957e36fcf997c677d15baf496533895271bff