use new horizontal mmx scaler instead of old x86asm if mmx2 cant be used (FAST_BILINEAR only)
fixed overflow in init function ... using double precission fp now :)
using C scaler for the last 1-2 lines if there is a chance to write over the end of the dst array
Originally committed as revision 3353 to svn://svn.mplayerhq.hu/mplayer/trunk/postproc