Geza Lore f1342a7b07 Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
2015-11-05 10:02:17 +00:00
..
2015-07-20 18:13:04 -07:00
2015-10-06 16:26:08 -07:00
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2015-06-04 07:32:16 -07:00
2015-07-06 13:04:05 -07:00
2015-07-31 10:27:33 -07:00
2015-04-28 20:00:59 -07:00
2015-08-07 10:16:27 -07:00
2015-06-22 06:09:38 -07:00
2015-06-22 06:09:38 -07:00
2015-06-11 09:52:00 -07:00
2015-10-01 11:19:13 -07:00
2015-08-03 09:43:34 -07:00
2015-07-31 10:27:33 -07:00
2015-05-22 11:19:51 -07:00
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2013-02-22 11:03:14 -08:00
2015-09-29 09:34:42 -07:00