Geza Lore 5eefd3ebfd Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
2015-11-11 14:03:47 +00:00
..
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2015-06-04 07:32:16 -07:00
2015-07-06 13:04:05 -07:00
2015-07-31 10:27:33 -07:00
2015-04-28 20:00:59 -07:00
2015-08-07 10:16:27 -07:00
2015-06-22 06:09:38 -07:00
2015-06-22 06:09:38 -07:00
2015-06-11 09:52:00 -07:00
2015-08-03 09:43:34 -07:00
2015-07-31 10:27:33 -07:00
2015-05-22 11:19:51 -07:00
2015-07-20 18:13:04 -07:00
2015-07-20 18:13:04 -07:00
2013-02-22 11:03:14 -08:00
2015-09-29 09:34:42 -07:00