vpx/vp9 at 5215b83aead928f66d9133e846ede6fd1b52aa89 - vpx

History

A.Mahfoodh 5215b83aea Simplifying and inlining k_cvtlo_epi16 and k_cvthi_epi16

Simplify the k_cvtlo_epi16 and k_cvthi_epi16 to only two
instructions. Then inlined them.

quoting from intel MMX_App_Compute_16bit_Vector.pdf‎
"The PMADDWD instruction multiplies four
pairs of 16-bit numbers and produces partial sums of the results
and can do so once per clock (with a three-clock latency)."
so I am assuming that there will be three clock overhead after the
last _mm_madd_pi16 command.
Even with the overhead the number of clocks in general should be
smaller. I am not sure though becasue I could not find information
about number of clocks required for instructions in k_cvtlo_epi16
and k_cvthi_epi16. I will run a test and compare the execution time.

Change-Id: Ieda4aa338f69ad3dd196ac6e7892da3cf1b47ea7

2013-10-02 20:02:03 -04:00

common

BITSTREAM - CLARIFICATION OF MV SIZE RANGE

2013-10-02 10:29:45 -07:00

decoder

Merge "Removing memset calls inside idct/iht functions."

2013-10-02 12:45:27 -07:00

encoder

Simplifying and inlining k_cvtlo_epi16 and k_cvthi_epi16

2013-10-02 20:02:03 -04:00

exports_dec

support building vp8 and vp9 into a single lib