James Almer
b7863c972c
x86/hevc_mc: use fewer instructions in hevc_put_hevc_{uni, bi}_w[24]_{8, 10, 12}
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:47:15 +02:00
James Almer
b1a44e6bf5
x86/hevc_mc: remove an unnecessary pxor
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:35:08 +02:00
Christophe Gisquet
a507623bad
x86: hevc_mc: fix register count usage
...
A macro was using a fixed register, causing too many GPRs to be
declared as used.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 22:50:50 +02:00
Christophe Gisquet
81943a10b5
x86: hevc_mc: load less data in epel filters
...
Before:
5679 decicycles in epel_bi, 2059976 runs, 37176 skips
3468 decicycles in epel_uni, 1040886 runs, 7690 skips
After:
5323 decicycles in epel_bi, 2059493 runs, 37659 skips
3262 decicycles in epel_uni, 1040871 runs, 7705 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 18:34:39 +02:00
Christophe Gisquet
36284ae981
x86: hevc_mc: replace one lea by add
...
Should have been in 036f11bdb5
.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 17:42:56 +02:00
Christophe Gisquet
036f11bdb5
x86: hevc_mc: replace simple leas by adds
...
lea is detrimental for those simple cases. No impact overall to
the change though.
Before:
15017 decicycles in q, 1016152 runs, 32424 skips
15382 decicycles in q_bi, 1013673 runs, 34903 skips
3713 decicycles in e, 2074534 runs, 22618 skips
3901 decicycles in e_bi, 2065509 runs, 31643 skips
7852 decicycles in q_uni, 520165 runs, 4123 skips
2398 decicycles in e_uni, 1043339 runs, 5237 skips
After:
14898 decicycles in q, 1016295 runs, 32281 skips
15119 decicycles in q_bi, 1015392 runs, 33184 skips
3682 decicycles in e, 2073224
runs, 23928 skips
3720 decicycles in e_bi, 2065043 runs, 32109 skips
7643 decicycles in q_uni, 520280 runs, 4008 skips
2363 decicycles in e_uni, 1043780 runs, 4796 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 05:41:04 +02:00
Mickaël Raulet
bd0f2d316f
x86/hevc: add 12bits support for MC
...
cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:55:20 +02:00
Christophe Gisquet
dcd2a6ca36
x86: hevc_mc: remove unneeded shift
...
The immediate value may be 0.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01 23:34:33 +02:00
Christophe Gisquet
0810608e23
x86: hevc_mc: better register allocation
...
The xmm reg count was incorrect, and manual loading of the gprs
furthermore allows to noticeable reduce the number needed.
The modified functions are used in weighted prediction, so only a
few samples like WP_* exhibit a change. For this one and Win64
(some widths removed because of too few occurrences):
WP_A_Toshiba_3.bit, ff_hevc_put_hevc_uni_w
16 32
before: 2194 3872
after: 2119 3767
WP_B_Toshiba_3.bit, ff_hevc_put_hevc_bi_w
16 32 64
before: 2819 4960 9396
after: 2617 4788 9150
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-28 17:39:34 +02:00
Christophe Gisquet
f1793fe9cd
x86: hevc_mc: specify coefficients registers
...
By default, macro EPEL_FILTER loads the coefficients inconditionally
into m14/m15. This forces an unneeded higher register count.
Reduce that count by making them parameters of EPEL_FILTER.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-18 16:23:58 +02:00
Hendrik Leppkes
87f2d8079a
hevcdsp: correctly indicate that hevc_put_hevc_bi_epel_h uses 9 GPRs
...
Fixes FATE on Windows.
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-12 17:00:48 +02:00
plepere
7a2491c436
HEVC : added assembly MC functions
...
pretty print x86
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:36 +02:00