Commit Graph

6 Commits

Author SHA1 Message Date
James Almer
52ec81c67d x86/hevc_res_add: add missing guards to hevc_transform_add32_8_avx2
Should fix compilation with old Yasm/Nasm versions.

Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-04 23:34:01 -03:00
James Almer
c3d2426cca x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2
~20% faster than AVX.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-04 20:21:29 -03:00
James Almer
54ca4dd43b x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8
* Reduced xmm register count to 7 (As such they are now enabled for x86_32).
* Removed four movdqa (affects the sse2 version only).
* pxor is now used to clear m0 only once.

~5% faster.

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-21 15:01:33 -03:00
James Almer
76a99d467f x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx
~15% faster than sse2

Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-20 16:54:52 -03:00
James Almer
9f498f4e6f x86/hevc_res_add: fix register count in hevc_transform_add{16,32}_10_avx2
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-19 21:34:52 -03:00
Pierre Edouard Lepere
a6af4bf64d x86: hevc: adding transform_add
Reviewed-by: James Almer <jamrial@gmail.com>
Approved-by: Ronald S. Bultje
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-20 01:28:56 +02:00