Commit Graph

34 Commits

Author SHA1 Message Date
Christophe Gisquet
ed450d4acf x86: lavc: share more constant through defines
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-07 17:48:14 +01:00
Christophe Gisquet
4e128ab0b1 x86: vpx/h264/hevc/mpeg2: share constants
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 18:36:31 +02:00
Christophe Gisquet
6786848585 hevc_deblock: change tc type
The x86 asm expects int32_t so use that type.

Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 12:38:26 +02:00
Christophe Gisquet
e8c003edd2 x86: hevc_deblock: remove unnecessary masking
The unpacks/shuffles later on makes it unnecessary.

Before:
1508 decicycles in h, 2096759 runs, 393 skips
2512 decicycles in v, 2095422 runs, 1730 skips

After:
1477 decicycles in h, 2096745 runs, 407 skips
2484 decicycles in v, 2095297 runs, 1855 skips

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 17:46:04 +02:00
James Almer
d0f56ca071 x86/hevc_deblock: improve 8bit transpose store macros
Up to four instructions less depending on function and instruction set.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-03 04:24:15 +02:00
James Almer
62baf5b853 x86/hevc_deblock: use existing x86util transpose macro in chroma_{10, 12}
Cosmetic change. No measurable difference in speed.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-31 22:56:21 +02:00
James Almer
73c4f63ba5 x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx
~5% faster than SSSE3

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 14:04:59 +02:00
James Almer
88ba821f23 x86/hevc_deblock: improve luma functions register allocation
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:38:05 +02:00
James Almer
c74b08c5c6 x86/hevc_deblock: remove some unnecessary instructions
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:27:44 +02:00
James Almer
4f91bb0ff0 x86/hevc_deblock: use psignw instead of pmullw where possible
It's slightly faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 03:42:29 +02:00
James Almer
664e9e4331 x86/hevc_deblock: load less data in hevc_h_loop_filter_luma_8
Reading 8 bytes is enough.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-28 21:55:22 +02:00
Michael Niedermayer
706f81a2c2 Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d'
* commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d':
  hevc: SSE2 and SSSE3 loop filters

Conflicts:
	libavcodec/hevcdsp.c
	libavcodec/hevcdsp.h
	libavcodec/x86/Makefile
	libavcodec/x86/hevc_deblock.asm
	libavcodec/x86/hevcdsp_init.c

See: de7b89fd43 and several others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:20:48 +02:00
Pierre Edouard Lepere
1a880b2fb8 hevc: SSE2 and SSSE3 loop filters
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Mickaël Raulet
7df98d8c4d x86/hevc: remove unused constant in deblocking filter
cherry picked from commit a3f7282eaa6f1ab0524fb966c6eade50c3025f99

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:20:40 +02:00
Mickaël Raulet
7bdcf5c934 x86/hevc: add 12bits support for deblocking filter
cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:19:42 +02:00
Carl Eugen Hoyos
c75fdee747 avcodec/x86/hevc_deblock: Fix compilation with nasm. 2014-07-23 10:32:27 +02:00
Anton Khirnov
d7e162d46b hevcdsp: remove an unneeded variable in the loop filter
beta0 and beta1 will always be the same within a CU

Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>

cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:27:26 +02:00
Anton Khirnov
ae2f048fd7 avcodec/x86/hevc_deblock: cosmetics
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:18:05 +02:00
Anton Khirnov
b435043abb hevc: cleanups in SSE2 and SSSE3 loop filters, use fewer instructions
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:17:29 +02:00
Anton Khirnov
e8581b17a8 avcodec/x86/hevc_deblock: use test instead of cmp 0
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:16:05 +02:00
Anton Khirnov
dc69247de4 avcodec/x86/hevc_deblock: use of paddw instead of psllw
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:14:53 +02:00
Anton Khirnov
500a0394d5 avcodec/x86/hevc_deblock: add %ifs to avoid "do nothing instructions"
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:13:28 +02:00
Anton Khirnov
7a4cf67117 hevc: cleaning up SSE2 and SSSE3 deblocking filters
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>

cherry picked from commit b432041d7d1eca38831590f13b4e5baffff8186f
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:00:48 +02:00
James Almer
276bef5340 x86/hevc_deblock: add ff_hevc_[hv]_loop_filter_luma_{8, 10}_sse2
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-13 13:48:31 +02:00
James Almer
7538ad2248 x86/hevc_deblock: improve chroma functions register allocation
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-24 01:16:26 +02:00
James Almer
d43c303038 x86/hevc_deblock: use constants instead of generating values at runtime
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-19 23:09:33 +02:00
James Almer
057ebf1222 x86/hevc_deblock: remove some duplicated instructions
Also remove a couple unnecessary cmps

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-18 23:28:17 +02:00
Carl Eugen Hoyos
ef2713747f Fix compilation of libavcodec/x86/hevc_deblock.asm with nasm.
Suggested-by: Reimar
2014-05-17 12:50:55 +02:00
James Almer
be1fbc02b8 x86/hevc_deblock: use movhps instead of shuffling values
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-17 05:40:14 +02:00
James Almer
8aac77fede x86/hevc_deblock: fix label names
Also remove some unnecessary jmps

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-17 05:40:08 +02:00
James Almer
521eaea63a x86/hevc_deblock: fix usage of ABS1
The second argument is a temp register for non-SSSE3 cases

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-17 05:39:55 +02:00
James Almer
45110d2290 x86/hevc_deblock: merge movs with other instructions
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-17 05:39:34 +02:00
plepere
ef7c4cd001 avcodec/x86/hevc: updated to use x86util macros
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Ronald S. Bultje
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-16 21:11:07 +02:00
plepere
de7b89fd43 avcodec/x86/hevc: added DBF assembly functions
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: Ronald S. Bultje
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-16 21:11:03 +02:00