Michael Niedermayer
5db23c07a3
Merge commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b'
...
* commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b':
idctdsp: Add global function pointers for {add|put}_pixels_clamped functions
Conflicts:
libavcodec/arm/idctdsp_init_arm.c
libavcodec/dct.h
libavcodec/idctdsp.c
libavcodec/jrevdct.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-09-03 03:19:40 +02:00
Pascal Massimino
7a1d6ddd2c
xvid: Add C IDCT
...
Thanks to Pascal Massimino and Michael Militzer for relicensing as LGPL.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2014-09-02 14:41:13 -07:00
Diego Biurrun
95c0cec03a
idctdsp: Add global function pointers for {add|put}_pixels_clamped functions
...
These function pointers already existed in the ARM code. Adding them globally
allows calls to the function pointers to access arch-optimized versions of the
functions transparently.
2014-09-02 14:41:13 -07:00
Reimar Döffinger
d9e2aceb7f
Add missing "const" all over the place.
...
Only "./configure --enable-gpl" on x86 was tested.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2014-08-29 18:57:25 +02:00
Michael Niedermayer
5403a288a7
Merge commit '8d27bf1cff35be406b0fd89d832e1852d4c573bc'
...
* commit '8d27bf1cff35be406b0fd89d832e1852d4c573bc':
x86: xvid: K&R formatting cosmetics
Conflicts:
libavcodec/x86/xvididct_sse2.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:20:39 +02:00
Michael Niedermayer
b3b05a11d3
Merge commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5'
...
* commit 'dcb7c868ec7af7d3a138b3254ef2e08f074d8ec5':
cosmetics: Make naming scheme of Xvid IDCT consistent with other IDCTs
Conflicts:
libavcodec/mpeg4videodec.c
libavcodec/x86/Makefile
libavcodec/x86/dct-test.c
libavcodec/x86/xvididct_sse2.c
libavcodec/xvididct.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:09:30 +02:00
Michael Niedermayer
3ff5ca89fc
Merge commit '1f156af4274dc72d588620f6bedb4e9e66023c92'
...
* commit '1f156af4274dc72d588620f6bedb4e9e66023c92':
x86: xvid_idct: Drop unused definitions
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-27 21:01:54 +02:00
Diego Biurrun
8d27bf1cff
x86: xvid: K&R formatting cosmetics
2014-08-27 05:58:04 -07:00
Diego Biurrun
dcb7c868ec
cosmetics: Make naming scheme of Xvid IDCT consistent with other IDCTs
2014-08-27 04:54:05 -07:00
Diego Biurrun
1f156af427
x86: xvid_idct: Drop unused definitions
2014-08-27 04:36:41 -07:00
Christophe Gisquet
3e892b2bcd
x86: hevc_mc: split differently calls
...
In some cases, 2 or 3 calls are performed to functions for unusual
widths. Instead, perform 2 calls for different widths to split the
workload.
The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't
be processed that way without modifications: some calls use unaligned
buffers, and having branches to handle this was resulting in no
micro-benchmark benefit.
For block_w == 12 (around 1% of the pixels of the sequence):
Before:
12758 decicycles in epel_uni, 4093 runs, 3 skips
19389 decicycles in qpel_uni, 8187 runs, 5 skips
22699 decicycles in epel_bi, 32743 runs, 25 skips
34736 decicycles in qpel_bi, 32733 runs, 35 skips
After:
11929 decicycles in epel_uni, 4096 runs, 0 skips
18131 decicycles in qpel_uni, 8184 runs, 8 skips
20065 decicycles in epel_bi, 32750 runs, 18 skips
31458 decicycles in qpel_bi, 32753 runs, 15 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-24 12:05:33 +02:00
Christophe Gisquet
38e2aa3759
x86: hevc_mc: correct unneeded use of SSE4 code
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-24 11:43:33 +02:00
Christophe Gisquet
2346f2b5db
x86: hevcdsp: use compilation-time-fixed constant
...
The stride for some buffers is known.
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 16:26:30 +02:00
Christophe Gisquet
dad7f15567
hevcdsp: remove more instances of compile-time-fixed parameters
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 15:22:42 +02:00
Christophe Gisquet
d4f44b66d3
hevcdsp: remove compilation-time-fixed parameter
...
The dststride parameter is always MAX_PB_SIZE.
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 14:57:37 +02:00
Christophe Gisquet
fb1a98ec5b
x86: hevc_mc: assume 2nd source stride is 64
...
Reviewed-by: Mickaël Raulet <mraulet@gmail.com
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-22 13:21:37 +02:00
James Almer
54ca4dd43b
x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8
...
* Reduced xmm register count to 7 (As such they are now enabled for x86_32).
* Removed four movdqa (affects the sse2 version only).
* pxor is now used to clear m0 only once.
~5% faster.
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-21 15:01:33 -03:00
James Almer
76a99d467f
x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx
...
~15% faster than sse2
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-20 16:54:52 -03:00
James Almer
9f498f4e6f
x86/hevc_res_add: fix register count in hevc_transform_add{16,32}_10_avx2
...
Signed-off-by: James Almer <jamrial@gmail.com>
2014-08-19 21:34:52 -03:00
Pierre Edouard Lepere
a6af4bf64d
x86: hevc: adding transform_add
...
Reviewed-by: James Almer <jamrial@gmail.com>
Approved-by: Ronald S. Bultje
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-20 01:28:56 +02:00
Michael Niedermayer
3bb2297351
Merge commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6'
...
* commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6':
build: Add explanatory comments to (optimization) blocks in the Makefiles
Conflicts:
libavcodec/ppc/Makefile
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-15 20:25:12 +02:00
Michael Niedermayer
c1df467d73
Merge commit '835f798c7d20bca89eb4f3593846251ad0d84e4b'
...
* commit '835f798c7d20bca89eb4f3593846251ad0d84e4b':
mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes
Conflicts:
libavcodec/h261dec.c
libavcodec/intrax8.c
libavcodec/mjpegenc.c
libavcodec/mpeg12dec.c
libavcodec/mpeg12enc.c
libavcodec/mpeg4videoenc.c
libavcodec/mpegvideo.c
libavcodec/mpegvideo.h
libavcodec/mpegvideo_enc.c
libavcodec/rv10.c
libavcodec/x86/mpegvideoenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-15 20:11:56 +02:00
Diego Biurrun
efd26bedec
build: Add explanatory comments to (optimization) blocks in the Makefiles
2014-08-15 02:55:21 -07:00
Diego Biurrun
835f798c7d
mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes
2014-08-15 01:26:33 -07:00
James Darnley
54a51d3840
lavc/flacenc: partially unroll loop in flac_enc_lpc_16
...
It now does 12 samples per iteration, up from 4.
From 1.8 to 3.2 times faster again. 3.6 to 5.7 times faster overall.
Runtime is reduced by a further 2 to 18%. Overall runtime reduced by
4 to 50%.
Same conditions as before apply.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 03:09:26 +02:00
James Darnley
0081a14e7d
lavc/flacenc: add sse4 version of the 16-bit lpc encoder
...
From 1.8 to 2.4 times faster. Runtime is reduced by 2 to 39%. The
speed-up generally increases with compression_level.
This lpc encoder is not used with levels < 3 so it provides no speed-up
in these cases.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-13 01:14:47 +02:00
Ronald S. Bultje
45bed0ab30
vp9/x86: fix bug in intra_pred_hd_32x32.
...
Fixes mismatch in first keyframe in sample
ffvp9_fails_where_libvpx.succeeds.webm from ticket 3849. There's still
a second mismatch a few frames into the sample.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 13:11:21 +02:00
James Almer
c97870d1a1
x86/dca: remove unused header
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 12:46:53 +02:00
James Almer
e20ff251a6
x86/ttadsp: remove an unnecessary mova
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-12 12:29:05 +02:00
Michael Niedermayer
3841f2ae66
Merge commit 'd35b94fbabd8beb5d566c0b5d01688aff62c3b36'
...
* commit 'd35b94fbabd8beb5d566c0b5d01688aff62c3b36':
avcodec: Rename xvidmmx IDCT to xvid
Conflicts:
doc/APIchanges
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-09 12:11:13 +02:00
Michael Niedermayer
0dcebb9f63
Merge commit '84d173d3de97c753234ab0c0b50551d51413d663'
...
* commit '84d173d3de97c753234ab0c0b50551d51413d663':
xvididct: Ensure that the scantable permutation is always set correctly
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-08 22:17:04 +02:00
Diego Biurrun
d35b94fbab
avcodec: Rename xvidmmx IDCT to xvid
...
The Xvid IDCT is not MMX-specific.
2014-08-08 11:13:30 -07:00
Diego Biurrun
84d173d3de
xvididct: Ensure that the scantable permutation is always set correctly
...
This fixes cases where the scantable permuation would get overwritten by
the general idctdsp initialization.
2014-08-08 11:13:29 -07:00
Christophe Gisquet
75837e9add
x86: sbrdsp/fft: reuse ps_neg constant
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 19:25:08 +02:00
Christophe Gisquet
51dd80e751
x86: diracdsp: reuse constants
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 19:25:02 +02:00
Christophe Gisquet
6622a6cff3
x86: dwt: better share constants
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 19:24:57 +02:00
Christophe Gisquet
71db2d08b1
x86: better share ff_pw_2
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 19:24:49 +02:00
Christophe Gisquet
4e128ab0b1
x86: vpx/h264/hevc/mpeg2: share constants
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 18:36:31 +02:00
Michael Niedermayer
305f72aee7
avcodec: Change get_pixels() to ptrdiff_t linesize
...
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 15:50:54 +02:00
Christophe Gisquet
6786848585
hevc_deblock: change tc type
...
The x86 asm expects int32_t so use that type.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-06 12:38:26 +02:00
James Almer
de417982e8
x86/vp9lpf: use fewer instructions in SPLATB_MIX
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-05 02:47:54 +02:00
Christophe Gisquet
e8c003edd2
x86: hevc_deblock: remove unnecessary masking
...
The unpacks/shuffles later on makes it unnecessary.
Before:
1508 decicycles in h, 2096759 runs, 393 skips
2512 decicycles in v, 2095422 runs, 1730 skips
After:
1477 decicycles in h, 2096745 runs, 407 skips
2484 decicycles in v, 2095297 runs, 1855 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 17:46:04 +02:00
James Almer
b7863c972c
x86/hevc_mc: use fewer instructions in hevc_put_hevc_{uni, bi}_w[24]_{8, 10, 12}
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:47:15 +02:00
James Almer
b1a44e6bf5
x86/hevc_mc: remove an unnecessary pxor
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-04 14:35:08 +02:00
James Almer
d0f56ca071
x86/hevc_deblock: improve 8bit transpose store macros
...
Up to four instructions less depending on function and instruction set.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-03 04:24:15 +02:00
Michael Niedermayer
f54e01c24e
Merge commit 'a786c8259dafeca9744252230b5d78f67810770c'
...
* commit 'a786c8259dafeca9744252230b5d78f67810770c':
idct: Split off Xvid IDCT
Conflicts:
libavcodec/Makefile
libavcodec/mpeg4videodec.c
libavcodec/x86/Makefile
libavcodec/x86/idctdsp_init.c
This split is somewhat restructured leaving the xvid IDCT available
outside mpeg4 if manually selected.
The code also could not be merged unchanged as it conflicted with a
bugfix in FFmpeg
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-01 16:21:52 +02:00
Diego Biurrun
a786c8259d
idct: Split off Xvid IDCT
...
The Xvid IDCT is only required to decode some Xvid-encoded MPEG-4 files,
so there is no point in having it as an unconditional part of idctdsp.
2014-08-01 01:25:18 -07:00
James Almer
62baf5b853
x86/hevc_deblock: use existing x86util transpose macro in chroma_{10, 12}
...
Cosmetic change. No measurable difference in speed.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-31 22:56:21 +02:00
Christophe Gisquet
a507623bad
x86: hevc_mc: fix register count usage
...
A macro was using a fixed register, causing too many GPRs to be
declared as used.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 22:50:50 +02:00
James Almer
73c4f63ba5
x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx
...
~5% faster than SSSE3
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 14:04:59 +02:00
James Almer
88ba821f23
x86/hevc_deblock: improve luma functions register allocation
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:38:05 +02:00
James Almer
c74b08c5c6
x86/hevc_deblock: remove some unnecessary instructions
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 13:27:44 +02:00
James Almer
4f91bb0ff0
x86/hevc_deblock: use psignw instead of pmullw where possible
...
It's slightly faster
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 03:42:29 +02:00
Michael Niedermayer
a91c5ed008
Merge commit '4f8cf0dc4ef6110174056df7edd9dc2f2a988b6d'
...
* commit '4f8cf0dc4ef6110174056df7edd9dc2f2a988b6d':
x86: build: Restore ordering of OBJS lines
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-29 00:34:53 +02:00
Diego Biurrun
4f8cf0dc4e
x86: build: Restore ordering of OBJS lines
2014-07-28 13:19:04 -07:00
James Almer
664e9e4331
x86/hevc_deblock: load less data in hevc_h_loop_filter_luma_8
...
Reading 8 bytes is enough.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-28 21:55:22 +02:00
James Almer
f137876182
x86/hevc_idct: add a colon to labels
...
This fixes a warning spam when using NASM
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-28 21:43:32 +02:00
Christophe Gisquet
81943a10b5
x86: hevc_mc: load less data in epel filters
...
Before:
5679 decicycles in epel_bi, 2059976 runs, 37176 skips
3468 decicycles in epel_uni, 1040886 runs, 7690 skips
After:
5323 decicycles in epel_bi, 2059493 runs, 37659 skips
3262 decicycles in epel_uni, 1040871 runs, 7705 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 18:34:39 +02:00
Christophe Gisquet
36284ae981
x86: hevc_mc: replace one lea by add
...
Should have been in 036f11bdb5
.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 17:42:56 +02:00
James Almer
bfb3b2b7a6
x86/hevc_idct: add 12bit idct_dc
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:30:56 +02:00
Michael Niedermayer
d4a9e89b27
avcodec/x86/hevcdsp_init: make license header consistent
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:28:44 +02:00
Michael Niedermayer
706f81a2c2
Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d'
...
* commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d':
hevc: SSE2 and SSSE3 loop filters
Conflicts:
libavcodec/hevcdsp.c
libavcodec/hevcdsp.h
libavcodec/x86/Makefile
libavcodec/x86/hevc_deblock.asm
libavcodec/x86/hevcdsp_init.c
See: de7b89fd43
and several others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:20:48 +02:00
James Almer
1ace9573dc
x86/hevc_idct: replace old and unused idct functions
...
Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial).
Benchmarks on an Intel Core i5-4200U:
idct8x8_dc
SSE2 MMXEXT C
cycles 22 26 57
idct16x16_dc
AVX2 SSE2 C
cycles 27 32 249
idct32x32_dc
AVX2 SSE2 C
cycles 62 126 1375
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 18:00:11 +02:00
Pierre Edouard Lepere
1a880b2fb8
hevc: SSE2 and SSSE3 loop filters
...
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Christophe Gisquet
036f11bdb5
x86: hevc_mc: replace simple leas by adds
...
lea is detrimental for those simple cases. No impact overall to
the change though.
Before:
15017 decicycles in q, 1016152 runs, 32424 skips
15382 decicycles in q_bi, 1013673 runs, 34903 skips
3713 decicycles in e, 2074534 runs, 22618 skips
3901 decicycles in e_bi, 2065509 runs, 31643 skips
7852 decicycles in q_uni, 520165 runs, 4123 skips
2398 decicycles in e_uni, 1043339 runs, 5237 skips
After:
14898 decicycles in q, 1016295 runs, 32281 skips
15119 decicycles in q_bi, 1015392 runs, 33184 skips
3682 decicycles in e, 2073224
runs, 23928 skips
3720 decicycles in e_bi, 2065043 runs, 32109 skips
7643 decicycles in q_uni, 520280 runs, 4008 skips
2363 decicycles in e_uni, 1043780 runs, 4796 skips
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 05:41:04 +02:00
Mickaël Raulet
bd0f2d316f
x86/hevc: add 12bits support for MC
...
cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:55:20 +02:00
Mickaël Raulet
7df98d8c4d
x86/hevc: remove unused constant in deblocking filter
...
cherry picked from commit a3f7282eaa6f1ab0524fb966c6eade50c3025f99
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:20:40 +02:00
Mickaël Raulet
7bdcf5c934
x86/hevc: add 12bits support for deblocking filter
...
cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-26 01:19:42 +02:00
Michael Niedermayer
2904d052b7
Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac'
...
* commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac':
qpeldsp: Mark source pointer in qpel_mc_func function pointer const
Conflicts:
libavcodec/h264qpel_template.c
libavcodec/x86/cavsdsp.c
libavcodec/x86/rv40dsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-25 13:05:08 +02:00
Diego Biurrun
7fb993d338
qpeldsp: Mark source pointer in qpel_mc_func function pointer const
2014-07-25 02:52:54 -07:00
Christophe Gisquet
670b7f203a
x86: hevcdsp: align
...
Reviewed-by: Mickaël Raulet <mraulet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-23 22:18:08 +02:00
Carl Eugen Hoyos
c75fdee747
avcodec/x86/hevc_deblock: Fix compilation with nasm.
2014-07-23 10:32:27 +02:00
Michael Niedermayer
ca6b33b8bd
avcodec/x86/hevcdsp_init: Fix "warning: assignment from incompatible pointer type"
2014-07-22 16:36:12 +02:00
Anton Khirnov
d7e162d46b
hevcdsp: remove an unneeded variable in the loop filter
...
beta0 and beta1 will always be the same within a CU
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:27:26 +02:00
Anton Khirnov
ae2f048fd7
avcodec/x86/hevc_deblock: cosmetics
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:18:05 +02:00
Anton Khirnov
b435043abb
hevc: cleanups in SSE2 and SSSE3 loop filters, use fewer instructions
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:17:29 +02:00
Anton Khirnov
e8581b17a8
avcodec/x86/hevc_deblock: use test instead of cmp 0
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:16:05 +02:00
Anton Khirnov
dc69247de4
avcodec/x86/hevc_deblock: use of paddw instead of psllw
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:14:53 +02:00
Anton Khirnov
500a0394d5
avcodec/x86/hevc_deblock: add %ifs to avoid "do nothing instructions"
...
cherry picked from commit f7843356253459e6010320292dbbc1e888a5249b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:13:28 +02:00
Anton Khirnov
7a4cf67117
hevc: cleaning up SSE2 and SSSE3 deblocking filters
...
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit b432041d7d1eca38831590f13b4e5baffff8186f
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-22 16:00:48 +02:00
Michael Niedermayer
d986c414de
Merge commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce'
...
* commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce':
dct-test: Move arch-specific bits into arch-specific subdirectories
Conflicts:
libavcodec/dct-test.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-21 13:33:51 +02:00
Diego Biurrun
81b9bf3192
dct-test: Move arch-specific bits into arch-specific subdirectories
2014-07-21 01:10:11 -07:00
Michael Niedermayer
776647360d
Merge commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0'
...
* commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0':
simple_idct: Move x86-specific declarations to a header in the x86 directory
Conflicts:
libavcodec/x86/simple_idct.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-19 13:56:29 +02:00
Michael Niedermayer
6da96a9fc9
Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314'
...
* commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314':
fdct: Move x86-specific declarations to a header in the x86 directory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-19 13:45:59 +02:00
Diego Biurrun
5dcc201505
simple_idct: Move x86-specific declarations to a header in the x86 directory
2014-07-19 02:33:36 -07:00
Diego Biurrun
85cabb8d00
fdct: Move x86-specific declarations to a header in the x86 directory
2014-07-19 02:25:59 -07:00
Michael Niedermayer
097bf834ba
Merge commit '9e0b29911f1f167381a7dbdfca68bf417b8c767b'
...
* commit '9e0b29911f1f167381a7dbdfca68bf417b8c767b':
x86: dnxhdenc: Eliminate some unnecessary ifdefs
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:33:24 +02:00
Michael Niedermayer
521f569734
Merge commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273'
...
* commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273':
idctdsp: prettyprinting cosmetics
Conflicts:
libavcodec/idctdsp.c
libavcodec/ppc/idctdsp.c
libavcodec/x86/idctdsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:16:04 +02:00
Michael Niedermayer
42d326353c
Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae'
...
* commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae':
idct: Convert IDCT permutation #defines to an enum
Conflicts:
libavcodec/idctdsp.c
libavcodec/x86/cavsdsp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:01:17 +02:00
Diego Biurrun
9e0b29911f
x86: dnxhdenc: Eliminate some unnecessary ifdefs
2014-07-18 09:58:17 -07:00
Diego Biurrun
8b0dd4942a
idctdsp: prettyprinting cosmetics
2014-07-18 07:51:03 -07:00
Diego Biurrun
b4987f7219
idct: Convert IDCT permutation #defines to an enum
...
Also rename the enum values to be consistent with other DCT permutations.
2014-07-18 07:51:03 -07:00
Michael Niedermayer
3a2d1465c8
Merge commit '2d60444331fca1910510038dd3817bea885c2367'
...
* commit '2d60444331fca1910510038dd3817bea885c2367':
dsputil: Split motion estimation compare bits off into their own context
Conflicts:
configure
libavcodec/Makefile
libavcodec/arm/Makefile
libavcodec/dvenc.c
libavcodec/error_resilience.c
libavcodec/h264.h
libavcodec/h264_slice.c
libavcodec/me_cmp.c
libavcodec/me_cmp.h
libavcodec/motion_est.c
libavcodec/motion_est_template.c
libavcodec/mpeg4videoenc.c
libavcodec/mpegvideo.c
libavcodec/mpegvideo_enc.c
libavcodec/x86/Makefile
libavcodec/x86/me_cmp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-17 23:27:40 +02:00
Michael Niedermayer
d6676a1605
Merge commit 'c23ce454b3e33634a188d6facfd2b7182af5af93'
...
* commit 'c23ce454b3e33634a188d6facfd2b7182af5af93':
x86: dsputil: Coalesce all init files
Conflicts:
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_x86.h
libavcodec/x86/motion_est.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-17 22:07:52 +02:00
Diego Biurrun
2d60444331
dsputil: Split motion estimation compare bits off into their own context
2014-07-17 09:07:10 -07:00
Diego Biurrun
c23ce454b3
x86: dsputil: Coalesce all init files
...
This makes the init files match the structure of the dsputil split.
2014-07-17 03:32:56 -07:00
Michael Niedermayer
cc3e7a4c3d
Merge commit 'acf91215c74a91eb3b86af01dcb1d3c78d0e2310'
...
* commit 'acf91215c74a91eb3b86af01dcb1d3c78d0e2310':
x86: dsputil: Avoid pointless CONFIG_ENCODERS indirection
Conflicts:
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-13 21:51:20 +02:00
Diego Biurrun
acf91215c7
x86: dsputil: Avoid pointless CONFIG_ENCODERS indirection
...
The remaining dsputil bits are encoding-specific anyway.
2014-07-13 07:01:05 -07:00
James Almer
276bef5340
x86/hevc_deblock: add ff_hevc_[hv]_loop_filter_luma_{8, 10}_sse2
...
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-13 13:48:31 +02:00
James Almer
123649dd19
x86/dsputilenc: remove some empty if statements
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-12 15:04:58 +02:00
Michael Niedermayer
b8cdf04726
Merge commit '1173320249745eab01c901a39054fc0fced33c87'
...
* commit '1173320249745eab01c901a39054fc0fced33c87':
dsputil: Drop unused bit_depth parameter from all init functions
Conflicts:
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/ppc/dsputil_ppc.c
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-11 20:29:40 +02:00
Diego Biurrun
1173320249
dsputil: Drop unused bit_depth parameter from all init functions
2014-07-11 06:38:26 -07:00
Michael Niedermayer
2d5e9451de
Merge commit 'f46bb608d9d76c543e4929dc8cffe36b84bd789e'
...
* commit 'f46bb608d9d76c543e4929dc8cffe36b84bd789e':
dsputil: Split off pixel block routines into their own context
Conflicts:
configure
libavcodec/dsputil.c
libavcodec/mpegvideo_enc.c
libavcodec/pixblockdsp_template.c
libavcodec/x86/dsputilenc.asm
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-10 01:22:14 +02:00
Diego Biurrun
f46bb608d9
dsputil: Split off pixel block routines into their own context
2014-07-09 08:05:26 -07:00
Michael Niedermayer
14e2406de7
Merge commit 'a9aee08d900f686e966c64afec5d88a7d9d130a3'
...
* commit 'a9aee08d900f686e966c64afec5d88a7d9d130a3':
dsputil: Split off FDCT bits into their own context
Conflicts:
configure
libavcodec/Makefile
libavcodec/asvenc.c
libavcodec/dnxhdenc.c
libavcodec/dsputil.c
libavcodec/mpegvideo.h
libavcodec/mpegvideo_enc.c
libavcodec/x86/Makefile
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-08 03:19:06 +02:00
Diego Biurrun
a9aee08d90
dsputil: Split off FDCT bits into their own context
2014-07-07 12:28:45 -07:00
Michael Niedermayer
3790801f9c
Merge commit '3c650efb81aaa3b395ba4606ee68a47ee4efb57b'
...
* commit '3c650efb81aaa3b395ba4606ee68a47ee4efb57b':
dsputil: Move draw_edges() to mpegvideoencdsp
Conflicts:
libavcodec/mpegvideo_enc.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-07 16:17:27 +02:00
Michael Niedermayer
020865f557
Merge commit 'c166148409fe8f0dbccef2fe684286a40ba1e37d'
...
* commit 'c166148409fe8f0dbccef2fe684286a40ba1e37d':
dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc
Conflicts:
libavcodec/dsputil.c
libavcodec/mpegvideo_enc.c
libavcodec/x86/dsputilenc.asm
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-07 15:36:58 +02:00
Michael Niedermayer
462c6cdb8e
Merge commit '8d686ca59db14900ad5c12b547fb8a7afc8b0b94'
...
* commit '8d686ca59db14900ad5c12b547fb8a7afc8b0b94':
dsputil: Split off *_8x8basis to a separate context
Conflicts:
libavcodec/dsputil.c
libavcodec/mpegvideo_enc.c
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-07 15:08:55 +02:00
Diego Biurrun
3c650efb81
dsputil: Move draw_edges() to mpegvideoencdsp
2014-07-06 14:48:50 -07:00
Diego Biurrun
c166148409
dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc
2014-07-06 14:26:53 -07:00
Diego Biurrun
8d686ca59d
dsputil: Split off *_8x8basis to a separate context
2014-07-06 13:09:24 -07:00
James Almer
195f7bd23d
x86/svq1enc: use unaligned mov on SSE2
...
Might fix fate failures on some systems
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-06 20:27:57 +02:00
James Almer
dad31083ae
x86/svq1enc: port ssd_int8_vs_int16 to yasm
...
Also add an SSE2 version
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-05 21:43:40 +02:00
Michael Niedermayer
19b79c1429
Merge commit 'b0de1c766329dd8c9960ad1722e2f653160abc1b'
...
* commit 'b0de1c766329dd8c9960ad1722e2f653160abc1b':
x86: build: Only compile FDCT code if MMX is enabled
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-05 20:12:31 +02:00
Michael Niedermayer
5036c8b17b
Merge commit '12f129e545e5a5844b6ad7f3eb6a438015cad8bc'
...
* commit '12f129e545e5a5844b6ad7f3eb6a438015cad8bc':
x86: Unconditionally compile blockdsp and svq1enc init files
Conflicts:
libavcodec/x86/Makefile
blockdsp_mmx is renamed to blockdsp_init as we already have a blockdsp file
and _init is how all other such files are called
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-05 19:50:05 +02:00
Michael Niedermayer
6bef3e55bd
Merge commit '009331303a6462d07cbe94aef9c446f1a1695519'
...
* commit '009331303a6462d07cbe94aef9c446f1a1695519':
x86: huffyuvdsp: Move inline assembly to init file
Conflicts:
libavcodec/x86/Makefile
libavcodec/x86/huffyuvdsp.h
libavcodec/x86/huffyuvdsp_init.c
libavcodec/x86/huffyuvdsp_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-05 19:11:26 +02:00
Diego Biurrun
b0de1c7663
x86: build: Only compile FDCT code if MMX is enabled
...
All other files containing purely inline assembly are treated the same way.
2014-07-05 04:18:34 -07:00
Diego Biurrun
12f129e545
x86: Unconditionally compile blockdsp and svq1enc init files
...
This avoids a link failure with MMX disabled as the init functions
are referenced unconditionally.
2014-07-05 04:18:34 -07:00
Diego Biurrun
009331303a
x86: huffyuvdsp: Move inline assembly to init file
...
This avoids a link failure with MMX disabled as now code and
initialization are compiled under the same condition.
2014-07-05 04:18:34 -07:00
Michael Niedermayer
5c65aed7fd
Merge commit '391ecc961ced2bde7aecb3053ac35191f838fae8'
...
* commit '391ecc961ced2bde7aecb3053ac35191f838fae8':
x86: mpegvideoenc: Change SIMD optimization name suffixes to lowercase
Conflicts:
libavcodec/x86/mpegvideoenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-04 01:17:39 +02:00
Diego Biurrun
391ecc961c
x86: mpegvideoenc: Change SIMD optimization name suffixes to lowercase
2014-07-03 13:41:41 -07:00
James Almer
a441a2437b
x86: rename dsputil.asm to idctdsp.asm
...
Its only function is no longer part of dsputil.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-02 01:08:04 +02:00
Michael Niedermayer
8d0c7031a8
Merge commit '79793f833784121d574454af4871866576c0749d'
...
* commit '79793f833784121d574454af4871866576c0749d':
Update Fiona's name in copyright statements.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 15:43:40 +02:00
Michael Niedermayer
581b5f0b9b
Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e'
...
* commit 'e3fcb14347466095839c2a3c47ebecff02da891e':
dsputil: Split off IDCT bits into their own context
Conflicts:
configure
libavcodec/aic.c
libavcodec/arm/Makefile
libavcodec/arm/dsputil_init_arm.c
libavcodec/arm/dsputil_init_armv6.c
libavcodec/asvdec.c
libavcodec/dnxhdenc.c
libavcodec/dsputil.c
libavcodec/dvdec.c
libavcodec/dxva2_mpeg2.c
libavcodec/intrax8.c
libavcodec/mdec.c
libavcodec/mjpegdec.c
libavcodec/mjpegenc_common.h
libavcodec/mpegvideo.c
libavcodec/ppc/dsputil_altivec.h
libavcodec/ppc/dsputil_ppc.c
libavcodec/ppc/idctdsp.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 15:22:11 +02:00
Diego Biurrun
79793f8337
Update Fiona's name in copyright statements.
2014-07-01 03:26:51 -07:00
Diego Biurrun
e3fcb14347
dsputil: Split off IDCT bits into their own context
2014-06-30 07:58:46 -07:00
Michael Niedermayer
5bca5f87d1
Revert "x86/videodsp: add emulated_edge_mc_mmxext"
...
The commit causes minor out of array reads and was mainly intended for
future optimizations which turned out not to be meassurably faster.
Itself it was just 1 cpu cycle faster
Approved-by: jamrial
This reverts commit 057d2704e7
.
2014-06-28 05:39:07 +02:00
Michael Niedermayer
09a7a4704e
Merge commit 'd2869aea0494d3a20d53d5034cd41dbb488eb133'
...
* commit 'd2869aea0494d3a20d53d5034cd41dbb488eb133':
dsputil: Move MMX/SSE2-optimized IDCT bits to the x86 subdirectory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-27 03:05:33 +02:00
Diego Biurrun
d2869aea04
dsputil: Move MMX/SSE2-optimized IDCT bits to the x86 subdirectory
2014-06-26 16:15:07 -07:00
James Almer
057d2704e7
x86/videodsp: add emulated_edge_mc_mmxext
...
This also changes hfix8_mmx and above to use mmx regs instead of
gprs, and makes emulated_edge_mc_sse and emulated_edge_mc_sse2 use
mmxext hfix and hvar functions instead of mmx where possible.
This is mostly in preparation for an ssse3 version.
Signed-off-by: James Almer <jamrial@gmail.com>
code is about 1 cpu cycle faster approximately
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-26 17:58:57 +02:00
Michael Niedermayer
11ba0c8207
Merge commit '5ab03e41e553452118113d0c224fa32b325e45e5'
...
* commit '5ab03e41e553452118113d0c224fa32b325e45e5':
x86: h264dsp: Fix link failure with optimizations disabled
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-26 02:58:59 +02:00
Diego Biurrun
5ab03e41e5
x86: h264dsp: Fix link failure with optimizations disabled
...
With optimzations disabled compilers have trouble doing dead code
elimination on 'if (foo && 0)' expressions, while 'if (0 && foo)'
still works, so use the latter to avoid problems.
Bug-Id: 707
2014-06-25 15:24:51 -07:00
Michael Niedermayer
1ace0ca60f
avcodec/x86/hevc_idct: fix function name in comment
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-25 18:22:25 +02:00
plepere
9ba6b17add
avcodec/x86/hevc_idct: fix number of sse registers
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-25 14:59:23 +02:00
plepere
942e22c651
avcodec/x86/hevc: add avx2 dc idct
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-25 14:49:44 +02:00
Michael Niedermayer
eab2509f8c
avcodec/x86/h264_qpel_10bit: locally define pb_0
...
somehow old llvm-gcc manages to ignore the alignment from ff_pb_0 causing a crash on freebsd
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-24 02:13:43 +02:00
James Almer
476bd3c7e4
x86/dsputil: move put_signed_pixels_clamped out of bswapdsp.asm
...
It's still a dsputil function
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-23 22:11:18 +02:00
Michael Niedermayer
d7463c6813
Merge commit 'fab9df63a3156ffe1f9490aafaea41e03ef60ddf'
...
* commit 'fab9df63a3156ffe1f9490aafaea41e03ef60ddf':
dsputil: Split off global motion compensation bits into a separate context
Conflicts:
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/ppc/dsputil_altivec.h
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-23 21:10:10 +02:00
Diego Biurrun
fab9df63a3
dsputil: Split off global motion compensation bits into a separate context
2014-06-23 09:58:17 -07:00
Michael Niedermayer
35bb74900b
Merge commit 'c67b449bebbe0b35c73b203683e77a0a649bc765'
...
* commit 'c67b449bebbe0b35c73b203683e77a0a649bc765':
dsputil: Split bswap*_buf() off into a separate context
Conflicts:
configure
libavcodec/4xm.c
libavcodec/ac3dec.c
libavcodec/ac3dec.h
libavcodec/apedec.c
libavcodec/eamad.c
libavcodec/flacenc.c
libavcodec/fraps.c
libavcodec/huffyuv.c
libavcodec/huffyuvdec.c
libavcodec/motionpixels.c
libavcodec/truemotion2.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-23 13:31:26 +02:00
Diego Biurrun
c67b449beb
dsputil: Split bswap*_buf() off into a separate context
2014-06-22 18:22:31 -07:00
James Almer
c172683bf4
x86/dsputil: remove redundant global motion compensation code
...
The SSE version has been no different than the mmx one since commit a41bf09d
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-23 02:15:06 +02:00
James Almer
6ec3dc97fc
x86/audiodsp: move asm code out of dsputil
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22 19:53:09 +02:00
Michael Niedermayer
99497b4683
Merge commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2'
...
* commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2':
dsputil: Split audio operations off into a separate context
Conflicts:
configure
libavcodec/takdec.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil.asm
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22 17:58:28 +02:00
Diego Biurrun
9a9e2f1c8a
dsputil: Split audio operations off into a separate context
2014-06-22 06:20:15 -07:00
Michael Niedermayer
33f83a2157
avcodec/x86/rv40dsp_init: fix () in macros
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-20 21:36:43 +02:00
James Almer
a5ce608fc7
x86/blockdsp: restore author attribution
...
See commits
649c00c96d
5fecfb7d58
73b02e2460
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 18:31:44 +02:00
Michael Niedermayer
08c5859f17
avcodec: add simpleauto idct
...
This will pick the "best" simple idct compatible idct
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 14:28:01 +02:00
James Almer
454c019cb5
x86/hevc_idct: fix movd parameter size in DC_ADD_INIT
...
Fixes compilation with NASM x86_64
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 13:18:13 +02:00
James Almer
fe782233aa
x86/blockdsp: move asm code out of dsputil
...
Also replace INLINE_<opt> with EXTERNAL_<opt> that were wrongly
changed by commit 2b05db4f81
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 13:09:03 +02:00
Michael Niedermayer
042a82ca37
avcodec/x86/lossless_videodsp: Fix size of values read for left/left_top
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 05:53:41 +02:00
Michael Niedermayer
2b05db4f81
Merge commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9'
...
* commit 'e74433a8e6fc00c8dbde293c97a3e45384c2c1d9':
dsputil: Split clear_block*/fill_block* off into a separate context
Conflicts:
configure
libavcodec/asvdec.c
libavcodec/dnxhddec.c
libavcodec/dnxhdenc.c
libavcodec/dsputil.h
libavcodec/eamad.c
libavcodec/intrax8.c
libavcodec/mjpegdec.c
libavcodec/ppc/dsputil_ppc.c
libavcodec/vc1dec.c
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-19 04:54:38 +02:00
Diego Biurrun
e74433a8e6
dsputil: Split clear_block*/fill_block* off into a separate context
2014-06-18 14:07:23 -07:00
plepere
92cccb7bcd
avcodec/hevc: new idct + asm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-17 13:23:36 +02:00
Christophe Gisquet
9107612818
x86util: add and use RSHIFT/LSHIFT macros
...
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 13:19:27 +02:00
Ronald S. Bultje
385a3420d1
vp9/x86: fix overwrite in ipred_vl_4x4_ssse3.
...
Fixes track ticket 3717.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-12 04:11:20 +02:00
Christophe Gisquet
508e7a5c16
x86: huffyuv: fix {add,diff}_int16
...
They used an extra, undeclared register. Fixes a crash in
fate-vsynth3-ffvhuff444p16
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-12 00:26:19 +02:00
Michael Niedermayer
1a2ff62859
Merge commit '570d4b21863b6254d6bbca9c528bede471bb4478'
...
* commit '570d4b21863b6254d6bbca9c528bede471bb4478':
x86: h264: Don't keep data in the redzone across function calls on 64 bit unix
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-10 18:35:49 +02:00
Martin Storsjö
570d4b2186
x86: h264: Don't keep data in the redzone across function calls on 64 bit unix
...
We know that the called function (ff_chroma_inter_body_mmxext)
doesn't touch the redzone, and thus will be kept intact - thus,
this doesn't fix any bug per se.
However, valgrind's memcheck tool intentionally assumes that the
redzone is clobbered on every function call and function return
(see a long comment in valgrind/memcheck/mc_main.c). This avoids
false positives in that tool, at the cost of an extra stack pointer
adjustment.
The other alternative would be a valgrind suppression for this issue,
but that's an extra burden for everybody that wants to run libavcodec
within valgrind.
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-06-10 16:31:48 +03:00
Michael Niedermayer
06f576c4ab
avcodec/x86/dct_init: fix build failure with clang && disable-optimizations
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09 19:32:41 +02:00
James Almer
6d408495b5
x86/dct32: don't build ff_dct32_float_sse on x86_64
...
There's an SSE2 version already, and technically the SSE version
on x86_64 was wrong (using pshufd and pshuflw, SSE2 instructions).
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09 00:51:43 +02:00
James Almer
fc8db12a73
x86/vp9: inital AVX2 intra_pred
...
tos3k-vp9-b10000.webm on a Core i5-4200U @1.6GHz
1219 decicycles in ff_vp9_ipred_dc_32x32_ssse3, 131070 runs, 2 skips
439 decicycles in ff_vp9_ipred_dc_32x32_avx2, 131070 runs, 2 skips
3570 decicycles in ff_vp9_ipred_dc_top_32x32_ssse3, 4096 runs, 0 skips
2494 decicycles in ff_vp9_ipred_dc_top_32x32_avx2, 4096 runs, 0 skips
1419 decicycles in ff_vp9_ipred_dc_left_32x32_ssse3, 16384 runs, 0 skips
717 decicycles in ff_vp9_ipred_dc_left_32x32_avx2, 16384 runs, 0 skips
2737 decicycles in ff_vp9_ipred_tm_32x32_avx, 1024 runs, 0 skips
2088 decicycles in ff_vp9_ipred_tm_32x32_avx2, 1024 runs, 0 skips
3090 decicycles in ff_vp9_ipred_v_32x32_avx, 512 runs, 0 skips
2226 decicycles in ff_vp9_ipred_v_32x32_avx2, 512 runs, 0 skips
1565 decicycles in ff_vp9_ipred_h_32x32_avx, 1024 runs, 0 skips
922 decicycles in ff_vp9_ipred_h_32x32_avx2, 1024 runs, 0 skips
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-08 02:37:20 +02:00
James Almer
ec98f80af4
x86/dsputil: move some mmx init code inside dsputil_init_mmx()
...
This reduces differences with the fork
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-06 05:26:04 +02:00
Christophe Gisquet
ccff45a0d3
apedsp: move to llauddsp
...
APE is not the sole codec using scalarproduct_and_madd_int16.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-05 20:31:59 +02:00
Michael Niedermayer
d5c9d055ea
avcodec/x86/dsputilenc_mmx: fix build without yasm
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04 05:39:03 +02:00
James Almer
625ffa1457
x86/motion_est: sad_{x, y}2_mmxext functions are bitexact
...
Only the xy2 functions aren't.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-04 00:48:35 +02:00
Timothy Gu
108dec3055
x86: dsputilenc: convert hf_noise*_mmx to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Several bugfixes by: Christophe Gisquet <christophe.gisquet@gmail.com>
See: [FFmpeg-devel] [WIP] [PATCH 4/4] x86: dsputilenc: convert hf_noise*_mmx to yasm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-03 23:59:43 +02:00
Christophe Gisquet
dcd2a6ca36
x86: hevc_mc: remove unneeded shift
...
The immediate value may be 0.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01 23:34:33 +02:00
Christophe Gisquet
09fc28aed1
x86: hevcdsp_init: fix macro usage
...
The macro was not using the parameter but unconditionally using sse4.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01 23:20:07 +02:00
James Almer
e1bd40fe6b
x86/motion_est: enable sad16_sse2 on k10 CPUs
...
The check is meant for k8 CPUs. sad16_sse2 is ~20% faster than sad16_mmxext on k10.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-01 02:10:32 +02:00
James Almer
f128342df2
build: fix compilation of svq1enc_mmx.c with --disable-mmx
...
It's needed for ff_svq1enc_init_x86() even if simd functions are disabled.
Alternatively, svq1enc_init.c could be made and the relevant code moved there.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-31 00:38:24 +02:00
James Almer
4ac41a52e2
x86/huffyuvdsp: fix some prototypes
...
Remove duplicate prototypes and fix int -> intptr_t in another
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-31 00:29:00 +02:00
Christophe Gisquet
d136fe6fd7
x86: huffyuvdsp: fewer functions for x86_64
...
When there are 2 functions that are <= SSE2, only one is needed for x86_64.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 21:39:06 +02:00
Timothy Gu
154cee9292
x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 16:57:52 +02:00
Timothy Gu
0b6292b7b8
x86: dsputilenc: move all the function prototypes together
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 16:18:10 +02:00
Christophe Gisquet
f743fa9c7f
x86: huffyuvdsp: add_hfyu_left_pred_bgr32
...
C MMX SSE2
Cycles: 3092 1053 578
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 15:20:36 +02:00
Michael Niedermayer
7be79c76d3
avcodec/huffyuvdsp: Change w to intptr in add_hfyu_median_pred() and add_hfyu_left_pred()
...
This avoids potential issues with the high 32bits being random in x86-64 asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 15:12:58 +02:00
Christophe Gisquet
884078d2df
x86: huffyuvdsp: add SSE2 median prediction
...
From 5010c to 4566 on lagarith YUY2.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 14:57:57 +02:00
Michael Niedermayer
8c891d90ca
avcodec/x86/qpeldsp_init: Restore author attribution
...
See: 368f50359e
See: 44eb495128
, and many others
See:
similarity index 83%
copy from libavcodec/x86/dsputil_init.c
copy to libavcodec/x86/qpeldsp_init.c
index ebbf97f..8f296a1 100644
--- a/libavcodec/x86/dsputil_init.c
+++ b/libavcodec/x86/qpeldsp_init.c
@@ -1,6 +1,5 @@
/*
- * Copyright (c) 2000, 2001 Fabrice Bellard
- * Copyright (c) 2002-2004 Michael Niedermayer <michaelni@gmx.at>
+ * quarterpel DSP functions
*
* This file is part of FFmpeg.
*
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 04:05:29 +02:00
Michael Niedermayer
98a6806fdd
Merge commit '368f50359eb328b0b9d67451f56fda20b3255f9a'
...
* commit '368f50359eb328b0b9d67451f56fda20b3255f9a':
dsputil: Split off quarterpel bits into their own context
Conflicts:
configure
libavcodec/dsputil.c
libavcodec/h263dec.c
libavcodec/mpegvideo.c
libavcodec/mpegvideo_enc.c
libavcodec/vc1dec.c
libavcodec/vc1dsp.c
libavcodec/x86/dsputil_init.c
libavcodec/x86/qpeldsp.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 02:43:34 +02:00
Michael Niedermayer
40f3a87c10
Merge commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3'
...
* commit '054013a0fc6f2b52c60cee3e051be8cc7f82cef3':
dsputil: Move APE-specific bits into apedsp
Conflicts:
libavcodec/arm/int_neon.S
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 00:59:15 +02:00
Michael Niedermayer
c814a6c778
avcodec/x86/svq1enc_mmx: Add author attribution
...
See: 5900637219
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 00:30:05 +02:00
Michael Niedermayer
ea0931fb96
Merge commit '65d5d5865845f057cc6530a8d0f34db952d9009c'
...
* commit '65d5d5865845f057cc6530a8d0f34db952d9009c':
dsputil: Move SVQ1 encoding specific bits into svq1enc
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-30 00:01:45 +02:00
James Almer
02a3e327f1
x86/dsputilenc: add missing guards to ff_pix_sum16_xop
...
XOP support was added in Yasm 1.0.0 and Nasm 2.06, and we still
support older versions.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 22:31:28 +02:00
Christophe Gisquet
99a319c4e7
x86: huffyuvdsp: port add_bytes to yasm
...
C MMX SSE2
Cycles: 2972 587 302
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 21:56:00 +02:00
Christophe Gisquet
2267003981
x86: hpeldsp: better factorization
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 21:47:40 +02:00
Michael Niedermayer
7b4c46050e
rename add_hfyu_left_prediction_int16 to add_hfyu_left_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 19:50:44 +02:00
Michael Niedermayer
550ae6c02f
rename add_hfyu_median_prediction_int16 to add_hfyu_median_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 19:49:29 +02:00
Michael Niedermayer
40a4ab8ba4
rename sub_hfyu_median_prediction_int16 to sub_hfyu_median_pred_int16
...
This makes the naming more consistent with the 8bit variant
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 19:48:23 +02:00
James Almer
05de4d3011
x86/dsputilenc: implement XOP version of pix_sum16
...
SSE2: 137 cycles
XOP: 87 cycles
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 18:40:23 +02:00
Diego Biurrun
368f50359e
dsputil: Split off quarterpel bits into their own context
2014-05-29 06:48:31 -07:00
Diego Biurrun
054013a0fc
dsputil: Move APE-specific bits into apedsp
2014-05-29 06:41:15 -07:00
Diego Biurrun
65d5d58658
dsputil: Move SVQ1 encoding specific bits into svq1enc
2014-05-29 06:41:15 -07:00
Michael Niedermayer
b50559fc0b
libavcodec/x86/dsputilenc: drop and 0xffff that should have becomei redundant
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-29 00:16:52 +02:00
James Almer
561bfc85eb
x86/dsputilenc: implement SSE2 versions of pix_{sum16, norm1}
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-28 23:29:34 +02:00
Christophe Gisquet
0810608e23
x86: hevc_mc: better register allocation
...
The xmm reg count was incorrect, and manual loading of the gprs
furthermore allows to noticeable reduce the number needed.
The modified functions are used in weighted prediction, so only a
few samples like WP_* exhibit a change. For this one and Win64
(some widths removed because of too few occurrences):
WP_A_Toshiba_3.bit, ff_hevc_put_hevc_uni_w
16 32
before: 2194 3872
after: 2119 3767
WP_B_Toshiba_3.bit, ff_hevc_put_hevc_bi_w
16 32 64
before: 2819 4960 9396
after: 2617 4788 9150
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-28 17:39:34 +02:00
Michael Niedermayer
48a6916308
Merge commit '512f3ffe9b4bb86767c2b1176554407c75fe1a5c'
...
* commit '512f3ffe9b4bb86767c2b1176554407c75fe1a5c':
dsputil: Split off HuffYUV encoding bits into their own context
Conflicts:
configure
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/huffyuv.h
libavcodec/huffyuvenc.c
libavcodec/pngenc.c
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-28 00:03:59 +02:00
Michael Niedermayer
e2abc0d5ca
Merge commit '0d439fbede03854eac8a978cccf21a3425a3c82d'
...
* commit '0d439fbede03854eac8a978cccf21a3425a3c82d':
dsputil: Split off HuffYUV decoding bits into their own context
Conflicts:
configure
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/huffyuv.h
libavcodec/huffyuvdec.c
libavcodec/lagarith.c
libavcodec/vble.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil.asm
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-27 23:16:06 +02:00
Diego Biurrun
512f3ffe9b
dsputil: Split off HuffYUV encoding bits into their own context
...
Also shorten HuffYUV context member names to avoid clutter.
2014-05-27 08:54:53 -07:00