Commit Graph

14 Commits

Author SHA1 Message Date
Henrik Gramner
f0b7882ceb x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
2015-08-04 20:13:09 +02:00
Michael Niedermayer
29d147c94d Merge commit '059a934806d61f7af9ab3fd9f74994b838ea5eba'
* commit '059a934806d61f7af9ab3fd9f74994b838ea5eba':
  lavc: Consistently prefix input buffer defines

Conflicts:
	doc/examples/decoding_encoding.c
	libavcodec/4xm.c
	libavcodec/aac_adtstoasc_bsf.c
	libavcodec/aacdec.c
	libavcodec/aacenc.c
	libavcodec/ac3dec.h
	libavcodec/asvenc.c
	libavcodec/avcodec.h
	libavcodec/avpacket.c
	libavcodec/dvdec.c
	libavcodec/ffv1enc.c
	libavcodec/g2meet.c
	libavcodec/gif.c
	libavcodec/h264.c
	libavcodec/h264_mp4toannexb_bsf.c
	libavcodec/huffyuvdec.c
	libavcodec/huffyuvenc.c
	libavcodec/jpeglsenc.c
	libavcodec/libxvid.c
	libavcodec/mdec.c
	libavcodec/motionpixels.c
	libavcodec/mpeg4videodec.c
	libavcodec/mpegvideo.c
	libavcodec/noise_bsf.c
	libavcodec/nuv.c
	libavcodec/nvenc.c
	libavcodec/options.c
	libavcodec/parser.c
	libavcodec/pngenc.c
	libavcodec/proresenc_kostya.c
	libavcodec/qsvdec.c
	libavcodec/svq1enc.c
	libavcodec/tiffenc.c
	libavcodec/truemotion2.c
	libavcodec/utils.c
	libavcodec/utvideoenc.c
	libavcodec/vc1dec.c
	libavcodec/wmalosslessdec.c
	libavformat/adxdec.c
	libavformat/aiffdec.c
	libavformat/apc.c
	libavformat/apetag.c
	libavformat/avidec.c
	libavformat/bink.c
	libavformat/cafdec.c
	libavformat/flvdec.c
	libavformat/id3v2.c
	libavformat/isom.c
	libavformat/matroskadec.c
	libavformat/mov.c
	libavformat/mpc.c
	libavformat/mpc8.c
	libavformat/mpegts.c
	libavformat/mvi.c
	libavformat/mxfdec.c
	libavformat/mxg.c
	libavformat/nutdec.c
	libavformat/oggdec.c
	libavformat/oggparsecelt.c
	libavformat/oggparseflac.c
	libavformat/oggparseopus.c
	libavformat/oggparsespeex.c
	libavformat/omadec.c
	libavformat/rawdec.c
	libavformat/riffdec.c
	libavformat/rl2.c
	libavformat/rmdec.c
	libavformat/rtpdec_latm.c
	libavformat/rtpdec_mpeg4.c
	libavformat/rtpdec_qdm2.c
	libavformat/rtpdec_svq3.c
	libavformat/sierravmd.c
	libavformat/smacker.c
	libavformat/smush.c
	libavformat/spdifenc.c
	libavformat/takdec.c
	libavformat/tta.c
	libavformat/utils.c
	libavformat/vqf.c
	libavformat/westwood_vqa.c
	libavformat/xmv.c
	libavformat/xwma.c
	libavformat/yop.c

Merged-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-27 23:15:19 +02:00
James Almer
844bef578e avcodec/x86: add missing colon to labels
Silences warnings with Nasm

Signed-off-by: James Almer <jamrial@gmail.com>
2015-07-26 02:50:14 -03:00
James Almer
5c8f747085 x86/hevc_sao: use unaligned movs for sao_{band,filter} with width 8
Suggested-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-03-01 20:02:43 -03:00
James Almer
14b44c1614 x86/hevc_sao: make sao_edge_filter_{10,12} work on x86_32
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-12 13:21:30 -03:00
James Almer
06fe6dfe12 x86/hevc_sao: make sao_band_filter work on x86_32
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-09 20:41:21 -03:00
Christophe Gisquet
9dc45d1f42 x86: lavc: share more constants
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-06 23:35:02 +01:00
James Almer
aea29a891f x86/hevc_sao: fix loading of RIP address
pb_eo must be handled as a rip relative address for MSVC64, so an
intermediate register is needed. Should fix link failures.

Suggested by Hendrik Leppkes and Christophe Gisquet.

Tested-By: Hendrik Leppkes <h.leppkes@gmail.com>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-06 15:06:15 -03:00
James Almer
15574c505b x86/hevcdsp: add ff_hevc_sao_edge_filter_{10,12}_{sse2,avx2}
Original x86 intrinsics code by Pierre-Edouard Lepere.
Yasm port, refactoring and optimizations by James Almer.

Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U

Width 32
342694 decicycles in sao_edge_filter_10, 16384 runs, 0 skips
29476 decicycles in ff_hevc_sao_edge_filter_32_10_ssse3, 16384 runs, 0 skips
13996 decicycles in ff_hevc_sao_edge_filter_32_10_avx2, 16381 runs, 3 skips

Width 64
581163 decicycles in sao_edge_filter_10, 8192 runs, 0 skips
59774 decicycles in ff_hevc_sao_edge_filter_64_10_ssse3, 8192 runs, 0 skips
28383 decicycles in ff_hevc_sao_edge_filter_64_10_avx2, 8191 runs, 1 skips

Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-05 15:02:33 -03:00
James Almer
042c1159fc x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2}
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.

Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U

Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips

Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips

Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-05 15:02:27 -03:00
James Almer
aa945dc112 x86/hevcdsp: add missing vzeroupper in ff_hevc_sao_band_filter_48_*_avx2
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-02 00:01:35 -03:00
James Almer
71e2cb4706 x86/hevcdsp: add missing guards to ff_hevc_sao_band_filter_avx2
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 21:45:52 -03:00
Christophe Gisquet
bff7feb328 x86: hevc/sao: aligned source buffers
Usefull for at least band filter, for which:
- Band filter call only:
           32      64
Before:  16556    54015
After:   16497    52355
- Whole case:
           32      64
Before:  37031   103008
After:   32045    93952
2015-02-01 20:22:54 -03:00
James Almer
fa3eccb4f9 x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2}
Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
10/12bit yasm ports, refactoring and optimizations by James Almer

Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U

width 32
40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips

width 64
136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 20:22:35 -03:00