Christophe Gisquet
4f50646697
x86: sbrdsp: Implement SSE qmf_post_shuffle
...
255 to 174 cycles on Arrandale / Win64. Unrolling yields no gain.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-06 13:57:01 +01:00
Christophe Gisquet
44a0036d10
x86: sbrdsp: Implement SSE sum64x5
...
698 to 174 cycles on Arrandale. Unrolling is a 6 cycles gain.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-01-06 13:57:01 +01:00
Diego Biurrun
5b4dfbffc2
x86: ABS1: port to cpuflags
2013-01-06 13:57:01 +01:00
Michael Niedermayer
9cb887ed37
dsputil_mmx: fix pointer type for emulated_edge_mc_func()
...
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-25 02:04:31 +01:00
Michael Niedermayer
3e15775333
x86/ac3dsp_init: try to workaround ICC failure.
...
The asm code is not valid for older compilers as it uses too many
operands, ICC on x86_32 seems affected by this.
This patch disables the affected code for ICC on x86_32 and should
make it compileable again.
A better fix would be to use fewer operands or to change this code
to yasm, later is being worked on AFAIK so this is a temporary
solution.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-23 19:27:19 +01:00
Michael Niedermayer
e16bac7b33
videodsp: Fix project name
...
These are all part of splited out dsp utils from FFmpeg
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-22 00:58:08 +01:00
Michael Niedermayer
90eaa989f1
x86/videodsp_init: Add back lost author attribution
...
Code originates from:
910b9f30 libavcodec/dsputil.c (David Conrad 2010-05-27 04:39:27 +0000 334) void ff_emulated_edge_mc(uint8_t *buf, const uint8_t *src, int linesize, int block_w, int block_h,
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 335) int src_x, int src_y, int w, int h){
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 336) int x, y;
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 337) int start_y, start_x, end_y, end_x;
b5a093b3 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-25 20:22:36 +0000 338)
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 339) if(src_y>= h){
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 340) src+= (h-1-src_y)*linesize;
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 341) src_y=h-1;
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 342) }else if(src_y<=-block_h){
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 343) src+= (1-block_h-src_y)*linesize;
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 344) src_y=1-block_h;
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 345) }
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 346) if(src_x>= w){
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 347) src+= (w-1-src_x);
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 348) src_x=w-1;
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 349) }else if(src_x<=-block_w){
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 350) src+= (1-block_w-src_x);
225f9c44 libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-15 00:25:53 +0000 351) src_x=1-block_w;
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 352) }
93a21abd libavcodec/mpegvideo.c (Michael Niedermayer 2002-07-14 18:37:35 +0000 353)
b8a78f41 libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 354) start_y= FFMAX(0, -src_y);
b8a78f41 libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 355) start_x= FFMAX(0, -src_x);
b8a78f41 libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 356) end_y= FFMIN(block_h, h-src_y);
b8a78f41 libavcodec/mpegvideo.c (Michael Niedermayer 2002-11-10 11:46:59 +0000 357) end_x= FFMIN(block_w, w-src_x);
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-22 00:58:08 +01:00
Michael Niedermayer
a41bf09d9c
Merge commit '6906b19346ae8a330bfaa1c16ce535be10789723'
...
* commit '6906b19346ae8a330bfaa1c16ce535be10789723':
lavc: add missing files for arm
lavc: introduce VideoDSPContext
Conflicts:
configure
libavcodec/arm/dsputil_init_armv5te.c
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/dsputil_template.c
libavcodec/h264.c
libavcodec/mpegvideo.h
libavcodec/mpegvideo_enc.c
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-21 17:18:43 +01:00
Ronald S. Bultje
8c53d39e7f
lavc: introduce VideoDSPContext
...
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Ronald S. Bultje
ce58642ed0
x86inc: support stack mem allocation and re-alignment in PROLOGUE.
...
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-12 10:37:52 +01:00
Ronald S. Bultje
6f40e9f070
x86inc: support stack mem allocation and re-alignment in PROLOGUE
...
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-12 05:23:46 +01:00
Clément Bœsch
7eafd274d8
build: fix prores decoder dependencies.
...
According to lavc/proresdsp.c, both prores and prores-lgpl decoders need
lavc/x86/proresdsp_init.c:ff_proresdsp_x86_init().
2012-12-11 02:54:55 +01:00
Michael Niedermayer
e7101a7f3f
libavcodec/x86/mpegvideo: switch to av_assert2
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-10 23:30:16 +01:00
Michael Niedermayer
ddbf0702c5
dsputil_mmx: switch to av_assert2()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-10 14:41:31 +01:00
Michael Niedermayer
a933698457
Merge commit '30b39164256999efc8d77edc85e2e0b963c24834'
...
* commit '30b39164256999efc8d77edc85e2e0b963c24834':
ac3dec: make downmix() take array of pointers to channel data
Conflicts:
libavcodec/ac3dsp.c
libavcodec/ac3dsp.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-10 02:06:50 +01:00
Mans Rullgard
30b3916425
ac3dec: make downmix() take array of pointers to channel data
2012-12-09 15:52:01 +00:00
Michael Niedermayer
0110108a7c
sbr_hf_gen_sse: Optimize code a bit more.
...
Core I7 (Sandy Bridge) 135 to 107 cycles
Core i5 (Arrandale) 162 to 142 (Thanks to Christophe Gisquet for testing)
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 17:30:11 +01:00
Michael Niedermayer
af164d7d9f
Merge commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152'
...
* commit 'c25fc5c2bb6ae8c93541c9427df3e47206d95152':
fate: dpcm: Add dependencies
SBR DSP x86: implement SSE sbr_hf_gen
AAC SBR: use AVFloatDSPContext's vector_fmul
fate: image: Add dependencies
Changelog: add an entry for deprecating the avconv -vol option
x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Conflicts:
Changelog
libavutil/x86/float_dsp.asm
tests/fate/image.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-07 15:21:41 +01:00
Christophe Gisquet
2aef3d66c9
SBR DSP x86: implement SSE sbr_hf_gen
...
Start and end index are multiple of 2, therefore guaranteeing aligned access.
Also, this allows to generate 4 floats per loop, keeping the alignment all
along.
Timing:
- 32 bits: 326c -> 172c
- 64 bits: 323c -> 156c
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-12-07 11:04:26 +01:00
Michael Niedermayer
b023392f34
mpegvideo: remove #if/define PARANOID code
...
This code never did anything as far as i can remember
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-05 19:31:27 +01:00
Michael Niedermayer
599ae9995f
ff_emulated_edge_mc: fix handling of w/h being 0
...
Fixes assertion failure
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-05 03:45:10 +01:00
Michael Niedermayer
076300bf8b
Merge commit 'bfe5454cd238b16e7977085f880205229103eccb'
...
* commit 'bfe5454cd238b16e7977085f880205229103eccb':
lavf: move ff_codec_get_tag() and ff_codec_get_id() definitions to internal.h
lavf: move "MP3 " fourcc from riff to nut
fate: vpx: Add dependencies
fate: Fix wavpack-matroskamode test dependencies
x86: dsputilenc: port to cpuflags
Conflicts:
libavformat/internal.h
libavformat/nut.c
tests/fate/vpx.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-29 13:45:57 +01:00
Michael Niedermayer
7dc0ed80e8
Merge commit '1f3f896564501c23b44fcf605567c78ce066b539'
...
* commit '1f3f896564501c23b44fcf605567c78ce066b539':
fate: Add dependencies for Vorbis, ProRes, QTRLE, utvideo tests
fate: real: Add dependencies
fate: lossless-audio: Add dependencies
x86: h264dsp: Fix linking with yasm and optimizations disabled
Conflicts:
libavcodec/x86/h264dsp_init.c
tests/fate/lossless-audio.mak
tests/fate/real.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-29 13:35:56 +01:00
Diego Biurrun
9b15c0a9b3
x86: dsputilenc: port to cpuflags
2012-11-28 16:05:44 +01:00
Diego Biurrun
89145fbbfe
x86: h264dsp: Fix linking with yasm and optimizations disabled
...
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
2012-11-28 14:45:28 +01:00
Michael Niedermayer
42d3fea65f
Merge commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1'
...
* commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1':
asink_nullsink: plug a memory leak.
x86: h264_idct: port to cpuflags
x86: cpu: Drop unused HAVE_RWEFLAGS condition
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-28 13:32:17 +01:00
Michael Niedermayer
264441715b
Merge commit 'f5fa03660db16f9d78abc5a626438b4d0b54f563'
...
* commit 'f5fa03660db16f9d78abc5a626438b4d0b54f563':
vble: Do not abort decoding when version is not 1
lavr: do not pass consumed samples as a parameter to ff_audio_resample()
lavr: correct the documentation for the ff_audio_resample() return value
lavr: do not pass sample count as a parameter to ff_audio_convert()
x86: h264_weight: port to cpuflags
configure: Enable avconv filter dependencies automatically
Conflicts:
configure
libavcodec/x86/h264_weight.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-28 13:27:18 +01:00
Diego Biurrun
2e89aeed65
x86: h264_idct: port to cpuflags
2012-11-28 00:28:09 +01:00
Diego Biurrun
28e1cf19aa
x86: h264_weight: port to cpuflags
2012-11-27 21:10:38 +01:00
Michael Niedermayer
a3f30f2e99
Merge commit '5ae72f54532960cb9eae82a1c9e8d505106c022b'
...
* commit '5ae72f54532960cb9eae82a1c9e8d505106c022b':
flashsv: check for keyframe before using differential coding
h264: enable low delay only if no delayed frames were seen
x86: fix build without inline asm
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 16:11:02 +01:00
Michael Niedermayer
a13148f633
Merge commit '8e134e5104e99a69cd4cea10540a7ce9c3682a2c'
...
* commit '8e134e5104e99a69cd4cea10540a7ce9c3682a2c':
lavc: clarify get_buffer() documentation
mpegaudiodec: use planar sample format for output unless packed is requested
x86: h264 qpel: use the correct number of utilized xmm regs in cglobal
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 14:24:19 +01:00
Michael Niedermayer
86270236d5
dsputil_mmx: ff_put_dirac_pixels depend now on yasm.
...
Fix compile failure without yasm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 13:59:41 +01:00
Michael Niedermayer
7b29b07394
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
remove #defines to prevent use of discouraged external functions
x86: h264: Convert 8-bit QPEL inline assembly to YASM
Conflicts:
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 02:17:02 +01:00
Diego Biurrun
7ee4071362
x86: fix build without inline asm
...
The qpel functions referenced here are not related to h264 and should
thus never have been under CONFIG_H264QPEL.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-26 01:50:47 +01:00
Michael Niedermayer
66c3bac2b9
Merge commit 'ad01ba6ceaea7d71c4b9887795523438689b5a96'
...
* commit 'ad01ba6ceaea7d71c4b9887795523438689b5a96':
x86: h264: Remove 3dnow QPEL code
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 00:57:33 +01:00
Justin Ruggles
2d3993ce8c
x86: h264 qpel: use the correct number of utilized xmm regs in cglobal
...
Fixes xmm register clobbering on win64.
2012-11-25 18:48:43 -05:00
Michael Niedermayer
bf2f93cdbf
Merge commit '28c8e288fa0342fdef532a7522a4707bebf831cc'
...
* commit '28c8e288fa0342fdef532a7522a4707bebf831cc':
x86: h264_chromamc: port to cpuflags
yop: fix typo
avconv: fix copying per-stream metadata.
doc: avtools-common-opts: Fix terminology concerning metric prefixes
configure: suncc: Add compiler arch support for Nehalem & Sandy Bridge
riff: Make ff_riff_tags static and move under appropriate #ifdef
Conflicts:
libavformat/riff.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-26 00:43:45 +01:00
Daniel Kang
610e00b359
x86: h264: Convert 8-bit QPEL inline assembly to YASM
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-25 20:38:35 +01:00
Daniel Kang
ad01ba6cea
x86: h264: Remove 3dnow QPEL code
...
The only CPUs that have 3dnow and don't have mmxext are 12 years old.
Moreover, AMD has dropped 3dnow extensions from newer CPUs.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-11-25 20:32:55 +01:00
Diego Biurrun
28c8e288fa
x86: h264_chromamc: port to cpuflags
2012-11-25 17:25:10 +01:00
Michael Niedermayer
533a8b2a7d
x86/mpegvideoenc_template: use av_assert
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-23 17:57:22 +01:00
Michael Niedermayer
e6d81ce22e
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
x86: h264_intrapred: Fix C function names in comments
x86: SPLATD: port to cpuflags
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-19 14:24:20 +01:00
Diego Biurrun
89923fce70
x86: h264_intrapred: Fix C function names in comments
...
Function names changed after switching to declaration with
PRED4x4/8x8/8x8L/16x16 macros in the C code.
2012-11-18 18:34:05 +01:00
Diego Biurrun
87af05c575
x86: SPLATD: port to cpuflags
2012-11-18 18:34:05 +01:00
Michael Niedermayer
2207ea44fb
ff_emulated_edge_mc: fix integer anomalies, fix out of array reads
...
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-16 21:33:52 +01:00
Michael Niedermayer
ff3b59c848
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
x86: dsputil: port to cpuflags
crc: av_crc() parameter names should match between .c, .h and doxygen
avserver: replace av_read_packet with av_read_frame
avserver: fix constness casting warnings
Conflicts:
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-16 13:23:35 +01:00
Diego Biurrun
8c3849bc76
x86: dsputil: port to cpuflags
2012-11-16 10:38:23 +01:00
Michael Niedermayer
a1b5c9634e
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
x86: mmx2 ---> mmxext in asm constructs
Conflicts:
libavcodec/x86/h264_chromamc_10bit.asm
libavcodec/x86/h264_deblock.asm
libavcodec/x86/h264dsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-14 12:34:30 +01:00
Michael Niedermayer
e13d5e9a4b
Merge commit '5e9c6ef8f3beb9ed7b271654a82349ac90fe43f2'
...
* commit '5e9c6ef8f3beb9ed7b271654a82349ac90fe43f2':
x86: h264_weight_10bit: port to cpuflags
libtheoraenc: add missing pixdesc.h header
avcodec: remove ff_is_hwaccel_pix_fmt
pixdesc: add av_pix_fmt_get_chroma_sub_sample
hlsenc: stand alone hls segmenter
Conflicts:
doc/muxers.texi
libavcodec/ffv1enc.c
libavcodec/imgconvert.c
libavcodec/mpegvideo_enc.c
libavcodec/tiffenc.c
libavformat/Makefile
libavformat/allformats.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-14 11:59:20 +01:00
Diego Biurrun
26301caaa1
x86: mmx2 ---> mmxext in asm constructs
2012-11-14 00:58:51 +01:00