* commit 'e5c2794a7162e485eefd3133af5b98fd31386aeb':
x86: consistently use unaligned movs in the unaligned bswap
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This way, the special IDCT permutations are no longer needed. Bfin code
is disabled until someone updates it. This is similar to how H264 does
it, and removes the dsputil dependency imposed by the scantable code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e8c52271c45ec27d783e74238dcfad0c2008731c':
Revert "Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm."
Conflicts:
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This reverts commit f90ff772e7.
The code should be put back in h264_qpel_8bit.asm, but unfortunately
it is unconditionally used from dsputil_mmx.c since 71155d7.
* commit '845cfc92f908791714b8c4c8a49c91b8c64b685e':
x86: dsputil: Drop aliasing of ff_put_pixels8_mmx to ff_put_pixels8_mmxext
Conflicts:
libavcodec/x86/dsputil_mmx.c
Note, the commit message is wrong, there are no mmxext instructions as
claimed in the function. The change should do no harm though
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '096cc11ec102701a18951b4f0437d609081ca1dd':
x86: vc1dsp: Move ff_avg_vc1_mspel_mc00_mmxext out of dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The external assembly function uses mmxext instructions and should not be
masqueraded as an mmx-only function. Instead, use the mmx-only inline
assembly function.
This fixes crashes in chromium on win64 on machines with AVX
(crashes that apparently aren't triggered by fate).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Reference:
commit 3615e2be84
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 22:02:57 2003 +0000
h263_h_loop_filter_mmx
Originally committed as revision 2553 to svn://svn.ffmpeg.org/ffmpeg/trunk
commit 359f98ded9
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 20:28:10 2003 +0000
h263_v_loop_filter_mmx
Originally committed as revision 2552 to svn://svn.ffmpeg.org/ffmpeg/trunk
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
x86: dsputil: Fix h263 loop filter link error in some configurations
Conflicts:
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This was caused by unconditionally referencing a conditionally compiled
table. Now the code is also compiled conditionally.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The symbol "ff_h263_loop_filter_strength" is defined in h263.c, but
the h263 loopfilter functions (in the .asm file) are not optimized
out (even though their function pointers are never assigned).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '304b806cb524fb040f8e09a241040f1af2cb820b':
build: Make library minor version visible in the Makefile
x86: mpeg4qpel: Make movsxifnidn do the right thing
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Also use the resulting 16bpp functions for anything >8 and <=16, not just
9 and 10. This fixes 12 and 14bpp H264 support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Reference:
commit 3615e2be84
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 22:02:57 2003 +0000
h263_h_loop_filter_mmx
Originally committed as revision 2553 to svn://svn.ffmpeg.org/ffmpeg/trunk
commit 359f98ded9
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 20:28:10 2003 +0000
h263_v_loop_filter_mmx
Originally committed as revision 2552 to svn://svn.ffmpeg.org/ffmpeg/trunk
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc':
h264chroma: x86: Fix building with yasm disabled
rv34: Drop now unnecessary dsputil dependencies
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '620289a20e022b9c16c10d546ef86cc0bb77cc84':
sh4: Fix silly type vs. variable name search and replace typo
configure: Group all hwaccels together in a separate variable
Add av_cold attributes to arch-specific init functions
Conflicts:
configure
libavcodec/arm/mpegvideo_armv5te.c
libavcodec/x86/mlpdsp.c
libavcodec/x86/motion_est.c
libavcodec/x86/mpegvideoenc.c
libavcodec/x86/videodsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '25841dfe806a13de526ae09c11149ab1f83555a8':
Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
Conflicts:
libavcodec/alpha/dsputil_alpha.c
libavcodec/dsputil_template.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
x86: hpel: Move {avg,put}_pixels16_sse2 to hpeldsp
configure: Add a comment indicating why uclibc is checked before glibc
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '098eed95bc1a6b2c8ac97f126f62bb74699670cf':
mdec: merge mdec_common_init() into decode_init().
eatgv: use fixed-width types where appropriate.
x86: Simplify some arch conditionals
bfin: Separate VP3 initialization code
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '05b0998f511ffa699407465d48c7d5805f746ad2':
dsputil: Fix error by not using redzone and register name
swscale: GBRP output support
Conflicts:
libswscale/output.c
libswscale/swscale.c
libswscale/swscale_internal.h
libswscale/utils.c
tests/ref/lavfi/pixdesc
tests/ref/lavfi/pixfmts_copy
tests/ref/lavfi/pixfmts_null
tests/ref/lavfi/pixfmts_scale
tests/ref/lavfi/pixfmts_vflip
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Also fix project name
See git blame/log/show and
commit 826f429ae9
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Sun Jan 5 15:57:10 2003 +0000
qpel in mmx2/3dnow
qpel refinement quality parameter
Originally committed as revision 1393 to svn://svn.ffmpeg.org/ffmpeg/trunk
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This also fixes the project name
Original authors fabrice and nick go back to the initial ffmpeg commit
Others for example contributed in: (for a complete list please use git blame / show / log)
commit e9c0a38ff0
Author: Zdenek Kabelac <kabi@informatics.muni.cz>
Date: Tue May 28 16:35:58 2002 +0000
* optimized avg_* functions (except xy2)
* minor speedup for put_pixels_x2 & cleanup
Originally committed as revision 619 to svn://svn.ffmpeg.org/ffmpeg/trunk
commit 607dce96c0
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Fri May 17 01:04:14 2002 +0000
hopefully faster mmx2&3dnow MC
Originally committed as revision 506 to svn://svn.ffmpeg.org/ffmpeg/trunk
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f90ff772e7e35b4923c2de429d1fab9f2569b568':
Move H264/QPEL specific asm from dsputil.asm to h264_qpel_*.asm.
doc: update the reference for the title
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '69c25c9284645cf5189af2ede42d6f53828f3b45':
dnxhdenc: fix invalid reads in dnxhd_mb_var_thread().
x86: h264qpel: Move stray comment to the right spot and clarify it
atrac3: use correct loop variable in add_tonal_components()
Conflicts:
tests/ref/vsynth/vsynth1-dnxhd-1080i
tests/ref/vsynth/vsynth2-dnxhd-1080i
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2c10e2a2f62477efaef5b641974594f7df4ca339':
build: Make the H.264 parser select h264qpel
x86: h264qpel: add cpu flag checks for init function
Conflicts:
libavcodec/x86/h264_qpel.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '2e4bb99f4df7052b3e147ee898fcb4013a34d904':
vorbisdsp: convert x86 simd functions from inline asm to yasm.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
* qatar/master:
proresdec: support mixed interlaced/non-interlaced content
vp3/5: move put_no_rnd_pixels_l2 from dsputil to VP3DSPContext.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3':
fate: Use wmv2 IDCT for wmv2 tests
vorbisdsp: change block_size type from int to intptr_t.
Conflicts:
tests/fate-run.sh
tests/fate/vcodec.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8a4f26206d7914eaf2903954ce97cb7686933382':
dsputil: remove butterflies_float_interleave.
srtp: Move a variable to a local scope
srtp: Add tests for the crypto suite with 32/80 bit HMAC
Conflicts:
libavcodec/x86/dsputil.asm
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a':
vp3: integrate clear_blocks with idct of previous block.
mpegvideo: fix loop condition in draw_line()
dvdsubdec: parse the size from the extradata
Conflicts:
libavcodec/dvdsubdec.c
libavcodec/mpegvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).
Arm asm updated by Janne Grunau <janne-libav@jannau.net>.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
* qatar/master:
x86: dsputil: Drop some unused macro definitions
x86: Add a Yasm-based emms() replacement
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
lavc: Move vector_fmul_window to AVFloatDSPContext
rtpdec_mpeg4: Check the remaining amount of data before reading
Conflicts:
libavcodec/dsputil.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'dae1d507af94261bafd3b11549884e5d1eca590e':
x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
vf_fps: add final flushed frames to the dropped frame count
rv34_parser: Adjust #if for disabling individual parsers
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd8c772de53d29afb1bada88afa859fce8489c668':
nutdec: Always return a value from nut_read_timestamp()
configure: Make warnings from -Wreturn-type fatal errors
x86: ABS2: port to cpuflags
vdpau: Remove av_unused attribute from function declaration
h264: fix ff_generate_sliding_window_mmcos() prototype.
Conflicts:
configure
libavformat/nutdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This could also be fixed by changing the argument type if
someone prefers that and wants to change it ...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0':
x86: ABS1: port to cpuflags
v210x: cosmetics, reformat
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The asm code is not valid for older compilers as it uses too many
operands, ICC on x86_32 seems affected by this.
This patch disables the affected code for ICC on x86_32 and should
make it compileable again.
A better fix would be to use fewer operands or to change this code
to yasm, later is being worked on AFAIK so this is a temporary
solution.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '30b39164256999efc8d77edc85e2e0b963c24834':
ac3dec: make downmix() take array of pointers to channel data
Conflicts:
libavcodec/ac3dsp.c
libavcodec/ac3dsp.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Start and end index are multiple of 2, therefore guaranteeing aligned access.
Also, this allows to generate 4 floats per loop, keeping the alignment all
along.
Timing:
- 32 bits: 326c -> 172c
- 64 bits: 323c -> 156c
Signed-off-by: Diego Biurrun <diego@biurrun.de>
* commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1':
asink_nullsink: plug a memory leak.
x86: h264_idct: port to cpuflags
x86: cpu: Drop unused HAVE_RWEFLAGS condition
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f5fa03660db16f9d78abc5a626438b4d0b54f563':
vble: Do not abort decoding when version is not 1
lavr: do not pass consumed samples as a parameter to ff_audio_resample()
lavr: correct the documentation for the ff_audio_resample() return value
lavr: do not pass sample count as a parameter to ff_audio_convert()
x86: h264_weight: port to cpuflags
configure: Enable avconv filter dependencies automatically
Conflicts:
configure
libavcodec/x86/h264_weight.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5ae72f54532960cb9eae82a1c9e8d505106c022b':
flashsv: check for keyframe before using differential coding
h264: enable low delay only if no delayed frames were seen
x86: fix build without inline asm
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8e134e5104e99a69cd4cea10540a7ce9c3682a2c':
lavc: clarify get_buffer() documentation
mpegaudiodec: use planar sample format for output unless packed is requested
x86: h264 qpel: use the correct number of utilized xmm regs in cglobal
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The qpel functions referenced here are not related to h264 and should
thus never have been under CONFIG_H264QPEL.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The only CPUs that have 3dnow and don't have mmxext are 12 years old.
Moreover, AMD has dropped 3dnow extensions from newer CPUs.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
* qatar/master:
x86: dsputil: port to cpuflags
crc: av_crc() parameter names should match between .c, .h and doxygen
avserver: replace av_read_packet with av_read_frame
avserver: fix constness casting warnings
Conflicts:
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>