* commit '16b7328058fa600d5158c84d9cc621a134eb88bc':
build: Conditionally build and run DCT test program
Conflicts:
libavcodec/Makefile
libavcodec/dct-test.c
tests/fate/libavcodec.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'bd499d9af668aef979ec9f3f3215b8dd508c7ec1':
build: Conditionally build and test iirfilter
Conflicts:
libavcodec/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
They are unneeded and make adding elements slightly harder as they
would need to be constantly updated
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c39059bea3adebcd888571d1181db215eee54495':
h264: Fix direct temporal mvs for bottom-field-first poc order
Conflicts:
libavcodec/h264_direct.c
See: ebd1c505d2
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4de8b60684ce13dff3e3d372dae4f49b9e53f755':
idct: Move arm-specific declarations to a header in the arm directory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Such files can be created using the --bff x264 option.
Sample-Id: h264_direct_temporal_mvs_bff.mkv
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
* commit '9f99a5f1d078721a30a76aec27c58805b7b87e58':
mpegencconetxt: Move rv10-specific orig_width/orig_height where they belong
Merged-by: Michael Niedermayer <michaelni@gmx.at>
When dealing with MVs, both components may be processed at a time.
On Win64, 560 to 539 cycles for derive_spatial_merge_candidates.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
There's a lag of one CTB line for SAO behind deblocking filter, except for
last line. However, once SAO has been completed on a line, all its pixels,
i.e. up to y+ctb_size are filtered and ready to be used as reference.
Without SAO, when deblocking filter finishes a CTB line, only the bottom
bottom 4 pixels may be filtered when next CTB is process by the deblocing.
The await_progess for hevc then checks whether the bottom pixels of a PU
requires access beyond that point, so the reporting should effectively
report up to the the above limits.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '1a583c0c60240adb8fa6620c6df33f1a0a0fe5d9':
fdct: Move ppc-specific declarations to a header in the ppc directory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0':
simple_idct: Move x86-specific declarations to a header in the x86 directory
Conflicts:
libavcodec/x86/simple_idct.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314':
fdct: Move x86-specific declarations to a header in the x86 directory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
- adding one extra pixel all around the frame
- do not copy when SAO is not applied
5% improvement
cherry picked from commit 10fc29fc19a12c4d8168fbe1a954b76386db12d0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '24af1aa0f70362a66cda04c9d7cd012e019f5572':
fft: Convert FFT/MDCT permutation type #defines to enums
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '746ad4e0df7faf93329804e412ec53c1d929a75b':
dct-test: Improve CPU flags struct member name
Conflicts:
libavcodec/dct-test.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'cb44b21da1f59923be577f08c267ec270529be97':
dct-test: Move cpu_flags variable out of global scope
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7e18a727d2c2a19f22fcf68875d1b05fd2eafcef':
arm: cosmetics: Consistently use lowercase for shift operators
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '87552d54d3337c3241e8a9e1a05df16eaa821496':
armv6: Accelerate ff_fft_calc for general case (nbits != 4)
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The previous implementation targeted DTS Coherent Acoustics, which only
requires nbits == 4 (fft16()). This case was (and still is) linked directly
rather than being indirected through ff_fft_calc_vfp(), but now the full
range from radix-4 up to radix-65536 is available. This benefits other codecs
such as AAC and AC3.
The implementaion is based upon the C version, with each routine larger than
radix-16 calling a hierarchy of smaller FFT functions, then performing a
post-processing pass. This pass benefits a lot from loop unrolling to
counter the long pipelines in the VFP. A relaxed calling standard also
reduces the overhead of the call hierarchy, and avoiding the excessive
inlining performed by GCC probably helps with I-cache utilisation too.
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in the FFT routines (fft4() to fft512() and pass()) for the
same sample AAC stream:
Before After
Mean StdDev Mean StdDev Confidence Change
Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4%
FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2%
Signed-off-by: Martin Storsjö <martin@martin.st>
The previous implementation targeted DTS Coherent Acoustics, which only
requires mdct_bits == 6. This relatively small size lent itself to
unrolling the loops a small number of times, and encoding offsets
calculated at assembly time within the load/store instructions of each
iteration.
In the more general case (codecs such as AAC and AC3) much larger arrays
are used - mdct_bits == [8, 9, 11]. The old method does not scale for
these cases, so more integer registers are used with non-unrolled versions
of the loops (and with some stack spillage). The postrotation filter loop
is still unrolled by a factor of 2 to permit the double-buffering of some
VFP registers to facilitate overlap of neighbouring iterations.
I benchmarked the result by measuring the number of gperftools samples
that hit anywhere in the AAC decoder (starting from aac_decode_frame())
or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same
example AAC stream:
Before After
Mean StdDev Mean StdDev Confidence Change
aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8%
ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1%
Signed-off-by: Martin Storsjö <martin@martin.st>
These where removed by libav in
See: git show -C 2d60444331
diff --git a/libavcodec/dsputil.c b/libavcodec/me_cmp.c
similarity index 98%
rename from libavcodec/dsputil.c
rename to libavcodec/me_cmp.c
index ba71a99..9fcc937 100644
--- a/libavcodec/dsputil.c
+++ b/libavcodec/me_cmp.c
@@ -1,8 +1,4 @@
/*
- * DSP utils
- * Copyright (c) 2000, 2001 Fabrice Bellard
- * Copyright (c) 2002-2004 Michael Niedermayer <michaelni@gmx.at>
- *
* This file is part of Libav.
*
* Libav is free software; you can redistribute it and/or
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '14b4e64eabc84c5a5e57c8ccc56bbeb95380823b':
g2meet: allow size changes within original sizes
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes fate failure with --enable-memory-poisoning && make THREAD_TYPE=slice THREADS=7 fate-hevc-conformance-ENTP_C_Qualcomm_1
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
- support for 4:2:2 and 4:4:4 up to 12 bits
- add a new profile for range extension
(cherry picked from commit d3c067fa65bbc871758d28aa07f54123430ca346)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The previous implementation targeted DTS Coherent Acoustics, which only
requires mdct_bits == 6. This relatively small size lent itself to
unrolling the loops a small number of times, and encoding offsets
calculated at assembly time within the load/store instructions of each
iteration.
In the more general case (codecs such as AAC and AC3) much larger arrays
are used - mdct_bits == [8, 9, 11]. The old method does not scale for
these cases, so more integer registers are used with non-unrolled versions
of the loops (and with some stack spillage). The postrotation filter loop
is still unrolled by a factor of 2 to permit the double-buffering of some
VFP registers to facilitate overlap of neighbouring iterations.
I benchmarked the result by measuring the number of gperftools samples
that hit anywhere in the AAC decoder (starting from aac_decode_frame())
or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same
example AAC stream:
Before After
Mean StdDev Mean StdDev Confidence Change
aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8%
ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1%
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f43789b76e661acd93c21664678f140e53cfa1fa':
hevc: set the keyframe flag on output frames
See: e2760de605
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f46bb608d9d76c543e4929dc8cffe36b84bd789e':
dsputil: Split off pixel block routines into their own context
Conflicts:
configure
libavcodec/dsputil.c
libavcodec/mpegvideo_enc.c
libavcodec/pixblockdsp_template.c
libavcodec/x86/dsputilenc.asm
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f6ee61fb05482c617f5deee29a190d8ff483b3d1':
lavc: export DV profile API used by muxer/demuxer as public
Conflicts:
configure
doc/APIchanges
libavcodec/Makefile
libavcodec/dv_profile.c
libavcodec/dv_profile.h
libavcodec/version.h
libavformat/dvenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The issue affects dvdsub subtitles (a.k.a. VOBSUB).
Some players -- in particular hardware players -- cut off
the lowest row of pixels if the number of rows in the subtitle
is odd.
The patch below implements a work-around for that. If the
number of rows is odd, it is simply rounded up to an even
number, adding an invisible (i.e. fully transparent) row.
The work-around can be enabled or disabled with a new
option -even_rows_fix. The default is disabled, so there
is no change of behaviour for users who don't care about it.
The overhead for the fix is low, and in many cases even zero:
For subtitles with an odd number of rows (i.e. in 50% of
cases on average), the size increases by two bytes because
a fully transparent row is encoded as 0x00 0x00. However,
in the VOBSUB standard, all data packets are padded to 2KB
anyway, so in most cases the additional bytes just use some
part of the padding, so there is no overhead. Only in the
rare case that the 2KB boundary is hit (0.1% chance), a full
2KB block is added.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Disable moved functions to prevent build/test failure,
patch to update and re-enable them is welcome
volunteer to maintain the alpha code is welcome too
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The threshold was choosen so that no further size decrease happened with larger lambda
with the test video.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '79fce1ec8abd017593c003917fc123f7119a78d6':
arm: Avoid using the 'setend' instruction on ARMv7 and newer
Conflicts:
libavcodec/arm/h264dsp_init_arm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '1e9a93bfca2c2f43a07e01f2ef9fd5cbafe6c22d':
libfdk-aacdec: Decode the first AAC frame to reliably identify the bitstream
Conflicts:
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
For implicit signaling cases (as possible for Spectral Band Replication
and Parametric Stereo Tools), the decoder must decode the first frame to
correctly identify the stream configuration (as called from
avformat_find_stream_info). The mechanism for this is built-in and only
requires adding CODEC_CAP_CHANNEL_CONF to the libfdk-aacdec AVCodec
struct.
Signed-off-by: Omer Osman <omer.osman@iis.fraunhofer.de>
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '246f869590b8c7313d26e1c2ef56db01f6fd2503':
vmd: Split audio and video decoder
Conflicts:
libavcodec/vmdvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8d686ca59db14900ad5c12b547fb8a7afc8b0b94':
dsputil: Split off *_8x8basis to a separate context
Conflicts:
libavcodec/dsputil.c
libavcodec/mpegvideo_enc.c
libavcodec/x86/dsputilenc_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2fc85fe96e7e0e5fc433b98eacebf4d3511d2d58':
bmv: Split audio and video decoder
Conflicts:
libavcodec/bmvvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b0633f83f277c05bf1f617a99c7aedd2db8306e3':
paf: split audio and video decoder
Conflicts:
libavcodec/pafvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
These where removed by libav while spliting the file in adcb8392c9
See: de6d9b6404
See: 723106b279
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
libutvideo.h is not a C header and thus fails building as a C file
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Currently, http://ffmpeg.org/doxygen/trunk/group__preproc__misc.html is
broken, because of inclusion bug in Doxygen's preprocessing mode. The
now-unincluded header file is included anyway in avcodec.h, the only
place where it is used, so there should be no problem removing it.
One concern is API-compatibility, because old_codec_ids.h is a public
header. However, IMO this is not a problem because currently users
cannot include only this header and not `avcodec.h` anyway, because of
missing AV_CODEC_ID_NONE symbol.
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b0de1c766329dd8c9960ad1722e2f653160abc1b':
x86: build: Only compile FDCT code if MMX is enabled
Conflicts:
libavcodec/x86/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '12f129e545e5a5844b6ad7f3eb6a438015cad8bc':
x86: Unconditionally compile blockdsp and svq1enc init files
Conflicts:
libavcodec/x86/Makefile
blockdsp_mmx is renamed to blockdsp_init as we already have a blockdsp file
and _init is how all other such files are called
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* cehoyos/master:
Use os/2 palette even if it contains less than 256 entries.
Assume that old bmps do not contain transparency information.
Do not detect jp2 images as mov files.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e0bfe34ea8ccf333ec5b17961fd58eb575e74f8b':
libfdk-aacdec: Reduce the default decoder delay by one frame
Conflicts:
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The default error concealment method if none is set via
aacDecoder_SetParam(AAC_CONCEAL_METHOD) is set in
CConcealment_InitCommonData within the fdk-aac library
and is set to Energy Interpolation. This method requires one frame
delay to the output. To reduce the default decoder output delay and
avoid missing the last frame in file based decoding, use Noise
Substitution as the default concealment method.
Signed-off-by: Omer Osman <omer.osman@iis.fraunhofer.de>
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit 'c6698dfe7cdbc7634f33245875488ed3fa4a8ced':
webpdec: Fix decoding of the huffman group indices.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
improve the debugging function for saving subtitles
to PPM files: Actually use the alpha channel.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fix an off-by-one error that causes the height of decoded
subtitles to be too small, thus cutting off the lowest row
of pixels.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'df949b645b12d62bb4da56d629a887c81f67f2e5':
hevc: Use the local context variable when needed
Conflicts:
libavcodec/hevc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'adcb8392c9b185fd8a91a95fa256d15ab1432a30':
mjpeg: Split off bits shared by MJPEG and LJPEG encoders
Conflicts:
libavcodec/mjpegenc.c
libavcodec/mjpegenc.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '253d0be6a1ecc343d29ff8e1df0ddf961ab9c772':
pgssubdec: handle more complex PGS scenarios
Conflicts:
libavcodec/pgssubdec.c
Some of this has been split out and commited in cleanly split patches immedeately
before this merge
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Add ability to handle multiple palettes and objects simultaneously.
Each simultaneous object is given its own AVSubtitleRect.
Note that there can be up to 64 currently valid objects, but only
2 at any one time can be "presented".
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The commit causes minor out of array reads and was mainly intended for
future optimizations which turned out not to be meassurably faster.
Itself it was just 1 cpu cycle faster
Approved-by: jamrial
This reverts commit 057d2704e7.
* commit '5dd8c08fd5e4c04d7a08d8934f0098a8a4a35c28':
mpeg: Change ff_convert_matrix() to take an MpegEncContext parameter
Conflicts:
libavcodec/mpegvideo.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd2869aea0494d3a20d53d5034cd41dbb488eb133':
dsputil: Move MMX/SSE2-optimized IDCT bits to the x86 subdirectory
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '422e14f721c22cf9c19a8e7aae051ba9d559f6b6':
indeo2: rename stride to pitch for consistency with other Indeo decoders
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '1b04eb20f7e3f0a71f73ba91efcc3d60a435e443':
lavc: do not allocate edges in the default get_buffer2()
Conflicts:
libavcodec/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This also changes hfix8_mmx and above to use mmx regs instead of
gprs, and makes emulated_edge_mc_sse and emulated_edge_mc_sse2 use
mmxext hfix and hvar functions instead of mmx where possible.
This is mostly in preparation for an ssse3 version.
Signed-off-by: James Almer <jamrial@gmail.com>
code is about 1 cpu cycle faster approximately
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
in this case current MB size is forced to 16x16 (AVS standard section 9.9.1)
Signed-off-by: Yao Wang <jiayaowang@gmail.com>
Fixes Ticket 1901
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '593d2326ef985cdffe413df629419938f7b07c4c':
dv: Replace a magic number by sizeof()
Conflicts:
libavcodec/dv.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5ab03e41e553452118113d0c224fa32b325e45e5':
x86: h264dsp: Fix link failure with optimizations disabled
Merged-by: Michael Niedermayer <michaelni@gmx.at>
With optimzations disabled compilers have trouble doing dead code
elimination on 'if (foo && 0)' expressions, while 'if (0 && foo)'
still works, so use the latter to avoid problems.
Bug-Id: 707
* commit '772d150a6e82542c06b0c251e73dd299d98d1027':
h264: error out from decode_nal_units() when AV_EF_EXPLODE is set
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'fab9df63a3156ffe1f9490aafaea41e03ef60ddf':
dsputil: Split off global motion compensation bits into a separate context
Conflicts:
libavcodec/dsputil.c
libavcodec/dsputil.h
libavcodec/ppc/dsputil_altivec.h
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f23d26a6864128001b03876b0b92fffe131f2060':
h264: avoid using uninitialized memory in NEON chroma mc
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e121ac634ba324a318f4a97f978dcfb48da6b735':
indeo45: use is_indeo4 context flag instead of checking codec ID
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7b9ef8d701c319c26f7d0664fe977e176764c74e':
mpeg: Split error resilience bits off into a separate file
Conflicts:
configure
libavcodec/Makefile
libavcodec/mpegvideo.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '44127546b0a81dc9dd6190739a62d48f0044c6f3':
Check if an mp3 header is using a reserved sample rate.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
As indicated in the function documentation, the header MUST be
checked prior to calling it because no consistency check is done
there.
CC:libav-stable@libav.org
Fixes an invalid read past the end of avpriv_mpa_freq_tab.
Fixes divide-by-zero due to sample_rate being set to 0.
Bug-Id: 705
CC:libav-stable@libav.org
The SSE version has been no different than the mmx one since commit a41bf09d
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c54e118722cbbdc04945538d1796d4472a1ff406':
build: Have the eatqi decoder depend on the MPEG-1 decoder
Conflicts:
configure
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9a9e2f1c8aa4539a261625145e5c1f46a8106ac2':
dsputil: Split audio operations off into a separate context
Conflicts:
configure
libavcodec/takdec.c
libavcodec/x86/Makefile
libavcodec/x86/dsputil.asm
libavcodec/x86/dsputil_init.c
libavcodec/x86/dsputil_mmx.c
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9e500efdbe0deeff1602500ebc229a0a6b6bb1a2':
Add av_image_check_sar() and use it to validate SAR
Conflicts:
libavcodec/dpx.c
libavcodec/dvdec.c
libavcodec/ffv1dec.c
libavcodec/utils.c
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '650dee63c8b1e6693c6cf5983f4a5ed3f571379f':
dv: get rid of global non-const tables
Conflicts:
libavcodec/dv_profile.h
libavcodec/dvdec.c
libavcodec/dvenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '778111592bf5f38630858ee6dfcfd097cd6c6da9':
dvenc: initialize the profile only once, at init
Conflicts:
libavcodec/dvenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '874390e163427c1fe7682ab27924a7843780dbb3':
lavc: add a convenience function for rescaling timestamps in a packet
Conflicts:
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The blockdsp split did not cover Alpha optimizations
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Intel i263 codec has special 8-byte dummy frames that should not be decoded,
so do not even attempt to decode them and skip them instead.
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
Also replace INLINE_<opt> with EXTERNAL_<opt> that were wrongly
changed by commit 2b05db4f81
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Normally, a Laplace distribution is more typical of the residuals
encoded, but for noisy input, it's both better and simpler to be
safe and use a 1/d^2 distribution. Second hunk could use some
renormalization but it has effectively little impact.
Output size of ffvhuff on various 4:2:0 sequences:
context=0,1/d: 851974 27226 1137281
context=0,1/d²: 619081 25069 1051500
context=0,1/d³: 501983 30454 1290561
context=0,lapl: 500650 31754 1304082
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This avoids the following libass warning when using the subtitles
filter: "Neither PlayResX nor PlayResY defined. Assuming 384x288"
Subtitles tests change because the output is ASS and the PlayRes[XY]
ends up in the output.
POSIX gurantees >=32bit so it all fits in signed int
Also >=32bit ints are assumed througout the codebase
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f2ce63246f5c934429f9cb857a794e07624d7912':
dcadec: replace ldexpf with a multiplication by a constant
Conflicts:
libavcodec/dcadec.c
See: 6da06ef6bb
See: 9ccb5455ca
See: 6b88f22e89
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'fe4d5fe9361162f9033ff1bd84bfc1b2091ba785':
jpeg2000: Mark static data init functions as av_cold
Conflicts:
libavcodec/jpeg2000.c
libavcodec/jpeg2000dec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The reader reads in chunks of 11 bits at most, and at most 3 times. The unsafe
reader therefore may read 6 chunks instead of 1 in worst case, ie 8 bytes,
which is within the padding tolerance.
The reader ends up being ~10% faster. Cumulative effect of unsafe reading and
code block swapping on 3 sequences is for 1 thread, decoding time goes from
23.3s to 19.0s.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The old code was reserving the 0xFFFF entry to represent an inexisting
entry/codeword. These entries are now detected through their length
being <= 0. As this entry is often used for the residuals (-1,-1), which
should be among the most frequent, it is particularly important to not
reserve it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The effect is not really deterministic, as it seems to be a combination
on x86_64 of fewer registers used, different jump offsets and, for all
archs, of likely branches.
Speedup is around 15%.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Those macros take a byte number as shift argument, as this argument
differs between MMX and SSE2 instructions.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '27631796c9d1b8146ad4a16e6539ecc08afa7565':
ac3: Only initialize float_dsp for the float encoder variant
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This fixes a race condition and use of the wrong field, which become shared
instead of per thread during some AVFrame changes.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Sometimes the input buffers get directly used as raw images and
SIMD optimized video/image filters can sometimes read more than 16 bytes
over the end.
a specific example is the AVX 24bpp to yuv code
This also fixes fate-vsynth3-rgb
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Since error resilience uses AVFrame pointers instead of references it
has to copy NULL pointers too. After a codec flush the last/next frame
pointers in MpegEncContext are NULL and the old pointers remaining in
ERContext are invalid. Fixes a crash in vlc for android thumbnailer.
Reported and debugged by Adrien Maglo <magsoft@videolan.org>.
grayscale is coded as 4 pixels at a time, the encoder lacks support
for the case where width%4 != 0, and will simply encode less data
leaving random data after decoding at the right side
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '27860819d508068f056cf48473af04868791ad77':
ppc: Consistently use convenience macro for runtime CPU detection
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '570d4b21863b6254d6bbca9c528bede471bb4478':
x86: h264: Don't keep data in the redzone across function calls on 64 bit unix
Merged-by: Michael Niedermayer <michaelni@gmx.at>
We know that the called function (ff_chroma_inter_body_mmxext)
doesn't touch the redzone, and thus will be kept intact - thus,
this doesn't fix any bug per se.
However, valgrind's memcheck tool intentionally assumes that the
redzone is clobbered on every function call and function return
(see a long comment in valgrind/memcheck/mc_main.c). This avoids
false positives in that tool, at the cost of an extra stack pointer
adjustment.
The other alternative would be a valgrind suppression for this issue,
but that's an extra burden for everybody that wants to run libavcodec
within valgrind.
Signed-off-by: Martin Storsjö <martin@martin.st>
The actual predictor value, set by the trellis code, never
was written back into the variable that was written into
the block header. This was accidentally removed in b304244b.
This significantly improves the audio quality of the trellis
case, which was plain broken since b304244b.
Encoding IMA QT with trellis still actually gives a slightly
worse quality than without trellis, since the trellis encoder
doesn't use the exact same way of rounding as in
adpcm_ima_qt_compress_sample and adpcm_ima_qt_expand_nibble.
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
This very slightly improves compression
Found-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is probably not the simplest solution but as this is needed for a bugfix,
simplification is left for later.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes a regression since fb3e380 similar to ticket #2661,
reported by fluffrabbit at aol dot com.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
There's an SSE2 version already, and technically the SSE version
on x86_64 was wrong (using pshufd and pshuflw, SSE2 instructions).
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This avoids returning a initial frame after seeking which does
not match what would be received when decoding from the begin.
Suggested-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was broken in 095be4fb - samples+ch (for the previous
non-planar case) equals &samples_p[ch][0]. The confusion
probably stemmed from the IMA WAV case where it originally
was &samples[avctx->channels + ch], which was correctly
changed into &samples_p[ch][1].
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
The actual predictor value, set by the trellis code, never
was written back into the variable that was written into
the block header. This was accidentally removed in b304244b.
This significantly improves the audio quality of the trellis
case, which was plain broken since b304244b.
Encoding IMA QT with trellis still actually gives a slightly
worse quality than without trellis, since the trellis encoder
doesn't use the exact same way of rounding as in
adpcm_ima_qt_compress_sample and adpcm_ima_qt_expand_nibble.
Fixes part of Ticket3701
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was broken in 095be4fb - samples+ch (for the previous
non-planar case) equals &samples_p[ch][0]. The confusion
probably stemmed from the IMA WAV case where it originally
was &samples[avctx->channels + ch], which was correctly
changed into &samples_p[ch][1].
Fixes part of Ticket3701
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is done by padding the coefficient buffer with 0s, because the order
may be only a multiple of 4, and the DSP function requires batches of 8.
However, no sample with such a case was found, so request one if it uses
that kind of order.
Approximate relative speedup depending on instruction set:
plain C: -6%
mmxext: 51%
sse2: 54%
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2f7065190ad48744014a02288799d03adfa613e0':
libfdk-aac: Relicense the library wrappers to the ISC license
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This reduces the number of different licenses used within libav,
and is preferrable since it has less ambiguous wordings than
the BSD license with respect to the duties of the user of the code.
Fraunhofer have now indicated that they're allowed to contribute
code under this license as well.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes the mp3 decoder produce the same result when repeatly flushing and decoding
Suggested-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
An invalid entry already has the property of having a negative number
of bits, so remove the check on the reserved value, and rearrange the
code as a consequence.
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Overall time: 7.878s
319076 decicycles in 422, 262096 runs, 48 skips
159875 decicycles in gray, 262057 runs, 87 skips
Overall time: 7.394s
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It's no longer used inside another specific macro, so rename it.
Also remove duplicated definition and realign code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These where intended to maintain the previous behavior before dca_dmix_code()
but it is unclear (to me) which way is correct and no sample seem to trigger
the case, also they are incomplete for the purprose of error checking
Found-by: Niels Möller <nisse@lysator.liu.se>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
When the joint table does not contain a valid entry, the decoding restarts
from scratch. By implementing the trick of jumping to the 2nd level of the
individual table (and inlining the whole), a speed improvement of 5-10%
is possible.
On a 1000-frames YUV4:2:0 video, before:
362851 decicycles in 422, 262094 runs, 50 skips
182488 decicycles in gray, 262087 runs, 57 skips
Object size: 23584
Overall time: 8.377
After:
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Object size: 23188
Overall time: 7.878
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It was instead using the highest available asm functions, which completely
kills the point of being a reference C context.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Move the GNU as check before the arch specific asm checks since the .dn
check requires gas compatible assembler.
Disable the VC-1 motion compensation NEON asm which is the only part
using that directive. The integrated assembler in the upcoming clang 3.5
does not support .dn/.qn without plans to change that. Too much effort
to implement it while it is rarely used.
http://llvm.org/bugs/show_bug.cgi?id=18199.
* commit 'b88cc5cca111132b42c2ee99662bfefe7652e3da':
bink: Rename BinkDSPContext member so as not to clash with BlockDSPContext
Conflicts:
libavcodec/bink.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Blackfin is a painful platform to work with, no test machines are available
and the range of multimedia applications is dubious. Thus it only represents
a maintenance burden.
It is obviously nonsense since it produces wrong results
or even crashes (crashes should be encode-only though).
Fixes trac issue #1092.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ed39cda02923316b6710c1bcc34d3445370be5b4':
flacenc: send final extradata in packet side data
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '0957b274e312e985d69cb490bee2a7ff820acaa6':
lavc: add an option to enable side data-only packets during encoding
Conflicts:
libavcodec/avcodec.h
libavcodec/options_table.h
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8c02adc62d71dfbb079a04753d8c16152c49de88':
lavu: add all color-related enums to AVFrame
Conflicts:
libavcodec/avcodec.h
libavutil/frame.c
libavutil/frame.h
libavutil/version.h
The version check is changed so they are available with the current ABI
FFmpeg libs should have no problems with added fields, nor should any
application using the libs, and we regularly added fields in the past.
We also moved 2 of these fields to AVFrame already previously without issues.
See: a80e622924
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Some encoders (e.g. flac) need to send side data when there is no more
data to be output. This enables them to output a packet with no data in
it, only side data.
I do not know on which side to place the padding to encode with 16x16 MBs
If someone knows or has a known to be correct sample, please contact me
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The check is meant for k8 CPUs. sad16_sse2 is ~20% faster than sad16_mmxext on k10.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
the mpeg encoder would try to use them if vsad or vsse were selected for frame_skip_cmp
and frame_skip_threshold or frame_skip_factor were set to values != 0
example: "ffmpeg -i INPUT -c:v mpeg2video -skipcmp vsad -skip_threshold 1 -f null -"
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>