This may improve the precision of the fixed point encoder/decoder for some
compilers and architectures.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Marton Balint <cus@passwd.hu>
Should fix xvid/cache on windows with --enable-shared
May be related to Ticket 4780
Tested-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The maximum number of bits int the prefix code for
p(0) is 4. By setting it as 3, we were missing the
last 0 bit.
This fixes bug #4715 present on the trac.
Signed-off-by: Umair Khan <omerjerk@gmail.com>
Reviewed-by: Thilo Borgmann <thilo.borgmann@mail.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Yields 2x improvement in function performance, and boosts aac encoding
speed by ~ 4% overall. Sample benchmark (Haswell+GCC under -march=native):
after:
ffmpeg -i sin.flac -acodec aac -y sin_new.aac 5.22s user 0.03s system 105% cpu 4.970 total
before:
ffmpeg -i sin.flac -acodec aac -y sin_new.aac 5.40s user 0.05s system 105% cpu 5.162 total
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanag@gmail.com>
Understanding the mips32r6 and mips64r6 ISAs in the configure script is
not enough. In order to have full support for MIPS R6 in FFmpeg we need
to be able to build it, and for that we need to make sure we don't use
incompatible assembler code which makes the build fail. Ifdefing the
offending code is sufficient to fix the problem.
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This ensures gcc does not create unnecessary
loads or stores and possibly even does not vectorize
the negation.
Speeds up mp3 to aac transcoding with default settings
by 10% when using "gcc (Debian 5.3.1-10) 5.3.1 20160224".
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
The change of bps from 0 doesn't contain any info useful to the
user. This message is now at info log level only if the original
value is !=0, otherwise pushed back to debug log level. The
original value is displayed additionally.
Signed-off-by: Moritz Barsnick <barsnick@gmx.net>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The first X96 channel set can have more channels than core, causing X96
decoding to be skipped. Clear the number of decoded X96 channels to zero
in this rudimentary case.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Umair Khan <omerjerk@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
I cannot see any point whatsoever to use
double here instead of float, the results
are likely identical in all cases..
Using float allows for much more
efficient use of SIMD.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Fixes compilation of fft with hardcoded tables
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
This was a leftover from before the slices were encoded in parallel.
Since the put_bits context is initialized per slice aligning it
aferwards is pointless.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit solves most of the crashes and issues with the encoder and
the bitrate setting. Now the encoder will always allocate the absolute
lowest amount of memory regardless of what the bitrate has been set to.
Therefore if a user inputs a very low bitrate the encoder will use the
maximum possible quantization (basically zero out all coefficients),
allocate a packet and encode it. There is no coupling between the
bitrate and the allocation size and so no crashes because the buffer
isn't large enough.
The maximum quantizer was raised to the size of the table now to both
keep the overshoot at ridiculous bitrates low and to improve quality
with higher bit depths (since the coefficients grow larger per transform
quantizing them to the same relative level requires larger quantization
indices).
Since the quantization index start follows the previous quantization
index for that slice, the quantization step was reduced to a static 1
to improve performance. Previously with quant/5 the step was usually
set to 0 upon start (and was later clipped to 1), that isn't a big change.
As the step size increases so does the amount of bits leftover and so
the redistribution algorithm has to iterate more and thus waste more
time.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
AVClass is now a const, the rest are no-op.
* commit '5e555f93009f0605db120eec78262d0fe337e645':
mpeg12enc: always write closed gops for intra only outputs
h264: Add an AVClass pointer to H264Context
libx264: Fix noise_reduction option assignment
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
These macros were added in OS X 10.11, and the file compiles without warnings
on both 10.10 and 10.11 with them removed.
Thanks to mark4o on IRC for pointing out the failure and testing the patch.
Autodetected by default. Encode using -codec:v h264_videotoolbox.
Signed-off-by: Rick Kern <kernrj@gmail.com>
Signed-off-by: wm4 <nfxjfg@googlemail.com>
It makes no sense whatsoever to do this at each function call; we
already have a table for this.
Yields a 2x improvement in find_min_book (x86-64, Haswell+GCC):
ffmpeg -i sin.flac -acodec aac -y sin.aac
find_min_book
old
605 decicycles in find_min_book, 8388453 runs, 155 skips.9x
606 decicycles in find_min_book,16776912 runs, 304 skips.9x
607 decicycles in find_min_book,33553819 runs, 613 skips.2x
607 decicycles in find_min_book,67107668 runs, 1196 skips.3x
607 decicycles in find_min_book,134215360 runs, 2368 skips3x
new
359 decicycles in find_min_book, 8388552 runs, 56 skips.3x
360 decicycles in find_min_book,16777112 runs, 104 skips.1x
361 decicycles in find_min_book,33554218 runs, 214 skips.4x
361 decicycles in find_min_book,67108381 runs, 483 skips.5x
361 decicycles in find_min_book,134216725 runs, 1003 skips5x
and more importantly a non-negligible speedup (~ 8%) to overall AAC encoding:
old:
ffmpeg -i sin.flac -acodec aac -strict -2 -y sin_new.aac 6.82s user 0.03s system 104% cpu 6.565 total
new:
ffmpeg -i sin.flac -acodec aac -strict -2 -y sin_old.aac 6.24s user 0.03s system 104% cpu 5.993 total
This also improves accuracy of the expression by ~ 2 ulp in some cases.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanag@gmail.com>
This fixes a data race warning by ThreadSanitizer.
FrameThreadContext.die is read by all the worker threads but is not
protected by any mutex. Move it to PerThreadContext so that each worker
thread reads its own copy of |die|, which can then be protected with
PerThreadContext.mutex.
Signed-off-by: Wan-Teh Chang <wtc@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This was a regression introduced by commit e7345abe05 which
enabled full use of the allocated packet but due to the overhead of
using field coding the buffer was too small and triggered warnings and
crashes.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The fact that now all quantization indices costs are cached justifies
storing 20 more integers in a structure already allocated on heap.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit redistributes the leftover bytes amongst the top 150 slices
in terms of size (in the hopes that they'll be the ones pretty bitrate
starved).
A more perceptual method would probably need to cut bits off from slices
which don't need much, but that'll be implemented later.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Previously a global average was used. Using the previous quantizer
resulted in a fairly significant speedup as slice size selection settled
down quicker.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
16 bits were definitely not enough and caused artifacts to appear on
images at barely compressed images.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Check that the required plane pointers and only
those are set up.
Currently does not enforce anything for the palette
pointer of pseudopal formats as I am unsure about the
requirements.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
We do neither document nor check such a requirement
and for application-provided get_buffer2 they could
contain the result of a malloc(0) or whatever value
they had previously.
This fixes a use-after-free in e.g. MPlayer:
https://trac.mplayerhq.hu/ticket/2262
We might want to consider changing the (documented)
API in addition though.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Reported as https://trac.mplayerhq.hu/ticket/2264 but have
not been able to reproduce with FFmpeg-only.
I have no idea what coded_height is used for here exactly,
so this might not be the best fix.
Fixes the following chain of events:
ff_mss12_decode_init sets coded_height while not setting height.
ff_mpv_decode_init then copies coded_height into MpegEncContext height.
This is then used by init_context_frame to allocate the data structures.
However the wmv9rects are validated/initialized based on avctx->height, not
avctx->coded_height.
Thus the decode_wmv9 function will try to decode a larger video that we
allocated data structures for, causing out-of-bounds writes.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
This should return an error to the decoder if the struct it tried to getbuffer is dirty
Reviewed-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit moves the minimum bits per slice calculations outside of the
rate control function as it is identical for every slice.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since coefficients differ only in the last bit when writing to the
bitstream it was possible to remove the sign from the tables, thus
halving them. Also now all quantization is done in the unsigned domain
as the sign is completely separate, which gets rid of the need to do
quantization on 32 bit signed integers.
Overall, this slightly speeds up the encoder depending on the machine.
The commit still generates bit-identical files as before the commit.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Some containers, like webm/mkv, will contain this mastering metadata.
This is analogous to the way 3D fpa data is handled (in frame and
packet side data).
Signed-off-by: Neil Birkbeck <neil.birkbeck@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'f9fbd474676e903e12efe83203697d60a9d28cf9':
msmpeg4data: Move WMV2 data tables to their own file
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>