Commit Graph

76988 Commits

Author SHA1 Message Date
Ganesh Ajjanagadde
96786a12f6 avcodec/aac_tablegen: speed up table initialization
This speeds up aac_tablegen to a ludicruous degree (~97%), i.e to the point
where it can be argued that runtime initialization can always be done instead of
hard-coded tables. The only cost is essentially a trivial increase in
the stack size.

Even if one does not care about this, the patch also improves accuracy
as detailed below.

Performance:
Benchmark obtained by looping 10^4 times over ff_aac_tableinit.

Sample benchmark (x86-64, Haswell, GNU/Linux):
old:
1295292 decicycles in ff_aac_tableinit,     512 runs,      0 skips
1275981 decicycles in ff_aac_tableinit,    1024 runs,      0 skips
1272932 decicycles in ff_aac_tableinit,    2048 runs,      0 skips
1262164 decicycles in ff_aac_tableinit,    4096 runs,      0 skips
1256720 decicycles in ff_aac_tableinit,    8192 runs,      0 skips

new:
21112 decicycles in ff_aac_tableinit,     511 runs,      1 skips
21269 decicycles in ff_aac_tableinit,    1023 runs,      1 skips
21352 decicycles in ff_aac_tableinit,    2043 runs,      5 skips
21386 decicycles in ff_aac_tableinit,    4080 runs,     16 skips
21299 decicycles in ff_aac_tableinit,    8173 runs,     19 skips

Accuracy:
The previous code was resulting in needless loss of
accuracy due to the pow being called in succession. As an illustration
of this:
ff_aac_pow34sf_tab[3]
old : 0.000000000007598092294225
new : 0.000000000007598091426864
real: 0.000000000007598091778545

truncated to float
old : 0.000000000007598092294225
new : 0.000000000007598091426864
real: 0.000000000007598091426864

showing that the old value was not correctly rounded. This affects a
large number of elements of the array.

Patch tested with FATE.

Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-27 06:38:06 -05:00
Matthieu Bouron
72eaf72623 lavf/utils: avoid decoding a frame to get the codec parameters
Avoid decoding a frame to get the codec parameters while the codec
supports FF_CODEC_CAP_SKIP_FRAME_FILL_PARAM. This is particulary useful
to avoid decoding twice images (once in avformat_find_stream_info and
once when the actual decode is made).
2015-11-26 21:50:55 +01:00
Rostislav Pehlivanov
f5b7a29ae8 aac_ltp: actually signal LTP as off during EIGHT_SHORT windows
This hugely reduces the echo which was introduced with the previous
commit (though likely because previously everything was broken).
Makes LTP actually worthwhile now.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-11-26 18:20:42 +00:00
Rostislav Pehlivanov
1e5dbb3409 aac_ltp: split, reorder and improve prediction algorithm
This commit attempts to mirror what the decoder does more closely
in addition to fixing some shortcomings.
2015-11-26 17:40:04 +00:00
Ganesh Ajjanagadde
a239ce7074 avcodec/faandct: remove L suffixes for floating point literal
Should fix issues with ppc, tested by bug reporter.

Reported-by: John Warburton <john@johnwarburton.net>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 11:19:03 -05:00
Matthieu Bouron
39290f2715 fate: add FF_CODEC_CAP_SKIP_FRAME_FILL_PARAM tests 2015-11-26 17:05:54 +01:00
Ganesh Ajjanagadde
74b79dcf51 avfilter/vsrc_mptestsrc: use hypot()
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 09:20:46 -05:00
Ganesh Ajjanagadde
352bd18dff avfilter/af_dynaudnorm: remove wasteful pow
This removes wasteful pow(x, 2.0) that although not terribly important
for speed, is still useless.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 09:20:46 -05:00
Ganesh Ajjanagadde
9ee1feaa7c avfilter/af_afade: improve accuracy and speed of gain computation
Gain computation for various curves was being done in a needlessly
inaccurate fashion. Of course these are all subjective curves, but when
a curve is advertised to the user, it should be matched as closely as
possible within the limitations of libm. In particular, the constants
kept here were pretty inaccurate for double precision.

Speed improvements are mainly due to the avoidance of pow, the most
notorious of the libm functions in terms of performance. To be fair, it
is the GNU libm that is among the worst, but it is not really GNU libm's fault
since others simply yield a higher error as measured in ULP.

"Magic" constants are also accordingly documented, since they take at
least a minute of thought for a casual reader.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 09:20:46 -05:00
Ganesh Ajjanagadde
68e79b27a5 avutil/lls: speed up performance of solve_lls
This is a trivial rewrite of the loops that results in better
prefetching and associated cache efficiency. Essentially, the problem is
that modern prefetching logic is based on finite state Markov memory, a reasonable
assumption that is used elsewhere in CPU's in for instance branch
predictors.

Surrounding loops all iterate forward through the array, making the
predictor think of prefetching in the forward direction, but the
intermediate loop is unnecessarily in the backward direction.

Speedup is nontrivial. Benchmarks obtained by 10^6 iterations within
solve_lls, with START/STOP_TIMER. File is tests/data/fate/flac-16-lpc-cholesky.err.
Hardware: x86-64, Haswell, GNU/Linux.

new:
  17291 decicycles in solve_lls, 2096706 runs,    446 skips
  17255 decicycles in solve_lls, 4193657 runs,    647 skips
  17231 decicycles in solve_lls, 8384997 runs,   3611 skips
  17189 decicycles in solve_lls,16771010 runs,   6206 skips
  17132 decicycles in solve_lls,33544757 runs,   9675 skips
  17092 decicycles in solve_lls,67092404 runs,  16460 skips
  17058 decicycles in solve_lls,134188213 runs,  29515 skips

old:
  18009 decicycles in solve_lls, 2096665 runs,    487 skips
  17805 decicycles in solve_lls, 4193320 runs,    984 skips
  17779 decicycles in solve_lls, 8386855 runs,   1753 skips
  18289 decicycles in solve_lls,16774280 runs,   2936 skips
  18158 decicycles in solve_lls,33548104 runs,   6328 skips
  18420 decicycles in solve_lls,67091793 runs,  17071 skips
  18310 decicycles in solve_lls,134187219 runs,  30509 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-26 09:20:46 -05:00
Paul B Mahol
a330430238 avfilter/vf_stack: make it possible to stop with shortest stream
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-11-26 10:02:00 +01:00
Timothy Gu
9078a694f3 aaccoder_twoloop: Mark sfdiff as av_unused
Silences warning when building without assertions

Signed-off-by: Claudio Freire <klaussfreire@gmail.com>
2015-11-26 03:46:09 -03:00
Claudio Freire
3b1cab9351 AAC encoder: fix wrong gain sacalefactor being set
In some conditions, where the first band was being zeroed
mainly, the wrong global gain scalefactor would be written
to the stream since it's always taken from the first band
regardless of whether it's been marked as zero or not.

So, always make sure it contians something useful.
2015-11-26 03:37:29 -03:00
Claudio Freire
fc36d852ee AAC encoder: Fix application of M/S with PNS
When both M/S coding and PNS are enabled, scalefactors
and coding books would be mistakenly clobbered when setting
the M/S flag on PNS'd bands. The flag needs to be set to
signal the generation of correlated noise, but the scalefactors,
coefficients and the coding books need to be kept intact.
2015-11-26 03:27:06 -03:00
Timothy Gu
04deaef293 fate-run: Fix indentation 2015-11-25 21:03:14 -08:00
Rodger Combs
362c17e656 lavf/http: fix incorrect warning in range requests 2015-11-25 19:34:01 -06:00
Michael Niedermayer
b3494e3c3e avcodec/pthread_slice: Remove rets_count
It appears rets_count is redundant

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-11-26 00:56:12 +01:00
James Almer
3885ef0c6c avcodec/mjpegdec: fix typo on a warning 2015-11-25 19:24:24 -03:00
Paul B Mahol
56ff563f3b avfilter: add '.' at and of long filter description where it is missing
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-11-25 22:22:17 +01:00
Paul B Mahol
142894d720 avfilter: do not leak frame if ff_get_audio_buffer() fails
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-11-25 21:59:33 +01:00
Paul B Mahol
fd3df296c1 avfilter/af_alimiter: make description a bit longer
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-11-25 21:52:36 +01:00
Stefano Sabatini
5f2c233a85 doc/indevs: fix x11grab options consistency 2015-11-25 18:08:40 +01:00
Paul B Mahol
5b106215ba avfilter/af_sidechaincompress: add forgotten option
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-11-25 12:57:13 +01:00
Luca Barbato
0e2395293b nut: Mark non-fatal errors as warnings
And make one more informative.
2015-11-25 09:01:25 +01:00
Luca Barbato
62f72b40c0 nut: Provide more information on failure 2015-11-25 09:01:25 +01:00
Luca Barbato
2c17fb61ce rtsp: Log getaddrinfo failures
And forward the logging contexts when needed.
2015-11-25 09:01:25 +01:00
Luca Barbato
12b1438286 udp: Provide additional information on getaddrinfo failure 2015-11-25 09:01:25 +01:00
Luca Barbato
34af7813f7 udp: Use the logging context 2015-11-25 09:01:25 +01:00
Luca Barbato
98063bcf15 rtsp: Do not assume getnameinfo cannot fail
And properly report the error when it happens.
2015-11-25 09:01:25 +01:00
Ganesh Ajjanagadde
29af74e4e3 avutil/libm: fix isnan compatibility hack
Commit 14ea4151d7 had a bug in that the
conversion of the uint64_t result to an int (the return signature) would
lead to implementation defined behavior, and in this case simply
returned 0 for NAN. A fix via AND'ing the result with 1 does the trick,
simply by ensuring a 0 or 1 return value.

Patch tested with FATE on x86-64, GNU/Linux by forcing the compatibility
code via an ifdef hack suggested by Michael.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 21:33:13 -05:00
Timothy Gu
4f99308ed3 doc/indevs: Fix German 2015-11-24 18:18:12 -08:00
Timothy Gu
798920033e configure: Fix pseudo-German 2015-11-24 18:18:12 -08:00
Ganesh Ajjanagadde
990619968a avfilter/vsrc_mandelbrot: change sin to sinf for color computation
lrintf is anyway used, suggesting we only care up to floating precision.
Rurthermore, there is a compat hack in avutil/libm for this function,
and it is used in avcodec/aacps_tablegen.h.

This yields a non-negligible speedup. Sample benchmark:
x86-64, Haswell, GNU/Linux:

old (draw_mandelbrot):
274635709 decicycles in draw_mandelbrot,     256 runs,      0 skips
300287046 decicycles in draw_mandelbrot,     512 runs,      0 skips
371819935 decicycles in draw_mandelbrot,    1024 runs,      0 skips
336663765 decicycles in draw_mandelbrot,    2048 runs,      0 skips
581851016 decicycles in draw_mandelbrot,    4096 runs,      0 skips

new (draw_mandelbrot):
269882717 decicycles in draw_mandelbrot,     256 runs,      0 skips
296359285 decicycles in draw_mandelbrot,     512 runs,      0 skips
370076599 decicycles in draw_mandelbrot,    1024 runs,      0 skips
331478354 decicycles in draw_mandelbrot,    2048 runs,      0 skips
571904318 decicycles in draw_mandelbrot,    4096 runs,      0 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 20:36:40 -05:00
Ganesh Ajjanagadde
e9c7493f19 avfilter/vsrc_mandelbrot: avoid sqrt for epsilon calculation
This rewrites into a similar expression avoiding sqrt. Similarity is
assured since sqrt(x^2 + y^2)/(x+y) lies in [1/sqrt(2), 1] for x, y > 0.

Tested on x86-64, Haswell, GNU/Linux.
Command:
ffmpeg -f lavfi -i mandelbrot -f null -

old (draw_mandelbrot):
277625266 decicycles in draw_mandelbrot,     256 runs,      0 skips
304527322 decicycles in draw_mandelbrot,     512 runs,      0 skips
377593582 decicycles in draw_mandelbrot,    1024 runs,      0 skips
338539499 decicycles in draw_mandelbrot,    2048 runs,      0 skips
583630357 decicycles in draw_mandelbrot,    4096 runs,      0 skips

new (draw_mandelbrot):
274635709 decicycles in draw_mandelbrot,     256 runs,      0 skips
300287046 decicycles in draw_mandelbrot,     512 runs,      0 skips
371819935 decicycles in draw_mandelbrot,    1024 runs,      0 skips
336663765 decicycles in draw_mandelbrot,    2048 runs,      0 skips
581851016 decicycles in draw_mandelbrot,    4096 runs,      0 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 20:36:40 -05:00
Ganesh Ajjanagadde
81a0aec29e avcodec/aacps_tablegen: use hypot()
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 20:36:40 -05:00
Ganesh Ajjanagadde
aececd11ab avcodec/aacps_tablegen_template: replace #define by typedef
See e.g https://stackoverflow.com/questions/1666353/are-typedef-and-define-the-same-in-c
for rationale.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 20:36:40 -05:00
Ganesh Ajjanagadde
5472de5ca8 avcodec/aac_defines: replace #define by typedef
See e.g https://stackoverflow.com/questions/1666353/are-typedef-and-define-the-same-in-c
for rationale.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-11-24 20:36:40 -05:00
Timothy Gu
15dcc506d7 vsrc_mandelbrot: Don't use German in comments 2015-11-24 17:33:07 -08:00
Marton Balint
839eb1c77d lavfi/select: add support for concatdec_select option
This option can be used to select useful frames from an ffconcat file which is
using inpoints and outpoints but where the source files are not intra frame
only.

Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
2015-11-25 00:34:29 +01:00
Marton Balint
65406b0bed concatdec: add option for adding segment start time and duration metadata
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Marton Balint <cus@passwd.hu>
2015-11-25 00:34:29 +01:00
Marton Balint
ba9191ab3a concatdec: simplify duration calculation in open_next_file
If duration is still AV_NOPTS_VALUE when opening the next file, we can assume
that outpoint is not set.

Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Marton Balint <cus@passwd.hu>
2015-11-25 00:34:29 +01:00
Marton Balint
8f60663c8b concatdec: calculate duration early if outpoint is known
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Marton Balint <cus@passwd.hu>
2015-11-25 00:34:29 +01:00
Michael Niedermayer
4ea4d2f438 avcodec/h264_slice: Limit max_contexts when slice_context_count is initialized
Fixes out of array access
Fixes: 1430e9c43fae47a24c179c7c54f94918/signal_sigsegv_421427_2049_f2192b6829ab6e0eefcb035329c03c60.264

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-11-24 22:17:36 +01:00
Michael Niedermayer
5b70fb8fee movenc-test: Fix integer overflows
Signed-off-by: Martin Storsjö <martin@martin.st>
2015-11-24 20:57:11 +02:00
Vittorio Giovara
fdd5c48ebd texturedsp: Explicitly cast RGBA parameters to unsigned
Silences warnings when using -Wshift-overflow (GCC 6+).
Found-by: James Almer <jamrial@gmail.com>
2015-11-24 09:24:48 -05:00
Vittorio Giovara
eef38316ca texturedspenc: Avoid using separate variables
Use the result directly, removing an unneeded cast.
2015-11-24 09:24:39 -05:00
Vittorio Giovara
7831fb9050 textureencdsp: cosmetics: Use normal static const for tables 2015-11-24 09:24:30 -05:00
Vittorio Giovara
99cb833fc2 sgi: Correctly propagate meaningful error values 2015-11-24 09:05:01 -05:00
Vittorio Giovara
823fa70045 fate: Rework sgi tests into a suite and add the missing ones
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-11-24 09:05:01 -05:00
Vittorio Giovara
4a0918cae6 sgienc: Support encoding high bit depth images with RLE
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-11-24 09:05:01 -05:00