Commit Graph

34679 Commits

Author SHA1 Message Date
Andreas Cadhalpun
7ea2db6eaf mjpegdec: extend check for incompatible values of s->rgb and s->ls
This can happen if s->ls changes from 0 to 1, but picture allocation is
skipped due to s->interlaced.

In that case ff_jpegls_decode_picture could be called even though the
s->picture_ptr frame has the wrong pixel format and thus a wrong
linesize, which results in a too small zero buffer being allocated.

This fixes an out-of-bounds read in ls_decode_line.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-31 17:30:25 +01:00
Alexandra Hájková
40d9496773 dca: use defines for subband related constants
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2015-12-31 11:40:32 +01:00
Ganesh Ajjanagadde
b492fbcc6e lavc/dsd_tablegen: always generate tables at runtime
Commit b272c3a5aa has sped up dsd_tablegen, and now table generation takes
~ 40k cycles. Thus, these tables can always be generated at runtime.

Tested with/without --enable-hardcoded-tables.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-30 20:37:13 -08:00
Rostislav Pehlivanov
8de5b0d966 dirac_dwt: remove unnecessary undefs
They're all undefined within the template file.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-31 00:35:06 +00:00
Ganesh Ajjanagadde
05434b0eea lavc/cook: get rid of wasteful pow in init_pow2table
The table is highly structured, so pow (or exp2 for that matter) can entirely
be avoided, yielding a ~ 40x speedup with no loss of accuracy.

sample benchmark (Haswell, GNU/Linux):
new:
4449 decicycles in init_pow2table(loop 1000),     254 runs,      2 skips
4411 decicycles in init_pow2table(loop 1000),     510 runs,      2 skips
4391 decicycles in init_pow2table(loop 1000),    1022 runs,      2 skips

old:
183673 decicycles in init_pow2table(loop 1000),     256 runs,      0 skips
182142 decicycles in init_pow2table(loop 1000),     512 runs,      0 skips
182104 decicycles in init_pow2table(loop 1000),    1024 runs,      0 skips

Reviewed-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-30 08:52:19 -08:00
Ganesh Ajjanagadde
b272c3a5aa lavc/dsd_tablegen: speed up table generation
Tables are bit identical.
Sample benchmark (Haswell, GNU/Linux+gcc):
old:
 814485 decicycles in dsd_ctables_tableinit,     512 runs,      0 skips

new:
 356808 decicycles in dsd_ctable_tableinit,     512 runs,      0 skips

Binary size should essentially be identical, and is in fact identical on
the configuration I tested on.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-30 08:45:19 -08:00
Janne Grunau
8563f98871 x86: use emms after ff_int32_to_float_fmul_scalar_sse
Intel's Instruction Set Reference (as of September 2015) clearly states
that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the
source is a memory location. The Instruction Set Reference from 1999
(Order Number 243191) describes this behaviour but all later versions
I've seen have make no distinction whether MMX registers or memory is
used as source.
The documentation for the matching SSE2 instruction to convert to double
(cvtpi2pd) was fixed (see the valgrind bug
https://bugs.kde.org/show_bug.cgi?id=210264).

It will take time to get a clarification and fixes in place. In the
meantime it makes sense to change ff_int32_to_float_fmul_scalar_sse to
be correct according to the documentation. The vast majority of users
will have SSE2 so a change to the SSE version has little effect.

Fixes fate-checkasm on x86 valgrind targets.

Valgrind 'bug' reported as https://bugs.kde.org/show_bug.cgi?id=357059
2015-12-30 13:37:57 +01:00
Mark Harris
c51c08e0e7 avcodec: Use get_ue_golomb_long() when needed
get_ue_golomb() cannot decode values larger than 8190 (the maximum
value that can be golomb encoded in 25 bits) and produces the error
"Invalid UE golomb code" if a larger value is encountered.  Use
get_ue_golomb_long() instead (which supports 63 bits, up to 4294967294)
when valid h264/hevc values can exceed 8190.

This updates decoding of the following values:   (maximum)
  first_mb_in_slice                                36863* for level 5.2
  abs_diff_pic_num_minus1                         131071
  difference_of_pic_nums_minus1                   131071
  idr_pic_id                                       65535
  recovery_frame_cnt                               65535
  frame_packing_arrangement_id                4294967294
  frame_packing_arrangement_repetition_period      16384
  display_orientation_repetition_period            16384

An alternative would be to modify get_ue_golomb() to handle encoded
values of up to 49 bits as was done for get_se_golomb() in a92816c.
In that case get_ue_golomb() could continue to be used for all of
these except frame_packing_arrangement_id.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-29 13:11:51 +01:00
Janne Grunau
f4f27e4cf1 x86: zero extend the 32-bit length in int32_to_float_fmul_scalar implicitly
This reverts commit 5dfe4edad6.
2015-12-29 11:42:51 +01:00
Hendrik Leppkes
50401f5fb7 avcodec: properly check pkt_timebase for validity
Unset/invalid timebases have a zero numerator.
This makes the checks consistent with other timebase checks and fixes an
integer division by 0.
2015-12-28 10:24:15 +01:00
Michael Niedermayer
3215342121 avcodec/on2avc: Fix stability issues with scale_tab generation
This also simplifies the code
the resulting values are binary identical to what pow(10, i/10.0) produces
2015-12-27 16:44:48 +01:00
Ganesh Ajjanagadde
c5b3c4c741 lavc/snowenc: replace rint by lrint
avoids float to int cast.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-26 20:29:22 -08:00
Ganesh Ajjanagadde
5979c740f5 lavc/dds: replace rint by lrint
avoids float to int cast.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-26 20:29:22 -08:00
Ganesh Ajjanagadde
e09edc62cd lavc/texturedsp: replace rint by lrint
avoids float to int cast.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-26 20:29:22 -08:00
Ganesh Ajjanagadde
71af38954b avcodec/on2avc: fix regression on icc since 5495c7f
Should fix the regression, and also speeds up table generation.
Tables tested on GNU/Linux+clang: they are identical to the ones prior
to 5495c7f. ff_exp10 caused one slight change in one entry, 50000 became
50001 due to somewhat incorrect rounding.

Untested on ICC; passes FATE on GNU/Linux+gcc.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-26 17:20:55 -08:00
Diego Biurrun
69a68593ce Remove stray line breaks from avpriv_{report_missing_feature|request_samples} 2015-12-26 10:28:03 +01:00
Ganesh Ajjanagadde
0abdcae5a9 lavc/acelp_pitch_delay: replace exp2f(M_LOG2_10 *x) by ff_exp10f(x)
Suggested-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
25ae086db2 lavc/wmaprodec: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
3343e4e607 lavc/wmaenc: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
62765c0014 lavc/wmadec: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
a0ea801dc3 lavc/opus: replace pow(10,x) by ff_exp10(x)
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
5495c7f2a3 lavc/on2avc: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
26ac80d235 lavc/imc: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
717eeb77e1 lavc/dcaenc: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
b0e28da37c lavc/cngdec: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
cb3a994bb1 lavc/aacpsy: replace pow(10,x) by ff_exp10(x)
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:48:18 -08:00
Ganesh Ajjanagadde
48cd3d233b lavc/libopusdec: replace pow(10,x) by ff_exp10(x)
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-25 10:18:43 -08:00
Michael Bradshaw
99eabcdd5f avcodec: add OpenJPEG 2.x compatibility
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-25 10:55:34 +01:00
Michael Niedermayer
052e692e85 avcodec/ac3dec: Print the value of out of range exponents
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-24 20:47:25 +01:00
Alexandra Hájková
2008f76054 dca: remove unused decode_hf function and quant_d tables
They were superseded with their integer equivalents. Rename integer
decode_hf to decode_hf.
2015-12-24 13:58:18 +01:00
Paul B Mahol
57787f5ef8 avcodec/s302menc: comment out allowed channel layouts
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-23 12:46:23 +01:00
Alexandra Hájková
aebf07075f dca: change the core to work with integer coefficients.
The DCA core decoder converts integer coefficients read from the
bitstream to floats just after reading them (along with dequantization).
All the other steps of the audio reconstruction are done with floats
which makes the output for the DTS lossless extension (XLL)
actually lossy.
This patch changes the DCA core to work with integer coefficients
until QMF. At this point the integer coefficients are converted to floats.
The coefficients for the LFE channel (lfe_data) are not touched.
This is the first step for the really lossless XLL decoding.
2015-12-23 11:50:18 +01:00
Alexandra Hájková
85990140e7 dca: Add math helpers.
They will be used by the integer core decoder.
2015-12-23 11:50:08 +01:00
Hendrik Leppkes
b942845bee avcodec/libschroedingerenc: add missing AVClass to private context
Fixes ticket #5104.
2015-12-23 10:22:00 +01:00
Andreas Cadhalpun
b648b246f0 diracdec: add missing check for pixel_range_index
This fixes an out-of-bounds read introduced in commit 0379603.

Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-22 20:23:36 +01:00
Michael Niedermayer
05af8608c2 avcodec/ass: check for av_mallocz() failure
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-22 14:41:27 +01:00
Timo Rothenpieler
d7c2b75681 vaapi: Add VP9 hwaccell support
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2015-12-22 12:54:23 +01:00
Claudio Freire
4720a562c8 AAC encoder: fix possible assertion failure in PNS
Fix possible SF delta violation that would cause an
eventual assertion failure in some corner cases (esp
on very low bitrates) when marking bands for PNS due
to misuse of the sf_delta utilities
2015-12-22 05:26:12 -03:00
Rostislav Pehlivanov
c0f67e1123 aacenc_is: rename variable
'erf' is far from the best name for a variable and is not very
descriptive since the actual variable points to the comparitively best
IS phase. Therefore rename it to 'best'.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-21 17:23:38 +00:00
Ganesh Ajjanagadde
b9c2ebeee4 lavc/libvpxenc: replace round by lrint
Mostly cosmetic here.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-21 08:20:26 -08:00
Janne Grunau
cc29d96d5a arm64: fix inverted register order in transpose_4x4H
Fix related register order issue in ff_h264_idct_add_neon.

Found-by: zjh8890 <243186085@qq.com>
2015-12-21 13:44:20 +01:00
Clément Bœsch
f122ba36cb lavc: add text encoder 2015-12-21 11:14:02 +01:00
Paul B Mahol
484cc66f57 avcodec/indeo2: use init_get_bits8
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-20 21:31:55 +01:00
James Almer
d4c47333e1 x86/hevc_sao: add ff_hevc_sao_edge_filter_{8,16}_{10,12}
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-20 17:01:15 -03:00
James Almer
3ff2beff65 x86/hevc_sao: simplify sao_edge_filter 10/12bit
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-20 16:45:37 -03:00
James Almer
34b2bd03cf x86/hevc_sao: simplify sao_band_filter 10/12bit
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-20 16:42:36 -03:00
Paul B Mahol
367ffa0c15 avcodec/flacenc: use designated initializers for AVClass
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-20 17:47:21 +01:00
Paul B Mahol
db6e337b41 avcodec/s302menc: check if buf_size can actually be put into 16bit size
This disallows creating unplayable audio.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-20 16:05:37 +01:00
Paul B Mahol
577f057355 avcodec/s302menc: set supported channel layouts by codec
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2015-12-20 16:05:37 +01:00
Andreas Cadhalpun
699e68371e rawdec: only exempt BIT0 with need_copy from buffer sanity check
Otherwise the too small buffer is directly used in the frame, causing
segmentation faults, when trying to use the frame.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-20 12:15:56 +01:00
Michael Niedermayer
70f13abb4f avcodec/mpeg4videodec: also for empty partitioned slices
Fixes assertion failure
Fixes: id_acf3e47f864e1ee4c7b86c0653e0ff31e5bde56e.m4v

Found-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-19 23:54:10 +01:00
James Almer
0f520e489a avcodec/Makefile: add missing dep for g723_1 encoder
Signed-off-by: James Almer <jamrial@gmail.com>
2015-12-19 18:50:47 -03:00
Michael Niedermayer
b92b4775a0 avcodec/h264_refs: Fix long_idx check
Fixes out of array read
Fixes mozilla bug 1233606

Found-by: Tyson Smith
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-19 22:17:54 +01:00
Michael Niedermayer
2d2b41d169 avcodec/ffv1enc: Fix 2 pass mode with the default rc table
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-19 20:42:54 +01:00
Ganesh Ajjanagadde
def3c83e1b lavc/aacsbr: sbr_dequant optimization
This uses ff_exp2fi to get a speedup (~ 6x).

sample benchmark (Haswell, GNU/Linux):
old:
  19102 decicycles in sbr_dequant,    1023 runs,      1 skips
  19002 decicycles in sbr_dequant,    2045 runs,      3 skips
  17638 decicycles in sbr_dequant,    4093 runs,      3 skips
  15825 decicycles in sbr_dequant,    8189 runs,      3 skips
  16404 decicycles in sbr_dequant,   16379 runs,      5 skips

new:
   3063 decicycles in sbr_dequant,    1024 runs,      0 skips
   3049 decicycles in sbr_dequant,    2048 runs,      0 skips
   2968 decicycles in sbr_dequant,    4096 runs,      0 skips
   2818 decicycles in sbr_dequant,    8191 runs,      1 skips
   2853 decicycles in sbr_dequant,   16383 runs,      1 skips

Reviewed-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-19 09:32:53 -08:00
Andreas Cadhalpun
9d38f06d05 xwddec: prevent overflow of lsize * avctx->height
This is used to check if the input buffer is large enough, so if this
overflows it can cause a false negative leading to a segmentation fault
in bytestream2_get_bufferu.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-19 14:28:51 +01:00
Janne Grunau
2dba0407fd avcodec/arm64: fix inverted register order in transpose_4x4H
Fix related register order issue in ff_h264_idct_add_neon.

Found-by: zjh8890 <243186085@qq.com>

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-19 03:58:46 +01:00
Michael Niedermayer
1c878474fb avcodec/ffv1enc: unbreak -coder option
This fixes a segfault caused by moving the coder option and changing its semantics

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-18 18:08:00 +01:00
Andreas Cadhalpun
90b99a8107 exr: fix out of bounds read in get_code
This macro unconditionally used out[-1], which causes an out of bounds
read, if out is the very beginning of the buffer.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-18 15:30:04 +01:00
Andreas Cadhalpun
4d5c3b02e9 on2avc: limit number of bits to 30 in get_egolomb
More don't fit into the integer output.

Also use get_bits_long, since get_bits only supports reading up to 25
bits, while get_bits_long supports the full integer range.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-18 15:29:57 +01:00
Rostislav Pehlivanov
4386f17bbd acenc: remove deprecated avctx->frame_bits use
The type of last_frame_pb_count was chosen to be an int since overflow
is impossible (the spec says the maximum bits per frame is 6144 per
channel and the encoder checks for that).

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
2015-12-18 14:28:40 +00:00
Hendrik Leppkes
06d69a2d22 Merge commit '458e53f51fc75d08df884f8e9eb3d7ded23e97b3'
* commit '458e53f51fc75d08df884f8e9eb3d7ded23e97b3':
  mpegvideo_enc: actually add the side data with vbv_delay to the packet

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:53:19 +01:00
Hendrik Leppkes
4a80f0bdb0 Merge commit '81c95eb8eee856d98d4ac37367dbc761f2faf875'
* commit '81c95eb8eee856d98d4ac37367dbc761f2faf875':
  openh264: Directly include the deprecation guards header

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:52:18 +01:00
Hendrik Leppkes
a38b50c3ff Merge commit '34138ece23c8ddae543269212a051c00d49e67d7'
* commit '34138ece23c8ddae543269212a051c00d49e67d7':
  log: Use a do {} while (0) for tlog

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:50:54 +01:00
Hendrik Leppkes
67ebc88fb5 lavc/sunrastenc: fix private codec options
The options were not actually hooked up.
2015-12-18 14:47:19 +01:00
Hendrik Leppkes
ef9ae0e748 Merge commit 'c34df422628e6b7b657faee241fe7bb2629e0f57'
* commit 'c34df422628e6b7b657faee241fe7bb2629e0f57':
  sgienc: Make sure to initialize skipped header portions

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:39:59 +01:00
Hendrik Leppkes
362028cac9 Merge commit '16216b713f9a21865cc07993961cf5d0ece24916'
* commit '16216b713f9a21865cc07993961cf5d0ece24916':
  lavc: Drop exporting 2-pass encoding stats

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:39:15 +01:00
Hendrik Leppkes
2630f7f709 Merge commit 'be00ec832c519427cd92218abac77dafdc1d5487'
* commit 'be00ec832c519427cd92218abac77dafdc1d5487':
  lavc: Deprecate coder_type and its symbols

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-18 14:27:41 +01:00
Michael Niedermayer
c8ea57664f avcodec/h264_mc_template: prefetch list1 only if it is used in the MB
Fixes ubsan warning
Fixes Mozilla bug 1230276

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-18 00:54:37 +01:00
Michael Niedermayer
ef8f6464a5 avcodec/h264_slice: Simplify ref2frm indexing
This also suppresses a ubsan warning
Fixes Mozilla bug 1230247

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-18 00:54:28 +01:00
Ganesh Ajjanagadde
97d2c2d678 lavc/opus_celt: replace pow by exp2
exp2 is faster.

It may be possible to optimize further; e.g the exponents seem to be
multiples of 0.25. This requires study though.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-17 14:06:46 -08:00
Michael Niedermayer
95b59bfb9d Revert "avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H"
The change was not correct and broke H264

This reverts commit cd83f899c9.
2015-12-17 21:26:37 +01:00
Andreas Cadhalpun
9637c2531f sonic: make sure num_taps * channels is not larger than frame_size
If that is the case, the loop setting predictor_state in
sonic_decode_frame causes out of bounds reads of int_samples, which has
only frame_size number of elements.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-17 19:55:09 +01:00
Michael Niedermayer
73840bbe4e avcodec/diracdec: Check ff_set_dimensions() for failure
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-17 19:00:33 +01:00
Michael Niedermayer
ffad6f6b89 avcodec/diracdec: fix aspect ratio (it was lost after efcc8fddd6)
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-17 19:00:18 +01:00
Ganesh Ajjanagadde
07a8fbaa55 lavc/nellymoserenc: avoid wasteful pow
exp2 suffices here. Some trivial speedup is done in addition here by
reusing results.

This retains accuracy, and in particular results in identical values
with GNU libm + gcc/clang.

sample benchmark (Haswell, GNU/Linux):
proposed : 424160 decicycles in pow_table,     512 runs,      0 skips
exp2 only: 1262093 decicycles in pow_table,     512 runs,      0 skips
old      : 2849085 decicycles in pow_table,     512 runs,      0 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-17 08:12:41 -08:00
Hendrik Leppkes
74b8fa103d Merge commit '68e547ae8b455e5e2b60839f35c359d77a6d94bc'
* commit '68e547ae8b455e5e2b60839f35c359d77a6d94bc':
  avpacket: use ERANGE instead of EOVERFLOW

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 14:52:28 +01:00
Hendrik Leppkes
efcc8fddd6 Merge commit 'e02de9df4b218bd6e1e927b67fd4075741545688'
* commit 'e02de9df4b218bd6e1e927b67fd4075741545688':
  lavc: export Dirac parsing API used by the ogg demuxer as public

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 14:48:46 +01:00
Hendrik Leppkes
b2d8b91cf0 Merge commit '825900248b4053515803152d3165efdb034b660b'
* commit '825900248b4053515803152d3165efdb034b660b':
  qsvenc: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:42:53 +01:00
Hendrik Leppkes
5fc17edc7d Merge commit '1520c6ff05d835da4b793318fc88bbbc129c86a1'
* commit '1520c6ff05d835da4b793318fc88bbbc129c86a1':
  nvenc: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:41:29 +01:00
Hendrik Leppkes
31ae2308b3 Merge commit '2507b5dd674834be7261772996f47ae3b95cca69'
* commit '2507b5dd674834be7261772996f47ae3b95cca69':
  mpegvideo_enc: export vbv_delay in side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:36:09 +01:00
Hendrik Leppkes
b799619f48 Merge commit '3f5c99fcbb2c366d7bdef8500c19f43a33bdb6b9'
* commit '3f5c99fcbb2c366d7bdef8500c19f43a33bdb6b9':
  mpegvideo_enc: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:35:34 +01:00
Hendrik Leppkes
b77061b5ca Merge commit '732a37d1466d45b3812509d68c82e783530e291a'
* commit '732a37d1466d45b3812509d68c82e783530e291a':
  libx264: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:34:51 +01:00
Hendrik Leppkes
d6322710c5 Merge commit '03afb62e83516141ba999536fc97575faefb98af'
* commit '03afb62e83516141ba999536fc97575faefb98af':
  libvpxenc: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:33:20 +01:00
Hendrik Leppkes
f49264a1c5 Merge commit '11c9bd633f635f07a762be1ecd672de55daf4edc'
* commit '11c9bd633f635f07a762be1ecd672de55daf4edc':
  libopenh264enc: export CPB props side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:31:21 +01:00
Hendrik Leppkes
dd6ee019ea Merge commit 'f0b769c16daafa64720dcba7fa81a9f5255e1d29'
* commit 'f0b769c16daafa64720dcba7fa81a9f5255e1d29':
  lavc: add a packet side data type for VBV-like parameters

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:25:52 +01:00
Hendrik Leppkes
a7d5b9f1c3 Merge commit '84adab333cddeefc3cfd843089dee23f58bd372c'
* commit '84adab333cddeefc3cfd843089dee23f58bd372c':
  lavc: add stream-global packet side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:18:18 +01:00
Hendrik Leppkes
30833d121e Merge commit '31c51f7441de07b88cfea2550245bf1f5140cb8f'
* commit '31c51f7441de07b88cfea2550245bf1f5140cb8f':
  avpacket: add a function for wrapping existing data as side data

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:12:07 +01:00
Hendrik Leppkes
10e55bd658 Merge commit 'b09ad37c83841c399abb7f2503a2ab214d0c2d48'
* commit 'b09ad37c83841c399abb7f2503a2ab214d0c2d48':
  h264: derive the delay from the level when it's not present

Merged without changing the strict_std_compliance check, as it breaks FATE
and changes decoding behavior.

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 13:07:08 +01:00
Hendrik Leppkes
c6f1f334cb Merge commit '792b9c9dfcf44b657d7854368d975b5ca3bc22ca'
* commit '792b9c9dfcf44b657d7854368d975b5ca3bc22ca':
  h264: set frame_num in start_frame(), not decode_slice_header()

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 12:59:25 +01:00
Hendrik Leppkes
bc66451e5e Merge commit '741b494fa8cd28a7d096349bac183893c236e3f9'
* commit '741b494fa8cd28a7d096349bac183893c236e3f9':
  h264: eliminate default_ref_list

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-17 12:45:28 +01:00
Kieran Kunhya
25f6ccccd6 diracdec: Fix codeblock parameters reading 2015-12-16 23:26:03 +00:00
Kieran Kunhya
a349a10edf diracdec: Add support for HQ profile 2015-12-16 21:35:12 +00:00
Kieran Kunhya
0379603632 diracdec: Add 10-bits to pix_fmt table 2015-12-16 21:35:12 +00:00
Andreas Cadhalpun
5ea59b1f42 exr: fix out of bounds read in get_code
This macro unconditionally used out[-1], which causes an out of bounds
read, if out is the very beginning of the buffer.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-12-16 22:22:06 +01:00
Andreas Cadhalpun
17776638c3 opus: Fix typo causing overflow in silk_stabilize_lsf
Due to this typo max_center can be too large, causing nlsf to be set to
too large values, which in turn can cause nlsf[i - 1] + min_delta[i] to
overflow to a negative value, which is not allowed for nlsf and can
cause an out of bounds read in silk_lsf2lpc.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-12-16 22:19:58 +01:00
Andreas Cadhalpun
f61d44b74a opus_silk: fix typo causing overflow in silk_stabilize_lsf
Due to this typo max_center can be too large, causing nlsf to be set to
too large values, which in turn can cause nlsf[i - 1] + min_delta[i] to
overflow to a negative value, which is not allowed for nlsf and can
cause an out of bounds read in silk_lsf2lpc.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-16 19:29:17 +01:00
Ganesh Ajjanagadde
83a04f103d lavc: move exp2fi to ff_exp2fi in internal.h
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-16 07:57:26 -05:00
Stefano Sabatini
6e891d51f4 lavc/libopenh264: apply minor options text consistency fixes 2015-12-16 10:48:28 +01:00
Ganesh Ajjanagadde
65877ab935 lavc: typo fix uncliped -> unclipped
Untested due to lack of ppc.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-15 22:45:15 -05:00
Matthieu Bouron
ae1c750cb4 lavc/utils: use AVPixFmtDescriptor to probe palette formats
Also use the input frame format instead of the AVCodecContext one according
to the documentation of AVCodecContext.get_buffer2().
2015-12-15 10:35:47 +01:00
Andreas Cadhalpun
22e960ad47 golomb: always check for invalid UE golomb codes in get_ue_golomb
Also correct the check to reject log < 7, because UPDATE_CACHE only
guarantees 25 meaningful bits.

This fixes undefined behavior:
runtime error: shift exponent is negative

Testing with START/STOP timers in get_ue_golomb, one for the first
branch (A) and one for the second (B), shows that there is practically no
slowdown, e.g. for the cavs decoder:

With the check in the B branch:
    629 decicycles in get_ue_golomb B, 4194260 runs,     44 skips
    433 decicycles in get_ue_golomb A,268434102 runs,   1354 skips

Without the check:
    624 decicycles in get_ue_golomb B, 4194273 runs,     31 skips
    433 decicycles in get_ue_golomb A,268434203 runs,   1253 skips

Since the B branch is executed far less often than the A branch, this
change is negligible, even more so for the h264 decoder, where the ratio
B/A is a lot smaller.

Fixes: mozilla bug 1230239
Fixes: fbeb8b2c7c996e9b91c6b1af319d7ebc/asan_heap-oob_195450f_2743_e8856ece4579ea486670be2b236099a0.bit

Found-by: Tyson Smith
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-14 20:51:39 +01:00
Rostislav Pehlivanov
ade31b9424 aacenc: switch to using the RNG from libavutil
PSNR doesn't change as expected. The AAC spec doesn't really say
anything about how exactly to generate noise.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-14 18:53:09 +00:00
Janne Grunau
90b1b9350c arm: add ff_int32_to_float_fmul_array8_neon
Quite a bit faster than int32_to_float_fmul_array8_c calling
ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
Number of cycles per int32_to_float_fmul_array8 call while decoding
padded.dts on exynos5422:

               before  after   change
cortex-a7:     1270     951    -25%
cortex-a15:     434     285    -34%

checkasm --bench cycle counts:     cortex-a15   cortex-a7
int32_to_float_fmul_array8_c:      1730.4       4384.5
int32_to_float_fmul_array8_neon_c:  571.5       1694.3
int32_to_float_fmul_array8_neon:    374.0       1448.8

Interesting are the differences between
int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
The former is current behaviour of calling
ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
The raw numbers differ since checkasm uses different lengths than the
dca decoder.
2015-12-14 16:45:02 +01:00
Janne Grunau
a0fc780a20 arm64: int32_to_float_fmul neon asm
3% faster dts decoding on a cortex-a57.

                                 cortex-a57   cortex-a53
int32_to_float_fmul_array8_c:    1270.9       4475.6
int32_to_float_fmul_array8_neon:  328.6        569.2
int32_to_float_fmul_scalar_c:     928.5       4119.6
int32_to_float_fmul_scalar_neon:  309.1        524.1
2015-12-14 16:45:02 +01:00
Janne Grunau
705f5e5e15 arm64: port synth_filter_float_neon from arm
~25% faster dts decoding overall. The checkasm CPU cycles numbers are
not that useful since synth_filter_float() calls FFTContext.imdct_half().

                         cortex-a57   cortex-a53
synth_filter_float_c:    1866.2       3490.9
synth_filter_float_neon:  915.0       1531.5

With fftc.imdct_half forced to imdct_half_neon:
                         cortex-a57   cortex-a53
synth_filter_float_c:    1718.4       3025.3
synth_filter_float_neon:  926.2       1530.1
2015-12-14 16:45:01 +01:00
Janne Grunau
c33c1fa8af arm64: convert dcadsp neon asm from arm
~2% faster dts decoding overall.

                    cortex-a57   cortex-a53
dca_decode_hf_c:    474.8        1659.9
dca_decode_hf_neon: 225.2         301.1
dca_lfe_fir0_c:     913.2        1537.7
dca_lfe_fir0_neon:  286.8         451.9
dca_lfe_fir1_c:     848.7        1711.5
dca_lfe_fir1_neon:  387.1         506.4
2015-12-14 16:45:01 +01:00
Janne Grunau
e2710e790c arm: add a cpu flag for the VFPv2 vector mode
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.

Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
2015-12-14 16:42:35 +01:00
Janne Grunau
5dfe4edad6 x86_64: int32_to_float_fmul_scalar sign extend integer length 2015-12-14 16:42:35 +01:00
Agatha Hu
758be45756 avcodec/nvenc: clamp initial qp value to [1, 51]
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2015-12-14 10:34:59 +01:00
Agatha Hu
f1a8897375 avcodec/nvenc: set slice number to 1 to improve encoding quality
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2015-12-14 10:27:36 +01:00
Kieran Kunhya
906c0b7716 get_bits: Support max_depth > 2 in GET_RL_VLC_INTERNAL 2015-12-13 22:56:49 +00:00
Anton Khirnov
de9e199a03 lavc: make avpriv_mpa_decode_header private on next bump
It's not used by anything outside of lavc anymore.
2015-12-12 21:26:29 +01:00
Anton Khirnov
955aec3c7c mpegaudiodecheader: check the header in avpriv_mpegaudio_decode_header
Almost all the places from which this function is called already check
the header manually and in the two that don't (the mp3 muxer) the check
should not cause any problems.
2015-12-12 21:25:42 +01:00
Anton Khirnov
cea1eef25c lavc: get the profile name through the codec descriptor in avcodec_string() 2015-12-12 21:24:29 +01:00
Anton Khirnov
2c6811397b lavc: add profiles to AVCodecDescriptor
The profiles are a property of the codec, so it makes sense to export
them through AVCodecDescriptors, not just the codec implementations.
2015-12-12 21:22:49 +01:00
Anton Khirnov
cdc9ce098e lavc: print the name of the codec, not its implementation, in avcodec_string 2015-12-12 21:21:54 +01:00
Anton Khirnov
458e53f51f mpegvideo_enc: actually add the side data with vbv_delay to the packet
Fixes 2507b5dd67
2015-12-12 21:16:41 +01:00
Michael Niedermayer
625b582d5a avcodec/aacsbr_template: Add Check to read_sbr_envelope()
The limit is a conservative guess, the spec does not seem to specify a limit

Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-12 19:05:07 +01:00
zjh8890
c18176bd55 avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H
The transpose_4x4H is wrong which cost me much time to find this bug. The orders of r2 and r3 are wrong,
this bug waste me much time while I make aarch64 arm instruction which used the function.
2015-12-12 14:20:01 +01:00
Michael Niedermayer
b78885a3c5 avcodec/aacsbr: Split the env_facs table
This also removes a #ifdef and special case for the fixed point case

Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-12 12:19:07 +01:00
Ganesh Ajjanagadde
b4f1636a4d lavc: typo fix cliping -> clipping, saftey -> safety
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-11 19:10:00 -05:00
Ganesh Ajjanagadde
b8e5b1d786 lavc/mdct_template: use lrint instead of floor hack
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-11 10:35:15 -05:00
Ganesh Ajjanagadde
df679f1264 lavc/dcaenc: avoid wasteful cos calls
cos has symmetry; use this.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-11 10:22:09 -05:00
Ganesh Ajjanagadde
a0ddebfedf lavc/nellymoserdec: replace pow by exp2
exp2 suffices here.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-11 10:21:47 -05:00
Dave Yeo
b0b133b8c0 hevcdsp: use a macro for .rodata section
fixes assembling on OS/2

Signed-off-by: Dave Yeo <dave.r.yeo@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2015-12-11 16:19:30 +01:00
Andreas Cadhalpun
fdc94db37e sbr_qmf_analysis: sanitize input for 32-bit imdct
If the input contains too many too large values, the imdct can overflow.
Even if it didn't, the output would be larger than the valid range of 29
bits.

Note that this is a very delicate limit: Allowing values up to 1<<25
does not prevent input larger than 1<<29 from arriving at
sbr_sum_square, while limiting values to 1<<23 breaks the
fate-aac-fixed-al_sbr_hq_cm_48_5.1 test.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-11 00:04:04 +01:00
Andreas Cadhalpun
a9c20e922c sbrdsp_fixed: assert that input values are in the valid range
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-11 00:04:04 +01:00
Andreas Cadhalpun
ff8816f717 aacsbr: ensure strictly monotone time borders
This fixes a division by zero in the aac_fixed decoder.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-11 00:04:04 +01:00
Rostislav Pehlivanov
d8f13e783a diracdec: remove duplicate codeblock decoding
Broken by commit 7424a6d0a5

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-10 22:50:58 +00:00
Kieran Kunhya
3652dd5d0c diracdec: Fix FPE on invalid low_delay data 2015-12-10 22:14:03 +00:00
Kieran Kunhya
cdf8c9038d diracdec: Replace dirac parse codes with better ones 2015-12-10 21:47:01 +00:00
Kieran Kunhya
7424a6d0a5 diracdec: Read picture types by using parse_code 2015-12-10 21:42:13 +00:00
Kieran Kunhya
8880ca2307 diracdec: Store version major/minor flags 2015-12-10 21:39:06 +00:00
Kieran Kunhya
8eb6acef92 diracdec: Support new extended quantiser range 2015-12-10 21:37:24 +00:00
Kieran Kunhya
8dcc99dc68 diracdec: Extract version parameters 2015-12-10 21:26:35 +00:00
Kieran Kunhya
9f374c5906 diracdec: Make slice parameters common between lowdelay and future hq profile 2015-12-10 21:04:04 +00:00
Kieran Kunhya
3bb6ce1af9 diracdec: Rename lowdelay_subband to decode_subband because it is shared with HQ profile 2015-12-10 19:11:21 +00:00
Kieran Kunhya
3f07f12f65 diracdec: Template DSP functions adding 10-bit versions 2015-12-10 18:25:02 +00:00
Kieran Kunhya
9553689854 diracdec: Move strides to bytes, and pointer types to uint8_t.
Start templating functions for move to support 10-bit
Parts of this patch were written by Rostislav Pehlivanov
2015-12-10 16:52:48 +00:00
Claudio Freire
124c375938 AAC encoder: fix OOB access in search_for_pns
Fix OOB access in search_for_pns which was using
w2 outside the window group loop, and fix a typo
in which it was checking sf_idx instead of band_type

Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-09 22:29:18 +01:00
Ganesh Ajjanagadde
cb93df0dcb avcodec/aacsbr_tablegen: always initialize tables at runtime
This gets rid of virtually useless hardcoded tables hackery. The reason
it is useless is that a 320 element lut is anyway placed regardless of
--enable-hardcoded-tables, from which all necessary tables are trivially
derived at runtime at very low cost:

sample benchmark (x86-64, Haswell, GNU/Linux, single run is really
what is relevant here since looping drastically changes the bench). Fluctuations
are on the order of 10% for the single run test:
39400 decicycles in aacsbr_tableinit,       1 runs,      0 skips
25325 decicycles in aacsbr_tableinit,       2 runs,      0 skips
18475 decicycles in aacsbr_tableinit,       4 runs,      0 skips
15008 decicycles in aacsbr_tableinit,       8 runs,      0 skips
13016 decicycles in aacsbr_tableinit,      16 runs,      0 skips
12005 decicycles in aacsbr_tableinit,      32 runs,      0 skips
11546 decicycles in aacsbr_tableinit,      64 runs,      0 skips
11506 decicycles in aacsbr_tableinit,     128 runs,      0 skips
11500 decicycles in aacsbr_tableinit,     256 runs,      0 skips
11183 decicycles in aacsbr_tableinit,     509 runs,      3 skips

Tested with FATE with/without --enable-hardcoded-tables.

Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-09 07:36:58 -05:00
Ganesh Ajjanagadde
42868ca569 avcodec/jpeg2000: replace naive pow call with smarter exp2fi
pow is a very wasteful function for this purpose. A low hanging fruit
would be simply to replace with exp2f, and that does yield some speedup.
However, there are 2 drawbacks of this:
1. It does not exploit the integer nature of the argument.
2. (minor) Some platforms lack a proper exp2f routine, making benefits available
only to non broken libm.
3. exp2f does not solve the same issue that plagues pow, namely terrible
worst case performance. This is a fundamental issue known as the
"table-maker's dilemma" recognized by Prof. Kahan himself and
subsequently elaborated and researched by many others. All this is clear from benchmarks below.

This exploits the IEEE-754 format to get very good performance even in
the worst case for integer powers of 2. This solves all the issues noted
above. Function tested with clang usan over [-1000, 1000] (beyond range of
relevance for this, which is [-255, 255]), patch itself with FATE.

Benchmarks obtained on x86-64, Haswell, GNU-Linux via 10^5 iterations of
the pow call, START/STOP, and command ffplay ~/samples/jpeg2000/chiens_dcinema2K.mxf.
Low number of runs also given to prove the point about worst case:

pow:
 216270 decicycles in pow,       1 runs,      0 skips
 110175 decicycles in pow,       2 runs,      0 skips
  56085 decicycles in pow,       4 runs,      0 skips
  29013 decicycles in pow,       8 runs,      0 skips
  15472 decicycles in pow,      16 runs,      0 skips
   8689 decicycles in pow,      32 runs,      0 skips
   5295 decicycles in pow,      64 runs,      0 skips
   3599 decicycles in pow,     128 runs,      0 skips
   2748 decicycles in pow,     256 runs,      0 skips
   2304 decicycles in pow,     511 runs,      1 skips
   2072 decicycles in pow,    1022 runs,      2 skips
   1963 decicycles in pow,    2044 runs,      4 skips
   1894 decicycles in pow,    4091 runs,      5 skips
   1860 decicycles in pow,    8184 runs,      8 skips

exp2f:
 134140 decicycles in pow,       1 runs,      0 skips
  68110 decicycles in pow,       2 runs,      0 skips
  34530 decicycles in pow,       4 runs,      0 skips
  17677 decicycles in pow,       8 runs,      0 skips
   9175 decicycles in pow,      16 runs,      0 skips
   4931 decicycles in pow,      32 runs,      0 skips
   2808 decicycles in pow,      64 runs,      0 skips
   1747 decicycles in pow,     128 runs,      0 skips
   1208 decicycles in pow,     256 runs,      0 skips
    952 decicycles in pow,     512 runs,      0 skips
    822 decicycles in pow,    1024 runs,      0 skips
    765 decicycles in pow,    2047 runs,      1 skips
    722 decicycles in pow,    4094 runs,      2 skips
    693 decicycles in pow,    8190 runs,      2 skips

exp2fi:
   2740 decicycles in pow,       1 runs,      0 skips
   1530 decicycles in pow,       2 runs,      0 skips
    955 decicycles in pow,       4 runs,      0 skips
    622 decicycles in pow,       8 runs,      0 skips
    477 decicycles in pow,      16 runs,      0 skips
    368 decicycles in pow,      32 runs,      0 skips
    317 decicycles in pow,      64 runs,      0 skips
    291 decicycles in pow,     128 runs,      0 skips
    277 decicycles in pow,     256 runs,      0 skips
    268 decicycles in pow,     512 runs,      0 skips
    265 decicycles in pow,    1024 runs,      0 skips
    263 decicycles in pow,    2048 runs,      0 skips
    263 decicycles in pow,    4095 runs,      1 skips
    260 decicycles in pow,    8191 runs,      1 skips

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-08 22:00:05 -05:00
Andreas Cadhalpun
5b0da6999f aacenc: update max_sfb when num_swb changes
This fixes out-of-bounds reads in avoid_clipping.

Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-08 22:53:09 +01:00
Luca Barbato
81c95eb8ee openh264: Directly include the deprecation guards header
Make easier to avoid compile failure when reworking the internal
headers.
2015-12-08 18:12:33 +01:00
Sebastian Dröge
9aebea0a4d avcodec/h264: Set CORRUPT flag on output frames that are not fully recovered
In the merge commit 78265fcfee this behaviour
was broken and the CORRUPT flag would never ever be set on a frame. However
the flag on the AVCodecContext was taken into account properly, including
AV_CODEC_FLAG2_SHOW_ALL.

The reason for this was that the recovered field of the next output picture
was always set to TRUE whenever one of the two AVCodecContext flags was set,
which made it impossible to detect later, before outputting, if the frame was
really recovered or not. Now don't set it to TRUE unless the frame is really
recovered and check the AVCodecContext flags right before outputting.

Signed-off-by: Sebastian Dröge <sebastian@centricular.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-08 18:09:27 +01:00
Rostislav Pehlivanov
4c5136a48b aacenc_ltp: disable LTP with high lambda values
Makes no sense to enable for high bitrates, the coder does well enough.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-08 13:31:55 +00:00
Rostislav Pehlivanov
6e5dbe7267 aacenc_tns: use 4 bits for short windows
With only 7 coefficients per short window at most the extra precision
makes a difference and seems to reduce crackling and stddev even
further.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-08 13:31:50 +00:00
Luca Barbato
34138ece23 log: Use a do {} while (0) for tlog
Avoid the warning `-Wempty-body`.
2015-12-08 11:26:21 +01:00
Hendrik Leppkes
c9dbff60c6 Merge commit '7d36474d1908d6267d4e11d4d9909f9604bd0c81'
* commit '7d36474d1908d6267d4e11d4d9909f9604bd0c81':
  imgconvert: Re-enable the deprecation warnings

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-08 10:03:51 +01:00
Hendrik Leppkes
68cb0dc8a3 Merge commit 'f7edcac040f73635fc1127489c9bb29ca8b43532'
* commit 'f7edcac040f73635fc1127489c9bb29ca8b43532':
  avpicture: Suppress warning from deprecated code

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-08 10:03:08 +01:00
Hendrik Leppkes
92186f2d10 Merge commit 'b805482b1fba1d82fbe47023a24c9261f18979b6'
* commit 'b805482b1fba1d82fbe47023a24c9261f18979b6':
  aac: Provide more information on the failure message

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-08 09:59:45 +01:00
Hendrik Leppkes
3f9c64831c Merge commit 'c5eb279e240af6b7228a624cd7193732f2d5adaa'
* commit 'c5eb279e240af6b7228a624cd7193732f2d5adaa':
  g723: Add missing header

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-08 09:50:59 +01:00
foo86
ff6dd5851b avcodec/libdcadec: honor -err_detect option
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-08 03:37:04 +01:00
foo86
de12aa51b6 avcodec/libdcadec: add some useful options
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-08 03:37:04 +01:00
foo86
704b278361 avcodec/libdcadec: implement logging callback
Don't print a warning when dcadec_context_filter() returns positive
warning code. Most relevant warnings are now output through the callback
function.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-08 03:37:04 +01:00
foo86
704654ea17 avcodec/libdcadec: fix request_channel_layout
Take request_channel_layout as a hint and don't force 2.0 downmix by
using both the 2CH and 6CH flags together.

Remove warnings about missing coefficients because they are no longer
relevant.

Honor AV_CH_LAYOUT_NATIVE and make it possible for native DTS channel
layout to be output.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-08 03:37:04 +01:00
Vittorio Giovara
c34df42262 sgienc: Make sure to initialize skipped header portions
Fix fate tests with asan. Introduced during bytestream2 porting
(in revision 62cc8f4d79).

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-12-07 11:27:42 -05:00
Vittorio Giovara
16216b713f lavc: Drop exporting 2-pass encoding stats
These variables are coming from mpegvideoenc where are supposedly used
as bit counters on various frame properties. However their use is
unclear as they lack documentation, are available only from a very small
subset of encoders, and they are hardly used in the wild. Also frame_bits
in aacenc is employed in a similar way.

Remove this functionality from AVCodecContex, these variable are mostly
frame properties, and too few encoders support setting them with anything
useful.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-12-07 11:27:42 -05:00
Clément Bœsch
a8bb81a05c lavc, lavu: use avutil/thread.h instead of redundant conditional includes 2015-12-07 17:25:51 +01:00
Vittorio Giovara
be00ec832c lavc: Deprecate coder_type and its symbols
Most option values are simply unused or ignored and in practice the
majory of codecs only need to check whether to enable rle or not.

Add appropriate codec private options which better expose the allowed
features.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2015-12-07 11:01:22 -05:00
Hendrik Leppkes
312c83e057 avcodec/g723_1: fix license header 2015-12-07 16:10:51 +01:00
Hendrik Leppkes
90c93fb129 Merge commit 'f023d57d355ff3b917f1aad9b03db5c293ec4244'
* commit 'f023d57d355ff3b917f1aad9b03db5c293ec4244':
  lavc: G.723.1 encoder

Split existing FFmpeg G.723.1 encoder into a new file.

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-07 15:50:45 +01:00
Hendrik Leppkes
6c9cc21bcc Merge commit '165cc6fb9defcd79fd71c08167f3e8df26b058ff'
* commit '165cc6fb9defcd79fd71c08167f3e8df26b058ff':
  g723_1: Move sharable functions to a separate file

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-07 15:26:41 +01:00
Hendrik Leppkes
9cf74191ed Merge commit 'aac996cc01042194bf621d845bbe684549b5882e'
* commit 'aac996cc01042194bf621d845bbe684549b5882e':
  g723_1: Rename files to better reflect their purpose

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-07 15:04:13 +01:00
Hendrik Leppkes
2730a2013d Merge commit 'b74b88f30da2389f333a31815d8326d5576d3331'
* commit 'b74b88f30da2389f333a31815d8326d5576d3331':
  g723_1: Handle values at the ends of the table in lsp2lpc()

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2015-12-07 14:58:38 +01:00
Anton Khirnov
68e547ae8b avpacket: use ERANGE instead of EOVERFLOW
EOVERFLOW seems to be unavailable on certain platforms.
2015-12-07 11:42:26 +01:00
Anton Khirnov
f1ccd07680 h264: do not call frame_start() for missing frames
We do not need to do a full setup like for a real frame, just allocate a
buffer and set cur_pic(_ptr).
2015-12-07 11:42:26 +01:00
Anton Khirnov
d6dc5d15af aacdec: fix aac_static_table_init() prototype 2015-12-07 11:42:26 +01:00
Hendrik Leppkes
1e6cf7272f avcodec: implement vp9 dxva2 hwaccel 2015-12-07 09:38:59 +01:00
Hendrik Leppkes
585083dd1f vp9: add hwaccel hooks 2015-12-07 09:25:02 +01:00
Hendrik Leppkes
cd1b7e2bd7 vp9: fix pixel format changes with threading 2015-12-07 09:23:18 +01:00
Andreas Cadhalpun
5adb5d9d89 mjpegdec: consider chroma subsampling in size check
If the chroma components are subsampled, smaller buffers are allocated
for them. In that case the maximal block_offset for the chroma
components is not as large as for the luma component.

This fixes out of bounds writes causing segmentation faults or memory
corruption.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2015-12-06 22:40:41 +01:00
Rostislav Pehlivanov
d55f83de4d aacenc_tns: tune and reduce artifacts
There are a couple of major changes here:

1. Start using TNS coefficient compression.
2. Start using 3 bits per coefficient maximum for short windows.
The bits we save from these 2 changes seem to make a nice impact on the
rest of the file/windows.

3. Remove special case gain checking for short windows.
4. Modify the coefficient loop to support up to 3 windows.
The additional restrictions on TNS were something that was no in the
specifications and furthermore restricting TNS to only low energy short
windows was done to compensate for bugs elsewhere in the code.

Overall, the improvements here reduce crackling artifacts heard in very
noisy tracks.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-06 20:16:48 +00:00
Rostislav Pehlivanov
b32e989e6c aacenc: move the TNS search and filtering before PNS
The original plan was to have TNS use data from the PNS search to better
tune itself to noise but this was never used nor necessary. This should
slightly boost the PNS accuracy if TNS was used.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-06 20:16:48 +00:00
Ganesh Ajjanagadde
14886bebfe avcodec/dvdsubdec: fix typo in dlog message
Likely accidental in 764900d645.

Fixes: CID 1341578.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-06 08:11:47 -05:00
Anton Khirnov
e02de9df4b lavc: export Dirac parsing API used by the ogg demuxer as public
Also, stop using AVCodecContext for storing the stream parameters.
2015-12-06 10:28:04 +01:00
Anton Khirnov
825900248b qsvenc: export CPB props side data 2015-12-06 10:25:49 +01:00
Anton Khirnov
1520c6ff05 nvenc: export CPB props side data 2015-12-06 10:25:43 +01:00
Anton Khirnov
2507b5dd67 mpegvideo_enc: export vbv_delay in side data
Deprecate AVCodecContext.vbv_delay
2015-12-06 10:25:23 +01:00
Anton Khirnov
3f5c99fcbb mpegvideo_enc: export CPB props side data 2015-12-06 10:25:08 +01:00
Anton Khirnov
732a37d146 libx264: export CPB props side data 2015-12-06 10:25:00 +01:00
Anton Khirnov
03afb62e83 libvpxenc: export CPB props side data 2015-12-06 10:24:47 +01:00
Anton Khirnov
11c9bd633f libopenh264enc: export CPB props side data 2015-12-06 10:24:21 +01:00
Anton Khirnov
f0b769c16d lavc: add a packet side data type for VBV-like parameters 2015-12-06 10:23:45 +01:00
Anton Khirnov
84adab333c lavc: add stream-global packet side data
This is similar to what is done for AVStream.
2015-12-06 10:22:43 +01:00
Anton Khirnov
31c51f7441 avpacket: add a function for wrapping existing data as side data 2015-12-06 10:22:18 +01:00
Anton Khirnov
b09ad37c83 h264: derive the delay from the level when it's not present
Fall back to maximum DPB size if the level is unknown.

This should be more spec-compliant and does not depend on the caller
setting has_b_frames before opening the decoder.

The old behaviour, when the delay is supplied by the caller setting
has_b_frames, can still be obtained by setting strict_std_compliance
below normal.
2015-12-06 09:43:52 +01:00
Anton Khirnov
792b9c9dfc h264: set frame_num in start_frame(), not decode_slice_header()
That is a more appropriate place for it, since it is not allowed to
change between slices.
2015-12-06 09:43:45 +01:00
Anton Khirnov
741b494fa8 h264: eliminate default_ref_list
According to the spec, the reference list for a slice should be
constructed by first generating an initial (what we now call "default")
reference list and then optionally applying modifications to it.

Our code has an optimization where the initial reference list is
constructed for the first inter slice and then rebuilt for other slices
if needed. This, however, adds complexity to the code, requires an extra
2.5kB array in the codec context and there is no reason to think that it
has any positive effect on performance. Therefore, simplify the code by
generating the reference list from scratch for each slice.
2015-12-06 09:42:39 +01:00
Michael Niedermayer
2140858524 avcodec/hevc: Fix integer overflow of entry_point_offset
Fixes out of array read
Fixes: d41d8cd98f00b204e9800998ecf8427e/signal_sigsegv_321165b_7641_077dfcd8cbc80b1c0b470c8554cd6ffb.bit

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-05 22:34:30 +01:00
Anton Khirnov
e7078e842d hevcdsp: add x86 SIMD for MC 2015-12-05 21:11:52 +01:00
Anton Khirnov
a853388d2f hevc: change the stride of the MC buffer to be in bytes instead of elements
Currently, the frame stride is passed in bytes, while the MC buffer size
is in int16_t elements, This can be confusing, so pass both strides in
bytes.
2015-12-05 21:11:12 +01:00
Anton Khirnov
688417399c hevcdsp: split the pred functions by width
This should allow for more efficient SIMD.
2015-12-05 21:10:41 +01:00
Anton Khirnov
818bfe7f0a hevcdsp: split the epel functions by width
This should allow for more efficient SIMD.
2015-12-05 21:09:57 +01:00
Anton Khirnov
1f821750f0 hevcdsp: split the qpel functions by width instead of by the subpixel fraction
This should allow for more efficient SIMD.

Keep the C versions as they are now, to allow the compiler to inline the
interpolation coefficients.
2015-12-05 21:08:04 +01:00
Rostislav Pehlivanov
dcbe8d8abc aacenc_ltp: use an AR filter for LTP encoding as well
Seems to work better. Information on why the decoder does this is
lacking.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-05 19:06:39 +00:00
Rostislav Pehlivanov
3112501daf aacenc: fix aac_pred option triggering an error
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2015-12-05 18:43:17 +00:00
Michael Niedermayer
a08681f1e6 avcodec/dirac_parser: Check that there is a previous PU before accessing it
Fixes out of array read
Fixes: 99d142c47e6ba3510a74b872a1a2ae72/asan_heap-oob_11b36f4_3811_0f5c69e7609a88a580135678de1df844.dxa

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-05 17:42:45 +01:00
Michael Niedermayer
c7d6ec947c avcodec/dirac_parser: Add basic validity checks for next_pu_offset and prev_pu_offset
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-05 17:42:45 +01:00