Tables are bit identical.
Sample benchmark (Haswell, GNU/Linux+gcc):
old:
814485 decicycles in dsd_ctables_tableinit, 512 runs, 0 skips
new:
356808 decicycles in dsd_ctable_tableinit, 512 runs, 0 skips
Binary size should essentially be identical, and is in fact identical on
the configuration I tested on.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Intel's Instruction Set Reference (as of September 2015) clearly states
that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the
source is a memory location. The Instruction Set Reference from 1999
(Order Number 243191) describes this behaviour but all later versions
I've seen have make no distinction whether MMX registers or memory is
used as source.
The documentation for the matching SSE2 instruction to convert to double
(cvtpi2pd) was fixed (see the valgrind bug
https://bugs.kde.org/show_bug.cgi?id=210264).
It will take time to get a clarification and fixes in place. In the
meantime it makes sense to change ff_int32_to_float_fmul_scalar_sse to
be correct according to the documentation. The vast majority of users
will have SSE2 so a change to the SSE version has little effect.
Fixes fate-checkasm on x86 valgrind targets.
Valgrind 'bug' reported as https://bugs.kde.org/show_bug.cgi?id=357059
get_ue_golomb() cannot decode values larger than 8190 (the maximum
value that can be golomb encoded in 25 bits) and produces the error
"Invalid UE golomb code" if a larger value is encountered. Use
get_ue_golomb_long() instead (which supports 63 bits, up to 4294967294)
when valid h264/hevc values can exceed 8190.
This updates decoding of the following values: (maximum)
first_mb_in_slice 36863* for level 5.2
abs_diff_pic_num_minus1 131071
difference_of_pic_nums_minus1 131071
idr_pic_id 65535
recovery_frame_cnt 65535
frame_packing_arrangement_id 4294967294
frame_packing_arrangement_repetition_period 16384
display_orientation_repetition_period 16384
An alternative would be to modify get_ue_golomb() to handle encoded
values of up to 49 bits as was done for get_se_golomb() in a92816c.
In that case get_ue_golomb() could continue to be used for all of
these except frame_packing_arrangement_id.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Should fix the regression, and also speeds up table generation.
Tables tested on GNU/Linux+clang: they are identical to the ones prior
to 5495c7f. ff_exp10 caused one slight change in one entry, 50000 became
50001 due to somewhat incorrect rounding.
Untested on ICC; passes FATE on GNU/Linux+gcc.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
The DCA core decoder converts integer coefficients read from the
bitstream to floats just after reading them (along with dequantization).
All the other steps of the audio reconstruction are done with floats
which makes the output for the DTS lossless extension (XLL)
actually lossy.
This patch changes the DCA core to work with integer coefficients
until QMF. At this point the integer coefficients are converted to floats.
The coefficients for the LFE channel (lfe_data) are not touched.
This is the first step for the really lossless XLL decoding.
This fixes an out-of-bounds read introduced in commit 0379603.
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Fix possible SF delta violation that would cause an
eventual assertion failure in some corner cases (esp
on very low bitrates) when marking bands for PNS due
to misuse of the sf_delta utilities
'erf' is far from the best name for a variable and is not very
descriptive since the actual variable points to the comparitively best
IS phase. Therefore rename it to 'best'.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Otherwise the too small buffer is directly used in the frame, causing
segmentation faults, when trying to use the frame.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>