Commit Graph

428 Commits

Author SHA1 Message Date
James Almer
dd2c9034b1 x86/swr: convert resample_{common, linear}_double_sse2 to yasm
Signed-off-by: James Almer <jamrial@gmail.com>

312531 -> 311528 dezicycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 17:57:36 +02:00
Ronald S. Bultje
847bb638c0 swr: convert resample_common/linear_int16_mmx2/sse2 to yasm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-30 20:11:50 +02:00
Michael Niedermayer
418e5768c6 swresample/resample_template: move division out of loop for float/double swri_resample_linear()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-30 04:30:10 +02:00
Michael Niedermayer
c5a405c4f0 swresample/resample_template: flip order of operations in swri_resample_linear() for 32bit
Fixes integer overflow

Found-by: BBB
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-29 22:19:57 +02:00
Ronald S. Bultje
faa1471ffc swr: rewrite resample_common/linear_float_sse/avx in yasm.
Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 17:06:47 +02:00
Ronald S. Bultje
ddb7b4435a swr: move dst_size == 0 handling outside DSP function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-28 15:30:01 +02:00
Ronald S. Bultje
0dae193d3e swr: remove another forgotten division in DSP function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-22 05:20:22 +02:00
Ronald S. Bultje
cbf21628a5 swr: remove div/mod from DSP functions.
Also fix a bug with resample_compensation resetting dst_incr.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-18 14:15:52 +02:00
Michael Niedermayer
0608bc6502 swresample/audioconvert: fix () in FMT_PAIR_FUNC()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-18 03:13:37 +02:00
Ronald S. Bultje
edf930472b swr: reindent.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-16 01:33:32 +02:00
Ronald S. Bultje
083cd3d1f7 swr: compile mmx2 s16p functions only on x86-32.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 13:34:53 +02:00
James Almer
7f4dfbd080 swr: add prototypes for resample dsp functions
Should fix compilation failures with MSVC and any other compiler
without inline asm support.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 01:33:17 +02:00
Ronald S. Bultje
ada8f9c046 swr: remove obsolete function prototypes.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-15 00:07:25 +02:00
Ronald S. Bultje
7128a35f8c swr: split out DSP functions.
DSP bits of swri_resample go into their own mini-DSP functions; DSP
init goes from a per-call branch in multiple_resample to a proper
DSP init routine; x86 bits go into x86/; swri_resample() moves out of
resample_template.c into resample.c because it's independent of DSP
code or sample type; multiple_resample() is simplified.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 20:21:39 +02:00
Michael Niedermayer
4411928c64 swresample/resample: replace assert by av_assert
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 16:33:09 +02:00
Ronald S. Bultje
b785c62681 swr: handle initial negative sample index outside DSP function.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 14:36:18 +02:00
Ronald S. Bultje
6b9685de3a swr: remove unnecessary assignment.
I don't see dst_incr/dst_incr_frac ever being changed from their
initial value (which is the inverse of this operation), so it seems
to me that this is a no-op.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-14 04:30:53 +02:00
Ronald S. Bultje
f341340552 swr: handle 64bit overflow check in multiple_resample().
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-09 15:24:51 +02:00
Lou Logan
88f2586adb fix various typos
Signed-off-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-03 10:58:19 -08:00
Ronald S. Bultje
cdfd9717ed swr: move compensation_distance handling to swri_resample caller.
I think there's an off-by-one in terms of the switchpoint where we
switch from dst_incr to ideal_dst_incr, I don't think that's a massive
issue, but just be aware of that. It's probably trivial to prevent but
I don't care.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>

I could not reproduce any off by 1 error, results are bit exact (michael)
2014-06-02 15:06:24 +02:00
Michael Niedermayer
2c23f87c85 swr/resample_template: prevent end_index from overflowing and add check for delta_frac overflow
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-02 00:57:08 +02:00
Ronald S. Bultje
9b53853756 Rewrite main resampling loop (common and linear).
This removes a branch at a performance-sensitive point (in the middle
of the loop). In fate-swr-resample-s32p-8000-2626, this makes the code
about 10% faster. It also simplifies the loops, allowing us to rewrite
it in yasm at some later point.

The compensation_distance != 0 code and index < 0 code are still kind
of hairy. For compensation_distance != 0, this should likely be handled
in the caller, so that it calls swri_resample twice (once until the
dst_incr switch-point, and once with the remainder of the samples). For
index < 0, the code should probably be rewritten to break out of the
loop once sample_index >= 0, and then resume (e.g. as a tail-call) to
the common or linear resampling loops.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-02 00:47:54 +02:00
James Almer
a9bf713d35 swresample: add swri_resample_float_avx
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-16 05:27:03 +02:00
Michael Niedermayer
96cb4c8718 swresample: swr_close()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-15 18:27:23 +02:00
Matt Oliver
1898c2f49d inline asm: fix arrays as named constraints.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-07 15:02:45 +02:00
James Almer
4cdea92976 swresample/resample: add missing xmm clobbers
Might fix fate-swr on ICL

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-07 01:32:40 +02:00
Michael Niedermayer
68c3e6025f Fix convertion typos
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-29 00:09:08 +02:00
James Almer
cdac3ab59f swresample: add swri_resample_double_sse2
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 16:46:07 +02:00
Michael Niedermayer
291d464161 swresample: fix AV_CH_LAYOUT_STEREO_DOWNMIX input
Fixes Ticket 3542

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-24 01:25:46 +02:00
Michael Niedermayer
2b58c9c945 swresample/resample_template: try to consider src_size more exactly
This should avoid slight differences in the output causes by input
size alignment differences between archs

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-15 06:35:35 +02:00
Michael Niedermayer
5e379cd3ee swresample/resample: simplify index/consumed calculation for the filter = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:23:04 +02:00
Michael Niedermayer
6c8ee74af2 swresample/resample: Fix fractional part of index in the filter_size = 1 filters = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:22:17 +02:00
Michael Niedermayer
5027f39712 swresample/resample: use av_malloc_array() where appropriate
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-08 00:29:26 +02:00
Michael Niedermayer
a5290cb1ac swresample/dither: use av_malloc_array()
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-08 00:29:20 +02:00
Michael Niedermayer
f9158b01d0 swresample/resample: Limit filter length
Related to CID1197063

The limit choosen is arbitrary and much larger than what makes sense.
It avoids the need for checking arithmetic operations with the length for overflow

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-08 00:25:49 +02:00
James Almer
63dbba655e swresample/resample: sse float linear interpolation
About two times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:34:02 +01:00
James Almer
fa25c4c400 swresample/resample: mmx2/sse2 int16 linear interpolation
About three times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:33:16 +01:00
James Almer
32291ba6ea swresample: add swri_resample_float_sse
At least two times faster than the C version.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 06:01:06 +01:00
Matt Oliver
8236747511 Automatically change MANGLE() into named inline asm operands when direct symbol reference in inline asm are not supported.
This is part of the patch-set for intel C inline asm on windows support

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 23:39:30 +01:00
James Almer
3d48cbc56c swresample: reuse COMMON_CORE asm where possible
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 16:08:34 +01:00
James Almer
7c8bf09edd swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2
pshuf+paddd is slightly faster than phaddd.
The real gain is in pre-ssse3 processors like AMD K8 and K10, which get
a big boost in performance compared to the mmxext version

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 15:00:50 +01:00
Michael Niedermayer
6c6e4dd139 swr: check that the context for swr_convert() has been initialized
Reviewed-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-24 08:23:42 +01:00
Michael Niedermayer
a66be60888 swresample: add swr_is_initialized()
Idea-from/based-on: 7e86c27b4e
Reviewed-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-24 08:23:22 +01:00
Michael Niedermayer
f284e2a58a swresample: factorize clear_context() out
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-22 21:36:30 +01:00
Reimar Döffinger
e535897fad Fix libswresample compilation with Apple Neon assembler.
Signed-off-by: Carl Eugen Hoyos <cehoyos@ag.or.at>
2014-02-17 17:40:10 +01:00
Martin Storsjö
3dd04cbcf7 swresample: Add arm&x86 clobber tests
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-18 18:38:57 +01:00
Reimar Döffinger
cbeaf67888 Avoid using empty macro arguments.
These are not supported by all compilers (gcc 2.95 but also older SPARC
compilers, see gcc bug #33304 for example), and there is no real need for them.
One use of this feature remains in libavdevice/v4l2.c which can't be
replaced quite as easily.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2013-12-31 12:19:59 +01:00
Stefano Sabatini
334e2e2363 lavu,lavc,lswr: do not hardcode AV_SAMPLE_FMT_NB value when setting sample format max value
The constant may change in libavutil but the library may be compiled
against an older version, thus rejecting a value which is otherwise
supported by the new libavutil.

INT_MAX is used here to denote the max allowed value for a sample format.

The opt-test code is changed to provide a valid reference example.
2013-12-26 11:35:27 +01:00
James Almer
56572787ae Add Windows resource file support for shared libraries
Originally written by James Almer <jamrial@gmail.com>

With the following contributions by Timothy Gu <timothygu99@gmail.com>

* Use descriptions of libraries from the pkg-config file generation function
* Use "FFmpeg Project" as CompanyName (suggested by Alexander Strasser)
* Use "FFmpeg" for ProductName as MSDN says "name of the product with which the
  file is distributed" [1].
* Use FFmpeg's version (N-xxxxx-gxxxxxxx) for ProductVersion per MSDN [1].
* Only build the .rc files when --enable-small is not enabled.

[1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa381058.aspx

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-05 23:42:07 +01:00
Michael Niedermayer
a6af5da7a2 swresample: use the internal buffer for resampling the last few samples
Fixes out of array read
Fixes Ticket3193

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-04 20:40:42 +01:00