Commit Graph

21 Commits

Author SHA1 Message Date
Ronald S. Bultje
9b53853756 Rewrite main resampling loop (common and linear).
This removes a branch at a performance-sensitive point (in the middle
of the loop). In fate-swr-resample-s32p-8000-2626, this makes the code
about 10% faster. It also simplifies the loops, allowing us to rewrite
it in yasm at some later point.

The compensation_distance != 0 code and index < 0 code are still kind
of hairy. For compensation_distance != 0, this should likely be handled
in the caller, so that it calls swri_resample twice (once until the
dst_incr switch-point, and once with the remainder of the samples). For
index < 0, the code should probably be rewritten to break out of the
loop once sample_index >= 0, and then resume (e.g. as a tail-call) to
the common or linear resampling loops.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-02 00:47:54 +02:00
James Almer
a9bf713d35 swresample: add swri_resample_float_avx
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-16 05:27:03 +02:00
James Almer
cdac3ab59f swresample: add swri_resample_double_sse2
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-25 16:46:07 +02:00
Michael Niedermayer
2b58c9c945 swresample/resample_template: try to consider src_size more exactly
This should avoid slight differences in the output causes by input
size alignment differences between archs

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-15 06:35:35 +02:00
Michael Niedermayer
5e379cd3ee swresample/resample: simplify index/consumed calculation for the filter = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:23:04 +02:00
Michael Niedermayer
6c8ee74af2 swresample/resample: Fix fractional part of index in the filter_size = 1 filters = 1 case
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-04-14 02:22:17 +02:00
James Almer
63dbba655e swresample/resample: sse float linear interpolation
About two times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:34:02 +01:00
James Almer
fa25c4c400 swresample/resample: mmx2/sse2 int16 linear interpolation
About three times faster

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-24 02:33:16 +01:00
James Almer
32291ba6ea swresample: add swri_resample_float_sse
At least two times faster than the C version.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-20 06:01:06 +01:00
James Almer
3d48cbc56c swresample: reuse COMMON_CORE asm where possible
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 16:08:34 +01:00
James Almer
7c8bf09edd swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2
pshuf+paddd is slightly faster than phaddd.
The real gain is in pre-ssse3 processors like AMD K8 and K10, which get
a big boost in performance compared to the mmxext version

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-18 15:00:50 +01:00
Michael Niedermayer
b8c55590d5 swr/resample: fix integer overflow, add missing cast
The effects of this are limited to numeric errors in the output

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-04 04:05:59 +01:00
Michael Niedermayer
b6a7f66f93 resample: remove disabled debug code
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-06 02:51:26 +01:00
Clément Bœsch
8ea8833979 swr/resample: move templating parameters to template itself.
It has various benefits such as allowing some refactoring, clarifying
the code in the inclusion part, and making the template understandable
in standalone.

This commit is based on the templating method used by Justin Ruggles for
libavresample.
2012-11-15 21:24:49 +01:00
Michael Niedermayer
d53f447130 swr: move if() block into the only branch where it can be true.
This should make the code a tiny tiny bit faster.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-15 12:33:40 +01:00
Michael Niedermayer
17da2d9eee swr: reorder/redesign operations to avoid integer overflow.
This fixes a out of array read.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-15 12:33:40 +01:00
Michael Niedermayer
4ccf6e3971 swr: MMX2 & SSSE3 int16 resample core
about 4 times faster

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-28 00:36:27 +02:00
Michael Niedermayer
0c142e4cda swr: introduce filter_alloc in preparation of SIMD resample optimisations
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-19 03:09:24 +02:00
Michael Niedermayer
80e857c967 swr/resample: optimize C code for the most common case
15% speedup

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-19 03:09:24 +02:00
Michael Niedermayer
6e6dd9995b resample_template: use av_assert
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-06 20:08:57 +02:00
Michael Niedermayer
7f1ae79d38 swr: support float & int32 in the resampler
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-10 13:18:49 +02:00