Commit Graph

16358 Commits

Author SHA1 Message Date
Mans Rullgard
69665bd6f4 g723.1: do not pass large structs by value
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
138914dcd8 g723.1: do not bounce intermediate values via memory
Although a reasonable compiler will probably optimise out the
actual store and load, this operation still implies a truncation
to 16 bits which the compiler will probably not realise is not
necessary here.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
cbcf1b411f g723.1: declare a variable in the block it is used
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
35b533e4de g723.1: avoid saving/restoring excitation
Writing the scaled excitation to a scratch buffer (borrowing the
'audio' array) instead of modifying it in place avoids the need
to save and restore the unscaled values.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
4b728b4712 g723.1: avoid unnecessary memcpy() in residual_interp()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
f645710cf3 g723.1: make postfilter write directly to output buffer
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
1953264331 g723.1: drop unnecessary variable buf_ptr in formant_postfilter()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
b2af2c4bee g723.1: make scale_vector() output to a separate buffer
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
783da0d696 g723.1: make autocorr_max() work on an arbitrary buffer
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
3716105103 g723.1: do not needlessly use int64_t
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
47c73a73b0 g723.1: use saturating addition functions
Use saturating addition functions instead of 64-bit intermediates
and separate clipping.  This is much faster when dedicated
instructions are available.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
4aca716a53 g723.1: optimise scale_vector()
Firstly, nothing in this function can overflow 32 bits so the use
of a 64-bit type is completely unnecessary.  Secondly, the scale
is either a power of two or 0x7fff.  Doing separate loops for these
cases avoids using multiplications.  Finally, since only the number
of bits, not the actual value, of the maximum value is needed, the
bitwise or of all the values serves the purpose while being faster.

It is worth noting that even if overflow could happen, it was not
handled correctly anyway.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
1eb1f6f281 g723.1: remove useless uses of MUL64()
The operands in both cases are 16-bit so cannot overflow a 32-bit
destination.  In gain_scale() the inputs are reduced to 14-bit,
so even the shift cannot overflow.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
5a43eba956 g723.1: remove unnecessary argument 'shift' from dot_product()
The 'shift' argument is always 1 so there is no need to pass it
explicitly in every call.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
8b0de73464 g723.1: deobfuscate "(x << 4) - x" to "15 * x"
The compiler performs this optimisation.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Mans Rullgard
fddc5b9bea celp: optimise ff_celp_lp_synthesis_filter()
Adding instead of subtracting the products in the loop allows the
compiler to generate more efficient multiply-accumulate instructions
when 16-bit multiply-subtract is not available. ARM has only
multiply-accumulate for 16-bit operands.  In general, if only one
variant exists, it is usually accumulate rather than subtract.

In the same spirit, using the dedicated saturation function enables
use of any special optimised versions of this.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-13 01:03:25 +01:00
Derek Buitenhuis
17c11cef9f cllc: Implement ARGB support
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-08-12 15:21:15 -04:00
Derek Buitenhuis
7fda47d53b cllc: Add support for QRGB
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-08-12 15:07:00 -04:00
Derek Buitenhuis
f4bb38cc26 cllc: Rename some funcs to represent what they actually do
This is in preparation for adding support for other colorspaces
and coding types.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-08-12 15:07:00 -04:00
Diego Biurrun
3b9e832e17 x86: Drop silly "_yasm" suffixes from filenames 2012-08-12 17:13:05 +02:00
Anton Khirnov
51efed152d lavc: add an intra-only codec property. 2012-08-11 11:34:09 +02:00
Anton Khirnov
c223d79945 lavc: add codec descriptors.
They describe properties that are inherent to a codec (as described by
an AVCodecID) without referring to a specific implementation.
2012-08-11 11:32:11 +02:00
Anton Khirnov
2ff67c909c lavc: fix mixing CODEC_ID/AV_CODEC_ID in C++ code.
C++ does not allow to mix different enums, so e.g. code comparing
ACodecID with CodecID would fail to compile with gcc.

This very evil hack should fix this problem.
2012-08-10 18:48:40 +02:00
Mans Rullgard
05c36e0e5f g723.1: fix addition overflow
This addition must be done as 64-bit to avoid overflow and for
the subsequent clipping to be meaningful.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-10 12:18:38 +01:00
Mans Rullgard
52aa3015a3 g723.1: simplify and fix multiplication overflow
In 16-bit arithmetic, x * 0xffffc is simply x * -4 with extra overflows,
(and the constant was probably meant to be 0xfffc).  Combined with the
shift, this simplifies to -x >> 1.  Finally, clearing the low two bits
with a 32-bit mask and switching to a 32-bit type allows more efficient
code on 32-bit machines.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-10 12:18:38 +01:00
Mans Rullgard
e141cf2c57 g723.1: deobfuscate an expression
(x << 2) - x is just an optimisation of 3 * x the compiler is
perfectly capable of doing on its own.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-10 12:18:38 +01:00
Mans Rullgard
e2b7c5783d g723.1: remove unused #includes
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-10 12:18:38 +01:00
Diego Biurrun
804d7a1aa6 doxygen: Fix function parameter names to match the code 2012-08-09 20:05:55 +02:00
Mans Rullgard
0db9eba48c motion_est: drop inline from sad_hpel_motion_search()
This function is only ever called through a function pointer,
so marking it inline makes no sense.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 12:27:49 +01:00
Mans Rullgard
5bf7bc625b motion_est: remove unused macros
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 12:27:49 +01:00
Mans Rullgard
74f82f92a4 motion_est: remove useless no_motion_search() function
At both places this function is called, mb_[xy] == s->mb_[xy]
making the call together with following code equivalent to
simply assigning zeros.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 12:27:49 +01:00
Hendrik Leppkes
f9150c8ac0 lagarith: frame multithreading
About 2x speedup going from 1 to 2 threads.
1.7s to 0.85s on foreman CIF.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-08-09 10:50:27 +02:00
Diego Biurrun
36a8c43073 doxygen: qdm2: Drop documentation for non-existing function parameters 2012-08-09 03:49:44 +02:00
Mans Rullgard
f69f4036f8 mpegvideo: reduce excessive inlining of mpeg_motion()
The main benefit of inlining this function is from constant
propagation for the 'field_based' argument.  Instead of inlining
all calls, create two versions of the function for field_based
values of 0 and 1.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 01:31:37 +01:00
Mans Rullgard
7a851153d3 mpegvideo: convert mpegvideo_common.h to a .c file
This file defines a single, huge function, MPV_motion(), which
although being declared inline is not actually inlined by the
compiler (for good reason).  There is thus no sense in defining
this function in a header file, resulting in multiple copies of
it in the final library.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 01:31:37 +01:00
Mans Rullgard
18bbca1fd3 build: factor out mpegvideo.o dependencies to CONFIG_MPEGVIDEO
This adds a hidden config variable for the mpegvideo.o dependency
and selects from the codecs which require it.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 01:31:37 +01:00
Mans Rullgard
d7a4f8f8b9 Move MASK_ABS macro to libavcodec/mathops.h
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.

Two small tweaks to the macro are made:

- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Mans Rullgard
c318626ce2 x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h
This puts x86-specific things in the x86/ subdirectory where they
belong.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-09 00:58:20 +01:00
Alex Converse
122d5c526a aacdec: Don't fall back to the old output configuration when no old configuration is present.
Fixes MP4 files where the first frame is broken.
2012-08-08 16:55:41 -07:00
Dave Yeo
197439c1ef x86: pngdsp: Fix assembly for OS/2
The a.out object format does not allow aligning sections.
On OS/2 LD aligns sections to 16 bytes.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2012-08-08 15:45:09 +02:00
Kostya Shishkov
e78e6c37ef g723_1: clip argument for 15-bit version of normalize_bits()
It expects maximum value to be 32767 but calculations in scale_vector()
which uses this function can give it ABS(-32768) which leads to wrong
result and thus clipping is needed.
2012-08-08 13:24:19 +02:00
Kostya Shishkov
f86b2f3661 g723_1: use all LPC vectors in formant postfilter
Due to some mistake LPC vector for the first subframe was used for all
subframes instead of their own LPC vectors.
2012-08-08 13:23:22 +02:00
Kostya Shishkov
8f2aa89a5d mpc8: do not leave padding after last frame in buffer for the next decode call 2012-08-08 09:11:38 +02:00
Anton Khirnov
94364b7d42 mpegaudioenc: list supported channel layouts. 2012-08-08 07:53:48 +02:00
Anton Khirnov
927e92cdc7 mpegaudiodec: don't print an error on > 1 frame in a packet.
It's a perfectly normal situation, nothing to spam about.
2012-08-08 07:53:48 +02:00
Anton Khirnov
5702c8670e api-example: update to new audio encoding API. 2012-08-08 07:53:47 +02:00
Mans Rullgard
2b140a3d09 x86: use 32-bit source registers with movd instruction
yasm tolerates mismatch between movd/movq and source register size,
adjusting the instruction according to the register.  nasm is more
strict.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:21:20 +01:00
Mans Rullgard
a3df4781f4 x86: add colons after labels
nasm prints a warning if the colon is missing.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:20:56 +01:00
Anton Khirnov
36ef5369ee Replace all CODEC_ID_* with AV_CODEC_ID_* 2012-08-07 16:00:24 +02:00
Anton Khirnov
104e10fb42 lavc: add AV prefix to codec ids. 2012-08-07 15:56:39 +02:00