Janne Grunau
363bd1c62c
remove iwmmxt optimizations
...
The were broken since August of 2010 without anyone noticing until
three weeks ago. Nobody cares about it anymore and hopefully Marvell
will support NEON like in the PXA978 from now on.
2012-03-12 22:46:56 +01:00
Christophe GISQUET
7e1ce6a6ac
dsputil: remove shift parameter from scalarproduct_int16
...
There is only one caller, which does not need the shifting. Other use cases
are situations where different roundings would be needed.
The x86 and neon versions are modified accordingly.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-07 10:29:52 -08:00
Ronald S. Bultje
bd66f073fe
vp8: change int stride to ptrdiff_t stride.
...
On 64bit platforms with 32bit int, this means we won't have to sign-
extend the integer anymore.
2012-03-02 10:31:50 -08:00
Christophe GISQUET
2e74a5abc2
SBR DSP: use intptr_t for the ixh parameter.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:48:40 -08:00
Ronald S. Bultje
3ab9a2a557
rv34: change most "int stride" into "ptrdiff_t stride".
...
This prevents having to sign-extend on 64-bit systems with 32-bit ints,
such as x86-64. Also fixes crashes on systems where we don't do it and
arguments are not in registers, such as Win64 for all weight functions.
2012-02-20 14:58:25 -08:00
Martin Storsjö
efd29844eb
mpegvideo: Add ff_ prefix to nonstatic functions
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:07:23 +02:00
Martin Storsjö
9cf0841ef3
dsputil: Add ff_ prefix to the dsputil*_init* functions
...
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:06:34 +02:00
Diego Biurrun
aa06d65693
arm: Add missing #include to vp8.h to fix a make checkheaders warning.
2012-02-09 12:26:47 +01:00
Diego Biurrun
32f3c541bc
doxygen: Do not include license boilerplates in Doxygen comment blocks.
2012-02-06 19:39:24 +01:00
Mans Rullgard
cd2f98f365
ARM: ac3: fix ac3_bit_alloc_calc_bap_armv6
...
This function was broken when the start bin was not at the start
of a band.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-02-02 18:50:42 +00:00
Mans Rullgard
be822d77b6
aacsbr: ARM NEON optimised sbrdsp functions
...
Overall speedup of HE-AAC decoding 2.3x on Cortex-A8, 1.2x on A9.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-28 14:56:18 +00:00
Felipe Contreras
c3d5e290ca
ARM: fix build with FFT enabled and MDCT disabled
...
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-20 16:14:01 +00:00
Janne Grunau
9e12002f11
rv34: add NEON rv34_idct_add
...
Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add
down from 83 to 30 cycles.
squash: rv34 idct rearrange partial register loads
2012-01-16 19:26:41 +01:00
Christophe GISQUET
9ba9c34024
rv34: 1-pass inter MB reconstruction
...
Implement 1-pass inverse transform and reconstruction for inter blocks.
2012-01-16 19:26:41 +01:00
Mans Rullgard
71b3a63e9c
ARM: fix Thumb-mode simple_idct_arm
...
The alignment directive must obviously precede the label.
This was never noticed in ARM mode since the location is
already aligned there.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-13 19:09:59 +00:00
Mans Rullgard
5c5e1ea3cd
ARM: 4-byte align start of all asm functions
...
Due to apprent bugs in the GNU assembler and/or linker, relocations
can be incorrectly processed if the alignment of a Thumb instruction
is changed in the output file compared to the input object.
This fixes crashes in h264 decoding with Thumb enabled. No effect in
ARM mode since everything is 4-byte aligned there.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-13 19:09:59 +00:00
Mans Rullgard
81dc6a2a3c
ARM: rv34: fix asm syntax in dc transform functions
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-01-12 22:11:13 +01:00
Janne Grunau
e1e369049e
rv34: NEON optimised dc only inverse transform
...
30-50% faster than the C implementation, 0.5% overall speedup on
bourne.rmvb.
2012-01-12 18:33:55 +01:00
Christophe GISQUET
98f24ecd6c
rv34: joint coefficient decoding and dequantization
...
Perform dequantization while decoding coefficients instead of performing it
on the entire coefficients buffer.
Since quantized coefficients are very sparse, this usually causes a small
speedup. Speedup of around 1% on Panda board compared to the removed here
neon code. Global speedup is probably around 3%.
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-01-04 10:30:01 +01:00
Mans Rullgard
11b1db2759
rv40: NEON optimised weak loop filter
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-16 14:36:01 +00:00
Mans Rullgard
b536c7a3e1
ARM: fix external symbol refs in rv40 asm
...
External symbol references need prefixes on some systems.
This should fix build errors on Darwin.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-15 11:02:59 +00:00
Mans Rullgard
f7de52354f
ARM: dca: disable optimised decode_blockcodes() for old gcc
...
Old gcc versions have trouble compiling this function, and
no simple, targeted test is possible.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-15 01:02:58 +00:00
Mans Rullgard
71ce76027d
rv40: NEON optimised loop filter strength selection
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-14 11:26:30 +00:00
Mans Rullgard
4722a03c75
rv34: NEON optimised 4x4 dequant
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-13 12:06:21 +00:00
Mans Rullgard
392107ad07
rv40: NEON optimised rv40 qpel motion compensation
...
Based on patch by Janne Grunau.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-07 22:38:14 +00:00
Janne Grunau
6c88988866
rv40: NEON optimised weighted prediction
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-06 13:48:25 +00:00
Janne Grunau
f5c05b9aa5
rv40: NEON optimised chroma MC
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-06 13:48:25 +00:00
Mans Rullgard
f054a82727
ARM: move NEON H264 chroma mc to a separate file
...
This allows sharing code with the rv40 version of these functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-06 13:48:24 +00:00
Janne Grunau
42d32cf53c
rv34: NEON optimised inverse transform functions
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-06 13:48:24 +00:00
Mans Rullgard
59807fee6d
ARM: h264dsp_neon cosmetics
...
- Replace 'ip' with 'r12'.
- Use correct size designators for vld1/vst1.
- Whitespace fixes.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-02 19:59:18 +00:00
Janne Grunau
a760f530bb
ARM: make some NEON macros reusable
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-02 19:59:18 +00:00
Mans Rullgard
3adba2de3d
ARM: fix indentation in ff_dsputil_init_neon()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-01 19:41:36 +00:00
Mans Rullgard
96fef6cf31
ARM: NEON put/avg_pixels8/16 cosmetics
...
This makes whitespace and register names consistent with
the style used in more recent code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-01 19:41:36 +00:00
Mans Rullgard
716f1705e9
ARM: add remaining NEON avg_pixels8/16 functions
2011-12-01 19:41:36 +00:00
Mans Rullgard
94267ddfb2
ARM: clean up NEON put/avg_pixels macros
...
Although this adds a few lines, the macro calls are less convoluted.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-01 19:41:35 +00:00
Mans Rullgard
00a856e3f9
dca: ARMv6 optimised decode_blockcode()
...
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.
2-5% faster overall on Cortex-A8.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-11-25 13:19:53 +00:00
Mans Rullgard
3a0b72dee0
ARM: remove needless .text/.align directives
...
The 'function' macro already includes the appropriate
directives.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-11-23 15:06:50 +00:00
Mans Rullgard
8ee2b4672f
ARM: add explicit .arch and .fpu directives to asm.S
...
This prevents build errors when compiler and assembler default
targets differ. Ideally each file would declare the highest
level it requires. This is however not easily possible as it
complicates assembling pre-armv6t2 code in Thumb-2 mode.
HAVE_NEON is used as indicator for ARMv7-A since no other
symbol exists for this and NEON is only available in this
variant.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-11-22 12:13:02 +00:00
Diego Biurrun
ce33320b30
Remove redundant filename self-references inside files.
...
Filenames are brittle across renames and add no useful information.
2011-11-08 17:52:56 +01:00
Anton Khirnov
acffe45732
mpegvideo: remove some unused variables from MpegEncContext.
2011-10-23 14:13:40 +02:00
Ronald S. Bultje
c2d337429c
H264: change weight/biweight functions to take a height argument.
...
Neon parts by Mans Rullgard <mans@mansr.com>.
2011-10-21 01:00:45 -07:00
Baptiste Coudurier
76741b0e56
h264: 4:2:2 intra decoding support
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Mans Rullgard
6308729e68
ARM: check for inline asm 'y' operand modifier support
...
The inline asm added in bf5d46d
uses the 'y' modifier which
is only supported from gcc 4.5. This check allows building
with older compilers.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-10-03 08:56:24 +01:00
Mans Rullgard
bf5d46d8e6
dca: NEON optimised high freq VQ decoding
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-09-30 19:01:23 +01:00
Mans Rullgard
baf6b738f2
ARM: NEON optimised vector_fmac_scalar()
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-09-28 15:56:09 +01:00
Anton Khirnov
297d9cb3dc
mpeg12enc: add intra_vlc private option.
...
Deprecate CODEC_FLAG2_INTRA_VLC.
2011-08-31 13:19:14 +02:00
Måns Rullgård
9a83adaf34
arm: Avoid using the movw instruction needlessly
...
This fixes building for ARM11 without Thumb2.
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-08-03 11:56:58 +03:00
Martin Storsjö
d0a2f0af9d
Move an int64_t down in MpegEncContext
...
This allows using the same arm assembler offsets for both EABI
and the mach-o ABI.
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-08-03 11:56:56 +03:00
Mans Rullgard
cbd58a872d
dsputil: remove some unused functions
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-27 16:05:49 +01:00
Mans Rullgard
a617c6aaa3
dsputil: update per-arch init funcs for non-h264 high bit depth
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00