Commit Graph

754 Commits

Author SHA1 Message Date
Michael Niedermayer
1306359ea9 Merge commit '49676eb7301e775d08bdbba5380159b106ee258f'
* commit '49676eb7301e775d08bdbba5380159b106ee258f':
  dsputil: Remove prototypes for nonexisting optimization functions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-13 22:56:08 +01:00
Diego Biurrun
831a118078 Update dsputil- and SIMD-related comments to match reality more closely 2014-03-13 05:50:29 -07:00
Diego Biurrun
d1184b8110 arm: dsputil: Add a bunch of missing #includes 2014-03-13 05:50:28 -07:00
Diego Biurrun
49676eb730 dsputil: Remove prototypes for nonexisting optimization functions 2014-03-13 05:50:28 -07:00
Michael Niedermayer
e5920425b0 Merge commit '5a7f382a5d33d9a26890affe6c8c5070a48dfc22'
* commit '5a7f382a5d33d9a26890affe6c8c5070a48dfc22':
  armv6: vp8: use explicit labels in motion compensation asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-12 22:03:00 +01:00
Janne Grunau
5a7f382a5d armv6: vp8: use explicit labels in motion compensation asm
The integrated arm assembler in clang-503.0.38 (Xcode-5.1) fails
to assemble a branch to 'label + offset' in thumb mode.
2014-03-12 15:06:05 +01:00
Michael Niedermayer
fa4f573997 Merge commit '634d9d8b398982647b3d7160641198744901d8d8'
* commit '634d9d8b398982647b3d7160641198744901d8d8':
  arm: get_cabac inline asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-09 13:37:29 +01:00
Michael Niedermayer
fc1d7811ef Merge commit '4506a854a4d846692ba71daeeff661dc214c8fa2'
* commit '4506a854a4d846692ba71daeeff661dc214c8fa2':
  arm: vp3: remove incorrect const in ff_vp3_idct_dc_add_neon declaration

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-09 13:06:31 +01:00
Michael Niedermayer
b39e895024 Merge commit '61985ad72c47bbb668f2d3923bf5c9df83e79323'
* commit '61985ad72c47bbb668f2d3923bf5c9df83e79323':
  arm: hpeldsp: fix put_pixels8_y2_{,no_rnd_}armv6

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-09 01:16:21 +01:00
Janne Grunau
634d9d8b39 arm: get_cabac inline asm
Based on the aarch64 asm. CPU cycle counts on cortex-a9 compared to
gcc 4.8.2:
before: 475 decicycles in get_cabac_noinline, 67106035 runs, 2829 skips
after:  393 decicycles in get_cabac_noinline, 67106474 runs, 2390 skips

Overall speedup is above 2%. Code generated by clang 3.4 is slower on
the same hardware and the relative change is a little larger.
2014-03-09 00:45:34 +01:00
Janne Grunau
4506a854a4 arm: vp3: remove incorrect const in ff_vp3_idct_dc_add_neon declaration
Was missed in aeaf268e52 when integrating
clear_blocks into the idct.
2014-03-09 00:45:33 +01:00
Janne Grunau
61985ad72c arm: hpeldsp: fix put_pixels8_y2_{,no_rnd_}armv6
The overread avoidance fix in cbddee1cca
broke the computation for the last row since it prevented the safe
reading from the height+1-th row.

CC: libav-stable@libav.org
2014-03-08 18:31:57 +01:00
Michael Niedermayer
a74bab7079 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: hpeldsp: prevent overreads in armv6 asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-05 21:35:30 +01:00
Janne Grunau
cbddee1cca arm: hpeldsp: prevent overreads in armv6 asm
Based on a patch by Russel King <rmk+libav@arm.linux.org.uk>

Bug-Id: 646
CC: libav-stable@libav.org
2014-03-05 14:30:57 +01:00
Michael Niedermayer
fe6603745e Merge commit '6e4009d4cdf5927bdaedf58fcfc5e813b14c366b'
* commit '6e4009d4cdf5927bdaedf58fcfc5e813b14c366b':
  arm: dcadsp: implement decode_hf as external NEON asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:52:25 +01:00
Michael Niedermayer
fb3c33f3cd Merge commit '4cb6964244fd6c099383d8b7e99731e72cc844b9'
* commit '4cb6964244fd6c099383d8b7e99731e72cc844b9':
  dcadec: simplify decoding of VQ high frequencies

Conflicts:
	configure
	libavcodec/dcadec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:41:19 +01:00
Michael Niedermayer
90f674d55b Merge commit '87ec849fe9acba075c843e67bcd01f256f481a18'
* commit '87ec849fe9acba075c843e67bcd01f256f481a18':
  dcadec: remove scaling in lfe_interpolation_fir

Conflicts:
	libavcodec/dcadec.c
	libavcodec/dcadsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 18:14:12 +01:00
Janne Grunau
6e4009d4cd arm: dcadsp: implement decode_hf as external NEON asm 2014-02-28 13:12:19 +01:00
Christophe Gisquet
4cb6964244 dcadec: simplify decoding of VQ high frequencies
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.

Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.

The decode_hf implementations have following timings:

For x86 Arrandale:
        C  SSE SSE2 SSE4
win32: 260 162  119  104
win64: 242 N/A   89   72

The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
2014-02-28 13:03:22 +01:00
Christophe Gisquet
87ec849fe9 dcadec: remove scaling in lfe_interpolation_fir
The scaling factor is constant so it is faster to scale the
FIR coefficients in the tables during compilation.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:47 +01:00
Peter Ross
b8664c9294 avcodec/vp8dsp: add VP7 idct and loop filter
Signed-off-by: Peter Ross <pross@xvid.org>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-15 02:15:35 +01:00
James Darnley
623f380a18 lavc: fix flac encoder and decoder dependencies
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-13 21:00:32 +01:00
Michael Niedermayer
ccc48b318b avcodec/arm/int_neon: fix handling sizes % 16 != 0
This assumes the array is sufficiently padded with 0

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-13 02:20:08 +01:00
Michael Niedermayer
b7c6ccc1e6 Merge commit 'd6eac2f1bcce0cb85fac5d50fcfe94bc490d4a5e'
* commit 'd6eac2f1bcce0cb85fac5d50fcfe94bc490d4a5e':
  arm: Remove a stray .fpu directive

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-09 20:57:55 +01:00
Martin Storsjö
d6eac2f1bc arm: Remove a stray .fpu directive
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2014-02-09 18:36:16 +01:00
Michael Niedermayer
df98b36aa6 Merge commit '5c1c6e82261b856214499b9fef3a08bf3ff6e0ae'
* commit '5c1c6e82261b856214499b9fef3a08bf3ff6e0ae':
  dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 17:25:31 +01:00
Janne Grunau
5c1c6e8226 dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders 2014-02-08 13:38:36 +01:00
Michael Niedermayer
5794e9fce2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dcadsp: split lfe_dir cases

Conflicts:
	libavcodec/arm/dcadsp_init_arm.c

See: 45854df9a5
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 02:27:27 +01:00
Christophe Gisquet
45854df9a5 dcadsp: split lfe_dir cases
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 02:04:12 +01:00
Christophe Gisquet
481a46a462 dcadsp: add int8x8_fmul_int32 to DSP context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:55:42 +01:00
Michael Niedermayer
bf90abe1dd Merge commit '5bcbb516f2ff45290ef7995b081762e668693672'
* commit '5bcbb516f2ff45290ef7995b081762e668693672':
  arm: Add X() around all references to extern symbols

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:48:26 +01:00
Christophe Gisquet
5fdbfcb5b7 dcadsp: split lfe_dir cases
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.

Arm asm changes by Janne Grunau.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:54:18 +01:00
Christophe Gisquet
2bd44cb705 dcadsp: add int8x8_fmul_int32 to dsp context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:51:59 +01:00
Martin Storsjö
5bcbb516f2 arm: Add X() around all references to extern symbols
Don't rely on the fact that an unprefixed label currently exists.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-07 15:13:58 +02:00
Michael Niedermayer
c73445a45c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter

Conflicts:
	libavcodec/arm/vp8dsp_neon.S

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-06 15:33:54 +01:00
Martin Storsjö
49ec551595 vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter
Based on a patch by Ronald S. Bultje.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-06 09:32:26 +02:00
Michael Niedermayer
7766c7b7a0 Merge commit '7151c5d04aed3b496c21f713dcb603e2cbdb9c49'
* commit '7151c5d04aed3b496c21f713dcb603e2cbdb9c49':
  arm: Use full filenames as multiple inclusion guards

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-14 14:38:10 +01:00
Diego Biurrun
7151c5d04a arm: Use full filenames as multiple inclusion guards 2014-01-14 00:04:52 +01:00
Michael Niedermayer
9d5cc55f0f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add an option for making sure NEON registers aren't clobbered

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-11 03:08:10 +01:00
Martin Storsjö
44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Michael Niedermayer
8be8dddd13 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add a missing # as prefix for an immediate constant

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:44:56 +01:00
Michael Niedermayer
edba54630b Merge commit '5dae4872357613a0b51120b54a4c5221e0ec3f69'
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
  arm: Allow overriding the alignment set in the function macro

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:36:56 +01:00
Michael Niedermayer
3c00d4c5f0 Merge commit 'b7b932f5e3602bd34c3cc634b71c8bbbc0fb8dc0'
* commit 'b7b932f5e3602bd34c3cc634b71c8bbbc0fb8dc0':
  arm: Remove a leftover define for the pld instruction

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:21:18 +01:00
Michael Niedermayer
9ef7c0c551 Merge commit '67bb3a4e285a5871770cbaa2d78bf9024961dd0f'
* commit '67bb3a4e285a5871770cbaa2d78bf9024961dd0f':
  arm: cosmetics: Reindent the h264dsp neon init function

Conflicts:
	libavcodec/arm/h264dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:13:25 +01:00
Martin Storsjö
952d3187d8 arm: Add a missing # as prefix for an immediate constant
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:30:13 +02:00
Martin Storsjö
5dae487235 arm: Allow overriding the alignment set in the function macro
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).

The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.

This restores the original intention, to align the loop entry
points.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:56 +02:00
Martin Storsjö
b7b932f5e3 arm: Remove a leftover define for the pld instruction
This file no longer uses the pld instruction at all, all such uses
have been split into hpeldsp_arm.S.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:42 +02:00
Martin Storsjö
67bb3a4e28 arm: cosmetics: Reindent the h264dsp neon init function
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:31 +02:00
Michael Niedermayer
7778979f6f Merge commit '794fcf79a89eca2d4e889803b2c804a0b1defbb3'
* commit '794fcf79a89eca2d4e889803b2c804a0b1defbb3':
  Rename CONFIG_FFT_FLOAT ---> FFT_FLOAT

Conflicts:
	libavcodec/fft-internal.h
	libavcodec/fft-test.c
	libavcodec/fft_fixed.c
	libavcodec/fft_float.c
	libavcodec/fft_template.c
	libavcodec/mdct_fixed.c
	libavcodec/mdct_float.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-07 13:10:56 +01:00
Diego Biurrun
794fcf79a8 Rename CONFIG_FFT_FLOAT ---> FFT_FLOAT
The define does not originate from configure, so it should not
have a name that is CONFIG_-prefixed.
2014-01-06 19:12:48 +01:00
Michael Niedermayer
30056fd0be Merge commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb'
* commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb':
  h264: do not use 422 functions for monochrome

See: 07abf13da4
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-06 16:51:23 +01:00
Anton Khirnov
a03a642d5c h264: do not use 422 functions for monochrome
Fixes invalid memory access.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
2014-01-06 08:25:36 +01:00
Michael Niedermayer
d8e65e9224 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Use the matching endfunc macro instead of the assembler directive directly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-04 13:19:54 +01:00
Michael Niedermayer
63ce041a7d Merge commit '2ad4ee345a4216aef3999f57dd14c56128d27a13'
* commit '2ad4ee345a4216aef3999f57dd14c56128d27a13':
  arm: Add a missing endfunc macro call

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-04 13:13:39 +01:00
Martin Storsjö
3348e3492d arm: Use the matching endfunc macro instead of the assembler directive directly
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-04 13:53:08 +02:00
Martin Storsjö
2ad4ee345a arm: Add a missing endfunc macro call
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-04 13:53:02 +02:00
Michael Niedermayer
70d6ce783c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Don't clobber callee saved registers in scalarproduct

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-21 01:19:59 +01:00
Michael Niedermayer
4fa91b88c6 Merge commit 'b254490bdabb21bd517c05b1a68717f9952ac8c4'
* commit 'b254490bdabb21bd517c05b1a68717f9952ac8c4':
  vc1: arm: Add NEON no_rnd chroma MC

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-20 23:33:15 +01:00
Michael Niedermayer
69278d94c4 Merge commit '832e19063209a5f355af733d1a45f5051f49ce33'
* commit '832e19063209a5f355af733d1a45f5051f49ce33':
  vc1: arm: Add NEON assembly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-20 23:12:16 +01:00
Martin Storsjö
d307e408d4 arm: Don't clobber callee saved registers in scalarproduct
q4-q7/d8-d15 are supposed to not be clobbered by the callee.

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 20:48:30 +02:00
Mason Carter
b254490bda vc1: arm: Add NEON no_rnd chroma MC
Apply David Conrad's old patch to the modern codebase.

http://ffmpeg.org/pipermail/ffmpeg-devel/2009-April/059877.html

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 14:53:42 +02:00
Mason Carter
832e190632 vc1: arm: Add NEON assembly
For:

ff_vc1_inv_trans_{8,4}x{8,4}_{dc_,}neon
ff_put_pixels8x8_neon
ff_put_vc1_mspel_mc{0,1,2,3}{0,1,2,3}_neon (except for 00)

Based on ARM assembly code in libavcodec/arm by Rob Clark and Mans
Rullgard.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 14:53:39 +02:00
Michael Niedermayer
8e70fdab36 Merge commit '4958f35a2ebc307049ff2104ffb944f5f457feb3'
* commit '4958f35a2ebc307049ff2104ffb944f5f457feb3':
  dsputil: Move apply_window_int16 to ac3dsp

Conflicts:
	libavcodec/arm/ac3dsp_init_arm.c
	libavcodec/arm/ac3dsp_neon.S
	libavcodec/x86/ac3dsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-09 04:12:40 +01:00
Diego Biurrun
4958f35a2e dsputil: Move apply_window_int16 to ac3dsp
The (optimized) functions are used nowhere else.
2013-12-08 17:57:15 +01:00
Ronald S. Bultje
f574b4da56 vp8: use 2 registers for dst_stride and src_stride in neon bilin filter. 2013-09-28 20:25:29 -04:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer
e1c3e6277f Merge commit 'f0389eb777b1ab4291329d4f709098cdfa7384dc'
* commit 'f0389eb777b1ab4291329d4f709098cdfa7384dc':
  arm: fmtconvert: Split armv6 fmtconvert code off from vfp code

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:10:39 +02:00
Michael Niedermayer
99e7c702db Merge commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7'
* commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7':
  arm: dcadsp: Move synth filter initialization to dcadsp file

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:00:45 +02:00
Diego Biurrun
f0389eb777 arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
Diego Biurrun
bd549cbaac arm: dcadsp: Move synth filter initialization to dcadsp file 2013-08-29 11:24:14 +02:00
Michael Niedermayer
6067186f3a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: h264chroma: Do not compile h264_chroma_mc* dependent on h264 decoder

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-24 11:30:57 +02:00
Michael Niedermayer
f9418d156f Merge commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9'
* commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9':
  vp56: Mark VP6-only optimizations as such.

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-24 11:04:11 +02:00
Diego Biurrun
f407856968 arm: h264chroma: Do not compile h264_chroma_mc* dependent on h264 decoder
The functions are used by all codecs that enable the h264chroma component
and the file is already compiled conditional on h264chroma being enabled.
2013-08-23 17:21:14 +02:00
Diego Biurrun
8506ff97c9 vp56: Mark VP6-only optimizations as such.
Most of our VP56 optimizations are VP6-only and will stay that way.
So avoid compiling them for VP5-only builds.
2013-08-23 14:42:19 +02:00
Michael Niedermayer
e1a5ee25e0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add assembly version of h264_find_start_code_candidate

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-08 12:52:35 +02:00
Ben Avison
45e10e5c8d arm: Add assembly version of h264_find_start_code_candidate
Before          After
               Mean   StdDev   Mean   StdDev  Change
This function   508.8 23.4      185.4  9.0    +174.4%
Overall        3068.5 31.7     2752.1 29.4     +11.5%

In combination with the preceding patch:
                Before          After
                Mean   StdDev   Mean   StdDev  Change
Overall         2925.6 26.2     2752.1 29.4     +6.3%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-08-08 12:08:34 +03:00
Michael Niedermayer
0815b23062 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Comment out unused labels in simple_idct_arm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-25 10:15:08 +02:00
Martin Storsjö
54ba52077c arm: Comment out unused labels in simple_idct_arm
When building for iOS in thumb mode, gas-preprocessor.pl doesn't
mark unused labels as thumb functions (as it does for other
local labels, where it can figure out that they are functions
due to being referenced in branch instructions). This leads to
linker warnings for some of those local labels, such as:

ld: warning: ARM function not 4-byte aligned: __a_evaluation from
libavcodec/libavcodec.a(simple_idct_arm.o)

Therefore, comment them out since they don't have any function.
They do still have a value in documenting key points in the
assembly source though.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-24 22:43:21 +03:00
Martin Storsjö
69e6702c01 arm: Mangle external symbols properly in new vfp assembly files
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 14:48:30 +03:00
Martin Storsjö
47d57f24e3 arm: Mangle external symbols properly in new vfp assembly files
Reviewed-by: Kostya Shishkov
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 13:16:21 +02:00
Michael Niedermayer
23f250d2bc Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add VFP-accelerated version of qmf_32_subbands

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 12:11:23 +02:00
Michael Niedermayer
c30eb74d18 Merge commit '8b9eba664edaddf9a304d3acbf0388b5c520781d'
* commit '8b9eba664edaddf9a304d3acbf0388b5c520781d':
  arm: Add VFP-accelerated version of fft16

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 12:05:39 +02:00
Michael Niedermayer
bacba8e5d1 Merge commit 'ba6836c966debc56314ce2ef133c7f0c1fdfdeac'
* commit 'ba6836c966debc56314ce2ef133c7f0c1fdfdeac':
  arm: Add VFP-accelerated version of dca_lfe_fir

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 12:04:28 +02:00
Michael Niedermayer
2305a6775d Merge commit 'b63bb251ea6d6ba23295294e37a92625c0192206'
* commit 'b63bb251ea6d6ba23295294e37a92625c0192206':
  arm: Add VFP-accelerated version of imdct_half

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 11:57:05 +02:00
Michael Niedermayer
3cab228005 Merge commit 'd6e4f5fef0d811e180fd7541941e07dca9e11dc0'
* commit 'd6e4f5fef0d811e180fd7541941e07dca9e11dc0':
  arm: Add VFP-accelerated version of int32_to_float_fmul_array8

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 11:55:40 +02:00
Michael Niedermayer
877bbe05ce Merge commit 'ce9ed10ac27b9cf32a6257e083ea2f052692d971'
* commit 'ce9ed10ac27b9cf32a6257e083ea2f052692d971':
  arm: Add VFP-accelerated version of int32_to_float_fmul_scalar

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 11:42:19 +02:00
Michael Niedermayer
0129026c4e Merge commit '41ef1d360bac65032aa32f6b43ae137666507ae5'
* commit '41ef1d360bac65032aa32f6b43ae137666507ae5':
  arm: Add VFP-accelerated version of synth_filter_float

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-22 11:41:37 +02:00
Ben Avison
ff30d12159 arm: Add VFP-accelerated version of qmf_32_subbands
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1323.0  98.0      746.2  60.6   +77.3%
Overall        15400.0 336.4    14147.5 288.4    +8.9%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:44 +03:00
Martin Storsjö
8b9eba664e arm: Add VFP-accelerated version of fft16
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1389.3  4.2       967.8  35.1   +43.6%
Overall        15577.5 83.2     15400.0 336.4    +1.2%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:41 +03:00
Martin Storsjö
ba6836c966 arm: Add VFP-accelerated version of dca_lfe_fir
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function    868.2  33.5      436.0  27.0   +99.1%
Overall        15973.0 223.2    15577.5  83.2    +2.5%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:39 +03:00
Martin Storsjö
b63bb251ea arm: Add VFP-accelerated version of imdct_half
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   2653.0  28.5     1108.8  51.4   +139.3%
Overall        17049.5 408.2    15973.0 223.2     +6.7%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:37 +03:00
Ben Avison
d6e4f5fef0 arm: Add VFP-accelerated version of int32_to_float_fmul_array8
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function    366.2  18.3      277.8  13.7   +31.9%
Overall        18420.5 489.1    17049.5 408.2    +8.0%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:36 +03:00
Ben Avison
ce9ed10ac2 arm: Add VFP-accelerated version of int32_to_float_fmul_scalar
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   1175.0   4.4      366.2  18.3   +220.8%
Overall        19285.5 292.0    18420.5 489.1     +4.7%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:30 +03:00
Ben Avison
41ef1d360b arm: Add VFP-accelerated version of synth_filter_float
Before           After
               Mean    StdDev   Mean    StdDev  Change
This function   9295.0 114.9     4853.2 83.5    +91.5%
Overall        23699.8 397.6    19285.5 292.0   +22.9%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-22 10:15:17 +03:00
Christophe Gisquet
b6293e2798 fmtconvert: Explicitly use int32_t instead of int
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-17 11:02:47 +03:00
Michael Niedermayer
a40370661d use Kostyas full name in copyrights
This fixes 2 files that where not part of the original change
See: de421b2085

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-06-02 11:25:56 +02:00
Michael Niedermayer
1a7ae6be2a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Include hpeldsp_neon.o if h264qpel is enabled

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-30 11:21:45 +02:00
Michael Niedermayer
e119d1b345 Merge commit 'efb7968cfe8b285ab4f27b363719b7c92d19ec74'
* commit 'efb7968cfe8b285ab4f27b363719b7c92d19ec74':
  arm: Don't unconditionally build dsputil files

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-30 11:02:21 +02:00
Michael Niedermayer
0b539da4c7 Merge commit '36a7df8cf1115aa37a1b0d42324ecde5ab6c2304'
* commit '36a7df8cf1115aa37a1b0d42324ecde5ab6c2304':
  arm: Only build the FFT init files if FFT is enabled

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-30 10:57:14 +02:00
Michael Niedermayer
3a0e21f037 Merge commit '186599ffe0a94d587434e5e46e190e038357ed99'
* commit '186599ffe0a94d587434e5e46e190e038357ed99':
  build: cosmetics: Place unconditional before conditional OBJS lines

Conflicts:
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-30 10:49:43 +02:00
Michael Niedermayer
1bbbbb0a32 Merge commit '9b9b2e9f3036abfd42916bcf734af14b4cb686aa'
* commit '9b9b2e9f3036abfd42916bcf734af14b4cb686aa':
  build: arm: cosmetics: Place all OBJS declarations in alphabetical order

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-30 10:37:52 +02:00
Martin Storsjö
86113667c0 arm: Include hpeldsp_neon.o if h264qpel is enabled
A few of the h264qpel neon functions are shared with other
hpeldsp functions in this file.

This fixes standalone compilation of the h264 decoder on arm.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:37 +03:00
Martin Storsjö
efb7968cfe arm: Don't unconditionally build dsputil files
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:35 +03:00
Martin Storsjö
36a7df8cf1 arm: Only build the FFT init files if FFT is enabled
This fixes build errors in cases where FFT is disabled.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:33 +03:00
Diego Biurrun
186599ffe0 build: cosmetics: Place unconditional before conditional OBJS lines
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:31 +03:00
Diego Biurrun
9b9b2e9f30 build: arm: cosmetics: Place all OBJS declarations in alphabetical order
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-05-30 02:17:27 +03:00
Christophe Gisquet
f49564c607 fmtconvert: int32_t input to int32_to_float_fmul_scalar
It was previously declared as int.
Does not change fate results for x86.

Conflicts:

	libavcodec/ppc/fmtconvert_altivec.c

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-18 18:01:16 +02:00
Michael Niedermayer
dbcf7e9ef7 Merge commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d'
* commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d':
  ppc: Drop unnecessary ff_ name prefixes from static functions
  x86: Drop unnecessary ff_ name prefixes from static functions
  arm: Drop unnecessary ff_ name prefixes from static functions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-01 18:21:35 +02:00
Diego Biurrun
383fd4d478 arm: Drop unnecessary ff_ name prefixes from static functions 2013-04-30 16:02:03 +02:00
Michael Niedermayer
c4010972c4 Merge commit '7384b7a71338d960e421d6dc3d77da09b0a442cb'
* commit '7384b7a71338d960e421d6dc3d77da09b0a442cb':
  arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp

Conflicts:
	libavcodec/arm/Makefile
	libavcodec/arm/hpeldsp_arm.S
	libavcodec/arm/hpeldsp_arm.h
	libavcodec/arm/hpeldsp_armv6.S
	libavcodec/arm/hpeldsp_init_arm.c
	libavcodec/arm/hpeldsp_init_armv6.c
	libavcodec/arm/hpeldsp_init_neon.c
	libavcodec/arm/hpeldsp_neon.S
	libavcodec/hpeldsp.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-20 14:19:08 +02:00
Ronald S. Bultje
7384b7a713 arm: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:19:08 +03:00
Ronald S. Bultje
015821229f vp3: Use full transpose for all IDCTs
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.

Also remove the unused type == 0 cases from the plain C version
of the idct.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Reimar Döffinger
8067f55edf Fix compilation on ARM with android gcc 4.7
With the current code it fails due to running out
of registers.
So code the store offsets manually into the assembler
instead.
Passes "make fate-dts".

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2013-04-15 09:04:07 +02:00
Michael Niedermayer
0724b4a16d Merge commit '62844c3fd66940c7747e9b2bb7804e265319f43f'
* commit '62844c3fd66940c7747e9b2bb7804e265319f43f':
  h264: Integrate clear_blocks calls with IDCT

Conflicts:
	libavcodec/arm/h264idct_neon.S
	libavcodec/h264idct_template.c
	libavcodec/x86/h264_idct.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-11 11:53:19 +02:00
Ronald S. Bultje
62844c3fd6 h264: Integrate clear_blocks calls with IDCT
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Michael Niedermayer
db4e4f766c Merge commit 'a8b6015823e628047a45916404c00044c5e80415'
* commit 'a8b6015823e628047a45916404c00044c5e80415':
  dsputil: convert remaining functions to use ptrdiff_t strides

Conflicts:
	libavcodec/dsputil.h
	libavcodec/dsputil_template.c
	libavcodec/h264qpel_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-13 14:18:53 +01:00
Ronald S. Bultje
de99545f46 Move arm half-pel assembly from dsputil to hpeldsp. 2013-03-13 04:04:13 +01:00
Ronald S. Bultje
d85c9b036e vp3/x86: use full transpose for all IDCTs.
This way, the special IDCT permutations are no longer needed. Bfin code
is disabled until someone updates it. This is similar to how H264 does
it, and removes the dsputil dependency imposed by the scantable code.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-12 22:54:10 +01:00
Luca Barbato
a8b6015823 dsputil: convert remaining functions to use ptrdiff_t strides
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-03-12 18:26:42 +01:00
Michael Niedermayer
a984efd104 Merge commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f'
* commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f':
  Remove unnecessary dsputil.h #includes

Conflicts:
	libavcodec/ffv1.c
	libavcodec/h261dec.c
	libavcodec/h261enc.c
	libavcodec/h264pred.c
	libavcodec/lpc.h
	libavcodec/mjpegdec.c
	libavcodec/rectangle.h
	libavcodec/x86/idct_sse2_xvid.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-26 13:05:10 +01:00
Michael Niedermayer
9b1a0c2ee8 Merge commit '76b19a3984359b3be44d4f7e4e69b7b86729a622'
* commit '76b19a3984359b3be44d4f7e4e69b7b86729a622':
  Fix a number of incorrect intmath.h #includes.
  avconv: remove an unused variable

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-26 12:56:55 +01:00
Diego Biurrun
c242bbd8b6 Remove unnecessary dsputil.h #includes 2013-02-26 00:51:34 +01:00
Diego Biurrun
76b19a3984 Fix a number of incorrect intmath.h #includes. 2013-02-26 00:51:34 +01:00
Michael Niedermayer
6b8f21190d Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dxva2: Add missing #define to make header compile standalone
  arm: vp8: Add missing #includes for header to compile standalone
  doc: filters: Correct BNF FILTER description

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-21 12:59:38 +01:00
Diego Biurrun
3e85b46ecf arm: vp8: Add missing #includes for header to compile standalone 2013-02-20 14:24:07 +01:00
Ronald S. Bultje
1acd7d594c h264: integrate clear_blocks calls with IDCT.
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-19 16:25:50 +01:00
Carl Eugen Hoyos
cf36180143 Only set accelerated arm fft functions if fft is enabled.
Fixes lavc compilation (linking) for configurations without fft.

Reported-by: tyler wear
Tested-by: Gavin Kinsey
2013-02-17 17:29:55 +01:00
Michael Niedermayer
60a0bc46cd Merge commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc'
* commit 'a846dccb29d2bb0798af1d47d06100eda9ca87cc':
  h264chroma: x86: Fix building with yasm disabled
  rv34: Drop now unnecessary dsputil dependencies

Conflicts:
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-07 13:35:49 +01:00
Michael Niedermayer
c4e394e460 Merge commit '79dad2a932534d1155079f937649e099f9e5cc27'
* commit '79dad2a932534d1155079f937649e099f9e5cc27':
  dsputil: Separate h264chroma

Conflicts:
	libavcodec/dsputil_template.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/vc1dec.c
	libavcodec/vc1dsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-07 13:09:35 +01:00
Michael Niedermayer
6c38884876 Merge commit '620289a20e022b9c16c10d546ef86cc0bb77cc84'
* commit '620289a20e022b9c16c10d546ef86cc0bb77cc84':
  sh4: Fix silly type vs. variable name search and replace typo
  configure: Group all hwaccels together in a separate variable
  Add av_cold attributes to arch-specific init functions

Conflicts:
	configure
	libavcodec/arm/mpegvideo_armv5te.c
	libavcodec/x86/mlpdsp.c
	libavcodec/x86/motion_est.c
	libavcodec/x86/mpegvideoenc.c
	libavcodec/x86/videodsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-06 13:27:24 +01:00
Michael Niedermayer
ede45c4e1d Merge commit '25841dfe806a13de526ae09c11149ab1f83555a8'
* commit '25841dfe806a13de526ae09c11149ab1f83555a8':
  Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.

Conflicts:
	libavcodec/alpha/dsputil_alpha.c
	libavcodec/dsputil_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-06 12:18:25 +01:00
Diego Biurrun
82bd04b170 rv34: Drop now unnecessary dsputil dependencies 2013-02-06 11:30:54 +01:00
Diego Biurrun
79dad2a932 dsputil: Separate h264chroma 2013-02-06 11:30:53 +01:00
Diego Biurrun
c9f933b5b6 Add av_cold attributes to arch-specific init functions 2013-02-05 17:01:05 +01:00
Diego Biurrun
25841dfe80 Use ptrdiff_t instead of int for {avg, put}_pixels line_size parameter.
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
2013-02-05 12:59:12 +01:00
Michael Niedermayer
911e270688 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Use proper "" quotes for local header #includes
  ppc: fmtconvert: Drop two unused variables.
  bink demuxer: set framerate.

Conflicts:
	libavcodec/kbdwin.c
	libavcodec/ppc/fmtconvert_altivec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-01 14:34:18 +01:00
Diego Biurrun
6c1a7d07eb Use proper "" quotes for local header #includes 2013-02-01 12:51:15 +01:00
Michael Niedermayer
2b14344ab3 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: vp8: Fix the plain-armv6 version of vp8_luma_dc_wht

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-27 16:47:56 +01:00
Martin Storsjö
2026eb1408 arm: vp8: Fix the plain-armv6 version of vp8_luma_dc_wht
This makes the plain-armv6 version use the same registers as the
armv6t2 version above.

This fixes fate-vp8 on plain-armv6 devices.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-01-27 13:17:25 +02:00
Michael Niedermayer
b2d0c5bd13 Merge commit '33552a5f7b6ec7057516f487b1a902331f8c353e'
* commit '33552a5f7b6ec7057516f487b1a902331f8c353e':
  arm: Add mathops.h to ARCH_HEADERS list
  avstring: K&R formatting cosmetics

Conflicts:
	libavutil/avstring.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-25 14:24:38 +01:00
Michael Niedermayer
91ae9bc51e Merge commit '6bdb841b46d170d58488deaed720729b79223b1d'
* commit '6bdb841b46d170d58488deaed720729b79223b1d':
  arm: h264qpel: use neon h264 qpel functions only if supported

* bug was fixed previously (in merge of buggy code):
  h264: copy h264qpel dsp context to slice thread copies

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-25 13:45:22 +01:00
Diego Biurrun
33552a5f7b arm: Add mathops.h to ARCH_HEADERS list
It is an arch-specific header not suitable for standalone compilation.
2013-01-24 20:59:22 +01:00
Janne Grunau
8e148b8742 arm: h264qpel: use neon h264 qpel functions only if supported 2013-01-24 17:06:52 +01:00
Michael Niedermayer
fc13a89654 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dsputil: Separate h264 qpel

Conflicts:
	libavcodec/dsputil_template.c
	libavcodec/h264.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-24 15:47:47 +01:00
Mans Rullgard
e9d817351b dsputil: Separate h264 qpel
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-24 10:44:43 +01:00
Michael Niedermayer
1e7a92f219 Merge commit 'baf35bb4bc4fe7a2a4113c50989d11dd9ef81e76'
* commit 'baf35bb4bc4fe7a2a4113c50989d11dd9ef81e76':
  dsputil: remove one array dimension from avg_no_rnd_pixels_tab.

Conflicts:
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 18:15:29 +01:00
Michael Niedermayer
6d1e9d993a Merge commit '32ff6432284f713e9f837ee5b36fc8e9f1902836'
* commit '32ff6432284f713e9f837ee5b36fc8e9f1902836':
  dsputil: remove avg_no_rnd_pixels8.

Conflicts:
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 17:52:21 +01:00
Michael Niedermayer
ac8987591f Merge commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f'
* commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f':
  Drop DCTELEM typedef

Conflicts:
	libavcodec/alpha/dsputil_alpha.h
	libavcodec/alpha/motion_est_alpha.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/bfin/dsputil_bfin.h
	libavcodec/bfin/pixels_bfin.S
	libavcodec/cavs.c
	libavcodec/cavsdec.c
	libavcodec/dct-test.c
	libavcodec/dnxhdenc.c
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/dsputil_template.c
	libavcodec/eamad.c
	libavcodec/h264_cavlc.c
	libavcodec/h264idct_template.c
	libavcodec/mpeg12.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo.h
	libavcodec/mpegvideo_enc.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/proresdsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 17:44:56 +01:00
Michael Niedermayer
8102f27b5b Merge commit '73b704ac609d83e0be124589f24efd9b94947cf9'
* commit '73b704ac609d83e0be124589f24efd9b94947cf9':
  arm: Add some missing header #includes
  floatdsp: move scalarproduct_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/acelp_pitch_delay.c
	libavcodec/amrnbdec.c
	libavcodec/amrwbdec.c
	libavcodec/ra288.c
	libavcodec/x86/dsputil_mmx.c
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:31:55 +01:00
Michael Niedermayer
24604ebaf8 Merge commit '5959bfaca396ecaf63a8123055f499688b79cae3'
* commit '5959bfaca396ecaf63a8123055f499688b79cae3':
  floatdsp: move butterflies_float from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/imc.c
	libavcodec/mpegaudiodec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:13:54 +01:00
Michael Niedermayer
6e6e170898 Merge commit '42d324694883cdf1fff1612ac70fa403692a1ad4'
* commit '42d324694883cdf1fff1612ac70fa403692a1ad4':
  floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.

Conflicts:
	libavcodec/arm/dsputil_init_vfp.c
	libavcodec/arm/dsputil_vfp.S
	libavcodec/dsputil.c
	libavcodec/ppc/float_altivec.c
	libavcodec/x86/dsputil.asm
	libavutil/x86/float_dsp.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 14:04:50 +01:00
Michael Niedermayer
b1b870fbd7 Merge commit '55aa03b9f8f11ebb7535424cc0e5635558590f49'
* commit '55aa03b9f8f11ebb7535424cc0e5635558590f49':
  floatdsp: move vector_fmul_add from dsputil to avfloatdsp.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/x86/dsputil.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 13:54:34 +01:00
Ronald S. Bultje
baf35bb4bc dsputil: remove one array dimension from avg_no_rnd_pixels_tab. 2013-01-22 18:41:36 -08:00
Ronald S. Bultje
32ff643228 dsputil: remove avg_no_rnd_pixels8.
This is never used.
2013-01-22 18:41:36 -08:00
Diego Biurrun
88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Diego Biurrun
73b704ac60 arm: Add some missing header #includes 2013-01-22 21:24:10 +01:00
Ronald S. Bultje
5959bfaca3 floatdsp: move butterflies_float from dsputil to avfloatdsp.
This makes wmadec/enc, twinvq and mpegaudiodec (i.e. mp2/mp3)
independent of dsputil.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
42d3246948 floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
2013-01-22 11:55:42 -08:00
Ronald S. Bultje
55aa03b9f8 floatdsp: move vector_fmul_add from dsputil to avfloatdsp. 2013-01-22 11:55:42 -08:00
Ronald S. Bultje
d56668bd80 floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
This makes the aac decoder and all voice codecs independent of dsputil.
2013-01-22 11:55:42 -08:00
Michael Niedermayer
cf4515ecd9 Merge commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3'
* commit 'ce378f0dd0c4e5350b3280e6b3e8d6b46fe4b0a3':
  fate: Use wmv2 IDCT for wmv2 tests
  vorbisdsp: change block_size type from int to intptr_t.

Conflicts:
	tests/fate-run.sh
	tests/fate/vcodec.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-21 16:28:04 +01:00
Michael Niedermayer
a85311ef84 Merge commit '68f18f03519ae550e25cf12661172641e9f0eaca'
* commit '68f18f03519ae550e25cf12661172641e9f0eaca':
  videodsp_armv5te: remove #if HAVE_ARMV5TE_EXTERNAL
  dsputil: drop non-compliant "fast" qpel mc functions
  get_bits: change the failure condition in init_get_bits

Conflicts:
	libavcodec/get_bits.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-21 13:38:57 +01:00
Ronald S. Bultje
1768e43ceb vorbisdsp: change block_size type from int to intptr_t.
This saves one instruction in the x86-64 assembly.
2013-01-20 22:26:42 -08:00
Janne Grunau
68f18f0351 videodsp_armv5te: remove #if HAVE_ARMV5TE_EXTERNAL
libavutil/arm/asm.S sets '.arch' depending on HAVE_ARMV5TE so that
assembling armv5te code will always succeed even if the default -march
flag does not support it. HAVE_ARMV5TE_EXTERNAL tests assembling code
with the default arch.
Fixes the missing symbol ff_prefetch_arm with --cpu= not including
armv5te.

CC: libav-stable@libav.org
2013-01-20 15:20:00 +01:00
Michael Niedermayer
c62cb1112f Merge commit 'fef906c77c09940a2fdad155b2adc05080e17eda'
* commit 'fef906c77c09940a2fdad155b2adc05080e17eda':
  Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-20 14:13:16 +01:00
Michael Niedermayer
cf061a9c3b Merge commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a'
* commit 'aeaf268e52fc11c1f64914a319e0edddf1346d6a':
  vp3: integrate clear_blocks with idct of previous block.
  mpegvideo: fix loop condition in draw_line()
  dvdsubdec: parse the size from the extradata

Conflicts:
	libavcodec/dvdsubdec.c
	libavcodec/mpegvideo.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-20 13:57:10 +01:00
Ronald S. Bultje
fef906c77c Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.
Conveniently (together with Justin's earlier patches), this makes
our vorbis decoder entirely independent of dsputil.
2013-01-19 22:21:10 -08:00
Ronald S. Bultje
aeaf268e52 vp3: integrate clear_blocks with idct of previous block.
This is identical to what e.g. vp8 does, and prevents the function call
overhead (plus dependency on dsputil for this particular function).

Arm asm updated by Janne Grunau <janne-libav@jannau.net>.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2013-01-19 22:04:55 -08:00
Michael Niedermayer
5c7e9e16c9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavc: Move vector_fmul_window to AVFloatDSPContext
  rtpdec_mpeg4: Check the remaining amount of data before reading

Conflicts:
	libavcodec/dsputil.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-16 12:38:41 +01:00
Justin Ruggles
e034cc6c60 lavc: Move vector_fmul_window to AVFloatDSPContext
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-01-16 10:45:45 +01:00
Michael Niedermayer
e16bac7b33 videodsp: Fix project name
These are all part of splited out dsp utils from FFmpeg

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-22 00:58:08 +01:00
Michael Niedermayer
a41bf09d9c Merge commit '6906b19346ae8a330bfaa1c16ce535be10789723'
* commit '6906b19346ae8a330bfaa1c16ce535be10789723':
  lavc: add missing files for arm
  lavc: introduce VideoDSPContext

Conflicts:
	configure
	libavcodec/arm/dsputil_init_armv5te.c
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/dsputil_template.c
	libavcodec/h264.c
	libavcodec/mpegvideo.h
	libavcodec/mpegvideo_enc.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-21 17:18:43 +01:00
Luca Barbato
6906b19346 lavc: add missing files for arm
Across the many retouches those did not make the main commit.
2012-12-20 14:07:23 +01:00
Ronald S. Bultje
8c53d39e7f lavc: introduce VideoDSPContext
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-12-20 13:40:45 +01:00
Michael Niedermayer
af804dbe9e Merge commit '523c7bd23c781aa0f3a85044896f5e18e8b52534'
* commit '523c7bd23c781aa0f3a85044896f5e18e8b52534':
  misc typo, style and wording fixes

Conflicts:
	libavcodec/options_table.h
	libavutil/pixfmt.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-19 14:28:58 +01:00
Diego Biurrun
523c7bd23c misc typo, style and wording fixes 2012-12-18 13:36:51 +01:00
Michael Niedermayer
3193a5cdbf arm: put prefetch under matching #ifdef as the actual code
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-09 17:14:15 +01:00
Michael Niedermayer
aba1a48cc5 Merge commit 'b326755989b346d0d935e0628e8865f9b2951c30'
* commit 'b326755989b346d0d935e0628e8865f9b2951c30':
  arm: rename ARMVFP config symbol to VFP

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 14:24:16 +01:00
Michael Niedermayer
89c8eaa321 Merge commit '637606de2d2e0af0a9fa2f23f943765d7d7c5cd5'
* commit '637606de2d2e0af0a9fa2f23f943765d7d7c5cd5':
  configure: arm: make _inline arch ext symbols depend on inline_asm
  arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation

Conflicts:
	configure
	libavcodec/arm/dca.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 14:19:55 +01:00
Michael Niedermayer
7181806dc1 Merge commit '9ebd45c2d58ad9241ad09718679f0cf7fb57da52'
* commit '9ebd45c2d58ad9241ad09718679f0cf7fb57da52':
  configure: do not bypass cpuflags section if --cpu not given
  dct-test: arm: indicate required cpu features for optimised funcs
  snow: fix build after 594d4d5df3
  arm: fix use of uninitialised value in ff_fft_fixed_init_arm()
  avpicture: Don't assume a valid pix fmt in avpicture_get_size

Conflicts:
	libavcodec/avpicture.c
	libavcodec/snow.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 13:49:26 +01:00
Mans Rullgard
b326755989 arm: rename ARMVFP config symbol to VFP
This is consistent with usual ARM nomenclature as well as with the
VFPV3 and NEON symbols which both lack the ARM prefix.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:04 +00:00
Mans Rullgard
a7831d509f arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation
These macros reflect the actual capabilities required here.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:03 +00:00
Mans Rullgard
92dad6687f arm: fix use of uninitialised value in ff_fft_fixed_init_arm()
When initialising an FFTContext for a plain FFT, mdct_bits is not set
and can contain a garbage value.  Since nbits is always valid and for
MDCT operation is mdct_bits - 2 checking this instead avoids using an
uninitialised value while having the same effect.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 13:11:57 +00:00
Michael Niedermayer
2684d2e3ea Merge commit '284ea790d89441fa1e6b2d72d3c1ed6d61972f0b'
* commit '284ea790d89441fa1e6b2d72d3c1ed6d61972f0b':
  dsputil: move vector_fmul_scalar() to AVFloatDSPContext in libavutil
  aacenc: use the correct output buffer
  aacdec: fix signed overflows in lcg_random()
  base64: fix signed overflow in shift

Conflicts:
	libavcodec/dsputil.c
	libavutil/base64.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-27 13:39:52 +01:00
Justin Ruggles
284ea790d8 dsputil: move vector_fmul_scalar() to AVFloatDSPContext in libavutil 2012-11-26 11:29:06 -05:00
Michael Niedermayer
a201639a01 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  pixfmt: support more yuva formats
  swscale: support gray to 9bit and 10bit formats
  configure: rewrite print_config() function using awk
  FATE: fix (AD)PCM test dependencies broken in e519990
  Use ptrdiff_t instead of int for intra pred "stride" function parameter.
  x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes.

Conflicts:
	libavcodec/h264pred.c
	libavcodec/h264pred_template.c
	libavutil/pixfmt.h
	libswscale/swscale_unscaled.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-30 17:47:36 +01:00
Ronald S. Bultje
95c89da36e Use ptrdiff_t instead of int for intra pred "stride" function parameter.
This way, SIMD-optimized functions don't have to sign-extend their
stride argument manually to be able to do pointer arithmetic.
2012-10-29 17:49:13 -07:00
Michael Niedermayer
04c6ecb7da Merge commit 'c9ef43215c7d68c2cdcdbe02287aa114f27a32ed'
* commit 'c9ef43215c7d68c2cdcdbe02287aa114f27a32ed':
  fate-vc1: add dependencies
  ARM: fix overreads in neon h264 chroma mc
  rtsp: Make sure the ret variable is initialized in ff_rtsp_fetch_packet
  gitignore: ignore files created by msvc
  fate: Add proper dependencies for the tests in video.mak
  configure: Disable Snow decoder and encoder by default
  lzo: Drop obsolete fast_memcpy reference
  build: Drop OBJS declaration for non-existing PCM_DVD encoder
  mpeg4videodec: Disable frame multithreading for GMC, its not implemented at all

Conflicts:
	libavcodec/mpegvideo.c
	libavformat/rtsp.c
	tests/fate/microsoft.mak
	tests/fate/video.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-20 12:37:52 +02:00
Mans Rullgard
1846ddf0a7 ARM: fix overreads in neon h264 chroma mc
The loops were reading ahead one line, which could end up outside the
buffer for reference blocks at the edge of the picture.  Removing
this readahead has no measurable performance impact.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-20 01:28:38 +01:00
Michael Niedermayer
e335658370 Merge commit '9734b8ba56d05e970c353dfd5baafa43fdb08024'
* commit '9734b8ba56d05e970c353dfd5baafa43fdb08024':
  Move avutil tables only used in libavcodec to libavcodec.

Conflicts:
	libavcodec/mathtables.c
	libavutil/intmath.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-12 14:26:46 +02:00
Diego Biurrun
9734b8ba56 Move avutil tables only used in libavcodec to libavcodec. 2012-10-11 18:29:36 +02:00
Michael Niedermayer
de31814ab0 Merge commit 'b522000e9b2ca36fe5b2751096b9a5f5ed8f87e6'
* commit 'b522000e9b2ca36fe5b2751096b9a5f5ed8f87e6':
  avio: introduce avio_closep
  mpegtsenc: set muxing type notification to verbose
  vc1dec: Use correct spelling of "opposite"
  a64multienc: change mc_frame_counter to unsigned
  arm: call arm-specific rv34dsp init functions under if (ARCH_ARM)
  svq1: Drop a bunch of useless parentheses
  parseutils-test: do not print numerical error codes
  svq1: K&R formatting cosmetics

Conflicts:
	doc/APIchanges
	libavcodec/svq1dec.c
	libavcodec/svq1enc.c
	libavformat/version.h
	libavutil/parseutils.c
	tests/ref/fate/parseutils

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-11 14:03:12 +02:00
Jean-Baptiste Kempf
507dce2536 arm: call arm-specific rv34dsp init functions under if (ARCH_ARM)
Assign NEON specific function pointers after runtime check via
av_get_cpu_flags().

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-10-10 15:28:50 +02:00
Michael Niedermayer
eadba3e94d Merge commit 'cbcd497f384f0f8ef3f76f85b29b644b900d6b9f'
* commit 'cbcd497f384f0f8ef3f76f85b29b644b900d6b9f':
  adxdec: use planar sample format
  adpcmdec: use planar sample format for adpcm_thp
  adpcmdec: use planar sample format for adpcm_ea_xas
  adpcmdec: use planar sample format for adpcm_ea_r1/r2/r3
  adpcmdec: use planar sample format for adpcm_xa
  adpcmdec: use planar sample format for adpcm_ima_ws for vqa version 3
  adpcmdec: use planar sample format for adpcm_4xm
  adpcmdec: use planar sample format for adpcm_ima_wav
  adpcmdec: use planar sample format for adpcm_ima_qt
  pcmdec: use planar sample format for pcm_lxf
  mace: use planar sample format
  atrac1: use planar sample format
  build: non-x86: Only compile mpegvideo optimizations when necessary
  rtpdec_mpeg4: au_headers is a single array, simple av_free is enough
  avcodec: free extended_data instead address of it
  fate: Add tests of the ff_make_absolute_url function
  url: Handle relative urls starting with two slashes
  url: Handle relative urls being just a new query string
  url: Don't treat slashes in query parameters as directory separators

Conflicts:
	libavcodec/adxdec.c
	libavcodec/mips/Makefile
	libavcodec/pcm.c
	libavcodec/utils.c
	libavformat/Makefile
	libavformat/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-10 13:01:27 +02:00
Diego Biurrun
ac56ff9cc9 build: non-x86: Only compile mpegvideo optimizations when necessary 2012-10-09 14:45:59 +02:00
Michael Niedermayer
5de75336a1 mpegvideo_armv5te: change asserts to av_asserts
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-06 04:45:57 +02:00
Michael Niedermayer
7e5496fc41 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ARM: use numeric ID for Tag_ABI_align_preserved
  segment: Pass the interrupt callback on to the chained AVFormatContext, too
  ARM: bswap: drop armcc version of av_bswap16()
  ARM: set Tag_ABI_align_preserved in all asm files

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-03 13:35:02 +02:00
Mans Rullgard
5e826fd65e ARM: set Tag_ABI_align_preserved in all asm files
All our ARM asm preserves alignment so setting this attribute
in a common location is simpler.  This removes numerous warnings
when linking with armcc.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-10-02 19:47:56 +01:00
Michael Niedermayer
3b92075e6c Revert "arm/h264: fix overreads in h264_chroma_mc8-and-h264_chroma_mc4"
This reverts commit d25f87f517.

This breaks decoding of some h264 files
I have tested the original patch with fate but by mistake have
forgotten to specify the fate samples so testing was limited to
the internal regression tests.
2012-09-26 17:31:25 +02:00
bruce-wu
d25f87f517 arm/h264: fix overreads in h264_chroma_mc8-and-h264_chroma_mc4
Fixes Ticket1227
2012-09-25 20:09:30 +02:00