Commit Graph

577 Commits

Author SHA1 Message Date
Michael Niedermayer
5794e9fce2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dcadsp: split lfe_dir cases

Conflicts:
	libavcodec/arm/dcadsp_init_arm.c

See: 45854df9a5
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 02:27:27 +01:00
Christophe Gisquet
45854df9a5 dcadsp: split lfe_dir cases
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 02:04:12 +01:00
Christophe Gisquet
481a46a462 dcadsp: add int8x8_fmul_int32 to DSP context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:55:42 +01:00
Michael Niedermayer
bf90abe1dd Merge commit '5bcbb516f2ff45290ef7995b081762e668693672'
* commit '5bcbb516f2ff45290ef7995b081762e668693672':
  arm: Add X() around all references to extern symbols

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:48:26 +01:00
Christophe Gisquet
5fdbfcb5b7 dcadsp: split lfe_dir cases
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.

Arm asm changes by Janne Grunau.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:54:18 +01:00
Christophe Gisquet
2bd44cb705 dcadsp: add int8x8_fmul_int32 to dsp context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:51:59 +01:00
Martin Storsjö
5bcbb516f2 arm: Add X() around all references to extern symbols
Don't rely on the fact that an unprefixed label currently exists.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-07 15:13:58 +02:00
Michael Niedermayer
c73445a45c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter

Conflicts:
	libavcodec/arm/vp8dsp_neon.S

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-06 15:33:54 +01:00
Martin Storsjö
49ec551595 vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter
Based on a patch by Ronald S. Bultje.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-02-06 09:32:26 +02:00
Michael Niedermayer
7766c7b7a0 Merge commit '7151c5d04aed3b496c21f713dcb603e2cbdb9c49'
* commit '7151c5d04aed3b496c21f713dcb603e2cbdb9c49':
  arm: Use full filenames as multiple inclusion guards

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-14 14:38:10 +01:00
Diego Biurrun
7151c5d04a arm: Use full filenames as multiple inclusion guards 2014-01-14 00:04:52 +01:00
Michael Niedermayer
9d5cc55f0f Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add an option for making sure NEON registers aren't clobbered

Conflicts:
	configure

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-11 03:08:10 +01:00
Martin Storsjö
44a0a98f92 arm: Add an option for making sure NEON registers aren't clobbered
This is pretty much based on the same test for XMM registers.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-11 00:03:00 +02:00
Michael Niedermayer
8be8dddd13 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add a missing # as prefix for an immediate constant

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:44:56 +01:00
Michael Niedermayer
edba54630b Merge commit '5dae4872357613a0b51120b54a4c5221e0ec3f69'
* commit '5dae4872357613a0b51120b54a4c5221e0ec3f69':
  arm: Allow overriding the alignment set in the function macro

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:36:56 +01:00
Michael Niedermayer
3c00d4c5f0 Merge commit 'b7b932f5e3602bd34c3cc634b71c8bbbc0fb8dc0'
* commit 'b7b932f5e3602bd34c3cc634b71c8bbbc0fb8dc0':
  arm: Remove a leftover define for the pld instruction

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:21:18 +01:00
Michael Niedermayer
9ef7c0c551 Merge commit '67bb3a4e285a5871770cbaa2d78bf9024961dd0f'
* commit '67bb3a4e285a5871770cbaa2d78bf9024961dd0f':
  arm: cosmetics: Reindent the h264dsp neon init function

Conflicts:
	libavcodec/arm/h264dsp_init_arm.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-08 05:13:25 +01:00
Martin Storsjö
952d3187d8 arm: Add a missing # as prefix for an immediate constant
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:30:13 +02:00
Martin Storsjö
5dae487235 arm: Allow overriding the alignment set in the function macro
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).

The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.

This restores the original intention, to align the loop entry
points.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:56 +02:00
Martin Storsjö
b7b932f5e3 arm: Remove a leftover define for the pld instruction
This file no longer uses the pld instruction at all, all such uses
have been split into hpeldsp_arm.S.

Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:42 +02:00
Martin Storsjö
67bb3a4e28 arm: cosmetics: Reindent the h264dsp neon init function
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-07 19:29:31 +02:00
Michael Niedermayer
7778979f6f Merge commit '794fcf79a89eca2d4e889803b2c804a0b1defbb3'
* commit '794fcf79a89eca2d4e889803b2c804a0b1defbb3':
  Rename CONFIG_FFT_FLOAT ---> FFT_FLOAT

Conflicts:
	libavcodec/fft-internal.h
	libavcodec/fft-test.c
	libavcodec/fft_fixed.c
	libavcodec/fft_float.c
	libavcodec/fft_template.c
	libavcodec/mdct_fixed.c
	libavcodec/mdct_float.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-07 13:10:56 +01:00
Diego Biurrun
794fcf79a8 Rename CONFIG_FFT_FLOAT ---> FFT_FLOAT
The define does not originate from configure, so it should not
have a name that is CONFIG_-prefixed.
2014-01-06 19:12:48 +01:00
Michael Niedermayer
30056fd0be Merge commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb'
* commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb':
  h264: do not use 422 functions for monochrome

See: 07abf13da4
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-06 16:51:23 +01:00
Anton Khirnov
a03a642d5c h264: do not use 422 functions for monochrome
Fixes invalid memory access.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
2014-01-06 08:25:36 +01:00
Michael Niedermayer
d8e65e9224 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Use the matching endfunc macro instead of the assembler directive directly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-04 13:19:54 +01:00
Michael Niedermayer
63ce041a7d Merge commit '2ad4ee345a4216aef3999f57dd14c56128d27a13'
* commit '2ad4ee345a4216aef3999f57dd14c56128d27a13':
  arm: Add a missing endfunc macro call

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-04 13:13:39 +01:00
Martin Storsjö
3348e3492d arm: Use the matching endfunc macro instead of the assembler directive directly
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-04 13:53:08 +02:00
Martin Storsjö
2ad4ee345a arm: Add a missing endfunc macro call
Signed-off-by: Martin Storsjö <martin@martin.st>
2014-01-04 13:53:02 +02:00
Michael Niedermayer
70d6ce783c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Don't clobber callee saved registers in scalarproduct

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-21 01:19:59 +01:00
Michael Niedermayer
4fa91b88c6 Merge commit 'b254490bdabb21bd517c05b1a68717f9952ac8c4'
* commit 'b254490bdabb21bd517c05b1a68717f9952ac8c4':
  vc1: arm: Add NEON no_rnd chroma MC

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-20 23:33:15 +01:00
Michael Niedermayer
69278d94c4 Merge commit '832e19063209a5f355af733d1a45f5051f49ce33'
* commit '832e19063209a5f355af733d1a45f5051f49ce33':
  vc1: arm: Add NEON assembly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-20 23:12:16 +01:00
Martin Storsjö
d307e408d4 arm: Don't clobber callee saved registers in scalarproduct
q4-q7/d8-d15 are supposed to not be clobbered by the callee.

CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 20:48:30 +02:00
Mason Carter
b254490bda vc1: arm: Add NEON no_rnd chroma MC
Apply David Conrad's old patch to the modern codebase.

http://ffmpeg.org/pipermail/ffmpeg-devel/2009-April/059877.html

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 14:53:42 +02:00
Mason Carter
832e190632 vc1: arm: Add NEON assembly
For:

ff_vc1_inv_trans_{8,4}x{8,4}_{dc_,}neon
ff_put_pixels8x8_neon
ff_put_vc1_mspel_mc{0,1,2,3}{0,1,2,3}_neon (except for 00)

Based on ARM assembly code in libavcodec/arm by Rob Clark and Mans
Rullgard.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-12-20 14:53:39 +02:00
Michael Niedermayer
8e70fdab36 Merge commit '4958f35a2ebc307049ff2104ffb944f5f457feb3'
* commit '4958f35a2ebc307049ff2104ffb944f5f457feb3':
  dsputil: Move apply_window_int16 to ac3dsp

Conflicts:
	libavcodec/arm/ac3dsp_init_arm.c
	libavcodec/arm/ac3dsp_neon.S
	libavcodec/x86/ac3dsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-09 04:12:40 +01:00
Diego Biurrun
4958f35a2e dsputil: Move apply_window_int16 to ac3dsp
The (optimized) functions are used nowhere else.
2013-12-08 17:57:15 +01:00
Ronald S. Bultje
f574b4da56 vp8: use 2 registers for dst_stride and src_stride in neon bilin filter. 2013-09-28 20:25:29 -04:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Michael Niedermayer
e1c3e6277f Merge commit 'f0389eb777b1ab4291329d4f709098cdfa7384dc'
* commit 'f0389eb777b1ab4291329d4f709098cdfa7384dc':
  arm: fmtconvert: Split armv6 fmtconvert code off from vfp code

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:10:39 +02:00
Michael Niedermayer
99e7c702db Merge commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7'
* commit 'bd549cbaacd33dfb7be81d0619c9b107b8a85be7':
  arm: dcadsp: Move synth filter initialization to dcadsp file

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-29 16:00:45 +02:00
Diego Biurrun
f0389eb777 arm: fmtconvert: Split armv6 fmtconvert code off from vfp code 2013-08-29 11:24:14 +02:00
Diego Biurrun
bd549cbaac arm: dcadsp: Move synth filter initialization to dcadsp file 2013-08-29 11:24:14 +02:00
Michael Niedermayer
6067186f3a Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: h264chroma: Do not compile h264_chroma_mc* dependent on h264 decoder

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-24 11:30:57 +02:00
Michael Niedermayer
f9418d156f Merge commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9'
* commit '8506ff97c9ea4a1f52983497ecf8d4ef193403a9':
  vp56: Mark VP6-only optimizations as such.

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-24 11:04:11 +02:00
Diego Biurrun
f407856968 arm: h264chroma: Do not compile h264_chroma_mc* dependent on h264 decoder
The functions are used by all codecs that enable the h264chroma component
and the file is already compiled conditional on h264chroma being enabled.
2013-08-23 17:21:14 +02:00
Diego Biurrun
8506ff97c9 vp56: Mark VP6-only optimizations as such.
Most of our VP56 optimizations are VP6-only and will stay that way.
So avoid compiling them for VP5-only builds.
2013-08-23 14:42:19 +02:00
Michael Niedermayer
e1a5ee25e0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Add assembly version of h264_find_start_code_candidate

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-08 12:52:35 +02:00
Ben Avison
45e10e5c8d arm: Add assembly version of h264_find_start_code_candidate
Before          After
               Mean   StdDev   Mean   StdDev  Change
This function   508.8 23.4      185.4  9.0    +174.4%
Overall        3068.5 31.7     2752.1 29.4     +11.5%

In combination with the preceding patch:
                Before          After
                Mean   StdDev   Mean   StdDev  Change
Overall         2925.6 26.2     2752.1 29.4     +6.3%

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-08-08 12:08:34 +03:00
Michael Niedermayer
0815b23062 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  arm: Comment out unused labels in simple_idct_arm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-07-25 10:15:08 +02:00