Michael Niedermayer
fe6603745e
Merge commit '6e4009d4cdf5927bdaedf58fcfc5e813b14c366b'
...
* commit '6e4009d4cdf5927bdaedf58fcfc5e813b14c366b':
arm: dcadsp: implement decode_hf as external NEON asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:52:25 +01:00
Michael Niedermayer
fb3c33f3cd
Merge commit '4cb6964244fd6c099383d8b7e99731e72cc844b9'
...
* commit '4cb6964244fd6c099383d8b7e99731e72cc844b9':
dcadec: simplify decoding of VQ high frequencies
Conflicts:
configure
libavcodec/dcadec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:41:19 +01:00
Michael Niedermayer
ffb7d7195b
avcodec/dcadec: use brackets to ensure that no slow division is used
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:32:24 +01:00
Michael Niedermayer
747b0337e7
Merge commit '7686afd049be98d18663682b92d983340fa2c305'
...
* commit '7686afd049be98d18663682b92d983340fa2c305':
dca: factorize scaling in inverse ADPCM
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:32:14 +01:00
Michael Niedermayer
baf3adc621
Merge commit '08e3ea60ff4059341b74be04a428a38f7c3630b0'
...
* commit '08e3ea60ff4059341b74be04a428a38f7c3630b0':
x86: synth filter float: implement SSE2 version
Conflicts:
libavcodec/x86/dcadsp.asm
libavcodec/x86/dcadsp_init.c
See: 2cdbcc004837ce092a14f326f24d97a29512a2c3
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 20:38:39 +01:00
Christophe Gisquet
2cdbcc0048
x86: synth filter float: implement SSE2 version
...
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 20:34:40 +01:00
Michael Niedermayer
5333e0dd66
Merge commit '57b1eb9f75b04571063ddec316e290c216c114ac'
...
* commit '57b1eb9f75b04571063ddec316e290c216c114ac':
dcadsp: scan coefficients linearly in dca_lfe_fir
Conflicts:
libavcodec/dcadsp.c
See: 9ae8e23188fc2e533eea74757c9060557941d3d9
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 19:40:40 +01:00
Michael Niedermayer
e346a59383
Merge commit 'ad507d7907457e678900bac132122ba7be4644cb'
...
* commit 'ad507d7907457e678900bac132122ba7be4644cb':
x86: dcadsp: implement SSE lfe_dir
Conflicts:
libavcodec/x86/dcadsp.asm
See: 169243112c1e310d90c030fb258092f6d2e46117
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 19:22:00 +01:00
Christophe Gisquet
169243112c
x86: dcadsp: implement SSE lfe_dir
...
Results for Arrandale/Windows:
32: 1670 -> 316
64: 728 -> 298
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 19:20:03 +01:00
Michael Niedermayer
90f674d55b
Merge commit '87ec849fe9acba075c843e67bcd01f256f481a18'
...
* commit '87ec849fe9acba075c843e67bcd01f256f481a18':
dcadec: remove scaling in lfe_interpolation_fir
Conflicts:
libavcodec/dcadec.c
libavcodec/dcadsp.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 18:14:12 +01:00
Michael Niedermayer
810eb285e3
Merge commit 'a55546f48d55e3d1155840541b2be5f4f8cf18ab'
...
* commit 'a55546f48d55e3d1155840541b2be5f4f8cf18ab':
proresenc: Reuse proper dsputil infrastructure for FDCT
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 18:06:12 +01:00
Michael Niedermayer
2e88f82a8a
Merge commit '92e598a57a7ce4b8ac9ea56274af39f5fd888311'
...
* commit '92e598a57a7ce4b8ac9ea56274af39f5fd888311':
prores: Drop DSP infrastructure for prores encoder bits
Conflicts:
libavcodec/Makefile
libavcodec/proresdsp.c
libavcodec/proresenc_kostya.c
Note, these changes only affect one of the 2 prores encoders we have
If someone wants to add optimizations to the affected encoder, or needs/wants
this infrastructure, then iam happy to revert this
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 18:02:00 +01:00
Michael Niedermayer
18d870da83
Merge commit 'd6acefe05862af244fd5a30ae946ed507c063994'
...
* commit 'd6acefe05862af244fd5a30ae946ed507c063994':
proresenc: Drop unnecessary DCT permutation bits
Conflicts:
libavcodec/proresenc_kostya.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 17:26:37 +01:00
Michael Niedermayer
5ba1648318
Merge commit 'b23650491fbd579a4365f42bd42575afb7b53f7e'
...
* commit 'b23650491fbd579a4365f42bd42575afb7b53f7e':
prores: Use consistent names for DSP arch initialization functions
Conflicts:
libavcodec/proresdsp.c
libavcodec/proresdsp.h
libavcodec/x86/proresdsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 17:13:00 +01:00
Michael Niedermayer
f3eef02746
avcodec/msvideo1: Fix palette in case of seek before decode
...
Fixes Ticket3212
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 13:20:22 +01:00
Janne Grunau
6e4009d4cd
arm: dcadsp: implement decode_hf as external NEON asm
2014-02-28 13:12:19 +01:00
Christophe Gisquet
4cb6964244
dcadec: simplify decoding of VQ high frequencies
...
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.
Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.
The decode_hf implementations have following timings:
For x86 Arrandale:
C SSE SSE2 SSE4
win32: 260 162 119 104
win64: 242 N/A 89 72
The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
2014-02-28 13:03:22 +01:00
Janne Grunau
7686afd049
dca: factorize scaling in inverse ADPCM
...
Based on a patch from Christophe Gisquet.
Unrolling of the m == 0 case avoids a possible use of the uninitilized
value sum when s->predictor_history is not set. I failed to find a
sample for it. It also reduced the cycle count from 220 to 150 on
sandy bridge, x86_64 linux, gcc 4.8.2 compared to his patch.
2014-02-28 13:00:48 +01:00
Christophe Gisquet
08e3ea60ff
x86: synth filter float: implement SSE2 version
...
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:48 +01:00
Christophe Gisquet
57b1eb9f75
dcadsp: scan coefficients linearly in dca_lfe_fir
...
This change is inspired by x86 asm where it frees a register.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:47 +01:00
Christophe Gisquet
ad507d7907
x86: dcadsp: implement SSE lfe_dir
...
Results for Arrandale/Windows:
32: 1670 -> 316
64: 728 -> 298
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:47 +01:00
Christophe Gisquet
87ec849fe9
dcadec: remove scaling in lfe_interpolation_fir
...
The scaling factor is constant so it is faster to scale the
FIR coefficients in the tables during compilation.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:47 +01:00
Diego Biurrun
a55546f48d
proresenc: Reuse proper dsputil infrastructure for FDCT
2014-02-28 11:19:47 +01:00
Diego Biurrun
92e598a57a
prores: Drop DSP infrastructure for prores encoder bits
...
None of the encoder bits are arch-optimized.
2014-02-28 11:17:25 +01:00
Diego Biurrun
d6acefe058
proresenc: Drop unnecessary DCT permutation bits
...
No permutation is necessary for the FDCT.
2014-02-28 11:00:24 +01:00
Diego Biurrun
b23650491f
prores: Use consistent names for DSP arch initialization functions
2014-02-28 10:34:55 +01:00
Michael Niedermayer
5c634cbeb7
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
Give IDCT matrix transpose macro a more descriptive name
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 02:08:11 +01:00
James Almer
2163a40a46
x86/imdct36: use sse3 instructions in the last BUTTERF step when possible
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 23:28:15 +01:00
James Almer
fbf98375e4
x86/imdct36: don't build imdct36_float_sse on x86_64 targets
...
There's an SSE2 version as well, and x86_64 guarantees that
instruction set is present.
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 22:54:03 +01:00
Diego Biurrun
f2408ec9d7
Give IDCT matrix transpose macro a more descriptive name
...
This also avoids a macro name clash and related warning on ARM.
2014-02-27 13:38:00 -08:00
Michael Niedermayer
a05635ee01
avcodec/mjpegdec: convert CMYK to GBRAP
...
Fixes Ticket2799
This should be moved into swscale once we have a CMYK pixel format
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 22:18:34 +01:00
Michael Niedermayer
501beae6f9
avcodec/mjpegdec: fix decoding 4th plane
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 21:49:46 +01:00
Michael Niedermayer
6904168c79
avcodec/mjpegdec: Print error in case of CMYK
...
Also fail if AV_EF_EXPLODE is set.
We do not fail by default, but rather return some image as it may be usefull to the
end user to see what is on the image, for example text could be read quite fine and
objects recognized.
Possibly fixes Ticket3424
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 20:24:39 +01:00
Michael Niedermayer
681e72a668
avcodec/mjpegdec: parse adobe_transform
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 20:13:48 +01:00
Michael Niedermayer
7e8be7081f
avcodec/mjpegdec: Print human readable string for APPx
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 19:35:37 +01:00
Michael Niedermayer
4f4cc43fd8
avcodec/h264: allow mixing idr and non idr slices with frame threading again
...
This combination exists in the wild
Fixes Ticket3131
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 15:49:25 +01:00
Michael Niedermayer
649686d89b
avcodec/h264_refs: remove lost frames instead of disfavoring them
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 11:16:23 +01:00
Michael Niedermayer
64bb64f704
avcodec/h264: fix droped frame handling also for threads > 1
...
Seems i mistakely tested just with threads=1
Fixes part of Ticket3386
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 08:07:46 +01:00
Michael Niedermayer
b5005def8a
avcodec/h264: avoid using lost frames as references
...
Fixes Ticket3386
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-27 03:26:03 +01:00
Michael Niedermayer
c4c5351f08
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
build: Do not redundantly specifiy H.263-related object files for MSMPEG4v*
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-26 22:02:23 +01:00
Diego Biurrun
a63ac1106d
build: Do not redundantly specifiy H.263-related object files for MSMPEG4v*
...
These are already covered through dependencies specified in configure.
2014-02-26 19:44:55 +01:00
Peter Ross
1524b0fa68
libavcodec/rawdec: avoid memcpy when performing 16-bit samples shift
...
Signed-off-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-26 19:44:34 +01:00
Michael Niedermayer
bdadf05ec8
avcodec/parser: put lost comments back
...
Found-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-26 18:57:43 +01:00
Michael Niedermayer
2673357048
Merge remote-tracking branch 'qatar/master'
...
* qatar/master:
parser: cosmetics: Drop some unnecessary parentheses
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 20:52:45 +01:00
Michael Niedermayer
72d580f819
Merge commit 'a1c699659d56b76c0bf399307f642c6fd6d28281'
...
* commit 'a1c699659d56b76c0bf399307f642c6fd6d28281':
parser: K&R formatting cosmetics
Conflicts:
libavcodec/parser.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 19:12:30 +01:00
Michael Niedermayer
0306436416
Merge commit 'ed61f3ca8a0664a697782253b354055136c5d303'
...
* commit 'ed61f3ca8a0664a697782253b354055136c5d303':
parser: Remove commented-out cruft
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 19:03:45 +01:00
Hendrik Leppkes
bc249bd673
mpegvideo: re-indent buffer clearing code
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 13:47:16 +01:00
Hendrik Leppkes
fa84231ee8
mpegvideo: fix overwriting hwaccel surface objects
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 13:47:10 +01:00
Diego Biurrun
4ec336484d
parser: cosmetics: Drop some unnecessary parentheses
2014-02-25 13:40:47 +01:00
Peter Ross
bef6b27f10
avcodec/vp8dsp: use AV_ZERO64 to clear idct coefficient rows
...
Signed-off-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-25 12:49:35 +01:00