Commit Graph

16288 Commits

Author SHA1 Message Date
Ronald S. Bultje
98d0d19208 lagarith: pad RGB buffer by 1 byte.
For left HFYU prediction, we predict from the buffer buf+1 using 8- or
16-byte reads. This means that aligning the buffer by 16 bytes is in
itself not sufficient, because if the width itself is 16- or 8-byte
aligned, the buffer will not be padded, and thus a read of size 16 at
buf+1 will overflow boundaries at the right edge. Padding the buffer by
1 byte is sufficient to not overflow its boundaries.

Fixes bug 342.
2012-08-03 11:09:17 -07:00
Ronald S. Bultje
da6505ad2f dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
This makes add_hfyu_left_prediction_sse4() handle sources that are not
16-byte aligned in its own function rather than by proxying the call to
add_hfyu_left_prediction_ssse3(). This fixes a crash on Win64, since the
sse4 version clobberes xmm6, but the ssse3 version (which uses MMX regs)
does not restore it, thus leading to XMM clobbering and RSP being off.

Fixes bug 342.
2012-08-03 11:09:14 -07:00
Mashiat Sarker Shakkhar
9cc74c9f6e vc1dec: Remove separate scaling function for interlaced field MVs
The scaling process for obtaining direct MVs from co-located field MVs
is the same for interlaced field and progressive pictures.

Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-08-03 17:21:54 +02:00
Mashiat Sarker Shakkhar
8379ea5e9f vc1dec: Invoke edge_emulation regardless of MV precision
In VC-1 interlaced field pictures, chroma motion vectors can extend beyond
picture boundary even if luma vectors are bounded. The problem shows up
only for hpel interpolated MVs, and may be due to the way motion vectors
are scaled / cropped.

Thanks to Konstantin Shishkov for suggesting the fix. This fixes
long-known segfaults in MC-VC1.ts from videolan streams archive.

Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-08-03 17:21:54 +02:00
Diego Biurrun
ca844b7be9 x86: Use consistent 3dnowext function and macro name suffixes
Currently there is a wild mix of 3dn2/3dnow2/3dnowext.  Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
2012-08-03 14:00:47 +02:00
Kostya Shishkov
d3e0766fc0 g723_1: scale output as supposed for the case with postfilter disabled 2012-08-03 07:07:07 +02:00
Kostya Shishkov
94bfdfd6f0 g723_1: increase excitation storage by 4
Fixed codebook mode in 5300 rate may write up to SUBFRAME_LEN + 4 and
that is considered normal by the reference decoder. Without that additional
padding it might overwrite first elements of LPC history.
2012-08-03 07:07:07 +02:00
Kostya Shishkov
802bcdcb2f g723_1: fix upper bound parameter from inverse maximum autocorrelation 2012-08-03 07:07:07 +02:00
Kostya Shishkov
8ddadea171 g723_1: make scale_vector() behave like the reference 2012-08-03 07:07:07 +02:00
Kostya Shishkov
8772d2511a g723_1: fix off-by-one error in normalize_bits() 2012-08-03 07:07:07 +02:00
Kostya Shishkov
7f92db14f9 g723_1: save/restore excitation with offset to store LPC history
The same buffer with saved data is used later in LPC reconstruction, so
it should have some head space for LPC history.
2012-08-03 07:07:06 +02:00
Sean McGovern
3680b24351 wmapro: prevent division by zero when sample rate is unspecified
This fixes Bugzilla #327:

Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-08-03 07:07:00 +02:00
Diego Biurrun
03737412a3 x86: proresdsp: improve SIGNEXTEND macro comments 2012-08-02 22:30:44 +02:00
Diego Biurrun
81905088a1 x86: h264dsp: K&R formatting cosmetics 2012-08-02 20:20:21 +02:00
Ronald S. Bultje
c728518b3c x86: fft: fix imdct_half() for AVX
Some calculations were changed in b6a3849 to use mmsize, which was not correct
for the AVX version, which uses INIT_YMM and therefore has mmsize == 32.

Fixes Bug 341.

Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-08-02 13:40:11 -04:00
Mans Rullgard
cfb1091898 vc1dec: remove useless #include simple_idct.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Mans Rullgard
af500c08bb dct-test: always link with aandcttab.o
This allows building dct-test even if aandcttab.o is not pulled in
by any enabled codec.  The DCT with which these tables are used does
not use them directly, so building it without the tables is possible.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Mans Rullgard
cf5781fad0 vp8: pack struct VP8ThreadData more efficiently
Reordering the members in this struct reduces the holes required
to maintain alignment.  With this order, the only remaining, and
unavoidable, hole is 3 bytes following left_nnz.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Mans Rullgard
ec7c501ed5 x86: remove libmpeg2 mmx(ext) idct functions
These functions are not faster than other mmx implementations on
any hardware I have been able to test on, and they are horribly
inaccurate.  There is thus no reason to ever use them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-02 12:14:52 +01:00
Derek Buitenhuis
a675d73d57 eamad: Use dsputils instead of a custom bswap16_buf
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-08-01 22:07:04 -04:00
Derek Buitenhuis
45eaac02cb Canopus Lossless decoder
At the moment it only does BGR24, but I plan to add the rest after.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-08-01 22:06:16 -04:00
Diego Biurrun
19cf7163c1 dca: Switch dca_sample_rates to avpriv_ prefix; it is used across libs 2012-08-01 11:43:31 +02:00
Mans Rullgard
faa788227f ARM: use =const syntax instead of explicit literal pools
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-01 10:32:24 +01:00
Mans Rullgard
998170913c ARM: use standard syntax for all LDRD/STRD instructions
The standard syntax requires two destination registers for
LDRD/STRD instructions.  Some versions of the GNU assembler
allow using only one with the second implicit, others are
more strict.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-01 10:32:24 +01:00
Ronald S. Bultje
b6a3849adb fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
64-bit CPUs always have SSE available, thus there is no need to compile
in the 3dnow functions. This results in smaller binaries.
2012-07-31 21:20:47 -07:00
Ronald S. Bultje
ddbe71b44f dct-test: allow to compile without HAVE_INLINE_ASM. 2012-07-31 20:30:29 -07:00
Ronald S. Bultje
53dfaedc01 x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM. 2012-07-31 20:28:52 -07:00
Diego Biurrun
9e4bca16f8 dca: Move tables used outside of dcadec.c to a separate file. 2012-08-01 00:17:17 +02:00
Diego Biurrun
13a79cf84e dca: Rename dca.c ---> dcadec.c
This will allow adding dca.c with tables used from other files.
2012-08-01 00:17:16 +02:00
Diego Biurrun
6376a3ad24 x86: h264dsp: Remove unused variable ff_pb_3_1 2012-08-01 00:17:16 +02:00
Diego Biurrun
8728b381cb x86: h264dsp: Adjust YASM #ifdefs
This fixes compilation with YASM disabled.
2012-07-31 13:54:07 +02:00
Ronald S. Bultje
b829b4ce29 h264: convert loop filter strength dsp function to yasm.
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles
faster (201->193) on x86-32.
2012-07-30 19:39:47 -07:00
Diego Biurrun
0177b7d23a Improve descriptiveness of a number of codec and container long names 2012-07-30 20:46:55 +02:00
Ronald S. Bultje
be391fb6df h264_ps: declare array of colorspace strings on its own line. 2012-07-29 14:53:42 -07:00
Mans Rullgard
f3eb008343 eamad/eatgq/eatqi: call special EA IDCT directly
These decoders use a special non-MPEG2 IDCT.  Call it directly
instead of going through dsputil.  There is never any reason
to use a regular IDCT with these decoders or to use the EA IDCT
with other codecs.

This also fixes the bizarre situation of eamad and eatqi decoding
incorrectly if eatgq is disabled.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-29 21:30:57 +01:00
Mans Rullgard
591766a3a9 eamad: remove use of MpegEncContext
There is no sense in pulling in this monster struct just for
a handful of fields.  The code does not call any functions
expecting an MpegEncContext.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-29 21:30:47 +01:00
Mans Rullgard
87cf481aa8 mpegvideo: remove unnecessary inclusions of faandct.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-07-29 21:28:45 +01:00
Ronald S. Bultje
c83f44dba1 h264_idct_10bit: port x86 assembly to cpuflags. 2012-07-28 08:29:45 -07:00
Ronald S. Bultje
b3c5ae5607 fft: rename "z" to "zc" to prevent name collision.
Without this, cglobal will expand "z" to "zh" to access the high byte
in a register's word, which causes a name collision with the ZH(x) macro
further up in this file.
2012-07-28 08:29:44 -07:00
Michael Niedermayer
45838561f2 vc1dec: Override invalid macroblock quantizer
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-07-28 14:13:22 +02:00
Michael Niedermayer
2bf369b60c vc1: avoid reading beyond the last line in vc1_draw_sprites()
Fixes overread

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-07-28 13:35:12 +02:00
Michael Niedermayer
1100acbab2 vc1dec: check that coded slice positions and interlacing match.
This fixes out of array writes

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-07-28 13:34:05 +02:00
Michael Niedermayer
0aa907cfb1 vc1dec: Do not ignore ff_vc1_parse_frame_header_adv return value
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-07-28 13:34:05 +02:00
Ronald S. Bultje
4d777eedfd vp3: don't compile mmx IDCT functions on x86-64.
64-bit CPUs always have SSE2, and a SSE2 version exists, thus the MMX
version will never be used.
2012-07-27 20:12:30 -07:00
Ronald S. Bultje
a5bbb1242c h264_loopfilter: port x86 simd to cpuflags. 2012-07-27 20:12:11 -07:00
Ronald S. Bultje
d07ff3cd5a h264_chromamc_10bit: port x86 simd to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
4a26fdd852 vp3: port x86 SIMD to cpuflags. 2012-07-27 17:35:49 -07:00
Ronald S. Bultje
76888c64b0 rv34: port x86 SIMD to cpuflags. 2012-07-27 15:13:26 -07:00
Ronald S. Bultje
158744a4cd vp56: only compile MMX SIMD on x86-32.
All x86-64 CPUs have SSE2, so the MMX version will never be used. This
leads to smaller binaries.
2012-07-27 14:40:27 -07:00
Ronald S. Bultje
2734ba787b vp56: port x86 simd to cpuflags. 2012-07-27 14:39:07 -07:00