Commit Graph

31 Commits

Author SHA1 Message Date
Ronald S. Bultje
8123e0901f x86: place some inline asm under #if HAVE_INLINE_ASM
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-25 13:23:12 +01:00
Mans Rullgard
0b6f973635 h264: use asm cabac reader under a generic condition
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 22:14:21 +01:00
Roland Scheidegger
9b9df1cdff h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 09:43:25 -07:00
Roland Scheidegger
14e9ffc1e4 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:26:12 -07:00
Ronald S. Bultje
a940198130 cabac: add overread protection to BRANCHLESS_GET_CABAC().
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
2012-03-28 08:01:29 -07:00
Ronald S. Bultje
448dc42571 cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC(). 2012-03-28 08:01:29 -07:00
Ronald S. Bultje
951014e5bb cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC(). 2012-03-28 08:01:29 -07:00
Martin Storsjö
676a9ee1d2 x86: Fix constraints for decode_significance*_x86
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:

ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)

Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.

This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.

Signed-off-by: Martin Storsjö <martin@martin.st>
2011-12-27 09:05:14 +02:00
Diego Biurrun
6fdb2ce34a x86: Tighten register constraints for decode_significance*_x86.
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.

This reverts commit 8742a4f to a simplified version of the original constraints.
2011-12-21 12:06:37 +01:00
Mans Rullgard
599b4c6efd x86: cabac: replace explicit memory references with "m" operands
This replaces the explicit offset(reg) memory references with
"m" operands for the same locations.  As a result, one fewer
register operand is needed for these inline asm statements.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-11 22:29:22 +00:00
Diego Biurrun
276b995d85 x86: drop pointless ARCH_X86 #ifdef from files in x86 subdirectory 2011-11-08 17:52:55 +01:00
Mans Rullgard
3ad1684126 x86: cabac: add operand size suffixes missing from 6c32576
This fixes build with clang.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 18:59:23 -07:00
Mans Rullgard
f5f004bc5a x86: cabac: don't load/store context values in asm
Inspection of compiled code shows gcc handles these fine on its own.
Benchmarking also shows no measurable speed difference.

Removing the remaining cases in get_cabac_bypass_sign_x86() does
cause more substantial changes to the compiled code with uncertain
impact.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-28 22:25:21 +01:00
Jason Garrett-Glaser
6c32576548 H.264: optimize CABAC x86 asm for Atom 2011-07-28 13:06:13 -07:00
Mans Rullgard
c5ee740745 x86: cabac: fix register constraints for 32-bit mode
Some operands need to be accessed in byte mode, which restricts the
available registers in 32-bit mode.  Using the 'q' constraint selects
a suitable register.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 23:36:40 +01:00
Mans Rullgard
2143d69bdd cabac: move x86 asm to libavcodec/x86/cabac.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
d075e7d540 x86: h264: cast pointers to intptr_t rather than int
Only the low-order bits are used here so the type is not important,
but this avoids a compiler warning.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3a4edb76d6 x86: h264: remove hardcoded edi in decode_significance_8x8_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
b92c1a6d26 x86: h264: remove hardcoded esi in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3fc4e36c78 x86: h264: remove hardcoded edx in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
e4b5a204aa x86: h264: remove hardcoded eax in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
018c33838e x86: cabac: remove hardcoded ebx in inline asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
6b712acc0e x86: cabac: remove hardcoded struct offsets from inline asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Diego Biurrun
ba87f0801d Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-20 14:45:34 +00:00
Diego Biurrun
bad5537e2c Use full internal pathname in doxygen @file directives.
Otherwise doxygen complains about ambiguous filenames when files exist
under the same name in different subdirectories.

Originally committed as revision 16912 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-01 02:00:19 +00:00
Aurelien Jacobs
b250f9c66d Change semantic of CONFIG_*, HAVE_* and ARCH_*.
They are now always defined to either 0 or 1.

Originally committed as revision 16590 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-13 23:44:16 +00:00
Diego Biurrun
a6493a8fbd Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 09:12:42 +00:00