Ronald S. Bultje
62844c3fd6
h264: Integrate clear_blocks calls with IDCT
...
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Diego Biurrun
26301caaa1
x86: mmx2 ---> mmxext in asm constructs
2012-11-14 00:58:51 +01:00
Diego Biurrun
2b479bcab0
build: Drop AVX assembly ifdefs
...
An assembler able to cope with AVX instructions is now required.
2012-11-11 20:43:28 +01:00
Diego Biurrun
04581c8c77
x86: yasm: Use complete source path for macro helper %includes
...
This is more consistent with the way we handle C #includes and
it simplifies the build system.
2012-10-31 00:37:42 +01:00
Diego Biurrun
6860b4081d
x86: include x86inc.asm in x86util.asm
...
This is necessary to allow refactoring some x86util macros with cpuflags.
2012-10-31 00:37:42 +01:00
Diego Biurrun
17337f54c0
x86: Split inline and external assembly #ifdefs
2012-08-31 01:53:25 +02:00
Mans Rullgard
a3df4781f4
x86: add colons after labels
...
nasm prints a warning if the colon is missing.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-08-07 15:20:56 +01:00
Ronald S. Bultje
c83f44dba1
h264_idct_10bit: port x86 assembly to cpuflags.
2012-07-28 08:29:45 -07:00
Henrik Gramner
729f90e268
x86inc improvements for 64-bit
...
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
2012-04-11 15:47:00 -04:00
Michael Kostylev
3206cccc0e
h264: mark h264_idct_add8_10 with number of XMM registers.
...
This fixes XMM register clobber problems on Win64.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-07 11:37:13 -08:00
Ronald S. Bultje
3b15a6d742
config.asm: change %ifdef directives to %if directives.
...
This allows combining multiple conditionals in a single statement.
2012-01-27 10:19:57 +08:00
Dave Yeo
cc73511e8e
Fix NASM include directive
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-15 11:24:35 -07:00
Ronald S. Bultje
b2c087871d
Move x86util.asm from libavcodec/ to libavutil/.
...
This allows using it in swscale also.
2011-08-12 11:43:03 -07:00
Ronald S. Bultje
3a39195b1d
Move x86inc.asm to libavutil/.
...
This allows using it in libswscale/ also.
2011-08-12 11:43:02 -07:00
Jason Garrett-Glaser
c90b94424c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea
Roll back 4:4:4 H.264 for now
...
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c
4:4:4 H.264 decoding support
...
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Loren Merritt
53be7b23e9
Cosmetic changes to h264_idct_10bit.asm.
...
Removes redundant dword tags and whitespace changes.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:15 -07:00
Loren Merritt
994c3550ff
2x faster h264_idct_add8_10.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:02 -07:00
Daniel Kang
836f47d34b
Add IDCT functions for 10-bit H.264.
...
Ports the majority of IDCT functions for 10-bit H.264.
Parts are inspired from 8-bit IDCT code in Libav; other parts ported from x264 with relicensing permission from author.
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00