11 Commits

Author SHA1 Message Date
Sindre Aamås
48a520915a [Encoder/x86] Add AVX2 SATD routines
WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell).
WelsSampleSatd16x8_avx2  (~2.19x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x16_avx2  (~1.68x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x8_avx2   (~1.53x speedup over SSE4.1 on Haswell).
2016-03-08 11:31:17 +01:00
Sindre Aamås
9909c306f1 [Common/x86] DeblockChromaLt4H_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

~5.72x speedup on Haswell (x86-64).
~1.85x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00
Sindre Aamås
732e1c5f78 [Common/x86] DeblockLumaLt4_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Avoid spills.

~1.97x speedup on Haswell (x86-64).
~3.09x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:18 +01:00
Sindre Aamås
3088d96978 [Encoder] Add an AVX2 4x4 IDCT implementation
~2.03x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Martin Storsjö
3243a78959 Move DEFAULT REL into the x86_64 cases
This fixes warnings when building for x86_32 using yasm, which says
the "DEFAULT REL" is ignored for non-64-bit targets.
2015-02-02 00:49:37 +02:00
Martin Storsjö
d5a45ec513 Mark the x86 assembly object files as not requiring an executable stack
This avoids having to add extra linker flags in order to specify this.

This is similar to how this already is handled for the arm assembly.
2014-07-25 00:56:39 +03:00
Martin Storsjö
57f6bcc4b0 Convert all tabs to spaces in assembly sources, unify indentation
Previously the assembly sources had mixed indentation consisting
of both spaces and tabs, making it quite hard to read unless
the right tab size was used in the editor.

Tabs have been interpreted as 4 spaces in most cases, matching
the surrounding code.
2014-06-01 01:35:43 +03:00
Martin Storsjö
faaf62afad Get rid of double spaces in macro declarations 2014-06-01 01:13:01 +03:00
Martin Storsjö
ac03b8b503 Avoid unnecessary tabs in macro declarations 2014-06-01 01:13:01 +03:00
Licai Guo
485b2b5b43 Add IntraSad asm code.
Enable intraSad ASM code

Refine format

Add X86_ASM pretect for intraSad ASM code UT

remove duplicated code.
2014-05-04 12:12:38 +08:00
Licai Guo
e39de8d404 reoranize common to inc/src/x86/arm 2014-03-18 19:41:32 -07:00