Sindre Aamås
48a520915a
[Encoder/x86] Add AVX2 SATD routines
...
WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell).
WelsSampleSatd16x8_avx2 (~2.19x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x16_avx2 (~1.68x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x8_avx2 (~1.53x speedup over SSE4.1 on Haswell).
2016-03-08 11:31:17 +01:00
Sindre Aamås
9909c306f1
[Common/x86] DeblockChromaLt4H_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
~5.72x speedup on Haswell (x86-64).
~1.85x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00
Sindre Aamås
732e1c5f78
[Common/x86] DeblockLumaLt4_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~1.97x speedup on Haswell (x86-64).
~3.09x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:18 +01:00
Sindre Aamås
3088d96978
[Encoder] Add an AVX2 4x4 IDCT implementation
...
~2.03x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Martin Storsjö
3243a78959
Move DEFAULT REL into the x86_64 cases
...
This fixes warnings when building for x86_32 using yasm, which says
the "DEFAULT REL" is ignored for non-64-bit targets.
2015-02-02 00:49:37 +02:00
Martin Storsjö
d5a45ec513
Mark the x86 assembly object files as not requiring an executable stack
...
This avoids having to add extra linker flags in order to specify this.
This is similar to how this already is handled for the arm assembly.
2014-07-25 00:56:39 +03:00
Martin Storsjö
57f6bcc4b0
Convert all tabs to spaces in assembly sources, unify indentation
...
Previously the assembly sources had mixed indentation consisting
of both spaces and tabs, making it quite hard to read unless
the right tab size was used in the editor.
Tabs have been interpreted as 4 spaces in most cases, matching
the surrounding code.
2014-06-01 01:35:43 +03:00
Martin Storsjö
faaf62afad
Get rid of double spaces in macro declarations
2014-06-01 01:13:01 +03:00
Martin Storsjö
ac03b8b503
Avoid unnecessary tabs in macro declarations
2014-06-01 01:13:01 +03:00
Licai Guo
485b2b5b43
Add IntraSad asm code.
...
Enable intraSad ASM code
Refine format
Add X86_ASM pretect for intraSad ASM code UT
remove duplicated code.
2014-05-04 12:12:38 +08:00
Licai Guo
e39de8d404
reoranize common to inc/src/x86/arm
2014-03-18 19:41:32 -07:00