281 Commits

Author SHA1 Message Date
Martin Storsjö
062937ac5a Unify the indentation of the new aarch64 assembly 2014-06-17 10:01:23 +03:00
Martin Storsjö
d15534ecb8 Get rid of mixed tabs and spaces in the aarch64 assembly 2014-06-17 10:00:07 +03:00
ruil2
1111757977 Merge pull request #967 from dongzha/Deblock_AArch64
add arm 64 deblock code and Unit Test code
2014-06-16 17:19:25 +08:00
huili2
91cd93e5d0 Merge pull request #962 from dongzha/UseIntInRC
Use Int instead of Double in Rate Control and Modify anchor SHA1 value
2014-06-13 10:59:50 +08:00
dongzha
f6ce43f83b Use Int instead of Double in Rate Control and Modify anchor SHA1 value 2014-06-12 17:30:13 +08:00
ruil2
44b048edd6 move trace related info to interface header 2014-06-11 17:05:40 +08:00
Martin Storsjö
dc91e0958b Integrate the lone function from logging.cpp into welsCodecTrace.cpp 2014-06-11 08:08:56 +03:00
Martin Storsjö
6e5f31214a Add a method for overriding the logging function in welsCodecTrace 2014-06-11 08:08:56 +03:00
Martin Storsjö
ce8065fe68 Don't use global variables in welsCodecTrace
This allows actually honoring the requested log level
properly if there are multiple codec instances within
the same process.
2014-06-11 08:08:56 +03:00
Martin Storsjö
cb5ee6c239 Remove the global log callback function
Now all logging should use a non-null log context, allowing to
pass the messages to the right recipient.
2014-06-11 08:08:56 +03:00
Martin Storsjö
8bac9315e6 Expose a SLogContext from welsCodecTrace 2014-06-11 08:08:29 +03:00
Martin Storsjö
4e428ab020 Add a log context to the encoder and decoder contexts
This will allow setting non-global logging callbacks, that
are different for each encoder or decoder instance.
2014-06-11 08:08:29 +03:00
Martin Storsjö
c8b81b4239 Only keep one single trace function pointer in welsCodecTrace 2014-06-11 08:08:29 +03:00
Martin Storsjö
20e889fadb Change CM_WELS_TRACE to take a plain string, not a format and variadic arguments
The format string was always "%s" anyway.
2014-06-11 08:08:29 +03:00
Martin Storsjö
5e22d5366e Remove the unused level parameter to welsStderrLevelTrace 2014-06-11 08:08:29 +03:00
Martin Storsjö
cfc9367610 Remove WelsStderrSetTraceLevel
The logging level is checked in welsCodecTrace anyway.

Previously, error logging wasn't ever shown if the trace
level was set to WELS_LOG_ERROR (as it was by default),
since welsStderrLevelTrace required the message level to
be strictly lower than the trace level.
2014-06-11 08:08:29 +03:00
Martin Storsjö
90be3d8215 Don't treat log levels as a bitmask
All use of log levels in the library just do a numerical
greater-than comparison between the set log level and the
level of the current message.
2014-06-11 08:08:29 +03:00
Martin Storsjö
4f1ea1c4f8 Remove some unused typedefs 2014-06-11 08:08:29 +03:00
Martin Storsjö
cc65a1d76c Don't include a [ENCODER]: prefix in all logging
The same trace module is used for the decoder now as well.
2014-06-10 10:52:26 +03:00
Martin Storsjö
17fc6bd66e Remove the unnecessary method WelsTraceModuleIsExist(), which always returned true 2014-06-10 10:52:26 +03:00
Martin Storsjö
d93488448e Remove some commented out lines 2014-06-10 10:52:26 +03:00
Martin Storsjö
968d87045d Remove an unnecessary local function 2014-06-10 10:52:26 +03:00
Martin Storsjö
40af75c19d Remove the unnecessary WelsSet/GetLogLevel functions
Nothing actually used the variable that these functions
handled.
2014-06-10 10:52:06 +03:00
Martin Storsjö
ba1de16ac2 Make internal logging variables static
This avoids polluting the global namespace.
2014-06-10 09:28:45 +03:00
Martin Storsjö
ab4fe3fdf4 Remove an unused variable 2014-06-10 09:27:54 +03:00
dongzhang
0e0c8b5569 add arm 64 deblock code and Unit Test code 2014-06-10 11:23:51 +08:00
ruil2
4c12f8970c cleanup trace module 2014-06-10 10:24:45 +08:00
Martin Storsjö
7bc3e944ad Get rid of uneven spacing after WELS_EXTERN 2014-06-09 11:03:25 +03:00
Martin Storsjö
57f6bcc4b0 Convert all tabs to spaces in assembly sources, unify indentation
Previously the assembly sources had mixed indentation consisting
of both spaces and tabs, making it quite hard to read unless
the right tab size was used in the editor.

Tabs have been interpreted as 4 spaces in most cases, matching
the surrounding code.
2014-06-01 01:35:43 +03:00
Martin Storsjö
faaf62afad Get rid of double spaces in macro declarations 2014-06-01 01:13:01 +03:00
Martin Storsjö
ac03b8b503 Avoid unnecessary tabs in macro declarations 2014-06-01 01:13:01 +03:00
Martin Storsjö
932a38abc0 Reformat the copyright header of deblocking_neon.S
This makes it identical to the ones in the other files.
2014-05-31 13:44:21 +03:00
ruil2
14e5d740cd clean up expand picture. 2014-05-30 11:05:31 +08:00
dongzha
80fdf09b26 Merge pull request #903 from zhilwang/arm64-sad
Add Arm64 sad code
2014-05-30 09:26:04 +08:00
Sijia Chen
7413032185 using WelsRound for all the double-int32_t conversion 2014-05-20 14:06:31 +08:00
zhiliang wang
e6c9eb9824 Add Sad arm64 code 2014-05-14 17:06:48 +08:00
Martin Storsjö
3cc01c6239 Use CCASFLAGS when assembling .S sources
This allows overriding whether all of CFLAGS should be passed
when assembling.
2014-05-13 19:39:26 +03:00
sijchen
31a4d2aa3e Merge pull request #829 from dongzha/FixBugforDeblocking
Fix a bug in deblocking for neon 32 bit arm implementation for master
2014-05-13 17:21:48 +08:00
Martin Storsjö
6b9167199f Use the built-in define __linux__ instead of the manually set LINUX 2014-05-12 12:14:33 +03:00
dongzhang
218adc7e29 Fix a bug in deblocking for neon 32 bit arm implementation 2014-05-09 14:06:16 +08:00
Martin Storsjö
6e715ddc10 Make an endif comment match the actual condition 2014-05-08 11:14:24 +03:00
huili2
5ed24f216b astyle all files 2014-05-05 19:30:21 -07:00
Martin Storsjö
b8eeda1740 Properly back up and restore XMM registers on win64 in WelsSampleSadFour4x4_sse2 2014-05-04 15:47:56 +03:00
Licai Guo
fe5b8d1a69 refine format 2014-05-04 14:51:05 +08:00
Licai Guo
485b2b5b43 Add IntraSad asm code.
Enable intraSad ASM code

Refine format

Add X86_ASM pretect for intraSad ASM code UT

remove duplicated code.
2014-05-04 12:12:38 +08:00
Martin Storsjö
23f57adaea Do full register loads instead of single-lane loads in DeblockLumaEq4H_neon
Instead of loading the registers one lane at a time, load full
registers and then transpose them.

This is faster, reducing the runtime for the function from about
506 cycles to 434 cycles (tested on a Cortex A8).

This also avoids an issue which seems like a cpu bug, present
on Sony Xperia T (cpu implementer 0x51 architecture 7 variant 0x1
part 0x04d). On such a device, it seemed like the "vswp q9, q10"
could start executing before the previous
vld4.u8 {d20[x],d21[x],d22[x],d23[x]}, [r3], r1
had finished and written back their result. Changing the
"vswp q9, q10" into "vswp q10, q9", or into separate
"vswp d18, d20; vswp d19, d21" (or the other way around) seemed to
avoid the issue. This happened occasionally (a couple times per
100000 invocations or so).
2014-04-28 10:12:16 +03:00
volvet
c65e286036 Merge pull request #738 from mstorsjo/gnu-aarch64
Fix building the aarch64 assembly using gnu binutils
2014-04-25 09:07:43 +08:00
Martin Storsjö
66f58e8357 Add macros for the non-standard mov.16b/mov.8b/ext.16b/ext.8b
This fixes building with gnu binutils, which don't support this
nonstandard form of the instructions.

Once Apple's tools support the proper standard form of the
instructions, the code should be updated to use that everywhere
instead, and these macros should be removed.
2014-04-23 11:47:12 +03:00
Martin Storsjö
7cd175d097 Use the correct ext syntax in the gnu version of macros 2014-04-23 11:47:12 +03:00
Martin Storsjö
b13a399ab5 Use a plain "ret" instead of "ret lr"
This fixes an issue with assembling with gnu binutils.
2014-04-23 11:47:12 +03:00