zhilwang
eb221eb3d1
Merge pull request #1030 from mstorsjo/cpuid-32bit-param
...
Don't load undefined bits into rcx before calling the cpuid instruction
2014-06-30 10:05:26 +08:00
zhilwang
58349156b1
Merge pull request #1035 from mstorsjo/aarch64-cpufeatures
...
Implement WelsCPUFeatureDetect for AArch64
2014-06-30 10:04:03 +08:00
Martin Storsjö
68b4b09ae6
Don't load undefined bits into rcx before calling the cpuid instruction
...
The pFeatureC pointer is an uint32_t pointer, therefore only
load 32 bits into ecx.
This avoids loading potentially uninitialized data into the upper
half of the rcx register, fixing valgrind warnings in some build
setups (depending on how the compiler chooses to layout the stack
in the calling function).
2014-06-29 20:46:55 +03:00
Martin Storsjö
b5407915cc
Avoid warnings when building for iOS
...
Get rid of warnings by avoiding mixing data types unnecessarily,
and by adding casts.
2014-06-29 16:03:34 +03:00
Martin Storsjö
263833b3bf
Remove the now unused macros __align16, ALIGNED_DECLARE_MATRIX_1D and ALIGNED_DECLARE_MATRIX_2D
2014-06-29 00:36:29 +03:00
Martin Storsjö
66deed24b8
Implement WelsCPUFeatureDetect for AArch64
...
Previously it actually didn't return any cpu flags at all.
2014-06-27 23:57:42 +03:00
Martin Storsjö
b406f4471a
Fix a typo in arm64 assembly macros, ARCH64 -> AARCH64
2014-06-27 23:52:20 +03:00
ruil2
1ffcd36c70
change void* to explicit type definition
2014-06-27 11:34:47 +08:00
huili2
dc3fae4477
astyle all
2014-06-25 18:50:41 -07:00
Haibo Zhu
daf67d607f
add win 64 warnings remove
2014-06-19 18:31:39 -07:00
Martin Storsjö
720f8dcc52
Fix building the deblocking aarch64 assembly with gnu binutils
2014-06-17 10:10:50 +03:00
Martin Storsjö
b9477cdb94
Unify the copyright header in the aarch64 deblocking assembly
...
This file was the only one that had a differently formatted
copyright header.
2014-06-17 10:02:57 +03:00
Martin Storsjö
062937ac5a
Unify the indentation of the new aarch64 assembly
2014-06-17 10:01:23 +03:00
Martin Storsjö
d15534ecb8
Get rid of mixed tabs and spaces in the aarch64 assembly
2014-06-17 10:00:07 +03:00
ruil2
1111757977
Merge pull request #967 from dongzha/Deblock_AArch64
...
add arm 64 deblock code and Unit Test code
2014-06-16 17:19:25 +08:00
huili2
91cd93e5d0
Merge pull request #962 from dongzha/UseIntInRC
...
Use Int instead of Double in Rate Control and Modify anchor SHA1 value
2014-06-13 10:59:50 +08:00
dongzha
f6ce43f83b
Use Int instead of Double in Rate Control and Modify anchor SHA1 value
2014-06-12 17:30:13 +08:00
ruil2
44b048edd6
move trace related info to interface header
2014-06-11 17:05:40 +08:00
Martin Storsjö
dc91e0958b
Integrate the lone function from logging.cpp into welsCodecTrace.cpp
2014-06-11 08:08:56 +03:00
Martin Storsjö
6e5f31214a
Add a method for overriding the logging function in welsCodecTrace
2014-06-11 08:08:56 +03:00
Martin Storsjö
ce8065fe68
Don't use global variables in welsCodecTrace
...
This allows actually honoring the requested log level
properly if there are multiple codec instances within
the same process.
2014-06-11 08:08:56 +03:00
Martin Storsjö
cb5ee6c239
Remove the global log callback function
...
Now all logging should use a non-null log context, allowing to
pass the messages to the right recipient.
2014-06-11 08:08:56 +03:00
Martin Storsjö
8bac9315e6
Expose a SLogContext from welsCodecTrace
2014-06-11 08:08:29 +03:00
Martin Storsjö
4e428ab020
Add a log context to the encoder and decoder contexts
...
This will allow setting non-global logging callbacks, that
are different for each encoder or decoder instance.
2014-06-11 08:08:29 +03:00
Martin Storsjö
c8b81b4239
Only keep one single trace function pointer in welsCodecTrace
2014-06-11 08:08:29 +03:00
Martin Storsjö
20e889fadb
Change CM_WELS_TRACE to take a plain string, not a format and variadic arguments
...
The format string was always "%s" anyway.
2014-06-11 08:08:29 +03:00
Martin Storsjö
5e22d5366e
Remove the unused level parameter to welsStderrLevelTrace
2014-06-11 08:08:29 +03:00
Martin Storsjö
cfc9367610
Remove WelsStderrSetTraceLevel
...
The logging level is checked in welsCodecTrace anyway.
Previously, error logging wasn't ever shown if the trace
level was set to WELS_LOG_ERROR (as it was by default),
since welsStderrLevelTrace required the message level to
be strictly lower than the trace level.
2014-06-11 08:08:29 +03:00
Martin Storsjö
90be3d8215
Don't treat log levels as a bitmask
...
All use of log levels in the library just do a numerical
greater-than comparison between the set log level and the
level of the current message.
2014-06-11 08:08:29 +03:00
Martin Storsjö
4f1ea1c4f8
Remove some unused typedefs
2014-06-11 08:08:29 +03:00
Martin Storsjö
cc65a1d76c
Don't include a [ENCODER]: prefix in all logging
...
The same trace module is used for the decoder now as well.
2014-06-10 10:52:26 +03:00
Martin Storsjö
17fc6bd66e
Remove the unnecessary method WelsTraceModuleIsExist(), which always returned true
2014-06-10 10:52:26 +03:00
Martin Storsjö
d93488448e
Remove some commented out lines
2014-06-10 10:52:26 +03:00
Martin Storsjö
968d87045d
Remove an unnecessary local function
2014-06-10 10:52:26 +03:00
Martin Storsjö
40af75c19d
Remove the unnecessary WelsSet/GetLogLevel functions
...
Nothing actually used the variable that these functions
handled.
2014-06-10 10:52:06 +03:00
Martin Storsjö
ba1de16ac2
Make internal logging variables static
...
This avoids polluting the global namespace.
2014-06-10 09:28:45 +03:00
Martin Storsjö
ab4fe3fdf4
Remove an unused variable
2014-06-10 09:27:54 +03:00
dongzhang
0e0c8b5569
add arm 64 deblock code and Unit Test code
2014-06-10 11:23:51 +08:00
ruil2
4c12f8970c
cleanup trace module
2014-06-10 10:24:45 +08:00
Martin Storsjö
7bc3e944ad
Get rid of uneven spacing after WELS_EXTERN
2014-06-09 11:03:25 +03:00
Martin Storsjö
57f6bcc4b0
Convert all tabs to spaces in assembly sources, unify indentation
...
Previously the assembly sources had mixed indentation consisting
of both spaces and tabs, making it quite hard to read unless
the right tab size was used in the editor.
Tabs have been interpreted as 4 spaces in most cases, matching
the surrounding code.
2014-06-01 01:35:43 +03:00
Martin Storsjö
faaf62afad
Get rid of double spaces in macro declarations
2014-06-01 01:13:01 +03:00
Martin Storsjö
ac03b8b503
Avoid unnecessary tabs in macro declarations
2014-06-01 01:13:01 +03:00
Martin Storsjö
932a38abc0
Reformat the copyright header of deblocking_neon.S
...
This makes it identical to the ones in the other files.
2014-05-31 13:44:21 +03:00
ruil2
14e5d740cd
clean up expand picture.
2014-05-30 11:05:31 +08:00
dongzha
80fdf09b26
Merge pull request #903 from zhilwang/arm64-sad
...
Add Arm64 sad code
2014-05-30 09:26:04 +08:00
Sijia Chen
7413032185
using WelsRound for all the double-int32_t conversion
2014-05-20 14:06:31 +08:00
zhiliang wang
e6c9eb9824
Add Sad arm64 code
2014-05-14 17:06:48 +08:00
Martin Storsjö
3cc01c6239
Use CCASFLAGS when assembling .S sources
...
This allows overriding whether all of CFLAGS should be passed
when assembling.
2014-05-13 19:39:26 +03:00
sijchen
31a4d2aa3e
Merge pull request #829 from dongzha/FixBugforDeblocking
...
Fix a bug in deblocking for neon 32 bit arm implementation for master
2014-05-13 17:21:48 +08:00
Martin Storsjö
6b9167199f
Use the built-in define __linux__ instead of the manually set LINUX
2014-05-12 12:14:33 +03:00
dongzhang
218adc7e29
Fix a bug in deblocking for neon 32 bit arm implementation
2014-05-09 14:06:16 +08:00
Martin Storsjö
6e715ddc10
Make an endif comment match the actual condition
2014-05-08 11:14:24 +03:00
huili2
5ed24f216b
astyle all files
2014-05-05 19:30:21 -07:00
Martin Storsjö
b8eeda1740
Properly back up and restore XMM registers on win64 in WelsSampleSadFour4x4_sse2
2014-05-04 15:47:56 +03:00
Licai Guo
fe5b8d1a69
refine format
2014-05-04 14:51:05 +08:00
Licai Guo
485b2b5b43
Add IntraSad asm code.
...
Enable intraSad ASM code
Refine format
Add X86_ASM pretect for intraSad ASM code UT
remove duplicated code.
2014-05-04 12:12:38 +08:00
Martin Storsjö
23f57adaea
Do full register loads instead of single-lane loads in DeblockLumaEq4H_neon
...
Instead of loading the registers one lane at a time, load full
registers and then transpose them.
This is faster, reducing the runtime for the function from about
506 cycles to 434 cycles (tested on a Cortex A8).
This also avoids an issue which seems like a cpu bug, present
on Sony Xperia T (cpu implementer 0x51 architecture 7 variant 0x1
part 0x04d). On such a device, it seemed like the "vswp q9, q10"
could start executing before the previous
vld4.u8 {d20[x],d21[x],d22[x],d23[x]}, [r3], r1
had finished and written back their result. Changing the
"vswp q9, q10" into "vswp q10, q9", or into separate
"vswp d18, d20; vswp d19, d21" (or the other way around) seemed to
avoid the issue. This happened occasionally (a couple times per
100000 invocations or so).
2014-04-28 10:12:16 +03:00
volvet
c65e286036
Merge pull request #738 from mstorsjo/gnu-aarch64
...
Fix building the aarch64 assembly using gnu binutils
2014-04-25 09:07:43 +08:00
Martin Storsjö
66f58e8357
Add macros for the non-standard mov.16b/mov.8b/ext.16b/ext.8b
...
This fixes building with gnu binutils, which don't support this
nonstandard form of the instructions.
Once Apple's tools support the proper standard form of the
instructions, the code should be updated to use that everywhere
instead, and these macros should be removed.
2014-04-23 11:47:12 +03:00
Martin Storsjö
7cd175d097
Use the correct ext syntax in the gnu version of macros
2014-04-23 11:47:12 +03:00
Martin Storsjö
b13a399ab5
Use a plain "ret" instead of "ret lr"
...
This fixes an issue with assembling with gnu binutils.
2014-04-23 11:47:12 +03:00
Martin Storsjö
f2642b308a
Add correct arguments to the gnu version of UNPACK_FILTER_SINGLE_TAG_16BITS
2014-04-23 11:47:12 +03:00
Martin Storsjö
90fad9fd98
Add \() to macro arguments to separate the argument from the following .8h or similar
2014-04-23 11:47:12 +03:00
Martin Storsjö
80bd541cbe
Remove .syntax unified from the aarch64 common header
...
This directive isn't available in aarch64 code, only in arm code.
2014-04-23 11:47:12 +03:00
Martin Storsjö
3c2e9cd7bf
Regenerate makefiles to include the new arm64 assembly files
2014-04-23 11:44:47 +03:00
Martin Storsjö
764f787dcb
Rename the makefile variable for arm assembly sources
...
This is in preparation for adding support for the aarc64 assembly
files as well.
2014-04-23 10:55:30 +03:00
Martin Storsjö
a842f14a3c
Remove .orig files left over from running astyle
2014-04-23 09:24:23 +03:00
Martin Storsjö
45aef90d26
Remove the executable bit from source files
2014-04-23 09:23:56 +03:00
dongzhang
ad9e2dab4f
Add Motion Compehension ARM64 Neon Code
2014-04-23 13:26:28 +08:00
Licai Guo
b47606a4ff
Merge pull request #733 from dongzha/ExpandPic_ARM64
...
Add expand picture support for ARM64 NEON
2014-04-23 09:57:39 +08:00
dongzhang
2444327a6c
Add expand picture support for ARM64 NEON
...
Remove duplicate MACROS
2014-04-23 09:14:32 +08:00
Martin Storsjö
564d16c2ef
Make Wels*Snprintf return values be non-negative
...
This makes sure the windows version of these functions behave
more like the posix version. The posix *snprintf returns how
much would have been written if the buffer had been large
enough, which we don't know easily in the windows versions.
This basically means that we can assume that the return value is
>= 0 now, which can simplify the calling code.
2014-04-21 22:03:20 +03:00
Licai Guo
3f2ea77908
Merge pull request #719 from dongzha/MC
...
Modify ARM32 Neon code for Expand Chroma Picture, when UVWidth%16==8.
2014-04-21 14:38:51 +08:00
Licai Guo
039a547804
give accurate align information for mc copy functions
...
this can improve the performance for target like javascript
2014-04-19 00:33:23 -07:00
Licai Guo
2f8c539e60
Merge pull request #707 from dongzha/FixIssueMcNEON
...
Fix potential issue for neon implement on encoder mode decision.
2014-04-17 17:26:25 +08:00
dongzhang
a4f59bc0d7
Modify ARM32 Neon code for Expand Chroma Picture, when UVWidth%16==8.
2014-04-17 15:58:30 +08:00
Licai Guo
4062fa9d34
Merge pull request #703 from zhilwang/pf-test
...
Move copy_mb neon code to common folder
2014-04-17 11:08:56 +08:00
Licai Guo
3d9d00b27c
Update targets.mk
2014-04-17 10:43:10 +08:00
Licai Guo
c8e1a41c29
Move copy_mb neon code to common folder
2014-04-17 10:06:48 +08:00
ruil2
b553468ad3
keep the declaration and definition in the same namespace
2014-04-17 09:45:26 +08:00
huili2
4ab8c88e98
divide copy_mb functions into new file for decoder use from encoder and add files for EC in decoder only.
2014-04-14 20:17:41 -07:00
Dong Zhang
8a4300be50
Fix potential issue for neon implement on encoder mode decision.
...
Error happens when ME_REFINE_BUF_STRIDE is not equal to 32.
2014-04-13 19:41:29 -07:00
Martin Storsjö
b35c21201b
Use the Windows Runtime ThreadPool API for creating threads on Windows Phone
...
Windows Phone lacks the old CreateThread/beginthreadex APIs for
creating threads. (Technically, the functions still do exist,
but they aren't officially supported and aren't visible in the
headers when targeting Windows Phone.)
Building code that uses the Windows Runtime language extensions
requires building with the -ZW option.
2014-04-01 11:18:49 +03:00
Martin Storsjö
f293d26a62
Use more modern versions of functions that don't exist on Windows Phone
2014-04-01 11:18:48 +03:00
Martin Storsjö
4bcb03c5a0
Remove the unused function WelsSleep
...
Windows Phone 8 doesn't have Sleep(), but there's no need to
use the function at all.
2014-04-01 11:18:48 +03:00
volvet
9f50e0c91e
clean multi-threading macro
2014-03-31 18:24:10 -07:00
ruil2
6b3f89d582
move some common functions to common.cpp and add some functions in common
2014-03-25 15:35:55 +08:00
Licai Guo
e39de8d404
reoranize common to inc/src/x86/arm
2014-03-18 19:41:32 -07:00
volvet
7313ecdbd0
Merge pull request #538 from mstorsjo/use-apple-builtin-define
...
Use __APPLE__ instead of APPLE_IOS for apple/arm specific features
2014-03-19 09:45:56 +08:00
Licai Guo
d897d362ab
Merge pull request #532 from huili2/WELS_CLIP1
...
Modify MACRO WELS_CLIP1 as inline functions
2014-03-19 08:50:04 +08:00
Martin Storsjö
9586c59b9e
Use __APPLE__ instead of APPLE_IOS in the arm assembly sources
2014-03-18 23:15:49 +02:00
Martin Storsjö
73ed237d73
Use __APPLE__ instead of APPLE_IOS for using the apple cpu feature detection
2014-03-18 23:15:49 +02:00
Ethan Hugg
197423f271
Merge pull request #520 from ylatuya/master
...
Fix compiler warnings and remove dead code
2014-03-18 13:28:02 -07:00
Andoni Morales Alastruey
703c69de81
codec: add a new macro for unused functions
...
Variables used only for tracing logs can trigger
-Werror=unusef-variable when tracing is disabled.
This macro helps to silent gcc in those casesWIP
2014-03-18 19:15:25 +01:00
Martin Storsjö
e1b5e038d2
Use .obj as suffix for object files on MSVC
...
This avoids warnings when linking about "unrecognized source file
type, object file assumed".
2014-03-18 19:41:06 +02:00
huili2
090e8cc1ed
modify WELS_CLIP1 to be inline functions
2014-03-18 01:54:25 -07:00
volvet
b21411ad7c
Merge pull request #511 from mstorsjo/remove-unused-define
...
Remove the unused FORMAT_COFF define
2014-03-18 16:11:22 +08:00
volvet
fb1958ad13
Merge pull request #519 from mstorsjo/push-xmm-registers
...
Backup/restore the xmm6-xmm15 SSE registers within asm functions on win64
Reviewed by zhiliang
2014-03-18 15:04:54 +08:00
volvet
b5353c8455
Merge pull request #516 from mstorsjo/fix-yasm-64bit
...
Fix building with yasm in 64 bit mode
2014-03-18 09:29:42 +08:00