Commit Graph

458 Commits

Author SHA1 Message Date
sijchen
a917444b2e [Encoder ME] add memory allocation basics for FME 2014-04-01 15:52:13 +08:00
ruil2
e7603d8fbb use function pointer in reference frame management 2014-04-01 14:52:55 +08:00
Licai Guo
9a81260b24 Merge pull request #605 from volvet/clean_mt_macro
clean multi-threading macro
2014-04-01 14:50:31 +08:00
ganyang
eb6f6ecf40 Add EncoderMB UT test file, and reformat UT files. 2014-04-01 13:55:22 +08:00
Licai Guo
fa9735b331 Merge pull request #602 from sijchen/fme_merge22
[Encoder ME] Add alternative search methods
2014-04-01 10:23:24 +08:00
ruil2
6bb23f5df4 use WelsLog instead of fprintf to have a unified trace output 2014-04-01 09:24:57 +08:00
volvet
9f50e0c91e clean multi-threading macro 2014-03-31 18:24:10 -07:00
volvet
cad753d871 Merge pull request #603 from ruil2/vp_update2
add scene change for screen content

Approved by Sijia.
2014-03-31 13:59:29 -07:00
ruil2
aed919a65a add scene change for screen content 2014-03-31 17:14:00 +08:00
sijchen
69983d6df4 Add alternative search methods 2014-03-31 16:11:31 +08:00
unknown
a128d7f790 add usagetype setting for screen content 2014-03-31 14:00:25 +08:00
Licai Guo
881298ed31 Merge pull request #595 from sijchen/fme_merge12
[Encoder ME] Add feature search basic functions
2014-03-31 08:59:09 +08:00
Varun B Patil
6663743f4c Remove compiler warnings 2014-03-30 15:13:29 +05:30
ruil2
4751fe7690 add scene change detection in workflow for screen content 2014-03-28 11:30:51 +08:00
sijchen
12616019b6 Add feature search basic functions 2014-03-28 11:21:30 +08:00
ruil2
63cef0f0f4 add preprocessing parameter for screen content 2014-03-28 10:06:42 +08:00
sijchen
a60af6a750 add function pointer 2014-03-28 09:09:21 +08:00
volvet
f7fba4b122 Merge pull request #580 from ylatuya/api
Prefix API with the Wels namespace
2014-03-26 15:45:02 -07:00
sijchen
59f243b487 Adjust function interface and add void function for further coworking, adjust test accordingly 2014-03-26 16:52:53 +08:00
sijchen
bbe016543f Add basic cross search functions and its unit tests 2014-03-26 16:23:44 +08:00
Andoni Morales Alastruey
328740f294 Prefix API with the Wels namespace 2014-03-25 17:40:01 +01:00
ruil2
6b3f89d582 move some common functions to common.cpp and add some functions in common 2014-03-25 15:35:55 +08:00
sijchen
fcae0c7c48 Change the output of diamond search from qpel to interpel 2014-03-25 11:03:37 +08:00
sijchen
99f3bd69c4 Add checking directional MV in ME initial point 2014-03-24 14:16:16 +08:00
Martin Storsjö
b6883b4ef8 Make the iRCMode field use the RC_MODES type instead of plain int
This makes it even clearer for users about how to set this field.
2014-03-21 09:19:30 +02:00
Martin Storsjö
2bc8e61fcf Move the RC_MODES enum to the public header
This allows users to know what values to set for the iRCMode
parameter.
2014-03-21 09:18:17 +02:00
Licai Guo
65e8560dc7 Merge pull request #560 from ruil2/encoder_nal
add uiMaxNalSize to support the maximum nal size setting
2014-03-21 12:52:32 +08:00
Licai Guo
7a29b1f55a Merge pull request #549 from lyao2/rc_tune
RC LOWBR mode merge
2014-03-21 09:15:18 +08:00
ruil2
fd2c950778 add uiMaxNalSize to support the maximum nal size setting 2014-03-21 08:59:38 +08:00
Licai Guo
58966cb2e8 Merge pull request #558 from ruil2/encoder_level
add leve parameter, update profile and usagetype type
2014-03-20 17:16:40 +08:00
ruil2
e6c072b364 add leve parameter, update profile and usagetype type 2014-03-20 17:02:32 +08:00
sijchen
0ea480323e expand MVD table and rename some macros 2014-03-20 16:56:43 +08:00
Licai Guo
9d73d273ff Merge pull request #554 from ruil2/encoder_update
add maxbitrate parameter
2014-03-20 14:57:54 +08:00
ruil2
258185f8c2 add maxbitrate parameter 2014-03-20 14:30:20 +08:00
sijchen
e0aed6e4e7 add static 2014-03-20 14:19:55 +08:00
lyao2
071254748f avoid QP sudden fluctates 2014-03-20 13:13:32 +08:00
sijchen
c00bec2aa6 refactor the setting of function pointer for simplification 2014-03-20 09:51:57 +08:00
lyao2
4bc881c3ae RC LOWBR mode merge 2014-03-20 09:26:16 +08:00
Ethan Hugg
e8540af9eb Merge pull request #541 from licaiguo/disable-warnings
disable most warnings produced by -Wall
2014-03-19 09:17:34 -07:00
ruil2
028d39077f Merge pull request #545 from mstorsjo/remove-extra-parentheses
Remove unnecessary/superfluous parentheses in slice_multi_threading.cpp
2014-03-19 16:39:01 +08:00
Licai Guo
a688f5278a fix most of the warnings 2014-03-19 01:16:08 -07:00
Martin Storsjö
d75f677034 Remove unnecessary/superfluous parentheses in slice_multi_threading.cpp 2014-03-19 10:15:29 +02:00
ruil2
e74f01ad47 use the same frame type EVideoFrameType in encoder internal 2014-03-19 16:11:06 +08:00
ruil2
3238c913cc Merge pull request #535 from volvet/add-scene-change-detector
Add scene change detector
2014-03-19 14:52:08 +08:00
volvet
7313ecdbd0 Merge pull request #538 from mstorsjo/use-apple-builtin-define
Use __APPLE__ instead of APPLE_IOS for apple/arm specific features
2014-03-19 09:45:56 +08:00
Licai Guo
4bbe61a783 Merge pull request #537 from mstorsjo/rename-x86-asm
Rename the asm subdirectories to x86
2014-03-19 08:51:39 +08:00
Licai Guo
d897d362ab Merge pull request #532 from huili2/WELS_CLIP1
Modify MACRO WELS_CLIP1 as inline functions
2014-03-19 08:50:04 +08:00
Martin Storsjö
9586c59b9e Use __APPLE__ instead of APPLE_IOS in the arm assembly sources 2014-03-18 23:15:49 +02:00
Martin Storsjö
ed9c03408f Rename the asm subdirectories to x86
This is consistent with having the arm assembly in a subdirectory
called arm.
2014-03-18 23:09:45 +02:00
Ethan Hugg
197423f271 Merge pull request #520 from ylatuya/master
Fix compiler warnings and remove dead code
2014-03-18 13:28:02 -07:00
Andoni Morales Alastruey
ae60f1bee9 Fix compiler warnings and remove dead code
Fix several -Werror=unused-variable and -Werror=unused-but-set-variable
and removed dead code found with this warnings
2014-03-18 19:15:25 +01:00
Martin Storsjö
e1b5e038d2 Use .obj as suffix for object files on MSVC
This avoids warnings when linking about "unrecognized source file
type, object file assumed".
2014-03-18 19:41:06 +02:00
volvet
d7b7419040 add scene change detector for further extension 2014-03-18 17:54:58 +08:00
ruil2
9cf9238cfc fix bug that there is no output in encoder console 2014-03-18 17:38:48 +08:00
huili2
3b270aa901 remove unncessary cast 2014-03-18 02:15:57 -07:00
huili2
090e8cc1ed modify WELS_CLIP1 to be inline functions 2014-03-18 01:54:25 -07:00
volvet
b21411ad7c Merge pull request #511 from mstorsjo/remove-unused-define
Remove the unused FORMAT_COFF define
2014-03-18 16:11:22 +08:00
Martin Storsjö
4c829a12e2 Fix the comment in welsEncoderExt.h about the EncodeFrame return value
This was changed in 36d56b6638 in the public api, but the
internal implementation header was missed and left inconsistent.
2014-03-18 10:07:23 +02:00
volvet
fb1958ad13 Merge pull request #519 from mstorsjo/push-xmm-registers
Backup/restore the xmm6-xmm15 SSE registers within asm functions on win64

Reviewed by zhiliang
2014-03-18 15:04:54 +08:00
Licai Guo
3956a32d41 Merge pull request #524 from sijchen/me_refactor33
Expand structure of MD and ME
2014-03-18 12:51:05 +08:00
Licai Guo
37fa5f554e Merge pull request #513 from ruil2/encoder_interface
Encoder interface
2014-03-18 09:51:32 +08:00
sijchen
7f0c7daad9 expand structure of MD and ME 2014-03-18 09:47:05 +08:00
volvet
b5353c8455 Merge pull request #516 from mstorsjo/fix-yasm-64bit
Fix building with yasm in 64 bit mode
2014-03-18 09:29:42 +08:00
volvet
e75cd2298b Merge pull request #517 from mstorsjo/simplify-x86-asm-func-macro
Fold ALIGN 16 and the function label into WELS_EXTERN
2014-03-18 09:29:17 +08:00
Martin Storsjö
29a0c77acf Don't clobber q4-q7 in WelsIntra16x16Combined3Satd_neon
This is similar to what is done in other neon functions. This
function was missed since it isn't covered by the current
set of unittests.
2014-03-17 20:04:53 +02:00
Martin Storsjö
4633626d69 Remove XMMREG_PROTECT
This isn't necessary any longer, when all the assembly routines
take care of restoring registers as necessary.
2014-03-17 13:47:01 +02:00
Martin Storsjö
3cf52554f7 Backup/restore the xmm6-xmm15 SSE registers within asm functions on win64
According to the Win64 ABI, these registers need to be preserved,
and compilers are allowed to rely on their content to stay
available - not only for float usage but for any usage, anywhere,
in the calling C++ code.

This adds a macro which pushes the clobbered registers onto the
stack if targeting win64 (and a matching one which restores them).
The parameter to the macro is the number of xmm registers used
(e.g. if using xmm0 - xmm7, the parameter is 8), or in other
words, the number of the highest xmm register used plus one.

This is similar to how the same issue is handled for the NEON
registers q4-q7 with the vpush instruction, except that they needed
to be preserved on all platforms, not only on one particular platform.

This allows removing the XMMREG_PROTECT_* hacks, which can
easily fail if the compiler chooses to use the callee saved
xmm registers in an unexpected spot.
2014-03-17 13:44:33 +02:00
Martin Storsjö
9293f2f947 Remove commented out rodata sections and tables in assembly files 2014-03-17 13:42:18 +02:00
Martin Storsjö
eec968234d Fold ALIGN 16 and the function label into WELS_EXTERN
This simplifies the structure for all x86 assembly functions,
reducing the amount of duplicated code structure.
2014-03-17 13:35:00 +02:00
Martin Storsjö
04f5bcd68d Use movsxd in SIGN_EXTENSION
This is what nasm ended up assembling movsx with 32 bit input to
anyway.

Keep using plain movsx for 16 bit input.

This fixes building with yasm in 64 bit mode.
2014-03-17 13:26:46 +02:00
Martin Storsjö
f96918283f Remove commented out code for old, 32-bit only x86 assembly function prologues/epilogues 2014-03-17 11:20:11 +02:00
Licai Guo
fc4e0cacec Merge pull request #483 from volvet/develop_b
use large/medium/similar to define scene change result
2014-03-17 16:32:31 +08:00
Licai Guo
b5a4d706b9 Merge pull request #496 from mstorsjo/use-sign-extend-macro
Use the SIGN_EXTENSION macro where possible
2014-03-17 16:31:03 +08:00
ruil2
895c0ff635 fix typo 2014-03-17 12:09:52 +08:00
ruil2
36abe317a5 modify unit test for return type modification 2014-03-17 11:56:19 +08:00
ruil2
36d56b6638 modify EncoderFrame return type.
commit b99a307ab94183c32a293ad5fda8b0e3323546a0
Author: ruil2 <ruil2@cisco.com>
Date:   Wed Mar 12 13:34:27 2014 +0800

    fix typo
2014-03-17 10:46:38 +08:00
Licai Guo
2c796337ba Merge pull request #510 from huili2/remove_basemb
remove BASE_MB related code
2014-03-17 08:46:25 +08:00
Martin Storsjö
fc260b39e0 Remove the unused FORMAT_COFF define
Nothing in the project currently sets FORMAT_COFF - the other generic
branch works just fine on windows.
2014-03-16 17:54:55 +02:00
Martin Storsjö
eb238e6549 Use the SIGN_EXTENSION macro where possible
This shortens the x86 assembly by 134 lines in total.
2014-03-16 17:54:24 +02:00
volvet
e654bf6b7f Merge pull request #490 from ruil2/encoder_slice_auto
fix dump file issue
2014-03-16 15:41:26 +08:00
volvet
e606bae0e9 Merge pull request #504 from mstorsjo/fix-function-name-typo
Fix a typo, Smple -> Sample
2014-03-16 10:18:03 +08:00
Licai Guo
6f2b98975e Merge pull request #502 from mstorsjo/fix-macro-indentation
Fix the indentation of some nasm macros
2014-03-15 07:16:37 +08:00
Martin Storsjö
f4fdb15397 Fix a typo, Smple -> Sample 2014-03-14 23:30:09 +02:00
Martin Storsjö
4d120781c1 Fix the indentation of some nasm macros 2014-03-14 22:26:33 +02:00
Martin Storsjö
b3d04d88a0 Check for the right function pointer
This code checked whether one function pointer was non-null,
but the went on to call a different function pointer. Check
for the one that actually was called.
2014-03-14 22:20:40 +02:00
volvet
6da9a9e5c8 Merge pull request #489 from sijchen/me_refactor22
refactor ME for easier adding other search methods
2014-03-14 17:53:10 +08:00
huili2
b1f596fd69 remove BASE_MB related code 2014-03-14 02:03:41 -07:00
Martin Storsjö
9199798f22 Fix a typo in a macro name, EXTENTION -> EXTENSION 2014-03-14 10:13:18 +02:00
unknown
94f8c351ca fix dump file issue 2014-03-14 15:13:24 +08:00
sijchen
6c3d83a8ac refactor ME for easier adding other search methods 2014-03-14 15:04:35 +08:00
volvet
6714b8ae99 Merge pull request #463 from mstorsjo/dont-clobber-neon-registers
Avoid clobbering the neon registers q4-q7

Review and verified by zhilwang
2014-03-14 10:28:55 +08:00
volvet
fc5c48830a fix the condition of scene change flag and comments 2014-03-14 09:53:24 +08:00
volvet
c8761c08ae use large/medium/similar to define scene change result 2014-03-13 10:43:20 +08:00
volvet
8962b7c98b Merge pull request #482 from sijchen/me_refactor1
mv range setting refactor
2014-03-13 10:21:39 +08:00
sijchen
d809a7981b mv range setting refactor 2014-03-13 10:18:01 +08:00
volvet
8b907c18fd fix idr interval issue 2014-03-12 17:38:25 +08:00
ruil2
c7f2a0b7f6 3Author: ruil2 <ruil2@cisco.com>
modify the parameter verification for SM_AUTO_SLICE mode -- uiSliceNum
 iis ignored
2014-03-12 10:44:13 +08:00
ruil2
7c8ce799c0 fix SM_FIXEDSLCNUM_SLICE bug, add SM_AUTO_SLICE mode 2014-03-11 10:23:46 +08:00
Martin Storsjö
c011890764 Push clobbered neon registers on the stack
According to the calling convention, the registers q4-q7 should be
preserved by functions. The caller (generated by the compiler) could
be using those registers anywhere for any intermediate data.

Functions that use more than 12 of the qX registers must push
the clobbered registers on the stack in order to be able to restore them
afterwards.

In functions that don't use all 16 registers, but clobber some of
the callee saved registers q4-q7, one or more of them are remapped
to reduce the number of registers that have to be saved/restored.

This incurs a very small (around 0.5%) slowdown in the decoder and
encoder.
2014-03-10 22:07:36 +02:00
Martin Storsjö
811c647c0e Remap registers to avoid clobbering the neon registers q4-q7
According to the calling convention, the registers q4-q7 should be
preserved by functions. The caller (generated by the compiler) could
be using those registers anywhere for any intermediate data.

Functions that use 12 or less of the qX registers can avoid
violating the calling convention by simply using other registers instead
of the callee saved registers q4-q7.

This change only remaps the registers used within functions - therefore
this does not affect performance at all. E.g. in functions using
registers q0-q7, we now use q0-q3 and q8-q11 instead.
2014-03-10 22:07:25 +02:00