Compare commits

..

557 Commits

Author SHA1 Message Date
HaiboZhu
8fcef67c70 Merge pull request #2539 from HaiboZhu/Update_binary_files_location_v1.6.0
Update the binary files location for release v1.6.0
2016-07-22 09:16:23 +08:00
HaiboZhu
241700cb96 Merge pull request #2537 from mstorsjo/msvc-warning-fix
Silence warnings about conversions in MSVC
2016-07-21 16:09:55 +08:00
HaiboZhu
0f21b2b02e Merge pull request #2535 from mstorsjo/silence-warnings
Silence warnings with GCC 5.4
2016-07-21 16:08:38 +08:00
Haibo Zhu
9b6476e98a Update the binary files location for release v1.6.0 2016-07-21 09:11:21 +08:00
Martin Storsjö
350cafd69a Silence warnings about conversions in MSVC
The kind of cast (static_cast vs C style cast) is different to
match the closely surrounding code.
2016-07-20 12:10:34 +03:00
Martin Storsjö
91331e1ba4 Silence warnings with GCC 5.4
This fixes warnings like the following:

codec/decoder/core/src/mv_pred.cpp: In function ‘void WelsDec::PredPSkipMvFromNeighbor(WelsDec::PDqLayer, int16_t*)’:
codec/decoder/core/src/mv_pred.cpp:158:51: warning: ‘iLeftTopXy’ may be used uninitialized in this function [-Wmaybe-uninitialized]

codec/processing/src/backgrounddetection/BackgroundDetection.cpp: In member function ‘void WelsVP::CBackgroundDetection::ForegroundDilation(WelsVP::SBackgroundOU*, WelsVP::SBackgroundOU**, WelsVP::CBackgroundDetection::vBGDParam*, int32_t)’:
codec/processing/src/backgrounddetection/BackgroundDetection.cpp:281:63: warning: suggest parentheses around operand of ‘!’ or change ‘|’ to ‘||’ or ‘!’ to ‘~’ [-Wparentheses]

For the possibly uninitialized variables, this is similar to earlier
commits 8be8fe17 and af2666fd.
2016-07-20 11:53:10 +03:00
HaiboZhu
1f770c488c Merge pull request #2531 from GuangweiWang/enable-disable-AVX2
add option for enable/disable AVX2
2016-07-20 13:49:31 +08:00
HaiboZhu
5a8f5e8cf1 Merge pull request #2520 from ruil2/rc1
change FrameQP control and complexity calculation
2016-07-18 15:37:02 +08:00
Guangwei Wang
7d00e8bc42 add option for enable/disable AVX2 2016-07-15 12:15:57 +08:00
HaiboZhu
8980731be0 Merge pull request #2527 from ruil2/init
init samplebuffer
2016-07-13 14:21:01 +08:00
huili2
a43841d0e9 Merge pull request #2526 from HaiboZhu/Bugfix_CHP_support
Fixes for CHP support
2016-07-13 13:31:17 +08:00
Karina
9d89a6976e init samplebuffer 2016-07-13 11:20:36 +08:00
Haibo Zhu
687f9eff1b (1) remove the weighted prediction sytax limit
(2) fix the 4:0:0 support bug
2016-07-13 09:58:58 +08:00
HaiboZhu
d97f4c5b68 Merge pull request #2524 from HaiboZhu/Change_SharedLibVersion
Update the SharedLibVersion to make the compatibilty
2016-07-12 09:01:28 +08:00
Haibo Zhu
03add69386 Update the SharedLibVersion to make the compatibilty 2016-07-11 15:20:22 +08:00
Karina
ffe11835fc change FrameQP control and complexity calculation 2016-07-11 10:20:17 +08:00
HaiboZhu
44d8560698 Merge pull request #2519 from HaiboZhu/Update_ReadMe_for_master
Update the wrong description in README.md
2016-07-11 08:39:50 +08:00
Haibo Zhu
3a6ed92a35 Update the wrong description in README.md 2016-07-08 17:13:48 +08:00
huili2
842b4f0243 Merge pull request #2516 from HaiboZhu/Add_release_binary_files_v1.6.0
Update the binary files location for openh264 release 1.6.0 in master branch
2016-07-08 09:25:11 +08:00
Haibo Zhu
d5a8c84409 Update the binary files location for openh264 release 1.6.0 2016-07-08 08:58:50 +08:00
HaiboZhu
4e3df2619a Merge pull request #2512 from HaiboZhu/Update_v1.6_information
Update the release note and readme files for version 1.6
2016-07-05 08:41:52 +08:00
Haibo Zhu
af8240a440 Update the release note and readme files for version 1.6 2016-07-04 10:45:23 +08:00
HaiboZhu
ec70649261 Merge pull request #2509 from pengyanhai/master
Make sure the output resolution of encoder doesn't exceed the Openh264 capability
2016-07-01 14:42:13 +08:00
Hank Peng
00e89b89f0 Make sure the output resolution of encoder doesn't exceed the Openh264 capability 2016-06-30 22:32:30 -07:00
HaiboZhu
acab999b45 Merge pull request #2507 from GuangweiWang/master
rename debug symbols file's name
2016-06-29 10:00:30 +08:00
Guangwei Wang
f8516bb8af fix 2016-06-28 15:58:25 +08:00
Guangwei Wang
e77a101885 fix bug 2016-06-28 15:03:00 +08:00
Guangwei Wang
ac65f3adc8 rename debug symbols file's name 2016-06-27 12:41:59 +08:00
HaiboZhu
b2d6902176 Merge pull request #2504 from ruil2/adaptive
fix QP issue when adaptive quant turns on
2016-06-22 14:13:18 +08:00
Karina
35e073714d fix QP issue when adaptive quant turns on 2016-06-22 13:33:50 +08:00
HaiboZhu
60cbb77583 Merge pull request #2500 from ruil2/downsampling
use average downsampling fistly then general downsampling
2016-06-21 10:11:44 +08:00
ruil2
c6356ca8fc Merge pull request #2499 from saamas/encoder-avoid-valgrind-downsampling-false-positives
[Encoder] Avoid valgrind downsampling false positives
2016-06-21 09:17:37 +08:00
Karina
7c0ca2fc14 use average downsampling fistly then general downsampling when dst resolution > 1/4 source resolution and dst resolution <1/2 source resolution 2016-06-17 10:30:47 +08:00
ruil2
6a86e37849 Merge pull request #2495 from saamas/processing-dyadic-bilinear-downsample-optimizations
[UT] Allow for different output depending on downsample average order
2016-06-17 09:18:46 +08:00
Sindre Aamås
f14fb2cfbc [UT] Allow for different output depending on downsample average order
Avoid X86_ASM ifdef.

Ideally, we may want to update all routines to average vertically
first, which would make this unnecessary. In the interim, this
enables the tests to run successfully on x86 without SSSE3 support
again.
2016-06-16 22:07:58 +02:00
Sindre Aamås
0f7b8365b9 [Encoder] Avoid valgrind downsampling false positives
X86 SIMD downsampling routines may, for convenience, read slightly beyond
the input data and into the alignment padding area beyond each line. This
causes valgrind to warn about uninitialized values even if these values
only affect lanes of a SIMD vector that are effectively never used.

Avoid these false positives by zero-initializing the padding area beyond
each line of the source buffer used for downsampling.
2016-06-16 21:19:17 +02:00
HaiboZhu
5637df510e Merge pull request #2498 from mstorsjo/android-avoid-stl-include
Use assert.h instead of cassert
2016-06-16 09:21:03 +08:00
Martin Storsjö
e945654f06 Use assert.h instead of cassert
This fixes building for android differently than in f5e483ce.

On android, <cassert> isn't available in the normal include path,
only when the STL headers are available.

We intentionally avoid using STL within the main libopenh264.so, to
simplify dependency chains for users of the library (which otherwise
could run into conflicts if the surrounding app would want to use
a different STL implementation).

The previous fix only provided headers, not actually linking
against STL, so at this point it's not a real issue yet, but it's
still a very slippery slope towards accidentally starting relying on
STL within the core library.

Instead explicitly avoid using STL within the core library, by not
even providing the include path.
2016-06-15 21:06:11 +03:00
HaiboZhu
8661e358c0 Merge pull request #2497 from GuangweiWang/master
fix android build issue
2016-06-15 14:05:53 +08:00
Guangwei Wang
f5e483ce95 fix android build issue 2016-06-15 13:19:41 +08:00
HaiboZhu
2e6c9f7cd3 Merge pull request #2496 from saamas/processing-relax-downsample-buffer-size-requirement
[Processing] Relax downsample buffer size requirement
2016-06-15 10:31:53 +08:00
HaiboZhu
d35647ec3b Merge pull request #2491 from ruil2/nalsize
add nalsize checking UT and fix nalsize control when cabac on
2016-06-15 10:24:18 +08:00
HaiboZhu
151a7ff643 Merge pull request #2490 from sijchen/refactor_ref4
[Encoder] refactor: to avoid only use idx0 in syntax writing, for now it has no impact on bs
2016-06-15 10:23:38 +08:00
HaiboZhu
84a7669b63 Merge pull request #2464 from bumblebritches57/MVC
MVC aka Stereoscopic 3D support
2016-06-15 10:05:15 +08:00
ruil2
4b6f037020 Merge pull request #2489 from saamas/processing-dyadic-bilinear-downsample-optimizations
[Processing] DyadicBilinearDownsample optimizations
2016-06-12 10:02:55 +08:00
Sindre Aamås
fe4a47a979 [UT] Add comment on X86_ASM checksum ifdef 2016-06-08 21:53:30 +02:00
Karina
b5cef5d49c modify reserved nal header size and change source frame in NalSizeChecking UT 2016-06-08 10:12:27 +08:00
sijchen
94c94ca3b1 Merge pull request #2493 from ruil2/configure
modify  comments in configure file
2016-06-07 14:41:21 -07:00
sijchen
4c8458f7ff Merge pull request #2494 from ruil2/stat
use the correct frametype in statistics info
2016-06-07 14:41:12 -07:00
Karina
40f4fc05bb get each spatial layer qp 2016-06-06 17:13:22 +08:00
Karina
c1255451d7 use the correct frametype in statistics info 2016-06-06 17:06:56 +08:00
Karina
02218e2dbd modify configure file comments 2016-06-06 16:22:09 +08:00
ruil2
106d13d26c Merge pull request #2492 from saamas/processing-x86-downsample-use-lddqu
[Processing/x86] Use lddqu in case we still run on anything that benefits
2016-06-06 12:46:55 +08:00
Sindre Aamås
f183891c5b [Processing/x86] Use lddqu in case we still run on anything that benefits 2016-06-04 00:41:35 +02:00
Sindre Aamås
5a9c6db335 [Processing] Relax downsample buffer size requirement
AFAICT, it is sufficient that the sample buffer has space for half
the source width/height. With the current sample buffer size, this
enables its use for resolutions up to 3840x2176.
2016-06-03 15:14:09 +02:00
Sindre Aamås
68a5910f8f [Processing] Clear LSB before rounding up dyadic downsample width 2016-06-03 12:03:01 +02:00
Karina
2171d84f1e add nalsize checking UT and fix nalsize control when cabac on 2016-06-03 17:36:14 +08:00
ruil2
3eba80765c Merge pull request #2487 from sijchen/refactor_ref31
[Encoder] Preprocess: refactor to improve code readability
2016-06-03 13:39:04 +08:00
sijchen
1fa02f6b07 Merge pull request #2488 from ruil2/codingIdx1
fix codingIdx update issue
2016-06-02 10:00:56 -07:00
Karina
4f41c3a5bf fix codingIdx update issue 2016-06-02 21:17:31 +08:00
Sindre Aamås
8a0af4a3f2 [Processing/x86] DyadicBilinearDownsample optimizations
Average vertically before horizontally; horizontal averaging is more
worksome. Doing the vertical averaging first reduces the number of
horizontal averages by half.

Use pmaddubsw and pavgw to do the horizontal averaging for a slight
performance improvement.

Minor tweaks.

Improve the SSSE3 dyadic downsample routines and drop the SSE4 routines.
The non-temporal loads used in the SSE4 routines do nothing for cache-
backed memory AFAIK.

Adjust tests because averaging vertically first gives slightly different
output.

~2.39x speedup for the widthx32 routine on Haswell when not memory-bound.
~2.20x speedup for the widthx16 routine on Haswell when not memory-bound.

Note that the widthx16 routine can be unrolled for further speedup.
2016-06-02 13:44:28 +02:00
Sindre Aamås
7cbb75eac6 [Processing] Pick dyadic downsample function based on stride
Assume that data can be written into the padding area following each
line. This enables the use of faster routines for more cases.

Align downsample buffer stride to a multiple of 32.

With this all strides used should be a multiple of 16, which means
that use of narrower downsample routines can be dropped altogether.
2016-06-02 13:44:28 +02:00
Sindre Aamås
770e48ac2b [Processing] Remove unused align macros
The WELS_ALIGN macro here aliases the WELS_ALIGN macro in macros.h
which is inconvenient. Just remove these unused macros.
2016-06-02 13:44:28 +02:00
sijchen@cisco.com
a7ae1efc3a add back the missing part after merging and formatting 2016-06-01 21:33:33 -07:00
sijchen@cisco.com
8bacc3d4d0 Preprocess: refactor to improve code readability 2016-06-01 21:26:24 -07:00
sijchen
f6b6a0f6aa Merge pull request #2485 from ruil2/init
remove redundant initialization
2016-06-01 09:28:02 -07:00
sijchen@cisco.com
8537a9274d fix a prob 2016-06-01 09:21:12 -07:00
sijchen@cisco.com
a9601cdc59 refactor to avoid only use idx0 in syntax writing, for now it has no impact on bs, may benefit future usage 2016-06-01 09:21:12 -07:00
Karina
268a0eb6f4 remove redundant initialization 2016-06-01 10:52:51 +08:00
HaiboZhu
515eeb41e4 Merge pull request #2481 from ruil2/maxbitrate1
fix iContinualSkipFrames calculation
2016-06-01 09:03:57 +08:00
HaiboZhu
7ccc377d55 Merge pull request #2480 from ruil2/fix
fix removing parameter setting wrongly
2016-06-01 09:03:49 +08:00
ruil2
2d3fc37a07 Merge pull request #2484 from sijchen/refactor_preprocess13
[Encoder] Refactor: add class for diff preprocess strategy
2016-06-01 08:31:02 +08:00
Karina
87e81a7a40 use the same name to avoid confusing. 2016-06-01 08:21:03 +08:00
sijchen@cisco.com
03863ae4c6 different preprocess actually used diff source picture management 2016-05-31 14:36:21 -07:00
sijchen@cisco.com
a1cae49732 add class for diff preprocess strategy 2016-05-31 13:48:45 -07:00
sijchen
c29da290b9 Merge pull request #2479 from ruil2/refine_rc1
get the correct did for savc case
2016-05-31 10:58:38 -07:00
Karina
dd021b6ca8 fix iContinualSkipFrames calculation 2016-05-31 21:01:11 +08:00
Karina
8effa45edd fix removing parameter setting 2016-05-31 20:46:13 +08:00
Karina
64ad70b0ea get the correct did for savc case 2016-05-31 17:35:20 +08:00
HaiboZhu
df77a5d587 Merge pull request #2478 from ruil2/refine_rc1
refine RC
2016-05-31 17:20:46 +08:00
Karina
4fc2b1f636 refine RC 2016-05-31 16:44:04 +08:00
HaiboZhu
3f199f92a9 Merge pull request #2477 from ruil2/add_param_configure
add savc setting in configure file and command line
2016-05-31 16:33:40 +08:00
Karina
7f2ba4dcb6 add savc setting in configure file and command line 2016-05-31 13:53:31 +08:00
HaiboZhu
1d2b52e4cc Merge pull request #2476 from ruil2/did1
fix dependency ID mapping issue
2016-05-31 11:08:16 +08:00
Karina
e3c306608c fix dependency ID mapping issue 2016-05-30 15:03:39 +08:00
ruil2
39c2fb3d6b Merge pull request #2472 from saamas/processing-x86-general-bilinear-downsample-optimizations
[Processing/x86] GeneralBilinearDownsample optimizations
2016-05-27 15:17:31 +08:00
Sindre Aamås
563376df0c [UT] Test downsampling routines with a wider variety of height ratios 2016-05-25 14:16:29 +02:00
HaiboZhu
c17a58efdf Merge pull request #2473 from ruil2/update_interface
modify the interface that use a independent subseqID for each layer
2016-05-25 10:00:13 +08:00
HaiboZhu
780101fcfd Merge pull request #2474 from ruil2/overflow
avoid overflow
2016-05-25 09:59:36 +08:00
Karina
2ef9613e55 avoid overflow 2016-05-24 13:25:05 +08:00
Sindre Aamås
4fec6d581e [UT] Test generic downsampling routines with a wider variety of width ratios
Get coverage of all code paths for routines that branch to different
paths for different scaling ratios.
2016-05-23 20:23:47 +02:00
Sindre Aamås
e490215990 [Processing/x86] Add an AVX2 implementation of GeneralBilinearAccurateDownsample
Keep track of relative pixel offsets and utilize pshufb to efficiently
extract relevant pixels for horizontal scaling ratios <= 8. Because
pshufb does not cross 128-bit lanes, the overhead of address
calculations and loads is relatively greater as compared with an
SSSE3/SSE4.1 implementation.

Fall back to a generic approach for ratios > 8.

The implementation assumes that data beyond the end of each line,
before the next line begins, can be dirtied; which AFAICT is safe with
the current usage of these routines.

Speedup is ~8.52x/~6.89x (32-bit/64-bit) for horizontal ratios <= 2,
~7.81x/~6.13x for ratios within (2, 4], ~5.81x/~4.52x for ratios
within (4, 8], and ~5.06x/~4.09x for ratios > 8 when not memory-bound
on Haswell as compared with the current SSE2 implementation.
2016-05-23 20:23:47 +02:00
Sindre Aamås
b43e58a366 [Processing/x86] Add an AVX2 implementation of GeneralBilinearFastDownsample
Keep track of relative pixel offsets and utilize pshufb to efficiently
extract relevant pixels for horizontal scaling ratios <= 8. Because
pshufb does not cross 128-bit lanes, the overhead of address
calculations and loads is relatively greater as compared with an
SSSE3 implementation.

Fall back to a generic approach for ratios > 8.

The implementation assumes that data beyond the end of each line,
before the next line begins, can be dirtied; which AFAICT is safe with
the current usage of these routines.

Speedup is ~10.42x/~5.23x (32-bit/64-bit) for horizontal ratios <= 2,
~9.49x/~4.64x for ratios within (2, 4], ~6.43x/~3.18x for ratios
within (4, 8], and ~5.42x/~2.50x for ratios > 8 when not memory-bound
on Haswell as compared with the current SSE2 implementation.
2016-05-23 20:23:47 +02:00
Sindre Aamås
b1013095b1 [Processing/x86] Add an SSE4.1 implementation of GeneralBilinearAccurateDownsample
Keep track of relative pixel offsets and utilize pshufb to efficiently
extract relevant pixels for horizontal scaling ratios <= 4.

Fall back to a generic approach for ratios > 4.

The use of blendps makes this require SSE4.1. The pshufb path can be
backported to SSSE3 and the generic path to SSE2 for a minor reduction
in performance by replacing blendps and preceding instructions with an
equivalent sequence.

The implementation assumes that data beyond the end of each line,
before the next line begins, can be dirtied; which AFAICT is safe with
the current usage of these routines.

Speedup is ~5.32x/~4.25x (32-bit/64-bit) for horizontal ratios <= 2,
~5.06x/~3.97x for ratios within (2, 4], and ~3.93x/~3.13x for ratios
> 4 when not memory-bound on Haswell as compared with the current SSE2
implementation.
2016-05-23 20:23:39 +02:00
Sindre Aamås
1995e03d91 [Processing/x86] Add an SSSE3 implementation of GeneralBilinearFastDownsample
Keep track of relative pixel offsets and utilize pshufb to efficiently
extract relevant pixels for horizontal scaling ratios <= 4.

Fall back to a generic approach for ratios > 4. Note that the generic
approach can be backported to SSE2.

The implementation assumes that data beyond the end of each line,
before the next line begins, can be dirtied; which AFAICT is safe with
the current usage of these routines.

Speedup is ~6.67x/~3.26x (32-bit/64-bit) for horizontal ratios <= 2,
~6.24x/~3.00x for ratios within (2, 4], and ~4.89x/~2.17x for ratios
> 4 when not memory-bound on Haswell as compared with the current SSE2
implementation.
2016-05-23 20:23:31 +02:00
Sindre Aamås
cbaf087583 [Processing] Reduce duplication in downsampling wrappers 2016-05-23 13:19:17 +02:00
ruil2
c96c8b05a8 Merge pull request #2468 from sijchen/refactor_pre
[Encoder] Refactor: create diff func for diff case to make logic clean
2016-05-23 13:21:40 +08:00
HaiboZhu
685b6144a5 Merge pull request #2469 from ruil2/fix_bitrate
add GetBsPostion for cabac and cavlc
2016-05-23 09:49:45 +08:00
Karina
9b2dd55324 add GetBsPostion for cabac and cavlc 2016-05-20 14:29:48 +08:00
sijchen
27e803f6f4 refactor to make logic clean 2016-05-19 09:42:39 -07:00
Karina
ac37666cf1 modify the interface that use a independent subseqID for each layer 2016-05-19 17:17:17 +08:00
sijchen
a5e4cca710 Merge pull request #2467 from ruil2/overflow
fix overflow issue
2016-05-18 21:35:32 -07:00
Karina
8a341070f2 fix overflow issue 2016-05-19 12:00:49 +08:00
sijchen
3fd490dbed Merge pull request #2460 from sijchen/refactor_ref2
[Encoder] move strategy related pointer to class
2016-05-18 11:40:08 -07:00
sijchen
1ac02f3002 fix conflict with master 2016-05-18 10:57:39 -07:00
sijchen
7188e50acf Merge pull request #2465 from ruil2/skip_layers
fix temporal layer skip issue
2016-05-18 09:34:09 -07:00
Karina
c298d66d48 fix temporal layer skip issue 2016-05-18 09:47:49 +08:00
sijchen
6d79601d93 Merge pull request #2463 from HaiboZhu/Fix_build_error_windows_debug
Fix the wrong variable name which casue the build error
2016-05-16 22:57:32 -07:00
Haibo Zhu
85f4beb9a8 Fix the wrong variable name which casue the build error 2016-05-17 13:46:04 +08:00
HaiboZhu
46220cfb3b Merge pull request #2461 from HaiboZhu/Bugfix_remove_undefined_behavior_warning
Remove the undefined behavior waring in parse_cabac
2016-05-17 10:51:18 +08:00
Haibo Zhu
86c1f0d2c6 Remove the undefined behavior waring in parse_cabac 2016-05-17 09:40:03 +08:00
ruil2
0ec686f7ec Merge pull request #2452 from sijchen/refactor_sps2
Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class
2016-05-17 09:19:14 +08:00
sijchen
1eb735299a Merge pull request #2458 from ruil2/downsampling2
add one new downsampling algorithms
2016-05-16 10:59:35 -07:00
sijchen
00747540fb move strategy related pointer to class 2016-05-16 10:55:13 -07:00
HaiboZhu
f623aa318d Merge pull request #2459 from ruil2/fix_crash
fix crash when temporal layer is skipped, the frame should not be encoded
2016-05-16 15:35:38 +08:00
Karina
3b55d64902 fix crash when temporal layer is skipped, the frame should not be encoded 2016-05-16 14:43:13 +08:00
Karina
96b2a87030 add one new downsampling algorithms 2016-05-16 09:28:19 +08:00
sijchen
3fa9a4840a Merge pull request #2433 from hzwangsiyu/master
Update .gitignore
2016-05-05 16:27:56 -07:00
sijchen
ffb85046b4 Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class, to improve code readability 2016-05-04 15:06:02 -07:00
HaiboZhu
c30cc41261 Merge pull request #2448 from saamas/encoder-getnonzerocount-sse42
[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
2016-05-04 09:49:47 +08:00
ruil2
e9dc97803d Merge pull request #2447 from saamas/encoder-cavlcparamcal-sse42
[Encoder] Add an SSE4.2 implementation of CavlcParamCal
2016-04-28 09:08:44 +08:00
ruil2
7d65687284 Merge pull request #2441 from saamas/encoder-add-avx2-4x4-quantization-routines
[Encoder] Add AVX2 4x4 quantization routines
2016-04-28 09:08:31 +08:00
ruil2
56618249d7 Merge pull request #2436 from saamas/processing-add-avx2-vaa-routines
[Processing] Add AVX2 VAA routines
2016-04-28 09:08:03 +08:00
Sindre Aamås
fb0b2b3f41 [Encoder/x86] Drop unneeded LOAD_4_PARA in CavlcParamCal_sse42 2016-04-24 22:59:35 +02:00
Sindre Aamås
d1c7713191 [Encoder/x86] Minor CavlcParamCal_sse42 tweak
Do more elaborate register allocation to avoid a few mov instructions.
2016-04-24 22:36:23 +02:00
Sindre Aamås
f56bdc3aa4 [Encoder/x86] Minor CavlcParamCal_sse42 tweak
Avoid loading single-use parameter.
2016-04-21 16:29:02 +02:00
Sindre Aamås
2eb8800712 [Encoder/x86] Remove a leftover mov instruction in CavlcParamCal_sse42 2016-04-21 15:53:33 +02:00
Sindre Aamås
4645bd26aa [Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
Avoid touching some cache lines by using popcnt instead of table
lookups.

Also gives a speedup of ~1.4x on Haswell as compared with SSE2.
2016-04-20 19:10:24 +02:00
Sindre Aamås
d906dda224 [UT] Improve GetNonZeroCount tests
Reduce duplication.
Test more combinations.
Always test boundary cases.
2016-04-20 19:10:24 +02:00
Sindre Aamås
3f31aff4dc [Encoder] Add an SSE4.2 implementation of CavlcParamCal
Use a combination of table lookups and pshufb to convert coefficients
to zero run/level format. Two 16-entry lookup tables are used for a
total of 192 bytes worth of tables. (The existing SSE2 version uses a
table of size 2048 bytes.)

Speedup is ~1.5x-3x as compared with the SSE2 version on Haswell (the
speedup is greater for input with many trailing zeros).

The use of popcnt makes it require SSE4.2. This can be replaced with
a small LUT and accumulation which would reduce the requirement to
SSSE3.
2016-04-20 18:37:08 +02:00
Sindre Aamås
502b16925e [UT] Add tests for CavlcParamCal_c and CavlcParamCal_sse2 2016-04-20 18:37:08 +02:00
HaiboZhu
98c6c6de11 Merge pull request #2446 from HaiboZhu/Reduce_log_size_for_parse_only_mode
Add the log reduce logic into parse only mode
2016-04-20 10:48:57 +08:00
HaiboZhu
3b68840d5f Merge pull request #2444 from GuangweiWang/fix-assembly-arm64
Fix assembly arm64
Code review at: https://rbcommons.com/s/OpenH264/r/1594/
2016-04-20 10:03:01 +08:00
Haibo Zhu
3ccecfbdbe Add the log reduce logic into parse only mode 2016-04-20 09:58:12 +08:00
HaiboZhu
c9433ee73b Merge pull request #2442 from ruil2/deblocking_fix
fix 32-bit parameters issue on arm64 assembly function
2016-04-18 09:21:24 +08:00
Guangwei Wang
cc407b4b21 fix code style 2016-04-17 19:47:55 +08:00
Guangwei Wang
0b8cdcaff8 extension 32-bit parameters to 64-bit on arm64 assembly function 2016-04-17 19:41:57 +08:00
Karina
1ecb9582df update arm assembly comments 2016-04-14 14:57:21 +08:00
Karina
dd340b7fe7 modify neon comment 2016-04-14 14:49:11 +08:00
Karina
525dbe7093 add 32-bit parameter sign-extentions for block_add_aarch64_neon.S 2016-04-14 10:06:57 +08:00
Karina
d34e209266 fix 32-bit parameters issue on arm64 assembly function 2016-04-13 19:30:08 +08:00
Sindre Aamås
bb49e23719 [Encoder] Add AVX2 4x4 quantization routines
WelsQuantFour4x4Max_avx2 (~2.06x speedup over SSE2)
WelsQuantFour4x4_avx2    (~2.32x speedup over SSE2)
WelsQuant4x4Dc_avx2      (~1.49x speedup over SSE2)
WelsQuant4x4_avx2        (~1.42x speedup over SSE2)
2016-04-13 11:56:47 +02:00
Sindre Aamås
1e83bec860 [UT] Add some missing quantization tests 2016-04-13 11:56:44 +02:00
Sindre Aamås
abaf3a4104 [UT] Reduce duplication in quantization tests 2016-04-13 08:59:16 +02:00
HaiboZhu
50daa8f737 Merge pull request #2439 from ruil2/deblocking_fix
add missing sign extension for arm64 on deblocking_aarch64_neon.S
2016-04-12 16:48:54 +08:00
Karina
7943764869 add missing sign extension for arm64 2016-04-12 16:27:58 +08:00
Sindre Aamås
93db6511a8 [UT] Test VAA routines with a wider variety of resolutions
Test even and odd multiples of 32 width because some AVX2 routines
have conditional logic based on that.
2016-04-11 16:40:36 +02:00
Sindre Aamås
57fc3e9917 [Processing] Add AVX2 VAA routines
Process 8 lines at a time rather than 16 lines at a time because
this appears to give more reliable memory subsystem performance on
Haswell.

Speedup is > 2x as compared to SSE2 when not memory-bound on Haswell.
On my Haswell MBP, VAACalcSadSsdBgd is about ~3x faster when uncached,
which appears to be related to processing 8 lines at a time as opposed
to 16 lines at a time. The other routines are also faster as compared
to the SSE2 routines in this case but to a lesser extent.
2016-04-11 16:09:56 +02:00
HaiboZhu
eb9f56584f Merge pull request #2432 from ruil2/refine_encode1
refine the workflow for encode one frame
2016-04-06 08:59:48 +08:00
hzwangsiyu
6d2d031fca Update .gitignore 2016-04-04 10:32:29 +08:00
Karina
7e14400b0b refine the workflow for encode one frame 2016-03-31 16:58:20 +08:00
HaiboZhu
c423a80ba4 Merge pull request #2431 from ruil2/temporal_layer
fix frame rate issue
2016-03-30 17:02:43 +08:00
Karina
3927d91b85 fix temporal layer issue when output frame rate is different from input frame rate 2016-03-29 15:48:06 +08:00
ruil2
17d7aa13e4 Merge pull request #2427 from mstorsjo/mktargets
Refresh regenerating targets.mk
2016-03-25 09:36:53 +08:00
sijchen
30da4f196e Merge pull request #2426 from ruil2/fix_trace
fix skip frames statistics issue
2016-03-24 09:49:48 -07:00
zhilwang
25818b0fc2 Merge pull request #2428 from mstorsjo/sign-extension
Add missing sign extension for x86_64 in mb_copy.asm
2016-03-24 17:07:42 +08:00
Martin Storsjö
a4e71d6662 Add missing sign extension for x86_64 in mb_copy.asm
This fixes running the code built for x86_64 OS X with Xcode 7.3.
2016-03-24 10:20:42 +02:00
Martin Storsjö
81493590f8 Remove a stray empty line
This disappears when regenerating the makefiles.
2016-03-24 10:01:48 +02:00
Martin Storsjö
d7bc4f5f03 Make sure that gtest-targets.mk gets regenerated with the right directory 2016-03-24 10:01:21 +02:00
Karina
10dfb2670b fix skip frames statistics issue 2016-03-24 14:17:47 +08:00
sijchen
22bec09507 Merge pull request #2425 from sijchen/ruil_rc_update
fix bitrate overflow issue when adaptive quality turns on
2016-03-23 10:54:51 -07:00
sijchen
47d310539f Squashed commit of the following:
commit c8111942e07437034a74b33887c33b5ad78e476a
Author: Karina <ruil2@cisco.com>
Date:   Wed Mar 23 14:31:18 2016 +0800

    update SHA table

commit f36a25344c25a131581dcbcd2d103fc4b131012e
Author: Karina <ruil2@cisco.com>
Date:   Wed Mar 23 13:45:58 2016 +0800

    fix bitrate overflow issue when adaptive quality turns on
2016-03-23 10:23:33 -07:00
HaiboZhu
c0641f40d9 Merge pull request #2423 from shihuade/SPSUpdate
fix bug for debug mode
2016-03-23 13:27:35 +08:00
HaiboZhu
e52c6eacb0 Merge pull request #2422 from HaiboZhu/Bugfix_level_check_error_fmo_return_value
Fix the level limit check bug and fmo return overflow bug
2016-03-23 12:14:20 +08:00
Haibo Zhu
e4a4fb6577 (1) Fix the level limit check wrong condition
(2) Fix the FMO return value overflow bug
2016-03-23 11:15:26 +08:00
Forrest Shi
47ad929c25 fix bug for debug mode 2016-03-23 11:14:40 +08:00
sijchen
22d6a94919 Merge pull request #2414 from ksb2go/master
Google has deprecated using SVN. Move over to GitHub
2016-03-22 16:20:58 -07:00
sijchen
40e1a69fae Merge pull request #2421 from shihuade/MultiThread_V5.2_Pull_V2
refactor for slice buffer init/allocate/free
2016-03-22 16:20:37 -07:00
huade
a7a5b7b0f4 refactor for slice buffer init/allocate/free 2016-03-22 13:51:20 +08:00
sijchen
33bb96f604 Merge pull request #2420 from sijchen/fix_sps
[Encoder] fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on
2016-03-21 21:51:07 -07:00
sijchen
8103988cde Merge pull request #2418 from ruil2/refine_init
fix preprocessing initialization logic
2016-03-21 11:32:24 -07:00
Karina
228cdeba1b refine reset function 2016-03-21 10:48:41 +08:00
sijchen
38313b913d Merge pull request #2419 from ruil2/bitrate_update
fix bitrate update issue
2016-03-18 16:07:59 -07:00
Karina
7c15d68e24 fix preprocessing initialization logic 2016-03-18 16:43:11 +08:00
Karina
316ab31882 fix bitrate update issue 2016-03-18 14:28:32 +08:00
zhilwang
d7570bfa52 Merge pull request #2401 from saamas/decoder-use-encoder-x86-idct-routines
[Decoder] Use encoder x86 IDCT routines
2016-03-18 08:50:33 +08:00
David Chen
7112938a28 Google has deprecated using SVN. Move over to GitHub 2016-03-17 17:25:22 -07:00
HaiboZhu
a8ab4afe5b Merge pull request #2410 from HaiboZhu/Add_disable_assert_in_release
Diable assert in release with -DNDEBUG macro
2016-03-17 15:46:25 +08:00
HaiboZhu
c441f6f390 Merge pull request #2411 from huili2/memory_leak_fix
fix memory leak when alloc failed in decoder
2016-03-17 15:46:12 +08:00
Haibo Zhu
43f767d06e Diable assert in release with -DNDEBUG macro
Update the code to avoid the function unused warning
2016-03-17 11:24:01 +08:00
unknown
693fd14272 fix memory leak when alloc failed in decoder 2016-03-17 10:31:25 +08:00
Sindre Aamås
b6c4a5447c [Decoder/x86] IDCT one block at a time with SSE2
At lower bitrates, it is overall faster to conditionally do one block
at a time with SSE2 on Haswell and likely other common architectures.
At higher bitrates, it is faster to use the wider routine that IDCTs
four blocks at a time. To avoid potential performance regressions
as compared to MMX, stick with single-block IDCTs with SSE2. There
is still a performance advantage as compared to MMX because the
single-block SSE2 routine is faster than the corresponding MMX
routine.

Stick with four blocks at a time with AVX2 for which that appears
to be consistently faster on Haswell.
2016-03-16 19:55:11 +01:00
huili2
a8d9576297 Merge pull request #2405 from HaiboZhu/Fix_UT_decoder_init_fail
Fix the decoder init failed case in UT
2016-03-16 16:28:14 +08:00
Marcus Johnson
4d6b1c23fe MVC support 2 2016-03-16 01:32:56 -04:00
Marcus Johnson
69bae68698 Add support for MVC NALs to EWelsNalUnitType 2016-03-16 01:28:55 -04:00
HaiboZhu
7a3b3fdbe7 Merge pull request #2403 from ruil2/downsampling1
change downsampling logic
2016-03-16 09:48:08 +08:00
sijchen
90deb80b50 rename the functions 2016-03-14 21:41:08 -07:00
sijchen
c009183e97 fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on 2016-03-14 11:28:44 -07:00
Haibo Zhu
46f42ec5f3 Fix the decoder init failed case in UT 2016-03-14 17:06:58 +08:00
Karina
f84f2315ab change downsampling logic that downsampling source is from the nearest layer instead of the highest layer 2016-03-14 09:55:36 +08:00
HaiboZhu
25f53a2e3d Merge pull request #2399 from saamas/encoder-x86-add-avx2-satd-routines
[Encoder/x86] Add AVX2 SATD routines
2016-03-10 09:59:33 +08:00
Sindre Aamås
98042f1600 [Decoder] Use encoder x86 IDCT routines
Move asm routines to common. Delete obsolete decoder routines.

Use wider routines where applicable.

~1.07x overall faster decode on a quick 720p30 4Mbps test on Haswell.
2016-03-09 10:41:42 +01:00
HaiboZhu
bffda9ec02 Merge pull request #2397 from HaiboZhu/Remove_level_limit_check
Change the level limit check behavior to make the compatibility
2016-03-09 09:50:44 +08:00
Haibo Zhu
31de8bb3a0 Change the level limit check behavior to make the compatibility 2016-03-09 08:34:07 +08:00
Sindre Aamås
48a520915a [Encoder/x86] Add AVX2 SATD routines
WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell).
WelsSampleSatd16x8_avx2  (~2.19x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x16_avx2  (~1.68x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x8_avx2   (~1.53x speedup over SSE4.1 on Haswell).
2016-03-08 11:31:17 +01:00
volvet
d4c68527b1 Merge pull request #2389 from saamas/common-x86-deblock-chroma-horizontal-ssse3-optimizations
[Common/x86] Deblock chroma horizontal ssse3 optimizations
2016-03-08 17:09:08 +08:00
HaiboZhu
d9bfc9204b Merge pull request #2394 from sijchen/th021
[Common] remove sink in WelsThreadPool and hide the construtor to finish the s…
2016-03-08 16:29:40 +08:00
HaiboZhu
74b8a66140 Merge pull request #2395 from ruil2/stat_output
format update and fix build issue when turn on STAT_OUTPUT macro
2016-03-07 13:46:27 +08:00
Karina
fee9d502bb format update and fix build issue when turn on STAT_OUTPUT macro 2016-03-04 13:55:14 +08:00
sijchen
316f740630 Merge pull request #2390 from sijchen/th012
[Common] put CWelsThreadPool to singleTon for future usage
2016-03-03 09:47:20 -08:00
huili2
ac6cf877d6 Merge pull request #2392 from mstorsjo/decoder-error-return
Fix a return value check
2016-03-03 16:40:55 +08:00
Martin Storsjö
7f53c29302 Fix a return value check
In 9cb4f4e8e21af, the error code returned from CheckIntraNxNPredMode
was changed - therefore, these return value checks, that look
for a specific error code, need to be updated accordingly.

This fixes crashes in DecodeCrashTestAPI.DecoderCrashTest
with some seeds.
2016-03-03 10:15:34 +02:00
sijchen
4db9c32976 remove sink in WelsThreadPool and hide the construtor to finish the singleTon 2016-03-02 17:08:09 -08:00
sijchen
d4f09d9048 put CWelsThreadPool to singleTon for future usage (including add sink for IWelsTask) 2016-02-29 11:40:25 -08:00
HaiboZhu
52d25f544a Merge pull request #2386 from huili2/return_info_change
modify return value check inside decoder
2016-02-29 09:21:31 +08:00
sijchen
7e88b13809 Merge pull request #2380 from mstorsjo/fix-slice-realloc
Avoid reading iCountMbNumInSlice out of bounds on slice realloc
2016-02-26 09:46:13 -08:00
Sindre Aamås
a009153741 [Common/x86] DeblockChromaEq4H_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

~5.80x speedup on Haswell (x86-64).
~1.69x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00
Sindre Aamås
9909c306f1 [Common/x86] DeblockChromaLt4H_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

~5.72x speedup on Haswell (x86-64).
~1.85x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00
unknown
9cb4f4e8e2 modify return value check inside decoder 2016-02-26 16:29:35 +08:00
Martin Storsjö
69e3fac093 Avoid reading iCountMbNumInSlice out of bounds on slice realloc
Prior to 7bcb3ba4f4abf18a,
pCurLayer->sLayerInfo.pSliceInLayer[uiSliceIdx].iCountMbNumInSlice
was read after setting pCurLayer->sLayerInfo.pSliceInLayer to
the newly allocated, larger array. After this commit, it is read
before the array has been switched, and thus is read from the
old array (which only holds elements up to iMaxSliceNumOld, not
up to iMaxSliceNum).

This fixes reads out of bounds, and crashes in the test suite.
2016-02-25 10:31:58 +02:00
HaiboZhu
040974f735 Merge pull request #2378 from shihuade/MultiThread_V4.9_V5
add thread-based slice buffer and  refactor reallocate process
2016-02-25 14:40:56 +08:00
HaiboZhu
321c772536 Merge pull request #2372 from ruil2/refine_trace
update trace for ENCODER_OPTION_TRACE_CALLBACK
2016-02-25 10:50:12 +08:00
HaiboZhu
027f027c25 Merge pull request #2371 from GregoryJWolfe/master
Added support for "video signal type present" information.
2016-02-25 10:49:34 +08:00
huade
5e8a716c1d add thread-based slice buffer and refact reallocate process for futher change 2016-02-25 10:08:41 +08:00
Gregory J. Wolfe
03890fe86f Added support for "video signal type present" information.
The "Video signal type present" information is written to the output
video file when it is created, and later is used by the decoder to
properly decode the compressed video data. The saved attributes
are:

- format type (PAL, NTSC, etc.)
- color primaries (BT709, SMPTE170M, etc.)
- transfer characteristics (BT709, SMPTE170M, etc.)
- color matrix ((BT709, SMPTE170M, etc.)

These modifications allow the client to specify these attributes
and, if specified, makes sure they are written to the output file.
2016-02-24 10:33:18 -05:00
Gregory J. Wolfe
c7fcba06c7 Added support for "video signal type present" information.
The "Video signal type present" information is written to the output
video file when it is created, and later is used by the decoder to
properly decode the compressed video data. The saved attributes
are:

- format type (PAL, NTSC, etc.)
- color primaries (BT709, SMPTE170M, etc.)
- transfer characteristics (BT709, SMPTE170M, etc.)
- color matrix ((BT709, SMPTE170M, etc.)

These modifications allow the client to specify these attributes
and, if specified, makes sure they are written to the output file.
2016-02-23 13:21:06 -05:00
ruil2
3e538617cd Merge pull request #2374 from sijchen/for_ts0
[Encoder] fix timestamp = 0 issue when rc mode is BITRATE mode
2016-02-23 17:26:20 +08:00
ruil2
78ae48c686 Merge pull request #2375 from shihuade/MultiThread_V4.8_v4
refactor slice level rc statistic info structure
2016-02-23 17:25:57 +08:00
huade
7bcb3ba4f4 refactor slice level rc structure 2016-02-23 16:49:37 +08:00
sijchen
881fc11c48 finish the remaining prob of fixing ts=0 2016-02-22 10:40:35 -08:00
sijchen
9816e3302d fix timestamp = 0 issue when rc mode is BITRATE mode 2016-02-22 10:33:55 -08:00
Karina
597b4eef73 fix timestamp = 0 issue when rc mode is BITRATE mode. 2016-02-22 10:33:55 -08:00
Karina
65218a3c35 update trace for ENCODER_OPTION_TRACE_CALLBACK 2016-02-22 14:33:10 +08:00
ruil2
2754129064 Merge pull request #2360 from saamas/common-x86-deblock-optimizations
[Common/x86] Deblocking optimizations
2016-02-19 09:52:39 +08:00
Gregory J. Wolfe
f35a0daccf Added support for "video signal type present" information.
The "Video signal type present" information is written to the output
video file when it is created, and later is used by the decoder to
properly decode the compressed video data.  The saved attributes
are:

- format type (PAL, NTSC, etc.)
- color primaries (BT709, SMPTE170M, etc.)
- transfer characteristics (BT709, SMPTE170M, etc.)
- color matrix ((BT709, SMPTE170M, etc.)

These modifications allow the client to specify these attributes
and, if specified, makes sure they are written to the output file.
2016-02-18 11:51:51 -05:00
ruil2
13586a3dfc Merge pull request #2366 from sijchen/fix_free6
[Encoder] add error handling in memory allocation failed case for multi-threading
2016-02-18 10:25:19 +08:00
ruil2
f791ac28ec Merge pull request #2365 from sijchen/fix_free42
[Encoder] avoid memory problem when mem alloc failed during initializing pRefList
2016-02-18 10:25:07 +08:00
ruil2
de1a70d164 Merge pull request #2363 from sijchen/fix_free5
[Encoder] add input parameter check as protection for an encoder interface
2016-02-18 10:24:55 +08:00
sijchen
4537682042 Merge pull request #2362 from ruil2/trace1
trace cleanup
2016-02-17 14:52:46 -08:00
sijchen
e07ee9c096 use WELS_DELETE_OP for deleting 2016-02-17 10:07:33 -08:00
sijchen
74955c877f set pointers to null and call uninit 2016-02-17 10:07:33 -08:00
sijchen
cc675f9fd1 add error handling in memory allocation failed case 2016-02-17 10:07:33 -08:00
sijchen
41b4ecb06b Avoid memory problem when mem alloc failed during initializing pRefList 2016-02-17 09:52:30 -08:00
sijchen
4b97dcb367 avoid memory problem when mem alloc failed during initializing pRefList 2016-02-16 10:05:49 -08:00
Karina
18728a4876 trace cleanup 2016-02-16 10:52:37 +08:00
ruil2
a26955e444 Merge pull request #2358 from sijchen/fix_free2
[Encoder]  avoid memory problem if mem alloc failed in the middle of InitDqLayer
2016-02-16 10:47:23 +08:00
ruil2
6cf240237b Merge pull request #2361 from sijchen/fix_free00
[Encoder] multiple protection if memory allocation failed
2016-02-16 10:47:02 +08:00
sijchen
855d1cf8c2 add input parameter check as protection for an encoder interface 2016-02-15 11:54:51 -08:00
sijchen
b76a79c726 move the rc free to the correct condition to avoid access to invalid memory 2016-02-15 10:13:50 -08:00
sijchen
025500d5aa move the assigning m_uiSpatialPicNum earlier to cover the memory leak if error in allocating pic 2016-02-15 10:13:23 -08:00
sijchen
36722c553b use WelsMallocz instead of WelsMalloc to avoid non-null pointer at init 2016-02-15 10:12:44 -08:00
sijchen
71aa533038 move the printing of MEMORY_CHECK part to more reasonable 2016-02-15 10:12:34 -08:00
sijchen
6a0f0811ae use WelsUninitEncoderExt in all free process in WelsInitEncoderExt 2016-02-15 10:06:43 -08:00
sijchen
408b7cad17 use WelsUninitEncoderExt rather than FreeMemorySvc which correctly deals with release of vpp memory 2016-02-15 10:04:52 -08:00
Sindre Aamås
e96a7b5c92 [Common/x86] DeblockChromaEq4V_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Avoid spills.

~2.07x speedup on Haswell (x86-64).
~2.12x speedup on Haswell (x86 32-bit).
2016-02-15 02:08:03 +01:00
Sindre Aamås
fc16010583 [Common/x86] DeblockChromaLt4V_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Avoid spills.

~2.68x speedup on Haswell (x86-64).
~2.38x speedup on Haswell (x86 32-bit).
2016-02-15 02:07:25 +01:00
Sindre Aamås
62fb37d096 [Common/x86] DeblockLumaEq4_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Minimize spills.

~2.31x speedup on Haswell (x86-64).
~2.40x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:39 +01:00
Sindre Aamås
732e1c5f78 [Common/x86] DeblockLumaLt4_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Avoid spills.

~1.97x speedup on Haswell (x86-64).
~3.09x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:18 +01:00
sijchen
8b1206001c Merge pull request #2355 from pra85/patch-1
Fix a typo
2016-02-12 16:42:02 -08:00
sijchen
2b9a250fbd include the free-ing of pointer into FreeDqLayer 2016-02-12 16:23:57 -08:00
Prayag Verma
2d378b9db8 Fix a typo
`Availabe` → `Available`
2016-02-11 12:04:58 +05:30
sijchen
a1a3873a62 improve the code structure 2016-02-10 22:25:41 -08:00
sijchen
43fdf74fa6 fix a miss of assigning and remove an unused line 2016-02-10 21:54:53 -08:00
sijchen
914302a462 avoid memory problem if mem alloc failed in the middle of InitDqLayer 2016-02-10 21:54:53 -08:00
sijchen
aaa25160ec Merge pull request #2353 from saamas/encoder-x86-dct-opt2
[Encoder] x86 DCT optimizations
2016-02-08 15:00:12 -08:00
sijchen
e5e7013b73 Merge pull request #2350 from sijchen/th00
[Common] Add sink to IWelsTask
2016-02-08 14:59:38 -08:00
HaiboZhu
ad9ca3824f Merge pull request #2354 from ruil2/remove_trace
fix error width and height issue
2016-02-04 12:00:20 +08:00
Karina
ae508b9724 fix error width and height issue 2016-02-04 10:25:03 +08:00
sijchen
f5fd7420a9 Merge pull request #2351 from huili2/fix_width_height_enc_constraint
fix frame size constraints for width and height
2016-02-02 16:31:05 -08:00
sijchen
fb901269ef Merge pull request #2352 from ruil2/remove_trace
remove trace
2016-02-02 16:30:50 -08:00
Sindre Aamås
db9fa9154c Update README.md nasm version requirement
Version 2.10.06 has some RIP-relative relocation fixes for macho64
that are needed to generate correct code on 64-bit OS X with recent
code changes.
2016-02-02 17:22:49 +01:00
Sindre Aamås
c8c74903f8 [Encoder] Add single-block AVX2 4x4 DCT/IDCT routines
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.

~3.15x speedup over MMX for the DCT on Haswell.
~2.94x speedup over MMX for the IDCT on Haswell.

Returns diminish with increasing vector length because a larger
proportion of the time is spent on load/store/shuffling.
2016-02-02 17:22:49 +01:00
Sindre Aamås
f90960983c [Encoder] Add single-block SSE2 4x4 DCT/IDCT routines
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.

~2.31x speedup over MMX for the DCT on Haswell.
~1.92x speedup over MMX for the IDCT on Haswell.
2016-02-02 17:22:48 +01:00
Sindre Aamås
7486de2844 [Encoder] AVX2 DCT tweaks
Do some shuffling in load/store unpack/pack to save some
work in horizontal DCTs.

Use a few 128-bit broadcasts to compact data vectors a bit.

~1.04x speedup for the DCT case on Haswell.
~1.12x speedup for the IDCT case on Haswell.
2016-02-02 17:22:48 +01:00
Karina
2d4cbcf060 remove trace 2016-02-02 17:34:59 +08:00
unknown
3873addc3d fix frame size constraints for width and height 2016-02-01 15:55:53 +08:00
HaiboZhu
1030820ec4 Merge pull request #2342 from sijchen/enh_ut_tem
[UT] correct and enhance the ut template and trace improvement
2016-02-01 09:08:05 +08:00
zhilwang
c420d72443 Merge pull request #2341 from saamas/encoder-x86-dct-opt
[Encoder] x86 DCT optimizations
2016-01-28 10:33:34 +08:00
HaiboZhu
51f3bbdfde Merge pull request #2345 from shihuade/WP8ScriptUpdate
update build script for wp8 under multi-vc version
2016-01-24 07:56:23 +08:00
Forrest Shi
21402ca419 update build script for wp8 under multi-vc version 2016-01-23 16:56:53 +08:00
HaiboZhu
3174e2a220 Merge pull request #2344 from mstorsjo/cleanup-map
Ignore the MSVC generated map file, remove it on make clean
2016-01-22 09:45:57 +08:00
Martin Storsjö
fa52fbfc9d Ignore the MSVC generated map file, remove it on make clean 2016-01-21 10:23:34 +02:00
HaiboZhu
77c40e09e0 Merge pull request #2343 from HaiboZhu/Add_map_file_msvc
Generate map file for msvc build
2016-01-21 14:34:50 +08:00
sijchen
ef329e33c3 add simulcastAvc setting in setting trace 2016-01-20 14:24:16 -08:00
sijchen
47e3f4c45c correct and enhance the ut template 2016-01-19 17:16:39 -08:00
Sindre Aamås
cc8d541432 [UT] Utilize DCT function pointer typedefs 2016-01-19 22:00:24 +01:00
Sindre Aamås
e22d731f26 [Encoder] yasm-compatible vinserti128 syntax in DCT asm 2016-01-19 21:48:23 +01:00
Sindre Aamås
a45c10cf91 [UT] Only run AVX2 tests if host supports AVX2 2016-01-19 14:27:46 +01:00
Sindre Aamås
144ff0fd51 [Encoder] SSE2 4x4 IDCT optimizations
Use a combination of instruction types that distributes more
evenly across execution ports on common architectures.

Do the horizontal IDCT without transposing back and forth.

Minor tweaks.

~1.14x faster on Haswell. Should be faster on other architectures
as well.
2016-01-19 13:12:29 +01:00
Sindre Aamås
991e344d8c [Encoder] SSE2 4x4 DCT optimizations
Use a combination of instruction types that distributes more
evenly across execution ports on common architectures.

Do the horizontal DCT without transposing back and forth.

Minor tweaks.

~1.54x faster on Haswell. Should be faster on other architectures
as well.
2016-01-19 13:12:28 +01:00
Sindre Aamås
3088d96978 [Encoder] Add an AVX2 4x4 IDCT implementation
~2.03x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Sindre Aamås
b267163f10 [Encoder] Add an AVX2 4x4 DCT implementation
~2.52x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Sindre Aamås
b9adbcf37c [UT] Add missing SSE2 4x4 IDCT test
IDCT input is defined in such a way that the intermediate values
cannot legally overflow an int16_t. The use of random values
as input causes such overflows. This results in implementation-
dependent output depending on which type is used to hold
intermediate results. Use a template for the test reference
implementation to test implementations with different
intermediate representation.
2016-01-19 13:12:28 +01:00
Sindre Aamås
8764231784 [UT] Improve DCT tests
Initialize input arrays with different random values.

Otherwise, the input to the DCT routines is effectively
all zero values after taking the difference.

Reduce duplication.
2016-01-19 13:12:28 +01:00
Sindre Aamås
7739184dfd Update nasm requirement in README.md
We need version 2.10 or above for AVX2 support.
2016-01-19 13:12:28 +01:00
Sindre Aamås
496de8bf09 Use dist: trusty with travis
Trusty has a newer nasm version with AVX2 support.
2016-01-19 12:10:39 +01:00
Haibo Zhu
3206010a89 Generate map file for msvc build 2016-01-19 17:03:50 +08:00
HaiboZhu
21c1c02441 Merge pull request #2334 from sijchen/fix_ut
[UT] fix the prob in case that the task uID is too big
2016-01-19 15:39:17 +08:00
HaiboZhu
8eb4de10a2 Merge pull request #2337 from HaiboZhu/Add_Protection_wrong_API_call
Add protection for wrong API call without initialize
2016-01-19 13:42:49 +08:00
HaiboZhu
5e3e975ffb Merge pull request #2331 from ruil2/return_value
add return value judgment
2016-01-19 12:25:10 +08:00
Haibo Zhu
6d7bd2daf4 Add protection for wrong API call without initialize 2016-01-19 12:00:54 +08:00
huili2
91fa9fad63 Merge pull request #2335 from mstorsjo/fix-msvc-warnings
Avoid warnings in MSVC about implicitly casting floats to integers
2016-01-18 08:48:15 +08:00
Martin Storsjö
fbe35cffca Avoid warnings in MSVC about implicitly casting floats to integers 2016-01-16 11:10:25 +02:00
sijchen
d46cd07511 fix the prob in case that the task uID is too big 2016-01-15 16:06:09 -08:00
Karina
559e786fa4 add return value judgment 2016-01-15 10:30:41 +08:00
HaiboZhu
d11f12db54 Merge pull request #2330 from ruil2/mt_build_1
fix build issue when some macro turn on
2016-01-15 09:28:07 +08:00
HaiboZhu
67f925674a Merge pull request #2329 from ruil2/layer4
using independent encoder control logic for SAVC case
2016-01-15 09:27:58 +08:00
sijchen
5eb18b101e change the output way of debug trace 2016-01-13 22:13:43 -08:00
Karina
67f4dcf2e2 fix build issue when some macro turn on 2016-01-14 09:40:20 +08:00
Karina
0f0d54ef51 using independent encoder control logic for SAVC case 2016-01-14 09:16:12 +08:00
sijchen
cce1c29844 add sink to IWelsTask (for further enhancements) 2016-01-13 16:24:54 -08:00
ruil2
7bfb96b2b6 Merge pull request #2327 from sijchen/th41
Multiple enhancements and a bug fix
2016-01-13 13:07:58 +08:00
sijchen
5cad0f9bba enhance a UT to cover more case 2016-01-11 22:01:02 -08:00
sijchen
bf35b6fee7 add a debug trace if encoder returns error 2016-01-11 22:00:24 -08:00
sijchen
19f5eb0932 complete a debug trace in load-balancing task 2016-01-11 22:00:14 -08:00
sijchen
7a8da6a468 remove unneed codes after new task-managements 2016-01-11 21:59:49 -08:00
sijchen
dcdd496082 fix a bug in multi-layer case in task-management 2016-01-11 21:58:10 -08:00
HaiboZhu
b940e2cdf8 Merge pull request #2325 from ruil2/trace1
separate each layer trace output
2016-01-11 14:05:55 +08:00
ruil2
737548fe06 Merge pull request #2326 from shihuade/Win10_V1.0_Push
update auto build script for windows 10
2016-01-08 17:15:40 +08:00
ruil2
c32263e06b Merge pull request #2322 from HaiboZhu/Fix_Encoder_Info_Output
Fix the build errors when open the encoder info output
2016-01-08 17:15:15 +08:00
huade
1d9497b7f6 update auto build script for windows 10 2016-01-08 09:38:15 +08:00
Karina
d4f979c495 seperate each layer trace output 2016-01-05 14:02:58 +08:00
HaiboZhu
303fbfeb55 Merge pull request #2324 from ruil2/update_style
update format
2016-01-05 13:08:53 +08:00
Karina
57c87f1845 update format 2016-01-05 11:40:59 +08:00
HaiboZhu
cd75541c8f Merge pull request #2323 from ruil2/rc_timestamp
resolve abnormal timestamp(rollback or jump case)
2015-12-31 09:55:58 +08:00
Haibo Zhu
a6a504f944 Fix the build errors when open the encoder info output 2015-12-31 09:06:59 +08:00
HaiboZhu
539818101f Merge pull request #2321 from huili2/modify_ec_option_comment
modify EC method comment in API
2015-12-30 14:21:58 +08:00
huili2
740968d1f6 modify EC method comment in API 2015-12-30 13:41:29 +08:00
Karina
0d5db3d986 resolve abnormal timestamp(rollback or jump case) 2015-12-29 15:05:42 +08:00
ruil2
e3c2cb00a5 Merge pull request #2317 from shihuade/Scripts_V3
update scripts
2015-12-18 14:50:18 +08:00
huade
f79361ac35 update scripts 2015-12-18 09:05:12 +08:00
sijchen
100e952231 Merge pull request #2314 from shihuade/MultiThread_V4.5_SliceBsRefact_V1
remove pSliceBs from ctx
2015-12-17 12:02:00 -08:00
sijchen
1b0735c3a9 Merge pull request #2315 from shihuade/Scripts_V2
add scripts for multi-encoder comparision
2015-12-17 12:01:49 -08:00
huade
74d73ac7ec add scripts for multi-encoder comparision 2015-12-17 16:22:55 +08:00
huade
f161566458 remove pSliceBs from ctx 2015-12-15 17:10:52 +08:00
HaiboZhu
04bfacd7e1 Merge pull request #2313 from shihuade/MultiThread_V4.4_ThreadIdcUnify
refact threadIdc and CPU cores logic in init module
2015-12-15 13:56:49 +08:00
huade
ef38c2abf8 refact threadIdc and CPU cores logic in init module 2015-12-15 11:27:00 +08:00
sijchen
e75c5852e8 Merge pull request #2312 from shihuade/TravisTestCase
reduce one test sequences and let travis jobs num to 4, thus reduce test time
2015-12-14 09:44:38 -08:00
sijchen
406f89ec54 Merge pull request #2309 from shihuade/MultiThread_V4.4_ThreadSliceNum_V3_Pull
remove iCountThreadsNum and unitfy with iMultipleThreadIdc
2015-12-14 09:44:13 -08:00
sijchen
2620f4bcfd Merge pull request #2310 from shihuade/MultiThread_V4.5_LayerSizeFixed
fixed layer size update bugs
2015-12-14 09:44:01 -08:00
huade
0f24b80af8 reduce one test sequences and let travis jobs num to 4, thus reduce test time 2015-12-14 17:18:21 +08:00
huade
549a1b9bf4 fixed layer size update bugs 2015-12-14 14:56:09 +08:00
huade
e8536c6b73 remove iCountThreadsNum and unitfy with iMultipleThreadIdc 2015-12-14 12:26:02 +08:00
HaiboZhu
ee01b3afaf Merge pull request #2307 from huili2/fix_decstat
fix iAvgLumaQp in decStat
2015-12-14 10:26:16 +08:00
HaiboZhu
762d1812bb Merge pull request #2306 from shihuade/MultiThread_V4.4_ThreadSliceNum_V2_Pull
refact validate and init logic for fixed sliceMode
2015-12-14 09:44:38 +08:00
HaiboZhu
92637b4912 Merge pull request #2304 from sijchen/th21
[Encoder] Add tasks and thread pool call for SM_SIZELIMITED_SLICE mode
2015-12-11 16:16:16 +08:00
HaiboZhu
045e51b075 Merge pull request #2305 from ruil2/qp_layer
update temporal layer quant
2015-12-11 16:10:06 +08:00
huili2
b2d4a95537 fix iAvgLumaQp in decStat 2015-12-11 14:14:42 +08:00
huade
14d89eb48c refact validate and init logic for fixed sliceMode 2015-12-11 13:08:05 +08:00
Karina
fde8bd2554 update temporal layer quant 2015-12-10 15:07:19 +08:00
sijchen
0c820f4c06 adjust encoder test case to cover multi-thread without loadbalancing 2015-12-09 09:58:03 -08:00
sijchen
76ca56498a Add tasks and thread pool call for SM_SIZELIMITED_SLICE mode 2015-12-09 09:55:04 -08:00
HaiboZhu
76b428a453 Merge pull request #2302 from GuangweiWang/platform
add stripped lib for firefox and modify README for the usage fo DEBUG…
2015-12-09 12:36:12 +08:00
Guangwei Wang
8af088af93 merge changed Makefile to master 2015-12-09 11:36:02 +08:00
Guangwei Wang
3bcf6069ab add stripped lib for firefox and modify README for the usage fo DEBUGSYMBOLS 2015-12-09 10:54:09 +08:00
HaiboZhu
7e9fdc181f Merge pull request #2301 from huili2/simple_parseonly_ctx
remove parseonly in decoder ctx
2015-12-09 10:33:29 +08:00
sijchen
fb241569df Merge pull request #2300 from pengyanhai/master
Avoid a potential deadlock between the main thread and worker thread when encoding or decoding complete
2015-12-08 15:50:37 -08:00
Hank Peng
d58ac746a0 Avoid a potential deadlock between the main thread and worker thread when encoding or decoding complete 2015-12-08 11:53:22 -08:00
HaiboZhu
9272995143 Merge pull request #2298 from GuangweiWang/platform
add stripped lib for firefox
2015-12-08 16:16:14 +08:00
HaiboZhu
7999219f61 Merge pull request #2297 from shihuade/MultiThread_V4.3_SliceBs_V5_Pull
unitfy slice bs writing for multi-thread(sliceindex==0 is the same wi…
2015-12-08 16:16:05 +08:00
huade
dcfe76d1ff unitfy slice bs writing for multi-thread(sliceindex==0 is the same with others ) 2015-12-08 14:09:43 +08:00
Guangwei Wang
703ed1d86e add stripped lib for firefox 2015-12-08 09:52:03 +08:00
HaiboZhu
6dc3f72ef8 Merge pull request #2291 from sijchen/api5
[Encoder] Console: update help info in console to sync with recent api change
2015-12-07 14:26:54 +08:00
HaiboZhu
642d5aa453 Merge pull request #2295 from HaiboZhu/Add_Debug_symbols_in_makefile
Add DEBUGSYMBOLS option for makefile under release mode
2015-12-07 14:26:09 +08:00
HaiboZhu
8ee4fdef28 Merge pull request #2294 from HaiboZhu/Add_Bitcode_Enable
Enable bitcode for iOS 9
2015-12-07 14:25:59 +08:00
Haibo Zhu
95491a7584 Enable bitcode for iOS 9 2015-12-07 08:44:16 +08:00
Haibo Zhu
6ebacd0cbf Add DEBUGSYMBOLS option for makefile under release mode 2015-12-04 09:47:39 +08:00
sijchen
b5e9e7e823 Merge pull request #2292 from ruil2/trace
add parameter output trace
2015-12-03 13:47:48 -08:00
Karina
5ac58e8dc9 add parameter output trace 2015-12-03 16:47:57 +08:00
HaiboZhu
988784ffe1 Merge pull request #2289 from ruil2/interface1
change output interface
2015-12-02 17:15:59 +08:00
Karina
fd43759fc2 change output interface 2015-12-02 09:58:56 +08:00
sijchen
c55e5f6130 update help info in console to sync with recent api change 2015-12-01 16:45:37 -08:00
sijchen
4cdbc20da0 Merge pull request #2285 from sijchen/ut13
[UT] moving test cases to specific files to avoid the too long file
2015-12-01 09:23:33 -08:00
sijchen
f38d24f036 fix the conflict with the current master 2015-11-30 23:42:26 -08:00
HaiboZhu
ece95c815c Merge pull request #2286 from sijchen/ut3
[Encoder] adjust the input para judgement of iMaxNalSize
2015-12-01 15:24:02 +08:00
HaiboZhu
5bf677c9f7 Merge pull request #2284 from sijchen/rf2
[Encoder] Refactor: remove CWelsTaskManageMultiD
2015-12-01 15:23:21 +08:00
sijchen
6a75152a9a Merge pull request #2287 from GuangweiWang/bugfix
fix bug in UT code
2015-11-30 23:04:59 -08:00
Guangwei Wang
c917d09263 fix bug in UT code 2015-12-01 08:55:00 +08:00
sijchen
420778f4d8 add valid adjustment in test to avoid outputing warning trace 2015-11-30 11:33:13 -08:00
sijchen
42ac53b5fc update win UT project after UT structure change 2015-11-30 11:29:47 -08:00
sijchen
46667588e3 moving test cases to specific files to avoid the too long encode_decode_api_test.cpp 2015-11-30 10:47:10 -08:00
sijchen
89752ff62f Refactor: remove CWelsTaskManageMultiD 2015-11-30 10:32:48 -08:00
HaiboZhu
f679da900f Merge pull request #2281 from sijchen/th11
[Encoder] remove duplicated operation after thread pool
2015-11-27 12:13:33 +08:00
HaiboZhu
b749fe7160 Merge pull request #2273 from sijchen/th0
[Encoder] use different task when load-balancing or not, to save computation
2015-11-27 09:29:22 +08:00
HaiboZhu
921443ead8 Merge pull request #2272 from sijchen/rf0
[Encoder] put duplicated codes into one function
2015-11-27 09:27:37 +08:00
huili2
926fc67451 remove parseonly in decoder ctx 2015-11-27 08:56:20 +08:00
ruil2
6696022028 Merge pull request #2283 from shihuade/MultiThread_V4.3_SliceBs_V2
initial for iReturn and refact PPS Sps bs write function
2015-11-26 17:03:54 +08:00
huade
436da21ccf initial for iReturn and refact PPS Sps bs write function 2015-11-26 14:06:01 +08:00
ruil2
60aaf48744 Merge pull request #2282 from shihuade/MultiThread_V4.3_SliceBs_V1_Pull
refact WriteSliceBs()
2015-11-26 12:12:33 +08:00
huade
4a4ade1201 refact WriteSliceBs() 2015-11-26 09:32:33 +08:00
sijchen
8667452940 adjust the input para judgement of iMaxNalSize 2015-11-25 14:21:32 -08:00
sijchen
05c89b75f0 remove duplicated operation after thread pool and rename a task for clearer meaning 2015-11-25 13:46:21 -08:00
sijchen
67dab5d70e Merge pull request #2266 from sijchen/ut0
[UT] put class notification to header file
2015-11-25 09:57:43 -08:00
HaiboZhu
a422180695 Merge pull request #2277 from ruil2/qp_trace
add minqp and maxqp parameters in console
2015-11-25 15:05:12 +08:00
HaiboZhu
3dccfabce3 Merge pull request #2276 from shihuade/MultiThread_V4.2_SSliceCtx_pSliceCountInMB_V3
remove pCountMbNumInSlice from SSliceCtx
2015-11-25 15:02:50 +08:00
Karina
ab7eb1535d add minqp and maxqp parameters in console 2015-11-25 14:21:44 +08:00
huade
d02addd90f remove pCountMbNumInSlice from SSliceCtx 2015-11-25 13:36:37 +08:00
HaiboZhu
60f36eb25a Merge pull request #2275 from HaiboZhu/Fix_Emulation_Prevention_Bytes_Profiles_Bugs
Add protection for emulation prevention bytes and profile_id
2015-11-25 12:30:51 +08:00
HaiboZhu
f47be08065 Merge pull request #2271 from sijchen/rf1
[Encoder] refactor multi-thread logic and add error-dealing
2015-11-25 12:04:00 +08:00
unknown
cc6b409f12 Add protection for emulation prevention bytes and profile_id 2015-11-25 11:48:07 +08:00
HaiboZhu
d85b1f6863 Merge pull request #2274 from shihuade/MultiThread_V4.2_SSliceCtx_PFirstMBInSlice_Pull_BugFixed
fixed bug for firsMbIndex in multi-thread-slice encoding with slicemo…
2015-11-25 11:12:24 +08:00
HaiboZhu
404315ab19 Merge pull request #2270 from huili2/parseonly_api_bugfix
disable wrongly calling for parseonly related
2015-11-25 09:00:54 +08:00
sijchen
13cb84e695 use different task when load-balancing or not to save computation 2015-11-24 14:19:15 -08:00
sijchen
1247006cbb remove unneeded variable 2015-11-24 13:39:27 -08:00
sijchen
2df092bcae refactor multi-thread logic 2015-11-24 13:35:55 -08:00
sijchen
2fc9c08710 put duplicated codes into one function 2015-11-24 11:14:58 -08:00
huade
29dd5e71be fixed bug for firsMbIndex in multi-thread-slice encoding with slicemode==SM_SIZELIMITED_SLICE 2015-11-24 17:55:30 +08:00
HaiboZhu
01016b1c83 Merge pull request #2264 from sijchen/api41
[Encoder] put bUseLoadBalancing into actual usage and add test case for it
2015-11-24 14:16:21 +08:00
ruil2
9ea1f0c7ea Merge pull request #2269 from shihuade/MultiThread_V4.2_SSliceCtx_pSliceComplexRatio_pull
remove pSliceComplexRatio from SliceThreading
2015-11-24 12:37:58 +08:00
huili2
9fade10d77 disable wrongly calling for parseonly related 2015-11-24 11:11:27 +08:00
huade
f263f0710a remove pSliceComplexRatio from SliceThreading 2015-11-24 10:44:23 +08:00
HaiboZhu
aeb55e07fe Merge pull request #2268 from HaiboZhu/Update_ftell_fseek_support_long_file
Update encoder console to support 64bit file length
2015-11-24 10:35:54 +08:00
HaiboZhu
4c19823d44 Merge pull request #2267 from shihuade/MultiThread_V4.2_SSliceCtx_SliceConSumeTime_Pull
remove pSliceConsumeTime in SSliceCtx and SliceThreading
2015-11-24 10:35:43 +08:00
Haibo Zhu
d7644664a6 Update the ftell and fseek to support 64bit length 2015-11-24 09:23:18 +08:00
huade
b001785eee remove pSliceConsumeTime in SSliceCtx and pSliceThreading 2015-11-24 08:58:37 +08:00
sijchen
5d03a8a692 put class notification to header file 2015-11-23 15:55:24 -08:00
sijchen
f3c4b878ff update the usage of flag and MD5 value 2015-11-23 11:54:43 -08:00
ruil2
2d3071e37c Merge pull request #2262 from shihuade/MultiThread_V4.2_SSliceCtx_PFirstMBInSlice_Pull
remove pFirstMbInSlice in SSliceCtx
2015-11-20 14:35:00 +08:00
huade
9ef07c5b99 remove pFirstMbInSlice in SSliceCtx 2015-11-20 09:51:01 +08:00
huili2
6f15550b9e Merge pull request #2261 from mstorsjo/fix-test-init-uninit
Fix DecoderInterfaceTest::TestInitUninit()
2015-11-20 08:35:33 +08:00
Martin Storsjö
eaf4798119 Readd a test for GetOption in TestInitUninit
In dc2cbe4, the previous test for GetOption that succeeds when the
decoder is initialized was removed. Add a GetOption call for a different
option, now that DECODER_OPTION_DATAFORMAT is removed.
2015-11-20 00:17:43 +02:00
Martin Storsjö
b3b083c883 Fully initialize m_sDecParam in TestInitUninit
Before dc2cbe4, the DecoderConfigParam function returned early
since DecoderSetCsp signaled a failure, which is why the uninitialized
parameters weren't read before.

This fixes valgrind warnings about conditional jumps depending on
uninitialized values.
2015-11-20 00:13:42 +02:00
sijchen
222c84c193 Merge pull request #2260 from shihuade/MultiThread_V4.1_SliceCtx_V10V11_Pull_V4
change input parameters for  UpdateMbNeighbourInfoForNextSlice etc.
2015-11-19 13:20:28 -08:00
huade
b77b68ffa0 change input parameters for UpdateMbNeighbourInfoForNextSlice etc. 2015-11-19 17:18:03 +08:00
HaiboZhu
54a194ce66 Merge pull request #2258 from shihuade/MultiThread_V4.1_SliceCtx_V6V7V8V9_Pull_V2
change input parameters for AssignMbMapMultipleSlices etc,
2015-11-19 16:10:54 +08:00
huade
c842c5c946 change input parameters for DynamicAdjustSlicePEncCtxAll etc, SSliceCtx refactoring 2015-11-19 15:00:38 +08:00
HaiboZhu
e4229db53d Merge pull request #2257 from shihuade/MultiThread_V4.1_SliceCtx_V5_Pull
change input parameters for AssignMbMapMultipleSlices
2015-11-19 14:51:52 +08:00
HaiboZhu
40b2cc85f3 Merge pull request #2256 from shihuade/MultiThread_V4.1_SliceCtx_V4_Pull_V3
change input paramters for Init/UninitSlicePEncCtx()
2015-11-19 14:07:29 +08:00
huade
c298755da5 SSliceCtx structure refactoring----change input parameters for AssignMbMapMultipleSlices 2015-11-19 13:29:16 +08:00
huade
b60bb67b4e SSliceCtx struture refactoring----change input paramters for Init/UninitSlicePEncCtx() 2015-11-19 13:19:34 +08:00
HaiboZhu
268e6cf09f Merge pull request #2255 from shihuade/MultiThread_V4.1_SliceCtx_V3_Pull_V3
remove (ppCtx)->pSliceCtxList and only keep DqLayer->sSliceCtx
2015-11-19 13:13:01 +08:00
huade
35ab32b1a3 remove (ppCtx)->pSliceCtxList and only keep DqLayer->sSliceCtx to simply the structure manage 2015-11-19 11:03:50 +08:00
HaiboZhu
f9d8e9a76e Merge pull request #2249 from huili2/remove_output_colorformat
remove data format in decoder API
2015-11-19 09:11:29 +08:00
sijchen
e0282587d1 Merge pull request #2251 from luser/plugin-name
add an echo-plugin-name target
2015-11-18 10:03:42 -08:00
Ted Mielczarek
bdb837ffaf add an echo-plugin-name target 2015-11-18 06:47:39 -05:00
ruil2
174f09bd10 Merge pull request #2246 from shihuade/MultiThread_V4.1_SliceCtx_V2_Pull
SSliceCtx structure refactoring----change input parameters for UpdateSl…
2015-11-18 13:42:20 +08:00
ruil2
a8584b530f Merge pull request #2245 from shihuade/MultiThread_V4.1_SliceCtx_V1_Pull_V2
SSliceCtx structure refactoring----change input parameters for UpdateMb…
2015-11-18 13:42:02 +08:00
huade
8d44427dc6 SSliceCtx struture refactoring----change input paramters for UpdateSlicepEncCtxWithPartition 2015-11-17 20:54:27 +08:00
huade
06eb03578d SSliceCtx struture refactoring----change input paramters for UpdateMbListNeighborParallel 2015-11-17 17:54:58 +08:00
HaiboZhu
148f86f3b0 Merge pull request #2244 from pengyanhai/master
Merge back the changes of v1.5.2 to master and generate PDB file on Windows
2015-11-17 14:35:16 +08:00
huili2
dc2cbe4a22 remove API data format in decoder in 1.6 2015-11-17 13:58:57 +08:00
Hank Peng
28bdcc3871 Update the plugin version to v1.5.2 2015-11-16 16:59:21 -08:00
Hank Peng
baa69f3cd0 Shut down the encoder/decoder thread when Encoding/DecodingComplete is invoked, to avoid potential crash on Android 2015-11-16 16:56:11 -08:00
Hank Peng
e014b5ea43 Avoid to call any host API after Encoding/DecodingComplete(), to avoid potential crash in the browser 2015-11-16 16:55:58 -08:00
Hank Peng
545612e4d7 Merge branch 'master' of github.com:pengyanhai/openh264 2015-11-16 16:53:58 -08:00
sijchen
7bc1b7abf5 Merge pull request #2240 from ruil2/qp_trace
add qp related trace
2015-11-16 10:36:24 -08:00
sijchen
18fdf6292d Merge pull request #2239 from ruil2/remove_trace
remove iAbsDiffPicNumMinus1 processing for no reference frame
2015-11-16 10:36:00 -08:00
ruil2
8f785ebcd5 Merge pull request #2241 from shihuade/MultiThread_V4.0_ThreadPoolChange_V5_astyle
astyle for codec/encoder/core/src/slice_multi_threading.cpp
2015-11-16 17:39:40 +08:00
huade
953f74a8a2 astyle for codec/encoder/core/src/slice_multi_threading.cpp 2015-11-16 15:20:24 +08:00
Karina
96b5b3965e add qp related trace 2015-11-16 13:08:48 +08:00
Karina
42222b8e7e remove iAbsDiffPicNumMinus1 processing for no reference frame 2015-11-16 12:22:10 +08:00
HaiboZhu
8d2883277c Merge pull request #2236 from sijchen/thp82
[Encoder] add error handling of task returns
2015-11-16 10:26:45 +08:00
HaiboZhu
991b05fb69 Merge pull request #2238 from shihuade/MultiThread_V4.0_ThreadPoolChange_V2
fixed bug for NeedDynamicAdjust()
2015-11-16 10:13:02 +08:00
huade
0d4d32efbd fixed bug for NeedDynamicAdjust() 2015-11-16 08:57:38 +08:00
sijchen
6fe05b0996 add error handling of task returns 2015-11-13 12:05:06 -08:00
sijchen
b5d890c1ea Merge pull request #2224 from sijchen/thp73
[Encoder] put the logic related to multiple D layer into a class …
2015-11-13 11:57:07 -08:00
HaiboZhu
da0965c42f Merge pull request #2234 from HaiboZhu/Revert_Simply_Dec_Ctx
Revert "Merge pull request #2217 from huili2/simply_dec_ctx"
2015-11-13 20:45:11 +08:00
Haibo Zhu
628befe8be Revert "Merge pull request #2217 from huili2/simply_dec_ctx"
This reverts commit 27172bafd7ff2cc80b08768a32a23470f3d6d3fd, reversing
changes made to 24916a652ee5d3e36d931c222df20966f7c158fa.
2015-11-13 20:16:03 +08:00
HaiboZhu
513a34069d Merge pull request #2232 from ruil2/fix_crash_1
fix crash
2015-11-13 17:19:16 +08:00
Karina
7c1fbad53a fix crash 2015-11-13 17:16:26 +08:00
huili2
d2e66deb66 Merge pull request #2227 from sijchen/thp92
[Encoder] fix the missing loadbalancing part
2015-11-13 07:38:06 +08:00
sijchen
e508c86dac fix the missing loadbalancing part 2015-11-12 13:15:07 -08:00
sijchen
aeb5ab4b99 [Encoder] put the logic related to multiple D layer into a class for better structure 2015-11-11 22:55:16 -08:00
HaiboZhu
beacba76e3 Merge pull request #2220 from sijchen/thp61
[Encoder] add preencodingtasklist in task management
2015-11-12 13:54:49 +08:00
pengyanhai
b5792a09f9 Generate PDB file for openh264.dll and gmpopenh264.dll 2015-11-10 20:37:17 -08:00
HaiboZhu
1a2606f45d Merge pull request #2219 from sijchen/api3
[Encoder] change API for slicing part for easier usage
2015-11-11 09:19:03 +08:00
HaiboZhu
27172bafd7 Merge pull request #2217 from huili2/simply_dec_ctx
remove bParseonly in ctx using that in param, and slightly modify the…
2015-11-11 09:18:04 +08:00
sijchen
33c378f7b7 change API for slicing part for easier usage (the UseLoadBalancing flag is still under working) 2015-11-10 09:50:06 -08:00
huili2
24916a652e Merge pull request #2215 from pengyanhai/master
Tear down the OpenH264 encoder and decoder properly to avoid potential crash and memory leak
2015-11-10 09:07:41 +08:00
pengyanhai
b5f1460dd1 Tear down the OpenH264 encoder and decoder properly to avoid potential crash and memory leak 2015-11-09 11:52:11 -08:00
HaiboZhu
643df65c58 Merge pull request #2212 from ruil2/rc2
remove an useless code line
2015-11-09 09:53:28 +08:00
HaiboZhu
f1b10e454d Merge pull request #2213 from ruil2/rc4
do GOM rate control for I frame
2015-11-09 09:53:05 +08:00
Karina
e20ce63778 do GOM rate control for I frame 2015-11-06 16:08:52 +08:00
Karina
a251504aa2 remove an useless code line 2015-11-06 13:43:21 +08:00
huili2
17e610da9f Merge pull request #2209 from sijchen/fixslc
[UT] add autolock in ThreadPoolTest to avoid possible conflict
2015-11-05 13:46:51 +08:00
huili2
c47d235942 Merge pull request #2210 from pengyanhai/master
Never call GMPVideoDecoderCallback after Encoding/DecodingComplete, to fix bug #1204588 in Bugzilla
2015-11-05 09:09:55 +08:00
Hank Peng
955fce60a1 Never call GMPVideoDecoderCallback after DecodingComplete, to fix bug #1204588 in Bugzilla 2015-11-04 11:29:02 -08:00
sijchen
59779539e7 add autolock in ThreadPoolTest to avoid possible conflict 2015-11-04 10:29:08 -08:00
HaiboZhu
f13f502203 Merge pull request #2208 from sijchen/fixslc
[Encoder] Fix for a slicing and multi-threading setting
2015-11-04 09:24:08 +08:00
HaiboZhu
e6d9d44344 Merge pull request #2204 from sijchen/ut_template2
[UT] add a .template file for codec UT
2015-11-04 09:23:26 +08:00
Sijia Chen
2dab8bf087 fix for a slicing and multi-threading setting 2015-11-03 14:42:56 -08:00
Sijia Chen
ee27d13262 add preencodingtasklist in task manegement
add interface to enable different task list
2015-11-03 09:33:26 -08:00
sijchen
597adfd98c Merge pull request #2207 from sijchen/thp53
[Encoder] remove unneeded codes and add some logs (basing on PR2206)
2015-11-03 09:05:55 -08:00
sijchen
b0c6ea9385 Merge pull request #2206 from sijchen/thp42
[Encoder] adjust encoder tasks, add ut and enable new thread pool under some cases
2015-11-03 09:05:43 -08:00
Sijia Chen
3d3884641c use the correct commit number in comment 2015-11-02 23:19:02 -08:00
Sijia Chen
3e0ee69812 remove unneeded codes and add some logs 2015-11-02 23:15:29 -08:00
HaiboZhu
cda6a1fa76 Merge pull request #2191 from mstorsjo/cabac-warnings
Avoid warnings in the cabac code
2015-10-29 14:19:22 +08:00
HaiboZhu
17934b9843 Merge pull request #2192 from sijchen/fix_slc
[Encoder] change an improper setting of max_slice_count
2015-10-29 14:17:26 +08:00
HaiboZhu
0292647449 Merge pull request #2195 from sijchen/add_stat_log
[Encoder] Log enhancement for easier debugging
2015-10-29 14:17:19 +08:00
sijchen
8eed27a357 Merge pull request #2197 from HaiboZhu/Add_binary_address_RELEASES
Add binary package address in RELEASES
2015-10-28 21:54:03 -07:00
Sijia Chen
3350cf75a5 add one test case 2015-10-28 21:51:47 -07:00
sijchen
1ed0e8c37b Merge pull request #2196 from shihuade/PSliceRefact_V1.5
refact WelsMarkPicScreen  based on pSlice buffer refactoring
2015-10-28 21:28:42 -07:00
Haibo Zhu
45f26e4fb7 Add binary package address in RELEASES 2015-10-29 09:18:23 +08:00
huade
d962ff1ed1 refact WelsMarkPicScreen based on pSlice buffer refactoring 2015-10-29 09:17:39 +08:00
Sijia Chen
32669bc941 change an improper setting of max_slice_count 2015-10-28 13:55:21 -07:00
Sijia Chen
054a297ca7 adjust encoder tasks, add ut and enable new thread pool under some slice modes 2015-10-28 09:39:26 -07:00
Martin Storsjö
1661a60090 Avoid warnings in the cabac code
Use int32_t for a parameter that is always 0 or 1, because it is
negated. This fixes "warning C4146: unary minus operator applied
to unsigned type, result still unsigned" in MSVC.

Also add casts to silence MSVC warnings about "conversion from
'WelsEnc::cabac_low_t' to 'uint8_t', possible loss of data".

The generated code still is identical to before, on both gcc
and clang.
2015-10-28 14:39:30 +02:00
HaiboZhu
1a7a3e2462 Merge pull request #2188 from shihuade/PSliceRefact_V1.2
refact WelsMarkPic based on pSlice buffer refactoring
2015-10-28 11:04:22 +08:00
HaiboZhu
fe7684bf37 Merge pull request #2187 from shihuade/PSliceRefact_V1.1
refact slice header init
2015-10-28 09:58:31 +08:00
huade
ff8bb6238d refact WelsMarkPic based on pSlice buffer refactoring 2015-10-27 17:44:55 +08:00
huili2
777dbc09d4 remove bParseonly in ctx using that in param, and slightly modify the initialize process of decoder 2015-10-27 16:12:08 +08:00
HaiboZhu
a3e60a1c6f Merge pull request #2186 from shihuade/PSliceRefact_V1.0
change the AbsDiffPicNumMinus1 check logic
2015-10-27 15:07:54 +08:00
sijchen
a0328cda80 Merge pull request #2184 from mstorsjo/fix-readme
Remove a false claim about older android versions not being supported
2015-10-26 13:36:01 -07:00
Martin Storsjö
0a57ec3c40 Remove a false claim about older android versions not being supported
Building for older android versions work just fine.
2015-10-26 16:54:00 +02:00
HaiboZhu
319552db48 Merge pull request #2160 from cisco/cj-build1
Cj build1
2015-10-26 13:58:29 +08:00
HaiboZhu
906627a029 Merge pull request #2179 from alexcohn/patch-1
Update README.md
2015-10-26 13:56:01 +08:00
HaiboZhu
51d8e00564 Merge pull request #2180 from saamas/cabac_encode_opt
[Encoder] CABAC optimizations
2015-10-26 09:02:51 +08:00
huade
08f7ad3f1f refact slice header init 2015-10-23 15:46:06 +08:00
huade
741c122399 change the AbsDiffPicNumMinus1 check logic 2015-10-23 14:45:18 +08:00
HaiboZhu
e0cee02d77 Merge pull request #2177 from sijchen/thp21
[Encoder] add encoder tasks and task-management class
2015-10-23 13:21:42 +08:00
Sindre Aamås
ed133d4c3d [Encoder] CABAC optimizations
~2.4x speedup (time attributed to all CABAC-related fuctions) on x86
(Ivy Bridge) with GCC version 4.9.2 (Debian 4.9.2-10).

~1.3x overall faster encode on a quick 720p30 6Mbps test.

Reviewed at https://rbcommons.com/s/OpenH264/r/1347/
2015-10-21 12:53:12 +02:00
Alex Cohn
a9605ac063 Update README.md
See https://github.com/cisco/openh264/issues/1807#issuecomment-149794717.
2015-10-21 09:49:58 +03:00
sijchen
b700b67bba Merge pull request #2178 from mstorsjo/add-missing-include
Add a missing include of stdlib.h
2015-10-19 23:15:46 -07:00
Martin Storsjö
80c8b7b1cc Add a missing include of stdlib.h
This is required for malloc in this header.

This fixes building for Windows Phone.
2015-10-20 08:59:41 +03:00
Sijia Chen
819f6f5d93 [Encoder] add encoder tasks and task-management class
https://rbcommons.com/s/OpenH264/r/1334/
2015-10-19 22:48:28 -07:00
HaiboZhu
490098915f Merge pull request #2176 from HaiboZhu/Bugfix_CHECK_SE_UPPER_conditions
Fix the macro UPPER_CHECK conditions
2015-10-20 10:31:57 +08:00
Haibo Zhu
151c1d9ffd Fix the macro UPPER_CHECK conditions 2015-10-19 18:12:53 -07:00
sijchen
9befe7b1a3 Merge pull request #2173 from mstorsjo/remove-includes
Remove unused STL includes
2015-10-19 10:41:13 -07:00
HaiboZhu
3ee8784c01 Merge pull request #2170 from HaiboZhu/Bugfix_entropy_decoding_upper_check
Add protection for unsigned int output
2015-10-19 16:31:15 +08:00
Martin Storsjö
dac26cf923 Remove unused STL includes
This fixes building for Android, where libopenh264.so is intended
not to link to any particular STL implementation.
2015-10-19 11:21:29 +03:00
Haibo Zhu
9ba2c9825c (1) add protection for golomb GetUe output value
(2) change the max length of cabac bypass to 16
2015-10-18 20:12:34 -07:00
HaiboZhu
fb61733b27 Merge pull request #2163 from HaiboZhu/Remove_cabac_shift_exponent_too_large
Remove the shift exponent too large warning
2015-10-16 21:16:56 +08:00
HaiboZhu
ea52112d45 Merge pull request #2158 from sijchen/thp0a
[Common] basic thread pool functions
2015-10-16 16:50:39 +08:00
HaiboZhu
7cbc31a0bf Merge pull request #2161 from huili2/MMCO_overflow
prevent too many MMCO num overflow
2015-10-16 16:50:26 +08:00
HaiboZhu
10179539cf Merge pull request #2157 from sijchen/mb
Sync from release 1.5 on API and release notes, etc.
2015-10-16 16:47:04 +08:00
Haibo Zhu
f1d92ef363 Remove the shift exponent too large warning 2015-10-16 01:13:23 -07:00
huili2
4bafe1c430 prevent too many MMCO num overflow 2015-10-16 10:36:13 +08:00
unknown
cb6ab3211d add new ut to win ut proj 2015-10-16 02:37:39 +08:00
Sijia Chen
b29760ee31 remove unneeded parts 2015-10-15 11:31:34 -07:00
Sijia Chen
ade32f5c48 implementation for WelsSleep on WP8.0
https://rbcommons.com/s/OpenH264/r/1315/
2015-10-15 11:27:43 -07:00
Sijia Chen
a3f606e58a replacement of std::list for m_cBusyThreads
https://rbcommons.com/s/OpenH264/r/1320/
2015-10-15 11:17:29 -07:00
Sijia Chen
bc566f0923 put m_cIdleThreads to CWelsCircleQueue rather than std::map
https://rbcommons.com/s/OpenH264/r/1313/
2015-10-15 10:24:48 -07:00
Sijia Chen
eb00d5cb9e change std::list to internal implementation and add the new ut file for CWelsCircleQueue
https://rbcommons.com/s/OpenH264/r/1310/
2015-10-15 10:11:29 -07:00
Sijia Chen
757a596e97 add basic threadpool functions
https://rbcommons.com/s/OpenH264/r/1294/
2015-10-15 10:04:00 -07:00
Sijia Chen
6ca397e758 correct a typo along with the in-plan v1.5 release 2015-10-15 09:55:06 -07:00
Sijia Chen
9d25161f40 add version updates after 1.5 release 2015-10-15 09:54:10 -07:00
Sijia Chen
2f836fc787 add release notes for 1.5 2015-10-15 09:53:02 -07:00
HaiboZhu
af6a9a838f Merge pull request #2152 from mstorsjo/remove-unused-code
Remove unused source files from the encoder
2015-10-15 12:03:41 +08:00
HaiboZhu
813cfca95f Merge pull request #2155 from HaiboZhu/Remove_UBSAN_negative_left_shift_warning
Remove UBSAN warnings about negative left shift
2015-10-15 11:31:55 +08:00
Haibo Zhu
03d16bb4d1 Remove UBSAN warnings about negative left shift 2015-10-14 19:43:19 -07:00
HaiboZhu
3067d127aa Merge pull request #2153 from mstorsjo/fix-warnings
Fix warnings when building for iOS with xcode
2015-10-13 18:26:56 +08:00
HaiboZhu
6c13a2c496 Merge pull request #2151 from mstorsjo/fix-msvc
Revert an accidental change that broke MSVC compilation
2015-10-13 18:26:14 +08:00
Martin Storsjö
8363d43588 Fix warnings when building for iOS with xcode 2015-10-13 12:27:11 +03:00
Martin Storsjö
5ff8af6883 Remove unused source files from the encoder 2015-10-13 12:21:34 +03:00
Martin Storsjö
837599becc Revert an accidental change that broke MSVC compilation
This reverts an unrelated part of e7e3b4f37f0.

Since the function still is declared as taking an int32_t parameter
in the header, changing the function implementation makes it end
up as a different function.
2015-10-13 12:15:01 +03:00
HaiboZhu
6239fbe131 Merge pull request #2150 from huili2/log_memory_decrease
decrease log output for decoder momery info
2015-10-13 16:42:03 +08:00
HaiboZhu
df936ad73b Merge pull request #2131 from sijchen/fix_simul3
[Encoder] Add fix for simulcast for 3 spatial layers
2015-10-13 16:41:15 +08:00
huili2
410689f7ca Merge pull request #2147 from HaiboZhu/Bugfix_uninit_strcat
Init the string value and add protection for WelsStrcat()
2015-10-13 09:04:51 +08:00
huili2
042ac9aba1 decrease log output for decoder momery info 2015-10-12 10:53:33 +08:00
Haibo Zhu
e7e3b4f37f Init the string value and add protection for WelsStrcat() 2015-10-10 08:45:48 -07:00
sijchen
b37cda2482 Merge pull request #2138 from HaiboZhu/Bugfix_SPS_update_logic_under_EC
Fix a SPS update logic bug under EC mode
2015-10-08 10:11:32 -07:00
Haibo Zhu
4ffdca6b06 Fix the SPS update logic bug under EC mode 2015-10-08 02:01:15 -07:00
huili2
4901821328 Merge pull request #2137 from HaiboZhu/Bugfix_CAVCL_8x8_init_error
Fix the 8x8 init bug under CAVCL when scalinglist enable
2015-10-08 16:36:16 +08:00
Haibo Zhu
2cd3fc805d Fix the 8x8 init bug under CAVCL when scalinglist enable 2015-10-07 19:49:37 -07:00
Sijia Chen
542fd232cc add a .template file for codec UT 2015-10-06 11:10:06 -07:00
Sijia Chen
b86bd5f7f6 modify forceIDR log 2015-10-05 16:22:30 -07:00
Sijia Chen
f230c63777 add one more log statstics 2015-10-05 16:16:27 -07:00
Sijia Chen
82cc0535ae Add fix for simulcast if frame rate in the middle spatial layer is smaller 2015-09-30 17:26:50 -07:00
HaiboZhu
f9f2bbf805 Merge pull request #2127 from huili2/repos_DecoderConfigParam
move DecoderConfigParam into InitDecoder
2015-09-23 17:41:52 +08:00
huili2
ecab683f0f move DecoderConfigParam into InitDecoder 2015-09-23 14:37:53 +08:00
huili2
6efeb0ef95 Merge pull request #2124 from HaiboZhu/Bugfix_Duplicate_frame_num
Check the duplicate frame_num in short ref list
2015-09-22 15:45:22 +08:00
HaiboZhu
936747e9a4 Merge pull request #2122 from sijchen/fixsimul
[Encoder] fix for simulcast case when frame rate of lower resolution is higher
2015-09-22 14:51:29 +08:00
unknown
868c8e45a1 Check the duplicate frame_num in short ref list
Add more judgement for return value in WelsMarkAsRef()
2015-09-21 21:31:59 -07:00
Cullen Jennings
f9bfb31fd2 Update Dockerfile
Junk the hash of .a file
2014-11-29 15:48:23 -07:00
Cullen Jennings
9baaaa2a64 fix line continution problem 2014-11-29 11:40:18 -07:00
Cullen Jennings
d744a1ab19 update .gitignore to ignore emacs temp files 2014-11-27 15:46:29 -07:00
Cullen Jennings
df07fa8adc Test dockerfile to see if we get consistent builds 2014-11-27 15:45:52 -07:00
201 changed files with 25605 additions and 20657 deletions

9
.gitignore vendored
View File

@ -30,6 +30,7 @@ codec_unittest
# Other files generated by the MSVC compiler
*.exp
*.pdb
*.map
# Executables built by the MSVC project files
bin
@ -46,5 +47,13 @@ testbin/test_vd_rc.yuv
testbin/test.264
testbin/test.yuv
# iOS output files
codec/build/iOS/common/build/
codec/build/iOS/dec/welsdec/build/
# pkg-config file
*.pc
# editor files
*~

View File

@ -1,4 +1,5 @@
language: cpp
dist: trusty
compiler:
- g++
@ -6,7 +7,7 @@ compiler:
before_install:
- sudo apt-get update -qq
- sudo apt-get install -qq nasm g++-4.6-multilib gcc-multilib libc6-dev-i386
- sudo apt-get install -qq nasm g++-multilib gcc-multilib libc6-dev-i386
install:
- make gmp-bootstrap
@ -21,15 +22,12 @@ before_script:
env:
- TASK=UnitTest; TestParameter=""
- TASK=BinaryCompare; TestParameter=BA_MW_D.264
- TASK=BinaryCompare; TestParameter=LS_SVA_D.264
- TASK=BinaryCompare; TestParameter=CVPCMNL1_SVA_C.264
matrix:
exclude:
- compiler: clang
env: TASK=BinaryCompare; TestParameter=BA_MW_D.264
- compiler: clang
env: TASK=BinaryCompare; TestParameter=LS_SVA_D.264
- compiler: clang
env: TASK=BinaryCompare; TestParameter=CVPCMNL1_SVA_C.264
script:

View File

@ -1,4 +1,4 @@
# Contributors to the OpenH264 project
# Contributors to the OpenH264 project
Patrick Ai
Sijia Chen
@ -35,6 +35,7 @@ James Wang
Juanny Wang
Zhiliang Wang
Hervé Willems
Gregory J Wolfe
Katherine Wu
Guang Xu
Jeffery Xu

View File

@ -33,8 +33,8 @@ GMP_API_BRANCH=Firefox39
CCASFLAGS=$(CFLAGS)
STATIC_LDFLAGS=-lstdc++
VERSION=1.5.1
SHAREDLIBVERSION=1
VERSION=1.6
SHAREDLIBVERSION=3
ifeq (,$(wildcard $(SRC_PATH)gmp-api))
HAVE_GMP_API=No
@ -51,7 +51,14 @@ endif
# Configurations
ifeq ($(BUILDTYPE), Release)
CFLAGS += $(CFLAGS_OPT)
CFLAGS += -DNDEBUG
USE_ASM = Yes
ifeq ($(DEBUGSYMBOLS), True)
CFLAGS += -g
CXXFLAGS += -g
DEBUGSYMBOLS_TAG := _debug_symbols
PROCESS_FILES := True
endif
else
CFLAGS += $(CFLAGS_DEBUG)
USE_ASM = No
@ -62,11 +69,17 @@ CFLAGS += -fsanitize=address
LDFLAGS += -fsanitize=address
endif
STRIP_FLAGS := -S
ifeq (linux, $((OS)))
STRIP_FLAGS := -g
endif
# Make sure the all target is the first one
all: libraries binaries
include $(SRC_PATH)build/platform-$(OS).mk
MODULE := $(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIX)
CFLAGS += -DGENERATED_VERSION_HEADER
LDFLAGS +=
@ -108,11 +121,11 @@ PROCESSING_INCLUDES += \
-I$(SRC_PATH)codec/processing/src/vaacalc
GTEST_INCLUDES += \
-I$(SRC_PATH)gtest \
-I$(SRC_PATH)gtest/include
-I$(SRC_PATH)gtest/googletest \
-I$(SRC_PATH)gtest/googletest/include
CODEC_UNITTEST_INCLUDES += \
-I$(SRC_PATH)gtest/include \
-I$(SRC_PATH)gtest/googletest/include \
-I$(SRC_PATH)codec/common/inc \
-I$(SRC_PATH)test
@ -154,14 +167,14 @@ clean:
ifeq (android,$(OS))
clean: clean_Android
endif
$(QUIET)rm -f $(OBJS) $(OBJS:.$(OBJ)=.d) $(OBJS:.$(OBJ)=.obj) $(LIBRARIES) $(BINARIES) *.lib *.a *.dylib *.dll *.so *.exe *.pdb *.exp *.pc *.res
$(QUIET)rm -f $(OBJS) $(OBJS:.$(OBJ)=.d) $(OBJS:.$(OBJ)=.obj) $(LIBRARIES) $(BINARIES) *.lib *.a *.dylib *.dll *.so *.exe *.pdb *.exp *.pc *.res *.map
gmp-bootstrap:
if [ ! -d gmp-api ] ; then git clone https://github.com/mozilla/gmp-api gmp-api ; fi
cd gmp-api && git fetch origin && git checkout $(GMP_API_BRANCH)
gtest-bootstrap:
svn co https://googletest.googlecode.com/svn/trunk/ gtest
git clone https://github.com/google/googletest.git gtest
ifeq ($(HAVE_GTEST),Yes)
@ -215,18 +228,29 @@ LIBRARIES += $(LIBPREFIX)$(PROJECT_NAME).$(LIBSUFFIX) $(LIBPREFIX)$(PROJECT_NAME
$(LIBPREFIX)$(PROJECT_NAME).$(LIBSUFFIX): $(ENCODER_OBJS) $(DECODER_OBJS) $(PROCESSING_OBJS) $(COMMON_OBJS)
$(QUIET)rm -f $@
$(QUIET_AR)$(AR) $(AR_OPTS) $+
ifeq (True, $(PROCESS_FILES))
cp $@ $(LIBPREFIX)$(PROJECT_NAME)$(DEBUGSYMBOLS_TAG).$(LIBSUFFIX)
strip $(STRIP_FLAGS) $@ -o $@
endif
$(LIBPREFIX)$(PROJECT_NAME).$(SHAREDLIBSUFFIXVER): $(ENCODER_OBJS) $(DECODER_OBJS) $(PROCESSING_OBJS) $(COMMON_OBJS)
$(QUIET)rm -f $@
$(QUIET_CXX)$(CXX) $(SHARED) $(CXX_LINK_O) $+ $(LDFLAGS) $(SHLDFLAGS)
ifeq (True, $(PROCESS_FILES))
cp $@ $(LIBPREFIX)$(PROJECT_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIXVER)
strip $(STRIP_FLAGS) $@ -o $@
endif
ifneq ($(SHAREDLIBSUFFIXVER),$(SHAREDLIBSUFFIX))
$(LIBPREFIX)$(PROJECT_NAME).$(SHAREDLIBSUFFIX): $(LIBPREFIX)$(PROJECT_NAME).$(SHAREDLIBSUFFIXVER)
$(QUIET)ln -sfn $+ $@
ifeq (True, $(PROCESS_FILES))
$(QUIET)ln -sfn $(LIBPREFIX)$(PROJECT_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIXVER) $(LIBPREFIX)$(PROJECT_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIX)
endif
endif
ifeq ($(HAVE_GMP_API),Yes)
plugin: $(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIX)
plugin: $(MODULE)
LIBRARIES += $(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIXVER)
else
plugin:
@ -234,13 +258,23 @@ plugin:
@echo "You do not have gmp-api. Run make gmp-bootstrap to get the gmp-api headers."
endif
echo-plugin-name:
@echo $(MODULE)
$(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIXVER): $(MODULE_OBJS) $(ENCODER_OBJS) $(DECODER_OBJS) $(PROCESSING_OBJS) $(COMMON_OBJS)
$(QUIET)rm -f $@
$(QUIET_CXX)$(CXX) $(SHARED) $(CXX_LINK_O) $+ $(LDFLAGS) $(SHLDFLAGS) $(MODULE_LDFLAGS)
ifeq (True, $(PROCESS_FILES))
cp $@ $(LIBPREFIX)$(MODULE_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIXVER)
strip $(STRIP_FLAGS) $@ -o $@
endif
ifneq ($(SHAREDLIBSUFFIXVER),$(SHAREDLIBSUFFIX))
$(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIX): $(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIXVER)
$(MODULE): $(LIBPREFIX)$(MODULE_NAME).$(SHAREDLIBSUFFIXVER)
$(QUIET)ln -sfn $+ $@
ifeq (True, $(PROCESS_FILES))
$(QUIET)ln -sfn $(LIBPREFIX)$(MODULE_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIXVER) $(LIBPREFIX)$(MODULE_NAME)$(DEBUGSYMBOLS_TAG).$(SHAREDLIBSUFFIX)
endif
endif
$(PROJECT_NAME).pc: $(PROJECT_NAME).pc.in

View File

@ -4,12 +4,13 @@ OpenH264 is a codec library which supports H.264 encoding and decoding. It is su
Encoder Features
----------------
- Constrained Baseline Profile up to Level 5.2 (4096x2304)
- Constrained Baseline Profile up to Level 5.2 (Max frame size is 36864 macro-blocks)
- Arbitrary resolution, not constrained to multiples of 16x16
- Rate control with adaptive quantization, or constant quantization
- Slice options: 1 slice per frame, N slices per frame, N macroblocks per slice, or N bytes per slice
- Multiple threads automatically used for multiple slices
- Temporal scalability up to 4 layers in a dyadic hierarchy
- Simulcast AVC up to 4 resolutions from a single input
- Spatial simulcast up to 4 resolutions from a single input
- Long Term Reference (LTR) frames
- Memory Management Control Operation (MMCO)
@ -23,7 +24,7 @@ Encoder Features
Decoder Features
----------------
- Constrained Baseline Profile up to Level 5.2 (4096x2304)
- Constrained Baseline Profile up to Level 5.2 (Max frame size is 36864 macro-blocks)
- Arbitrary resolution, not constrained to multiples of 16x16
- Single thread for all slices
- Long Term Reference (LTR) frames
@ -50,7 +51,7 @@ Processor Support
Building the Library
--------------------
NASM needed to be installed for assembly code: workable version 2.07 or above, nasm can downloaded from http://www.nasm.us/
NASM needed to be installed for assembly code: workable version 2.10.06 or above, nasm can downloaded from http://www.nasm.us/
For Mac OSX 64-bit NASM needed to be below version 2.11.08 as nasm 2.11.08 will introduce error when using RIP-relative addresses in Mac OSX 64-bit
To build the arm assembly for Windows Phone, gas-preprocessor is required. It can be downloaded from git://git.libav.org/gas-preprocessor.git
@ -68,11 +69,12 @@ The codec and demo can be built by
Valid `**ANDROID_TARGET**` can be found in `**ANDROID_SDK**/platforms`, such as `android-12`.
You can also set `ARCH`, `NDKLEVEL` according to your device and NDK version.
`ARCH` specifies the architecture of android device. Currently `arm`, `arm64`, `x86` and `x86_64` are supported, the default is `arm`. (`mips` and `mips64` can also be used, but there's no specific optimization for those architectures.)
`NDKLEVEL` specifies android api level, the api level can be 12-19, the default is 12.
`NDKLEVEL` specifies android api level, the default is 12. Available possibilities can be found in `**ANDROID_NDK**/platforms`, such as `android-21` (strip away the `android-` prefix).
By default these commands build for the `armeabi-v7a` ABI. To build for the other android
ABIs, add `ARCH=arm64`, `ARCH=x86`, `ARCH=x86_64`, `ARCH=mips` or `ARCH=mips64`.
To build for the older `armeabi` ABI (which has armv5te as baseline), add `APP_ABI=armeabi` (`ARCH=arm` is implicit).
To build for 64-bit ABI, such as `arm64`, explicitly set `NDKLEVEL` to 21 or higher.
For iOS Builds
--------------
@ -129,6 +131,7 @@ From the main project directory:
- `make ARCH=i386` for x86 32bit builds
- `make ARCH=x86_64` for x86 64bit builds
- `make V=No` for a silent build (not showing the actual compiler commands)
- `make DEBUGSYMBOLS=True` for two libraries, one is normal libraries, another one is removed the debugging symbol table entries (those created by the -g option )
The command line programs `h264enc` and `h264dec` will appear in the main project directory.

View File

@ -1,6 +1,31 @@
Releases
-----------
v1.6.0
------
- Adjusted the encoder API structures
- Removed the unused data format in decoder API
- Encoder support of simulcast AVC
- Added support of video signal type present information
- Added support of encoder load-balancing
- Improved encoder multi-threads, rate control and down-sampling
- Fixed the frame size constraint in encoder
- Bug fixes for rate control, multi-threading, simulcasting in encoder
- Bug fixes for interface call, return value check, memory leak in decoder
- Bug fixes for UT and statistic information
- Bug fixes for assembly code
- Remove the unused and redundant code
- Improvements on UT, memory allocation failed protection, error-protection in decoder, input parameters checking in encoder, assembly for AVX2 support, assembly code performance, logging and documentation
- Correct some typos in source code and documents
v1.5.3
------
- Bug fixes for GMP Plugin
v1.5.2
------
- Fix GMP Plugin causing the Browser crash on Android
v1.5.1
------
- Bug fixes for GMP Plugin
@ -99,6 +124,17 @@ Binaries
These binary releases are distributed under this license:
http://www.openh264.org/BINARY_LICENSE.txt
v1.6.0
------
http://ciscobinary.openh264.org/libopenh264-1.6.0-android19.so.bz2
http://ciscobinary.openh264.org/libopenh264-1.6.0-ios.a.bz2
http://ciscobinary.openh264.org/libopenh264-1.6.0-linux32.3.so.bz2
http://ciscobinary.openh264.org/libopenh264-1.6.0-linux64.3.so.bz2
http://ciscobinary.openh264.org/libopenh264-1.6.0-osx32.3.dylib.bz2
http://ciscobinary.openh264.org/libopenh264-1.6.0-osx64.3.dylib.bz2
http://ciscobinary.openh264.org/openh264-1.6.0-win32msvc.dll.bz2
http://ciscobinary.openh264.org/openh264-1.6.0-win64msvc.dll.bz2
v1.5.0
------
http://ciscobinary.openh264.org/libopenh264-1.5.0-android19.so.bz2

View File

@ -0,0 +1,431 @@
#!/bin/bash
#*******************************************************************
# brief: multi-encoders comparision for openh264
# (one given sequence only)
# comparision almong encoders in \$TestEncoderList
#
# more detail, please refer to runUsage() and runBrief()
#
# date: 2015-12-16
#*******************************************************************
runUsage()
{
echo -e "\033[32m ********************************************************************* \033[0m"
echo " Usage: "
echo " --$0 \$TestPicW \$TestPicH \$TestEncoderList"
echo ""
echo " --example:"
echo " $0 1280 720 h264enc_master h264enc_target1 h264enc_target2 "
echo ""
echo " Pre-test:"
echo " --1) copy welsenc.cfg from ./openh264/testbin/ to current dir"
echo " --2) set test YUV path in welsenc.cfg "
echo " --3) copy layer0.cfg from ./openh264/testbin/layer2.cfg to current dir"
echo " --4) copy layer1.cfg from ./openh264/testbin/layer2.cfg to current dir"
echo " --5) copy layer2.cfg from ./openh264/testbin/layer2.cfg to current dir"
echo " --6) copy layer3.cfg from ./openh264/testbin/layer2.cfg to current dir"
echo " layer0.cfg~layer3.cfg are used for multi-layers test cases"
echo ""
echo " --7) generate at least one encoder, "
echo " eg. h264enc_master----master branch as benchmark codec"
echo " h264enc_target----your branch CodecChanged as target codec"
echo ""
echo " --8) copy all tests codec to folder ./Encoder"
echo " --9) run below command line:"
echo " $0 \$TestPicW \$TestPicH \$TestEncoderList"
echo ""
echo " Post-test:"
echo " --1) temp cases log will be output in ./Trace-AllTestData"
echo " --2) all comparision data parsed from log files will be output to "
echo " related .csv file under ./Trace-AllTestData "
echo ""
echo " example:"
echo " --comparison almong h264enc_master h264enc_target1 h264enc_target2"
echo " for Zhuling_1280x720.yuv"
echo ""
echo " --run command as below:"
echo " $0 1280 720 h264enc_master h264enc_target1 h264enc_target2 "
echo ""
echo " --get final result files(.csv) under ./Trace-AllTestData"
echo ""
echo -e "\033[32m ********************************************************************* \033[0m"
}
runBrief()
{
echo -e "\033[32m ********************************************************************* \033[0m"
echo " brief:"
echo ""
echo " encoder veision comparision "
echo " --comparision almong encoders in \$TestEncoderList"
echo " --please generate at least one encoder and copy to folder ./Encoder"
echo " --script will run all test cases for each test encoder"
echo " and generate related trace log files for each encoder"
echo " --script will parse and extact data based on keyword from trace log file"
echo " and output to related .csv files for all encoder"
echo " --the test outout file will be put under ./Trace-AllTestData"
echo ""
echo " test cases:"
echo " --add more cases in function runGlobleInit()"
echo " --add new argument with for loop like rc. etc in function "
echo " runAllEncodeCasesAndGenerateLog()"
echo ""
echo " new data:"
echo " --currently only memory usage, you can add new data for your comparision"
echo " --need to add related data parse in function runParseTraceLog()"
echo ""
echo -e "\033[32m ********************************************************************* \033[0m"
}
runPrompt()
{
echo -e "\033[32m ********************************************************************* \033[0m"
echo ""
echo " ------Test completed!--------"
echo ""
echo -e "\033[32m ********************************************************************* \033[0m"
echo " "
echo " --Total ${iTotalCaseNum} cases run for encoders: ${aEncoderList[@]}"
echo ""
echo " --Statistic files for comparision are list as below:"
echo " ${MemoryUsageStatic}"
echo ""
echo " --trace log files can be found under:"
echo " ${LogDir}"
echo ""
echo -e "\033[32m ********************************************************************* \033[0m"
}
runGlobleInit()
{
CurrenDir=`pwd`
LogDir="${CurrenDir}/Trace-AllTestData"
EncoderDir="${CurrenDir}/Encoder"
if [ ! -d ${LogDir} ]
then
mkdir ${LogDir}
fi
LogFile="Log_EncTraceInfo.log"
MemoryUsageStatic="${LogDir}/MemoryUsage.csv"
TempEncoderList=""
for((i=0; i<${#aEncoderList[@]}; i++))
do
if [ -z "${TempEncoderList}" ]
then
TempEncoderList="${aEncoderList[$i]},"
else
TempEncoderList="${TempEncoderList} ${aEncoderList[$i]},"
fi
done
let "iTotalCaseNum=0"
let "MemoryUsage = 0"
echo "LogDir is ${LogDir}"
echo "MemoryUsageStatic file is ${MemoryUsageStatic}"
echo "SpatialLayerNum, ThreadNum, SliceMode, SliceNum, SlicMbNum, ${TempEncoderList}" >${MemoryUsageStatic}
echo "LogDir is ${LogDir}"
echo "MemoryUsageStatic file is ${MemoryUsageStatic}"
let "iTraceLevel=4"
let "iFrameToBeEncoded = 32"
let "iMaxNalSize=0"
#you can add more test case like rc, gop size, et.
#and add "for loop" in function runAllEncodeCasesAndGenerateLog()
aSpatialLayerNum=(1 2 3 4 )
aThreadIdc=(1 4)
aSliceMode=(0 1 2 3)
aSliceNum=(0 8 16 32)
aSliceMbNum=(0 960)
Encoder=""
sEncoderCommand=""
}
runCheck()
{
if [ ! -d ${EncoderDir} ]
then
echo "encoder folder does not exist, please following below command to copy encoder to test folder--./Encoder"
echo " mkdir Encoder"
echo " cp \${AllVersionEncoders} ./Encoder "
exit 1
fi
let "bEncoderFlag = 0"
echo "aEncoderList is ${aEncoderList[@]} "
for file in ${aEncoderList[@]}
do
if [ -x ${EncoderDir}/${file} ]
then
let "bEncoderFlag = 1"
fi
done
if [ ${bEncoderFlag} -eq 0 ]
then
echo "no encoder under test folder, please following below command to copy encoder to test folder--./Encoder"
echo " cp \${AllVersionEncoders} ./Encoder "
echo " chmod u+x ./Encoder/* "
exit 1
fi
}
runGenerateSpatialLayerResolution()
{
SpatialLayerNum=$1
if [ -z "${SpatialLayerNum}" ]
then
let "SpatialLayerNum =1"
fi
let "PicW_L0= PicW / 8"
let "PicW_L1= PicW / 4"
let "PicW_L2= PicW / 2"
let "PicW_L3= PicW"
let "PicH_L0= PicH / 8"
let "PicH_L1= PicH / 4"
let "PicH_L2= PicH / 2"
let "PicH_L3= PicH"
if [ ${SpatialLayerNum} -eq 1 ]
then
aPicW=( ${PicW_L3} 0 0 0 )
aPicH=( ${PicH_L3} 0 0 0 )
elif [ ${SpatialLayerNum} -eq 2 ]
then
aPicW=( ${PicW_L2} ${PicW_L3} 0 0 )
aPicH=( ${PicH_L2} ${PicH_L3} 0 0 )
elif [ ${SpatialLayerNum} -eq 3 ]
then
aPicW=( ${PicW_L1} ${PicW_L2} ${PicW_L3} 0 )
aPicH=( ${PicH_L1} ${PicH_L2} ${PicH_L3} 0 )
elif [ ${SpatialLayerNum} -eq 4 ]
then
aPicW=( ${PicW_L0} ${PicW_L1} ${PicW_L2} ${PicW_L3} )
aPicH=( ${PicH_L0} ${PicH_L1} ${PicH_L2} ${PicH_L3} )
fi
echo "*************************************************************************"
echo " ${SpatialLayerNum} layers spactial resolution for ${PicW}x${PicH} are:"
echo ""
echo " aPicW is ${aPicW[@]}"
echo " aPicH is ${aPicH[@]}"
echo "*************************************************************************"
}
#parse data from encoder trace log
#you can add more key word to extract data from log file
runParseTraceLog()
{
TempLogFile=$1
let "MemoryUsage = 0"
echo "*****************************************"
echo "parsing trace log file"
echo "log file name is ${TempLogFile}"
echo "*****************************************"
if [ ! -e ${TempLogFile} ]
then
echo "LogFile ${TempLogFile} does not exist, please double check!"
return 1
fi
MemUsageInLog=""
while read line
do
if [[ "${line}" =~ "overall memory usage" ]]
then
#[OpenH264] this = 0x0x7fa4d2c04c30, Info:WelsInitEncoderExt() exit, overall memory usage: 40907254 bytes
MemUsageInLog=(`echo $line | awk 'BEGIN {FS="usage:"} {print $2}' `)
fi
# you can add more key word to extract data from log file
# e.g.: bit rate, fps, encoder time, psnr etc.
# add script block like:
# ****************************************************
# if [[ "${line}" =~ "KeyWordYouWantToSearch" ]]
# then
# $line in log file which contain data you want
# DataYouWant=(`echo $line | awk 'BEGIN {FS="keywordYourSearch"} {print $2}' `)
# fi
# ****************************************************
done < ${TempLogFile}
let "MemoryUsage = ${MemUsageInLog}"
echo "MemoryUsage is ${MemoryUsage}"
}
runEncodeOneCase()
{
#encoding process
echo "------------------------------------------------------"
echo "${Encoder} welsenc.cfg ${sEncoderCommand}" >${LogFile}
${Encoder} welsenc.cfg ${sEncoderCommand} 2>>${LogFile}
${Encoder} welsenc.cfg ${sEncoderCommand} >>${LogFile}
echo "------------------------------------------------------"
}
runAllEncodeCasesAndGenerateLog()
{
echo "aSpatialLayerNum is ${aSpatialLayerNum[@]}"
echo "aThreadIdc is ${aThreadIdc[@]}"
echo "aSliceMode is ${aSliceMode[@]}"
echo "aSliceNum is ${aSliceNum[@]}"
echo "aSliceMbNum is ${aSliceMbNum[@]}"
sEncoderCommand1="-lconfig 0 layer0.cfg -lconfig 1 layer1.cfg -lconfig 2 layer2.cfg -lconfig 3 layer3.cfg"
TempMemoryUsage=""
OtherDataYouWant=""
TempTestCase=""
let "CaseNum=1"
for iSLayerNum in ${aSpatialLayerNum[@]}
do
for iThreadNum in ${aThreadIdc[@]}
do
for iSliceMode in ${aSliceMode[@]}
do
for iSliceNum in ${aSliceNum[@]}
do
#raster slice mb mode, slice-mb-num =0, switch to row-mb-mode
if [ ${iSliceMode} -eq 2 ]
then
aSliceMbNum=(0 960)
else
aSliceMbNum=(960)
fi
for iSlicMbNum in ${aSliceMbNum[@]}
do
TempMemoryUsage=""
#for cases output to statistic file
TempTestCase="${iSLayerNum}, ${iThreadNum}, ${iSliceMode}, ${iSliceNum}, ${iSlicMbNum}"
for eEncoder in ${aEncoderList[@]}
do
Encoder=${EncoderDir}/${eEncoder}
if [ -x ${Encoder} ]
then
if [ ${iSliceMode} -eq 3 ]
then
iMaxNalSize=1000
else
iMaxNalSize=0
fi
runGenerateSpatialLayerResolution ${iSLayerNum}
sEncoderCommand2="-slcmd 0 ${iSliceMode} -slcmd 1 ${iSliceMode} -slcmd 2 ${iSliceMode} -slcmd 3 ${iSliceMode}"
sEncoderCommand3="-slcnum 0 ${iSliceNum} -slcnum 1 ${iSliceNum} -slcnum 2 ${iSliceNum} -slcnum 3 ${iSliceNum}"
sEncoderCommand4="-slcmbnum 0 ${iSlicMbNum} -slcmbnum 1 ${iSlicMbNum} -slcmbnum 2 ${iSlicMbNum} -slcmbnum 3 ${iSlicMbNum} "
sEncoderCommand5="-trace ${iTraceLevel} -numl ${iSLayerNum} -thread ${iThreadNum} -nalsize ${iMaxNalSize}"
sEncoderCommand6="-dw 0 ${aPicW[0]} -dw 1 ${aPicW[1]} -dw 2 ${aPicW[2]} -dw 3 ${aPicW[3]}"
sEncoderCommand7="-dh 0 ${aPicH[0]} -dh 1 ${aPicH[1]} -dh 2 ${aPicH[2]} -dh 3 ${aPicH[3]}"
sEncoderCommand="-frms ${iFrameToBeEncoded} ${sEncoderCommand1} ${sEncoderCommand2} ${sEncoderCommand3} ${sEncoderCommand4} ${sEncoderCommand5} ${sEncoderCommand6} ${sEncoderCommand7}"
LogFile="${LogDir}/${CaseNum}_LogInfo_iSLNum_${iSLayerNum}_ThrNum_${iThreadNum}_SlcM_${iSliceMode}_SlcN_${iSliceNum}_${eEncoder}.log"
echo "Encode command is: "
echo "${Encoder} welsenc.cfg ${sEncoderCommand}"
echo ""
echo "log file is ${LogFile}"
#encode one case
runEncodeOneCase
#parse trace log
runParseTraceLog ${LogFile}
#data extracted from log
#you can add new data here like rc, fps , etc.
echo "memory usage is ${MemoryUsage}"
if [ -z ${TempMemoryUsage} ]
then
TempMemoryUsage="${MemoryUsage},"
else
TempMemoryUsage="${TempMemoryUsage} ${MemoryUsage},"
fi
echo "TempMemoryUsage is ${TempMemoryUsage}"
fi
done
#output memory usage for all encoders
echo "${TempTestCase}, ${TempMemoryUsage}, ${OtherDataYouWant}" >>${MemoryUsageStatic}
let " CaseNum ++"
let "iTotalCaseNum ++"
done
done
done
done
done
}
runMain()
{
runGlobleInit
runCheck
runAllEncodeCasesAndGenerateLog
runPrompt
}
#*************************************************************
if [ $# -lt 3 ]
then
runUsage
runBrief
exit 1
fi
declare -a aEncoderList
declare -a aParamList
aParamList=( $@ )
ParamNum=$#
PicW=${aParamList[0]}
PicH=${aParamList[1]}
for((i=2;i<$#;i++))
do
echo "encoder is ${aParamList[$i]}"
aEncoderList="${aEncoderList} ${aParamList[$i]}"
done
aEncoderList=(${aEncoderList})
echo -e "\033[32m ********************************* \033[0m"
echo ""
echo " --num parameters is ${ParamNum} "
echo " --input parameters are:"
echo " $0 $@"
echo ""
echo -e "\033[32m ********************************* \033[0m"
runMain
#*************************************************************

View File

@ -3,29 +3,29 @@ rem ****************************************************************************
rem usage:
rem AutoBuildForWPAndWindows.bat % Configuration %
rem --For debug version:
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Debug-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Debug-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Debug-C
rem Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Debug-ASM
rem ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-Debug-C
rem ARM-ASM: AutoBuildForWPAndWindows.bat ARM-Debug-ASM
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Debug-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Debug-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Debug-C
rem Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Debug-ASM
rem ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-Debug-C
rem ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-Debug-ASM
rem --For release version:
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Release-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Release-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Release-C
rem Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Release-ASM
rem ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-Release-C
rem ARM-ASM: AutoBuildForWPAndWindows.bat ARM-Release-ASM
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Release-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Release-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Release-C
rem Win64-ASM(WP8): AutoBuildForWPAndWindows.bat Win64-Release-ASM
rem ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-Release-C
rem ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-Release-ASM
rem --For debug and release version:
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-All-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-All-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-All-C
rem Win64-ASM: AutoBuildForWPAndWindows.bat Win64-All-ASM
rem ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-All-C
rem ARM-ASM: AutoBuildForWPAndWindows.bat ARM-All-ASM
rem Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-All-C
rem Win32-ASM: AutoBuildForWPAndWindows.bat Win32-All-ASM
rem Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-All-C
rem Win64-ASM: AutoBuildForWPAndWindows.bat Win64-All-ASM
rem ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-All-C
rem ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-All-ASM
rem --For default:
rem AutoBuildForWPAndWindows.bat
rem ARM-All-ASM
rem ARM-All-ASM(WP8)
rem
rem --lib/dll files will be copied to folder .\bin
rem --win32 folder bin\i386*
@ -46,6 +46,8 @@ rem --more detail, please refer to http://www.mingw.org/
rem
rem 2015/03/15 huashi@cisco.com
rem *************************************************************************************************
set WP8Flag=0
call :BasicSetting
call :PathSetting
call :SetBuildOption %1
@ -96,8 +98,8 @@ goto :EOF
set MinGWPath=C:\MinGW\bin
set MsysPath=C:\MinGW\msys\1.0\bin
set GitPath=C:\Program Files (x86)\Git\bin
set GasScriptPath=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
set VC14Path=C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC
set VC12Path=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC
set VC11Path=C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC
set VC10Path=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC
@ -107,10 +109,14 @@ goto :EOF
set VC12ArmLib02=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\lib\arm
set WP8KitLib=C:\Program Files (x86)\Windows Phone Kits\8.1\lib\arm
if exist "%VC9Path%" set VCPATH=%VC9Path%
if exist "%VC10Path%" set VCPATH=%VC10Path%
if exist "%VC11Path%" set VCPATH=%VC11Path%
if exist "%VC12Path%" set VCPATH=%VC12Path%
if exist "%VC9Path%" set VCPATH=%VC9Path%
if exist "%VC10Path%" set VCPATH=%VC10Path%
if exist "%VC11Path%" set VCPATH=%VC11Path%
if exist "%VC12Path%" set VCPATH=%VC12Path%
if exist "%VC14Path%" set VCPATH=%VC14Path%
if %WP8Flag%==1 set VCPATH=%VC12Path%
set GasScriptPath=%VCPATH%\bin
if "%vArcType%" =="i386" set PATH=%MinGWPath%;%MsysPath%;%VCPATH%\bin;%GitPath%;%PATH%
if "%vArcType%" =="x86_64" set PATH=%MinGWPath%;%MsysPath%;%VCPATH%\bin;%GitPath%;%PATH%
@ -120,7 +126,7 @@ goto :EOF
if "%vArcType%" =="i386" call "%VCPATH%\vcvarsall.bat" x86
if "%vArcType%" =="x86_64" call "%VCPATH%\vcvarsall.bat" x64
if "%vArcType%" =="arm" call "%VCPATH%\vcvarsall.bat" x86_arm
if "%vArcType%" =="arm" call :WPSetting
if %WP8Flag%==1 call :WPSetting
echo PATH is %PATH%
echo LIB is %LIB%
@ -128,6 +134,8 @@ goto :EOF
:WPSetting
set LIB=%VC12ArmLib01%;%VC12ArmLib02%;%WP8KitLib%
echo LIB setting for wp8 is:
echo %LIB%
if not exist "%VC12Path%" (
echo VC12 does not exist,
echo ******************************************
@ -145,69 +153,73 @@ goto :EOF
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=Yes
echo default setting
set WP8Flag=1
echo default setting
) else if "%1"=="Win32-Debug-C" (
set aConfigurationList=Debug
set vArcType=i386
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=No
echo Win32-Debug-C setting
echo Win32-Debug-C setting
) else if "%1"=="Win32-Release-C" (
set aConfigurationList=Release
set vArcType=i386
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=No
echo Win32-Release-C setting
echo Win32-Release-C setting
) else if "%1"=="Win64-Debug-C" (
set aConfigurationList=Debug
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=No
echo All-C setting
echo All-C setting
) else if "%1"=="Win64-Release-C" (
set aConfigurationList=Release
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=No
echo Win64-Release-C setting
echo Win64-Release-C setting
) else if "%1"=="ARM-Debug-C" (
set aConfigurationList=Debug
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=No
echo ARM-Debug-C setting
set WP8Flag=1
echo ARM-Debug-C setting
) else if "%1"=="ARM-Release-C" (
set aConfigurationList=Debug Release
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=No
echo ARM-Release-C setting
set WP8Flag=1
echo ARM-Release-C setting
) else if "%1"=="Win32-All-C" (
set aConfigurationList=Debug Release
set vArcType=i386
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=No
echo Win32-All-C setting
echo Win32-All-C setting
) else if "%1"=="Win64-All-C" (
set aConfigurationList=Debug Release
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=No
echo All-C setting
echo All-C setting
) else if "%1"=="ARM-All-C" (
set aConfigurationList=Debug Release
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=No
set WP8Flag=1
echo ARM-All-C setting
) else if "%1"=="Win32-Debug-ASM" (
set aConfigurationList=Debug
@ -215,63 +227,66 @@ goto :EOF
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=Yes
echo Win32-Debug-ASM setting
echo Win32-Debug-ASM setting
) else if "%1"=="Win32-Release-ASM" (
set aConfigurationList=Release
set vArcType=i386
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=Yes
echo Win32-Release-ASM setting
echo Win32-Release-ASM setting
) else if "%1"=="Win64-Debug-ASM" (
set aConfigurationList=Debug
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=Yes
echo All-ASM setting
echo All-ASM setting
) else if "%1"=="Win64-Release-ASM" (
set aConfigurationList=Release
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=Yes
echo Win64-Release-ASM setting
echo Win64-Release-ASM setting
) else if "%1"=="ARM-Debug-ASM" (
set aConfigurationList=Debug
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=Yes
echo ARM-Debug-ASM setting
set WP8Flag=1
echo ARM-Debug-ASM setting
) else if "%1"=="ARM-Release-ASM" (
set aConfigurationList=Release
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=Yes
echo ARM-Release-ASM setting
set WP8Flag=1
echo ARM-Release-ASM setting
) else if "%1"=="Win32-All-ASM" (
set aConfigurationList=Debug Release
set vArcType=i386
set vOSType=msvc
set vEnable64BitFlag=No
set vASMFlag=Yes
echo Win32-All-ASM setting
echo Win32-All-ASM setting
) else if "%1"=="Win64-All-ASM" (
set aConfigurationList=Debug Release
set vArcType=x86_64
set vOSType=msvc
set vEnable64BitFlag=Yes
set vASMFlag=Yes
echo All-ASM setting
echo All-ASM setting
) else if "%1"=="ARM-All-ASM" (
set aConfigurationList=Debug Release
set vArcType=arm
set vOSType=msvc-wp
set vEnable64BitFlag=No
set vASMFlag=Yes
echo ARM-All-ASM setting
set WP8Flag=1
echo ARM-All-ASM setting
) else (
call :help
goto :ErrorReturn
@ -338,29 +353,29 @@ rem ***********************************************
echo usage:
echo AutoBuildForWPAndWindows.bat % Configuration %
echo --For debug version:
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Debug-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Debug-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Debug-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Debug-ASM
echo ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-Debug-C
echo ARM-ASM: AutoBuildForWPAndWindows.bat ARM-Debug-ASM
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Debug-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Debug-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Debug-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Debug-ASM
echo ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-Debug-C
echo ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-Debug-ASM
echo --For release version:
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Release-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Release-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Release-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Release-ASM
echo ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-Release-C
echo ARM-ASM: AutoBuildForWPAndWindows.bat ARM-Release-ASM
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-Release-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-Release-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-Release-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-Release-ASM
echo ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-Release-C
echo ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-Release-ASM
echo --For debug and release version:
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-All-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-All-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-All-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-All-ASM
echo ARM-C-Only: AutoBuildForWPAndWindows.bat ARM-All-C
echo ARM-ASM: AutoBuildForWPAndWindows.bat ARM-All-ASM
echo Win32-C-Only: AutoBuildForWPAndWindows.bat Win32-All-C
echo Win32-ASM: AutoBuildForWPAndWindows.bat Win32-All-ASM
echo Win64-C-Only: AutoBuildForWPAndWindows.bat Win64-All-C
echo Win64-ASM: AutoBuildForWPAndWindows.bat Win64-All-ASM
echo ARM-C-Only(WP8): AutoBuildForWPAndWindows.bat ARM-All-C
echo ARM-ASM(WP8): AutoBuildForWPAndWindows.bat ARM-All-ASM
echo --For default:
echo AutoBuildForWPAndWindows.bat
echo ARM-All-ASM
echo ARM-All-ASM(WP8)
echo *******************************************************************************
goto :EOF

51
build/Dockerfile Normal file
View File

@ -0,0 +1,51 @@
# This is a docker image with all the tools to build openh264 for linux
# build the docker image with: sudo docker build -t openh264tools - < Dockerfile
# get the result with: sudo docker run -t -i -v /tmp/openH264:/build openh264tools /bin/cp libopenh264.so log /build
# the results will be left in /tmp/openH264
# have a look at log file and if the hash match the "Fluffy got" hashes
FROM ubuntu:14.04
MAINTAINER Cullen Jennings <fluffy@cisco.com>
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y bison flex g++ gcc git libgmp3-dev libmpc-dev libmpfr-dev libz-dev make wget
WORKDIR /tmp
RUN wget http://ftp.gnu.org/gnu/gcc/gcc-4.9.2/gcc-4.9.2.tar.gz
RUN tar xvfz gcc-4.9.2.tar.gz
WORKDIR /tmp/gcc-4.9.2/
RUN mkdir build
WORKDIR /tmp/gcc-4.9.2/build
RUN ../configure --disable-checking --enable-languages=c,c++ --enable-multiarch --enable-shared --enable-threads=posix --with-gmp=/usr/local/lib --with-mpc=/usr/lib --with-mpfr=/usr/lib --without-included-gettext --with-system-zlib --with-tune=generic --disable-multilib --disable-nls
RUN make -j 8
RUN make install
WORKDIR /tmp
RUN wget http://www.nasm.us/pub/nasm/releasebuilds/2.11.06/nasm-2.11.06.tar.gz
RUN tar xvfz nasm-2.11.06.tar.gz
WORKDIR /tmp/nasm-2.11.06/
RUN ./configure
RUN make
RUN make install
WORKDIR /tmp
RUN git clone https://github.com/cisco/openh264.git
WORKDIR /tmp/openh264
RUN git checkout v1.1
RUN make ENABLE64BIT=Yes
RUN date > log
RUN uname -a >> log
RUN nasm -v >> log
RUN gcc -v 2>> log
RUN git status -v >> log
RUN openssl dgst -sha1 libopenh264.so >> log
RUN echo "Fluffy Got hash of - 3b6280fce36111ab9c911453f4ee1fd99ce6f841" >> log

View File

@ -1,6 +1,18 @@
#for x86
HAVE_AVX2 := true
ifneq ($(filter %86 x86_64, $(ARCH)),)
include $(SRC_PATH)build/x86-common.mk
ifeq ($(USE_ASM), Yes)
ifeq ($(HAVE_AVX2), true)
CFLAGS += -DHAVE_AVX2
CXXFLAGS += -DHAVE_AVX2
ASMFLAGS += -DHAVE_AVX2
endif
endif
endif
#for arm
ifneq ($(filter-out arm64, $(filter arm%, $(ARCH))),)
ifeq ($(USE_ASM), Yes)
ASM_ARCH = arm
@ -8,6 +20,8 @@ ASMFLAGS += -I$(SRC_PATH)codec/common/arm/
CFLAGS += -DHAVE_NEON
endif
endif
#for arm64
ifneq ($(filter arm64 aarch64, $(ARCH)),)
ifeq ($(USE_ASM), Yes)
ASM_ARCH = arm64

View File

@ -1,4 +1,4 @@
GTEST_SRCDIR=gtest
GTEST_SRCDIR=gtest/googletest
GTEST_CPP_SRCS=\
$(GTEST_SRCDIR)/src/gtest-all.cc\

View File

@ -14,4 +14,4 @@ python build/mktargets.py --directory test/processing --prefix processing_unitte
python build/mktargets.py --directory test/api --prefix api_test
python build/mktargets.py --directory test/common --prefix common_unittest
python build/mktargets.py --directory module --prefix module
python build/mktargets.py --directory gtest --library gtest --out build/gtest-targets.mk --cpp-suffix .cc --include gtest-all.cc
python build/mktargets.py --directory gtest/googletest --library gtest --out build/gtest-targets.mk --cpp-suffix .cc --include gtest-all.cc

View File

@ -22,7 +22,7 @@ CXX_O=-Fo$@
# it unconditionally. The same issue can also be worked around by adding
# -DGTEST_HAS_TR1_TUPLE=0 instead, but we prefer this version since it
# matches what gtest itself does.
CFLAGS += -nologo -Fd$(PROJECT_NAME).pdb -W3 -EHsc -fp:precise -Zc:wchar_t -Zc:forScope -D_VARIADIC_MAX=10
CFLAGS += -nologo -W3 -EHsc -fp:precise -Zc:wchar_t -Zc:forScope -D_VARIADIC_MAX=10
CXX_LINK_O=-nologo -Fe$@
AR_OPTS=-nologo -out:$@
CFLAGS_OPT=-O2 -Ob1 -Oy- -Zi -GF -Gm- -GS -Gy -DNDEBUG
@ -41,7 +41,7 @@ SHAREDLIBSUFFIXVER=$(SHAREDLIBSUFFIX)
SHARED=-LD
EXTRA_LIBRARY=$(PROJECT_NAME)_dll.lib
LDFLAGS += -link
SHLDFLAGS=-pdb:$(PROJECT_NAME).pdb -def:$(SRC_PATH)openh264.def -implib:$(EXTRA_LIBRARY)
SHLDFLAGS=-debug -map -opt:ref -opt:icf -def:$(SRC_PATH)openh264.def -implib:$(EXTRA_LIBRARY)
STATIC_LDFLAGS=
CODEC_UNITTEST_CFLAGS=-D_CRT_SECURE_NO_WARNINGS

View File

@ -12,6 +12,6 @@ SDK_MIN = 5.1
XCODE=$(shell xcode-select -p)
SDKROOT = $(XCODE)/Platforms/$(SDKTYPE).platform/Developer/SDKs/$(SDKTYPE)$(SDK).sdk
CFLAGS += -arch $(ARCH) -isysroot $(SDKROOT) -miphoneos-version-min=$(SDK_MIN) -DAPPLE_IOS
CFLAGS += -arch $(ARCH) -isysroot $(SDKROOT) -miphoneos-version-min=$(SDK_MIN) -DAPPLE_IOS -fembed-bitcode
LDFLAGS += -arch $(ARCH) -isysroot $(SDKROOT) -miphoneos-version-min=$(SDK_MIN)

View File

@ -148,15 +148,14 @@ typedef enum {
* @brief Option types introduced in decoder application
*/
typedef enum {
DECODER_OPTION_DATAFORMAT = 0, ///< color format, now supports 23 only (I420)
DECODER_OPTION_END_OF_STREAM, ///< end of stream flag
DECODER_OPTION_END_OF_STREAM = 1, ///< end of stream flag
DECODER_OPTION_VCL_NAL, ///< feedback whether or not have VCL NAL in current AU for application layer
DECODER_OPTION_TEMPORAL_ID, ///< feedback temporal id for application layer
DECODER_OPTION_FRAME_NUM, ///< feedback current decoded frame number
DECODER_OPTION_IDR_PIC_ID, ///< feedback current frame belong to which IDR period
DECODER_OPTION_LTR_MARKING_FLAG, ///< feedback wether current frame mark a LTR
DECODER_OPTION_LTR_MARKED_FRAME_NUM, ///< feedback frame num marked by current Frame
DECODER_OPTION_ERROR_CON_IDC, ///< not finished yet, indicate decoder error concealment status, in progress
DECODER_OPTION_ERROR_CON_IDC, ///< indicate decoder error concealment method
DECODER_OPTION_TRACE_LEVEL,
DECODER_OPTION_TRACE_CALLBACK, ///< a void (*)(void* context, int level, const char* message) function which receives log messages
DECODER_OPTION_TRACE_CALLBACK_CONTEXT,///< context info of trace callbac
@ -254,29 +253,6 @@ typedef struct {
int iLTRRefNum; ///< TODO: not supported to set it arbitrary yet
} SLTRConfig;
/**
* @brief Structure for slice argument
*/
typedef struct {
unsigned int
uiSliceMbNum[MAX_SLICES_NUM_TMP]; ///< only used when uiSliceMode=2;here we use a tmp fixed value since MAX_SLICES_NUM is not defined here and its definition may be changed;
unsigned int uiSliceNum; ///< only used when uiSliceMode=1
unsigned int uiSliceSizeConstraint; ///< only used when uiSliceMode=4
} SSliceArgument; ///< not all the elements in this argument will be used, how it will be used depends on uiSliceMode; please refer to SliceModeEnum
/**
* @brief Enumerate the type of slice mode
*/
typedef enum {
SM_SINGLE_SLICE = 0, ///< | SliceNum==1
SM_FIXEDSLCNUM_SLICE = 1, ///< | according to SliceNum | enabled dynamic slicing for multi-thread
SM_RASTER_SLICE = 2, ///< | according to SlicesAssign | need input of MB numbers each slice. In addition, if other constraint in SSliceArgument is presented, need to follow the constraints. Typically if MB num and slice size are both constrained, re-encoding may be involved.
SM_ROWMB_SLICE = 3, ///< | according to PictureMBHeight | typical of single row of mbs each slice + slice size constraint which including re-encoding
SM_DYN_SLICE = 4, ///< | according to SliceSize | dynamic slicing (have no idea about slice_nums until encoding current frame)
SM_AUTO_SLICE = 5, ///< | according to thread number
SM_RESERVED = 6
} SliceModeEnum;
/**
* @brief Enumerate the type of rate control mode
*/
@ -347,12 +323,97 @@ enum {
};
/**
* @brief Structure for slice configuration
*/
* @brief Enumerate the type of slice mode
*/
typedef enum {
SM_SINGLE_SLICE = 0, ///< | SliceNum==1
SM_FIXEDSLCNUM_SLICE = 1, ///< | according to SliceNum | enabled dynamic slicing for multi-thread
SM_RASTER_SLICE = 2, ///< | according to SlicesAssign | need input of MB numbers each slice. In addition, if other constraint in SSliceArgument is presented, need to follow the constraints. Typically if MB num and slice size are both constrained, re-encoding may be involved.
SM_SIZELIMITED_SLICE = 3, ///< | according to SliceSize | slicing according to size, the slicing will be dynamic(have no idea about slice_nums until encoding current frame)
SM_RESERVED = 4
} SliceModeEnum;
/**
* @brief Structure for slice argument
*/
typedef struct {
SliceModeEnum uiSliceMode; ///< by default, uiSliceMode will be SM_SINGLE_SLICE
SSliceArgument sSliceArgument;
} SSliceConfig;
unsigned int uiSliceNum; ///< only used when uiSliceMode=1, when uiSliceNum=0 means auto design it with cpu core number
unsigned int uiSliceMbNum[MAX_SLICES_NUM_TMP]; ///< only used when uiSliceMode=2; when =0 means setting one MB row a slice
unsigned int uiSliceSizeConstraint; ///< now only used when uiSliceMode=4
} SSliceArgument;
/**
* @brief Enumerate the type of video format
*/
typedef enum {
VF_COMPONENT,
VF_PAL,
VF_NTSC,
VF_SECAM,
VF_MAC,
VF_UNDEF,
VF_NUM_ENUM
} EVideoFormatSPS; // EVideoFormat is already defined/used elsewhere!
/**
* @brief Enumerate the type of color primaries
*/
typedef enum {
CP_RESERVED0,
CP_BT709,
CP_UNDEF,
CP_RESERVED3,
CP_BT470M,
CP_BT470BG,
CP_SMPTE170M,
CP_SMPTE240M,
CP_FILM,
CP_BT2020,
CP_NUM_ENUM
} EColorPrimaries;
/**
* @brief Enumerate the type of transfer characteristics
*/
typedef enum {
TRC_RESERVED0,
TRC_BT709,
TRC_UNDEF,
TRC_RESERVED3,
TRC_BT470M,
TRC_BT470BG,
TRC_SMPTE170M,
TRC_SMPTE240M,
TRC_LINEAR,
TRC_LOG100,
TRC_LOG316,
TRC_IEC61966_2_4,
TRC_BT1361E,
TRC_IEC61966_2_1,
TRC_BT2020_10,
TRC_BT2020_12,
TRC_NUM_ENUM
} ETransferCharacteristics;
/**
* @brief Enumerate the type of color matrix
*/
typedef enum {
CM_GBR,
CM_BT709,
CM_UNDEF,
CM_RESERVED3,
CM_FCC,
CM_BT470BG,
CM_SMPTE170M,
CM_SMPTE240M,
CM_YCGCO,
CM_BT2020NC,
CM_BT2020C,
CM_NUM_ENUM
} EColorMatrix;
/**
* @brief Structure for spatial layer configuration
*/
@ -366,7 +427,19 @@ typedef struct {
ELevelIdc uiLevelIdc; ///< value of profile IDC (0 for auto-detection)
int iDLayerQp; ///< value of level IDC (0 for auto-detection)
SSliceConfig sSliceCfg; ///< slice configuration for a layer
SSliceArgument sSliceArgument;
// Note: members bVideoSignalTypePresent through uiColorMatrix below are also defined in SWelsSPS in parameter_sets.h.
bool bVideoSignalTypePresent; // false => do not write any of the following information to the header
unsigned char uiVideoFormat; // EVideoFormatSPS; 3 bits in header; 0-5 => component, kpal, ntsc, secam, mac, undef
bool bFullRange; // false => analog video data range [16, 235]; true => full data range [0,255]
bool bColorDescriptionPresent; // false => do not write any of the following three items to the header
unsigned char uiColorPrimaries; // EColorPrimaries; 8 bits in header; 0 - 9 => ???, bt709, undef, ???, bt470m, bt470bg,
// smpte170m, smpte240m, film, bt2020
unsigned char uiTransferCharacteristics; // ETransferCharacteristics; 8 bits in header; 0 - 15 => ???, bt709, undef, ???, bt470m, bt470bg, smpte170m,
// smpte240m, linear, log100, log316, iec61966-2-4, bt1361e, iec61966-2-1, bt2020-10, bt2020-12
unsigned char uiColorMatrix; // EColorMatrix; 8 bits in header (corresponds to FFmpeg "colorspace"); 0 - 10 => GBR, bt709,
// undef, ???, fcc, bt470bg, smpte170m, smpte240m, YCgCo, bt2020nc, bt2020c
} SSpatialLayerConfig;
/**
@ -438,7 +511,7 @@ typedef struct TagEncParamExt {
eSpsPpsIdStrategy; ///< different stategy in adjust ID in SPS/PPS: 0- constant ID, 1-additional ID, 6-mapping and additional
bool bPrefixNalAddingCtrl; ///< false:not use Prefix NAL; true: use Prefix NAL
bool bEnableSSEI; ///< false:not use SSEI; true: use SSEI -- TODO: planning to remove the interface of SSEI
bool bSimulcastAVC; ///< (when encoding more than 1 spatial layer) false: use SVC syntax for higher layers; true: use Simulcast AVC -- coming soon
bool bSimulcastAVC; ///< (when encoding more than 1 spatial layer) false: use SVC syntax for higher layers; true: use Simulcast AVC
int iPaddingFlag; ///< 0:disable padding;1:padding
int iEntropyCodingModeFlag; ///< 0:CAVLC 1:CABAC.
@ -456,6 +529,7 @@ typedef struct TagEncParamExt {
/* multi-thread settings*/
unsigned short
iMultipleThreadIdc; ///< 1 # 0: auto(dynamic imp. internal encoder); 1: multiple threads imp. disabled; lager than 1: count number of threads;
bool bUseLoadBalancing; ///< only used when uiSliceMode=1 or 3, will change slicing of a picture during the run-time of multi-thread encoding, so the result of each run may be different
/* Deblocking loop filter */
int iLoopFilterDisableIdc; ///< 0: on, 1: off, 2: on except for slice boundaries
@ -485,7 +559,6 @@ typedef struct {
typedef struct TagSVCDecodingParam {
char* pFileNameRestructed; ///< file name of reconstructed frame used for PSNR calculation based debug
EVideoFormatType eOutputColorFormat; ///< color space format to be outputed, EVideoFormatType specified in codec_def.h
unsigned int uiCpuLoad; ///< CPU load
unsigned char uiTargetDqLayer; ///< setting target dq layer id
@ -502,9 +575,14 @@ typedef struct {
unsigned char uiTemporalId;
unsigned char uiSpatialId;
unsigned char uiQualityId;
EVideoFrameType eFrameType;
unsigned char uiLayerType;
/**
* The sub sequence layers are ordered hierarchically based on their dependency on each other so that any picture in a layer shall not be
* predicted from any picture on any higher layer.
*/
int iSubSeqId; ///< refer to D.2.11 Sub-sequence information SEI message semantics
int iNalCount; ///< count number of NAL coded already
int* pNalLengthInByte; ///< length of NAL size in byte from 0 to iNalCount-1
unsigned char* pBsBuf; ///< buffer of bitstream contained
@ -514,14 +592,6 @@ typedef struct {
* @brief Frame bit stream info
*/
typedef struct {
int iTemporalId; ///< temporal ID
/**
* The sub sequence layers are ordered hierarchically based on their dependency on each other so that any picture in a layer shall not be
* predicted from any picture on any higher layer.
*/
int iSubSeqId; ///< refer to D.2.11 Sub-sequence information SEI message semantics
int iLayerNum;
SLayerBSInfo sLayerInfo[MAX_LAYER_NUM_OF_FRAME];

View File

@ -4,12 +4,12 @@
#include "codec_app_def.h"
static const OpenH264Version g_stCodecVersion = {1, 5, 1, 0};
static const char* const g_strCodecVer = "OpenH264 version:1.5.1.0";
static const OpenH264Version g_stCodecVersion = {1, 6, 0, 0};
static const char* const g_strCodecVer = "OpenH264 version:1.6.0.0";
#define OPENH264_MAJOR (1)
#define OPENH264_MINOR (5)
#define OPENH264_REVISION (1)
#define OPENH264_MINOR (6)
#define OPENH264_REVISION (0)
#define OPENH264_RESERVED (0)
#endif // CODEC_VER_H

View File

@ -7,6 +7,9 @@
objects = {
/* Begin PBXBuildFile section */
0DD32A861B467902009181A1 /* WelsThread.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A851B467902009181A1 /* WelsThread.cpp */; };
0DD32A881B467911009181A1 /* WelsTaskThread.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A871B467911009181A1 /* WelsTaskThread.cpp */; };
0DD32A941B468F77009181A1 /* WelsThreadPool.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A931B468F77009181A1 /* WelsThreadPool.cpp */; };
4C3406C918D96EA600DFA14A /* arm_arch_common_macro.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C3406B218D96EA600DFA14A /* arm_arch_common_macro.S */; };
4C3406CA18D96EA600DFA14A /* deblocking_neon.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C3406B318D96EA600DFA14A /* deblocking_neon.S */; };
4C3406CB18D96EA600DFA14A /* expand_picture_neon.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C3406B418D96EA600DFA14A /* expand_picture_neon.S */; };
@ -46,6 +49,16 @@
/* End PBXCopyFilesBuildPhase section */
/* Begin PBXFileReference section */
0DB71EF31BAB273500EABC51 /* WelsCircleQueue.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsCircleQueue.h; sourceTree = "<group>"; };
0DD32A851B467902009181A1 /* WelsThread.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WelsThread.cpp; sourceTree = "<group>"; };
0DD32A871B467911009181A1 /* WelsTaskThread.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WelsTaskThread.cpp; sourceTree = "<group>"; };
0DD32A8E1B467B83009181A1 /* WelsLock.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsLock.h; sourceTree = "<group>"; };
0DD32A8F1B467C73009181A1 /* WelsTask.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsTask.h; sourceTree = "<group>"; };
0DD32A901B467C73009181A1 /* WelsTaskThread.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsTaskThread.h; sourceTree = "<group>"; };
0DD32A911B467C73009181A1 /* WelsThread.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsThread.h; sourceTree = "<group>"; };
0DD32A921B467C73009181A1 /* WelsThreadPool.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsThreadPool.h; sourceTree = "<group>"; };
0DD32A931B468F77009181A1 /* WelsThreadPool.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WelsThreadPool.cpp; sourceTree = "<group>"; };
0DEA477E1BB36FE100ADD134 /* WelsList.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WelsList.h; sourceTree = "<group>"; };
4C3406B218D96EA600DFA14A /* arm_arch_common_macro.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; path = arm_arch_common_macro.S; sourceTree = "<group>"; };
4C3406B318D96EA600DFA14A /* deblocking_neon.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; path = deblocking_neon.S; sourceTree = "<group>"; };
4C3406B418D96EA600DFA14A /* expand_picture_neon.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; path = expand_picture_neon.S; sourceTree = "<group>"; };
@ -134,6 +147,13 @@
5BD896B81A7B837700D32B7D /* memory_align.h */,
4C3406C118D96EA600DFA14A /* typedefs.h */,
5BA8F2BE19603F3500011CE4 /* wels_common_defs.h */,
0DB71EF31BAB273500EABC51 /* WelsCircleQueue.h */,
0DEA477E1BB36FE100ADD134 /* WelsList.h */,
0DD32A8E1B467B83009181A1 /* WelsLock.h */,
0DD32A8F1B467C73009181A1 /* WelsTask.h */,
0DD32A901B467C73009181A1 /* WelsTaskThread.h */,
0DD32A911B467C73009181A1 /* WelsThread.h */,
0DD32A921B467C73009181A1 /* WelsThreadPool.h */,
5B9196F91A7F8BA40075D641 /* wels_const_common.h */,
4C3406C218D96EA600DFA14A /* WelsThreadLib.h */,
);
@ -153,6 +173,9 @@
4C3406C618D96EA600DFA14A /* deblocking_common.cpp */,
5BDD15EC1A79027600B6CA2E /* mc.cpp */,
5BD896B91A7B839B00D32B7D /* memory_align.cpp */,
0DD32A871B467911009181A1 /* WelsTaskThread.cpp */,
0DD32A931B468F77009181A1 /* WelsThreadPool.cpp */,
0DD32A851B467902009181A1 /* WelsThread.cpp */,
4C3406C818D96EA600DFA14A /* WelsThreadLib.cpp */,
);
path = src;
@ -260,6 +283,7 @@
isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647;
files = (
0DD32A941B468F77009181A1 /* WelsThreadPool.cpp in Sources */,
F5B8D82D190757290037849A /* mc_aarch64_neon.S in Sources */,
4C3406C918D96EA600DFA14A /* arm_arch_common_macro.S in Sources */,
F556A8241906673900E156A8 /* arm_arch64_common_macro.S in Sources */,
@ -268,8 +292,10 @@
4C3406CE18D96EA600DFA14A /* crt_util_safe_x.cpp in Sources */,
F791965919D3BE2200F60C6B /* intra_pred_common.cpp in Sources */,
5BD896BA1A7B839B00D32B7D /* memory_align.cpp in Sources */,
0DD32A881B467911009181A1 /* WelsTaskThread.cpp in Sources */,
4C3406CF18D96EA600DFA14A /* deblocking_common.cpp in Sources */,
5BA8F2C019603F5F00011CE4 /* common_tables.cpp in Sources */,
0DD32A861B467902009181A1 /* WelsThread.cpp in Sources */,
F791965419D3B89D00F60C6B /* intra_pred_common_aarch64_neon.S in Sources */,
4C3406D118D96EA600DFA14A /* WelsThreadLib.cpp in Sources */,
4C3406CC18D96EA600DFA14A /* mc_neon.S in Sources */,

View File

@ -7,6 +7,10 @@
objects = {
/* Begin PBXBuildFile section */
0D6970BE1CA5BCFB001D88F8 /* paraset_strategy.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0D6970BC1CA5BCFB001D88F8 /* paraset_strategy.cpp */; };
0DD32A961B4A478B009181A1 /* wels_task_base.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A951B4A478B009181A1 /* wels_task_base.cpp */; };
0DD32A991B4A4997009181A1 /* wels_task_management.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A981B4A4997009181A1 /* wels_task_management.cpp */; };
0DD32A9C1B4A4E8F009181A1 /* wels_task_encoder.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0DD32A9B1B4A4E8F009181A1 /* wels_task_encoder.cpp */; };
4C23BC60195A77E0003B81FC /* intra_pred_sad_3_opt_aarch64_neon.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C23BC5F195A77E0003B81FC /* intra_pred_sad_3_opt_aarch64_neon.S */; };
4C34066D18C57D0400DFA14A /* intra_pred_neon.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C34066618C57D0400DFA14A /* intra_pred_neon.S */; };
4C34066E18C57D0400DFA14A /* intra_pred_sad_3_opt_neon.S in Sources */ = {isa = PBXBuildFile; fileRef = 4C34066718C57D0400DFA14A /* intra_pred_sad_3_opt_neon.S */; };
@ -28,7 +32,6 @@
4CE4471A18BC605C0017DF25 /* mv_pred.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446E918BC605C0017DF25 /* mv_pred.cpp */; };
4CE4471B18BC605C0017DF25 /* nal_encap.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446EA18BC605C0017DF25 /* nal_encap.cpp */; };
4CE4471C18BC605C0017DF25 /* picture_handle.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446EB18BC605C0017DF25 /* picture_handle.cpp */; };
4CE4471D18BC605C0017DF25 /* property.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446EC18BC605C0017DF25 /* property.cpp */; };
4CE4471E18BC605C0017DF25 /* ratectl.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446ED18BC605C0017DF25 /* ratectl.cpp */; };
4CE4471F18BC605C0017DF25 /* ref_list_mgr_svc.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446EE18BC605C0017DF25 /* ref_list_mgr_svc.cpp */; };
4CE4472018BC605C0017DF25 /* sample.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 4CE446EF18BC605C0017DF25 /* sample.cpp */; };
@ -67,6 +70,14 @@
/* Begin PBXFileReference section */
04FE0684196FD9370004D7CE /* version.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = version.h; path = ../../../common/inc/version.h; sourceTree = "<group>"; };
0D6970BC1CA5BCFB001D88F8 /* paraset_strategy.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = paraset_strategy.cpp; sourceTree = "<group>"; };
0D6970BF1CA5BD26001D88F8 /* paraset_strategy.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = paraset_strategy.h; sourceTree = "<group>"; };
0DD32A951B4A478B009181A1 /* wels_task_base.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = wels_task_base.cpp; sourceTree = "<group>"; };
0DD32A971B4A47D0009181A1 /* wels_task_base.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = wels_task_base.h; sourceTree = "<group>"; };
0DD32A981B4A4997009181A1 /* wels_task_management.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = wels_task_management.cpp; sourceTree = "<group>"; };
0DD32A9A1B4A49AC009181A1 /* wels_task_management.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = wels_task_management.h; sourceTree = "<group>"; };
0DD32A9B1B4A4E8F009181A1 /* wels_task_encoder.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = wels_task_encoder.cpp; sourceTree = "<group>"; };
0DD32A9D1B4A4E9C009181A1 /* wels_task_encoder.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = wels_task_encoder.h; sourceTree = "<group>"; };
4C23BC5F195A77E0003B81FC /* intra_pred_sad_3_opt_aarch64_neon.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; name = intra_pred_sad_3_opt_aarch64_neon.S; path = arm64/intra_pred_sad_3_opt_aarch64_neon.S; sourceTree = "<group>"; };
4C34066618C57D0400DFA14A /* intra_pred_neon.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; path = intra_pred_neon.S; sourceTree = "<group>"; };
4C34066718C57D0400DFA14A /* intra_pred_sad_3_opt_neon.S */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.asm; path = intra_pred_sad_3_opt_neon.S; sourceTree = "<group>"; };
@ -98,7 +109,6 @@
4CE446C018BC605C0017DF25 /* parameter_sets.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = parameter_sets.h; sourceTree = "<group>"; };
4CE446C118BC605C0017DF25 /* picture.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = picture.h; sourceTree = "<group>"; };
4CE446C218BC605C0017DF25 /* picture_handle.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = picture_handle.h; sourceTree = "<group>"; };
4CE446C318BC605C0017DF25 /* property.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = property.h; sourceTree = "<group>"; };
4CE446C418BC605C0017DF25 /* rc.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = rc.h; sourceTree = "<group>"; };
4CE446C518BC605C0017DF25 /* ref_list_mgr_svc.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ref_list_mgr_svc.h; sourceTree = "<group>"; };
4CE446C618BC605C0017DF25 /* sample.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = sample.h; sourceTree = "<group>"; };
@ -133,7 +143,6 @@
4CE446E918BC605C0017DF25 /* mv_pred.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = mv_pred.cpp; sourceTree = "<group>"; };
4CE446EA18BC605C0017DF25 /* nal_encap.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = nal_encap.cpp; sourceTree = "<group>"; };
4CE446EB18BC605C0017DF25 /* picture_handle.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = picture_handle.cpp; sourceTree = "<group>"; };
4CE446EC18BC605C0017DF25 /* property.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = property.cpp; sourceTree = "<group>"; };
4CE446ED18BC605C0017DF25 /* ratectl.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ratectl.cpp; sourceTree = "<group>"; };
4CE446EE18BC605C0017DF25 /* ref_list_mgr_svc.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ref_list_mgr_svc.cpp; sourceTree = "<group>"; };
4CE446EF18BC605C0017DF25 /* sample.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = sample.cpp; sourceTree = "<group>"; };
@ -275,9 +284,9 @@
4CE446BD18BC605C0017DF25 /* nal_encap.h */,
4CE446BF18BC605C0017DF25 /* param_svc.h */,
4CE446C018BC605C0017DF25 /* parameter_sets.h */,
0D6970BF1CA5BD26001D88F8 /* paraset_strategy.h */,
4CE446C118BC605C0017DF25 /* picture.h */,
4CE446C218BC605C0017DF25 /* picture_handle.h */,
4CE446C318BC605C0017DF25 /* property.h */,
4CE446C418BC605C0017DF25 /* rc.h */,
4CE446C518BC605C0017DF25 /* ref_list_mgr_svc.h */,
4CE446C618BC605C0017DF25 /* sample.h */,
@ -300,6 +309,9 @@
4CE446D918BC605C0017DF25 /* wels_const.h */,
4CE446DA18BC605C0017DF25 /* wels_func_ptr_def.h */,
4CE446DB18BC605C0017DF25 /* wels_preprocess.h */,
0DD32A971B4A47D0009181A1 /* wels_task_base.h */,
0DD32A9D1B4A4E9C009181A1 /* wels_task_encoder.h */,
0DD32A9A1B4A49AC009181A1 /* wels_task_management.h */,
);
path = inc;
sourceTree = "<group>";
@ -321,8 +333,8 @@
4CE446E718BC605C0017DF25 /* md.cpp */,
4CE446E918BC605C0017DF25 /* mv_pred.cpp */,
4CE446EA18BC605C0017DF25 /* nal_encap.cpp */,
0D6970BC1CA5BCFB001D88F8 /* paraset_strategy.cpp */,
4CE446EB18BC605C0017DF25 /* picture_handle.cpp */,
4CE446EC18BC605C0017DF25 /* property.cpp */,
4CE446ED18BC605C0017DF25 /* ratectl.cpp */,
4CE446EE18BC605C0017DF25 /* ref_list_mgr_svc.cpp */,
4CE446EF18BC605C0017DF25 /* sample.cpp */,
@ -336,6 +348,9 @@
4CE446F718BC605C0017DF25 /* svc_motion_estimate.cpp */,
4CE446F818BC605C0017DF25 /* svc_set_mb_syn_cavlc.cpp */,
4CE446FA18BC605C0017DF25 /* wels_preprocess.cpp */,
0DD32A951B4A478B009181A1 /* wels_task_base.cpp */,
0DD32A9B1B4A4E8F009181A1 /* wels_task_encoder.cpp */,
0DD32A981B4A4997009181A1 /* wels_task_management.cpp */,
);
path = src;
sourceTree = "<group>";
@ -424,10 +439,10 @@
4CE4471118BC605C0017DF25 /* encode_mb_aux.cpp in Sources */,
4CE4472718BC605C0017DF25 /* svc_mode_decision.cpp in Sources */,
4CE4472818BC605C0017DF25 /* svc_motion_estimate.cpp in Sources */,
4CE4471D18BC605C0017DF25 /* property.cpp in Sources */,
4CE4471018BC605C0017DF25 /* decode_mb_aux.cpp in Sources */,
4CE4472018BC605C0017DF25 /* sample.cpp in Sources */,
6CA38DA31991CACE003EAAE0 /* svc_motion_estimation.S in Sources */,
0DD32A9C1B4A4E8F009181A1 /* wels_task_encoder.cpp in Sources */,
4CE4471318BC605C0017DF25 /* encoder_data_tables.cpp in Sources */,
4C34067118C57D0400DFA14A /* pixel_neon.S in Sources */,
9AED665019469FC1009A3567 /* welsCodecTrace.cpp in Sources */,
@ -460,9 +475,12 @@
4CE4471618BC605C0017DF25 /* get_intra_predictor.cpp in Sources */,
4CE4472E18BC605C0017DF25 /* welsEncoderExt.cpp in Sources */,
6CA38DA51991D31A003EAAE0 /* svc_motion_estimation_aarch64_neon.S in Sources */,
0DD32A991B4A4997009181A1 /* wels_task_management.cpp in Sources */,
4CE4471418BC605C0017DF25 /* encoder_ext.cpp in Sources */,
4C34067218C57D0400DFA14A /* reconstruct_neon.S in Sources */,
0DD32A961B4A478B009181A1 /* wels_task_base.cpp in Sources */,
F7E9994919EBD1F8009B1021 /* set_mb_syn_cabac.cpp in Sources */,
0D6970BE1CA5BCFB001D88F8 /* paraset_strategy.cpp in Sources */,
);
runOnlyForDeploymentPostprocessing = 0;
};

View File

@ -355,6 +355,46 @@
/>
</FileConfiguration>
</File>
<File
RelativePath="..\..\..\common\x86\dct.asm"
>
<FileConfiguration
Name="Release|Win32"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win32 -DPREFIX -DX86_32 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Release|x64"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win64 -DWIN64 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Debug|Win32"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win32 -DPREFIX -DX86_32 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Debug|x64"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win64 -DWIN64 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
</File>
<File
RelativePath="..\..\..\decoder\core\x86\dct.asm"
>

View File

@ -406,11 +406,11 @@
>
</File>
<File
RelativePath="..\..\..\encoder\core\src\picture_handle.cpp"
RelativePath="..\..\..\encoder\core\src\paraset_strategy.cpp"
>
</File>
<File
RelativePath="..\..\..\encoder\core\src\property.cpp"
RelativePath="..\..\..\encoder\core\src\picture_handle.cpp"
>
</File>
<File
@ -477,10 +477,34 @@
RelativePath="..\..\..\common\src\utils.cpp"
>
</File>
<File
RelativePath="..\..\..\encoder\core\src\wels_task_base.cpp"
>
</File>
<File
RelativePath="..\..\..\encoder\core\src\wels_task_encoder.cpp"
>
</File>
<File
RelativePath="..\..\..\encoder\core\src\wels_task_management.cpp"
>
</File>
<File
RelativePath="..\..\..\common\src\WelsTaskThread.cpp"
>
</File>
<File
RelativePath="..\..\..\common\src\WelsThread.cpp"
>
</File>
<File
RelativePath="..\..\..\common\src\WelsThreadLib.cpp"
>
</File>
<File
RelativePath="..\..\..\common\src\WelsThreadPool.cpp"
>
</File>
</Filter>
<Filter
Name="Header Files"
@ -598,6 +622,10 @@
RelativePath="..\..\..\encoder\core\inc\parameter_sets.h"
>
</File>
<File
RelativePath="..\..\..\encoder\core\inc\paraset_strategy.h"
>
</File>
<File
RelativePath="..\..\..\encoder\core\inc\picture.h"
>
@ -606,10 +634,6 @@
RelativePath="..\..\..\encoder\core\inc\picture_handle.h"
>
</File>
<File
RelativePath="..\..\..\encoder\core\inc\property.h"
>
</File>
<File
RelativePath="..\..\..\encoder\core\inc\rc.h"
>
@ -811,6 +835,46 @@
/>
</FileConfiguration>
</File>
<File
RelativePath="..\..\..\common\x86\dct.asm"
>
<FileConfiguration
Name="Debug|Win32"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win32 -DPREFIX -DX86_32 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Debug|x64"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win64 -DWIN64 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Release|Win32"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win32 -DPREFIX -DX86_32 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
<FileConfiguration
Name="Release|x64"
>
<Tool
Name="VCCustomBuildTool"
CommandLine="nasm -I$(InputDir) -I$(InputDir)/../../../common/x86/ -f win64 -DWIN64 -o $(IntDir)\$(InputName)_common.obj $(InputPath)&#x0D;&#x0A;"
Outputs="$(IntDir)\$(InputName)_common.obj"
/>
</FileConfiguration>
</File>
<File
RelativePath="..\..\..\encoder\core\x86\dct.asm"
>

View File

@ -62,3 +62,7 @@ ret
.endm
#endif
.macro SIGN_EXTENSION arg0, arg1
sxtw \arg0, \arg1
.endm

View File

@ -105,9 +105,10 @@
// }
.endm
//void WelsCopy8x8_AArch64_neon (uint8_t* pDst, int32_t iStrideD, uint8_t* pSrc, int32_t iStrideS);
WELS_ASM_AARCH64_FUNC_BEGIN WelsCopy8x8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
LOAD_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x2, x3
STORE_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x0, x1
@ -120,7 +121,8 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsCopy16x16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
LOAD16_ALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x2, x3
STORE16_ALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x0, x1
@ -141,7 +143,8 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsCopy16x16NotAligned_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
LOAD16_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x2, x3
STORE16_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x0, x1
@ -162,7 +165,8 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsCopy16x8NotAligned_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
LOAD16_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x2, x3
STORE16_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x0, x1
@ -175,7 +179,8 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsCopy8x16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
LOAD_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x2, x3
STORE_UNALIGNED_DATA_WITH_STRIDE v0, v1, v2, v3, x0, x1

View File

@ -305,6 +305,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN DeblockLumaLt4V_AArch64_neon //uint8_t* pPix, int32_t iStride, int32_t iAlpha, int32_t iBeta, int8_t* tc
dup v16.16b, w2 //alpha
dup v17.16b, w3 //beta
SIGN_EXTENSION x1,w1
add x2, x1, x1, lsl #1
sub x2, x0, x2
movi v23.16b, #128
@ -363,8 +364,8 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN DeblockLumaEq4V_AArch64_neon
dup v16.16b, w2 //alpha
dup v17.16b, w3 //beta
SIGN_EXTENSION x1,w1
sub x3, x0, x1, lsl #2
ld1 {v0.16b}, [x3], x1
ld1 {v4.16b}, [x0], x1
ld1 {v1.16b}, [x3], x1
@ -431,7 +432,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN DeblockLumaLt4H_AArch64_neon //uint8_t* pPix, int32_
dup v17.16b, w3 //beta
sub x2, x0, #3
movi v23.16b, #128
SIGN_EXTENSION x1,w1
LOAD_LUMA_DATA_3 v0, v1, v2, v3, v4, v5, 0
LOAD_LUMA_DATA_3 v0, v1, v2, v3, v4, v5, 1
LOAD_LUMA_DATA_3 v0, v1, v2, v3, v4, v5, 2
@ -515,7 +516,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN DeblockLumaEq4H_AArch64_neon
dup v16.16b, w2 //alpha
dup v17.16b, w3 //beta
sub x3, x0, #4
SIGN_EXTENSION x1,w1
LOAD_LUMA_DATA_4 v0, v1, v2, v3, v4, v5, v6, v7, 0
LOAD_LUMA_DATA_4 v0, v1, v2, v3, v4, v5, v6, v7, 1
LOAD_LUMA_DATA_4 v0, v1, v2, v3, v4, v5, v6, v7, 2

View File

@ -32,8 +32,11 @@
#ifdef HAVE_NEON_AARCH64
#include "arm_arch64_common_macro.S"
//void ExpandPictureLuma_AArch64_neon (uint8_t* pDst, const int32_t kiStride, const int32_t kiPicW, const int32_t kiPicH);
WELS_ASM_AARCH64_FUNC_BEGIN ExpandPictureLuma_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x2,w2
SIGN_EXTENSION x3,w3
mov x7, x0
mov x8, x3
add x4, x7, x2
@ -73,8 +76,13 @@ _expand_picture_luma_loop1:
cbnz x2, _expand_picture_luma_loop0
WELS_ASM_AARCH64_FUNC_END
//void ExpandPictureChroma_AArch64_neon (uint8_t* pDst, const int32_t kiStride, const int32_t kiPicW,
// const int32_t kiPicH);
WELS_ASM_AARCH64_FUNC_BEGIN ExpandPictureChroma_AArch64_neon
//Save the dst
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x2,w2
SIGN_EXTENSION x3,w3
mov x7, x0
mov x8, x3
mov x10, #16

View File

@ -34,7 +34,9 @@
#include "arm_arch64_common_macro.S"
//for Luma 16x16
//void WelsI16x16LumaPredV_AArch64_neon (uint8_t* pPred, uint8_t* pRef, const int32_t kiStride);
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredV_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.16b}, [x3]
.rept 16
@ -42,7 +44,9 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredV_AArch64_neon
.endr
WELS_ASM_AARCH64_FUNC_END
//void WelsI16x16LumaPredH_AArch64_neon (uint8_t* pPred, uint8_t* pRef, const int32_t kiStride);
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredH_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
.rept 16
ld1r {v0.16b}, [x3], x2

View File

@ -294,6 +294,9 @@ WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20WidthEq16_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w16_h_mc_luma_loop:
ld1 {v2.8b, v3.8b, v4.8b}, [x0], x1 //only use 21(16+5); v2=src[-2]
trn1 v2.2d, v2.2d, v3.2d
@ -312,11 +315,15 @@ w16_h_mc_luma_loop:
cbnz x4, w16_h_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer20WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20WidthEq8_AArch64_neon
sub x0, x0, #2
stp d8,d9, [sp,#-16]!
movi v8.8h, #20, lsl #0
movi v9.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w8_h_mc_luma_loop:
VEC4_LD1_8BITS_16ELEMENT x0, x1, v16, v20, v24, v28 //load src[-2] in v16,v20,v24,v28 for 4 row; only use 13(8+5);
sub x4, x4, #4
@ -366,10 +373,15 @@ w8_h_mc_luma_loop:
ldp d8,d9,[sp],#16
WELS_ASM_AARCH64_FUNC_END
//void McHorVer20WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20WidthEq4_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
asr x4, x4, #1
w4_h_mc_luma_loop:
ld1 {v2.16b}, [x0], x1 //only use 9(4+5); 1st row src[-2:6]
@ -401,10 +413,15 @@ w4_h_mc_luma_loop:
cbnz x4, w4_h_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer10WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer10WidthEq16_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w16_xy_10_mc_luma_loop:
ld1 {v2.8b, v3.8b, v4.8b}, [x0], x1 //only use 21(16+5); v2=src[-2]
trn1 v2.2d, v2.2d, v3.2d
@ -423,11 +440,16 @@ w16_xy_10_mc_luma_loop:
cbnz x4, w16_xy_10_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer10WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer10WidthEq8_AArch64_neon
sub x0, x0, #2
stp d8,d9, [sp,#-16]!
movi v8.8h, #20, lsl #0
movi v9.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w8_xy_10_mc_luma_loop:
VEC4_LD1_8BITS_16ELEMENT x0, x1, v16, v20, v24, v28 //load src[-2] in v16,v20,v24,v28 for 4 row; only use 13(8+5);
sub x4, x4, #4
@ -479,10 +501,15 @@ w8_xy_10_mc_luma_loop:
ldp d8,d9,[sp],#16
WELS_ASM_AARCH64_FUNC_END
//void McHorVer10WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer10WidthEq4_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
asr x4, x4, #1
w4_xy_10_mc_luma_loop:
ld1 {v2.16b}, [x0], x1 //only use 9(4+5); 1st row src[-2:6]
@ -514,11 +541,15 @@ w4_xy_10_mc_luma_loop:
cbnz x4, w4_xy_10_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer30WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer30WidthEq16_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w16_xy_30_mc_luma_loop:
ld1 {v2.8b, v3.8b, v4.8b}, [x0], x1 //only use 21(16+5); v2=src[-2]
trn1 v2.2d, v2.2d, v3.2d
@ -537,11 +568,16 @@ w16_xy_30_mc_luma_loop:
cbnz x4, w16_xy_30_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer30WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer30WidthEq8_AArch64_neon
sub x0, x0, #2
stp d8,d9, [sp,#-16]!
movi v8.8h, #20, lsl #0
movi v9.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w8_xy_30_mc_luma_loop:
VEC4_LD1_8BITS_16ELEMENT x0, x1, v16, v20, v24, v28 //load src[-2] in v16,v20,v24,v28 for 4 row; only use 13(8+5);
sub x4, x4, #4
@ -593,10 +629,15 @@ w8_xy_30_mc_luma_loop:
ldp d8,d9,[sp],#16
WELS_ASM_AARCH64_FUNC_END
//void McHorVer30WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer30WidthEq4_AArch64_neon
sub x0, x0, #2
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
asr x4, x4, #1
w4_xy_30_mc_luma_loop:
ld1 {v2.16b}, [x0], x1 //only use 9(4+5); 1st row src[-2:6]
@ -628,8 +669,12 @@ w4_xy_30_mc_luma_loop:
cbnz x4, w4_xy_30_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer01WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer01WidthEq16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -711,7 +756,12 @@ w16_xy_01_mc_luma_loop:
cbnz x4, w16_xy_01_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer01WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer01WidthEq8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v30.8h, #20, lsl #0
movi v31.8h, #5, lsl #0
@ -750,7 +800,12 @@ w8_xy_01_mc_luma_loop:
cbnz x4, w8_xy_01_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer01WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer01WidthEq4_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -805,8 +860,12 @@ w4_xy_01_mc_luma_loop:
cbnz x4, w4_xy_01_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer03WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer03WidthEq16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -888,7 +947,12 @@ w16_xy_03_mc_luma_loop:
cbnz x4, w16_xy_03_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer03WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer03WidthEq8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v30.8h, #20, lsl #0
movi v31.8h, #5, lsl #0
@ -927,7 +991,12 @@ w8_xy_03_mc_luma_loop:
cbnz x4, w8_xy_03_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer03WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer03WidthEq4_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -982,8 +1051,12 @@ w4_xy_03_mc_luma_loop:
cbnz x4, w4_xy_03_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02WidthEq16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -1065,7 +1138,12 @@ w16_xy_02_mc_luma_loop:
cbnz x4, w16_xy_02_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02WidthEq8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v30.8h, #20, lsl #0
movi v31.8h, #5, lsl #0
@ -1100,7 +1178,12 @@ w8_xy_02_mc_luma_loop:
cbnz x4, w8_xy_02_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02WidthEq4_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -1155,8 +1238,12 @@ w4_xy_02_mc_luma_loop:
cbnz x4, w4_xy_02_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22WidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22WidthEq16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
stp d8, d9, [sp,#-16]!
stp d10, d11, [sp,#-16]!
stp d12, d13, [sp,#-16]!
@ -1321,7 +1408,12 @@ w16_hv_mc_luma_loop:
ldp d8, d9, [sp], #16
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22WidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22WidthEq8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
@ -1391,9 +1483,13 @@ w8_hv_mc_luma_loop:
sub x4, x4, #4
cbnz x4, w8_hv_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22WidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22WidthEq4_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
@ -1462,9 +1558,13 @@ w4_hv_mc_luma_loop:
sub x4, x4, #4
cbnz x4, w4_hv_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McCopyWidthEq16_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McCopyWidthEq16_AArch64_neon
//prfm pldl1strm, [x0]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w16_copy_loop:
//prfm pldl1strm, [x0, x1]
ld1 {v0.16b}, [x0], x1 //read 16Byte : 0 line
@ -1476,9 +1576,13 @@ w16_copy_loop:
sub x4, x4, #2
cbnz x4, w16_copy_loop
WELS_ASM_AARCH64_FUNC_END
//void McCopyWidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McCopyWidthEq8_AArch64_neon
//prfm pldl1strm, [x0]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w8_copy_loop:
//prfm pldl1strm, [x0, x1]
ld1 {v0.8b}, [x0], x1 //read 16Byte : 0 line
@ -1493,6 +1597,9 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN McCopyWidthEq4_AArch64_neon
//prfm pldl1strm, [x0]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
w4_copy_loop:
//prfm pldl1strm, [x0, x1]
ld1 {v0.s}[0], [x0], x1 //read 16Byte : 0 line
@ -1505,8 +1612,14 @@ w4_copy_loop:
cbnz x4, w4_copy_loop
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN PixStrideAvgWidthEq16_AArch64_neon
//void PixStrideAvgWidthEq16_AArch64_neon (uint8_t* pDst, int32_t iDstStride, const uint8_t* pSrcA, int32_t iSrcStrideA,
//const uint8_t* pSrcB, int32_t iSrcStrideB, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN PixStrideAvgWidthEq16_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
SIGN_EXTENSION x6,w6
enc_w16_pix_avg_loop:
ld1 {v0.16b}, [x2], x3 //read 16Byte : src0: 0 line
ld1 {v1.16b}, [x4], x5 //read 16Byte : src1: 0 line
@ -1538,9 +1651,15 @@ enc_w16_pix_avg_loop:
cbnz x6, enc_w16_pix_avg_loop
WELS_ASM_AARCH64_FUNC_END
//void PixStrideAvgWidthEq8_AArch64_neon (uint8_t* pDst, int32_t iDstStride, const uint8_t* pSrcA, int32_t iSrcStrideA,
// const uint8_t* pSrcB, int32_t iSrcStrideB, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN PixStrideAvgWidthEq8_AArch64_neon
//prfm pldl1strm, [x2]
//prfm pldl1strm, [x4]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
SIGN_EXTENSION x6,w6
enc_w8_pix_avg_loop:
//prfm pldl1strm, [x2, x3]
//prfm pldl1strm, [x4, x5]
@ -1574,10 +1693,15 @@ enc_w8_pix_avg_loop:
sub x6, x6, #4
cbnz x6, enc_w8_pix_avg_loop
WELS_ASM_AARCH64_FUNC_END
//void PixelAvgWidthEq16_AArch64_neon (uint8_t* pDst, int32_t iDstStride, const uint8_t* pSrcA, int32_t iSrcAStride,
// const uint8_t* pSrcB, int32_t iSrcBStride, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN PixelAvgWidthEq16_AArch64_neon
//prfm pldl1strm, [x2]
//prfm pldl1strm, [x4]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
SIGN_EXTENSION x6,w6
w16_pix_avg_loop:
//prfm pldl1strm, [x2, x3]
//prfm pldl1strm, [x4, x5]
@ -1616,10 +1740,15 @@ w16_pix_avg_loop:
sub x6, x6, #4
cbnz x6, w16_pix_avg_loop
WELS_ASM_AARCH64_FUNC_END
//void PixelAvgWidthEq8_AArch64_neon (uint8_t* pDst, int32_t iDstStride, const uint8_t* pSrcA, int32_t iSrcAStride,
// const uint8_t* pSrcB, int32_t iSrcBStride, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN PixelAvgWidthEq8_AArch64_neon
//prfm pldl1strm, [x2]
//prfm pldl1strm, [x4]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
SIGN_EXTENSION x6,w6
w8_pix_avg_loop:
//prfm pldl1strm, [x2, x3]
//prfm pldl1strm, [x4, x5]
@ -1654,10 +1783,15 @@ w8_pix_avg_loop:
cbnz x6, w8_pix_avg_loop
WELS_ASM_AARCH64_FUNC_END
//void PixelAvgWidthEq4_AArch64_neon (uint8_t* pDst, int32_t iDstStride, const uint8_t* pSrcA, int32_t iSrcAStride,
// const uint8_t* pSrcB, int32_t iSrcBStride, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN PixelAvgWidthEq4_AArch64_neon
//prfm pldl1strm, [x2]
//prfm pldl1strm, [x4]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
SIGN_EXTENSION x6,w6
w4_pix_avg_loop:
//prfm pldl1strm, [x2, x3]
//prfm pldl1strm, [x4, x5]
@ -1674,8 +1808,12 @@ w4_pix_avg_loop:
sub x6, x6, #2
cbnz x6, w4_pix_avg_loop
WELS_ASM_AARCH64_FUNC_END
//void McChromaWidthEq8_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t* pWeights, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McChromaWidthEq8_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
ld4r {v28.8b, v29.8b, v30.8b, v31.8b}, [x4] //load A/B/C/D
ld1 {v16.16b}, [x0], x1 // src[x]
ext v17.16b, v16.16b, v16.16b, #1 // src[x+1]
@ -1729,8 +1867,12 @@ w8_mc_chroma_loop:
sub x5, x5, #4
cbnz x5, w8_mc_chroma_loop
WELS_ASM_AARCH64_FUNC_END
//void McChromaWidthEq4_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t* pWeights, int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McChromaWidthEq4_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
ld4r {v4.8b, v5.8b, v6.8b, v7.8b}, [x4] //load A/B/C/D
ld1 {v0.8b}, [x0], x1 // src[x]
ext v1.8b, v0.8b, v0.8b, #1 // src[x+1]
@ -1759,8 +1901,12 @@ w4_mc_chroma_loop:
cbnz x5, w4_mc_chroma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer20Width17_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// width+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20Width17_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x3, x3, #16
mov x5, #16
@ -1789,7 +1935,12 @@ w17_h_mc_luma_loop:
cbnz x4, w17_h_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer20Width9_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// width+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20Width9_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x3, x3, #8
mov x5, #8
@ -1817,8 +1968,12 @@ w9_h_mc_luma_loop:
cbnz x4, w9_h_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer20Width5_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// width+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer20Width5_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x3, x3, #4
mov x5, #4
@ -1841,12 +1996,16 @@ w5_h_mc_luma_loop:
cbnz x4, w5_h_mc_luma_loop
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22Width17_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22Width17_AArch64_neon
stp d8, d9, [sp,#-16]!
stp d10, d11, [sp,#-16]!
stp d12, d13, [sp,#-16]!
stp d14, d15, [sp,#-16]!
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
@ -2044,8 +2203,12 @@ w17_hv_mc_luma_loop:
ldp d8, d9, [sp], #16
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22Width9_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);//width+1&&height+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22Width9_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
@ -2140,8 +2303,12 @@ w9_hv_mc_luma_loop:
st1 {v26.b}[0], [x2], x3 //write 8th Byte : 0 line
WELS_ASM_AARCH64_FUNC_END
//void McHorVer22Width5_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);//width+1&&height+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer22Width5_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, #2
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
@ -2231,8 +2398,12 @@ w5_hv_mc_luma_loop:
st1 {v26.b}[4], [x2], x3 //write 5th Byte : 0 line
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02Height17_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// height+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02Height17_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -2320,8 +2491,12 @@ w17_v_mc_luma_loop:
FILTER_6TAG_8BITS2 v2, v3, v4, v5, v6, v7, v20, v0, v1
st1 {v20.16b}, [x2], x3 //write 16Byte : last line
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02Height9_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// height+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02Height9_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0
@ -2375,8 +2550,12 @@ w9_v_mc_luma_loop:
st1 {v20.8b}, [x2], x3 //write 8Byte : 0 line
WELS_ASM_AARCH64_FUNC_END
//void McHorVer02Height5_AArch64_neon (const uint8_t* pSrc, int32_t iSrcStride, uint8_t* pDst, int32_t iDstStride,
// int32_t iHeight);// height+1
WELS_ASM_AARCH64_FUNC_BEGIN McHorVer02Height5_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x4,w4
sub x0, x0, x1, lsl #1
movi v0.8h, #20, lsl #0
movi v1.8h, #5, lsl #0

View File

@ -0,0 +1,180 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsCircleQueue.h
*
* \brief for the queue function needed in ThreadPool
*
* \date 9/27/2015 Created
*
*************************************************************************************
*/
#ifndef _WELS_CIRCLE_QUEUE_H_
#define _WELS_CIRCLE_QUEUE_H_
#include "typedefs.h"
#include <stdlib.h>
namespace WelsCommon {
template<typename TNodeType>
class CWelsCircleQueue {
public:
CWelsCircleQueue() {
m_iMaxNodeCount = 50;
m_pCurrentQueue = static_cast<TNodeType**> (malloc (m_iMaxNodeCount * sizeof (TNodeType*)));
//here using array to simulate list is to avoid the frequent malloc/free of Nodes which may cause fragmented memory
m_iCurrentListStart = m_iCurrentListEnd = 0;
};
~CWelsCircleQueue() {
free (m_pCurrentQueue);
};
int32_t size() {
return ((m_iCurrentListEnd >= m_iCurrentListStart)
? (m_iCurrentListEnd - m_iCurrentListStart)
: (m_iMaxNodeCount - m_iCurrentListStart + m_iCurrentListEnd));
}
int32_t push_back (TNodeType* pNode) {
if ((NULL != pNode) && (find (pNode))) { //not checking NULL for easier testing
return 1;
}
return InternalPushBack (pNode);
}
bool find (TNodeType* pNode) {
if (size() > 0) {
if (m_iCurrentListEnd > m_iCurrentListStart) {
for (int32_t idx = m_iCurrentListStart; idx < m_iCurrentListEnd; idx++) {
if (pNode == m_pCurrentQueue[idx]) {
return true;
}
}
} else {
for (int32_t idx = m_iCurrentListStart; idx < m_iMaxNodeCount; idx++) {
if (pNode == m_pCurrentQueue[idx]) {
return true;
}
}
for (int32_t idx = 0; idx < m_iCurrentListEnd; idx++) {
if (pNode == m_pCurrentQueue[idx]) {
return true;
}
}
}
}
return false;
}
void pop_front() {
if (size() > 0) {
m_pCurrentQueue[m_iCurrentListStart] = NULL;
m_iCurrentListStart = ((m_iCurrentListStart < (m_iMaxNodeCount - 1))
? (m_iCurrentListStart + 1)
: 0);
}
}
TNodeType* begin() {
if (size() > 0) {
return m_pCurrentQueue[m_iCurrentListStart];
}
return NULL;
}
TNodeType* GetIndexNode (const int32_t iIdx) {
if (size() > 0) {
if ((iIdx + m_iCurrentListStart) < m_iMaxNodeCount) {
return m_pCurrentQueue[m_iCurrentListStart + iIdx];
} else {
return m_pCurrentQueue[m_iCurrentListStart + iIdx - m_iMaxNodeCount];
}
}
return NULL;
}
private:
int32_t InternalPushBack (TNodeType* pNode) {
m_pCurrentQueue[m_iCurrentListEnd] = pNode;
m_iCurrentListEnd ++;
if (m_iCurrentListEnd == m_iMaxNodeCount) {
m_iCurrentListEnd = 0;
}
if (m_iCurrentListEnd == m_iCurrentListStart) {
int32_t ret = ExpandQueue();
if (ret) {
return 1;
}
}
return 0;
}
int32_t ExpandQueue() {
TNodeType** tmpCurrentTaskQueue = static_cast<TNodeType**> (malloc (m_iMaxNodeCount * 2 * sizeof (TNodeType*)));
if (tmpCurrentTaskQueue == NULL) {
return 1;
}
memcpy (tmpCurrentTaskQueue,
(m_pCurrentQueue + m_iCurrentListStart),
(m_iMaxNodeCount - m_iCurrentListStart)*sizeof (TNodeType*));
if (m_iCurrentListEnd > 0) {
memcpy (tmpCurrentTaskQueue + m_iMaxNodeCount - m_iCurrentListStart,
m_pCurrentQueue,
m_iCurrentListEnd * sizeof (TNodeType*));
}
free (m_pCurrentQueue);
m_pCurrentQueue = tmpCurrentTaskQueue;
m_iCurrentListEnd = m_iMaxNodeCount;
m_iCurrentListStart = 0;
m_iMaxNodeCount = m_iMaxNodeCount * 2;
return 0;
}
int32_t m_iCurrentListStart;
int32_t m_iCurrentListEnd;
int32_t m_iMaxNodeCount;
TNodeType** m_pCurrentQueue;
};
}
#endif

233
codec/common/inc/WelsList.h Normal file
View File

@ -0,0 +1,233 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsList
*
* \brief for the list function needed in ThreadPool
*
* \date 9/27/2015 Created
*
*************************************************************************************
*/
#ifndef _WELS_LIST_H_
#define _WELS_LIST_H_
#include "typedefs.h"
namespace WelsCommon {
template<typename TNodeType>
struct SNode {
TNodeType* pPointer;
SNode* pPrevNode;
SNode* pNextNode;
};
template<typename TNodeType>
class CWelsList {
public:
CWelsList() {
m_iCurrentNodeCount = 0;
m_iMaxNodeCount = 50;
m_pCurrentList = static_cast<SNode<TNodeType>*> (malloc (m_iMaxNodeCount * sizeof (SNode<TNodeType>)));
//here using array storage to simulate list is to avoid the frequent malloc/free of Nodes which may cause fragmented memory
ResetStorage();
};
~CWelsList() {
free (m_pCurrentList);
};
int32_t size() {
return m_iCurrentNodeCount;
}
bool push_back (TNodeType* pNode) {
m_pCurrent->pPointer = pNode;
if (0 == m_iCurrentNodeCount) {
m_pFirst = m_pCurrent;
}
m_iCurrentNodeCount++;
if (m_iCurrentNodeCount == m_iMaxNodeCount) {
if (!ExpandList()) {
return false;
}
}
SNode<TNodeType>* pNext = FindNextStorage();
m_pCurrent->pNextNode = pNext;
pNext->pPrevNode = m_pCurrent;
m_pCurrent = pNext;
return true;
}
TNodeType* begin() {
if (m_pFirst) {
return m_pFirst->pPointer;
}
return NULL;
}
void pop_front() {
if (m_iCurrentNodeCount == 0) {
return;
}
SNode<TNodeType>* pTemp = m_pFirst;
if (m_iCurrentNodeCount > 0) {
m_iCurrentNodeCount --;
}
if (0 == m_iCurrentNodeCount) {
ResetStorage();
} else {
m_pFirst = m_pFirst->pNextNode;
m_pFirst->pPrevNode = NULL;
CleanOneNode (pTemp);
}
}
bool erase (TNodeType* pNode) {
if (0 == m_iCurrentNodeCount) {
return false;
}
SNode<TNodeType>* pTemp = m_pFirst;
do {
if (pNode == pTemp->pPointer) {
if (pTemp->pPrevNode) {
pTemp->pPrevNode->pNextNode = pTemp->pNextNode;
} else {
m_pFirst = pTemp->pNextNode;
}
if (pTemp->pNextNode) {
pTemp->pNextNode->pPrevNode = pTemp->pPrevNode;
}
CleanOneNode (pTemp);
m_iCurrentNodeCount --;
return true;
}
if (pTemp->pNextNode) {
pTemp = pTemp->pNextNode;
} else {
break;
}
} while (pTemp->pPointer && pTemp->pNextNode);
return false;
}
private:
bool ExpandList() {
SNode<TNodeType>* tmpCurrentList = static_cast<SNode<TNodeType>*> (malloc (m_iMaxNodeCount * 2 * sizeof (
SNode<TNodeType>)));
if (tmpCurrentList == NULL) {
return false;
}
InitStorage (tmpCurrentList, (m_iMaxNodeCount * 2) - 1);
SNode<TNodeType>* pTemp = m_pFirst;
for (int i = 0; ((i < m_iMaxNodeCount) && pTemp); i++) {
tmpCurrentList[i].pPointer = pTemp->pPointer;
pTemp = pTemp->pNextNode;
}
free (m_pCurrentList);
m_pCurrentList = tmpCurrentList;
m_iCurrentNodeCount = m_iMaxNodeCount;
m_iMaxNodeCount = m_iMaxNodeCount * 2;
m_pFirst = m_pCurrentList;
m_pCurrent = & (m_pCurrentList[m_iCurrentNodeCount - 1]);
return true;
}
void InitStorage (SNode<TNodeType>* pList, const int32_t iMaxIndex) {
pList[0].pPrevNode = NULL;
pList[0].pPointer = NULL;
pList[0].pNextNode = & (pList[1]);
for (int i = 1; i < iMaxIndex; i++) {
pList[i].pPrevNode = & (pList[i - 1]);
pList[i].pPointer = NULL;
pList[i].pNextNode = & (pList[i + 1]);
}
pList[iMaxIndex].pPrevNode = & (pList[iMaxIndex - 1]);
pList[iMaxIndex].pPointer = NULL;
pList[iMaxIndex].pNextNode = NULL;
}
SNode<TNodeType>* FindNextStorage() {
if (NULL != m_pCurrent->pNextNode) {
if (NULL == m_pCurrent->pNextNode->pPointer) {
return (m_pCurrent->pNextNode);
}
}
for (int32_t i = 0; i < m_iMaxNodeCount; i++) {
if (NULL == m_pCurrentList[i].pPointer) {
return (&m_pCurrentList[i]);
}
}
return NULL;
}
void CleanOneNode (SNode<TNodeType>* pSNode) {
pSNode->pPointer = NULL;
pSNode->pPrevNode = NULL;
pSNode->pNextNode = NULL;
}
void ResetStorage() {
m_pFirst = NULL;
m_pCurrent = m_pCurrentList;
InitStorage (m_pCurrentList, m_iMaxNodeCount - 1);
}
int32_t m_iCurrentNodeCount;
int32_t m_iMaxNodeCount;
SNode<TNodeType>* m_pCurrentList;
SNode<TNodeType>* m_pFirst;
SNode<TNodeType>* m_pCurrent;
};
}
#endif

View File

@ -0,0 +1,97 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsLock.h
*
* \brief class wrapping for locks
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#ifndef _WELS_LOCK_H_
#define _WELS_LOCK_H_
#include "macros.h"
#include "typedefs.h"
#include "WelsThreadLib.h"
namespace WelsCommon {
class CWelsLock {
DISALLOW_COPY_AND_ASSIGN (CWelsLock);
public:
CWelsLock() {
WelsMutexInit (&m_cMutex);
}
virtual ~CWelsLock() {
WelsMutexDestroy (&m_cMutex);
}
WELS_THREAD_ERROR_CODE Lock() {
return WelsMutexLock (&m_cMutex);
}
WELS_THREAD_ERROR_CODE Unlock() {
return WelsMutexUnlock (&m_cMutex);
}
private:
WELS_MUTEX m_cMutex;
};
class CWelsAutoLock {
DISALLOW_COPY_AND_ASSIGN (CWelsAutoLock);
public:
CWelsAutoLock (CWelsLock& cLock) : m_cLock (cLock) {
m_cLock.Lock();
}
virtual ~CWelsAutoLock() {
m_cLock.Unlock();
}
private:
CWelsLock& m_cLock;
};
}
#endif

View File

@ -0,0 +1,75 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsTask.h
*
* \brief Interfaces introduced in thread pool
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#ifndef _WELS_TASK_H_
#define _WELS_TASK_H_
#include "codec_def.h"
namespace WelsCommon {
class IWelsTaskSink {
public:
virtual int OnTaskExecuted() = 0;
virtual int OnTaskCancelled() = 0;
};
class IWelsTask {
public:
IWelsTask (IWelsTaskSink* pSink) {
m_pSink = pSink;
};
virtual ~IWelsTask() { }
virtual int Execute() = 0;
IWelsTaskSink* GetSink() {
return m_pSink;
};
protected:
IWelsTaskSink* m_pSink;
};
}
#endif

View File

@ -0,0 +1,83 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsTaskThread.h
*
* \brief connecting task and thread
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#ifndef _WELS_TASK_THREAD_H_
#define _WELS_TASK_THREAD_H_
#include "WelsTask.h"
#include "WelsThread.h"
namespace WelsCommon {
class CWelsTaskThread;
class IWelsTaskThreadSink {
public:
virtual WELS_THREAD_ERROR_CODE OnTaskStart (CWelsTaskThread* pThread, IWelsTask* pTask) = 0;
virtual WELS_THREAD_ERROR_CODE OnTaskStop (CWelsTaskThread* pThread, IWelsTask* pTask) = 0;
};
class CWelsTaskThread : public CWelsThread {
public:
CWelsTaskThread (IWelsTaskThreadSink* pSink);
virtual ~CWelsTaskThread();
WELS_THREAD_ERROR_CODE SetTask (IWelsTask* pTask);
virtual void ExecuteTask();
uintptr_t GetID() const {
return m_uiID;
}
private:
CWelsLock m_cLockTask;
IWelsTaskThreadSink* m_pSink;
IWelsTask* m_pTask;
uintptr_t m_uiID;
DISALLOW_COPY_AND_ASSIGN (CWelsTaskThread);
};
}
#endif

View File

@ -0,0 +1,105 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsThread.h
*
* \brief Interfaces introduced in threads
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#ifndef _WELS_THREAD_H_
#define _WELS_THREAD_H_
#include "macros.h"
#include "WelsLock.h"
#include "WelsThreadLib.h"
namespace WelsCommon {
class CWelsThread {
public:
CWelsThread();
virtual ~CWelsThread();
virtual void Thread();
virtual void ExecuteTask() = 0;
virtual WELS_THREAD_ERROR_CODE Start();
virtual void Kill();
protected:
static WELS_THREAD_ROUTINE_TYPE TheThread (void* pParam);
void SetRunning (bool bRunning) {
CWelsAutoLock cLock (m_cLockStatus);
m_bRunning = bRunning;
}
void SetEndFlag (bool bEndFlag) {
CWelsAutoLock cLock (m_cLockStatus);
m_bEndFlag = bEndFlag;
}
bool GetRunning() const {
return m_bRunning;
}
bool GetEndFlag() const {
return m_bEndFlag;
}
void SignalThread() {
WelsEventSignal (&m_hEvent);
}
private:
WELS_THREAD_HANDLE m_hThread;
WELS_EVENT m_hEvent;
CWelsLock m_cLockStatus;
bool m_bRunning;
bool m_bEndFlag;
DISALLOW_COPY_AND_ASSIGN (CWelsThread);
};
}
#endif

View File

@ -104,8 +104,9 @@ WELS_THREAD_ERROR_CODE WelsMutexLock (WELS_MUTEX* mutex);
WELS_THREAD_ERROR_CODE WelsMutexUnlock (WELS_MUTEX* mutex);
WELS_THREAD_ERROR_CODE WelsMutexDestroy (WELS_MUTEX* mutex);
WELS_THREAD_ERROR_CODE WelsEventOpen (WELS_EVENT* p_event, const char* event_name);
WELS_THREAD_ERROR_CODE WelsEventClose (WELS_EVENT* event, const char* event_name);
WELS_THREAD_ERROR_CODE WelsEventOpen (WELS_EVENT* p_event, const char* event_name = NULL);
WELS_THREAD_ERROR_CODE WelsEventClose (WELS_EVENT* event, const char* event_name = NULL);
WELS_THREAD_ERROR_CODE WelsEventSignal (WELS_EVENT* event);
WELS_THREAD_ERROR_CODE WelsEventWait (WELS_EVENT* event);
WELS_THREAD_ERROR_CODE WelsEventWaitWithTimeOut (WELS_EVENT* event, uint32_t dwMilliseconds);
@ -125,6 +126,7 @@ WELS_THREAD_HANDLE WelsThreadSelf();
WELS_THREAD_ERROR_CODE WelsQueryLogicalProcessInfo (WelsLogicalProcessInfo* pInfo);
void WelsSleep (uint32_t dwMilliSecond);
#ifdef __cplusplus
}

View File

@ -0,0 +1,125 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsThreadPool.h
*
* \brief Interfaces introduced in thread pool
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#ifndef _WELS_THREAD_POOL_H_
#define _WELS_THREAD_POOL_H_
#include <stdio.h>
#include "WelsTask.h"
#include "WelsTaskThread.h"
#include "WelsCircleQueue.h"
#include "WelsList.h"
namespace WelsCommon {
class CWelsThreadPool : public CWelsThread, public IWelsTaskThreadSink {
public:
enum {
DEFAULT_THREAD_NUM = 4,
};
static WELS_THREAD_ERROR_CODE SetThreadNum (int32_t iMaxThreadNum);
static CWelsThreadPool& AddReference();
void RemoveInstance();
static bool IsReferenced();
//IWelsTaskThreadSink
virtual WELS_THREAD_ERROR_CODE OnTaskStart (CWelsTaskThread* pThread, IWelsTask* pTask);
virtual WELS_THREAD_ERROR_CODE OnTaskStop (CWelsTaskThread* pThread, IWelsTask* pTask);
// CWelsThread
virtual void ExecuteTask();
WELS_THREAD_ERROR_CODE QueueTask (IWelsTask* pTask);
int32_t GetThreadNum() const {
return m_iMaxThreadNum;
}
protected:
WELS_THREAD_ERROR_CODE Init();
WELS_THREAD_ERROR_CODE Uninit();
WELS_THREAD_ERROR_CODE CreateIdleThread();
void DestroyThread (CWelsTaskThread* pThread);
WELS_THREAD_ERROR_CODE AddThreadToIdleQueue (CWelsTaskThread* pThread);
WELS_THREAD_ERROR_CODE AddThreadToBusyList (CWelsTaskThread* pThread);
WELS_THREAD_ERROR_CODE RemoveThreadFromBusyList (CWelsTaskThread* pThread);
void AddTaskToWaitedList (IWelsTask* pTask);
CWelsTaskThread* GetIdleThread();
IWelsTask* GetWaitedTask();
int32_t GetIdleThreadNum();
int32_t GetBusyThreadNum();
int32_t GetWaitedTaskNum();
void ClearWaitedTasks();
private:
CWelsThreadPool();
virtual ~CWelsThreadPool();
WELS_THREAD_ERROR_CODE StopAllRunning();
static int32_t m_iRefCount;
static CWelsLock m_cInitLock;
static int32_t m_iMaxThreadNum;
CWelsCircleQueue<IWelsTask>* m_cWaitedTasks;
CWelsCircleQueue<CWelsTaskThread>* m_cIdleThreads;
CWelsList<CWelsTaskThread>* m_cBusyThreads;
CWelsLock m_cLockPool;
CWelsLock m_cLockWaitedTasks;
CWelsLock m_cLockIdleTasks;
CWelsLock m_cLockBusyTasks;
DISALLOW_COPY_AND_ASSIGN (CWelsThreadPool);
};
}
#endif

View File

@ -56,7 +56,6 @@
#define WELS_CPU_SSE42 0x00000400 /* sse 4.2 */
/* CPU features application extensive */
#define WELS_CPU_AVX 0x00000800 /* Advanced Vector eXtentions */
#define WELS_CPU_FPU 0x00001000 /* x87-FPU on chip */
#define WELS_CPU_HTT 0x00002000 /* Hyper-Threading Technology (HTT), Multi-threading enabled feature:
physical processor package is capable of supporting more than one logic processor
@ -67,7 +66,13 @@
#define WELS_CPU_MOVBE 0x00008000 /* MOVBE instruction */
#define WELS_CPU_AES 0x00010000 /* AES instruction extensions */
#define WELS_CPU_FMA 0x00020000 /* AVX VEX FMA instruction sets */
#define WELS_CPU_AVX 0x00000800 /* Advanced Vector eXtentions */
#ifdef HAVE_AVX2
#define WELS_CPU_AVX2 0x00040000 /* AVX2 */
#else
#define WELS_CPU_AVX2 0x00000000 /* !AVX2 */
#endif
#define WELS_CPU_CACHELINE_16 0x10000000 /* CacheLine Size 16 */
#define WELS_CPU_CACHELINE_32 0x20000000 /* CacheLine Size 32 */

View File

@ -205,6 +205,11 @@ template<typename T> T WelsClip3(T iX, T iY, T iZ) {
return iX;
}
#define DISALLOW_COPY_AND_ASSIGN(cclass) \
private: \
cclass(const cclass &); \
cclass& operator=(const cclass &);
/*
* Description: to check variable validation and return the specified result
* iResult: value to be checked

View File

@ -104,6 +104,12 @@ void WelsFree (void* pPtr, const char* kpTag);
#define WELS_SAFE_FREE(pPtr, pTag) if (pPtr) { WelsFree(pPtr, pTag); pPtr = NULL; }
#define WELS_NEW_OP(object, type) \
(type*)(new object);
#define WELS_DELETE_OP(p) \
delete p; \
p = NULL;
}

View File

@ -43,8 +43,7 @@
#include "typedefs.h"
#define MAX_LOG_SIZE 1024
#define MAX_WIDTH (4096)
#define MAX_HEIGHT (2304)//MAX_FS_LEVEL51 (36864); MAX_FS_LEVEL51*256/4096 = 2304
#define MAX_MBS_PER_FRAME 36864 //in accordance with max level support in Rec
/*
* Function pointer declaration for various tool sets
*/

View File

@ -97,12 +97,12 @@ enum EWelsNalUnitType {
NAL_UNIT_SPS_EXT = 13,
NAL_UNIT_PREFIX = 14,
NAL_UNIT_SUBSET_SPS = 15,
NAL_UNIT_RESV_16 = 16,
NAL_UNIT_DEPTH_PARAM = 16, // NAL_UNIT_RESV_16
NAL_UNIT_RESV_17 = 17,
NAL_UNIT_RESV_18 = 18,
NAL_UNIT_AUX_CODED_SLICE = 19,
NAL_UNIT_CODED_SLICE_EXT = 20,
NAL_UNIT_RESV_21 = 21,
NAL_UNIT_MVC_SLICE_EXT = 21, // NAL_UNIT_RESV_21
NAL_UNIT_RESV_22 = 22,
NAL_UNIT_RESV_23 = 23,
NAL_UNIT_UNSPEC_24 = 24,

View File

@ -0,0 +1,88 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsTaskThread.cpp
*
* \brief functions for TaskThread
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#include "WelsTaskThread.h"
namespace WelsCommon {
CWelsTaskThread::CWelsTaskThread (IWelsTaskThreadSink* pSink) : m_pSink (pSink) {
WelsThreadSetName ("CWelsTaskThread");
m_uiID = (uintptr_t) (this);
m_pTask = NULL;
}
CWelsTaskThread::~CWelsTaskThread() {
}
void CWelsTaskThread::ExecuteTask() {
CWelsAutoLock cLock (m_cLockTask);
if (m_pSink) {
m_pSink->OnTaskStart (this, m_pTask);
}
if (m_pTask) {
m_pTask->Execute();
}
if (m_pSink) {
m_pSink->OnTaskStop (this, m_pTask);
}
m_pTask = NULL;
}
WELS_THREAD_ERROR_CODE CWelsTaskThread::SetTask (WelsCommon::IWelsTask* pTask) {
CWelsAutoLock cLock (m_cLockTask);
if (!GetRunning()) {
return WELS_THREAD_ERROR_GENERAL;
}
m_pTask = pTask;
SignalThread();
return WELS_THREAD_ERROR_OK;
}
}

View File

@ -0,0 +1,125 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsThreadPool.cpp
*
* \brief functions for Thread Pool
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#include "WelsThread.h"
namespace WelsCommon {
CWelsThread::CWelsThread() :
m_hThread (0),
m_bRunning (false),
m_bEndFlag (false) {
WELS_THREAD_ERROR_CODE rc = WelsEventOpen (&m_hEvent);
if (WELS_THREAD_ERROR_OK != rc) {
m_hEvent = NULL;
}
}
CWelsThread::~CWelsThread() {
Kill();
WelsEventClose (&m_hEvent);
m_hEvent = NULL;
}
void CWelsThread::Thread() {
while (true) {
WelsEventWait (&m_hEvent);
if (GetEndFlag()) {
break;
}
ExecuteTask();
}
SetRunning (false);
}
WELS_THREAD_ERROR_CODE CWelsThread::Start() {
if (NULL == m_hEvent) {
return WELS_THREAD_ERROR_GENERAL;
}
if (GetRunning()) {
return WELS_THREAD_ERROR_OK;
}
SetEndFlag (false);
WELS_THREAD_ERROR_CODE rc = WelsThreadCreate (&m_hThread,
(LPWELS_THREAD_ROUTINE)TheThread, this, 0);
if (WELS_THREAD_ERROR_OK != rc) {
return rc;
}
while (!GetRunning()) {
WelsSleep (1);
}
return WELS_THREAD_ERROR_OK;
}
void CWelsThread::Kill() {
if (!GetRunning()) {
return;
}
SetEndFlag (true);
WelsEventSignal (&m_hEvent);
WelsThreadJoin (m_hThread);
return;
}
WELS_THREAD_ROUTINE_TYPE CWelsThread::TheThread (void* pParam) {
CWelsThread* pThis = static_cast<CWelsThread*> (pParam);
pThis->SetRunning (true);
pThis->Thread();
WELS_THREAD_ROUTINE_RETURN (NULL);
}
}

View File

@ -71,6 +71,7 @@
#ifdef WINAPI_FAMILY
#if !WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP)
#define WP80
using namespace Platform;
using namespace Windows::Foundation;
using namespace Windows::System::Threading;
@ -174,6 +175,30 @@ WELS_THREAD_ERROR_CODE WelsEventClose (WELS_EVENT* event, const char* event_n
return WELS_THREAD_ERROR_OK;
}
#ifndef WP80
void WelsSleep (uint32_t dwMilliSecond) {
::Sleep (dwMilliSecond);
}
#else
void WelsSleep (uint32_t dwMilliSecond) {
static WELS_EVENT hSleepEvent = NULL;
if (!hSleepEvent) {
WELS_EVENT hLocalSleepEvent = NULL;
WELS_THREAD_ERROR_CODE ret = WelsEventOpen (&hLocalSleepEvent);
if (WELS_THREAD_ERROR_OK != ret) {
return;
}
WELS_EVENT hPreviousEvent = InterlockedCompareExchangePointerRelease (&hSleepEvent, hLocalSleepEvent, NULL);
if (hPreviousEvent) {
WelsEventClose (&hLocalSleepEvent);
}
//On this singleton usage idea of using InterlockedCompareExchangePointerRelease:
// similar idea of can be found at msdn blog when introducing InterlockedCompareExchangePointerRelease
}
WaitForSingleObject (hSleepEvent, dwMilliSecond);
}
#endif
WELS_THREAD_ERROR_CODE WelsThreadCreate (WELS_THREAD_HANDLE* thread, LPWELS_THREAD_ROUTINE routine,
void* arg, WELS_THREAD_ATTR attr) {
@ -226,7 +251,7 @@ WELS_THREAD_ERROR_CODE WelsQueryLogicalProcessInfo (WelsLogicalProcessInfo* p
return WELS_THREAD_ERROR_OK;
}
#else
#else //platform: #ifdef _WIN32
WELS_THREAD_ERROR_CODE WelsThreadCreate (WELS_THREAD_HANDLE* thread, LPWELS_THREAD_ROUTINE routine,
void* arg, WELS_THREAD_ATTR attr) {
@ -274,14 +299,21 @@ WELS_THREAD_HANDLE WelsThreadSelf() {
WELS_THREAD_ERROR_CODE WelsEventOpen (WELS_EVENT* p_event, const char* event_name) {
#ifdef __APPLE__
if (p_event == NULL || event_name == NULL)
if (p_event == NULL) {
return WELS_THREAD_ERROR_GENERAL;
}
char strSuffix[100] = { 0 };
if (NULL == event_name) {
sprintf (strSuffix, "WelsSem%ld_p%ld", (intptr_t)p_event, (long) (getpid()));
event_name = &strSuffix[0];
}
*p_event = sem_open (event_name, O_CREAT, (S_IRUSR | S_IWUSR)/*0600*/, 0);
if (*p_event == (sem_t*)SEM_FAILED) {
sem_unlink (event_name);
*p_event = NULL;
return WELS_THREAD_ERROR_GENERAL;
} else {
//printf("event_open:%x, %s\n", p_event, event_name);
return WELS_THREAD_ERROR_OK;
}
#else
@ -298,6 +330,7 @@ WELS_THREAD_ERROR_CODE WelsEventOpen (WELS_EVENT* p_event, const char* event_
#endif
}
WELS_THREAD_ERROR_CODE WelsEventClose (WELS_EVENT* event, const char* event_name) {
//printf("event_close:%x, %s\n", event, event_name);
#ifdef __APPLE__
WELS_THREAD_ERROR_CODE err = sem_close (*event); // match with sem_open
if (event_name)
@ -310,6 +343,10 @@ WELS_THREAD_ERROR_CODE WelsEventClose (WELS_EVENT* event, const char* event_n
#endif
}
void WelsSleep (uint32_t dwMilliSecond) {
usleep (dwMilliSecond * 1000);
}
WELS_THREAD_ERROR_CODE WelsEventSignal (WELS_EVENT* event) {
WELS_THREAD_ERROR_CODE err = 0;
// int32_t val = 0;
@ -321,7 +358,7 @@ WELS_THREAD_ERROR_CODE WelsEventSignal (WELS_EVENT* event) {
return err;
}
WELS_THREAD_ERROR_CODE WelsEventWait (WELS_EVENT* event) {
WELS_THREAD_ERROR_CODE WelsEventWait (WELS_EVENT* event) {
return sem_wait (*event); // blocking until signaled
}

View File

@ -0,0 +1,350 @@
/*!
* \copy
* Copyright (c) 2009-2015, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* \file WelsThreadPool.cpp
*
* \brief functions for Thread Pool
*
* \date 5/09/2012 Created
*
*************************************************************************************
*/
#include "typedefs.h"
#include "memory_align.h"
#include "WelsThreadPool.h"
namespace WelsCommon {
int32_t CWelsThreadPool::m_iRefCount = 0;
CWelsLock CWelsThreadPool::m_cInitLock;
int32_t CWelsThreadPool::m_iMaxThreadNum = DEFAULT_THREAD_NUM;
CWelsThreadPool::CWelsThreadPool() :
m_cWaitedTasks (NULL), m_cIdleThreads (NULL), m_cBusyThreads (NULL) {
}
CWelsThreadPool::~CWelsThreadPool() {
//fprintf(stdout, "CWelsThreadPool::~CWelsThreadPool: delete %x, %x, %x\n", m_cWaitedTasks, m_cIdleThreads, m_cBusyThreads);
if (0 != m_iRefCount) {
m_iRefCount = 0;
Uninit();
}
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::SetThreadNum (int32_t iMaxThreadNum) {
CWelsAutoLock cLock (m_cInitLock);
if (m_iRefCount != 0) {
return WELS_THREAD_ERROR_GENERAL;
}
if (iMaxThreadNum <= 0) {
iMaxThreadNum = 1;
}
m_iMaxThreadNum = iMaxThreadNum;
return WELS_THREAD_ERROR_OK;
}
CWelsThreadPool& CWelsThreadPool::AddReference () {
CWelsAutoLock cLock (m_cInitLock);
static CWelsThreadPool m_cThreadPoolSelf;
if (m_iRefCount == 0) {
//TODO: will remove this afterwards
if (WELS_THREAD_ERROR_OK != m_cThreadPoolSelf.Init()) {
m_cThreadPoolSelf.Uninit();
}
}
//fprintf(stdout, "m_iRefCount=%d, pSink=%x, iMaxThreadNum=%d\n", m_iRefCount, pSink, iMaxThreadNum);
++ m_iRefCount;
//fprintf(stdout, "m_iRefCount2=%d\n", m_iRefCount);
return m_cThreadPoolSelf;
}
void CWelsThreadPool::RemoveInstance() {
CWelsAutoLock cLock (m_cInitLock);
//fprintf(stdout, "m_iRefCount=%d\n", m_iRefCount);
-- m_iRefCount;
if (0 == m_iRefCount) {
StopAllRunning();
Uninit();
//fprintf(stdout, "m_iRefCount=%d, IdleThreadNum=%d, BusyThreadNum=%d, WaitedTask=%d\n", m_iRefCount, GetIdleThreadNum(), GetBusyThreadNum(), GetWaitedTaskNum());
}
}
bool CWelsThreadPool::IsReferenced() {
CWelsAutoLock cLock (m_cInitLock);
return (m_iRefCount>0);
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::OnTaskStart (CWelsTaskThread* pThread, IWelsTask* pTask) {
AddThreadToBusyList (pThread);
//fprintf(stdout, "CWelsThreadPool::AddThreadToBusyList: Task %x at Thread %x\n", pTask, pThread);
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::OnTaskStop (CWelsTaskThread* pThread, IWelsTask* pTask) {
//fprintf(stdout, "CWelsThreadPool::OnTaskStop 0: Task %x at Thread %x Finished\n", pTask, pThread);
RemoveThreadFromBusyList (pThread);
AddThreadToIdleQueue (pThread);
//fprintf(stdout, "CWelsThreadPool::OnTaskStop 1: Task %x at Thread %x Finished, m_pSink=%x\n", pTask, pThread, m_pSink);
if (pTask->GetSink()) {
pTask->GetSink()->OnTaskExecuted();
}
//if (m_pSink) {
// m_pSink->OnTaskExecuted (pTask);
//}
//fprintf(stdout, "CWelsThreadPool::OnTaskStop 2: Task %x at Thread %x Finished\n", pTask, pThread);
SignalThread();
//fprintf(stdout, "ThreadPool: Task %x at Thread %x Finished\n", pTask, pThread);
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::Init () {
//fprintf(stdout, "Enter WelsThreadPool Init\n");
CWelsAutoLock cLock (m_cLockPool);
m_cWaitedTasks = new CWelsCircleQueue<IWelsTask>();
m_cIdleThreads = new CWelsCircleQueue<CWelsTaskThread>();
m_cBusyThreads = new CWelsList<CWelsTaskThread>();
if (NULL == m_cWaitedTasks || NULL == m_cIdleThreads || NULL == m_cBusyThreads) {
return WELS_THREAD_ERROR_GENERAL;
}
for (int32_t i = 0; i < m_iMaxThreadNum; i++) {
if (WELS_THREAD_ERROR_OK != CreateIdleThread()) {
return WELS_THREAD_ERROR_GENERAL;
}
}
if (WELS_THREAD_ERROR_OK != Start()) {
return WELS_THREAD_ERROR_GENERAL;
}
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::StopAllRunning() {
WELS_THREAD_ERROR_CODE iReturn = WELS_THREAD_ERROR_OK;
ClearWaitedTasks();
while (GetBusyThreadNum() > 0) {
//WELS_INFO_TRACE ("CWelsThreadPool::Uninit - Waiting all thread to exit");
WelsSleep (10);
}
if (GetIdleThreadNum() != m_iMaxThreadNum) {
iReturn = WELS_THREAD_ERROR_GENERAL;
}
return iReturn;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::Uninit() {
WELS_THREAD_ERROR_CODE iReturn = WELS_THREAD_ERROR_OK;
CWelsAutoLock cLock (m_cLockPool);
iReturn = StopAllRunning();
if (WELS_THREAD_ERROR_OK != iReturn) {
return iReturn;
}
m_cLockIdleTasks.Lock();
while (m_cIdleThreads->size() > 0) {
DestroyThread (m_cIdleThreads->begin());
m_cIdleThreads->pop_front();
}
m_cLockIdleTasks.Unlock();
Kill();
WELS_DELETE_OP(m_cWaitedTasks);
WELS_DELETE_OP(m_cIdleThreads);
WELS_DELETE_OP(m_cBusyThreads);
return iReturn;
}
void CWelsThreadPool::ExecuteTask() {
//fprintf(stdout, "ThreadPool: scheduled tasks: ExecuteTask\n");
CWelsTaskThread* pThread = NULL;
IWelsTask* pTask = NULL;
while (GetWaitedTaskNum() > 0) {
pThread = GetIdleThread();
if (pThread == NULL) {
break;
}
pTask = GetWaitedTask();
//fprintf(stdout, "ThreadPool: ExecuteTask = %x at thread %x\n", pTask, pThread);
pThread->SetTask (pTask);
}
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::QueueTask (IWelsTask* pTask) {
CWelsAutoLock cLock (m_cLockPool);
//fprintf(stdout, "CWelsThreadPool::QueueTask: %d, pTask=%x\n", m_iRefCount, pTask);
if (GetWaitedTaskNum() == 0) {
CWelsTaskThread* pThread = GetIdleThread();
if (pThread != NULL) {
//fprintf(stdout, "ThreadPool: ExecuteTask = %x at thread %x\n", pTask, pThread);
pThread->SetTask (pTask);
return WELS_THREAD_ERROR_OK;
}
}
//fprintf(stdout, "ThreadPool: AddTaskToWaitedList: %x\n", pTask);
AddTaskToWaitedList (pTask);
//fprintf(stdout, "ThreadPool: SignalThread: %x\n", pTask);
SignalThread();
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::CreateIdleThread() {
CWelsTaskThread* pThread = new CWelsTaskThread (this);
if (NULL == pThread) {
return WELS_THREAD_ERROR_GENERAL;
}
if (WELS_THREAD_ERROR_OK != pThread->Start()) {
return WELS_THREAD_ERROR_GENERAL;
}
//fprintf(stdout, "ThreadPool: AddThreadToIdleQueue: %x\n", pThread);
AddThreadToIdleQueue (pThread);
return WELS_THREAD_ERROR_OK;
}
void CWelsThreadPool::DestroyThread (CWelsTaskThread* pThread) {
pThread->Kill();
WELS_DELETE_OP(pThread);
return;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::AddThreadToIdleQueue (CWelsTaskThread* pThread) {
CWelsAutoLock cLock (m_cLockIdleTasks);
m_cIdleThreads->push_back (pThread);
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::AddThreadToBusyList (CWelsTaskThread* pThread) {
CWelsAutoLock cLock (m_cLockBusyTasks);
m_cBusyThreads->push_back (pThread);
return WELS_THREAD_ERROR_OK;
}
WELS_THREAD_ERROR_CODE CWelsThreadPool::RemoveThreadFromBusyList (CWelsTaskThread* pThread) {
CWelsAutoLock cLock (m_cLockBusyTasks);
if (m_cBusyThreads->erase (pThread)) {
return WELS_THREAD_ERROR_OK;
} else {
return WELS_THREAD_ERROR_GENERAL;
}
}
void CWelsThreadPool::AddTaskToWaitedList (IWelsTask* pTask) {
CWelsAutoLock cLock (m_cLockWaitedTasks);
m_cWaitedTasks->push_back (pTask);
//fprintf(stdout, "CWelsThreadPool::AddTaskToWaitedList=%d, pTask=%x\n", m_cWaitedTasks->size(), pTask);
return;
}
CWelsTaskThread* CWelsThreadPool::GetIdleThread() {
CWelsAutoLock cLock (m_cLockIdleTasks);
//fprintf(stdout, "CWelsThreadPool::GetIdleThread=%d\n", m_cIdleThreads->size());
if (m_cIdleThreads->size() == 0) {
return NULL;
}
CWelsTaskThread* pThread = m_cIdleThreads->begin();
m_cIdleThreads->pop_front();
return pThread;
}
int32_t CWelsThreadPool::GetBusyThreadNum() {
return m_cBusyThreads->size();
}
int32_t CWelsThreadPool::GetIdleThreadNum() {
return m_cIdleThreads->size();
}
int32_t CWelsThreadPool::GetWaitedTaskNum() {
//fprintf(stdout, "CWelsThreadPool::m_cWaitedTasks=%d\n", m_cWaitedTasks->size());
return m_cWaitedTasks->size();
}
IWelsTask* CWelsThreadPool::GetWaitedTask() {
CWelsAutoLock lock (m_cLockWaitedTasks);
if (m_cWaitedTasks->size() == 0) {
return NULL;
}
IWelsTask* pTask = m_cWaitedTasks->begin();
m_cWaitedTasks->pop_front();
return pTask;
}
void CWelsThreadPool::ClearWaitedTasks() {
CWelsAutoLock cLock (m_cLockWaitedTasks);
IWelsTask* pTask = NULL;
while (0 != m_cWaitedTasks->size()) {
pTask = m_cWaitedTasks->begin();
if (pTask->GetSink()) {
pTask->GetSink()->OnTaskCancelled();
}
m_cWaitedTasks->pop_front();
}
}
}

View File

@ -165,12 +165,12 @@ const EVclType g_keTypeMap[32][2] = {
{ NON_VCL, NON_VCL }, // 13: NAL_UNIT_SPS_EXT
{ NON_VCL, NON_VCL }, // 14: NAL_UNIT_PREFIX, NEED associate succeeded NAL to make a VCL
{ NON_VCL, NON_VCL }, // 15: NAL_UNIT_SUBSET_SPS
{ NON_VCL, NON_VCL }, // 16: NAL_UNIT_RESV_16
{ NON_VCL, NON_VCL }, // 16: NAL_UNIT_DEPTH_PARAM
{ NON_VCL, NON_VCL }, // 17: NAL_UNIT_RESV_17
{ NON_VCL, NON_VCL }, // 18: NAL_UNIT_RESV_18
{ NON_VCL, NON_VCL }, // 19: NAL_UNIT_AUX_CODED_SLICE
{ NON_VCL, VCL }, // 20: NAL_UNIT_CODED_SLICE_EXT
{ NON_VCL, NON_VCL }, // 21: NAL_UNIT_RESV_21
{ NON_VCL, NON_VCL }, // 21: NAL_UNIT_MVC_SLICE_EXT
{ NON_VCL, NON_VCL }, // 22: NAL_UNIT_RESV_22
{ NON_VCL, NON_VCL }, // 23: NAL_UNIT_RESV_23
{ NON_VCL, NON_VCL }, // 24: NAL_UNIT_UNSPEC_24

View File

@ -70,6 +70,9 @@ void* WelsMalloc (const uint32_t kuiSize, const char* kpTag, const uint32_t kiAl
const uint32_t kiPayloadSize = kuiSize;
uint8_t* pBuf = (uint8_t*) malloc (kiActualRequestedSize);
if (NULL == pBuf)
return NULL;
#ifdef MEMORY_CHECK
if (fpMemChkPoint == NULL) {
fpMemChkPoint = fopen ("./enc_mem_check_point.txt", "at+");
@ -87,10 +90,6 @@ void* WelsMalloc (const uint32_t kuiSize, const char* kpTag, const uint32_t kiAl
}
#endif
uint8_t* pAlignedBuffer;
if (NULL == pBuf)
return NULL;
pAlignedBuffer = pBuf + kiAlignedBytes + kiSizeOfVoidPointer + kiSizeOfInt;
pAlignedBuffer -= ((uintptr_t) pAlignedBuffer & kiAlignedBytes);
* ((void**) (pAlignedBuffer - kiSizeOfVoidPointer)) = pBuf;

View File

@ -12,12 +12,16 @@ COMMON_CPP_SRCS=\
$(COMMON_SRCDIR)/src/sad_common.cpp\
$(COMMON_SRCDIR)/src/utils.cpp\
$(COMMON_SRCDIR)/src/welsCodecTrace.cpp\
$(COMMON_SRCDIR)/src/WelsTaskThread.cpp\
$(COMMON_SRCDIR)/src/WelsThread.cpp\
$(COMMON_SRCDIR)/src/WelsThreadLib.cpp\
$(COMMON_SRCDIR)/src/WelsThreadPool.cpp\
COMMON_OBJS += $(COMMON_CPP_SRCS:.cpp=.$(OBJ))
COMMON_ASM_SRCS=\
$(COMMON_SRCDIR)/x86/cpuid.asm\
$(COMMON_SRCDIR)/x86/dct.asm\
$(COMMON_SRCDIR)/x86/deblock.asm\
$(COMMON_SRCDIR)/x86/expand_picture.asm\
$(COMMON_SRCDIR)/x86/intra_pred_com.asm\

View File

@ -79,6 +79,19 @@ BITS 64
%define arg11 [rsp + push_num*8 + 88]
%define arg12 [rsp + push_num*8 + 96]
%define arg1d ecx
%define arg2d edx
%define arg3d r8d
%define arg4d r9d
%define arg5d arg5
%define arg6d arg6
%define arg7d arg7
%define arg8d arg8
%define arg9d arg9
%define arg10d arg10
%define arg11d arg11
%define arg12d arg12
%define r0 rcx
%define r1 rdx
%define r2 r8
@ -100,6 +113,7 @@ BITS 64
%define r1w dx
%define r2w r8w
%define r3w r9w
%define r4w ax
%define r6w r11w
%define r0b cl
@ -135,6 +149,19 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
%define arg11 [rsp + push_num*8 + 40]
%define arg12 [rsp + push_num*8 + 48]
%define arg1d edi
%define arg2d esi
%define arg3d edx
%define arg4d ecx
%define arg5d r8d
%define arg6d r9d
%define arg7d arg7
%define arg8d arg8
%define arg9d arg9
%define arg10d arg10
%define arg11d arg11
%define arg12d arg12
%define r0 rdi
%define r1 rsi
%define r2 rdx
@ -156,6 +183,7 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
%define r1w si
%define r2w dx
%define r3w cx
%define r4w r8w
%define r6w r10w
%define r0b dil
@ -189,6 +217,19 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
%define arg11 [esp + push_num*4 + 44]
%define arg12 [esp + push_num*4 + 48]
%define arg1d arg1
%define arg2d arg2
%define arg3d arg3
%define arg4d arg4
%define arg5d arg5
%define arg6d arg6
%define arg7d arg7
%define arg8d arg8
%define arg9d arg9
%define arg10d arg10
%define arg11d arg11
%define arg12d arg12
%define r0 eax
%define r1 ecx
%define r2 edx
@ -210,6 +251,7 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
%define r1w cx
%define r2w dx
%define r3w bx
%define r4w si
%define r6w bp
%define r0b al
@ -436,8 +478,14 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
%endif
%endmacro
%macro ZERO_EXTENSION 1
%ifndef X86_32
mov dword %1, %1
%endif
%endmacro
%macro WELS_EXTERN 1
ALIGN 16
ALIGN 16, nop
%ifdef PREFIX
global _%1
%define %1 _%1
@ -605,8 +653,18 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits ; Mark the stack as non-
packuswb %1,%1
%endmacro
%macro WELS_DW1_VEX 1
vpcmpeqw %1, %1, %1
vpsrlw %1, %1, 15
%endmacro
%macro WELS_DW32_VEX 1
vpcmpeqw %1, %1, %1
vpsrlw %1, %1, 15
vpsllw %1, %1, 5
%endmacro
%macro WELS_DW32767_VEX 1
vpcmpeqw %1, %1, %1
vpsrlw %1, %1, 1
%endmacro

1016
codec/common/x86/dct.asm Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -68,6 +68,8 @@ WELS_EXTERN WelsCopy16x16_sse2
%assign push_num 2
LOAD_4_PARA
PUSH_XMM 8
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
lea r4, [r1+2*r1] ;ebx, [eax+2*eax] ; x3
lea r5, [r3+2*r3] ;edx, [ecx+2*ecx] ; x3
@ -132,6 +134,8 @@ WELS_EXTERN WelsCopy16x16NotAligned_sse2
%assign push_num 2
LOAD_4_PARA
PUSH_XMM 8
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
lea r4, [r1+2*r1] ;ebx, [eax+2*eax] ; x3
lea r5, [r3+2*r3] ;edx, [ecx+2*ecx] ; x3
@ -196,6 +200,8 @@ WELS_EXTERN WelsCopy16x8NotAligned_sse2
%assign push_num 2
LOAD_4_PARA
PUSH_XMM 8
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
lea r4, [r1+2*r1] ;ebx, [eax+2*eax] ; x3
lea r5, [r3+2*r3] ;edx, [ecx+2*ecx] ; x3
@ -235,6 +241,8 @@ WELS_EXTERN WelsCopy16x8NotAligned_sse2
WELS_EXTERN WelsCopy8x16_mmx
%assign push_num 0
LOAD_4_PARA
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
movq mm0, [r2]
movq mm1, [r2+r3]
@ -300,6 +308,8 @@ WELS_EXTERN WelsCopy8x8_mmx
push r4
%assign push_num 1
LOAD_4_PARA
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
lea r4, [r3+2*r3] ;edx, [ebx+2*ebx]
; to prefetch next loop

View File

@ -1498,6 +1498,240 @@ loop_get_satd_16x16_right:
;
;***********************************************************************
;***********************************************************************
;
;Pixel_satd_wxh_avx2 BEGIN
;
;***********************************************************************
%ifdef HAVE_AVX2
; out=%1 pSrcA=%2 pSrcB=%3 HSumSubDB1_256=%4 ymm_clobber=%5
%macro AVX2_LoadDiffSatd16x1 5
vbroadcasti128 %1, [%2]
vpmaddubsw %1, %1, %4 ; hadamard neighboring horizontal sums and differences
vbroadcasti128 %5, [%3]
vpmaddubsw %5, %5, %4 ; hadamard neighboring horizontal sums and differences
vpsubw %1, %1, %5 ; diff srcA srcB
%endmacro
; out=%1 pSrcA=%2 pSrcA+4*iStride=%3 pSrcB=%4 pSrcB+4*iStride=%5 HSumSubDB1_128x2=%6 ymm_clobber=%7,%8
%macro AVX2_LoadDiffSatd8x2 8
vpbroadcastq %1, [%2]
vpbroadcastq %7, [%3]
vpblendd %1, %1, %7, 11110000b
vpmaddubsw %1, %1, %6 ; hadamard neighboring horizontal sums and differences
vpbroadcastq %7, [%4]
vpbroadcastq %8, [%5]
vpblendd %7, %7, %8, 11110000b
vpmaddubsw %7, %7, %6 ; hadamard neighboring horizontal sums and differences
vpsubw %1, %1, %7 ; diff srcA srcB
%endmacro
; in/out=%1,%2,%3,%4 clobber=%5
%macro AVX2_HDMFour4x4 5
vpsubw %5, %1, %4 ; s3 = x0 - x3
vpaddw %1, %1, %4 ; s0 = x0 + x3
vpsubw %4, %2, %3 ; s2 = x1 - x2
vpaddw %2, %2, %3 ; s1 = x1 + x2
vpsubw %3, %1, %2 ; y2 = s0 - s1
vpaddw %1, %1, %2 ; y0 = s0 + s1
vpaddw %2, %5, %4 ; y1 = s3 + s2
vpsubw %4, %5, %4 ; y3 = s3 - s2
%endmacro
; out=%1 in=%1,%2,%3,%4 clobber=%5
%macro AVX2_SatdFour4x4 5
AVX2_HDMFour4x4 %1, %2, %3, %4, %5
vpabsw %1, %1
vpabsw %2, %2
vpabsw %3, %3
vpabsw %4, %4
; second stage of horizontal hadamard.
; utilizes that |a + b| + |a - b| = 2 * max(|a|, |b|)
vpblendw %5, %1, %2, 10101010b
vpslld %2, %2, 16
vpsrld %1, %1, 16
vpor %2, %2, %1
vpmaxuw %2, %2, %5
vpblendw %5, %3, %4, 10101010b
vpslld %4, %4, 16
vpsrld %3, %3, 16
vpor %4, %4, %3
vpmaxuw %3, %5, %4
vpaddw %1, %2, %3
%endmacro
; out=%1 pSrcA=%2 iStrideA=%3 3*iStrideA=%4 pSrcB=%5 iStrideB=%6 3*iStrideB=%7 HSumSubDB1_256=%8 ymm_clobber=%9,%10,%11,%12
%macro AVX2_GetSatd16x4 12
AVX2_LoadDiffSatd16x1 %1, %2 + 0 * %3, %5 + 0 * %6, %8, %12
AVX2_LoadDiffSatd16x1 %9, %2 + 1 * %3, %5 + 1 * %6, %8, %12
AVX2_LoadDiffSatd16x1 %10, %2 + 2 * %3, %5 + 2 * %6, %8, %12
AVX2_LoadDiffSatd16x1 %11, %2 + 1 * %4, %5 + 1 * %7, %8, %12
AVX2_SatdFour4x4 %1, %9, %10, %11, %12
%endmacro
; out=%1 pSrcA=%2 iStrideA=%3 3*iStrideA=%4 pSrcB=%5 iStrideB=%6 3*iStrideB=%7 HSumSubDB1_128x2=%8 ymm_clobber=%9,%10,%11,%12,%13
%macro AVX2_GetSatd8x8 13
AVX2_LoadDiffSatd8x2 %1, %2 + 0 * %3, %2 + 4 * %3, %5 + 0 * %6, %5 + 4 * %6, %8, %12, %13
AVX2_LoadDiffSatd8x2 %10, %2 + 2 * %3, %2 + 2 * %4, %5 + 2 * %6, %5 + 2 * %7, %8, %12, %13
add %2, %3
add %5, %6
AVX2_LoadDiffSatd8x2 %9, %2 + 0 * %3, %2 + 4 * %3, %5 + 0 * %6, %5 + 4 * %6, %8, %12, %13
AVX2_LoadDiffSatd8x2 %11, %2 + 2 * %3, %2 + 2 * %4, %5 + 2 * %6, %5 + 2 * %7, %8, %12, %13
AVX2_SatdFour4x4 %1, %9, %10, %11, %12
%endmacro
; d_out=%1 mm_in=%2 mm_clobber=%3
%macro AVX2_SumWHorizon 3
WELS_DW1_VEX y%3
vpmaddwd y%2, y%2, y%3
vextracti128 x%3, y%2, 1
vpaddd x%2, x%2, x%3
vpunpckhqdq x%3, x%2, x%2
vpaddd x%2, x%2, x%3
vpsrldq x%3, x%2, 4
vpaddd x%2, x%2, x%3
vmovd %1, x%2
%endmacro
;***********************************************************************
;
;int32_t WelsSampleSatd8x16_avx2( uint8_t *, int32_t, uint8_t *, int32_t, );
;
;***********************************************************************
WELS_EXTERN WelsSampleSatd8x16_avx2
%assign push_num 0
%ifdef X86_32
push r4
%assign push_num 1
%endif
mov r4, 2 ; loop cnt
jmp WelsSampleSatd8x8N_avx2
;***********************************************************************
;
;int32_t WelsSampleSatd8x8_avx2( uint8_t *, int32_t, uint8_t *, int32_t, );
;
;***********************************************************************
WELS_EXTERN WelsSampleSatd8x8_avx2
%assign push_num 0
%ifdef X86_32
push r4
%assign push_num 1
%endif
mov r4, 1 ; loop cnt
; fall through
WelsSampleSatd8x8N_avx2:
%ifdef X86_32
push r5
push r6
%assign push_num push_num+2
%endif
LOAD_4_PARA
PUSH_XMM 8
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
vbroadcasti128 ymm7, [HSumSubDB1]
lea r5, [3 * r1]
lea r6, [3 * r3]
vpxor ymm6, ymm6, ymm6
.loop:
AVX2_GetSatd8x8 ymm0, r0, r1, r5, r2, r3, r6, ymm7, ymm1, ymm2, ymm3, ymm4, ymm5
vpaddw ymm6, ymm6, ymm0
sub r4, 1
jbe .loop_end
add r0, r5
add r2, r6
lea r0, [r0 + 4 * r1]
lea r2, [r2 + 4 * r3]
jmp .loop
.loop_end:
AVX2_SumWHorizon retrd, mm6, mm5
vzeroupper
POP_XMM
LOAD_4_PARA_POP
%ifdef X86_32
pop r6
pop r5
pop r4
%endif
ret
;***********************************************************************
;
;int32_t WelsSampleSatd16x16_avx2( uint8_t *, int32_t, uint8_t *, int32_t, );
;
;***********************************************************************
WELS_EXTERN WelsSampleSatd16x16_avx2
%assign push_num 0
%ifdef X86_32
push r4
%assign push_num 1
%endif
mov r4, 4 ; loop cnt
jmp WelsSampleSatd16x4N_avx2
;***********************************************************************
;
;int32_t WelsSampleSatd16x8_avx2( uint8_t *, int32_t, uint8_t *, int32_t, );
;
;***********************************************************************
WELS_EXTERN WelsSampleSatd16x8_avx2
%assign push_num 0
%ifdef X86_32
push r4
%assign push_num 1
%endif
mov r4, 2 ; loop cnt
; fall through
WelsSampleSatd16x4N_avx2:
%ifdef X86_32
push r5
push r6
%assign push_num push_num+2
%endif
LOAD_4_PARA
PUSH_XMM 7
SIGN_EXTENSION r1, r1d
SIGN_EXTENSION r3, r3d
vpbroadcastq xmm0, [HSumSubDB1]
vpbroadcastq ymm6, [HSumSubDB1 + 8]
vpblendd ymm6, ymm0, ymm6, 11110000b
lea r5, [3 * r1]
lea r6, [3 * r3]
vpxor ymm5, ymm5, ymm5
.loop:
AVX2_GetSatd16x4 ymm0, r0, r1, r5, r2, r3, r6, ymm6, ymm1, ymm2, ymm3, ymm4
vpaddw ymm5, ymm5, ymm0
lea r0, [r0 + 4 * r1]
lea r2, [r2 + 4 * r3]
sub r4, 1
ja .loop
AVX2_SumWHorizon retrd, mm5, mm0
vzeroupper
POP_XMM
LOAD_4_PARA_POP
%ifdef X86_32
pop r6
pop r5
pop r4
%endif
ret
%endif
;***********************************************************************
;
;Pixel_satd_wxh_avx2 END
;
;***********************************************************************
;***********************************************************************
;
;Pixel_sad_wxh_sse2 BEGIN

View File

@ -102,7 +102,6 @@ void H264DecodeInstance (ISVCDecoder* pDecoder, const char* kpH264FileName, cons
int32_t iLastWidth = 0, iLastHeight = 0;
int32_t iFrameCount = 0;
int32_t iEndOfStreamFlag = 0;
int32_t iColorFormat = videoFormatInternal;
//for coverage test purpose
int32_t iErrorConMethod = (int32_t) ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE;
pDecoder->SetOption (DECODER_OPTION_ERROR_CON_IDC, &iErrorConMethod);
@ -168,11 +167,6 @@ void H264DecodeInstance (ISVCDecoder* pDecoder, const char* kpH264FileName, cons
memcpy (pBuf + iFileSize, &uiStartCode[0], 4); //confirmed_safe_unsafe_usage
if (pDecoder->SetOption (DECODER_OPTION_DATAFORMAT, &iColorFormat)) {
fprintf (stderr, "SetOption() failed, opt_id : %d ..\n", DECODER_OPTION_DATAFORMAT);
goto label_exit;
}
while (true) {
if (iBufPos >= iFileSize) {
@ -201,8 +195,6 @@ void H264DecodeInstance (ISVCDecoder* pDecoder, const char* kpH264FileName, cons
}
//for coverage test purpose
int32_t iOutputColorFormat;
pDecoder->GetOption (DECODER_OPTION_DATAFORMAT, &iOutputColorFormat);
int32_t iEndOfStreamFlag;
pDecoder->GetOption (DECODER_OPTION_END_OF_STREAM, &iEndOfStreamFlag);
int32_t iCurIdrPicId;
@ -376,8 +368,6 @@ int32_t main (int32_t iArgC, char* pArgV[]) {
strncpy (sDecParam.pFileNameRestructed, strReconFile.c_str(), iLen); //confirmed_safe_unsafe_usage
} else if (strTag[0].compare ("TargetDQID") == 0) {
sDecParam.uiTargetDqLayer = (uint8_t)atol (strTag[1].c_str());
} else if (strTag[0].compare ("OutColorFormat") == 0) {
sDecParam.eOutputColorFormat = (EVideoFormatType) atoi (strTag[1].c_str());
} else if (strTag[0].compare ("ErrorConcealmentIdc") == 0) {
sDecParam.eEcActiveIdc = (ERROR_CON_IDC)atol (strTag[1].c_str());
} else if (strTag[0].compare ("CPULoad") == 0) {
@ -394,7 +384,6 @@ int32_t main (int32_t iArgC, char* pArgV[]) {
} else if (strstr (pArgV[1],
".264")) { // no output dump yuv file, just try to render the decoded pictures //confirmed_safe_unsafe_usage
strInputFile = pArgV[1];
sDecParam.eOutputColorFormat = videoFormatI420;
sDecParam.uiTargetDqLayer = (uint8_t) - 1;
sDecParam.eEcActiveIdc = ERROR_CON_SLICE_COPY;
sDecParam.sVideoProperty.eVideoBsType = VIDEO_BITSTREAM_DEFAULT;
@ -402,7 +391,6 @@ int32_t main (int32_t iArgC, char* pArgV[]) {
} else { //iArgC > 2
strInputFile = pArgV[1];
strOutputFile = pArgV[2];
sDecParam.eOutputColorFormat = videoFormatI420;
sDecParam.uiTargetDqLayer = (uint8_t) - 1;
sDecParam.eEcActiveIdc = ERROR_CON_SLICE_COPY;
sDecParam.sVideoProperty.eVideoBsType = VIDEO_BITSTREAM_DEFAULT;

View File

@ -88,6 +88,10 @@ int g_iEncodedFrame = 0;
#endif
#endif /* _WIN32 */
#if defined(__linux__) || defined(__unix__)
#define _FILE_OFFSET_BITS 64
#endif
#include <iostream>
using namespace std;
using namespace WelsEnc;
@ -97,7 +101,7 @@ using namespace WelsEnc;
*/
typedef struct LayerpEncCtx_s {
int32_t iDLayerQp;
SSliceConfig sSliceCfg;
SSliceArgument sSliceArgument;
} SLayerPEncCtx;
typedef struct tagFilesSet {
@ -183,26 +187,26 @@ int ParseLayerConfig (CReadConfig& cRdLayerCfg, const int iLayer, SEncParamExt&
} else if (strTag[0].compare ("InitialQP") == 0) {
sLayerCtx.iDLayerQp = atoi (strTag[1].c_str());
} else if (strTag[0].compare ("SliceMode") == 0) {
sLayerCtx.sSliceCfg.uiSliceMode = (SliceModeEnum)atoi (strTag[1].c_str());
} else if (strTag[0].compare ("SliceSize") == 0) { //SM_DYN_SLICE
sLayerCtx.sSliceCfg.sSliceArgument.uiSliceSizeConstraint = atoi (strTag[1].c_str());
sLayerCtx.sSliceArgument.uiSliceMode = (SliceModeEnum)atoi (strTag[1].c_str());
} else if (strTag[0].compare ("SliceSize") == 0) { //SM_SIZELIMITED_SLICE
sLayerCtx.sSliceArgument.uiSliceSizeConstraint = atoi (strTag[1].c_str());
continue;
} else if (strTag[0].compare ("SliceNum") == 0) {
sLayerCtx.sSliceCfg.sSliceArgument.uiSliceNum = atoi (strTag[1].c_str());
sLayerCtx.sSliceArgument.uiSliceNum = atoi (strTag[1].c_str());
} else if (strTag[0].compare (0, kiSize, str_) == 0) {
const char* kpString = strTag[0].c_str();
int uiSliceIdx = atoi (&kpString[kiSize]);
assert (uiSliceIdx < MAX_SLICES_NUM);
sLayerCtx.sSliceCfg.sSliceArgument.uiSliceMbNum[uiSliceIdx] = atoi (strTag[1].c_str());
sLayerCtx.sSliceArgument.uiSliceMbNum[uiSliceIdx] = atoi (strTag[1].c_str());
}
}
}
pDLayer->iDLayerQp = sLayerCtx.iDLayerQp;
pDLayer->sSliceCfg.uiSliceMode = sLayerCtx.sSliceCfg.uiSliceMode;
pDLayer->sSliceArgument.uiSliceMode = sLayerCtx.sSliceArgument.uiSliceMode;
memcpy (&pDLayer->sSliceCfg, &sLayerCtx.sSliceCfg, sizeof (SSliceConfig)); // confirmed_safe_unsafe_usage
memcpy (&pDLayer->sSliceCfg.sSliceArgument.uiSliceMbNum[0], &sLayerCtx.sSliceCfg.sSliceArgument.uiSliceMbNum[0],
sizeof (sLayerCtx.sSliceCfg.sSliceArgument.uiSliceMbNum)); // confirmed_safe_unsafe_usage
memcpy (&pDLayer->sSliceArgument, &sLayerCtx.sSliceArgument, sizeof (SSliceArgument)); // confirmed_safe_unsafe_usage
memcpy (&pDLayer->sSliceArgument.uiSliceMbNum[0], &sLayerCtx.sSliceArgument.uiSliceMbNum[0],
sizeof (sLayerCtx.sSliceArgument.uiSliceMbNum)); // confirmed_safe_unsafe_usage
return 0;
}
@ -219,7 +223,9 @@ int ParseConfig (CReadConfig& cRdCfg, SSourcePicture* pSrcPic, SEncParamExt& pSv
if (strTag[0].compare ("UsageType") == 0) {
pSvcParam.iUsageType = (EUsageType)atoi (strTag[1].c_str());
} else if (strTag[0].compare ("SourceWidth") == 0) {
}else if (strTag[0].compare ("SimulcastAVC") == 0) {
pSvcParam.bSimulcastAVC = atoi (strTag[1].c_str()) ? true : false;
}else if (strTag[0].compare ("SourceWidth") == 0) {
pSrcPic->iPicWidth = atoi (strTag[1].c_str());
} else if (strTag[0].compare ("SourceHeight") == 0) {
pSrcPic->iPicHeight = atoi (strTag[1].c_str());
@ -241,24 +247,24 @@ int ParseConfig (CReadConfig& cRdCfg, SSourcePicture* pSrcPic, SEncParamExt& pSv
} else if (strTag[0].compare ("SpsPpsIDStrategy") == 0) {
int32_t iValue = atoi (strTag[1].c_str());
switch (iValue) {
case 0:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0x01:
pSvcParam.eSpsPpsIdStrategy = INCREASING_ID;
break;
case 0x02:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING;
break;
case 0x03:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING_AND_PPS_INCREASING;
break;
case 0x06:
pSvcParam.eSpsPpsIdStrategy = SPS_PPS_LISTING;
break;
default:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0x01:
pSvcParam.eSpsPpsIdStrategy = INCREASING_ID;
break;
case 0x02:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING;
break;
case 0x03:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING_AND_PPS_INCREASING;
break;
case 0x06:
pSvcParam.eSpsPpsIdStrategy = SPS_PPS_LISTING;
break;
default:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
}
} else if (strTag[0].compare ("EnableScalableSEI") == 0) {
pSvcParam.bEnableSSEI = atoi (strTag[1].c_str()) ? true : false;
@ -292,6 +298,8 @@ int ParseConfig (CReadConfig& cRdCfg, SSourcePicture* pSrcPic, SEncParamExt& pSv
pSvcParam.iMultipleThreadIdc = 0;
else if (pSvcParam.iMultipleThreadIdc > MAX_THREADS_NUM)
pSvcParam.iMultipleThreadIdc = MAX_THREADS_NUM;
} else if (strTag[0].compare ("UseLoadBalancing") == 0) {
pSvcParam.bUseLoadBalancing = (atoi (strTag[1].c_str())) ? true : false;
} else if (strTag[0].compare ("RCMode") == 0) {
pSvcParam.iRCMode = (RC_MODES) atoi (strTag[1].c_str());
} else if (strTag[0].compare ("TargetBitrate") == 0) {
@ -306,6 +314,10 @@ int ParseConfig (CReadConfig& cRdCfg, SSourcePicture* pSrcPic, SEncParamExt& pSv
fprintf (stderr, "Invalid max overall bitrate setting due to RC enabled. Check MaxOverallBitrate field please!\n");
return 1;
}
} else if (strTag[0].compare ("MaxQp") == 0) {
pSvcParam.iMaxQp = atoi (strTag[1].c_str());
} else if (strTag[0].compare ("MinQp") == 0) {
pSvcParam.iMinQp = atoi (strTag[1].c_str());
} else if (strTag[0].compare ("EnableDenoise") == 0) {
pSvcParam.bEnableDenoise = atoi (strTag[1].c_str()) ? true : false;
} else if (strTag[0].compare ("EnableSceneChangeDetection") == 0) {
@ -377,11 +389,13 @@ void PrintHelp() {
printf (" -org Original file, example: -org src.yuv\n");
printf (" -sw the source width\n");
printf (" -sh the source height\n");
printf (" -utype usage type\n");
printf (" -savc simulcast avc\n");
printf (" -frms Number of total frames to be encoded\n");
printf (" -frin input frame rate\n");
printf (" -numtl Temporal layer number (default: 1)\n");
printf (" -iper Intra period (default: -1) : must be a power of 2 of GOP size (or -1)\n");
printf (" -nalsize the Maximum NAL size. which should be larger than the each layer slicesize when slice mode equals to SM_DYN_SLICE\n");
printf (" -nalsize the Maximum NAL size. which should be larger than the each layer slicesize when slice mode equals to SM_SIZELIMITED_SLICE\n");
printf (" -spsid Enable id adding in SPS/PPS per IDR \n");
printf (" -cabac Entropy coding mode(0:cavlc 1:cabac \n");
printf (" -denois Control denoising (default: 0)\n");
@ -391,12 +405,15 @@ void PrintHelp() {
printf (" -ltr Control long term reference (default: 0)\n");
printf (" -ltrnum Control the number of long term reference((1-4):screen LTR,(1-2):video LTR \n");
printf (" -threadIdc 0: auto(dynamic imp. internal encoder); 1: multiple threads imp. disabled; > 1: count number of threads \n");
printf (" -loadbalancing 0: turn off loadbalancing between slices when multi-threading available; 1: (default value) turn on loadbalancing between slices when multi-threading available\n");
printf (" -deblockIdc Loop filter idc (0: on, 1: off, \n");
printf (" -alphaOffset AlphaOffset(-6..+6): valid range \n");
printf (" -betaOffset BetaOffset (-6..+6): valid range\n");
printf (" -rc rate control mode: 0-quality mode; 1-bitrate mode; 2-bitrate limited mode; -1-rc off \n");
printf (" -tarb Overall target bitrate\n");
printf (" -maxbrTotal Overall max bitrate\n");
printf (" -maxqp Maximum Qp\n");
printf (" -minqp Minimum Qp\n");
printf (" -numl Number Of Layers: Must exist with layer_cfg file and the number of input layer_cfg file must equal to the value set by this command\n");
printf (" The options below are layer-based: (need to be set with layer id)\n");
printf (" -lconfig (Layer) (spatial layer configure file)\n");
@ -407,7 +424,7 @@ void PrintHelp() {
printf (" -lqp (Layer) (base quality layer qp : must work with -ldeltaqp or -lqparr)\n");
printf (" -ltarb (Layer) (spatial layer target bitrate)\n");
printf (" -lmaxb (Layer) (spatial layer max bitrate)\n");
printf (" -slcmd (Layer) (spatial layer slice mode): pls refer to layerX.cfg for details ( -slcnum: set target slice num; -slcsize: set target slice size constraint ) \n");
printf (" -slcmd (Layer) (spatial layer slice mode): pls refer to layerX.cfg for details ( -slcnum: set target slice num; -slcsize: set target slice size constraint ; -slcmbnum: set the first slice mb num under some slice modes) \n");
printf (" -trace (Level)\n");
printf ("\n");
}
@ -426,6 +443,9 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
else if (!strcmp (pCommand, "-utype") && (n < argc))
pSvcParam.iUsageType = (EUsageType)atoi (argv[n++]);
else if (!strcmp (pCommand, "-savc") && (n < argc))
pSvcParam.bSimulcastAVC = atoi (argv[n++]) ? true : false;
else if (!strcmp (pCommand, "-org") && (n < argc))
sFileSet.strSeqFile.assign (argv[n++]);
@ -453,27 +473,26 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
else if (!strcmp (pCommand, "-spsid") && (n < argc)) {
int32_t iValue = atoi (argv[n++]);
switch (iValue) {
case 0:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0x01:
pSvcParam.eSpsPpsIdStrategy = INCREASING_ID;
break;
case 0x02:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING;
break;
case 0x03:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING_AND_PPS_INCREASING;
break;
case 0x06:
pSvcParam.eSpsPpsIdStrategy = SPS_PPS_LISTING;
break;
default:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
case 0x01:
pSvcParam.eSpsPpsIdStrategy = INCREASING_ID;
break;
case 0x02:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING;
break;
case 0x03:
pSvcParam.eSpsPpsIdStrategy = SPS_LISTING_AND_PPS_INCREASING;
break;
case 0x06:
pSvcParam.eSpsPpsIdStrategy = SPS_PPS_LISTING;
break;
default:
pSvcParam.eSpsPpsIdStrategy = CONSTANT_ID;
break;
}
}
else if (!strcmp (pCommand, "-cabac") && (n < argc))
} else if (!strcmp (pCommand, "-cabac") && (n < argc))
pSvcParam.iEntropyCodingModeFlag = atoi (argv[n++]);
else if (!strcmp (pCommand, "-denois") && (n < argc))
@ -502,8 +521,9 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
else if (!strcmp (pCommand, "-threadIdc") && (n < argc))
pSvcParam.iMultipleThreadIdc = atoi (argv[n++]);
else if (!strcmp (pCommand, "-deblockIdc") && (n < argc))
else if (!strcmp (pCommand, "-loadbalancing") && (n + 1 < argc)) {
pSvcParam.bUseLoadBalancing = (atoi (argv[n++])) ? true : false;
} else if (!strcmp (pCommand, "-deblockIdc") && (n < argc))
pSvcParam.iLoopFilterDisableIdc = atoi (argv[n++]);
else if (!strcmp (pCommand, "-alphaOffset") && (n < argc))
@ -524,6 +544,12 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
else if (!strcmp (pCommand, "-maxbrTotal") && (n < argc))
pSvcParam.iMaxBitrate = 1000 * atoi (argv[n++]);
else if (!strcmp (pCommand, "-maxqp") && (n < argc))
pSvcParam.iMaxQp = atoi (argv[n++]);
else if (!strcmp (pCommand, "-minqp") && (n < argc))
pSvcParam.iMinQp = atoi (argv[n++]);
else if (!strcmp (pCommand, "-numl") && (n < argc)) {
pSvcParam.iSpatialLayerNum = atoi (argv[n++]);
} else if (!strcmp (pCommand, "-lconfig") && (n < argc)) {
@ -583,25 +609,19 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
switch (atoi (argv[n++])) {
case 0:
pDLayer->sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
pDLayer->sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
break;
case 1:
pDLayer->sSliceCfg.uiSliceMode = SM_FIXEDSLCNUM_SLICE;
pDLayer->sSliceArgument.uiSliceMode = SM_FIXEDSLCNUM_SLICE;
break;
case 2:
pDLayer->sSliceCfg.uiSliceMode = SM_RASTER_SLICE;
pDLayer->sSliceArgument.uiSliceMode = SM_RASTER_SLICE;
break;
case 3:
pDLayer->sSliceCfg.uiSliceMode = SM_ROWMB_SLICE;
break;
case 4:
pDLayer->sSliceCfg.uiSliceMode = SM_DYN_SLICE;
break;
case 5:
pDLayer->sSliceCfg.uiSliceMode = SM_AUTO_SLICE;
pDLayer->sSliceArgument.uiSliceMode = SM_SIZELIMITED_SLICE;
break;
default:
pDLayer->sSliceCfg.uiSliceMode = SM_RESERVED;
pDLayer->sSliceArgument.uiSliceMode = SM_RESERVED;
break;
}
}
@ -609,13 +629,17 @@ int ParseCommandLine (int argc, char** argv, SSourcePicture* pSrcPic, SEncParamE
else if (!strcmp (pCommand, "-slcsize") && (n + 1 < argc)) {
unsigned int iLayer = atoi (argv[n++]);
SSpatialLayerConfig* pDLayer = &pSvcParam.sSpatialLayers[iLayer];
pDLayer->sSliceCfg.sSliceArgument.uiSliceSizeConstraint = atoi (argv[n++]);
pDLayer->sSliceArgument.uiSliceSizeConstraint = atoi (argv[n++]);
}
else if (!strcmp (pCommand, "-slcnum") && (n + 1 < argc)) {
unsigned int iLayer = atoi (argv[n++]);
SSpatialLayerConfig* pDLayer = &pSvcParam.sSpatialLayers[iLayer];
pDLayer->sSliceCfg.sSliceArgument.uiSliceNum = atoi (argv[n++]);
pDLayer->sSliceArgument.uiSliceNum = atoi (argv[n++]);
} else if (!strcmp (pCommand, "-slcmbnum") && (n + 1 < argc)) {
unsigned int iLayer = atoi (argv[n++]);
SSpatialLayerConfig* pDLayer = &pSvcParam.sSpatialLayers[iLayer];
pDLayer->sSliceArgument.uiSliceMbNum[0] = atoi (argv[n++]);
}
}
return 0;
@ -644,6 +668,7 @@ int FillSpecificParameters (SEncParamExt& sParam) {
sParam.eSpsPpsIdStrategy = INCREASING_ID;
sParam.bPrefixNalAddingCtrl = 0;
sParam.iComplexityMode = MEDIUM_COMPLEXITY;
sParam.bSimulcastAVC = false;
int iIndexLayer = 0;
sParam.sSpatialLayers[iIndexLayer].uiProfileIdc = PRO_BASELINE;
sParam.sSpatialLayers[iIndexLayer].iVideoWidth = 160;
@ -651,7 +676,7 @@ int FillSpecificParameters (SEncParamExt& sParam) {
sParam.sSpatialLayers[iIndexLayer].fFrameRate = 7.5f;
sParam.sSpatialLayers[iIndexLayer].iSpatialBitrate = 64000;
sParam.sSpatialLayers[iIndexLayer].iMaxSpatialBitrate = UNSPECIFIED_BIT_RATE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
++ iIndexLayer;
sParam.sSpatialLayers[iIndexLayer].uiProfileIdc = PRO_SCALABLE_BASELINE;
@ -660,7 +685,7 @@ int FillSpecificParameters (SEncParamExt& sParam) {
sParam.sSpatialLayers[iIndexLayer].fFrameRate = 15.0f;
sParam.sSpatialLayers[iIndexLayer].iSpatialBitrate = 160000;
sParam.sSpatialLayers[iIndexLayer].iMaxSpatialBitrate = UNSPECIFIED_BIT_RATE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
++ iIndexLayer;
sParam.sSpatialLayers[iIndexLayer].uiProfileIdc = PRO_SCALABLE_BASELINE;
@ -669,8 +694,8 @@ int FillSpecificParameters (SEncParamExt& sParam) {
sParam.sSpatialLayers[iIndexLayer].fFrameRate = 30.0f;
sParam.sSpatialLayers[iIndexLayer].iSpatialBitrate = 512000;
sParam.sSpatialLayers[iIndexLayer].iMaxSpatialBitrate = UNSPECIFIED_BIT_RATE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.sSliceArgument.uiSliceNum = 1;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceNum = 1;
++ iIndexLayer;
sParam.sSpatialLayers[iIndexLayer].uiProfileIdc = PRO_SCALABLE_BASELINE;
@ -679,8 +704,8 @@ int FillSpecificParameters (SEncParamExt& sParam) {
sParam.sSpatialLayers[iIndexLayer].fFrameRate = 30.0f;
sParam.sSpatialLayers[iIndexLayer].iSpatialBitrate = 1500000;
sParam.sSpatialLayers[iIndexLayer].iMaxSpatialBitrate = UNSPECIFIED_BIT_RATE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceCfg.sSliceArgument.uiSliceNum = 1;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
sParam.sSpatialLayers[iIndexLayer].sSliceArgument.uiSliceNum = 1;
float fMaxFr = sParam.sSpatialLayers[sParam.iSpatialLayerNum - 1].fFrameRate;
for (int32_t i = sParam.iSpatialLayerNum - 2; i >= 0; -- i) {
@ -793,7 +818,7 @@ int ProcessEncoding (ISVCEncoder* pPtrEnc, int argc, char** argv, bool bConfigFi
sSvcParam.iPicHeight = (!sSvcParam.iPicHeight) ? iSourceHeight : sSvcParam.iPicHeight;
iTotalFrameMax = (int32_t)fs.uiFrameToBeCoded;
// sSvcParam.bSimulcastAVC = true;
if (cmResultSuccess != pPtrEnc->InitializeExt (&sSvcParam)) { // SVC encoder initialization
fprintf (stderr, "SVC encoder Initialize failed\n");
iRet = 1;
@ -832,11 +857,27 @@ int ProcessEncoding (ISVCEncoder* pPtrEnc, int argc, char** argv, bool bConfigFi
pFileYUV = fopen (fs.strSeqFile.c_str(), "rb");
if (pFileYUV != NULL) {
#if defined(_WIN32) || defined(_WIN64)
#if _MSC_VER >= 1400
if (!_fseeki64 (pFileYUV, 0, SEEK_END)) {
int64_t i_size = _ftelli64 (pFileYUV);
_fseeki64 (pFileYUV, 0, SEEK_SET);
iTotalFrameMax = WELS_MAX ((int32_t) (i_size / kiPicResSize), iTotalFrameMax);
}
#else
if (!fseek (pFileYUV, 0, SEEK_END)) {
int64_t i_size = ftell (pFileYUV);
fseek (pFileYUV, 0, SEEK_SET);
iTotalFrameMax = WELS_MAX ((int32_t) (i_size / kiPicResSize), iTotalFrameMax);
}
#endif
#else
if (!fseeko (pFileYUV, 0, SEEK_END)) {
int64_t i_size = ftello (pFileYUV);
fseeko (pFileYUV, 0, SEEK_SET);
iTotalFrameMax = WELS_MAX ((int32_t) (i_size / kiPicResSize), iTotalFrameMax);
}
#endif
} else {
fprintf (stderr, "Unable to open source sequence file (%s), check corresponding path!\n",
fs.strSeqFile.c_str());

View File

@ -68,7 +68,7 @@
// uint8_t *pred, const int32_t stride, int16_t *rs
WELS_ASM_AARCH64_FUNC_BEGIN IdctResAddPred_AArch64_neon
SIGN_EXTENSION x1,w1
ld4 {v0.4h, v1.4h, v2.4h, v3.4h}, [x2] // cost 3 cycles!
ROW_TRANSFORM_1_STEP v0, v1, v2, v3, v16, v17, v18, v19, v4, v5
TRANSFORM_4BYTES v0, v1, v2, v3, v16, v17, v18, v19
@ -113,6 +113,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsBlockZero16x16_AArch64_neon
eor v0.16b, v0.16b, v0.16b
eor v1.16b, v1.16b, v1.16b
SIGN_EXTENSION x1,w1
lsl x1, x1, 1
.rept 16
st1 {v0.16b, v1.16b}, [x0], x1
@ -121,6 +122,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsBlockZero8x8_AArch64_neon
eor v0.16b, v0.16b, v0.16b
SIGN_EXTENSION x1, w1
lsl x1, x1, 1
.rept 8
st1 {v0.16b}, [x0], x1

View File

@ -47,6 +47,11 @@ extern "C" {
#if defined(X86_ASM)
void IdctResAddPred_mmx (uint8_t* pPred, const int32_t kiStride, int16_t* pRs);
void IdctResAddPred_sse2 (uint8_t* pPred, const int32_t kiStride, int16_t* pRs);
#if defined(HAVE_AVX2)
void IdctResAddPred_avx2 (uint8_t* pPred, const int32_t kiStride, int16_t* pRs);
void IdctFourResAddPred_avx2 (uint8_t* pPred, int32_t iStride, int16_t* pRs, const int8_t* pNzc);
#endif
#endif//X86_ASM
#if defined(HAVE_NEON)

View File

@ -54,6 +54,11 @@ extern "C" {
*/
int32_t DecoderConfigParam (PWelsDecoderContext pCtx, const SDecodingParam* kpParam);
/*!
* \brief fill in default values of decoder context
*/
void WelsDecoderDefaults (PWelsDecoderContext pCtx, SLogContext* pLogCtx);
/*!
*************************************************************************************
* \brief Initialize Wels decoder parameters and memory
@ -68,7 +73,7 @@ int32_t DecoderConfigParam (PWelsDecoderContext pCtx, const SDecodingParam* kpPa
* \note N/A
*************************************************************************************
*/
int32_t WelsInitDecoder (PWelsDecoderContext pCtx, const bool bParseOnly, SLogContext* pLogCtx);
int32_t WelsInitDecoder (PWelsDecoderContext pCtx, SLogContext* pLogCtx);
/*!
*************************************************************************************
@ -106,18 +111,13 @@ int32_t WelsDecodeBs (PWelsDecoderContext pCtx, const uint8_t* kpBsBuf, const in
/*
* request memory blocks for decoder avc part
*/
int32_t WelsRequestMem (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const int32_t kiMbHeight);
int32_t WelsRequestMem (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const int32_t kiMbHeight, bool& bReallocFlag);
/*
* free memory blocks in avc
* free memory dynamically allocated during decoder
*/
void WelsFreeMem (PWelsDecoderContext pCtx);
/*
* set colorspace format in decoder
*/
int32_t DecoderSetCsp (PWelsDecoderContext pCtx, const int32_t kiColorFormat);
void WelsFreeDynamicMemory (PWelsDecoderContext pCtx);
/*!
* \brief make sure synchonozization picture resolution (get from slice header) among different parts (i.e, memory related and so on)
@ -130,7 +130,19 @@ int32_t DecoderSetCsp (PWelsDecoderContext pCtx, const int32_t kiColorFormat);
*/
int32_t SyncPictureResolutionExt (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const int32_t kiMbHeight);
void AssignFuncPointerForRec (PWelsDecoderContext pCtx);
/*!
* \brief init decoder predictive function pointers including ASM functions during MB reconstruction
* \param pCtx Wels decoder context
* \param uiCpuFlag cpu assembly indication
*/
void InitPredFunc (PWelsDecoderContext pCtx, uint32_t uiCpuFlag);
/*!
* \brief init decoder internal function pointers including ASM functions
* \param pCtx Wels decoder context
* \param uiCpuFlag cpu assembly indication
*/
void InitDecFuncs (PWelsDecoderContext pCtx, uint32_t uiCpuFlag);
void GetVclNalTemporalId (PWelsDecoderContext pCtx); //get the info that whether or not have VCL NAL in current AU,
//and if YES, get the temporal ID

View File

@ -136,6 +136,7 @@ typedef struct TagPpsBsInfo {
/*typedef for get intra predictor func pointer*/
typedef void (*PGetIntraPredFunc) (uint8_t* pPred, const int32_t kiLumaStride);
typedef void (*PIdctResAddPredFunc) (uint8_t* pPred, const int32_t kiStride, int16_t* pRs);
typedef void (*PIdctFourResAddPredFunc) (uint8_t* pPred, int32_t iStride, int16_t* pRs, const int8_t* pNzc);
typedef void (*PExpandPictureFunc) (uint8_t* pDst, const int32_t kiStride, const int32_t kiPicWidth,
const int32_t kiPicHeight);
@ -242,7 +243,6 @@ typedef struct TagWelsDecoderContext {
SDecodingParam* pParam;
uint32_t uiCpuFlag; // CPU compatibility detected
EVideoFormatType eOutputColorFormat; // color space format to be outputed
VIDEO_BITSTREAM_TYPE eVideoType; //indicate the type of video to decide whether or not to do qp_delta error detection.
bool bHaveGotMemory; // global memory for decoder context related ever requested?
@ -376,7 +376,6 @@ typedef struct TagWelsDecoderContext {
ERROR_CON_IDC eErrorConMethod; //
//for Parse only
bool bParseOnly;
bool bFramePending;
bool bFrameFinish;
int32_t iNalNum;
@ -391,6 +390,7 @@ typedef struct TagWelsDecoderContext {
PGetIntraPredFunc pGetI4x4LumaPredFunc[14]; // h264_predict_4x4_t
PGetIntraPredFunc pGetIChromaPredFunc[7]; // h264_predict_8x8_t
PIdctResAddPredFunc pIdctResAddPredFunc;
PIdctFourResAddPredFunc pIdctFourResAddPredFunc;
SMcFunc sMcFunc;
//Transform8x8
PGetIntraPred8x8Func pGetI8x8LumaPredFunc[14];

View File

@ -72,21 +72,21 @@ int32_t ExpandBsBuffer (PWelsDecoderContext pCtx, const int32_t kiSrcLen);
int32_t CheckBsBuffer (PWelsDecoderContext pCtx, const int32_t kiSrcLen);
/*
* WelsInitMemory
* Memory request for introduced data
* WelsInitStaticMemory
* Memory request for introduced data at decoder start
* Especially for:
* rbsp_au_buffer, cur_dq_layer_ptr and ref_dq_layer_ptr in MB info cache.
* return:
* 0 - success; otherwise returned error_no defined in error_no.h.
*/
int32_t WelsInitMemory (PWelsDecoderContext pCtx);
int32_t WelsInitStaticMemory (PWelsDecoderContext pCtx);
/*
* WelsFreeMemory
* Free memory introduced in WelsInitMemory at destruction of decoder.
* WelsFreeStaticMemory
* Free memory introduced in WelsInitStaticMemory at destruction of decoder.
*
*/
void WelsFreeMemory (PWelsDecoderContext pCtx);
void WelsFreeStaticMemory (PWelsDecoderContext pCtx);
/*!
* \brief request memory when maximal picture width and height are available

View File

@ -129,6 +129,7 @@ ERR_INFO_INVALID_LOG2_MAX_PIC_ORDER_CNT_LSB_MINUS4,
ERR_INFO_INVALID_NUM_REF_FRAME_IN_PIC_ORDER_CNT_CYCLE,
ERR_INFO_INVALID_DBLOCKING_IDC,
ERR_INFO_INVALID_MB_TYPE,
ERR_INFO_INVALID_MB_SKIP_RUN,
ERR_INFO_INVALID_SPS_ID,
ERR_INFO_INVALID_PPS_ID,
ERR_INFO_INVALID_SUB_MB_TYPE,
@ -199,6 +200,16 @@ ERR_CABAC_NO_BS_TO_READ,
ERR_CABAC_UNEXPECTED_VALUE,
//for scaling list
ERR_SCALING_LIST_DELTA_SCALE,
//logic error related to multi-layer
ERR_INFO_WIDTH_MISMATCH,
//reconstruction error
ERR_INFO_MB_RECON_FAIL,
ERR_INFO_MB_NUM_EXCEED_FAIL,
ERR_INFO_BS_INCOMPLETE,
ERR_INFO_MB_NUM_INADEQUATE,
//parse only error
ERR_INFO_PARSEONLY_PENDING,
ERR_INFO_PARSEONLY_ERROR,
};
//-----------------------------------------------------------------------------------------------------------

View File

@ -48,7 +48,7 @@
namespace WelsDec {
#ifndef MB_XY_T
#define MB_XY_T int16_t
#define MB_XY_T int32_t
#endif//MB_XY_T
/*!

View File

@ -317,7 +317,7 @@ uint8_t* ParseNalHeader (PWelsDecoderContext pCtx, SNalUnitHeader* pNalUnitHeade
iNalSize -= NAL_UNIT_HEADER_EXT_SIZE;
*pConsumedBytes += NAL_UNIT_HEADER_EXT_SIZE;
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
pCurNal->sNalData.sVclNal.pNalPos = pSavedData->pCurPos;
int32_t iTrailingZeroByte = 0;
while (pSrcNal[iSrcNalLen - iTrailingZeroByte - 1] == 0x0) //remove final trailing 0 bytes
@ -346,7 +346,7 @@ uint8_t* ParseNalHeader (PWelsDecoderContext pCtx, SNalUnitHeader* pNalUnitHeade
pSavedData->pCurPos += iActualLen - iOffset;
}
} else {
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
pCurNal->sNalData.sVclNal.pNalPos = pSavedData->pCurPos;
int32_t iTrailingZeroByte = 0;
while (pSrcNal[iSrcNalLen - iTrailingZeroByte - 1] == 0x0) //remove final trailing 0 bytes
@ -782,7 +782,7 @@ int32_t DecodeSpsSvcExt (PWelsDecoderContext pCtx, PSubsetSps pSpsExt, PBitStrin
return 0;
return ERR_NONE;
}
const SLevelLimits* GetLevelLimits (int32_t iLevelIdx, bool bConstraint3) {
@ -915,11 +915,18 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
int32_t iSpsId;
uint32_t uiCode;
int32_t iCode;
int32_t iRet = ERR_NONE;
bool bConstraintSetFlags[6] = { false };
const bool kbUseSubsetFlag = IS_SUBSET_SPS_NAL (pNalHead->eNalUnitType);
WELS_READ_VERIFY (BsGetBits (pBs, 8, &uiCode)); //profile_idc
uiProfileIdc = uiCode;
if (uiProfileIdc != PRO_BASELINE && uiProfileIdc != PRO_MAIN && uiProfileIdc != PRO_SCALABLE_BASELINE
&& uiProfileIdc != PRO_SCALABLE_HIGH
&& uiProfileIdc != PRO_EXTENDED && uiProfileIdc != PRO_HIGH) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "SPS ID can not be supported!\n");
return false;
}
WELS_READ_VERIFY (BsGetOneBit (pBs, &uiCode)); //constraint_set0_flag
bConstraintSetFlags[0] = !!uiCode;
WELS_READ_VERIFY (BsGetOneBit (pBs, &uiCode)); //constraint_set1_flag
@ -944,6 +951,8 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
pSubsetSps = &sTempSubsetSps;
pSps = &sTempSubsetSps.sSps;
memset (pSubsetSps, 0, sizeof (SSubsetSps));
// Use the level 5.2 for compatibility
const SLevelLimits* pSMaxLevelLimits = GetLevelLimits (52, false);
const SLevelLimits* pSLevelLimits = GetLevelLimits (uiLevelIdc, bConstraintSetFlags[3]);
if (NULL == pSLevelLimits) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "ParseSps(): level_idx (%d).\n", uiLevelIdc);
@ -1048,8 +1057,12 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
}
if (((uint64_t)pSps->iMbWidth * (uint64_t)pSps->iMbWidth) > (uint64_t) (8 * pSLevelLimits->uiMaxFS)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_ERROR, " the pic_width_in_mbs exceeds the level limits!");
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
if (((uint64_t)pSps->iMbWidth * (uint64_t)pSps->iMbWidth) > (uint64_t) (8 * pSMaxLevelLimits->uiMaxFS)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_ERROR, "the pic_width_in_mbs exceeds the level limits!");
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
} else {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "the pic_width_in_mbs exceeds the level limits!");
}
}
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //pic_height_in_map_units_minus1
pSps->iMbHeight = PIC_HEIGHT_IN_MAP_UNITS_OFFSET + uiCode;
@ -1058,14 +1071,23 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
}
if (((uint64_t)pSps->iMbHeight * (uint64_t)pSps->iMbHeight) > (uint64_t) (8 * pSLevelLimits->uiMaxFS)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_ERROR, " the pic_height_in_mbs exceeds the level limits!");
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
if (((uint64_t)pSps->iMbHeight * (uint64_t)pSps->iMbHeight) > (uint64_t) (8 * pSMaxLevelLimits->uiMaxFS)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_ERROR, "the pic_height_in_mbs exceeds the level limits!");
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
} else {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "the pic_height_in_mbs exceeds the level limits!");
}
}
uint32_t uiTmp32 = pSps->iMbWidth * pSps->iMbHeight;
if (uiTmp32 > (uint32_t)pSLevelLimits->uiMaxFS) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, " the total count of mb exceeds the level limits!");
uint64_t uiTmp64 = (uint64_t)pSps->iMbWidth * (uint64_t)pSps->iMbHeight;
if (uiTmp64 > (uint64_t)pSLevelLimits->uiMaxFS) {
if (uiTmp64 > (uint64_t)pSMaxLevelLimits->uiMaxFS) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_ERROR, "the total count of mb exceeds the level limits!");
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_MB_SIZE);
} else {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "the total count of mb exceeds the level limits!");
}
}
pSps->uiTotalMbCount = uiTmp32;
pSps->uiTotalMbCount = (uint32_t)uiTmp64;
WELS_CHECK_SE_UPPER_ERROR (pSps->iNumRefFrames, SPS_MAX_NUM_REF_FRAMES_MAX, "max_num_ref_frames",
GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_MAX_NUM_REF_FRAMES));
// here we check max_num_ref_frames
@ -1113,7 +1135,7 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
WELS_READ_VERIFY (BsGetOneBit (pBs, &uiCode)); //vui_parameters_present_flag
pSps->bVuiParamPresentFlag = !!uiCode;
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
if (kSrcNalLen >= SPS_PPS_BS_SIZE - 4) { //sps bs exceeds!
pCtx->iErrorCode |= dsOutOfMemory;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_OUT_OF_MEMORY);
@ -1201,8 +1223,8 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
}
// Check if SPS SVC extension applicated
if (kbUseSubsetFlag && (PRO_SCALABLE_BASELINE == uiProfileIdc || PRO_SCALABLE_HIGH == uiProfileIdc)) {
if (DecodeSpsSvcExt (pCtx, pSubsetSps, pBs) != ERR_NONE) {
return -1;
if ((iRet = DecodeSpsSvcExt (pCtx, pSubsetSps, pBs)) != ERR_NONE) {
return iRet;
}
WELS_READ_VERIFY (BsGetOneBit (pBs, &uiCode)); //svc_vui_parameters_present_flag
@ -1265,7 +1287,7 @@ int32_t ParseSps (PWelsDecoderContext pCtx, PBitStringAux pBsAux, int32_t* pPicW
pCtx->bSpsAvailFlags[iSpsId] = true;
pCtx->bSpsExistAheadFlag = true;
}
return 0;
return ERR_NONE;
}
/*!
@ -1295,7 +1317,7 @@ int32_t ParsePps (PWelsDecoderContext pCtx, PPps pPpsList, PBitStringAux pBsAux,
WELS_READ_VERIFY (BsGetUe (pBsAux, &uiCode)); //pic_parameter_set_id
uiPpsId = uiCode;
if (uiPpsId >= MAX_PPS_COUNT) {
return ERR_INFO_PPS_ID_OVERFLOW;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_PPS_ID_OVERFLOW);
}
pPps = &sTempPps;
memset (pPps, 0, sizeof (SPps));
@ -1305,7 +1327,7 @@ int32_t ParsePps (PWelsDecoderContext pCtx, PPps pPpsList, PBitStringAux pBsAux,
pPps->iSpsId = uiCode;
if (pPps->iSpsId >= MAX_SPS_COUNT) {
return ERR_INFO_SPS_ID_OVERFLOW;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_SPS_ID_OVERFLOW);
}
WELS_READ_VERIFY (BsGetOneBit (pBsAux, &uiCode)); //entropy_coding_mode_flag
@ -1317,7 +1339,7 @@ int32_t ParsePps (PWelsDecoderContext pCtx, PPps pPpsList, PBitStringAux pBsAux,
pPps->uiNumSliceGroups = NUM_SLICE_GROUPS_OFFSET + uiCode;
if (pPps->uiNumSliceGroups > MAX_SLICEGROUP_IDS) {
return ERR_INFO_INVALID_SLICEGROUP;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_INVALID_SLICEGROUP);
}
if (pPps->uiNumSliceGroups > 1) {
@ -1348,19 +1370,14 @@ int32_t ParsePps (PWelsDecoderContext pCtx, PPps pPpsList, PBitStringAux pBsAux,
if (pPps->uiNumRefIdxL0Active > MAX_REF_PIC_COUNT ||
pPps->uiNumRefIdxL1Active > MAX_REF_PIC_COUNT) {
return ERR_INFO_REF_COUNT_OVERFLOW;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_REF_COUNT_OVERFLOW);
}
WELS_READ_VERIFY (BsGetOneBit (pBsAux, &uiCode)); //weighted_pred_flag
pPps->bWeightedPredFlag = !!uiCode;
WELS_READ_VERIFY (BsGetBits (pBsAux, 2, &uiCode)); //weighted_bipred_idc
pPps->uiWeightedBipredIdc = uiCode;
if (pPps->uiWeightedBipredIdc != 0) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"ParsePps(): weighted_bipred_idc (%d) not supported.\n",
pPps->uiWeightedBipredIdc);
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_UNSUPPORTED_WP);
}
// weighted_bipred_idc > 0 NOT supported now, but no impact when we ignore it
WELS_READ_VERIFY (BsGetSe (pBsAux, &iCode)); //pic_init_qp_minus26
pPps->iPicInitQp = PIC_INIT_QP_OFFSET + iCode;
@ -1416,7 +1433,7 @@ int32_t ParsePps (PWelsDecoderContext pCtx, PPps pPpsList, PBitStringAux pBsAux,
memcpy (&pCtx->sPpsBuffer[uiPpsId], pPps, sizeof (SPps));
pCtx->bPpsAvailFlags[uiPpsId] = true;
}
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
if (kSrcNalLen >= SPS_PPS_BS_SIZE - 4) { //pps bs exceeds
pCtx->iErrorCode |= dsOutOfMemory;
return GENERATE_ERROR_NO (ERR_LEVEL_PARAM_SETS, ERR_INFO_OUT_OF_MEMORY);

View File

@ -107,7 +107,7 @@ int32_t Read32BitsCabac (PWelsCabacDecEngine pDecEngine, uint32_t& uiValue, int3
iNumBitsRead = 0;
uiValue = 0;
if (iLeftBytes <= 0) {
return ERR_CABAC_NO_BS_TO_READ;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_CABAC_NO_BS_TO_READ);
}
switch (iLeftBytes) {
case 3:
@ -275,7 +275,7 @@ int32_t DecodeExpBypassCabac (PWelsCabacDecEngine pDecEngine, int32_t iCount, ui
}
} while (uiCode != 0 && iCount != 16);
if (iCount == 16) {
return ERR_CABAC_UNEXPECTED_VALUE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_CABAC_UNEXPECTED_VALUE);
}
while (iCount--) {

View File

@ -70,7 +70,7 @@ int32_t WelsTargetSliceConstruction (PWelsDecoderContext pCtx) {
PDeblockingFilterMbFunc pDeblockMb;
if (!pCtx->bAvcBasedFlag && iCurLayerWidth != pCtx->iCurSeqIntervalMaxPicWidth) {
return -1;
return ERR_INFO_WIDTH_MISMATCH;
}
iNextMbXyIndex = pSliceHeader->iFirstMbInSlice;
@ -90,13 +90,13 @@ int32_t WelsTargetSliceConstruction (PWelsDecoderContext pCtx) {
break;
}
if (!pCtx->bParseOnly) { //for parse only, actual recon MB unnecessary
if (!pCtx->pParam->bParseOnly) { //for parse only, actual recon MB unnecessary
if (WelsTargetMbConstruction (pCtx)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"WelsTargetSliceConstruction():::MB(%d, %d) construction error. pCurSlice_type:%d",
pCurLayer->iMbX, pCurLayer->iMbY, pCurSlice->eSliceType);
return -1;
return ERR_INFO_MB_RECON_FAIL;
}
}
@ -112,7 +112,7 @@ int32_t WelsTargetSliceConstruction (PWelsDecoderContext pCtx) {
"WelsTargetSliceConstruction():::pCtx->iTotalNumMbRec:%d, iTotalMbTargetLayer:%d",
pCtx->iTotalNumMbRec, iTotalMbTargetLayer);
return -1;
return ERR_INFO_MB_NUM_EXCEED_FAIL;
}
if (pSliceHeader->pPps->uiNumSliceGroups > 1) {
@ -132,22 +132,22 @@ int32_t WelsTargetSliceConstruction (PWelsDecoderContext pCtx) {
pCtx->pDec->iHeightInPixel = iCurLayerHeight;
if ((pCurSlice->eSliceType != I_SLICE) && (pCurSlice->eSliceType != P_SLICE))
return 0;
return ERR_NONE; //no error but just ignore the type unsupported
if (pCtx->bParseOnly) //for parse only, deblocking should not go on
return 0;
if (pCtx->pParam->bParseOnly) //for parse only, deblocking should not go on
return ERR_NONE;
pDeblockMb = WelsDeblockingMb;
if (1 == pSliceHeader->uiDisableDeblockingFilterIdc
|| pCtx->pCurDqLayer->sLayerInfo.sSliceInLayer.iTotalMbInCurSlice <= 0) {
return 0;//NO_SUPPORTED_FILTER_IDX
return ERR_NONE;//NO_SUPPORTED_FILTER_IDX
} else {
WelsDeblockingFilterSlice (pCtx, pDeblockMb);
}
// any other filter_idc not supported here, 7/22/2010
return 0;
return ERR_NONE;
}
int32_t WelsMbInterSampleConstruction (PWelsDecoderContext pCtx, PDqLayer pCurLayer,
@ -168,30 +168,23 @@ int32_t WelsMbInterSampleConstruction (PWelsDecoderContext pCtx, PDqLayer pCurLa
}
}
} else {
for (i = 0; i < 16; i++) { //luma
iIndex = g_kuiMbCountScan4Idx[i];
if (pCurLayer->pNzc[iMbXy][iIndex]) {
iOffset = ((iIndex >> 2) << 2) * iStrideL + ((iIndex % 4) << 2);
pCtx->pIdctResAddPredFunc (pDstY + iOffset, iStrideL, pCurLayer->pScaledTCoeff[iMbXy] + (i << 4));
}
}
// luma.
const int8_t* pNzc = pCurLayer->pNzc[iMbXy];
int16_t* pScaledTCoeff = pCurLayer->pScaledTCoeff[iMbXy];
pCtx->pIdctFourResAddPredFunc (pDstY + 0 * iStrideL + 0, iStrideL, pScaledTCoeff + 0 * 64, pNzc + 0);
pCtx->pIdctFourResAddPredFunc (pDstY + 0 * iStrideL + 8, iStrideL, pScaledTCoeff + 1 * 64, pNzc + 2);
pCtx->pIdctFourResAddPredFunc (pDstY + 8 * iStrideL + 0, iStrideL, pScaledTCoeff + 2 * 64, pNzc + 8);
pCtx->pIdctFourResAddPredFunc (pDstY + 8 * iStrideL + 8, iStrideL, pScaledTCoeff + 3 * 64, pNzc + 10);
}
for (i = 0; i < 4; i++) { //chroma
iIndex = g_kuiMbCountScan4Idx[i + 16]; //Cb
if (pCurLayer->pNzc[iMbXy][iIndex] || * (pCurLayer->pScaledTCoeff[iMbXy] + ((i + 16) << 4))) {
iOffset = (((iIndex - 16) >> 2) << 2) * iStrideC + (((iIndex - 16) % 4) << 2);
pCtx->pIdctResAddPredFunc (pDstU + iOffset, iStrideC, pCurLayer->pScaledTCoeff[iMbXy] + ((i + 16) << 4));
}
const int8_t* pNzc = pCurLayer->pNzc[iMbXy];
int16_t* pScaledTCoeff = pCurLayer->pScaledTCoeff[iMbXy];
// Cb.
pCtx->pIdctFourResAddPredFunc (pDstU, iStrideC, pScaledTCoeff + 4 * 64, pNzc + 16);
// Cr.
pCtx->pIdctFourResAddPredFunc (pDstV, iStrideC, pScaledTCoeff + 5 * 64, pNzc + 18);
iIndex = g_kuiMbCountScan4Idx[i + 20]; //Cr
if (pCurLayer->pNzc[iMbXy][iIndex] || * (pCurLayer->pScaledTCoeff[iMbXy] + ((i + 20) << 4))) {
iOffset = (((iIndex - 18) >> 2) << 2) * iStrideC + (((iIndex - 18) % 4) << 2);
pCtx->pIdctResAddPredFunc (pDstV + iOffset, iStrideC , pCurLayer->pScaledTCoeff[iMbXy] + ((i + 20) << 4));
}
}
return 0;
return ERR_NONE;
}
int32_t WelsMbInterConstruction (PWelsDecoderContext pCtx, PDqLayer pCurLayer) {
int32_t iMbX = pCurLayer->iMbX;
@ -210,7 +203,7 @@ int32_t WelsMbInterConstruction (PWelsDecoderContext pCtx, PDqLayer pCurLayer) {
pCtx->sBlockFunc.pWelsSetNonZeroCountFunc (
pCurLayer->pNzc[pCurLayer->iMbXyIndex]); // set all none-zero nzc to 1; dbk can be opti!
return 0;
return ERR_NONE;
}
void WelsLumaDcDequantIdct (int16_t* pBlock, int32_t iQp, PWelsDecoderContext pCtx) {
@ -265,7 +258,7 @@ int32_t WelsMbIntraPredictionConstruction (PWelsDecoderContext pCtx, PDqLayer pC
WelsLumaDcDequantIdct (pCurLayer->pScaledTCoeff[iMbXy], pCurLayer->pLumaQp[iMbXy], pCtx);
RecI16x16Mb (iMbXy, pCtx, pCurLayer->pScaledTCoeff[iMbXy], pCurLayer);
return 0;
return ERR_NONE;
}
if (IS_INTRA8x8 (pCurLayer->pMbType[iMbXy])) {
@ -275,7 +268,7 @@ int32_t WelsMbIntraPredictionConstruction (PWelsDecoderContext pCtx, PDqLayer pC
if (IS_INTRA4x4 (pCurLayer->pMbType[iMbXy]))
RecI4x4Mb (iMbXy, pCtx, pCurLayer->pScaledTCoeff[iMbXy], pCurLayer);
return 0;
return ERR_NONE;
}
int32_t WelsMbInterPrediction (PWelsDecoderContext pCtx, PDqLayer pCurLayer) {
@ -292,14 +285,14 @@ int32_t WelsMbInterPrediction (PWelsDecoderContext pCtx, PDqLayer pCurLayer) {
GetInterPred (pDstY, pDstCb, pDstCr, pCtx);
return 0;
return ERR_NONE;
}
int32_t WelsTargetMbConstruction (PWelsDecoderContext pCtx) {
PDqLayer pCurLayer = pCtx->pCurDqLayer;
if (MB_TYPE_INTRA_PCM == pCurLayer->pMbType[pCurLayer->iMbXyIndex]) {
//already decoded and reconstructed when parsing
return 0;
return ERR_NONE;
} else if (IS_INTRA (pCurLayer->pMbType[pCurLayer->iMbXyIndex])) {
WelsMbIntraPredictionConstruction (pCtx, pCurLayer, 1);
} else if (IS_INTER (pCurLayer->pMbType[pCurLayer->iMbXyIndex])) { //InterMB
@ -311,10 +304,10 @@ int32_t WelsTargetMbConstruction (PWelsDecoderContext pCtx) {
} else {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "WelsTargetMbConstruction():::::Unknown MB type: %d",
pCurLayer->pMbType[pCurLayer->iMbXyIndex]);
return -1;
return ERR_INFO_MB_RECON_FAIL;
}
return 0;
return ERR_NONE;
}
void WelsChromaDcIdct (int16_t* pBlock) {
@ -444,8 +437,8 @@ int32_t ParseIntra4x4Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAvail
}
iFinalMode = CheckIntraNxNPredMode (&iSampleAvail[0], &iBestMode, i, false);
if (iFinalMode == ERR_INVALID_INTRA4X4_MODE) {
return ERR_INFO_INVALID_I4x4_PRED_MODE;
if (iFinalMode == GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INVALID_INTRA4X4_MODE)) {
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I4x4_PRED_MODE);
}
pCurDqLayer->pIntra4x4FinalMode[iMbXy][g_kuiScan4[i]] = iFinalMode;
@ -465,20 +458,20 @@ int32_t ParseIntra4x4Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAvail
if (pCurDqLayer->sLayerInfo.pPps->bEntropyCodingModeFlag) {
WELS_READ_VERIFY (ParseIntraPredModeChromaCabac (pCtx, uiNeighAvail, iCode));
if (iCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = iCode;
} else {
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //intra_chroma_pred_mode
if (uiCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = uiCode;
}
if (-1 == pCurDqLayer->pChromaPredMode[iMbXy]
|| CheckIntraChromaPredMode (uiNeighAvail, &pCurDqLayer->pChromaPredMode[iMbXy])) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
return ERR_NONE;
}
@ -528,8 +521,8 @@ int32_t ParseIntra8x8Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAvail
iFinalMode = CheckIntraNxNPredMode (&iSampleAvail[0], &iBestMode, i << 2, true);
if (iFinalMode == ERR_INVALID_INTRA4X4_MODE) {
return ERR_INFO_INVALID_I4x4_PRED_MODE;
if (iFinalMode == GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INVALID_INTRA4X4_MODE)) {
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I4x4_PRED_MODE);
}
for (int j = 0; j < 4; j++) {
@ -542,23 +535,27 @@ int32_t ParseIntra8x8Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAvail
pCurDqLayer->pIntraPredMode[iMbXy][4] = pIntraPredMode[4 + 8 * 1];
pCurDqLayer->pIntraPredMode[iMbXy][5] = pIntraPredMode[4 + 8 * 2];
pCurDqLayer->pIntraPredMode[iMbXy][6] = pIntraPredMode[4 + 8 * 3];
if (pCtx->pSps->uiChromaFormatIdc == 0)
return ERR_NONE;
if (pCurDqLayer->sLayerInfo.pPps->bEntropyCodingModeFlag) {
WELS_READ_VERIFY (ParseIntraPredModeChromaCabac (pCtx, uiNeighAvail, iCode));
if (iCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = iCode;
} else {
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //intra_chroma_pred_mode
if (uiCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = uiCode;
}
if (-1 == pCurDqLayer->pChromaPredMode[iMbXy]
|| CheckIntraChromaPredMode (uiNeighAvail, &pCurDqLayer->pChromaPredMode[iMbXy])) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
return ERR_NONE;
@ -574,7 +571,7 @@ int32_t ParseIntra16x16Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAva
if (CheckIntra16x16PredMode (uiNeighAvail,
&pCurDqLayer->pIntraPredMode[iMbXy][7])) { //invalid iPredMode, must stop decoding
return ERR_INFO_INVALID_I16x16_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I16x16_PRED_MODE);
}
if (pCtx->pSps->uiChromaFormatIdc == 0)
return ERR_NONE;
@ -582,19 +579,19 @@ int32_t ParseIntra16x16Mode (PWelsDecoderContext pCtx, PWelsNeighAvail pNeighAva
if (pCurDqLayer->sLayerInfo.pPps->bEntropyCodingModeFlag) {
WELS_READ_VERIFY (ParseIntraPredModeChromaCabac (pCtx, uiNeighAvail, iCode));
if (iCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = iCode;
} else {
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //intra_chroma_pred_mode
if (uiCode > MAX_PRED_MODE_ID_CHROMA) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
pCurDqLayer->pChromaPredMode[iMbXy] = uiCode;
}
if (-1 == pCurDqLayer->pChromaPredMode[iMbXy]
|| CheckIntraChromaPredMode (uiNeighAvail, &pCurDqLayer->pChromaPredMode[iMbXy])) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
return ERR_NONE;
@ -622,10 +619,10 @@ int32_t WelsDecodeMbCabacISliceBaseMode0 (PWelsDecoderContext pCtx, uint32_t& ui
GetNeighborAvailMbType (&sNeighAvail, pCurLayer);
WELS_READ_VERIFY (ParseMBTypeISliceCabac (pCtx, &sNeighAvail, uiMbType));
if (uiMbType > 25) {
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
} else if (!pCtx->pSps->uiChromaFormatIdc && ((uiMbType >= 5 && uiMbType <= 12) || (uiMbType >= 17
&& uiMbType <= 24))) {
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
} else if (25 == uiMbType) { //I_PCM
WELS_READ_VERIFY (ParseIPCMInfoCabac (pCtx));
pSlice->iLastDeltaQp = 0;
@ -688,7 +685,7 @@ int32_t WelsDecodeMbCabacISliceBaseMode0 (PWelsDecoderContext pCtx, uint32_t& ui
int32_t iQpDelta, iId8x8, iId4x4;
WELS_READ_VERIFY (ParseDeltaQpCabac (pCtx, iQpDelta));
if (iQpDelta > 25 || iQpDelta < -26) {//out of iQpDelta range
return ERR_INFO_INVALID_QP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_QP);
}
pCurLayer->pLumaQp[iMbXy] = (pSlice->iLastMbQp + iQpDelta + 52) % 52; //update last_mb_qp
pSlice->iLastMbQp = pCurLayer->pLumaQp[iMbXy];
@ -841,9 +838,9 @@ int32_t WelsDecodeMbCabacPSliceBaseMode0 (PWelsDecoderContext pCtx, PWelsNeighAv
} else { //Intra mode
uiMbType -= 5;
if (uiMbType > 25)
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (!pCtx->pSps->uiChromaFormatIdc && ((uiMbType >= 5 && uiMbType <= 12) || (uiMbType >= 17 && uiMbType <= 24)))
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (25 == uiMbType) { //I_PCM
WELS_READ_VERIFY (ParseIPCMInfoCabac (pCtx));
@ -922,7 +919,7 @@ int32_t WelsDecodeMbCabacPSliceBaseMode0 (PWelsDecoderContext pCtx, PWelsNeighAv
WELS_READ_VERIFY (ParseDeltaQpCabac (pCtx, iQpDelta));
if (iQpDelta > 25 || iQpDelta < -26) { //out of iQpDelta range
return ERR_INFO_INVALID_QP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_QP);
}
pCurLayer->pLumaQp[iMbXy] = (pSlice->iLastMbQp + iQpDelta + 52) % 52; //update last_mb_qp
pSlice->iLastMbQp = pCurLayer->pLumaQp[iMbXy];
@ -1276,6 +1273,7 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
const int32_t iMbXy = pCurLayer->iMbXyIndex;
int8_t* pNzc = pCurLayer->pNzc[iMbXy];
int32_t i;
int32_t iRet = ERR_NONE;
uint32_t uiMbType = 0, uiCbp = 0, uiCbpL = 0, uiCbpC = 0;
uint32_t uiCode;
int32_t iCode;
@ -1291,9 +1289,9 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //uiMbType
uiMbType = uiCode;
if (uiMbType > 25)
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (!pCtx->pSps->uiChromaFormatIdc && ((uiMbType >= 5 && uiMbType <= 12) || (uiMbType >= 17 && uiMbType <= 24)))
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (25 == uiMbType) {
int32_t iDecStrideL = pCurLayer->pDec->iLinesize[0];
@ -1345,7 +1343,7 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
memset (pCurLayer->pChromaQp[iMbXy], 0, sizeof (pCurLayer->pChromaQp[iMbXy]));
memset (pNzc, 16, sizeof (pCurLayer->pNzc[iMbXy])); //Rec. 9.2.1 for PCM, nzc=16
WELS_READ_VERIFY (InitReadBits (pBs, 0));
return 0;
return ERR_NONE;
} else if (0 == uiMbType) { //reference to JM
ENFORCE_STACK_ALIGN_1D (int8_t, pIntraPredMode, 48, 16);
pCurLayer->pMbType[iMbXy] = MB_TYPE_INTRA4x4;
@ -1369,9 +1367,9 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
uiCbp = uiCode;
//G.9.1 Alternative parsing process for coded pBlock pattern
if (pCtx->pSps->uiChromaFormatIdc && (uiCbp > 47))
return ERR_INFO_INVALID_CBP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_CBP);
if (!pCtx->pSps->uiChromaFormatIdc && (uiCbp > 15))
return ERR_INFO_INVALID_CBP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_CBP);
if (pCtx->pSps->uiChromaFormatIdc)
uiCbp = g_kuiIntra4x4CbpTable[uiCbp];
@ -1416,7 +1414,7 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
iQpDelta = iCode;
if (iQpDelta > 25 || iQpDelta < -26) { //out of iQpDelta range
return ERR_INFO_INVALID_QP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_QP);
}
pCurLayer->pLumaQp[iMbXy] = (pSlice->iLastMbQp + iQpDelta + 52) % 52; //update last_mb_qp
@ -1432,17 +1430,17 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
if (MB_TYPE_INTRA16x16 == pCurLayer->pMbType[iMbXy]) {
//step1: Luma DC
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 0, 16,
g_kuiLumaDcZigzagScan, I16_LUMA_DC, pCurLayer->pScaledTCoeff[iMbXy], pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 0, 16, g_kuiLumaDcZigzagScan, I16_LUMA_DC,
pCurLayer->pScaledTCoeff[iMbXy], pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
//step2: Luma AC
if (uiCbpL) {
for (i = 0; i < 16; i++) {
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, i,
iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1),
I16_LUMA_AC, pCurLayer->pScaledTCoeff[iMbXy] + (i << 4), pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, i, iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1,
g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1), I16_LUMA_AC, pCurLayer->pScaledTCoeff[iMbXy] + (i << 4),
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
}
ST32A4 (&pNzc[0], LD32 (&pNonZeroCount[1 + 8 * 1]));
@ -1457,10 +1455,10 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
if (uiCbpL & (1 << iId8x8)) {
int32_t iIndex = (iId8x8 << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
if (WelsResidualBlockCavlc8x8 (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - iScanIdxStart + 1, g_kuiZigzagScan8x8 + iScanIdxStart, iMbResProperty,
pCurLayer->pScaledTCoeff[iMbXy] + (iId8x8 << 6), iId4x4, pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;
if ((iRet = WelsResidualBlockCavlc8x8 (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - iScanIdxStart + 1,
g_kuiZigzagScan8x8 + iScanIdxStart, iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iId8x8 << 6), iId4x4,
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;
}
iIndex++;
}
@ -1479,10 +1477,10 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
int32_t iIndex = (iId8x8 << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
//Luma (DC and AC decoding together)
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - iScanIdxStart + 1, g_kuiZigzagScan + iScanIdxStart,
LUMA_DC_AC_INTRA, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4), pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - iScanIdxStart + 1,
g_kuiZigzagScan + iScanIdxStart, LUMA_DC_AC_INTRA, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4),
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
iIndex++;
}
@ -1503,10 +1501,9 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
if (1 == uiCbpC || 2 == uiCbpC) {
for (i = 0; i < 2; i++) { //Cb Cr
iMbResProperty = i ? CHROMA_DC_V : CHROMA_DC_U;
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs,
16 + (i << 2), 4, g_kuiChromaDcScan, iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + 256 + (i << 6),
pCurLayer->pChromaQp[iMbXy][i], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 16 + (i << 2), 4, g_kuiChromaDcScan, iMbResProperty,
pCurLayer->pScaledTCoeff[iMbXy] + 256 + (i << 6), pCurLayer->pChromaQp[iMbXy][i], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
}
}
@ -1517,10 +1514,10 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
iMbResProperty = i ? CHROMA_AC_V : CHROMA_AC_U;
int32_t iIndex = 16 + (i << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1),
iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4), pCurLayer->pChromaQp[iMbXy][i], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - WELS_MAX (iScanIdxStart,
1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1), iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4),
pCurLayer->pChromaQp[iMbXy][i], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
iIndex++;
}
@ -1533,7 +1530,7 @@ int32_t WelsActualDecodeMbCavlcISlice (PWelsDecoderContext pCtx) {
BsEndCavlc (pBs);
}
return 0;
return ERR_NONE;
}
int32_t WelsDecodeMbCavlcISlice (PWelsDecoderContext pCtx, PNalUnit pNalCur, uint32_t& uiEosFlag) {
@ -1572,9 +1569,9 @@ int32_t WelsDecodeMbCavlcISlice (PWelsDecoderContext pCtx, PNalUnit pNalCur, uin
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"WelsDecodeMbCavlcISlice()::::pBs incomplete, iUsedBits:%" PRId64 " > pBs->iBits:%d, MUST stop decoding.",
(int64_t) iUsedBits, pBs->iBits);
return -1;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_BS_INCOMPLETE);
}
return 0;
return ERR_NONE;
}
int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
@ -1593,6 +1590,7 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
const int32_t iMbXy = pCurLayer->iMbXyIndex;
int8_t* pNzc = pCurLayer->pNzc[iMbXy];
int32_t i;
int32_t iRet = ERR_NONE;
uint32_t uiMbType = 0, uiCbp = 0, uiCbpL = 0, uiCbpC = 0;
uint32_t uiCode;
int32_t iCode;
@ -1609,8 +1607,8 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
pCurLayer->pMbType[iMbXy] = g_ksInterMbTypeInfo[uiMbType].iType;
WelsFillCacheInter (&sNeighAvail, pNonZeroCount, iMotionVector, iRefIndex, pCurLayer);
if (ParseInterInfo (pCtx, iMotionVector, iRefIndex, pBs)) {
return -1;//abnormal
if ((iRet = ParseInterInfo (pCtx, iMotionVector, iRefIndex, pBs)) != ERR_NONE) {
return iRet;//abnormal
}
if (pSlice->sSliceHeaderExt.bAdaptiveResidualPredFlag == 1) {
@ -1624,14 +1622,14 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
pCurLayer->pInterPredictionDoneFlag[iMbXy] = 0;
} else {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "residual_pred_flag = 1 not supported.");
return -1;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_UNSUPPORTED_ILP);
}
} else { //intra MB type
uiMbType -= 5;
if (uiMbType > 25)
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (!pCtx->pSps->uiChromaFormatIdc && ((uiMbType >= 5 && uiMbType <= 12) || (uiMbType >= 17 && uiMbType <= 24)))
return ERR_INFO_INVALID_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_TYPE);
if (25 == uiMbType) {
int32_t iDecStrideL = pCurLayer->pDec->iLinesize[0];
@ -1689,7 +1687,7 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
ST32A4 (&pNzc[16], 0x10101010);
ST32A4 (&pNzc[20], 0x10101010);
WELS_READ_VERIFY (InitReadBits (pBs, 0));
return 0;
return ERR_NONE;
} else {
if (0 == uiMbType) {
ENFORCE_STACK_ALIGN_1D (int8_t, pIntraPredMode, 48, 16);
@ -1717,8 +1715,8 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
uiCbpC = pCtx->pSps->uiChromaFormatIdc ? pCurLayer->pCbp[iMbXy] >> 4 : 0;
uiCbpL = pCurLayer->pCbp[iMbXy] & 15;
WelsFillCacheNonZeroCount (&sNeighAvail, pNonZeroCount, pCurLayer);
if (ParseIntra16x16Mode (pCtx, &sNeighAvail, pBs, pCurLayer)) {
return -1;
if ((iRet = ParseIntra16x16Mode (pCtx, &sNeighAvail, pBs, pCurLayer)) != ERR_NONE) {
return iRet;
}
}
}
@ -1729,9 +1727,9 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
uiCbp = uiCode;
{
if (pCtx->pSps->uiChromaFormatIdc && (uiCbp > 47))
return ERR_INFO_INVALID_CBP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_CBP);
if (!pCtx->pSps->uiChromaFormatIdc && (uiCbp > 15))
return ERR_INFO_INVALID_CBP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_CBP);
if (MB_TYPE_INTRA4x4 == pCurLayer->pMbType[iMbXy] || MB_TYPE_INTRA8x8 == pCurLayer->pMbType[iMbXy]) {
uiCbp = pCtx->pSps->uiChromaFormatIdc ? g_kuiIntra4x4CbpTable[uiCbp] : g_kuiIntra4x4CbpTable400[uiCbp];
@ -1779,7 +1777,7 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
iQpDelta = iCode;
if (iQpDelta > 25 || iQpDelta < -26) { //out of iQpDelta range
return ERR_INFO_INVALID_QP;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_QP);
}
pCurLayer->pLumaQp[iMbXy] = (pSlice->iLastMbQp + iQpDelta + 52) % 52; //update last_mb_qp
@ -1794,17 +1792,17 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
if (MB_TYPE_INTRA16x16 == pCurLayer->pMbType[iMbXy]) {
//step1: Luma DC
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 0, 16, g_kuiLumaDcZigzagScan,
I16_LUMA_DC, pCurLayer->pScaledTCoeff[iMbXy], pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 0, 16, g_kuiLumaDcZigzagScan, I16_LUMA_DC,
pCurLayer->pScaledTCoeff[iMbXy], pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
//step2: Luma AC
if (uiCbpL) {
for (i = 0; i < 16; i++) {
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, i,
iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1),
I16_LUMA_AC, pCurLayer->pScaledTCoeff[iMbXy] + (i << 4), pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, i, iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1,
g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1), I16_LUMA_AC, pCurLayer->pScaledTCoeff[iMbXy] + (i << 4),
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
}
ST32A4 (&pNzc[0], LD32 (&pNonZeroCount[1 + 8 * 1]));
@ -1819,10 +1817,10 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
if (uiCbpL & (1 << iId8x8)) {
int32_t iIndex = (iId8x8 << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
if (WelsResidualBlockCavlc8x8 (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - iScanIdxStart + 1, g_kuiZigzagScan8x8 + iScanIdxStart, iMbResProperty,
pCurLayer->pScaledTCoeff[iMbXy] + (iId8x8 << 6), iId4x4, pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;
if ((iRet = WelsResidualBlockCavlc8x8 (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - iScanIdxStart + 1,
g_kuiZigzagScan8x8 + iScanIdxStart, iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iId8x8 << 6), iId4x4,
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;
}
iIndex++;
}
@ -1842,10 +1840,10 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
int32_t iIndex = (iId8x8 << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
//Luma (DC and AC decoding together)
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - iScanIdxStart + 1, g_kuiZigzagScan + iScanIdxStart, iMbResProperty,
pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4), pCurLayer->pLumaQp[iMbXy], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - iScanIdxStart + 1,
g_kuiZigzagScan + iScanIdxStart, iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4),
pCurLayer->pLumaQp[iMbXy], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
iIndex++;
}
@ -1871,10 +1869,9 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
else
iMbResProperty = i ? CHROMA_DC_V_INTER : CHROMA_DC_U_INTER;
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs,
16 + (i << 2), 4, g_kuiChromaDcScan, iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + 256 + (i << 6),
pCurLayer->pChromaQp[iMbXy][i], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, 16 + (i << 2), 4, g_kuiChromaDcScan, iMbResProperty,
pCurLayer->pScaledTCoeff[iMbXy] + 256 + (i << 6), pCurLayer->pChromaQp[iMbXy][i], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
}
} else {
@ -1889,10 +1886,10 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
int32_t iIndex = 16 + (i << 2);
for (iId4x4 = 0; iId4x4 < 4; iId4x4++) {
if (WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex,
iScanIdxEnd - WELS_MAX (iScanIdxStart, 1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1),
iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4), pCurLayer->pChromaQp[iMbXy][i], pCtx)) {
return -1;//abnormal
if ((iRet = WelsResidualBlockCavlc (pVlcTable, pNonZeroCount, pBs, iIndex, iScanIdxEnd - WELS_MAX (iScanIdxStart,
1) + 1, g_kuiZigzagScan + WELS_MAX (iScanIdxStart, 1), iMbResProperty, pCurLayer->pScaledTCoeff[iMbXy] + (iIndex << 4),
pCurLayer->pChromaQp[iMbXy][i], pCtx)) != ERR_NONE) {
return iRet;//abnormal
}
iIndex++;
}
@ -1905,7 +1902,7 @@ int32_t WelsActualDecodeMbCavlcPSlice (PWelsDecoderContext pCtx) {
BsEndCavlc (pBs);
}
return 0;
return ERR_NONE;
}
int32_t WelsDecodeMbCavlcPSlice (PWelsDecoderContext pCtx, PNalUnit pNalCur, uint32_t& uiEosFlag) {
@ -1928,7 +1925,7 @@ int32_t WelsDecodeMbCavlcPSlice (PWelsDecoderContext pCtx, PNalUnit pNalCur, uin
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //mb_skip_run
pSlice->iMbSkipRun = uiCode;
if (-1 == pSlice->iMbSkipRun) {
return -1;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_MB_SKIP_RUN);
}
}
if (pSlice->iMbSkipRun--) {
@ -1995,9 +1992,9 @@ int32_t WelsDecodeMbCavlcPSlice (PWelsDecoderContext pCtx, PNalUnit pNalCur, uin
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"WelsDecodeMbCavlcISlice()::::pBs incomplete, iUsedBits:%" PRId64 " > pBs->iBits:%d, MUST stop decoding.",
(int64_t) iUsedBits, pBs->iBits);
return -1;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_BS_INCOMPLETE);
}
return 0;
return ERR_NONE;
}
void WelsBlockFuncInit (SBlockFunc* pFunc, int32_t iCpu) {

View File

@ -64,7 +64,7 @@ static int32_t CreatePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, cons
PPicBuff pPicBuf = NULL;
int32_t iPicIdx = 0;
if (kiSize <= 0 || kiPicWidth <= 0 || kiPicHeight <= 0) {
return 1;
return ERR_INFO_INVALID_PARAM;
}
CMemoryAlign* pMa = pCtx->pMemAlign;
@ -72,7 +72,7 @@ static int32_t CreatePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, cons
pPicBuf = (PPicBuff)pMa->WelsMallocz (sizeof (SPicBuff), "PPicBuff");
if (NULL == pPicBuf) {
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
pPicBuf->ppPic = (PPicture*)pMa->WelsMallocz (kiSize * sizeof (PPicture), "PPicture*");
@ -80,7 +80,7 @@ static int32_t CreatePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, cons
if (NULL == pPicBuf->ppPic) {
pPicBuf->iCapacity = 0;
DestroyPicBuff (&pPicBuf, pMa);
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
for (iPicIdx = 0; iPicIdx < kiSize; ++ iPicIdx) {
@ -89,7 +89,7 @@ static int32_t CreatePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, cons
// init capacity first for free memory
pPicBuf->iCapacity = iPicIdx;
DestroyPicBuff (&pPicBuf, pMa);
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
pPicBuf->ppPic[iPicIdx] = pPic;
}
@ -99,7 +99,7 @@ static int32_t CreatePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, cons
pPicBuf->iCurrentIdx = 0;
* ppPicBuf = pPicBuf;
return 0;
return ERR_NONE;
}
static int32_t IncreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, const int32_t kiOldSize,
@ -108,14 +108,14 @@ static int32_t IncreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
PPicBuff pPicNewBuf = NULL;
int32_t iPicIdx = 0;
if (kiOldSize <= 0 || kiNewSize <= 0 || kiPicWidth <= 0 || kiPicHeight <= 0) {
return 1;
return ERR_INFO_INVALID_PARAM;
}
CMemoryAlign* pMa = pCtx->pMemAlign;
pPicNewBuf = (PPicBuff)pMa->WelsMallocz (sizeof (SPicBuff), "PPicBuff");
if (NULL == pPicNewBuf) {
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
pPicNewBuf->ppPic = (PPicture*)pMa->WelsMallocz (kiNewSize * sizeof (PPicture), "PPicture*");
@ -123,7 +123,7 @@ static int32_t IncreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
if (NULL == pPicNewBuf->ppPic) {
pPicNewBuf->iCapacity = 0;
DestroyPicBuff (&pPicNewBuf, pMa);
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
// increase new PicBuf
@ -133,7 +133,7 @@ static int32_t IncreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
// Set maximum capacity as the new malloc memory at the tail
pPicNewBuf->iCapacity = iPicIdx;
DestroyPicBuff (&pPicNewBuf, pMa);
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
pPicNewBuf->ppPic[iPicIdx] = pPic;
}
@ -162,7 +162,7 @@ static int32_t IncreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
pPicOldBuf->iCurrentIdx = 0;
pMa->WelsFree (pPicOldBuf, "pPicOldBuf");
pPicOldBuf = NULL;
return 0;
return ERR_NONE;
}
static int32_t DecreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, const int32_t kiOldSize,
@ -171,7 +171,7 @@ static int32_t DecreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
PPicBuff pPicNewBuf = NULL;
int32_t iPicIdx = 0;
if (kiOldSize <= 0 || kiNewSize <= 0 || kiPicWidth <= 0 || kiPicHeight <= 0) {
return 1;
return ERR_INFO_INVALID_PARAM;
}
CMemoryAlign* pMa = pCtx->pMemAlign;
@ -179,7 +179,7 @@ static int32_t DecreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
pPicNewBuf = (PPicBuff)pMa->WelsMallocz (sizeof (SPicBuff), "PPicBuff");
if (NULL == pPicNewBuf) {
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
pPicNewBuf->ppPic = (PPicture*)pMa->WelsMallocz (kiNewSize * sizeof (PPicture), "PPicture*");
@ -187,7 +187,7 @@ static int32_t DecreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
if (NULL == pPicNewBuf->ppPic) {
pPicNewBuf->iCapacity = 0;
DestroyPicBuff (&pPicNewBuf, pMa);
return 1;
return ERR_INFO_OUT_OF_MEMORY;
}
int32_t iPrevPicIdx = -1;
@ -239,7 +239,7 @@ static int32_t DecreasePicBuff (PWelsDecoderContext pCtx, PPicBuff* ppPicBuf, co
pMa->WelsFree (pPicOldBuf, "pPicOldBuf");
pPicOldBuf = NULL;
return 0;
return ERR_NONE;
}
void DestroyPicBuff (PPicBuff* ppPicBuf, CMemoryAlign* pMa) {
@ -272,18 +272,16 @@ void DestroyPicBuff (PPicBuff* ppPicBuf, CMemoryAlign* pMa) {
pPicBuf = NULL;
*ppPicBuf = NULL;
}
/*
* fill data fields in default for decoder context
*/
void WelsDecoderDefaults (PWelsDecoderContext pCtx, SLogContext* pLogCtx, CMemoryAlign* pMa) {
void WelsDecoderDefaults (PWelsDecoderContext pCtx, SLogContext* pLogCtx) {
int32_t iCpuCores = 1;
memset (pCtx, 0, sizeof (SWelsDecoderContext)); // fill zero first
pCtx->sLogCtx = *pLogCtx;
pCtx->pMemAlign = pMa;
pCtx->pArgDec = NULL;
pCtx->eOutputColorFormat = videoFormatI420; // yuv in default
pCtx->bHaveGotMemory = false; // not ever request memory blocks for decoder context related
pCtx->uiCpuFlag = 0;
@ -358,13 +356,15 @@ static inline int32_t GetTargetRefListSize (PWelsDecoderContext pCtx) {
/*
* request memory blocks for decoder avc part
*/
int32_t WelsRequestMem (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const int32_t kiMbHeight) {
int32_t WelsRequestMem (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const int32_t kiMbHeight,
bool& bReallocFlag) {
const int32_t kiPicWidth = kiMbWidth << 4;
const int32_t kiPicHeight = kiMbHeight << 4;
int32_t iErr = ERR_NONE;
int32_t iListIdx = 0; //, mb_blocks = 0;
int32_t iPicQueueSize = 0; // adaptive size of picture queue, = (pSps->iNumRefFrames x 2)
bReallocFlag = false;
bool bNeedChangePicQueue = true;
CMemoryAlign* pMa = pCtx->pMemAlign;
@ -434,22 +434,26 @@ int32_t WelsRequestMem (PWelsDecoderContext pCtx, const int32_t kiMbWidth, const
if (pCtx->pCabacDecEngine == NULL)
pCtx->pCabacDecEngine = (SWelsCabacDecEngine*) pMa->WelsMallocz (sizeof (SWelsCabacDecEngine), "pCtx->pCabacDecEngine");
WELS_VERIFY_RETURN_IF (ERR_INFO_OUT_OF_MEMORY, (NULL == pCtx->pCabacDecEngine))
bReallocFlag = true; // memory re-allocation successfully finished
return ERR_NONE;
}
/*
* free memory blocks in avc
* free memory dynamically allocated during decoder
*/
void WelsFreeMem (PWelsDecoderContext pCtx) {
void WelsFreeDynamicMemory (PWelsDecoderContext pCtx) {
int32_t iListIdx = 0;
CMemoryAlign* pMa = pCtx->pMemAlign;
/* TODO: free memory blocks introduced in avc */
//free dq layer memory
UninitialDqLayersContext (pCtx);
//free FMO memory
ResetFmoList (pCtx);
//free ref-pic list & picture memory
WelsResetRefPic (pCtx);
// for sPicBuff
for (iListIdx = LIST_0; iListIdx < LIST_A; ++ iListIdx) {
PPicBuff* pPicBuff = &pCtx->pPicBuff[iListIdx];
if (NULL != pPicBuff && NULL != *pPicBuff) {
@ -464,6 +468,8 @@ void WelsFreeMem (PWelsDecoderContext pCtx) {
pCtx->iLastImgHeightInPixel = 0;
pCtx->bFreezeOutput = true;
pCtx->bHaveGotMemory = false;
//free CABAC memory
pMa->WelsFree (pCtx->pCabacDecEngine, "pCtx->pCabacDecEngine");
}
@ -471,19 +477,15 @@ void WelsFreeMem (PWelsDecoderContext pCtx) {
* \brief Open decoder
*/
int32_t WelsOpenDecoder (PWelsDecoderContext pCtx) {
// function pointers
//initial MC function pointer--
int iRet = ERR_NONE;
InitMcFunc (& (pCtx->sMcFunc), pCtx->uiCpuFlag);
InitExpandPictureFunc (& (pCtx->sExpandPicFunc), pCtx->uiCpuFlag);
AssignFuncPointerForRec (pCtx);
// function pointers
InitDecFuncs (pCtx, pCtx->uiCpuFlag);
// vlc tables
InitVlcTable (&pCtx->sVlcTable);
// startup memory
iRet = WelsInitMemory (pCtx);
// static memory
iRet = WelsInitStaticMemory (pCtx);
if (ERR_NONE != iRet)
return iRet;
@ -503,11 +505,9 @@ int32_t WelsOpenDecoder (PWelsDecoderContext pCtx) {
* \brief Close decoder
*/
void WelsCloseDecoder (PWelsDecoderContext pCtx) {
WelsFreeMem (pCtx);
WelsFreeDynamicMemory (pCtx);
WelsFreeMemory (pCtx);
UninitialDqLayersContext (pCtx);
WelsFreeStaticMemory (pCtx);
#ifdef LONG_TERM_REF
pCtx->bParamSetsLostFlag = false;
@ -523,25 +523,20 @@ void WelsCloseDecoder (PWelsDecoderContext pCtx) {
*/
int32_t DecoderConfigParam (PWelsDecoderContext pCtx, const SDecodingParam* kpParam) {
if (NULL == pCtx || NULL == kpParam)
return 1;
CMemoryAlign* pMa = pCtx->pMemAlign;
pCtx->pParam = (SDecodingParam*)pMa->WelsMallocz (sizeof (SDecodingParam), "SDecodingParam");
if (NULL == pCtx->pParam)
return 1;
return ERR_INFO_INVALID_PARAM;
memcpy (pCtx->pParam, kpParam, sizeof (SDecodingParam));
pCtx->eOutputColorFormat = pCtx->pParam->eOutputColorFormat;
if (!pCtx->bParseOnly) {
int32_t iRet = DecoderSetCsp (pCtx, pCtx->pParam->eOutputColorFormat);
if (iRet)
return iRet;
if ((pCtx->pParam->eEcActiveIdc > ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE)
|| (pCtx->pParam->eEcActiveIdc < ERROR_CON_DISABLE)) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"eErrorConMethod (%d) not in range: (%d - %d). Set as default value: (%d).", pCtx->pParam->eEcActiveIdc,
ERROR_CON_DISABLE, ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE,
ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE);
pCtx->pParam->eEcActiveIdc = ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE;
}
pCtx->eErrorConMethod = pCtx->pParam->eEcActiveIdc;
if (pCtx->bParseOnly) //parse only, disable EC method
if (pCtx->pParam->bParseOnly) //parse only, disable EC method
pCtx->eErrorConMethod = ERROR_CON_DISABLE;
InitErrorCon (pCtx);
@ -554,7 +549,7 @@ int32_t DecoderConfigParam (PWelsDecoderContext pCtx, const SDecodingParam* kpPa
WelsLog (& (pCtx->sLogCtx), WELS_LOG_INFO, "eVideoType: %d", pCtx->eVideoType);
return 0;
return ERR_NONE;
}
/*!
@ -569,15 +564,11 @@ int32_t DecoderConfigParam (PWelsDecoderContext pCtx, const SDecodingParam* kpPa
* \note N/A
*************************************************************************************
*/
int32_t WelsInitDecoder (PWelsDecoderContext pCtx, const bool bParseOnly, SLogContext* pLogCtx) {
int32_t WelsInitDecoder (PWelsDecoderContext pCtx, SLogContext* pLogCtx) {
if (pCtx == NULL) {
return ERR_INFO_INVALID_PTR;
}
// default
WelsDecoderDefaults (pCtx, pLogCtx, pCtx->pMemAlign);
pCtx->bParseOnly = bParseOnly;
// open decoder
return WelsOpenDecoder (pCtx);
}
@ -654,7 +645,7 @@ int32_t WelsDecodeBs (PWelsDecoderContext pCtx, const uint8_t* kpBsBuf, const in
pRawData->pCurPos = pRawData->pHead;
}
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
pSavedData = &pCtx->sSavedData;
if ((kiBsLen + 4) > (pSavedData->pEnd - pSavedData->pCurPos)) {
pSavedData->pCurPos = pSavedData->pHead;
@ -664,11 +655,23 @@ int32_t WelsDecodeBs (PWelsDecoderContext pCtx, const uint8_t* kpBsBuf, const in
//0x03 removal and extract all of NAL Unit from current raw data
pDstNal = pRawData->pCurPos;
bool bNalStartBytes = false;
while (iSrcConsumed < iSrcLength) {
if ((2 + iSrcConsumed < iSrcLength) &&
(0 == LD16 (pSrcNal + iSrcIdx)) &&
((pSrcNal[2 + iSrcIdx] == 0x03) || (pSrcNal[2 + iSrcIdx] == 0x01))) {
if (pSrcNal[2 + iSrcIdx] == 0x03) {
if ((2 + iSrcConsumed < iSrcLength) && (0 == LD16 (pSrcNal + iSrcIdx)) && (pSrcNal[2 + iSrcIdx] <= 0x03)) {
if (bNalStartBytes && (pSrcNal[2 + iSrcIdx] != 0x00 && pSrcNal[2 + iSrcIdx] != 0x01)) {
pCtx->iErrorCode |= dsBitstreamError;
return pCtx->iErrorCode;
}
if (pSrcNal[2 + iSrcIdx] == 0x02) {
pCtx->iErrorCode |= dsBitstreamError;
return pCtx->iErrorCode;
} else if (pSrcNal[2 + iSrcIdx] == 0x00) {
pDstNal[iDstIdx++] = pSrcNal[iSrcIdx++];
iSrcConsumed++;
bNalStartBytes = true;
} else if (pSrcNal[2 + iSrcIdx] == 0x03) {
if ((3 + iSrcConsumed < iSrcLength) && pSrcNal[3 + iSrcIdx] > 0x03) {
pCtx->iErrorCode |= dsBitstreamError;
return pCtx->iErrorCode;
@ -678,7 +681,8 @@ int32_t WelsDecodeBs (PWelsDecoderContext pCtx, const uint8_t* kpBsBuf, const in
iSrcIdx += 3;
iSrcConsumed += 3;
}
} else {
} else { // 0x01
bNalStartBytes = false;
iConsumedBytes = 0;
pDstNal[iDstIdx] = pDstNal[iDstIdx + 1] = pDstNal[iDstIdx + 2] = pDstNal[iDstIdx + 3] =
@ -798,29 +802,6 @@ int32_t WelsDecodeBs (PWelsDecoderContext pCtx, const uint8_t* kpBsBuf, const in
return pCtx->iErrorCode;
}
/*
* set colorspace format in decoder
*/
int32_t DecoderSetCsp (PWelsDecoderContext pCtx, const int32_t kiColorFormat) {
WELS_VERIFY_RETURN_IF (1, (NULL == pCtx));
pCtx->eOutputColorFormat = (EVideoFormatType) kiColorFormat;
if (pCtx->pParam != NULL) {
pCtx->pParam->eOutputColorFormat = (EVideoFormatType) kiColorFormat;
}
//For now, support only videoFormatI420!
if (kiColorFormat == (int32_t) videoFormatInternal) {
pCtx->pParam->eOutputColorFormat = pCtx->eOutputColorFormat = videoFormatI420;
} else if (kiColorFormat != (int32_t) videoFormatI420) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "Support I420 output only for now! Change to I420...");
pCtx->pParam->eOutputColorFormat = pCtx->eOutputColorFormat = videoFormatI420;
return cmUnsupportedData;
}
return 0;
}
/*!
* \brief make sure synchonozization picture resolution (get from slice header) among different parts (i.e, memory related and so on)
* over decoder internal
@ -835,7 +816,8 @@ int32_t SyncPictureResolutionExt (PWelsDecoderContext pCtx, const int32_t kiMbWi
const int32_t kiPicWidth = kiMbWidth << 4;
const int32_t kiPicHeight = kiMbHeight << 4;
iErr = WelsRequestMem (pCtx, kiMbWidth, kiMbHeight); // common memory used
bool bReallocFlag = false;
iErr = WelsRequestMem (pCtx, kiMbWidth, kiMbHeight, bReallocFlag); // common memory used
if (ERR_NONE != iErr) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING,
"SyncPictureResolutionExt()::WelsRequestMem--buffer allocated failure.");
@ -850,13 +832,39 @@ int32_t SyncPictureResolutionExt (PWelsDecoderContext pCtx, const int32_t kiMbWi
pCtx->iErrorCode = dsOutOfMemory;
}
#if defined(MEMORY_MONITOR)
WelsLog (& (pCtx->sLogCtx), WELS_LOG_INFO, "SyncPictureResolutionExt(), overall memory usage: %llu bytes",
static_cast<unsigned long long> (sizeof (SWelsDecoderContext) + pCtx->pMemAlign->WelsGetMemoryUsage()));
if (bReallocFlag) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_INFO, "SyncPictureResolutionExt(), overall memory usage: %llu bytes",
static_cast<unsigned long long> (sizeof (SWelsDecoderContext) + pCtx->pMemAlign->WelsGetMemoryUsage()));
}
#endif//MEMORY_MONITOR
return iErr;
}
void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
void InitDecFuncs (PWelsDecoderContext pCtx, uint32_t uiCpuFlag) {
WelsBlockFuncInit (&pCtx->sBlockFunc, uiCpuFlag);
InitPredFunc (pCtx, uiCpuFlag);
InitMcFunc (& (pCtx->sMcFunc), uiCpuFlag);
InitExpandPictureFunc (& (pCtx->sExpandPicFunc), uiCpuFlag);
DeblockingInit (&pCtx->sDeblockingFunc, uiCpuFlag);
}
namespace {
template<void pfIdctResAddPred (uint8_t* pPred, int32_t iStride, int16_t* pRs)>
void IdctFourResAddPred_ (uint8_t* pPred, int32_t iStride, int16_t* pRs, const int8_t* pNzc) {
if (pNzc[0] || pRs[0 * 16])
pfIdctResAddPred (pPred + 0 * iStride + 0, iStride, pRs + 0 * 16);
if (pNzc[1] || pRs[1 * 16])
pfIdctResAddPred (pPred + 0 * iStride + 4, iStride, pRs + 1 * 16);
if (pNzc[4] || pRs[2 * 16])
pfIdctResAddPred (pPred + 4 * iStride + 0, iStride, pRs + 2 * 16);
if (pNzc[5] || pRs[3 * 16])
pfIdctResAddPred (pPred + 4 * iStride + 4, iStride, pRs + 3 * 16);
}
} // anon ns
void InitPredFunc (PWelsDecoderContext pCtx, uint32_t uiCpuFlag) {
pCtx->pGetI16x16LumaPredFunc[I16_PRED_V ] = WelsI16x16LumaPredV_c;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_H ] = WelsI16x16LumaPredH_c;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_DC ] = WelsI16x16LumaPredDc_c;
@ -904,12 +912,14 @@ void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
pCtx->pGetIChromaPredFunc[C_PRED_DC_128] = WelsIChromaPredDcNA_c;
pCtx->pIdctResAddPredFunc = IdctResAddPred_c;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_<IdctResAddPred_c>;
pCtx->pIdctResAddPredFunc8x8 = IdctResAddPred8x8_c;
#if defined(HAVE_NEON)
if (pCtx->uiCpuFlag & WELS_CPU_NEON) {
if (uiCpuFlag & WELS_CPU_NEON) {
pCtx->pIdctResAddPredFunc = IdctResAddPred_neon;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_<IdctResAddPred_neon>;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_DC] = WelsDecoderI16x16LumaPredDc_neon;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_P] = WelsDecoderI16x16LumaPredPlane_neon;
@ -933,8 +943,9 @@ void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
#endif//HAVE_NEON
#if defined(HAVE_NEON_AARCH64)
if (pCtx->uiCpuFlag & WELS_CPU_NEON) {
if (uiCpuFlag & WELS_CPU_NEON) {
pCtx->pIdctResAddPredFunc = IdctResAddPred_AArch64_neon;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_<IdctResAddPred_AArch64_neon>;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_DC] = WelsDecoderI16x16LumaPredDc_AArch64_neon;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_P] = WelsDecoderI16x16LumaPredPlane_AArch64_neon;
@ -963,8 +974,9 @@ void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
#endif//HAVE_NEON_AARCH64
#if defined(X86_ASM)
if (pCtx->uiCpuFlag & WELS_CPU_MMXEXT) {
if (uiCpuFlag & WELS_CPU_MMXEXT) {
pCtx->pIdctResAddPredFunc = IdctResAddPred_mmx;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_<IdctResAddPred_mmx>;
///////mmx code opt---
pCtx->pGetIChromaPredFunc[C_PRED_H] = WelsDecoderIChromaPredH_mmx;
@ -978,8 +990,10 @@ void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
pCtx->pGetI4x4LumaPredFunc[I4_PRED_DDL] = WelsDecoderI4x4LumaPredDDL_mmx;
pCtx->pGetI4x4LumaPredFunc[I4_PRED_VL ] = WelsDecoderI4x4LumaPredVL_mmx;
}
if (pCtx->uiCpuFlag & WELS_CPU_SSE2) {
/////////sse2 code opt---
if (uiCpuFlag & WELS_CPU_SSE2) {
pCtx->pIdctResAddPredFunc = IdctResAddPred_sse2;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_<IdctResAddPred_sse2>;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_DC] = WelsDecoderI16x16LumaPredDc_sse2;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_P] = WelsDecoderI16x16LumaPredPlane_sse2;
pCtx->pGetI16x16LumaPredFunc[I16_PRED_H] = WelsDecoderI16x16LumaPredH_sse2;
@ -991,10 +1005,14 @@ void AssignFuncPointerForRec (PWelsDecoderContext pCtx) {
pCtx->pGetIChromaPredFunc[C_PRED_DC_T] = WelsDecoderIChromaPredDcTop_sse2;
pCtx->pGetI4x4LumaPredFunc[I4_PRED_H] = WelsDecoderI4x4LumaPredH_sse2;
}
#if defined(HAVE_AVX2)
if (uiCpuFlag & WELS_CPU_AVX2) {
pCtx->pIdctResAddPredFunc = IdctResAddPred_avx2;
pCtx->pIdctFourResAddPredFunc = IdctFourResAddPred_avx2;
}
#endif
DeblockingInit (&pCtx->sDeblockingFunc, pCtx->uiCpuFlag);
WelsBlockFuncInit (&pCtx->sBlockFunc, pCtx->uiCpuFlag);
#endif
}
//reset decoder number related statistics info
@ -1028,10 +1046,15 @@ void UpdateDecStatNoFreezingInfo (PWelsDecoderContext pCtx) {
//update QP info
int32_t iTotalQp = 0;
const int32_t kiMbNum = pCurDq->iMbWidth * pCurDq->iMbHeight;
int32_t iCorrectMbNum = 0;
for (int32_t iMb = 0; iMb < kiMbNum; ++iMb) {
iCorrectMbNum += (int32_t) pCurDq->pMbCorrectlyDecodedFlag[iMb];
iTotalQp += pCurDq->pLumaQp[iMb] * pCurDq->pMbCorrectlyDecodedFlag[iMb];
}
iTotalQp /= kiMbNum;
if (iCorrectMbNum == 0) //non MB is correct, should remain QP statistic info
iTotalQp = pDecStat->iAvgLumaQp;
else
iTotalQp /= iCorrectMbNum;
if (pDecStat->uiDecodedFrameCount + 1 == 0) { //maximum uint32_t reached
ResetDecStatNums (pDecStat);
pDecStat->iAvgLumaQp = iTotalQp;

View File

@ -72,7 +72,7 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
}
}
if (pCtx->bParseOnly) { //should exit for parse only to prevent access NULL pDstInfo
if (pCtx->pParam->bParseOnly) { //should exit for parse only to prevent access NULL pDstInfo
PAccessUnit pCurAu = pCtx->pAccessUnitList;
if (dsErrorFree == pCtx->iErrorCode) { //correct decoding, add to data buffer
SParserBsInfo* pParser = pCtx->pParserBsInfo;
@ -133,7 +133,7 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
pCtx->pDec->bIsComplete = false;
pCtx->bFrameFinish = false; //current frame not finished
pCtx->iErrorCode |= dsFramePending;
return -1;
return ERR_INFO_PARSEONLY_PENDING;
//pCtx->pParserBsInfo->iNalNum = 0;
}
} else { //error
@ -141,9 +141,9 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
pCtx->pParserBsInfo->iNalNum = 0;
pCtx->pParserBsInfo->iSpsWidthInPixel = 0;
pCtx->pParserBsInfo->iSpsHeightInPixel = 0;
return -1;
return ERR_INFO_PARSEONLY_ERROR;
}
return 0;
return ERR_NONE;
}
if (pCtx->iTotalNumMbRec != kiTotalNumMbInCurLayer) {
@ -152,7 +152,7 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
pCtx->iTotalNumMbRec, kiTotalNumMbInCurLayer, pCurDq->iMbWidth, pCurDq->iMbHeight);
bFrameCompleteFlag = false; //return later after output buffer is done
if (pCtx->bInstantDecFlag) //no-delay decoding, wait for new slice
return -1;
return ERR_INFO_MB_NUM_INADEQUATE;
} else if (pCurDq->sLayerInfo.sNalHeaderExt.bIdrFlag
&& (pCtx->iErrorCode == dsErrorFree)) { //complete non-ECed IDR frame done
pCtx->pDec->bIsComplete = true;
@ -193,7 +193,7 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
if (pDstInfo->iBufferStatus == 0) {
if (!bFrameCompleteFlag)
pCtx->iErrorCode |= dsBitstreamError;
return -1;
return ERR_INFO_MB_NUM_INADEQUATE;
}
if (pCtx->bFreezeOutput) {
pDstInfo->iBufferStatus = 0;
@ -206,7 +206,7 @@ static inline int32_t DecodeFrameConstruction (PWelsDecoderContext pCtx, uint8_t
pCtx->iMbEcedPropNum = pPic->iMbEcedPropNum;
UpdateDecStat (pCtx, pDstInfo->iBufferStatus != 0);
return 0;
return ERR_NONE;
}
inline bool CheckSliceNeedReconstruct (uint8_t uiLayerDqId, uint8_t uiTargetDqId) {
@ -467,7 +467,7 @@ int32_t InitBsBuffer (PWelsDecoderContext pCtx) {
}
pCtx->sRawData.pStartPos = pCtx->sRawData.pCurPos = pCtx->sRawData.pHead;
pCtx->sRawData.pEnd = pCtx->sRawData.pHead + pCtx->iMaxBsBufferSizeInByte;
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
pCtx->pParserBsInfo = static_cast<SParserBsInfo*> (pMa->WelsMallocz (sizeof (SParserBsInfo), "pCtx->pParserBsInfo"));
if (pCtx->pParserBsInfo == NULL) {
return ERR_INFO_OUT_OF_MEMORY;
@ -538,14 +538,14 @@ int32_t CheckBsBuffer (PWelsDecoderContext pCtx, const int32_t kiSrcLen) {
}
/*
* WelsInitMemory
* WelsInitStaticMemory
* Memory request for new introduced data
* Especially for:
* rbsp_au_buffer, cur_dq_layer_ptr and ref_dq_layer_ptr in MB info cache.
* return:
* 0 - success; otherwise returned error_no defined in error_no.h.
*/
int32_t WelsInitMemory (PWelsDecoderContext pCtx) {
int32_t WelsInitStaticMemory (PWelsDecoderContext pCtx) {
if (pCtx == NULL) {
return ERR_INFO_INVALID_PTR;
}
@ -563,22 +563,16 @@ int32_t WelsInitMemory (PWelsDecoderContext pCtx) {
}
/*
* WelsFreeMemory
* Free memory introduced in WelsInitMemory at destruction of decoder.
* WelsFreeStaticMemory
* Free memory introduced in WelsInitStaticMemory at destruction of decoder.
*
*/
void WelsFreeMemory (PWelsDecoderContext pCtx) {
void WelsFreeStaticMemory (PWelsDecoderContext pCtx) {
if (pCtx == NULL)
return;
CMemoryAlign* pMa = pCtx->pMemAlign;
if (NULL != pCtx->pParam) {
pMa->WelsFree (pCtx->pParam, "pCtx->pParam");
pCtx->pParam = NULL;
}
MemFreeNalList (&pCtx->pAccessUnitList, pMa);
if (pCtx->sRawData.pHead) {
@ -588,7 +582,7 @@ void WelsFreeMemory (PWelsDecoderContext pCtx) {
pCtx->sRawData.pEnd = NULL;
pCtx->sRawData.pStartPos = NULL;
pCtx->sRawData.pCurPos = NULL;
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
if (pCtx->sSavedData.pHead) {
pMa->WelsFree (pCtx->sSavedData.pHead, "pCtx->sSavedData->pHead");
}
@ -605,6 +599,12 @@ void WelsFreeMemory (PWelsDecoderContext pCtx) {
pCtx->pParserBsInfo = NULL;
}
}
if (NULL != pCtx->pParam) {
pMa->WelsFree (pCtx->pParam, "pCtx->pParam");
pCtx->pParam = NULL;
}
}
/*
* DecodeNalHeaderExt
@ -723,8 +723,9 @@ int32_t ParseSliceHeaderSyntaxs (PWelsDecoderContext pCtx, PBitStringAux pBs, co
pSliceHead->eSliceType = static_cast <EWelsSliceType> (uiSliceType);
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //pic_parameter_set_id
WELS_CHECK_SE_UPPER_ERROR (uiCode, (MAX_PPS_COUNT - 1), "iPpsId out of range", GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER,
ERR_INFO_PPS_ID_OVERFLOW));
WELS_CHECK_SE_UPPER_ERROR (uiCode, (MAX_PPS_COUNT - 1), "iPpsId out of range",
GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER,
ERR_INFO_PPS_ID_OVERFLOW));
iPpsId = uiCode;
//add check PPS available here
@ -970,7 +971,7 @@ int32_t ParseSliceHeaderSyntaxs (PWelsDecoderContext pCtx, PBitStringAux pBs, co
if (pSliceHead->uiDisableDeblockingFilterIdc > 6) {
WelsLog (pLogCtx, WELS_LOG_WARNING, "disable_deblock_filter_idc (%d) out of range [0, 6]",
pSliceHead->uiDisableDeblockingFilterIdc);
return ERR_INFO_INVALID_DBLOCKING_IDC;
return GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER, ERR_INFO_INVALID_DBLOCKING_IDC);
}
if (pSliceHead->uiDisableDeblockingFilterIdc != 1) {
WELS_READ_VERIFY (BsGetSe (pBs, &iCode)); //slice_alpha_c0_offset_div2
@ -1017,7 +1018,7 @@ int32_t ParseSliceHeaderSyntaxs (PWelsDecoderContext pCtx, PBitStringAux pBs, co
if (pSliceHeadExt->uiDisableInterLayerDeblockingFilterIdc > 6) {
WelsLog (& (pCtx->sLogCtx), WELS_LOG_WARNING, "disable_inter_layer_deblock_filter_idc (%d) out of range [0, 6]",
pSliceHeadExt->uiDisableInterLayerDeblockingFilterIdc);
return ERR_INFO_INVALID_DBLOCKING_IDC;
return GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER, ERR_INFO_INVALID_DBLOCKING_IDC);
}
if (pSliceHeadExt->uiDisableInterLayerDeblockingFilterIdc != 1) {
WELS_READ_VERIFY (BsGetSe (pBs, &iCode)); //inter_layer_slice_alpha_c0_offset_div2
@ -1246,6 +1247,7 @@ int32_t InitialDqLayersContext (PWelsDecoderContext pCtx, const int32_t kiMaxWid
if (pDq == NULL)
return ERR_INFO_OUT_OF_MEMORY;
pCtx->pDqLayersList[i] = pDq; //to keep consistence with in UninitialDqLayersContext()
memset (pDq, 0, sizeof (SDqLayer));
pCtx->sMb.pMbType[i] = (int16_t*)pMa->WelsMallocz (pCtx->sMb.iMbWidth * pCtx->sMb.iMbHeight * sizeof (int16_t),
@ -1297,7 +1299,6 @@ int32_t InitialDqLayersContext (PWelsDecoderContext pCtx, const int32_t kiMaxWid
"pCtx->sMb.pSliceIdc[]"); // using int32_t for slice_idc, 4/21/2010
pCtx->sMb.pResidualPredFlag[i] = (int8_t*) pMa->WelsMallocz (pCtx->sMb.iMbWidth * pCtx->sMb.iMbHeight * sizeof (int8_t),
"pCtx->sMb.pResidualPredFlag[]");
//pCtx->sMb.pMotionPredFlag[i] = (uint8_t *) pMa->WelsMallocz(pCtx->sMb.iMbWidth * pCtx->sMb.iMbHeight * sizeof(uint8_t), "pCtx->sMb.pMotionPredFlag[]");
pCtx->sMb.pInterPredictionDoneFlag[i] = (int8_t*) pMa->WelsMallocz (pCtx->sMb.iMbWidth * pCtx->sMb.iMbHeight * sizeof (
int8_t), "pCtx->sMb.pInterPredictionDoneFlag[]");
@ -1313,6 +1314,8 @@ int32_t InitialDqLayersContext (PWelsDecoderContext pCtx, const int32_t kiMaxWid
(NULL == pCtx->sMb.pMv[i][0]) ||
(NULL == pCtx->sMb.pRefIndex[i][0]) ||
(NULL == pCtx->sMb.pLumaQp[i]) ||
(NULL == pCtx->sMb.pNoSubMbPartSizeLessThan8x8Flag[i]) ||
(NULL == pCtx->sMb.pTransformSize8x8Flag[i]) ||
(NULL == pCtx->sMb.pChromaQp[i]) ||
(NULL == pCtx->sMb.pMvd[i][0]) ||
(NULL == pCtx->sMb.pCbfDc[i]) ||
@ -1321,6 +1324,7 @@ int32_t InitialDqLayersContext (PWelsDecoderContext pCtx, const int32_t kiMaxWid
(NULL == pCtx->sMb.pScaledTCoeff[i]) ||
(NULL == pCtx->sMb.pIntraPredMode[i]) ||
(NULL == pCtx->sMb.pIntra4x4FinalMode[i]) ||
(NULL == pCtx->sMb.pIntraNxNAvailFlag[i]) ||
(NULL == pCtx->sMb.pChromaPredMode[i]) ||
(NULL == pCtx->sMb.pCbp[i]) ||
(NULL == pCtx->sMb.pSubMbType[i]) ||
@ -1334,7 +1338,6 @@ int32_t InitialDqLayersContext (PWelsDecoderContext pCtx, const int32_t kiMaxWid
memset (pCtx->sMb.pSliceIdc[i], 0xff, (pCtx->sMb.iMbWidth * pCtx->sMb.iMbHeight * sizeof (int32_t)));
pCtx->pDqLayersList[i] = pDq;
++ i;
} while (i < LAYER_NUM_EXCHANGEABLE);
@ -1974,7 +1977,7 @@ int32_t ConstructAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBufferI
if (ERR_NONE != iErr) {
ForceResetCurrentAccessUnit (pCtx->pAccessUnitList);
if (!pCtx->bParseOnly)
if (!pCtx->pParam->bParseOnly)
pDstInfo->iBufferStatus = 0;
pCtx->bNewSeqBegin = pCtx->bNewSeqBegin || pCtx->bNextNewSeqBegin;
pCtx->bNextNewSeqBegin = false; // reset it
@ -2006,7 +2009,7 @@ int32_t ConstructAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBufferI
return iErr;
}
return 0;
return ERR_NONE;
}
static inline void InitDqLayerInfo (PDqLayer pDqLayer, PLayerInfo pLayerInfo, PNalUnit pNalUnit, PPicture pPicDec) {
@ -2250,7 +2253,7 @@ int32_t DecodeCurrentAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBuf
#else
pCtx->bReferenceLostAtT0Flag = true;
#endif
return ERR_INFO_REFERENCE_PIC_LOST;
return GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER, ERR_INFO_REFERENCE_PIC_LOST);
}
}
}
@ -2289,9 +2292,9 @@ int32_t DecodeCurrentAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBuf
}
if (bReconstructSlice) {
if (WelsDecodeConstructSlice (pCtx, pNalCur)) {
if ((iRet = WelsDecodeConstructSlice (pCtx, pNalCur)) != ERR_NONE) {
pCtx->pDec->bIsComplete = false; // reconstruction error, directly set the flag false
return -1;
return iRet;
}
}
if (bAllRefComplete && pCtx->eSliceType != I_SLICE) {
@ -2339,7 +2342,7 @@ int32_t DecodeCurrentAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBuf
if (dq_cur->uiLayerDqId == kuiTargetLayerDqId) {
if (!pCtx->bInstantDecFlag) {
if (!pCtx->bParseOnly) {
if (!pCtx->pParam->bParseOnly) {
//Do error concealment here
if ((NeedErrorCon (pCtx)) && (pCtx->eErrorConMethod != ERROR_CON_DISABLE)) {
ImplementErrorCon (pCtx);
@ -2365,7 +2368,7 @@ int32_t DecodeCurrentAccessUnit (PWelsDecoderContext pCtx, uint8_t** ppDst, SBuf
return iRet;
}
}
if (!pCtx->bParseOnly)
if (!pCtx->pParam->bParseOnly)
ExpandReferencingPicture (pCtx->pDec->pData, pCtx->pDec->iWidthInPixel, pCtx->pDec->iHeightInPixel,
pCtx->pDec->iLinesize,
pCtx->sExpandPicFunc.pfExpandLumaPicture, pCtx->sExpandPicFunc.pfExpandChromaPicture);
@ -2421,7 +2424,7 @@ bool CheckAndFinishLastPic (PWelsDecoderContext pCtx, uint8_t** ppDst, SBufferIn
if (pCtx->sLastNalHdrExt.sNalUnitHeader.uiNalRefIdc > 0) {
MarkECFrameAsRef (pCtx);
}
} else if (pCtx->bParseOnly) { //clear parse only internal data status
} else if (pCtx->pParam->bParseOnly) { //clear parse only internal data status
pCtx->pParserBsInfo->iNalNum = 0;
pCtx->bFrameFinish = true; //clear frame pending status here!
} else {
@ -2478,6 +2481,8 @@ bool CheckRefPicturesComplete (PWelsDecoderContext pCtx) {
}
iRealMbIdx = (pCtx->pPps->uiNumSliceGroups > 1) ? FmoNextMb (pCtx->pFmo, iRealMbIdx) :
(pCtx->pCurDqLayer->sLayerInfo.sSliceInLayer.sSliceHeaderExt.sSliceHeader.iFirstMbInSlice + iMbIdx);
if (iRealMbIdx == -1) //caused by abnormal return of FmoNextMb()
return false;
}
return bAllRefComplete;
}

View File

@ -246,7 +246,7 @@ void DoMbECMvCopy (PWelsDecoderContext pCtx, PPicture pDec, PPicture pRef, int32
iMVs[1] = iFullMVy - (iMbYInPix << 2);
BaseMC (pMCRefMem, iMbXInPix, iMbYInPix, &pCtx->sMcFunc, 16, 16, iMVs);
}
return ;
return;
}
void GetAvilInfoFromCorrectMb (PWelsDecoderContext pCtx) {

View File

@ -40,6 +40,7 @@
#include "fmo.h"
#include "memory_align.h"
#include "error_code.h"
namespace WelsDec {
@ -56,10 +57,11 @@ static inline int32_t FmoGenerateMbAllocMapType0 (PFmo pFmo, PPps pPps) {
int32_t iMbNum = 0;
int32_t i = 0;
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo || NULL == pPps))
WELS_VERIFY_RETURN_IF (ERR_INFO_INVALID_PARAM, (NULL == pFmo || NULL == pPps))
uiNumSliceGroups = pPps->uiNumSliceGroups;
iMbNum = pFmo->iCountMbNum;
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo->pMbAllocMap || iMbNum <= 0 || uiNumSliceGroups > MAX_SLICEGROUP_IDS))
WELS_VERIFY_RETURN_IF (ERR_INFO_INVALID_PARAM, (NULL == pFmo->pMbAllocMap || iMbNum <= 0
|| uiNumSliceGroups > MAX_SLICEGROUP_IDS))
do {
uint8_t uiGroup = 0;
@ -75,7 +77,7 @@ static inline int32_t FmoGenerateMbAllocMapType0 (PFmo pFmo, PPps pPps) {
} while (uiGroup < uiNumSliceGroups && i < iMbNum);
} while (i < iMbNum);
return 0; // well here
return ERR_NONE; // well here
}
/*!
@ -91,18 +93,18 @@ static inline int32_t FmoGenerateMbAllocMapType1 (PFmo pFmo, PPps pPps, const in
uint32_t uiNumSliceGroups = 0;
int32_t iMbNum = 0;
int32_t i = 0;
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo || NULL == pPps))
WELS_VERIFY_RETURN_IF (ERR_INFO_INVALID_PARAM, (NULL == pFmo || NULL == pPps))
uiNumSliceGroups = pPps->uiNumSliceGroups;
iMbNum = pFmo->iCountMbNum;
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo->pMbAllocMap || iMbNum <= 0 || kiMbWidth == 0
|| uiNumSliceGroups > MAX_SLICEGROUP_IDS))
WELS_VERIFY_RETURN_IF (ERR_INFO_INVALID_PARAM, (NULL == pFmo->pMbAllocMap || iMbNum <= 0 || kiMbWidth == 0
|| uiNumSliceGroups > MAX_SLICEGROUP_IDS))
do {
pFmo->pMbAllocMap[i] = (uint8_t) (((i % kiMbWidth) + (((i / kiMbWidth) * uiNumSliceGroups) >> 1)) % uiNumSliceGroups);
++ i;
} while (i < iMbNum);
return 0; // well here
return ERR_NONE; // well here
}
/*!
@ -122,18 +124,18 @@ static inline int32_t FmoGenerateSliceGroup (PFmo pFmo, const PPps kpPps, const
bool bResolutionChanged = false;
// the cases we would not like
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo || NULL == kpPps))
WELS_VERIFY_RETURN_IF (ERR_INFO_INVALID_PARAM, (NULL == pFmo || NULL == kpPps))
iNumMb = pFmo->iCountMbNum;
iNumMb = kiMbWidth * kiMbHeight;
if (0 == iNumMb)
return 1;
return ERR_INFO_INVALID_PARAM;
pMa->WelsFree (pFmo->pMbAllocMap, "_fmo->pMbAllocMap");
pFmo->pMbAllocMap = (uint8_t*)pMa->WelsMallocz (iNumMb * sizeof (uint8_t), "_fmo->pMbAllocMap");
WELS_VERIFY_RETURN_IF (1, (NULL == pFmo->pMbAllocMap)) // out of memory
WELS_VERIFY_RETURN_IF (ERR_INFO_OUT_OF_MEMORY, (NULL == pFmo->pMbAllocMap)) // out of memory
pFmo->iCountMbNum = iNumMb;
@ -142,7 +144,7 @@ static inline int32_t FmoGenerateSliceGroup (PFmo pFmo, const PPps kpPps, const
pFmo->iSliceGroupCount = 1;
return 0;
return ERR_NONE;
}
if (bResolutionChanged || ((int32_t)kpPps->uiSliceGroupMapType != pFmo->iSliceGroupType)
@ -163,7 +165,7 @@ static inline int32_t FmoGenerateSliceGroup (PFmo pFmo, const PPps kpPps, const
iErr = 1;
break;
default:
return 1;
return ERR_INFO_UNSUPPORTED_FMOTYPE;
}
}

View File

@ -40,6 +40,7 @@
*****************************************************************************/
#include "memmgr_nal_unit.h"
#include "memory_align.h"
#include "error_code.h"
namespace WelsDec {
@ -52,7 +53,7 @@ int32_t MemInitNalList (PAccessUnit* ppAu, const uint32_t kuiSize, CMemoryAlign*
const uint32_t kuiCountSize = (kuiSizeAu + kuiSizeNalUnitPtr + kuiSize * kuiSizeNalUnit) * sizeof (uint8_t);
if (kuiSize == 0)
return 1;
return ERR_INFO_INVALID_PARAM;
if (*ppAu != NULL) {
MemFreeNalList (ppAu, pMa);
@ -60,7 +61,7 @@ int32_t MemInitNalList (PAccessUnit* ppAu, const uint32_t kuiSize, CMemoryAlign*
pBase = (uint8_t*)pMa->WelsMallocz (kuiCountSize, "Access Unit");
if (pBase == NULL)
return 1;
return ERR_INFO_OUT_OF_MEMORY;
pPtr = pBase;
*ppAu = (PAccessUnit)pPtr;
pPtr += kuiSizeAu;
@ -79,7 +80,7 @@ int32_t MemInitNalList (PAccessUnit* ppAu, const uint32_t kuiSize, CMemoryAlign*
(*ppAu)->uiEndPos = 0;
(*ppAu)->bCompletedAuFlag = false;
return 0;
return ERR_NONE;
}
int32_t MemFreeNalList (PAccessUnit* ppAu, CMemoryAlign* pMa) {
@ -90,19 +91,19 @@ int32_t MemFreeNalList (PAccessUnit* ppAu, CMemoryAlign* pMa) {
*ppAu = NULL;
}
}
return 0;
return ERR_NONE;
}
int32_t ExpandNalUnitList (PAccessUnit* ppAu, const int32_t kiOrgSize, const int32_t kiExpSize, CMemoryAlign* pMa) {
if (kiExpSize <= kiOrgSize)
return 1;
return ERR_INFO_INVALID_PARAM;
else {
PAccessUnit pTmp = NULL;
int32_t iIdx = 0;
if (MemInitNalList (&pTmp, kiExpSize, pMa)) // request new list with expanding
return 1;
int32_t iRet = ERR_NONE;
if ((iRet = MemInitNalList (&pTmp, kiExpSize, pMa)) != ERR_NONE) // request new list with expanding
return iRet;
do {
memcpy (pTmp->pNalUnitsList[iIdx], (*ppAu)->pNalUnitsList[iIdx], sizeof (SNalUnit)); //confirmed_safe_unsafe_usage
@ -117,7 +118,7 @@ int32_t ExpandNalUnitList (PAccessUnit* ppAu, const int32_t kiOrgSize, const int
MemFreeNalList (ppAu, pMa); // free old list
*ppAu = pTmp;
return 0;
return ERR_NONE;
}
}

View File

@ -48,7 +48,7 @@ void PredPSkipMvFromNeighbor (PDqLayer pCurLayer, int16_t iMvp[2]) {
int32_t iCurSliceIdc, iTopSliceIdc, iLeftTopSliceIdc, iRightTopSliceIdc, iLeftSliceIdc;
int32_t iLeftTopType, iRightTopType, iTopType, iLeftType;
int32_t iCurX, iCurY, iCurXy, iLeftXy, iTopXy, iLeftTopXy, iRightTopXy = 0;
int32_t iCurX, iCurY, iCurXy, iLeftXy, iTopXy = 0, iLeftTopXy = 0, iRightTopXy = 0;
int8_t iLeftRef;
int8_t iTopRef;

View File

@ -401,7 +401,7 @@ int32_t ParseInterMotionInfoCabac (PWelsDecoderContext pCtx, PWelsNeighAvail pNe
iRef[0] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRef[0]]
@ -427,7 +427,7 @@ int32_t ParseInterMotionInfoCabac (PWelsDecoderContext pCtx, PWelsNeighAvail pNe
iRef[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRef[i]]
@ -457,7 +457,7 @@ int32_t ParseInterMotionInfoCabac (PWelsDecoderContext pCtx, PWelsNeighAvail pNe
iRef[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRef[i]]
@ -485,7 +485,7 @@ int32_t ParseInterMotionInfoCabac (PWelsDecoderContext pCtx, PWelsNeighAvail pNe
for (i = 0; i < 4; i++) {
WELS_READ_VERIFY (ParseSubMBTypeCabac (pCtx, pNeighAvail, uiSubMbType));
if (uiSubMbType >= 4) { //invalid sub_mb_type
return ERR_INFO_INVALID_SUB_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_SUB_MB_TYPE);
}
pCurDqLayer->pSubMbType[iMbXy][i] = g_ksInterSubMbTypeInfo[uiSubMbType].iType;
pSubPartCount[i] = g_ksInterSubMbTypeInfo[uiSubMbType].iPartCount;
@ -505,7 +505,7 @@ int32_t ParseInterMotionInfoCabac (PWelsDecoderContext pCtx, PWelsNeighAvail pNe
pRefIdx[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[pRefIdx[i]]
@ -934,14 +934,16 @@ int32_t ParseResidualBlockCabac (PWelsNeighAvail pNeighAvail, uint8_t* pNonZeroC
} else if (iResProperty == CHROMA_DC_U || iResProperty == CHROMA_DC_V) {
do {
if (pSignificantMap[j] != 0)
sTCoeff[pScanTable[j]] = pCtx->bUseScalingList ? (pSignificantMap[j] * pDeQuantMul[0]) >> 4 :
sTCoeff[pScanTable[j]] = pCtx->bUseScalingList ? (int16_t) ((int64_t)pSignificantMap[j] *
(int64_t)pDeQuantMul[0] >> 4) :
(pSignificantMap[j] * pDeQuantMul[0]);
++j;
} while (j < 16);
} else { //luma ac, chroma ac
do {
if (pSignificantMap[j] != 0)
sTCoeff[pScanTable[j]] = pCtx->bUseScalingList ? (pSignificantMap[j] * pDeQuantMul[pScanTable[j]] >> 4) :
sTCoeff[pScanTable[j]] = pCtx->bUseScalingList ? (int16_t) ((int64_t)pSignificantMap[j] *
(int64_t)pDeQuantMul[pScanTable[j]] >> 4) :
pSignificantMap[j] * pDeQuantMul[pScanTable[j] & 0x07];
++j;
} while (j < 16);
@ -973,7 +975,7 @@ int32_t ParseIPCMInfoCabac (PWelsDecoderContext pCtx) {
RestoreCabacDecEngineToBS (pCabacDecEngine, pBsAux);
intX_t iBytesLeft = pBsAux->pEndBuf - pBsAux->pCurBuf;
if (iBytesLeft < 384) {
return ERR_CABAC_NO_BS_TO_READ;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_CABAC_NO_BS_TO_READ);
}
pPtrSrc = pBsAux->pCurBuf;
for (i = 0; i < 16; i++) { //luma

View File

@ -518,12 +518,12 @@ int32_t CheckIntra16x16PredMode (uint8_t uiSampleAvail, int8_t* pMode) {
int32_t iTopAvail = uiSampleAvail & 0x01;
if ((*pMode < 0) || (*pMode > MAX_PRED_MODE_ID_I16x16)) {
return ERR_INFO_INVALID_I16x16_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I16x16_PRED_MODE);
}
if (I16_PRED_DC == *pMode) {
if (iLeftAvail && iTopAvail) {
return 0;
return ERR_NONE;
} else if (iLeftAvail) {
*pMode = I16_PRED_DC_L;
} else if (iTopAvail) {
@ -534,10 +534,10 @@ int32_t CheckIntra16x16PredMode (uint8_t uiSampleAvail, int8_t* pMode) {
} else {
bool bModeAvail = CHECK_I16_MODE (*pMode, iLeftAvail, iTopAvail, bLeftTopAvail);
if (0 == bModeAvail) {
return ERR_INFO_INVALID_I16x16_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I16x16_PRED_MODE);
}
}
return 0;
return ERR_NONE;
}
@ -548,7 +548,7 @@ int32_t CheckIntraChromaPredMode (uint8_t uiSampleAvail, int8_t* pMode) {
if (C_PRED_DC == *pMode) {
if (iLeftAvail && iTopAvail) {
return 0;
return ERR_NONE;
} else if (iLeftAvail) {
*pMode = C_PRED_DC_L;
} else if (iTopAvail) {
@ -559,10 +559,10 @@ int32_t CheckIntraChromaPredMode (uint8_t uiSampleAvail, int8_t* pMode) {
} else {
bool bModeAvail = CHECK_CHROMA_MODE (*pMode, iLeftAvail, iTopAvail, bLeftTopAvail);
if (0 == bModeAvail) {
return ERR_INFO_INVALID_I_CHROMA_PRED_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_I_CHROMA_PRED_MODE);
}
}
return 0;
return ERR_NONE;
}
int32_t CheckIntraNxNPredMode (int32_t* pSampleAvail, int8_t* pMode, int32_t iIndex, bool b8x8) {
@ -576,7 +576,7 @@ int32_t CheckIntraNxNPredMode (int32_t* pSampleAvail, int8_t* pMode, int32_t iIn
int8_t iFinalMode;
if ((*pMode < 0) || (*pMode > MAX_PRED_MODE_ID_I4x4)) {
return ERR_INVALID_INTRA4X4_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INVALID_INTRA4X4_MODE);
}
if (I4_PRED_DC == *pMode) {
@ -592,7 +592,7 @@ int32_t CheckIntraNxNPredMode (int32_t* pSampleAvail, int8_t* pMode, int32_t iIn
} else {
bool bModeAvail = CHECK_I4_MODE (*pMode, iLeftAvail, iTopAvail, bLeftTopAvail);
if (0 == bModeAvail) {
return ERR_INVALID_INTRA4X4_MODE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INVALID_INTRA4X4_MODE);
}
iFinalMode = *pMode;
@ -848,13 +848,13 @@ int32_t WelsResidualBlockCavlc (SVlcTable* pVlcTable, uint8_t* pNonZeroCountCach
}
if (0 == uiTotalCoeff) {
pBs->iIndex += iUsedBits;
return 0;
return ERR_NONE;
}
if ((uiTrailingOnes > 3) || (uiTotalCoeff > 16)) { /////////////////check uiTrailingOnes and uiTotalCoeff
return ERR_INFO_CAVLC_INVALID_TOTAL_COEFF_OR_TRAILING_ONES;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_TOTAL_COEFF_OR_TRAILING_ONES);
}
if ((i = CavlcGetLevelVal (iLevel, &sReadBitsCache, uiTotalCoeff, uiTrailingOnes)) == -1) {
return ERR_INFO_CAVLC_INVALID_LEVEL;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_LEVEL);
}
iUsedBits += i;
if (uiTotalCoeff < iMaxNumCoeff) {
@ -864,10 +864,10 @@ int32_t WelsResidualBlockCavlc (SVlcTable* pVlcTable, uint8_t* pNonZeroCountCach
}
if ((iZerosLeft < 0) || ((iZerosLeft + uiTotalCoeff) > iMaxNumCoeff)) {
return ERR_INFO_CAVLC_INVALID_ZERO_LEFT;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_ZERO_LEFT);
}
if ((i = CavlcGetRunBefore (iRun, &sReadBitsCache, uiTotalCoeff, pVlcTable, iZerosLeft)) == -1) {
return ERR_INFO_CAVLC_INVALID_RUN_BEFORE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_RUN_BEFORE);
}
iUsedBits += i;
pBs->iIndex += iUsedBits;
@ -898,7 +898,7 @@ int32_t WelsResidualBlockCavlc (SVlcTable* pVlcTable, uint8_t* pNonZeroCountCach
}
}
return 0;
return ERR_NONE;
}
int32_t WelsResidualBlockCavlc8x8 (SVlcTable* pVlcTable, uint8_t* pNonZeroCountCache, PBitStringAux pBs, int32_t iIndex,
@ -951,13 +951,13 @@ int32_t WelsResidualBlockCavlc8x8 (SVlcTable* pVlcTable, uint8_t* pNonZeroCountC
}
if (0 == uiTotalCoeff) {
pBs->iIndex += iUsedBits;
return 0;
return ERR_NONE;
}
if ((uiTrailingOnes > 3) || (uiTotalCoeff > 16)) { /////////////////check uiTrailingOnes and uiTotalCoeff
return ERR_INFO_CAVLC_INVALID_TOTAL_COEFF_OR_TRAILING_ONES;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_TOTAL_COEFF_OR_TRAILING_ONES);
}
if ((i = CavlcGetLevelVal (iLevel, &sReadBitsCache, uiTotalCoeff, uiTrailingOnes)) == -1) {
return ERR_INFO_CAVLC_INVALID_LEVEL;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_LEVEL);
}
iUsedBits += i;
if (uiTotalCoeff < iMaxNumCoeff) {
@ -967,10 +967,10 @@ int32_t WelsResidualBlockCavlc8x8 (SVlcTable* pVlcTable, uint8_t* pNonZeroCountC
}
if ((iZerosLeft < 0) || ((iZerosLeft + uiTotalCoeff) > iMaxNumCoeff)) {
return ERR_INFO_CAVLC_INVALID_ZERO_LEFT;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_ZERO_LEFT);
}
if ((i = CavlcGetRunBefore (iRun, &sReadBitsCache, uiTotalCoeff, pVlcTable, iZerosLeft)) == -1) {
return ERR_INFO_CAVLC_INVALID_RUN_BEFORE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_CAVLC_INVALID_RUN_BEFORE);
}
iUsedBits += i;
pBs->iIndex += iUsedBits;
@ -985,7 +985,7 @@ int32_t WelsResidualBlockCavlc8x8 (SVlcTable* pVlcTable, uint8_t* pNonZeroCountC
: ((iLevel[i] * kpDequantCoeff[j] + (1 << (5 - uiQp / 6))) >> (6 - uiQp / 6));
}
return 0;
return ERR_NONE;
}
int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][MV_A], int8_t iRefIdxArray[LIST_A][30],
@ -1026,7 +1026,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
iRefIdx = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRefIdx]
@ -1067,7 +1067,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
iRefIdx[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRefIdx[i]]
@ -1104,7 +1104,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
iRefIdx[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRefIdx[i]]
@ -1142,7 +1142,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
WELS_READ_VERIFY (BsGetUe (pBs, &uiCode)); //sub_mb_type[ mbPartIdx ]
uiSubMbType = uiCode;
if (uiSubMbType >= 4) { //invalid uiSubMbType
return ERR_INFO_INVALID_SUB_MB_TYPE;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_SUB_MB_TYPE);
}
pCurDqLayer->pSubMbType[iMbXy][i] = g_ksInterSubMbTypeInfo[uiSubMbType].iType;
iSubPartCount[i] = g_ksInterSubMbTypeInfo[uiSubMbType].iPartCount;
@ -1176,7 +1176,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
iRefIdx[i] = 0;
pCtx->iErrorCode |= dsBitstreamError;
} else {
return ERR_INFO_INVALID_REF_INDEX;
return GENERATE_ERROR_NO (ERR_LEVEL_MB_DATA, ERR_INFO_INVALID_REF_INDEX);
}
}
pCtx->bMbRefConcealed = pCtx->bRPLRError || pCtx->bMbRefConcealed || ! (ppRefPic[iRefIdx[i]]
@ -1245,7 +1245,7 @@ int32_t ParseInterInfo (PWelsDecoderContext pCtx, int16_t iMvArray[LIST_A][30][M
break;
}
return 0;
return ERR_NONE;
}
} // namespace WelsDec

View File

@ -83,7 +83,7 @@ PPicture AllocPicture (PWelsDecoderContext pCtx, const int32_t kiPicWidth, const
iLumaSize = iPicWidth * iPicHeight;
iChromaSize = iPicChromaWidth * iPicChromaHeight;
if (pCtx->bParseOnly) {
if (pCtx->pParam->bParseOnly) {
pPic->pBuffer[0] = pPic->pBuffer[1] = pPic->pBuffer[2] = NULL;
pPic->pData[0] = pPic->pData[1] = pPic->pData[2] = NULL;
pPic->iLinesize[0] = iPicWidth;

View File

@ -186,27 +186,21 @@ int32_t RecI16x16Mb (int32_t iMBXY, PWelsDecoderContext pCtx, int16_t* pScoeffLe
/*common use by decoder&encoder*/
int32_t iYStride = pDqLayer->iLumaStride;
int32_t* pBlockOffset = pCtx->iDecBlockOffsetArray;
int16_t* pRS = pScoeffLevel;
uint8_t* pPred = pDqLayer->pPred[0];
PIdctResAddPredFunc pIdctResAddPredFunc = pCtx->pIdctResAddPredFunc;
uint8_t i = 0;
PIdctFourResAddPredFunc pIdctFourResAddPredFunc = pCtx->pIdctFourResAddPredFunc;
/*decode i16x16 y*/
pGetI16x16LumaPredFunc[iI16x16PredMode] (pPred, iYStride);
/*1 mb is divided 16 4x4_block to idct*/
for (i = 0; i < 16; i++) {
int16_t* pRSI4x4 = pRS + (i << 4);
uint8_t* pPredI4x4 = pPred + pBlockOffset[i];
if (pDqLayer->pNzc[iMBXY][g_kuiMbCountScan4Idx[i]] || pRSI4x4[0]) {
pIdctResAddPredFunc (pPredI4x4, iYStride, pRSI4x4);
}
}
const int8_t* pNzc = pDqLayer->pNzc[iMBXY];
pIdctFourResAddPredFunc (pPred + 0 * iYStride + 0, iYStride, pRS + 0 * 64, pNzc + 0);
pIdctFourResAddPredFunc (pPred + 0 * iYStride + 8, iYStride, pRS + 1 * 64, pNzc + 2);
pIdctFourResAddPredFunc (pPred + 8 * iYStride + 0, iYStride, pRS + 2 * 64, pNzc + 8);
pIdctFourResAddPredFunc (pPred + 8 * iYStride + 8, iYStride, pRS + 3 * 64, pNzc + 10);
/*decode intra mb cb&cr*/
pPred = pDqLayer->pPred[1];
@ -541,9 +535,9 @@ void GetInterPred (uint8_t* pPredY, uint8_t* pPredCb, uint8_t* pPredCr, PWelsDec
int32_t RecChroma (int32_t iMBXY, PWelsDecoderContext pCtx, int16_t* pScoeffLevel, PDqLayer pDqLayer) {
int32_t iChromaStride = pCtx->pCurDqLayer->pDec->iLinesize[1];
PIdctResAddPredFunc pIdctResAddPredFunc = pCtx->pIdctResAddPredFunc;
PIdctFourResAddPredFunc pIdctFourResAddPredFunc = pCtx->pIdctFourResAddPredFunc;
uint8_t i = 0, j = 0;
uint8_t i = 0;
uint8_t uiCbpC = pDqLayer->pCbp[iMBXY] >> 4;
if (1 == uiCbpC || 2 == uiCbpC) {
@ -552,17 +546,10 @@ int32_t RecChroma (int32_t iMBXY, PWelsDecoderContext pCtx, int16_t* pScoeffLeve
for (i = 0; i < 2; i++) {
int16_t* pRS = pScoeffLevel + 256 + (i << 6);
uint8_t* pPred = pDqLayer->pPred[i + 1];
int32_t* pBlockOffset = i == 0 ? &pCtx->iDecBlockOffsetArray[16] : &pCtx->iDecBlockOffsetArray[20];
const int8_t* pNzc = pDqLayer->pNzc[iMBXY] + 16 + 2 * i;
/*1 chroma is divided 4 4x4_block to idct*/
for (j = 0; j < 4; j++) {
int16_t* pRSI4x4 = &pRS[j << 4];
uint8_t* pPredI4x4 = pPred + pBlockOffset[j];
if (pDqLayer->pNzc[iMBXY][g_kuiMbCountScan4Idx[16 + (i << 2) + j]] || pRSI4x4[0]) {
pIdctResAddPredFunc (pPredI4x4, iChromaStride, pRSI4x4);
}
}
pIdctFourResAddPredFunc (pPred, iChromaStride, pRS, pNzc);
}
}

View File

@ -42,78 +42,8 @@
%include "asm_inc.asm"
;*******************************************************************************
; Macros and other preprocessor constants
;*******************************************************************************
%macro MMX_SumSubDiv2 3
movq %3, %2
psraw %3, $01
paddw %3, %1
psraw %1, $01
psubw %1, %2
%endmacro
%macro MMX_SumSub 3
movq %3, %2
psubw %2, %1
paddw %1, %3
%endmacro
%macro MMX_IDCT 6
MMX_SumSub %4, %5, %6
MMX_SumSubDiv2 %3, %2, %1
MMX_SumSub %1, %4, %6
MMX_SumSub %3, %5, %6
%endmacro
%macro MMX_StoreDiff4P 5
movd %2, %5
punpcklbw %2, %4
paddw %1, %3
psraw %1, $06
paddsw %1, %2
packuswb %1, %2
movd %5, %1
%endmacro
;*******************************************************************************
; Code
;*******************************************************************************
SECTION .text
;*******************************************************************************
; void IdctResAddPred_mmx( uint8_t *pPred, const int32_t kiStride, int16_t *pRs )
;*******************************************************************************
WELS_EXTERN IdctResAddPred_mmx
%assign push_num 0
LOAD_3_PARA
SIGN_EXTENSION r1, r1d
movq mm0, [r2+ 0]
movq mm1, [r2+ 8]
movq mm2, [r2+16]
movq mm3, [r2+24]
MMX_Trans4x4W mm0, mm1, mm2, mm3, mm4
MMX_IDCT mm1, mm2, mm3, mm4, mm0, mm6
MMX_Trans4x4W mm1, mm3, mm0, mm4, mm2
MMX_IDCT mm3, mm0, mm4, mm2, mm1, mm6
WELS_Zero mm7
WELS_DW32 mm6
MMX_StoreDiff4P mm3, mm0, mm6, mm7, [r0]
MMX_StoreDiff4P mm4, mm0, mm6, mm7, [r0+r1]
lea r0, [r0+2*r1]
MMX_StoreDiff4P mm1, mm0, mm6, mm7, [r0]
MMX_StoreDiff4P mm2, mm0, mm6, mm7, [r0+r1]
emms
ret
;void WelsBlockZero16x16_sse2(int16_t * block, int32_t stride);
WELS_EXTERN WelsBlockZero16x16_sse2
%assign push_num 0

View File

@ -109,7 +109,7 @@ virtual long EXTAPI GetOption (DECODER_OPTION eOptID, void* pOption);
PWelsDecoderContext m_pDecContext;
welsCodecTrace* m_pWelsTrace;
int32_t InitDecoder (const bool);
int32_t InitDecoder (const SDecodingParam* pParam);
void UninitDecoder (void);
int32_t ResetDecoder();

View File

@ -198,11 +198,7 @@ long CWelsDecoder::Initialize (const SDecodingParam* pParam) {
}
// H.264 decoder initialization,including memory allocation,then open it ready to decode
iRet = InitDecoder (pParam->bParseOnly);
if (iRet)
return iRet;
iRet = DecoderConfigParam (m_pDecContext, pParam);
iRet = InitDecoder (pParam);
if (iRet)
return iRet;
@ -241,12 +237,13 @@ void CWelsDecoder::UninitDecoder (void) {
}
// the return value of this function is not suitable, it need report failure info to upper layer.
int32_t CWelsDecoder::InitDecoder (const bool bParseOnly) {
int32_t CWelsDecoder::InitDecoder (const SDecodingParam* pParam) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO,
"CWelsDecoder::init_decoder(), openh264 codec version = %s, ParseOnly = %d",
VERSION_NUMBER, (int32_t)bParseOnly);
VERSION_NUMBER, (int32_t)pParam->bParseOnly);
//reset decoder context
if (m_pDecContext) //free
UninitDecoder();
m_pDecContext = (PWelsDecoderContext)WelsMallocz (sizeof (SWelsDecoderContext), "m_pDecContext");
@ -256,7 +253,20 @@ int32_t CWelsDecoder::InitDecoder (const bool bParseOnly) {
m_pDecContext->pMemAlign = new CMemoryAlign (iCacheLineSize);
WELS_VERIFY_RETURN_PROC_IF (1, (NULL == m_pDecContext->pMemAlign), UninitDecoder())
return WelsInitDecoder (m_pDecContext, bParseOnly, &m_pWelsTrace->m_sLogCtx);
//fill in default value into context
WelsDecoderDefaults (m_pDecContext, &m_pWelsTrace->m_sLogCtx);
//check param and update decoder context
m_pDecContext->pParam = (SDecodingParam*) m_pDecContext->pMemAlign->WelsMallocz (sizeof (SDecodingParam),
"SDecodingParam");
WELS_VERIFY_RETURN_PROC_IF (cmMallocMemeError, (NULL == m_pDecContext->pParam), UninitDecoder());
int32_t iRet = DecoderConfigParam (m_pDecContext, pParam);
WELS_VERIFY_RETURN_IFNEQ (iRet, cmResultSuccess);
//init decoder
WELS_VERIFY_RETURN_PROC_IF (cmInitParaError, WelsInitDecoder (m_pDecContext, &m_pWelsTrace->m_sLogCtx), UninitDecoder())
return cmResultSuccess;
}
int32_t CWelsDecoder::ResetDecoder() {
@ -267,11 +277,7 @@ int32_t CWelsDecoder::ResetDecoder() {
SDecodingParam sPrevParam;
memcpy (&sPrevParam, m_pDecContext->pParam, sizeof (SDecodingParam));
int32_t iRet = InitDecoder (m_pDecContext->bParseOnly);
if (iRet)
return iRet;
return DecoderConfigParam (m_pDecContext, &sPrevParam);
WELS_VERIFY_RETURN_PROC_IF (cmInitParaError, InitDecoder (&sPrevParam), UninitDecoder());
} else if (m_pWelsTrace != NULL) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_ERROR, "ResetDecoder() failed as decoder context null");
}
@ -287,20 +293,7 @@ long CWelsDecoder::SetOption (DECODER_OPTION eOptID, void* pOption) {
if (m_pDecContext == NULL && eOptID != DECODER_OPTION_TRACE_LEVEL &&
eOptID != DECODER_OPTION_TRACE_CALLBACK && eOptID != DECODER_OPTION_TRACE_CALLBACK_CONTEXT)
return dsInitialOptExpected;
if (eOptID == DECODER_OPTION_DATAFORMAT) { // Set color space of decoding output frame
if (m_pDecContext->bParseOnly) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_WARNING,
"CWelsDecoder::SetOption for data format meaningless for parseonly.");
return cmResultSuccess;
}
if (pOption == NULL)
return cmInitParaError;
iVal = * ((int*)pOption); // is_rgb
return DecoderSetCsp (m_pDecContext, iVal);
} else if (eOptID == DECODER_OPTION_END_OF_STREAM) { // Indicate bit-stream of the final frame to be decoded
if (eOptID == DECODER_OPTION_END_OF_STREAM) { // Indicate bit-stream of the final frame to be decoded
if (pOption == NULL)
return cmInitParaError;
@ -315,13 +308,13 @@ long CWelsDecoder::SetOption (DECODER_OPTION eOptID, void* pOption) {
iVal = * ((int*)pOption); // int value for error concealment idc
iVal = WELS_CLIP3 (iVal, (int32_t) ERROR_CON_DISABLE, (int32_t) ERROR_CON_SLICE_MV_COPY_CROSS_IDR_FREEZE_RES_CHANGE);
m_pDecContext->pParam->eEcActiveIdc = m_pDecContext->eErrorConMethod = (ERROR_CON_IDC) iVal;
if ((m_pDecContext->bParseOnly) && (m_pDecContext->eErrorConMethod != ERROR_CON_DISABLE)) {
if ((m_pDecContext->pParam->bParseOnly) && (iVal != (int32_t) ERROR_CON_DISABLE)) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO,
"CWelsDecoder::SetOption for ERROR_CON_IDC = %d not allowd for parse only!.", iVal);
return cmInitParaError;
}
m_pDecContext->pParam->eEcActiveIdc = m_pDecContext->eErrorConMethod = (ERROR_CON_IDC) iVal;
InitErrorCon (m_pDecContext);
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO,
"CWelsDecoder::SetOption for ERROR_CON_IDC = %d.", iVal);
@ -337,8 +330,9 @@ long CWelsDecoder::SetOption (DECODER_OPTION eOptID, void* pOption) {
if (m_pWelsTrace) {
WelsTraceCallback callback = * ((WelsTraceCallback*)pOption);
m_pWelsTrace->SetTraceCallback (callback);
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO, "CWelsDecoder::SetOption(), openh264 codec version = %s.",
VERSION_NUMBER);
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO,
"CWelsDecoder::SetOption():DECODER_OPTION_TRACE_CALLBACK callback = %p.",
callback);
}
return cmResultSuccess;
} else if (eOptID == DECODER_OPTION_TRACE_CALLBACK_CONTEXT) {
@ -369,11 +363,7 @@ long CWelsDecoder::GetOption (DECODER_OPTION eOptID, void* pOption) {
if (pOption == NULL)
return cmInitParaError;
if (DECODER_OPTION_DATAFORMAT == eOptID) {
iVal = (int32_t) m_pDecContext->eOutputColorFormat;
* ((int*)pOption) = iVal;
return cmResultSuccess;
} else if (DECODER_OPTION_END_OF_STREAM == eOptID) {
if (DECODER_OPTION_END_OF_STREAM == eOptID) {
iVal = m_pDecContext->bEndOfStreamFlag;
* ((int*)pOption) = iVal;
return cmResultSuccess;
@ -414,11 +404,13 @@ long CWelsDecoder::GetOption (DECODER_OPTION eOptID, void* pOption) {
memcpy (pDecoderStatistics, &m_pDecContext->sDecoderStatistics, sizeof (SDecoderStatistics));
pDecoderStatistics->fAverageFrameSpeedInMs = (float) (m_pDecContext->dDecTime) /
(m_pDecContext->sDecoderStatistics.uiDecodedFrameCount);
pDecoderStatistics->fActualAverageFrameSpeedInMs = (float) (m_pDecContext->dDecTime) /
(m_pDecContext->sDecoderStatistics.uiDecodedFrameCount + m_pDecContext->sDecoderStatistics.uiFreezingIDRNum +
m_pDecContext->sDecoderStatistics.uiFreezingNonIDRNum);
if (m_pDecContext->sDecoderStatistics.uiDecodedFrameCount != 0) { //not original status
pDecoderStatistics->fAverageFrameSpeedInMs = (float) (m_pDecContext->dDecTime) /
(m_pDecContext->sDecoderStatistics.uiDecodedFrameCount);
pDecoderStatistics->fActualAverageFrameSpeedInMs = (float) (m_pDecContext->dDecTime) /
(m_pDecContext->sDecoderStatistics.uiDecodedFrameCount + m_pDecContext->sDecoderStatistics.uiFreezingIDRNum +
m_pDecContext->sDecoderStatistics.uiFreezingNonIDRNum);
}
return cmResultSuccess;
}
@ -453,6 +445,18 @@ DECODING_STATE CWelsDecoder::DecodeFrame2 (const unsigned char* kpSrc,
const int kiSrcLen,
unsigned char** ppDst,
SBufferInfo* pDstInfo) {
if (m_pDecContext == NULL || m_pDecContext->pParam == NULL) {
if (m_pWelsTrace != NULL) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_ERROR, "Call DecodeFrame2 without Initialize.\n");
}
return dsInitialOptExpected;
}
if (m_pDecContext->pParam->bParseOnly) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_ERROR, "bParseOnly should be false for this API calling! \n");
m_pDecContext->iErrorCode |= dsInvalidArgument;
return dsInvalidArgument;
}
if (CheckBsBuffer (m_pDecContext, kiSrcLen)) {
return dsOutOfMemory;
}
@ -507,7 +511,8 @@ DECODING_STATE CWelsDecoder::DecodeFrame2 (const unsigned char* kpSrc,
eNalType = m_pDecContext->sCurNalHead.eNalUnitType;
if (m_pDecContext->iErrorCode & dsOutOfMemory) {
ResetDecoder();
if (ResetDecoder())
return dsOutOfMemory;
}
//for AVC bitstream (excluding AVC with temporal scalability, including TP), as long as error occur, SHOULD notify upper layer key frame loss.
if ((IS_PARAM_SETS_NALS (eNalType) || NAL_UNIT_CODED_SLICE_IDR == eNalType) ||
@ -593,6 +598,18 @@ DECODING_STATE CWelsDecoder::DecodeFrame2 (const unsigned char* kpSrc,
DECODING_STATE CWelsDecoder::DecodeParser (const unsigned char* kpSrc,
const int kiSrcLen,
SParserBsInfo* pDstInfo) {
if (m_pDecContext == NULL || m_pDecContext->pParam == NULL) {
if (m_pWelsTrace != NULL) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_ERROR, "Call DecodeParser without Initialize.\n");
}
return dsInitialOptExpected;
}
if (!m_pDecContext->pParam->bParseOnly) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_ERROR, "bParseOnly should be true for this API calling! \n");
m_pDecContext->iErrorCode |= dsInvalidArgument;
return dsInvalidArgument;
}
if (CheckBsBuffer (m_pDecContext, kiSrcLen)) {
return dsOutOfMemory;
}
@ -632,6 +649,11 @@ DECODING_STATE CWelsDecoder::DecodeParser (const unsigned char* kpSrc,
m_pDecContext->bInstantDecFlag = false; //reset no-delay flag
if (m_pDecContext->iErrorCode && m_pDecContext->bPrintFrameErrorTraceFlag) {
WelsLog (&m_pWelsTrace->m_sLogCtx, WELS_LOG_INFO, "decode failed, failure type:%d \n", m_pDecContext->iErrorCode);
m_pDecContext->bPrintFrameErrorTraceFlag = false;
}
return (DECODING_STATE) m_pDecContext->iErrorCode;
}

View File

@ -35,6 +35,7 @@
// for Luma 4x4
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredH_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
.rept 4
ld1r {v0.8b}, [x3], x2
@ -43,6 +44,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredH_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDc_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub x4, x1, #1
ldr s0, [x3]
@ -59,6 +61,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDc_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDcTop_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub v0.8b, v0.8b, v0.8b
ldr s0, [x3]
@ -71,6 +74,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDcTop_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDDL_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
dup v1.8b, v0.b[7]
@ -90,6 +94,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDDL_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDDLTop_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
dup v1.8b, v0.b[3]
@ -110,6 +115,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredDDLTop_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredVL_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
ext v1.8b, v0.8b, v0.8b, #1
@ -127,6 +133,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredVL_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredVLTop_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
dup v1.8b, v0.b[3]
@ -146,6 +153,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredVLTop_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredVR_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.s}[1], [x3]
sub x3, x3, #1
@ -177,6 +185,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredHU_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
mov x4, #3
mul x4, x4, x2
@ -203,6 +212,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredHU_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI4x4LumaPredHD_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
sub x3, x3, x2 // x2 points to top left
ld1 {v0.s}[1], [x3], x2
@ -228,6 +238,7 @@ WELS_ASM_AARCH64_FUNC_END
// for Chroma 8x8
WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredV_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
.rept 8
@ -236,6 +247,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredV_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredH_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
.rept 8
ld1r {v0.8b}, [x3], x2
@ -245,6 +257,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredDc_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub x4, x1, #1
ld1 {v0.8b}, [x3]
@ -280,6 +293,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredDc_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredDcTop_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.8b}, [x3]
uaddlp v0.4h, v0.8b
@ -298,6 +312,7 @@ intra_1_to_4: .short 17*1, 17*2, 17*3, 17*4, 17*1, 17*2, 17*3, 17*4
intra_m3_to_p4: .short -3, -2, -1, 0, 1, 2, 3, 4
WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredPlane_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub x3, x3, #1
mov x4, x3
@ -349,6 +364,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIChromaPredPlane_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredDc_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub x4, x1, #1
ld1 {v0.16b}, [x3]
@ -380,6 +396,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredDc_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredDcTop_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
ld1 {v0.16b}, [x3]
// reduce instruction
@ -392,6 +409,7 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredDcTop_AArch64_neon
WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredDcLeft_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, #1
ld1 {v1.b}[0], [x3], x2
ld1 {v1.b}[1], [x3], x2
@ -422,8 +440,9 @@ WELS_ASM_AARCH64_FUNC_END
.align 4
intra_1_to_8: .short 5, 10, 15, 20, 25, 30, 35, 40
intra_m7_to_p8: .short -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8
//void WelsI16x16LumaPredPlane_AArch64_neon (uint8_t* pPred, uint8_t* pRef, const int32_t kiStride);
WELS_ASM_AARCH64_FUNC_BEGIN WelsI16x16LumaPredPlane_AArch64_neon
SIGN_EXTENSION x2,w2
sub x3, x1, x2
sub x3, x3, #1
mov x4, x3

View File

@ -179,9 +179,12 @@
add \arg7, \arg7, v4.4s
.endm
//int32_t WelsIntra8x8Combined3Sad_AArch64_neon (uint8_t*, int32_t, uint8_t*, int32_t, int32_t*, int32_t, uint8_t*, uint8_t*,uint8_t*);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra8x8Combined3Sad_AArch64_neon
ldr x11, [sp, #0]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
LOAD_CHROMA_DATA x0, v0.8b, v0.b
uaddlp v1.8h, v0.16b
@ -279,8 +282,11 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra8x8Combined3Sad_AArch64_neon
str w7, [x4]
WELS_ASM_AARCH64_FUNC_END
//int32_t WelsIntra16x16Combined3Sad_AArch64_neon (uint8_t*, int32_t, uint8_t*, int32_t, int32_t*, int32_t, uint8_t*);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra16x16Combined3Sad_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
LOAD_LUMA_DATA
uaddlv h2, v0.16b
@ -331,7 +337,13 @@ sad_intra_16x16_x3_opt_loop0:
str w7, [x4]
WELS_ASM_AARCH64_FUNC_END
//int32_t WelsIntra4x4Combined3Satd_AArch64_neon (uint8_t*, int32_t, uint8_t*, int32_t, uint8_t*, int32_t*, int32_t, int32_t,int32_t);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra4x4Combined3Satd_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x6,w6
SIGN_EXTENSION x7,w7
sub x9, x0, x1
ld1 {v16.s}[0], [x9] //top
sub x9, x0, #1
@ -421,9 +433,13 @@ satd_intra_4x4_x3_opt_end:
WELS_ASM_AARCH64_FUNC_END
//int32_t WelsIntra8x8Combined3Satd_AArch64_neon (uint8_t*, int32_t, uint8_t*, int32_t, int32_t*, int32_t, uint8_t*, uint8_t*,uint8_t*);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra8x8Combined3Satd_AArch64_neon
ldr x11, [sp, #0]
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
LOAD_CHROMA_DATA x0, v0.8b, v0.b
LOAD_CHROMA_DATA x7, v1.8b, v1.b
@ -511,8 +527,11 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra8x8Combined3Satd_AArch64_neon
str w7, [x4]
WELS_ASM_AARCH64_FUNC_END
//int32_t WelsIntra16x16Combined3Satd_AArch64_neon (uint8_t*, int32_t, uint8_t*, int32_t, int32_t*, int32_t, uint8_t*);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIntra16x16Combined3Satd_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
SIGN_EXTENSION x5,w5
LOAD_LUMA_DATA
uaddlv h2, v0.16b

View File

@ -33,9 +33,10 @@
#ifdef HAVE_NEON_AARCH64
#include "arm_arch64_common_macro.S"
//void WelsSetMemZero_AArch64_neon (void* pDst, int32_t iSize);
WELS_ASM_AARCH64_FUNC_BEGIN WelsSetMemZero_AArch64_neon
eor v0.16b, v0.16b, v0.16b
SIGN_EXTENSION x1,w1
cmp x1, #32
b.eq mem_zero_32_neon_start
b.lt mem_zero_24_neon_start

View File

@ -469,7 +469,10 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsDequantIHadamard4x4_AArch64_neon
st1 {v0.16b, v1.16b}, [x0]
WELS_ASM_AARCH64_FUNC_END
//void WelsDctT4_AArch64_neon (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
WELS_ASM_AARCH64_FUNC_BEGIN WelsDctT4_AArch64_neon
SIGN_EXTENSION x2, w2
SIGN_EXTENSION x4, w4
LOAD_4x4_DATA_FOR_DCT v0, v1, x1, x2, x3, x4
usubl v2.8h, v0.8b, v1.8b
usubl2 v4.8h, v0.16b, v1.16b
@ -490,8 +493,10 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsDctT4_AArch64_neon
st4 {v0.d, v1.d, v2.d, v3.d}[0], [x0]
WELS_ASM_AARCH64_FUNC_END
//void WelsDctFourT4_AArch64_neon (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
WELS_ASM_AARCH64_FUNC_BEGIN WelsDctFourT4_AArch64_neon
SIGN_EXTENSION x2,w2
SIGN_EXTENSION x4,w4
.rept 2
LOAD_8x4_DATA_FOR_DCT v0, v1, v2, v3, v4, v5, v6, v7, x1, x3
usubl v0.8h, v0.8b, v4.8b
@ -518,8 +523,10 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsDctFourT4_AArch64_neon
st1 {v6.16b, v7.16b}, [x0], #32
.endr
WELS_ASM_AARCH64_FUNC_END
//void WelsIDctT4Rec_AArch64_neon (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct)
WELS_ASM_AARCH64_FUNC_BEGIN WelsIDctT4Rec_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
ld1 {v16.s}[0], [x2], x3
ld1 {v16.s}[1], [x2], x3
ld1 {v16.s}[2], [x2], x3
@ -552,8 +559,10 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsIDctT4Rec_AArch64_neon
st1 {v1.s}[0],[x0],x1
st1 {v1.s}[1],[x0],x1
WELS_ASM_AARCH64_FUNC_END
//void WelsIDctFourT4Rec_AArch64_neon (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIDctFourT4Rec_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
.rept 2
ld1 {v16.d}[0], [x2], x3
ld1 {v16.d}[1], [x2], x3
@ -644,7 +653,11 @@ WELS_ASM_AARCH64_FUNC_BEGIN WelsHadamardT4Dc_AArch64_neon
st1 {v4.16b, v5.16b}, [x0] //store
WELS_ASM_AARCH64_FUNC_END
//void WelsIDctRecI16x16Dc_AArch64_neon (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride,
// int16_t* pDctDc);
WELS_ASM_AARCH64_FUNC_BEGIN WelsIDctRecI16x16Dc_AArch64_neon
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x3,w3
ld1 {v16.16b,v17.16b}, [x4]
srshr v16.8h, v16.8h, #6
srshr v17.8h, v17.8h, #6

View File

@ -32,8 +32,9 @@
#ifdef HAVE_NEON_AARCH64
#include "arm_arch64_common_macro.S"
//int32_t SumOf8x8SingleBlock_AArch64_neon (uint8_t* pRef, const int32_t kiRefStride);
WELS_ASM_AARCH64_FUNC_BEGIN SumOf8x8SingleBlock_AArch64_neon
SIGN_EXTENSION x1,w1
ld1 {v0.d}[0], [x0], x1
ld1 {v0.d}[1], [x0], x1
ld1 {v1.d}[0], [x0], x1
@ -50,7 +51,9 @@ WELS_ASM_AARCH64_FUNC_BEGIN SumOf8x8SingleBlock_AArch64_neon
mov x0, v0.d[0]
WELS_ASM_AARCH64_FUNC_END
//int32_t SumOf16x16SingleBlock_AArch64_neon (uint8_t* pRef, const int32_t kiRefStride);
WELS_ASM_AARCH64_FUNC_BEGIN SumOf16x16SingleBlock_AArch64_neon
SIGN_EXTENSION x1,w1
ld1 {v0.16b}, [x0], x1
uaddlp v0.8h, v0.16b
.rept 15
@ -61,11 +64,17 @@ WELS_ASM_AARCH64_FUNC_BEGIN SumOf16x16SingleBlock_AArch64_neon
mov x0, v0.d[0]
WELS_ASM_AARCH64_FUNC_END
//void SumOf8x8BlockOfFrame_AArch64_neon (uint8_t* pRefPicture, const int32_t kiWidth, const int32_t kiHeight,
// const int32_t kiRefStride,
// uint16_t* pFeatureOfBlock, uint32_t pTimesOfFeatureValue[]);
WELS_ASM_AARCH64_FUNC_BEGIN SumOf8x8BlockOfFrame_AArch64_neon
//(uint8_t* pRefPicture, const int32_t kiWidth, const int32_t kiHeight,const int32_t kiRefStride,uint16_t* pFeatureOfBlock, uint32_t pTimesOfFeatureValue[])
//x5: pTimesOfFeatureValue
//x4: pFeatureOfBlock
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x2,w2
SIGN_EXTENSION x3,w3
mov x8, x0
mov x6, x1
add x8, x8, x6
@ -147,6 +156,9 @@ WELS_ASM_AARCH64_FUNC_BEGIN SumOf16x16BlockOfFrame_AArch64_neon
//x5: pTimesOfFeatureValue
//x4: pFeatureOfBlock
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x2,w2
SIGN_EXTENSION x3,w3
mov x8, x0
mov x6, x1
add x8, x8, x6
@ -219,6 +231,7 @@ WELS_ASM_AARCH64_FUNC_END
WELS_ASM_AARCH64_FUNC_BEGIN InitializeHashforFeature_AArch64_neon
// (uint32_t* pTimesOfFeatureValue, uint16_t* pBuf, const int32_t kiListSize, uint16_t** pLocationOfFeature, uint16_t** pFeatureValuePointerList);
SIGN_EXTENSION x2,w2
mov x9, #3
bic x5, x2, x9
mov x8, #0
@ -280,7 +293,8 @@ WELS_ASM_AARCH64_FUNC_BEGIN FillQpelLocationByFeatureValue_AArch64_neon
ldr q7, mv_x_inc_x4
ldr q6, mv_y_inc_x4
ldr q5, mx_x_offset_x4
SIGN_EXTENSION x1,w1
SIGN_EXTENSION x2,w2
eor v4.16b, v4.16b, v4.16b
eor v3.16b, v3.16b, v3.16b
dup v16.2d, x3 // v8->v16

View File

@ -43,6 +43,7 @@
#define WELS_ACCESS_UNIT_WRITER_H__
#include "parameter_sets.h"
#include "paraset_strategy.h"
#include "param_svc.h"
#include "utils.h"
namespace WelsEnc {
@ -92,7 +93,7 @@ int32_t WelsWriteSubsetSpsSyntax (SSubsetSps* pSubsetSps, SBitStringAux* pBitStr
* \note Call it in case EWelsNalUnitType is PPS.
*************************************************************************************
*/
int32_t WelsWritePpsSyntax (SWelsPPS* pPps, SBitStringAux* pBitStringAux, SParaSetOffset* sPSOVector);
int32_t WelsWritePpsSyntax (SWelsPPS* pPps, SBitStringAux* pBitStringAux, IWelsParametersetStrategy* pParametersetStrategy);
/*!
* \brief initialize pSps based on configurable parameters in svc
@ -147,21 +148,5 @@ int32_t WelsCheckRefFrameLimitationLevelIdcFirst (SLogContext* pLogCtx, SWelsSvc
int32_t WelsAdjustLevel (SSpatialLayerConfig* pSpatialLayer);
/*!
* \brief check if the current parameter can found a presenting sps
* \param pParam the current encoding paramter in SWelsSvcCodingParam
* \param kbUseSubsetSps bool
* \param iDlayerIndex int, the index of current D layer
* \param iDlayerCount int, the number of total D layer
* \param pSpsArray array of all the stored SPSs
* \param pSubsetArray array of all the stored Subset-SPSs
* \return 0 - successful
* -1 - cannot find existing SPS for current encoder parameter
*/
int32_t FindExistingSps (SWelsSvcCodingParam* pParam, const bool kbUseSubsetSps, const int32_t iDlayerIndex,
const int32_t iDlayerCount, const int32_t iSpsNumInUse,
SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray,
bool bSVCBaselayer);
}
#endif//WELS_ACCESS_UNIT_PARSER_H__

View File

@ -65,9 +65,12 @@ void WelsDequantFour4x4_sse2 (int16_t* pDct, const uint16_t* kpMF);
void WelsDequantIHadamard4x4_sse2 (int16_t* pRes, const uint16_t kuiMF);
void WelsIDctT4Rec_mmx (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
void WelsIDctT4Rec_sse2 (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
void WelsIDctFourT4Rec_sse2 (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
void WelsIDctRecI16x16Dc_sse2 (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride,
int16_t* pDctDc);
void WelsIDctT4Rec_avx2 (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
void WelsIDctFourT4Rec_avx2 (uint8_t* pRec, int32_t iStride, uint8_t* pPrediction, int32_t iPredStride, int16_t* pDct);
#endif//X86_ASM
#ifdef HAVE_NEON

View File

@ -76,6 +76,7 @@ extern "C" {
#ifdef X86_ASM
int32_t WelsGetNoneZeroCount_sse2 (int16_t* pLevel);
int32_t WelsGetNoneZeroCount_sse42 (int16_t* pLevel);
/****************************************************************************
* Scan and Score functions
@ -89,7 +90,10 @@ int32_t WelsCalculateSingleCtr4x4_sse2 (int16_t* pDct);
* DCT functions
****************************************************************************/
void WelsDctT4_mmx (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
void WelsDctT4_sse2 (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
void WelsDctFourT4_sse2 (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
void WelsDctT4_avx2 (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
void WelsDctFourT4_avx2 (int16_t* pDct, uint8_t* pPixel1, int32_t iStride1, uint8_t* pPixel2, int32_t iStride2);
/****************************************************************************
* HDM and Quant functions
@ -103,6 +107,11 @@ void WelsQuant4x4Dc_sse2 (int16_t* pDct, int16_t iFF, int16_t iMF);
void WelsQuantFour4x4_sse2 (int16_t* pDct, const int16_t* pFF, const int16_t* pMF);
void WelsQuantFour4x4Max_sse2 (int16_t* pDct, const int16_t* pFF, const int16_t* pMF, int16_t* pMax);
void WelsQuant4x4_avx2 (int16_t* pDct, const int16_t* pFF, const int16_t* pMF);
void WelsQuant4x4Dc_avx2 (int16_t* pDct, int16_t iFF, int16_t iMF);
void WelsQuantFour4x4_avx2 (int16_t* pDct, const int16_t* pFF, const int16_t* pMF);
void WelsQuantFour4x4Max_avx2 (int16_t* pDct, const int16_t* pFF, const int16_t* pMF, int16_t* pMax);
#endif
#ifdef HAVE_NEON

View File

@ -82,11 +82,11 @@ int32_t InitFunctionPointers (sWelsEncCtx* pEncCtx, SWelsSvcCodingParam* _param,
/*!
* \brief initialize frame coding
*/
void InitFrameCoding (sWelsEncCtx* pEncCtx, const EVideoFrameType keFrameType);
void LoadBackFrameNum(sWelsEncCtx* pEncCtx);
EVideoFrameType DecideFrameType (sWelsEncCtx* pEncCtx, const int8_t kiSpatialNum);
void InitFrameCoding (sWelsEncCtx* pEncCtx, const EVideoFrameType keFrameType,const int32_t kiDidx);
void LoadBackFrameNum(sWelsEncCtx* pEncCtx,const int32_t kiDidx);
EVideoFrameType DecideFrameType (sWelsEncCtx* pEncCtx, const int8_t kiSpatialNum,const int32_t kiDidx, bool bSkipFrameFlag);
void InitBitStream(sWelsEncCtx* pEncCtx);
int32_t GetTemporalLevel (SSpatialLayerInternal* fDlp, const int32_t kiFrameNum, const int32_t kiGopSize);
/*!
* \brief Dump reconstruction for dependency layer

View File

@ -45,6 +45,7 @@
#include "param_svc.h"
#include "nal_encap.h"
#include "picture.h"
#include "paraset_strategy.h"
#include "dq_map.h"
#include "stat.h"
#include "macros.h"
@ -57,9 +58,13 @@
#include "mt_defs.h" // for multiple threadin,
#include "WelsThreadLib.h"
#include "wels_task_management.h"
namespace WelsEnc {
class IWelsTaskManage;
class IWelsReferenceStrategy;
/*
* reference list for each quality layer in SVC
*/
@ -112,7 +117,6 @@ typedef struct TagWelsEncCtx {
SLogContext sLogCtx;
// Input
SWelsSvcCodingParam* pSvcParam; // SVC parameter, WelsSVCParamConfig in svc_param_settings.h
SWelsSliceBs* pSliceBs; // bitstream buffering for various slices, [uiSliceIdx]
int32_t* pSadCostMb;
/* MVD cost tables for Inter MB */
@ -133,10 +137,10 @@ typedef struct TagWelsEncCtx {
SWelsFuncPtrList* pFuncList;
SSliceThreading* pSliceThreading;
IWelsTaskManage* pTaskManage; //was planning to put it under CWelsH264SVCEncoder but it may be updated (lock/no lock) when param is changed
IWelsReferenceStrategy* pReferenceStrategy;
// SSlice context
SSliceCtx* pSliceCtxList;// slice context table for each dependency quality layer
// pointers
// pointers
SPicture* pEncPic; // pointer to current picture to be encoded
SPicture* pDecPic; // pointer to current picture being reconstructed
SPicture* pRefPic; // pointer to current reference picture
@ -149,10 +153,7 @@ typedef struct TagWelsEncCtx {
SLTRState* pLtr;//[MAX_DEPENDENCY_LAYER];
bool bCurFrameMarkedAsSceneLtr;
// Derived
int32_t iCodingIndex;
int32_t iFrameIndex; // count how many frames elapsed during coding context currently
int32_t iFrameNum; // current frame number coding
int32_t iPOC; // frame iPOC
EWelsSliceType eSliceType; // currently coding slice type
EWelsNalUnitType eNalType; // NAL type
EWelsNalRefIdc eNalPriority; // NAL_Reference_Idc currently
@ -162,7 +163,6 @@ typedef struct TagWelsEncCtx {
uint8_t uiDependencyId; // Idc of dependecy layer to be coded
uint8_t uiTemporalId; // Idc of temporal layer to be coded
bool bNeedPrefixNalFlag; // whether add prefix nal
bool bEncCurFrmAsIdrFlag;
// Rate control routine
SWelsSvcRc* pWelsSvcRc;
@ -172,8 +172,6 @@ typedef struct TagWelsEncCtx {
int32_t iCheckWindowInterval;
int32_t iCheckWindowIntervalShift;
bool bCheckWindowShiftResetFlag;
int32_t iSkipFrameFlag; //_GOM_RC_
int32_t iContinualSkipFrames;
int32_t iGlobalQp; // global qp
// VAA
@ -198,10 +196,10 @@ typedef struct TagWelsEncCtx {
int32_t iPosBsBuffer; // current writing position of frame bs pBuffer
SSpatialPicIndex sSpatialIndexMap[MAX_DEPENDENCY_LAYER];
int32_t iSliceBufferSize[MAX_DEPENDENCY_LAYER];
bool bRefOfCurTidIsLtr[MAX_DEPENDENCY_LAYER][MAX_TEMPORAL_LEVEL];
uint16_t uiIdrPicId; // IDR picture id: [0, 65535], this one is used for LTR
int32_t iMaxSliceCount;// maximal count number of slices for all layers observation
int16_t iActiveThreadsNum; // number of threads active so far
@ -224,12 +222,12 @@ typedef struct TagWelsEncCtx {
//related to Statistics
int64_t uiStartTimestamp;
SEncoderStatistics sEncoderStatistics;
SEncoderStatistics sEncoderStatistics[MAX_DEPENDENCY_LAYER];
int32_t iStatisticsLogInterval;
int64_t iLastStatisticsLogTs;
int64_t iTotalEncodedBytes;
int64_t iLastStatisticsBytes;
int64_t iLastStatisticsFrameCount;
int64_t iTotalEncodedBytes[MAX_DEPENDENCY_LAYER];
int64_t iLastStatisticsBytes[MAX_DEPENDENCY_LAYER];
int64_t iLastStatisticsFrameCount[MAX_DEPENDENCY_LAYER];
int32_t iEncoderError;
WELS_MUTEX mutexEncoderError;
@ -239,31 +237,8 @@ typedef struct TagWelsEncCtx {
bool bDependencyRecFlag[MAX_DEPENDENCY_LAYER];
bool bRecFlag;
#endif
int64_t uiLastTimestamp;
uint32_t GetNeededSpsNum() {
if (0 == sPSOVector.uiNeededSpsNum) {
sPSOVector.uiNeededSpsNum = ((SPS_LISTING & pSvcParam->eSpsPpsIdStrategy) ? (MAX_SPS_COUNT) : (1));
sPSOVector.uiNeededSpsNum *= ((pSvcParam->bSimulcastAVC) ? (pSvcParam->iSpatialLayerNum) : (1));
}
return sPSOVector.uiNeededSpsNum;
}
uint32_t GetNeededSubsetSpsNum() {
if (0 == sPSOVector.uiNeededSubsetSpsNum) {
sPSOVector.uiNeededSubsetSpsNum = ((pSvcParam->bSimulcastAVC) ? (0) :
((SPS_LISTING & pSvcParam->eSpsPpsIdStrategy) ? (MAX_SPS_COUNT) : (pSvcParam->iSpatialLayerNum - 1)));
}
return sPSOVector.uiNeededSubsetSpsNum;
}
uint32_t GetNeededPpsNum() {
if (0 == sPSOVector.uiNeededPpsNum) {
sPSOVector.uiNeededPpsNum = ((pSvcParam->eSpsPpsIdStrategy & SPS_PPS_LISTING) ? (MAX_PPS_COUNT) :
(1 + pSvcParam->iSpatialLayerNum));
sPSOVector.uiNeededPpsNum *= ((pSvcParam->bSimulcastAVC) ? (pSvcParam->iSpatialLayerNum) : (1));
}
return sPSOVector.uiNeededPpsNum;
}
} sWelsEncCtx/*, *PWelsEncCtx*/;
}
#endif//sWelsEncCtx_H__

View File

@ -44,7 +44,9 @@
#include "codec_app_def.h"
#include "wels_const.h"
#include "WelsThreadLib.h"
#include "slice.h"
using namespace WelsEnc;
/*
* MT_DEBUG: output trace MT related into log file
*/
@ -80,14 +82,14 @@ WELS_EVENT pThreadMasterEvent[MAX_THREADS_NUM]; // event
WELS_MUTEX mutexSliceNumUpdate; // for dynamic slicing mode MT
uint32_t* pSliceConsumeTime[MAX_DEPENDENCY_LAYER]; // consuming time for each slice, [iSpatialIdx][uiSliceIdx]
int32_t* pSliceComplexRatio[MAX_DEPENDENCY_LAYER]; // *INT_MULTIPLY
#ifdef MT_DEBUG
FILE* pFSliceDiff; // file handle for debug
#endif//MT_DEBUG
uint8_t* pThreadBsBuffer[MAX_THREADS_NUM]; //actual memory for slice buffer
bool bThreadBsBufferUsage[MAX_THREADS_NUM];
WELS_MUTEX mutexThreadBsBufferUsage;
} SSliceThreading;
#endif//MULTIPLE_THREADING_DEFINES_H__

View File

@ -86,7 +86,12 @@ typedef struct TagDLayerParam {
int8_t iHighestTemporalId;
float fInputFrameRate; // input frame rate
float fOutputFrameRate; // output frame rate
// uint16_t uiIdrPicId; // IDR picture id: [0, 65535], this one is used for LTR
int32_t iCodingIndex;
int32_t iFrameIndex; // count how many frames elapsed during coding context currently
bool bEncCurFrmAsIdrFlag;
int32_t iFrameNum; // current frame number coding
int32_t iPOC; // frame iPOC
#ifdef ENABLE_FRAME_DUMP
char sRecFileName[MAX_FNAME_LEN]; // file to be constructed
#endif//ENABLE_FRAME_DUMP
@ -111,8 +116,6 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
bool bDeblockingParallelFlag; // deblocking filter parallelization control flag
int32_t iBitsVaryPercentage;
short
iCountThreadsNum; // # derived from disable_multiple_slice_idc (=0 or >1) means;
int8_t iDecompStages; // GOP size dependency
int32_t iMaxNumRefFrame;
@ -137,6 +140,7 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
param.iTargetBitrate = UNSPECIFIED_BIT_RATE; // overall target bitrate introduced in RC module
param.iMaxBitrate = UNSPECIFIED_BIT_RATE;
param.iMultipleThreadIdc = 1;
param.bUseLoadBalancing = true;
param.iLTRRefNum = 0;
param.iLtrMarkPeriod = 30; //the min distance of two int32_t references
@ -176,13 +180,25 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
param.sSpatialLayers[iLayer].uiLevelIdc = LEVEL_UNKNOWN;
param.sSpatialLayers[iLayer].iDLayerQp = SVC_QUALITY_BASE_QP;
param.sSpatialLayers[iLayer].fFrameRate = param.fMaxFrameRate;
param.sSpatialLayers[iLayer].sSliceCfg.uiSliceMode = SM_SINGLE_SLICE;
param.sSpatialLayers[iLayer].sSliceCfg.sSliceArgument.uiSliceSizeConstraint = 1500;
param.sSpatialLayers[iLayer].sSliceCfg.sSliceArgument.uiSliceNum = 1;
param.sSpatialLayers[iLayer].iMaxSpatialBitrate = UNSPECIFIED_BIT_RATE;
param.sSpatialLayers[iLayer].sSliceArgument.uiSliceMode = SM_SINGLE_SLICE;
param.sSpatialLayers[iLayer].sSliceArgument.uiSliceNum = 0; //AUTO, using number of CPU cores
param.sSpatialLayers[iLayer].sSliceArgument.uiSliceSizeConstraint = 1500;
const int32_t kiLesserSliceNum = ((MAX_SLICES_NUM < MAX_SLICES_NUM_TMP) ? MAX_SLICES_NUM : MAX_SLICES_NUM_TMP);
for (int32_t idx = 0; idx < kiLesserSliceNum; idx++)
param.sSpatialLayers[iLayer].sSliceCfg.sSliceArgument.uiSliceMbNum[idx] = 960;
param.sSpatialLayers[iLayer].sSliceArgument.uiSliceMbNum[idx] = 0; //default, using one row a slice if uiSliceMode is SM_RASTER_MODE
// See codec_app_def.h for more info about members bVideoSignalTypePresent through uiColorMatrix. The default values
// used below preserve the previous behavior; i.e., no additional information will be written to the output file.
param.sSpatialLayers[iLayer].bVideoSignalTypePresent = false; // do not write any of the following information to the header
param.sSpatialLayers[iLayer].uiVideoFormat = VF_UNDEF; // undefined
param.sSpatialLayers[iLayer].bFullRange = false; // analog video data range [16, 235]
param.sSpatialLayers[iLayer].bColorDescriptionPresent = false; // do not write any of the following three items to the header
param.sSpatialLayers[iLayer].uiColorPrimaries = CP_UNDEF; // undefined
param.sSpatialLayers[iLayer].uiTransferCharacteristics = TRC_UNDEF; // undefined
param.sSpatialLayers[iLayer].uiColorMatrix = CM_UNDEF; // undefined
}
}
@ -199,10 +215,8 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
bDeblockingParallelFlag = false;// deblocking filter parallelization control flag
iCountThreadsNum = 1; // # derived from disable_multiple_slice_idc (=0 or >1) means;
iDecompStages = 0; // GOP size dependency, unknown here and be revised later
iBitsVaryPercentage = 0;
iBitsVaryPercentage = 10;
}
int32_t ParamBaseTranscode (const SEncParamBase& pCodingParam) {
@ -278,6 +292,7 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
SUsedPicRect.iHeight = ((iPicHeight >> 1) << 1);
iMultipleThreadIdc = pCodingParam.iMultipleThreadIdc;
bUseLoadBalancing = pCodingParam.bUseLoadBalancing;
/* Deblocking loop filter */
iLoopFilterDisableIdc = pCodingParam.iLoopFilterDisableIdc; // 0: on, 1: off, 2: on except for slice boundaries,
@ -394,18 +409,23 @@ typedef struct TagWelsSvcCodingParam: SEncParamExt {
pCodingParam.sSpatialLayers[iIdxSpatial].iMaxSpatialBitrate;
//multi slice
pSpatialLayer->sSliceCfg.uiSliceMode = pCodingParam.sSpatialLayers[iIdxSpatial].sSliceCfg.uiSliceMode;
pSpatialLayer->sSliceCfg.sSliceArgument.uiSliceSizeConstraint
= (uint32_t) (pCodingParam.sSpatialLayers[iIdxSpatial].sSliceCfg.sSliceArgument.uiSliceSizeConstraint);
pSpatialLayer->sSliceCfg.sSliceArgument.uiSliceNum
= pCodingParam.sSpatialLayers[iIdxSpatial].sSliceCfg.sSliceArgument.uiSliceNum;
const int32_t kiLesserSliceNum = ((MAX_SLICES_NUM < MAX_SLICES_NUM_TMP) ? MAX_SLICES_NUM : MAX_SLICES_NUM_TMP);
memcpy (pSpatialLayer->sSliceCfg.sSliceArgument.uiSliceMbNum,
pCodingParam.sSpatialLayers[iIdxSpatial].sSliceCfg.sSliceArgument.uiSliceMbNum, // confirmed_safe_unsafe_usage
kiLesserSliceNum * sizeof (uint32_t)) ;
pSpatialLayer->sSliceArgument = pCodingParam.sSpatialLayers[iIdxSpatial].sSliceArgument;
memcpy (&(pSpatialLayer->sSliceArgument),
&(pCodingParam.sSpatialLayers[iIdxSpatial].sSliceArgument), // confirmed_safe_unsafe_usage
sizeof (SSliceArgument)) ;
pSpatialLayer->iDLayerQp = pCodingParam.sSpatialLayers[iIdxSpatial].iDLayerQp;
// See codec_app_def.h and parameter_sets.h for more info about members bVideoSignalTypePresent through uiColorMatrix.
pSpatialLayer->bVideoSignalTypePresent = pCodingParam.sSpatialLayers[iIdxSpatial].bVideoSignalTypePresent;
pSpatialLayer->uiVideoFormat = pCodingParam.sSpatialLayers[iIdxSpatial].uiVideoFormat;
pSpatialLayer->bFullRange = pCodingParam.sSpatialLayers[iIdxSpatial].bFullRange;
pSpatialLayer->bColorDescriptionPresent = pCodingParam.sSpatialLayers[iIdxSpatial].bColorDescriptionPresent;
pSpatialLayer->uiColorPrimaries = pCodingParam.sSpatialLayers[iIdxSpatial].uiColorPrimaries;
pSpatialLayer->uiTransferCharacteristics = pCodingParam.sSpatialLayers[iIdxSpatial].uiTransferCharacteristics;
pSpatialLayer->uiColorMatrix = pCodingParam.sSpatialLayers[iIdxSpatial].uiColorMatrix;
uiProfileIdc = (!bSimulcastAVC) ? PRO_SCALABLE_BASELINE : PRO_BASELINE;
++ pDlp;
++ pSpatialLayer;

View File

@ -79,6 +79,19 @@ bool bVuiParamPresentFlag;
// bool bTimingInfoPresentFlag;
// bool bFixedFrameRateFlag;
// Note: members bVideoSignalTypePresent through uiColorMatrix below are also defined in SSpatialLayerConfig in codec_app_def.h,
// along with definitions for enumerators EVideoFormatSPS, EColorPrimaries, ETransferCharacteristics, and EColorMatrix.
bool bVideoSignalTypePresent; // false => do not write any of the following information to the header
uint8_t uiVideoFormat; // EVideoFormatSPS; 3 bits in header; 0-5 => component, kpal, ntsc, secam, mac, undef
bool bFullRange; // false => analog video data range [16, 235]; true => full data range [0,255]
bool bColorDescriptionPresent; // false => do not write any of the following three items to the header
uint8_t uiColorPrimaries; // EColorPrimaries; 8 bits in header; 0 - 9 => ???, bt709, undef, ???, bt470m, bt470bg,
// smpte170m, smpte240m, film, bt2020
uint8_t uiTransferCharacteristics; // ETransferCharacteristics; 8 bits in header; 0 - 15 => ???, bt709, undef, ???, bt470m, bt470bg, smpte170m,
// smpte240m, linear, log100, log316, iec61966-2-4, bt1361e, iec61966-2-1, bt2020-10, bt2020-12
uint8_t uiColorMatrix; // EColorMatrix; 8 bits in header (corresponds to FFmpeg "colorspace"); 0 - 10 => GBR, bt709,
// undef, ???, fcc, bt470bg, smpte170m, smpte240m, YCgCo, bt2020nc, bt2020c
bool bConstraintSet0Flag;
bool bConstraintSet1Flag;
bool bConstraintSet2Flag;

View File

@ -0,0 +1,310 @@
/*!
* \copy
* Copyright (c) 2013, Cisco Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*/
#ifndef WELS_PARASET_STRATEGY_H
#define WELS_PARASET_STRATEGY_H
#include "param_svc.h"
#include "utils.h"
namespace WelsEnc {
class IWelsParametersetStrategy {
public:
virtual ~IWelsParametersetStrategy() { }
static IWelsParametersetStrategy* CreateParametersetStrategy (EParameterSetStrategy eSpsPpsIdStrategy,
const bool bSimulcastAVC, const int32_t kiSpatialLayerNum);
//virtual SParaSetOffset* GetParaSetOffset() = 0;
virtual int32_t GetPpsIdOffset (const int32_t iPpsId) = 0;
virtual int32_t GetSpsIdOffset (const int32_t iPpsId, const int32_t iSpsId) = 0;
virtual int32_t* GetSpsIdOffsetList (const int iParasetType) = 0;
virtual uint32_t GetAllNeededParasetNum() = 0;
virtual uint32_t GetNeededSpsNum() = 0;
virtual uint32_t GetNeededSubsetSpsNum() = 0;
virtual uint32_t GetNeededPpsNum() = 0;
virtual void LoadPrevious (SExistingParasetList* pExistingParasetList, SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray,
SWelsPPS* pPpsArray) = 0;
virtual void Update (const uint32_t kuiId, const int iParasetType) = 0;
virtual void UpdatePpsList (sWelsEncCtx* pCtx) = 0;
virtual bool CheckParamCompatibility (SWelsSvcCodingParam* pCodingParam, SLogContext* pLogCtx) = 0;
virtual uint32_t GenerateNewSps (sWelsEncCtx* pCtx, const bool kbUseSubsetSps, const int32_t iDlayerIndex,
const int32_t iDlayerCount,
uint32_t kuiSpsId,
SWelsSPS*& pSps, SSubsetSps*& pSubsetSps, bool bSVCBaselayer) = 0;
virtual uint32_t InitPps (sWelsEncCtx* pCtx, uint32_t kiSpsId,
SWelsSPS* pSps,
SSubsetSps* pSubsetSps,
uint32_t kuiPpsId,
const bool kbDeblockingFilterPresentFlag,
const bool kbUsingSubsetSps,
const bool kbEntropyCodingModeFlag) = 0;
virtual void SetUseSubsetFlag (const uint32_t iPpsId, const bool bUseSubsetSps) = 0;
virtual void UpdateParaSetNum (sWelsEncCtx* pCtx) = 0;
virtual int32_t GetCurrentPpsId (const int32_t iPpsId, const int32_t iIdrLoop) = 0;
virtual void OutputCurrentStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList,
sWelsEncCtx* pCtx, SExistingParasetList* pExistingParasetList) = 0;
virtual void LoadPreviousStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList) = 0;
virtual int32_t GetSpsIdx (const int32_t iIdx) = 0;
};
class CWelsParametersetIdConstant : public IWelsParametersetStrategy {
public:
CWelsParametersetIdConstant (const bool bSimulcastAVC, const int32_t kiSpatialLayerNum);
virtual ~ CWelsParametersetIdConstant();
virtual int32_t GetPpsIdOffset (const int32_t iPpsId);
virtual int32_t GetSpsIdOffset (const int32_t iPpsId, const int32_t iSpsId);
int32_t* GetSpsIdOffsetList (const int iParasetType);
uint32_t GetAllNeededParasetNum();
virtual uint32_t GetNeededSpsNum();
virtual uint32_t GetNeededSubsetSpsNum();
virtual uint32_t GetNeededPpsNum();
virtual void LoadPrevious (SExistingParasetList* pExistingParasetList, SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray,
SWelsPPS* pPpsArray);
virtual void Update (const uint32_t kuiId, const int iParasetType);
virtual void UpdatePpsList (sWelsEncCtx* pCtx) {};
bool CheckParamCompatibility (SWelsSvcCodingParam* pCodingParam, SLogContext* pLogCtx) {
return true;
};
virtual uint32_t GenerateNewSps (sWelsEncCtx* pCtx, const bool kbUseSubsetSps, const int32_t iDlayerIndex,
const int32_t iDlayerCount, uint32_t kuiSpsId,
SWelsSPS*& pSps, SSubsetSps*& pSubsetSps, bool bSVCBaselayer);
virtual uint32_t InitPps (sWelsEncCtx* pCtx, uint32_t kiSpsId,
SWelsSPS* pSps,
SSubsetSps* pSubsetSps,
uint32_t kuiPpsId,
const bool kbDeblockingFilterPresentFlag,
const bool kbUsingSubsetSps,
const bool kbEntropyCodingModeFlag);
virtual void SetUseSubsetFlag (const uint32_t iPpsId, const bool bUseSubsetSps);
virtual void UpdateParaSetNum (sWelsEncCtx* pCtx) {};
virtual int32_t GetCurrentPpsId (const int32_t iPpsId, const int32_t iIdrLoop) {
return iPpsId;
};
virtual void OutputCurrentStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList,
sWelsEncCtx* pCtx,
SExistingParasetList* pExistingParasetList) {};
virtual void LoadPreviousStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList) {};
virtual int32_t GetSpsIdx (const int32_t iIdx) {
return 0;
};
protected:
virtual void LoadPreviousSps (SExistingParasetList* pExistingParasetList, SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray) {};
virtual void LoadPreviousPps (SExistingParasetList* pExistingParasetList, SWelsPPS* pPpsArray) {};
protected:
SParaSetOffset m_sParaSetOffset;
bool m_bSimulcastAVC;
int32_t m_iSpatialLayerNum;
uint32_t m_iBasicNeededSpsNum;
uint32_t m_iBasicNeededPpsNum;
};
/*
typedef struct TagParaSetOffsetVariable {
int32_t iParaSetIdDelta[MAX_DQ_LAYER_NUM+1];//mark delta between SPS_ID_in_bs and sps_id_in_encoder, can be minus, for each dq-layer
//need not extra +1 due no MGS and FMO case so far
bool bUsedParaSetIdInBs[MAX_PPS_COUNT]; //mark the used SPS_ID with 1
uint32_t uiNextParaSetIdToUseInBs; //mark the next SPS_ID_in_bs, for all layers
} SParaSetOffsetVariable;
typedef struct TagParaSetOffset {
//in PS0 design, "sParaSetOffsetVariable" record the previous paras before current IDR, AND NEED to be stacked and recover across IDR
SParaSetOffsetVariable
sParaSetOffsetVariable[PARA_SET_TYPE]; //PARA_SET_TYPE=3; paraset_type = 0: AVC_SPS; =1: Subset_SPS; =2: PPS
//in PSO design, "bPpsIdMappingIntoSubsetsps" uses the current para of current IDR period
bool
bPpsIdMappingIntoSubsetsps[MAX_DQ_LAYER_NUM+1]; // need not extra +1 due no MGS and FMO case so far
int32_t iPpsIdList[MAX_DQ_LAYER_NUM][MAX_PPS_COUNT]; //index0: max pps types; index1: for differnt IDRs, if only index0=1, index1 can reach MAX_PPS_COUNT
//#if _DEBUG
int32_t eSpsPpsIdStrategy;
//#endif
uint32_t uiNeededSpsNum;
uint32_t uiNeededSubsetSpsNum;
uint32_t uiNeededPpsNum;
uint32_t uiInUseSpsNum;
uint32_t uiInUseSubsetSpsNum;
uint32_t uiInUsePpsNum;
} SParaSetOffset;
*/
class CWelsParametersetIdNonConstant : public CWelsParametersetIdConstant {
public:
CWelsParametersetIdNonConstant (const bool bSimulcastAVC,
const int32_t kiSpatialLayerNum): CWelsParametersetIdConstant (bSimulcastAVC, kiSpatialLayerNum) {};
virtual void OutputCurrentStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList,
sWelsEncCtx* pCtx,
SExistingParasetList* pExistingParasetList);
virtual void LoadPreviousStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList);
};
class CWelsParametersetIdIncreasing : public CWelsParametersetIdNonConstant {
public:
CWelsParametersetIdIncreasing (const bool bSimulcastAVC,
const int32_t kiSpatialLayerNum): CWelsParametersetIdNonConstant (bSimulcastAVC, kiSpatialLayerNum) {};
virtual int32_t GetPpsIdOffset (const int32_t iPpsId);
virtual int32_t GetSpsIdOffset (const int32_t iPpsId, const int32_t iSpsId);
virtual void Update (const uint32_t kuiId, const int iParasetType);
protected:
//void ParasetIdAdditionIdAdjust (SParaSetOffsetVariable* sParaSetOffsetVariable, const int32_t kiCurEncoderParaSetId,
// const uint32_t kuiMaxIdInBs);
private:
void DebugPps (const int32_t kiPpsId);
void DebugSpsPps (const int32_t iPpsId, const int32_t iSpsId);
};
class CWelsParametersetSpsListing : public CWelsParametersetIdNonConstant {
public:
CWelsParametersetSpsListing (const bool bSimulcastAVC, const int32_t kiSpatialLayerNum);
virtual uint32_t GetNeededSubsetSpsNum();
virtual void LoadPrevious (SExistingParasetList* pExistingParasetList, SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray,
SWelsPPS* pPpsArray);
bool CheckParamCompatibility (SWelsSvcCodingParam* pCodingParam, SLogContext* pLogCtx);
virtual uint32_t GenerateNewSps (sWelsEncCtx* pCtx, const bool kbUseSubsetSps, const int32_t iDlayerIndex,
const int32_t iDlayerCount, uint32_t kuiSpsId,
SWelsSPS*& pSps, SSubsetSps*& pSubsetSps, bool bSVCBaselayer);
virtual void UpdateParaSetNum (sWelsEncCtx* pCtx);
int32_t GetSpsIdx (const int32_t iIdx) {
return iIdx;
};
virtual void OutputCurrentStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList,
sWelsEncCtx* pCtx,
SExistingParasetList* pExistingParasetList);
protected:
virtual void LoadPreviousSps (SExistingParasetList* pExistingParasetList, SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray);
virtual bool CheckPpsGenerating();
virtual int32_t SpsReset (sWelsEncCtx* pCtx, bool kbUseSubsetSps);
};
class CWelsParametersetSpsPpsListing : public CWelsParametersetSpsListing {
public:
CWelsParametersetSpsPpsListing (const bool bSimulcastAVC, const int32_t kiSpatialLayerNum);
//uint32_t GetNeededPpsNum();
virtual void UpdatePpsList (sWelsEncCtx* pCtx);
virtual uint32_t InitPps (sWelsEncCtx* pCtx, uint32_t kiSpsId,
SWelsSPS* pSps,
SSubsetSps* pSubsetSps,
uint32_t kuiPpsId,
const bool kbDeblockingFilterPresentFlag,
const bool kbUsingSubsetSps,
const bool kbEntropyCodingModeFlag);
virtual void UpdateParaSetNum (sWelsEncCtx* pCtx);
virtual int32_t GetCurrentPpsId (const int32_t iPpsId, const int32_t iIdrLoop);
virtual void OutputCurrentStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList,
sWelsEncCtx* pCtx,
SExistingParasetList* pExistingParasetList);
virtual void LoadPreviousStructure (SParaSetOffsetVariable* pParaSetOffsetVariable, int32_t* pPpsIdList);
protected:
virtual void LoadPreviousPps (SExistingParasetList* pExistingParasetList, SWelsPPS* pPpsArray);
virtual bool CheckPpsGenerating();
virtual int32_t SpsReset (sWelsEncCtx* pCtx, bool kbUseSubsetSps);
};
class CWelsParametersetSpsListingPpsIncreasing : public CWelsParametersetSpsListing {
public:
CWelsParametersetSpsListingPpsIncreasing (const bool bSimulcastAVC,
const int32_t kiSpatialLayerNum): CWelsParametersetSpsListing (bSimulcastAVC, kiSpatialLayerNum) {};
virtual int32_t GetPpsIdOffset (const int32_t kiPpsId);
virtual void Update (const uint32_t kuiId, const int iParasetType);
};
int32_t FindExistingSps (SWelsSvcCodingParam* pParam, const bool kbUseSubsetSps, const int32_t iDlayerIndex,
const int32_t iDlayerCount, const int32_t iSpsNumInUse,
SWelsSPS* pSpsArray,
SSubsetSps* pSubsetArray, bool bSVCBaseLayer);
}
#endif

View File

@ -119,7 +119,7 @@ enum {
#define FRAME_iTargetBits_VARY_RANGE 50 // *INT_MULTIPLY
//R-Q Model
#define LINEAR_MODEL_DECAY_FACTOR 80 // *INT_MULTIPLY
#define FRAME_CMPLX_RATIO_RANGE 10 // *INT_MULTIPLY
#define FRAME_CMPLX_RATIO_RANGE 20 // *INT_MULTIPLY
#define SMOOTH_FACTOR_MIN_VALUE 2 // *INT_MULTIPLY
//#define VGOP_BITS_MIN_RATIO 0.8
//skip and padding
@ -140,21 +140,6 @@ enum {
TIME_WINDOW_TOTAL =2
};
typedef struct TagRCSlicing {
int32_t iComplexityIndexSlice;
int32_t iCalculatedQpSlice;
int32_t iStartMbSlice;
int32_t iEndMbSlice;
int32_t iTotalQpSlice;
int32_t iTotalMbSlice;
int32_t iTargetBitsSlice;
int32_t iBsPosSlice;
int32_t iFrameBitsSlice;
int32_t iGomBitsSlice;
int32_t iGomTargetBits;
//int32_t gom_coded_mb;
} SRCSlicing;
typedef struct TagRCTemporal {
int32_t iMinBitsTl;
int32_t iMaxBitsTl;
@ -164,7 +149,8 @@ int32_t iGopBitsDq;
int64_t iLinearCmplx; // *INT_MULTIPLY
int32_t iPFrameNum;
int32_t iFrameCmplxMean;
int32_t iMaxQp;
int32_t iMinQp;
} SRCTemporal;
typedef struct TagWelsRc {
@ -188,6 +174,7 @@ int32_t iCurrentBitsLevel;//0:normal; 1:limited; 2:exceeded.
int32_t iIdrNum;
int64_t iIntraComplexity; //255*255(MaxMbSAD)*36864(MaxFS) make the highest bit of 32-bit integer 1
int32_t iIntraMbCount;
int64_t iIntraComplxMean;
int8_t iTlOfFrames[VGOP_SIZE];
int32_t iRemainingWeights;
@ -198,6 +185,7 @@ int32_t* pGomForegroundBlockNum;
int32_t* pCurrentFrameGomSad;
int32_t* pGomCost;
int32_t bEnableGomQp;
int32_t iAverageFrameQp;
int32_t iMinFrameQp;
int32_t iMaxFrameQp;
@ -236,8 +224,7 @@ int32_t iBufferFullnessPadding;
int32_t iPaddingSize;
int32_t iPaddingBitrateStat;
bool bSkipFlag;
SRCSlicing* pSlicingOverRc;
int32_t iContinualSkipFrames;
SRCTemporal* pTemporalOverRc;
//for scc
@ -252,12 +239,11 @@ float fLatestFrameRate; // TODO: to complete later
} SWelsSvcRc;
typedef void (*PWelsRCPictureInitFunc) (sWelsEncCtx* pCtx,long long uiTimeStamp);
typedef void (*PWelsRCPictureDelayJudgeFunc) (sWelsEncCtx* pCtx, EVideoFrameType eFrameType, long long uiTimeStamp);
typedef void (*PWelsRCPictureDelayJudgeFunc) (sWelsEncCtx* pCtx,long long uiTimeStamp,int32_t iDidIdx);
typedef void (*PWelsRCPictureInfoUpdateFunc) (sWelsEncCtx* pCtx, int32_t iLayerSize);
typedef void (*PWelsRCMBInfoUpdateFunc) (sWelsEncCtx* pCtx, SMB* pCurMb, int32_t iCostLuma, SSlice* pSlice);
typedef void (*PWelsRCMBInitFunc) (sWelsEncCtx* pCtx, SMB* pCurMb, SSlice* pSlice);
typedef bool (*PWelsCheckFrameSkipBasedMaxbrFunc) (sWelsEncCtx* pCtx, int32_t iSpatialNum, EVideoFrameType eFrameType,
const uint32_t uiTimeStamp);
typedef void (*PWelsCheckFrameSkipBasedMaxbrFunc) (sWelsEncCtx* pCtx, const long long uiTimeStamp, int32_t iDidIdx);
typedef void (*PWelsUpdateBufferWhenFrameSkippedFunc)(sWelsEncCtx* pCtx, int32_t iSpatialNum);
typedef void (*PWelsUpdateMaxBrCheckWindowStatusFunc)(sWelsEncCtx* pCtx, int32_t iSpatialNum, const long long uiTimeStamp);
typedef bool (*PWelsRCPostFrameSkippingFunc)(sWelsEncCtx* pCtx, const int32_t iDid, const long long uiTimeStamp);
@ -275,8 +261,7 @@ PWelsUpdateMaxBrCheckWindowStatusFunc pfWelsUpdateMaxBrWindowStatus;
PWelsRCPostFrameSkippingFunc pfWelsRcPostFrameSkipping;
} SWelsRcFunc;
bool CheckFrameSkipBasedMaxbr (sWelsEncCtx* pCtx, int32_t iSpatialNum, EVideoFrameType eFrameType,
const uint32_t uiTimeStamp);
void CheckFrameSkipBasedMaxbr (sWelsEncCtx* pCtx,const long long uiTimeStamp, int32_t iDidIdx);
void UpdateBufferWhenFrameSkipped(sWelsEncCtx* pCtx, int32_t iSpatialNum);
void UpdateMaxBrCheckWindowStatus(sWelsEncCtx* pCtx, int32_t iSpatialNum, const long long uiTimeStamp);
bool WelsRcPostFrameSkipping(sWelsEncCtx* pCtx, const int32_t iDid, const long long uiTimeStamp);
@ -286,6 +271,9 @@ void RcTraceFrameBits (sWelsEncCtx* pEncCtx, long long uiTimeStamp);
void WelsRcInitModule (sWelsEncCtx* pCtx, RC_MODES iRcMode);
void WelsRcInitFuncPointers (sWelsEncCtx* pEncCtx, RC_MODES iRcMode);
void WelsRcFreeMemory (sWelsEncCtx* pCtx);
bool WelsRcCheckFrameStatus (sWelsEncCtx* pEncCtx,long long uiTimeStamp,int32_t iSpatialNum,int32_t iCurDid);
bool WelsUpdateSkipFrameStatus();
long long GetTimestampForRc(const long long uiTimeStamp, const long long uiLastTimeStamp, const float fFrameRate);
}
#endif //RC_H

View File

@ -93,10 +93,56 @@ bool CheckCurMarkFrameNumUsed (sWelsEncCtx* pCtx);
*/
void WelsMarkPic (sWelsEncCtx* pCtx);
void InitRefListMgrFunc (SWelsFuncPtrList* pFuncList, const bool bEnableLongTermReference, const bool bScreenContent);
#ifdef LONG_TERM_REF_DUMP
void DumpRef (sWelsEncCtx* ctx);
#endif
class IWelsReferenceStrategy {
public:
IWelsReferenceStrategy() {};
virtual ~IWelsReferenceStrategy() { };
static IWelsReferenceStrategy* CreateReferenceStrategy (sWelsEncCtx* pCtx, const EUsageType keUsageType,
const bool kbLtrEnabled);
virtual bool BuildRefList (const int32_t iPOC, int32_t iBestLtrRefIdx) = 0;
virtual void MarkPic() = 0;
virtual bool UpdateRefList() = 0;
virtual void EndofUpdateRefList() = 0;
virtual void AfterBuildRefList() = 0;
protected:
virtual void Init (sWelsEncCtx* pCtx) = 0;
};
class CWelsReference_TemporalLayer : public IWelsReferenceStrategy {
public:
virtual bool BuildRefList (const int32_t iPOC, int32_t iBestLtrRefIdx);
virtual void MarkPic();
virtual bool UpdateRefList();
virtual void EndofUpdateRefList();
virtual void AfterBuildRefList();
void Init (sWelsEncCtx* pCtx);
protected:
sWelsEncCtx* m_pEncoderCtx;
};
class CWelsReference_Screen : public CWelsReference_TemporalLayer {
public:
virtual bool BuildRefList (const int32_t iPOC, int32_t iBestLtrRefIdx);
virtual void MarkPic();
virtual bool UpdateRefList();
virtual void EndofUpdateRefList();
virtual void AfterBuildRefList();
};
class CWelsReference_LosslessWithLtr : public CWelsReference_Screen {
public:
virtual bool BuildRefList (const int32_t iPOC, int32_t iBestLtrRefIdx);
virtual void MarkPic();
virtual bool UpdateRefList();
virtual void EndofUpdateRefList();
};
}
#endif//REFERENCE_PICTURE_LIST_MANAGEMENT_SVC_H__

View File

@ -82,6 +82,11 @@ int32_t WelsIntra16x16Combined3Sad_ssse3 (uint8_t*, int32_t, uint8_t*, int32_t,
int32_t WelsIntraChroma8x8Combined3Satd_sse41 (uint8_t*, int32_t, uint8_t*, int32_t, int32_t*, int32_t, uint8_t*,
uint8_t*, uint8_t*);
int32_t WelsSampleSatd8x8_avx2 (uint8_t*, int32_t, uint8_t*, int32_t);
int32_t WelsSampleSatd8x16_avx2 (uint8_t*, int32_t, uint8_t*, int32_t);
int32_t WelsSampleSatd16x8_avx2 (uint8_t*, int32_t, uint8_t*, int32_t);
int32_t WelsSampleSatd16x16_avx2 (uint8_t*, int32_t, uint8_t*, int32_t);
#endif//X86_ASM
#if defined (HAVE_NEON)

View File

@ -50,29 +50,33 @@ namespace WelsEnc {
#define WELS_QP_MAX 51
typedef uint64_t cabac_low_t;
enum { CABAC_LOW_WIDTH = sizeof (cabac_low_t) / sizeof (uint8_t) * 8 };
typedef struct TagStateCtx {
uint8_t m_uiState;
uint8_t m_uiValMps;
// Packed representation of state and MPS as state << 1 | MPS.
uint8_t m_uiStateMps;
uint8_t Mps() const { return m_uiStateMps & 1; }
uint8_t State() const { return m_uiStateMps >> 1; }
void Set (uint8_t uiState, uint8_t uiMps) { m_uiStateMps = uiState * 2 + uiMps; }
} SStateCtx;
typedef struct TagCabacCtx {
uint32_t m_uiLow;
cabac_low_t m_uiLow;
int32_t m_iLowBitCnt;
int32_t m_iRenormCnt;
uint32_t m_uiRange;
SStateCtx m_sStateCtx[WELS_CONTEXT_COUNT];
uint8_t* m_pBufStart;
uint8_t* m_pBufEnd;
uint8_t* m_pBufCur;
uint8_t m_iBitsOutstanding;
uint32_t m_uData;
uint32_t m_uiBitsUsed;
uint32_t m_iFirstFlag;
uint32_t m_uiBinCountsInNalUnits;
} SCabacCtx;
void WelsCabacContextInit (void* pCtx, SCabacCtx* pCbCtx, int32_t iModel);
void WelsCabacEncodeInit (SCabacCtx* pCbCtx, uint8_t* pBuf, uint8_t* pEnd);
void WelsCabacEncodeDecision (SCabacCtx* pCbCtx, int32_t iCtx, uint32_t uiBin);
void WelsCabacEncodeBypassOne (SCabacCtx* pCbCtx, uint32_t uiBin);
inline void WelsCabacEncodeDecision (SCabacCtx* pCbCtx, int32_t iCtx, uint32_t uiBin);
inline void WelsCabacEncodeBypassOne (SCabacCtx* pCbCtx, int32_t uiBin);
void WelsCabacEncodeTerminate (SCabacCtx* pCbCtx, uint32_t uiBin);
void WelsCabacEncodeUeBypass (SCabacCtx* pCbCtx, int32_t iExpBits, uint32_t uiVal);
void WelsCabacEncodeFlush (SCabacCtx* pCbCtx);
@ -81,5 +85,43 @@ int32_t WriteBlockResidualCabac (void* pEncCtx, int16_t* pCoffLevel, int32_t i
int32_t iCalRunLevelFlag,
int32_t iResidualProperty, int8_t iNC, SBitStringAux* pBs);
// private functions used by public inline functions.
void WelsCabacEncodeDecisionLps_ (SCabacCtx* pCbCtx, int32_t iCtx);
void WelsCabacEncodeUpdateLowNontrivial_ (SCabacCtx* pCbCtx);
inline void WelsCabacEncodeUpdateLow_ (SCabacCtx* pCbCtx) {
if (pCbCtx->m_iLowBitCnt + pCbCtx->m_iRenormCnt < CABAC_LOW_WIDTH) {
pCbCtx->m_iLowBitCnt += pCbCtx->m_iRenormCnt;
pCbCtx->m_uiLow <<= pCbCtx->m_iRenormCnt;
} else {
WelsCabacEncodeUpdateLowNontrivial_ (pCbCtx);
}
pCbCtx->m_iRenormCnt = 0;
}
// inline function definitions.
void WelsCabacEncodeDecision (SCabacCtx* pCbCtx, int32_t iCtx, uint32_t uiBin) {
if (uiBin == pCbCtx->m_sStateCtx[iCtx].Mps()) {
const int32_t kiState = pCbCtx->m_sStateCtx[iCtx].State();
uint32_t uiRange = pCbCtx->m_uiRange;
uint32_t uiRangeLps = g_kuiCabacRangeLps[kiState][(uiRange & 0xff) >> 6];
uiRange -= uiRangeLps;
const int32_t kiRenormAmount = uiRange >> 8 ^ 1;
pCbCtx->m_uiRange = uiRange << kiRenormAmount;
pCbCtx->m_iRenormCnt += kiRenormAmount;
pCbCtx->m_sStateCtx[iCtx].Set (g_kuiStateTransTable[kiState][1], uiBin);
} else {
WelsCabacEncodeDecisionLps_ (pCbCtx, iCtx);
}
}
void WelsCabacEncodeBypassOne (SCabacCtx* pCbCtx, int32_t uiBin) {
const uint32_t kuiBinBitmask = -uiBin;
pCbCtx->m_iRenormCnt++;
WelsCabacEncodeUpdateLow_ (pCbCtx);
pCbCtx->m_uiLow += kuiBinBitmask & pCbCtx->m_uiRange;
}
}
#endif

View File

@ -75,9 +75,13 @@ int32_t WriteBlockResidualCavlc (SWelsFuncPtrList* pFuncList, int16_t* pCoffLev
extern "C" {
#endif//__cplusplus
int32_t CavlcParamCal_c (int16_t* pCoffLevel, uint8_t* pRun, int16_t* pLevel, int32_t* pTotalCoeffs ,
int32_t iEndIdx);
#ifdef X86_ASM
int32_t CavlcParamCal_sse2 (int16_t* pCoffLevel, uint8_t* pRun, int16_t* pLevel, int32_t* pTotalCoeffs ,
int32_t iEndIdx);
int32_t CavlcParamCal_sse42 (int16_t* pCoffLevel, uint8_t* pRun, int16_t* pLevel, int32_t* pTotalCoeffs ,
int32_t iEndIdx);
#endif
#if defined(__cplusplus)

View File

@ -42,6 +42,7 @@
#include "parameter_sets.h"
#include "svc_enc_slice_segment.h"
#include "set_mb_syn_cabac.h"
#include "nal_encap.h"
namespace WelsEnc {
@ -79,6 +80,21 @@ bool bLongTermRefFlag;
bool bAdaptiveRefPicMarkingModeFlag;
} SRefPicMarking;
// slice level rc statistic info
typedef struct TagRCSlicing {
int32_t iComplexityIndexSlice;
int32_t iCalculatedQpSlice;
int32_t iStartMbSlice;
int32_t iEndMbSlice;
int32_t iTotalQpSlice;
int32_t iTotalMbSlice;
int32_t iTargetBitsSlice;
int32_t iBsPosSlice;
int32_t iFrameBitsSlice;
int32_t iGomBitsSlice;
int32_t iGomTargetBits;
//int32_t gom_coded_mb;
} SRCSlicing;
/* Header of slice syntax elements, refer to Page 63 in JVT X201wcm */
typedef struct TagSliceHeader {
@ -157,6 +173,7 @@ typedef struct TagSlice {
// mainly for multiple threads imp.
SMbCache sMbCacheInfo; // MBCache is introduced within slice dependency
SBitStringAux* pSliceBsa;
SWelsSliceBs sSliceBs;
/*******************************sSliceHeader****************************/
SSliceHeaderExt sSliceHeaderExt;
@ -181,6 +198,12 @@ uint8_t uiReservedFillByte; // reserved to meet 4 bytes alignment
SCabacCtx sCabacCtx;
int32_t iCabacInitIdc;
int32_t iMbSkipRun;
int32_t iCountMbNumInSlice;
uint32_t uiSliceConsumeTime;
int32_t iSliceComplexRatio;
SRCSlicing sSlicingOverRc; //slice level rc statistic info
} SSlice, *PSlice;
}

View File

@ -51,17 +51,16 @@
#include "WelsThreadLib.h"
namespace WelsEnc {
void UpdateMbListNeighborParallel (SSliceCtx* pSliceCtx,
void UpdateMbListNeighborParallel (SDqLayer* pCurDq,
SMB* pMbList,
const int32_t kiSliceIdc);
void CalcSliceComplexRatio (void* pRatio, SSliceCtx* pSliceCtx, uint32_t* pSliceConsume);
void CalcSliceComplexRatio (SDqLayer* pCurDq);
int32_t NeedDynamicAdjust (void* pConsumeTime, const int32_t kiSliceNum);
void DynamicAdjustSlicing (sWelsEncCtx* pCtx,
SDqLayer* pCurDqLayer,
void* pComplexRatio,
int32_t iCurDid);
int32_t RequestMtResource (sWelsEncCtx** ppCtx, SWelsSvcCodingParam* pParam, const int32_t kiCountBsLen,
@ -99,6 +98,7 @@ void TrackSliceConsumeTime (sWelsEncCtx* pCtx, int32_t* pDidList, const int32_t
#endif//defined(MT_DEBUG)
void SetOneSliceBsBufferUnderMultithread(sWelsEncCtx* pCtx, const int32_t kiThreadIdx, const int32_t iSliceIdx);
int32_t WriteSliceBs (sWelsEncCtx* pCtx,SWelsSliceBs* pSliceBs,const int32_t iSliceIdx,int32_t& iSliceSize);
}
#endif//SVC_SLICE_MULTIPLE_THREADING_H__

View File

@ -68,10 +68,17 @@ uint8_t uiFMEGoodFrameCount;
int32_t iHighFreMbCount;
} SFeatureSearchPreparation; //maintain only one
typedef struct TagSliceThreadInfo {
SSlice* pSliceInThread[MAX_THREADS_NUM];// slice buffer for multi thread,
// will not alloated when multi thread is off
int32_t iMaxSliceNumInThread[MAX_THREADS_NUM];
int32_t iEncodedSliceNumInThread[MAX_THREADS_NUM];
}SSliceThreadInfo;
typedef struct TagLayerInfo {
SNalUnitHeaderExt sNalHeaderExt;
SSlice*
pSliceInLayer;// Here SSlice identify to Frame on concept, [iSliceIndex], need memory block external side for MT
SSlice* pSliceInLayer; // Here SSlice identify to Frame on concept, [iSliceIndex],
// may need extend list size for sliceMode=SM_SIZELIMITED_SLICE
SSubsetSps* pSubsetSpsP; // current pSubsetSps used, memory alloc in external
SWelsSPS* pSpsP; // current pSps based avc used, memory alloc in external
SWelsPPS* pPpsP; // current pPps used
@ -79,7 +86,9 @@ SWelsPPS* pPpsP; // current pPps used
/* Layer Representation */
struct TagDqLayer {
SLayerInfo sLayerInfo;
SSliceThreadInfo sSliceThreadInfo;
SSlice** ppSliceInLayer;
SSliceCtx sSliceEncCtx; // current slice context
uint8_t* pCsData[3]; // pointer to reconstructed picture pData
int32_t iCsStride[3]; // Cs stride
@ -105,11 +114,11 @@ SPicture* pRefPic; // reference picture pointer
SPicture* pDecPic; // reconstruction picture pointer for layer
SPicture* pRefOri[MAX_REF_PIC_COUNT];
SSliceCtx* pSliceEncCtx; // current slice context
int32_t iMaxSliceNum;
int32_t* pNumSliceCodedOfPartition; // for dynamic slicing mode
int32_t* pLastCodedMbIdxOfPartition; // for dynamic slicing mode
int32_t* pLastMbIdxOfPartition; // for dynamic slicing mode
bool bNeedAdjustingSlicing;
SFeatureSearchPreparation* pFeatureSearchPreparation;

View File

@ -69,7 +69,7 @@ SMVUnitXY sP16x16Mv;
uint8_t uiLumaQp; // uiLumaQp: pPps->iInitialQp + sSliceHeader->delta_qp + mb->dquant.
uint8_t uiChromaQp;
uint16_t uiSliceIdc; // 2^16=65536 > MaxFS(36864) of level 5.1; AVC: pFirstMbInSlice?; SVC: (pFirstMbInSlice << 7) | ((uiDependencyId << 4) | uiQualityId);
uint16_t uiSliceIdc; // 2^16=65536 > MaxFS(36864) of level 5.1; AVC: iFirstMbInSlice?; SVC: (iFirstMbInSlice << 7) | ((uiDependencyId << 4) | uiQualityId);
uint32_t uiChromPredMode;
int32_t iLumaDQp;
SMVUnitXY sMvd[MB_BLOCK4x4_NUM]; //only for CABAC writing; storage structure the same as sMv, in 4x4 scan order.

View File

@ -71,7 +71,9 @@ namespace WelsEnc {
#define JUMPPACKETSIZE_CONSTRAINT(max_byte) ( max_byte - AVER_MARGIN_BYTES ) //in bytes
#define JUMPPACKETSIZE_JUDGE(len,mb_idx,max_byte) ( (len) > JUMPPACKETSIZE_CONSTRAINT(max_byte) ) //( (mb_idx+1)%40/*16slice for compare*/ == 0 ) //
//cur_mb_idx is for early tests, can be omit in optimization
typedef struct TagSlice SSlice;
typedef struct TagDqLayer SDqLayer;
typedef struct TagWelsEncCtx sWelsEncCtx;
/*!
* \brief SSlice context
*/
@ -83,10 +85,9 @@ int16_t iMbHeight; /* height of picture size in mb
int32_t iSliceNumInFrame; /* count number of slices in frame; */
int32_t iMbNumInFrame; /* count number of MBs in frame */
uint16_t* pOverallMbMap; /* overall MB map in frame, store virtual slice idc; */
int32_t* pFirstMbInSlice; /* first MB address top-left based in every slice respectively; */
int32_t* pCountMbNumInSlice; /* count number of MBs in every slice respectively; */
uint32_t uiSliceSizeConstraint; /* in byte */
int32_t iMaxSliceNumConstraint; /* maximal number of slices constraint */
} SSliceCtx;
@ -106,7 +107,7 @@ uint8_t uiLastMbQp;
/*!
* \brief Initialize Wels SSlice context (Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context to be initialized
* \param pCurDq current layer which its SSlice context will be initialized
* \param bFmoUseFlag flag of using fmo
* \param iMbWidth MB width
* \param iMbHeight MB height
@ -116,80 +117,81 @@ uint8_t uiLastMbQp;
*
* \return 0 - successful; none 0 - failed;
*/
int32_t InitSlicePEncCtx (SSliceCtx* pSliceCtx,
int32_t InitSlicePEncCtx (SDqLayer* pCurDq,
CMemoryAlign* pMa,
bool bFmoUseFlag,
int32_t iMbWidth,
int32_t iMbHeight,
SSliceConfig* pMulSliceOption,
SSliceArgument* pSliceArgument,
void* pPpsArg);
/*!
* \brief Uninitialize Wels SSlice context (Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context to be initialized
* \param pCurDq curent layer which its SSlice context will be initialized
*
* \return NONE;
*/
void UninitSlicePEncCtx (SSliceCtx* pSliceCtx, CMemoryAlign* pMa);
void UninitSlicePEncCtx (SDqLayer* pCurDq, CMemoryAlign* pMa);
/*!
* \brief Get slice idc for given iMbXY (apply in Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context
* \param kiMbXY MB xy index
* \param pCurDq current layer info
* \param kiMbXY MB xy index
*
* \return uiSliceIdc - successful; (uint8_t)(-1) - failed;
*/
uint16_t WelsMbToSliceIdc (SSliceCtx* pSliceCtx, const int32_t kiMbXY);
uint16_t WelsMbToSliceIdc (SDqLayer* pCurDq, const int32_t kiMbXY);
/*!
* \brief Get first mb in slice/slice_group: uiSliceIdc (apply in Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context
* \param pSliceInLayer slice list in current layer
* \param kiSliceIdc slice idc
*
* \return first_mb - successful; -1 - failed;
*/
int32_t WelsGetFirstMbOfSlice (SSliceCtx* pSliceCtx, const int32_t kiSliceIdc);
int32_t WelsGetFirstMbOfSlice (SSlice* pSliceInLayer, const int32_t kiSliceIdc);
/*!
* \brief Get successive mb to be processed in slice/slice_group: uiSliceIdc (apply in Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context
* \param kiMbXY MB xy index
* \param pCurDq current layer info
* \param kiMbXY MB xy index
*
* \return next_mb - successful; -1 - failed;
*/
int32_t WelsGetNextMbOfSlice (SSliceCtx* pSliceCtx, const int32_t kiMbXY);
int32_t WelsGetNextMbOfSlice (SDqLayer* pCurDq, const int32_t kiMbXY);
/*!
* \brief Get previous mb to be processed in slice/slice_group: uiSliceIdc (apply in Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context
* \param pCurDq current layer info
* \param kiMbXY MB xy index
*
* \return prev_mb - successful; -1 - failed;
*/
int32_t WelsGetPrevMbOfSlice (SSliceCtx* pSliceCtx, const int32_t kiMbXY);
int32_t WelsGetPrevMbOfSlice (SDqLayer* pCurDq, const int32_t kiMbXY);
/*!
* \brief Get number of mb in slice/slice_group: uiSliceIdc (apply in Single/multiple slices and FMO)
*
* \param pSliceCtx SSlice context
* \param pCurDq current layer info
* \param kiSliceIdc slice/slice_group idc
*
* \return count_num_of_mb - successful; -1 - failed;
*/
int32_t WelsGetNumMbInSlice (SSliceCtx* pSliceCtx, const int32_t kiSliceIdc);
int32_t WelsGetNumMbInSlice (SDqLayer* pCurDq, const int32_t kiSliceIdc);
/*!
* Get slice count for multiple slice segment
*
*/
int32_t GetInitialSliceNum (const int32_t kiMbWidth, const int32_t kiMbHeight, SSliceConfig* pMso);
int32_t GetCurrentSliceNum (const SSliceCtx* kpSliceCtx);
int32_t GetInitialSliceNum (const int32_t kiMbWidth, const int32_t kiMbHeight, SSliceArgument* pSliceArgument);
int32_t GetCurrentSliceNum (const SDqLayer* pCurDq);
SSlice* GetSliceByIndex(sWelsEncCtx* pCtx, const int32_t kiSliceIdc);
//checking valid para
int32_t DynamicMaxSliceNumConstraint (uint32_t uiMaximumNum, int32_t uiConsumedNum, uint32_t uiDulplicateTimes);
@ -202,7 +204,7 @@ bool GomValidCheckSliceNum (const int32_t kiMbWidth, const int32_t kiMbHeight, u
bool GomValidCheckSliceMbNum (const int32_t kiMbWidth, const int32_t kiMbHeight, SSliceArgument* pSliceArg);
//end of checking valid para
int32_t DynamicAdjustSlicePEncCtxAll (SSliceCtx* pSliceCtx,
int32_t DynamicAdjustSlicePEncCtxAll (SDqLayer* pCurDq,
int32_t* pRunLength);
}
#endif//WELS_SLICE_SEGMENT_H__

Some files were not shown because too many files have changed in this diff Show More