Fixes issue on iPad Pro 10.5 (and probably other places) where threads
are not properly synchronized. On x86 this data race was benign as load
and store instructions are atomic, they were being atomic in practice as
the program hasn't been observed to be miscompiled.
Such guarantees are not made outside x86, and real problems manifested
where libvpx reliably reproduced a broken bitstream for even just the
initial keyframe. This was detected in WebRTC where this device started
using multithreading (as its CPU count is higher than earlier devices,
where the problem did not manifest as single-threading was used in
practice).
This issue was not detected under thread-sanitizer bots as mutexes were
conditionally used under this platform to simulate the protected read
and write semantics that were in practice provided on x86 platforms.
This change also removes several mutexes, so encoder/decoder state is
lighter-weight after this change and we do not need to initialize so
many mutexes (this was done even on non-thread-sanitizer platforms where
they were unused).
Change-Id: If41fcb0d99944f7bbc8ec40877cdc34d672ae72a
Reapply this patch:
ff0107f Amend and improve VP8 multithreading implementation
Amended the patch to add a unit test, and fix an asan error.
BUG=webm:851
Change-Id: I6572c03256169c64e80248bf5a5e99f59a2fc93c
For some reason allocated_decoding_thread_count is signed, but decoding_thread_count is not.
Cleans -Wextra/-Wsign-compare:
comparison between signed and unsigned integer expressions
Change-Id: Id0ada78100acff27c1c4ed7493c563d13c55cdcd
1 - stops de allocating before threads are closed.
2 - limits threads to mb_rows when mb_rows < partitions
BUG=webm:851
Change-Id: I7ead53e80cc0f8c2e4c1c53506eff8431de2a37e
There are flaws in current implementation of VP8 multithreading encoder
and decoder as reported in the following issue:
https://code.google.com/p/chromium/issues/detail?id=158922
Although the data race warnings are harmless, and wouldn't cause real
problems while encoding and decoding videos, it is better to fix the
warnings so that VP8 code could pass the TSan test.
To synchronize the thread-shared data access and maintain the speed
(i.e. decoding speed), use multiple mutexes based on mb_rows to reduce
the number of synchronizations needed, make the reads and writes of
the shared data protected, and reduce the number of mb_col writes by
nsync times.
The decoder speed tests showed < 3% speed loss while using 2 ~ 4
threads.
Change-Id: Ie296defffcd86a693188b668270d811964227882
and denoising.c
Adding -Wshadow to CFLAGS generated a bunch of warnings. This patch
removes these warnings.
Change-Id: I434a9f4bfac9ad4ab7d2a67a35ef21e6636280da
Writing all #define guards using the same style. Inlining macro
VP8DX_BOOL_DECODER_FILL into vp8dx_bool_decoder_fill. Removing unnecessary
includes.
Change-Id: I483fa979ab34008bf7835b5f34c6471c44daf956
So that, in case of error, the arrays are not filled with trash
pointers that are attempted a free() during vp8mt_de_alloc_temp_buffers()
Change-Id: Ic074549c2903a43316510eb42e4f393e7d3ee528
Initialize the top line at the beginning of each frame and
the left column at the beginning of each row.
Change-Id: I5412f7ea49ffc490215cf65a62715a6c5e3a5a29
predict_d has become canonical. Remove previous helper function.
Disable ARM assembly pending update.
Change-Id: Idd84ac8a28f9b0221ea97904a77de1e705d06a7d
Allows building the library with the gcc -pedantic option, for improved
portabilty. In particular, this commit removes usage of C99/C++ style
single-line comments and dynamic struct initializers. This is a
continuation of the work done in commit 97b766a46, which removed most
of these warnings for decode only builds.
Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
Make functions only referenced from one translation unit static. Other
symbols with extern linkage get a vp8/vpx prefix.
Change-Id: I928c7e0d0d36e89ac78cb54ff8bb28748727834f
Increment the last_row_mb_col counter by nsync after last MB of row is
ready. This way we dont need to check for last MB of row when
synching.
Set last MB of row ready just after row extension is done, This avoids
o potential race condition whit the processing of last MB of next row.
Change-Id: I19c44fd6041116ee5483be2143b4f4bfcd149eac
The block pointers and offset do not need to be calculated for every
frame. Block internal predictors can be update once when decoder is
allocated. Destination and previous buffer offsets have to be updated
just when frame size is changing.
Change-Id: I92ca8df0e6aaac4cc35ab890751d446760bf82e2
Eliminated some mb branches along with other code cleanups.
This is part of an ongoing effort to remove cut/paste
code in the decoder.
Change-Id: Ifabb0f67cafa6922b5a0e89a0d03a9b34e9e5752
Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead. vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict. Future commits will fix this
code duplication.
Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5
Reworked the code to use vp8_build_intra_predictors_mbuv_s
instead. This is WIP with the goal of eliminating all
functions in reconintra_mt.h
Change-Id: I61c4a132684544b24a38c4a90044597c6ec0dd52