Fixes issue on iPad Pro 10.5 (and probably other places) where threads
are not properly synchronized. On x86 this data race was benign as load
and store instructions are atomic, they were being atomic in practice as
the program hasn't been observed to be miscompiled.
Such guarantees are not made outside x86, and real problems manifested
where libvpx reliably reproduced a broken bitstream for even just the
initial keyframe. This was detected in WebRTC where this device started
using multithreading (as its CPU count is higher than earlier devices,
where the problem did not manifest as single-threading was used in
practice).
This issue was not detected under thread-sanitizer bots as mutexes were
conditionally used under this platform to simulate the protected read
and write semantics that were in practice provided on x86 platforms.
This change also removes several mutexes, so encoder/decoder state is
lighter-weight after this change and we do not need to initialize so
many mutexes (this was done even on non-thread-sanitizer platforms where
they were unused).
Change-Id: If41fcb0d99944f7bbc8ec40877cdc34d672ae72a
Reapply this patch:
ff0107f Amend and improve VP8 multithreading implementation
Amended the patch to add a unit test, and fix an asan error.
BUG=webm:851
Change-Id: I6572c03256169c64e80248bf5a5e99f59a2fc93c
There are flaws in current implementation of VP8 multithreading encoder
and decoder as reported in the following issue:
https://code.google.com/p/chromium/issues/detail?id=158922
Although the data race warnings are harmless, and wouldn't cause real
problems while encoding and decoding videos, it is better to fix the
warnings so that VP8 code could pass the TSan test.
To synchronize the thread-shared data access and maintain the speed
(i.e. decoding speed), use multiple mutexes based on mb_rows to reduce
the number of synchronizations needed, make the reads and writes of
the shared data protected, and reduce the number of mb_col writes by
nsync times.
The decoder speed tests showed < 3% speed loss while using 2 ~ 4
threads.
Change-Id: Ie296defffcd86a693188b668270d811964227882
The obj_int_extract code is no longer worth maintaining. It creates
significant issues when adapting for different build systems and no
longer offers as significant of a performance benefit due to
improvements in intrinsics.
Source files will remain until the various third-party builds are updated.
The neon fast quantizer has been moved to intrinsics. The armv6 version
has been removed because so few remaining targets require it.
Compilers and processors have improved significantly since the
pack_tokens code was written. The assembly is no longer faster than the
C code.
pack_tokens were the only optimizations for the armv5te targets so the targets
will be removed after the test infrastructure has been updated.
BUG=710
Change-Id: Ic785b167cd9f95eeff31c7c76b7b736c07fb30eb
Allow for an option to selectively apply the deblocking loop filter to the denoised
raw block, based on the denoised state (no-filter, filter with zero motion, or filter with non-zero motion)
of the current block and its upper and left denoised block.
This helps to reduce some blocking artifacts from the motion-compensated denoising.
Change-Id: I0ac4e70076df69a98c5391979e739a2681e24ae6
Condition the existing zbin boost logic for gf/altf mode to temporal layers==1,
since gf/altf reference frames are used in temporal layers as reference frames.
Change-Id: I618bb20730e5f193e078215d06f54997c363dd7b
ratectrl.c and quantize.c
Adding -Wshadow to CFLAGS generated a bunch of warnings. This patch
removes these warnings.
Change-Id: I8c8faa9fde57c1c49662d332a90bc8d9a0f4a2ce
WIP: Fixing unsafe threading in VP8 encoder.
Use the passed in macroblock instead of the macroblock located in
cpi.
Change-Id: I1bfa07de6ea463f2baeaae1bae5d950691bc4afc
Multi-threaded code was not updated to disable background
refresh for non base-layer frames at the time it was
disabled in the main C-code.
Change-Id: Id6cc376130b7def046942121cfd0526b4f0a71d4
The current way of counting inter_zz_count doesn't work correctly
in multi-threaded encoding. Calculating it after the frame is
encoded fixed the problem.
Change-Id: Ifcb1972cde950b8cc194f75c6d7b6af09e8b0e65
Precalculated block ptrs do not need updates during encoding.
Set these at init stage.
Moved the allocation of 'mt_current_mb_col' (last encoded MB on each
row) to vp8_alloc_compressor_data(), so that it is correctly
reallocated when frame size is changing.
Change-Id: Idcdaa2d0cf3a7f782b7d888626b7cf22a4ffb5c1
Changes relating to Issue 411
Removed code that was clearing down the segmentation data each
frame.
Added range/parameter checking in vp8_set_roimap(); Return error
if called when cyclic_refresh is enabled.
Correct setup_features() so that it sets or clears the segment update
flags as appropriate.
Change-Id: Ib31ac53006640ddf1ba7b9ec8f8b952e3eff860a
Allows building the library with the gcc -pedantic option, for improved
portabilty. In particular, this commit removes usage of C99/C++ style
single-line comments and dynamic struct initializers. This is a
continuation of the work done in commit 97b766a46, which removed most
of these warnings for decode only builds.
Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966