The obj_int_extract code is no longer worth maintaining. It creates
significant issues when adapting for different build systems and no
longer offers as significant of a performance benefit due to
improvements in intrinsics.
Source files will remain until the various third-party builds are updated.
The neon fast quantizer has been moved to intrinsics. The armv6 version
has been removed because so few remaining targets require it.
Compilers and processors have improved significantly since the
pack_tokens code was written. The assembly is no longer faster than the
C code.
pack_tokens were the only optimizations for the armv5te targets so the targets
will be removed after the test infrastructure has been updated.
BUG=710
Change-Id: Ic785b167cd9f95eeff31c7c76b7b736c07fb30eb
This patch was to fix the vpxdec fuzzing3 test failure. When an
error occurs, setjmp() is invoked, which calls the decoder
removing routine. In multiple thread situation, other threads
could try to access the frame context memory that is already
deallocated, thus causing a segfault.
An invalid unit test was added for this issue.
Change-Id: Ida7442154f3d89759483f0f4fe0324041fffb952
This commit removes the cyclic aq mode dependency on
in_static_area and reworks the corresponding cut-off thresholds.
It improves the compression performance of speed -5 by 1.47% in
PSNR and 2.07% in SSIM, and the compression performance of speed
-6 by 3.10% in PSNR and 5.25% in SSIM. Speed wise, about 1% faster
in both settings at high bit-rates.
Change-Id: I1ffc775afdc047964448d9dff5751491ba4ff4a9
This will save the memory and improve the decode speed due to
removing unnecessary memset of big prev_mi array for
all the key frames.
Decoding a all key frames 1080p video shows speed improve around 2%.
Change-Id: I6284a445c1291056e3c15135c3c20d502f791c10
Check that the numerator is not zero. If it is, guess 30fps.
Fixes a clang IOC error in the quantize test. It's very unlikely for
this to occur in the wild because the setup in the quantize test is very
nonstandard.
Change-Id: Icdab7b81d4e168d3423e14db20787f960052e0c3
This commit makes the RTC coding mode to conditionally skip the
reference frame mode search, when the predicted motion vector of
the current reference frame gives more than two times sum of
absolute difference compared to that of other reference frames.
It reduces the runtim by 1% - 4% for speed -5 and -6. The average
compression performance is improved by about 0.1% in both settings.
It is of particular benefit to light change scenarios. The
compression performance of test clip mmmovingvga.y4m is improved by
6.39% and 15.69% at high bit rates for speed -5 and -6, respectively.
Speed -5
vidyo1 16555 b/f, 40.818 dB, 12422 ms ->
16552 b/f, 40.804 dB, 12100 ms
nik 33211 b/f, 39.138 dB, 11341 ms ->
33228 b/f, 39.139 dB, 11023 ms
mmmoving 33263 b/f, 40.935 dB, 13508 ms ->
33256 b/f, 41.068 dB, 12861 ms
Speed -6
vidyo1 16541 b/f, 40.227 dB, 8437 ms ->
16540 b/f, 40.220 dB, 8216 ms
nik 33272 b/f, 38.399 dB, 7610 ms ->
33267 b/f, 38.414 dB, 7490 ms
mmmoving 33255 b/f, 40.555 dB, 7523 ms ->
33257 b/f, 40.975 dB, 7493 ms
Change-Id: Id2aef76ef74a3cba5e9a82a83b792144948c6a91
This commit unfolds the legacy macro definitions used in the
sub-pixel motion search and refactors the operational flow for
later optimizations.
Change-Id: I3e3f770cad961d03d1a6eb0b2a0186cc77eaf2b8
The current logic was allowing for disabling golden refresh only
for two pass svc encoding. This change disables it as long as
more than 1 layer encoding is used (for example temporal layers under 1pass CBR).
Change-Id: I4dc5204a7ad365c821ec7963e93b59da82e1826b
In the function mb_lpf_horizontal_edge_w_avx2_16 the usage of the intrinsic
_mm256_cvtepu8_epi16 cause a compiler bug in gcc 4.9.1.
until it will be fixed I created a workaround that create the up convert by
using broadcast128+shuffle.
The bug was reported here:
https://code.google.com/p/webm/issues/detail?id=867
Change-Id: I73452e6806f42e0fadcde96b804ea3afa7eeb351
A recent change has introduced big quality drops for speed 7 and 12
for --rt mode. The change reverted the big drop and improved quality
by 9.5% for speed 7 and 13.4% for speed 12.
Change-Id: I07b82e3bb6002a73af486a083458c88877bdad01
This will save a lot of memory for decoder due to removing of prev_mi,
but prev_mi is still needed in encoder. So this will increase a little bit
memory for encoder.
Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed
Use intrinsics for neon quantization. Slight loss (<5%) of performance
compared to the assembly. Roughly 10x faster on arm64 because that was
running C code before.
Change-Id: I7cf5242d8f29b7eab5bca6a1c20c89c9fc9ca66d
This commit makes the inter prediction buffer system to support
hybrid partition search. It reduces the runtime of speed -5 by
about 3%. No compression performance change.
vidyo1 720p 1000 kbps
11831 ms -> 11497 ms
nik 720p 1000 kbps
10919 ms -> 10645 ms
Change-Id: I5b2da747c6395c253cd074d3907f5402e1840c36