this is consistent with other threaded tests and ensures gtest_filters
meant to operate on these pick them up
Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b
vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster
than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are
code relocations, no new code.
Change-Id: I5dac0e98cc411a4ce05660406921118986638d19
'in' is used for the reference fdct. 'coeff' is input to the idct being
tested and 'dst[16]' is output
Fixes a segfault on unaligned memory access on x86.
Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962
this was deleted in:
98967645a Remove vpx_idct8x8_64_add_ssse3()
but this was merged in:
9e03eedf6 Merge changes Ib26dd515,Ie60dabc3
after:
a92991133 Merge "dct tests: run all possible sizes in one test"
which added a new reference
Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25
Modify fdct4x4_test.cc to support all size combinations. This does not
add any new tests and in fact fails a few. There were minimal changes
made to the tests so it's not entirely surprising that some of the
larger 12 bit transforms are failing since it was initially only used
for 4x4.
In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and
dct32x32_test.cc will be evaluated and moved to dct_test.cc.
BUG=webm:1424
Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f
It's almost identical with vpx_idct8x8_64_add_sse2(), except little
difference in instructions order.
Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f
the check for error correction being disabled was overriding the data
length checks. this avoids returning incorrect information (width /
height) for the decoded frame which could result in inconsistent sizes
returned in to an application causing it to read beyond the bounds of
the frame allocation.
BUG=webm:1443
BUG=b/62458770
Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342
Roughly 2x speedup. Since the only change for HBD is to store(), the
improvement appears to hold there as well.
BUG=webm:1424
Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.
BUG=webm:1435
Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
Continue processing sets of 16 values. Plenty of improvement for 4x8
(doubles the speed) but only about 30% for 4x4.
BUG=webm:1422
Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905
Add PartialIDctTest::PrintDiff() to help debugging.
In RunQuantCheck, try all combinations of +/-mask_ input for 4x4 idct.
Update PartialIDctTest::InitInput().
Change-Id: I13fd163954a4c1a3a6cfeb5e4a4d3d0e7ff901f4
For SVC 1 pass non-rd pickmode, the interpolation filter for the
upsampling of the golden (spatial) reference was not being explicitly
set and instead was takin gwhatever value was set in the previous
mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH).
Fix it to the default EIGHTTAP for now, to be updated/selected
adaptively in a later change.
Minor adjustmemt to rate targeting thresholds in datarate unittests.
Change-Id: I52085048674072c6cfb7163e11e9a2658d773826
Makes more sense to call the corresponding partial idct C function
instead of the full idct C function as the reference.
Change-Id: Ibb7681dd063edd6307ba582c10c26c4c6a4b78c6
An intended behavior change disabling exhaustive searches in speed
feature causes VP9/DatarateTestVP9LargeDenoiser.4threads test failure.
Change the threshold to make it pass.
BUG=webm:1429
Change-Id: Ibcbe2314c6b2525799894f5d7204fc8eb4ec2a1e
Add support for everything except block sizes of 4.
Performance is better but numbers will improve again when the variance
optimizations land.
BUG=webm:1423
Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80
Some of the mixed sizes were missing. They can be implemented trivially
using the existing helper function.
When comparing the previous 16x8 and 8x16 implementations, the helper
function is about 10% faster than the 16x8 version. The 8x16 is very
close, but the existing version appears to be faster.
BUG=webm:1422
Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004
Approximates division using multiply and shift.
Speeds up both sizes (8x8 and 16x16) by 30 times.
Fix the call sites to use the RTCD function.
Delete sse2 and mips implementation. They were based on a previous
implementation of the filter. It was changed in Dec 2015:
ece4fd5d22
BUG=webm:1378
Change-Id: I0818e767a802966520b5c6e7999584ad13159276
This patch followed allow_exhaustive_searches feature modification and
continued to modify the encoder to achieve the determinism in the row
based multi-threaded encoding. While row-mt = 1 and using multiple
threads, the adaptive feature in encoder was disabled, which gave
BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
speed 2). These speed losses were acceptable considering the speed
gains obtained from row-mt.
Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
Re-enable the SVC tests, wrap the non-zero expectation
in GetMismatchFrames around #if CONFIG_VP9_DECODER.
Change-Id: I0e8a2d78b868c32f18fe597540f397d3a1b303b5
Slightly faster, the other dc predictors cannot be faster since
the computation speedup is overwhelmed by the time spent reading
dst to write just the 8x8 part.
Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66
Replace by CAST_TO_BYTEPTR/SHORTPTR.
The rule is: if a short ptr is casted to a byte ptr, any offset
operation on the byte ptr must be doubled. We do this by casting to
short ptr first, adding offset, then casting back to byte ptr.
BUG=webm:1388
Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
Disable the 1 pass CBR SVC tests with temporal_layers > 1.
Issue with the commit 863f860, which will cause encoder/decoder
mismatch due to skipping encoder loopfilter for non-reference frames.
Will re-enable the tests when fixed.
Change-Id: I74918a0045a17976b069c4be63fbeb921974df0d
Modify the frame flags to update the ARF on top layer,
for the tests:
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayersFrameDropping
This is needed to fix the encode/decoder mismatches caused by 863f860,
and removed in the revert e9b7f98.
Change-Id: I6b9fecfdd17315fc0179e29949338c77636026c0
Buffers on 32 bit x86 builds only guaranteed 8 byte alignment. Fixed
with "AvgPred test: use aligned buffers" and "sad avg: align
intermediate buffer"
Also re-enable asserts on the C version.
BUG=webm:1390
Change-Id: I93081f1b0002a352bb0a3371ac35452417fa8514
This reverts commit 863f860bfc.
This causes encoder / decoder mismatches in various
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests
BUG=webm:1408
Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
Useful for SVC, where the top layer enhancement frames may
not update any reference buffers, as is the case for the
patterns in the 1 pass CBR SVC when #temporal_layers > 1.
~3% encoder speedup for SVC patterns with temporal layers
in 1 pass CBR mode.
Updated the SVC datarate tests for the mismatch frames.
Set the frame-dropper off in some tests with #temporal_layers > 1
so we can correctly set #mismatch frames. Adjusted rate target
threshold for tests where frame-dropper was turned off.
Change-Id: Ia0c142f02100be0fed61cd2049691be9c59d6793
Provides over 15x speedup for width > 8.
Due to smaller loads and shifting for width == 8 it gets about 8x
speedup.
For width == 4 it's only about 4x speedup because there is a lot of
shuffling and shifting to get the data properly situated.
BUG=webm:1390
Change-Id: Ice0b3dbbf007be3d9509786a61e7f35e94bdffa8
To prevent the motion vector out of range bug, added a motion vector unit
test in VP9. In the 4k video encoding, always forced to use extreme motion
vectors and also encouraged to use INTER modes. In the decoding, checked if
the motion vector was valid, and also checked the encoder/decoder mismatch.
The tests showed that this unit test could reveal the issue we saw before.
Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
Make the source_sad feature work properly for cases of VBR or
screen_content with SVC.
Added unittest for SVC with screen-content on.
Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
Change tests to reflect use. Input sizes will be 8 or 16 (but not
necessarily square).
filter_weight is capped at 2 and filter_strength at 6
Speed test, disabled by default.
Change-Id: Idfde9d6c4b7d93aaf0e641b0f4862c15e2a2af7a
Enable row-mt for SVC for real-time mode, speed >=5.
Add the controls to the sample encoders, but keep it off for now.
Add the control and enable it for the 1 pass CBR unittests.
For speed 7, 3 layer SVC, 2 threads, row-mt enabled gives about ~5% speedup.
Change-Id: Ie8e77323c17263e3e7a7b9858aec12a3a93ec0c1