add a dependency on *_rtcd.h to ensure they're generated before
attempting to build the test files
Change-Id: Ibbbd1f6ea77912bfd297129e7c83b9a80923ea12
fails unit tests:
[ FAILED ] NEON/VP8SubpelVarianceTest.ExtremeRef/0, where GetParam() = (3, 3, 0x14e36d, 0)
[ FAILED ] NEON/VP8SubpelVarianceTest.Ref/0, where GetParam() = (3, 3, 0x14e36d, 0)
the tests were recently enabled in:
eb88b17 Make vp9 subpixel match vp8
the functions likely haven't changed since being converted from assembly
Change-Id: I6141717b111b8f735f436c160d74270af53ef722
This control allows the application to skip the loop filter in the
decoder. This is an advanced control that should only be used in
extreme circumstances as it may introduce and accumulate decode
artifacts.
Change-Id: I278c65c60826f84c9141ebe06c6eeed3c2335fa8
WebM files will adjust the display width and height according to the
input pixel aspect ratio. The default pixel aspect ratio is 1:1.
BUG=https://code.google.com/p/webm/issues/detail?id=1005
Change-Id: I23e0a601b7259fa9513cb86110c41b8437769808
The only difference between the two was that the vp9 function allowed
for every step in the bilinear filter (16 steps) while vp8 only allowed
for half of those. Since all the call sites in vp9 (<< 1) the input, it
only ever used the same steps as vp8.
This will allow moving the subpel variance to vpx_dsp with the rest of
the variance functions.
Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75
Updated sources according to improved version of common MSA macros.
Enabled respective convolve MSA hooks and tests.
Overall, this is just upgrading the code with styling changes.
Change-Id: If5ad6ef8ea7ca47feed6d2fc9f34f0f0e8b6694d
The larger internal variables are required for the intermediates
but RoundHighBitDepth brings them down to uint32_t/unsigned int.
Fixes type warnings in visual studio.
Change-Id: I48d35284d6cbde330ccdc1f46b6215a645d5eb00
Updated sources according to improved version of common MSA macros.
Enabled idct MSA hooks and tests.
Overall, this is just upgrading the code with styling changes.
Change-Id: I1f488ab2c741f6c622b7a855388a202168082209
Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc.
Disabled all MSA hooks and tests
Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190
ROUND_POWER_OF_TWO has some poor side effects when used
with [u]int64_t such as doing the shifting in 32bits.
Change-Id: Ic85a19765cd316fb43657cb21c86f35ceb772773
Various header/test files had to be re-worked in order to
build "Remove cm parameter from vp9_decode_block_tokens()".
This patch reverts the "Remove cm" part and only contains
the re-worked header files.
Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32
this allows test_libvpx's simd caps check to be used; it also fixes a
link error on OS X with -fcommon.
Change-Id: I1a62a3e74ba06b8f3b37a22fcfdebf90c04ab289
in addition to <arch>/*. this will pick up tests defined with TEST()
instead of INSTANTIATE_TEST_CASE_P()
Change-Id: I0917741baac89d9ce857f4d4aa53790e8a0c6c12
useful for speed testing / verifying individual function optimizations;
currently tests non-high-bitdepth VP9 intra predictors
Change-Id: Ibd247765e43a31894697d43f1d39d312e0ba2090
With the sad functions, and hopefully the variance functions soon,
moving to the vpx_dsp location, place the defines used in the
reference C code in a common location.
Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
The rotation computation using 2X of cos(pi/16) has a potential to
overflow 32 bit, this commit disable the function to allow further
investigation and optimization.
Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf
The version is currently producing different result from c version
for some input. Disable the use of it for now to allow time for
investigation the source of mismatch.
Change-Id: Id039455494ee531db4886a9f1fa4761174ef6df3
This partially reverts commit 14ef4aeafb
Including the rtcd headers to get the function definitions causes
problems on VS9.
Change-Id: I780874d9e03af2d3124192ab0e3907301f22674c
add a check for the status line to awk and better report failure given
the program output will be lost in this case
Change-Id: I1348a80108c81099d609f2e2227dd2c31bd8cd54
Reset the reached_eos flag in webm_guess_framerate in case it ends
up consuming the entire file. Also adding a vpxdec shell test to
verify this behavior.
Change-Id: I371eebd2105231dc0f60e65da1f71b233ad14be5
The following functions use the count parameter to either loop or select
dedicated paths:
vp9_lpf_horizontal_16_c
vp9_lpf_horizontal_16_sse2
vp9_lpf_horizontal_16_avx2
vp9_lpf_horizontal_16_neon
vp9_highbd_lpf_horizontal_16_c
vp9_highbd_lpf_horizontal_16_sse2
Change-Id: I7abfd2cb30baa292b4ebe11c847968481103c037
the TODO around CONFIG_SPATIAL_SVC has been resolved by changing the
CONFIG_* checks to use an ABI based check
Change-Id: If2638baf361b863186177a453beec9af9231e69e
this removes the CONFIG_* checks from public headers, but means
'--enable-experimental --enable-spatial-svc' builds will fail without a
local change to the ABI in vpx_encoder.h. this should be all right for
testing this experiment.
Change-Id: Ief55e7b9d1e8332cfce990275e04c29b30af0c4a
add explicit returns in cases where ASSERT_* can't be used due to the
function returning a value; retain the EXPECT_* for reporting purposes.
Change-Id: I1f514728537fee42a99277d3aba538e832d3b65b
Crash occured on very first key frame, because denoiser
temporal function was beng entered.
Updated denoiser unittest to set cpu_used from first frame,
and verified fix fixes the crash.
Change-Id: I3be1124b52846fbbe7248d2c3d6136e086c80bc1
both the encode and decode perf tests require niklas_1280_720_30.yuv
broken since:
28eebf3 Merge "tests: add a shorter 720p test clip"
7839d03 tests: add a shorter 720p test clip
Change-Id: I51ebbf7261832e25d8f2c1da5c7df5c2e47f748e
+ add a helper function to reduce the duplication
this is a bit clearer when the environment variable is set, but the
directory is missing
Change-Id: I08f9b56122b5741bb40a5f795f7f82f5b49f1047
niklas_1280_720_30.y4m 60 frames @ 30fps
only a small number of frames are being used; this reduces the test data
download size in non-perf-test cases by >500M.
retain niklas_1280_720_30.yuv for encode+decode perf tests
Change-Id: I56b3433104acd462f952a9554280de5a3ec0b6d2
Modified the thresholds of deciding whether or not to skip
the transforms in model_rd_for_sb_y(). Used zbin[] instead
of dequant[] to be more precise. Also, modified the checking
coditions.
Rtc set borg test results (at speed 6) showed:
average PSNR gain: 0.138%, overall PSNR gain: 0.158%,
and SSIM gain: 0.177%.
The data rate test was modified slightly as suggested by
Marco.
Change-Id: Ieaf633ab77f4838cb3c45cf69065b29d55f8ae6c
There is a corner case that when a frame is corrupted, the following
inter frame decode worker will miss the previous failure. To solve
this problem, a need_resync flag needs to be added to master thread
to keep control of that.
Change-Id: Iea9309b2562e7b59a83dd6b720607410286c90a6
use VP[89]_INSTANTIATE_TEST_CASE case when possible to disable the tests if
the codec is unavailable.
broken since:
be6aead Try again to merge branch 'frame-parallel' into master branch.
Change-Id: I8d81c5ba3b951f82be94bfaed6be194e4289baec
cm->frame_bufs[].idx values were made consistent in:
61c5e94 Use -1 consistently as invalid buffer idx
update the initialization in swap_frame_buffers() to match.
additionally:
- remove some shadowed variables in the former and marked them volatile
Change-Id: Ie3f9636c405bd822112bb56bd22d28024ae98909
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls.
Current frame parallel decode will only speed up the decoding for frame
parallel encoded videos. For non frame parallel encoded videos, frame
parallel decode is slower than serial decode due to lack of loopfilter
worker thread.
There are still some known issues that need to be addressed. For example:
decode frame parallel videos with segmentation enabled is not right sometimes.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
This reverts commit a18da9760a.
Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
On Nexus 7 speed -6 saw ~18% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
On Nexus 7 speed -6 saw ~15% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64