Compare commits

..

1482 Commits

Author SHA1 Message Date
Johann
f80be22a10 Release 1.7.0 Mandarin Duck
Change-Id: I186440f3643a85694f45400393efb661f6d012fc
2018-01-24 14:25:44 -08:00
James Zern
81d66e2cc6 vpx_codec_enc_init_multi: fix segfault w/vp9
vp9 does not support multi-res encoding, the request should not crash.

+ encode_api_test: unconditionally expose multi-res test

vpx_codec_enc_init_multi should fail independent of
CONFIG_MULTI_RES_ENCODING if not for the same reason.

Change-Id: I44fc58ef70ee4e0e482cb6a5736885f4cb2a8517
(cherry picked from commit 004fb91416)
2018-01-23 18:14:23 -08:00
Jerome Jiang
9f36419bf2 Fix crash invalid params for vp8 multres. Add test.
Fix is from the patch in the issue.
Release memories allocated before early exit.

BUG=webm:1482

Change-Id: I64952af99c58241496e03fa55da09fd129a07c77
(cherry picked from commit 5b6ae020b6)
2018-01-23 18:14:14 -08:00
Johann Koenig
740883897a Merge "Revert "Add frame width & height to frame pkt. Add test."" into mandarinduck 2018-01-17 20:17:33 +00:00
Johann
f87a4594fb Revert "Add frame width & height to frame pkt. Add test."
This reverts commit bd1d995cd3.

Remove the feature from the release as it requires additional work.

BUG=webm:1485

Change-Id: I1a01ac2525703af97a456a3eed85718306c0f734
2018-01-17 11:28:24 -08:00
Vignesh Venkatasubramanian
eedda5f924 vp8dx.h: Add macro for skipping loop filter
Without this applications cannot use the vpx_codec_control macro
for VP9_SET_SKIP_LOOP_FILTER. The tests only cover the underscored
version vpx_codec_control_().

Change-Id: I3e6c1888307b76636fdc1a8deae70b5c14238163
(cherry picked from commit 373e08f921)
2018-01-16 18:27:42 -08:00
paulwilkins
7d19739949 Fix bug in use of zoom metric as part of arf breakout.
The in/out (or zoom metrics) in accumulate_frame_motion_stats()
are in effect a % of the blocks that have a motion vector pointing
either towards or away from the center. As such they are already
normalized in terms of image size and the thresholds against which
these are tested should be image size independent.

In practice a zoom either in or out is an indicator for a shorter group
length so the abs value is more important as a breakout clause.

This patch fixes the threshold test. Clips without noticeable zoom show
no effect but some  with strong zooms such as "station" show a big
gain (5-10%). Average psnr-hvs gain on hdres set was 0.292%

Change-Id: I4f97a72b0e273e4e844ade15285749c32cd81c1c
(cherry picked from commit 0226ce79e9)
2018-01-16 11:21:29 -08:00
Johann
c5dc3373db work around pic issue with gcc 6
Enable pic when building sse2 or higher optimizations.

BUG=webm:1464

Change-Id: I36c6e83ed716649f3d9ee10ce3aa9bb847cac2d9
2018-01-09 12:46:45 -08:00
Scott LaVarnway
8a4336ed2e Add vp9_quantize_fp_nz_c() -- 2
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.

The test was updated to use the correct quant/round range.

Change-Id: Ie5871f710d9eb39047d8d9f48b907c0633e1f830
2017-12-21 15:26:36 -08:00
James Zern
1a7bf0d1f9 Merge "vp9_quantize_ssse3_x86_64: fix out of bounds write" 2017-12-21 23:02:32 +00:00
Ralph Giles
117893a717 Don't force inlining for msvc targets.
INLINE is defined as __forceinline for vs* configs, but is the
normal, compiler-discretion inline for gcc/clang configs. This
makes many functions very large when building for windows targets,
much larger than they are elsewhere.

Use '__inline' as a consistent definition to get consistent function
sizes. Although Visual Studio documentation says that 'inline' is
only available in C+ code. This is probably incorrect, since Visual
Studio 2017 accepts C99 'inline' even when passed /TC. Nevertheless,
this commit uses the recommended '__inline' for consistency.

Thanks to David Major for the diagnosis.

Change-Id: Ib0b31a3afcea77822c84fe3c6cd452add66d825a
2017-12-21 13:55:18 -08:00
James Zern
84a7263d4c vp9_quantize_ssse3_x86_64: fix out of bounds write
eob is a pointer to a uint16_t. previously the code would store 64-bits
causing a crash or test failure with the right stack layout.

Change-Id: Ibd653baf323db114f2444951b9d8b00c596bf15a
2017-12-21 16:53:14 -05:00
James Zern
7a245adb18 Revert "Add vp9_quantize_fp_nz_c()"
This reverts commit 86842855d3.

SSSE3/VP9QuantizeTest.EOBCheck/1 fails on Mac and the build breaks under
visual studio due to a #if within another macro.

Change-Id: I475095a04aafcc714fade2b24e4df7b682be2cd1
2017-12-21 06:05:19 -08:00
Scott LaVarnway
de50e8052c Merge "Add vp9_quantize_fp_nz_c()" 2017-12-20 23:15:11 +00:00
James Zern
1a9c7bee88 Merge "lpf_test: correct threshold ranges" 2017-12-20 20:22:34 +00:00
Marco
9ca9c12dbd vp9-svc: Add layer bitrate targeting to SVC datarate tests.
Modify and update the SVC datarate unittests to verify the
rate targeting for each spatial-temporal layer.
The current tests were only verifying the rate targeting
of the full SVC stream, not individual layers.
Also re-enabled a test that was disabled.

This is a stronger verification of the layered rate control
for SVC for 1 pass CBR encoding.

Added PostEncodeFrameHook, needed to get the layer_id and
update the layer buffer level.

Change-Id: I9fd54ad474686b20a6de3250d587e2cec194a56f
2017-12-19 19:48:47 -08:00
Scott LaVarnway
86842855d3 Add vp9_quantize_fp_nz_c()
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.

The test was updated to use the correct quant/round range.

Change-Id: I5d19f8af2fddda8e50910249eafb740acb29415b
2017-12-19 12:48:45 -08:00
Marco
a2127236ae vp9: Reset buffer level on large bitrate changes.
For a large change in the target avg_frame_bandwidth,
via the update in change_config()), reset the buffer_level
to optimal_level.

This fix prevents possible frame drops, where for example,
encoder suddenly goes from lower to higher bitrate.

Change-Id: I2f844c41d04c01240e85f574e59d2b9075c7eb6d
2017-12-19 09:57:21 -08:00
James Zern
5203b40a2a lpf_test: correct threshold ranges
the random number generator creates values from [0, range) add 1 to all
and make hev more realistic by mirroring its calculation of level >> 4,
i.e., [0, 3]

Change-Id: Ic19be5d7ba668deb17c96f143b739116a4b5d21c
2017-12-18 23:17:45 -08:00
Shiyou Yin
08a668af32 vp8: [loongson] optimize loopfilter v2.
Optimize function vp8_mbloop_filter_vertical_edge_mmi and
function vp8_mbloop_filter_horizontal_edge_mmi.
Make full use of memory loading delay slot and reduce unnecessary
instructions.

Change-Id: I61da2c3a44c06044225461f46bf487d83cba6c16
2017-12-15 17:06:47 +08:00
Shiyou Yin
09519a55c7 Merge "vp8: [loongson] optimize sixtab predict v2." 2017-12-15 00:53:21 +00:00
Johann Koenig
7970cc02df Merge "add copyright to rtcd files" 2017-12-14 23:44:30 +00:00
Johann Koenig
d95ddc7c71 Merge "mark generated version header" 2017-12-14 23:44:04 +00:00
Johann
e4b3f03c64 add copyright to rtcd files
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-14 22:50:08 +00:00
Johann Koenig
7d1bf5d12a Merge "remove unused tools" 2017-12-14 21:19:59 +00:00
Johann Koenig
9f8433ffe2 Merge "fix typo in boilerplate" 2017-12-14 21:19:47 +00:00
Johann
920ba82409 remove unused tools
all_builds.py has been more or less replaced by Jenkins.

author_first_release.sh is unused.

ftfy.sh has been obviated by having the whole tree clang-format clean.

Change-Id: I741315ad9042e6e901f07410e93f28371db703b2
2017-12-14 20:34:14 +00:00
Johann
fe4de1ff63 mark generated version header
Allows it to pass the license check in chromium.

BUG=chromium:98319

Change-Id: I5ba9c8c81ab9eb4168df09db9d2eab846e99e981
2017-12-14 11:58:10 -08:00
Johann
6746ba6d01 fix typo in boilerplate
The extra 'e' was causing the chromium license check to flag this file.

BUG=chromium:98319

Change-Id: Ic875ba66370298bf998438d14ff5f7e760293706
2017-12-14 11:54:16 -08:00
Johann
05e6e9ac83 mark generated rtcd headers
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Ib37bf45bdac8cf1edc62037dea17b734a5e37fa7
2017-12-14 11:48:46 -08:00
Shiyou Yin
f2ad523461 vp8: [loongson] optimize sixtab predict v2.
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.

Change-Id: I8bab801416e7f4958c28c6d080e3cf785a50f82b
2017-12-14 16:29:58 +08:00
Marco
c58f01724c vp9: Update to SVC datarate tests.
With recent fixes to rate control for SVC the
buffer underrun in the tests does not happen,
so comment and TODO can be removed.

Also, in some of these SVC tests, replace the HD clip
with the corresponding VGA clip, which has > 400 frames.
For the (niklas) HD clip: it has only 60 frames but the
test was running up to 300 frames. Fixed it to 60 frames.

Keep some tests with the HD clip, needed for the 4 thread
and 5 level scaling test.

Change-Id: I0a2356a908e8b2271c7a422eb8b15c0d56eec968
2017-12-13 14:07:52 -08:00
Marco Paniconi
028429310a Merge "vp9: Reset rc flags on some configuration changes." 2017-12-13 21:03:40 +00:00
Marco
e9ad5d2aee vp9: Cleanup/remove TODO comment.
Change-Id: I2bd43e996909ad688b7e00b81ee19a5fc4df460b
2017-12-13 11:30:09 -08:00
Marco
a40fa1f95d vp9: Reset rc flags on some configuration changes.
For large dynamic changes in target avg_frame_bandwidth, or
a change in resolution, via the update in change_config()),
reset the under/overshoot flags (rc_1_frame, rc_2_frame)
to prevent constraining the QP for the first few frames
following the change.

For SVC use the spatial stream avg_frame_bandwidth in
reset condition.

For the avg_frame_bandwidth condition, use fairly large
threshold (~50%) for now in reset.

This allows for better/faster QP response if, for example,
application dynamically changes bitrate by large amount.

Change-Id: Ib6e3761732d956949d79c9247e50dba744a535c0
2017-12-13 10:41:38 -08:00
Paul Wilkins
94eaecaa91 Merge "Bug fix for second reference stats." 2017-12-12 11:56:10 +00:00
Jerome Jiang
f9ecdc35ec Merge "vp9 svc: Allow denoising next to highest resolution." 2017-12-12 05:27:11 +00:00
Jerome Jiang
c1e511fd82 vp9 svc: Allow denoising next to highest resolution.
Denoise 2 spatial layes at most.

Add noise sensitivity level 2 for vp9 such that applications can control
whether to denoise the second highest spatial layer.

Add tests to cover this case.

Change-Id: Ic327d14b29adeba3f0dae547629f43b98d22997f
2017-12-11 15:20:19 -08:00
Jerome Jiang
a1689ed16b Merge "Fix build warnings for gcc 6.3" 2017-12-11 18:27:17 +00:00
paulwilkins
f1ce050f44 Bug fix for second reference stats.
Immediately following a key frame the trailing second reference
error in the first pass stats will be based on a reference frame from
the prior key frame group and will thus usually be much larger.

This fix eliminates that effect (which typically triggers a short arf
group immediately after a key frame). It also changes the accounting
for the first frame in each new arf group.

This change gives large gains on a couple of clips that contain mid
sequence key frames (e.g. 6% on 1080P tennis). Overall there was
a net gain in PSNR and PSNR-HVS ~(0.05- 0.4%) and mixed results for
SSIM (+/- 0.2%).

Change-Id: I8e00538ac2c0b5c2e7e637903cac329ce5c2a375
2017-12-08 10:05:36 +00:00
Jerome Jiang
2a602f745d Fix build warnings for gcc 6.3
Clean up some alias.

BUG=webm:1465

Change-Id: I99e186162db9f9e15375fef01564692434eda619
2017-12-07 13:42:10 -08:00
Jerome Jiang
14dbdd95e6 Merge "Add frame width & height to frame pkt. Add test." 2017-12-06 22:37:15 +00:00
Jerome Jiang
bd1d995cd3 Add frame width & height to frame pkt. Add test.
Used to return correct frame width and height when dynamic resizing happens.

BUG=webm:1474

Change-Id: Ia2043f7e1635b3821848a67b9b134f47f14b0f3a
2017-12-06 13:55:18 -08:00
Marco
3562d6b0a2 vp9-svc: Set downsampling filter for VGA layer.
Downsampling filter for SVC was set to subsample (phase 0)
for HD -> VGA, and bilinear averaging (phase 8) for VGA -> QVGA.
This change makes it bilinear averaging for HD -> VGA.

Given the recent commit 9f9d4f8, quality is improved with
this change: avgPSNR/SSIM up ~1-3% on HD clips in RTC set.
Speed decrease of ~1% for 3 layer SVC.

Change-Id: If834a320e372b8b922a6bf7cab4227703b1beae6
2017-12-06 12:01:24 -08:00
Marco Paniconi
575c1933ea Merge "vp9: Nonrd-pickmode: move some early exits up." 2017-12-06 19:18:51 +00:00
Hui Su
2e44f16443 Merge "Add max luma picture width/height constraint in VP9 level" 2017-12-06 18:46:19 +00:00
Marco
33953f310e vp9: Nonrd-pickmode: move some early exits up.
Move the early exit checks on usable_ref_frame and
skip_ref_find_pref up before the check on flag_svc_subpel.
The code under flag_svc_subpel requires frame_mv to be set
for the golden/spatial reference, which is only set if the
both those exits don't pass.

No change in behavior.

Change-Id: Id304276c745eeb389ff85fa2dcf510d5976bc413
2017-12-06 10:18:44 -08:00
Marco
9f9d4f8dc9 vp9-svc: Allow for nonzero motion on spatial reference.
For nonrd pickmode on a given spatial layer, the spatial
(golden) reference was always only using zeromv for prediction.
In this patch if the downsampling filter used for generating
the lower spatial layer is an averaging filter (nonzero phase),
we allow for subpel motion on the spatial (golden) reference to
compensate for the shift. This is done by forcing the testing of
nonzero motion mode to compensate for spatial downsampling shift.

Improvement for cases where the downsampling is averaging filter.
In the current code this is only done for generating
resolutions <= QVGA.

Improvement for avgPSNR/SSIM on RTC set for speed 7: ~1.2%.
Gain is larger (~2-3%) for VGA clips with 2 spatial layers.
~1% speed slowdown for 3 layer SVC on mac.

Change-Id: I9ec4fa20a38947934fc650594596c25280c3b289
2017-12-05 22:41:07 -08:00
Shiyou Yin
90ce21e519 Merge "vpx_dsp: [loongson] optimize variance v2." 2017-12-04 01:30:06 +00:00
Hui Su
07b12aad77 Add max luma picture width/height constraint in VP9 level
BUG=b/65412009

Change-Id: I9e1478dcbd2ef9e97f5f8fb5a1c733b5f5cdf396
2017-12-01 16:29:40 -08:00
Johann
e83d00f584 filter out asm includes
Don't add include files to the archive. Avoids build failures for
Windows such as:
the input file 'libvpx_g.a(x86_abi_support.asm.o)' has no sections

Change-Id: If9c8e70c0ec913b7ad7dd6a08d4fa19011114ad2
2017-12-01 15:03:51 -08:00
Johann
bdbecea1ba explicitly label .text sections
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451

Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01 14:33:04 -08:00
Johann
65df957df6 nasm defaults to -Ox
No need to specify default behaviour. The original change introducing nasm:
7be093ea4d
mentions requiring 2.0.9, which was the first release to default to this behaviour:
http://www.nasm.us/doc/nasmdoc2.html
"The -Ox mode is recommended for most uses, and is the default since NASM 2.09."

Change-Id: Ia914c4deede5aa447277b5189bb4fcf7e54c338d
2017-12-01 14:33:04 -08:00
Johann Koenig
401e00792f Merge "pass 'win64' instead of 'x64' to the assembler" 2017-12-01 22:07:03 +00:00
Johann
460dbc01b5 pass 'win64' instead of 'x64' to the assembler
nasm does not accept x64

yasm has accepted (and appears to prefer) win64 at least as far back as
1.0.0:
http://yasm.tortall.net/releases/Release1.0.0.html

Change-Id: Ied881b1df0570da256b1bd7e131e7817e47f768f
2017-12-01 10:58:54 -08:00
Marco
8d0e7ac29a vp9-svc: Set num_inter_modes in non-rd pickmode.
Set num_inter_modes based on ref_mode_set_svc, which is
smaller set than ref_mode_set (which may use alt-ref).

No change in behavior.

Change-Id: I31169bb09028db230552c6fca0a86959d1ade692
2017-12-01 10:30:45 -08:00
Shiyou Yin
298f5ca47d vpx_dsp: [loongson] optimize variance v2.
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.

Change-Id: I58890c6a2ed1543379acb48e03e620c144f6515f
2017-12-01 13:44:48 +08:00
Kaustubh Raste
8099220e6c Merge "mips msa optimize vpx_scaled_2d function" 2017-12-01 01:24:25 +00:00
Marco Paniconi
c22ab8ab9f Merge "Nonrd-pickmode: avoid duplicate computation of UV predictor." 2017-12-01 00:23:52 +00:00
James Zern
9dbefc4b57 Merge "decouple spatial-svc from encoder abi" 2017-12-01 00:22:34 +00:00
Marco
2e701f7c29 Nonrd-pickmode: avoid duplicate computation of UV predictor.
Avoids duplicate computation of UV predictor.

Bit-exact when static_threshold is zero.
Small/neutral difference on RTC set with nonzero static_threshold
(since UV predictor won't be skipped with this change).

Small speed gain, ~1-2%, at speed 8.

Change-Id: Iba8d22a307768b391e29d63c9826aac5a4d9c285
2017-11-30 12:41:58 -08:00
James Zern
5044779e77 decouple spatial-svc from encoder abi
this is only meant for testing. along with --enable-experimental
--enable-spatial-svc require VPX_TEST_SPATIAL_SVC to be defined rather
than bumping the encoder ABI.

Change-Id: I7f34d9f60300fa31ccf22e1a4aa619392c391b2e
2017-11-30 10:52:25 -08:00
Marco
b409863c48 Fix to copy partition.
Update the prev_partition on early exits in
choose_partitioning().

Change-Id: I382ffcab8e647c00b14283d15c3dd11bb0ac6f50
2017-11-30 10:27:34 -08:00
Shiyou Yin
392e0188f6 Merge changes Icd9c866b,I81717e47
* changes:
  vp8: [loongson] optimize regular quantize v2.
  vp8: [loongson] optimize vp8_short_fdct4x4_mmi v2.
2017-11-30 00:53:50 +00:00
Shiyou Yin
8d70aef05f Merge "vpx: [loongson] fix bug in var_filter_block2d_bil_16x" 2017-11-30 00:53:37 +00:00
Jingning Han
9116e3d957 Merge "Add PSNR Cb and Cr metric to opsnr.stt" 2017-11-29 22:56:47 +00:00
Marco Paniconi
3437fe484a Merge "vp9-svc: Don't allow encode_breakout on golden ref." 2017-11-29 22:41:31 +00:00
Marco
49f51af4c9 vp9-svc: Don't allow encode_breakout on golden ref.
For 1 pass cbr SVC: GOLDEN is the spatial reference,
better not to check for encoder_breakout on this reference.

Small positive ~0.075% (mostly neutral) gain in avgPSNR/SSIM metrics.
No observed change in encoder speed.

Change-Id: Ib337f16d6771105bf06384c6a23ad047fc690418
2017-11-29 13:58:43 -08:00
Marco
0e94522338 vp9-svc: Clean conditon for allowing copy_partition.
Make condition explicit on non_reference_frame.

No change in behavior.

Change-Id: Iec5068bccd93c7c7be67634c5c090580b2dbb20d
2017-11-29 13:19:09 -08:00
Kyle Siefring
3ae909b0f9 Merge "Remove unnecessary includes of emmintrin_compat.h" 2017-11-29 19:14:45 +00:00
Kyle Siefring
a60da3a2eb Remove unnecessary includes of emmintrin_compat.h
Change-Id: Ie60381a0c6ee01f828cd364a43f01517f4cb03e9
2017-11-29 11:48:24 -05:00
Shiyou Yin
d49bf26b1c vp8: [loongson] optimize regular quantize v2.
1. Optimize the memset with mmi.
2. Optimize macro REGULAR_SELECT_EOB.

Change-Id: Icd9c866b0e6aef08874b2f123e9b0e09919445ff
2017-11-29 17:06:00 +08:00
Kaustubh Raste
339f4dcaee mips msa optimize vpx_scaled_2d function
Change-Id: I638507b360c71489ab0e87bd558d2719ad995333
2017-11-29 13:27:04 +05:30
Shiyou Yin
9966cc8d12 vp8: [loongson] optimize vp8_short_fdct4x4_mmi v2.
Optimize the calculate process of a,b,c,d.

Change-Id: I81717e47bc988ace1412d478513e7dd3cb6b0cc9
2017-11-29 12:58:37 +08:00
James Zern
c5f5f4ed17 vpx{enc,dec}: add --help
only output short usage to stderr on error, with --help use stdout

Change-Id: I7089f3bca829817e14b14c766f4f3eaee6f54e5c
2017-11-28 20:49:54 -08:00
Jingning Han
9bd3f1e30d Add PSNR Cb and Cr metric to opsnr.stt
Change-Id: I24e1741c00f9514647c7db2758a7ababd4e96932
2017-11-28 20:03:59 -08:00
Shiyou Yin
a0ca2a4079 vpx: [loongson] fix bug in var_filter_block2d_bil_16x
Which cause failed case:
1. MMI/VpxSubpelVarianceTest.Ref/6
2. MMI/VpxSubpelVarianceTest.Ref/7
3. MMI/VpxSubpelVarianceTest.ExtremeRef/6
4. MMI/VpxSubpelVarianceTest.ExtremeRef/7

Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
2017-11-29 10:26:43 +08:00
Marco
f0b4868625 vp9-svc: Fix condition for setting downsampling filter.
Use (width * height) for setting downsampling filter type.

Change-Id: If4acfde7ff9339e0584155f8a4d15b2f134211f2
2017-11-28 16:28:29 -08:00
Johann
bd990cad72 quantize x86: dedup some parts
Change-Id: I9f95f47bc7ecbb7980f21cbc3a91f699624141af
2017-11-27 13:09:21 -08:00
Marco
cbe62b9c2d vp9-svc: Fix to the layer buffer settings.
For the case when the number of temporal layers > 1,
the buffer levels (starting/optimal_buffer_level,
and maximum_buffer_size) were not scaled properly.

In vp9_update_layer_context_change_config():
when setting the layer-buffer levels, fix is to scale
the layer-target_bandwidth by the target_bandwidth
(which is the full stream bandwidth) instead of the
spatial_layer_target.

This is needed because prior to the call
vp9_update_layer_context_change_config(), set_rc_buffer_sizes()
is called which sets the buffer levels based on target bandwidth
(which is the full bandwidth for the SVC stream).

This fix properly sets the layer-buffer levels based on the
layer-bandwidth, and leads to better rate targeting.

Small/neutral change in avgPSNR/SSIM metrics on RTC set.

Change-Id: Ic0f4f7f3487c37b9a9adb4781ae5edfed7140a57
2017-11-26 22:17:48 -08:00
Peter Collingbourne
9639641cd4 Merge "[CFI] Remove function pointer casts" 2017-11-21 18:42:40 +00:00
Jerome Jiang
50fc0d896b Merge "vp8 simulcast: fix compile warnings." 2017-11-21 01:22:46 +00:00
Vlad Tsyrklevich
bc29863b96 [CFI] Remove function pointer casts
Control Flow Integrity [1] indirect call checking verifies that function
pointers only call valid functions with a matching type signature. This
change eliminates function pointer casts to make libvpx CFI-safe.

[1] https://www.chromium.org/developers/testing/control-flow-integrity

Change-Id: I7e08522d195a43c88cda06fa20414426c8c4372c
2017-11-20 16:36:29 -08:00
Jerome Jiang
f49360d740 vp8 simulcast: fix compile warnings.
Clean up some prints.

Change-Id: I199350e34a8b6fbff9601fcbd11ec68d24da5073
2017-11-20 16:18:31 -08:00
Kyle Siefring
dd4cc5b596 Merge "Optimize AVX2 get16x16var and get32x16var functions" 2017-11-20 22:37:57 +00:00
Jerome Jiang
0cc23242b0 Merge "vp9 svc: fix a few compile warnings." 2017-11-20 18:52:58 +00:00
Marco
559166acfe vp9-svc: Enbale scale partition reference frames.
For reference frames: enable scale partition for
superblocks with low source sad or if bsize on lower-resoln
is at least 32x32.

Keep feature disabled for base temporal layer.

Small regression in avgPNSR/SSIM metrics, ~0.5-1%.
Speedup ~2-3% on mac for SVC (3 spatial/3 temporal layers) at speed 7.

Change-Id: I5987eb7763845b680059128b538bb5188be0cca5
2017-11-17 14:52:20 -08:00
Jerome Jiang
8b7a6ca60a vp9 svc: fix a few compile warnings.
Change-Id: I4cb878600038066513ab73f3658990d1245ff2fb
2017-11-17 14:40:05 -08:00
Kyle Siefring
07a0bf038f Optimize AVX2 get16x16var and get32x16var functions
Change-Id: If8b91aaa883c01107f0ea3468139fa24cfb301d2
2017-11-17 13:55:49 -05:00
Paul Wilkins
849b3c238d Merge "Disable allow_partition_search_skip for speed 2." 2017-11-17 10:34:56 +00:00
Paul Wilkins
c66eeab30e Merge "Code cleanup." 2017-11-17 10:34:46 +00:00
Paul Wilkins
55eacca945 Merge "Remove decay_accumulator clause from alt ref breakout." 2017-11-17 10:34:37 +00:00
Paul Wilkins
4bd2a59e9b Merge "Add clause to alt ref group breakout." 2017-11-17 10:34:26 +00:00
Jerome Jiang
ea14a1a965 Merge "vp9: Fix mem rel for non-ref for external buffer." 2017-11-17 00:31:16 +00:00
paulwilkins
44473e7eb9 Disable allow_partition_search_skip for speed 2.
When allow_partition_search_skip  is set the two pass code
can optionally skip the partition search in the rd loop if the image
appears static (based on selection of 0,0 motion).

Unfortunately 0,0 motion does not necessarily mean that there are
no meaningful changes or that motion or intra modes will not be selected
in the second pass.

Disabling "allow_partition_search_skip" may hurt the encode speed a little
for a small number of clips but can have a big impact on compression.
The most notable example of this in our test sets is "bridge_close_cif"
where this change gives a gains of 18%, 12% and 16% in opsnr, ssim and
psnr-hvs.

Change-Id: I765e288b5c0cd82bce00a148e7653a21e9203024
2017-11-16 16:17:57 +00:00
Jerome Jiang
1aea1675c0 vp9 svc: Rework/fix scale partitioning on boundary.
Enable partition copy on boundary and scale blocks along the boundary.
Rename copy_partition_svc to scale_partition_svc.

Do not copy if the block crosses the boundary.

Change-Id: I37a04d48f11b15c4ea67facd7631193ec2f62150
2017-11-15 20:34:58 -08:00
Johann
3e3a568616 fwd txfm ssse3: use GLOBAL() for loading constants
Fixes a build issue when relocation is not allowed:
relocation R_X86_64_32 against '.rodata' can not be used when making a shared object

Change-Id: Ica3e90c926847bc384e818d7854f0030f4d69aa0
2017-11-15 13:01:44 -08:00
paulwilkins
05302360c9 Code cleanup.
Removal of parameters to and code in calc_frame_boost() that is no
longer required.

No change to results from previous patch.

Change-Id: Ic92da35613fdc247d22fddf24d09679fc5329017
2017-11-15 17:07:28 +00:00
paulwilkins
03c1a827ac Remove decay_accumulator clause from alt ref breakout.
The decay accumulator clause covers similar ground to the
new clause that tests the accumulated second reference error
so it has been removed to reduce complexity.

Change-Id: I4ec1cce32d72bd4ee463ad7def2831a68447d525
2017-11-15 16:58:05 +00:00
paulwilkins
607e45f420 Add clause to alt ref group breakout.
Add a clause to the breakout test for alt ref groups that
examines the size of the accumulated second reference
frame error compared to the cost of intra coding.

This clause causes a reduction in the average group length for many
clips. Alongside the change to the group length the minimum
boost is increased.

On balance the results are positive for psnr and psnr-hvs
but is negative for ssim/fast ssim for the smaller image formats.

Strong gains on some harder clips (eg ducks take off (midres) ~20%,
husky (lowres) 6-17%. Most of the negative cases are lower motion
clips. Subsequent patch hopefully will help with those.

Change-Id: Ic1f5dbb9153d5089e58b1540470e799f91a65dc4
2017-11-15 16:40:12 +00:00
Marco
b3c93d60c2 vp9-svc: Fix flag for usage of reuse-lowres partition
Fix/cleaup the conditioning for usage of the reuse-lowres
partition feature.

Replace the non-reference condition with the top temporal
layer, and put this condition in the speed feature.

This prevents doing update_partition_svc() on every
VGA frame, instead it will now only do update for VGA in
the top temporal layer frames.

Also this makes it easier to test/enable this feature
for lower layer temporal frames.

Change-Id: Ia897afbc6fe5c84c5693e310bcaa6a87ce017be5
2017-11-14 20:08:10 -08:00
Scott LaVarnway
8d471fcee2 tiny_ssim.c : clang compile error fix
Change-Id: Ic10ba580fd5da7d6ff7fa0f33db72fb0c1a97801
2017-11-14 04:38:00 -08:00
James Bankoski
7839fb98a8 Merge "add 10 and 12 bit to tiny_ssim" 2017-11-14 00:15:24 +00:00
Jerome Jiang
9df11a7c52 Merge "vp9 svc: Change conditions on VPX_ENCODER_ABI_VERSION." 2017-11-13 21:04:41 +00:00
Jerome Jiang
0d2555bd2e vp9 svc: Change conditions on VPX_ENCODER_ABI_VERSION.
VPX_ENCODER_ABI_VERSION was bumped up in 93e83f.

Change-Id: Id5707f9f9db56fa96549bc8f54e1cfa04e7fa4cd
2017-11-13 11:05:20 -08:00
Jim Bankoski
becab42eee add 10 and 12 bit to tiny_ssim
Change-Id: I92e4dba2d1682a0d77ad9a214ec4312b1cf4d42e
2017-11-13 10:56:42 -08:00
paulwilkins
a73cee2870 New content type to improve grain retention.
For new VP9 only content type adjust  the rate distortion and ARF
filter based on the relative spatial variance of the source and
reconstruction.

In regards to the RD loop the method favors modes where the
reconstruction variance is similar to the source variance. However it
is currently only applied to regions where the source variance is quite
low.

For very low variance blocks it applies a further bias against intra
coding and large prediction block sizes (the later in particular limit
the usefulness of the loop filter).

The final part of this change is to lower the strength of the ARF
filter for blocks where the source has very low spatial variance, to
encourage some low amplitude texture or noise to pass through
the filter.

This change improves the retention of film grain and fine noise /
texture in spatially flat regions, but as expected causes a significant
drop in PSNR on many clips. This is to be expected because similar
but misaligned noise or texture will give a lower PSNR than a flat
noise free reconstruction. However, it is worth noting that most clips
show a strong gain in FAST SSIM.

The features are enabled on the vpxenc command line by setting
--tune-content=film.

VPX_ENCODER_ABI_VERSION bumped for this change and cvbr.

Change-Id: I26a4e4edfa3dc5cacead82fa701fe7a9118ccd0a
2017-11-13 16:57:23 +00:00
paulwilkins
55fc4d95af Small parameter clean up.
Removed three parameters that are no longer needed in calls
to calc_arf_boost() and associated minor changes.

No impact on encode results.

Change-Id: Ieaf31d0d2e1990b99cf69647170145a1bbfbb9fb
2017-11-13 16:53:57 +00:00
Paul Wilkins
2eddfb46a9 Merge "Fix to frames considered in arf boost calculation." 2017-11-13 16:36:43 +00:00
Paul Wilkins
f5817fa612 Merge "CVBR command line option." 2017-11-13 16:32:39 +00:00
Scott LaVarnway
8e6022844f vpx: [x86] add vpx_satd_avx2()
SSE2 instrinsic vs AVX2 intrinsic speed gains:
blocksize   16: ~1.33
blocksize   64: ~1.51
blocksize  256: ~3.03
blocksize 1024: ~3.71

Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
2017-11-10 12:24:12 -08:00
Scott LaVarnway
8c7213bc00 Merge "vpx: [x86] add vp9_block_error_fp_avx2()" 2017-11-10 00:45:47 +00:00
Marco Paniconi
1ff68ec035 Merge "vp9-svc: Avoid minmax variance for non-reference frames." 2017-11-10 00:30:04 +00:00
Marco
6c0011a255 vp9-svc: Avoid minmax variance for non-reference frames.
For choose_partitioning (speed >= 6): avoid computation
of minmax variance for non-reference frames in SVC.

Existing condition only avoided this for speed >= 8.
Combine that existing logic with non-reference condition.

Small speedup (~0.5-1%) for 3 layer SVC,
neutral change on avgPSNR/SSIM metrics.

Change-Id: I3e9f3a1af0647b15e475cf170d9402908d672ee5
2017-11-09 16:27:27 -08:00
James Zern
10cb17aec0 Merge "runtime error fix: bitdepth_conversion_avx2.h" 2017-11-10 00:15:03 +00:00
Jerome Jiang
6246d8aa76 vp9: Fix mem rel for non-ref for external buffer.
Release frame buffers for non-ref when the decoder is destroyed.

Enable the non ref test.

BUG=b/68819248

Change-Id: Id87ef3b0a62318f9812e927cd957c05c859047fa
2017-11-09 15:47:21 -08:00
Jerome Jiang
0665b09661 Merge "vp9: SVC feature to use partition from lower resolution." 2017-11-09 23:28:44 +00:00
Jerome Jiang
fdb054a05d vp9: SVC feature to use partition from lower resolution.
For SVC with 3 spatial layers:
Add feature to copy/upscale partition from middle spatial layer
to the upper/highest resolution, when superblock sad is not high.

Enabled for speed >= 7 and only for non-reference frames.

Speedup ~3-4%, small loss in avgPNSR/SSIM of ~1%.

Change-Id: I7f0a2716c0fde28bade0f86159d11b7e31d6ab8d
2017-11-09 14:16:50 -08:00
Scott LaVarnway
2387024f41 runtime error fix: bitdepth_conversion_avx2.h
Change-Id: I7364a157de39eb7137b599808474b8d46d19d376
2017-11-09 12:26:43 -08:00
Johann Koenig
bdb8b3ad86 Merge "fail early on oversize frames" 2017-11-09 19:50:04 +00:00
Scott LaVarnway
62ab5e99c1 vpx: [x86] add vp9_block_error_fp_avx2()
SSE2 asm vs AVX2 intrinsics speed gains:
blocksize   16: ~1.00
blocksize   64: ~1.17
blocksize  256: ~1.67
blocksize 1024: ~1.81

Change-Id: I2a86db239cf57e3ff617890ccb2d236aba83ad5e
2017-11-09 05:02:31 -08:00
paulwilkins
d6e29868ac Fix to frames considered in arf boost calculation.
For a chosen interval "i" the existing arf boost calculation examined frames
+/- (i-1) frames from the current location in the second pass.

This change checks to make sure that the forward search does not extend
beyond the next key frame in the event that the distance to the next key
frame is < (i - 1).

Small metrics gains on all our  test sets but these are localized to a few clips
(e.g. midres set psnr-hvs sintel -2.59% but overall average was only -0.185%)

Change-Id: I26fc9ce582b6d58fa1113a238395e12ad3123cf6
2017-11-09 10:46:10 +00:00
Jerome Jiang
adbb4c4d32 Merge "vp9: Add nonref frame buffer test." 2017-11-09 04:41:10 +00:00
Jerome Jiang
a68bbcff29 vp9: Add nonref frame buffer test.
The new test will run a SVC bitstream which has non ref frames.
It checks the number of buffer acquired and released to make sure all
external frame buffers are released.

Add a new test bitstream:
vp90-2-22-svc_1280x720_1.webm
which has 400 frames in total, and 1 spatial layer and 2 temporal layers.
There is one non ref frame every other frame.

Disabled for now. Will be enabled with the fix.

BUG=b/68819248

Change-Id: I0515336fd9809a9e1fceba90e4dce53dabaf53a5
2017-11-08 18:41:33 -08:00
Johann Koenig
cf8039c25f Merge "Support building AVX-512 and implement sadx4 for AVX-512" 2017-11-08 16:28:40 +00:00
paulwilkins
93e83fd7cf CVBR command line option.
Added command line control of Corpus VBR.

The new corpus vbr mode is a variant of standard
VBR (end-usage=0) where the complexity distribution
mid point is passed in rather than calculated for a specific
clip or chunk.

The new variant is enabled by setting a new command line
parameter --corpus-complexity to a zero value. Omitting
this parameter or setting it to 0 will cause the codec to use
standard vbr mode.

The correct value for a given corpus needs to be derived
experimentally using a training set such that the average
rate for the corpus is close to the target value.

For example our using our low res test set with upper and lower
vbr limits of 50%-150% and a corpus complexity value of 650
gives a similar average data rate across the set to using standard
vbr. However, with the corpus mode easier clips will be allocated
fewer bits and harder clips more bits rather than having the same
rate target for all.

Change-Id: I03f0fc8c6fb0ee32dc03720fea6a3f1949118589
2017-11-08 10:41:04 +00:00
Marco
6fbc354c97 Nonrd_pickmode: avoid computing UV cost when early_term is set.
For nonrd_pickmode: if early_term is set there should be
no need to include UV in rdcost (when color_sensitivity is set).

Neutral change on RTC and RTC_derf metrics, for speed >= 5.
No change for ytlive metrics.

Very small speed gain (~0.5%) on some clips with strong color content.

Change-Id: Ifc00928ecd935fc71e94935ceef0ae7481249f07
2017-11-06 10:22:14 -08:00
Kyle Siefring
b383a17fa4 Support building AVX-512 and implement sadx4 for AVX-512
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.

Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00
Marco
eb7d431cb5 Compound prediction mode for nonrd pickmode.
Allow for compound prediction mode in nonrd_pickmode for ZEROMV.
For real-time encoding, 1 pass with non-zero lag-in-frames.

Added speed feature to control the feature.
Enabled for speed >=6 for now, under VBR mode.

avgPSNR/SSIM metrics positive on ytlive set, for speed 6:
some clips up by ~3-5%, some clips neutral gain, average gain
across clips is ~1%.

Small/negligible decrease in speed.

Change-Id: I7a60c7596e69b9a928410c5ee2f9141eecd8613d
2017-11-03 10:13:05 -07:00
Johann
5fe82459ec fail early on oversize frames
Even though frame_size is calculated in uint64_t, it winds up in an int
size value.

This was exposed with the msan test because the memset is called with
(int)frame_size, leading to a segfault.

Change-Id: I7fd930360dca274adb8f3e43e5e6785204808861
2017-11-03 09:49:13 -07:00
Jerome Jiang
3ba9a2c8b2 Merge "vp9: Move allocation of vt2 after early exits." 2017-11-01 16:58:01 +00:00
Jerome Jiang
34805d6d0d vp9: Move allocation of vt2 after early exits.
Remove the memory deallocation on the early exits.

Change-Id: I00b4a814ae6705105ecab89644d055ca3311d9f4
2017-10-31 17:04:04 -07:00
Jerome Jiang
0c84b9b703 Merge "vp9: Reduce stack usage of choose_partitioning." 2017-10-31 21:42:18 +00:00
Jerome Jiang
18b470f486 vp9: Reduce stack usage of choose_partitioning.
Move vt2 to heap.
Reduce the stack usage from ~87K to ~44K.

BUG=b/68362457

Change-Id: I8f5f93712934d59a8cc4564378172d409a736a2e
2017-10-31 13:10:27 -07:00
Jerome Jiang
c77822615e Merge "vp9: Reduce stack usage of choose_partioning." 2017-10-30 23:39:41 +00:00
Jerome Jiang
cc47231187 vp9: Reduce stack usage of choose_partioning.
Change type of sum_square_error from int64_t to uint32_t.
Change type of sum_error from int64_t to int32_t.

This reduces the stack usage from ~131K to ~87K.

BUG=b/68362457

Change-Id: I147d7c7b226bceb4f0817bb86848e1fa9d9ac149
2017-10-30 13:53:20 -07:00
James Zern
acb9460929 vp8: correct if/else '{' placement
swap '{' and c-style comments removing a few redundant ones along the
way; covers most leftovers from the clang-tidy run against an
x86_64-linux config.

Change-Id: I67a45596f80a12389faca49c5be440875092a7df
2017-10-27 12:27:10 -07:00
Scott LaVarnway
3bf02ad74a vpx: hadamard: use ptrdiff_t instead of int for stride
Eliminates the following instruction for the x86 (64 bit)
intrinsic code:

movslq %esi,%rax

Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae
2017-10-26 11:41:48 -07:00
Kyle Siefring
037e596f04 Merge "Optimize convolve8 SSSE3 and AVX2 intrinsics" 2017-10-24 19:22:36 +00:00
Kyle Siefring
ae35425ae6 Optimize convolve8 SSSE3 and AVX2 intrinsics
Changed the intrinsics to perform summation similiar to the way the assembly does.

The new code diverges from the assembly by preferring unsaturated additions.

Results for haswell

SSSE3
Horiz/Vert  Size  Speedup
Horiz       x4    ~32%
Horiz       x8    ~6%
Vert        x8    ~4%

AVX2
Horiz/Vert  Size  Speedup
Horiz       x16   ~16%
Vert        x16   ~14%

BUG=webm:1471

Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668
2017-10-24 10:39:48 -04:00
Scott LaVarnway
e0aa6b24aa Merge "vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix" 2017-10-23 22:02:59 +00:00
Marco
0738d90169 vp9-svc: Allow for adapt_rd_thresh with row-mt.
Set adaptive_row_thresh_mt = 1 at speed >= 7,
for svc when multi-threading is used with row-mt.
This allow the adaptive_rd_thresh feature to be used
in the nonrd-pickmode.

~1-2% speedup for SVC encoding with small quality
loss (< 0.6%) on RTC set.

Change-Id: Iab9878dff117bccdaef3e4d0645165db9808cdfc
2017-10-23 11:47:18 -07:00
Scott LaVarnway
512bf4e029 vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix
Use an intermediate buffer before storing to coeffs when
highbitdepth is enabled.

Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b
2017-10-23 08:49:32 -07:00
Scott LaVarnway
4906cea027 vpx: [x86] vpx_hadamard_16x16_avx2() improvements
~10% performance gain.  Fixed the cosmetics noted in the
previous commit.

Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13
2017-10-20 08:55:06 -07:00
Scott LaVarnway
b58259ab55 Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()" 2017-10-19 23:32:10 +00:00
Paul Wilkins
199971d606 Merge "Corpus VBR tweak for undershoot." 2017-10-19 10:07:45 +00:00
Paul Wilkins
0c493cbe2b Merge "Increase precision of some debug stats output for corpus VBR." 2017-10-19 10:07:30 +00:00
Paul Wilkins
d8c34a2552 Merge "Prevent double application of min rate in two pass." 2017-10-19 10:06:33 +00:00
Scott LaVarnway
55c126a5d7 vpx: [x86] add vpx_hadamard_16x16_avx2()
This version is ~1.91x faster than the sse2 version.  When
highbitdepth is enabled, it is ~1.74x.

Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd
2017-10-18 18:00:00 -07:00
Jerome Jiang
401e6d48bf Merge "Add datarate test for vp8 ROI." 2017-10-18 19:39:26 +00:00
Jerome Jiang
bd6d82e881 Add datarate test for vp8 ROI.
BUG=webm:1470

Change-Id: Icbc848837e64eacc49491dcc26b4c5802af2ee13
2017-10-18 11:19:59 -07:00
Jerome Jiang
ec2fced451 Merge "vp8: Enable use of ROI map." 2017-10-18 18:16:44 +00:00
Kyle Siefring
b3a36f7946 Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c" 2017-10-18 16:19:52 +00:00
Shiyou Yin
df15220a89 Merge "vp8: [loongson] optimize idct with mmi" 2017-10-18 00:55:36 +00:00
Jerome Jiang
dbb8926b86 vp8: Enable use of ROI map.
Disable cyclic refresh if ROI is used and add flag to properly handle
the static_thresh deltas.
Remove the ROI test for cyclic refresh (it's allowed but disabled if ROI
is used).
Add an example in vpx_temporal_svc_encoder.c. Turned off by default.

BUG=webm:1470

Change-Id: Ief9ba1d7f967bc00511b412b491c3f70943bfbda
2017-10-17 15:23:03 -07:00
Linfeng Zhang
9336e01621 Merge changes I17fff122,Ic149e3cb
* changes:
  Add 4 to 3 scaling SSSE3 optimization
  Test extreme inputs in frame scale functions
2017-10-17 16:03:29 +00:00
Linfeng Zhang
0d2e95193b Merge "Generalize CheckScalingFiltering in ConvolveTest" 2017-10-17 16:03:07 +00:00
Kyle Siefring
55805e2786 Refactor x86/vpx_subpixel_8t_intrin_avx2.c
Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305
2017-10-17 11:57:40 -04:00
Shiyou Yin
577d4fa792 vp8: [loongson] optimize idct with mmi
1. vp8_dequant_idct_add_y_block_mmi
2. vp8_dequant_idct_add_uv_block_mmi

Change-Id: I9987147be2685ac79d4b045d1d56f6709ee1223c
2017-10-17 03:27:31 +00:00
Linfeng Zhang
580d32240f Add 4 to 3 scaling SSSE3 optimization
Note this change will trigger the different C version on SSSE3 and
generate different scaled output.

Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3().

Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
2017-10-16 15:42:42 -07:00
Marco
a9248457b1 Adjust threshold in gf_boost for 1 pass vbr
Small inncrease the sad_thresh1, avoids some false
detection of possible scene changes within lag.

Small improvement in few clips on ytlive, otherwise neutral change.

Change-Id: Ia79b53bb657bbce65a7aac7d20666b6373d5af8b
2017-10-13 15:33:51 -07:00
Paul Wilkins
12df840777 Merge "Further Corpus VBR change." 2017-10-13 15:59:58 +00:00
Paul Wilkins
eaa593d293 Merge "Corpus Wide VBR test implementation." 2017-10-13 15:59:45 +00:00
paulwilkins
8842ee0b0d Corpus VBR tweak for undershoot.
In cases of strong undershoot adjust Q range down faster.

Change-Id: I84982beceb3c9b6dc50e52e4a6e891c7dd395d03
2017-10-13 10:27:15 +01:00
Shiyou Yin
3e2770de4f Merge "vp8: [loongson] optimize dct with mmi" 2017-10-13 00:37:57 +00:00
Marco Paniconi
28d1c0535d Merge "Adjust to scene detection for 1 pass vbr." 2017-10-12 19:36:33 +00:00
Marco
a673b4f4af Adjust to scene detection for 1 pass vbr.
Expose the threshold for setting key frame on cut,
and increase it for speed 5.
Also small adjustment to min_thresh.

No change in overall metrics or fps.
Small quality improvement and lower encode time on scene cuts.

Change-Id: I36e06ff3b26b6c29aede39c23fce454525fc9026
2017-10-12 10:59:23 -07:00
Jerome Jiang
175b36cb6d Merge "vp9: use nonrd pick_intra for small blocks on keyframes." 2017-10-12 17:29:27 +00:00
Kyle Siefring
caa116c9be Merge changes I38783d97,If5160c0c
* changes:
  Extend 16 wide AVX2 convolve8 code to support averaging.
  Add AVX2 version of vpx_convolve8_avg.
2017-10-12 16:12:38 +00:00
paulwilkins
2b247ae91c Increase precision of some debug stats output for corpus VBR.
Change-Id: I75841797cc0c215781b5b36e3a3e9f4b0e35ba63
2017-10-12 10:07:21 +01:00
Jerome Jiang
288890cd43 vp9: use nonrd pick_intra for small blocks on keyframes.
Keyframe encoding is more than 2x faster.
Disabled on Speed 8.

Change-Id: I2157318b6ac8253fa5398322c72d98cd7fa9b2b6
2017-10-11 21:38:01 -07:00
Shiyou Yin
f70de09f2a vp8: [loongson] optimize dct with mmi
1. vp8_short_fdct4x4_mmi
2. vp8_short_fdct8x4_mmi
3. vp8_short_walsh4x4_mmi

Change-Id: I89a7df25cfd09fae309fac257ad8b6a3dc1c8acb
2017-10-12 08:50:04 +08:00
Shiyou Yin
bc4098a8e9 Merge "vp8: [loongson] optimize quantize with mmi" 2017-10-12 00:33:17 +00:00
Marco
72c69e14ad Adjust threshold in datarate tests for 1 pass VBR
Small increase in threshold for the 1 pass VBR datarate tests.
Needed due to commit:
<017257a Adjustment to scene detection and key frame>

Change-Id: I28b3bd7db2192a8cc2bccc3cb0e3b8dbb910ca16
2017-10-11 11:48:36 -07:00
Linfeng Zhang
1fa3ec3023 Test extreme inputs in frame scale functions
Change-Id: Ic149e3cb59be2ee0f98a3fcfd83226ad5ea30c99
2017-10-11 11:35:19 -07:00
paulwilkins
416b7051d7 Prevent double application of min rate in two pass.
The initial allocation of bits in the two pass code to each frame
should be within the min max limits on the command line. However,
when forming an ARF group the cost of the ARF is shared by frames
in that group such that the residual bits for a frame could drop below
the min value. This change prevents the minimum being re-applied
after the cost of the ARF has been deducted as this may otherwise
cause low rate sections to overshoot their target.

Test runs comparing to a baseline run with min and max section pct
0-2000% vs one closer to the YT use case (50-150%) suggest that
this fix not only results in better rate control but also gives a better
rd outcome.

For example the HD set vs 0-2000% baseline (opsnr, ssim).
Old code (50-150):  +0.751, +1.099
New code(50-150): +0.241, -0.009

Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef
2017-10-11 18:00:44 +01:00
Shiyou Yin
e8ed2bb762 vp8: [loongson] optimize quantize with mmi
1. vp8_fast_quantize_b_mmi
2. vp8_regular_quantize_b_mmi

Change-Id: Ic6e21593075f92c1004acd67184602d2aa5d5646
2017-10-11 16:45:58 +08:00
Linfeng Zhang
16166bfdaa Add 4 to 1 scaling x86 optimization
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10 16:24:06 -07:00
Jerome Jiang
dcfae2cc64 Merge "Fix alignment in vpx_image without external allocation." 2017-10-10 23:02:05 +00:00
Jerome Jiang
33c598990b Fix alignment in vpx_image without external allocation.
This restores behaviors prior to
<40c8fde Fix image width alignment. Enable ImageSizeSetting test.>.

BUG=b/64710201

Change-Id: I559557afe80d5ff5ea6ac24021561715068e7786
2017-10-10 14:26:17 -07:00
Linfeng Zhang
54f7d68c5c Generalize CheckScalingFiltering in ConvolveTest
Let it test extreme inputs and all filter types.
In the future ConvolveTest should test regular 8-bit functions in
high bitdepth mode.

Change-Id: I1042564d1d390589ca203070fe332c6da3315d75
2017-10-10 14:12:43 -07:00
Marco
017257a317 Adjustment to scene detection and key frame.
For 1 pass vbr: use higher threshold on avg_sad
and force key frame under scene cut detection if
above the threshold. Allow it for speed >= 6 for now,
since it does not use the full nonrd_pickmode partition
(as in speed 5).

Improves quality somewhat on scene cut frames.
Neutral on overall metrics and fps for speed 6 on
ytlive set.

Change-Id: I12626f7627419ca14f9d0d249df86c7104438162
2017-10-10 11:20:05 -07:00
Linfeng Zhang
963cc22cef Merge changes I9d4c1af5,I882da3a0
* changes:
  Rename some inline functions in NEON scaling
  Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10 17:29:50 +00:00
paulwilkins
06d231c9fa Further Corpus VBR change.
Change to the bit allocation within a GF/ARF group.

Normal VBR and CQ mode allocate bits to a GF/ARF group based of the mean
complexity score of the frames in that group but then share bits evenly between
the "normal" frames in that group regardless of the individual frame complexity
scores (with the exception of the middle and last frames).

This patch alters the behavior for the experimental "Corpus VBR" mode such that
the allocation is always based on the individual complexity scores.

Change-Id: I5045a143eadeb452302886cc5ccffd0906b75708
2017-10-10 10:41:35 +01:00
paulwilkins
741bd6df4f Corpus Wide VBR test implementation.
This patch makes further changes to support an experimental
corpus wide VBR mode that uses a corpus complexity
number as the midpoint of the distribution used to allocate bits
within a clip, rather than some average error score derived from the
clip itself.

At the moment the midpoint number is hard wired for testing and
the mode is enabled or disabled through a #ifdef.  Ultimately this
would need to be controlled by command line parameters.

Change-Id: I9383b76ac9fc646eb35a5d2c5b7d8bc645bfa873
2017-10-10 10:40:44 +01:00
Kyle Siefring
1b2f92ee8e Extend 16 wide AVX2 convolve8 code to support averaging.
Also adds vpx_convolve8_avg_horiz_avx2.

Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf
2017-10-09 19:10:03 -04:00
Linfeng Zhang
27d21a3d13 Rename some inline functions in NEON scaling
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-09 11:23:00 -07:00
Linfeng Zhang
e1ae3772da Merge "Update vp9_scale_and_extend_frame_ssse3()" 2017-10-09 16:20:00 +00:00
Kyle Siefring
9ca06bcdd2 Add AVX2 version of vpx_convolve8_avg.
vpx_convolve8_avg works by first running a normal horizontal filter then a
vertical filter averages at the end.

The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
horizontal step.

vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.

Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
2017-10-07 23:37:48 -04:00
James Zern
807248ec81 Merge "ppc: Add vpx_idct32x32_1024_add_vsx" 2017-10-07 19:08:26 +00:00
Marco Paniconi
5bc4c37a89 Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."" 2017-10-06 22:41:34 +00:00
Marco Paniconi
bcbc6ed82d Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."
This reverts commit 9311ef18b4.

Reason for revert:
Notice small regression in some clips.
Will revisit in another change.

Original change's description:
> Speed >=5 real-time: add TM intra mode for high_source_sad.
> 
> Small/neutral change in metrics or speed for ytlive.
> Some improvement in quality on frames with big content change.
> 
> Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-10-06 22:14:56 +00:00
Marco
e405eb06b1 Adjust threshold in scene detection
For 1 pass vbr: increase min_thresh slightly, and also add
condition on golden/arf update for using full nonrd_pick_partition.

Reduces possible false detection for scene cut detection.

Neutral/small change in metrics or speed for speed 5.

Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76
2017-10-06 11:08:56 -07:00
Marco Paniconi
7af6c6c9ca Merge "Speed >=5 real-time: add TM intra mode for high_source_sad." 2017-10-06 06:29:46 +00:00
Marco
9311ef18b4 Speed >=5 real-time: add TM intra mode for high_source_sad.
Small/neutral change in metrics or speed for ytlive.
Some improvement in quality on frames with big content change.

Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d
2017-10-05 23:07:03 -07:00
James Zern
d2fb834ebd Merge "vpx_codec.h: namespace local defines" 2017-10-06 05:30:16 +00:00
James Zern
e8ed030da3 vpx_codec.h: namespace local defines
add VPX_ to UNUSED/*DEPRECATED to avoid conflicts with other headers.

Change-Id: Ie16bdac3575bc1af57a05d37e65b994370585377
2017-10-05 15:12:20 -07:00
James Zern
107eb6a9d4 vp9_ethread_test: abort early/add more detailed output
in the case compare_fp_stats fails report the 2 values and their index

Change-Id: I927a832b7a1e24c392961093b7caee1134223def
2017-10-05 15:02:51 -07:00
Marco Paniconi
e095bcce44 Merge "Adjust threshold for adapt_partition for speed 6." 2017-10-05 03:28:06 +00:00
Marco
18262a8576 Adjust threshold for adapt_partition for speed 6.
Lower SAD threshold to select non_rd pickmode partition
at superblock level more often.
Small gain in metrics, small/negligible decrease in speed.

Change-Id: I0f728236b91a604e4ca7e02039adc54d5985c4dc
2017-10-04 18:04:09 -07:00
Marco Paniconi
014976c251 Merge "Avoid nonrd_pick_partition for speed >= 6." 2017-10-04 23:36:27 +00:00
Marco
4bc1fc58b6 Avoid nonrd_pick_partition for speed >= 6.
For 1 pass vbr speed >= 6: when REFERENCE_PARTITION is selected,
avoid doing the full nonrd_pickmode based partition.
No change in overall metrics or speed.
Reduces encode times on scene cuts by 10-20%.

Change-Id: I0310b1610cc1c83793a509e0a9059840e8f18308
2017-10-04 15:31:54 -07:00
Marco Paniconi
6a42bdd25f Merge "Modify early exit for alt_ref in nonrd_pickmode." 2017-10-04 19:38:49 +00:00
Linfeng Zhang
127864deb3 Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
Change-Id: I882da3a04884d5fabd4cd591c28682cbb2d76aa5
2017-10-04 12:35:39 -07:00
Linfeng Zhang
b809442521 Update vp9_scale_and_extend_frame_ssse3()
Change-Id: I22622faebfcc36f7a4d1f37e3800ae8ab87c8cd4
2017-10-04 12:32:30 -07:00
Marco
77e51e2035 Modify early exit for alt_ref in nonrd_pickmode.
For 1 pass vbr mode:
On no-show_frame/ARF: instead of skipping alt_ref_frame
completely in mode testing, allow for checking (0, 0) on alt_ref.

Small gain in metrics, ~0.18%, no change in speed.

Change-Id: I32a3c24faca64ab70dd5091071a0dc301db7dd1e
2017-10-04 11:53:39 -07:00
Linfeng Zhang
9a71811d98 Merge changes Id6a8c549,Ib1e0650b,Ic369dd86
* changes:
  Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
  Add vpx_dsp/x86/mem_sse2.h
  Add transpose_8bit_{4x4,8x8}() x86 optimization
2017-10-04 16:15:14 +00:00
Jerome Jiang
ffa3a3c441 Merge "Fix image width alignment. Enable ImageSizeSetting test." 2017-10-04 14:48:03 +00:00
Marco
98dbf31c87 Enable arf usage for speed >= 6, 1 pass vbr.
For speed 6 on ytlive set:
On average, speed slowdown ~5%, quality gain ~2%.

Change-Id: Ia18237cc1d52c54d7e2cb3c71f571cf37ef61b44
2017-10-03 17:18:33 -07:00
Marco
ab2bd340ac vp9: 1 pass vbr: Limit qpdelta on high_source_sad.
For 1 pass vbr: when significant content/scene change is detected
(high_source_sad = 1) reduce/turnoff the additional qdelta on the
active_worst_quality. This helps somewhat to reduce the occurrence
of large frame sizes and large encode times.
Allow it only when use_altef_onepass is enabled.

Neutral/no change on metrics.

Change-Id: I1dd97dd2ab892d65f707b841b27a5de300b714ea
2017-10-03 16:27:17 -07:00
James Zern
66b6b87471 Merge "vpx: fix nasm build errors" 2017-10-03 21:47:49 +00:00
Scott LaVarnway
bc4bc9b622 vpx: fix nasm build errors
BUG=webm:1462,766721

Change-Id: Icfa536a8e38623636b96c396e3c94889bfde7a98
2017-10-03 20:02:21 +00:00
Linfeng Zhang
6543213e87 Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac
2017-10-03 13:02:05 -07:00
Linfeng Zhang
0f756a307d Add vpx_dsp/x86/mem_sse2.h
Add some load and store sse2 inline functions.

Change-Id: Ib1e0650b5a3d8e2b3736ab7c7642d6e384354222
2017-10-03 12:59:05 -07:00
Marco
c8678fb7f3 Use adapt_partition for ARF in 1 pass.
For speed 6 real-time mode: use adapt_partition
on ARF frame instead of REFERENCE_PARTITION (which is slower).
This requires enabling compute_source_sad_onepass for no-show_frames.

Speedup of ~3-5% on some clips that heavily use ARF,
small loss (~0.2%) in quality on ytlive set.

Change-Id: Ib50acc97df06458244a6ac55d2bd882c30012536
2017-10-03 11:49:55 -07:00
Linfeng Zhang
67c38c92e7 Add transpose_8bit_{4x4,8x8}() x86 optimization
Change-Id: Ic369dd86b3b81686f68fbc13ad34ab8ea8846878
2017-10-03 10:00:30 -07:00
Marco Paniconi
fe7b869104 Merge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode." 2017-10-03 03:01:14 +00:00
Marco
33e10dfa7e ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.
Speedup of ~2-3% on 1080p clips speed 6.
Neutral/negligible loss in metrics on ytlive.

Change-Id: I7ac47a4d8b58c566920bae29a94a0e8d59c36dee
2017-10-02 19:04:03 -07:00
Linfeng Zhang
0e55b0b0a7 Add 4 to 3 scaling NEON optimization
Speed comparing with the one calling vpx_scaled_2d_neon()
  ~1.7 x in general
  ~2.8x for BILINEAR filter

BUG=webm:1419

Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6
2017-10-02 15:04:09 -07:00
Linfeng Zhang
2c560c3c22 Specialize 4 to 3 frame scaling in C
Scale 3x3 block instead of 16x16 block in each loop. Disabled by
default.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3.
   Optimization code will be smaller and faster.
2. Maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: I59a1f7496d89a1b090498c935d30cfcf1d0c282b
2017-10-02 11:56:15 -07:00
Scott LaVarnway
3c700052be Merge "vpxdsp: [x86] add highbd_d135_predictor functions" 2017-10-02 15:00:19 +00:00
Alexandra Hájková
fb7fc1dbda ppc: Add vpx_idct32x32_1024_add_vsx
Change-Id: I55cd0a1569ccc47a53d0ecf751aac259d510e10d
2017-09-30 19:31:20 +00:00
Marco
c8f6e7b99e Fix partition selection in speed features for arf overlay frame.
For real-time mode. Move the switch to fixed partition
for is_src_frame_alt_ref so all speeds may use it
if use_altref_onepass is set.

Improves metrics by ~2% for ytlive set at speed 4
(where use_altref_onepass is currently used).

Change-Id: I033240386598c9dbd0364da89ccbcca64bc663ee
2017-09-29 15:02:28 -07:00
Marco
f2c3d0a7a3 Enable use_altref_onepass for speed 4 real-time mode.
Used for VBR mode with lag-in-frames > 0.
On ytlive set at speed 4: ~3% average gain.

Change-Id: I45dad1700bf8be9d8f177815dc062774f6f2f0de
2017-09-29 10:56:14 -07:00
Scott LaVarnway
3bbd62ed27 vpxdsp: [x86] add highbd_d135_predictor functions
C vs SSE2 speed gains:
_4x4 : ~1.81x

C vs SSSE3 speed gains:
_8x8 : ~1.96x
_16x16 : ~1.88x
_32x32 : ~2.02x

BUG=webm:1411

Change-Id: Iefaf8b39afbbfe34c1ad1d21e3a003b20f1f61e0
2017-09-29 08:56:38 -07:00
Scott LaVarnway
4cae64c32c vpxdsp: [x86] add highbd_d117_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.04x

C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x

BUG=webm:1411

Change-Id: I31d949695991c067dac89d91e0bed3e666c94993
2017-09-28 14:45:28 -07:00
Jerome Jiang
5a40c8fde1 Fix image width alignment. Enable ImageSizeSetting test.
BUG=b/64710201

Change-Id: I5465f6c6481d3c9a5e00fcab024cf4ae562b6b01
2017-09-28 11:25:24 -07:00
Marco
a2ef180dd0 Set rc->high_source_sad = 0 before scene detection.
Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f
2017-09-28 10:49:45 -07:00
Marco Paniconi
3b8cc214ef Merge "vp9: Modification to adapt the ARF usage for 1 pass vbr" 2017-09-28 16:52:28 +00:00
Marco
03e8f13337 vp9: Modification to adapt the ARF usage for 1 pass vbr
Add stats for past ARF usage, and use it to disable
ARF usage based on some conditions.

Overall improvement on ytlive set, reduces the regression
on the problem clips for this feature.

Only affects when sf->use_altref_onepass is enabled
(currently off by default).

Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb
2017-09-28 09:10:30 -07:00
Marco
c493ea1a6b Add use_svc condition to the scene detection in 1 pass.
Scene detection is not currently used in SVC 1 pass code.
Speedup of ~0.4%.

Change-Id: I0ab769300919de710cd2da1402014fa3f22a1f86
2017-09-27 14:51:46 -07:00
Marco Paniconi
786b124e20 Merge "Revert "Remove the speed condition on scene detection in 1 pass code."" 2017-09-27 20:42:48 +00:00
Scott LaVarnway
80992a746c Merge "vpxdsp: [x86] add highbd_d153_predictor functions" 2017-09-27 20:40:21 +00:00
Marco Paniconi
8d438dc313 Revert "Remove the speed condition on scene detection in 1 pass code."
This reverts commit 535b7b915a.

This is actually used in CBR to reset the rate control if high source sad is detected.

Original change's description:
> Remove the speed condition on scene detection in 1 pass code.
> 
> Scene detection is used for VBR mode and for screen_content mode.
> 
> It was also enabled for CBR mode via the speed condition,
> but currently the analysis in the scene detection is not used
> in CRB mode (similar computations are done locally at superblock level
> when the source_sad feature is enabled).
> 
> For 1 pass code.
> No change in behavior. Small speed gain, ~0.5%.
> 
> Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ib4e6b02047f75632503e7b0fc870af97fa9291c3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-09-27 19:42:48 +00:00
James Zern
690fa6bb6e Merge "fix signed integer overflow of idct" 2017-09-27 19:39:11 +00:00
James Zern
6175f285a9 Merge "vp9_dx_iface: Stop using iter parameter incorrectly" 2017-09-27 18:37:20 +00:00
Linfeng Zhang
dbbbd44304 fix signed integer overflow of idct
Exposed by fuzz test in high bitdepth.
The bug is introduced in commit 64653fa.

BUG=webm:1466

Change-Id: Idd77d5c6a60efb9241471611ce1aba0646cb6ff5
2017-09-27 11:17:54 -07:00
Scott LaVarnway
19c45ccd43 vpxdsp: [x86] add highbd_d153_predictor functions
C vs SSE2 speed gains:
_4x4 : ~1.95x

C vs SSSE3 speed gains:
_8x8 : ~3.30x
_16x16 : ~5.67x
_32x32 : ~3.87x

BUG=webm:1411

Change-Id: Ib483989b25614aa89b635e8c087d0879a5d71904
2017-09-27 11:01:11 -07:00
Marco
535b7b915a Remove the speed condition on scene detection in 1 pass code.
Scene detection is used for VBR mode and for screen_content mode.

It was also enabled for CBR mode via the speed condition,
but currently the analysis in the scene detection is not used
in CRB mode (similar computations are done locally at superblock level
when the source_sad feature is enabled).

For 1 pass code.
No change in behavior. Small speed gain, ~0.5%.

Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f
2017-09-27 10:32:54 -07:00
Vignesh Venkatasubramanian
530e60143a vp9_dx_iface: Stop using iter parameter incorrectly
'iter' parameter is being checked for NULL in every call to
decoder_get_frame which is quite pointless because it is always
going to be NULL unless the application changed it. The code works
as described only because vp9_get_raw_frame returns -1 on all
subsequent calls after the first.

Change-Id: Ic736b9e8fe36fc1430fc11d6a9b292be02497248
2017-09-27 09:59:39 -07:00
Linfeng Zhang
d203a91a09 Merge "Add vpx_scaled_2d_neon()" 2017-09-27 16:12:48 +00:00
Jerome Jiang
878464150b Merge "Add unit test to expose vp8 bug when width is set odd." 2017-09-27 01:26:59 +00:00
Shiyou Yin
f12e786c54 Merge "vp8: [loongson] optimize copymen with mmi" 2017-09-27 00:49:28 +00:00
Jerome Jiang
767503504f Add unit test to expose vp8 bug when width is set odd.
BUG=b/64710201

Change-Id: Ia518af5494a42e80949cf1165244fbed59606cf7
2017-09-26 17:40:13 -07:00
Marco
819c5b365d Remove the speed condition in setting compute_source_sad.
The speed condition is not needed, feature can used for any
speed in 1 pass code.

Change-Id: I878ef3f63a075302eda48c0343fa243c80aab9ba
2017-09-26 15:48:34 -07:00
Marco
d5094cfde8 Replace flag USE_ALTREF_FOR_ONE_PASS with speed feature.
To be used for 1 pass VBR.
Off by default in speed features.

Change-Id: I5d6110d6d191990db526fe68ec9715379a4d1754
2017-09-26 11:16:50 -07:00
Marco Paniconi
9e52d3910b Merge "SVC: Add setting for max_intra_rate_pct in sample encoder." 2017-09-26 16:28:30 +00:00
Linfeng Zhang
9d0d13e939 Add vpx_scaled_2d_neon()
BUG=webm:1419

Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96
2017-09-26 09:22:39 -07:00
Linfeng Zhang
28762341ac Merge changes Ib9105462,Idfac00ed,If8d8a0e2
* changes:
  cosmetics: NEON scaling code
  Refactor convolve NEON code
  Refactor convolve code
2017-09-26 16:10:46 +00:00
Shiyou Yin
73102d1ed2 vp8: [loongson] optimize copymen with mmi
1. vp8_copy_mem16x16_mmi
2. vp8_copy_mem8x8_mmi
3. vp8_copy_mem8x4_mmi

Change-Id: I3de29a11fa7402df0e48bbb944440b1e66498a65
2017-09-26 08:40:11 +08:00
Marco
23eccb3ca7 SVC: Add setting for max_intra_rate_pct in sample encoder.
Set it as default to 900.

Change-Id: Id2d990925dccff1f6762411c66ea95973440c92f
2017-09-25 13:39:18 -07:00
Scott LaVarnway
a059dc0986 Merge "vpxdsp: [x86] add highbd_d45_predictor functions" 2017-09-25 11:34:14 +00:00
Scott LaVarnway
cf82f7276e vpxdsp: [x86] add highbd_d45_predictor functions
C vs SSSE3 speed gains:
_4x4 : ~2.45x
_8x8 : ~10.61x
_16x16 : ~11.34x
_32x32 : ~6.36x

BUG=webm:1411

Change-Id: Ic91389a4f1a8ad093f498afe53765b897fb9be09
2017-09-22 05:20:12 -07:00
James Zern
691585f6b8 Merge changes If59743aa,Ib046fe28,Ia2345752
* changes:
  Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
  Remove the unnecessary upcasts of (int)cospi_{1...31}_64
  Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
2017-09-22 07:35:55 +00:00
Andrew Lewis
10bab1ec29 Merge "Comma-separate VP9 encoder tmp.stt output" 2017-09-21 08:50:53 +00:00
Marco Paniconi
0b08f8892f Merge "vp9: Modify pickmode early exit for ARF in 1pass." 2017-09-21 01:33:12 +00:00
Marco
42373b21ce vp9: Modify pickmode early exit for ARF in 1pass.
Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc
2017-09-20 15:02:37 -07:00
Linfeng Zhang
d586cdb4d4 Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
BUG=webm:1450

Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8
2017-09-20 14:13:26 -07:00
Linfeng Zhang
76a3d3fcc5 Remove the unnecessary upcasts of (int)cospi_{1...31}_64
BUG=webm:1450

Change-Id: Ib046fe28caec5b9ebdc9d0152df7c54ff4266858
2017-09-20 14:13:26 -07:00
Linfeng Zhang
64653fa133 Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
The unnecessary upcast to (int) will be cleaned later.

BUG=webm:1450

Change-Id: Ia234575206d5a74540526924b06ed3939322d063
2017-09-20 14:13:26 -07:00
James Zern
f7b276c26b Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c" 2017-09-20 21:12:45 +00:00
Linfeng Zhang
24afb5d036 Bug fix: fadst4() in vp9/encoder/vp9_dct.c
A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991
2017-09-20 12:27:29 -07:00
Marco Paniconi
f407b30490 Merge "vp9: Modify simple_block_yrd condition for SVC" 2017-09-20 16:42:31 +00:00
Scott LaVarnway
b85e391ac8 Merge "vpxdsp: [x86] add highbd_d63_predictor functions" 2017-09-20 11:39:28 +00:00
James Zern
15bea62176 temporal_filter_apply_sse2.asm: add ':' to label
quiets nasm warning:
label alone on a line without a colon might be in error

BUG=webm:1462

Change-Id: I660407ca60e8c9a810dba9d76afb65852029a29c
2017-09-19 18:59:11 -07:00
Linfeng Zhang
7c0529728a cosmetics: NEON scaling code
Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618
2017-09-19 16:39:17 -07:00
Linfeng Zhang
f357335c38 Refactor convolve NEON code
Rename a couple of hbd static functions.
Move the position of NEON function convolve8_4().

Change-Id: Idfac00edf2e99cdd8e0a73b9f895402f60be6349
2017-09-19 16:28:36 -07:00
Linfeng Zhang
bf8bdae913 Refactor convolve code
Extract a couple of static functions into their caller functions.

Change-Id: If8d8a0e217fba6b402d2a79ede13b5b444ff08a0
2017-09-19 16:28:31 -07:00
Scott LaVarnway
bc86e2c6a2 vpxdsp: [x86] add highbd_d63_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.94x

C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x

BUG=webm:1411

Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4
2017-09-19 15:47:22 -07:00
Marco
aaa6cdcc2e vp9: Modify simple_block_yrd condition for SVC
Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344
2017-09-19 15:39:05 -07:00
Marco Paniconi
2a7c0e1c4b Merge "Add datarate test for frame_parallel_decoding mode off." 2017-09-19 22:31:08 +00:00
Marco
cd463c7acb vp9: Fix condition for limiting ARF 1 pass vbr.
Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae
2017-09-19 11:12:37 -07:00
Marco Paniconi
310e388423 Merge "vp9: Adjustments for ARF usage in 1 pass vbr." 2017-09-19 16:29:19 +00:00
Marco
ebb015a539 vp9: Adjustments for ARF usage in 1 pass vbr.
Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d
2017-09-18 18:46:41 -07:00
Linfeng Zhang
a80bdfd081 Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154
2017-09-18 09:26:03 -07:00
Linfeng Zhang
9d278465b5 Merge "cosmetics: vp9_rtcd_defs.pl" 2017-09-18 16:23:33 +00:00
Shiyou Yin
2aacfa1acd Merge "vp8: [loongson] optimize dequantize with mmi" 2017-09-15 23:53:40 +00:00
Marco
ad31fe36a8 Add datarate test for frame_parallel_decoding mode off.
Add datarate test, for both VBR and CBR mode, with the
frame_parallel_decoding mode disabled (and error_resilience off).

Change-Id: I54feec3248a68ecff4bef8d9a31bb1616fab77df
2017-09-15 11:38:38 -07:00
Paul Wilkins
65f1c90652 Merge "Fix bug in intra mode rd penalty." 2017-09-15 15:43:29 +00:00
Kaustubh Raste
08fda52e18 Merge "mips msa clean-up msa macros" 2017-09-15 01:27:02 +00:00
James Zern
90ed0d2f73 Merge "vp9_scale_test: add C config" 2017-09-15 00:27:58 +00:00
James Zern
c12b39626f Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"" 2017-09-15 00:27:41 +00:00
Hui Su
293734b755 Merge "VP9 level targeting: add a new AUTO mode" 2017-09-14 21:02:38 +00:00
James Zern
c24d911847 vp9_scale_test: add C config
Change-Id: I9dfe8255d1c096d246bf9719729f57dbae779ffc
2017-09-14 13:08:04 -07:00
James Zern
baf658ec4c Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
This reverts commit afee58f2c4.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e
2017-09-14 13:07:21 -07:00
Hui Su
c3a6943c16 VP9 level targeting: add a new AUTO mode
In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c
2017-09-14 16:20:29 +00:00
Shiyou Yin
b81de66171 vp8: [loongson] optimize dequantize with mmi
1. vp8_dequantize_b_mmi
2. vp8_dequant_idct_add_mmi

Change-Id: I505f8afb7a444173392b325906e6a4f420f00709
2017-09-14 20:56:06 +08:00
Shiyou Yin
5b558592f5 vp8: [loongson] optimize idctllm with mmi
1. vp8_short_idct4x4llm_mmi
2. vp8_short_inv_walsh4x4_mmi
3. vp8_dc_only_idct_add_mmi

Change-Id: I616923681e79d78607a4988608fc39df77b093f4
2017-09-14 16:51:11 +08:00
Kaustubh Raste
4ca8f8f5e2 mips msa clean-up msa macros
Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store

Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448
2017-09-14 12:29:19 +05:30
Linfeng Zhang
535dee0fb6 cosmetics: vp9_rtcd_defs.pl
Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086
2017-09-13 12:13:55 -07:00
Linfeng Zhang
0726dd97d3 Merge "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()" 2017-09-13 17:21:45 +00:00
Andrew Lewis
949730e2dc Comma-separate VP9 encoder tmp.stt output
Also add column headings so that the output can still be parsed if the
set of headers changes later.

Change-Id: I4beaf266521e093db4acf5f715b18fdfb7e3d1cd
2017-09-13 16:26:40 +01:00
Johann Koenig
ed3a80cb5e Merge "Revert "Revert "quantize avx: copy 32x32 implementation""" 2017-09-13 14:44:53 +00:00
Kaustubh Raste
83e59914e5 Merge "Optimize mips msa vp9 average mc functions" 2017-09-13 06:02:49 +00:00
Shiyou Yin
fa01426ade Merge "vp8: [loongson] optimize loopfilter with mmi" 2017-09-13 01:05:46 +00:00
Johann
eb4238ac70 Revert "Revert "quantize avx: copy 32x32 implementation""
This reverts commit 8c42237bb2.

Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.

Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
2017-09-12 14:25:38 -07:00
Linfeng Zhang
afee58f2c4 Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()
Scale 3x3 block instead of 16x16 block in each loop.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3. Optimization code
   will be smaller and faster.
2. The maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: Ibb9242a629ddb03e1ff93b859bece738255e698c
2017-09-12 12:05:16 -07:00
Kaustubh Raste
30f1ff94e0 Optimize mips msa vp9 average mc functions
Load the specific destination loads instead of vector load

Change-Id: I65ca13ae8f608fad07121fef848e2a18f54171fe
2017-09-12 16:12:11 +05:30
Scott LaVarnway
c39cd9235e Merge "vpxdsp: [x86] add highbd_d207_predictor functions" 2017-09-11 22:32:23 +00:00
Linfeng Zhang
a9bbe53dbb Add 4 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: If82a93935d2453e61b7647aae70983db1740bec7
2017-09-11 10:17:28 -07:00
Scott LaVarnway
d6c9bbc2b6 vpxdsp: [x86] add highbd_d207_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.31x

C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x

BUG=webm:1411

Change-Id: I0bac29db261079181ddabc6814bd62c463109caf
2017-09-11 07:36:24 -07:00
Shiyou Yin
761f2f5cb4 vp8: [loongson] optimize loopfilter with mmi
1. vp8_loop_filter_horizontal_edge_mmi
2. vp8_loop_filter_vertical_edge_mmi
3. vp8_mbloop_filter_horizontal_edge_mmi
4. vp8_mbloop_filter_vertical_edge_mmi
5. vp8_loop_filter_simple_horizontal_edge_mmi
6. vp8_loop_filter_simple_vertical_edge_mmi

Change-Id: Ie34bbff3a16cff64e39a50798afd2b7dac9bcdc3
2017-09-11 11:08:09 +08:00
James Zern
fb40b5d7a7 intrapred: sync highbd_d63_predictor w/d63_
8/16/32: ~6%/~18%/~33% faster

previously:
7012ba639 vp9_reconintra: simplify d63_predictor

BUG=webm:1411

Change-Id: Ie775f3a4f7fd74df44754e65686d826a51c2cdc2
2017-09-08 19:28:01 -07:00
James Zern
9dfa76f948 vpx_mem: make vpx_memset16 inline
Change-Id: Ibb2cab930c95836e6d6e66300c33e7d08e4474d4
2017-09-08 19:11:46 -07:00
James Zern
5c95fd921e intrapred: sync highbd_d45_predictor w/d45_
8/16/32:: ~19%/~54%/~75.5% faster

previously:
acc481eaa vp9_reconintra: simplify d45_predictor

BUG=webm:1411

Change-Id: Ie8340b0c5070ae640f124733f025e4e749b660d8
2017-09-08 19:09:07 -07:00
James Zern
9a2dd7e67e Merge changes I9ec438aa,I99c954ff
* changes:
  Update convolve functions' assertions
  Add 2 to 1 scaling NEON optimization
2017-09-08 19:23:40 +00:00
paulwilkins
0657f4732c Fix bug in intra mode rd penalty.
The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.

The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.

This bug fix improves average metrics  on our standard test
sets by about 0.1%

Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
2017-09-08 15:10:53 +01:00
James Zern
d7caee2170 vpx_scale_test.h: remove #if from inside macro
fixes visual studio error

Change-Id: I86206f17ca951b15e247c1b92561847d8c21ec7a
2017-09-08 00:06:25 -07:00
Shiyou Yin
43cbdc216d Merge "vp8: [loongson] optimize sixtap predict with mmi" 2017-09-08 00:59:31 +00:00
Shiyou Yin
2c7b7424c5 Merge "vpxdsp: [loongson] optimize sad functions with mmi" 2017-09-08 00:55:14 +00:00
Linfeng Zhang
ef41c6286d Update convolve functions' assertions
So that 4 to 1 frame scaling can call them.

Change-Id: I9ec438aa63b923ba164ad3c59d7ecfa12789eab5
2017-09-07 12:33:58 -07:00
Linfeng Zhang
71b38a144e Add 2 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b
2017-09-07 12:33:50 -07:00
Linfeng Zhang
3ec20445b2 Refactor convolve8 NEON functions
Change-Id: I4ac576875c91fee7cb150d298fae4a2c156d374c
2017-09-06 15:55:17 -07:00
Linfeng Zhang
d5d2cbcc75 Add ScaleFrameTest
Move class VpxScaleBase to new file test/vpx_scale_test.h.
Add new file test/vp9_scale_test.cc with ScaleFrameTest.

BUG=webm:1419

Change-Id: Iec2098eafcef99b94047de525e5da47bcab519c1
2017-09-06 15:54:58 -07:00
Linfeng Zhang
7219f31904 Merge "Remove get_filter_base() and get_filter_offset() in convolve" 2017-09-06 22:39:15 +00:00
Scott LaVarnway
0e95039bd9 Merge "vpxdsp: [x86] add highbd_dc_128_predictor functions" 2017-09-06 21:53:32 +00:00
Peter Boström
6822fb2f09 Remove support for stdatomic.h.
This header doesn't build on g++ v6 as it's a C and not C++ header
(_Atomic is not a keyword in C++11). Since the C and C++ invocations
cannot be guaranteed to point to the same underlying atomic_int
implementation, remove support for them and use compiler intrinsics
instead.

BUG=webm:1461

Change-Id: Ie1cd6759c258042efc87f51f036b9aa53e4ea9d5
2017-09-06 11:59:50 -04:00
Linfeng Zhang
d331e7a1c0 Remove get_filter_base() and get_filter_offset() in convolve
so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-09-05 15:22:36 -07:00
Scott LaVarnway
bc4bcca3fd vpxdsp: [x86] add highbd_dc_128_predictor functions
C vs SSE2 speed gains:
_4x4 : ~7.64x
_8x8 : ~16.60x
_16x16 : ~8.15x
_32x32 : ~5.05x

BUG=webm:1411

Change-Id: If165d419711cfda901bd428a05ca1560a009e62e
2017-09-05 07:57:42 -07:00
Shiyou Yin
0095213790 vp8: [loongson] optimize sixtap predict with mmi
1. vp8_sixtap_predict16x16_mmi
2. vp8_sixtap_predict8x8_mmi
3. vp8_sixtap_predict8x4_mmi
4. vp8_sixtap_predict4x4_mmi

Change-Id: I186669d1a1d998a0f3ba3a548e25eee8b52c251b
2017-09-02 19:08:20 +00:00
Shiyou Yin
f4150163a2 vpxdsp: [loongson] optimize sad functions with mmi
1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c

Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a
2017-09-02 15:11:32 +00:00
James Zern
d49a1a5329 test,Android.mk: export gtest include path
fixes test file builds

Change-Id: Iaa725ad95d56cf77d9fef8994981a80102e9a966
2017-09-01 19:44:12 -07:00
clang-format
7587a97551 apply clang-format
Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c
2017-09-01 01:24:03 -07:00
James Zern
053bd263eb .clang-format: update to 4.0.1
based on Google style with the following differences:

3a4
> # Generated with clang-format 4.0.1
13c14
< AllowShortCaseLabelsOnASingleLine: false
---
> AllowShortCaseLabelsOnASingleLine: true
23c24
< BraceWrapping:
---
> BraceWrapping:
43c44
< ConstructorInitializerAllOnOneLineOrOnePerLine: true
---
> ConstructorInitializerAllOnOneLineOrOnePerLine: false
46,47c47,48
< Cpp11BracedListStyle: true
< DerivePointerAlignment: true
---
> Cpp11BracedListStyle: false
> DerivePointerAlignment: false
51c52
< IncludeCategories:
---
> IncludeCategories:
78c79
< PointerAlignment: Left
---
> PointerAlignment: Right
80c81
< SortIncludes:    true
---
> SortIncludes:    false

Change-Id: Ibc0ef87a516b8eae88d426dfdd7624be57e7b87c
2017-09-01 01:24:03 -07:00
Peter Boström
be2ba48cac Merge "Prevent data race from low-pass filter." 2017-09-01 05:37:51 +00:00
James Zern
334e9abb0b Merge "inv_txfm_vsx: fix loads in high-bitdepth" 2017-09-01 03:09:49 +00:00
Peter Boström
9ab4d9df38 Prevent data race from low-pass filter.
Makes main thread wait for the filter level to be picked to avoid a race
between the LPF thread and update_reference_frames(). This also
re-enables the failing tests under thread_sanitizer where this data race
was detected.

BUG=webm:1460

Change-Id: I7f5797142ea0200394309842ce3e91a480be4fbc
2017-08-31 18:37:55 -07:00
Peter Boström
03191f738e Merge "Add atomics to vp8 synchronization primitives." 2017-09-01 01:36:22 +00:00
Peter Boström
d42e876164 Add atomics to vp8 synchronization primitives.
Fixes issue on iPad Pro 10.5 (and probably other places) where threads
are not properly synchronized. On x86 this data race was benign as load
and store instructions are atomic, they were being atomic in practice as
the program hasn't been observed to be miscompiled.

Such guarantees are not made outside x86, and real problems manifested
where libvpx reliably reproduced a broken bitstream for even just the
initial keyframe. This was detected in WebRTC where this device started
using multithreading (as its CPU count is higher than earlier devices,
where the problem did not manifest as single-threading was used in
practice).

This issue was not detected under thread-sanitizer bots as mutexes were
conditionally used under this platform to simulate the protected read
and write semantics that were in practice provided on x86 platforms.

This change also removes several mutexes, so encoder/decoder state is
lighter-weight after this change and we do not need to initialize so
many mutexes (this was done even on non-thread-sanitizer platforms where
they were unused).

Change-Id: If41fcb0d99944f7bbc8ec40877cdc34d672ae72a
2017-08-31 17:55:57 -07:00
Scott LaVarnway
ab5704f02c Merge "vpxdsp: [x86] add highbd_dc_left_predictor functions" 2017-08-31 21:34:27 +00:00
Jerome Jiang
20973508da Merge "vp9: Skip testing duplicate zero mv in nonrd-pickmode." 2017-08-31 17:16:19 +00:00
Jerome Jiang
ebf3ae1a29 vp9: Skip testing duplicate zero mv in nonrd-pickmode.
Neutral on rtc set for speed 8. Neutral on ytlive for speed 5.

Saves some computation cycles but no speed gain observed on Pixel.

Change-Id: I34c4642cd543aa89c5b9c4bff6b7113577c64c91
2017-08-31 17:13:31 +00:00
James Zern
f8f64c309b inv_txfm_vsx: fix loads in high-bitdepth
vec_vsx_ld -> load_tran_low

Change-Id: Id3144cdd528d2d406a515e5812e2ea9e4db64bf1
2017-08-30 23:47:56 -07:00
Jerome Jiang
297c110dcb Merge "Revert "Re-enable disabled tests under TSan."" 2017-08-31 01:52:42 +00:00
Jerome Jiang
d7ba519b9f Revert "Re-enable disabled tests under TSan."
This reverts commit df9ce12259.

Reason for revert: 

Re-enabled tests still fail tsan in high bitdepth.

Original change's description:
> Re-enable disabled tests under TSan.
> 
> These tests point to an already-fixed bug, this should no longer have a
> data race.
> 
> BUG=webm:1049
> 
> Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8

TBR=jzern@google.com,pbos@chromium.org,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: webm:1049
Change-Id: I232f1f7726bf795b301abfb2e07cad6756642e53
2017-08-30 23:44:21 +00:00
Scott LaVarnway
c39a05ff61 vpxdsp: [x86] add highbd_dc_left_predictor functions
C vs SSE2 speed gains:
_4x4 : ~6.49x
_8x8 : ~10.82x
_16x16 : ~7.61x
_32x32 : ~5.29x

BUG=webm:1411

Change-Id: Ibc30c50cb7139049bf05298010803499e6ef949b
2017-08-30 09:29:06 -07:00
Scott LaVarnway
2d0c11093e Merge "vpxdsp: [x86] add highbd_dc_top_predictor functions" 2017-08-30 11:25:07 +00:00
Scott LaVarnway
f783e3a75d vpxdsp: [x86] add highbd_dc_top_predictor functions
C vs SSE2 speed gains:
_4x4 : ~7.39x
_8x8 : ~11.36x
_16x16 : ~8.68x
_32x32 : ~4.33x

BUG=webm:1411

Change-Id: I7f1487cd1531d4e7f0fbb4596fed3bfb72a59d58
2017-08-29 12:53:30 -07:00
Jerome Jiang
fed52670a2 Merge "vp9: Speed 8: Enable skip_encode_sb" 2017-08-29 16:45:09 +00:00
Peter Boström
2f5fb37dac Merge "Re-enable disabled tests under TSan." 2017-08-29 15:42:39 +00:00
Scott LaVarnway
81f45ff798 Merge "vpxdsp: [x86] add highbd_h_predictor functions" 2017-08-29 14:05:13 +00:00
Scott LaVarnway
30d9a1916c vpxdsp: [x86] add highbd_h_predictor functions
C vs SSE2 speed gains:
_4x4 : ~8.12x
_8x8 : ~9.71x
_16x16 : ~8.21x
_32x32 : ~5.0x

BUG=webm:1422

Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993
2017-08-28 17:31:18 -07:00
Jerome Jiang
7c10251f22 vp9: Speed 8: Enable skip_encode_sb
Neutral in borg tests.

Some clips show 3-4% speed gain on 2 threads on Pixel.

Change-Id: Ic959f34e44892a854551de6e9a3d9ec819ffed00
2017-08-28 17:05:48 -07:00
Peter Boström
df9ce12259 Re-enable disabled tests under TSan.
These tests point to an already-fixed bug, this should no longer have a
data race.

BUG=webm:1049

Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8
2017-08-28 16:24:38 -07:00
Jerome Jiang
64c55576b7 vp9: Remove resolution condition for using source_sad in speed 6.
Rev d147771 fixed the test failure. So remove the resolution condition
for using source_sad in speed 6.

BUG=webm:1452

Change-Id: I1efba97e1ef5bd4de5f886299f6fcb907187abcd
2017-08-28 12:49:54 -07:00
Marco Paniconi
255241c6d0 Merge "vp9: Speed 6 adapt_partition for live/vbr usage." 2017-08-25 22:00:08 +00:00
Marco Paniconi
43b9e785ba Merge "vp9: SVC: Modify mv search condition in speed features." 2017-08-25 21:46:35 +00:00
Marco
a0de2692fc vp9: Speed 6 adapt_partition for live/vbr usage.
Enable adapt_partition for vbr mode for speed 6.
This allows the usage of the pickmode-based partition
(used in speed 5), but only selectively for superblocks
with high source sad, otherwise the faster variance based
partition scheme is used.

For speed 6 on ytlive set: avgPSNR/SSIM metrics up by ~0.6%,
several clips up by ~1.5%. Small/negligible decrease in speed.

Change-Id: I12f3efef6b3e059391de330fdbe5a44c2587f1f8
2017-08-25 11:36:34 -07:00
Marco Paniconi
3e069846b9 Merge "Revert "quantize avx: copy 32x32 implementation"" 2017-08-25 18:20:31 +00:00
Marco
a74593b30c vp9: SVC: Modify mv search condition in speed features.
For SVC at speed >= 7: only use the improved mv search
on base spatial layer, if top layer resolution is above 640x360.

~2.3% speedup
Small/negligible loss in avgPSNR metrics on rtc set.

Change-Id: Iaef75a57ebf1c248931bc1aa28d20b7fecac1851
2017-08-25 10:12:38 -07:00
Marco Paniconi
8c42237bb2 Revert "quantize avx: copy 32x32 implementation"
This reverts commit f60d1dcd3d.

Reason for revert: <INSERT REASONING HERE>
Failures in AVX/VP9QuantizeTest in nightly tests.
Original change's description:
> quantize avx: copy 32x32 implementation
> 
> Ensure avx and ssse3 stay in sync by testing them against each other.
> 
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org

Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-08-25 16:56:08 +00:00
Shiyou Yin
ece1989fa2 Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi." 2017-08-25 06:44:02 +00:00
Marco Paniconi
34e48d6115 Merge "vp9: Adjust 16x16 splot threshold for variance partition" 2017-08-24 22:26:43 +00:00
Tom Finegan
ccae8da7c6 Make sure diff is present at configure time.
This avoids an endless build loop at vpx_version.h
creation time when diff is not present.

Change-Id: I16ae386dbdaf14f9a2b85e4c5d1aaa6c08f52a45
2017-08-24 12:11:48 -07:00
Johann Koenig
6c21650c0e Merge "quantize avx: copy 32x32 implementation" 2017-08-24 18:55:03 +00:00
Shiyou Yin
9e4647c7ab vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi.
Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174
2017-08-25 01:58:49 +08:00
Marco
d14777157e vp9: Adjust 16x16 splot threshold for variance partition
For speeds < 7, increase threshold that controls the split
of 16x16->8x8 blocks, for resolutions 720p and higher.

Minor change for speed 5 (since it uses reference partition scheme
which only uses variance partition as first step).
For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set.
No change in speed.

Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b
2017-08-24 10:44:05 -07:00
Johann Koenig
258122fdc6 Merge "quantize test: skip block was removed" 2017-08-24 17:43:10 +00:00
Johann
f60d1dcd3d quantize avx: copy 32x32 implementation
Ensure avx and ssse3 stay in sync by testing them against each other.

Change-Id: I699f3b48785c83260825402d7826231f475f697c
2017-08-24 10:42:34 -07:00
Johann
1787e7dbe0 quantize ssse3: copy implementation to intrinsics
Still does not pass tests. Does match the previous assembly, although
saving the sign before multiplying is dubious.

Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a
2017-08-24 07:47:51 -07:00
Johann
92aafefa1e quantize test: skip block was removed
Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87
2017-08-24 07:21:42 -07:00
Johann Koenig
2dc0a5132d Merge "quantize test: set threshold for 32x32" 2017-08-24 14:04:29 +00:00
Shiyou Yin
d080c92524 Merge "vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi." 2017-08-24 00:55:11 +00:00
Marco Paniconi
30c261b1eb Merge "vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv." 2017-08-23 23:09:33 +00:00
Johann
e89344d61a quantize test: set threshold for 32x32
Change-Id: I77be617c7d7c64929dd51c6077322f4f8ad23897
2017-08-23 15:59:11 -07:00
Johann Koenig
f53b656207 Merge "quantize avx: copy implementation to intrinsics" 2017-08-23 21:14:13 +00:00
Marco
c9ff7b6637 vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv.
For SVC encoding:
average speedup ~1.5%, with small ~0.57 loss in avgPSNR metrics.

Change-Id: Icebce6f6ef4e819d7dfcf8db898c583167351de4
2017-08-23 13:08:27 -07:00
Scott LaVarnway
1aad50c092 Merge "vpx_dsp: get32x32var_avx2() cleanup" 2017-08-23 19:59:25 +00:00
Johann Koenig
dfafd10ef5 Merge "quantize neon: round dqcoeff towards zero" 2017-08-23 19:20:53 +00:00
Johann
7c27872164 quantize avx: copy implementation to intrinsics
Adds an early exit based on ptest. Slightly slower than ssse3 in the
full case because of the extra check, but potentially faster if lots of
rows can be skipped.

Very close in speed to the assembly.

Can run in 32 bit, unlike the assembly. Allows reworking the function
prototype to use structs.

Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
2017-08-23 09:19:16 -07:00
Johann
2a5aa98a35 quantize neon: round dqcoeff towards zero
Add 1 if negative to get dqcoeff to round towards zero.

10-15% faster than converting to positive before shifting.

Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d
2017-08-23 08:05:50 -07:00
Johann
e83d99d7b8 quantize fp: neon implementation
About 4x faster when values are below the dequant threshold and 10x
faster if everything needs to be calculated.

Both numbers would improve if the division for dqcoeff could be
simplified.

BUG=webm:1426

Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-23 08:01:30 -07:00
Shiyou Yin
59e065b6ed vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi.
Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2
2017-08-23 15:14:15 +08:00
Marco Paniconi
0207f17144 Merge "vp9: Condition lighting change detection on CBR mode." 2017-08-22 22:52:05 +00:00
Johann Koenig
103e4e50a8 Merge changes I53f8a160,I48f282bf
* changes:
  quantize ssse3: copy style from sse2
  quantize sse2: copy opts from ssse3
2017-08-22 22:27:56 +00:00
Marco
a31461c853 vp9: Condition lighting change detection on CBR mode.
This feature is used for the CBR RTC encoding mode
at speed >= 6. This change will exclude it for VBR mode.

For speed 6 live encoding (VBR):
avgPSNR/SSIM metrics on ytlive set up by ~1% (few clips up by 2/3%).
No change in speed.

Change-Id: I1a0dd94c334f7df309ab5a48d477d7e25355b798
2017-08-22 14:59:37 -07:00
Johann
b9c1dcc5fa quantize ssse3: copy style from sse2
Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598
2017-08-22 14:25:27 -07:00
Johann Koenig
7f2993f5e4 Merge "quantize: capture skip block early" 2017-08-22 20:03:02 +00:00
Johann
75752ab7c0 quantize sse2: copy opts from ssse3
Simplify eob calculations based on ssse3 implementation.

General clean up and re-scoping.

Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee
2017-08-22 13:01:44 -07:00
Johann Koenig
ab27b68693 Merge changes Icfb70687,I9a963e99,Ie8ac00ef,I1272917c
* changes:
  quantize: ignore skip_block in arm
  quantize: ignore skip_block in x86
  quantize fp: ignore skip_block in arm
  quantize fp: ignore skip_block in x86
2017-08-22 19:19:14 +00:00
Johann
7a178a5631 quantize: capture skip block early
This should probably be handled before vp9_regular_quantize_b_4x4 even
gets called.

Fixes an assert resulting from removing skip_block from the quantize
functions.

BUG=webm:1459

Change-Id: I7f52b53f959b4654b3d4517ebda31a678f4d0fde
2017-08-22 12:10:55 -07:00
James Zern
419ce36294 Merge "ppc: Add vpx_idct16x16_256_add_vsx" 2017-08-22 00:48:39 +00:00
Shiyou Yin
bff5aa9827 Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi." 2017-08-22 00:37:23 +00:00
Johann
2c56bb97f2 quantize: ignore skip_block in arm
Change-Id: Icfb70687476b2edb25d255793ba325b261d40584
2017-08-21 14:37:50 -07:00
Johann
c02fdd0258 quantize: ignore skip_block in x86
Change-Id: I9a963e99f08761f0c8d6a305619270b2f1c4edf8
2017-08-21 14:37:03 -07:00
Johann
b527b47312 quantize fp: ignore skip_block in arm
Change-Id: Ie8ac00efa826eead2a227726a1add816e04ff147
2017-08-21 14:34:48 -07:00
Johann
7b13d99b98 quantize fp: ignore skip_block in x86
Change-Id: I1272917c49cf6e6710e52c36535b2fc8c8dced78
2017-08-21 14:33:41 -07:00
Johann
661efeca97 quantize test: test _fp_ version of quantize
None of the x86 optimizations pass the tests.

Change-Id: Ic67f2ba1977b657e68f2a13b0711fc5fcbafd909
2017-08-21 12:29:41 -07:00
Johann
13eed991f9 Remove skip_block from quantize
This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.

Add assert() and comments regarding the usage of skip_block.

Removing the parameter is a fairly involved process so leave it be for
the moment.

Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-08-21 09:49:04 -07:00
Scott LaVarnway
eab3f5e0cc vpx_dsp: get32x32var_avx2() cleanup
renamed to get32x16var_avx2()

BUG=webm:1404

Change-Id: Icb8f3986c9c9c646e13a69430db7235fc7e1a036
2017-08-18 13:44:09 -07:00
Scott LaVarnway
2c5478e383 Merge "vpx_dsp: vpx_get16x16var_avx2() cleanup" 2017-08-18 20:30:59 +00:00
Scott LaVarnway
2f7497f341 vpx_dsp: vpx_get16x16var_avx2() cleanup
BUG=webm:1404

Change-Id: I88aceb07f4db4870a06eee21d87296974ce3221a
2017-08-18 12:23:49 -07:00
Johann Koenig
1426f04e91 Merge "quantize: normalize intermediate types" 2017-08-18 16:00:28 +00:00
Shiyou Yin
7d82e57f5b vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi.
Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e
2017-08-18 09:06:49 +08:00
James Zern
bb15fd51be highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo
135 -> 34

fixes unused function warnings for highbd_idct32_34_4x32_quarter_[12]

Change-Id: I4f50ff6ea514200af93dd59ff94c7f9717409682
2017-08-17 15:37:38 -07:00
Johann
7f602d6114 quantize: normalize intermediate types
Despite abs_coeff being a positive value, all the other implementations
treat it as signed which simplifies restoring the sign.

HBD builds cast qcoeff to avoid a visual studio warning. Match
vp9_quantize.c style of casting the entire expression.

Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5
2017-08-17 12:34:28 -07:00
James Zern
e038d1610e inv_txfm_sse2.h: correct idct*/iadst* prototypes
fixes mismatch between prototypes and definitions

Change-Id: Ib5e7dfcce244dbb8401815be2cdd183d96792652
2017-08-16 23:06:09 -07:00
Paul Wilkins
f64e14047d Merge "Prevent parameters that can cause invalid ARF groups." 2017-08-16 18:25:57 +00:00
Paul Wilkins
372336d1e5 Merge "Fix corrupt arf groups due to low "lag_in_frames"" 2017-08-16 18:25:29 +00:00
Linfeng Zhang
f95686895b Merge changes I08b562b6,Ia275940a,I51106e90
* changes:
  Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
  Update highbd idct x86 optimizations.
  Update 32x32 idct sse2 and ssse3 optimizations.
2017-08-16 16:36:37 +00:00
paulwilkins
b814e2d898 Prevent parameters that can cause invalid ARF groups.
Having a very low "lag_in_frames" value could cause the encoder to create
incorrect / corrupt ARF groups including displayed frames that update the
ARF buffer and false overlay frames that are coded at low rate but are not
actually overlays of a real ARF frame.

This is linked to a reported unit test "slow down" where the chosen parameters
(lag of 3 frames) gave rise to such "broken" ARF group(s).

See also BUG=webm:1454

Change-Id: If52d0236243ed5552537d1ea9ed3fed8c867232c
2017-08-16 14:33:59 +01:00
paulwilkins
48110d0f79 Fix corrupt arf groups due to low "lag_in_frames"
Having a very small value for "lag_in_frames" can result in
corrupt arf groups including displayed frames that update
the arf buffer and fake overlay frames that are not in fact
overlays of real arfs but are nevertheless starved of bits.

Leaving lag_in_frames at the default of 25 for these 5 frame two
pass VBR tests should now give rise to a valid ARF coding pattern
as follows:-  K(ey), A(rf), N(ormal), N, N, O(verlay).

This change is part of a response to BUG=webm:1454 where broken
arf groups interacted badly with a change that corrects for large rate
misses. However, it may still in some cases increase encode time by
virtue of the fact that the unit test now codes a correct coding pattern
with "hidden" ARF frames.

Change-Id: Ifd0246a4c1d0be247247c754024d7a4ed5f66a6b
2017-08-16 14:07:24 +01:00
Paul Wilkins
0472382dbe Merge "Fix for encoder slowdown (for speeds >= 3)" 2017-08-16 13:01:38 +00:00
paulwilkins
e15be3025b Fix for encoder slowdown (for speeds >= 3)
Some clips in nightly unit test exhibiting significant encoder slowdown which
appears to bisect to Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a.

The above change allowed for emergency iterations of the recode loop and
adjustment of the Q range if there is a large rate miss.

This patch disables the above adaptation for cases of cpu_speed >= 3 or more
specifically where cpi->sf.recode_loop >= ALLOW_RECODE_KFARFGF.

For speeds >= 3 the code does not currently run a dummy bit pack operation
inside the recode loop. Without this dummy pack operation there is no up to
date estimate of the current frame's size to use as a basis for assessing the
requirement for a recode. In practice it was using the previous frames size (or 0
for the first frame) which could cause odd behavior.

If we require the emergency rate correction added in  Change-Id: I6923.. for
the higher speed settings it will be necessary to enable the dummy pack
which will in turn hurt encode speed.

BUG=webm:1454

Change-Id: I4fb3c6062ca9508325a6f31582f8e80f1a9b126f
2017-08-16 10:56:52 +01:00
Jerome Jiang
6b9c691daf Merge "Clean up writing YUV files for debug purpose." 2017-08-15 18:28:54 +00:00
Marco Paniconi
14437d0fa6 Merge "vp9: Denoiser fix: use correct bsize for skin detection." 2017-08-15 17:53:08 +00:00
Jerome Jiang
a153080b55 Clean up writing YUV files for debug purpose.
Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.

To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising

For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'

Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528
2017-08-15 10:44:03 -07:00
Johann Koenig
c59d1a4dc7 Merge changes I1f1edeaa,I89313cac
* changes:
  quantize: silence unsigned overflow warning
  quantize test: quiet overflow warning
2017-08-15 17:37:59 +00:00
Marco
e9ccc6fe79 vp9: Denoiser fix: use correct bsize for skin detection.
Change-Id: I9d201fa3a4b00ebd147b57ed519fab8d59b0a802
2017-08-15 10:02:19 -07:00
Johann
77ed4414d6 quantize: silence unsigned overflow warning
The result of the xor operation is unsigned. If coeff was negative,
this results in an unsigned value - INT_MIN.

Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0
2017-08-15 09:48:24 -07:00
Scott LaVarnway
7e8357d664 Merge "vp9: strip temporal filter code" 2017-08-15 15:35:33 +00:00
Johann
08cb7b5c68 quantize test: quiet overflow warning
Promote the result of RandRange to signed

Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128
2017-08-15 08:28:09 -07:00
Paul Wilkins
ca393c9726 Merge "Patch relating to Issue 1456." 2017-08-15 14:57:56 +00:00
Paul Wilkins
5009302bce Merge "Enable emergency fast Q adaptation for VBR test case." 2017-08-15 14:57:22 +00:00
Linfeng Zhang
d72e20b123 Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
BUG=webm:1412

Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f
2017-08-14 17:05:22 -07:00
Linfeng Zhang
69775d2f40 Update highbd idct x86 optimizations.
BUG=webm:1412

Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069
2017-08-14 16:59:50 -07:00
Linfeng Zhang
3f05a70c41 Update 32x32 idct sse2 and ssse3 optimizations.
Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70
2017-08-14 16:59:31 -07:00
Scott LaVarnway
fa85cf131c vp9: strip temporal filter code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52
2017-08-14 14:27:53 -07:00
Johann Koenig
ff184e482a Merge changes I4b4beab1,I02f74dec
* changes:
  quantize test: check skip_block
  quantize test: use negative input
2017-08-14 20:52:52 +00:00
Johann Koenig
45b39750d6 Merge "temporal filter test: adjust inputs and runtime" 2017-08-14 20:46:22 +00:00
Jerome Jiang
2caff16151 vp9 svc: Fix the stats output when sl = 1.
Actual frame size and bitrate is all 0 when using SVC sample encoder
with sl = 1 because the stats are set in parse_superframe_index which
will not caculate properly when sl = 1 since there is no superframe.

Use pkt->data.frame.sz instead when sl = 1.

Change-Id: I93f5e98a4c779e32b007e1564ba5396af9e34ad6
2017-08-14 12:00:00 -07:00
Scott LaVarnway
1ab60466ec Merge "vp9: strip mb graph code" 2017-08-14 18:01:44 +00:00
Johann
c06d6649c5 temporal filter test: adjust inputs and runtime
Use input with a narrow range because the filter only applies when the
frames are similar.

Run CompareReferenceRandom more times. Especially before narrowing the
input range, the filter frequently did not apply.

Change-Id: Ie249bedf6d0d33dfa5884611cb1835788e418b38
2017-08-14 17:24:11 +00:00
James Zern
746c0eab3b disable SSSE3/VP9QuantizeTest* in hbd builds
this test fails with the configuration similar to the assembly prior to:
d52cb5972 quantize: copy ssse3 optimizations to intrinsics

BUG=webm:1458

Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf
2017-08-14 09:31:14 -07:00
Scott LaVarnway
e702b68b6c vp9: strip mb graph code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: I4b1b8e9a456830ba1b1bd3a8882e038d37ee7903
2017-08-11 12:59:40 -07:00
Johann
e022ce84ac Rename vp8 quantize file
BUG=webm:1457

Change-Id: Ie8fae018ad8417724fde087055b90228850d631d
2017-08-11 10:44:36 -07:00
Jerome Jiang
d48be6ad73 Merge "vp9 SVC: Fix the denoiser frame buffer management." 2017-08-11 00:54:35 +00:00
Jerome Jiang
0f8ebddec4 vp9 SVC: Fix the denoiser frame buffer management.
Change the denoiser frame buffer management for SVC to more generally
handle the layer patterns in SVC (where last is not always refreshed).

This change is only for SVC with denoising and is bitexact.

Change-Id: Ic2b146a924cdf6e7114609158afa3d4880fe3fae
2017-08-10 16:56:46 -07:00
Linfeng Zhang
15193ce51f Merge "Clean highbd idct x86 code with inline functions" 2017-08-10 20:25:18 +00:00
Johann Koenig
9bb8ce5efb Merge "neon: vpx_quantize_b_32x32" 2017-08-10 15:42:49 +00:00
Johann Koenig
0b393ae505 Merge "quantize: copy ssse3 optimizations to intrinsics" 2017-08-10 15:42:20 +00:00
paulwilkins
db8fa86a6c Patch relating to Issue 1456.
Testing of 4k videos encoded with a fixed arbitrary chunking interval
uncovered a bug where by if a chunk ends 1 frame before a real scene cut,
the next chunk may be encoded with two consecutive key frames at the start
with the first being assigned 0 bits.

This fix insures that where there is a key frame group of length 1 it is
at least assigned 1 frames worth of bits not 0.

See also patch Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
which by virtue of allowing fast adaptation  of Q made this bug more visible.

BUG=webm:1456

Change-Id: Ic9e016cb66d489b829412052273238975dc6f6ab
2017-08-09 16:34:43 +01:00
Linfeng Zhang
39da7fb786 Clean highbd idct x86 code with inline functions
Created inline functions highbd_butterfly_cospi16_sse2()
and highbd_butterfly_cospi16_sse4_1()

BUG=webm:1412

Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a
2017-08-08 17:53:28 -07:00
Marco Paniconi
68805583e9 Merge "vp9: Partition logic adjustment for speed 6 feature." 2017-08-08 23:08:10 +00:00
Johann
357adb68b2 quantize test: check skip_block
Not all sizes were tested previously. Only 4x4 and 32x32

Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315
2017-08-08 14:21:58 -07:00
Johann
1092cc7f1a quantize test: use negative input
coeff contains signed values.

Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23
2017-08-08 14:19:56 -07:00
Johann
93166c5e51 neon: vpx_quantize_b_32x32
With skip block the neon is about twice as fast as C.

The neon has no shortcut for coeff < zbin so it always takes the
same amount of time. Even if the C can take the shortcut, it is over
twice as fast in neon. If it can't, that gap increases to over 10x.

BUG=webm:1426

Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6
2017-08-08 14:05:18 -07:00
Johann
d52cb59729 quantize: copy ssse3 optimizations to intrinsics
Fairly minor differences from sse2. pabsw and psignw are the big gains.
Also re-uses some values in eob calculation to avoid an extra pcmp.

Fixes test failures in HBD and OS X builds.

Allows using it in 32bit builds, where it is about 40% faster than sse2.

Substantially faster than the assembly for skip_block. 10-20% faster the
rest of the time.

Change-Id: If783bb3567e561e47667e10133b9c84414a334e2
2017-08-08 12:22:14 -07:00
Marco
427de67e63 vp9: Partition logic adjustment for speed 6 feature.
When adapt_partition_source_sad is enabled (currently only at
speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16)
for nonrd_select_partition on 32X32 blocks.

And force avoiding rectangular partition checks in
nonrd_pick_partition for speed >= 6.

Small increase ~0.5 in metrics for speed 6 on rtc_derf,
no change in speed.

Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae
2017-08-08 11:31:27 -07:00
Linfeng Zhang
853165ba39 Update 32x32 idct sse2 funcs, add partial case 135
Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a
2017-08-07 17:37:02 -07:00
Linfeng Zhang
d670678f26 Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx()
in idct x86 code

Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083
2017-08-04 15:33:37 -07:00
Linfeng Zhang
fa829e0e5a Replace multiplication_and_add() with butterfly() in idct x86 code
Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e
2017-08-04 15:33:34 -07:00
Linfeng Zhang
c9fb719ee1 Update butterfly() in idct x86 optimizations.
Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701
2017-08-04 15:33:28 -07:00
Linfeng Zhang
7f20c3ac44 Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1
BUG=webm:1412

Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca
2017-08-04 15:31:17 -07:00
Linfeng Zhang
22b6dc9fdf Update for loop increment of idct x86 functions
Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4
2017-08-04 15:29:19 -07:00
Linfeng Zhang
0c61331244 Update high bitdepth 16x16 idct x86 code
Prepare for high bitdepth 16x16 idct sse4.1 code.
Just functions moving and renaming.

BUG=webm:1412

Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a
2017-08-04 15:12:33 -07:00
Johann Koenig
cbb83ba4aa Merge "quantize test: consolidate sizes" 2017-08-04 20:34:50 +00:00
Johann
9578a84205 quantize test: consolidate sizes
Pass a max txfm size parameter and combine the base quantize
test with the 32x32 test.

Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b
2017-08-04 12:45:32 -07:00
Scott LaVarnway
c42517568d vpx_dsp: merge avx2 variance files
BUG=webm:1404

Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4
2017-08-04 07:49:30 -07:00
Kaustubh Raste
39e8b8dac6 Fix mips dspr2 6 tap filter clobber list
Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646
2017-08-04 10:56:56 +05:30
Linfeng Zhang
e921c7ba8d Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function" 2017-08-04 01:16:35 +00:00
Scott LaVarnway
f6c6f37e0c Merge "vpx_dsp: Use correct check for halfpel in" 2017-08-03 23:17:09 +00:00
Linfeng Zhang
563d58ab84 Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function
BUG=webm:1412

Change-Id: I945f0fb6807b8948747243794dc7352b959221f7
2017-08-03 13:59:47 -07:00
Linfeng Zhang
6624f20785 Merge changes I76727df0,I66297d78,I1d000c6b
* changes:
  Extract inlined 16x16 idct sse2 code into header file
  Add transpose_32bit_8x4() sse2 optimization
  Update x86 idct optimization
2017-08-03 20:51:02 +00:00
Scott LaVarnway
8334a48d3a vpx_dsp: Use correct check for halfpel in
vpx_sub_pixel_variance32xh_avx2() and
vpx_sub_pixel_avg_variance32xh_avx2

see:
17fae3a Change to use correct check for halfpel

Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c
2017-08-03 06:57:40 -07:00
paulwilkins
76d77aa013 Enable emergency fast Q adaptation for VBR test case.
Enable fast adaptation of Q when there is a large overshoot
for the  #ifdef AGGRESSIVE_VBR test case.

AGGRESSIVE_VBR  is not currently enabled by default.

Change-Id: I7240bb6589795964b6b0b66df4468e4f21504e0f
2017-08-03 12:06:07 +01:00
Yunqing Wang
6843e7c7f3 Merge "Force the bit exactness in the first pass" 2017-08-03 00:03:10 +00:00
Linfeng Zhang
15a47db730 Extract inlined 16x16 idct sse2 code into header file
Will be called by high bitdepth functions.

Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c
2017-08-02 16:17:43 -07:00
Linfeng Zhang
8c0ab7607e Add transpose_32bit_8x4() sse2 optimization
Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955
2017-08-02 16:15:58 -07:00
Yunqing Wang
bfd0f41f9b Force the bit exactness in the first pass
Originally, for the purpose of keeping a fast first pass, the first-pass
stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but
that difference is very small that doesn't cause a mismatch between the
final bitstreams. However, if the encoder changes, this minor difference
may cause a mismatch. Thus, this patch always forces the first pass to
be bit exact.

BUG=webm:1453

Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8
2017-08-02 15:58:39 -07:00
Johann Koenig
787970a625 Merge "quantize test: add speed comparison" 2017-08-02 21:16:35 +00:00
Marco
b9577e07fc vp8: Drop due to overshoot for non-screen content.
For 1 pass CBR mode:
Apply the logic for dropping (and re-adjusting rate control)
due to large overshoot to the case of non-screen content when
drop_frames_allowed is enabled.

For the non-screen content case: add additional condition that
rate correction factor is close to minimum state, and flag to
constrain the frequency of the dropping.

Also handle the case of temporal layers and multi-res encoding.
Add some flags/counters to the layer context for temporal layers.
For multi-res: drop due to overshoot is checked on lowest stream,
and if overshoot is detected we force drops on all upper streams
for that frame.

This feature is to avoid large frame sizes on big content
changes following low content period.

No change in behavior for screen_content_mode = 2.

Change-Id: I797ab236cbbf3b15cad439e9a227fbebced632e6
2017-08-02 13:12:48 -07:00
Scott LaVarnway
698e56f26c Merge "vpxdsp: variance_impl_avx2.c cleanup" 2017-08-02 19:08:10 +00:00
Johann
1059b5cc52 quantize test: add speed comparison
Test some possible scenarios.

Change-Id: I1a612e7153b31756be66390ceea55877856d5a33
2017-08-02 09:33:35 -07:00
Scott LaVarnway
632fe8286a vpxdsp: variance_impl_avx2.c cleanup
BUG=webm:1404

Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02
2017-08-02 05:57:39 -07:00
shiyou yin
0e87b16022 Merge "loongson mmi configuration patch." 2017-08-02 01:08:43 +00:00
Linfeng Zhang
6738ad7aaf Update x86 idct optimization
Move constant coefficients preparation into inline function.

Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1
2017-08-01 14:40:12 -07:00
Linfeng Zhang
c0490b52b1 Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2" 2017-08-01 21:39:39 +00:00
Johann Koenig
847394fe77 Merge "neon: vpx_quantize_b" 2017-08-01 16:44:31 +00:00
Paul Wilkins
3be14200fc Merge "Respond more rapidly to excessive local overshoot." 2017-08-01 08:58:36 +00:00
Marco Paniconi
c22b17dcef Merge "vp9: Adjust noise estimation for 360p." 2017-08-01 02:48:13 +00:00
Marco
5d6c1c2d8f vp9: Adjust noise estimation for 360p.
Change-Id: Ib76875232491b14f7114061e8e913e87004427a0
2017-07-31 17:12:58 -07:00
Linfeng Zhang
bf14d468c1 Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
This replaces commit aa1c4cd, which has a bug and was reverted in
commit 3c73e58.

The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d().

Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8
2017-07-31 16:36:13 -07:00
James Zern
78e2da3e42 Merge "highbd_inv_txfm_sse4: make << of neg. val a multiply" 2017-07-31 22:43:41 +00:00
Johann
2d6b5df657 neon: vpx_quantize_b
With skip block or coeff < zbin it is about twice as fast as C.

If most coeff values are > zbin it is about 10-15x as fast as C.

BUG=webm:1426

Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7
2017-07-31 10:38:46 -07:00
YinShiyou
2758de5cb2 loongson mmi configuration patch.
enable loongson mmi optimization: ../configure --enable-mmi

Change-Id: I7792c3adeac1d5b573917d7857bba6c1cc05fea5
2017-07-31 17:29:36 +00:00
Marco Paniconi
ebb023deb6 Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""" 2017-07-31 14:58:15 +00:00
Marco
999bd6ea84 vp9: Fix denoising condition when pickmode partition is used.
When the superblock partition is based on the nonrd-pickmode,
we need to avoid the denoising. Current condition was based on
the speed level. This change is to make the condition at the
superblock level, as the switch in partitioning may be done at
sb level based on source_sad (e.g., in speed 6).

Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
2017-07-30 23:16:38 -07:00
Jerome Jiang
f027908ad0 Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""
This reverts commit c9266b8547.

Disable source_sad when resolution > 1080P. The test should
pass now.

BUG=webm:1452

Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082
2017-07-30 19:49:31 -07:00
James Zern
78155b7ed5 highbd_inv_txfm_sse4: make << of neg. val a multiply
left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.

Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254
2017-07-30 12:48:28 -07:00
James Zern
facb124941 Merge "Revert "vp9: Speed feature to adapt partition based on source_sad."" 2017-07-30 03:26:10 +00:00
James Zern
c9266b8547 Revert "vp9: Speed feature to adapt partition based on source_sad."
This reverts commit 064fc570ff.

This causes an assertion failure in vp9_mcomp.c when running
gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
`mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
- 1)'

Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6
2017-07-29 11:36:58 -07:00
James Zern
d35b627340 Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"
This reverts commit aa1c4cd140.

This fails the following tests with extreme input coefficients:
SSE2/InvTrans8x8DCT.CompareReference/0
SSE2/InvTrans8x8DCT.CompareReference/2

previously the optimized path was skipped in this range

Change-Id: I9af015a46eba96208834a219fafd651d37556a80
2017-07-29 11:12:27 -07:00
Marco Paniconi
5d0bef4763 Merge "vp9: Adjust logic in source sad for screen content." 2017-07-29 01:46:58 +00:00
Marco Paniconi
e48dfcead1 Merge "vp9: Speed feature to adapt partition based on source_sad." 2017-07-29 01:45:19 +00:00
Jerome Jiang
ac211fe23e vp9: Adjust logic in source sad for screen content.
Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3
2017-07-28 17:25:41 -07:00
Marco
064fc570ff vp9: Speed feature to adapt partition based on source_sad.
Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.

Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.

Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
2017-07-29 00:20:26 +00:00
Urvang Joshi
7105e66d19 Remove the DP version of vp9_optimize_b().
The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/

And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/

Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97
2017-07-28 23:12:57 +00:00
Linfeng Zhang
75653b7032 Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34
* changes:
  Update vpx_idct16x16_10_add_sse2()
  Add vpx_idct16x16_38_add_sse2()
  Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
  Refactor highbd idct 4x4 and 8x8 x86 functions
2017-07-28 22:43:00 +00:00
James Zern
3c73e587d1 Revert "quantize ssse3: declare all variables"
This reverts commit 03f5e300d6.

This causes test failures under OSX:
SSSE3/VP9QuantizeTest.EOBCheck/0
SSSE3/VP9QuantizeTest.OperationCheck/0

Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b
2017-07-28 01:22:16 -07:00
Linfeng Zhang
5232e35bc2 Update vpx_idct16x16_10_add_sse2()
Change-Id: Ia0e20f5fa47382af5785221eebb05212b40bd35c
2017-07-27 18:03:25 -07:00
Linfeng Zhang
7f4acf8700 Add vpx_idct16x16_38_add_sse2()
Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898
2017-07-27 18:02:43 -07:00
Linfeng Zhang
aa1c4cd140 Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
BUG=webm:1412

Change-Id: I35df041b757d42278ac7a5cdbd909e8ffcee1455
2017-07-27 18:02:36 -07:00
Linfeng Zhang
9c43d81bc2 Refactor highbd idct 4x4 and 8x8 x86 functions
BUG=webm:1412

Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec
2017-07-27 18:01:03 -07:00
Johann Koenig
a83e1f1d53 Merge "quantize ssse3: declare all variables" 2017-07-27 21:18:35 +00:00
Jerome Jiang
905b8ec27f Merge "vp8: Remove isolated skin & non skin blocks." 2017-07-27 20:24:08 +00:00
Jerome Jiang
56d95b77f5 vp8: Remove isolated skin & non skin blocks.
Neutral on RTC metrics and speed on Pixel.

Change-Id: I26b907483fe133e6e4c1009d147631f0d0e0f2fb
2017-07-26 14:44:36 -07:00
James Zern
1c666465af inv_txfm_{sse2,ssse3}: clear conversion warnings
visual studio reports tran_high_t (int64) -> short in calls to
_mm_set1_epi16

Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745
2017-07-25 20:13:49 -07:00
James Zern
62682ac8ad highbd_idct*_sse*.c: clear conversion warnings
visual studio reports tran_high_t (int64) -> int in calls to
_mm_setr_epi32

Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05
2017-07-25 20:11:09 -07:00
James Zern
85736e616e vpx_variance16x16_sse2: correct cast order
allow the right shift to operate on 64-bits, this matches the rest of
the implementations

previously:
b0f1ae147 vpx_get16x16var_avx2: correct cast order

Change-Id: I632ee5e418f3f9b30e79ecd05588eb172b0783aa
2017-07-25 16:45:40 -07:00
Alexandra Hájková
666c543f7b ppc: Add vpx_idct16x16_256_add_vsx
Change-Id: Ibc3f7965423fd91179f8d8e77c7ae3e6d7f80572
2017-07-25 12:34:15 +00:00
James Zern
b0f1ae1475 vpx_get16x16var_avx2: correct cast order
allow the right shift to operate on 64-bits, this matches the rest of
the implementations

missed in:
6acd061aa variance_avx2: sync variance functions with c-code

Change-Id: Icae436b881251ccb9f9ed64fcbf8d358c58a4617
2017-07-24 16:29:44 -07:00
James Zern
8836e46ffd set_var_thresh_from_histogram: prevent negative variance
For 8-bit the subtrahend is small enough to fit into uint32_t.

For 10/12-bit apply:
63a37d16f Prevent negative variance

previously:
47b9a0912 Resolve -Wshorten-64-to-32 in highbd variance.
c0241664a Resolve -Wshorten-64-to-32 in variance.

Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee
2017-07-22 13:27:32 -07:00
Marco
8c7a60e04d vp8: Fix compile warning in vp8_multi_resolution_encoder.c
Change-Id: I49c960179dfc1902aa5e5c99915789878c06bc3d
2017-07-20 14:19:43 -07:00
Johann Koenig
e8bd534c42 Merge "quantize test: promote RandRange() result to signed" 2017-07-20 19:46:05 +00:00
Johann Koenig
0c30b75f40 Merge "quantize test: lowbd functions do not pass in highbd" 2017-07-20 19:45:59 +00:00
Jerome Jiang
494188505b Merge "vp9: Removed unused skin detection function." 2017-07-20 16:58:01 +00:00
Johann
af08fbb444 quantize test: promote RandRange() result to signed
Avoid unsigned overflow warning:
unsigned integer overflow: 19974 - 32703 cannot be represented in type
'unsigned int'

Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12
2017-07-20 08:17:48 -07:00
Johann
c782f27ead quantize test: lowbd functions do not pass in highbd
qcoeff output looks OK but dqcoeff is no good.

BUG=webm:1448

Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd
2017-07-20 08:17:48 -07:00
Johann Koenig
4702bb26be Merge "quantize test: eob is output" 2017-07-20 15:17:26 +00:00
Johann Koenig
e1809501d0 Merge "Earmark extra space for VSX." 2017-07-19 21:35:57 +00:00
Jerome Jiang
9dd992b6f0 Merge "Roll libwebm: Fix android build failure with NDK r15b." 2017-07-19 21:30:21 +00:00
Johann
bde2e4aa36 quantize test: eob is output
eob values are generated by the function.

Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3
2017-07-19 14:17:19 -07:00
Han Shen
b72d3e8a25 Earmark extra space for VSX.
Backend specific optimization for PPC VSX reads 16 bytes, whereas arm neon /
sse2 only reads <= 8 bytes. Although the extra bytes read are actually never
used, this is not a warrant for groping around.  Fixed by allocating more when
building for VSX. This is reported by asan.

Also note - PPC does have assembly that loads 64-bit content from memory - lxsdx
loads one 64-bit doubleword (whereas lxvd2x loads two 64-bit doubleword) from
memory. However, we only have "vec_vsx_ld" builtins that mapped to lxvd2x, no
builtins to lxsdx. The only way to access lxsdx is through inline assembly,
which does not fit well in the origin paradigm.

Refer:
  vsx:
    vpx_tm_predictor_4x4_vsx @ third_party/libvpx/git_root/vpx_dsp/ppc/intrapred_vsx.c
  neon:
    vpx_tm_predictor_4x4_neon @ third_party/libvpx/git_root/vpx_dsp/arm/intrapred_neon_asm.asm
  sse2:
    tm_predictor_4x4 @ third_party/libvpx/git_root/vpx_dsp/x86/intrapred_sse2.asm

BUG=b/63112600

Tested:
  asan tests passed.

Change-Id: I5f74b56e35c05b67851de8b5530aece213f2ce9d
2017-07-19 13:59:32 -07:00
Johann Koenig
89a116f4cb Merge "variance: call C comp_avg_pred" 2017-07-19 20:34:13 +00:00
Jerome Jiang
8ad9338e2e Roll libwebm: Fix android build failure with NDK r15b.
BUG=webm:1447

Change-Id: I8defe45cb94eb9c209ba72ce446786f24c14c0b8
2017-07-18 16:52:46 -07:00
Jerome Jiang
4526644615 vp9: Removed unused skin detection function.
Change-Id: I6702b7b11aa4ac9aac5fd54deef4377cdcb29c64
2017-07-18 14:52:04 -07:00
Jerome Jiang
59e461db1f Merge "vp9: Allocate alt-ref in denoiser for SVC." 2017-07-18 21:30:04 +00:00
Jerome Jiang
babef23a5f Merge "vp9: Remove isolated skin & non-skin blocks." 2017-07-18 20:48:32 +00:00
Johann Koenig
56d3f1573a Merge changes I62c2e313,Ibd7a0337,I94e1d886
* changes:
  quantize test: test sse2 and avx optimizations
  quantize test: extend arrays
  quantize test: restrict and correct input
2017-07-18 20:42:39 +00:00
Johann
4b9a848bb3 variance: call C comp_avg_pred
Keep optimized code out of the reference implementation. This matches
the style of the other sub calls.

Change-Id: I3da6acd4f2c647b029c420e22ac9410a18259689
2017-07-18 20:22:53 +00:00
Jerome Jiang
fd216268ad vp9: Allocate alt-ref in denoiser for SVC.
When SVC is used, allocate alt-ref in denoiser.

Change-Id: I1b17221b55b9444cd23b97d481b54ff8d296d857
2017-07-18 13:22:47 -07:00
Johann
03f5e300d6 quantize ssse3: declare all variables
Copy missing line from avx implementation.

Change-Id: I9755c5b4d4034867de6fa9f741c24bf49dce3a27
2017-07-18 12:32:57 -07:00
Johann
101981b736 quantize test: test sse2 and avx optimizations
ssse3 does not pass either of the tests.

avx 32x32 does not pass.

Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44
2017-07-18 12:08:16 -07:00
Jerome Jiang
adbfc4308a vp9: Remove isolated skin & non-skin blocks.
0.007% regression on rtc and 0.004% gain on rtc_derf.
1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has
~0.2% speed gain on Google Pixel.

Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc
2017-07-18 11:29:14 -07:00
Johann
c7ebe82253 quantize test: extend arrays
Officially the quant structures are 8 elements, with one dc element and
7 repeated ac elements. The low bit depth optimizations take advantage
of this to fill the xmm registers. The high bit depth version manually
duplicates the values.

If all the optimizations were unified, the structure sizes could be
greatly reduced.

Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae
2017-07-18 09:55:47 -07:00
Johann
cb61ba02f4 quantize test: restrict and correct input
Use only valid values for quantize inputs. These were determined by
looping over vp9_init_quantizer and looking for max and min values.

This allows extending the test to the low bit depth functions which were
not designed to handle all possible inputs but only valid inputs.

Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e
2017-07-18 09:40:45 -07:00
Marco
817f68cdcf vp9: Disable usage of sb_use_mv_part for SVC.
To fix valgrind issueis with SVC tests.
SVC encoding uses prune_evenmore which is causing uinit value.

Will re-enable later when issue is resolved.

Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44
2017-07-18 09:28:56 -07:00
Marco
ad56371343 vp9: Fix to setting content_state for real-time mode.
When content_state_sb is set to LowVarHighSumdiff, don't reset
it to VeryHighSad. Visually better on clips with strong lighting changes.

Small/negligible change in RTC metrics and speed.

Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
2017-07-17 16:21:25 -07:00
Marco
0c9e2f4c15 vp9: Reuse motion from choose_partitioning in NEWMV search.
When int_pro_motion_estimation is done for superblock in
choose_partitioning, use it to avoid the full_pixel_search
for NEWMV mode, if bsize is >= 32X32.

For speed > 7.
Small/neutral change on RTC metrics.
~1-2% speedup on arm on high motion clip.

Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
2017-07-17 13:15:48 -07:00
James Zern
9223b947ca Merge "fix 'make exampletest' w/CONFIG_REALTIME_ONLY" 2017-07-15 18:37:10 +00:00
Jerome Jiang
682135fa60 vp9: Compute skin only for blocks eligible for noise estimation.
Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264
2017-07-14 15:14:30 -07:00
Marco
666e394d41 vp9: Adjust minmax threshold for variance partitioning.
Only affects speed 7. Improvement on high motion clips.

Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687
2017-07-13 21:19:37 -07:00
Johann
e3fa4ae8e3 quantize test: use Buffer
Although the low bitdepth functions are identical (excepting the need
for larger intermediate values) they do not pass these tests. This
improves the error output to aid debugging.

Simplify buffer usage with Buffer and removing unnecessarily aligned
variables.

eob is a single element and never written using aligned instructions.

BUG=webm:1426

Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35
2017-07-13 15:54:48 -07:00
James Zern
960466939d fix 'make exampletest' w/CONFIG_REALTIME_ONLY
for tests that aren't explicitly testing 2-pass behavior use --passes=1
with this configuration

Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f
2017-07-13 10:47:20 -07:00
James Zern
b578d59623 Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY" 2017-07-12 23:30:04 +00:00
Johann Koenig
e0d79bc7b5 Merge "sad4d neon: 64x[32,64]" 2017-07-12 20:15:00 +00:00
Marco Paniconi
f6586b8bf8 Merge "vp9: Fix to SVC and denoising for fixed pattern case." 2017-07-12 19:13:05 +00:00
Johann Koenig
3158752980 Merge changes Ibf5e61dc,I44b48512,I7de2500c,I5081b5ce
* changes:
  sad4d neon: 32x[16,32,64]
  sad4d neon: 16x[8,16,32]
  sad4d neon: 8x[4,8,16]
  sad4d neon: 4x4, 4x8
2017-07-12 15:01:30 +00:00
Johann
e381753926 sad4d neon: 64x[32,64]
Rewrite 64x64.

BUG=webm:1425

Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf
2017-07-12 13:26:39 +00:00
Johann
e1bde306c8 sad4d neon: 32x[16,32,64]
Rewrite 32x32. Use half the accumulator registers.

BUG=webm:1425

Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13
2017-07-12 13:25:18 +00:00
Johann
807ce8fb1e sad4d neon: 16x[8,16,32]
Rewrite 16x16. Use half the accumulator registers.

BUG=webm:1425

Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2
2017-07-12 13:25:11 +00:00
Johann
8152b0904d sad4d neon: 8x[4,8,16]
BUG=webm:1425

Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4
2017-07-12 13:25:03 +00:00
Johann
dd4347e9ec sad4d neon: 4x4, 4x8
BUG=webm:1425

Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468
2017-07-12 03:38:03 +00:00
Urvang Joshi
1dee320446 Merge "Remove the token state array from greedy optimize_b." 2017-07-12 00:08:56 +00:00
James Zern
df18412f32 remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY
BUG=webm:1446

Change-Id: I6e0ea9342c715d354c641109737172afa649b85b
2017-07-11 13:10:16 -07:00
Urvang Joshi
5322a31b18 Remove the token state array from greedy optimize_b.
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.

Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501

Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
2017-07-11 13:05:29 -07:00
James Bankoski
7d5afa227a Merge "Reintroduce fix for max qindex calculation of a gf interval" 2017-07-11 19:47:16 +00:00
Jerome Jiang
1a4d8f2033 Merge "vp9: Move skinmap computation into multithreading loop." 2017-07-11 19:44:22 +00:00
Jim Bankoski
689ad89e86 Reintroduce fix for max qindex calculation of a gf interval
This reintroduces the fix:
  https://chromium-review.googlesource.com/c/422807/
and later reverted here:
  https://chromium-review.googlesource.com/c/447843/

BUG=webm:1355

This time behind a compile time flag :

configure --disable-always_adjust_bpm
configure --enable-always_adjust_bpm

This should make side by side testing easier and let users of the
lib pick which way they want to go.

Change-Id: I7d7b37b83015dc001810af84c132cbc1e71ba8d6
2017-07-11 18:40:26 +00:00
Marco
3818a3723b vp9: Fix to SVC and denoising for fixed pattern case.
For fixed pattern SVC: keep track of denoised last_frame buffer
for base temporal layer, and if alt_ref is updated on middle/upper
temporal layers, force an update to denoised last_frame buffer.
This allows for improved denoising on top temporal layers.

Change-Id: Icbd08566027d4d2eabc024d3b7a0d959d2f8c18b
2017-07-11 11:27:04 -07:00
Jerome Jiang
3d6b0cb825 vp9: Move skinmap computation into multithreading loop.
Change-Id: Iebc9dd293d8b1449c0674c0295349297e9b90646
2017-07-10 17:18:15 -07:00
Johann
66a96fd3de avg_neon: fix 4x4, update 8x8
4x4 was failing with a bus error. Most likely due to clang alignment
hints on 32bit loads.

Change-Id: Ib191ce0e6239fc55d85f10e4dbe15876e5052edb
2017-07-10 15:29:34 -07:00
Johann
87610ac45e neon: consolidate horizontal adds
Change-Id: Iaf9e88ff636ccf8f0ef310869c6827f3f205cca8
2017-07-10 15:29:13 -07:00
Johann Koenig
4b78c6e6f7 Merge "remove vp9_full_sad_search" 2017-07-10 20:42:40 +00:00
Jerome Jiang
125a532b34 Merge "vp9: Remove alt-ref from denoiser." 2017-07-10 20:03:51 +00:00
Johann
109faffe9b remove vp9_full_sad_search
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.

Remove the remaining sizes and all the highbitdepth versions.

BUG=webm:1425

Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-07-10 11:20:35 -07:00
Jerome Jiang
2ac7c549e9 vp9: Remove alt-ref from denoiser.
Denoiser is used in real-time mode which does not use alt-ref.
Reduce memory usage when denoiser is enabled.

Change-Id: I54ba3bcaeeb1818bbdf718ef90e97d4897ff793d
2017-07-10 10:56:03 -07:00
Johann Koenig
4e16f70703 Merge changes Id84d9780,Iaa6ea75b,I3362e0dd,I0020a49e,Ia42e4f36, ...
* changes:
  sad neon: avg for 64x[32,64]
  sad neon: macroize 64xN definitions
  sad neon: avg for 32x[16,32,64]
  sad neon: macroize 32xN definitions
  sad neon: avg for 16x[8,16,32]
  sad neon: macroize 16xN definitions
2017-07-07 21:01:23 +00:00
James Zern
5d6060b62f Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming" 2017-07-07 19:15:02 +00:00
Johann Koenig
6c375b9cd0 Merge "fdct neon: 32x32_rd" 2017-07-07 14:05:51 +00:00
Johann
e4e08556db sad neon: avg for 64x[32,64]
BUG=webm:1425

Change-Id: Id84d97807a6a0fbcc889c4dfe11929d54f85493d
2017-07-07 07:04:04 -07:00
Johann
6ae8f8dbe8 sad neon: macroize 64xN definitions
Change-Id: Iaa6ea75b10e75784f31b1e08637eecf0dcb5cff9
2017-07-07 07:04:04 -07:00
Johann
67cffc1ef6 sad neon: avg for 32x[16,32,64]
BUG=webm:1425

Change-Id: I3362e0dded3b46ca032caa7f44db42f324bc596d
2017-07-07 07:04:04 -07:00
Johann
b0d15713be sad neon: macroize 32xN definitions
Change-Id: I0020a49e77d27514375a03095d5821dc0aa7d128
2017-07-07 07:04:04 -07:00
Johann
527e0c9b1c sad neon: avg for 16x[8,16,32]
BUG=webm:1425

Change-Id: Ia42e4f36547c5fe12114fb58379e34bce82eb2f2
2017-07-07 07:04:04 -07:00
Johann
3c18acf452 sad neon: macroize 16xN definitions
Change-Id: I5aea6ffbfa48eb1970afe3be54f0bba275d7fa58
2017-07-07 07:04:04 -07:00
Johann Koenig
9b253f9f0a Merge changes I7b36a57e,If2ab51e3,Ifc685a96
* changes:
  sad neon: macroize 8xN definitions
  sad neon: avg for 8x[4,8,16]
  sad neon: avg for 4x4 and 4x8
2017-07-07 14:03:13 +00:00
Marco Paniconi
2075af4b16 Merge "vp9: Nonrd mode: use content_state_sb for high motion." 2017-07-07 03:00:59 +00:00
James Zern
80b83c73ba cosmetics,vp9/: normalize inv/fwd_txfm naming
+ vpx_dsp/, test/

itxfm -> inv_txfm, ftxfm -> fwd_txfm

Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-07-06 18:35:44 -07:00
James Zern
777ca80f0a Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9: remove FrameWorkerData & vp9_dthread.h
  vp9: remove (un)lock_buffer_pool
2017-07-06 23:31:30 +00:00
James Zern
a15c6a7ebf vp8cx,cosmetics: correct VP9_SET_TILE_COLUMNS docs
this has been set to max since:
f5c36a5ce VP9: turn on tile-columns and frame-parallel-mode by default
~v1.4.0

Change-Id: Ic796fc05abe73a58700ec50e3f8e72d3462898ec
2017-07-06 16:24:35 -07:00
Marco
8c3f18efa1 vp9: Nonrd mode: use content_state_sb for high motion.
In the content_state for a superblock is set to HighSad,
use that to bias some decisions in variance partition and
nonrd pickmde: use int_pro_motion for sad computation in
choose_partitioning, and set large_block in pickmode based
on the content_state_sb.

Only affects speed >= 7.

Immprovement for high motion content.
Small gain (~1%) in RTC metrics.
Speedup of ~5 for high motion clip on android (speed 8, 1 thread).

Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
2017-07-06 15:05:19 -07:00
James Zern
26a9a4cd64 vp8cx,cosmetics: correct VP9_SET_FRAME_PARALLEL_DECODING docs
this has been on by default since:
f5c36a5ce VP9: turn on tile-columns and frame-parallel-mode by default
~v1.4.0

Change-Id: I52017ab0157feaf429dce3d9e1af8a53bb5c1b65
2017-07-06 10:40:18 -07:00
Johann
d6423b3166 sad neon: macroize 8xN definitions
Change-Id: I7b36a57e893c1795a37ba7994995bec7ff021409
2017-07-06 07:51:59 -07:00
Johann
63bdc574e5 sad neon: avg for 8x[4,8,16]
BUG=webm:1425

Change-Id: If2ab51e3050e078b0011b174efe41fcb65a15f44
2017-07-06 07:43:09 -07:00
Johann
6bac3f80ee sad neon: avg for 4x4 and 4x8
BUG=webm:1425

Change-Id: Ifc685a96cb34f7fd9243b4c674027480564b84fb
2017-07-06 07:12:47 -07:00
Johann
75b00592c7 fdct neon: 32x32_rd
About 40% faster than the non-rd version.

BUG=webm:1424

Change-Id: Ia99d14eb9532302eeaab8cd3e503395b0374b5a2
2017-07-06 06:30:50 -07:00
James Zern
5227b8200b vp9: remove FrameWorkerData & vp9_dthread.h
the file was empty after the struct removal. the only remaining use was
within vp9_dx_iface, but the wrapper became unnecessary after the
removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd
2017-07-05 22:32:00 -07:00
James Zern
48c4a038eb vp9: remove (un)lock_buffer_pool
there is no threaded access to this pool after the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: I710769b87102edc898c59eb9a2e7a91d8c49107f
2017-07-05 21:07:00 -07:00
James Zern
af3cab7b24 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_onyxc_int,RefCntBuffer: rm unused members
  remove vp9_dthread.c
  vp9: reduce FRAME_BUFFERS by 3
2017-07-06 04:06:30 +00:00
James Zern
4ffd8350be Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  VP9_COMMON: rm frame_parallel_decode
  VP9Decoder: rm frame_parallel_decode
  vp9_dx: rm worker thread creation
2017-07-05 23:53:22 +00:00
James Zern
0d245d42c4 Merge "test_vector_test,vp8: correct thread range" 2017-07-05 22:33:51 +00:00
Yaowu Xu
f2b1dc529f Merge "Further refactoring of mod error calculation." 2017-07-05 21:43:50 +00:00
Yaowu Xu
e3cafbc8df Merge "Fix incorrect index test in GF group rate assignment." 2017-07-05 21:43:43 +00:00
James Zern
24d0391efb Merge "googletest: suppress unsigned overflow in the LCG" 2017-07-05 21:19:44 +00:00
Johann Koenig
9a05f9771a Merge "test/buffer.h: move range checking to compiler" 2017-07-05 21:15:13 +00:00
James Zern
a22bb9809e Merge "dct_partial_test: cover vpx_fdct8x8_1_msa in hbd" 2017-07-05 21:08:46 +00:00
Hui Su
3e08a88854 Merge "level tests: allow level undershoot" 2017-07-05 20:47:20 +00:00
James Zern
23d60be414 dct_partial_test: cover vpx_fdct8x8_1_msa in hbd
this was enabled in:
5ac88162b partial fdct test

Change-Id: Ibae2031ec1308fe3a3b84a1ce6e7bacda3a7cb82
2017-07-05 13:01:41 -07:00
James Zern
a6531cbc54 Merge changes from topic 'missing-proto'
* changes:
  fwd_txfm_msa.c: add missing vpx_dsp_rtcd.h
  vpx_convolve_*_msa.c: add missing vpx_dsp_rtcd.h
  loopfilter_*_msa.c: add missing vpx_dsp_rtcd.h
2017-07-05 20:00:25 +00:00
Johann Koenig
b6321025cd Merge "partial fdct neon: maintain neon registers" 2017-07-05 19:12:38 +00:00
Johann
da2ad47d66 test/buffer.h: move range checking to compiler
Pass low/high values as type T. Out of range values should be caught by
static analysis instead.

Change-Id: I0a3ee8820af05f4c791ab097626174e2206fa6d5
2017-07-05 11:21:18 -07:00
paulwilkins
5b44ef0c50 Respond more rapidly to excessive local overshoot.
This patch attempts to address a bug reported for 4K video.
https://b.corp.google.com/issues/62215394

In this instance a perfect storm of a moderate complexity section
followed by a much easier section where a CGI overlay helped to
suppress film grain noise, followed by a much harder and very grainy
section at the end, cause a massive local rate spike that pushed a chunk
over the upper allowed rate limit.

This patch detects cases where the rate for a frame is much higher than
expected and allows, in this special case, for rapid adjustment of the active
Q range.

For the example chunk in the bug report the target rate was 18Mb/s and the
observed rate was over 37 Mb/s with a surge for the last few frames to over
100Mb/s. This patch brings the overall chunk rate right back down to ~18.2 Mbit/s
and  almost completely eliminates the rate spike at the end. (See graphs appended
to bug report)

Also see  I108da7ca42f3bc95c5825dd33c9d84583227dac1 which fixes a bug
unearthed during testing of this patch and also has a bearing on high rate
encodes such as 4K.

This patch does have a negative impact on some metrics. Most notably there are
clips in our standard test set where it hurts global psnr (though in many cases it
conversely helps SSIM, FAST SSIM and PSNR-HVS). It is also worth noting that
the clips (and data rates) where there is a big metric impact, are almost all cases
where there is currently a significant overshoot vs the target rate and overall rate
accuracy is greatly improved.

Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
2017-07-05 16:51:52 +01:00
paulwilkins
a1af335f44 Further refactoring of mod error calculation.
Further refactoring to support alternative error distributions.

Change-Id: I0f7fa3fd6f3baa4b0a1e53c6aa3be63966e97b82
2017-07-05 16:49:37 +01:00
paulwilkins
b0459ec8ea Fix incorrect index test in GF group rate assignment.
Correct test for middle frame in the group.

Change-Id: I1ee49fa33968eb3c4a01d6a27a60bb1409e3e68c
2017-07-05 16:45:36 +01:00
James Zern
7d526c1654 Merge "buffer.h: incorrect RandRange results" 2017-07-02 03:48:53 +00:00
Johann
6cb3178192 buffer.h: incorrect RandRange results
'low' was promoted to unsigned, triggering a ubsan warning

Change-Id: Id49340079d39c105da93cf13e96cf852a93a94ba
2017-07-01 20:01:22 -07:00
James Zern
fb135ff050 Merge changes I4ed1312f,Id2673eec
* changes:
  ppc: Add vpx_idct8x8_64_add_vsx
  ppc: Add vpx_idct4x4_16_add_vsx
2017-07-02 02:38:39 +00:00
Alexandra Hájková
c757d6dde4 ppc: Add vpx_idct8x8_64_add_vsx
Change-Id: I4ed1312f365509e0595dcc09890ecb050f6f2069
2017-07-01 12:55:47 -07:00
Alexandra Hájková
d8c277030c ppc: Add vpx_idct4x4_16_add_vsx
Change-Id: Id2673eece32027fb245919c7a5c81994a4a19fd8
2017-07-01 12:32:18 -07:00
Alex Converse
f7645138d4 googletest: suppress unsigned overflow in the LCG
Local application of:
https://github.com/google/googletest/pull/1066

Suppress unsigned overflow instrumentation in the LCG

The rest of the (covered) codebase is already integer overflow clean.

TESTED=gtest_shuffle_test goes from fail to pass with -fsanitize=integer

Change-Id: I8a6db02a7c274160adb08b7dfd528b87b5b53050
2017-07-01 12:24:32 -07:00
James Zern
3dd993e4be highbd_idct8x8_add_sse4: make << of neg. val a multiply
left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.

Change-Id: Ia17a7672d4832463decbc4afd6cd42974d02698e
2017-07-01 11:56:56 -07:00
Johann
3ae458f2f3 partial fdct neon: maintain neon registers
Finish the calulations in neon registers. This avoids a potentially
expensive move from neon to gp and allows at least clang to store
directly to memory.

BUG=webm:1424

Change-Id: Idef25eec95f7610947167818e9194bde8b00d282
2017-07-01 09:29:38 -07:00
James Zern
a876d04072 fwd_txfm_msa.c: add missing vpx_dsp_rtcd.h
+ only expose compatible functions in high-bitdepth build

quiets -Wmissing-prototypes warnings

Change-Id: I8ef7db08a34c5c54b5cde6e732c0d70f4287c89a
2017-06-30 18:53:30 -07:00
James Zern
8710c6d884 vpx_convolve_*_msa.c: add missing vpx_dsp_rtcd.h
quiets -Wmissing-prototypes warnings

Change-Id: I1ab5b8ae4a62f54e0f9eb3fc81371c9b99972c30
2017-06-30 18:50:56 -07:00
James Zern
329dabf57e loopfilter_*_msa.c: add missing vpx_dsp_rtcd.h
+ make some functions static

quiets -Wmissing-prototypes warnings

Change-Id: I2130e06142e71a004a1eb30e173feba4f6fe68a0
2017-06-30 18:50:52 -07:00
James Zern
27e37e1a8a fwd_txfm_msa.c: correct vpx_fdct8x8_1_msa prototype
this makes the function compatible with high-bitdepth and fixes test
failures since:
5ac88162b partial fdct test

Change-Id: Ib630694608237f0c515948942e05dbea259ba338
2017-06-30 18:50:47 -07:00
James Zern
af3ab45867 test_vector_test,vp8: correct thread range
testing::Range does not include the end parameter in the set of values.
also adjust the start to 2 as the single threaded case is already
covered in another instantiation

Change-Id: Iae3bf3ed4363dd434eccfa5ad4e3c5e553fbee60
2017-06-30 16:21:06 -07:00
James Zern
5a8e4110c7 Merge "gen_msvs_sln: fix solution version for 2015/17" 2017-06-30 22:05:32 +00:00
James Zern
37e03b1d13 Merge "cosmetics,vp9/encoder: s/txm/txfm/" 2017-06-30 21:57:16 +00:00
Jerome Jiang
f781cc8a46 Merge "vp9: Adjust condition for checking intra mode." 2017-06-30 21:55:00 +00:00
Marco
2290898ac7 vp9: Adjust condition for checking intra mode.
For nonrd_pickmode: add condition for checking
intra mode if the sb content state is VeryHighSad.

Reduces artifacts when sudden change in content.

Metrics on RTC/RTC_derf neutral (small gain).
No speed loss observed.

Change-Id: I07006d28fd2dc06c1d06b07630102b0fece50c40
2017-06-30 14:52:00 -07:00
Linfeng Zhang
1e3a93e72e Merge changes I5d038b4f,I9d00d1dd,I0722841d,I1f640db7
* changes:
  Add vpx_highbd_idct8x8_{12, 64}_add_sse4_1
  sse2: Add transpose_32bit_4x4x2() and update transpose_32bit_4x4()
  Refactor highbd idct 4x4 sse4.1 code and add highbd_inv_txfm_sse4.h
  Refactor vpx_idct8x8_12_add_ssse3() and add inv_txfm_ssse3.h
2017-06-30 20:49:19 +00:00
James Zern
70b7713059 gen_msvs_sln: fix solution version for 2015/17
these are rewritten to 12; 15 causes the open to fail under vs2017

Change-Id: I9c3fd38b632180fa10f1713d4a5d9d15aefd8569
2017-06-30 12:36:46 -07:00
James Zern
303cb3106b vp9_onyxc_int,RefCntBuffer: rm unused members
the last frame_worker_owner, row and col references were removed in:
131bd06e6 remove vp9_dthread.c

BUG=webm:1395

Change-Id: Ia7fb2e8782b12a58d2a2263849d20a8abf06aef6
2017-06-30 12:03:07 -07:00
James Zern
bc837b223b VP9_COMMON: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3a562b2c6b82050064d2b2ccb18a3e77c700b2da
2017-06-30 12:03:07 -07:00
James Zern
78fe5ca360 remove vp9_dthread.c
and the related prototypes in vp9_dthread.h. the last references were
removed in:
09dabc58d VP9_COMMON: rm frame_parallel_decode

vp9_dx_iface.c still uses FrameWorkerData

BUG=webm:1395

Change-Id: Ica8e98ae776fc0105f1fbbed9e0a729808980810
2017-06-30 12:03:07 -07:00
James Zern
ba76b662af VP9Decoder: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3f579766ecfa4777395b99686738e1c5610f86ef
2017-06-30 12:03:07 -07:00
James Zern
fb134e759a vp9: reduce FRAME_BUFFERS by 3
the additional buffers are unneeded with the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: Id9ec4cb6462af5d07a0d3cf939bd216db27d9d9e
2017-06-30 12:03:07 -07:00
James Zern
86d51dfbf5 vp9_dx: rm worker thread creation
creating a thread associated with the sole worker isn't necessary when
only execute() is being used after the removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I2255ce72607321e5708bc82a632dc6825d4eff5c
2017-06-30 12:03:07 -07:00
James Zern
469986f963 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_dx,vpx_codec_alg_priv: rm *worker_id*
  vp9_dx,vpx_codec_alg_priv: rm *cache*
  vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
2017-06-30 19:02:05 +00:00
Johann
c2044fda1d buffer.h: use stride_ instead of stride()
Change-Id: Ib51231349bf0ff3e23672762dc7bfa49b5fe4083
2017-06-30 07:37:20 -07:00
Johann
ce5b17f9ad testing: ranges for random values
Add a method to acm_random.h to generate ranges of values

Add a way to call that method to buffer.h

Adjust dct_[partial_]test.cc to use it.

Change-Id: I8c23ae9d27612c28f050b0e44c41cb4ad2494086
2017-06-30 07:25:30 -07:00
Johann Koenig
89d3dc043e Merge changes Id5beb35d,I2945fe54,Ib0f3cfd6,I78a2eba8
* changes:
  partial fdct neon: add 32x32_1
  partial fdct neon: add 16x16_1
  partial fdct neon: add 4x4_1
  partial fdct neon: move 8x8_1 and enable hbd tests
2017-06-30 01:00:07 +00:00
Linfeng Zhang
c338f3635e Add vpx_highbd_idct8x8_{12, 64}_add_sse4_1
BUG=webm:1412

Change-Id: I5d038b4fa842ce2f6b9bd5c8c44c70647bda9591
2017-06-29 17:19:34 -07:00
Linfeng Zhang
ee5cb8d87f sse2: Add transpose_32bit_4x4x2() and update transpose_32bit_4x4()
BUG=webm:1412

Change-Id: I9d00d1ddbd724fd5f825fd974c4cf46a9bca6cb3
2017-06-29 17:18:01 -07:00
Linfeng Zhang
0fa59a4baf Refactor highbd idct 4x4 sse4.1 code and add highbd_inv_txfm_sse4.h
Also clean highbd_inv_txfm_sse2.h

BUG=webm:1412

Change-Id: I0722841d824ce602874019bd9779b10d49d10c0b
2017-06-29 17:17:43 -07:00
Linfeng Zhang
9ac78ae35f Refactor vpx_idct8x8_12_add_ssse3() and add inv_txfm_ssse3.h
BUG=webm:1412

Change-Id: I1f640db71ad4c644b7521305a781f2218eb1ba9d
2017-06-29 17:13:28 -07:00
James Zern
e3c8f2f152 vp9_dx,vpx_codec_alg_priv: rm *worker_id*
+ available_threads

these are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: I59c5075542a5a74d4a539c213682f566b005f5a6
2017-06-29 16:21:50 -07:00
James Zern
dcdb013b55 vp9_dx,vpx_codec_alg_priv: rm *cache*
these fields are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: Ia3821f7fb81d17b20033b094e5265b1030ee4030
2017-06-29 16:21:50 -07:00
James Zern
ab76f1f224 vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
this field has been 0 since:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

BUG=webm:1395

Change-Id: I15448e9401e15329b54c6878dda033b17be5ec6b
2017-06-29 16:21:50 -07:00
James Zern
67d7a6df2d Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  rm vp9_frame_parallel_test.cc
  test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING
2017-06-29 23:21:18 +00:00
James Zern
e5bdab98e9 rm vp9_frame_parallel_test.cc
VPX_CODEC_USE_FRAME_THREADING was made a no-op in:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

and the tests in this file have been disabled since:
6ab0870d4 disable VP9MultiThreadedFrameParallel tests

BUG=webm:1395

Change-Id: I2c7a250acb65cf9522cf8a7bb724bb92070e41c6
2017-06-29 15:15:56 -07:00
James Zern
508ef2a6e3 test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING
this was made a no-op in:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

and the test hitting this branch has been disabled since:
6ab0870d4 disable VP9MultiThreadedFrameParallel tests

rename the test to VP9MultiThreaded to exercise the tile-based threading

BUG=webm:1395

Change-Id: I35564a75eb5a7d7f7ccb923133b1b07295201f4c
2017-06-29 15:15:48 -07:00
James Zern
8d1bda93f4 cosmetics,vp9/encoder: s/txm/txfm/
txfm is more commonly used as an abbreviation through the codebase

Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-06-29 15:08:47 -07:00
James Zern
bd77931421 dct_partial_test,fwd_txfm: change << to *
left shift of a negative number is undefined in C; quiets a ubsan
warning

Change-Id: Ib1624ad5326ac8e0eead9348468ef7fe5d4df9a4
2017-06-29 14:42:03 -07:00
Jerome Jiang
74dc640565 vp9: Remove avg2x2 in skin detection and clean up.
Change-Id: I6510e36866138f8ac4cb82c207e58e0b9522e499
2017-06-29 03:06:23 +00:00
Johann
9fe510c12a partial fdct neon: add 32x32_1
Always return an int32_t. Since it needs to be moved to a register for
shifting, this doesn't really penalize the smaller transforms.

The values could potentially be summed and shifted in place.

BUG=webm:1424

Change-Id: Id5beb35d79c7574ebd99285fc4182788cf2bb972
2017-06-28 15:37:44 -07:00
Johann
f310ddc470 partial fdct neon: add 16x16_1
For the 8x8_1, the highbd output fit nicely in the existing function. 12
bit input will overflow this implementation of 16x16_1.

BUG=webm:1424

Change-Id: I2945fe5478b18f996f1a5de80110fa30f3f4e7ec
2017-06-28 15:37:44 -07:00
Johann
4959dd3eb3 partial fdct neon: add 4x4_1
BUG=webm:1424

Change-Id: Ib0f3cfd6116fc1f5a99acb8bfd76e25b90177ffc
2017-06-28 15:37:44 -07:00
Johann
cf75ab6ccd partial fdct neon: move 8x8_1 and enable hbd tests
The function was originally written with HBD in mind. Enable it and
configure the tests.

BUG=webm:1424

Change-Id: I78a2eba8d4d9d59db98a344ba0840d4a60ebe9a1
2017-06-28 15:37:43 -07:00
Johann Koenig
81e25512c3 Merge changes Ib454762d,I966650df,Ie126553e,I068f06c6,Icb72a94e
* changes:
  sad neon: rewrite 64x64 and add 64x32
  sad neon: rewrite 32x32, add 32x16 and 32x64
  sad neon: rewrite 16x8, 16x16, add 16x32
  sad neon: rewrite 8x8 and 8x16
  sad neon: rewrite 4x4 and add 4x8
2017-06-28 22:37:00 +00:00
Johann Koenig
d91af5f905 Merge "buffer.h: Only allow Init() to be called once." 2017-06-28 22:36:05 +00:00
Johann Koenig
35f8515c3f Merge "partial fdct test" 2017-06-28 22:34:53 +00:00
Johann
5ac88162b9 partial fdct test
Test the _1 variant of the fdct, which simply sums the block and applies
a modifying shift based on the block size.

BUG=webm:1424

Change-Id: Ic80d6008abba0c596b575fa0484d5b5855321468
2017-06-28 20:32:20 +00:00
Johann
ad011aaab8 sad neon: rewrite 64x64 and add 64x32
BUG=webm:1425

Change-Id: Ib454762d1c61b05a98324fe81ad58c9e09784717
2017-06-28 12:21:34 -07:00
Johann
77a648885c sad neon: rewrite 32x32, add 32x16 and 32x64
BUG=webm:1425

Change-Id: I966650df7e3face93e1e771634d1cc5458a35f85
2017-06-28 12:20:27 -07:00
Johann
469643757f sad neon: rewrite 16x8, 16x16, add 16x32
BUG=webm:1425

Change-Id: Ie126553e5fffcdfaf3d82a85b368ac10ce9ab082
2017-06-28 12:16:00 -07:00
Johann
e40e78be24 sad neon: rewrite 8x8 and 8x16
BUG=webm:1425

Change-Id: I068f06c67b841f09ea07c04ada0c2f1706102138
2017-06-28 12:15:57 -07:00
Johann
46d8660ce3 sad neon: rewrite 4x4 and add 4x8
The previous implementation loaded 8 values (discarding half)

BUG=webm:1425

Change-Id: Icb72a94e2557a4ee2db7091266ab58fd92f72158
2017-06-28 11:14:59 -07:00
Jerome Jiang
8582d33a0d Merge "vp9: compute skinmap only once before encoding." 2017-06-28 18:01:46 +00:00
Johann
e0330c4810 buffer.h: Only allow Init() to be called once.
Change-Id: I041c8b6f314802833c5287a176dbfeec9461b08e
2017-06-28 10:59:39 -07:00
Marco
88d11f473c vp9: Speed >= 8: Remove logic on reducing subpel.
Existing logic was only affecting resolutions above 720p.
Needs more testing for reducing subpel for speed >= 8.

No change on RTC metrics.

Change-Id: I2f4bf9f25891614aafa9a86aa5a5063a3ccfce4d
2017-06-27 20:27:02 -07:00
James Zern
515fed8f38 Merge "highbd_quantize_fp_32x32: normalize abs_qcoeff type" 2017-06-27 23:30:17 +00:00
Jerome Jiang
a220b931f5 vp9: compute skinmap only once before encoding.
This could save some cycles since skin detection is used in multiple
places in vp9.

1~2% speed up on ARM.

Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
2017-06-27 16:16:02 -07:00
hui su
d4595de5db level tests: allow level undershoot
Obtaining a level that is lower than the target should be tolerated.

Change-Id: I90a55ee6d7142e9f6cc525ebbd1e0501defcbe28
2017-06-26 15:17:04 -07:00
Linfeng Zhang
0bb31a46a4 Update vpx_idct8x8_12_add_ssse3()
Change-Id: I0f38801c391db87ddae168602a786a062cd34b1d
2017-06-26 14:57:41 -07:00
Linfeng Zhang
39972d999d Merge "Update load_input_data() in x86" 2017-06-26 21:48:50 +00:00
Jerome Jiang
ddae8f7632 Merge "vp8: Clean up skinmap debugging codes." 2017-06-26 21:32:52 +00:00
Linfeng Zhang
a76b6b232c Update load_input_data() in x86
Split to load_input_data4() and load_input_data8().
Use pack with signed saturation instruction for high bitdepth.

Change-Id: Icda3e0129a6fdb4a51d1cafbdc652ae3a65f4e06
2017-06-26 13:38:33 -07:00
James Zern
f749905d0a roll libwebm snapshot
git log --no-merges --oneline 9732ae9..a97c484
9096786 mkvparser: fix float conversion warning
84e8257 disable -Wdeprecated-declarations in legacy code
a98f495 AddGenericFrame: fix memory leak on failure
da131dd AddCuePoint: fix memory leak on failure
b0cea9c Add(Audio|Video)Track: fix memory leak on failure
5261a67 webm_info: check vp9 ParseUncompressedHeader return
85f7e2e webm_info,PrintVP9Info: validate alt ref sizes
9b97ca1 vp9_header_parser_tests: check parser return
300d6d8 CuePoint::Find: check Track pointer
50c44bb webm_info,OutputCues: fix indexing of tracks
a0d27f0 mkvparser,Block::Parse: remove incorrect assert
784fc1b vttdemux,CloseFiles: check file pointer before closing
b4522c1 .gitattributes: force mkv/webm to be treated as binary
a118f3d Add test for projection parse failures.
d398479 Add test for primary chromaticity parse failures.
9bbec4c Fix permissions on test file.
2cef4d5 mkvparser:Parse: s/FLT_MIN/-FLT_MAX/
35a3c88 mkvmuxer: Turn off estimate_file_duration_ by default
5a41830 mkvparser: Avoid double free when Chromaticity parse fails.
67e3ffa mkvparser: Avoid casts of values too large for float in
Projection elements.
87bcddf vttdemux::ChapterAtomParser: check for NULL display string
a534a24 Update .gitignore
a0d67d0 mkvmuxer: Fix hard-coded data size in EbmlElementSize
c36112c mkvparser: #include sys/type.h
686664e Fix cmake generation warnings on Windows.
2b2c196 cmake: Fix required flag check.
166e40f Cmake refactor.
9fb774a Add missing include in webm2pes.cc.
4956b2d mkvmuxer: Force new clusters when audio queue gets too long.
54f1559 cmake: Cache results of CXX flag tests.
81c73fc mkvparser: Avoid alloc failures in SeekHead::Parse.

Change-Id: Ib81b1772ec81e7af3852dcfef2d312416f6db53d
2017-06-26 18:38:47 +00:00
Linfeng Zhang
ec4afbf74a Merge "Add vpx_highbd_idct4x4_16_add_sse4_1()" 2017-06-24 01:15:14 +00:00
Urvang Joshi
30eeb4dc32 Merge "Enable greedy version of optimize_b() in VP9 by default." 2017-06-24 00:58:24 +00:00
Linfeng Zhang
6ff5de68dd Merge "Cosmetics, 8x8 idct SSE2 optimization" 2017-06-24 00:51:08 +00:00
Urvang Joshi
4bb99ee27e Enable greedy version of optimize_b() in VP9 by default.
Improvements were already mentioned in the previous patch:
https://chromium-review.googlesource.com/#/c/531675/

Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
2017-06-23 17:04:58 -07:00
James Zern
ee1fcb0e69 Merge "variance_test: move Subpel* from tuples to TestParams" 2017-06-23 22:48:40 +00:00
Marco Paniconi
8ad23a072a Merge "vp9: Use scene detection for CBR mode." 2017-06-23 21:59:32 +00:00
Linfeng Zhang
8253a27904 Add vpx_highbd_idct4x4_16_add_sse4_1()
BUG=webm:1412

Change-Id: Ie33482409351a01be4e89466b0441834eb1e905a
2017-06-23 14:30:12 -07:00
Linfeng Zhang
b8a4b5dd8d Cosmetics, 8x8 idct SSE2 optimization
Change-Id: Id21fa94fd323e36cd19a2d890bf4a0cafb7d964d
2017-06-23 14:30:12 -07:00
Jerome Jiang
b0c4d87ac7 vp8: Clean up skinmap debugging codes.
Use the computed skinmap.

Change-Id: I8aabb5922ef5190ec85b9e01807cb79b4803b925
2017-06-23 14:05:33 -07:00
James Zern
0d1c782306 Merge "datarate_test: rename thread -> Thread in test name" 2017-06-23 20:00:51 +00:00
James Zern
54bcd98314 variance_test: move Subpel* from tuples to TestParams
this normalizes these tests with the regular variance ones both in
implementation and test list output

Change-Id: I387aea81456f94b8223b8fb2a28cab94bc1aa9d5
2017-06-23 12:54:18 -07:00
Marco
18805eee6c vp9: Use scene detection for CBR mode.
Use the scene detection for CBR mode, and use it to reset the
rate control if large source sad is detected and rate
correctioni fact/QP is at minimum state.

Avoids large frame sizes after big content change following
low content period.

Only affects CBR mode for 1 pass at speeds 5, 6, 7.
Change-Id: I56dd853478cd5849b32db776e9221e258998d874
2017-06-23 11:44:50 -07:00
Jerome Jiang
e187b27438 Merge "vp8: Compute skinmap only once before encoding." 2017-06-23 17:17:02 +00:00
James Zern
88a302e743 Merge changes from topic 'missing-proto'
* changes:
  onyxd_int.h: add missing prototypes
  onyxd.h: add vp8dx_references_buffer prototype
  vp[89],vpx_dsp: add missing includes
  vp8,encodeframe.h: correct prototypes
  vp8: add temporal_filter.h
  add picklpf.h
  add ethreading.h
  vp8,bitstream.h: add missing prototypes
  vp8: remove vp8_fast_quantize_b_mmx
  vp8,loopfilter_filters: make some functions static
  vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
  vp9_encodeframe: make scale_part_thresh_sumdiff static
  vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
  tiny_ssim: make some functions static
2017-06-23 05:44:24 +00:00
James Zern
a377f99088 onyxd_int.h: add missing prototypes
vp8cx_init_de_quantizer, vp8_mb_init_dequantizer
quiets -Wmissing-prototypes

Change-Id: Ib63d14caf0144eff31a75b7cdb667b7e1f9d83ae
2017-06-22 20:20:23 -07:00
Johann Koenig
794a5ad713 Merge "fdct32x32 neon implementation" 2017-06-23 01:58:00 +00:00
Jerome Jiang
b28fc490aa vp8: Compute skinmap only once before encoding.
Get ready for other uses (i.e. cyclic refresh).
Then use it when needed.

Change-Id: Id0519a9921045e5fb7f3badb54e9f04e903f28f9
2017-06-22 17:12:15 -07:00
Marco Paniconi
4f917912b9 Merge "vp9: Add high source sad to content state." 2017-06-22 22:18:48 +00:00
Linfeng Zhang
c5f9de573f Merge changes I783c5f4f,I365f8e53,I5dac0e98
* changes:
  Clean vpx_idct16x16_256_add_sse2()
  Update vpx_idct{8x8,16x16,32x32}_1_add_sse2()
  Clean 32x32 full idct sse2 and ssse3 code
2017-06-22 21:42:23 +00:00
Paul Wilkins
92145006c3 Merge "Fix int overflow in rate control for high bit rates." 2017-06-22 16:30:40 +00:00
Johann
e67660cf37 fdct32x32 neon implementation
Almost 3x faster in constrained loop testing. Over 10x faster in HBD
builds.

BUG=webm:1424

Change-Id: I2b7f8453e1d4ada63cde729d8115d684c4a71ff9
2017-06-22 06:40:17 -07:00
paulwilkins
efe1982e63 Fix int overflow in rate control for high bit rates.
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.

In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k  clip the encode time for speeds 0 and 1 was
reduced by >25%

Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
2017-06-22 10:34:21 +01:00
Marco
d7515b1187 vp9: Add high source sad to content state.
Use it to limit NEWMV early exit in nonrd pickmode

Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
2017-06-21 20:57:17 -07:00
Marco Paniconi
33a9394eb1 Merge "vp9: Adjustments for aq-mode and pickmode for speed >= 8." 2017-06-22 03:27:47 +00:00
James Zern
dd88bd87db datarate_test: rename thread -> Thread in test name
this is consistent with other threaded tests and ensures gtest_filters
meant to operate on these pick them up

Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b
2017-06-21 20:05:31 -07:00
James Zern
828a1fa6de Merge "vp8_dx_iface: clear -Wclobbered warnings" 2017-06-22 02:01:11 +00:00
James Zern
7c0788b07f onyxd.h: add vp8dx_references_buffer prototype
quiets -Wmissing-prototypes

Change-Id: I6bee535f3fb67e54a390266d787a5a92127aeadc
2017-06-21 19:00:15 -07:00
James Zern
44418c659f vp[89],vpx_dsp: add missing includes
quiets -Wmissing-prototypes

Change-Id: I841cfc019d592f2bc6b3fec5818051a31f4c53b5
2017-06-21 19:00:15 -07:00
James Zern
b24ed95f44 vp8,encodeframe.h: correct prototypes
+ add missing include
quiets -Wmissing-prototypes

Change-Id: I64af0368ba3d7f1d4de22a5887b631bb2cf15b8a
2017-06-21 19:00:15 -07:00
James Zern
b093d998fc vp8: add temporal_filter.h
quiets -Wmissing-prototypes

Change-Id: Iffa77467720affe030de5335e9335232b9e70af1
2017-06-21 19:00:15 -07:00
James Zern
eb8226b903 add picklpf.h
quiets -Wmissing-prototypes

Change-Id: Ic24164aa1f86fe99a493a633d64606e6f44ecdc1
2017-06-21 19:00:14 -07:00
James Zern
864bc77e7a add ethreading.h
quiets -Wmissing-prototypes in encodeframe.c

Change-Id: Ic216d0bdd6130eac44f2183639a715b2f1088ebe
2017-06-21 19:00:14 -07:00
James Zern
d5d6a609d0 vp8,bitstream.h: add missing prototypes
quiets -Wmissing:prototypes

Change-Id: I835a80eddca2b16280780e18558c321df3272c43
2017-06-21 19:00:14 -07:00
James Zern
1d86383512 vp8: remove vp8_fast_quantize_b_mmx
and vp8_fast_quantize_b_impl_mmx; this was never enabled in rtcd
an sse2 version exists so there isn't much reason to keep a mmx
implementation around.

Change-Id: I8b3ee7f46ba194ffa0d0a6225a0f299f2a4dea90
2017-06-21 19:00:14 -07:00
James Zern
18335f193d vp8,loopfilter_filters: make some functions static
quiets -Wmissing-prototypes

Change-Id: Ie5b00537f64a05e68a38dc558463691523988994
2017-06-21 19:00:14 -07:00
James Zern
07f847873b vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
quiets -Wmissing-prototypes

Change-Id: I72d899c2d8de1ddc52d90ac081f2629374b3a6e9
2017-06-21 19:00:14 -07:00
James Zern
9a329b5285 vp9_encodeframe: make scale_part_thresh_sumdiff static
quiets -Wmissing-prototypes

Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
2017-06-21 19:00:14 -07:00
James Zern
3f296533f6 vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
quiets -Wmissing-prototypes

Change-Id: Ib2d4f294f1982739bb2ac98155e789e040d309a1
2017-06-21 19:00:04 -07:00
James Zern
9e1d2de67c highbd_quantize_fp_32x32: normalize abs_qcoeff type
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c

Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
2017-06-21 18:56:10 -07:00
Marco
21afafa31a vp9: Put skin detection usage around cpi flag.
Skin detection usage in choose_partitioning should be
around the cpi->use_skin_detection.

Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
2017-06-21 17:32:56 -07:00
Marco
8cf6f78fce vp9: Adjustments for aq-mode and pickmode for speed >= 8.
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.

No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
2017-06-21 16:01:24 -07:00
James Zern
fbba31e241 vp8_dx_iface: clear -Wclobbered warnings
with gcc 6.x

Change-Id: Ib2070421603a6777892d4ea01f4b0921696f38b3
2017-06-21 15:09:58 -07:00
Johann Koenig
355432b0d2 Merge "dct tests: align InvAccuracyCheck buffers" 2017-06-21 21:16:23 +00:00
Linfeng Zhang
466b667ff3 Clean vpx_idct16x16_256_add_sse2()
Remove macro IDCT16 which is redundant with idct16_8col().

Change-Id: I783c5f4fda038a22d5ee5c2b22e8c2cdfb38432c
2017-06-21 13:47:15 -07:00
Linfeng Zhang
42522ce0b7 Update vpx_idct{8x8,16x16,32x32}_1_add_sse2()
Change-Id: I365f8e53d9ccd028cef0f561d4de9e5916278609
2017-06-21 13:47:05 -07:00
Linfeng Zhang
2b43a1ee18 Clean 32x32 full idct sse2 and ssse3 code
vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster
than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are
code relocations, no new code.

Change-Id: I5dac0e98cc411a4ce05660406921118986638d19
2017-06-21 13:46:49 -07:00
Hui Su
96ec8a425b Merge "VP9 level targeting: properly handle max_gf_interval" 2017-06-21 20:38:45 +00:00
Johann
1c48915233 dct tests: align InvAccuracyCheck buffers
'in' is used for the reference fdct. 'coeff' is input to the idct being
tested and 'dst[16]' is output

Fixes a segfault on unaligned memory access on x86.

Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962
2017-06-21 11:47:00 -07:00
James Zern
0aa3677d9d fix build, rm ref to vpx_idct8x8_64_add_ssse3
this was deleted in:
98967645a Remove vpx_idct8x8_64_add_ssse3()

but this was merged in:
9e03eedf6 Merge changes Ib26dd515,Ie60dabc3

after:
a92991133 Merge "dct tests: run all possible sizes in one test"

which added a new reference

Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25
2017-06-20 19:46:45 -07:00
Linfeng Zhang
9e03eedf62 Merge changes Ib26dd515,Ie60dabc3
* changes:
  Clean 8x8 idct x86 optimization
  Remove vpx_idct8x8_64_add_ssse3()
2017-06-21 00:38:25 +00:00
hui su
d96ed96c0f VP9 level targeting: properly handle max_gf_interval
Don't overide max_gf_interval if it's not specified. It will
be assigned with a default value in vp9_rc_set_gf_interval_range().

BUG=b/62803416

Change-Id: Ide46ce00279ed076865fc54ce98c55a994f0c798
2017-06-20 16:29:04 -07:00
Marco
492d52b9cc vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1000 and
buf-initial to 600. Paramaters affect target size of key frame.

Change-Id: I2be6bc2927f5fa74e19e1efa3fb574d23a503300
2017-06-20 12:22:03 -07:00
Marco
ae3a173352 vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1500 and
buf-initial to 700. Paramaters affect target size of key frame.

Change-Id: I01e238378b63eeef28dfc2178baadffcd3cc7561
2017-06-20 09:08:13 -07:00
Johann Koenig
a929911339 Merge "dct tests: run all possible sizes in one test" 2017-06-20 15:04:25 +00:00
Marco Paniconi
737aa5c9e4 Merge "vp9: SVC: Rework the usage of base_mv for SVC." 2017-06-20 03:08:32 +00:00
Marco
b55240057f vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Adjust some parameters in sample encoder: vpx_temporal_svc_encoder.
Parameters adjusted to set lower QP for initial key frame,
and allow for larger target size on subsequent key frames.

Change-Id: I092ad968e5b51b9f495dadb6ee96e810663c910e
2017-06-19 18:29:39 -07:00
Marco Paniconi
782aacc3d3 Merge "vp9: Speed >= 8: Adjust resolution threshold for subpel." 2017-06-19 23:45:35 +00:00
Johann
4ebb9a36f1 dct tests: run all possible sizes in one test
Modify fdct4x4_test.cc to support all size combinations. This does not
add any new tests and in fact fails a few. There were minimal changes
made to the tests so it's not entirely surprising that some of the
larger 12 bit transforms are failing since it was initially only used
for 4x4.

In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and
dct32x32_test.cc will be evaluated and moved to dct_test.cc.

BUG=webm:1424

Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f
2017-06-19 15:39:35 -07:00
James Zern
ed56ddfef8 Merge "libs.mk: retry partial testdata download" 2017-06-19 22:15:06 +00:00
James Zern
ada640a508 libs.mk: retry partial testdata download
attempt retry on transient failures uncaught by --retry

Change-Id: I7cd8846ff88daf0f521af9ee182e30bfd79f51f3
2017-06-19 14:40:39 -07:00
Marco
ff7fb4b280 vp9: Speed >= 8: Adjust resolution threshold for subpel.
Get some quality gain on RTC metrics (~7%), with
~5-8% speed slowdown.

Change-Id: I0d02942a77074424ee0326b6e110ddff09f2df5e
2017-06-19 13:58:08 -07:00
Jerome Jiang
bf41a982b4 Merge "Enable 8x8 skin detection for vp8." 2017-06-19 16:42:14 +00:00
Marco
112cd95507 vp9: SVC: Rework the usage of base_mv for SVC.
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.

Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.

Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
2017-06-18 22:35:46 -07:00
James Zern
31d6ba9a54 tiny_ssim: make some functions static
quiets -Wmissing-prototypes

Change-Id: If2e77c921b2fba456ed8d94119773e360d90b878
2017-06-16 15:36:32 -07:00
James Zern
038522e4a0 Merge "configure: test for -Wparentheses-equality" 2017-06-16 20:07:52 +00:00
Jerome Jiang
a36017e007 Enable 8x8 skin detection for vp8.
If 2 or more 8x8 blocks are identified as skin, the macroblock will be
labeled as skin.

Change-Id: I596542c81a2df9e96270cab39d920bbfeb02bc6e
2017-06-15 20:53:03 -07:00
James Zern
27c2185954 configure: test for -Wparentheses-equality
Change-Id: I36de79c58461907deaea70d6131da9119bc0bc69
2017-06-15 16:05:20 -07:00
Linfeng Zhang
c7e4917e97 Clean 8x8 idct x86 optimization
Create load_buffer_8x8() and write_buffer_8x8().

Change-Id: Ib26dd515d734a5402971c91de336ab481b213fdf
2017-06-15 14:30:00 -07:00
Linfeng Zhang
98967645a1 Remove vpx_idct8x8_64_add_ssse3()
It's almost identical with vpx_idct8x8_64_add_sse2(), except little
difference in instructions order.

Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f
2017-06-15 14:09:33 -07:00
Urvang Joshi
a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00
Linfeng Zhang
8d391a111a Merge changes Ibf9d120b,I341399ec,Iaa5dd63b,Id59865fd
* changes:
  Update high bitdepth load_input_data() in x86
  Clean array_transpose_{4X8,16x16,16x16_2) in x86
  Remove array_transpose_8x8() in x86
  Convert 8x8 idct x86 macros to inline functions
2017-06-15 17:57:50 +00:00
Marco Paniconi
8b48f68c0d Merge "vp8: Adjust the pred_err threhsold for drop on overshoot." 2017-06-14 15:59:55 +00:00
Linfeng Zhang
6da6a23291 Update high bitdepth load_input_data() in x86
BUG=webm:1412

Change-Id: Ibf9d120b80c7d3a7637e79e123cf2f0aae6dd78c
2017-06-13 16:53:53 -07:00
Linfeng Zhang
d6eeef9ee6 Clean array_transpose_{4X8,16x16,16x16_2) in x86
Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0
2017-06-13 16:50:44 -07:00
Linfeng Zhang
9c72e85e4c Remove array_transpose_8x8() in x86
Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f
2017-06-13 16:50:44 -07:00
Linfeng Zhang
cbb991b6b8 Convert 8x8 idct x86 macros to inline functions
Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce
2017-06-13 16:50:43 -07:00
James Zern
4f9d852759 vp8_skin_detection: add 'vp8_' prefix to public fns
BUG=webm:1438

Change-Id: I5feb31c254d02e116e624cfe702e73ba5a1f7aca
2017-06-12 20:13:28 -07:00
James Zern
98666368ee rename vp8/common/skin_detection.[hc] -> vp8_*
some build systems have trouble with duplicate basenames.
vpx_dsp/skin_detection.[hc] were added in:
658e85425 Merge skin detection code in vp8/9.

BUG=webm:1438

Change-Id: Ieaa70b40bda409ec23e6d179b47a930ac6243b05
2017-06-12 20:13:23 -07:00
Marco
b6e1bdfc76 vp8: Adjust the pred_err threhsold for drop on overshoot.
Change-Id: Ica2a09ac87160936b6f7bd01f167f464ea3ac41c
2017-06-12 09:54:16 -07:00
Hui Su
21e1661b54 Merge "vp9 level targeting: more strict constraint on min_gf_interval" 2017-06-12 16:38:02 +00:00
Jerome Jiang
a46bc0268b Merge "Remove duplication on vp8/9_write_yuv_frame." 2017-06-10 04:50:19 +00:00
Marco
e540ca7155 vp9: SVC: Use prune_evenemore only for non_reference.
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
2017-06-09 17:52:20 -07:00
Jerome Jiang
ff2d220d21 Remove duplication on vp8/9_write_yuv_frame.
Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8
2017-06-09 17:08:26 -07:00
Johann Koenig
6dcd9b37ea Merge "idct_test: don't use std::nothrow anymore" 2017-06-09 20:42:39 +00:00
Johann Koenig
8aa4ee1f10 Merge "buffer.h: allow declaring an alignment" 2017-06-09 20:42:21 +00:00
Johann Koenig
65f4299d65 Merge "Remove some dead code. Coverity CID 1310058" 2017-06-09 20:41:57 +00:00
Johann
92373a5bb2 idct_test: don't use std::nothrow anymore
But still check for NULL before calling Init()

Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76
2017-06-09 11:09:06 -07:00
Johann
5aee8ea752 buffer.h: allow declaring an alignment
x86 simd register operations generally prefer and may require 16 byte
alignment.

Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba
2017-06-09 11:03:15 -07:00
Sylvestre Ledru
c12d1d9b98 Remove some dead code. Coverity CID 1310058
Change-Id: I1186cf1dd8cde42f5970928f43edfc852298289d
2017-06-09 17:56:38 +00:00
James Zern
b3a262dff3 Merge "vp8_decode_frame: fix oob read on truncated key frame" 2017-06-08 23:17:50 +00:00
James Zern
45daecb4f7 vp8_decode_frame: fix oob read on truncated key frame
the check for error correction being disabled was overriding the data
length checks. this avoids returning incorrect information (width /
height) for the decoded frame which could result in inconsistent sizes
returned in to an application causing it to read beyond the bounds of
the frame allocation.

BUG=webm:1443
BUG=b/62458770

Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342
2017-06-08 23:16:04 +00:00
Johann
e50ea014c3 Revert "buffer.h: use size_t"
This reverts commit f08581c1d0.

type conversion warnings abound.

Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00
2017-06-08 10:20:21 -07:00
Jerome Jiang
943f9ee25c Merge "Merge skin detection code in vp8/9." 2017-06-08 16:36:00 +00:00
Johann Koenig
903375a48a Merge "fdct16x16 neon optimization" 2017-06-08 15:19:36 +00:00
Jerome Jiang
658e854252 Merge skin detection code in vp8/9.
BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14
2017-06-07 21:20:34 -07:00
hui su
21d2273efa vp9 level targeting: more strict constraint on min_gf_interval
min_gf_interval should be no less than min_altref_distance + 1,
as the encoder may produce bitstream with alt-ref distance being
min_gf_interval - 1.

BUG=b/38450599

Change-Id: Ifb733daa643ebc668d1b23e1ce92db94b66dabe8
2017-06-07 17:40:25 -07:00
Johann
eae7cf2368 fdct16x16 neon optimization
Roughly 2x speedup. Since the only change for HBD is to store(), the
improvement appears to hold there as well.

BUG=webm:1424

Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19
2017-06-07 14:59:55 -07:00
Marco Paniconi
9cea3a3c4e Merge "vp9: SVC: Enable simple_block_yrd for temporal layers." 2017-06-07 21:12:14 +00:00
Johann Koenig
0c4f74d129 Merge changes Iade45f69,I18d90658,Ieca3f1ef
* changes:
  buffer.h: add num_elements_
  buffer.h: zero-init all values
  buffer.h: use size_t
2017-06-07 19:20:16 +00:00
Marco
14d4718043 vp9: SVC: Enable simple_block_yrd for temporal layers.
Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed
2017-06-07 11:41:50 -07:00
Johann
902d63759e buffer.h: add num_elements_
raw_size_ was being incorrectly computed and used

Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d
2017-06-07 11:31:20 -07:00
Johann
4a37e3e2a0 buffer.h: zero-init all values
Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8
2017-06-07 11:27:26 -07:00
Johann
f08581c1d0 buffer.h: use size_t
Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f
2017-06-07 11:24:27 -07:00
Marco
13b02a8efe vp9: SVC: Enable row-mt in sample encoder.
Change-Id: I4b51043cb3f5955efe947fe4685aed4a21adb8bd
2017-06-07 10:32:44 -07:00
James Zern
ff42e04f9c Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}" 2017-06-06 23:52:39 +00:00
Marco Paniconi
27b34a109d Merge "vp9: SVC: Adjust some speed settings for SVC speed >= 7." 2017-06-06 23:07:45 +00:00
Marco
7d2f5f8e9d vp9: SVC: Adjust some speed settings for SVC speed >= 7.
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
2017-06-06 15:30:24 -07:00
Johann
de4cb716ee buffer.h: split out init
Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287
2017-06-06 15:02:50 -07:00
Johann
8659764a07 buffer.h: Use T for values
Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9
2017-06-06 12:05:14 -07:00
Jerome Jiang
cf07d85809 Initialize cost_list all to INT_MAX.
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
2017-06-06 10:42:37 -07:00
James Zern
6df142e2ab vp9_mcomp,get_cost_surf_min: quiet conversion warning
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-05 22:52:58 -07:00
Jerome Jiang
968a5d6bc2 Merge "Fix valgrind failure on uninitialized variables." 2017-06-06 03:47:31 +00:00
James Zern
4753c23983 Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx" 2017-06-06 02:19:41 +00:00
Jerome Jiang
ffe0f9b7fb Fix valgrind failure on uninitialized variables.
BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-05 13:09:29 -07:00
Jerome Jiang
f3a9ae5baa Fix ubsan failure in vp9_mcomp.c.
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-06-02 21:37:13 -07:00
Marco
e30781ff80 vp9: SVC: Force subpel search off under certain conditions.
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
2017-06-01 20:48:52 -07:00
Marco Paniconi
ff637d1903 Merge "vp9: Speed >8: Set subpel_search_method for low motion." 2017-06-01 23:57:19 +00:00
Marco
8c6fa5c5e3 vp9: Speed >8: Set subpel_search_method for low motion.
Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.

Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.

Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc
2017-06-01 16:16:13 -07:00
Jerome Jiang
68f035026f vp8 skin detection: Fix visual studio build failure.
Change-Id: I510b755550ebbfa2aaf9b974920d7f1c6454a845
2017-06-01 13:46:46 -07:00
Jerome Jiang
e254969df2 Fix corruption in skin map debugging output yuv.
For both vp8 and vp9.

BUG=webm:1437

Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28
2017-06-01 16:59:43 +00:00
Jerome Jiang
f1a300acc4 vp8: Clean up skin detection.
Use only the average of center 2x2 pixels in vp8.

Change-Id: I2b23ff19a90827226273e0fca49e90c734eda59b
2017-05-31 14:57:10 -07:00
Johann Koenig
755b3daf90 Merge "comp_avg_pred neon: used by sub pixel avg variance" 2017-05-31 18:17:28 +00:00
Jerome Jiang
32d8992147 Merge "Write skin map of vp8 skin detection for debug." 2017-05-31 16:37:07 +00:00
Linfeng Zhang
30ea3ef283 Merge "Update vpx_highbd_idct4x4_16_add_sse2()" 2017-05-31 15:56:20 +00:00
Johann
f695b30ac2 comp_avg_pred neon: used by sub pixel avg variance
BUG=webm:1423

Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804
2017-05-30 22:47:34 +00:00
Jerome Jiang
c39526da8a Write skin map of vp8 skin detection for debug.
Change-Id: Ica1b4e918aa759cd0ce65920f9d88452bbf9e3b4
2017-05-30 10:30:05 -07:00
Linfeng Zhang
45048dc9dc Update vpx_highbd_idct4x4_16_add_sse2()
BUG=webm:1412

Change-Id: I26e4b34ae9bc1ae80c24f56d740d737a95f1ab84
2017-05-30 09:25:30 -07:00
Johann Koenig
b9649d2407 Merge "comp_avg_pred: alignment" 2017-05-30 16:21:05 +00:00
Johann Koenig
48c0e13286 Merge "remove DECLARE_ALIGNED from neon code" 2017-05-30 15:58:17 +00:00
Johann
ea8b4a450d comp_avg_pred: alignment
x86 requires 16 byte alignment for some vector loads/stores.

arm does not have the same requirement.

The asserts are still in avg_pred_sse2.c. This just removes them from
the common code.

Change-Id: Ic5175c607a94d2abf0b80d431c4e30c8a6f731b6
2017-05-30 07:46:43 -07:00
Jerome Jiang
a5ab38093f Merge "Fix vp8 race when build --enable-vp9-highbitdepth." 2017-05-30 05:47:44 +00:00
Johann
42ce25821d remove DECLARE_ALIGNED from neon code
Unlike x86 neon only requires type alignment when loading into vectors.

Change-Id: I7bbbe4d51f78776e499ce137578d8c0effdbc02f
2017-05-26 10:41:57 -07:00
Johann Koenig
2693b89c19 Merge "subpel variance neon: reduce stack usage" 2017-05-26 17:25:47 +00:00
Johann Koenig
47174d60c8 Merge "Use vdup instead of vmov" 2017-05-26 17:25:24 +00:00
Jerome Jiang
0afa2dad76 Fix vp8 race when build --enable-vp9-highbitdepth.
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
2017-05-26 09:45:01 -07:00
Marco
146005a911 vp9: SVC: Fix to condiiton on using source_sad.
Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.

Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3
2017-05-26 08:46:50 -07:00
Marco Paniconi
9ec9415fd9 Merge "vp9: Use source_sad only on top temporal enhancement layer." 2017-05-26 05:24:06 +00:00
Marco Paniconi
4be18ab295 Merge "vp9: SVC: Enable copy partition for SVC speed >= 7." 2017-05-26 05:23:47 +00:00
Marco
ea914456af vp9: Use source_sad only on top temporal enhancement layer.
For 1 pass CBR SVC mode.

Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d
2017-05-25 16:32:05 -07:00
Jerome Jiang
327c9bb1da Refactor: Move vp8 skin detection to new files.
Change-Id: If760f28cbbf22beac1cc9bd1546f13831e9dd3f0
2017-05-25 16:12:27 -07:00
Marco
747cf7a505 vp9: SVC: Enable copy partition for SVC speed >= 7.
Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).

~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.

Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780
2017-05-25 12:21:46 -07:00
Johann
f3c97ed32e subpel variance neon: reduce stack usage
Unlike x86, arm does not impose additional alignment restrictions on
vector loads. For incoming values to the first pass, it uses vld1_u32()
which typically does impose a 4 byte alignment. However, as the first
pass operates on user-supplied values we must prepare for unaligned
values anyway (and have, see mem_neon.h).

But for the local temporary values there is no stride and the load will
use vld1_u8 which does not require 4 byte alignment.

There are 3 temporary structures. In the C, one is uint16_t. The arm
saturates between passes but still passes tests. If this becomes an
issue new functions will be needed.

Change-Id: I3c9d4701bfeb14b77c783d0164608e621bfecfb1
2017-05-24 13:28:13 -07:00
Johann
d204c4bf01 Use vdup instead of vmov
Change-Id: Idb6248c1429b55176bb3e9f4e8365ea0ed2be62a
2017-05-24 11:38:15 -07:00
Johann Koenig
de1a9c77a7 Merge changes Iaab2b9a1,Idfb458d3
* changes:
  sub pel avg variance neon: 4x block sizes
  sub pel variance neon: 4x block sizes
2017-05-24 18:33:53 +00:00
Johann Koenig
b11a37f540 Merge changes I31fa6ef8,I228c6f29
* changes:
  sub pel avg variance neon: add neon optimizations
  sub pel variance neon: normalize variable names
2017-05-24 18:32:02 +00:00
James Zern
f0279ceb92 Merge "partial_idct_test,InitInput: fix rollover in mult" 2017-05-24 16:27:21 +00:00
James Zern
566f6d75bd partial_idct_test,InitInput: fix rollover in mult
promote coeff to signed 64-bit to avoid exceeding integer bounds when
squaring the value

Change-Id: If77bef6bc0a6a4c39ca3013e5e2ddb426a1c6e1f
2017-05-24 15:27:38 +02:00
Alexandra Hájková
8bf6eaf433 ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}
Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123
2017-05-24 13:27:09 +00:00
Linfeng Zhang
6444958f62 Update inv_txfm_sse2.h and inv_txfm_sse2.c
Extract shared code into inline functions.

Change-Id: Iee1e5a4bc6396aeed0d301163095c9b21aa66b2f
2017-05-23 14:54:46 -07:00
Linfeng Zhang
36f1b183e4 Update InitInput() in test/partial_idct_test.cc
Make it work in high bit depth.

BUG=webm:1412

Change-Id: Ic5cfd410a69709f01e2924774356a108a349d273
2017-05-23 14:24:23 -07:00
Gregor Jasny
bcfd9c9750 Add support for Visual Studio 2017
BUG=webm:1428

Change-Id: Iba98aef1159724d106cf39b94d7b69843d76cd48
2017-05-23 11:32:27 +02:00
Johann
f6fcd3410d sub pel avg variance neon: 4x block sizes
BUG=webm:1423

Change-Id: Iaab2b9a183fdb54aae5f717aba95d90dc36a9e3b
2017-05-22 14:40:05 -07:00
Johann
188d58eaa9 sub pel variance neon: 4x block sizes
Add optimizations for blocks of width 4

BUG=webm:1423

Change-Id: Idfb458d36db3014d48fbfbe7f5462aa6eb249938
2017-05-22 14:40:01 -07:00
Johann
9b0d306a2f sub pel avg variance neon: add neon optimizations
These are missing an optimized version of vpx_comp_avg_pred

BUG=webm:1423

Change-Id: I31fa6ef842e98f7ff3ea079ffed51ae33178e2ed
2017-05-22 13:58:43 -07:00
Johann
e0d294c3af sub pel variance neon: normalize variable names
match vpx_dsp/variance.c variable names

Change-Id: I228c6f296c183af147b079b7c8bcdf97bd09cf3a
2017-05-22 13:58:43 -07:00
Linfeng Zhang
27beada6d0 Merge "Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2" 2017-05-22 20:58:18 +00:00
Johann
67ac68e399 variance neon: assert overflow conditions
Change-Id: I12faca82d062eb33dc48dfeb39739b25112316cd
2017-05-22 11:25:06 -07:00
Linfeng Zhang
c167345ffb Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2
BUG=webm:1412

Change-Id: Ia338a6057d36f9ed7eaa9cbd4dfbf0c3cbdc6468
2017-05-22 11:24:21 -07:00
Johann
d217c87139 neon variance: special case 4x
The sub pixel variance uses a temp buffer which guarantees width ==
stride. Take advantage of this with the 4x and avoid the very costly
lane loads.

Change-Id: Ia0c97eb8c29dc8dfa6e51a29dff9b75b3c6726f1
2017-05-22 10:51:31 -07:00
Johann Koenig
e7cac13016 Merge changes Ib8dd96f7,Ie9854b77
* changes:
  neon variance: process 4x blocks
  use memcpy for unaligned neon stores
2017-05-22 17:48:33 +00:00
Marco Paniconi
b3bf91bdc6 Merge "vp9: Adjustments to cyclic refresh for high motion." 2017-05-22 06:27:30 +00:00
Marco
2adc0443dd vp9: Adjustments to cyclic refresh for high motion.
For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.

No/little change in RTC metrics, only affects high motion case.

Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b
2017-05-21 22:21:44 -07:00
Marco
ff9395eb3b vp9: Speed >= 8: Modify condition for low-resoln.
No change on RTC metrics.

Change-Id: I5abc573cb56572188d900645d13ba479f55a1ea0
2017-05-21 22:14:38 -07:00
Johann Koenig
b5055002d7 Merge "neon 4 byte helper functions" 2017-05-19 17:11:30 +00:00
Johann Koenig
3c603eadb4 Merge "neon fdct: 4x4 implementation" 2017-05-19 17:08:58 +00:00
Paul Wilkins
a7977ece93 Merge "Changes to modified error." 2017-05-19 12:24:32 +00:00
Marco
1205e3207e vp9: SVC: Modify condition to allow for copy partition.
When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.

Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
2017-05-18 14:19:31 -07:00
Jerome Jiang
6b6ff9c969 Merge "vp9: Make copy partition work for SVC and dynamic resize." 2017-05-18 19:37:30 +00:00
Marco
2ba4729ef8 vp9: Make copy partition work for SVC and dynamic resize.
Only affects speed 8.

Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.

Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
2017-05-18 11:33:56 -07:00
paulwilkins
5680b4517f Changes to modified error.
The modified error was a derivative of the "coded_error"
that was used to allocate bits between different frames on the
assumption that the allocation should be linear in terms of this
modified error.  I.e. a frame with double the modified error score
should all things being equal get double the number of bits. The
code also included upper and lower caps derived from input
VBR parameters.

This patch improves the initial calculation of the clip mean error
(now called "mean_mod_score" as it is no longer a prediction error)
used as the midpoint for the rate distribution function and normalizes
the output "modified scores" scores such that 1.0 indicates a frame
in the middle of the distribution.  The VBR upper and lower caps are
then applied directly to a  frame's normalized score.

This refactoring is intended to make it easier to drop in alternative
distribution functions or to base the rate allocation on a corpus wide
midpoint (rather than the clip mean).

Change-Id: I4fb09de637e93566bfc4e022b2e7d04660817195
2017-05-18 12:56:02 +01:00
Johann
7b742da63e neon variance: process 4x blocks
Continue processing sets of 16 values. Plenty of improvement for 4x8
(doubles the speed) but only about 30% for 4x4.

BUG=webm:1422

Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905
2017-05-17 17:35:01 -07:00
Johann
2057d3ef75 use memcpy for unaligned neon stores
Advise the compiler that the store is eventually going to a uint8_t
buffer. This helps avoid getting alignment hints which would cause the
memory access to fail.

Originally added as a workaround for clang:
https://bugs.llvm.org//show_bug.cgi?id=24421

Change-Id: Ie9854b777cfb2f4baaee66764f0e51dcb094d51e
2017-05-17 12:11:31 -07:00
Marco Paniconi
a2dfbbd7d6 Merge "vp9: Modify ChangingDropFrameThresh unittest." 2017-05-17 18:42:51 +00:00
Linfeng Zhang
13918a9ccc Merge "Update partial idct testing code" 2017-05-17 17:53:03 +00:00
Yaowu Xu
bde2c04fb7 Merge "Experiment. Store first pass errors as per MB values." 2017-05-17 17:38:15 +00:00
Marco
4733df333f vp9: Modify ChangingDropFrameThresh unittest.
Add another (lower) bitrate to the test, to cover
frame drop behavior at low bitrate range.

Change-Id: Iaad003974159daf3d2d65ef3a6575a3e72e498d6
2017-05-17 09:38:21 -07:00
Linfeng Zhang
3210ca6d60 Update partial idct testing code
Add PartialIDctTest::PrintDiff() to help debugging.
In RunQuantCheck, try all combinations of +/-mask_ input for 4x4 idct.
Update PartialIDctTest::InitInput().

Change-Id: I13fd163954a4c1a3a6cfeb5e4a4d3d0e7ff901f4
2017-05-17 09:28:32 -07:00
Johann
105503b839 neon fdct: 4x4 implementation
Approximately twice as fast as C implementation.

BUG=webm:1424

Change-Id: I3c0307fb08ddc23df42545cd089a78e2ed5c9d3f
2017-05-17 07:38:18 -07:00
paulwilkins
42e5073f94 Experiment. Store first pass errors as per MB values.
Most existing first pass stats are stored in a form normalized to a
macro-block scale. However the error scores for intra / inter etc were
stored as frame level values but mainly used as MB level values.

This change  fixes that. Normalized per MB values make comparisons
between different formats easier and in any case this is usually what is
wanted.

An change in results should be limited to slight differences in rounding.

*** Change after patch 8 +2 requiring new approval.

Final pre-submit testing showed  one 4K clip with above expected change.
Investigation showed this was due to a value used to test for ultra low intra
complexity in key frame detection. This was a per frame not per MB value but
also did not scale with frame size. Replacement with a small per MB value
(based on original per frame value and cif frame size) resolved the KF detection
problem.

Also converted kf_group_error_left to a double in line with other error values
to reduce rounding problems in KF group bit allocation

All clips and sets now show nominal (or 0) change as expected.

Change-Id: Ic2d57980398c99ade2b7380e3e6ca6b32186901f
2017-05-17 12:00:18 +01:00
Linfeng Zhang
18e8baa5c0 Add transpose_32bit_4x4() and rename transpose_4x4() for vpx_dsp/x86
Change-Id: Ib57377f6cf6573c04720d3cc5dea4285362b4220
2017-05-16 17:46:37 -07:00
Johann Koenig
31cb852a90 Merge "Revert "Add visibility="protected" attribute for global variables referenced in asm files."" 2017-05-16 23:39:37 +00:00
Johann Koenig
2300e16675 Revert "Add visibility="protected" attribute for global variables referenced in asm files."
This reverts commit 0d88e15454.

Reason for revert: chromium builds are failing to locate vpx_rv during dlopen()

dlopen failed: cannot locate symbol "vpx_rv" referenced by "libstandalonelibwebviewchromium.so"

Original change's description:
> Add visibility="protected" attribute for global variables referenced in asm files.
>
> During aosp builds with binutils-2.27, we're seeing linker error
> messages of this form:
> libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
> symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
> object
>
> subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
> Other messages refer to symbol references from deblock_sse2.o and
> subpixel_sse2.o, also assembled from asm files.
>
> This change marks such symbols as having "protected" visibility. This
> satisfies the linker as the symbols are not preemptible from outside
> the shared library now, which I think is the original intent anyway.
>
> Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
>

TBR=jzern@google.com,johannkoenig@google.com,rahulchaudhry@chromium.org,builds@webmproject.org

Change-Id: I0c2ea375aa7ef5fda15b9d9e23e654bb315c941b
2017-05-16 15:54:33 -07:00
Marco Paniconi
baef5486bf Merge "Revert "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.""" 2017-05-16 22:50:29 +00:00
Marco Paniconi
13d4a0d011 Revert "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.""
This reverts commit 3704807805.

Reason for revert: <INSERT REASONING HERE>
Does not look to be the cause of the test failures.

Original change's description:
> Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."
> 
> This reverts commit 4a7424adba.
> 
> Reason for revert: <INSERT REASONING HERE>
> Possibly causing test failures in roll into chromium.
> 
> Original change's description:
> > vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
> > 
> > Reduces quality regression at speed 10 for real-time mode.
> > 
> > Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
> > 
> 
> TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
> # Not skipping CQ checks because original CL landed > 1 day ago.
> 
> Change-Id: I1defcb74e78a5a3bd29b7d1b21a96a79fa26a457
> 

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Change-Id: I13d86a2a68b8aa8c0c7465e6e58cff0e00bc7862
2017-05-16 22:50:19 +00:00
Marco Paniconi
b9987a7c25 Merge "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."" 2017-05-16 22:48:39 +00:00
Marco Paniconi
3704807805 Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."
This reverts commit 4a7424adba.

Reason for revert: <INSERT REASONING HERE>
Possibly causing test failures in roll into chromium.

Original change's description:
> vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
> 
> Reduces quality regression at speed 10 for real-time mode.
> 
> Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
> 

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: I1defcb74e78a5a3bd29b7d1b21a96a79fa26a457
2017-05-16 22:48:13 +00:00
Johann Koenig
dac3b59721 Merge "'protected' visibility unsupported on macho" 2017-05-15 21:21:45 +00:00
Johann
7498fe2e54 neon 4 byte helper functions
When data is guaranteed to be aligned, use helper functions which
assert that requirement.

Change-Id: Ic4b188593aea0799d5bd8eda64f9858a1592a2a3
2017-05-15 13:42:31 -07:00
Johann
3fbc371e99 'protected' visibility unsupported on macho
Mac builds must not specify 'protected' visibility. Then only support
'default' and 'hidden'.

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/CppRuntimeEnv/Articles/SymbolVisibility.html

Change-Id: I94eccfaa29af0ddcc4a5c1c0e14cf63ef7146462
2017-05-15 11:29:22 -07:00
Johann Koenig
8739a182c8 Merge "move neon load/stores to a new file" 2017-05-15 18:15:27 +00:00
Johann
1088b4f87c move neon load/stores to a new file
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.

Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
2017-05-15 08:29:43 -07:00
Marco
4a7424adba vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
Reduces quality regression at speed 10 for real-time mode.

Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
2017-05-14 18:19:06 -07:00
Alexandra Hájková
bcbc3929ae ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx
Change-Id: Ic9639b1331d8c5cbc207c2a036891ff0137fc56f
2017-05-13 13:13:15 +00:00
Jerome Jiang
6b9d130214 Merge "vp9: speed 8: Fix seg fault in partition copy when drop frames." 2017-05-13 03:20:49 +00:00
Cheng Chen
4c0655f26b Merge "Speed up encoding by skipping altref recode" 2017-05-13 01:29:59 +00:00
Jerome Jiang
1fcd5cca3c vp9: speed 8: Fix seg fault in partition copy when drop frames.
BUG=webm:1433

Change-Id: I4f3984ef28660d3218d48007d7c977bdbdaf8af6
2017-05-12 15:57:23 -07:00
Rahul Chaudhry
0d88e15454 Add visibility="protected" attribute for global variables referenced in asm files.
During aosp builds with binutils-2.27, we're seeing linker error
messages of this form:
libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
object

subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
Other messages refer to symbol references from deblock_sse2.o and
subpixel_sse2.o, also assembled from asm files.

This change marks such symbols as having "protected" visibility. This
satisfies the linker as the symbols are not preemptible from outside
the shared library now, which I think is the original intent anyway.

Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
2017-05-12 11:11:16 -07:00
Marco Paniconi
9a66582604 Merge "vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl" 2017-05-12 17:02:50 +00:00
James Zern
ac8f58f6ab Merge changes I1b54a7a5,I3028bdad,I59788cd9
* changes:
  ppc: Add get_mb_ss_vsx
  ppc: Add get4x4sse_cs_vsx
  ppc: Add comp_avg_pred_vsx
2017-05-12 15:24:59 +00:00
Luca Barbato
143b21e362 ppc: Add get_mb_ss_vsx
Change-Id: I1b54a7a5bb642e4b836d786ea1ae506eed025e3f
2017-05-12 17:23:00 +02:00
Luca Barbato
6d225eb5f9 ppc: Add get4x4sse_cs_vsx
Change-Id: I3028bdadf653665d18e781d28e9625f62804b3d8
2017-05-12 17:23:00 +02:00
Luca Barbato
a7f8bd451b ppc: Add comp_avg_pred_vsx
Change-Id: I59788cd98231e707239c2ad95ae54f67cfe24e10
2017-05-12 17:22:55 +02:00
Alexandra Hájková
f48532e271 ppc: Add vpx_sad64x32/64_vsx
Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2
2017-05-12 16:10:16 +02:00
Alexandra Hájková
0b15bf1e54 ppc Add vpx_sad32x16/32/64_vsx
Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd
2017-05-12 16:10:11 +02:00
James Zern
a12ea1d5e9 Merge "ppc: Add vpx_sad16x8/16/32_vsx" 2017-05-12 13:33:51 +00:00
Marco Paniconi
629279a45c Merge "vp9: Adjust speed features for speed 8 at low resoln." 2017-05-12 00:35:40 +00:00
Marco Paniconi
c64667c338 Merge "vp9: SVC: Increase the partiiton and acskip thresholds" 2017-05-11 23:37:32 +00:00
Marco Paniconi
37cdd3bfc2 Merge "vp9; Adjust noise estimation thresholds." 2017-05-11 21:58:40 +00:00
Marco
c5c31b9eb6 vp9: SVC: Increase the partiiton and acskip thresholds
Increase the partition and acskip thresholds for temporal
enhancement layers.

~1-2% speedup, with negligible loss in quality.

Change-Id: Id527398a05855298ad9ddac10ada972482415627
2017-05-11 12:28:19 -07:00
Marco
c5a4376aed vp9: SVC: allow for setting the interp_filter in non-rd pickmode.
For SVC 1 pass non-rd pickmode, the interpolation filter for the
upsampling of the golden (spatial) reference was not being explicitly
set and instead was takin gwhatever value was set in the previous
mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH).

Fix it to the default EIGHTTAP for now, to be updated/selected
adaptively in a later change.

Minor adjustmemt to rate targeting thresholds in datarate unittests.

Change-Id: I52085048674072c6cfb7163e11e9a2658d773826
2017-05-11 11:45:09 -07:00
Paul Wilkins
3caaf21c5b Merge "Tuning of factor used to calculate Q range in two pass." 2017-05-11 18:25:45 +00:00
Jerome Jiang
d35541fe29 Merge "vp9: Fix ubsan failure in denoiser." 2017-05-11 16:38:59 +00:00
paulwilkins
9a7625652c Tuning of factor used to calculate Q range in two pass.
A more detailed explanation of the experimentation
leading to this change can be found in:-

https://docs.google.com/a/google.com/document/d/13lsYhxgPyxUHvEess6wg9nikaonIZKY9Ak_Lpafv5Mo/edit?usp=sharing

This change gives gains across all our standard test sets for
overall psnr, ssim, fast ssim and psnr-HVS.

Values expressed as % reduction in bitrate.

Low res set     -0.257, -0.192, -0.173, -0.101
Mid res set     -0.233, -0.336, -0.367, -0.139
High res set    -0.999, -1.039, -1.111, -0.567
NetFlix 2K set -0.734, -0.174, -0.389, -0.820
Netflix 4K set  -0.814, -0.485, -0.796, -0.839

Change-Id: Ie981fb3c895c9dfcfc8682640d201a86375db5c8
2017-05-11 16:19:59 +01:00
Cheng Chen
76567d84ce Speed up encoding by skipping altref recode
Speed up for speed 0.
Reduce 10+% of encoding time for hdres in speed 0,
with less than 0.1% PSNR loss.
Compute total difference of previous and current frame context probability
model. If the diff is less than the threshold, skip recoding the frame.

Borg test (positive number means performance loss):
		lowres    midres    hdres
PSNR:		0.030     0.032     0.065

Local speed test: bitrate set at 1200
		blue_sky  pedestrian  rush_hour
Encoding time:	 -10.0%     -16.5%      -16.5%

Change-Id: I4e2d200ea3115d48b2c3e890143596b31b8ef9e9
2017-05-10 22:15:01 -07:00
Marco Paniconi
f7e767d8ee Merge "vp9: SVC: Fix setting in sample encoder." 2017-05-10 23:51:44 +00:00
Marco
2f11a65c99 vp9; Adjust noise estimation thresholds.
Change-Id: Ia41a11df18e5a58d2b8bbecd11c249d357de2a8f
2017-05-10 16:48:10 -07:00
Marco
eaa6715b02 vp9: SVC: Fix setting in sample encoder.
For 1 spatial layer case, scaling_num/den was not set properly.

Change-Id: I139bf70c6dffde89eed24e435bcb5d98d2029bcd
2017-05-10 16:19:23 -07:00
Jerome Jiang
597d1f4c03 vp9: Fix ubsan failure in denoiser.
Fix the overflow for subtraction between two unsigned integers.

BUG=webm:1432

Change-Id: I7b665e93ba5850548810eff23258782c4f5ee15a
2017-05-10 13:43:17 -07:00
Linfeng Zhang
8477a66fc8 Merge "Update specializations of idct functions" 2017-05-10 20:31:13 +00:00
Alexandra Hájková
cc7f0c0f3e ppc: Add vpx_sad16x8/16/32_vsx
Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9
2017-05-10 19:57:30 +00:00
Linfeng Zhang
764b3b8090 Update specializations of idct functions
Introduced append situation in Commit 0178d97 which could be
confusing. Clean a little bit and add some comments.

Change-Id: I69ad336f805aca7ce9d45515b8cd237423fadbb2
2017-05-10 12:51:18 -07:00
Jerome Jiang
2574573fea vp9: Wrap threshold tuning for HD only when denoiser is enabled.
Fixes a speed regression.

Change-Id: I23d942e4af17fa81fe4a366c7369b3ad537e59b0
2017-05-10 12:15:41 -07:00
Marco
d3aebeee4e vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl
Change-Id: I259d152c62864b365490368051f3c3b7d7f2f1c5
2017-05-10 12:06:44 -07:00
Johann Koenig
d713ec3c46 Merge changes I92eb4312,Ibb2afe4e
* changes:
  subpel variance neon: add mixed sizes
  sub pixel variance neon: use generic variance
2017-05-10 18:19:52 +00:00
Marco Paniconi
db2fad7516 Merge "vp9: Adjustment to noise estimation." 2017-05-10 17:11:18 +00:00
Marco Paniconi
fcd6e4a1c2 Merge "vp9: SVC: Add option to set downsampling filter type." 2017-05-10 17:10:52 +00:00
Marco
1b59964162 vp9: Adjustment to noise estimation.
When the noise estimate is forced off due to large motion,
reset the counter and set smaller window for next estimate.

Change-Id: Ifa4ec95396134173a00d48353ad52f1b6a40c217
2017-05-10 09:39:08 -07:00
Marco
4e23998fb4 vp9: SVC: Add option to set downsampling filter type.
Add option in SVC to set the filter type and phase for
the frame level downsampling filters.

For 3 spatial layers: set downsampling filter type to bilinear
and set phase to 8, for lowest spatial layer.

Change-Id: Id81f4b1ba93db19c1cd37b6a46d1281a2c61bc43
2017-05-09 17:22:44 -07:00
Linfeng Zhang
870cf4356c Update test/partial_idct_test.cc
Makes more sense to call the corresponding partial idct C function
instead of the full idct C function as the reference.

Change-Id: Ibb7681dd063edd6307ba582c10c26c4c6a4b78c6
2017-05-09 13:07:47 -07:00
Linfeng Zhang
f532504864 Clean 32x32 idct C code
Change-Id: I73b8104a9e7a70ffe827c1b7ff43618f24f5d7bd
2017-05-09 11:05:51 -07:00
Linfeng Zhang
ecd1eb2162 Update 4x4 idct sse2 functions
It's a bit faster to call idct4_sse2() in vpx_idct4x4_16_add_sse2()

Change-Id: I1513be7a895cd2fc190f4a8297c240b17de0f876
2017-05-08 16:16:52 -07:00
Marco Paniconi
8053dba5a1 Merge "vp9: SVC: Modify conditon for setting downsample filter type." 2017-05-08 21:45:58 +00:00
Marco
9586d5e682 vp9: SVC: Modify conditon for setting downsample filter type.
Base the condition on the resolution of the spatial layer.
And remove restriction on scaling factor.

Change-Id: Iad00177ce364279d85661654bff00ce7f48a672e
2017-05-08 14:13:49 -07:00
Johann
f7d1486f48 neon variance: process 16 values at a time
Read in a Q register. Works on blocks of 16 and larger.

Improvement of about 20% for 64x64. The smaller blocks are faster, but
don't have quite the same level of improvement. 16x32 is only about 5%

BUG=webm:1422

Change-Id: Ie11a877c7b839e66690a48117a46657b2ac82d4b
2017-05-08 18:48:55 +00:00
Johann Koenig
1814463864 Merge changes Id602909a,Ib0e85608
* changes:
  neon variance: process two rows of 8 at a time
  neon variance: add small missing sizes
2017-05-08 17:34:20 +00:00
Linfeng Zhang
2c3a2ad6f1 Merge changes I0cfe4117,I3581d80d,Ida62c941
* changes:
  Split dsp/x86/inv_txfm_sse2.c
  Update highbd idct functions arguments to use uint16_t dst
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
2017-05-08 16:15:57 +00:00
Marco Paniconi
f4653c1efc Merge "vp9: SVC: Set downsample filtertype for lowest spatial layer." 2017-05-06 02:31:00 +00:00
Marco
9b729748ac vp9: SVC: Set downsample filtertype for lowest spatial layer.
For lowest spatial layer, in 3 layer SVC, set the
downsampling filtertype to get averaging filter.
Needed for reducing aliasing on low-res layer,
small increase in overall encoder time.

Change-Id: Ia31460123bd91b72eca49b46dd924b9f226d4563
2017-05-05 19:29:09 -07:00
Jerome Jiang
3453c8d6c4 Merge "vp9: Neon optimization for denoiser. Add unit tests." 2017-05-06 01:28:32 +00:00
Jerome Jiang
83a2bfd7dc Merge "Change target bitrate thresh in denoiser test." 2017-05-06 01:28:15 +00:00
Jerome Jiang
fff358fb06 Change target bitrate thresh in denoiser test.
An intended behavior change disabling exhaustive searches in speed
feature causes VP9/DatarateTestVP9LargeDenoiser.4threads test failure.
Change the threshold to make it pass.

BUG=webm:1429

Change-Id: Ibcbe2314c6b2525799894f5d7204fc8eb4ec2a1e
2017-05-05 16:50:19 -07:00
Jerome Jiang
069eedb3a0 vp9: Neon optimization for denoiser. Add unit tests.
Denoiser on Neon is 5x faster than C code.

BUG=webm:1420

Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb
2017-05-05 16:40:52 -07:00
Marco Paniconi
a082c562d0 Merge "vp9: Adjust some thresholds for noise estimation." 2017-05-05 20:02:41 +00:00
Marco
34cce144d8 vp9: Adjust some thresholds for noise estimation.
Adjust thresholds for noise estimation, for resolutions above VGA.
Tends to push cleaner/low noise clips to LowLow state.

No change in RTC metrics.

Change-Id: I739ca6b797d0a60ccd1c6c6a2775269b1f007e5e
2017-05-05 12:00:12 -07:00
Johann Koenig
38f440120c Merge "fdct 8x8 neon: minor comment cleanup" 2017-05-05 18:22:45 +00:00
Jerome Jiang
af69ed20c4 vp9: Enable noise estimation on low res.
Set noise level to kLowLow for high motion low res clips.
Change the normalization in noise metric for low res.
Reduce the initial time-window for all resolutions.

Change-Id: Iaed39dbb50b205cd9c735dc5b84822304fb01987
2017-05-04 15:38:23 -07:00
Johann
2346a6da4a subpel variance neon: add mixed sizes
Add support for everything except block sizes of 4.

Performance is better but numbers will improve again when the variance
optimizations land.

BUG=webm:1423

Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80
2017-05-04 15:30:01 -07:00
Johann
19e1ec8359 sub pixel variance neon: use generic variance
When a neon version is available it will be called. This allows
decoupling the variance implementations and has no real downside. For
most configurations, the call will be #define'd to the neon
implementation.

Change-Id: Ibb2afe4e156c5610e89488504d366b3e6d1ba712
2017-05-04 15:30:01 -07:00
Johann
462e29703c fdct 8x8 neon: minor comment cleanup
Simplify HBD/non distinction in test.

Document why transpose_neon.h is not used

Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52
2017-05-04 15:14:23 -07:00
Johann
d6a7489dd5 neon variance: process two rows of 8 at a time
When the width is equal to 8, process two rows at a time. This doubles
the speed of 8x4 and improves 8x8 by about 20%.

8x16 was using this technique already, but still improved a little bit
with the rewrite.

Also use this for vpx_get8x8var_neon

BUG=webm:1422

Change-Id: Id602909afcec683665536d11298b7387ac0a1207
2017-05-04 08:59:46 -07:00
Johann
cb9133c72f neon variance: add small missing sizes
Some of the mixed sizes were missing. They can be implemented trivially
using the existing helper function.

When comparing the previous 16x8 and 8x16 implementations, the helper
function is about 10% faster than the 16x8 version. The 8x16 is very
close, but the existing version appears to be faster.

BUG=webm:1422

Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004
2017-05-04 08:59:42 -07:00
Yi Luo
a24e1e8027 Merge "High bit depth inter prediction horizontal/vertical filters AVX2" 2017-05-04 15:43:21 +00:00
Linfeng Zhang
2231669a83 Split dsp/x86/inv_txfm_sse2.c
Spin out highbd idct functions.

BUG=webm:1412

Change-Id: I0cfe4117c00039b6778c59c022eee79ad089a2af
2017-05-03 15:43:02 -07:00
Linfeng Zhang
d5de63d2be Update highbd idct functions arguments to use uint16_t dst
BUG=webm:1388

Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03 13:59:16 -07:00
Linfeng Zhang
081b39f2b7 Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
BUG=webm:1388

Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-05-03 13:58:31 -07:00
Hui Su
5048d6e7ee Merge "vp9 level: add tentative max cpb values for high levels" 2017-05-03 20:51:03 +00:00
Hui Su
f701a44305 Merge "Adjust alt-ref selection in define_gf_group()" 2017-05-03 20:50:29 +00:00
Yi Luo
a3452996a1 High bit depth inter prediction horizontal/vertical filters AVX2
User level speed improvement on i7-6700, cpu-used=1,
  x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps:
- Decoder:
  1080p: ~4%
  4K: ~5%
- Encoder:
  1080p: ~1%
  4K: ~3%

Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640
2017-05-03 12:18:01 -07:00
Linfeng Zhang
a10a5cb356 Merge changes I8bb660de,Ica51d780,I6037525d
* changes:
  Clean specializes of idct functions
  Clean add_protos of highbd idct functions
  Clean add_protos of idct functions
2017-05-03 19:17:55 +00:00
James Zern
5599e4275a Merge changes Ia5293d94,I90d481d3,Ia509d622,I54549b03,I89b635d6
* changes:
  ppc: Add convolve8_vsx and convolve8_avg_vsx
  ppc: Add convolve8_avg_vert_vsx
  ppc: Add convolve8_vert
  ppc: Add convolve8_horiz_avg
  ppc: Add convolve8_horiz
2017-05-03 03:31:19 +00:00
Luca Barbato
e2ad89092d ppc: Add convolve8_vsx and convolve8_avg_vsx
Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857
2017-05-02 20:27:47 -07:00
Luca Barbato
e6ca81ee67 ppc: Add convolve8_avg_vert_vsx
Only the generic one again, speedups for 8x8 and larger blocks to
come later.

Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d
2017-05-02 20:27:42 -07:00
Luca Barbato
a65f1771ad ppc: Add convolve8_vert
Only the generic one again, speedups for 8x8 and larger blocks
to come later.

Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f
2017-05-02 20:27:33 -07:00
Luca Barbato
77772350f3 ppc: Add convolve8_horiz_avg
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15
2017-05-02 20:27:28 -07:00
Luca Barbato
08edb85bd0 ppc: Add convolve8_horiz
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046
2017-05-02 20:27:16 -07:00
Linfeng Zhang
0178d974e5 Clean specializes of idct functions
Change-Id: I8bb660de47b5f97263ec381dc428db96e9c9a4b2
2017-05-02 18:01:19 -07:00
Linfeng Zhang
4412996d59 Clean add_protos of highbd idct functions
Change-Id: Ica51d780b92b316ce9112740c56cdf7670816371
2017-05-02 17:59:38 -07:00
Linfeng Zhang
a7a57d9756 Clean add_protos of idct functions
Change-Id: I6037525d92ec172810edab720389eb1865ed3b1a
2017-05-02 17:58:40 -07:00
Johann Koenig
240a5a15ef Merge "block error sse2: sum in 32 bits when possible" 2017-05-02 14:16:47 +00:00
Johann
cd94d5f68e block error avx2: rename variables
Change-Id: I2b8a9253f2c3d1fd85304c2970ebe70213870fe9
2017-05-01 17:54:29 -07:00
Johann Koenig
b1a31f8066 Merge "block error avx2: sum in 32 bits when possible" 2017-05-02 00:52:59 +00:00
Marco Paniconi
1e112bce37 Merge "vp9: SVC: Early exit on golden ref in non-rd pickmode." 2017-05-01 21:04:52 +00:00
Linfeng Zhang
e8655d49f5 Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()" 2017-05-01 19:54:40 +00:00
Johann Koenig
3d33a462b3 Merge "move vp9_error_intrin_avx2.c" 2017-05-01 19:52:36 +00:00
Kyle Siefring
760c214519 block error avx2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

AVX2 code provides a very minor performance improvement.

BUG=webm:1210

Change-Id: I4c82308eaf65741dca2f5c6db9be9c85f905073a
2017-05-01 12:51:33 -07:00
James Zern
ee3df31d74 Merge "vpx_scale_test: fix segfault on alloc failure" 2017-05-01 19:22:22 +00:00
Marco
ae0215f945 vp9: SVC: Early exit on golden ref in non-rd pickmode.
For SVC 1 pass real-time: add condition to skip the
golden (spatial) reference mode in non-rd pickmode.
Condition is to skip golden if the sse of zeromv-last mode
is below threshold. And change order in ref_mode_set_svc
to make sure golden zeromv is tested after last-nearest.

Speedup ~3-4% with little/negligible quality loss.

Change-Id: I6cbe314a93210454ba2997945f714015f1b2fca3
2017-05-01 10:36:54 -07:00
Kyle Siefring
8394990b27 block error sse2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

BUG=webm:1210

Change-Id: I5ca8c7f7775585a17fe09d6bbfc25e1f2955eb0a
2017-05-01 09:59:18 -07:00
Johann
2ff01aa1e4 move vp9_error_intrin_avx2.c
There is only one avx2 implementation. Drop '_intrin'

Change-Id: I887a0d27d58567eaad49f749f127eca61313f312
2017-05-01 09:13:01 -07:00
James Zern
2930903d51 vpx_scale_test: fix segfault on alloc failure
check the return of ResetImage() before continuing

Change-Id: Iff0b038f7b9761113b8cf33a511a5306640d1273
2017-04-29 13:12:53 -07:00
Luca Barbato
d51d3934f5 ppc: Add convolve_avg
Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014
2017-04-29 15:47:25 +02:00
Luca Barbato
63860ba7b8 ppc: Add convolve_copy
Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405
2017-04-29 15:47:25 +02:00
Johann Koenig
ef5918098d Merge "Use uint32_t for accumulator" 2017-04-28 18:32:09 +00:00
Jerome Jiang
ce2e278059 Merge "vp9: Fix condition for disabling adaptive_rd_thresh." 2017-04-28 18:10:36 +00:00
Jerome Jiang
04de501229 vp9: Fix condition for disabling adaptive_rd_thresh.
Add speed constrains for disabling adaptive_rd_thresh when
row_mt_bit_exact is set.

Change-Id: I2445115c2f9a2e46b8a0966031a0fea488d4964e
2017-04-28 10:26:20 -07:00
Jerome Jiang
bea27a5809 Merge "Generalize vp9 sse2 denoiser test for other platforms." 2017-04-28 15:45:52 +00:00
Johann
657f3e9f14 Use uint32_t for accumulator
Be specific about the data type size.

Use convenience macro vp9_zero_array.

Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd
2017-04-28 06:36:59 -07:00
Johann Koenig
94ebdba71d Merge "vp9 temporal filter: sse4 implementation" 2017-04-28 13:22:41 +00:00
Jerome Jiang
26aebd77b8 Generalize vp9 sse2 denoiser test for other platforms.
Renamed to vp9_denoiser_test.

Change-Id: I0d8f4c94bcb81a60949a13d9fe839cee95d03f77
2017-04-27 22:47:41 -07:00
Yaowu Xu
0e8fea6c13 Merge "VP9: enable trellis for high bitdepth intra" 2017-04-28 00:16:56 +00:00
James Zern
ef15d38df0 Merge "webm_read_frame: avoid NULL dereference" 2017-04-27 21:47:10 +00:00
Johann
6dfeea6592 vp9 temporal filter: sse4 implementation
Approximates division using multiply and shift.

Speeds up both sizes (8x8 and 16x16) by 30 times.

Fix the call sites to use the RTCD function.

Delete sse2 and mips implementation. They were based on a previous
implementation of the filter. It was changed in Dec 2015:
ece4fd5d22

BUG=webm:1378

Change-Id: I0818e767a802966520b5c6e7999584ad13159276
2017-04-26 22:03:05 -07:00
Jerome Jiang
43e0e082d1 vp9: Don't force disabling of adaptive_rd_thresh for realtime.
Don't force disabling of adaptive_rd_thresh for realtime when
row_mt_bit_exact is set.

Row based adaptive rd is made usable in CL
454882(https://chromium-review.googlesource.com/c/454882) for REALTIME.

Change-Id: Ief023414f0fd6eb86f299dd46ae58f4436875af5
2017-04-26 13:17:57 -07:00
Yunqing Wang
b68f14d0ed Merge "Make the row based multi-threaded encoder deterministic" 2017-04-26 16:12:14 +00:00
Linfeng Zhang
54c4e0f7a5 Merge "Update highbd convolve functions arguments to use uint16_t src/dst" 2017-04-26 15:50:46 +00:00
Marco Paniconi
004fab120a Merge "vp9: SVC: Adjust some speed settings for temporal layers." 2017-04-26 15:45:06 +00:00
Peter de Rivaz
66117b97c5 VP9: enable trellis for high bitdepth intra
BUG=webm:1409

Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7
2017-04-26 11:43:01 +01:00
hui su
d01c9febe9 vp9 level: add tentative max cpb values for high levels
Add tentative max cpb size values for levels 5.2 and up. Otherwise
encoding will fail when targeting for these levels.

Change-Id: Ib7e0ba4b9836ea1ac900b6822543812843d48463
2017-04-25 18:03:55 -07:00
hui su
8069f31076 Adjust alt-ref selection in define_gf_group()
107de19698 changes the encoder alt-ref selection behavior. Assuming
min_gf_interval = max_gf_interval = 4, the frame order would be
frm_1  arf_1  frm_2  frm_3  frm_4  frm_5  arf_2 before 107de19698;
frm_1  arf_1  frm_2  frm_3  frm_4  arf_2  frm_5 after 107de19698.

This patch reverts such alt-ref placement change.

Change-Id: I93a4a65036575151286f004d455d4fcea88a1550
2017-04-25 18:03:47 -07:00
Jerome Jiang
15ee8a8c45 Merge "Fix the decoder seg fault when frame is corrupted." 2017-04-26 00:09:29 +00:00
Jerome Jiang
997e54ea43 Merge "vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large" 2017-04-26 00:09:22 +00:00
Marco
c614164cb6 vp9: SVC: Adjust some speed settings for temporal layers.
Make some speed setting changes for temporal enhancement layers,
and remove the switch in subpel_force_stop for the aggressive_base_mv
in non-rd pickmode.

Gain some 2-3% speed with little/negligible quality loss.

Change-Id: I3e2a7f80ff45f38c0a6ceb01b34dbca2f53edbf0
2017-04-25 16:27:01 -07:00
Jerome Jiang
69b0242e9a vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large
For speed >= 8 and color_sensitivity not set, skip the transform
skipping test in UV planes.
Add a new condition to check noise level to skip chroma check
for speed >= 8 if y_sad is high.

1~2% speedup on ARM for speed 8.

Borg tests show neutral results in both rtc and rtc_derf.

Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
2017-04-25 16:21:36 -07:00
Linfeng Zhang
4758d20227 Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()
BUG=webm:1388

Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d
2017-04-25 14:32:20 -07:00
Linfeng Zhang
51dc998f3a Update highbd convolve functions arguments to use uint16_t src/dst
BUG=webm:1388

Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-25 14:22:19 -07:00
James Zern
0be513e8e8 webm_read_frame: avoid NULL dereference
block may be NULL with block_entry_eos or from return of GetBlock()

Change-Id: Ia0dd3ffa46305ee70efcdc55c05c2ad24efc993b
2017-04-25 12:34:23 -07:00
Marco
92ec0674fd vp9; Reduce artifact in non-rd pickmode for lighting changes.
Add a low-variance high-sumdiff to the superblock content state
and use it to limit the mv and bias some decisions in non-rd pickmode.
Only affects speed >= 6.

Reduces artifact for lighting changes.
Small/no difference in metrics on RTC set.

Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
2017-04-24 17:08:43 -07:00
Yunqing Wang
10a497bd38 Make the row based multi-threaded encoder deterministic
This patch followed allow_exhaustive_searches feature modification and
continued to modify the encoder to achieve the determinism in the row
based multi-threaded encoding. While row-mt = 1 and using multiple
threads, the adaptive feature in encoder was disabled, which gave
BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
speed 2). These speed losses were acceptable considering the speed
gains obtained from row-mt.

Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
2017-04-24 16:28:27 -07:00
Yunqing Wang
c530208ae3 Merge "Make allow_exhaustive_searches feature no longer adaptive" 2017-04-24 17:41:10 +00:00
Marco Paniconi
b35f64241f Merge "vp9: SVC: fix condition for partition/skip threshold when denoising." 2017-04-21 21:28:17 +00:00
Yunqing Wang
bca4564683 Make allow_exhaustive_searches feature no longer adaptive
A previous patch turned on allow_exhaustive_searches feature only for
FC_GRAPHICS_ANIMATION content. This patch further modified the feature
by removing the exhaustive search limit, and made it no longer adaptive.
As a result, the 2 counts that recorded the number of motion searches
were removed, which helped achieve the determinism in the row based
multi-threading encoding. Tests showed that this patch didn't cause
the encoder much slower.

Used exhaustive_searches_thresh for this speed feature, and removed
allow_exhaustive_searches. Also, refactored the speed feature code
to follow the general speed feature setting style.

Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
2017-04-21 11:14:02 -07:00
Jerome Jiang
58fe1bde59 Merge "vp9: Non-rd pickmode: Avoid computation duplication." 2017-04-21 00:51:47 +00:00
Marco
5de0e9ed08 vp9: SVC: fix condition for partition/skip threshold when denoising.
The more aggressive settings should only be used when denoise_svc
condition is satisfied (which means top spatial layer).

Change-Id: Ia8e3515b27f31bf21b1976ca80a2fa826daece3a
2017-04-20 16:36:55 -07:00
Jerome Jiang
7ae1e321a1 vp9: Non-rd pickmode: Avoid computation duplication.
In non-rd pickmode (speed >= 5), avoid duplication of computations in
model_rd_for_sb_y when the speed feature use_simple_block_yrd is
enabled (or for high bitdepth build under certain conditions).

QVGA, VGA and HD have 1.23%, 2.68% and 1.7% speedup on ARM for speed 8,
respectively.

Encoding results are bitexact for speed >= 5.

Change-Id: I3f9130810c21439f5ad7e159e21cb2243dcd05f1
2017-04-20 16:20:59 -07:00
Jerome Jiang
25c1bada72 Fix the decoder seg fault when frame is corrupted.
BUG=webm:1399

Change-Id: I1e006e0260d9b56a4d2273659ca19b86c69c474b
2017-04-20 14:55:42 -07:00
Marco
29938b3a5a vp9: 1 pass SVC: Fix comment and condition for up-sampling reference.
No change in behavior.

Change-Id: I218fb30289091da623acb23324027435b8510d0e
2017-04-20 14:21:05 -07:00
Yunqing Wang
30ef50b522 Merge "Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content" 2017-04-20 19:57:46 +00:00
Marco Paniconi
17559cd8b5 Merge "vp9: Re-enable SVC datarate tests." 2017-04-20 19:53:20 +00:00
Marco
85ca2e8a8b vp9: Re-enable SVC datarate tests.
Re-enable the SVC tests, wrap the non-zero expectation
in GetMismatchFrames around #if CONFIG_VP9_DECODER.

Change-Id: I0e8a2d78b868c32f18fe597540f397d3a1b303b5
2017-04-20 12:08:08 -07:00
Marco
3134a52d26 vp9: SVC: Redefine the source downsample filter choice.
Rename the source downsampling filter, and define it
per spatial layers. Used 1 pass CBR SVC.

Change-Id: I8135f2ab89c535c53429b9c58b586f746bb668c7
2017-04-20 10:17:13 -07:00
Luca Barbato
8975436466 ppc: Add the intra predictor tests
Change-Id: Idea15b916044ab3d8e74519337880a484ecfd87e
2017-04-19 20:21:40 -07:00
Luca Barbato
914b160fb5 ppc: h predictor 8x8
Slightly faster with the current compiler.

Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47
2017-04-19 19:57:51 -07:00
Luca Barbato
0b9be93205 ppc: d63 predictor 8x8
10x faster.

Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8
2017-04-19 19:57:51 -07:00
Luca Barbato
ee9325b0bd ppc: tm predictor 4x4
Slightly faster.

Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191
2017-04-19 19:57:51 -07:00
Luca Barbato
2904eb5800 ppc: h predictor 4x4
2x faster.

Change-Id: I0583dec353299c6797401b646099f18db4e0420d
2017-04-19 19:57:51 -07:00
Luca Barbato
58245d7050 ppc: dc predictor 8x8
Slightly faster, the other dc predictors cannot be faster since
the computation speedup is overwhelmed by the time spent reading
dst to write just the 8x8 part.

Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66
2017-04-19 19:57:51 -07:00
Luca Barbato
6b4a65e8b1 ppc: d45 predictor 8x8
11x faster.

Change-Id: I5b8f39213ee1f5260724fc254e3fb5c462435798
2017-04-19 19:57:51 -07:00
Luca Barbato
92e33c7b31 ppc: d63 predictor 32x32
About 10x faster.

Change-Id: If7d0645f75c5d7deb9751edd0bf47e2f9068e9e7
2017-04-19 19:57:51 -07:00
Luca Barbato
a5469a00a8 ppc: d63 predictor 16x16
About 18x faster.

Change-Id: Id043bf76c011e03e992085bb5e20f330d3e98cd4
2017-04-19 19:57:51 -07:00
Luca Barbato
cc868da526 ppc: d45 predictor 32x32
About 12x faster.

Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed
2017-04-19 19:57:51 -07:00
Luca Barbato
7a7dc9e624 ppc: d45 predictor 16x16
About 16x faster.

Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9
2017-04-19 19:57:51 -07:00
Luca Barbato
c08baa2900 ppc: dc predictor 32x32
10x and 5x faster.

Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf
2017-04-19 19:57:47 -07:00
Luca Barbato
22ca468c7c ppc: dc top and left predictor 32x32
6x faster.

Change-Id: I717995b4056e5579c68191d11b495372971fe1ae
2017-04-19 19:49:31 -07:00
Luca Barbato
ad9dea1f6d ppc: dc top and left predictor 16x16
13x faster.

Change-Id: I1771ac39fda599153f933cb3f0506c9f97a6cbe6
2017-04-19 19:49:31 -07:00
Luca Barbato
d68d37872c ppc: dc_128 predictor 32x32
6x faster.

Change-Id: I1da8f51b4262871cb98f0aa03ccda41b0ac2b08b
2017-04-19 19:49:31 -07:00
Luca Barbato
f9d20e6df2 ppc: dc_128 predictor 16x16
20x faster.

Change-Id: I05f0deb2d38ae7966eae6b71fbc0aa51880e5709
2017-04-19 19:49:31 -07:00
Luca Barbato
0d9417de4a ppc: tm predictor 32x32
About 8x faster.

Change-Id: I9bad827ccbdf47ec95406e961c74ac2ff45f80cf
2017-04-19 19:49:26 -07:00
James Zern
a81f037f15 Merge changes I1f5a3752,I95123051,I3bb724e0,Ie81077fa,Ic80f3c05, ...
* changes:
  ppc: tm predictor 16x16
  ppc: tm predictor 8x8
  ppc: horizontal predictor 32x32
  ppc: horizontal predictor 16x16
  ppc: vertical intrapred 16x16 and 32x32
  configure: Workaround clang not enabling altivec on -mvsx
  configure: Match power*64* as ppc64
2017-04-20 02:45:45 +00:00
Yunqing Wang
e96e49c2f9 Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content
The allow_exhaustive_searches feature improves the encoding quality
of FC_GRAPHICS_ANIMATION content a lot. For non-FC_GRAPHICS_ANIMATION
content, the quality test result is almost neutral. This patch makes
this feature to be used only for FC_GRAPHICS_ANIMATION content.

The motivation of doing that is to make this feature no longer adaptive,
which will be implemented in the following patch.

Change-Id: Ic911df6dd757402b6480789cc247801e99840369
2017-04-20 00:03:27 +00:00
Linfeng Zhang
fbbdba3b04 Merge changes I9e18a73b,Ie47c8cd4
* changes:
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
  Create CAST_TO_BYTEPTR/SHORTPTR
2017-04-19 23:55:58 +00:00
Linfeng Zhang
bf8a49abbd Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
Replace by CAST_TO_BYTEPTR/SHORTPTR.
The rule is: if a short ptr is casted to a byte ptr, any offset
operation on the byte ptr must be doubled. We do this by casting to
short ptr first, adding offset, then casting back to byte ptr.

BUG=webm:1388

Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
2017-04-19 12:13:49 -07:00
Marco Paniconi
977356a72b Merge "vp9: Add phase to get averaging filter for 1:2 downsampling." 2017-04-19 15:27:55 +00:00
Marco
f34be01190 vp9: Fix the disabling of a SVC 3TL datarate test.
Change-Id: Ib42d23ab5ee39ab3c85e1d9a84e36249e59fe74e
2017-04-19 08:01:44 -07:00
Marco
348bdc0195 vp9: Add phase to get averaging filter for 1:2 downsampling.
The scaling filter with zero shift will give sub-sampling for
2x downsampling. Allow for a phase shift to get an averaging filter.

Usage is for source scaling in 1 pass SVC mode for 1:2 downscale.
Reduces aliasing in downsampled image.

Keep the phase to 0/off for now.

Change-Id: Ic547ea0748d151b675f877527e656407fcf4d51e
2017-04-18 16:56:15 -07:00
Luca Barbato
479443a570 ppc: tm predictor 16x16
About 10x faster.

Change-Id: I1f5a3752d346459df3b45f92963208bf3e520f06
2017-04-19 01:48:10 +02:00
Luca Barbato
c8f5a55df4 ppc: tm predictor 8x8
About 5x faster.

Change-Id: I951230517f49c0dca9ac9eac2efa8916a303b85a
2017-04-19 01:48:09 +02:00
Luca Barbato
7b0e12934e ppc: horizontal predictor 32x32
About 5x faster.

Change-Id: I3bb724e07baffd901aa2d0f65060ba48882cc9b8
2017-04-19 01:48:09 +02:00
Luca Barbato
a7a2d1653b ppc: horizontal predictor 16x16
About 10x faster.

Change-Id: Ie81077fa32ad214cdb46bdcb0be4e9e2c7df47c2
2017-04-19 01:48:09 +02:00
Luca Barbato
7ad1faa6f8 ppc: vertical intrapred 16x16 and 32x32
Change-Id: Ic80f3c050cfbe7697e81a311b4edaaa597b85cab
2017-04-19 01:48:09 +02:00
Luca Barbato
a39b723eb3 configure: Workaround clang not enabling altivec on -mvsx
The flag `-mvsx` implies `-maltivec`.

Change-Id: I7544553eba131a533467b387f8bf329d57f5af5c
2017-04-19 01:48:04 +02:00
Luca Barbato
3252e6b63d configure: Match power*64* as ppc64
Change-Id: Ie640dff50a5db935bb57c5a2570b423ce8946f2c
2017-04-19 01:47:56 +02:00
Linfeng Zhang
a02f391cbe Create CAST_TO_BYTEPTR/SHORTPTR
They will replace CONVERT_TO_BYTEPTR/SHORTPTR module by module.

BUG=webm:1388

Change-Id: Ie47c8cd4897696481b9cbbf9e2d439dc22dc85ec
2017-04-18 14:48:11 -07:00
Marco
15afee1938 vp9: Disable some SVC tests for now.
Disable the 1 pass CBR SVC tests with temporal_layers > 1.
Issue with the commit 863f860, which will cause encoder/decoder
mismatch due to skipping encoder loopfilter for non-reference frames.

Will re-enable the tests when fixed.

Change-Id: I74918a0045a17976b069c4be63fbeb921974df0d
2017-04-18 09:51:42 -07:00
Marco
ad2e3598d2 vp9: Add key_frame condition to is_reference check for loopfilter.
This condiiton is not needed as key_frame should set the refresh
of the reference frames, but good to have for clarity in condition.

Change-Id: Icf9838e7e4f0ff5cf0a9562ae3b5d6c7e6f78702
2017-04-17 15:18:46 -07:00
Johann Koenig
a6095333a7 Merge "re-enable vpx_comp_avg_pred_sse2" 2017-04-17 22:07:34 +00:00
Marco Paniconi
9aa429a66d Revert "Revert "vp9: Avoid encoder loopfilter for non-reference frames.""
This reverts commit e9b7f98c56.

Reason for revert:
Commit d578bdad fixes the issue (encoder/decoder mismatch
in 3TL datarate test) that causes the original revert.

Original change's description:
> Revert "vp9: Avoid encoder loopfilter for non-reference frames."
>
> This reverts commit 863f860bfc.
>
> This causes encoder / decoder mismatches in various
> VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests
>
> BUG=webm:1408
>
> Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
>

TBR=jzern@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com
BUG=webm:1408

Change-Id: Ifeb81460856d1d56482d4e0477a70ee98f8bfaa6
2017-04-17 11:02:02 -07:00
Marco
d578bdad02 vp9: Datarate test: modify frame flags for 3 TL.
Modify the frame flags to update the ARF on top layer,
for the tests:
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayersFrameDropping

This is needed to fix the encode/decoder mismatches caused by 863f860,
and removed in the revert e9b7f98.

Change-Id: I6b9fecfdd17315fc0179e29949338c77636026c0
2017-04-17 09:33:20 -07:00
Johann
9fa24f03b5 re-enable vpx_comp_avg_pred_sse2
Buffers on 32 bit x86 builds only guaranteed 8 byte alignment. Fixed
with "AvgPred test: use aligned buffers" and "sad avg: align
intermediate buffer"

Also re-enable asserts on the C version.

BUG=webm:1390

Change-Id: I93081f1b0002a352bb0a3371ac35452417fa8514
2017-04-17 08:40:43 -07:00
Johann Koenig
9e19102972 Merge "AvgPred test: use aligned buffers" 2017-04-17 15:36:41 +00:00
Johann
069b772915 sad avg: align intermediate buffer
comp_avg_pred has started declaring a requirement for aligned buffers.

BUG=webm:1390

Change-Id: Idaf6667498ea343e8d49b32bc9d8b9d0aa43ef5c
2017-04-17 14:26:33 +00:00
James Zern
4ba20da8b1 Merge "Add AVX2 optimization to copy/avg functions" 2017-04-15 00:26:08 +00:00
Yi Luo
aa5a941992 Add AVX2 optimization to copy/avg functions
Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074
2017-04-14 16:50:10 -07:00
Johann Koenig
7178e68bbe Merge "Disable vpx_comp_avg_pred_sse2" 2017-04-14 22:01:39 +00:00
Johann
e3b2710b04 AvgPred test: use aligned buffers
BUG=webm:1390

Change-Id: Idb6d1ce119a09c5e7c9f3c58bbbae3de63463d1d
2017-04-14 12:49:56 -07:00
James Zern
e9b7f98c56 Revert "vp9: Avoid encoder loopfilter for non-reference frames."
This reverts commit 863f860bfc.

This causes encoder / decoder mismatches in various
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests

BUG=webm:1408

Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
2017-04-14 11:50:06 -07:00
Marco
5f39262dcc vp9: Adjust speed features for speed 8 at low resoln.
For low resolutions (<= CIF): use quarter-pixel and simple_block_yrd.

~5% gain on RTC_derf.
~6-7% slowdown on ARM.

Change-Id: I4439ebd1116b9decac04786503f978840b68a60c
2017-04-14 11:35:47 -07:00
Marco Paniconi
b937f1c839 Merge "vp9: SVC: fix to allow use_base_mv to be used for 3 layers." 2017-04-14 17:12:58 +00:00
Johann
eaa7cdf05d Disable vpx_comp_avg_pred_sse2
Failures on windows:
unknown file: error: SEH exception with code 0xc0000005 thrown in the
test body.

Alignment check errors on linux:
test_libvpx: ../libvpx/vpx_dsp/variance.c:230: void
vpx_comp_avg_pred_c(uint8_t *, const uint8_t *, int, int, const uint8_t
*, int): Assertion `((intptr_t)comp_pred & 0xf) == 0' failed.

BUG=webm:1390

Change-Id: I5eed5381c0f1a8fe594a128eb415e77232f544ea
2017-04-14 08:43:06 -07:00
Johann Koenig
bdb593ab20 Merge "vpx_comp_avg_pred: sse2 optimization" 2017-04-14 04:10:56 +00:00
Marco
adb9b4eddf vp9: SVC: fix to allow use_base_mv to be used for 3 layers.
Allow use_base_mv to be used for 3 spatial layers where
base is 4x4 scale from the top layer.

Change-Id: If6641baf8b8e4d0fd5dc67619d873c6d75065f43
2017-04-13 20:43:43 -07:00
Marco Paniconi
f0ccaff553 Merge "vp9: Avoid encoder loopfilter for non-reference frames." 2017-04-14 00:45:42 +00:00
Marco
6bff6cb5a9 vp9: 1 pass VBR: Fix to rate control at low min-q.
Fix to avoid getting stuck at very low Q even
though content is changing, which can happen for --min-q=0.

Fix is to more aggressively increase active_worst_quality
when detecting significant rate_deviation at very low Q.

Change will only affect 1 pass VBR for --min-q < 4, so no
change in ytlive metrics for --min-q >= 4.

Change-Id: I4dd77dd7c08a30a4390da0ff2c8bda6fccfa76d7
2017-04-13 11:44:35 -07:00
Marco
863f860bfc vp9: Avoid encoder loopfilter for non-reference frames.
Useful for SVC, where the top layer enhancement frames may
not update any reference buffers, as is the case for the
patterns in the 1 pass CBR SVC when #temporal_layers > 1.

~3% encoder speedup for SVC patterns with temporal layers
in 1 pass CBR mode.

Updated the SVC datarate tests for the mismatch frames.
Set the frame-dropper off in some tests with #temporal_layers > 1
so we can correctly set #mismatch frames. Adjusted rate target
threshold for tests where frame-dropper was turned off.

Change-Id: Ia0c142f02100be0fed61cd2049691be9c59d6793
2017-04-13 09:51:55 -07:00
Johann
28a8622143 vpx_comp_avg_pred: sse2 optimization
Provides over 15x speedup for width > 8.

Due to smaller loads and shifting for width == 8 it gets about 8x
speedup.

For width == 4 it's only about 4x speedup because there is a lot of
shuffling and shifting to get the data properly situated.

BUG=webm:1390

Change-Id: Ice0b3dbbf007be3d9509786a61e7f35e94bdffa8
2017-04-13 08:44:52 -07:00
Yunqing Wang
f22b828d68 Fix an integer overflow in vp9_mcomp.c
The MV unit test revealed an integer overflow issue in vp9_mcomp.c.
This was caused if the MV was very large. In mv_err_cost(), when
mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363
and error_per_bit = 132412, causing the overflow.

BUG=webm:1406

Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e
2017-04-10 18:09:50 -07:00
Jerome Jiang
2420f44342 Merge "vp9: speed >= 8: Adjust speed settings on ARM." 2017-04-11 00:45:21 +00:00
Jerome Jiang
f16f08e55b vp9: speed >= 8: Adjust speed settings on ARM.
Set adaptive_rd_thresh to 2 when simple block yrd is not used.

Fix regression caused by computing y sad without
int_pro_motion_estimation on low res motion clips.

Overall 0.07% quality loss on rtc_derf.

Change only affects low res on speed 8.

Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
2017-04-11 00:26:56 +00:00
Marco
6557baf336 vp9: 1 pass CBR: avoid nonrd_pick_partition on segment.
For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition
on the segment, rather use choose_partitioning followed by
nonrd_select_partition (as is done on base segment).

Little/no quality loss on RTC and RTC_derf (< 0.3%),
speedup of at least 5%.

Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
2017-04-10 15:02:49 -07:00
Marco Paniconi
ff1fef9607 Merge "vp9: Fix to noise estimation for temporal denoising." 2017-04-07 17:13:22 +00:00
Yunqing Wang
f496032686 Merge "VP9 motion vector unit test" 2017-04-07 16:46:22 +00:00
Marco
349c3118bd vp9: Fix to noise estimation for temporal denoising.
If the noise estimation is avoided due to large motion,
the last_source for denoising should still be updated.

Change-Id: I67155ea7dbe9ac2785978e64a27bdafd7d57aac0
2017-04-07 09:23:30 -07:00
Marco
18b54ef468 vp9: Adjust consec_zeromv threshold for aq-mode=3.
To reduce refresh on partial super-blocks on boundary,
for noisy input. Reduces some artifacts on noisy input.

Change-Id: I10b5808a296874e08c7f378b3df58466591d8dbe
Edit
2017-04-07 08:54:09 -07:00
James Zern
04e9456567 Merge changes from topic 'Wshorten'
* changes:
  configure: enable -Wshorten-64-to-32 for hbd
  vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
  Resolve -Wshorten-64-to-32 in highbd variance.
2017-04-07 07:32:14 +00:00
Jerome Jiang
6af42f5102 Merge "Fix compile warnings with enable-internal-stats flag." 2017-04-07 03:34:55 +00:00
Jerome Jiang
b82b574e76 Fix compile warnings with enable-internal-stats flag.
BUG=webm:1402

Change-Id: Ibe9ecb1b559a4b989f6ccedbd097e369f6edde1e
2017-04-06 14:00:01 -07:00
Marco
3227a9be5f vp9; Move the denoising condition for speed 5.
Move the condition for effectively disabling the denoising
for speed 5 into the vp9_denoiser_denoise().

This is cleaner, and also moving the condition into vp9_denoiser_denoise
will keep the denoiser buffer updated with the current source.
This allows for more consistent behavior if speed is changed midstream.

Change-Id: Ia001f591c56e454bf724c3ae73c024badb183ef8
2017-04-06 11:03:04 -07:00
Jerome Jiang
c9fbb1881a Merge "vp9: speed 8: Compute y sad without int_pro_motion_estimation." 2017-04-06 02:57:16 +00:00
Jerome Jiang
705fc9f107 Merge "Refactor: Clean memory allocation for copy partition." 2017-04-06 02:57:08 +00:00
Yunqing Wang
1aa46abbdf VP9 motion vector unit test
To prevent the motion vector out of range bug, added a motion vector unit
test in VP9. In the 4k video encoding, always forced to use extreme motion
vectors and also encouraged to use INTER modes. In the decoding, checked if
the motion vector was valid, and also checked the encoder/decoder mismatch.

The tests showed that this unit test could reveal the issue we saw before.

Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
2017-04-06 00:50:56 +00:00
James Zern
7ac3acf762 configure: enable -Wshorten-64-to-32 for hbd
Change-Id: I4041f3fc4aabc7a2251c44c75a477b659284c3cf
2017-04-05 17:34:06 -07:00
James Zern
b3e2eb14c5 vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
vp9_high_get_sby_perpixel_variance the variance operated on in is
already in 32-bits

Change-Id: I97006eb9c08dbd0f88ee35e1a1ca205737508296
2017-04-05 17:34:06 -07:00
James Zern
47b9a09120 Resolve -Wshorten-64-to-32 in highbd variance.
For 8-bit the subtrahend is small enough to fit into uint32_t.

This is the same that was done for:
c0241664a Resolve -Wshorten-64-to-32 in variance.

For 10/12-bit apply:
63a37d16f Prevent negative variance

Change-Id: Iab35e3f3f269035e17c711bd6cc01272c3137e1d
2017-04-05 17:34:02 -07:00
Jerome Jiang
288d73c861 vp9: speed 8: Compute y sad without int_pro_motion_estimation.
Little change in overall PSNR in rtc. 2-4% speedup on VGA on ARM.

Change-Id: I3395806d7afd456deacd4077c330adca13ab0645
2017-04-05 17:25:47 -07:00
Marco Paniconi
511a207444 Merge "vp9: Temporal denoising: avoid denoising for speed <= 5." 2017-04-06 00:25:45 +00:00
Marco
2136de9374 vp9: Temporal denoising: avoid denoising for speed <= 5.
Temporal denoiser runs in non-rd pickmode, so it is only used
for speed >= 5. Regression exists for speed 5, due to use of
reference_partition (which use non-rd pickmode for partitioning).
Avoid denoising for now at speed 5.

Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
2017-04-05 16:43:39 -07:00
Jerome Jiang
58ba880b94 Refactor: Clean memory allocation for copy partition.
Move the memory allocation from setting speed features.

Change-Id: I2e89dfaeb46daee63effe5a5df62feed732aa990
2017-04-05 15:33:24 -07:00
Linfeng Zhang
6fc2e57c2c Update 32x32 high bitdepth idct NEON optimization
Preparation of CONVERT_TO_BYTEPTR/SHORTPTR clean up.

BUG=webm:1388

Change-Id: I928d30a5698023bb90888d783cf81c51ec183760
2017-04-05 15:28:11 -07:00
Jerome Jiang
fb60204d4c vp9: Remove legacy comments for avg_source_sad.
Change-Id: Ia6e8614535a097f17f37fc382cef8e22e03b70f6
2017-04-04 16:28:27 -07:00
Marco
8097b49997 vp9: Adjust condition of golden update with cyclic refresh.
Base the low_content_frame metric on the motion vectors,
and adjust the logic for preventing golden update.

Small change in behavior: small positive gain (~0.2-1%) on clips
with high activity.

Change-Id: I0b861c8e9666cd82b45cde5ee57ee8a1e5ab453c
2017-04-04 09:55:24 -07:00
Marco
6b3f4bc794 vp9: 1 pass CBR: cleanup to cyclic refresh.
Code cleanup: merged two functions that were doing postencode
update for cylic refresh, remove some unused code and fix comments.

No change in behavior.

Change-Id: I9be0d7e346d34dec29bf4e5bb380a7bf81c8480a
2017-04-03 16:37:45 -07:00
Yunqing Wang
41fac44707 Merge "Fix for out of range motion vector bug in sub-pel motion estimation" 2017-04-03 18:27:57 +00:00
Marco Paniconi
9d403d6f48 Merge "vp9: SVC: Fix issue with artifact for svc-denoising." 2017-04-03 16:23:25 +00:00
Ranjit Kumar Tulabandu
bf15ca1091 Fix for out of range motion vector bug in sub-pel motion estimation
BUG=webm:1397

(yunqingwang)
To verify that this patch wouldn't cause much performance change,
the Borg tests were run. Here was the result:
       avg_psnr   overall_psnr  ssim
hdres: -0.002     0.006         0.013
midres:   0         0             0
lowres:   0         0             0

Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d
2017-04-03 16:16:49 +00:00
Yunqing Wang
002cf38837 Merge "Enhance the row mt sync read to accept the sync_range greater than 1" 2017-04-03 15:59:51 +00:00
Yunqing Wang
f1600db3e4 Enhance the row mt sync read to accept the sync_range greater than 1
The row mt sync read uses sync_range = 1, and wouldn't work if we want
to use a sync_range that is greater than 1. To make it work, this sync
read code is modified. Pass in col instead of col - 1 to make it
consistent with other row mt code in VP9, and then add 1 in "while"
codition.

Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
2017-03-31 10:48:38 -07:00
Marco
c824eda6cc vp9: SVC: Fix issue with artifact for svc-denoising.
Issue/bug happens for denoising with spatial layers, where
the golden (spatial) reference is used in pickmode, but
denoising is only done wrt to last (temporal).

Fix is to make sure set_ref_ptrs is set before build predictors
in denoiser.

Change-Id: I793cf441341edf7c4a88b8ab1e1b22b3cb0eb508
2017-03-31 10:05:32 -07:00
James Zern
8e7c5a3c8d Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vpxdec: silently ignore -frame-parallel
  vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op
2017-03-30 19:07:44 +00:00
James Zern
6fed5692d2 vpxdec: silently ignore -frame-parallel
BUG=webm:1395

Change-Id: Ibf47cc931e51b71e49067c6d7b7a39ab57c11c96
2017-03-29 23:39:12 -07:00
James Zern
01d23109ab vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op
this is unmaintained and due to be removed

BUG=webm:1395

Change-Id: Iaffa6aa057c820fd1a182b93ebb45d4286e1306e
2017-03-29 23:38:49 -07:00
James Zern
7fde349b25 rtcd,unit tests: fix ppc64 build
match ppc* in rtcd to ensure vsx is included

Change-Id: I331a5d35e7160eeb69ebd14b98ba03ec5be6c600
2017-03-29 22:51:22 -07:00
Marco
fc83fcb7c4 vp9: SVC: fix to allow output of denoised result.
Change-Id: Iaf55cfb5e9621d074eb33d6a32f184e4777968f8
2017-03-29 14:02:54 -07:00
Marco
32b3d2f174 vp9: 1 pass SVC: Modify condition for intra-mode search.
Temporary override to condition for disallowing intra-search in SVC,
since golden (spatial) reference is currently suppressed due to
artifact issue.

Change-Id: I28ed7fdddc9fcdbcc0a4175a247a3ecc94c11767
2017-03-29 09:24:50 -07:00
James Zern
fe432cacf8 Merge "rate_hist: add parameter validation" 2017-03-29 02:41:07 +00:00
James Zern
ddd75573a7 Merge changes from topic 'sync-highbd-intrapred'
* changes:
  intrapred: sync highbd_d135_predictor w/d135_
  intrapred: specialize highbd 4x4 predictors
  intrapred: rename d63f to d63e
  remove CONFIG_MISC_FIXES
2017-03-29 02:39:21 +00:00
James Zern
3cc57e67a9 Merge ".mailmap: add an additional entry for Yaowu Xu" 2017-03-29 01:56:57 +00:00
Johann Koenig
eec92e8a5b Merge "vpx_comp_avg_pred: add test" 2017-03-28 21:50:01 +00:00
Johann
6e99ed72a5 vpx_comp_avg_pred: add test
BUG=webm:1389

Change-Id: I23cd65f1939db026958ccb5d70b8c5cc9aa5bc51
2017-03-28 14:11:14 -07:00
Marco
0169a985d9 vp9: Speed >= 8: avoid chrome check under some condition.
For non-rd variance partition, avoid the chrome check
unless y_sad is below some threshold.

Small decrease in avgPSNR (~0.3) on RTC set.
Small/negligible decrease on RTC_derf.

Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
2017-03-27 13:18:21 -07:00
Marco
66c6b4d6fc vp9: 1 pass: Move source sad computation into encodeframe loop.
Refactor to split the 1 passs source sad computation into scene
detection (currently used for VBR and screen-content mode), and
superblock based source sad computation (used in non-rd CBR mode).

This allows the source sad computation for CBR mode to be
multi-threaded.

No change in compression.

Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
2017-03-27 11:11:05 -07:00
James Zern
aefc1088a2 intrapred: sync highbd_d135_predictor w/d135_
previously:
05437805f intrapred/d135: flatten border results before storing

BUG=webm:1316

Change-Id: I3b8bd89117ad7f2f4560b57f7c148da781e86f85
2017-03-24 20:45:44 -07:00
James Zern
67cde46dd7 intrapred: specialize highbd 4x4 predictors
d207/d63/d45/d117/d135/d153

~9-45% better depending on the predictor on 32-bit ARM, similar range on
x86-64

this matches the non-highbitdepth implementation

BUG=webm:1316

Change-Id: Iddebdf7c58c6f31c47cae04da95c6e5318200e4c
2017-03-24 20:45:36 -07:00
James Zern
e05f4cf8f4 intrapred: rename d63f to d63e
this is consistent with he/ve/d45e

Change-Id: I75641ae5667430b0ecd370db86fff6e666cb577d
2017-03-24 20:41:39 -07:00
James Zern
d45617c702 remove CONFIG_MISC_FIXES
this belonged to vp10 with the changes now migrated to av1.

Change-Id: Ie30ead3e7b71f465bc14136e1b6f156ea978c43f
2017-03-24 20:41:39 -07:00
Marco
07ad5a15c2 vp9: Fix to condition on using source_sad for 1 pass real-time.
Make the source_sad feature work properly for cases of VBR or
screen_content with SVC.

Added unittest for SVC with screen-content on.

Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
2017-03-24 10:21:47 -07:00
James Zern
4ffdf60b85 rate_hist: add parameter validation
tolerate a NULL hist being passed as a result of invalid parameters
passed to init_rate_histogram(). this fixes a divide by zero in
init_rate_histogram() with an invalid fps.

BUG=webm:1383

Change-Id: Id203e0f3b18d67a4a09aaf206abcce4708f966ec
2017-03-23 14:50:00 -07:00
Johann Koenig
10164407fb Merge "vp9 temporal filter: additional test" 2017-03-23 18:38:36 +00:00
Alex Converse
d7b220b467 Merge changes Ie989e60c,Ifc110b12
* changes:
  Backport "Optimize the use case of token_cost table" to VP9
  Drop vp9_get_token_extracost
2017-03-23 18:05:13 +00:00
Marco Paniconi
d856157388 Merge "vp9: Non-rd partition: avoid unneeded call to chrome_check" 2017-03-23 14:17:48 +00:00
Kaustubh Raste
8ee9b855a0 Merge "Fix mips msa fwd xform mismatch" 2017-03-23 07:44:16 +00:00
Marco
4863e07c01 vp9: Non-rd partition: avoid unneeded call to chrome_check
Since y_sad is not computed yet (on the early exit due to source_sad),
no need to check for setting color_sensitiviy.

Only affects speed >=8. No change in behavior.

Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
2017-03-22 22:40:28 -07:00
James Zern
f16ea6a6eb Merge "vp9_rdopt: correct size to vpx_sum_squares_2d_i16" 2017-03-23 00:53:22 +00:00
Marco Paniconi
ff0e0a76e8 Merge "vp9: Adjust some speed settings for speed 8." 2017-03-22 22:56:17 +00:00
Marco
4d50991320 vp9: Adjust some speed settings for speed 8.
Allow for simple_block_rd for VGA resoln, and reduce
adaptive_rd_thresh to 1.

On average no loss on RTC set, ~4% speedup on mac.

Change-Id: Ib549c4061c853776062b5e34040f839d470fbebc
2017-03-22 15:16:15 -07:00
Jerome Jiang
dcd6c87b80 Merge "vp9: Enable adaptive_rd_threshold for row mt for realtime speed 8." 2017-03-22 22:02:24 +00:00
Johann
83dd9b36f4 vp9 temporal filter: additional test
Change tests to reflect use. Input sizes will be 8 or 16 (but not
necessarily square).

filter_weight is capped at 2 and filter_strength at 6

Speed test, disabled by default.

Change-Id: Idfde9d6c4b7d93aaf0e641b0f4862c15e2a2af7a
2017-03-22 19:37:04 +00:00
Johann Koenig
c099d6be1c Merge "vp9 temporal filter: add const to function prototype" 2017-03-22 19:36:40 +00:00
James Zern
e097bb1d39 Merge "idct_neon: prefix non-static functions w/'vpx_'" 2017-03-22 19:30:11 +00:00
James Zern
5661cd8ff4 vp9_rdopt: correct size to vpx_sum_squares_2d_i16
the current implementations expect pixel size, not the block type

BUG=webm:1392

Change-Id: Ib91e9f30a1f56e13566b1fb76f089dae9bb50cdc
2017-03-22 12:04:33 -07:00
James Zern
f91c3bb3ab idct_neon: prefix non-static functions w/'vpx_'
Change-Id: I94fcdeae18468e6ef0cb7119b8142d982a048031
2017-03-22 11:49:23 -07:00
Johann
36d732c22b vp9 temporal filter: add const to function prototype
The input frames are not modified.

Change-Id: Ideb810e3c5afeb4dbdc4c7d54024c43a8129ad39
2017-03-22 18:14:21 +00:00
Kaustubh Raste
e45c1f55b4 Fix mips msa fwd xform mismatch
Change-Id: I32a6df11463144aa1a562256ee7d57a41fd678d6
2017-03-22 14:01:03 +05:30
Jerome Jiang
20c2892693 vp9: Enable adaptive_rd_threshold for row mt for realtime speed 8.
Change it to row based array to avoid the slow down cause by sync.
row-mt on, speed 8, 2 threads: ~4% speedup for VGA on ARM benefited
from adaptive_rd_threshold.

Change-Id: I887e65a53af20a6c4f48d293daaee09dab3512cf
2017-03-21 18:49:47 -07:00
Marco Paniconi
2fac50fa0e Merge "vp9: Modify datarate tests to cover denoising with multi-threading." 2017-03-21 23:44:05 +00:00
Jerome Jiang
74f4c5cd12 Merge "Fix the data race caused by vp9 denoiser." 2017-03-21 23:27:48 +00:00
Marco
4ddde47d8c vp9: Modify datarate tests to cover denoising with multi-threading.
Change-Id: I6ed48a630edf9923c25a05deaca50e0afec43918
2017-03-21 15:57:33 -07:00
Jerome Jiang
dbed479d79 Fix the data race caused by vp9 denoiser.
BUG=webm:1391

Change-Id: I9669ae62fe9c695d4c6f9973094cb0f39bed51c7
2017-03-21 15:46:25 -07:00
Yi Luo
cb9b277b2f Merge "Make butterfly_self() signature consistent with butterfly()" 2017-03-21 22:32:20 +00:00
Yunqing Wang
1935dfb294 Code refactoring in the partition search
Computed the partition search early termination score in a separate
function.

Change-Id: I1894b517ff179a38b1c05e054d373ac4b7f4cbb4
2017-03-21 10:00:44 -07:00
Yi Luo
266868a40b Make butterfly_self() signature consistent with butterfly()
- Refer to patch: 48fca113d inv_txfm_ssse3,butterfly: fix win32 abi
  compatibility.
- Change four butterfly() calls to butterfly_self(), to simplify the
  operations.

Change-Id: Ib2a8cfe6cddcaf0a59e6e6270d8380055ea42ef3
2017-03-21 09:36:35 -07:00
James Zern
6f13620761 .mailmap: add an additional entry for Yaowu Xu
Change-Id: I26b928848a7e72ff5ca25001ed6991c95f5992a5
2017-03-20 22:39:40 -07:00
James Zern
e0b4c4d1ae Merge "Add vpx_highbd_idct32x32_1024_add_neon()" 2017-03-21 03:27:35 +00:00
James Zern
6d71d33d55 Merge "Add vpx_highbd_idct32x32_34_add_neon()" 2017-03-21 03:02:51 +00:00
Marco Paniconi
05c7259525 Merge "vp9: Nonrd variance partition: improve split to 16x16." 2017-03-21 00:17:35 +00:00
Yunqing Wang
bf43b4c4b4 Merge "Record the sum of tx block eobs in the partition block" 2017-03-20 23:20:12 +00:00
Marco
3135b85423 vp9: Nonrd variance partition: improve split to 16x16.
Add additional condition to split to 16x16, for resolutions <= 360p,
reduces dragging artifact near moving boundary.

Small/no change on RTC metrics.

Change-Id: I314694f2166435d918f74e7ab42f002b07f40dae
2017-03-20 15:44:46 -07:00
Marco Paniconi
8dc1523057 Merge "vp9: Use sb content measure to bias against golden." 2017-03-20 21:35:12 +00:00
Marco
06c8713e89 vp9: Use sb content measure to bias against golden.
For each superblock, keep track of how far from current frame
was the last significant content change, and use that (along
with GF distance), to turnoff GF search in non-rd pickmode.

Only enabled for speed >= 8.

avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
Speedup on mac: ~3-5%.
Speedup on arm: 3.6% for VGA and 4.4% for HD.

Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
2017-03-20 12:42:26 -07:00
Johann Koenig
d642dd4311 Merge "temporal filter test: update types" 2017-03-20 19:05:55 +00:00
Yunqing Wang
9c2552a1c1 Record the sum of tx block eobs in the partition block
The sum of tx bloxk eobs is needed in the machine learning based partition
early termination. The eobs are first accumulated during tx search, and
then the value associated with the best tx_size is copied to ctx for later
use.

After the sum of eobs are calculated correctly, re-enabled
ml_partition_search_early_termination speed feature.

Re-did the quality/speed test to check the impact of the fix.

1. Borg test BDRATE result:
4k set:     PSNR: +0.183%; SSIM: +0.100%;
hdres set:  PSNR: +0.168%; SSIM: +0.256%;
midres set: PSNR: +0.186%; SSIM: +0.326%;

2.Average speed gain result:
4k clips: 21%;
hd clips: 26%;
midres clips: 15%.

The result is in line with the original result.

Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
2017-03-20 17:12:15 +00:00
Jingning Han
ca9bedd538 Backport "Optimize the use case of token_cost table" to VP9
cherry picked from nextgenv2 90ea281f29

Change-Id: Ie989e60c6479ac3251cadaac9c7e795ccba52f4e
2017-03-17 16:54:22 -07:00
Alex Converse
ab71181545 Drop vp9_get_token_extracost
vp9_get_token_cost does the same thing with one fewer lookup.

Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
2017-03-17 16:53:09 -07:00
James Zern
36533e8c5a Merge "inv_txfm_sse2: clear conversion warning in hbd build" 2017-03-17 21:48:20 +00:00
Johann
775569473d temporal filter test: update types
Use 'int' for w/h since it is that way everywhere else.

Pass Buffer pointers

Change-Id: I9eef6890af657baba171c6bcfcc85fc976173399
2017-03-17 13:22:28 -07:00
Johann Koenig
9675affae0 Merge "test: add vp9_temporal_filter_apply test" 2017-03-17 18:18:06 +00:00
Alex Converse
0842daa24e Merge "vp9_optimize_b: Combine extrabits cost with token lookup" 2017-03-17 16:18:21 +00:00
James Zern
5da2e500d7 inv_txfm_sse2: clear conversion warning in hbd build
tran_high -> tran_low in return from dct_const_round_shift()

Change-Id: I2fe06c4b604823b1d1fe40a487017c3c2819a440
2017-03-17 01:16:38 -07:00
Linfeng Zhang
27530d484e Add vpx_highbd_idct32x32_1024_add_neon()
BUG=webm:1301

Change-Id: Ib90af0c1712e56b301d0e981dbe9a641e15e36ca
2017-03-17 00:27:46 -07:00
Linfeng Zhang
50b13f75b8 Add vpx_highbd_idct32x32_34_add_neon()
BUG=webm:1301

Change-Id: I74dd16c6c64e7bb71aa991cedccddf0663ef5e06
2017-03-17 00:27:46 -07:00
James Zern
2882778310 Merge "Add vpx_highbd_idct32x32_135_add_neon()" 2017-03-17 07:26:52 +00:00
Linfeng Zhang
65e9fb65e8 Add vpx_highbd_idct32x32_135_add_neon()
BUG=webm:1301

Change-Id: I58c2d65d385080711c3666d6d8f9d241dac7b21a
2017-03-16 22:37:55 -07:00
James Zern
68efc64b72 Merge "Clean vpx_idct32x32_1024_add_neon()" 2017-03-17 05:24:58 +00:00
Marco
02975a604c vp9: Fix speed 8 condition for enabling copy_partition.
Change-Id: I2c090e6ba853a30fef1957b620853315f9471753
2017-03-16 17:08:37 -07:00
Alex Converse
3a6ec9ea72 vp9_optimize_b: Combine extrabits cost with token lookup
About 0.6% fewer cycles spent in vp9_optimize_b.

Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
2017-03-16 17:03:22 -07:00
Gabriel Marin
976ddb61d3 Add a vector form of routine vp9_model_rd_from_var_lapndz
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb
to model the rate and distortion for MAX_MB_PLANE Laplacian sources in
parallel. The caller ensures that all sources have non-zero variance.

Measured a 18% to 25% reduction in retired instructions, and 17% to 24%
reduction in instruction execution cost with different compilers for the
Laplacian modeling.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
2017-03-16 22:19:44 +00:00
Marco Paniconi
83ba1880bf Merge "vp9: Fixes in non-rd pickmode for denoising with SVC." 2017-03-16 21:53:38 +00:00
Johann Koenig
eeeb71ed97 Merge "Remove ppc-linux-gcc target" 2017-03-16 21:53:17 +00:00
Johann Koenig
cd3d7cf4ac Merge "Add Hadamard for Power8" 2017-03-16 21:52:15 +00:00
Marco
bc7d4935bb vp9: Fixes in non-rd pickmode for denoising with SVC.
Don't denoise spatial layer frames whose base layer is a key frame.

Disallow golden reference for SVC with denoising on frames
that will be denoised (highest layer), as this removes bad artifact.
Will re-enable when issue is resolved.

Change-Id: I87a6597812330500966458172acfce54af65f70f
2017-03-16 12:59:41 -07:00
Marco
ba8bfaafa7 vpx_codec.h: include vpx/*.h -> ./*.h
This matches the other includes and also fixes a compile issue in
chromium.

Change-Id: I45e00a1454f7ed948aa3b96b04cc5946b1d02985
2017-03-16 16:55:56 +00:00
Jerome Jiang
bf40776aa4 Merge "Refactor: Change cpi->resize_state to enum values." 2017-03-16 16:43:42 +00:00
Marco Paniconi
ec73bf53a5 Merge "vp8: Fix compiler warning in vp8 pickinter.c" 2017-03-16 05:13:38 +00:00
Rafael de Lucena Valle
405b94c661 Add Hadamard for Power8
Change-Id: I3b4b043c1402b4100653ace4869847e030861b18
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
2017-03-15 23:46:18 -03:00
Marco Paniconi
cd47c1942e Merge "vp9: Fix some issues with denoiser and SVC." 2017-03-16 02:42:55 +00:00
Marco
a340c64a79 vp9: Fix some issues with denoiser and SVC.
Fix the update of the denoiser buffer when the base
spatial layer is a key frame. And allow for better/lower
QP on high spatial layers when their base layer is key frame.

Change-Id: I96b2426f1eaa43b8b8d4c31a68b0c6d68c3024a2
2017-03-15 17:19:17 -07:00
Jerome Jiang
b5f7f7737a Refactor: Change cpi->resize_state to enum values.
Change-Id: Iab1409b0fc1175bc5a14afc4749a08c536c98c41
2017-03-15 17:16:17 -07:00
Marco
2c8430e223 vp9: Turn off ml_partition_search_early_termination.
Fails on nightly ubsan, valgrind tests.
Enabled on commit:6701014

Change-Id: Ied3f5cb38e39cba54ac134f4514107cdfdfce159
2017-03-15 15:00:38 -07:00
Marco
deea4ede59 vp8: Fix compiler warning in vp8 pickinter.c
Change-Id: I0e5714538fe53d885a2201d808846901ae8fc288
2017-03-15 11:50:14 -07:00
Linfeng Zhang
e54231d613 Clean vpx_idct32x32_1024_add_neon()
Change-Id: I05921e16d6a3e4e7e5b00a90624735050a186636
2017-03-15 11:24:31 -07:00
Yi Luo
8440cc4817 Merge "Improve idct32x32_1024_add SSSE3 intrinsics performance" 2017-03-15 02:32:52 +00:00
Linfeng Zhang
d9a9a4ffea Merge "Fix overflow issue in 32x32 idct NEON intrinsics" 2017-03-15 00:38:17 +00:00
Jerome Jiang
27d5a57072 Merge "vp9: Using source sad for speedup for dynamic resizing." 2017-03-15 00:03:52 +00:00
Linfeng Zhang
c756eb01c8 Fix overflow issue in 32x32 idct NEON intrinsics
Similar issue as Change bc1c18e.

The PartialIDctTest.ResultsMatch test on vpx_idct32x32_135_add_neon()
in high bit-depth mode exposes 16-bit overflow in final stage of pass
2, when changing the test number from 1,000 to 1,000,000.

Change to use saturating add/sub for vpx_idct32x32_34_add_neon(),
vpx_idct32x32_135_add_neon and vpx_idct32x32_1024_add_neon() in high
bit-depth mode.

Change-Id: Iaec0e9aeab41a3fdb4e170d7e9b3ad1fda922f6f
2017-03-14 16:59:14 -07:00
Jerome Jiang
2fa7092808 Merge "vp9: Enable row multithreading for SVC in real-time mode." 2017-03-14 23:29:46 +00:00
Jerome Jiang
02463273c9 vp9: Using source sad for speedup for dynamic resizing.
Only for speed >= 7.

Change-Id: I3ac85fbb4023cf7e6f8333806b345b0174382a09
2017-03-14 15:47:19 -07:00
Yi Luo
fedcf83f33 Improve idct32x32_1024_add SSSE3 intrinsics performance
- Function level speed improves ~12%.

Change-Id: I9b7dbddabf08c7d0f6b25264e6074d5ccbe39290
2017-03-14 14:04:08 -07:00
James Zern
1b91f41935 Merge "vp9/encoder: fix segfault on win32 using vs < 2015" 2017-03-14 19:21:42 +00:00
Yunqing Wang
c3e290963d Merge "Apply machine learning-based early termination in VP9 partition search" 2017-03-14 18:07:05 +00:00
Marco Paniconi
78a6946904 Merge "vp9: Speed >= 8: Enable simple_block_yrd speed feature." 2017-03-14 17:50:17 +00:00
Marco
c0c789ab50 vp9: Adjust copy partition threshold, for speed 8.
Reduce it from 5 to 4, small/no change in metrics or speed.
Small reduction in dragging artifact near moving head.

Change-Id: Ic3bc5ca67c70bf0c89fc2ed14454840a28ae5b6a
2017-03-14 09:18:53 -07:00
Marco
c216c8d6f2 vp9: Speed >= 8: Enable simple_block_yrd speed feature.
Enable speed feature for resolutions > VGA.
avgPSNR on RTC down by ~1.7%.
Speedup on ARM: ~5%.

Change-Id: I7a3fe5f7425aa8df3f4a2eced1afa355bc0d4c95
2017-03-14 09:10:28 -07:00
Johann
a14a987c82 test: add vp9_temporal_filter_apply test
Add an independent implementation of the filter.

BUG=webm:1379

Change-Id: I309c459b493c3011273b78b127a786bb23c59f9c
2017-03-13 15:26:26 -07:00
Marco Paniconi
507204316a Merge "vp9: Fix to source_sad feature for SVC." 2017-03-13 19:18:31 +00:00
Linfeng Zhang
b0bfcc368c Merge "Add vpx_highbd_idct32x32_135_add_c()" 2017-03-13 18:49:01 +00:00
Marco
f0a22b23fe vp9: Fix to source_sad feature for SVC.
Allow speed feature sf->use_source_sad to be used
on highest spatial layer for SVC.

Change-Id: I260eb0478902764f49f83e43b17024fe86ff3b22
2017-03-13 11:00:40 -07:00
Yunqing Wang
670101439f Apply machine learning-based early termination in VP9 partition search
This patch was based on Yang Xian's intern project code. Further modifications
were done.
1. Moved machine-learning related parameters into the context structure.
2. Corrected the calculation of sum_eobs.
3. Removed unused parameters and calculations.
4. Made it work with multiple tiles.
5. Added a speed feature for the machine-learning based partition search
early termination.
6. Re-organized the code.

The patch was rebased to the top-of-tree.

Borg test BDRATE result:
4k set:     PSNR: +0.144%; SSIM: +0.043%;
hdres set:  PSNR: +0.149%; SSIM: +0.269%;
midres set: PSNR: +0.127%; SSIM: +0.257%;

Average speed gain result:
4k clips: 22%;
hd clips: 23%;
midres clips: 15%.

Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
2017-03-13 09:54:18 -07:00
Marco Paniconi
b39f7c3364 Merge "vp9: Fix condition for intra search in non-rd pickmode." 2017-03-13 06:11:13 +00:00
Marco
8c18df7fcd vp9: Fix condition for intra search in non-rd pickmode.
Fixes an issue when the LAST and golden is not used as a reference,
in which case its possible no encoding mode is set (since intra may be
skipped under certain codtions). Fix is to make sure intra is searched
if no inter mode is checked.

Issue can happen for temporal layer pattern#7 in vpx_temporal_svc_encoder.c

Change-Id: I5ab4999b2f9dbd739044888e0916b5ec491d966b
2017-03-12 22:30:39 -07:00
James Zern
48fca113d1 inv_txfm_ssse3,butterfly: fix win32 abi compatibility
only the first 3 parameters can be aligned to 16 as required by __m128i,
make them all pointers for consistency.

since:
07c48ccfe Improve idct32x32_34_add SSSE3 intrinsics performance

BUG=webm:1384

Change-Id: I0324f701e723a27cb470036a180693ba8829d01d
2017-03-10 19:57:17 -08:00
James Zern
c09b290cea vp9/encoder: fix segfault on win32 using vs < 2015
shift the bsse[] member of the macroblock struct to the front to avoid
an incorrect offset (0) to the upper half of bsse[0] which leads to a
negative resulting in a crash. restrict this to visual studio versions
before 2015 (the bug was observed with 2013, fixed in 2015) to avoid any
potential cache impact on other platforms.

https://connect.microsoft.com/VisualStudio/feedback/details/2396360/bad-structure-offset-in-32-bit-code

BUG=webm:1054

Change-Id: I40f68a1d421ccc503cc712192263bab4f7dde076
2017-03-10 17:37:17 -08:00
Marco Paniconi
0af189c00d Merge "vp9: Sample encoder vpx_temporal_svc_encoder: enable row-mt" 2017-03-10 18:26:06 +00:00
Marco
169c846575 vp9: Sample encoder vpx_temporal_svc_encoder: enable row-mt
Enable row-mt in the sample encoder vpx_temporal_svc_encoder.c,
under certain condiitons.

Change-Id: Ic103ee81a9d80be5bf6e5778cc21fc3199db909d
2017-03-10 10:11:39 -08:00
Yi Luo
018290a344 Merge "Improve idct32x32_135_add SSSE3 intrinsics performance" 2017-03-10 17:14:30 +00:00
Marco
ffb3c50da1 vp9: Enable row multithreading for SVC in real-time mode.
Enable row-mt for SVC for real-time mode, speed >=5.

Add the controls to the sample encoders, but keep it off for now.
Add the control and enable it for the 1 pass CBR unittests.

For speed 7, 3 layer SVC, 2 threads, row-mt enabled gives about ~5% speedup.

Change-Id: Ie8e77323c17263e3e7a7b9858aec12a3a93ec0c1
2017-03-10 01:01:07 +00:00
Yi Luo
327add990f Improve idct32x32_135_add SSSE3 intrinsics performance
- Split the inv txfm into three parts to avoid stack spillover.
- Function level speed improves ~12%.
- Use function and macro to remove some repeated code.

Change-Id: I14f5f072334fd766808cb52bf648df792e7379ee
2017-03-09 16:17:54 -08:00
Johann Koenig
f951881e8c Merge "ppc: include ppc.h for ppc_simd_caps()" 2017-03-09 23:12:37 +00:00
James Zern
cb60e66085 Merge "move vp9_scale_and_extend_frame_c to vp9_frame_scale.c" 2017-03-09 22:51:08 +00:00
Johann
94655569fe Remove ppc-linux-gcc target
Change-Id: Iec2430966f54e2e5ba79f6bb703f47adde46479f
2017-03-09 11:33:33 -08:00
Johann
ccd23215ed ppc: include ppc.h for ppc_simd_caps()
Change-Id: Idc829eb066cf4e905d062cb9c08424e0f1b7e1a7
2017-03-09 09:26:45 -08:00
James Zern
2f31a16445 move vp9_scale_and_extend_frame_c to vp9_frame_scale.c
this is similar to the x86 configuration and helps mitigate an issue
with a circular dependency between this function and the ssse3 variant
causing an outsized increase in binary size (~300K for chrome)
chrome.dll:
.text 255B000 -> 252B000
.data 7B000 -> 75000
-221184 bytes

BUG=chromium:697956

Change-Id: Ic95b142ecd62dd4f1795788aa27dd8fab59b708c
2017-03-08 21:13:50 -08:00
Marco Paniconi
04aa9e28d5 Merge "vp9: Enable two speed features for SVC real-time mode." 2017-03-09 03:58:14 +00:00
Marco
ea3c817ac2 vp9: Enable two speed features for SVC real-time mode.
Enable short_circuit_low_temp_var and limit_newmv_early_exit
for SVC, 1 pass CBR mode.

Change-Id: I77df2b2c6cc40657bb8ea76e19dfc2fdaad6389e
2017-03-08 16:13:59 -08:00
Marco
97b6a6f037 vp9: Add control to vpx_temporal_svc_encoder for row-mt.
Keep it off as default for now.

Change-Id: Ia2518a8ce96c9735c3fe67215dde25a35e8620af
2017-03-08 16:03:44 -08:00
Jerome Jiang
834f26c3b9 Merge "Shift speed 2 from non-large VP9 tests to large ones." 2017-03-08 23:14:27 +00:00
Johann Koenig
42a1b310e1 Merge "Add support for POWER8/VSX" 2017-03-08 22:38:21 +00:00
Yunqing Wang
6a86492adf Merge "Make the partition search early termination feature to be frame size dependent" 2017-03-08 22:31:30 +00:00
Yunqing Wang
099e9bf1ff Make the partition search early termination feature to be frame size dependent
The 2 thresholds(i.e. partition_search_breakout_dist_thr and
partition_search_breakout_rate_thr) are used as the partition search
early termination speed feature. This refactoring patch made this
feature to be frame size dependent consistently throughout the code.

Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
2017-03-08 12:56:41 -08:00
Linfeng Zhang
77311e0dff Update vpx_idct32x32_1024_add_neon()
Most are cosmetics changes.
Speed has no change with clang 3.8, and about 5% faster with gcc 4.8.4

Tried the strategy used in 8x8 and 16x16 (which operations' orders are
similar to the C code), though speed gets better with gcc, it's worse
with clang.

Tried to remove store_in_output(), but speed gets worse.

Change-Id: I93c8d284e90836f98962bb23d63a454cd40f776e
2017-03-08 12:39:04 -08:00
Rafael de Lucena Valle
51289302ab Add support for POWER8/VSX
Add ppc, ppc64 and ppc64le on all_platforms and ARCH_LIST

Add VSX flags and check for -mvsx

Define empty setup_rtcd_internal

Add Altivec detection based on:
http://freevec.org/function/altivec_runtime_detection_linux

Detect VSX at runtime when enabled

Change-Id: I304f4d8c5fee0ff19b6483cd2e9cc50d6ddec472
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
2017-03-08 20:28:08 +00:00
Linfeng Zhang
48f5886605 Add vpx_highbd_idct32x32_135_add_c()
When eob is less than or equal to 135 for high-bitdepth 32x32 idct,
call this function.

BUG=webm:1301

Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6
2017-03-08 10:46:33 -08:00
Marco Paniconi
2fa710aa6d Merge "vp9: Fix for denoising with SVC." 2017-03-08 18:26:12 +00:00
Marco
45de35fc58 vp9: Fix for denoising with SVC.
Fix the conditon for getting last_source when denoising is on.
This avoids unneeded scaling in the case of SVC.

No change in quality.

Change-Id: I32c1c2c9085104da51af8535716bcc4d55fb0f42
2017-03-08 09:45:58 -08:00
Linfeng Zhang
c4e5c54d69 cosmetics,dsp/arm/: vpx_idct32x32_{34,135}_add_neon()
No speed changes and disassembly is almost identical.

Change-Id: Id07996237d2607ca6004da5906b7d288b8307e1f
2017-03-08 08:58:32 -08:00
Linfeng Zhang
3cf5c213f1 cosmetics,dsp/arm/: rename a variable
Rename cospi_6_26_14_18N to cospi_6_26N_14_18N for consistency.

Change-Id: I00498b43bb612b368219a489b3adaa41729bf31a
2017-03-08 08:55:41 -08:00
Jerome Jiang
c4c0331f65 Shift speed 2 from non-large VP9 tests to large ones.
This may fix the time out failure of valgrind tests in nightly
since more coverages were added on row-mt.

Change-Id: Id9414e66d1a266602c7495243d9f5cb69e17ccdc
2017-03-07 13:58:11 -08:00
James Bankoski
88a888f022 Merge "tiny_ssim.c : adds y4m support to tiny_ssim." 2017-03-07 18:49:14 +00:00
Jim Bankoski
393d9d0195 tiny_ssim.c : adds y4m support to tiny_ssim.
Change-Id: I7a13b7e3a1e11ddbe4be3009edf03528e1bc7647
2017-03-07 08:37:00 -08:00
James Zern
47cf7c25a2 Merge "vp8_create_decoder_instances: correct pbi[] memset" 2017-03-04 00:47:18 +00:00
Alex Converse
15dac923b9 Merge "Narrow cat6_high_cost tables to uint16_t" 2017-03-03 23:45:39 +00:00
James Zern
9c3c1f3725 vp8_create_decoder_instances: correct pbi[] memset
clear the entire array on error. the size used previously was equal to
the number of elements.

BUG=webm:1364

Change-Id: I2f2e16ed6e867f41d4774a5a8ac9cedaee11ce46
2017-03-03 15:23:32 -08:00
Alex Converse
bcd12de6c3 Narrow cat6_high_cost tables to uint16_t
Saves 2688 bytes of rodata.

Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
2017-03-03 23:09:12 +00:00
Vignesh Venkatasubramanian
9e7140b451 Merge "vp9,realtime: Enable row multithreading for non-rd" 2017-03-03 19:05:52 +00:00
Marco Paniconi
1bb63bf669 Merge "vp9: Speed 8: reduce the adaptive_rd_thresh level." 2017-03-02 22:25:03 +00:00
Marco
b60617f5ff vp9: Speed 8: reduce the adaptive_rd_thresh level.
Reduce the level from 4 to 2.
This gives ~1-2% quality gain on RTC set, with small decreaee in speed (~1-2% on mac).

Change-Id: I7d959731badcee3d45b2f4a08efe378765016a13
2017-03-02 13:34:10 -08:00
Vignesh Venkatasubramanian
453f18040f vp9,realtime: Enable row multithreading for non-rd
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).

Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
2017-03-02 11:03:56 -08:00
Yi Luo
07c48ccfe0 Improve idct32x32_34_add SSSE3 intrinsics performance
- Split the transform into first half and second half.
- Reschedule the instructions to avoid stack spillover.
- Function level speed improves ~16%.

Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
2017-03-01 11:14:48 -08:00
Chrome Cunningham
b71245683b Merge "VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface" 2017-03-01 18:01:14 +00:00
Chris Cunningham
bcd0c49af3 VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface
Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value
is changed as part of this move.

Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH.

Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419
2017-02-28 17:10:34 -08:00
James Zern
8697d14ec8 Revert "Fix for max qindex calculation of a gf interval"
This reverts commit d3db846cc5.

This change causes a large drop in psnr (4-5db) on low framerate
difficult content (tested at 360/480p)

BUG=b/35804225

Change-Id: I8e90012d3b9c8a0cddb062ba93b01b36c0e0c0a0
2017-02-28 16:26:13 -08:00
James Zern
66919e370b vp9_ethread_test,cosmetics: s/new-mt/row-mt/
Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0
2017-02-28 15:13:11 -08:00
James Zern
3ab8a05b37 stress.sh: add vp9_stress_test_row_mt
vp9_stress_test now forces --row-mt=0 to cover both versions

Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2
2017-02-28 15:09:30 -08:00
James Zern
b58a8ccb02 stress.sh: parameterize thread count
Change-Id: Iae45266cea86585f0935af4012335198cf93719f
2017-02-28 15:09:30 -08:00
James Zern
4684d286de stress.sh: add one pass encodes
Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90
2017-02-28 15:09:30 -08:00
Yunqing Wang
3833905ff2 Add a comment in encoder thread test
Added a comment.

Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81
2017-02-28 11:13:09 -08:00
Yunqing Wang
3fa7e5c62c Set row_mt to 0 by default
Set row_mt to 0 for now.

Change-Id: I922536a6d71a765e435daeaf4d932ef14363d19a
2017-02-28 11:00:56 -08:00
Marco
defe094e9e vp9: Fix an issue with setting variance thresholds.
From commit:
https://chromium-review.googlesource.com/c/441393/

On non-segment the set_vbp_thresholds() should be called
again to adjust thresholds based on content_state of superblock.
This was the intended behavior from 441393.

Small change in RTC metrics and speed.

Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
2017-02-27 12:09:51 -08:00
Vignesh Venkatasubramanian
ddfe906be2 vp9_ethread_test: Rename new_mt to row_mt
Rename left over occurences of new_mt.

Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375
2017-02-27 10:50:02 -08:00
Vignesh Venkatasubramanian
5881601488 vp9: Rename new_mt to row_mt
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.

Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-27 09:43:26 -08:00
Yunqing Wang
8121f85473 Remove an old leftover comment
Removed an old comment that wasn't true anymore.

Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7
2017-02-24 18:31:21 -08:00
James Zern
47d6f16a04 get_prob(): rationalize int types
promote the unsigned int calculation to uint64_t rather than int64_t for
type consistency

Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
2017-02-24 15:36:52 -08:00
Yunqing Wang
af9002dd16 Merge "Improve VP9 encoder threading test for better coverage" 2017-02-24 23:26:23 +00:00
Yunqing Wang
cc168054a8 Improve VP9 encoder threading test for better coverage
Re-organized the encoder threading tests and grouped tests into
4 parts. Added PSNR checking test to make sure the PSNR variation
is within a small range.

BUG=webm:1376

Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff
2017-02-24 09:48:29 -08:00
Jerome Jiang
e96ab22462 Merge "Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8." 2017-02-24 16:56:33 +00:00
Johann
904b957ae9 consolidate block_error functions
vp9_highbd_block_error_8bit_c was a very simple wrapper around
vp9_block_error_c. The SSE2 implemention was practically identical to
the non-HBD one. It was missing some minor improvements which only
went into the original version.

In quick speed tests, the AVX implementation showed minimal
improvement over SSE2 when it does not detect overflow. However, when
overflow is detected the function is run a second time. The
OperationCheck test seems to trigger this case and reverses any
speed benefits by running ~60% slower. AVX2 on the other hand is
always 30-40% faster.

Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
2017-02-24 05:25:26 +00:00
Johann Koenig
aa911e8b41 Merge "block error sse2: use tran_low_t" 2017-02-24 05:24:34 +00:00
Jerome Jiang
0998a146d4 Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8.
Only works for bitdepth = 8 when compiled with high bitdepth flag.
4x speed ups for handling 1:2 down/upsampling.

Validated manually for:
1) Dynamic resize for a single layer encoding
2) SVC encoding with 3 spatial layers
Results are bitexact with the patch and the speed gain (~4x) in the
scaling was verified.

BUG=webm:1371

Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
2017-02-23 20:40:28 -08:00
Johann
3c16bbb73b block error sse2: use tran_low_t
Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb
2017-02-24 01:33:35 +00:00
Johann Koenig
57e987576f Merge "vp8_fdct4x4 test: fix segfault again" 2017-02-23 07:41:21 +00:00
Marco Paniconi
1d12a125e7 Merge "vp9: 1pass CBR: modify condition for reducing loop filter." 2017-02-23 03:24:26 +00:00
Jerome Jiang
a6b6258284 Merge "vp9: Non-rd pickmode: use simple block_yrd under some conditons." 2017-02-22 23:19:29 +00:00
Marco
84f106f198 vp9: 1pass CBR: modify condition for reducing loop filter.
The reduction showed improvement on RTC when aq-mode=3 is on.
Add that (cyclic refresh enabled) to the condition.

Only affects 1 pass CBR.

Change-Id: I5d0843002d8e31d7c165098a62e7a71146b08664
2017-02-22 15:09:45 -08:00
Marco
7e7d820d5b vp9: Non-rd pickmode: use simple block_yrd under some conditons.
For speed 8 only.
3% speed up for QVGA and 6.3% for VGA on Nexus 6.
~3% avgPSNR decrease on rtc_derf and 2.9% on rtc.

Disabled for now.

Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
2017-02-22 13:22:53 -08:00
Marco Paniconi
0acc270830 Merge "vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0." 2017-02-22 19:52:24 +00:00
Marco
7e79831016 vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0.
This prevent possible reduction of cyclic refresh after key frame.

Change-Id: Idd4e49b69cd95476e7eccfa31b2bd8669569e9e8
2017-02-22 10:50:08 -08:00
Johann
672100a84e vp8_fdct4x4 test: fix segfault again
The output needs to be aligned. Input is read with 'movq' not 'movqda'
so it is not expected to be aligned.

Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4
2017-02-22 18:29:11 +00:00
Jerome Jiang
3d1fa00fce vp9: Only compute y_sad for golden in variance partition for speed < 8.
Only affects speed 8. No obvious quality regression. Systematic speed
ups by ~1% on Nexus 6.

Change-Id: Ia904ca28ea041c3281c532911ec38fb7d7f46a17
2017-02-22 10:19:09 -08:00
Yunqing Wang
66f36f4735 Merge "Refactored the row based multi-threading code" 2017-02-22 16:55:04 +00:00
Jerome Jiang
b1dcaf7f1e Merge "Fix segmentation fault caused by denoiser working with spatial SVC." 2017-02-22 04:44:55 +00:00
Marco
7f2daa74a0 vp9: Incorporate source sum_diff into non-rd partition thresholds.
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.

Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.

Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
2017-02-21 17:22:11 -08:00
Yi Luo
6036a0d24f Following SSSE3 intrinsics functions also work for HBD
- vpx_idct8x8_12_add_ssse3
  vpx_idct8x8_64_add_ssse3
  vpx_idct32x32_34_add_ssse3
  vpx_idct32x32_135_add_ssse3
  vpx_idct32x32_1024_add_ssse3
- turn on unit tests.

Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
2017-02-21 12:37:53 -08:00
Johann Koenig
1e224dcb83 Merge "Drop zbin_ptr and quant_shift_ptr" 2017-02-21 18:16:38 +00:00
Jerome Jiang
0d1e5a21c4 Fix segmentation fault caused by denoiser working with spatial SVC.
Re-enable the affected test.
BUG=webm:1374

Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
2017-02-21 09:38:28 -08:00
Yi Luo
62a332160f Merge "Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests" 2017-02-21 16:36:06 +00:00
Paul Wilkins
4d4231352c Merge "Change to prediction decay calculation." 2017-02-21 09:42:38 +00:00
Marco Paniconi
f091752a9c Merge "vp9: Fix for non-rd pickmode for high-bitdepth build." 2017-02-21 05:37:23 +00:00
Marco
4e1ba35458 vp9: Fix for non-rd pickmode for high-bitdepth build.
Use the simple block_yrd under certain conditions.
The optimization code is completed but the speed is still slower
(~6% on 720p) than the low-bitdepth build.

For now, use the more complex block_yrd under certain conditions
(always use it for speed <= 5, otherwise use it on key frames and for
bsize >= 32x32).

This gives about ~2-3% gain in quality for speed 7 on RTC set
(over high bitdepth build), with about the same encoder fps as the
low bitdepth build.

Change-Id: Ibe92a1945d0bd635f880befb4c815727df62d754
2017-02-20 20:25:36 -08:00
Ranjit Kumar Tulabandu
97d6a4cbd1 Refactored the row based multi-threading code
Modified the code to facilitate bit-match tests in first pass
Added unit-tests to test the row based multi-threading behavior for bit-exactness

Change-Id: Ieaf6a8f935bb1075597e0a3b52d9989c8546d7df
2017-02-20 16:13:45 +05:30
James Zern
bf6fcebfed vp8_fdct4x4_test: align input and output buffers
fixes segfault in 32-bit builds

Change-Id: I5b3cc5a335cb236a6ec4cb11fa8feb54ae0182c7
2017-02-18 13:30:28 -08:00
James Zern
52b3e1a633 datarate_test: disable OnePassCbrSvc2SpatialLayersDenoiserOn
segfaults

BUG=webm:1374

Change-Id: I3790c6cb8a539d13dee6a8225ef09b1575dea26c
2017-02-17 16:23:22 -08:00
Johann Koenig
9cb470eba7 Merge "vp8_short_fdct4x4: verify optimized functions" 2017-02-17 22:11:08 +00:00
Yi Luo
1f8e8e5bf1 Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests
- In SSSE3 optimization, 16-bit addition and subtraction would
  overflow when input coefficient is 16-bit signed extreme values.
- Function-level speed becomes slower (unit ms):
  idct8x8_64: 284 -> 294
  idct8x8_12: 145 -> 158.

BUG=webm:1332

Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b
2017-02-17 14:05:05 -08:00
James Zern
3e7025022e Merge "Add vpx_highbd_idct16x16_10_add_neon()" 2017-02-17 20:29:37 +00:00
paulwilkins
a63adac604 Change to prediction decay calculation.
This change subtracts out low complexity intra regions that are also low
error in the inter domain, in the calculation of the frame prediction decay.
The rationale here his that low complexity regions (such as sky) do not imply
high prediction decay in the same way as high error intra or neutral blocks.

The effect of this is small in most clips but in a few clips it can be > 10%.
(E.g. In to tree)

Change-Id: If67ac23d17fca14285cad2defa464c61c9ea861c
2017-02-17 09:29:24 +00:00
Johann
bf05cd3c99 vp8_short_fdct4x4: verify optimized functions
Change-Id: I7c7f5dfabde65c09f111fb0ced0e3ad231ee716e
2017-02-16 19:34:50 -08:00
Johann
c7342f35c8 tiny_ssim: clean up on failure
Clears up clang static analysis warnings about memory leaks.

Change-Id: I60d4d0f3794735a8b81d9da4a30d19e7a9cba9cf
2017-02-17 03:28:34 +00:00
Yi Luo
f62dcc9c33 Replace idct32x32_1024_add_ssse3 assembly with intrinsics
- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on
  i7-6700, no obvious user-level speed performance downgrade.
- Passed unit tests.

Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
2017-02-16 16:10:40 -08:00
James Zern
b5bc9ee02d Merge "cosmetics: Fix spelling mistake in compile flag name." 2017-02-17 00:04:42 +00:00
Johann Koenig
a9b81da575 Merge "block error avx2: use tran_low_t" 2017-02-16 23:51:14 +00:00
Linfeng Zhang
0620081731 Add vpx_highbd_idct16x16_10_add_neon()
BUG=webm:1301

Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab
2017-02-16 15:13:50 -08:00
James Zern
0f014c97e5 Merge "Fix mips vpx_post_proc_down_and_across_mb_row_msa function" 2017-02-16 23:02:10 +00:00
James Zern
e9d07c0c2a Merge "disable VP9MultiThreadedFrameParallel tests" 2017-02-16 22:56:02 +00:00
paulwilkins
d218b0914e cosmetics: Fix spelling mistake in compile flag name.
agressive -> aggressive

after:
ce7b38459 Aggressive VBR method.

Change-Id: Ie0f30b1bbc77ed9f32bec047b4a9b3d0cf4853f5
2017-02-16 14:51:31 -08:00
Johann Koenig
06a82af0de Merge "correct bitdepth_conversion_sse2.h header guard" 2017-02-16 21:41:28 +00:00
Johann
ca4e27f5da Drop zbin_ptr and quant_shift_ptr
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
of these parameters.

scan is used for C code and iscan is used for SIMD implementations.

Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
2017-02-16 13:20:32 -08:00
James Zern
6ab0870d45 disable VP9MultiThreadedFrameParallel tests
these are flaky and cause TSan warnings with clang-3.9.1

BUG=webm:1372

Change-Id: I8a7047552ba2ccd2d8c45f8795818c74562e5990
2017-02-16 12:56:04 -08:00
Johann
6c2d732bf4 correct bitdepth_conversion_sse2.h header guard
Change-Id: Ic4ffd861608e67fe59bcb3a86010ce3ef11a5519
2017-02-16 12:43:33 -08:00
Yi Luo
1cb44945fb Merge "Add idct32x32_135_add SSSE3 intrinsics" 2017-02-16 20:43:29 +00:00
Johann
2104454607 block error avx2: use tran_low_t
Change-Id: Ic5f3a1f569d6f82afeaf4fcd7235374bb460db3c
2017-02-16 12:39:02 -08:00
Johann Koenig
cc43012674 Merge changes I267050a5,Iebade0ef,Id96a8df3
* changes:
  quantize_fp_32x32 highbd ssse3: enable existing function
  quantize_fp highbd ssse3: use tran_low_t for coeff
  quantize_fp highbd sse2: use tran_low_t for coeff
2017-02-16 20:34:48 +00:00
Yi Luo
72a43e2378 Add idct32x32_135_add SSSE3 intrinsics
- Replace the corresponding assembly code.
- No user level speed performance degrade.
- Unit tests passed.

Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
2017-02-16 11:29:34 -08:00
Yunqing Wang
0bf6b51572 Merge "Structured the mode ordering code to avoid redundant memcpy" 2017-02-16 16:22:54 +00:00
Johann
ff37a911ce quantize_fp_32x32 highbd ssse3: enable existing function
This was created as part of the quantize_fp_ssse3 change. Both
functions use the same source file with different macro parameters.

Change-Id: I267050a559426a85955d215aa0aaca270439c5ab
2017-02-16 07:40:56 -08:00
Johann
4682130b60 quantize_fp highbd ssse3: use tran_low_t for coeff
Change-Id: Iebade0efc0efbb0a80a0f3adbef4962e3a2f25e8
2017-02-16 07:40:56 -08:00
Johann
ac3996a6d1 quantize_fp highbd sse2: use tran_low_t for coeff
Change-Id: Id96a8df33354a7987ce890a3d6798c7375ffa4aa
2017-02-16 07:40:55 -08:00
Johann
44600442dc bitdepth conversion: really use num elements
The previous implementation confused bit/bytes/elements. It was using
'32' as the multiplier but that was mistakenly adopted because a 32x32
transform embedded the stride.

Change-Id: Ieeb867a332416b9a40580b5e7c9b20088e9e691a
2017-02-16 15:02:48 +00:00
Ranjit Kumar Tulabandu
5127e58dab Structured the mode ordering code to avoid redundant memcpy
Change-Id: I4f5d6b54018bd1928cd9e5e42619e6f55b334803
2017-02-16 14:12:33 +00:00
Paul Wilkins
60a10116d1 Merge "Disconnect ARF breakout from frame boost." 2017-02-16 10:02:09 +00:00
Paul Wilkins
543ebc900f Merge "Remove unnecessary factor." 2017-02-16 10:01:58 +00:00
Paul Wilkins
9216ba58d8 Merge "Bug in scale_sse_threshold()" 2017-02-16 10:01:46 +00:00
Paul Wilkins
e6c1993f1b Merge "Additional first pass stats." 2017-02-16 09:39:29 +00:00
Kaustubh Raste
fddf66b741 Fix mips vpx_post_proc_down_and_across_mb_row_msa function
Added fix to handle non-multiple of 16 cols case for size 16

Change-Id: If3a6d772d112077c5e0a9be9e612e1148f04338c
2017-02-16 13:17:00 +05:30
Johann Koenig
b63e88e506 Merge "Use 'packssdw' for loading tran_low_t values" 2017-02-16 02:41:00 +00:00
Johann Koenig
61d05c1e67 Merge "vp8_dx_iface: remove unused 'else' condition" 2017-02-16 01:00:45 +00:00
James Zern
cc04ae1565 Merge "vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism" 2017-02-16 00:21:19 +00:00
Marco Paniconi
e6cf741ae6 Merge "vp9: Some code cleanup for aq-mode = 3." 2017-02-15 23:03:27 +00:00
Marco
158b300952 vp9: Some code cleanup for aq-mode = 3.
The weight segment needs to only be computed once per frame,
so remove it from the funciton vp9_cyclic_refresh_rc_bits_per_mb(),
which is called within a loop inside vp9_rc_regulate_q.

Change-Id: Ia0e18b89abb97e42c466d4dbc47700d7f76555db
2017-02-15 14:07:04 -08:00
Jerome Jiang
2865de86ec vpx_temporal_svc_encoder: Expose error resilient control to cmd line.
Change-Id: Ic74a8690b136ffbc370080f70b2d5a6b1572bf63
2017-02-15 21:45:52 +00:00
Linfeng Zhang
d12f25f216 Merge "cosmetics,dsp/inv_txfm.c: reorder functions" 2017-02-15 20:18:23 +00:00
Marco Paniconi
725606a678 Merge "vp9. Use same source_sad threshold for all speeds." 2017-02-15 20:07:19 +00:00
Linfeng Zhang
106c342659 cosmetics,dsp/inv_txfm.c: reorder functions
Change-Id: Ie0f7689ebe230c68eadb22a32b14838c1a7543a6
2017-02-15 11:40:35 -08:00
Linfeng Zhang
d5edf56bb5 Merge "Add vpx_highbd_idct16x16_38_add_neon()" 2017-02-15 19:34:18 +00:00
Marco
f82280820a vp9. Use same source_sad threshold for all speeds.
Only affects real-time mode.

Change-Id: Iba836f110c4da936f5173cc0f54424d5b6121bff
2017-02-15 11:28:26 -08:00
Marco
716c1d5ff5 Vp9: Speed 8 aq-mode=3: Reduce computation in estimating bits per mb.
vp9_compute_qdelta_by_rate has almost 2% overhead in profiling on Nexus 6.
Reduce the calling of that function in speed 8 by estimating the delta-q.
Both rtc and rtc_derf show little/no change in avg psnr/ssim.
Encoding speed is 2~3% faster on Nexus 6.

Change-Id: If25933715783f31104a18a5092ea347b1221b5f5
2017-02-15 09:28:16 -08:00
Linfeng Zhang
81914ce68a Add vpx_highbd_idct16x16_38_add_neon()
BUG=webm:1301

Change-Id: Ic6cd8c1e63e1b7a997cbed221e20fff4c599e0fe
2017-02-15 09:12:02 -08:00
Linfeng Zhang
ccada0636b Merge "Add vpx_highbd_idct16x16_38_add_c()" 2017-02-15 17:06:17 +00:00
paulwilkins
cfc79a357a Disconnect ARF breakout from frame boost.
This small change replaces the frame boost check in the arf group
length break out clause with a test against a prediction decay value.

The boost value is in fact partly dependent on the decay value but
this change means that the per frame boost calculation can be adjusted
without influencing the group length calculation.

The value chosen gives a close match on all the test sets with the previous
code (on average) but it was noted that a lower threshold was slightly better
for 1080P and up and a slightly higher value for small image sizes.

Change-Id: I4d5b9f67d5b17b0d99ea3f796d3d6202fd61ee0c
2017-02-15 10:46:14 +00:00
paulwilkins
b89ba05ab4 Remove unnecessary factor.
Removed unnecessary scaling factor to simplify.

Change-Id: I3fc9c5975a2597e72f1324e09dd586dea1facfa7
2017-02-15 10:45:43 +00:00
paulwilkins
76550dfdc0 Bug in scale_sse_threshold()
The function scale_sse_threshold() returns a threshold scaled
if necessary for use with 10 and 12 bit from an 8 bit baseline.

SSE error values would be expected to rise for the 10 and 12
bit cases where there are more bits of precision.

Hence the threshold used for the test should also be scaled up.

Change-Id: I4009c98b6eecd1bf64c3c38aaa56598e0136b03d
2017-02-15 10:45:03 +00:00
paulwilkins
945ccfee59 Additional first pass stats.
Added counts that split the intra coded blocks into low and high variance.

Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14
2017-02-15 10:44:37 +00:00
Paul Wilkins
7635ee0f37 Merge "Aggressive VBR method." 2017-02-15 10:37:02 +00:00
James Zern
1cd926d665 vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism
replace with an explicit output file prefix that matches the function
name

Change-Id: I7f6a4105adb34327b1099a5fbf132aa8d1ad5b90
2017-02-14 23:44:00 -08:00
Johann Koenig
61927ba4ac Merge "vp9 fdct higbd neon: connect existing highbd calls" 2017-02-15 01:33:00 +00:00
Linfeng Zhang
e07e74fb0f Add vpx_highbd_idct16x16_38_add_c()
When eob is less than or equal to 38 for high-bitdepth 16x16 idct,
call this function.

BUG=webm:1301

Change-Id: I09167f89d29c401f9c36710b0fd2d02644052060
2017-02-14 17:25:52 -08:00
Yunqing Wang
f2c1aea118 Merge "Row based multi-threading of encoding stage" 2017-02-15 00:54:10 +00:00
Ranjit Kumar Tulabandu
71061e9332 Row based multi-threading of encoding stage
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.

Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.

Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-15 00:49:34 +00:00
Linfeng Zhang
615566aa81 Merge "Replace 14 with DCT_CONST_BITS in idct NEON functions' shifts" 2017-02-15 00:46:29 +00:00
Johann
86fed469ec vp8_dx_iface: remove unused 'else' condition
Clears up static clang analysis warning regarding a dead store.

Change-Id: If4fe7a9a7f94c6e2001d46136944f90712e543b4
2017-02-15 00:05:41 +00:00
Johann
327a02d77e Use 'packssdw' for loading tran_low_t values
This matches bitdepth_conversion_sse2.asm and produces substantially
better assembly. The old way had lots of 'movzwl' and 'shl' and storing
back to memory before loading into an xmm register.

Change-Id: Ib33e35354dfd691a4f8b1e39f4dbcbb14cd5302b
2017-02-14 22:39:49 +00:00
Johann
3e7aa8fda9 vp9 fdct higbd neon: connect existing highbd calls
Change-Id: Ia8f822bd6e70b3911bc433a5a750bfb6f9a3a75c
2017-02-14 22:11:49 +00:00
Johann Koenig
9c2bb7f342 Merge "quantize_fp highbd neon: use tran_low_t for coeff" 2017-02-14 21:28:23 +00:00
Linfeng Zhang
429e652809 Replace 14 with DCT_CONST_BITS in idct NEON functions' shifts
Change-Id: I2a39a3bb87516b04d273bc1c0f4a634e3fb6f0f6
2017-02-14 13:08:41 -08:00
clang-format
4b402746ca apply clang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14 12:45:52 -08:00
James Zern
f670628ca5 .clang-format: update to 3.9.1
Change-Id: Ia51f2201df897651067d09122075953382b59139
2017-02-14 12:39:54 -08:00
Yi Luo
c1a90dc160 Merge "Replace idct32x32_34_add_ssse3 assembly with intrinsics" 2017-02-14 20:13:27 +00:00
Yi Luo
bd86de1ac8 Replace idct32x32_34_add_ssse3 assembly with intrinsics
- No user-level speed performance change.
- Pass unit tests.

Change-Id: Idfc598e00f354265e41f6b3219f4734216c115c6
2017-02-14 10:38:36 -08:00
Johann
2b24aa87d9 quantize_fp highbd neon: use tran_low_t for coeff
Change-Id: I90fd815f15884490ad138f35df575a00d31e8c95
2017-02-14 10:26:10 -08:00
Johann
25301a84a8 vp8 onyx_if: assert divide by zero
Clears up static clang analysis warning regarding divide by zero.

Trying to explain to the compiler how it's impossible to avoid
incrementing num_blocks at least once is difficult.

Change-Id: Ibaae43be572e5cd7a689b440dcd341c17d33443b
2017-02-14 04:27:31 +00:00
Johann Koenig
eeb288d568 Merge "Remove UNINITIALIZED_IS_SAFE" 2017-02-14 03:02:51 +00:00
Linfeng Zhang
de9ae32b93 Merge "Add vpx_highbd_idct16x16_256_add_neon()" 2017-02-14 01:15:34 +00:00
Johann
8a1fb40273 Remove UNINITIALIZED_IS_SAFE
Where clang static analysis or gcc -Wmaybe-uninitialized warns of
uninitialized values, assign 0 to ints, MB_MODE_COUNT to
MB_PREDICTION_MODE, and B_MODE_COUNT to B_PREDICTION_MODE.

Assert that the modes have been changed from the invalid value by
the end of the function.

Change-Id: Ib11e1ffb08f0a6fe4b6c6729dc93b83b1c4b6350
2017-02-14 00:56:08 +00:00
Linfeng Zhang
5ad4159ebb Add vpx_highbd_idct16x16_256_add_neon()
BUG=webm:1301

Change-Id: I6bb755552a39bdd26eef3f449601f6a9766c65ec
2017-02-13 15:50:33 -08:00
Johann Koenig
4526ec7907 Merge "fdct8x8 highbd neon: use tran_low_t for output" 2017-02-13 23:11:30 +00:00
Johann
5ecde212a8 fdct8x8 highbd neon: use tran_low_t for output
Change-Id: I100c4a1955d80bec4d28e82796b3e7f57e84d0ba
2017-02-13 22:16:14 +00:00
Yunqing Wang
318ca07657 The bitstream bit match test in multi-threaded encoder
While the new-mt mode is enabled(namely, allowing to use row-based
multi-threading in encoder), several speed features that adaptively
adjust encoding parameters during encoding would cause mismatch
between single-thread encoded bitstream and multi-thread encoded
bitstream. This patch provides a set_control API to disable these
features, so that the bit match bitstream is obtained in the unit
test.

Change-Id: Ie9868bafdfe196296d1dd29e0dca517f6a9a4d60
2017-02-13 13:02:26 -08:00
Yunqing Wang
e7db593a46 Merge "Minor code style refactoring" 2017-02-13 21:01:41 +00:00
James Zern
45664383f1 Merge "cosmetics,vp9_ratectrl: apply clang-format" 2017-02-13 21:01:18 +00:00
James Zern
7a48bfab47 Merge "vpx_usec_timer_elapsed: use 64-bit math" 2017-02-13 21:00:33 +00:00
Yunqing Wang
f024518387 Minor code style refactoring
Change-Id: I20107693d0a87e08a10520bfb573ff3dcef69fdb
2017-02-13 12:59:01 -08:00
James Zern
3c4ea94210 cosmetics,vp9_ratectrl: apply clang-format
broken since:
c3f095c8b Merge "Fix to avoid abrupt relaxation of max qindex in recode path"
5f21aba4b Fix to avoid abrupt relaxation of max qindex in recode path

the original change pre-dated the addition of .clang-format

Change-Id: If5e399d9a805bcad9147360b13b36fbc8c560a7c
2017-02-13 11:29:39 -08:00
Linfeng Zhang
016933ad48 Add vpx_highbd_idct{16x16,32x32}_1_add_neon()
and update vpx_highbd_idct8x8_1_add_neon()

BUG=webm:1301

Change-Id: I18d1a0cbe98ba822d5194c1b4e13a4c29c5c75f4
2017-02-13 10:25:22 -08:00
paulwilkins
ce7b38459a Aggressive VBR method.
VBR method that allows a wider Q range for the first normal frame
in each ARF group and then centers the min - max range for the rest of
the arf group on the chosen Q value for that first frame.

This allows for quite rapid adjustment of the active Q range even if the
initial estimate is poor.

In some cases where the ARF frames themselves are tending to
undershoot but the normal frames are overshooting this can still give
net undershoot. This can be corrected by allowing a larger Q delta for
arf frames but is usually is a sign that the allocation to the arfs was to
high.

Change-Id: Icec87758925d8f7aeb2dca29aac0ff9496237469
2017-02-13 15:42:11 +00:00
James Zern
91f87e7513 Merge "Add vpx_idct16x16_38_add_neon()" 2017-02-11 03:42:36 +00:00
Marco
22dcfa80aa vp9: Non-rd mode: use simple block_yrd for 8 bit high bitdepth builds
Temporary fix until optimization work for block_yrd is completed.
This essentially reverts back to the state before the change:
https://chromium-review.googlesource.com/c/433821/

Compression loss is about ~5-6% on RTC set.
Speed-up (from using this simple/model-based block_yrd) over the low
bitdepth builds (which uses more complex block_yrd) is ~5% on 720p.

Change-Id: Ie0af9eb0d111e5595f587870c44f08317403b8d8
2017-02-10 10:15:35 -08:00
James Zern
943f9c0356 vpx_usec_timer_elapsed: use 64-bit math
this prevents a rollover when tv_sec is a long:
signed integer overflow: 2776 * 1000000 cannot be represented in type
'long'

Change-Id: I03dc4476ee122b02e2856dad28358a20cf16a9f8
2017-02-09 19:28:59 -08:00
Paul Wilkins
c3f095c8b3 Merge "Fix to avoid abrupt relaxation of max qindex in recode path" 2017-02-09 17:17:55 +00:00
Paul Wilkins
82b88a7fd0 Merge "Fix for max qindex calculation of a gf interval" 2017-02-09 17:17:44 +00:00
Linfeng Zhang
bc1c18e18c Add vpx_idct16x16_38_add_neon()
The RunQuantCheck() test on it exposes 16-bit overflow in stage 7 of
pass 2. Change to use saturating add/sub for both
vpx_idct16x16_38_add_neon() and vpx_idct16x16_256_add_neon() for high
bitdepth.

Change-Id: Ibf4c107a887553a52852cc582e28d38a5a5a2712
2017-02-08 12:15:22 -08:00
Yi Luo
ac04d11abc Replace idct8x8_12_add_ssse3 assembly code with intrinsics
- Performance achieves the same as assembly.
- Unit tests pass.

Change-Id: I6eacfbbd826b3946c724d78fbef7948af6406ccd
2017-02-08 10:07:45 -08:00
Linfeng Zhang
0fefc6873a Merge "Add vpx_idct16x16_38_add_c()" 2017-02-08 17:20:19 +00:00
Johann Koenig
b73f99745b Merge "block_error_fp highbd sse2: use tran_low_t for coeff" 2017-02-07 23:26:10 +00:00
Marco Paniconi
71f5314993 Merge "vp9: Denoiser speed-up: increase partition and ac skip thresholds." 2017-02-07 22:25:00 +00:00
Yunqing Wang
b106abe570 Merge "Row based multi-threading of ARNR filtering stage" 2017-02-07 19:55:41 +00:00
Marco Paniconi
259e835b1b Merge "vp9: Adjust rate_err threshold for setting active_worst factor." 2017-02-07 19:25:47 +00:00
Marco
1a5482d4d8 vp9: Denoiser speed-up: increase partition and ac skip thresholds.
Add factor to increase varianace partition and ac skip thresholds,
under certain conditions (noise level and sum_diff), to increase
denoiser speed.

Change-Id: I7671140ef3598bf5f114a72623d68792bcd7b77b
2017-02-07 10:33:13 -08:00
Linfeng Zhang
cf76ee2cb7 Add vpx_idct16x16_38_add_c()
When eob is less than or equal to 38 for 16x16 idct, call this function.

Change-Id: Ief6f3fb16a49ace3c92cebf4e220bf5bf52a6087
2017-02-07 09:40:51 -08:00
Marco
3c2f076ad0 vp9: Adjust rate_err threshold for setting active_worst factor.
Only affects 1 pass vbr.
Small improvement on ytlive set.

Change-Id: I09a7456fe658fbea82ece1035cf683bd8bd8bd14
2017-02-07 09:38:16 -08:00
Linfeng Zhang
66695533a8 Merge "Update 16x16 8-bit idct NEON intrinsics" 2017-02-07 16:52:40 +00:00
Johann
537949a9df block_error_fp highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Id2ed3ebaaaa6a4b68628c23e08b64ea5f1341761
2017-02-07 15:03:28 +00:00
Ranjit Kumar Tulabandu
91f01a2060 Row based multi-threading of ARNR filtering stage
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
2017-02-07 14:03:19 +05:30
Johann Koenig
85f3a82355 Merge "highbd x86: consolidate tran_low_t conversions" 2017-02-07 02:49:58 +00:00
Jerome Jiang
aa327a1ed4 vp9: speed 8: Tune threshold of ac skip and partitioning.
Threshold for partitioning only affects VGA and lower res.
0.07% quality regression is observed in borg tests on rtc_derf
and 0.2% regression on rtc.
5.6% speed up for low res and 6.8% for VGA on Nexus 6.

Change-Id: If85a2919b48c991de66059c90f32ed06980452be
2017-02-06 16:27:53 -08:00
Johann
641fda79bb highbd x86: consolidate tran_low_t conversions
Create new helper files specifically for converting tran_low_t types.

Change-Id: I7c4c458ef910f3b3d10a3cfbf9df4de7682fd905
2017-02-06 10:43:26 -08:00
Yunqing Wang
dbc5090b5e Merge "Changes to facilitate multi-threading of encoding stage" 2017-02-04 01:02:29 +00:00
Yunqing Wang
2a21b45fdc Fix visual studio build failure
Fixed the following issue.
..\test\vp9_ethread_test.cc(69): warning C4805: '|=' : unsafe mix of type 'bool' and type 'int' in operation [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]
..\test\vp9_ethread_test.cc(69): warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning) [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]

Change-Id: I37f897cf12a0b7500d2fcbac9e4615f08a83fdb4
2017-02-03 08:36:55 -08:00
Jerome Jiang
a16ca80b09 Merge "Add unit tests for vp9_block_error_fp." 2017-02-02 22:20:42 +00:00
Jingning Han
bb40844e32 Merge "Add SSSE3 intrinsic 8x8 inverse 2D-DCT" 2017-02-02 22:18:32 +00:00
Jerome Jiang
0b60d3ffa5 Add unit tests for vp9_block_error_fp.
BUG=webm:1365

Change-Id: I004e5cd7ca331d14b31b7fc3edeee45fce064026
2017-02-02 12:41:51 -08:00
Johann Koenig
8d5d21aaec Merge "Update third_party/googletest to 1.8.0" 2017-02-02 20:15:46 +00:00
Johann
d89b4f5ece Update third_party/googletest to 1.8.0
Change-Id: If61137e28291f2a0911e9260eb58f234e0d8594c
2017-02-02 07:27:11 -08:00
Ranjit Kumar Tulabandu
12ec948490 Changes to facilitate multi-threading of encoding stage
Modified the encoding stage to have row level entry points with relevant
initializations and to access the token information at row level

Change-Id: Ife10e55a7c1a420ee906d711caf75002688d9e39
2017-02-02 14:47:13 +05:30
Kaustubh Raste
5b10674b5c Merge "Add mips msa sum_squares_2d_i16 function" 2017-02-02 08:09:21 +00:00
Johann Koenig
726556dde9 Merge "Remove neon assembly for idct 16x16 and 8x8" 2017-02-02 03:25:31 +00:00
Johann Koenig
ce6318f254 Merge changes I43521ad3,I013659f6
* changes:
  satd highbd neon: use tran_low_t for coeff
  satd highbd sse2: use tran_low_t for coeff
2017-02-02 03:03:58 +00:00
Linfeng Zhang
e4985cf619 Update 16x16 8-bit idct NEON intrinsics
Remove redundant memory accesses.

Change-Id: I8049074bdba5f49eab7e735b2b377423a69cd4c8
2017-02-01 17:04:33 -08:00
Jingning Han
8f95389742 Add SSSE3 intrinsic 8x8 inverse 2D-DCT
The intrinsic version reduces the average cycles from 183 to 175.

Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03
2017-02-01 14:47:53 -08:00
Yunqing Wang
770c6663d6 Merge "Changes to facilitate row based multi-threading of ARNR filtering" 2017-02-01 22:04:15 +00:00
Johann Koenig
dc90501ba3 Merge changes I374dfc08,I7e15192e,Ica414007
* changes:
  hadamard highbd ssse3: use tran_low_t for coeff
  hadamard highbd neon: use tran_low_t for coeff
  hadamard highbd sse2: use tran_low_t for coeff
2017-02-01 21:56:36 +00:00
Ranjit Kumar Tulabandu
359a6796da Changes to facilitate row based multi-threading of ARNR filtering
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
2017-02-01 13:03:52 -08:00
Johann Koenig
5cc0a364ae Merge "vp9_rdopt: declare 'c' closer to use" 2017-02-01 20:55:12 +00:00
Johann
bfd62cdaff vp9_rdopt: declare 'c' closer to use
Clears up static clang analysis warning regarding a dead store. Only
declare 'c' when it will be used.

Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3
2017-02-01 19:58:24 +00:00
Johann Koenig
f60171bb4f Merge "deblock: annotate postproc parameters" 2017-02-01 19:57:29 +00:00
Johann
f8d744d91a satd highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I43521ad32b6c96737a8ef2b8c327f901fd7eaf84
2017-02-01 11:55:47 -08:00
Johann
2ba383474d satd highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I013659f6b9fbf9cc52ab840eae520fe0b5f883fb
2017-02-01 11:55:16 -08:00
Johann
0f751ecee3 hadamard highbd ssse3: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I374dfc08732932382043905f128e928b08cb4f57
2017-02-01 11:51:15 -08:00
Johann
1eb8a718bf hadamard highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I7e15192ead3a3631755b386f102c979f06e26279
2017-02-01 11:50:46 -08:00
Johann
2dac808dd1 hadamard highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Ica414007d8412ceebfffa9e58e8416226a3fe934
2017-02-01 11:46:57 -08:00
Johann Koenig
3bda634576 Merge "quantize ssse3: remove unused pxor" 2017-02-01 19:41:41 +00:00
Jingning Han
a7949f2dd2 Make satd unit test support all bit-depth settings
Turn on satd unit test for c function in both regular and high
bit-depth settings.

Change-Id: I4b0c56addfb84964ede0da3ab760fe0ee640cfd0
2017-01-31 23:21:32 -08:00
Jingning Han
59917dd18e Unify the hadamard transform unit test for bit-depth settings
Unify the 8x8 and 16x16 Hadamard unit test system for both 8-bit
and high bit-depth settings.

Change-Id: I53373c1d43f3ced514ad1e53e03f0fb9b25d9ead
2017-01-31 23:21:32 -08:00
Jingning Han
969957f9f2 Fix real-time compression regression in hbd mode
This commit resolves the compression performance regression in
real-time encoding setting when high bit-depth mode is enabled.

The current solution temporarily disables the SIMD implementations
of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.

The commit makes the coding results bit-wise identical between
regular coding pipeline and high bit-depth at profile 0.

BUG=webm:1365

Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2017-01-31 23:17:09 -08:00
Johann
32f68cc58c deblock: annotate postproc parameters
Clears a clang static analyzer warning where 'cols' is assumed to be
less than 0, preventing the for loop from executing.

The assembly already requires that the size be 8 or 16 (U/V or Y plane)
and cols is a multiple of 8.

Change-Id: Ica4612690ead1638c94cfe56b306e87f8ce644f9
2017-01-31 15:58:57 -08:00
Johann Koenig
9efc42f4f8 Merge "Use Buffer class for post proc tests" 2017-01-31 15:28:28 +00:00
Kaustubh Raste
750e753134 Add mips msa sum_squares_2d_i16 function
average improvement ~4x-5x

Change-Id: I8d91b71d0677009be52b412e4f52b40b98573a53
2017-01-31 12:22:43 +00:00
Kaustubh Raste
df7e1fecc1 Add mips msa vpx_minmax_8x8 function
average improvement ~4x-5x

Change-Id: I83aee9977534fddb8a9b80d31af646c0b6b1a8c3
2017-01-31 10:00:43 +05:30
Kaustubh Raste
280ad35553 Merge "Add mips msa vpx_vector_var function" 2017-01-31 02:34:51 +00:00
Johann
dcfff3ccc8 quantize ssse3: remove unused pxor
Change-Id: Ifa22d77fd530827de0b32ae71810dc2213ab2937
2017-01-30 17:02:57 -08:00
Marco
d47f257484 vp9: Modify bsize condition for using model_rd_large for speed 7.
In non-rd pickmode: Allow speed 7 to also use larger block size in
model_rd. Small change in behavior for speed 7.

Change-Id: I8c5523e424308e8f0bc71b3f6324dec42a464cc8
2017-01-30 11:16:51 -08:00
Yunqing Wang
106c620a23 Merge "Disable multi-threading in first pass for SVC encoding" 2017-01-28 19:29:01 +00:00
Kaustubh Raste
4ce20fb3f4 Add mips msa vpx_vector_var function
average improvement ~4x-5x

Change-Id: I2f63ef83d816052ca8dc42421e7e9d42f7a7af6b
2017-01-28 08:53:20 +00:00
Marco Paniconi
164db8278f Merge "vp9: Fix to pick_filter_level for highbitdepth build." 2017-01-27 22:47:45 +00:00
Jerome Jiang
9b1a2aa82b Merge "Add macOS Sierra support in configure" 2017-01-27 21:15:52 +00:00
Marco
d94d0ed12f vp9: Fix to pick_filter_level for highbitdepth build.
Change-Id: I53b3fa8bfc0a0717eb1b730c29f2b70060b1b1b7
2017-01-27 10:44:07 -08:00
Jerome Jiang
6d38ad4146 Add macOS Sierra support in configure
BUG=webm:1367

Change-Id: I3000b6d9f93ec49ca86d08151348d33d86bf0034
2017-01-27 10:41:38 -08:00
Ranjit Kumar Tulabandu
6985a0f516 Disable multi-threading in first pass for SVC encoding
BUG=webm:1366

Change-Id: I204ef8496884ba7c4debe64f23f50d298b4090c3
2017-01-27 09:41:53 -08:00
Marco Paniconi
ad1aad69fb Merge "vp9: Modify bsize condition for using model_rd_large." 2017-01-27 15:15:36 +00:00
Marco Paniconi
d0495132aa Merge "vp9: Fixes for usage of skin_map for high bit depth." 2017-01-27 15:15:15 +00:00
Marco
b16c77cdc4 vp9: Modify bsize condition for using model_rd_large.
In non-rd pickmode: small change in behavior for speed 6 and 7.
Remove condition on HIGHBITDEPTH flag.

Change-Id: I360a13fcc313d72612fe9b918162ef4bb278cdea
2017-01-26 22:45:27 -08:00
Kaustubh Raste
407fad2356 Add mips msa vpx Integer projection row/col functions
average improvement ~4x-5x

Change-Id: I17c41383250282b39f5ecae0197ef1df7de20801
2017-01-27 11:11:42 +05:30
Kaustubh Raste
c1553f859f Merge "Add mips msa vpx satd function" 2017-01-27 04:08:51 +00:00
Marco
db99840bf6 vp9: Fixes for usage of skin_map for high bit depth.
Also avoid noise_estimation and source_sad if use_highbitdepth is set.

Change-Id: I5fea396b8f8380ea377045d99ba22a52b92daa46
2017-01-26 19:57:59 -08:00
Johann
f380a1658d Use Buffer class for post proc tests
Add Buffer features for:
Setting the buffer to the output of an ACMRandom function.
Copying a buffer.
Comparing two buffers.
Printing two buffers.

Change-Id: Ib53fb602451a3abdcee279ea2b65b51fbc02d3df
2017-01-26 09:50:49 -08:00
Jerome Jiang
eacc3b4ccf Merge "vp9: Refactor copy partitioning to reduce duplication." 2017-01-26 17:46:11 +00:00
Jerome Jiang
fe4791b0d5 vp9: Refactor copy partitioning to reduce duplication.
Change-Id: Ia1b3c118adec5eccbd2900c8e4b9ea6b1e3e9b7c
2017-01-25 17:33:04 -08:00
Yunqing Wang
4d50dc5ab5 Merge "Remove marco MVC in mcomp.c" 2017-01-26 00:32:55 +00:00
Hui Su
37cd112b0f Merge "Fix an overflow warning in optimize_b()" 2017-01-25 22:49:30 +00:00
Marco
3b2d08a93b vp9-denoiser: Modify skip denoising condition for small blocks.
Skip denoising for blocks < 16x16, and for block = 16x16
skip denoising for low noise levels and width > 480 for now.
Allow for some speed-up in denoiser.

Change-Id: Ib46cefe4741962d145fa08775defea3a9c928567
2017-01-25 11:48:09 -08:00
hui su
519b2e48a8 Fix an overflow warning in optimize_b()
BUG=webm:1361

Change-Id: Ib840bf3b39f7b3c8c017d3488a83434e9a0f45f5
2017-01-25 10:54:39 -08:00
Jerome Jiang
70a3652693 Merge "vp9: Adjust threshold for y sad used in copying partition." 2017-01-25 17:54:15 +00:00
Yunqing Wang
a762cef917 Merge "Initialize errorperbit and sabperbit in ARNR filtering" 2017-01-25 16:43:02 +00:00
Yunqing Wang
633dbcb458 Merge "Multi-threading of first pass stats collection" 2017-01-25 16:40:32 +00:00
Jerome Jiang
3a7ad43fb8 vp9: Adjust threshold for y sad used in copying partition.
Visual quality improvement is observed for noisy clips. Little effects
on speed tests on Nexus 6.

Change-Id: Ib38e04002220708c34102de7b5c36e9940775d89
2017-01-24 17:20:05 -08:00
Ranjit Kumar Tulabandu
8b0c11c358 Multi-threading of first pass stats collection
(yunqingwang)
1. Rebased the patch. Incorporated recent first pass changes.
2. Turned on the first pass unit test.

Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2017-01-24 15:48:02 -08:00
Marco
8d0c8c5e6b vp9: Adjust some parameters in aq-mode=3 mode.
Increase the qp-delta, mainly for low resolutions,
excluding case of very low bitrates.

avgPSNR/SSSIM gain of ~3-5% on rtc_derf set.
Small change on rtc set.

Change-Id: Ice03d04bd0340404d1957666ef154fd64fed0606
2017-01-24 14:18:02 -08:00
Jerome Jiang
8e8e2d11bf Merge "vp9: Copy partition using avg_source_sad." 2017-01-24 20:58:09 +00:00
Jerome Jiang
ac1358cd56 vp9: Copy partition using avg_source_sad.
Affecting only speed 8.
Speed tests on Nexus 6 show 4% faster for QVGA and 2.4% faster for VGA.
Little/negligible quality regression observed on both rtc and rtc_derf sets.

Change-Id: I337f301a2db49a568d18ba7623160f7678399ae1
2017-01-24 10:31:22 -08:00
Yunqing Wang
91aa1fae2a Merge "Add the multi-threaded first pass encoder unit test" 2017-01-24 17:14:07 +00:00
Ranjit Kumar Tulabandu
75d2443bf0 Initialize errorperbit and sabperbit in ARNR filtering
(Yunqing)
This patch added the missing initialization in temporal filter.
Borg test BDRate results:
PSNR: -0.019%(lowres); -0.013%(hdres);
SSIM: -0.001%(lowres); -0.010%(hdres).
Other q values gave comparable but no better results.

Change-Id: I7ad0c18b39e6f558342688e2fe1e12fdb133ce9b
2017-01-24 08:58:17 -08:00
Kaustubh Raste
182ea677a0 Add mips msa vpx satd function
average improvement ~4x-5x

Change-Id: If8683d636fe2606d4ca1038e28185bca53bbe244
2017-01-24 10:44:22 +05:30
Jerome Jiang
d82b9f62a9 Merge "vp9: Adjust the threshold to set avg_source_sad_sb flag." 2017-01-24 03:43:12 +00:00
Yunqing Wang
b987bc36af Remove marco MVC in mcomp.c
Removed MVC so that mv_err_cost() is always called while calculating
the mv cost.

Change-Id: I28123e05fbfc2352128e266c985d2ab093940071
2017-01-23 17:03:12 -08:00
Jerome Jiang
40ffa2839f vp9: Adjust the threshold to set avg_source_sad_sb flag.
Affect only speed 8. Small/Negligible regression on rtc set.

Change-Id: I67a6b6b4008a22ed798bd980336d95bb799f64b4
2017-01-23 16:11:28 -08:00
Johann
270fadc135 PartialIDctTest: reduce number of RunQuantCheck iterations
This currently runs 1000 * 1000 = one *million* times which is quite
unnecessary. It's one of the slowest items in Jenkins and takes over an
hour for each of the larger transforms.

Change-Id: I01653b5e610683e1a2d778ec60cf5065562ab8db
2017-01-23 13:32:09 -08:00
Marco
f38ed0c560 vp9: Non-rd pickmode: fix to add ARF mode entries to THR_MODES.
BUG=webm:1359

Change-Id: Ie0c66efa2e19d1ec9c744d14e3fa8f1e6214cdd6
2017-01-23 10:56:29 -08:00
Marco
b71ff28a1a vp9: Small threshold adjustment to unittest BasicRateTargeting444
Due to recent change to speed >=7 from commit:219cdab.

Change-Id: I366e7750ec91119881050ff6c05849504c7959e8
2017-01-21 18:19:45 -08:00
Kaustubh Raste
881bef00c7 Merge "Add mips msa vpx hadamard functions" 2017-01-21 03:16:39 +00:00
Jerome Jiang
f4169936ee Merge "vp9: Add feature to use block source_sad for realtime mode." 2017-01-20 20:35:07 +00:00
Marco
219cdab676 vp9: Add feature to use block source_sad for realtime mode.
Only for speed >= 7, and affects skipping of intra modes.
Threshold is set low for now, needs to be tuned.
Small/no difference in metrics on rtc clips.

Change-Id: If9bdbd43f08d1f80407cdd2e9e5e96780dcd2424
2017-01-20 11:57:02 -08:00
Yunqing Wang
b0d8a75e48 Add the multi-threaded first pass encoder unit test
Added the multi-threaded first pass encoder unit test in VP9. The test is
to check if the new multi-threaded first pass encoder(namely, new-mt = 1)
still generates matching stats. In the unit test, the new-mt mode will be
turned on once the multi-threaded first pass implementation is checked in.

Change-Id: Ic21bb1a55c454f024cfd2b397a4c148cfe638218
2017-01-20 10:06:24 -08:00
James Zern
b608c09781 tools_common.h: add missing ';' in generic branch
missed in:
380a26112 Fix compile warnings for target=armv7-android-gcc

Change-Id: I2820fff00858a19f7dcf6e0fff189d455b7d640f
2017-01-19 15:09:59 -08:00
Johann
13234d3c43 Remove neon assembly for idct 16x16 and 8x8
Tested using test/partial_idct_test.cc:DISABLED_Speed

Both gcc 4.9 and clang 3.8 from the r13 Android NDK offer improvements
using the intrinsics:
<function>    <clang asm> <gcc asm> <clang intrin> <gcc intrin>
idct16x16_256  1720ms      1703ms    1546ms         1554ms
idct16x16_10   1320ms      1247ms     518ms          488ms
idct16x16_1     107ms       108ms      64ms           68ms
idct8x8_64      924ms       931ms     866ms          989ms
idct8x8_12      826ms       824ms     519ms          514ms
idct8x8_1       172ms       166ms     110ms          125ms

idct8x8_64 isn't quite perfect (slight regression with gcc intrinsics)
but as a counter example idct16x16_10 goes from ~1300ms to ~500ms

On a sample clip, clang improved from 48.5 to 49fps and gcc stayed roughly
stable.

BUG=webm:1303

Change-Id: I9d4fd2b41b46ea6174a887b40a82c8e6e4769ed4
2017-01-19 12:27:31 -08:00
Marco
0f9760ab6f vp9: Modify usage of force_skip under low temporal variance in non-rd pickmode.
For short_circuit set to level 1, skip newmv for 64x64 blocks if the
low temporal variance flag is set. Also modify threshold for 64x64 split
in variance partitioning.

Overall speed-up on noisy clips of 2-4%.
Only affect speed >= 7.

Change-Id: I384b3772007e84de6f8707e480d2ddf1fe1f907d
2017-01-19 11:21:15 -08:00
Kaustubh Raste
e0c0e65378 Add mips msa vpx hadamard functions
average improvement ~4x-5x

Change-Id: I167132d894c04fa85dda8dde7906ff9c61b3a65d
2017-01-19 14:44:03 +05:30
Jerome Jiang
ee5b29ae30 vp9: Stop copying partition every a fixed number of frames.
Avoid quality loss when copying partition of superblock with large motions.
Maximum consecutively copied frames can be set (currently 5).

Change-Id: I11c30575514f02194c0f001444cf4021609e5049
2017-01-18 11:23:59 -08:00
Peter Boström
e758f9d457 Merge "Add CSV per-frame stats to vpxdec." 2017-01-18 16:32:34 +00:00
James Zern
70c9b3c668 Merge "vp9_cx_iface,encoder_encode: check validate_img return" 2017-01-18 07:36:53 +00:00
Jerome Jiang
9152d434dc vp9: Disable partition copy when resizing is enabled.
Change-Id: I4fa3262e0f1c4018604c954b020ec5d1e3d1465c
2017-01-17 18:21:31 -08:00
Jerome Jiang
255866419d Merge "vp9: Set low variance flag when partition is copied." 2017-01-17 21:02:52 +00:00
Jerome Jiang
0c65aed099 vp9: Set low variance flag when partition is copied.
Also set the flag to 1 when exit early choosing 64x64 block
such that skipping new mv for golden works in these scenerios.

Change the size of prev_segment_id to the number of superblocks
to save memory.

Borg test shows quality regression of 0.012% on average PSNR
and 0.035% on SSIM.

Change-Id: I5014224c8617d439d35c66ece3fed9ae30b31d23
2017-01-17 11:14:50 -08:00
Johann Koenig
add0587fae Merge "Cygwin x86_64 support." 2017-01-17 17:45:55 +00:00
Moriyoshi Koizumi
34be6057da Cygwin x86_64 support.
This should have been taken into account at 64347a10

Change-Id: Ie8e3ad7cbaab3e5799e04bd50f2639390b0a2428
2017-01-17 09:04:37 -08:00
Peter Boström
a9ae351667 Add per-frame SSIM/PSNR stats to tools/tiny_ssim.
Adds an optional output framestats.csv file that prints comparions
per-frame instead of averaged over the entire clip. It prints
per-channel and combined metrics for SSIM and PSNR.

Change-Id: Id28dfade27bc5775b59a9d83cfe8b37d1d52b686
2017-01-17 10:47:50 -05:00
Ranjit Kumar Tulabandu
5f21aba4b0 Fix to avoid abrupt relaxation of max qindex in recode path
The fix relaxes the max qindex based on the data from previous loop of
coding if output frame size is greater than maximum frame size allowed

Change-Id: Iac1f63ec67559d68766e090a7cbb80b812b2560f
2017-01-16 18:03:27 +05:30
James Zern
c42a281439 vp9_cx_iface,encoder_encode: check validate_img return
before calling vp9_apply_encoding_flags() which may crash if the
resolution was invalid. this is the same change as:
c0523090b vp8e_encode: check validate_config return

BUG=https://bugzilla.mozilla.org/show_bug.cgi?id=1315288

Change-Id: Icd2aab322422e83d3a778fca6d7789e5000239d7
2017-01-13 16:53:03 -08:00
Marco
159cc3b33c vp9: Add speed feature flag for computing average source sad.
If enabled will compute source_sad for every superblock on every frame,
prior to encoding. Off by default, only on for speed=8 when
copy_partition is set.

Change-Id: Iab7903180a23dad369135e8234b7f896f20e1231
2017-01-13 11:52:12 -08:00
Marco Paniconi
f217049dbe Merge "vp9: Adjust threshold for copy partiton, for speed=8." 2017-01-13 19:07:57 +00:00
Marco
47270b6858 vp9: Adjust threshold for copy partiton, for speed=8.
Change-Id: I4799cb2b67d911ee385e6d6992c61633ca77e69d
2017-01-13 10:29:31 -08:00
Jingning Han
b6fe63a505 Merge "Rework 8x8 transpose SSSE3 for avg computation" 2017-01-13 18:25:17 +00:00
Jingning Han
553e9e291f Merge "Rework 8x8 transpose SSSE3 for inverse 2D-DCT" 2017-01-13 18:25:09 +00:00
Peter Boström
a981cb2809 Add CSV per-frame stats to vpxdec.
Used with --framestats=file.csv. Currently prints raw codec QP (not
internal 0-63 range) and bytes per frame.

Change-Id: Ifbb90129c218dda869eaf5b810bad12a32ebd82d
2017-01-13 08:14:49 -05:00
Marco Paniconi
888bb6c133 Merge "vp9: Update threshold for partition copy." 2017-01-13 06:22:53 +00:00
Jerome Jiang
2ff2376fbc vp9: Update threshold for partition copy.
Avoid many visual artifacts. Compression quality is improved by more
than 1%. Encode speed is about 4% for QVGA and 6% for VGA faster on
android.

Change-Id: I4dd0a81429ddf7efdef1e80a191da5fb8de8e8af
2017-01-12 18:48:38 -08:00
Johann
d630cda597 Merge remote-tracking branch 'origin/longtailedduck' 2017-01-12 15:40:14 -08:00
Jingning Han
39fff1bea0 Rework 8x8 transpose SSSE3 for avg computation
Use same transpose process as inv_txfm_sse2 does.

Change-Id: I2db05f0b254628a11f621c4c09abb89501ba6d3c
2017-01-12 15:16:07 -08:00
Jingning Han
f65170ea84 Rework 8x8 transpose SSSE3 for inverse 2D-DCT
Use same transpose process as inv_txfm_sse2 does.

Change-Id: Ic4827825bd174cba57a0a80e19bf458a648e7d94
2017-01-12 15:13:18 -08:00
Peter Boström
297dfd8696 Add decoder getters for the last quantizer.
To be used for frame stats output of vpxdec.

Change-Id: I0739a01bd3635c4b3fedd58f3e27363ce8fb1b1e
2017-01-12 16:16:22 -05:00
Marco Paniconi
baa4a290eb Merge "vp9: Make the denoiser work with spatial SVC." 2017-01-12 17:54:41 +00:00
Johann Koenig
3628975a15 Merge "Create a class for buffers used in tests" 2017-01-12 01:02:58 +00:00
Peter Boström
70ead526f1 Merge "Add Y,U,V channel metrics and unweighted metrics." 2017-01-11 21:05:47 +00:00
Jerome Jiang
6e07bf3634 Merge "vp9: Turn on the partition copy for speed 8. Tune threshold." 2017-01-11 20:50:43 +00:00
Johann Koenig
9f27d1f843 Merge "arm idct16x16: remove extra config guards" 2017-01-11 20:22:27 +00:00
Peter Boström
605ca82e7f Add Y,U,V channel metrics and unweighted metrics.
Renames SSIM to VpxSSIM as an upscaled weighted SSIM metric, then prints
Y, U and V channels unweighted as well as a weighted but not scaled SSIM
score that's 8/1/1 parts Y/U/V (same as VpxSSIM).

Change-Id: Iff800cc8f145314eeb1a9b4af1e11a25bec095ca
2017-01-11 14:51:18 -05:00
Jingning Han
d98bd7cc5f Merge "Rework forward 8x8 2D-DCT ssse3 implementation" 2017-01-11 19:28:39 +00:00
Jerome Jiang
f129e09529 vp9: Turn on the partition copy for speed 8. Tune threshold.
For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.

Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732
2017-01-11 10:48:16 -08:00
Johann
68d0f46ec0 arm idct16x16: remove extra config guards
This file is guarded by HAVE_NEON_ASM in the .mk file now.

Change-Id: I513a621c234aa90ad52e426c8ed494d8a7d4b74a
2017-01-11 10:17:14 -08:00
Johann
6886da7547 Create a class for buffers used in tests
Demonstrate its use with the IDCT test.

Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b
2017-01-11 08:28:39 -08:00
hui su
7a0bfa6ec6 Add "Large" label to VP9 target level tests
Also reduce the number of test frames.

Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08
2017-01-10 17:29:43 -08:00
Marco
7e3a82c384 vp9: Make the denoiser work with spatial SVC.
If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b
2017-01-10 17:23:58 -08:00
Jingning Han
9a780fa7db Rework forward 8x8 2D-DCT ssse3 implementation
This commit reworks the SSSE3 implementation of the forward 8x8
2D-DCT. It uses a cyclic rotation approach to the temporary xmm
registers. It reduces the average cycles from 158 to 154. The SSE2
version uses 169 cycles.

Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa
2017-01-10 12:50:55 -08:00
Marco
91fc730d83 vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.
When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.

Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.

Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460
2017-01-10 09:42:06 -08:00
Jerome Jiang
299ef2f8eb Merge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8." 2017-01-10 00:51:09 +00:00
Jerome Jiang
198b834c97 vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.
Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.

Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44
2017-01-09 16:44:07 -08:00
Jerome Jiang
5b1a8ca5e8 Merge "Fix compile warnings for target=armv7-android-gcc" 2017-01-09 23:53:41 +00:00
James Zern
9480da21e8 Merge "Refine 8-bit 16x16 idct NEON intrinsics" 2017-01-09 23:52:29 +00:00
Marco Paniconi
62cce50d55 Merge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used." 2017-01-09 23:30:32 +00:00
Marco
35c4a13eb7 vp9: Fix comment in speed features.
Change-Id: I65d79c06b152922d725bf559adaa508f91cd5766
2017-01-09 13:05:31 -08:00
Marco
bea22782e9 vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.
Avoid the qp-clamping on gf/alt frame if gf_cbr_boost_pct is set.

Change only affect CBR mode when  gf_cbr_boost_pct is set.

Change-Id: I0655ed4f2b047c8ed1ed33a070c17960ad776704
2017-01-09 12:52:50 -08:00
Johann Koenig
371a64bfe7 Merge "postproc: vpx_mbpost_proc_down_neon" 2017-01-09 19:53:15 +00:00
Johann
c23970ec25 postproc: vpx_mbpost_proc_down_neon
This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x

BUG=webm:1320

Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4
2017-01-09 10:21:56 -08:00
Linfeng Zhang
6abdd31555 Refine 8-bit 16x16 idct NEON intrinsics
Speed test shows 25% gain on vpx_idct16x16_256_add_neon(),
and vpx_idct16x16_10_add_neon() got trippled.

Change-Id: If8518d9b6a3efab74031297b8d40cd83c4a49541
2017-01-06 17:52:07 -08:00
Ranjit Kumar Tulabandu
d3db846cc5 Fix for max qindex calculation of a gf interval
Calculation of active_worst_quality of a gf interval is modified
for coherency

BUG=webm:1355

Change-Id: I84cc2b47a8713f102a69419fb33ab020cffa3e71
2017-01-03 10:24:02 -08:00
Jerome Jiang
380a26112c Fix compile warnings for target=armv7-android-gcc
Fix compile warnings about implicit type conversion for
target=armv7-android-gcc in vpxenc.c.

BUG=webm:1348

Change-Id: I9fbabd843512f2a1a09f4bb934cd091e834eed9c
2016-12-22 14:56:20 -08:00
522 changed files with 86132 additions and 55585 deletions

View File

@@ -1,7 +1,7 @@
---
Language: Cpp
# BasedOnStyle: Google
# Generated with clang-format 3.8.1
# Generated with clang-format 4.0.1
AccessModifierOffset: -1
AlignAfterOpenBracket: Align
AlignConsecutiveAssignments: false
@@ -37,6 +37,8 @@ BreakBeforeBinaryOperators: None
BreakBeforeBraces: Attach
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 80
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: false
@@ -54,9 +56,12 @@ IncludeCategories:
Priority: 2
- Regex: '.*'
Priority: 3
IncludeIsMainRegex: '([-_](test|unittest))?$'
IndentCaseLabels: true
IndentWidth: 2
IndentWrappedFunctionNames: false
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: false
MacroBlockBegin: ''
MacroBlockEnd: ''
@@ -75,6 +80,7 @@ PointerAlignment: Right
ReflowComments: true
SortIncludes: false
SpaceAfterCStyleCast: false
SpaceAfterTemplateKeyword: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeParens: ControlStatements
SpaceInEmptyParentheses: false

View File

@@ -3,6 +3,7 @@ Aex Converse <aconverse@google.com>
Aex Converse <aconverse@google.com> <alex.converse@gmail.com>
Alexis Ballier <aballier@gentoo.org> <alexis.ballier@gmail.com>
Alpha Lam <hclam@google.com> <hclam@chromium.org>
Chris Cunningham <chcunningham@chromium.org>
Daniele Castagna <dcastagna@chromium.org> <dcastagna@google.com>
Deb Mukherjee <debargha@google.com>
Erik Niemeyer <erik.a.niemeyer@intel.com> <erik.a.niemeyer@gmail.com>
@@ -21,17 +22,21 @@ Marco Paniconi <marpan@google.com>
Marco Paniconi <marpan@google.com> <marpan@chromium.org>
Pascal Massimino <pascal.massimino@gmail.com>
Paul Wilkins <paulwilkins@google.com>
Peter Boström <pbos@chromium.org> <pbos@google.com>
Peter de Rivaz <peter.derivaz@gmail.com>
Peter de Rivaz <peter.derivaz@gmail.com> <peter.derivaz@argondesign.com>
Ralph Giles <giles@xiph.org> <giles@entropywave.com>
Ralph Giles <giles@xiph.org> <giles@mozilla.com>
Ronald S. Bultje <rsbultje@gmail.com> <rbultje@google.com>
Sami Pietilä <samipietila@google.com>
Shiyou Yin <yinshiyou-hf@loongson.cn>
Tamar Levy <tamar.levy@intel.com>
Tamar Levy <tamar.levy@intel.com> <levytamar82@gmail.com>
Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com>
Timothy B. Terriberry <tterribe@xiph.org> <tterriberry@mozilla.com>
Tom Finegan <tomfinegan@google.com>
Tom Finegan <tomfinegan@google.com> <tomfinegan@chromium.org>
Urvang Joshi <urvang@google.com> <urvang@chromium.org>
Yaowu Xu <yaowu@google.com> <adam@xuyaowu.com>
Yaowu Xu <yaowu@google.com> <yaowu@xuyaowu.com>
Yaowu Xu <yaowu@google.com> <Yaowu Xu>

17
AUTHORS
View File

@@ -3,13 +3,13 @@
Aaron Watry <awatry@gmail.com>
Abo Talib Mahfoodh <ab.mahfoodh@gmail.com>
Adam Xu <adam@xuyaowu.com>
Adrian Grange <agrange@google.com>
Aex Converse <aconverse@google.com>
Ahmad Sharif <asharif@google.com>
Aleksey Vasenev <margtu-fivt@ya.ru>
Alexander Potapenko <glider@google.com>
Alexander Voronov <avoronov@graphics.cs.msu.ru>
Alexandra Hájková <alexandra.khirnova@gmail.com>
Alexis Ballier <aballier@gentoo.org>
Alok Ahuja <waveletcoeff@gmail.com>
Alpha Lam <hclam@google.com>
@@ -17,6 +17,7 @@ A.Mahfoodh <ab.mahfoodh@gmail.com>
Ami Fischman <fischman@chromium.org>
Andoni Morales Alastruey <ylatuya@gmail.com>
Andres Mejia <mcitadel@gmail.com>
Andrew Lewis <andrewlewis@google.com>
Andrew Russell <anrussell@google.com>
Angie Chiang <angiebird@google.com>
Aron Rosenberg <arosenberg@logitech.com>
@@ -24,7 +25,9 @@ Attila Nagy <attilanagy@google.com>
Brion Vibber <bvibber@wikimedia.org>
changjun.yang <changjun.yang@intel.com>
Charles 'Buck' Krasic <ckrasic@google.com>
Cheng Chen <chengchen@google.com>
chm <chm@rock-chips.com>
Chris Cunningham <chcunningham@chromium.org>
Christian Duvivier <cduvivier@google.com>
Daniele Castagna <dcastagna@chromium.org>
Daniel Kang <ddkang@google.com>
@@ -46,10 +49,12 @@ Geza Lore <gezalore@gmail.com>
Ghislain MARY <ghislainmary2@gmail.com>
Giuseppe Scrivano <gscrivano@gnu.org>
Gordana Cmiljanovic <gordana.cmiljanovic@imgtec.com>
Gregor Jasny <gjasny@gmail.com>
Guillaume Martres <gmartres@google.com>
Guillermo Ballester Valor <gbvalor@gmail.com>
Hangyu Kuang <hkuang@google.com>
Hanno Böck <hanno@hboeck.de>
Han Shen <shenhan@google.com>
Henrik Lundin <hlundin@google.com>
Hui Su <huisu@google.com>
Ivan Krasin <krasin@chromium.org>
@@ -83,6 +88,7 @@ Justin Clift <justin@salasaga.org>
Justin Lebar <justin.lebar@gmail.com>
Kaustubh Raste <kaustubh.raste@imgtec.com>
KO Myung-Hun <komh@chollian.net>
Kyle Siefring <kylesiefring@gmail.com>
Lawrence Velázquez <larryv@macports.org>
Linfeng Zhang <linfengz@google.com>
Lou Quillio <louquillio@google.com>
@@ -101,6 +107,7 @@ Mikhal Shemer <mikhal@google.com>
Min Chen <chenm003@gmail.com>
Minghai Shang <minghai@google.com>
Min Ye <yeemmi@google.com>
Moriyoshi Koizumi <mozo@mozo.jp>
Morton Jonuschat <yabawock@gmail.com>
Nathan E. Egge <negge@mozilla.com>
Nico Weber <thakis@chromium.org>
@@ -111,12 +118,15 @@ Paul Wilkins <paulwilkins@google.com>
Pavol Rusnak <stick@gk2.sk>
Paweł Hajdan <phajdan@google.com>
Pengchong Jin <pengchong@google.com>
Peter Boström <pbos@google.com>
Peter Boström <pbos@chromium.org>
Peter Collingbourne <pcc@chromium.org>
Peter de Rivaz <peter.derivaz@gmail.com>
Philip Jägenstedt <philipj@opera.com>
Priit Laes <plaes@plaes.org>
Rafael Ávila de Espíndola <rafael.espindola@gmail.com>
Rafaël Carré <funman@videolan.org>
Rafael de Lucena Valle <rafaeldelucena@gmail.com>
Rahul Chaudhry <rahulchaudhry@google.com>
Ralph Giles <giles@xiph.org>
Ranjit Kumar Tulabandu <ranjit.tulabandu@ittiam.com>
Rob Bradford <rob@linux.intel.com>
@@ -131,9 +141,11 @@ Sean McGovern <gseanmcg@gmail.com>
Sergey Kolomenkin <kolomenkin@gmail.com>
Sergey Ulanov <sergeyu@chromium.org>
Shimon Doodkin <helpmepro1@gmail.com>
Shiyou Yin <yinshiyou-hf@loongson.cn>
Shunyao Li <shunyaoli@google.com>
Stefan Holmer <holmer@google.com>
Suman Sunkara <sunkaras@google.com>
Sylvestre Ledru <sylvestre@mozilla.com>
Taekhyun Kim <takim@nvidia.com>
Takanori MATSUURA <t.matsuu@gmail.com>
Tamar Levy <tamar.levy@intel.com>
@@ -146,6 +158,7 @@ Tom Finegan <tomfinegan@google.com>
Tristan Matthews <le.businessman@gmail.com>
Urvang Joshi <urvang@google.com>
Vignesh Venkatasubramanian <vigneshv@google.com>
Vlad Tsyrklevich <vtsyrklevich@chromium.org>
Yaowu Xu <yaowu@google.com>
Yi Luo <luoyi@google.com>
Yongzhe Wang <yongzhe@google.com>

View File

@@ -1,3 +1,28 @@
2017-01-04 v1.7.0 "Mandarin Duck"
This release focused on high bit depth performance (10/12 bit) and vp9
encoding improvements.
- Upgrading:
This release is ABI incompatible due to new vp9 encoder features.
Frame parallel decoding for vp9 has been removed.
- Enhancements:
vp9 encoding supports additional threads with --row-mt. This can be greater
than the number of tiles.
Two new vp9 encoder options have been added:
--corpus-complexity
--tune-content=film
Additional tooling for respecting the vp9 "level" profiles has been added.
- Bug fixes:
A variety of fuzzing issues.
vp8 threading fix for ARM.
Codec control VP9_SET_SKIP_LOOP_FILTER fixed.
Reject invalid multi resolution configurations.
2017-01-09 v1.6.1 "Long Tailed Duck"
This release improves upon the VP9 encoder and speeds up the encoding and
decoding processes.

9
README
View File

@@ -1,4 +1,4 @@
README - 9 January 2017
README - 24 January 2018
Welcome to the WebM VP8/VP9 Codec SDK!
@@ -58,10 +58,13 @@ COMPILING THE APPLICATIONS/LIBRARIES:
armv7-win32-vs11
armv7-win32-vs12
armv7-win32-vs14
armv7-win32-vs15
armv7s-darwin-gcc
armv8-linux-gcc
mips32-linux-gcc
mips64-linux-gcc
ppc64-linux-gcc
ppc64le-linux-gcc
sparc-solaris-gcc
x86-android-gcc
x86-darwin8-gcc
@@ -74,6 +77,7 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86-darwin13-gcc
x86-darwin14-gcc
x86-darwin15-gcc
x86-darwin16-gcc
x86-iphonesimulator-gcc
x86-linux-gcc
x86-linux-icc
@@ -84,6 +88,7 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86-win32-vs11
x86-win32-vs12
x86-win32-vs14
x86-win32-vs15
x86_64-android-gcc
x86_64-darwin9-gcc
x86_64-darwin10-gcc
@@ -92,6 +97,7 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86_64-darwin13-gcc
x86_64-darwin14-gcc
x86_64-darwin15-gcc
x86_64-darwin16-gcc
x86_64-iphonesimulator-gcc
x86_64-linux-gcc
x86_64-linux-icc
@@ -101,6 +107,7 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86_64-win64-vs11
x86_64-win64-vs12
x86_64-win64-vs14
x86_64-win64-vs15
generic-gnu
The generic-gnu target, in conjunction with the CROSS environment variable,

View File

@@ -124,6 +124,7 @@ ifeq ($(TOOLCHAIN), x86-os2-gcc)
CFLAGS += -mstackrealign
endif
# x86[_64]
$(BUILD_PFX)%_mmx.c.d: CFLAGS += -mmmx
$(BUILD_PFX)%_mmx.c.o: CFLAGS += -mmmx
$(BUILD_PFX)%_sse2.c.d: CFLAGS += -msse2
@@ -138,6 +139,12 @@ $(BUILD_PFX)%_avx.c.d: CFLAGS += -mavx
$(BUILD_PFX)%_avx.c.o: CFLAGS += -mavx
$(BUILD_PFX)%_avx2.c.d: CFLAGS += -mavx2
$(BUILD_PFX)%_avx2.c.o: CFLAGS += -mavx2
$(BUILD_PFX)%_avx512.c.d: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
$(BUILD_PFX)%_avx512.c.o: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
# POWER
$(BUILD_PFX)%_vsx.c.d: CFLAGS += -maltivec -mvsx
$(BUILD_PFX)%_vsx.c.o: CFLAGS += -maltivec -mvsx
$(BUILD_PFX)%.c.d: %.c
$(if $(quiet),@echo " [DEP] $@")

View File

@@ -403,6 +403,23 @@ check_gcc_machine_option() {
fi
}
# tests for -m$2, -m$3, -m$4... toggling the feature given in $1.
check_gcc_machine_options() {
feature="$1"
shift
flags="-m$1"
shift
for opt in $*; do
flags="$flags -m$opt"
done
if enabled gcc && ! disabled "$feature" && ! check_cflags $flags; then
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-$feature "
else
soft_enable "$feature"
fi
}
write_common_config_banner() {
print_webm_license config.mk "##" ""
echo '# This file automatically generated by configure. Do not edit!' >> config.mk
@@ -674,7 +691,6 @@ check_xcode_minimum_version() {
process_common_toolchain() {
if [ -z "$toolchain" ]; then
gcctarget="${CHOST:-$(gcc -dumpmachine 2> /dev/null)}"
# detect tgt_isa
case "$gcctarget" in
aarch64*)
@@ -697,6 +713,18 @@ process_common_toolchain() {
*sparc*)
tgt_isa=sparc
;;
power*64*-*)
tgt_isa=ppc64
;;
power*)
tgt_isa=ppc
;;
*mips64el*)
tgt_isa=mips64
;;
*mips32el*)
tgt_isa=mips32
;;
esac
# detect tgt_os
@@ -725,9 +753,16 @@ process_common_toolchain() {
tgt_isa=x86_64
tgt_os=darwin15
;;
*darwin16*)
tgt_isa=x86_64
tgt_os=darwin16
;;
x86_64*mingw32*)
tgt_os=win64
;;
x86_64*cygwin*)
tgt_os=win64
;;
*mingw32*|*cygwin*)
[ -z "$tgt_isa" ] && tgt_isa=x86
tgt_os=win32
@@ -775,6 +810,9 @@ process_common_toolchain() {
mips*)
enable_feature mips
;;
ppc*)
enable_feature ppc
;;
esac
# PIC is probably what we want when building shared libs
@@ -843,6 +881,10 @@ process_common_toolchain() {
add_cflags "-mmacosx-version-min=10.11"
add_ldflags "-mmacosx-version-min=10.11"
;;
*-darwin16-*)
add_cflags "-mmacosx-version-min=10.12"
add_ldflags "-mmacosx-version-min=10.12"
;;
*-iphonesimulator-*)
add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
add_ldflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
@@ -1144,10 +1186,20 @@ EOF
fi
fi
if enabled mmi; then
tgt_isa=loongson3a
check_add_ldflags -march=loongson3a
fi
check_add_cflags -march=${tgt_isa}
check_add_asflags -march=${tgt_isa}
check_add_asflags -KPIC
;;
ppc*)
link_with_cc=gcc
setup_gnu_toolchain
check_gcc_machine_option "vsx"
;;
x86*)
case ${tgt_os} in
win*)
@@ -1202,6 +1254,13 @@ EOF
AS=msvs
msvs_arch_dir=x86-msvs
vc_version=${tgt_cc##vs}
case $vc_version in
7|8|9|10|11|12|13|14)
echo "${tgt_cc} does not support avx512, disabling....."
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-avx512 "
soft_disable avx512
;;
esac
case $vc_version in
7|8|9|10)
echo "${tgt_cc} does not support avx/avx2, disabling....."
@@ -1246,9 +1305,18 @@ EOF
elif disabled $ext; then
disable_exts="yes"
else
# use the shortened version for the flag: sse4_1 -> sse4
check_gcc_machine_option ${ext%_*} $ext
if [ "$ext" = "avx512" ]; then
check_gcc_machine_options $ext avx512f avx512cd avx512bw avx512dq avx512vl
else
# use the shortened version for the flag: sse4_1 -> sse4
check_gcc_machine_option ${ext%_*} $ext
fi
fi
# https://bugs.chromium.org/p/webm/issues/detail?id=1464
# The assembly optimizations for vpx_sub_pixel_variance do not link with
# gcc 6.
enabled sse2 && soft_enable pic
done
if enabled external_build; then
@@ -1273,7 +1341,6 @@ EOF
esac
log_echo " using $AS"
fi
[ "${AS##*/}" = nasm ] && add_asflags -Ox
AS_SFX=.asm
case ${tgt_os} in
win32)
@@ -1282,7 +1349,7 @@ EOF
EXE_SFX=.exe
;;
win64)
add_asflags -f x64
add_asflags -f win64
enabled debug && add_asflags -g cv8
EXE_SFX=.exe
;;
@@ -1416,6 +1483,10 @@ EOF
echo "msa optimizations are available only for little endian platforms"
disable_feature msa
fi
if enabled mmi; then
echo "mmi optimizations are available only for little endian platforms"
disable_feature mmi
fi
fi
;;
esac

View File

@@ -25,7 +25,7 @@ files.
Options:
--help Print this message
--out=outfile Redirect output to a file
--ver=version Version (7,8,9,10,11,12,14) of visual studio to generate for
--ver=version Version (7,8,9,10,11,12,14,15) of visual studio to generate for
--target=isa-os-cc Target specifier
EOF
exit 1
@@ -215,7 +215,7 @@ for opt in "$@"; do
;;
--ver=*) vs_ver="$optval"
case $optval in
10|11|12|14)
10|11|12|14|15)
;;
*) die Unrecognized Visual Studio Version in $opt
;;
@@ -240,9 +240,12 @@ case "${vs_ver:-10}" in
12) sln_vers="12.00"
sln_vers_str="Visual Studio 2013"
;;
14) sln_vers="14.00"
14) sln_vers="12.00"
sln_vers_str="Visual Studio 2015"
;;
15) sln_vers="12.00"
sln_vers_str="Visual Studio 2017"
;;
esac
sfx=vcxproj

View File

@@ -34,7 +34,7 @@ Options:
--name=project_name Name of the project (required)
--proj-guid=GUID GUID to use for the project
--module-def=filename File containing export definitions (for DLLs)
--ver=version Version (10,11,12,14) of visual studio to generate for
--ver=version Version (10,11,12,14,15) of visual studio to generate for
--src-path-bare=dir Path to root of source tree
-Ipath/to/include Additional include directories
-DFLAG[=value] Preprocessor macros to define
@@ -168,7 +168,7 @@ for opt in "$@"; do
--ver=*)
vs_ver="$optval"
case "$optval" in
10|11|12|14)
10|11|12|14|15)
;;
*) die Unrecognized Visual Studio Version in $opt
;;
@@ -218,7 +218,7 @@ guid=${guid:-`generate_uuid`}
asm_use_custom_step=false
uses_asm=${uses_asm:-false}
case "${vs_ver:-11}" in
10|11|12|14)
10|11|12|14|15)
asm_use_custom_step=$uses_asm
;;
esac
@@ -347,6 +347,9 @@ generate_vcxproj() {
if [ "$vs_ver" = "14" ]; then
tag_content PlatformToolset v140
fi
if [ "$vs_ver" = "15" ]; then
tag_content PlatformToolset v141
fi
tag_content CharacterSet Unicode
if [ "$config" = "Release" ]; then
tag_content WholeProgramOptimization true

View File

@@ -35,8 +35,8 @@ ARM_TARGETS="arm64-darwin-gcc
armv7s-darwin-gcc"
SIM_TARGETS="x86-iphonesimulator-gcc
x86_64-iphonesimulator-gcc"
OSX_TARGETS="x86-darwin15-gcc
x86_64-darwin15-gcc"
OSX_TARGETS="x86-darwin16-gcc
x86_64-darwin16-gcc"
TARGETS="${ARM_TARGETS} ${SIM_TARGETS}"
# Configures for the target specified by $1, and invokes make with the dist
@@ -271,7 +271,7 @@ cat << EOF
--help: Display this message and exit.
--enable-shared: Build a dynamic framework for use on iOS 8 or later.
--extra-configure-args <args>: Extra args to pass when configuring libvpx.
--macosx: Uses darwin15 targets instead of iphonesimulator targets for x86
--macosx: Uses darwin16 targets instead of iphonesimulator targets for x86
and x86_64. Allows linking to framework when builds target MacOSX
instead of iOS.
--preserve-build-output: Do not delete the build directory.

View File

@@ -1,4 +1,13 @@
#!/usr/bin/env perl
##
## Copyright (c) 2017 The WebM project authors. All Rights Reserved.
##
## Use of this source code is governed by a BSD-style license
## that can be found in the LICENSE file in the root of the source
## tree. An additional intellectual property rights grant can be found
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
no strict 'refs';
use warnings;
@@ -200,6 +209,7 @@ sub filter {
sub common_top() {
my $include_guard = uc($opts{sym})."_H_";
print <<EOF;
// This file is generated. Do not edit.
#ifndef ${include_guard}
#define ${include_guard}
@@ -335,6 +345,36 @@ EOF
common_bottom;
}
sub ppc() {
determine_indirection("c", @ALL_ARCHS);
# Assign the helper variable for each enabled extension
foreach my $opt (@ALL_ARCHS) {
my $opt_uc = uc $opt;
eval "\$have_${opt}=\"flags & HAS_${opt_uc}\"";
}
common_top;
print <<EOF;
#include "vpx_config.h"
#ifdef RTCD_C
#include "vpx_ports/ppc.h"
static void setup_rtcd_internal(void)
{
int flags = ppc_simd_caps();
(void)flags;
EOF
set_function_pointers("c", @ALL_ARCHS);
print <<EOF;
}
#endif
EOF
common_bottom;
}
sub unoptimized() {
determine_indirection "c";
common_top;
@@ -361,10 +401,10 @@ EOF
&require("c");
if ($opts{arch} eq 'x86') {
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2/);
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2 avx512/);
x86;
} elsif ($opts{arch} eq 'x86_64') {
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2/);
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2 avx512/);
@REQUIRES = filter(keys %required ? keys %required : qw/mmx sse sse2/);
&require(@REQUIRES);
x86;
@@ -381,6 +421,10 @@ if ($opts{arch} eq 'x86') {
@ALL_ARCHS = filter("$opts{arch}", qw/msa/);
last;
}
if (/HAVE_MMI=yes/) {
@ALL_ARCHS = filter("$opts{arch}", qw/mmi/);
last;
}
}
close CONFIG_FILE;
mips;
@@ -390,6 +434,9 @@ if ($opts{arch} eq 'x86') {
} elsif ($opts{arch} eq 'armv8' || $opts{arch} eq 'arm64' ) {
@ALL_ARCHS = filter(qw/neon/);
arm;
} elsif ($opts{arch} =~ /^ppc/ ) {
@ALL_ARCHS = filter(qw/vsx/);
ppc;
} else {
unoptimized;
}

View File

@@ -60,6 +60,7 @@ if [ ${bare} ]; then
echo "${changelog_version}${git_version_id}" > $$.tmp
else
cat<<EOF>$$.tmp
// This file is generated. Do not edit.
#define VERSION_MAJOR $major_version
#define VERSION_MINOR $minor_version
#define VERSION_PATCH $patch_version

37
configure vendored
View File

@@ -109,10 +109,13 @@ all_platforms="${all_platforms} armv7-none-rvct" #neon Cortex-A8
all_platforms="${all_platforms} armv7-win32-vs11"
all_platforms="${all_platforms} armv7-win32-vs12"
all_platforms="${all_platforms} armv7-win32-vs14"
all_platforms="${all_platforms} armv7-win32-vs15"
all_platforms="${all_platforms} armv7s-darwin-gcc"
all_platforms="${all_platforms} armv8-linux-gcc"
all_platforms="${all_platforms} mips32-linux-gcc"
all_platforms="${all_platforms} mips64-linux-gcc"
all_platforms="${all_platforms} ppc64-linux-gcc"
all_platforms="${all_platforms} ppc64le-linux-gcc"
all_platforms="${all_platforms} sparc-solaris-gcc"
all_platforms="${all_platforms} x86-android-gcc"
all_platforms="${all_platforms} x86-darwin8-gcc"
@@ -125,6 +128,7 @@ all_platforms="${all_platforms} x86-darwin12-gcc"
all_platforms="${all_platforms} x86-darwin13-gcc"
all_platforms="${all_platforms} x86-darwin14-gcc"
all_platforms="${all_platforms} x86-darwin15-gcc"
all_platforms="${all_platforms} x86-darwin16-gcc"
all_platforms="${all_platforms} x86-iphonesimulator-gcc"
all_platforms="${all_platforms} x86-linux-gcc"
all_platforms="${all_platforms} x86-linux-icc"
@@ -135,6 +139,7 @@ all_platforms="${all_platforms} x86-win32-vs10"
all_platforms="${all_platforms} x86-win32-vs11"
all_platforms="${all_platforms} x86-win32-vs12"
all_platforms="${all_platforms} x86-win32-vs14"
all_platforms="${all_platforms} x86-win32-vs15"
all_platforms="${all_platforms} x86_64-android-gcc"
all_platforms="${all_platforms} x86_64-darwin9-gcc"
all_platforms="${all_platforms} x86_64-darwin10-gcc"
@@ -143,6 +148,7 @@ all_platforms="${all_platforms} x86_64-darwin12-gcc"
all_platforms="${all_platforms} x86_64-darwin13-gcc"
all_platforms="${all_platforms} x86_64-darwin14-gcc"
all_platforms="${all_platforms} x86_64-darwin15-gcc"
all_platforms="${all_platforms} x86_64-darwin16-gcc"
all_platforms="${all_platforms} x86_64-iphonesimulator-gcc"
all_platforms="${all_platforms} x86_64-linux-gcc"
all_platforms="${all_platforms} x86_64-linux-icc"
@@ -152,6 +158,7 @@ all_platforms="${all_platforms} x86_64-win64-vs10"
all_platforms="${all_platforms} x86_64-win64-vs11"
all_platforms="${all_platforms} x86_64-win64-vs12"
all_platforms="${all_platforms} x86_64-win64-vs14"
all_platforms="${all_platforms} x86_64-win64-vs15"
all_platforms="${all_platforms} generic-gnu"
# all_targets is a list of all targets that can be configured
@@ -163,11 +170,14 @@ for t in ${all_targets}; do
[ -f "${source_path}/${t}.mk" ] && enable_feature ${t}
done
if ! diff --version >/dev/null; then
die "diff missing: Try installing diffutils via your package manager."
fi
if ! perl --version >/dev/null; then
die "Perl is required to build"
fi
if [ "`cd \"${source_path}\" && pwd`" != "`pwd`" ]; then
# test to see if source_path already configured
if [ -f "${source_path}/vpx_config.h" ]; then
@@ -223,6 +233,7 @@ ARCH_LIST="
mips
x86
x86_64
ppc
"
ARCH_EXT_LIST_X86="
mmx
@@ -233,7 +244,13 @@ ARCH_EXT_LIST_X86="
sse4_1
avx
avx2
avx512
"
ARCH_EXT_LIST_LOONGSON="
mmi
"
ARCH_EXT_LIST="
neon
neon_asm
@@ -244,6 +261,10 @@ ARCH_EXT_LIST="
mips64
${ARCH_EXT_LIST_X86}
vsx
${ARCH_EXT_LIST_LOONGSON}
"
HAVE_LIST="
${ARCH_EXT_LIST}
@@ -255,7 +276,6 @@ EXPERIMENT_LIST="
spatial_svc
fp_mb_stats
emulate_hardware
misc_fixes
"
CONFIG_LIST="
dependency_tracking
@@ -310,6 +330,7 @@ CONFIG_LIST="
better_hw_compatibility
experimental
size_limit
always_adjust_bpm
${EXPERIMENT_LIST}
"
CMDLINE_SELECT="
@@ -369,6 +390,7 @@ CMDLINE_SELECT="
better_hw_compatibility
vp9_highbitdepth
experimental
always_adjust_bpm
"
process_cmdline() {
@@ -570,6 +592,7 @@ process_toolchain() {
check_add_cflags -Wdeclaration-after-statement
check_add_cflags -Wdisabled-optimization
check_add_cflags -Wfloat-conversion
check_add_cflags -Wparentheses-equality
check_add_cflags -Wpointer-arith
check_add_cflags -Wtype-limits
check_add_cflags -Wcast-qual
@@ -588,11 +611,9 @@ process_toolchain() {
if enabled mips || [ -z "${INLINE}" ]; then
enabled extra_warnings || check_add_cflags -Wno-unused-function
fi
if ! enabled vp9_highbitdepth; then
# Avoid this warning for third_party C++ sources. Some reorganization
# would be needed to apply this only to test/*.cc.
check_cflags -Wshorten-64-to-32 && add_cflags_only -Wshorten-64-to-32
fi
# Avoid this warning for third_party C++ sources. Some reorganization
# would be needed to apply this only to test/*.cc.
check_cflags -Wshorten-64-to-32 && add_cflags_only -Wshorten-64-to-32
fi
if enabled icc; then
@@ -644,7 +665,7 @@ process_toolchain() {
gen_vcproj_cmd=${source_path}/build/make/gen_msvs_vcxproj.sh
enabled werror && gen_vcproj_cmd="${gen_vcproj_cmd} --enable-werror"
all_targets="${all_targets} solution"
INLINE="__forceinline"
INLINE="__inline"
;;
esac

View File

@@ -151,7 +151,7 @@ static void write_ivf_frame_header(FILE *outfile,
if (pkt->kind != VPX_CODEC_CX_FRAME_PKT) return;
pts = pkt->data.frame.pts;
mem_put_le32(header, pkt->data.frame.sz);
mem_put_le32(header, (int)pkt->data.frame.sz);
mem_put_le32(header + 4, pts & 0xFFFFFFFF);
mem_put_le32(header + 8, pts >> 32);
@@ -190,7 +190,7 @@ static void set_temporal_layer_pattern(int num_temporal_layers,
cfg->ts_layer_id[0] = 0;
cfg->ts_layer_id[1] = 1;
// Use 60/40 bit allocation as example.
cfg->ts_target_bitrate[0] = 0.6f * bitrate;
cfg->ts_target_bitrate[0] = (int)(0.6f * bitrate);
cfg->ts_target_bitrate[1] = bitrate;
/* 0=L, 1=GF */
@@ -241,8 +241,8 @@ static void set_temporal_layer_pattern(int num_temporal_layers,
cfg->ts_layer_id[2] = 1;
cfg->ts_layer_id[3] = 2;
// Use 45/20/35 bit allocation as example.
cfg->ts_target_bitrate[0] = 0.45f * bitrate;
cfg->ts_target_bitrate[1] = 0.65f * bitrate;
cfg->ts_target_bitrate[0] = (int)(0.45f * bitrate);
cfg->ts_target_bitrate[1] = (int)(0.65f * bitrate);
cfg->ts_target_bitrate[2] = bitrate;
/* 0=L, 1=GF, 2=ARF */
@@ -294,8 +294,8 @@ int main(int argc, char **argv) {
vpx_codec_err_t res[NUM_ENCODERS];
int i;
long width;
long height;
int width;
int height;
int length_frame;
int frame_avail;
int got_data;
@@ -347,9 +347,9 @@ int main(int argc, char **argv) {
printf("Using %s\n", vpx_codec_iface_name(interface));
width = strtol(argv[1], NULL, 0);
height = strtol(argv[2], NULL, 0);
framerate = strtol(argv[3], NULL, 0);
width = (int)strtol(argv[1], NULL, 0);
height = (int)strtol(argv[2], NULL, 0);
framerate = (int)strtol(argv[3], NULL, 0);
if (width < 16 || width % 2 || height < 16 || height % 2)
die("Invalid resolution: %ldx%ld", width, height);
@@ -371,12 +371,13 @@ int main(int argc, char **argv) {
// Bitrates per spatial layer: overwrite default rates above.
for (i = 0; i < NUM_ENCODERS; i++) {
target_bitrate[i] = strtol(argv[NUM_ENCODERS + 5 + i], NULL, 0);
target_bitrate[i] = (int)strtol(argv[NUM_ENCODERS + 5 + i], NULL, 0);
}
// Temporal layers per spatial layers: overwrite default settings above.
for (i = 0; i < NUM_ENCODERS; i++) {
num_temporal_layers[i] = strtol(argv[2 * NUM_ENCODERS + 5 + i], NULL, 0);
num_temporal_layers[i] =
(int)strtol(argv[2 * NUM_ENCODERS + 5 + i], NULL, 0);
if (num_temporal_layers[i] < 1 || num_temporal_layers[i] > 3)
die("Invalid temporal layers: %d, Must be 1, 2, or 3. \n",
num_temporal_layers);
@@ -391,9 +392,9 @@ int main(int argc, char **argv) {
downsampled_input[i] = fopen(filename, "wb");
}
key_frame_insert = strtol(argv[3 * NUM_ENCODERS + 5], NULL, 0);
key_frame_insert = (int)strtol(argv[3 * NUM_ENCODERS + 5], NULL, 0);
show_psnr = strtol(argv[3 * NUM_ENCODERS + 6], NULL, 0);
show_psnr = (int)strtol(argv[3 * NUM_ENCODERS + 6], NULL, 0);
/* Populate default encoder configuration */
for (i = 0; i < NUM_ENCODERS; i++) {
@@ -469,7 +470,7 @@ int main(int argc, char **argv) {
if (!vpx_img_alloc(&raw[i], VPX_IMG_FMT_I420, cfg[i].g_w, cfg[i].g_h, 32))
die("Failed to allocate image", cfg[i].g_w, cfg[i].g_h);
if (raw[0].stride[VPX_PLANE_Y] == raw[0].d_w)
if (raw[0].stride[VPX_PLANE_Y] == (int)raw[0].d_w)
read_frame_p = read_frame;
else
read_frame_p = read_frame_by_row;
@@ -558,7 +559,8 @@ int main(int argc, char **argv) {
/* Write out down-sampled input. */
length_frame = cfg[i].g_w * cfg[i].g_h * 3 / 2;
if (fwrite(raw[i].planes[0], 1, length_frame,
downsampled_input[NUM_ENCODERS - i - 1]) != length_frame) {
downsampled_input[NUM_ENCODERS - i - 1]) !=
(unsigned int)length_frame) {
return EXIT_FAILURE;
}
}
@@ -619,10 +621,6 @@ int main(int argc, char **argv) {
break;
default: break;
}
printf(pkt[i]->kind == VPX_CODEC_CX_FRAME_PKT &&
(pkt[i]->data.frame.flags & VPX_FRAME_IS_KEY)
? "K"
: "");
fflush(stdout);
}
}
@@ -663,7 +661,6 @@ int main(int argc, char **argv) {
write_ivf_file_header(outfile[i], &cfg[i], frame_cnt - 1);
fclose(outfile[i]);
}
printf("\n");
return EXIT_SUCCESS;
}

View File

@@ -168,7 +168,7 @@ void usage_exit(void) {
static void parse_command_line(int argc, const char **argv_,
AppInput *app_input, SvcContext *svc_ctx,
vpx_codec_enc_cfg_t *enc_cfg) {
struct arg arg = { 0 };
struct arg arg;
char **argv = NULL;
char **argi = NULL;
char **argj = NULL;
@@ -509,7 +509,7 @@ static void printout_rate_control_summary(struct RateControlStats *rc,
}
vpx_codec_err_t parse_superframe_index(const uint8_t *data, size_t data_sz,
uint32_t sizes[8], int *count) {
uint64_t sizes[8], int *count) {
// A chunk ending with a byte matching 0xc0 is an invalid chunk unless
// it is a super frame index. If the last byte of real video compression
// data is 0xc0 the encoder must add a 0 byte. If we have the marker but
@@ -606,9 +606,9 @@ void set_frame_flags_bypass_mode(int sl, int tl, int num_spatial_layers,
}
int main(int argc, const char **argv) {
AppInput app_input = { 0 };
AppInput app_input;
VpxVideoWriter *writer = NULL;
VpxVideoInfo info = { 0 };
VpxVideoInfo info;
vpx_codec_ctx_t codec;
vpx_codec_enc_cfg_t enc_cfg;
SvcContext svc_ctx;
@@ -640,8 +640,9 @@ int main(int argc, const char **argv) {
// Allocate image buffer
#if CONFIG_VP9_HIGHBITDEPTH
if (!vpx_img_alloc(&raw, enc_cfg.g_input_bit_depth == 8 ? VPX_IMG_FMT_I420
: VPX_IMG_FMT_I42016,
if (!vpx_img_alloc(&raw,
enc_cfg.g_input_bit_depth == 8 ? VPX_IMG_FMT_I420
: VPX_IMG_FMT_I42016,
enc_cfg.g_w, enc_cfg.g_h, 32)) {
die("Failed to allocate image %dx%d\n", enc_cfg.g_w, enc_cfg.g_h);
}
@@ -697,12 +698,18 @@ int main(int argc, const char **argv) {
if (svc_ctx.speed != -1)
vpx_codec_control(&codec, VP8E_SET_CPUUSED, svc_ctx.speed);
if (svc_ctx.threads)
if (svc_ctx.threads) {
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (svc_ctx.threads >> 1));
if (svc_ctx.threads > 1)
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 1);
else
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 0);
}
if (svc_ctx.speed >= 5 && svc_ctx.aqmode == 1)
vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3);
if (svc_ctx.speed >= 5)
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP8E_SET_MAX_INTRA_BITRATE_PCT, 900);
// Encode frames
while (!end_of_stream) {
@@ -763,7 +770,7 @@ int main(int argc, const char **argv) {
SvcInternal_t *const si = (SvcInternal_t *)svc_ctx.internal;
if (cx_pkt->data.frame.sz > 0) {
#if OUTPUT_RC_STATS
uint32_t sizes[8];
uint64_t sizes[8];
int count = 0;
#endif
vpx_video_writer_write_frame(writer, cx_pkt->data.frame.buf,
@@ -775,6 +782,8 @@ int main(int argc, const char **argv) {
vpx_codec_control(&codec, VP9E_GET_SVC_LAYER_ID, &layer_id);
parse_superframe_index(cx_pkt->data.frame.buf,
cx_pkt->data.frame.sz, sizes, &count);
if (enc_cfg.ss_number_layers == 1)
sizes[0] = cx_pkt->data.frame.sz;
// Note computing input_layer_frames here won't account for frame
// drops in rate control stats.
// TODO(marpan): Fix this for non-bypass mode so we can get stats

View File

@@ -26,17 +26,27 @@
#include "../tools_common.h"
#include "../video_writer.h"
#define VP8_ROI_MAP 0
static const char *exec_name;
void usage_exit(void) { exit(EXIT_FAILURE); }
// Denoiser states, for temporal denoising.
enum denoiserState {
kDenoiserOff,
kDenoiserOnYOnly,
kDenoiserOnYUV,
kDenoiserOnYUVAggressive,
kDenoiserOnAdaptive
// Denoiser states for vp8, for temporal denoising.
enum denoiserStateVp8 {
kVp8DenoiserOff,
kVp8DenoiserOnYOnly,
kVp8DenoiserOnYUV,
kVp8DenoiserOnYUVAggressive,
kVp8DenoiserOnAdaptive
};
// Denoiser states for vp9, for temporal denoising.
enum denoiserStateVp9 {
kVp9DenoiserOff,
kVp9DenoiserOnYOnly,
// For SVC: denoise the top two spatial layers.
kVp9DenoiserOnYTwoSpatialLayers
};
static int mode_to_num_layers[13] = { 1, 2, 2, 3, 3, 3, 3, 5, 2, 3, 3, 3, 3 };
@@ -154,6 +164,53 @@ static void printout_rate_control_summary(struct RateControlMetrics *rc,
die("Error: Number of input frames not equal to output! \n");
}
#if VP8_ROI_MAP
static void vp8_set_roi_map(vpx_codec_enc_cfg_t *cfg, vpx_roi_map_t *roi) {
unsigned int i, j;
memset(roi, 0, sizeof(*roi));
// ROI is based on the segments (4 for vp8, 8 for vp9), smallest unit for
// segment is 16x16 for vp8, 8x8 for vp9.
roi->rows = (cfg->g_h + 15) / 16;
roi->cols = (cfg->g_w + 15) / 16;
// Applies delta QP on the segment blocks, varies from -63 to 63.
// Setting to negative means lower QP (better quality).
// Below we set delta_q to the extreme (-63) to show strong effect.
roi->delta_q[0] = 0;
roi->delta_q[1] = -63;
roi->delta_q[2] = 0;
roi->delta_q[3] = 0;
// Applies delta loopfilter strength on the segment blocks, varies from -63 to
// 63. Setting to positive means stronger loopfilter.
roi->delta_lf[0] = 0;
roi->delta_lf[1] = 0;
roi->delta_lf[2] = 0;
roi->delta_lf[3] = 0;
// Applies skip encoding threshold on the segment blocks, varies from 0 to
// UINT_MAX. Larger value means more skipping of encoding is possible.
// This skip threshold only applies on delta frames.
roi->static_threshold[0] = 0;
roi->static_threshold[1] = 0;
roi->static_threshold[2] = 0;
roi->static_threshold[3] = 0;
// Use 2 states: 1 is center square, 0 is the rest.
roi->roi_map =
(uint8_t *)calloc(roi->rows * roi->cols, sizeof(*roi->roi_map));
for (i = 0; i < roi->rows; ++i) {
for (j = 0; j < roi->cols; ++j) {
if (i > (roi->rows >> 2) && i < ((roi->rows * 3) >> 2) &&
j > (roi->cols >> 2) && j < ((roi->cols * 3) >> 2)) {
roi->roi_map[i * roi->cols + j] = 1;
}
}
}
}
#endif
// Temporal scaling parameters:
// NOTE: The 3 prediction frames cannot be used interchangeably due to
// differences in the way they are handled throughout the code. The
@@ -495,6 +552,7 @@ int main(int argc, char **argv) {
vpx_codec_err_t res;
unsigned int width;
unsigned int height;
uint32_t error_resilient = 0;
int speed;
int frame_avail;
int got_data;
@@ -505,16 +563,15 @@ int main(int argc, char **argv) {
int layering_mode = 0;
int layer_flags[VPX_TS_MAX_PERIODICITY] = { 0 };
int flag_periodicity = 1;
#if VPX_ENCODER_ABI_VERSION > (4 + VPX_CODEC_ABI_VERSION)
vpx_svc_layer_id_t layer_id = { 0, 0 };
#else
vpx_svc_layer_id_t layer_id = { 0 };
#if VP8_ROI_MAP
vpx_roi_map_t roi;
#endif
vpx_svc_layer_id_t layer_id = { 0, 0 };
const VpxInterface *encoder = NULL;
FILE *infile = NULL;
struct RateControlMetrics rc;
int64_t cx_time = 0;
const int min_args_base = 12;
const int min_args_base = 13;
#if CONFIG_VP9_HIGHBITDEPTH
vpx_bit_depth_t bit_depth = VPX_BITS_8;
int input_bit_depth = 8;
@@ -531,12 +588,14 @@ int main(int argc, char **argv) {
if (argc < min_args) {
#if CONFIG_VP9_HIGHBITDEPTH
die("Usage: %s <infile> <outfile> <codec_type(vp8/vp9)> <width> <height> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> <threads> <mode> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> "
"<error_resilient> <threads> <mode> "
"<Rate_0> ... <Rate_nlayers-1> <bit-depth> \n",
argv[0]);
#else
die("Usage: %s <infile> <outfile> <codec_type(vp8/vp9)> <width> <height> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> <threads> <mode> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> "
"<error_resilient> <threads> <mode> "
"<Rate_0> ... <Rate_nlayers-1> \n",
argv[0]);
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -553,9 +612,9 @@ int main(int argc, char **argv) {
die("Invalid resolution: %d x %d", width, height);
}
layering_mode = (int)strtol(argv[11], NULL, 0);
layering_mode = (int)strtol(argv[12], NULL, 0);
if (layering_mode < 0 || layering_mode > 13) {
die("Invalid layering mode (0..12) %s", argv[11]);
die("Invalid layering mode (0..12) %s", argv[12]);
}
if (argc != min_args + mode_to_num_layers[layering_mode]) {
@@ -619,11 +678,11 @@ int main(int argc, char **argv) {
for (i = min_args_base;
(int)i < min_args_base + mode_to_num_layers[layering_mode]; ++i) {
rc.layer_target_bitrate[i - 12] = (int)strtol(argv[i], NULL, 0);
rc.layer_target_bitrate[i - 13] = (int)strtol(argv[i], NULL, 0);
if (strncmp(encoder->name, "vp8", 3) == 0)
cfg.ts_target_bitrate[i - 12] = rc.layer_target_bitrate[i - 12];
cfg.ts_target_bitrate[i - 13] = rc.layer_target_bitrate[i - 13];
else if (strncmp(encoder->name, "vp9", 3) == 0)
cfg.layer_target_bitrate[i - 12] = rc.layer_target_bitrate[i - 12];
cfg.layer_target_bitrate[i - 13] = rc.layer_target_bitrate[i - 13];
}
// Real time parameters.
@@ -634,7 +693,7 @@ int main(int argc, char **argv) {
if (strncmp(encoder->name, "vp9", 3) == 0) cfg.rc_max_quantizer = 52;
cfg.rc_undershoot_pct = 50;
cfg.rc_overshoot_pct = 50;
cfg.rc_buf_initial_sz = 500;
cfg.rc_buf_initial_sz = 600;
cfg.rc_buf_optimal_sz = 600;
cfg.rc_buf_sz = 1000;
@@ -642,10 +701,14 @@ int main(int argc, char **argv) {
cfg.rc_resize_allowed = 0;
// Use 1 thread as default.
cfg.g_threads = (unsigned int)strtoul(argv[10], NULL, 0);
cfg.g_threads = (unsigned int)strtoul(argv[11], NULL, 0);
error_resilient = (uint32_t)strtoul(argv[10], NULL, 0);
if (error_resilient != 0 && error_resilient != 1) {
die("Invalid value for error resilient (0, 1): %d.", error_resilient);
}
// Enable error resilient mode.
cfg.g_error_resilient = 1;
cfg.g_error_resilient = error_resilient;
cfg.g_lag_in_frames = 0;
cfg.kf_mode = VPX_KF_AUTO;
@@ -700,9 +763,15 @@ int main(int argc, char **argv) {
if (strncmp(encoder->name, "vp8", 3) == 0) {
vpx_codec_control(&codec, VP8E_SET_CPUUSED, -speed);
vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kDenoiserOff);
vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kVp8DenoiserOff);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP8E_SET_GF_CBR_BOOST_PCT, 0);
#if VP8_ROI_MAP
vp8_set_roi_map(&cfg, &roi);
if (vpx_codec_control(&codec, VP8E_SET_ROI_MAP, &roi))
die_codec(&codec, "Failed to set ROI map");
#endif
} else if (strncmp(encoder->name, "vp9", 3) == 0) {
vpx_svc_extra_cfg_t svc_params;
memset(&svc_params, 0, sizeof(svc_params));
@@ -711,10 +780,16 @@ int main(int argc, char **argv) {
vpx_codec_control(&codec, VP9E_SET_GF_CBR_BOOST_PCT, 0);
vpx_codec_control(&codec, VP9E_SET_FRAME_PARALLEL_DECODING, 0);
vpx_codec_control(&codec, VP9E_SET_FRAME_PERIODIC_BOOST, 0);
vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, kDenoiserOff);
vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, kVp9DenoiserOff);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP9E_SET_TUNE_CONTENT, 0);
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (cfg.g_threads >> 1));
// TODO(marpan/jianj): There is an issue with row-mt for low resolutons at
// high speed settings, disable its use for those cases for now.
if (cfg.g_threads > 1 && ((cfg.g_w > 320 && cfg.g_h > 240) || speed < 7))
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 1);
else
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 0);
if (vpx_codec_control(&codec, VP9E_SET_SVC, layering_mode > 0 ? 1 : 0))
die_codec(&codec, "Failed to set SVC");
for (i = 0; i < cfg.ts_number_layers; ++i) {
@@ -733,7 +808,7 @@ int main(int argc, char **argv) {
// For generating smaller key frames, use a smaller max_intra_size_pct
// value, like 100 or 200.
{
const int max_intra_size_pct = 900;
const int max_intra_size_pct = 1000;
vpx_codec_control(&codec, VP8E_SET_MAX_INTRA_BITRATE_PCT,
max_intra_size_pct);
}
@@ -743,10 +818,8 @@ int main(int argc, char **argv) {
struct vpx_usec_timer timer;
vpx_codec_iter_t iter = NULL;
const vpx_codec_cx_pkt_t *pkt;
#if VPX_ENCODER_ABI_VERSION > (4 + VPX_CODEC_ABI_VERSION)
// Update the temporal layer_id. No spatial layers in this test.
layer_id.spatial_layer_id = 0;
#endif
layer_id.temporal_layer_id =
cfg.ts_layer_id[frame_cnt % cfg.ts_periodicity];
if (strncmp(encoder->name, "vp9", 3) == 0) {

33
libs.mk
View File

@@ -149,6 +149,7 @@ CODEC_SRCS-yes += $(BUILD_PFX)vpx_config.c
INSTALL-SRCS-no += $(BUILD_PFX)vpx_config.c
ifeq ($(ARCH_X86)$(ARCH_X86_64),yes)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += third_party/x86inc/x86inc.asm
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += vpx_dsp/x86/bitdepth_conversion_sse2.asm
endif
CODEC_EXPORTS-yes += vpx/exports_com
CODEC_EXPORTS-$(CONFIG_ENCODERS) += vpx/exports_enc
@@ -187,6 +188,13 @@ libvpx_srcs.txt:
@echo $(CODEC_SRCS) | xargs -n1 echo | LC_ALL=C sort -u > $@
CLEAN-OBJS += libvpx_srcs.txt
# Assembly files that are included, but don't define symbols themselves.
# Filtered out to avoid Windows build warnings.
ASM_INCLUDES := \
third_party/x86inc/x86inc.asm \
vpx_config.asm \
vpx_ports/x86_abi_support.asm \
vpx_dsp/x86/bitdepth_conversion_sse2.asm \
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
ifeq ($(CONFIG_MSVS),yes)
@@ -198,13 +206,6 @@ vpx.def: $(call enabled,CODEC_EXPORTS)
--out=$@ $^
CLEAN-OBJS += vpx.def
# Assembly files that are included, but don't define symbols themselves.
# Filtered out to avoid Visual Studio build warnings.
ASM_INCLUDES := \
third_party/x86inc/x86inc.asm \
vpx_config.asm \
vpx_ports/x86_abi_support.asm \
vpx.$(VCPROJ_SFX): $(CODEC_SRCS) vpx.def
@echo " [CREATE] $@"
$(qexec)$(GEN_VCPROJ) \
@@ -227,13 +228,13 @@ vpx.$(VCPROJ_SFX): $(RTCD)
endif
else
LIBVPX_OBJS=$(call objs,$(CODEC_SRCS))
LIBVPX_OBJS=$(call objs, $(filter-out $(ASM_INCLUDES), $(CODEC_SRCS)))
OBJS-yes += $(LIBVPX_OBJS)
LIBS-$(if yes,$(CONFIG_STATIC)) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a
$(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
SO_VERSION_MAJOR := 4
SO_VERSION_MINOR := 1
SO_VERSION_MAJOR := 5
SO_VERSION_MINOR := 0
SO_VERSION_PATCH := 0
ifeq ($(filter darwin%,$(TGT_OS)),$(TGT_OS))
LIBVPX_SO := libvpx.$(SO_VERSION_MAJOR).dylib
@@ -404,8 +405,16 @@ CLEAN-OBJS += libvpx_test_srcs.txt
$(LIBVPX_TEST_DATA): $(SRC_PATH_BARE)/test/test-data.sha1
@echo " [DOWNLOAD] $@"
$(qexec)trap 'rm -f $@' INT TERM &&\
curl --retry 1 -L -o $@ $(call libvpx_test_data_url,$(@F))
# Attempt to download the file using curl, retrying once if it fails for a
# partial file (18).
$(qexec)( \
trap 'rm -f $@' INT TERM; \
curl="curl --retry 1 -L -o $@ $(call libvpx_test_data_url,$(@F))"; \
$$curl; \
case "$$?" in \
18) $$curl -C -;; \
esac \
)
testdata:: $(LIBVPX_TEST_DATA)
$(qexec)[ -x "$$(which sha1sum)" ] && sha1sum=sha1sum;\

View File

@@ -37,7 +37,13 @@ struct rate_hist {
struct rate_hist *init_rate_histogram(const vpx_codec_enc_cfg_t *cfg,
const vpx_rational_t *fps) {
int i;
struct rate_hist *hist = malloc(sizeof(*hist));
struct rate_hist *hist = calloc(1, sizeof(*hist));
if (hist == NULL || cfg == NULL || fps == NULL || fps->num == 0 ||
fps->den == 0) {
destroy_rate_histogram(hist);
return NULL;
}
// Determine the number of samples in the buffer. Use the file's framerate
// to determine the number of frames in rc_buf_sz milliseconds, with an
@@ -80,7 +86,11 @@ void update_rate_histogram(struct rate_hist *hist,
(uint64_t)cfg->g_timebase.num /
(uint64_t)cfg->g_timebase.den;
int idx = hist->frames++ % hist->samples;
int idx;
if (hist == NULL || cfg == NULL || pkt == NULL) return;
idx = hist->frames++ % hist->samples;
hist->pts[idx] = now;
hist->sz[idx] = (int)pkt->data.frame.sz;
@@ -116,9 +126,14 @@ void update_rate_histogram(struct rate_hist *hist,
static int merge_hist_buckets(struct hist_bucket *bucket, int max_buckets,
int *num_buckets) {
int small_bucket = 0, merge_bucket = INT_MAX, big_bucket = 0;
int buckets = *num_buckets;
int buckets;
int i;
assert(bucket != NULL);
assert(num_buckets != NULL);
buckets = *num_buckets;
/* Find the extrema for this list of buckets */
big_bucket = small_bucket = 0;
for (i = 0; i < buckets; i++) {
@@ -181,6 +196,8 @@ static void show_histogram(const struct hist_bucket *bucket, int buckets,
const char *pat1, *pat2;
int i;
assert(bucket != NULL);
switch ((int)(log(bucket[buckets - 1].high) / log(10)) + 1) {
case 1:
case 2:
@@ -259,6 +276,8 @@ void show_rate_histogram(struct rate_hist *hist, const vpx_codec_enc_cfg_t *cfg,
int i, scale;
int buckets = 0;
if (hist == NULL || cfg == NULL) return;
for (i = 0; i < RATE_BINS; i++) {
if (hist->bucket[i].low == INT_MAX) continue;
hist->bucket[buckets++] = hist->bucket[i];

View File

@@ -11,6 +11,10 @@
#ifndef TEST_ACM_RANDOM_H_
#define TEST_ACM_RANDOM_H_
#include <assert.h>
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "vpx/vpx_integer.h"
@@ -50,6 +54,13 @@ class ACMRandom {
return r < 128 ? r << 4 : r >> 4;
}
uint32_t RandRange(const uint32_t range) {
// testing::internal::Random::Generate provides values in the range
// testing::internal::Random::kMaxRange.
assert(range <= testing::internal::Random::kMaxRange);
return random_.Generate(range);
}
int PseudoUniform(int range) { return random_.Generate(range); }
int operator()(int n) { return PseudoUniform(n); }

View File

@@ -32,6 +32,7 @@ LOCAL_CPP_EXTENSION := .cc
LOCAL_MODULE := gtest
LOCAL_C_INCLUDES := $(LOCAL_PATH)/third_party/googletest/src/
LOCAL_C_INCLUDES += $(LOCAL_PATH)/third_party/googletest/src/include/
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/third_party/googletest/src/include/
LOCAL_SRC_FILES := ./third_party/googletest/src/src/gtest-all.cc
include $(BUILD_STATIC_LIBRARY)

View File

@@ -14,6 +14,7 @@
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
@@ -22,6 +23,7 @@
#include "test/register_state_check.h"
#include "test/util.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/vpx_timer.h"
using libvpx_test::ACMRandom;
@@ -186,7 +188,7 @@ class IntProColTest : public AverageTestBase,
int16_t sum_c_;
};
typedef int (*SatdFunc)(const int16_t *coeffs, int length);
typedef int (*SatdFunc)(const tran_low_t *coeffs, int length);
typedef std::tr1::tuple<int, SatdFunc> SatdTestParam;
class SatdTest : public ::testing::Test,
@@ -196,7 +198,7 @@ class SatdTest : public ::testing::Test,
satd_size_ = GET_PARAM(0);
satd_func_ = GET_PARAM(1);
rnd_.Reset(ACMRandom::DeterministicSeed());
src_ = reinterpret_cast<int16_t *>(
src_ = reinterpret_cast<tran_low_t *>(
vpx_memalign(16, sizeof(*src_) * satd_size_));
ASSERT_TRUE(src_ != NULL);
}
@@ -206,12 +208,15 @@ class SatdTest : public ::testing::Test,
vpx_free(src_);
}
void FillConstant(const int16_t val) {
void FillConstant(const tran_low_t val) {
for (int i = 0; i < satd_size_; ++i) src_[i] = val;
}
void FillRandom() {
for (int i = 0; i < satd_size_; ++i) src_[i] = rnd_.Rand16();
for (int i = 0; i < satd_size_; ++i) {
const int16_t tmp = rnd_.Rand16();
src_[i] = (tran_low_t)tmp;
}
}
void Check(const int expected) {
@@ -223,11 +228,66 @@ class SatdTest : public ::testing::Test,
int satd_size_;
private:
int16_t *src_;
tran_low_t *src_;
SatdFunc satd_func_;
ACMRandom rnd_;
};
typedef int64_t (*BlockErrorFunc)(const tran_low_t *coeff,
const tran_low_t *dqcoeff, int block_size);
typedef std::tr1::tuple<int, BlockErrorFunc> BlockErrorTestFPParam;
class BlockErrorTestFP
: public ::testing::Test,
public ::testing::WithParamInterface<BlockErrorTestFPParam> {
protected:
virtual void SetUp() {
txfm_size_ = GET_PARAM(0);
block_error_func_ = GET_PARAM(1);
rnd_.Reset(ACMRandom::DeterministicSeed());
coeff_ = reinterpret_cast<tran_low_t *>(
vpx_memalign(16, sizeof(*coeff_) * txfm_size_));
dqcoeff_ = reinterpret_cast<tran_low_t *>(
vpx_memalign(16, sizeof(*dqcoeff_) * txfm_size_));
ASSERT_TRUE(coeff_ != NULL);
ASSERT_TRUE(dqcoeff_ != NULL);
}
virtual void TearDown() {
libvpx_test::ClearSystemState();
vpx_free(coeff_);
vpx_free(dqcoeff_);
}
void FillConstant(const tran_low_t coeff_val, const tran_low_t dqcoeff_val) {
for (int i = 0; i < txfm_size_; ++i) coeff_[i] = coeff_val;
for (int i = 0; i < txfm_size_; ++i) dqcoeff_[i] = dqcoeff_val;
}
void FillRandom() {
// Just two fixed seeds
rnd_.Reset(0xb0b9);
for (int i = 0; i < txfm_size_; ++i) coeff_[i] = rnd_.Rand16() >> 1;
rnd_.Reset(0xb0c8);
for (int i = 0; i < txfm_size_; ++i) dqcoeff_[i] = rnd_.Rand16() >> 1;
}
void Check(const int64_t expected) {
int64_t total;
ASM_REGISTER_STATE_CHECK(
total = block_error_func_(coeff_, dqcoeff_, txfm_size_));
EXPECT_EQ(expected, total);
}
int txfm_size_;
private:
tran_low_t *coeff_;
tran_low_t *dqcoeff_;
BlockErrorFunc block_error_func_;
ACMRandom rnd_;
};
uint8_t *AverageTestBase::source_data_ = NULL;
TEST_P(AverageTest, MinValue) {
@@ -308,6 +368,66 @@ TEST_P(SatdTest, Random) {
Check(expected);
}
TEST_P(SatdTest, DISABLED_Speed) {
const int kCountSpeedTestBlock = 20000;
vpx_usec_timer timer;
DECLARE_ALIGNED(16, tran_low_t, coeff[1024]);
const int blocksize = GET_PARAM(0);
vpx_usec_timer_start(&timer);
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
GET_PARAM(1)(coeff, blocksize);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("blocksize: %4d time: %4d us\n", blocksize, elapsed_time);
}
TEST_P(BlockErrorTestFP, MinValue) {
const int64_t kMin = -32640;
const int64_t expected = kMin * kMin * txfm_size_;
FillConstant(kMin, 0);
Check(expected);
}
TEST_P(BlockErrorTestFP, MaxValue) {
const int64_t kMax = 32640;
const int64_t expected = kMax * kMax * txfm_size_;
FillConstant(kMax, 0);
Check(expected);
}
TEST_P(BlockErrorTestFP, Random) {
int64_t expected;
switch (txfm_size_) {
case 16: expected = 2051681432; break;
case 64: expected = 11075114379; break;
case 256: expected = 44386271116; break;
case 1024: expected = 184774996089; break;
default:
FAIL() << "Invalid satd size (" << txfm_size_
<< ") valid: 16/64/256/1024";
}
FillRandom();
Check(expected);
}
TEST_P(BlockErrorTestFP, DISABLED_Speed) {
const int kCountSpeedTestBlock = 20000;
vpx_usec_timer timer;
DECLARE_ALIGNED(16, tran_low_t, coeff[1024]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff[1024]);
const int blocksize = GET_PARAM(0);
vpx_usec_timer_start(&timer);
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
GET_PARAM(1)(coeff, dqcoeff, blocksize);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("blocksize: %4d time: %4d us\n", blocksize, elapsed_time);
}
using std::tr1::make_tuple;
INSTANTIATE_TEST_CASE_P(
@@ -321,6 +441,13 @@ INSTANTIATE_TEST_CASE_P(C, SatdTest,
make_tuple(256, &vpx_satd_c),
make_tuple(1024, &vpx_satd_c)));
INSTANTIATE_TEST_CASE_P(
C, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_c),
make_tuple(64, &vp9_block_error_fp_c),
make_tuple(256, &vp9_block_error_fp_c),
make_tuple(1024, &vp9_block_error_fp_c)));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2, AverageTest,
@@ -350,6 +477,28 @@ INSTANTIATE_TEST_CASE_P(SSE2, SatdTest,
make_tuple(64, &vpx_satd_sse2),
make_tuple(256, &vpx_satd_sse2),
make_tuple(1024, &vpx_satd_sse2)));
INSTANTIATE_TEST_CASE_P(
SSE2, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_sse2),
make_tuple(64, &vp9_block_error_fp_sse2),
make_tuple(256, &vp9_block_error_fp_sse2),
make_tuple(1024, &vp9_block_error_fp_sse2)));
#endif // HAVE_SSE2
#if HAVE_AVX2
INSTANTIATE_TEST_CASE_P(AVX2, SatdTest,
::testing::Values(make_tuple(16, &vpx_satd_avx2),
make_tuple(64, &vpx_satd_avx2),
make_tuple(256, &vpx_satd_avx2),
make_tuple(1024, &vpx_satd_avx2)));
INSTANTIATE_TEST_CASE_P(
AVX2, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_avx2),
make_tuple(64, &vp9_block_error_fp_avx2),
make_tuple(256, &vp9_block_error_fp_avx2),
make_tuple(1024, &vp9_block_error_fp_avx2)));
#endif
#if HAVE_NEON
@@ -381,7 +530,18 @@ INSTANTIATE_TEST_CASE_P(NEON, SatdTest,
make_tuple(64, &vpx_satd_neon),
make_tuple(256, &vpx_satd_neon),
make_tuple(1024, &vpx_satd_neon)));
#endif
// TODO(jianj): Remove the highbitdepth flag once the SIMD functions are
// in place.
#if !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
NEON, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_neon),
make_tuple(64, &vp9_block_error_fp_neon),
make_tuple(256, &vp9_block_error_fp_neon),
make_tuple(1024, &vp9_block_error_fp_neon)));
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(
@@ -392,6 +552,30 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(16, 16, 0, 4, &vpx_avg_4x4_msa),
make_tuple(16, 16, 5, 4, &vpx_avg_4x4_msa),
make_tuple(32, 32, 15, 4, &vpx_avg_4x4_msa)));
#endif
INSTANTIATE_TEST_CASE_P(
MSA, IntProRowTest,
::testing::Values(make_tuple(16, &vpx_int_pro_row_msa, &vpx_int_pro_row_c),
make_tuple(32, &vpx_int_pro_row_msa, &vpx_int_pro_row_c),
make_tuple(64, &vpx_int_pro_row_msa,
&vpx_int_pro_row_c)));
INSTANTIATE_TEST_CASE_P(
MSA, IntProColTest,
::testing::Values(make_tuple(16, &vpx_int_pro_col_msa, &vpx_int_pro_col_c),
make_tuple(32, &vpx_int_pro_col_msa, &vpx_int_pro_col_c),
make_tuple(64, &vpx_int_pro_col_msa,
&vpx_int_pro_col_c)));
// TODO(jingning): Remove the highbitdepth flag once the SIMD functions are
// in place.
#if !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(MSA, SatdTest,
::testing::Values(make_tuple(16, &vpx_satd_msa),
make_tuple(64, &vpx_satd_msa),
make_tuple(256, &vpx_satd_msa),
make_tuple(1024, &vpx_satd_msa)));
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_MSA
} // namespace

382
test/buffer.h Normal file
View File

@@ -0,0 +1,382 @@
/*
* Copyright (c) 2016 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef TEST_BUFFER_H_
#define TEST_BUFFER_H_
#include <stdio.h>
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/acm_random.h"
#include "vpx/vpx_integer.h"
#include "vpx_mem/vpx_mem.h"
namespace libvpx_test {
template <typename T>
class Buffer {
public:
Buffer(int width, int height, int top_padding, int left_padding,
int right_padding, int bottom_padding)
: width_(width), height_(height), top_padding_(top_padding),
left_padding_(left_padding), right_padding_(right_padding),
bottom_padding_(bottom_padding), alignment_(0), padding_value_(0),
stride_(0), raw_size_(0), num_elements_(0), raw_buffer_(NULL) {}
Buffer(int width, int height, int top_padding, int left_padding,
int right_padding, int bottom_padding, unsigned int alignment)
: width_(width), height_(height), top_padding_(top_padding),
left_padding_(left_padding), right_padding_(right_padding),
bottom_padding_(bottom_padding), alignment_(alignment),
padding_value_(0), stride_(0), raw_size_(0), num_elements_(0),
raw_buffer_(NULL) {}
Buffer(int width, int height, int padding)
: width_(width), height_(height), top_padding_(padding),
left_padding_(padding), right_padding_(padding),
bottom_padding_(padding), alignment_(0), padding_value_(0), stride_(0),
raw_size_(0), num_elements_(0), raw_buffer_(NULL) {}
Buffer(int width, int height, int padding, unsigned int alignment)
: width_(width), height_(height), top_padding_(padding),
left_padding_(padding), right_padding_(padding),
bottom_padding_(padding), alignment_(alignment), padding_value_(0),
stride_(0), raw_size_(0), num_elements_(0), raw_buffer_(NULL) {}
~Buffer() {
if (alignment_) {
vpx_free(raw_buffer_);
} else {
delete[] raw_buffer_;
}
}
T *TopLeftPixel() const;
int stride() const { return stride_; }
// Set the buffer (excluding padding) to 'value'.
void Set(const T value);
// Set the buffer (excluding padding) to the output of ACMRandom function
// 'rand_func'.
void Set(ACMRandom *rand_class, T (ACMRandom::*rand_func)());
// Set the buffer (excluding padding) to the output of ACMRandom function
// 'RandRange' with range 'low' to 'high' which typically must be within
// testing::internal::Random::kMaxRange (1u << 31). However, because we want
// to allow negative low (and high) values, it is restricted to INT32_MAX
// here.
void Set(ACMRandom *rand_class, const T low, const T high);
// Copy the contents of Buffer 'a' (excluding padding).
void CopyFrom(const Buffer<T> &a);
void DumpBuffer() const;
// Highlight the differences between two buffers if they are the same size.
void PrintDifference(const Buffer<T> &a) const;
bool HasPadding() const;
// Sets all the values in the buffer to 'padding_value'.
void SetPadding(const T padding_value);
// Checks if all the values (excluding padding) are equal to 'value' if the
// Buffers are the same size.
bool CheckValues(const T value) const;
// Check that padding matches the expected value or there is no padding.
bool CheckPadding() const;
// Compare the non-padding portion of two buffers if they are the same size.
bool CheckValues(const Buffer<T> &a) const;
bool Init() {
if (raw_buffer_ != NULL) return false;
EXPECT_GT(width_, 0);
EXPECT_GT(height_, 0);
EXPECT_GE(top_padding_, 0);
EXPECT_GE(left_padding_, 0);
EXPECT_GE(right_padding_, 0);
EXPECT_GE(bottom_padding_, 0);
stride_ = left_padding_ + width_ + right_padding_;
num_elements_ = stride_ * (top_padding_ + height_ + bottom_padding_);
raw_size_ = num_elements_ * sizeof(T);
if (alignment_) {
EXPECT_GE(alignment_, sizeof(T));
// Ensure alignment of the first value will be preserved.
EXPECT_EQ((left_padding_ * sizeof(T)) % alignment_, 0u);
// Ensure alignment of the subsequent rows will be preserved when there is
// a stride.
if (stride_ != width_) {
EXPECT_EQ((stride_ * sizeof(T)) % alignment_, 0u);
}
raw_buffer_ = reinterpret_cast<T *>(vpx_memalign(alignment_, raw_size_));
} else {
raw_buffer_ = new (std::nothrow) T[num_elements_];
}
EXPECT_TRUE(raw_buffer_ != NULL);
SetPadding(std::numeric_limits<T>::max());
return !::testing::Test::HasFailure();
}
private:
bool BufferSizesMatch(const Buffer<T> &a) const;
const int width_;
const int height_;
const int top_padding_;
const int left_padding_;
const int right_padding_;
const int bottom_padding_;
const unsigned int alignment_;
T padding_value_;
int stride_;
int raw_size_;
int num_elements_;
T *raw_buffer_;
};
template <typename T>
T *Buffer<T>::TopLeftPixel() const {
if (!raw_buffer_) return NULL;
return raw_buffer_ + (top_padding_ * stride_) + left_padding_;
}
template <typename T>
void Buffer<T>::Set(const T value) {
if (!raw_buffer_) return;
T *src = TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
src[width] = value;
}
src += stride_;
}
}
template <typename T>
void Buffer<T>::Set(ACMRandom *rand_class, T (ACMRandom::*rand_func)()) {
if (!raw_buffer_) return;
T *src = TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
src[width] = (*rand_class.*rand_func)();
}
src += stride_;
}
}
template <typename T>
void Buffer<T>::Set(ACMRandom *rand_class, const T low, const T high) {
if (!raw_buffer_) return;
EXPECT_LE(low, high);
EXPECT_LE(static_cast<int64_t>(high) - low,
std::numeric_limits<int32_t>::max());
T *src = TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
// 'low' will be promoted to unsigned given the return type of RandRange.
// Store the value as an int to avoid unsigned overflow warnings when
// 'low' is negative.
const int32_t value =
static_cast<int32_t>((*rand_class).RandRange(high - low));
src[width] = static_cast<T>(value + low);
}
src += stride_;
}
}
template <typename T>
void Buffer<T>::CopyFrom(const Buffer<T> &a) {
if (!raw_buffer_) return;
if (!BufferSizesMatch(a)) return;
T *a_src = a.TopLeftPixel();
T *b_src = this->TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
b_src[width] = a_src[width];
}
a_src += a.stride();
b_src += this->stride();
}
}
template <typename T>
void Buffer<T>::DumpBuffer() const {
if (!raw_buffer_) return;
for (int height = 0; height < height_ + top_padding_ + bottom_padding_;
++height) {
for (int width = 0; width < stride_; ++width) {
printf("%4d", raw_buffer_[height + width * stride_]);
}
printf("\n");
}
}
template <typename T>
bool Buffer<T>::HasPadding() const {
if (!raw_buffer_) return false;
return top_padding_ || left_padding_ || right_padding_ || bottom_padding_;
}
template <typename T>
void Buffer<T>::PrintDifference(const Buffer<T> &a) const {
if (!raw_buffer_) return;
if (!BufferSizesMatch(a)) return;
T *a_src = a.TopLeftPixel();
T *b_src = TopLeftPixel();
printf("This buffer:\n");
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
if (a_src[width] != b_src[width]) {
printf("*%3d", b_src[width]);
} else {
printf("%4d", b_src[width]);
}
}
printf("\n");
a_src += a.stride();
b_src += this->stride();
}
a_src = a.TopLeftPixel();
b_src = TopLeftPixel();
printf("Reference buffer:\n");
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
if (a_src[width] != b_src[width]) {
printf("*%3d", a_src[width]);
} else {
printf("%4d", a_src[width]);
}
}
printf("\n");
a_src += a.stride();
b_src += this->stride();
}
}
template <typename T>
void Buffer<T>::SetPadding(const T padding_value) {
if (!raw_buffer_) return;
padding_value_ = padding_value;
T *src = raw_buffer_;
for (int i = 0; i < num_elements_; ++i) {
src[i] = padding_value;
}
}
template <typename T>
bool Buffer<T>::CheckValues(const T value) const {
if (!raw_buffer_) return false;
T *src = TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
if (value != src[width]) {
return false;
}
}
src += stride_;
}
return true;
}
template <typename T>
bool Buffer<T>::CheckPadding() const {
if (!raw_buffer_) return false;
if (!HasPadding()) return true;
// Top padding.
T const *top = raw_buffer_;
for (int i = 0; i < stride_ * top_padding_; ++i) {
if (padding_value_ != top[i]) {
return false;
}
}
// Left padding.
T const *left = TopLeftPixel() - left_padding_;
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < left_padding_; ++width) {
if (padding_value_ != left[width]) {
return false;
}
}
left += stride_;
}
// Right padding.
T const *right = TopLeftPixel() + width_;
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < right_padding_; ++width) {
if (padding_value_ != right[width]) {
return false;
}
}
right += stride_;
}
// Bottom padding
T const *bottom = raw_buffer_ + (top_padding_ + height_) * stride_;
for (int i = 0; i < stride_ * bottom_padding_; ++i) {
if (padding_value_ != bottom[i]) {
return false;
}
}
return true;
}
template <typename T>
bool Buffer<T>::CheckValues(const Buffer<T> &a) const {
if (!raw_buffer_) return false;
if (!BufferSizesMatch(a)) return false;
T *a_src = a.TopLeftPixel();
T *b_src = this->TopLeftPixel();
for (int height = 0; height < height_; ++height) {
for (int width = 0; width < width_; ++width) {
if (a_src[width] != b_src[width]) {
return false;
}
}
a_src += a.stride();
b_src += this->stride();
}
return true;
}
template <typename T>
bool Buffer<T>::BufferSizesMatch(const Buffer<T> &a) const {
if (!raw_buffer_) return false;
if (a.width_ != this->width_ || a.height_ != this->height_) {
printf(
"Reference buffer of size %dx%d does not match this buffer which is "
"size %dx%d\n",
a.width_, a.height_, this->width_, this->height_);
return false;
}
return true;
}
} // namespace libvpx_test
#endif // TEST_BUFFER_H_

View File

@@ -128,8 +128,8 @@ class ByteAlignmentTest
// TODO(fgalligan): Move the MD5 testing code into another class.
void OpenMd5File(const std::string &md5_file_name_) {
md5_file_ = libvpx_test::OpenTestDataFile(md5_file_name_);
ASSERT_TRUE(md5_file_ != NULL) << "MD5 file open failed. Filename: "
<< md5_file_name_;
ASSERT_TRUE(md5_file_ != NULL)
<< "MD5 file open failed. Filename: " << md5_file_name_;
}
void CheckMd5(const vpx_image_t &img) {

182
test/comp_avg_pred_test.cc Normal file
View File

@@ -0,0 +1,182 @@
/*
* Copyright (c) 2017 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/register_state_check.h"
#include "vpx_ports/vpx_timer.h"
namespace {
using ::libvpx_test::ACMRandom;
using ::libvpx_test::Buffer;
typedef void (*AvgPredFunc)(uint8_t *a, const uint8_t *b, int w, int h,
const uint8_t *c, int c_stride);
uint8_t avg_with_rounding(uint8_t a, uint8_t b) { return (a + b + 1) >> 1; }
void reference_pred(const Buffer<uint8_t> &pred, const Buffer<uint8_t> &ref,
int width, int height, Buffer<uint8_t> *avg) {
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
avg->TopLeftPixel()[y * avg->stride() + x] =
avg_with_rounding(pred.TopLeftPixel()[y * pred.stride() + x],
ref.TopLeftPixel()[y * ref.stride() + x]);
}
}
}
class AvgPredTest : public ::testing::TestWithParam<AvgPredFunc> {
public:
virtual void SetUp() {
avg_pred_func_ = GetParam();
rnd_.Reset(ACMRandom::DeterministicSeed());
}
protected:
AvgPredFunc avg_pred_func_;
ACMRandom rnd_;
};
TEST_P(AvgPredTest, SizeCombinations) {
// This is called as part of the sub pixel variance. As such it must be one of
// the variance block sizes.
for (int width_pow = 2; width_pow <= 6; ++width_pow) {
for (int height_pow = width_pow - 1; height_pow <= width_pow + 1;
++height_pow) {
// Don't test 4x2 or 64x128
if (height_pow == 1 || height_pow == 7) continue;
// The sse2 special-cases when ref width == stride, so make sure to test
// it.
for (int ref_padding = 0; ref_padding < 2; ref_padding++) {
const int width = 1 << width_pow;
const int height = 1 << height_pow;
// Only the reference buffer may have a stride not equal to width.
Buffer<uint8_t> ref =
Buffer<uint8_t>(width, height, ref_padding ? 8 : 0);
ASSERT_TRUE(ref.Init());
Buffer<uint8_t> pred = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(pred.Init());
Buffer<uint8_t> avg_ref = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(avg_ref.Init());
Buffer<uint8_t> avg_chk = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(avg_chk.Init());
ref.Set(&rnd_, &ACMRandom::Rand8);
pred.Set(&rnd_, &ACMRandom::Rand8);
reference_pred(pred, ref, width, height, &avg_ref);
ASM_REGISTER_STATE_CHECK(
avg_pred_func_(avg_chk.TopLeftPixel(), pred.TopLeftPixel(), width,
height, ref.TopLeftPixel(), ref.stride()));
EXPECT_TRUE(avg_chk.CheckValues(avg_ref));
if (HasFailure()) {
printf("Width: %d Height: %d\n", width, height);
avg_chk.PrintDifference(avg_ref);
return;
}
}
}
}
}
TEST_P(AvgPredTest, CompareReferenceRandom) {
const int width = 64;
const int height = 32;
Buffer<uint8_t> ref = Buffer<uint8_t>(width, height, 8);
ASSERT_TRUE(ref.Init());
Buffer<uint8_t> pred = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(pred.Init());
Buffer<uint8_t> avg_ref = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(avg_ref.Init());
Buffer<uint8_t> avg_chk = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(avg_chk.Init());
for (int i = 0; i < 500; ++i) {
ref.Set(&rnd_, &ACMRandom::Rand8);
pred.Set(&rnd_, &ACMRandom::Rand8);
reference_pred(pred, ref, width, height, &avg_ref);
ASM_REGISTER_STATE_CHECK(avg_pred_func_(avg_chk.TopLeftPixel(),
pred.TopLeftPixel(), width, height,
ref.TopLeftPixel(), ref.stride()));
EXPECT_TRUE(avg_chk.CheckValues(avg_ref));
if (HasFailure()) {
printf("Width: %d Height: %d\n", width, height);
avg_chk.PrintDifference(avg_ref);
return;
}
}
}
TEST_P(AvgPredTest, DISABLED_Speed) {
for (int width_pow = 2; width_pow <= 6; ++width_pow) {
for (int height_pow = width_pow - 1; height_pow <= width_pow + 1;
++height_pow) {
// Don't test 4x2 or 64x128
if (height_pow == 1 || height_pow == 7) continue;
for (int ref_padding = 0; ref_padding < 2; ref_padding++) {
const int width = 1 << width_pow;
const int height = 1 << height_pow;
Buffer<uint8_t> ref =
Buffer<uint8_t>(width, height, ref_padding ? 8 : 0);
ASSERT_TRUE(ref.Init());
Buffer<uint8_t> pred = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(pred.Init());
Buffer<uint8_t> avg = Buffer<uint8_t>(width, height, 0, 16);
ASSERT_TRUE(avg.Init());
ref.Set(&rnd_, &ACMRandom::Rand8);
pred.Set(&rnd_, &ACMRandom::Rand8);
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int i = 0; i < 10000000 / (width * height); ++i) {
avg_pred_func_(avg.TopLeftPixel(), pred.TopLeftPixel(), width, height,
ref.TopLeftPixel(), ref.stride());
}
vpx_usec_timer_mark(&timer);
const int elapsed_time =
static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("Average Test (ref_padding: %d) %dx%d time: %5d us\n",
ref_padding, width, height, elapsed_time);
}
}
}
}
INSTANTIATE_TEST_CASE_P(C, AvgPredTest,
::testing::Values(&vpx_comp_avg_pred_c));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, AvgPredTest,
::testing::Values(&vpx_comp_avg_pred_sse2));
#endif // HAVE_SSE2
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, AvgPredTest,
::testing::Values(&vpx_comp_avg_pred_neon));
#endif // HAVE_NEON
#if HAVE_VSX
INSTANTIATE_TEST_CASE_P(VSX, AvgPredTest,
::testing::Values(&vpx_comp_avg_pred_vsx));
#endif // HAVE_VSX
} // namespace

View File

@@ -25,6 +25,7 @@
#include "vpx_dsp/vpx_filter.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/mem.h"
#include "vpx_ports/vpx_timer.h"
namespace {
@@ -32,9 +33,9 @@ static const unsigned int kMaxDimension = 64;
typedef void (*ConvolveFunc)(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x, int filter_x_stride,
const int16_t *filter_y, int filter_y_stride,
int w, int h);
const InterpKernel *filter, int x0_q4,
int x_step_q4, int y0_q4, int y_step_q4, int w,
int h);
typedef void (*WrapperFilterBlock2d8Func)(
const uint8_t *src_ptr, const unsigned int src_stride,
@@ -300,9 +301,9 @@ void wrapper_filter_average_block2d_8_c(
filter_average_block2d_8_c(src_ptr, src_stride, hfilter, vfilter, dst_ptr,
dst_stride, output_width, output_height);
} else {
highbd_filter_average_block2d_8_c(CONVERT_TO_SHORTPTR(src_ptr), src_stride,
highbd_filter_average_block2d_8_c(CAST_TO_SHORTPTR(src_ptr), src_stride,
hfilter, vfilter,
CONVERT_TO_SHORTPTR(dst_ptr), dst_stride,
CAST_TO_SHORTPTR(dst_ptr), dst_stride,
output_width, output_height, use_highbd);
}
#else
@@ -323,8 +324,8 @@ void wrapper_filter_block2d_8_c(const uint8_t *src_ptr,
filter_block2d_8_c(src_ptr, src_stride, hfilter, vfilter, dst_ptr,
dst_stride, output_width, output_height);
} else {
highbd_filter_block2d_8_c(CONVERT_TO_SHORTPTR(src_ptr), src_stride, hfilter,
vfilter, CONVERT_TO_SHORTPTR(dst_ptr), dst_stride,
highbd_filter_block2d_8_c(CAST_TO_SHORTPTR(src_ptr), src_stride, hfilter,
vfilter, CAST_TO_SHORTPTR(dst_ptr), dst_stride,
output_width, output_height, use_highbd);
}
#else
@@ -459,7 +460,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
if (UUT_->use_highbd_ == 0) {
return input_ + offset;
} else {
return CONVERT_TO_BYTEPTR(input16_) + offset;
return CAST_TO_BYTEPTR(input16_ + offset);
}
#else
return input_ + offset;
@@ -472,7 +473,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
if (UUT_->use_highbd_ == 0) {
return output_ + offset;
} else {
return CONVERT_TO_BYTEPTR(output16_) + offset;
return CAST_TO_BYTEPTR(output16_ + offset);
}
#else
return output_ + offset;
@@ -485,7 +486,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
if (UUT_->use_highbd_ == 0) {
return output_ref_ + offset;
} else {
return CONVERT_TO_BYTEPTR(output16_ref_) + offset;
return CAST_TO_BYTEPTR(output16_ref_ + offset);
}
#else
return output_ref_ + offset;
@@ -497,7 +498,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
if (UUT_->use_highbd_ == 0) {
return list[index];
} else {
return CONVERT_TO_SHORTPTR(list)[index];
return CAST_TO_SHORTPTR(list)[index];
}
#else
return list[index];
@@ -509,7 +510,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
if (UUT_->use_highbd_ == 0) {
list[index] = (uint8_t)val;
} else {
CONVERT_TO_SHORTPTR(list)[index] = val;
CAST_TO_SHORTPTR(list)[index] = val;
}
#else
list[index] = (uint8_t)val;
@@ -539,12 +540,167 @@ uint16_t *ConvolveTest::output16_ref_ = NULL;
TEST_P(ConvolveTest, GuardBlocks) { CheckGuardBlocks(); }
TEST_P(ConvolveTest, DISABLED_Copy_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->copy_[0](in, kInputStride, out, kOutputStride, NULL, 0, 0, 0, 0,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve_copy_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_Avg_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->copy_[1](in, kInputStride, out, kOutputStride, NULL, 0, 0, 0, 0,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve_avg_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_Scale_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP];
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
SetConstantInput(127);
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->shv8_[0](in, kInputStride, out, kOutputStride, eighttap, 8, 16, 8, 16,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve_scale_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_8Tap_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP_SHARP];
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
SetConstantInput(127);
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->hv8_[0](in, kInputStride, out, kOutputStride, eighttap, 8, 16, 8, 16,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve8_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_8Tap_Horiz_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP_SHARP];
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
SetConstantInput(127);
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->h8_[0](in, kInputStride, out, kOutputStride, eighttap, 8, 16, 8, 16,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve8_horiz_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_8Tap_Vert_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP_SHARP];
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
SetConstantInput(127);
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->v8_[0](in, kInputStride, out, kOutputStride, eighttap, 8, 16, 8, 16,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve8_vert_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, DISABLED_8Tap_Avg_Speed) {
const uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP_SHARP];
const int kNumTests = 5000000;
const int width = Width();
const int height = Height();
vpx_usec_timer timer;
SetConstantInput(127);
vpx_usec_timer_start(&timer);
for (int n = 0; n < kNumTests; ++n) {
UUT_->hv8_[1](in, kInputStride, out, kOutputStride, eighttap, 8, 16, 8, 16,
width, height);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("convolve8_avg_%dx%d_%d: %d us\n", width, height,
UUT_->use_highbd_ ? UUT_->use_highbd_ : 8, elapsed_time);
}
TEST_P(ConvolveTest, Copy) {
uint8_t *const in = input();
uint8_t *const out = output();
ASM_REGISTER_STATE_CHECK(UUT_->copy_[0](in, kInputStride, out, kOutputStride,
NULL, 0, NULL, 0, Width(), Height()));
NULL, 0, 0, 0, 0, Width(), Height()));
CheckGuardBlocks();
@@ -563,7 +719,7 @@ TEST_P(ConvolveTest, Avg) {
CopyOutputToRef();
ASM_REGISTER_STATE_CHECK(UUT_->copy_[1](in, kInputStride, out, kOutputStride,
NULL, 0, NULL, 0, Width(), Height()));
NULL, 0, 0, 0, 0, Width(), Height()));
CheckGuardBlocks();
@@ -580,12 +736,10 @@ TEST_P(ConvolveTest, Avg) {
TEST_P(ConvolveTest, CopyHoriz) {
uint8_t *const in = input();
uint8_t *const out = output();
DECLARE_ALIGNED(256, const int16_t,
filter8[8]) = { 0, 0, 0, 128, 0, 0, 0, 0 };
ASM_REGISTER_STATE_CHECK(UUT_->sh8_[0](in, kInputStride, out, kOutputStride,
filter8, 16, filter8, 16, Width(),
Height()));
vp9_filter_kernels[0], 0, 16, 0, 16,
Width(), Height()));
CheckGuardBlocks();
@@ -600,12 +754,10 @@ TEST_P(ConvolveTest, CopyHoriz) {
TEST_P(ConvolveTest, CopyVert) {
uint8_t *const in = input();
uint8_t *const out = output();
DECLARE_ALIGNED(256, const int16_t,
filter8[8]) = { 0, 0, 0, 128, 0, 0, 0, 0 };
ASM_REGISTER_STATE_CHECK(UUT_->sv8_[0](in, kInputStride, out, kOutputStride,
filter8, 16, filter8, 16, Width(),
Height()));
vp9_filter_kernels[0], 0, 16, 0, 16,
Width(), Height()));
CheckGuardBlocks();
@@ -620,12 +772,10 @@ TEST_P(ConvolveTest, CopyVert) {
TEST_P(ConvolveTest, Copy2D) {
uint8_t *const in = input();
uint8_t *const out = output();
DECLARE_ALIGNED(256, const int16_t,
filter8[8]) = { 0, 0, 0, 128, 0, 0, 0, 0 };
ASM_REGISTER_STATE_CHECK(UUT_->shv8_[0](in, kInputStride, out, kOutputStride,
filter8, 16, filter8, 16, Width(),
Height()));
vp9_filter_kernels[0], 0, 16, 0, 16,
Width(), Height()));
CheckGuardBlocks();
@@ -661,7 +811,6 @@ TEST(ConvolveTest, FiltersWontSaturateWhenAddedPairwise) {
}
}
const int16_t kInvalidFilter[8] = { 0 };
const WrapperFilterBlock2d8Func wrapper_filter_block2d_8[2] = {
wrapper_filter_block2d_8_c, wrapper_filter_average_block2d_8_c
};
@@ -677,7 +826,7 @@ TEST_P(ConvolveTest, MatchesReferenceSubpixelFilter) {
if (UUT_->use_highbd_ == 0) {
ref = ref8;
} else {
ref = CONVERT_TO_BYTEPTR(ref16);
ref = CAST_TO_BYTEPTR(ref16);
}
#else
uint8_t ref[kOutputStride * kMaxDimension];
@@ -714,21 +863,21 @@ TEST_P(ConvolveTest, MatchesReferenceSubpixelFilter) {
Width(), Height(), UUT_->use_highbd_);
if (filter_x && filter_y)
ASM_REGISTER_STATE_CHECK(UUT_->hv8_[i](
in, kInputStride, out, kOutputStride, filters[filter_x], 16,
filters[filter_y], 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->hv8_[i](in, kInputStride, out, kOutputStride, filters,
filter_x, 16, filter_y, 16, Width(), Height()));
else if (filter_y)
ASM_REGISTER_STATE_CHECK(UUT_->v8_[i](
in, kInputStride, out, kOutputStride, kInvalidFilter, 16,
filters[filter_y], 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->v8_[i](in, kInputStride, out, kOutputStride, filters, 0,
16, filter_y, 16, Width(), Height()));
else if (filter_x)
ASM_REGISTER_STATE_CHECK(UUT_->h8_[i](
in, kInputStride, out, kOutputStride, filters[filter_x], 16,
kInvalidFilter, 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->h8_[i](in, kInputStride, out, kOutputStride, filters,
filter_x, 16, 0, 16, Width(), Height()));
else
ASM_REGISTER_STATE_CHECK(UUT_->copy_[i](
in, kInputStride, out, kOutputStride, kInvalidFilter, 0,
kInvalidFilter, 0, Width(), Height()));
ASM_REGISTER_STATE_CHECK(UUT_->copy_[i](in, kInputStride, out,
kOutputStride, NULL, 0, 0,
0, 0, Width(), Height()));
CheckGuardBlocks();
@@ -756,7 +905,7 @@ TEST_P(ConvolveTest, FilterExtremes) {
if (UUT_->use_highbd_ == 0) {
ref = ref8;
} else {
ref = CONVERT_TO_BYTEPTR(ref16);
ref = CAST_TO_BYTEPTR(ref16);
}
#else
uint8_t ref[kOutputStride * kMaxDimension];
@@ -812,21 +961,21 @@ TEST_P(ConvolveTest, FilterExtremes) {
filters[filter_y], ref, kOutputStride,
Width(), Height(), UUT_->use_highbd_);
if (filter_x && filter_y)
ASM_REGISTER_STATE_CHECK(UUT_->hv8_[0](
in, kInputStride, out, kOutputStride, filters[filter_x], 16,
filters[filter_y], 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->hv8_[0](in, kInputStride, out, kOutputStride, filters,
filter_x, 16, filter_y, 16, Width(), Height()));
else if (filter_y)
ASM_REGISTER_STATE_CHECK(UUT_->v8_[0](
in, kInputStride, out, kOutputStride, kInvalidFilter, 16,
filters[filter_y], 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->v8_[0](in, kInputStride, out, kOutputStride, filters, 0,
16, filter_y, 16, Width(), Height()));
else if (filter_x)
ASM_REGISTER_STATE_CHECK(UUT_->h8_[0](
in, kInputStride, out, kOutputStride, filters[filter_x], 16,
kInvalidFilter, 16, Width(), Height()));
ASM_REGISTER_STATE_CHECK(
UUT_->h8_[0](in, kInputStride, out, kOutputStride, filters,
filter_x, 16, 0, 16, Width(), Height()));
else
ASM_REGISTER_STATE_CHECK(UUT_->copy_[0](
in, kInputStride, out, kOutputStride, kInvalidFilter, 0,
kInvalidFilter, 0, Width(), Height()));
ASM_REGISTER_STATE_CHECK(UUT_->copy_[0](in, kInputStride, out,
kOutputStride, NULL, 0, 0,
0, 0, Width(), Height()));
for (int y = 0; y < Height(); ++y) {
for (int x = 0; x < Width(); ++x)
@@ -845,33 +994,51 @@ TEST_P(ConvolveTest, FilterExtremes) {
/* This test exercises that enough rows and columns are filtered with every
possible initial fractional positions and scaling steps. */
#if !CONFIG_VP9_HIGHBITDEPTH
static const ConvolveFunc scaled_2d_c_funcs[2] = { vpx_scaled_2d_c,
vpx_scaled_avg_2d_c };
TEST_P(ConvolveTest, CheckScalingFiltering) {
uint8_t *const in = input();
uint8_t *const out = output();
const InterpKernel *const eighttap = vp9_filter_kernels[EIGHTTAP];
uint8_t ref[kOutputStride * kMaxDimension];
SetConstantInput(127);
::libvpx_test::ACMRandom prng;
for (int y = 0; y < Height(); ++y) {
for (int x = 0; x < Width(); ++x) {
const uint16_t r = prng.Rand8Extremes();
assign_val(in, y * kInputStride + x, r);
}
}
for (int frac = 0; frac < 16; ++frac) {
for (int step = 1; step <= 32; ++step) {
/* Test the horizontal and vertical filters in combination. */
ASM_REGISTER_STATE_CHECK(
UUT_->shv8_[0](in, kInputStride, out, kOutputStride, eighttap[frac],
step, eighttap[frac], step, Width(), Height()));
for (int i = 0; i < 2; ++i) {
for (INTERP_FILTER filter_type = 0; filter_type < 4; ++filter_type) {
const InterpKernel *const eighttap = vp9_filter_kernels[filter_type];
for (int frac = 0; frac < 16; ++frac) {
for (int step = 1; step <= 32; ++step) {
/* Test the horizontal and vertical filters in combination. */
scaled_2d_c_funcs[i](in, kInputStride, ref, kOutputStride, eighttap,
frac, step, frac, step, Width(), Height());
ASM_REGISTER_STATE_CHECK(
UUT_->shv8_[i](in, kInputStride, out, kOutputStride, eighttap,
frac, step, frac, step, Width(), Height()));
CheckGuardBlocks();
CheckGuardBlocks();
for (int y = 0; y < Height(); ++y) {
for (int x = 0; x < Width(); ++x) {
ASSERT_EQ(lookup(in, y * kInputStride + x),
lookup(out, y * kOutputStride + x))
<< "x == " << x << ", y == " << y << ", frac == " << frac
<< ", step == " << step;
for (int y = 0; y < Height(); ++y) {
for (int x = 0; x < Width(); ++x) {
ASSERT_EQ(lookup(ref, y * kOutputStride + x),
lookup(out, y * kOutputStride + x))
<< "x == " << x << ", y == " << y << ", frac == " << frac
<< ", step == " << step;
}
}
}
}
}
}
}
#endif
using std::tr1::make_tuple;
@@ -879,10 +1046,11 @@ using std::tr1::make_tuple;
#define WRAP(func, bd) \
void wrap_##func##_##bd( \
const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, \
ptrdiff_t dst_stride, const int16_t *filter_x, int filter_x_stride, \
const int16_t *filter_y, int filter_y_stride, int w, int h) { \
vpx_highbd_##func(src, src_stride, dst, dst_stride, filter_x, \
filter_x_stride, filter_y, filter_y_stride, w, h, bd); \
ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, \
int x_step_q4, int y0_q4, int y_step_q4, int w, int h) { \
vpx_highbd_##func(reinterpret_cast<const uint16_t *>(src), src_stride, \
reinterpret_cast<uint16_t *>(dst), dst_stride, filter, \
x0_q4, x_step_q4, y0_q4, y_step_q4, w, h, bd); \
}
#if HAVE_SSE2 && ARCH_X86_64
@@ -912,6 +1080,35 @@ WRAP(convolve8_sse2, 12)
WRAP(convolve8_avg_sse2, 12)
#endif // HAVE_SSE2 && ARCH_X86_64
#if HAVE_AVX2
WRAP(convolve_copy_avx2, 8)
WRAP(convolve_avg_avx2, 8)
WRAP(convolve8_horiz_avx2, 8)
WRAP(convolve8_avg_horiz_avx2, 8)
WRAP(convolve8_vert_avx2, 8)
WRAP(convolve8_avg_vert_avx2, 8)
WRAP(convolve8_avx2, 8)
WRAP(convolve8_avg_avx2, 8)
WRAP(convolve_copy_avx2, 10)
WRAP(convolve_avg_avx2, 10)
WRAP(convolve8_avx2, 10)
WRAP(convolve8_horiz_avx2, 10)
WRAP(convolve8_vert_avx2, 10)
WRAP(convolve8_avg_avx2, 10)
WRAP(convolve8_avg_horiz_avx2, 10)
WRAP(convolve8_avg_vert_avx2, 10)
WRAP(convolve_copy_avx2, 12)
WRAP(convolve_avg_avx2, 12)
WRAP(convolve8_avx2, 12)
WRAP(convolve8_horiz_avx2, 12)
WRAP(convolve8_vert_avx2, 12)
WRAP(convolve8_avg_avx2, 12)
WRAP(convolve8_avg_horiz_avx2, 12)
WRAP(convolve8_avg_vert_avx2, 12)
#endif // HAVE_AVX2
#if HAVE_NEON
WRAP(convolve_copy_neon, 8)
WRAP(convolve_avg_neon, 8)
@@ -1057,18 +1254,48 @@ INSTANTIATE_TEST_CASE_P(SSSE3, ConvolveTest,
::testing::ValuesIn(kArrayConvolve8_ssse3));
#endif
#if HAVE_AVX2 && HAVE_SSSE3
#if HAVE_AVX2
#if CONFIG_VP9_HIGHBITDEPTH
const ConvolveFunctions convolve8_avx2(
wrap_convolve_copy_avx2_8, wrap_convolve_avg_avx2_8,
wrap_convolve8_horiz_avx2_8, wrap_convolve8_avg_horiz_avx2_8,
wrap_convolve8_vert_avx2_8, wrap_convolve8_avg_vert_avx2_8,
wrap_convolve8_avx2_8, wrap_convolve8_avg_avx2_8, wrap_convolve8_horiz_c_8,
wrap_convolve8_avg_horiz_c_8, wrap_convolve8_vert_c_8,
wrap_convolve8_avg_vert_c_8, wrap_convolve8_c_8, wrap_convolve8_avg_c_8, 8);
const ConvolveFunctions convolve10_avx2(
wrap_convolve_copy_avx2_10, wrap_convolve_avg_avx2_10,
wrap_convolve8_horiz_avx2_10, wrap_convolve8_avg_horiz_avx2_10,
wrap_convolve8_vert_avx2_10, wrap_convolve8_avg_vert_avx2_10,
wrap_convolve8_avx2_10, wrap_convolve8_avg_avx2_10,
wrap_convolve8_horiz_c_10, wrap_convolve8_avg_horiz_c_10,
wrap_convolve8_vert_c_10, wrap_convolve8_avg_vert_c_10, wrap_convolve8_c_10,
wrap_convolve8_avg_c_10, 10);
const ConvolveFunctions convolve12_avx2(
wrap_convolve_copy_avx2_12, wrap_convolve_avg_avx2_12,
wrap_convolve8_horiz_avx2_12, wrap_convolve8_avg_horiz_avx2_12,
wrap_convolve8_vert_avx2_12, wrap_convolve8_avg_vert_avx2_12,
wrap_convolve8_avx2_12, wrap_convolve8_avg_avx2_12,
wrap_convolve8_horiz_c_12, wrap_convolve8_avg_horiz_c_12,
wrap_convolve8_vert_c_12, wrap_convolve8_avg_vert_c_12, wrap_convolve8_c_12,
wrap_convolve8_avg_c_12, 12);
const ConvolveParam kArrayConvolve8_avx2[] = { ALL_SIZES(convolve8_avx2),
ALL_SIZES(convolve10_avx2),
ALL_SIZES(convolve12_avx2) };
INSTANTIATE_TEST_CASE_P(AVX2, ConvolveTest,
::testing::ValuesIn(kArrayConvolve8_avx2));
#else // !CONFIG_VP9_HIGHBITDEPTH
const ConvolveFunctions convolve8_avx2(
vpx_convolve_copy_c, vpx_convolve_avg_c, vpx_convolve8_horiz_avx2,
vpx_convolve8_avg_horiz_ssse3, vpx_convolve8_vert_avx2,
vpx_convolve8_avg_vert_ssse3, vpx_convolve8_avx2, vpx_convolve8_avg_ssse3,
vpx_convolve8_avg_horiz_avx2, vpx_convolve8_vert_avx2,
vpx_convolve8_avg_vert_avx2, vpx_convolve8_avx2, vpx_convolve8_avg_avx2,
vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c, vpx_scaled_vert_c,
vpx_scaled_avg_vert_c, vpx_scaled_2d_c, vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve8_avx2[] = { ALL_SIZES(convolve8_avx2) };
INSTANTIATE_TEST_CASE_P(AVX2, ConvolveTest,
::testing::ValuesIn(kArrayConvolve8_avx2));
#endif // HAVE_AVX2 && HAVE_SSSE3
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_AVX2
#if HAVE_NEON
#if CONFIG_VP9_HIGHBITDEPTH
@@ -1105,7 +1332,7 @@ const ConvolveFunctions convolve8_neon(
vpx_convolve8_avg_horiz_neon, vpx_convolve8_vert_neon,
vpx_convolve8_avg_vert_neon, vpx_convolve8_neon, vpx_convolve8_avg_neon,
vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c, vpx_scaled_vert_c,
vpx_scaled_avg_vert_c, vpx_scaled_2d_c, vpx_scaled_avg_2d_c, 0);
vpx_scaled_avg_vert_c, vpx_scaled_2d_neon, vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve_neon[] = { ALL_SIZES(convolve8_neon) };
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -1132,10 +1359,22 @@ const ConvolveFunctions convolve8_msa(
vpx_convolve8_avg_horiz_msa, vpx_convolve8_vert_msa,
vpx_convolve8_avg_vert_msa, vpx_convolve8_msa, vpx_convolve8_avg_msa,
vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c, vpx_scaled_vert_c,
vpx_scaled_avg_vert_c, vpx_scaled_2d_c, vpx_scaled_avg_2d_c, 0);
vpx_scaled_avg_vert_c, vpx_scaled_2d_msa, vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve8_msa[] = { ALL_SIZES(convolve8_msa) };
INSTANTIATE_TEST_CASE_P(MSA, ConvolveTest,
::testing::ValuesIn(kArrayConvolve8_msa));
#endif // HAVE_MSA
#if HAVE_VSX
const ConvolveFunctions convolve8_vsx(
vpx_convolve_copy_vsx, vpx_convolve_avg_vsx, vpx_convolve8_horiz_vsx,
vpx_convolve8_avg_horiz_vsx, vpx_convolve8_vert_vsx,
vpx_convolve8_avg_vert_vsx, vpx_convolve8_vsx, vpx_convolve8_avg_vsx,
vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c, vpx_scaled_vert_c,
vpx_scaled_avg_vert_c, vpx_scaled_2d_c, vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve_vsx[] = { ALL_SIZES(convolve8_vsx) };
INSTANTIATE_TEST_CASE_P(VSX, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_vsx));
#endif // HAVE_VSX
} // namespace

View File

@@ -44,6 +44,7 @@ class DatarateTestLarge
denoiser_offon_test_ = 0;
denoiser_offon_period_ = -1;
gf_boost_ = 0;
use_roi_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
@@ -54,6 +55,12 @@ class DatarateTestLarge
encoder->Control(VP8E_SET_GF_CBR_BOOST_PCT, gf_boost_);
}
#if CONFIG_VP8_ENCODER
if (use_roi_ == 1) {
encoder->Control(VP8E_SET_ROI_MAP, &roi_);
}
#endif
if (denoiser_offon_test_) {
ASSERT_GT(denoiser_offon_period_, 0)
<< "denoiser_offon_period_ is not positive.";
@@ -91,8 +98,8 @@ class DatarateTestLarge
const bool key_frame =
(pkt->data.frame.flags & VPX_FRAME_IS_KEY) ? true : false;
if (!key_frame) {
ASSERT_GE(bits_in_buffer_model_, 0) << "Buffer Underrun at frame "
<< pkt->data.frame.pts;
ASSERT_GE(bits_in_buffer_model_, 0)
<< "Buffer Underrun at frame " << pkt->data.frame.pts;
}
const int64_t frame_size_in_bits = pkt->data.frame.sz * 8;
@@ -145,6 +152,8 @@ class DatarateTestLarge
int denoiser_offon_period_;
int set_cpu_used_;
int gf_boost_;
int use_roi_;
vpx_roi_map_t roi_;
};
#if CONFIG_TEMPORAL_DENOISING
@@ -258,14 +267,6 @@ TEST_P(DatarateTestLarge, ChangingDropFrameThresh) {
}
}
// Disabled for tsan, see:
// https://bugs.chromium.org/p/webm/issues/detail?id=1049
#if defined(__has_feature)
#if __has_feature(thread_sanitizer)
#define BUILDING_WITH_TSAN
#endif
#endif
#ifndef BUILDING_WITH_TSAN
TEST_P(DatarateTestLarge, DropFramesMultiThreads) {
denoiser_on_ = 0;
cfg_.rc_buf_initial_sz = 500;
@@ -285,7 +286,6 @@ TEST_P(DatarateTestLarge, DropFramesMultiThreads) {
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.4)
<< " The datarate for the file missed the target!";
}
#endif // !BUILDING_WITH_TSAN
class DatarateTestRealTime : public DatarateTestLarge {
public:
@@ -402,10 +402,6 @@ TEST_P(DatarateTestRealTime, ChangingDropFrameThresh) {
}
}
// Disabled for tsan, see:
// https://bugs.chromium.org/p/webm/issues/detail?id=1049
#ifndef BUILDING_WITH_TSAN
TEST_P(DatarateTestRealTime, DropFramesMultiThreads) {
denoiser_on_ = 0;
cfg_.rc_buf_initial_sz = 500;
@@ -426,7 +422,67 @@ TEST_P(DatarateTestRealTime, DropFramesMultiThreads) {
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.4)
<< " The datarate for the file missed the target!";
}
#endif
TEST_P(DatarateTestRealTime, RegionOfInterest) {
denoiser_on_ = 0;
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_dropframe_thresh = 0;
cfg_.rc_max_quantizer = 56;
cfg_.rc_end_usage = VPX_CBR;
// Encode using multiple threads.
cfg_.g_threads = 2;
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 300);
cfg_.rc_target_bitrate = 450;
cfg_.g_w = 352;
cfg_.g_h = 288;
ResetModel();
// Set ROI parameters
use_roi_ = 1;
memset(&roi_, 0, sizeof(roi_));
roi_.rows = (cfg_.g_h + 15) / 16;
roi_.cols = (cfg_.g_w + 15) / 16;
roi_.delta_q[0] = 0;
roi_.delta_q[1] = -20;
roi_.delta_q[2] = 0;
roi_.delta_q[3] = 0;
roi_.delta_lf[0] = 0;
roi_.delta_lf[1] = -20;
roi_.delta_lf[2] = 0;
roi_.delta_lf[3] = 0;
roi_.static_threshold[0] = 0;
roi_.static_threshold[1] = 1000;
roi_.static_threshold[2] = 0;
roi_.static_threshold[3] = 0;
// Use 2 states: 1 is center square, 0 is the rest.
roi_.roi_map =
(uint8_t *)calloc(roi_.rows * roi_.cols, sizeof(*roi_.roi_map));
for (unsigned int i = 0; i < roi_.rows; ++i) {
for (unsigned int j = 0; j < roi_.cols; ++j) {
if (i > (roi_.rows >> 2) && i < ((roi_.rows * 3) >> 2) &&
j > (roi_.cols >> 2) && j < ((roi_.cols * 3) >> 2)) {
roi_.roi_map[i * roi_.cols + j] = 1;
}
}
}
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, effective_datarate_ * 0.95)
<< " The datarate for the file exceeds the target!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.4)
<< " The datarate for the file missed the target!";
free(roi_.roi_map);
}
TEST_P(DatarateTestRealTime, GFBoost) {
denoiser_on_ = 0;
@@ -482,6 +538,7 @@ class DatarateTestVP9Large
}
denoiser_offon_test_ = 0;
denoiser_offon_period_ = -1;
frame_parallel_decoding_mode_ = 1;
}
//
@@ -496,8 +553,8 @@ class DatarateTestVP9Large
// 2 6
// 0 4 ....
// LAST is always update on base/layer 0, GOLDEN is updated on layer 1.
// For this 3 layer example, the 2nd enhancement layer (layer 2) does not
// update any reference frames.
// For this 3 layer example, the 2nd enhancement layer (layer 2) updates
// the altref frame.
int SetFrameFlags(int frame_num, int num_temp_layers) {
int frame_flags = 0;
if (num_temp_layers == 2) {
@@ -519,9 +576,8 @@ class DatarateTestVP9Large
// Layer 1: predict from L, G, ARF; update G.
frame_flags = VP8_EFLAG_NO_UPD_ARF | VP8_EFLAG_NO_UPD_LAST;
} else if ((frame_num - 1) % 2 == 0) {
// Layer 2: predict from L, G, ARF; update none.
frame_flags =
VP8_EFLAG_NO_UPD_GF | VP8_EFLAG_NO_UPD_ARF | VP8_EFLAG_NO_UPD_LAST;
// Layer 2: predict from L, G, ARF; update ARF.
frame_flags = VP8_EFLAG_NO_UPD_GF | VP8_EFLAG_NO_UPD_LAST;
}
}
return frame_flags;
@@ -561,6 +617,9 @@ class DatarateTestVP9Large
}
encoder->Control(VP9E_SET_NOISE_SENSITIVITY, denoiser_on_);
encoder->Control(VP9E_SET_TILE_COLUMNS, (cfg_.g_threads >> 1));
encoder->Control(VP9E_SET_FRAME_PARALLEL_DECODING,
frame_parallel_decoding_mode_);
if (cfg_.ts_number_layers > 1) {
if (video->frame() == 0) {
@@ -599,8 +658,8 @@ class DatarateTestVP9Large
duration * timebase_ * cfg_.rc_target_bitrate * 1000);
// Buffer should not go negative.
ASSERT_GE(bits_in_buffer_model_, 0) << "Buffer Underrun at frame "
<< pkt->data.frame.pts;
ASSERT_GE(bits_in_buffer_model_, 0)
<< "Buffer Underrun at frame " << pkt->data.frame.pts;
const size_t frame_size_in_bits = pkt->data.frame.sz * 8;
@@ -641,6 +700,7 @@ class DatarateTestVP9Large
int denoiser_on_;
int denoiser_offon_test_;
int denoiser_offon_period_;
int frame_parallel_decoding_mode_;
};
// Check basic rate targeting for VBR mode with 0 lag.
@@ -659,7 +719,7 @@ TEST_P(DatarateTestVP9Large, BasicRateTargetingVBRLagZero) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.75)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.25)
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.30)
<< " The datarate for the file is greater than target by too much!";
}
}
@@ -686,7 +746,37 @@ TEST_P(DatarateTestVP9Large, BasicRateTargetingVBRLagNonZero) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.75)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.25)
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.30)
<< " The datarate for the file is greater than target by too much!";
}
}
// Check basic rate targeting for VBR mode with non-zero lag, with
// frame_parallel_decoding_mode off. This enables the adapt_coeff/mode/mv probs
// since error_resilience is off.
TEST_P(DatarateTestVP9Large, BasicRateTargetingVBRLagNonZeroFrameParDecOff) {
cfg_.rc_min_quantizer = 0;
cfg_.rc_max_quantizer = 63;
cfg_.g_error_resilient = 0;
cfg_.rc_end_usage = VPX_VBR;
// For non-zero lag, rate control will work (be within bounds) for
// real-time mode.
if (deadline_ == VPX_DL_REALTIME) {
cfg_.g_lag_in_frames = 15;
} else {
cfg_.g_lag_in_frames = 0;
}
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 300);
for (int i = 400; i <= 800; i += 400) {
cfg_.rc_target_bitrate = i;
ResetModel();
frame_parallel_decoding_mode_ = 0;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.75)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.30)
<< " The datarate for the file is greater than target by too much!";
}
}
@@ -715,6 +805,33 @@ TEST_P(DatarateTestVP9Large, BasicRateTargeting) {
}
}
// Check basic rate targeting for CBR mode, with frame_parallel_decoding_mode
// off( and error_resilience off).
TEST_P(DatarateTestVP9Large, BasicRateTargetingFrameParDecOff) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
cfg_.rc_dropframe_thresh = 1;
cfg_.rc_min_quantizer = 0;
cfg_.rc_max_quantizer = 63;
cfg_.rc_end_usage = VPX_CBR;
cfg_.g_lag_in_frames = 0;
cfg_.g_error_resilient = 0;
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 140);
for (int i = 150; i < 800; i += 200) {
cfg_.rc_target_bitrate = i;
ResetModel();
frame_parallel_decoding_mode_ = 0;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.85)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.15)
<< " The datarate for the file is greater than target by too much!";
}
}
// Check basic rate targeting for CBR mode, with 2 threads and dropped frames.
TEST_P(DatarateTestVP9Large, BasicRateTargetingDropFramesMultiThreads) {
cfg_.rc_buf_initial_sz = 500;
@@ -759,7 +876,7 @@ TEST_P(DatarateTestVP9Large, BasicRateTargeting444) {
ResetModel();
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(static_cast<double>(cfg_.rc_target_bitrate),
effective_datarate_[0] * 0.85)
effective_datarate_[0] * 0.80)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(static_cast<double>(cfg_.rc_target_bitrate),
effective_datarate_[0] * 1.15)
@@ -792,26 +909,29 @@ TEST_P(DatarateTestVP9Large, ChangingDropFrameThresh) {
30, 1, 0, 140);
const int kDropFrameThreshTestStep = 30;
vpx_codec_pts_t last_drop = 140;
int last_num_drops = 0;
for (int i = 10; i < 100; i += kDropFrameThreshTestStep) {
cfg_.rc_dropframe_thresh = i;
ResetModel();
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.85)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.15)
<< " The datarate for the file is greater than target by too much!";
ASSERT_LE(first_drop_, last_drop)
<< " The first dropped frame for drop_thresh " << i
<< " > first dropped frame for drop_thresh "
<< i - kDropFrameThreshTestStep;
ASSERT_GE(num_drops_, last_num_drops * 0.85)
<< " The number of dropped frames for drop_thresh " << i
<< " < number of dropped frames for drop_thresh "
<< i - kDropFrameThreshTestStep;
last_drop = first_drop_;
last_num_drops = num_drops_;
for (int j = 50; j <= 150; j += 100) {
cfg_.rc_target_bitrate = j;
vpx_codec_pts_t last_drop = 140;
int last_num_drops = 0;
for (int i = 10; i < 100; i += kDropFrameThreshTestStep) {
cfg_.rc_dropframe_thresh = i;
ResetModel();
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.85)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.25)
<< " The datarate for the file is greater than target by too much!";
ASSERT_LE(first_drop_, last_drop)
<< " The first dropped frame for drop_thresh " << i
<< " > first dropped frame for drop_thresh "
<< i - kDropFrameThreshTestStep;
ASSERT_GE(num_drops_, last_num_drops * 0.85)
<< " The number of dropped frames for drop_thresh " << i
<< " < number of dropped frames for drop_thresh "
<< i - kDropFrameThreshTestStep;
last_drop = first_drop_;
last_num_drops = num_drops_;
}
}
}
@@ -988,7 +1108,7 @@ TEST_P(DatarateTestVP9LargeDenoiser, LowNoise) {
}
// Check basic datarate targeting, for a single bitrate, when denoiser is on,
// for clip with high noise level.
// for clip with high noise level. Use 2 threads.
TEST_P(DatarateTestVP9LargeDenoiser, HighNoise) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
@@ -998,9 +1118,39 @@ TEST_P(DatarateTestVP9LargeDenoiser, HighNoise) {
cfg_.rc_max_quantizer = 56;
cfg_.rc_end_usage = VPX_CBR;
cfg_.g_lag_in_frames = 0;
cfg_.g_threads = 2;
::libvpx_test::Y4mVideoSource video("noisy_clip_640_360.y4m", 0, 200);
// For the temporal denoiser (#if CONFIG_VP9_TEMPORAL_DENOISING),
// there is only one denoiser mode: kDenoiserOnYOnly(which is 1),
// but may add more modes in the future.
cfg_.rc_target_bitrate = 1000;
ResetModel();
// Turn on the denoiser.
denoiser_on_ = 1;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.85)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.15)
<< " The datarate for the file is greater than target by too much!";
}
// Check basic datarate targeting, for a single bitrate, when denoiser is on,
// for 1280x720 clip with 4 threads.
TEST_P(DatarateTestVP9LargeDenoiser, 4threads) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
cfg_.rc_dropframe_thresh = 1;
cfg_.rc_min_quantizer = 2;
cfg_.rc_max_quantizer = 56;
cfg_.rc_end_usage = VPX_CBR;
cfg_.g_lag_in_frames = 0;
cfg_.g_threads = 4;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
// For the temporal denoiser (#if CONFIG_VP9_TEMPORAL_DENOISING),
// there is only one denoiser mode: denoiserYonly(which is 1),
// but may add more modes in the future.
@@ -1011,7 +1161,7 @@ TEST_P(DatarateTestVP9LargeDenoiser, HighNoise) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(effective_datarate_[0], cfg_.rc_target_bitrate * 0.85)
<< " The datarate for the file is lower than target by too much!";
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.15)
ASSERT_LE(effective_datarate_[0], cfg_.rc_target_bitrate * 1.29)
<< " The datarate for the file is greater than target by too much!";
}
@@ -1066,13 +1216,17 @@ class DatarateOnePassCbrSvc
}
virtual void ResetModel() {
last_pts_ = 0;
bits_in_buffer_model_ = cfg_.rc_target_bitrate * cfg_.rc_buf_initial_sz;
frame_number_ = 0;
first_drop_ = 0;
bits_total_ = 0;
duration_ = 0.0;
mismatch_psnr_ = 0.0;
mismatch_nframes_ = 0;
denoiser_on_ = 0;
tune_content_ = 0;
base_speed_setting_ = 5;
spatial_layer_id_ = 0;
temporal_layer_id_ = 0;
memset(bits_in_buffer_model_, 0, sizeof(bits_in_buffer_model_));
memset(bits_total_, 0, sizeof(bits_total_));
memset(layer_target_avg_bandwidth_, 0, sizeof(layer_target_avg_bandwidth_));
}
virtual void BeginPassHook(unsigned int /*pass*/) {}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
@@ -1083,48 +1237,114 @@ class DatarateOnePassCbrSvc
svc_params_.max_quantizers[i] = 63;
svc_params_.min_quantizers[i] = 0;
}
svc_params_.speed_per_layer[0] = 5;
svc_params_.speed_per_layer[0] = base_speed_setting_;
for (i = 1; i < VPX_SS_MAX_LAYERS; ++i) {
svc_params_.speed_per_layer[i] = speed_setting_;
}
encoder->Control(VP9E_SET_NOISE_SENSITIVITY, denoiser_on_);
encoder->Control(VP9E_SET_SVC, 1);
encoder->Control(VP9E_SET_SVC_PARAMETERS, &svc_params_);
encoder->Control(VP8E_SET_CPUUSED, speed_setting_);
encoder->Control(VP9E_SET_TILE_COLUMNS, 0);
encoder->Control(VP8E_SET_MAX_INTRA_BITRATE_PCT, 300);
encoder->Control(VP9E_SET_TILE_COLUMNS, (cfg_.g_threads >> 1));
encoder->Control(VP9E_SET_ROW_MT, 1);
encoder->Control(VP8E_SET_STATIC_THRESHOLD, 1);
encoder->Control(VP9E_SET_TUNE_CONTENT, tune_content_);
}
const vpx_rational_t tb = video->timebase();
timebase_ = static_cast<double>(tb.num) / tb.den;
duration_ = 0;
}
virtual void PostEncodeFrameHook(::libvpx_test::Encoder *encoder) {
vpx_svc_layer_id_t layer_id;
encoder->Control(VP9E_GET_SVC_LAYER_ID, &layer_id);
spatial_layer_id_ = layer_id.spatial_layer_id;
temporal_layer_id_ = layer_id.temporal_layer_id;
// Update buffer with per-layer target frame bandwidth, this is done
// for every frame passed to the encoder (encoded or dropped).
// For temporal layers, update the cumulative buffer level.
for (int sl = 0; sl < number_spatial_layers_; ++sl) {
for (int tl = temporal_layer_id_; tl < number_temporal_layers_; ++tl) {
const int layer = sl * number_temporal_layers_ + tl;
bits_in_buffer_model_[layer] +=
static_cast<int64_t>(layer_target_avg_bandwidth_[layer]);
}
}
}
vpx_codec_err_t parse_superframe_index(const uint8_t *data, size_t data_sz,
uint32_t sizes[8], int *count) {
uint8_t marker;
marker = *(data + data_sz - 1);
*count = 0;
if ((marker & 0xe0) == 0xc0) {
const uint32_t frames = (marker & 0x7) + 1;
const uint32_t mag = ((marker >> 3) & 0x3) + 1;
const size_t index_sz = 2 + mag * frames;
// This chunk is marked as having a superframe index but doesn't have
// enough data for it, thus it's an invalid superframe index.
if (data_sz < index_sz) return VPX_CODEC_CORRUPT_FRAME;
{
const uint8_t marker2 = *(data + data_sz - index_sz);
// This chunk is marked as having a superframe index but doesn't have
// the matching marker byte at the front of the index therefore it's an
// invalid chunk.
if (marker != marker2) return VPX_CODEC_CORRUPT_FRAME;
}
{
uint32_t i, j;
const uint8_t *x = &data[data_sz - index_sz + 1];
for (i = 0; i < frames; ++i) {
uint32_t this_sz = 0;
for (j = 0; j < mag; ++j) this_sz |= (*x++) << (j * 8);
sizes[i] = this_sz;
}
*count = frames;
}
}
return VPX_CODEC_OK;
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
vpx_codec_pts_t duration = pkt->data.frame.pts - last_pts_;
if (last_pts_ == 0) duration = 1;
bits_in_buffer_model_ += static_cast<int64_t>(
duration * timebase_ * cfg_.rc_target_bitrate * 1000);
uint32_t sizes[8] = { 0 };
int count = 0;
last_pts_ = pkt->data.frame.pts;
const bool key_frame =
(pkt->data.frame.flags & VPX_FRAME_IS_KEY) ? true : false;
if (!key_frame) {
// TODO(marpan): This check currently fails for some of the SVC tests,
// re-enable when issue (webm:1350) is resolved.
// ASSERT_GE(bits_in_buffer_model_, 0) << "Buffer Underrun at frame "
// << pkt->data.frame.pts;
parse_superframe_index(static_cast<const uint8_t *>(pkt->data.frame.buf),
pkt->data.frame.sz, sizes, &count);
ASSERT_EQ(count, number_spatial_layers_);
for (int sl = 0; sl < number_spatial_layers_; ++sl) {
sizes[sl] = sizes[sl] << 3;
// Update the total encoded bits per layer.
// For temporal layers, update the cumulative encoded bits per layer.
for (int tl = temporal_layer_id_; tl < number_temporal_layers_; ++tl) {
const int layer = sl * number_temporal_layers_ + tl;
bits_total_[layer] += static_cast<int64_t>(sizes[sl]);
// Update the per-layer buffer level with the encoded frame size.
bits_in_buffer_model_[layer] -= static_cast<int64_t>(sizes[sl]);
// There should be no buffer underrun, except on the base
// temporal layer, since there may be key frames there.
if (!key_frame && tl > 0) {
ASSERT_GE(bits_in_buffer_model_[layer], 0)
<< "Buffer Underrun at frame " << pkt->data.frame.pts;
}
}
}
const size_t frame_size_in_bits = pkt->data.frame.sz * 8;
bits_in_buffer_model_ -= static_cast<int64_t>(frame_size_in_bits);
bits_total_ += frame_size_in_bits;
if (!first_drop_ && duration > 1) first_drop_ = last_pts_ + 1;
last_pts_ = pkt->data.frame.pts;
bits_in_last_frame_ = frame_size_in_bits;
++frame_number_;
}
virtual void EndPassHook(void) {
if (bits_total_) {
const double file_size_in_kb = bits_total_ / 1000.; // bits per kilobit
duration_ = (last_pts_ + 1) * timebase_;
file_datarate_ = file_size_in_kb / duration_;
for (int sl = 0; sl < number_spatial_layers_; ++sl) {
for (int tl = 0; tl < number_temporal_layers_; ++tl) {
const int layer = sl * number_temporal_layers_ + tl;
const double file_size_in_kb = bits_total_[layer] / 1000.;
duration_ = (last_pts_ + 1) * timebase_;
file_datarate_[layer] = file_size_in_kb / duration_;
}
}
}
@@ -1137,26 +1357,35 @@ class DatarateOnePassCbrSvc
unsigned int GetMismatchFrames() { return mismatch_nframes_; }
vpx_codec_pts_t last_pts_;
int64_t bits_in_buffer_model_;
int64_t bits_in_buffer_model_[VPX_MAX_LAYERS];
double timebase_;
int frame_number_;
vpx_codec_pts_t first_drop_;
int64_t bits_total_;
int64_t bits_total_[VPX_MAX_LAYERS];
double duration_;
double file_datarate_;
double file_datarate_[VPX_MAX_LAYERS];
size_t bits_in_last_frame_;
vpx_svc_extra_cfg_t svc_params_;
int speed_setting_;
double mismatch_psnr_;
int mismatch_nframes_;
int denoiser_on_;
int tune_content_;
int base_speed_setting_;
int spatial_layer_id_;
int temporal_layer_id_;
int number_spatial_layers_;
int number_temporal_layers_;
int layer_target_avg_bandwidth_[VPX_MAX_LAYERS];
};
static void assign_layer_bitrates(vpx_codec_enc_cfg_t *const enc_cfg,
const vpx_svc_extra_cfg_t *svc_params,
int spatial_layers, int temporal_layers,
int temporal_layering_mode) {
int temporal_layering_mode,
int *layer_target_avg_bandwidth,
int64_t *bits_in_buffer_model) {
int sl, spatial_layer_target;
float total = 0;
float alloc_ratio[VPX_MAX_LAYERS] = { 0 };
float framerate = 30.0;
for (sl = 0; sl < spatial_layers; ++sl) {
if (svc_params->scaling_factor_den[sl] > 0) {
alloc_ratio[sl] = (float)(svc_params->scaling_factor_num[sl] * 1.0 /
@@ -1176,13 +1405,85 @@ static void assign_layer_bitrates(vpx_codec_enc_cfg_t *const enc_cfg,
} else if (temporal_layering_mode == 2) {
enc_cfg->layer_target_bitrate[index] = spatial_layer_target * 2 / 3;
enc_cfg->layer_target_bitrate[index + 1] = spatial_layer_target;
} else if (temporal_layering_mode <= 1) {
enc_cfg->layer_target_bitrate[index] = spatial_layer_target;
}
}
for (sl = 0; sl < spatial_layers; ++sl) {
for (int tl = 0; tl < temporal_layers; ++tl) {
const int layer = sl * temporal_layers + tl;
float layer_framerate = framerate;
if (temporal_layers == 2 && tl == 0) layer_framerate = framerate / 2;
if (temporal_layers == 3 && tl == 0) layer_framerate = framerate / 4;
if (temporal_layers == 3 && tl == 1) layer_framerate = framerate / 2;
layer_target_avg_bandwidth[layer] = static_cast<int>(
enc_cfg->layer_target_bitrate[layer] * 1000.0 / layer_framerate);
bits_in_buffer_model[layer] =
enc_cfg->layer_target_bitrate[layer] * enc_cfg->rc_buf_initial_sz;
}
}
}
static void CheckLayerRateTargeting(vpx_codec_enc_cfg_t *const cfg,
int number_spatial_layers,
int number_temporal_layers,
double *file_datarate,
double thresh_overshoot,
double thresh_undershoot) {
for (int sl = 0; sl < number_spatial_layers; ++sl)
for (int tl = 0; tl < number_temporal_layers; ++tl) {
const int layer = sl * number_temporal_layers + tl;
ASSERT_GE(cfg->layer_target_bitrate[layer],
file_datarate[layer] * thresh_overshoot)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg->layer_target_bitrate[layer],
file_datarate[layer] * thresh_undershoot)
<< " The datarate for the file is lower than the target by too much!";
}
}
// Check basic rate targeting for 1 pass CBR SVC: 2 spatial layers and 1
// temporal layer, with screen content mode on and same speed setting for all
// layers.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL1TLScreenContent1) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
cfg_.rc_min_quantizer = 0;
cfg_.rc_max_quantizer = 63;
cfg_.rc_end_usage = VPX_CBR;
cfg_.g_lag_in_frames = 0;
cfg_.ss_number_layers = 2;
cfg_.ts_number_layers = 1;
cfg_.ts_rate_decimator[0] = 1;
cfg_.g_error_resilient = 1;
cfg_.g_threads = 1;
cfg_.temporal_layering_mode = 0;
svc_params_.scaling_factor_num[0] = 144;
svc_params_.scaling_factor_den[0] = 288;
svc_params_.scaling_factor_num[1] = 288;
svc_params_.scaling_factor_den[1] = 288;
cfg_.rc_dropframe_thresh = 10;
cfg_.kf_max_dist = 9999;
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 60);
cfg_.rc_target_bitrate = 500;
ResetModel();
tune_content_ = 1;
base_speed_setting_ = speed_setting_;
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78, 1.15);
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
}
// Check basic rate targeting for 1 pass CBR SVC: 2 spatial layers and
// 3 temporal layers. Run CIF clip with 1 thread.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL3TL) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1202,29 +1503,93 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers) {
svc_params_.scaling_factor_den[0] = 288;
svc_params_.scaling_factor_num[1] = 288;
svc_params_.scaling_factor_den[1] = 288;
cfg_.rc_dropframe_thresh = 10;
cfg_.rc_dropframe_thresh = 0;
cfg_.kf_max_dist = 9999;
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 200);
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::I420VideoSource video("niklas_640_480_30.yuv", 640, 480, 30, 1,
0, 400);
// TODO(marpan): Check that effective_datarate for each layer hits the
// layer target_bitrate.
for (int i = 200; i <= 800; i += 200) {
cfg_.rc_target_bitrate = i;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.15)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78,
1.15);
#if CONFIG_VP9_DECODER
// Number of temporal layers > 1, so half of the frames in this SVC pattern
// will be non-reference frame and hence encoder will avoid loopfilter.
// Since frame dropper is off, we can expect 200 (half of the sequence)
// mismatched frames.
EXPECT_EQ(static_cast<unsigned int>(200), GetMismatchFrames());
#endif
}
}
// Check basic rate targeting for 1 pass CBR SVC with denoising.
// 2 spatial layers and 3 temporal layer. Run HD clip with 2 threads.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL3TLDenoiserOn) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
cfg_.rc_min_quantizer = 0;
cfg_.rc_max_quantizer = 63;
cfg_.rc_end_usage = VPX_CBR;
cfg_.g_lag_in_frames = 0;
cfg_.ss_number_layers = 2;
cfg_.ts_number_layers = 3;
cfg_.ts_rate_decimator[0] = 4;
cfg_.ts_rate_decimator[1] = 2;
cfg_.ts_rate_decimator[2] = 1;
cfg_.g_error_resilient = 1;
cfg_.g_threads = 2;
cfg_.temporal_layering_mode = 3;
svc_params_.scaling_factor_num[0] = 144;
svc_params_.scaling_factor_den[0] = 288;
svc_params_.scaling_factor_num[1] = 288;
svc_params_.scaling_factor_den[1] = 288;
cfg_.rc_dropframe_thresh = 0;
cfg_.kf_max_dist = 9999;
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::I420VideoSource video("niklas_640_480_30.yuv", 640, 480, 30, 1,
0, 400);
// TODO(marpan): Check that effective_datarate for each layer hits the
// layer target_bitrate.
// For SVC, noise_sen = 1 means denoising only the top spatial layer
// noise_sen = 2 means denoising the two top spatial layers.
for (int noise_sen = 1; noise_sen <= 2; noise_sen++) {
for (int i = 600; i <= 1000; i += 200) {
cfg_.rc_target_bitrate = i;
ResetModel();
denoiser_on_ = noise_sen;
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78,
1.15);
#if CONFIG_VP9_DECODER
// Number of temporal layers > 1, so half of the frames in this SVC
// pattern
// will be non-reference frame and hence encoder will avoid loopfilter.
// Since frame dropper is off, we can expect 200 (half of the sequence)
// mismatched frames.
EXPECT_EQ(static_cast<unsigned int>(200), GetMismatchFrames());
#endif
}
}
}
// Check basic rate targeting for 1 pass CBR SVC: 2 spatial layers and 3
// temporal layers. Run CIF clip with 1 thread, and few short key frame periods.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayersSmallKf) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL3TLSmallKf) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1245,28 +1610,29 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayersSmallKf) {
svc_params_.scaling_factor_num[1] = 288;
svc_params_.scaling_factor_den[1] = 288;
cfg_.rc_dropframe_thresh = 10;
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 200);
cfg_.rc_target_bitrate = 400;
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::I420VideoSource video("niklas_640_480_30.yuv", 640, 480, 30, 1,
0, 400);
// For this 3 temporal layer case, pattern repeats every 4 frames, so choose
// 4 key neighboring key frame periods (so key frame will land on 0-2-1-2).
for (int j = 64; j <= 67; j++) {
cfg_.kf_max_dist = j;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.15)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78,
1.15);
}
}
// Check basic rate targeting for 1 pass CBR SVC: 2 spatial layers and
// 3 temporal layers. Run HD clip with 4 threads.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers4threads) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL3TL4Threads) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1286,24 +1652,31 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers4threads) {
svc_params_.scaling_factor_den[0] = 288;
svc_params_.scaling_factor_num[1] = 288;
svc_params_.scaling_factor_den[1] = 288;
cfg_.rc_dropframe_thresh = 10;
cfg_.rc_dropframe_thresh = 0;
cfg_.kf_max_dist = 9999;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 60);
cfg_.rc_target_bitrate = 800;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.15)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78, 1.15);
#if CONFIG_VP9_DECODER
// Number of temporal layers > 1, so half of the frames in this SVC pattern
// will be non-reference frame and hence encoder will avoid loopfilter.
// Since frame dropper is off, we can expect 30 (half of the sequence)
// mismatched frames.
EXPECT_EQ(static_cast<unsigned int>(30), GetMismatchFrames());
#endif
}
// Check basic rate targeting for 1 pass CBR SVC: 3 spatial layers and
// 3 temporal layers. Run CIF clip with 1 thread.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayers) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SL3TL) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1325,24 +1698,32 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayers) {
svc_params_.scaling_factor_den[1] = 288;
svc_params_.scaling_factor_num[2] = 288;
svc_params_.scaling_factor_den[2] = 288;
cfg_.rc_dropframe_thresh = 10;
cfg_.rc_dropframe_thresh = 0;
cfg_.kf_max_dist = 9999;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::I420VideoSource video("niklas_640_480_30.yuv", 640, 480, 30, 1,
0, 400);
cfg_.rc_target_bitrate = 800;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.22)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78, 1.15);
#if CONFIG_VP9_DECODER
// Number of temporal layers > 1, so half of the frames in this SVC pattern
// will be non-reference frame and hence encoder will avoid loopfilter.
// Since frame dropper is off, we can expect 200 (half of the sequence)
// mismatched frames.
EXPECT_EQ(static_cast<unsigned int>(200), GetMismatchFrames());
#endif
}
// Check basic rate targeting for 1 pass CBR SVC: 3 spatial layers and 3
// temporal layers. Run CIF clip with 1 thread, and few short key frame periods.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayersSmallKf) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SL3TLSmallKf) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1365,27 +1746,29 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayersSmallKf) {
svc_params_.scaling_factor_num[2] = 288;
svc_params_.scaling_factor_den[2] = 288;
cfg_.rc_dropframe_thresh = 10;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
cfg_.rc_target_bitrate = 800;
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::I420VideoSource video("niklas_640_480_30.yuv", 640, 480, 30, 1,
0, 400);
// For this 3 temporal layer case, pattern repeats every 4 frames, so choose
// 4 key neighboring key frame periods (so key frame will land on 0-2-1-2).
for (int j = 32; j <= 35; j++) {
cfg_.kf_max_dist = j;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.30)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78,
1.15);
}
}
// Check basic rate targeting for 1 pass CBR SVC: 3 spatial layers and
// 3 temporal layers. Run HD clip with 4 threads.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayers4threads) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SL3TL4threads) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1407,24 +1790,31 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc3SpatialLayers4threads) {
svc_params_.scaling_factor_den[1] = 288;
svc_params_.scaling_factor_num[2] = 288;
svc_params_.scaling_factor_den[2] = 288;
cfg_.rc_dropframe_thresh = 10;
cfg_.rc_dropframe_thresh = 0;
cfg_.kf_max_dist = 9999;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 60);
cfg_.rc_target_bitrate = 800;
ResetModel();
assign_layer_bitrates(&cfg_, &svc_params_, cfg_.ss_number_layers,
cfg_.ts_number_layers, cfg_.temporal_layering_mode);
cfg_.ts_number_layers, cfg_.temporal_layering_mode,
layer_target_avg_bandwidth_, bits_in_buffer_model_);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(cfg_.rc_target_bitrate, file_datarate_ * 0.85)
<< " The datarate for the file exceeds the target by too much!";
ASSERT_LE(cfg_.rc_target_bitrate, file_datarate_ * 1.22)
<< " The datarate for the file is lower than the target by too much!";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78, 1.15);
#if CONFIG_VP9_DECODER
// Number of temporal layers > 1, so half of the frames in this SVC pattern
// will be non-reference frame and hence encoder will avoid loopfilter.
// Since frame dropper is off, we can expect 30 (half of the sequence)
// mismatched frames.
EXPECT_EQ(static_cast<unsigned int>(30), GetMismatchFrames());
#endif
}
// Run SVC encoder for 1 temporal layer, 2 spatial layers, with spatial
// downscale 5x5.
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers5x5MultipleRuns) {
TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SL1TL5x5MultipleRuns) {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 500;
cfg_.rc_buf_sz = 1000;
@@ -1442,7 +1832,7 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers5x5MultipleRuns) {
svc_params_.scaling_factor_den[0] = 1280;
svc_params_.scaling_factor_num[1] = 1280;
svc_params_.scaling_factor_den[1] = 1280;
cfg_.rc_dropframe_thresh = 0;
cfg_.rc_dropframe_thresh = 10;
cfg_.kf_max_dist = 999999;
cfg_.kf_min_dist = 0;
cfg_.ss_target_bitrate[0] = 300;
@@ -1450,9 +1840,19 @@ TEST_P(DatarateOnePassCbrSvc, OnePassCbrSvc2SpatialLayers5x5MultipleRuns) {
cfg_.layer_target_bitrate[0] = 300;
cfg_.layer_target_bitrate[1] = 1400;
cfg_.rc_target_bitrate = 1700;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 300);
number_spatial_layers_ = cfg_.ss_number_layers;
number_temporal_layers_ = cfg_.ts_number_layers;
ResetModel();
layer_target_avg_bandwidth_[0] = cfg_.layer_target_bitrate[0] * 1000 / 30;
bits_in_buffer_model_[0] =
cfg_.layer_target_bitrate[0] * cfg_.rc_buf_initial_sz;
layer_target_avg_bandwidth_[1] = cfg_.layer_target_bitrate[1] * 1000 / 30;
bits_in_buffer_model_[1] =
cfg_.layer_target_bitrate[1] * cfg_.rc_buf_initial_sz;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 60);
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
CheckLayerRateTargeting(&cfg_, number_spatial_layers_,
number_temporal_layers_, file_datarate_, 0.78, 1.15);
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
}

View File

@@ -255,11 +255,11 @@ void iht16x16_ref(const tran_low_t *in, uint8_t *dest, int stride,
#if CONFIG_VP9_HIGHBITDEPTH
void idct16x16_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_256_add_c(in, out, stride, 10);
vpx_highbd_idct16x16_256_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct16x16_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_256_add_c(in, out, stride, 12);
vpx_highbd_idct16x16_256_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void idct16x16_10_ref(const tran_low_t *in, uint8_t *out, int stride,
@@ -273,36 +273,36 @@ void idct16x16_12_ref(const tran_low_t *in, uint8_t *out, int stride,
}
void iht16x16_10(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht16x16_256_add_c(in, out, stride, tx_type, 10);
vp9_highbd_iht16x16_256_add_c(in, CAST_TO_SHORTPTR(out), stride, tx_type, 10);
}
void iht16x16_12(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht16x16_256_add_c(in, out, stride, tx_type, 12);
vp9_highbd_iht16x16_256_add_c(in, CAST_TO_SHORTPTR(out), stride, tx_type, 12);
}
#if HAVE_SSE2
void idct16x16_10_add_10_c(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_10_add_c(in, out, stride, 10);
vpx_highbd_idct16x16_10_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct16x16_10_add_12_c(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_10_add_c(in, out, stride, 12);
vpx_highbd_idct16x16_10_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void idct16x16_256_add_10_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_256_add_sse2(in, out, stride, 10);
vpx_highbd_idct16x16_256_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct16x16_256_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_256_add_sse2(in, out, stride, 12);
vpx_highbd_idct16x16_256_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void idct16x16_10_add_10_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_10_add_sse2(in, out, stride, 10);
vpx_highbd_idct16x16_10_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct16x16_10_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct16x16_10_add_sse2(in, out, stride, 12);
vpx_highbd_idct16x16_10_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 12);
}
#endif // HAVE_SSE2
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -353,7 +353,7 @@ class Trans16x16TestBase {
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(test_temp_block, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
@@ -475,10 +475,10 @@ class Trans16x16TestBase {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(output_ref_block, dst, pitch_));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
inv_txfm_ref(output_ref_block, CONVERT_TO_BYTEPTR(ref16), pitch_,
inv_txfm_ref(output_ref_block, CAST_TO_BYTEPTR(ref16), pitch_,
tx_type_);
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(output_ref_block, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(output_ref_block, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
if (bit_depth_ == VPX_BITS_8) {
@@ -530,8 +530,7 @@ class Trans16x16TestBase {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, dst, 16));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CONVERT_TO_BYTEPTR(dst16), 16));
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), 16));
#endif // CONFIG_VP9_HIGHBITDEPTH
}
@@ -543,8 +542,8 @@ class Trans16x16TestBase {
const uint32_t diff = dst[j] - src[j];
#endif // CONFIG_VP9_HIGHBITDEPTH
const uint32_t error = diff * diff;
EXPECT_GE(1u, error) << "Error: 16x16 IDCT has error " << error
<< " at index " << j;
EXPECT_GE(1u, error)
<< "Error: 16x16 IDCT has error " << error << " at index " << j;
}
}
}
@@ -585,9 +584,9 @@ class Trans16x16TestBase {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, dst, pitch_));
} else {
#if CONFIG_VP9_HIGHBITDEPTH
ref_txfm(coeff, CONVERT_TO_BYTEPTR(ref16), pitch_);
ref_txfm(coeff, CAST_TO_BYTEPTR(ref16), pitch_);
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_));
#endif // CONFIG_VP9_HIGHBITDEPTH
}
@@ -745,66 +744,6 @@ TEST_P(InvTrans16x16DCT, CompareReference) {
CompareInvReference(ref_txfm_, thresh_);
}
class PartialTrans16x16Test : public ::testing::TestWithParam<
std::tr1::tuple<FdctFunc, vpx_bit_depth_t> > {
public:
virtual ~PartialTrans16x16Test() {}
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
bit_depth_ = GET_PARAM(1);
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
vpx_bit_depth_t bit_depth_;
FdctFunc fwd_txfm_;
};
TEST_P(PartialTrans16x16Test, Extremes) {
#if CONFIG_VP9_HIGHBITDEPTH
const int16_t maxval =
static_cast<int16_t>(clip_pixel_highbd(1 << 30, bit_depth_));
#else
const int16_t maxval = 255;
#endif
const int minval = -maxval;
DECLARE_ALIGNED(16, int16_t, input[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output[kNumCoeffs]);
for (int i = 0; i < kNumCoeffs; ++i) input[i] = maxval;
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 16));
EXPECT_EQ((maxval * kNumCoeffs) >> 1, output[0]);
for (int i = 0; i < kNumCoeffs; ++i) input[i] = minval;
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 16));
EXPECT_EQ((minval * kNumCoeffs) >> 1, output[0]);
}
TEST_P(PartialTrans16x16Test, Random) {
#if CONFIG_VP9_HIGHBITDEPTH
const int16_t maxval =
static_cast<int16_t>(clip_pixel_highbd(1 << 30, bit_depth_));
#else
const int16_t maxval = 255;
#endif
DECLARE_ALIGNED(16, int16_t, input[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output[kNumCoeffs]);
ACMRandom rnd(ACMRandom::DeterministicSeed());
int sum = 0;
for (int i = 0; i < kNumCoeffs; ++i) {
const int val = (i & 1) ? -rnd(maxval + 1) : rnd(maxval + 1);
input[i] = val;
sum += val;
}
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 16));
EXPECT_EQ(sum >> 1, output[0]);
}
using std::tr1::make_tuple;
#if CONFIG_VP9_HIGHBITDEPTH
@@ -837,11 +776,6 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 3, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(
C, PartialTrans16x16Test,
::testing::Values(make_tuple(&vpx_highbd_fdct16x16_1_c, VPX_BITS_8),
make_tuple(&vpx_highbd_fdct16x16_1_c, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct16x16_1_c, VPX_BITS_12)));
#else
INSTANTIATE_TEST_CASE_P(
C, Trans16x16HT,
@@ -850,17 +784,14 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 3, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(C, PartialTrans16x16Test,
::testing::Values(make_tuple(&vpx_fdct16x16_1_c,
VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
NEON, Trans16x16DCT,
::testing::Values(make_tuple(&vpx_fdct16x16_c, &vpx_idct16x16_256_add_neon,
0, VPX_BITS_8)));
#endif
::testing::Values(make_tuple(&vpx_fdct16x16_neon,
&vpx_idct16x16_256_add_neon, 0, VPX_BITS_8)));
#endif // HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
@@ -877,9 +808,6 @@ INSTANTIATE_TEST_CASE_P(
2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2,
3, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(SSE2, PartialTrans16x16Test,
::testing::Values(make_tuple(&vpx_fdct16x16_1_sse2,
VPX_BITS_8)));
#endif // HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -914,9 +842,6 @@ INSTANTIATE_TEST_CASE_P(
&idct16x16_10_add_12_sse2, 3167, VPX_BITS_12),
make_tuple(&idct16x16_12, &idct16x16_256_add_12_sse2,
3167, VPX_BITS_12)));
INSTANTIATE_TEST_CASE_P(SSE2, PartialTrans16x16Test,
::testing::Values(make_tuple(&vpx_fdct16x16_1_sse2,
VPX_BITS_8)));
#endif // HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -932,8 +857,12 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_fht16x16_msa, &vp9_iht16x16_256_add_msa, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_msa, &vp9_iht16x16_256_add_msa, 3,
VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(MSA, PartialTrans16x16Test,
::testing::Values(make_tuple(&vpx_fdct16x16_1_msa,
VPX_BITS_8)));
#endif // HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(VSX, Trans16x16DCT,
::testing::Values(make_tuple(&vpx_fdct16x16_c,
&vpx_idct16x16_256_add_vsx,
0, VPX_BITS_8)));
#endif // HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
} // namespace

View File

@@ -71,11 +71,11 @@ typedef std::tr1::tuple<FwdTxfmFunc, InvTxfmFunc, int, vpx_bit_depth_t>
#if CONFIG_VP9_HIGHBITDEPTH
void idct32x32_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct32x32_1024_add_c(in, out, stride, 10);
vpx_highbd_idct32x32_1024_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct32x32_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct32x32_1024_add_c(in, out, stride, 12);
vpx_highbd_idct32x32_1024_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -137,7 +137,7 @@ TEST_P(Trans32x32Test, AccuracyCheck) {
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
inv_txfm_(test_temp_block, CONVERT_TO_BYTEPTR(dst16), 32));
inv_txfm_(test_temp_block, CAST_TO_BYTEPTR(dst16), 32));
#endif
}
@@ -275,7 +275,7 @@ TEST_P(Trans32x32Test, InverseAccuracy) {
ASM_REGISTER_STATE_CHECK(inv_txfm_(coeff, dst, 32));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(inv_txfm_(coeff, CONVERT_TO_BYTEPTR(dst16), 32));
ASM_REGISTER_STATE_CHECK(inv_txfm_(coeff, CAST_TO_BYTEPTR(dst16), 32));
#endif
}
for (int j = 0; j < kNumCoeffs; ++j) {
@@ -292,67 +292,6 @@ TEST_P(Trans32x32Test, InverseAccuracy) {
}
}
class PartialTrans32x32Test
: public ::testing::TestWithParam<
std::tr1::tuple<FwdTxfmFunc, vpx_bit_depth_t> > {
public:
virtual ~PartialTrans32x32Test() {}
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
bit_depth_ = GET_PARAM(1);
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
vpx_bit_depth_t bit_depth_;
FwdTxfmFunc fwd_txfm_;
};
TEST_P(PartialTrans32x32Test, Extremes) {
#if CONFIG_VP9_HIGHBITDEPTH
const int16_t maxval =
static_cast<int16_t>(clip_pixel_highbd(1 << 30, bit_depth_));
#else
const int16_t maxval = 255;
#endif
const int minval = -maxval;
DECLARE_ALIGNED(16, int16_t, input[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output[kNumCoeffs]);
for (int i = 0; i < kNumCoeffs; ++i) input[i] = maxval;
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 32));
EXPECT_EQ((maxval * kNumCoeffs) >> 3, output[0]);
for (int i = 0; i < kNumCoeffs; ++i) input[i] = minval;
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 32));
EXPECT_EQ((minval * kNumCoeffs) >> 3, output[0]);
}
TEST_P(PartialTrans32x32Test, Random) {
#if CONFIG_VP9_HIGHBITDEPTH
const int16_t maxval =
static_cast<int16_t>(clip_pixel_highbd(1 << 30, bit_depth_));
#else
const int16_t maxval = 255;
#endif
DECLARE_ALIGNED(16, int16_t, input[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output[kNumCoeffs]);
ACMRandom rnd(ACMRandom::DeterministicSeed());
int sum = 0;
for (int i = 0; i < kNumCoeffs; ++i) {
const int val = (i & 1) ? -rnd(maxval + 1) : rnd(maxval + 1);
input[i] = val;
sum += val;
}
output[0] = 0;
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input, output, 32));
EXPECT_EQ(sum >> 3, output[0]);
}
using std::tr1::make_tuple;
#if CONFIG_VP9_HIGHBITDEPTH
@@ -366,11 +305,6 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_c, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_c, &vpx_idct32x32_1024_add_c, 1,
VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(
C, PartialTrans32x32Test,
::testing::Values(make_tuple(&vpx_highbd_fdct32x32_1_c, VPX_BITS_8),
make_tuple(&vpx_highbd_fdct32x32_1_c, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct32x32_1_c, VPX_BITS_12)));
#else
INSTANTIATE_TEST_CASE_P(
C, Trans32x32Test,
@@ -378,19 +312,16 @@ INSTANTIATE_TEST_CASE_P(
VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_c, &vpx_idct32x32_1024_add_c,
1, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(C, PartialTrans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_1_c,
VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
NEON, Trans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_neon,
0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_c,
::testing::Values(make_tuple(&vpx_fdct32x32_neon,
&vpx_idct32x32_1024_add_neon, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_neon,
&vpx_idct32x32_1024_add_neon, 1, VPX_BITS_8)));
#endif // HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#endif // HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
@@ -399,9 +330,6 @@ INSTANTIATE_TEST_CASE_P(
&vpx_idct32x32_1024_add_sse2, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_sse2,
&vpx_idct32x32_1024_add_sse2, 1, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(SSE2, PartialTrans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_1_sse2,
VPX_BITS_8)));
#endif // HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -418,9 +346,6 @@ INSTANTIATE_TEST_CASE_P(
VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_sse2, &vpx_idct32x32_1024_add_c, 1,
VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(SSE2, PartialTrans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_1_sse2,
VPX_BITS_8)));
#endif // HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_AVX2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -439,8 +364,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_idct32x32_1024_add_msa, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_msa,
&vpx_idct32x32_1024_add_msa, 1, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(MSA, PartialTrans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_1_msa,
VPX_BITS_8)));
#endif // HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
VSX, Trans32x32Test,
::testing::Values(make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_vsx,
0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_c,
&vpx_idct32x32_1024_add_vsx, 1, VPX_BITS_8)));
#endif // HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
} // namespace

169
test/dct_partial_test.cc Normal file
View File

@@ -0,0 +1,169 @@
/*
* Copyright (c) 2017 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_dsp/vpx_dsp_common.h"
using libvpx_test::ACMRandom;
using libvpx_test::Buffer;
using std::tr1::tuple;
using std::tr1::make_tuple;
namespace {
typedef void (*PartialFdctFunc)(const int16_t *in, tran_low_t *out, int stride);
typedef tuple<PartialFdctFunc, int /* size */, vpx_bit_depth_t>
PartialFdctParam;
tran_low_t partial_fdct_ref(const Buffer<int16_t> &in, int size) {
int64_t sum = 0;
for (int y = 0; y < size; ++y) {
for (int x = 0; x < size; ++x) {
sum += in.TopLeftPixel()[y * in.stride() + x];
}
}
switch (size) {
case 4: sum *= 2; break;
case 8: /*sum = sum;*/ break;
case 16: sum >>= 1; break;
case 32: sum >>= 3; break;
}
return static_cast<tran_low_t>(sum);
}
class PartialFdctTest : public ::testing::TestWithParam<PartialFdctParam> {
public:
PartialFdctTest() {
fwd_txfm_ = GET_PARAM(0);
size_ = GET_PARAM(1);
bit_depth_ = GET_PARAM(2);
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
void RunTest() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int16_t maxvalue =
clip_pixel_highbd(std::numeric_limits<int16_t>::max(), bit_depth_);
const int16_t minvalue = -maxvalue;
Buffer<int16_t> input_block =
Buffer<int16_t>(size_, size_, 8, size_ == 4 ? 0 : 16);
ASSERT_TRUE(input_block.Init());
Buffer<tran_low_t> output_block = Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(output_block.Init());
for (int i = 0; i < 100; ++i) {
if (i == 0) {
input_block.Set(maxvalue);
} else if (i == 1) {
input_block.Set(minvalue);
} else {
input_block.Set(&rnd, minvalue, maxvalue);
}
ASM_REGISTER_STATE_CHECK(fwd_txfm_(input_block.TopLeftPixel(),
output_block.TopLeftPixel(),
input_block.stride()));
EXPECT_EQ(partial_fdct_ref(input_block, size_),
output_block.TopLeftPixel()[0]);
}
}
PartialFdctFunc fwd_txfm_;
vpx_bit_depth_t bit_depth_;
int size_;
};
TEST_P(PartialFdctTest, PartialFdctTest) { RunTest(); }
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, PartialFdctTest,
::testing::Values(make_tuple(&vpx_highbd_fdct32x32_1_c, 32, VPX_BITS_12),
make_tuple(&vpx_highbd_fdct32x32_1_c, 32, VPX_BITS_10),
make_tuple(&vpx_fdct32x32_1_c, 32, VPX_BITS_8),
make_tuple(&vpx_highbd_fdct16x16_1_c, 16, VPX_BITS_12),
make_tuple(&vpx_highbd_fdct16x16_1_c, 16, VPX_BITS_10),
make_tuple(&vpx_fdct16x16_1_c, 16, VPX_BITS_8),
make_tuple(&vpx_highbd_fdct8x8_1_c, 8, VPX_BITS_12),
make_tuple(&vpx_highbd_fdct8x8_1_c, 8, VPX_BITS_10),
make_tuple(&vpx_fdct8x8_1_c, 8, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_1_c, 4, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
C, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct32x32_1_c, 32, VPX_BITS_8),
make_tuple(&vpx_fdct16x16_1_c, 16, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_1_c, 8, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_1_c, 4, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct32x32_1_sse2, 32, VPX_BITS_8),
make_tuple(&vpx_fdct16x16_1_sse2, 16, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_1_sse2, 8, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_1_sse2, 4, VPX_BITS_8)));
#endif // HAVE_SSE2
#if HAVE_NEON
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
NEON, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct32x32_1_neon, 32, VPX_BITS_8),
make_tuple(&vpx_fdct16x16_1_neon, 16, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_1_neon, 8, VPX_BITS_12),
make_tuple(&vpx_fdct8x8_1_neon, 8, VPX_BITS_10),
make_tuple(&vpx_fdct8x8_1_neon, 8, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_1_neon, 4, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
NEON, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct32x32_1_neon, 32, VPX_BITS_8),
make_tuple(&vpx_fdct16x16_1_neon, 16, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_1_neon, 8, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_1_neon, 4, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON
#if HAVE_MSA
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(MSA, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct8x8_1_msa, 8,
VPX_BITS_8)));
#else // !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
MSA, PartialFdctTest,
::testing::Values(make_tuple(&vpx_fdct32x32_1_msa, 32, VPX_BITS_8),
make_tuple(&vpx_fdct16x16_1_msa, 16, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_1_msa, 8, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_MSA
} // namespace

737
test/dct_test.cc Normal file
View File

@@ -0,0 +1,737 @@
/*
* Copyright (c) 2017 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "vp9/common/vp9_entropy.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
using libvpx_test::ACMRandom;
using libvpx_test::Buffer;
using std::tr1::tuple;
using std::tr1::make_tuple;
namespace {
typedef void (*FdctFunc)(const int16_t *in, tran_low_t *out, int stride);
typedef void (*IdctFunc)(const tran_low_t *in, uint8_t *out, int stride);
typedef void (*FhtFunc)(const int16_t *in, tran_low_t *out, int stride,
int tx_type);
typedef void (*FhtFuncRef)(const Buffer<int16_t> &in, Buffer<tran_low_t> *out,
int size, int tx_type);
typedef void (*IhtFunc)(const tran_low_t *in, uint8_t *out, int stride,
int tx_type);
/* forward transform, inverse transform, size, transform type, bit depth */
typedef tuple<FdctFunc, IdctFunc, int, int, vpx_bit_depth_t> DctParam;
typedef tuple<FhtFunc, IhtFunc, int, int, vpx_bit_depth_t> HtParam;
void fdct_ref(const Buffer<int16_t> &in, Buffer<tran_low_t> *out, int size,
int /*tx_type*/) {
const int16_t *i = in.TopLeftPixel();
const int i_stride = in.stride();
tran_low_t *o = out->TopLeftPixel();
if (size == 4) {
vpx_fdct4x4_c(i, o, i_stride);
} else if (size == 8) {
vpx_fdct8x8_c(i, o, i_stride);
} else if (size == 16) {
vpx_fdct16x16_c(i, o, i_stride);
} else if (size == 32) {
vpx_fdct32x32_c(i, o, i_stride);
}
}
void fht_ref(const Buffer<int16_t> &in, Buffer<tran_low_t> *out, int size,
int tx_type) {
const int16_t *i = in.TopLeftPixel();
const int i_stride = in.stride();
tran_low_t *o = out->TopLeftPixel();
if (size == 4) {
vp9_fht4x4_c(i, o, i_stride, tx_type);
} else if (size == 8) {
vp9_fht8x8_c(i, o, i_stride, tx_type);
} else if (size == 16) {
vp9_fht16x16_c(i, o, i_stride, tx_type);
}
}
void fwht_ref(const Buffer<int16_t> &in, Buffer<tran_low_t> *out, int size,
int /*tx_type*/) {
ASSERT_EQ(size, 4);
vp9_fwht4x4_c(in.TopLeftPixel(), out->TopLeftPixel(), in.stride());
}
#if CONFIG_VP9_HIGHBITDEPTH
#define idctNxN(n, coeffs, bitdepth) \
void idct##n##x##n##_##bitdepth(const tran_low_t *in, uint8_t *out, \
int stride) { \
vpx_highbd_idct##n##x##n##_##coeffs##_add_c(in, CAST_TO_SHORTPTR(out), \
stride, bitdepth); \
}
idctNxN(4, 16, 10);
idctNxN(4, 16, 12);
idctNxN(8, 64, 10);
idctNxN(8, 64, 12);
idctNxN(16, 256, 10);
idctNxN(16, 256, 12);
idctNxN(32, 1024, 10);
idctNxN(32, 1024, 12);
#define ihtNxN(n, coeffs, bitdepth) \
void iht##n##x##n##_##bitdepth(const tran_low_t *in, uint8_t *out, \
int stride, int tx_type) { \
vp9_highbd_iht##n##x##n##_##coeffs##_add_c(in, CAST_TO_SHORTPTR(out), \
stride, tx_type, bitdepth); \
}
ihtNxN(4, 16, 10);
ihtNxN(4, 16, 12);
ihtNxN(8, 64, 10);
ihtNxN(8, 64, 12);
ihtNxN(16, 256, 10);
// ihtNxN(16, 256, 12);
void iwht4x4_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_iwht4x4_16_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void iwht4x4_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_iwht4x4_16_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
#endif // CONFIG_VP9_HIGHBITDEPTH
class TransTestBase {
public:
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
virtual void RunFwdTxfm(const Buffer<int16_t> &in,
Buffer<tran_low_t> *out) = 0;
virtual void RunInvTxfm(const Buffer<tran_low_t> &in, uint8_t *out) = 0;
void RunAccuracyCheck(int limit) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
Buffer<int16_t> test_input_block =
Buffer<int16_t>(size_, size_, 8, size_ == 4 ? 0 : 16);
ASSERT_TRUE(test_input_block.Init());
Buffer<tran_low_t> test_temp_block =
Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(test_temp_block.Init());
Buffer<uint8_t> dst = Buffer<uint8_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst.Init());
Buffer<uint8_t> src = Buffer<uint8_t>(size_, size_, 0, 16);
ASSERT_TRUE(src.Init());
#if CONFIG_VP9_HIGHBITDEPTH
Buffer<uint16_t> dst16 = Buffer<uint16_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst16.Init());
Buffer<uint16_t> src16 = Buffer<uint16_t>(size_, size_, 0, 16);
ASSERT_TRUE(src16.Init());
#endif // CONFIG_VP9_HIGHBITDEPTH
uint32_t max_error = 0;
int64_t total_error = 0;
const int count_test_block = 10000;
for (int i = 0; i < count_test_block; ++i) {
if (bit_depth_ == 8) {
src.Set(&rnd, &ACMRandom::Rand8);
dst.Set(&rnd, &ACMRandom::Rand8);
// Initialize a test block with input range [-255, 255].
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
test_input_block.TopLeftPixel()[h * test_input_block.stride() + w] =
src.TopLeftPixel()[h * src.stride() + w] -
dst.TopLeftPixel()[h * dst.stride() + w];
}
}
#if CONFIG_VP9_HIGHBITDEPTH
} else {
src16.Set(&rnd, 0, max_pixel_value_);
dst16.Set(&rnd, 0, max_pixel_value_);
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
test_input_block.TopLeftPixel()[h * test_input_block.stride() + w] =
src16.TopLeftPixel()[h * src16.stride() + w] -
dst16.TopLeftPixel()[h * dst16.stride() + w];
}
}
#endif // CONFIG_VP9_HIGHBITDEPTH
}
ASM_REGISTER_STATE_CHECK(RunFwdTxfm(test_input_block, &test_temp_block));
if (bit_depth_ == VPX_BITS_8) {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, dst.TopLeftPixel()));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, CAST_TO_BYTEPTR(dst16.TopLeftPixel())));
#endif // CONFIG_VP9_HIGHBITDEPTH
}
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
int diff;
#if CONFIG_VP9_HIGHBITDEPTH
if (bit_depth_ != 8) {
diff = dst16.TopLeftPixel()[h * dst16.stride() + w] -
src16.TopLeftPixel()[h * src16.stride() + w];
} else {
#endif // CONFIG_VP9_HIGHBITDEPTH
diff = dst.TopLeftPixel()[h * dst.stride() + w] -
src.TopLeftPixel()[h * src.stride() + w];
#if CONFIG_VP9_HIGHBITDEPTH
}
#endif // CONFIG_VP9_HIGHBITDEPTH
const uint32_t error = diff * diff;
if (max_error < error) max_error = error;
total_error += error;
}
}
}
EXPECT_GE(static_cast<uint32_t>(limit), max_error)
<< "Error: 4x4 FHT/IHT has an individual round trip error > " << limit;
EXPECT_GE(count_test_block * limit, total_error)
<< "Error: 4x4 FHT/IHT has average round trip error > " << limit
<< " per block";
}
void RunCoeffCheck() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
Buffer<int16_t> input_block =
Buffer<int16_t>(size_, size_, 8, size_ == 4 ? 0 : 16);
ASSERT_TRUE(input_block.Init());
Buffer<tran_low_t> output_ref_block = Buffer<tran_low_t>(size_, size_, 0);
ASSERT_TRUE(output_ref_block.Init());
Buffer<tran_low_t> output_block = Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(output_block.Init());
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-max_pixel_value_,
// max_pixel_value_].
input_block.Set(&rnd, -max_pixel_value_, max_pixel_value_);
fwd_txfm_ref(input_block, &output_ref_block, size_, tx_type_);
ASM_REGISTER_STATE_CHECK(RunFwdTxfm(input_block, &output_block));
// The minimum quant value is 4.
EXPECT_TRUE(output_block.CheckValues(output_ref_block));
if (::testing::Test::HasFailure()) {
printf("Size: %d Transform type: %d\n", size_, tx_type_);
output_block.PrintDifference(output_ref_block);
return;
}
}
}
void RunMemCheck() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
Buffer<int16_t> input_extreme_block =
Buffer<int16_t>(size_, size_, 8, size_ == 4 ? 0 : 16);
ASSERT_TRUE(input_extreme_block.Init());
Buffer<tran_low_t> output_ref_block = Buffer<tran_low_t>(size_, size_, 0);
ASSERT_TRUE(output_ref_block.Init());
Buffer<tran_low_t> output_block = Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(output_block.Init());
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with -max_pixel_value_ or max_pixel_value_.
if (i == 0) {
input_extreme_block.Set(max_pixel_value_);
} else if (i == 1) {
input_extreme_block.Set(-max_pixel_value_);
} else {
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
input_extreme_block
.TopLeftPixel()[h * input_extreme_block.stride() + w] =
rnd.Rand8() % 2 ? max_pixel_value_ : -max_pixel_value_;
}
}
}
fwd_txfm_ref(input_extreme_block, &output_ref_block, size_, tx_type_);
ASM_REGISTER_STATE_CHECK(RunFwdTxfm(input_extreme_block, &output_block));
// The minimum quant value is 4.
EXPECT_TRUE(output_block.CheckValues(output_ref_block));
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
EXPECT_GE(
4 * DCT_MAX_VALUE << (bit_depth_ - 8),
abs(output_block.TopLeftPixel()[h * output_block.stride() + w]))
<< "Error: 4x4 FDCT has coefficient larger than "
"4*DCT_MAX_VALUE"
<< " at " << w << "," << h;
if (::testing::Test::HasFailure()) {
printf("Size: %d Transform type: %d\n", size_, tx_type_);
output_block.DumpBuffer();
return;
}
}
}
}
}
void RunInvAccuracyCheck(int limit) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 1000;
Buffer<int16_t> in = Buffer<int16_t>(size_, size_, 4);
ASSERT_TRUE(in.Init());
Buffer<tran_low_t> coeff = Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(coeff.Init());
Buffer<uint8_t> dst = Buffer<uint8_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst.Init());
Buffer<uint8_t> src = Buffer<uint8_t>(size_, size_, 0);
ASSERT_TRUE(src.Init());
Buffer<uint16_t> dst16 = Buffer<uint16_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst16.Init());
Buffer<uint16_t> src16 = Buffer<uint16_t>(size_, size_, 0);
ASSERT_TRUE(src16.Init());
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-max_pixel_value_,
// max_pixel_value_].
if (bit_depth_ == VPX_BITS_8) {
src.Set(&rnd, &ACMRandom::Rand8);
dst.Set(&rnd, &ACMRandom::Rand8);
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
in.TopLeftPixel()[h * in.stride() + w] =
src.TopLeftPixel()[h * src.stride() + w] -
dst.TopLeftPixel()[h * dst.stride() + w];
}
}
#if CONFIG_VP9_HIGHBITDEPTH
} else {
src16.Set(&rnd, 0, max_pixel_value_);
dst16.Set(&rnd, 0, max_pixel_value_);
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
in.TopLeftPixel()[h * in.stride() + w] =
src16.TopLeftPixel()[h * src16.stride() + w] -
dst16.TopLeftPixel()[h * dst16.stride() + w];
}
}
#endif // CONFIG_VP9_HIGHBITDEPTH
}
fwd_txfm_ref(in, &coeff, size_, tx_type_);
if (bit_depth_ == VPX_BITS_8) {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, dst.TopLeftPixel()));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16.TopLeftPixel())));
#endif // CONFIG_VP9_HIGHBITDEPTH
}
for (int h = 0; h < size_; ++h) {
for (int w = 0; w < size_; ++w) {
int diff;
#if CONFIG_VP9_HIGHBITDEPTH
if (bit_depth_ != 8) {
diff = dst16.TopLeftPixel()[h * dst16.stride() + w] -
src16.TopLeftPixel()[h * src16.stride() + w];
} else {
#endif // CONFIG_VP9_HIGHBITDEPTH
diff = dst.TopLeftPixel()[h * dst.stride() + w] -
src.TopLeftPixel()[h * src.stride() + w];
#if CONFIG_VP9_HIGHBITDEPTH
}
#endif // CONFIG_VP9_HIGHBITDEPTH
const uint32_t error = diff * diff;
EXPECT_GE(static_cast<uint32_t>(limit), error)
<< "Error: " << size_ << "x" << size_ << " IDCT has error "
<< error << " at " << w << "," << h;
}
}
}
}
FhtFuncRef fwd_txfm_ref;
vpx_bit_depth_t bit_depth_;
int tx_type_;
int max_pixel_value_;
int size_;
};
class TransDCT : public TransTestBase,
public ::testing::TestWithParam<DctParam> {
public:
TransDCT() {
fwd_txfm_ref = fdct_ref;
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
size_ = GET_PARAM(2);
tx_type_ = GET_PARAM(3);
bit_depth_ = GET_PARAM(4);
max_pixel_value_ = (1 << bit_depth_) - 1;
}
protected:
void RunFwdTxfm(const Buffer<int16_t> &in, Buffer<tran_low_t> *out) {
fwd_txfm_(in.TopLeftPixel(), out->TopLeftPixel(), in.stride());
}
void RunInvTxfm(const Buffer<tran_low_t> &in, uint8_t *out) {
inv_txfm_(in.TopLeftPixel(), out, in.stride());
}
FdctFunc fwd_txfm_;
IdctFunc inv_txfm_;
};
TEST_P(TransDCT, AccuracyCheck) { RunAccuracyCheck(1); }
TEST_P(TransDCT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(TransDCT, MemCheck) { RunMemCheck(); }
TEST_P(TransDCT, InvAccuracyCheck) { RunInvAccuracyCheck(1); }
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, TransDCT,
::testing::Values(
make_tuple(&vpx_highbd_fdct32x32_c, &idct32x32_10, 32, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct32x32_c, &idct32x32_12, 32, 0, VPX_BITS_10),
make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_c, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_highbd_fdct16x16_c, &idct16x16_10, 16, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct16x16_c, &idct16x16_12, 16, 0, VPX_BITS_10),
make_tuple(&vpx_fdct16x16_c, &vpx_idct16x16_256_add_c, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_highbd_fdct8x8_c, &idct8x8_10, 8, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct8x8_c, &idct8x8_12, 8, 0, VPX_BITS_10),
make_tuple(&vpx_fdct8x8_c, &vpx_idct8x8_64_add_c, 8, 0, VPX_BITS_8),
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_10, 4, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_12, 4, 0, VPX_BITS_12),
make_tuple(&vpx_fdct4x4_c, &vpx_idct4x4_16_add_c, 4, 0, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
C, TransDCT,
::testing::Values(
make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_c, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct16x16_c, &vpx_idct16x16_256_add_c, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct8x8_c, &vpx_idct8x8_64_add_c, 8, 0, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_c, &vpx_idct4x4_16_add_c, 4, 0, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSE2
#if !CONFIG_EMULATE_HARDWARE
#if CONFIG_VP9_HIGHBITDEPTH
/* TODO:(johannkoenig) Determine why these fail AccuracyCheck
make_tuple(&vpx_highbd_fdct32x32_sse2, &idct32x32_12, 32, 0, VPX_BITS_12),
make_tuple(&vpx_highbd_fdct16x16_sse2, &idct16x16_12, 16, 0, VPX_BITS_12),
*/
INSTANTIATE_TEST_CASE_P(
SSE2, TransDCT,
::testing::Values(
make_tuple(&vpx_highbd_fdct32x32_sse2, &idct32x32_10, 32, 0,
VPX_BITS_10),
make_tuple(&vpx_fdct32x32_sse2, &vpx_idct32x32_1024_add_sse2, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_highbd_fdct16x16_sse2, &idct16x16_10, 16, 0,
VPX_BITS_10),
make_tuple(&vpx_fdct16x16_sse2, &vpx_idct16x16_256_add_sse2, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_highbd_fdct8x8_sse2, &idct8x8_10, 8, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct8x8_sse2, &idct8x8_12, 8, 0, VPX_BITS_12),
make_tuple(&vpx_fdct8x8_sse2, &vpx_idct8x8_64_add_sse2, 8, 0,
VPX_BITS_8),
make_tuple(&vpx_highbd_fdct4x4_sse2, &idct4x4_10, 4, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct4x4_sse2, &idct4x4_12, 4, 0, VPX_BITS_12),
make_tuple(&vpx_fdct4x4_sse2, &vpx_idct4x4_16_add_sse2, 4, 0,
VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
SSE2, TransDCT,
::testing::Values(make_tuple(&vpx_fdct32x32_sse2,
&vpx_idct32x32_1024_add_sse2, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct16x16_sse2,
&vpx_idct16x16_256_add_sse2, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct8x8_sse2, &vpx_idct8x8_64_add_sse2, 8,
0, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_sse2, &vpx_idct4x4_16_add_sse2, 4,
0, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // !CONFIG_EMULATE_HARDWARE
#endif // HAVE_SSE2
#if !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSSE3 && !CONFIG_EMULATE_HARDWARE
#if !ARCH_X86_64
// TODO(johannkoenig): high bit depth fdct8x8.
INSTANTIATE_TEST_CASE_P(
SSSE3, TransDCT,
::testing::Values(make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_sse2,
32, 0, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_c, &vpx_idct8x8_64_add_sse2, 8, 0,
VPX_BITS_8)));
#else
// vpx_fdct8x8_ssse3 is only available in 64 bit builds.
INSTANTIATE_TEST_CASE_P(
SSSE3, TransDCT,
::testing::Values(make_tuple(&vpx_fdct32x32_c, &vpx_idct32x32_1024_add_sse2,
32, 0, VPX_BITS_8),
make_tuple(&vpx_fdct8x8_ssse3, &vpx_idct8x8_64_add_sse2,
8, 0, VPX_BITS_8)));
#endif // !ARCH_X86_64
#endif // HAVE_SSSE3 && !CONFIG_EMULATE_HARDWARE
#endif // !CONFIG_VP9_HIGHBITDEPTH
#if !CONFIG_VP9_HIGHBITDEPTH && HAVE_AVX2 && !CONFIG_EMULATE_HARDWARE
// TODO(johannkoenig): high bit depth fdct32x32.
INSTANTIATE_TEST_CASE_P(
AVX2, TransDCT, ::testing::Values(make_tuple(&vpx_fdct32x32_avx2,
&vpx_idct32x32_1024_add_sse2,
32, 0, VPX_BITS_8)));
#endif // !CONFIG_VP9_HIGHBITDEPTH && HAVE_AVX2 && !CONFIG_EMULATE_HARDWARE
#if HAVE_NEON
#if !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
NEON, TransDCT,
::testing::Values(make_tuple(&vpx_fdct32x32_neon,
&vpx_idct32x32_1024_add_neon, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct16x16_neon,
&vpx_idct16x16_256_add_neon, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct8x8_neon, &vpx_idct8x8_64_add_neon, 8,
0, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_neon, &vpx_idct4x4_16_add_neon, 4,
0, VPX_BITS_8)));
#endif // !CONFIG_EMULATE_HARDWARE
#endif // HAVE_NEON
#if HAVE_MSA
#if !CONFIG_VP9_HIGHBITDEPTH
#if !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
MSA, TransDCT,
::testing::Values(
make_tuple(&vpx_fdct32x32_msa, &vpx_idct32x32_1024_add_msa, 32, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct16x16_msa, &vpx_idct16x16_256_add_msa, 16, 0,
VPX_BITS_8),
make_tuple(&vpx_fdct8x8_msa, &vpx_idct8x8_64_add_msa, 8, 0, VPX_BITS_8),
make_tuple(&vpx_fdct4x4_msa, &vpx_idct4x4_16_add_msa, 4, 0,
VPX_BITS_8)));
#endif // !CONFIG_EMULATE_HARDWARE
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_MSA
#if HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(VSX, TransDCT,
::testing::Values(make_tuple(&vpx_fdct4x4_c,
&vpx_idct4x4_16_add_vsx, 4,
0, VPX_BITS_8)));
#endif // HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
class TransHT : public TransTestBase, public ::testing::TestWithParam<HtParam> {
public:
TransHT() {
fwd_txfm_ref = fht_ref;
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
size_ = GET_PARAM(2);
tx_type_ = GET_PARAM(3);
bit_depth_ = GET_PARAM(4);
max_pixel_value_ = (1 << bit_depth_) - 1;
}
protected:
void RunFwdTxfm(const Buffer<int16_t> &in, Buffer<tran_low_t> *out) {
fwd_txfm_(in.TopLeftPixel(), out->TopLeftPixel(), in.stride(), tx_type_);
}
void RunInvTxfm(const Buffer<tran_low_t> &in, uint8_t *out) {
inv_txfm_(in.TopLeftPixel(), out, in.stride(), tx_type_);
}
FhtFunc fwd_txfm_;
IhtFunc inv_txfm_;
};
TEST_P(TransHT, AccuracyCheck) { RunAccuracyCheck(1); }
TEST_P(TransHT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(TransHT, MemCheck) { RunMemCheck(); }
TEST_P(TransHT, InvAccuracyCheck) { RunInvAccuracyCheck(1); }
/* TODO:(johannkoenig) Determine why these fail AccuracyCheck
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_12, 16, 0, VPX_BITS_12),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_12, 16, 1, VPX_BITS_12),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_12, 16, 2, VPX_BITS_12),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_12, 16, 3, VPX_BITS_12),
*/
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, TransHT,
::testing::Values(
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_10, 16, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_10, 16, 1, VPX_BITS_10),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_10, 16, 2, VPX_BITS_10),
make_tuple(&vp9_highbd_fht16x16_c, &iht16x16_10, 16, 3, VPX_BITS_10),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 0, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 1, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 3, VPX_BITS_8),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_10, 8, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_10, 8, 1, VPX_BITS_10),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_10, 8, 2, VPX_BITS_10),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_10, 8, 3, VPX_BITS_10),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_12, 8, 0, VPX_BITS_12),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_12, 8, 1, VPX_BITS_12),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_12, 8, 2, VPX_BITS_12),
make_tuple(&vp9_highbd_fht8x8_c, &iht8x8_12, 8, 3, VPX_BITS_12),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 0, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 1, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 2, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 3, VPX_BITS_8),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 4, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 4, 1, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 4, 2, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 4, 3, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 4, 0, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 4, 1, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 4, 2, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 4, 3, VPX_BITS_12),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 3, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
C, TransHT,
::testing::Values(
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 0, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 1, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 16, 3, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 0, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 1, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 2, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_c, 8, 3, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 4, 3, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2, TransHT,
::testing::Values(
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2, 16, 0,
VPX_BITS_8),
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2, 16, 1,
VPX_BITS_8),
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2, 16, 2,
VPX_BITS_8),
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2, 16, 3,
VPX_BITS_8),
make_tuple(&vp9_fht8x8_sse2, &vp9_iht8x8_64_add_sse2, 8, 0, VPX_BITS_8),
make_tuple(&vp9_fht8x8_sse2, &vp9_iht8x8_64_add_sse2, 8, 1, VPX_BITS_8),
make_tuple(&vp9_fht8x8_sse2, &vp9_iht8x8_64_add_sse2, 8, 2, VPX_BITS_8),
make_tuple(&vp9_fht8x8_sse2, &vp9_iht8x8_64_add_sse2, 8, 3, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 4, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 4, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 4, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 4, 3,
VPX_BITS_8)));
#endif // HAVE_SSE2
class TransWHT : public TransTestBase,
public ::testing::TestWithParam<DctParam> {
public:
TransWHT() {
fwd_txfm_ref = fwht_ref;
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
size_ = GET_PARAM(2);
tx_type_ = GET_PARAM(3);
bit_depth_ = GET_PARAM(4);
max_pixel_value_ = (1 << bit_depth_) - 1;
}
protected:
void RunFwdTxfm(const Buffer<int16_t> &in, Buffer<tran_low_t> *out) {
fwd_txfm_(in.TopLeftPixel(), out->TopLeftPixel(), in.stride());
}
void RunInvTxfm(const Buffer<tran_low_t> &in, uint8_t *out) {
inv_txfm_(in.TopLeftPixel(), out, in.stride());
}
FdctFunc fwd_txfm_;
IdctFunc inv_txfm_;
};
TEST_P(TransWHT, AccuracyCheck) { RunAccuracyCheck(0); }
TEST_P(TransWHT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(TransWHT, MemCheck) { RunMemCheck(); }
TEST_P(TransWHT, InvAccuracyCheck) { RunInvAccuracyCheck(0); }
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, TransWHT,
::testing::Values(
make_tuple(&vp9_highbd_fwht4x4_c, &iwht4x4_10, 4, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fwht4x4_c, &iwht4x4_12, 4, 0, VPX_BITS_12),
make_tuple(&vp9_fwht4x4_c, &vpx_iwht4x4_16_add_c, 4, 0, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(C, TransWHT,
::testing::Values(make_tuple(&vp9_fwht4x4_c,
&vpx_iwht4x4_16_add_c, 4,
0, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, TransWHT,
::testing::Values(make_tuple(&vp9_fwht4x4_sse2,
&vpx_iwht4x4_16_add_sse2,
4, 0, VPX_BITS_8)));
#endif // HAVE_SSE2
} // namespace

View File

@@ -172,4 +172,21 @@ TEST(DecodeAPI, Vp9PeekSI) {
}
#endif // CONFIG_VP9_DECODER
TEST(DecodeAPI, HighBitDepthCapability) {
// VP8 should not claim VP9 HBD as a capability.
#if CONFIG_VP8_DECODER
const vpx_codec_caps_t vp8_caps = vpx_codec_get_caps(&vpx_codec_vp8_dx_algo);
EXPECT_EQ(vp8_caps & VPX_CODEC_CAP_HIGHBITDEPTH, 0);
#endif
#if CONFIG_VP9_DECODER
const vpx_codec_caps_t vp9_caps = vpx_codec_get_caps(&vpx_codec_vp9_dx_algo);
#if CONFIG_VP9_HIGHBITDEPTH
EXPECT_EQ(vp9_caps & VPX_CODEC_CAP_HIGHBITDEPTH, VPX_CODEC_CAP_HIGHBITDEPTH);
#else
EXPECT_EQ(vp9_caps & VPX_CODEC_CAP_HIGHBITDEPTH, 0);
#endif
#endif
}
} // namespace

View File

@@ -53,13 +53,13 @@ void DecoderTest::HandlePeekResult(Decoder *const decoder,
* pass it is not a keyframe, so we only expect VPX_CODEC_OK on the first
* frame, which must be a keyframe. */
if (video->frame_number() == 0)
ASSERT_EQ(VPX_CODEC_OK, res_peek) << "Peek return failed: "
<< vpx_codec_err_to_string(res_peek);
ASSERT_EQ(VPX_CODEC_OK, res_peek)
<< "Peek return failed: " << vpx_codec_err_to_string(res_peek);
} else {
/* The Vp9 implementation of PeekStream returns an error only if the
* data passed to it isn't a valid Vp9 chunk. */
ASSERT_EQ(VPX_CODEC_OK, res_peek) << "Peek return failed: "
<< vpx_codec_err_to_string(res_peek);
ASSERT_EQ(VPX_CODEC_OK, res_peek)
<< "Peek return failed: " << vpx_codec_err_to_string(res_peek);
}
}

View File

@@ -62,4 +62,134 @@ TEST(EncodeAPI, InvalidParams) {
}
}
TEST(EncodeAPI, HighBitDepthCapability) {
// VP8 should not claim VP9 HBD as a capability.
#if CONFIG_VP8_ENCODER
const vpx_codec_caps_t vp8_caps = vpx_codec_get_caps(&vpx_codec_vp8_cx_algo);
EXPECT_EQ(vp8_caps & VPX_CODEC_CAP_HIGHBITDEPTH, 0);
#endif
#if CONFIG_VP9_ENCODER
const vpx_codec_caps_t vp9_caps = vpx_codec_get_caps(&vpx_codec_vp9_cx_algo);
#if CONFIG_VP9_HIGHBITDEPTH
EXPECT_EQ(vp9_caps & VPX_CODEC_CAP_HIGHBITDEPTH, VPX_CODEC_CAP_HIGHBITDEPTH);
#else
EXPECT_EQ(vp9_caps & VPX_CODEC_CAP_HIGHBITDEPTH, 0);
#endif
#endif
}
#if CONFIG_VP8_ENCODER
TEST(EncodeAPI, ImageSizeSetting) {
const int width = 711;
const int height = 360;
const int bps = 12;
vpx_image_t img;
vpx_codec_ctx_t enc;
vpx_codec_enc_cfg_t cfg;
uint8_t *img_buf = reinterpret_cast<uint8_t *>(
calloc(width * height * bps / 8, sizeof(*img_buf)));
vpx_codec_enc_config_default(vpx_codec_vp8_cx(), &cfg, 0);
cfg.g_w = width;
cfg.g_h = height;
vpx_img_wrap(&img, VPX_IMG_FMT_I420, width, height, 1, img_buf);
vpx_codec_enc_init(&enc, vpx_codec_vp8_cx(), &cfg, 0);
EXPECT_EQ(VPX_CODEC_OK, vpx_codec_encode(&enc, &img, 0, 1, 0, 0));
free(img_buf);
vpx_codec_destroy(&enc);
}
#endif
// Set up 2 spatial streams with 2 temporal layers per stream, and generate
// invalid configuration by setting the temporal layer rate allocation
// (ts_target_bitrate[]) to 0 for both layers. This should fail independent of
// CONFIG_MULTI_RES_ENCODING.
TEST(EncodeAPI, MultiResEncode) {
static const vpx_codec_iface_t *kCodecs[] = {
#if CONFIG_VP8_ENCODER
&vpx_codec_vp8_cx_algo,
#endif
#if CONFIG_VP9_ENCODER
&vpx_codec_vp9_cx_algo,
#endif
};
const int width = 1280;
const int height = 720;
const int width_down = width / 2;
const int height_down = height / 2;
const int target_bitrate = 1000;
const int framerate = 30;
for (int c = 0; c < NELEMENTS(kCodecs); ++c) {
const vpx_codec_iface_t *const iface = kCodecs[c];
vpx_codec_ctx_t enc[2];
vpx_codec_enc_cfg_t cfg[2];
vpx_rational_t dsf[2] = { { 2, 1 }, { 2, 1 } };
memset(enc, 0, sizeof(enc));
for (int i = 0; i < 2; i++) {
vpx_codec_enc_config_default(iface, &cfg[i], 0);
}
/* Highest-resolution encoder settings */
cfg[0].g_w = width;
cfg[0].g_h = height;
cfg[0].rc_dropframe_thresh = 0;
cfg[0].rc_end_usage = VPX_CBR;
cfg[0].rc_resize_allowed = 0;
cfg[0].rc_min_quantizer = 2;
cfg[0].rc_max_quantizer = 56;
cfg[0].rc_undershoot_pct = 100;
cfg[0].rc_overshoot_pct = 15;
cfg[0].rc_buf_initial_sz = 500;
cfg[0].rc_buf_optimal_sz = 600;
cfg[0].rc_buf_sz = 1000;
cfg[0].g_error_resilient = 1; /* Enable error resilient mode */
cfg[0].g_lag_in_frames = 0;
cfg[0].kf_mode = VPX_KF_AUTO;
cfg[0].kf_min_dist = 3000;
cfg[0].kf_max_dist = 3000;
cfg[0].rc_target_bitrate = target_bitrate; /* Set target bitrate */
cfg[0].g_timebase.num = 1; /* Set fps */
cfg[0].g_timebase.den = framerate;
memcpy(&cfg[1], &cfg[0], sizeof(cfg[0]));
cfg[1].rc_target_bitrate = 500;
cfg[1].g_w = width_down;
cfg[1].g_h = height_down;
for (int i = 0; i < 2; i++) {
cfg[i].ts_number_layers = 2;
cfg[i].ts_periodicity = 2;
cfg[i].ts_rate_decimator[0] = 2;
cfg[i].ts_rate_decimator[1] = 1;
cfg[i].ts_layer_id[0] = 0;
cfg[i].ts_layer_id[1] = 1;
// Invalid parameters.
cfg[i].ts_target_bitrate[0] = 0;
cfg[i].ts_target_bitrate[1] = 0;
}
// VP9 should report incapable, VP8 invalid for all configurations.
const char kVP9Name[] = "WebM Project VP9";
const bool is_vp9 = strncmp(kVP9Name, vpx_codec_iface_name(iface),
sizeof(kVP9Name) - 1) == 0;
EXPECT_EQ(is_vp9 ? VPX_CODEC_INCAPABLE : VPX_CODEC_INVALID_PARAM,
vpx_codec_enc_init_multi(&enc[0], iface, &cfg[0], 2, 0, &dsf[0]));
for (int i = 0; i < 2; i++) {
vpx_codec_destroy(&enc[i]);
}
}
}
} // namespace

View File

@@ -201,6 +201,8 @@ void EncoderTest::RunLoop(VideoSource *video) {
PreEncodeFrameHook(video, encoder.get());
encoder->EncodeFrame(video, frame_flags_);
PostEncodeFrameHook(encoder.get());
CxDataIterator iter = encoder->GetCxData();
bool has_cxdata = false;
@@ -226,6 +228,8 @@ void EncoderTest::RunLoop(VideoSource *video) {
case VPX_CODEC_PSNR_PKT: PSNRPktHook(pkt); break;
case VPX_CODEC_STATS_PKT: StatsPktHook(pkt); break;
default: break;
}
}

View File

@@ -139,6 +139,13 @@ class Encoder {
}
#endif
#if CONFIG_VP8_ENCODER
void Control(int ctrl_id, vpx_roi_map_t *arg) {
const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
}
#endif
void Config(const vpx_codec_enc_cfg_t *cfg) {
const vpx_codec_err_t res = vpx_codec_enc_config_set(&encoder_, cfg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
@@ -212,12 +219,17 @@ class EncoderTest {
virtual void PreEncodeFrameHook(VideoSource * /*video*/,
Encoder * /*encoder*/) {}
virtual void PostEncodeFrameHook(Encoder * /*encoder*/) {}
// Hook to be called on every compressed data packet.
virtual void FramePktHook(const vpx_codec_cx_pkt_t * /*pkt*/) {}
// Hook to be called on every PSNR packet.
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t * /*pkt*/) {}
// Hook to be called on every first pass stats packet.
virtual void StatsPktHook(const vpx_codec_cx_pkt_t * /*pkt*/) {}
// Hook to determine whether the encode loop should continue.
virtual bool Continue() const {
return !(::testing::Test::HasFatalFailure() || abort_);

View File

@@ -34,7 +34,8 @@ struct ExternalFrameBuffer {
// Class to manipulate a list of external frame buffers.
class ExternalFrameBufferList {
public:
ExternalFrameBufferList() : num_buffers_(0), ext_fb_list_(NULL) {}
ExternalFrameBufferList()
: num_buffers_(0), num_used_buffers_(0), ext_fb_list_(NULL) {}
virtual ~ExternalFrameBufferList() {
for (int i = 0; i < num_buffers_; ++i) {
@@ -71,6 +72,8 @@ class ExternalFrameBufferList {
}
SetFrameBuffer(idx, fb);
num_used_buffers_++;
return 0;
}
@@ -106,6 +109,7 @@ class ExternalFrameBufferList {
}
EXPECT_EQ(1, ext_fb->in_use);
ext_fb->in_use = 0;
num_used_buffers_--;
return 0;
}
@@ -121,6 +125,8 @@ class ExternalFrameBufferList {
}
}
int num_used_buffers() const { return num_used_buffers_; }
private:
// Returns the index of the first free frame buffer. Returns |num_buffers_|
// if there are no free frame buffers.
@@ -145,6 +151,7 @@ class ExternalFrameBufferList {
}
int num_buffers_;
int num_used_buffers_;
ExternalFrameBuffer *ext_fb_list_;
};
@@ -220,8 +227,8 @@ class ExternalFrameBufferMD5Test
void OpenMD5File(const std::string &md5_file_name_) {
md5_file_ = libvpx_test::OpenTestDataFile(md5_file_name_);
ASSERT_TRUE(md5_file_ != NULL) << "Md5 file open failed. Filename: "
<< md5_file_name_;
ASSERT_TRUE(md5_file_ != NULL)
<< "Md5 file open failed. Filename: " << md5_file_name_;
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
@@ -273,6 +280,7 @@ class ExternalFrameBufferMD5Test
#if CONFIG_WEBM_IO
const char kVP9TestFile[] = "vp90-2-02-size-lf-1920x1080.webm";
const char kVP9NonRefTestFile[] = "vp90-2-22-svc_1280x720_1.webm";
// Class for testing passing in external frame buffers to libvpx.
class ExternalFrameBufferTest : public ::testing::Test {
@@ -292,7 +300,9 @@ class ExternalFrameBufferTest : public ::testing::Test {
virtual void TearDown() {
delete decoder_;
decoder_ = NULL;
delete video_;
video_ = NULL;
}
// Passes the external frame buffer information to libvpx.
@@ -325,7 +335,7 @@ class ExternalFrameBufferTest : public ::testing::Test {
return VPX_CODEC_OK;
}
private:
protected:
void CheckDecodedFrames() {
libvpx_test::DxDataIterator dec_iter = decoder_->GetDxData();
const vpx_image_t *img = NULL;
@@ -341,6 +351,25 @@ class ExternalFrameBufferTest : public ::testing::Test {
int num_buffers_;
ExternalFrameBufferList fb_list_;
};
class ExternalFrameBufferNonRefTest : public ExternalFrameBufferTest {
protected:
virtual void SetUp() {
video_ = new libvpx_test::WebMVideoSource(kVP9NonRefTestFile);
ASSERT_TRUE(video_ != NULL);
video_->Init();
video_->Begin();
vpx_codec_dec_cfg_t cfg = vpx_codec_dec_cfg_t();
decoder_ = new libvpx_test::VP9Decoder(cfg, 0);
ASSERT_TRUE(decoder_ != NULL);
}
virtual void CheckFrameBufferRelease() {
TearDown();
ASSERT_EQ(0, fb_list_.num_used_buffers());
}
};
#endif // CONFIG_WEBM_IO
// This test runs through the set of test vectors, and decodes them.
@@ -419,6 +448,8 @@ TEST_F(ExternalFrameBufferTest, NotEnoughBuffers) {
SetFrameBufferFunctions(num_buffers, get_vp9_frame_buffer,
release_vp9_frame_buffer));
ASSERT_EQ(VPX_CODEC_OK, DecodeOneFrame());
// Only run this on long clips. Decoding a very short clip will return
// VPX_CODEC_OK even with only 2 buffers.
ASSERT_EQ(VPX_CODEC_MEM_ERROR, DecodeRemainingFrames());
}
@@ -467,6 +498,15 @@ TEST_F(ExternalFrameBufferTest, SetAfterDecode) {
SetFrameBufferFunctions(num_buffers, get_vp9_frame_buffer,
release_vp9_frame_buffer));
}
TEST_F(ExternalFrameBufferNonRefTest, ReleaseNonRefFrameBuffer) {
const int num_buffers = VP9_MAXIMUM_REF_BUFFERS + VPX_MAXIMUM_WORK_BUFFERS;
ASSERT_EQ(VPX_CODEC_OK,
SetFrameBufferFunctions(num_buffers, get_vp9_frame_buffer,
release_vp9_frame_buffer));
ASSERT_EQ(VPX_CODEC_OK, DecodeRemainingFrames());
CheckFrameBufferRelease();
}
#endif // CONFIG_WEBM_IO
VP9_INSTANTIATE_TEST_CASE(

View File

@@ -1,511 +0,0 @@
/*
* Copyright (c) 2012 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "vp9/common/vp9_entropy.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
using libvpx_test::ACMRandom;
namespace {
const int kNumCoeffs = 16;
typedef void (*FdctFunc)(const int16_t *in, tran_low_t *out, int stride);
typedef void (*IdctFunc)(const tran_low_t *in, uint8_t *out, int stride);
typedef void (*FhtFunc)(const int16_t *in, tran_low_t *out, int stride,
int tx_type);
typedef void (*IhtFunc)(const tran_low_t *in, uint8_t *out, int stride,
int tx_type);
typedef std::tr1::tuple<FdctFunc, IdctFunc, int, vpx_bit_depth_t> Dct4x4Param;
typedef std::tr1::tuple<FhtFunc, IhtFunc, int, vpx_bit_depth_t> Ht4x4Param;
void fdct4x4_ref(const int16_t *in, tran_low_t *out, int stride,
int /*tx_type*/) {
vpx_fdct4x4_c(in, out, stride);
}
void fht4x4_ref(const int16_t *in, tran_low_t *out, int stride, int tx_type) {
vp9_fht4x4_c(in, out, stride, tx_type);
}
void fwht4x4_ref(const int16_t *in, tran_low_t *out, int stride,
int /*tx_type*/) {
vp9_fwht4x4_c(in, out, stride);
}
#if CONFIG_VP9_HIGHBITDEPTH
void idct4x4_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct4x4_16_add_c(in, out, stride, 10);
}
void idct4x4_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct4x4_16_add_c(in, out, stride, 12);
}
void iht4x4_10(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht4x4_16_add_c(in, out, stride, tx_type, 10);
}
void iht4x4_12(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht4x4_16_add_c(in, out, stride, tx_type, 12);
}
void iwht4x4_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_iwht4x4_16_add_c(in, out, stride, 10);
}
void iwht4x4_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_iwht4x4_16_add_c(in, out, stride, 12);
}
#if HAVE_SSE2
void idct4x4_10_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct4x4_16_add_sse2(in, out, stride, 10);
}
void idct4x4_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct4x4_16_add_sse2(in, out, stride, 12);
}
#endif // HAVE_SSE2
#endif // CONFIG_VP9_HIGHBITDEPTH
class Trans4x4TestBase {
public:
virtual ~Trans4x4TestBase() {}
protected:
virtual void RunFwdTxfm(const int16_t *in, tran_low_t *out, int stride) = 0;
virtual void RunInvTxfm(const tran_low_t *out, uint8_t *dst, int stride) = 0;
void RunAccuracyCheck(int limit) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
uint32_t max_error = 0;
int64_t total_error = 0;
const int count_test_block = 10000;
for (int i = 0; i < count_test_block; ++i) {
DECLARE_ALIGNED(16, int16_t, test_input_block[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, test_temp_block[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, src[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
DECLARE_ALIGNED(16, uint16_t, dst16[kNumCoeffs]);
DECLARE_ALIGNED(16, uint16_t, src16[kNumCoeffs]);
#endif
// Initialize a test block with input range [-255, 255].
for (int j = 0; j < kNumCoeffs; ++j) {
if (bit_depth_ == VPX_BITS_8) {
src[j] = rnd.Rand8();
dst[j] = rnd.Rand8();
test_input_block[j] = src[j] - dst[j];
#if CONFIG_VP9_HIGHBITDEPTH
} else {
src16[j] = rnd.Rand16() & mask_;
dst16[j] = rnd.Rand16() & mask_;
test_input_block[j] = src16[j] - dst16[j];
#endif
}
}
ASM_REGISTER_STATE_CHECK(
RunFwdTxfm(test_input_block, test_temp_block, pitch_));
if (bit_depth_ == VPX_BITS_8) {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(test_temp_block, dst, pitch_));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, CONVERT_TO_BYTEPTR(dst16), pitch_));
#endif
}
for (int j = 0; j < kNumCoeffs; ++j) {
#if CONFIG_VP9_HIGHBITDEPTH
const int diff =
bit_depth_ == VPX_BITS_8 ? dst[j] - src[j] : dst16[j] - src16[j];
#else
ASSERT_EQ(VPX_BITS_8, bit_depth_);
const int diff = dst[j] - src[j];
#endif
const uint32_t error = diff * diff;
if (max_error < error) max_error = error;
total_error += error;
}
}
EXPECT_GE(static_cast<uint32_t>(limit), max_error)
<< "Error: 4x4 FHT/IHT has an individual round trip error > " << limit;
EXPECT_GE(count_test_block * limit, total_error)
<< "Error: 4x4 FHT/IHT has average round trip error > " << limit
<< " per block";
}
void RunCoeffCheck() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
DECLARE_ALIGNED(16, int16_t, input_block[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_ref_block[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_block[kNumCoeffs]);
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-mask_, mask_].
for (int j = 0; j < kNumCoeffs; ++j) {
input_block[j] = (rnd.Rand16() & mask_) - (rnd.Rand16() & mask_);
}
fwd_txfm_ref(input_block, output_ref_block, pitch_, tx_type_);
ASM_REGISTER_STATE_CHECK(RunFwdTxfm(input_block, output_block, pitch_));
// The minimum quant value is 4.
for (int j = 0; j < kNumCoeffs; ++j)
EXPECT_EQ(output_block[j], output_ref_block[j]);
}
}
void RunMemCheck() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
DECLARE_ALIGNED(16, int16_t, input_extreme_block[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_ref_block[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_block[kNumCoeffs]);
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-mask_, mask_].
for (int j = 0; j < kNumCoeffs; ++j) {
input_extreme_block[j] = rnd.Rand8() % 2 ? mask_ : -mask_;
}
if (i == 0) {
for (int j = 0; j < kNumCoeffs; ++j) input_extreme_block[j] = mask_;
} else if (i == 1) {
for (int j = 0; j < kNumCoeffs; ++j) input_extreme_block[j] = -mask_;
}
fwd_txfm_ref(input_extreme_block, output_ref_block, pitch_, tx_type_);
ASM_REGISTER_STATE_CHECK(
RunFwdTxfm(input_extreme_block, output_block, pitch_));
// The minimum quant value is 4.
for (int j = 0; j < kNumCoeffs; ++j) {
EXPECT_EQ(output_block[j], output_ref_block[j]);
EXPECT_GE(4 * DCT_MAX_VALUE << (bit_depth_ - 8), abs(output_block[j]))
<< "Error: 4x4 FDCT has coefficient larger than 4*DCT_MAX_VALUE";
}
}
}
void RunInvAccuracyCheck(int limit) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 1000;
DECLARE_ALIGNED(16, int16_t, in[kNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, src[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
DECLARE_ALIGNED(16, uint16_t, dst16[kNumCoeffs]);
DECLARE_ALIGNED(16, uint16_t, src16[kNumCoeffs]);
#endif
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-mask_, mask_].
for (int j = 0; j < kNumCoeffs; ++j) {
if (bit_depth_ == VPX_BITS_8) {
src[j] = rnd.Rand8();
dst[j] = rnd.Rand8();
in[j] = src[j] - dst[j];
#if CONFIG_VP9_HIGHBITDEPTH
} else {
src16[j] = rnd.Rand16() & mask_;
dst16[j] = rnd.Rand16() & mask_;
in[j] = src16[j] - dst16[j];
#endif
}
}
fwd_txfm_ref(in, coeff, pitch_, tx_type_);
if (bit_depth_ == VPX_BITS_8) {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, dst, pitch_));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CONVERT_TO_BYTEPTR(dst16), pitch_));
#endif
}
for (int j = 0; j < kNumCoeffs; ++j) {
#if CONFIG_VP9_HIGHBITDEPTH
const int diff =
bit_depth_ == VPX_BITS_8 ? dst[j] - src[j] : dst16[j] - src16[j];
#else
const int diff = dst[j] - src[j];
#endif
const uint32_t error = diff * diff;
EXPECT_GE(static_cast<uint32_t>(limit), error)
<< "Error: 4x4 IDCT has error " << error << " at index " << j;
}
}
}
int pitch_;
int tx_type_;
FhtFunc fwd_txfm_ref;
vpx_bit_depth_t bit_depth_;
int mask_;
};
class Trans4x4DCT : public Trans4x4TestBase,
public ::testing::TestWithParam<Dct4x4Param> {
public:
virtual ~Trans4x4DCT() {}
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
pitch_ = 4;
fwd_txfm_ref = fdct4x4_ref;
bit_depth_ = GET_PARAM(3);
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(const int16_t *in, tran_low_t *out, int stride) {
fwd_txfm_(in, out, stride);
}
void RunInvTxfm(const tran_low_t *out, uint8_t *dst, int stride) {
inv_txfm_(out, dst, stride);
}
FdctFunc fwd_txfm_;
IdctFunc inv_txfm_;
};
TEST_P(Trans4x4DCT, AccuracyCheck) { RunAccuracyCheck(1); }
TEST_P(Trans4x4DCT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(Trans4x4DCT, MemCheck) { RunMemCheck(); }
TEST_P(Trans4x4DCT, InvAccuracyCheck) { RunInvAccuracyCheck(1); }
class Trans4x4HT : public Trans4x4TestBase,
public ::testing::TestWithParam<Ht4x4Param> {
public:
virtual ~Trans4x4HT() {}
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
pitch_ = 4;
fwd_txfm_ref = fht4x4_ref;
bit_depth_ = GET_PARAM(3);
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(const int16_t *in, tran_low_t *out, int stride) {
fwd_txfm_(in, out, stride, tx_type_);
}
void RunInvTxfm(const tran_low_t *out, uint8_t *dst, int stride) {
inv_txfm_(out, dst, stride, tx_type_);
}
FhtFunc fwd_txfm_;
IhtFunc inv_txfm_;
};
TEST_P(Trans4x4HT, AccuracyCheck) { RunAccuracyCheck(1); }
TEST_P(Trans4x4HT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(Trans4x4HT, MemCheck) { RunMemCheck(); }
TEST_P(Trans4x4HT, InvAccuracyCheck) { RunInvAccuracyCheck(1); }
class Trans4x4WHT : public Trans4x4TestBase,
public ::testing::TestWithParam<Dct4x4Param> {
public:
virtual ~Trans4x4WHT() {}
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
pitch_ = 4;
fwd_txfm_ref = fwht4x4_ref;
bit_depth_ = GET_PARAM(3);
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(const int16_t *in, tran_low_t *out, int stride) {
fwd_txfm_(in, out, stride);
}
void RunInvTxfm(const tran_low_t *out, uint8_t *dst, int stride) {
inv_txfm_(out, dst, stride);
}
FdctFunc fwd_txfm_;
IdctFunc inv_txfm_;
};
TEST_P(Trans4x4WHT, AccuracyCheck) { RunAccuracyCheck(0); }
TEST_P(Trans4x4WHT, CoeffCheck) { RunCoeffCheck(); }
TEST_P(Trans4x4WHT, MemCheck) { RunMemCheck(); }
TEST_P(Trans4x4WHT, InvAccuracyCheck) { RunInvAccuracyCheck(0); }
using std::tr1::make_tuple;
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, Trans4x4DCT,
::testing::Values(
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_10, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_12, 0, VPX_BITS_12),
make_tuple(&vpx_fdct4x4_c, &vpx_idct4x4_16_add_c, 0, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(C, Trans4x4DCT,
::testing::Values(make_tuple(&vpx_fdct4x4_c,
&vpx_idct4x4_16_add_c, 0,
VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 1, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 2, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_10, 3, VPX_BITS_10),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 0, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 1, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 2, VPX_BITS_12),
make_tuple(&vp9_highbd_fht4x4_c, &iht4x4_12, 3, VPX_BITS_12),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 3, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(
C, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_c, 3, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
C, Trans4x4WHT,
::testing::Values(
make_tuple(&vp9_highbd_fwht4x4_c, &iwht4x4_10, 0, VPX_BITS_10),
make_tuple(&vp9_highbd_fwht4x4_c, &iwht4x4_12, 0, VPX_BITS_12),
make_tuple(&vp9_fwht4x4_c, &vpx_iwht4x4_16_add_c, 0, VPX_BITS_8)));
#else
INSTANTIATE_TEST_CASE_P(C, Trans4x4WHT,
::testing::Values(make_tuple(&vp9_fwht4x4_c,
&vpx_iwht4x4_16_add_c, 0,
VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(NEON, Trans4x4DCT,
::testing::Values(make_tuple(&vpx_fdct4x4_c,
&vpx_idct4x4_16_add_neon,
0, VPX_BITS_8)));
#if !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
NEON, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_neon, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_neon, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_neon, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_c, &vp9_iht4x4_16_add_neon, 3, VPX_BITS_8)));
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
SSE2, Trans4x4WHT,
::testing::Values(
make_tuple(&vp9_fwht4x4_sse2, &vpx_iwht4x4_16_add_c, 0, VPX_BITS_8),
make_tuple(&vp9_fwht4x4_c, &vpx_iwht4x4_16_add_sse2, 0, VPX_BITS_8)));
#endif
#if HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(SSE2, Trans4x4DCT,
::testing::Values(make_tuple(&vpx_fdct4x4_sse2,
&vpx_idct4x4_16_add_sse2,
0, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(
SSE2, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_sse2, 3, VPX_BITS_8)));
#endif // HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(
SSE2, Trans4x4DCT,
::testing::Values(
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_10_sse2, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct4x4_sse2, &idct4x4_10_sse2, 0, VPX_BITS_10),
make_tuple(&vpx_highbd_fdct4x4_c, &idct4x4_12_sse2, 0, VPX_BITS_12),
make_tuple(&vpx_highbd_fdct4x4_sse2, &idct4x4_12_sse2, 0, VPX_BITS_12),
make_tuple(&vpx_fdct4x4_sse2, &vpx_idct4x4_16_add_c, 0, VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(
SSE2, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_c, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_sse2, &vp9_iht4x4_16_add_c, 3, VPX_BITS_8)));
#endif // HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(MSA, Trans4x4DCT,
::testing::Values(make_tuple(&vpx_fdct4x4_msa,
&vpx_idct4x4_16_add_msa, 0,
VPX_BITS_8)));
INSTANTIATE_TEST_CASE_P(
MSA, Trans4x4HT,
::testing::Values(
make_tuple(&vp9_fht4x4_msa, &vp9_iht4x4_16_add_msa, 0, VPX_BITS_8),
make_tuple(&vp9_fht4x4_msa, &vp9_iht4x4_16_add_msa, 1, VPX_BITS_8),
make_tuple(&vp9_fht4x4_msa, &vp9_iht4x4_16_add_msa, 2, VPX_BITS_8),
make_tuple(&vp9_fht4x4_msa, &vp9_iht4x4_16_add_msa, 3, VPX_BITS_8)));
#endif // HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
} // namespace

View File

@@ -88,45 +88,45 @@ void fht8x8_ref(const int16_t *in, tran_low_t *out, int stride, int tx_type) {
#if CONFIG_VP9_HIGHBITDEPTH
void idct8x8_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_64_add_c(in, out, stride, 10);
vpx_highbd_idct8x8_64_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct8x8_12(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_64_add_c(in, out, stride, 12);
vpx_highbd_idct8x8_64_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void iht8x8_10(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht8x8_64_add_c(in, out, stride, tx_type, 10);
vp9_highbd_iht8x8_64_add_c(in, CAST_TO_SHORTPTR(out), stride, tx_type, 10);
}
void iht8x8_12(const tran_low_t *in, uint8_t *out, int stride, int tx_type) {
vp9_highbd_iht8x8_64_add_c(in, out, stride, tx_type, 12);
vp9_highbd_iht8x8_64_add_c(in, CAST_TO_SHORTPTR(out), stride, tx_type, 12);
}
#if HAVE_SSE2
void idct8x8_12_add_10_c(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_12_add_c(in, out, stride, 10);
vpx_highbd_idct8x8_12_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct8x8_12_add_12_c(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_12_add_c(in, out, stride, 12);
vpx_highbd_idct8x8_12_add_c(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void idct8x8_12_add_10_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_12_add_sse2(in, out, stride, 10);
vpx_highbd_idct8x8_12_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct8x8_12_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_12_add_sse2(in, out, stride, 12);
vpx_highbd_idct8x8_12_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 12);
}
void idct8x8_64_add_10_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_64_add_sse2(in, out, stride, 10);
vpx_highbd_idct8x8_64_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 10);
}
void idct8x8_64_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct8x8_64_add_sse2(in, out, stride, 12);
vpx_highbd_idct8x8_64_add_sse2(in, CAST_TO_SHORTPTR(out), stride, 12);
}
#endif // HAVE_SSE2
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -257,7 +257,7 @@ class FwdTrans8x8TestBase {
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(test_temp_block, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
@@ -340,7 +340,7 @@ class FwdTrans8x8TestBase {
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(test_temp_block, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(test_temp_block, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
@@ -413,7 +413,7 @@ class FwdTrans8x8TestBase {
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
@@ -497,9 +497,9 @@ class FwdTrans8x8TestBase {
ASM_REGISTER_STATE_CHECK(RunInvTxfm(coeff, dst, pitch_));
#if CONFIG_VP9_HIGHBITDEPTH
} else {
ref_txfm(coeff, CONVERT_TO_BYTEPTR(ref16), pitch_);
ref_txfm(coeff, CAST_TO_BYTEPTR(ref16), pitch_);
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CONVERT_TO_BYTEPTR(dst16), pitch_));
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_));
#endif
}
@@ -511,8 +511,8 @@ class FwdTrans8x8TestBase {
const int diff = dst[j] - ref[j];
#endif
const uint32_t error = diff * diff;
EXPECT_EQ(0u, error) << "Error: 8x8 IDCT has error " << error
<< " at index " << j;
EXPECT_EQ(0u, error)
<< "Error: 8x8 IDCT has error " << error << " at index " << j;
}
}
}
@@ -671,16 +671,11 @@ INSTANTIATE_TEST_CASE_P(
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(NEON, FwdTrans8x8DCT,
::testing::Values(make_tuple(&vpx_fdct8x8_c,
&vpx_idct8x8_64_add_neon,
0, VPX_BITS_8)));
#else // !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(NEON, FwdTrans8x8DCT,
::testing::Values(make_tuple(&vpx_fdct8x8_neon,
&vpx_idct8x8_64_add_neon,
0, VPX_BITS_8)));
#if !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
NEON, FwdTrans8x8HT,
::testing::Values(
@@ -688,8 +683,8 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_neon, 1, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_neon, 2, VPX_BITS_8),
make_tuple(&vp9_fht8x8_c, &vp9_iht8x8_64_add_neon, 3, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(SSE2, FwdTrans8x8DCT,
@@ -744,7 +739,7 @@ INSTANTIATE_TEST_CASE_P(
!CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(SSSE3, FwdTrans8x8DCT,
::testing::Values(make_tuple(&vpx_fdct8x8_ssse3,
&vpx_idct8x8_64_add_ssse3,
&vpx_idct8x8_64_add_sse2,
0, VPX_BITS_8)));
#endif
@@ -761,4 +756,11 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_fht8x8_msa, &vp9_iht8x8_64_add_msa, 2, VPX_BITS_8),
make_tuple(&vp9_fht8x8_msa, &vp9_iht8x8_64_add_msa, 3, VPX_BITS_8)));
#endif // HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_CASE_P(VSX, FwdTrans8x8DCT,
::testing::Values(make_tuple(&vpx_fdct8x8_c,
&vpx_idct8x8_64_add_vsx, 0,
VPX_BITS_8)));
#endif // HAVE_VSX && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
} // namespace

View File

@@ -13,6 +13,7 @@
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "vpx_ports/vpx_timer.h"
#include "test/acm_random.h"
#include "test/register_state_check.h"
@@ -21,7 +22,8 @@ namespace {
using ::libvpx_test::ACMRandom;
typedef void (*HadamardFunc)(const int16_t *a, int a_stride, int16_t *b);
typedef void (*HadamardFunc)(const int16_t *a, ptrdiff_t a_stride,
tran_low_t *b);
void hadamard_loop(const int16_t *a, int a_stride, int16_t *out) {
int16_t b[8];
@@ -46,18 +48,16 @@ void hadamard_loop(const int16_t *a, int a_stride, int16_t *out) {
out[5] = c[3] - c[7];
}
void reference_hadamard8x8(const int16_t *a, int a_stride, int16_t *b) {
void reference_hadamard8x8(const int16_t *a, int a_stride, tran_low_t *b) {
int16_t buf[64];
for (int i = 0; i < 8; ++i) {
hadamard_loop(a + i, a_stride, buf + i * 8);
}
int16_t buf2[64];
for (int i = 0; i < 8; ++i) hadamard_loop(a + i, a_stride, buf + i * 8);
for (int i = 0; i < 8; ++i) hadamard_loop(buf + i, 8, buf2 + i * 8);
for (int i = 0; i < 8; ++i) {
hadamard_loop(buf + i, 8, b + i * 8);
}
for (int i = 0; i < 64; ++i) b[i] = (tran_low_t)buf2[i];
}
void reference_hadamard16x16(const int16_t *a, int a_stride, int16_t *b) {
void reference_hadamard16x16(const int16_t *a, int a_stride, tran_low_t *b) {
/* The source is a 16x16 block. The destination is rearranged to 8x32.
* Input is 9 bit. */
reference_hadamard8x8(a + 0 + 0 * a_stride, a_stride, b + 0);
@@ -68,16 +68,16 @@ void reference_hadamard16x16(const int16_t *a, int a_stride, int16_t *b) {
/* Overlay the 8x8 blocks and combine. */
for (int i = 0; i < 64; ++i) {
/* 8x8 steps the range up to 15 bits. */
const int16_t a0 = b[0];
const int16_t a1 = b[64];
const int16_t a2 = b[128];
const int16_t a3 = b[192];
const tran_low_t a0 = b[0];
const tran_low_t a1 = b[64];
const tran_low_t a2 = b[128];
const tran_low_t a3 = b[192];
/* Prevent the result from escaping int16_t. */
const int16_t b0 = (a0 + a1) >> 1;
const int16_t b1 = (a0 - a1) >> 1;
const int16_t b2 = (a2 + a3) >> 1;
const int16_t b3 = (a2 - a3) >> 1;
const tran_low_t b0 = (a0 + a1) >> 1;
const tran_low_t b1 = (a0 - a1) >> 1;
const tran_low_t b2 = (a2 + a3) >> 1;
const tran_low_t b3 = (a2 - a3) >> 1;
/* Store a 16 bit value. */
b[0] = b0 + b2;
@@ -101,12 +101,35 @@ class HadamardTestBase : public ::testing::TestWithParam<HadamardFunc> {
ACMRandom rnd_;
};
void HadamardSpeedTest(const char *name, HadamardFunc const func,
const int16_t *input, int stride, tran_low_t *output,
int times) {
int i;
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (i = 0; i < times; ++i) {
func(input, stride, output);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("%s[%12d runs]: %d us\n", name, times, elapsed_time);
}
class Hadamard8x8Test : public HadamardTestBase {};
void HadamardSpeedTest8x8(HadamardFunc const func, int times) {
DECLARE_ALIGNED(16, int16_t, input[64]);
DECLARE_ALIGNED(16, tran_low_t, output[64]);
memset(input, 1, sizeof(input));
HadamardSpeedTest("Hadamard8x8", func, input, 8, output, times);
}
TEST_P(Hadamard8x8Test, CompareReferenceRandom) {
DECLARE_ALIGNED(16, int16_t, a[64]);
DECLARE_ALIGNED(16, int16_t, b[64]);
int16_t b_ref[64];
DECLARE_ALIGNED(16, tran_low_t, b[64]);
tran_low_t b_ref[64];
for (int i = 0; i < 64; ++i) {
a[i] = rnd_.Rand9Signed();
}
@@ -124,8 +147,8 @@ TEST_P(Hadamard8x8Test, CompareReferenceRandom) {
TEST_P(Hadamard8x8Test, VaryStride) {
DECLARE_ALIGNED(16, int16_t, a[64 * 8]);
DECLARE_ALIGNED(16, int16_t, b[64]);
int16_t b_ref[64];
DECLARE_ALIGNED(16, tran_low_t, b[64]);
tran_low_t b_ref[64];
for (int i = 0; i < 64 * 8; ++i) {
a[i] = rnd_.Rand9Signed();
}
@@ -144,6 +167,12 @@ TEST_P(Hadamard8x8Test, VaryStride) {
}
}
TEST_P(Hadamard8x8Test, DISABLED_Speed) {
HadamardSpeedTest8x8(h_func_, 10);
HadamardSpeedTest8x8(h_func_, 10000);
HadamardSpeedTest8x8(h_func_, 10000000);
}
INSTANTIATE_TEST_CASE_P(C, Hadamard8x8Test,
::testing::Values(&vpx_hadamard_8x8_c));
@@ -162,12 +191,33 @@ INSTANTIATE_TEST_CASE_P(NEON, Hadamard8x8Test,
::testing::Values(&vpx_hadamard_8x8_neon));
#endif // HAVE_NEON
// TODO(jingning): Remove highbitdepth flag when the SIMD functions are
// in place and turn on the unit test.
#if !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, Hadamard8x8Test,
::testing::Values(&vpx_hadamard_8x8_msa));
#endif // HAVE_MSA
#endif // !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_VSX
INSTANTIATE_TEST_CASE_P(VSX, Hadamard8x8Test,
::testing::Values(&vpx_hadamard_8x8_vsx));
#endif // HAVE_VSX
class Hadamard16x16Test : public HadamardTestBase {};
void HadamardSpeedTest16x16(HadamardFunc const func, int times) {
DECLARE_ALIGNED(16, int16_t, input[256]);
DECLARE_ALIGNED(16, tran_low_t, output[256]);
memset(input, 1, sizeof(input));
HadamardSpeedTest("Hadamard16x16", func, input, 16, output, times);
}
TEST_P(Hadamard16x16Test, CompareReferenceRandom) {
DECLARE_ALIGNED(16, int16_t, a[16 * 16]);
DECLARE_ALIGNED(16, int16_t, b[16 * 16]);
int16_t b_ref[16 * 16];
DECLARE_ALIGNED(16, tran_low_t, b[16 * 16]);
tran_low_t b_ref[16 * 16];
for (int i = 0; i < 16 * 16; ++i) {
a[i] = rnd_.Rand9Signed();
}
@@ -185,8 +235,8 @@ TEST_P(Hadamard16x16Test, CompareReferenceRandom) {
TEST_P(Hadamard16x16Test, VaryStride) {
DECLARE_ALIGNED(16, int16_t, a[16 * 16 * 8]);
DECLARE_ALIGNED(16, int16_t, b[16 * 16]);
int16_t b_ref[16 * 16];
DECLARE_ALIGNED(16, tran_low_t, b[16 * 16]);
tran_low_t b_ref[16 * 16];
for (int i = 0; i < 16 * 16 * 8; ++i) {
a[i] = rnd_.Rand9Signed();
}
@@ -205,6 +255,12 @@ TEST_P(Hadamard16x16Test, VaryStride) {
}
}
TEST_P(Hadamard16x16Test, DISABLED_Speed) {
HadamardSpeedTest16x16(h_func_, 10);
HadamardSpeedTest16x16(h_func_, 10000);
HadamardSpeedTest16x16(h_func_, 10000000);
}
INSTANTIATE_TEST_CASE_P(C, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_c));
@@ -213,8 +269,25 @@ INSTANTIATE_TEST_CASE_P(SSE2, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_sse2));
#endif // HAVE_SSE2
#if HAVE_AVX2
INSTANTIATE_TEST_CASE_P(AVX2, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_avx2));
#endif // HAVE_AVX2
#if HAVE_VSX
INSTANTIATE_TEST_CASE_P(VSX, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_vsx));
#endif // HAVE_VSX
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_neon));
#endif // HAVE_NEON
#if !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, Hadamard16x16Test,
::testing::Values(&vpx_hadamard_16x16_msa));
#endif // HAVE_MSA
#endif // !CONFIG_VP9_HIGHBITDEPTH
} // namespace

View File

@@ -13,6 +13,7 @@
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "vpx/vpx_integer.h"
@@ -21,110 +22,156 @@ typedef void (*IdctFunc)(int16_t *input, unsigned char *pred_ptr,
int pred_stride, unsigned char *dst_ptr,
int dst_stride);
namespace {
using libvpx_test::Buffer;
class IDCTTest : public ::testing::TestWithParam<IdctFunc> {
protected:
virtual void SetUp() {
int i;
UUT = GetParam();
memset(input, 0, sizeof(input));
/* Set up guard blocks */
for (i = 0; i < 256; i++) output[i] = ((i & 0xF) < 4 && (i < 64)) ? 0 : -1;
input = new Buffer<int16_t>(4, 4, 0);
ASSERT_TRUE(input != NULL);
ASSERT_TRUE(input->Init());
predict = new Buffer<uint8_t>(4, 4, 3);
ASSERT_TRUE(predict != NULL);
ASSERT_TRUE(predict->Init());
output = new Buffer<uint8_t>(4, 4, 3);
ASSERT_TRUE(output != NULL);
ASSERT_TRUE(output->Init());
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
virtual void TearDown() {
delete input;
delete predict;
delete output;
libvpx_test::ClearSystemState();
}
IdctFunc UUT;
int16_t input[16];
unsigned char output[256];
unsigned char predict[256];
Buffer<int16_t> *input;
Buffer<uint8_t> *predict;
Buffer<uint8_t> *output;
};
TEST_P(IDCTTest, TestGuardBlocks) {
int i;
for (i = 0; i < 256; i++) {
if ((i & 0xF) < 4 && i < 64)
EXPECT_EQ(0, output[i]) << i;
else
EXPECT_EQ(255, output[i]);
}
}
TEST_P(IDCTTest, TestAllZeros) {
int i;
// When the input is '0' the output will be '0'.
input->Set(0);
predict->Set(0);
output->Set(0);
ASM_REGISTER_STATE_CHECK(UUT(input, output, 16, output, 16));
ASM_REGISTER_STATE_CHECK(UUT(input->TopLeftPixel(), predict->TopLeftPixel(),
predict->stride(), output->TopLeftPixel(),
output->stride()));
for (i = 0; i < 256; i++) {
if ((i & 0xF) < 4 && i < 64)
EXPECT_EQ(0, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
}
ASSERT_TRUE(input->CheckValues(0));
ASSERT_TRUE(input->CheckPadding());
ASSERT_TRUE(output->CheckValues(0));
ASSERT_TRUE(output->CheckPadding());
}
TEST_P(IDCTTest, TestAllOnes) {
int i;
input->Set(0);
// When the first element is '4' it will fill the output buffer with '1'.
input->TopLeftPixel()[0] = 4;
predict->Set(0);
output->Set(0);
input[0] = 4;
ASM_REGISTER_STATE_CHECK(UUT(input, output, 16, output, 16));
ASM_REGISTER_STATE_CHECK(UUT(input->TopLeftPixel(), predict->TopLeftPixel(),
predict->stride(), output->TopLeftPixel(),
output->stride()));
for (i = 0; i < 256; i++) {
if ((i & 0xF) < 4 && i < 64)
EXPECT_EQ(1, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
}
ASSERT_TRUE(output->CheckValues(1));
ASSERT_TRUE(output->CheckPadding());
}
TEST_P(IDCTTest, TestAddOne) {
int i;
// Set the transform output to '1' and make sure it gets added to the
// prediction buffer.
input->Set(0);
input->TopLeftPixel()[0] = 4;
output->Set(0);
for (i = 0; i < 256; i++) predict[i] = i;
input[0] = 4;
ASM_REGISTER_STATE_CHECK(UUT(input, predict, 16, output, 16));
for (i = 0; i < 256; i++) {
if ((i & 0xF) < 4 && i < 64)
EXPECT_EQ(i + 1, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
uint8_t *pred = predict->TopLeftPixel();
for (int y = 0; y < 4; ++y) {
for (int x = 0; x < 4; ++x) {
pred[y * predict->stride() + x] = y * 4 + x;
}
}
ASM_REGISTER_STATE_CHECK(UUT(input->TopLeftPixel(), predict->TopLeftPixel(),
predict->stride(), output->TopLeftPixel(),
output->stride()));
uint8_t const *out = output->TopLeftPixel();
for (int y = 0; y < 4; ++y) {
for (int x = 0; x < 4; ++x) {
EXPECT_EQ(1 + y * 4 + x, out[y * output->stride() + x]);
}
}
if (HasFailure()) {
output->DumpBuffer();
}
ASSERT_TRUE(output->CheckPadding());
}
TEST_P(IDCTTest, TestWithData) {
int i;
// Test a single known input.
predict->Set(0);
for (i = 0; i < 16; i++) input[i] = i;
ASM_REGISTER_STATE_CHECK(UUT(input, output, 16, output, 16));
for (i = 0; i < 256; i++) {
if ((i & 0xF) > 3 || i > 63)
EXPECT_EQ(255, output[i]) << "i==" << i;
else if (i == 0)
EXPECT_EQ(11, output[i]) << "i==" << i;
else if (i == 34)
EXPECT_EQ(1, output[i]) << "i==" << i;
else if (i == 2 || i == 17 || i == 32)
EXPECT_EQ(3, output[i]) << "i==" << i;
else
EXPECT_EQ(0, output[i]) << "i==" << i;
int16_t *in = input->TopLeftPixel();
for (int y = 0; y < 4; ++y) {
for (int x = 0; x < 4; ++x) {
in[y * input->stride() + x] = y * 4 + x;
}
}
ASM_REGISTER_STATE_CHECK(UUT(input->TopLeftPixel(), predict->TopLeftPixel(),
predict->stride(), output->TopLeftPixel(),
output->stride()));
uint8_t *out = output->TopLeftPixel();
for (int y = 0; y < 4; ++y) {
for (int x = 0; x < 4; ++x) {
switch (y * 4 + x) {
case 0: EXPECT_EQ(11, out[y * output->stride() + x]); break;
case 2:
case 5:
case 8: EXPECT_EQ(3, out[y * output->stride() + x]); break;
case 10: EXPECT_EQ(1, out[y * output->stride() + x]); break;
default: EXPECT_EQ(0, out[y * output->stride() + x]);
}
}
}
if (HasFailure()) {
output->DumpBuffer();
}
ASSERT_TRUE(output->CheckPadding());
}
INSTANTIATE_TEST_CASE_P(C, IDCTTest, ::testing::Values(vp8_short_idct4x4llm_c));
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_neon));
#endif
#endif // HAVE_NEON
#if HAVE_MMX
INSTANTIATE_TEST_CASE_P(MMX, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_mmx));
#endif
#endif // HAVE_MMX
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_msa));
#endif
#endif // HAVE_MSA
#if HAVE_MMI
INSTANTIATE_TEST_CASE_P(MMI, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_mmi));
#endif // HAVE_MMI
}

View File

@@ -45,8 +45,8 @@ class InvalidFileTest : public ::libvpx_test::DecoderTest,
void OpenResFile(const std::string &res_file_name_) {
res_file_ = libvpx_test::OpenTestDataFile(res_file_name_);
ASSERT_TRUE(res_file_ != NULL) << "Result file open failed. Filename: "
<< res_file_name_;
ASSERT_TRUE(res_file_ != NULL)
<< "Result file open failed. Filename: " << res_file_name_;
}
virtual bool HandleDecodeResult(
@@ -120,11 +120,23 @@ class InvalidFileTest : public ::libvpx_test::DecoderTest,
TEST_P(InvalidFileTest, ReturnCode) { RunTest(); }
#if CONFIG_VP8_DECODER
const DecodeParam kVP8InvalidFileTests[] = {
{ 1, "invalid-bug-1443.ivf" },
};
VP8_INSTANTIATE_TEST_CASE(InvalidFileTest,
::testing::ValuesIn(kVP8InvalidFileTests));
#endif // CONFIG_VP8_DECODER
#if CONFIG_VP9_DECODER
const DecodeParam kVP9InvalidFileTests[] = {
{ 1, "invalid-vp90-02-v2.webm" },
#if CONFIG_VP9_HIGHBITDEPTH
{ 1, "invalid-vp90-2-00-quantizer-00.webm.ivf.s5861_r01-05_b6-.v2.ivf" },
{ 1,
"invalid-vp90-2-21-resize_inter_320x180_5_3-4.webm.ivf.s45551_r01-05_b6-."
"ivf" },
#endif
{ 1, "invalid-vp90-03-v3.webm" },
{ 1, "invalid-vp90-2-00-quantizer-11.webm.ivf.s52984_r01-05_b6-.ivf" },
@@ -164,12 +176,12 @@ class InvalidFileInvalidPeekTest : public InvalidFileTest {
TEST_P(InvalidFileInvalidPeekTest, ReturnCode) { RunTest(); }
#if CONFIG_VP8_DECODER
const DecodeParam kVP8InvalidFileTests[] = {
const DecodeParam kVP8InvalidPeekTests[] = {
{ 1, "invalid-vp80-00-comprehensive-018.ivf.2kf_0x6.ivf" },
};
VP8_INSTANTIATE_TEST_CASE(InvalidFileInvalidPeekTest,
::testing::ValuesIn(kVP8InvalidFileTests));
::testing::ValuesIn(kVP8InvalidPeekTests));
#endif // CONFIG_VP8_DECODER
#if CONFIG_VP9_DECODER

View File

@@ -47,8 +47,8 @@ class IVFVideoSource : public CompressedVideoSource {
virtual void Begin() {
input_file_ = OpenTestDataFile(file_name_);
ASSERT_TRUE(input_file_ != NULL) << "Input file open failed. Filename: "
<< file_name_;
ASSERT_TRUE(input_file_ != NULL)
<< "Input file open failed. Filename: " << file_name_;
// Read file header
uint8_t file_hdr[kIvfFileHdrSize];

View File

@@ -135,8 +135,8 @@ TEST_P(KeyframeTest, TestAutoKeyframe) {
for (std::vector<vpx_codec_pts_t>::const_iterator iter = kf_pts_list_.begin();
iter != kf_pts_list_.end(); ++iter) {
if (deadline_ == VPX_DL_REALTIME && *iter > 0)
EXPECT_EQ(0, (*iter - 1) % 30) << "Unexpected keyframe at frame "
<< *iter;
EXPECT_EQ(0, (*iter - 1) % 30)
<< "Unexpected keyframe at frame " << *iter;
else
EXPECT_EQ(0, *iter % 30) << "Unexpected keyframe at frame " << *iter;
}

View File

@@ -66,34 +66,34 @@ class LevelTest
int level_;
};
TEST_P(LevelTest, TestTargetLevel11) {
TEST_P(LevelTest, TestTargetLevel11Large) {
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("hantro_odd.yuv", 208, 144, 30, 1, 0,
90);
60);
target_level_ = 11;
cfg_.rc_target_bitrate = 150;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_EQ(target_level_, level_);
ASSERT_GE(target_level_, level_);
}
TEST_P(LevelTest, TestTargetLevel20) {
TEST_P(LevelTest, TestTargetLevel20Large) {
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 90);
30, 1, 0, 60);
target_level_ = 20;
cfg_.rc_target_bitrate = 1200;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_EQ(target_level_, level_);
ASSERT_GE(target_level_, level_);
}
TEST_P(LevelTest, TestTargetLevel31) {
TEST_P(LevelTest, TestTargetLevel31Large) {
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("niklas_1280_720_30.y4m", 1280, 720, 30,
1, 0, 60);
target_level_ = 31;
cfg_.rc_target_bitrate = 8000;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_EQ(target_level_, level_);
ASSERT_GE(target_level_, level_);
}
// Test for keeping level stats only
@@ -103,11 +103,11 @@ TEST_P(LevelTest, TestTargetLevel0) {
target_level_ = 0;
min_gf_internal_ = 4;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_EQ(11, level_);
ASSERT_GE(11, level_);
cfg_.rc_target_bitrate = 1600;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_EQ(20, level_);
ASSERT_GE(20, level_);
}
// Test for level control being turned off
@@ -130,7 +130,7 @@ TEST_P(LevelTest, TestTargetLevelApi) {
if (level == 10 || level == 11 || level == 20 || level == 21 ||
level == 30 || level == 31 || level == 40 || level == 41 ||
level == 50 || level == 51 || level == 52 || level == 60 ||
level == 61 || level == 62 || level == 0 || level == 255)
level == 61 || level == 62 || level == 0 || level == 1 || level == 255)
EXPECT_EQ(VPX_CODEC_OK,
vpx_codec_control(&enc, VP9E_SET_TARGET_LEVEL, level));
else

View File

@@ -114,6 +114,18 @@ void InitInput(Pixel *s, Pixel *ref_s, ACMRandom *rnd, const uint8_t limit,
}
}
uint8_t GetOuterThresh(ACMRandom *rnd) {
return static_cast<uint8_t>(rnd->RandRange(3 * MAX_LOOP_FILTER + 5));
}
uint8_t GetInnerThresh(ACMRandom *rnd) {
return static_cast<uint8_t>(rnd->RandRange(MAX_LOOP_FILTER + 1));
}
uint8_t GetHevThresh(ACMRandom *rnd) {
return static_cast<uint8_t>(rnd->RandRange(MAX_LOOP_FILTER + 1) >> 4);
}
class Loop8Test6Param : public ::testing::TestWithParam<loop8_param_t> {
public:
virtual ~Loop8Test6Param() {}
@@ -162,15 +174,15 @@ TEST_P(Loop8Test6Param, OperationCheck) {
int first_failure = -1;
for (int i = 0; i < count_test_block; ++i) {
int err_count = 0;
uint8_t tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
uint8_t tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
@@ -221,15 +233,15 @@ TEST_P(Loop8Test6Param, ValueCheck) {
for (int i = 0; i < count_test_block; ++i) {
int err_count = 0;
uint8_t tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
uint8_t tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
@@ -271,27 +283,27 @@ TEST_P(Loop8Test9Param, OperationCheck) {
int first_failure = -1;
for (int i = 0; i < count_test_block; ++i) {
int err_count = 0;
uint8_t tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
uint8_t tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
@@ -334,27 +346,27 @@ TEST_P(Loop8Test9Param, ValueCheck) {
int first_failure = -1;
for (int i = 0; i < count_test_block; ++i) {
int err_count = 0;
uint8_t tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
uint8_t tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh0[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(3 * MAX_LOOP_FILTER + 4));
tmp = GetOuterThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
blimit1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = static_cast<uint8_t>(rnd(MAX_LOOP_FILTER));
tmp = GetInnerThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
limit1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };
tmp = rnd.Rand8();
tmp = GetHevThresh(&rnd);
DECLARE_ALIGNED(16, const uint8_t,
thresh1[16]) = { tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp,
tmp, tmp, tmp, tmp, tmp, tmp, tmp, tmp };

View File

@@ -107,10 +107,10 @@ TEST_P(MinMaxTest, CompareReferenceAndVaryStride) {
int min_ref, max_ref, min, max;
reference_minmax(a, a_stride, b, b_stride, &min_ref, &max_ref);
ASM_REGISTER_STATE_CHECK(mm_func_(a, a_stride, b, b_stride, &min, &max));
EXPECT_EQ(max_ref, max) << "when a_stride = " << a_stride
<< " and b_stride = " << b_stride;
EXPECT_EQ(min_ref, min) << "when a_stride = " << a_stride
<< " and b_stride = " << b_stride;
EXPECT_EQ(max_ref, max)
<< "when a_stride = " << a_stride << " and b_stride = " << b_stride;
EXPECT_EQ(min_ref, min)
<< "when a_stride = " << a_stride << " and b_stride = " << b_stride;
}
}
}
@@ -127,4 +127,9 @@ INSTANTIATE_TEST_CASE_P(NEON, MinMaxTest,
::testing::Values(&vpx_minmax_8x8_neon));
#endif
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, MinMaxTest,
::testing::Values(&vpx_minmax_8x8_msa));
#endif
} // namespace

View File

@@ -43,9 +43,11 @@ void wrapper(const tran_low_t *in, uint8_t *out, int stride, int bd) {
}
#if CONFIG_VP9_HIGHBITDEPTH
template <InvTxfmWithBdFunc fn>
typedef void (*InvTxfmHighbdFunc)(const tran_low_t *in, uint16_t *out,
int stride, int bd);
template <InvTxfmHighbdFunc fn>
void highbd_wrapper(const tran_low_t *in, uint8_t *out, int stride, int bd) {
fn(in, CONVERT_TO_BYTEPTR(out), stride, bd);
fn(in, CAST_TO_SHORTPTR(out), stride, bd);
}
#endif
@@ -55,41 +57,14 @@ typedef std::tr1::tuple<FwdTxfmFunc, InvTxfmWithBdFunc, InvTxfmWithBdFunc,
const int kMaxNumCoeffs = 1024;
const int kCountTestBlock = 1000;
// https://bugs.chromium.org/p/webm/issues/detail?id=1332
// The functions specified do not pass with INT16_MIN/MAX. They fail at the
// value specified, but pass when 1 is added/subtracted.
int16_t MaxSupportedCoeff(InvTxfmWithBdFunc a) {
#if HAVE_SSSE3 && ARCH_X86_64 && !CONFIG_EMULATE_HARDWARE
if (a == &wrapper<vpx_idct8x8_64_add_ssse3> ||
a == &wrapper<vpx_idct8x8_12_add_ssse3>) {
return 23625 - 1;
}
#else
(void)a;
#endif
return std::numeric_limits<int16_t>::max();
}
int16_t MinSupportedCoeff(InvTxfmWithBdFunc a) {
#if HAVE_SSSE3 && ARCH_X86_64 && !CONFIG_EMULATE_HARDWARE
if (a == &wrapper<vpx_idct8x8_64_add_ssse3> ||
a == &wrapper<vpx_idct8x8_12_add_ssse3>) {
return -23625 + 1;
}
#else
(void)a;
#endif
return std::numeric_limits<int16_t>::min();
}
class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
public:
virtual ~PartialIDctTest() {}
virtual void SetUp() {
rnd_.Reset(ACMRandom::DeterministicSeed());
ftxfm_ = GET_PARAM(0);
full_itxfm_ = GET_PARAM(1);
partial_itxfm_ = GET_PARAM(2);
fwd_txfm_ = GET_PARAM(0);
full_inv_txfm_ = GET_PARAM(1);
partial_inv_txfm_ = GET_PARAM(2);
tx_size_ = GET_PARAM(3);
last_nonzero_ = GET_PARAM(4);
bit_depth_ = GET_PARAM(5);
@@ -153,12 +128,12 @@ class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
}
void InitInput() {
const int max_coeff = 32766 / 4;
int max_energy_leftover = max_coeff * max_coeff;
const int64_t max_coeff = (32766 << (bit_depth_ - 8)) / 4;
int64_t max_energy_leftover = max_coeff * max_coeff;
for (int j = 0; j < last_nonzero_; ++j) {
int16_t coeff = static_cast<int16_t>(sqrt(1.0 * max_energy_leftover) *
(rnd_.Rand16() - 32768) / 65536);
max_energy_leftover -= coeff * coeff;
tran_low_t coeff = static_cast<tran_low_t>(
sqrt(1.0 * max_energy_leftover) * (rnd_.Rand16() - 32768) / 65536);
max_energy_leftover -= static_cast<int64_t>(coeff) * coeff;
if (max_energy_leftover < 0) {
max_energy_leftover = 0;
coeff = 0;
@@ -167,6 +142,36 @@ class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
}
}
void PrintDiff() {
if (memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_)) {
uint16_t ref, opt;
for (int y = 0; y < size_; y++) {
for (int x = 0; x < size_; x++) {
if (pixel_size_ == 1) {
ref = output_block_ref_[y * stride_ + x];
opt = output_block_[y * stride_ + x];
} else {
ref = reinterpret_cast<uint16_t *>(
output_block_ref_)[y * stride_ + x];
opt = reinterpret_cast<uint16_t *>(output_block_)[y * stride_ + x];
}
if (ref != opt) {
printf("dest[%d][%d] diff:%6d (ref),%6d (opt)\n", y, x, ref, opt);
}
}
}
printf("\ninput_block_:\n");
for (int y = 0; y < size_; y++) {
for (int x = 0; x < size_; x++) {
printf("%6d,", input_block_[y * size_ + x]);
}
printf("\n");
}
}
}
protected:
int last_nonzero_;
TX_SIZE tx_size_;
@@ -180,34 +185,43 @@ class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
int output_block_size_;
int bit_depth_;
int mask_;
FwdTxfmFunc ftxfm_;
InvTxfmWithBdFunc full_itxfm_;
InvTxfmWithBdFunc partial_itxfm_;
FwdTxfmFunc fwd_txfm_;
InvTxfmWithBdFunc full_inv_txfm_;
InvTxfmWithBdFunc partial_inv_txfm_;
ACMRandom rnd_;
};
TEST_P(PartialIDctTest, RunQuantCheck) {
const int count_test_block = (size_ != 4) ? kCountTestBlock : 65536;
DECLARE_ALIGNED(16, int16_t, input_extreme_block[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_ref_block[kMaxNumCoeffs]);
InitMem();
for (int i = 0; i < kCountTestBlock * kCountTestBlock; ++i) {
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-mask_, mask_].
if (i == 0) {
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = mask_;
}
} else if (i == 1) {
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = -mask_;
if (size_ != 4) {
if (i == 0) {
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = mask_;
}
} else if (i == 1) {
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = -mask_;
}
} else {
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = rnd_.Rand8() % 2 ? mask_ : -mask_;
}
}
} else {
// Try all possible combinations.
for (int k = 0; k < input_block_size_; ++k) {
input_extreme_block[k] = rnd_.Rand8() % 2 ? mask_ : -mask_;
input_extreme_block[k] = (i & (1 << k)) ? mask_ : -mask_;
}
}
ftxfm_(input_extreme_block, output_ref_block, size_);
fwd_txfm_(input_extreme_block, output_ref_block, size_);
// quantization with minimum allowed step sizes
input_block_[0] = (output_ref_block[0] / 4) * 4;
@@ -217,9 +231,9 @@ TEST_P(PartialIDctTest, RunQuantCheck) {
}
ASM_REGISTER_STATE_CHECK(
full_itxfm_(input_block_, output_block_ref_, stride_, bit_depth_));
full_inv_txfm_(input_block_, output_block_ref_, stride_, bit_depth_));
ASM_REGISTER_STATE_CHECK(
partial_itxfm_(input_block_, output_block_, stride_, bit_depth_));
partial_inv_txfm_(input_block_, output_block_, stride_, bit_depth_));
ASSERT_EQ(0, memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_))
<< "Error: partial inverse transform produces different results";
@@ -232,9 +246,9 @@ TEST_P(PartialIDctTest, ResultsMatch) {
InitInput();
ASM_REGISTER_STATE_CHECK(
full_itxfm_(input_block_, output_block_ref_, stride_, bit_depth_));
full_inv_txfm_(input_block_, output_block_ref_, stride_, bit_depth_));
ASM_REGISTER_STATE_CHECK(
partial_itxfm_(input_block_, output_block_, stride_, bit_depth_));
partial_inv_txfm_(input_block_, output_block_, stride_, bit_depth_));
ASSERT_EQ(0, memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_))
<< "Error: partial inverse transform produces different results";
@@ -249,9 +263,9 @@ TEST_P(PartialIDctTest, AddOutputBlock) {
}
ASM_REGISTER_STATE_CHECK(
full_itxfm_(input_block_, output_block_ref_, stride_, bit_depth_));
full_inv_txfm_(input_block_, output_block_ref_, stride_, bit_depth_));
ASM_REGISTER_STATE_CHECK(
partial_itxfm_(input_block_, output_block_, stride_, bit_depth_));
partial_inv_txfm_(input_block_, output_block_, stride_, bit_depth_));
ASSERT_EQ(0, memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_))
<< "Error: Transform results are not correctly added to output.";
@@ -259,8 +273,8 @@ TEST_P(PartialIDctTest, AddOutputBlock) {
}
TEST_P(PartialIDctTest, SingleExtremeCoeff) {
const int16_t max_coeff = MaxSupportedCoeff(partial_itxfm_);
const int16_t min_coeff = MinSupportedCoeff(partial_itxfm_);
const int16_t max_coeff = std::numeric_limits<int16_t>::max();
const int16_t min_coeff = std::numeric_limits<int16_t>::min();
for (int i = 0; i < last_nonzero_; ++i) {
memset(input_block_, 0, sizeof(*input_block_) * input_block_size_);
// Run once for min and once for max.
@@ -272,9 +286,9 @@ TEST_P(PartialIDctTest, SingleExtremeCoeff) {
input_block_[vp9_default_scan_orders[tx_size_].scan[i]] = coeff;
ASM_REGISTER_STATE_CHECK(
full_itxfm_(input_block_, output_block_ref_, stride_, bit_depth_));
full_inv_txfm_(input_block_, output_block_ref_, stride_, bit_depth_));
ASM_REGISTER_STATE_CHECK(
partial_itxfm_(input_block_, output_block_, stride_, bit_depth_));
partial_inv_txfm_(input_block_, output_block_, stride_, bit_depth_));
ASSERT_EQ(0, memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_))
<< "Error: Fails with single coeff of " << coeff << " at " << i
@@ -291,20 +305,20 @@ TEST_P(PartialIDctTest, DISABLED_Speed) {
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
ASM_REGISTER_STATE_CHECK(
full_itxfm_(input_block_, output_block_ref_, stride_, bit_depth_));
full_inv_txfm_(input_block_, output_block_ref_, stride_, bit_depth_));
}
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
partial_itxfm_(input_block_, output_block_, stride_, bit_depth_);
partial_inv_txfm_(input_block_, output_block_, stride_, bit_depth_);
}
libvpx_test::ClearSystemState();
vpx_usec_timer_mark(&timer);
const int elapsed_time =
static_cast<int>(vpx_usec_timer_elapsed(&timer) / 1000);
printf("idct%dx%d_%d (bitdepth %d) time: %5d ms ", size_, size_,
last_nonzero_, bit_depth_, elapsed_time);
printf("idct%dx%d_%d (%s %d) time: %5d ms\n", size_, size_, last_nonzero_,
(pixel_size_ == 1) ? "bitdepth" : "high bitdepth", bit_depth_,
elapsed_time);
ASSERT_EQ(0, memcmp(output_block_ref_, output_block_,
pixel_size_ * output_block_size_))
<< "Error: partial inverse transform produces different results";
@@ -323,6 +337,15 @@ const PartialInvTxfmParam c_partial_idct_tests[] = {
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>, TX_32X32, 1024, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>, TX_32X32, 135, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>, TX_32X32, 135, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>, TX_32X32, 135, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_c>, TX_32X32, 34, 8, 2),
@@ -350,6 +373,15 @@ const PartialInvTxfmParam c_partial_idct_tests[] = {
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_c>, TX_16X16, 256, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_c>, TX_16X16, 38, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_c>, TX_16X16, 38, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_c>, TX_16X16, 38, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_c>, TX_16X16, 10, 8, 2),
@@ -424,6 +456,8 @@ const PartialInvTxfmParam c_partial_idct_tests[] = {
&wrapper<vpx_idct32x32_1_add_c>, TX_32X32, 1, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_256_add_c>, TX_16X16, 256, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_38_add_c>, TX_16X16, 38, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_10_add_c>, TX_16X16, 10, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
@@ -443,9 +477,86 @@ const PartialInvTxfmParam c_partial_idct_tests[] = {
INSTANTIATE_TEST_CASE_P(C, PartialIDctTest,
::testing::ValuesIn(c_partial_idct_tests));
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#if !CONFIG_EMULATE_HARDWARE
#if HAVE_NEON
const PartialInvTxfmParam neon_partial_idct_tests[] = {
#if CONFIG_VP9_HIGHBITDEPTH
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_neon>, TX_32X32,
1024, 8, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_neon>, TX_32X32,
1024, 10, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_neon>, TX_32X32,
1024, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_neon>, TX_32X32, 135, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_neon>, TX_32X32, 135, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_neon>, TX_32X32, 135, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_neon>, TX_32X32, 34, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_neon>, TX_32X32, 34, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_neon>, TX_32X32, 34, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_neon>, TX_32X32, 1, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_neon>, TX_32X32, 1, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_neon>, TX_32X32, 1, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_neon>, TX_16X16, 256, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_neon>, TX_16X16, 256, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_neon>, TX_16X16, 256, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_neon>, TX_16X16, 38, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_neon>, TX_16X16, 38, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_neon>, TX_16X16, 38, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_neon>, TX_16X16, 10, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_neon>, TX_16X16, 10, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_neon>, TX_16X16, 10, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_neon>, TX_16X16, 1, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_neon>, TX_16X16, 1, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_neon>, TX_16X16, 1, 12, 2),
make_tuple(&vpx_highbd_fdct8x8_c,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_neon>, TX_8X8, 64, 8, 2),
@@ -488,46 +599,78 @@ const PartialInvTxfmParam neon_partial_idct_tests[] = {
#endif // CONFIG_VP9_HIGHBITDEPTH
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_neon>, TX_32X32, 1024, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_135_add_c>,
&wrapper<vpx_idct32x32_135_add_neon>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_34_add_c>,
&wrapper<vpx_idct32x32_34_add_neon>, TX_32X32, 34, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1_add_c>,
&wrapper<vpx_idct32x32_1_add_neon>, TX_32X32, 1, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_256_add_neon>, TX_16X16, 256, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_38_add_c>,
&wrapper<vpx_idct16x16_38_add_neon>, TX_16X16, 38, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_10_add_c>,
&wrapper<vpx_idct16x16_10_add_neon>, TX_16X16, 10, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_1_add_c>,
&wrapper<vpx_idct16x16_1_add_neon>, TX_16X16, 1, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
&wrapper<vpx_idct8x8_64_add_neon>, TX_8X8, 64, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_12_add_c>,
&wrapper<vpx_idct8x8_12_add_neon>, TX_8X8, 12, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_1_add_c>,
&wrapper<vpx_idct8x8_1_add_neon>, TX_8X8, 1, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
&wrapper<vpx_idct4x4_16_add_neon>, TX_4X4, 16, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_1_add_c>,
&wrapper<vpx_idct4x4_1_add_neon>, TX_4X4, 1, 8, 1)
};
INSTANTIATE_TEST_CASE_P(NEON, PartialIDctTest,
::testing::ValuesIn(neon_partial_idct_tests));
#endif // HAVE_NEON && !CONFIG_EMULATE_HARDWARE
#endif // HAVE_NEON
#if HAVE_SSE2 && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2
// 32x32_135_ is implemented using the 1024 version.
const PartialInvTxfmParam sse2_partial_idct_tests[] = {
#if CONFIG_VP9_HIGHBITDEPTH
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse2>, TX_32X32,
1024, 8, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse2>, TX_32X32,
1024, 10, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse2>, TX_32X32,
1024, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse2>, TX_32X32, 135, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse2>, TX_32X32, 135, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse2>, TX_32X32, 135, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse2>, TX_32X32, 34, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse2>, TX_32X32, 34, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse2>, TX_32X32, 34, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_sse2>, TX_32X32, 1, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_sse2>, TX_32X32, 1, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_1_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1_add_sse2>, TX_32X32, 1, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
@@ -539,14 +682,32 @@ const PartialInvTxfmParam sse2_partial_idct_tests[] = {
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_sse2>, TX_16X16, 256, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse2>, TX_16X16, 38, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse2>, TX_16X16, 38, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse2>, TX_16X16, 38, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse2>, TX_16X16, 10, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse2>, TX_16X16, 10, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse2>, TX_16X16, 10, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_sse2>, TX_16X16, 1, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_sse2>, TX_16X16, 1, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_1_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_1_add_sse2>, TX_16X16, 1, 12, 2),
make_tuple(&vpx_highbd_fdct8x8_c,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_sse2>, TX_8X8, 64, 8, 2),
@@ -557,14 +718,20 @@ const PartialInvTxfmParam sse2_partial_idct_tests[] = {
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_sse2>, TX_8X8, 64, 12, 2),
make_tuple(&vpx_highbd_fdct8x8_c,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse2>, TX_8X8, 12, 8, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse2>, TX_8X8, 12, 10, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse2>, TX_8X8, 12, 12, 2),
make_tuple(&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_1_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_1_add_sse2>, TX_8X8, 1, 8, 2),
make_tuple(&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_1_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_1_add_sse2>, TX_8X8, 1, 10, 2),
make_tuple(&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_1_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_1_add_sse2>, TX_8X8, 1, 12, 2),
make_tuple(&vpx_highbd_fdct4x4_c,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_sse2>, TX_4X4, 16, 8, 2),
@@ -574,119 +741,219 @@ const PartialInvTxfmParam sse2_partial_idct_tests[] = {
make_tuple(
&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_16_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_sse2>, TX_4X4, 16, 12, 2),
make_tuple(&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_1_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_1_add_sse2>, TX_4X4, 1, 8, 2),
make_tuple(&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_1_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_1_add_sse2>, TX_4X4, 1, 10, 2),
make_tuple(&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_1_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_1_add_sse2>, TX_4X4, 1, 12, 2),
#endif // CONFIG_VP9_HIGHBITDEPTH
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_sse2>, TX_32X32, 1024, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_sse2>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_135_add_c>,
&wrapper<vpx_idct32x32_135_add_sse2>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_34_add_c>,
&wrapper<vpx_idct32x32_34_add_sse2>, TX_32X32, 34, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1_add_c>,
&wrapper<vpx_idct32x32_1_add_sse2>, TX_32X32, 1, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_256_add_sse2>, TX_16X16, 256, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_38_add_c>,
&wrapper<vpx_idct16x16_38_add_sse2>, TX_16X16, 38, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_10_add_c>,
&wrapper<vpx_idct16x16_10_add_sse2>, TX_16X16, 10, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_1_add_c>,
&wrapper<vpx_idct16x16_1_add_sse2>, TX_16X16, 1, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
&wrapper<vpx_idct8x8_64_add_sse2>, TX_8X8, 64, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_12_add_c>,
&wrapper<vpx_idct8x8_12_add_sse2>, TX_8X8, 12, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_1_add_c>,
&wrapper<vpx_idct8x8_1_add_sse2>, TX_8X8, 1, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
&wrapper<vpx_idct4x4_16_add_sse2>, TX_4X4, 16, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_1_add_c>,
&wrapper<vpx_idct4x4_1_add_sse2>, TX_4X4, 1, 8, 1)
};
INSTANTIATE_TEST_CASE_P(SSE2, PartialIDctTest,
::testing::ValuesIn(sse2_partial_idct_tests));
#endif // HAVE_SSE2 && !CONFIG_EMULATE_HARDWARE
#endif // HAVE_SSE2
#if HAVE_SSSE3 && ARCH_X86_64 && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSSE3
const PartialInvTxfmParam ssse3_partial_idct_tests[] = {
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_ssse3>, TX_32X32, 1024, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_135_add_c>,
&wrapper<vpx_idct32x32_135_add_ssse3>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_34_add_c>,
&wrapper<vpx_idct32x32_34_add_ssse3>, TX_32X32, 34, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
&wrapper<vpx_idct8x8_64_add_ssse3>, TX_8X8, 64, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_12_add_c>,
&wrapper<vpx_idct8x8_12_add_ssse3>, TX_8X8, 12, 8, 1)
};
INSTANTIATE_TEST_CASE_P(SSSE3, PartialIDctTest,
::testing::ValuesIn(ssse3_partial_idct_tests));
#endif // HAVE_SSSE3 && ARCH_X86_64 && !CONFIG_EMULATE_HARDWARE
#endif // HAVE_SSSE3
#if HAVE_DSPR2 && !CONFIG_EMULATE_HARDWARE && !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_SSE4_1 && CONFIG_VP9_HIGHBITDEPTH
const PartialInvTxfmParam sse4_1_partial_idct_tests[] = {
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse4_1>, TX_32X32,
1024, 8, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse4_1>, TX_32X32,
1024, 10, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_1024_add_sse4_1>, TX_32X32,
1024, 12, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse4_1>, TX_32X32,
135, 8, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse4_1>, TX_32X32,
135, 10, 2),
make_tuple(&vpx_highbd_fdct32x32_c,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_135_add_sse4_1>, TX_32X32,
135, 12, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse4_1>, TX_32X32, 34, 8, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse4_1>, TX_32X32, 34, 10, 2),
make_tuple(
&vpx_highbd_fdct32x32_c, &highbd_wrapper<vpx_highbd_idct32x32_34_add_c>,
&highbd_wrapper<vpx_highbd_idct32x32_34_add_sse4_1>, TX_32X32, 34, 12, 2),
make_tuple(&vpx_highbd_fdct16x16_c,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_sse4_1>, TX_16X16,
256, 8, 2),
make_tuple(&vpx_highbd_fdct16x16_c,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_sse4_1>, TX_16X16,
256, 10, 2),
make_tuple(&vpx_highbd_fdct16x16_c,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_256_add_sse4_1>, TX_16X16,
256, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse4_1>, TX_16X16, 38, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse4_1>, TX_16X16, 38, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_38_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_38_add_sse4_1>, TX_16X16, 38, 12, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse4_1>, TX_16X16, 10, 8, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse4_1>, TX_16X16, 10, 10, 2),
make_tuple(
&vpx_highbd_fdct16x16_c, &highbd_wrapper<vpx_highbd_idct16x16_10_add_c>,
&highbd_wrapper<vpx_highbd_idct16x16_10_add_sse4_1>, TX_16X16, 10, 12, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_sse4_1>, TX_8X8, 64, 8, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_sse4_1>, TX_8X8, 64, 10, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_64_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_64_add_sse4_1>, TX_8X8, 64, 12, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse4_1>, TX_8X8, 12, 8, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse4_1>, TX_8X8, 12, 10, 2),
make_tuple(
&vpx_highbd_fdct8x8_c, &highbd_wrapper<vpx_highbd_idct8x8_12_add_c>,
&highbd_wrapper<vpx_highbd_idct8x8_12_add_sse4_1>, TX_8X8, 12, 12, 2),
make_tuple(
&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_16_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_sse4_1>, TX_4X4, 16, 8, 2),
make_tuple(
&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_16_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_sse4_1>, TX_4X4, 16, 10, 2),
make_tuple(
&vpx_highbd_fdct4x4_c, &highbd_wrapper<vpx_highbd_idct4x4_16_add_c>,
&highbd_wrapper<vpx_highbd_idct4x4_16_add_sse4_1>, TX_4X4, 16, 12, 2)
};
INSTANTIATE_TEST_CASE_P(SSE4_1, PartialIDctTest,
::testing::ValuesIn(sse4_1_partial_idct_tests));
#endif // HAVE_SSE4_1 && CONFIG_VP9_HIGHBITDEPTH
#if HAVE_DSPR2 && !CONFIG_VP9_HIGHBITDEPTH
const PartialInvTxfmParam dspr2_partial_idct_tests[] = {
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_dspr2>, TX_32X32, 1024, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_dspr2>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_34_add_c>,
&wrapper<vpx_idct32x32_34_add_dspr2>, TX_32X32, 34, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1_add_c>,
&wrapper<vpx_idct32x32_1_add_dspr2>, TX_32X32, 1, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_256_add_dspr2>, TX_16X16, 256, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_10_add_c>,
&wrapper<vpx_idct16x16_10_add_dspr2>, TX_16X16, 10, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_1_add_c>,
&wrapper<vpx_idct16x16_1_add_dspr2>, TX_16X16, 1, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
&wrapper<vpx_idct8x8_64_add_dspr2>, TX_8X8, 64, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_12_add_c>,
&wrapper<vpx_idct8x8_12_add_dspr2>, TX_8X8, 12, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_1_add_c>,
&wrapper<vpx_idct8x8_1_add_dspr2>, TX_8X8, 1, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
&wrapper<vpx_idct4x4_16_add_dspr2>, TX_4X4, 16, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_1_add_c>,
&wrapper<vpx_idct4x4_1_add_dspr2>, TX_4X4, 1, 8, 1)
};
INSTANTIATE_TEST_CASE_P(DSPR2, PartialIDctTest,
::testing::ValuesIn(dspr2_partial_idct_tests));
#endif // HAVE_DSPR2 && !CONFIG_EMULATE_HARDWARE && !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_DSPR2 && !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_MSA && !CONFIG_EMULATE_HARDWARE && !CONFIG_VP9_HIGHBITDEPTH
#if HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH
// 32x32_135_ is implemented using the 1024 version.
const PartialInvTxfmParam msa_partial_idct_tests[] = {
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_msa>, TX_32X32, 1024, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
&wrapper<vpx_idct32x32_1024_add_msa>, TX_32X32, 135, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_34_add_c>,
&wrapper<vpx_idct32x32_34_add_msa>, TX_32X32, 34, 8, 1),
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1024_add_c>,
make_tuple(&vpx_fdct32x32_c, &wrapper<vpx_idct32x32_1_add_c>,
&wrapper<vpx_idct32x32_1_add_msa>, TX_32X32, 1, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
&wrapper<vpx_idct16x16_256_add_msa>, TX_16X16, 256, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_10_add_c>,
&wrapper<vpx_idct16x16_10_add_msa>, TX_16X16, 10, 8, 1),
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_256_add_c>,
make_tuple(&vpx_fdct16x16_c, &wrapper<vpx_idct16x16_1_add_c>,
&wrapper<vpx_idct16x16_1_add_msa>, TX_16X16, 1, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
&wrapper<vpx_idct8x8_64_add_msa>, TX_8X8, 64, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_12_add_c>,
&wrapper<vpx_idct8x8_12_add_msa>, TX_8X8, 12, 8, 1),
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_64_add_c>,
make_tuple(&vpx_fdct8x8_c, &wrapper<vpx_idct8x8_1_add_c>,
&wrapper<vpx_idct8x8_1_add_msa>, TX_8X8, 1, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
&wrapper<vpx_idct4x4_16_add_msa>, TX_4X4, 16, 8, 1),
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_16_add_c>,
make_tuple(&vpx_fdct4x4_c, &wrapper<vpx_idct4x4_1_add_c>,
&wrapper<vpx_idct4x4_1_add_msa>, TX_4X4, 1, 8, 1)
};
INSTANTIATE_TEST_CASE_P(MSA, PartialIDctTest,
::testing::ValuesIn(msa_partial_idct_tests));
#endif // HAVE_MSA && !CONFIG_EMULATE_HARDWARE && !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH
#endif // !CONFIG_EMULATE_HARDWARE
} // namespace

View File

@@ -11,6 +11,7 @@
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
@@ -18,6 +19,7 @@
#include "vpx_mem/vpx_mem.h"
using libvpx_test::ACMRandom;
using libvpx_test::Buffer;
typedef void (*VpxPostProcDownAndAcrossMbRowFunc)(
unsigned char *src_ptr, unsigned char *dst_ptr, int src_pixels_per_line,
@@ -54,31 +56,16 @@ TEST_P(VpxPostProcDownAndAcrossMbRowTest, CheckFilterOutput) {
const int block_height = 16;
// 5-tap filter needs 2 padding rows above and below the block in the input.
const int input_width = block_width;
const int input_height = block_height + 4;
const int input_stride = input_width;
const int input_size = input_width * input_height;
Buffer<uint8_t> src_image = Buffer<uint8_t>(block_width, block_height, 2);
ASSERT_TRUE(src_image.Init());
// Filter extends output block by 8 samples at left and right edges.
const int output_width = block_width + 16;
const int output_height = block_height;
const int output_stride = output_width;
const int output_size = output_width * output_height;
uint8_t *const src_image = new uint8_t[input_size];
ASSERT_TRUE(src_image != NULL);
// Though the left padding is only 8 bytes, the assembly code tries to
// read 16 bytes before the pointer.
uint8_t *const dst_image = new uint8_t[output_size + 8];
ASSERT_TRUE(dst_image != NULL);
Buffer<uint8_t> dst_image =
Buffer<uint8_t>(block_width, block_height, 8, 16, 8, 8);
ASSERT_TRUE(dst_image.Init());
// Pointers to top-left pixel of block in the input and output images.
uint8_t *const src_image_ptr = src_image + (input_stride << 1);
// The assembly works in increments of 16. The first read may be offset by
// this amount.
uint8_t *const dst_image_ptr = dst_image + 16;
uint8_t *const flimits =
reinterpret_cast<uint8_t *>(vpx_memalign(16, block_width));
(void)memset(flimits, 255, block_width);
@@ -86,37 +73,29 @@ TEST_P(VpxPostProcDownAndAcrossMbRowTest, CheckFilterOutput) {
// Initialize pixels in the input:
// block pixels to value 1,
// border pixels to value 10.
(void)memset(src_image, 10, input_size);
uint8_t *pixel_ptr = src_image_ptr;
for (int i = 0; i < block_height; ++i) {
for (int j = 0; j < block_width; ++j) {
pixel_ptr[j] = 1;
}
pixel_ptr += input_stride;
}
src_image.SetPadding(10);
src_image.Set(1);
// Initialize pixels in the output to 99.
(void)memset(dst_image, 99, output_size);
dst_image.Set(99);
ASM_REGISTER_STATE_CHECK(GetParam()(src_image_ptr, dst_image_ptr,
input_stride, output_stride, block_width,
flimits, 16));
ASM_REGISTER_STATE_CHECK(GetParam()(
src_image.TopLeftPixel(), dst_image.TopLeftPixel(), src_image.stride(),
dst_image.stride(), block_width, flimits, 16));
static const uint8_t kExpectedOutput[block_height] = {
4, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4
};
pixel_ptr = dst_image_ptr;
uint8_t *pixel_ptr = dst_image.TopLeftPixel();
for (int i = 0; i < block_height; ++i) {
for (int j = 0; j < block_width; ++j) {
ASSERT_EQ(kExpectedOutput[i], pixel_ptr[j]) << "at (" << i << ", " << j
<< ")";
ASSERT_EQ(kExpectedOutput[i], pixel_ptr[j])
<< "at (" << i << ", " << j << ")";
}
pixel_ptr += output_stride;
pixel_ptr += dst_image.stride();
}
delete[] src_image;
delete[] dst_image;
vpx_free(flimits);
};
@@ -129,35 +108,20 @@ TEST_P(VpxPostProcDownAndAcrossMbRowTest, CheckCvsAssembly) {
// 5-tap filter needs 2 padding rows above and below the block in the input.
// SSE2 reads in blocks of 16. Pad an extra 8 in case the width is not %16.
const int input_width = block_width;
const int input_height = block_height + 4 + 8;
const int input_stride = input_width;
const int input_size = input_stride * input_height;
Buffer<uint8_t> src_image =
Buffer<uint8_t>(block_width, block_height, 2, 2, 10, 2);
ASSERT_TRUE(src_image.Init());
// Filter extends output block by 8 samples at left and right edges.
// Though the left padding is only 8 bytes, there is 'above' padding as well
// so when the assembly code tries to read 16 bytes before the pointer it is
// not a problem.
// SSE2 reads in blocks of 16. Pad an extra 8 in case the width is not %16.
const int output_width = block_width + 24;
const int output_height = block_height;
const int output_stride = output_width;
const int output_size = output_stride * output_height;
uint8_t *const src_image = new uint8_t[input_size];
ASSERT_TRUE(src_image != NULL);
// Though the left padding is only 8 bytes, the assembly code tries to
// read 16 bytes before the pointer.
uint8_t *const dst_image = new uint8_t[output_size + 8];
ASSERT_TRUE(dst_image != NULL);
uint8_t *const dst_image_ref = new uint8_t[output_size + 8];
ASSERT_TRUE(dst_image_ref != NULL);
// Pointers to top-left pixel of block in the input and output images.
uint8_t *const src_image_ptr = src_image + (input_stride << 1);
// The assembly works in increments of 16. The first read may be offset by
// this amount.
uint8_t *const dst_image_ptr = dst_image + 16;
uint8_t *const dst_image_ref_ptr = dst_image + 16;
Buffer<uint8_t> dst_image =
Buffer<uint8_t>(block_width, block_height, 8, 8, 16, 8);
ASSERT_TRUE(dst_image.Init());
Buffer<uint8_t> dst_image_ref = Buffer<uint8_t>(block_width, block_height, 8);
ASSERT_TRUE(dst_image_ref.Init());
// Filter values are set in blocks of 16 for Y and 8 for U/V. Each macroblock
// can have a different filter. SSE2 assembly reads flimits in blocks of 16 so
@@ -171,14 +135,8 @@ TEST_P(VpxPostProcDownAndAcrossMbRowTest, CheckCvsAssembly) {
// Initialize pixels in the input:
// block pixels to random values.
// border pixels to value 10.
(void)memset(src_image, 10, input_size);
uint8_t *pixel_ptr = src_image_ptr;
for (int i = 0; i < block_height; ++i) {
for (int j = 0; j < block_width; ++j) {
pixel_ptr[j] = rnd.Rand8();
}
pixel_ptr += input_stride;
}
src_image.SetPadding(10);
src_image.Set(&rnd, &ACMRandom::Rand8);
for (int blocks = 0; blocks < block_width; blocks += 8) {
(void)memset(flimits, 0, sizeof(*flimits) * flimits_width);
@@ -186,29 +144,22 @@ TEST_P(VpxPostProcDownAndAcrossMbRowTest, CheckCvsAssembly) {
for (int f = 0; f < 255; f++) {
(void)memset(flimits + blocks, f, sizeof(*flimits) * 8);
(void)memset(dst_image, 0, output_size);
(void)memset(dst_image_ref, 0, output_size);
dst_image.Set(0);
dst_image_ref.Set(0);
vpx_post_proc_down_and_across_mb_row_c(
src_image_ptr, dst_image_ref_ptr, input_stride, output_stride,
block_width, flimits, block_height);
ASM_REGISTER_STATE_CHECK(GetParam()(src_image_ptr, dst_image_ptr,
input_stride, output_stride,
block_width, flimits, 16));
src_image.TopLeftPixel(), dst_image_ref.TopLeftPixel(),
src_image.stride(), dst_image_ref.stride(), block_width, flimits,
block_height);
ASM_REGISTER_STATE_CHECK(
GetParam()(src_image.TopLeftPixel(), dst_image.TopLeftPixel(),
src_image.stride(), dst_image.stride(), block_width,
flimits, block_height));
for (int i = 0; i < block_height; ++i) {
for (int j = 0; j < block_width; ++j) {
ASSERT_EQ(dst_image_ref_ptr[j + i * output_stride],
dst_image_ptr[j + i * output_stride])
<< "at (" << i << ", " << j << ")";
}
}
ASSERT_TRUE(dst_image.CheckValues(dst_image_ref));
}
}
delete[] src_image;
delete[] dst_image;
delete[] dst_image_ref;
vpx_free(flimits);
}
@@ -231,8 +182,8 @@ class VpxMbPostProcAcrossIpTest
int rows, int cols, int src_pitch) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
ASSERT_EQ(expected_output[c], src_c[c]) << "at (" << r << ", " << c
<< ")";
ASSERT_EQ(expected_output[c], src_c[c])
<< "at (" << r << ", " << c << ")";
}
src_c += src_pitch;
}
@@ -249,108 +200,83 @@ class VpxMbPostProcAcrossIpTest
TEST_P(VpxMbPostProcAcrossIpTest, CheckLowFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_left_padding = 8;
const int src_right_padding = 17;
const int src_width = cols + src_left_padding + src_right_padding;
const int src_size = rows * src_width;
unsigned char *const src = new unsigned char[src_size];
ASSERT_TRUE(src != NULL);
memset(src, 10, src_size);
unsigned char *const s = src + src_left_padding;
SetCols(s, rows, cols, src_width);
Buffer<uint8_t> src = Buffer<uint8_t>(cols, rows, 8, 8, 17, 8);
ASSERT_TRUE(src.Init());
src.SetPadding(10);
SetCols(src.TopLeftPixel(), rows, cols, src.stride());
unsigned char *expected_output = new unsigned char[rows * cols];
ASSERT_TRUE(expected_output != NULL);
SetCols(expected_output, rows, cols, cols);
Buffer<uint8_t> expected_output = Buffer<uint8_t>(cols, rows, 0);
ASSERT_TRUE(expected_output.Init());
SetCols(expected_output.TopLeftPixel(), rows, cols, expected_output.stride());
RunFilterLevel(s, rows, cols, src_width, q2mbl(0), expected_output);
delete[] src;
delete[] expected_output;
RunFilterLevel(src.TopLeftPixel(), rows, cols, src.stride(), q2mbl(0),
expected_output.TopLeftPixel());
}
TEST_P(VpxMbPostProcAcrossIpTest, CheckMediumFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_left_padding = 8;
const int src_right_padding = 17;
const int src_width = cols + src_left_padding + src_right_padding;
const int src_size = rows * src_width;
unsigned char *const src = new unsigned char[src_size];
ASSERT_TRUE(src != NULL);
memset(src, 10, src_size);
unsigned char *const s = src + src_left_padding;
Buffer<uint8_t> src = Buffer<uint8_t>(cols, rows, 8, 8, 17, 8);
ASSERT_TRUE(src.Init());
src.SetPadding(10);
SetCols(src.TopLeftPixel(), rows, cols, src.stride());
SetCols(s, rows, cols, src_width);
static const unsigned char kExpectedOutput[cols] = {
2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 13
};
RunFilterLevel(s, rows, cols, src_width, q2mbl(70), kExpectedOutput);
delete[] src;
RunFilterLevel(src.TopLeftPixel(), rows, cols, src.stride(), q2mbl(70),
kExpectedOutput);
}
TEST_P(VpxMbPostProcAcrossIpTest, CheckHighFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_left_padding = 8;
const int src_right_padding = 17;
const int src_width = cols + src_left_padding + src_right_padding;
const int src_size = rows * src_width;
unsigned char *const src = new unsigned char[src_size];
ASSERT_TRUE(src != NULL);
unsigned char *const s = src + src_left_padding;
Buffer<uint8_t> src = Buffer<uint8_t>(cols, rows, 8, 8, 17, 8);
ASSERT_TRUE(src.Init());
src.SetPadding(10);
SetCols(src.TopLeftPixel(), rows, cols, src.stride());
memset(src, 10, src_size);
SetCols(s, rows, cols, src_width);
static const unsigned char kExpectedOutput[cols] = {
2, 2, 3, 4, 4, 5, 6, 7, 8, 9, 10, 11, 11, 12, 13, 13
};
RunFilterLevel(s, rows, cols, src_width, INT_MAX, kExpectedOutput);
RunFilterLevel(src.TopLeftPixel(), rows, cols, src.stride(), INT_MAX,
kExpectedOutput);
memset(src, 10, src_size);
SetCols(s, rows, cols, src_width);
RunFilterLevel(s, rows, cols, src_width, q2mbl(100), kExpectedOutput);
SetCols(src.TopLeftPixel(), rows, cols, src.stride());
delete[] src;
RunFilterLevel(src.TopLeftPixel(), rows, cols, src.stride(), q2mbl(100),
kExpectedOutput);
}
TEST_P(VpxMbPostProcAcrossIpTest, CheckCvsAssembly) {
const int rows = 16;
const int cols = 16;
const int src_left_padding = 8;
const int src_right_padding = 17;
const int src_width = cols + src_left_padding + src_right_padding;
const int src_size = rows * src_width;
unsigned char *const c_mem = new unsigned char[src_size];
unsigned char *const asm_mem = new unsigned char[src_size];
ASSERT_TRUE(c_mem != NULL);
ASSERT_TRUE(asm_mem != NULL);
unsigned char *const src_c = c_mem + src_left_padding;
unsigned char *const src_asm = asm_mem + src_left_padding;
Buffer<uint8_t> c_mem = Buffer<uint8_t>(cols, rows, 8, 8, 17, 8);
ASSERT_TRUE(c_mem.Init());
Buffer<uint8_t> asm_mem = Buffer<uint8_t>(cols, rows, 8, 8, 17, 8);
ASSERT_TRUE(asm_mem.Init());
// When level >= 100, the filter behaves the same as the level = INT_MAX
// When level < 20, it behaves the same as the level = 0
for (int level = 0; level < 100; level++) {
memset(c_mem, 10, src_size);
memset(asm_mem, 10, src_size);
SetCols(src_c, rows, cols, src_width);
SetCols(src_asm, rows, cols, src_width);
c_mem.SetPadding(10);
asm_mem.SetPadding(10);
SetCols(c_mem.TopLeftPixel(), rows, cols, c_mem.stride());
SetCols(asm_mem.TopLeftPixel(), rows, cols, asm_mem.stride());
vpx_mbpost_proc_across_ip_c(src_c, src_width, rows, cols, q2mbl(level));
ASM_REGISTER_STATE_CHECK(
GetParam()(src_asm, src_width, rows, cols, q2mbl(level)));
vpx_mbpost_proc_across_ip_c(c_mem.TopLeftPixel(), c_mem.stride(), rows,
cols, q2mbl(level));
ASM_REGISTER_STATE_CHECK(GetParam()(
asm_mem.TopLeftPixel(), asm_mem.stride(), rows, cols, q2mbl(level)));
RunComparison(src_c, src_asm, rows, cols, src_width);
ASSERT_TRUE(asm_mem.CheckValues(c_mem));
}
delete[] c_mem;
delete[] asm_mem;
}
class VpxMbPostProcDownTest
@@ -359,44 +285,10 @@ class VpxMbPostProcDownTest
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
void SetRows(unsigned char *src_c, int rows, int cols) {
void SetRows(unsigned char *src_c, int rows, int cols, int src_width) {
for (int r = 0; r < rows; r++) {
memset(src_c, r, cols);
src_c += cols;
}
}
void SetRandom(unsigned char *src_c, unsigned char *src_asm, int rows,
int cols, int src_pitch) {
ACMRandom rnd;
rnd.Reset(ACMRandom::DeterministicSeed());
// Add some random noise to the input
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
const int noise = rnd(4);
src_c[c] = r + noise;
src_asm[c] = r + noise;
}
src_c += src_pitch;
src_asm += src_pitch;
}
}
void SetRandomSaturation(unsigned char *src_c, unsigned char *src_asm,
int rows, int cols, int src_pitch) {
ACMRandom rnd;
rnd.Reset(ACMRandom::DeterministicSeed());
// Add some random noise to the input
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
const int noise = 3 * rnd(2);
src_c[c] = r + noise;
src_asm[c] = r + noise;
}
src_c += src_pitch;
src_asm += src_pitch;
src_c += src_width;
}
}
@@ -404,24 +296,13 @@ class VpxMbPostProcDownTest
int rows, int cols, int src_pitch) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
ASSERT_EQ(expected_output[r * rows + c], src_c[c]) << "at (" << r
<< ", " << c << ")";
ASSERT_EQ(expected_output[r * rows + c], src_c[c])
<< "at (" << r << ", " << c << ")";
}
src_c += src_pitch;
}
}
void RunComparison(unsigned char *src_c, unsigned char *src_asm, int rows,
int cols, int src_pitch) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
ASSERT_EQ(src_c[c], src_asm[c]) << "at (" << r << ", " << c << ")";
}
src_c += src_pitch;
src_asm += src_pitch;
}
}
void RunFilterLevel(unsigned char *s, int rows, int cols, int src_width,
int filter_level, const unsigned char *expected_output) {
ASM_REGISTER_STATE_CHECK(
@@ -433,17 +314,12 @@ class VpxMbPostProcDownTest
TEST_P(VpxMbPostProcDownTest, CheckHighFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_pitch = cols;
const int src_top_padding = 8;
const int src_bottom_padding = 17;
const int src_size = cols * (rows + src_top_padding + src_bottom_padding);
unsigned char *const c_mem = new unsigned char[src_size];
ASSERT_TRUE(c_mem != NULL);
memset(c_mem, 10, src_size);
unsigned char *const src_c = c_mem + src_top_padding * src_pitch;
Buffer<uint8_t> src_c = Buffer<uint8_t>(cols, rows, 8, 8, 8, 17);
ASSERT_TRUE(src_c.Init());
src_c.SetPadding(10);
SetRows(src_c, rows, cols);
SetRows(src_c.TopLeftPixel(), rows, cols, src_c.stride());
static const unsigned char kExpectedOutput[rows * cols] = {
2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2,
@@ -462,29 +338,24 @@ TEST_P(VpxMbPostProcDownTest, CheckHighFilterOutput) {
13, 13, 13, 13, 14, 13, 13, 13, 13
};
RunFilterLevel(src_c, rows, cols, src_pitch, INT_MAX, kExpectedOutput);
RunFilterLevel(src_c.TopLeftPixel(), rows, cols, src_c.stride(), INT_MAX,
kExpectedOutput);
memset(c_mem, 10, src_size);
SetRows(src_c, rows, cols);
RunFilterLevel(src_c, rows, cols, src_pitch, q2mbl(100), kExpectedOutput);
delete[] c_mem;
src_c.SetPadding(10);
SetRows(src_c.TopLeftPixel(), rows, cols, src_c.stride());
RunFilterLevel(src_c.TopLeftPixel(), rows, cols, src_c.stride(), q2mbl(100),
kExpectedOutput);
}
TEST_P(VpxMbPostProcDownTest, CheckMediumFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_pitch = cols;
const int src_top_padding = 8;
const int src_bottom_padding = 17;
const int src_size = cols * (rows + src_top_padding + src_bottom_padding);
unsigned char *const c_mem = new unsigned char[src_size];
ASSERT_TRUE(c_mem != NULL);
memset(c_mem, 10, src_size);
unsigned char *const src_c = c_mem + src_top_padding * src_pitch;
Buffer<uint8_t> src_c = Buffer<uint8_t>(cols, rows, 8, 8, 8, 17);
ASSERT_TRUE(src_c.Init());
src_c.SetPadding(10);
SetRows(src_c, rows, cols);
SetRows(src_c.TopLeftPixel(), rows, cols, src_c.stride());
static const unsigned char kExpectedOutput[rows * cols] = {
2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2,
@@ -503,70 +374,65 @@ TEST_P(VpxMbPostProcDownTest, CheckMediumFilterOutput) {
13, 13, 13, 13, 14, 13, 13, 13, 13
};
RunFilterLevel(src_c, rows, cols, src_pitch, q2mbl(70), kExpectedOutput);
delete[] c_mem;
RunFilterLevel(src_c.TopLeftPixel(), rows, cols, src_c.stride(), q2mbl(70),
kExpectedOutput);
}
TEST_P(VpxMbPostProcDownTest, CheckLowFilterOutput) {
const int rows = 16;
const int cols = 16;
const int src_pitch = cols;
const int src_top_padding = 8;
const int src_bottom_padding = 17;
const int src_size = cols * (rows + src_top_padding + src_bottom_padding);
unsigned char *const c_mem = new unsigned char[src_size];
ASSERT_TRUE(c_mem != NULL);
memset(c_mem, 10, src_size);
unsigned char *const src_c = c_mem + src_top_padding * src_pitch;
Buffer<uint8_t> src_c = Buffer<uint8_t>(cols, rows, 8, 8, 8, 17);
ASSERT_TRUE(src_c.Init());
src_c.SetPadding(10);
SetRows(src_c, rows, cols);
SetRows(src_c.TopLeftPixel(), rows, cols, src_c.stride());
unsigned char *expected_output = new unsigned char[rows * cols];
ASSERT_TRUE(expected_output != NULL);
SetRows(expected_output, rows, cols);
SetRows(expected_output, rows, cols, cols);
RunFilterLevel(src_c, rows, cols, src_pitch, q2mbl(0), expected_output);
RunFilterLevel(src_c.TopLeftPixel(), rows, cols, src_c.stride(), q2mbl(0),
expected_output);
delete[] c_mem;
delete[] expected_output;
}
TEST_P(VpxMbPostProcDownTest, CheckCvsAssembly) {
const int rows = 16;
const int cols = 16;
const int src_pitch = cols;
const int src_top_padding = 8;
const int src_bottom_padding = 17;
const int src_size = cols * (rows + src_top_padding + src_bottom_padding);
unsigned char *const c_mem = new unsigned char[src_size];
unsigned char *const asm_mem = new unsigned char[src_size];
ASSERT_TRUE(c_mem != NULL);
ASSERT_TRUE(asm_mem != NULL);
unsigned char *const src_c = c_mem + src_top_padding * src_pitch;
unsigned char *const src_asm = asm_mem + src_top_padding * src_pitch;
ACMRandom rnd;
rnd.Reset(ACMRandom::DeterministicSeed());
Buffer<uint8_t> src_c = Buffer<uint8_t>(cols, rows, 8, 8, 8, 17);
ASSERT_TRUE(src_c.Init());
Buffer<uint8_t> src_asm = Buffer<uint8_t>(cols, rows, 8, 8, 8, 17);
ASSERT_TRUE(src_asm.Init());
for (int level = 0; level < 100; level++) {
memset(c_mem, 10, src_size);
memset(asm_mem, 10, src_size);
SetRandom(src_c, src_asm, rows, cols, src_pitch);
vpx_mbpost_proc_down_c(src_c, src_pitch, rows, cols, q2mbl(level));
ASM_REGISTER_STATE_CHECK(
GetParam()(src_asm, src_pitch, rows, cols, q2mbl(level)));
RunComparison(src_c, src_asm, rows, cols, src_pitch);
src_c.SetPadding(10);
src_asm.SetPadding(10);
src_c.Set(&rnd, &ACMRandom::Rand8);
src_asm.CopyFrom(src_c);
memset(c_mem, 10, src_size);
memset(asm_mem, 10, src_size);
SetRandomSaturation(src_c, src_asm, rows, cols, src_pitch);
vpx_mbpost_proc_down_c(src_c, src_pitch, rows, cols, q2mbl(level));
ASM_REGISTER_STATE_CHECK(
GetParam()(src_asm, src_pitch, rows, cols, q2mbl(level)));
RunComparison(src_c, src_asm, rows, cols, src_pitch);
vpx_mbpost_proc_down_c(src_c.TopLeftPixel(), src_c.stride(), rows, cols,
q2mbl(level));
ASM_REGISTER_STATE_CHECK(GetParam()(
src_asm.TopLeftPixel(), src_asm.stride(), rows, cols, q2mbl(level)));
ASSERT_TRUE(src_asm.CheckValues(src_c));
src_c.SetPadding(10);
src_asm.SetPadding(10);
src_c.Set(&rnd, &ACMRandom::Rand8Extremes);
src_asm.CopyFrom(src_c);
vpx_mbpost_proc_down_c(src_c.TopLeftPixel(), src_c.stride(), rows, cols,
q2mbl(level));
ASM_REGISTER_STATE_CHECK(GetParam()(
src_asm.TopLeftPixel(), src_asm.stride(), rows, cols, q2mbl(level)));
ASSERT_TRUE(src_asm.CheckValues(src_c));
}
delete[] c_mem;
delete[] asm_mem;
}
INSTANTIATE_TEST_CASE_P(
@@ -598,6 +464,9 @@ INSTANTIATE_TEST_CASE_P(
INSTANTIATE_TEST_CASE_P(NEON, VpxMbPostProcAcrossIpTest,
::testing::Values(vpx_mbpost_proc_across_ip_neon));
INSTANTIATE_TEST_CASE_P(NEON, VpxMbPostProcDownTest,
::testing::Values(vpx_mbpost_proc_down_neon));
#endif // HAVE_NEON
#if HAVE_MSA

View File

@@ -324,6 +324,15 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(4, 4, &vp8_sixtap_predict4x4_msa)));
#endif
#if HAVE_MMI
INSTANTIATE_TEST_CASE_P(
MMI, SixtapPredictTest,
::testing::Values(make_tuple(16, 16, &vp8_sixtap_predict16x16_mmi),
make_tuple(8, 8, &vp8_sixtap_predict8x8_mmi),
make_tuple(8, 4, &vp8_sixtap_predict8x4_mmi),
make_tuple(4, 4, &vp8_sixtap_predict4x4_mmi)));
#endif
class BilinearPredictTest : public PredictTestBase {};
TEST_P(BilinearPredictTest, TestWithRandomData) {

View File

@@ -200,4 +200,12 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp8_fast_quantize_b_msa, &vp8_fast_quantize_b_c),
make_tuple(&vp8_regular_quantize_b_msa, &vp8_regular_quantize_b_c)));
#endif // HAVE_MSA
#if HAVE_MMI
INSTANTIATE_TEST_CASE_P(
MMI, QuantizeTest,
::testing::Values(
make_tuple(&vp8_fast_quantize_b_mmi, &vp8_fast_quantize_b_c),
make_tuple(&vp8_regular_quantize_b_mmi, &vp8_regular_quantize_b_c)));
#endif // HAVE_MMI
} // namespace

View File

@@ -113,8 +113,8 @@ class RegisterStateCheck {
int64_t post_store[8];
vpx_push_neon(post_store);
for (int i = 0; i < 8; ++i) {
EXPECT_EQ(pre_store_[i], post_store[i]) << "d" << i + 8
<< " has been modified";
EXPECT_EQ(pre_store_[i], post_store[i])
<< "d" << i + 8 << " has been modified";
}
}

View File

@@ -298,10 +298,10 @@ TEST_P(ResizeTest, TestExternalResizeWorks) {
unsigned int expected_h;
ScaleForFrameNumber(frame, kInitialWidth, kInitialHeight, &expected_w,
&expected_h, 0);
EXPECT_EQ(expected_w, info->w) << "Frame " << frame
<< " had unexpected width";
EXPECT_EQ(expected_h, info->h) << "Frame " << frame
<< " had unexpected height";
EXPECT_EQ(expected_w, info->w)
<< "Frame " << frame << " had unexpected width";
EXPECT_EQ(expected_h, info->h)
<< "Frame " << frame << " had unexpected height";
}
}
@@ -513,10 +513,10 @@ TEST_P(ResizeRealtimeTest, TestExternalResizeWorks) {
unsigned int expected_h;
ScaleForFrameNumber(frame, kInitialWidth, kInitialHeight, &expected_w,
&expected_h, 1);
EXPECT_EQ(expected_w, info->w) << "Frame " << frame
<< " had unexpected width";
EXPECT_EQ(expected_h, info->h) << "Frame " << frame
<< " had unexpected height";
EXPECT_EQ(expected_w, info->w)
<< "Frame " << frame << " had unexpected width";
EXPECT_EQ(expected_h, info->h)
<< "Frame " << frame << " had unexpected height";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
}
}

View File

@@ -644,19 +644,50 @@ INSTANTIATE_TEST_CASE_P(C, SADx4Test, ::testing::ValuesIn(x4d_c_tests));
#if HAVE_NEON
const SadMxNParam neon_tests[] = {
SadMxNParam(64, 64, &vpx_sad64x64_neon),
SadMxNParam(64, 32, &vpx_sad64x32_neon),
SadMxNParam(32, 32, &vpx_sad32x32_neon),
SadMxNParam(16, 32, &vpx_sad16x32_neon),
SadMxNParam(16, 16, &vpx_sad16x16_neon),
SadMxNParam(16, 8, &vpx_sad16x8_neon),
SadMxNParam(8, 16, &vpx_sad8x16_neon),
SadMxNParam(8, 8, &vpx_sad8x8_neon),
SadMxNParam(8, 4, &vpx_sad8x4_neon),
SadMxNParam(4, 8, &vpx_sad4x8_neon),
SadMxNParam(4, 4, &vpx_sad4x4_neon),
};
INSTANTIATE_TEST_CASE_P(NEON, SADTest, ::testing::ValuesIn(neon_tests));
const SadMxNAvgParam avg_neon_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_neon),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_neon),
SadMxNAvgParam(32, 64, &vpx_sad32x64_avg_neon),
SadMxNAvgParam(32, 32, &vpx_sad32x32_avg_neon),
SadMxNAvgParam(32, 16, &vpx_sad32x16_avg_neon),
SadMxNAvgParam(16, 32, &vpx_sad16x32_avg_neon),
SadMxNAvgParam(16, 16, &vpx_sad16x16_avg_neon),
SadMxNAvgParam(16, 8, &vpx_sad16x8_avg_neon),
SadMxNAvgParam(8, 16, &vpx_sad8x16_avg_neon),
SadMxNAvgParam(8, 8, &vpx_sad8x8_avg_neon),
SadMxNAvgParam(8, 4, &vpx_sad8x4_avg_neon),
SadMxNAvgParam(4, 8, &vpx_sad4x8_avg_neon),
SadMxNAvgParam(4, 4, &vpx_sad4x4_avg_neon),
};
INSTANTIATE_TEST_CASE_P(NEON, SADavgTest, ::testing::ValuesIn(avg_neon_tests));
const SadMxNx4Param x4d_neon_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_neon),
SadMxNx4Param(64, 32, &vpx_sad64x32x4d_neon),
SadMxNx4Param(32, 64, &vpx_sad32x64x4d_neon),
SadMxNx4Param(32, 32, &vpx_sad32x32x4d_neon),
SadMxNx4Param(32, 16, &vpx_sad32x16x4d_neon),
SadMxNx4Param(16, 32, &vpx_sad16x32x4d_neon),
SadMxNx4Param(16, 16, &vpx_sad16x16x4d_neon),
SadMxNx4Param(16, 8, &vpx_sad16x8x4d_neon),
SadMxNx4Param(8, 16, &vpx_sad8x16x4d_neon),
SadMxNx4Param(8, 8, &vpx_sad8x8x4d_neon),
SadMxNx4Param(8, 4, &vpx_sad8x4x4d_neon),
SadMxNx4Param(4, 8, &vpx_sad4x8x4d_neon),
SadMxNx4Param(4, 4, &vpx_sad4x4x4d_neon),
};
INSTANTIATE_TEST_CASE_P(NEON, SADx4Test, ::testing::ValuesIn(x4d_neon_tests));
#endif // HAVE_NEON
@@ -865,6 +896,14 @@ const SadMxNx4Param x4d_avx2_tests[] = {
INSTANTIATE_TEST_CASE_P(AVX2, SADx4Test, ::testing::ValuesIn(x4d_avx2_tests));
#endif // HAVE_AVX2
#if HAVE_AVX512
const SadMxNx4Param x4d_avx512_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_avx512),
};
INSTANTIATE_TEST_CASE_P(AVX512, SADx4Test,
::testing::ValuesIn(x4d_avx512_tests));
#endif // HAVE_AVX512
//------------------------------------------------------------------------------
// MIPS functions
#if HAVE_MSA
@@ -920,4 +959,98 @@ const SadMxNx4Param x4d_msa_tests[] = {
INSTANTIATE_TEST_CASE_P(MSA, SADx4Test, ::testing::ValuesIn(x4d_msa_tests));
#endif // HAVE_MSA
//------------------------------------------------------------------------------
// VSX functions
#if HAVE_VSX
const SadMxNParam vsx_tests[] = {
SadMxNParam(64, 64, &vpx_sad64x64_vsx),
SadMxNParam(64, 32, &vpx_sad64x32_vsx),
SadMxNParam(32, 64, &vpx_sad32x64_vsx),
SadMxNParam(32, 32, &vpx_sad32x32_vsx),
SadMxNParam(32, 16, &vpx_sad32x16_vsx),
SadMxNParam(16, 32, &vpx_sad16x32_vsx),
SadMxNParam(16, 16, &vpx_sad16x16_vsx),
SadMxNParam(16, 8, &vpx_sad16x8_vsx),
};
INSTANTIATE_TEST_CASE_P(VSX, SADTest, ::testing::ValuesIn(vsx_tests));
const SadMxNAvgParam avg_vsx_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_vsx),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_vsx),
SadMxNAvgParam(32, 64, &vpx_sad32x64_avg_vsx),
SadMxNAvgParam(32, 32, &vpx_sad32x32_avg_vsx),
SadMxNAvgParam(32, 16, &vpx_sad32x16_avg_vsx),
SadMxNAvgParam(16, 32, &vpx_sad16x32_avg_vsx),
SadMxNAvgParam(16, 16, &vpx_sad16x16_avg_vsx),
SadMxNAvgParam(16, 8, &vpx_sad16x8_avg_vsx),
};
INSTANTIATE_TEST_CASE_P(VSX, SADavgTest, ::testing::ValuesIn(avg_vsx_tests));
const SadMxNx4Param x4d_vsx_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_vsx),
SadMxNx4Param(64, 32, &vpx_sad64x32x4d_vsx),
SadMxNx4Param(32, 64, &vpx_sad32x64x4d_vsx),
SadMxNx4Param(32, 32, &vpx_sad32x32x4d_vsx),
SadMxNx4Param(32, 16, &vpx_sad32x16x4d_vsx),
SadMxNx4Param(16, 32, &vpx_sad16x32x4d_vsx),
SadMxNx4Param(16, 16, &vpx_sad16x16x4d_vsx),
SadMxNx4Param(16, 8, &vpx_sad16x8x4d_vsx),
};
INSTANTIATE_TEST_CASE_P(VSX, SADx4Test, ::testing::ValuesIn(x4d_vsx_tests));
#endif // HAVE_VSX
//------------------------------------------------------------------------------
// Loongson functions
#if HAVE_MMI
const SadMxNParam mmi_tests[] = {
SadMxNParam(64, 64, &vpx_sad64x64_mmi),
SadMxNParam(64, 32, &vpx_sad64x32_mmi),
SadMxNParam(32, 64, &vpx_sad32x64_mmi),
SadMxNParam(32, 32, &vpx_sad32x32_mmi),
SadMxNParam(32, 16, &vpx_sad32x16_mmi),
SadMxNParam(16, 32, &vpx_sad16x32_mmi),
SadMxNParam(16, 16, &vpx_sad16x16_mmi),
SadMxNParam(16, 8, &vpx_sad16x8_mmi),
SadMxNParam(8, 16, &vpx_sad8x16_mmi),
SadMxNParam(8, 8, &vpx_sad8x8_mmi),
SadMxNParam(8, 4, &vpx_sad8x4_mmi),
SadMxNParam(4, 8, &vpx_sad4x8_mmi),
SadMxNParam(4, 4, &vpx_sad4x4_mmi),
};
INSTANTIATE_TEST_CASE_P(MMI, SADTest, ::testing::ValuesIn(mmi_tests));
const SadMxNAvgParam avg_mmi_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_mmi),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_mmi),
SadMxNAvgParam(32, 64, &vpx_sad32x64_avg_mmi),
SadMxNAvgParam(32, 32, &vpx_sad32x32_avg_mmi),
SadMxNAvgParam(32, 16, &vpx_sad32x16_avg_mmi),
SadMxNAvgParam(16, 32, &vpx_sad16x32_avg_mmi),
SadMxNAvgParam(16, 16, &vpx_sad16x16_avg_mmi),
SadMxNAvgParam(16, 8, &vpx_sad16x8_avg_mmi),
SadMxNAvgParam(8, 16, &vpx_sad8x16_avg_mmi),
SadMxNAvgParam(8, 8, &vpx_sad8x8_avg_mmi),
SadMxNAvgParam(8, 4, &vpx_sad8x4_avg_mmi),
SadMxNAvgParam(4, 8, &vpx_sad4x8_avg_mmi),
SadMxNAvgParam(4, 4, &vpx_sad4x4_avg_mmi),
};
INSTANTIATE_TEST_CASE_P(MMI, SADavgTest, ::testing::ValuesIn(avg_mmi_tests));
const SadMxNx4Param x4d_mmi_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_mmi),
SadMxNx4Param(64, 32, &vpx_sad64x32x4d_mmi),
SadMxNx4Param(32, 64, &vpx_sad32x64x4d_mmi),
SadMxNx4Param(32, 32, &vpx_sad32x32x4d_mmi),
SadMxNx4Param(32, 16, &vpx_sad32x16x4d_mmi),
SadMxNx4Param(16, 32, &vpx_sad16x32x4d_mmi),
SadMxNx4Param(16, 16, &vpx_sad16x16x4d_mmi),
SadMxNx4Param(16, 8, &vpx_sad16x8x4d_mmi),
SadMxNx4Param(8, 16, &vpx_sad8x16x4d_mmi),
SadMxNx4Param(8, 8, &vpx_sad8x8x4d_mmi),
SadMxNx4Param(8, 4, &vpx_sad8x4x4d_mmi),
SadMxNx4Param(4, 8, &vpx_sad4x8x4d_mmi),
SadMxNx4Param(4, 4, &vpx_sad4x4x4d_mmi),
};
INSTANTIATE_TEST_CASE_P(MMI, SADx4Test, ::testing::ValuesIn(x4d_mmi_tests));
#endif // HAVE_MMI
} // namespace

View File

@@ -146,14 +146,6 @@ TEST(VP8RoiMapTest, ParameterCheck) {
if (deltas_valid != roi_retval) break;
}
// Test that we report and error if cyclic refresh is enabled.
cpi.cyclic_refresh_mode_enabled = 1;
roi_retval =
vp8_set_roimap(&cpi, roi_map, cpi.common.mb_rows, cpi.common.mb_cols,
delta_q, delta_lf, threshold);
EXPECT_EQ(-1, roi_retval) << "cyclic refresh check error";
cpi.cyclic_refresh_mode_enabled = 0;
// Test invalid number of rows or colums.
roi_retval =
vp8_set_roimap(&cpi, roi_map, cpi.common.mb_rows + 1,

View File

@@ -8,8 +8,10 @@
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
## This file performs a stress test. It runs 5 encodes and 30 decodes in
## parallel.
## This file performs a stress test. It runs (STRESS_ONEPASS_MAX_JOBS,
## default=5) one, (STRESS_TWOPASS_MAX_JOBS, default=5) two pass &
## (STRESS_RT_MAX_JOBS, default=5) encodes and (STRESS_<codec>_DECODE_MAX_JOBS,
## default=30) decodes in parallel.
. $(dirname $0)/tools_common.sh
@@ -75,20 +77,33 @@ stress() {
local readonly codec="$1"
local readonly webm="$2"
local readonly decode_count="$3"
local readonly threads="$4"
local readonly enc_args="$5"
local pids=""
local rt_max_jobs=${STRESS_RT_MAX_JOBS:-5}
local onepass_max_jobs=${STRESS_ONEPASS_MAX_JOBS:-5}
local twopass_max_jobs=${STRESS_TWOPASS_MAX_JOBS:-5}
# Enable job control, so we can run multiple processes.
set -m
# Start $onepass_max_jobs encode jobs in parallel.
for i in $(seq ${onepass_max_jobs}); do
bitrate=$(($i * 20 + 300))
eval "${VPX_TEST_PREFIX}" "${encoder}" "--codec=${codec} -w 1280 -h 720" \
"${YUV}" "-t ${threads} --limit=150 --test-decode=fatal --passes=1" \
"--target-bitrate=${bitrate} -o ${VPX_TEST_OUTPUT_DIR}/${i}.1pass.webm" \
"${enc_args}" ${devnull} &
pids="${pids} $!"
done
# Start $twopass_max_jobs encode jobs in parallel.
for i in $(seq ${twopass_max_jobs}); do
bitrate=$(($i * 20 + 300))
eval "${VPX_TEST_PREFIX}" "${encoder}" "--codec=${codec} -w 1280 -h 720" \
"${YUV}" "-t 4 --limit=150 --test-decode=fatal " \
"--target-bitrate=${bitrate} -o ${VPX_TEST_OUTPUT_DIR}/${i}.webm" \
${devnull} &
"${YUV}" "-t ${threads} --limit=150 --test-decode=fatal --passes=2" \
"--target-bitrate=${bitrate} -o ${VPX_TEST_OUTPUT_DIR}/${i}.2pass.webm" \
"${enc_args}" ${devnull} &
pids="${pids} $!"
done
@@ -96,7 +111,7 @@ stress() {
for i in $(seq ${rt_max_jobs}); do
bitrate=$(($i * 20 + 300))
eval "${VPX_TEST_PREFIX}" "${encoder}" "--codec=${codec} -w 1280 -h 720" \
"${YUV}" "-t 4 --limit=150 --test-decode=fatal " \
"${YUV}" "-t ${threads} --limit=150 --test-decode=fatal " \
"--target-bitrate=${bitrate} --lag-in-frames=0 --error-resilient=1" \
"--kf-min-dist=3000 --kf-max-dist=3000 --cpu-used=-6 --static-thresh=1" \
"--end-usage=cbr --min-q=2 --max-q=56 --undershoot-pct=100" \
@@ -109,7 +124,7 @@ stress() {
# Start $decode_count decode jobs in parallel.
for i in $(seq "${decode_count}"); do
eval "${decoder}" "-t 4" "${webm}" "--noblit" ${devnull} &
eval "${decoder}" "-t ${threads}" "${webm}" "--noblit" ${devnull} &
pids="${pids} $!"
done
@@ -125,17 +140,30 @@ vp8_stress_test() {
local vp8_max_jobs=${STRESS_VP8_DECODE_MAX_JOBS:-40}
if [ "$(vp8_decode_available)" = "yes" -a \
"$(vp8_encode_available)" = "yes" ]; then
stress vp8 "${VP8}" "${vp8_max_jobs}"
stress vp8 "${VP8}" "${vp8_max_jobs}" 4
fi
}
vp9_stress_test() {
vp9_stress() {
local vp9_max_jobs=${STRESS_VP9_DECODE_MAX_JOBS:-25}
if [ "$(vp9_decode_available)" = "yes" -a \
"$(vp9_encode_available)" = "yes" ]; then
stress vp9 "${VP9}" "${vp9_max_jobs}"
stress vp9 "${VP9}" "${vp9_max_jobs}" "$@"
fi
}
run_tests stress_verify_environment "vp8_stress_test vp9_stress_test"
vp9_stress_test() {
for threads in 4 8 100; do
vp9_stress "$threads" "--row-mt=0"
done
}
vp9_stress_test_row_mt() {
for threads in 4 8 100; do
vp9_stress "$threads" "--row-mt=1"
done
}
run_tests stress_verify_environment \
"vp8_stress_test vp9_stress_test vp9_stress_test_row_mt"

View File

@@ -110,4 +110,10 @@ INSTANTIATE_TEST_CASE_P(
::testing::Values(make_tuple(&vpx_sum_squares_2d_i16_c,
&vpx_sum_squares_2d_i16_sse2)));
#endif // HAVE_SSE2
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, SumSquaresTest, ::testing::Values(make_tuple(
&vpx_sum_squares_2d_i16_c,
&vpx_sum_squares_2d_i16_msa)));
#endif // HAVE_MSA
} // namespace

View File

@@ -0,0 +1,277 @@
/*
* Copyright (c) 2016 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/register_state_check.h"
#include "vpx_ports/vpx_timer.h"
namespace {
using ::libvpx_test::ACMRandom;
using ::libvpx_test::Buffer;
typedef void (*TemporalFilterFunc)(const uint8_t *a, unsigned int stride,
const uint8_t *b, unsigned int w,
unsigned int h, int filter_strength,
int filter_weight, unsigned int *accumulator,
uint16_t *count);
// Calculate the difference between 'a' and 'b', sum in blocks of 9, and apply
// filter based on strength and weight. Store the resulting filter amount in
// 'count' and apply it to 'b' and store it in 'accumulator'.
void reference_filter(const Buffer<uint8_t> &a, const Buffer<uint8_t> &b, int w,
int h, int filter_strength, int filter_weight,
Buffer<unsigned int> *accumulator,
Buffer<uint16_t> *count) {
Buffer<int> diff_sq = Buffer<int>(w, h, 0);
ASSERT_TRUE(diff_sq.Init());
diff_sq.Set(0);
int rounding = 0;
if (filter_strength > 0) {
rounding = 1 << (filter_strength - 1);
}
// Calculate all the differences. Avoids re-calculating a bunch of extra
// values.
for (int height = 0; height < h; ++height) {
for (int width = 0; width < w; ++width) {
int diff = a.TopLeftPixel()[height * a.stride() + width] -
b.TopLeftPixel()[height * b.stride() + width];
diff_sq.TopLeftPixel()[height * diff_sq.stride() + width] = diff * diff;
}
}
// For any given point, sum the neighboring values and calculate the
// modifier.
for (int height = 0; height < h; ++height) {
for (int width = 0; width < w; ++width) {
// Determine how many values are being summed.
int summed_values = 9;
if (height == 0 || height == (h - 1)) {
summed_values -= 3;
}
if (width == 0 || width == (w - 1)) {
if (summed_values == 6) { // corner
summed_values -= 2;
} else {
summed_values -= 3;
}
}
// Sum the diff_sq of the surrounding values.
int sum = 0;
for (int idy = -1; idy <= 1; ++idy) {
for (int idx = -1; idx <= 1; ++idx) {
const int y = height + idy;
const int x = width + idx;
// If inside the border.
if (y >= 0 && y < h && x >= 0 && x < w) {
sum += diff_sq.TopLeftPixel()[y * diff_sq.stride() + x];
}
}
}
sum *= 3;
sum /= summed_values;
sum += rounding;
sum >>= filter_strength;
// Clamp the value and invert it.
if (sum > 16) sum = 16;
sum = 16 - sum;
sum *= filter_weight;
count->TopLeftPixel()[height * count->stride() + width] += sum;
accumulator->TopLeftPixel()[height * accumulator->stride() + width] +=
sum * b.TopLeftPixel()[height * b.stride() + width];
}
}
}
class TemporalFilterTest : public ::testing::TestWithParam<TemporalFilterFunc> {
public:
virtual void SetUp() {
filter_func_ = GetParam();
rnd_.Reset(ACMRandom::DeterministicSeed());
}
protected:
TemporalFilterFunc filter_func_;
ACMRandom rnd_;
};
TEST_P(TemporalFilterTest, SizeCombinations) {
// Depending on subsampling this function may be called with values of 8 or 16
// for width and height, in any combination.
Buffer<uint8_t> a = Buffer<uint8_t>(16, 16, 8);
ASSERT_TRUE(a.Init());
const int filter_weight = 2;
const int filter_strength = 6;
for (int width = 8; width <= 16; width += 8) {
for (int height = 8; height <= 16; height += 8) {
// The second buffer must not have any border.
Buffer<uint8_t> b = Buffer<uint8_t>(width, height, 0);
ASSERT_TRUE(b.Init());
Buffer<unsigned int> accum_ref = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_ref.Init());
Buffer<unsigned int> accum_chk = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_chk.Init());
Buffer<uint16_t> count_ref = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_ref.Init());
Buffer<uint16_t> count_chk = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_chk.Init());
// The difference between the buffers must be small to pass the threshold
// to apply the filter.
a.Set(&rnd_, 0, 7);
b.Set(&rnd_, 0, 7);
accum_ref.Set(rnd_.Rand8());
accum_chk.CopyFrom(accum_ref);
count_ref.Set(rnd_.Rand8());
count_chk.CopyFrom(count_ref);
reference_filter(a, b, width, height, filter_strength, filter_weight,
&accum_ref, &count_ref);
ASM_REGISTER_STATE_CHECK(
filter_func_(a.TopLeftPixel(), a.stride(), b.TopLeftPixel(), width,
height, filter_strength, filter_weight,
accum_chk.TopLeftPixel(), count_chk.TopLeftPixel()));
EXPECT_TRUE(accum_chk.CheckValues(accum_ref));
EXPECT_TRUE(count_chk.CheckValues(count_ref));
if (HasFailure()) {
printf("Width: %d Height: %d\n", width, height);
count_chk.PrintDifference(count_ref);
accum_chk.PrintDifference(accum_ref);
return;
}
}
}
}
TEST_P(TemporalFilterTest, CompareReferenceRandom) {
for (int width = 8; width <= 16; width += 8) {
for (int height = 8; height <= 16; height += 8) {
Buffer<uint8_t> a = Buffer<uint8_t>(width, height, 8);
ASSERT_TRUE(a.Init());
// The second buffer must not have any border.
Buffer<uint8_t> b = Buffer<uint8_t>(width, height, 0);
ASSERT_TRUE(b.Init());
Buffer<unsigned int> accum_ref = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_ref.Init());
Buffer<unsigned int> accum_chk = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_chk.Init());
Buffer<uint16_t> count_ref = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_ref.Init());
Buffer<uint16_t> count_chk = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_chk.Init());
for (int filter_strength = 0; filter_strength <= 6; ++filter_strength) {
for (int filter_weight = 0; filter_weight <= 2; ++filter_weight) {
for (int repeat = 0; repeat < 100; ++repeat) {
if (repeat < 50) {
a.Set(&rnd_, 0, 7);
b.Set(&rnd_, 0, 7);
} else {
// Check large (but close) values as well.
a.Set(&rnd_, std::numeric_limits<uint8_t>::max() - 7,
std::numeric_limits<uint8_t>::max());
b.Set(&rnd_, std::numeric_limits<uint8_t>::max() - 7,
std::numeric_limits<uint8_t>::max());
}
accum_ref.Set(rnd_.Rand8());
accum_chk.CopyFrom(accum_ref);
count_ref.Set(rnd_.Rand8());
count_chk.CopyFrom(count_ref);
reference_filter(a, b, width, height, filter_strength,
filter_weight, &accum_ref, &count_ref);
ASM_REGISTER_STATE_CHECK(filter_func_(
a.TopLeftPixel(), a.stride(), b.TopLeftPixel(), width, height,
filter_strength, filter_weight, accum_chk.TopLeftPixel(),
count_chk.TopLeftPixel()));
EXPECT_TRUE(accum_chk.CheckValues(accum_ref));
EXPECT_TRUE(count_chk.CheckValues(count_ref));
if (HasFailure()) {
printf("Weight: %d Strength: %d\n", filter_weight,
filter_strength);
count_chk.PrintDifference(count_ref);
accum_chk.PrintDifference(accum_ref);
return;
}
}
}
}
}
}
}
TEST_P(TemporalFilterTest, DISABLED_Speed) {
Buffer<uint8_t> a = Buffer<uint8_t>(16, 16, 8);
ASSERT_TRUE(a.Init());
const int filter_weight = 2;
const int filter_strength = 6;
for (int width = 8; width <= 16; width += 8) {
for (int height = 8; height <= 16; height += 8) {
// The second buffer must not have any border.
Buffer<uint8_t> b = Buffer<uint8_t>(width, height, 0);
ASSERT_TRUE(b.Init());
Buffer<unsigned int> accum_ref = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_ref.Init());
Buffer<unsigned int> accum_chk = Buffer<unsigned int>(width, height, 0);
ASSERT_TRUE(accum_chk.Init());
Buffer<uint16_t> count_ref = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_ref.Init());
Buffer<uint16_t> count_chk = Buffer<uint16_t>(width, height, 0);
ASSERT_TRUE(count_chk.Init());
a.Set(&rnd_, 0, 7);
b.Set(&rnd_, 0, 7);
accum_chk.Set(0);
count_chk.Set(0);
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int i = 0; i < 10000; ++i) {
filter_func_(a.TopLeftPixel(), a.stride(), b.TopLeftPixel(), width,
height, filter_strength, filter_weight,
accum_chk.TopLeftPixel(), count_chk.TopLeftPixel());
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
printf("Temporal filter %dx%d time: %5d us\n", width, height,
elapsed_time);
}
}
}
INSTANTIATE_TEST_CASE_P(C, TemporalFilterTest,
::testing::Values(&vp9_temporal_filter_apply_c));
#if HAVE_SSE4_1
INSTANTIATE_TEST_CASE_P(SSE4_1, TemporalFilterTest,
::testing::Values(&vp9_temporal_filter_apply_sse4_1));
#endif // HAVE_SSE4_1
} // namespace

View File

@@ -23,6 +23,7 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_1280_720_30.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += noisy_clip_640_360.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += rush_hour_444.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += screendata.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_640_480_30.yuv
# Test vectors
LIBVPX_TEST_DATA-$(CONFIG_VP8_DECODER) += vp80-00-comprehensive-001.ivf
@@ -731,6 +732,8 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp93-2-20-12bit-yuv444.webm.md5
endif # CONFIG_VP9_HIGHBITDEPTH
# Invalid files for testing libvpx error checking.
LIBVPX_TEST_DATA-$(CONFIG_VP8_DECODER) += invalid-bug-1443.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP8_DECODER) += invalid-bug-1443.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP8_DECODER) += invalid-vp80-00-comprehensive-018.ivf.2kf_0x6.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP8_DECODER) += invalid-vp80-00-comprehensive-018.ivf.2kf_0x6.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-01-v3.webm
@@ -771,6 +774,8 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-12-droppable_1.ivf.s367
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-12-droppable_1.ivf.s3676_r01-05_b6-.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-12-droppable_1.ivf.s73804_r01-05_b6-.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-12-droppable_1.ivf.s73804_r01-05_b6-.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-21-resize_inter_320x180_5_3-4.webm.ivf.s45551_r01-05_b6-.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-21-resize_inter_320x180_5_3-4.webm.ivf.s45551_r01-05_b6-.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp91-2-mixedrefcsp-444to420.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp91-2-mixedrefcsp-444to420.ivf.res
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += invalid-vp90-2-07-frame_parallel-1.webm
@@ -814,7 +819,6 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += kirland_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += macmarcomoving_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += macmarcostationary_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_1280_720_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += tacomanarrows_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += tacomasmallcameramovement_640_480_30.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += thaloundeskmtg_640_480_30.yuv
@@ -874,3 +878,5 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_3-4
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-22-svc_1280x720_3.ivf
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-22-svc_1280x720_3.ivf.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-22-svc_1280x720_1.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-22-svc_1280x720_1.webm.md5

View File

@@ -6,6 +6,8 @@ b87815bf86020c592ccc7a846ba2e28ec8043902 *hantro_odd.yuv
456d1493e52d32a5c30edf44a27debc1fa6b253a *invalid-vp90-2-00-quantizer-11.webm.ivf.s52984_r01-05_b6-.ivf.res
c123d1f9f02fb4143abb5e271916e3a3080de8f6 *invalid-vp90-2-00-quantizer-11.webm.ivf.s52984_r01-05_b6-z.ivf
456d1493e52d32a5c30edf44a27debc1fa6b253a *invalid-vp90-2-00-quantizer-11.webm.ivf.s52984_r01-05_b6-z.ivf.res
efafb92b7567bc04c3f1432ea6c268c1c31affd5 *invalid-vp90-2-21-resize_inter_320x180_5_3-4.webm.ivf.s45551_r01-05_b6-.ivf
5d9474c0309b7ca09a182d888f73b37a8fe1362c *invalid-vp90-2-21-resize_inter_320x180_5_3-4.webm.ivf.s45551_r01-05_b6-.ivf.res
fe346136b9b8c1e6f6084cc106485706915795e4 *invalid-vp90-01-v3.webm
5d9474c0309b7ca09a182d888f73b37a8fe1362c *invalid-vp90-01-v3.webm.res
d78e2fceba5ac942246503ec8366f879c4775ca5 *invalid-vp90-02-v2.webm
@@ -848,3 +850,7 @@ a000d568431d07379dd5a8ec066061c07e560b47 *invalid-vp90-2-00-quantizer-63.ivf.kf_
6fa3d3ac306a3d9ce1d610b78441dc00d2c2d4b9 *tos_vp8.webm
e402cbbf9e550ae017a1e9f1f73931c1d18474e8 *invalid-crbug-667044.webm
d3964f9dad9f60363c81b688324d95b4ec7c8038 *invalid-crbug-667044.webm.res
fd9df7f3f6992af1d7a9dde975c9a0d6f28c053d *invalid-bug-1443.ivf
fd3020fa6e9ca5966206738654c97dec313b0a95 *invalid-bug-1443.ivf.res
17696cd21e875f1d6e5d418cbf89feab02c8850a *vp90-2-22-svc_1280x720_1.webm
e2f9e1e47a791b4e939a9bdc50bf7a25b3761f77 *vp90-2-22-svc_1280x720_1.webm.md5

View File

@@ -1,4 +1,5 @@
LIBVPX_TEST_SRCS-yes += acm_random.h
LIBVPX_TEST_SRCS-yes += buffer.h
LIBVPX_TEST_SRCS-yes += clear_system_state.h
LIBVPX_TEST_SRCS-yes += codec_factory.h
LIBVPX_TEST_SRCS-yes += md5_helper.h
@@ -38,7 +39,6 @@ LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += byte_alignment_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += decode_svc_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += external_frame_buffer_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += user_priv_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_frame_parallel_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += active_map_refresh_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += active_map_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += borders_test.cc
@@ -47,6 +47,7 @@ LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += frame_size_tests.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_lossless_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_end_to_end_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_ethread_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_motion_vector_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += level_test.cc
LIBVPX_TEST_SRCS-yes += decode_test_driver.cc
@@ -122,6 +123,7 @@ LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += vp8_fdct4x4_test.cc
LIBVPX_TEST_SRCS-yes += idct_test.cc
LIBVPX_TEST_SRCS-yes += predict_test.cc
LIBVPX_TEST_SRCS-yes += vpx_scale_test.cc
LIBVPX_TEST_SRCS-yes += vpx_scale_test.h
ifeq ($(CONFIG_VP8_ENCODER)$(CONFIG_TEMPORAL_DENOISING),yesyes)
LIBVPX_TEST_SRCS-$(HAVE_SSE2) += vp8_denoiser_sse2_test.cc
@@ -149,14 +151,20 @@ LIBVPX_TEST_SRCS-yes += vp9_intrapred_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_decrypt_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_thread_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += avg_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += comp_avg_pred_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct16x16_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct32x32_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct4x4_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct_partial_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct8x8_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += hadamard_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += minmax_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_scale_test.cc
ifneq ($(CONFIG_REALTIME_ONLY),yes)
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += temporal_filter_test.cc
endif
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += variance_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_error_block_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_block_error_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_quantize_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_subtract_test.cc
@@ -167,7 +175,7 @@ LIBVPX_TEST_SRCS-$(CONFIG_INTERNAL_STATS) += consistency_test.cc
endif
ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_TEMPORAL_DENOISING),yesyes)
LIBVPX_TEST_SRCS-$(HAVE_SSE2) += vp9_denoiser_sse2_test.cc
LIBVPX_TEST_SRCS-yes += vp9_denoiser_test.cc
endif
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_arf_freq_test.cc

View File

@@ -312,6 +312,31 @@ INTRA_PRED_TEST(MSA, TestIntraPred32, vpx_dc_predictor_32x32_msa,
vpx_tm_predictor_32x32_msa)
#endif // HAVE_MSA
#if HAVE_VSX
INTRA_PRED_TEST(VSX, TestIntraPred4, NULL, NULL, NULL, NULL, NULL,
vpx_h_predictor_4x4_vsx, NULL, NULL, NULL, NULL, NULL, NULL,
vpx_tm_predictor_4x4_vsx)
INTRA_PRED_TEST(VSX, TestIntraPred8, vpx_dc_predictor_8x8_vsx, NULL, NULL, NULL,
NULL, vpx_h_predictor_8x8_vsx, vpx_d45_predictor_8x8_vsx, NULL,
NULL, NULL, NULL, vpx_d63_predictor_8x8_vsx,
vpx_tm_predictor_8x8_vsx)
INTRA_PRED_TEST(VSX, TestIntraPred16, vpx_dc_predictor_16x16_vsx,
vpx_dc_left_predictor_16x16_vsx, vpx_dc_top_predictor_16x16_vsx,
vpx_dc_128_predictor_16x16_vsx, vpx_v_predictor_16x16_vsx,
vpx_h_predictor_16x16_vsx, vpx_d45_predictor_16x16_vsx, NULL,
NULL, NULL, NULL, vpx_d63_predictor_16x16_vsx,
vpx_tm_predictor_16x16_vsx)
INTRA_PRED_TEST(VSX, TestIntraPred32, vpx_dc_predictor_32x32_vsx,
vpx_dc_left_predictor_32x32_vsx, vpx_dc_top_predictor_32x32_vsx,
vpx_dc_128_predictor_32x32_vsx, vpx_v_predictor_32x32_vsx,
vpx_h_predictor_32x32_vsx, vpx_d45_predictor_32x32_vsx, NULL,
NULL, NULL, NULL, vpx_d63_predictor_32x32_vsx,
vpx_tm_predictor_32x32_vsx)
#endif // HAVE_VSX
// -----------------------------------------------------------------------------
#if CONFIG_VP9_HIGHBITDEPTH
@@ -455,29 +480,70 @@ HIGHBD_INTRA_PRED_TEST(
vpx_highbd_d63_predictor_32x32_c, vpx_highbd_tm_predictor_32x32_c)
#if HAVE_SSE2
HIGHBD_INTRA_PRED_TEST(SSE2, TestHighbdIntraPred4,
vpx_highbd_dc_predictor_4x4_sse2, NULL, NULL, NULL,
vpx_highbd_v_predictor_4x4_sse2, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, vpx_highbd_tm_predictor_4x4_c)
HIGHBD_INTRA_PRED_TEST(
SSE2, TestHighbdIntraPred4, vpx_highbd_dc_predictor_4x4_sse2,
vpx_highbd_dc_left_predictor_4x4_sse2, vpx_highbd_dc_top_predictor_4x4_sse2,
vpx_highbd_dc_128_predictor_4x4_sse2, vpx_highbd_v_predictor_4x4_sse2,
vpx_highbd_h_predictor_4x4_sse2, NULL, vpx_highbd_d135_predictor_4x4_sse2,
vpx_highbd_d117_predictor_4x4_sse2, vpx_highbd_d153_predictor_4x4_sse2,
vpx_highbd_d207_predictor_4x4_sse2, vpx_highbd_d63_predictor_4x4_sse2,
vpx_highbd_tm_predictor_4x4_c)
HIGHBD_INTRA_PRED_TEST(SSE2, TestHighbdIntraPred8,
vpx_highbd_dc_predictor_8x8_sse2, NULL, NULL, NULL,
vpx_highbd_v_predictor_8x8_sse2, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, vpx_highbd_tm_predictor_8x8_sse2)
vpx_highbd_dc_predictor_8x8_sse2,
vpx_highbd_dc_left_predictor_8x8_sse2,
vpx_highbd_dc_top_predictor_8x8_sse2,
vpx_highbd_dc_128_predictor_8x8_sse2,
vpx_highbd_v_predictor_8x8_sse2,
vpx_highbd_h_predictor_8x8_sse2, NULL, NULL, NULL, NULL,
NULL, NULL, vpx_highbd_tm_predictor_8x8_sse2)
HIGHBD_INTRA_PRED_TEST(SSE2, TestHighbdIntraPred16,
vpx_highbd_dc_predictor_16x16_sse2, NULL, NULL, NULL,
vpx_highbd_v_predictor_16x16_sse2, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
vpx_highbd_tm_predictor_16x16_sse2)
vpx_highbd_dc_predictor_16x16_sse2,
vpx_highbd_dc_left_predictor_16x16_sse2,
vpx_highbd_dc_top_predictor_16x16_sse2,
vpx_highbd_dc_128_predictor_16x16_sse2,
vpx_highbd_v_predictor_16x16_sse2,
vpx_highbd_h_predictor_16x16_sse2, NULL, NULL, NULL,
NULL, NULL, NULL, vpx_highbd_tm_predictor_16x16_sse2)
HIGHBD_INTRA_PRED_TEST(SSE2, TestHighbdIntraPred32,
vpx_highbd_dc_predictor_32x32_sse2, NULL, NULL, NULL,
vpx_highbd_v_predictor_32x32_sse2, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
vpx_highbd_tm_predictor_32x32_sse2)
vpx_highbd_dc_predictor_32x32_sse2,
vpx_highbd_dc_left_predictor_32x32_sse2,
vpx_highbd_dc_top_predictor_32x32_sse2,
vpx_highbd_dc_128_predictor_32x32_sse2,
vpx_highbd_v_predictor_32x32_sse2,
vpx_highbd_h_predictor_32x32_sse2, NULL, NULL, NULL,
NULL, NULL, NULL, vpx_highbd_tm_predictor_32x32_sse2)
#endif // HAVE_SSE2
#if HAVE_SSSE3
HIGHBD_INTRA_PRED_TEST(SSSE3, TestHighbdIntraPred4, NULL, NULL, NULL, NULL,
NULL, NULL, vpx_highbd_d45_predictor_4x4_ssse3, NULL,
NULL, NULL, NULL, NULL, NULL)
HIGHBD_INTRA_PRED_TEST(SSSE3, TestHighbdIntraPred8, NULL, NULL, NULL, NULL,
NULL, NULL, vpx_highbd_d45_predictor_8x8_ssse3,
vpx_highbd_d135_predictor_8x8_ssse3,
vpx_highbd_d117_predictor_8x8_ssse3,
vpx_highbd_d153_predictor_8x8_ssse3,
vpx_highbd_d207_predictor_8x8_ssse3,
vpx_highbd_d63_predictor_8x8_ssse3, NULL)
HIGHBD_INTRA_PRED_TEST(SSSE3, TestHighbdIntraPred16, NULL, NULL, NULL, NULL,
NULL, NULL, vpx_highbd_d45_predictor_16x16_ssse3,
vpx_highbd_d135_predictor_16x16_ssse3,
vpx_highbd_d117_predictor_16x16_ssse3,
vpx_highbd_d153_predictor_16x16_ssse3,
vpx_highbd_d207_predictor_16x16_ssse3,
vpx_highbd_d63_predictor_16x16_ssse3, NULL)
HIGHBD_INTRA_PRED_TEST(SSSE3, TestHighbdIntraPred32, NULL, NULL, NULL, NULL,
NULL, NULL, vpx_highbd_d45_predictor_32x32_ssse3,
vpx_highbd_d135_predictor_32x32_ssse3,
vpx_highbd_d117_predictor_32x32_ssse3,
vpx_highbd_d153_predictor_32x32_ssse3,
vpx_highbd_d207_predictor_32x32_ssse3,
vpx_highbd_d63_predictor_32x32_ssse3, NULL)
#endif // HAVE_SSSE3
#if HAVE_NEON
HIGHBD_INTRA_PRED_TEST(
NEON, TestHighbdIntraPred4, vpx_highbd_dc_predictor_4x4_neon,

View File

@@ -53,6 +53,9 @@ int main(int argc, char **argv) {
}
if (!(simd_caps & HAS_AVX)) append_negative_gtest_filter(":AVX.*:AVX/*");
if (!(simd_caps & HAS_AVX2)) append_negative_gtest_filter(":AVX2.*:AVX2/*");
if (!(simd_caps & HAS_AVX512)) {
append_negative_gtest_filter(":AVX512.*:AVX512/*");
}
#endif // ARCH_X86 || ARCH_X86_64
#if !CONFIG_SHARED

View File

@@ -28,13 +28,10 @@
namespace {
enum DecodeMode { kSerialMode, kFrameParallelMode };
const int kThreads = 0;
const int kFileName = 1;
const int kDecodeMode = 0;
const int kThreads = 1;
const int kFileName = 2;
typedef std::tr1::tuple<int, int, const char *> DecodeParam;
typedef std::tr1::tuple<int, const char *> DecodeParam;
class TestVectorTest : public ::libvpx_test::DecoderTest,
public ::libvpx_test::CodecTestWithParam<DecodeParam> {
@@ -53,8 +50,8 @@ class TestVectorTest : public ::libvpx_test::DecoderTest,
void OpenMD5File(const std::string &md5_file_name_) {
md5_file_ = libvpx_test::OpenTestDataFile(md5_file_name_);
ASSERT_TRUE(md5_file_ != NULL) << "Md5 file open failed. Filename: "
<< md5_file_name_;
ASSERT_TRUE(md5_file_ != NULL)
<< "Md5 file open failed. Filename: " << md5_file_name_;
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
@@ -92,29 +89,14 @@ class TestVectorTest : public ::libvpx_test::DecoderTest,
TEST_P(TestVectorTest, MD5Match) {
const DecodeParam input = GET_PARAM(1);
const std::string filename = std::tr1::get<kFileName>(input);
const int threads = std::tr1::get<kThreads>(input);
const int mode = std::tr1::get<kDecodeMode>(input);
vpx_codec_flags_t flags = 0;
vpx_codec_dec_cfg_t cfg = vpx_codec_dec_cfg_t();
char str[256];
if (mode == kFrameParallelMode) {
flags |= VPX_CODEC_USE_FRAME_THREADING;
#if CONFIG_VP9_DECODER
// TODO(hkuang): Fix frame parallel decode bug. See issue 1086.
if (resize_clips_.find(filename) != resize_clips_.end()) {
printf("Skipping the test file: %s, due to frame parallel decode bug.\n",
filename.c_str());
return;
}
#endif
}
cfg.threads = std::tr1::get<kThreads>(input);
cfg.threads = threads;
snprintf(str, sizeof(str) / sizeof(str[0]) - 1,
"file: %s mode: %s threads: %d", filename.c_str(),
mode == 0 ? "Serial" : "Parallel", threads);
snprintf(str, sizeof(str) / sizeof(str[0]) - 1, "file: %s threads: %d",
filename.c_str(), cfg.threads);
SCOPED_TRACE(str);
// Open compressed video file.
@@ -145,13 +127,10 @@ TEST_P(TestVectorTest, MD5Match) {
ASSERT_NO_FATAL_FAILURE(RunLoop(video.get(), cfg));
}
// Test VP8 decode in serial mode with single thread.
// NOTE: VP8 only support serial mode.
#if CONFIG_VP8_DECODER
VP8_INSTANTIATE_TEST_CASE(
TestVectorTest,
::testing::Combine(
::testing::Values(0), // Serial Mode.
::testing::Values(1), // Single thread.
::testing::ValuesIn(libvpx_test::kVP8TestVectors,
libvpx_test::kVP8TestVectors +
@@ -164,33 +143,28 @@ INSTANTIATE_TEST_CASE_P(
::testing::Values(
static_cast<const libvpx_test::CodecFactory *>(&libvpx_test::kVP8)),
::testing::Combine(
::testing::Values(0), // Serial Mode.
::testing::Range(1, 8), // With 1 ~ 8 threads.
::testing::Range(2, 9), // With 2 ~ 8 threads.
::testing::ValuesIn(libvpx_test::kVP8TestVectors,
libvpx_test::kVP8TestVectors +
libvpx_test::kNumVP8TestVectors))));
#endif // CONFIG_VP8_DECODER
// Test VP9 decode in serial mode with single thread.
#if CONFIG_VP9_DECODER
VP9_INSTANTIATE_TEST_CASE(
TestVectorTest,
::testing::Combine(
::testing::Values(0), // Serial Mode.
::testing::Values(1), // Single thread.
::testing::ValuesIn(libvpx_test::kVP9TestVectors,
libvpx_test::kVP9TestVectors +
libvpx_test::kNumVP9TestVectors)));
// Test VP9 decode in frame parallel mode with different number of threads.
INSTANTIATE_TEST_CASE_P(
VP9MultiThreadedFrameParallel, TestVectorTest,
VP9MultiThreaded, TestVectorTest,
::testing::Combine(
::testing::Values(
static_cast<const libvpx_test::CodecFactory *>(&libvpx_test::kVP9)),
::testing::Combine(
::testing::Values(1), // Frame Parallel mode.
::testing::Range(2, 9), // With 2 ~ 8 threads.
::testing::ValuesIn(libvpx_test::kVP9TestVectors,
libvpx_test::kVP9TestVectors +

View File

@@ -371,6 +371,7 @@ const char *const kVP9TestVectors[] = {
#endif // CONFIG_VP9_HIGHBITDEPTH
"vp90-2-20-big_superframe-01.webm",
"vp90-2-20-big_superframe-02.webm",
"vp90-2-22-svc_1280x720_1.webm",
RESIZE_TEST_VECTORS
};
const char *const kVP9TestVectorsSvc[] = { "vp90-2-22-svc_1280x720_3.ivf" };

View File

@@ -54,7 +54,10 @@ twopass_encoder_vp9() {
fi
}
twopass_encoder_tests="twopass_encoder_vp8
twopass_encoder_vp9"
run_tests twopass_encoder_verify_environment "${twopass_encoder_tests}"
if [ "$(vpx_config_option_enabled CONFIG_REALTIME_ONLY)" != "yes" ]; then
twopass_encoder_tests="twopass_encoder_vp8
twopass_encoder_vp9"
run_tests twopass_encoder_verify_environment "${twopass_encoder_tests}"
fi

File diff suppressed because it is too large Load Diff

View File

@@ -17,12 +17,16 @@
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h"
#include "./vp8_rtcd.h"
#include "test/acm_random.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
namespace {
typedef void (*FdctFunc)(int16_t *a, int16_t *b, int a_stride);
const int cospi8sqrt2minus1 = 20091;
const int sinpi8sqrt2 = 35468;
@@ -68,10 +72,21 @@ void reference_idct4x4(const int16_t *input, int16_t *output) {
using libvpx_test::ACMRandom;
TEST(VP8FdctTest, SignBiasCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
class FdctTest : public ::testing::TestWithParam<FdctFunc> {
public:
virtual void SetUp() {
fdct_func_ = GetParam();
rnd_.Reset(ACMRandom::DeterministicSeed());
}
protected:
FdctFunc fdct_func_;
ACMRandom rnd_;
};
TEST_P(FdctTest, SignBiasCheck) {
int16_t test_input_block[16];
int16_t test_output_block[16];
DECLARE_ALIGNED(16, int16_t, test_output_block[16]);
const int pitch = 8;
int count_sign_block[16][2];
const int count_test_block = 1000000;
@@ -81,10 +96,10 @@ TEST(VP8FdctTest, SignBiasCheck) {
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-255, 255].
for (int j = 0; j < 16; ++j) {
test_input_block[j] = rnd.Rand8() - rnd.Rand8();
test_input_block[j] = rnd_.Rand8() - rnd_.Rand8();
}
vp8_short_fdct4x4_c(test_input_block, test_output_block, pitch);
fdct_func_(test_input_block, test_output_block, pitch);
for (int j = 0; j < 16; ++j) {
if (test_output_block[j] < 0) {
@@ -110,10 +125,10 @@ TEST(VP8FdctTest, SignBiasCheck) {
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-15, 15].
for (int j = 0; j < 16; ++j) {
test_input_block[j] = (rnd.Rand8() >> 4) - (rnd.Rand8() >> 4);
test_input_block[j] = (rnd_.Rand8() >> 4) - (rnd_.Rand8() >> 4);
}
vp8_short_fdct4x4_c(test_input_block, test_output_block, pitch);
fdct_func_(test_input_block, test_output_block, pitch);
for (int j = 0; j < 16; ++j) {
if (test_output_block[j] < 0) {
@@ -135,23 +150,22 @@ TEST(VP8FdctTest, SignBiasCheck) {
<< "Error: 4x4 FDCT has a sign bias > 10% for input range [-15, 15]";
};
TEST(VP8FdctTest, RoundTripErrorCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
TEST_P(FdctTest, RoundTripErrorCheck) {
int max_error = 0;
double total_error = 0;
const int count_test_block = 1000000;
for (int i = 0; i < count_test_block; ++i) {
int16_t test_input_block[16];
int16_t test_temp_block[16];
int16_t test_output_block[16];
DECLARE_ALIGNED(16, int16_t, test_temp_block[16]);
// Initialize a test block with input range [-255, 255].
for (int j = 0; j < 16; ++j) {
test_input_block[j] = rnd.Rand8() - rnd.Rand8();
test_input_block[j] = rnd_.Rand8() - rnd_.Rand8();
}
const int pitch = 8;
vp8_short_fdct4x4_c(test_input_block, test_temp_block, pitch);
fdct_func_(test_input_block, test_temp_block, pitch);
reference_idct4x4(test_temp_block, test_output_block);
for (int j = 0; j < 16; ++j) {
@@ -169,4 +183,24 @@ TEST(VP8FdctTest, RoundTripErrorCheck) {
<< "Error: FDCT/IDCT has average roundtrip error > 1 per block";
};
INSTANTIATE_TEST_CASE_P(C, FdctTest, ::testing::Values(vp8_short_fdct4x4_c));
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, FdctTest,
::testing::Values(vp8_short_fdct4x4_neon));
#endif // HAVE_NEON
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, FdctTest,
::testing::Values(vp8_short_fdct4x4_sse2));
#endif // HAVE_SSE2
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, FdctTest,
::testing::Values(vp8_short_fdct4x4_msa));
#endif // HAVE_MSA
#if HAVE_MMI
INSTANTIATE_TEST_CASE_P(MMI, FdctTest,
::testing::Values(vp8_short_fdct4x4_mmi));
#endif // HAVE_MMI
} // namespace

View File

@@ -23,36 +23,36 @@
#include "vp9/common/vp9_entropy.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_dsp/vpx_dsp_common.h"
using libvpx_test::ACMRandom;
namespace {
#if CONFIG_VP9_HIGHBITDEPTH
const int kNumIterations = 1000;
typedef int64_t (*ErrorBlockFunc)(const tran_low_t *coeff,
typedef int64_t (*HBDBlockErrorFunc)(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size, int64_t *ssz,
int bps);
typedef std::tr1::tuple<HBDBlockErrorFunc, HBDBlockErrorFunc, vpx_bit_depth_t>
BlockErrorParam;
typedef int64_t (*BlockErrorFunc)(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size, int64_t *ssz, int bps);
intptr_t block_size, int64_t *ssz);
typedef std::tr1::tuple<ErrorBlockFunc, ErrorBlockFunc, vpx_bit_depth_t>
ErrorBlockParam;
// wrapper for 8-bit block error functions without a 'bps' param.
typedef int64_t (*HighBdBlockError8bit)(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size, int64_t *ssz);
template <HighBdBlockError8bit fn>
int64_t HighBdBlockError8bitWrapper(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size, int64_t *ssz,
int bps) {
EXPECT_EQ(8, bps);
template <BlockErrorFunc fn>
int64_t BlockError8BitWrapper(const tran_low_t *coeff,
const tran_low_t *dqcoeff, intptr_t block_size,
int64_t *ssz, int bps) {
EXPECT_EQ(bps, 8);
return fn(coeff, dqcoeff, block_size, ssz);
}
class ErrorBlockTest : public ::testing::TestWithParam<ErrorBlockParam> {
class BlockErrorTest : public ::testing::TestWithParam<BlockErrorParam> {
public:
virtual ~ErrorBlockTest() {}
virtual ~BlockErrorTest() {}
virtual void SetUp() {
error_block_op_ = GET_PARAM(0);
ref_error_block_op_ = GET_PARAM(1);
@@ -63,11 +63,11 @@ class ErrorBlockTest : public ::testing::TestWithParam<ErrorBlockParam> {
protected:
vpx_bit_depth_t bit_depth_;
ErrorBlockFunc error_block_op_;
ErrorBlockFunc ref_error_block_op_;
HBDBlockErrorFunc error_block_op_;
HBDBlockErrorFunc ref_error_block_op_;
};
TEST_P(ErrorBlockTest, OperationCheck) {
TEST_P(BlockErrorTest, OperationCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff[4096]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff[4096]);
@@ -110,7 +110,7 @@ TEST_P(ErrorBlockTest, OperationCheck) {
<< "First failed at test case " << first_failure;
}
TEST_P(ErrorBlockTest, ExtremeValues) {
TEST_P(BlockErrorTest, ExtremeValues) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff[4096]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff[4096]);
@@ -171,29 +171,28 @@ TEST_P(ErrorBlockTest, ExtremeValues) {
using std::tr1::make_tuple;
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2, ErrorBlockTest,
::testing::Values(
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_10),
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_12),
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_8),
make_tuple(
&HighBdBlockError8bitWrapper<vp9_highbd_block_error_8bit_sse2>,
&HighBdBlockError8bitWrapper<vp9_highbd_block_error_8bit_c>,
VPX_BITS_8)));
const BlockErrorParam sse2_block_error_tests[] = {
#if CONFIG_VP9_HIGHBITDEPTH
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_10),
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_12),
make_tuple(&vp9_highbd_block_error_sse2, &vp9_highbd_block_error_c,
VPX_BITS_8),
#endif // CONFIG_VP9_HIGHBITDEPTH
make_tuple(&BlockError8BitWrapper<vp9_block_error_sse2>,
&BlockError8BitWrapper<vp9_block_error_c>, VPX_BITS_8)
};
INSTANTIATE_TEST_CASE_P(SSE2, BlockErrorTest,
::testing::ValuesIn(sse2_block_error_tests));
#endif // HAVE_SSE2
#if HAVE_AVX
#if HAVE_AVX2
INSTANTIATE_TEST_CASE_P(
AVX, ErrorBlockTest,
::testing::Values(make_tuple(
&HighBdBlockError8bitWrapper<vp9_highbd_block_error_8bit_avx>,
&HighBdBlockError8bitWrapper<vp9_highbd_block_error_8bit_c>,
VPX_BITS_8)));
#endif // HAVE_AVX
#endif // CONFIG_VP9_HIGHBITDEPTH
AVX2, BlockErrorTest,
::testing::Values(make_tuple(&BlockError8BitWrapper<vp9_block_error_avx2>,
&BlockError8BitWrapper<vp9_block_error_c>,
VPX_BITS_8)));
#endif // HAVE_AVX2
} // namespace

View File

@@ -29,11 +29,21 @@ using libvpx_test::ACMRandom;
namespace {
const int kNumPixels = 64 * 64;
class VP9DenoiserTest : public ::testing::TestWithParam<BLOCK_SIZE> {
typedef int (*Vp9DenoiserFilterFunc)(const uint8_t *sig, int sig_stride,
const uint8_t *mc_avg, int mc_avg_stride,
uint8_t *avg, int avg_stride,
int increase_denoising, BLOCK_SIZE bs,
int motion_magnitude);
typedef std::tr1::tuple<Vp9DenoiserFilterFunc, BLOCK_SIZE> VP9DenoiserTestParam;
class VP9DenoiserTest
: public ::testing::Test,
public ::testing::WithParamInterface<VP9DenoiserTestParam> {
public:
virtual ~VP9DenoiserTest() {}
virtual void SetUp() { bs_ = GetParam(); }
virtual void SetUp() { bs_ = GET_PARAM(1); }
virtual void TearDown() { libvpx_test::ClearSystemState(); }
@@ -76,9 +86,9 @@ TEST_P(VP9DenoiserTest, BitexactCheck) {
64, avg_block_c, 64, 0, bs_,
motion_magnitude_random));
ASM_REGISTER_STATE_CHECK(vp9_denoiser_filter_sse2(
sig_block, 64, mc_avg_block, 64, avg_block_sse2, 64, 0, bs_,
motion_magnitude_random));
ASM_REGISTER_STATE_CHECK(GET_PARAM(0)(sig_block, 64, mc_avg_block, 64,
avg_block_sse2, 64, 0, bs_,
motion_magnitude_random));
// Test bitexactness.
for (int h = 0; h < (4 << b_height_log2_lookup[bs_]); ++h) {
@@ -89,10 +99,36 @@ TEST_P(VP9DenoiserTest, BitexactCheck) {
}
}
using std::tr1::make_tuple;
// Test for all block size.
INSTANTIATE_TEST_CASE_P(SSE2, VP9DenoiserTest,
::testing::Values(BLOCK_8X8, BLOCK_8X16, BLOCK_16X8,
BLOCK_16X16, BLOCK_16X32, BLOCK_32X16,
BLOCK_32X32, BLOCK_32X64, BLOCK_64X32,
BLOCK_64X64));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2, VP9DenoiserTest,
::testing::Values(make_tuple(&vp9_denoiser_filter_sse2, BLOCK_8X8),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_8X16),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_16X8),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_16X16),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_16X32),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_32X16),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_32X32),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_32X64),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_64X32),
make_tuple(&vp9_denoiser_filter_sse2, BLOCK_64X64)));
#endif // HAVE_SSE2
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(
NEON, VP9DenoiserTest,
::testing::Values(make_tuple(&vp9_denoiser_filter_neon, BLOCK_8X8),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_8X16),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_16X8),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_16X16),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_16X32),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_32X16),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_32X32),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_32X64),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_64X32),
make_tuple(&vp9_denoiser_filter_neon, BLOCK_64X64)));
#endif
} // namespace

View File

@@ -99,9 +99,7 @@ class VpxEncoderParmsGetToDecoder
vpx_codec_ctx_t *const vp9_decoder = decoder->GetDecoder();
vpx_codec_alg_priv_t *const priv =
reinterpret_cast<vpx_codec_alg_priv_t *>(vp9_decoder->priv);
FrameWorkerData *const worker_data =
reinterpret_cast<FrameWorkerData *>(priv->frame_workers[0].data1);
VP9_COMMON *const common = &worker_data->pbi->common;
VP9_COMMON *const common = &priv->pbi->common;
if (encode_parms.lossless) {
EXPECT_EQ(0, common->base_qindex);

View File

@@ -16,8 +16,207 @@
#include "test/md5_helper.h"
#include "test/util.h"
#include "test/y4m_video_source.h"
#include "vp9/encoder/vp9_firstpass.h"
namespace {
// FIRSTPASS_STATS struct:
// {
// 25 double members;
// 1 int64_t member;
// }
// Whenever FIRSTPASS_STATS struct is modified, the following constants need to
// be revisited.
const int kDbl = 25;
const int kInt = 1;
const size_t kFirstPassStatsSz = kDbl * sizeof(double) + kInt * sizeof(int64_t);
class VPxFirstPassEncoderThreadTest
: public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
VPxFirstPassEncoderThreadTest()
: EncoderTest(GET_PARAM(0)), encoder_initialized_(false), tiles_(0),
encoding_mode_(GET_PARAM(1)), set_cpu_used_(GET_PARAM(2)) {
init_flags_ = VPX_CODEC_USE_PSNR;
row_mt_mode_ = 1;
first_pass_only_ = true;
firstpass_stats_.buf = NULL;
firstpass_stats_.sz = 0;
}
virtual ~VPxFirstPassEncoderThreadTest() { free(firstpass_stats_.buf); }
virtual void SetUp() {
InitializeConfig();
SetMode(encoding_mode_);
cfg_.rc_end_usage = VPX_VBR;
cfg_.rc_2pass_vbr_minsection_pct = 5;
cfg_.rc_2pass_vbr_maxsection_pct = 2000;
cfg_.rc_max_quantizer = 56;
cfg_.rc_min_quantizer = 0;
}
virtual void BeginPassHook(unsigned int /*pass*/) {
encoder_initialized_ = false;
abort_ = false;
}
virtual void EndPassHook() {
// For first pass stats test, only run first pass encoder.
if (first_pass_only_ && cfg_.g_pass == VPX_RC_FIRST_PASS)
abort_ |= first_pass_only_;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource * /*video*/,
::libvpx_test::Encoder *encoder) {
if (!encoder_initialized_) {
// Encode in 2-pass mode.
encoder->Control(VP9E_SET_TILE_COLUMNS, tiles_);
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
encoder->Control(VP8E_SET_ARNR_MAXFRAMES, 7);
encoder->Control(VP8E_SET_ARNR_STRENGTH, 5);
encoder->Control(VP8E_SET_ARNR_TYPE, 3);
encoder->Control(VP9E_SET_FRAME_PARALLEL_DECODING, 0);
if (encoding_mode_ == ::libvpx_test::kTwoPassGood)
encoder->Control(VP9E_SET_ROW_MT, row_mt_mode_);
encoder_initialized_ = true;
}
}
virtual void StatsPktHook(const vpx_codec_cx_pkt_t *pkt) {
const uint8_t *const pkt_buf =
reinterpret_cast<uint8_t *>(pkt->data.twopass_stats.buf);
const size_t pkt_size = pkt->data.twopass_stats.sz;
// First pass stats size equals sizeof(FIRSTPASS_STATS)
EXPECT_EQ(pkt_size, kFirstPassStatsSz)
<< "Error: First pass stats size doesn't equal kFirstPassStatsSz";
firstpass_stats_.buf =
realloc(firstpass_stats_.buf, firstpass_stats_.sz + pkt_size);
memcpy((uint8_t *)firstpass_stats_.buf + firstpass_stats_.sz, pkt_buf,
pkt_size);
firstpass_stats_.sz += pkt_size;
}
bool encoder_initialized_;
int tiles_;
::libvpx_test::TestMode encoding_mode_;
int set_cpu_used_;
int row_mt_mode_;
bool first_pass_only_;
vpx_fixed_buf_t firstpass_stats_;
};
static void compare_fp_stats(vpx_fixed_buf_t *fp_stats, double factor) {
// fp_stats consists of 2 set of first pass encoding stats. These 2 set of
// stats are compared to check if the stats match or at least are very close.
FIRSTPASS_STATS *stats1 = reinterpret_cast<FIRSTPASS_STATS *>(fp_stats->buf);
int nframes_ = (int)(fp_stats->sz / sizeof(FIRSTPASS_STATS));
FIRSTPASS_STATS *stats2 = stats1 + nframes_ / 2;
int i, j;
// The total stats are also output and included in the first pass stats. Here
// ignore that in the comparison.
for (i = 0; i < (nframes_ / 2 - 1); ++i) {
const double *frame_stats1 = reinterpret_cast<double *>(stats1);
const double *frame_stats2 = reinterpret_cast<double *>(stats2);
for (j = 0; j < kDbl; ++j) {
ASSERT_LE(fabs(*frame_stats1 - *frame_stats2),
fabs(*frame_stats1) / factor)
<< "First failure @ frame #" << i << " stat #" << j << " ("
<< *frame_stats1 << " vs. " << *frame_stats2 << ")";
frame_stats1++;
frame_stats2++;
}
stats1++;
stats2++;
}
// Reset firstpass_stats_ to 0.
memset((uint8_t *)fp_stats->buf, 0, fp_stats->sz);
fp_stats->sz = 0;
}
static void compare_fp_stats_md5(vpx_fixed_buf_t *fp_stats) {
// fp_stats consists of 2 set of first pass encoding stats. These 2 set of
// stats are compared to check if the stats match.
uint8_t *stats1 = reinterpret_cast<uint8_t *>(fp_stats->buf);
uint8_t *stats2 = stats1 + fp_stats->sz / 2;
::libvpx_test::MD5 md5_row_mt_0, md5_row_mt_1;
md5_row_mt_0.Add(stats1, fp_stats->sz / 2);
const char *md5_row_mt_0_str = md5_row_mt_0.Get();
md5_row_mt_1.Add(stats2, fp_stats->sz / 2);
const char *md5_row_mt_1_str = md5_row_mt_1.Get();
// Check md5 match.
ASSERT_STREQ(md5_row_mt_0_str, md5_row_mt_1_str)
<< "MD5 checksums don't match";
// Reset firstpass_stats_ to 0.
memset((uint8_t *)fp_stats->buf, 0, fp_stats->sz);
fp_stats->sz = 0;
}
TEST_P(VPxFirstPassEncoderThreadTest, FirstPassStatsTest) {
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 0, 60);
first_pass_only_ = true;
cfg_.rc_target_bitrate = 1000;
// Test row_mt_mode: 0 vs 1 at single thread case(threads = 1, tiles_ = 0)
tiles_ = 0;
cfg_.g_threads = 1;
row_mt_mode_ = 0;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
row_mt_mode_ = 1;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
// Compare to check if using or not using row-mt generates close stats.
ASSERT_NO_FATAL_FAILURE(compare_fp_stats(&firstpass_stats_, 1000.0));
// Test single thread vs multiple threads
row_mt_mode_ = 1;
tiles_ = 0;
cfg_.g_threads = 1;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
cfg_.g_threads = 4;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
// Compare to check if single-thread and multi-thread stats are close enough.
ASSERT_NO_FATAL_FAILURE(compare_fp_stats(&firstpass_stats_, 1000.0));
// Bit exact test in row_mt mode.
// When row_mt_mode_=1 and using >1 threads, the encoder generates bit exact
// result.
row_mt_mode_ = 1;
tiles_ = 2;
cfg_.g_threads = 2;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
cfg_.g_threads = 8;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
// Compare to check if stats match with row-mt=0/1.
compare_fp_stats_md5(&firstpass_stats_);
}
class VPxEncoderThreadTest
: public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith4Params<libvpx_test::TestMode, int,
@@ -29,6 +228,9 @@ class VPxEncoderThreadTest
encoding_mode_(GET_PARAM(1)), set_cpu_used_(GET_PARAM(2)) {
init_flags_ = VPX_CODEC_USE_PSNR;
md5_.clear();
row_mt_mode_ = 1;
psnr_ = 0.0;
nframes_ = 0;
}
virtual ~VPxEncoderThreadTest() {}
@@ -37,7 +239,6 @@ class VPxEncoderThreadTest
SetMode(encoding_mode_);
if (encoding_mode_ != ::libvpx_test::kRealTime) {
cfg_.g_lag_in_frames = 3;
cfg_.rc_end_usage = VPX_VBR;
cfg_.rc_2pass_vbr_minsection_pct = 5;
cfg_.rc_2pass_vbr_maxsection_pct = 2000;
@@ -52,6 +253,8 @@ class VPxEncoderThreadTest
virtual void BeginPassHook(unsigned int /*pass*/) {
encoder_initialized_ = false;
psnr_ = 0.0;
nframes_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource * /*video*/,
@@ -70,10 +273,17 @@ class VPxEncoderThreadTest
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 0);
encoder->Control(VP9E_SET_AQ_MODE, 3);
}
encoder->Control(VP9E_SET_ROW_MT, row_mt_mode_);
encoder_initialized_ = true;
}
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
psnr_ += pkt->data.psnr.psnr[0];
nframes_++;
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t /*pts*/) {
::libvpx_test::MD5 md5_res;
@@ -92,38 +302,102 @@ class VPxEncoderThreadTest
return true;
}
double GetAveragePsnr() const { return nframes_ ? (psnr_ / nframes_) : 0.0; }
bool encoder_initialized_;
int tiles_;
int threads_;
::libvpx_test::TestMode encoding_mode_;
int set_cpu_used_;
int row_mt_mode_;
double psnr_;
unsigned int nframes_;
std::vector<std::string> md5_;
};
TEST_P(VPxEncoderThreadTest, EncoderResultTest) {
std::vector<std::string> single_thr_md5, multi_thr_md5;
::libvpx_test::Y4mVideoSource video("niklas_1280_720_30.y4m", 15, 20);
cfg_.rc_target_bitrate = 1000;
// Part 1: Bit exact test for row_mt_mode_ = 0.
// This part keeps original unit tests done before row-mt code is checked in.
row_mt_mode_ = 0;
// Encode using single thread.
cfg_.g_threads = 1;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
single_thr_md5 = md5_;
const std::vector<std::string> single_thr_md5 = md5_;
md5_.clear();
// Encode using multiple threads.
cfg_.g_threads = threads_;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
multi_thr_md5 = md5_;
const std::vector<std::string> multi_thr_md5 = md5_;
md5_.clear();
// Compare to check if two vectors are equal.
ASSERT_EQ(single_thr_md5, multi_thr_md5);
// Part 2: row_mt_mode_ = 0 vs row_mt_mode_ = 1 single thread bit exact test.
row_mt_mode_ = 1;
// Encode using single thread
cfg_.g_threads = 1;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
std::vector<std::string> row_mt_single_thr_md5 = md5_;
md5_.clear();
ASSERT_EQ(single_thr_md5, row_mt_single_thr_md5);
// Part 3: Bit exact test with row-mt on
// When row_mt_mode_=1 and using >1 threads, the encoder generates bit exact
// result.
row_mt_mode_ = 1;
row_mt_single_thr_md5.clear();
// Encode using 2 threads.
cfg_.g_threads = 2;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
row_mt_single_thr_md5 = md5_;
md5_.clear();
// Encode using multiple threads.
cfg_.g_threads = threads_;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
const std::vector<std::string> row_mt_multi_thr_md5 = md5_;
md5_.clear();
// Compare to check if two vectors are equal.
ASSERT_EQ(row_mt_single_thr_md5, row_mt_multi_thr_md5);
// Part 4: PSNR test with bit_match_mode_ = 0
row_mt_mode_ = 1;
// Encode using single thread.
cfg_.g_threads = 1;
init_flags_ = VPX_CODEC_USE_PSNR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
const double single_thr_psnr = GetAveragePsnr();
// Encode using multiple threads.
cfg_.g_threads = threads_;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
const double multi_thr_psnr = GetAveragePsnr();
EXPECT_NEAR(single_thr_psnr, multi_thr_psnr, 0.1);
}
INSTANTIATE_TEST_CASE_P(
VP9, VPxFirstPassEncoderThreadTest,
::testing::Combine(
::testing::Values(
static_cast<const libvpx_test::CodecFactory *>(&libvpx_test::kVP9)),
::testing::Values(::libvpx_test::kTwoPassGood),
::testing::Range(0, 4))); // cpu_used
// Split this into two instantiations so that we can distinguish
// between very slow runs ( ie cpu_speed 0 ) vs ones that can be
// run nightly by adding Large to the title.
@@ -135,7 +409,7 @@ INSTANTIATE_TEST_CASE_P(
::testing::Values(::libvpx_test::kTwoPassGood,
::libvpx_test::kOnePassGood,
::libvpx_test::kRealTime),
::testing::Range(2, 9), // cpu_used
::testing::Range(3, 9), // cpu_used
::testing::Range(0, 3), // tile_columns
::testing::Range(2, 5))); // threads
@@ -147,7 +421,7 @@ INSTANTIATE_TEST_CASE_P(
::testing::Values(::libvpx_test::kTwoPassGood,
::libvpx_test::kOnePassGood,
::libvpx_test::kRealTime),
::testing::Range(0, 2), // cpu_used
::testing::Range(0, 3), // cpu_used
::testing::Range(0, 3), // tile_columns
::testing::Range(2, 5))); // threads

View File

@@ -1,217 +0,0 @@
/*
* Copyright (c) 2014 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <cstdio>
#include <cstdlib>
#include <string>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h"
#include "test/codec_factory.h"
#include "test/decode_test_driver.h"
#include "test/ivf_video_source.h"
#include "test/md5_helper.h"
#include "test/util.h"
#if CONFIG_WEBM_IO
#include "test/webm_video_source.h"
#endif
#include "vpx_mem/vpx_mem.h"
namespace {
using std::string;
#if CONFIG_WEBM_IO
struct PauseFileList {
const char *name;
// md5 sum for decoded frames which does not include skipped frames.
const char *expected_md5;
const int pause_frame_num;
};
// Decodes |filename| with |num_threads|. Pause at the specified frame_num,
// seek to next key frame and then continue decoding until the end. Return
// the md5 of the decoded frames which does not include skipped frames.
string DecodeFileWithPause(const string &filename, int num_threads,
int pause_num) {
libvpx_test::WebMVideoSource video(filename);
video.Init();
int in_frames = 0;
int out_frames = 0;
vpx_codec_dec_cfg_t cfg = vpx_codec_dec_cfg_t();
cfg.threads = num_threads;
vpx_codec_flags_t flags = 0;
flags |= VPX_CODEC_USE_FRAME_THREADING;
libvpx_test::VP9Decoder decoder(cfg, flags);
libvpx_test::MD5 md5;
video.Begin();
do {
++in_frames;
const vpx_codec_err_t res =
decoder.DecodeFrame(video.cxdata(), video.frame_size());
if (res != VPX_CODEC_OK) {
EXPECT_EQ(VPX_CODEC_OK, res) << decoder.DecodeError();
break;
}
// Pause at specified frame number.
if (in_frames == pause_num) {
// Flush the decoder and then seek to next key frame.
decoder.DecodeFrame(NULL, 0);
video.SeekToNextKeyFrame();
} else {
video.Next();
}
// Flush the decoder at the end of the video.
if (!video.cxdata()) decoder.DecodeFrame(NULL, 0);
libvpx_test::DxDataIterator dec_iter = decoder.GetDxData();
const vpx_image_t *img;
// Get decompressed data
while ((img = dec_iter.Next())) {
++out_frames;
md5.Add(img);
}
} while (video.cxdata() != NULL);
EXPECT_EQ(in_frames, out_frames)
<< "Input frame count does not match output frame count";
return string(md5.Get());
}
void DecodeFilesWithPause(const PauseFileList files[]) {
for (const PauseFileList *iter = files; iter->name != NULL; ++iter) {
SCOPED_TRACE(iter->name);
for (int t = 2; t <= 8; ++t) {
EXPECT_EQ(iter->expected_md5,
DecodeFileWithPause(iter->name, t, iter->pause_frame_num))
<< "threads = " << t;
}
}
}
TEST(VP9MultiThreadedFrameParallel, PauseSeekResume) {
// vp90-2-07-frame_parallel-1.webm is a 40 frame video file with
// one key frame for every ten frames.
static const PauseFileList files[] = {
{ "vp90-2-07-frame_parallel-1.webm", "6ea7c3875d67252e7caf2bc6e75b36b1",
6 },
{ "vp90-2-07-frame_parallel-1.webm", "4bb634160c7356a8d7d4299b6dc83a45",
12 },
{ "vp90-2-07-frame_parallel-1.webm", "89772591e6ef461f9fa754f916c78ed8",
26 },
{ NULL, NULL, 0 },
};
DecodeFilesWithPause(files);
}
struct FileList {
const char *name;
// md5 sum for decoded frames which does not include corrupted frames.
const char *expected_md5;
// Expected number of decoded frames which does not include corrupted frames.
const int expected_frame_count;
};
// Decodes |filename| with |num_threads|. Return the md5 of the decoded
// frames which does not include corrupted frames.
string DecodeFile(const string &filename, int num_threads,
int expected_frame_count) {
libvpx_test::WebMVideoSource video(filename);
video.Init();
vpx_codec_dec_cfg_t cfg = vpx_codec_dec_cfg_t();
cfg.threads = num_threads;
const vpx_codec_flags_t flags = VPX_CODEC_USE_FRAME_THREADING;
libvpx_test::VP9Decoder decoder(cfg, flags);
libvpx_test::MD5 md5;
video.Begin();
int out_frames = 0;
do {
const vpx_codec_err_t res =
decoder.DecodeFrame(video.cxdata(), video.frame_size());
// TODO(hkuang): frame parallel mode should return an error on corruption.
if (res != VPX_CODEC_OK) {
EXPECT_EQ(VPX_CODEC_OK, res) << decoder.DecodeError();
break;
}
video.Next();
// Flush the decoder at the end of the video.
if (!video.cxdata()) decoder.DecodeFrame(NULL, 0);
libvpx_test::DxDataIterator dec_iter = decoder.GetDxData();
const vpx_image_t *img;
// Get decompressed data
while ((img = dec_iter.Next())) {
++out_frames;
md5.Add(img);
}
} while (video.cxdata() != NULL);
EXPECT_EQ(expected_frame_count, out_frames)
<< "Input frame count does not match expected output frame count";
return string(md5.Get());
}
void DecodeFiles(const FileList files[]) {
for (const FileList *iter = files; iter->name != NULL; ++iter) {
SCOPED_TRACE(iter->name);
for (int t = 2; t <= 8; ++t) {
EXPECT_EQ(iter->expected_md5,
DecodeFile(iter->name, t, iter->expected_frame_count))
<< "threads = " << t;
}
}
}
TEST(VP9MultiThreadedFrameParallel, InvalidFileTest) {
static const FileList files[] = {
// invalid-vp90-2-07-frame_parallel-1.webm is a 40 frame video file with
// one key frame for every ten frames. The 11th frame has corrupted data.
{ "invalid-vp90-2-07-frame_parallel-1.webm",
"0549d0f45f60deaef8eb708e6c0eb6cb", 30 },
// invalid-vp90-2-07-frame_parallel-2.webm is a 40 frame video file with
// one key frame for every ten frames. The 1st and 31st frames have
// corrupted data.
{ "invalid-vp90-2-07-frame_parallel-2.webm",
"6a1f3cf6f9e7a364212fadb9580d525e", 20 },
// invalid-vp90-2-07-frame_parallel-3.webm is a 40 frame video file with
// one key frame for every ten frames. The 5th and 13th frames have
// corrupted data.
{ "invalid-vp90-2-07-frame_parallel-3.webm",
"8256544308de926b0681e04685b98677", 27 },
{ NULL, NULL, 0 },
};
DecodeFiles(files);
}
TEST(VP9MultiThreadedFrameParallel, ValidFileTest) {
static const FileList files[] = {
#if CONFIG_VP9_HIGHBITDEPTH
{ "vp92-2-20-10bit-yuv420.webm", "a16b99df180c584e8db2ffeda987d293", 10 },
#endif
{ NULL, NULL, 0 },
};
DecodeFiles(files);
}
#endif // CONFIG_WEBM_IO
} // namespace

View File

@@ -378,6 +378,60 @@ INSTANTIATE_TEST_CASE_P(
8)));
#endif // HAVE_MSA
#if HAVE_VSX
INSTANTIATE_TEST_CASE_P(
VSX, VP9IntraPredTest,
::testing::Values(
IntraPredParam(&vpx_d45_predictor_8x8_vsx, &vpx_d45_predictor_8x8_c, 8,
8),
IntraPredParam(&vpx_d45_predictor_16x16_vsx, &vpx_d45_predictor_16x16_c,
16, 8),
IntraPredParam(&vpx_d45_predictor_32x32_vsx, &vpx_d45_predictor_32x32_c,
32, 8),
IntraPredParam(&vpx_d63_predictor_8x8_vsx, &vpx_d63_predictor_8x8_c, 8,
8),
IntraPredParam(&vpx_d63_predictor_16x16_vsx, &vpx_d63_predictor_16x16_c,
16, 8),
IntraPredParam(&vpx_d63_predictor_32x32_vsx, &vpx_d63_predictor_32x32_c,
32, 8),
IntraPredParam(&vpx_dc_128_predictor_16x16_vsx,
&vpx_dc_128_predictor_16x16_c, 16, 8),
IntraPredParam(&vpx_dc_128_predictor_32x32_vsx,
&vpx_dc_128_predictor_32x32_c, 32, 8),
IntraPredParam(&vpx_dc_left_predictor_16x16_vsx,
&vpx_dc_left_predictor_16x16_c, 16, 8),
IntraPredParam(&vpx_dc_left_predictor_32x32_vsx,
&vpx_dc_left_predictor_32x32_c, 32, 8),
IntraPredParam(&vpx_dc_predictor_8x8_vsx, &vpx_dc_predictor_8x8_c, 8,
8),
IntraPredParam(&vpx_dc_predictor_16x16_vsx, &vpx_dc_predictor_16x16_c,
16, 8),
IntraPredParam(&vpx_dc_predictor_32x32_vsx, &vpx_dc_predictor_32x32_c,
32, 8),
IntraPredParam(&vpx_dc_top_predictor_16x16_vsx,
&vpx_dc_top_predictor_16x16_c, 16, 8),
IntraPredParam(&vpx_dc_top_predictor_32x32_vsx,
&vpx_dc_top_predictor_32x32_c, 32, 8),
IntraPredParam(&vpx_h_predictor_4x4_vsx, &vpx_h_predictor_4x4_c, 4, 8),
IntraPredParam(&vpx_h_predictor_8x8_vsx, &vpx_h_predictor_8x8_c, 8, 8),
IntraPredParam(&vpx_h_predictor_16x16_vsx, &vpx_h_predictor_16x16_c, 16,
8),
IntraPredParam(&vpx_h_predictor_32x32_vsx, &vpx_h_predictor_32x32_c, 32,
8),
IntraPredParam(&vpx_tm_predictor_4x4_vsx, &vpx_tm_predictor_4x4_c, 4,
8),
IntraPredParam(&vpx_tm_predictor_8x8_vsx, &vpx_tm_predictor_8x8_c, 8,
8),
IntraPredParam(&vpx_tm_predictor_16x16_vsx, &vpx_tm_predictor_16x16_c,
16, 8),
IntraPredParam(&vpx_tm_predictor_32x32_vsx, &vpx_tm_predictor_32x32_c,
32, 8),
IntraPredParam(&vpx_v_predictor_16x16_vsx, &vpx_v_predictor_16x16_c, 16,
8),
IntraPredParam(&vpx_v_predictor_32x32_vsx, &vpx_v_predictor_32x32_c, 32,
8)));
#endif // HAVE_VSX
#if CONFIG_VP9_HIGHBITDEPTH
typedef void (*HighbdIntraPred)(uint16_t *dst, ptrdiff_t stride,
const uint16_t *above, const uint16_t *left,
@@ -413,10 +467,164 @@ TEST_P(VP9HighbdIntraPredTest, HighbdIntraPredTests) {
RunTest(left_col, above_data, dst, ref_dst);
}
#if HAVE_SSSE3
INSTANTIATE_TEST_CASE_P(
SSSE3_TO_C_8, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_d45_predictor_4x4_ssse3,
&vpx_highbd_d45_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_8x8_ssse3,
&vpx_highbd_d45_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_16x16_ssse3,
&vpx_highbd_d45_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_32x32_ssse3,
&vpx_highbd_d45_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_8x8_ssse3,
&vpx_highbd_d63_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_16x16_ssse3,
&vpx_highbd_d63_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_32x32_c,
&vpx_highbd_d63_predictor_32x32_ssse3, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_8x8_ssse3,
&vpx_highbd_d117_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_16x16_ssse3,
&vpx_highbd_d117_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_32x32_c,
&vpx_highbd_d117_predictor_32x32_ssse3, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_8x8_ssse3,
&vpx_highbd_d135_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_16x16_ssse3,
&vpx_highbd_d135_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_32x32_ssse3,
&vpx_highbd_d135_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_8x8_ssse3,
&vpx_highbd_d153_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_16x16_ssse3,
&vpx_highbd_d153_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_32x32_ssse3,
&vpx_highbd_d153_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_8x8_ssse3,
&vpx_highbd_d207_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_16x16_ssse3,
&vpx_highbd_d207_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_32x32_ssse3,
&vpx_highbd_d207_predictor_32x32_c, 32, 8)));
INSTANTIATE_TEST_CASE_P(
SSSE3_TO_C_10, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_d45_predictor_4x4_ssse3,
&vpx_highbd_d45_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_8x8_ssse3,
&vpx_highbd_d45_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_16x16_ssse3,
&vpx_highbd_d45_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_32x32_ssse3,
&vpx_highbd_d45_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_8x8_ssse3,
&vpx_highbd_d63_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_16x16_ssse3,
&vpx_highbd_d63_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_32x32_c,
&vpx_highbd_d63_predictor_32x32_ssse3, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_8x8_ssse3,
&vpx_highbd_d117_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_16x16_ssse3,
&vpx_highbd_d117_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_32x32_c,
&vpx_highbd_d117_predictor_32x32_ssse3, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_8x8_ssse3,
&vpx_highbd_d135_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_16x16_ssse3,
&vpx_highbd_d135_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_32x32_ssse3,
&vpx_highbd_d135_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_8x8_ssse3,
&vpx_highbd_d153_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_16x16_ssse3,
&vpx_highbd_d153_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_32x32_ssse3,
&vpx_highbd_d153_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_8x8_ssse3,
&vpx_highbd_d207_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_16x16_ssse3,
&vpx_highbd_d207_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_32x32_ssse3,
&vpx_highbd_d207_predictor_32x32_c, 32, 10)));
INSTANTIATE_TEST_CASE_P(
SSSE3_TO_C_12, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_d45_predictor_4x4_ssse3,
&vpx_highbd_d45_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_8x8_ssse3,
&vpx_highbd_d45_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_16x16_ssse3,
&vpx_highbd_d45_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d45_predictor_32x32_ssse3,
&vpx_highbd_d45_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_8x8_ssse3,
&vpx_highbd_d63_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_16x16_ssse3,
&vpx_highbd_d63_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_32x32_c,
&vpx_highbd_d63_predictor_32x32_ssse3, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_8x8_ssse3,
&vpx_highbd_d117_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_16x16_ssse3,
&vpx_highbd_d117_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_32x32_c,
&vpx_highbd_d117_predictor_32x32_ssse3, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_8x8_ssse3,
&vpx_highbd_d135_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_16x16_ssse3,
&vpx_highbd_d135_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_32x32_ssse3,
&vpx_highbd_d135_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_8x8_ssse3,
&vpx_highbd_d153_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_16x16_ssse3,
&vpx_highbd_d153_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_32x32_ssse3,
&vpx_highbd_d153_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_8x8_ssse3,
&vpx_highbd_d207_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_16x16_ssse3,
&vpx_highbd_d207_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_32x32_ssse3,
&vpx_highbd_d207_predictor_32x32_c, 32, 12)));
#endif // HAVE_SSSE3
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(
SSE2_TO_C_8, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_4x4_sse2,
&vpx_highbd_dc_128_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_8x8_sse2,
&vpx_highbd_dc_128_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_16x16_sse2,
&vpx_highbd_dc_128_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_32x32_sse2,
&vpx_highbd_dc_128_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_4x4_sse2,
&vpx_highbd_d63_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_4x4_sse2,
&vpx_highbd_d117_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_4x4_sse2,
&vpx_highbd_d135_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_4x4_sse2,
&vpx_highbd_d153_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_4x4_sse2,
&vpx_highbd_d207_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_4x4_sse2,
&vpx_highbd_dc_left_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_8x8_sse2,
&vpx_highbd_dc_left_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_16x16_sse2,
&vpx_highbd_dc_left_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_32x32_sse2,
&vpx_highbd_dc_left_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_8x8_sse2,
@@ -425,6 +633,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_dc_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_32x32_sse2,
&vpx_highbd_dc_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_4x4_sse2,
&vpx_highbd_dc_top_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_8x8_sse2,
&vpx_highbd_dc_top_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_16x16_sse2,
&vpx_highbd_dc_top_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_32x32_sse2,
&vpx_highbd_dc_top_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_8x8_sse2,
@@ -433,6 +649,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_tm_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_h_predictor_4x4_sse2,
&vpx_highbd_h_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_h_predictor_8x8_sse2,
&vpx_highbd_h_predictor_8x8_c, 8, 8),
HighbdIntraPredParam(&vpx_highbd_h_predictor_16x16_sse2,
&vpx_highbd_h_predictor_16x16_c, 16, 8),
HighbdIntraPredParam(&vpx_highbd_h_predictor_32x32_sse2,
&vpx_highbd_h_predictor_32x32_c, 32, 8),
HighbdIntraPredParam(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 8),
HighbdIntraPredParam(&vpx_highbd_v_predictor_8x8_sse2,
@@ -445,6 +669,32 @@ INSTANTIATE_TEST_CASE_P(
INSTANTIATE_TEST_CASE_P(
SSE2_TO_C_10, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_4x4_sse2,
&vpx_highbd_dc_128_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_8x8_sse2,
&vpx_highbd_dc_128_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_16x16_sse2,
&vpx_highbd_dc_128_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_32x32_sse2,
&vpx_highbd_dc_128_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_4x4_sse2,
&vpx_highbd_d63_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_4x4_sse2,
&vpx_highbd_d117_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_4x4_sse2,
&vpx_highbd_d135_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_4x4_sse2,
&vpx_highbd_d153_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_4x4_sse2,
&vpx_highbd_d207_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_4x4_sse2,
&vpx_highbd_dc_left_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_8x8_sse2,
&vpx_highbd_dc_left_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_16x16_sse2,
&vpx_highbd_dc_left_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_32x32_sse2,
&vpx_highbd_dc_left_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_8x8_sse2,
@@ -453,6 +703,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_dc_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_32x32_sse2,
&vpx_highbd_dc_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_4x4_sse2,
&vpx_highbd_dc_top_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_8x8_sse2,
&vpx_highbd_dc_top_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_16x16_sse2,
&vpx_highbd_dc_top_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_32x32_sse2,
&vpx_highbd_dc_top_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_8x8_sse2,
@@ -461,6 +719,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_tm_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_h_predictor_4x4_sse2,
&vpx_highbd_h_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_h_predictor_8x8_sse2,
&vpx_highbd_h_predictor_8x8_c, 8, 10),
HighbdIntraPredParam(&vpx_highbd_h_predictor_16x16_sse2,
&vpx_highbd_h_predictor_16x16_c, 16, 10),
HighbdIntraPredParam(&vpx_highbd_h_predictor_32x32_sse2,
&vpx_highbd_h_predictor_32x32_c, 32, 10),
HighbdIntraPredParam(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 10),
HighbdIntraPredParam(&vpx_highbd_v_predictor_8x8_sse2,
@@ -473,6 +739,32 @@ INSTANTIATE_TEST_CASE_P(
INSTANTIATE_TEST_CASE_P(
SSE2_TO_C_12, VP9HighbdIntraPredTest,
::testing::Values(
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_4x4_sse2,
&vpx_highbd_dc_128_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_8x8_sse2,
&vpx_highbd_dc_128_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_16x16_sse2,
&vpx_highbd_dc_128_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_dc_128_predictor_32x32_sse2,
&vpx_highbd_dc_128_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_d63_predictor_4x4_sse2,
&vpx_highbd_d63_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_d117_predictor_4x4_sse2,
&vpx_highbd_d117_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_d135_predictor_4x4_sse2,
&vpx_highbd_d135_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_d153_predictor_4x4_sse2,
&vpx_highbd_d153_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_d207_predictor_4x4_sse2,
&vpx_highbd_d207_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_4x4_sse2,
&vpx_highbd_dc_left_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_8x8_sse2,
&vpx_highbd_dc_left_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_16x16_sse2,
&vpx_highbd_dc_left_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_dc_left_predictor_32x32_sse2,
&vpx_highbd_dc_left_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_8x8_sse2,
@@ -481,6 +773,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_dc_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_dc_predictor_32x32_sse2,
&vpx_highbd_dc_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_4x4_sse2,
&vpx_highbd_dc_top_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_8x8_sse2,
&vpx_highbd_dc_top_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_16x16_sse2,
&vpx_highbd_dc_top_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_dc_top_predictor_32x32_sse2,
&vpx_highbd_dc_top_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_8x8_sse2,
@@ -489,6 +789,14 @@ INSTANTIATE_TEST_CASE_P(
&vpx_highbd_tm_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_h_predictor_4x4_sse2,
&vpx_highbd_h_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_h_predictor_8x8_sse2,
&vpx_highbd_h_predictor_8x8_c, 8, 12),
HighbdIntraPredParam(&vpx_highbd_h_predictor_16x16_sse2,
&vpx_highbd_h_predictor_16x16_c, 16, 12),
HighbdIntraPredParam(&vpx_highbd_h_predictor_32x32_sse2,
&vpx_highbd_h_predictor_32x32_c, 32, 12),
HighbdIntraPredParam(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 12),
HighbdIntraPredParam(&vpx_highbd_v_predictor_8x8_sse2,

View File

@@ -0,0 +1,97 @@
/*
* Copyright (c) 2017 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
#include "test/yuv_video_source.h"
namespace {
#define MAX_EXTREME_MV 1
#define MIN_EXTREME_MV 2
// Encoding modes
const libvpx_test::TestMode kEncodingModeVectors[] = {
::libvpx_test::kTwoPassGood, ::libvpx_test::kOnePassGood,
::libvpx_test::kRealTime,
};
// Encoding speeds
const int kCpuUsedVectors[] = { 0, 1, 2, 3, 4, 5, 6 };
// MV test modes: 1 - always use maximum MV; 2 - always use minimum MV.
const int kMVTestModes[] = { MAX_EXTREME_MV, MIN_EXTREME_MV };
class MotionVectorTestLarge
: public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith3Params<libvpx_test::TestMode, int,
int> {
protected:
MotionVectorTestLarge()
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)),
cpu_used_(GET_PARAM(2)), mv_test_mode_(GET_PARAM(3)) {}
virtual ~MotionVectorTestLarge() {}
virtual void SetUp() {
InitializeConfig();
SetMode(encoding_mode_);
if (encoding_mode_ != ::libvpx_test::kRealTime) {
cfg_.g_lag_in_frames = 3;
cfg_.rc_end_usage = VPX_VBR;
} else {
cfg_.g_lag_in_frames = 0;
cfg_.rc_end_usage = VPX_CBR;
cfg_.rc_buf_sz = 1000;
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 600;
}
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
if (video->frame() == 1) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_ENABLE_MOTION_VECTOR_UNIT_TEST, mv_test_mode_);
if (encoding_mode_ != ::libvpx_test::kRealTime) {
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
encoder->Control(VP8E_SET_ARNR_MAXFRAMES, 7);
encoder->Control(VP8E_SET_ARNR_STRENGTH, 5);
encoder->Control(VP8E_SET_ARNR_TYPE, 3);
}
}
}
libvpx_test::TestMode encoding_mode_;
int cpu_used_;
int mv_test_mode_;
};
TEST_P(MotionVectorTestLarge, OverallTest) {
cfg_.rc_target_bitrate = 24000;
cfg_.g_profile = 0;
init_flags_ = VPX_CODEC_USE_PSNR;
testing::internal::scoped_ptr<libvpx_test::VideoSource> video;
video.reset(new libvpx_test::YUVVideoSource(
"niklas_640_480_30.yuv", VPX_IMG_FMT_I420, 3840, 2160, // 2048, 1080,
30, 1, 0, 5));
ASSERT_TRUE(video.get() != NULL);
ASSERT_NO_FATAL_FAILURE(RunLoop(video.get()));
}
VP9_INSTANTIATE_TEST_CASE(MotionVectorTestLarge,
::testing::ValuesIn(kEncodingModeVectors),
::testing::ValuesIn(kCpuUsedVectors),
::testing::ValuesIn(kMVTestModes));
} // namespace

View File

@@ -14,9 +14,11 @@
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
@@ -24,11 +26,12 @@
#include "vp9/common/vp9_scan.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/vpx_timer.h"
using libvpx_test::ACMRandom;
using libvpx_test::Buffer;
namespace {
#if CONFIG_VP9_HIGHBITDEPTH
const int number_of_iterations = 100;
typedef void (*QuantizeFunc)(const tran_low_t *coeff, intptr_t count,
@@ -38,307 +41,494 @@ typedef void (*QuantizeFunc)(const tran_low_t *coeff, intptr_t count,
tran_low_t *dqcoeff, const int16_t *dequant,
uint16_t *eob, const int16_t *scan,
const int16_t *iscan);
typedef std::tr1::tuple<QuantizeFunc, QuantizeFunc, vpx_bit_depth_t>
typedef std::tr1::tuple<QuantizeFunc, QuantizeFunc, vpx_bit_depth_t,
int /*max_size*/, bool /*is_fp*/>
QuantizeParam;
class VP9QuantizeTest : public ::testing::TestWithParam<QuantizeParam> {
// Wrapper for FP version which does not use zbin or quant_shift.
typedef void (*QuantizeFPFunc)(const tran_low_t *coeff, intptr_t count,
int skip_block, const int16_t *round,
const int16_t *quant, tran_low_t *qcoeff,
tran_low_t *dqcoeff, const int16_t *dequant,
uint16_t *eob, const int16_t *scan,
const int16_t *iscan);
template <QuantizeFPFunc fn>
void QuantFPWrapper(const tran_low_t *coeff, intptr_t count, int skip_block,
const int16_t *zbin, const int16_t *round,
const int16_t *quant, const int16_t *quant_shift,
tran_low_t *qcoeff, tran_low_t *dqcoeff,
const int16_t *dequant, uint16_t *eob, const int16_t *scan,
const int16_t *iscan) {
(void)zbin;
(void)quant_shift;
fn(coeff, count, skip_block, round, quant, qcoeff, dqcoeff, dequant, eob,
scan, iscan);
}
class VP9QuantizeBase {
public:
virtual ~VP9QuantizeTest() {}
virtual void SetUp() {
quantize_op_ = GET_PARAM(0);
ref_quantize_op_ = GET_PARAM(1);
bit_depth_ = GET_PARAM(2);
mask_ = (1 << bit_depth_) - 1;
VP9QuantizeBase(vpx_bit_depth_t bit_depth, int max_size, bool is_fp)
: bit_depth_(bit_depth), max_size_(max_size), is_fp_(is_fp) {
max_value_ = (1 << bit_depth_) - 1;
zbin_ptr_ =
reinterpret_cast<int16_t *>(vpx_memalign(16, 8 * sizeof(*zbin_ptr_)));
round_fp_ptr_ = reinterpret_cast<int16_t *>(
vpx_memalign(16, 8 * sizeof(*round_fp_ptr_)));
quant_fp_ptr_ = reinterpret_cast<int16_t *>(
vpx_memalign(16, 8 * sizeof(*quant_fp_ptr_)));
round_ptr_ =
reinterpret_cast<int16_t *>(vpx_memalign(16, 8 * sizeof(*round_ptr_)));
quant_ptr_ =
reinterpret_cast<int16_t *>(vpx_memalign(16, 8 * sizeof(*quant_ptr_)));
quant_shift_ptr_ = reinterpret_cast<int16_t *>(
vpx_memalign(16, 8 * sizeof(*quant_shift_ptr_)));
dequant_ptr_ = reinterpret_cast<int16_t *>(
vpx_memalign(16, 8 * sizeof(*dequant_ptr_)));
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
vpx_bit_depth_t bit_depth_;
int mask_;
QuantizeFunc quantize_op_;
QuantizeFunc ref_quantize_op_;
};
class VP9Quantize32Test : public ::testing::TestWithParam<QuantizeParam> {
public:
virtual ~VP9Quantize32Test() {}
virtual void SetUp() {
quantize_op_ = GET_PARAM(0);
ref_quantize_op_ = GET_PARAM(1);
bit_depth_ = GET_PARAM(2);
mask_ = (1 << bit_depth_) - 1;
~VP9QuantizeBase() {
vpx_free(zbin_ptr_);
vpx_free(round_fp_ptr_);
vpx_free(quant_fp_ptr_);
vpx_free(round_ptr_);
vpx_free(quant_ptr_);
vpx_free(quant_shift_ptr_);
vpx_free(dequant_ptr_);
zbin_ptr_ = NULL;
round_fp_ptr_ = NULL;
quant_fp_ptr_ = NULL;
round_ptr_ = NULL;
quant_ptr_ = NULL;
quant_shift_ptr_ = NULL;
dequant_ptr_ = NULL;
libvpx_test::ClearSystemState();
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
int16_t *zbin_ptr_;
int16_t *round_fp_ptr_;
int16_t *quant_fp_ptr_;
int16_t *round_ptr_;
int16_t *quant_ptr_;
int16_t *quant_shift_ptr_;
int16_t *dequant_ptr_;
const vpx_bit_depth_t bit_depth_;
int max_value_;
const int max_size_;
const bool is_fp_;
};
class VP9QuantizeTest : public VP9QuantizeBase,
public ::testing::TestWithParam<QuantizeParam> {
public:
VP9QuantizeTest()
: VP9QuantizeBase(GET_PARAM(2), GET_PARAM(3), GET_PARAM(4)),
quantize_op_(GET_PARAM(0)), ref_quantize_op_(GET_PARAM(1)) {}
protected:
vpx_bit_depth_t bit_depth_;
int mask_;
QuantizeFunc quantize_op_;
QuantizeFunc ref_quantize_op_;
const QuantizeFunc quantize_op_;
const QuantizeFunc ref_quantize_op_;
};
// This quantizer compares the AC coefficients to the quantization step size to
// determine if further multiplication operations are needed.
// Based on vp9_quantize_fp_sse2().
void quantize_fp_nz_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs,
int skip_block, const int16_t *round_ptr,
const int16_t *quant_ptr, tran_low_t *qcoeff_ptr,
tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr,
uint16_t *eob_ptr, const int16_t *scan,
const int16_t *iscan) {
int i, eob = -1;
const int thr = dequant_ptr[1] >> 1;
(void)iscan;
(void)skip_block;
assert(!skip_block);
// Quantization pass: All coefficients with index >= zero_flag are
// skippable. Note: zero_flag can be zero.
for (i = 0; i < n_coeffs; i += 16) {
int y;
int nzflag_cnt = 0;
int abs_coeff[16];
int coeff_sign[16];
// count nzflag for each row (16 tran_low_t)
for (y = 0; y < 16; ++y) {
const int rc = i + y;
const int coeff = coeff_ptr[rc];
coeff_sign[y] = (coeff >> 31);
abs_coeff[y] = (coeff ^ coeff_sign[y]) - coeff_sign[y];
// The first 16 are skipped in the sse2 code. Do the same here to match.
if (i >= 16 && (abs_coeff[y] <= thr)) {
nzflag_cnt++;
}
}
for (y = 0; y < 16; ++y) {
const int rc = i + y;
// If all of the AC coeffs in a row has magnitude less than the
// quantization step_size/2, quantize to zero.
if (nzflag_cnt < 16) {
int tmp =
clamp(abs_coeff[y] + round_ptr[rc != 0], INT16_MIN, INT16_MAX);
tmp = (tmp * quant_ptr[rc != 0]) >> 16;
qcoeff_ptr[rc] = (tmp ^ coeff_sign[y]) - coeff_sign[y];
dqcoeff_ptr[rc] = qcoeff_ptr[rc] * dequant_ptr[rc != 0];
} else {
qcoeff_ptr[rc] = 0;
dqcoeff_ptr[rc] = 0;
}
}
}
// Scan for eob.
for (i = 0; i < n_coeffs; i++) {
// Use the scan order to find the correct eob.
const int rc = scan[i];
if (qcoeff_ptr[rc]) {
eob = i;
}
}
*eob_ptr = eob + 1;
}
void GenerateHelperArrays(ACMRandom *rnd, int16_t *zbin, int16_t *round,
int16_t *quant, int16_t *quant_shift,
int16_t *dequant, int16_t *round_fp,
int16_t *quant_fp) {
// Max when q == 0. Otherwise, it is 48 for Y and 42 for U/V.
const int max_qrounding_factor_fp = 64;
for (int j = 0; j < 2; j++) {
// The range is 4 to 1828 in the VP9 tables.
const int qlookup = rnd->RandRange(1825) + 4;
round_fp[j] = (max_qrounding_factor_fp * qlookup) >> 7;
quant_fp[j] = (1 << 16) / qlookup;
// Values determined by deconstructing vp9_init_quantizer().
// zbin may be up to 1143 for 8 and 10 bit Y values, or 1200 for 12 bit Y
// values or U/V values of any bit depth. This is because y_delta is not
// factored into the vp9_ac_quant() call.
zbin[j] = rnd->RandRange(1200);
// round may be up to 685 for Y values or 914 for U/V.
round[j] = rnd->RandRange(914);
// quant ranges from 1 to -32703
quant[j] = static_cast<int>(rnd->RandRange(32704)) - 32703;
// quant_shift goes up to 1 << 16.
quant_shift[j] = rnd->RandRange(16384);
// dequant maxes out at 1828 for all cases.
dequant[j] = rnd->RandRange(1828);
}
for (int j = 2; j < 8; j++) {
zbin[j] = zbin[1];
round_fp[j] = round_fp[1];
quant_fp[j] = quant_fp[1];
round[j] = round[1];
quant[j] = quant[1];
quant_shift[j] = quant_shift[1];
dequant[j] = dequant[1];
}
}
TEST_P(VP9QuantizeTest, OperationCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff_ptr[256]);
DECLARE_ALIGNED(16, int16_t, zbin_ptr[2]);
DECLARE_ALIGNED(16, int16_t, round_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_shift_ptr[2]);
DECLARE_ALIGNED(16, tran_low_t, qcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, ref_qcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, ref_dqcoeff_ptr[256]);
DECLARE_ALIGNED(16, int16_t, dequant_ptr[2]);
DECLARE_ALIGNED(16, uint16_t, eob_ptr[1]);
DECLARE_ALIGNED(16, uint16_t, ref_eob_ptr[1]);
int err_count_total = 0;
int first_failure = -1;
for (int i = 0; i < number_of_iterations; ++i) {
const int skip_block = i == 0;
const TX_SIZE sz = (TX_SIZE)(i % 3); // TX_4X4, TX_8X8 TX_16X16
const TX_TYPE tx_type = (TX_TYPE)((i >> 2) % 3);
const scan_order *scan_order = &vp9_scan_orders[sz][tx_type];
const int count = (4 << sz) * (4 << sz); // 16, 64, 256
int err_count = 0;
*eob_ptr = rnd.Rand16();
*ref_eob_ptr = *eob_ptr;
for (int j = 0; j < count; j++) {
coeff_ptr[j] = rnd.Rand16() & mask_;
}
for (int j = 0; j < 2; j++) {
zbin_ptr[j] = rnd.Rand16() & mask_;
round_ptr[j] = rnd.Rand16();
quant_ptr[j] = rnd.Rand16();
quant_shift_ptr[j] = rnd.Rand16();
dequant_ptr[j] = rnd.Rand16();
}
ref_quantize_op_(coeff_ptr, count, skip_block, zbin_ptr, round_ptr,
quant_ptr, quant_shift_ptr, ref_qcoeff_ptr,
ref_dqcoeff_ptr, dequant_ptr, ref_eob_ptr,
scan_order->scan, scan_order->iscan);
ASM_REGISTER_STATE_CHECK(quantize_op_(
coeff_ptr, count, skip_block, zbin_ptr, round_ptr, quant_ptr,
quant_shift_ptr, qcoeff_ptr, dqcoeff_ptr, dequant_ptr, eob_ptr,
scan_order->scan, scan_order->iscan));
for (int j = 0; j < sz; ++j) {
err_count += (ref_qcoeff_ptr[j] != qcoeff_ptr[j]) |
(ref_dqcoeff_ptr[j] != dqcoeff_ptr[j]);
}
err_count += (*ref_eob_ptr != *eob_ptr);
if (err_count && !err_count_total) {
first_failure = i;
}
err_count_total += err_count;
}
EXPECT_EQ(0, err_count_total)
<< "Error: Quantization Test, C output doesn't match SSE2 output. "
<< "First failed at test case " << first_failure;
}
Buffer<tran_low_t> coeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 16);
ASSERT_TRUE(coeff.Init());
Buffer<tran_low_t> qcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(qcoeff.Init());
Buffer<tran_low_t> dqcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(dqcoeff.Init());
Buffer<tran_low_t> ref_qcoeff =
Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(ref_qcoeff.Init());
Buffer<tran_low_t> ref_dqcoeff =
Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(ref_dqcoeff.Init());
uint16_t eob, ref_eob;
TEST_P(VP9Quantize32Test, OperationCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff_ptr[1024]);
DECLARE_ALIGNED(16, int16_t, zbin_ptr[2]);
DECLARE_ALIGNED(16, int16_t, round_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_shift_ptr[2]);
DECLARE_ALIGNED(16, tran_low_t, qcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, ref_qcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, ref_dqcoeff_ptr[1024]);
DECLARE_ALIGNED(16, int16_t, dequant_ptr[2]);
DECLARE_ALIGNED(16, uint16_t, eob_ptr[1]);
DECLARE_ALIGNED(16, uint16_t, ref_eob_ptr[1]);
int err_count_total = 0;
int first_failure = -1;
for (int i = 0; i < number_of_iterations; ++i) {
const int skip_block = i == 0;
const TX_SIZE sz = TX_32X32;
const TX_TYPE tx_type = (TX_TYPE)(i % 4);
// Test skip block for the first three iterations to catch all the different
// sizes.
const int skip_block = 0;
TX_SIZE sz;
if (max_size_ == 16) {
sz = static_cast<TX_SIZE>(i % 3); // TX_4X4, TX_8X8 TX_16X16
} else {
sz = TX_32X32;
}
const TX_TYPE tx_type = static_cast<TX_TYPE>((i >> 2) % 3);
const scan_order *scan_order = &vp9_scan_orders[sz][tx_type];
const int count = (4 << sz) * (4 << sz); // 1024
int err_count = 0;
*eob_ptr = rnd.Rand16();
*ref_eob_ptr = *eob_ptr;
for (int j = 0; j < count; j++) {
coeff_ptr[j] = rnd.Rand16() & mask_;
}
for (int j = 0; j < 2; j++) {
zbin_ptr[j] = rnd.Rand16() & mask_;
round_ptr[j] = rnd.Rand16();
quant_ptr[j] = rnd.Rand16();
quant_shift_ptr[j] = rnd.Rand16();
dequant_ptr[j] = rnd.Rand16();
}
ref_quantize_op_(coeff_ptr, count, skip_block, zbin_ptr, round_ptr,
quant_ptr, quant_shift_ptr, ref_qcoeff_ptr,
ref_dqcoeff_ptr, dequant_ptr, ref_eob_ptr,
const int count = (4 << sz) * (4 << sz);
coeff.Set(&rnd, -max_value_, max_value_);
GenerateHelperArrays(&rnd, zbin_ptr_, round_ptr_, quant_ptr_,
quant_shift_ptr_, dequant_ptr_, round_fp_ptr_,
quant_fp_ptr_);
int16_t *r_ptr = (is_fp_) ? round_fp_ptr_ : round_ptr_;
int16_t *q_ptr = (is_fp_) ? quant_fp_ptr_ : quant_ptr_;
ref_quantize_op_(coeff.TopLeftPixel(), count, skip_block, zbin_ptr_, r_ptr,
q_ptr, quant_shift_ptr_, ref_qcoeff.TopLeftPixel(),
ref_dqcoeff.TopLeftPixel(), dequant_ptr_, &ref_eob,
scan_order->scan, scan_order->iscan);
ASM_REGISTER_STATE_CHECK(quantize_op_(
coeff_ptr, count, skip_block, zbin_ptr, round_ptr, quant_ptr,
quant_shift_ptr, qcoeff_ptr, dqcoeff_ptr, dequant_ptr, eob_ptr,
scan_order->scan, scan_order->iscan));
for (int j = 0; j < sz; ++j) {
err_count += (ref_qcoeff_ptr[j] != qcoeff_ptr[j]) |
(ref_dqcoeff_ptr[j] != dqcoeff_ptr[j]);
coeff.TopLeftPixel(), count, skip_block, zbin_ptr_, r_ptr, q_ptr,
quant_shift_ptr_, qcoeff.TopLeftPixel(), dqcoeff.TopLeftPixel(),
dequant_ptr_, &eob, scan_order->scan, scan_order->iscan));
EXPECT_TRUE(qcoeff.CheckValues(ref_qcoeff));
EXPECT_TRUE(dqcoeff.CheckValues(ref_dqcoeff));
EXPECT_EQ(eob, ref_eob);
if (HasFailure()) {
printf("Failure on iteration %d.\n", i);
qcoeff.PrintDifference(ref_qcoeff);
dqcoeff.PrintDifference(ref_dqcoeff);
return;
}
err_count += (*ref_eob_ptr != *eob_ptr);
if (err_count && !err_count_total) {
first_failure = i;
}
err_count_total += err_count;
}
EXPECT_EQ(0, err_count_total)
<< "Error: Quantization Test, C output doesn't match SSE2 output. "
<< "First failed at test case " << first_failure;
}
TEST_P(VP9QuantizeTest, EOBCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff_ptr[256]);
DECLARE_ALIGNED(16, int16_t, zbin_ptr[2]);
DECLARE_ALIGNED(16, int16_t, round_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_shift_ptr[2]);
DECLARE_ALIGNED(16, tran_low_t, qcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, ref_qcoeff_ptr[256]);
DECLARE_ALIGNED(16, tran_low_t, ref_dqcoeff_ptr[256]);
DECLARE_ALIGNED(16, int16_t, dequant_ptr[2]);
DECLARE_ALIGNED(16, uint16_t, eob_ptr[1]);
DECLARE_ALIGNED(16, uint16_t, ref_eob_ptr[1]);
int err_count_total = 0;
int first_failure = -1;
Buffer<tran_low_t> coeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 16);
ASSERT_TRUE(coeff.Init());
Buffer<tran_low_t> qcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(qcoeff.Init());
Buffer<tran_low_t> dqcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(dqcoeff.Init());
Buffer<tran_low_t> ref_qcoeff =
Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(ref_qcoeff.Init());
Buffer<tran_low_t> ref_dqcoeff =
Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(ref_dqcoeff.Init());
uint16_t eob, ref_eob;
for (int i = 0; i < number_of_iterations; ++i) {
int skip_block = i == 0;
TX_SIZE sz = (TX_SIZE)(i % 3); // TX_4X4, TX_8X8 TX_16X16
TX_TYPE tx_type = (TX_TYPE)((i >> 2) % 3);
const int skip_block = 0;
TX_SIZE sz;
if (max_size_ == 16) {
sz = static_cast<TX_SIZE>(i % 3); // TX_4X4, TX_8X8 TX_16X16
} else {
sz = TX_32X32;
}
const TX_TYPE tx_type = static_cast<TX_TYPE>((i >> 2) % 3);
const scan_order *scan_order = &vp9_scan_orders[sz][tx_type];
int count = (4 << sz) * (4 << sz); // 16, 64, 256
int err_count = 0;
*eob_ptr = rnd.Rand16();
*ref_eob_ptr = *eob_ptr;
int count = (4 << sz) * (4 << sz);
// Two random entries
for (int j = 0; j < count; j++) {
coeff_ptr[j] = 0;
}
coeff_ptr[rnd(count)] = rnd.Rand16() & mask_;
coeff_ptr[rnd(count)] = rnd.Rand16() & mask_;
for (int j = 0; j < 2; j++) {
zbin_ptr[j] = rnd.Rand16() & mask_;
round_ptr[j] = rnd.Rand16();
quant_ptr[j] = rnd.Rand16();
quant_shift_ptr[j] = rnd.Rand16();
dequant_ptr[j] = rnd.Rand16();
}
ref_quantize_op_(coeff_ptr, count, skip_block, zbin_ptr, round_ptr,
quant_ptr, quant_shift_ptr, ref_qcoeff_ptr,
ref_dqcoeff_ptr, dequant_ptr, ref_eob_ptr,
coeff.Set(0);
coeff.TopLeftPixel()[rnd(count)] =
static_cast<int>(rnd.RandRange(max_value_ * 2)) - max_value_;
coeff.TopLeftPixel()[rnd(count)] =
static_cast<int>(rnd.RandRange(max_value_ * 2)) - max_value_;
GenerateHelperArrays(&rnd, zbin_ptr_, round_ptr_, quant_ptr_,
quant_shift_ptr_, dequant_ptr_, round_fp_ptr_,
quant_fp_ptr_);
int16_t *r_ptr = (is_fp_) ? round_fp_ptr_ : round_ptr_;
int16_t *q_ptr = (is_fp_) ? quant_fp_ptr_ : quant_ptr_;
ref_quantize_op_(coeff.TopLeftPixel(), count, skip_block, zbin_ptr_, r_ptr,
q_ptr, quant_shift_ptr_, ref_qcoeff.TopLeftPixel(),
ref_dqcoeff.TopLeftPixel(), dequant_ptr_, &ref_eob,
scan_order->scan, scan_order->iscan);
ASM_REGISTER_STATE_CHECK(quantize_op_(
coeff_ptr, count, skip_block, zbin_ptr, round_ptr, quant_ptr,
quant_shift_ptr, qcoeff_ptr, dqcoeff_ptr, dequant_ptr, eob_ptr,
scan_order->scan, scan_order->iscan));
for (int j = 0; j < sz; ++j) {
err_count += (ref_qcoeff_ptr[j] != qcoeff_ptr[j]) |
(ref_dqcoeff_ptr[j] != dqcoeff_ptr[j]);
ASM_REGISTER_STATE_CHECK(quantize_op_(
coeff.TopLeftPixel(), count, skip_block, zbin_ptr_, r_ptr, q_ptr,
quant_shift_ptr_, qcoeff.TopLeftPixel(), dqcoeff.TopLeftPixel(),
dequant_ptr_, &eob, scan_order->scan, scan_order->iscan));
EXPECT_TRUE(qcoeff.CheckValues(ref_qcoeff));
EXPECT_TRUE(dqcoeff.CheckValues(ref_dqcoeff));
EXPECT_EQ(eob, ref_eob);
if (HasFailure()) {
printf("Failure on iteration %d.\n", i);
qcoeff.PrintDifference(ref_qcoeff);
dqcoeff.PrintDifference(ref_dqcoeff);
return;
}
err_count += (*ref_eob_ptr != *eob_ptr);
if (err_count && !err_count_total) {
first_failure = i;
}
err_count_total += err_count;
}
EXPECT_EQ(0, err_count_total)
<< "Error: Quantization Test, C output doesn't match SSE2 output. "
<< "First failed at test case " << first_failure;
}
TEST_P(VP9Quantize32Test, EOBCheck) {
TEST_P(VP9QuantizeTest, DISABLED_Speed) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
DECLARE_ALIGNED(16, tran_low_t, coeff_ptr[1024]);
DECLARE_ALIGNED(16, int16_t, zbin_ptr[2]);
DECLARE_ALIGNED(16, int16_t, round_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_ptr[2]);
DECLARE_ALIGNED(16, int16_t, quant_shift_ptr[2]);
DECLARE_ALIGNED(16, tran_low_t, qcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, dqcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, ref_qcoeff_ptr[1024]);
DECLARE_ALIGNED(16, tran_low_t, ref_dqcoeff_ptr[1024]);
DECLARE_ALIGNED(16, int16_t, dequant_ptr[2]);
DECLARE_ALIGNED(16, uint16_t, eob_ptr[1]);
DECLARE_ALIGNED(16, uint16_t, ref_eob_ptr[1]);
int err_count_total = 0;
int first_failure = -1;
for (int i = 0; i < number_of_iterations; ++i) {
int skip_block = i == 0;
TX_SIZE sz = TX_32X32;
TX_TYPE tx_type = (TX_TYPE)(i % 4);
const scan_order *scan_order = &vp9_scan_orders[sz][tx_type];
int count = (4 << sz) * (4 << sz); // 1024
int err_count = 0;
*eob_ptr = rnd.Rand16();
*ref_eob_ptr = *eob_ptr;
for (int j = 0; j < count; j++) {
coeff_ptr[j] = 0;
}
// Two random entries
coeff_ptr[rnd(count)] = rnd.Rand16() & mask_;
coeff_ptr[rnd(count)] = rnd.Rand16() & mask_;
for (int j = 0; j < 2; j++) {
zbin_ptr[j] = rnd.Rand16() & mask_;
round_ptr[j] = rnd.Rand16();
quant_ptr[j] = rnd.Rand16();
quant_shift_ptr[j] = rnd.Rand16();
dequant_ptr[j] = rnd.Rand16();
}
Buffer<tran_low_t> coeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 16);
ASSERT_TRUE(coeff.Init());
Buffer<tran_low_t> qcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(qcoeff.Init());
Buffer<tran_low_t> dqcoeff = Buffer<tran_low_t>(max_size_, max_size_, 0, 32);
ASSERT_TRUE(dqcoeff.Init());
uint16_t eob;
TX_SIZE starting_sz, ending_sz;
ref_quantize_op_(coeff_ptr, count, skip_block, zbin_ptr, round_ptr,
quant_ptr, quant_shift_ptr, ref_qcoeff_ptr,
ref_dqcoeff_ptr, dequant_ptr, ref_eob_ptr,
scan_order->scan, scan_order->iscan);
ASM_REGISTER_STATE_CHECK(quantize_op_(
coeff_ptr, count, skip_block, zbin_ptr, round_ptr, quant_ptr,
quant_shift_ptr, qcoeff_ptr, dqcoeff_ptr, dequant_ptr, eob_ptr,
scan_order->scan, scan_order->iscan));
for (int j = 0; j < sz; ++j) {
err_count += (ref_qcoeff_ptr[j] != qcoeff_ptr[j]) |
(ref_dqcoeff_ptr[j] != dqcoeff_ptr[j]);
}
err_count += (*ref_eob_ptr != *eob_ptr);
if (err_count && !err_count_total) {
first_failure = i;
}
err_count_total += err_count;
if (max_size_ == 16) {
starting_sz = TX_4X4;
ending_sz = TX_16X16;
} else {
starting_sz = TX_32X32;
ending_sz = TX_32X32;
}
for (TX_SIZE sz = starting_sz; sz <= ending_sz; ++sz) {
// zbin > coeff, zbin < coeff.
for (int i = 0; i < 2; ++i) {
const int skip_block = 0;
// TX_TYPE defines the scan order. That is not relevant to the speed test.
// Pick the first one.
const TX_TYPE tx_type = DCT_DCT;
const scan_order *scan_order = &vp9_scan_orders[sz][tx_type];
const int count = (4 << sz) * (4 << sz);
GenerateHelperArrays(&rnd, zbin_ptr_, round_ptr_, quant_ptr_,
quant_shift_ptr_, dequant_ptr_, round_fp_ptr_,
quant_fp_ptr_);
int16_t *r_ptr = (is_fp_) ? round_fp_ptr_ : round_ptr_;
int16_t *q_ptr = (is_fp_) ? quant_fp_ptr_ : quant_ptr_;
if (i == 0) {
// When |coeff values| are less than zbin the results are 0.
int threshold = 100;
if (max_size_ == 32) {
// For 32x32, the threshold is halved. Double it to keep the values
// from clearing it.
threshold = 200;
}
for (int j = 0; j < 8; ++j) zbin_ptr_[j] = threshold;
coeff.Set(&rnd, -99, 99);
} else if (i == 1) {
for (int j = 0; j < 8; ++j) zbin_ptr_[j] = 50;
coeff.Set(&rnd, -500, 500);
}
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int j = 0; j < 100000000 / count; ++j) {
quantize_op_(coeff.TopLeftPixel(), count, skip_block, zbin_ptr_, r_ptr,
q_ptr, quant_shift_ptr_, qcoeff.TopLeftPixel(),
dqcoeff.TopLeftPixel(), dequant_ptr_, &eob,
scan_order->scan, scan_order->iscan);
}
vpx_usec_timer_mark(&timer);
const int elapsed_time = static_cast<int>(vpx_usec_timer_elapsed(&timer));
if (i == 0) printf("Bypass calculations.\n");
if (i == 1) printf("Full calculations.\n");
printf("Quantize %dx%d time: %5d ms\n", 4 << sz, 4 << sz,
elapsed_time / 1000);
}
printf("\n");
}
EXPECT_EQ(0, err_count_total)
<< "Error: Quantization Test, C output doesn't match SSE2 output. "
<< "First failed at test case " << first_failure;
}
using std::tr1::make_tuple;
#if HAVE_SSE2
#if CONFIG_VP9_HIGHBITDEPTH
// TODO(johannkoenig): Fix vpx_quantize_b_sse2 in highbitdepth builds.
// make_tuple(&vpx_quantize_b_sse2, &vpx_highbd_quantize_b_c, VPX_BITS_8),
INSTANTIATE_TEST_CASE_P(
SSE2, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_highbd_quantize_b_sse2,
&vpx_highbd_quantize_b_c, VPX_BITS_8),
make_tuple(&vpx_highbd_quantize_b_sse2,
&vpx_highbd_quantize_b_c, VPX_BITS_10),
make_tuple(&vpx_highbd_quantize_b_sse2,
&vpx_highbd_quantize_b_c, VPX_BITS_12)));
::testing::Values(
make_tuple(&vpx_highbd_quantize_b_sse2, &vpx_highbd_quantize_b_c,
VPX_BITS_8, 16, false),
make_tuple(&vpx_highbd_quantize_b_sse2, &vpx_highbd_quantize_b_c,
VPX_BITS_10, 16, false),
make_tuple(&vpx_highbd_quantize_b_sse2, &vpx_highbd_quantize_b_c,
VPX_BITS_12, 16, false),
make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_8, 32, false),
make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_10, 32, false),
make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_12, 32, false)));
#else
INSTANTIATE_TEST_CASE_P(
SSE2, VP9Quantize32Test,
::testing::Values(make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_8),
make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_10),
make_tuple(&vpx_highbd_quantize_b_32x32_sse2,
&vpx_highbd_quantize_b_32x32_c, VPX_BITS_12)));
#endif // HAVE_SSE2
SSE2, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_sse2, &vpx_quantize_b_c,
VPX_BITS_8, 16, false),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_sse2>,
&QuantFPWrapper<quantize_fp_nz_c>, VPX_BITS_8,
16, true)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_SSE2
#if HAVE_SSSE3 && !CONFIG_VP9_HIGHBITDEPTH
#if ARCH_X86_64
INSTANTIATE_TEST_CASE_P(
SSSE3, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_ssse3, &vpx_quantize_b_c,
VPX_BITS_8, 16, false),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_ssse3>,
&QuantFPWrapper<quantize_fp_nz_c>, VPX_BITS_8,
16, true)));
#else
INSTANTIATE_TEST_CASE_P(SSSE3, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_ssse3,
&vpx_quantize_b_c,
VPX_BITS_8, 16, false)));
#endif
#if ARCH_X86_64
// TODO(johannkoenig): SSSE3 optimizations do not yet pass this test.
INSTANTIATE_TEST_CASE_P(
DISABLED_SSSE3, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_32x32_ssse3,
&vpx_quantize_b_32x32_c, VPX_BITS_8, 32,
false),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_32x32_ssse3>,
&QuantFPWrapper<vp9_quantize_fp_32x32_c>,
VPX_BITS_8, 32, true)));
#endif // ARCH_X86_64
#endif // HAVE_SSSE3 && !CONFIG_VP9_HIGHBITDEPTH
// TODO(johannkoenig): AVX optimizations do not yet pass the 32x32 test or
// highbitdepth configurations.
#if HAVE_AVX && !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
AVX, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_avx, &vpx_quantize_b_c,
VPX_BITS_8, 16, false),
// Even though SSSE3 and AVX do not match the reference
// code, we can keep them in sync with each other.
make_tuple(&vpx_quantize_b_32x32_avx,
&vpx_quantize_b_32x32_ssse3, VPX_BITS_8, 32,
false)));
#endif // HAVE_AVX && !CONFIG_VP9_HIGHBITDEPTH
// TODO(webm:1448): dqcoeff is not handled correctly in HBD builds.
#if HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_CASE_P(
NEON, VP9QuantizeTest,
::testing::Values(make_tuple(&vpx_quantize_b_neon, &vpx_quantize_b_c,
VPX_BITS_8, 16, false),
make_tuple(&vpx_quantize_b_32x32_neon,
&vpx_quantize_b_32x32_c, VPX_BITS_8, 32,
false),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_neon>,
&QuantFPWrapper<vp9_quantize_fp_c>, VPX_BITS_8,
16, true),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_32x32_neon>,
&QuantFPWrapper<vp9_quantize_fp_32x32_c>,
VPX_BITS_8, 32, true)));
#endif // HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH
// Only useful to compare "Speed" test results.
INSTANTIATE_TEST_CASE_P(
DISABLED_C, VP9QuantizeTest,
::testing::Values(
make_tuple(&vpx_quantize_b_c, &vpx_quantize_b_c, VPX_BITS_8, 16, false),
make_tuple(&vpx_quantize_b_32x32_c, &vpx_quantize_b_32x32_c, VPX_BITS_8,
32, false),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_c>,
&QuantFPWrapper<vp9_quantize_fp_c>, VPX_BITS_8, 16, true),
make_tuple(&QuantFPWrapper<quantize_fp_nz_c>,
&QuantFPWrapper<quantize_fp_nz_c>, VPX_BITS_8, 16, true),
make_tuple(&QuantFPWrapper<vp9_quantize_fp_32x32_c>,
&QuantFPWrapper<vp9_quantize_fp_32x32_c>, VPX_BITS_8, 32,
true)));
} // namespace

214
test/vp9_scale_test.cc Normal file
View File

@@ -0,0 +1,214 @@
/*
* Copyright (c) 2017 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <assert.h>
#include <stdio.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
#include "./vpx_scale_rtcd.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/vpx_scale_test.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/vpx_timer.h"
#include "vpx_scale/yv12config.h"
namespace libvpx_test {
typedef void (*ScaleFrameFunc)(const YV12_BUFFER_CONFIG *src,
YV12_BUFFER_CONFIG *dst,
INTERP_FILTER filter_type, int phase_scaler);
class ScaleTest : public VpxScaleBase,
public ::testing::TestWithParam<ScaleFrameFunc> {
public:
virtual ~ScaleTest() {}
protected:
virtual void SetUp() { scale_fn_ = GetParam(); }
void ReferenceScaleFrame(INTERP_FILTER filter_type, int phase_scaler) {
vp9_scale_and_extend_frame_c(&img_, &ref_img_, filter_type, phase_scaler);
}
void ScaleFrame(INTERP_FILTER filter_type, int phase_scaler) {
ASM_REGISTER_STATE_CHECK(
scale_fn_(&img_, &dst_img_, filter_type, phase_scaler));
}
void RunTest() {
static const int kNumSizesToTest = 20;
static const int kNumScaleFactorsToTest = 4;
static const int kSizesToTest[] = {
2, 4, 6, 8, 10, 12, 14, 16, 18, 20,
22, 24, 26, 28, 30, 32, 34, 68, 128, 134
};
static const int kScaleFactors[] = { 1, 2, 3, 4 };
for (INTERP_FILTER filter_type = 0; filter_type < 4; ++filter_type) {
for (int phase_scaler = 0; phase_scaler < 16; ++phase_scaler) {
for (int h = 0; h < kNumSizesToTest; ++h) {
const int src_height = kSizesToTest[h];
for (int w = 0; w < kNumSizesToTest; ++w) {
const int src_width = kSizesToTest[w];
for (int sf_up_idx = 0; sf_up_idx < kNumScaleFactorsToTest;
++sf_up_idx) {
const int sf_up = kScaleFactors[sf_up_idx];
for (int sf_down_idx = 0; sf_down_idx < kNumScaleFactorsToTest;
++sf_down_idx) {
const int sf_down = kScaleFactors[sf_down_idx];
const int dst_width = src_width * sf_up / sf_down;
const int dst_height = src_height * sf_up / sf_down;
if (sf_up == sf_down && sf_up != 1) {
continue;
}
// I420 frame width and height must be even.
if (!dst_width || !dst_height || dst_width & 1 ||
dst_height & 1) {
continue;
}
// vpx_convolve8_c() has restriction on the step which cannot
// exceed 64 (ratio 1 to 4).
if (src_width > 4 * dst_width || src_height > 4 * dst_height) {
continue;
}
ASSERT_NO_FATAL_FAILURE(ResetScaleImages(
src_width, src_height, dst_width, dst_height));
ReferenceScaleFrame(filter_type, phase_scaler);
ScaleFrame(filter_type, phase_scaler);
if (memcmp(dst_img_.buffer_alloc, ref_img_.buffer_alloc,
ref_img_.frame_size)) {
printf(
"filter_type = %d, phase_scaler = %d, src_width = %4d, "
"src_height = %4d, dst_width = %4d, dst_height = %4d, "
"scale factor = %d:%d\n",
filter_type, phase_scaler, src_width, src_height,
dst_width, dst_height, sf_down, sf_up);
PrintDiff();
}
CompareImages(dst_img_);
DeallocScaleImages();
}
}
}
}
}
}
}
void PrintDiffComponent(const uint8_t *const ref, const uint8_t *const opt,
const int stride, const int width, const int height,
const int plane_idx) const {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (ref[y * stride + x] != opt[y * stride + x]) {
printf("Plane %d pixel[%d][%d] diff:%6d (ref),%6d (opt)\n", plane_idx,
y, x, ref[y * stride + x], opt[y * stride + x]);
break;
}
}
}
}
void PrintDiff() const {
assert(ref_img_.y_stride == dst_img_.y_stride);
assert(ref_img_.y_width == dst_img_.y_width);
assert(ref_img_.y_height == dst_img_.y_height);
assert(ref_img_.uv_stride == dst_img_.uv_stride);
assert(ref_img_.uv_width == dst_img_.uv_width);
assert(ref_img_.uv_height == dst_img_.uv_height);
if (memcmp(dst_img_.buffer_alloc, ref_img_.buffer_alloc,
ref_img_.frame_size)) {
PrintDiffComponent(ref_img_.y_buffer, dst_img_.y_buffer,
ref_img_.y_stride, ref_img_.y_width, ref_img_.y_height,
0);
PrintDiffComponent(ref_img_.u_buffer, dst_img_.u_buffer,
ref_img_.uv_stride, ref_img_.uv_width,
ref_img_.uv_height, 1);
PrintDiffComponent(ref_img_.v_buffer, dst_img_.v_buffer,
ref_img_.uv_stride, ref_img_.uv_width,
ref_img_.uv_height, 2);
}
}
ScaleFrameFunc scale_fn_;
};
TEST_P(ScaleTest, ScaleFrame) { ASSERT_NO_FATAL_FAILURE(RunTest()); }
TEST_P(ScaleTest, DISABLED_Speed) {
static const int kCountSpeedTestBlock = 100;
static const int kNumScaleFactorsToTest = 4;
static const int kScaleFactors[] = { 1, 2, 3, 4 };
const int src_width = 1280;
const int src_height = 720;
for (INTERP_FILTER filter_type = 2; filter_type < 4; ++filter_type) {
for (int phase_scaler = 0; phase_scaler < 2; ++phase_scaler) {
for (int sf_up_idx = 0; sf_up_idx < kNumScaleFactorsToTest; ++sf_up_idx) {
const int sf_up = kScaleFactors[sf_up_idx];
for (int sf_down_idx = 0; sf_down_idx < kNumScaleFactorsToTest;
++sf_down_idx) {
const int sf_down = kScaleFactors[sf_down_idx];
const int dst_width = src_width * sf_up / sf_down;
const int dst_height = src_height * sf_up / sf_down;
if (sf_up == sf_down && sf_up != 1) {
continue;
}
// I420 frame width and height must be even.
if (dst_width & 1 || dst_height & 1) {
continue;
}
ASSERT_NO_FATAL_FAILURE(
ResetScaleImages(src_width, src_height, dst_width, dst_height));
ASM_REGISTER_STATE_CHECK(
ReferenceScaleFrame(filter_type, phase_scaler));
vpx_usec_timer timer;
vpx_usec_timer_start(&timer);
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
ScaleFrame(filter_type, phase_scaler);
}
libvpx_test::ClearSystemState();
vpx_usec_timer_mark(&timer);
const int elapsed_time =
static_cast<int>(vpx_usec_timer_elapsed(&timer) / 1000);
CompareImages(dst_img_);
DeallocScaleImages();
printf(
"filter_type = %d, phase_scaler = %d, src_width = %4d, "
"src_height = %4d, dst_width = %4d, dst_height = %4d, "
"scale factor = %d:%d, scale time: %5d ms\n",
filter_type, phase_scaler, src_width, src_height, dst_width,
dst_height, sf_down, sf_up, elapsed_time);
}
}
}
}
}
INSTANTIATE_TEST_CASE_P(C, ScaleTest,
::testing::Values(vp9_scale_and_extend_frame_c));
#if HAVE_SSSE3
INSTANTIATE_TEST_CASE_P(SSSE3, ScaleTest,
::testing::Values(vp9_scale_and_extend_frame_ssse3));
#endif // HAVE_SSSE3
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, ScaleTest,
::testing::Values(vp9_scale_and_extend_frame_neon));
#endif // HAVE_NEON
} // namespace libvpx_test

View File

@@ -85,8 +85,8 @@ class SkipLoopFilterTest {
// TODO(fgalligan): Move the MD5 testing code into another class.
void OpenMd5File(const std::string &md5_file_name) {
md5_file_ = libvpx_test::OpenTestDataFile(md5_file_name);
ASSERT_TRUE(md5_file_ != NULL) << "MD5 file open failed. Filename: "
<< md5_file_name;
ASSERT_TRUE(md5_file_ != NULL)
<< "MD5 file open failed. Filename: " << md5_file_name;
}
// Reads the next line of the MD5 file.

View File

@@ -101,4 +101,9 @@ INSTANTIATE_TEST_CASE_P(MSA, VP9SubtractBlockTest,
::testing::Values(vpx_subtract_block_msa));
#endif
#if HAVE_MMI
INSTANTIATE_TEST_CASE_P(MMI, VP9SubtractBlockTest,
::testing::Values(vpx_subtract_block_mmi));
#endif
} // namespace vp9

View File

@@ -187,8 +187,8 @@ void DecodeFiles(const FileList files[]) {
for (const FileList *iter = files; iter->name != NULL; ++iter) {
SCOPED_TRACE(iter->name);
for (int t = 1; t <= 8; ++t) {
EXPECT_EQ(iter->expected_md5, DecodeFile(iter->name, t)) << "threads = "
<< t;
EXPECT_EQ(iter->expected_md5, DecodeFile(iter->name, t))
<< "threads = " << t;
}
}
}

View File

@@ -14,149 +14,17 @@
#include "./vpx_scale_rtcd.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/vpx_scale_test.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/vpx_timer.h"
#include "vpx_scale/yv12config.h"
namespace {
namespace libvpx_test {
typedef void (*ExtendFrameBorderFunc)(YV12_BUFFER_CONFIG *ybf);
typedef void (*CopyFrameFunc)(const YV12_BUFFER_CONFIG *src_ybf,
YV12_BUFFER_CONFIG *dst_ybf);
class VpxScaleBase {
public:
virtual ~VpxScaleBase() { libvpx_test::ClearSystemState(); }
void ResetImage(int width, int height) {
width_ = width;
height_ = height;
memset(&img_, 0, sizeof(img_));
ASSERT_EQ(0, vp8_yv12_alloc_frame_buffer(&img_, width_, height_,
VP8BORDERINPIXELS));
memset(img_.buffer_alloc, kBufFiller, img_.frame_size);
FillPlane(img_.y_buffer, img_.y_crop_width, img_.y_crop_height,
img_.y_stride);
FillPlane(img_.u_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
FillPlane(img_.v_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
memset(&ref_img_, 0, sizeof(ref_img_));
ASSERT_EQ(0, vp8_yv12_alloc_frame_buffer(&ref_img_, width_, height_,
VP8BORDERINPIXELS));
memset(ref_img_.buffer_alloc, kBufFiller, ref_img_.frame_size);
memset(&cpy_img_, 0, sizeof(cpy_img_));
ASSERT_EQ(0, vp8_yv12_alloc_frame_buffer(&cpy_img_, width_, height_,
VP8BORDERINPIXELS));
memset(cpy_img_.buffer_alloc, kBufFiller, cpy_img_.frame_size);
ReferenceCopyFrame();
}
void DeallocImage() {
vp8_yv12_de_alloc_frame_buffer(&img_);
vp8_yv12_de_alloc_frame_buffer(&ref_img_);
vp8_yv12_de_alloc_frame_buffer(&cpy_img_);
}
protected:
static const int kBufFiller = 123;
static const int kBufMax = kBufFiller - 1;
static void FillPlane(uint8_t *buf, int width, int height, int stride) {
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
buf[x + (y * stride)] = (x + (width * y)) % kBufMax;
}
}
}
static void ExtendPlane(uint8_t *buf, int crop_width, int crop_height,
int width, int height, int stride, int padding) {
// Copy the outermost visible pixel to a distance of at least 'padding.'
// The buffers are allocated such that there may be excess space outside the
// padding. As long as the minimum amount of padding is achieved it is not
// necessary to fill this space as well.
uint8_t *left = buf - padding;
uint8_t *right = buf + crop_width;
const int right_extend = padding + (width - crop_width);
const int bottom_extend = padding + (height - crop_height);
// Fill the border pixels from the nearest image pixel.
for (int y = 0; y < crop_height; ++y) {
memset(left, left[padding], padding);
memset(right, right[-1], right_extend);
left += stride;
right += stride;
}
left = buf - padding;
uint8_t *top = left - (stride * padding);
// The buffer does not always extend as far as the stride.
// Equivalent to padding + width + padding.
const int extend_width = padding + crop_width + right_extend;
// The first row was already extended to the left and right. Copy it up.
for (int y = 0; y < padding; ++y) {
memcpy(top, left, extend_width);
top += stride;
}
uint8_t *bottom = left + (crop_height * stride);
for (int y = 0; y < bottom_extend; ++y) {
memcpy(bottom, left + (crop_height - 1) * stride, extend_width);
bottom += stride;
}
}
void ReferenceExtendBorder() {
ExtendPlane(ref_img_.y_buffer, ref_img_.y_crop_width,
ref_img_.y_crop_height, ref_img_.y_width, ref_img_.y_height,
ref_img_.y_stride, ref_img_.border);
ExtendPlane(ref_img_.u_buffer, ref_img_.uv_crop_width,
ref_img_.uv_crop_height, ref_img_.uv_width, ref_img_.uv_height,
ref_img_.uv_stride, ref_img_.border / 2);
ExtendPlane(ref_img_.v_buffer, ref_img_.uv_crop_width,
ref_img_.uv_crop_height, ref_img_.uv_width, ref_img_.uv_height,
ref_img_.uv_stride, ref_img_.border / 2);
}
void ReferenceCopyFrame() {
// Copy img_ to ref_img_ and extend frame borders. This will be used for
// verifying extend_fn_ as well as copy_frame_fn_.
EXPECT_EQ(ref_img_.frame_size, img_.frame_size);
for (int y = 0; y < img_.y_crop_height; ++y) {
for (int x = 0; x < img_.y_crop_width; ++x) {
ref_img_.y_buffer[x + y * ref_img_.y_stride] =
img_.y_buffer[x + y * img_.y_stride];
}
}
for (int y = 0; y < img_.uv_crop_height; ++y) {
for (int x = 0; x < img_.uv_crop_width; ++x) {
ref_img_.u_buffer[x + y * ref_img_.uv_stride] =
img_.u_buffer[x + y * img_.uv_stride];
ref_img_.v_buffer[x + y * ref_img_.uv_stride] =
img_.v_buffer[x + y * img_.uv_stride];
}
}
ReferenceExtendBorder();
}
void CompareImages(const YV12_BUFFER_CONFIG actual) {
EXPECT_EQ(ref_img_.frame_size, actual.frame_size);
EXPECT_EQ(0, memcmp(ref_img_.buffer_alloc, actual.buffer_alloc,
ref_img_.frame_size));
}
YV12_BUFFER_CONFIG img_;
YV12_BUFFER_CONFIG ref_img_;
YV12_BUFFER_CONFIG cpy_img_;
int width_;
int height_;
};
class ExtendBorderTest
: public VpxScaleBase,
public ::testing::TestWithParam<ExtendFrameBorderFunc> {
@@ -178,11 +46,11 @@ class ExtendBorderTest
static const int kSizesToTest[] = { 1, 15, 33, 145, 512, 1025, 16383 };
for (int h = 0; h < kNumSizesToTest; ++h) {
for (int w = 0; w < kNumSizesToTest; ++w) {
ResetImage(kSizesToTest[w], kSizesToTest[h]);
ASSERT_NO_FATAL_FAILURE(ResetImages(kSizesToTest[w], kSizesToTest[h]));
ReferenceCopyFrame();
ExtendBorder();
ReferenceExtendBorder();
CompareImages(img_);
DeallocImage();
DeallocImages();
}
}
}
@@ -204,7 +72,7 @@ class CopyFrameTest : public VpxScaleBase,
virtual void SetUp() { copy_frame_fn_ = GetParam(); }
void CopyFrame() {
ASM_REGISTER_STATE_CHECK(copy_frame_fn_(&img_, &cpy_img_));
ASM_REGISTER_STATE_CHECK(copy_frame_fn_(&img_, &dst_img_));
}
void RunTest() {
@@ -217,11 +85,11 @@ class CopyFrameTest : public VpxScaleBase,
static const int kSizesToTest[] = { 1, 15, 33, 145, 512, 1025, 16383 };
for (int h = 0; h < kNumSizesToTest; ++h) {
for (int w = 0; w < kNumSizesToTest; ++w) {
ResetImage(kSizesToTest[w], kSizesToTest[h]);
ASSERT_NO_FATAL_FAILURE(ResetImages(kSizesToTest[w], kSizesToTest[h]));
ReferenceCopyFrame();
CopyFrame();
CompareImages(cpy_img_);
DeallocImage();
CompareImages(dst_img_);
DeallocImages();
}
}
}
@@ -233,4 +101,5 @@ TEST_P(CopyFrameTest, CopyFrame) { ASSERT_NO_FATAL_FAILURE(RunTest()); }
INSTANTIATE_TEST_CASE_P(C, CopyFrameTest,
::testing::Values(vp8_yv12_copy_frame_c));
} // namespace
} // namespace libvpx_test

200
test/vpx_scale_test.h Normal file
View File

@@ -0,0 +1,200 @@
/*
* Copyright (c) 2014 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef TEST_VPX_SCALE_TEST_H_
#define TEST_VPX_SCALE_TEST_H_
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h"
#include "./vpx_scale_rtcd.h"
#include "test/acm_random.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_scale/yv12config.h"
using libvpx_test::ACMRandom;
namespace libvpx_test {
class VpxScaleBase {
public:
virtual ~VpxScaleBase() { libvpx_test::ClearSystemState(); }
void ResetImage(YV12_BUFFER_CONFIG *const img, const int width,
const int height) {
memset(img, 0, sizeof(*img));
ASSERT_EQ(
0, vp8_yv12_alloc_frame_buffer(img, width, height, VP8BORDERINPIXELS));
memset(img->buffer_alloc, kBufFiller, img->frame_size);
}
void ResetImages(const int width, const int height) {
ResetImage(&img_, width, height);
ResetImage(&ref_img_, width, height);
ResetImage(&dst_img_, width, height);
FillPlane(img_.y_buffer, img_.y_crop_width, img_.y_crop_height,
img_.y_stride);
FillPlane(img_.u_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
FillPlane(img_.v_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
}
void ResetScaleImage(YV12_BUFFER_CONFIG *const img, const int width,
const int height) {
memset(img, 0, sizeof(*img));
#if CONFIG_VP9_HIGHBITDEPTH
ASSERT_EQ(0, vpx_alloc_frame_buffer(img, width, height, 1, 1, 0,
VP9_ENC_BORDER_IN_PIXELS, 0));
#else
ASSERT_EQ(0, vpx_alloc_frame_buffer(img, width, height, 1, 1,
VP9_ENC_BORDER_IN_PIXELS, 0));
#endif
memset(img->buffer_alloc, kBufFiller, img->frame_size);
}
void ResetScaleImages(const int src_width, const int src_height,
const int dst_width, const int dst_height) {
ResetScaleImage(&img_, src_width, src_height);
ResetScaleImage(&ref_img_, dst_width, dst_height);
ResetScaleImage(&dst_img_, dst_width, dst_height);
FillPlaneExtreme(img_.y_buffer, img_.y_crop_width, img_.y_crop_height,
img_.y_stride);
FillPlaneExtreme(img_.u_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
FillPlaneExtreme(img_.v_buffer, img_.uv_crop_width, img_.uv_crop_height,
img_.uv_stride);
}
void DeallocImages() {
vp8_yv12_de_alloc_frame_buffer(&img_);
vp8_yv12_de_alloc_frame_buffer(&ref_img_);
vp8_yv12_de_alloc_frame_buffer(&dst_img_);
}
void DeallocScaleImages() {
vpx_free_frame_buffer(&img_);
vpx_free_frame_buffer(&ref_img_);
vpx_free_frame_buffer(&dst_img_);
}
protected:
static const int kBufFiller = 123;
static const int kBufMax = kBufFiller - 1;
static void FillPlane(uint8_t *const buf, const int width, const int height,
const int stride) {
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
buf[x + (y * stride)] = (x + (width * y)) % kBufMax;
}
}
}
static void FillPlaneExtreme(uint8_t *const buf, const int width,
const int height, const int stride) {
ACMRandom rnd;
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
buf[x + (y * stride)] = rnd.Rand8() % 2 ? 255 : 0;
}
}
}
static void ExtendPlane(uint8_t *buf, int crop_width, int crop_height,
int width, int height, int stride, int padding) {
// Copy the outermost visible pixel to a distance of at least 'padding.'
// The buffers are allocated such that there may be excess space outside the
// padding. As long as the minimum amount of padding is achieved it is not
// necessary to fill this space as well.
uint8_t *left = buf - padding;
uint8_t *right = buf + crop_width;
const int right_extend = padding + (width - crop_width);
const int bottom_extend = padding + (height - crop_height);
// Fill the border pixels from the nearest image pixel.
for (int y = 0; y < crop_height; ++y) {
memset(left, left[padding], padding);
memset(right, right[-1], right_extend);
left += stride;
right += stride;
}
left = buf - padding;
uint8_t *top = left - (stride * padding);
// The buffer does not always extend as far as the stride.
// Equivalent to padding + width + padding.
const int extend_width = padding + crop_width + right_extend;
// The first row was already extended to the left and right. Copy it up.
for (int y = 0; y < padding; ++y) {
memcpy(top, left, extend_width);
top += stride;
}
uint8_t *bottom = left + (crop_height * stride);
for (int y = 0; y < bottom_extend; ++y) {
memcpy(bottom, left + (crop_height - 1) * stride, extend_width);
bottom += stride;
}
}
void ReferenceExtendBorder() {
ExtendPlane(ref_img_.y_buffer, ref_img_.y_crop_width,
ref_img_.y_crop_height, ref_img_.y_width, ref_img_.y_height,
ref_img_.y_stride, ref_img_.border);
ExtendPlane(ref_img_.u_buffer, ref_img_.uv_crop_width,
ref_img_.uv_crop_height, ref_img_.uv_width, ref_img_.uv_height,
ref_img_.uv_stride, ref_img_.border / 2);
ExtendPlane(ref_img_.v_buffer, ref_img_.uv_crop_width,
ref_img_.uv_crop_height, ref_img_.uv_width, ref_img_.uv_height,
ref_img_.uv_stride, ref_img_.border / 2);
}
void ReferenceCopyFrame() {
// Copy img_ to ref_img_ and extend frame borders. This will be used for
// verifying extend_fn_ as well as copy_frame_fn_.
EXPECT_EQ(ref_img_.frame_size, img_.frame_size);
for (int y = 0; y < img_.y_crop_height; ++y) {
for (int x = 0; x < img_.y_crop_width; ++x) {
ref_img_.y_buffer[x + y * ref_img_.y_stride] =
img_.y_buffer[x + y * img_.y_stride];
}
}
for (int y = 0; y < img_.uv_crop_height; ++y) {
for (int x = 0; x < img_.uv_crop_width; ++x) {
ref_img_.u_buffer[x + y * ref_img_.uv_stride] =
img_.u_buffer[x + y * img_.uv_stride];
ref_img_.v_buffer[x + y * ref_img_.uv_stride] =
img_.v_buffer[x + y * img_.uv_stride];
}
}
ReferenceExtendBorder();
}
void CompareImages(const YV12_BUFFER_CONFIG actual) {
EXPECT_EQ(ref_img_.frame_size, actual.frame_size);
EXPECT_EQ(0, memcmp(ref_img_.buffer_alloc, actual.buffer_alloc,
ref_img_.frame_size));
}
YV12_BUFFER_CONFIG img_;
YV12_BUFFER_CONFIG ref_img_;
YV12_BUFFER_CONFIG dst_img_;
};
} // namespace libvpx_test
#endif // TEST_VPX_SCALE_TEST_H_

View File

@@ -41,6 +41,7 @@ vpx_tsvc_encoder() {
local speed="6"
local frame_drop_thresh="30"
local max_threads="4"
local error_resilient="1"
shift 2
@@ -51,11 +52,19 @@ vpx_tsvc_encoder() {
# TODO(tomfinegan): Verify file output for all thread runs.
for threads in $(seq $max_threads); do
eval "${VPX_TEST_PREFIX}" "${encoder}" "${YUV_RAW_INPUT}" "${output_file}" \
"${codec}" "${YUV_RAW_INPUT_WIDTH}" "${YUV_RAW_INPUT_HEIGHT}" \
"${timebase_num}" "${timebase_den}" "${speed}" "${frame_drop_thresh}" \
"${threads}" "$@" \
${devnull}
if [ "$(vpx_config_option_enabled CONFIG_VP9_HIGHBITDEPTH)" != "yes" ]; then
eval "${VPX_TEST_PREFIX}" "${encoder}" "${YUV_RAW_INPUT}" \
"${output_file}" "${codec}" "${YUV_RAW_INPUT_WIDTH}" \
"${YUV_RAW_INPUT_HEIGHT}" "${timebase_num}" "${timebase_den}" \
"${speed}" "${frame_drop_thresh}" "${error_resilient}" "${threads}" \
"$@" ${devnull}
else
eval "${VPX_TEST_PREFIX}" "${encoder}" "${YUV_RAW_INPUT}" \
"${output_file}" "${codec}" "${YUV_RAW_INPUT_WIDTH}" \
"${YUV_RAW_INPUT_HEIGHT}" "${timebase_num}" "${timebase_den}" \
"${speed}" "${frame_drop_thresh}" "${error_resilient}" "${threads}" \
"$@" "8" ${devnull}
fi
done
}
@@ -76,193 +85,217 @@ files_exist() {
vpx_tsvc_encoder_vp8_mode_0() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 0 200 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_0"
vpx_tsvc_encoder vp8 "${output_basename}" 0 200 || return 1
# Mode 0 produces 1 stream
files_exist "${FUNCNAME}" 1 || return 1
files_exist "${output_basename}" 1 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_1() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 1 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_1"
vpx_tsvc_encoder vp8 "${output_basename}" 1 200 400 || return 1
# Mode 1 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_2() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 2 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_2"
vpx_tsvc_encoder vp8 "${output_basename}" 2 200 400 || return 1
# Mode 2 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_3() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 3 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_3"
vpx_tsvc_encoder vp8 "${output_basename}" 3 200 400 600 || return 1
# Mode 3 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_4() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 4 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_4"
vpx_tsvc_encoder vp8 "${output_basename}" 4 200 400 600 || return 1
# Mode 4 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_5() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 5 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_5"
vpx_tsvc_encoder vp8 "${output_basename}" 5 200 400 600 || return 1
# Mode 5 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_6() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 6 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_6"
vpx_tsvc_encoder vp8 "${output_basename}" 6 200 400 600 || return 1
# Mode 6 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_7() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 7 200 400 600 800 1000 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_7"
vpx_tsvc_encoder vp8 "${output_basename}" 7 200 400 600 800 1000 || return 1
# Mode 7 produces 5 streams
files_exist "${FUNCNAME}" 5 || return 1
files_exist "${output_basename}" 5 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_8() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 8 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_8"
vpx_tsvc_encoder vp8 "${output_basename}" 8 200 400 || return 1
# Mode 8 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_9() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 9 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_9"
vpx_tsvc_encoder vp8 "${output_basename}" 9 200 400 600 || return 1
# Mode 9 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_10() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 10 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_10"
vpx_tsvc_encoder vp8 "${output_basename}" 10 200 400 600 || return 1
# Mode 10 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp8_mode_11() {
if [ "$(vp8_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp8 "${FUNCNAME}" 11 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp8_mode_11"
vpx_tsvc_encoder vp8 "${output_basename}" 11 200 400 600 || return 1
# Mode 11 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_0() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 0 200 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_0"
vpx_tsvc_encoder vp9 "${output_basename}" 0 200 || return 1
# Mode 0 produces 1 stream
files_exist "${FUNCNAME}" 1 || return 1
files_exist "${output_basename}" 1 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_1() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 1 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_1"
vpx_tsvc_encoder vp9 "${output_basename}" 1 200 400 || return 1
# Mode 1 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_2() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 2 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_2"
vpx_tsvc_encoder vp9 "${output_basename}" 2 200 400 || return 1
# Mode 2 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_3() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 3 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_3"
vpx_tsvc_encoder vp9 "${output_basename}" 3 200 400 600 || return 1
# Mode 3 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_4() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 4 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_4"
vpx_tsvc_encoder vp9 "${output_basename}" 4 200 400 600 || return 1
# Mode 4 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_5() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 5 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_5"
vpx_tsvc_encoder vp9 "${output_basename}" 5 200 400 600 || return 1
# Mode 5 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_6() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 6 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_6"
vpx_tsvc_encoder vp9 "${output_basename}" 6 200 400 600 || return 1
# Mode 6 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_7() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 7 200 400 600 800 1000 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_7"
vpx_tsvc_encoder vp9 "${output_basename}" 7 200 400 600 800 1000 || return 1
# Mode 7 produces 5 streams
files_exist "${FUNCNAME}" 5 || return 1
files_exist "${output_basename}" 5 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_8() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 8 200 400 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_8"
vpx_tsvc_encoder vp9 "${output_basename}" 8 200 400 || return 1
# Mode 8 produces 2 streams
files_exist "${FUNCNAME}" 2 || return 1
files_exist "${output_basename}" 2 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_9() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 9 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_9"
vpx_tsvc_encoder vp9 "${output_basename}" 9 200 400 600 || return 1
# Mode 9 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_10() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 10 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_10"
vpx_tsvc_encoder vp9 "${output_basename}" 10 200 400 600 || return 1
# Mode 10 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}
vpx_tsvc_encoder_vp9_mode_11() {
if [ "$(vp9_encode_available)" = "yes" ]; then
vpx_tsvc_encoder vp9 "${FUNCNAME}" 11 200 400 600 || return 1
local readonly output_basename="vpx_tsvc_encoder_vp9_mode_11"
vpx_tsvc_encoder vp9 "${output_basename}" 11 200 400 600 || return 1
# Mode 11 produces 3 streams
files_exist "${FUNCNAME}" 3 || return 1
files_exist "${output_basename}" 3 || return 1
fi
}

View File

@@ -90,6 +90,15 @@ vpxenc_rt_params() {
--undershoot-pct=50"
}
# Forces --passes to 1 with CONFIG_REALTIME_ONLY.
vpxenc_passes_param() {
if [ "$(vpx_config_option_enabled CONFIG_REALTIME_ONLY)" = "yes" ]; then
echo "--passes=1"
else
echo "--passes=2"
fi
}
# Wrapper function for running vpxenc with pipe input. Requires that
# LIBVPX_BIN_PATH points to the directory containing vpxenc. $1 is used as the
# input file path and shifted away. All remaining parameters are passed through
@@ -218,9 +227,11 @@ vpxenc_vp8_ivf_piped_input() {
vpxenc_vp9_ivf() {
if [ "$(vpxenc_can_encode_vp9)" = "yes" ]; then
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9.ivf"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(yuv_input_hantro_collage) \
--codec=vp9 \
--limit="${TEST_FRAMES}" \
"${passes}" \
--ivf \
--output="${output}"
@@ -235,9 +246,11 @@ vpxenc_vp9_webm() {
if [ "$(vpxenc_can_encode_vp9)" = "yes" ] && \
[ "$(webm_io_available)" = "yes" ]; then
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9.webm"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(yuv_input_hantro_collage) \
--codec=vp9 \
--limit="${TEST_FRAMES}" \
"${passes}" \
--output="${output}"
if [ ! -e "${output}" ]; then
@@ -339,11 +352,13 @@ vpxenc_vp9_webm_2pass() {
vpxenc_vp9_ivf_lossless() {
if [ "$(vpxenc_can_encode_vp9)" = "yes" ]; then
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9_lossless.ivf"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(yuv_input_hantro_collage) \
--codec=vp9 \
--limit="${TEST_FRAMES}" \
--ivf \
--output="${output}" \
"${passes}" \
--lossless=1
if [ ! -e "${output}" ]; then
@@ -356,11 +371,13 @@ vpxenc_vp9_ivf_lossless() {
vpxenc_vp9_ivf_minq0_maxq0() {
if [ "$(vpxenc_can_encode_vp9)" = "yes" ]; then
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9_lossless_minq0_maxq0.ivf"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(yuv_input_hantro_collage) \
--codec=vp9 \
--limit="${TEST_FRAMES}" \
--ivf \
--output="${output}" \
"${passes}" \
--min-q=0 \
--max-q=0
@@ -377,12 +394,13 @@ vpxenc_vp9_webm_lag10_frames20() {
local readonly lag_total_frames=20
local readonly lag_frames=10
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9_lag10_frames20.webm"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(yuv_input_hantro_collage) \
--codec=vp9 \
--limit="${lag_total_frames}" \
--lag-in-frames="${lag_frames}" \
--output="${output}" \
--passes=2 \
"${passes}" \
--auto-alt-ref=1
if [ ! -e "${output}" ]; then
@@ -397,9 +415,11 @@ vpxenc_vp9_webm_non_square_par() {
if [ "$(vpxenc_can_encode_vp9)" = "yes" ] && \
[ "$(webm_io_available)" = "yes" ]; then
local readonly output="${VPX_TEST_OUTPUT_DIR}/vp9_non_square_par.webm"
local readonly passes=$(vpxenc_passes_param)
vpxenc $(y4m_input_non_square_par) \
--codec=vp9 \
--limit="${TEST_FRAMES}" \
"${passes}" \
--output="${output}"
if [ ! -e "${output}" ]; then
@@ -412,18 +432,21 @@ vpxenc_vp9_webm_non_square_par() {
vpxenc_tests="vpxenc_vp8_ivf
vpxenc_vp8_webm
vpxenc_vp8_webm_rt
vpxenc_vp8_webm_2pass
vpxenc_vp8_webm_lag10_frames20
vpxenc_vp8_ivf_piped_input
vpxenc_vp9_ivf
vpxenc_vp9_webm
vpxenc_vp9_webm_rt
vpxenc_vp9_webm_rt_multithread_tiled
vpxenc_vp9_webm_rt_multithread_tiled_frameparallel
vpxenc_vp9_webm_2pass
vpxenc_vp9_ivf_lossless
vpxenc_vp9_ivf_minq0_maxq0
vpxenc_vp9_webm_lag10_frames20
vpxenc_vp9_webm_non_square_par"
if [ "$(vpx_config_option_enabled CONFIG_REALTIME_ONLY)" != "yes" ]; then
vpxenc_tests="$vpxenc_tests
vpxenc_vp8_webm_2pass
vpxenc_vp8_webm_lag10_frames20
vpxenc_vp9_webm_2pass"
fi
run_tests vpxenc_verify_environment "${vpxenc_tests}"

View File

@@ -40,8 +40,8 @@ class WebMVideoSource : public CompressedVideoSource {
virtual void Begin() {
vpx_ctx_->file = OpenTestDataFile(file_name_);
ASSERT_TRUE(vpx_ctx_->file != NULL) << "Input file open failed. Filename: "
<< file_name_;
ASSERT_TRUE(vpx_ctx_->file != NULL)
<< "Input file open failed. Filename: " << file_name_;
ASSERT_EQ(file_is_webm(webm_ctx_, vpx_ctx_), 1) << "file is not WebM";

View File

@@ -34,8 +34,8 @@ class Y4mVideoSource : public VideoSource {
virtual void OpenSource() {
CloseSource();
input_file_ = OpenTestDataFile(file_name_);
ASSERT_TRUE(input_file_ != NULL) << "Input file open failed. Filename: "
<< file_name_;
ASSERT_TRUE(input_file_ != NULL)
<< "Input file open failed. Filename: " << file_name_;
}
virtual void ReadSourceToStart() {

View File

@@ -43,8 +43,8 @@ class YUVVideoSource : public VideoSource {
virtual void Begin() {
if (input_file_) fclose(input_file_);
input_file_ = OpenTestDataFile(file_name_);
ASSERT_TRUE(input_file_ != NULL) << "Input file open failed. Filename: "
<< file_name_;
ASSERT_TRUE(input_file_ != NULL)
<< "Input file open failed. Filename: " << file_name_;
if (start_) {
fseek(input_file_, static_cast<unsigned>(raw_size_) * start_, SEEK_SET);
}

View File

@@ -1,7 +1,7 @@
URL: http://code.google.com/p/googletest/
Version: 1.7.0
URL: https://github.com/google/googletest
Version: 1.8.0
License: BSD
License File: COPYING
License File: LICENSE
Description:
Google's framework for writing C++ tests on a variety of platforms
@@ -12,10 +12,13 @@ failures, various options for running the tests, and XML test report
generation.
Local Modifications:
- Removed unused declarations of kPathSeparatorString to have warning
free build.
- Added GTEST_ATTRIBUTE_UNUSED_ to test registering dummies in TEST_P
and INSTANTIATE_TEST_CASE_P to remove warnings about unused variables
under GCC 5.
- Only define g_in_fast_death_test_child for non-Windows builds; quiets an
unused variable warning.
- Remove everything but:
googletest-release-1.8.0/googletest/
CHANGES
CONTRIBUTORS
include
LICENSE
README.md
src
- Suppress unsigned overflow instrumentation in the LCG
https://github.com/google/googletest/pull/1066

View File

@@ -1,435 +0,0 @@
Google C++ Testing Framework
============================
http://code.google.com/p/googletest/
Overview
--------
Google's framework for writing C++ tests on a variety of platforms
(Linux, Mac OS X, Windows, Windows CE, Symbian, etc). Based on the
xUnit architecture. Supports automatic test discovery, a rich set of
assertions, user-defined assertions, death tests, fatal and non-fatal
failures, various options for running the tests, and XML test report
generation.
Please see the project page above for more information as well as the
mailing list for questions, discussions, and development. There is
also an IRC channel on OFTC (irc.oftc.net) #gtest available. Please
join us!
Requirements for End Users
--------------------------
Google Test is designed to have fairly minimal requirements to build
and use with your projects, but there are some. Currently, we support
Linux, Windows, Mac OS X, and Cygwin. We will also make our best
effort to support other platforms (e.g. Solaris, AIX, and z/OS).
However, since core members of the Google Test project have no access
to these platforms, Google Test may have outstanding issues there. If
you notice any problems on your platform, please notify
googletestframework@googlegroups.com. Patches for fixing them are
even more welcome!
### Linux Requirements ###
These are the base requirements to build and use Google Test from a source
package (as described below):
* GNU-compatible Make or gmake
* POSIX-standard shell
* POSIX(-2) Regular Expressions (regex.h)
* A C++98-standard-compliant compiler
### Windows Requirements ###
* Microsoft Visual C++ 7.1 or newer
### Cygwin Requirements ###
* Cygwin 1.5.25-14 or newer
### Mac OS X Requirements ###
* Mac OS X 10.4 Tiger or newer
* Developer Tools Installed
Also, you'll need CMake 2.6.4 or higher if you want to build the
samples using the provided CMake script, regardless of the platform.
Requirements for Contributors
-----------------------------
We welcome patches. If you plan to contribute a patch, you need to
build Google Test and its own tests from an SVN checkout (described
below), which has further requirements:
* Python version 2.3 or newer (for running some of the tests and
re-generating certain source files from templates)
* CMake 2.6.4 or newer
Getting the Source
------------------
There are two primary ways of getting Google Test's source code: you
can download a stable source release in your preferred archive format,
or directly check out the source from our Subversion (SVN) repositary.
The SVN checkout requires a few extra steps and some extra software
packages on your system, but lets you track the latest development and
make patches much more easily, so we highly encourage it.
### Source Package ###
Google Test is released in versioned source packages which can be
downloaded from the download page [1]. Several different archive
formats are provided, but the only difference is the tools used to
manipulate them, and the size of the resulting file. Download
whichever you are most comfortable with.
[1] http://code.google.com/p/googletest/downloads/list
Once the package is downloaded, expand it using whichever tools you
prefer for that type. This will result in a new directory with the
name "gtest-X.Y.Z" which contains all of the source code. Here are
some examples on Linux:
tar -xvzf gtest-X.Y.Z.tar.gz
tar -xvjf gtest-X.Y.Z.tar.bz2
unzip gtest-X.Y.Z.zip
### SVN Checkout ###
To check out the main branch (also known as the "trunk") of Google
Test, run the following Subversion command:
svn checkout http://googletest.googlecode.com/svn/trunk/ gtest-svn
Setting up the Build
--------------------
To build Google Test and your tests that use it, you need to tell your
build system where to find its headers and source files. The exact
way to do it depends on which build system you use, and is usually
straightforward.
### Generic Build Instructions ###
Suppose you put Google Test in directory ${GTEST_DIR}. To build it,
create a library build target (or a project as called by Visual Studio
and Xcode) to compile
${GTEST_DIR}/src/gtest-all.cc
with ${GTEST_DIR}/include in the system header search path and ${GTEST_DIR}
in the normal header search path. Assuming a Linux-like system and gcc,
something like the following will do:
g++ -isystem ${GTEST_DIR}/include -I${GTEST_DIR} \
-pthread -c ${GTEST_DIR}/src/gtest-all.cc
ar -rv libgtest.a gtest-all.o
(We need -pthread as Google Test uses threads.)
Next, you should compile your test source file with
${GTEST_DIR}/include in the system header search path, and link it
with gtest and any other necessary libraries:
g++ -isystem ${GTEST_DIR}/include -pthread path/to/your_test.cc libgtest.a \
-o your_test
As an example, the make/ directory contains a Makefile that you can
use to build Google Test on systems where GNU make is available
(e.g. Linux, Mac OS X, and Cygwin). It doesn't try to build Google
Test's own tests. Instead, it just builds the Google Test library and
a sample test. You can use it as a starting point for your own build
script.
If the default settings are correct for your environment, the
following commands should succeed:
cd ${GTEST_DIR}/make
make
./sample1_unittest
If you see errors, try to tweak the contents of make/Makefile to make
them go away. There are instructions in make/Makefile on how to do
it.
### Using CMake ###
Google Test comes with a CMake build script (CMakeLists.txt) that can
be used on a wide range of platforms ("C" stands for cross-platofrm.).
If you don't have CMake installed already, you can download it for
free from http://www.cmake.org/.
CMake works by generating native makefiles or build projects that can
be used in the compiler environment of your choice. The typical
workflow starts with:
mkdir mybuild # Create a directory to hold the build output.
cd mybuild
cmake ${GTEST_DIR} # Generate native build scripts.
If you want to build Google Test's samples, you should replace the
last command with
cmake -Dgtest_build_samples=ON ${GTEST_DIR}
If you are on a *nix system, you should now see a Makefile in the
current directory. Just type 'make' to build gtest.
If you use Windows and have Vistual Studio installed, a gtest.sln file
and several .vcproj files will be created. You can then build them
using Visual Studio.
On Mac OS X with Xcode installed, a .xcodeproj file will be generated.
### Legacy Build Scripts ###
Before settling on CMake, we have been providing hand-maintained build
projects/scripts for Visual Studio, Xcode, and Autotools. While we
continue to provide them for convenience, they are not actively
maintained any more. We highly recommend that you follow the
instructions in the previous two sections to integrate Google Test
with your existing build system.
If you still need to use the legacy build scripts, here's how:
The msvc\ folder contains two solutions with Visual C++ projects.
Open the gtest.sln or gtest-md.sln file using Visual Studio, and you
are ready to build Google Test the same way you build any Visual
Studio project. Files that have names ending with -md use DLL
versions of Microsoft runtime libraries (the /MD or the /MDd compiler
option). Files without that suffix use static versions of the runtime
libraries (the /MT or the /MTd option). Please note that one must use
the same option to compile both gtest and the test code. If you use
Visual Studio 2005 or above, we recommend the -md version as /MD is
the default for new projects in these versions of Visual Studio.
On Mac OS X, open the gtest.xcodeproj in the xcode/ folder using
Xcode. Build the "gtest" target. The universal binary framework will
end up in your selected build directory (selected in the Xcode
"Preferences..." -> "Building" pane and defaults to xcode/build).
Alternatively, at the command line, enter:
xcodebuild
This will build the "Release" configuration of gtest.framework in your
default build location. See the "xcodebuild" man page for more
information about building different configurations and building in
different locations.
If you wish to use the Google Test Xcode project with Xcode 4.x and
above, you need to either:
* update the SDK configuration options in xcode/Config/General.xconfig.
Comment options SDKROOT, MACOS_DEPLOYMENT_TARGET, and GCC_VERSION. If
you choose this route you lose the ability to target earlier versions
of MacOS X.
* Install an SDK for an earlier version. This doesn't appear to be
supported by Apple, but has been reported to work
(http://stackoverflow.com/questions/5378518).
Tweaking Google Test
--------------------
Google Test can be used in diverse environments. The default
configuration may not work (or may not work well) out of the box in
some environments. However, you can easily tweak Google Test by
defining control macros on the compiler command line. Generally,
these macros are named like GTEST_XYZ and you define them to either 1
or 0 to enable or disable a certain feature.
We list the most frequently used macros below. For a complete list,
see file include/gtest/internal/gtest-port.h.
### Choosing a TR1 Tuple Library ###
Some Google Test features require the C++ Technical Report 1 (TR1)
tuple library, which is not yet available with all compilers. The
good news is that Google Test implements a subset of TR1 tuple that's
enough for its own need, and will automatically use this when the
compiler doesn't provide TR1 tuple.
Usually you don't need to care about which tuple library Google Test
uses. However, if your project already uses TR1 tuple, you need to
tell Google Test to use the same TR1 tuple library the rest of your
project uses, or the two tuple implementations will clash. To do
that, add
-DGTEST_USE_OWN_TR1_TUPLE=0
to the compiler flags while compiling Google Test and your tests. If
you want to force Google Test to use its own tuple library, just add
-DGTEST_USE_OWN_TR1_TUPLE=1
to the compiler flags instead.
If you don't want Google Test to use tuple at all, add
-DGTEST_HAS_TR1_TUPLE=0
and all features using tuple will be disabled.
### Multi-threaded Tests ###
Google Test is thread-safe where the pthread library is available.
After #include "gtest/gtest.h", you can check the GTEST_IS_THREADSAFE
macro to see whether this is the case (yes if the macro is #defined to
1, no if it's undefined.).
If Google Test doesn't correctly detect whether pthread is available
in your environment, you can force it with
-DGTEST_HAS_PTHREAD=1
or
-DGTEST_HAS_PTHREAD=0
When Google Test uses pthread, you may need to add flags to your
compiler and/or linker to select the pthread library, or you'll get
link errors. If you use the CMake script or the deprecated Autotools
script, this is taken care of for you. If you use your own build
script, you'll need to read your compiler and linker's manual to
figure out what flags to add.
### As a Shared Library (DLL) ###
Google Test is compact, so most users can build and link it as a
static library for the simplicity. You can choose to use Google Test
as a shared library (known as a DLL on Windows) if you prefer.
To compile *gtest* as a shared library, add
-DGTEST_CREATE_SHARED_LIBRARY=1
to the compiler flags. You'll also need to tell the linker to produce
a shared library instead - consult your linker's manual for how to do
it.
To compile your *tests* that use the gtest shared library, add
-DGTEST_LINKED_AS_SHARED_LIBRARY=1
to the compiler flags.
Note: while the above steps aren't technically necessary today when
using some compilers (e.g. GCC), they may become necessary in the
future, if we decide to improve the speed of loading the library (see
http://gcc.gnu.org/wiki/Visibility for details). Therefore you are
recommended to always add the above flags when using Google Test as a
shared library. Otherwise a future release of Google Test may break
your build script.
### Avoiding Macro Name Clashes ###
In C++, macros don't obey namespaces. Therefore two libraries that
both define a macro of the same name will clash if you #include both
definitions. In case a Google Test macro clashes with another
library, you can force Google Test to rename its macro to avoid the
conflict.
Specifically, if both Google Test and some other code define macro
FOO, you can add
-DGTEST_DONT_DEFINE_FOO=1
to the compiler flags to tell Google Test to change the macro's name
from FOO to GTEST_FOO. Currently FOO can be FAIL, SUCCEED, or TEST.
For example, with -DGTEST_DONT_DEFINE_TEST=1, you'll need to write
GTEST_TEST(SomeTest, DoesThis) { ... }
instead of
TEST(SomeTest, DoesThis) { ... }
in order to define a test.
Upgrating from an Earlier Version
---------------------------------
We strive to keep Google Test releases backward compatible.
Sometimes, though, we have to make some breaking changes for the
users' long-term benefits. This section describes what you'll need to
do if you are upgrading from an earlier version of Google Test.
### Upgrading from 1.3.0 or Earlier ###
You may need to explicitly enable or disable Google Test's own TR1
tuple library. See the instructions in section "Choosing a TR1 Tuple
Library".
### Upgrading from 1.4.0 or Earlier ###
The Autotools build script (configure + make) is no longer officially
supportted. You are encouraged to migrate to your own build system or
use CMake. If you still need to use Autotools, you can find
instructions in the README file from Google Test 1.4.0.
On platforms where the pthread library is available, Google Test uses
it in order to be thread-safe. See the "Multi-threaded Tests" section
for what this means to your build script.
If you use Microsoft Visual C++ 7.1 with exceptions disabled, Google
Test will no longer compile. This should affect very few people, as a
large portion of STL (including <string>) doesn't compile in this mode
anyway. We decided to stop supporting it in order to greatly simplify
Google Test's implementation.
Developing Google Test
----------------------
This section discusses how to make your own changes to Google Test.
### Testing Google Test Itself ###
To make sure your changes work as intended and don't break existing
functionality, you'll want to compile and run Google Test's own tests.
For that you can use CMake:
mkdir mybuild
cd mybuild
cmake -Dgtest_build_tests=ON ${GTEST_DIR}
Make sure you have Python installed, as some of Google Test's tests
are written in Python. If the cmake command complains about not being
able to find Python ("Could NOT find PythonInterp (missing:
PYTHON_EXECUTABLE)"), try telling it explicitly where your Python
executable can be found:
cmake -DPYTHON_EXECUTABLE=path/to/python -Dgtest_build_tests=ON ${GTEST_DIR}
Next, you can build Google Test and all of its own tests. On *nix,
this is usually done by 'make'. To run the tests, do
make test
All tests should pass.
### Regenerating Source Files ###
Some of Google Test's source files are generated from templates (not
in the C++ sense) using a script. A template file is named FOO.pump,
where FOO is the name of the file it will generate. For example, the
file include/gtest/internal/gtest-type-util.h.pump is used to generate
gtest-type-util.h in the same directory.
Normally you don't need to worry about regenerating the source files,
unless you need to modify them. In that case, you should modify the
corresponding .pump files instead and run the pump.py Python script to
regenerate them. You can find pump.py in the scripts/ directory.
Read the Pump manual [2] for how to use it.
[2] http://code.google.com/p/googletest/wiki/PumpManual
### Contributing a Patch ###
We welcome patches. Please read the Google Test developer's guide [3]
for how you can contribute. In particular, make sure you have signed
the Contributor License Agreement, or we won't be able to accept the
patch.
[3] http://code.google.com/p/googletest/wiki/GoogleTestDevGuide
Happy testing!

280
third_party/googletest/src/README.md vendored Normal file
View File

@@ -0,0 +1,280 @@
### Generic Build Instructions ###
#### Setup ####
To build Google Test and your tests that use it, you need to tell your
build system where to find its headers and source files. The exact
way to do it depends on which build system you use, and is usually
straightforward.
#### Build ####
Suppose you put Google Test in directory `${GTEST_DIR}`. To build it,
create a library build target (or a project as called by Visual Studio
and Xcode) to compile
${GTEST_DIR}/src/gtest-all.cc
with `${GTEST_DIR}/include` in the system header search path and `${GTEST_DIR}`
in the normal header search path. Assuming a Linux-like system and gcc,
something like the following will do:
g++ -isystem ${GTEST_DIR}/include -I${GTEST_DIR} \
-pthread -c ${GTEST_DIR}/src/gtest-all.cc
ar -rv libgtest.a gtest-all.o
(We need `-pthread` as Google Test uses threads.)
Next, you should compile your test source file with
`${GTEST_DIR}/include` in the system header search path, and link it
with gtest and any other necessary libraries:
g++ -isystem ${GTEST_DIR}/include -pthread path/to/your_test.cc libgtest.a \
-o your_test
As an example, the make/ directory contains a Makefile that you can
use to build Google Test on systems where GNU make is available
(e.g. Linux, Mac OS X, and Cygwin). It doesn't try to build Google
Test's own tests. Instead, it just builds the Google Test library and
a sample test. You can use it as a starting point for your own build
script.
If the default settings are correct for your environment, the
following commands should succeed:
cd ${GTEST_DIR}/make
make
./sample1_unittest
If you see errors, try to tweak the contents of `make/Makefile` to make
them go away. There are instructions in `make/Makefile` on how to do
it.
### Using CMake ###
Google Test comes with a CMake build script (
[CMakeLists.txt](CMakeLists.txt)) that can be used on a wide range of platforms ("C" stands for
cross-platform.). If you don't have CMake installed already, you can
download it for free from <http://www.cmake.org/>.
CMake works by generating native makefiles or build projects that can
be used in the compiler environment of your choice. The typical
workflow starts with:
mkdir mybuild # Create a directory to hold the build output.
cd mybuild
cmake ${GTEST_DIR} # Generate native build scripts.
If you want to build Google Test's samples, you should replace the
last command with
cmake -Dgtest_build_samples=ON ${GTEST_DIR}
If you are on a \*nix system, you should now see a Makefile in the
current directory. Just type 'make' to build gtest.
If you use Windows and have Visual Studio installed, a `gtest.sln` file
and several `.vcproj` files will be created. You can then build them
using Visual Studio.
On Mac OS X with Xcode installed, a `.xcodeproj` file will be generated.
### Legacy Build Scripts ###
Before settling on CMake, we have been providing hand-maintained build
projects/scripts for Visual Studio, Xcode, and Autotools. While we
continue to provide them for convenience, they are not actively
maintained any more. We highly recommend that you follow the
instructions in the previous two sections to integrate Google Test
with your existing build system.
If you still need to use the legacy build scripts, here's how:
The msvc\ folder contains two solutions with Visual C++ projects.
Open the `gtest.sln` or `gtest-md.sln` file using Visual Studio, and you
are ready to build Google Test the same way you build any Visual
Studio project. Files that have names ending with -md use DLL
versions of Microsoft runtime libraries (the /MD or the /MDd compiler
option). Files without that suffix use static versions of the runtime
libraries (the /MT or the /MTd option). Please note that one must use
the same option to compile both gtest and the test code. If you use
Visual Studio 2005 or above, we recommend the -md version as /MD is
the default for new projects in these versions of Visual Studio.
On Mac OS X, open the `gtest.xcodeproj` in the `xcode/` folder using
Xcode. Build the "gtest" target. The universal binary framework will
end up in your selected build directory (selected in the Xcode
"Preferences..." -> "Building" pane and defaults to xcode/build).
Alternatively, at the command line, enter:
xcodebuild
This will build the "Release" configuration of gtest.framework in your
default build location. See the "xcodebuild" man page for more
information about building different configurations and building in
different locations.
If you wish to use the Google Test Xcode project with Xcode 4.x and
above, you need to either:
* update the SDK configuration options in xcode/Config/General.xconfig.
Comment options `SDKROOT`, `MACOS_DEPLOYMENT_TARGET`, and `GCC_VERSION`. If
you choose this route you lose the ability to target earlier versions
of MacOS X.
* Install an SDK for an earlier version. This doesn't appear to be
supported by Apple, but has been reported to work
(http://stackoverflow.com/questions/5378518).
### Tweaking Google Test ###
Google Test can be used in diverse environments. The default
configuration may not work (or may not work well) out of the box in
some environments. However, you can easily tweak Google Test by
defining control macros on the compiler command line. Generally,
these macros are named like `GTEST_XYZ` and you define them to either 1
or 0 to enable or disable a certain feature.
We list the most frequently used macros below. For a complete list,
see file [include/gtest/internal/gtest-port.h](include/gtest/internal/gtest-port.h).
### Choosing a TR1 Tuple Library ###
Some Google Test features require the C++ Technical Report 1 (TR1)
tuple library, which is not yet available with all compilers. The
good news is that Google Test implements a subset of TR1 tuple that's
enough for its own need, and will automatically use this when the
compiler doesn't provide TR1 tuple.
Usually you don't need to care about which tuple library Google Test
uses. However, if your project already uses TR1 tuple, you need to
tell Google Test to use the same TR1 tuple library the rest of your
project uses, or the two tuple implementations will clash. To do
that, add
-DGTEST_USE_OWN_TR1_TUPLE=0
to the compiler flags while compiling Google Test and your tests. If
you want to force Google Test to use its own tuple library, just add
-DGTEST_USE_OWN_TR1_TUPLE=1
to the compiler flags instead.
If you don't want Google Test to use tuple at all, add
-DGTEST_HAS_TR1_TUPLE=0
and all features using tuple will be disabled.
### Multi-threaded Tests ###
Google Test is thread-safe where the pthread library is available.
After `#include "gtest/gtest.h"`, you can check the `GTEST_IS_THREADSAFE`
macro to see whether this is the case (yes if the macro is `#defined` to
1, no if it's undefined.).
If Google Test doesn't correctly detect whether pthread is available
in your environment, you can force it with
-DGTEST_HAS_PTHREAD=1
or
-DGTEST_HAS_PTHREAD=0
When Google Test uses pthread, you may need to add flags to your
compiler and/or linker to select the pthread library, or you'll get
link errors. If you use the CMake script or the deprecated Autotools
script, this is taken care of for you. If you use your own build
script, you'll need to read your compiler and linker's manual to
figure out what flags to add.
### As a Shared Library (DLL) ###
Google Test is compact, so most users can build and link it as a
static library for the simplicity. You can choose to use Google Test
as a shared library (known as a DLL on Windows) if you prefer.
To compile *gtest* as a shared library, add
-DGTEST_CREATE_SHARED_LIBRARY=1
to the compiler flags. You'll also need to tell the linker to produce
a shared library instead - consult your linker's manual for how to do
it.
To compile your *tests* that use the gtest shared library, add
-DGTEST_LINKED_AS_SHARED_LIBRARY=1
to the compiler flags.
Note: while the above steps aren't technically necessary today when
using some compilers (e.g. GCC), they may become necessary in the
future, if we decide to improve the speed of loading the library (see
<http://gcc.gnu.org/wiki/Visibility> for details). Therefore you are
recommended to always add the above flags when using Google Test as a
shared library. Otherwise a future release of Google Test may break
your build script.
### Avoiding Macro Name Clashes ###
In C++, macros don't obey namespaces. Therefore two libraries that
both define a macro of the same name will clash if you `#include` both
definitions. In case a Google Test macro clashes with another
library, you can force Google Test to rename its macro to avoid the
conflict.
Specifically, if both Google Test and some other code define macro
FOO, you can add
-DGTEST_DONT_DEFINE_FOO=1
to the compiler flags to tell Google Test to change the macro's name
from `FOO` to `GTEST_FOO`. Currently `FOO` can be `FAIL`, `SUCCEED`,
or `TEST`. For example, with `-DGTEST_DONT_DEFINE_TEST=1`, you'll
need to write
GTEST_TEST(SomeTest, DoesThis) { ... }
instead of
TEST(SomeTest, DoesThis) { ... }
in order to define a test.
## Developing Google Test ##
This section discusses how to make your own changes to Google Test.
### Testing Google Test Itself ###
To make sure your changes work as intended and don't break existing
functionality, you'll want to compile and run Google Test's own tests.
For that you can use CMake:
mkdir mybuild
cd mybuild
cmake -Dgtest_build_tests=ON ${GTEST_DIR}
Make sure you have Python installed, as some of Google Test's tests
are written in Python. If the cmake command complains about not being
able to find Python (`Could NOT find PythonInterp (missing:
PYTHON_EXECUTABLE)`), try telling it explicitly where your Python
executable can be found:
cmake -DPYTHON_EXECUTABLE=path/to/python -Dgtest_build_tests=ON ${GTEST_DIR}
Next, you can build Google Test and all of its own tests. On \*nix,
this is usually done by 'make'. To run the tests, do
make test
All tests should pass.
Normally you don't need to worry about regenerating the source files,
unless you need to modify them. In that case, you should modify the
corresponding .pump files instead and run the pump.py Python script to
regenerate them. You can find pump.py in the [scripts/](scripts/) directory.
Read the [Pump manual](docs/PumpManual.md) for how to use it.

View File

@@ -0,0 +1,294 @@
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// The Google C++ Testing Framework (Google Test)
//
// This header file defines the public API for death tests. It is
// #included by gtest.h so a user doesn't need to include this
// directly.
#ifndef GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_
#include "gtest/internal/gtest-death-test-internal.h"
namespace testing {
// This flag controls the style of death tests. Valid values are "threadsafe",
// meaning that the death test child process will re-execute the test binary
// from the start, running only a single death test, or "fast",
// meaning that the child process will execute the test logic immediately
// after forking.
GTEST_DECLARE_string_(death_test_style);
#if GTEST_HAS_DEATH_TEST
namespace internal {
// Returns a Boolean value indicating whether the caller is currently
// executing in the context of the death test child process. Tools such as
// Valgrind heap checkers may need this to modify their behavior in death
// tests. IMPORTANT: This is an internal utility. Using it may break the
// implementation of death tests. User code MUST NOT use it.
GTEST_API_ bool InDeathTestChild();
} // namespace internal
// The following macros are useful for writing death tests.
// Here's what happens when an ASSERT_DEATH* or EXPECT_DEATH* is
// executed:
//
// 1. It generates a warning if there is more than one active
// thread. This is because it's safe to fork() or clone() only
// when there is a single thread.
//
// 2. The parent process clone()s a sub-process and runs the death
// test in it; the sub-process exits with code 0 at the end of the
// death test, if it hasn't exited already.
//
// 3. The parent process waits for the sub-process to terminate.
//
// 4. The parent process checks the exit code and error message of
// the sub-process.
//
// Examples:
//
// ASSERT_DEATH(server.SendMessage(56, "Hello"), "Invalid port number");
// for (int i = 0; i < 5; i++) {
// EXPECT_DEATH(server.ProcessRequest(i),
// "Invalid request .* in ProcessRequest()")
// << "Failed to die on request " << i;
// }
//
// ASSERT_EXIT(server.ExitNow(), ::testing::ExitedWithCode(0), "Exiting");
//
// bool KilledBySIGHUP(int exit_code) {
// return WIFSIGNALED(exit_code) && WTERMSIG(exit_code) == SIGHUP;
// }
//
// ASSERT_EXIT(client.HangUpServer(), KilledBySIGHUP, "Hanging up!");
//
// On the regular expressions used in death tests:
//
// On POSIX-compliant systems (*nix), we use the <regex.h> library,
// which uses the POSIX extended regex syntax.
//
// On other platforms (e.g. Windows), we only support a simple regex
// syntax implemented as part of Google Test. This limited
// implementation should be enough most of the time when writing
// death tests; though it lacks many features you can find in PCRE
// or POSIX extended regex syntax. For example, we don't support
// union ("x|y"), grouping ("(xy)"), brackets ("[xy]"), and
// repetition count ("x{5,7}"), among others.
//
// Below is the syntax that we do support. We chose it to be a
// subset of both PCRE and POSIX extended regex, so it's easy to
// learn wherever you come from. In the following: 'A' denotes a
// literal character, period (.), or a single \\ escape sequence;
// 'x' and 'y' denote regular expressions; 'm' and 'n' are for
// natural numbers.
//
// c matches any literal character c
// \\d matches any decimal digit
// \\D matches any character that's not a decimal digit
// \\f matches \f
// \\n matches \n
// \\r matches \r
// \\s matches any ASCII whitespace, including \n
// \\S matches any character that's not a whitespace
// \\t matches \t
// \\v matches \v
// \\w matches any letter, _, or decimal digit
// \\W matches any character that \\w doesn't match
// \\c matches any literal character c, which must be a punctuation
// . matches any single character except \n
// A? matches 0 or 1 occurrences of A
// A* matches 0 or many occurrences of A
// A+ matches 1 or many occurrences of A
// ^ matches the beginning of a string (not that of each line)
// $ matches the end of a string (not that of each line)
// xy matches x followed by y
//
// If you accidentally use PCRE or POSIX extended regex features
// not implemented by us, you will get a run-time failure. In that
// case, please try to rewrite your regular expression within the
// above syntax.
//
// This implementation is *not* meant to be as highly tuned or robust
// as a compiled regex library, but should perform well enough for a
// death test, which already incurs significant overhead by launching
// a child process.
//
// Known caveats:
//
// A "threadsafe" style death test obtains the path to the test
// program from argv[0] and re-executes it in the sub-process. For
// simplicity, the current implementation doesn't search the PATH
// when launching the sub-process. This means that the user must
// invoke the test program via a path that contains at least one
// path separator (e.g. path/to/foo_test and
// /absolute/path/to/bar_test are fine, but foo_test is not). This
// is rarely a problem as people usually don't put the test binary
// directory in PATH.
//
// TODO(wan@google.com): make thread-safe death tests search the PATH.
// Asserts that a given statement causes the program to exit, with an
// integer exit status that satisfies predicate, and emitting error output
// that matches regex.
# define ASSERT_EXIT(statement, predicate, regex) \
GTEST_DEATH_TEST_(statement, predicate, regex, GTEST_FATAL_FAILURE_)
// Like ASSERT_EXIT, but continues on to successive tests in the
// test case, if any:
# define EXPECT_EXIT(statement, predicate, regex) \
GTEST_DEATH_TEST_(statement, predicate, regex, GTEST_NONFATAL_FAILURE_)
// Asserts that a given statement causes the program to exit, either by
// explicitly exiting with a nonzero exit code or being killed by a
// signal, and emitting error output that matches regex.
# define ASSERT_DEATH(statement, regex) \
ASSERT_EXIT(statement, ::testing::internal::ExitedUnsuccessfully, regex)
// Like ASSERT_DEATH, but continues on to successive tests in the
// test case, if any:
# define EXPECT_DEATH(statement, regex) \
EXPECT_EXIT(statement, ::testing::internal::ExitedUnsuccessfully, regex)
// Two predicate classes that can be used in {ASSERT,EXPECT}_EXIT*:
// Tests that an exit code describes a normal exit with a given exit code.
class GTEST_API_ ExitedWithCode {
public:
explicit ExitedWithCode(int exit_code);
bool operator()(int exit_status) const;
private:
// No implementation - assignment is unsupported.
void operator=(const ExitedWithCode& other);
const int exit_code_;
};
# if !GTEST_OS_WINDOWS
// Tests that an exit code describes an exit due to termination by a
// given signal.
class GTEST_API_ KilledBySignal {
public:
explicit KilledBySignal(int signum);
bool operator()(int exit_status) const;
private:
const int signum_;
};
# endif // !GTEST_OS_WINDOWS
// EXPECT_DEBUG_DEATH asserts that the given statements die in debug mode.
// The death testing framework causes this to have interesting semantics,
// since the sideeffects of the call are only visible in opt mode, and not
// in debug mode.
//
// In practice, this can be used to test functions that utilize the
// LOG(DFATAL) macro using the following style:
//
// int DieInDebugOr12(int* sideeffect) {
// if (sideeffect) {
// *sideeffect = 12;
// }
// LOG(DFATAL) << "death";
// return 12;
// }
//
// TEST(TestCase, TestDieOr12WorksInDgbAndOpt) {
// int sideeffect = 0;
// // Only asserts in dbg.
// EXPECT_DEBUG_DEATH(DieInDebugOr12(&sideeffect), "death");
//
// #ifdef NDEBUG
// // opt-mode has sideeffect visible.
// EXPECT_EQ(12, sideeffect);
// #else
// // dbg-mode no visible sideeffect.
// EXPECT_EQ(0, sideeffect);
// #endif
// }
//
// This will assert that DieInDebugReturn12InOpt() crashes in debug
// mode, usually due to a DCHECK or LOG(DFATAL), but returns the
// appropriate fallback value (12 in this case) in opt mode. If you
// need to test that a function has appropriate side-effects in opt
// mode, include assertions against the side-effects. A general
// pattern for this is:
//
// EXPECT_DEBUG_DEATH({
// // Side-effects here will have an effect after this statement in
// // opt mode, but none in debug mode.
// EXPECT_EQ(12, DieInDebugOr12(&sideeffect));
// }, "death");
//
# ifdef NDEBUG
# define EXPECT_DEBUG_DEATH(statement, regex) \
GTEST_EXECUTE_STATEMENT_(statement, regex)
# define ASSERT_DEBUG_DEATH(statement, regex) \
GTEST_EXECUTE_STATEMENT_(statement, regex)
# else
# define EXPECT_DEBUG_DEATH(statement, regex) \
EXPECT_DEATH(statement, regex)
# define ASSERT_DEBUG_DEATH(statement, regex) \
ASSERT_DEATH(statement, regex)
# endif // NDEBUG for EXPECT_DEBUG_DEATH
#endif // GTEST_HAS_DEATH_TEST
// EXPECT_DEATH_IF_SUPPORTED(statement, regex) and
// ASSERT_DEATH_IF_SUPPORTED(statement, regex) expand to real death tests if
// death tests are supported; otherwise they just issue a warning. This is
// useful when you are combining death test assertions with normal test
// assertions in one test.
#if GTEST_HAS_DEATH_TEST
# define EXPECT_DEATH_IF_SUPPORTED(statement, regex) \
EXPECT_DEATH(statement, regex)
# define ASSERT_DEATH_IF_SUPPORTED(statement, regex) \
ASSERT_DEATH(statement, regex)
#else
# define EXPECT_DEATH_IF_SUPPORTED(statement, regex) \
GTEST_UNSUPPORTED_DEATH_TEST_(statement, regex, )
# define ASSERT_DEATH_IF_SUPPORTED(statement, regex) \
GTEST_UNSUPPORTED_DEATH_TEST_(statement, regex, return)
#endif
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_DEATH_TEST_H_

View File

@@ -0,0 +1,250 @@
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// The Google C++ Testing Framework (Google Test)
//
// This header file defines the Message class.
//
// IMPORTANT NOTE: Due to limitation of the C++ language, we have to
// leave some internal implementation details in this header file.
// They are clearly marked by comments like this:
//
// // INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
//
// Such code is NOT meant to be used by a user directly, and is subject
// to CHANGE WITHOUT NOTICE. Therefore DO NOT DEPEND ON IT in a user
// program!
#ifndef GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_
#define GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_
#include <limits>
#include "gtest/internal/gtest-port.h"
// Ensures that there is at least one operator<< in the global namespace.
// See Message& operator<<(...) below for why.
void operator<<(const testing::internal::Secret&, int);
namespace testing {
// The Message class works like an ostream repeater.
//
// Typical usage:
//
// 1. You stream a bunch of values to a Message object.
// It will remember the text in a stringstream.
// 2. Then you stream the Message object to an ostream.
// This causes the text in the Message to be streamed
// to the ostream.
//
// For example;
//
// testing::Message foo;
// foo << 1 << " != " << 2;
// std::cout << foo;
//
// will print "1 != 2".
//
// Message is not intended to be inherited from. In particular, its
// destructor is not virtual.
//
// Note that stringstream behaves differently in gcc and in MSVC. You
// can stream a NULL char pointer to it in the former, but not in the
// latter (it causes an access violation if you do). The Message
// class hides this difference by treating a NULL char pointer as
// "(null)".
class GTEST_API_ Message {
private:
// The type of basic IO manipulators (endl, ends, and flush) for
// narrow streams.
typedef std::ostream& (*BasicNarrowIoManip)(std::ostream&);
public:
// Constructs an empty Message.
Message();
// Copy constructor.
Message(const Message& msg) : ss_(new ::std::stringstream) { // NOLINT
*ss_ << msg.GetString();
}
// Constructs a Message from a C-string.
explicit Message(const char* str) : ss_(new ::std::stringstream) {
*ss_ << str;
}
#if GTEST_OS_SYMBIAN
// Streams a value (either a pointer or not) to this object.
template <typename T>
inline Message& operator <<(const T& value) {
StreamHelper(typename internal::is_pointer<T>::type(), value);
return *this;
}
#else
// Streams a non-pointer value to this object.
template <typename T>
inline Message& operator <<(const T& val) {
// Some libraries overload << for STL containers. These
// overloads are defined in the global namespace instead of ::std.
//
// C++'s symbol lookup rule (i.e. Koenig lookup) says that these
// overloads are visible in either the std namespace or the global
// namespace, but not other namespaces, including the testing
// namespace which Google Test's Message class is in.
//
// To allow STL containers (and other types that has a << operator
// defined in the global namespace) to be used in Google Test
// assertions, testing::Message must access the custom << operator
// from the global namespace. With this using declaration,
// overloads of << defined in the global namespace and those
// visible via Koenig lookup are both exposed in this function.
using ::operator <<;
*ss_ << val;
return *this;
}
// Streams a pointer value to this object.
//
// This function is an overload of the previous one. When you
// stream a pointer to a Message, this definition will be used as it
// is more specialized. (The C++ Standard, section
// [temp.func.order].) If you stream a non-pointer, then the
// previous definition will be used.
//
// The reason for this overload is that streaming a NULL pointer to
// ostream is undefined behavior. Depending on the compiler, you
// may get "0", "(nil)", "(null)", or an access violation. To
// ensure consistent result across compilers, we always treat NULL
// as "(null)".
template <typename T>
inline Message& operator <<(T* const& pointer) { // NOLINT
if (pointer == NULL) {
*ss_ << "(null)";
} else {
*ss_ << pointer;
}
return *this;
}
#endif // GTEST_OS_SYMBIAN
// Since the basic IO manipulators are overloaded for both narrow
// and wide streams, we have to provide this specialized definition
// of operator <<, even though its body is the same as the
// templatized version above. Without this definition, streaming
// endl or other basic IO manipulators to Message will confuse the
// compiler.
Message& operator <<(BasicNarrowIoManip val) {
*ss_ << val;
return *this;
}
// Instead of 1/0, we want to see true/false for bool values.
Message& operator <<(bool b) {
return *this << (b ? "true" : "false");
}
// These two overloads allow streaming a wide C string to a Message
// using the UTF-8 encoding.
Message& operator <<(const wchar_t* wide_c_str);
Message& operator <<(wchar_t* wide_c_str);
#if GTEST_HAS_STD_WSTRING
// Converts the given wide string to a narrow string using the UTF-8
// encoding, and streams the result to this Message object.
Message& operator <<(const ::std::wstring& wstr);
#endif // GTEST_HAS_STD_WSTRING
#if GTEST_HAS_GLOBAL_WSTRING
// Converts the given wide string to a narrow string using the UTF-8
// encoding, and streams the result to this Message object.
Message& operator <<(const ::wstring& wstr);
#endif // GTEST_HAS_GLOBAL_WSTRING
// Gets the text streamed to this object so far as an std::string.
// Each '\0' character in the buffer is replaced with "\\0".
//
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
std::string GetString() const;
private:
#if GTEST_OS_SYMBIAN
// These are needed as the Nokia Symbian Compiler cannot decide between
// const T& and const T* in a function template. The Nokia compiler _can_
// decide between class template specializations for T and T*, so a
// tr1::type_traits-like is_pointer works, and we can overload on that.
template <typename T>
inline void StreamHelper(internal::true_type /*is_pointer*/, T* pointer) {
if (pointer == NULL) {
*ss_ << "(null)";
} else {
*ss_ << pointer;
}
}
template <typename T>
inline void StreamHelper(internal::false_type /*is_pointer*/,
const T& value) {
// See the comments in Message& operator <<(const T&) above for why
// we need this using statement.
using ::operator <<;
*ss_ << value;
}
#endif // GTEST_OS_SYMBIAN
// We'll hold the text streamed to this object here.
const internal::scoped_ptr< ::std::stringstream> ss_;
// We declare (but don't implement) this to prevent the compiler
// from implementing the assignment operator.
void operator=(const Message&);
};
// Streams a Message to an ostream.
inline std::ostream& operator <<(std::ostream& os, const Message& sb) {
return os << sb.GetString();
}
namespace internal {
// Converts a streamable value to an std::string. A NULL pointer is
// converted to "(null)". When the input value is a ::string,
// ::std::string, ::wstring, or ::std::wstring object, each NUL
// character in it is replaced with "\\0".
template <typename T>
std::string StreamableToString(const T& streamable) {
return (Message() << streamable).GetString();
}
} // namespace internal
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_MESSAGE_H_

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,510 @@
$$ -*- mode: c++; -*-
$var n = 50 $$ Maximum length of Values arguments we want to support.
$var maxtuple = 10 $$ Maximum number of Combine arguments we want to support.
// Copyright 2008, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: vladl@google.com (Vlad Losev)
//
// Macros and functions for implementing parameterized tests
// in Google C++ Testing Framework (Google Test)
//
// This file is generated by a SCRIPT. DO NOT EDIT BY HAND!
//
#ifndef GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_
// Value-parameterized tests allow you to test your code with different
// parameters without writing multiple copies of the same test.
//
// Here is how you use value-parameterized tests:
#if 0
// To write value-parameterized tests, first you should define a fixture
// class. It is usually derived from testing::TestWithParam<T> (see below for
// another inheritance scheme that's sometimes useful in more complicated
// class hierarchies), where the type of your parameter values.
// TestWithParam<T> is itself derived from testing::Test. T can be any
// copyable type. If it's a raw pointer, you are responsible for managing the
// lifespan of the pointed values.
class FooTest : public ::testing::TestWithParam<const char*> {
// You can implement all the usual class fixture members here.
};
// Then, use the TEST_P macro to define as many parameterized tests
// for this fixture as you want. The _P suffix is for "parameterized"
// or "pattern", whichever you prefer to think.
TEST_P(FooTest, DoesBlah) {
// Inside a test, access the test parameter with the GetParam() method
// of the TestWithParam<T> class:
EXPECT_TRUE(foo.Blah(GetParam()));
...
}
TEST_P(FooTest, HasBlahBlah) {
...
}
// Finally, you can use INSTANTIATE_TEST_CASE_P to instantiate the test
// case with any set of parameters you want. Google Test defines a number
// of functions for generating test parameters. They return what we call
// (surprise!) parameter generators. Here is a summary of them, which
// are all in the testing namespace:
//
//
// Range(begin, end [, step]) - Yields values {begin, begin+step,
// begin+step+step, ...}. The values do not
// include end. step defaults to 1.
// Values(v1, v2, ..., vN) - Yields values {v1, v2, ..., vN}.
// ValuesIn(container) - Yields values from a C-style array, an STL
// ValuesIn(begin,end) container, or an iterator range [begin, end).
// Bool() - Yields sequence {false, true}.
// Combine(g1, g2, ..., gN) - Yields all combinations (the Cartesian product
// for the math savvy) of the values generated
// by the N generators.
//
// For more details, see comments at the definitions of these functions below
// in this file.
//
// The following statement will instantiate tests from the FooTest test case
// each with parameter values "meeny", "miny", and "moe".
INSTANTIATE_TEST_CASE_P(InstantiationName,
FooTest,
Values("meeny", "miny", "moe"));
// To distinguish different instances of the pattern, (yes, you
// can instantiate it more then once) the first argument to the
// INSTANTIATE_TEST_CASE_P macro is a prefix that will be added to the
// actual test case name. Remember to pick unique prefixes for different
// instantiations. The tests from the instantiation above will have
// these names:
//
// * InstantiationName/FooTest.DoesBlah/0 for "meeny"
// * InstantiationName/FooTest.DoesBlah/1 for "miny"
// * InstantiationName/FooTest.DoesBlah/2 for "moe"
// * InstantiationName/FooTest.HasBlahBlah/0 for "meeny"
// * InstantiationName/FooTest.HasBlahBlah/1 for "miny"
// * InstantiationName/FooTest.HasBlahBlah/2 for "moe"
//
// You can use these names in --gtest_filter.
//
// This statement will instantiate all tests from FooTest again, each
// with parameter values "cat" and "dog":
const char* pets[] = {"cat", "dog"};
INSTANTIATE_TEST_CASE_P(AnotherInstantiationName, FooTest, ValuesIn(pets));
// The tests from the instantiation above will have these names:
//
// * AnotherInstantiationName/FooTest.DoesBlah/0 for "cat"
// * AnotherInstantiationName/FooTest.DoesBlah/1 for "dog"
// * AnotherInstantiationName/FooTest.HasBlahBlah/0 for "cat"
// * AnotherInstantiationName/FooTest.HasBlahBlah/1 for "dog"
//
// Please note that INSTANTIATE_TEST_CASE_P will instantiate all tests
// in the given test case, whether their definitions come before or
// AFTER the INSTANTIATE_TEST_CASE_P statement.
//
// Please also note that generator expressions (including parameters to the
// generators) are evaluated in InitGoogleTest(), after main() has started.
// This allows the user on one hand, to adjust generator parameters in order
// to dynamically determine a set of tests to run and on the other hand,
// give the user a chance to inspect the generated tests with Google Test
// reflection API before RUN_ALL_TESTS() is executed.
//
// You can see samples/sample7_unittest.cc and samples/sample8_unittest.cc
// for more examples.
//
// In the future, we plan to publish the API for defining new parameter
// generators. But for now this interface remains part of the internal
// implementation and is subject to change.
//
//
// A parameterized test fixture must be derived from testing::Test and from
// testing::WithParamInterface<T>, where T is the type of the parameter
// values. Inheriting from TestWithParam<T> satisfies that requirement because
// TestWithParam<T> inherits from both Test and WithParamInterface. In more
// complicated hierarchies, however, it is occasionally useful to inherit
// separately from Test and WithParamInterface. For example:
class BaseTest : public ::testing::Test {
// You can inherit all the usual members for a non-parameterized test
// fixture here.
};
class DerivedTest : public BaseTest, public ::testing::WithParamInterface<int> {
// The usual test fixture members go here too.
};
TEST_F(BaseTest, HasFoo) {
// This is an ordinary non-parameterized test.
}
TEST_P(DerivedTest, DoesBlah) {
// GetParam works just the same here as if you inherit from TestWithParam.
EXPECT_TRUE(foo.Blah(GetParam()));
}
#endif // 0
#include "gtest/internal/gtest-port.h"
#if !GTEST_OS_SYMBIAN
# include <utility>
#endif
// scripts/fuse_gtest.py depends on gtest's own header being #included
// *unconditionally*. Therefore these #includes cannot be moved
// inside #if GTEST_HAS_PARAM_TEST.
#include "gtest/internal/gtest-internal.h"
#include "gtest/internal/gtest-param-util.h"
#include "gtest/internal/gtest-param-util-generated.h"
#if GTEST_HAS_PARAM_TEST
namespace testing {
// Functions producing parameter generators.
//
// Google Test uses these generators to produce parameters for value-
// parameterized tests. When a parameterized test case is instantiated
// with a particular generator, Google Test creates and runs tests
// for each element in the sequence produced by the generator.
//
// In the following sample, tests from test case FooTest are instantiated
// each three times with parameter values 3, 5, and 8:
//
// class FooTest : public TestWithParam<int> { ... };
//
// TEST_P(FooTest, TestThis) {
// }
// TEST_P(FooTest, TestThat) {
// }
// INSTANTIATE_TEST_CASE_P(TestSequence, FooTest, Values(3, 5, 8));
//
// Range() returns generators providing sequences of values in a range.
//
// Synopsis:
// Range(start, end)
// - returns a generator producing a sequence of values {start, start+1,
// start+2, ..., }.
// Range(start, end, step)
// - returns a generator producing a sequence of values {start, start+step,
// start+step+step, ..., }.
// Notes:
// * The generated sequences never include end. For example, Range(1, 5)
// returns a generator producing a sequence {1, 2, 3, 4}. Range(1, 9, 2)
// returns a generator producing {1, 3, 5, 7}.
// * start and end must have the same type. That type may be any integral or
// floating-point type or a user defined type satisfying these conditions:
// * It must be assignable (have operator=() defined).
// * It must have operator+() (operator+(int-compatible type) for
// two-operand version).
// * It must have operator<() defined.
// Elements in the resulting sequences will also have that type.
// * Condition start < end must be satisfied in order for resulting sequences
// to contain any elements.
//
template <typename T, typename IncrementT>
internal::ParamGenerator<T> Range(T start, T end, IncrementT step) {
return internal::ParamGenerator<T>(
new internal::RangeGenerator<T, IncrementT>(start, end, step));
}
template <typename T>
internal::ParamGenerator<T> Range(T start, T end) {
return Range(start, end, 1);
}
// ValuesIn() function allows generation of tests with parameters coming from
// a container.
//
// Synopsis:
// ValuesIn(const T (&array)[N])
// - returns a generator producing sequences with elements from
// a C-style array.
// ValuesIn(const Container& container)
// - returns a generator producing sequences with elements from
// an STL-style container.
// ValuesIn(Iterator begin, Iterator end)
// - returns a generator producing sequences with elements from
// a range [begin, end) defined by a pair of STL-style iterators. These
// iterators can also be plain C pointers.
//
// Please note that ValuesIn copies the values from the containers
// passed in and keeps them to generate tests in RUN_ALL_TESTS().
//
// Examples:
//
// This instantiates tests from test case StringTest
// each with C-string values of "foo", "bar", and "baz":
//
// const char* strings[] = {"foo", "bar", "baz"};
// INSTANTIATE_TEST_CASE_P(StringSequence, SrtingTest, ValuesIn(strings));
//
// This instantiates tests from test case StlStringTest
// each with STL strings with values "a" and "b":
//
// ::std::vector< ::std::string> GetParameterStrings() {
// ::std::vector< ::std::string> v;
// v.push_back("a");
// v.push_back("b");
// return v;
// }
//
// INSTANTIATE_TEST_CASE_P(CharSequence,
// StlStringTest,
// ValuesIn(GetParameterStrings()));
//
//
// This will also instantiate tests from CharTest
// each with parameter values 'a' and 'b':
//
// ::std::list<char> GetParameterChars() {
// ::std::list<char> list;
// list.push_back('a');
// list.push_back('b');
// return list;
// }
// ::std::list<char> l = GetParameterChars();
// INSTANTIATE_TEST_CASE_P(CharSequence2,
// CharTest,
// ValuesIn(l.begin(), l.end()));
//
template <typename ForwardIterator>
internal::ParamGenerator<
typename ::testing::internal::IteratorTraits<ForwardIterator>::value_type>
ValuesIn(ForwardIterator begin, ForwardIterator end) {
typedef typename ::testing::internal::IteratorTraits<ForwardIterator>
::value_type ParamType;
return internal::ParamGenerator<ParamType>(
new internal::ValuesInIteratorRangeGenerator<ParamType>(begin, end));
}
template <typename T, size_t N>
internal::ParamGenerator<T> ValuesIn(const T (&array)[N]) {
return ValuesIn(array, array + N);
}
template <class Container>
internal::ParamGenerator<typename Container::value_type> ValuesIn(
const Container& container) {
return ValuesIn(container.begin(), container.end());
}
// Values() allows generating tests from explicitly specified list of
// parameters.
//
// Synopsis:
// Values(T v1, T v2, ..., T vN)
// - returns a generator producing sequences with elements v1, v2, ..., vN.
//
// For example, this instantiates tests from test case BarTest each
// with values "one", "two", and "three":
//
// INSTANTIATE_TEST_CASE_P(NumSequence, BarTest, Values("one", "two", "three"));
//
// This instantiates tests from test case BazTest each with values 1, 2, 3.5.
// The exact type of values will depend on the type of parameter in BazTest.
//
// INSTANTIATE_TEST_CASE_P(FloatingNumbers, BazTest, Values(1, 2, 3.5));
//
// Currently, Values() supports from 1 to $n parameters.
//
$range i 1..n
$for i [[
$range j 1..i
template <$for j, [[typename T$j]]>
internal::ValueArray$i<$for j, [[T$j]]> Values($for j, [[T$j v$j]]) {
return internal::ValueArray$i<$for j, [[T$j]]>($for j, [[v$j]]);
}
]]
// Bool() allows generating tests with parameters in a set of (false, true).
//
// Synopsis:
// Bool()
// - returns a generator producing sequences with elements {false, true}.
//
// It is useful when testing code that depends on Boolean flags. Combinations
// of multiple flags can be tested when several Bool()'s are combined using
// Combine() function.
//
// In the following example all tests in the test case FlagDependentTest
// will be instantiated twice with parameters false and true.
//
// class FlagDependentTest : public testing::TestWithParam<bool> {
// virtual void SetUp() {
// external_flag = GetParam();
// }
// }
// INSTANTIATE_TEST_CASE_P(BoolSequence, FlagDependentTest, Bool());
//
inline internal::ParamGenerator<bool> Bool() {
return Values(false, true);
}
# if GTEST_HAS_COMBINE
// Combine() allows the user to combine two or more sequences to produce
// values of a Cartesian product of those sequences' elements.
//
// Synopsis:
// Combine(gen1, gen2, ..., genN)
// - returns a generator producing sequences with elements coming from
// the Cartesian product of elements from the sequences generated by
// gen1, gen2, ..., genN. The sequence elements will have a type of
// tuple<T1, T2, ..., TN> where T1, T2, ..., TN are the types
// of elements from sequences produces by gen1, gen2, ..., genN.
//
// Combine can have up to $maxtuple arguments. This number is currently limited
// by the maximum number of elements in the tuple implementation used by Google
// Test.
//
// Example:
//
// This will instantiate tests in test case AnimalTest each one with
// the parameter values tuple("cat", BLACK), tuple("cat", WHITE),
// tuple("dog", BLACK), and tuple("dog", WHITE):
//
// enum Color { BLACK, GRAY, WHITE };
// class AnimalTest
// : public testing::TestWithParam<tuple<const char*, Color> > {...};
//
// TEST_P(AnimalTest, AnimalLooksNice) {...}
//
// INSTANTIATE_TEST_CASE_P(AnimalVariations, AnimalTest,
// Combine(Values("cat", "dog"),
// Values(BLACK, WHITE)));
//
// This will instantiate tests in FlagDependentTest with all variations of two
// Boolean flags:
//
// class FlagDependentTest
// : public testing::TestWithParam<tuple<bool, bool> > {
// virtual void SetUp() {
// // Assigns external_flag_1 and external_flag_2 values from the tuple.
// tie(external_flag_1, external_flag_2) = GetParam();
// }
// };
//
// TEST_P(FlagDependentTest, TestFeature1) {
// // Test your code using external_flag_1 and external_flag_2 here.
// }
// INSTANTIATE_TEST_CASE_P(TwoBoolSequence, FlagDependentTest,
// Combine(Bool(), Bool()));
//
$range i 2..maxtuple
$for i [[
$range j 1..i
template <$for j, [[typename Generator$j]]>
internal::CartesianProductHolder$i<$for j, [[Generator$j]]> Combine(
$for j, [[const Generator$j& g$j]]) {
return internal::CartesianProductHolder$i<$for j, [[Generator$j]]>(
$for j, [[g$j]]);
}
]]
# endif // GTEST_HAS_COMBINE
# define TEST_P(test_case_name, test_name) \
class GTEST_TEST_CLASS_NAME_(test_case_name, test_name) \
: public test_case_name { \
public: \
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)() {} \
virtual void TestBody(); \
private: \
static int AddToRegistry() { \
::testing::UnitTest::GetInstance()->parameterized_test_registry(). \
GetTestCasePatternHolder<test_case_name>(\
#test_case_name, \
::testing::internal::CodeLocation(\
__FILE__, __LINE__))->AddTestPattern(\
#test_case_name, \
#test_name, \
new ::testing::internal::TestMetaFactory< \
GTEST_TEST_CLASS_NAME_(\
test_case_name, test_name)>()); \
return 0; \
} \
static int gtest_registering_dummy_ GTEST_ATTRIBUTE_UNUSED_; \
GTEST_DISALLOW_COPY_AND_ASSIGN_(\
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)); \
}; \
int GTEST_TEST_CLASS_NAME_(test_case_name, \
test_name)::gtest_registering_dummy_ = \
GTEST_TEST_CLASS_NAME_(test_case_name, test_name)::AddToRegistry(); \
void GTEST_TEST_CLASS_NAME_(test_case_name, test_name)::TestBody()
// The optional last argument to INSTANTIATE_TEST_CASE_P allows the user
// to specify a function or functor that generates custom test name suffixes
// based on the test parameters. The function should accept one argument of
// type testing::TestParamInfo<class ParamType>, and return std::string.
//
// testing::PrintToStringParamName is a builtin test suffix generator that
// returns the value of testing::PrintToString(GetParam()).
//
// Note: test names must be non-empty, unique, and may only contain ASCII
// alphanumeric characters or underscore. Because PrintToString adds quotes
// to std::string and C strings, it won't work for these types.
# define INSTANTIATE_TEST_CASE_P(prefix, test_case_name, generator, ...) \
::testing::internal::ParamGenerator<test_case_name::ParamType> \
gtest_##prefix##test_case_name##_EvalGenerator_() { return generator; } \
::std::string gtest_##prefix##test_case_name##_EvalGenerateName_( \
const ::testing::TestParamInfo<test_case_name::ParamType>& info) { \
return ::testing::internal::GetParamNameGen<test_case_name::ParamType> \
(__VA_ARGS__)(info); \
} \
int gtest_##prefix##test_case_name##_dummy_ GTEST_ATTRIBUTE_UNUSED_ = \
::testing::UnitTest::GetInstance()->parameterized_test_registry(). \
GetTestCasePatternHolder<test_case_name>(\
#test_case_name, \
::testing::internal::CodeLocation(\
__FILE__, __LINE__))->AddTestCaseInstantiation(\
#prefix, \
&gtest_##prefix##test_case_name##_EvalGenerator_, \
&gtest_##prefix##test_case_name##_EvalGenerateName_, \
__FILE__, __LINE__)
} // namespace testing
#endif // GTEST_HAS_PARAM_TEST
#endif // GTEST_INCLUDE_GTEST_GTEST_PARAM_TEST_H_

View File

@@ -0,0 +1,993 @@
// Copyright 2007, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
// Google Test - The Google C++ Testing Framework
//
// This file implements a universal value printer that can print a
// value of any type T:
//
// void ::testing::internal::UniversalPrinter<T>::Print(value, ostream_ptr);
//
// A user can teach this function how to print a class type T by
// defining either operator<<() or PrintTo() in the namespace that
// defines T. More specifically, the FIRST defined function in the
// following list will be used (assuming T is defined in namespace
// foo):
//
// 1. foo::PrintTo(const T&, ostream*)
// 2. operator<<(ostream&, const T&) defined in either foo or the
// global namespace.
//
// If none of the above is defined, it will print the debug string of
// the value if it is a protocol buffer, or print the raw bytes in the
// value otherwise.
//
// To aid debugging: when T is a reference type, the address of the
// value is also printed; when T is a (const) char pointer, both the
// pointer value and the NUL-terminated string it points to are
// printed.
//
// We also provide some convenient wrappers:
//
// // Prints a value to a string. For a (const or not) char
// // pointer, the NUL-terminated string (but not the pointer) is
// // printed.
// std::string ::testing::PrintToString(const T& value);
//
// // Prints a value tersely: for a reference type, the referenced
// // value (but not the address) is printed; for a (const or not) char
// // pointer, the NUL-terminated string (but not the pointer) is
// // printed.
// void ::testing::internal::UniversalTersePrint(const T& value, ostream*);
//
// // Prints value using the type inferred by the compiler. The difference
// // from UniversalTersePrint() is that this function prints both the
// // pointer and the NUL-terminated string for a (const or not) char pointer.
// void ::testing::internal::UniversalPrint(const T& value, ostream*);
//
// // Prints the fields of a tuple tersely to a string vector, one
// // element for each field. Tuple support must be enabled in
// // gtest-port.h.
// std::vector<string> UniversalTersePrintTupleFieldsToStrings(
// const Tuple& value);
//
// Known limitation:
//
// The print primitives print the elements of an STL-style container
// using the compiler-inferred type of *iter where iter is a
// const_iterator of the container. When const_iterator is an input
// iterator but not a forward iterator, this inferred type may not
// match value_type, and the print output may be incorrect. In
// practice, this is rarely a problem as for most containers
// const_iterator is a forward iterator. We'll fix this if there's an
// actual need for it. Note that this fix cannot rely on value_type
// being defined as many user-defined container types don't have
// value_type.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_
#define GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_
#include <ostream> // NOLINT
#include <sstream>
#include <string>
#include <utility>
#include <vector>
#include "gtest/internal/gtest-port.h"
#include "gtest/internal/gtest-internal.h"
#if GTEST_HAS_STD_TUPLE_
# include <tuple>
#endif
namespace testing {
// Definitions in the 'internal' and 'internal2' name spaces are
// subject to change without notice. DO NOT USE THEM IN USER CODE!
namespace internal2 {
// Prints the given number of bytes in the given object to the given
// ostream.
GTEST_API_ void PrintBytesInObjectTo(const unsigned char* obj_bytes,
size_t count,
::std::ostream* os);
// For selecting which printer to use when a given type has neither <<
// nor PrintTo().
enum TypeKind {
kProtobuf, // a protobuf type
kConvertibleToInteger, // a type implicitly convertible to BiggestInt
// (e.g. a named or unnamed enum type)
kOtherType // anything else
};
// TypeWithoutFormatter<T, kTypeKind>::PrintValue(value, os) is called
// by the universal printer to print a value of type T when neither
// operator<< nor PrintTo() is defined for T, where kTypeKind is the
// "kind" of T as defined by enum TypeKind.
template <typename T, TypeKind kTypeKind>
class TypeWithoutFormatter {
public:
// This default version is called when kTypeKind is kOtherType.
static void PrintValue(const T& value, ::std::ostream* os) {
PrintBytesInObjectTo(reinterpret_cast<const unsigned char*>(&value),
sizeof(value), os);
}
};
// We print a protobuf using its ShortDebugString() when the string
// doesn't exceed this many characters; otherwise we print it using
// DebugString() for better readability.
const size_t kProtobufOneLinerMaxLength = 50;
template <typename T>
class TypeWithoutFormatter<T, kProtobuf> {
public:
static void PrintValue(const T& value, ::std::ostream* os) {
const ::testing::internal::string short_str = value.ShortDebugString();
const ::testing::internal::string pretty_str =
short_str.length() <= kProtobufOneLinerMaxLength ?
short_str : ("\n" + value.DebugString());
*os << ("<" + pretty_str + ">");
}
};
template <typename T>
class TypeWithoutFormatter<T, kConvertibleToInteger> {
public:
// Since T has no << operator or PrintTo() but can be implicitly
// converted to BiggestInt, we print it as a BiggestInt.
//
// Most likely T is an enum type (either named or unnamed), in which
// case printing it as an integer is the desired behavior. In case
// T is not an enum, printing it as an integer is the best we can do
// given that it has no user-defined printer.
static void PrintValue(const T& value, ::std::ostream* os) {
const internal::BiggestInt kBigInt = value;
*os << kBigInt;
}
};
// Prints the given value to the given ostream. If the value is a
// protocol message, its debug string is printed; if it's an enum or
// of a type implicitly convertible to BiggestInt, it's printed as an
// integer; otherwise the bytes in the value are printed. This is
// what UniversalPrinter<T>::Print() does when it knows nothing about
// type T and T has neither << operator nor PrintTo().
//
// A user can override this behavior for a class type Foo by defining
// a << operator in the namespace where Foo is defined.
//
// We put this operator in namespace 'internal2' instead of 'internal'
// to simplify the implementation, as much code in 'internal' needs to
// use << in STL, which would conflict with our own << were it defined
// in 'internal'.
//
// Note that this operator<< takes a generic std::basic_ostream<Char,
// CharTraits> type instead of the more restricted std::ostream. If
// we define it to take an std::ostream instead, we'll get an
// "ambiguous overloads" compiler error when trying to print a type
// Foo that supports streaming to std::basic_ostream<Char,
// CharTraits>, as the compiler cannot tell whether
// operator<<(std::ostream&, const T&) or
// operator<<(std::basic_stream<Char, CharTraits>, const Foo&) is more
// specific.
template <typename Char, typename CharTraits, typename T>
::std::basic_ostream<Char, CharTraits>& operator<<(
::std::basic_ostream<Char, CharTraits>& os, const T& x) {
TypeWithoutFormatter<T,
(internal::IsAProtocolMessage<T>::value ? kProtobuf :
internal::ImplicitlyConvertible<const T&, internal::BiggestInt>::value ?
kConvertibleToInteger : kOtherType)>::PrintValue(x, &os);
return os;
}
} // namespace internal2
} // namespace testing
// This namespace MUST NOT BE NESTED IN ::testing, or the name look-up
// magic needed for implementing UniversalPrinter won't work.
namespace testing_internal {
// Used to print a value that is not an STL-style container when the
// user doesn't define PrintTo() for it.
template <typename T>
void DefaultPrintNonContainerTo(const T& value, ::std::ostream* os) {
// With the following statement, during unqualified name lookup,
// testing::internal2::operator<< appears as if it was declared in
// the nearest enclosing namespace that contains both
// ::testing_internal and ::testing::internal2, i.e. the global
// namespace. For more details, refer to the C++ Standard section
// 7.3.4-1 [namespace.udir]. This allows us to fall back onto
// testing::internal2::operator<< in case T doesn't come with a <<
// operator.
//
// We cannot write 'using ::testing::internal2::operator<<;', which
// gcc 3.3 fails to compile due to a compiler bug.
using namespace ::testing::internal2; // NOLINT
// Assuming T is defined in namespace foo, in the next statement,
// the compiler will consider all of:
//
// 1. foo::operator<< (thanks to Koenig look-up),
// 2. ::operator<< (as the current namespace is enclosed in ::),
// 3. testing::internal2::operator<< (thanks to the using statement above).
//
// The operator<< whose type matches T best will be picked.
//
// We deliberately allow #2 to be a candidate, as sometimes it's
// impossible to define #1 (e.g. when foo is ::std, defining
// anything in it is undefined behavior unless you are a compiler
// vendor.).
*os << value;
}
} // namespace testing_internal
namespace testing {
namespace internal {
// FormatForComparison<ToPrint, OtherOperand>::Format(value) formats a
// value of type ToPrint that is an operand of a comparison assertion
// (e.g. ASSERT_EQ). OtherOperand is the type of the other operand in
// the comparison, and is used to help determine the best way to
// format the value. In particular, when the value is a C string
// (char pointer) and the other operand is an STL string object, we
// want to format the C string as a string, since we know it is
// compared by value with the string object. If the value is a char
// pointer but the other operand is not an STL string object, we don't
// know whether the pointer is supposed to point to a NUL-terminated
// string, and thus want to print it as a pointer to be safe.
//
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
// The default case.
template <typename ToPrint, typename OtherOperand>
class FormatForComparison {
public:
static ::std::string Format(const ToPrint& value) {
return ::testing::PrintToString(value);
}
};
// Array.
template <typename ToPrint, size_t N, typename OtherOperand>
class FormatForComparison<ToPrint[N], OtherOperand> {
public:
static ::std::string Format(const ToPrint* value) {
return FormatForComparison<const ToPrint*, OtherOperand>::Format(value);
}
};
// By default, print C string as pointers to be safe, as we don't know
// whether they actually point to a NUL-terminated string.
#define GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_(CharType) \
template <typename OtherOperand> \
class FormatForComparison<CharType*, OtherOperand> { \
public: \
static ::std::string Format(CharType* value) { \
return ::testing::PrintToString(static_cast<const void*>(value)); \
} \
}
GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_(char);
GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_(const char);
GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_(wchar_t);
GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_(const wchar_t);
#undef GTEST_IMPL_FORMAT_C_STRING_AS_POINTER_
// If a C string is compared with an STL string object, we know it's meant
// to point to a NUL-terminated string, and thus can print it as a string.
#define GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(CharType, OtherStringType) \
template <> \
class FormatForComparison<CharType*, OtherStringType> { \
public: \
static ::std::string Format(CharType* value) { \
return ::testing::PrintToString(value); \
} \
}
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(char, ::std::string);
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(const char, ::std::string);
#if GTEST_HAS_GLOBAL_STRING
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(char, ::string);
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(const char, ::string);
#endif
#if GTEST_HAS_GLOBAL_WSTRING
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(wchar_t, ::wstring);
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(const wchar_t, ::wstring);
#endif
#if GTEST_HAS_STD_WSTRING
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(wchar_t, ::std::wstring);
GTEST_IMPL_FORMAT_C_STRING_AS_STRING_(const wchar_t, ::std::wstring);
#endif
#undef GTEST_IMPL_FORMAT_C_STRING_AS_STRING_
// Formats a comparison assertion (e.g. ASSERT_EQ, EXPECT_LT, and etc)
// operand to be used in a failure message. The type (but not value)
// of the other operand may affect the format. This allows us to
// print a char* as a raw pointer when it is compared against another
// char* or void*, and print it as a C string when it is compared
// against an std::string object, for example.
//
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
template <typename T1, typename T2>
std::string FormatForComparisonFailureMessage(
const T1& value, const T2& /* other_operand */) {
return FormatForComparison<T1, T2>::Format(value);
}
// UniversalPrinter<T>::Print(value, ostream_ptr) prints the given
// value to the given ostream. The caller must ensure that
// 'ostream_ptr' is not NULL, or the behavior is undefined.
//
// We define UniversalPrinter as a class template (as opposed to a
// function template), as we need to partially specialize it for
// reference types, which cannot be done with function templates.
template <typename T>
class UniversalPrinter;
template <typename T>
void UniversalPrint(const T& value, ::std::ostream* os);
// Used to print an STL-style container when the user doesn't define
// a PrintTo() for it.
template <typename C>
void DefaultPrintTo(IsContainer /* dummy */,
false_type /* is not a pointer */,
const C& container, ::std::ostream* os) {
const size_t kMaxCount = 32; // The maximum number of elements to print.
*os << '{';
size_t count = 0;
for (typename C::const_iterator it = container.begin();
it != container.end(); ++it, ++count) {
if (count > 0) {
*os << ',';
if (count == kMaxCount) { // Enough has been printed.
*os << " ...";
break;
}
}
*os << ' ';
// We cannot call PrintTo(*it, os) here as PrintTo() doesn't
// handle *it being a native array.
internal::UniversalPrint(*it, os);
}
if (count > 0) {
*os << ' ';
}
*os << '}';
}
// Used to print a pointer that is neither a char pointer nor a member
// pointer, when the user doesn't define PrintTo() for it. (A member
// variable pointer or member function pointer doesn't really point to
// a location in the address space. Their representation is
// implementation-defined. Therefore they will be printed as raw
// bytes.)
template <typename T>
void DefaultPrintTo(IsNotContainer /* dummy */,
true_type /* is a pointer */,
T* p, ::std::ostream* os) {
if (p == NULL) {
*os << "NULL";
} else {
// C++ doesn't allow casting from a function pointer to any object
// pointer.
//
// IsTrue() silences warnings: "Condition is always true",
// "unreachable code".
if (IsTrue(ImplicitlyConvertible<T*, const void*>::value)) {
// T is not a function type. We just call << to print p,
// relying on ADL to pick up user-defined << for their pointer
// types, if any.
*os << p;
} else {
// T is a function type, so '*os << p' doesn't do what we want
// (it just prints p as bool). We want to print p as a const
// void*. However, we cannot cast it to const void* directly,
// even using reinterpret_cast, as earlier versions of gcc
// (e.g. 3.4.5) cannot compile the cast when p is a function
// pointer. Casting to UInt64 first solves the problem.
*os << reinterpret_cast<const void*>(
reinterpret_cast<internal::UInt64>(p));
}
}
}
// Used to print a non-container, non-pointer value when the user
// doesn't define PrintTo() for it.
template <typename T>
void DefaultPrintTo(IsNotContainer /* dummy */,
false_type /* is not a pointer */,
const T& value, ::std::ostream* os) {
::testing_internal::DefaultPrintNonContainerTo(value, os);
}
// Prints the given value using the << operator if it has one;
// otherwise prints the bytes in it. This is what
// UniversalPrinter<T>::Print() does when PrintTo() is not specialized
// or overloaded for type T.
//
// A user can override this behavior for a class type Foo by defining
// an overload of PrintTo() in the namespace where Foo is defined. We
// give the user this option as sometimes defining a << operator for
// Foo is not desirable (e.g. the coding style may prevent doing it,
// or there is already a << operator but it doesn't do what the user
// wants).
template <typename T>
void PrintTo(const T& value, ::std::ostream* os) {
// DefaultPrintTo() is overloaded. The type of its first two
// arguments determine which version will be picked. If T is an
// STL-style container, the version for container will be called; if
// T is a pointer, the pointer version will be called; otherwise the
// generic version will be called.
//
// Note that we check for container types here, prior to we check
// for protocol message types in our operator<<. The rationale is:
//
// For protocol messages, we want to give people a chance to
// override Google Mock's format by defining a PrintTo() or
// operator<<. For STL containers, other formats can be
// incompatible with Google Mock's format for the container
// elements; therefore we check for container types here to ensure
// that our format is used.
//
// The second argument of DefaultPrintTo() is needed to bypass a bug
// in Symbian's C++ compiler that prevents it from picking the right
// overload between:
//
// PrintTo(const T& x, ...);
// PrintTo(T* x, ...);
DefaultPrintTo(IsContainerTest<T>(0), is_pointer<T>(), value, os);
}
// The following list of PrintTo() overloads tells
// UniversalPrinter<T>::Print() how to print standard types (built-in
// types, strings, plain arrays, and pointers).
// Overloads for various char types.
GTEST_API_ void PrintTo(unsigned char c, ::std::ostream* os);
GTEST_API_ void PrintTo(signed char c, ::std::ostream* os);
inline void PrintTo(char c, ::std::ostream* os) {
// When printing a plain char, we always treat it as unsigned. This
// way, the output won't be affected by whether the compiler thinks
// char is signed or not.
PrintTo(static_cast<unsigned char>(c), os);
}
// Overloads for other simple built-in types.
inline void PrintTo(bool x, ::std::ostream* os) {
*os << (x ? "true" : "false");
}
// Overload for wchar_t type.
// Prints a wchar_t as a symbol if it is printable or as its internal
// code otherwise and also as its decimal code (except for L'\0').
// The L'\0' char is printed as "L'\\0'". The decimal code is printed
// as signed integer when wchar_t is implemented by the compiler
// as a signed type and is printed as an unsigned integer when wchar_t
// is implemented as an unsigned type.
GTEST_API_ void PrintTo(wchar_t wc, ::std::ostream* os);
// Overloads for C strings.
GTEST_API_ void PrintTo(const char* s, ::std::ostream* os);
inline void PrintTo(char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const char*>(s), os);
}
// signed/unsigned char is often used for representing binary data, so
// we print pointers to it as void* to be safe.
inline void PrintTo(const signed char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(signed char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(const unsigned char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
inline void PrintTo(unsigned char* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const void*>(s), os);
}
// MSVC can be configured to define wchar_t as a typedef of unsigned
// short. It defines _NATIVE_WCHAR_T_DEFINED when wchar_t is a native
// type. When wchar_t is a typedef, defining an overload for const
// wchar_t* would cause unsigned short* be printed as a wide string,
// possibly causing invalid memory accesses.
#if !defined(_MSC_VER) || defined(_NATIVE_WCHAR_T_DEFINED)
// Overloads for wide C strings
GTEST_API_ void PrintTo(const wchar_t* s, ::std::ostream* os);
inline void PrintTo(wchar_t* s, ::std::ostream* os) {
PrintTo(ImplicitCast_<const wchar_t*>(s), os);
}
#endif
// Overload for C arrays. Multi-dimensional arrays are printed
// properly.
// Prints the given number of elements in an array, without printing
// the curly braces.
template <typename T>
void PrintRawArrayTo(const T a[], size_t count, ::std::ostream* os) {
UniversalPrint(a[0], os);
for (size_t i = 1; i != count; i++) {
*os << ", ";
UniversalPrint(a[i], os);
}
}
// Overloads for ::string and ::std::string.
#if GTEST_HAS_GLOBAL_STRING
GTEST_API_ void PrintStringTo(const ::string&s, ::std::ostream* os);
inline void PrintTo(const ::string& s, ::std::ostream* os) {
PrintStringTo(s, os);
}
#endif // GTEST_HAS_GLOBAL_STRING
GTEST_API_ void PrintStringTo(const ::std::string&s, ::std::ostream* os);
inline void PrintTo(const ::std::string& s, ::std::ostream* os) {
PrintStringTo(s, os);
}
// Overloads for ::wstring and ::std::wstring.
#if GTEST_HAS_GLOBAL_WSTRING
GTEST_API_ void PrintWideStringTo(const ::wstring&s, ::std::ostream* os);
inline void PrintTo(const ::wstring& s, ::std::ostream* os) {
PrintWideStringTo(s, os);
}
#endif // GTEST_HAS_GLOBAL_WSTRING
#if GTEST_HAS_STD_WSTRING
GTEST_API_ void PrintWideStringTo(const ::std::wstring&s, ::std::ostream* os);
inline void PrintTo(const ::std::wstring& s, ::std::ostream* os) {
PrintWideStringTo(s, os);
}
#endif // GTEST_HAS_STD_WSTRING
#if GTEST_HAS_TR1_TUPLE || GTEST_HAS_STD_TUPLE_
// Helper function for printing a tuple. T must be instantiated with
// a tuple type.
template <typename T>
void PrintTupleTo(const T& t, ::std::ostream* os);
#endif // GTEST_HAS_TR1_TUPLE || GTEST_HAS_STD_TUPLE_
#if GTEST_HAS_TR1_TUPLE
// Overload for ::std::tr1::tuple. Needed for printing function arguments,
// which are packed as tuples.
// Overloaded PrintTo() for tuples of various arities. We support
// tuples of up-to 10 fields. The following implementation works
// regardless of whether tr1::tuple is implemented using the
// non-standard variadic template feature or not.
inline void PrintTo(const ::std::tr1::tuple<>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1>
void PrintTo(const ::std::tr1::tuple<T1>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2>
void PrintTo(const ::std::tr1::tuple<T1, T2>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8, typename T9>
void PrintTo(const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8, T9>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
template <typename T1, typename T2, typename T3, typename T4, typename T5,
typename T6, typename T7, typename T8, typename T9, typename T10>
void PrintTo(
const ::std::tr1::tuple<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>& t,
::std::ostream* os) {
PrintTupleTo(t, os);
}
#endif // GTEST_HAS_TR1_TUPLE
#if GTEST_HAS_STD_TUPLE_
template <typename... Types>
void PrintTo(const ::std::tuple<Types...>& t, ::std::ostream* os) {
PrintTupleTo(t, os);
}
#endif // GTEST_HAS_STD_TUPLE_
// Overload for std::pair.
template <typename T1, typename T2>
void PrintTo(const ::std::pair<T1, T2>& value, ::std::ostream* os) {
*os << '(';
// We cannot use UniversalPrint(value.first, os) here, as T1 may be
// a reference type. The same for printing value.second.
UniversalPrinter<T1>::Print(value.first, os);
*os << ", ";
UniversalPrinter<T2>::Print(value.second, os);
*os << ')';
}
// Implements printing a non-reference type T by letting the compiler
// pick the right overload of PrintTo() for T.
template <typename T>
class UniversalPrinter {
public:
// MSVC warns about adding const to a function type, so we want to
// disable the warning.
GTEST_DISABLE_MSC_WARNINGS_PUSH_(4180)
// Note: we deliberately don't call this PrintTo(), as that name
// conflicts with ::testing::internal::PrintTo in the body of the
// function.
static void Print(const T& value, ::std::ostream* os) {
// By default, ::testing::internal::PrintTo() is used for printing
// the value.
//
// Thanks to Koenig look-up, if T is a class and has its own
// PrintTo() function defined in its namespace, that function will
// be visible here. Since it is more specific than the generic ones
// in ::testing::internal, it will be picked by the compiler in the
// following statement - exactly what we want.
PrintTo(value, os);
}
GTEST_DISABLE_MSC_WARNINGS_POP_()
};
// UniversalPrintArray(begin, len, os) prints an array of 'len'
// elements, starting at address 'begin'.
template <typename T>
void UniversalPrintArray(const T* begin, size_t len, ::std::ostream* os) {
if (len == 0) {
*os << "{}";
} else {
*os << "{ ";
const size_t kThreshold = 18;
const size_t kChunkSize = 8;
// If the array has more than kThreshold elements, we'll have to
// omit some details by printing only the first and the last
// kChunkSize elements.
// TODO(wan@google.com): let the user control the threshold using a flag.
if (len <= kThreshold) {
PrintRawArrayTo(begin, len, os);
} else {
PrintRawArrayTo(begin, kChunkSize, os);
*os << ", ..., ";
PrintRawArrayTo(begin + len - kChunkSize, kChunkSize, os);
}
*os << " }";
}
}
// This overload prints a (const) char array compactly.
GTEST_API_ void UniversalPrintArray(
const char* begin, size_t len, ::std::ostream* os);
// This overload prints a (const) wchar_t array compactly.
GTEST_API_ void UniversalPrintArray(
const wchar_t* begin, size_t len, ::std::ostream* os);
// Implements printing an array type T[N].
template <typename T, size_t N>
class UniversalPrinter<T[N]> {
public:
// Prints the given array, omitting some elements when there are too
// many.
static void Print(const T (&a)[N], ::std::ostream* os) {
UniversalPrintArray(a, N, os);
}
};
// Implements printing a reference type T&.
template <typename T>
class UniversalPrinter<T&> {
public:
// MSVC warns about adding const to a function type, so we want to
// disable the warning.
GTEST_DISABLE_MSC_WARNINGS_PUSH_(4180)
static void Print(const T& value, ::std::ostream* os) {
// Prints the address of the value. We use reinterpret_cast here
// as static_cast doesn't compile when T is a function type.
*os << "@" << reinterpret_cast<const void*>(&value) << " ";
// Then prints the value itself.
UniversalPrint(value, os);
}
GTEST_DISABLE_MSC_WARNINGS_POP_()
};
// Prints a value tersely: for a reference type, the referenced value
// (but not the address) is printed; for a (const) char pointer, the
// NUL-terminated string (but not the pointer) is printed.
template <typename T>
class UniversalTersePrinter {
public:
static void Print(const T& value, ::std::ostream* os) {
UniversalPrint(value, os);
}
};
template <typename T>
class UniversalTersePrinter<T&> {
public:
static void Print(const T& value, ::std::ostream* os) {
UniversalPrint(value, os);
}
};
template <typename T, size_t N>
class UniversalTersePrinter<T[N]> {
public:
static void Print(const T (&value)[N], ::std::ostream* os) {
UniversalPrinter<T[N]>::Print(value, os);
}
};
template <>
class UniversalTersePrinter<const char*> {
public:
static void Print(const char* str, ::std::ostream* os) {
if (str == NULL) {
*os << "NULL";
} else {
UniversalPrint(string(str), os);
}
}
};
template <>
class UniversalTersePrinter<char*> {
public:
static void Print(char* str, ::std::ostream* os) {
UniversalTersePrinter<const char*>::Print(str, os);
}
};
#if GTEST_HAS_STD_WSTRING
template <>
class UniversalTersePrinter<const wchar_t*> {
public:
static void Print(const wchar_t* str, ::std::ostream* os) {
if (str == NULL) {
*os << "NULL";
} else {
UniversalPrint(::std::wstring(str), os);
}
}
};
#endif
template <>
class UniversalTersePrinter<wchar_t*> {
public:
static void Print(wchar_t* str, ::std::ostream* os) {
UniversalTersePrinter<const wchar_t*>::Print(str, os);
}
};
template <typename T>
void UniversalTersePrint(const T& value, ::std::ostream* os) {
UniversalTersePrinter<T>::Print(value, os);
}
// Prints a value using the type inferred by the compiler. The
// difference between this and UniversalTersePrint() is that for a
// (const) char pointer, this prints both the pointer and the
// NUL-terminated string.
template <typename T>
void UniversalPrint(const T& value, ::std::ostream* os) {
// A workarond for the bug in VC++ 7.1 that prevents us from instantiating
// UniversalPrinter with T directly.
typedef T T1;
UniversalPrinter<T1>::Print(value, os);
}
typedef ::std::vector<string> Strings;
// TuplePolicy<TupleT> must provide:
// - tuple_size
// size of tuple TupleT.
// - get<size_t I>(const TupleT& t)
// static function extracting element I of tuple TupleT.
// - tuple_element<size_t I>::type
// type of element I of tuple TupleT.
template <typename TupleT>
struct TuplePolicy;
#if GTEST_HAS_TR1_TUPLE
template <typename TupleT>
struct TuplePolicy {
typedef TupleT Tuple;
static const size_t tuple_size = ::std::tr1::tuple_size<Tuple>::value;
template <size_t I>
struct tuple_element : ::std::tr1::tuple_element<I, Tuple> {};
template <size_t I>
static typename AddReference<
const typename ::std::tr1::tuple_element<I, Tuple>::type>::type get(
const Tuple& tuple) {
return ::std::tr1::get<I>(tuple);
}
};
template <typename TupleT>
const size_t TuplePolicy<TupleT>::tuple_size;
#endif // GTEST_HAS_TR1_TUPLE
#if GTEST_HAS_STD_TUPLE_
template <typename... Types>
struct TuplePolicy< ::std::tuple<Types...> > {
typedef ::std::tuple<Types...> Tuple;
static const size_t tuple_size = ::std::tuple_size<Tuple>::value;
template <size_t I>
struct tuple_element : ::std::tuple_element<I, Tuple> {};
template <size_t I>
static const typename ::std::tuple_element<I, Tuple>::type& get(
const Tuple& tuple) {
return ::std::get<I>(tuple);
}
};
template <typename... Types>
const size_t TuplePolicy< ::std::tuple<Types...> >::tuple_size;
#endif // GTEST_HAS_STD_TUPLE_
#if GTEST_HAS_TR1_TUPLE || GTEST_HAS_STD_TUPLE_
// This helper template allows PrintTo() for tuples and
// UniversalTersePrintTupleFieldsToStrings() to be defined by
// induction on the number of tuple fields. The idea is that
// TuplePrefixPrinter<N>::PrintPrefixTo(t, os) prints the first N
// fields in tuple t, and can be defined in terms of
// TuplePrefixPrinter<N - 1>.
//
// The inductive case.
template <size_t N>
struct TuplePrefixPrinter {
// Prints the first N fields of a tuple.
template <typename Tuple>
static void PrintPrefixTo(const Tuple& t, ::std::ostream* os) {
TuplePrefixPrinter<N - 1>::PrintPrefixTo(t, os);
GTEST_INTENTIONAL_CONST_COND_PUSH_()
if (N > 1) {
GTEST_INTENTIONAL_CONST_COND_POP_()
*os << ", ";
}
UniversalPrinter<
typename TuplePolicy<Tuple>::template tuple_element<N - 1>::type>
::Print(TuplePolicy<Tuple>::template get<N - 1>(t), os);
}
// Tersely prints the first N fields of a tuple to a string vector,
// one element for each field.
template <typename Tuple>
static void TersePrintPrefixToStrings(const Tuple& t, Strings* strings) {
TuplePrefixPrinter<N - 1>::TersePrintPrefixToStrings(t, strings);
::std::stringstream ss;
UniversalTersePrint(TuplePolicy<Tuple>::template get<N - 1>(t), &ss);
strings->push_back(ss.str());
}
};
// Base case.
template <>
struct TuplePrefixPrinter<0> {
template <typename Tuple>
static void PrintPrefixTo(const Tuple&, ::std::ostream*) {}
template <typename Tuple>
static void TersePrintPrefixToStrings(const Tuple&, Strings*) {}
};
// Helper function for printing a tuple.
// Tuple must be either std::tr1::tuple or std::tuple type.
template <typename Tuple>
void PrintTupleTo(const Tuple& t, ::std::ostream* os) {
*os << "(";
TuplePrefixPrinter<TuplePolicy<Tuple>::tuple_size>::PrintPrefixTo(t, os);
*os << ")";
}
// Prints the fields of a tuple tersely to a string vector, one
// element for each field. See the comment before
// UniversalTersePrint() for how we define "tersely".
template <typename Tuple>
Strings UniversalTersePrintTupleFieldsToStrings(const Tuple& value) {
Strings result;
TuplePrefixPrinter<TuplePolicy<Tuple>::tuple_size>::
TersePrintPrefixToStrings(value, &result);
return result;
}
#endif // GTEST_HAS_TR1_TUPLE || GTEST_HAS_STD_TUPLE_
} // namespace internal
template <typename T>
::std::string PrintToString(const T& value) {
::std::stringstream ss;
internal::UniversalTersePrinter<T>::Print(value, &ss);
return ss.str();
}
} // namespace testing
// Include any custom printer added by the local installation.
// We must include this header at the end to make sure it can use the
// declarations from this file.
#include "gtest/internal/custom/gtest-printers.h"
#endif // GTEST_INCLUDE_GTEST_GTEST_PRINTERS_H_

View File

@@ -0,0 +1,232 @@
// Copyright 2007, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// Utilities for testing Google Test itself and code that uses Google Test
// (e.g. frameworks built on top of Google Test).
#ifndef GTEST_INCLUDE_GTEST_GTEST_SPI_H_
#define GTEST_INCLUDE_GTEST_GTEST_SPI_H_
#include "gtest/gtest.h"
namespace testing {
// This helper class can be used to mock out Google Test failure reporting
// so that we can test Google Test or code that builds on Google Test.
//
// An object of this class appends a TestPartResult object to the
// TestPartResultArray object given in the constructor whenever a Google Test
// failure is reported. It can either intercept only failures that are
// generated in the same thread that created this object or it can intercept
// all generated failures. The scope of this mock object can be controlled with
// the second argument to the two arguments constructor.
class GTEST_API_ ScopedFakeTestPartResultReporter
: public TestPartResultReporterInterface {
public:
// The two possible mocking modes of this object.
enum InterceptMode {
INTERCEPT_ONLY_CURRENT_THREAD, // Intercepts only thread local failures.
INTERCEPT_ALL_THREADS // Intercepts all failures.
};
// The c'tor sets this object as the test part result reporter used
// by Google Test. The 'result' parameter specifies where to report the
// results. This reporter will only catch failures generated in the current
// thread. DEPRECATED
explicit ScopedFakeTestPartResultReporter(TestPartResultArray* result);
// Same as above, but you can choose the interception scope of this object.
ScopedFakeTestPartResultReporter(InterceptMode intercept_mode,
TestPartResultArray* result);
// The d'tor restores the previous test part result reporter.
virtual ~ScopedFakeTestPartResultReporter();
// Appends the TestPartResult object to the TestPartResultArray
// received in the constructor.
//
// This method is from the TestPartResultReporterInterface
// interface.
virtual void ReportTestPartResult(const TestPartResult& result);
private:
void Init();
const InterceptMode intercept_mode_;
TestPartResultReporterInterface* old_reporter_;
TestPartResultArray* const result_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(ScopedFakeTestPartResultReporter);
};
namespace internal {
// A helper class for implementing EXPECT_FATAL_FAILURE() and
// EXPECT_NONFATAL_FAILURE(). Its destructor verifies that the given
// TestPartResultArray contains exactly one failure that has the given
// type and contains the given substring. If that's not the case, a
// non-fatal failure will be generated.
class GTEST_API_ SingleFailureChecker {
public:
// The constructor remembers the arguments.
SingleFailureChecker(const TestPartResultArray* results,
TestPartResult::Type type,
const string& substr);
~SingleFailureChecker();
private:
const TestPartResultArray* const results_;
const TestPartResult::Type type_;
const string substr_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(SingleFailureChecker);
};
} // namespace internal
} // namespace testing
// A set of macros for testing Google Test assertions or code that's expected
// to generate Google Test fatal failures. It verifies that the given
// statement will cause exactly one fatal Google Test failure with 'substr'
// being part of the failure message.
//
// There are two different versions of this macro. EXPECT_FATAL_FAILURE only
// affects and considers failures generated in the current thread and
// EXPECT_FATAL_FAILURE_ON_ALL_THREADS does the same but for all threads.
//
// The verification of the assertion is done correctly even when the statement
// throws an exception or aborts the current function.
//
// Known restrictions:
// - 'statement' cannot reference local non-static variables or
// non-static members of the current object.
// - 'statement' cannot return a value.
// - You cannot stream a failure message to this macro.
//
// Note that even though the implementations of the following two
// macros are much alike, we cannot refactor them to use a common
// helper macro, due to some peculiarity in how the preprocessor
// works. The AcceptsMacroThatExpandsToUnprotectedComma test in
// gtest_unittest.cc will fail to compile if we do that.
#define EXPECT_FATAL_FAILURE(statement, substr) \
do { \
class GTestExpectFatalFailureHelper {\
public:\
static void Execute() { statement; }\
};\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kFatalFailure, (substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ONLY_CURRENT_THREAD, &gtest_failures);\
GTestExpectFatalFailureHelper::Execute();\
}\
} while (::testing::internal::AlwaysFalse())
#define EXPECT_FATAL_FAILURE_ON_ALL_THREADS(statement, substr) \
do { \
class GTestExpectFatalFailureHelper {\
public:\
static void Execute() { statement; }\
};\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kFatalFailure, (substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ALL_THREADS, &gtest_failures);\
GTestExpectFatalFailureHelper::Execute();\
}\
} while (::testing::internal::AlwaysFalse())
// A macro for testing Google Test assertions or code that's expected to
// generate Google Test non-fatal failures. It asserts that the given
// statement will cause exactly one non-fatal Google Test failure with 'substr'
// being part of the failure message.
//
// There are two different versions of this macro. EXPECT_NONFATAL_FAILURE only
// affects and considers failures generated in the current thread and
// EXPECT_NONFATAL_FAILURE_ON_ALL_THREADS does the same but for all threads.
//
// 'statement' is allowed to reference local variables and members of
// the current object.
//
// The verification of the assertion is done correctly even when the statement
// throws an exception or aborts the current function.
//
// Known restrictions:
// - You cannot stream a failure message to this macro.
//
// Note that even though the implementations of the following two
// macros are much alike, we cannot refactor them to use a common
// helper macro, due to some peculiarity in how the preprocessor
// works. If we do that, the code won't compile when the user gives
// EXPECT_NONFATAL_FAILURE() a statement that contains a macro that
// expands to code containing an unprotected comma. The
// AcceptsMacroThatExpandsToUnprotectedComma test in gtest_unittest.cc
// catches that.
//
// For the same reason, we have to write
// if (::testing::internal::AlwaysTrue()) { statement; }
// instead of
// GTEST_SUPPRESS_UNREACHABLE_CODE_WARNING_BELOW_(statement)
// to avoid an MSVC warning on unreachable code.
#define EXPECT_NONFATAL_FAILURE(statement, substr) \
do {\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kNonFatalFailure, \
(substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter:: \
INTERCEPT_ONLY_CURRENT_THREAD, &gtest_failures);\
if (::testing::internal::AlwaysTrue()) { statement; }\
}\
} while (::testing::internal::AlwaysFalse())
#define EXPECT_NONFATAL_FAILURE_ON_ALL_THREADS(statement, substr) \
do {\
::testing::TestPartResultArray gtest_failures;\
::testing::internal::SingleFailureChecker gtest_checker(\
&gtest_failures, ::testing::TestPartResult::kNonFatalFailure, \
(substr));\
{\
::testing::ScopedFakeTestPartResultReporter gtest_reporter(\
::testing::ScopedFakeTestPartResultReporter::INTERCEPT_ALL_THREADS, \
&gtest_failures);\
if (::testing::internal::AlwaysTrue()) { statement; }\
}\
} while (::testing::internal::AlwaysFalse())
#endif // GTEST_INCLUDE_GTEST_GTEST_SPI_H_

View File

@@ -0,0 +1,179 @@
// Copyright 2008, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: mheule@google.com (Markus Heule)
//
#ifndef GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_
#define GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_
#include <iosfwd>
#include <vector>
#include "gtest/internal/gtest-internal.h"
#include "gtest/internal/gtest-string.h"
namespace testing {
// A copyable object representing the result of a test part (i.e. an
// assertion or an explicit FAIL(), ADD_FAILURE(), or SUCCESS()).
//
// Don't inherit from TestPartResult as its destructor is not virtual.
class GTEST_API_ TestPartResult {
public:
// The possible outcomes of a test part (i.e. an assertion or an
// explicit SUCCEED(), FAIL(), or ADD_FAILURE()).
enum Type {
kSuccess, // Succeeded.
kNonFatalFailure, // Failed but the test can continue.
kFatalFailure // Failed and the test should be terminated.
};
// C'tor. TestPartResult does NOT have a default constructor.
// Always use this constructor (with parameters) to create a
// TestPartResult object.
TestPartResult(Type a_type,
const char* a_file_name,
int a_line_number,
const char* a_message)
: type_(a_type),
file_name_(a_file_name == NULL ? "" : a_file_name),
line_number_(a_line_number),
summary_(ExtractSummary(a_message)),
message_(a_message) {
}
// Gets the outcome of the test part.
Type type() const { return type_; }
// Gets the name of the source file where the test part took place, or
// NULL if it's unknown.
const char* file_name() const {
return file_name_.empty() ? NULL : file_name_.c_str();
}
// Gets the line in the source file where the test part took place,
// or -1 if it's unknown.
int line_number() const { return line_number_; }
// Gets the summary of the failure message.
const char* summary() const { return summary_.c_str(); }
// Gets the message associated with the test part.
const char* message() const { return message_.c_str(); }
// Returns true iff the test part passed.
bool passed() const { return type_ == kSuccess; }
// Returns true iff the test part failed.
bool failed() const { return type_ != kSuccess; }
// Returns true iff the test part non-fatally failed.
bool nonfatally_failed() const { return type_ == kNonFatalFailure; }
// Returns true iff the test part fatally failed.
bool fatally_failed() const { return type_ == kFatalFailure; }
private:
Type type_;
// Gets the summary of the failure message by omitting the stack
// trace in it.
static std::string ExtractSummary(const char* message);
// The name of the source file where the test part took place, or
// "" if the source file is unknown.
std::string file_name_;
// The line in the source file where the test part took place, or -1
// if the line number is unknown.
int line_number_;
std::string summary_; // The test failure summary.
std::string message_; // The test failure message.
};
// Prints a TestPartResult object.
std::ostream& operator<<(std::ostream& os, const TestPartResult& result);
// An array of TestPartResult objects.
//
// Don't inherit from TestPartResultArray as its destructor is not
// virtual.
class GTEST_API_ TestPartResultArray {
public:
TestPartResultArray() {}
// Appends the given TestPartResult to the array.
void Append(const TestPartResult& result);
// Returns the TestPartResult at the given index (0-based).
const TestPartResult& GetTestPartResult(int index) const;
// Returns the number of TestPartResult objects in the array.
int size() const;
private:
std::vector<TestPartResult> array_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(TestPartResultArray);
};
// This interface knows how to report a test part result.
class TestPartResultReporterInterface {
public:
virtual ~TestPartResultReporterInterface() {}
virtual void ReportTestPartResult(const TestPartResult& result) = 0;
};
namespace internal {
// This helper class is used by {ASSERT|EXPECT}_NO_FATAL_FAILURE to check if a
// statement generates new fatal failures. To do so it registers itself as the
// current test part result reporter. Besides checking if fatal failures were
// reported, it only delegates the reporting to the former result reporter.
// The original result reporter is restored in the destructor.
// INTERNAL IMPLEMENTATION - DO NOT USE IN A USER PROGRAM.
class GTEST_API_ HasNewFatalFailureHelper
: public TestPartResultReporterInterface {
public:
HasNewFatalFailureHelper();
virtual ~HasNewFatalFailureHelper();
virtual void ReportTestPartResult(const TestPartResult& result);
bool has_new_fatal_failure() const { return has_new_fatal_failure_; }
private:
bool has_new_fatal_failure_;
TestPartResultReporterInterface* original_reporter_;
GTEST_DISALLOW_COPY_AND_ASSIGN_(HasNewFatalFailureHelper);
};
} // namespace internal
} // namespace testing
#endif // GTEST_INCLUDE_GTEST_GTEST_TEST_PART_H_

View File

@@ -0,0 +1,263 @@
// Copyright 2008 Google Inc.
// All Rights Reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
#ifndef GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_
#define GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_
// This header implements typed tests and type-parameterized tests.
// Typed (aka type-driven) tests repeat the same test for types in a
// list. You must know which types you want to test with when writing
// typed tests. Here's how you do it:
#if 0
// First, define a fixture class template. It should be parameterized
// by a type. Remember to derive it from testing::Test.
template <typename T>
class FooTest : public testing::Test {
public:
...
typedef std::list<T> List;
static T shared_;
T value_;
};
// Next, associate a list of types with the test case, which will be
// repeated for each type in the list. The typedef is necessary for
// the macro to parse correctly.
typedef testing::Types<char, int, unsigned int> MyTypes;
TYPED_TEST_CASE(FooTest, MyTypes);
// If the type list contains only one type, you can write that type
// directly without Types<...>:
// TYPED_TEST_CASE(FooTest, int);
// Then, use TYPED_TEST() instead of TEST_F() to define as many typed
// tests for this test case as you want.
TYPED_TEST(FooTest, DoesBlah) {
// Inside a test, refer to TypeParam to get the type parameter.
// Since we are inside a derived class template, C++ requires use to
// visit the members of FooTest via 'this'.
TypeParam n = this->value_;
// To visit static members of the fixture, add the TestFixture::
// prefix.
n += TestFixture::shared_;
// To refer to typedefs in the fixture, add the "typename
// TestFixture::" prefix.
typename TestFixture::List values;
values.push_back(n);
...
}
TYPED_TEST(FooTest, HasPropertyA) { ... }
#endif // 0
// Type-parameterized tests are abstract test patterns parameterized
// by a type. Compared with typed tests, type-parameterized tests
// allow you to define the test pattern without knowing what the type
// parameters are. The defined pattern can be instantiated with
// different types any number of times, in any number of translation
// units.
//
// If you are designing an interface or concept, you can define a
// suite of type-parameterized tests to verify properties that any
// valid implementation of the interface/concept should have. Then,
// each implementation can easily instantiate the test suite to verify
// that it conforms to the requirements, without having to write
// similar tests repeatedly. Here's an example:
#if 0
// First, define a fixture class template. It should be parameterized
// by a type. Remember to derive it from testing::Test.
template <typename T>
class FooTest : public testing::Test {
...
};
// Next, declare that you will define a type-parameterized test case
// (the _P suffix is for "parameterized" or "pattern", whichever you
// prefer):
TYPED_TEST_CASE_P(FooTest);
// Then, use TYPED_TEST_P() to define as many type-parameterized tests
// for this type-parameterized test case as you want.
TYPED_TEST_P(FooTest, DoesBlah) {
// Inside a test, refer to TypeParam to get the type parameter.
TypeParam n = 0;
...
}
TYPED_TEST_P(FooTest, HasPropertyA) { ... }
// Now the tricky part: you need to register all test patterns before
// you can instantiate them. The first argument of the macro is the
// test case name; the rest are the names of the tests in this test
// case.
REGISTER_TYPED_TEST_CASE_P(FooTest,
DoesBlah, HasPropertyA);
// Finally, you are free to instantiate the pattern with the types you
// want. If you put the above code in a header file, you can #include
// it in multiple C++ source files and instantiate it multiple times.
//
// To distinguish different instances of the pattern, the first
// argument to the INSTANTIATE_* macro is a prefix that will be added
// to the actual test case name. Remember to pick unique prefixes for
// different instances.
typedef testing::Types<char, int, unsigned int> MyTypes;
INSTANTIATE_TYPED_TEST_CASE_P(My, FooTest, MyTypes);
// If the type list contains only one type, you can write that type
// directly without Types<...>:
// INSTANTIATE_TYPED_TEST_CASE_P(My, FooTest, int);
#endif // 0
#include "gtest/internal/gtest-port.h"
#include "gtest/internal/gtest-type-util.h"
// Implements typed tests.
#if GTEST_HAS_TYPED_TEST
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the name of the typedef for the type parameters of the
// given test case.
# define GTEST_TYPE_PARAMS_(TestCaseName) gtest_type_params_##TestCaseName##_
// The 'Types' template argument below must have spaces around it
// since some compilers may choke on '>>' when passing a template
// instance (e.g. Types<int>)
# define TYPED_TEST_CASE(CaseName, Types) \
typedef ::testing::internal::TypeList< Types >::type \
GTEST_TYPE_PARAMS_(CaseName)
# define TYPED_TEST(CaseName, TestName) \
template <typename gtest_TypeParam_> \
class GTEST_TEST_CLASS_NAME_(CaseName, TestName) \
: public CaseName<gtest_TypeParam_> { \
private: \
typedef CaseName<gtest_TypeParam_> TestFixture; \
typedef gtest_TypeParam_ TypeParam; \
virtual void TestBody(); \
}; \
bool gtest_##CaseName##_##TestName##_registered_ GTEST_ATTRIBUTE_UNUSED_ = \
::testing::internal::TypeParameterizedTest< \
CaseName, \
::testing::internal::TemplateSel< \
GTEST_TEST_CLASS_NAME_(CaseName, TestName)>, \
GTEST_TYPE_PARAMS_(CaseName)>::Register(\
"", ::testing::internal::CodeLocation(__FILE__, __LINE__), \
#CaseName, #TestName, 0); \
template <typename gtest_TypeParam_> \
void GTEST_TEST_CLASS_NAME_(CaseName, TestName)<gtest_TypeParam_>::TestBody()
#endif // GTEST_HAS_TYPED_TEST
// Implements type-parameterized tests.
#if GTEST_HAS_TYPED_TEST_P
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the namespace name that the type-parameterized tests for
// the given type-parameterized test case are defined in. The exact
// name of the namespace is subject to change without notice.
# define GTEST_CASE_NAMESPACE_(TestCaseName) \
gtest_case_##TestCaseName##_
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE.
//
// Expands to the name of the variable used to remember the names of
// the defined tests in the given test case.
# define GTEST_TYPED_TEST_CASE_P_STATE_(TestCaseName) \
gtest_typed_test_case_p_state_##TestCaseName##_
// INTERNAL IMPLEMENTATION - DO NOT USE IN USER CODE DIRECTLY.
//
// Expands to the name of the variable used to remember the names of
// the registered tests in the given test case.
# define GTEST_REGISTERED_TEST_NAMES_(TestCaseName) \
gtest_registered_test_names_##TestCaseName##_
// The variables defined in the type-parameterized test macros are
// static as typically these macros are used in a .h file that can be
// #included in multiple translation units linked together.
# define TYPED_TEST_CASE_P(CaseName) \
static ::testing::internal::TypedTestCasePState \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName)
# define TYPED_TEST_P(CaseName, TestName) \
namespace GTEST_CASE_NAMESPACE_(CaseName) { \
template <typename gtest_TypeParam_> \
class TestName : public CaseName<gtest_TypeParam_> { \
private: \
typedef CaseName<gtest_TypeParam_> TestFixture; \
typedef gtest_TypeParam_ TypeParam; \
virtual void TestBody(); \
}; \
static bool gtest_##TestName##_defined_ GTEST_ATTRIBUTE_UNUSED_ = \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName).AddTestName(\
__FILE__, __LINE__, #CaseName, #TestName); \
} \
template <typename gtest_TypeParam_> \
void GTEST_CASE_NAMESPACE_(CaseName)::TestName<gtest_TypeParam_>::TestBody()
# define REGISTER_TYPED_TEST_CASE_P(CaseName, ...) \
namespace GTEST_CASE_NAMESPACE_(CaseName) { \
typedef ::testing::internal::Templates<__VA_ARGS__>::type gtest_AllTests_; \
} \
static const char* const GTEST_REGISTERED_TEST_NAMES_(CaseName) = \
GTEST_TYPED_TEST_CASE_P_STATE_(CaseName).VerifyRegisteredTestNames(\
__FILE__, __LINE__, #__VA_ARGS__)
// The 'Types' template argument below must have spaces around it
// since some compilers may choke on '>>' when passing a template
// instance (e.g. Types<int>)
# define INSTANTIATE_TYPED_TEST_CASE_P(Prefix, CaseName, Types) \
bool gtest_##Prefix##_##CaseName GTEST_ATTRIBUTE_UNUSED_ = \
::testing::internal::TypeParameterizedTestCase<CaseName, \
GTEST_CASE_NAMESPACE_(CaseName)::gtest_AllTests_, \
::testing::internal::TypeList< Types >::type>::Register(\
#Prefix, \
::testing::internal::CodeLocation(__FILE__, __LINE__), \
&GTEST_TYPED_TEST_CASE_P_STATE_(CaseName), \
#CaseName, GTEST_REGISTERED_TEST_NAMES_(CaseName))
#endif // GTEST_HAS_TYPED_TEST_P
#endif // GTEST_INCLUDE_GTEST_GTEST_TYPED_TEST_H_

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,358 @@
// Copyright 2006, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// This file is AUTOMATICALLY GENERATED on 10/31/2011 by command
// 'gen_gtest_pred_impl.py 5'. DO NOT EDIT BY HAND!
//
// Implements a family of generic predicate assertion macros.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_
#define GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_
// Makes sure this header is not included before gtest.h.
#ifndef GTEST_INCLUDE_GTEST_GTEST_H_
# error Do not include gtest_pred_impl.h directly. Include gtest.h instead.
#endif // GTEST_INCLUDE_GTEST_GTEST_H_
// This header implements a family of generic predicate assertion
// macros:
//
// ASSERT_PRED_FORMAT1(pred_format, v1)
// ASSERT_PRED_FORMAT2(pred_format, v1, v2)
// ...
//
// where pred_format is a function or functor that takes n (in the
// case of ASSERT_PRED_FORMATn) values and their source expression
// text, and returns a testing::AssertionResult. See the definition
// of ASSERT_EQ in gtest.h for an example.
//
// If you don't care about formatting, you can use the more
// restrictive version:
//
// ASSERT_PRED1(pred, v1)
// ASSERT_PRED2(pred, v1, v2)
// ...
//
// where pred is an n-ary function or functor that returns bool,
// and the values v1, v2, ..., must support the << operator for
// streaming to std::ostream.
//
// We also define the EXPECT_* variations.
//
// For now we only support predicates whose arity is at most 5.
// Please email googletestframework@googlegroups.com if you need
// support for higher arities.
// GTEST_ASSERT_ is the basic statement to which all of the assertions
// in this file reduce. Don't use this in your code.
#define GTEST_ASSERT_(expression, on_failure) \
GTEST_AMBIGUOUS_ELSE_BLOCKER_ \
if (const ::testing::AssertionResult gtest_ar = (expression)) \
; \
else \
on_failure(gtest_ar.failure_message())
// Helper function for implementing {EXPECT|ASSERT}_PRED1. Don't use
// this in your code.
template <typename Pred,
typename T1>
AssertionResult AssertPred1Helper(const char* pred_text,
const char* e1,
Pred pred,
const T1& v1) {
if (pred(v1)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT1.
// Don't use this in your code.
#define GTEST_PRED_FORMAT1_(pred_format, v1, on_failure)\
GTEST_ASSERT_(pred_format(#v1, v1), \
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED1. Don't use
// this in your code.
#define GTEST_PRED1_(pred, v1, on_failure)\
GTEST_ASSERT_(::testing::AssertPred1Helper(#pred, \
#v1, \
pred, \
v1), on_failure)
// Unary predicate assertion macros.
#define EXPECT_PRED_FORMAT1(pred_format, v1) \
GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED1(pred, v1) \
GTEST_PRED1_(pred, v1, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT1(pred_format, v1) \
GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED1(pred, v1) \
GTEST_PRED1_(pred, v1, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED2. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2>
AssertionResult AssertPred2Helper(const char* pred_text,
const char* e1,
const char* e2,
Pred pred,
const T1& v1,
const T2& v2) {
if (pred(v1, v2)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT2.
// Don't use this in your code.
#define GTEST_PRED_FORMAT2_(pred_format, v1, v2, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, v1, v2), \
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED2. Don't use
// this in your code.
#define GTEST_PRED2_(pred, v1, v2, on_failure)\
GTEST_ASSERT_(::testing::AssertPred2Helper(#pred, \
#v1, \
#v2, \
pred, \
v1, \
v2), on_failure)
// Binary predicate assertion macros.
#define EXPECT_PRED_FORMAT2(pred_format, v1, v2) \
GTEST_PRED_FORMAT2_(pred_format, v1, v2, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED2(pred, v1, v2) \
GTEST_PRED2_(pred, v1, v2, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT2(pred_format, v1, v2) \
GTEST_PRED_FORMAT2_(pred_format, v1, v2, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED2(pred, v1, v2) \
GTEST_PRED2_(pred, v1, v2, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED3. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3>
AssertionResult AssertPred3Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3) {
if (pred(v1, v2, v3)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT3.
// Don't use this in your code.
#define GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, v1, v2, v3), \
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED3. Don't use
// this in your code.
#define GTEST_PRED3_(pred, v1, v2, v3, on_failure)\
GTEST_ASSERT_(::testing::AssertPred3Helper(#pred, \
#v1, \
#v2, \
#v3, \
pred, \
v1, \
v2, \
v3), on_failure)
// Ternary predicate assertion macros.
#define EXPECT_PRED_FORMAT3(pred_format, v1, v2, v3) \
GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED3(pred, v1, v2, v3) \
GTEST_PRED3_(pred, v1, v2, v3, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT3(pred_format, v1, v2, v3) \
GTEST_PRED_FORMAT3_(pred_format, v1, v2, v3, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED3(pred, v1, v2, v3) \
GTEST_PRED3_(pred, v1, v2, v3, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED4. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3,
typename T4>
AssertionResult AssertPred4Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
const char* e4,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3,
const T4& v4) {
if (pred(v1, v2, v3, v4)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ", "
<< e4 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3
<< "\n" << e4 << " evaluates to " << v4;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT4.
// Don't use this in your code.
#define GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, #v4, v1, v2, v3, v4), \
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED4. Don't use
// this in your code.
#define GTEST_PRED4_(pred, v1, v2, v3, v4, on_failure)\
GTEST_ASSERT_(::testing::AssertPred4Helper(#pred, \
#v1, \
#v2, \
#v3, \
#v4, \
pred, \
v1, \
v2, \
v3, \
v4), on_failure)
// 4-ary predicate assertion macros.
#define EXPECT_PRED_FORMAT4(pred_format, v1, v2, v3, v4) \
GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED4(pred, v1, v2, v3, v4) \
GTEST_PRED4_(pred, v1, v2, v3, v4, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT4(pred_format, v1, v2, v3, v4) \
GTEST_PRED_FORMAT4_(pred_format, v1, v2, v3, v4, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED4(pred, v1, v2, v3, v4) \
GTEST_PRED4_(pred, v1, v2, v3, v4, GTEST_FATAL_FAILURE_)
// Helper function for implementing {EXPECT|ASSERT}_PRED5. Don't use
// this in your code.
template <typename Pred,
typename T1,
typename T2,
typename T3,
typename T4,
typename T5>
AssertionResult AssertPred5Helper(const char* pred_text,
const char* e1,
const char* e2,
const char* e3,
const char* e4,
const char* e5,
Pred pred,
const T1& v1,
const T2& v2,
const T3& v3,
const T4& v4,
const T5& v5) {
if (pred(v1, v2, v3, v4, v5)) return AssertionSuccess();
return AssertionFailure() << pred_text << "("
<< e1 << ", "
<< e2 << ", "
<< e3 << ", "
<< e4 << ", "
<< e5 << ") evaluates to false, where"
<< "\n" << e1 << " evaluates to " << v1
<< "\n" << e2 << " evaluates to " << v2
<< "\n" << e3 << " evaluates to " << v3
<< "\n" << e4 << " evaluates to " << v4
<< "\n" << e5 << " evaluates to " << v5;
}
// Internal macro for implementing {EXPECT|ASSERT}_PRED_FORMAT5.
// Don't use this in your code.
#define GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, on_failure)\
GTEST_ASSERT_(pred_format(#v1, #v2, #v3, #v4, #v5, v1, v2, v3, v4, v5), \
on_failure)
// Internal macro for implementing {EXPECT|ASSERT}_PRED5. Don't use
// this in your code.
#define GTEST_PRED5_(pred, v1, v2, v3, v4, v5, on_failure)\
GTEST_ASSERT_(::testing::AssertPred5Helper(#pred, \
#v1, \
#v2, \
#v3, \
#v4, \
#v5, \
pred, \
v1, \
v2, \
v3, \
v4, \
v5), on_failure)
// 5-ary predicate assertion macros.
#define EXPECT_PRED_FORMAT5(pred_format, v1, v2, v3, v4, v5) \
GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, GTEST_NONFATAL_FAILURE_)
#define EXPECT_PRED5(pred, v1, v2, v3, v4, v5) \
GTEST_PRED5_(pred, v1, v2, v3, v4, v5, GTEST_NONFATAL_FAILURE_)
#define ASSERT_PRED_FORMAT5(pred_format, v1, v2, v3, v4, v5) \
GTEST_PRED_FORMAT5_(pred_format, v1, v2, v3, v4, v5, GTEST_FATAL_FAILURE_)
#define ASSERT_PRED5(pred, v1, v2, v3, v4, v5) \
GTEST_PRED5_(pred, v1, v2, v3, v4, v5, GTEST_FATAL_FAILURE_)
#endif // GTEST_INCLUDE_GTEST_GTEST_PRED_IMPL_H_

View File

@@ -0,0 +1,58 @@
// Copyright 2006, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Author: wan@google.com (Zhanyong Wan)
//
// Google C++ Testing Framework definitions useful in production code.
#ifndef GTEST_INCLUDE_GTEST_GTEST_PROD_H_
#define GTEST_INCLUDE_GTEST_GTEST_PROD_H_
// When you need to test the private or protected members of a class,
// use the FRIEND_TEST macro to declare your tests as friends of the
// class. For example:
//
// class MyClass {
// private:
// void MyMethod();
// FRIEND_TEST(MyClassTest, MyMethod);
// };
//
// class MyClassTest : public testing::Test {
// // ...
// };
//
// TEST_F(MyClassTest, MyMethod) {
// // Can call MyClass::MyMethod() here.
// }
#define FRIEND_TEST(test_case_name, test_name)\
friend class test_case_name##_##test_name##_Test
#endif // GTEST_INCLUDE_GTEST_GTEST_PROD_H_

Some files were not shown because too many files have changed in this diff Show More