1895 Commits

Author SHA1 Message Date
Linfeng Zhang
cf76ee2cb7 Add vpx_idct16x16_38_add_c()
When eob is less than or equal to 38 for 16x16 idct, call this function.

Change-Id: Ief6f3fb16a49ace3c92cebf4e220bf5bf52a6087
2017-02-07 09:40:51 -08:00
Johann
537949a9df block_error_fp highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Id2ed3ebaaaa6a4b68628c23e08b64ea5f1341761
2017-02-07 15:03:28 +00:00
Yunqing Wang
2a21b45fdc Fix visual studio build failure
Fixed the following issue.
..\test\vp9_ethread_test.cc(69): warning C4805: '|=' : unsafe mix of type 'bool' and type 'int' in operation [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]
..\test\vp9_ethread_test.cc(69): warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning) [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]

Change-Id: I37f897cf12a0b7500d2fcbac9e4615f08a83fdb4
2017-02-03 08:36:55 -08:00
Jerome Jiang
a16ca80b09 Merge "Add unit tests for vp9_block_error_fp." 2017-02-02 22:20:42 +00:00
Jingning Han
bb40844e32 Merge "Add SSSE3 intrinsic 8x8 inverse 2D-DCT" 2017-02-02 22:18:32 +00:00
Jerome Jiang
0b60d3ffa5 Add unit tests for vp9_block_error_fp.
BUG=webm:1365

Change-Id: I004e5cd7ca331d14b31b7fc3edeee45fce064026
2017-02-02 12:41:51 -08:00
Kaustubh Raste
5b10674b5c Merge "Add mips msa sum_squares_2d_i16 function" 2017-02-02 08:09:21 +00:00
Johann Koenig
726556dde9 Merge "Remove neon assembly for idct 16x16 and 8x8" 2017-02-02 03:25:31 +00:00
Johann Koenig
ce6318f254 Merge changes I43521ad3,I013659f6
* changes:
  satd highbd neon: use tran_low_t for coeff
  satd highbd sse2: use tran_low_t for coeff
2017-02-02 03:03:58 +00:00
Jingning Han
8f95389742 Add SSSE3 intrinsic 8x8 inverse 2D-DCT
The intrinsic version reduces the average cycles from 183 to 175.

Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03
2017-02-01 14:47:53 -08:00
Johann
f8d744d91a satd highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I43521ad32b6c96737a8ef2b8c327f901fd7eaf84
2017-02-01 11:55:47 -08:00
Johann
2ba383474d satd highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I013659f6b9fbf9cc52ab840eae520fe0b5f883fb
2017-02-01 11:55:16 -08:00
Johann
0f751ecee3 hadamard highbd ssse3: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I374dfc08732932382043905f128e928b08cb4f57
2017-02-01 11:51:15 -08:00
Johann
1eb8a718bf hadamard highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I7e15192ead3a3631755b386f102c979f06e26279
2017-02-01 11:50:46 -08:00
Johann
2dac808dd1 hadamard highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Ica414007d8412ceebfffa9e58e8416226a3fe934
2017-02-01 11:46:57 -08:00
Jingning Han
a7949f2dd2 Make satd unit test support all bit-depth settings
Turn on satd unit test for c function in both regular and high
bit-depth settings.

Change-Id: I4b0c56addfb84964ede0da3ab760fe0ee640cfd0
2017-01-31 23:21:32 -08:00
Jingning Han
59917dd18e Unify the hadamard transform unit test for bit-depth settings
Unify the 8x8 and 16x16 Hadamard unit test system for both 8-bit
and high bit-depth settings.

Change-Id: I53373c1d43f3ced514ad1e53e03f0fb9b25d9ead
2017-01-31 23:21:32 -08:00
Jingning Han
969957f9f2 Fix real-time compression regression in hbd mode
This commit resolves the compression performance regression in
real-time encoding setting when high bit-depth mode is enabled.

The current solution temporarily disables the SIMD implementations
of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.

The commit makes the coding results bit-wise identical between
regular coding pipeline and high bit-depth at profile 0.

BUG=webm:1365

Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2017-01-31 23:17:09 -08:00
Johann Koenig
9efc42f4f8 Merge "Use Buffer class for post proc tests" 2017-01-31 15:28:28 +00:00
Kaustubh Raste
750e753134 Add mips msa sum_squares_2d_i16 function
average improvement ~4x-5x

Change-Id: I8d91b71d0677009be52b412e4f52b40b98573a53
2017-01-31 12:22:43 +00:00
Kaustubh Raste
df7e1fecc1 Add mips msa vpx_minmax_8x8 function
average improvement ~4x-5x

Change-Id: I83aee9977534fddb8a9b80d31af646c0b6b1a8c3
2017-01-31 10:00:43 +05:30
Kaustubh Raste
407fad2356 Add mips msa vpx Integer projection row/col functions
average improvement ~4x-5x

Change-Id: I17c41383250282b39f5ecae0197ef1df7de20801
2017-01-27 11:11:42 +05:30
Kaustubh Raste
c1553f859f Merge "Add mips msa vpx satd function" 2017-01-27 04:08:51 +00:00
Johann
f380a1658d Use Buffer class for post proc tests
Add Buffer features for:
Setting the buffer to the output of an ACMRandom function.
Copying a buffer.
Comparing two buffers.
Printing two buffers.

Change-Id: Ib53fb602451a3abdcee279ea2b65b51fbc02d3df
2017-01-26 09:50:49 -08:00
Ranjit Kumar Tulabandu
8b0c11c358 Multi-threading of first pass stats collection
(yunqingwang)
1. Rebased the patch. Incorporated recent first pass changes.
2. Turned on the first pass unit test.

Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2017-01-24 15:48:02 -08:00
Yunqing Wang
91aa1fae2a Merge "Add the multi-threaded first pass encoder unit test" 2017-01-24 17:14:07 +00:00
Kaustubh Raste
182ea677a0 Add mips msa vpx satd function
average improvement ~4x-5x

Change-Id: If8683d636fe2606d4ca1038e28185bca53bbe244
2017-01-24 10:44:22 +05:30
Johann
270fadc135 PartialIDctTest: reduce number of RunQuantCheck iterations
This currently runs 1000 * 1000 = one *million* times which is quite
unnecessary. It's one of the slowest items in Jenkins and takes over an
hour for each of the larger transforms.

Change-Id: I01653b5e610683e1a2d778ec60cf5065562ab8db
2017-01-23 13:32:09 -08:00
Marco
b71ff28a1a vp9: Small threshold adjustment to unittest BasicRateTargeting444
Due to recent change to speed >=7 from commit:219cdab.

Change-Id: I366e7750ec91119881050ff6c05849504c7959e8
2017-01-21 18:19:45 -08:00
Yunqing Wang
b0d8a75e48 Add the multi-threaded first pass encoder unit test
Added the multi-threaded first pass encoder unit test in VP9. The test is
to check if the new multi-threaded first pass encoder(namely, new-mt = 1)
still generates matching stats. In the unit test, the new-mt mode will be
turned on once the multi-threaded first pass implementation is checked in.

Change-Id: Ic21bb1a55c454f024cfd2b397a4c148cfe638218
2017-01-20 10:06:24 -08:00
Johann
13234d3c43 Remove neon assembly for idct 16x16 and 8x8
Tested using test/partial_idct_test.cc:DISABLED_Speed

Both gcc 4.9 and clang 3.8 from the r13 Android NDK offer improvements
using the intrinsics:
<function>    <clang asm> <gcc asm> <clang intrin> <gcc intrin>
idct16x16_256  1720ms      1703ms    1546ms         1554ms
idct16x16_10   1320ms      1247ms     518ms          488ms
idct16x16_1     107ms       108ms      64ms           68ms
idct8x8_64      924ms       931ms     866ms          989ms
idct8x8_12      826ms       824ms     519ms          514ms
idct8x8_1       172ms       166ms     110ms          125ms

idct8x8_64 isn't quite perfect (slight regression with gcc intrinsics)
but as a counter example idct16x16_10 goes from ~1300ms to ~500ms

On a sample clip, clang improved from 48.5 to 49fps and gcc stayed roughly
stable.

BUG=webm:1303

Change-Id: I9d4fd2b41b46ea6174a887b40a82c8e6e4769ed4
2017-01-19 12:27:31 -08:00
Kaustubh Raste
e0c0e65378 Add mips msa vpx hadamard functions
average improvement ~4x-5x

Change-Id: I167132d894c04fa85dda8dde7906ff9c61b3a65d
2017-01-19 14:44:03 +05:30
Marco Paniconi
baa4a290eb Merge "vp9: Make the denoiser work with spatial SVC." 2017-01-12 17:54:41 +00:00
Johann Koenig
3628975a15 Merge "Create a class for buffers used in tests" 2017-01-12 01:02:58 +00:00
Johann
6886da7547 Create a class for buffers used in tests
Demonstrate its use with the IDCT test.

Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b
2017-01-11 08:28:39 -08:00
hui su
7a0bfa6ec6 Add "Large" label to VP9 target level tests
Also reduce the number of test frames.

Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08
2017-01-10 17:29:43 -08:00
Marco
7e3a82c384 vp9: Make the denoiser work with spatial SVC.
If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b
2017-01-10 17:23:58 -08:00
Johann Koenig
371a64bfe7 Merge "postproc: vpx_mbpost_proc_down_neon" 2017-01-09 19:53:15 +00:00
Johann Koenig
cabc29ba24 Merge "Add mips dspr2 partial idct tests" 2017-01-09 19:49:02 +00:00
Johann Koenig
7b18202e74 Merge "Add mips dspr2 vp9 intrapred tests" 2017-01-09 19:39:13 +00:00
Johann
c23970ec25 postproc: vpx_mbpost_proc_down_neon
This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x

BUG=webm:1320

Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4
2017-01-09 10:21:56 -08:00
Johann Koenig
9af97fb630 Merge "postproc: vpx_mbpost_proc_across_ip_neon" 2017-01-09 18:17:26 +00:00
Kaustubh Raste
6377f9d966 Add mips dspr2 partial idct tests
Change-Id: Idf4003ea6f9a2a42a9f26e156bee73697acb7a37
2017-01-09 17:30:16 +05:30
Kaustubh Raste
c6ccd1e939 Add mips dspr2 vp9 intrapred tests
Change-Id: I6be8c59ee220af0597bc2d7213f2779ac2e88db9
2017-01-09 14:11:57 +05:30
Johann
4dca923454 postproc: vpx_mbpost_proc_across_ip_neon
The speedup is pretty poor. I would be concerned except the SSE2 is
worse:
Existing SSE2 improvement: 22%
New neon improvement: 35%

BUG=webm:1320

Change-Id: Ied598a261134aa6cbe69f96f58589d2bae17bf62
2017-01-06 16:39:17 -08:00
hui su
337ad83e58 Add support for VP9 level targeting
Constraints on encoder config:
-target_bandwidth is no larger than 80% of level bitrate limit
-target_bandwidth * (1 + max_over_shoot_pct) is no larger than
88% of level bitrate limit
-min_gf_interval is no smaller than level limit
-tile_columns is no larger than level limit

Constraints on rate control:
-current frame size plus previous three frames' size is no larger
than the CPB level limit
-current frame size is no larger than 50%/40%/20% of the CPB
level limit if it's a key/alt-ref/other frame.

Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b
2017-01-06 10:07:31 -08:00
Linfeng Zhang
2d12a52ff0 Merge "Add high bitdepth 8x8 idct NEON intrinsics" 2017-01-06 16:47:23 +00:00
Marco
63a8257fb7 vp9: SVC unittests: fix to use y4m source.
Comment out check on buffer underrun, as it currently fails
on some of the svc tests.

Also cast the update of bits_in_buffer_model_, as this can
go negative now due to the buffer underrun.
This fixes the issue in #1352.

BUG=webm:1350
BUG=webm:1352

Change-Id: Ibd4ef23921daf09e5c15b000aca904aa4573599c
2017-01-03 15:29:04 -08:00
Linfeng Zhang
9b187954df Add high bitdepth 8x8 idct NEON intrinsics
BUG=webm:1301

Change-Id: I56e3bc3aab9214e2debac93796389a7194991084
2016-12-27 16:28:53 -08:00
James Zern
78a24171a6 Revert "vp9: SVC unittests: fix to use y4m source."
This reverts commit f0b491a52405abb1b3dbb6b2c74dd6a4c7a7ddb1.

This change results in unsigned integer overflows (as reported by
-fsanitize=integer) in datarate_test.cc,
for many of --gtest_filter=VP9/DatarateOnePassCbrSvc.OnePassCbrSvc*:
unsigned integer overflow: 167198 - 185560 cannot be represented in type
'unsigned long'

As the encoder didn't change, but the input with the change to
(correctly) use Y4mVideoSource, this revert is merely masking the issue.

BUG=webm:1352

Change-Id: Iecd9a6c83b3fca67c566732a5c92d36193cc2060
2016-12-23 14:18:18 -08:00