generic-library/vpx

Author	SHA1	Message	Date
Johann	e3fa4ae8e3	quantize test: use Buffer Although the low bitdepth functions are identical (excepting the need for larger intermediate values) they do not pass these tests. This improves the error output to aid debugging. Simplify buffer usage with Buffer and removing unnecessarily aligned variables. eob is a single element and never written using aligned instructions. BUG=webm:1426 Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35	2017-07-13 15:54:48 -07:00
James Zern	960466939d	fix 'make exampletest' w/CONFIG_REALTIME_ONLY for tests that aren't explicitly testing 2-pass behavior use --passes=1 with this configuration Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f	2017-07-13 10:47:20 -07:00
Johann	e381753926	sad4d neon: 64x[32,64] Rewrite 64x64. BUG=webm:1425 Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf	2017-07-12 13:26:39 +00:00
Johann	e1bde306c8	sad4d neon: 32x[16,32,64] Rewrite 32x32. Use half the accumulator registers. BUG=webm:1425 Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13	2017-07-12 13:25:18 +00:00
Johann	807ce8fb1e	sad4d neon: 16x[8,16,32] Rewrite 16x16. Use half the accumulator registers. BUG=webm:1425 Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2	2017-07-12 13:25:11 +00:00
Johann	8152b0904d	sad4d neon: 8x[4,8,16] BUG=webm:1425 Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4	2017-07-12 13:25:03 +00:00
Johann	dd4347e9ec	sad4d neon: 4x4, 4x8 BUG=webm:1425 Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468	2017-07-12 03:38:03 +00:00
Johann Koenig	4e16f70703	Merge changes Id84d9780,Iaa6ea75b,I3362e0dd,I0020a49e,Ia42e4f36, ... * changes: sad neon: avg for 64x[32,64] sad neon: macroize 64xN definitions sad neon: avg for 32x[16,32,64] sad neon: macroize 32xN definitions sad neon: avg for 16x[8,16,32] sad neon: macroize 16xN definitions	2017-07-07 21:01:23 +00:00
James Zern	5d6060b62f	Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming"	2017-07-07 19:15:02 +00:00
Johann Koenig	6c375b9cd0	Merge "fdct neon: 32x32_rd"	2017-07-07 14:05:51 +00:00
Johann	e4e08556db	sad neon: avg for 64x[32,64] BUG=webm:1425 Change-Id: Id84d97807a6a0fbcc889c4dfe11929d54f85493d	2017-07-07 07:04:04 -07:00
Johann	67cffc1ef6	sad neon: avg for 32x[16,32,64] BUG=webm:1425 Change-Id: I3362e0dded3b46ca032caa7f44db42f324bc596d	2017-07-07 07:04:04 -07:00
Johann	527e0c9b1c	sad neon: avg for 16x[8,16,32] BUG=webm:1425 Change-Id: Ia42e4f36547c5fe12114fb58379e34bce82eb2f2	2017-07-07 07:04:04 -07:00
Johann Koenig	9b253f9f0a	Merge changes I7b36a57e,If2ab51e3,Ifc685a96 * changes: sad neon: macroize 8xN definitions sad neon: avg for 8x[4,8,16] sad neon: avg for 4x4 and 4x8	2017-07-07 14:03:13 +00:00
James Zern	80b83c73ba	cosmetics,vp9/: normalize inv/fwd_txfm naming + vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e	2017-07-06 18:35:44 -07:00
Johann	63bdc574e5	sad neon: avg for 8x[4,8,16] BUG=webm:1425 Change-Id: If2ab51e3050e078b0011b174efe41fcb65a15f44	2017-07-06 07:43:09 -07:00
Johann	6bac3f80ee	sad neon: avg for 4x4 and 4x8 BUG=webm:1425 Change-Id: Ifc685a96cb34f7fd9243b4c674027480564b84fb	2017-07-06 07:12:47 -07:00
Johann	75b00592c7	fdct neon: 32x32_rd About 40% faster than the non-rd version. BUG=webm:1424 Change-Id: Ia99d14eb9532302eeaab8cd3e503395b0374b5a2	2017-07-06 06:30:50 -07:00
James Zern	5227b8200b	vp9: remove FrameWorkerData & vp9_dthread.h the file was empty after the struct removal. the only remaining use was within vp9_dx_iface, but the wrapper became unnecessary after the removal of frame_parallel_decode. BUG=webm:1395 Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd	2017-07-05 22:32:00 -07:00
James Zern	0d245d42c4	Merge "test_vector_test,vp8: correct thread range"	2017-07-05 22:33:51 +00:00
Johann Koenig	9a05f9771a	Merge "test/buffer.h: move range checking to compiler"	2017-07-05 21:15:13 +00:00
James Zern	a22bb9809e	Merge "dct_partial_test: cover vpx_fdct8x8_1_msa in hbd"	2017-07-05 21:08:46 +00:00
Hui Su	3e08a88854	Merge "level tests: allow level undershoot"	2017-07-05 20:47:20 +00:00
James Zern	23d60be414	dct_partial_test: cover vpx_fdct8x8_1_msa in hbd this was enabled in: `5ac88162b` partial fdct test Change-Id: Ibae2031ec1308fe3a3b84a1ce6e7bacda3a7cb82	2017-07-05 13:01:41 -07:00
Johann	da2ad47d66	test/buffer.h: move range checking to compiler Pass low/high values as type T. Out of range values should be caught by static analysis instead. Change-Id: I0a3ee8820af05f4c791ab097626174e2206fa6d5	2017-07-05 11:21:18 -07:00
James Zern	7d526c1654	Merge "buffer.h: incorrect RandRange results"	2017-07-02 03:48:53 +00:00
Johann	6cb3178192	buffer.h: incorrect RandRange results 'low' was promoted to unsigned, triggering a ubsan warning Change-Id: Id49340079d39c105da93cf13e96cf852a93a94ba	2017-07-01 20:01:22 -07:00
Alexandra Hájková	c757d6dde4	ppc: Add vpx_idct8x8_64_add_vsx Change-Id: I4ed1312f365509e0595dcc09890ecb050f6f2069	2017-07-01 12:55:47 -07:00
Alexandra Hájková	d8c277030c	ppc: Add vpx_idct4x4_16_add_vsx Change-Id: Id2673eece32027fb245919c7a5c81994a4a19fd8	2017-07-01 12:32:18 -07:00
James Zern	af3ab45867	test_vector_test,vp8: correct thread range testing::Range does not include the end parameter in the set of values. also adjust the start to 2 as the single threaded case is already covered in another instantiation Change-Id: Iae3bf3ed4363dd434eccfa5ad4e3c5e553fbee60	2017-06-30 16:21:06 -07:00
Johann	c2044fda1d	buffer.h: use stride_ instead of stride() Change-Id: Ib51231349bf0ff3e23672762dc7bfa49b5fe4083	2017-06-30 07:37:20 -07:00
Johann	ce5b17f9ad	testing: ranges for random values Add a method to acm_random.h to generate ranges of values Add a way to call that method to buffer.h Adjust dct_[partial_]test.cc to use it. Change-Id: I8c23ae9d27612c28f050b0e44c41cb4ad2494086	2017-06-30 07:25:30 -07:00
Johann Koenig	89d3dc043e	Merge changes Id5beb35d,I2945fe54,Ib0f3cfd6,I78a2eba8 * changes: partial fdct neon: add 32x32_1 partial fdct neon: add 16x16_1 partial fdct neon: add 4x4_1 partial fdct neon: move 8x8_1 and enable hbd tests	2017-06-30 01:00:07 +00:00
James Zern	67d7a6df2d	Merge changes from topic 'rm-dec-frame-parallel' * changes: rm vp9_frame_parallel_test.cc test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING	2017-06-29 23:21:18 +00:00
James Zern	e5bdab98e9	rm vp9_frame_parallel_test.cc VPX_CODEC_USE_FRAME_THREADING was made a no-op in: `01d23109a` vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op and the tests in this file have been disabled since: `6ab0870d4` disable VP9MultiThreadedFrameParallel tests BUG=webm:1395 Change-Id: I2c7a250acb65cf9522cf8a7bb724bb92070e41c6	2017-06-29 15:15:56 -07:00
James Zern	508ef2a6e3	test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING this was made a no-op in: `01d23109a` vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op and the test hitting this branch has been disabled since: `6ab0870d4` disable VP9MultiThreadedFrameParallel tests rename the test to VP9MultiThreaded to exercise the tile-based threading BUG=webm:1395 Change-Id: I35564a75eb5a7d7f7ccb923133b1b07295201f4c	2017-06-29 15:15:48 -07:00
James Zern	bd77931421	dct_partial_test,fwd_txfm: change << to * left shift of a negative number is undefined in C; quiets a ubsan warning Change-Id: Ib1624ad5326ac8e0eead9348468ef7fe5d4df9a4	2017-06-29 14:42:03 -07:00
Johann	9fe510c12a	partial fdct neon: add 32x32_1 Always return an int32_t. Since it needs to be moved to a register for shifting, this doesn't really penalize the smaller transforms. The values could potentially be summed and shifted in place. BUG=webm:1424 Change-Id: Id5beb35d79c7574ebd99285fc4182788cf2bb972	2017-06-28 15:37:44 -07:00
Johann	f310ddc470	partial fdct neon: add 16x16_1 For the 8x8_1, the highbd output fit nicely in the existing function. 12 bit input will overflow this implementation of 16x16_1. BUG=webm:1424 Change-Id: I2945fe5478b18f996f1a5de80110fa30f3f4e7ec	2017-06-28 15:37:44 -07:00
Johann	4959dd3eb3	partial fdct neon: add 4x4_1 BUG=webm:1424 Change-Id: Ib0f3cfd6116fc1f5a99acb8bfd76e25b90177ffc	2017-06-28 15:37:44 -07:00
Johann	cf75ab6ccd	partial fdct neon: move 8x8_1 and enable hbd tests The function was originally written with HBD in mind. Enable it and configure the tests. BUG=webm:1424 Change-Id: I78a2eba8d4d9d59db98a344ba0840d4a60ebe9a1	2017-06-28 15:37:43 -07:00
Johann Koenig	81e25512c3	Merge changes Ib454762d,I966650df,Ie126553e,I068f06c6,Icb72a94e * changes: sad neon: rewrite 64x64 and add 64x32 sad neon: rewrite 32x32, add 32x16 and 32x64 sad neon: rewrite 16x8, 16x16, add 16x32 sad neon: rewrite 8x8 and 8x16 sad neon: rewrite 4x4 and add 4x8	2017-06-28 22:37:00 +00:00
Johann Koenig	d91af5f905	Merge "buffer.h: Only allow Init() to be called once."	2017-06-28 22:36:05 +00:00
Johann Koenig	35f8515c3f	Merge "partial fdct test"	2017-06-28 22:34:53 +00:00
Johann	5ac88162b9	partial fdct test Test the _1 variant of the fdct, which simply sums the block and applies a modifying shift based on the block size. BUG=webm:1424 Change-Id: Ic80d6008abba0c596b575fa0484d5b5855321468	2017-06-28 20:32:20 +00:00
Johann	ad011aaab8	sad neon: rewrite 64x64 and add 64x32 BUG=webm:1425 Change-Id: Ib454762d1c61b05a98324fe81ad58c9e09784717	2017-06-28 12:21:34 -07:00
Johann	469643757f	sad neon: rewrite 16x8, 16x16, add 16x32 BUG=webm:1425 Change-Id: Ie126553e5fffcdfaf3d82a85b368ac10ce9ab082	2017-06-28 12:16:00 -07:00
Johann	e40e78be24	sad neon: rewrite 8x8 and 8x16 BUG=webm:1425 Change-Id: I068f06c67b841f09ea07c04ada0c2f1706102138	2017-06-28 12:15:57 -07:00
Johann	46d8660ce3	sad neon: rewrite 4x4 and add 4x8 The previous implementation loaded 8 values (discarding half) BUG=webm:1425 Change-Id: Icb72a94e2557a4ee2db7091266ab58fd92f72158	2017-06-28 11:14:59 -07:00
Johann	e0330c4810	buffer.h: Only allow Init() to be called once. Change-Id: I041c8b6f314802833c5287a176dbfeec9461b08e	2017-06-28 10:59:39 -07:00
hui su	d4595de5db	level tests: allow level undershoot Obtaining a level that is lower than the target should be tolerated. Change-Id: I90a55ee6d7142e9f6cc525ebbd1e0501defcbe28	2017-06-26 15:17:04 -07:00
Linfeng Zhang	ec4afbf74a	Merge "Add vpx_highbd_idct4x4_16_add_sse4_1()"	2017-06-24 01:15:14 +00:00
James Zern	ee1fcb0e69	Merge "variance_test: move Subpel* from tuples to TestParams"	2017-06-23 22:48:40 +00:00
Linfeng Zhang	8253a27904	Add vpx_highbd_idct4x4_16_add_sse4_1() BUG=webm:1412 Change-Id: Ie33482409351a01be4e89466b0441834eb1e905a	2017-06-23 14:30:12 -07:00
James Zern	0d1c782306	Merge "datarate_test: rename thread -> Thread in test name"	2017-06-23 20:00:51 +00:00
James Zern	54bcd98314	variance_test: move Subpel* from tuples to TestParams this normalizes these tests with the regular variance ones both in implementation and test list output Change-Id: I387aea81456f94b8223b8fb2a28cab94bc1aa9d5	2017-06-23 12:54:18 -07:00
Johann Koenig	794a5ad713	Merge "fdct32x32 neon implementation"	2017-06-23 01:58:00 +00:00
Linfeng Zhang	c5f9de573f	Merge changes I783c5f4f,I365f8e53,I5dac0e98 * changes: Clean vpx_idct16x16_256_add_sse2() Update vpx_idct{8x8,16x16,32x32}_1_add_sse2() Clean 32x32 full idct sse2 and ssse3 code	2017-06-22 21:42:23 +00:00
Johann	e67660cf37	fdct32x32 neon implementation Almost 3x faster in constrained loop testing. Over 10x faster in HBD builds. BUG=webm:1424 Change-Id: I2b7f8453e1d4ada63cde729d8115d684c4a71ff9	2017-06-22 06:40:17 -07:00
James Zern	dd88bd87db	datarate_test: rename thread -> Thread in test name this is consistent with other threaded tests and ensures gtest_filters meant to operate on these pick them up Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b	2017-06-21 20:05:31 -07:00
Linfeng Zhang	2b43a1ee18	Clean 32x32 full idct sse2 and ssse3 code vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are code relocations, no new code. Change-Id: I5dac0e98cc411a4ce05660406921118986638d19	2017-06-21 13:46:49 -07:00
Johann	1c48915233	dct tests: align InvAccuracyCheck buffers 'in' is used for the reference fdct. 'coeff' is input to the idct being tested and 'dst[16]' is output Fixes a segfault on unaligned memory access on x86. Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962	2017-06-21 11:47:00 -07:00
James Zern	0aa3677d9d	fix build, rm ref to vpx_idct8x8_64_add_ssse3 this was deleted in: `98967645a` Remove vpx_idct8x8_64_add_ssse3() but this was merged in: `9e03eedf6` Merge changes Ib26dd515,Ie60dabc3 after: `a92991133` Merge "dct tests: run all possible sizes in one test" which added a new reference Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25	2017-06-20 19:46:45 -07:00
Linfeng Zhang	9e03eedf62	Merge changes Ib26dd515,Ie60dabc3 * changes: Clean 8x8 idct x86 optimization Remove vpx_idct8x8_64_add_ssse3()	2017-06-21 00:38:25 +00:00
Johann	4ebb9a36f1	dct tests: run all possible sizes in one test Modify fdct4x4_test.cc to support all size combinations. This does not add any new tests and in fact fails a few. There were minimal changes made to the tests so it's not entirely surprising that some of the larger 12 bit transforms are failing since it was initially only used for 4x4. In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and dct32x32_test.cc will be evaluated and moved to dct_test.cc. BUG=webm:1424 Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f	2017-06-19 15:39:35 -07:00
Linfeng Zhang	98967645a1	Remove vpx_idct8x8_64_add_ssse3() It's almost identical with vpx_idct8x8_64_add_sse2(), except little difference in instructions order. Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f	2017-06-15 14:09:33 -07:00
Johann Koenig	6dcd9b37ea	Merge "idct_test: don't use std::nothrow anymore"	2017-06-09 20:42:39 +00:00
Johann Koenig	8aa4ee1f10	Merge "buffer.h: allow declaring an alignment"	2017-06-09 20:42:21 +00:00
Johann	92373a5bb2	idct_test: don't use std::nothrow anymore But still check for NULL before calling Init() Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76	2017-06-09 11:09:06 -07:00
Johann	5aee8ea752	buffer.h: allow declaring an alignment x86 simd register operations generally prefer and may require 16 byte alignment. Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba	2017-06-09 11:03:15 -07:00
James Zern	b3a262dff3	Merge "vp8_decode_frame: fix oob read on truncated key frame"	2017-06-08 23:17:50 +00:00
James Zern	45daecb4f7	vp8_decode_frame: fix oob read on truncated key frame the check for error correction being disabled was overriding the data length checks. this avoids returning incorrect information (width / height) for the decoded frame which could result in inconsistent sizes returned in to an application causing it to read beyond the bounds of the frame allocation. BUG=webm:1443 BUG=b/62458770 Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342	2017-06-08 23:16:04 +00:00
Johann	e50ea014c3	Revert "buffer.h: use size_t" This reverts commit `f08581c1d0`. type conversion warnings abound. Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00	2017-06-08 10:20:21 -07:00
Johann Koenig	903375a48a	Merge "fdct16x16 neon optimization"	2017-06-08 15:19:36 +00:00
Johann	eae7cf2368	fdct16x16 neon optimization Roughly 2x speedup. Since the only change for HBD is to store(), the improvement appears to hold there as well. BUG=webm:1424 Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19	2017-06-07 14:59:55 -07:00
Johann Koenig	0c4f74d129	Merge changes Iade45f69,I18d90658,Ieca3f1ef * changes: buffer.h: add num_elements_ buffer.h: zero-init all values buffer.h: use size_t	2017-06-07 19:20:16 +00:00
Johann	902d63759e	buffer.h: add num_elements_ raw_size_ was being incorrectly computed and used Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d	2017-06-07 11:31:20 -07:00
Johann	4a37e3e2a0	buffer.h: zero-init all values Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8	2017-06-07 11:27:26 -07:00
Johann	f08581c1d0	buffer.h: use size_t Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f	2017-06-07 11:24:27 -07:00
James Zern	ff42e04f9c	Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}"	2017-06-06 23:52:39 +00:00
Johann	de4cb716ee	buffer.h: split out init Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287	2017-06-06 15:02:50 -07:00
Johann	8659764a07	buffer.h: Use T for values Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9	2017-06-06 12:05:14 -07:00
James Zern	4753c23983	Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx"	2017-06-06 02:19:41 +00:00
Johann Koenig	755b3daf90	Merge "comp_avg_pred neon: used by sub pixel avg variance"	2017-05-31 18:17:28 +00:00
Johann	f695b30ac2	comp_avg_pred neon: used by sub pixel avg variance BUG=webm:1423 Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804	2017-05-30 22:47:34 +00:00
Jerome Jiang	a5ab38093f	Merge "Fix vp8 race when build --enable-vp9-highbitdepth."	2017-05-30 05:47:44 +00:00
Jerome Jiang	0afa2dad76	Fix vp8 race when build --enable-vp9-highbitdepth. Split vp8/vp9 implementations on yv12_copy_frame_c. Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c. Clean up vp8 codes usage in vp9. BUG=webm:1435 Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d	2017-05-26 09:45:01 -07:00
Johann Koenig	de1a9c77a7	Merge changes Iaab2b9a1,Idfb458d3 * changes: sub pel avg variance neon: 4x block sizes sub pel variance neon: 4x block sizes	2017-05-24 18:33:53 +00:00
Johann Koenig	b11a37f540	Merge changes I31fa6ef8,I228c6f29 * changes: sub pel avg variance neon: add neon optimizations sub pel variance neon: normalize variable names	2017-05-24 18:32:02 +00:00
James Zern	566f6d75bd	partial_idct_test,InitInput: fix rollover in mult promote coeff to signed 64-bit to avoid exceeding integer bounds when squaring the value Change-Id: If77bef6bc0a6a4c39ca3013e5e2ddb426a1c6e1f	2017-05-24 15:27:38 +02:00
Alexandra Hájková	8bf6eaf433	ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64} Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123	2017-05-24 13:27:09 +00:00
Linfeng Zhang	36f1b183e4	Update InitInput() in test/partial_idct_test.cc Make it work in high bit depth. BUG=webm:1412 Change-Id: Ic5cfd410a69709f01e2924774356a108a349d273	2017-05-23 14:24:23 -07:00
Johann	f6fcd3410d	sub pel avg variance neon: 4x block sizes BUG=webm:1423 Change-Id: Iaab2b9a183fdb54aae5f717aba95d90dc36a9e3b	2017-05-22 14:40:05 -07:00
Johann	188d58eaa9	sub pel variance neon: 4x block sizes Add optimizations for blocks of width 4 BUG=webm:1423 Change-Id: Idfb458d36db3014d48fbfbe7f5462aa6eb249938	2017-05-22 14:40:01 -07:00
Johann	9b0d306a2f	sub pel avg variance neon: add neon optimizations These are missing an optimized version of vpx_comp_avg_pred BUG=webm:1423 Change-Id: I31fa6ef842e98f7ff3ea079ffed51ae33178e2ed	2017-05-22 13:58:43 -07:00
Linfeng Zhang	c167345ffb	Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2 BUG=webm:1412 Change-Id: Ia338a6057d36f9ed7eaa9cbd4dfbf0c3cbdc6468	2017-05-22 11:24:21 -07:00
Johann Koenig	e7cac13016	Merge changes Ib8dd96f7,Ie9854b77 * changes: neon variance: process 4x blocks use memcpy for unaligned neon stores	2017-05-22 17:48:33 +00:00
Johann Koenig	3c603eadb4	Merge "neon fdct: 4x4 implementation"	2017-05-19 17:08:58 +00:00
Johann	7b742da63e	neon variance: process 4x blocks Continue processing sets of 16 values. Plenty of improvement for 4x8 (doubles the speed) but only about 30% for 4x4. BUG=webm:1422 Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905	2017-05-17 17:35:01 -07:00
Marco Paniconi	a2dfbbd7d6	Merge "vp9: Modify ChangingDropFrameThresh unittest."	2017-05-17 18:42:51 +00:00
Marco	4733df333f	vp9: Modify ChangingDropFrameThresh unittest. Add another (lower) bitrate to the test, to cover frame drop behavior at low bitrate range. Change-Id: Iaad003974159daf3d2d65ef3a6575a3e72e498d6	2017-05-17 09:38:21 -07:00
Linfeng Zhang	3210ca6d60	Update partial idct testing code Add PartialIDctTest::PrintDiff() to help debugging. In RunQuantCheck, try all combinations of +/-mask_ input for 4x4 idct. Update PartialIDctTest::InitInput(). Change-Id: I13fd163954a4c1a3a6cfeb5e4a4d3d0e7ff901f4	2017-05-17 09:28:32 -07:00
Johann	105503b839	neon fdct: 4x4 implementation Approximately twice as fast as C implementation. BUG=webm:1424 Change-Id: I3c0307fb08ddc23df42545cd089a78e2ed5c9d3f	2017-05-17 07:38:18 -07:00
Alexandra Hájková	bcbc3929ae	ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx Change-Id: Ic9639b1331d8c5cbc207c2a036891ff0137fc56f	2017-05-13 13:13:15 +00:00
James Zern	ac8f58f6ab	Merge changes I1b54a7a5,I3028bdad,I59788cd9 * changes: ppc: Add get_mb_ss_vsx ppc: Add get4x4sse_cs_vsx ppc: Add comp_avg_pred_vsx	2017-05-12 15:24:59 +00:00
Luca Barbato	143b21e362	ppc: Add get_mb_ss_vsx Change-Id: I1b54a7a5bb642e4b836d786ea1ae506eed025e3f	2017-05-12 17:23:00 +02:00
Luca Barbato	6d225eb5f9	ppc: Add get4x4sse_cs_vsx Change-Id: I3028bdadf653665d18e781d28e9625f62804b3d8	2017-05-12 17:23:00 +02:00
Luca Barbato	a7f8bd451b	ppc: Add comp_avg_pred_vsx Change-Id: I59788cd98231e707239c2ad95ae54f67cfe24e10	2017-05-12 17:22:55 +02:00
Alexandra Hájková	f48532e271	ppc: Add vpx_sad64x32/64_vsx Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2	2017-05-12 16:10:16 +02:00
Alexandra Hájková	0b15bf1e54	ppc Add vpx_sad32x16/32/64_vsx Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd	2017-05-12 16:10:11 +02:00
James Zern	a12ea1d5e9	Merge "ppc: Add vpx_sad16x8/16/32_vsx"	2017-05-12 13:33:51 +00:00
Marco	c5a4376aed	vp9: SVC: allow for setting the interp_filter in non-rd pickmode. For SVC 1 pass non-rd pickmode, the interpolation filter for the upsampling of the golden (spatial) reference was not being explicitly set and instead was takin gwhatever value was set in the previous mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH). Fix it to the default EIGHTTAP for now, to be updated/selected adaptively in a later change. Minor adjustmemt to rate targeting thresholds in datarate unittests. Change-Id: I52085048674072c6cfb7163e11e9a2658d773826	2017-05-11 11:45:09 -07:00
Alexandra Hájková	cc7f0c0f3e	ppc: Add vpx_sad16x8/16/32_vsx Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9	2017-05-10 19:57:30 +00:00
Johann Koenig	d713ec3c46	Merge changes I92eb4312,Ibb2afe4e * changes: subpel variance neon: add mixed sizes sub pixel variance neon: use generic variance	2017-05-10 18:19:52 +00:00
Linfeng Zhang	870cf4356c	Update test/partial_idct_test.cc Makes more sense to call the corresponding partial idct C function instead of the full idct C function as the reference. Change-Id: Ibb7681dd063edd6307ba582c10c26c4c6a4b78c6	2017-05-09 13:07:47 -07:00
Johann Koenig	1814463864	Merge changes Id602909a,Ib0e85608 * changes: neon variance: process two rows of 8 at a time neon variance: add small missing sizes	2017-05-08 17:34:20 +00:00
Linfeng Zhang	2c3a2ad6f1	Merge changes I0cfe4117,I3581d80d,Ida62c941 * changes: Split dsp/x86/inv_txfm_sse2.c Update highbd idct functions arguments to use uint16_t dst Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct	2017-05-08 16:15:57 +00:00
Jerome Jiang	3453c8d6c4	Merge "vp9: Neon optimization for denoiser. Add unit tests."	2017-05-06 01:28:32 +00:00
Jerome Jiang	83a2bfd7dc	Merge "Change target bitrate thresh in denoiser test."	2017-05-06 01:28:15 +00:00
Jerome Jiang	fff358fb06	Change target bitrate thresh in denoiser test. An intended behavior change disabling exhaustive searches in speed feature causes VP9/DatarateTestVP9LargeDenoiser.4threads test failure. Change the threshold to make it pass. BUG=webm:1429 Change-Id: Ibcbe2314c6b2525799894f5d7204fc8eb4ec2a1e	2017-05-05 16:50:19 -07:00
Jerome Jiang	069eedb3a0	vp9: Neon optimization for denoiser. Add unit tests. Denoiser on Neon is 5x faster than C code. BUG=webm:1420 Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb	2017-05-05 16:40:52 -07:00
Johann	2346a6da4a	subpel variance neon: add mixed sizes Add support for everything except block sizes of 4. Performance is better but numbers will improve again when the variance optimizations land. BUG=webm:1423 Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80	2017-05-04 15:30:01 -07:00
Johann	462e29703c	fdct 8x8 neon: minor comment cleanup Simplify HBD/non distinction in test. Document why transpose_neon.h is not used Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52	2017-05-04 15:14:23 -07:00
Johann	cb9133c72f	neon variance: add small missing sizes Some of the mixed sizes were missing. They can be implemented trivially using the existing helper function. When comparing the previous 16x8 and 8x16 implementations, the helper function is about 10% faster than the 16x8 version. The 8x16 is very close, but the existing version appears to be faster. BUG=webm:1422 Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004	2017-05-04 08:59:42 -07:00
Linfeng Zhang	d5de63d2be	Update highbd idct functions arguments to use uint16_t dst BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5	2017-05-03 13:59:16 -07:00
Linfeng Zhang	081b39f2b7	Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct BUG=webm:1388 Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112	2017-05-03 13:58:31 -07:00
Yi Luo	a3452996a1	High bit depth inter prediction horizontal/vertical filters AVX2 User level speed improvement on i7-6700, cpu-used=1, x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps: - Decoder: 1080p: ~4% 4K: ~5% - Encoder: 1080p: ~1% 4K: ~3% Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640	2017-05-03 12:18:01 -07:00
James Zern	5599e4275a	Merge changes Ia5293d94,I90d481d3,Ia509d622,I54549b03,I89b635d6 * changes: ppc: Add convolve8_vsx and convolve8_avg_vsx ppc: Add convolve8_avg_vert_vsx ppc: Add convolve8_vert ppc: Add convolve8_horiz_avg ppc: Add convolve8_horiz	2017-05-03 03:31:19 +00:00
Luca Barbato	e2ad89092d	ppc: Add convolve8_vsx and convolve8_avg_vsx Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857	2017-05-02 20:27:47 -07:00
Luca Barbato	e6ca81ee67	ppc: Add convolve8_avg_vert_vsx Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d	2017-05-02 20:27:42 -07:00
Luca Barbato	a65f1771ad	ppc: Add convolve8_vert Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f	2017-05-02 20:27:33 -07:00
Luca Barbato	77772350f3	ppc: Add convolve8_horiz_avg The 8x8 and larger blocks cases can be sped up further. Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15	2017-05-02 20:27:28 -07:00
Luca Barbato	08edb85bd0	ppc: Add convolve8_horiz The 8x8 and larger blocks cases can be sped up further. Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046	2017-05-02 20:27:16 -07:00
James Zern	ee3df31d74	Merge "vpx_scale_test: fix segfault on alloc failure"	2017-05-01 19:22:22 +00:00
James Zern	2930903d51	vpx_scale_test: fix segfault on alloc failure check the return of ResetImage() before continuing Change-Id: Iff0b038f7b9761113b8cf33a511a5306640d1273	2017-04-29 13:12:53 -07:00
Luca Barbato	d51d3934f5	ppc: Add convolve_avg Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014	2017-04-29 15:47:25 +02:00
Luca Barbato	63860ba7b8	ppc: Add convolve_copy Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405	2017-04-29 15:47:25 +02:00
Jerome Jiang	bea27a5809	Merge "Generalize vp9 sse2 denoiser test for other platforms."	2017-04-28 15:45:52 +00:00
Johann Koenig	94ebdba71d	Merge "vp9 temporal filter: sse4 implementation"	2017-04-28 13:22:41 +00:00
Jerome Jiang	26aebd77b8	Generalize vp9 sse2 denoiser test for other platforms. Renamed to vp9_denoiser_test. Change-Id: I0d8f4c94bcb81a60949a13d9fe839cee95d03f77	2017-04-27 22:47:41 -07:00
Johann	6dfeea6592	vp9 temporal filter: sse4 implementation Approximates division using multiply and shift. Speeds up both sizes (8x8 and 16x16) by 30 times. Fix the call sites to use the RTCD function. Delete sse2 and mips implementation. They were based on a previous implementation of the filter. It was changed in Dec 2015: `ece4fd5d22` BUG=webm:1378 Change-Id: I0818e767a802966520b5c6e7999584ad13159276	2017-04-26 22:03:05 -07:00
Yunqing Wang	b68f14d0ed	Merge "Make the row based multi-threaded encoder deterministic"	2017-04-26 16:12:14 +00:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
Yunqing Wang	10a497bd38	Make the row based multi-threaded encoder deterministic This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb	2017-04-24 16:28:27 -07:00
Marco	85ca2e8a8b	vp9: Re-enable SVC datarate tests. Re-enable the SVC tests, wrap the non-zero expectation in GetMismatchFrames around #if CONFIG_VP9_DECODER. Change-Id: I0e8a2d78b868c32f18fe597540f397d3a1b303b5	2017-04-20 12:08:08 -07:00
Luca Barbato	8975436466	ppc: Add the intra predictor tests Change-Id: Idea15b916044ab3d8e74519337880a484ecfd87e	2017-04-19 20:21:40 -07:00
Luca Barbato	914b160fb5	ppc: h predictor 8x8 Slightly faster with the current compiler. Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47	2017-04-19 19:57:51 -07:00
Luca Barbato	0b9be93205	ppc: d63 predictor 8x8 10x faster. Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8	2017-04-19 19:57:51 -07:00
Luca Barbato	ee9325b0bd	ppc: tm predictor 4x4 Slightly faster. Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191	2017-04-19 19:57:51 -07:00
Luca Barbato	2904eb5800	ppc: h predictor 4x4 2x faster. Change-Id: I0583dec353299c6797401b646099f18db4e0420d	2017-04-19 19:57:51 -07:00
Luca Barbato	58245d7050	ppc: dc predictor 8x8 Slightly faster, the other dc predictors cannot be faster since the computation speedup is overwhelmed by the time spent reading dst to write just the 8x8 part. Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66	2017-04-19 19:57:51 -07:00
Luca Barbato	6b4a65e8b1	ppc: d45 predictor 8x8 11x faster. Change-Id: I5b8f39213ee1f5260724fc254e3fb5c462435798	2017-04-19 19:57:51 -07:00
Luca Barbato	92e33c7b31	ppc: d63 predictor 32x32 About 10x faster. Change-Id: If7d0645f75c5d7deb9751edd0bf47e2f9068e9e7	2017-04-19 19:57:51 -07:00
Luca Barbato	a5469a00a8	ppc: d63 predictor 16x16 About 18x faster. Change-Id: Id043bf76c011e03e992085bb5e20f330d3e98cd4	2017-04-19 19:57:51 -07:00
Luca Barbato	cc868da526	ppc: d45 predictor 32x32 About 12x faster. Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed	2017-04-19 19:57:51 -07:00
Luca Barbato	7a7dc9e624	ppc: d45 predictor 16x16 About 16x faster. Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9	2017-04-19 19:57:51 -07:00
Luca Barbato	c08baa2900	ppc: dc predictor 32x32 10x and 5x faster. Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf	2017-04-19 19:57:47 -07:00
Luca Barbato	22ca468c7c	ppc: dc top and left predictor 32x32 6x faster. Change-Id: I717995b4056e5579c68191d11b495372971fe1ae	2017-04-19 19:49:31 -07:00
Luca Barbato	ad9dea1f6d	ppc: dc top and left predictor 16x16 13x faster. Change-Id: I1771ac39fda599153f933cb3f0506c9f97a6cbe6	2017-04-19 19:49:31 -07:00
Luca Barbato	d68d37872c	ppc: dc_128 predictor 32x32 6x faster. Change-Id: I1da8f51b4262871cb98f0aa03ccda41b0ac2b08b	2017-04-19 19:49:31 -07:00
Luca Barbato	f9d20e6df2	ppc: dc_128 predictor 16x16 20x faster. Change-Id: I05f0deb2d38ae7966eae6b71fbc0aa51880e5709	2017-04-19 19:49:31 -07:00
Luca Barbato	0d9417de4a	ppc: tm predictor 32x32 About 8x faster. Change-Id: I9bad827ccbdf47ec95406e961c74ac2ff45f80cf	2017-04-19 19:49:26 -07:00
James Zern	a81f037f15	Merge changes I1f5a3752,I95123051,I3bb724e0,Ie81077fa,Ic80f3c05, ... * changes: ppc: tm predictor 16x16 ppc: tm predictor 8x8 ppc: horizontal predictor 32x32 ppc: horizontal predictor 16x16 ppc: vertical intrapred 16x16 and 32x32 configure: Workaround clang not enabling altivec on -mvsx configure: Match power64 as ppc64	2017-04-20 02:45:45 +00:00
Linfeng Zhang	fbbdba3b04	Merge changes I9e18a73b,Ie47c8cd4 * changes: Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Create CAST_TO_BYTEPTR/SHORTPTR	2017-04-19 23:55:58 +00:00
Linfeng Zhang	bf8a49abbd	Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Replace by CAST_TO_BYTEPTR/SHORTPTR. The rule is: if a short ptr is casted to a byte ptr, any offset operation on the byte ptr must be doubled. We do this by casting to short ptr first, adding offset, then casting back to byte ptr. BUG=webm:1388 Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248	2017-04-19 12:13:49 -07:00
Marco	f34be01190	vp9: Fix the disabling of a SVC 3TL datarate test. Change-Id: Ib42d23ab5ee39ab3c85e1d9a84e36249e59fe74e	2017-04-19 08:01:44 -07:00
Luca Barbato	479443a570	ppc: tm predictor 16x16 About 10x faster. Change-Id: I1f5a3752d346459df3b45f92963208bf3e520f06	2017-04-19 01:48:10 +02:00
Luca Barbato	c8f5a55df4	ppc: tm predictor 8x8 About 5x faster. Change-Id: I951230517f49c0dca9ac9eac2efa8916a303b85a	2017-04-19 01:48:09 +02:00
Luca Barbato	7b0e12934e	ppc: horizontal predictor 32x32 About 5x faster. Change-Id: I3bb724e07baffd901aa2d0f65060ba48882cc9b8	2017-04-19 01:48:09 +02:00
Luca Barbato	a7a2d1653b	ppc: horizontal predictor 16x16 About 10x faster. Change-Id: Ie81077fa32ad214cdb46bdcb0be4e9e2c7df47c2	2017-04-19 01:48:09 +02:00
Luca Barbato	7ad1faa6f8	ppc: vertical intrapred 16x16 and 32x32 Change-Id: Ic80f3c050cfbe7697e81a311b4edaaa597b85cab	2017-04-19 01:48:09 +02:00
Marco	15afee1938	vp9: Disable some SVC tests for now. Disable the 1 pass CBR SVC tests with temporal_layers > 1. Issue with the commit `863f860`, which will cause encoder/decoder mismatch due to skipping encoder loopfilter for non-reference frames. Will re-enable the tests when fixed. Change-Id: I74918a0045a17976b069c4be63fbeb921974df0d	2017-04-18 09:51:42 -07:00
Johann Koenig	a6095333a7	Merge "re-enable vpx_comp_avg_pred_sse2"	2017-04-17 22:07:34 +00:00
Marco Paniconi	9aa429a66d	Revert "Revert "vp9: Avoid encoder loopfilter for non-reference frames."" This reverts commit `e9b7f98c56`. Reason for revert: Commit `d578bdad` fixes the issue (encoder/decoder mismatch in 3TL datarate test) that causes the original revert. Original change's description: > Revert "vp9: Avoid encoder loopfilter for non-reference frames." > > This reverts commit `863f860bfc`. > > This causes encoder / decoder mismatches in various > VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests > > BUG=webm:1408 > > Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14 > TBR=jzern@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com BUG=webm:1408 Change-Id: Ifeb81460856d1d56482d4e0477a70ee98f8bfaa6	2017-04-17 11:02:02 -07:00
Marco	d578bdad02	vp9: Datarate test: modify frame flags for 3 TL. Modify the frame flags to update the ARF on top layer, for the tests: VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayersFrameDropping This is needed to fix the encode/decoder mismatches caused by `863f860`, and removed in the revert `e9b7f98`. Change-Id: I6b9fecfdd17315fc0179e29949338c77636026c0	2017-04-17 09:33:20 -07:00
Johann	9fa24f03b5	re-enable vpx_comp_avg_pred_sse2 Buffers on 32 bit x86 builds only guaranteed 8 byte alignment. Fixed with "AvgPred test: use aligned buffers" and "sad avg: align intermediate buffer" Also re-enable asserts on the C version. BUG=webm:1390 Change-Id: I93081f1b0002a352bb0a3371ac35452417fa8514	2017-04-17 08:40:43 -07:00
Johann Koenig	9e19102972	Merge "AvgPred test: use aligned buffers"	2017-04-17 15:36:41 +00:00
James Zern	4ba20da8b1	Merge "Add AVX2 optimization to copy/avg functions"	2017-04-15 00:26:08 +00:00
Yi Luo	aa5a941992	Add AVX2 optimization to copy/avg functions Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074	2017-04-14 16:50:10 -07:00
Johann Koenig	7178e68bbe	Merge "Disable vpx_comp_avg_pred_sse2"	2017-04-14 22:01:39 +00:00
Johann	e3b2710b04	AvgPred test: use aligned buffers BUG=webm:1390 Change-Id: Idb6d1ce119a09c5e7c9f3c58bbbae3de63463d1d	2017-04-14 12:49:56 -07:00
James Zern	e9b7f98c56	Revert "vp9: Avoid encoder loopfilter for non-reference frames." This reverts commit `863f860bfc`. This causes encoder / decoder mismatches in various VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests BUG=webm:1408 Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14	2017-04-14 11:50:06 -07:00
Johann	eaa7cdf05d	Disable vpx_comp_avg_pred_sse2 Failures on windows: unknown file: error: SEH exception with code 0xc0000005 thrown in the test body. Alignment check errors on linux: test_libvpx: ../libvpx/vpx_dsp/variance.c:230: void vpx_comp_avg_pred_c(uint8_t , const uint8_t , int, int, const uint8_t *, int): Assertion `((intptr_t)comp_pred & 0xf) == 0' failed. BUG=webm:1390 Change-Id: I5eed5381c0f1a8fe594a128eb415e77232f544ea	2017-04-14 08:43:06 -07:00
Johann Koenig	bdb593ab20	Merge "vpx_comp_avg_pred: sse2 optimization"	2017-04-14 04:10:56 +00:00
Marco	863f860bfc	vp9: Avoid encoder loopfilter for non-reference frames. Useful for SVC, where the top layer enhancement frames may not update any reference buffers, as is the case for the patterns in the 1 pass CBR SVC when #temporal_layers > 1. ~3% encoder speedup for SVC patterns with temporal layers in 1 pass CBR mode. Updated the SVC datarate tests for the mismatch frames. Set the frame-dropper off in some tests with #temporal_layers > 1 so we can correctly set #mismatch frames. Adjusted rate target threshold for tests where frame-dropper was turned off. Change-Id: Ia0c142f02100be0fed61cd2049691be9c59d6793	2017-04-13 09:51:55 -07:00
Johann	28a8622143	vpx_comp_avg_pred: sse2 optimization Provides over 15x speedup for width > 8. Due to smaller loads and shifting for width == 8 it gets about 8x speedup. For width == 4 it's only about 4x speedup because there is a lot of shuffling and shifting to get the data properly situated. BUG=webm:1390 Change-Id: Ice0b3dbbf007be3d9509786a61e7f35e94bdffa8	2017-04-13 08:44:52 -07:00
Yunqing Wang	1aa46abbdf	VP9 motion vector unit test To prevent the motion vector out of range bug, added a motion vector unit test in VP9. In the 4k video encoding, always forced to use extreme motion vectors and also encouraged to use INTER modes. In the decoding, checked if the motion vector was valid, and also checked the encoder/decoder mismatch. The tests showed that this unit test could reveal the issue we saw before. Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4	2017-04-06 00:50:56 +00:00
Johann Koenig	eec92e8a5b	Merge "vpx_comp_avg_pred: add test"	2017-03-28 21:50:01 +00:00
Johann	6e99ed72a5	vpx_comp_avg_pred: add test BUG=webm:1389 Change-Id: I23cd65f1939db026958ccb5d70b8c5cc9aa5bc51	2017-03-28 14:11:14 -07:00
Marco	07ad5a15c2	vp9: Fix to condition on using source_sad for 1 pass real-time. Make the source_sad feature work properly for cases of VBR or screen_content with SVC. Added unittest for SVC with screen-content on. Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b	2017-03-24 10:21:47 -07:00
Johann	83dd9b36f4	vp9 temporal filter: additional test Change tests to reflect use. Input sizes will be 8 or 16 (but not necessarily square). filter_weight is capped at 2 and filter_strength at 6 Speed test, disabled by default. Change-Id: Idfde9d6c4b7d93aaf0e641b0f4862c15e2a2af7a	2017-03-22 19:37:04 +00:00
Johann	36d732c22b	vp9 temporal filter: add const to function prototype The input frames are not modified. Change-Id: Ideb810e3c5afeb4dbdc4c7d54024c43a8129ad39	2017-03-22 18:14:21 +00:00
Marco	4ddde47d8c	vp9: Modify datarate tests to cover denoising with multi-threading. Change-Id: I6ed48a630edf9923c25a05deaca50e0afec43918	2017-03-21 15:57:33 -07:00
James Zern	e0b4c4d1ae	Merge "Add vpx_highbd_idct32x32_1024_add_neon()"	2017-03-21 03:27:35 +00:00
James Zern	6d71d33d55	Merge "Add vpx_highbd_idct32x32_34_add_neon()"	2017-03-21 03:02:51 +00:00
Johann	775569473d	temporal filter test: update types Use 'int' for w/h since it is that way everywhere else. Pass Buffer pointers Change-Id: I9eef6890af657baba171c6bcfcc85fc976173399	2017-03-17 13:22:28 -07:00
Johann Koenig	9675affae0	Merge "test: add vp9_temporal_filter_apply test"	2017-03-17 18:18:06 +00:00
Linfeng Zhang	27530d484e	Add vpx_highbd_idct32x32_1024_add_neon() BUG=webm:1301 Change-Id: Ib90af0c1712e56b301d0e981dbe9a641e15e36ca	2017-03-17 00:27:46 -07:00
Linfeng Zhang	50b13f75b8	Add vpx_highbd_idct32x32_34_add_neon() BUG=webm:1301 Change-Id: I74dd16c6c64e7bb71aa991cedccddf0663ef5e06	2017-03-17 00:27:46 -07:00
James Zern	2882778310	Merge "Add vpx_highbd_idct32x32_135_add_neon()"	2017-03-17 07:26:52 +00:00
Linfeng Zhang	65e9fb65e8	Add vpx_highbd_idct32x32_135_add_neon() BUG=webm:1301 Change-Id: I58c2d65d385080711c3666d6d8f9d241dac7b21a	2017-03-16 22:37:55 -07:00
Rafael de Lucena Valle	405b94c661	Add Hadamard for Power8 Change-Id: I3b4b043c1402b4100653ace4869847e030861b18 Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>	2017-03-15 23:46:18 -03:00
Jerome Jiang	2fa7092808	Merge "vp9: Enable row multithreading for SVC in real-time mode."	2017-03-14 23:29:46 +00:00
Johann	a14a987c82	test: add vp9_temporal_filter_apply test Add an independent implementation of the filter. BUG=webm:1379 Change-Id: I309c459b493c3011273b78b127a786bb23c59f9c	2017-03-13 15:26:26 -07:00
Linfeng Zhang	b0bfcc368c	Merge "Add vpx_highbd_idct32x32_135_add_c()"	2017-03-13 18:49:01 +00:00
Marco	ffb3c50da1	vp9: Enable row multithreading for SVC in real-time mode. Enable row-mt for SVC for real-time mode, speed >=5. Add the controls to the sample encoders, but keep it off for now. Add the control and enable it for the 1 pass CBR unittests. For speed 7, 3 layer SVC, 2 threads, row-mt enabled gives about ~5% speedup. Change-Id: Ie8e77323c17263e3e7a7b9858aec12a3a93ec0c1	2017-03-10 01:01:07 +00:00
Linfeng Zhang	48f5886605	Add vpx_highbd_idct32x32_135_add_c() When eob is less than or equal to 135 for high-bitdepth 32x32 idct, call this function. BUG=webm:1301 Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6	2017-03-08 10:46:33 -08:00
Jerome Jiang	c4c0331f65	Shift speed 2 from non-large VP9 tests to large ones. This may fix the time out failure of valgrind tests in nightly since more coverages were added on row-mt. Change-Id: Id9414e66d1a266602c7495243d9f5cb69e17ccdc	2017-03-07 13:58:11 -08:00
Vignesh Venkatasubramanian	453f18040f	vp9,realtime: Enable row multithreading for non-rd Enable row level multithreading for realtime encodes where non-rd path is used (speed >= 5). Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41	2017-03-02 11:03:56 -08:00
Chrome Cunningham	b71245683b	Merge "VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface"	2017-03-01 18:01:14 +00:00
Chris Cunningham	bcd0c49af3	VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value is changed as part of this move. Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH. Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419	2017-02-28 17:10:34 -08:00
James Zern	66919e370b	vp9_ethread_test,cosmetics: s/new-mt/row-mt/ Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0	2017-02-28 15:13:11 -08:00
James Zern	3ab8a05b37	stress.sh: add vp9_stress_test_row_mt vp9_stress_test now forces --row-mt=0 to cover both versions Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2	2017-02-28 15:09:30 -08:00
James Zern	b58a8ccb02	stress.sh: parameterize thread count Change-Id: Iae45266cea86585f0935af4012335198cf93719f	2017-02-28 15:09:30 -08:00
James Zern	4684d286de	stress.sh: add one pass encodes Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90	2017-02-28 15:09:30 -08:00
Yunqing Wang	3833905ff2	Add a comment in encoder thread test Added a comment. Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81	2017-02-28 11:13:09 -08:00
Vignesh Venkatasubramanian	ddfe906be2	vp9_ethread_test: Rename new_mt to row_mt Rename left over occurences of new_mt. Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375	2017-02-27 10:50:02 -08:00
Vignesh Venkatasubramanian	5881601488	vp9: Rename new_mt to row_mt new_mt is a very generic name that will get obsolete soon enough. Since this is exposed as a codec control, renaming it to row_mt to signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH codec control to ROW_MT_BIT_EXACT. Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558	2017-02-27 09:43:26 -08:00
Yunqing Wang	8121f85473	Remove an old leftover comment Removed an old comment that wasn't true anymore. Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7	2017-02-24 18:31:21 -08:00
Yunqing Wang	af9002dd16	Merge "Improve VP9 encoder threading test for better coverage"	2017-02-24 23:26:23 +00:00
Yunqing Wang	cc168054a8	Improve VP9 encoder threading test for better coverage Re-organized the encoder threading tests and grouped tests into 4 parts. Added PSNR checking test to make sure the PSNR variation is within a small range. BUG=webm:1376 Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff	2017-02-24 09:48:29 -08:00
Johann	904b957ae9	consolidate block_error functions vp9_highbd_block_error_8bit_c was a very simple wrapper around vp9_block_error_c. The SSE2 implemention was practically identical to the non-HBD one. It was missing some minor improvements which only went into the original version. In quick speed tests, the AVX implementation showed minimal improvement over SSE2 when it does not detect overflow. However, when overflow is detected the function is run a second time. The OperationCheck test seems to trigger this case and reverses any speed benefits by running ~60% slower. AVX2 on the other hand is always 30-40% faster. Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1	2017-02-24 05:25:26 +00:00
Johann Koenig	57e987576f	Merge "vp8_fdct4x4 test: fix segfault again"	2017-02-23 07:41:21 +00:00
Johann	672100a84e	vp8_fdct4x4 test: fix segfault again The output needs to be aligned. Input is read with 'movq' not 'movqda' so it is not expected to be aligned. Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4	2017-02-22 18:29:11 +00:00
Yunqing Wang	66f36f4735	Merge "Refactored the row based multi-threading code"	2017-02-22 16:55:04 +00:00
Jerome Jiang	b1dcaf7f1e	Merge "Fix segmentation fault caused by denoiser working with spatial SVC."	2017-02-22 04:44:55 +00:00
Yi Luo	6036a0d24f	Following SSSE3 intrinsics functions also work for HBD - vpx_idct8x8_12_add_ssse3 vpx_idct8x8_64_add_ssse3 vpx_idct32x32_34_add_ssse3 vpx_idct32x32_135_add_ssse3 vpx_idct32x32_1024_add_ssse3 - turn on unit tests. Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7	2017-02-21 12:37:53 -08:00
Jerome Jiang	0d1e5a21c4	Fix segmentation fault caused by denoiser working with spatial SVC. Re-enable the affected test. BUG=webm:1374 Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb	2017-02-21 09:38:28 -08:00
Yi Luo	62a332160f	Merge "Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests"	2017-02-21 16:36:06 +00:00
Ranjit Kumar Tulabandu	97d6a4cbd1	Refactored the row based multi-threading code Modified the code to facilitate bit-match tests in first pass Added unit-tests to test the row based multi-threading behavior for bit-exactness Change-Id: Ieaf6a8f935bb1075597e0a3b52d9989c8546d7df	2017-02-20 16:13:45 +05:30
James Zern	bf6fcebfed	vp8_fdct4x4_test: align input and output buffers fixes segfault in 32-bit builds Change-Id: I5b3cc5a335cb236a6ec4cb11fa8feb54ae0182c7	2017-02-18 13:30:28 -08:00
James Zern	52b3e1a633	datarate_test: disable OnePassCbrSvc2SpatialLayersDenoiserOn segfaults BUG=webm:1374 Change-Id: I3790c6cb8a539d13dee6a8225ef09b1575dea26c	2017-02-17 16:23:22 -08:00
Johann Koenig	9cb470eba7	Merge "vp8_short_fdct4x4: verify optimized functions"	2017-02-17 22:11:08 +00:00
Yi Luo	1f8e8e5bf1	Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests - In SSSE3 optimization, 16-bit addition and subtraction would overflow when input coefficient is 16-bit signed extreme values. - Function-level speed becomes slower (unit ms): idct8x8_64: 284 -> 294 idct8x8_12: 145 -> 158. BUG=webm:1332 Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b	2017-02-17 14:05:05 -08:00
James Zern	3e7025022e	Merge "Add vpx_highbd_idct16x16_10_add_neon()"	2017-02-17 20:29:37 +00:00
Johann	bf05cd3c99	vp8_short_fdct4x4: verify optimized functions Change-Id: I7c7f5dfabde65c09f111fb0ced0e3ad231ee716e	2017-02-16 19:34:50 -08:00
Yi Luo	f62dcc9c33	Replace idct32x32_1024_add_ssse3 assembly with intrinsics - Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on i7-6700, no obvious user-level speed performance downgrade. - Passed unit tests. Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc	2017-02-16 16:10:40 -08:00
Linfeng Zhang	0620081731	Add vpx_highbd_idct16x16_10_add_neon() BUG=webm:1301 Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab	2017-02-16 15:13:50 -08:00
James Zern	6ab0870d45	disable VP9MultiThreadedFrameParallel tests these are flaky and cause TSan warnings with clang-3.9.1 BUG=webm:1372 Change-Id: I8a7047552ba2ccd2d8c45f8795818c74562e5990	2017-02-16 12:56:04 -08:00
Paul Wilkins	e6c1993f1b	Merge "Additional first pass stats."	2017-02-16 09:39:29 +00:00
James Zern	cc04ae1565	Merge "vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism"	2017-02-16 00:21:19 +00:00
Jerome Jiang	2865de86ec	vpx_temporal_svc_encoder: Expose error resilient control to cmd line. Change-Id: Ic74a8690b136ffbc370080f70b2d5a6b1572bf63	2017-02-15 21:45:52 +00:00
Linfeng Zhang	81914ce68a	Add vpx_highbd_idct16x16_38_add_neon() BUG=webm:1301 Change-Id: Ic6cd8c1e63e1b7a997cbed221e20fff4c599e0fe	2017-02-15 09:12:02 -08:00
paulwilkins	945ccfee59	Additional first pass stats. Added counts that split the intra coded blocks into low and high variance. Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14	2017-02-15 10:44:37 +00:00
James Zern	1cd926d665	vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism replace with an explicit output file prefix that matches the function name Change-Id: I7f6a4105adb34327b1099a5fbf132aa8d1ad5b90	2017-02-14 23:44:00 -08:00
Linfeng Zhang	e07e74fb0f	Add vpx_highbd_idct16x16_38_add_c() When eob is less than or equal to 38 for high-bitdepth 16x16 idct, call this function. BUG=webm:1301 Change-Id: I09167f89d29c401f9c36710b0fd2d02644052060	2017-02-14 17:25:52 -08:00
Linfeng Zhang	de9ae32b93	Merge "Add vpx_highbd_idct16x16_256_add_neon()"	2017-02-14 01:15:34 +00:00
Linfeng Zhang	5ad4159ebb	Add vpx_highbd_idct16x16_256_add_neon() BUG=webm:1301 Change-Id: I6bb755552a39bdd26eef3f449601f6a9766c65ec	2017-02-13 15:50:33 -08:00
Johann Koenig	4526ec7907	Merge "fdct8x8 highbd neon: use tran_low_t for output"	2017-02-13 23:11:30 +00:00
Johann	5ecde212a8	fdct8x8 highbd neon: use tran_low_t for output Change-Id: I100c4a1955d80bec4d28e82796b3e7f57e84d0ba	2017-02-13 22:16:14 +00:00
Yunqing Wang	318ca07657	The bitstream bit match test in multi-threaded encoder While the new-mt mode is enabled(namely, allowing to use row-based multi-threading in encoder), several speed features that adaptively adjust encoding parameters during encoding would cause mismatch between single-thread encoded bitstream and multi-thread encoded bitstream. This patch provides a set_control API to disable these features, so that the bit match bitstream is obtained in the unit test. Change-Id: Ie9868bafdfe196296d1dd29e0dca517f6a9a4d60	2017-02-13 13:02:26 -08:00
Linfeng Zhang	016933ad48	Add vpx_highbd_idct{16x16,32x32}_1_add_neon() and update vpx_highbd_idct8x8_1_add_neon() BUG=webm:1301 Change-Id: I18d1a0cbe98ba822d5194c1b4e13a4c29c5c75f4	2017-02-13 10:25:22 -08:00
Linfeng Zhang	bc1c18e18c	Add vpx_idct16x16_38_add_neon() The RunQuantCheck() test on it exposes 16-bit overflow in stage 7 of pass 2. Change to use saturating add/sub for both vpx_idct16x16_38_add_neon() and vpx_idct16x16_256_add_neon() for high bitdepth. Change-Id: Ibf4c107a887553a52852cc582e28d38a5a5a2712	2017-02-08 12:15:22 -08:00
Linfeng Zhang	0fefc6873a	Merge "Add vpx_idct16x16_38_add_c()"	2017-02-08 17:20:19 +00:00
Linfeng Zhang	cf76ee2cb7	Add vpx_idct16x16_38_add_c() When eob is less than or equal to 38 for 16x16 idct, call this function. Change-Id: Ief6f3fb16a49ace3c92cebf4e220bf5bf52a6087	2017-02-07 09:40:51 -08:00
Johann	537949a9df	block_error_fp highbd sse2: use tran_low_t for coeff BUG=webm:1365 Change-Id: Id2ed3ebaaaa6a4b68628c23e08b64ea5f1341761	2017-02-07 15:03:28 +00:00
Yunqing Wang	2a21b45fdc	Fix visual studio build failure Fixed the following issue. ..\test\vp9_ethread_test.cc(69): warning C4805: '\|=' : unsafe mix of type 'bool' and type 'int' in operation [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj] ..\test\vp9_ethread_test.cc(69): warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning) [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj] Change-Id: I37f897cf12a0b7500d2fcbac9e4615f08a83fdb4	2017-02-03 08:36:55 -08:00
Jerome Jiang	a16ca80b09	Merge "Add unit tests for vp9_block_error_fp."	2017-02-02 22:20:42 +00:00
Jingning Han	bb40844e32	Merge "Add SSSE3 intrinsic 8x8 inverse 2D-DCT"	2017-02-02 22:18:32 +00:00
Jerome Jiang	0b60d3ffa5	Add unit tests for vp9_block_error_fp. BUG=webm:1365 Change-Id: I004e5cd7ca331d14b31b7fc3edeee45fce064026	2017-02-02 12:41:51 -08:00
Kaustubh Raste	5b10674b5c	Merge "Add mips msa sum_squares_2d_i16 function"	2017-02-02 08:09:21 +00:00
Johann Koenig	726556dde9	Merge "Remove neon assembly for idct 16x16 and 8x8"	2017-02-02 03:25:31 +00:00
Johann Koenig	ce6318f254	Merge changes I43521ad3,I013659f6 * changes: satd highbd neon: use tran_low_t for coeff satd highbd sse2: use tran_low_t for coeff	2017-02-02 03:03:58 +00:00
Jingning Han	8f95389742	Add SSSE3 intrinsic 8x8 inverse 2D-DCT The intrinsic version reduces the average cycles from 183 to 175. Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03	2017-02-01 14:47:53 -08:00
Johann	f8d744d91a	satd highbd neon: use tran_low_t for coeff BUG=webm:1365 Change-Id: I43521ad32b6c96737a8ef2b8c327f901fd7eaf84	2017-02-01 11:55:47 -08:00
Johann	2ba383474d	satd highbd sse2: use tran_low_t for coeff BUG=webm:1365 Change-Id: I013659f6b9fbf9cc52ab840eae520fe0b5f883fb	2017-02-01 11:55:16 -08:00
Johann	0f751ecee3	hadamard highbd ssse3: use tran_low_t for coeff BUG=webm:1365 Change-Id: I374dfc08732932382043905f128e928b08cb4f57	2017-02-01 11:51:15 -08:00
Johann	1eb8a718bf	hadamard highbd neon: use tran_low_t for coeff BUG=webm:1365 Change-Id: I7e15192ead3a3631755b386f102c979f06e26279	2017-02-01 11:50:46 -08:00
Johann	2dac808dd1	hadamard highbd sse2: use tran_low_t for coeff BUG=webm:1365 Change-Id: Ica414007d8412ceebfffa9e58e8416226a3fe934	2017-02-01 11:46:57 -08:00
Jingning Han	a7949f2dd2	Make satd unit test support all bit-depth settings Turn on satd unit test for c function in both regular and high bit-depth settings. Change-Id: I4b0c56addfb84964ede0da3ab760fe0ee640cfd0	2017-01-31 23:21:32 -08:00
Jingning Han	59917dd18e	Unify the hadamard transform unit test for bit-depth settings Unify the 8x8 and 16x16 Hadamard unit test system for both 8-bit and high bit-depth settings. Change-Id: I53373c1d43f3ced514ad1e53e03f0fb9b25d9ead	2017-01-31 23:21:32 -08:00
Jingning Han	969957f9f2	Fix real-time compression regression in hbd mode This commit resolves the compression performance regression in real-time encoding setting when high bit-depth mode is enabled. The current solution temporarily disables the SIMD implementations of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode. The commit makes the coding results bit-wise identical between regular coding pipeline and high bit-depth at profile 0. BUG=webm:1365 Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf	2017-01-31 23:17:09 -08:00
Johann Koenig	9efc42f4f8	Merge "Use Buffer class for post proc tests"	2017-01-31 15:28:28 +00:00
Kaustubh Raste	750e753134	Add mips msa sum_squares_2d_i16 function average improvement ~4x-5x Change-Id: I8d91b71d0677009be52b412e4f52b40b98573a53	2017-01-31 12:22:43 +00:00
Kaustubh Raste	df7e1fecc1	Add mips msa vpx_minmax_8x8 function average improvement ~4x-5x Change-Id: I83aee9977534fddb8a9b80d31af646c0b6b1a8c3	2017-01-31 10:00:43 +05:30
Kaustubh Raste	407fad2356	Add mips msa vpx Integer projection row/col functions average improvement ~4x-5x Change-Id: I17c41383250282b39f5ecae0197ef1df7de20801	2017-01-27 11:11:42 +05:30
Kaustubh Raste	c1553f859f	Merge "Add mips msa vpx satd function"	2017-01-27 04:08:51 +00:00
Johann	f380a1658d	Use Buffer class for post proc tests Add Buffer features for: Setting the buffer to the output of an ACMRandom function. Copying a buffer. Comparing two buffers. Printing two buffers. Change-Id: Ib53fb602451a3abdcee279ea2b65b51fbc02d3df	2017-01-26 09:50:49 -08:00
Ranjit Kumar Tulabandu	8b0c11c358	Multi-threading of first pass stats collection (yunqingwang) 1. Rebased the patch. Incorporated recent first pass changes. 2. Turned on the first pass unit test. Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee	2017-01-24 15:48:02 -08:00
Yunqing Wang	91aa1fae2a	Merge "Add the multi-threaded first pass encoder unit test"	2017-01-24 17:14:07 +00:00
Kaustubh Raste	182ea677a0	Add mips msa vpx satd function average improvement ~4x-5x Change-Id: If8683d636fe2606d4ca1038e28185bca53bbe244	2017-01-24 10:44:22 +05:30
Johann	270fadc135	PartialIDctTest: reduce number of RunQuantCheck iterations This currently runs 1000 * 1000 = one million times which is quite unnecessary. It's one of the slowest items in Jenkins and takes over an hour for each of the larger transforms. Change-Id: I01653b5e610683e1a2d778ec60cf5065562ab8db	2017-01-23 13:32:09 -08:00
Marco	b71ff28a1a	vp9: Small threshold adjustment to unittest BasicRateTargeting444 Due to recent change to speed >=7 from commit:219cdab. Change-Id: I366e7750ec91119881050ff6c05849504c7959e8	2017-01-21 18:19:45 -08:00
Yunqing Wang	b0d8a75e48	Add the multi-threaded first pass encoder unit test Added the multi-threaded first pass encoder unit test in VP9. The test is to check if the new multi-threaded first pass encoder(namely, new-mt = 1) still generates matching stats. In the unit test, the new-mt mode will be turned on once the multi-threaded first pass implementation is checked in. Change-Id: Ic21bb1a55c454f024cfd2b397a4c148cfe638218	2017-01-20 10:06:24 -08:00
Johann	13234d3c43	Remove neon assembly for idct 16x16 and 8x8 Tested using test/partial_idct_test.cc:DISABLED_Speed Both gcc 4.9 and clang 3.8 from the r13 Android NDK offer improvements using the intrinsics: <function> <clang asm> <gcc asm> <clang intrin> <gcc intrin> idct16x16_256 1720ms 1703ms 1546ms 1554ms idct16x16_10 1320ms 1247ms 518ms 488ms idct16x16_1 107ms 108ms 64ms 68ms idct8x8_64 924ms 931ms 866ms 989ms idct8x8_12 826ms 824ms 519ms 514ms idct8x8_1 172ms 166ms 110ms 125ms idct8x8_64 isn't quite perfect (slight regression with gcc intrinsics) but as a counter example idct16x16_10 goes from ~1300ms to ~500ms On a sample clip, clang improved from 48.5 to 49fps and gcc stayed roughly stable. BUG=webm:1303 Change-Id: I9d4fd2b41b46ea6174a887b40a82c8e6e4769ed4	2017-01-19 12:27:31 -08:00
Kaustubh Raste	e0c0e65378	Add mips msa vpx hadamard functions average improvement ~4x-5x Change-Id: I167132d894c04fa85dda8dde7906ff9c61b3a65d	2017-01-19 14:44:03 +05:30
Marco Paniconi	baa4a290eb	Merge "vp9: Make the denoiser work with spatial SVC."	2017-01-12 17:54:41 +00:00
Johann Koenig	3628975a15	Merge "Create a class for buffers used in tests"	2017-01-12 01:02:58 +00:00
Johann	6886da7547	Create a class for buffers used in tests Demonstrate its use with the IDCT test. Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b	2017-01-11 08:28:39 -08:00
hui su	7a0bfa6ec6	Add "Large" label to VP9 target level tests Also reduce the number of test frames. Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08	2017-01-10 17:29:43 -08:00
Marco	7e3a82c384	vp9: Make the denoiser work with spatial SVC. If enabled denoiser will only denoise the top spatial layer for now. Added unittest for SVC with denoising. Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b	2017-01-10 17:23:58 -08:00
Johann Koenig	371a64bfe7	Merge "postproc: vpx_mbpost_proc_down_neon"	2017-01-09 19:53:15 +00:00
Johann Koenig	cabc29ba24	Merge "Add mips dspr2 partial idct tests"	2017-01-09 19:49:02 +00:00
Johann Koenig	7b18202e74	Merge "Add mips dspr2 vp9 intrapred tests"	2017-01-09 19:39:13 +00:00
Johann	c23970ec25	postproc: vpx_mbpost_proc_down_neon This was much more amenable to optimization than the across filter. Speedup of almost 2.5x BUG=webm:1320 Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4	2017-01-09 10:21:56 -08:00
Johann Koenig	9af97fb630	Merge "postproc: vpx_mbpost_proc_across_ip_neon"	2017-01-09 18:17:26 +00:00
Kaustubh Raste	6377f9d966	Add mips dspr2 partial idct tests Change-Id: Idf4003ea6f9a2a42a9f26e156bee73697acb7a37	2017-01-09 17:30:16 +05:30
Kaustubh Raste	c6ccd1e939	Add mips dspr2 vp9 intrapred tests Change-Id: I6be8c59ee220af0597bc2d7213f2779ac2e88db9	2017-01-09 14:11:57 +05:30
Johann	4dca923454	postproc: vpx_mbpost_proc_across_ip_neon The speedup is pretty poor. I would be concerned except the SSE2 is worse: Existing SSE2 improvement: 22% New neon improvement: 35% BUG=webm:1320 Change-Id: Ied598a261134aa6cbe69f96f58589d2bae17bf62	2017-01-06 16:39:17 -08:00
hui su	337ad83e58	Add support for VP9 level targeting Constraints on encoder config: -target_bandwidth is no larger than 80% of level bitrate limit -target_bandwidth * (1 + max_over_shoot_pct) is no larger than 88% of level bitrate limit -min_gf_interval is no smaller than level limit -tile_columns is no larger than level limit Constraints on rate control: -current frame size plus previous three frames' size is no larger than the CPB level limit -current frame size is no larger than 50%/40%/20% of the CPB level limit if it's a key/alt-ref/other frame. Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b	2017-01-06 10:07:31 -08:00

... 4 5 6 7 8 ...

2249 Commits