generic-library/vpx

Author	SHA1	Message	Date
Johann Koenig	f53b656207	Merge "quantize avx: copy implementation to intrinsics"	2017-08-23 21:14:13 +00:00
Johann	7c27872164	quantize avx: copy implementation to intrinsics Adds an early exit based on ptest. Slightly slower than ssse3 in the full case because of the extra check, but potentially faster if lots of rows can be skipped. Very close in speed to the assembly. Can run in 32 bit, unlike the assembly. Allows reworking the function prototype to use structs. Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e	2017-08-23 09:19:16 -07:00
Johann	e83d99d7b8	quantize fp: neon implementation About 4x faster when values are below the dequant threshold and 10x faster if everything needs to be calculated. Both numbers would improve if the division for dqcoeff could be simplified. BUG=webm:1426 Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2	2017-08-23 08:01:30 -07:00
Shiyou Yin	59e065b6ed	vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi. Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2	2017-08-23 15:14:15 +08:00
James Zern	419ce36294	Merge "ppc: Add vpx_idct16x16_256_add_vsx"	2017-08-22 00:48:39 +00:00
Shiyou Yin	bff5aa9827	Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi."	2017-08-22 00:37:23 +00:00
Johann	661efeca97	quantize test: test _fp_ version of quantize None of the x86 optimizations pass the tests. Change-Id: Ic67f2ba1977b657e68f2a13b0711fc5fcbafd909	2017-08-21 12:29:41 -07:00
Johann	13eed991f9	Remove skip_block from quantize This condition is handled before this code is reached. The ssse3 version of the function has always crashed when attempting to handle the skip_block condition. Add assert() and comments regarding the usage of skip_block. Removing the parameter is a fairly involved process so leave it be for the moment. Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a	2017-08-21 09:49:04 -07:00
Shiyou Yin	7d82e57f5b	vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi. Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e	2017-08-18 09:06:49 +08:00
Paul Wilkins	372336d1e5	Merge "Fix corrupt arf groups due to low "lag_in_frames""	2017-08-16 18:25:29 +00:00
Linfeng Zhang	f95686895b	Merge changes I08b562b6,Ia275940a,I51106e90 * changes: Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} Update highbd idct x86 optimizations. Update 32x32 idct sse2 and ssse3 optimizations.	2017-08-16 16:36:37 +00:00
paulwilkins	48110d0f79	Fix corrupt arf groups due to low "lag_in_frames" Having a very small value for "lag_in_frames" can result in corrupt arf groups including displayed frames that update the arf buffer and fake overlay frames that are not in fact overlays of real arfs but are nevertheless starved of bits. Leaving lag_in_frames at the default of 25 for these 5 frame two pass VBR tests should now give rise to a valid ARF coding pattern as follows:- K(ey), A(rf), N(ormal), N, N, O(verlay). This change is part of a response to BUG=webm:1454 where broken arf groups interacted badly with a change that corrects for large rate misses. However, it may still in some cases increase encode time by virtue of the fact that the unit test now codes a correct coding pattern with "hidden" ARF frames. Change-Id: Ifd0246a4c1d0be247247c754024d7a4ed5f66a6b	2017-08-16 14:07:24 +01:00
Johann Koenig	c59d1a4dc7	Merge changes I1f1edeaa,I89313cac * changes: quantize: silence unsigned overflow warning quantize test: quiet overflow warning	2017-08-15 17:37:59 +00:00
Johann	08cb7b5c68	quantize test: quiet overflow warning Promote the result of RandRange to signed Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128	2017-08-15 08:28:09 -07:00
Linfeng Zhang	d72e20b123	Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} BUG=webm:1412 Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f	2017-08-14 17:05:22 -07:00
Scott LaVarnway	fa85cf131c	vp9: strip temporal filter code when CONFIG_REALTIME_ONLY is enabled. BUG=webm:1446 Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52	2017-08-14 14:27:53 -07:00
Johann Koenig	ff184e482a	Merge changes I4b4beab1,I02f74dec * changes: quantize test: check skip_block quantize test: use negative input	2017-08-14 20:52:52 +00:00
Johann Koenig	45b39750d6	Merge "temporal filter test: adjust inputs and runtime"	2017-08-14 20:46:22 +00:00
Johann	c06d6649c5	temporal filter test: adjust inputs and runtime Use input with a narrow range because the filter only applies when the frames are similar. Run CompareReferenceRandom more times. Especially before narrowing the input range, the filter frequently did not apply. Change-Id: Ie249bedf6d0d33dfa5884611cb1835788e418b38	2017-08-14 17:24:11 +00:00
James Zern	746c0eab3b	disable SSSE3/VP9QuantizeTest* in hbd builds this test fails with the configuration similar to the assembly prior to: `d52cb5972` quantize: copy ssse3 optimizations to intrinsics BUG=webm:1458 Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf	2017-08-14 09:31:14 -07:00
Johann Koenig	9bb8ce5efb	Merge "neon: vpx_quantize_b_32x32"	2017-08-10 15:42:49 +00:00
Johann Koenig	0b393ae505	Merge "quantize: copy ssse3 optimizations to intrinsics"	2017-08-10 15:42:20 +00:00
Johann	357adb68b2	quantize test: check skip_block Not all sizes were tested previously. Only 4x4 and 32x32 Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315	2017-08-08 14:21:58 -07:00
Johann	1092cc7f1a	quantize test: use negative input coeff contains signed values. Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23	2017-08-08 14:19:56 -07:00
Johann	93166c5e51	neon: vpx_quantize_b_32x32 With skip block the neon is about twice as fast as C. The neon has no shortcut for coeff < zbin so it always takes the same amount of time. Even if the C can take the shortcut, it is over twice as fast in neon. If it can't, that gap increases to over 10x. BUG=webm:1426 Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6	2017-08-08 14:05:18 -07:00
Johann	d52cb59729	quantize: copy ssse3 optimizations to intrinsics Fairly minor differences from sse2. pabsw and psignw are the big gains. Also re-uses some values in eob calculation to avoid an extra pcmp. Fixes test failures in HBD and OS X builds. Allows using it in 32bit builds, where it is about 40% faster than sse2. Substantially faster than the assembly for skip_block. 10-20% faster the rest of the time. Change-Id: If783bb3567e561e47667e10133b9c84414a334e2	2017-08-08 12:22:14 -07:00
Linfeng Zhang	853165ba39	Update 32x32 idct sse2 funcs, add partial case 135 Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a	2017-08-07 17:37:02 -07:00
Linfeng Zhang	7f20c3ac44	Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1 BUG=webm:1412 Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca	2017-08-04 15:31:17 -07:00
Johann Koenig	cbb83ba4aa	Merge "quantize test: consolidate sizes"	2017-08-04 20:34:50 +00:00
Johann	9578a84205	quantize test: consolidate sizes Pass a max txfm size parameter and combine the base quantize test with the 32x32 test. Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b	2017-08-04 12:45:32 -07:00
Linfeng Zhang	563d58ab84	Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function BUG=webm:1412 Change-Id: I945f0fb6807b8948747243794dc7352b959221f7	2017-08-03 13:59:47 -07:00
Yunqing Wang	6843e7c7f3	Merge "Force the bit exactness in the first pass"	2017-08-03 00:03:10 +00:00
Yunqing Wang	bfd0f41f9b	Force the bit exactness in the first pass Originally, for the purpose of keeping a fast first pass, the first-pass stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but that difference is very small that doesn't cause a mismatch between the final bitstreams. However, if the encoder changes, this minor difference may cause a mismatch. Thus, this patch always forces the first pass to be bit exact. BUG=webm:1453 Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8	2017-08-02 15:58:39 -07:00
Johann	1059b5cc52	quantize test: add speed comparison Test some possible scenarios. Change-Id: I1a612e7153b31756be66390ceea55877856d5a33	2017-08-02 09:33:35 -07:00
Johann Koenig	847394fe77	Merge "neon: vpx_quantize_b"	2017-08-01 16:44:31 +00:00
Johann	2d6b5df657	neon: vpx_quantize_b With skip block or coeff < zbin it is about twice as fast as C. If most coeff values are > zbin it is about 10-15x as fast as C. BUG=webm:1426 Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7	2017-07-31 10:38:46 -07:00
Linfeng Zhang	75653b7032	Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34 * changes: Update vpx_idct16x16_10_add_sse2() Add vpx_idct16x16_38_add_sse2() Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2 Refactor highbd idct 4x4 and 8x8 x86 functions	2017-07-28 22:43:00 +00:00
James Zern	3c73e587d1	Revert "quantize ssse3: declare all variables" This reverts commit `03f5e300d6`. This causes test failures under OSX: SSSE3/VP9QuantizeTest.EOBCheck/0 SSSE3/VP9QuantizeTest.OperationCheck/0 Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b	2017-07-28 01:22:16 -07:00
Linfeng Zhang	7f4acf8700	Add vpx_idct16x16_38_add_sse2() Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898	2017-07-27 18:02:43 -07:00
Linfeng Zhang	9c43d81bc2	Refactor highbd idct 4x4 and 8x8 x86 functions BUG=webm:1412 Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec	2017-07-27 18:01:03 -07:00
Johann Koenig	a83e1f1d53	Merge "quantize ssse3: declare all variables"	2017-07-27 21:18:35 +00:00
Alexandra Hájková	666c543f7b	ppc: Add vpx_idct16x16_256_add_vsx Change-Id: Ibc3f7965423fd91179f8d8e77c7ae3e6d7f80572	2017-07-25 12:34:15 +00:00
Johann	af08fbb444	quantize test: promote RandRange() result to signed Avoid unsigned overflow warning: unsigned integer overflow: 19974 - 32703 cannot be represented in type 'unsigned int' Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12	2017-07-20 08:17:48 -07:00
Johann	c782f27ead	quantize test: lowbd functions do not pass in highbd qcoeff output looks OK but dqcoeff is no good. BUG=webm:1448 Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd	2017-07-20 08:17:48 -07:00
Johann	bde2e4aa36	quantize test: eob is output eob values are generated by the function. Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3	2017-07-19 14:17:19 -07:00
Johann	03f5e300d6	quantize ssse3: declare all variables Copy missing line from avx implementation. Change-Id: I9755c5b4d4034867de6fa9f741c24bf49dce3a27	2017-07-18 12:32:57 -07:00
Johann	101981b736	quantize test: test sse2 and avx optimizations ssse3 does not pass either of the tests. avx 32x32 does not pass. Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44	2017-07-18 12:08:16 -07:00
Johann	c7ebe82253	quantize test: extend arrays Officially the quant structures are 8 elements, with one dc element and 7 repeated ac elements. The low bit depth optimizations take advantage of this to fill the xmm registers. The high bit depth version manually duplicates the values. If all the optimizations were unified, the structure sizes could be greatly reduced. Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae	2017-07-18 09:55:47 -07:00
Johann	cb61ba02f4	quantize test: restrict and correct input Use only valid values for quantize inputs. These were determined by looping over vp9_init_quantizer and looking for max and min values. This allows extending the test to the low bit depth functions which were not designed to handle all possible inputs but only valid inputs. Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e	2017-07-18 09:40:45 -07:00
James Zern	9223b947ca	Merge "fix 'make exampletest' w/CONFIG_REALTIME_ONLY"	2017-07-15 18:37:10 +00:00
Johann	e3fa4ae8e3	quantize test: use Buffer Although the low bitdepth functions are identical (excepting the need for larger intermediate values) they do not pass these tests. This improves the error output to aid debugging. Simplify buffer usage with Buffer and removing unnecessarily aligned variables. eob is a single element and never written using aligned instructions. BUG=webm:1426 Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35	2017-07-13 15:54:48 -07:00
James Zern	960466939d	fix 'make exampletest' w/CONFIG_REALTIME_ONLY for tests that aren't explicitly testing 2-pass behavior use --passes=1 with this configuration Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f	2017-07-13 10:47:20 -07:00
Johann	e381753926	sad4d neon: 64x[32,64] Rewrite 64x64. BUG=webm:1425 Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf	2017-07-12 13:26:39 +00:00
Johann	e1bde306c8	sad4d neon: 32x[16,32,64] Rewrite 32x32. Use half the accumulator registers. BUG=webm:1425 Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13	2017-07-12 13:25:18 +00:00
Johann	807ce8fb1e	sad4d neon: 16x[8,16,32] Rewrite 16x16. Use half the accumulator registers. BUG=webm:1425 Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2	2017-07-12 13:25:11 +00:00
Johann	8152b0904d	sad4d neon: 8x[4,8,16] BUG=webm:1425 Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4	2017-07-12 13:25:03 +00:00
Johann	dd4347e9ec	sad4d neon: 4x4, 4x8 BUG=webm:1425 Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468	2017-07-12 03:38:03 +00:00
Johann Koenig	4e16f70703	Merge changes Id84d9780,Iaa6ea75b,I3362e0dd,I0020a49e,Ia42e4f36, ... * changes: sad neon: avg for 64x[32,64] sad neon: macroize 64xN definitions sad neon: avg for 32x[16,32,64] sad neon: macroize 32xN definitions sad neon: avg for 16x[8,16,32] sad neon: macroize 16xN definitions	2017-07-07 21:01:23 +00:00
James Zern	5d6060b62f	Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming"	2017-07-07 19:15:02 +00:00
Johann Koenig	6c375b9cd0	Merge "fdct neon: 32x32_rd"	2017-07-07 14:05:51 +00:00
Johann	e4e08556db	sad neon: avg for 64x[32,64] BUG=webm:1425 Change-Id: Id84d97807a6a0fbcc889c4dfe11929d54f85493d	2017-07-07 07:04:04 -07:00
Johann	67cffc1ef6	sad neon: avg for 32x[16,32,64] BUG=webm:1425 Change-Id: I3362e0dded3b46ca032caa7f44db42f324bc596d	2017-07-07 07:04:04 -07:00
Johann	527e0c9b1c	sad neon: avg for 16x[8,16,32] BUG=webm:1425 Change-Id: Ia42e4f36547c5fe12114fb58379e34bce82eb2f2	2017-07-07 07:04:04 -07:00
Johann Koenig	9b253f9f0a	Merge changes I7b36a57e,If2ab51e3,Ifc685a96 * changes: sad neon: macroize 8xN definitions sad neon: avg for 8x[4,8,16] sad neon: avg for 4x4 and 4x8	2017-07-07 14:03:13 +00:00
James Zern	80b83c73ba	cosmetics,vp9/: normalize inv/fwd_txfm naming + vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e	2017-07-06 18:35:44 -07:00
Johann	63bdc574e5	sad neon: avg for 8x[4,8,16] BUG=webm:1425 Change-Id: If2ab51e3050e078b0011b174efe41fcb65a15f44	2017-07-06 07:43:09 -07:00
Johann	6bac3f80ee	sad neon: avg for 4x4 and 4x8 BUG=webm:1425 Change-Id: Ifc685a96cb34f7fd9243b4c674027480564b84fb	2017-07-06 07:12:47 -07:00
Johann	75b00592c7	fdct neon: 32x32_rd About 40% faster than the non-rd version. BUG=webm:1424 Change-Id: Ia99d14eb9532302eeaab8cd3e503395b0374b5a2	2017-07-06 06:30:50 -07:00
James Zern	5227b8200b	vp9: remove FrameWorkerData & vp9_dthread.h the file was empty after the struct removal. the only remaining use was within vp9_dx_iface, but the wrapper became unnecessary after the removal of frame_parallel_decode. BUG=webm:1395 Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd	2017-07-05 22:32:00 -07:00
James Zern	0d245d42c4	Merge "test_vector_test,vp8: correct thread range"	2017-07-05 22:33:51 +00:00
Johann Koenig	9a05f9771a	Merge "test/buffer.h: move range checking to compiler"	2017-07-05 21:15:13 +00:00
James Zern	a22bb9809e	Merge "dct_partial_test: cover vpx_fdct8x8_1_msa in hbd"	2017-07-05 21:08:46 +00:00
Hui Su	3e08a88854	Merge "level tests: allow level undershoot"	2017-07-05 20:47:20 +00:00
James Zern	23d60be414	dct_partial_test: cover vpx_fdct8x8_1_msa in hbd this was enabled in: `5ac88162b` partial fdct test Change-Id: Ibae2031ec1308fe3a3b84a1ce6e7bacda3a7cb82	2017-07-05 13:01:41 -07:00
Johann	da2ad47d66	test/buffer.h: move range checking to compiler Pass low/high values as type T. Out of range values should be caught by static analysis instead. Change-Id: I0a3ee8820af05f4c791ab097626174e2206fa6d5	2017-07-05 11:21:18 -07:00
James Zern	7d526c1654	Merge "buffer.h: incorrect RandRange results"	2017-07-02 03:48:53 +00:00
Johann	6cb3178192	buffer.h: incorrect RandRange results 'low' was promoted to unsigned, triggering a ubsan warning Change-Id: Id49340079d39c105da93cf13e96cf852a93a94ba	2017-07-01 20:01:22 -07:00
Alexandra Hájková	c757d6dde4	ppc: Add vpx_idct8x8_64_add_vsx Change-Id: I4ed1312f365509e0595dcc09890ecb050f6f2069	2017-07-01 12:55:47 -07:00
Alexandra Hájková	d8c277030c	ppc: Add vpx_idct4x4_16_add_vsx Change-Id: Id2673eece32027fb245919c7a5c81994a4a19fd8	2017-07-01 12:32:18 -07:00
James Zern	af3ab45867	test_vector_test,vp8: correct thread range testing::Range does not include the end parameter in the set of values. also adjust the start to 2 as the single threaded case is already covered in another instantiation Change-Id: Iae3bf3ed4363dd434eccfa5ad4e3c5e553fbee60	2017-06-30 16:21:06 -07:00
Johann	c2044fda1d	buffer.h: use stride_ instead of stride() Change-Id: Ib51231349bf0ff3e23672762dc7bfa49b5fe4083	2017-06-30 07:37:20 -07:00
Johann	ce5b17f9ad	testing: ranges for random values Add a method to acm_random.h to generate ranges of values Add a way to call that method to buffer.h Adjust dct_[partial_]test.cc to use it. Change-Id: I8c23ae9d27612c28f050b0e44c41cb4ad2494086	2017-06-30 07:25:30 -07:00
Johann Koenig	89d3dc043e	Merge changes Id5beb35d,I2945fe54,Ib0f3cfd6,I78a2eba8 * changes: partial fdct neon: add 32x32_1 partial fdct neon: add 16x16_1 partial fdct neon: add 4x4_1 partial fdct neon: move 8x8_1 and enable hbd tests	2017-06-30 01:00:07 +00:00
James Zern	67d7a6df2d	Merge changes from topic 'rm-dec-frame-parallel' * changes: rm vp9_frame_parallel_test.cc test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING	2017-06-29 23:21:18 +00:00
James Zern	e5bdab98e9	rm vp9_frame_parallel_test.cc VPX_CODEC_USE_FRAME_THREADING was made a no-op in: `01d23109a` vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op and the tests in this file have been disabled since: `6ab0870d4` disable VP9MultiThreadedFrameParallel tests BUG=webm:1395 Change-Id: I2c7a250acb65cf9522cf8a7bb724bb92070e41c6	2017-06-29 15:15:56 -07:00
James Zern	508ef2a6e3	test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING this was made a no-op in: `01d23109a` vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op and the test hitting this branch has been disabled since: `6ab0870d4` disable VP9MultiThreadedFrameParallel tests rename the test to VP9MultiThreaded to exercise the tile-based threading BUG=webm:1395 Change-Id: I35564a75eb5a7d7f7ccb923133b1b07295201f4c	2017-06-29 15:15:48 -07:00
James Zern	bd77931421	dct_partial_test,fwd_txfm: change << to * left shift of a negative number is undefined in C; quiets a ubsan warning Change-Id: Ib1624ad5326ac8e0eead9348468ef7fe5d4df9a4	2017-06-29 14:42:03 -07:00
Johann	9fe510c12a	partial fdct neon: add 32x32_1 Always return an int32_t. Since it needs to be moved to a register for shifting, this doesn't really penalize the smaller transforms. The values could potentially be summed and shifted in place. BUG=webm:1424 Change-Id: Id5beb35d79c7574ebd99285fc4182788cf2bb972	2017-06-28 15:37:44 -07:00
Johann	f310ddc470	partial fdct neon: add 16x16_1 For the 8x8_1, the highbd output fit nicely in the existing function. 12 bit input will overflow this implementation of 16x16_1. BUG=webm:1424 Change-Id: I2945fe5478b18f996f1a5de80110fa30f3f4e7ec	2017-06-28 15:37:44 -07:00
Johann	4959dd3eb3	partial fdct neon: add 4x4_1 BUG=webm:1424 Change-Id: Ib0f3cfd6116fc1f5a99acb8bfd76e25b90177ffc	2017-06-28 15:37:44 -07:00
Johann	cf75ab6ccd	partial fdct neon: move 8x8_1 and enable hbd tests The function was originally written with HBD in mind. Enable it and configure the tests. BUG=webm:1424 Change-Id: I78a2eba8d4d9d59db98a344ba0840d4a60ebe9a1	2017-06-28 15:37:43 -07:00
Johann Koenig	81e25512c3	Merge changes Ib454762d,I966650df,Ie126553e,I068f06c6,Icb72a94e * changes: sad neon: rewrite 64x64 and add 64x32 sad neon: rewrite 32x32, add 32x16 and 32x64 sad neon: rewrite 16x8, 16x16, add 16x32 sad neon: rewrite 8x8 and 8x16 sad neon: rewrite 4x4 and add 4x8	2017-06-28 22:37:00 +00:00
Johann Koenig	d91af5f905	Merge "buffer.h: Only allow Init() to be called once."	2017-06-28 22:36:05 +00:00
Johann Koenig	35f8515c3f	Merge "partial fdct test"	2017-06-28 22:34:53 +00:00
Johann	5ac88162b9	partial fdct test Test the _1 variant of the fdct, which simply sums the block and applies a modifying shift based on the block size. BUG=webm:1424 Change-Id: Ic80d6008abba0c596b575fa0484d5b5855321468	2017-06-28 20:32:20 +00:00
Johann	ad011aaab8	sad neon: rewrite 64x64 and add 64x32 BUG=webm:1425 Change-Id: Ib454762d1c61b05a98324fe81ad58c9e09784717	2017-06-28 12:21:34 -07:00
Johann	469643757f	sad neon: rewrite 16x8, 16x16, add 16x32 BUG=webm:1425 Change-Id: Ie126553e5fffcdfaf3d82a85b368ac10ce9ab082	2017-06-28 12:16:00 -07:00
Johann	e40e78be24	sad neon: rewrite 8x8 and 8x16 BUG=webm:1425 Change-Id: I068f06c67b841f09ea07c04ada0c2f1706102138	2017-06-28 12:15:57 -07:00
Johann	46d8660ce3	sad neon: rewrite 4x4 and add 4x8 The previous implementation loaded 8 values (discarding half) BUG=webm:1425 Change-Id: Icb72a94e2557a4ee2db7091266ab58fd92f72158	2017-06-28 11:14:59 -07:00
Johann	e0330c4810	buffer.h: Only allow Init() to be called once. Change-Id: I041c8b6f314802833c5287a176dbfeec9461b08e	2017-06-28 10:59:39 -07:00
hui su	d4595de5db	level tests: allow level undershoot Obtaining a level that is lower than the target should be tolerated. Change-Id: I90a55ee6d7142e9f6cc525ebbd1e0501defcbe28	2017-06-26 15:17:04 -07:00
Linfeng Zhang	ec4afbf74a	Merge "Add vpx_highbd_idct4x4_16_add_sse4_1()"	2017-06-24 01:15:14 +00:00
James Zern	ee1fcb0e69	Merge "variance_test: move Subpel* from tuples to TestParams"	2017-06-23 22:48:40 +00:00
Linfeng Zhang	8253a27904	Add vpx_highbd_idct4x4_16_add_sse4_1() BUG=webm:1412 Change-Id: Ie33482409351a01be4e89466b0441834eb1e905a	2017-06-23 14:30:12 -07:00
James Zern	0d1c782306	Merge "datarate_test: rename thread -> Thread in test name"	2017-06-23 20:00:51 +00:00
James Zern	54bcd98314	variance_test: move Subpel* from tuples to TestParams this normalizes these tests with the regular variance ones both in implementation and test list output Change-Id: I387aea81456f94b8223b8fb2a28cab94bc1aa9d5	2017-06-23 12:54:18 -07:00
Johann Koenig	794a5ad713	Merge "fdct32x32 neon implementation"	2017-06-23 01:58:00 +00:00
Linfeng Zhang	c5f9de573f	Merge changes I783c5f4f,I365f8e53,I5dac0e98 * changes: Clean vpx_idct16x16_256_add_sse2() Update vpx_idct{8x8,16x16,32x32}_1_add_sse2() Clean 32x32 full idct sse2 and ssse3 code	2017-06-22 21:42:23 +00:00
Johann	e67660cf37	fdct32x32 neon implementation Almost 3x faster in constrained loop testing. Over 10x faster in HBD builds. BUG=webm:1424 Change-Id: I2b7f8453e1d4ada63cde729d8115d684c4a71ff9	2017-06-22 06:40:17 -07:00
James Zern	dd88bd87db	datarate_test: rename thread -> Thread in test name this is consistent with other threaded tests and ensures gtest_filters meant to operate on these pick them up Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b	2017-06-21 20:05:31 -07:00
Linfeng Zhang	2b43a1ee18	Clean 32x32 full idct sse2 and ssse3 code vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are code relocations, no new code. Change-Id: I5dac0e98cc411a4ce05660406921118986638d19	2017-06-21 13:46:49 -07:00
Johann	1c48915233	dct tests: align InvAccuracyCheck buffers 'in' is used for the reference fdct. 'coeff' is input to the idct being tested and 'dst[16]' is output Fixes a segfault on unaligned memory access on x86. Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962	2017-06-21 11:47:00 -07:00
James Zern	0aa3677d9d	fix build, rm ref to vpx_idct8x8_64_add_ssse3 this was deleted in: `98967645a` Remove vpx_idct8x8_64_add_ssse3() but this was merged in: `9e03eedf6` Merge changes Ib26dd515,Ie60dabc3 after: `a92991133` Merge "dct tests: run all possible sizes in one test" which added a new reference Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25	2017-06-20 19:46:45 -07:00
Linfeng Zhang	9e03eedf62	Merge changes Ib26dd515,Ie60dabc3 * changes: Clean 8x8 idct x86 optimization Remove vpx_idct8x8_64_add_ssse3()	2017-06-21 00:38:25 +00:00
Johann	4ebb9a36f1	dct tests: run all possible sizes in one test Modify fdct4x4_test.cc to support all size combinations. This does not add any new tests and in fact fails a few. There were minimal changes made to the tests so it's not entirely surprising that some of the larger 12 bit transforms are failing since it was initially only used for 4x4. In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and dct32x32_test.cc will be evaluated and moved to dct_test.cc. BUG=webm:1424 Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f	2017-06-19 15:39:35 -07:00
Linfeng Zhang	98967645a1	Remove vpx_idct8x8_64_add_ssse3() It's almost identical with vpx_idct8x8_64_add_sse2(), except little difference in instructions order. Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f	2017-06-15 14:09:33 -07:00
Johann Koenig	6dcd9b37ea	Merge "idct_test: don't use std::nothrow anymore"	2017-06-09 20:42:39 +00:00
Johann Koenig	8aa4ee1f10	Merge "buffer.h: allow declaring an alignment"	2017-06-09 20:42:21 +00:00
Johann	92373a5bb2	idct_test: don't use std::nothrow anymore But still check for NULL before calling Init() Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76	2017-06-09 11:09:06 -07:00
Johann	5aee8ea752	buffer.h: allow declaring an alignment x86 simd register operations generally prefer and may require 16 byte alignment. Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba	2017-06-09 11:03:15 -07:00
James Zern	b3a262dff3	Merge "vp8_decode_frame: fix oob read on truncated key frame"	2017-06-08 23:17:50 +00:00
James Zern	45daecb4f7	vp8_decode_frame: fix oob read on truncated key frame the check for error correction being disabled was overriding the data length checks. this avoids returning incorrect information (width / height) for the decoded frame which could result in inconsistent sizes returned in to an application causing it to read beyond the bounds of the frame allocation. BUG=webm:1443 BUG=b/62458770 Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342	2017-06-08 23:16:04 +00:00
Johann	e50ea014c3	Revert "buffer.h: use size_t" This reverts commit `f08581c1d0`. type conversion warnings abound. Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00	2017-06-08 10:20:21 -07:00
Johann Koenig	903375a48a	Merge "fdct16x16 neon optimization"	2017-06-08 15:19:36 +00:00
Johann	eae7cf2368	fdct16x16 neon optimization Roughly 2x speedup. Since the only change for HBD is to store(), the improvement appears to hold there as well. BUG=webm:1424 Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19	2017-06-07 14:59:55 -07:00
Johann Koenig	0c4f74d129	Merge changes Iade45f69,I18d90658,Ieca3f1ef * changes: buffer.h: add num_elements_ buffer.h: zero-init all values buffer.h: use size_t	2017-06-07 19:20:16 +00:00
Johann	902d63759e	buffer.h: add num_elements_ raw_size_ was being incorrectly computed and used Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d	2017-06-07 11:31:20 -07:00
Johann	4a37e3e2a0	buffer.h: zero-init all values Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8	2017-06-07 11:27:26 -07:00
Johann	f08581c1d0	buffer.h: use size_t Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f	2017-06-07 11:24:27 -07:00
James Zern	ff42e04f9c	Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}"	2017-06-06 23:52:39 +00:00
Johann	de4cb716ee	buffer.h: split out init Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287	2017-06-06 15:02:50 -07:00
Johann	8659764a07	buffer.h: Use T for values Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9	2017-06-06 12:05:14 -07:00
James Zern	4753c23983	Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx"	2017-06-06 02:19:41 +00:00
Johann Koenig	755b3daf90	Merge "comp_avg_pred neon: used by sub pixel avg variance"	2017-05-31 18:17:28 +00:00
Johann	f695b30ac2	comp_avg_pred neon: used by sub pixel avg variance BUG=webm:1423 Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804	2017-05-30 22:47:34 +00:00
Jerome Jiang	a5ab38093f	Merge "Fix vp8 race when build --enable-vp9-highbitdepth."	2017-05-30 05:47:44 +00:00
Jerome Jiang	0afa2dad76	Fix vp8 race when build --enable-vp9-highbitdepth. Split vp8/vp9 implementations on yv12_copy_frame_c. Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c. Clean up vp8 codes usage in vp9. BUG=webm:1435 Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d	2017-05-26 09:45:01 -07:00
Johann Koenig	de1a9c77a7	Merge changes Iaab2b9a1,Idfb458d3 * changes: sub pel avg variance neon: 4x block sizes sub pel variance neon: 4x block sizes	2017-05-24 18:33:53 +00:00
Johann Koenig	b11a37f540	Merge changes I31fa6ef8,I228c6f29 * changes: sub pel avg variance neon: add neon optimizations sub pel variance neon: normalize variable names	2017-05-24 18:32:02 +00:00
James Zern	566f6d75bd	partial_idct_test,InitInput: fix rollover in mult promote coeff to signed 64-bit to avoid exceeding integer bounds when squaring the value Change-Id: If77bef6bc0a6a4c39ca3013e5e2ddb426a1c6e1f	2017-05-24 15:27:38 +02:00
Alexandra Hájková	8bf6eaf433	ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64} Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123	2017-05-24 13:27:09 +00:00
Linfeng Zhang	36f1b183e4	Update InitInput() in test/partial_idct_test.cc Make it work in high bit depth. BUG=webm:1412 Change-Id: Ic5cfd410a69709f01e2924774356a108a349d273	2017-05-23 14:24:23 -07:00
Johann	f6fcd3410d	sub pel avg variance neon: 4x block sizes BUG=webm:1423 Change-Id: Iaab2b9a183fdb54aae5f717aba95d90dc36a9e3b	2017-05-22 14:40:05 -07:00
Johann	188d58eaa9	sub pel variance neon: 4x block sizes Add optimizations for blocks of width 4 BUG=webm:1423 Change-Id: Idfb458d36db3014d48fbfbe7f5462aa6eb249938	2017-05-22 14:40:01 -07:00
Johann	9b0d306a2f	sub pel avg variance neon: add neon optimizations These are missing an optimized version of vpx_comp_avg_pred BUG=webm:1423 Change-Id: I31fa6ef842e98f7ff3ea079ffed51ae33178e2ed	2017-05-22 13:58:43 -07:00
Linfeng Zhang	c167345ffb	Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2 BUG=webm:1412 Change-Id: Ia338a6057d36f9ed7eaa9cbd4dfbf0c3cbdc6468	2017-05-22 11:24:21 -07:00
Johann Koenig	e7cac13016	Merge changes Ib8dd96f7,Ie9854b77 * changes: neon variance: process 4x blocks use memcpy for unaligned neon stores	2017-05-22 17:48:33 +00:00
Johann Koenig	3c603eadb4	Merge "neon fdct: 4x4 implementation"	2017-05-19 17:08:58 +00:00
Johann	7b742da63e	neon variance: process 4x blocks Continue processing sets of 16 values. Plenty of improvement for 4x8 (doubles the speed) but only about 30% for 4x4. BUG=webm:1422 Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905	2017-05-17 17:35:01 -07:00
Marco Paniconi	a2dfbbd7d6	Merge "vp9: Modify ChangingDropFrameThresh unittest."	2017-05-17 18:42:51 +00:00
Marco	4733df333f	vp9: Modify ChangingDropFrameThresh unittest. Add another (lower) bitrate to the test, to cover frame drop behavior at low bitrate range. Change-Id: Iaad003974159daf3d2d65ef3a6575a3e72e498d6	2017-05-17 09:38:21 -07:00
Linfeng Zhang	3210ca6d60	Update partial idct testing code Add PartialIDctTest::PrintDiff() to help debugging. In RunQuantCheck, try all combinations of +/-mask_ input for 4x4 idct. Update PartialIDctTest::InitInput(). Change-Id: I13fd163954a4c1a3a6cfeb5e4a4d3d0e7ff901f4	2017-05-17 09:28:32 -07:00
Johann	105503b839	neon fdct: 4x4 implementation Approximately twice as fast as C implementation. BUG=webm:1424 Change-Id: I3c0307fb08ddc23df42545cd089a78e2ed5c9d3f	2017-05-17 07:38:18 -07:00
Alexandra Hájková	bcbc3929ae	ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx Change-Id: Ic9639b1331d8c5cbc207c2a036891ff0137fc56f	2017-05-13 13:13:15 +00:00
James Zern	ac8f58f6ab	Merge changes I1b54a7a5,I3028bdad,I59788cd9 * changes: ppc: Add get_mb_ss_vsx ppc: Add get4x4sse_cs_vsx ppc: Add comp_avg_pred_vsx	2017-05-12 15:24:59 +00:00
Luca Barbato	143b21e362	ppc: Add get_mb_ss_vsx Change-Id: I1b54a7a5bb642e4b836d786ea1ae506eed025e3f	2017-05-12 17:23:00 +02:00
Luca Barbato	6d225eb5f9	ppc: Add get4x4sse_cs_vsx Change-Id: I3028bdadf653665d18e781d28e9625f62804b3d8	2017-05-12 17:23:00 +02:00
Luca Barbato	a7f8bd451b	ppc: Add comp_avg_pred_vsx Change-Id: I59788cd98231e707239c2ad95ae54f67cfe24e10	2017-05-12 17:22:55 +02:00
Alexandra Hájková	f48532e271	ppc: Add vpx_sad64x32/64_vsx Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2	2017-05-12 16:10:16 +02:00
Alexandra Hájková	0b15bf1e54	ppc Add vpx_sad32x16/32/64_vsx Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd	2017-05-12 16:10:11 +02:00
James Zern	a12ea1d5e9	Merge "ppc: Add vpx_sad16x8/16/32_vsx"	2017-05-12 13:33:51 +00:00
Marco	c5a4376aed	vp9: SVC: allow for setting the interp_filter in non-rd pickmode. For SVC 1 pass non-rd pickmode, the interpolation filter for the upsampling of the golden (spatial) reference was not being explicitly set and instead was takin gwhatever value was set in the previous mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH). Fix it to the default EIGHTTAP for now, to be updated/selected adaptively in a later change. Minor adjustmemt to rate targeting thresholds in datarate unittests. Change-Id: I52085048674072c6cfb7163e11e9a2658d773826	2017-05-11 11:45:09 -07:00
Alexandra Hájková	cc7f0c0f3e	ppc: Add vpx_sad16x8/16/32_vsx Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9	2017-05-10 19:57:30 +00:00
Johann Koenig	d713ec3c46	Merge changes I92eb4312,Ibb2afe4e * changes: subpel variance neon: add mixed sizes sub pixel variance neon: use generic variance	2017-05-10 18:19:52 +00:00
Linfeng Zhang	870cf4356c	Update test/partial_idct_test.cc Makes more sense to call the corresponding partial idct C function instead of the full idct C function as the reference. Change-Id: Ibb7681dd063edd6307ba582c10c26c4c6a4b78c6	2017-05-09 13:07:47 -07:00
Johann Koenig	1814463864	Merge changes Id602909a,Ib0e85608 * changes: neon variance: process two rows of 8 at a time neon variance: add small missing sizes	2017-05-08 17:34:20 +00:00
Linfeng Zhang	2c3a2ad6f1	Merge changes I0cfe4117,I3581d80d,Ida62c941 * changes: Split dsp/x86/inv_txfm_sse2.c Update highbd idct functions arguments to use uint16_t dst Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct	2017-05-08 16:15:57 +00:00
Jerome Jiang	3453c8d6c4	Merge "vp9: Neon optimization for denoiser. Add unit tests."	2017-05-06 01:28:32 +00:00
Jerome Jiang	83a2bfd7dc	Merge "Change target bitrate thresh in denoiser test."	2017-05-06 01:28:15 +00:00
Jerome Jiang	fff358fb06	Change target bitrate thresh in denoiser test. An intended behavior change disabling exhaustive searches in speed feature causes VP9/DatarateTestVP9LargeDenoiser.4threads test failure. Change the threshold to make it pass. BUG=webm:1429 Change-Id: Ibcbe2314c6b2525799894f5d7204fc8eb4ec2a1e	2017-05-05 16:50:19 -07:00
Jerome Jiang	069eedb3a0	vp9: Neon optimization for denoiser. Add unit tests. Denoiser on Neon is 5x faster than C code. BUG=webm:1420 Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb	2017-05-05 16:40:52 -07:00
Johann	2346a6da4a	subpel variance neon: add mixed sizes Add support for everything except block sizes of 4. Performance is better but numbers will improve again when the variance optimizations land. BUG=webm:1423 Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80	2017-05-04 15:30:01 -07:00
Johann	462e29703c	fdct 8x8 neon: minor comment cleanup Simplify HBD/non distinction in test. Document why transpose_neon.h is not used Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52	2017-05-04 15:14:23 -07:00
Johann	cb9133c72f	neon variance: add small missing sizes Some of the mixed sizes were missing. They can be implemented trivially using the existing helper function. When comparing the previous 16x8 and 8x16 implementations, the helper function is about 10% faster than the 16x8 version. The 8x16 is very close, but the existing version appears to be faster. BUG=webm:1422 Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004	2017-05-04 08:59:42 -07:00
Linfeng Zhang	d5de63d2be	Update highbd idct functions arguments to use uint16_t dst BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5	2017-05-03 13:59:16 -07:00
Linfeng Zhang	081b39f2b7	Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct BUG=webm:1388 Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112	2017-05-03 13:58:31 -07:00
Yi Luo	a3452996a1	High bit depth inter prediction horizontal/vertical filters AVX2 User level speed improvement on i7-6700, cpu-used=1, x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps: - Decoder: 1080p: ~4% 4K: ~5% - Encoder: 1080p: ~1% 4K: ~3% Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640	2017-05-03 12:18:01 -07:00
James Zern	5599e4275a	Merge changes Ia5293d94,I90d481d3,Ia509d622,I54549b03,I89b635d6 * changes: ppc: Add convolve8_vsx and convolve8_avg_vsx ppc: Add convolve8_avg_vert_vsx ppc: Add convolve8_vert ppc: Add convolve8_horiz_avg ppc: Add convolve8_horiz	2017-05-03 03:31:19 +00:00
Luca Barbato	e2ad89092d	ppc: Add convolve8_vsx and convolve8_avg_vsx Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857	2017-05-02 20:27:47 -07:00
Luca Barbato	e6ca81ee67	ppc: Add convolve8_avg_vert_vsx Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d	2017-05-02 20:27:42 -07:00
Luca Barbato	a65f1771ad	ppc: Add convolve8_vert Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f	2017-05-02 20:27:33 -07:00
Luca Barbato	77772350f3	ppc: Add convolve8_horiz_avg The 8x8 and larger blocks cases can be sped up further. Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15	2017-05-02 20:27:28 -07:00
Luca Barbato	08edb85bd0	ppc: Add convolve8_horiz The 8x8 and larger blocks cases can be sped up further. Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046	2017-05-02 20:27:16 -07:00
James Zern	ee3df31d74	Merge "vpx_scale_test: fix segfault on alloc failure"	2017-05-01 19:22:22 +00:00
James Zern	2930903d51	vpx_scale_test: fix segfault on alloc failure check the return of ResetImage() before continuing Change-Id: Iff0b038f7b9761113b8cf33a511a5306640d1273	2017-04-29 13:12:53 -07:00
Luca Barbato	d51d3934f5	ppc: Add convolve_avg Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014	2017-04-29 15:47:25 +02:00
Luca Barbato	63860ba7b8	ppc: Add convolve_copy Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405	2017-04-29 15:47:25 +02:00
Jerome Jiang	bea27a5809	Merge "Generalize vp9 sse2 denoiser test for other platforms."	2017-04-28 15:45:52 +00:00
Johann Koenig	94ebdba71d	Merge "vp9 temporal filter: sse4 implementation"	2017-04-28 13:22:41 +00:00
Jerome Jiang	26aebd77b8	Generalize vp9 sse2 denoiser test for other platforms. Renamed to vp9_denoiser_test. Change-Id: I0d8f4c94bcb81a60949a13d9fe839cee95d03f77	2017-04-27 22:47:41 -07:00
Johann	6dfeea6592	vp9 temporal filter: sse4 implementation Approximates division using multiply and shift. Speeds up both sizes (8x8 and 16x16) by 30 times. Fix the call sites to use the RTCD function. Delete sse2 and mips implementation. They were based on a previous implementation of the filter. It was changed in Dec 2015: `ece4fd5d22` BUG=webm:1378 Change-Id: I0818e767a802966520b5c6e7999584ad13159276	2017-04-26 22:03:05 -07:00
Yunqing Wang	b68f14d0ed	Merge "Make the row based multi-threaded encoder deterministic"	2017-04-26 16:12:14 +00:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
Yunqing Wang	10a497bd38	Make the row based multi-threaded encoder deterministic This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb	2017-04-24 16:28:27 -07:00
Marco	85ca2e8a8b	vp9: Re-enable SVC datarate tests. Re-enable the SVC tests, wrap the non-zero expectation in GetMismatchFrames around #if CONFIG_VP9_DECODER. Change-Id: I0e8a2d78b868c32f18fe597540f397d3a1b303b5	2017-04-20 12:08:08 -07:00
Luca Barbato	8975436466	ppc: Add the intra predictor tests Change-Id: Idea15b916044ab3d8e74519337880a484ecfd87e	2017-04-19 20:21:40 -07:00
Luca Barbato	914b160fb5	ppc: h predictor 8x8 Slightly faster with the current compiler. Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47	2017-04-19 19:57:51 -07:00
Luca Barbato	0b9be93205	ppc: d63 predictor 8x8 10x faster. Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8	2017-04-19 19:57:51 -07:00
Luca Barbato	ee9325b0bd	ppc: tm predictor 4x4 Slightly faster. Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191	2017-04-19 19:57:51 -07:00
Luca Barbato	2904eb5800	ppc: h predictor 4x4 2x faster. Change-Id: I0583dec353299c6797401b646099f18db4e0420d	2017-04-19 19:57:51 -07:00

... 2 3 4 5 6 ...

2199 Commits