generic-library/vpx

Author	SHA1	Message	Date
Linfeng Zhang	201dcefafe	Update idct test code to test 8-bit & high bitdepth simultaneously Change-Id: Icc0eb9c0ddf2a13ec832877a089450972134e8ec	2016-12-13 17:25:04 -08:00
Linfeng Zhang	834feffe08	Update TEST_P(PartialIDctTest, RunQuantCheck) 1. Use correct projections when copying real dct/quant outputs. 2. Remove local random number generator and combine loops. 3. Quantization with minimum allowed step sizes instead of maximum. This may generate larger inputs. Change-Id: I154afc26230c894d564671cff4b8fd5485b69598	2016-12-07 11:34:00 -08:00
Linfeng Zhang	17a8cf5cc3	Add high bitdepth 4x4 idct NEON intrinsics Change-Id: I4afc130effa05b8be2e9f982967216b1beb2ce4b	2016-11-30 13:07:13 -08:00
Linfeng Zhang	6cc76ec73f	Update vpx_idct4x4_16_add_neon() to pass SingleExtremeCoeff test Change-Id: Icc4ead05506797d12bf134e8790443676fef5c10	2016-11-22 11:35:05 -08:00
Linfeng Zhang	45876b4550	Add idct speed test. Change-Id: I3b5fd3b36cac1fb3a93e27fd8fd0781c91d412ce	2016-11-22 11:19:24 -08:00
Linfeng Zhang	d479c9653e	Update partial_idct_test.cc to support high bitdepth BUG=webm:1301 Change-Id: Ieedadee221ce539e39bf806c41331f749f891a3c	2016-11-22 11:11:58 -08:00
James Zern	f6921412d4	partial_idct_test: s/SingleLargeCoef/SingleExtremeCoeff/ tests with 'Large' in the name are reserved for slow running tests which may not be run on all platforms Change-Id: I2a7d6dd46b29b50469893e46433844132fb727c2	2016-11-17 12:28:57 -08:00
James Zern	2218a4c292	partial_idct_test: use <limits> for int16_min/max this removes the need for __STDC_LIMIT_MACROS which is defined in vpx_integer.h, but may be preceded by earlier includes of stdint.h; fixes build with the r13 ndk Change-Id: I3950c8837cf90d5584a20ce370ae370581c2182c	2016-11-15 12:18:38 -08:00
James Zern	c344dee463	partial_idct_test,NEON: add missing idct variants idct4x4 and idct8x8 were universally enabled for high-bitdepth builds in: `3ae2597` idct,NEON: add a tran_low_t->s16 load adapter BUG=webm:1294 Change-Id: If142afb169c48728cc4b222e7c41aa4a63f95f0f	2016-11-08 18:29:35 -08:00
James Zern	738c8f23c6	enable vpx_idct32x32_34_add_neon in hbd builds replace load_and_transpose_s16_8x8() in idct32_6_neon() with a separate load_tran_low_to_s16() and transpose_s16_8x8(). the combined function is used in idct32_8_neon() where the input is the correctly sized output from the earlier stage. BUG=webm:1294 Change-Id: I4257c4b3a421b2cf5d13651f966eee0680ef98a9	2016-11-08 17:03:36 -08:00
Johann	50b40f114c	Optimize idct32x32_135_add for NEON BUG=webm:1295 Change-Id: I7f80ef4d29813fcb401fc6075babf19e3c195462	2016-11-08 22:06:07 +00:00
James Zern	40bcb96abd	partial_idct_test: set MinSupportedCoeff for NEON vpx_idct4x4_16_add_neon fails with INT16_MIN, +1 is all right BUG=webm:1335 Change-Id: I25830c8ab0782822fc3c9db6cc669c2e65f2700e	2016-11-07 15:47:09 -08:00
Johann	e851160642	idct test: use coeff consistently Change-Id: I913a13066993a3315a0ff8310b3cad1572d4cdd7	2016-11-04 18:41:59 -07:00
Johann	9ad3e14015	partial_idct_test: Add large coefficient test Two functions do not pass this test: vpx_idct8x8_64_add_ssse3 vpx_idct8x8_12_add_ssse3 The test has been modified to avoid triggering an issue with those functions but they still must be investigated. BUG=webm:1332 Change-Id: I52569a81e8e6e0b33c4a4d060d0b69c3fc4f578e	2016-11-04 18:37:58 -07:00
Johann	7994dba6c0	partial_idct_test: add _add_ test The result of the transform is added to the destination buffers. In the existing tests the destination buffer is always empty so that portion of the code was never exercised. Change-Id: I1858c4fed2274f1b9faf834d2ba4186a4510492a	2016-10-26 21:35:49 -07:00
Johann	ed2c240538	partial_idct_test: consolidate block size Use *input_block_ for sizeof() calculation like the other test Change-Id: I1e4bd227131662056405af78c5052ad6ef769e9f	2016-10-26 21:35:03 -07:00
Johann	08e0da30ca	Refactor partial idct test Switch to using correctly sized inputs and outputs. This simplifies adding tests with varying strides. Change-Id: I716a0d8173dcf6a86d56656ac9d3101b7ec27642	2016-10-26 12:28:18 -07:00
Johann	9720b58aac	Optimize idct32x32_34_add for NEON Approximately 3 times faster than the 1024 version which was used previously. BUG=webm:1295 Change-Id: Id15fb3d096029ec38ef01c53e5f6eb08254347c9	2016-10-25 15:43:58 -07:00
James Zern	a6be7ba1aa	enable idct*_1_add_neon in high-bitdepth builds these are compatible as they only load one element of the input so the larger size of tran_low_t makes no difference in little endian builds. note the asm is incompatible with big-endian, but there are other points of failure there so currently it's considered unsupported. BUG=webm:1294 Change-Id: Icd2665a0699bccae92d1bea43a95b0a83fb17028	2016-10-05 11:14:25 -07:00
Johann	24c0146403	Connect partial IDCT tests Change-Id: Ie8d5d9123f5a9d39db4ec9c74f77ee979ae4e685	2016-10-04 10:31:01 -07:00
clang-format	9c9d92ae3a	test: apply clang-tidy google-readability-braces-around-statements applied against a x86_64 configure with and without --enable-vp9-highbitdepth clang-tidy-3.7.1 \ -checks='-,google-readability-braces-around-statements' \ -header-filter='.' -fix + clang-format afterward Change-Id: Ia2993ec64cf1eb3505d3bfb39068d9e44cfbce8d	2016-08-05 20:02:28 -07:00
clang-format	33e40cb5db	test: apply clang-format Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f	2016-07-27 01:58:52 +00:00
Johann	0266e70c52	test: remove x86inc.asm distinction BUG=b:29583530 Change-Id: I296a0b81755e3086bc0a40cb126d0200ff03c095	2016-06-30 11:14:10 -07:00
Jingning Han	08a453b9de	Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names This commit clears the function naming convention in vpx_dsp. It replaces vp9_ prefix of global functions with vpx_ prefix. It also removes the vp9_ prefix from static functions. Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf	2015-08-04 13:46:11 -07:00
Jingning Han	097d59c28c	Cosmetics - Fix header file order in unit tests Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798	2015-07-29 20:48:25 -07:00
Jingning Han	4b5109cd73	Replace vp9_ prefix in 2D-DCT functions with vpx_ Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25	2015-07-28 16:06:44 -07:00
Jingning Han	b67821f37b	Factor forward 2D-DCT transforms into vpx_dsp This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward transform operations into vpx_dsp folder. Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d	2015-07-22 15:48:17 -07:00
Johann	ff8505a54d	Fix --disable-use-x86inc Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06	2015-07-01 13:15:51 -07:00
Parag Salasakar	54a6f73958	mips msa vp9 idct4x4 and iwht4x4 optimization average improvement ~3x-4x moved assert to respective files Change-Id: I6c915059d456a00bdd76fab0dd2eede8b6c6ea58	2015-06-02 12:16:28 +05:30
Parag Salasakar	6af9d7f2e2	mips msa vp9 updated idct 8x8, 16x16 and 32x32 module Updated sources according to improved version of common MSA macros. Enabled idct MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: I1f488ab2c741f6c622b7a855388a202168082209	2015-06-01 09:24:23 +05:30
Parag Salasakar	f9f078ebb6	mips msa vp9 updated macros and disable all MSA functions Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc. Disabled all MSA hooks and tests Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190	2015-05-29 13:34:33 +05:30
Parag Salasakar	7c5f00f868	mips msa vp9 idct 8x8 optimization average improvement ~4x-6x Change-Id: I5edf713721b9e24c7e0ce2e69d8fc3ecab625d91	2015-05-08 12:23:27 +05:30
Parag Salasakar	a8a9c2bb45	Merge "mips msa vp9 idct 32x32 optimization"	2015-05-08 04:27:44 +00:00
James Zern	fd3658b0e4	replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79	2015-05-07 11:55:08 -07:00
Parag Salasakar	1601c1385a	mips msa vp9 idct 32x32 optimization average improvement ~4x-6x Change-Id: Idaba7e49fbd7f388caee0d73773ccf6e4807ef17	2015-05-07 12:42:23 +05:30
Parag Salasakar	60052b618f	mips msa vp9 idct 16x16 optimization average improvement ~4x-6x Change-Id: I55e95b7f2ba403dff11813958dc7c73a900dd022	2015-05-05 12:37:06 +05:30
Yaowu Xu	47767609fe	Remove vp9_idct16x16_10_add_ssse3() The rotation computation using 2X of cos(pi/16) has a potential to overflow 32 bit, this commit disable the function to allow further investigation and optimization. Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf	2015-04-30 09:07:30 -07:00
James Zern	8845334097	vp9: fix high-bitdepth NEON build remove incorrect specializations in rtcd and update a configuration check in partial_idct_test.cc Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0	2015-03-31 17:45:25 -07:00
Johann	26a0721268	Enable neon idct tests for intrinsics Change-Id: I45d4a22f3ecb9af172e37c95f168805e492c5493	2014-12-10 18:20:04 -08:00
James Zern	8da5088da1	Revert "Fix SSSE3 partial_idct_test detection" This reverts commit `7d07f512cd`. this breaks visual studio builds: '#' : invalid character : possibly the result of a macro expansion Change-Id: I77170d549afb71e75a878fa0f6acd204fe8d9e67	2014-11-13 11:32:02 -08:00
Johann	7d07f512cd	Fix SSSE3 partial_idct_test detection The test filter is not a prefix matcher. It requires test type to contain no more than the optimization type. In this example, SSSE3_64 fails to match and the test is not skipped even when SSSE3 is not available. Change-Id: Ia74229a167c88da4e6da169012a7a77d438c3f75	2014-11-05 12:58:08 -08:00
Deb Mukherjee	d50716face	Incorporate WRAPLOW macro into non-highbitdepth tx Incorporates the WRAPLOW macro into the non-highbitdepth transforms to aid hardware verification between a software C model and an intended hardware implementation though the use of the configure options: --enable-experimental --enable-emulate-hardware. Note that to avoid further discrepancies between the sse/sse2 implementations of the transforms and the C implementation, when the emulate hardware option is invoked, we also disable sse/sse2/etc. Also incudes some minor cleanups/renaming etc. Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287	2014-10-03 11:38:05 -07:00
Deb Mukherjee	10783d4f3a	Adds high bitdepth transform functions and tests Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8	2014-09-11 19:56:33 -07:00
James Zern	49135d3748	partial_idct_test: drop '_t' from local typenames _t is reserved by posix + switch to camelcase http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Type_Names Change-Id: Ied220e09fceef53039f234ebbb7e51c8e081c84e	2014-07-18 20:39:06 -07:00
James Zern	29e1b1a4b0	tests: add API_REGISTER_STATE_CHECK used to wrap API functions to ensure full environment consistency as opposed to the renamed ASM_REGISTER_STATE_CHECK which is used with assembly functions. currently checks the FPU tag word in x86/x86_64 gcc builds to ensure emms has been called. Change-Id: Ie241772dbf903d33d516a1add4c8c6783f2e1490	2014-07-10 12:40:31 -07:00
Jingning Han	7eaad70bf7	Enable unit test for partial 16x16 inverse 2D-DCT This commit enables unit test for SSSE3 16x16 inverse 2D-DCT with 10 non-zero coefficients. It includes a new test condition to cover the potential overflow issue due to extremely coarse quantization. Change-Id: I945e16f05dfbe19500f0da5f15990feba8e26d99	2014-06-03 19:06:39 -07:00
Jingning Han	6d21cbd20b	Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe	2014-05-28 10:53:33 -07:00
Johann	ce23931a3f	Only build neon assembly for armv7 targets Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477	2014-05-12 08:52:02 -07:00
Jingning Han	b466ad5efc	Turn on unit tests for SSSE3 8x8 forward and inverse 2D-DCT Change-Id: I3edd4b956a1273d65547771bf43c5cdaea25e5d6	2014-05-08 10:53:27 -07:00
Jingning Han	41a350a83d	Change eob threshold for partial inverse 8x8 2D-DCT to 12 The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2	2014-05-08 09:48:58 -07:00

1 2

56 Commits