generic-library/vpx

Author	SHA1	Message	Date
Johann	d52cb59729	quantize: copy ssse3 optimizations to intrinsics Fairly minor differences from sse2. pabsw and psignw are the big gains. Also re-uses some values in eob calculation to avoid an extra pcmp. Fixes test failures in HBD and OS X builds. Allows using it in 32bit builds, where it is about 40% faster than sse2. Substantially faster than the assembly for skip_block. 10-20% faster the rest of the time. Change-Id: If783bb3567e561e47667e10133b9c84414a334e2	2017-08-08 12:22:14 -07:00
Marco	427de67e63	vp9: Partition logic adjustment for speed 6 feature. When adapt_partition_source_sad is enabled (currently only at speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16) for nonrd_select_partition on 32X32 blocks. And force avoiding rectangular partition checks in nonrd_pick_partition for speed >= 6. Small increase ~0.5 in metrics for speed 6 on rtc_derf, no change in speed. Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae	2017-08-08 11:31:27 -07:00
Linfeng Zhang	853165ba39	Update 32x32 idct sse2 funcs, add partial case 135 Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a	2017-08-07 17:37:02 -07:00
Linfeng Zhang	d670678f26	Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx() in idct x86 code Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083	2017-08-04 15:33:37 -07:00
Linfeng Zhang	fa829e0e5a	Replace multiplication_and_add() with butterfly() in idct x86 code Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e	2017-08-04 15:33:34 -07:00
Linfeng Zhang	c9fb719ee1	Update butterfly() in idct x86 optimizations. Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701	2017-08-04 15:33:28 -07:00
Linfeng Zhang	7f20c3ac44	Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1 BUG=webm:1412 Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca	2017-08-04 15:31:17 -07:00
Linfeng Zhang	22b6dc9fdf	Update for loop increment of idct x86 functions Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4	2017-08-04 15:29:19 -07:00
Linfeng Zhang	0c61331244	Update high bitdepth 16x16 idct x86 code Prepare for high bitdepth 16x16 idct sse4.1 code. Just functions moving and renaming. BUG=webm:1412 Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a	2017-08-04 15:12:33 -07:00
Johann Koenig	cbb83ba4aa	Merge "quantize test: consolidate sizes"	2017-08-04 20:34:50 +00:00
Johann	9578a84205	quantize test: consolidate sizes Pass a max txfm size parameter and combine the base quantize test with the 32x32 test. Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b	2017-08-04 12:45:32 -07:00
Scott LaVarnway	c42517568d	vpx_dsp: merge avx2 variance files BUG=webm:1404 Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4	2017-08-04 07:49:30 -07:00
Kaustubh Raste	39e8b8dac6	Fix mips dspr2 6 tap filter clobber list Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646	2017-08-04 10:56:56 +05:30
Linfeng Zhang	e921c7ba8d	Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function"	2017-08-04 01:16:35 +00:00
Scott LaVarnway	f6c6f37e0c	Merge "vpx_dsp: Use correct check for halfpel in"	2017-08-03 23:17:09 +00:00
Linfeng Zhang	563d58ab84	Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function BUG=webm:1412 Change-Id: I945f0fb6807b8948747243794dc7352b959221f7	2017-08-03 13:59:47 -07:00
Linfeng Zhang	6624f20785	Merge changes I76727df0,I66297d78,I1d000c6b * changes: Extract inlined 16x16 idct sse2 code into header file Add transpose_32bit_8x4() sse2 optimization Update x86 idct optimization	2017-08-03 20:51:02 +00:00
Scott LaVarnway	8334a48d3a	vpx_dsp: Use correct check for halfpel in vpx_sub_pixel_variance32xh_avx2() and vpx_sub_pixel_avg_variance32xh_avx2 see: `17fae3a` Change to use correct check for halfpel Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c	2017-08-03 06:57:40 -07:00
paulwilkins	76d77aa013	Enable emergency fast Q adaptation for VBR test case. Enable fast adaptation of Q when there is a large overshoot for the #ifdef AGGRESSIVE_VBR test case. AGGRESSIVE_VBR is not currently enabled by default. Change-Id: I7240bb6589795964b6b0b66df4468e4f21504e0f	2017-08-03 12:06:07 +01:00
Yunqing Wang	6843e7c7f3	Merge "Force the bit exactness in the first pass"	2017-08-03 00:03:10 +00:00
Linfeng Zhang	15a47db730	Extract inlined 16x16 idct sse2 code into header file Will be called by high bitdepth functions. Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c	2017-08-02 16:17:43 -07:00
Linfeng Zhang	8c0ab7607e	Add transpose_32bit_8x4() sse2 optimization Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955	2017-08-02 16:15:58 -07:00
Yunqing Wang	bfd0f41f9b	Force the bit exactness in the first pass Originally, for the purpose of keeping a fast first pass, the first-pass stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but that difference is very small that doesn't cause a mismatch between the final bitstreams. However, if the encoder changes, this minor difference may cause a mismatch. Thus, this patch always forces the first pass to be bit exact. BUG=webm:1453 Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8	2017-08-02 15:58:39 -07:00
Johann Koenig	787970a625	Merge "quantize test: add speed comparison"	2017-08-02 21:16:35 +00:00
Marco	b9577e07fc	vp8: Drop due to overshoot for non-screen content. For 1 pass CBR mode: Apply the logic for dropping (and re-adjusting rate control) due to large overshoot to the case of non-screen content when drop_frames_allowed is enabled. For the non-screen content case: add additional condition that rate correction factor is close to minimum state, and flag to constrain the frequency of the dropping. Also handle the case of temporal layers and multi-res encoding. Add some flags/counters to the layer context for temporal layers. For multi-res: drop due to overshoot is checked on lowest stream, and if overshoot is detected we force drops on all upper streams for that frame. This feature is to avoid large frame sizes on big content changes following low content period. No change in behavior for screen_content_mode = 2. Change-Id: I797ab236cbbf3b15cad439e9a227fbebced632e6	2017-08-02 13:12:48 -07:00
Scott LaVarnway	698e56f26c	Merge "vpxdsp: variance_impl_avx2.c cleanup"	2017-08-02 19:08:10 +00:00
Johann	1059b5cc52	quantize test: add speed comparison Test some possible scenarios. Change-Id: I1a612e7153b31756be66390ceea55877856d5a33	2017-08-02 09:33:35 -07:00
Scott LaVarnway	632fe8286a	vpxdsp: variance_impl_avx2.c cleanup BUG=webm:1404 Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02	2017-08-02 05:57:39 -07:00
shiyou yin	0e87b16022	Merge "loongson mmi configuration patch."	2017-08-02 01:08:43 +00:00
Linfeng Zhang	6738ad7aaf	Update x86 idct optimization Move constant coefficients preparation into inline function. Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1	2017-08-01 14:40:12 -07:00
Linfeng Zhang	c0490b52b1	Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"	2017-08-01 21:39:39 +00:00
Johann Koenig	847394fe77	Merge "neon: vpx_quantize_b"	2017-08-01 16:44:31 +00:00
Paul Wilkins	3be14200fc	Merge "Respond more rapidly to excessive local overshoot."	2017-08-01 08:58:36 +00:00
Marco Paniconi	c22b17dcef	Merge "vp9: Adjust noise estimation for 360p."	2017-08-01 02:48:13 +00:00
Marco	5d6c1c2d8f	vp9: Adjust noise estimation for 360p. Change-Id: Ib76875232491b14f7114061e8e913e87004427a0	2017-07-31 17:12:58 -07:00
Linfeng Zhang	bf14d468c1	Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2 This replaces commit `aa1c4cd`, which has a bug and was reverted in commit `3c73e58`. The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d(). Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8	2017-07-31 16:36:13 -07:00
James Zern	78e2da3e42	Merge "highbd_inv_txfm_sse4: make << of neg. val a multiply"	2017-07-31 22:43:41 +00:00
Johann	2d6b5df657	neon: vpx_quantize_b With skip block or coeff < zbin it is about twice as fast as C. If most coeff values are > zbin it is about 10-15x as fast as C. BUG=webm:1426 Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7	2017-07-31 10:38:46 -07:00
YinShiyou	2758de5cb2	loongson mmi configuration patch. enable loongson mmi optimization: ../configure --enable-mmi Change-Id: I7792c3adeac1d5b573917d7857bba6c1cc05fea5	2017-07-31 17:29:36 +00:00
Marco Paniconi	ebb023deb6	Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad."""	2017-07-31 14:58:15 +00:00
Marco	999bd6ea84	vp9: Fix denoising condition when pickmode partition is used. When the superblock partition is based on the nonrd-pickmode, we need to avoid the denoising. Current condition was based on the speed level. This change is to make the condition at the superblock level, as the switch in partitioning may be done at sb level based on source_sad (e.g., in speed 6). Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04	2017-07-30 23:16:38 -07:00
Jerome Jiang	f027908ad0	Revert "Revert "vp9: Speed feature to adapt partition based on source_sad."" This reverts commit `c9266b8547`. Disable source_sad when resolution > 1080P. The test should pass now. BUG=webm:1452 Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082	2017-07-30 19:49:31 -07:00
James Zern	78155b7ed5	highbd_inv_txfm_sse4: make << of neg. val a multiply left shifting a negative value is undefined; quiets a ubsan warning. this is applied to a constant, no change in the generated code. Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254	2017-07-30 12:48:28 -07:00
James Zern	facb124941	Merge "Revert "vp9: Speed feature to adapt partition based on source_sad.""	2017-07-30 03:26:10 +00:00
James Zern	c9266b8547	Revert "vp9: Speed feature to adapt partition based on source_sad." This reverts commit `064fc570ff`. This causes an assertion failure in vp9_mcomp.c when running gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41: `mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2)) - 1)' Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6	2017-07-29 11:36:58 -07:00
James Zern	d35b627340	Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2" This reverts commit `aa1c4cd140`. This fails the following tests with extreme input coefficients: SSE2/InvTrans8x8DCT.CompareReference/0 SSE2/InvTrans8x8DCT.CompareReference/2 previously the optimized path was skipped in this range Change-Id: I9af015a46eba96208834a219fafd651d37556a80	2017-07-29 11:12:27 -07:00
Marco Paniconi	5d0bef4763	Merge "vp9: Adjust logic in source sad for screen content."	2017-07-29 01:46:58 +00:00
Marco Paniconi	e48dfcead1	Merge "vp9: Speed feature to adapt partition based on source_sad."	2017-07-29 01:45:19 +00:00
Jerome Jiang	ac211fe23e	vp9: Adjust logic in source sad for screen content. Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3	2017-07-28 17:25:41 -07:00
Marco	064fc570ff	vp9: Speed feature to adapt partition based on source_sad. Move the source_sad feature to speed 6 (from speed 7), and add speed feature to switch from the variance-based partition to reference_partition (which uses nonrd-pickmode for bsize selection) if source_sad is high. Currently used only for speed 6 for resoln <= 360p. About 4-5% improvement on 360p in RTC set. Some speed slowdown, but still ~30% faster than speed 5. Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c	2017-07-29 00:20:26 +00:00

... 2 3 4 5 6 ...

17809 Commits