generic-library/vpx

Author	SHA1	Message	Date
Johann	eae7cf2368	fdct16x16 neon optimization Roughly 2x speedup. Since the only change for HBD is to store(), the improvement appears to hold there as well. BUG=webm:1424 Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19	2017-06-07 14:59:55 -07:00
Marco Paniconi	9cea3a3c4e	Merge "vp9: SVC: Enable simple_block_yrd for temporal layers."	2017-06-07 21:12:14 +00:00
Johann Koenig	0c4f74d129	Merge changes Iade45f69,I18d90658,Ieca3f1ef * changes: buffer.h: add num_elements_ buffer.h: zero-init all values buffer.h: use size_t	2017-06-07 19:20:16 +00:00
Marco	14d4718043	vp9: SVC: Enable simple_block_yrd for temporal layers. Enable simple_block_yrd for temporal enhancement layers (TL > 0). And remove block size condiiton for SVC mode. Only affects speed >= 7 SVC. Speedup ~3-4%. avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%. Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed	2017-06-07 11:41:50 -07:00
Johann	902d63759e	buffer.h: add num_elements_ raw_size_ was being incorrectly computed and used Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d	2017-06-07 11:31:20 -07:00
Johann	4a37e3e2a0	buffer.h: zero-init all values Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8	2017-06-07 11:27:26 -07:00
Johann	f08581c1d0	buffer.h: use size_t Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f	2017-06-07 11:24:27 -07:00
Marco	13b02a8efe	vp9: SVC: Enable row-mt in sample encoder. Change-Id: I4b51043cb3f5955efe947fe4685aed4a21adb8bd	2017-06-07 10:32:44 -07:00
James Zern	ff42e04f9c	Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}"	2017-06-06 23:52:39 +00:00
Marco Paniconi	27b34a109d	Merge "vp9: SVC: Adjust some speed settings for SVC speed >= 7."	2017-06-06 23:07:45 +00:00
Marco	7d2f5f8e9d	vp9: SVC: Adjust some speed settings for SVC speed >= 7. Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE for all temporal enhancement layer frames. Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a	2017-06-06 15:30:24 -07:00
Johann	de4cb716ee	buffer.h: split out init Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287	2017-06-06 15:02:50 -07:00
Johann	8659764a07	buffer.h: Use T for values Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9	2017-06-06 12:05:14 -07:00
Jerome Jiang	cf07d85809	Initialize cost_list all to INT_MAX. It is initialized to be { INT_MAX, 0, ... } in ffe0f9b. No effect on encoders. Make it consistent with other initializations. BUG=webm:1440 Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922	2017-06-06 10:42:37 -07:00
James Zern	6df142e2ab	vp9_mcomp,get_cost_surf_min: quiet conversion warning visual studio will warn if a 32-bit shift is implicitly converted to 64. in this case integer storage is enough for the result. since: f3a9ae5ba Fix ubsan failure in vp9_mcomp.c. Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc	2017-06-05 22:52:58 -07:00
Jerome Jiang	968a5d6bc2	Merge "Fix valgrind failure on uninitialized variables."	2017-06-06 03:47:31 +00:00
James Zern	4753c23983	Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx"	2017-06-06 02:19:41 +00:00
Jerome Jiang	ffe0f9b7fb	Fix valgrind failure on uninitialized variables. BUG=webm:1440 Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4	2017-06-05 13:09:29 -07:00
Jerome Jiang	f3a9ae5baa	Fix ubsan failure in vp9_mcomp.c. Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d	2017-06-02 21:37:13 -07:00
Marco	e30781ff80	vp9: SVC: Force subpel search off under certain conditions. For SVC 1 pass non-rd mode: Force subpel seach off for SVC for non-reference frames under motion threshold. Add flag to svc context to indicate if the frame is not used as a reference. Little/no quaity loss, ~2% speedup. Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9	2017-06-01 20:48:52 -07:00
Marco Paniconi	ff637d1903	Merge "vp9: Speed >8: Set subpel_search_method for low motion."	2017-06-01 23:57:19 +00:00
Marco	8c6fa5c5e3	vp9: Speed >8: Set subpel_search_method for low motion. Speed >=8: for resolutions above CIF, and for low motion content, set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE. Small speed gain (~2%) on vga clips, RTC metrics up by ~2-3% on average. Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc	2017-06-01 16:16:13 -07:00
Jerome Jiang	68f035026f	vp8 skin detection: Fix visual studio build failure. Change-Id: I510b755550ebbfa2aaf9b974920d7f1c6454a845	2017-06-01 13:46:46 -07:00
Jerome Jiang	e254969df2	Fix corruption in skin map debugging output yuv. For both vp8 and vp9. BUG=webm:1437 Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28	2017-06-01 16:59:43 +00:00
Jerome Jiang	f1a300acc4	vp8: Clean up skin detection. Use only the average of center 2x2 pixels in vp8. Change-Id: I2b23ff19a90827226273e0fca49e90c734eda59b	2017-05-31 14:57:10 -07:00
Johann Koenig	755b3daf90	Merge "comp_avg_pred neon: used by sub pixel avg variance"	2017-05-31 18:17:28 +00:00
Jerome Jiang	32d8992147	Merge "Write skin map of vp8 skin detection for debug."	2017-05-31 16:37:07 +00:00
Linfeng Zhang	30ea3ef283	Merge "Update vpx_highbd_idct4x4_16_add_sse2()"	2017-05-31 15:56:20 +00:00
Johann	f695b30ac2	comp_avg_pred neon: used by sub pixel avg variance BUG=webm:1423 Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804	2017-05-30 22:47:34 +00:00
Jerome Jiang	c39526da8a	Write skin map of vp8 skin detection for debug. Change-Id: Ica1b4e918aa759cd0ce65920f9d88452bbf9e3b4	2017-05-30 10:30:05 -07:00
Linfeng Zhang	45048dc9dc	Update vpx_highbd_idct4x4_16_add_sse2() BUG=webm:1412 Change-Id: I26e4b34ae9bc1ae80c24f56d740d737a95f1ab84	2017-05-30 09:25:30 -07:00
Johann Koenig	b9649d2407	Merge "comp_avg_pred: alignment"	2017-05-30 16:21:05 +00:00
Johann Koenig	48c0e13286	Merge "remove DECLARE_ALIGNED from neon code"	2017-05-30 15:58:17 +00:00
Johann	ea8b4a450d	comp_avg_pred: alignment x86 requires 16 byte alignment for some vector loads/stores. arm does not have the same requirement. The asserts are still in avg_pred_sse2.c. This just removes them from the common code. Change-Id: Ic5175c607a94d2abf0b80d431c4e30c8a6f731b6	2017-05-30 07:46:43 -07:00
Jerome Jiang	a5ab38093f	Merge "Fix vp8 race when build --enable-vp9-highbitdepth."	2017-05-30 05:47:44 +00:00
Johann	42ce25821d	remove DECLARE_ALIGNED from neon code Unlike x86 neon only requires type alignment when loading into vectors. Change-Id: I7bbbe4d51f78776e499ce137578d8c0effdbc02f	2017-05-26 10:41:57 -07:00
Johann Koenig	2693b89c19	Merge "subpel variance neon: reduce stack usage"	2017-05-26 17:25:47 +00:00
Johann Koenig	47174d60c8	Merge "Use vdup instead of vmov"	2017-05-26 17:25:24 +00:00
Jerome Jiang	0afa2dad76	Fix vp8 race when build --enable-vp9-highbitdepth. Split vp8/vp9 implementations on yv12_copy_frame_c. Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c. Clean up vp8 codes usage in vp9. BUG=webm:1435 Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d	2017-05-26 09:45:01 -07:00
Marco	146005a911	vp9: SVC: Fix to condiiton on using source_sad. Fix the condition on usage of source_sad for temporal layers. FIx allows it to be used for the case of 1 temporal layer. Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3	2017-05-26 08:46:50 -07:00
Marco Paniconi	9ec9415fd9	Merge "vp9: Use source_sad only on top temporal enhancement layer."	2017-05-26 05:24:06 +00:00
Marco Paniconi	4be18ab295	Merge "vp9: SVC: Enable copy partition for SVC speed >= 7."	2017-05-26 05:23:47 +00:00
Marco	ea914456af	vp9: Use source_sad only on top temporal enhancement layer. For 1 pass CBR SVC mode. Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d	2017-05-25 16:32:05 -07:00
Jerome Jiang	327c9bb1da	Refactor: Move vp8 skin detection to new files. Change-Id: If760f28cbbf22beac1cc9bd1546f13831e9dd3f0	2017-05-25 16:12:27 -07:00
Marco	747cf7a505	vp9: SVC: Enable copy partition for SVC speed >= 7. Adjust the max_copied_frame setting for temporal layers. Keep the same setting for non-SVC at speed 8. This change also enables copy_partiton for non-SVC at speed 7, but with smaller value of max_copied_frame (=2). ~2% speedup for SVC speed 7, 3 layers, with little/no quality loss. Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780	2017-05-25 12:21:46 -07:00
Johann	f3c97ed32e	subpel variance neon: reduce stack usage Unlike x86, arm does not impose additional alignment restrictions on vector loads. For incoming values to the first pass, it uses vld1_u32() which typically does impose a 4 byte alignment. However, as the first pass operates on user-supplied values we must prepare for unaligned values anyway (and have, see mem_neon.h). But for the local temporary values there is no stride and the load will use vld1_u8 which does not require 4 byte alignment. There are 3 temporary structures. In the C, one is uint16_t. The arm saturates between passes but still passes tests. If this becomes an issue new functions will be needed. Change-Id: I3c9d4701bfeb14b77c783d0164608e621bfecfb1	2017-05-24 13:28:13 -07:00
Johann	d204c4bf01	Use vdup instead of vmov Change-Id: Idb6248c1429b55176bb3e9f4e8365ea0ed2be62a	2017-05-24 11:38:15 -07:00
Johann Koenig	de1a9c77a7	Merge changes Iaab2b9a1,Idfb458d3 * changes: sub pel avg variance neon: 4x block sizes sub pel variance neon: 4x block sizes	2017-05-24 18:33:53 +00:00
Johann Koenig	b11a37f540	Merge changes I31fa6ef8,I228c6f29 * changes: sub pel avg variance neon: add neon optimizations sub pel variance neon: normalize variable names	2017-05-24 18:32:02 +00:00
James Zern	f0279ceb92	Merge "partial_idct_test,InitInput: fix rollover in mult"	2017-05-24 16:27:21 +00:00

... 2 3 4 5 6 ...

17488 Commits