generic-library/vpx

Author	SHA1	Message	Date
Johann	d6a7489dd5	neon variance: process two rows of 8 at a time When the width is equal to 8, process two rows at a time. This doubles the speed of 8x4 and improves 8x8 by about 20%. 8x16 was using this technique already, but still improved a little bit with the rewrite. Also use this for vpx_get8x8var_neon BUG=webm:1422 Change-Id: Id602909afcec683665536d11298b7387ac0a1207	2017-05-04 08:59:46 -07:00
Johann	cb9133c72f	neon variance: add small missing sizes Some of the mixed sizes were missing. They can be implemented trivially using the existing helper function. When comparing the previous 16x8 and 8x16 implementations, the helper function is about 10% faster than the 16x8 version. The 8x16 is very close, but the existing version appears to be faster. BUG=webm:1422 Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004	2017-05-04 08:59:42 -07:00
Linfeng Zhang	a10a5cb356	Merge changes I8bb660de,Ica51d780,I6037525d * changes: Clean specializes of idct functions Clean add_protos of highbd idct functions Clean add_protos of idct functions	2017-05-03 19:17:55 +00:00
James Zern	5599e4275a	Merge changes Ia5293d94,I90d481d3,Ia509d622,I54549b03,I89b635d6 * changes: ppc: Add convolve8_vsx and convolve8_avg_vsx ppc: Add convolve8_avg_vert_vsx ppc: Add convolve8_vert ppc: Add convolve8_horiz_avg ppc: Add convolve8_horiz	2017-05-03 03:31:19 +00:00
Luca Barbato	e2ad89092d	ppc: Add convolve8_vsx and convolve8_avg_vsx Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857	2017-05-02 20:27:47 -07:00
Luca Barbato	e6ca81ee67	ppc: Add convolve8_avg_vert_vsx Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d	2017-05-02 20:27:42 -07:00
Luca Barbato	a65f1771ad	ppc: Add convolve8_vert Only the generic one again, speedups for 8x8 and larger blocks to come later. Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f	2017-05-02 20:27:33 -07:00
Luca Barbato	77772350f3	ppc: Add convolve8_horiz_avg The 8x8 and larger blocks cases can be sped up further. Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15	2017-05-02 20:27:28 -07:00
Luca Barbato	08edb85bd0	ppc: Add convolve8_horiz The 8x8 and larger blocks cases can be sped up further. Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046	2017-05-02 20:27:16 -07:00
Linfeng Zhang	0178d974e5	Clean specializes of idct functions Change-Id: I8bb660de47b5f97263ec381dc428db96e9c9a4b2	2017-05-02 18:01:19 -07:00
Linfeng Zhang	4412996d59	Clean add_protos of highbd idct functions Change-Id: Ica51d780b92b316ce9112740c56cdf7670816371	2017-05-02 17:59:38 -07:00
Linfeng Zhang	a7a57d9756	Clean add_protos of idct functions Change-Id: I6037525d92ec172810edab720389eb1865ed3b1a	2017-05-02 17:58:40 -07:00
Johann Koenig	240a5a15ef	Merge "block error sse2: sum in 32 bits when possible"	2017-05-02 14:16:47 +00:00
Johann	cd94d5f68e	block error avx2: rename variables Change-Id: I2b8a9253f2c3d1fd85304c2970ebe70213870fe9	2017-05-01 17:54:29 -07:00
Johann Koenig	b1a31f8066	Merge "block error avx2: sum in 32 bits when possible"	2017-05-02 00:52:59 +00:00
Marco Paniconi	1e112bce37	Merge "vp9: SVC: Early exit on golden ref in non-rd pickmode."	2017-05-01 21:04:52 +00:00
Linfeng Zhang	e8655d49f5	Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()"	2017-05-01 19:54:40 +00:00
Johann Koenig	3d33a462b3	Merge "move vp9_error_intrin_avx2.c"	2017-05-01 19:52:36 +00:00
Kyle Siefring	760c214519	block error avx2: sum in 32 bits when possible Add 31bit pairs before unpacking in x86 block error code AVX2 code provides a very minor performance improvement. BUG=webm:1210 Change-Id: I4c82308eaf65741dca2f5c6db9be9c85f905073a	2017-05-01 12:51:33 -07:00
James Zern	ee3df31d74	Merge "vpx_scale_test: fix segfault on alloc failure"	2017-05-01 19:22:22 +00:00
Marco	ae0215f945	vp9: SVC: Early exit on golden ref in non-rd pickmode. For SVC 1 pass real-time: add condition to skip the golden (spatial) reference mode in non-rd pickmode. Condition is to skip golden if the sse of zeromv-last mode is below threshold. And change order in ref_mode_set_svc to make sure golden zeromv is tested after last-nearest. Speedup ~3-4% with little/negligible quality loss. Change-Id: I6cbe314a93210454ba2997945f714015f1b2fca3	2017-05-01 10:36:54 -07:00
Kyle Siefring	8394990b27	block error sse2: sum in 32 bits when possible Add 31bit pairs before unpacking in x86 block error code BUG=webm:1210 Change-Id: I5ca8c7f7775585a17fe09d6bbfc25e1f2955eb0a	2017-05-01 09:59:18 -07:00
Johann	2ff01aa1e4	move vp9_error_intrin_avx2.c There is only one avx2 implementation. Drop '_intrin' Change-Id: I887a0d27d58567eaad49f749f127eca61313f312	2017-05-01 09:13:01 -07:00
James Zern	2930903d51	vpx_scale_test: fix segfault on alloc failure check the return of ResetImage() before continuing Change-Id: Iff0b038f7b9761113b8cf33a511a5306640d1273	2017-04-29 13:12:53 -07:00
Luca Barbato	d51d3934f5	ppc: Add convolve_avg Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014	2017-04-29 15:47:25 +02:00
Luca Barbato	63860ba7b8	ppc: Add convolve_copy Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405	2017-04-29 15:47:25 +02:00
Johann Koenig	ef5918098d	Merge "Use uint32_t for accumulator"	2017-04-28 18:32:09 +00:00
Jerome Jiang	ce2e278059	Merge "vp9: Fix condition for disabling adaptive_rd_thresh."	2017-04-28 18:10:36 +00:00
Jerome Jiang	04de501229	vp9: Fix condition for disabling adaptive_rd_thresh. Add speed constrains for disabling adaptive_rd_thresh when row_mt_bit_exact is set. Change-Id: I2445115c2f9a2e46b8a0966031a0fea488d4964e	2017-04-28 10:26:20 -07:00
Jerome Jiang	bea27a5809	Merge "Generalize vp9 sse2 denoiser test for other platforms."	2017-04-28 15:45:52 +00:00
Johann	657f3e9f14	Use uint32_t for accumulator Be specific about the data type size. Use convenience macro vp9_zero_array. Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd	2017-04-28 06:36:59 -07:00
Johann Koenig	94ebdba71d	Merge "vp9 temporal filter: sse4 implementation"	2017-04-28 13:22:41 +00:00
Jerome Jiang	26aebd77b8	Generalize vp9 sse2 denoiser test for other platforms. Renamed to vp9_denoiser_test. Change-Id: I0d8f4c94bcb81a60949a13d9fe839cee95d03f77	2017-04-27 22:47:41 -07:00
Yaowu Xu	0e8fea6c13	Merge "VP9: enable trellis for high bitdepth intra"	2017-04-28 00:16:56 +00:00
James Zern	ef15d38df0	Merge "webm_read_frame: avoid NULL dereference"	2017-04-27 21:47:10 +00:00
Johann	6dfeea6592	vp9 temporal filter: sse4 implementation Approximates division using multiply and shift. Speeds up both sizes (8x8 and 16x16) by 30 times. Fix the call sites to use the RTCD function. Delete sse2 and mips implementation. They were based on a previous implementation of the filter. It was changed in Dec 2015: `ece4fd5d22` BUG=webm:1378 Change-Id: I0818e767a802966520b5c6e7999584ad13159276	2017-04-26 22:03:05 -07:00
Jerome Jiang	43e0e082d1	vp9: Don't force disabling of adaptive_rd_thresh for realtime. Don't force disabling of adaptive_rd_thresh for realtime when row_mt_bit_exact is set. Row based adaptive rd is made usable in CL 454882(https://chromium-review.googlesource.com/c/454882) for REALTIME. Change-Id: Ief023414f0fd6eb86f299dd46ae58f4436875af5	2017-04-26 13:17:57 -07:00
Yunqing Wang	b68f14d0ed	Merge "Make the row based multi-threaded encoder deterministic"	2017-04-26 16:12:14 +00:00
Linfeng Zhang	54c4e0f7a5	Merge "Update highbd convolve functions arguments to use uint16_t src/dst"	2017-04-26 15:50:46 +00:00
Marco Paniconi	004fab120a	Merge "vp9: SVC: Adjust some speed settings for temporal layers."	2017-04-26 15:45:06 +00:00
Peter de Rivaz	66117b97c5	VP9: enable trellis for high bitdepth intra BUG=webm:1409 Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7	2017-04-26 11:43:01 +01:00
Jerome Jiang	15ee8a8c45	Merge "Fix the decoder seg fault when frame is corrupted."	2017-04-26 00:09:29 +00:00
Jerome Jiang	997e54ea43	Merge "vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large"	2017-04-26 00:09:22 +00:00
Marco	c614164cb6	vp9: SVC: Adjust some speed settings for temporal layers. Make some speed setting changes for temporal enhancement layers, and remove the switch in subpel_force_stop for the aggressive_base_mv in non-rd pickmode. Gain some 2-3% speed with little/negligible quality loss. Change-Id: I3e2a7f80ff45f38c0a6ceb01b34dbca2f53edbf0	2017-04-25 16:27:01 -07:00
Jerome Jiang	69b0242e9a	vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large For speed >= 8 and color_sensitivity not set, skip the transform skipping test in UV planes. Add a new condition to check noise level to skip chroma check for speed >= 8 if y_sad is high. 1~2% speedup on ARM for speed 8. Borg tests show neutral results in both rtc and rtc_derf. Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c	2017-04-25 16:21:36 -07:00
Linfeng Zhang	4758d20227	Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor() BUG=webm:1388 Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d	2017-04-25 14:32:20 -07:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
James Zern	0be513e8e8	webm_read_frame: avoid NULL dereference block may be NULL with block_entry_eos or from return of GetBlock() Change-Id: Ia0dd3ffa46305ee70efcdc55c05c2ad24efc993b	2017-04-25 12:34:23 -07:00
Marco	92ec0674fd	vp9; Reduce artifact in non-rd pickmode for lighting changes. Add a low-variance high-sumdiff to the superblock content state and use it to limit the mv and bias some decisions in non-rd pickmode. Only affects speed >= 6. Reduces artifact for lighting changes. Small/no difference in metrics on RTC set. Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b	2017-04-24 17:08:43 -07:00
Yunqing Wang	10a497bd38	Make the row based multi-threaded encoder deterministic This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb	2017-04-24 16:28:27 -07:00

1 2 3 4 5 ...

17175 Commits