generic-library/vpx

Author	SHA1	Message	Date
Yunqing Wang	770c6663d6	Merge "Changes to facilitate row based multi-threading of ARNR filtering"	2017-02-01 22:04:15 +00:00
Ranjit Kumar Tulabandu	359a6796da	Changes to facilitate row based multi-threading of ARNR filtering Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb	2017-02-01 13:03:52 -08:00
Johann	bfd62cdaff	vp9_rdopt: declare 'c' closer to use Clears up static clang analysis warning regarding a dead store. Only declare 'c' when it will be used. Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3	2017-02-01 19:58:24 +00:00
Jingning Han	969957f9f2	Fix real-time compression regression in hbd mode This commit resolves the compression performance regression in real-time encoding setting when high bit-depth mode is enabled. The current solution temporarily disables the SIMD implementations of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode. The commit makes the coding results bit-wise identical between regular coding pipeline and high bit-depth at profile 0. BUG=webm:1365 Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf	2017-01-31 23:17:09 -08:00
Debargha Mukherjee	e6446b4b60	Refactor uv tx size with lookup arrays Change-Id: Ife6a3d301c5faaba89d16d188d638631083511f7	2016-08-31 13:15:38 -07:00
paulwilkins	635ae8bdc1	Adjust coefficient optimization and tx_domain rd speed features. Previously Tx domain rd was used in all cases above speed 0. Coefficient optimization was only enabled for best and speed 0. This patch selectively sets these features at other speed settings based on block complexity. For the Netflix and HD sets in particular the quality gains are large compared to the speed hit. At speed 1 the average psnr gain in the NF set is > 2.5% with one clip coming in at 18% and some points almost 30%. Average gains for the lower resolution test sets are around 1%. The gains are biggest at low Q so some further optimization may be possible. Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576	2016-08-25 15:36:16 +01:00
Yunqing Wang	a413dbe594	Fix another motion vector out of range bug This patch fixed a motion vector out of range bug: vpxenc: ../libvpx/vp9/encoder/vp9_mcomp.c:69: mv_cost: Assertion `mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2)) - 1)' failed. For blocks that returned without having full-pixel search, the original MV limits were not restored, which caused the failure. Moved the set MV limit function down to fix the bug. Change-Id: Id7d798fc7214e95c6e4846c588f0233fcf1a4223	2016-08-12 09:27:58 -07:00
Alex Converse	6554333b59	Refactor mv limits. Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987	2016-08-08 11:54:00 -07:00
Yunqing Wang	2fb826c4d5	Fix a motion vector out of range bug This patch fixed a motion vector(MV) out of range bug, which was caused by not restoring the original values of the MV min/max thresholds after the sub8x8 full pixel motion search. It occurred rarely and only was seen while encoding a 4k clip for 200 frames. BUG=webm:1271 Change-Id: Ibc4e0de80846f297431923cef8a0c80fe8dcc6a5	2016-08-05 15:23:05 -07:00
Yaowu Xu	7a79fa1362	Fix msvc compiler warnings MSVC 2013 complained about using 32 shift where 64 bit shift should be used. Change-Id: I7a2b165d1a92d3c0a91dd4511b27aba7709b5e55	2016-08-03 18:33:06 -07:00
clang-format	e0cc52db3f	vp9/encoder: apply clang-format Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2	2016-08-02 16:47:11 -07:00
Alex Converse	335cf67d8b	Fix 64 to 32 narrowing warning. - Solves potential integer overflow on 12-bit - Fixes Visual Studio build Change-Id: I26dd660451bbab23040e4123920d59e82585795c	2016-07-27 12:40:23 -07:00
Alex Converse	d6c5ef4557	Only consider visible 4x4s in pixel domain error. BDRATE change derf144: -0.327 lowres: -0.048 midres: -0.125 hdres: -0.238 Change-Id: I789aba9870b5c2952373a7dd4fc8ed45590c3c54	2016-07-25 21:44:06 +00:00
Scott LaVarnway	c969b2b02b	VP9: get_pred_context_switchable_interp() -- encoder side Change-Id: I7217c90d5cf38c51b76759a2dc4f10070f3a40ac	2016-07-21 11:47:51 -07:00
Scott LaVarnway	2e93fcf893	Merge "vp9_rd_pick_intra_mode_sb(): set interp_filter to"	2016-07-11 22:31:06 +00:00
Scott LaVarnway	ed7786869a	vp9_rd_pick_intra_mode_sb(): set interp_filter to SWITCHABLE_FILTERS. This is a partial fix for the build issues with Change 357240. Change-Id: I4e507c196175bae729a4f1397878ec8776b0146c	2016-07-09 09:47:34 -07:00
Jingning Han	2f28f9072e	Enable coeff optimization for intra modes This further improves the coding performance by lowres 0.3% midres 0.5% hdres 0.6% Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68	2016-07-07 12:25:41 -07:00
Jingning Han	62aa642d71	Enable uniform quantization with trellis optimization in speed 0 This commit allows the inter prediction residual to use uniform quantization followed by trellis coefficient optimization in speed 0. It improves the coding performance by lowres 0.79% midres 1.07% hdres 1.44% Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619	2016-07-07 12:25:33 -07:00
Jingning Han	541eb78994	Refactor coeff_cost() function Move the operations that update the context buffers outside this function. The coeff_cost() takes all input as const value and returns the coefficient cost. This makes preparation for the next coefficient optimization CLs. Change-Id: I850eec6e5470b91ea84646ff26b9231b09f70a0c	2016-07-07 18:09:39 +00:00
Jingning Han	e357b9efe0	Support measure distortion in the pixel domain Use pixel domain distortion metric in speed 0. This improves the compression performance by 0.3% for both low and high resolution test sets. Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d	2016-07-06 18:25:17 -07:00
Jingning Han	14011f037d	Remove txfrm_block_to_raster_xy() from vp9 encoder The transform block row and column positions are always available outside the callees. There is no need to re-compute these values again. This approach has been used by the decoder. This commit removes txfrm_block_to_raster_xy() function. Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108	2016-07-04 18:41:47 -07:00
Scott LaVarnway	74bb78df82	Merge "VP9: handle_inter_mode()... Use interp_filter"	2016-06-29 11:41:52 +00:00
Scott LaVarnway	feb7e9a372	VP9: handle_inter_mode()... Use interp_filter only if above/left is inter. Change-Id: I0cc1f926425c021c84536df8271e9ee5f3f87caf	2016-06-28 14:09:59 -07:00
James Zern	ca88d22f39	s/UINT32_MAX/UINT_MAX/ provides better toolchain compatibility Change-Id: I8561a6de668a68ff54fe3886a4ee6300f0ae9c04	2016-06-25 12:15:51 -07:00
James Zern	b34705f64f	Merge "cosmetics: Beautify whitespaces and line wrapping"	2016-06-24 21:51:01 +00:00
Yury Gitman	67611119b5	cosmetics: Beautify whitespaces and line wrapping Change-Id: I9afa02cae671bd3527cf344695e53d0cc767f549	2016-06-24 10:18:06 -07:00
Yaowu Xu	7738bcb350	Rationalize type to avoid integer out of range BUG=webm:1250 Change-Id: Id5bb2762ca1bf996ba4f9a60eec977a7994c1d94	2016-06-24 13:58:02 +00:00
Yaowu Xu	b3933e2d3c	Merge "Fix ubsan warnings: vp9/encoder/vp9_mcomp.c"	2016-06-22 00:12:58 +00:00
Yaowu Xu	87bf1a149c	Fix ubsan warnings: vp9/encoder/vp9_mcomp.c This commit fixes a number of ubsan warnings in HBD build. BUG=webm:1219 Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc	2016-06-21 15:37:59 -07:00
hui su	a5af392aae	Add a hardware compatibility feature This commit adds an encoder workaround to support better compatibility with a non-compliant hardware vp9 profile 2 decoder. The known issue with this decoder is: The decoder assumes a wrong value, 127 instead of the correct value of 511 and 2047, for any assumed top-left corner pixel in UV planes for 10 and 12 bit, respectively. Such assumed top-left corner pixel is used for INTRA prediction when a real decoded/reconstructed pixel is not avalable, e.g. when it is located inside the row above the top row or inside the column left to the leftest column of a video image. Change-Id: Ic15a938a3107e1b85e96cb7903a5c4220986b99d	2016-06-21 10:33:57 -07:00
Scott LaVarnway	ba962a5f37	VP9: Eliminate up_available and left_available Use above_mi and left_mi instead. Change-Id: I0b50e232c31d11da30aa2fb6f91a695aaf725e0c	2016-03-30 04:47:39 -07:00
Julia Robson	74a679de6f	Port "cost_coeff speed improvements" to vp9. About a 5% faster overall encode (perf cycles) at speed zero! Change-Id: Iaf013ba75884415cd824e98349f654ffb1c3ef33	2016-02-26 14:47:18 -08:00
Jingning Han	d642294b1c	Fix tsan error in VP9 sub8x8 intra mode search This commit fixes issue 1141. The issue was triggered in multi-tile encoding. The change properly saves and restores the block context information in the real-time mode selection process. It removes several redundant memcpy operations in sub8x8 intra block mode search. Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010	2016-02-16 11:24:09 -08:00
Jingning Han	f032c7eaed	Merge "Account for sub8x8 block skip mode cost in RD decision"	2016-02-08 19:40:01 +00:00
Jingning Han	203bdd20fb	Account for sub8x8 block skip mode cost in RD decision Make this consistent with regular block size rate-distortion optimization. It improves the compression performance: derf 0.055% hevcmr 0.129% Change-Id: I112fe734f592c21bc7aa6efb7e3f269c4214ee7b	2016-02-08 10:18:51 -08:00
Jingning Han	ac6d40ece8	Clean up in vp9_rd_pick_inter_mode_sb Use local variable. Change-Id: I0d3df36cf4536958a0cda422f6c30da50f0e0bbf	2016-02-08 10:15:02 -08:00
Jingning Han	bcce658d31	Use precise rate cost estimate for skip block mode It improves the compression performance of VP9 by 0.1% across all test sets. No speed change is observed. Change-Id: I59338c5c9e67bae22188f35fc3afbfe2a6bba6b0	2016-02-03 11:09:16 -08:00
Alex Converse	d13385cee7	Switch to 9-bit rate cost constants built on a 256 probability denominator. -.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html -.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f	2016-01-27 19:34:30 +00:00
Alex Converse	4326cffa65	Merge "Tie the bit cost scale to a define."	2016-01-21 19:17:56 +00:00
Scott LaVarnway	5232326716	VP9: Eliminate MB_MODE_INFO Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185	2016-01-19 16:40:20 -08:00
Alex Converse	269428e35c	Tie the bit cost scale to a define. This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9	2016-01-15 15:59:31 -08:00
Scott LaVarnway	a85e552d95	VP9: Remove decoder args from find_mv_refs_idx() The decoder does not use this function. Change-Id: Ie67f909c0f4108ef286789c70df867d4b960a780	2016-01-13 13:30:40 -08:00
Yaowu Xu	9cac17d157	Enable encoder to avoid 8x4 or 4x8 partitions This commit enables encoder to avoid 8x4 and 4x8 partitions for scaled reference frames when libvpx is configured and built with --enable-better-hw-compatibility Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413	2016-01-07 09:53:14 -08:00
Yaowu Xu	650a2d7628	Fix a typo Change-Id: I12de2dd5e5f375551804166188d76a9ad8067b41	2016-01-07 09:29:34 -08:00
Jingning Han	27bbfd652d	Fix sub8x8 motion search on scaled reference frame This commit makes the sub8x8 block rate-distortion optimization scheme use precise motion compensated prediction to compute the rd cost. It fixes a potential buffer overflow issue related to sub8x8 motion search on scaled reference frame. Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de	2015-12-11 10:08:51 -08:00
Alex Converse	b1fcd1751e	Fix unsigned overflow in rd_variance_adjustment. Found with clang -fsanitize=integer Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1	2015-11-19 15:00:59 -08:00
paulwilkins	0149fb3d6b	Changes to exhaustive motion search. This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa	2015-11-13 10:16:31 +00:00
hui su	6ab6ac450b	Use accurate bit cost for uv_mode in UV intra mode RD selection On derflr, +0.1% for VP10; however, -0.03% on VP9. Change-Id: I09c724232ede74254043d61d3cadc506256af0af	2015-11-06 14:45:43 -08:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00

1 2 3 4 5 ...

1341 Commits