generic-library/vpx

Author	SHA1	Message	Date
James Zern	b34705f64f	Merge "cosmetics: Beautify whitespaces and line wrapping"	2016-06-24 21:51:01 +00:00
Yury Gitman	67611119b5	cosmetics: Beautify whitespaces and line wrapping Change-Id: I9afa02cae671bd3527cf344695e53d0cc767f549	2016-06-24 10:18:06 -07:00
Yaowu Xu	7738bcb350	Rationalize type to avoid integer out of range BUG=webm:1250 Change-Id: Id5bb2762ca1bf996ba4f9a60eec977a7994c1d94	2016-06-24 13:58:02 +00:00
Yaowu Xu	b3933e2d3c	Merge "Fix ubsan warnings: vp9/encoder/vp9_mcomp.c"	2016-06-22 00:12:58 +00:00
Yaowu Xu	87bf1a149c	Fix ubsan warnings: vp9/encoder/vp9_mcomp.c This commit fixes a number of ubsan warnings in HBD build. BUG=webm:1219 Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc	2016-06-21 15:37:59 -07:00
hui su	a5af392aae	Add a hardware compatibility feature This commit adds an encoder workaround to support better compatibility with a non-compliant hardware vp9 profile 2 decoder. The known issue with this decoder is: The decoder assumes a wrong value, 127 instead of the correct value of 511 and 2047, for any assumed top-left corner pixel in UV planes for 10 and 12 bit, respectively. Such assumed top-left corner pixel is used for INTRA prediction when a real decoded/reconstructed pixel is not avalable, e.g. when it is located inside the row above the top row or inside the column left to the leftest column of a video image. Change-Id: Ic15a938a3107e1b85e96cb7903a5c4220986b99d	2016-06-21 10:33:57 -07:00
Scott LaVarnway	ba962a5f37	VP9: Eliminate up_available and left_available Use above_mi and left_mi instead. Change-Id: I0b50e232c31d11da30aa2fb6f91a695aaf725e0c	2016-03-30 04:47:39 -07:00
Julia Robson	74a679de6f	Port "cost_coeff speed improvements" to vp9. About a 5% faster overall encode (perf cycles) at speed zero! Change-Id: Iaf013ba75884415cd824e98349f654ffb1c3ef33	2016-02-26 14:47:18 -08:00
Jingning Han	d642294b1c	Fix tsan error in VP9 sub8x8 intra mode search This commit fixes issue 1141. The issue was triggered in multi-tile encoding. The change properly saves and restores the block context information in the real-time mode selection process. It removes several redundant memcpy operations in sub8x8 intra block mode search. Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010	2016-02-16 11:24:09 -08:00
Jingning Han	f032c7eaed	Merge "Account for sub8x8 block skip mode cost in RD decision"	2016-02-08 19:40:01 +00:00
Jingning Han	203bdd20fb	Account for sub8x8 block skip mode cost in RD decision Make this consistent with regular block size rate-distortion optimization. It improves the compression performance: derf 0.055% hevcmr 0.129% Change-Id: I112fe734f592c21bc7aa6efb7e3f269c4214ee7b	2016-02-08 10:18:51 -08:00
Jingning Han	ac6d40ece8	Clean up in vp9_rd_pick_inter_mode_sb Use local variable. Change-Id: I0d3df36cf4536958a0cda422f6c30da50f0e0bbf	2016-02-08 10:15:02 -08:00
Jingning Han	bcce658d31	Use precise rate cost estimate for skip block mode It improves the compression performance of VP9 by 0.1% across all test sets. No speed change is observed. Change-Id: I59338c5c9e67bae22188f35fc3afbfe2a6bba6b0	2016-02-03 11:09:16 -08:00
Alex Converse	d13385cee7	Switch to 9-bit rate cost constants built on a 256 probability denominator. -.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html -.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f	2016-01-27 19:34:30 +00:00
Alex Converse	4326cffa65	Merge "Tie the bit cost scale to a define."	2016-01-21 19:17:56 +00:00
Scott LaVarnway	5232326716	VP9: Eliminate MB_MODE_INFO Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185	2016-01-19 16:40:20 -08:00
Alex Converse	269428e35c	Tie the bit cost scale to a define. This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9	2016-01-15 15:59:31 -08:00
Scott LaVarnway	a85e552d95	VP9: Remove decoder args from find_mv_refs_idx() The decoder does not use this function. Change-Id: Ie67f909c0f4108ef286789c70df867d4b960a780	2016-01-13 13:30:40 -08:00
Yaowu Xu	9cac17d157	Enable encoder to avoid 8x4 or 4x8 partitions This commit enables encoder to avoid 8x4 and 4x8 partitions for scaled reference frames when libvpx is configured and built with --enable-better-hw-compatibility Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413	2016-01-07 09:53:14 -08:00
Yaowu Xu	650a2d7628	Fix a typo Change-Id: I12de2dd5e5f375551804166188d76a9ad8067b41	2016-01-07 09:29:34 -08:00
Jingning Han	27bbfd652d	Fix sub8x8 motion search on scaled reference frame This commit makes the sub8x8 block rate-distortion optimization scheme use precise motion compensated prediction to compute the rd cost. It fixes a potential buffer overflow issue related to sub8x8 motion search on scaled reference frame. Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de	2015-12-11 10:08:51 -08:00
Alex Converse	b1fcd1751e	Fix unsigned overflow in rd_variance_adjustment. Found with clang -fsanitize=integer Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1	2015-11-19 15:00:59 -08:00
paulwilkins	0149fb3d6b	Changes to exhaustive motion search. This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa	2015-11-13 10:16:31 +00:00
hui su	6ab6ac450b	Use accurate bit cost for uv_mode in UV intra mode RD selection On derflr, +0.1% for VP10; however, -0.03% on VP9. Change-Id: I09c724232ede74254043d61d3cadc506256af0af	2015-11-06 14:45:43 -08:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00
Scott LaVarnway	2f8625d824	VP9: remove plane_type from macroblockd_plane Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec	2015-09-30 15:15:11 -07:00
hui su	38cc168822	Adjust rd calculation in choose_tx_size_from_rd Coding gain: derflr 0.142% hevclr 0.153% hevcmr 0.124% Change-Id: I63b56ae3a9002c3a266e10e2964135ed43b0ba53	2015-09-23 10:54:28 -07:00
Jingning Han	b6d71a308c	Fix ioc warnings related to sub8x8 reference frame Access scaled reference frame in the sub8x8 rate-distortion optimization loop only when the current test mode is an inter mode. This prevents an ioc warning triggered by sending intra_frame index to fetch scaled reference frame. Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604	2015-09-09 15:48:00 -07:00
Jingning Han	50461166b7	Enable sub8x8 inter mode with scaled ref frame in RD optimization This commit allows the encoder to include sub8x8 inter mode with scaled reference frame in the rate-distortion optimization scheme. Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5	2015-09-09 00:29:06 +00:00
Johann	c5f11912ae	Include vpx_dsp_common.h when using VPXMIN/MAX Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee	2015-08-31 14:36:35 -07:00
James Zern	5e16d397bd	vpx_dsp_common: add VPX prefix to MIN/MAX prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c	2015-08-26 20:11:32 -07:00
Shunyao Li	aa006d7149	Add transform size rate for intra skip mode in rdopt stdhd +0.226 hevchr +0.091 hevcmr +0.052 derflr +0.033 Change-Id: I84034209c5760609a99bd6e0ce55e02534b72cac	2015-08-24 18:15:09 -07:00
hui su	088b05fd99	Use sizeof(variable) instead of sizeof(type) Change-Id: Ia069da11eebb271063e9eb837bdb3e7175ecce13	2015-08-12 11:25:38 -07:00
Alex Converse	a8a08ce57e	Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4	2015-08-10 15:37:14 -07:00
Zoe Liu	c21cab39c8	Fixed a comment on the compound ref frames. Change-Id: I77e397ac9f594c9c4c1db442e334a6ea5f53f588	2015-08-06 17:36:57 -07:00
Jingning Han	b4f2c567c8	Cosmetic - align format in vp9 Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e	2015-08-06 15:56:11 -07:00
Alex Converse	ab20c98e84	Compute skippable inside the block_rd_txfm loop. Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2	2015-07-31 11:45:59 -07:00
Alex Converse	c62228f273	Simplify model_rd_for_sb HBD ifdefs Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d	2015-07-31 11:16:59 -07:00
Alex Converse	da9c73c293	Simplify dist_block HBD ifdefs Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114	2015-07-31 11:04:01 -07:00
Aℓex Converse	8abd0c2a12	Merge "Short circuit rate_block in block_rd_txfm."	2015-07-31 17:59:22 +00:00
Alex Converse	4ac5058afc	Give skip_txfm constants names. This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792	2015-07-31 10:08:08 -07:00
Alex Converse	73422d3b2d	Short circuit rate_block in block_rd_txfm. Don't run rate_block (cost_coeffs) if distortion alone is enough to surpass best_rd. This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is zero effect on output if tx_cache is removed. Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784	2015-07-31 10:05:51 -07:00
Yunqing Wang	3b2e73b9a4	Remove tx cache and speed up tx size selection 1. The RD scores obtained during the tx size selection were stored in the tx cache, and used to help make the tx decision for the following frames. This wasn't used anymore in VP9 encoder. Recovered the related decision making code from 1.5+ years ago, and borg tests didn't show any quality gain. This patch removed it to lower the complexity. 2. An optimization was done after the above refactoring. If the tx_mode is not TX_MODE_SELECT, we only need to test the chosen tx size instead of all posible tx sizes. This gave a 1.5% average speed gain at speed 2, and a 1% average speed gain at speed 3. Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c	2015-07-30 18:53:40 -07:00
Aℓex Converse	eb6b443bd2	Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."	2015-07-30 23:04:28 +00:00
Alex Converse	c827c59eaf	Convert simple_model_rd_from_var from a speed check to a speed feature. Change-Id: I8877025e172fff29bc4e270790211463b676b4d7	2015-07-30 13:53:26 -07:00
Alex Converse	b7f441a0bc	Cleanup rdcost_block_args Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba	2015-07-30 12:55:51 -07:00
Jingning Han	4b5109cd73	Replace vp9_ prefix in 2D-DCT functions with vpx_ Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25	2015-07-28 16:06:44 -07:00
Yaowu Xu	bf82514b54	vpx_dsp/bitreader.h: vp9_->vpx_ Replace vp9_ in names to vpx_ as they are not codec specific. Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb	2015-07-20 18:06:31 -07:00
Jingning Han	389ed6da10	Refactor highbd forward transform use case Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf	2015-07-20 10:31:17 -07:00
Jingning Han	81452cf0b7	Refactor intra block prediction function This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138	2015-07-13 15:20:47 -07:00
paulwilkins	8dd466edc8	Changes to use of rectangular partitions. Changes to allow more use of rectangular partitions at speeds 1 and 2 for content classed by the first pass as animation and for blocks near the active image edge. This has quite a big impact in quality for the animated test sequence but also hurts encode speed for speed 2. For other content types the impact on both speed and quality is small. Added some plumbing for detection of internal vertical image edges. Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a	2015-07-08 18:14:12 +01:00
paulwilkins	a126b6ce7d	Change speed and rd features for formatting bars. Change speed features / behavior for split mode when there is an internal active edge (e.g. formatting bars). Remove some threshold constraints in rd code near the active edge of the image. Add some plumbing for left and right active edge detection. Patch set 5. Limit rd pass through for sub 8x8 to internal active edges. This takes away any speed penalty for most clips but keeps the enhanced edge coding for the more critical case of internal image edges Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5	2015-07-08 17:51:42 +01:00
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
Jingning Han	fcb5a8692a	Merge "Move subtract functions from vp9 to vpx_dsp"	2015-07-06 22:39:26 +00:00
James Zern	017253b7a3	remove vp9_get_interp_kernel() expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20	2015-07-06 13:04:05 -07:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
Scott LaVarnway	c06d56cc7d	VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFO to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107	2015-06-29 12:46:47 -07:00
Scott LaVarnway	86f4a3d8af	Remove tile param and added to MACROBLOCKD. Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db	2015-06-22 06:09:38 -07:00
Scott LaVarnway	cca866f578	inline vp9_get_segdata() and change name. Change-Id: I706645cf9d9dc04f1b3b6ac80df80edb7f101854	2015-06-11 09:52:00 -07:00
Scott LaVarnway	42c0b1b1f1	inline vp9_segfeature_active() and changed name. Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3	2015-06-11 04:20:55 -07:00
Scott LaVarnway	baaaa57533	Reducing size of MODE_INFO struct Reduced size from 124 bytes to 104 bytes. For decode only builds, it is reduced to 68 bytes. Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3	2015-06-04 07:32:16 -07:00
Scott LaVarnway	b962646fc5	Re-worked header files Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32	2015-05-22 11:19:51 -07:00
Johann	1d7ccd5325	Relocate memory operations for common code With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca	2015-05-13 11:41:15 -07:00
James Zern	fd3658b0e4	replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79	2015-05-07 11:55:08 -07:00
James Zern	f58011ada5	vpx_mem: remove vpx_memset vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201	2015-04-28 20:00:59 -07:00
James Zern	f274c2199b	vpx_mem: remove vpx_memcpy vestigial. replace instances with memcpy() which they already were being defined to. Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c	2015-04-28 19:59:41 -07:00
James Zern	fbd3b89488	vpx_mem: remove vpx_memmove vestigial. replace instances with memmove() which they already were being defined to. Change-Id: If396d3f9e3cf79c0ee5d7429615ef3d6b2a34afa	2015-04-28 19:59:40 -07:00
Scott LaVarnway	8b17f7f4eb	Revert "Remove mi_grid_* structures." (see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d	2015-04-21 11:16:45 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
Adrian Grange	ad18b2b641	Remove 8-bit array in HBD Creating both 8- and 16-bit arrays and then only using one of them is wasteful. Change-Id: Ic5b397c283efaff7bcfff2d2413838ba3e065561	2015-03-25 15:37:03 -07:00
Adrian Grange	65df3d138a	Replace heap with stack memory allocation Replaced the dynamic memory allocation of the second_pred buffer with an allocation on the stack. Change-Id: I2716c46b71e8587714ca5733a99eca2c68419b23	2015-03-25 15:36:43 -07:00
Adrian Grange	8d8d7bfde5	Fix use of scaling in joint motion search To enable us to the scale-invariant motion estimation code during mode selection, each of the reference buffers is scaled to match the size of the frame being encoded. This fix ensures that a unit scaling factor is used in this case rather than the one calculated assuming that the reference frame is not scaled. Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf	2015-03-25 15:35:29 -07:00
paulwilkins	8ea7bafdaa	Merge "Revised rd adjustment for variance."	2015-03-24 03:12:56 -07:00
paulwilkins	c0b71cf82f	Merge "Experimental rd bias based on source vs recon variance."	2015-03-24 03:12:41 -07:00
paulwilkins	7e234b9228	Revised rd adjustment for variance. Revised adjustment for rd based on source complexity. Two cases: 1) Bias against low variance intra predictors when the actual source variance is higher. 2) When the source variance is very low to give a slight bias against predictors that might introduce false texture or features. The impact on metrics of this change across the test sets is small and mixed. derf -0.073%, -0.049%, -0.291% std hd -0.093%, -0.1%, -0.557% yt +0.186%, +0.04%, - 0.074% ythd +0.625%, + 0.563%, +0.584% Medium to strong psycho-visual improvements in some problem clips. This feature and intra weight on GF group length now turned on by default. Change-Id: Idefc8b633a7b7bc56c42dbe19f6b2f872d73851e	2015-03-20 11:59:39 +00:00
paulwilkins	9a1ce7be7d	Experimental rd bias based on source vs recon variance. This experiment biases the rd decision based on the impact a mode decision has on the relative spatial complexity of the reconstruction vs the source. The aim is to better retain a semblance of texture even if it is slightly misaligned / wrong, rather than use a simple rd measure that tends to favor use of a flat predictor if a perfect match can't be found. This improves the appearance of texture and visual quality on specific test clips but is hidden under a flag and currently off by default pending visual quality testing on a wider Yt set. Change-Id: Idf6e754a8949bf39ed9d314c6f2daaa20c888aad	2015-03-20 11:57:36 +00:00
Adrian Grange	12d946df89	Restore first ref frame pointer to the correct value The joint_motion_search function alternates prediction between two reference frames. In order to reuse existing code, a pointer to the appropriate reference frame is written into xd->plane[0].pre[0], that the motion estimation code assumes points to the reference frame. If this first reference frame was scaled then the pointer was incorrectly being reset to point to the unscaled reference frame rather than the scaled version. Change-Id: I76f73a8d8f4f15c1f3a5e7e08a35140cdb7886ab	2015-03-19 16:17:31 -07:00
Adrian Grange	53c9ebe609	Move joint_motion_search & delete function prototype Change-Id: I7fb3a78ed0e0bc940d8b4a57c470302f8369782f	2015-03-19 14:28:52 -07:00
Alex Converse	ad01d275e9	Merge "Don't inline cost_coeffs."	2015-03-05 13:54:44 -08:00
Adrian Grange	6e3be5c3b6	Merge "Fix valgrind memcpy memory overlaps warning"	2015-03-05 12:52:57 -08:00
Alex Converse	2eb113d00a	Don't inline cost_coeffs. It was tiny when it was orginally marked INLINE. Forcing this function to be inlined prevents the compiler from inlining its much smaller callers. No measurable speed impact, 28320 byte smaller libvpx.a Change-Id: I6bf4c917157d15cbadb3cd3e20a9e82d35dc7d6f	2015-03-05 12:39:02 -08:00
Adrian Grange	3807dd82ab	Make encoder buffer allocation dynamic Frame buffers are now allocated dynamically on-demand. Entries in the reference frame map, cm->ref_frame_map, may now be set to -1 (INVALID_IDX) to indicate that there is not a valid reference buffer in that "slot". All slots in the reference frame map are now initialized to the empty state (-1) and each buffer is initialized to have a reference count of 0. Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582	2015-03-04 07:58:32 -08:00
Adrian Grange	852f62fde5	Fix valgrind memcpy memory overlaps warning Change-Id: Id0bb162b48b891c5c849f0411ef2ac0aa4bbe261	2015-03-03 15:06:34 -08:00
Jingning Han	5041aa0fbe	Fix ioc issue in block_rd_txfm Force 64-bit precision in the intermediate steps. Change-Id: I666113d9adcef8975da201d5aa1a13b783d09594	2015-02-12 12:51:39 -08:00
Adrian Grange	23ebacdb81	Auto-adaptive encoder frame resizing logic Note: This feature is still in development. Add an option for the encoder to decide the resolution at which to encode each frame. Each KF/GF/ARF goup is tested to see if it would be better encoded at a lower resolution. At present, each KF/GF/ARF is coded first at full-size and if the coded size exceeds a threshold (twice target data rate) at the maximum active Q then the entire group is encoded at lower resolution. This feature is enabled in vpxenc by setting: --resize-allowed=1 In addition, if the vpxenc command line also specifies valid frame dimensions using: --resize-width=XXXX & --resize_height=YYYY then all frames will be encoded at this resolution. Change-Id: I13f341e0a82512f9e84e144e0f3b5aed8a65402b	2015-02-10 09:59:32 -08:00
hkuang	be6aeadaf4	Try again to merge branch 'frame-parallel' into master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit `a18da9760a`. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02	2015-01-30 21:00:13 -08:00
Jingning Han	9bdc0ae2b2	Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8 Add parentheses to bit operations. Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749	2015-01-23 11:48:58 -08:00
Johann	a18da9760a	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch." This reverts commit `bde04ce503` Change-Id: I053dae04c761b04a36dc239558503905a14d2470	2015-01-23 08:42:02 -08:00
hkuang	bde04ce503	Merge branch 'frame-parallel' to enable frame parallel decode in master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64	2015-01-22 18:18:53 -08:00
Jingning Han	5c31fd5c6d	Merge "Enable sub8x8 inter block search for RTC coding mode"	2015-01-02 10:00:35 -08:00
Jingning Han	dad89d5ca1	Enable sub8x8 inter block search for RTC coding mode This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a	2014-12-24 17:40:31 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Paul Wilkins	2e39817f5e	Merge "Improve motion detection for low complexity regions."	2014-12-18 08:38:21 -08:00
Jingning Han	01613aa753	Set second ref frame to be NONE in key frame coding This commit explicitly set the second reference frame type to be NONE in key frame coding mode. This fixes a subtle dependency of reference motion vector used by next inter frame on mode_info reset before key frame coding. Change-Id: I5ff0359753fdc9992b0bfe889490f7a32d7d5f6a	2014-12-16 15:49:58 -08:00
Paul Wilkins	b6c75c5a8d	Improve motion detection for low complexity regions. Where there is very subtle motion, especially when combined with low spatial complexity, the codec sometimes fails to quickly pick up the ambient motion field. Once it has been established though the field propagates well using Nearest and Near MV. This patch looks specifically at the case where the Nearest and Near have not been established as non zero vectors and in this case discounts the cost of searching for a new vector in the rd code. This will almost certainly have some implications in terms of encode speed but it should be possible to mitigate the impact in a subsequent using first pass stats and the local spatial complexity. Average results for test sets approximately neutral. Change-Id: I44a29e20f11f7ab10f8c93ffbdc50183d9801524	2014-12-16 17:22:54 +00:00
Jingning Han	eefe869291	Simplify rate-distortion modeling function Use left shift to replace one multiplication. The computation outcome remains identical. Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1	2014-12-15 11:51:16 -08:00
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
hkuang	382f86f945	Improve the performance by caching the left_mi and right_mi in macroblockd. This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3	2014-12-05 16:25:42 -08:00

1 2 3 4 5 ...

1367 Commits