generic-library/vpx

Author	SHA1	Message	Date
Zoe Liu	9c62f9282f	Merge "Added 3 more reference frames for inter prediction." into nextgenv2	2015-11-24 19:47:03 +00:00
Yaowu Xu	c1629ca53b	Merge branch 'master' into nextgenv2	2015-11-21 05:00:05 -08:00
Zoe Liu	3ec1601e37	Added 3 more reference frames for inter prediction. Under the experiment of EXT_REFS: LAST2_FRAME, LAST3_FRAME, and LAST4_FRAME. Coding efficiency: derflr +1.601%; hevchr +1.895% Speed: Encoder slowed down by ~75% Change-Id: Ifeee5f049c2c1f7cb29bc897622ef88897082ecf	2015-11-20 17:00:24 -08:00
Alex Converse	b1fcd1751e	Fix unsigned overflow in rd_variance_adjustment. Found with clang -fsanitize=integer Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1	2015-11-19 15:00:59 -08:00
Yaowu Xu	7eeb7671d5	Merge branch 'master' into nextgenv2	2015-11-18 05:00:05 -08:00
paulwilkins	0149fb3d6b	Changes to exhaustive motion search. This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa	2015-11-13 10:16:31 +00:00
Yaowu Xu	b49ac0b160	Merge branch 'master' into nextgenv2 Change-Id: I8811bfd8fc132b9f515707e795bb6308e4bf263b	2015-11-09 09:52:18 -08:00
hui su	6ab6ac450b	Use accurate bit cost for uv_mode in UV intra mode RD selection On derflr, +0.1% for VP10; however, -0.03% on VP9. Change-Id: I09c724232ede74254043d61d3cadc506256af0af	2015-11-06 14:45:43 -08:00
Yaowu Xu	b6da40ad82	Merge branch 'master' into nextgenv2 Change-Id: I0e4030a37354bb23b3aa8be5cc1473770b9e7b06	2015-10-27 08:28:09 -07:00
Yaowu Xu	4ac2ae3a4d	Merge branch 'masterbase' into nextgenv2 Conflicts: configure test/vp9_encoder_parms_get_to_decoder.cc vp10/common/blockd.h vp10/common/entropymode.c vp10/common/entropymode.h vp10/common/idct.c vp10/decoder/decodeframe.c vp10/decoder/decodemv.c vp10/encoder/bitstream.c vp10/encoder/encodeframe.c vp10/encoder/encodemb.c vp10/encoder/encoder.c vp10/encoder/encoder.h vp10/encoder/rd.c vp10/encoder/rdopt.c vp10/encoder/tokenize.c vp10/encoder/tokenize.h vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_aq_cyclicrefresh.c vp9/encoder/vp9_encoder.h vp9/vp9_cx_iface.c vpx/vp8cx.h vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c vpx_scale/yv12config.h Change-Id: I604a329d38badec7a11e8ede16ca1404476e9b93	2015-10-22 11:40:44 -07:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00
Zoe Liu	8806955dbd	Added is_compound_ref() to identify compound prediction Change-Id: I7e3bf9f181e0cfbebf7afe93dabb03384b595b79	2015-10-02 13:47:15 -07:00
Scott LaVarnway	2f8625d824	VP9: remove plane_type from macroblockd_plane Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec	2015-09-30 15:15:11 -07:00
Yaowu Xu	7c514e2dfd	Merged branch 'master' into nextgenv2 Resolved Conflicts in the following files: configure vp10/common/idct.c vp10/encoder/dct.c vp10/encoder/encodemb.c vp10/encoder/rdopt.c Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316	2015-09-29 16:17:32 -07:00
hui su	38cc168822	Adjust rd calculation in choose_tx_size_from_rd Coding gain: derflr 0.142% hevclr 0.153% hevcmr 0.124% Change-Id: I63b56ae3a9002c3a266e10e2964135ed43b0ba53	2015-09-23 10:54:28 -07:00
Jingning Han	b6d71a308c	Fix ioc warnings related to sub8x8 reference frame Access scaled reference frame in the sub8x8 rate-distortion optimization loop only when the current test mode is an inter mode. This prevents an ioc warning triggered by sending intra_frame index to fetch scaled reference frame. Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604	2015-09-09 15:48:00 -07:00
Jingning Han	50461166b7	Enable sub8x8 inter mode with scaled ref frame in RD optimization This commit allows the encoder to include sub8x8 inter mode with scaled reference frame in the rate-distortion optimization scheme. Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5	2015-09-09 00:29:06 +00:00
Johann	c5f11912ae	Include vpx_dsp_common.h when using VPXMIN/MAX Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee	2015-08-31 14:36:35 -07:00
James Zern	5e16d397bd	vpx_dsp_common: add VPX prefix to MIN/MAX prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c	2015-08-26 20:11:32 -07:00
Shunyao Li	aa006d7149	Add transform size rate for intra skip mode in rdopt stdhd +0.226 hevchr +0.091 hevcmr +0.052 derflr +0.033 Change-Id: I84034209c5760609a99bd6e0ce55e02534b72cac	2015-08-24 18:15:09 -07:00
hui su	088b05fd99	Use sizeof(variable) instead of sizeof(type) Change-Id: Ia069da11eebb271063e9eb837bdb3e7175ecce13	2015-08-12 11:25:38 -07:00
Alex Converse	a8a08ce57e	Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4	2015-08-10 15:37:14 -07:00
Zoe Liu	c21cab39c8	Fixed a comment on the compound ref frames. Change-Id: I77e397ac9f594c9c4c1db442e334a6ea5f53f588	2015-08-06 17:36:57 -07:00
Jingning Han	b4f2c567c8	Cosmetic - align format in vp9 Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e	2015-08-06 15:56:11 -07:00
Alex Converse	ab20c98e84	Compute skippable inside the block_rd_txfm loop. Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2	2015-07-31 11:45:59 -07:00
Alex Converse	c62228f273	Simplify model_rd_for_sb HBD ifdefs Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d	2015-07-31 11:16:59 -07:00
Alex Converse	da9c73c293	Simplify dist_block HBD ifdefs Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114	2015-07-31 11:04:01 -07:00
Aℓex Converse	8abd0c2a12	Merge "Short circuit rate_block in block_rd_txfm."	2015-07-31 17:59:22 +00:00
Alex Converse	4ac5058afc	Give skip_txfm constants names. This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792	2015-07-31 10:08:08 -07:00
Alex Converse	73422d3b2d	Short circuit rate_block in block_rd_txfm. Don't run rate_block (cost_coeffs) if distortion alone is enough to surpass best_rd. This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is zero effect on output if tx_cache is removed. Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784	2015-07-31 10:05:51 -07:00
Yunqing Wang	3b2e73b9a4	Remove tx cache and speed up tx size selection 1. The RD scores obtained during the tx size selection were stored in the tx cache, and used to help make the tx decision for the following frames. This wasn't used anymore in VP9 encoder. Recovered the related decision making code from 1.5+ years ago, and borg tests didn't show any quality gain. This patch removed it to lower the complexity. 2. An optimization was done after the above refactoring. If the tx_mode is not TX_MODE_SELECT, we only need to test the chosen tx size instead of all posible tx sizes. This gave a 1.5% average speed gain at speed 2, and a 1% average speed gain at speed 3. Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c	2015-07-30 18:53:40 -07:00
Aℓex Converse	eb6b443bd2	Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."	2015-07-30 23:04:28 +00:00
Alex Converse	c827c59eaf	Convert simple_model_rd_from_var from a speed check to a speed feature. Change-Id: I8877025e172fff29bc4e270790211463b676b4d7	2015-07-30 13:53:26 -07:00
Alex Converse	b7f441a0bc	Cleanup rdcost_block_args Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba	2015-07-30 12:55:51 -07:00
Jingning Han	4b5109cd73	Replace vp9_ prefix in 2D-DCT functions with vpx_ Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25	2015-07-28 16:06:44 -07:00
Yaowu Xu	bf82514b54	vpx_dsp/bitreader.h: vp9_->vpx_ Replace vp9_ in names to vpx_ as they are not codec specific. Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb	2015-07-20 18:06:31 -07:00
Jingning Han	389ed6da10	Refactor highbd forward transform use case Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf	2015-07-20 10:31:17 -07:00
Jingning Han	81452cf0b7	Refactor intra block prediction function This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138	2015-07-13 15:20:47 -07:00
paulwilkins	8dd466edc8	Changes to use of rectangular partitions. Changes to allow more use of rectangular partitions at speeds 1 and 2 for content classed by the first pass as animation and for blocks near the active image edge. This has quite a big impact in quality for the animated test sequence but also hurts encode speed for speed 2. For other content types the impact on both speed and quality is small. Added some plumbing for detection of internal vertical image edges. Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a	2015-07-08 18:14:12 +01:00
paulwilkins	a126b6ce7d	Change speed and rd features for formatting bars. Change speed features / behavior for split mode when there is an internal active edge (e.g. formatting bars). Remove some threshold constraints in rd code near the active edge of the image. Add some plumbing for left and right active edge detection. Patch set 5. Limit rd pass through for sub 8x8 to internal active edges. This takes away any speed penalty for most clips but keeps the enhanced edge coding for the more critical case of internal image edges Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5	2015-07-08 17:51:42 +01:00
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
Jingning Han	fcb5a8692a	Merge "Move subtract functions from vp9 to vpx_dsp"	2015-07-06 22:39:26 +00:00
James Zern	017253b7a3	remove vp9_get_interp_kernel() expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20	2015-07-06 13:04:05 -07:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
Scott LaVarnway	c06d56cc7d	VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFO to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107	2015-06-29 12:46:47 -07:00
Scott LaVarnway	86f4a3d8af	Remove tile param and added to MACROBLOCKD. Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db	2015-06-22 06:09:38 -07:00
Scott LaVarnway	cca866f578	inline vp9_get_segdata() and change name. Change-Id: I706645cf9d9dc04f1b3b6ac80df80edb7f101854	2015-06-11 09:52:00 -07:00
Scott LaVarnway	42c0b1b1f1	inline vp9_segfeature_active() and changed name. Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3	2015-06-11 04:20:55 -07:00
Scott LaVarnway	baaaa57533	Reducing size of MODE_INFO struct Reduced size from 124 bytes to 104 bytes. For decode only builds, it is reduced to 68 bytes. Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3	2015-06-04 07:32:16 -07:00

1 2 3 4 5 ...

1305 Commits