generic-library/vpx

Author	SHA1	Message	Date
Geza Lore	f1342a7b07	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6	2015-11-05 10:02:17 +00:00
Geza Lore	965a8dea0b	Convert motion search config from AoS to SoA This is a prerequisite for vectorizing vp9_diamond_search_sad_c. Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410	2015-10-28 15:30:43 +00:00
Marco Paniconi	dc9d36c0a6	Merge "Code cleanup for vp9-denoiser."	2015-10-26 20:52:16 +00:00
Paul Wilkins	cce3982c48	Merge "Incorrect frame used in KF boost loop."	2015-10-26 19:12:34 +00:00
Paul Wilkins	26abc15e04	Merge "Bug in clamping of base_frame_target."	2015-10-26 19:12:08 +00:00
Marco	f2845ed83c	Code cleanup for vp9-denoiser. Change-Id: Ibb573f50c4bf2cfb382b589803f3363db0ac1285	2015-10-26 12:04:54 -07:00
Debargha Mukherjee	65dd056e41	Merge "Optimize vpx_quantize_{b,b_32x32} assembler."	2015-10-26 18:04:49 +00:00
Debargha Mukherjee	35cae7f1b3	Merge "Optimize vp9_highbd_block_error_8bit assembly."	2015-10-26 18:03:46 +00:00
Paul Wilkins	762c0f2264	Bug in clamping of base_frame_target. Bug relating to issue:- http://b/25090786 base_frame_target is supposed to track the idealized bit allocation based on error score and not the actual bits allocated to each frame. The clamping of this value based on the VBR min and max pct values was causing a bug where in some cases the loop that adjusts the active max quantizer for each GF group was running out of bits at the end of a KF group. This caused a spike in Q and some ugly artifacts. A second change makes sure that the calculation of the active Q range for a group DOES, however, take account of clamping. Change-Id: I31035e97d18853530b0874b433c1da7703f607d1	2015-10-23 14:45:48 -07:00
Marco	d162934bdc	VP9: Estimate noise level for denoiser. Periodically estiamte noise level in source, and only denoise if estimated noise level is above threshold. Change-Id: I54f967b3003b0c14d0b1d3dc83cb82ce8cc2d381	2015-10-23 11:03:30 -07:00
Paul Wilkins	4e887f032d	Incorrect frame used in KF boost loop. Fixes a bug in the calculation of the boost for key frames. Change-Id: I75e9c96a9e86379239fbbbecb56ccd529783dc7c	2015-10-21 22:17:53 +01:00
Geza Lore	aa8f85223b	Optimize vp9_highbd_block_error_8bit assembly. A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7	2015-10-21 12:30:40 +01:00
Geza Lore	9cfba09ac0	Optimize vpx_quantize_{b,b_32x32} assembler. Added optimization of the 8 bit assembly quantizer routines. This makes these functions up to 100% faster, depending on encoding parameters. This patch maskes the encoder faster in both the high bitdepth and 8bit configurations. In the high bitdepth configuration, it effects profile 0 only. Based on my profiling using 1080p input the net gain is between 1-3% for the 8 bit config, and around 2.5-4.5% for the high bitdepth config, depending on target bitrate. The difference between the 8 bit and high bitdepth configurations for the same encoder run is reduced by 1% in all cases I have profiled. Change-Id: I86714a6b7364da20cd468cd784247009663a5140	2015-10-20 10:11:19 +01:00
James Zern	849e54cedd	Merge "vp8cx: remove deprecated reference/entropy controls"	2015-10-20 02:46:36 +00:00
James Zern	a046f56491	vp8cx: remove deprecated reference/entropy controls VP8E_UPD_ENTROPY, VP8E_UPD_REFERENCE and VP8E_USE_REFERENCE have been deprecated since the initial public release Change-Id: Ied16b441eec13434d85f1ab115d49ccaf5f2f7b0	2015-10-16 17:02:36 -07:00
Yaowu Xu	568429512e	Add a new enum type vpx_color_range_t to make meaning of color_range obvious. Change-Id: I303582e448b82b3203b497e27b22601cc718dfff	2015-10-16 16:27:18 -07:00
Marco	b44c5cf639	Adjustment on limiting cyclic refresh on steady blocks. Adjust the qp threshold and consec_zeromv threshold for limiting cyclic refresh. Also increase the refresh period when the limit amount is significant, and some code-cleanup. Small gain in PSNR/SSIM metrics: ~0.25/0.3 gain on RTC set, speed 7. Change only affects non-screen content. Change-Id: I1ced87a89a132684c071e722616e445b2d18236a	2015-10-16 10:16:44 -07:00
Yaowu Xu	1832ba7509	Restore partial changes from previous commit This portion was tested to have no effect on asan test failures. Change-Id: I3de1dab7479148bdffc24c4568cb2e7e9963f099	2015-10-16 00:28:37 +00:00
Jacky Chen	a5d74843eb	Merge "VP9_resizing: adjust the threshold and another improvement."	2015-10-15 21:35:02 +00:00
Marco Paniconi	cff15f9d3c	Merge "Fix resetting of cyclic refresh on dynamic resize change."	2015-10-15 21:09:06 +00:00
JackyChen	dc002cb7b4	VP9_resizing: adjust the threshold and another improvement. Adjust the qp threshold based on the denoising setting; not allow to scale directly from original resolution to one half and vise versa. Change-Id: I032a9b22f8e1c88de6bb81cf8351367223a3e40d	2015-10-15 09:27:22 -07:00
Marco	d6bbda4bc2	Fix resetting of cyclic refresh on dynamic resize change. Put the reset at the right place, during the setup and prior to updating the map. Change-Id: I75e550ae9d8cc15081330b8857edc04c23947875	2015-10-15 09:03:51 -07:00
Marco	1a0a10cf3d	VP9: Rate control update for re-encode screen-content. For the re-encoding (at max-qp) on the detected high-content change: update rate correction factor, reset rate over/under-shoot flags, and update/reset the rate control for layered coding. Change-Id: I5dc72bb235427344dc87b5235f2b0f31704a034a	2015-10-15 08:26:15 -07:00
Yaowu Xu	4727fa2a75	Fix two asan failures Change-Id: I57865e9604ac162ef0d97deb16e81ca436a98428	2015-10-14 18:03:31 -07:00
Yaowu Xu	c2b8b5bfe2	Merge "Changes to partition breakout rules."	2015-10-13 22:31:56 +00:00
paulwilkins	cdc359989a	Changes to partition breakout rules. Changes to the breakout behavior for partition selection. The biggest impact is on speed 0 where encode speed in some cases more than doubles with typically less than 1% impact on quality. Speed 0 encode speed impact examples Animation test clip: +128% Park Joy: +59% Old town Cross: + 109% Change-Id: I222720657e56cede1b2a5539096f788ffb2df3a1	2015-10-13 14:19:06 -07:00
Marco	1ce01eaaf7	VP9-SVC: Bugfix to allow skipping lower layer(s) encoding. The setting of svc->spatial_layer_to_encode was missing in VP9E_SET_SVC_LAYER_ID. Change-Id: I015b1a64adb9ef2644d6477a02d9d9364c8462b9	2015-10-12 16:11:34 -07:00
James Zern	ba7ea4456f	tile_worker_hook: fix -Wclobbered warning *tile should be marked volatile like the others due to the use of setjmp() Change-Id: I5dbf8e6792e4c0f34a683434b4fd06e3b4c75c4b	2015-10-10 11:17:08 -07:00
James Zern	65055a5fbd	Merge "vp9/decode_tiles_mt: remove unnecessary local"	2015-10-09 17:52:34 +00:00
Debargha Mukherjee	94bedd013e	Merge "Optimization of 8bit block error for high bitdepth"	2015-10-09 13:36:47 +00:00
Geza Lore	0134764fa6	Optimization of 8bit block error for high bitdepth If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f	2015-10-08 14:05:25 -07:00
Jacky Chen	66bf686975	Merge "VP9 denoiser: use skin map to improve denoising."	2015-10-08 21:02:46 +00:00
jackychen	bafe1a2d67	VP9 denoiser: use skin map to improve denoising. Only denoise at small motion if it's a skin block. Change-Id: I6235cad9dd7f76ab40e7d9cdfe6180e619c20c6e	2015-10-08 12:17:25 -07:00
jackychen	eaa101b502	vp9_skin_detection: fix some build warnings. Change-Id: Ib779c083e9775dc9922ed6e104f6275bc453bef9	2015-10-08 09:51:34 -07:00
James Zern	50b20b90aa	vp9/decode_tiles_mt: remove unnecessary local reuse the common loop index Change-Id: I9db45a93c219c2123917514cb8e9d4ea86454711	2015-10-07 17:46:13 -07:00
James Zern	a83e8ec008	Merge "vp9/tile_worker_hook: pass pbi directly"	2015-10-07 22:09:33 +00:00
James Zern	1f2acb7e40	Merge changes Iaee60826,I51cf1e39 * changes: vp9/tile_worker_hook: add multiple tile decoding invalid_file_test: loosen error check w/tile-threading	2015-10-07 22:09:21 +00:00
jackychen	b0a2ba2ffa	VP9_denoiser: pass address in copy_frame to make it faster. Change-Id: I65269ddb3ea5f911d5be38614b93c97be7e1ba76	2015-10-07 13:22:37 -07:00
Marco Paniconi	780ada18aa	Merge "VP9 denoiser bug-fix: artifact caused by false buffer swap."	2015-10-07 19:08:07 +00:00
Alex Converse	061103dc82	Merge "vp9: simplify extrabits encoding"	2015-10-07 18:45:02 +00:00
jackychen	7231c62c9f	VP9 denoiser bug-fix: artifact caused by false buffer swap. The artifact occurs periodically when VP9 denoiser is on and refresh_golden_frame happen. When refresh_golden_frame happen, we should copy the frame buffer instead of swapping the pointers. Change-Id: Ib3204c4b04db28ecf439c6d9e61f3d146f04196d	2015-10-07 11:16:15 -07:00
James Zern	0bd82af834	vp9/tile_worker_hook: pass pbi directly reduces the size of TileWorkerData reusing the storage in the worker itself Change-Id: If8a62fcb35167037c3da5814ab84fb81893f9cab	2015-10-06 20:14:24 -07:00
James Zern	1f4a6c8a4e	vp9/tile_worker_hook: add multiple tile decoding this reduces the number of synchronizations in decode_tiles_mt() and improves overall performance when the number of threads is less than the number of tiles Change-Id: Iaee6082673dc187ffe0e3d91a701d1e470c62924	2015-10-06 20:13:54 -07:00
Marco	bc137ff67b	Move setting of refresh threshold outside loop. Small code cleanup. consec_zeromv refresh threshold does not need to be computed for every super-block. No change in behavior. Change-Id: I8c4b1b28072f42b01d917fff6d1f62722f1e1554	2015-10-06 17:51:30 -07:00
Alex Converse	2f7f482c77	vp9: simplify extrabits encoding Change-Id: I5a2abd35cb303d8f6354b3119ab95acf90405116	2015-10-06 16:26:08 -07:00
Marco	7266bedc04	Add first_spatial_layer_to_encode to SVC. Use the existing VP9_SET_SVC control to set the first spatial layer to encode. Since we loop over all spatial layers inside the encoder, the setting of spatial_layer_id via VP9_SET_SVC has no relevance. Use it instead to set the first_spatial_layer_to_encode, which allows an application to skip encoding lower layer(s). Change only affects the 1 pass CBR SVC. Change-Id: I5d63ab713c3e250fdf42c637f38d5ec8f60cd1fb	2015-10-06 08:56:15 -07:00
jackychen	de53e6de49	Add the check of resolution in VP9 dynamic resizing. The resolution check fixs the issue which resets resize_pending unnecessarily and causes not-bitexact with previous one-step version. Change-Id: I4e7660b3c8f34f59781e2e61ca30d61080c322de	2015-10-05 15:39:32 -07:00
Marco Paniconi	7777e7a8d5	Merge "Fix to denoiser with dynamic resize."	2015-10-05 14:14:35 +00:00
Marco Paniconi	3da6564f90	Merge "Stabilize the encoder buffer from going too negative."	2015-10-05 14:11:43 +00:00
JackyChen	87b2495f95	Turn on two-steps scaling in VP9 encoder dynamic resizing. First do a 3/4 scaling and then go down to 1/2 when necessary. Change-Id: I5689c5228ca7e1606baea7f960eb24d0dab04d4d	2015-10-02 15:27:37 -07:00

1 2 3 4 5 ...

8380 Commits