generic-library/vpx

Author	SHA1	Message	Date
Parag Salasakar	1c9af9833d	Merge "mips msa vp9 convolve8 horiz optimization"	2015-04-21 22:08:25 -07:00
Jim Bankoski	3b35e962e2	Merge "Adds a new temporal consistency metric to libvpx."	2015-04-21 16:11:11 -07:00
Johann	931c0a954f	Merge "Rename neon convolve avg file"	2015-04-21 15:45:29 -07:00
Johann	66b9933b8d	Rename neon convolve avg file Some build systems use just the basename for object files. Change-Id: I333e1107ee866f3906cc46476ef8d04c6200a8a0	2015-04-21 14:18:17 -07:00
Scott LaVarnway	8b17f7f4eb	Revert "Remove mi_grid_* structures." (see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d	2015-04-21 11:16:45 -07:00
Jim Bankoski	ee87e20d53	Adds a new temporal consistency metric to libvpx. Change-Id: Id61699ebf57ae4f8af96a468740c852b2f45f8e1	2015-04-21 10:05:37 -07:00
Yaowu Xu	924d06a075	Merge "Resolve configuration conflict"	2015-04-21 08:00:49 -07:00
paulwilkins	3606b78108	Modified test for auto key frame detection. The existing test was triggering a lot of false positives on some types of animated material with very plain backgrounds. These were triggering code designed to catch key frames in letter box format clips. This patch tightens up the criteria and imposes a minimum requirement on the % blocks coded intra in the first pass and the ratio between the % coded intra and the modified inter % after discounting neutral (flat) blocks that are coded equally well either way. On a particular problem animation clip this change eliminated a large number of false positives including some cases where the old code selected kf several times in a row. Marginal false negatives are less damaging typically to compression and in the problem clip there are now a couple of cases where "visual" scene cuts are ignored because of well correlated content across the scene cut. Replaced some magic numbers related to this with #defines and added explanatory comments. Change-Id: Ia3d304ac60eb7e4323e3817eaf83b4752cd63ecf	2015-04-21 12:50:11 +01:00
Parag Salasakar	ca90d4fd96	mips msa vp9 convolve8 horiz optimization average improvement ~6x-8x Change-Id: I7c91eec41aada3b0a5231dda7869b3b968f3ad18	2015-04-21 12:31:26 +05:30
Parag Salasakar	391ecffed9	Merge "mips msa vp9 convolve8 hv optimization"	2015-04-20 23:39:24 -07:00
Parag Salasakar	ef51c1ab5b	mips msa vp9 convolve8 hv optimization average improvement ~5x-8x Change-Id: I3214734cb3716e742907ce0d2d7a042d953df82b	2015-04-21 09:17:49 +05:30
Yaowu Xu	b423a6b212	Resolve configuration conflict Between --enable-internal-stats and --enable-vp9-highbitdepth Change-Id: I36b741554e835033e69883270b6b0e5374a1aafa	2015-04-20 16:44:12 -07:00
Yaowu Xu	305492c375	Move declaration before statement Change-Id: Ib64786fcc0d6dc11c4e66f5b7f3e93b2a4fcb664	2015-04-20 09:50:59 -07:00
Parag Salasakar	2e36149ccd	Merge "mips msa vp9 convolve8 vert optimization"	2015-04-18 23:39:25 -07:00
Parag Salasakar	27d083c1b9	mips msa vp9 convolve8 vert optimization average improvement ~6x-10x Change-Id: Ie3f3ab3a9005be84935919701e56b404e420affa	2015-04-18 08:13:04 +05:30
Jim Bankoski	03829f2fea	Merge "Adds a blockiness metric to internal stats."	2015-04-17 16:06:26 -07:00
Jim Bankoski	3d2f037a44	Merge "adds psnrhvs to internal stats."	2015-04-17 16:06:10 -07:00
Jim Bankoski	f2cbee9a04	Merge "Adds a fastssim metric to VPX internal stats."	2015-04-17 16:05:53 -07:00
Jim Bankoski	1777413a2a	Adds a blockiness metric to internal stats. Change-Id: Iedceeb020492050063acf3fd2326f96c29db9ae5	2015-04-17 11:13:18 -07:00
Jim Bankoski	9757c1aded	adds psnrhvs to internal stats. PSNR HVS is a human visual system weighted version of SNR that's gained some popularity from academia and apparently better matches MOS testing. This code is borrowed from the Daala Project but uses our FDCT code. Change-Id: Idd10fbc93129f7f4734946f6009f87d0f44cd2d7	2015-04-17 10:29:27 -07:00
Jim Bankoski	3f7f194304	Adds a fastssim metric to VPX internal stats. This code appeared in the Daala project first and was originally committed by Nathan Egge. Change-Id: Iadce416a091929c51b46637ebdec984cddcaf18c	2015-04-17 10:23:24 -07:00
Jingning Han	73bce9ec7e	Merge "Remove unnecessary backup token stream pointer"	2015-04-17 09:13:53 -07:00
Marco Paniconi	f76ccce5bc	Revert "Revert "Force_split on 16x16 blocks in variance partition."" This reverts commit `004b9d83e3` Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743	2015-04-16 17:52:13 -07:00
Jingning Han	645c70f852	Remove unnecessary backup token stream pointer When the tokenization is not taking effect, the tokenization pointer remains unchanged. No need to re-assign the backup pointer value. Change-Id: I58fe1f6285aa3b4a88ceb864c11d5de8ac6235dd	2015-04-16 16:44:44 -07:00
Minghai Shang	29b5cf6a9d	Merge "[svc] Fix syntax error when encoding multiple tiles."	2015-04-16 13:43:44 -07:00
Minghai Shang	4aa9255efa	[svc] Fix syntax error when encoding multiple tiles. Change-Id: Ia77b551415f3b3386e22a6c805f244f2d13fe3e3	2015-04-16 12:56:30 -07:00
paulwilkins	effd974b16	Limit arf interval for low fpf clips. This patch limits the maximum arf interval length to approximately half a second. In some low fps animations in particular the existing code was selecting an overly long interval which was hurting visual quality. For a sample problem test clip (360P animation , 15fps, ~200Kbit/s) this change also improved metrics by >0.5 db. There may be some clips where this hurts metrics a little, but the worst case impact visually is likely to be less than having an interval that is much too long. On more normal material at 24 fps or higher, the impact is likely to be nil/minimal. Change-Id: Id8b57413931a670c861213ea91d7cc596375a297	2015-04-16 11:50:37 +01:00
Yunqing Wang	14e7203e7b	Merge "Fix Tsan errors"	2015-04-15 15:34:03 -07:00
Yunqing Wang	63c5bf2b9c	Fix Tsan errors This patch fixed 2 reported Tsan errors while running VP9 real-time encoder. Change-Id: Ib0278fe802852862c3ce87c4a500e544d7089f67	2015-04-15 12:33:39 -07:00
Johann	14ef4aeafb	Reorganize *_rtcd() calling conventions Change-Id: Ib1e17d8aae9b713b87f560ab5e49952ee2bfdcc2	2015-04-15 11:12:05 -04:00
Yunqing Wang	004b9d83e3	Revert "Force_split on 16x16 blocks in variance partition." This reverts commit `eb8c667570`. The patch caused mismatch while using multi-threads. Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be	2015-04-14 15:19:31 -07:00
Marco	2baa3debd5	Merge "Force_split on 16x16 blocks in variance partition."	2015-04-14 09:44:58 -07:00
hkuang	3b2510374a	Merge "Remove unnecessary set postproc flags."	2015-04-13 14:33:43 -07:00
Marco	eb8c667570	Force_split on 16x16 blocks in variance partition. Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks. Also increase variance threshold for 32x32, and add exit condiiton in choose_partition (with very safe threshold) based on sad used to select reference frame. Some visual improvement near moving boundaries. Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%. Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip. Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577	2015-04-13 12:05:07 -07:00
Parag Salasakar	2f693be8f8	Merge "mips msa vp9 common headers added"	2015-04-09 21:50:15 -07:00
Jingning Han	2404332c1b	Merge "Remove get_nonrd_var_based_fixed_partition function"	2015-04-09 14:45:19 -07:00
Jingning Han	4565812032	Merge "Compute prediction filter type cost only when needed"	2015-04-09 14:45:11 -07:00
Jingning Han	93d9c50419	Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"	2015-04-09 11:16:11 -07:00
Jingning Han	208aa6158b	Remove get_nonrd_var_based_fixed_partition function This function has been replaced by other approaches and is not in use now. Change-Id: I387f45b5607d202539e482468ccc70e6c0f9341f	2015-04-09 09:49:55 -07:00
Parag Salasakar	481fb7640c	mips msa vp9 common headers added Change-Id: Ia31ada59172eb1818e1eb91009f83cbb1f581223	2015-04-09 15:35:12 +05:30
hkuang	7e8e507bfb	Remove unnecessary mv clamp with on demand border extension. Change-Id: Ia2956f06f409b9b0ca8320ca4c1ea5680e938402	2015-04-08 17:16:52 -07:00
Frank Galligan	5668dcc7b9	Refactor dec_build_inter_predictors Refactor the loops in dec_build_inter_predictors to try and decrease the number of instructions. Limited testing saw about 1% perf increase on x86 and about 0.67 % perf increase on Arm. Change-Id: I69cfe6335bb562fbaaebf43fb3f5c5a2a28882a2	2015-04-08 15:00:29 -07:00
Debargha Mukherjee	59681be0a0	Merge "Improve accuracy of rate control in CQ mode"	2015-04-08 10:48:17 -07:00
James Zern	2ed0cf06f9	Merge "vp9_full_search_sadx[38]: align sad arrays"	2015-04-07 20:57:21 -07:00
Yaowu Xu	c88ce84bb5	Merge "Optimize the checking for transform skipping"	2015-04-07 16:29:51 -07:00
Yaowu Xu	90517b5e85	Merge "move ref_frame_cost computations into a function"	2015-04-07 16:29:45 -07:00
Debargha Mukherjee	60bd744c88	Improve accuracy of rate control in CQ mode Modifies a special handling that improves rate control accuracy in the constrained quality mode, when the undershoot and overshoot limits are set tighter. Change-Id: If62103f0ef3ed1cac92807400678c93da50cf046	2015-04-07 16:29:21 -07:00
James Zern	e1ff83f4b0	vp9_full_search_sadx[38]: align sad arrays the sse4 code expects 16-byte aligned arrays; vp8 already had a similar change applied: `b2aa401` Align SAD output array to be 16-byte aligned Change-Id: I5e902035e5a87e23309e151113f3c0d4a8372226	2015-04-07 14:34:06 -07:00
Jingning Han	927693a991	Merge "Enable Hadamard transform based cost estimate for all block sizes"	2015-04-07 12:51:27 -07:00
Jingning Han	6de407b638	Merge "Account for eob cost in the RTC mode decision process"	2015-04-07 12:50:30 -07:00
Jingning Han	25206e7b7f	Compute prediction filter type cost only when needed Skip redundant prediction filter type cost in filter search loop, if the rate value will be reset in Hadamard transform based rate distortion estimate. Change-Id: Ie5221f4bc8da9461c449df367251aeeac52c6e5d	2015-04-07 12:41:46 -07:00
Yaowu Xu	0bb897211d	Optimize the checking for transform skipping If U is not skippable, then do not perform the check on V. Change-Id: Iba5e8362bd42390197f373c44388a426a4404549	2015-04-06 17:54:05 -07:00
Jingning Han	7f629dfca4	SSSE3 assembly implementation of 8x8 Hadamard transform It uses about 10% less CPU cycles than the SSE2 intrinsic implementation. Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499	2015-04-04 09:59:37 -07:00
Jingning Han	9922e4344a	Enable Hadamard transform based cost estimate for all block sizes This commit turns on the Hadamard transform based rate distortion estimate for all block sizes in RTC coding mode. It conditionally skips the rate distortion estimation if all zero block flag is set on. No significant encoding speed change is observed. The compression performance of speed -6 is improved by 1.7% over using it only for block sizes of 32x32 and below. Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb	2015-04-04 09:58:45 -07:00
Yunqing Wang	b2baaa215b	Merge "Fix the scaling factor in UV skipping test"	2015-04-03 17:09:59 -07:00
Yunqing Wang	1a1114d21c	Fix the scaling factor in UV skipping test The threshold scaling factor was calculated wrong using partition size "bsize". Thank Yaowu for pointing it out. It was fixed and no speed change was seen. Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27	2015-04-03 16:07:43 -07:00
James Zern	44e3640923	Merge "vp9: enable sse4 sad functions"	2015-04-03 14:57:52 -07:00
Jingning Han	30e9c091c0	Merge "Tune SSSE3 assembly implementation to improve quantization speed"	2015-04-03 11:24:28 -07:00
Jingning Han	60e01c6530	Account for eob cost in the RTC mode decision process This commit accounts for the transform block end of coefficient flag cost in the RTC mode decision process. This allows a more precise rate estimate. It also turns on the model to block sizes up to 32x32. The test sequences shows about 3% - 5% speed penalty for speed -6. The average compression performance improvement for speed -6 is 1.58% in PSNR. The compression gains for hard clips like jimredvga, mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and 3.2%, respectively. Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb	2015-04-03 10:31:51 -07:00
hkuang	d72ed35374	Merge "Fix error of "Left shift of negative value -1"."	2015-04-02 21:35:12 -07:00
Yunqing Wang	12cb30d4bd	Merge "Set vbp thresholds for aq3 boosted blocks"	2015-04-02 18:22:08 -07:00
Yaowu Xu	718feb0f69	move ref_frame_cost computations into a function Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1	2015-04-02 18:10:55 -07:00
hkuang	73c8fe5deb	Fix error of "Left shift of negative value -1". Change-Id: Ia4f3feb20df0e89cc51b02def858e12e927312cc	2015-04-02 17:35:33 -07:00
Marco	f85f79f630	Merge "Code cleanup: put (8x8/4x4)fill_variance into separate function."	2015-04-02 17:33:01 -07:00
Yunqing Wang	cae03a7ef5	Set vbp thresholds for aq3 boosted blocks The vbp thresholds are set seperately for boosted/non-boosted superblocks according to their segment_id. This way we don't have to force the boosted blocks to split to 32x32. Speed 6 RTC set borg test result showed some quality gains. Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%. No speed change was observed. Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a	2015-04-02 15:48:32 -07:00
Marco	77ea408983	Code cleanup: put (8x8/4x4)fill_variance into separate function. Code cleanup, no change in behavior. Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24	2015-04-02 13:37:35 -07:00
Marco	6eb05c9ed0	Small fix to segment check in pickmode. Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72	2015-04-02 09:55:13 -07:00
James Zern	b8a1de86fd	Merge "vp9/neon: skip some files in high-bitdepth build"	2015-04-01 23:36:56 -07:00
James Zern	b644384bb5	Merge "vp9: fix high-bitdepth NEON build"	2015-04-01 23:36:17 -07:00
Yaowu Xu	54210f706c	Merge "use MAX_MB_PLANE consistently"	2015-04-01 18:24:39 -07:00
hkuang	f3bea3de5b	Remove unnecessary set postproc flags. Change-Id: Iaf136969bc368a890f9671647576ee9d54eef03b	2015-04-01 17:11:35 -07:00
hkuang	4cf68be17a	Merge "Fix 10-bit video decode failure with --frame-parallel mode."	2015-04-01 17:07:58 -07:00
Jingning Han	2149f214d5	Merge "Reduce required xmm number by one in block_error_fp"	2015-04-01 15:46:22 -07:00
Jingning Han	657cabe0f7	Tune SSSE3 assembly implementation to improve quantization speed Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3	2015-04-01 15:28:01 -07:00
Yaowu Xu	f26b8c84f8	use MAX_MB_PLANE consistently Change-Id: Ic416a7f145001a88f5a7f70dde9b1edbc1b69381	2015-04-01 15:21:20 -07:00
Yaowu Xu	fff4654d36	Merge "Simplify bsize calculation"	2015-04-01 15:06:55 -07:00
Jingning Han	cf4447339e	Merge "Optimize quantization simd implementation"	2015-04-01 14:55:18 -07:00
Jingning Han	a4364e5146	Merge "Simplify effective src_diff address computation"	2015-04-01 14:55:03 -07:00
Jingning Han	7acb2a8795	Merge "Refactor block_yrd function for RTC coding mode"	2015-04-01 14:54:24 -07:00
Yaowu Xu	ba91b54d7c	Simplify bsize calculation Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168	2015-04-01 12:15:06 -07:00
Jingning Han	19da916716	Simplify effective src_diff address computation Remove redundant offset calculation for effective src_diff address. Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef	2015-04-01 12:07:47 -07:00
Jingning Han	f2cf3c06a0	Reduce required xmm number by one in block_error_fp Use 6 xmms instead of 8. Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7	2015-04-01 12:07:35 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
Jingning Han	eed1badedd	Optimize quantization simd implementation This commit allows the quantizer to compare the AC coefficients to the quantization step size to determine if further multiplication operations are needed. It makes the quantization process 20% faster without coding statistics change. Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a	2015-04-01 11:47:09 -07:00
Yunqing Wang	a0043c6d30	Enhance the transform skipping decision-making in non-rd mode For large partition blocks(block_size > 32x32), the variance calculation is modified so that every 8x8 block's variance is stored during the calculation, which is used in the following transform skipping test. Also, the variance for every tx block is calculated. The skipping test checks all tx blocks in the partition, and sets the skip flag only if all tx blocks are skippable. If the skip flag of Y plane is 1, a quick evaluation is done on UV planes. If the current partition block is skippable in YUV planes, the mode search checks fewer inter modes and doesn't check intra modes. The rtc set borg test(at speed 6) showed that: Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%. Average single-thread speedup on rtc set was 3.5%. For 720p clips, more speedups were seen. gipsrecmotion: 13% gipsrestat: 12% vidyo: 5 - 9% dark: 15% niklas: 6% Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e	2015-04-01 09:43:40 -07:00
hkuang	1582ac851f	Fix 10-bit video decode failure with --frame-parallel mode. Also add unit test to avoid same error in the future. Issue:981 Change-Id: Iaf9889d8d5514cfdff1ea098e6ae133be56d501f	2015-04-01 09:19:35 -07:00
James Zern	14e24a1297	vp9: enable sse4 sad functions sse4 isn't set by configure or used in rtcd, correct the sad entries to use sse4_1 without changing the signatures for now. this was done in vp8 post-vp9 branch. Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271	2015-03-31 21:00:55 -07:00
James Zern	a98f6c0254	vp9/neon: skip some files in high-bitdepth build exclude files that only contain functions for non-high-bitdepth builds. this removes some warnings related to missing prototypes Change-Id: Ic6642998c46a7b808c6c53b2f9c34bcd4d037abe	2015-03-31 18:06:21 -07:00
James Zern	8845334097	vp9: fix high-bitdepth NEON build remove incorrect specializations in rtcd and update a configuration check in partial_idct_test.cc Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0	2015-03-31 17:45:25 -07:00
Yunqing Wang	fc98114761	Merge "Rename vbp thresholds"	2015-03-31 16:33:30 -07:00
Marco	c2b8218eba	Merge "Set postproc flags in decoder_get_frame."	2015-03-31 15:22:14 -07:00
Yunqing Wang	c28ff1a9de	Rename vbp thresholds Code refactoring Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79	2015-03-31 15:14:44 -07:00
Jingning Han	502ac72233	Merge "Tuning SATD rate calculation for speed"	2015-03-31 14:24:26 -07:00
Jingning Han	1c39c5b96f	Merge "Use aligned copy in 8x8 Hadamard transform SSE2"	2015-03-31 12:16:47 -07:00
Jingning Han	fa4289522e	Merge "Allow block skip coding option in RTC mode"	2015-03-31 12:16:36 -07:00
Jingning Han	1638d7dc96	Merge "Fix 8x8 Hadamard SSE2 implementation"	2015-03-31 12:16:27 -07:00
Alex Converse	9670d766ab	Merge "VP9E_GET_ACTIVE_MAP API function."	2015-03-31 11:52:56 -07:00
Jingning Han	531468a07a	Tuning SATD rate calculation for speed This commit allows the encoder to check the eob per transform block to decide how to compute the SATD rate cost. If the entire block is quantized to zero, there is no need to add anything; if only the DC coefficient is non-zero, add its absolute value; otherwise, sum over the block. This reduces the CPU cycles spent on vp9_satd_sse2 to one third. Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32	2015-03-31 11:02:20 -07:00
hui su	d4f2f1dd5b	Merge "Move vp9_coef_con_tree to common/"	2015-03-31 10:51:10 -07:00
Jingning Han	014fa45298	Use aligned copy in 8x8 Hadamard transform SSE2 This reduces the 8x8 Hadamard transform cycles by 20%. Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b	2015-03-31 10:21:52 -07:00

1 2 3 4 5 ...

7684 Commits