generic-library/vpx

Author	SHA1	Message	Date
Geza Lore	73bc3119be	Factor out model_rd_from_sse Change-Id: Ia60ff0ecc8d083870fadbfe07d494d1e2c080489	2016-06-03 09:34:55 +01:00
Geza Lore	ab29978e9f	Pre-compute and use contiguous wedge masks. This is purely a refactoring patch and has no functional effect. Uses of these masks can be arranged such that all input blocks are contiguous in memory (stride == block width). In this case 1D versions of operations can be used. 1D vector operations have superior performance over 2D block equivalents as they are more processor cache friendly and they can do away with a second loop overhead. Change-Id: I2b76c9888aea2c857cc497e8a4b2841fd3dad54e	2016-06-03 00:16:22 -07:00
Alex Converse	380c4ee32d	Merge "segmentation: Don't use uninitialized probability data." into nextgenv2	2016-06-01 17:50:37 +00:00
Alex Converse	6bae20ca43	Merge "Replace some vpxbool calls with entropy coder agnostic calls." into nextgenv2	2016-05-31 23:58:00 +00:00
Alex Converse	7a6cb59dbb	segmentation: Don't use uninitialized probability data. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1224 Change-Id: I17b76fcf0d8c191850350d5aa50dcc007b8b0cdc	2016-05-31 16:42:29 -07:00
Alex Converse	aee0091161	Replace some vpxbool calls with entropy coder agnostic calls. Change-Id: Ifbcd0714fcf994c43b69255185456c7a255df66c	2016-05-31 15:42:19 -07:00
hui su	fa933553da	ext-intra: speed up keyframe encoding 130% speed increase for keyframe encoding, with 0.4% compression loss. When kf-max-dist=150, 1.5% speed increase with 0.03% compression loss. Change-Id: I4cf7314ab95b9eb6dd17f314aca8955522c82676	2016-05-31 10:34:44 -07:00
hui su	f523d7b540	Add a speed feature for inter tx type search Seperate prediction mode and tx type search for inter modes. Enabled for speed >=1. baseline: speed increase 40% compression drop 0.30%/0.29% on lowres/midres ext-tx: speed increase 160% compression drop 1.08%/0.95% on lowres/midres Change-Id: Ieb34b1ee80df6980d16e26a5783e08cc0deae55b	2016-05-31 10:34:35 -07:00
hui su	38e6dd71bb	Add a speed feature for intra tx type search Add a speed feature to seperate prediction mode and tx type search for intra modes: search for best intra prediction mode with fixed default tx type first, then choose the best tx type for the selected mode. Coding performance drop: baseline lowres 0.10% midres 0.08% hdres 0.14% with ext-tx lowres 0.14% midres 0.25% hdres 0.20% Speed improvement is 20% for baseline and 17% for ext-tx. It is turned on for speed >= 1. Change-Id: Ia5e8d39e8a4e2e42c521bfde938f8b6a98ab24f9	2016-05-31 10:33:56 -07:00
Zoe Liu	e89ca180c2	Make the bi-predictive frame group interval adjustable This is for the bidir-pred experiment. Previously the length of the bi-predictive frame group interval is fixed at 2, i.e. one bi-predictive frame may be inserted every other frame. This patch makes the length adjustable, i.e. any positive number may be specified, but the use of the backward ref will be turned off if the bi-predictive frame group interval is larger than the golden frame group. Further, an additional rate factor level has been added: INTER_LOW , which applies to LAST_BIPRED_UPDATE frames that are not used as references. Change-Id: I5514d34a64dd486bbb5756c2d0612946f598a789	2016-05-28 16:46:45 -07:00
Linfeng Zhang	af7fb17c09	Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10. Function level timing test shows about 27% time saving on a Xeon E5-2680 v2 desktop. Rename vp9_dct_sse2.c to vp9_dct_intrin_sse2.c for vp9 and rename dct_sse2.c to dct_intrin_sse2.c for vp10 to avoid duplicate basenames. Actually vp9_fwht4x4_mmx/sse2() and vp10_fwht4x4_mmx/sse2() are identical. TODO: They should be unified later if there is no intention to keep a duplicate. Change-Id: I3e537b7bbd9ba417c606cd7c68c4dbbfa583f77d	2016-05-27 09:51:16 -07:00
hui su	e5f47d4334	ext-intra: refactor mode info. writing and reading No performance changes. Change-Id: I001068330ea217a993aee9b79d7ffead0d23100e	2016-05-26 14:56:40 -07:00
Hui Su	88eaf5d6ce	Merge "Skip unnecessary calculations in ext-intra" into nextgenv2	2016-05-26 18:03:02 +00:00
Yi Luo	cb507ff29a	Merge "HBD inverse HT 8x8 and 16x16 sse4.1 optimization" into nextgenv2	2016-05-24 22:06:07 +00:00
Zoe Liu	cf5083d4cd	Added an experiment "bidir_pred" for backward prediction Major parts have been implemented as follows: (1) Added BRF_UPDATE, LASTNRF_UPDATE, and NRF_UPDATE in firstpass.c; (2) Added the handling for the scenario of "cpi->common.show_existing_frame == 1" at the encoder; (3) Added a new reference frame of BWDREF_FRAME; (4) Have bwd-ref work with upsampled references. Note that when the experiment of "ext_refs" turned on, this experiment will be turned off automatically currently. RD performance in Overall PSNR has been improved, compared against the VP10 baseline: lowres: Avg -3.312; BDRate -3.154 derflr: Avg -1.927; BDRate -1.176 midres: Avg -2.149; BDRate -2.001 hdres : Avg -0.567; BDRate -0.588 Change-Id: I4c06ff51cc20194bffbd4d2346e57ba3dcf6b62c	2016-05-24 13:55:57 -07:00
Yi Luo	28cdee448d	HBD inverse HT 8x8 and 16x16 sse4.1 optimization - Covers tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST. - Encoding speed improves ~27% on crowd_run_1080p_12. - Merge 4x4, 8x8, 16x16 unit tests in one test file. Change-Id: I058ef5254d068a9523a826480c78ebbdd231824c	2016-05-24 12:55:30 -07:00
hui su	4a741a5d5c	Skip unnecessary calculations in ext-intra Around 5% speedup. Change-Id: I1c552e4e58fbf5637c0b5a97dd2cc4f83a1ca201	2016-05-23 17:24:19 -07:00
Zoe Liu	a63147ae77	Fix --test-decode=warn to test mismatch This patch always compares the most recent show frames between the encoder and the decoder to test the mismatch. Change-Id: I68a91ad0996a598231450debfd616e24992419b5	2016-05-23 17:01:53 -07:00
Debargha Mukherjee	fa5022978d	Merge "Wedge refactoring to handle signs better" into nextgenv2	2016-05-20 23:19:39 +00:00
Debargha Mukherjee	e5de2ad632	Wedge refactoring to handle signs better Mostly refactoring. Handles signs better though results are more or less neutral. Change-Id: If499537c8f8da4f34d104ebfda072eb4c85fb12f	2016-05-20 14:12:52 -07:00
Yaowu Xu	a9fc1cc257	Fix a build issue When both obmc and dual_filter is enabled. Change-Id: I56b127573a6cca31469bb357cf7a6a9c3df64071	2016-05-19 14:24:41 -07:00
Yue Chen	a33e3d12cb	Merge "Fix obmc + ext-interp interference" into nextgenv2	2016-05-19 18:08:52 +00:00
Jingning Han	936ed0804d	Merge "Account sub8x8 block reference filter type for prob context" into nextgenv2	2016-05-19 15:04:31 +00:00
Geza Lore	009bd1153e	Fix obmc + ext-interp interference With ext-interp, a switchable interpolation filter is coded iff the motion vector uses fractional pixel movement (ie, true subpixel movement). With ext-interp and obmc enabled at the same time, the RD search proceeds as: 1. Do motion search 2. Do interpolation filter search iff subpixel motion, otherwise use EIGHTTAP_REGULAR 3. Evaluate obmc=0 4. Evaluete obmc=1 - This involves another motion search If the motion search in step 4 yields an integer motion vector, while the search in step 1 did not, then an interp_filter value other than EIGHTTAP_REGULAR is invalid, and will cause an assertion failure at output time, or a mismatch if not using --enable-debug. The fix sets the interp_filter to EIGHTTAP_REGULAR if obmc=1 is picked with an integer motion vector. Change-Id: I4685d1ad537f41d833dc9eb64845956b67886cca	2016-05-19 11:30:07 +01:00
Zoe Liu	011f020447	Refactor on getting upsampled reference frame Reused a function that has been used in getting the normal reference frames. Change-Id: Ic4f7dac5c396d689a72699ab79fd580747f8bd65	2016-05-18 16:00:23 -07:00
Jingning Han	9161464f6c	Account sub8x8 block reference filter type for prob context If a reference block is coded with sub8x8 block size, and if it has sub-pixel level motion vectors, its prediction filter type should be used as context information. The coding performance gains of dual filter type coding scheme are lowres 0.57% hdres 0.88% Change-Id: I68b98f2518d02f11c29d0256aeb45b2580fe5cac	2016-05-18 12:35:31 -07:00
Angie Chiang	6f28581b26	Turn on flip in inverse txfm2d Fix build failed Reduce txfm test time Change-Id: Ieaf6b27f3a272d06286f817f01230413fa8adcf6	2016-05-18 11:26:57 -07:00
Yi Luo	18ecb16c30	Merge "Integrate HBD row/column flip fwd txfm SSE4.1 optimization" into nextgenv2	2016-05-18 17:45:45 +00:00
Debargha Mukherjee	f1ddf6eb04	Merge "Reducing computation of interintra modes" into nextgenv2	2016-05-18 17:21:15 +00:00
Yi Luo	1d307368a9	Integrate HBD row/column flip fwd txfm SSE4.1 optimization - Integrate 5 flip transform types for each 4x4, 8x8, and 16x16 block, for experiment, EXT_TX. - Encoder speed improves about 12%-15%. - Update the unit tests for bit-exact result against C. Change-Id: Idf27c87f1e516ca5b66c7b70142477a115404ccb	2016-05-18 03:48:01 +00:00
Debargha Mukherjee	049dbe7786	Reducing computation of interintra modes Use model for interintra mode search. Speed-up about 5-10% with about 0.04 drop in efficiency. lowres: -2.60% Change-Id: I825bf0ba8a46eb7f19fc528c25b8df066fb8ea95	2016-05-17 07:28:06 -07:00
James Zern	a81a75184c	Merge "vp10/rdopt,rd_pick_intra4x4block: port tsan fix from vp9" into nextgenv2	2016-05-17 03:04:00 +00:00
James Zern	8eba4ac46e	vp10/rdopt,rd_pick_intra4x4block: port tsan fix from vp9 minus the non-existent nonrd portion. original change: commit d642294b1c57a5adacb1038ff45766c38bae8a6d Author: Jingning Han <jingning@google.com> Date: Thu Feb 11 12:36:49 2016 -0800 Fix tsan error in VP9 sub8x8 intra mode search This commit fixes issue 1141. The issue was triggered in multi-tile encoding. The change properly saves and restores the block context information in the real-time mode selection process. It removes several redundant memcpy operations in sub8x8 intra block mode search. Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010 Change-Id: Idfa38c54c9e645479f6870d46f71fb1e91c071da	2016-05-16 17:20:29 -07:00
Jingning Han	4677e1a718	Unify the per directional filter type system for compound modes For the current stage, we assume a single prediction filter type per direction in the settings of compound inter prediction modes. Change-Id: I12a1afdd364b93fcee870bd11ad01fc40ab48cff	2016-05-16 14:41:08 -07:00
Jingning Han	d567e14e81	Enable per motion component filter type selection Change-Id: I73823fc94f296d225dece7156de71b30bae3fcb7	2016-05-16 14:38:43 -07:00
Debargha Mukherjee	fb8ea1736b	Various wedge enhancements Increases number of wedges for smaller block and removes wedge coding mode for blocks larger than 32x32. Also adds various other enhancements for subsequent experimentation, including adding provision for multiple smoothing functions (though one is used currently), adds a speed feature that decides the sign for interinter wedges using a fast mechanism, and refactors wedge representations. lowres: -2.651% BDRATE Most of the gain is due to increase in codebook size for 8x8 - 16x16. Change-Id: I50669f558c8d0d45e5a6f70aca4385a185b58b5b	2016-05-16 12:41:47 -07:00
Angie Chiang	1e587ae616	Merge "Add flip option for vp10_fwd_txfm2d_#x#_c" into nextgenv2	2016-05-12 18:08:28 +00:00
Alex Converse	ccf4f47b99	Merge changes I412c24aa,I28a8bbf0 * changes: mcomp: Remove an obsolete undef. mcomp: Remove an obsolete comment.	2016-05-11 20:03:21 +00:00
Geza Lore	c1b739014f	Cost wedge sign/index properly in rdopt. Lowres improves by about 0.1% lowres: -2.164 BDRATE Change-Id: I393bbb92700bfbb8763ace424f4edc2d672a74b4	2016-05-11 11:59:10 -07:00
Yaowu Xu	a45596cff7	Merge "Added a measure of rc drift."	2016-05-11 18:02:00 +00:00
Yue Chen	372e12b959	Merge "Add single motion search for OBMC predictor" into nextgenv2	2016-05-11 17:20:32 +00:00
Paul Wilkins	5fd142e763	Merge "Fixed 8K two pass encoder crash."	2016-05-11 16:25:25 +00:00
paulwilkins	45df87ca57	Added a measure of rc drift. Added actual and absolute rate miss values to the opsnr.stt stats output line. Changes to the borg graphing may be needed before merge. Change-Id: I1e9d548ce445d29002f0c59ebfd3957a6f15e702	2016-05-11 15:15:07 +01:00
paulwilkins	65732c36a8	Fixed 8K two pass encoder crash. Bug found by Yunqing relating to the correction for size at 8K and above in get_twopass_worst_quality(). The basis for the correction was changed to the linear size relative to 1080P as a baseline and the adjustment has been clamped to prevent problems at extreme images sizes. For 1080P the results on our test sets were neutral but the low res and mid res sets saw a small gain (0.1%-0.2% average). I would also expect some gains on 4k and larger content where the previous correction was overly aggressive. Change-Id: I30b026b5f4535e9601e3178d738066459d19c8fb	2016-05-11 14:45:50 +01:00
Yue Chen	370f203a40	Add single motion search for OBMC predictor Weighted single motion search is implemented for obmc predictor. When NEWMV mode is used, to determine the MV for the current block, we run weighted motion search to compare the weighted prediction with (source - weighted prediction using neighbors' MVs), in which the distortion is the actual prediction error of obmc prediction. Coding gain: 0.404/0.425/0.366 for lowres/midres/hdres Speed impact: +14% encoding time (obmc w/o mv search 13%-> obmc w/ mv search 27%) Change-Id: Id7ad3fc6ba295b23d9c53c8a16a4ac1677ad835c	2016-05-10 18:27:45 -07:00
Angie Chiang	1954fa390f	Add flip option for vp10_fwd_txfm2d_#x#_c Will add unit test to test/vp10_fwd_txfm2d_test.cc later Change-Id: I626900c67fca4eee2ad0ae1828188527a04a5362	2016-05-10 18:14:57 -07:00
Alex Converse	6dd5ec7efb	mcomp: Remove an obsolete undef. The macro was removed in 6724676. Change-Id: I412c24aac49bd1ff60a331a30933e0d8ae3f2dd5	2016-05-10 18:04:24 -07:00
Alex Converse	7764f8af3e	mcomp: Remove an obsolete comment. This was copied over from VP8. VP9 doesn't seem to do this buffer copy. Change-Id: I28a8bbf0503a7f99b2cb60620ab3674adde863bb	2016-05-10 18:04:24 -07:00
Yaowu Xu	dc73c3332e	Merge "Move count buffers from stack to heap" into nextgenv2	2016-05-10 23:58:59 +00:00
Jingning Han	005564813d	Merge "Remove unused highbd_fdct32x32 function" into nextgenv2	2016-05-10 23:16:41 +00:00

... 3 4 5 6 7 ...

1015 Commits