generic-library/vpx

Author	SHA1	Message	Date
Debargha Mukherjee	d43544137b	Rename NEAR_FORNEW to NEW2 Change-Id: I2928b0d28dcbf9c6b705d3ebf20550aeec9b99b3	2015-05-20 17:31:20 -07:00
Zoe Liu	6437c3cb6d	Combined two experiments of NEWMVREF and COMPOUND_MODES to NEW_INTER Runborgs results on derflr show consistent results between NEW_INTER and the previous combination of NEWMVREF and COMPOUND_MODES. Change-Id: Ieba239c4faa7f93bc5c05ad656a7a3b818b4fbfc	2015-05-19 14:04:22 -07:00
Debargha Mukherjee	fb093a337f	Global motion enhancements Adds warping functions. Also includes some refactoring. Change-Id: I909830650f29046edf108ddaddceb1a5e7c6c61c	2015-05-14 16:33:01 -07:00
Jingning Han	7a2f9bbda4	Add row tile coding support in bit-stream Fix the row tile boundary detection issues. This allows to use more resources for parallel encoding/decoding when avaiable. Change-Id: Ifda9f66d1d7c2567dd4e0a572a99a83f179b55f9	2015-05-11 12:30:03 -07:00
Zoe Liu	9e0466d0fd	Cleaned mv search code and added a few fixes on the experiments Besides code cleaning, this patch contains 3 fixes: (1) Fixed the COMPOUND_MODES for the NEW_NEWMV mode; (2) Fixed the joint search when the NEAR_FORNEWMV mode (in NEWMVREF) is being evaluated; (3) Fixed the WEDGE_PARTITION when the NEAR_FORNEWMV mode (in NEWMVREF) is being evaluated. (4) Adjusted the entropy probability value for NEAR_FORNEW mode. On derflr turning on all 14 experiments (except for global-motion), the average gain w.r.t. PSNR is +0.07%: Maximum on bridge_far_cif: +1.02% Minimum on hallmonitor_cif: -0.16% Change-Id: I4c9c6ee24a981af7e655a629580641d9f9745f91	2015-05-10 23:38:44 -07:00
hui su	bada9f0b87	Merge "Optimize entropy coding of non-transform tokens" into nextgen	2015-05-08 18:18:49 +00:00
hui su	00c793ee5f	Optimize entropy coding of non-transform tokens Use separate token probabilities and counters for non-transform blocks (pixel domain) . Initial probabilities are trained with screen_content clips. On screen_content, it improves coding performance by about 2% (from +16.4% to +18.45%). The initial probabilities are not optimized for natural videos. So it should not be used for natural videos. Set FOR_SCREEN_CONTENT as 0/1 to specify whether or not to enable this patch. Change-Id: Ifa361c94bb62aa4b783cbfa50de08c3fecae0984	2015-05-07 07:58:19 -07:00
Debargha Mukherjee	e6889b28e9	Merge "Fix a bug in copy_mode experiment" into nextgen	2015-05-07 05:10:01 +00:00
Debargha Mukherjee	5e7bc81128	Merge "Global motion continued" into nextgen	2015-05-07 05:09:34 +00:00
Yaowu Xu	d1f04fb5b2	Fix a bug in copy_mode experiment Change-Id: I1cf7d51ba99e5b6f5cf7e0d1a5d86ce4f19046e5	2015-05-06 17:03:32 -07:00
Peter de Rivaz	d6153aa447	Added highbitdepth sse2 acceleration for quantize and block error This is a partial cherry-pick of db7192e Change-Id: Idef18f90b111a0d0c9546543d3347e551908fd78	2015-05-06 15:14:01 -07:00
Debargha Mukherjee	caae13d54f	Global motion continued Implements a first version of global motion where the existing ZEROMV mode is converted to a translation only global motion mode. A lot of the code for supporting a rotation-zoom affine model is also incorporated. WIP. Change-Id: Ia1288a8dfe82f89484d4e291780288388e56d91b	2015-05-06 14:59:38 -07:00
Peter de Rivaz	16add99f0d	Corrected optimization of 8x8 DCT code The 8x8 DCT uses a fast version whenever possible. There was a mistake in the checking code which meant sometimes the fast version was used when it was not safe to do so. Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7	2015-05-06 10:10:19 -07:00
Peter de Rivaz	ecf677ede6	Fixed idct16x16_10 highbitdepth transform In the case when there are only non-zero coefficients in the first 4x4 block a special routine is called. The highbitdepth optimized version of this routine examined the wrong positions when deciding whether to call an assembler or C inverse transform. Change-Id: I62da663ca11775dadb66e402e42f4a1cb1927893	2015-05-06 10:10:18 -07:00
Deb Mukherjee	963393321c	Iadst transforms to use internal low precision Change-Id: I266777d40c300bc53b45b205144520b85b0d6e58	2015-05-06 10:10:18 -07:00
Peter de Rivaz	2dad1a7c8e	Added high bitdepth sse2 transform functions Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282	2015-05-06 10:10:18 -07:00
Peter de Rivaz	2189a51891	Added sse2 acceleration for highbitdepth variance This is a combination of: 4a19fa6 Added sse2 acceleration for highbitdepth variance c6f5d3b Fix high bit depth assembly function bugs Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f	2015-05-06 10:04:08 -07:00
Peter de Rivaz	41973e0e3e	Refactored idct routines and headers This change is made in preparation for a subsequent patch which adds acceleration for the highbitdepth transform functions. The highbitdepth transform functions attempt to use 16/32bit sse instructions where possible, but fallback to using the C implementations if potential overflow is detected. For this reason the dct routines are made global so they can be called from the acceleration functions in the subsequent patch. Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665	2015-05-06 09:59:20 -07:00
Peter de Rivaz	0e82cba628	Added highbitdepth sse2 SAD acceleration and tests Change-Id: I9f09e404e3136951e5cc15bf40b915c1fe10b620	2015-05-06 09:00:53 -07:00
Zoe Liu	9b083e8271	Changed nearmv for one of the sub8x8 partitions It is a minor change, but the essential idea is to use the mv of the top right block as the nearmv for the bottom left partition in the sub8x8 block. The change is under the experiment of NEWMVREF. When all 13 experiments are on (except for INTRABC), the gain is +0.05%: Worse on bowing_cif: -0.17% Best on foreman_cif: +0.42%; and bridge_far_cif: +0.40% The total 13 experiments achieved a gain of +6.97% against base. Change-Id: I3a51d9e28b34b0943fe16a984d62bfb38304ebca	2015-04-30 22:59:32 -07:00
Alex Converse	8d9c600d44	Merge "palette: Add missing consts" into nextgen	2015-04-28 19:51:16 +00:00
Alex Converse	7ae0b65f32	palette: Add missing consts Change-Id: I83a2e57dc5dbc328c7bfea421ffbaeb83b7ca3bd	2015-04-28 11:35:17 -07:00
hui su	1f7b49f7cd	Use uniform quantization settings for non-transform blocks Do not treat first element (dc) differently. on screen_content tx-skip only: +16.4% (was +15.45%) no significant impact on natrual videos Change-Id: I79415a9e948ebbb4a69109311c10126d8a0b96ab	2015-04-28 07:54:16 -07:00
Alex Converse	98d4f09a7a	Replace vp9_get_bit_depth with vp9_ceil_log2. The current name is confusing with regard to high bit depth buffers. Change-Id: Ieacd55ec22c81bd2f013f2e3d73a095affc93689	2015-04-23 10:26:57 -07:00
Debargha Mukherjee	96213ac5e7	Merge "Some minor improvements in bilateral filter expt." into nextgen	2015-04-23 01:14:45 +00:00
Debargha Mukherjee	425a45a45c	Some minor improvements in bilateral filter expt. Changes include: * Uses double for RD cost computation to guard against overflow for large resolution frames. * Use previous frame's filter level to code the level better. * Change precision of the filter parameters. * Allow spatial variance for x and y to be different Change-Id: I1669f65eb0ab1e8519962954c92d59e04f1277b7 derflr: +0.556% (a little up from before)	2015-04-22 18:09:42 -07:00
hui su	9e0750c2d2	Modify scan order for non-transform coding blocks Use raster scan order for non-transform blocks +15.45% (+2.1%) on screen_content no significant change on natural videos Change-Id: I0e264cb69e8624540639302d131f7de9c31c3ba7	2015-04-21 14:23:52 -07:00
hui su	ebd3666940	Merge "Add high bit depth support for tx-skip expt" into nextgen	2015-04-17 11:37:39 -07:00
Debargha Mukherjee	fb001c2e2f	Merge "Simplify bilateral filter search for speed" into nextgen	2015-04-16 18:58:03 -07:00
Debargha Mukherjee	017baf9f4b	Simplify bilateral filter search for speed Adds an internal buffer in the encoder to store the deblocked result to help speed up the search for the best bilateral filter. Very small change in performance but a lot faster: derflr: +0.518% Change-Id: I5d37e016088e559c16317789cfb1c2f49334b2b9	2015-04-16 15:33:34 -07:00
hui su	8c00c7a9cd	Fix palette expt asan failure Account for 422 video format. Change-Id: Ic5af661720fc5fa7142210d907dd25e1e79ff653	2015-04-16 15:08:06 -07:00
hui su	b69152db79	Add high bit depth support for tx-skip expt +0.3% on 10-bit +0.3% on 12-bit With other high bit compatible experiments on 12-bit +12.44% (+0.17) over 8-bit baseline Change-Id: I40b4c382fa54ba4640d08d9d01950ea8c1200bc9	2015-04-16 14:54:39 -07:00
Debargha Mukherjee	343c092e2e	High bit-depth support for wedge partition expt Change-Id: Idbd27e66d4f4a7953f888137d5752856215a6760	2015-04-13 09:28:15 -07:00
Debargha Mukherjee	8fa0b12cf7	Merge "An experiment introducing a bilateral loop filter" into nextgen	2015-04-10 16:46:16 -07:00
Debargha Mukherjee	fe4b6ac652	An experiment introducing a bilateral loop filter Adds a framework to incorporate a parameterized loop postfilter in the coding loop after the application of the standard deblocking loop filter. The first version uses a straight bilateral filter where the parameters conveyed are just spatial and intensity gaussian variances. Results on derflr: +0.523% (only with this experiment) +6.714% (with all expts other than intrabc) Change-Id: I20d47285b4d25b8c6386ff8af2a75ff88ac2b69b	2015-04-10 16:05:00 -07:00
hui su	bfc27bb614	tx-skip experiment: improve entropy coding of coeff tokens This patch allows the prediction residues of tx-skipped blocks to use probs that are different from regular transfrom coefficients for token entropy coding. Prediction residues are assumed as in band 6. The initial value of probs is obtained with stats from limited tests. The statistic model for constrained token nodes has not been optimized. The probs for token extra bits have not been optimized. These can be future work. Certain coding improvment is observed: derflr with all experiments: +6.26% (+0.10%) screen_content with palette: +22.48% (+1.28%) Change-Id: I1c0d78178ee9f3655febb6f30cdaef8ee9f8e3cc	2015-04-10 11:33:42 -07:00
Alex Converse	16e5e713fa	Add an intra block copy mode (NEWDV). Change-Id: I82b261c54ac9db33706bb057613dcbe66fc71387	2015-04-03 11:59:57 -07:00
Zoe Liu	2ae3d4f266	Add a new PREDICTION mode using NEARMV as ref mv This experiment, referred as NEWMVREF, also merged with NEWMVREF_SUB8X8 and the latter one has been removed. Runborgs results show that: (1) Turning on this experiment only, compared against the base: derflf: Average PSNR 0.40%; Overall PSNR 0.40%; SSIM 0.35% (2) Turning on all the experiments including this feature, compared against that without this feature, on the highbitdepth case using 12-bit: derflf: Average PSNR 0.33%; Overall PSNR 0.32%; SSIM 0.30%. Now for highbitdepth using 12-bit, compared against base: derflf: Average PSNR 11.12%; Overall PSNR 11.07%; SSIM 20.27%. Change-Id: Ie61dbfd5a19b8652920d2c602201a25a018a87a6	2015-04-02 14:37:22 -07:00
hui su	9eada94a3e	palette experiment: remove run-length coding Change-Id: I1e52475d0179cf019841d09a53b3b7fc53c79336	2015-03-31 11:09:30 -07:00
hui su	6ad18db24f	Palette experiment: encode color indices based on context The basic idea is to use a pixel’s neighboring colors as context to predict its own color. Up to 4 neighbors are considered here: left, left-above, above, right-above. To reduce the number of contexts, the combination of any 4 (or less) colors are mapped to a reduced number of patterns. For example, 1111, 2222, 3333, … , can be mapped to the same pattern: AAAA. SImilarly, 1122, 1133, 2233, …, can be mapped to the pattern AABB. In this way, the total number of color contexts is reduced to 16. This almost doubles the gain of palette coding on screen content videos. on screen_content --enable-palette +14.2% --enable-palette --enable-tx-skip +21.2% on derflr --enable-palette +0.12% with all other experiments +6.16% Change-Id: I560306dae216f2ac11a9214968c2ad2319fa1718	2015-03-26 15:48:08 -07:00
hui su	e18b104462	Palette experiment: adaptly update probs Also make changes to transmit palette-enabled flag using neighbor blocks as context. on screen_content --enable-palette +7.35% on derflr with all other experiments +6.05% Change-Id: Id6c2f726d21913d54a3f86ecfea474a4044c27f6	2015-03-25 09:12:57 -07:00
hui su	070d635657	Add palette coding mode for inter frames on screen_content --enable-palette +6.74% on derflr with all other experiments +6.02% (--enable-supertx --enable-copy-mode --enable-ext-tx --enable-filterintra --enable-tx64x64 --enable-tx-skip --enable-interintra --enable-wedge-partition --enable-compound-modes --enable-new-quant --enable-palette) Change-Id: Ib85049b4c3fcf52bf95efbc9d6aecf53d53ca1a3	2015-03-23 08:41:51 -07:00
Deb Mukherjee	c082df2359	Make interintra experiment work with highbitdepth Also includes some adjustments to the algorithm. All stats look good. Change-Id: I824ef8ecf25b34f3feb358623d14fe375c3e4eb7	2015-03-21 07:35:40 -07:00
Deb Mukherjee	c8ed36432e	Non-uniform quantization experiment This framework allows lower quantization bins to be shrunk down or expanded to match closer the source distribution (assuming a generalized gaussian-like central peaky model for the coefficients) in an entropy-constrained sense. Specifically, the width of the bins 0-4 are modified as a factor of the nominal quantization step size and from 5 onwards all bins become the same as the nominal quantization step size. Further, different bin width profiles as well as reconstruction values can be used based on the coefficient band as well as the quantization step size divided into 5 ranges. A small gain currently on derflr of about 0.16% is observed with the same paraemters for all q values. Optimizing the parameters based on qstep value is left as a TODO for now. Results on derflr with all expts on is +6.08% (up from 5.88%). Experiments are in progress to tune the parameters for different coefficient bands and quantization step ranges. Change-Id: I88429d8cb0777021bfbb689ef69b764eafb3a1de	2015-03-17 21:42:55 -07:00
Alex Converse	9a92891ac4	interintra: wedge: Get the correct wedge params. Fixes an asan issue. Change-Id: I671ffc382c77c2b38673e0b148f54e7bce2ce9c2	2015-03-17 10:49:22 -07:00
Alex Converse	7ca745a2df	palette: Fix an illegal read Change-Id: I71649f0a85d98b96efd08c8a9e3ee7372fd7d327	2015-03-16 17:13:15 -07:00
Deb Mukherjee	961fe77e70	Merge "Misc changes to support high-bitdepth with supertx" into nextgen	2015-03-12 17:42:20 -07:00
Deb Mukherjee	35d38646ec	Misc changes to support high-bitdepth with supertx Change-Id: I0331646d1c55deb6e4631e64bd6b092fb892a43e	2015-03-12 16:52:25 -07:00
hui su	7621c779e5	Add palette coding mode for UV For 444 videos, a single palette of 3-d colors is generated for YUV. For 420 videos, there may be two palettes, one for Y, and the other for UV. Also fixed a bug when palette and tx-skip are both on. on derflr --enable-palette +0.00% with all experiments +5.87% (was +5.93%) on screen_content --enable-palette +6.00% --enable-palette --enable-tx_skip +15.3% on screen_content 444 version --enable-palette +6.76% --enable-palette --enable-tx_skip +19.5% Change-Id: I7287090aecc90eebcd4335d132a8c2c3895dfdd4	2015-03-10 13:38:19 -07:00
Deb Mukherjee	78bcc48756	Make filterintra experiment work with highbitdepth All stats look fine. derflr: +0.912 with respect to 10-bit internal baseline (Was +0.747% w.r.t. 8 bit) +5.545 with respect to 8-bit baseline Change-Id: I3c14fd17718a640ea2f6bd39534e0b5cbe04fb66	2015-03-10 07:59:59 -07:00

1 2 3 4 5 ...

2722 Commits