generic-library/vpx

Author	SHA1	Message	Date
Ronald S. Bultje	d00b8e5f82	Inline vp9_get_coef_context() (and remove vp9_ prefix). Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896	2013-06-28 10:40:21 -07:00
Ronald S. Bultje	91d223bd5c	Some minor optimizations for cost_coeffs(). Cycle timings for first 3 frames of bus (speed 0) at 1500kbps: 4x4: 298 -> 234 cycles 8x8: 1227 -> 878 cycles 16x16: 23426 -> 18134 cycles 32x32: 4906 -> 3664 cycles Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster. Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95	2013-06-28 10:29:02 -07:00
Jingning Han	fc1cfd8e32	Merge "Make intra predictor reference buffer configurable"	2013-06-26 19:02:02 -07:00
Jingning Han	861cb06c67	Make intra predictor reference buffer configurable This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1	2013-06-26 17:17:21 -07:00
Ronald S. Bultje	b5468155b6	Remove unused macro RDTRUNC_8x8 from encodemb.c. Change-Id: I0c097567adab24215d807963ccb34810a2afe007	2013-06-26 15:34:01 -07:00
Dmitry Kovalev	87ee34aacb	Removing unused code. Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1	2013-06-25 10:17:19 -07:00
Ronald S. Bultje	25c588b1e4	Add subtract_block SSE2 version and unit test. 3% faster overall (3min35.0 to 3min28.5). Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e	2013-06-21 09:35:37 -07:00
Jingning Han	7088426976	Merge "Make fdct32 computation flow within 16bit range"	2013-06-18 11:40:14 -07:00
Jingning Han	a41a4860c0	Make fdct32 computation flow within 16bit range This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3	2013-06-18 09:46:24 -07:00
Dmitry Kovalev	686b99741c	Removing vp9_invtrans.{c, h} files. Moving single function from vp9_invtrans.c to vp9_encodemb.c. Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7	2013-06-17 16:09:03 -07:00
John Koleszar	717d744a01	Fix use of get_uv_tx_size in loopfilter Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98	2013-06-10 11:40:57 -07:00
Ronald S. Bultje	6ef805eb9d	Change ref frame coding. Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95	2013-06-06 17:28:09 -07:00
Jim Bankoski	5a88271b09	don't tokenize & encode tokens for blocks in UMV This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9	2013-06-06 06:10:25 -07:00
Dmitry Kovalev	317d832d38	Merge "Adding plane_block_width and plane_block_height functions." into experimental	2013-05-31 15:28:45 -07:00
Deb Mukherjee	0048ec2329	Costing fixes related to trellis optimization Migrates costing changes/fixes from the rebalance expt to the head without the expt on. Rebased. Change-Id: I51677d62f77ed08aca8d21a4c9a13103eb8de93f Results: derfraw300: +0.126%	2013-05-31 13:56:32 -07:00
Dmitry Kovalev	120a878199	Adding plane_block_width and plane_block_height functions. Change-Id: I02c17fb733c0f3c22dc3167c3d3182797415f1ae	2013-05-31 12:31:49 -07:00
Deb Mukherjee	b8b3f1a46d	Balancing coef-tree to reduce bool decodes This patch changes the coefficient tree to move the EOB to below the ZERO node in order to save number of bool decodes. The advantages of moving EOB one step down as opposed to two steps down in the other parallel patch are: 1. The coef modeling based on the One-node becomes independent of the tree structure above it, and 2. Fewer conext/counter increases are needed. The drawback is that the potential savings in bool decodes will be less, but assuming that 0s are much more predominant than 1's the potential savings is still likely to be substantial. Results on derf300: -0.237% Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242	2013-05-29 16:25:52 -07:00
Jingning Han	6c97bba403	Merge "further clean-ups on intra4x4 coding" into experimental	2013-05-29 10:55:14 -07:00
Sami Pietila	88a4d4c510	Residual coding to cache energy class of tokens. Proposal for tuning the residual coding by changing how the context from previous tokens is calculated. Storing the energy class of previous tokens instead of the token itself eases the critical path of HW implementations. Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab	2013-05-29 15:21:01 +01:00
Jingning Han	4729a6f389	further clean-ups on intra4x4 coding Removed one 4x4 prediction step that was unnessary in the rd loop. Removed a unused modecosts estimate from encoder side. Change-Id: I65221a52719d6876492996955ef04142d2752d86	2013-05-28 11:19:05 -07:00
Yaowu Xu	2b96ffe025	a few clean-ups 1. remove prediction mode conversion 2. unified bmode, same for key and non-key frame 3. set I4X4_PRED count for pdf to 0, as I4X4_PRED is no longer coded ever. It is determined by ref_frame and block partition Change-Id: If5b282957c24339b241acdb9f2afef85658fe47d	2013-05-27 13:53:56 -07:00
Paul Wilkins	33ecd6ad54	Merge Scatter Scan experiment. Removal from under configure flag. A bit renaming Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa	2013-05-23 13:09:27 +01:00
Yaowu Xu	8ba92a0bed	changes intra coding to be based on txfm block This commit changed the encoding and decoding of intra blocks to be based on transform block. In each prediction block, the intra coding iterates thorough each transform block based on raster scan order. This commit also fixed a bug in D135 prediction code. TODO next: The RD mode/txfm_size selection should take this into account when computing RD values. Change-Id: I6d1be2faa4c4948a52e830b6a9a84a6b2b6850f6	2013-05-22 11:53:19 +01:00
Yaowu Xu	232d90d8fd	Generalized intra 4x4 encoding for all sizes Change-Id: I1b86744fa247233c8df031b3f4b87b212c8dd094	2013-05-22 11:44:12 +01:00
John Koleszar	ddf13be8ef	Merge "Initial version of alpha channel support" into experimental	2013-05-21 17:29:51 -07:00
Scott LaVarnway	ba48a11130	WIP: 4x4 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22	2013-05-20 13:03:17 -04:00
John Koleszar	679e4abdd5	Initial version of alpha channel support This is a mostly-working implementation of an extra channel in the bitstream. Configure with --enable-alpha to test. Notable TODOs: - Add extra channel to all mismatch tests, PSNR, SSIM, etc - Configurable subsampling - Variable number of planes (currently always uses all 4) - Loop filtering - Per-plane lossless quantizer - ARNR support This implementation just uses the same contents as the Y channel for the A channel, due to lack of content and general pain in playing back 4 channel content. A later patch will use the actual alpha channel passed in from outside the codec. Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592	2013-05-16 22:21:09 -07:00
Scott LaVarnway	794a7bedbd	WIP: 8x8 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45	2013-05-16 13:52:15 -04:00
Scott LaVarnway	a272ff25cd	WIP: 16x16 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1	2013-05-15 13:16:02 -04:00
Scott LaVarnway	2cf0d4be12	WIP: 32x32 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a	2013-05-14 15:54:17 -07:00
Paul Wilkins	e5f715201a	Change to band calculation. Change band calculation back to simpler model based on the order in which coefficients are coded in scan order not the absolute coefficient positions. With the scatter scan experiment enabled the results were appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134). Without the scatterscan experiment on the results were up derf as well. Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791	2013-05-13 17:21:49 +01:00
John Koleszar	1fec23bef6	Use common get_uv_tx_size() Use a single method for calculating the transform size of non-luma planes. Change-Id: I16ebd10e7944d7b9075ab79d15e6a5b5f9bab775	2013-05-08 20:48:32 -07:00
Jingning Han	776c1482a3	Merge SB8X8 into the codebase Pull sb8x8 out of experimental list. verified via borg run tests. Fixed unit test failures. Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76	2013-05-07 09:08:25 -07:00
John Koleszar	4529c68b3b	Separate transform and quant from vp9_encode_sb This allows removing a large number of transform size specific functions, as well as supporting 444/alpha by routing all code through the subsampling-aware path. Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4	2013-05-03 12:14:50 -07:00
John Koleszar	3f4e80634b	Create common vp9_encode_sb{,y} Creates a common encode (subtract, transform, quantize, optimize, inverse transform, reconstruct) function for all sb sizes, including the old 16x16 path. Change-Id: I964dff1ea7a0a5c378046a069ad83495f54df007	2013-05-02 14:02:03 -07:00
John Koleszar	1f80a568d2	Make vp9_optimize_sb* common Unify the various vp9_optimize_sb functions into one that handles all transform sizes. Change-Id: I48b642fbfb3e72cc2e0bcf1d0317a80a80547882	2013-04-30 21:34:58 -07:00
Ronald S. Bultje	d068d869b9	sb8x8 integration in rd loop. Work-in-progress, not yet ready for review. TODO items: - bitstream writing (encoder) and reading (decoder) - decoder reconstruction Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081	2013-04-30 16:13:20 -07:00
Ronald S. Bultje	2dbaa4f4f4	Change above/left_context to use an 8x8 basis. Output changes slightly because of a minor bug in (at least) the sb32x16 block2above tx16x16 tables that previously existed in vp9_blockd.c. Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc	2013-04-29 10:37:25 -07:00
Ronald S. Bultje	1a46b30ebe	Grow MODE_INFO array to use an 8x8 basis. Change-Id: I087e08e7909a406b71715b8525c104208daa6889	2013-04-26 11:57:17 -07:00
Ronald S. Bultje	c849eaca59	Use b_width/height_log2 instead of mb_ where appropriate. Basic assumption: when talking about transform units, use b_; when talking about macroblock indices, use mb_. Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884	2013-04-25 14:20:59 -07:00
John Koleszar	15255eef82	Move dequant from BLOCKD to per-plane MACROBLOCKD This data can vary per-plane, but not per-block. Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a	2013-04-25 11:57:20 -07:00
John Koleszar	4bd0f4f646	Remove BLOCK structure All members can be referenced from their per-plane counterparts, and removes assumptions about 24 blocks per macroblock. Change-Id: I593fb0715e74cd84b48facd1c9b18c3ae1185d4b	2013-04-25 11:33:17 -07:00
John Koleszar	aa6a36b062	Merge "Convert coeff to per-plane MACROBLOCK data" into experimental	2013-04-23 17:41:59 -07:00
John Koleszar	138ec38cab	Convert coeff to per-plane MACROBLOCK data This commit moves the coeff storage from the MACROBLOCK struct to its per-plane part. The next commit will remove the coeff member from the BLOCK structure so that it is consistently accessed per-plane. Also refactors vp9_sb_block_error_c and vp9_sb_uv_block_error_c to be variable subsampling aware. Change-Id: I18c30f87f27c3a012119b6c1970d5fa499804455	2013-04-23 16:28:17 -07:00
John Koleszar	4f35e3e1c1	Merge "Move src_diff to per-plane MACROBLOCK data" into experimental	2013-04-23 16:24:08 -07:00
John Koleszar	cbd1315ac4	Move src_diff to per-plane MACROBLOCK data First in a series of commits making certain MACROBLOCK members addressable per-plane. This commit also refactors the block subtraction functions vp9_subtract_b, vp9_subtract_sby_c, etc to be loops-over-planes and variable subsampling aware. Change-Id: I371d092b914ae0a495dfd852ea1a3d2467be6ec3	2013-04-23 12:18:51 -07:00
Dmitry Kovalev	5de7e16ca2	Adding get_scan_{4x4, 8x8, 16x16} functions. Change-Id: Id4306ef6d65d4a3984aed50b775bdf48d4f6c438	2013-04-22 14:08:41 -07:00
Deb Mukherjee	f12509f640	Merge "Removes the code_nonzerocount experiment" into experimental	2013-04-22 11:53:14 -07:00
Deb Mukherjee	0aa79be7d5	Removes the code_nonzerocount experiment This patch does not seem to give any benefits. Change-Id: I9d2b4091d6af3dfc0875f24db86c01e2de57f8db	2013-04-22 10:58:49 -07:00
Deb Mukherjee	6ce718eb18	Merge "End of orientation zero group experiment" into experimental	2013-04-22 10:33:12 -07:00

1 2 3 4

161 Commits