generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	ba48a11130	WIP: 4x4 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22	2013-05-20 13:03:17 -04:00
John Koleszar	4529c68b3b	Separate transform and quant from vp9_encode_sb This allows removing a large number of transform size specific functions, as well as supporting 444/alpha by routing all code through the subsampling-aware path. Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4	2013-05-03 12:14:50 -07:00
Ronald S. Bultje	c849eaca59	Use b_width/height_log2 instead of mb_ where appropriate. Basic assumption: when talking about transform units, use b_; when talking about macroblock indices, use mb_. Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884	2013-04-25 14:20:59 -07:00
John Koleszar	17313c408f	Move diff to MACROBLOCKD per-plane data. Change-Id: Ic27af09e38af8317ac4743241883d577a44f1490	2013-04-19 11:11:54 -07:00
Ronald S. Bultje	13e41ba440	Remove unused macroblock versions of reconstruction functions. More specifically, remove vp9_quantize_mb, vp9_optimize_mb, vp9_inverse_transform_mb* and vp9_transform_mb. Instead, use the generic _sb functions that take a size argument, and call them with BLOCK_SIZE_MB16X16. Change-Id: I33024afea95d3a23ffbc1df7da426e4645110f29	2013-04-11 12:27:15 -07:00
Ronald S. Bultje	a3874850dd	Make SB coding size-independent. Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code gives identical encoder results before and after. There are a few macros for rectangular block sizes under the sbsegment experiment; this experiment is not yet functional and should not yet be used. Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728	2013-04-09 21:28:27 -07:00
Ronald S. Bultje	f42bee7edf	Don't use BLOCKD in vp9_invtrans.c. Change-Id: I40524170334109e2864b06e3c73c8b34e5aa8b0f	2013-04-08 11:37:29 -07:00
John Koleszar	05a79f2fbf	Move EOB to per-plane data Continue migrating data from BLOCKD/MACROBLOCKD to the per-plane structures. Change-Id: Ibbfa68d6da438d32dcbe8df68245ee28b0a2fa2c	2013-04-04 21:30:23 -07:00
John Koleszar	4c05a051ab	Move qcoeff, dqcoeff from BLOCKD to per-plane data Start grouping data per-plane, as part of refactoring to support additional planes, and chroma planes with other-than 4:2:0 subsampling. Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23	2013-04-04 16:30:57 -07:00
Ronald S. Bultje	d3724abe9f	Re-add support for ADST in superblocks. This also changes the RD search to take account of the correct block index when searching (this is required for ADST positioning to work correctly in combination with tx_select). Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6	2013-03-07 11:19:10 -08:00
Ronald S. Bultje	111ca42133	Make superblocks independent of macroblock code and data. Split macroblock and superblock tokenization and detokenization functions and coefficient-related data structs so that the bitstream layout and related code of superblock coefficients looks less like it's a hack to fit macroblocks in superblocks. In addition, unify chroma transform size selection from luma transform size (i.e. always use the same size, as long as it fits the predictor); in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma transform will now use the 16x16 (instead of the 8x8) chroma transform, and 64x64 superblocks using the 32x32 luma transform will now use the 32x32 (instead of the 16x16) chroma transform. Lastly, add a trellis optimize function for 32x32 transform blocks. HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's a few negative points here and there that I might want to analyze a little closer. Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430	2013-03-04 16:34:36 -08:00
Ronald S. Bultje	e8c74e2b70	Move eob from BLOCKD to MACROBLOCKD. Consistent with VP8. Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07	2013-02-27 11:00:55 -08:00
Dmitry Kovalev	9bf3f75168	Changing pitch value meaning for fht and iht transforms. Pitch now means the number of elements, not the number of bytes. Change-Id: Idb9f2f012e39b09d596a3cc1802305a80b7c13af	2013-02-25 18:19:55 -08:00
Jingning Han	77a3becf92	clean up forward and inverse hybrid transform Rebased. Remove the old matrix multiplication transform computation. The 16x16 ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16 300/0 in vp9/common/vp9_blockd.h. Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f	2013-02-25 09:16:12 -08:00
Jingning Han	cd907b1601	16x16 butterfly inverse ADST/DCT hybrid transform rebased. This patch includes 16x16 butterfly inverse ADST/DCT hybrid transform. It uses the variant ADST of kernel sin((2k+1)*(2n+1)/4N), which allows a butterfly implementation. The coding gains as compared to DCT 16x16 are about 0.1% for both derf and std-hd. It is noteworthy that for std-hd sets many sequences gains about 0.5%, some 0.2%. There are also few points that provides -1% to -3% performance. Hence the average goes to about 0.1%. Change-Id: Ie80ac84cf403390f6e5d282caa58723739e5ec17	2013-02-19 09:07:00 -08:00
Ronald S. Bultje	46dff5d233	Remove some Y2-related code. Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78	2013-02-15 14:06:25 -08:00
Yaowu Xu	f01b08c96c	Merge "enable bitstream lossless support" into experimental	2013-02-13 10:26:58 -08:00
Yaowu Xu	d3de97794f	Merge "fix the lossless experiment" into experimental	2013-02-13 09:54:35 -08:00
Yaowu Xu	17db5d00be	enable bitstream lossless support 1. Added a bit in frame header to to indicate if a frame is encoded in lossless mode, so decoder does not make the decision based on Q0 2. Minor changes to make sure that lossy coding works same as when the lossless experiment is not enabled. 3. Renamed function pointers for transforms to be consistent, using prefix fwd_txm and inv_txm for forward and inverse respectively To encode in lossless mode, using "--lossless=1 --min-q=0 --max-q=0" with vpxenc. Change-Id: Ifae53b26d2ffbe378d707e29d96817b8a5e6c068	2013-02-13 09:24:39 -08:00
Yaowu Xu	16f25f9dc8	fix the lossless experiment Change-Id: I95acfc1417634b52d344586ab97f0abaa9a4b256	2013-02-13 09:20:26 -08:00
Jingning Han	57e995ff9c	butterfly inverse 4x4 ADST fixed format issues. Implement the inverse 4x4 ADST using 9 multiplications. For this particular dimension, the original ADST transform can be factorized into simpler operations, hence is retained. Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960	2013-02-11 10:42:39 -08:00
Jingning Han	d15e1da494	Butterfly ADST based hybrid transform Refactor the 8x8 inverse hybrid transform. It is now consistent with the new inverse DCT. Overall performance loss (due to the use of this variant ADST, and the rounding errors in the butterfly implementation) for std-hd is -0.02. Fixed BUILD warning. Devise a variant of the original ADST, which allows butterfly computation structure. This new transform has kernel of the form: sin((2k+1)*(2n+1) / (4N)). One of its butterfly structures using floating-point multiplications was reported in Z. Wang, "Fast algorithms for the discrete W transform and for the discrete Fourier transform", IEEE Trans. on ASSP, 1984. This patch includes the butterfly implementation of the inverse ADST/DCT hybrid transform of dimension 8x8. Change-Id: I3533cb715f749343a80b9087ce34b3e776d1581d	2013-02-07 10:07:46 -08:00
Ronald S. Bultje	aa2effa954	Merge tx32x32 experiment. Change-Id: I615651e4c7b09e576a341ad425cf80c393637833	2013-01-10 08:23:59 -08:00
Ronald S. Bultje	4455036cfc	Merge superblocks (32x32) experiment. Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587	2013-01-08 12:54:45 -08:00
John Koleszar	879cb7d962	Merge vp9-preview changes into experimental branch Incorportate vp9-preview changes by merging master branch into experimental. Conflicts: test/test.mk vp9/common/vp9_filter.c vp9/common/vp9_idctllm.c vp9/common/vp9_invtrans.h vp9/common/vp9_mbpitch.c vp9/common/vp9_rtcd_defs.sh vp9/common/vp9_systemdependent.h vp9/common/vp9_type_aliases.h vp9/common/x86/vp9_asm_stubs.c vp9/common/x86/vp9_subpixel_mmx.asm vp9/decoder/vp9_decodframe.c vp9/decoder/vp9_dequantize.c vp9/decoder/vp9_dequantize.h vp9/decoder/vp9_onyxd_int.h vp9/encoder/vp9_bitstream.c vp9/encoder/vp9_encodeframe.c vp9/encoder/vp9_rdopt.c Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5	2013-01-08 10:19:59 -08:00
Ronald S. Bultje	4cca47b538	Use standard integer types for pixel values and coefficients. For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da	2012-12-18 15:31:19 -08:00
Scott LaVarnway	b575394e21	Improved vp9_ihtllm_c As suggested by Yaowu, we can use eob to reduce the complexity of the vp9_ihtllm_c function. For the 1080p test clip used, the decoder performance improved by 17%. Change-Id: I32486f2f06f9b8f60467d2a574209aa3a3daa435	2012-12-12 15:49:39 -08:00
Ronald S. Bultje	c456b35fdf	32x32 transform for superblocks. This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds code all over the place to wrap that in the bitstream/encoder/decoder/RD. Some implementation notes (these probably need careful review): - token range is extended by 1 bit, since the value range out of this transform is [-16384,16383]. - the coefficients coming out of the FDCT are manually scaled back by 1 bit, or else they won't fit in int16_t (they are 17 bits). Because of this, the RD error scoring does not right-shift the MSE score by two (unlike for 4x4/8x8/16x16). - to compensate for this loss in precision, the quantizer is halved also. This is currently a little hacky. - FDCT and IDCT is double-only right now. Needs a fixed-point impl. - There are no default probabilities for the 32x32 transform yet; I'm simply using the 16x16 luma ones. A future commit will add newly generated probabilities for all transforms. - No ADST version. I don't think we'll add one for this level; if an ADST is desired, transform-size selection can scale back to 16x16 or lower, and use an ADST at that level. Additional notes specific to Debargha's DWT/DCT hybrid: - coefficient scale is different for the top/left 16x16 (DCT-over-DWT) block than for the rest (DWT pixel differences) of the block. Therefore, RD error scoring isn't easily scalable between coefficient and pixel domain. Thus, unfortunately, we need to compute the RD distortion in the pixel domain until we figure out how to scale these appropriately. Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b	2012-12-07 14:45:05 -08:00
Jim Bankoski	030e268a90	ihtllm moves to rtcd clears up some warnings Change-Id: I9899637497c6ad7519f098e055ab98580ae6d688	2012-11-29 07:19:38 -08:00
Deb Mukherjee	0742b1e4ae	Fixing 8x8/4x4 ADST for intra modes with tx select This patch allows use of 8x8 and 4x4 ADST correctly for Intra 16x16 modes and Intra 8x8 modes when the block size selected is smaller than the prediction mode. Also includes some cleanups and refactoring. Rebase. Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a	2012-11-28 16:21:12 -08:00
Jim Bankoski	c67873989f	fixed includes to be fully specified Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243	2012-11-28 10:53:17 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00

32 Commits