generic-library/vpx

Author	SHA1	Message	Date
Jingning Han	ac50b75e50	Use balanced model for intra prediction mode coding This commit replaces the previous table based intra mode model coding with a more balanced entropy coding system. It reduces the decoder lookup table size by 1K bytes. The key frame compression performance is about even on average. There are a few points where the compression performance is improved by over 5%. Most test points are fairly close to the lookup table approach. Change-Id: I47154276c0a6a22ae87de8845bc2d494681b95f6	2015-06-23 16:42:56 -07:00
Jingning Han	81c389e790	Make tx partition entropy coder account for block size This commit allows the entropy coder for transform block partition to account for its relative position with respect to the block size. Change-Id: I2b5019c378bfb58c11b926fa50c0db1933f35852	2015-06-18 21:56:30 +00:00
Jingning Han	0a42a1efd4	Add max_tx_size to MB_MODE_INFO Refactor the recursive transform block partition to reduce repeated computation maximum transform block size per block. Change-Id: Ib408c78dc6923fe7d337dc937e74f2701ac63859	2015-06-18 14:54:49 -07:00
Jingning Han	a4fd58a761	Refactor tx_block_rd_b() to compute per block rd cost This commit makes the tx_block_rd_b() compute the rate and distortion cost per transform block, instead of accumulating these costs. Change-Id: Iff5adc4c27cc54f8e6eb3abd95f8d88ba00f462c	2015-06-15 09:08:00 -07:00
Jingning Han	e272e5b8fb	Skip redundant flag reset If the skip flag is already on, there is no need to further check the all zero block case. This improves encoding speed at no coding statistics change. Change-Id: Icab997ca2977e650351a47ff1314def5ac4ecb1d	2015-06-12 11:44:01 -07:00
Jingning Han	5180368403	Allow encoder to force all zero coefficient block This commit allows the encoder to force all zero quantized coefficient block per transform block, if that provides better rate-distortion trade-off. Change-Id: I5b57b28cccd257ebfaf7c1749dda7be482abc834	2015-06-12 09:18:10 -07:00
Jingning Han	9ce132ac37	Refactor transform block partition entropy coding This commit refactors the transform block partition entropy coding process to improve the encoding speed. There is no change in the compression statistics. Change-Id: I237466fd95c1b888df432babfa36e01f74240eef	2015-06-11 09:41:20 -07:00
Jingning Han	87a0d5436b	Account for context information for partition rate estimate This commit allows the encoder to account for the boundary block information to estimate the transform block partitiion rate cost in the rate-distortion optimization scheme. Change-Id: Idb79cf936d96cdd15bcba27e47318295413a5f5d	2015-06-09 15:53:55 -07:00
Jingning Han	948c6d882e	Enable transform block partition entropy coding Select the probability model for transform block partition coding conditioned on the neighbor transform block sizes. Change-Id: Ib701296e59009bad97dbd21d8dcd58bc5e552f39	2015-06-09 12:30:52 -07:00
Jingning Han	79d6b8fc85	Properly handle boundary block rate distortion computation This commit makes the encoder to properly compute the rate distortion cost for blocks that partially cover extend pixels. Change-Id: I44529af6f76925cdc0f6b24a5d190b51b0813983	2015-06-09 11:14:24 -07:00
Jingning Han	b54dd00f53	Align the intra and inter mode cost measurement This commit aligns the measurement method used to evaluate both intra and inter modes. Change-Id: I8071584ce87fa3c5401800363daa0e670de29af5	2015-06-05 11:37:21 -07:00
Jingning Han	3239e22a42	Conditionally use recursive transform block partition search If the frame header sets to use fixed transform block size, use the univariate transform block partition search flow. Change-Id: Ic422ecb6565642cd8ddb96dc67a37109ef3ce90f	2015-06-03 11:14:26 -07:00
Jingning Han	a96f2ca319	Rework the rate and distortion computation pipeline This allows the encoder to use more precise rate and distortion costs for mode decision. Change-Id: I7cfd676a88531a194b9a509375feea8365e5ef12	2015-06-02 23:15:09 -07:00
Jingning Han	0207dcde4a	Fix rate estimate issue in transform block partition coding This commit fixes the over count issue in the recursive transform block partition rate cost estimation. It improves the compression performance by about 0.45%. Change-Id: I01ccda954ed0e120263977472c1c759c3c67170c	2015-06-02 18:51:03 -07:00
Jingning Han	33f05e90fe	Enable rate-distortion optimization for transform partition This commit enables the rate-distortion optimization for recursive transform block partition for inter mode blocks based on luma component. The chroma component infers the transform block size decision from those of luma component. Change-Id: I907cc52af888a606b718e087e717b189fa505748	2015-06-01 16:50:36 -07:00
Jingning Han	0451c6b6dd	Refactor per block rate distortion estimate Move the rate-distortion estimate function outside the recursion as an individual operating module. Change-Id: I662199223c256664bcd312084b3aebffb8a8034b	2015-06-01 12:41:45 -07:00
Jingning Han	d4b8dd76c4	Make chroma component RD estimate support transform partition This commit makes the rate-distortion estimation of the chroma components support the recursive transform block partition inferred from the luma component mode decisions. Change-Id: I2e038bebf558da406e966015952ad1058bdf4766	2015-06-01 11:15:15 -07:00
Jingning Han	6fc13b5cc2	Inter block transform coding partition syntax elements Allocate memory buffer to store the transform coding partition information of inter prediction mode blocks. Change-Id: I428b1dd0b26e8eaf24030a833554ceb4479c5551	2015-05-22 10:57:36 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
Adrian Grange	ad18b2b641	Remove 8-bit array in HBD Creating both 8- and 16-bit arrays and then only using one of them is wasteful. Change-Id: Ic5b397c283efaff7bcfff2d2413838ba3e065561	2015-03-25 15:37:03 -07:00
Adrian Grange	65df3d138a	Replace heap with stack memory allocation Replaced the dynamic memory allocation of the second_pred buffer with an allocation on the stack. Change-Id: I2716c46b71e8587714ca5733a99eca2c68419b23	2015-03-25 15:36:43 -07:00
Adrian Grange	8d8d7bfde5	Fix use of scaling in joint motion search To enable us to the scale-invariant motion estimation code during mode selection, each of the reference buffers is scaled to match the size of the frame being encoded. This fix ensures that a unit scaling factor is used in this case rather than the one calculated assuming that the reference frame is not scaled. Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf	2015-03-25 15:35:29 -07:00
paulwilkins	8ea7bafdaa	Merge "Revised rd adjustment for variance."	2015-03-24 03:12:56 -07:00
paulwilkins	c0b71cf82f	Merge "Experimental rd bias based on source vs recon variance."	2015-03-24 03:12:41 -07:00
paulwilkins	7e234b9228	Revised rd adjustment for variance. Revised adjustment for rd based on source complexity. Two cases: 1) Bias against low variance intra predictors when the actual source variance is higher. 2) When the source variance is very low to give a slight bias against predictors that might introduce false texture or features. The impact on metrics of this change across the test sets is small and mixed. derf -0.073%, -0.049%, -0.291% std hd -0.093%, -0.1%, -0.557% yt +0.186%, +0.04%, - 0.074% ythd +0.625%, + 0.563%, +0.584% Medium to strong psycho-visual improvements in some problem clips. This feature and intra weight on GF group length now turned on by default. Change-Id: Idefc8b633a7b7bc56c42dbe19f6b2f872d73851e	2015-03-20 11:59:39 +00:00
paulwilkins	9a1ce7be7d	Experimental rd bias based on source vs recon variance. This experiment biases the rd decision based on the impact a mode decision has on the relative spatial complexity of the reconstruction vs the source. The aim is to better retain a semblance of texture even if it is slightly misaligned / wrong, rather than use a simple rd measure that tends to favor use of a flat predictor if a perfect match can't be found. This improves the appearance of texture and visual quality on specific test clips but is hidden under a flag and currently off by default pending visual quality testing on a wider Yt set. Change-Id: Idf6e754a8949bf39ed9d314c6f2daaa20c888aad	2015-03-20 11:57:36 +00:00
Adrian Grange	12d946df89	Restore first ref frame pointer to the correct value The joint_motion_search function alternates prediction between two reference frames. In order to reuse existing code, a pointer to the appropriate reference frame is written into xd->plane[0].pre[0], that the motion estimation code assumes points to the reference frame. If this first reference frame was scaled then the pointer was incorrectly being reset to point to the unscaled reference frame rather than the scaled version. Change-Id: I76f73a8d8f4f15c1f3a5e7e08a35140cdb7886ab	2015-03-19 16:17:31 -07:00
Adrian Grange	53c9ebe609	Move joint_motion_search & delete function prototype Change-Id: I7fb3a78ed0e0bc940d8b4a57c470302f8369782f	2015-03-19 14:28:52 -07:00
Alex Converse	ad01d275e9	Merge "Don't inline cost_coeffs."	2015-03-05 13:54:44 -08:00
Adrian Grange	6e3be5c3b6	Merge "Fix valgrind memcpy memory overlaps warning"	2015-03-05 12:52:57 -08:00
Alex Converse	2eb113d00a	Don't inline cost_coeffs. It was tiny when it was orginally marked INLINE. Forcing this function to be inlined prevents the compiler from inlining its much smaller callers. No measurable speed impact, 28320 byte smaller libvpx.a Change-Id: I6bf4c917157d15cbadb3cd3e20a9e82d35dc7d6f	2015-03-05 12:39:02 -08:00
Adrian Grange	3807dd82ab	Make encoder buffer allocation dynamic Frame buffers are now allocated dynamically on-demand. Entries in the reference frame map, cm->ref_frame_map, may now be set to -1 (INVALID_IDX) to indicate that there is not a valid reference buffer in that "slot". All slots in the reference frame map are now initialized to the empty state (-1) and each buffer is initialized to have a reference count of 0. Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582	2015-03-04 07:58:32 -08:00
Adrian Grange	852f62fde5	Fix valgrind memcpy memory overlaps warning Change-Id: Id0bb162b48b891c5c849f0411ef2ac0aa4bbe261	2015-03-03 15:06:34 -08:00
Jingning Han	5041aa0fbe	Fix ioc issue in block_rd_txfm Force 64-bit precision in the intermediate steps. Change-Id: I666113d9adcef8975da201d5aa1a13b783d09594	2015-02-12 12:51:39 -08:00
Adrian Grange	23ebacdb81	Auto-adaptive encoder frame resizing logic Note: This feature is still in development. Add an option for the encoder to decide the resolution at which to encode each frame. Each KF/GF/ARF goup is tested to see if it would be better encoded at a lower resolution. At present, each KF/GF/ARF is coded first at full-size and if the coded size exceeds a threshold (twice target data rate) at the maximum active Q then the entire group is encoded at lower resolution. This feature is enabled in vpxenc by setting: --resize-allowed=1 In addition, if the vpxenc command line also specifies valid frame dimensions using: --resize-width=XXXX & --resize_height=YYYY then all frames will be encoded at this resolution. Change-Id: I13f341e0a82512f9e84e144e0f3b5aed8a65402b	2015-02-10 09:59:32 -08:00
hkuang	be6aeadaf4	Try again to merge branch 'frame-parallel' into master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit a18da9760a74d9ce6fb9f875706dc639c95402f5. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02	2015-01-30 21:00:13 -08:00
Jingning Han	9bdc0ae2b2	Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8 Add parentheses to bit operations. Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749	2015-01-23 11:48:58 -08:00
Johann	a18da9760a	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch." This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7 Change-Id: I053dae04c761b04a36dc239558503905a14d2470	2015-01-23 08:42:02 -08:00
hkuang	bde04ce503	Merge branch 'frame-parallel' to enable frame parallel decode in master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64	2015-01-22 18:18:53 -08:00
Jingning Han	5c31fd5c6d	Merge "Enable sub8x8 inter block search for RTC coding mode"	2015-01-02 10:00:35 -08:00
Jingning Han	dad89d5ca1	Enable sub8x8 inter block search for RTC coding mode This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a	2014-12-24 17:40:31 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Paul Wilkins	2e39817f5e	Merge "Improve motion detection for low complexity regions."	2014-12-18 08:38:21 -08:00
Jingning Han	01613aa753	Set second ref frame to be NONE in key frame coding This commit explicitly set the second reference frame type to be NONE in key frame coding mode. This fixes a subtle dependency of reference motion vector used by next inter frame on mode_info reset before key frame coding. Change-Id: I5ff0359753fdc9992b0bfe889490f7a32d7d5f6a	2014-12-16 15:49:58 -08:00
Paul Wilkins	b6c75c5a8d	Improve motion detection for low complexity regions. Where there is very subtle motion, especially when combined with low spatial complexity, the codec sometimes fails to quickly pick up the ambient motion field. Once it has been established though the field propagates well using Nearest and Near MV. This patch looks specifically at the case where the Nearest and Near have not been established as non zero vectors and in this case discounts the cost of searching for a new vector in the rd code. This will almost certainly have some implications in terms of encode speed but it should be possible to mitigate the impact in a subsequent using first pass stats and the local spatial complexity. Average results for test sets approximately neutral. Change-Id: I44a29e20f11f7ab10f8c93ffbdc50183d9801524	2014-12-16 17:22:54 +00:00
Jingning Han	eefe869291	Simplify rate-distortion modeling function Use left shift to replace one multiplication. The computation outcome remains identical. Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1	2014-12-15 11:51:16 -08:00
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
hkuang	382f86f945	Improve the performance by caching the left_mi and right_mi in macroblockd. This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3	2014-12-05 16:25:42 -08:00
Jingning Han	74ded4863e	Enable conditional skip path in rd_pick_intra_sby_mode These speed-up features for key frame coding are only turned on in the settings of hybrid non-RD and RD mode decision. It provides about 20% speed-up to the hybrid key frame coding at the expense of certain compression performance loss. For vidyo1, the key frame coding statistics are changed 9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us Overall rtc set compression performance is down by -0.257%. Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f	2014-12-05 09:36:09 -08:00

1 2 3 4 5 ...

1266 Commits