generic-library/vpx

Author	SHA1	Message	Date
paulwilkins	8ea7bafdaa	Merge "Revised rd adjustment for variance."	2015-03-24 03:12:56 -07:00
paulwilkins	c0b71cf82f	Merge "Experimental rd bias based on source vs recon variance."	2015-03-24 03:12:41 -07:00
paulwilkins	7e234b9228	Revised rd adjustment for variance. Revised adjustment for rd based on source complexity. Two cases: 1) Bias against low variance intra predictors when the actual source variance is higher. 2) When the source variance is very low to give a slight bias against predictors that might introduce false texture or features. The impact on metrics of this change across the test sets is small and mixed. derf -0.073%, -0.049%, -0.291% std hd -0.093%, -0.1%, -0.557% yt +0.186%, +0.04%, - 0.074% ythd +0.625%, + 0.563%, +0.584% Medium to strong psycho-visual improvements in some problem clips. This feature and intra weight on GF group length now turned on by default. Change-Id: Idefc8b633a7b7bc56c42dbe19f6b2f872d73851e	2015-03-20 11:59:39 +00:00
paulwilkins	9a1ce7be7d	Experimental rd bias based on source vs recon variance. This experiment biases the rd decision based on the impact a mode decision has on the relative spatial complexity of the reconstruction vs the source. The aim is to better retain a semblance of texture even if it is slightly misaligned / wrong, rather than use a simple rd measure that tends to favor use of a flat predictor if a perfect match can't be found. This improves the appearance of texture and visual quality on specific test clips but is hidden under a flag and currently off by default pending visual quality testing on a wider Yt set. Change-Id: Idf6e754a8949bf39ed9d314c6f2daaa20c888aad	2015-03-20 11:57:36 +00:00
Adrian Grange	12d946df89	Restore first ref frame pointer to the correct value The joint_motion_search function alternates prediction between two reference frames. In order to reuse existing code, a pointer to the appropriate reference frame is written into xd->plane[0].pre[0], that the motion estimation code assumes points to the reference frame. If this first reference frame was scaled then the pointer was incorrectly being reset to point to the unscaled reference frame rather than the scaled version. Change-Id: I76f73a8d8f4f15c1f3a5e7e08a35140cdb7886ab	2015-03-19 16:17:31 -07:00
Adrian Grange	53c9ebe609	Move joint_motion_search & delete function prototype Change-Id: I7fb3a78ed0e0bc940d8b4a57c470302f8369782f	2015-03-19 14:28:52 -07:00
Alex Converse	ad01d275e9	Merge "Don't inline cost_coeffs."	2015-03-05 13:54:44 -08:00
Adrian Grange	6e3be5c3b6	Merge "Fix valgrind memcpy memory overlaps warning"	2015-03-05 12:52:57 -08:00
Alex Converse	2eb113d00a	Don't inline cost_coeffs. It was tiny when it was orginally marked INLINE. Forcing this function to be inlined prevents the compiler from inlining its much smaller callers. No measurable speed impact, 28320 byte smaller libvpx.a Change-Id: I6bf4c917157d15cbadb3cd3e20a9e82d35dc7d6f	2015-03-05 12:39:02 -08:00
Adrian Grange	3807dd82ab	Make encoder buffer allocation dynamic Frame buffers are now allocated dynamically on-demand. Entries in the reference frame map, cm->ref_frame_map, may now be set to -1 (INVALID_IDX) to indicate that there is not a valid reference buffer in that "slot". All slots in the reference frame map are now initialized to the empty state (-1) and each buffer is initialized to have a reference count of 0. Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582	2015-03-04 07:58:32 -08:00
Adrian Grange	852f62fde5	Fix valgrind memcpy memory overlaps warning Change-Id: Id0bb162b48b891c5c849f0411ef2ac0aa4bbe261	2015-03-03 15:06:34 -08:00
Jingning Han	5041aa0fbe	Fix ioc issue in block_rd_txfm Force 64-bit precision in the intermediate steps. Change-Id: I666113d9adcef8975da201d5aa1a13b783d09594	2015-02-12 12:51:39 -08:00
Adrian Grange	23ebacdb81	Auto-adaptive encoder frame resizing logic Note: This feature is still in development. Add an option for the encoder to decide the resolution at which to encode each frame. Each KF/GF/ARF goup is tested to see if it would be better encoded at a lower resolution. At present, each KF/GF/ARF is coded first at full-size and if the coded size exceeds a threshold (twice target data rate) at the maximum active Q then the entire group is encoded at lower resolution. This feature is enabled in vpxenc by setting: --resize-allowed=1 In addition, if the vpxenc command line also specifies valid frame dimensions using: --resize-width=XXXX & --resize_height=YYYY then all frames will be encoded at this resolution. Change-Id: I13f341e0a82512f9e84e144e0f3b5aed8a65402b	2015-02-10 09:59:32 -08:00
hkuang	be6aeadaf4	Try again to merge branch 'frame-parallel' into master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit a18da9760a74d9ce6fb9f875706dc639c95402f5. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02	2015-01-30 21:00:13 -08:00
Jingning Han	9bdc0ae2b2	Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8 Add parentheses to bit operations. Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749	2015-01-23 11:48:58 -08:00
Johann	a18da9760a	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch." This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7 Change-Id: I053dae04c761b04a36dc239558503905a14d2470	2015-01-23 08:42:02 -08:00
hkuang	bde04ce503	Merge branch 'frame-parallel' to enable frame parallel decode in master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64	2015-01-22 18:18:53 -08:00
Jingning Han	5c31fd5c6d	Merge "Enable sub8x8 inter block search for RTC coding mode"	2015-01-02 10:00:35 -08:00
Jingning Han	dad89d5ca1	Enable sub8x8 inter block search for RTC coding mode This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a	2014-12-24 17:40:31 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Paul Wilkins	2e39817f5e	Merge "Improve motion detection for low complexity regions."	2014-12-18 08:38:21 -08:00
Jingning Han	01613aa753	Set second ref frame to be NONE in key frame coding This commit explicitly set the second reference frame type to be NONE in key frame coding mode. This fixes a subtle dependency of reference motion vector used by next inter frame on mode_info reset before key frame coding. Change-Id: I5ff0359753fdc9992b0bfe889490f7a32d7d5f6a	2014-12-16 15:49:58 -08:00
Paul Wilkins	b6c75c5a8d	Improve motion detection for low complexity regions. Where there is very subtle motion, especially when combined with low spatial complexity, the codec sometimes fails to quickly pick up the ambient motion field. Once it has been established though the field propagates well using Nearest and Near MV. This patch looks specifically at the case where the Nearest and Near have not been established as non zero vectors and in this case discounts the cost of searching for a new vector in the rd code. This will almost certainly have some implications in terms of encode speed but it should be possible to mitigate the impact in a subsequent using first pass stats and the local spatial complexity. Average results for test sets approximately neutral. Change-Id: I44a29e20f11f7ab10f8c93ffbdc50183d9801524	2014-12-16 17:22:54 +00:00
Jingning Han	eefe869291	Simplify rate-distortion modeling function Use left shift to replace one multiplication. The computation outcome remains identical. Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1	2014-12-15 11:51:16 -08:00
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
hkuang	382f86f945	Improve the performance by caching the left_mi and right_mi in macroblockd. This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3	2014-12-05 16:25:42 -08:00
Jingning Han	74ded4863e	Enable conditional skip path in rd_pick_intra_sby_mode These speed-up features for key frame coding are only turned on in the settings of hybrid non-RD and RD mode decision. It provides about 20% speed-up to the hybrid key frame coding at the expense of certain compression performance loss. For vidyo1, the key frame coding statistics are changed 9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us Overall rtc set compression performance is down by -0.257%. Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f	2014-12-05 09:36:09 -08:00
Yunqing Wang	edbd61e136	vp9_ethread: modify VP9_COMP structure This patch modified struct VP9_COMP. Created a struct ThreadData to include data that need to be copied for each thread. In multiple thread case, one thread processes one tile. all threads share one copy of VP9_COMP, (refer to VP9_COMP cpi in the code) but each thread has its own copy of ThreadData, (refer to ThreadData td in the code). Therefore, within the scope of encode_tiles(), both cpi and td need to be passed as function parameters. In single thread case, the FRAME_COUNTS pointer in ThreadData points to "counts" in VP9_COMMON. Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e	2014-11-24 17:57:38 -08:00
Yunqing Wang	379334c2d8	vp9_ethread: move filter_cache out of RD_OPT struct Similar to mask_filter, the filter_cache in RD_OPT struct can be moved out, and declared as a local variable since it is only used in pick_inter_mode functions. Change-Id: I412b99cca82bade07ac912064ec03dd1de6b2c17	2014-11-20 13:44:16 -08:00
Yunqing Wang	b0efddd8e6	vp9_ethread: change mask_filter to a local variable The mask_filter in RD_OPT struct is used to record rd result in filter decision. It is only used in pick_inter_mode functions, and is removed from the struct and declared as a local variable. Change-Id: I3c95c8632ba7241591ce00ef2ef5677b5e297d7b	2014-11-20 09:41:49 -08:00
Yunqing Wang	70c9d2983b	Revert "vp9_ethread: include a pointer to mb in VP9_COMP" This reverts commit 6906d218ddd1af97228a797f4558e402231d94f1. Another way will be used to handle mb struct. Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4	2014-11-20 08:31:12 -08:00
Yunqing Wang	6906d218dd	vp9_ethread: include a pointer to mb in VP9_COMP Modified VP9_COMP struct to include MACROBLOCK *mb. This change makes it feasible in multi-thread case to allocate a mb for each thread. Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e	2014-11-14 12:31:06 -08:00
Jingning Han	61966b1d10	Merge "Refactor vp9_update_rd_thresh_fact"	2014-10-31 08:55:28 -07:00
Jingning Han	f7b46d8c5e	Refactor vp9_update_rd_thresh_fact Reduce the scope of function parameters. Change-Id: Ifef2cfb559908a97498ffdbd6ea53da1cd45a73c	2014-10-30 11:09:40 -07:00
Hui Su	66906da066	Merge "Combine vp9_encode_block_intra and encode_block_intra"	2014-10-30 11:02:31 -07:00
Jingning Han	9349a28e80	Enable mode search threshold update in non-RD coding mode Adaptively adjust the mode thresholds after each mode search round to skip checking less likely selected modes. Local tests indicate 5% - 10% speed-up in speed -5 and -6. Average coding performance loss is -1.055%. speed -5 vidyo1 720p 1000 kbps 16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms nik 720p 1000 kbps 33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms speed -6 vidyo1 720p 1000 kbps 16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms nik 720p 1000 kbps 33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b	2014-10-29 10:55:34 -07:00
Hui Su	0928da3b6e	Combine vp9_encode_block_intra and encode_block_intra Change-Id: I79091fb677b64892ecca2fb466fde14602d8cdfc	2014-10-28 18:57:01 -07:00
Jingning Han	d56b3eb0cf	Refactor encoder tile data structure Make the common tile info as one element in the encoder tile data struct. Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be	2014-10-27 19:37:13 -07:00
Jingning Han	eee201c221	Tile based adaptive mode search in RD loop Make the spatially adaptive mode search in rate-distortion optimization loop inter tile independent. Experiments suggest that this does not significantly change the coding staticstics. Single tile, speed 3: pedestrian_area 1080p 1500 kbps 59192 b/f, 40.611 dB, 101689 ms blue_sky 1080p 1500 kbps 58505 b/f, 36.347 dB, 62458 ms mobile_cal 720p 1000 kbps 13335 b/f, 35.646 dB, 45655 ms as compared to 4 column tiles, speed 3: pedestrian_area 1080p 1500 kbps 59329 b/f, 40.597 dB, 101917 ms blue_sky 1080p 1500 kbps 58712 b/f, 36.320 dB, 62693 ms mobile_cal 720p 1000 kbps 13191 b/f, 35.485 dB, 45319 ms Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325	2014-10-24 10:00:27 -07:00
Yunqing Wang	330a6b2756	Merge "vp9_ethread: allocate frame contexts outside VP9_COMMON struct"	2014-10-22 17:10:39 -07:00
Yunqing Wang	7c7e4d4eb8	vp9_ethread: allocate frame contexts outside VP9_COMMON struct This patch allocated frame contexts outside VP9_COMMON. This allows multiple threads to share the same copy of frame contexts, and reduces the overhead. It also guarantees the correct update of these contexts during bitstream packing. This patch doesn't change encoding result. Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353	2014-10-22 15:03:12 -07:00
Hangyu Kuang	9ce3a7d76c	Implement frame parallel decode for VP9. Using 4 threads, frame parallel decode is ~3x faster than single thread decode and around 30% faster than tile parallel decode for frame parallel encoded video on both Android and desktop with 4 threads. Decode speed is scalable to threads too which means decode could be even faster with more threads. Change-Id: Ia0a549aaa3e83b5a17b31d8299aa496ea4f21e3e	2014-10-22 10:50:58 -07:00
Jingning Han	94ecfa323f	Reset rate cost value in rd mode search When early termination is triggered, properly reset the rate cost to invalid value to avoid potential ioc issue. Change-Id: I3444390be2e49a34bb02cf8a74c33d5dbd96d88d	2014-10-17 09:33:59 -07:00
Jingning Han	ed100c0b00	Fix an ioc issue in super_block_uvrd This commit fixes an ioc issue that will happen when the cumulative variables are not in effective use. The fix discards these redundant additions. Change-Id: Idbac5bfb989c0cedc5f8a323effce938519b2457	2014-10-16 11:07:39 -07:00
Jingning Han	f3a5de816d	Refactor super_block_uvrd function to remove goto statement Use return value 0/1 as indicator of the validity of the rate- distortion cost. Change-Id: I6244126fbf03472cebcba4f177a6cd329fae4743	2014-10-14 09:58:11 -07:00
Jingning Han	69a09a70e9	Use speed feature variable in vp9_rd_pick_inter/intra_mode Replace repeated fetch cpi->sf with a local sf pointer. Change-Id: I5a55bba3e1c41fbdbc6ad5f078d2fa49dd95ee67	2014-10-13 16:15:00 -07:00
Jingning Han	3bdb6bfcee	Fix vp9_rd_pick_inter/intra function types The returned value is not used anywhere, hence changing the function type into void. Change-Id: I0ece49ed61e7aab6df01140135503ad41d4ef4a4	2014-10-13 16:00:46 -07:00
Jingning Han	811cef97c9	Refactor rate distortion cost structure This commit makes a struct that contains rate value, distortion value, and the rate-distortion cost. The goal is to provide a better interface for rate-distortion related operation. It is first used in rd_pick_partition and saves a few RDCOST calculations. Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f	2014-10-13 14:27:16 -07:00
Jingning Han	a62acf3c0a	Fix ActiveMapTest valgrind warning This fixes a valgrind warning in the ActiveMapTest unit test reported in issue 870. Change-Id: Idf172ab0244ebefe630c3577e649bc9ba7c43d10	2014-10-11 22:36:58 -07:00

1 2 3 4 5 ...

1244 Commits