generic-library/vpx

Author	SHA1	Message	Date
Jingning Han	81452cf0b7	Refactor intra block prediction function This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138	2015-07-13 15:20:47 -07:00
paulwilkins	8dd466edc8	Changes to use of rectangular partitions. Changes to allow more use of rectangular partitions at speeds 1 and 2 for content classed by the first pass as animation and for blocks near the active image edge. This has quite a big impact in quality for the animated test sequence but also hurts encode speed for speed 2. For other content types the impact on both speed and quality is small. Added some plumbing for detection of internal vertical image edges. Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a	2015-07-08 18:14:12 +01:00
paulwilkins	a126b6ce7d	Change speed and rd features for formatting bars. Change speed features / behavior for split mode when there is an internal active edge (e.g. formatting bars). Remove some threshold constraints in rd code near the active edge of the image. Add some plumbing for left and right active edge detection. Patch set 5. Limit rd pass through for sub 8x8 to internal active edges. This takes away any speed penalty for most clips but keeps the enhanced edge coding for the more critical case of internal image edges Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5	2015-07-08 17:51:42 +01:00
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
Jingning Han	fcb5a8692a	Merge "Move subtract functions from vp9 to vpx_dsp"	2015-07-06 22:39:26 +00:00
James Zern	017253b7a3	remove vp9_get_interp_kernel() expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20	2015-07-06 13:04:05 -07:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
Scott LaVarnway	c06d56cc7d	VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFO to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107	2015-06-29 12:46:47 -07:00
Scott LaVarnway	86f4a3d8af	Remove tile param and added to MACROBLOCKD. Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db	2015-06-22 06:09:38 -07:00
Scott LaVarnway	cca866f578	inline vp9_get_segdata() and change name. Change-Id: I706645cf9d9dc04f1b3b6ac80df80edb7f101854	2015-06-11 09:52:00 -07:00
Scott LaVarnway	42c0b1b1f1	inline vp9_segfeature_active() and changed name. Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3	2015-06-11 04:20:55 -07:00
Scott LaVarnway	baaaa57533	Reducing size of MODE_INFO struct Reduced size from 124 bytes to 104 bytes. For decode only builds, it is reduced to 68 bytes. Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3	2015-06-04 07:32:16 -07:00
Scott LaVarnway	b962646fc5	Re-worked header files Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32	2015-05-22 11:19:51 -07:00
Johann	1d7ccd5325	Relocate memory operations for common code With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca	2015-05-13 11:41:15 -07:00
James Zern	fd3658b0e4	replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79	2015-05-07 11:55:08 -07:00
James Zern	f58011ada5	vpx_mem: remove vpx_memset vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201	2015-04-28 20:00:59 -07:00
James Zern	f274c2199b	vpx_mem: remove vpx_memcpy vestigial. replace instances with memcpy() which they already were being defined to. Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c	2015-04-28 19:59:41 -07:00
James Zern	fbd3b89488	vpx_mem: remove vpx_memmove vestigial. replace instances with memmove() which they already were being defined to. Change-Id: If396d3f9e3cf79c0ee5d7429615ef3d6b2a34afa	2015-04-28 19:59:40 -07:00
Scott LaVarnway	8b17f7f4eb	Revert "Remove mi_grid_* structures." (see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d	2015-04-21 11:16:45 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
Adrian Grange	ad18b2b641	Remove 8-bit array in HBD Creating both 8- and 16-bit arrays and then only using one of them is wasteful. Change-Id: Ic5b397c283efaff7bcfff2d2413838ba3e065561	2015-03-25 15:37:03 -07:00
Adrian Grange	65df3d138a	Replace heap with stack memory allocation Replaced the dynamic memory allocation of the second_pred buffer with an allocation on the stack. Change-Id: I2716c46b71e8587714ca5733a99eca2c68419b23	2015-03-25 15:36:43 -07:00
Adrian Grange	8d8d7bfde5	Fix use of scaling in joint motion search To enable us to the scale-invariant motion estimation code during mode selection, each of the reference buffers is scaled to match the size of the frame being encoded. This fix ensures that a unit scaling factor is used in this case rather than the one calculated assuming that the reference frame is not scaled. Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf	2015-03-25 15:35:29 -07:00
paulwilkins	8ea7bafdaa	Merge "Revised rd adjustment for variance."	2015-03-24 03:12:56 -07:00
paulwilkins	c0b71cf82f	Merge "Experimental rd bias based on source vs recon variance."	2015-03-24 03:12:41 -07:00
paulwilkins	7e234b9228	Revised rd adjustment for variance. Revised adjustment for rd based on source complexity. Two cases: 1) Bias against low variance intra predictors when the actual source variance is higher. 2) When the source variance is very low to give a slight bias against predictors that might introduce false texture or features. The impact on metrics of this change across the test sets is small and mixed. derf -0.073%, -0.049%, -0.291% std hd -0.093%, -0.1%, -0.557% yt +0.186%, +0.04%, - 0.074% ythd +0.625%, + 0.563%, +0.584% Medium to strong psycho-visual improvements in some problem clips. This feature and intra weight on GF group length now turned on by default. Change-Id: Idefc8b633a7b7bc56c42dbe19f6b2f872d73851e	2015-03-20 11:59:39 +00:00
paulwilkins	9a1ce7be7d	Experimental rd bias based on source vs recon variance. This experiment biases the rd decision based on the impact a mode decision has on the relative spatial complexity of the reconstruction vs the source. The aim is to better retain a semblance of texture even if it is slightly misaligned / wrong, rather than use a simple rd measure that tends to favor use of a flat predictor if a perfect match can't be found. This improves the appearance of texture and visual quality on specific test clips but is hidden under a flag and currently off by default pending visual quality testing on a wider Yt set. Change-Id: Idf6e754a8949bf39ed9d314c6f2daaa20c888aad	2015-03-20 11:57:36 +00:00
Adrian Grange	12d946df89	Restore first ref frame pointer to the correct value The joint_motion_search function alternates prediction between two reference frames. In order to reuse existing code, a pointer to the appropriate reference frame is written into xd->plane[0].pre[0], that the motion estimation code assumes points to the reference frame. If this first reference frame was scaled then the pointer was incorrectly being reset to point to the unscaled reference frame rather than the scaled version. Change-Id: I76f73a8d8f4f15c1f3a5e7e08a35140cdb7886ab	2015-03-19 16:17:31 -07:00
Adrian Grange	53c9ebe609	Move joint_motion_search & delete function prototype Change-Id: I7fb3a78ed0e0bc940d8b4a57c470302f8369782f	2015-03-19 14:28:52 -07:00
Alex Converse	ad01d275e9	Merge "Don't inline cost_coeffs."	2015-03-05 13:54:44 -08:00
Adrian Grange	6e3be5c3b6	Merge "Fix valgrind memcpy memory overlaps warning"	2015-03-05 12:52:57 -08:00
Alex Converse	2eb113d00a	Don't inline cost_coeffs. It was tiny when it was orginally marked INLINE. Forcing this function to be inlined prevents the compiler from inlining its much smaller callers. No measurable speed impact, 28320 byte smaller libvpx.a Change-Id: I6bf4c917157d15cbadb3cd3e20a9e82d35dc7d6f	2015-03-05 12:39:02 -08:00
Adrian Grange	3807dd82ab	Make encoder buffer allocation dynamic Frame buffers are now allocated dynamically on-demand. Entries in the reference frame map, cm->ref_frame_map, may now be set to -1 (INVALID_IDX) to indicate that there is not a valid reference buffer in that "slot". All slots in the reference frame map are now initialized to the empty state (-1) and each buffer is initialized to have a reference count of 0. Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582	2015-03-04 07:58:32 -08:00
Adrian Grange	852f62fde5	Fix valgrind memcpy memory overlaps warning Change-Id: Id0bb162b48b891c5c849f0411ef2ac0aa4bbe261	2015-03-03 15:06:34 -08:00
Jingning Han	5041aa0fbe	Fix ioc issue in block_rd_txfm Force 64-bit precision in the intermediate steps. Change-Id: I666113d9adcef8975da201d5aa1a13b783d09594	2015-02-12 12:51:39 -08:00
Adrian Grange	23ebacdb81	Auto-adaptive encoder frame resizing logic Note: This feature is still in development. Add an option for the encoder to decide the resolution at which to encode each frame. Each KF/GF/ARF goup is tested to see if it would be better encoded at a lower resolution. At present, each KF/GF/ARF is coded first at full-size and if the coded size exceeds a threshold (twice target data rate) at the maximum active Q then the entire group is encoded at lower resolution. This feature is enabled in vpxenc by setting: --resize-allowed=1 In addition, if the vpxenc command line also specifies valid frame dimensions using: --resize-width=XXXX & --resize_height=YYYY then all frames will be encoded at this resolution. Change-Id: I13f341e0a82512f9e84e144e0f3b5aed8a65402b	2015-02-10 09:59:32 -08:00
hkuang	be6aeadaf4	Try again to merge branch 'frame-parallel' into master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit `a18da9760a`. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02	2015-01-30 21:00:13 -08:00
Jingning Han	9bdc0ae2b2	Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8 Add parentheses to bit operations. Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749	2015-01-23 11:48:58 -08:00
Johann	a18da9760a	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch." This reverts commit `bde04ce503` Change-Id: I053dae04c761b04a36dc239558503905a14d2470	2015-01-23 08:42:02 -08:00
hkuang	bde04ce503	Merge branch 'frame-parallel' to enable frame parallel decode in master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64	2015-01-22 18:18:53 -08:00
Jingning Han	5c31fd5c6d	Merge "Enable sub8x8 inter block search for RTC coding mode"	2015-01-02 10:00:35 -08:00
Jingning Han	dad89d5ca1	Enable sub8x8 inter block search for RTC coding mode This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a	2014-12-24 17:40:31 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Paul Wilkins	2e39817f5e	Merge "Improve motion detection for low complexity regions."	2014-12-18 08:38:21 -08:00
Jingning Han	01613aa753	Set second ref frame to be NONE in key frame coding This commit explicitly set the second reference frame type to be NONE in key frame coding mode. This fixes a subtle dependency of reference motion vector used by next inter frame on mode_info reset before key frame coding. Change-Id: I5ff0359753fdc9992b0bfe889490f7a32d7d5f6a	2014-12-16 15:49:58 -08:00
Paul Wilkins	b6c75c5a8d	Improve motion detection for low complexity regions. Where there is very subtle motion, especially when combined with low spatial complexity, the codec sometimes fails to quickly pick up the ambient motion field. Once it has been established though the field propagates well using Nearest and Near MV. This patch looks specifically at the case where the Nearest and Near have not been established as non zero vectors and in this case discounts the cost of searching for a new vector in the rd code. This will almost certainly have some implications in terms of encode speed but it should be possible to mitigate the impact in a subsequent using first pass stats and the local spatial complexity. Average results for test sets approximately neutral. Change-Id: I44a29e20f11f7ab10f8c93ffbdc50183d9801524	2014-12-16 17:22:54 +00:00
Jingning Han	eefe869291	Simplify rate-distortion modeling function Use left shift to replace one multiplication. The computation outcome remains identical. Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1	2014-12-15 11:51:16 -08:00
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
hkuang	382f86f945	Improve the performance by caching the left_mi and right_mi in macroblockd. This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3	2014-12-05 16:25:42 -08:00
Jingning Han	74ded4863e	Enable conditional skip path in rd_pick_intra_sby_mode These speed-up features for key frame coding are only turned on in the settings of hybrid non-RD and RD mode decision. It provides about 20% speed-up to the hybrid key frame coding at the expense of certain compression performance loss. For vidyo1, the key frame coding statistics are changed 9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us Overall rtc set compression performance is down by -0.257%. Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f	2014-12-05 09:36:09 -08:00
Yunqing Wang	edbd61e136	vp9_ethread: modify VP9_COMP structure This patch modified struct VP9_COMP. Created a struct ThreadData to include data that need to be copied for each thread. In multiple thread case, one thread processes one tile. all threads share one copy of VP9_COMP, (refer to VP9_COMP cpi in the code) but each thread has its own copy of ThreadData, (refer to ThreadData td in the code). Therefore, within the scope of encode_tiles(), both cpi and td need to be passed as function parameters. In single thread case, the FRAME_COUNTS pointer in ThreadData points to "counts" in VP9_COMMON. Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e	2014-11-24 17:57:38 -08:00
Yunqing Wang	379334c2d8	vp9_ethread: move filter_cache out of RD_OPT struct Similar to mask_filter, the filter_cache in RD_OPT struct can be moved out, and declared as a local variable since it is only used in pick_inter_mode functions. Change-Id: I412b99cca82bade07ac912064ec03dd1de6b2c17	2014-11-20 13:44:16 -08:00
Yunqing Wang	b0efddd8e6	vp9_ethread: change mask_filter to a local variable The mask_filter in RD_OPT struct is used to record rd result in filter decision. It is only used in pick_inter_mode functions, and is removed from the struct and declared as a local variable. Change-Id: I3c95c8632ba7241591ce00ef2ef5677b5e297d7b	2014-11-20 09:41:49 -08:00
Yunqing Wang	70c9d2983b	Revert "vp9_ethread: include a pointer to mb in VP9_COMP" This reverts commit `6906d218dd`. Another way will be used to handle mb struct. Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4	2014-11-20 08:31:12 -08:00
Yunqing Wang	6906d218dd	vp9_ethread: include a pointer to mb in VP9_COMP Modified VP9_COMP struct to include MACROBLOCK *mb. This change makes it feasible in multi-thread case to allocate a mb for each thread. Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e	2014-11-14 12:31:06 -08:00
Jingning Han	61966b1d10	Merge "Refactor vp9_update_rd_thresh_fact"	2014-10-31 08:55:28 -07:00
Jingning Han	f7b46d8c5e	Refactor vp9_update_rd_thresh_fact Reduce the scope of function parameters. Change-Id: Ifef2cfb559908a97498ffdbd6ea53da1cd45a73c	2014-10-30 11:09:40 -07:00
Hui Su	66906da066	Merge "Combine vp9_encode_block_intra and encode_block_intra"	2014-10-30 11:02:31 -07:00
Jingning Han	9349a28e80	Enable mode search threshold update in non-RD coding mode Adaptively adjust the mode thresholds after each mode search round to skip checking less likely selected modes. Local tests indicate 5% - 10% speed-up in speed -5 and -6. Average coding performance loss is -1.055%. speed -5 vidyo1 720p 1000 kbps 16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms nik 720p 1000 kbps 33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms speed -6 vidyo1 720p 1000 kbps 16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms nik 720p 1000 kbps 33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b	2014-10-29 10:55:34 -07:00
Hui Su	0928da3b6e	Combine vp9_encode_block_intra and encode_block_intra Change-Id: I79091fb677b64892ecca2fb466fde14602d8cdfc	2014-10-28 18:57:01 -07:00
Jingning Han	d56b3eb0cf	Refactor encoder tile data structure Make the common tile info as one element in the encoder tile data struct. Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be	2014-10-27 19:37:13 -07:00
Jingning Han	eee201c221	Tile based adaptive mode search in RD loop Make the spatially adaptive mode search in rate-distortion optimization loop inter tile independent. Experiments suggest that this does not significantly change the coding staticstics. Single tile, speed 3: pedestrian_area 1080p 1500 kbps 59192 b/f, 40.611 dB, 101689 ms blue_sky 1080p 1500 kbps 58505 b/f, 36.347 dB, 62458 ms mobile_cal 720p 1000 kbps 13335 b/f, 35.646 dB, 45655 ms as compared to 4 column tiles, speed 3: pedestrian_area 1080p 1500 kbps 59329 b/f, 40.597 dB, 101917 ms blue_sky 1080p 1500 kbps 58712 b/f, 36.320 dB, 62693 ms mobile_cal 720p 1000 kbps 13191 b/f, 35.485 dB, 45319 ms Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325	2014-10-24 10:00:27 -07:00
Yunqing Wang	330a6b2756	Merge "vp9_ethread: allocate frame contexts outside VP9_COMMON struct"	2014-10-22 17:10:39 -07:00
Yunqing Wang	7c7e4d4eb8	vp9_ethread: allocate frame contexts outside VP9_COMMON struct This patch allocated frame contexts outside VP9_COMMON. This allows multiple threads to share the same copy of frame contexts, and reduces the overhead. It also guarantees the correct update of these contexts during bitstream packing. This patch doesn't change encoding result. Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353	2014-10-22 15:03:12 -07:00
Hangyu Kuang	9ce3a7d76c	Implement frame parallel decode for VP9. Using 4 threads, frame parallel decode is ~3x faster than single thread decode and around 30% faster than tile parallel decode for frame parallel encoded video on both Android and desktop with 4 threads. Decode speed is scalable to threads too which means decode could be even faster with more threads. Change-Id: Ia0a549aaa3e83b5a17b31d8299aa496ea4f21e3e	2014-10-22 10:50:58 -07:00
Jingning Han	94ecfa323f	Reset rate cost value in rd mode search When early termination is triggered, properly reset the rate cost to invalid value to avoid potential ioc issue. Change-Id: I3444390be2e49a34bb02cf8a74c33d5dbd96d88d	2014-10-17 09:33:59 -07:00
Jingning Han	ed100c0b00	Fix an ioc issue in super_block_uvrd This commit fixes an ioc issue that will happen when the cumulative variables are not in effective use. The fix discards these redundant additions. Change-Id: Idbac5bfb989c0cedc5f8a323effce938519b2457	2014-10-16 11:07:39 -07:00
Jingning Han	f3a5de816d	Refactor super_block_uvrd function to remove goto statement Use return value 0/1 as indicator of the validity of the rate- distortion cost. Change-Id: I6244126fbf03472cebcba4f177a6cd329fae4743	2014-10-14 09:58:11 -07:00
Jingning Han	69a09a70e9	Use speed feature variable in vp9_rd_pick_inter/intra_mode Replace repeated fetch cpi->sf with a local sf pointer. Change-Id: I5a55bba3e1c41fbdbc6ad5f078d2fa49dd95ee67	2014-10-13 16:15:00 -07:00
Jingning Han	3bdb6bfcee	Fix vp9_rd_pick_inter/intra function types The returned value is not used anywhere, hence changing the function type into void. Change-Id: I0ece49ed61e7aab6df01140135503ad41d4ef4a4	2014-10-13 16:00:46 -07:00
Jingning Han	811cef97c9	Refactor rate distortion cost structure This commit makes a struct that contains rate value, distortion value, and the rate-distortion cost. The goal is to provide a better interface for rate-distortion related operation. It is first used in rd_pick_partition and saves a few RDCOST calculations. Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f	2014-10-13 14:27:16 -07:00
Jingning Han	a62acf3c0a	Fix ActiveMapTest valgrind warning This fixes a valgrind warning in the ActiveMapTest unit test reported in issue 870. Change-Id: Idf172ab0244ebefe630c3577e649bc9ba7c43d10	2014-10-11 22:36:58 -07:00
Deb Mukherjee	9a29fdbae7	Merge "Rename highbitdepth functions to use highbd prefix"	2014-10-09 15:39:56 -07:00
Deb Mukherjee	1929c9b391	Rename highbitdepth functions to use highbd prefix Uses highbd_ prefix convention consistently. Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e	2014-10-09 14:40:40 -07:00
Deb Mukherjee	3117830af3	Merge "Subpel search cleanups and enhancements"	2014-10-09 11:14:51 -07:00
Alex Converse	9ffbc31367	Merge "Move the high freq coeff check outside store_coding_context"	2014-10-09 11:12:02 -07:00
Deb Mukherjee	d78dbff09a	Subpel search cleanups and enhancements - Some fixes to surface fit. - Returns variance function as cost rather than sad in the pattern search and diamond search functions. Only vp9_pattern_search_sad function used in bigdia search uses sad as integer 1-away costs. - Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+. Results: derf [Speed 3]: About +0.036% in coding efficiency without any discernible speed loss. derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency. derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency. Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63	2014-10-08 23:59:43 -07:00
Yunqing Wang	189566db58	Merge "Allow mode search breakout at very low prediction errors"	2014-10-08 19:58:18 -07:00
Yunqing Wang	e18edd5eb6	Allow mode search breakout at very low prediction errors In model_rd_for_sb function, the spatial domain SSE and variance are checked to see if transform coefficients are quantized to 0. Besides that, this patch adds another set of thresholds that are much more strict. These thresholds are used to conduct a partition block level check to measure if all its TX blocks are skippable for YUV planes. If it is true, x->skip is set for this partition block, and thus its mode search is terminated. This speeds up the encoding at very low prediction error case, such as screen sharing application. This patch covers what rd_encode_breakout_test() does, so that function is removed. Borg test at speed 3 shows: For stdhd set, psnr: +0.008%, ssim: +0.014%; For derf set, psnr: +0.018%, ssim: +0.025%. No noticeable speed change. Change-Id: I4e5f15cf10016a282a68e35175ff854b28195944	2014-10-08 17:46:22 -07:00
Jingning Han	5fcbcf1b22	Move the high freq coeff check outside store_coding_context This fixes valgrind message issue 870. Change-Id: Ibbc2481923a2995029ab05de30c9e8a6e9f0f9a8	2014-10-08 16:10:32 -07:00
Jingning Han	41cea46154	Use local variable in vp9_rd_pick_inter_mode_sb Change-Id: Ie35a965a6b8de536ccaf61ff61498620d22db205	2014-10-08 16:09:47 -07:00
Jingning Han	3bbec7b422	Merge "Replace mi_width_log2() with mi_width_log2_lookup table"	2014-10-07 15:33:52 -07:00
Jingning Han	27c9577f8e	Merge "Take out repeated block width/height lookup functions"	2014-10-07 15:33:45 -07:00
Yunqing Wang	0cac69f594	Merge "Fix skip_txfm issue in rdopt code"	2014-10-07 15:13:44 -07:00
Yunqing Wang	a4aa14020a	Fix skip_txfm issue in rdopt code Fixed an encoder crash. Set skip_txfm to 0 for cases that skip_txfm isn't calculated. Put memcpy of skip_txfm at right place. Change-Id: Ib3b6afc1b251a85b2a853c8138fb3393f48cfef6	2014-10-07 12:47:43 -07:00
Jingning Han	7ee58985bd	Replace mi_width_log2() with mi_width_log2_lookup table Change-Id: If0ea98aa139d14d40cd924114e18396aff36b5a5	2014-10-07 12:45:25 -07:00
Jingning Han	b66f7016c1	Take out repeated block width/height lookup functions The functions b_width_log2 and b_height_log2 only do direct table fetch. This commit unifies such use cases by using the table directly and removes these functions. Change-Id: I3103fc6ba959c1182886a2799d21b8b77c8a7b6b	2014-10-07 12:33:07 -07:00
Deb Mukherjee	cfc337aae8	Merge "Resolves some static analysis / undefined warnings"	2014-10-07 12:15:26 -07:00
Deb Mukherjee	fced63ed30	Resolves some static analysis / undefined warnings Also fixes a case of distortion becoming negative and messing up the RDCOST computation. Change-Id: Id345af9e8dfff31ade622be5756e51f2cdface53	2014-10-07 11:20:56 -07:00
Jingning Han	a75551585b	Fix eobs buffer pointer mis-use This commit fixes a buffer pointer mis-use in store_coding_context. The compression performance for stdhd set of speed 3 is improved by 0.097%. It fixes issue 869. Change-Id: Idc59e22035eaf39f7133ca04174894374d647ff7	2014-10-06 15:57:13 -07:00
Jingning Han	1b8c57e915	Merge "Fix an IOC issue in vp9_rd_pick_inter_mode_sb"	2014-10-06 09:29:29 -07:00
Jingning Han	085b97aa5c	Fix an IOC issue in vp9_rd_pick_inter_mode_sb It is possible that the GOLDEN reference frame is not avaiable, in which setting the predicted mv will be associated with a residual value of INT_MAX. This commit checks this condition before left shift and comparison with that of ALTREF frame, to avoid overflow issue. Change-Id: Ib98c3149dbdd016f2fe5beaafb13f67d469dd07c	2014-10-05 12:05:14 -07:00
Jingning Han	a1088e0b5f	Merge "Rework partition search skip scheme"	2014-10-03 15:23:54 -07:00
Jingning Han	bb260d9076	Rework partition search skip scheme This commit enables the encoder to skip split partition search if the bigger block size has all non-zero quantized coefficients in low frequency area and the total rate cost is below a certain threshold. It logarithmatically scales the rate threshold according to the current block size. For speed 3, the compression performance loss: derf -0.093% stdhd -0.066% Local experiments show 4% - 20% encoding speed-up for speed 3. blue_sky_1080p, 1500 kbps 51051 b/f, 35.891 dB, 67236 ms -> 50554 b/f, 35.857 dB, 59270 ms (12% speed-up) old_town_cross_720p, 1500 kbps 14431 b/f, 36.249 dB, 57687 ms -> 14108 b/f, 36.172 dB, 46586 ms (19% speed-up) pedestrian_area_1080p, 1500 kbps 50812 b/f, 40.124 dB, 100439 ms -> 50755 b/f, 40.118 dB, 96549 ms (4% speed-up) mobile_calendar_720p, 1000 kbps 10352 b/f, 35.055 dB, 51837 ms -> 10172 b/f, 35.003 dB, 44076 ms (15% speed-up) Change-Id: I412e34db49060775b3b89ba1738522317c3239c8	2014-10-03 11:54:30 -07:00
Deb Mukherjee	431cdc33ee	Prevent negative cost for highbitdepth Adds proper scaling for highbitdepth in a rdopt cost. Change-Id: I066694799a7f491b830945ef1c66eb202071c355	2014-10-03 10:22:21 -07:00
Deb Mukherjee	30fbf23fda	Merge "High-bitdepth bugfixes"	2014-10-01 16:47:43 -07:00
Yunqing Wang	e350e3fe68	Merge "Modify block transform skipping check"	2014-10-01 16:19:56 -07:00
Deb Mukherjee	a160d72522	High-bitdepth bugfixes Miscellaneous bug-fixes for high bitdepth functionality. With this patch, high bit-depth profiles become mostly functional, except for an intermittent assert failure issue that is being tracked. Change-Id: I6a7fcbdcf1e5b09842e88535f8442d2e1230748c	2014-10-01 14:18:11 -07:00
Yunqing Wang	e4aac6bb61	Modify block transform skipping check Block transform skipping was implemented based on DCT's energy conservation property. Modified the thresholds using zero bin parameters. AC and DC coefficients were checked separately to allow better identifying of skippable blocks. Borg test at speed 3 showed: stdhd set: psnr gain: 0.153%, ssim gain: 0.051%; derf set: psnr gain: 0.023%, ssim gain: 0.036% For most test clips, the encoding speedup is 1% - 2%. parkrun(720p): 7.5% speedup, park_joy(1080p): 3.5% speedup. Change-Id: If28eb81113a077414f5ca7b021c14f9069b373bb	2014-10-01 12:58:09 -07:00
Jingning Han	891793a540	Conditionally skip reference frame check For regular inter frames, if the distance from GOLDEN_FRAME is larger than 2 and if the predicted motion vector of LAST_FRAME gives lower sse than that of GOLDEN_FRAME, skip the GOLDE_FRAME mode checking in the rate-distortion optimization. It provides about 5% speed-up at expense of -0.137% and -0.230% performance down for speed 3. Local experiment results: pedestrian 1080p 2000 kbps 66712 b/f, 40.908 dB, 113688 ms -> 66768 b/f, 40.911 dB, 108752 ms blue_sky 1080p 2000 kbps 51054 b/f, 35.894 dB, 70406 ms -> 51051 b/f, 35.891 dB, 67236 ms old_town_cross 720p 1500 kbps 14412 b/f, 36.252 dB, 60690 ms -> 14431 b/f, 36.249 dB, 57346 ms Change-Id: Idfcafe7f63da7a4896602fc60bd7093f0f0d82ca	2014-10-01 08:32:15 -07:00
Jingning Han	8b4dd536a5	Merge "Skip certain ALTREF inter modes in ARF coding"	2014-09-29 10:43:45 -07:00
Jingning Han	ccdb518ff8	Skip certain ALTREF inter modes in ARF coding This commit enables the encoder to skip checking ALTREF inter modes in ARF coding, if the predicted motion vectors suggest that the GOLDEN_FRAME provides higher prediction accuracy than ALTREF_FRAME. It improves the speed 3 encoding speed by about 5%, at the expense of compression performance loss -0.041% and -0.225% for derf and stdhd, respectively. pedestrian_area 1080p 2000 kbps 66705 b/f, 40.909 dB, 118738 ms -> 66732 b/f, 40.908 dB, 113688 ms old_town_cross 720p 1500 kbps 14427 b/f, 36.256 dB, 62746 ms -> 14412 b/f, 36.252 dB, 60690 ms blue_sky 1080p 1500 kbps 51026 b/f, 35.897 dB, 73310 ms -> 50921 b/f, 35.893 dB, 70406 ms bus CIF 1000 kbps 21301 b/f, 34.841 dB, 7326 ms -> 21248 b/f, 34.837 dB, 7196 ms Change-Id: I76cf88b4d655e1ee3c0cb03c8a5745493040e8d2	2014-09-26 12:53:43 -07:00
Deb Mukherjee	993d10a217	Adds various high bit-depth encode functions Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f	2014-09-25 01:50:36 -07:00
Jingning Han	6989e81d61	Remove unused variable in handle_inter_mode Change-Id: Id757d2c940756ce1b0ead2ea24af9ac0a493de05	2014-09-24 18:27:44 -07:00
Yaowu Xu	4a101310e8	Adapt mode based rd_threshold for similar block size The rd_thresholds are adaptively changed based on best mode tested. It was only changed for the same block size, this commit makes the adaptation for similar block sizes too. The commit also made minor adjustment and code cleanups. The impact on encoding time for _ped: 118089 ms -> 111927 ms The impact on compression: derf: -0.339% stdhd: -0.303% Change-Id: I8817fed1102350497f2ec631849e43f753878e5d	2014-09-23 16:10:59 -07:00
Yaowu Xu	56032b471d	Fix an IOC Change-Id: I0ca6746696d81657c035b0f6523c9af370da3c95	2014-09-23 16:07:22 -07:00
Yaowu Xu	7feede9869	Merge "Remove code duplication"	2014-09-22 17:13:59 -07:00
Yaowu Xu	052bc8ea6a	Merge "Simplify rd_pick_intra_sby_mode()"	2014-09-22 17:13:55 -07:00
Yaowu Xu	c7ab18fe56	Remove code duplication Change-Id: I453b3e0d946951665d5919248445fc4f3222d2ad	2014-09-22 15:22:51 -07:00
Yaowu Xu	f46326c7a2	Simplify rd_pick_intra_sby_mode() Change-Id: Ifb0915c94c2db48827ddbd446314cb6e3155b99c	2014-09-22 14:58:51 -07:00
Jingning Han	f7023ea014	Remove unnecessary local variable declaration This commit removes a repetitive local variable declaration in vp9_rd_pick_inter_mode_sb. Change-Id: I1b0afa98ff1ecbfb46e17d3d1cee95d32c4309db	2014-09-22 09:29:28 -07:00
Jingning Han	eee904c9b9	Adaptive mode search scheduling This commit enables an adaptive mode search order scheduling scheme in the rate-distortion optimization. It changes the compression performance by -0.433% and -0.420% for derf and stdhd respectively. It provides speed improvement for speed 3: bus CIF 1000 kbps 24590 b/f, 35.513 dB, 7864 ms -> 24696 b/f, 35.491 dB, 7408 ms (6% speed-up) stockholm 720p 1000 kbps 8983 b/f, 35.078 dB, 65698 ms -> 8962 b/f, 35.054 dB, 60298 ms (8%) old_town_cross 720p 1000 kbps 11804 b/f, 35.666 dB, 62492 ms -> 11778 b/f, 35.609 dB, 56040 ms (10%) blue_sky 1080p 1500 kbps 57173 b/f, 36.179 dB, 77879 ms -> 57199 b/f, 36.131 dB, 69821 ms (10%) pedestrian_area 1080p 2000 kbps 74241 b/f, 41.105 dB, 144031 ms -> 74271 b/f, 41.091 dB, 133614 ms (8%) Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e	2014-09-22 09:28:16 -07:00
hkuang	c70cea97ac	Remove mi_grid_* structures. mi_grid_* are arrays of pointer to pointer. They save the pointers that point to the MIs in cm->mi. But they are unnecessary and complicated. The original goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer inside MODE_INFO_t, same goal could be achieved. This commit totally removes the mi_grid_* structures. But there are still many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit will do on-demand MODE_INFO_t allocation in order to save these memories. Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6	2014-09-19 21:27:11 -07:00
Deb Mukherjee	5cd0aab81a	Adds high bitdepth quantization functions Adds various high bitdepth quantization functions. Change-Id: I36fc0bf75a1bd15128ed271df8723de0ac134b0c	2014-09-16 14:55:37 -07:00
Jingning Han	ffaebfc7b4	Merge "Add ARF validation for compound inter mode check"	2014-09-15 21:26:37 -07:00
Jingning Han	c50256c157	Merge "Remove redundant reference frame check in sub8x8 RD search"	2014-09-15 21:26:11 -07:00
Jingning Han	fe96932c69	Merge "Replace best_ref_index table fetch with best_mbmode"	2014-09-15 21:25:48 -07:00
Yunqing Wang	57eb2a4e83	Merge "Simplify the skip flag cost code"	2014-09-15 18:50:30 -07:00
Yunqing Wang	c60ef810a1	Merge "Set the skip flag to 1 for skippable blocks"	2014-09-15 18:50:19 -07:00
Yunqing Wang	200ec69abb	Simplify the skip flag cost code Code refactoring. Change-Id: Idad53cb80497d13551a142a642f7529fc305b0bc	2014-09-15 17:11:16 -07:00
Yunqing Wang	46aed7b8d0	Set the skip flag to 1 for skippable blocks If the partition block is skippable, which means no coefficients for Y, U, and V planes, its skip flag is set to 1. No quality change (verified by borg tests), and no noticeable speed change. Change-Id: I9231f720f8dd6364384cf05aa148ca24d75450f1	2014-09-15 16:50:19 -07:00
Jingning Han	f897dd5f09	Merge "Fix format in vp9_rd_pick_inter_mode_sub8x8"	2014-09-15 15:34:22 -07:00
Jingning Han	f1581b3b2e	Add ARF validation for compound inter mode check This commit enforces ARF validation check for compound inter modes. It avoids potential access to ARF in the encoding process if it is not allowed. Change-Id: I055fec946b5d19d97937dc9001e1e564923e2439	2014-09-15 12:20:57 -07:00
Jingning Han	252822e81c	Remove redundant reference frame check in sub8x8 RD search The valid reference frame check in sub8x8 rate-distortion optimization search has been included in the ref_frame_skip_mask scheme. This commit removes the later further validation checks that are not in effect. Change-Id: I853b477c44037d3dc0afec6cbfce08a96c597a75	2014-09-15 12:20:04 -07:00
Jingning Han	cc00eea676	Replace best_ref_index table fetch with best_mbmode This commit replaces the best_ref_index table fetch with the use of best_mbmode in vp9_rd_pick_inter_mode_sub8x8. Change-Id: I882ee9ee6a8c0e61befcca1f4dba6d2ea8de8f13	2014-09-15 09:59:20 -07:00
Jingning Han	73805bfa70	Fix format in vp9_rd_pick_inter_mode_sub8x8 Change-Id: I9b6a74bdf003b39235f14f8b5b7f3b861f6bf131	2014-09-15 09:44:09 -07:00
Jingning Han	59dd83a3ea	Merge "Refactor reference frame control in sub8x8 block RD search"	2014-09-13 10:43:36 -07:00
Jingning Han	e6d927343e	Merge "Format fixes in vp9_rd_pick_inter_mode_sb"	2014-09-13 10:43:24 -07:00
Jingning Han	ad3c92b9b7	Merge "Remove unused best_inter_rd variable"	2014-09-13 10:43:14 -07:00
Jingning Han	f02e0b6cf6	Merge "Remove unused speed feature"	2014-09-13 10:43:03 -07:00
Jingning Han	adb20849b6	Refactor reference frame control in sub8x8 block RD search This commit unifies the reference frame control in the rate- distortion optimization search loop of sub8x8 block size to remove the control dependency on mode search order. Change-Id: I3a174099f71a7cc176ede9fd60e2374243ae9232	2014-09-12 11:03:03 -07:00
Jingning Han	7f77a1c3c9	Merge "Unify intra mode mask into mode_skip_mask scheme"	2014-09-12 09:06:35 -07:00
Deb Mukherjee	10783d4f3a	Adds high bitdepth transform functions and tests Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8	2014-09-11 19:56:33 -07:00
Jingning Han	74ddde01c0	Format fixes in vp9_rd_pick_inter_mode_sb Change-Id: Ie45687405dcaa34ba465dce2aa14f76017d3a794	2014-09-11 17:15:15 -07:00
Jingning Han	8e3f7a52a1	Remove unused best_inter_rd variable The variable best_inter_rd is effectively not in use in the rate- distortion mode search loops of both regular block sizes and sub8x8 block sizes. Change-Id: I178f909f8c9629772e13adc6257908653b2adf31	2014-09-11 16:16:26 -07:00
Jingning Han	00fe92c22f	Remove unused speed feature The speed feature that skips compound inter prediction modes was subsumed by other speed features and effectively was not in use. This commit removes it. Change-Id: I22b0c71a8ddd15d93b25d86fa63a1dce2ba6a1a9	2014-09-11 15:54:53 -07:00
Jingning Han	bdd8eb6fcc	Unify intra mode mask into mode_skip_mask scheme Integrate intra mode mask speed feature with the mode_skip_mask scheme. Move it outside the mode search loop in the vp9_rd_pick_inter_mode_sb function. Change-Id: I7738fea749bfdc08ad05d7f2524feb8ff67568d9	2014-09-11 10:36:48 -07:00
Jingning Han	8cefed1568	Remove inter_mode_mask from rate-distortion search loop This speed feature is used in real-time setting only. Remove the related condition check in the rate-distortion optimization search loop. Change-Id: Iaacc1e268214634e6f95c5048c28a60cec6c42fc	2014-09-11 10:18:55 -07:00
Jingning Han	238b2ace86	Move intra block size skip outside mode search loop Unify this speed feature in the ref_frame_skip_mask scheme. Change-Id: I7ea5646da02d3ea643680c22d50dabd448d55a27	2014-09-11 09:54:19 -07:00
Jingning Han	8b06a24ce7	Fix format in vp9_rd_pick_inter_mode_sub8x8 Change-Id: I0da29c858c6c1eb5ef07cee8f599329f5a002da9	2014-09-11 09:28:47 -07:00
Jingning Han	8d42fad9c1	Move overlay frame speed feature setting out of mode search loop Refactor overlay frame speed-up related function. Make it unified with the ref_frame_skip_mask system and Move it out of the mode search loop. Change-Id: I0dde9baf44354f6ba00b4679cba02fa6a30c7316	2014-09-10 19:44:58 -07:00
Jingning Han	f9f0879756	Refactor to remove speed feature dependency on mode search order This commit refactor the rate-distortion optimization search for regular block sizes to remove the speed feature dependency on mode search order. Change-Id: Ied033ee484c2957e17baa7b6450b720fe7dd0e7d	2014-09-10 17:09:14 -07:00
Jingning Han	68d79146ea	Fix a bug in vp9_rd_pick_inter_mode_sb This commit fixes a bug related to skipping intra mode checking, by using a separate variable to store the best prediction error from inter mode. It avoids unintentionally overwriting intra mode rate-distortion cost, and hence affecting other speed features. Change-Id: I99e12993339c84c8b4f597996b372012e5858fae	2014-09-09 15:39:54 -07:00
Jingning Han	9a9e2aef09	Remove redundant ref frame pointer assignment Assigning selected reference frame pointer is done in the encode_superblock function. No need to do this at the end of rate-distortion optimization search. Change-Id: I33fcede0fd304b4a4c4deef2d126d79546a9c070	2014-09-09 15:15:11 -07:00
Jingning Han	33593d1f03	Remove dependency of intra mode search skip check on mode order This commit refactors the vp9_rd_pick_inter_mode_sb function to remove the intra mode early termination dependency on the mode search order. Change-Id: If6ac49aa7c530c7b9a5bd31b0ab84db83e192bec	2014-09-09 12:30:47 -07:00
Jingning Han	d96228a07c	Replace best_mode_index table retrieve with fetching best_mbmode This commit allows the encoder to find current best prediction mode state using best_mbmode, instead of fetching from the static mode search table via best_mode_index. Change-Id: Ibefeab83aed33a49c2be03e83f09153856ca4271	2014-09-09 11:58:10 -07:00
Jingning Han	a61973bf29	Merge "Enable adaptive motion search for ARF coding"	2014-09-08 08:51:05 -07:00
Yunqing Wang	1dd9a63929	Correct the mode decisions in special cases The rate costs calculated for inter modes are not precise in some cases, which causes NEWMV is chosen instead of NEARESTMV, NEARMV, and ZEROMV. This patch added checks for these cases, and corrected the mode decisions. Borg tests at speed 3 showed: 1. stdhd set: 0.102% PSNR gain and 0.088% SSIM gain. 2. derf set: 0.147% PSNR gain and 0.132% SSIM gain. No speed change. Change-Id: I35d17684b89ad4734fb610942d707899146426db	2014-09-05 12:01:07 -07:00
Jingning Han	d435148fe6	Enable adaptive motion search for ARF coding This commit turns on adaptive motion search for ARF coding, in addition to other normal inter frame coding. It improves the average compression efficiency: stdhd 0.1% derf 0.04% For the test sequences, the speed 3 runtime is reduced: pedestrian 1080p 2000 kbps, 149932 ms -> 144580 ms, (3.3% speed-up) bus CIF 1000 kbps, 8050 ms -> 7895 ms, (1.9%) highway CIF 100 bkps, 45033 ms -> 44078 ms, (2.2%) Change-Id: I5228565b609f99e8ae04f6140a2bf2b64a831d21	2014-09-04 16:26:40 -07:00

1 2 3 4 5 ...

1367 Commits