generic-library/vpx

Author	SHA1	Message	Date
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
Jingning Han	3e0793b80b	Merge "Fix PICK_MODE_CONTEXT index in non-RD coding mode"	2014-12-12 09:16:01 -08:00
Jingning Han	e2c2a65695	Fix PICK_MODE_CONTEXT index in non-RD coding mode This commit fixes a bug in the PICK_MODE_CONTEXT index for horizontal partition case. The compression performance change is less than 0.01% level, since most blocks are selected to use square block size in RTC coding mode. Change-Id: I67effc18ae8795fccdd82a55f4efc609fa5cb3e1	2014-12-11 17:21:24 -08:00
Jingning Han	811c74cdfa	Merge "Replace division with bit shift in choose_partitioning"	2014-12-11 13:30:03 -08:00
Jingning Han	d9892e846f	Merge "Refactor choose_partitioning computing scheme"	2014-12-11 11:14:07 -08:00
Jingning Han	d5c396a902	Replace division with bit shift in choose_partitioning This commit explicitly uses the bit shift operation instead of division for computing block variance. Change-Id: Id19c0ff27dd1d1ae4aceee6657e1aad0d406bd74	2014-12-11 11:06:57 -08:00
Jingning Han	377d2f027a	Refactor choose_partitioning computing scheme This commit refactors the choose_partitioning function. It removes redundant memset calls and makes the encoder to calculate variance value per block only when it is needed. It reduces the average runtime cost of choose_partitioning by 60%. Overall it reduces speed -6 runtime by 2-5%. Change-Id: I951922c50d901d0fff77a3bafc45992179bacef9	2014-12-11 09:33:40 -08:00
Paul Wilkins	65cfb808d0	Merge "Substantial restructuring of AQ mode 2."	2014-12-10 10:44:27 -08:00
Jingning Han	e728678c50	Refactor update_state_rt Update the frame motion vector only if previous frame motion vector is needed for next frame reference motion vector. Change-Id: Ica50f9d7b46ad4f815bba0d9e30f5546df29546f	2014-12-09 15:35:49 -08:00
Jingning Han	225cdef665	Make RTC coding flow support sub8x8 in key frame coding This commit enables the use of sub8x8 blocks in RTC key frame encoding. It requires the block size to be preset and will decide the coding mode and encode the bit-stream. Change-Id: I35aaf8ee2d4d6085432410c7963f339f85a2c19b	2014-12-09 11:34:58 -08:00
Jingning Han	4bacaab46d	Cosmetic naming change Rename set_modeinfo_offsets as set_mode_info_offsets, to be more consistent with naming convention. Change-Id: I68ca1f36c4a78127d9439a50c1506a2afd07927d	2014-12-09 10:32:04 -08:00
Jingning Han	f051a7beab	Take out redundant setting of mode_info from set_block_size The later encoding process will take the top-left block's mode_info for pre-determined block size. Change-Id: I76a90f9ce7f3b2dbc2975b52442114e461c465b5	2014-12-09 10:27:18 -08:00
Paul Wilkins	e68c8dcfd2	Substantial restructuring of AQ mode 2. The restructure moves the decision into the rd pick modes loop and makes a decision based at the 16x16 block level instead of only the 64x64 level. This gives finer granularity and better visual results on the clips I have tested. Metrics results are worse than the old AQ2 especially for PSNR and this mode now falls between AQ0 and AQ1 in terms of visual impact and metrics results. Further tuning of this to follow. It should be noted that if there are multiple iterations of the recode loop the segment for a MB could change in each loop if the previous loop causes a change in the complexity / variance bin of the block. Also where a block gets a delta Q this will alter the rd multiplier for this block in subsequent recode iterations and frames where the segmentation is applied. Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7	2014-12-09 15:10:52 +00:00
Jingning Han	1395ded2a7	Remove unused rd cost calculation from nonrd_use_partition The per block rd cost calculation is not needed when partition size is preset. Change-Id: Ie5575248bbffb584e908aa13097f697ace6ec747	2014-12-08 18:45:19 -08:00
James Zern	616b3a810f	vp9 asserts: fix compile warning string literal to int within an assert Change-Id: I76a173f96b9add5bf27c3f5ad5d72c6f30e51629	2014-12-05 16:20:42 -08:00
hkuang	eaa6deee5b	Merge "Merge set_prev_mi function into encoder function."	2014-12-05 15:12:50 -08:00
Jingning Han	6ae829088f	Merge "Remove redundant vp9_zero in choose_partitioning"	2014-12-05 11:47:58 -08:00
Jingning Han	62c7356098	Merge "Use hybrid RD and non-RD coding flow for key frame coding"	2014-12-05 11:25:19 -08:00
Jingning Han	9d88b30854	Remove redundant vp9_zero in choose_partitioning It makes the overall speed -6 about 2% faster with no compression performance change. Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a	2014-12-05 10:39:39 -08:00
Jingning Han	07711e9b27	Use hybrid RD and non-RD coding flow for key frame coding When block size is below 16x16, the encoder swap from non-RD to RD mode for key frame coding. This largely brough back the key frame compression performance. For vidyo1 at 1000 kbps, the key frame coding statistics are changed 9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us As compared to the full RD case 7187F, 34.930 dB, 214470 us The overall rtc set coding performance (single key frame setting) is improved by 1.5%. Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878	2014-12-05 09:35:27 -08:00
Yunqing Wang	a3a4a34c60	Merge "vp9_ethread: the tile-based multi-threaded encoder"	2014-12-05 08:23:49 -08:00
hkuang	62de07c8c6	Merge set_prev_mi function into encoder function. Change-Id: Ifcf2efbb232ea4cabcdebbe77e0820d121e4a6da	2014-12-04 14:44:23 -08:00
Yunqing Wang	eba9c762a1	vp9_ethread: the tile-based multi-threaded encoder Currently, VP9 supports column-tile encoding, which allows a frame to be encoded in multiple column tiles independently. The number of column tiles are set by encoder option "--tile-columns". This provides a way to encode a frame in parallel. Based on previous set of patches, this patch implemented the tile- based multi-threaded encoder. Each thread processes one or more tiles. Usage: For HD clips: --tile-columns=2 --threads=1/2/3/4 While using 4 threads, tests showed that the encoder achieved 2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at realtime speed 5. Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4	2014-12-04 11:21:34 -08:00
Jingning Han	17176cd452	Fix indent in source_var_based_partition_search_method Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e	2014-12-03 12:37:36 -08:00
Marco	8fd3f9a2fb	Enable non-rd mode coding on key frame, for speed 6. For key frame at speed 6: enable the non-rd mode selection in speed setting and use the (non-rd) variance_based partition. Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames), mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16. Loss in key frame quality (~0.6-0.7dB) compared to rd coding, but speeds up key frame encoding by at least 6x. Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6. Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405	2014-12-03 09:18:08 -08:00
Peter de Rivaz	7e40a55ef9	Added high bitdepth sse2 transform functions Also removes some spurious changes in common/vp9_blockd.h which was introduced by a rebase issue between nextgen and master branches. Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282 (cherry picked from commit `005d80cd05`) (cherry picked from commit `08d2f54800`) (cherry picked from commit `4230c2306c`)	2014-12-02 11:16:24 -08:00
Yunqing Wang	0993bef7e9	vp9_ethread: calculate and save the tok starting address for tiles Each tile's tok starting address is calculated before the encoding process. These addresses are stored so that the same calculation won't be done again in packing bit stream. Change-Id: I0a3be0301f002260c19a850303f2f73ebc47aa50	2014-11-25 17:19:35 -08:00
Yunqing Wang	edbd61e136	vp9_ethread: modify VP9_COMP structure This patch modified struct VP9_COMP. Created a struct ThreadData to include data that need to be copied for each thread. In multiple thread case, one thread processes one tile. all threads share one copy of VP9_COMP, (refer to VP9_COMP cpi in the code) but each thread has its own copy of ThreadData, (refer to ThreadData td in the code). Therefore, within the scope of encode_tiles(), both cpi and td need to be passed as function parameters. In single thread case, the FRAME_COUNTS pointer in ThreadData points to "counts" in VP9_COMMON. Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e	2014-11-24 17:57:38 -08:00
Jingning Han	2fbdfd2c66	Key frame non-RD mode decision process This commit makes a non-RD coding mode decision process for key frame coding. It can be optionally turned on in speed -6 and above. Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5	2014-11-24 09:04:28 -08:00
Paul Wilkins	f5209d7e01	Remove rate component adjustment for AQ1 In AQ1 a rate adjustment was applied for blocks coded with a deltaq. This tends to skew the partition selection and cause rate overshoot. For example, consider a 64x64 super block where some but not all sub blocks are in a low q segment and some are in a high q segment. The choice of Q when considering large partition and transform sizes is defined by the lowest sub block segment id (currently this implies the lowest Q). If some parts of the larger partition are very hard this will cause a high rate component. The correct behavior here is for the rd code to discard the large partition choice and break down to sub blocks where some have low and some have high Q. However the rate correction factor above mask the high cost of coding at a larger partition size. Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960	2014-11-21 08:51:58 -08:00
Paul Wilkins	d031237999	Add variance restriction to AQ2. Add an additional restriction to bit/complexity based segmentation based on spatial variance. Only lower Q when both the number of bits spent in the initial encoding pass and the spatial complexity are below a threshold. This will prevent the low Q segments being used just because there is a surfeit of bits. Small metrics gains especially opsnr. derf ~0.2% std-hd ~0.3% Change-Id: I6a8496d466d673f9b0e2b2ca6304ea7b6d8e1cce	2014-11-20 16:23:35 -08:00
Paul Wilkins	6a760d483d	Initial AQ1 restructuring. This is the first of a series of patches to restructure and improve AQ mode 1 (variance based AQ). Change-Id: Idcf693131a3ea2459dcfd957a54a65b971fa4a2a	2014-11-20 15:50:15 -08:00
Yunqing Wang	54ba65a63e	Merge "vp9_ethread: move max/min partition size to mb struct"	2014-11-20 14:00:37 -08:00
Yunqing Wang	ad7586a9e1	vp9_ethread: move max/min partition size to mb struct The max_partition_size and max_partition_size are set at the beginning while setting speed features, and then adjusted at SB level. Moving them to mb struct ensures there is a local copy for each thread. Change-Id: I7dd08dc918d9f772fcd718bbd6533e0787720ad4	2014-11-20 09:24:50 -08:00
Yunqing Wang	70c9d2983b	Revert "vp9_ethread: include a pointer to mb in VP9_COMP" This reverts commit `6906d218dd`. Another way will be used to handle mb struct. Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4	2014-11-20 08:31:12 -08:00
Yunqing Wang	d0b547c676	vp9_ethread: combine encoder counts in separate struct Several frame counters in encoder are updated at SB level. Combine those counters and put them in a separate struct, which allows us to allocate one copy for each thread. Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7	2014-11-14 16:09:22 -08:00
Yunqing Wang	6906d218dd	vp9_ethread: include a pointer to mb in VP9_COMP Modified VP9_COMP struct to include MACROBLOCK *mb. This change makes it feasible in multi-thread case to allocate a mb for each thread. Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e	2014-11-14 12:31:06 -08:00
Yunqing Wang	807885b5e0	Merge "vp9_ethread: modify the cyclic refresh struct"	2014-11-13 18:35:01 -08:00
Yunqing Wang	8ee605f188	vp9_ethread: modify the cyclic refresh struct Two members in struct CYCLIC_REFRESH int64_t projected_rate_sb; int64_t projected_dist_sb; are updated at the superblock level, which makes them shared data in the multi-thread situation, and requires extra work to handle them. However, those values are updated and used immediately, and therefore can be removed. This patch cleaned up the code and removed the two members. Change-Id: I2c6ee4552bf49fb63ce590cdb47f9723974fffb1	2014-11-13 15:05:46 -08:00
Jingning Han	6efafda738	Merge "Refactor nonrd_use_partition coding process"	2014-11-13 13:58:21 -08:00
Jingning Han	754b05a4de	Refactor nonrd_use_partition coding process This commit integrates the non-RD mode decision process and the encoding process into a single recursion scheme. Change-Id: I6a7e72a0b84d567554801ebbe01ec75d54c1f77d	2014-11-06 17:00:48 -08:00
Jingning Han	10da059b52	Remove unused is_background function Change-Id: Ia540eac5f066ae95280c2f898370eddf0110c279	2014-11-05 21:19:23 -08:00
Jingning Han	caaf63b2c4	Rework cut-off decisions in cyclic refresh aq mode This commit removes the cyclic aq mode dependency on in_static_area and reworks the corresponding cut-off thresholds. It improves the compression performance of speed -5 by 1.47% in PSNR and 2.07% in SSIM, and the compression performance of speed -6 by 3.10% in PSNR and 5.25% in SSIM. Speed wise, about 1% faster in both settings at high bit-rates. Change-Id: I1ffc775afdc047964448d9dff5751491ba4ff4a9	2014-11-05 21:17:09 -08:00
hkuang	55577431ae	Bind motion vectors with frame buffer structure. This will save a lot of memory for decoder due to removing of prev_mi, but prev_mi is still needed in encoder. So this will increase a little bit memory for encoder. Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed	2014-10-31 17:01:08 -07:00
Jingning Han	1cffea9fb7	Merge "Rework pred pixel buffer system in non-RD coding mode"	2014-10-31 08:55:24 -07:00
Jingning Han	7bea8c59f9	Rework pred pixel buffer system in non-RD coding mode This commit makes the inter prediction buffer system to support hybrid partition search. It reduces the runtime of speed -5 by about 3%. No compression performance change. vidyo1 720p 1000 kbps 11831 ms -> 11497 ms nik 720p 1000 kbps 10919 ms -> 10645 ms Change-Id: I5b2da747c6395c253cd074d3907f5402e1840c36	2014-10-30 11:08:35 -07:00
Jingning Han	07436abb86	Use zero motion vector in choose_partitioning The zero motion vector was effectively used in the subsampled pixel based variance calculation. This commit makes it directly use zero mv to generate prediction. Change-Id: Ica83dc843e9f8da2f89c3ef451e50f16214c0def	2014-10-27 19:38:43 -07:00
Jingning Han	d56b3eb0cf	Refactor encoder tile data structure Make the common tile info as one element in the encoder tile data struct. Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be	2014-10-27 19:37:13 -07:00
Jingning Han	192010d218	Refactor rtc coding mode to support tile encoding Use per tile threshold in the prediction mode search process. Change-Id: I6c74ee5a3b069bb4281002dfe51310911a0756c0	2014-10-27 09:53:46 -07:00
Jingning Han	eee201c221	Tile based adaptive mode search in RD loop Make the spatially adaptive mode search in rate-distortion optimization loop inter tile independent. Experiments suggest that this does not significantly change the coding staticstics. Single tile, speed 3: pedestrian_area 1080p 1500 kbps 59192 b/f, 40.611 dB, 101689 ms blue_sky 1080p 1500 kbps 58505 b/f, 36.347 dB, 62458 ms mobile_cal 720p 1000 kbps 13335 b/f, 35.646 dB, 45655 ms as compared to 4 column tiles, speed 3: pedestrian_area 1080p 1500 kbps 59329 b/f, 40.597 dB, 101917 ms blue_sky 1080p 1500 kbps 58712 b/f, 36.320 dB, 62693 ms mobile_cal 720p 1000 kbps 13191 b/f, 35.485 dB, 45319 ms Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325	2014-10-24 10:00:27 -07:00

1 2 3 4 5 ...

972 Commits