generic-library/vpx

Author	SHA1	Message	Date
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Dmitry Kovalev	0ac5e4dd58	Adding write_compressed_header function. Change-Id: Ic5257fa8278e9b6297de230e4fd26a1e23ad2bb7	2013-07-10 15:08:34 -07:00
Jim Bankoski	68ef7a6b8a	configure with internal stats not working Change-Id: I5dea4570cb05df27a522abf6e7b695998654284a	2013-07-10 15:07:53 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Jingning Han	114423538f	SSE2 16x16 ADST/DCT hybrid transform This commit enables 16x16 ADST/DCT forward hybrid transform using SSE2 operations. It reduces the runtime from 5433 cycles to 1621 cycles, at no compression performance loss. Change-Id: I75fd7f1984e9e28846af459f810ff0d6ae125230	2013-07-10 12:14:53 -07:00
Dmitry Kovalev	417df1d42e	Merge "Adding encode_tiles function to vp9_bitstream.c."	2013-07-10 11:43:50 -07:00
Yaowu Xu	e52eec490c	Merge "Add a feature to reduce chrome intra mode search"	2013-07-10 11:35:47 -07:00
Ronald S. Bultje	b1df674a99	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83	2013-07-10 09:27:56 -07:00
Yaowu Xu	bed27a960a	Add a feature to reduce chrome intra mode search Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8	2013-07-10 08:59:18 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
Yaowu Xu	059f2929e9	Merge "Revert "Remove memcpy() in handle_inter_mode() filter selection.""	2013-07-09 20:10:06 -07:00
Yaowu Xu	205efbc153	Revert "Remove memcpy() in handle_inter_mode() filter selection." This reverts commit fcf7998a47f7e1ec27fe93f99e488d345560a9be. Change-Id: Ic6532223faec9f1483b78adb2e37b79c7b1a0efb	2013-07-09 17:42:10 -07:00
Dmitry Kovalev	d82f459d1a	Adding encode_tiles function to vp9_bitstream.c. Change-Id: Ie44824ec25fd8fdb25d7c8124a9b28c26d802029	2013-07-09 15:59:19 -07:00
John Koleszar	f0d9f10d24	Remove all asm offset files from VP9 The files are empty and unused. Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a	2013-07-09 14:26:53 -07:00
Ronald S. Bultje	204d1b7058	Merge "Unbreak lossless."	2013-07-09 09:54:48 -07:00
Ronald S. Bultje	d8fa5d45cc	Merge "Make intra prediction pointers RTCD-based."	2013-07-09 09:54:43 -07:00
Ronald S. Bultje	059c0ba5d4	Unbreak lossless. Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1	2013-07-09 09:46:37 -07:00
Dmitry Kovalev	c6c279aff0	Merge "Using mi_cols instead of mb_cols."	2013-07-08 20:09:19 -07:00
Dmitry Kovalev	1c65c580d6	Merge "Refactoring setup_pre_planes function."	2013-07-08 20:08:05 -07:00
Dmitry Kovalev	6254c8d780	Merge "Calling set_partition_seg_context() instead of code duplication."	2013-07-08 20:07:06 -07:00
Ronald S. Bultje	8350e7fe38	Make intra prediction pointers RTCD-based. This probably has a mildly negative impact on performance, but will (in future commits - or possibly merged with this one) allow SIMD implementations of individual intra prediction functions. We may perhaps want to consider having separate functions per txfm-size also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for each intra prediction mode), but I haven't played much with that yet. Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269	2013-07-08 17:25:51 -07:00
Ronald S. Bultje	a5062cc635	Don't call encode_sb() for the final of 4-split subpartitions. The resulting reconstruction is never used, thus it just wastes CPU cycles. Reduces encode time of first 50 frames of bus (speed 0) @ 1500kbps from 2min2.0 to 2min1.2, i.e. a 0.65% overall speedup. Change-Id: I74755ca3aadc21e2be220f486259060bd4088c45	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	8fde07a3ae	Don't recalculate mv_ref costs for each block/partition. Changes cost_mv_ref() into doing a LUT into pre-calculated cost arrays instead. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall. Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	5a73254918	Remove unnecessary memset(best_index, 0) from trellis/optimize. First 50 frames of bus @ 1500kbps (speed 0) goes from 2min12.6 to 2min11.6, i.e. 0.75% overall speedup. Change-Id: I67054f8146e82a02b6457c51a1c8627a937e5e1e	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	fcf7998a47	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	ed995afba1	Make frame-wide filter-type decision fully RD-based. Overall, on all test sets, this gains about +0.2% on all metrics. City is a clip where this really hurts (-1.0% on all metrics), I'm not quite sure why yet. Maybe interesting to look into in the future. Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78	2013-07-08 16:22:37 -07:00
Dmitry Kovalev	b7559258a4	Using mi_cols instead of mb_cols. Eliminating usage of mb-units, switching to mi-units. Adding ALIGN_POWER_OF_TWO macro. Change-Id: I2491c969f713207c062011878b57e4e531818607	2013-07-08 14:54:04 -07:00
Deb Mukherjee	d9b62160a0	Implements several heuristics to prune mode search Skips mode searches for intra and compound inter modes depending on the best mode so far and the reference frames. The various heuristics to be used are selected by bits from a flag. The previous direction based intra mode search pruning is also absorbed in this framework. Specifically the flags and their impact are: 1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique directional modes and TM_PRED if the best so far is an inter mode) derfraw300: -0.15%, 10% speedup 2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153 mode search if the best so far is not one of the closest hor/vert/diagonal directions. derfraw300: -0.05%, about 9% speedup 3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode search if the best so far is an intra mode) derfraw300: -0.06%, about 7-8% speedup 4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search if the best single ref inter mode does not have the same ref as one of the two references being tested in the compound mode) derfraw300: -0.56%, about 10% speedup Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495	2013-07-08 12:17:12 -07:00
Jingning Han	a38cf2658a	Merge "Refactor SSE2 8x8 functional units"	2013-07-05 11:18:18 -07:00
Paul Wilkins	ef0ca2deaa	Merge "Fix to comp_inter_joint_search_thresh feature."	2013-07-04 03:27:00 -07:00
Dmitry Kovalev	f72e072555	Refactoring setup_pre_planes function. Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63	2013-07-03 17:42:01 -07:00
Dmitry Kovalev	2ce6b23473	Merge "Adding write_skip_coeff function."	2013-07-03 16:33:58 -07:00
Jingning Han	68172dbede	Merge "Enable early termination in rd search"	2013-07-03 14:20:41 -07:00
Dmitry Kovalev	430bd0c94a	Merge "Replacing 64 / MI_SIZE with MI_BLOCK_SIZE."	2013-07-03 14:16:02 -07:00
Dmitry Kovalev	dda1835dc6	Adding write_skip_coeff function. Change-Id: I221126f22ab9067348eb0efb8a73b15a8f49c3fd	2013-07-03 13:23:47 -07:00
Jingning Han	2bd6fe08f8	Enable early termination in rd search This commit allows encoder to detect the cumulative rate-distortion cost per transformed block inside a partition. If the cumulative rd cost is already above the best rd value, it terminates the rest operations and continue to next prediction mode test. It reduces the runtime of bus at target bit-rate 2000 from 308 second to 266 second, i.e., about 13% speed-up at no performance penalty. Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a	2013-07-03 12:54:18 -07:00
Dmitry Kovalev	2ad62c9312	Calling set_partition_seg_context() instead of code duplication. Change-Id: I65be6acc54c99688fd1f0c946cec3511514b8555	2013-07-03 11:15:58 -07:00
Dmitry Kovalev	5a21de8418	Replacing 64 / MI_SIZE with MI_BLOCK_SIZE. Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c	2013-07-03 10:54:50 -07:00
Dmitry Kovalev	60198a595d	Merge "Adding write_selected_txfm_size function."	2013-07-03 10:33:55 -07:00
Jingning Han	2cb75c9607	Refactor SSE2 8x8 functional units These serve as building blocks for SSE2 8x8 and 16x16 ADST/DCT hybrid transform coding. Change-Id: I4089a754c66e0c986f67d9b8ec4dfb9627ad430d	2013-07-03 10:11:59 -07:00
Ronald S. Bultje	61fe678f36	Merge "Use pmovmskb to skip quantize loops over empty coefficients."	2013-07-03 09:05:48 -07:00
Paul Wilkins	f58b44ad62	Fix to comp_inter_joint_search_thresh feature. When this is 0 (BLOCK_SIZE_AB4X4) we want to do the inter joint search for all sizes. Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88	2013-07-03 16:58:34 +01:00
Paul Wilkins	72c5778ec5	Added two new skip experiments. sf->unused_mode_skip_lvl. Tests modes as normal for all sizes at or below the given level. At larger sizes it skips all modes that were not chosen at any smaller size. Hence setting BLOCK_SIZE_SB64X64 is in effect off. Setting BLOCK_SIZE_AB4X4 will only consider modes that were chosen for one or more 4x4 blocks at larger sizes. sf->reference_masking. Do a test encode of the NONE partition at one size and create a reference frame mask based on the best rd choice. In the full search only allow this reference frame. Currently it is testing 64x64 and repeats this in the full search. This does not work well with Jim's Partition code just now and is disabled by default. Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd	2013-07-03 16:56:06 +01:00
Paul Wilkins	b0a2871c35	Merge "Adjust Speed 0 settings."	2013-07-03 02:47:18 -07:00
Dmitry Kovalev	1f6e95e76a	Merge "Removing redundant struct from union b_mode_info."	2013-07-02 18:09:31 -07:00
Dmitry Kovalev	be77f6bbbf	Removing redundant struct from union b_mode_info. Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83	2013-07-02 16:51:57 -07:00
Dmitry Kovalev	edb060a77c	Adding write_selected_txfm_size function. Change-Id: I143b430b7c24a964ccd0ebb75944cf317a072214	2013-07-02 16:41:22 -07:00
Yaowu Xu	0d7b7c09cb	Added a speed feature use_square_partition_only This commit adds a speed feature where only squared partition are evaluated in partition picking. Enable this feature in cpu-used 2 reduces encoding time by ~30%. loss of compression: -0.9% on cif set -1.23% on stdhd Change-Id: Ia6fad11210f0b78365abb889f9245604513be5b9	2013-07-02 16:40:15 -07:00

... 3 4 5 6 7 ...

1412 Commits