generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	b775081283	Merge "Removing experimental code from vp9_entropymv.c."	2013-07-17 14:43:45 -07:00
hkuang	bd6ce7128c	Remove unnecessary buffer copy in idct4x4. Change-Id: I386066b9bcfb4bffb582e6827af36ca0181f6a83	2013-07-17 14:20:56 -07:00
Dmitry Kovalev	8452c34551	Removing experimental code from vp9_entropymv.c. Change-Id: I340d06e3bc32c78358654496503cccd4196cbe2e	2013-07-17 10:25:09 -07:00
Johann	9ca66ec050	Merge "vp9_convolve8_neon placeholder"	2013-07-17 10:09:00 -07:00
Johann	59dc4e9cdd	vp9_convolve8_neon placeholder Call the individually optimized horizontal and vertical functions. This implementation abuses the temp buffer. This will be replaced with a custom optimized function. Over 2x speedup. Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd	2013-07-17 08:39:27 -07:00
Paul Wilkins	5f4722c75f	Merge "Minor cleanup in code to fine uv tx_size."	2013-07-17 02:50:09 -07:00
Dmitry Kovalev	6638b6f63f	Merge "Removing MV_GROUP_UPDATE define and corresponding code."	2013-07-16 21:09:00 -07:00
Dmitry Kovalev	41ae3d02d4	Removing two unused arguments from vp9_inc_mv signature. Change-Id: Ieffea49eb7a5e5092f21f8694c546aff69b07c6d	2013-07-16 17:01:08 -07:00
Dmitry Kovalev	5b65a71cdc	Changing signature of vp9_get_pred_probs_tx_size. Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c	2013-07-16 16:34:54 -07:00
Dmitry Kovalev	f53d007b9e	Merge "Loop filter code cleanup."	2013-07-16 15:55:17 -07:00
Dmitry Kovalev	3997da0d35	Removing MV_GROUP_UPDATE define and corresponding code. Change-Id: I4884cdc2557d25d50c7c4f7e19b1ad8bdb93cd63	2013-07-16 15:03:00 -07:00
Dmitry Kovalev	9482a0bf10	Cleaning up tile code. Removing tile_rows and tile_columns from VP9Common, removing redundant constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of vp9_get_tile_n_bits. Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267	2013-07-16 14:47:15 -07:00
Dmitry Kovalev	2de3c8d29b	Loop filter code cleanup. Cosmetic code changes, renaming 'flat' local var to 'mask', removing unused field 'blim' from loopfilter_info_n and loop_filter_info structs. Change-Id: I51e6ccf727fe361ad9a08e29e1201aa7abd4987f	2013-07-16 14:39:31 -07:00
James Zern	98e132bde0	Merge changes I40454d26,I892e76d5,I865ab3f9,I4a4bec17,I61c4351e,I37eb3559,I1031c556,I8c8f1f42 * changes: delete vp9_loopfilter_sse2.asm vp9_loopfilter_intrin_sse2: cosmetics: fix indent delete x86/vp9_loopfilter_x86.h vp9_loopfilter_intrin_sse2: make some funcs static vp9_loopfilter_intrin_sse2: remove unused uv funcs vp9_loopfilter: remove uv function typedef filter_block_plane: reuse some constants vp9_loopfilter.c: make some functions static	2013-07-16 14:25:32 -07:00
James Zern	39ce4b13d5	Merge "use consistent framerate naming"	2013-07-16 14:22:52 -07:00
James Zern	9581eb6e8a	use consistent framerate naming s/frame_rate/framerate/g Change-Id: I6fc3e088e419c5f46e3a9390dd8a2cad2677a2fc	2013-07-16 14:12:47 -07:00
Jingning Han	5e8e2bf48e	Merge "SSE2 16x16 inverse ADST/DCT hybrid transform"	2013-07-16 14:04:04 -07:00
Dmitry Kovalev	5de96b3ce6	Merge "Rewriting vp9_set_pred_flag_{seg_id, mbskip}."	2013-07-16 13:34:42 -07:00
Dmitry Kovalev	85a0d8e85c	Merge "Moving vp9_kf_default_bmode_probs to vp9_entropymode.c."	2013-07-16 13:26:53 -07:00
James Zern	50015f6eba	delete vp9_loopfilter_sse2.asm sse2 functions are provided by vp9_loopfilter_intrin_sse2.c Change-Id: I40454d26034e3ef915eeaf889937fe7d1b519b9b	2013-07-16 13:09:16 -07:00
James Zern	8f4787a383	vp9_loopfilter_intrin_sse2: cosmetics: fix indent Change-Id: I892e76d5ad1443b2ea0d1a7839fe26afe9c68ffb	2013-07-16 13:09:16 -07:00
James Zern	af58254267	delete x86/vp9_loopfilter_x86.h also remove prototype_loopfilter{,_block} defines from vp9_loopfilter.h Change-Id: I865ab3f9436c7b1ca166f76630328abf01389405	2013-07-16 13:09:05 -07:00
James Zern	5baa416b6c	Merge "vp9: remove frames_{since,till}.. from MACROBLOCKD"	2013-07-16 13:00:14 -07:00
Jingning Han	d05f66aa10	SSE2 16x16 inverse ADST/DCT hybrid transform This commit enables SSE2 implementation of 16x16 inverse ADST/DCT hybrid transform. The runtime goes from 5742 cycles -> 1821 cycles. This provides about 1% encoding speed-up at speed 0. Change-Id: I1678d0988bf30b9efd524877705bbb3645edb17b	2013-07-16 12:51:42 -07:00
James Zern	c0562d08f6	Merge "VP[89]_COMMON: remove unused near_boffset"	2013-07-16 12:17:04 -07:00
James Zern	63e914bde4	Merge "VP9_COMMON: remove unused framerate/bitrate"	2013-07-16 12:16:37 -07:00
James Zern	3a7c2665d0	Merge "yv12config: remove YUV_TYPE"	2013-07-16 12:16:04 -07:00
Ronald S. Bultje	58a2005367	Merge "Replace generated quant tables with static lookup tables."	2013-07-16 12:07:17 -07:00
Ronald S. Bultje	e965cccce5	Replace generated quant tables with static lookup tables. This prevents possible float rounding issues between architectures. Change-Id: I6ed260aebd49feb4cfb5596a5370c44be5f72167	2013-07-16 12:06:26 -07:00
John Koleszar	cc1aac1b3c	Merge "Fix above context pointers"	2013-07-16 11:23:38 -07:00
Jingning Han	5851904744	Merge "SSE2 8x8 inverse ADST/DCT transform"	2013-07-16 11:00:11 -07:00
Dmitry Kovalev	baf0c959c7	Moving vp9_kf_default_bmode_probs to vp9_entropymode.c. Removing vp9_modelcontext.c. Change-Id: If2316c58dead2708d9f95b52d9494ba4c1dd7427	2013-07-16 10:54:34 -07:00
Dmitry Kovalev	863138a2ad	Rewriting vp9_set_pred_flag_{seg_id, mbskip}. Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent with vp9_get_segment_id without using confusing sub(a, b) macro. Passing mi_row and mi_col to functions explicitly instead of replying on mb_to_right_edge and mb_to_bottom_edge. Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435	2013-07-16 10:44:48 -07:00
Paul Wilkins	30d2ea45ce	Minor cleanup in code to fine uv tx_size. Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e	2013-07-16 18:27:33 +01:00
John Koleszar	5efd9609e3	Fix above context pointers In the prior code, the above context pointers used for entropy decoding were initialized on the first frame, and not updated when the frame size changed. The per-frame code which initializes the contexts assumes that the contexts are contiguous, leading to an incomplete initialization when the frame is smaller. This commit updates the pointers so that the context is contigous whenever the frame size changes. Change-Id: I08b53e3a30c8289491212311682ff1b8028cff6c	2013-07-16 10:26:56 -07:00
Johann	90ebfe621f	Merge "vp9_convolve8_[horiz\|vert]_avg"	2013-07-16 09:42:52 -07:00
Dmitry Kovalev	e8e7620a1f	Merge "Removing and moving around constant definitions."	2013-07-16 00:52:53 -07:00
Yaowu Xu	c5b0cd8405	Merge "Change to extend full border only when needed"	2013-07-15 21:35:32 -07:00
Yaowu Xu	5b915ebd92	Change to extend full border only when needed This is a short term optimization till we work out a decoder implementation requiring no frame border extension. Change-Id: I02d15bfde4d926b50a4e58b393d8c4062d1be70f	2013-07-15 20:52:13 -07:00
Dmitry Kovalev	ca75f1255f	Removing and moving around constant definitions. Removing unused and duplicated constants, moving them from .h to .c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f	2013-07-15 19:26:30 -07:00
Dmitry Kovalev	65762849d1	Merge "Consistent naming for loop-filter filters."	2013-07-15 19:21:32 -07:00
Frank Galligan	ce1d69aed9	Merge "Neon: Update mbfilter if all vectors follow one branch."	2013-07-15 17:11:55 -07:00
Dmitry Kovalev	e973b4e2d9	Consistent naming for loop-filter filters. Renaming flatmask4 to flat_mask4, flatmask5 to flat_mask5, hevmask to hev_mask, filter to filter4, mbfilter to filter8, wide_mbfilter to filter16. Change-Id: Ic61c73e59c2eee505257584867aafac99833cea1	2013-07-15 16:01:31 -07:00
Frank Galligan	f4f60f6005	Neon: Update mbfilter if all vectors follow one branch. Change the mbfilter Neon code from executing both branches if all vectors follow only one branch. The code is about 5% faster when executing only one branch and about 1% slower when executing both branches. -PS5: Remove local stack space from mbfilter. Change-Id: I6a23f9b318a9f4568a2718b4c9348db988fe2182	2013-07-15 13:08:28 -07:00
Dmitry Kovalev	1f14bbb624	Merge "Fixing vp9_get_pred_context_comp_ref_p function."	2013-07-15 10:51:42 -07:00
James Zern	04606d7258	vp9_loopfilter_intrin_sse2: make some funcs static + drop 'vp9_' Change-Id: I4a4bec175316aab8f65c3a23bacc8362399a1357	2013-07-13 18:48:00 -07:00
James Zern	dc968d3d45	vp9_loopfilter_intrin_sse2: remove unused uv funcs vp9_mbloop_filter_horizontal_edge_sse2 / vp9_mbloop_filter_vertical_edge_uv_sse2 Change-Id: I61c4351ef0cce79fa4156a47ddace781f1566869	2013-07-13 18:44:32 -07:00
James Zern	bd6b79c44d	vp9_loopfilter: remove uv function typedef loop_filter_uvfunction is unused Change-Id: I37eb3559e9eb2808f1f29dfea429441c94c9df2a	2013-07-13 18:38:28 -07:00
James Zern	9a4e175a64	filter_block_plane: reuse some constants + light const application + limit scope of params to build_lfi Change-Id: I1031c556aec160a690921dc10e7aa8a707f43ecd	2013-07-13 18:21:05 -07:00
James Zern	b09d37af0c	vp9_loopfilter.c: make some functions static + drop 'vp9_' Change-Id: I8c8f1f421f7fc84d2efb80349cd725de3c9bf6bd	2013-07-13 18:14:03 -07:00
James Zern	dc1d2331f6	vp9: remove frames_{since,till}.. from MACROBLOCKD frames_since_golden / frames_till_alt_ref_frame are unused. Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b	2013-07-13 18:02:11 -07:00
James Zern	04092764f7	VP9_COMMON: remove unused framerate/bitrate + VP8_COMMON: place them under CONFIG_POSTPROC_VISUALIZER Change-Id: I2702d5a3e1134b9c5f7ddc14b4173955a400f2cf	2013-07-12 21:43:23 -07:00
Jingning Han	91365addf8	SSE2 8x8 inverse ADST/DCT transform This commit enables SSE2 implementation of 8x8 inverse ADST/DCT transform. The runtime goes from 1216 cycles -> 266 cycles. For bus_cif at 2000 kbps, the overall runtime reduces from 253707ms -> 248430ms, i.e., 2% speed-up at speed 0. Change-Id: Ib0372e17e9162d7b11a10d653b1c8be547c878fb	2013-07-12 21:03:16 -07:00
James Zern	ce0324d8dd	VP[89]_COMMON: remove unused near_boffset Change-Id: If9b9ca703b997312df85241a0758d414cfdc5228	2013-07-12 19:41:27 -07:00
Dmitry Kovalev	429070987a	Using vp9_copy and vp9_zero instead of custom code. Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b	2013-07-12 18:07:43 -07:00
Dmitry Kovalev	31a68bcdff	Fixing vp9_get_pred_context_comp_ref_p function. Adding missed parenthesis around boolean expressions. Bitstream is changed. Regenerating test vectors. Change-Id: I4cc00b761e9473f92f180a9fc3a0c607f0aaae56	2013-07-12 17:46:02 -07:00
Johann	a15bebfc0a	vp9_convolve8_[horiz\|vert]_avg Super basic conversion from the other implementations. Any changes to one should be trivial to copy over keep in sync. Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8	2013-07-12 16:21:33 -07:00
Dmitry Kovalev	aa518af8c7	Merge "Adding struct tx_probs and struct tx_counts to cleanup the code."	2013-07-12 16:02:09 -07:00
James Zern	c9a2a06c20	Merge "vp9_postproc: remove useless self-assign"	2013-07-12 15:41:41 -07:00
James Zern	4fc6c88e9c	yv12config: remove YUV_TYPE this was never fleshed out in the context of VP8, for which it was added. for VP9 it has no meaning. Change-Id: Iba2ecc026d9e947067b96690245d337e51e26eff	2013-07-12 15:25:48 -07:00
Dmitry Kovalev	cc662dd768	Adding struct tx_probs and struct tx_counts to cleanup the code. Also removing unused declarations from vp9_entropymode.h file. Change-Id: Ib9c5826db3584a32f6bb3297a76c522b99d83402	2013-07-12 15:22:38 -07:00
Dmitry Kovalev	60969da5cb	Merge "Code cleanup in vp9_pred_common.c"	2013-07-12 15:04:07 -07:00
James Zern	cca973a1ab	vp9_postproc: remove useless self-assign Change-Id: I0bc5d2d8c9fec8be18263b0dc2528886bb5b7b61	2013-07-12 14:17:15 -07:00
Dmitry Kovalev	3ab86adb1e	Code cleanup in vp9_pred_common.c No bitstream changes. Using MB_MODE_INFO temp variables instead of MODE_INFO variables. Removing redundant curly braces. Change-Id: Ib9d1bedfbd8af97ecc722ccf697ea8177bbe287c	2013-07-12 14:11:48 -07:00
James Zern	0195fb53cb	vp9: consistent 'log2' variable naming lg2 -> log2 Change-Id: I0602ddff49e42c9c40c29c084d04b7592b9f8edf	2013-07-12 11:37:43 -07:00
Deb Mukherjee	94c481f9f1	Some minor cleanups for efficiency Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f	2013-07-12 10:22:56 -07:00
Dmitry Kovalev	dd150e8ea9	Removing redundant code mostly from vp9_pred_common.{h, c}. Removing redundant function arguments and curly braces. Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b	2013-07-11 18:39:10 -07:00
Jingning Han	dac5891a1a	Merge "SSE2 4x4 invserse ADST/DCT transform"	2013-07-11 14:17:23 -07:00
Dmitry Kovalev	b55ecafda8	Merge "Making vp9_default_nmv_context static."	2013-07-11 13:58:34 -07:00
Dmitry Kovalev	c4ad3273c7	Moving segmentation related vars into separate struct. Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03	2013-07-11 11:57:57 -07:00
Johann	158c80cbb0	convolve8 optimizations for neon Independent horizontal and vertical implementations. Requires that blocks be built from 4x4 and [xy]_step_q4 == 16 6-10% improvement. CIF improved the least. Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda	2013-07-11 11:08:19 -07:00
hkuang	c9b25dcae4	Add neon optimize vp9_dc_only_idct_add. Change-Id: Iae84ab945cc9662a0ddd839aa2b9ca59f2ae5423	2013-07-11 10:30:47 -07:00
Jim Bankoski	5000cdf0ff	Merge "Wide loopfilter 16 pix at a time"	2013-07-11 06:44:02 -07:00
Jingning Han	49b6302044	SSE2 4x4 invserse ADST/DCT transform Enable SSE2 4x4 inverse ADST/DCT transform. The runtime goes from 292 cycles down to 89 cycles. Running bus_cif at 2000 kbps, the overall runtime of speed 0 goes from 301s to 295s (2% speed-up). Change-Id: I24098136e7fee7ab2fbf1c11755bdf2ca37f3628	2013-07-10 20:16:02 -07:00
Ronald S. Bultje	decead7336	Replace copy_memNxM functions with a generic copy/avg function. Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa	2013-07-10 18:27:24 -07:00
Dmitry Kovalev	ac72ad071d	Making vp9_default_nmv_context static. Change-Id: Ia3d5bd45adf288de11ab59c4728266c93c17e275	2013-07-10 17:44:45 -07:00
Ronald S. Bultje	46997bde88	Merge "Remove unused iwalsh4x4 MMX/SSE2 functions."	2013-07-10 17:08:46 -07:00
Ronald S. Bultje	a7ef456453	Merge "Remove unused 16x3/3x16 sad SSE2 functions."	2013-07-10 17:08:43 -07:00
John Koleszar	64f7a4d8cb	Wide loopfilter 16 pix at a time Where possible, do the 16 pixel wide filter while doing the horizontal filtering pass. The same approach can be taken for the mbloop_filter when that's implemented. Doing so on the vertical pass is a little more involved, but possible. Change-Id: I010cb505e623464247ae8f67fa25a0cdac091320	2013-07-10 16:32:44 -07:00
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Ronald S. Bultje	3f210f10eb	Remove unused iwalsh4x4 MMX/SSE2 functions. Change-Id: I2d22577911a37ed7d8c7e08cac20764842267652	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	48c53233fd	Remove unused 16x3/3x16 sad SSE2 functions. Change-Id: I30a597c0cc366e34c9a3e2afe32d70e044f95ca4	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	e6f955251f	Merge "SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction."	2013-07-10 14:52:23 -07:00
Ronald S. Bultje	6a60249071	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 TM intra prediction."	2013-07-10 14:52:19 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Jingning Han	114423538f	SSE2 16x16 ADST/DCT hybrid transform This commit enables 16x16 ADST/DCT forward hybrid transform using SSE2 operations. It reduces the runtime from 5433 cycles to 1621 cycles, at no compression performance loss. Change-Id: I75fd7f1984e9e28846af459f810ff0d6ae125230	2013-07-10 12:14:53 -07:00
John Koleszar	d1f8dd518c	Merge "Fix intermediate height in convolve"	2013-07-10 11:04:40 -07:00
Ronald S. Bultje	44b29a769c	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 V intra prediction."	2013-07-10 10:24:16 -07:00
Ronald S. Bultje	89810bfd71	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction."	2013-07-10 10:13:16 -07:00
Dmitry Kovalev	20986c81b3	Merge "Removing vp9_maskingmv.c and corresponding assembly file."	2013-07-10 10:05:06 -07:00
Ronald S. Bultje	7fd643264a	SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction. Change-Id: Iad70966b986f65259329070e258f76ef0af816b4	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	8dade638a1	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 TM intra prediction. Change-Id: I3441c059214c2956e8261331bbf521525a617a86	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	75b33c68c7	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 V intra prediction. Change-Id: I55a6cfa2daba738cbc0c4a02f806893f7e556997	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	92c5d3665d	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction. Change-Id: Ibe1690afc5459f3b3beca401e7734fcd03da6dd0	2013-07-10 09:28:03 -07:00
Jim Bankoski	863204e64d	mi_width_log2 & mi_height_log2 converted to lookup to avoid unnecessary code Change-Id: I2ee6a01f06984cc2c4ba74b3fffd215318f749d2	2013-07-10 07:26:08 -07:00
Jim Bankoski	6c8170af52	b_width_log2 and b_height_log2 lookups Replace case statement with lookup. Small speed gain at low speed settings but at speed 2+ where the number of motion searches etc. falls the impact rises to ~3-4%. Change-Id: Idff639b7b302ee65e042b7bf836943ac0a06fad8 Change-Id: I5940719a4a161f8c26ac9a6753f1678494cec644	2013-07-10 07:19:09 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
James Zern	dac57fece6	Merge "Remove all asm offset files from VP9"	2013-07-09 19:13:37 -07:00
Dmitry Kovalev	2824048a56	Merge "Loop filter code cleanup."	2013-07-09 18:56:19 -07:00
Frank Galligan	53971d86ea	Merge "Add Neon horizontal and vertical vp9_mbloop_filter"	2013-07-09 15:38:44 -07:00
John Koleszar	f0d9f10d24	Remove all asm offset files from VP9 The files are empty and unused. Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a	2013-07-09 14:26:53 -07:00
Frank Galligan	198fa6d0a0	Add Neon horizontal and vertical vp9_mbloop_filter - The vp9 mbfilter C code will branch on flat and mask. This CL will perform both branches and combine the data. A later CL will perform a check to see if all patch will take one branch. - These functions are about 1.75 times faster than the C code on Nexus 7. PS #3 - Changed all functions to dub limit, blimit, and thresh from vld {dx[]}, freeing up r4-r6. - Changed code to use vbif to reduce one instruction and free up a d register. Change-Id: I028dae0e434dc9891c3677bdb182e201ffb04777	2013-07-09 12:40:05 -07:00
Dmitry Kovalev	ec68d25521	Merge "Adding update_tx_ct function, removing duplicated code."	2013-07-09 12:26:11 -07:00
Dmitry Kovalev	aeed28f143	Removing vp9_maskingmv.c and corresponding assembly file. Change-Id: I9842d02d61d78d17dc3449bae8ffbe60f4b3ecb3	2013-07-09 11:22:56 -07:00
Dmitry Kovalev	92a9eaef50	Loop filter code cleanup. Using MAX_LOOP_FILTER constant instead of number 63. Change-Id: If91e0c198331b3041e7cd0707a5948479e9209d8	2013-07-09 11:18:09 -07:00
Ronald S. Bultje	d8fa5d45cc	Merge "Make intra prediction pointers RTCD-based."	2013-07-09 09:54:43 -07:00
Yaowu Xu	df5731273f	Merge "Fix loopfilter bug"	2013-07-09 01:34:25 -07:00
Dmitry Kovalev	c6c279aff0	Merge "Using mi_cols instead of mb_cols."	2013-07-08 20:09:19 -07:00
Dmitry Kovalev	1c65c580d6	Merge "Refactoring setup_pre_planes function."	2013-07-08 20:08:05 -07:00
Dmitry Kovalev	6254c8d780	Merge "Calling set_partition_seg_context() instead of code duplication."	2013-07-08 20:07:06 -07:00
Ronald S. Bultje	8350e7fe38	Make intra prediction pointers RTCD-based. This probably has a mildly negative impact on performance, but will (in future commits - or possibly merged with this one) allow SIMD implementations of individual intra prediction functions. We may perhaps want to consider having separate functions per txfm-size also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for each intra prediction mode), but I haven't played much with that yet. Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269	2013-07-08 17:25:51 -07:00
John Koleszar	527fc5caf6	Fix loopfilter bug In the rare case were 4x4 interior filtering was called for but no 8x8 or larger filtering takes place, the previous code was skipping the filtering. This patch fixes the issue by including the interior mask in the overall mask for the filter application loops. Change-Id: I4a0b65056c64f97478827c2ff41e0914fc7779d0	2013-07-08 16:49:57 -07:00
Ronald S. Bultje	bd867f1619	Inline vp9_get_mv_joint(). Encode time for first 50 frames of bus (speed 0) @ 1500kbps goes from 2min10.9 to 2min10.5, i.e. 0.3% faster overall, basically because we prevent the call overhead. Change-Id: I1eab1a95dd3eae282f9b866f1f0b3dcadff073d5	2013-07-08 16:22:39 -07:00
Dmitry Kovalev	b7559258a4	Using mi_cols instead of mb_cols. Eliminating usage of mb-units, switching to mi-units. Adding ALIGN_POWER_OF_TWO macro. Change-Id: I2491c969f713207c062011878b57e4e531818607	2013-07-08 14:54:04 -07:00
Tero Rintaluoma	18303b1263	Fix intermediate height in convolve intermediate_height for horizontal filtering must be at least 8 pixels to be able to do vertical filtering correctly. Currently it can be less for small block and y_step_q4 sizes. Change-Id: I2ee28b0591b2041c2fa9844d0ae2ff8a1a59cc21	2013-07-05 14:58:25 +03:00
Dmitry Kovalev	bfcef95c45	Adding update_tx_ct function, removing duplicated code. Change-Id: I8882fe3cd247a5a8304ab8ab2ee9abdb92830133	2013-07-03 18:24:13 -07:00
Dmitry Kovalev	f72e072555	Refactoring setup_pre_planes function. Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63	2013-07-03 17:42:01 -07:00
Dmitry Kovalev	430bd0c94a	Merge "Replacing 64 / MI_SIZE with MI_BLOCK_SIZE."	2013-07-03 14:16:02 -07:00
Dmitry Kovalev	2ad62c9312	Calling set_partition_seg_context() instead of code duplication. Change-Id: I65be6acc54c99688fd1f0c946cec3511514b8555	2013-07-03 11:15:58 -07:00
Dmitry Kovalev	5a21de8418	Replacing 64 / MI_SIZE with MI_BLOCK_SIZE. Change-Id: I32276552b3ea6dc1dce8e298be114cfe1019b31c	2013-07-03 10:54:50 -07:00
Yaowu Xu	0f02dc2709	Inline a few intra predictors Change-Id: Ib41f0643fdcc088500e7420708f4e72f1f64c710	2013-07-03 10:20:41 -07:00
Ronald S. Bultje	98c493a1c0	Merge "Remove unused function vp9_build_inter4x4_predictors_mbuv()."	2013-07-03 09:05:20 -07:00
Dmitry Kovalev	be77f6bbbf	Removing redundant struct from union b_mode_info. Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83	2013-07-02 16:51:57 -07:00
Ronald S. Bultje	5b87240230	Remove unused function vp9_build_inter4x4_predictors_mbuv(). Change-Id: Ibfd2def2c088f4bc541a1de25990d73480b53d4b	2013-07-02 16:34:24 -07:00
Deb Mukherjee	8d3d2b76f3	Tx size selection enhancements (1) Refines the modeling function and uses that to add some speed features. Specifically, intead of using a flag use_largest_txfm as a speed feature, an enum tx_size_search_method is used, of which two of the types are USE_FULL_RD and USE_LARGESTALL. Two other new types are added: USE_LARGESTINTRA (use largest only for intra) USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for inter) (2) Another change is that the framework for deciding transform type is simplified to use a heuristic count based method rather than an rd based method using txfm_cache. In practice the new method is found to work just as well - with derf only -0.01 down. The new method is more compatible with the new framework where certain rd costs are based on full rd and certain others are based on modeled rd or are not computed. In this patch the existing rd based method is still kept for use in the USE_FULL_RD mode. In the other modes, the count based method is used. However the recommendation is to remove it eventually since the benefit is limited, and will remove a lot of complications in the code (3) Finally a bug is fixed with the existing use_largest_txfm speed feature that causes mismatches when the lossless mode and 4x4 WH transform is forced. Results on derf: USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a pretty good compromise) USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction (currently the benefit of modeling is limited for txfm size selection, but keeping this enum as a placeholder) . USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing use_largest_txfm speed feature). Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936	2013-07-02 13:54:00 -07:00
Dmitry Kovalev	904070ca64	Merge "Removing unused implicit segmentation code."	2013-07-02 11:58:48 -07:00
Ronald S. Bultje	3cc6eb7c00	Merge "Make get_coef_context() branchless."	2013-07-02 11:48:15 -07:00
Dmitry Kovalev	3140c443e4	Merge "Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h."	2013-07-02 11:31:35 -07:00
Dmitry Kovalev	a3d2e6c98b	Removing unused implicit segmentation code. Change-Id: I8a2983fb14274a6ac53681fa4cd5d4209cbd2905	2013-07-02 11:16:42 -07:00
Ronald S. Bultje	9df24b41ca	Merge "Update quantize SSSE3 SIMD to cover 32x32 transform case also."	2013-07-02 09:38:08 -07:00
Dmitry Kovalev	1ac0540296	Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h. Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c	2013-07-01 17:28:08 -07:00
Ronald S. Bultje	26b6318de8	Make get_coef_context() branchless. This should significantly speedup cost_coeffs(). Basically what the patch does is to make the neighbour arrays padded by one item to prevent an eob check in get_coef_context(), then it populates each col/row scan and left/top edge coefficient with two times the same neighbour - this prevents a single/double context branch in get_coef_context(). Lastly, it populates neighbour arrays in pixel order (rather than scan order), so we don't have to dereference the scantable to get the correct neighbours. Total encoding time of first 50 frames of bus (speed 0) at 1500kbps goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase. Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56	2013-07-01 16:34:10 -07:00
Yaowu Xu	ba3b2604f0	Merge "Quantize (64-bit only, for now) SSSE3 SIMD."	2013-07-01 15:58:57 -07:00
Ronald S. Bultje	c8defcfdee	Update quantize SSSE3 SIMD to cover 32x32 transform case also. Encode time of bus (speed 0) 50 frames @ 1500kbps goes from 2min14.4 to 2min10.1, i.e. a 2.3% overall speed increase. Change-Id: I3699580e74ec26c7d24e03681bc47ba25ee1ee87	2013-07-01 11:36:33 -07:00
Ronald S. Bultje	7353ceab9d	Quantize (64-bit only, for now) SSSE3 SIMD. Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904	2013-07-01 11:36:07 -07:00
Dmitry Kovalev	2ab3bc8871	Removing vp9_modecont.{h, c}. Moving vp9_default_inter_mode_probs array to vp9_entropymode.c. Change-Id: I88ebda86ccc07f2a43c6c01d4b37898214cfb6de	2013-07-01 10:17:15 -07:00
Jingning Han	993942ce0c	Merge "Enable SSE2 4x4 ADST/DCT transform"	2013-06-29 15:57:04 -07:00
Christian Duvivier	466e0cf303	SSE2 version of vp9_short_fdct32x32_rd. 43,000 -> 5,750 cycles, about 7.5x faster. Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0	2013-06-29 13:53:00 -07:00
Johann	6098e359f4	Merge "add Neon optimized add constant residual functions"	2013-06-28 19:50:38 -07:00
James Zern	84d08fa9c4	Merge "fix test compile error"	2013-06-28 19:48:05 -07:00
Ronald S. Bultje	a487af8d35	Merge "Inline vp9_get_coef_context() (and remove vp9_ prefix)."	2013-06-28 19:37:11 -07:00
chm	a83cfd4da1	add Neon optimized add constant residual functions - Add add_constant_residual_8x8 16x16 32x32 functions - Tested under RealView debugger enviroment Change-Id: I5c3a432f651b49bf375de6496353706a33e3e68e	2013-06-28 19:06:51 -07:00
James Zern	a63e31e81e	fix test compile error since: `92479d9` Make update_partition_context faster fixes: vp9/common/vp9_blockd.h:408:22: error: non-constant-expression cannot be narrowed from type 'int' to 'char' in initializer list [-Wc++11-narrowing] char pcvalue[2] = {~(0xe << boffset), ~(0xf <<boffset)}; ^~~~~~~~~~~~~~~~~ Change-Id: Id5b00b9a72d00a2b314081a23879bd1fa3ce983b	2013-06-28 18:07:37 -07:00
Jingning Han	1109b6b888	Enable SSE2 4x4 ADST/DCT transform This commit enables SSE2 4x4 foward hybrid transform. The runtime goes from 249 cycles down to 74 cycles. Overall around 2% speed-up at no compression performance change. Change-Id: Iad4d526346e05c7be896466c05500711bb763660	2013-06-28 17:24:43 -07:00
Dmitry Kovalev	228b8232d3	Cosmetic reordering of FRAME_CONTEXT members. Change-Id: Id641e5188adf55e53e606e5813ae45feaf7abbd2	2013-06-28 16:16:03 -07:00
Dmitry Kovalev	59070f6e3c	Merge "Removing CONFIG_DEBUG checks on assertions."	2013-06-28 14:03:28 -07:00
Ronald S. Bultje	ec5d09b950	Merge "Make coefficient skip condition an explicit RD choice."	2013-06-28 11:54:28 -07:00
Ronald S. Bultje	d00b8e5f82	Inline vp9_get_coef_context() (and remove vp9_ prefix). Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896	2013-06-28 10:40:21 -07:00
Dmitry Kovalev	0345fc3ad9	Merge "Decoder's code cleanup."	2013-06-28 10:38:54 -07:00
Dmitry Kovalev	8e6ce6bb9e	Removing CONFIG_DEBUG checks on assertions. Adding CHECK_MEM_ERROR macro to vp9_common.h and removing two duplicated ones from vp9_onyx_int.h and vp9_onyxd_int.h. Change-Id: I916afec61b3019f18193135dac7c35ed0f89b8b6	2013-06-28 10:36:20 -07:00
Ronald S. Bultje	af660715c0	Make coefficient skip condition an explicit RD choice. This commit replaces zrun_zbin_boost, a method of biasing non-zero coefficients following runs of zero-coefficients to be rounded towards zero, with an explicit skip-block choice in the RD loop. The logic is basically that if individual coefficients should be rounded towards zero (from a RD point of view), the trellis/optimize loop should take care of it. If whole blocks should be zero (from a RD point of view), a single RD check is much more efficient than a complete serialization of the quantization loop. Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim. SIMD for quantize will follow in a separate patch. Results for other test sets pending. Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4	2013-06-28 10:28:49 -07:00
Dmitry Kovalev	3231da0a9e	Decoder's code cleanup. Using vp9_set_pred_flag function instead of custom code, adding decode_tokens function which is now called from decode_atom, decode_sb_intra, and decode_sb. Change-Id: Ie163a7106c0241099da9c5fe03069bd71f9d9ff8	2013-06-27 16:15:43 -07:00
Frank Galligan	1d6dc1b702	Add Neon optimized loop filter functions. - Added vp9_loop_filter_horizontal_edge_neon and vp9_loop_filter_vertical_edge_neon. - The functions are based off the vp8 loopfilter functions. - Matches x86 md5 checksum. Change-Id: Id1c4dddb03584227e5ecd29f574a6ac27738fdd0	2013-06-27 16:14:45 -07:00
Dmitry Kovalev	a3664258c5	Merge "General cleanup in segmentation-related code."	2013-06-27 14:57:07 -07:00
Dmitry Kovalev	be83ef3104	Merge "Moving subexp encoding functions in separate vp9_dsubexp.c file."	2013-06-27 14:55:18 -07:00
Jingning Han	fc1cfd8e32	Merge "Make intra predictor reference buffer configurable"	2013-06-26 19:02:02 -07:00
Jingning Han	4c10515f89	Merge "Make update_partition_context faster"	2013-06-26 19:01:45 -07:00
Yaowu Xu	896dc47cac	Merge "Change to use LUT for mode-to-txfm conversion"	2013-06-26 17:19:47 -07:00
Jingning Han	861cb06c67	Make intra predictor reference buffer configurable This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1	2013-06-26 17:17:21 -07:00
Jingning Han	92479d9526	Make update_partition_context faster Use vpx_memset for updating the partition contexts. Thanks to Noah for pointing out the need of refactoring in this part. Change-Id: I67fb78429d632298f1cd8a0be346cc76f79392a6	2013-06-26 17:05:51 -07:00
Yaowu Xu	25fe05fd92	Change to use LUT for mode-to-txfm conversion Change-Id: Ieb989830f49e6708ee7728eddebf7a2144c37c6f	2013-06-26 14:10:43 -07:00
Dmitry Kovalev	be07485e9a	General cleanup in segmentation-related code. Using consistent function and variable names. Change-Id: I2deb3fded8797453a2081836c9ce2e79ade06eb7	2013-06-26 10:27:28 -07:00
John Koleszar	8137e24f3d	Merge "Move vp9_counts_to_nmv_context to encoder"	2013-06-25 22:44:21 -07:00
John Koleszar	7bbb0633cd	Merge "Move vp9_full_to_model_counts to encoder"	2013-06-25 22:44:16 -07:00
Jingning Han	3cc8c8c3a0	Merge "Refactor intra predictor block"	2013-06-25 19:46:55 -07:00
Jingning Han	d19ea3861d	Refactor intra predictor block Remove vp9_intra4x4_predict(). Use the common intra prediction function for all block sizes. Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560	2013-06-25 16:33:13 -07:00
Dmitry Kovalev	6fb10f2de4	Renaming "nmv" to "mv". Change-Id: I8299f55c3b930221e52c2237f2ddea65b94fd33b	2013-06-25 15:19:18 -07:00
Ronald S. Bultje	c24d922396	Add averaging-SAD functions for 8-point comp-inter motion search. Makes first 50 frames of bus @ 1500kbps encode from 3min22.7 to 3min18.2, i.e. 2.3% faster. In addition, use the sub_pixel_avg functions to calc the variance of the averaging predictor. This is slightly suboptimal because the function is subpixel-position-aware, but it will (at least for the SSE2 version) not actually use a bilinear filter for a full-pixel position, thus leading to approximately the same performance compared to if we implemented an actual average-aware full-pixel variance function. That gains another 0.3 seconds (i.e. encode time goes to 3min17.4), thus leading to a total gain of 2.7%. Change-Id: I3f059d2b04243921868cfed2568d4fa65d7b5acd	2013-06-25 12:57:28 -07:00
Dmitry Kovalev	9467571777	Moving subexp encoding functions in separate vp9_dsubexp.c file. Change-Id: Idbb2ea80f764fa830fe2ddcfc54ef7fe232f05a8	2013-06-25 11:53:17 -07:00
Dmitry Kovalev	5ae096778e	Merge "Removing unused code."	2013-06-25 11:50:55 -07:00
Yaowu Xu	c2e3ee13e7	Merge "Changed size of mb_mode_context to 8 bits"	2013-06-25 10:44:47 -07:00
Scott LaVarnway	855e23ce8c	Merge "Small mode_info_context cleanup in filter_block_plane"	2013-06-25 10:34:19 -07:00
Dmitry Kovalev	87ee34aacb	Removing unused code. Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1	2013-06-25 10:17:19 -07:00
Dmitry Kovalev	70e9622185	Merge "Removing find_seg_id and using vp9_get_pred_mi_segid instead."	2013-06-25 10:16:06 -07:00
Dmitry Kovalev	529679bd52	Merge "Transforming scale_mv_component_q4 into scale_mv_q4 function."	2013-06-25 10:15:33 -07:00
Scott LaVarnway	c787f40bc4	Small mode_info_context cleanup in filter_block_plane Unnecessary updates to xd->mode_info_context. Change-Id: I36d2d68ca48366f727548526726b1b5437f62968	2013-06-25 12:28:50 -04:00
Yaowu Xu	b9c934df8e	Merge "Enable sse2 implmentation of 8x8 ADST/DCT"	2013-06-25 09:13:22 -07:00
Jingning Han	a32a086d23	Enable sse2 implmentation of 8x8 ADST/DCT This commit makes use of the butterfly structure to enable the sse2 version implementation of 8x8 ADST/DCT hybrid transform coding. The runtime of hybrid transform module goes down from 1170 cycles to 245 cycles. Overall speed-up around 1.5%. Change-Id: Ic808ffd21ece8a9d0410d8c0243d7b6c28ac3b3f	2013-06-24 18:41:33 -07:00
John Koleszar	4ecd6dbead	Move vp9_counts_to_nmv_context to encoder This function only used from within vp9_encodemv.c. Change-Id: Ib3fc7c30b1e2d27321397ac474cbc8976bc1f4b1	2013-06-24 15:58:18 -07:00
John Koleszar	08b1798ae7	Move vp9_full_to_model_counts to encoder This function is not called from the decoder, so it doesn't need to be in common/. Change-Id: I6977dd462a25b4ff39c9c7e1b0b5b16aa58ee733	2013-06-24 15:46:15 -07:00
John Koleszar	ece724ae16	Merge "Remove unused vp9_build_intra_predictors_sb{y,uv}_s"	2013-06-24 15:08:58 -07:00
John Koleszar	ee4a7e4e46	Merge "Remove unused vp9_model_to_full_probs_sb()"	2013-06-24 15:08:54 -07:00
Scott LaVarnway	dfa2ecc3f1	Changed size of mb_mode_context to 8 bits This reduced the size of the MODE_INFO array (mip and prev_mip) by 425,568 bytes each for 1080p resolutions. Change-Id: Ifa513ec2d0a49e8ec0867ec90620762fb7f1261d	2013-06-24 17:11:16 -04:00
John Koleszar	858475a03a	Fix loopfilter of leftmost 4x4 edges in SB For cases where there's no transform set in bit 0 (the left edge of the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the left edge needs filtering), it was incorrectly being skipped before. This situation only happens on the leftmost edge of the image, as the edge at column 0 is intentionally skipped since there aren't pixels to the left to read. Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3	2013-06-24 08:26:00 -07:00
John Koleszar	9e7019f7df	Remove unused vp9_build_intra_predictors_sb{y,uv}_s The functions no longer referenced. Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf	2013-06-21 16:10:05 -07:00
John Koleszar	5c32215e27	Remove unused vp9_model_to_full_probs_sb() This function never referenced. Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc	2013-06-21 15:38:55 -07:00
Dmitry Kovalev	f27f76dfb3	Transforming scale_mv_component_q4 into scale_mv_q4 function. Using MV instead of int_mv for function arguments. Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6	2013-06-21 15:34:29 -07:00
Dmitry Kovalev	40141681c0	Removing find_seg_id and using vp9_get_pred_mi_segid instead. Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827	2013-06-21 13:05:10 -07:00
Ronald S. Bultje	54b2a59623	Implement SSE2 block_error. Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68	2013-06-21 12:54:52 -07:00
Ronald S. Bultje	7756e9892b	Merge "Add subtract_block SSE2 version and unit test."	2013-06-21 12:49:50 -07:00
Ronald S. Bultje	9a480482cb	Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()."	2013-06-21 12:49:43 -07:00
Ronald S. Bultje	25c588b1e4	Add subtract_block SSE2 version and unit test. 3% faster overall (3min35.0 to 3min28.5). Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e	2013-06-21 09:35:37 -07:00
Yaowu Xu	e6cd5ed307	Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes."	2013-06-20 17:42:50 -07:00
Ronald S. Bultje	1e6a32f1af	SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 \|\| y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9	2013-06-20 15:59:48 -07:00
Frank Galligan	c259af4f73	Fix win64 warning. - size_t vs int. Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134	2013-06-20 14:07:11 -07:00
Dmitry Kovalev	8283d893eb	Merge "Renaming 'nmv' to 'mv' for several functions."	2013-06-20 10:17:12 -07:00
Ronald S. Bultje	8fb6c58191	Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d	2013-06-20 09:34:25 -07:00
Jim Bankoski	2c6bdbbc78	new debug modes code The new print out includes skips and has prefixed sections so you can grep to find things like transforms chosen on each frame. Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b	2013-06-20 09:33:11 -07:00
Yaowu Xu	12180c8329	Remove unnecessary copying of probs. Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c	2013-06-18 23:02:27 -07:00
Dmitry Kovalev	87e1fa7627	Renaming 'nmv' to 'mv' for several functions. Change-Id: I183a38997a9d01e4a1b869e92509f6915216fa09	2013-06-18 18:28:10 -07:00
Jingning Han	7088426976	Merge "Make fdct32 computation flow within 16bit range"	2013-06-18 11:40:14 -07:00
Dmitry Kovalev	dfc0385291	Merge "Removing vp9_invtrans.{c, h} files."	2013-06-18 10:16:25 -07:00
Jingning Han	a41a4860c0	Make fdct32 computation flow within 16bit range This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3	2013-06-18 09:46:24 -07:00
Ronald S. Bultje	d9fc451666	Move subpixel variance function from common/ to encoder/. This seems to only be used in the encoder. Also remove an empty wrapper file that contained forward declarations for this function, but didn't actually define any actual functions. Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b	2013-06-17 16:54:09 -07:00
Dmitry Kovalev	686b99741c	Removing vp9_invtrans.{c, h} files. Moving single function from vp9_invtrans.c to vp9_encodemb.c. Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7	2013-06-17 16:09:03 -07:00
John Koleszar	61ecc282b5	Merge "Remove unused need_to_clamp_mvs"	2013-06-17 10:31:58 -07:00
John Koleszar	141ab2d5d0	Merge "Fix type mismatch in array definition"	2013-06-14 17:07:22 -07:00
John Koleszar	c2da365484	Merge "Remove constant vp9_coef_update_prob table"	2013-06-14 17:07:19 -07:00
John Koleszar	a9415d2e4c	Fix type mismatch in array definition vp9_default_inter_mode_probs was being accessed with a different type than it was defined with. Ensure that its declaration is included prior to its definition. Change-Id: I2f963f513ab2f4e339f8a3c17e3d0f03749eba16	2013-06-14 16:38:42 -07:00
John Koleszar	0f7a66e962	Remove constant vp9_coef_update_prob table All elements of this table are equal to 252, so replace it with a single constant VP9_COEF_UPDATE_PROB. Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9	2013-06-14 15:12:31 -07:00
Jingning Han	0b7910b9ff	Merge "Enable sse2 version of sad8x4/4x8"	2013-06-14 13:15:49 -07:00
Jingning Han	c43af9a8a3	Enable sse2 version of sad8x4/4x8 The encoding time for bus at CIF goes from 661s to 625s. This commit also enabled unit test of sad8x4/4x8 in sad_test.cc. Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1	2013-06-14 09:19:28 -07:00
Jingning Han	15f50e7b42	Enable sse2 version of sad8x4/4x8 The encoding time for bus at CIF goes from 661s to 625s. This commit also enabled unit test of sad8x4/4x8 in sad_test.cc. Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1	2013-06-13 16:18:18 -07:00
John Koleszar	8e47093c9e	Remove unused need_to_clamp_mvs This flag no longer needed. Change-Id: If13482015ddb92d225792ea5c0ee455d2285d1f6	2013-06-12 16:50:14 -07:00
Scott LaVarnway	a81bd12a2e	Quick modifications to mb loopfilter intrinsic functions Modified to work with 8x8 blocks of memory. Will revisit later for further optimizations. For the HD clip used, the decoder improved by almost 20%. Change-Id: Iaa4785be293a32a42e8db07141bd699f504b8c67	2013-06-12 19:23:03 -04:00
Yaowu Xu	d682243012	Merge "Quick modifications to wide loopfilter intrinsic functions"	2013-06-12 15:16:11 -07:00
Ronald S. Bultje	fa96eeb835	Implement SSE version for sad4x8x4d and SSE2 version for sad8x4x4d. Encoding time of crew (CIF, first 50 frames) @ 1500kbps goes from 4min56 to 4min42. Change-Id: I92c0c8b32980d2ae7c6dafc8b883a2c7fcd14a9f	2013-06-12 17:40:01 -04:00
Scott LaVarnway	26496c52bf	Quick modifications to wide loopfilter intrinsic functions Modified to work with 8x8 blocks of memory. Will revisit later for further optimizations. For the HD clip used, the decoder improved my 20%. Change-Id: Ia0057f55d66d1445882351ea6c43b595a5a980e5	2013-06-12 16:49:08 -04:00
John Koleszar	1fa04e1a03	Merge changes I86fe51b0,I4c9a9e0f * changes: Remove unused vp9_idct_add_{y,uv}_block Remove some unused loopfilter code	2013-06-12 13:43:30 -07:00
Johann	bbd5cb2bd4	Merge "Fix compile warnings on windows."	2013-06-12 13:36:50 -07:00
John Koleszar	495ff8e0c7	Merge "Enable mmx loop filter routines"	2013-06-12 12:52:04 -07:00
John Koleszar	ceee4563d6	Remove unused vp9_idct_add_{y,uv}_block These functions are not used, and appear to have been superceded. Change-Id: I86fe51b088264f6b1b8d4d232bba97b371b98120	2013-06-12 12:24:22 -07:00
Jingning Han	1a5bb3cc76	Fix the comments in boundary block partition check Change-Id: Ic6b2881d8d495269edbc514b33376ca963798b45	2013-06-12 12:05:06 -07:00
John Koleszar	8933a652fc	Remove some unused loopfilter code This code is unreachable, and not useful for later reference. Change-Id: I4c9a9e0fbf859c1081bbcfbcda9710afb4b4741f	2013-06-12 11:36:00 -07:00
Frank Galligan	4524548f80	Fix compile warnings on windows. Change-Id: If74bc6110016bc75ea3883ab136fbbac88f6a913	2013-06-12 11:34:15 -07:00
John Koleszar	0e1e16db90	Enable mmx loop filter routines The mmx routines work as expected for the loop filter, so enable them. Change-Id: I2bbd9b99a4445fcba17bb95002f1fb6e01fe8f85	2013-06-12 11:28:21 -07:00
Yaowu Xu	efe05b7437	fix a mis use of ref_frame Change-Id: I9aac140d775b7b4a8727494d15b185b75501a546	2013-06-12 10:32:38 -07:00
Frank Galligan	15f9077ee2	Fix duplicate const. Change-Id: I86be1f7421ed49d577cacf405f6e4b0daa85cfdc	2013-06-12 08:52:34 -07:00
John Koleszar	9831f20594	Disallow wide loopfilter on some chroma borders Don't do the 15 tap filter if there aren't 8 pixels below/right of the edge. Change-Id: I62f16437c1d9ba59b6901a5fe71ddb2f472da344	2013-06-11 11:28:38 -07:00
Jingning Han	551f37d63d	Fix partition coding of corner block This commit fixed the allowable partition types for bottom-right corner blocks. When a block has over half of its pixels as valid content in both vertical and horizontal directions, allow all the four partition types in the bit-stream. Otherwise, apply partition type constraints. Change-Id: I2252e2de7125a8bfb1c824bf34299a13c81102e3	2013-06-10 21:43:17 -07:00
Deb Mukherjee	51a7c7631d	Merge "New probs for filters/tx_size and a few others" into experimental	2013-06-10 16:39:43 -07:00
Deb Mukherjee	a43ff15399	New probs for filters/tx_size and a few others * New probs for subpel filters/tx_count * Makes a change to not reset to defaults for the tx_size probs if an intermediate frame reverts to using a fixed tx_size. * A few updates to the parameters for backward adaptation for mode/mv * some cosmetic cleanups derf300: +0.06% Change-Id: I22994d659bc31ca7a4fc8820fde24001e64a2920	2013-06-10 16:38:47 -07:00
John Koleszar	091e23c3e6	Merge "Remove remnants of VP8 profiles/versions" into experimental	2013-06-10 16:16:17 -07:00
John Koleszar	0fcb625e35	Remove remnants of VP8 profiles/versions Remove the bilinear filter mode, and the no-loopfilter mode, and the related vp9_setup_version() function. Change-Id: I32311367812faf37863131df3af37d63d03973d7	2013-06-10 15:55:03 -07:00
Jim Bankoski	ba2af976cb	print debugging info from mode info struct This commit has no impact but to help us debug issues. To Use call like this: vp9_print_modes_and_motion_vectors(cpi->common.mi, cpi->common.mi_rows, cpi->common.mi_cols, cpi->common.current_video_frame, "decode_mi.stt"); Change-Id: I89e27725dae351370eb7f311a20a145ed4f1d041	2013-06-10 14:03:17 -07:00
John Koleszar	44db42c114	Merge the new loopfilter experiment Change-Id: I524ba98841f2e1850e3276ac365c501cea31546d	2013-06-10 12:30:12 -07:00
John Koleszar	c37a1e5ef2	Merge "Loopfilter: Fix chroma edge selection" into experimental	2013-06-10 12:17:24 -07:00
John Koleszar	2f3cbfdde1	Merge "Fix use of get_uv_tx_size in loopfilter" into experimental	2013-06-10 12:17:11 -07:00
Adrian Grange	c4e5b77d74	Merge "Implement intra-coded frames" into experimental	2013-06-10 12:08:09 -07:00
Deb Mukherjee	995ce523eb	Cosmetic cleanups of filters No bitstream change. Removes unused filters and the code for the case of 2 switchable filters; also changes the 8tap-smooth filter coefficients for integer shifts to be interpolating to be consistent with the way it is implemented currently. Change-Id: I96c542fd8c06f4e0df507a645976f58e6de92aae	2013-06-10 12:06:36 -07:00
Adrian Grange	eac344ef10	Implement intra-coded frames Implements ability to signal and decode frames that are encoded using only intra coding modes. Only the decode side has been implemented here. Change-Id: I53ac6a8d90422cd08ba389e5236e15b45f9e93de	2013-06-10 11:43:16 -07:00
John Koleszar	48b7cbcac5	Loopfilter: Fix chroma edge selection A 32x32 transform should have no internal filtering (check c==4) Change-Id: I7414cf4748ed053208217692ef00cd8b20d49a91	2013-06-10 11:40:57 -07:00
John Koleszar	717d744a01	Fix use of get_uv_tx_size in loopfilter Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98	2013-06-10 11:40:57 -07:00
John Koleszar	ec38b6150d	Merge "Fixed point reference picture scaling" into experimental	2013-06-10 09:45:34 -07:00
Ronald S. Bultje	549258b1c2	Merge "border mvref issue" into experimental	2013-06-10 09:22:49 -07:00
Jim Bankoski	75459d65df	border mvref issue Fixes mvref issue. Change-Id: I07dc1b0682845bc18fe0efa6af5e4f4da3abfa3a	2013-06-10 09:21:11 -07:00
Yaowu Xu	7f99844e91	Merge "Loopfilter: bug fix in sb_type usage" into experimental	2013-06-10 08:56:38 -07:00
Tero Rintaluoma	86bb6df005	Fixed point reference picture scaling Fixed point scaling factors are calculated once for each reference frame by using integer division. Otherwise fixed point scaling routines are used in all scaling calculations. This makes it possible to calculate fixed point scaling factors on device driver software and pass them to hardware and thus avoid division on hardware. TODO: - Missing check for maximum frame dimensions (currently scaling uses 14 bits) - Missing check for maximum scaling ratio (upscaling 16:1, downscaling 2:1) Problems: - Straightforward fixed point implementation can cause error +-1 compared to integer division (i.e. in x_step_q4). Should only be an issue for frames larger than 16k. Change-Id: I3cf4dabd610a4dc18da3bdb31ae244ebaf5d579c	2013-06-10 08:07:55 -07:00
Janne Salonen	548f90d2ce	Loopfilter: bug fix in sb_type usage Was always using sb_type of first column in a row of 8x8 units when determining decoded block edges as a subcondition for loop filter skipping. Change-Id: Ib17554633a63a90b70cdaa7bed65db035a8ad9d8	2013-06-10 06:40:05 -07:00
Yaowu Xu	4852a8023d	Merge "Loopfilter: Always filter intra edges" into experimental	2013-06-09 21:18:00 -07:00
Yaowu Xu	9c44ce9f4b	Merge "Loopfilter: use the current block only for skip" into experimental	2013-06-09 21:17:54 -07:00
Yaowu Xu	2e1fd0a497	Merge "Modified loop filter edge skipping" into experimental	2013-06-09 21:17:47 -07:00
John Koleszar	140ac34e57	Loopfilter: Always filter intra edges Change-Id: Ifb1ce2bd52147981ca1aec9ec6cfea8738a23e45	2013-06-09 09:02:47 -07:00
Ronald S. Bultje	c3f9b070ca	Merge "New comp_inter defaults." into experimental	2013-06-09 06:40:02 -07:00
Ronald S. Bultje	3993d30922	Merge "Fix firstpass if framesize is not a multiple of 16." into experimental	2013-06-08 17:40:17 -07:00
Ronald S. Bultje	d30968c32a	Merge "New default tables" into experimental	2013-06-08 17:39:50 -07:00
Ronald S. Bultje	20760254f6	Merge "Align frame size to 8 instead of 16." into experimental	2013-06-08 17:39:41 -07:00
Ronald S. Bultje	99e10253b0	New comp_inter defaults. It seems like I inverted the meaning of the contexts by accident? Change-Id: Iafb2346d9933930949578342b84519b719dd5dd3	2013-06-08 15:13:57 -07:00
Ronald S. Bultje	073c7d5eec	Fix firstpass if framesize is not a multiple of 16. Change-Id: Iec41736c2b6140715f90f40de5ae6cf52497a9b8	2013-06-08 13:32:05 -07:00
Ronald S. Bultje	b64be43998	New default tables Change-Id: Ice8c73a2a843113877b8f8ed78737a1442c25ced	2013-06-08 13:29:14 -07:00
Deb Mukherjee	17da2cab78	TX_SIZE contexts simplification. Reduces TX_SIZE contexts to 2 for each kind. The code is cleaner and there is hardly any performance difference with more than two contexts. Results: almost neutral Change-Id: I17656bd6db76224ae2856adf882504560e7dbaa4	2013-06-08 12:32:26 -07:00
Deb Mukherjee	67cb1f093c	Minor fix in TX_SIZE contexts Change-Id: I9e81f84877e18ba7e55d66389ed60e64a5b7abcc	2013-06-08 07:14:58 -07:00
Yaowu Xu	b7da6d0c5a	Merge "Handle partition type coding of boundary blocks" into experimental	2013-06-07 18:16:16 -07:00
John Koleszar	f7e4b72df8	Loopfilter: use the current block only for skip Use the current block's skip flag to determine edge skipping. Change-Id: I4ba81f899286afbc3f6bb83eba2ef146a01b6fa4	2013-06-07 17:48:57 -07:00
Ronald S. Bultje	71701f3d40	Align frame size to 8 instead of 16. Change-Id: Ic606ef1b31e49963a779455a1e010a9ebb0f3f1f	2013-06-07 17:20:50 -07:00
Adrian Grange	07a5777bde	Frame header changes to support intra_only frames Made changes to the frame header to write the sync code in the frame header for a non-displayable, intra-only frame. Extended reset_frame_context to 2-bits. (Submitting on behalf of Dmitri) Change-Id: Ie836ae0df9ed572fb4f08aabe9351a555c4f3b96	2013-06-07 16:19:34 -07:00
Deb Mukherjee	21401942b0	Coding tx-size selection by use of spatial context Adds coding of transform size within a frame by use of context of transform sizes selected in left and above blocks. Also incorporates code for generating stats. TODO: generate and incorporate new default stats Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6	2013-06-07 16:07:58 -07:00
Deb Mukherjee	869a39ba60	Cleans up mbskip encoding Refactors mbskip coding to be compatible with coding of the rest of the symbols. Adds forward/backward adaptation and removes a lot of the legacy code. Results: fast50: +1.6% derfraw300: +0.317% Change-Id: I395a2976d15af044d3b8ded5acfa45f6f065f980	2013-06-07 16:00:26 -07:00
Jingning Han	78b8190cc7	Handle partition type coding of boundary blocks The partition types of blocks sitting on the frame boundary are constrained by the block size and the position of each sub-block relative to the frame. Hence we use truncated probability models to handle the coding of such information. 100 frames run: yt 0.138% Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d	2013-06-07 14:19:40 -07:00
Ronald S. Bultje	28164eb962	Fix segment feature data size. Change-Id: I4331cfd99a717938f4f970cad81c468cbf287b00	2013-06-07 13:57:28 -07:00
Ronald S. Bultje	fb1f6f1db4	Fix segment feature data type. It has a range of -255,255, so should be int16_t, not int8_t. Change-Id: I5ef4b6aefb6212b0f35f4754f3c4d73fddbc52a0	2013-06-07 13:57:27 -07:00
Ronald S. Bultje	363dc6ceda	Don't crash if motion vector ref points to out-of-bounds area. This can only happen if partition is partly out-of-frame, in which case the referenced mv is either out-of-frame also (and thus has the same value as an already-read one), or it is actually uninitialized, in which case we don't want to use it. Change-Id: Icf39fa4d987c7abcbebb9bbdcdd6311e8fb9d3c9	2013-06-07 13:57:27 -07:00
Paul Wilkins	340c7a48e6	Change to segment ref frame feature. Simplify feature to only support a single reference frame instead of a mask. Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb	2013-06-07 21:42:22 +01:00
Yaowu Xu	0bb6da3668	Merge "Remove two un-used entries in mode_lf_delta[]" into experimental	2013-06-07 10:10:45 -07:00
Yaowu Xu	254f46bc5b	Merge "Specify mv neighborhood for block larger than 8x8" into experimental	2013-06-07 10:09:35 -07:00
Yaowu Xu	b097a3ba82	Remove two un-used entries in mode_lf_delta[] With the removal of i4X4 and SPLIT_MV modes, the two entries for the modes are no longer used. This patch remove the coding of the deltas. Change-Id: Iea4eb500404ebe9706159380a03b8eca542fb4c3	2013-06-07 09:24:09 -07:00
Deb Mukherjee	78fbaf4d84	Merge "Coding updates for tx-size selection" into experimental	2013-06-07 09:19:36 -07:00
Ronald S. Bultje	def6bc765c	Merge "Revert "Align frame size to 8 instead of 16."" into experimental	2013-06-07 09:01:33 -07:00
Yaowu Xu	8b3ad75266	Specify mv neighborhood for block larger than 8x8 The new neighorbhood adapts to the shape and size of the block type cif +.16% stdhd +.13% Change-Id: I978db58278e9ae3fbd6726ef831bdfc5f5f37d02	2013-06-07 08:59:48 -07:00
Ronald S. Bultje	e7d306aae6	Revert "Align frame size to 8 instead of 16." This reverts commit `c2574414d4` Change-Id: Ie9013cb0bb43e639e01b4588f630b1da59295d38	2013-06-07 08:59:27 -07:00
Deb Mukherjee	3ee1a21a42	Coding updates for tx-size selection Changes to the coding of transform sizes, along with forward and backward probability updates. Results: derf300: +0.241% Context based coding of transform sizes will be in a separate patch. Change-Id: I97241d60a926f014fee2de21fa4446ca56495756	2013-06-07 08:54:00 -07:00
Janne Salonen	5c5223860a	Modified loop filter edge skipping Added condition to not to skip filtering of transform block edges when the edge is also a decoding block edge. Change-Id: Iaccb6206c4202b78e5dca3b89379556e0f4aba0c	2013-06-07 06:36:22 -07:00
Paul Wilkins	576c2bb021	Fix bug in segment skip. Wrong max data size (skip has no data) and use of vp9_get_segdata() when it should be vp9_segfeature_active(). Change-Id: I1eb97d33df6e2a42cc589049f704266fe3639902	2013-06-07 13:27:08 +01:00
Yaowu Xu	4df9e7883c	Merge "Removed rectangular intra prediction code" into experimental	2013-06-06 22:58:07 -07:00
Yaowu Xu	472669befb	Fix a merge conflict ref_frame in MB_Mode_Info was changed in the ref frame coding patch to be an array to handle first and second reference frame, this patch fix the loop filter code that use the pointer directly as reference frame. Change-Id: I71afa5a49deb50c1bc38029fd07470b984c6dfe9	2013-06-06 22:10:07 -07:00
Yaowu Xu	9470c1a2a1	Removed rectangular intra prediction code As all intra predictions happen on squared transform block now. Change-Id: I7ec91e3f0ad01383a03d2bd3099bbf32e87e3466	2013-06-06 21:35:10 -07:00
Jim Bankoski	fa9db8da15	Merge "Fix FIXME." into experimental	2013-06-06 20:50:51 -07:00
Jim Bankoski	686f437264	Merge "Align frame size to 8 instead of 16." into experimental	2013-06-06 20:49:59 -07:00
John Koleszar	736c7b804a	Merge "Reimplementation of loop filter" into experimental	2013-06-06 17:34:26 -07:00
Ronald S. Bultje	c2574414d4	Align frame size to 8 instead of 16. Change-Id: Ic22f416a33de558519d5c30a929f6a954546ade9	2013-06-06 17:28:11 -07:00
Ronald S. Bultje	bc41af00cf	Fix FIXME. Change-Id: I47a9857d35da1bff6153f8090c6b98b689b31a61	2013-06-06 17:28:11 -07:00
Ronald S. Bultje	6ef805eb9d	Change ref frame coding. Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95	2013-06-06 17:28:09 -07:00
Ronald S. Bultje	ad34368786	New intra mode and partitioning probabilities. Split partition probabilities between keyframes and non-keyframes, since they are fairly different. Also have per-blocksize interframe y intramode probabilities, since these vary heavily between different blocksizes. Lastly, replace default probabilities for partitioning and intra modes with new ones generated from current codec. Replace counts with actual probabilities also. Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f	2013-06-06 10:45:30 -07:00
John Koleszar	043d348aae	Reimplementation of loop filter This version of the loop filter supports non-4:2:0 subsampling and a fourth plane, as well as changing the filtering order to be more friendly to hardware implementations. The filters are applied first to all vertical edges within the 64x64 SB, followed by the top horizontal edge and any internal horizontal edges. Since filtering is applied on each 4x4 edge serially, a dependency is created from filtering one block edge to the next. It would be possible to remove this depencnecy by building all filtering decisions from the unfiltered reconstruction data. Change-Id: I08f3e9683eb7bded8a76651cbc50fc0dfdd05fa7	2013-06-06 08:45:45 -07:00
Jim Bankoski	5a88271b09	don't tokenize & encode tokens for blocks in UMV This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9	2013-06-06 06:10:25 -07:00
Dmitry Kovalev	28d31aed7f	Merge "Moving bits from compressed header to uncompressed one." into experimental	2013-06-06 01:15:44 -07:00
Jingning Han	61e6586230	Merge "Fix UV intra coding rd loop" into experimental	2013-06-05 21:47:00 -07:00
Jingning Han	f04b15486a	Fix UV intra coding rd loop This commit makes the coding/reconstruction operations of intra coding rate-distortion loop for UV components consistent with those of the encoding process. key frame coding gains: derf: 0.11% stdhd: 0.42% Change-Id: I8d49f83924a320e3689ef2d60096c49d7f0c7a40	2013-06-05 21:18:02 -07:00
Dmitry Kovalev	12345cb391	Moving bits from compressed header to uncompressed one. Bits moved: refresh_frame_flags, active_ref_idx[], ref_frame_sign_bias[], allow_high_precision_mv, mcomp_filter_type, ref_pred_probs[]. Derf results: +0.040% Change-Id: I011f43c7eac0371d533b255fd99aee5ed75b85a5	2013-06-05 20:56:37 -07:00
Deb Mukherjee	30226a658f	Cosmetic renaming VP9_MVREFS to VP9_INTER_MODES NO bitstream change Change-Id: I79f6146dac5fdd157051b6f8dc611c0b7b5e5f7f	2013-06-05 11:24:01 -07:00
Deb Mukherjee	83885235a7	Clean-ups on switchable interpolation and mv_ref Adds backward adaptation and differential forward updates of switchable interpolation filter probabilities. Also adds some cosmetic cleanups and minor fixes on mv_ref probabilities. derfraw300: +0.353% (with most coming from switchable interp changes) Change-Id: Ie2718be73528c945fd0d80cfd63ca2d9cb3032de	2013-06-05 10:11:52 -07:00
Yaowu Xu	0449ee0fec	Fix a off-by-one bug in the calculation of maximum number of tiles in log2 scale. Change-Id: Id283d6e51a8b926015fd3fc631cdbfb4b8268d4a	2013-06-03 14:25:28 -07:00
Paul Wilkins	6dd3a6320e	Merge "Replace scatter scan 32x32 with HW friendly scan." into experimental	2013-06-03 02:42:37 -07:00
Paul Wilkins	3f380d5252	Merge "vp9_find_mv_refs_idx change for last frame." into experimental	2013-06-03 02:34:46 -07:00
Dmitry Kovalev	317d832d38	Merge "Adding plane_block_width and plane_block_height functions." into experimental	2013-05-31 15:28:45 -07:00
Dmitry Kovalev	d771bba27e	Renaming 'motion_vector' to 'mv' for consistency. Change-Id: Ie869ea4992e26867caec46cb878fc86a646aeb9f	2013-05-31 12:32:53 -07:00
Dmitry Kovalev	120a878199	Adding plane_block_width and plane_block_height functions. Change-Id: I02c17fb733c0f3c22dc3167c3d3182797415f1ae	2013-05-31 12:31:49 -07:00
Ronald S. Bultje	a288cb3b10	Merge "Merge all various transform size data trackers into single variables." into experimental	2013-05-31 09:59:24 -07:00
Scott LaVarnway	1e025dbfd1	Merge "Moved use_prev_in_find_mv_refs check to frame level" into experimental	2013-05-31 09:35:51 -07:00
Ronald S. Bultje	e9d68a5e36	Merge all various transform size data trackers into single variables. Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db	2013-05-31 09:18:59 -07:00
Paul Wilkins	cf61fae8ee	vp9_find_mv_refs_idx change for last frame. Restrict get_matching_candidate() to considering mvs at 8x8 and larger sizes for last frame case. This is to reduce the HW load of using vectors down to the 4x4 level from the previous frame. Change-Id: I6505e610fd63a4e22d67f136aec7905a01b893ba	2013-05-31 15:37:27 +01:00
Sami Pietila	0835a35347	Fix inter mode context adaptation. Change-Id: Ibaa47be878c1cd84d88d7518418d2d8d38224e70	2013-05-31 12:58:31 +03:00
Paul Wilkins	aaf61dfbca	Merge "Patch to remove implicit segmentation." into experimental	2013-05-31 02:56:20 -07:00
Yaowu Xu	7ca651a383	Merge "Changed to use a new variant of WHT" into experimental	2013-05-30 21:53:12 -07:00
Ronald S. Bultje	a4e7c6bd4d	Merge "Remove unused define." into experimental	2013-05-30 20:58:22 -07:00
Ronald S. Bultje	310bc1030a	Merge "Merge VP9_YMODES, VP9_UV_MODES, INTRA_MODE_COUNT and cousins." into experimental	2013-05-30 20:58:19 -07:00
Ronald S. Bultje	7d549870f7	Merge "Remove TX_SIZE_MAX_MB." into experimental	2013-05-30 20:58:16 -07:00
Ronald S. Bultje	6ea6f4d253	Merge "Remove one (unused) entry from mvref tables." into experimental	2013-05-30 20:58:13 -07:00
Jim Bankoski	21595f8e38	Merge "Creates a new speed 1:" into experimental	2013-05-30 20:36:05 -07:00
Jim Bankoski	ced21bd6a6	Creates a new speed 1: This speed 1 - uses variance threshold stolen from static-thresh to determine split. Any superblock with greater than the variance set by static thresh * quantizer index squared is split. In addition transform size is set to largest size less than or equal to partition size, sub pixel filter is set to normal, and only 12 modes are used at all. Change-Id: If7a2858ee70f96d1eb989c04fd87a332b147abef	2013-05-30 19:53:00 -07:00
Ronald S. Bultje	16482bddf7	Merge "Remove splitmv." into experimental	2013-05-30 19:07:12 -07:00
Ronald S. Bultje	d2205f92c3	Merge changes I98c18fe5,I80c37cff into experimental * changes: Remove i4x4_pred. Remove unused table.	2013-05-30 19:06:44 -07:00
Ronald S. Bultje	117282a690	Remove unused define. Change-Id: Ic6555128206d61f47a46c550cb3dcaf3b4ec6374	2013-05-30 17:21:06 -07:00
Ronald S. Bultje	a433abbcad	Merge VP9_YMODES, VP9_UV_MODES, INTRA_MODE_COUNT and cousins. These are now merged in a new define called VP9_INTRA_MODES. Change-Id: I0890f895756a7395d84c92f98f43e43f4cf9050d	2013-05-30 17:21:06 -07:00
Ronald S. Bultje	4d3d00b195	Remove TX_SIZE_MAX_MB. Change-Id: I715870513d1fef8471bfd0f5218a79360a1ef126	2013-05-30 17:21:06 -07:00
Ronald S. Bultje	580d29bdbb	Remove one (unused) entry from mvref tables. Change-Id: Ieb4669ae564bec9f3051485ecdf186cb4e00decb	2013-05-30 17:21:06 -07:00
Ronald S. Bultje	e6485581fe	Remove splitmv. We leave it in rdopt.c as a local define for now - this can be removed later. In all other places, we remove it, thereby slightly decreasing the size of some arrays in the bitstream. Change-Id: Ic2a9beb97a4eda0b086f62c039d994b192f99ca5	2013-05-30 17:21:01 -07:00
Ronald S. Bultje	1efa79d32f	Remove i4x4_pred. It remains as a local define in rdopt.c so we can distinguish between split and non-split modes in the RD loop, but disappears outside that scope in the codec. Change-Id: I98c18fe5ab7e4fbd1d6620ec5695e2ea20513ce9	2013-05-30 16:44:58 -07:00
Ronald S. Bultje	9175082c4e	Remove unused table. Change-Id: I80c37cffa176bac942ab3051abdfd585ed5555e1	2013-05-30 16:44:56 -07:00
Yaowu Xu	042e70e45e	Changed to use a new variant of WHT The commit changed to use a new variant of Walsh-Hadamard Transform by Tim Terriberry. This new variant has the best compression among a number of variants that developed by Tim. Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9	2013-05-30 15:37:52 -07:00
Ronald S. Bultje	f5827699bf	Merge "Merge all intra mode coding trees into a single one." into experimental	2013-05-30 11:27:51 -07:00
Adrian Grange	6f361f5841	Merge "Add intra_only and reset_frame_context flags" into experimental	2013-05-30 10:56:25 -07:00
Ronald S. Bultje	98c192ae83	Merge all intra mode coding trees into a single one. Also merge all counters. This removes a few unused probability updates from the bitstream. Change-Id: I20f58853e9dac84d8c0d9703ae012c55917516eb	2013-05-30 09:58:53 -07:00
Deb Mukherjee	c98bfcfbbb	Merge "Balancing coef-tree to reduce bool decodes" into experimental	2013-05-30 08:10:47 -07:00
Sami Pietila	5700b4ea42	Replace scatter scan 32x32 with HW friendly scan. The first 240 coeff positions (15 top-left blocks) are scanned in the same order as in scatter scan, after that the coeffs are scanned in "block bands", each band at a time, all coeffs in one band before moving on to the next band. This brings down the amount of 4x4 coeff blocks that need to be buffered while scanning, from 15 blocks to 8 blocks. Change-Id: I478a991d63c48bd5e64d36e59fed7a00c9a651ba	2013-05-30 15:32:46 +03:00
Paul Wilkins	1b103f250f	Patch to remove implicit segmentation. This patch removes the implicit segmentation experiment from the code base as the benefits were still unproven as of the bitstream deadline. Change-Id: I273b99d8d621d1853eac4182f97982cb5957247e	2013-05-30 11:06:29 +01:00
Ronald S. Bultje	17544d1478	Merge "Remove some unused code related to macroblock/splitmv coding." into experimental	2013-05-29 17:35:05 -07:00
Ronald S. Bultje	7873de1481	Merge "Remove unused and outdated debug code." into experimental	2013-05-29 17:33:32 -07:00
Adrian Grange	9e5bb9598c	Add intra_only and reset_frame_context flags Added two flags to the frame header: intra_only: Signals that the frame is encoded using only INTRA coding modes. reset_frame_context: Indicates that the coding context specified in the frame header should be reset to default values before the frame is encoded/decoded. Change-Id: I182d46f1f84fb67a13c46ad767f246a38d7861a2	2013-05-29 17:16:00 -07:00
Deb Mukherjee	b8b3f1a46d	Balancing coef-tree to reduce bool decodes This patch changes the coefficient tree to move the EOB to below the ZERO node in order to save number of bool decodes. The advantages of moving EOB one step down as opposed to two steps down in the other parallel patch are: 1. The coef modeling based on the One-node becomes independent of the tree structure above it, and 2. Fewer conext/counter increases are needed. The drawback is that the potential savings in bool decodes will be less, but assuming that 0s are much more predominant than 1's the potential savings is still likely to be substantial. Results on derf300: -0.237% Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242	2013-05-29 16:25:52 -07:00
Dmitry Kovalev	38cb616fbf	Merge "Compressed/uncompressed frame header changes." into experimental	2013-05-29 15:29:44 -07:00
Scott LaVarnway	353642bc53	Moved use_prev_in_find_mv_refs check to frame level This patch checks at the frame level to see if the previous mode info context can be used. This patch eliminates the flag check that was done for every mode and removes another check that was done prior to every vp9_find_mv_refs(). Change-Id: I9da5e18b7e7e28f8b1f90d527cad087073df2d73	2013-05-29 16:42:23 -04:00
Jingning Han	6c97bba403	Merge "further clean-ups on intra4x4 coding" into experimental	2013-05-29 10:55:14 -07:00
Sami Pietila	88a4d4c510	Residual coding to cache energy class of tokens. Proposal for tuning the residual coding by changing how the context from previous tokens is calculated. Storing the energy class of previous tokens instead of the token itself eases the critical path of HW implementations. Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab	2013-05-29 15:21:01 +01:00
Ronald S. Bultje	4487f5a690	Remove some unused code related to macroblock/splitmv coding. Change-Id: Ic40d56fb162f4e201547dfae33e62ccd9e865889	2013-05-29 06:29:56 -07:00
Ronald S. Bultje	2afc3422c6	Remove unused and outdated debug code. Change-Id: I0e789bdeaed60f920f7a470e56a8d4ea374233fc	2013-05-28 19:15:57 -07:00
Dmitry Kovalev	18c83b3714	Compressed/uncompressed frame header changes. Adding API to read/write uncompressed frame header bits (it is not final yet). Separate functions to read/write uncompressed header. Moving clr_type, error_resilient_mode, refresh_frame_context, frame_parallel_decoding_mode, frame_context_idx from compressed partition to uncompressed frame header. Change-Id: Id3ed8a387980c652ae147549412f4ec24a0a5bd0	2013-05-28 18:07:54 -07:00
Deb Mukherjee	3d4e032e16	Merge "Clean up related to coefficient modeling" into experimental	2013-05-28 16:55:02 -07:00

... 5 6 7 8 9 ...

1433 Commits