generic-library/vpx

Author	SHA1	Message	Date
James Zern	dc1d2331f6	vp9: remove frames_{since,till}.. from MACROBLOCKD frames_since_golden / frames_till_alt_ref_frame are unused. Change-Id: I348e7689d4d75412cf4de7703d885be942e4a26b	2013-07-13 18:02:11 -07:00
Dmitry Kovalev	429070987a	Using vp9_copy and vp9_zero instead of custom code. Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b	2013-07-12 18:07:43 -07:00
Dmitry Kovalev	aa518af8c7	Merge "Adding struct tx_probs and struct tx_counts to cleanup the code."	2013-07-12 16:02:09 -07:00
James Zern	c9a2a06c20	Merge "vp9_postproc: remove useless self-assign"	2013-07-12 15:41:41 -07:00
Dmitry Kovalev	cc662dd768	Adding struct tx_probs and struct tx_counts to cleanup the code. Also removing unused declarations from vp9_entropymode.h file. Change-Id: Ib9c5826db3584a32f6bb3297a76c522b99d83402	2013-07-12 15:22:38 -07:00
Dmitry Kovalev	60969da5cb	Merge "Code cleanup in vp9_pred_common.c"	2013-07-12 15:04:07 -07:00
James Zern	cca973a1ab	vp9_postproc: remove useless self-assign Change-Id: I0bc5d2d8c9fec8be18263b0dc2528886bb5b7b61	2013-07-12 14:17:15 -07:00
Dmitry Kovalev	3ab86adb1e	Code cleanup in vp9_pred_common.c No bitstream changes. Using MB_MODE_INFO temp variables instead of MODE_INFO variables. Removing redundant curly braces. Change-Id: Ib9d1bedfbd8af97ecc722ccf697ea8177bbe287c	2013-07-12 14:11:48 -07:00
James Zern	0195fb53cb	vp9: consistent 'log2' variable naming lg2 -> log2 Change-Id: I0602ddff49e42c9c40c29c084d04b7592b9f8edf	2013-07-12 11:37:43 -07:00
Deb Mukherjee	94c481f9f1	Some minor cleanups for efficiency Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f	2013-07-12 10:22:56 -07:00
Dmitry Kovalev	dd150e8ea9	Removing redundant code mostly from vp9_pred_common.{h, c}. Removing redundant function arguments and curly braces. Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b	2013-07-11 18:39:10 -07:00
Jingning Han	dac5891a1a	Merge "SSE2 4x4 invserse ADST/DCT transform"	2013-07-11 14:17:23 -07:00
Dmitry Kovalev	b55ecafda8	Merge "Making vp9_default_nmv_context static."	2013-07-11 13:58:34 -07:00
Dmitry Kovalev	c4ad3273c7	Moving segmentation related vars into separate struct. Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03	2013-07-11 11:57:57 -07:00
Johann	158c80cbb0	convolve8 optimizations for neon Independent horizontal and vertical implementations. Requires that blocks be built from 4x4 and [xy]_step_q4 == 16 6-10% improvement. CIF improved the least. Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda	2013-07-11 11:08:19 -07:00
hkuang	c9b25dcae4	Add neon optimize vp9_dc_only_idct_add. Change-Id: Iae84ab945cc9662a0ddd839aa2b9ca59f2ae5423	2013-07-11 10:30:47 -07:00
Jim Bankoski	5000cdf0ff	Merge "Wide loopfilter 16 pix at a time"	2013-07-11 06:44:02 -07:00
Jingning Han	49b6302044	SSE2 4x4 invserse ADST/DCT transform Enable SSE2 4x4 inverse ADST/DCT transform. The runtime goes from 292 cycles down to 89 cycles. Running bus_cif at 2000 kbps, the overall runtime of speed 0 goes from 301s to 295s (2% speed-up). Change-Id: I24098136e7fee7ab2fbf1c11755bdf2ca37f3628	2013-07-10 20:16:02 -07:00
Ronald S. Bultje	decead7336	Replace copy_memNxM functions with a generic copy/avg function. Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa	2013-07-10 18:27:24 -07:00
Dmitry Kovalev	ac72ad071d	Making vp9_default_nmv_context static. Change-Id: Ia3d5bd45adf288de11ab59c4728266c93c17e275	2013-07-10 17:44:45 -07:00
Ronald S. Bultje	46997bde88	Merge "Remove unused iwalsh4x4 MMX/SSE2 functions."	2013-07-10 17:08:46 -07:00
Ronald S. Bultje	a7ef456453	Merge "Remove unused 16x3/3x16 sad SSE2 functions."	2013-07-10 17:08:43 -07:00
John Koleszar	64f7a4d8cb	Wide loopfilter 16 pix at a time Where possible, do the 16 pixel wide filter while doing the horizontal filtering pass. The same approach can be taken for the mbloop_filter when that's implemented. Doing so on the vertical pass is a little more involved, but possible. Change-Id: I010cb505e623464247ae8f67fa25a0cdac091320	2013-07-10 16:32:44 -07:00
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Ronald S. Bultje	3f210f10eb	Remove unused iwalsh4x4 MMX/SSE2 functions. Change-Id: I2d22577911a37ed7d8c7e08cac20764842267652	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	48c53233fd	Remove unused 16x3/3x16 sad SSE2 functions. Change-Id: I30a597c0cc366e34c9a3e2afe32d70e044f95ca4	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	e6f955251f	Merge "SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction."	2013-07-10 14:52:23 -07:00
Ronald S. Bultje	6a60249071	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 TM intra prediction."	2013-07-10 14:52:19 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Jingning Han	114423538f	SSE2 16x16 ADST/DCT hybrid transform This commit enables 16x16 ADST/DCT forward hybrid transform using SSE2 operations. It reduces the runtime from 5433 cycles to 1621 cycles, at no compression performance loss. Change-Id: I75fd7f1984e9e28846af459f810ff0d6ae125230	2013-07-10 12:14:53 -07:00
John Koleszar	d1f8dd518c	Merge "Fix intermediate height in convolve"	2013-07-10 11:04:40 -07:00
Ronald S. Bultje	44b29a769c	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 V intra prediction."	2013-07-10 10:24:16 -07:00
Ronald S. Bultje	89810bfd71	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction."	2013-07-10 10:13:16 -07:00
Dmitry Kovalev	20986c81b3	Merge "Removing vp9_maskingmv.c and corresponding assembly file."	2013-07-10 10:05:06 -07:00
Ronald S. Bultje	7fd643264a	SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction. Change-Id: Iad70966b986f65259329070e258f76ef0af816b4	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	8dade638a1	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 TM intra prediction. Change-Id: I3441c059214c2956e8261331bbf521525a617a86	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	75b33c68c7	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 V intra prediction. Change-Id: I55a6cfa2daba738cbc0c4a02f806893f7e556997	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	92c5d3665d	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction. Change-Id: Ibe1690afc5459f3b3beca401e7734fcd03da6dd0	2013-07-10 09:28:03 -07:00
Jim Bankoski	863204e64d	mi_width_log2 & mi_height_log2 converted to lookup to avoid unnecessary code Change-Id: I2ee6a01f06984cc2c4ba74b3fffd215318f749d2	2013-07-10 07:26:08 -07:00
Jim Bankoski	6c8170af52	b_width_log2 and b_height_log2 lookups Replace case statement with lookup. Small speed gain at low speed settings but at speed 2+ where the number of motion searches etc. falls the impact rises to ~3-4%. Change-Id: Idff639b7b302ee65e042b7bf836943ac0a06fad8 Change-Id: I5940719a4a161f8c26ac9a6753f1678494cec644	2013-07-10 07:19:09 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
James Zern	dac57fece6	Merge "Remove all asm offset files from VP9"	2013-07-09 19:13:37 -07:00
Dmitry Kovalev	2824048a56	Merge "Loop filter code cleanup."	2013-07-09 18:56:19 -07:00
Frank Galligan	53971d86ea	Merge "Add Neon horizontal and vertical vp9_mbloop_filter"	2013-07-09 15:38:44 -07:00
John Koleszar	f0d9f10d24	Remove all asm offset files from VP9 The files are empty and unused. Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a	2013-07-09 14:26:53 -07:00
Frank Galligan	198fa6d0a0	Add Neon horizontal and vertical vp9_mbloop_filter - The vp9 mbfilter C code will branch on flat and mask. This CL will perform both branches and combine the data. A later CL will perform a check to see if all patch will take one branch. - These functions are about 1.75 times faster than the C code on Nexus 7. PS #3 - Changed all functions to dub limit, blimit, and thresh from vld {dx[]}, freeing up r4-r6. - Changed code to use vbif to reduce one instruction and free up a d register. Change-Id: I028dae0e434dc9891c3677bdb182e201ffb04777	2013-07-09 12:40:05 -07:00
Dmitry Kovalev	ec68d25521	Merge "Adding update_tx_ct function, removing duplicated code."	2013-07-09 12:26:11 -07:00
Dmitry Kovalev	aeed28f143	Removing vp9_maskingmv.c and corresponding assembly file. Change-Id: I9842d02d61d78d17dc3449bae8ffbe60f4b3ecb3	2013-07-09 11:22:56 -07:00

1 2 3 4 5 ...

1077 Commits