generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	2c594a5275	Removing vp9_systemdependent.c. Change-Id: I7b9738a7113c0c4687e5d320581ff69d98a8b271	2014-02-26 18:07:23 -08:00
levytamar82	3068d7d944	SSSE3 convolution optimization Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization is done only for 64 bit Change-Id: Ic07fce2f9360329b4f2d956efda1480ae958766b	2014-02-14 15:08:42 -07:00
levytamar82	876c72a093	AVX2 Convolve Optimization Two convolve functions were optimized for AVX2: 1. vp9_filter_block1d16_h8 2. vp9_filter_block1d16_v8 vp9_filter_block1d16_v8 was optimized for AVX2 by reducing the number of loop strides by half, two strides were processed in parallel. vp9_filter_block1d16_v8 was also optimized in the same way also some of the loads were being done outside of the loop and by that preventing redundant loads. This Optimization gives 43% function level gain and 1.3% user level gain. Now can be compiled in Windows Change-Id: I2714124cfb0c14a77d7a0ce126a20db92ffbf92c	2014-02-12 20:45:31 -07:00
Frank Galligan	d51ca0db00	Merge "Add get release decoder frame buffer functions."	2014-02-11 08:19:37 -08:00
James Zern	66bfc69bfc	Merge "*.mk: s/\bUSE_X86INC/CONFIG_USE_X86INC/"	2014-02-10 15:39:28 -08:00
Frank Galligan	e8e152799b	Add get release decoder frame buffer functions. This CL changes libvpx to call a function when a frame buffer is needed for decode. Libvpx will call a release callback when no other frames reference the frame buffer. This CL adds a default implementation of the frame buffer callbacks. Currently only VP9 is supported. A future CL will add support for applications to supply their own frame buffer callbacks. Change-Id: I1405a320118f1cdd95f80c670d52b085a62cb10d	2014-02-10 14:08:11 -08:00
James Zern	7cf0c783c1	*.mk: s/\bUSE_X86INC/CONFIG_USE_X86INC/ CONFIG_USE_X86INC is available to every makefile, there's no need to duplicate its value with USE_X86INC Change-Id: Id12bd5f09cba78abba56ab5a8f56351562e5b8b6	2014-02-04 20:04:38 -08:00
Yunqing Wang	d1961e6fbf	Optimize bilinear sub-pixel filters in ssse3 This patch added ssse3 optimization of bilinear sub-pixel filters. The real time encoder was speeded up by ~1%. Change-Id: Ie82e98976f411183cb8c61ab8d2ba0276e55a338	2014-02-04 08:01:55 -08:00
Dmitry Kovalev	282f36adc4	Merge "Removing "_short" suffix from arm transform file names."	2014-02-03 14:28:47 -08:00
Yunqing Wang	2488cb34bc	Optimize bilinear sub-pixel filters in sse2 Using bilinear filters could speed up the codec in real-time mode. This patch added sse2 optimizations of bilinear filters that operate on different-sized blocks. Tests showed that the real-time encoder was speeded up by 3%. Change-Id: If99a7ee4385fcc225c3ee7445d962d5752e57c3f	2014-02-03 10:34:45 -08:00
Jim Bankoski	9dec7712ab	static function convert to inline or global vp9_blockd.h Change-Id: Ifdd951f24932839f06d1c700371662511dde6ebe	2014-01-31 19:50:40 -08:00
Dmitry Kovalev	c49b08c9a1	Removing "_short" suffix from arm transform file names. Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507	2014-01-31 17:19:02 -08:00
Yunqing Wang	d2bb0c51d3	Revert "Revert "Revert "SSSE3 convolution optimization""" This reverts commit `f9404f2406`. This patch caused some ASAN error. Change-Id: If15b7e581310e19061d111c69f2931809662ed19	2014-01-16 16:11:46 -08:00
Yunqing Wang	f9404f2406	Revert "Revert "SSSE3 convolution optimization"" This reverts commit `b645257121`. Change-Id: I60d1bf57ae8e9eb6127f42f2d5a780124ac51b45	2014-01-13 12:29:55 -08:00
Paul Wilkins	b645257121	Revert "SSSE3 convolution optimization" This reverts commit `511d218c60`. In current form intrinsics break borg build. Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9	2014-01-10 13:38:26 +00:00
Yunqing Wang	f3b9b97c0e	Merge "SSSE3 convolution optimization"	2014-01-09 12:39:47 -08:00
levytamar82	511d218c60	SSSE3 convolution optimization Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization done only for 64bit. Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969	2014-01-09 12:27:51 -07:00
hkuang	5be0ed30dc	Merge "Add initial intra frame neon optimization. 1~2% gain."	2014-01-08 14:41:43 -08:00
hkuang	691111aacf	Add initial intra frame neon optimization. 1~2% gain. More intra optimizations will be added. Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8	2014-01-08 11:58:42 -08:00
Dmitry Kovalev	987810ad95	Removing vp9_findnearmv.{h, c} files. Moving all code from that files to vp9_mvref_common.{h, c}. Change-Id: Ibc4afcb8cea6847166ff411130e93611ebe63b20	2013-12-19 17:39:57 -08:00
Dmitry Kovalev	b5c9261832	Converting vp9_treecoder.h to vp9_prob.{h, c} Moving vp9_norm probability table from vp9_entropy.c to vp9_prob.c Change-Id: Ie757b73860c6f43130790c332b292e2a1a81b788	2013-12-16 12:53:09 -08:00
Dmitry Kovalev	4ac6a2552b	Moving vp9_tree_probs_from_distribution() to encoder. Writing custom coeff branch count calculation (which is much clearer) in adapt_coef_probs() function. Removing vp9_treecoder.c file. Change-Id: I8880fb7a39996c8bcf6cd0acf9898a8c712ba91f	2013-12-05 18:13:26 -08:00
Dmitry Kovalev	4afd141a05	Removing vp9_default_coef_probs.h file. Moving all probability tables from removed file to vp9_entropy.c. Change-Id: I12846f1da778c3016d96b82e53384d4634883430	2013-12-04 17:04:35 -08:00
Frank Galligan	b4874e2c82	Fix 16 wide neon horz loopfilter. Multiply by 3 was on 8bit vectors when it should have been on 16bit vectors. Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc	2013-11-26 10:02:40 -08:00
Frank Galligan	97d1258375	Revert "Add 16 wide neon horz loopfilter." The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0	2013-11-21 14:01:33 -08:00
Frank Galligan	2dd77580c0	Merge "Add 16 wide neon horz loopfilter."	2013-11-21 10:29:30 -08:00
Frank Galligan	98de15137e	Add 16 wide neon horz loopfilter. Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d	2013-11-21 09:39:36 -08:00
Yaowu Xu	30b03050a2	Move vp9_sadmxn.h from common to encoder Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b	2013-11-19 12:46:08 -08:00
Yaowu Xu	a42ab027fd	Merge "Move vp9_extend.{h,c} from common to encoder"	2013-11-18 15:43:32 -08:00
Yaowu Xu	1c61e1960d	Move vp9_extend.{h,c} from common to encoder Since they used in encoder only. This commit also re-order includes for the files that include vp9_extend.h Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459	2013-11-18 12:43:36 -08:00
Yunqing Wang	64f728caef	Do horizontal loopfiltering in parallel This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35	2013-11-15 16:18:43 -08:00
Johann	4da2a8b718	Merge "mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)"	2013-11-13 09:00:09 -08:00
Parag Salasakar	1530a6b77f	mips dsp-ase r2 vp9 decoder intra module optimizations (rebase) Change-Id: Ib27fc4f3dbe01fe8adfa04a61aaba21b3480e75c	2013-11-13 11:17:14 +05:30
Parag Salasakar	248cf6f69f	mips dsp-ase r2 vp9 decoder loopfilter module optimizations (rebase) Change-Id: Ia7f640ca395e8deaac5986f19d11ab18d85eec2d	2013-11-13 10:53:16 +05:30
hkuang	a6462990e6	Merge "Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3."	2013-11-07 14:42:29 -08:00
hkuang	6b16f63332	Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3. Change-Id: I034848cf05031618818f7df2e7f9c35102686948	2013-11-05 14:57:32 -08:00
Tamar Levy	54f9205653	mb_lpf_horizontal_edge AVX2 optimization This CL contains two AVX2 optimized loop filter functions, mb_lpf_horizontal_edge_w_avx2_8 and mb_lpf_horizontal_edge_w_avx2_16. Change-Id: I604e4fe6e99752b7800c2ea98721d97f7e0b931b	2013-10-31 10:26:15 -06:00
Parag Salasakar	1699eb0bf6	mips dsp-ase r2 vp9 decoder idct module optimizations (rebase) Change-Id: Iedcdb8867084f328f4fce2fadb968e0984217308	2013-10-24 11:29:04 +05:30
Yunqing Wang	3a0b59e3fd	Merge "SSE2 8-tap sub-pixel filter optimization"	2013-10-11 08:44:56 -07:00
Yunqing Wang	3fb728c749	SSE2 8-tap sub-pixel filter optimization To ensure fast encoding/decoding on devices without ssse3 support, SSE2 optimization of sub-pixel filters was done. Test using 1080p clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps with sse2 filters, and ~15fps with c filters. Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c	2013-10-10 14:12:47 -07:00
Dmitry Kovalev	9a1250e3e0	Merge "Moving all scan/iscan code into separate vp9_scan.{h, c} files."	2013-10-10 10:45:07 -07:00
Parag Salasakar	eeb5b62dc1	mips dsp-ase r2 vp9 decoder bilinear convolve optimizations Change-Id: Ic31b4ef85e65070b4f8b9f26e068ccfaae00c4f0	2013-10-09 18:05:27 +05:30
Dmitry Kovalev	e3597c6af7	Moving all scan/iscan code into separate vp9_scan.{h, c} files. Now we have entropy code separate from scan/iscan code. The next step in future is to move iscan code from common part to the encoder. Change-Id: Id9732f7d80aec00af35c1d58d1137c4c96c91451	2013-10-07 13:55:56 -07:00
Parag Salasakar	40edab5e39	mips dsp-ase r2 vp9 decoder convolve module optimizations Change-Id: I401536778e3c68ba2b3ae3955c689d005e1f1d59	2013-10-02 16:58:37 -07:00
Dmitry Kovalev	efbacc9f89	Merge "Removing vp9_subpelvar.h from common."	2013-09-29 12:00:46 -07:00
Christian Duvivier	b1b4ba1bdd	Properly save neon registers. Replace current code which corrupts the stack by duplicate of vp8 code to save and restore neon registers. Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a	2013-09-27 14:25:33 -07:00
Christian Duvivier	5b1dc1515f	Fix a bunch of TODO from vp9_short_idct32x32_add_neon. - full ASM version, no more C gateway file. - integrate combine-add with last step of 2nd pass. - remove a few push/pop pairs. - some instruction reordering to hide latency. Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6	2013-09-25 21:15:19 -07:00
Dmitry Kovalev	64eff7f360	Removing vp9_subpelvar.h from common. Moving all code from that file to vp9_variace_c.c in the encoder. Change-Id: Ic803d5b4c78d5191e4d25541b3df97337878fc3e	2013-09-25 16:10:43 -07:00
James Zern	2d58761993	Revert "Improved 8t filters" This is incompatible with most toolchains other than gcc. Revert "Deleted #include <inttypes.h>" This reverts commit `4d018be950`. This reverts commit `d22a504d11`. Change-Id: I1751dc6831f4395ee064e6748281418e967e1dcf	2013-09-13 15:13:06 -07:00
hkuang	86fb12b600	Merge "Add neon optimize iht8x8 which is 282% faster than C."	2013-09-12 15:42:44 -07:00
hkuang	182366c736	Add neon optimize iht8x8 which is 282% faster than C. Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530	2013-09-12 11:49:05 -07:00
Christian Duvivier	6a501462f8	First draft of vp9_short_idct32x32_add_neon. Lots of TODO which will be taken care in upcoming changes. As is, about 6x faster than C version. Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0	2013-09-11 15:19:38 -07:00
Scott LaVarnway	d22a504d11	Improved 8t filters Reformatted version of a patch submitted by Erik/Tamar from Intel. For the test clips used, the decoder performance improved by ~2%. Change-Id: Ifbc37ac6311bca9ff1cfefe3f2e9b7f13a4a511b	2013-09-11 13:56:32 -04:00
hkuang	3c05bda058	Merge "Add neon optimize vp9_short_iht4x4_add."	2013-09-04 13:35:09 -07:00
hkuang	3b8614a8f6	Add neon optimize vp9_short_iht4x4_add. Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e	2013-09-04 12:37:58 -07:00
Jim Bankoski	79401542f7	make vp9 postproc a config option Vp9 postproc is disabled for now as its not been shown to help and may be merged with vp8. Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057	2013-09-04 10:02:08 -07:00
hkuang	3a679e56b2	Add neon optimize vp9_short_idct16x16_1_add. Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5	2013-08-27 14:00:27 -07:00
hkuang	36e9b82080	Add neon optimize vp9_short_idct8x8_1_add. Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4	2013-08-26 16:28:57 -07:00
hkuang	69384f4fad	Add neon optimize vp9_short_idct4x4_1_add. Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5	2013-08-26 15:55:16 -07:00
Johann	a9aa7d07d0	Merge "vp9: neon: add vp9_convolve_avg_neon"	2013-08-15 14:55:15 -07:00
Johann	63e140eaa7	Merge "vp9: neon: add vp9_convolve_copy_neon"	2013-08-15 14:55:08 -07:00
hkuang	39f42c8713	Merge "Add neon optimize vp9_short_idct16x16_add."	2013-08-14 14:16:20 -07:00
hkuang	cf6beea661	Add neon optimize vp9_short_idct16x16_add. Change-Id: I27134b9a5cace2bdad53534562c91d829b48838d	2013-08-14 13:52:16 -07:00
Mans Rullgard	0f1deccf86	vp9: neon: add vp9_convolve_avg_neon Change-Id: I33cff9ac4f2234558f6f87729f9b2e88a33fbf58	2013-08-14 16:27:55 +01:00
Mans Rullgard	635ba269be	vp9: neon: add vp9_convolve_copy_neon Change-Id: I15adbbda15d1842e9f15f21878a5ffbb75c3c0c9	2013-08-14 16:27:55 +01:00
Dmitry Kovalev	8ffe85ad00	Moving scale_factors and related code to separate files. Change-Id: I531829e5aee2a4a7a112d528ecccbddf052d0e74	2013-08-09 14:07:09 -07:00
Christian Duvivier	78182538d6	Neon version of vp9_short_idct4x4_add. Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4	2013-08-06 18:41:27 -07:00
Jim Bankoski	6eb1254b88	sse3 intrapred x86inc protected Change-Id: I4a3c83119cdf8a205920034c8019d855d5504605	2013-08-06 14:17:13 -07:00
Jim Bankoski	25ec1375c9	intrapred x86inc guards Change-Id: If0399d8e11f4ebe75a5c91abb8d6a52a7709065b	2013-08-06 09:39:30 -07:00
Jim Bankoski	c3809f3de5	Begin to restrict x86inc.asm usage Chromium does not support 32bit builds for Mac which use x86inc.asm. Make the files which include it work if 64bit or not PIC enabled starting with vp9_copy_sse2.asm Consolidate these targets in vp9_rtcd_defs.sh Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248	2013-08-05 12:07:30 -07:00
Mans Rullgard	d85ae87183	vp9: neon: add vp9_mb_lpf_* functions Change-Id: I13e0880df234f15abc4cc7c57fe84488d5d46a75	2013-08-02 08:10:50 -07:00
hkuang	d757de744c	Add neon optimize vp9_short_idct8x8_add. Change-Id: Ic32acf3e2939c6d12d9c2bf192a5f5da59705fda	2013-07-18 16:40:41 -07:00
Johann	9ca66ec050	Merge "vp9_convolve8_neon placeholder"	2013-07-17 10:09:00 -07:00
Johann	59dc4e9cdd	vp9_convolve8_neon placeholder Call the individually optimized horizontal and vertical functions. This implementation abuses the temp buffer. This will be replaced with a custom optimized function. Over 2x speedup. Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd	2013-07-17 08:39:27 -07:00
James Zern	98e132bde0	Merge changes I40454d26,I892e76d5,I865ab3f9,I4a4bec17,I61c4351e,I37eb3559,I1031c556,I8c8f1f42 * changes: delete vp9_loopfilter_sse2.asm vp9_loopfilter_intrin_sse2: cosmetics: fix indent delete x86/vp9_loopfilter_x86.h vp9_loopfilter_intrin_sse2: make some funcs static vp9_loopfilter_intrin_sse2: remove unused uv funcs vp9_loopfilter: remove uv function typedef filter_block_plane: reuse some constants vp9_loopfilter.c: make some functions static	2013-07-16 14:25:32 -07:00
James Zern	50015f6eba	delete vp9_loopfilter_sse2.asm sse2 functions are provided by vp9_loopfilter_intrin_sse2.c Change-Id: I40454d26034e3ef915eeaf889937fe7d1b519b9b	2013-07-16 13:09:16 -07:00
James Zern	af58254267	delete x86/vp9_loopfilter_x86.h also remove prototype_loopfilter{,_block} defines from vp9_loopfilter.h Change-Id: I865ab3f9436c7b1ca166f76630328abf01389405	2013-07-16 13:09:05 -07:00
Dmitry Kovalev	baf0c959c7	Moving vp9_kf_default_bmode_probs to vp9_entropymode.c. Removing vp9_modelcontext.c. Change-Id: If2316c58dead2708d9f95b52d9494ba4c1dd7427	2013-07-16 10:54:34 -07:00
Johann	a15bebfc0a	vp9_convolve8_[horiz\|vert]_avg Super basic conversion from the other implementations. Any changes to one should be trivial to copy over keep in sync. Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8	2013-07-12 16:21:33 -07:00
Johann	158c80cbb0	convolve8 optimizations for neon Independent horizontal and vertical implementations. Requires that blocks be built from 4x4 and [xy]_step_q4 == 16 6-10% improvement. CIF improved the least. Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda	2013-07-11 11:08:19 -07:00
hkuang	c9b25dcae4	Add neon optimize vp9_dc_only_idct_add. Change-Id: Iae84ab945cc9662a0ddd839aa2b9ca59f2ae5423	2013-07-11 10:30:47 -07:00
Ronald S. Bultje	decead7336	Replace copy_memNxM functions with a generic copy/avg function. Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa	2013-07-10 18:27:24 -07:00
Ronald S. Bultje	3f210f10eb	Remove unused iwalsh4x4 MMX/SSE2 functions. Change-Id: I2d22577911a37ed7d8c7e08cac20764842267652	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	48c53233fd	Remove unused 16x3/3x16 sad SSE2 functions. Change-Id: I30a597c0cc366e34c9a3e2afe32d70e044f95ca4	2013-07-10 14:52:47 -07:00
Ronald S. Bultje	e6f955251f	Merge "SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction."	2013-07-10 14:52:23 -07:00
Ronald S. Bultje	89810bfd71	Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction."	2013-07-10 10:13:16 -07:00
Dmitry Kovalev	20986c81b3	Merge "Removing vp9_maskingmv.c and corresponding assembly file."	2013-07-10 10:05:06 -07:00
Ronald S. Bultje	7fd643264a	SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction. Change-Id: Iad70966b986f65259329070e258f76ef0af816b4	2013-07-10 09:28:03 -07:00
Ronald S. Bultje	92c5d3665d	SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction. Change-Id: Ibe1690afc5459f3b3beca401e7734fcd03da6dd0	2013-07-10 09:28:03 -07:00
Jim Bankoski	6c8170af52	b_width_log2 and b_height_log2 lookups Replace case statement with lookup. Small speed gain at low speed settings but at speed 2+ where the number of motion searches etc. falls the impact rises to ~3-4%. Change-Id: Idff639b7b302ee65e042b7bf836943ac0a06fad8 Change-Id: I5940719a4a161f8c26ac9a6753f1678494cec644	2013-07-10 07:19:09 -07:00
John Koleszar	f0d9f10d24	Remove all asm offset files from VP9 The files are empty and unused. Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a	2013-07-09 14:26:53 -07:00
Dmitry Kovalev	aeed28f143	Removing vp9_maskingmv.c and corresponding assembly file. Change-Id: I9842d02d61d78d17dc3449bae8ffbe60f4b3ecb3	2013-07-09 11:22:56 -07:00
Ronald S. Bultje	8350e7fe38	Make intra prediction pointers RTCD-based. This probably has a mildly negative impact on performance, but will (in future commits - or possibly merged with this one) allow SIMD implementations of individual intra prediction functions. We may perhaps want to consider having separate functions per txfm-size also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for each intra prediction mode), but I haven't played much with that yet. Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269	2013-07-08 17:25:51 -07:00
Dmitry Kovalev	904070ca64	Merge "Removing unused implicit segmentation code."	2013-07-02 11:58:48 -07:00
Dmitry Kovalev	a3d2e6c98b	Removing unused implicit segmentation code. Change-Id: I8a2983fb14274a6ac53681fa4cd5d4209cbd2905	2013-07-02 11:16:42 -07:00
Dmitry Kovalev	1ac0540296	Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h. Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c	2013-07-01 17:28:08 -07:00
Dmitry Kovalev	2ab3bc8871	Removing vp9_modecont.{h, c}. Moving vp9_default_inter_mode_probs array to vp9_entropymode.c. Change-Id: I88ebda86ccc07f2a43c6c01d4b37898214cfb6de	2013-07-01 10:17:15 -07:00
Frank Galligan	1d6dc1b702	Add Neon optimized loop filter functions. - Added vp9_loop_filter_horizontal_edge_neon and vp9_loop_filter_vertical_edge_neon. - The functions are based off the vp8 loopfilter functions. - Matches x86 md5 checksum. Change-Id: Id1c4dddb03584227e5ecd29f574a6ac27738fdd0	2013-06-27 16:14:45 -07:00
Dmitry Kovalev	dfc0385291	Merge "Removing vp9_invtrans.{c, h} files."	2013-06-18 10:16:25 -07:00
Ronald S. Bultje	d9fc451666	Move subpixel variance function from common/ to encoder/. This seems to only be used in the encoder. Also remove an empty wrapper file that contained forward declarations for this function, but didn't actually define any actual functions. Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b	2013-06-17 16:54:09 -07:00
Dmitry Kovalev	686b99741c	Removing vp9_invtrans.{c, h} files. Moving single function from vp9_invtrans.c to vp9_encodemb.c. Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7	2013-06-17 16:09:03 -07:00
John Koleszar	0f7a66e962	Remove constant vp9_coef_update_prob table All elements of this table are equal to 252, so replace it with a single constant VP9_COEF_UPDATE_PROB. Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9	2013-06-14 15:12:31 -07:00
John Koleszar	8933a652fc	Remove some unused loopfilter code This code is unreachable, and not useful for later reference. Change-Id: I4c9a9e0fbf859c1081bbcfbcda9710afb4b4741f	2013-06-12 11:36:00 -07:00
Jim Bankoski	ba2af976cb	print debugging info from mode info struct This commit has no impact but to help us debug issues. To Use call like this: vp9_print_modes_and_motion_vectors(cpi->common.mi, cpi->common.mi_rows, cpi->common.mi_cols, cpi->common.current_video_frame, "decode_mi.stt"); Change-Id: I89e27725dae351370eb7f311a20a145ed4f1d041	2013-06-10 14:03:17 -07:00
Ronald S. Bultje	7873de1481	Merge "Remove unused and outdated debug code." into experimental	2013-05-29 17:33:32 -07:00
Ronald S. Bultje	2afc3422c6	Remove unused and outdated debug code. Change-Id: I0e789bdeaed60f920f7a470e56a8d4ea374233fc	2013-05-28 19:15:57 -07:00
Dmitry Kovalev	18c83b3714	Compressed/uncompressed frame header changes. Adding API to read/write uncompressed frame header bits (it is not final yet). Separate functions to read/write uncompressed header. Moving clr_type, error_resilient_mode, refresh_frame_context, frame_parallel_decoding_mode, frame_context_idx from compressed partition to uncompressed frame header. Change-Id: Id3ed8a387980c652ae147549412f4ec24a0a5bd0	2013-05-28 18:07:54 -07:00
Dmitry Kovalev	1a24011469	Revert "Adding API to read/write uncompressed frame header bits." because of bitstream mismatches. This reverts commit `df037b615f` Change-Id: I1a529f2590df7bc912f5035d22311268933e3dd6	2013-05-28 02:24:52 -07:00
Dmitry Kovalev	0b2b81249b	Merge "Adding API to read/write uncompressed frame header bits." into experimental	2013-05-24 13:43:19 -07:00
Dmitry Kovalev	df037b615f	Adding API to read/write uncompressed frame header bits. The API is not final yet and can be changed. Actual layout of uncompressed frame part will be finalized later. Right now moving clr_type, error_resilient_mode, refresh_frame_context, frame_parallel_decoding_mode from first compressed partition to uncompressed frame part. Change-Id: I3afc5d4ea92c5a114f4c3d88f96858cccc15b76e	2013-05-21 15:31:32 -07:00
Scott LaVarnway	a143152600	Removed unused idct functions No longer used. Change-Id: Id28c9247cebba183c6fa786dff96824ae100132c	2013-05-21 17:59:54 -04:00
Scott LaVarnway	0c3f3bf1d5	Removed vp9_recon functions No longer used. Change-Id: Ica5166f7117f4693dffdf7633dcfc1b263103d0d	2013-05-21 13:57:50 -04:00
Dmitry Kovalev	b05247df95	Removing vp9_swap_yv12_buffer function and corresponding files. Adding static swap_yv12 function to vp9_firstpass.c. Change-Id: I7da9caab9720498db4a74c627901bf37816ed06c	2013-05-07 16:49:22 -07:00
Paul Wilkins	8c1b516d10	Deprecate the newbintramode experiment. Clean out code relating to newbintramode. Change-Id: Ie91f4f156cdf60ce0da8ca407c1c9cb00c7d0705	2013-05-07 16:00:59 -07:00
Scott LaVarnway	9c7d06e6f3	Merge "Removed vp9_setup_intra_recon()" into experimental	2013-05-07 08:24:26 -07:00
Scott LaVarnway	cb7955d83e	Removed vp9_setup_intra_recon() This setup is now handled by vp9_build_intra_predictors() when left_available and/or up_available is zero. Change-Id: I59cec0ab95f8be69ce885fd20727510e4deef8a0	2013-05-06 16:13:06 -04:00
Johann	a62fcbea30	Automatically flag intrinsic files Change-Id: Iee9894615265d42aa23c43a4183924953aedb0c6	2013-05-03 15:35:13 -07:00
Ronald S. Bultje	2dbaa4f4f4	Change above/left_context to use an 8x8 basis. Output changes slightly because of a minor bug in (at least) the sb32x16 block2above tx16x16 tables that previously existed in vp9_blockd.c. Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc	2013-04-29 10:37:25 -07:00
Johann	32a5c52856	Merge branch 'master' into experimental Conflicts: vp9/common/vp9_findnearmv.c vp9/common/vp9_rtcd_defs.sh vp9/decoder/vp9_decodframe.c vp9/decoder/x86/vp9_dequantize_sse2.c vp9/encoder/vp9_rdopt.c vp9/vp9_common.mk Resolve file name changes in favor of master. Resolve rdopt changes in favor of experimental, preserving the newer experiments. Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f	2013-04-26 12:57:10 -07:00
Johann	c5b127afea	Rename vp9_idct_x86.c Remove similarly named header file. It is obsolete. Move file to match naming style. Adjust make file to include the file correctly and remove extra unnecessary #if guard. Change-Id: Ifba07ba9938a5df08a9f4eda54a3ac4d6983f7bf	2013-04-25 11:13:02 -07:00
John Koleszar	7f7d1357a2	Merge branch 'experimental' into master VP9 preview bitstream 2, commit '868ecb55a1528ca3f19286e7d1551572bf89b642' Conflicts: vp9/vp9_common.mk Change-Id: I3f0f6e692c987ff24f98ceafbb86cb9cf64ad8d3	2013-04-16 06:49:46 -07:00
Ronald S. Bultje	a3874850dd	Make SB coding size-independent. Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code gives identical encoder results before and after. There are a few macros for rectangular block sizes under the sbsegment experiment; this experiment is not yet functional and should not yet be used. Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728	2013-04-09 21:28:27 -07:00
John Koleszar	42db454c7f	Merge branch 'master' into experimental Conflicts: vp9/vp9_common.mk Change-Id: I2cd5ab47dc31c4210cefc23a282102123d5e2221	2013-04-02 14:54:44 -07:00
Johann	3db60c8c6c	Demux vp9_loopfilter_x86.c Allow more careful targeting of compiler flags. Change-Id: I963ab4a6479dedb165419310dfca52a58a9877b8	2013-04-02 12:49:04 -07:00
Johann	6c147b9d93	vp9_sadmxn_x86 only contains SSE2 functions Rename the file and clean up includes. In the future we would like to pattern match the files which need additional compiler flags. Change-Id: I2c76256467f392a78dd4ccc71e6e0a580e158e56	2013-04-02 11:20:55 -07:00
John Koleszar	e5d7542447	Merge "Add VP9_GET_REFERENCE control" into experimental	2013-03-18 11:57:31 -07:00
John Koleszar	b3c350a1a9	Add VP9_GET_REFERENCE control This is like VP8_COPY_REFERENCE, but returns a pointer to the reference frame rather than a copy of it. This is useful when the application doesn't know what the size of the reference is, as is the case when scaling is in effect. Change-Id: I63667109f65510364d0e397ebe56217140772085	2013-03-13 19:08:06 -07:00
Yaowu Xu	005552639b	removed reference to "LLM" and "x8" The commit changed the name of files and function to remove obselete reference to LLM and x8. Change-Id: I973b20fc1a55149ed68b5408b3874768e6f88516	2013-03-13 08:35:46 -07:00
Yunqing Wang	35bc02c6eb	Optimize vp9_dc_only_idct_add_c function Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to improve performance, clipped the absolute diff values to [0, 255]. This allowed us to keep the additions/subtractions in 8 bits. Test showed an over 2% decoder performance increase. Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d	2013-02-26 17:16:13 -08:00
Ronald S. Bultje	f496f601fb	Add tile column size limits (256 pixels min, 4096 pixels max). This is after discussion with the hardware team. Update the unit test to take these sizes into account. Split out some duplicate code into a separate file so it can be shared. Change-Id: I8311d11b0191d8bb37e8eb4ac962beb217e1bff5	2013-02-12 10:33:34 -08:00
John Koleszar	3de8ee6ba1	Merge changes Ife0d8147,I7d469716,Ic9a5615f into experimental * changes: Restore SSSE3 subpixel filters in new convolve framework Convert subpixel filters to use convolve framework Add 8-tap generic convolver	2013-02-08 13:19:47 -08:00
John Koleszar	29d47ac80e	Restore SSSE3 subpixel filters in new convolve framework This commit adds the 8 tap SSSE3 subpixel filters back into the code underneath the convolve API. The C code is still called for 4x4 blocks, as well as compound prediction modes. This restores the encode performance to be within about 8% of the baseline. Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c	2013-02-08 12:18:14 -08:00
Yaowu Xu	e6ad9ab02c	move dct/idct constants to a header file also removed some un-unsed functions. Change-Id: Ie363bcc8d94441d054137d2ef7c4fe59f56027e5	2013-02-07 13:51:45 -08:00
Ronald S. Bultje	1407bdc243	[WIP] Add column-based tiling. This patch adds column-based tiling. The idea is to make each tile independently decodable (after reading the common frame header) and also independendly encodable (minus within-frame cost adjustments in the RD loop) to speed-up hardware & software en/decoders if they used multi-threading. Column-based tiling has the added advantage (over other tiling methods) that it minimizes realtime use-case latency, since all threads can start encoding data as soon as the first SB-row worth of data is available to the encoder. There is some test code that does random tile ordering in the decoder, to confirm that each tile is indeed independently decodable from other tiles in the same frame. At tile edges, all contexts assume default values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode), and motion vector search and ordering do not cross tiles in the same frame. t log Tile independence is not maintained between frames ATM, i.e. tile 0 of frame 1 is free to use motion vectors that point into any tile of frame 0. We support 1 (i.e. no tiling), 2 or 4 column-tiles. The loopfilter crosses tile boundaries. I discussed this briefly with Aki and he says that's OK. An in-loop loopfilter would need to do some sync between tile threads, but that shouldn't be a big issue. Resuls: with tiling disabled, we go up slightly because of improved edge use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf, ~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5% on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is concentrated in the low-bitrate end of clips, and most of it is because of the loss of edges at tile boundaries and the resulting loss of intra predictors. TODO: - more tiles (perhaps allow row-based tiling also, and max. 8 tiles)? - maybe optionally (for EC purposes), motion vectors themselves should not cross tile edges, or we should emulate such borders as if they were off-frame, to limit error propagation to within one tile only. This doesn't have to be the default behaviour but could be an optional bitstream flag. Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f	2013-02-05 15:43:03 -08:00
John Koleszar	7a07eea13f	Convert subpixel filters to use convolve framework Update the code to call the new convolution functions to do subpixel prediction rather than the existing functions. Remove the old C and assembly code, since it is unused. This causes a 50% performance reduction on the decoder, but that will be resolved when the asm for the new functions is available. There is no consensus for whether 6-tap or 2-tap predictors will be supported in the final codec, so these filters are implemented in terms of the 8-tap code, so that quality testing of these modes can continue. Implementing the lower complexity algorithms is a simple exercise, should it be necessary. This code produces slightly better results in the EIGHTTAP_SMOOTH case, since the filter is now applied in only one direction when the subpel motion is only in one direction. Like the previous code, the filtering is skipped entirely on full-pel MVs. This combination seems to give the best quality gains, but this may be indicative of a bug in the encoder's filter selection, since the encoder could achieve the result of skipping the filtering on full-pel by selecting one of the other filters. This should be revisited. Quality gains on derf positive on almost all clips. The only clip that seemed to be hurt at all datarates was football (-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR, 0.347% SSIM. Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff	2013-02-05 14:23:17 -08:00
John Koleszar	5ca6a3667f	Add 8-tap generic convolver This commit introduces a new convolution function which will be used to replace the existing subpixel interpolation functions. It is much the same as the existing functions, but allows for changing the filter kernel on a per-pixel basis, and doesn't bake in knowledge of the filter to be applied or the size of the resulting block into the function name. Replacing the existing subpel filters will come in a later commit. Change-Id: Ic9a5615f2f456cb77f96741856fc650d6d78bb91	2013-02-05 14:19:28 -08:00
Yaowu Xu	9bf73f46f9	fix a number issues that cause failures During master jenkins verification proces Change-Id: I3722b8753eaf39f99b45979ce407a8ea0bea0b89	2013-01-14 18:32:32 -08:00
Yaowu Xu	741fbe9656	Merge experiment "subpelrefmv" Change-Id: Iac7f3d108863552b850c92c727e00c95571c9e96	2013-01-14 15:18:47 -08:00
Yunqing Wang	f1c56a8c8c	Merge "vp9_sub_pixel_variance16x2 SSE2 optimization" into experimental	2013-01-08 12:59:08 -08:00
Yunqing Wang	8d568312a2	vp9_sub_pixel_variance16x2 SSE2 optimization About 5% decoder speedup. Change-Id: Ib6687d337af758a536a0e7e289f400990f1f9794	2013-01-08 12:01:55 -08:00
John Koleszar	879cb7d962	Merge vp9-preview changes into experimental branch Incorportate vp9-preview changes by merging master branch into experimental. Conflicts: test/test.mk vp9/common/vp9_filter.c vp9/common/vp9_idctllm.c vp9/common/vp9_invtrans.h vp9/common/vp9_mbpitch.c vp9/common/vp9_rtcd_defs.sh vp9/common/vp9_systemdependent.h vp9/common/vp9_type_aliases.h vp9/common/x86/vp9_asm_stubs.c vp9/common/x86/vp9_subpixel_mmx.asm vp9/decoder/vp9_decodframe.c vp9/decoder/vp9_dequantize.c vp9/decoder/vp9_dequantize.h vp9/decoder/vp9_onyxd_int.h vp9/encoder/vp9_bitstream.c vp9/encoder/vp9_encodeframe.c vp9/encoder/vp9_rdopt.c Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5	2013-01-08 10:19:59 -08:00
John Koleszar	5ebe94f9f1	Build fixes to merge vp9-preview into master Various fixups to resolve issues when building vp9-preview under the more stringent checks placed on the experimental branch. Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07	2012-12-26 11:21:09 -08:00
John Koleszar	9a7023d2ad	Fix MSVS build for removed vp9/common/vp9_onyxd.h Change-Id: I75ad0b4ca5b53b5bf759cc26a484ec196d275279	2012-12-20 16:14:55 -08:00
Scott LaVarnway	08dabbcee1	Disabled x86inc style assembly functions Temporary fix for 32-bit mac build errors. Change-Id: I2038f033cac16ea796097d0edd0f1c3da03246d7	2012-12-19 11:53:43 -08:00
Ronald S. Bultje	4cca47b538	Use standard integer types for pixel values and coefficients. For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da	2012-12-18 15:31:19 -08:00
John Koleszar	1306ba7659	Remove vp9_type_aliases.h Prefer the standard fixed-size integer typedefs. Change-Id: Iad75582350669e49a8da3b7facb9c259e9514a5b	2012-12-17 11:32:37 -08:00
Johann	a905672906	Remove ARM optimizations from VP9 Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b	2012-12-05 08:59:25 -08:00
Johann	34591b54dd	Remove ARM optimizations from VP9 Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b	2012-12-03 12:50:15 -08:00
Jim Bankoski	e3bdae1fc7	intrinsic warnings begone Change-Id: I6a224c590b6a2c5b91f9084ffb8083d18223a206	2012-11-29 14:14:26 -08:00
Jim Bankoski	e69b5258fd	fix vp9_vp8 files renamed Change-Id: I20c426e91ee49666db42e20eb074095ab6b8ec5d	2012-11-29 06:53:08 -08:00

1 2 3 4 5 ...

260 Commits