generic-library/vpx

Author	SHA1	Message	Date
hkuang	4cc7c5a17f	Totally remove prev_mi in VP9 decoder. This will save the memory and improve the decode speed due to removing unnecessary memset of big prev_mi array for all the key frames. Decoding a all key frames 1080p video shows speed improve around 2%. Change-Id: I6284a445c1291056e3c15135c3c20d502f791c10	2014-11-05 16:14:30 -08:00
Yaowu Xu	2c4fee17bc	Fix visual studio 2013 compiler warnings For configured with --enable-vp9-highbitdepth Change-Id: I2b181519d7192f8d7a241ad5760c3578255f24e6	2014-11-05 13:47:28 -08:00
hkuang	23da920a8e	Fix the memory leak due to missing free frame_mvs. Change-Id: I2ceee7341d906259002c0ea31ea009ae32c04bfd	2014-11-04 13:28:31 -08:00
Yunqing Wang	6d90a9d289	Merge "WORKAROUND FIX FOR GCC4.9.1"	2014-11-03 16:56:38 -08:00
levytamar82	86175a5788	WORKAROUND FIX FOR GCC4.9.1 In the function mb_lpf_horizontal_edge_w_avx2_16 the usage of the intrinsic _mm256_cvtepu8_epi16 cause a compiler bug in gcc 4.9.1. until it will be fixed I created a workaround that create the up convert by using broadcast128+shuffle. The bug was reported here: https://code.google.com/p/webm/issues/detail?id=867 Change-Id: I73452e6806f42e0fadcde96b804ea3afa7eeb351	2014-11-01 11:27:28 -07:00
hkuang	55577431ae	Bind motion vectors with frame buffer structure. This will save a lot of memory for decoder due to removing of prev_mi, but prev_mi is still needed in encoder. So this will increase a little bit memory for encoder. Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed	2014-10-31 17:01:08 -07:00
Hui Su	d478d2df37	Merge "Move the definition of switchable filter numbers into enum INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and IF_DIFF_REF_FRAME_ADD_MV."	2014-10-30 11:05:04 -07:00
James Zern	01900edc40	Merge changes I8a9c9019,Ic7b2faa3,I44d42a50,I3f3a3924,I10747b32,I31b49c9e * changes: add vp9_loop_filter_data_reset move LFWorkerData allocation to VP9LfSync vp9_loop_filter_frame_mt: remove pbi dependency vp9_loop_filter_frame_mt: pass planes directly vp9_loop_filter_frame_mt: pass VP9LfSync directly vp9: store TileWorkerData allocations separately	2014-10-24 11:43:51 -07:00
James Zern	01483677e5	add vp9_loop_filter_data_reset Change-Id: I8a9c9019242ec10fa499a78db322221bf96a0275	2014-10-23 19:43:48 +02:00
Yunqing Wang	330a6b2756	Merge "vp9_ethread: allocate frame contexts outside VP9_COMMON struct"	2014-10-22 17:10:39 -07:00
Yunqing Wang	7c7e4d4eb8	vp9_ethread: allocate frame contexts outside VP9_COMMON struct This patch allocated frame contexts outside VP9_COMMON. This allows multiple threads to share the same copy of frame contexts, and reduces the overhead. It also guarantees the correct update of these contexts during bitstream packing. This patch doesn't change encoding result. Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353	2014-10-22 15:03:12 -07:00
Frank Galligan	95a568b3a8	Fix Neon convolve profiling When profiling, gprof can't distinguish between matching labels in different files. Change-Id: I56770df212ed314a0d8568071fa8157624ef1e8f	2014-10-22 10:51:53 -07:00
Hangyu Kuang	9ce3a7d76c	Implement frame parallel decode for VP9. Using 4 threads, frame parallel decode is ~3x faster than single thread decode and around 30% faster than tile parallel decode for frame parallel encoded video on both Android and desktop with 4 threads. Decode speed is scalable to threads too which means decode could be even faster with more threads. Change-Id: Ia0a549aaa3e83b5a17b31d8299aa496ea4f21e3e	2014-10-22 10:50:58 -07:00
Hui Su	8947b18fa3	Move the definition of switchable filter numbers into enum INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and IF_DIFF_REF_FRAME_ADD_MV. Change-Id: Ic36c9eb6ccb8ec324d991f7241e42b40b60b1dcb	2014-10-21 15:41:37 -07:00
Yunqing Wang	687c56e802	Merge "SAD32xh and SAD64xh for AVX2"	2014-10-20 12:37:55 -07:00
levytamar82	7045aec00a	SAD32xh and SAD64xh for AVX2 All sad function that process above 32 consecutive elements are optimized for AVX2: vp9_sad64x64 vp9_sad64x32 vp9_sad32x64 vp9_sad32x32 vp9_sad32x16 vp9_sad64x64_avg vp9_sad64x32_avg vp9_sad32x64_avg vp9_sad32x32_avg vp9_sad32x16_avg The functions that appeared as a hotspot is vp9_sad32x32 and vp9_sad64x64 vp9_sad32x32 was optimized by 68% and vp9_sad64x64 was optimized by 90% both of them gave and overall ~2.3% user level gain Change-Id: Iccf86b375a2b54c5fbbe685902ead0c9a561b9fd	2014-10-19 13:59:10 -07:00
Peter de Rivaz	73ae6e495c	Add highbitdepth function for vp9_avg_8x8 Cherry-picked from https://gerrit.chromium.org/gerrit/#/c/71914/ (`a92f987a6b`) on highbitdepth branch. Change-Id: I6903e4e4cb57d90590725c8a1c64c23da7ae65e8	2014-10-17 17:04:37 -07:00
James Zern	e9b8810b4d	move LFWorkerData allocation to VP9LfSync this removes an assumption that worker->data1 would be pointing to a TileWorkerData allocation. additionally, within the multi-threaded loopfilter pass VP9LfSync as a parameter to the worker hook, removing the need for a shadow pointer in LFWorkerData. Change-Id: Ic7b2faa34e3eb59dbcb8a7c67f333448fa047c88	2014-10-16 18:55:46 +02:00
Alex Converse	00a9671bbd	Merge "Add a 32-bit friendly sse2 quantizer."	2014-10-14 14:35:02 -07:00
Alex Converse	7497d2fb23	Add a 32-bit friendly sse2 quantizer. This is based on the 64-bit ssse3 quantizer. 1.1x speedup for screen content at speed 7. Change-Id: I57d15415ef97c49165954bbe3daaaf9318e37448	2014-10-14 11:37:41 -07:00
hkuang	c38a8edf16	Merge "Remove extra line."	2014-10-14 11:05:01 -07:00
Adrian Grange	f7c336aa19	Merge "Remove mi_grid_base_array from VP9_COMMON (unused)"	2014-10-14 07:50:17 -07:00
hkuang	c5fd035ce0	Use pre increment. Change-Id: I016b4e77d8268e189473f4c382603afe1ae1750f	2014-10-13 14:07:03 -07:00
Adrian Grange	83b63d573a	Remove mi_grid_base_array from VP9_COMMON (unused) Change-Id: I4b4764463f5a7cdc01ec004b882c6237466c74b0	2014-10-13 11:54:05 -07:00
hkuang	dbe91de6d4	Remove extra line. Change-Id: I5e79c276d8953ae17cd35b2846e6e40660c037c3	2014-10-10 14:59:04 -07:00
hkuang	effc1a6f56	Correct the code format. Change-Id: If2de420f8123a4e8bf635dd29205dd74ee174eee	2014-10-09 17:57:45 -07:00
Deb Mukherjee	9a29fdbae7	Merge "Rename highbitdepth functions to use highbd prefix"	2014-10-09 15:39:56 -07:00
Deb Mukherjee	1929c9b391	Rename highbitdepth functions to use highbd prefix Uses highbd_ prefix convention consistently. Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e	2014-10-09 14:40:40 -07:00
James Zern	caa0f81914	vp9_rtcd_defs: fix vp9_avg_8x8 declaration vp9_avg_8x8 does not depend on x86inc, fixes 32-bit OS X build Change-Id: I709b874ea84bf57c8cdb5ac7d43eecc6b8c1a2dd	2014-10-09 10:44:42 +02:00
Jingning Han	f6ff752c63	Merge "Clean up header files in vp9_blockd.h and related files"	2014-10-08 15:25:09 -07:00
Jingning Han	1c3398675f	Merge "Use #define statement for MAX_MB_PLANE"	2014-10-08 15:24:56 -07:00
Jim Bankoski	20254d1daa	Merge "experimental : partition using 1/8 x 1/8 image"	2014-10-08 09:04:26 -07:00
Jim Bankoski	0ce51d823f	experimental : partition using 1/8 x 1/8 image The concept: There's too much noise in source pixels for variance and at low bitrate the reconstructed looks nothing like the source so we have problems getting good partitionings with either. This skirts the issue by using a box blur scaled down version for variance calculations. To compare against source_var_ moved keyframe to be rd based like source_var. Change-Id: Ie3babdbfadae324b7b5a76bea192893af27f0624	2014-10-07 16:36:14 -07:00
Jingning Han	608c4acc1f	Merge "Remove vp9_blockd.h from vp9_common_data.c"	2014-10-07 15:34:07 -07:00
Jingning Han	3bbec7b422	Merge "Replace mi_width_log2() with mi_width_log2_lookup table"	2014-10-07 15:33:52 -07:00
Jingning Han	27c9577f8e	Merge "Take out repeated block width/height lookup functions"	2014-10-07 15:33:45 -07:00
Jingning Han	6ad272cb84	Clean up header files in vp9_blockd.h and related files This commit breaks the overly broad header files into more targeted and smaller ones, to help better structure the system layout. Change-Id: I7b24559d3ea6e582cf5d9bbe8f71459f9824d71b	2014-10-07 15:17:10 -07:00
Jingning Han	3c28fb768d	Use #define statement for MAX_MB_PLANE Change-Id: I3a7f83ab1dbfcedc8a82fe798c2fa30dd9c7d696	2014-10-07 15:00:22 -07:00
Jingning Han	d7febaf5c5	Remove extra empty line Change-Id: I6f2865bb8ba9295f5c45a4cad065aecbe1e63c32	2014-10-07 14:06:54 -07:00
Jingning Han	bd9706506f	Merge "Move inter filter defs to vp9_filter.h"	2014-10-07 13:42:26 -07:00
Jingning Han	ebd724852e	Remove vp9_blockd.h from vp9_common_data.c The basic data defs should be above block operation level. Change-Id: I7dd9836d01120ab75e0c472baac9f15495ed0db5	2014-10-07 13:02:54 -07:00
Jingning Han	7ee58985bd	Replace mi_width_log2() with mi_width_log2_lookup table Change-Id: If0ea98aa139d14d40cd924114e18396aff36b5a5	2014-10-07 12:45:25 -07:00
Jingning Han	b66f7016c1	Take out repeated block width/height lookup functions The functions b_width_log2 and b_height_log2 only do direct table fetch. This commit unifies such use cases by using the table directly and removes these functions. Change-Id: I3103fc6ba959c1182886a2799d21b8b77c8a7b6b	2014-10-07 12:33:07 -07:00
Jingning Han	5d9cdac087	Move inter filter defs to vp9_filter.h Add comments on the use case of these definitions. Further reduce the scope of header file in vp9_context_tree.h. Change-Id: Ic4a7638e838d0ac441b64abfc56e57354c059d75	2014-10-07 12:16:37 -07:00
Deb Mukherjee	cfc337aae8	Merge "Resolves some static analysis / undefined warnings"	2014-10-07 12:15:26 -07:00
Deb Mukherjee	fced63ed30	Resolves some static analysis / undefined warnings Also fixes a case of distortion becoming negative and messing up the RDCOST computation. Change-Id: Id345af9e8dfff31ade622be5756e51f2cdface53	2014-10-07 11:20:56 -07:00
JackyChen	a9f479682a	Merge "Add SSE2 code and unit test for VP9 denoiser."	2014-10-07 10:51:55 -07:00
JackyChen	80465dae88	Add SSE2 code and unit test for VP9 denoiser. This SSE2 is based on VP8 denoiser's SSE2 code. In VP8, there are only 16x16 blocks in denoiser, while in VP9, there are 13 different block sizes. By adding this SSE2 code, the improvement of encoder speed is around 20%(using C code vs using SSE2 code), vary for different clips. The unit test for VP9 denoiser is to confirm that the SSE2 code is bit-exact with the C code. The unit test covers all block size. Change-Id: Ic8d8ac26db4ea40a5f146b5678a065af07eaaa3d	2014-10-06 15:27:40 -07:00
Jingning Han	12344f2697	Add range check in inverse ADST 16x16 Bit-stream clarification related to Issue 868. Change-Id: I92a7bc5b7782c9ea5c3f6cceec761742183c9514	2014-10-06 11:07:58 -07:00
Deb Mukherjee	3bcc2af8cd	Some data type changes in vp9_idct.c Resolves a visual studio warning, and includes some cleanups. Change-Id: I6a7576ef323c475b7d1c659800cd82c6cb1fd18d	2014-10-04 16:03:04 -07:00
Deb Mukherjee	8a01074d04	Merge "Incorporate WRAPLOW macro into non-highbitdepth tx"	2014-10-03 12:45:39 -07:00
Deb Mukherjee	d50716face	Incorporate WRAPLOW macro into non-highbitdepth tx Incorporates the WRAPLOW macro into the non-highbitdepth transforms to aid hardware verification between a software C model and an intended hardware implementation though the use of the configure options: --enable-experimental --enable-emulate-hardware. Note that to avoid further discrepancies between the sse/sse2 implementations of the transforms and the C implementation, when the emulate hardware option is invoked, we also disable sse/sse2/etc. Also incudes some minor cleanups/renaming etc. Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287	2014-10-03 11:38:05 -07:00
Yaowu Xu	f809475c73	Merge "Make iscan and scan neighbor arrays static const."	2014-10-02 15:15:58 -07:00
Yaowu Xu	9712bc691d	Make iscan and scan neighbor arrays static const. This commit changes the tables to be read only, which fixes issue #866 Change-Id: I85bbe03f9d344f50570f8c1c61699bdc5cee248f	2014-10-02 14:08:14 -07:00
Alexander Voronov	befc36d4a7	Fix invalid memory access in inter prediction (issue 853). Change-Id: I5a566d6ade720f212a60c0ad5d6f1ee1d1d37f2e	2014-10-02 18:57:47 +04:00
Jingning Han	c7d719325e	Merge "Remove redundant header file from vp9_idct.h"	2014-10-01 17:05:36 -07:00
Deb Mukherjee	30fbf23fda	Merge "High-bitdepth bugfixes"	2014-10-01 16:47:43 -07:00
Jingning Han	74c2997bc9	Remove redundant header file from vp9_idct.h Change-Id: Id92544762e7b96d3c729dfc8e04ecff91cbcc7f9	2014-10-01 14:58:27 -07:00
Deb Mukherjee	a160d72522	High-bitdepth bugfixes Miscellaneous bug-fixes for high bitdepth functionality. With this patch, high bit-depth profiles become mostly functional, except for an intermittent assert failure issue that is being tracked. Change-Id: I6a7fcbdcf1e5b09842e88535f8442d2e1230748c	2014-10-01 14:18:11 -07:00
Jingning Han	3d17f0d45f	Remove repeated vpx_integer.h from vp9_prob.h The file vpx_integer.h has been included and used in the parent file vp9_common.h. Change-Id: I9c65f08353576f9ef1e5ea17244fc5ca964ec002	2014-10-01 12:45:52 -07:00
Jingning Han	764c00ab50	Use precise header files in vp9_entropymv.h The commit cleans up the header files in vp9_entropymv.h. This file should only depend on vp9_mv.h and vp9_prob.h. Remove the giant vp9_blockd.h from header file list. Change-Id: I44cd26d2cfd10a16a9325778347dd53f888a874c	2014-10-01 12:41:08 -07:00
Deb Mukherjee	872b207b78	Moves transform type defines to vp9_common Moves transform type defines to vp9_common.h from vp9_idct.h so that they can be included in vp9_rtcd_defs.pl safely. Change-Id: Id5106227bee5934f7ce8b06f2eb9fa8a9a2e0ddb	2014-09-30 19:44:17 -07:00
James Zern	4a296e6baa	Revert "Fix compiling error in vp9_idct.h" This reverts commit `eafc8c9c40`. tran_low_t/tran_high_t don't belong in a public header, they're private. Similarly the public headers shouldn't rely on config defines, vpx_config.h isn't installed. Change-Id: I194ec273598da418df8dd727b6c0e78a556740ad	2014-09-30 16:08:55 -07:00
Jingning Han	0829d2be7f	Remove redundant header file declaration Some header file in vp9_idct.c has been included in vp9_idct.h. This commit removes these redundant declarations. Change-Id: I0238c27e4efff5c981eb437022c6bc6970c4e445	2014-09-30 09:13:00 -07:00
Jingning Han	eafc8c9c40	Fix compiling error in vp9_idct.h This commit fixes a compiling error in vp9_idct.h, where the codec checks that the intermediate steps of transformation fit within 16-bit length. The issue was due to broken file dependency. Change-Id: Ib22bba13a1e6df28489cb23d6774c561969f1fdc	2014-09-30 09:11:59 -07:00
Deb Mukherjee	9ed23de13f	Miscellaneous decoder changes for high bitdepth Also includes yv12 config changes. Change-Id: Iacf40d8bf486815b54c32a127ce3cd4516b7e44f	2014-09-29 11:27:45 -07:00
hkuang	c53a95ad1d	Avoid calling vp9_is_scaled two times in a function. Use a local variable to hold the result of vp9_is_scaled. Change-Id: I5e203909805923e20eefef596bc84424da47dbe2	2014-09-25 11:52:16 -07:00
Yaowu Xu	845d4f333d	Fix a couple of comments The first comment is obselete given the way is now normative in VP9 bitstream. The second comment line was too long. Change-Id: I6546585babf60d466485ddcf2daa6d2fa79e999a	2014-09-25 08:24:16 -07:00
Yaowu Xu	d237d483a5	Correct the condition for border extension As reported in issue #850, the condition for border extension was not complete. This commit added the case when the scaling is enabled. This fixes issue #850. Change-Id: I67768b23f0dcc4ac9a9aa0a0825b0fe8cb85a72e	2014-09-24 11:26:40 -07:00
Yaowu Xu	148c57d231	Merge "Fix invalid memory access on 2x downscale."	2014-09-24 09:58:05 -07:00
Alexander Voronov	eafd842a3e	Fix incorrect subsampling used in VP9 non420 loopfilter. Change-Id: Ia959e24b4676242c80a8867d2c39a6fee90f71a5	2014-09-24 17:01:09 +04:00
Deb Mukherjee	e2a90c0b21	Merge "High bit-depth loop/arf/postproc filter functions"	2014-09-23 17:26:32 -07:00
Deb Mukherjee	931ed516ba	High bit-depth loop/arf/postproc filter functions Adds high-bitdepth loopfilter, temporal filter and postproc functions Change-Id: I81c8a9176890784686bc4f2af0d550d243b3b2d3	2014-09-23 16:20:43 -07:00
hkuang	c70cea97ac	Remove mi_grid_* structures. mi_grid_* are arrays of pointer to pointer. They save the pointers that point to the MIs in cm->mi. But they are unnecessary and complicated. The original goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer inside MODE_INFO_t, same goal could be achieved. This commit totally removes the mi_grid_* structures. But there are still many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit will do on-demand MODE_INFO_t allocation in order to save these memories. Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6	2014-09-19 21:27:11 -07:00
Deb Mukherjee	822b51609b	High bit-depth coefficient coding functions Tokenization and Detokenization enhancements for 10/12 bit Change-Id: I3c269ec30f8eb160ee024905638a193975237559	2014-09-19 15:21:24 -07:00
Frank Galligan	49dc7b05d0	Merge "FIX: vp9_loopfilter_intrin_sse2.c"	2014-09-18 15:10:16 -07:00
Scott LaVarnway	13284311eb	FIX: vp9_loopfilter_intrin_sse2.c Fixes Visual Studio build failures Change-Id: I233719cd63b3ad0db16e2834bf1d7ea1df805880	2014-09-18 13:09:13 -07:00
Deb Mukherjee	6d0ee9860e	Merge "Adds high bitdepth convolve, interpred & scaling"	2014-09-18 10:52:23 -07:00
Deb Mukherjee	0d3c3d3ce7	Adds high bitdepth convolve, interpred & scaling Change-Id: Ie51c352a6b250547207cbc1ebba833a01ed053e3	2014-09-18 07:26:17 -07:00
Frank Galligan	4e066299d9	Merge "Improved mb_lpf_horizontal_edge_w_sse2_16() #2 "	2014-09-17 18:52:30 -07:00
Scott LaVarnway	217e3cb1fb	Improved mb_lpf_horizontal_edge_w_sse2_16() #2 The decoder performance improved up to 1% for the test clips used. Change-Id: I4621112bdccfba01640322facfa4ba8da8290ea5	2014-09-17 17:25:20 -07:00
Deb Mukherjee	7d0e4f9ad1	Resolves a few gcc warnings clang is fine. Change-Id: Ia4e9ff17ea3b86bc87dca35828ee7ce45bea6994	2014-09-16 22:44:40 -07:00
Deb Mukherjee	f7cf05cfe0	Merge "Adding high-bitdepth intra prediction functions"	2014-09-16 17:10:24 -07:00
Frank Galligan	ecd7e3d2b7	Merge "Remove memset of every external frame buffer."	2014-09-16 15:17:26 -07:00
Deb Mukherjee	81a8138fc3	Adding high-bitdepth intra prediction functions Change-Id: I6f5cb101e2dc57c3d3f4d7e0ffb4ddbed027d111	2014-09-16 15:04:39 -07:00
Deb Mukherjee	5cd0aab81a	Adds high bitdepth quantization functions Adds various high bitdepth quantization functions. Change-Id: I36fc0bf75a1bd15128ed271df8723de0ac134b0c	2014-09-16 14:55:37 -07:00
Yaowu Xu	601f3a886e	Fix a performance regression This commit adds back sse2 or ssse3 optimized versio of a couple of functions, fixes a ~10% performance regression. Change-Id: I049786906e5a641224dced63c6492aec9d86d183	2014-09-16 11:18:46 -07:00
Frank Galligan	175d9dfe0a	Remove memset of every external frame buffer. Libvpx was memseting every external frame buffer before decode. This was to work around a valgrind issue in our C loop filter. Most of the time this was not needed and we have noticed some significant performance loss on some platforms. Now we require the application to zero out the buffers if it is using external frame buffers. Change-Id: I7330d00a315e65137ed30edd5f813e8929b76242	2014-09-15 15:37:36 -07:00
Alexander Voronov	29071a418e	Fix invalid memory access on 2x downscale. The issue was discovered on bitstream with 2x vertical downscale. For zero MVs, y_pad is set to 1 only when vertical convolution is required. The original code assumes that for y_step_q4 == 32 we don't perform vertical convolution. But vp9_setup_scale_factors_for_frame() sets convolve functions so that when x_step and y_step are both not equal to 16, convolve in both directions is performed. And convolve() unconditionally subtracts one stride from source pointer when calls convolve_horiz(). This leads to invalid memory access. Change-Id: I882dfa6081a58e172b5ffa55842bfcd6727f10bf	2014-09-15 17:50:20 +04:00
Jingning Han	82fad6f4b6	Merge "Add a note for enum values of MV_REFERENCE_FRAME"	2014-09-13 10:42:45 -07:00
Deb Mukherjee	10783d4f3a	Adds high bitdepth transform functions and tests Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8	2014-09-11 19:56:33 -07:00
Deb Mukherjee	1e4136d35d	Adds high bit depth sad and variance functions Moves high bit depth sad/var functions from highbitdepth branch to master. Change-Id: If03845d8ef9c9c494e13350e7a587c289306b94d	2014-09-11 17:30:44 -07:00
Johann	ac2f2e7855	Merge "Allow specifying opt dependencies"	2014-09-11 16:02:41 -07:00
Johann	8645a53039	Allow specifying opt dependencies If optimizations use more than one cpu feature, allow specifying them so that '--disable-X' still works https://code.google.com/p/webm/issues/detail?id=854 Change-Id: I3108ea37b397371a2be84dd5f2380b304db23f18	2014-09-11 13:43:48 -07:00
Jingning Han	3ef9786b7e	Add a note for enum values of MV_REFERENCE_FRAME Change-Id: Ifaf6738f26e86ded6eb6ea1465bad7a229612999	2014-09-11 10:55:42 -07:00
Jim Bankoski	0e66848081	Merge "LoopFilterWorkerData: remove misleading 'const'"	2014-09-10 06:33:51 -07:00
James Zern	2215d2f135	Merge changes If8887e1d,I36bfc9c8,I3d1e6c42 * changes: vp9_dthread: simplify loop_filter_row_worker signature simplify vp9_loop_filter_worker signature vp9_decodeframe: simplify tile_work_hook signature	2014-09-09 16:50:28 -07:00
Dmitry Kovalev	8e205a2a09	Merge "Cleaning up and speeding up vp9_idct32x32_1024_add_sse2()."	2014-09-09 12:50:23 -07:00
James Zern	7b572c9806	LoopFilterWorkerData: remove misleading 'const' 'frame_buffer' is modified indirectly via 'planes'. + do the same for vp9_loop_filter_rows Change-Id: Ibb7daa2e261064e4a5317a2969e3490e59891b82	2014-09-08 20:06:48 -07:00
James Zern	48662747bd	simplify vp9_loop_filter_worker signature use the type names directly in the function declaration rather than (void arg1, void arg2) Change-Id: I36bfc9c886310ce370bf0ca7c679ebd6e95109cc	2014-09-08 19:53:46 -07:00
Dmitry Kovalev	980abf6078	Fixing Mac OS build. Change-Id: Ifae8906185a868a07685eb7a7da2484af95e70a7	2014-09-08 08:53:12 -07:00
Dmitry Kovalev	70092af5c0	Cleaning up and speeding up vp9_idct32x32_1024_add_sse2(). Change-Id: If91017b792572c9db6e257011ca307bef8428486	2014-09-05 18:12:30 -07:00
Dmitry Kovalev	89963bf586	Merge "Removing postproc mmx code."	2014-09-05 18:11:08 -07:00
Dmitry Kovalev	54bec0971f	Merge "Initializing intra modes without vpx_once()."	2014-09-05 12:03:36 -07:00
Dmitry Kovalev	1100e262c5	Removing postproc mmx code. Removed functions: * vp9_post_proc_down_and_across_mmx * vp9_mbpost_proc_down_mmx * vp9_plane_add_noise_mmx They all have sse2 equivalent. Change-Id: I59c1fac12b7c96ca4538d455e4400c2b7875feff	2014-09-05 11:52:50 -07:00
James Zern	a8083449e9	fix x86-darwin* build vp9_variance_sse2.c contains a mix of intrinsics and references to assembly which uses x86inc.asm; it's conditionally included as a result. Change-Id: I254451483a65881c0b8e18e27bf0c3ddef60c4ec	2014-09-04 23:32:13 -07:00
Dmitry Kovalev	490943552f	Removing unused function prototypes. Change-Id: Ia5e383e2cf18052f6f1eacf8b9495ab8e4d58878	2014-09-04 14:26:30 -07:00
Dmitry Kovalev	48197f0a70	Adding sse2 variant for vp9_mse{8x8, 8x16, 16x8}. Change-Id: I6786d25ce4f32b8d8912f2d239a45ca15b310c4b	2014-09-03 19:02:14 -07:00
Dmitry Kovalev	bf778e7d8e	Initializing intra modes without vpx_once(). Change-Id: I0a9d52432f2500f1bd8f43f229e70e38bb9a0343	2014-09-03 11:39:02 -07:00
Dmitry Kovalev	0ecc75c819	Merge "Removing MMX SAD calculation code."	2014-09-02 17:35:59 -07:00
Dmitry Kovalev	318fc0c34f	Removing MMX SAD calculation code. Removed functions: * vp9_sad_16x16_mmx * vp9_sad_8x16_mmx * vp9_sad_16x8_mmx * vp9_sad_8x8_mmx * vp9_sad_4x4_mmx Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3	2014-09-02 14:41:36 -07:00
Deb Mukherjee	5acfafb18e	Adds config opt for highbitdepth + misc. vpx Adds config parameter vp9_highbitdepth, to support highbitdepth profiles. Also includes most vpx level high bit-depth functions. However encode/decode in the highbitdepth profiles will not work until the rest of the code is in place. Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6	2014-09-02 14:37:10 -07:00
Dmitry Kovalev	12cd6f421d	Removing variance MMX code. Removed functions: * vp9_mse16x16_mmx * vp9_get_mb_ss_mmx * vp9_get4x4var_mmx * vp9_get8x8var_mmx * vp9_variance4x4_mmx * vp9_variance8x8_mmx * vp9_variance16x16_mmx * vp9_variance16x8_mmx * vp9_variance8x16_mmx They all have SSE2 equivalent. Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615	2014-08-29 10:26:42 -07:00
Dmitry Kovalev	eba83a0fdb	Merge "Replacing int_mv with MV inside the first pass code."	2014-08-25 13:56:14 -07:00
Dmitry Kovalev	a459e582cb	Replacing int_mv with MV inside the first pass code. Change-Id: Ia3be6b5a18e1ff6cc5c5f4d37e4a5d0972388308	2014-08-22 16:20:18 -07:00
Jim Bankoski	cebe2c8d88	vp9_postproc.c: unused parameter warning resolved Change-Id: I6d77a7c775c0482fd1f9bb03ea6f336dd2973fa0	2014-08-22 13:41:07 -07:00
Yaowu Xu	23c88870ec	Merge "Fix bug 804"	2014-08-21 08:56:32 -07:00
Adrian Grange	c5d8c1e785	Merge "get_ref_frame: fix test for valid buffer."	2014-08-15 10:41:28 -07:00
Adrian Grange	54f8cb78c6	Merge "Fix bug 837: realloc mode info buffers on resize"	2014-08-14 14:53:33 -07:00
Adrian Grange	89a213b4b0	get_ref_frame: fix test for valid buffer. In the current implementation of the encoder, frame buffers may come from the wider set of 12 such buffers, and is not restricted to the 8 allowed as reference frames. This is only an implementation detail and does not affect the constraint of having a total of 8 reference buffers overall. Change-Id: I075f777146c2df49c275d89232933f8127235175	2014-08-14 12:42:11 -07:00
Adrian Grange	4e30565a9f	Fix bug 837: realloc mode info buffers on resize The test to determine if the mode info buffers need to be resized when the frame size changes was incorrect, as per bug 837. By storing the size of the allocated data structure, a simple test determines whether to allocate more memory when the frame size changes. Change-Id: I1544698f2882cf958fc672485614f2f46e9719bd	2014-08-14 08:59:15 -07:00
James Zern	4b79563805	Merge "get_ref_frame: check ref_frame_map value"	2014-08-12 22:48:27 -07:00
James Zern	a6b7bd6a1c	Merge "fixes several -Wunused-function warnings"	2014-08-12 20:15:14 -07:00
James Zern	3caed4f8fd	get_ref_frame: check ref_frame_map value 'ref_frame_map' is initialized to -1. avoids using an invalid index if VP9_GET_REFERENCE/VP8_COPY_REFERENCE controls are issued after a decode error. Change-Id: I4599762c4d0b07a5943a72bf4a86ccb596cc062a	2014-08-12 17:47:04 -07:00
Jim Bankoski	f452961765	fixes several -Wunused-function warnings Change-Id: I4dc2cb255f4fe30998b6ee61184895dee9f5da8e	2014-08-12 16:51:07 -07:00
Adrian Grange	1ebf52df2c	Common encode/decode function to get reference frame Replaced encoder and decoder functions to get a pointer to a reference frame with a common function, vp9_get_ref_frame, and simplified it. Change-Id: Icb206fcce8caace3bfd1db3dbfa318dde79043ee	2014-08-08 11:37:11 -07:00
Adrian Grange	75b42a4977	Remove coding_use_prev_mi member from VP9_COMMON This was shadowing the use of error_resilient_mode, but with the opposite sense. Change-Id: Ie4d30263a304fe4b3e94f0c7741db6888cc6afd8	2014-08-08 09:40:38 -07:00
levytamar82	69a5f5ecf7	Fix bug 807 in the sub_pixel_variance function the dst is aligned to 16 bytes and not to 32 bytes - now load unaligned data Change-Id: I2e0b9745543697efc56fefa32857ea10117af135	2014-08-07 18:51:02 -07:00
levytamar82	839911fb6d	Fix bug 804 A bug in Microsoft compiler was found in the function vp9_filter_block1d16_v8_avx2 and a workaround applied. the bug occur when there was 4 consecutive maddubs + min + adds intrinsic instructions. Change-Id: I83499faeb70971e650e5663fd2490360ddb1a51b	2014-08-07 15:09:24 -07:00
levytamar82	af10457e02	Fix bug 806 in the function sad32x32x4d and sad64x64x4d the source is aligned to 16 bytes and not to 32 bytes - the load is now unaligned. Change-Id: I922fdba56d0936b5cf72e4503519f185645a168c	2014-08-07 14:13:30 -07:00
Dmitry Kovalev	65234504b9	Merge "Removing direct references to VP9_COMP."	2014-08-07 14:12:32 -07:00
Deb Mukherjee	a468170804	Merge "Changes hdr for profiles > 1 for intraonly frames"	2014-08-07 11:15:38 -07:00
Deb Mukherjee	09bf1d61ca	Changes hdr for profiles > 1 for intraonly frames Specifies the bit-depth, color sampling and colorspace for intra only frames for profiles > 0 Also adds checks to ensure that profile 1 and 3 are exclusively used for non 420 streams. Change-Id: Icfb15fa1acccbce8f757c78fa8a2f60591360745	2014-08-07 09:47:14 -07:00
Yaowu Xu	0a2b25dcb9	configure: add --enable-coefficient-range-checking This commit adds a configure time option used to enable strict error checking in decoder to make sure intermediate stage cofficients of inverse transforms are within valid range of signed 16 bit integer. For valid VP9 input streams, intermediate stage coefficients should always stay within the range of a signed 16 bit integer. Coefficients can go out of this range for invalid/corrupt VP9 streams. However, strictly checking this range for every intermediate coefficient can be a burden for decoder, therefore such validation is only enabled with configure option --enable-coefficient-range-checking. Change-Id: I47d47c8c4e48a922c3d223ca59064f51b3f0f5ed	2014-08-06 17:13:16 -07:00
Dmitry Kovalev	09b3d04aac	Removing direct references to VP9_COMP. Change-Id: Ic37624d807884e71f08b50fd04892f03f2708ba7	2014-08-06 12:59:02 -07:00
Johann	7516abc7dc	Remove vp9_postproc_x86.h This configuration has moved to vp9_rtcd_defs.pl Change-Id: I71a31dbb8d79df226b60dd834324a5af69956c51	2014-08-05 15:46:13 -07:00
Jim Bankoski	128827d947	cast enums to int to avoid gcc warning in pred_common Change-Id: Ie3e478ef4fa565225d9e19a14d2f40aad966c2b6	2014-08-04 12:07:37 -07:00
Jim Bankoski	7f63dabfe9	break at the end of clauses with assert(0) to avoid gcc warning Change-Id: I1b3c5337f018dde27dc819ab18bd081d169a91e8	2014-08-04 08:52:53 -07:00
Jim Bankoski	3cf5908e24	uint8_t segment and skip to avoid signed / unsigned warnings Change-Id: I2e2765b851fb0a1b15351c2aa0e079197cbee373	2014-08-04 08:52:40 -07:00
James Zern	ce896df057	Merge "vp9_entropy: inline comes first to avoid warning."	2014-08-01 19:15:34 -07:00
James Zern	3a924f6ed1	Merge "signed unsigned mismatch - warning error"	2014-08-01 16:28:38 -07:00
Jim Bankoski	9c74e6aac7	vp9_entropy: inline comes first to avoid warning. Change-Id: I5b050122e6ed183a5b33c1f38e4fbf63b6721062	2014-08-01 16:05:30 -07:00
James Zern	1b6ac28a2f	Merge "removed sign mismatch warning"	2014-08-01 14:45:12 -07:00
Frank Galligan	5f8fa13258	Merge "Added vp9_sad8x8_neon()"	2014-08-01 14:11:38 -07:00
Scott LaVarnway	98165ec074	Neon version of vp9_sub_pixel_variance8x8(), vp9_variance8x8(), and vp9_get8x8var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~1.2%. Change-Id: I8a66ac2a0f550b407caa27816833bdc563395102	2014-08-01 11:35:55 -07:00
Frank Galligan	5487b6067c	Merge "Neon version of vp9_sub_pixel_variance32x32(),"	2014-08-01 09:46:37 -07:00
Scott LaVarnway	545be78136	Added vp9_sad8x8_neon() Change-Id: I3be8911121ef9a5f39f6c1a2e28f9e00972e0624	2014-08-01 06:36:18 -07:00
Jim Bankoski	0f3689d32d	signed unsigned mismatch - warning error Change-Id: I991e36aa3cfa62aae6d27b253297dd9ca9e8bc12	2014-08-01 06:29:32 -07:00
Jim Bankoski	512f9b631f	removed sign mismatch warning Change-Id: Iaa40b472f6c1c48bb3bb47332b6fcf36d7f3c10e	2014-08-01 06:28:00 -07:00
Scott LaVarnway	6f4b8dcdc2	Neon version of vp9_subtract_block() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~3.2% Change-Id: I8862497264142171b7efc32df1a67714a23539f4	2014-07-31 09:28:06 -07:00
Scott LaVarnway	d39448e2d4	Neon version of vp9_sub_pixel_variance32x32(), vp9_variance32x32(), and vp9_get32x32var(). Change-Id: I8137e2540e50984744da59ae3a41e94f8af4a548	2014-07-31 08:00:36 -07:00
Scott LaVarnway	d4a37db5b8	Neon version of vp9_quantize_fp() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~12.4% Change-Id: Id29d215acf58bb108489e218a259adf74b4768d7	2014-07-30 09:33:46 -07:00
Scott LaVarnway	521cf7e879	Neon version of vp9_sub_pixel_variance16x16(), vp9_variance16x16(), and vp9_get16x16var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~16.7%. Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb	2014-07-30 08:17:32 -07:00
Scott LaVarnway	d19d222db6	Added vp9_fdct8x8_neon(), vp9_fdct8x8_1_neon() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~3.7%. Change-Id: I428c72c40df82c6d537955e320a8debf99343004	2014-07-29 08:56:05 -07:00
levytamar82	4ba92dc5ab	Fix bug 805 Remove all the redundant dct functions (dct4x4, dct8x8) in avx2 except dct32x32 those functions were copied originally from dct_sse2 Change-Id: I742576fbf5175f3ac09f2076976a9247b259323e	2014-07-28 15:46:01 -07:00
hkuang	44395a21da	Move vp9_dec_build_inter_predictors_* to decoder folder. Change-Id: Ibe9fa28440cc79ba9f3504d78c7dca7bb01a23e1	2014-07-28 11:09:11 -07:00
hkuang	7eca086707	Add segmentation map array for current and last frame segmentation. The original implementation only allocates one segmentation map and this works fine for serial decode. But for frame parallel decode, each thread need to have its own segmentation map and the last frame segmentation map should be provided from last frame decoding thread. After finishing decoding a frame, thread need to serve the old segmentation map that associate with the previous decoded frame. The thread also need to use another segmentation map for decoding the current frame. Change-Id: I442ddff36b5de9cb8a7eb59e225744c78f4492d8	2014-07-28 10:44:02 -07:00
Jingning Han	53844275e9	Fix potential ioc issue in vp9_get_prob for 4K above sizes This commit turns on the existing vp9_get_prob function using 64 bit in the intermediate step. It fixes the ioc issue for 4K above frame sizes (issue 828). Change-Id: I9f627f3beca2c522f73b38fd2a3e7eefdff01a7c	2014-07-24 15:35:51 -07:00
Alex Converse	5926e7c0e8	Remove unfinished VP9 alpha channel. Change-Id: Ic5d3a3a0dac10b49495771886a31e793bb78b5ca	2014-07-21 15:55:50 -07:00
Deb Mukherjee	727f384085	Merge "Separates profile 2 into 2 profiles 2 and 3"	2014-07-18 03:23:51 -07:00
Deb Mukherjee	c447a50aea	Separates profile 2 into 2 profiles 2 and 3 Separates HBD profile int two profiles (2 and 3) consistent with the highbitdepth branch. This patch is ported from the original highbitdepth branch patch: https://gerrit.chromium.org/gerrit/#/c/70460/ Two of the invalid file tests needed to be updated. Change-Id: I6a4acd2f7a60b1fb4cbcc8e0dad4eab4248431e3	2014-07-17 20:51:59 -07:00
Adrian Grange	8cb8aef7c7	Merge "Modified frame buffer handling"	2014-07-17 12:15:16 -07:00
Scott LaVarnway	ba0652e83a	Merge "Added vp9_sad64x64_neon(), vp9_sad32x32_neon()"	2014-07-17 11:42:16 -07:00
Adrian Grange	f68aaa38d6	Modified frame buffer handling This patch is the first step toward simplifying the frame buffer handling. The final goal is to have a common frame buffer handling framework for both encoder and decoder that incorporates the existing ability to use externally allocated memory. Change-Id: I2c378a4f54a39908915f46c4260e17a080db7ff1	2014-07-17 11:06:35 -07:00
Scott LaVarnway	696fa52eaa	Added vp9_sad64x64_neon(), vp9_sad32x32_neon() and vp9_sad16x16_neon() On a Nexus 7, vpxenc (in realtime mode, speed -6) reported a performance improvement of ~17%. Change-Id: I91e070cde2973451083d3f3d63b49b7886de9a85	2014-07-16 12:54:46 -07:00
Deb Mukherjee	1f6aaeddc5	Merge "Some extra bit probability cleanups"	2014-07-14 17:26:54 -07:00
hkuang	4c08120ca0	Merge "Include the right header for VP9 worker thread." into frame_parallel	2014-07-14 16:09:16 -07:00
hkuang	294b849796	Include the right header for VP9 worker thread. pthread.h is not supported in windows. vp9_thread.h includes the emulation layer for pthread in windows. Change-Id: I2b1c8ec299928472faca7ebeea998170c9f4d744	2014-07-14 16:03:38 -07:00
Jingning Han	6ce515b9ff	Merge "Fix chrome valgrind warning due to the use of mismatched bsize"	2014-07-13 11:07:44 -07:00
James Zern	0999a2a24e	Merge "vp9_loopfilter.c: cosmetics"	2014-07-11 16:02:21 -07:00
Jingning Han	3cddd81c6d	Fix chrome valgrind warning due to the use of mismatched bsize This commit fixes a mismatched use case of block size in non-RD intra prediction check. The residual SSE and variance should be calculated per transform block size, instead of operating block size, which caused chrome valgrind warning on conditional jump based on uninitialized value (webm issue 823). This commit resolves this issue. Change-Id: I595c06599c7e0fd0e4a08736519ba68fc14bc79a	2014-07-11 15:49:22 -07:00
hkuang	3cffa0c74e	Move vp9_thread.* to common. Prepare for frame parallel decoding, the reference count buffers need to be protected by mutex. Move vp9_thread.* to common folder so that those buffers could use cross-platform mutex from vp9_thread.*. (cherry picked from commit `337e8015c9`) Change-Id: I0587a08447925f4554d7788686a31483c2ae3f37	2014-07-11 15:24:31 -07:00
Yunqing Wang	7e340614c1	Merge "Remove unnecessary assertions"	2014-07-11 13:47:03 -07:00
Deb Mukherjee	6957e7a077	Some extra bit probability cleanups Refactoring to remove some duplication of probability tables between tokenization and detokenization. Change-Id: I2fc6a6497f9c0410021a9b41f828bc58a864e466	2014-07-11 11:39:18 -07:00
Yunqing Wang	978642a426	Remove unnecessary assertions Removed 2 unnecessary assertions. Change-Id: I0f8877d0494bf3ecdb0d7931ccbcaa8289e01d8b	2014-07-11 10:48:57 -07:00
Yaowu Xu	a75d55df1b	Remove an unused parameter Change-Id: I6ad6fd75dc3c9e6218d88148cf49e205398e2af5	2014-07-11 08:10:04 -07:00
James Zern	8a7cc1f47b	Merge "update vp9_thread.c"	2014-07-10 23:19:55 -07:00
James Zern	8701ed0270	update vp9_thread.c pull the latest from libwebp. Original source: http://git.chromium.org/webm/libwebp.git 100644 blob 264210ba2807e4da47eb5d18c04cf869d89b9784 src/utils/thread.c commit 46fd44c1042c9903b2f1ab87e9f200a13c7e702d Author: James Zern <jzern@google.com> Date: Tue Jul 8 19:53:28 2014 -0700 thread: remove harmless race on status_ in End() if a thread was still doing work when End() was called there'd be a race on worker->status_. in these cases, however, the specific value is meaningless as it would be >= OK and the thread would have been shut down properly, but we'll check 'impl_' instead to avoid any potential TSan/DRD reports. Change-Id: Ib93cbc226a099f07761f7bad765549dffb8054b1 Change-Id: Ib0ef25737b3c6d017fa74822e21ed58508230b91	2014-07-10 12:20:54 -07:00
Yunqing Wang	1226d133df	Merge "Refactor vp9_diamond_search_sad function"	2014-07-10 11:06:32 -07:00
Yunqing Wang	46441ec5c8	Merge "Refactor refining_search_sad code"	2014-07-10 10:43:00 -07:00
hkuang	51e9788e58	Fix a bug in boundary checking. Change-Id: Ifc741da9da6f61c8d3c1f675ec6b8a96570f877d	2014-07-10 09:43:04 -07:00
Yunqing Wang	75cd57503d	Refactor vp9_diamond_search_sad function Currently, vp9_diamond_search_sadx4() is only called when sse3 is enabled, which is improper since sse2 optimization of sdx4df functions are available. Changed to always use vp9_diamond_search_sadx4(). Change-Id: I4b95d6b7a3c6c645783c373f0ba8d645ece24717	2014-07-10 09:19:03 -07:00
James Zern	58609335b1	vp9_loopfilter.c: cosmetics - fix indent, spelling - drop some whitespace in some comments - add an assert in vp9_setup_mask, it shouldn't be called on decode error Change-Id: Ic312a815e977a6f9cb81ceb7b039eeada76c5aa0	2014-07-09 17:27:57 -07:00
Yunqing Wang	30117a576d	Refactor refining_search_sad code There are sse2 optimization of sdx4df functions. Instead of calling vp9_refining_search_sadx4 only when sse3 is enabled, call it always. Change-Id: I24f93818f7d4209d1425039e0eb099ff9ff08fe9	2014-07-09 16:50:11 -07:00
Jingning Han	f6bf614b2f	Merge "Re-design quantization process for 32x32 transform block"	2014-07-09 11:55:26 -07:00
hkuang	b84ee5a3d0	Merge "Move vp9_thread.* to common."	2014-07-09 10:16:13 -07:00
Jingning Han	9ad1b9fc67	Re-design quantization process for 32x32 transform block This commit enables a new quantization process for 32x32 2D-DCT transform coefficient blocks. It improves the compression performance of speed 5 by 1.4%. The overall compression gains of speed 5 due to the new quantization scheme is 4.7%. It also includes the SSSE3 implementation of the 32x32 quantization process. Change-Id: I0855b124fd6462418683f783f5bcb44255c9993b	2014-07-08 16:55:28 -07:00
Adrian Grange	7c43fb67ae	Fix decoder handling of intra-only frames This patch fixes bug 633: https://code.google.com/p/webm/issues/detail?id=633 The first decoded frame does not have to be a keyframe, it could be an inter-frame that is coded intra-only. This patch fixes the handling of intra-only frames. A test vector has also been added that encodes 3 intra-only frames at the start of the clip. The test vector was generated using the code in the following patch: https://gerrit.chromium.org/gerrit/#/c/70680/ Change-Id: Ib40b1dbf91aae2bc047e23c626eaef09d1860147	2014-07-08 16:24:03 -07:00
hkuang	337e8015c9	Move vp9_thread.* to common. Prepare for frame parallel decoding, the reference count buffers need to be protected by mutex. Move vp9_thread.* to common folder so that those buffers could use cross-platform mutex from vp9_thread.*. Change-Id: I541277cf15eefed6641555944f67f4a0bcdc8154	2014-07-07 14:52:19 -07:00
hkuang	28a794f680	Seperate the frame buffers from VP9 encoder/decoder structure. Prepare for frame parallel decoding, the frame buffers must be separated from the encoder and decoder structure, while the encoder and decoder will hold the pointer of the BufferPool. Change-Id: I172c78f876e41fb5aea11be5f632adadf2a6f466	2014-07-02 15:34:20 -07:00
Yaowu Xu	82fd084b35	Merge "Re-design quantization process"	2014-07-01 19:04:01 -07:00
Jingning Han	9ac2f66320	Re-design quantization process This commit re-designs the quantization process for transform coefficient blocks of size 4x4 to 16x16. It improves compression performance for speed 7 by 3.85%. The SSSE3 version for the new quantization process is included. The average runtime of the 8x8 block quantization is reduced from 285 cycles -> 255 cycles, i.e., over 10% faster. Change-Id: I61278aa02efc70599b962d3314671db5b0446a50	2014-07-01 17:00:07 -07:00
Alex Converse	6c54dbcb69	Merge "BITSTREAM: Handle transform size and motion vectors more logically for non-420."	2014-06-30 17:44:01 -07:00
James Zern	44472cde55	vp9: disable postproc buffer alloc when unnecessary the buffer is only used in encoding and only when CONFIG_INTERNAL_STATS or CONFIG_VP9_POSTPROC is enabled. a future change should decouple this from the frame buffer allocation and make it conditional based on runtime flags when the above config options are enabled. reduces decode heap usage by at least 12% Change-Id: Id0b97620d4936afefa538d3aadf32106743d9caf	2014-06-27 20:59:56 -07:00
Jim Bankoski	52b63c238e	Merge "Better validation of invalid files"	2014-06-27 11:05:21 -07:00
Jim Bankoski	9f37d149c1	Better validation of invalid files This patch checks that a decoder never tries to reference frame that's outside the range of 2x to 1/16th the size of this frame. Any attempt to do so causes a failure. Change-Id: I5c98fa7bb95ac4f29146f29dd92b62fe96164e4c	2014-06-27 10:03:15 -07:00
Jingning Han	46ea9ec719	Enable real-time version reference motion vector search This commit enables a fast reference motion vector search scheme. It checks the nearest top and left neighboring blocks to decide the most probable predicted motion vector. If it finds the two have the same motion vectors, it then skip finding exterior range for the second most probable motion vector, and correspondingly skips the check for NEARMV. The runtime of speed -5 goes down pedestrian at 1080p 29377 ms -> 27783 ms vidyo at 720p 11830 ms -> 10990 ms i.e., 6%-8% speed-up. For rtc set, the compression performance goes down by about -1.3% for both speed -5 and -6. Change-Id: I2a7794fa99734f739f8b30519ad4dfd511ab91a5	2014-06-26 09:49:13 -07:00
Adrian Grange	8357292a5a	Fix test on maximum downscaling limits There is a normative scaling range of (x1/2, x16) for VP9. This patch fixes the maximum downscaling tests that are applied in the convolve function. The code used a maximum downscaling limit of x1/5 for historic reasons related to the scalable coding work. Since the downsampling in this application is non-normative it will revert to using a separate non-normative scaler. Change-Id: Ide80ed712cee82fe5cb3c55076ac428295a6019f	2014-06-24 10:26:09 -07:00
Adrian Grange	8c1f071f1e	Allocate buffers based on correct chroma format The encoder currently allocates frame buffers before it establishes what the chroma sub-sampling factor is, always allocating based on the 4:4:4 format. This patch detects the chroma format as early as possible allowing the encoder to allocate buffers of the correct size. Future patches will change the encoder to allocate frame buffers on demand to further reduce the memory profile of the encoder and rationalize the buffer management in the encoder and decoder. Change-Id: Ifd41dd96e67d0011719ba40fada0bae74f3a0d57	2014-06-23 11:45:13 -07:00
Jingning Han	961bafc366	Merge "Remove unused vp9_init_quant_tables function"	2014-06-23 09:37:30 -07:00
Johann	1fc2b0fd00	Merge "Include type defines"	2014-06-20 11:29:19 -07:00
Johann	d658216276	Don't return value for void functions Clears "warning: 'return' with a value, in function returning void" Change-Id: I93972610d67e243ec772a1021d2fdfcfc689c8c2	2014-06-20 11:26:44 -07:00
Johann	baef0b89da	Include type defines Clears error: unknown type name 'uint8_t' Change-Id: I9b6eff66a5c69bc24aeaeb5ade29255a164ef0e2	2014-06-20 11:26:13 -07:00
Alex Converse	7557a65d16	BITSTREAM: Handle transform size and motion vectors more logically for non-420. This breaks the profile 1 bitstream. Don't force non420 uv transform size to 1/4 y size. In the 4:2:0 case the chroma corresponding to a luma block is 1/4 its size. In the 4:4:4 case chroma and luma planes are the same size. Disallowing larger transforms can result in a loss of compression efficiency and is inconsistent. For sub-8x8 blocks only average corresponding motion vectors. 4:2:0 and profile 0 behavior remains unchanged. Change-Id: I560ae07183012c6734dd1860ea54ed6f62f3cae8	2014-06-18 13:07:51 -07:00
Jingning Han	3b9c19aaa7	Remove unused vp9_init_quant_tables function This function is not effectively used, hence removed. Change-Id: I2e8e48fa07c7518931690f3b04bae920cb360e49	2014-06-18 11:51:41 -07:00
James Zern	88df435d6b	Merge "vp9_rtcd: correct avx2 references"	2014-06-16 17:39:13 -07:00
Johann	79afb5eb41	Use lrand48 on Android When building x86 assembly use lrand48 instead of the undocumented inlined _rand function. Android now supports rand() https://android-review.googlesource.com/97731 but only for new versions. Original workaround: https://gerrit.chromium.org/gerrit/15744 Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a	2014-06-12 19:57:25 -07:00
Jingning Han	d5ae43318e	Merge "Fast computation path for forward transform and quantization"	2014-06-12 11:59:52 -07:00
Jingning Han	ccba289f8d	Fast computation path for forward transform and quantization This commit enables a fast path computational flow for forward transformation. It checks the sse and variance of prediction residuals and decides if the quantized coefficients are all zero, dc only, or more. It then selects the corresponding coding path in the forward transformation and quantization stage. It is currently enabled in rtc coding mode. Will do it for rd coding mode next. In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up. Overall coding performance for rtc set is changed by -0.18%. Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1	2014-06-12 11:10:54 -07:00
James Zern	9f3a0dbb5e	vp9_rtcd: correct avx2 references s/"\$avx2_x86inc"/"avx2"/ avx2 code is all intrinsics and as a result doesn't rely on x86inc.asm Change-Id: I76ad39474d8a00658f3e43131830ef0f4f34772a	2014-06-10 16:26:36 -07:00
James Zern	cbce09ce62	Merge changes I6abc0657,I8224fba2,I04f64a45,I5d49d119,I76b4d171,I88c11ac3 * changes: vp9_sub_pixel_variance: disable avx2 variants vp9_sad*x4d: disable avx2 variants vp9_f(dct\|ht): disable avx2 variants convolve: disable avx2 variants fdct8x8_test: add missing avx2 functions dct4x4_test: add missing avx2 functions	2014-06-10 16:14:45 -07:00
James Zern	520cb3f39f	vp9_sub_pixel_variance: disable avx2 variants tests failing under Win32/Win64 + variance_test: add missing avx2 functions (partially disabled) Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d	2014-06-10 16:11:15 -07:00
James Zern	d3ff009d84	vp9_sad*x4d: disable avx2 variants tests failing under Win32/Win64 + sad_test: add missing avx2 functions (disabled) Change-Id: I8224fba2b270f6039ab1877d71e1e512f0081856	2014-06-10 16:10:12 -07:00
hkuang	cdffeaaae0	Add mode info arrays and mode info index. In non frame-parallel decoding, this works the same way as current decoding scheme. Every time after decoder finish decoding a frame, it will swap the current mode info pointer and previous mode info pointer if the decoded frame needs to be shown. Both mode info pointer and previous mode info pointer are from mode info arrays. In frame-parallel decoding, this will become more complicated as current frame's mode info pointer will be shared with next frame as previous mode info pointer. But when one decoder thread finishes decoding one frame and starts to work on next available frame, it needs to retain the decoded frame's mode info pointers until next frame finishes decoding. The mode info index will serve this purpose. The decoder will use different buffer in the mode info arrays and use the other buffer to save previous decoded frame’s mode info. Change-Id: If11d57d8eb0ee38c8876158e5482177fcb229428	2014-06-10 13:43:36 -07:00
James Zern	dd9f502933	vp9_f(dct\|ht): disable avx2 variants tests failing under Win32/Win64 + dct16x16_test: add missing avx2 functions (partially disabled) exercises the forward transforms no idct/iht implementations, so the c-code is used Change-Id: I04f64a457fa0828a00f32b5c9fe4f55294f21f61	2014-06-09 18:48:11 -07:00
James Zern	5704578f5f	convolve: disable avx2 variants tests failing under Win32/Win64 Change-Id: I5d49d11911bcda3a832b14efe5500d22597bedcf	2014-06-09 18:42:03 -07:00
Jingning Han	0c4a4225ec	Merge "Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs"	2014-06-03 16:51:39 -07:00
Dmitry Kovalev	19c492a749	Merge "Reusing existing vp9_get{8x8, 16x16}var() instead of new ones."	2014-06-03 10:04:27 -07:00
Deb Mukherjee	fc88292ef2	Remove Wextra warnings from vp9_sad.c As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Fixes a bug in original patch: (https://gerrit.chromium.org/gerrit/#/c/70163/8) that was reverted due to a nightly test failure. Change-Id: Ia2a4e9e278fd3c89d6c3c82fcc6381320ec2a8a6	2014-06-02 13:50:20 -07:00
Frank Galligan	c40a968e13	Merge "Revert "Remove Wextra warnings from vp9_sad.c""	2014-06-01 16:58:11 -07:00
Frank Galligan	0b44988952	Revert "Remove Wextra warnings from vp9_sad.c" This reverts commit `916550428d` Change-Id: I500822b03f09c64ff6ec5396c68edee9ca3b75cb	2014-06-01 16:20:26 -07:00
Jingning Han	ba6bed372b	Merge "Fix a potential overflow issue in inverse 16x16 full 2D-DCT"	2014-05-30 15:52:53 -07:00
Jingning Han	2c1cdf69b6	Fix a potential overflow issue in inverse 16x16 full 2D-DCT An overflow issue could potentially happen in the second round 1-D transform of the SSSE3 full inverse 16x16 2D-DCT. This commit fixes this issue. Change-Id: Ia19e4888fda1cc929a28a5f89a5beec612d628dc	2014-05-29 11:46:32 -07:00
Dmitry Kovalev	e14f900ae3	Merge "Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK."	2014-05-29 11:16:39 -07:00
Dmitry Kovalev	f7ff24cdd0	Reusing existing vp9_get{8x8, 16x16}var() instead of new ones. Change-Id: I87b7c657d8813d7fb383ab519d150c0ffb1dd377	2014-05-29 11:14:06 -07:00
Jingning Han	6d21cbd20b	Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe	2014-05-28 10:53:33 -07:00
Jingning Han	d5bcef5242	Merge "Fix compiling error in MSVS"	2014-05-27 16:58:00 -07:00
Jingning Han	239e68ddbf	Fix compiling error in MSVS Need to include math.h before tmmintrin.h in some versions of MSVS. Change-Id: Ia6b83ae599316887ecf30c4e4b9e4355fb8a4219	2014-05-27 15:58:47 -07:00
Yunqing Wang	1f2200080b	Revert "Making vp9_get_sse_sum_{8x8, 16x16} static." This reverts commit `e8bbb3d9db`. Change-Id: Ie368d36fd249d323d859d208609c711f04537bbc	2014-05-27 13:37:08 -07:00
Deb Mukherjee	444f93945b	Merge "Remove Wextra warnings from vp9_sad.c"	2014-05-27 11:54:05 -07:00
Yunqing Wang	a591ac9e5a	Merge "Fix decoder mismatch in sub-pixel AVX2 intrinsic filters"	2014-05-27 10:52:16 -07:00
levytamar82	773596050f	Fix decoder mismatch in sub-pixel AVX2 intrinsic filters The subpixel SSSE3 was fixed in this patch: https://gerrit.chromium.org/gerrit/#/c/70283/ So the equivalent AVX2 is fixed accordingly. Change-Id: Ieebbc1949c99d34b12b8b47692df71aca5001f3a	2014-05-23 16:48:40 -07:00
Jingning Han	59c3f446fe	Merge "Inverse 16x16 2D-DCT SSSE3 implementation"	2014-05-23 16:01:22 -07:00
Jingning Han	48b0891370	Inverse 16x16 2D-DCT SSSE3 implementation This commit enables the SSSE3 implementation of full inverse 16x16 2D-DCT. The unit runtime goes down from 1642 cycles to 1519 cycles, about 7% speed-up. Change-Id: I14d2fdf9da1fb4ed1e5db7ce24f77a1bfc8ea90d	2014-05-23 15:09:35 -07:00
Yunqing Wang	67ca5b586a	Merge "Fix decoder mismatch in sub-pixel SSSE3 intrinsic filters"	2014-05-23 14:24:48 -07:00
Dmitry Kovalev	d7d7cedaaa	Merge "Removing vp9_pragmas.h."	2014-05-23 12:58:00 -07:00
Yunqing Wang	c5443fc881	Fix decoder mismatch in sub-pixel SSSE3 intrinsic filters In 8-tap filtering, to guarantee the intermediate results fit in 16 bits, the order of accumulating the products needs to be done correctly, and the largest product should be added last. This patch fixed the problem using the method in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation". Change-Id: I79d0ad60c057b15011ece84cda9648eee0809423	2014-05-23 11:52:20 -07:00
Yaowu Xu	9410330893	Merge "change to use assembly version of ssse3 filter code"	2014-05-23 08:02:28 -07:00
Deb Mukherjee	916550428d	Remove Wextra warnings from vp9_sad.c As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Change-Id: I068cc2391eed51e9b140ea6aba78338c5fec8d71	2014-05-22 22:21:16 -07:00
Yaowu Xu	7a0c9b82f2	change to use assembly version of ssse3 filter code As mismatchs were found between the intrinsic version and c only. The commit temporarily revert to use the matching assembly version to allow further investigation. Change-Id: I08436c47d4888b562c0eac8e8856d90a831442df	2014-05-22 17:11:57 -07:00
Yunqing Wang	aaf204e550	Merge "Fix a decoding mismatch in sub-pixel filters"	2014-05-22 17:09:14 -07:00
Yunqing Wang	efcdf946ed	Fix a decoding mismatch in sub-pixel filters This did the same correction as the one in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" to avoid saturation during filtering. Change-Id: Ife9aa3f62daf9114eb24fe38f7baa3c3f361b2d6	2014-05-22 15:42:13 -07:00
Dmitry Kovalev	72ab966d5e	Removing vp9_pragmas.h. Change-Id: I9120a87e27e73e496932d11716937e2fad246521	2014-05-22 13:46:31 -07:00
Deb Mukherjee	e272273443	Renames x86_64 specific asm files Renames all x86_64 specific assembly files to consistently end in _x86_64.asm. This will be useful for build systems to handle these files differently. All new 64-bit specific assembly files should use the new naming convention. Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536	2014-05-21 13:55:56 -07:00
Dmitry Kovalev	35a83677a5	Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK. The final goal is eventually to get rid of both itxm_add and fwd_txm4x4. This patch does it in the decoder. Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5	2014-05-21 11:09:44 -07:00
Deb Mukherjee	ef750d8472	Merge "Extends temporal filtering to work for 422 data"	2014-05-20 16:31:28 -07:00
Deb Mukherjee	a185bc3350	Extends temporal filtering to work for 422 data This is needed for profiles 1 and 2. Change-Id: I5dd7644c2932d055ab89e050d4be7d4117cd1028	2014-05-20 15:19:40 -07:00
hkuang	20c1edf612	Refactor decode_tiles and loopfilter code. The current decode_tiles decodes the frame one tile by one tile and then loopfilter the whole frame or use another worker thread to do loopfiltering. \|------\|------\|------\|------\| \|Tile1-\|Tile2-\|Tile3-\|Tile4-\| \|------\|------\|------\|------\| For example, if a tile video has one row and four cols, decode_tiles will decode the Tile1, then Tile2, then Tile3, then Tile4. And during decode each tile, decode_tile will decode row by row in each tile. For frame parallel decoding, decode_tiles will decode video in row order across the tiles. So the order will be: "Decode 1st row of Tile1" -> "Decode 1st row of Tile2" -> "Decode 1st row of Tile3" -> "Decode 1st row of Tile4" -> "Decode 2nd row of Tile1" -> "Decode 2nd row of Tile2" -> "Decode 2nd row of Tile3" -> "Decode 2nd row of Tile4"-> "loopfilter 1st row" Change-Id: I2211f9adc6d142fbf411d491031203cb8a6dbf6b	2014-05-20 14:47:45 -07:00
Dmitry Kovalev	c23c613fdf	Merge "Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file."	2014-05-19 10:27:16 -07:00
Dmitry Kovalev	79ba41903f	Removing MACROBLOCKD dependency from loop filter. Change-Id: I9ef40f3d95ab8f94f69e92ea25678a40956bc1ce	2014-05-16 09:48:26 -07:00
Adrian Grange	9dc9f17814	Merge "Fix post-processor macros & remove vizualization"	2014-05-16 09:01:41 -07:00
Dmitry Kovalev	619e6b539a	Merge "Removing redundant "8x8" suffix from MODE_INFO vars."	2014-05-15 17:53:31 -07:00
Jim Bankoski	ec82d2dfec	Merge "Revert "Remove Wextra warnings from vp9_sad.c""	2014-05-15 11:54:23 -07:00
Yunqing Wang	c661cf0dad	Merge "AVX2 To VP9 Block Error Optimization"	2014-05-15 11:29:29 -07:00
Dmitry Kovalev	ed784a0bc4	Removing redundant "8x8" suffix from MODE_INFO vars. Change-Id: I7ed7fecc959c6598ff98895f1a5cf7e11ac1615f	2014-05-15 11:14:42 -07:00
Adrian Grange	384bc5163c	Fix post-processor macros & remove vizualization Make all post-processor code conditionally compilable based on the CONFIG_VP9_POSTPROC macro. Also, remove the vizualization code from VP9 since it is out of date and will not compile. Change-Id: I1e9e13a09ecd43e9a3f3704c175ae8cd258ababd	2014-05-15 08:35:36 -07:00
Jim Bankoski	a16794dd31	Revert "Remove Wextra warnings from vp9_sad.c" This reverts commit `7ab9a9587b` Nightly test http://build.webmproject.org/jenkins/view/libvpx-nightly-tests/job/libvpx%20unit%20tests%20(valgrind-2)/arch=x86_64-linux-gcc,filter=-VP8:Large./276/console Failed This patch did not address all the assembly issues some of the vp8 assembly counts on 5 arguments being passed in to this function: one example : vp8_sad8x16_wmt Please address or split this into vp9 and vp8 patches. Change-Id: I78afcc171649894f887bb8ee3c66de24aaddc7ca	2014-05-15 08:31:20 -07:00
Yaowu Xu	71854f3a6e	Merge "vp9_decodeframe.c: cleanup -wextra warnings"	2014-05-15 06:50:51 -07:00
Dmitry Kovalev	021eaabdb8	Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file. Change-Id: Id401da740b0a0141caaef9e1bcccd981e5cef4a4	2014-05-14 16:21:41 -07:00
levytamar82	1fbab853c8	AVX2 To VP9 Block Error Optimization vp9_block_error_sse2 can only handle 16 bytes at a time but the function requires to handle a sequence of 32 bytes at a time so each 16 bytes is handled in a different register. With AVX2 optimization the 32 bytes can be handled in one register instead of two in the SSE2 The vp9_block_error was optimized by 85%. The user level was optimized by 1.2% Change-Id: Ia8fffe60e61eff7432a5fbd538757894f6c319fd	2014-05-14 11:51:07 -07:00
Deb Mukherjee	9687c057f8	Merge "Remove Wextra warnings from vp9_sad.c"	2014-05-14 10:01:50 -07:00
Yaowu Xu	ed09580777	vp9_decodeframe.c: cleanup -wextra warnings Change-Id: I0315cea6a5e58182bc2556e9825ec2ef0b1480c3	2014-05-14 09:46:11 -07:00
Jingning Han	e5bbb4cfd8	Merge "Silience -wextra warnings in vp9_reconintra.c"	2014-05-14 09:25:08 -07:00
Deb Mukherjee	7ab9a9587b	Remove Wextra warnings from vp9_sad.c As a side-effect, the max_sad check is removed from the C-implementation of VP8, for consistency with VP9, and to ensure that the SAD tests common to VP8/VP9 pass. That will make the VP8 C implementation of sad a little slower but given that is rarely used in practice, the impact will be minimal. Change-Id: I7f43089fdea047fbf1862e40c21e4715c30f07ca	2014-05-14 03:17:31 -07:00
Dmitry Kovalev	eecc750b33	Merge "Moving loopfilter call to vp9_decode_frame()."	2014-05-13 17:20:26 -07:00
Jingning Han	806fa6aaca	Silience -wextra warnings in vp9_reconintra.c The warning messages complained that there are unused arguments in a few prediction modes. This structure was designed on purpose, such that a wrapper function can cover all prediction mode cases and make them readily accessible as an pointer array. This commit silences such warnings. Change-Id: I7036b6bdb70747e5327d8f6fceb154f100abc4c0	2014-05-13 12:54:23 -07:00
Adrian Grange	fd6bf31b8a	vp9_convolve.c: cleanup -wextra warnings Change-Id: I04930aca2293ebbaeb96dfedd2f9c5a55762fd2e	2014-05-13 09:57:24 -07:00
Dmitry Kovalev	ae7d3ef39f	Moving loopfilter call to vp9_decode_frame(). Inline loopfilter has been already handled in vp9_decode_frame(). Collecting all similar code in one place now. Change-Id: I358a0280fc7c2b27cca520bc1e8c16c4eb6491dd	2014-05-12 16:19:19 -07:00
Johann	ce23931a3f	Only build neon assembly for armv7 targets Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477	2014-05-12 08:52:02 -07:00
Alex Converse	ec8a3272fa	Merge "Add an x86inc MMX fwht4x4."	2014-05-09 13:48:49 -07:00
Jingning Han	9412785b02	Merge changes I3edd4b95,I4514f974,Ie7fa4386 * changes: Turn on unit tests for SSSE3 8x8 forward and inverse 2D-DCT Change eob threshold for partial inverse 8x8 2D-DCT to 12 SSSE3 8x8 inverse 2D-DCT with first 10 coeffs non-zero	2014-05-09 09:58:39 -07:00
Alex Converse	b5422fab46	Add an x86inc MMX fwht4x4. Change-Id: Ib0a73d4863478f9b8a00976379d25d2f6ebbb197	2014-05-08 12:01:27 -07:00
Jingning Han	41a350a83d	Change eob threshold for partial inverse 8x8 2D-DCT to 12 The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2	2014-05-08 09:48:58 -07:00
Jingning Han	9e7b09bc5d	SSSE3 8x8 inverse 2D-DCT with first 10 coeffs non-zero This commit enables ssse3 assembly implementation of the 8x8 inverse 2D-DCT with only first 10 coefficients non-zero. The average runtime for this unit goes down from 198 cycles to 129 cycles (34.8% faster). Change-Id: Ie7fa4386f6d3a2fe0d47a2eb26fc2a6bbc592ac7	2014-05-07 17:40:02 -07:00
Dmitry Kovalev	68a600d82a	Merge "Moving pair_set_epi32 macro into vp9_dct32x32_sse2.c."	2014-05-07 13:34:05 -07:00
Paul Wilkins	33b1c457ed	Revert "Add an MMX fwht4x4" Includes changes that are not compatible with VS windows builds. Amongst other things stdint.h is not supported in VS. This reverts commit `89fbf3de50`. Change-Id: Ifa86d7df250578d1ada9b539c9ff12ed0c523cdd	2014-05-07 12:53:27 +01:00
Alex Converse	75d05d5ed4	Merge "Add an MMX fwht4x4"	2014-05-06 11:12:27 -07:00
Jingning Han	d289deb04c	Merge "SSSE3 implementation of full inverse 8x8 2D-DCT"	2014-05-06 09:17:22 -07:00
Dmitry Kovalev	e8bbb3d9db	Making vp9_get_sse_sum_{8x8, 16x16} static. Change-Id: Ifb7937c977308c682986f0ce9645a0807d2aa46a	2014-05-05 19:12:38 -07:00
Alex Converse	89fbf3de50	Add an MMX fwht4x4 7% faster encoding a desktop lossless at RT speed 4. Change-Id: I41627f5b737752616b6512bb91a36ec45995bf64	2014-05-05 15:10:48 -07:00
Jingning Han	52ae97b6aa	SSSE3 implementation of full inverse 8x8 2D-DCT This commit enables SSSE3 version full inverse 8x8 2D-DCT and reconstruction. It makes the runtime of vp9_idct8x8_64_add down from 256 cycles (SSE2) to 246 cycles. Change-Id: I0600feac894d6a443a3c9d18daf34156d4e225c3	2014-05-05 10:49:27 -07:00
Dmitry Kovalev	25a666ef39	Moving pair_set_epi32 macro into vp9_dct32x32_sse2.c. Change-Id: I642a7d343677bf934e9a54cf4ad78e908620e39a	2014-05-01 16:45:49 -07:00
Jingning Han	39761eb5d6	Merge "Enable SSSE3 implementation of 8x8 forward 2D-DCT"	2014-04-30 13:41:36 -07:00
Dmitry Kovalev	d2bc8816a1	Merge "Adding search_site_config struct."	2014-04-29 16:59:47 -07:00
Jingning Han	1eaa3a76dc	Enable SSSE3 implementation of 8x8 forward 2D-DCT Assembly implementation of ssse3 8x8 forward 2D-DCT. The current version is turned on only for x86_64. The average unit runtime goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster. This translates into about 1.5% speed-up for pedestrian_area 1080p at speed 2. Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4	2014-04-29 15:49:18 -07:00
Dmitry Kovalev	9b042dc04c	Merge "Removing unused vp9_variance_halfpixvar*() functions."	2014-04-29 14:52:58 -07:00
Dmitry Kovalev	aa464eca5e	Adding search_site_config struct. Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf	2014-04-29 10:34:53 -07:00
Dmitry Kovalev	7b59014b74	Removing old unused vp9_tapify.py. Change-Id: I7d66987fd04a3f98c140fc5f99ed0e9bc01f61d0	2014-04-25 15:19:31 -07:00
Dmitry Kovalev	6e01079cc0	Removing unused vp9_variance_halfpixvar*() functions. Change-Id: I99695564a3aa9bc8c79ac0a551d257e2ff3ad3c3	2014-04-25 11:50:07 -07:00
Dmitry Kovalev	03e7deae4f	Removing unused vp9_sub_pixel_mse* functions. Change-Id: I8d906da3bd6de0d3042676846f61a8b2a3444508	2014-04-24 11:49:12 -07:00
Dmitry Kovalev	e608418899	Renaming MB_PREDICTION_MODE to PREDICTION_MODE. Actually, it would be great to have two separate enums INTRA_MODES and INTER_MODES in future. Change-Id: I6c4147cf0002853da9c1e03fe9514eab876f01c8	2014-04-22 17:48:31 -07:00
Dmitry Kovalev	55977e4a4f	Merge "Moving frame_frags field from VP9Common to VP9_COMP."	2014-04-15 10:39:31 -07:00
Dmitry Kovalev	63fa722179	Removing unused cost arguments from mcomp functions. Change-Id: Id81a76d18be6b2de69f81bb563d74c3bb356d434	2014-04-11 10:24:36 -07:00
Yunqing Wang	23ccf71924	Merge "Fix encoder uninitialized read errors reported by drmemory"	2014-04-10 09:45:08 -07:00
Dmitry Kovalev	1d5ed021fb	Moving frame_frags field from VP9Common to VP9_COMP. Change-Id: I0f4a5c50561a2653d22c366c214a937272ecfa2c	2014-04-09 20:56:06 -07:00
Dmitry Kovalev	65e650e0c0	Merge "Revert "Converting set_prev_mi() to get_prev_mi().""	2014-04-09 20:44:30 -07:00
Dmitry Kovalev	60def47f21	Revert "Converting set_prev_mi() to get_prev_mi()." This reverts commit `22a3e30790` Change-Id: I460d905edf5fb2006da58c18fbe02c04d0c631bb	2014-04-09 15:23:16 -07:00
Tom Finegan	4fffefe189	Merge "Fix avx builds on macosx with clang 5.0."	2014-04-09 13:03:26 -07:00
Dmitry Kovalev	5ed83c3220	Merge "Converting set_prev_mi() to get_prev_mi()."	2014-04-09 10:27:05 -07:00
Yunqing Wang	2e7d327789	Merge "Use source frame difference to make partition decision"	2014-04-09 10:26:42 -07:00
Yunqing Wang	3a6670fcf8	Fix encoder uninitialized read errors reported by drmemory This patch fixed the uninitialized read errors in Issue 748: "dr memory VP9 encode errors". In vp9_convolve_avg_sse2, when width is 4, pavgb reads 8 bytes from dst buffer that is out of range. An error is reported although the data is not actually used later. This issue was resolved by preventing uninitialized reads. Change-Id: I109a54910aa47139cb13119de86f2062cff207df	2014-04-09 09:59:15 -07:00
Tom Finegan	f600b50a6e	Fix avx builds on macosx with clang 5.0. The macosx release of clang v5.0 identifies itself as: Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) This version of clang uses the older _mm_broadcastsi128_si256, like v3.3, as given away in the LLVM svn version above. Change-Id: I4d6d59d5454efd57d2ae9e75f5eb7486af7cbd0c	2014-04-08 18:56:03 -07:00
Yunqing Wang	4e66293fcb	Use source frame difference to make partition decision Calculate the difference variance between last source frame and current source frame. The variance is calculated at 16x16 block level. The variances are compared to several thresholds to decide final partition sizes. An adaptive strategy is implemented to decide using SOURCE_VAR_BASED_PARTITION or FIXED_PARTITION based on motions in the video. The switching test is done once every search_type_check_frequency frames. The selection of source_var_thresh needs to be investigated further later. RTC set Borg test showed 0.424% overall psnr gain, and 0.357% ssim gain. For clips with large enough static area, the encoding speedup is around 2% to 15%. Change-Id: Id7d268f1d8cbca7fb8026aa4a53b3c77459dc156	2014-04-08 17:03:02 -07:00
Deb Mukherjee	d35df2d8ea	High-level hooks for Profile 2 (10/12 bit) Adds some high-level hooks for profile 2 before further progress on the implementation. According to the definitiion in this patch: 1. Profile 2 only supports 10 or 12 bit color but not 8 2. Profile 2 supports all color sampling modes: 444, 422 and 420, and alpha plane. 3. Profile 3 is currently undefined. Please consider the definition carefully and suggest modifications to the definition as needed. Change-Id: I5b284fc679e54ac5aee171af72fa7994cfd28995	2014-04-08 16:18:34 -07:00
Dmitry Kovalev	22a3e30790	Converting set_prev_mi() to get_prev_mi(). Change-Id: Iad4002d7aecaae0e25d88e286bacde7e6cd7264f	2014-04-07 16:01:34 -07:00
Dmitry Kovalev	b5e12dda52	Cleaning up vp9_{cx, dx}_iface.c files. Change-Id: Ib4e31ba74c4b882bd93942ef743f4a189892738d	2014-04-07 10:38:51 -07:00
Dmitry Kovalev	a9f324fa7f	Removing interp_kernel from MACROBLOCKD. Now interp_kernel is obtained when it is really required (based on mbmi->interp_filter value). Change-Id: I4c7a93c179d1045eba16e7526c293d02c9b8b47e	2014-04-03 15:28:42 -07:00
Dmitry Kovalev	8b8606a737	Merge "Cleaning up vp9_mvref_common.c."	2014-04-02 11:03:36 -07:00
Dmitry Kovalev	68027a0b8a	Merge "Grouping members in MB_MODE_INFO struct."	2014-04-02 11:00:58 -07:00
Dmitry Kovalev	86f44a91f4	Renaming two members in MACROBLOCKD struct. Renames: mi_8x8 -> mi mode_info_stride -> mi_stride Change-Id: I66f3e5fd1e7b7f46f108af5bb711c5fd9493c1be	2014-04-01 17:46:40 -07:00
Dmitry Kovalev	d42976c515	Common configuration for MACROBLOCKD struct. Change-Id: Ie2ea9dd8bd338cc9fe12ca9033df64f7644c68b3	2014-04-01 10:57:59 -07:00
Dmitry Kovalev	20d868f05d	Grouping members in MB_MODE_INFO struct. Change-Id: Ia6d7e7a08810e0c3401da4d10266828d560e6851	2014-03-28 17:44:13 -07:00
Yaowu Xu	4f857bacd2	[BITSTREAM]Fix the scaling calculation For very large size video image, the scaling calculation may need use value beyond the range of int. This commit upgrade the value to 64bit to make sure the calculation do not wrap around INT_MAX. The change corrected the decoder behavior. The bug affects only very large resolution video because the scaling calculation was sufficient for image size smaller than 2^13. This resolves issue: https://code.google.com/p/webm/issues/detail?id=750 Change-Id: I2d2ed303ca6482f31f819f3c07d6d3e98ef3adc5	2014-03-28 16:40:29 -07:00
Dmitry Kovalev	03349d2ba2	Moving dqcoeff array to MACROBLOCKD in decoder. Change-Id: I3e20c0cdb9d2437bddf21afb255855f2dead8e02	2014-03-28 10:36:16 -07:00
Dmitry Kovalev	38053687bc	Cleaning up vp9_mvref_common.c. Change-Id: I4eb815156ecaab02c9182e6e1abbea0e4d86c441	2014-03-27 17:50:02 -07:00
Dmitry Kovalev	0437575848	Merge "Removing prev_mi_8x8 from MACROBLOCKD."	2014-03-26 15:45:11 -07:00
Dmitry Kovalev	38c2d37b9d	Merge "Cleaning up vp9_entropymv.c."	2014-03-26 14:28:45 -07:00
Dmitry Kovalev	63f86c149a	Removing prev_mi_8x8 from MACROBLOCKD. Change-Id: I32beb5f18c10b5771146c55933b5555487f53633	2014-03-26 10:50:34 -07:00
Dmitry Kovalev	ed39c40a2e	Moving above_context to VP9_COMMON. Change-Id: I713af99d1e17e05a20eab20df51d74ebfd1a68d2	2014-03-25 10:40:08 -07:00
Yaowu Xu	34a3628a45	Merge "Fixed a build issue"	2014-03-25 10:22:18 -07:00
Yaowu Xu	59872069d2	Merge "Change back the scaling calculation."	2014-03-25 09:48:21 -07:00
Yaowu Xu	8051563972	Fixed a build issue Adding the missed include file. Change-Id: I7e48df6b0633afbebaf1ccb3062ae404e7203dc9	2014-03-25 09:45:54 -07:00
Dmitry Kovalev	5b8c834c1a	Initialization code cleanup. Change-Id: I47a8b4bf9a6cc0063d1a6785eaaad641d0659e24	2014-03-24 12:21:22 -07:00
Dmitry Kovalev	49bb6df0e2	Cleaning up vp9_entropymv.c. Change-Id: I01b3530779da89acb84c71bac5ccac456f00c5ac	2014-03-24 11:02:27 -07:00
Yunqing Wang	b458bb7c20	Merge "AVX2 SAD Optimization:"	2014-03-24 10:52:32 -07:00
Dmitry Kovalev	ac5bdc0ed8	Merge "Cleaning up vp9_loopfilter.c."	2014-03-24 09:02:06 -07:00
hkuang	22232ec602	Change back the scaling calculation. Let the calculation to be compatible with Google's HW implementation. Change-Id: I22e179888cdb0419e230351c0a47661b37051fef	2014-03-24 08:32:56 -07:00
Dmitry Kovalev	9895c9d4dd	Merge "Removing redundant {above, left}_seg_context manipulation code."	2014-03-22 22:31:48 -07:00
Dmitry Kovalev	2786938a3c	Merge "Renaming and making vp9_update_mode_info_border() static."	2014-03-21 21:19:18 -07:00
Dmitry Kovalev	58cc06f9b3	Cleaning up vp9_loopfilter.c. Change-Id: I7c7cf7d3c7b00d1c74ffa8aa8fb8d78a0e48326f	2014-03-21 16:31:15 -07:00
Frank Galligan	8345e76d61	Merge "Fix libvpx VP9 decoder dr memory errors"	2014-03-21 15:24:39 -07:00
Dmitry Kovalev	e141f10bfc	Renaming and making vp9_update_mode_info_border() static. Change-Id: Ibb72a29cae9ca9443aae56fc4c5458d190eae279	2014-03-21 14:02:25 -07:00
levytamar82	0fa8b668c1	AVX2 SAD Optimization: 2 functions were optimized for avx2 by using full 256 bit register In order to handle 32 elements in parallel instead of only 16 in parallel: 1. vp9_sad32x32x4d 2. vp9_sad64x64x4d The function level gain is 66% and the user level gain is ~1%. Change-Id: I4efbb3bc7d8bc03b64b6c98f5cd5c4a9dd3212cb	2014-03-21 13:53:32 -07:00
Yunqing Wang	9b5df3fabe	Fix libvpx VP9 decoder dr memory errors Fixed dr memory errors reported in Issue 736: https://code.google.com/p/webm/issues/detail?id=736 All elements in left_col buffer need to be initialized to ensure the correctness of SIMD operations in x86 optimized code. Change-Id: I8e7f26ab45cca8099c1f9342bcf852f828bda7e4	2014-03-21 12:23:47 -07:00
Dmitry Kovalev	4cb37bff96	Removing redundant {above, left}_seg_context manipulation code. Change-Id: Ib3c1746e61220c629cbd971b2458aa686b5c9e36	2014-03-21 12:12:55 -07:00
Dmitry Kovalev	a57de9da03	Merge "Reusing {above, left}_seg_context vars in both encoder and decoder."	2014-03-21 12:02:42 -07:00
Yaowu Xu	46c71e5eba	Merge "Remove duplicate declaration"	2014-03-21 08:44:04 -07:00
Dmitry Kovalev	7ad40117f1	Reusing {above, left}_seg_context vars in both encoder and decoder. Change-Id: Id1fa36c92cb007b73a450cc8552e810cedad38b9	2014-03-20 16:15:57 -07:00
Dmitry Kovalev	03781ff22d	Merge "Removing mi_stream."	2014-03-20 13:43:13 -07:00
Dmitry Kovalev	4b37dc8d87	Adding alloc_mi() function. Change-Id: I3b944884c048f589c86e0169aeb3c3855bc8b729	2014-03-19 13:31:47 -07:00
Yaowu Xu	7ef16efca1	Remove duplicate declaration Change-Id: Ic8e52a89e0df816c38cd8ff1b7c53862b9a6dff2	2014-03-19 12:23:32 -07:00
Yaowu Xu	8cb59992e8	Merge "Fix the md5 mismatch for some scale cases."	2014-03-19 11:13:28 -07:00
Dmitry Kovalev	8ccfcb765f	Removing mi_stream. Change-Id: If674140e30c223c88894b983fd22a583efb99dcf	2014-03-19 10:47:32 -07:00
Dmitry Kovalev	b8bc2d337a	Fixing warnings/errors from c++ compiler. Change-Id: Ia561dda53f2dd10e3a10a2df2adb8027ab19397a	2014-03-18 10:47:51 -07:00
hkuang	1f7e4856f8	Fix the md5 mismatch for some scale cases. Fixes issue #731 Change-Id: Id313e84b8fb4ff20f6a4e1ed11cb601927888318	2014-03-17 11:21:43 -07:00
Dmitry Kovalev	7c6337ba9e	Merge "Adding vp9_swap_mi_and_prev_mi() function."	2014-03-13 17:47:27 -07:00
Dmitry Kovalev	d8e5564129	Using MB_PREDICTION_MODE enum instead of int. Change-Id: I652d17f7bff84f75d015f4f39652472e14eb3134	2014-03-13 15:03:00 -07:00
Dmitry Kovalev	e65c564c78	Adding vp9_swap_mi_and_prev_mi() function. Change-Id: I18b3939f0b51085cdd25c9182c3a9c7536ca7e3e	2014-03-13 13:55:33 -07:00
Dmitry Kovalev	3dca8ca7af	Merge "Renaming mode2txfm_map to intra_mode_to_tx_type_lookup."	2014-03-12 23:29:29 -07:00
Yaowu Xu	17256ad763	Revert "With on demand border extension, clamping the MV" This reverts commit `b0fec6ab4a`. Change-Id: I9acd8ee0423f22d92138f11579611ff959331013	2014-03-12 19:40:15 -07:00
Yaowu Xu	acf2eb73e7	Revert "Remove dec_build_inter_predictors() parameters" This reverts commit `9650b9d72a`. Change-Id: I841c4a4734170fda63469e32adc10703aa4bf0fa	2014-03-12 19:39:59 -07:00
Dmitry Kovalev	95aed4a3fa	Renaming mode2txfm_map to intra_mode_to_tx_type_lookup. Change-Id: I9a19eb96907f674e3ce1e573f5dd49f0fbf2ae4f	2014-03-12 17:23:26 -07:00
Dmitry Kovalev	c909b43e3c	Merge "Moving mi_streams from VP9Decompressor to VP9Common."	2014-03-12 12:20:18 -07:00
Dmitry Kovalev	fec0d4bc7d	Merge "Removing last_mi from MACROBLOCKD struct."	2014-03-12 12:19:43 -07:00
Dmitry Kovalev	dff81e6c7a	Moving mi_streams from VP9Decompressor to VP9Common. Change-Id: I7ad79c061ad4efbc4914ac49723b48183fdbdd47	2014-03-10 16:12:45 -07:00
Dmitry Kovalev	ff935ff781	Removing last_mi from MACROBLOCKD struct. Change-Id: Ied12b39c55667b26fd3bf90eb331e601c53a10f6	2014-03-10 16:02:03 -07:00
Dmitry Kovalev	6281a9abbb	Adding type casts to remove C++ compiler errors. Change-Id: I224e49955ad6c833d204feb8efc4056e37d206be	2014-03-10 14:53:30 -07:00
Dmitry Kovalev	f8f8c6d44c	Adding reusable get_y_mode_prob() function. Change-Id: Iebd182d7aeebc0f8964b6fd35057449bb25b00c1	2014-03-10 10:50:16 -07:00
Jim Bankoski	622f06eb59	Merge "vp9_reconinter.h static functions in header converted to global"	2014-03-10 07:36:05 -07:00
Jim Bankoski	ffda0cde7b	Merge "vp9_onyxc_int.h static -> static inline in header"	2014-03-10 07:35:54 -07:00
Dmitry Kovalev	0ac2139d02	Merge "Removing vp9_onyx.h and moving its content to the encoder."	2014-03-06 11:49:41 -08:00
James Zern	e7fe1543f6	Merge "vp9_systemdependent: reorder includes avoid proto mismatch"	2014-03-06 11:42:50 -08:00
James Zern	fe49c05214	Merge "vp9_subpixel_8t_intrin_avx2: fix build w/clang 3.4+"	2014-03-06 11:41:44 -08:00
James Zern	caecedc92f	vp9_subpixel_8t_intrin_avx2: fix build w/clang 3.4+ clang reports gcc-4.2.1 in e.g., 3.3, 3.4; add a specific clang version check for _mm256_broadcastsi128_si256 fixes issue #720 Change-Id: I5c8e3c27fdea05d8a5b050e8cb74894b595f4709	2014-03-06 10:55:44 -08:00
Dmitry Kovalev	3f1ab25812	Removing vp9_onyx.h and moving its content to the encoder. Change-Id: I03451c88536bc498edddbe0cd9773ff79da085c2	2014-03-05 23:33:22 -08:00
James Zern	e9680bef22	vp9_systemdependent: reorder includes avoid proto mismatch fixes a warning in vs9/x64 related to ceil() Change-Id: Ic4bde9d0b7e961546dbe304de74aa37fc02fcf94	2014-03-05 22:02:29 -08:00
Dmitry Kovalev	08a7d7e405	Merge "Renaming NMV_UPDATE_PROB to MV_UPDATE_PROB."	2014-03-05 21:39:09 -08:00
Dmitry Kovalev	bb9b6a9568	Merge "Cleaning up vp9_mvref_common.c."	2014-03-05 10:57:37 -08:00
Dmitry Kovalev	791751015f	Merge "Removing VP9_PTR."	2014-03-05 10:57:10 -08:00
Dmitry Kovalev	d31fc628a7	Renaming NMV_UPDATE_PROB to MV_UPDATE_PROB. Change-Id: I7f3bcca103f0b1f6b3c064b61472543de9a8288a	2014-03-05 10:37:52 -08:00
Dmitry Kovalev	fe7b1d0a8d	Removing VP9_PTR. Change-Id: Ib49d8dbc67c590f22a1a70251ff607c9f38febd7	2014-03-03 16:50:16 -08:00
Jim Bankoski	e5e9b05d68	vp9_reconinter.h static functions in header converted to global Change-Id: I916944950deb22f4c2301d83a803b732bf3ecd77	2014-03-03 14:58:43 -08:00
Jim Bankoski	3d12e65483	vp9_onyxc_int.h static -> static inline in header Change-Id: Ib65fb0679156960305b10fbf590254ff6bf1bfe1	2014-03-03 14:50:07 -08:00
James Zern	805078a1bf	build: convert rtcd.sh to perl significantly speeds up file generation. the goal of this change is to convert rtcd.sh to perl as directly as possible to allow for simple comparison. future changes can make it more perl-like. --- Linux [CREATE] vpx_scale_rtcd.h real 0m0.485s -> 0m0.022s [CREATE] vp8_rtcd.h real 0m4.619s -> 0m0.060s [CREATE] vp9_rtcd.h real 0m10.102s -> 0m0.087s Windows [CREATE] vpx_scale_rtcd.h real 0m8.360s -> 0m0.080s [CREATE] vp8_rtcd.h real 1m8.083s -> 0m0.160s [CREATE] vp9_rtcd.h real 2m6.489s -> 0m0.233s Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee	2014-03-03 14:47:11 -08:00
Dmitry Kovalev	be647f7b83	Merge "Adding get_tx_type() instead of get_tx_type_{8x8, 16x16}."	2014-03-03 14:24:28 -08:00
Dmitry Kovalev	594677a76b	Merge "Moving FRAME_CONTEXT & FRAME_COUNTS to vp9_entropymode.h."	2014-03-03 14:24:04 -08:00
Dmitry Kovalev	46af01d719	Adding get_tx_type() instead of get_tx_type_{8x8, 16x16}. Change-Id: I4a54b12e5229705222c5a101258b9d1f81e2948d	2014-03-03 12:20:51 -08:00
Dmitry Kovalev	c288367678	Adding consts and cleaning up vp9_rdopt. Change-Id: I9423b543e1be414e5c9e10480b813f06e6b88f8a	2014-03-03 12:19:51 -08:00
Yunqing Wang	d4648d93f4	Merge "AVX2 SubPixel AVG Variance Optimization"	2014-03-03 09:01:36 -08:00
Yaowu Xu	9650b9d72a	Remove dec_build_inter_predictors() parameters There were two parameters not in use, this commit removed them. Change-Id: Ia03a73b9a2521400bed539df45574e34214ed93a	2014-03-01 11:14:00 -08:00
Yaowu Xu	2f4eb5f096	Remove vp9_create_common() The function has evolved over time, now only calls vp9_rtcd(), so this commit removes the function and changes to call vp9_rtcd() directly. Change-Id: I8cfa6190daa4b28f6f3d1e11bb3a07f9c95322bf	2014-03-01 10:59:24 -08:00
levytamar82	ea14909687	AVX2 SubPixel AVG Variance Optimization Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_avg_variance64x64 2. vp9_sub_pixel_avg_variance32x32 both of those function were calling vp9_sub_pixel_avg_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_avg_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 80% function level gain and 2% user level gain Change-Id: Iea694654e1b7612dc6ed11e2626208c2179502c8	2014-02-28 22:51:04 -07:00
Dmitry Kovalev	d689f2ad33	Cleaning up vp9_mvref_common.c. different_ref_found is always equal to one (if calculated) because ref_frame[0] != ref_frame[1] for each mi-block. Change-Id: Ibd7625b7b29dec2fd3c40edbc3de1169abb78585	2014-02-28 15:12:33 -08:00
Dmitry Kovalev	e68cc30bb5	Moving FRAME_CONTEXT & FRAME_COUNTS to vp9_entropymode.h. Change-Id: I1fe71e35b1e44da693b43d26607abb33efd56820	2014-02-28 13:56:43 -08:00
Dmitry Kovalev	e4159100bc	Merge "Adding get_y_mode() function."	2014-02-28 11:12:22 -08:00
Dmitry Kovalev	28bd1dd15e	Merge "Adding consts to arguments of vp9_block_error()."	2014-02-28 10:51:43 -08:00
Dmitry Kovalev	3a83d08a08	Merge "Moving get_tx_eob() from common to encoder."	2014-02-28 10:49:47 -08:00
hkuang	edcbbf2ee3	Merge "Fix a bug in neon that has not save and restore q4-q7 registers."	2014-02-28 09:48:26 -08:00
Dmitry Kovalev	3b2cd9137a	Moving get_tx_eob() from common to encoder. Change-Id: I7d11c6ae259aff6560710d16fea3032c661e5b02	2014-02-27 18:26:44 -08:00
Dmitry Kovalev	791e9bdac9	Adding consts to arguments of vp9_block_error(). Change-Id: Id145da99259866109cfee8b47a1d8f309944b937	2014-02-27 18:17:08 -08:00
Dmitry Kovalev	1ae91f7784	Adding get_y_mode() function. Change-Id: Iaac57b24f79cd205a8c62bc1177412d22f5787a8	2014-02-27 16:05:50 -08:00
hkuang	f3d8e315ac	Fix a bug in neon that has not save and restore q4-q7 registers. Change-Id: Ie21b5ae89100389b80f919710839084f935a8545	2014-02-27 14:06:52 -08:00
Minghai Shang	3a8deeb8b6	Merge "[svc] Add target bitrate settings for each layers."	2014-02-27 10:51:26 -08:00
Dmitry Kovalev	2c594a5275	Removing vp9_systemdependent.c. Change-Id: I7b9738a7113c0c4687e5d320581ff69d98a8b271	2014-02-26 18:07:23 -08:00
Minghai Shang	8c196b27b3	[svc] Add target bitrate settings for each layers. Change-Id: Ia7677fb436667bc4f76db71f65e4784f433f7826	2014-02-26 13:30:50 -08:00
hkuang	08f250f565	Merge "Fix a bug in intra prediction due to change in 25e55526301eba7d6e5c68e25402e9b2102976d8."	2014-02-26 11:56:45 -08:00
hkuang	1c4e449133	Fix a bug in intra prediction due to change in `25e5552630`. Change-Id: I17ac67c3ced91ad4f057b296f7e8dc86a3389f26	2014-02-25 17:54:33 -08:00
Dmitry Kovalev	7bca32a6a3	Merge "Changing vp9_full_search_sad{, x3, x8} signatures."	2014-02-25 10:51:17 -08:00
Yaowu Xu	05e850cb9e	added clamp of segment loop filter level for ABSDATA mode, so segment loop filter level always fall in valid range for both Absolute and delta modes. Change-Id: If90df3411479533dbdab63f8ae088d2f5dd174a9	2014-02-24 09:56:48 -08:00
Yaowu Xu	bfaf415ea7	Merge "Added clamp of qindex to valid range"	2014-02-24 08:28:07 -08:00
Dmitry Kovalev	2aacc66b66	Merge "Cleaning up vp9_mvref_common.{h, c}."	2014-02-23 08:25:40 -08:00
Yaowu Xu	e22b12e304	Added clamp of qindex to valid range The qindex for a segment was not clamped in ABSDATA mode, which may cause invalid memory access if an ill-formed stream has a negative value in ABSDATA mode. This commit added clamp to make sure qindex for a segment always fall into valid range. Change-Id: I0a74d00f4ef40aec7edaeca1d03c8645e23ab08c	2014-02-22 12:30:18 -08:00
Yaowu Xu	f1633e5844	Merge "Remove an unused variable"	2014-02-21 22:44:05 -08:00
Alex Converse	6e3cf6ec1d	Stop gating non420 features with a configure flag. Change-Id: I8cc38fdef6a2a0968af8dfe15e7c2b3c46c531ea	2014-02-21 12:05:29 -08:00
James Zern	e2f614be53	Merge "vp9_subpixel_8t_intrin_ssse3.c: make some tables static"	2014-02-20 16:02:16 -08:00
James Zern	3240db7407	Merge "vp9_subpixel_8t_intrin_avx2.c: make some tables static"	2014-02-20 16:01:50 -08:00
Yaowu Xu	c58e1c7be9	Remove an unused variable Change-Id: I8eeec70a7d4403243762f14d0b560792801645e8	2014-02-20 14:49:44 -08:00
James Zern	10f2db2b1f	Merge "vp9: normalize DECLARE_ALIGNED use on global tables"	2014-02-19 11:38:47 -08:00
Dmitry Kovalev	d43c5cc5ea	Cleaning up vp9_mvref_common.{h, c}. Hiding vp9_find_mv_refs_idx() inside vp9_mvref_common.c, moving definition of vp9_find_mv_refs() to vp9_mvref_common.c. Change-Id: I0c9f34b03648785a7d18edf6d4fddd34e55dfcc5	2014-02-19 14:23:51 +01:00
Dmitry Kovalev	35bd886864	Merge "Cleaning up pack_inter_mode_mvs() function."	2014-02-19 01:04:36 -08:00
James Zern	b78c219c80	vp9: normalize DECLARE_ALIGNED use on global tables - place extern within the macro - use in the header only Change-Id: I4274b345d8af9ef329c0eb9553a3ddaad70d1d26	2014-02-18 22:57:43 -08:00
James Zern	d73d621e5d	vp9_subpixel_8t_intrin_ssse3.c: make some tables static + fix formatting Change-Id: I344d4de089d03e403f0c7b3e64aeb7086cce86ac	2014-02-18 20:42:00 -08:00
James Zern	a96af49bab	vp9_subpixel_8t_intrin_avx2.c: make some tables static + fix formatting Change-Id: Ia62610bff3d63855104366d7860749b6a3cf4577	2014-02-18 20:40:40 -08:00
James Zern	26c8e720ca	Merge "vp9_filter: move table alignment decl's to header"	2014-02-18 20:15:33 -08:00
Yunqing Wang	0cc71c9c9f	Merge "SSSE3 convolution optimization"	2014-02-18 12:55:34 -08:00
Yunqing Wang	ad8d4454f0	Merge "AVX2 SubPixel Variance Optimization"	2014-02-18 12:18:13 -08:00
Dmitry Kovalev	36420009ea	Changing vp9_full_search_sad{, x3, x8} signatures. Passing block MV pointer instead of block index into vp9_full_search_sad{, x3, x8} functions. Change-Id: Ica07356633471c2c8f81b583a7aeba85a436bafb	2014-02-17 14:24:57 +01:00
James Zern	8092080216	vp9_filter: move table alignment decl's to header avoids mismatched alignment warnings in visual studio builds Change-Id: I2cedb8042fd47e708bde3f7168a6fb4bd9aaa569	2014-02-15 10:18:24 -08:00
James Yu	e486488ce8	Replace vqshrun by vqmovun if shift #0 bit Change-Id: Ifabb8c7ec0c327fea9d6739cab10addb060ff435 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-14 21:03:40 -08:00
Johann	4378503665	Merge "Remove redundant arm neon instructions."	2014-02-14 20:02:51 -08:00
levytamar82	52dac5d1cb	AVX2 SubPixel Variance Optimization Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_variance64x64 2. vp9_sub_pixel_variance32x32 both of those function were calling vp9_sub_pixel_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 70% function level gain and 2% user level gain Change-Id: I4f5cb386b346ff6c878a094e1c3b37e418e50bde	2014-02-14 16:59:11 -07:00
Adrian Grange	b7be30eb36	Cleanup some comments. Change-Id: I568861ba1d43620865ad9a98a97eef37a51fd856	2014-02-14 15:05:30 -08:00
Yaowu Xu	ecf392a155	Merge "minor spelling cleanup in comments"	2014-02-14 14:29:35 -08:00
levytamar82	3068d7d944	SSSE3 convolution optimization Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization is done only for 64 bit Change-Id: Ic07fce2f9360329b4f2d956efda1480ae958766b	2014-02-14 15:08:42 -07:00
Dmitry Kovalev	19a8eee1f0	Cleaning up pack_inter_mode_mvs() function. Change-Id: I48ad06e3e1ae9720a0683022621f4504e3bebce6	2014-02-13 19:21:10 -08:00
Yaowu Xu	8d646becb6	Merge "Removed the reset of mode_info from previous frame"	2014-02-13 17:03:50 -08:00
Frank Galligan	fb8c246b70	Merge "Add VP9 decoder support for external frame buffers"	2014-02-13 15:29:52 -08:00
Frank Galligan	a4f30a5023	Add VP9 decoder support for external frame buffers Added support for external frame buffers to libvpx's VP9 decoder. If the external frame buffer functions are set then libvpx will call the get function whenever it needs a new frame buffer to decode a frame into. And it will call the release function whenever there are no more references to that buffer. Change-Id: Id2934d005f606af6e052fb6db0d5b7c02f567522	2014-02-13 13:14:19 -08:00
Yaowu Xu	896d79a57e	Removed the reset of mode_info from previous frame Prior to this commit, both encoder and decoder reset mode/mv info from previous frame in error resilient mode to ensure bitstreams are able to decode when there is loss of frame in decoder side. However, this is not necessary. This commit changed to remove the reset, so encoder can continue to use mode/mv/partition information from previously encoded frame without affecting decodeablilty under loss of frame. Change-Id: I0279f862900dc647fb471ae3389770bb1b9f454f	2014-02-13 12:48:08 -08:00
Dmitry Kovalev	df6c523fed	Merge "Renaming skip_coeff to skip for consistency."	2014-02-13 11:04:34 -08:00
Frank Galligan	e5a1b214f7	Merge "Fix neon wide loopfilter for filter8 only branch"	2014-02-13 09:52:48 -08:00
Yunqing Wang	92824a9cbc	Merge "AVX2 Convolve Optimization"	2014-02-13 09:43:55 -08:00
levytamar82	876c72a093	AVX2 Convolve Optimization Two convolve functions were optimized for AVX2: 1. vp9_filter_block1d16_h8 2. vp9_filter_block1d16_v8 vp9_filter_block1d16_v8 was optimized for AVX2 by reducing the number of loop strides by half, two strides were processed in parallel. vp9_filter_block1d16_v8 was also optimized in the same way also some of the loads were being done outside of the loop and by that preventing redundant loads. This Optimization gives 43% function level gain and 1.3% user level gain. Now can be compiled in Windows Change-Id: I2714124cfb0c14a77d7a0ce126a20db92ffbf92c	2014-02-12 20:45:31 -07:00
Frank Galligan	b41acbf9bb	Fix neon wide loopfilter for filter8 only branch The current code removed the check to only perform the filter8. Change-Id: Ie54e19a77745042a5660eab986d9ef1c42e82410	2014-02-12 18:36:17 -08:00
Dmitry Kovalev	004c8c636e	Renaming skip_coeff to skip for consistency. Change-Id: I036e815ca63d00cba71202ae09ba0f6ef745dcb8	2014-02-12 17:44:12 -08:00
Andrew Russell	549c31f8ae	minor spelling cleanup in comments Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06	2014-02-12 16:32:51 -08:00
Dmitry Kovalev	50712fcaa9	Adding consts to mv search function arguments. Change-Id: Ie79114bba4f0cea55d9f701e20d2be2017630f3b	2014-02-12 14:28:23 -08:00
Dmitry Kovalev	0109d757ee	Merge "Removing vp9_foreach_transformed_block_uv() function."	2014-02-12 12:11:14 -08:00
Jingning Han	e8b7610e8f	Use INTER_OFFSET in vp9_pick_inter_mode Cosmetic change to use pre-defined macros. Change-Id: I93e9fa90113d0242599048940b39694660385a6f	2014-02-12 09:14:29 -08:00
James Yu	619f29cdb0	Remove redundant arm neon instructions. Change-Id: I1fabad59747eb5f68c64275a36c3a1d94daf32a3 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-11 21:19:12 -08:00
Dmitry Kovalev	79dd1f8441	Removing vp9_foreach_transformed_block_uv() function. Change-Id: I35ec77b71e6fd686865cead9281e4dd9e9bc9e86	2014-02-11 18:06:00 -08:00
Tom Finegan	c49c75fde0	Merge "vp9/common/x86: Silence MSVC warnings in vp9_asm_stubs.c."	2014-02-11 14:39:27 -08:00
Frank Galligan	d51ca0db00	Merge "Add get release decoder frame buffer functions."	2014-02-11 08:19:37 -08:00
Dmitry Kovalev	803a5c67dd	Merge "Encoder quantization cleanup."	2014-02-10 21:32:04 -08:00
Tom Finegan	60e91a92c3	vp9/common/x86: Silence MSVC warnings in vp9_asm_stubs.c. Update filter_1dfunction definition to match usage. Change-Id: Ie3cae13dc1ec3f5838c5f29d1c76a1a98a9217fa	2014-02-10 15:08:42 -08:00
Frank Galligan	e8e152799b	Add get release decoder frame buffer functions. This CL changes libvpx to call a function when a frame buffer is needed for decode. Libvpx will call a release callback when no other frames reference the frame buffer. This CL adds a default implementation of the frame buffer callbacks. Currently only VP9 is supported. A future CL will add support for applications to supply their own frame buffer callbacks. Change-Id: I1405a320118f1cdd95f80c670d52b085a62cb10d	2014-02-10 14:08:11 -08:00
Jim Bankoski	3c790ec0f8	Convert small static header functions to inline Change-Id: I467b28346a0d8d4d8b96d6c05fc39c34eec26e5c	2014-02-10 07:56:45 -08:00
Jim Bankoski	b5f59ea280	Convert small static functions in header to inline.. Change-Id: Ic4fc01be7738fbabf8c7860dbe3476ab4caf5fc2	2014-02-10 07:56:38 -08:00
Jim Bankoski	7341725e13	Convert small header functions to inline Change-Id: I4e5575f0d7ccfe2361b8cbf78e7dc079272c9f5f	2014-02-10 07:56:29 -08:00
Jim Bankoski	69f58b40e0	Convert header static functions to inline or make them global. Change-Id: Ib26fbfef3505299f754e5af6c437a85d7746fc28	2014-02-10 07:39:12 -08:00
Jim Bankoski	6a9e58cb1d	Converted functions in header to INLINE... Change-Id: I00512c6cef3a4af8df57c7263ceb853fb2db8140	2014-02-09 20:12:04 -08:00
Jim Bankoski	18c8deabbf	Convert functions to inline that are small . Change-Id: I3b160e93d9319c8e1abda2a60f49f89c409d534b	2014-02-09 20:08:58 -08:00
Jim Bankoski	9768d0b184	Convert functions to inline that are in headers static. Change-Id: If1ec3b64be327e8c48ec7efbacde208d2129fdb0	2014-02-09 20:06:35 -08:00
Jim Bankoski	99e4c508b2	Converted function to inline Change-Id: Iaa4880c8a207cfea509608e1ef4593794b6b31f2	2014-02-09 20:04:54 -08:00
Jim Bankoski	3a3aa3f4e3	Converted short static functions to inline. Change-Id: I859719d41ced2e35d2765b636e627bb7edc3651e	2014-02-09 19:58:54 -08:00
Tom Finegan	bf79a4da77	vp9/common: Silence MSVC warning in vp9_convolve.c. Added cast to int to silence MSVC warning. Change-Id: I9ef4709d2e4cf0db070d9e52385c1b3f138b00a5	2014-02-07 10:13:57 -08:00
Dmitry Kovalev	005fc6970b	Finally removing "short" from transform names. Change-Id: I5259b68dc1bcceb153e3ffe638a79a59a3019e9d	2014-02-06 11:54:15 -08:00
Marco Paniconi	4864ab21b0	Layer based rate control for CBR mode. This patch adds a buffer-based rate control for temporal layers, under CBR mode. Added vpx_temporal_scalable_patters.c encoder for testing temporal layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns). Updated datarate unittest with tests for temporal layer rate-targeting. Change-Id: I8900a854288b9354d9c697cfeb0243a9fd6790b1	2014-02-06 09:24:45 -08:00
Dmitry Kovalev	f32fa45cba	Merge "Cleaning up vp9_get_pred_context_single_ref_p1()."	2014-02-05 18:38:38 -08:00
Dmitry Kovalev	4a1a7919da	Merge "Removing "_1d" suffix from mips transform code."	2014-02-05 18:37:49 -08:00
Yunqing Wang	7ad56bf3c9	Merge "Optimize bilinear sub-pixel filters in ssse3"	2014-02-05 17:20:52 -08:00
Dmitry Kovalev	724fefb4cf	Cleaning up vp9_get_pred_context_single_ref_p1(). Change-Id: I279343b474d7ff41afcf8f1493b6fbf716b51823	2014-02-05 11:48:01 -08:00
Dmitry Kovalev	a536237228	Merge "Cleaning up vp9_get_pred_context_single_ref_p2()."	2014-02-05 11:37:17 -08:00
Martin Storsjo	03bc491721	arm: Consistently use braces around doubleword arguments to vld This isn't strictly necessary, but makes the file more consistent with the other arm assembly source files. Change-Id: I245c9677d89e0ab3f31991e473764858af35b180	2014-02-05 13:24:25 +02:00
Martin Storsjo	c2bb1aa544	arm: Use {} around quadword arguments to vld This fixes building for iOS. Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb	2014-02-05 13:24:17 +02:00
James Zern	d89f861f4b	vp9_systemdependent.h: relocate system includes avoid wrapping msvc includes with extern "C"; this breaks some visual studio builds of the (c++) tests. Change-Id: Ie8062d55d4f4c049f6cd360a36da6a67607df132	2014-02-04 18:28:45 -08:00
Dmitry Kovalev	c31cf0d647	Merge "Moving x1 & y1 calculation under if condition."	2014-02-04 14:50:25 -08:00
hkuang	b0fec6ab4a	With on demand border extension, clamping the MV is not longer needed. Change-Id: I40c37ef18c67ab27fc336694dfca3c43a87c47ca	2014-02-04 13:57:40 -08:00
Yunqing Wang	d1961e6fbf	Optimize bilinear sub-pixel filters in ssse3 This patch added ssse3 optimization of bilinear sub-pixel filters. The real time encoder was speeded up by ~1%. Change-Id: Ie82e98976f411183cb8c61ab8d2ba0276e55a338	2014-02-04 08:01:55 -08:00
James Zern	2b7338aca4	Merge "vp9_filter.h: rename interp_kernel type"	2014-02-03 23:12:28 -08:00
Dmitry Kovalev	5daaff527e	Moving x1 & y1 calculation under if condition. Change-Id: Iae787d491f7cfe24855ef8f2d04e2c6c19350378	2014-02-03 18:03:17 -08:00
Dmitry Kovalev	64cca45c1d	Cleaning up vp9_get_pred_context_single_ref_p2(). Change-Id: I294075acd3073c41e153079ff4462816898b3778	2014-02-03 17:46:34 -08:00
James Zern	cca4276dac	vp9_filter.h: rename interp_kernel type -> InterpKernel avoids conflicts in variable names, fixing the build with various toolchains. broken since: `8691565` Removing subpix_fn_table struct. Change-Id: Ib5f6fdbcb494a97b62c75b99d4d826ff25d4c981	2014-02-03 16:48:38 -08:00
Alex Converse	be1b41673f	Merge "INLINE and reimplement get_unsigned_bits()."	2014-02-03 16:26:33 -08:00
Dmitry Kovalev	220b8f8644	Encoder quantization cleanup. Change-Id: I633205c95f0e81ce0589580501d0be4425a3cb8e	2014-02-03 14:57:28 -08:00
Dmitry Kovalev	282f36adc4	Merge "Removing "_short" suffix from arm transform file names."	2014-02-03 14:28:47 -08:00
Alex Converse	ffd3d4834b	INLINE and reimplement get_unsigned_bits(). The new implementation disagrees when the argument is equal to 2**n but that is never called in practice and based on how it is used the new implementation is correct in that case. Change-Id: Ifbac4ad87d459fe6bd2fd0f400c0340f96617342	2014-02-03 12:16:22 -08:00
Yunqing Wang	2488cb34bc	Optimize bilinear sub-pixel filters in sse2 Using bilinear filters could speed up the codec in real-time mode. This patch added sse2 optimizations of bilinear filters that operate on different-sized blocks. Tests showed that the real-time encoder was speeded up by 3%. Change-Id: If99a7ee4385fcc225c3ee7445d962d5752e57c3f	2014-02-03 10:34:45 -08:00
Marco Paniconi	6be2b750b8	Layer based rate control for CBR mode. This patch adds a buffer-based rate control for temporal layers, under CBR mode. Added vpx_temporal_scalable_patters.c encoder for testing temporal layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns). Updated datarate unittest with tests for temporal layer rate-targeting. Change-Id: I9cb6cce2494390ae6096ee17774af7fb9308bde7	2014-02-02 14:30:43 -08:00
Jim Bankoski	9dec7712ab	static function convert to inline or global vp9_blockd.h Change-Id: Ifdd951f24932839f06d1c700371662511dde6ebe	2014-01-31 19:50:40 -08:00
Yunqing Wang	7c6a49bada	Merge "Rename a loopfilter parameter"	2014-01-31 18:33:33 -08:00
Dmitry Kovalev	c2ca97caaf	Merge "Cleaning up motion compensation code."	2014-01-31 17:33:40 -08:00
Dmitry Kovalev	c49b08c9a1	Removing "_short" suffix from arm transform file names. Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507	2014-01-31 17:19:02 -08:00
Dmitry Kovalev	6e4a03e844	Removing "_1d" suffix from mips transform code. Unifying transform function names across libvpx, 1d is a redundant suffix. Change-Id: I077c19f3bc7d4842ed7ca5814d77b3dce1728e13	2014-01-31 17:05:03 -08:00
Yunqing Wang	11a9366e3b	Rename a loopfilter parameter As pointed out by Dmitry and James, "partial" is a Microsoft- specific c++ keyword, and it is renamed. Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8	2014-01-31 16:30:04 -08:00
Dmitry Kovalev	88340b173b	Merge "Combining fb_idx_ref_cnt[] and yv12_fb[] arrays."	2014-01-31 15:55:04 -08:00
Dmitry Kovalev	a8a2f22958	Merge "Renaming "mbskip" to "skip"."	2014-01-31 15:52:35 -08:00
Yunqing Wang	903801f1ef	vp9 decoder: row-based multi-threaded loopfilter Implemented parallel loopfiltering, which uses existing tile- decoding threads. Each thread works on one row, and when that row is loopfiltered, it moves to next unattended row. To ensure the correct filtering order, threads are synchronized and one superblock is filtered only if the superblocks it depends on are filtered already. To reduce synchronization overhead and speed up the decoder, we use nsync > 1 for high resolution. Performance tests: 1. on desktop: 8-tile 4k video using 8 threads, speedup: 70% - 80% 4-tile HD video using 4 threads, speedup: ~35% 2. on mobile device(Nexus 7): 4-tile 1080p video using 4 threads, speedup: 18% - 25% 4-tile 1080p video using 2 threads, speedup: 10% - 15% Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a	2014-01-31 14:44:53 -08:00
Yaowu Xu	96dc80da61	Merge "create super fast rtc mode"	2014-01-29 16:36:20 -08:00
Dmitry Kovalev	b107f2c470	Renaming "mbskip" to "skip". Change-Id: I27a30b43eae026a77f92958e2238d02d9cdf7832	2014-01-29 14:48:42 -08:00
Dmitry Kovalev	5670f1e2a8	Merge "Finally removing vp9_setup_interp_filters() function."	2014-01-29 12:54:21 -08:00
Dmitry Kovalev	6332063475	Combining fb_idx_ref_cnt[] and yv12_fb[] arrays. Adding new RefCntBuffer struct which contains reference counter and image buffer. Change-Id: I71c1f532faa13442c32c43fc03ec45b6f88fb844	2014-01-29 12:48:01 -08:00
Dmitry Kovalev	b00eb5c464	Finally removing vp9_setup_interp_filters() function. Change-Id: If446225afbb49f6033c2a4516a37c377de6f70f7	2014-01-29 11:29:34 -08:00
Jim Bankoski	ea8aaf15b5	create super fast rtc mode This patch only works if the video is a width and height that are both a multiple of 32.. It sets every partition to 16x16, and does INTRADC only on the first frame and ZEROMV on every other frame. It always does does the largest possible transform, and loop filter level is set to 4. Was ~20% faster than speed -5 of vp8 Now 20% slower but adds motion search ( every block ), nearest, near and zeromv The SVC test was changed because - while this realtime mode produces bad quality albeit quickly, it isn't obeying all the rules it should about which frames are available. Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7	2014-01-29 08:39:39 -08:00
Yunqing Wang	3c29cbffbf	Add macros for convolve functions Added macros to reduce the code duplication. Change-Id: I1916aa5a386ea07d961d4ec439ab09bb8c45487d	2014-01-28 18:40:23 -08:00
Dmitry Kovalev	b098c04290	Merge "Decoupling set_ref_ptrs() and vp9_setup_interp_filters()."	2014-01-28 10:37:58 -08:00
Dmitry Kovalev	4ce35d8f2d	Merge "Removing _1d suffix from transform names."	2014-01-28 10:37:26 -08:00
hkuang	af87148a22	Merge "Add vp9_tm_predictor_32x32 neon implementation which is 7.8 times faster than C."	2014-01-28 09:57:08 -08:00
Dmitry Kovalev	ff41764920	Removing _1d suffix from transform names. It is enough to specify (e.g.) idct16, it is obviously different from idct16x16. Change-Id: I6b408a37a945de3162429380b59a775b03b95db0	2014-01-27 16:15:36 -08:00
hkuang	770454f3a8	Add vp9_tm_predictor_32x32 neon implementation which is 7.8 times faster than C. Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321	2014-01-27 16:01:07 -08:00
Dmitry Kovalev	e5b31a1d8c	Decoupling set_ref_ptrs() and vp9_setup_interp_filters(). Change-Id: I8d17867a4772554cbba2bd113cc5b4c99d50146d	2014-01-27 16:00:20 -08:00
Dmitry Kovalev	b2f0ae65c7	Merge "Removing subpix_fn_table struct."	2014-01-27 10:42:42 -08:00
hkuang	05d2081d38	Fix the vp9_tm_predictor_8x8_neon. Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9	2014-01-27 10:17:20 -08:00
Dmitry Kovalev	8691565441	Removing subpix_fn_table struct. We don't use different filter kernels for x and y, it is always one kernel for both directions. Change-Id: Iefcbb02ec74bf46ea20d9dca672a3efd5d631517	2014-01-24 17:06:26 -08:00
Dmitry Kovalev	f9f936b82f	Merge "Renaming INTERPOLATION_TYPE to INTERP_FILTER."	2014-01-24 16:52:10 -08:00
Frank Galligan	183361dadb	Merge "Optimize vp9_tm_predictor_8x8_neon function"	2014-01-24 16:21:56 -08:00
Dmitry Kovalev	4264c93844	Renaming INTERPOLATION_TYPE to INTERP_FILTER. Corresponding renames: subpel_kernel => interp_kernel vp9_get_filter_kernel() => vp9_get_interp_kernel() pred_filter_type => pred_interp_filter adaptive_pred_filter_type => adaptive_pred_interp_filter mcomp_filter_type => interp_filter read_interp_filter_type() => read_interp_filter() write_interp_filter_type() => write_interp_filter() fix_mcomp_filter_type() => fix_interp_filter() Change-Id: I1fa61fa1dc81ebbf043457c3ee2d8d4515bee6d3	2014-01-24 15:57:28 -08:00
Dmitry Kovalev	03eb63c114	Merge "Removing MODE_STATS."	2014-01-24 15:53:12 -08:00
Frank Galligan	c6d537155c	Merge "Revert external frame buffer code."	2014-01-24 11:31:23 -08:00
Frank Galligan	56a8a0b54b	Optimize vp9_tm_predictor_8x8_neon function Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5	2014-01-24 11:07:14 -08:00
hkuang	92ab96a7ae	Merge "Add vp9_tm_predictor_16x16 neon implementation which is 3.5 times faster than C."	2014-01-24 10:48:44 -08:00
James Zern	26c88ec14e	Merge changes I826655a7,I5164df72,Iba9b198c,Ide9a6846,I4f51ce85,I0e6aa00f,Ic334da9a,I252f5f8a,I7865db2d,I13b434b1 * changes: test/: remove unnecessary extern "C"s top-level: add extern "C" to headers vpx_ports: add extern "C" to headers vpx: add extern "C" to headers vp9/encoder: add extern "C" to headers vp9/decoder: add extern "C" to headers vp9/common: add extern "C" to headers vp8/encoder: add extern "C" to headers vp8/decoder: add extern "C" to headers vp8/common: add extern "C" to headers	2014-01-24 10:47:00 -08:00
hkuang	3633ffcbf7	Add vp9_tm_predictor_16x16 neon implementation which is 3.5 times faster than C. Change-Id: I24439ba7a2971829c11620f34848facf2c916678	2014-01-24 10:22:58 -08:00
Frank Galligan	b1c72b633e	Revert external frame buffer code. A future CL will add external frame buffers differently. Squash commit of four revert commits: Revert "Increase required number of external frame buffers" This reverts commit `9e41d569d7`. Revert "Add external constants." This reverts commit `bbf53047b0`. Revert "Add frame buffer lru cache." This reverts commit `fbada948fa`. Conflicts: vpxdec.c Change-Id: I76fe42419923a6ea6c75d9997cbbf941d73d3005 Revert "Add support to pass in external frame buffers." This reverts commit `10f891696b`. Conflicts: test/external_frame_buffer_test.cc vp9/common/vp9_alloccommon.c vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/encoder/vp9_onyx_if.c vp9/vp9_dx_iface.c vpx/vpx_decoder.h vpx/vpx_external_frame_buffer.h vpx_scale/generic/yv12config.c vpxdec.c Change-Id: I7434cf590f1c852b38569980e4247fad0d939c2e	2014-01-24 10:10:20 -08:00
Adrian Grange	8b0537f631	Merge changes I24ad1f0f,I33be1366 * changes: Reorder functions to avoid forward declaration Rename set_scale_factors as set_ref_ptrs	2014-01-24 08:38:52 -08:00
Dmitry Kovalev	6c98df29e4	Cleaning up motion compensation code. Change-Id: I74cf028e8c732cd0dbc070326152d3085b824a80	2014-01-23 17:15:30 -08:00
James Zern	0940c9cfde	vp9/common: add extern "C" to headers Change-Id: Ic334da9aee968e33762c2b25d9fbad24c844b411	2014-01-23 16:21:24 -08:00
Dmitry Kovalev	5f75fda9e9	Merge "Cleaning up vp9_refining_search_sad() function."	2014-01-22 17:15:22 -08:00
hkuang	97826df96b	Add tm_predictor_8x8 neon implementation. Change-Id: I76c2720546b737cb63018a8ab6a3ff62a291786d	2014-01-22 13:43:20 -08:00
Adrian Grange	e37eb0ade7	Rename set_scale_factors as set_ref_ptrs New name better describes what the function does. Change-Id: I33be1366a81f058a9854b804bcde211061187dc7	2014-01-22 13:04:30 -08:00
Johann	4e9dc6d45d	Merge "Match vp9_coefband_trans_* declarations"	2014-01-22 11:10:51 -08:00
Johann	6c492fc2f9	Match vp9_coefband_trans_* declarations VS2013 Chromium builds failed with: warning C4742: 'vp9_coefband_trans_8x8plus' has different alignment in https://code.google.com/p/chromium/issues/detail?id=336620 Change-Id: I865f72bc23ae958531eeb5f497002c12e9a36fcd	2014-01-21 17:07:23 -08:00
hkuang	437004c710	Seperate the border size for encoder and decoder. Encoder's boarder is still 160, while decoder's boarder will be 32. With on demand and separate boarder buffer for boarder extension. The decoder's boarder does not need to to 160 anymore. Change-Id: I93d5aaff15a33a2213e9761eaa37c5f2870747db	2014-01-21 15:28:41 -08:00
Dmitry Kovalev	a001016996	Removing MODE_STATS. Change-Id: I7520e1cc82b749187c9445356dd7b54f3f3826cc	2014-01-17 17:30:22 -08:00
Jingning Han	b461c0884e	Deprecate best_mv from encoder This commit deprecates the use of best_mv from encoding and bit-stream writing stages. It hence removes the definition from MACROBLOCKD. Change-Id: I8e5302775a2aa4a18900726df407bff881f2dfb1	2014-01-17 17:15:34 -08:00
hkuang	671df8486d	Merge "Use a temp buffer for reconstruction when reference buffer is out of boarder."	2014-01-17 16:17:36 -08:00
hkuang	7459fee8c6	Use a temp buffer for reconstruction when reference buffer is out of boarder. Change-Id: Ic7ad136e54a4d68abe0fd4345146a86b0ba824e1	2014-01-17 16:15:54 -08:00
Dmitry Kovalev	d8bfe9e24c	Cleaning up vp9_refining_search_sad() function. Change-Id: I660b53da8ebf3049832ce8a10721051c4e0ebb00	2014-01-17 15:20:28 -08:00
Dmitry Kovalev	ac40c87f68	Removing unused vp9_yv12_copy_partial_frame() function. Change-Id: I3149e562fe9500914f67b6f908283edcdc381ac6	2014-01-16 18:16:34 -08:00
Yunqing Wang	d2bb0c51d3	Revert "Revert "Revert "SSSE3 convolution optimization""" This reverts commit `f9404f2406`. This patch caused some ASAN error. Change-Id: If15b7e581310e19061d111c69f2931809662ed19	2014-01-16 16:11:46 -08:00
hkuang	2a2d8c140f	Merge "Add vp9_tm_predictor_4x4 neon implementation"	2014-01-16 10:18:12 -08:00
Dmitry Kovalev	67e4ca2a1a	Merge "Cleaning up postproc code."	2014-01-15 16:23:54 -08:00
Yaowu Xu	056db03d17	Merge "Revert "Revert "SSSE3 convolution optimization"""	2014-01-15 15:03:25 -08:00
Deb Mukherjee	8ce5f68fe4	Merge "Rearranges the END_USAGE typedef"	2014-01-15 14:01:30 -08:00
hkuang	f2ef389256	Add vp9_tm_predictor_4x4 neon implementation Change-Id: I10c423bde7ea5a3bac9f14f35c73b6bc31c8f3e3	2014-01-15 11:51:36 -08:00
Deb Mukherjee	f32106951a	Rearranges the END_USAGE typedef Rearranges the END_USAGE typedef to make it compatible with the vpx user input. Change-Id: Ic9fa9e9edbee7c0ad01e12e685b219582fcecd16	2014-01-15 10:10:23 -08:00
Adrian Grange	c3011e6f90	Delete outdated comment & tidy-up others Change-Id: I83031180723ee59270ec8fb66b2f73c0796bee25	2014-01-15 09:53:03 -08:00
Dmitry Kovalev	a540f8a0b0	Cleaning up postproc code. Change-Id: I7e53f6345a4cf89309262f50850c9ad08ed3c527	2014-01-14 15:49:19 -08:00
Yunqing Wang	f9404f2406	Revert "Revert "SSSE3 convolution optimization"" This reverts commit `b645257121`. Change-Id: I60d1bf57ae8e9eb6127f42f2d5a780124ac51b45	2014-01-13 12:29:55 -08:00
James Zern	f83c12b540	Merge "cosmetics: vp9_reconinter.h: make some variables const"	2014-01-11 12:39:32 -08:00
Dmitry Kovalev	96be0a50ab	Removing mi_height_log2_lookup table. Change-Id: I1f0ae2edc3a96b33c0494d165ae756a8feba6184	2014-01-10 13:29:47 -08:00
Paul Wilkins	b645257121	Revert "SSSE3 convolution optimization" This reverts commit `511d218c60`. In current form intrinsics break borg build. Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9	2014-01-10 13:38:26 +00:00
Jingning Han	a4c94a94cc	Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P2"	2014-01-09 18:17:25 -08:00
Jingning Han	faa2ba86cc	Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P1"	2014-01-09 18:17:12 -08:00
Dmitry Kovalev	c8e8d3a461	Merge "Renaming 'Sharpness' to 'sharpness'."	2014-01-09 13:42:55 -08:00
Jingning Han	af31b27aae	Optimze inv 16x16 DCT with 10 non-zero coeffs - P2 This commit further optimizes SSE2 operations in the second 1-D inverse 16x16 DCT, with (<10) non-zero coefficients. The average runtime of this module goes down from 779 cycles -> 725 cycles. Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f	2014-01-09 12:46:09 -08:00
Yunqing Wang	f3b9b97c0e	Merge "SSSE3 convolution optimization"	2014-01-09 12:39:47 -08:00
levytamar82	511d218c60	SSSE3 convolution optimization Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization done only for 64bit. Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969	2014-01-09 12:27:51 -07:00
Dmitry Kovalev	4fbe54d201	Merge "Renaming 'Mode' to 'mode'."	2014-01-08 16:29:29 -08:00
Jingning Han	ba6ab46cdc	Optimze inv 16x16 DCT with 10 non-zero coeffs - P1 This commit is the first patch optimizing SSE2 implementation of inverse 16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row) transformation. It exploits the fact that only top-left 4x4 block contains non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients. The average runtime of idct16x16_10 unit is reduced from 883 cycles -> 779 cycles (12% faster). For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes down from 310651 ms -> 305910 ms. The decoding speed goes up from 80.37 fps -> 80.87 fps. Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645	2014-01-08 15:36:45 -08:00
Alex Converse	8fcb74e6bb	Merge "Add a C fallback for get_msb() and change inline to INLINE."	2014-01-08 14:43:46 -08:00
hkuang	5be0ed30dc	Merge "Add initial intra frame neon optimization. 1~2% gain."	2014-01-08 14:41:43 -08:00
Dmitry Kovalev	962c8b241e	Renaming 'Mode' to 'mode'. Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362	2014-01-08 14:33:59 -08:00
Dmitry Kovalev	57be81369a	Renaming 'Sharpness' to 'sharpness'. Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4	2014-01-08 14:19:14 -08:00
Alex Converse	ce7ff3b63d	Add a C fallback for get_msb() and change inline to INLINE. For systems without __builtin_clz() or _BitScanReverse(), taken from libwep Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e	2014-01-08 12:25:47 -08:00
hkuang	691111aacf	Add initial intra frame neon optimization. 1~2% gain. More intra optimizations will be added. Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8	2014-01-08 11:58:42 -08:00
Yunqing Wang	a84029ad9c	Merge "AVX2 Variance Optimization"	2014-01-08 11:33:42 -08:00
levytamar82	357b65369f	AVX2 Variance Optimization Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32, vp9_variance64x64, vp9_variance32x16, vp9_variance64x32, vp9_mse16x16 by migrating to AVX2 some of the functions were optimized by processing 32 elements instead of 16. some of the functions were optimized by processing 2 loop strides of 16 elements in a single 256 bit register This optimization gives between 2.4% - 2.7% user level performance gain and 42% function level gain. Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d	2014-01-08 12:05:53 -07:00
Alex Converse	f2ca665f1c	Replace RD modeling with a fixed point approximation. Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec	2014-01-08 10:37:24 -08:00
Dmitry Kovalev	bbb25e6a39	Merge "Adding RefBuffer struct."	2014-01-06 14:19:44 -08:00
Jingning Han	b49e9fb433	Merge "Tune IDCT8_1D macro function interface"	2014-01-06 09:38:19 -08:00
Dmitry Kovalev	0c5575fe57	Merge "Moving hev mask calculation into filter4() function."	2014-01-03 15:56:16 -08:00
Jingning Han	3e0c62b53f	Tune IDCT8_1D macro function interface This commit adds input/output ports for IDCT8_1D macro function to provide more flexibility in variable use. It allows to skip several buffer swap operations. Change-Id: I21f3450509537322293043b3281bfd3949868677	2014-01-03 15:23:47 -08:00
Dmitry Kovalev	ba41e9d459	Adding RefBuffer struct. Adding RefBuffer to simplify reference buffer management. The struct has a pointer to image data and scale factors relative to the current frame. Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e	2014-01-03 15:21:55 -08:00
Jingning Han	0b1a27135a	Reduce num of buffer swap calls in idct8_1d_sse2 This commit merges the initial buffer swap operations in idct8_1d_sse2 into the array transpose step, hence reducing number of instructions therein. Change-Id: I219f6f50813390d2ec3ee37eecf2a4a2b44ae479	2014-01-03 12:12:03 -08:00
Jingning Han	1bb11781e2	Rework idct8x8_10 SSE2 implementation This commit optimizes the SSE2 implmentation of idct8x8_10. It exploits the fact that only top-left 4x4 block contains non-zero coefficients, and hence reduces the instructions needed. The runtime of idct8x8_10_sse2 goes down from 216 to 198 CPU cycles, estimated by averaging over 100000 runs. For pedestrian_area_1080p 300 frames coded at 4000kbps, the average decoding speed goes up from 79.3 fps to 79.7 fps. Change-Id: I6d277bbaa3ec9e1562667906975bae06904cb180	2014-01-03 12:04:09 -08:00
Yaowu Xu	8458c8c450	Merge "Fix show existing frame"	2014-01-02 09:27:28 -08:00
Dmitry Kovalev	f3beca079c	Merge "Calculating has_second_ref only once for single_ref context."	2013-12-26 13:41:02 -08:00
Dmitry Kovalev	1e8b5bf4ac	Merge "Removing vp9_findnearmv.{h, c} files."	2013-12-26 13:38:38 -08:00
James Zern	44963dfd37	cosmetics: vp9_reconinter.h: make some variables const Change-Id: If5cd0a1487e97c8e9d13dc2e078c6dceaf79de4f	2013-12-26 14:02:46 -05:00
Dmitry Kovalev	87440aeb82	Moving MAX_PROB constant to vp9_prob.h. Change-Id: I07470ad1b7a0344d088911428ffab8ba9a0d8708	2013-12-20 15:56:59 -08:00
Dmitry Kovalev	b3b9f4a4d0	Merge "Using single struct to represent scale factors."	2013-12-20 11:22:02 -08:00
Yunqing Wang	b6a0ac11f0	Merge "Code clean up"	2013-12-20 08:46:11 -08:00
Dmitry Kovalev	987810ad95	Removing vp9_findnearmv.{h, c} files. Moving all code from that files to vp9_mvref_common.{h, c}. Change-Id: Ibc4afcb8cea6847166ff411130e93611ebe63b20	2013-12-19 17:39:57 -08:00
Dmitry Kovalev	a3fbcc88bb	Using single struct to represent scale factors. Moving back to scale_factors struct. We don't need anymore x_offset_q4 and y_offset_q4 because both values are calculated locally inside vp9_scale_mv function. Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b	2013-12-19 16:06:33 -08:00
Dmitry Kovalev	c872d2be65	Call set_scaled_offsets() just before scale_mv() call. Before mv scaling it is required to calculate x_offset_q4/y_offset_q4 by calling set_scaled_offsets(). Now offset configuration can not be missed because it happens just before scale_mv(). Change-Id: I7dd1a85b85811a6cc67c46c9b01e6ccbbb06ce3a	2013-12-19 14:55:13 -08:00
Yunqing Wang	09faf55916	Code clean up Removed unused filter coefficients. Change-Id: Ib395a51305e23ff41ab69c1808d56946d25961cd	2013-12-19 11:09:23 -08:00
Dmitry Kovalev	c67ee5ea24	Merge "Converting vp9_treecoder.h to vp9_prob.{h, c}"	2013-12-19 11:03:30 -08:00
Marco Paniconi	02d5ebcfdc	Merge "Updates for 1-pass CBR rate control."	2013-12-18 10:28:33 -08:00
Marco Paniconi	1b8b8b0d0d	Updates for 1-pass CBR rate control. Adjustments based on buffer level, frame dropper. Change-Id: Iaa85b570493526a60c4b9fb7ded4c0226b1b3a33	2013-12-18 09:24:24 -08:00
Jim Bankoski	9d754dcca8	Merge "rename loop filter functions"	2013-12-17 18:56:09 -08:00
Jim Bankoski	b720ba165f	rename loop filter functions This renames all the loop filter functions so that they no longer refer to mb Change-Id: I8a58a8c7fd253d835cb619bde13913e896ece90b	2013-12-17 17:34:34 -08:00
Dmitry Kovalev	118c8fb3fb	Calculating has_second_ref only once for single_ref context. Change-Id: Ib1253e0606426850f53060a4c5303af86bf1c093	2013-12-17 17:02:24 -08:00
Dmitry Kovalev	c6a1ff223b	Merge "Calling is_inter_block() only if mbmi is available."	2013-12-17 16:10:56 -08:00
Dmitry Kovalev	4821084b3f	Moving hev mask calculation into filter4() function. Change-Id: Ieccf2070b2b01b4135f4c5f9857667eb7825c761	2013-12-17 15:23:23 -08:00
Dmitry Kovalev	eb0c73b6e0	Merge "Converting mode_lf_lut struct member into static lookup table."	2013-12-17 15:20:05 -08:00
James Zern	bd9a388a06	vp9: normalize include guards Change-Id: If4ddbdcfb3ab387cbca6910b42cf4df8111e6879	2013-12-16 19:40:49 -08:00
Yaowu Xu	3cce464342	Define POSITION to differentiate from MV MV struct was ussed to indicate the postition of a MI_BLOCK with row and col components. The expression was confusing, this commit added a new stucture "POSITION" with row and col component to better describe the position of a mi_block. Change-Id: I59fdd4b45010fe7d85a8db22a55503265c4f5b2b	2013-12-16 17:28:00 -08:00
Yaowu Xu	50ec6311e6	Move two functions to encoder As they are used by encoder only. Change-Id: I7b1e6955b218aba66fe156523521a8121c9a84a4	2013-12-16 17:27:48 -08:00
Dmitry Kovalev	bb7b4bad6d	Merge "Getting rid of b_{width, height}_log2 calls in non-420 loop filter."	2013-12-16 15:10:25 -08:00
Dmitry Kovalev	865d5b83f2	Calling is_inter_block() only if mbmi is available. Modifying vp9_get_intra_inter_context(), vp9_get_reference_mode_context(), vp9_get_pred_context_single_ref_p1(), vp9_get_pred_context_single_ref_p2() functions. Change-Id: Ifaa2c3eb0c76a544ae8bd1fe3155aada266eae78	2013-12-16 15:09:33 -08:00
hkuang	fb53409d2a	Merge "Remove border extension in intra frame prediction."	2013-12-16 14:48:54 -08:00
Dmitry Kovalev	b1d821704b	Merge "Yet another vp9_pred_common.c cleanup."	2013-12-16 14:10:52 -08:00
hkuang	25e5552630	Remove border extension in intra frame prediction. Change-Id: Id677df4d3dbbed6fdf7319ca6464f19cf32c8176	2013-12-16 14:05:58 -08:00
Dmitry Kovalev	b5c9261832	Converting vp9_treecoder.h to vp9_prob.{h, c} Moving vp9_norm probability table from vp9_entropy.c to vp9_prob.c Change-Id: Ie757b73860c6f43130790c332b292e2a1a81b788	2013-12-16 12:53:09 -08:00
Frank Galligan	fbada948fa	Add frame buffer lru cache. Add an option for libvpx to return the least recently used frame buffer. Change-Id: I886a96ffb94984f1c42de53086e0131922df3260	2013-12-15 19:57:42 -08:00
Frank Galligan	d0ee1fd797	Merge "Add support to pass in external frame buffers."	2013-12-15 19:18:25 -08:00
Frank Galligan	10f891696b	Add support to pass in external frame buffers. VP9 decoder can now use frame buffers passed in by the application. Change-Id: I599527ec85c577f3f5552831d79a693884fafb73	2013-12-15 18:45:46 -08:00
Dmitry Kovalev	4d2d1591a3	Converting mode_lf_lut struct member into static lookup table. Change-Id: I6e6c7cb5ff5b60fbe6a7c314daec5ccdc2cafcc3	2013-12-14 17:42:12 -08:00
Dmitry Kovalev	2aadc06e0d	Yet another vp9_pred_common.c cleanup. Change-Id: I617d6c610d181076773c5c3d6f3dbc6717b02580	2013-12-14 17:39:24 -08:00
Dmitry Kovalev	64cf398713	Merge "Using MV struct instead of int_mv union in encoder."	2013-12-13 16:42:54 -08:00
Dmitry Kovalev	33df4f0483	Merge "vp9_convole.c cleanup."	2013-12-13 15:40:00 -08:00
Dmitry Kovalev	f54b515797	Merge "Cleaning up vp9_append_sub8x8_mvs_for_idx()."	2013-12-13 15:38:53 -08:00
Dmitry Kovalev	25da21b14e	Using MV struct instead of int_mv union in encoder. Change-Id: I8b81a3e4b4fa530a654c28d9c136afa0c1d379fd	2013-12-13 15:24:48 -08:00
Dmitry Kovalev	466cc94e7a	Getting rid of b_{width, height}_log2 calls in non-420 loop filter. Using num_{4x4, 8x8}_blocks_{wide, high}_lookup instead. Change-Id: I66a7ab807fa57395253b2d0e636c2479fa8c4adf	2013-12-13 12:53:41 -08:00
James Zern	178db94cd6	vp9 asserts: fix compile warning string literal to int within an assert Change-Id: I0c889256b67a078e6e2a79577f0b7ae084243258	2013-12-12 19:49:19 -08:00
Dmitry Kovalev	629fb85f17	vp9_convole.c cleanup. Making overall logic more clear, moving "hacked" calculation of base filter array pointer to get_filter_base() function. Change-Id: Ibbd38a9f937e48d35bbbfef3ad933ab36664cccb	2013-12-12 11:14:06 -08:00
Deb Mukherjee	7edd5170b5	Merge "Changes interfaces to vp9_get_compressed_data fn"	2013-12-11 15:50:40 -08:00
Dmitry Kovalev	e79103166f	Merge "Renames for consistency in vp9_pred_common.{c, h} files."	2013-12-11 14:30:44 -08:00
Deb Mukherjee	e33855cc47	Changes interfaces to vp9_get_compressed_data fn Silences some lint warnings in previous patches Change-Id: I04bf47ebe7e63a95fd322719a3154e589c115d78	2013-12-11 14:22:51 -08:00
hkuang	9460226acd	Merge "Fix valgrind error."	2013-12-11 13:22:32 -08:00
hkuang	1339f3842c	Fix valgrind error. Temporarily change memcpy to memmove. Change-Id: I700a197bc1ce496be1ddad7118429c5da465b0ca	2013-12-11 13:21:28 -08:00
Dmitry Kovalev	3274fc30ee	Renames for consistency in vp9_pred_common.{c, h} files. Change-Id: Icba06e84ca55c419abbacedf5825eeb394a1b140	2013-12-10 18:31:46 -08:00
Dmitry Kovalev	098d13ba10	Cleaning up vp9_append_sub8x8_mvs_for_idx(). Replacing if-else with switch statement, reordering function arguments. Change-Id: I4825d2ef311ba8999b6d4ceb0eef003587a13434	2013-12-10 17:56:53 -08:00
Dmitry Kovalev	2dd20e468a	Cleaning up skip context calculation. Renames: vp9_get_pred_context_mbskip => vp9_get_skip_context vp9_get_pred_prob_mbskip => vp9_get_skip_prob Change-Id: I2af499848ef73f3f5cd8cdb27852d0bcdfe31d09	2013-12-10 14:11:26 -08:00
Dmitry Kovalev	35b7b0b549	Merge "Removing unused vp9_get_pred_flag_mbskip() function."	2013-12-10 13:58:35 -08:00
hkuang	19bbe41c71	Merge "Refactor inter_predictor function."	2013-12-10 13:34:24 -08:00
Dmitry Kovalev	48088f210d	Removing unused vp9_get_pred_flag_mbskip() function. Change-Id: Ib46a97d8ff9f2915b9fa2abba3cd18b6711fcb0c	2013-12-10 12:53:17 -08:00
Dmitry Kovalev	e18eb7721e	Merge "Renaming comp_pred_mode to reference_mode."	2013-12-10 10:52:34 -08:00
hkuang	6c9dcae532	Refactor inter_predictor function. Change-Id: Ic429b2f16462e926f30efb3af4da3080026359d8	2013-12-10 10:36:44 -08:00
Dmitry Kovalev	d2dad31e79	Merge "Cleaning up vp9_get_pred_context_switchable_interp() functuion."	2013-12-09 17:34:30 -08:00
hkuang	d70a8c09c6	Merge "Implenment on demand border extension. In place extend the border now. Next commit will totally remove the border."	2013-12-09 17:16:31 -08:00
Dmitry Kovalev	9edd4d4db7	Cleaning up vp9_get_pred_context_switchable_interp() functuion. Change-Id: I67a45a41312ca0efd8fe00ccd8bdc0f97675d09f	2013-12-09 17:02:38 -08:00
hkuang	ff2c96be1f	Implenment on demand border extension. In place extend the border now. Next commit will totally remove the border. Change-Id: Ic1e1ca9cc34f81c688715b3948689b47df63a151	2013-12-09 16:44:08 -08:00
Jingning Han	f92b5842bf	Merge "Full range motion search for regular block sizes"	2013-12-09 16:12:35 -08:00
Dmitry Kovalev	08c48ddc01	Renaming comp_pred_mode to reference_mode. Change-Id: I83ffed2b1878a35ac35f07f9ee74309adc9c7b11	2013-12-09 15:13:34 -08:00
Dmitry Kovalev	347df4ce55	Merge "Renaming vp9_get_pred_context_tx_size() function."	2013-12-09 15:10:49 -08:00
Dmitry Kovalev	2c3120274a	Removing max_uv_txsize_lookup lookup table. Adding get_uv_tx_size_impl() with tx size selection logic, rewriting get_uv_tx_size(). Change-Id: I3ecb108059a41be227a8c89a0710bd174f508951	2013-12-09 14:03:23 -08:00
Dmitry Kovalev	a19d694f09	Merge "Removing BLOCK_TYPES and adding PLANE_TYPES constant instead."	2013-12-07 02:20:41 -08:00
Dmitry Kovalev	cb92f4f042	Renaming vp9_get_pred_context_tx_size() function. Change-Id: Ia6d6f4dfb1fd1ec0f8ba53796b59a802e9d7881d	2013-12-06 15:31:06 -08:00
Dmitry Kovalev	b6e5bb27c9	Merge "Renaming reference mode context calculation function."	2013-12-06 14:22:47 -08:00
Jingning Han	b295092b8f	Full range motion search for regular block sizes Add a full range motion search for regular block sizes. This runs exhaustive search within the given reference area. This commit further optimizes the search process by combining 4 points test into one pipeline, which gives 30% speed-up as compared to run each individual point at a time. This full range search serves as a best possible motion search reference. When replacing the diamond search with full range search, the speed 0 runtime of bus CIF at 2000 kbps goes from 153872ms to 623051ms. The compression performance compared to speed 0 setting gains 0.585% for derf set. Change-Id: Ieef1225216b0b86b4ac4872fa7fb9e18bf2eabb3	2013-12-06 12:24:53 -08:00
Dmitry Kovalev	2da30a96d4	Merge "Removing duplicated C code from vp9_loopfilter_filters.c file."	2013-12-06 12:13:24 -08:00
Dmitry Kovalev	63963f51ef	Renaming reference mode context calculation function. Renames: vp9_get_pred_context_comp_inter_inter => vp9_get_reference_mode_context vp9_get_pred_prob_comp_inter_inter => vp9_get_reference_mode_prob Change-Id: I3bbb69481e6b0c848028667c9269f567f293d3bd	2013-12-06 11:23:01 -08:00
Dmitry Kovalev	d6b159d4a6	Removing BLOCK_TYPES and adding PLANE_TYPES constant instead. Change-Id: Ic3bb862e93aedf6a489a33ea6f7e5097d96855ee	2013-12-06 10:54:00 -08:00
Dmitry Kovalev	cf4dfdc8e7	Merge "Moving vp9_tree_probs_from_distribution() to encoder."	2013-12-06 10:18:30 -08:00
Dmitry Kovalev	8eac2ca840	Merge "Renaming constants."	2013-12-06 09:55:02 -08:00
Dmitry Kovalev	5be34ba80f	Merge "vp9_get_pred_context_intra_inter() clean up."	2013-12-06 09:14:36 -08:00
Adrian Grange	de2046275d	Merge "Remove redundant calls to vp9_update_mode_info_border"	2013-12-06 08:59:47 -08:00
Dmitry Kovalev	4ac6a2552b	Moving vp9_tree_probs_from_distribution() to encoder. Writing custom coeff branch count calculation (which is much clearer) in adapt_coef_probs() function. Removing vp9_treecoder.c file. Change-Id: I8880fb7a39996c8bcf6cd0acf9898a8c712ba91f	2013-12-05 18:13:26 -08:00
Dmitry Kovalev	377fa8aff8	Renaming PREV_COEF_CONTEXTS to COEFF_CONTEXTS. Also adding BAND_COEFF_CONTEXTS macro to simplify for loop logic. Change-Id: I12a78a49cf1addf81e6b3fe2a3736ec2b79bd79e	2013-12-05 17:08:06 -08:00
Dmitry Kovalev	6fd71e1b09	vp9_get_pred_context_intra_inter() clean up. Renaming: vp9_get_pred_context_intra_inter => vp9_get_intra_inter_context vp9_get_pred_prob_intra_inter => vp9_get_intra_inter_prob Change-Id: I2c1affea2e84f4e616137c6df82adb11c7845781	2013-12-05 17:01:03 -08:00
Dmitry Kovalev	f7396f3394	Merge "Removing vp9_default_coef_probs.h file."	2013-12-05 16:44:26 -08:00
Dmitry Kovalev	0d4b8d7e43	Renaming constants. NUM_YV12_BUFFERS => FRAME_BUFFERS ALLOWED_REFS_PER_FRAME => REFS_PER_FRAME NUM_REF_FRAMES_LOG2 => REF_FRAMES_LOG2 NUM_REF_FRAMES => REF_FRAMES NUM_FRAME_CONTEXTS_LOG2 => FRAME_CONTEXTS_LOG2 NUM_FRAME_CONTEXTS => FRAME_CONTEXTS Change-Id: I4e1ada08f25d8fa30fdf03aebe1b1c9df0f87e63	2013-12-05 16:23:09 -08:00
Dmitry Kovalev	2b95a05bf6	Removing duplicated C code from vp9_loopfilter_filters.c file. Change-Id: I299b621fca1c8ff5d296afde9698cdcccfecaf3f	2013-12-05 15:49:57 -08:00
Adrian Grange	93d8a3fd29	Remove redundant calls to vp9_update_mode_info_border Removed calls to vp9_update_mode_info_border since they immediately followed code that initialized the entire buffer to 0. Change-Id: Ife06794daa20439a0b607a83a87f88df59afac40	2013-12-05 15:02:32 -08:00
Dmitry Kovalev	6df9ec52a0	Merge "Cleaning up vp9_get_pred_context_tx_size() function."	2013-12-05 09:59:00 -08:00
Tero Rintaluoma	047b0b01bb	Fix show existing frame - Disable mode info update in case where current frame is coded as "show existing frame". - Should fix issue 676. Change-Id: Ibee681850eb307f982da6528d3e31cb94f881c08	2013-12-05 12:10:10 +02:00
Frank Galligan	7ecf3bc91c	Fix ref count decrement code. Buffer 0 would never be decremented, so it could only be used once. Change-Id: I605d99fa2a513eadae6a0e230161729880653282	2013-12-04 22:21:00 -08:00
Dmitry Kovalev	5eeffc9fc5	Cleaning up vp9_get_pred_context_tx_size() function. Change-Id: Ia6ef876e3d1e66b2182a9c0bce3fd758691cd381	2013-12-04 21:35:30 -08:00
Dmitry Kovalev	a1123538a5	Moving vp9_token from common to encoder. Change-Id: I40a070c353663e82c59e174d7c92eb84f72ed808	2013-12-04 19:36:58 -08:00
Frank Galligan	8363349b84	Merge "Fix the initial references to frame buffers."	2013-12-04 19:26:40 -08:00
Dmitry Kovalev	4afd141a05	Removing vp9_default_coef_probs.h file. Moving all probability tables from removed file to vp9_entropy.c. Change-Id: I12846f1da778c3016d96b82e53384d4634883430	2013-12-04 17:04:35 -08:00
Dmitry Kovalev	cf8e3d2c5c	Merge "Cleaning up vp9_dec_build_inter_predictors_sb function."	2013-12-04 16:57:54 -08:00
Frank Galligan	9ed616a56c	Fix the initial references to frame buffers. The old code would start in a mixed state, where all the reference frames were pointing to frame buffer 0, but the reference counts were 0. This is why we needed special code for the first frame. Change-Id: I734961012917654ff8c0c8b317aac00ab75ded1a	2013-12-04 16:53:18 -08:00
Dmitry Kovalev	3712b58c2f	Merge "Cleaning up vp9_entropy.h file."	2013-12-04 16:46:41 -08:00
Dmitry Kovalev	c6ca5c5ad9	Compact formatting default_coef_probs_{4x4, 8x8, 16x16, 32x32}. Change-Id: If40b930431766d5179b9769509b5e4ca1628e9cc	2013-12-04 15:45:28 -08:00
Dmitry Kovalev	da2da79012	Merge "Formatting vp9_pareto8_full array."	2013-12-04 12:22:50 -08:00
Dmitry Kovalev	beb35aba19	Cleaning up vp9_dec_build_inter_predictors_sb function. Using get_plane_block_size() instead of manipulation with subsampling values, calculating all required values only once without redundant calls to b_width_log2(). Change-Id: I00303f2a0926f9c4cb17f34591adda60615f8919	2013-12-04 12:11:01 -08:00
Yunqing Wang	f6582d6928	Revert "Simplify mask checking in loop filters" Jingning saw bitstream change with this patch. It could be true that (mask_16x16_0 & 1) is 1, but (mask_16x16_1 & 1) is 0 in some edge cases. This reverts commit `8f05e70340`. Change-Id: I0a529435ce816a1e14653eb510d5090de276070a	2013-12-04 11:31:19 -08:00
Dmitry Kovalev	1470789927	Merge "Moving eob array to the encoder."	2013-12-04 10:58:02 -08:00
Yunqing Wang	920a074e89	Merge "Improve idct16x16: _256_add_sse2(x1.107)&_10_add_sse2(x1.012)"	2013-12-04 08:50:51 -08:00
Dmitry Kovalev	ff6d6a9f07	Formatting vp9_pareto8_full array. Change-Id: Ic7f47a8d233daf5e61e82092865837ea4eda4095	2013-12-03 18:49:19 -08:00
Dmitry Kovalev	f00d157c12	Moving eob array to the encoder. In the decoder we don't need to save eobs, we can pass eob as an argument. That's why removing eob arrays from VP9Decompressor and TileWorkerData, and moving eob pointer from macroblockd_plane to macroblock_plane. Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a	2013-12-03 17:59:32 -08:00
Dmitry Kovalev	8e89e2f2e0	Cleaning up vp9_entropy.h file. Renaming constants for consistency: DCT_VAL_CATEGORY1 => CATEGORY1_TOKEN DCT_VAL_CATEGORY2 => CATEGORY2_TOKEN DCT_VAL_CATEGORY3 => CATEGORY3_TOKEN DCT_VAL_CATEGORY4 => CATEGORY4_TOKEN DCT_VAL_CATEGORY5 => CATEGORY5_TOKEN DCT_VAL_CATEGORY6 => CATEGORY6_TOKEN DCT_EOB_TOKEN => EOB_TOKEN DCT_EOB_MODEL_TOKEN => EOB_MODEL_TOKEN MAX_ENTROPY_TOKENS => ENTROPY_TOKENS Moving constants: INTER_MODE_CONTEXTS from vp9_entropy.h to vp9_blockd.h. EOSB_TOKEN from vp9_entropy.h to vp9_tokenize.h Change-Id: I5fcbf081318e1d365792b6d290a930c6cb0f3fc2	2013-12-03 17:23:03 -08:00
Dmitry Kovalev	09577b8c8d	Merge "Removing dummy assignments."	2013-12-03 10:59:34 -08:00
Abo Talib Mahfoodh	e4419ab691	Improve idct16x16: _256_add_sse2(x1.107)&_10_add_sse2(x1.012) The performance gain of idct16x16_10_add_sse2 function is not noticeable. However since both functions use the IDCT16_1D, idct16x16_10_add_sse2 should be modified as well. Tested with: park_joy_420_720p50.y4m Change-Id: I02b957e36fcf997c677d15baf496533895271bff	2013-12-02 21:08:56 -05:00
Yunqing Wang	8f182a1cac	Merge "improve vp9_idct32x32_34(x1.472)&1024(x1.032)_add_sse2"	2013-12-02 15:10:05 -08:00
Yunqing Wang	37e68aba55	Merge "Simplify mask checking in loop filters"	2013-12-02 12:06:26 -08:00
Dmitry Kovalev	862c22cf7d	Merge "Moving token-encoding related stuff from common to encoder."	2013-12-02 10:32:04 -08:00
Yunqing Wang	8f05e70340	Simplify mask checking in loop filters Considering a horizontal edge, if mask_16x16 is 1 for an even- indexed 8x8 block, then mask_16x16 is 1 for next 8x8 block in same row. Similiar to a verticle edge, if mask_16x16 is 1 for an even-rowed 8x8 block, then mask_16x16 is 1 for the 8x8 block right below it in next raw. Based on that, the mask_16x16 checking can be simplified to save cycles. The corresponding 8-pixel vp9_mb_lpf_horizontal_edge code can also be removed. Change-Id: Ic3fe7a5674322239208cbe2731dc3216ce2084f3	2013-11-27 14:10:57 -08:00
Dmitry Kovalev	d83d61d942	Moving reaster_block_offset{,_int16} from vp9_blockd.h to vp9_rdopt.h. Change-Id: I5a5888d4639cc6b7eb266be47581dd15ba08c91e	2013-11-27 12:57:21 -08:00
Dmitry Kovalev	f9da823216	Moving token-encoding related stuff from common to encoder. Change-Id: I0e59d320407b3bed0ba3622a7b29975f6fad7ebf	2013-11-27 11:27:57 -08:00
Dmitry Kovalev	e2f1d02eb3	Merge "Moving mode encodings from common to encoder + cleanup."	2013-11-27 11:00:54 -08:00
Yaowu Xu	e9c19617bf	Merge "vp9_short_fdct32x32_rd vp9_short_fdct32x32 optimized for AVX2"	2013-11-27 10:27:32 -08:00
Dmitry Kovalev	d3a2e55af4	Removing qcoeff buffers from the decoder. We only need qcoeff buffers in the encoder. Reducing TileWorkerData struct and VP9Decompressor struct sizes by 24K. Change-Id: Id148868461f7ffa3d3dd634b371503ae9c57e207	2013-11-26 18:52:10 -08:00
Dmitry Kovalev	fc3c3303f1	Removing dummy assignments. Change-Id: I10d1a4bcac751a982d9dd135f019e3a4d92f8522	2013-11-26 15:35:11 -08:00
Dmitry Kovalev	f4bf712fbb	Moving mode encodings from common to encoder + cleanup. Change-Id: I248ccb1532e2cd95314d0b95108f2c2e71cf084f	2013-11-26 14:53:17 -08:00
Yaowu Xu	b60293e1ce	Merge "Amended some comments for clarity"	2013-11-26 14:32:02 -08:00
Frank Galligan	b4874e2c82	Fix 16 wide neon horz loopfilter. Multiply by 3 was on 8bit vectors when it should have been on 16bit vectors. Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc	2013-11-26 10:02:40 -08:00
Yunqing Wang	7a5fd6a1bf	Merge "Do vertical loopfiltering in parallel"	2013-11-26 09:35:14 -08:00
Abo Talib Mahfoodh	f97d91ab67	improve vp9_idct32x32_34(x1.472)&1024(x1.032)_add_sse2 vp9_idct32x32_34_add_sse2: speedup: 1.472 IDCT32_1D_34 and MULTIPLICATION_AND_ADD_2 are optimized based on the fact that Only upper-left 8x8 has non-zero values. vp9_idct32x32_1024_add_sse2: speedup: 1.032 Tested with: park_joy_420_720p50.y4m Change-Id: I8670ce547552b48695049de298e2fc46ce28dfbc	2013-11-26 12:28:26 -05:00
Dmitry Kovalev	5488da280d	Merge "Moving mv entropy encodings calculation to the encoder side."	2013-11-25 19:15:21 -08:00
Dmitry Kovalev	56d048c412	Moving mv entropy encodings calculation to the encoder side. Moved arrays: vp9_mv_joint_encodings vp9_mv_class_encodings vp9_mv_class0_encodings vp9_mv_fp_encodings Change-Id: Iaf5008c579fcbd6d77fdd81d1aef8c71b5f308b7	2013-11-25 16:36:28 -08:00
Dmitry Kovalev	7ba7a5f817	Merge "Removing redundant call of vp9_init_mbmode_probs()."	2013-11-25 16:08:42 -08:00
Dmitry Kovalev	cfc1f91c9f	Merge "Moving {left, right}_block_mode to vp9_blockd.h."	2013-11-25 10:59:24 -08:00
Dmitry Kovalev	e8af3db88a	Merge "Renaming COMPPREDMODE_TYPE enum and its members."	2013-11-25 10:59:08 -08:00
Yaowu Xu	dd69337e6e	Amended some comments for clarity Change-Id: I31c3908ba394095deb5d3a5d7b7c9b2b5328c3e8	2013-11-25 10:55:01 -08:00
Yaowu Xu	cc1e05ca5f	Merge "In frame Q adjustment experiment."	2013-11-25 10:52:22 -08:00
Jingning Han	f547fb8e07	Merge "Use separate inter predictors for enc/dec"	2013-11-25 10:29:07 -08:00
Paul Wilkins	644bd87e8e	In frame Q adjustment experiment. The idea here is to allow "in frame" adjustment of the final Q value used to encode each SB64, using segmentation. There is also adjustment of the rd mult in regions of overspend. Activated using aq_mode=2 Change-Id: I2f140cd898c9f877c32cd6d2e667f5e11ada4b1c	2013-11-25 10:22:55 -08:00
Yaowu Xu	3183135dd3	Merge "Fix a build issue with visual c."	2013-11-25 10:20:53 -08:00
Jingning Han	ba8b5e8d6d	Use separate inter predictors for enc/dec The decoder will construct inter predictor using lazy border extension, while the encoder, going with multiple runs of motion search in the rate- distortion optimization loop for each block, does border extension at frame level. This commit makes separate the inter predictors for encoder and decoder, respectively. Change-Id: Ieca2fecba3a7201a6d64ef9f219e5d91e50559c3	2013-11-25 09:43:34 -08:00
Jingning Han	12e5ec6aa8	Merge "Separate setup_scale_factor/extend_frame_borders"	2013-11-25 09:14:46 -08:00
Yaowu Xu	86368faca9	Fix a build issue with visual c. Change-Id: Ic8fc16ee1734cfde0d12a2e3abb3e9299382f3b1	2013-11-25 08:11:35 -08:00
Dmitry Kovalev	9fe88870c5	Merge "Cleaning up vp9_append_sub8x8_mvs_for_idx."	2013-11-24 16:08:20 -08:00
Dmitry Kovalev	52b43a2876	Inlining and removing vp9_set_pred_flag_seg_id() function. Change-Id: I0fd76937e847f78378a7ab3fa0af00a7c2c52b42	2013-11-22 17:32:11 -08:00
Dmitry Kovalev	fb9c19c62d	Renaming COMPPREDMODE_TYPE enum and its members. List of renames: COMPPREDMODE_TYPE => REFERENCE_MODE SINGLE_PREDICTION_ONLY => SINGLE_REFERENCE COMP_PREDICTION_ONLY => COMPOUND_REFERENCE HYBRID_PREDICTION => REFERENCE_MODE_SELECT (like TX_MODE_SELECT) NB_PREDICTION_TYPES => REFERENCE_MODES Change-Id: If723dabe9435325d0165dcd028142a2c78b417b4	2013-11-22 16:35:37 -08:00
Dmitry Kovalev	350731e8f9	Organizing all scan tables into lookup table. Change-Id: Ie829ee58a55157e6972c63cebe69a5d0a3221349	2013-11-22 16:20:45 -08:00
Dmitry Kovalev	52fa10a9a3	Cleaning up vp9_append_sub8x8_mvs_for_idx. Change-Id: Ic92f15d82ff5cfa3df655d08e460335c2ef8a325	2013-11-22 15:28:32 -08:00
Jingning Han	86d2a9b978	Separate setup_scale_factor/extend_frame_borders This commit takes out vp9_extend_frame_borders from vp9_setup_scale_factors. The refactoring is for the preparation of the use of lazy border extension at decoder. This makes it necessary to handle border extension separately at encoder/decoder. The use of vp9_extend_frame_borders will be removed, when lazy border extension is ready. Change-Id: Ia3baba3d179d5f11eee1634f19b3b319d2a59186	2013-11-22 12:02:08 -08:00
Dmitry Kovalev	e0ec61187e	Merge "Removing txfrm_block_to_raster_xy() call from extend_for_intra()."	2013-11-22 10:51:38 -08:00
Yunqing Wang	ed36720b66	Do vertical loopfiltering in parallel This patch followed "Add filter_selectively_vert_row2 to enable parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. For other optimizations (neon and dspr2), current 16-pixel functions were done by calling 8-pixel functions twice, and real 16-pixel functions could be added later. Decoder speedup: tulip clip: 2% speed gain; old_town_cross: 1.2% speed gain; bus: 2% speed gain. Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7	2013-11-22 10:04:51 -08:00
Dmitry Kovalev	7c8cac3c21	Removing txfrm_block_to_raster_xy() call from extend_for_intra(). Change-Id: I6a48d1f35ed5fe7a2c7499675b339994c9c3bdf2	2013-11-21 19:30:58 -08:00
Dmitry Kovalev	ad3333e2cd	Merge "Removing plane_block_{width, height} functions."	2013-11-21 16:37:27 -08:00
levytamar82	8def766de2	vp9_short_fdct32x32_rd vp9_short_fdct32x32 optimized for AVX2 Change-Id: I6366e84490883b72362f762369d7e5bccb64f02f	2013-11-21 14:19:49 -08:00
Frank Galligan	97d1258375	Revert "Add 16 wide neon horz loopfilter." The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0	2013-11-21 14:01:33 -08:00
Dmitry Kovalev	4896d5c7ef	Moving {left, right}_block_mode to vp9_blockd.h. Both functions have no relation to motion vectors, so moving them from vp9_findnearmv.h to vp9_blockd.h. Change-Id: I74f524267886ab0fff4a2da793a10c906ed0f43a	2013-11-21 11:43:53 -08:00
Yunqing Wang	e002bb99a8	Merge "Add filter_selectively_vert_row2 to enable parallel loopfiltering"	2013-11-21 11:25:55 -08:00
hkuang	370bf116a2	Merge "Remove unnecessary eob checking."	2013-11-21 11:24:02 -08:00
Frank Galligan	2dd77580c0	Merge "Add 16 wide neon horz loopfilter."	2013-11-21 10:29:30 -08:00
Yunqing Wang	b5e6d6cccf	Add filter_selectively_vert_row2 to enable parallel loopfiltering Added filter_selectively_vert_row2 to be ready for parallel loopfiltering in vertical direction. This change did 2-row filtering at a time. If 2 vertically adjacent 8x8 blocks do same type of filtering, we can do 16-pixel filtering in parallel. Next, we need to provide 16-pixel loopfiltering functions in c and optimized versions for codec speedup. Change-Id: Idf97bbdd70566e55bd30e1fd25cb8544e33291be	2013-11-21 09:53:15 -08:00
Yunqing Wang	6c4964602a	Merge "Correct ssse3 8/16-pixel wide sub-pixel filter calculation"	2013-11-21 09:40:02 -08:00
Frank Galligan	98de15137e	Add 16 wide neon horz loopfilter. Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d	2013-11-21 09:39:36 -08:00
Dmitry Kovalev	c90b6bb101	Removing redundant call of vp9_init_mbmode_probs(). This function is called from vp9_setup_past_independence() which is called before the modified piece of code. Moving reset of inter_mode_probs into vp9_init_mbmode_probs() for consistency. Change-Id: Ib188e8798e1fbe15407fd501406761b746fdda95	2013-11-20 21:56:38 -08:00
Dmitry Kovalev	a218a96784	Merge "Adding MV_FP_SIZE constant."	2013-11-20 14:39:58 -08:00
Yunqing Wang	256cf7ee7d	Correct ssse3 8/16-pixel wide sub-pixel filter calculation Although no mismatch was indicated for 8/16 wide sub-pixel filters in issue 661, they had similar problems that could cause mismatch potentially. This patch fixed calculations in HORIZx8/16 and VERTx8/16. Change-Id: I169961c9d40a20340995b7d22aafc89ccf30bfca	2013-11-20 12:52:56 -08:00
Dmitry Kovalev	79b5a2b142	Removing plane_block_{width, height} functions. Change-Id: I29c0dfcf41a1253d5e2a0d2ff740c0c38ebaa5a2	2013-11-20 12:39:29 -08:00
Jim Bankoski	302c33e49f	Merge "Clean up removal of vp9_pareto8 table."	2013-11-20 12:30:03 -08:00
Dmitry Kovalev	4956fcd31b	Adding MV_FP_SIZE constant. Change-Id: I98d750ee92ff51fb714980418ea28be3b1d0f3c6	2013-11-20 12:07:57 -08:00
hkuang	6debc446e0	Remove unnecessary eob checking. Change-Id: Ia568f70bddc1a2b62141a0197459119ca74c22b5	2013-11-20 11:58:11 -08:00
Jim Bankoski	25aae73a30	Merge "remove the model and copy in pack_mb_tokens"	2013-11-20 11:34:30 -08:00
Jim Bankoski	5bbb0c6295	Clean up removal of vp9_pareto8 table. Change-Id: I5556e8d1fc150be8a3e93af21900829b59a500dc	2013-11-20 11:17:26 -08:00
Jingning Han	81b9fd4310	Merge "Take out assertion from inverse transforms"	2013-11-20 10:55:27 -08:00
Jim Bankoski	03276bf6e6	remove the model and copy in pack_mb_tokens Change-Id: I00a5203c8ed76c184d936fccf93d76e7c06773d3	2013-11-20 10:06:04 -08:00
Yunqing Wang	0ef63f596d	Fix stack pointer in sub-pixel filters In commit "3d50da5397d20abc932d81453b26cde758293a40", the stack pointer was modified while aligning the stack, and it needed to be pop out at the end. Change-Id: I062971e195f1f2ab9d0ab5fb84dcf215a0fcaa67	2013-11-20 09:42:44 -08:00
Guillaume Martres	b00057c88a	Merge "vpxenc: add --aq-mode flag to control adaptive quantization"	2013-11-20 08:13:28 -08:00
Jim Bankoski	7a8a68e2bd	Merge "scan order table lookup same for encoder and decoder"	2013-11-19 16:22:48 -08:00
Yunqing Wang	e8f8e77642	Merge "Fix decoder mismatch with ssse3 enabled"	2013-11-19 16:19:32 -08:00
Yaowu Xu	dd04ff506b	Merge "Move vp9_setup_interp_filter() to encoder"	2013-11-19 16:01:19 -08:00
Jim Bankoski	d6667dd54f	scan order table lookup same for encoder and decoder Change-Id: I473947b5ca70b7a81151926284bff86f8555492a	2013-11-19 15:31:43 -08:00
Yunqing Wang	3d50da5397	Fix decoder mismatch with ssse3 enabled This patch fixed issue 661: "Decoder produces mismatched outputs with ssse3 enabled and disabled." In sub-pixel filters, a pixel value was multiplied by a filter coefficient, and the results were added up. The order of adding up these multiplications had to be arranged carefully to prevent incorrect overflowing. Change-Id: Id08af4200fea9e1b896fc40157b8651c2c7e80f2	2013-11-19 15:10:04 -08:00
Dmitry Kovalev	65cee2f01a	Merge "Simplifying partition context calculation."	2013-11-19 15:09:01 -08:00
Jim Bankoski	60aba6558f	Merge "entropy code speedup"	2013-11-19 14:58:44 -08:00
Yaowu Xu	df78fea166	Move vp9_setup_interp_filter() to encoder As it is used in encoder only. Change-Id: I5f2a8abbe72bb18cbf6ce36a3dc7e132aeae8ec2	2013-11-19 14:57:58 -08:00
Yaowu Xu	f92cfa1ca6	Merge "Move vp9_sadmxn.h from common to encoder"	2013-11-19 14:41:33 -08:00
Jim Bankoski	8cf352abac	entropy code speedup Change-Id: Ic316d3374ff9a2b43897272260947d56765a0fdd	2013-11-19 14:31:38 -08:00
Jim Bankoski	ff4f1c4b76	scan order / neighbors converted to lookup Change-Id: I64b189dfeee1cf3e90134a1a93497072f3361e5e	2013-11-19 12:55:44 -08:00
Yaowu Xu	30b03050a2	Move vp9_sadmxn.h from common to encoder Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b	2013-11-19 12:46:08 -08:00
Dmitry Kovalev	f6ec323906	Simplifying partition context calculation. Reversing bit order of partition_context_lookup, and modifying accordingly update_partition_context() and partition_plane_context(). Change-Id: I64a11f1a94962a3bf217de2f50698cb781db71a5	2013-11-19 11:17:30 -08:00
Yunqing Wang	f16fb829e6	Merge "Improve vp9_iht4x4_16_add_sse2 (x1.341)"	2013-11-19 11:11:47 -08:00
Dmitry Kovalev	953b1e9683	Removing raster_block_offset_uint8() function. There is no need to use that function, it is much clear to pass offset directly to the buffer. Change-Id: I9026cb0c5094c46f97df5d7f7daeb952f2843b24	2013-11-18 19:00:49 -08:00
Dmitry Kovalev	9e1e7bee48	Merge "Finally removing txfrm_block_to_raster_block() function."	2013-11-18 18:43:16 -08:00
Dmitry Kovalev	220af9ac2c	Merge "Cleaning up vp9_entropy.c file."	2013-11-18 18:04:56 -08:00
Abo Talib Mahfoodh	613e2d2e90	Improve vp9_iht4x4_16_add_sse2 (x1.341) This rebase is a better implementation of the previous ones. Modifications are done to reduce the total clock cycle. Speedup: 1.341 Compiled with -O3 Tested with: park_joy_420_720p50.y4m Change-Id: I940eaf283f60597ca0d9d2e13d518878d55ff02d	2013-11-18 20:53:13 -05:00
Dmitry Kovalev	d8c06d23da	Cleaning up vp9_entropy.c file. Change-Id: I568f5e2d4ef2f2affe013ba1691ffb546f1fe8c6	2013-11-18 17:18:14 -08:00
Yaowu Xu	a42ab027fd	Merge "Move vp9_extend.{h,c} from common to encoder"	2013-11-18 15:43:32 -08:00
Yaowu Xu	1c61e1960d	Move vp9_extend.{h,c} from common to encoder Since they used in encoder only. This commit also re-order includes for the files that include vp9_extend.h Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459	2013-11-18 12:43:36 -08:00
Yunqing Wang	e3168b0c54	Merge "Do horizontal loopfiltering in parallel"	2013-11-18 10:03:41 -08:00
Jim Bankoski	83eb1975df	partition context update speedup This removes a lot of operations in setting partition context... Change-Id: I365e6f5607ece85190cb21443988816dfa510ce3	2013-11-17 06:58:08 -08:00

... 13 14 15 16 17 ...

3255 Commits