generic-library/vpx

Author	SHA1	Message	Date
Frank Galligan	175d9dfe0a	Remove memset of every external frame buffer. Libvpx was memseting every external frame buffer before decode. This was to work around a valgrind issue in our C loop filter. Most of the time this was not needed and we have noticed some significant performance loss on some platforms. Now we require the application to zero out the buffers if it is using external frame buffers. Change-Id: I7330d00a315e65137ed30edd5f813e8929b76242	2014-09-15 15:37:36 -07:00
Alexander Voronov	29071a418e	Fix invalid memory access on 2x downscale. The issue was discovered on bitstream with 2x vertical downscale. For zero MVs, y_pad is set to 1 only when vertical convolution is required. The original code assumes that for y_step_q4 == 32 we don't perform vertical convolution. But vp9_setup_scale_factors_for_frame() sets convolve functions so that when x_step and y_step are both not equal to 16, convolve in both directions is performed. And convolve() unconditionally subtracts one stride from source pointer when calls convolve_horiz(). This leads to invalid memory access. Change-Id: I882dfa6081a58e172b5ffa55842bfcd6727f10bf	2014-09-15 17:50:20 +04:00
Jingning Han	82fad6f4b6	Merge "Add a note for enum values of MV_REFERENCE_FRAME"	2014-09-13 10:42:45 -07:00
Deb Mukherjee	10783d4f3a	Adds high bitdepth transform functions and tests Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8	2014-09-11 19:56:33 -07:00
Deb Mukherjee	1e4136d35d	Adds high bit depth sad and variance functions Moves high bit depth sad/var functions from highbitdepth branch to master. Change-Id: If03845d8ef9c9c494e13350e7a587c289306b94d	2014-09-11 17:30:44 -07:00
Johann	ac2f2e7855	Merge "Allow specifying opt dependencies"	2014-09-11 16:02:41 -07:00
Johann	8645a53039	Allow specifying opt dependencies If optimizations use more than one cpu feature, allow specifying them so that '--disable-X' still works https://code.google.com/p/webm/issues/detail?id=854 Change-Id: I3108ea37b397371a2be84dd5f2380b304db23f18	2014-09-11 13:43:48 -07:00
Jingning Han	3ef9786b7e	Add a note for enum values of MV_REFERENCE_FRAME Change-Id: Ifaf6738f26e86ded6eb6ea1465bad7a229612999	2014-09-11 10:55:42 -07:00
Jim Bankoski	0e66848081	Merge "LoopFilterWorkerData: remove misleading 'const'"	2014-09-10 06:33:51 -07:00
James Zern	2215d2f135	Merge changes If8887e1d,I36bfc9c8,I3d1e6c42 * changes: vp9_dthread: simplify loop_filter_row_worker signature simplify vp9_loop_filter_worker signature vp9_decodeframe: simplify tile_work_hook signature	2014-09-09 16:50:28 -07:00
Dmitry Kovalev	8e205a2a09	Merge "Cleaning up and speeding up vp9_idct32x32_1024_add_sse2()."	2014-09-09 12:50:23 -07:00
James Zern	7b572c9806	LoopFilterWorkerData: remove misleading 'const' 'frame_buffer' is modified indirectly via 'planes'. + do the same for vp9_loop_filter_rows Change-Id: Ibb7daa2e261064e4a5317a2969e3490e59891b82	2014-09-08 20:06:48 -07:00
James Zern	48662747bd	simplify vp9_loop_filter_worker signature use the type names directly in the function declaration rather than (void arg1, void arg2) Change-Id: I36bfc9c886310ce370bf0ca7c679ebd6e95109cc	2014-09-08 19:53:46 -07:00
Dmitry Kovalev	980abf6078	Fixing Mac OS build. Change-Id: Ifae8906185a868a07685eb7a7da2484af95e70a7	2014-09-08 08:53:12 -07:00
Dmitry Kovalev	70092af5c0	Cleaning up and speeding up vp9_idct32x32_1024_add_sse2(). Change-Id: If91017b792572c9db6e257011ca307bef8428486	2014-09-05 18:12:30 -07:00
Dmitry Kovalev	89963bf586	Merge "Removing postproc mmx code."	2014-09-05 18:11:08 -07:00
Dmitry Kovalev	54bec0971f	Merge "Initializing intra modes without vpx_once()."	2014-09-05 12:03:36 -07:00
Dmitry Kovalev	1100e262c5	Removing postproc mmx code. Removed functions: * vp9_post_proc_down_and_across_mmx * vp9_mbpost_proc_down_mmx * vp9_plane_add_noise_mmx They all have sse2 equivalent. Change-Id: I59c1fac12b7c96ca4538d455e4400c2b7875feff	2014-09-05 11:52:50 -07:00
James Zern	a8083449e9	fix x86-darwin* build vp9_variance_sse2.c contains a mix of intrinsics and references to assembly which uses x86inc.asm; it's conditionally included as a result. Change-Id: I254451483a65881c0b8e18e27bf0c3ddef60c4ec	2014-09-04 23:32:13 -07:00
Dmitry Kovalev	490943552f	Removing unused function prototypes. Change-Id: Ia5e383e2cf18052f6f1eacf8b9495ab8e4d58878	2014-09-04 14:26:30 -07:00
Dmitry Kovalev	48197f0a70	Adding sse2 variant for vp9_mse{8x8, 8x16, 16x8}. Change-Id: I6786d25ce4f32b8d8912f2d239a45ca15b310c4b	2014-09-03 19:02:14 -07:00
Dmitry Kovalev	bf778e7d8e	Initializing intra modes without vpx_once(). Change-Id: I0a9d52432f2500f1bd8f43f229e70e38bb9a0343	2014-09-03 11:39:02 -07:00
Dmitry Kovalev	0ecc75c819	Merge "Removing MMX SAD calculation code."	2014-09-02 17:35:59 -07:00
Dmitry Kovalev	318fc0c34f	Removing MMX SAD calculation code. Removed functions: * vp9_sad_16x16_mmx * vp9_sad_8x16_mmx * vp9_sad_16x8_mmx * vp9_sad_8x8_mmx * vp9_sad_4x4_mmx Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3	2014-09-02 14:41:36 -07:00
Deb Mukherjee	5acfafb18e	Adds config opt for highbitdepth + misc. vpx Adds config parameter vp9_highbitdepth, to support highbitdepth profiles. Also includes most vpx level high bit-depth functions. However encode/decode in the highbitdepth profiles will not work until the rest of the code is in place. Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6	2014-09-02 14:37:10 -07:00
Dmitry Kovalev	12cd6f421d	Removing variance MMX code. Removed functions: * vp9_mse16x16_mmx * vp9_get_mb_ss_mmx * vp9_get4x4var_mmx * vp9_get8x8var_mmx * vp9_variance4x4_mmx * vp9_variance8x8_mmx * vp9_variance16x16_mmx * vp9_variance16x8_mmx * vp9_variance8x16_mmx They all have SSE2 equivalent. Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615	2014-08-29 10:26:42 -07:00
Dmitry Kovalev	eba83a0fdb	Merge "Replacing int_mv with MV inside the first pass code."	2014-08-25 13:56:14 -07:00
Dmitry Kovalev	a459e582cb	Replacing int_mv with MV inside the first pass code. Change-Id: Ia3be6b5a18e1ff6cc5c5f4d37e4a5d0972388308	2014-08-22 16:20:18 -07:00
Jim Bankoski	cebe2c8d88	vp9_postproc.c: unused parameter warning resolved Change-Id: I6d77a7c775c0482fd1f9bb03ea6f336dd2973fa0	2014-08-22 13:41:07 -07:00
Yaowu Xu	23c88870ec	Merge "Fix bug 804"	2014-08-21 08:56:32 -07:00
Adrian Grange	c5d8c1e785	Merge "get_ref_frame: fix test for valid buffer."	2014-08-15 10:41:28 -07:00
Adrian Grange	54f8cb78c6	Merge "Fix bug 837: realloc mode info buffers on resize"	2014-08-14 14:53:33 -07:00
Adrian Grange	89a213b4b0	get_ref_frame: fix test for valid buffer. In the current implementation of the encoder, frame buffers may come from the wider set of 12 such buffers, and is not restricted to the 8 allowed as reference frames. This is only an implementation detail and does not affect the constraint of having a total of 8 reference buffers overall. Change-Id: I075f777146c2df49c275d89232933f8127235175	2014-08-14 12:42:11 -07:00
Adrian Grange	4e30565a9f	Fix bug 837: realloc mode info buffers on resize The test to determine if the mode info buffers need to be resized when the frame size changes was incorrect, as per bug 837. By storing the size of the allocated data structure, a simple test determines whether to allocate more memory when the frame size changes. Change-Id: I1544698f2882cf958fc672485614f2f46e9719bd	2014-08-14 08:59:15 -07:00
James Zern	4b79563805	Merge "get_ref_frame: check ref_frame_map value"	2014-08-12 22:48:27 -07:00
James Zern	a6b7bd6a1c	Merge "fixes several -Wunused-function warnings"	2014-08-12 20:15:14 -07:00
James Zern	3caed4f8fd	get_ref_frame: check ref_frame_map value 'ref_frame_map' is initialized to -1. avoids using an invalid index if VP9_GET_REFERENCE/VP8_COPY_REFERENCE controls are issued after a decode error. Change-Id: I4599762c4d0b07a5943a72bf4a86ccb596cc062a	2014-08-12 17:47:04 -07:00
Jim Bankoski	f452961765	fixes several -Wunused-function warnings Change-Id: I4dc2cb255f4fe30998b6ee61184895dee9f5da8e	2014-08-12 16:51:07 -07:00
Adrian Grange	1ebf52df2c	Common encode/decode function to get reference frame Replaced encoder and decoder functions to get a pointer to a reference frame with a common function, vp9_get_ref_frame, and simplified it. Change-Id: Icb206fcce8caace3bfd1db3dbfa318dde79043ee	2014-08-08 11:37:11 -07:00
Adrian Grange	75b42a4977	Remove coding_use_prev_mi member from VP9_COMMON This was shadowing the use of error_resilient_mode, but with the opposite sense. Change-Id: Ie4d30263a304fe4b3e94f0c7741db6888cc6afd8	2014-08-08 09:40:38 -07:00
levytamar82	69a5f5ecf7	Fix bug 807 in the sub_pixel_variance function the dst is aligned to 16 bytes and not to 32 bytes - now load unaligned data Change-Id: I2e0b9745543697efc56fefa32857ea10117af135	2014-08-07 18:51:02 -07:00
levytamar82	839911fb6d	Fix bug 804 A bug in Microsoft compiler was found in the function vp9_filter_block1d16_v8_avx2 and a workaround applied. the bug occur when there was 4 consecutive maddubs + min + adds intrinsic instructions. Change-Id: I83499faeb70971e650e5663fd2490360ddb1a51b	2014-08-07 15:09:24 -07:00
levytamar82	af10457e02	Fix bug 806 in the function sad32x32x4d and sad64x64x4d the source is aligned to 16 bytes and not to 32 bytes - the load is now unaligned. Change-Id: I922fdba56d0936b5cf72e4503519f185645a168c	2014-08-07 14:13:30 -07:00
Dmitry Kovalev	65234504b9	Merge "Removing direct references to VP9_COMP."	2014-08-07 14:12:32 -07:00
Deb Mukherjee	a468170804	Merge "Changes hdr for profiles > 1 for intraonly frames"	2014-08-07 11:15:38 -07:00
Deb Mukherjee	09bf1d61ca	Changes hdr for profiles > 1 for intraonly frames Specifies the bit-depth, color sampling and colorspace for intra only frames for profiles > 0 Also adds checks to ensure that profile 1 and 3 are exclusively used for non 420 streams. Change-Id: Icfb15fa1acccbce8f757c78fa8a2f60591360745	2014-08-07 09:47:14 -07:00
Yaowu Xu	0a2b25dcb9	configure: add --enable-coefficient-range-checking This commit adds a configure time option used to enable strict error checking in decoder to make sure intermediate stage cofficients of inverse transforms are within valid range of signed 16 bit integer. For valid VP9 input streams, intermediate stage coefficients should always stay within the range of a signed 16 bit integer. Coefficients can go out of this range for invalid/corrupt VP9 streams. However, strictly checking this range for every intermediate coefficient can be a burden for decoder, therefore such validation is only enabled with configure option --enable-coefficient-range-checking. Change-Id: I47d47c8c4e48a922c3d223ca59064f51b3f0f5ed	2014-08-06 17:13:16 -07:00
Dmitry Kovalev	09b3d04aac	Removing direct references to VP9_COMP. Change-Id: Ic37624d807884e71f08b50fd04892f03f2708ba7	2014-08-06 12:59:02 -07:00
Johann	7516abc7dc	Remove vp9_postproc_x86.h This configuration has moved to vp9_rtcd_defs.pl Change-Id: I71a31dbb8d79df226b60dd834324a5af69956c51	2014-08-05 15:46:13 -07:00
Jim Bankoski	128827d947	cast enums to int to avoid gcc warning in pred_common Change-Id: Ie3e478ef4fa565225d9e19a14d2f40aad966c2b6	2014-08-04 12:07:37 -07:00
Jim Bankoski	7f63dabfe9	break at the end of clauses with assert(0) to avoid gcc warning Change-Id: I1b3c5337f018dde27dc819ab18bd081d169a91e8	2014-08-04 08:52:53 -07:00
Jim Bankoski	3cf5908e24	uint8_t segment and skip to avoid signed / unsigned warnings Change-Id: I2e2765b851fb0a1b15351c2aa0e079197cbee373	2014-08-04 08:52:40 -07:00
James Zern	ce896df057	Merge "vp9_entropy: inline comes first to avoid warning."	2014-08-01 19:15:34 -07:00
James Zern	3a924f6ed1	Merge "signed unsigned mismatch - warning error"	2014-08-01 16:28:38 -07:00
Jim Bankoski	9c74e6aac7	vp9_entropy: inline comes first to avoid warning. Change-Id: I5b050122e6ed183a5b33c1f38e4fbf63b6721062	2014-08-01 16:05:30 -07:00
James Zern	1b6ac28a2f	Merge "removed sign mismatch warning"	2014-08-01 14:45:12 -07:00
Frank Galligan	5f8fa13258	Merge "Added vp9_sad8x8_neon()"	2014-08-01 14:11:38 -07:00
Scott LaVarnway	98165ec074	Neon version of vp9_sub_pixel_variance8x8(), vp9_variance8x8(), and vp9_get8x8var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~1.2%. Change-Id: I8a66ac2a0f550b407caa27816833bdc563395102	2014-08-01 11:35:55 -07:00
Frank Galligan	5487b6067c	Merge "Neon version of vp9_sub_pixel_variance32x32(),"	2014-08-01 09:46:37 -07:00
Scott LaVarnway	545be78136	Added vp9_sad8x8_neon() Change-Id: I3be8911121ef9a5f39f6c1a2e28f9e00972e0624	2014-08-01 06:36:18 -07:00
Jim Bankoski	0f3689d32d	signed unsigned mismatch - warning error Change-Id: I991e36aa3cfa62aae6d27b253297dd9ca9e8bc12	2014-08-01 06:29:32 -07:00
Jim Bankoski	512f9b631f	removed sign mismatch warning Change-Id: Iaa40b472f6c1c48bb3bb47332b6fcf36d7f3c10e	2014-08-01 06:28:00 -07:00
Scott LaVarnway	6f4b8dcdc2	Neon version of vp9_subtract_block() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~3.2% Change-Id: I8862497264142171b7efc32df1a67714a23539f4	2014-07-31 09:28:06 -07:00
Scott LaVarnway	d39448e2d4	Neon version of vp9_sub_pixel_variance32x32(), vp9_variance32x32(), and vp9_get32x32var(). Change-Id: I8137e2540e50984744da59ae3a41e94f8af4a548	2014-07-31 08:00:36 -07:00
Scott LaVarnway	d4a37db5b8	Neon version of vp9_quantize_fp() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~12.4% Change-Id: Id29d215acf58bb108489e218a259adf74b4768d7	2014-07-30 09:33:46 -07:00
Scott LaVarnway	521cf7e879	Neon version of vp9_sub_pixel_variance16x16(), vp9_variance16x16(), and vp9_get16x16var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~16.7%. Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb	2014-07-30 08:17:32 -07:00
Scott LaVarnway	d19d222db6	Added vp9_fdct8x8_neon(), vp9_fdct8x8_1_neon() On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~3.7%. Change-Id: I428c72c40df82c6d537955e320a8debf99343004	2014-07-29 08:56:05 -07:00
levytamar82	4ba92dc5ab	Fix bug 805 Remove all the redundant dct functions (dct4x4, dct8x8) in avx2 except dct32x32 those functions were copied originally from dct_sse2 Change-Id: I742576fbf5175f3ac09f2076976a9247b259323e	2014-07-28 15:46:01 -07:00
Jingning Han	53844275e9	Fix potential ioc issue in vp9_get_prob for 4K above sizes This commit turns on the existing vp9_get_prob function using 64 bit in the intermediate step. It fixes the ioc issue for 4K above frame sizes (issue 828). Change-Id: I9f627f3beca2c522f73b38fd2a3e7eefdff01a7c	2014-07-24 15:35:51 -07:00
Alex Converse	5926e7c0e8	Remove unfinished VP9 alpha channel. Change-Id: Ic5d3a3a0dac10b49495771886a31e793bb78b5ca	2014-07-21 15:55:50 -07:00
Deb Mukherjee	727f384085	Merge "Separates profile 2 into 2 profiles 2 and 3"	2014-07-18 03:23:51 -07:00
Deb Mukherjee	c447a50aea	Separates profile 2 into 2 profiles 2 and 3 Separates HBD profile int two profiles (2 and 3) consistent with the highbitdepth branch. This patch is ported from the original highbitdepth branch patch: https://gerrit.chromium.org/gerrit/#/c/70460/ Two of the invalid file tests needed to be updated. Change-Id: I6a4acd2f7a60b1fb4cbcc8e0dad4eab4248431e3	2014-07-17 20:51:59 -07:00
Adrian Grange	8cb8aef7c7	Merge "Modified frame buffer handling"	2014-07-17 12:15:16 -07:00
Scott LaVarnway	ba0652e83a	Merge "Added vp9_sad64x64_neon(), vp9_sad32x32_neon()"	2014-07-17 11:42:16 -07:00
Adrian Grange	f68aaa38d6	Modified frame buffer handling This patch is the first step toward simplifying the frame buffer handling. The final goal is to have a common frame buffer handling framework for both encoder and decoder that incorporates the existing ability to use externally allocated memory. Change-Id: I2c378a4f54a39908915f46c4260e17a080db7ff1	2014-07-17 11:06:35 -07:00
Scott LaVarnway	696fa52eaa	Added vp9_sad64x64_neon(), vp9_sad32x32_neon() and vp9_sad16x16_neon() On a Nexus 7, vpxenc (in realtime mode, speed -6) reported a performance improvement of ~17%. Change-Id: I91e070cde2973451083d3f3d63b49b7886de9a85	2014-07-16 12:54:46 -07:00
Deb Mukherjee	1f6aaeddc5	Merge "Some extra bit probability cleanups"	2014-07-14 17:26:54 -07:00
Jingning Han	6ce515b9ff	Merge "Fix chrome valgrind warning due to the use of mismatched bsize"	2014-07-13 11:07:44 -07:00
James Zern	0999a2a24e	Merge "vp9_loopfilter.c: cosmetics"	2014-07-11 16:02:21 -07:00
Jingning Han	3cddd81c6d	Fix chrome valgrind warning due to the use of mismatched bsize This commit fixes a mismatched use case of block size in non-RD intra prediction check. The residual SSE and variance should be calculated per transform block size, instead of operating block size, which caused chrome valgrind warning on conditional jump based on uninitialized value (webm issue 823). This commit resolves this issue. Change-Id: I595c06599c7e0fd0e4a08736519ba68fc14bc79a	2014-07-11 15:49:22 -07:00
Yunqing Wang	7e340614c1	Merge "Remove unnecessary assertions"	2014-07-11 13:47:03 -07:00
Deb Mukherjee	6957e7a077	Some extra bit probability cleanups Refactoring to remove some duplication of probability tables between tokenization and detokenization. Change-Id: I2fc6a6497f9c0410021a9b41f828bc58a864e466	2014-07-11 11:39:18 -07:00
Yunqing Wang	978642a426	Remove unnecessary assertions Removed 2 unnecessary assertions. Change-Id: I0f8877d0494bf3ecdb0d7931ccbcaa8289e01d8b	2014-07-11 10:48:57 -07:00
Yaowu Xu	a75d55df1b	Remove an unused parameter Change-Id: I6ad6fd75dc3c9e6218d88148cf49e205398e2af5	2014-07-11 08:10:04 -07:00
James Zern	8a7cc1f47b	Merge "update vp9_thread.c"	2014-07-10 23:19:55 -07:00
James Zern	8701ed0270	update vp9_thread.c pull the latest from libwebp. Original source: http://git.chromium.org/webm/libwebp.git 100644 blob 264210ba2807e4da47eb5d18c04cf869d89b9784 src/utils/thread.c commit 46fd44c1042c9903b2f1ab87e9f200a13c7e702d Author: James Zern <jzern@google.com> Date: Tue Jul 8 19:53:28 2014 -0700 thread: remove harmless race on status_ in End() if a thread was still doing work when End() was called there'd be a race on worker->status_. in these cases, however, the specific value is meaningless as it would be >= OK and the thread would have been shut down properly, but we'll check 'impl_' instead to avoid any potential TSan/DRD reports. Change-Id: Ib93cbc226a099f07761f7bad765549dffb8054b1 Change-Id: Ib0ef25737b3c6d017fa74822e21ed58508230b91	2014-07-10 12:20:54 -07:00
Yunqing Wang	1226d133df	Merge "Refactor vp9_diamond_search_sad function"	2014-07-10 11:06:32 -07:00
Yunqing Wang	46441ec5c8	Merge "Refactor refining_search_sad code"	2014-07-10 10:43:00 -07:00
hkuang	51e9788e58	Fix a bug in boundary checking. Change-Id: Ifc741da9da6f61c8d3c1f675ec6b8a96570f877d	2014-07-10 09:43:04 -07:00
Yunqing Wang	75cd57503d	Refactor vp9_diamond_search_sad function Currently, vp9_diamond_search_sadx4() is only called when sse3 is enabled, which is improper since sse2 optimization of sdx4df functions are available. Changed to always use vp9_diamond_search_sadx4(). Change-Id: I4b95d6b7a3c6c645783c373f0ba8d645ece24717	2014-07-10 09:19:03 -07:00
James Zern	58609335b1	vp9_loopfilter.c: cosmetics - fix indent, spelling - drop some whitespace in some comments - add an assert in vp9_setup_mask, it shouldn't be called on decode error Change-Id: Ic312a815e977a6f9cb81ceb7b039eeada76c5aa0	2014-07-09 17:27:57 -07:00
Yunqing Wang	30117a576d	Refactor refining_search_sad code There are sse2 optimization of sdx4df functions. Instead of calling vp9_refining_search_sadx4 only when sse3 is enabled, call it always. Change-Id: I24f93818f7d4209d1425039e0eb099ff9ff08fe9	2014-07-09 16:50:11 -07:00
Jingning Han	f6bf614b2f	Merge "Re-design quantization process for 32x32 transform block"	2014-07-09 11:55:26 -07:00
hkuang	b84ee5a3d0	Merge "Move vp9_thread.* to common."	2014-07-09 10:16:13 -07:00
Jingning Han	9ad1b9fc67	Re-design quantization process for 32x32 transform block This commit enables a new quantization process for 32x32 2D-DCT transform coefficient blocks. It improves the compression performance of speed 5 by 1.4%. The overall compression gains of speed 5 due to the new quantization scheme is 4.7%. It also includes the SSSE3 implementation of the 32x32 quantization process. Change-Id: I0855b124fd6462418683f783f5bcb44255c9993b	2014-07-08 16:55:28 -07:00
Adrian Grange	7c43fb67ae	Fix decoder handling of intra-only frames This patch fixes bug 633: https://code.google.com/p/webm/issues/detail?id=633 The first decoded frame does not have to be a keyframe, it could be an inter-frame that is coded intra-only. This patch fixes the handling of intra-only frames. A test vector has also been added that encodes 3 intra-only frames at the start of the clip. The test vector was generated using the code in the following patch: https://gerrit.chromium.org/gerrit/#/c/70680/ Change-Id: Ib40b1dbf91aae2bc047e23c626eaef09d1860147	2014-07-08 16:24:03 -07:00
hkuang	337e8015c9	Move vp9_thread.* to common. Prepare for frame parallel decoding, the reference count buffers need to be protected by mutex. Move vp9_thread.* to common folder so that those buffers could use cross-platform mutex from vp9_thread.*. Change-Id: I541277cf15eefed6641555944f67f4a0bcdc8154	2014-07-07 14:52:19 -07:00
Yaowu Xu	82fd084b35	Merge "Re-design quantization process"	2014-07-01 19:04:01 -07:00
Jingning Han	9ac2f66320	Re-design quantization process This commit re-designs the quantization process for transform coefficient blocks of size 4x4 to 16x16. It improves compression performance for speed 7 by 3.85%. The SSSE3 version for the new quantization process is included. The average runtime of the 8x8 block quantization is reduced from 285 cycles -> 255 cycles, i.e., over 10% faster. Change-Id: I61278aa02efc70599b962d3314671db5b0446a50	2014-07-01 17:00:07 -07:00
Alex Converse	6c54dbcb69	Merge "BITSTREAM: Handle transform size and motion vectors more logically for non-420."	2014-06-30 17:44:01 -07:00
James Zern	44472cde55	vp9: disable postproc buffer alloc when unnecessary the buffer is only used in encoding and only when CONFIG_INTERNAL_STATS or CONFIG_VP9_POSTPROC is enabled. a future change should decouple this from the frame buffer allocation and make it conditional based on runtime flags when the above config options are enabled. reduces decode heap usage by at least 12% Change-Id: Id0b97620d4936afefa538d3aadf32106743d9caf	2014-06-27 20:59:56 -07:00
Jim Bankoski	52b63c238e	Merge "Better validation of invalid files"	2014-06-27 11:05:21 -07:00
Jim Bankoski	9f37d149c1	Better validation of invalid files This patch checks that a decoder never tries to reference frame that's outside the range of 2x to 1/16th the size of this frame. Any attempt to do so causes a failure. Change-Id: I5c98fa7bb95ac4f29146f29dd92b62fe96164e4c	2014-06-27 10:03:15 -07:00
Jingning Han	46ea9ec719	Enable real-time version reference motion vector search This commit enables a fast reference motion vector search scheme. It checks the nearest top and left neighboring blocks to decide the most probable predicted motion vector. If it finds the two have the same motion vectors, it then skip finding exterior range for the second most probable motion vector, and correspondingly skips the check for NEARMV. The runtime of speed -5 goes down pedestrian at 1080p 29377 ms -> 27783 ms vidyo at 720p 11830 ms -> 10990 ms i.e., 6%-8% speed-up. For rtc set, the compression performance goes down by about -1.3% for both speed -5 and -6. Change-Id: I2a7794fa99734f739f8b30519ad4dfd511ab91a5	2014-06-26 09:49:13 -07:00
Adrian Grange	8357292a5a	Fix test on maximum downscaling limits There is a normative scaling range of (x1/2, x16) for VP9. This patch fixes the maximum downscaling tests that are applied in the convolve function. The code used a maximum downscaling limit of x1/5 for historic reasons related to the scalable coding work. Since the downsampling in this application is non-normative it will revert to using a separate non-normative scaler. Change-Id: Ide80ed712cee82fe5cb3c55076ac428295a6019f	2014-06-24 10:26:09 -07:00
Adrian Grange	8c1f071f1e	Allocate buffers based on correct chroma format The encoder currently allocates frame buffers before it establishes what the chroma sub-sampling factor is, always allocating based on the 4:4:4 format. This patch detects the chroma format as early as possible allowing the encoder to allocate buffers of the correct size. Future patches will change the encoder to allocate frame buffers on demand to further reduce the memory profile of the encoder and rationalize the buffer management in the encoder and decoder. Change-Id: Ifd41dd96e67d0011719ba40fada0bae74f3a0d57	2014-06-23 11:45:13 -07:00
Jingning Han	961bafc366	Merge "Remove unused vp9_init_quant_tables function"	2014-06-23 09:37:30 -07:00
Johann	1fc2b0fd00	Merge "Include type defines"	2014-06-20 11:29:19 -07:00
Johann	d658216276	Don't return value for void functions Clears "warning: 'return' with a value, in function returning void" Change-Id: I93972610d67e243ec772a1021d2fdfcfc689c8c2	2014-06-20 11:26:44 -07:00
Johann	baef0b89da	Include type defines Clears error: unknown type name 'uint8_t' Change-Id: I9b6eff66a5c69bc24aeaeb5ade29255a164ef0e2	2014-06-20 11:26:13 -07:00
Alex Converse	7557a65d16	BITSTREAM: Handle transform size and motion vectors more logically for non-420. This breaks the profile 1 bitstream. Don't force non420 uv transform size to 1/4 y size. In the 4:2:0 case the chroma corresponding to a luma block is 1/4 its size. In the 4:4:4 case chroma and luma planes are the same size. Disallowing larger transforms can result in a loss of compression efficiency and is inconsistent. For sub-8x8 blocks only average corresponding motion vectors. 4:2:0 and profile 0 behavior remains unchanged. Change-Id: I560ae07183012c6734dd1860ea54ed6f62f3cae8	2014-06-18 13:07:51 -07:00
Jingning Han	3b9c19aaa7	Remove unused vp9_init_quant_tables function This function is not effectively used, hence removed. Change-Id: I2e8e48fa07c7518931690f3b04bae920cb360e49	2014-06-18 11:51:41 -07:00
James Zern	88df435d6b	Merge "vp9_rtcd: correct avx2 references"	2014-06-16 17:39:13 -07:00
Johann	79afb5eb41	Use lrand48 on Android When building x86 assembly use lrand48 instead of the undocumented inlined _rand function. Android now supports rand() https://android-review.googlesource.com/97731 but only for new versions. Original workaround: https://gerrit.chromium.org/gerrit/15744 Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a	2014-06-12 19:57:25 -07:00
Jingning Han	d5ae43318e	Merge "Fast computation path for forward transform and quantization"	2014-06-12 11:59:52 -07:00
Jingning Han	ccba289f8d	Fast computation path for forward transform and quantization This commit enables a fast path computational flow for forward transformation. It checks the sse and variance of prediction residuals and decides if the quantized coefficients are all zero, dc only, or more. It then selects the corresponding coding path in the forward transformation and quantization stage. It is currently enabled in rtc coding mode. Will do it for rd coding mode next. In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up. Overall coding performance for rtc set is changed by -0.18%. Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1	2014-06-12 11:10:54 -07:00
James Zern	9f3a0dbb5e	vp9_rtcd: correct avx2 references s/"\$avx2_x86inc"/"avx2"/ avx2 code is all intrinsics and as a result doesn't rely on x86inc.asm Change-Id: I76ad39474d8a00658f3e43131830ef0f4f34772a	2014-06-10 16:26:36 -07:00
James Zern	cbce09ce62	Merge changes I6abc0657,I8224fba2,I04f64a45,I5d49d119,I76b4d171,I88c11ac3 * changes: vp9_sub_pixel_variance: disable avx2 variants vp9_sad*x4d: disable avx2 variants vp9_f(dct\|ht): disable avx2 variants convolve: disable avx2 variants fdct8x8_test: add missing avx2 functions dct4x4_test: add missing avx2 functions	2014-06-10 16:14:45 -07:00
James Zern	520cb3f39f	vp9_sub_pixel_variance: disable avx2 variants tests failing under Win32/Win64 + variance_test: add missing avx2 functions (partially disabled) Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d	2014-06-10 16:11:15 -07:00
James Zern	d3ff009d84	vp9_sad*x4d: disable avx2 variants tests failing under Win32/Win64 + sad_test: add missing avx2 functions (disabled) Change-Id: I8224fba2b270f6039ab1877d71e1e512f0081856	2014-06-10 16:10:12 -07:00
hkuang	cdffeaaae0	Add mode info arrays and mode info index. In non frame-parallel decoding, this works the same way as current decoding scheme. Every time after decoder finish decoding a frame, it will swap the current mode info pointer and previous mode info pointer if the decoded frame needs to be shown. Both mode info pointer and previous mode info pointer are from mode info arrays. In frame-parallel decoding, this will become more complicated as current frame's mode info pointer will be shared with next frame as previous mode info pointer. But when one decoder thread finishes decoding one frame and starts to work on next available frame, it needs to retain the decoded frame's mode info pointers until next frame finishes decoding. The mode info index will serve this purpose. The decoder will use different buffer in the mode info arrays and use the other buffer to save previous decoded frame’s mode info. Change-Id: If11d57d8eb0ee38c8876158e5482177fcb229428	2014-06-10 13:43:36 -07:00
James Zern	dd9f502933	vp9_f(dct\|ht): disable avx2 variants tests failing under Win32/Win64 + dct16x16_test: add missing avx2 functions (partially disabled) exercises the forward transforms no idct/iht implementations, so the c-code is used Change-Id: I04f64a457fa0828a00f32b5c9fe4f55294f21f61	2014-06-09 18:48:11 -07:00
James Zern	5704578f5f	convolve: disable avx2 variants tests failing under Win32/Win64 Change-Id: I5d49d11911bcda3a832b14efe5500d22597bedcf	2014-06-09 18:42:03 -07:00
Jingning Han	0c4a4225ec	Merge "Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs"	2014-06-03 16:51:39 -07:00
Dmitry Kovalev	19c492a749	Merge "Reusing existing vp9_get{8x8, 16x16}var() instead of new ones."	2014-06-03 10:04:27 -07:00
Deb Mukherjee	fc88292ef2	Remove Wextra warnings from vp9_sad.c As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Fixes a bug in original patch: (https://gerrit.chromium.org/gerrit/#/c/70163/8) that was reverted due to a nightly test failure. Change-Id: Ia2a4e9e278fd3c89d6c3c82fcc6381320ec2a8a6	2014-06-02 13:50:20 -07:00
Frank Galligan	c40a968e13	Merge "Revert "Remove Wextra warnings from vp9_sad.c""	2014-06-01 16:58:11 -07:00
Frank Galligan	0b44988952	Revert "Remove Wextra warnings from vp9_sad.c" This reverts commit `916550428d` Change-Id: I500822b03f09c64ff6ec5396c68edee9ca3b75cb	2014-06-01 16:20:26 -07:00
Jingning Han	ba6bed372b	Merge "Fix a potential overflow issue in inverse 16x16 full 2D-DCT"	2014-05-30 15:52:53 -07:00
Jingning Han	2c1cdf69b6	Fix a potential overflow issue in inverse 16x16 full 2D-DCT An overflow issue could potentially happen in the second round 1-D transform of the SSSE3 full inverse 16x16 2D-DCT. This commit fixes this issue. Change-Id: Ia19e4888fda1cc929a28a5f89a5beec612d628dc	2014-05-29 11:46:32 -07:00
Dmitry Kovalev	e14f900ae3	Merge "Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK."	2014-05-29 11:16:39 -07:00
Dmitry Kovalev	f7ff24cdd0	Reusing existing vp9_get{8x8, 16x16}var() instead of new ones. Change-Id: I87b7c657d8813d7fb383ab519d150c0ffb1dd377	2014-05-29 11:14:06 -07:00
Jingning Han	6d21cbd20b	Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe	2014-05-28 10:53:33 -07:00
Jingning Han	d5bcef5242	Merge "Fix compiling error in MSVS"	2014-05-27 16:58:00 -07:00
Jingning Han	239e68ddbf	Fix compiling error in MSVS Need to include math.h before tmmintrin.h in some versions of MSVS. Change-Id: Ia6b83ae599316887ecf30c4e4b9e4355fb8a4219	2014-05-27 15:58:47 -07:00
Yunqing Wang	1f2200080b	Revert "Making vp9_get_sse_sum_{8x8, 16x16} static." This reverts commit `e8bbb3d9db`. Change-Id: Ie368d36fd249d323d859d208609c711f04537bbc	2014-05-27 13:37:08 -07:00
Deb Mukherjee	444f93945b	Merge "Remove Wextra warnings from vp9_sad.c"	2014-05-27 11:54:05 -07:00
Yunqing Wang	a591ac9e5a	Merge "Fix decoder mismatch in sub-pixel AVX2 intrinsic filters"	2014-05-27 10:52:16 -07:00
levytamar82	773596050f	Fix decoder mismatch in sub-pixel AVX2 intrinsic filters The subpixel SSSE3 was fixed in this patch: https://gerrit.chromium.org/gerrit/#/c/70283/ So the equivalent AVX2 is fixed accordingly. Change-Id: Ieebbc1949c99d34b12b8b47692df71aca5001f3a	2014-05-23 16:48:40 -07:00
Jingning Han	59c3f446fe	Merge "Inverse 16x16 2D-DCT SSSE3 implementation"	2014-05-23 16:01:22 -07:00
Jingning Han	48b0891370	Inverse 16x16 2D-DCT SSSE3 implementation This commit enables the SSSE3 implementation of full inverse 16x16 2D-DCT. The unit runtime goes down from 1642 cycles to 1519 cycles, about 7% speed-up. Change-Id: I14d2fdf9da1fb4ed1e5db7ce24f77a1bfc8ea90d	2014-05-23 15:09:35 -07:00
Yunqing Wang	67ca5b586a	Merge "Fix decoder mismatch in sub-pixel SSSE3 intrinsic filters"	2014-05-23 14:24:48 -07:00
Dmitry Kovalev	d7d7cedaaa	Merge "Removing vp9_pragmas.h."	2014-05-23 12:58:00 -07:00
Yunqing Wang	c5443fc881	Fix decoder mismatch in sub-pixel SSSE3 intrinsic filters In 8-tap filtering, to guarantee the intermediate results fit in 16 bits, the order of accumulating the products needs to be done correctly, and the largest product should be added last. This patch fixed the problem using the method in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation". Change-Id: I79d0ad60c057b15011ece84cda9648eee0809423	2014-05-23 11:52:20 -07:00
Yaowu Xu	9410330893	Merge "change to use assembly version of ssse3 filter code"	2014-05-23 08:02:28 -07:00
Deb Mukherjee	916550428d	Remove Wextra warnings from vp9_sad.c As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Change-Id: I068cc2391eed51e9b140ea6aba78338c5fec8d71	2014-05-22 22:21:16 -07:00
Yaowu Xu	7a0c9b82f2	change to use assembly version of ssse3 filter code As mismatchs were found between the intrinsic version and c only. The commit temporarily revert to use the matching assembly version to allow further investigation. Change-Id: I08436c47d4888b562c0eac8e8856d90a831442df	2014-05-22 17:11:57 -07:00
Yunqing Wang	aaf204e550	Merge "Fix a decoding mismatch in sub-pixel filters"	2014-05-22 17:09:14 -07:00
Yunqing Wang	efcdf946ed	Fix a decoding mismatch in sub-pixel filters This did the same correction as the one in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" to avoid saturation during filtering. Change-Id: Ife9aa3f62daf9114eb24fe38f7baa3c3f361b2d6	2014-05-22 15:42:13 -07:00
Dmitry Kovalev	72ab966d5e	Removing vp9_pragmas.h. Change-Id: I9120a87e27e73e496932d11716937e2fad246521	2014-05-22 13:46:31 -07:00

1 2 3 4 5 ...

2562 Commits