generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	9410330893	Merge "change to use assembly version of ssse3 filter code"	2014-05-23 08:02:28 -07:00
Deb Mukherjee	916550428d	Remove Wextra warnings from vp9_sad.c As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Change-Id: I068cc2391eed51e9b140ea6aba78338c5fec8d71	2014-05-22 22:21:16 -07:00
Yaowu Xu	7a0c9b82f2	change to use assembly version of ssse3 filter code As mismatchs were found between the intrinsic version and c only. The commit temporarily revert to use the matching assembly version to allow further investigation. Change-Id: I08436c47d4888b562c0eac8e8856d90a831442df	2014-05-22 17:11:57 -07:00
Yunqing Wang	aaf204e550	Merge "Fix a decoding mismatch in sub-pixel filters"	2014-05-22 17:09:14 -07:00
Yunqing Wang	efcdf946ed	Fix a decoding mismatch in sub-pixel filters This did the same correction as the one in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" to avoid saturation during filtering. Change-Id: Ife9aa3f62daf9114eb24fe38f7baa3c3f361b2d6	2014-05-22 15:42:13 -07:00
Dmitry Kovalev	72ab966d5e	Removing vp9_pragmas.h. Change-Id: I9120a87e27e73e496932d11716937e2fad246521	2014-05-22 13:46:31 -07:00
Deb Mukherjee	e272273443	Renames x86_64 specific asm files Renames all x86_64 specific assembly files to consistently end in _x86_64.asm. This will be useful for build systems to handle these files differently. All new 64-bit specific assembly files should use the new naming convention. Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536	2014-05-21 13:55:56 -07:00
Dmitry Kovalev	35a83677a5	Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK. The final goal is eventually to get rid of both itxm_add and fwd_txm4x4. This patch does it in the decoder. Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5	2014-05-21 11:09:44 -07:00
Deb Mukherjee	ef750d8472	Merge "Extends temporal filtering to work for 422 data"	2014-05-20 16:31:28 -07:00
Deb Mukherjee	a185bc3350	Extends temporal filtering to work for 422 data This is needed for profiles 1 and 2. Change-Id: I5dd7644c2932d055ab89e050d4be7d4117cd1028	2014-05-20 15:19:40 -07:00
hkuang	20c1edf612	Refactor decode_tiles and loopfilter code. The current decode_tiles decodes the frame one tile by one tile and then loopfilter the whole frame or use another worker thread to do loopfiltering. \|------\|------\|------\|------\| \|Tile1-\|Tile2-\|Tile3-\|Tile4-\| \|------\|------\|------\|------\| For example, if a tile video has one row and four cols, decode_tiles will decode the Tile1, then Tile2, then Tile3, then Tile4. And during decode each tile, decode_tile will decode row by row in each tile. For frame parallel decoding, decode_tiles will decode video in row order across the tiles. So the order will be: "Decode 1st row of Tile1" -> "Decode 1st row of Tile2" -> "Decode 1st row of Tile3" -> "Decode 1st row of Tile4" -> "Decode 2nd row of Tile1" -> "Decode 2nd row of Tile2" -> "Decode 2nd row of Tile3" -> "Decode 2nd row of Tile4"-> "loopfilter 1st row" Change-Id: I2211f9adc6d142fbf411d491031203cb8a6dbf6b	2014-05-20 14:47:45 -07:00
Dmitry Kovalev	c23c613fdf	Merge "Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file."	2014-05-19 10:27:16 -07:00
Dmitry Kovalev	79ba41903f	Removing MACROBLOCKD dependency from loop filter. Change-Id: I9ef40f3d95ab8f94f69e92ea25678a40956bc1ce	2014-05-16 09:48:26 -07:00
Adrian Grange	9dc9f17814	Merge "Fix post-processor macros & remove vizualization"	2014-05-16 09:01:41 -07:00
Dmitry Kovalev	619e6b539a	Merge "Removing redundant "8x8" suffix from MODE_INFO vars."	2014-05-15 17:53:31 -07:00
Jim Bankoski	ec82d2dfec	Merge "Revert "Remove Wextra warnings from vp9_sad.c""	2014-05-15 11:54:23 -07:00
Yunqing Wang	c661cf0dad	Merge "AVX2 To VP9 Block Error Optimization"	2014-05-15 11:29:29 -07:00
Dmitry Kovalev	ed784a0bc4	Removing redundant "8x8" suffix from MODE_INFO vars. Change-Id: I7ed7fecc959c6598ff98895f1a5cf7e11ac1615f	2014-05-15 11:14:42 -07:00
Adrian Grange	384bc5163c	Fix post-processor macros & remove vizualization Make all post-processor code conditionally compilable based on the CONFIG_VP9_POSTPROC macro. Also, remove the vizualization code from VP9 since it is out of date and will not compile. Change-Id: I1e9e13a09ecd43e9a3f3704c175ae8cd258ababd	2014-05-15 08:35:36 -07:00
Jim Bankoski	a16794dd31	Revert "Remove Wextra warnings from vp9_sad.c" This reverts commit 7ab9a9587b96db4edce6be916c1f02297a9555ff Nightly test http://build.webmproject.org/jenkins/view/libvpx-nightly-tests/job/libvpx%20unit%20tests%20(valgrind-2)/arch=x86_64-linux-gcc,filter=-VP8:Large./276/console Failed This patch did not address all the assembly issues some of the vp8 assembly counts on 5 arguments being passed in to this function: one example : vp8_sad8x16_wmt Please address or split this into vp9 and vp8 patches. Change-Id: I78afcc171649894f887bb8ee3c66de24aaddc7ca	2014-05-15 08:31:20 -07:00
Yaowu Xu	71854f3a6e	Merge "vp9_decodeframe.c: cleanup -wextra warnings"	2014-05-15 06:50:51 -07:00
Dmitry Kovalev	021eaabdb8	Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file. Change-Id: Id401da740b0a0141caaef9e1bcccd981e5cef4a4	2014-05-14 16:21:41 -07:00
levytamar82	1fbab853c8	AVX2 To VP9 Block Error Optimization vp9_block_error_sse2 can only handle 16 bytes at a time but the function requires to handle a sequence of 32 bytes at a time so each 16 bytes is handled in a different register. With AVX2 optimization the 32 bytes can be handled in one register instead of two in the SSE2 The vp9_block_error was optimized by 85%. The user level was optimized by 1.2% Change-Id: Ia8fffe60e61eff7432a5fbd538757894f6c319fd	2014-05-14 11:51:07 -07:00
Deb Mukherjee	9687c057f8	Merge "Remove Wextra warnings from vp9_sad.c"	2014-05-14 10:01:50 -07:00
Yaowu Xu	ed09580777	vp9_decodeframe.c: cleanup -wextra warnings Change-Id: I0315cea6a5e58182bc2556e9825ec2ef0b1480c3	2014-05-14 09:46:11 -07:00
Jingning Han	e5bbb4cfd8	Merge "Silience -wextra warnings in vp9_reconintra.c"	2014-05-14 09:25:08 -07:00
Deb Mukherjee	7ab9a9587b	Remove Wextra warnings from vp9_sad.c As a side-effect, the max_sad check is removed from the C-implementation of VP8, for consistency with VP9, and to ensure that the SAD tests common to VP8/VP9 pass. That will make the VP8 C implementation of sad a little slower but given that is rarely used in practice, the impact will be minimal. Change-Id: I7f43089fdea047fbf1862e40c21e4715c30f07ca	2014-05-14 03:17:31 -07:00
Dmitry Kovalev	eecc750b33	Merge "Moving loopfilter call to vp9_decode_frame()."	2014-05-13 17:20:26 -07:00
Jingning Han	806fa6aaca	Silience -wextra warnings in vp9_reconintra.c The warning messages complained that there are unused arguments in a few prediction modes. This structure was designed on purpose, such that a wrapper function can cover all prediction mode cases and make them readily accessible as an pointer array. This commit silences such warnings. Change-Id: I7036b6bdb70747e5327d8f6fceb154f100abc4c0	2014-05-13 12:54:23 -07:00
Adrian Grange	fd6bf31b8a	vp9_convolve.c: cleanup -wextra warnings Change-Id: I04930aca2293ebbaeb96dfedd2f9c5a55762fd2e	2014-05-13 09:57:24 -07:00
Dmitry Kovalev	ae7d3ef39f	Moving loopfilter call to vp9_decode_frame(). Inline loopfilter has been already handled in vp9_decode_frame(). Collecting all similar code in one place now. Change-Id: I358a0280fc7c2b27cca520bc1e8c16c4eb6491dd	2014-05-12 16:19:19 -07:00
Johann	ce23931a3f	Only build neon assembly for armv7 targets Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477	2014-05-12 08:52:02 -07:00
Alex Converse	ec8a3272fa	Merge "Add an x86inc MMX fwht4x4."	2014-05-09 13:48:49 -07:00
Jingning Han	9412785b02	Merge changes I3edd4b95,I4514f974,Ie7fa4386 * changes: Turn on unit tests for SSSE3 8x8 forward and inverse 2D-DCT Change eob threshold for partial inverse 8x8 2D-DCT to 12 SSSE3 8x8 inverse 2D-DCT with first 10 coeffs non-zero	2014-05-09 09:58:39 -07:00
Alex Converse	b5422fab46	Add an x86inc MMX fwht4x4. Change-Id: Ib0a73d4863478f9b8a00976379d25d2f6ebbb197	2014-05-08 12:01:27 -07:00
Jingning Han	41a350a83d	Change eob threshold for partial inverse 8x8 2D-DCT to 12 The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2	2014-05-08 09:48:58 -07:00
Jingning Han	9e7b09bc5d	SSSE3 8x8 inverse 2D-DCT with first 10 coeffs non-zero This commit enables ssse3 assembly implementation of the 8x8 inverse 2D-DCT with only first 10 coefficients non-zero. The average runtime for this unit goes down from 198 cycles to 129 cycles (34.8% faster). Change-Id: Ie7fa4386f6d3a2fe0d47a2eb26fc2a6bbc592ac7	2014-05-07 17:40:02 -07:00
Dmitry Kovalev	68a600d82a	Merge "Moving pair_set_epi32 macro into vp9_dct32x32_sse2.c."	2014-05-07 13:34:05 -07:00
Paul Wilkins	33b1c457ed	Revert "Add an MMX fwht4x4" Includes changes that are not compatible with VS windows builds. Amongst other things stdint.h is not supported in VS. This reverts commit 89fbf3de501b5d7fd90047192521eae3198705cd. Change-Id: Ifa86d7df250578d1ada9b539c9ff12ed0c523cdd	2014-05-07 12:53:27 +01:00
Alex Converse	75d05d5ed4	Merge "Add an MMX fwht4x4"	2014-05-06 11:12:27 -07:00
Jingning Han	d289deb04c	Merge "SSSE3 implementation of full inverse 8x8 2D-DCT"	2014-05-06 09:17:22 -07:00
Dmitry Kovalev	e8bbb3d9db	Making vp9_get_sse_sum_{8x8, 16x16} static. Change-Id: Ifb7937c977308c682986f0ce9645a0807d2aa46a	2014-05-05 19:12:38 -07:00
Alex Converse	89fbf3de50	Add an MMX fwht4x4 7% faster encoding a desktop lossless at RT speed 4. Change-Id: I41627f5b737752616b6512bb91a36ec45995bf64	2014-05-05 15:10:48 -07:00
Jingning Han	52ae97b6aa	SSSE3 implementation of full inverse 8x8 2D-DCT This commit enables SSSE3 version full inverse 8x8 2D-DCT and reconstruction. It makes the runtime of vp9_idct8x8_64_add down from 256 cycles (SSE2) to 246 cycles. Change-Id: I0600feac894d6a443a3c9d18daf34156d4e225c3	2014-05-05 10:49:27 -07:00
Dmitry Kovalev	25a666ef39	Moving pair_set_epi32 macro into vp9_dct32x32_sse2.c. Change-Id: I642a7d343677bf934e9a54cf4ad78e908620e39a	2014-05-01 16:45:49 -07:00
Jingning Han	39761eb5d6	Merge "Enable SSSE3 implementation of 8x8 forward 2D-DCT"	2014-04-30 13:41:36 -07:00
Dmitry Kovalev	d2bc8816a1	Merge "Adding search_site_config struct."	2014-04-29 16:59:47 -07:00
Jingning Han	1eaa3a76dc	Enable SSSE3 implementation of 8x8 forward 2D-DCT Assembly implementation of ssse3 8x8 forward 2D-DCT. The current version is turned on only for x86_64. The average unit runtime goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster. This translates into about 1.5% speed-up for pedestrian_area 1080p at speed 2. Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4	2014-04-29 15:49:18 -07:00
Dmitry Kovalev	9b042dc04c	Merge "Removing unused vp9_variance_halfpixvar*() functions."	2014-04-29 14:52:58 -07:00
Dmitry Kovalev	aa464eca5e	Adding search_site_config struct. Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf	2014-04-29 10:34:53 -07:00

1 2 3 4 5 ...

2368 Commits