generic-library/vpx

Author	SHA1	Message	Date
Yi Luo	0f80b1f754	Optimized HBD block subtraction for all block sizes - Interface function takes a local MxN function to call based on the block size. - Repetition call (w/o cache line miss) shows improvement: ~63% - ~340%. - Overall encoder speed improvement: ~0.9%. Change-Id: Ieff8f3d192415c61d6d58d8b99bb2a722004823f	2016-04-12 12:04:43 -07:00
Geza Lore	61af8981b0	Extend variance based partitioning to 128x128 superblocks Change-Id: I41edf266d5540a9b070a5e65bc397dd3da210507	2016-04-12 09:40:11 +01:00
Yi Luo	e5f4e8eab9	Some cosmetic improvements since HBD variance 4x4 optimization Change-Id: I414c1fabd2e3a9b1d9daa8a90f85a0bace8bd3cd	2016-04-08 10:32:13 -07:00
Geza Lore	454989ff32	Make superblock size variable at the frame level. The uncompressed frame header contains a bit to signal whether the frame is encoded using 64x64 or 128x128 superblocks. This can vary between any 2 frames. vpxenc gained the --sb-size={64,128,dynamic} option, which allows the configuration of the superblock size used (default is dynamic). 64/128 will force the encoder to always use the specified superblock size. Dynamic would enable the encoder to choose the sb size for each frame, but this is not implemented yet (dynamic does the same as 128 for now). Constraints on tile sizes depend on the superblock size, the following is a summary of the current bitstream syntax and semantics: If both --enable-ext-tile is OFF and --enable-ext-partition is OFF: The tile coding in this case is the same as VP9. In particular, tiles have a minimum width of 256 pixels and a maximum width of 4096 pixels. The tile width must be multiples of 64 pixels (except for the rightmost tile column). There can be a maximum of 64 tile columns and 4 tile rows. If --enable-ext-tile is OFF and --enable-ext-partition is ON: Same constraints as above, except that tile width must be multiples of 128 pixels (except for the rightmost tile column). There is no change in the bitstream syntax used for coding the tile configuration if --enable-ext-tile is OFF. If --enable-ext-tile is ON and --enable-ext-partition is ON: This is the new large scale tile coding configuration. The minimum/maximum tile width and height are 64/4096 pixels. Tile width and height must be multiples of 64 pixels. The uncompressed header contains two 6 bit fields that hold the tile width/heigh in units of 64 pixels. The maximum number of tile rows/columns is only limited by the maximum frame size of 65536x65536 pixels that can be coded in the bitstream. This yields a maximum of 1024x1024 tile rows and columns (of 64x64 tiles in a 65536x65536 frame). If both --enable-ext-tile is ON and --enable-ext-partition is ON: Same applies as above, except that in the bitstream the 2 fields containing the tile width/height are in units of the superblock size, and the superblock size itself is also coded in the bitstream. If the uncompressed header signals the use of 64x64 superblocks, then the tile width/height fields are 6 bits wide and are in units of 64 pixels. If the uncompressed header signals the use of 128x128 superblocks, then the tile width/height fields are 5 bits wide and are in units of 128 pixels. The above is a summary of the bitstream. The user interface to vpxenc (and the equivalent encoder API) behaves a follows: If --enable-ext-tile is OFF: No change in the user interface. --tile-columns and --tile-rows specify the base 2 logarithm of the desired number of tile columns and tile rows. The actual number of tile rows and tile columns, and the particular tile width and tile height are computed by the codec ensuring all of the above constraints are respected. If --enable-ext-tile is ON, but --enable-ext-partition is OFF: No change in the user interface. --tile-columns and --tile-rows specify the WIDTH and HEIGHT of the tiles in unit of 64 pixels. The valid values are in the range [1, 64] (which corresponds to [64, 4096] pixels in increments of 64. If both --enable-ext-tile is ON and --enable-ext-partition is ON: If --sb-size=64 (default): The user interface is the same as in the previous point. --tile-columns and --tile-rows specify tile WIDTH and HEIGHT, in units of 64 pixels, in the range [1, 64] (which corresponds to [64, 4096] pixels in increments of 64). If --sb-size=128 or --sb-size=dynamic: --tile-columns and --tile-rows specify tile WIDTH and HEIGHT, in units of 128 pixels in the range [1, 32] (which corresponds to [128, 4096] pixels in increments of 128). Change-Id: Idc9beee1ad12ff1634e83671985d14c680f9179a	2016-04-07 10:34:25 +01:00
James Zern	5ab46e0ecd	Merge changes I7a1c0cba,Ie02b5caf,I2cbd85d7,I644f35b0 * changes: vpx_fdct16x16_1_sse2: improve load pattern vpx_fdct16x16_1_c/msa: fix accumulator overflow vpx_fdctNxN_1_sse2: reduce store size dct32x32_test: add PartialTrans32x32Test, Random	2016-04-06 02:51:53 +00:00
James Zern	38bc1d0f4b	vpx_fdct16x16_1_sse2: improve load pattern load the full row rather than doing 2 8-wide columns Change-Id: I7a1c0cba06b0dc1ae86046410922b1efccb95c95	2016-04-04 16:03:42 -07:00
James Zern	eb64ea3e89	vpx_fdct16x16_1_c/msa: fix accumulator overflow tran_low_t is only signed 16-bits in non-high-bitdepth mode Change-Id: Ie02b5caf2658e8e71f995c17dd5ce666a4d64918	2016-04-04 16:03:41 -07:00
James Zern	3735def667	vpx_fdctNxN_1_sse2: reduce store size only output[0] needs to be set, store_output is more involved than a movdqa in the high bitdepth case Change-Id: I2cbd85d7cf74688bdf47eb767934fe42e02bff67	2016-04-04 16:02:06 -07:00
Yi Luo	250935cab3	Optimized HBD 4x4 variance calculation vpx_highbd_8/10/12_variance4x4_sse4_1 improves performance ~7%-11%. Change-Id: Ida22bb2a2f7a58037cfd73e186d4f6267a960c02	2016-04-04 11:28:59 -07:00
James Zern	c21d437052	vpx_fdct32x32_1_msa: fix accumulator overflow Change-Id: I33a5432eda3416382e1cea06b45082c0c65faa75	2016-04-02 11:04:38 -07:00
James Zern	f4cae05cd4	vpx_fdctNxN_1_c: remove unnecessary store only output[0] needs to be set, the other values will be ignored in this case. Change-Id: I8e9692fc0d6d85700ba46f70c2e899a956023910	2016-04-01 12:21:59 -07:00
James Zern	0269df41c1	vpx_fdct32x32_1_c: fix accumulator overflow tran_low_t is only 16-bits in non-high-bitdepth mode Change-Id: Ifc06110c95e86e6d790c44250d52a538b2e9713b	2016-03-30 15:20:20 -07:00
Geza Lore	552d5cd715	Extend superblock size fo 128x128 pixels. If --enable-ext-partition is used at build time, the superblock size (sometimes also referred to as coding unit (CU) size) is extended to 128x128 pixels. Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a	2016-03-30 18:23:06 +01:00
Yaowu Xu	c810740c36	Merge branch 'masterbase' into nextgenv2 Conflicts: vp9/encoder/vp9_encoder.c vpx_dsp/x86/convolve.h Change-Id: I60c3532936bedd796a75dfe78245a95ec21e2e55	2016-03-28 17:44:28 -07:00
Yunqing Wang	5f5552d846	Optimize HBD up-sampled prediction functions Optimized 2 up-sampled reference prediction functions in high-bit depth case. This reduced the HBD encoding time by 3%. Change-Id: I8663ffb5234f5e70168c0fc9ca676309fe8e98f2	2016-03-14 19:04:33 -07:00
Yunqing Wang	e6e2d886d3	Add high-precision sub-pixel search as a speed feature Using the up-sampled reference frames in sub-pixel motion search is enabled as a speed feature for good-quality mode speed 0 and speed 1. Change-Id: Ieb454bf8c646ddb99e87bd64c8e74dbd78d84a50	2016-03-11 16:32:11 -08:00
Debargha Mukherjee	f34deab243	Adds compound wedge prediction modes Incorporates wedge compound prediction modes. Change-Id: Ie73b54b629105b9dcc5f3763be87f35b09ad2ec7	2016-03-10 07:19:54 -08:00
Scott LaVarnway	67c4c8244a	VPX: loopfilter_mmx.asm using x86inc 2 This reverts commit 9aa083d164e0d39086aa0c83f0d1a0d0f0d1ba61. Fixes a decoder mismatch with 32bit PIC builds. Change-Id: I94717df662834810302fe3594b38c53084a4e284	2016-03-08 04:24:47 -08:00
Geza Lore	938b8dfc73	Extend convolution functions to 128x128 for ext-partition. Change-Id: I7f7e26cd1d58eb38417200550c6fbf4108c9f942	2016-03-07 11:39:27 +00:00
James Zern	9aa083d164	Revert "VPX: loopfilter_mmx.asm using x86inc" This reverts commit 15ecdc3970462c15fdf7185d373cb52664f40c0f. breaks 32-bit pic builds Change-Id: I8bb1b9471a293f05ac7423aaba0339d408931b7a	2016-03-04 18:23:45 -08:00
Geza Lore	697bf5beff	Add 128 pixel variance and SAD functions Change-Id: I8fde245b32c9e586683a28aa6925da0b83850b39	2016-03-03 10:24:29 +00:00
Debargha Mukherjee	1d69ceee5c	Adds masked variance and sad functions for wedge Adds masked variance and sad functions needed for wedge prediction modes to come. Change-Id: I25b231bbc345e6a494316abb0a7d5cd5586a3a54	2016-03-01 17:28:56 -08:00
Yunqing Wang	342a368fd4	Do sub-pixel motion search in up-sampled reference frames Up-sampled the reference frames to 8 times in each dimension using the 8-tap interpolation filter. In sub-pixel motion search, use the up-sampled reference frames to find the best matching blocks. This largely improved the motion search precision, and thus, improved the compression quality. There was no change in decoder side. Borg test and speed test results: 1. On derflr set, Overall PSNR gain: 1.306%, and SSIM gain: 1.512%. Average speed loss on derf set was 6.0%. 2. On stdhd set, Overall PSNR gain: 0.754%, and SSIM gain: 0.814%. On hevchd set, Overall PSNR gain: 0.465%, and SSIM gain: 0.527%. Speed loss on HD clips was 3.5%. Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f	2016-02-29 12:14:47 -08:00
Scott LaVarnway	dd6729f826	VPX: Remove pmin/pmax from subpixel functions. These instructions are unnecessary if the adds are done in the correct order. Change-Id: I4e533b8267c32e610a4b94203ad052dc9fdabd71	2016-02-27 05:47:56 -08:00
Scott LaVarnway	51beb29f52	Merge "VPX: vpx_filter_block1d16_(v8, v8_avg)"	2016-02-27 13:31:18 +00:00
hui su	4aeabf1b0d	Fix compiler warnings Change-Id: Id7240260cec471a3f8d0986b9c8df06efda925f9	2016-02-26 13:52:49 -08:00
Yaowu Xu	a570cefcf8	Merge "Extend vpxssim to handle more HBD combinations" into nextgenv2	2016-02-26 15:57:40 +00:00
James Zern	654d2163c9	x86/convolve.h: remove redundant check in FUN_CONV_2D the filter will be the same in this case Change-Id: I95159bcb05bbfb71b57da741393e80cc7ffc5cff	2016-02-25 23:31:50 -08:00
James Zern	6d8c8c6201	x86/convolve.h: replace while w/if for w < 16 in non-hbd configurations; any high-bitdepth changes will be done in a follow-up Change-Id: Ia74e30971b744c1faab68c92fdeda1a053988c77	2016-02-25 21:44:06 -08:00
Scott LaVarnway	1f736e400f	VPX: vpx_filter_block1d16_(v8, v8_avg) Store result with one 16 byte store instead of two 8 byte stores. Change-Id: I43acbc5edfd6d6055a926f9b9605d47127400f09	2016-02-25 06:15:24 -08:00
James Zern	b3ceb629ba	x86/convolve.h: change filter[] \|\| chains to \| Change-Id: I661f64390f232826857b259e7a67e77f5a3a91ad	2016-02-24 19:47:43 -08:00
hui su	8537826eb4	Fix some compiler warnings. "taking the absolute value of unsigned type 'unsigned int' has no effect" Change-Id: Iea1f67c2a3171a98ca89d5dc7192a5508d086c16	2016-02-24 11:17:33 -08:00
Yaowu Xu	aa6c754635	Merge remote-tracking branch 'webm/master' into nextgenv2	2016-02-24 10:53:17 -08:00
Scott LaVarnway	06d0e2fe6c	BUG FIX: vpx_filter_block1d(8,4)_(v8, v8_avg) Change-Id: Ic7ea79988ed0864e7ddbfeb312516bcf77eaaac1	2016-02-23 12:23:41 -08:00
Yaowu Xu	eeaf8e6b6c	Extend vpxssim to handle more HBD combinations Change-Id: I38426d946b74c9090a265d34b89e2db6693927c2	2016-02-22 16:09:08 -08:00
Yaowu Xu	38cfc45e07	Cleanup psnr.h Change-Id: Id026e72ee655ee5bd645a89e378da0d462be367d	2016-02-22 15:37:40 -08:00
Yaowu Xu	d1c5cd4a30	Add shift stage in FASTSSIM computation This commits adds a shift stage for FASTSSIM computaton when source bit depth is different from working bit depth, to make sure metric results are calculated in bit_depth consistent with source. Change-Id: I997799634076ef7b00fd051710544681ed536185	2016-02-22 14:58:10 -08:00
Yaowu Xu	195bf52bca	Add shift stage for PSNRHVS computation This commit adds the ability to shift down the working buffer when source bit_depth is different than working bit_depth. It does so by shift down to be consistent with source bit_depth. Change-Id: Idfdbfc614d73fe445d62e35e642cc7d75e9dc4ff	2016-02-22 10:22:42 -08:00
Yaowu Xu	6e695da2d9	Move psnrhvs function declaration to psnr.h From "ssim.h" Change-Id: Ie53378794149ef8a844b4eb47ad4f08579de4b60	2016-02-22 08:38:49 -08:00
Scott LaVarnway	15ecdc3970	VPX: loopfilter_mmx.asm using x86inc Change-Id: Idcf29281d617b275e3ca50f77e6d00c60992a36d	2016-02-18 15:34:58 -08:00
Yaowu Xu	acc4addb60	Merge "Add tests for Highbitdepth PSNR metric computations" into nextgenv2	2016-02-18 01:01:00 +00:00
Yaowu Xu	7823fbb45c	Merge "Move PSNR related functions into vpx_dsp/psnr.c" into nextgenv2	2016-02-18 01:00:54 +00:00
Yaowu Xu	9fb593d0fc	Add tests for Highbitdepth PSNR metric computations Change-Id: I07324155f73bbdbe25bb7a7ccd587ebf9010ac7a	2016-02-17 21:28:22 +00:00
Yaowu Xu	7538501ad1	Move PSNR related functions into vpx_dsp/psnr.c This makes all metric computation to locate at some place, also gets rid of duplicate code between vp9 and vp10. Change-Id: I24a2707d183a2419cd18a8343010adae185ffcd4	2016-02-17 13:05:34 -08:00
Debargha Mukherjee	35d9eadf08	Merge "Extends ext-tx to support 32x32 masked transforms" into nextgenv2	2016-02-17 18:33:10 +00:00
Debargha Mukherjee	7485498773	Extends ext-tx to support 32x32 masked transforms Adds new 32x32 masked 1-d transforms that combine 1-D length-16 DCT with length-16 identity transforms. To be continued in subsequent patches. Change-Id: I0b4f66492d44c079b3c3b531ba48a97201de1484	2016-02-17 09:31:34 -08:00
Yaowu Xu	6ed7f7a516	Merge branch 'master' into nextgenv2	2016-02-17 07:23:58 -08:00
James Zern	9b44d9d00f	split vpx_highbd_lpf_horizontal_16 in two replace with vpx_highbd_lpf_horizontal_edge_16 and vpx_highbd_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I551f8cec0fce57032cb2652584bb802e2248644d	2016-02-16 23:13:58 -08:00
James Zern	1b519fb666	split vpx_lpf_horizontal_16 in two replace with vpx_lpf_horizontal_edge_16 and vpx_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I848c95c02a3c6ebaa6c2bdf0983dce05cd645271	2016-02-16 22:57:45 -08:00
James Zern	e7a23d703b	vpx_highbd_lpf_horizontal_4: remove unused count param Change-Id: I655a771e1b1a8753be5669ef9348a312ba6cfdbc	2016-02-16 22:57:45 -08:00

1 2 3 4 5 ...

408 Commits