generic-library/vpx

Author	SHA1	Message	Date
Johann	bdbecea1ba	explicitly label .text sections nasm should infer .text but does not for windows: https://bugzilla.nasm.us/show_bug.cgi?id=3392451 Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb	2017-12-01 14:33:04 -08:00
Vlad Tsyrklevich	bc29863b96	[CFI] Remove function pointer casts Control Flow Integrity [1] indirect call checking verifies that function pointers only call valid functions with a matching type signature. This change eliminates function pointer casts to make libvpx CFI-safe. [1] https://www.chromium.org/developers/testing/control-flow-integrity Change-Id: I7e08522d195a43c88cda06fa20414426c8c4372c	2017-11-20 16:36:29 -08:00
Jerome Jiang	ea14a1a965	Merge "vp9: Fix mem rel for non-ref for external buffer."	2017-11-17 00:31:16 +00:00
Scott LaVarnway	8c7213bc00	Merge "vpx: [x86] add vp9_block_error_fp_avx2()"	2017-11-10 00:45:47 +00:00
Jerome Jiang	6246d8aa76	vp9: Fix mem rel for non-ref for external buffer. Release frame buffers for non-ref when the decoder is destroyed. Enable the non ref test. BUG=b/68819248 Change-Id: Id87ef3b0a62318f9812e927cd957c05c859047fa	2017-11-09 15:47:21 -08:00
Scott LaVarnway	62ab5e99c1	vpx: [x86] add vp9_block_error_fp_avx2() SSE2 asm vs AVX2 intrinsics speed gains: blocksize 16: ~1.00 blocksize 64: ~1.17 blocksize 256: ~1.67 blocksize 1024: ~1.81 Change-Id: I2a86db239cf57e3ff617890ccb2d236aba83ad5e	2017-11-09 05:02:31 -08:00
Kyle Siefring	b383a17fa4	Support building AVX-512 and implement sadx4 for AVX-512 The added AVX-512 support requires the subset of AVX-512 added in Skylake-X. Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7	2017-11-03 13:37:23 -04:00
Linfeng Zhang	a80bdfd081	Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t Add "typedef int16_t tran_coef_t;" BUG=webm:1450 Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154	2017-09-18 09:26:03 -07:00
Linfeng Zhang	535dee0fb6	cosmetics: vp9_rtcd_defs.pl Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086	2017-09-13 12:13:55 -07:00
Linfeng Zhang	71b38a144e	Add 2 to 1 scaling NEON optimization BUG=webm:1419 Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b	2017-09-07 12:33:50 -07:00
Linfeng Zhang	d331e7a1c0	Remove get_filter_base() and get_filter_offset() in convolve so that the convolve functions are independent of table alignment. Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee	2017-09-05 15:22:36 -07:00
clang-format	7587a97551	apply clang-format Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c	2017-09-01 01:24:03 -07:00
Johann	e83d99d7b8	quantize fp: neon implementation About 4x faster when values are below the dequant threshold and 10x faster if everything needs to be calculated. Both numbers would improve if the division for dqcoeff could be simplified. BUG=webm:1426 Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2	2017-08-23 08:01:30 -07:00
Linfeng Zhang	f95686895b	Merge changes I08b562b6,Ia275940a,I51106e90 * changes: Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} Update highbd idct x86 optimizations. Update 32x32 idct sse2 and ssse3 optimizations.	2017-08-16 16:36:37 +00:00
Linfeng Zhang	3f05a70c41	Update 32x32 idct sse2 and ssse3 optimizations. Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70	2017-08-14 16:59:31 -07:00
Scott LaVarnway	fa85cf131c	vp9: strip temporal filter code when CONFIG_REALTIME_ONLY is enabled. BUG=webm:1446 Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52	2017-08-14 14:27:53 -07:00
Johann	109faffe9b	remove vp9_full_sad_search This code is unused in vp9. Only vp8 still contains references to vpx_sad_NxMx[3\|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4. Remove the remaining sizes and all the highbitdepth versions. BUG=webm:1425 Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8	2017-07-10 11:20:35 -07:00
James Zern	5227b8200b	vp9: remove FrameWorkerData & vp9_dthread.h the file was empty after the struct removal. the only remaining use was within vp9_dx_iface, but the wrapper became unnecessary after the removal of frame_parallel_decode. BUG=webm:1395 Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd	2017-07-05 22:32:00 -07:00
James Zern	48c4a038eb	vp9: remove (un)lock_buffer_pool there is no threaded access to this pool after the removal of frame_parallel_decode BUG=webm:1395 Change-Id: I710769b87102edc898c59eb9a2e7a91d8c49107f	2017-07-05 21:07:00 -07:00
James Zern	303cb3106b	vp9_onyxc_int,RefCntBuffer: rm unused members the last frame_worker_owner, row and col references were removed in: 131bd06e6 remove vp9_dthread.c BUG=webm:1395 Change-Id: Ia7fb2e8782b12a58d2a2263849d20a8abf06aef6	2017-06-30 12:03:07 -07:00
James Zern	fb134e759a	vp9: reduce FRAME_BUFFERS by 3 the additional buffers are unneeded with the removal of frame_parallel_decode BUG=webm:1395 Change-Id: Id9ec4cb6462af5d07a0d3cf939bd216db27d9d9e	2017-06-30 12:03:07 -07:00
James Zern	bc837b223b	VP9_COMMON: rm frame_parallel_decode this has been 0 since the removal of frame_parallel_decode in vp9_dx_iface. BUG=webm:1395 Change-Id: I3a562b2c6b82050064d2b2ccb18a3e77c700b2da	2017-06-30 12:03:07 -07:00
Linfeng Zhang	a76b6b232c	Update load_input_data() in x86 Split to load_input_data4() and load_input_data8(). Use pack with signed saturation instruction for high bitdepth. Change-Id: Icda3e0129a6fdb4a51d1cafbdc652ae3a65f4e06	2017-06-26 13:38:33 -07:00
Linfeng Zhang	cbb991b6b8	Convert 8x8 idct x86 macros to inline functions Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce	2017-06-13 16:50:43 -07:00
Jerome Jiang	0afa2dad76	Fix vp8 race when build --enable-vp9-highbitdepth. Split vp8/vp9 implementations on yv12_copy_frame_c. Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c. Clean up vp8 codes usage in vp9. BUG=webm:1435 Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d	2017-05-26 09:45:01 -07:00
Marco	d3aebeee4e	vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl Change-Id: I259d152c62864b365490368051f3c3b7d7f2f1c5	2017-05-10 12:06:44 -07:00
Marco	4e23998fb4	vp9: SVC: Add option to set downsampling filter type. Add option in SVC to set the filter type and phase for the frame level downsampling filters. For 3 spatial layers: set downsampling filter type to bilinear and set phase to 8, for lowest spatial layer. Change-Id: Id81f4b1ba93db19c1cd37b6a46d1281a2c61bc43	2017-05-09 17:22:44 -07:00
Linfeng Zhang	ecd1eb2162	Update 4x4 idct sse2 functions It's a bit faster to call idct4_sse2() in vpx_idct4x4_16_add_sse2() Change-Id: I1513be7a895cd2fc190f4a8297c240b17de0f876	2017-05-08 16:16:52 -07:00
Linfeng Zhang	2c3a2ad6f1	Merge changes I0cfe4117,I3581d80d,Ida62c941 * changes: Split dsp/x86/inv_txfm_sse2.c Update highbd idct functions arguments to use uint16_t dst Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct	2017-05-08 16:15:57 +00:00
Jerome Jiang	3453c8d6c4	Merge "vp9: Neon optimization for denoiser. Add unit tests."	2017-05-06 01:28:32 +00:00
Jerome Jiang	069eedb3a0	vp9: Neon optimization for denoiser. Add unit tests. Denoiser on Neon is 5x faster than C code. BUG=webm:1420 Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb	2017-05-05 16:40:52 -07:00
Linfeng Zhang	d5de63d2be	Update highbd idct functions arguments to use uint16_t dst BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5	2017-05-03 13:59:16 -07:00
Linfeng Zhang	081b39f2b7	Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct BUG=webm:1388 Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112	2017-05-03 13:58:31 -07:00
Linfeng Zhang	e8655d49f5	Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()"	2017-05-01 19:54:40 +00:00
Johann	657f3e9f14	Use uint32_t for accumulator Be specific about the data type size. Use convenience macro vp9_zero_array. Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd	2017-04-28 06:36:59 -07:00
Johann Koenig	94ebdba71d	Merge "vp9 temporal filter: sse4 implementation"	2017-04-28 13:22:41 +00:00
Johann	6dfeea6592	vp9 temporal filter: sse4 implementation Approximates division using multiply and shift. Speeds up both sizes (8x8 and 16x16) by 30 times. Fix the call sites to use the RTCD function. Delete sse2 and mips implementation. They were based on a previous implementation of the filter. It was changed in Dec 2015: `ece4fd5d22` BUG=webm:1378 Change-Id: I0818e767a802966520b5c6e7999584ad13159276	2017-04-26 22:03:05 -07:00
Linfeng Zhang	4758d20227	Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor() BUG=webm:1388 Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d	2017-04-25 14:32:20 -07:00
Linfeng Zhang	51dc998f3a	Update highbd convolve functions arguments to use uint16_t src/dst BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42	2017-04-25 14:22:19 -07:00
Linfeng Zhang	fbbdba3b04	Merge changes I9e18a73b,Ie47c8cd4 * changes: Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Create CAST_TO_BYTEPTR/SHORTPTR	2017-04-19 23:55:58 +00:00
Linfeng Zhang	bf8a49abbd	Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve Replace by CAST_TO_BYTEPTR/SHORTPTR. The rule is: if a short ptr is casted to a byte ptr, any offset operation on the byte ptr must be doubled. We do this by casting to short ptr first, adding offset, then casting back to byte ptr. BUG=webm:1388 Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248	2017-04-19 12:13:49 -07:00
Marco	348bdc0195	vp9: Add phase to get averaging filter for 1:2 downsampling. The scaling filter with zero shift will give sub-sampling for 2x downsampling. Allow for a phase shift to get an averaging filter. Usage is for source scaling in 1 pass SVC mode for 1:2 downscale. Reduces aliasing in downsampled image. Keep the phase to 0/off for now. Change-Id: Ic547ea0748d151b675f877527e656407fcf4d51e	2017-04-18 16:56:15 -07:00
Yunqing Wang	1aa46abbdf	VP9 motion vector unit test To prevent the motion vector out of range bug, added a motion vector unit test in VP9. In the 4k video encoding, always forced to use extreme motion vectors and also encouraged to use INTER modes. In the decoding, checked if the motion vector was valid, and also checked the encoder/decoder mismatch. The tests showed that this unit test could reveal the issue we saw before. Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4	2017-04-06 00:50:56 +00:00
Johann	36d732c22b	vp9 temporal filter: add const to function prototype The input frames are not modified. Change-Id: Ideb810e3c5afeb4dbdc4c7d54024c43a8129ad39	2017-03-22 18:14:21 +00:00
Linfeng Zhang	48f5886605	Add vpx_highbd_idct32x32_135_add_c() When eob is less than or equal to 135 for high-bitdepth 32x32 idct, call this function. BUG=webm:1301 Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6	2017-03-08 10:46:33 -08:00
Jerome Jiang	e96ab22462	Merge "Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8."	2017-02-24 16:56:33 +00:00
Johann	904b957ae9	consolidate block_error functions vp9_highbd_block_error_8bit_c was a very simple wrapper around vp9_block_error_c. The SSE2 implemention was practically identical to the non-HBD one. It was missing some minor improvements which only went into the original version. In quick speed tests, the AVX implementation showed minimal improvement over SSE2 when it does not detect overflow. However, when overflow is detected the function is run a second time. The OperationCheck test seems to trigger this case and reverses any speed benefits by running ~60% slower. AVX2 on the other hand is always 30-40% faster. Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1	2017-02-24 05:25:26 +00:00
Johann Koenig	aa911e8b41	Merge "block error sse2: use tran_low_t"	2017-02-24 05:24:34 +00:00
Jerome Jiang	0998a146d4	Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8. Only works for bitdepth = 8 when compiled with high bitdepth flag. 4x speed ups for handling 1:2 down/upsampling. Validated manually for: 1) Dynamic resize for a single layer encoding 2) SVC encoding with 3 spatial layers Results are bitexact with the patch and the speed gain (~4x) in the scaling was verified. BUG=webm:1371 Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712	2017-02-23 20:40:28 -08:00
Johann	3c16bbb73b	block error sse2: use tran_low_t Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb	2017-02-24 01:33:35 +00:00

1 2 3 4 5 ...

3250 Commits