generic-library/vpx

Author	SHA1	Message	Date
Johann	2d6b5df657	neon: vpx_quantize_b With skip block or coeff < zbin it is about twice as fast as C. If most coeff values are > zbin it is about 10-15x as fast as C. BUG=webm:1426 Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7	2017-07-31 10:38:46 -07:00
James Zern	1c666465af	inv_txfm_{sse2,ssse3}: clear conversion warnings visual studio reports tran_high_t (int64) -> short in calls to _mm_set1_epi16 Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745	2017-07-25 20:13:49 -07:00
James Zern	62682ac8ad	highbd_idct_sse.c: clear conversion warnings visual studio reports tran_high_t (int64) -> int in calls to _mm_setr_epi32 Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05	2017-07-25 20:11:09 -07:00
James Zern	85736e616e	vpx_variance16x16_sse2: correct cast order allow the right shift to operate on 64-bits, this matches the rest of the implementations previously: `b0f1ae147` vpx_get16x16var_avx2: correct cast order Change-Id: I632ee5e418f3f9b30e79ecd05588eb172b0783aa	2017-07-25 16:45:40 -07:00
James Zern	b0f1ae1475	vpx_get16x16var_avx2: correct cast order allow the right shift to operate on 64-bits, this matches the rest of the implementations missed in: `6acd061aa` variance_avx2: sync variance functions with c-code Change-Id: Icae436b881251ccb9f9ed64fcbf8d358c58a4617	2017-07-24 16:29:44 -07:00
James Zern	8836e46ffd	set_var_thresh_from_histogram: prevent negative variance For 8-bit the subtrahend is small enough to fit into uint32_t. For 10/12-bit apply: `63a37d16f` Prevent negative variance previously: `47b9a0912` Resolve -Wshorten-64-to-32 in highbd variance. `c0241664a` Resolve -Wshorten-64-to-32 in variance. Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee	2017-07-22 13:27:32 -07:00
Marco	8c7a60e04d	vp8: Fix compile warning in vp8_multi_resolution_encoder.c Change-Id: I49c960179dfc1902aa5e5c99915789878c06bc3d	2017-07-20 14:19:43 -07:00
Johann Koenig	e8bd534c42	Merge "quantize test: promote RandRange() result to signed"	2017-07-20 19:46:05 +00:00
Johann Koenig	0c30b75f40	Merge "quantize test: lowbd functions do not pass in highbd"	2017-07-20 19:45:59 +00:00
Jerome Jiang	494188505b	Merge "vp9: Removed unused skin detection function."	2017-07-20 16:58:01 +00:00
Johann	af08fbb444	quantize test: promote RandRange() result to signed Avoid unsigned overflow warning: unsigned integer overflow: 19974 - 32703 cannot be represented in type 'unsigned int' Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12	2017-07-20 08:17:48 -07:00
Johann	c782f27ead	quantize test: lowbd functions do not pass in highbd qcoeff output looks OK but dqcoeff is no good. BUG=webm:1448 Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd	2017-07-20 08:17:48 -07:00
Johann Koenig	4702bb26be	Merge "quantize test: eob is output"	2017-07-20 15:17:26 +00:00
Johann Koenig	e1809501d0	Merge "Earmark extra space for VSX."	2017-07-19 21:35:57 +00:00
Jerome Jiang	9dd992b6f0	Merge "Roll libwebm: Fix android build failure with NDK r15b."	2017-07-19 21:30:21 +00:00
Johann	bde2e4aa36	quantize test: eob is output eob values are generated by the function. Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3	2017-07-19 14:17:19 -07:00
Han Shen	b72d3e8a25	Earmark extra space for VSX. Backend specific optimization for PPC VSX reads 16 bytes, whereas arm neon / sse2 only reads <= 8 bytes. Although the extra bytes read are actually never used, this is not a warrant for groping around. Fixed by allocating more when building for VSX. This is reported by asan. Also note - PPC does have assembly that loads 64-bit content from memory - lxsdx loads one 64-bit doubleword (whereas lxvd2x loads two 64-bit doubleword) from memory. However, we only have "vec_vsx_ld" builtins that mapped to lxvd2x, no builtins to lxsdx. The only way to access lxsdx is through inline assembly, which does not fit well in the origin paradigm. Refer: vsx: vpx_tm_predictor_4x4_vsx @ third_party/libvpx/git_root/vpx_dsp/ppc/intrapred_vsx.c neon: vpx_tm_predictor_4x4_neon @ third_party/libvpx/git_root/vpx_dsp/arm/intrapred_neon_asm.asm sse2: tm_predictor_4x4 @ third_party/libvpx/git_root/vpx_dsp/x86/intrapred_sse2.asm BUG=b/63112600 Tested: asan tests passed. Change-Id: I5f74b56e35c05b67851de8b5530aece213f2ce9d	2017-07-19 13:59:32 -07:00
Johann Koenig	89a116f4cb	Merge "variance: call C comp_avg_pred"	2017-07-19 20:34:13 +00:00
Jerome Jiang	8ad9338e2e	Roll libwebm: Fix android build failure with NDK r15b. BUG=webm:1447 Change-Id: I8defe45cb94eb9c209ba72ce446786f24c14c0b8	2017-07-18 16:52:46 -07:00
Jerome Jiang	4526644615	vp9: Removed unused skin detection function. Change-Id: I6702b7b11aa4ac9aac5fd54deef4377cdcb29c64	2017-07-18 14:52:04 -07:00
Jerome Jiang	59e461db1f	Merge "vp9: Allocate alt-ref in denoiser for SVC."	2017-07-18 21:30:04 +00:00
Jerome Jiang	babef23a5f	Merge "vp9: Remove isolated skin & non-skin blocks."	2017-07-18 20:48:32 +00:00
Johann Koenig	56d3f1573a	Merge changes I62c2e313,Ibd7a0337,I94e1d886 * changes: quantize test: test sse2 and avx optimizations quantize test: extend arrays quantize test: restrict and correct input	2017-07-18 20:42:39 +00:00
Johann	4b9a848bb3	variance: call C comp_avg_pred Keep optimized code out of the reference implementation. This matches the style of the other sub calls. Change-Id: I3da6acd4f2c647b029c420e22ac9410a18259689	2017-07-18 20:22:53 +00:00
Jerome Jiang	fd216268ad	vp9: Allocate alt-ref in denoiser for SVC. When SVC is used, allocate alt-ref in denoiser. Change-Id: I1b17221b55b9444cd23b97d481b54ff8d296d857	2017-07-18 13:22:47 -07:00
Johann	101981b736	quantize test: test sse2 and avx optimizations ssse3 does not pass either of the tests. avx 32x32 does not pass. Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44	2017-07-18 12:08:16 -07:00
Jerome Jiang	adbfc4308a	vp9: Remove isolated skin & non-skin blocks. 0.007% regression on rtc and 0.004% gain on rtc_derf. 1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has ~0.2% speed gain on Google Pixel. Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc	2017-07-18 11:29:14 -07:00
Johann	c7ebe82253	quantize test: extend arrays Officially the quant structures are 8 elements, with one dc element and 7 repeated ac elements. The low bit depth optimizations take advantage of this to fill the xmm registers. The high bit depth version manually duplicates the values. If all the optimizations were unified, the structure sizes could be greatly reduced. Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae	2017-07-18 09:55:47 -07:00
Johann	cb61ba02f4	quantize test: restrict and correct input Use only valid values for quantize inputs. These were determined by looping over vp9_init_quantizer and looking for max and min values. This allows extending the test to the low bit depth functions which were not designed to handle all possible inputs but only valid inputs. Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e	2017-07-18 09:40:45 -07:00
Marco	817f68cdcf	vp9: Disable usage of sb_use_mv_part for SVC. To fix valgrind issueis with SVC tests. SVC encoding uses prune_evenmore which is causing uinit value. Will re-enable later when issue is resolved. Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44	2017-07-18 09:28:56 -07:00
Marco	ad56371343	vp9: Fix to setting content_state for real-time mode. When content_state_sb is set to LowVarHighSumdiff, don't reset it to VeryHighSad. Visually better on clips with strong lighting changes. Small/negligible change in RTC metrics and speed. Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa	2017-07-17 16:21:25 -07:00
Marco	0c9e2f4c15	vp9: Reuse motion from choose_partitioning in NEWMV search. When int_pro_motion_estimation is done for superblock in choose_partitioning, use it to avoid the full_pixel_search for NEWMV mode, if bsize is >= 32X32. For speed > 7. Small/neutral change on RTC metrics. ~1-2% speedup on arm on high motion clip. Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b	2017-07-17 13:15:48 -07:00
James Zern	9223b947ca	Merge "fix 'make exampletest' w/CONFIG_REALTIME_ONLY"	2017-07-15 18:37:10 +00:00
Jerome Jiang	682135fa60	vp9: Compute skin only for blocks eligible for noise estimation. Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264	2017-07-14 15:14:30 -07:00
Marco	666e394d41	vp9: Adjust minmax threshold for variance partitioning. Only affects speed 7. Improvement on high motion clips. Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687	2017-07-13 21:19:37 -07:00
Johann	e3fa4ae8e3	quantize test: use Buffer Although the low bitdepth functions are identical (excepting the need for larger intermediate values) they do not pass these tests. This improves the error output to aid debugging. Simplify buffer usage with Buffer and removing unnecessarily aligned variables. eob is a single element and never written using aligned instructions. BUG=webm:1426 Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35	2017-07-13 15:54:48 -07:00
James Zern	960466939d	fix 'make exampletest' w/CONFIG_REALTIME_ONLY for tests that aren't explicitly testing 2-pass behavior use --passes=1 with this configuration Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f	2017-07-13 10:47:20 -07:00
James Zern	b578d59623	Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY"	2017-07-12 23:30:04 +00:00
Johann Koenig	e0d79bc7b5	Merge "sad4d neon: 64x[32,64]"	2017-07-12 20:15:00 +00:00
Marco Paniconi	f6586b8bf8	Merge "vp9: Fix to SVC and denoising for fixed pattern case."	2017-07-12 19:13:05 +00:00
Johann Koenig	3158752980	Merge changes Ibf5e61dc,I44b48512,I7de2500c,I5081b5ce * changes: sad4d neon: 32x[16,32,64] sad4d neon: 16x[8,16,32] sad4d neon: 8x[4,8,16] sad4d neon: 4x4, 4x8	2017-07-12 15:01:30 +00:00
Johann	e381753926	sad4d neon: 64x[32,64] Rewrite 64x64. BUG=webm:1425 Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf	2017-07-12 13:26:39 +00:00
Johann	e1bde306c8	sad4d neon: 32x[16,32,64] Rewrite 32x32. Use half the accumulator registers. BUG=webm:1425 Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13	2017-07-12 13:25:18 +00:00
Johann	807ce8fb1e	sad4d neon: 16x[8,16,32] Rewrite 16x16. Use half the accumulator registers. BUG=webm:1425 Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2	2017-07-12 13:25:11 +00:00
Johann	8152b0904d	sad4d neon: 8x[4,8,16] BUG=webm:1425 Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4	2017-07-12 13:25:03 +00:00
Johann	dd4347e9ec	sad4d neon: 4x4, 4x8 BUG=webm:1425 Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468	2017-07-12 03:38:03 +00:00
Urvang Joshi	1dee320446	Merge "Remove the token state array from greedy optimize_b."	2017-07-12 00:08:56 +00:00
James Zern	df18412f32	remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY BUG=webm:1446 Change-Id: I6e0ea9342c715d354c641109737172afa649b85b	2017-07-11 13:10:16 -07:00
Urvang Joshi	5322a31b18	Remove the token state array from greedy optimize_b. Reduces memory usage, and speeds up encoding for some difficult clips. No impact on output or metrics. Ported from aomedia patch: https://aomedia-review.googlesource.com/c/14501 Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d	2017-07-11 13:05:29 -07:00
James Bankoski	7d5afa227a	Merge "Reintroduce fix for max qindex calculation of a gf interval"	2017-07-11 19:47:16 +00:00

1 2 3 4 5 ...

17597 Commits