Commit Graph

995 Commits

Author SHA1 Message Date
Jerome Jiang
546a210259 Silence warning when built with --enable-internal-stats.
Change-Id: I3a600a9baf2b8e46c109f4ec2b5bd6bafda4bf58
2018-04-12 15:07:03 -07:00
Linfeng Zhang
47f22fc5fb rm CONVERT_TO_SHORTPTR in vpx_highbd_comp_avg_pred
BUG=webm:1388

Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
2018-04-03 15:36:41 -07:00
Linfeng Zhang
c3ba5c521e Merge changes I5704bd66,I4d548e97
* changes:
  Shrink size of mode_map in struct TileDataEnc
  Update sad4d x86 functions
2018-04-02 16:05:05 +00:00
Linfeng Zhang
39de45d3cc Update sad4d x86 functions
Speed change is marginal.

Change-Id: I4d548e9763ce43bd546f19132202f7a8509a32bf
2018-03-28 12:49:12 -07:00
gxw
25d9adb74b vp9: [loongson] optimize vpx_convolve8 with mmi.
1. vpx_convolve8_vert_mmi
2. vpx_convolve8_horiz_mmi
3. vpx_convolve8_mmi
4. vpx_convolve8_avg_mmi
5. vpx_convolve8_avg_vert_mmi

Change-Id: I41a6b3b4f327d6b67d282e0163cfa0aee8648abe
2018-03-28 18:11:16 +00:00
Linfeng Zhang
9351f96069 Add vp9_highbd_iht16x16_256_add_neon()
BUG=webm:1403

Change-Id: I2293c11666786be276909d48ee78dacb40a89e25
2018-03-13 17:39:23 -07:00
Linfeng Zhang
88c2386447 Add vp9_iht16x16_256_add_neon()
BUG=webm:1403

Change-Id: I1413cc3dfcb62143ba04fe9b0f8d8b010fdf69b6
2018-02-27 10:13:20 -08:00
Linfeng Zhang
3c6dc743aa Fix a bug in create_s16x4_neon()
This bug exposes when 2nd argument is negative, and the higher 32 bits
would be all 1s.

Change-Id: I189ee8cd3753fde00a34847e7a37cde2caa4ba72
2018-02-26 17:49:24 -08:00
Linfeng Zhang
167594414f Merge "Add vp9_highbd_iht8x8_16_add_neon()" 2018-02-23 01:42:59 +00:00
Kyle Siefring
dccb8b45bb Merge "Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT" 2018-02-21 23:12:07 +00:00
Linfeng Zhang
29b6a30cd9 Add vp9_highbd_iht8x8_16_add_neon()
BUG=webm:1403

Change-Id: I11efb652f1aee371c71eee2d29e33793e4736832
2018-02-20 17:21:31 -08:00
Johann
c1435e321c remove deprecated 'register' keyword
Will be removed in C++17:
http://en.cppreference.com/w/cpp/language/storage_duration

Change-Id: Iadce5e2b974c707799fa939f3ff1c420fb79a871
2018-02-20 14:49:02 -08:00
Kyle Siefring
811b2e412e Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT
Changes in the function size in bytes (in lieu of performance metrics)
                   Before    After    Diff
vpx_fdct32x32_avx2  29564 -> 28334   -1230
vpx_fdct32x32_sse2  38053 -> 36309   -1744

Change-Id: Ie0b3e6ed7c3f2e9ea45f9d6a1ce1e27d068cee6b
2018-02-10 14:25:24 -05:00
Linfeng Zhang
0f3edc6625 Update iadst NEON functions
Use scalar multiply. No impact on clang, but improves gcc compiling.

BUG=webm:1403

Change-Id: I4922e7e033d9e93282c754754100850e232e1529
2018-02-08 07:23:55 +00:00
Linfeng Zhang
3636330490 Add vp9_highbd_iht4x4_16_add_neon()
BUG=webm:1403

Change-Id: Id9833e985fb70958cf4bde38f8e6303ed83c12f9
2018-02-05 13:42:16 -08:00
James Zern
73d1236384 inv_txfm_vsx.c: make code c90 compatible
move for loop declarations to function scope

Change-Id: I84d92a1a6ca6c5ac30aacb0f55d87ca3aef4c98f
2018-02-01 19:40:28 -08:00
Linfeng Zhang
b14b616d96 Update vp9_iht8x8_64_add_neon()
Change-Id: Ie70ed8b9273df5e1fd06bc93cb469e80630941d2
2018-01-29 15:17:08 -08:00
Linfeng Zhang
884d1681f8 Clean dct_const_round_shift() related neon code
Change-Id: I8f4e0fc6ecb77b623519f2dd3cd2886f89218ddd
2018-01-29 10:23:24 -08:00
Linfeng Zhang
2654afc16c Merge "cosmetic: clean idct neon functions" 2018-01-29 17:34:11 +00:00
Scott LaVarnway
15b261d854 Merge "BUG FIX: sse2 subpel variance is not PIC compliant" 2018-01-24 22:54:42 +00:00
Linfeng Zhang
6248f0c91f cosmetic: clean idct neon functions
Change-Id: I9c7c52567850aded0437b13ba1260e94441bc49d
2018-01-24 13:55:15 -08:00
Scott LaVarnway
cb9f4dc105 BUG FIX: sse2 subpel variance is not PIC compliant
BUG=webm:1464

Change-Id: Ibc15bac54aaf509365bed5892a26a29972ad3540
2018-01-24 05:58:54 -08:00
Scott LaVarnway
b9e44842fc Merge "vp9_quantize_fp_avx2()" 2018-01-24 13:58:08 +00:00
Linfeng Zhang
231012fdab Add vp9_highbd_iht16x16_256_add_sse4_1()
BUG=webm:1413

Change-Id: I8d7eeae1bd219eb848c1a86071046a477f7a91af
2018-01-23 11:24:42 -08:00
Linfeng Zhang
8f50e06012 Add "vpx_" prefix to 2 idct x86 functions
Change-Id: I4f3052d8748e16b06e9155f8daf22f867dfaa7a3
2018-01-23 09:17:38 -08:00
Linfeng Zhang
6fea41abee Merge "Add vp9_highbd_iht8x8_64_add_sse4_1()" 2018-01-23 17:04:20 +00:00
Linfeng Zhang
9874ec07bd Add vp9_highbd_iht8x8_64_add_sse4_1()
BUG=webm:1413

Change-Id: Id9038226902b2d793fc6c17ac81bb104c1a18988
2018-01-18 15:49:44 -08:00
Scott LaVarnway
c7449b482c vp9_quantize_fp_avx2()
Started from vp9_quantize_fp_sse2 and tweaked to use avx2.

Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2018-01-18 13:33:30 -08:00
Johann
97acbbb701 clang-format v5.0.0 vpx_dsp/
Remove comments above #define statements because they get
indented unnecessarily.
https://bugs.llvm.org/show_bug.cgi?id=35930

Add blank lines to prevent comments from being treated as
blocks.

Change-Id: I04dce21b2a10e13b8dc07411a0019c098f6dd705
2018-01-18 12:37:50 -08:00
Johann
f5b2dd2a66 adopt some clang 5.0.0 formatting
At least the changes that don't conflict with 4.0.1

Change-Id: I9b6a7c14dadc0738cd0f628a10ece90fc7ee89fd
2018-01-11 12:35:24 -08:00
Linfeng Zhang
e20ca4fead Add vp9_highbd_iht4x4_16_add_sse4_1()
BUG=webm:1413

Change-Id: I14930d0af24370a44ab359de5bba5512eef4e29f
2018-01-08 10:14:20 -08:00
Linfeng Zhang
7a41610581 Update dct_test.cc
Make 8-bit functions testing available in high bitdepth.

Change-Id: Ic030c75aa4c6b649c52426abb4bb2122882de0fe
2018-01-08 10:07:38 -08:00
Linfeng Zhang
867b593caa Update iadst4_sse2()
Change-Id: I21ff81df0d6898170a3b80b3b5220f9f3ac7f4e8
2017-12-28 16:47:57 -08:00
Johann
e4b3f03c64 add copyright to rtcd files
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-14 22:50:08 +00:00
Shiyou Yin
90ce21e519 Merge "vpx_dsp: [loongson] optimize variance v2." 2017-12-04 01:30:06 +00:00
Johann
bdbecea1ba explicitly label .text sections
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451

Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01 14:33:04 -08:00
Shiyou Yin
298f5ca47d vpx_dsp: [loongson] optimize variance v2.
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.

Change-Id: I58890c6a2ed1543379acb48e03e620c144f6515f
2017-12-01 13:44:48 +08:00
Kaustubh Raste
8099220e6c Merge "mips msa optimize vpx_scaled_2d function" 2017-12-01 01:24:25 +00:00
Shiyou Yin
8d70aef05f Merge "vpx: [loongson] fix bug in var_filter_block2d_bil_16x" 2017-11-30 00:53:37 +00:00
Kyle Siefring
3ae909b0f9 Merge "Remove unnecessary includes of emmintrin_compat.h" 2017-11-29 19:14:45 +00:00
Kyle Siefring
a60da3a2eb Remove unnecessary includes of emmintrin_compat.h
Change-Id: Ie60381a0c6ee01f828cd364a43f01517f4cb03e9
2017-11-29 11:48:24 -05:00
Kaustubh Raste
339f4dcaee mips msa optimize vpx_scaled_2d function
Change-Id: I638507b360c71489ab0e87bd558d2719ad995333
2017-11-29 13:27:04 +05:30
Shiyou Yin
a0ca2a4079 vpx: [loongson] fix bug in var_filter_block2d_bil_16x
Which cause failed case:
1. MMI/VpxSubpelVarianceTest.Ref/6
2. MMI/VpxSubpelVarianceTest.Ref/7
3. MMI/VpxSubpelVarianceTest.ExtremeRef/6
4. MMI/VpxSubpelVarianceTest.ExtremeRef/7

Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
2017-11-29 10:26:43 +08:00
Johann
bd990cad72 quantize x86: dedup some parts
Change-Id: I9f95f47bc7ecbb7980f21cbc3a91f699624141af
2017-11-27 13:09:21 -08:00
Kyle Siefring
dd4cc5b596 Merge "Optimize AVX2 get16x16var and get32x16var functions" 2017-11-20 22:37:57 +00:00
Kyle Siefring
07a0bf038f Optimize AVX2 get16x16var and get32x16var functions
Change-Id: If8b91aaa883c01107f0ea3468139fa24cfb301d2
2017-11-17 13:55:49 -05:00
Johann
3e3a568616 fwd txfm ssse3: use GLOBAL() for loading constants
Fixes a build issue when relocation is not allowed:
relocation R_X86_64_32 against '.rodata' can not be used when making a shared object

Change-Id: Ica3e90c926847bc384e818d7854f0030f4d69aa0
2017-11-15 13:01:44 -08:00
Scott LaVarnway
8e6022844f vpx: [x86] add vpx_satd_avx2()
SSE2 instrinsic vs AVX2 intrinsic speed gains:
blocksize   16: ~1.33
blocksize   64: ~1.51
blocksize  256: ~3.03
blocksize 1024: ~3.71

Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
2017-11-10 12:24:12 -08:00
Scott LaVarnway
2387024f41 runtime error fix: bitdepth_conversion_avx2.h
Change-Id: I7364a157de39eb7137b599808474b8d46d19d376
2017-11-09 12:26:43 -08:00
Kyle Siefring
b383a17fa4 Support building AVX-512 and implement sadx4 for AVX-512
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.

Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00