Linfeng Zhang
47f22fc5fb
rm CONVERT_TO_SHORTPTR in vpx_highbd_comp_avg_pred
...
BUG=webm:1388
Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
2018-04-03 15:36:41 -07:00
Linfeng Zhang
c3ba5c521e
Merge changes I5704bd66,I4d548e97
...
* changes:
Shrink size of mode_map in struct TileDataEnc
Update sad4d x86 functions
2018-04-02 16:05:05 +00:00
Linfeng Zhang
39de45d3cc
Update sad4d x86 functions
...
Speed change is marginal.
Change-Id: I4d548e9763ce43bd546f19132202f7a8509a32bf
2018-03-28 12:49:12 -07:00
gxw
25d9adb74b
vp9: [loongson] optimize vpx_convolve8 with mmi.
...
1. vpx_convolve8_vert_mmi
2. vpx_convolve8_horiz_mmi
3. vpx_convolve8_mmi
4. vpx_convolve8_avg_mmi
5. vpx_convolve8_avg_vert_mmi
Change-Id: I41a6b3b4f327d6b67d282e0163cfa0aee8648abe
2018-03-28 18:11:16 +00:00
Linfeng Zhang
9351f96069
Add vp9_highbd_iht16x16_256_add_neon()
...
BUG=webm:1403
Change-Id: I2293c11666786be276909d48ee78dacb40a89e25
2018-03-13 17:39:23 -07:00
Linfeng Zhang
88c2386447
Add vp9_iht16x16_256_add_neon()
...
BUG=webm:1403
Change-Id: I1413cc3dfcb62143ba04fe9b0f8d8b010fdf69b6
2018-02-27 10:13:20 -08:00
Linfeng Zhang
3c6dc743aa
Fix a bug in create_s16x4_neon()
...
This bug exposes when 2nd argument is negative, and the higher 32 bits
would be all 1s.
Change-Id: I189ee8cd3753fde00a34847e7a37cde2caa4ba72
2018-02-26 17:49:24 -08:00
Linfeng Zhang
167594414f
Merge "Add vp9_highbd_iht8x8_16_add_neon()"
2018-02-23 01:42:59 +00:00
Kyle Siefring
dccb8b45bb
Merge "Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT"
2018-02-21 23:12:07 +00:00
Linfeng Zhang
29b6a30cd9
Add vp9_highbd_iht8x8_16_add_neon()
...
BUG=webm:1403
Change-Id: I11efb652f1aee371c71eee2d29e33793e4736832
2018-02-20 17:21:31 -08:00
Johann
c1435e321c
remove deprecated 'register' keyword
...
Will be removed in C++17:
http://en.cppreference.com/w/cpp/language/storage_duration
Change-Id: Iadce5e2b974c707799fa939f3ff1c420fb79a871
2018-02-20 14:49:02 -08:00
Kyle Siefring
811b2e412e
Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT
...
Changes in the function size in bytes (in lieu of performance metrics)
Before After Diff
vpx_fdct32x32_avx2 29564 -> 28334 -1230
vpx_fdct32x32_sse2 38053 -> 36309 -1744
Change-Id: Ie0b3e6ed7c3f2e9ea45f9d6a1ce1e27d068cee6b
2018-02-10 14:25:24 -05:00
Linfeng Zhang
0f3edc6625
Update iadst NEON functions
...
Use scalar multiply. No impact on clang, but improves gcc compiling.
BUG=webm:1403
Change-Id: I4922e7e033d9e93282c754754100850e232e1529
2018-02-08 07:23:55 +00:00
Linfeng Zhang
3636330490
Add vp9_highbd_iht4x4_16_add_neon()
...
BUG=webm:1403
Change-Id: Id9833e985fb70958cf4bde38f8e6303ed83c12f9
2018-02-05 13:42:16 -08:00
James Zern
73d1236384
inv_txfm_vsx.c: make code c90 compatible
...
move for loop declarations to function scope
Change-Id: I84d92a1a6ca6c5ac30aacb0f55d87ca3aef4c98f
2018-02-01 19:40:28 -08:00
Linfeng Zhang
b14b616d96
Update vp9_iht8x8_64_add_neon()
...
Change-Id: Ie70ed8b9273df5e1fd06bc93cb469e80630941d2
2018-01-29 15:17:08 -08:00
Linfeng Zhang
884d1681f8
Clean dct_const_round_shift() related neon code
...
Change-Id: I8f4e0fc6ecb77b623519f2dd3cd2886f89218ddd
2018-01-29 10:23:24 -08:00
Linfeng Zhang
2654afc16c
Merge "cosmetic: clean idct neon functions"
2018-01-29 17:34:11 +00:00
Scott LaVarnway
15b261d854
Merge "BUG FIX: sse2 subpel variance is not PIC compliant"
2018-01-24 22:54:42 +00:00
Linfeng Zhang
6248f0c91f
cosmetic: clean idct neon functions
...
Change-Id: I9c7c52567850aded0437b13ba1260e94441bc49d
2018-01-24 13:55:15 -08:00
Scott LaVarnway
cb9f4dc105
BUG FIX: sse2 subpel variance is not PIC compliant
...
BUG=webm:1464
Change-Id: Ibc15bac54aaf509365bed5892a26a29972ad3540
2018-01-24 05:58:54 -08:00
Scott LaVarnway
b9e44842fc
Merge "vp9_quantize_fp_avx2()"
2018-01-24 13:58:08 +00:00
Linfeng Zhang
231012fdab
Add vp9_highbd_iht16x16_256_add_sse4_1()
...
BUG=webm:1413
Change-Id: I8d7eeae1bd219eb848c1a86071046a477f7a91af
2018-01-23 11:24:42 -08:00
Linfeng Zhang
8f50e06012
Add "vpx_" prefix to 2 idct x86 functions
...
Change-Id: I4f3052d8748e16b06e9155f8daf22f867dfaa7a3
2018-01-23 09:17:38 -08:00
Linfeng Zhang
6fea41abee
Merge "Add vp9_highbd_iht8x8_64_add_sse4_1()"
2018-01-23 17:04:20 +00:00
Linfeng Zhang
9874ec07bd
Add vp9_highbd_iht8x8_64_add_sse4_1()
...
BUG=webm:1413
Change-Id: Id9038226902b2d793fc6c17ac81bb104c1a18988
2018-01-18 15:49:44 -08:00
Scott LaVarnway
c7449b482c
vp9_quantize_fp_avx2()
...
Started from vp9_quantize_fp_sse2 and tweaked to use avx2.
Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2018-01-18 13:33:30 -08:00
Johann
97acbbb701
clang-format v5.0.0 vpx_dsp/
...
Remove comments above #define statements because they get
indented unnecessarily.
https://bugs.llvm.org/show_bug.cgi?id=35930
Add blank lines to prevent comments from being treated as
blocks.
Change-Id: I04dce21b2a10e13b8dc07411a0019c098f6dd705
2018-01-18 12:37:50 -08:00
Johann
f5b2dd2a66
adopt some clang 5.0.0 formatting
...
At least the changes that don't conflict with 4.0.1
Change-Id: I9b6a7c14dadc0738cd0f628a10ece90fc7ee89fd
2018-01-11 12:35:24 -08:00
Linfeng Zhang
e20ca4fead
Add vp9_highbd_iht4x4_16_add_sse4_1()
...
BUG=webm:1413
Change-Id: I14930d0af24370a44ab359de5bba5512eef4e29f
2018-01-08 10:14:20 -08:00
Linfeng Zhang
7a41610581
Update dct_test.cc
...
Make 8-bit functions testing available in high bitdepth.
Change-Id: Ic030c75aa4c6b649c52426abb4bb2122882de0fe
2018-01-08 10:07:38 -08:00
Linfeng Zhang
867b593caa
Update iadst4_sse2()
...
Change-Id: I21ff81df0d6898170a3b80b3b5220f9f3ac7f4e8
2017-12-28 16:47:57 -08:00
Johann
e4b3f03c64
add copyright to rtcd files
...
Allows them to pass the license check in chromium.
BUG=chromium:98319
Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-14 22:50:08 +00:00
Shiyou Yin
90ce21e519
Merge "vpx_dsp: [loongson] optimize variance v2."
2017-12-04 01:30:06 +00:00
Johann
bdbecea1ba
explicitly label .text sections
...
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451
Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01 14:33:04 -08:00
Shiyou Yin
298f5ca47d
vpx_dsp: [loongson] optimize variance v2.
...
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.
Change-Id: I58890c6a2ed1543379acb48e03e620c144f6515f
2017-12-01 13:44:48 +08:00
Kaustubh Raste
8099220e6c
Merge "mips msa optimize vpx_scaled_2d function"
2017-12-01 01:24:25 +00:00
Shiyou Yin
8d70aef05f
Merge "vpx: [loongson] fix bug in var_filter_block2d_bil_16x"
2017-11-30 00:53:37 +00:00
Kyle Siefring
3ae909b0f9
Merge "Remove unnecessary includes of emmintrin_compat.h"
2017-11-29 19:14:45 +00:00
Kyle Siefring
a60da3a2eb
Remove unnecessary includes of emmintrin_compat.h
...
Change-Id: Ie60381a0c6ee01f828cd364a43f01517f4cb03e9
2017-11-29 11:48:24 -05:00
Kaustubh Raste
339f4dcaee
mips msa optimize vpx_scaled_2d function
...
Change-Id: I638507b360c71489ab0e87bd558d2719ad995333
2017-11-29 13:27:04 +05:30
Shiyou Yin
a0ca2a4079
vpx: [loongson] fix bug in var_filter_block2d_bil_16x
...
Which cause failed case:
1. MMI/VpxSubpelVarianceTest.Ref/6
2. MMI/VpxSubpelVarianceTest.Ref/7
3. MMI/VpxSubpelVarianceTest.ExtremeRef/6
4. MMI/VpxSubpelVarianceTest.ExtremeRef/7
Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
2017-11-29 10:26:43 +08:00
Johann
bd990cad72
quantize x86: dedup some parts
...
Change-Id: I9f95f47bc7ecbb7980f21cbc3a91f699624141af
2017-11-27 13:09:21 -08:00
Kyle Siefring
dd4cc5b596
Merge "Optimize AVX2 get16x16var and get32x16var functions"
2017-11-20 22:37:57 +00:00
Kyle Siefring
07a0bf038f
Optimize AVX2 get16x16var and get32x16var functions
...
Change-Id: If8b91aaa883c01107f0ea3468139fa24cfb301d2
2017-11-17 13:55:49 -05:00
Johann
3e3a568616
fwd txfm ssse3: use GLOBAL() for loading constants
...
Fixes a build issue when relocation is not allowed:
relocation R_X86_64_32 against '.rodata' can not be used when making a shared object
Change-Id: Ica3e90c926847bc384e818d7854f0030f4d69aa0
2017-11-15 13:01:44 -08:00
Scott LaVarnway
8e6022844f
vpx: [x86] add vpx_satd_avx2()
...
SSE2 instrinsic vs AVX2 intrinsic speed gains:
blocksize 16: ~1.33
blocksize 64: ~1.51
blocksize 256: ~3.03
blocksize 1024: ~3.71
Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
2017-11-10 12:24:12 -08:00
Scott LaVarnway
2387024f41
runtime error fix: bitdepth_conversion_avx2.h
...
Change-Id: I7364a157de39eb7137b599808474b8d46d19d376
2017-11-09 12:26:43 -08:00
Kyle Siefring
b383a17fa4
Support building AVX-512 and implement sadx4 for AVX-512
...
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.
Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00
Scott LaVarnway
3bf02ad74a
vpx: hadamard: use ptrdiff_t instead of int for stride
...
Eliminates the following instruction for the x86 (64 bit)
intrinsic code:
movslq %esi,%rax
Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae
2017-10-26 11:41:48 -07:00