Linfeng Zhang
8edd5051aa
Add speed test in SADx4Test
...
Change-Id: I42dd3df8c13c0a6d08ce28e27e8917b5d831fc1a
2018-03-28 11:13:28 -07:00
Kyle Siefring
b383a17fa4
Support building AVX-512 and implement sadx4 for AVX-512
...
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.
Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00
Shiyou Yin
f4150163a2
vpxdsp: [loongson] optimize sad functions with mmi
...
1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c
Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a
2017-09-02 15:11:32 +00:00
Johann
e381753926
sad4d neon: 64x[32,64]
...
Rewrite 64x64.
BUG=webm:1425
Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf
2017-07-12 13:26:39 +00:00
Johann
e1bde306c8
sad4d neon: 32x[16,32,64]
...
Rewrite 32x32. Use half the accumulator registers.
BUG=webm:1425
Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13
2017-07-12 13:25:18 +00:00
Johann
807ce8fb1e
sad4d neon: 16x[8,16,32]
...
Rewrite 16x16. Use half the accumulator registers.
BUG=webm:1425
Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2
2017-07-12 13:25:11 +00:00
Johann
8152b0904d
sad4d neon: 8x[4,8,16]
...
BUG=webm:1425
Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4
2017-07-12 13:25:03 +00:00
Johann
dd4347e9ec
sad4d neon: 4x4, 4x8
...
BUG=webm:1425
Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468
2017-07-12 03:38:03 +00:00
Johann
e4e08556db
sad neon: avg for 64x[32,64]
...
BUG=webm:1425
Change-Id: Id84d97807a6a0fbcc889c4dfe11929d54f85493d
2017-07-07 07:04:04 -07:00
Johann
67cffc1ef6
sad neon: avg for 32x[16,32,64]
...
BUG=webm:1425
Change-Id: I3362e0dded3b46ca032caa7f44db42f324bc596d
2017-07-07 07:04:04 -07:00
Johann
527e0c9b1c
sad neon: avg for 16x[8,16,32]
...
BUG=webm:1425
Change-Id: Ia42e4f36547c5fe12114fb58379e34bce82eb2f2
2017-07-07 07:04:04 -07:00
Johann
63bdc574e5
sad neon: avg for 8x[4,8,16]
...
BUG=webm:1425
Change-Id: If2ab51e3050e078b0011b174efe41fcb65a15f44
2017-07-06 07:43:09 -07:00
Johann
6bac3f80ee
sad neon: avg for 4x4 and 4x8
...
BUG=webm:1425
Change-Id: Ifc685a96cb34f7fd9243b4c674027480564b84fb
2017-07-06 07:12:47 -07:00
Johann
ad011aaab8
sad neon: rewrite 64x64 and add 64x32
...
BUG=webm:1425
Change-Id: Ib454762d1c61b05a98324fe81ad58c9e09784717
2017-06-28 12:21:34 -07:00
Johann
469643757f
sad neon: rewrite 16x8, 16x16, add 16x32
...
BUG=webm:1425
Change-Id: Ie126553e5fffcdfaf3d82a85b368ac10ce9ab082
2017-06-28 12:16:00 -07:00
Johann
e40e78be24
sad neon: rewrite 8x8 and 8x16
...
BUG=webm:1425
Change-Id: I068f06c67b841f09ea07c04ada0c2f1706102138
2017-06-28 12:15:57 -07:00
Johann
46d8660ce3
sad neon: rewrite 4x4 and add 4x8
...
The previous implementation loaded 8 values (discarding half)
BUG=webm:1425
Change-Id: Icb72a94e2557a4ee2db7091266ab58fd92f72158
2017-06-28 11:14:59 -07:00
Alexandra Hájková
8bf6eaf433
ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}
...
Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123
2017-05-24 13:27:09 +00:00
Alexandra Hájková
bcbc3929ae
ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx
...
Change-Id: Ic9639b1331d8c5cbc207c2a036891ff0137fc56f
2017-05-13 13:13:15 +00:00
Alexandra Hájková
f48532e271
ppc: Add vpx_sad64x32/64_vsx
...
Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2
2017-05-12 16:10:16 +02:00
Alexandra Hájková
0b15bf1e54
ppc Add vpx_sad32x16/32/64_vsx
...
Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd
2017-05-12 16:10:11 +02:00
Alexandra Hájková
cc7f0c0f3e
ppc: Add vpx_sad16x8/16/32_vsx
...
Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9
2017-05-10 19:57:30 +00:00
clang-format
9c9d92ae3a
test: apply clang-tidy google-readability-braces-around-statements
...
applied against a x86_64 configure with and without
--enable-vp9-highbitdepth
clang-tidy-3.7.1 \
-checks='-*,google-readability-braces-around-statements' \
-header-filter='.*' -fix
+ clang-format afterward
Change-Id: Ia2993ec64cf1eb3505d3bfb39068d9e44cfbce8d
2016-08-05 20:02:28 -07:00
Johann
d55724fae9
Remove armv6 target
...
Change-Id: I1fa81cc9cabf362a185fc3a53f1e58de533a41e5
2016-08-04 12:55:06 -07:00
clang-format
33e40cb5db
test: apply clang-format
...
Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f
2016-07-27 01:58:52 +00:00
Pascal Massimino
5319e83843
sad_test: add some const to methods
...
Change-Id: I6f2481509b0aa94338ed6185f80c4a6b65532280
2016-07-15 21:07:00 -07:00
skal
3fc29ae3ee
remove tuple from 'sad_test.cc'
...
+ general clean-up
Change-Id: Ib9dca3d1a3b7f0c1bedef2a26c9ff5ae1c289e8a
2016-07-15 19:52:56 -07:00
Johann
0266e70c52
test: remove x86inc.asm distinction
...
BUG=b:29583530
Change-Id: I296a0b81755e3086bc0a40cb126d0200ff03c095
2016-06-30 11:14:10 -07:00
Linfeng Zhang
d0e687bf8c
remove mmx sad functions
...
there are sse2 equivalents which is a reasonable modern baseline
Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba
2016-05-11 10:50:04 -07:00
Jian Zhou
789dbb3131
Code clean of sad4xNx4D_sse
...
Replace MMX with SSE2.
Change-Id: I948ca1be6ed9b8e67f16555e226f1203726b7da6
2015-12-17 17:43:46 -08:00
Jian Zhou
b158d9a649
Code clean of sad4xN(_avg)_sse
...
Replace MMX with SSE2, reduce psadbw ops which may help Silvermont.
Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d
2015-12-17 11:10:42 -08:00
James Zern
91606bbbe6
sad_test: create fn pointers w/'&' ref
...
this helps some toolchains (vs9) resolve the type of the parameter
Change-Id: I4acc8a844d1e55b766f66482bd6d32998174d70f
2015-11-05 23:53:24 -08:00
Jingning Han
097d59c28c
Cosmetics - Fix header file order in unit tests
...
Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798
2015-07-29 20:48:25 -07:00
Parag Salasakar
bc3ec8ef07
mips msa vpx_dsp sad sad4d avgsad optimization
...
average improvement ~3x-5x
Change-Id: Ie30748cfbedebbd544b7ef4f286055ccb7f60306
2015-07-01 11:39:43 +05:30
Johann
1d7ccd5325
Relocate memory operations for common code
...
With the sad functions, and hopefully the variance functions soon,
moving to the vpx_dsp location, place the defines used in the
reference C code in a common location.
Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-05-13 11:41:15 -07:00
Johann
d5d9289800
Move shared SAD code to vpx_dsp
...
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-05-06 16:58:20 -07:00
Frank Galligan
e3167f7fbf
Add vp9_sad32x32x4d_neon Neon intrinsic function.
...
On Nexus 7 speed -6 saw ~18% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
2015-01-27 08:54:00 -08:00
Frank Galligan
9f574d0316
Add vp9_sad16x16x4d_neon Neon intrinsic function.
...
On Nexus 7 speed -6 saw ~15% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
2015-01-27 08:42:17 -08:00
Frank Galligan
54fa956715
Add vp9_sad64x64x4d_neon Neon intrinsic function.
...
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
2015-01-27 08:33:40 -08:00
James Zern
65d7fa7169
sad_test: initialize bit_depth_ in all cases
...
previously 'bit_depth_', which is later used to calculate 'mask_', would
be left uninitialized in non-high-bitdepth builds
Change-Id: Ia72035f4645baf3bb0f191504f491b934cdf1e0e
2014-11-22 12:12:59 -08:00
James Zern
16d2696978
sad_test: fix vp8-only build
...
ROUND_POWER_OF_TWO() is defined in vp9 headers currently, avoid it in
non-high-bitdepth code
Change-Id: Ic28b8f95ef7964800475ee8b35be5f9cea9afab6
2014-11-19 19:18:25 -08:00
Deb Mukherjee
00c385f17a
Visual studio build fix using explicit cast
...
Change-Id: If74510370723e497f4f33d988b8b398124edf69b
2014-11-14 15:12:01 -08:00
Peter de Rivaz
7eee487c00
Added highbitdepth sse2 SAD acceleration and tests
...
Change-Id: I1a74a1b032b198793ef9cc526327987f7799125f
(cherry picked from commit b1a6f6b9cb
)
2014-11-12 14:25:45 -08:00
levytamar82
7045aec00a
SAD32xh and SAD64xh for AVX2
...
All sad function that process above 32 consecutive elements are optimized
for AVX2:
vp9_sad64x64
vp9_sad64x32
vp9_sad32x64
vp9_sad32x32
vp9_sad32x16
vp9_sad64x64_avg
vp9_sad64x32_avg
vp9_sad32x64_avg
vp9_sad32x32_avg
vp9_sad32x16_avg
The functions that appeared as a hotspot is vp9_sad32x32 and vp9_sad64x64
vp9_sad32x32 was optimized by 68% and vp9_sad64x64 was optimized by 90%
both of them gave and overall ~2.3% user level gain
Change-Id: Iccf86b375a2b54c5fbbe685902ead0c9a561b9fd
2014-10-19 13:59:10 -07:00
Dmitry Kovalev
318fc0c34f
Removing MMX SAD calculation code.
...
Removed functions:
* vp9_sad_16x16_mmx
* vp9_sad_8x16_mmx
* vp9_sad_16x8_mmx
* vp9_sad_8x8_mmx
* vp9_sad_4x4_mmx
Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3
2014-09-02 14:41:36 -07:00
levytamar82
af10457e02
Fix bug 806
...
in the function sad32x32x4d and sad64x64x4d the source is aligned to 16 bytes
and not to 32 bytes - the load is now unaligned.
Change-Id: I922fdba56d0936b5cf72e4503519f185645a168c
2014-08-07 14:13:30 -07:00
Scott LaVarnway
545be78136
Added vp9_sad8x8_neon()
...
Change-Id: I3be8911121ef9a5f39f6c1a2e28f9e00972e0624
2014-08-01 06:36:18 -07:00
James Zern
18e733bd1d
sad_test: drop '_t' from local typenames
...
_t is reserved by posix
+ switch to camelcase
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Type_Names
Change-Id: I60746bc93ba2446c1458a1f09fd1e49cc2e68534
2014-07-18 20:39:06 -07:00
Scott LaVarnway
ba0652e83a
Merge "Added vp9_sad64x64_neon(), vp9_sad32x32_neon()"
2014-07-17 11:42:16 -07:00
Scott LaVarnway
696fa52eaa
Added vp9_sad64x64_neon(), vp9_sad32x32_neon()
...
and vp9_sad16x16_neon()
On a Nexus 7, vpxenc (in realtime mode, speed -6)
reported a performance improvement of ~17%.
Change-Id: I91e070cde2973451083d3f3d63b49b7886de9a85
2014-07-16 12:54:46 -07:00