Luca Barbato
cc868da526
ppc: d45 predictor 32x32
...
About 12x faster.
Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed
2017-04-19 19:57:51 -07:00
Luca Barbato
7a7dc9e624
ppc: d45 predictor 16x16
...
About 16x faster.
Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9
2017-04-19 19:57:51 -07:00
Luca Barbato
c08baa2900
ppc: dc predictor 32x32
...
10x and 5x faster.
Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf
2017-04-19 19:57:47 -07:00
Luca Barbato
22ca468c7c
ppc: dc top and left predictor 32x32
...
6x faster.
Change-Id: I717995b4056e5579c68191d11b495372971fe1ae
2017-04-19 19:49:31 -07:00
Luca Barbato
ad9dea1f6d
ppc: dc top and left predictor 16x16
...
13x faster.
Change-Id: I1771ac39fda599153f933cb3f0506c9f97a6cbe6
2017-04-19 19:49:31 -07:00
Luca Barbato
d68d37872c
ppc: dc_128 predictor 32x32
...
6x faster.
Change-Id: I1da8f51b4262871cb98f0aa03ccda41b0ac2b08b
2017-04-19 19:49:31 -07:00
Luca Barbato
f9d20e6df2
ppc: dc_128 predictor 16x16
...
20x faster.
Change-Id: I05f0deb2d38ae7966eae6b71fbc0aa51880e5709
2017-04-19 19:49:31 -07:00
Luca Barbato
0d9417de4a
ppc: tm predictor 32x32
...
About 8x faster.
Change-Id: I9bad827ccbdf47ec95406e961c74ac2ff45f80cf
2017-04-19 19:49:26 -07:00
Luca Barbato
479443a570
ppc: tm predictor 16x16
...
About 10x faster.
Change-Id: I1f5a3752d346459df3b45f92963208bf3e520f06
2017-04-19 01:48:10 +02:00
Luca Barbato
c8f5a55df4
ppc: tm predictor 8x8
...
About 5x faster.
Change-Id: I951230517f49c0dca9ac9eac2efa8916a303b85a
2017-04-19 01:48:09 +02:00
Luca Barbato
7b0e12934e
ppc: horizontal predictor 32x32
...
About 5x faster.
Change-Id: I3bb724e07baffd901aa2d0f65060ba48882cc9b8
2017-04-19 01:48:09 +02:00
Luca Barbato
a7a2d1653b
ppc: horizontal predictor 16x16
...
About 10x faster.
Change-Id: Ie81077fa32ad214cdb46bdcb0be4e9e2c7df47c2
2017-04-19 01:48:09 +02:00
Luca Barbato
7ad1faa6f8
ppc: vertical intrapred 16x16 and 32x32
...
Change-Id: Ic80f3c050cfbe7697e81a311b4edaaa597b85cab
2017-04-19 01:48:09 +02:00
Linfeng Zhang
05e2b5a59f
Merge "Add 32x32 d45 and 8x8, 16x16, 32x32 d135 NEON intra prediction"
2016-11-22 23:20:53 +00:00
Linfeng Zhang
85c1ee434d
Add high bitdepth intra prediction NEON optimization (mode tm)
...
BUG=webm:1316
Change-Id: Ib014de06836ac12726f4a2c9f0833ec4eb4d233b
2016-11-15 14:19:46 -08:00
Linfeng Zhang
a3128ad33a
Add high bitdepth intra prediction NEON optimization (h and v)
...
BUG=webm:1316
Change-Id: I47eeac698a98a31d1af5f72441052302e9fa4f46
2016-11-12 12:00:19 -08:00
Linfeng Zhang
40ab0424d4
Add high bitdepth intra prediction NEON optimization (mode d45 and d135)
...
BUG=webm:1316
Change-Id: I6a330874348df04df24a6d9efdc06f567e04bf8e
2016-11-09 12:04:04 -08:00
Linfeng Zhang
1868582e7d
Add 32x32 d45 and 8x8, 16x16, 32x32 d135 NEON intra prediction
...
Change-Id: I852616794244490123eb615ac750da50265f0fa5
2016-11-02 11:40:37 -07:00
Linfeng Zhang
3b74066b10
Add high bitdepth intra prediction NEON optimization (mode dc)
...
BUG=webm:1316
Change-Id: I984d6004ea2445e86f213fb6fa4d794a9955af8f
2016-11-01 17:07:36 -07:00
Linfeng Zhang
05ee241493
Add high bitdepth intra prediction optimization speed test
...
BUG=webm:1316
Change-Id: I99feec867d5b8ea06b43cdd3fcd7c90238f5efdb
2016-11-01 13:57:01 -07:00
clang-format
33e40cb5db
test: apply clang-format
...
Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f
2016-07-27 01:58:52 +00:00
Johann
0266e70c52
test: remove x86inc.asm distinction
...
BUG=b:29583530
Change-Id: I296a0b81755e3086bc0a40cb126d0200ff03c095
2016-06-30 11:14:10 -07:00
Linfeng Zhang
ad0646cb84
Slow pshufb removal in 3 intra prediction functions.
...
Replaced vpx_d45_predictor_4x4_ssse3(), vpx_d45_predictor_8x8_ssse3()
and vpx_d207_predictor_4x4_ssse3() with
created vpx_d45_predictor_4x4_sse2(), vpx_d45_predictor_8x8_sse2()
and vpx_d207_predictor_4x4_sse2() respectively.
It's mostly neutral or slightly worse than ssse3 in good cases and
better than ssse3 in the bad cases (but still worse than using the mmx
regs).
Change-Id: Ib0237ceb71d2c57b8a93fd3170330cfed9d56bdd
2016-06-02 10:55:58 -07:00
Jian Zhou
88120481a4
Code clean of tm_predictor_32x32
...
Reallocate the xmm register usage so that no ARCH_X86_64 required.
Reduce memory access to the left neighbor by half.
Speed up by single digit on big core machine.
Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990
2015-12-11 10:32:08 -08:00
Jian Zhou
c90a8a1a43
SSE2 based h_predictor_32x32
...
Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8,
and reduce mem access to left.
Speed up by single digit in ./test_intra_pred_speed on big core
machines.
Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960
2015-12-10 10:09:58 -08:00
Jian Zhou
aa5b517a39
Re-enable SSE2 based intra 4x4 prediction
...
4x4 Intra predictor implemented with MMX is replaced with SSE2.
Segfault in change 315561 when decoding vp8 is taken care of.
Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76
2015-12-07 18:50:37 -08:00
James Zern
79a9add666
Revert "MMX in intra 4x4 prediction replaced with SSE2"
...
This reverts commit 89a1efa4c4
.
This causes a segfault when decoding vp8, in both 32 and 64-bit
Change-Id: Idbb9bb28ab897e1d055340497c47b49a12231367
2015-12-05 10:20:39 -08:00
Jian Zhou
e86c7c863e
Speed up h_predictor_16x16
...
Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4,
and reduce mem access to left.
Speed up by >20% in ./test_intra_pred_speed.
Change-Id: Ie48229c2e32404706b722442942c84983bda74cc
2015-12-04 12:12:55 -08:00
Jian Zhou
da3f08fac3
Speed up h_predictor_8x8
...
Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2,
and reduce mem access to left.
Speed up by >20% in ./test_intra_pred_speed.
Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e
2015-12-04 11:36:44 -08:00
Jian Zhou
aa2764abdd
MMX in intra 8x8 prediction replaced with SSE2
...
8x8 Intra predictor implemented with MMX is replaced with SSE2.
Change-Id: I0c90e7c1e1e6942489ac2bfe58903b728aac7a52
2015-12-03 18:11:06 -08:00
Jian Zhou
89a1efa4c4
MMX in intra 4x4 prediction replaced with SSE2
...
4x4 Intra predictor implemented with MMX is replaced with SSE2.
Change-Id: Id57da2a7c38832d0356bc998790fc1989d39eafc
2015-12-03 16:40:23 -08:00
Jian Zhou
9d29d76280
SSE2 speed up of h_predictor_4x4
...
Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers.
Speed up by ~25% in ./test_intra_pred_speed.
Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d
2015-11-30 10:08:05 -08:00
Jian Zhou
79b68626ae
Speed up tm_predictor_4x4
...
tm_predictor_4x4 is implemented with SSE2 using XMM registers.
Speed up by ~25% in ./test_intra_pred_speed.
Change-Id: I25074b78d476a2cb17f81cf654bdfd80df2070e0
2015-11-18 16:44:25 -08:00
hui su
4013645353
Replace prefix vp9_ with vpx_ for intra prediction functions
...
Change-Id: I8ae6fb586f8d5d018ace228df11714f82b085076
2015-07-27 13:42:06 -07:00
hui su
7971846a5e
Move intra prediction functions from vp9/common/ to vpx_dsp/
...
Change-Id: I64edc26cf4aab050c83f2d393df6250628ad43b8
2015-07-27 13:38:16 -07:00
Johann
ff8505a54d
Fix --disable-use-x86inc
...
Change-Id: I374fcd8fb45a6893dcdeac6896671be142a99f06
2015-07-01 13:15:51 -07:00
James Zern
9db1f24c47
vp9_reconintra_neon: add d45 16x16
...
~90% faster over 20M pixels
Change-Id: I92d80f66e91e0a870a672cfb5dd29bf1a17cb11a
2015-06-22 21:00:07 -07:00
James Zern
12c6688e31
vp9_reconintra_neon: add d45 8x8
...
based on ssse3 implementation
~91% faster over 20M pixels
Change-Id: I6d743a53352c2d6de0efe7899d7996e8b0f7fa29
2015-06-19 19:19:22 -07:00
James Zern
ce88d74d34
vp9_reconintra_neon: add d45 4x4
...
based on webp's LD4()
~59% faster over 20M pixels
Change-Id: I371eaed9ce8f470451046997e130b0ba1a2f7a9c
2015-06-18 15:25:07 -07:00
James Zern
337b221e00
vp9_reconintra_neon: add d135 4x4
...
based on webp's RD4()
~50% faster over 20M pixels
Change-Id: Ifcb7bf7f7fc8eabf79d9e3b219ce1be67abc524a
2015-06-18 15:25:06 -07:00
James Zern
6e44bf20f7
vp9_reconintra_neon: add DC 4x4 predictors
...
~85-89% faster over 20M pixels
Change-Id: I3812e8adfffe5255034da88dfe6546e12f4d10ee
2015-06-18 15:22:43 -07:00
James Zern
79fb3a013e
vp9_reconintra_neon: add DC 32x32 predictors
...
~84-85% faster over 20M pixels
Change-Id: Ia67a7f4a342bf7b0a9280e05c25d81a774d90469
2015-06-15 20:57:28 -07:00
James Zern
98f0178611
enable vp9_d153_predictor_32x32_ssse3
...
unused since its initial commit
~91% faster over 20M pixels
Change-Id: Ic8b5b3246bc97c8406be8bc4496601370403b70a
2015-06-12 19:48:22 -07:00
James Zern
07799ef28a
Merge "test_intra_pred_speed: add ClearSystemState() call"
2015-06-12 06:27:45 +00:00
Parag Salasakar
c7489f4815
Merge "mips msa vp9 intra-pred optimization"
2015-06-11 03:31:49 +00:00
James Zern
1898d1336d
test_intra_pred_speed: add ClearSystemState() call
...
fixes instability; noticed on mingw
Change-Id: Idef4349339444ec84916e5fcd908ee9633d28aaa
2015-06-10 12:44:07 -07:00
James Zern
6a422e4452
test_intra_pred_speed: remove #if w/in another macro
...
fixes the compile under visual studio
Change-Id: Ifa3926e198af97d73250540c6d0ef692f5e354ff
2015-06-09 19:30:04 -07:00
Parag Salasakar
a2288d274c
mips msa vp9 intra-pred optimization
...
intra pred - average improvement ~2x-3x
Change-Id: Ie3f7d6eded5ecb7ed7ee506ba8e4d98f93803b09
2015-06-06 22:29:32 +05:30
James Zern
a2a13cbe5f
vp9_reconintra_neon: add DC 16x16 predictors
...
85-89% faster over 20M pixels
Change-Id: I9b320ed6b9e67f27df738b84c8b43b65a93c50c2
2015-05-29 15:41:44 -07:00
James Zern
e97b849219
vp9_reconintra_neon: add DC 8x8 predictors
...
~90% faster over 20M pixels
Change-Id: Iab791510cc57c8332c2f9a5da0ed50702e5f5763
2015-05-29 15:39:08 -07:00