James Zern
2d58761993
Revert "Improved 8t filters"
...
This is incompatible with most toolchains other than gcc.
Revert "Deleted #include <inttypes.h>"
This reverts commit 4d018be950
.
This reverts commit d22a504d11
.
Change-Id: I1751dc6831f4395ee064e6748281418e967e1dcf
2013-09-13 15:13:06 -07:00
hkuang
86fb12b600
Merge "Add neon optimize iht8x8 which is 282% faster than C."
2013-09-12 15:42:44 -07:00
hkuang
182366c736
Add neon optimize iht8x8 which is 282% faster than C.
...
Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530
2013-09-12 11:49:05 -07:00
Christian Duvivier
6a501462f8
First draft of vp9_short_idct32x32_add_neon.
...
Lots of TODO which will be taken care in upcoming changes. As is,
about 6x faster than C version.
Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0
2013-09-11 15:19:38 -07:00
Scott LaVarnway
d22a504d11
Improved 8t filters
...
Reformatted version of a patch submitted by Erik/Tamar
from Intel. For the test clips used, the decoder
performance improved by ~2%.
Change-Id: Ifbc37ac6311bca9ff1cfefe3f2e9b7f13a4a511b
2013-09-11 13:56:32 -04:00
hkuang
3c05bda058
Merge "Add neon optimize vp9_short_iht4x4_add."
2013-09-04 13:35:09 -07:00
hkuang
3b8614a8f6
Add neon optimize vp9_short_iht4x4_add.
...
Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e
2013-09-04 12:37:58 -07:00
Jim Bankoski
79401542f7
make vp9 postproc a config option
...
Vp9 postproc is disabled for now as its not been shown to help and
may be merged with vp8.
Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-09-04 10:02:08 -07:00
hkuang
3a679e56b2
Add neon optimize vp9_short_idct16x16_1_add.
...
Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5
2013-08-27 14:00:27 -07:00
hkuang
36e9b82080
Add neon optimize vp9_short_idct8x8_1_add.
...
Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4
2013-08-26 16:28:57 -07:00
hkuang
69384f4fad
Add neon optimize vp9_short_idct4x4_1_add.
...
Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5
2013-08-26 15:55:16 -07:00
Johann
a9aa7d07d0
Merge "vp9: neon: add vp9_convolve_avg_neon"
2013-08-15 14:55:15 -07:00
Johann
63e140eaa7
Merge "vp9: neon: add vp9_convolve_copy_neon"
2013-08-15 14:55:08 -07:00
hkuang
39f42c8713
Merge "Add neon optimize vp9_short_idct16x16_add."
2013-08-14 14:16:20 -07:00
hkuang
cf6beea661
Add neon optimize vp9_short_idct16x16_add.
...
Change-Id: I27134b9a5cace2bdad53534562c91d829b48838d
2013-08-14 13:52:16 -07:00
Mans Rullgard
0f1deccf86
vp9: neon: add vp9_convolve_avg_neon
...
Change-Id: I33cff9ac4f2234558f6f87729f9b2e88a33fbf58
2013-08-14 16:27:55 +01:00
Mans Rullgard
635ba269be
vp9: neon: add vp9_convolve_copy_neon
...
Change-Id: I15adbbda15d1842e9f15f21878a5ffbb75c3c0c9
2013-08-14 16:27:55 +01:00
Dmitry Kovalev
8ffe85ad00
Moving scale_factors and related code to separate files.
...
Change-Id: I531829e5aee2a4a7a112d528ecccbddf052d0e74
2013-08-09 14:07:09 -07:00
Christian Duvivier
78182538d6
Neon version of vp9_short_idct4x4_add.
...
Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4
2013-08-06 18:41:27 -07:00
Jim Bankoski
6eb1254b88
sse3 intrapred x86inc protected
...
Change-Id: I4a3c83119cdf8a205920034c8019d855d5504605
2013-08-06 14:17:13 -07:00
Jim Bankoski
25ec1375c9
intrapred x86inc guards
...
Change-Id: If0399d8e11f4ebe75a5c91abb8d6a52a7709065b
2013-08-06 09:39:30 -07:00
Jim Bankoski
c3809f3de5
Begin to restrict x86inc.asm usage
...
Chromium does not support 32bit builds for Mac which use x86inc.asm.
Make the files which include it work if 64bit or not PIC enabled
starting with vp9_copy_sse2.asm
Consolidate these targets in vp9_rtcd_defs.sh
Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248
2013-08-05 12:07:30 -07:00
Mans Rullgard
d85ae87183
vp9: neon: add vp9_mb_lpf_* functions
...
Change-Id: I13e0880df234f15abc4cc7c57fe84488d5d46a75
2013-08-02 08:10:50 -07:00
hkuang
d757de744c
Add neon optimize vp9_short_idct8x8_add.
...
Change-Id: Ic32acf3e2939c6d12d9c2bf192a5f5da59705fda
2013-07-18 16:40:41 -07:00
Johann
9ca66ec050
Merge "vp9_convolve8_neon placeholder"
2013-07-17 10:09:00 -07:00
Johann
59dc4e9cdd
vp9_convolve8_neon placeholder
...
Call the individually optimized horizontal and vertical functions. This
implementation abuses the temp buffer.
This will be replaced with a custom optimized function.
Over 2x speedup.
Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
2013-07-17 08:39:27 -07:00
James Zern
98e132bde0
Merge changes I40454d26,I892e76d5,I865ab3f9,I4a4bec17,I61c4351e,I37eb3559,I1031c556,I8c8f1f42
...
* changes:
delete vp9_loopfilter_sse2.asm
vp9_loopfilter_intrin_sse2: cosmetics: fix indent
delete x86/vp9_loopfilter_x86.h
vp9_loopfilter_intrin_sse2: make some funcs static
vp9_loopfilter_intrin_sse2: remove unused uv funcs
vp9_loopfilter: remove uv function typedef
filter_block_plane: reuse some constants
vp9_loopfilter.c: make some functions static
2013-07-16 14:25:32 -07:00
James Zern
50015f6eba
delete vp9_loopfilter_sse2.asm
...
sse2 functions are provided by vp9_loopfilter_intrin_sse2.c
Change-Id: I40454d26034e3ef915eeaf889937fe7d1b519b9b
2013-07-16 13:09:16 -07:00
James Zern
af58254267
delete x86/vp9_loopfilter_x86.h
...
also remove prototype_loopfilter{,_block} defines from vp9_loopfilter.h
Change-Id: I865ab3f9436c7b1ca166f76630328abf01389405
2013-07-16 13:09:05 -07:00
Dmitry Kovalev
baf0c959c7
Moving vp9_kf_default_bmode_probs to vp9_entropymode.c.
...
Removing vp9_modelcontext.c.
Change-Id: If2316c58dead2708d9f95b52d9494ba4c1dd7427
2013-07-16 10:54:34 -07:00
Johann
a15bebfc0a
vp9_convolve8_[horiz|vert]_avg
...
Super basic conversion from the other implementations. Any changes to
one should be trivial to copy over keep in sync.
Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8
2013-07-12 16:21:33 -07:00
Johann
158c80cbb0
convolve8 optimizations for neon
...
Independent horizontal and vertical implementations.
Requires that blocks be built from 4x4 and [xy]_step_q4 == 16
6-10% improvement. CIF improved the least.
Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda
2013-07-11 11:08:19 -07:00
hkuang
c9b25dcae4
Add neon optimize vp9_dc_only_idct_add.
...
Change-Id: Iae84ab945cc9662a0ddd839aa2b9ca59f2ae5423
2013-07-11 10:30:47 -07:00
Ronald S. Bultje
decead7336
Replace copy_memNxM functions with a generic copy/avg function.
...
Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa
2013-07-10 18:27:24 -07:00
Ronald S. Bultje
3f210f10eb
Remove unused iwalsh4x4 MMX/SSE2 functions.
...
Change-Id: I2d22577911a37ed7d8c7e08cac20764842267652
2013-07-10 14:52:47 -07:00
Ronald S. Bultje
48c53233fd
Remove unused 16x3/3x16 sad SSE2 functions.
...
Change-Id: I30a597c0cc366e34c9a3e2afe32d70e044f95ca4
2013-07-10 14:52:47 -07:00
Ronald S. Bultje
e6f955251f
Merge "SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction."
2013-07-10 14:52:23 -07:00
Ronald S. Bultje
89810bfd71
Merge "SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction."
2013-07-10 10:13:16 -07:00
Dmitry Kovalev
20986c81b3
Merge "Removing vp9_maskingmv.c and corresponding assembly file."
2013-07-10 10:05:06 -07:00
Ronald S. Bultje
7fd643264a
SSSE3 assembly for 4x4/8x8/16x16/32x32 H intra prediction.
...
Change-Id: Iad70966b986f65259329070e258f76ef0af816b4
2013-07-10 09:28:03 -07:00
Ronald S. Bultje
92c5d3665d
SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 DC intra prediction.
...
Change-Id: Ibe1690afc5459f3b3beca401e7734fcd03da6dd0
2013-07-10 09:28:03 -07:00
Jim Bankoski
6c8170af52
b_width_log2 and b_height_log2 lookups
...
Replace case statement with lookup.
Small speed gain at low speed settings but at speed 2+ where the
number of motion searches etc. falls the impact rises to ~3-4%.
Change-Id: Idff639b7b302ee65e042b7bf836943ac0a06fad8
Change-Id: I5940719a4a161f8c26ac9a6753f1678494cec644
2013-07-10 07:19:09 -07:00
John Koleszar
f0d9f10d24
Remove all asm offset files from VP9
...
The files are empty and unused.
Change-Id: Ieb4242d14273efdf24149bda33f9591540bba06a
2013-07-09 14:26:53 -07:00
Dmitry Kovalev
aeed28f143
Removing vp9_maskingmv.c and corresponding assembly file.
...
Change-Id: I9842d02d61d78d17dc3449bae8ffbe60f4b3ecb3
2013-07-09 11:22:56 -07:00
Ronald S. Bultje
8350e7fe38
Make intra prediction pointers RTCD-based.
...
This probably has a mildly negative impact on performance, but will
(in future commits - or possibly merged with this one) allow SIMD
implementations of individual intra prediction functions. We may
perhaps want to consider having separate functions per txfm-size
also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for
each intra prediction mode), but I haven't played much with that
yet.
Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269
2013-07-08 17:25:51 -07:00
Dmitry Kovalev
904070ca64
Merge "Removing unused implicit segmentation code."
2013-07-02 11:58:48 -07:00
Dmitry Kovalev
a3d2e6c98b
Removing unused implicit segmentation code.
...
Change-Id: I8a2983fb14274a6ac53681fa4cd5d4209cbd2905
2013-07-02 11:16:42 -07:00
Dmitry Kovalev
1ac0540296
Removing vp9_mbpitch.c, moving vp9_setup_block_dptrs to vp9_block.h.
...
Change-Id: Ia547a5dd7650b771fd00edd673ab9f920270731c
2013-07-01 17:28:08 -07:00
Dmitry Kovalev
2ab3bc8871
Removing vp9_modecont.{h, c}.
...
Moving vp9_default_inter_mode_probs array to vp9_entropymode.c.
Change-Id: I88ebda86ccc07f2a43c6c01d4b37898214cfb6de
2013-07-01 10:17:15 -07:00
Frank Galligan
1d6dc1b702
Add Neon optimized loop filter functions.
...
- Added vp9_loop_filter_horizontal_edge_neon and
vp9_loop_filter_vertical_edge_neon.
- The functions are based off the vp8 loopfilter
functions.
- Matches x86 md5 checksum.
Change-Id: Id1c4dddb03584227e5ecd29f574a6ac27738fdd0
2013-06-27 16:14:45 -07:00