Commit Graph

637 Commits

Author SHA1 Message Date
Johann Koenig
ac29aa135c Merge "Rename vp8 loopfilter_mmx.asm" 2015-08-04 15:55:48 +00:00
Johann
749c393c8d Rename vp8 loopfilter_mmx.asm
Chromium puts all the yasm output in the same directory. Looking at ways
to improve this but in the meantime get rid of collisions.

Change-Id: I923c5231d14e895ab96521eb89807ede868a0753
2015-08-03 14:27:03 -07:00
Parag Salasakar
d35f992599 mips msa vp8 denoising filter optimization
average improvement ~2x-3x

Change-Id: I6c17012c731fa4d56e0343f8de0df47b2dde289b
2015-08-01 08:05:25 +05:30
Parag Salasakar
8fbc641540 mips msa vp8 temporal filter optimization
average improvement ~2x-3x

Change-Id: I05593bed583234dc7809aaec6cab82773a29505d
2015-07-31 12:03:19 +05:30
Parag Salasakar
0e3f494b21 mips msa vp8 block subtract optimization
average improvement ~2x-3x

Change-Id: I30abf4c92cddcc9e87b7a40d4106076e1ec701c2
2015-07-31 09:29:10 +05:30
Parag Salasakar
56aa0da405 mips msa vp8 quantize optimization
average improvement ~2x-3x

Change-Id: I6fc37191bf9cb5a67e1af9787d0d27659c17bdba
2015-07-30 12:56:57 -07:00
Parag Salasakar
0c2a14f9e2 mips msa vp8 fdct optimization
average improvement ~2x-4x

Change-Id: Id0bc600440f7ef53348f585ebadb1ac6869e9a00
2015-07-30 08:14:42 +05:30
Parag Salasakar
a5d9416fd7 mips msa vp8 post proc optimization
average improvement ~2x-4x

Change-Id: I93abc15389649c169bb8b69127c0b95407d34692
2015-07-29 09:40:26 +05:30
Parag Salasakar
5deb983744 mips msa vp8 filter by weight optimization
average improvement ~3x-5x

Change-Id: Ia808ae56b118e0e1b293901447aa5a0f597b405b
2015-07-28 08:16:34 +05:30
Parag Salasakar
af6733aec6 mips msa vp8 recon intra optimization
average improvement ~3x-5x

Change-Id: I73306863e9bf172d5adc06b8dd54e43985d1e063
2015-07-25 12:32:26 +05:30
Parag Salasakar
fb73ceae85 mips msa vp8 bilinear filter optimization
average improvement ~3x-4x

Change-Id: I8c0b3d5c86c9eb4f802b87c971864d2cfceeb7cc
2015-07-24 09:21:35 +05:30
Parag Salasakar
509fb0bc9d mips msa vp8 copy mem optimization
average improvement ~2x-4x

Change-Id: I3af3ecced96c5b8e0cb811256e5089e28fe013a2
2015-07-23 10:29:40 +05:30
Parag Salasakar
55c0df5ef1 mips msa vp8 sixtap filter optimization
average improvement ~3x-5x

Change-Id: I5fd88cb088814be443d04be384b9fca99b22adef
2015-07-13 09:23:52 +05:30
Parag Salasakar
0ea2684c2c mips msa vp8 loop filter optimization
average improvement ~2x-4x

Change-Id: I20c4f900ef95d99b18f9cf4db592cd352c2212eb
2015-07-08 12:41:00 +05:30
Johann
6a82f0d7fb Move sub pixel variance to vpx_dsp
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-07 15:51:04 -07:00
Jingning Han
9d251f9510 Merge "Unify subtract function used in VP8/9" 2015-07-07 20:42:19 +00:00
Jingning Han
0ede9f52b7 Unify subtract function used in VP8/9
This commit replaces the vp8_ prefixed subtract function with the
common vpx_subtract_block function. It removes redundant SIMD
optimization codes and unit tests.

Change-Id: I42e086c32c93c6125e452dcaa6ed04337fe028d9
2015-07-07 09:57:44 -07:00
Parag Salasakar
3d938d71b0 mips msa vp8 idct optimization
average improvement ~2x-5x

Change-Id: I19e82f78772993bcd67fcf975fe180232172f86d
2015-07-07 12:41:54 +05:30
James Zern
dcf5b7cfdd loopfiltersimpleverticaledge_neon: quiet uninit var warnings
take 2. localize the function parameter to actually remove the warning

Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
2015-06-30 23:23:59 -07:00
James Zern
69c153c4e6 loopfiltersimpleverticaledge_neon: quiet uninit var warnings
the vector used in vld*_lane_* should be initialized before use

Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
2015-06-25 20:39:21 -07:00
James Zern
f4d746a3c1 idct_dequant_0_2x_neon: quiet uninit var warnings
the vector used in vld*_lane_* should be initialized before use

Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
2015-06-25 20:29:35 -07:00
James Zern
4bd87a9b9e vp8_subpixelvariance_neon: right size coeff table
only uint8 is required; each use only loads one value as a uint8
quiets a few type conversion warnings

Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
2015-06-23 23:48:12 -07:00
Johann
907b33cdc4 Move vp8 variance files
There is a naming conflict in the chromium build system.

The rest of the variance functions will move to vpx_dsp soon.

Change-Id: Iff78da2aafb0d7380eda73e38d7dac72110a1e47
2015-06-18 16:42:28 -07:00
James Zern
47fe535422 disable vp8_sub_pixel_variance8x8_neon
fails unit tests:
[  FAILED  ] NEON/VP8SubpelVarianceTest.ExtremeRef/0, where GetParam() = (3, 3, 0x14e36d, 0)
[  FAILED  ] NEON/VP8SubpelVarianceTest.Ref/0, where GetParam() = (3, 3, 0x14e36d, 0)

the tests were recently enabled in:
eb88b17 Make vp9 subpixel match vp8

the functions likely haven't changed since being converted from assembly

Change-Id: I6141717b111b8f735f436c160d74270af53ef722
2015-06-05 20:18:51 -07:00
Johann
516c087c51 Remove unused sub pixel mse
Change-Id: I7a5e4e2632c3fa69d2a85a68fa9b418631caf09c
2015-06-03 08:00:51 -07:00
Johann
86d0cb8325 Disable neon bilinear 4x4
Clang adds alignment hints when casting up the loads/stores. Although
this should be safe for most paths, it's causing some crashes. Either
the source of the misalignment needs to be determined and adjusted or
the intrinsics need to be rewritten to avoid using the cast to load the
data.

BUG=817,892

Change-Id: Ia3aa824d6a4cd97e14325ff49dc730b6f85ec7e8
2015-06-02 00:02:55 +00:00
Johann
c3bdffb0a5 Move variance functions to vpx_dsp
subpel functions will be moved in another patch.

Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-26 12:01:52 -07:00
James Zern
62ad8baa40 vp8: add some missing includes
silences missing prototype warnings

Change-Id: Ib62e4743532b871e63bc99732875fff20501b8ac
2015-05-14 22:41:25 -07:00
James Zern
632177fa7f vp8: make some functions static
silences missing prototype warnings

Change-Id: I9f24a3214c832c982ca0dc5a032316eba48472ff
2015-05-14 22:41:25 -07:00
James Zern
f80bbc0efb vp8/common/variance*: add vp8_rtcd include
silences missing prototype warnings

Change-Id: I5ca198b56a5ff0cf5b93c89957526f243c04e9c8
2015-05-14 22:41:25 -07:00
James Zern
6eb1016301 vp8_copy32xn: sync function signature
+ include vp8_rtcd.h in copy_c.c
silences missing prototype warnings

Change-Id: Iecc279c695b08a26b231dedb41e3b84c551703f3
2015-05-14 22:41:13 -07:00
Johann
11a4a3c065 Merge "Remove only remaining uses of 'fast_unaligned'" 2015-05-07 23:32:18 +00:00
Johann
802e1d84cc Remove only remaining uses of 'fast_unaligned'
Use memcpy instead of casting.

Change-Id: Ieca725cc628883985bde23c7d742af8781c5dbb5
2015-05-07 14:39:37 -07:00
James Zern
fd3658b0e4 replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.

Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-07 11:55:08 -07:00
Johann
d5d9289800 Move shared SAD code to vpx_dsp
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.

This reduces the size of vpxenc/dec by 36k on x86_64 builds.

Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-05-06 16:58:20 -07:00
James Zern
f58011ada5 vpx_mem: remove vpx_memset
vestigial. replace instances with memset() which they already were being
defined to.

Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-04-28 20:00:59 -07:00
James Zern
f274c2199b vpx_mem: remove vpx_memcpy
vestigial. replace instances with memcpy() which they already were being
defined to.

Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c
2015-04-28 19:59:41 -07:00
Johann
14ef4aeafb Reorganize *_rtcd() calling conventions
Change-Id: Ib1e17d8aae9b713b87f560ab5e49952ee2bfdcc2
2015-04-15 11:12:05 -04:00
James Zern
970acffa8f multiframe_quality_enhance_block: remove dead stores
Change-Id: I33ca9cddfdd54c3d8a23c1cb978986a537a20bf2
2015-04-03 16:15:51 -07:00
James Zern
7b4f727959 vp8_print_modes_and_motion_vectors: remove dead stores
Change-Id: I438cbf4970fa2220fb73b0b41a29e654836d4e3b
2015-04-03 16:08:37 -07:00
Johann
bc98e93b53 Remove PPC build support
There are no functional optimizations for AltiVec/PPC

Change-Id: I6877a7a9739017fe36fc769be22679c65ea99976
2015-04-02 09:13:59 -07:00
Johann
eabb793f3b Use correct buffer size in vp8 subpixel variance
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized
to kHeight8 * kWidth8. However, in the case that xoffset != 0 and
yoffset == 0, var_filter_block2d_bil_w8 is called with output_width
kHeight8PlusOne.

Thanks to cmugurel for diagnosing and yulius for the patch.

Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0
https://code.google.com/p/webrtc/issues/detail?id=4190
2015-02-03 09:11:05 -08:00
Jim Bankoski
f4eab151c5 Revert "remove vp8 unused uvstride parm in simple loop filter"
This reverts commit 392a2c43c7

Failing nexus build tests:
http://build.chromium.org/p/client.libvpx/builders/Nexus%207%20Builder/builds/224

Change-Id: I95ae2c894b70cef9c757334fcab7fdeca9003e9c
2014-12-21 21:35:07 -08:00
Jim Bankoski
2c5dc477bf Merge "remove vp8 unused uvstride parm in simple loop filter" 2014-12-21 16:49:45 -08:00
Johann
80b344dec5 Silence -Werror=unused-parameter
Cast away remaining issues so that new ones don't get lost in the noise.

Change-Id: Iacd6999b0686ce80f9835730d68db6382690fa92
2014-12-16 12:47:08 -08:00
Marco
af898b56bb Various updates to vp8.
Change-Id: Icc7a816491897107764e4c936288e9000e6319b8
2014-12-03 16:01:28 -08:00
Johann
6eec73a747 Remove asm offset dependencies
The obj_int_extract code is no longer worth maintaining. It creates
significant issues when adapting for different build systems and no
longer offers as significant of a performance benefit due to
improvements in intrinsics.

Source files will remain until the various third-party builds are updated.

The neon fast quantizer has been moved to intrinsics. The armv6 version
has been removed because so few remaining targets require it.

Compilers and processors have improved significantly since the
pack_tokens code was written. The assembly is no longer faster than the
C code.

pack_tokens were the only optimizations for the armv5te targets so the targets
will be removed after the test infrastructure has been updated.

BUG=710

Change-Id: Ic785b167cd9f95eeff31c7c76b7b736c07fb30eb
2014-11-06 16:00:01 -08:00
Johann
2134eb2f05 Remove pair quantization
The intrinsics version of the pair quant is slower than running it
individually.

Change-Id: I7b4ea8599d4aab04be0a5a0c59b8b29a7fc283f4
2014-10-31 13:42:55 -07:00
Johann
7ae75c3d52 vp8 quantization -> intrinsics
Use intrinsics for neon quantization. Slight loss (<5%) of performance
compared to the assembly. Roughly 10x faster on arm64 because that was
running C code before.

Change-Id: I7cf5242d8f29b7eab5bca6a1c20c89c9fc9ca66d
2014-10-31 13:42:13 -07:00
Johann
f6be2f3c87 Clarify GCC version check
The version check was incorrectly matching some versions of clang
which reported as gcc 4.2

Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-25 11:53:45 -07:00