Johann
9b54e812f7
neon hadamard 8x8
...
Runs about 30% faster than the C
BUG=webm:1021
Change-Id: I6809d6d84c3077ab619c53298296950e976bdaba
2016-05-16 11:58:02 -07:00
Yaowu Xu
c1e4f5a80d
Merge "Change to use correct check for halfpel"
2016-05-13 01:27:47 +00:00
Linfeng Zhang
2f55beb355
Merge "remove mmx variance functions"
2016-05-11 22:21:23 +00:00
Yaowu Xu
17fae3ad0a
Change to use correct check for halfpel
...
In motion estimation stage for subpel motion, subpel variance is
computed use bilinear interpolation. The motion vector precision
used is at 1/8 pel and three bits are used to represent the x and y
subpel offsets. Based on this, the half pel check should be against
4, not 8.
Change-Id: I1f56fa1fa3f2f5e19a20d27983efe628557f170e
2016-05-11 13:52:59 -07:00
Linfeng Zhang
d0ffae825d
remove mmx variance functions
...
there are sse2 equivalents which is a reasonable modern baseline
Removed mmx variance functions:
vpx_get_mb_ss_mmx()
vpx_get8x8var_mmx()
vpx_get4x4var_mmx()
vpx_variance4x4_mmx()
vpx_variance8x8_mmx()
vpx_mse16x16_mmx()
vpx_variance16x16_mmx()
vpx_variance16x8_mmx()
vpx_variance8x16_mmx()
Change-Id: Iffaf85344c6676a3dd337c0645a2dd5deb2f86a1
2016-05-11 12:39:42 -07:00
Linfeng Zhang
d0e687bf8c
remove mmx sad functions
...
there are sse2 equivalents which is a reasonable modern baseline
Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba
2016-05-11 10:50:04 -07:00
Jim Bankoski
da33728f48
vpx_dsp: Rename postproc.c add_noise.
...
Change-Id: I4906d1b79a2951e659995202b9fa97e2ea5cfba0
2016-05-10 06:52:58 -07:00
Scott LaVarnway
c2c5297595
Merge "VPX: refactor vpx_idct16x16_1_add_sse2()"
2016-05-09 22:15:17 +00:00
James Bankoski
7cced7b3ea
Merge "libvpx: vpx_add_plane_noise make c match assembly"
2016-05-09 20:17:38 +00:00
Johann Koenig
9e5811f485
Merge changes Id13b97f4,I1d342725
...
* changes:
The subfunctions are only defined for sse2
Unlike non-hbd variance, opt2 is never used
2016-05-09 18:38:59 +00:00
Scott LaVarnway
1490342be5
VPX: refactor vpx_idct16x16_1_add_sse2()
...
Change-Id: I431ea0d9abe764d110a1ba32a8cb15e2fdac8805
2016-05-09 09:50:00 -07:00
Jim Bankoski
7a91d21d69
libvpx: vpx_add_plane_noise make c match assembly
...
This change makes the c match the assembly and removes the todo's
associated with getting this to work.
Change-Id: Ie32e9ebb584a9d60399662d8bcb71b74fbd19d1e
2016-05-07 12:47:49 -07:00
Johann
7e4c306981
Use canonical avg_pred functions
...
Change-Id: Ibe0cc388226622561d2b4a00e5bdc1016a3c4a94
2016-05-06 19:06:03 -07:00
Johann
b23bd2360f
The subfunctions are only defined for sse2
...
See highbd_subpel_variance_impl_sse2.asm
Change-Id: Id13b97f4f6d189ed71cdc6d52b3c4ea63dc1da05
2016-05-06 18:58:49 -07:00
Johann
a761197fbd
Unlike non-hbd variance, opt2 is never used
...
Change-Id: I1d342725df332c4efc6006d9e3dcb7372c41f448
2016-05-06 18:38:04 -07:00
James Zern
5e679848e8
Merge changes from topic 'missing-proto'
...
* changes:
vp9_frame_scale_ssse3.c: make 2 functions static
vp9_pickmode.c: make function static
vp9_noise_estimate.c: make function static
vp9_aq_360.c: add missing include
vp9_idct_intrin_sse2: add missing vp9_rtcd.h include
vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
2016-05-06 02:25:29 +00:00
James Zern
2184692c07
vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
...
Change-Id: I103be7eee36492f8619144ce8325bc916d4975c7
2016-05-04 15:06:44 -07:00
James Zern
4f69f741d8
vpx_dsp_common.h: remove circular include
...
Change-Id: I05b3028a38bbc062c388eeb95e99a3fee583ae6b
2016-05-04 14:54:53 -07:00
James Zern
aa68a8301e
vpx_dsp_common.h: fix include guard
...
Change-Id: I1ad41c096ec86870f9aecab6fdbc3af03e972afc
2016-05-04 14:54:32 -07:00
James Bankoski
89f905e5e5
Merge "libvpx: add a unit test for plane_add_noise."
2016-05-04 13:09:05 +00:00
Jim Bankoski
34d5aff747
libvpx: add a unit test for plane_add_noise.
...
In so doing this fixes a couple of bugs:
vpx_plane_add_noise.c needed to subtract a clamp instead of add.
And the assembly (mmx sse) had assumptions that parameters were
continuous in memory which was not true.
Change-Id: I76f2c43cf54bfc838eb2edf8a443eaaa7565d7b5
2016-05-03 16:23:06 -07:00
James Bankoski
e755a283dd
Merge "Move vpx_add_plane from codec to vpx_dsp and dedup."
2016-05-03 14:11:57 +00:00
Jim Bankoski
fce3cee8dd
Move vpx_add_plane from codec to vpx_dsp and dedup.
...
Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7
2016-05-02 12:17:39 -07:00
Alex Converse
f6d13e7be5
Merge "bitreader: remove an unsigned overflow."
2016-04-28 16:26:37 +00:00
Alex Converse
a68b24fdee
Tweak casts on vpx_sub_pixel_variance to avoid implicit overflow.
...
Change-Id: I481eb271b082fa3497b0283f37d9b4d1f6de270c
2016-04-27 16:37:18 -07:00
Alex Converse
36a0c7ffe3
bitreader: remove an unsigned overflow.
...
bits_left is in the range [0, 64 (= BD_VALUE_SIZE)] , so the narrowing
conversion should be safe.
Change-Id: I943fcd359eaad76249ee1e1fb03a2ac16945d2fd
2016-04-27 15:31:35 -07:00
Alex Converse
6c4007be1c
Be explicit about overflow in vpx_variance16x16_sse2.
...
The product always fits in uint32_t, but the operands don't.
An optimizing compiler should generate the wraparound code.
(Verified with clang).
Change-Id: I25eb64df99152992bc898b8ccbb01d55c8d16e3c
2016-04-27 15:22:17 -07:00
Alex Converse
ccb894ce73
Remove casts on < 16x16 variance.
...
These blocks will never overflow since max sum is +/-255*w*h.
Change-Id: Ia2c630339fd9cfb411b56b6040ff402095f12a2e
2016-04-27 15:21:58 -07:00
Johann
2f5840de3e
vpx_minmax_8x8_neon and test
...
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156
Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6
2016-04-21 21:40:25 -07:00
Johann
8c02a36953
hadamard 8x8 test
...
The order of the output structure is not currently important.
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1021
Change-Id: Ibc0006d569675db6c5060c4529f5d9e73f2e96a6
2016-04-21 22:28:21 +00:00
Johann Koenig
c59c5cbeff
Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm"
2016-04-15 16:00:51 +00:00
Martin Storsjo
d8b3e29ee7
Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm
...
This was never hooked up for the 32x32_34 case as the neon_asm version
in 3f7c12da, when the intrinsics version was added.
Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320
2016-04-15 10:25:47 +03:00
Johann
26faa3ec7a
Apply 'const' to data not pointer
...
Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298
2016-04-14 14:47:16 -07:00
James Zern
5ab46e0ecd
Merge changes I7a1c0cba,Ie02b5caf,I2cbd85d7,I644f35b0
...
* changes:
vpx_fdct16x16_1_sse2: improve load pattern
vpx_fdct16x16_1_c/msa: fix accumulator overflow
vpx_fdctNxN_1_sse2: reduce store size
dct32x32_test: add PartialTrans32x32Test, Random
2016-04-06 02:51:53 +00:00
James Zern
38bc1d0f4b
vpx_fdct16x16_1_sse2: improve load pattern
...
load the full row rather than doing 2 8-wide columns
Change-Id: I7a1c0cba06b0dc1ae86046410922b1efccb95c95
2016-04-04 16:03:42 -07:00
James Zern
eb64ea3e89
vpx_fdct16x16_1_c/msa: fix accumulator overflow
...
tran_low_t is only signed 16-bits in non-high-bitdepth mode
Change-Id: Ie02b5caf2658e8e71f995c17dd5ce666a4d64918
2016-04-04 16:03:41 -07:00
James Zern
3735def667
vpx_fdctNxN_1_sse2: reduce store size
...
only output[0] needs to be set, store_output is more involved than a
movdqa in the high bitdepth case
Change-Id: I2cbd85d7cf74688bdf47eb767934fe42e02bff67
2016-04-04 16:02:06 -07:00
James Zern
c21d437052
vpx_fdct32x32_1_msa: fix accumulator overflow
...
Change-Id: I33a5432eda3416382e1cea06b45082c0c65faa75
2016-04-02 11:04:38 -07:00
James Zern
f4cae05cd4
vpx_fdctNxN_1_c: remove unnecessary store
...
only output[0] needs to be set, the other values will be ignored in this
case.
Change-Id: I8e9692fc0d6d85700ba46f70c2e899a956023910
2016-04-01 12:21:59 -07:00
James Zern
0269df41c1
vpx_fdct32x32_1_c: fix accumulator overflow
...
tran_low_t is only 16-bits in non-high-bitdepth mode
Change-Id: Ifc06110c95e86e6d790c44250d52a538b2e9713b
2016-03-30 15:20:20 -07:00
Scott LaVarnway
67c4c8244a
VPX: loopfilter_mmx.asm using x86inc 2
...
This reverts commit 9aa083d164e0d39086aa0c83f0d1a0d0f0d1ba61.
Fixes a decoder mismatch with 32bit PIC builds.
Change-Id: I94717df662834810302fe3594b38c53084a4e284
2016-03-08 04:24:47 -08:00
James Zern
9aa083d164
Revert "VPX: loopfilter_mmx.asm using x86inc"
...
This reverts commit 15ecdc3970462c15fdf7185d373cb52664f40c0f.
breaks 32-bit pic builds
Change-Id: I8bb1b9471a293f05ac7423aaba0339d408931b7a
2016-03-04 18:23:45 -08:00
Scott LaVarnway
dd6729f826
VPX: Remove pmin/pmax from subpixel functions.
...
These instructions are unnecessary if the adds
are done in the correct order.
Change-Id: I4e533b8267c32e610a4b94203ad052dc9fdabd71
2016-02-27 05:47:56 -08:00
Scott LaVarnway
51beb29f52
Merge "VPX: vpx_filter_block1d16_(v8, v8_avg)"
2016-02-27 13:31:18 +00:00
James Zern
654d2163c9
x86/convolve.h: remove redundant check in FUN_CONV_2D
...
the filter will be the same in this case
Change-Id: I95159bcb05bbfb71b57da741393e80cc7ffc5cff
2016-02-25 23:31:50 -08:00
James Zern
6d8c8c6201
x86/convolve.h: replace while w/if for w < 16
...
in non-hbd configurations; any high-bitdepth changes will be done in a
follow-up
Change-Id: Ia74e30971b744c1faab68c92fdeda1a053988c77
2016-02-25 21:44:06 -08:00
Scott LaVarnway
1f736e400f
VPX: vpx_filter_block1d16_(v8, v8_avg)
...
Store result with one 16 byte store instead of
two 8 byte stores.
Change-Id: I43acbc5edfd6d6055a926f9b9605d47127400f09
2016-02-25 06:15:24 -08:00
James Zern
b3ceb629ba
x86/convolve.h: change filter[] || chains to |
...
Change-Id: I661f64390f232826857b259e7a67e77f5a3a91ad
2016-02-24 19:47:43 -08:00
Scott LaVarnway
06d0e2fe6c
BUG FIX: vpx_filter_block1d(8,4)_(v8, v8_avg)
...
Change-Id: Ic7ea79988ed0864e7ddbfeb312516bcf77eaaac1
2016-02-23 12:23:41 -08:00
Scott LaVarnway
15ecdc3970
VPX: loopfilter_mmx.asm using x86inc
...
Change-Id: Idcf29281d617b275e3ca50f77e6d00c60992a36d
2016-02-18 15:34:58 -08:00