408 Commits

Author SHA1 Message Date
Yaowu Xu
d1f0f4cc63 Merge "Clarify integer value ranges" 2016-05-18 23:55:05 +00:00
James Zern
146ccd304f Merge "Code clean of sub_pixel_variance4xh" 2016-05-18 23:18:35 +00:00
Johann Koenig
36b610d8c1 Merge "neon hadamard 8x8" 2016-05-18 20:11:16 +00:00
Yaowu Xu
a564b18d7f Clarify integer value ranges
This commit clarifies integer value range for vairables used in
several variance functions, also change to use proper type
conversion to reflect the value ranges.

Change-Id: Ic3234b83a912ce1ad12d1b254f3378763e15cc5c
2016-05-18 10:25:12 -07:00
Scott LaVarnway
2468163e07 Code clean of sub_pixel_variance4xh
Replace MMX with SSE2.

Change-Id: Ia8fcba755952804e347d7d7736f57d1f90c988a0
2016-05-18 04:24:41 -07:00
James Zern
297c752106 vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
Change-Id: I103be7eee36492f8619144ce8325bc916d4975c7
(cherry picked from commit 2184692c07c15290b424d538ae942d5f60eb7df8)
2016-05-18 05:39:49 +00:00
Johann
9b54e812f7 neon hadamard 8x8
Runs about 30% faster than the C

BUG=webm:1021

Change-Id: I6809d6d84c3077ab619c53298296950e976bdaba
2016-05-16 11:58:02 -07:00
Yaowu Xu
c1e4f5a80d Merge "Change to use correct check for halfpel" 2016-05-13 01:27:47 +00:00
Jingning Han
aad8c94fb7 Merge "Fix highbd masked variance function declaration" into nextgenv2 2016-05-12 01:29:38 +00:00
Jingning Han
5538c2cb00 Fix highbd masked variance function declaration
Fix the variable type mismatch between highbd_calc_masked_var_t and
the actual function definiton. This clears the related compiler
warnings in highbd with ext-inter experiment.

Change-Id: I0423318b16c867ed84700084ad21ca6e42edb321
2016-05-11 15:52:58 -07:00
Linfeng Zhang
2f55beb355 Merge "remove mmx variance functions" 2016-05-11 22:21:23 +00:00
Yaowu Xu
17fae3ad0a Change to use correct check for halfpel
In motion estimation stage for subpel motion, subpel variance is
computed use bilinear interpolation. The motion vector precision
used is at 1/8 pel and three bits are used to represent the x and y
subpel offsets. Based on this, the half pel check should be against
4, not 8.

Change-Id: I1f56fa1fa3f2f5e19a20d27983efe628557f170e
2016-05-11 13:52:59 -07:00
Linfeng Zhang
d0ffae825d remove mmx variance functions
there are sse2 equivalents which is a reasonable modern baseline
Removed mmx variance functions:
vpx_get_mb_ss_mmx()
vpx_get8x8var_mmx()
vpx_get4x4var_mmx()
vpx_variance4x4_mmx()
vpx_variance8x8_mmx()
vpx_mse16x16_mmx()
vpx_variance16x16_mmx()
vpx_variance16x8_mmx()
vpx_variance8x16_mmx()

Change-Id: Iffaf85344c6676a3dd337c0645a2dd5deb2f86a1
2016-05-11 12:39:42 -07:00
Linfeng Zhang
d0e687bf8c remove mmx sad functions
there are sse2 equivalents which is a reasonable modern baseline

Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba
2016-05-11 10:50:04 -07:00
Yue Chen
372e12b959 Merge "Add single motion search for OBMC predictor" into nextgenv2 2016-05-11 17:20:32 +00:00
Yue Chen
370f203a40 Add single motion search for OBMC predictor
Weighted single motion search is implemented for obmc predictor.
When NEWMV mode is used, to determine the MV for the current block,
we run weighted motion search to compare the weighted prediction
with (source - weighted prediction using neighbors' MVs), in which
the distortion is the actual prediction error of obmc prediction.

Coding gain: 0.404/0.425/0.366 for lowres/midres/hdres
Speed impact: +14% encoding time
              (obmc w/o mv search 13%-> obmc w/ mv search 27%)

Change-Id: Id7ad3fc6ba295b23d9c53c8a16a4ac1677ad835c
2016-05-10 18:27:45 -07:00
Yaowu Xu
9e41323f13 Merge "Make type conversions explicit" into nextgenv2 2016-05-10 14:33:05 +00:00
Jim Bankoski
da33728f48 vpx_dsp: Rename postproc.c add_noise.
Change-Id: I4906d1b79a2951e659995202b9fa97e2ea5cfba0
2016-05-10 06:52:58 -07:00
Scott LaVarnway
c2c5297595 Merge "VPX: refactor vpx_idct16x16_1_add_sse2()" 2016-05-09 22:15:17 +00:00
James Bankoski
7cced7b3ea Merge "libvpx: vpx_add_plane_noise make c match assembly" 2016-05-09 20:17:38 +00:00
Johann Koenig
9e5811f485 Merge changes Id13b97f4,I1d342725
* changes:
  The subfunctions are only defined for sse2
  Unlike non-hbd variance, opt2 is never used
2016-05-09 18:38:59 +00:00
Scott LaVarnway
1490342be5 VPX: refactor vpx_idct16x16_1_add_sse2()
Change-Id: I431ea0d9abe764d110a1ba32a8cb15e2fdac8805
2016-05-09 09:50:00 -07:00
Yaowu Xu
98c59c98ba Make type conversions explicit
This eliminates MSVC compiler warnings.

Change-Id: Id6ace2586ed7c6248366905b133448fe8ecbd53d
2016-05-07 20:33:40 +00:00
Jim Bankoski
7a91d21d69 libvpx: vpx_add_plane_noise make c match assembly
This change makes the c match the assembly and removes the todo's
associated with getting this to work.

Change-Id: Ie32e9ebb584a9d60399662d8bcb71b74fbd19d1e
2016-05-07 12:47:49 -07:00
Johann
7e4c306981 Use canonical avg_pred functions
Change-Id: Ibe0cc388226622561d2b4a00e5bdc1016a3c4a94
2016-05-06 19:06:03 -07:00
Johann
b23bd2360f The subfunctions are only defined for sse2
See highbd_subpel_variance_impl_sse2.asm

Change-Id: Id13b97f4f6d189ed71cdc6d52b3c4ea63dc1da05
2016-05-06 18:58:49 -07:00
Johann
a761197fbd Unlike non-hbd variance, opt2 is never used
Change-Id: I1d342725df332c4efc6006d9e3dcb7372c41f448
2016-05-06 18:38:04 -07:00
Geza Lore
a0e1c23277 Add SSE2 versions of 128x128 vpx_sad*
Encoder speedup with all experiments enabled approx 15%.

Change-Id: Ib3c771d8da00989ddc9112b71b48ce7c5594e91a
2016-05-06 14:18:00 +01:00
James Zern
5e679848e8 Merge changes from topic 'missing-proto'
* changes:
  vp9_frame_scale_ssse3.c: make 2 functions static
  vp9_pickmode.c: make function static
  vp9_noise_estimate.c: make function static
  vp9_aq_360.c: add missing include
  vp9_idct_intrin_sse2: add missing vp9_rtcd.h include
  vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
2016-05-06 02:25:29 +00:00
James Zern
2184692c07 vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
Change-Id: I103be7eee36492f8619144ce8325bc916d4975c7
2016-05-04 15:06:44 -07:00
James Zern
4f69f741d8 vpx_dsp_common.h: remove circular include
Change-Id: I05b3028a38bbc062c388eeb95e99a3fee583ae6b
2016-05-04 14:54:53 -07:00
James Zern
aa68a8301e vpx_dsp_common.h: fix include guard
Change-Id: I1ad41c096ec86870f9aecab6fdbc3af03e972afc
2016-05-04 14:54:32 -07:00
James Bankoski
89f905e5e5 Merge "libvpx: add a unit test for plane_add_noise." 2016-05-04 13:09:05 +00:00
Jim Bankoski
34d5aff747 libvpx: add a unit test for plane_add_noise.
In so doing this fixes a couple of bugs:

vpx_plane_add_noise.c needed to subtract a clamp instead of add.
And the assembly (mmx sse) had assumptions that parameters were
continuous in memory which was not true.

Change-Id: I76f2c43cf54bfc838eb2edf8a443eaaa7565d7b5
2016-05-03 16:23:06 -07:00
James Bankoski
e755a283dd Merge "Move vpx_add_plane from codec to vpx_dsp and dedup." 2016-05-03 14:11:57 +00:00
Jim Bankoski
fce3cee8dd Move vpx_add_plane from codec to vpx_dsp and dedup.
Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7
2016-05-02 12:17:39 -07:00
Yi Luo
299c5fc202 HBD hybrid transform 8x8 SSE4.1 optimization
- Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
- Update bit-exact unit test against current C version.
- HBD encoder speed improves ~3.8%.

Change-Id: Ie13925ba11214eef2b5326814940638507bf68ec
2016-04-29 17:04:52 -07:00
Alex Converse
f6d13e7be5 Merge "bitreader: remove an unsigned overflow." 2016-04-28 16:26:37 +00:00
Alex Converse
a68b24fdee Tweak casts on vpx_sub_pixel_variance to avoid implicit overflow.
Change-Id: I481eb271b082fa3497b0283f37d9b4d1f6de270c
2016-04-27 16:37:18 -07:00
Alex Converse
36a0c7ffe3 bitreader: remove an unsigned overflow.
bits_left is in the range [0, 64 (= BD_VALUE_SIZE)] , so the narrowing
conversion should be safe.

Change-Id: I943fcd359eaad76249ee1e1fb03a2ac16945d2fd
2016-04-27 15:31:35 -07:00
Alex Converse
6c4007be1c Be explicit about overflow in vpx_variance16x16_sse2.
The product always fits in uint32_t, but the operands don't.

An optimizing compiler should generate the wraparound code.
(Verified with clang).

Change-Id: I25eb64df99152992bc898b8ccbb01d55c8d16e3c
2016-04-27 15:22:17 -07:00
Alex Converse
ccb894ce73 Remove casts on < 16x16 variance.
These blocks will never overflow since max sum is +/-255*w*h.

Change-Id: Ia2c630339fd9cfb411b56b6040ff402095f12a2e
2016-04-27 15:21:58 -07:00
Johann
2f5840de3e vpx_minmax_8x8_neon and test
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156

Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6
2016-04-21 21:40:25 -07:00
Johann
8c02a36953 hadamard 8x8 test
The order of the output structure is not currently important.

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1021

Change-Id: Ibc0006d569675db6c5060c4529f5d9e73f2e96a6
2016-04-21 22:28:21 +00:00
Yaowu Xu
ed04e82a04 Merge branch 'master' into nextgenv2
Conflicts:
	vp10/common/scan.c
	vp9/common/vp9_pred_common.c
	vp9/decoder/vp9_decoder.c

Change-Id: Id559d98ea676da15d60ed464ddb6c48d3eed1111
2016-04-18 15:15:05 -07:00
Johann Koenig
c59c5cbeff Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm" 2016-04-15 16:00:51 +00:00
Martin Storsjo
d8b3e29ee7 Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm
This was never hooked up for the 32x32_34 case as the neon_asm version
in 3f7c12da, when the intrinsics version was added.

Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320
2016-04-15 10:25:47 +03:00
Johann
26faa3ec7a Apply 'const' to data not pointer
Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298
2016-04-14 14:47:16 -07:00
Yi Luo
6db95602e4 Merge "Optimized HBD block subtraction for all block sizes" into nextgenv2 2016-04-12 21:22:32 +00:00
Debargha Mukherjee
ec1365a0c9 Merge "Extend variance based partitioning to 128x128 superblocks" into nextgenv2 2016-04-12 19:42:35 +00:00