17183 Commits

Author SHA1 Message Date
Pascal Massimino
b90dbc3da9 Merge "Fix highbd obmc_variance unit test" into nextgenv2 2016-07-14 18:59:38 +00:00
Sarah Parker
a6aed6e4b3 Add new_quant quantization in rdopt for 4x4 blocks and intra
Originally the uniform quantization function was not being
replaced with the new_quant version in rdopt when new_quant
is turned on. This fixes the bug.

Change-Id: I593793bb909e1e1a6f89544eeca6783fe0576f25
2016-07-14 11:25:13 -07:00
Jingning Han
a387b19619 Fix highbd obmc_variance unit test
Fix the compiling errors in highbd obmc_variance unit test.

Change-Id: Id1bdfd50aeaff996e54067d5e9b369a5fd2d87a8
2016-07-14 10:12:03 -07:00
Hui Su
0c68db43ea Merge "Refactor codes about motion search" into nextgenv2 2016-07-14 00:13:47 +00:00
Jingning Han
75b3224a42 Merge "Fix highbd inter prediction filter sse4 overwriting issue" into nextgenv2 2016-07-13 21:35:29 +00:00
Jingning Han
edbbce8e61 Fix highbd inter prediction filter sse4 overwriting issue
Properly handle the case where the height is an integer multiple
of 4.

Change-Id: I11ac188c13f78db20902e2e333c60ce76ce837c5
2016-07-13 12:51:02 -07:00
Yue Chen
f2b34c3ad8 Merge "Optimize and cleanup obmc predictor and rd search." into nextgenv2 2016-07-13 18:40:49 +00:00
hui su
581636d767 Refactor codes about motion search
1. Add "best_mv" in MACROBLOCK to store the best motion vector
during motion search, so that we don't need to pass its pointer
to various motion search functions.

2. Declare some functions as static when possible.

3. Fix some indents.

Change-Id: I0778146c0866cbc55e245988c59222577ea8260e
2016-07-13 10:12:37 -07:00
Geza Lore
4c4f04ac11 Optimize and cleanup obmc predictor and rd search.
Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up
computing the obmc predictor. Clean up calc_target_weighted_pred.

Encoder speedup: 1.3%
Decoder speedup: 6.5%

Change-Id: I0c774fe53d22399e92a10d1daf3af0010d88d2c5
2016-07-13 16:54:20 +00:00
Geza Lore
ebc2d34cd9 Add SSE4.1 vpx_obmc_variance* implementations and cosmetics
Speedup for these functions: 4x
Also include some cosmetic changes to SAD functions

Change-Id: I344c32c795492507ae08742f52d035a13f583799
2016-07-12 21:04:46 -07:00
Pascal Massimino
6de0e97d97 Merge "Clean up FunctionEquivalenceTest." into nextgenv2 2016-07-13 03:09:52 +00:00
Geza Lore
a3f7ddc347 Clean up FunctionEquivalenceTest.
remove use of tuple in favor of struct.

Change-Id: If3b1aa5c2fc3cfe1446fff7a8fd270f2ca85fedf
2016-07-12 17:01:19 -07:00
Aamir Anis
15aaa601bd Merge "Fix for loop filter selection procedure" into nextgenv2 2016-07-12 23:56:37 +00:00
Aamir Anis
8575709f97 Fix for loop filter selection procedure
Fixed best error reported by loop filter selection, this value is used
during loop restoration to pick best mode. Baseline remains unchanged,
change in BDRate for loop restoration experiment:
-0.628 -> -0.625 for lowres,
-1.262 -> -1.283 for highres.

Change-Id: I69ef1608bc232b250ac46f59e31fdbed1a999dcd
2016-07-12 15:01:07 -07:00
Yi Luo
fde48c980a Merge "HBD convolution filtering (10/12 taps) SSE4.1 optimization" into nextgenv2 2016-07-12 19:28:48 +00:00
Yi Luo
8cacca73bf HBD convolution filtering (10/12 taps) SSE4.1 optimization
- For experiment EXT_INTERP under high bit depth.
- Add unit test to verify bit-exact.
- Speed performance improvement:
  On Xeon E5-2680, park_joy_1080p_12.y4m, 50 frames, encoding time
  drops from 6682503 ms to 5390270 ms.

Change-Id: Iea4debf5414f3accf1eb5672abeab56a0539ac77
2016-07-12 10:13:30 -07:00
Geza Lore
c804e0df05 Cleanup obmc_sad function prototypes.
Name 'wsrc', 'mask' and 'pre' explicitly, rather than
using 'b', 'm' and 'a'.

Change-Id: Iaee6d1ac1211b0b05b47cf98b50570089b12d600
2016-07-12 13:23:33 +01:00
James Zern
b8a28fbb3a Merge changes from topic 'missing-proto' into nextgenv2
* changes:
  vp10/encoder/rdopt.c: make a function static
  vp10/encoder/rd.c: make a function static
  vp10_convolve_ssse3.c: make some functions static
  vp10/encoder/bitstream.[hc]: correct a prototype
  vp10/common/idct.h: add some missing prototypes
  highbd_quantize_intrin_sse2.c: add missing rtcd include
  vp10: add some missing includes
2016-07-12 02:39:24 +00:00
Yue Chen
4ff6d13771 Merge "Cosmetics for vp10/common/vp10_rtcd_defs.pl" into nextgenv2 2016-07-12 01:21:33 +00:00
James Zern
849e990779 vp10/encoder/rdopt.c: make a function static
+ remove vp10_ prefix

quiets a -Wmissing-prototypes warning

BUG=b/29584271

Change-Id: I8821c38009b90296280f9b14233e73c92076e81f
2016-07-11 16:52:11 -07:00
James Zern
0baa08336a vp10/encoder/rd.c: make a function static
+ remove vp10_ prefix

quiets a -Wmissing-prototypes warning

BUG=b/29584271

Change-Id: I6b5d71f8120a6d1fee4c782beb4c6d6eef980f65
2016-07-11 16:52:10 -07:00
James Zern
08bd57ef0d vp10_convolve_ssse3.c: make some functions static
quiets -Wmissing-prototypes warnings

BUG=b/29584271

Change-Id: I4d2eb7f4b45d7b829421976641b3212bcf29e7dd
2016-07-11 16:52:10 -07:00
James Zern
3c127f2e36 vp10/encoder/bitstream.[hc]: correct a prototype
quiets a -Wmissing-prototypes warning

BUG=b/29584271

Change-Id: I91aba2a75dccd6752bdf91837564c2aa45817c09
2016-07-11 16:52:09 -07:00
James Zern
9bf5a1ab46 vp10/common/idct.h: add some missing prototypes
quiets the warning of the same name

BUG=b/29584271

Change-Id: I220cd58e1060f77e3910472fed1b167add3a08f8
2016-07-11 16:52:08 -07:00
James Zern
e046f5efef highbd_quantize_intrin_sse2.c: add missing rtcd include
quiets -Wmissing-prototypes warnings

BUG=b/29584271

Change-Id: Iff5214df0d1781810afbfc20bfaf664f109e2f29
2016-07-11 16:52:08 -07:00
James Zern
bc4341fd94 vp10: add some missing includes
quiets some -Wmissing-prototypes warnings

BUG=b/29584271

Change-Id: I9174728459fcabb6d9ac0028ae58029e52c0da92
2016-07-11 16:52:07 -07:00
Yue Chen
68e19472c1 Cosmetics for vp10/common/vp10_rtcd_defs.pl
Change-Id: Iaf8c6f0b1e340f0406df2871a3dc2ded19b7009a
2016-07-11 23:41:30 +00:00
Debargha Mukherjee
5041ff4921 Merge "Add a few branch hints to vp10_optimize_b." into nextgenv2 2016-07-11 22:30:33 +00:00
Debargha Mukherjee
6770c7361e Merge "Optimize and cleanup supertx predictor." into nextgenv2 2016-07-11 22:30:16 +00:00
Debargha Mukherjee
6bbadfb303 Merge "Improve vpx_blend_* functions." into nextgenv2 2016-07-11 19:30:04 +00:00
Geza Lore
cd489264e1 Optimize and cleanup supertx predictor.
Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up
computing the supertx predictor.

Decoder speedup of up to 4% has been observed.

Change-Id: I255a5ba4cc24f78dc905d25b6e2f7fbafac13253
2016-07-11 18:14:21 +00:00
Geza Lore
bfa59b4a5f Improve vpx_blend_* functions.
- Made source buffers pointers to const.
- Renamed vpx_blend_mask6b to vpx_blend_a64_mask. This is more
  indicative that the function does alpha blending. The 6, or 6b
  suffix was misleading, as the max mask value (64) does not fit into
  6 bits.
- Added VPX_BLEND_* macros to use when needing to blend scalars.
- Use VPX_BLEND_A256 in combine_interintra to be more explicit about
  the operation being done.
- Added versions of vpx_blend_a64_* which take 1D horizontal/vertical
  masks directly and apply them to all rows/columns
  (vpx_blend_a64_hmask and vpx_blend_a64_vmask). The SSE4.1 optimzied
  horizontal version now falls back on the 2D version. This can be
  improved upon if it show up high enough in a profile.
- All vpx_blend_a64_* functions now support block sizes down to 1x1
  (ie: a single pixel). This is for usage convenience. The SSE4.1
  optimized versions fall back on the C implementation if
  w <= 2 or h <= 2. This can again be improved if it becomes hot code.

Change-Id: I13ab3835146ffafe3e1d74d8e9cf64a5abe4144d
2016-07-11 19:05:17 +01:00
Pascal Massimino
e5fb2d4e93 remove ROUNDZ_* macros in favor of just ROUND_* ones
Change-Id: I263088be8d71018deb9cc6a9d2c66307770b824d
2016-07-11 06:27:41 -07:00
Geza Lore
1178f71d99 Merge "Fix unused warning without ext-interp" into nextgenv2 2016-07-11 11:29:17 +00:00
Debargha Mukherjee
5d28183fcf Merge "Refactor and clean up on blend_mask6" into nextgenv2 2016-07-09 06:50:32 +00:00
Yue Chen
5b25323c25 Merge "Fix assertion failures in mips+msa setting" into nextgenv2 2016-07-09 01:07:27 +00:00
Yue Chen
4ab19eac62 Fix assertion failures in mips+msa setting
Directly call c functions, otherwise when EXT_TX is enabled, hybrid
transform other than combination of DCT/ADST has not been implemented, thus
will cause assertion failures in the switch loops in vp10_fhtnxn_msa() and
vp10_ihtnxn_nxn_add_msa().

BUG=webm:1239

Change-Id: I2379a07e5406f9489edcd2f3205682f679c9b091
2016-07-08 17:13:52 -07:00
Jingning Han
9c4b041a80 Merge "Properly reset rate and distortion value for zero pred residual case" into nextgenv2 2016-07-08 22:21:27 +00:00
Debargha Mukherjee
72ef6d7704 Refactor and clean up on blend_mask6
Change-Id: Ie9188471e7dc07ab9c95b22f258b1662e895c533
2016-07-08 15:02:57 -07:00
Jingning Han
985dd03ff7 Merge "Integrate ext-interp into dual filter framework" into nextgenv2 2016-07-08 18:25:14 +00:00
Geza Lore
0b9b3d8643 Add a few branch hints to vp10_optimize_b.
vp10_optimize_b now takes between 40% to 60% of the TOTAL runtime
of the encoder, depending on bit-rate. It also contains 2/3 to 3/4
of the mispredicted branch instructions in the whole program.

Adding a few branch hints makes vp10_optimize_b around 2-5% faster
(dependig on bit-rate) when compiled with gcc/clang.

Change-Id: I1572733e18b4166bc10591b958c5018a9561fa2b
2016-07-08 19:20:35 +01:00
Sarah Parker
6c56def33e Merge "Make new_quant bin widths to be uniform" into nextgenv2 2016-07-08 17:40:55 +00:00
Jingning Han
e3a2aeb05d Integrate ext-interp into dual filter framework
The combination of the two experiments improves the compression
performance gains:

lowres 2.5%
midres 2.1%

Change-Id: Id26c0a9474ce08893aa1d946365c7ff850fab57a
2016-07-08 16:38:59 +00:00
Jingning Han
1bf039ccd5 Properly reset rate and distortion value for zero pred residual case
When the prediction residuals are all zero, reset the coeff rate
cost and the distortion value to be zero. This change doesn't affect
lowres set significantly, but improves several clips in the midres
set, like sintel_480p and mobisode2_480p, by a few percents. The
average performance for midres set is improved by 0.2%.

Change-Id: Idd5ebf2652e556a1b1c569fe3c48dacef3f11c32
2016-07-08 09:09:18 -07:00
Geza Lore
bb5059ff9b Fix unused warning without ext-interp
Change-Id: Ibb63c492eb8278d115262b8fc3cbc761c406b107
2016-07-08 15:48:02 +01:00
Jingning Han
7c393d097f Merge "Fix ioc in trellis optimization with hbd" into nextgenv2 2016-07-08 01:11:17 +00:00
Sarah Parker
88faa2b348 Make new_quant bin widths to be uniform
Change-Id: Iceeca8ecbc43919b43189352a307479d666d1dad
2016-07-07 16:22:32 -07:00
Debargha Mukherjee
c6f9b7f4ee Merge "RD costing fix in loop-restoration expt" into nextgenv2 2016-07-07 22:47:58 +00:00
Debargha Mukherjee
51957b4162 Merge "Remove redundant code in new_quant" into nextgenv2 2016-07-07 21:55:38 +00:00
Debargha Mukherjee
fc3ce72674 Merge "Clean up build_wedge_inter_predictor_from_buf" into nextgenv2 2016-07-07 20:05:12 +00:00