17183 Commits

Author SHA1 Message Date
Marco
bedf1c3af6 vp9: Skip computation of best_sad for newmv, unless needed.
For non-rd pickmode:
best_pred_sad, computed for NEWMV-last, is only used for
skipping golden non-zero modes. Add condition to avoid this
computation if not used (i.e, if golden nonzero modes are not used).

And remove code for computing best_pred_sad for NEWMV-golden,
since that sad is not used.

No change in behavior; small speed gain (~1%) for svc encodes.

Change-Id: Ic2cbdef6c4e9a233a57c0db0eeac8ad5fcead366
2016-05-31 10:29:00 -07:00
Zoe Liu
e89ca180c2 Make the bi-predictive frame group interval adjustable
This is for the bidir-pred experiment. Previously the length of the
bi-predictive frame group interval is fixed at 2, i.e. one
bi-predictive frame may be inserted every other frame. This patch
makes the length adjustable, i.e. any positive number may be
specified, but the use of the backward ref will be turned off if the
bi-predictive frame group interval is larger than the golden frame
group.

Further, an additional rate factor level has been added:
INTER_LOW
, which applies to LAST_BIPRED_UPDATE frames that are not used as
references.

Change-Id: I5514d34a64dd486bbb5756c2d0612946f598a789
2016-05-28 16:46:45 -07:00
Hui Su
6fd7f7dd3e Merge "ext-intra: refactor mode info. writing and reading" into nextgenv2 2016-05-28 04:34:59 +00:00
Tom Finegan
f80d8011a0 Merge "vpx_ports/mem_ops.h: cast the lhs of bitwise shifts of 24." 2016-05-27 18:52:05 +00:00
James Zern
f6ac6cf5bd Merge "acm_random,Rand9Signed: correct cast" 2016-05-27 18:32:06 +00:00
Linfeng Zhang
2ab7b9a6c9 Merge "Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10." 2016-05-27 17:51:35 +00:00
James Zern
13d48c4267 acm_random,Rand9Signed: correct cast
convert the random value to int16 before subtracting 256 from it; quiets
a ubsan (sanitize=integer) warning

BUG=webm:1225

Change-Id: Ibc2c5a21f30e112bd6c180f7d6a033327c38d0df
2016-05-27 10:33:56 -07:00
Linfeng Zhang
af7fb17c09 Upgrade fwht4x4_mmx() to fwht4x4_sse2() for vp9 and vp10.
Function level timing test shows about 27% time saving on
a Xeon E5-2680 v2 desktop.

Rename vp9_dct_sse2.c to vp9_dct_intrin_sse2.c for vp9 and
rename dct_sse2.c to dct_intrin_sse2.c for vp10 to avoid
duplicate basenames.

Actually vp9_fwht4x4_mmx/sse2() and vp10_fwht4x4_mmx/sse2()
are identical. TODO: They should be unified later if there is
no intention to keep a duplicate.

Change-Id: I3e537b7bbd9ba417c606cd7c68c4dbbfa583f77d
2016-05-27 09:51:16 -07:00
Tom Finegan
f1de622617 vpx_ports/mem_ops.h: cast the lhs of bitwise shifts of 24.
C does not allow for shifting into the sign bit of a signed
integer, and the two instances here become signed ints via
promotion. Explcitly cast them to unsigned MEM_VALUE_T to
avoid the problem.

BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=614648

Change-Id: I51165361a8c6cbb5c378cf7e4e0f4b80b3ad9a6e
2016-05-27 09:23:11 -07:00
Linfeng Zhang
0ba9b299e9 Merge "Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2" 2016-05-27 15:47:28 +00:00
James Zern
5d237f0986 vp10_inv_txfm2d_test: fix memory leak
input_, ref_input_ and output_ were being allocated with new[] followed
by vpx_memalign, remove the former

Change-Id: Ia16d0f9b9317042a24445095ad3c284f4e7bb481
2016-05-26 20:04:59 -07:00
Hui Su
e717ece4ab Merge "Add a quick path in build_intra_predictors" into nextgenv2 2016-05-26 22:12:53 +00:00
hui su
e5f47d4334 ext-intra: refactor mode info. writing and reading
No performance changes.

Change-Id: I001068330ea217a993aee9b79d7ffead0d23100e
2016-05-26 14:56:40 -07:00
Linfeng Zhang
4b5e462d08 Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2
Followed the code style of other lpf fuctions.
These 2 functions put 2 rows of data in a single xmm register,
so they have similar but not identical filter operations,
and cannot share the same macros.

Change-Id: I3bab55a5d1a1232926ac8fd1f03251acc38302bc
2016-05-26 14:55:18 -07:00
Yaowu Xu
ff6accf936 Merge "Convert to unsigned int before left shift" 2016-05-26 21:29:50 +00:00
Hui Su
88eaf5d6ce Merge "Skip unnecessary calculations in ext-intra" into nextgenv2 2016-05-26 18:03:02 +00:00
Yaowu Xu
301e345273 Convert to unsigned int before left shift
This is to fix overflow when 128 is left shifted by 24.

Change-Id: Ibb5f6813536d985afa003a9848c0c3dd358955a7
2016-05-26 08:46:01 -07:00
Scott LaVarnway
9d24fe60f1 Merge "Code clean of sub_pixel_variance4xh -- 2" 2016-05-26 13:20:24 +00:00
hui su
bad6e169bf Add a quick path in build_intra_predictors
For the cases where no reference data is available.

Change-Id: Ibf1ac9b7073acc2c7fc44da893f3d608dc74bc1e
2016-05-25 15:21:57 -07:00
Yi Luo
469d002f4e Merge "Integrate HBD inverse HT flip types sse4.1 optimization" into nextgenv2 2016-05-25 21:35:14 +00:00
Marco
75d551783d vp9: Add datarate test for 1 pass VBR mode.
Existing tests are only for CBR mode.

Change-Id: Ie3b2cd46236457748e2650901d1a347a730f38af
2016-05-25 14:20:30 -07:00
Alex Converse
19e0b406c9 Refactor probability savings search.
- Avoid excessive copying

- Don't both searching if no update can possibly offer savings

- Simplify the interface

- Remove the confusing vp9_cost_upd256 macro

Change-Id: Id9d9676a361fd1203b27e930cd29c23b2813ce59
2016-05-25 13:00:09 -07:00
Yaowu Xu
e5b7f14ea7 Merge "Fix comments in build_intra_predictors_high()" 2016-05-25 19:58:18 +00:00
Yi Luo
bfe4c0ae07 Integrate HBD inverse HT flip types sse4.1 optimization
- tx_size: 4x4, 8x8, 16x16.
- tx_type: FLIPADST_DCT, DCT_FLIPADST, FLIPADST_FLIPADST,
  ADST_FLIPADST, FLIPADST_ADST.
- Encoder speed improvement:
  park_joy_1080p_12: ~11%, crowd_run_1080p_12: ~7%.
- Add unit test cases for bit-exact against C.

Change-Id: Ia69d069031fa76c4625e845bfbfe7e6f6ed6e841
2016-05-25 12:32:10 -07:00
Yaowu Xu
ba8651d474 Fix comments in build_intra_predictors_high()
1. Removed TODOs, no longer applicable to finalized vp9 profiles.
2. Added explanation on assumed values for highbitdepth profiles.

Change-Id: I59e0bebaaab900cc611ed284daa5fa0bdedb8097
2016-05-25 12:18:35 -07:00
James Zern
008f27e70a Merge "add vp10 ActiveMap/ActiveMapRefreshTest" into nextgenv2 2016-05-25 19:05:02 +00:00
Yaowu Xu
75b6cfe1c5 Prevent read to invalid RefBuffer
This commit adds check to validate RefBuffer before reading into the
data structure, to prevent invalid read.

BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=614701

Change-Id: Ie111e95bd18e88fa19d8b25e097cdf52b7139cb6
2016-05-25 09:28:36 -07:00
James Zern
7acd0a59ca Merge "remove vp9_diamond_search_sad_avx.c" 2016-05-25 00:08:38 +00:00
Yi Luo
cb507ff29a Merge "HBD inverse HT 8x8 and 16x16 sse4.1 optimization" into nextgenv2 2016-05-24 22:06:07 +00:00
Zoe Liu
cf5083d4cd Added an experiment "bidir_pred" for backward prediction
Major parts have been implemented as follows:
(1) Added BRF_UPDATE, LASTNRF_UPDATE, and NRF_UPDATE in firstpass.c;
(2) Added the handling for the scenario of
"cpi->common.show_existing_frame == 1" at the encoder;
(3) Added a new reference frame of BWDREF_FRAME;
(4) Have bwd-ref work with upsampled references.

Note that when the experiment of "ext_refs" turned on, this experiment
will be turned off automatically currently.

RD performance in Overall PSNR has been improved, compared against the
VP10 baseline:

lowres: Avg -3.312; BDRate -3.154
derflr: Avg -1.927; BDRate -1.176
midres: Avg -2.149; BDRate -2.001
hdres : Avg -0.567; BDRate -0.588

Change-Id: I4c06ff51cc20194bffbd4d2346e57ba3dcf6b62c
2016-05-24 13:55:57 -07:00
Brion Vibber
35d7e17b03 Move git version extras out of iOS shared framework bundle version
Apple's version format specification is strictly checked on app
store submission, even for embedded frameworks:

http://apple.co/1WgelY1

    The build version number should be a string comprised of
    three non-negative, period-separated integers with the
    first integer being greater than zero. The string should
    only contain numeric (0-9) and period (.) characters.

So that's room for "1.5.0" but not for "1.5.0-906-g656f9c4".

The full version returned from 'version.sh --bare' is now
embedded under a 'VPXFullVersion' custom key in the Info.plist,
so it can still be extracted from the resulting framework.

Change-Id: If34a58d02e407379d1f1859fda533ef7f983170b
2016-05-24 13:08:25 -07:00
Yi Luo
28cdee448d HBD inverse HT 8x8 and 16x16 sse4.1 optimization
- Covers tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
- Encoding speed improves ~27% on crowd_run_1080p_12.
- Merge 4x4, 8x8, 16x16 unit tests in one test file.

Change-Id: I058ef5254d068a9523a826480c78ebbdd231824c
2016-05-24 12:55:30 -07:00
James Zern
be12fefa4b remove vp9_diamond_search_sad_avx.c
vp9_diamond_search_sad_avx was disabled in:
057c1c4 disable vp9_diamond_search_sad_avx

this removes a missing prototype warning as the prototype is no longer
included in vp9_rtcd.h. the file can be restored if someone gets around
to fixing the issue.

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1168

Change-Id: Ia9fda4b81c53dc5fba7c31d780d761f886940b52
2016-05-24 12:02:22 -07:00
Debargha Mukherjee
89f5b6a0b6 Merge "Remove redundant memcpy from wedge predictor." into nextgenv2 2016-05-24 17:33:54 +00:00
Debargha Mukherjee
416da08102 Merge "Pick up bit-depth from the right place" into nextgenv2 2016-05-24 17:33:22 +00:00
Scott LaVarnway
a4f3751be5 Code clean of sub_pixel_variance4xh -- 2
Replace MMX with SSE2.

Change-Id: Id8482d2589131f9427e7f36bc64413f058caf31f
2016-05-24 04:44:05 -07:00
Geza Lore
2935b4db0e Remove redundant memcpy from wedge predictor.
Removing redundant calls to memcpy from
build_wedge_inter_predictor_from_buf yields a net 4% encoder speedup
with ext-inter only. The output is identical.

Change-Id: If97d4e323a5c8aca90c84a25a72085e006b05446
2016-05-24 11:31:18 +01:00
Geza Lore
62b6331753 Pick up bit-depth from the right place
Change-Id: Icbdb036d7927b77b84bd78e8348ec8b5be88df08
2016-05-24 11:08:23 +01:00
hui su
4a741a5d5c Skip unnecessary calculations in ext-intra
Around 5% speedup.

Change-Id: I1c552e4e58fbf5637c0b5a97dd2cc4f83a1ca201
2016-05-23 17:24:19 -07:00
Zoe Liu
a63147ae77 Fix --test-decode=warn to test mismatch
This patch always compares the most recent show frames between
the encoder and the decoder to test the mismatch.

Change-Id: I68a91ad0996a598231450debfd616e24992419b5
2016-05-23 17:01:53 -07:00
Debargha Mukherjee
fb65f9b54b Merge "Add optimized vpx_blend_mask6" into nextgenv2 2016-05-23 23:43:52 +00:00
Geza Lore
a661bc87c4 Add optimized vpx_blend_mask6
This is to replace vp10/common/reconinter.c:build_masked_compound.
Functionality is equivalent, but the interface is slightly more
generic.

Total encoder speedup with ext-inter: ~7.5%

Change-Id: Iee18b83ae324ffc9c7f7dc16d4b2b06adb4d4305
2016-05-23 16:28:58 +01:00
KO Myung-Hun
72e332f767 configure: Add -mstackrealign flags to CFLAGS on OS/2
Many codes require -mstackrealign flags. Although -mstackrealign has
been already added to CFLAGS of some modules, SIGSEGV occurs in other
modules than those modules.

The best way may be to find causes and to fix them. However, we
cannot know those causes until SIGSEGV occur really. In addition, if
SIGSEGV occurs in other programs, it will be fatal.

So adding -mstackrealign flags to CFLAGS unconditionally is
reasonable.

Change-Id: I999ef597a6afe97f5e7cc7bffaa866537c3eedd2
2016-05-22 18:11:59 +09:00
KO Myung-Hun
14e8adea3c vpx: Add OS/2-specific threading codes
With correction of a type of a thread function for new threading
codes.

Change-Id: Ic6dc9f530698800d1cfe2da327848e8f8b62e31f
2016-05-22 18:11:50 +09:00
Debargha Mukherjee
fa5022978d Merge "Wedge refactoring to handle signs better" into nextgenv2 2016-05-20 23:19:39 +00:00
Jingning Han
8c9f6c5531 Merge "Clear redundant condition check from vp10_ext_tile_test.cc" into nextgenv2 2016-05-20 22:10:41 +00:00
jackychen
6f397b8a5b vp9: Remove a redundent condition in sub-pixel filter choosing.
Change-Id: I5cbb0f452ec9622437482b3a9496ead1253acfe0
2016-05-20 14:38:45 -07:00
Debargha Mukherjee
e5de2ad632 Wedge refactoring to handle signs better
Mostly refactoring. Handles signs better though results are
more or less neutral.

Change-Id: If499537c8f8da4f34d104ebfda072eb4c85fb12f
2016-05-20 14:12:52 -07:00
Yaowu Xu
93921097a6 Merge "Properly handle the filter extension in highbd setting" into nextgenv2 2016-05-20 20:00:51 +00:00
Yaowu Xu
17611f2f73 Merge "Fix build when vp8 is disabled" into nextgenv2 2016-05-20 19:59:48 +00:00