6746 Commits

Author SHA1 Message Date
Marco
f82280820a vp9. Use same source_sad threshold for all speeds.
Only affects real-time mode.

Change-Id: Iba836f110c4da936f5173cc0f54424d5b6121bff
2017-02-15 11:28:26 -08:00
Marco
716c1d5ff5 Vp9: Speed 8 aq-mode=3: Reduce computation in estimating bits per mb.
vp9_compute_qdelta_by_rate has almost 2% overhead in profiling on Nexus 6.
Reduce the calling of that function in speed 8 by estimating the delta-q.
Both rtc and rtc_derf show little/no change in avg psnr/ssim.
Encoding speed is 2~3% faster on Nexus 6.

Change-Id: If25933715783f31104a18a5092ea347b1221b5f5
2017-02-15 09:28:16 -08:00
paulwilkins
cfc79a357a Disconnect ARF breakout from frame boost.
This small change replaces the frame boost check in the arf group
length break out clause with a test against a prediction decay value.

The boost value is in fact partly dependent on the decay value but
this change means that the per frame boost calculation can be adjusted
without influencing the group length calculation.

The value chosen gives a close match on all the test sets with the previous
code (on average) but it was noted that a lower threshold was slightly better
for 1080P and up and a slightly higher value for small image sizes.

Change-Id: I4d5b9f67d5b17b0d99ea3f796d3d6202fd61ee0c
2017-02-15 10:46:14 +00:00
paulwilkins
b89ba05ab4 Remove unnecessary factor.
Removed unnecessary scaling factor to simplify.

Change-Id: I3fc9c5975a2597e72f1324e09dd586dea1facfa7
2017-02-15 10:45:43 +00:00
paulwilkins
76550dfdc0 Bug in scale_sse_threshold()
The function scale_sse_threshold() returns a threshold scaled
if necessary for use with 10 and 12 bit from an 8 bit baseline.

SSE error values would be expected to rise for the 10 and 12
bit cases where there are more bits of precision.

Hence the threshold used for the test should also be scaled up.

Change-Id: I4009c98b6eecd1bf64c3c38aaa56598e0136b03d
2017-02-15 10:45:03 +00:00
paulwilkins
945ccfee59 Additional first pass stats.
Added counts that split the intra coded blocks into low and high variance.

Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14
2017-02-15 10:44:37 +00:00
Paul Wilkins
7635ee0f37 Merge "Aggressive VBR method." 2017-02-15 10:37:02 +00:00
Johann Koenig
61927ba4ac Merge "vp9 fdct higbd neon: connect existing highbd calls" 2017-02-15 01:33:00 +00:00
Yunqing Wang
f2c1aea118 Merge "Row based multi-threading of encoding stage" 2017-02-15 00:54:10 +00:00
Ranjit Kumar Tulabandu
71061e9332 Row based multi-threading of encoding stage
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.

Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.

Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-15 00:49:34 +00:00
Johann
3e7aa8fda9 vp9 fdct higbd neon: connect existing highbd calls
Change-Id: Ia8f822bd6e70b3911bc433a5a750bfb6f9a3a75c
2017-02-14 22:11:49 +00:00
Johann Koenig
9c2bb7f342 Merge "quantize_fp highbd neon: use tran_low_t for coeff" 2017-02-14 21:28:23 +00:00
clang-format
4b402746ca apply clang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14 12:45:52 -08:00
Johann
2b24aa87d9 quantize_fp highbd neon: use tran_low_t for coeff
Change-Id: I90fd815f15884490ad138f35df575a00d31e8c95
2017-02-14 10:26:10 -08:00
Yunqing Wang
318ca07657 The bitstream bit match test in multi-threaded encoder
While the new-mt mode is enabled(namely, allowing to use row-based
multi-threading in encoder), several speed features that adaptively
adjust encoding parameters during encoding would cause mismatch
between single-thread encoded bitstream and multi-thread encoded
bitstream. This patch provides a set_control API to disable these
features, so that the bit match bitstream is obtained in the unit
test.

Change-Id: Ie9868bafdfe196296d1dd29e0dca517f6a9a4d60
2017-02-13 13:02:26 -08:00
James Zern
3c4ea94210 cosmetics,vp9_ratectrl: apply clang-format
broken since:
c3f095c8b Merge "Fix to avoid abrupt relaxation of max qindex in recode path"
5f21aba4b Fix to avoid abrupt relaxation of max qindex in recode path

the original change pre-dated the addition of .clang-format

Change-Id: If5e399d9a805bcad9147360b13b36fbc8c560a7c
2017-02-13 11:29:39 -08:00
paulwilkins
ce7b38459a Aggressive VBR method.
VBR method that allows a wider Q range for the first normal frame
in each ARF group and then centers the min - max range for the rest of
the arf group on the chosen Q value for that first frame.

This allows for quite rapid adjustment of the active Q range even if the
initial estimate is poor.

In some cases where the ARF frames themselves are tending to
undershoot but the normal frames are overshooting this can still give
net undershoot. This can be corrected by allowing a larger Q delta for
arf frames but is usually is a sign that the allocation to the arfs was to
high.

Change-Id: Icec87758925d8f7aeb2dca29aac0ff9496237469
2017-02-13 15:42:11 +00:00
Marco
22dcfa80aa vp9: Non-rd mode: use simple block_yrd for 8 bit high bitdepth builds
Temporary fix until optimization work for block_yrd is completed.
This essentially reverts back to the state before the change:
https://chromium-review.googlesource.com/c/433821/

Compression loss is about ~5-6% on RTC set.
Speed-up (from using this simple/model-based block_yrd) over the low
bitdepth builds (which uses more complex block_yrd) is ~5% on 720p.

Change-Id: Ie0af9eb0d111e5595f587870c44f08317403b8d8
2017-02-10 10:15:35 -08:00
Paul Wilkins
c3f095c8b3 Merge "Fix to avoid abrupt relaxation of max qindex in recode path" 2017-02-09 17:17:55 +00:00
Paul Wilkins
82b88a7fd0 Merge "Fix for max qindex calculation of a gf interval" 2017-02-09 17:17:44 +00:00
Johann Koenig
b73f99745b Merge "block_error_fp highbd sse2: use tran_low_t for coeff" 2017-02-07 23:26:10 +00:00
Marco Paniconi
71f5314993 Merge "vp9: Denoiser speed-up: increase partition and ac skip thresholds." 2017-02-07 22:25:00 +00:00
Yunqing Wang
b106abe570 Merge "Row based multi-threading of ARNR filtering stage" 2017-02-07 19:55:41 +00:00
Marco Paniconi
259e835b1b Merge "vp9: Adjust rate_err threshold for setting active_worst factor." 2017-02-07 19:25:47 +00:00
Marco
1a5482d4d8 vp9: Denoiser speed-up: increase partition and ac skip thresholds.
Add factor to increase varianace partition and ac skip thresholds,
under certain conditions (noise level and sum_diff), to increase
denoiser speed.

Change-Id: I7671140ef3598bf5f114a72623d68792bcd7b77b
2017-02-07 10:33:13 -08:00
Marco
3c2f076ad0 vp9: Adjust rate_err threshold for setting active_worst factor.
Only affects 1 pass vbr.
Small improvement on ytlive set.

Change-Id: I09a7456fe658fbea82ece1035cf683bd8bd8bd14
2017-02-07 09:38:16 -08:00
Johann
537949a9df block_error_fp highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Id2ed3ebaaaa6a4b68628c23e08b64ea5f1341761
2017-02-07 15:03:28 +00:00
Ranjit Kumar Tulabandu
91f01a2060 Row based multi-threading of ARNR filtering stage
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
2017-02-07 14:03:19 +05:30
Johann Koenig
85f3a82355 Merge "highbd x86: consolidate tran_low_t conversions" 2017-02-07 02:49:58 +00:00
Jerome Jiang
aa327a1ed4 vp9: speed 8: Tune threshold of ac skip and partitioning.
Threshold for partitioning only affects VGA and lower res.
0.07% quality regression is observed in borg tests on rtc_derf
and 0.2% regression on rtc.
5.6% speed up for low res and 6.8% for VGA on Nexus 6.

Change-Id: If85a2919b48c991de66059c90f32ed06980452be
2017-02-06 16:27:53 -08:00
Johann
641fda79bb highbd x86: consolidate tran_low_t conversions
Create new helper files specifically for converting tran_low_t types.

Change-Id: I7c4c458ef910f3b3d10a3cfbf9df4de7682fd905
2017-02-06 10:43:26 -08:00
Yunqing Wang
dbc5090b5e Merge "Changes to facilitate multi-threading of encoding stage" 2017-02-04 01:02:29 +00:00
Ranjit Kumar Tulabandu
12ec948490 Changes to facilitate multi-threading of encoding stage
Modified the encoding stage to have row level entry points with relevant
initializations and to access the token information at row level

Change-Id: Ife10e55a7c1a420ee906d711caf75002688d9e39
2017-02-02 14:47:13 +05:30
Yunqing Wang
770c6663d6 Merge "Changes to facilitate row based multi-threading of ARNR filtering" 2017-02-01 22:04:15 +00:00
Ranjit Kumar Tulabandu
359a6796da Changes to facilitate row based multi-threading of ARNR filtering
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
2017-02-01 13:03:52 -08:00
Johann
bfd62cdaff vp9_rdopt: declare 'c' closer to use
Clears up static clang analysis warning regarding a dead store. Only
declare 'c' when it will be used.

Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3
2017-02-01 19:58:24 +00:00
Jingning Han
969957f9f2 Fix real-time compression regression in hbd mode
This commit resolves the compression performance regression in
real-time encoding setting when high bit-depth mode is enabled.

The current solution temporarily disables the SIMD implementations
of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.

The commit makes the coding results bit-wise identical between
regular coding pipeline and high bit-depth at profile 0.

BUG=webm:1365

Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2017-01-31 23:17:09 -08:00
Marco
d47f257484 vp9: Modify bsize condition for using model_rd_large for speed 7.
In non-rd pickmode: Allow speed 7 to also use larger block size in
model_rd. Small change in behavior for speed 7.

Change-Id: I8c5523e424308e8f0bc71b3f6324dec42a464cc8
2017-01-30 11:16:51 -08:00
Yunqing Wang
106c620a23 Merge "Disable multi-threading in first pass for SVC encoding" 2017-01-28 19:29:01 +00:00
Marco
d94d0ed12f vp9: Fix to pick_filter_level for highbitdepth build.
Change-Id: I53b3fa8bfc0a0717eb1b730c29f2b70060b1b1b7
2017-01-27 10:44:07 -08:00
Ranjit Kumar Tulabandu
6985a0f516 Disable multi-threading in first pass for SVC encoding
BUG=webm:1366

Change-Id: I204ef8496884ba7c4debe64f23f50d298b4090c3
2017-01-27 09:41:53 -08:00
Marco Paniconi
ad1aad69fb Merge "vp9: Modify bsize condition for using model_rd_large." 2017-01-27 15:15:36 +00:00
Marco
b16c77cdc4 vp9: Modify bsize condition for using model_rd_large.
In non-rd pickmode: small change in behavior for speed 6 and 7.
Remove condition on HIGHBITDEPTH flag.

Change-Id: I360a13fcc313d72612fe9b918162ef4bb278cdea
2017-01-26 22:45:27 -08:00
Marco
db99840bf6 vp9: Fixes for usage of skin_map for high bit depth.
Also avoid noise_estimation and source_sad if use_highbitdepth is set.

Change-Id: I5fea396b8f8380ea377045d99ba22a52b92daa46
2017-01-26 19:57:59 -08:00
Jerome Jiang
eacc3b4ccf Merge "vp9: Refactor copy partitioning to reduce duplication." 2017-01-26 17:46:11 +00:00
Jerome Jiang
fe4791b0d5 vp9: Refactor copy partitioning to reduce duplication.
Change-Id: Ia1b3c118adec5eccbd2900c8e4b9ea6b1e3e9b7c
2017-01-25 17:33:04 -08:00
Yunqing Wang
4d50dc5ab5 Merge "Remove marco MVC in mcomp.c" 2017-01-26 00:32:55 +00:00
Hui Su
37cd112b0f Merge "Fix an overflow warning in optimize_b()" 2017-01-25 22:49:30 +00:00
Marco
3b2d08a93b vp9-denoiser: Modify skip denoising condition for small blocks.
Skip denoising for blocks < 16x16, and for block = 16x16
skip denoising for low noise levels and width > 480 for now.
Allow for some speed-up in denoiser.

Change-Id: Ib46cefe4741962d145fa08775defea3a9c928567
2017-01-25 11:48:09 -08:00
hui su
519b2e48a8 Fix an overflow warning in optimize_b()
BUG=webm:1361

Change-Id: Ib840bf3b39f7b3c8c017d3488a83434e9a0f45f5
2017-01-25 10:54:39 -08:00