Commit Graph

17464 Commits

Author SHA1 Message Date
Johann
e67660cf37 fdct32x32 neon implementation
Almost 3x faster in constrained loop testing. Over 10x faster in HBD
builds.

BUG=webm:1424

Change-Id: I2b7f8453e1d4ada63cde729d8115d684c4a71ff9
2017-06-22 06:40:17 -07:00
paulwilkins
efe1982e63 Fix int overflow in rate control for high bit rates.
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.

In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k  clip the encode time for speeds 0 and 1 was
reduced by >25%

Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
2017-06-22 10:34:21 +01:00
Marco
d7515b1187 vp9: Add high source sad to content state.
Use it to limit NEWMV early exit in nonrd pickmode

Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
2017-06-21 20:57:17 -07:00
Marco Paniconi
33a9394eb1 Merge "vp9: Adjustments for aq-mode and pickmode for speed >= 8." 2017-06-22 03:27:47 +00:00
James Zern
dd88bd87db datarate_test: rename thread -> Thread in test name
this is consistent with other threaded tests and ensures gtest_filters
meant to operate on these pick them up

Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b
2017-06-21 20:05:31 -07:00
James Zern
828a1fa6de Merge "vp8_dx_iface: clear -Wclobbered warnings" 2017-06-22 02:01:11 +00:00
James Zern
7c0788b07f onyxd.h: add vp8dx_references_buffer prototype
quiets -Wmissing-prototypes

Change-Id: I6bee535f3fb67e54a390266d787a5a92127aeadc
2017-06-21 19:00:15 -07:00
James Zern
44418c659f vp[89],vpx_dsp: add missing includes
quiets -Wmissing-prototypes

Change-Id: I841cfc019d592f2bc6b3fec5818051a31f4c53b5
2017-06-21 19:00:15 -07:00
James Zern
b24ed95f44 vp8,encodeframe.h: correct prototypes
+ add missing include
quiets -Wmissing-prototypes

Change-Id: I64af0368ba3d7f1d4de22a5887b631bb2cf15b8a
2017-06-21 19:00:15 -07:00
James Zern
b093d998fc vp8: add temporal_filter.h
quiets -Wmissing-prototypes

Change-Id: Iffa77467720affe030de5335e9335232b9e70af1
2017-06-21 19:00:15 -07:00
James Zern
eb8226b903 add picklpf.h
quiets -Wmissing-prototypes

Change-Id: Ic24164aa1f86fe99a493a633d64606e6f44ecdc1
2017-06-21 19:00:14 -07:00
James Zern
864bc77e7a add ethreading.h
quiets -Wmissing-prototypes in encodeframe.c

Change-Id: Ic216d0bdd6130eac44f2183639a715b2f1088ebe
2017-06-21 19:00:14 -07:00
James Zern
d5d6a609d0 vp8,bitstream.h: add missing prototypes
quiets -Wmissing:prototypes

Change-Id: I835a80eddca2b16280780e18558c321df3272c43
2017-06-21 19:00:14 -07:00
James Zern
1d86383512 vp8: remove vp8_fast_quantize_b_mmx
and vp8_fast_quantize_b_impl_mmx; this was never enabled in rtcd
an sse2 version exists so there isn't much reason to keep a mmx
implementation around.

Change-Id: I8b3ee7f46ba194ffa0d0a6225a0f299f2a4dea90
2017-06-21 19:00:14 -07:00
James Zern
18335f193d vp8,loopfilter_filters: make some functions static
quiets -Wmissing-prototypes

Change-Id: Ie5b00537f64a05e68a38dc558463691523988994
2017-06-21 19:00:14 -07:00
James Zern
07f847873b vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
quiets -Wmissing-prototypes

Change-Id: I72d899c2d8de1ddc52d90ac081f2629374b3a6e9
2017-06-21 19:00:14 -07:00
James Zern
9a329b5285 vp9_encodeframe: make scale_part_thresh_sumdiff static
quiets -Wmissing-prototypes

Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
2017-06-21 19:00:14 -07:00
James Zern
3f296533f6 vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
quiets -Wmissing-prototypes

Change-Id: Ib2d4f294f1982739bb2ac98155e789e040d309a1
2017-06-21 19:00:04 -07:00
James Zern
9e1d2de67c highbd_quantize_fp_32x32: normalize abs_qcoeff type
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c

Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
2017-06-21 18:56:10 -07:00
Marco
21afafa31a vp9: Put skin detection usage around cpi flag.
Skin detection usage in choose_partitioning should be
around the cpi->use_skin_detection.

Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
2017-06-21 17:32:56 -07:00
Marco
8cf6f78fce vp9: Adjustments for aq-mode and pickmode for speed >= 8.
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.

No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
2017-06-21 16:01:24 -07:00
James Zern
fbba31e241 vp8_dx_iface: clear -Wclobbered warnings
with gcc 6.x

Change-Id: Ib2070421603a6777892d4ea01f4b0921696f38b3
2017-06-21 15:09:58 -07:00
Johann Koenig
355432b0d2 Merge "dct tests: align InvAccuracyCheck buffers" 2017-06-21 21:16:23 +00:00
Linfeng Zhang
466b667ff3 Clean vpx_idct16x16_256_add_sse2()
Remove macro IDCT16 which is redundant with idct16_8col().

Change-Id: I783c5f4fda038a22d5ee5c2b22e8c2cdfb38432c
2017-06-21 13:47:15 -07:00
Linfeng Zhang
42522ce0b7 Update vpx_idct{8x8,16x16,32x32}_1_add_sse2()
Change-Id: I365f8e53d9ccd028cef0f561d4de9e5916278609
2017-06-21 13:47:05 -07:00
Linfeng Zhang
2b43a1ee18 Clean 32x32 full idct sse2 and ssse3 code
vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster
than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are
code relocations, no new code.

Change-Id: I5dac0e98cc411a4ce05660406921118986638d19
2017-06-21 13:46:49 -07:00
Hui Su
96ec8a425b Merge "VP9 level targeting: properly handle max_gf_interval" 2017-06-21 20:38:45 +00:00
Johann
1c48915233 dct tests: align InvAccuracyCheck buffers
'in' is used for the reference fdct. 'coeff' is input to the idct being
tested and 'dst[16]' is output

Fixes a segfault on unaligned memory access on x86.

Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962
2017-06-21 11:47:00 -07:00
James Zern
0aa3677d9d fix build, rm ref to vpx_idct8x8_64_add_ssse3
this was deleted in:
98967645a Remove vpx_idct8x8_64_add_ssse3()

but this was merged in:
9e03eedf6 Merge changes Ib26dd515,Ie60dabc3

after:
a92991133 Merge "dct tests: run all possible sizes in one test"

which added a new reference

Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25
2017-06-20 19:46:45 -07:00
Linfeng Zhang
9e03eedf62 Merge changes Ib26dd515,Ie60dabc3
* changes:
  Clean 8x8 idct x86 optimization
  Remove vpx_idct8x8_64_add_ssse3()
2017-06-21 00:38:25 +00:00
hui su
d96ed96c0f VP9 level targeting: properly handle max_gf_interval
Don't overide max_gf_interval if it's not specified. It will
be assigned with a default value in vp9_rc_set_gf_interval_range().

BUG=b/62803416

Change-Id: Ide46ce00279ed076865fc54ce98c55a994f0c798
2017-06-20 16:29:04 -07:00
Marco
492d52b9cc vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1000 and
buf-initial to 600. Paramaters affect target size of key frame.

Change-Id: I2be6bc2927f5fa74e19e1efa3fb574d23a503300
2017-06-20 12:22:03 -07:00
Marco
ae3a173352 vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1500 and
buf-initial to 700. Paramaters affect target size of key frame.

Change-Id: I01e238378b63eeef28dfc2178baadffcd3cc7561
2017-06-20 09:08:13 -07:00
Johann Koenig
a929911339 Merge "dct tests: run all possible sizes in one test" 2017-06-20 15:04:25 +00:00
Marco Paniconi
737aa5c9e4 Merge "vp9: SVC: Rework the usage of base_mv for SVC." 2017-06-20 03:08:32 +00:00
Marco
b55240057f vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Adjust some parameters in sample encoder: vpx_temporal_svc_encoder.
Parameters adjusted to set lower QP for initial key frame,
and allow for larger target size on subsequent key frames.

Change-Id: I092ad968e5b51b9f495dadb6ee96e810663c910e
2017-06-19 18:29:39 -07:00
Marco Paniconi
782aacc3d3 Merge "vp9: Speed >= 8: Adjust resolution threshold for subpel." 2017-06-19 23:45:35 +00:00
Johann
4ebb9a36f1 dct tests: run all possible sizes in one test
Modify fdct4x4_test.cc to support all size combinations. This does not
add any new tests and in fact fails a few. There were minimal changes
made to the tests so it's not entirely surprising that some of the
larger 12 bit transforms are failing since it was initially only used
for 4x4.

In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and
dct32x32_test.cc will be evaluated and moved to dct_test.cc.

BUG=webm:1424

Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f
2017-06-19 15:39:35 -07:00
James Zern
ed56ddfef8 Merge "libs.mk: retry partial testdata download" 2017-06-19 22:15:06 +00:00
James Zern
ada640a508 libs.mk: retry partial testdata download
attempt retry on transient failures uncaught by --retry

Change-Id: I7cd8846ff88daf0f521af9ee182e30bfd79f51f3
2017-06-19 14:40:39 -07:00
Marco
ff7fb4b280 vp9: Speed >= 8: Adjust resolution threshold for subpel.
Get some quality gain on RTC metrics (~7%), with
~5-8% speed slowdown.

Change-Id: I0d02942a77074424ee0326b6e110ddff09f2df5e
2017-06-19 13:58:08 -07:00
Jerome Jiang
bf41a982b4 Merge "Enable 8x8 skin detection for vp8." 2017-06-19 16:42:14 +00:00
Marco
112cd95507 vp9: SVC: Rework the usage of base_mv for SVC.
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.

Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.

Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
2017-06-18 22:35:46 -07:00
James Zern
31d6ba9a54 tiny_ssim: make some functions static
quiets -Wmissing-prototypes

Change-Id: If2e77c921b2fba456ed8d94119773e360d90b878
2017-06-16 15:36:32 -07:00
James Zern
038522e4a0 Merge "configure: test for -Wparentheses-equality" 2017-06-16 20:07:52 +00:00
Jerome Jiang
a36017e007 Enable 8x8 skin detection for vp8.
If 2 or more 8x8 blocks are identified as skin, the macroblock will be
labeled as skin.

Change-Id: I596542c81a2df9e96270cab39d920bbfeb02bc6e
2017-06-15 20:53:03 -07:00
James Zern
27c2185954 configure: test for -Wparentheses-equality
Change-Id: I36de79c58461907deaea70d6131da9119bc0bc69
2017-06-15 16:05:20 -07:00
Linfeng Zhang
c7e4917e97 Clean 8x8 idct x86 optimization
Create load_buffer_8x8() and write_buffer_8x8().

Change-Id: Ib26dd515d734a5402971c91de336ab481b213fdf
2017-06-15 14:30:00 -07:00
Linfeng Zhang
98967645a1 Remove vpx_idct8x8_64_add_ssse3()
It's almost identical with vpx_idct8x8_64_add_sse2(), except little
difference in instructions order.

Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f
2017-06-15 14:09:33 -07:00
Urvang Joshi
a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00