Commit Graph

241 Commits

Author SHA1 Message Date
Jingning Han
e665c8f2c9 Add mode cost to sub8x8 block mode decision in rtc coding
This commit allows the encoder to properly account for the mode
cost in sub8x8 non-RD mode decision.

Change-Id: I2951960d20e37ed08e372ee0c7044935b2b9b899
2015-02-11 14:43:02 -08:00
Jingning Han
c9725813db Merge "Account for inter prediction filter rate cost in rtc mode selection" 2015-02-11 14:42:44 -08:00
Jingning Han
41b7f76db1 Account for inter prediction filter rate cost in rtc mode selection
Add the rate cost on inter prediction filter type to the overall
rate-distortion cost in vp9_pick_mode_inter.

Change-Id: I72c34017adf5220cadb3962694ee5404469fc673
2015-02-11 12:17:29 -08:00
Jingning Han
4ce70e8847 Add ref frame rate cost to non-RD mode decision
This commit adds a heuristic rate cost of reference frame to the
non-RD mode decision. It improves the compression performance of
speed -6 by 0.31% and speed -5 by 0.69%.

Change-Id: If7f3b45519d49b2cb640bcb7316a254efc8be446
2015-02-11 11:08:10 -08:00
Jingning Han
b2762a8853 Re-arrange inter mode search order in RTC coding flow
This commit makes the ZEROMV mode first in the search order to
ensure that the zero mv is always checked in the RTC coding mode.
It improves the average speed -6 compression performance by 0.3%
in both PSNR and SSIM at no visible speed change.

Change-Id: I465a7e59f4e20cd84fee3f02ced6f98036945949
2015-02-06 08:52:52 -08:00
Jingning Han
4ccfc7d517 Save an extra call for setup_pred_plane function
Reuse the yv12_mb array to fetch the buffer pointers/strides
corresponding to the current reference frame.

Change-Id: I5276b7494158b2cccef15213be2dc189e9036851
2015-02-04 09:47:14 -08:00
Jingning Han
0c6d3a03e1 Account for chroma component costs in RTC mode decision
This commit allows the encoder to account for additional chroma
plane costs in the mode decision process, if the current block
potentially contains significant color change. It improves the
visual quality at very low bit-rates.

The compression performance of dark720p is improved by 12.39% in
speed 6. For jimred at 150 kbps, the PSNR of V component (red)
increased by 0.2 dB, at the expense of about 5% increase in
encoding time. Note that for sequences where the chroma components
are fairly consistent, the encoding time increase is negligible.

On average the rtc set compression performance is improved by
1.172% in PSNR and 1.920% in SSIM.

Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
2015-02-04 09:45:14 -08:00
hkuang
be6aeadaf4 Try again to merge branch 'frame-parallel' into master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls.

Current frame parallel decode will only speed up the decoding for frame
parallel encoded videos. For non frame parallel encoded videos, frame
parallel decode is slower than serial decode due to lack of loopfilter
worker thread.

There are still some known issues that need to be addressed. For example:
decode frame parallel videos with segmentation enabled is not right sometimes.

* frame-parallel:
  Add error handling for frame parallel decode and unit test for that.
  Fix a bug in frame parallel decode and add a unit test for that.
  Add two test vectors to test frame parallel decode.
  Add key frame seeking to webmdec and webm_video_source.
  Implement frame parallel decode for VP9.
  Increase the thread test range to cover 5, 6, 7, 8 threads.
  Fix a bug in adding frame parallel unit test.
  Add VP9 frame-parallel unit test.
  Manually pick "Make the api behavior conform to api spec." from master branch.
  Move vp9_dec_build_inter_predictors_* to decoder folder.
  Add segmentation map array for current and last frame segmentation.
  Include the right header for VP9 worker thread.
  Move vp9_thread.* to common.
  ctrl_get_reference does not need user_priv.
  Seperate the frame buffers from VP9 encoder/decoder structure.
  Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
 Conflicts:
       test/codec_factory.h
       test/decode_test_driver.cc
       test/decode_test_driver.h
       test/invalid_file_test.cc
       test/test-data.sha1
       test/test.mk
       test/test_vectors.cc
       vp8/vp8_dx_iface.c
       vp9/common/vp9_alloccommon.c
       vp9/common/vp9_entropymode.c
       vp9/common/vp9_loopfilter_thread.c
       vp9/common/vp9_loopfilter_thread.h
       vp9/common/vp9_mvref_common.c
       vp9/common/vp9_onyxc_int.h
       vp9/common/vp9_reconinter.c
       vp9/decoder/vp9_decodeframe.c
       vp9/decoder/vp9_decodeframe.h
       vp9/decoder/vp9_decodemv.c
       vp9/decoder/vp9_decoder.c
       vp9/decoder/vp9_decoder.h
       vp9/encoder/vp9_encoder.c
       vp9/encoder/vp9_pickmode.c
       vp9/encoder/vp9_rdopt.c
       vp9/vp9_cx_iface.c
       vp9/vp9_dx_iface.c

This reverts commit a18da9760a.

Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
2015-01-30 21:00:13 -08:00
Johann
a18da9760a Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch."
This reverts commit bde04ce503

Change-Id: I053dae04c761b04a36dc239558503905a14d2470
2015-01-23 08:42:02 -08:00
hkuang
bde04ce503 Merge branch 'frame-parallel' to enable frame parallel decode in master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.

* frame-parallel:
  Add error handling for frame parallel decode and unit test for that.
  Fix a bug in frame parallel decode and add a unit test for that.
  Add two test vectors to test frame parallel decode.
  Add key frame seeking to webmdec and webm_video_source.
  Implement frame parallel decode for VP9.
  Increase the thread test range to cover 5, 6, 7, 8 threads.
  Fix a bug in adding frame parallel unit test.
  Add VP9 frame-parallel unit test.
  Manually pick "Make the api behavior conform to api spec." from master branch.
  Move vp9_dec_build_inter_predictors_* to decoder folder.
  Add segmentation map array for current and last frame segmentation.
  Include the right header for VP9 worker thread.
  Move vp9_thread.* to common.
  ctrl_get_reference does not need user_priv.
  Seperate the frame buffers from VP9 encoder/decoder structure.
  Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""

 Conflicts:
       test/codec_factory.h
       test/decode_test_driver.cc
       test/decode_test_driver.h
       test/invalid_file_test.cc
       test/test-data.sha1
       test/test.mk
       test/test_vectors.cc
       vp8/vp8_dx_iface.c
       vp9/common/vp9_alloccommon.c
       vp9/common/vp9_entropymode.c
       vp9/common/vp9_loopfilter_thread.c
       vp9/common/vp9_loopfilter_thread.h
       vp9/common/vp9_mvref_common.c
       vp9/common/vp9_onyxc_int.h
       vp9/common/vp9_reconinter.c
       vp9/decoder/vp9_decodeframe.c
       vp9/decoder/vp9_decodeframe.h
       vp9/decoder/vp9_decodemv.c
       vp9/decoder/vp9_decoder.c
       vp9/decoder/vp9_decoder.h
       vp9/encoder/vp9_encoder.c
       vp9/encoder/vp9_pickmode.c
       vp9/encoder/vp9_rdopt.c
       vp9/vp9_cx_iface.c
       vp9/vp9_dx_iface.c

Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
2015-01-22 18:18:53 -08:00
Jingning Han
97dc782635 Merge "Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode" 2015-01-08 10:55:03 -08:00
Jingning Han
e42b3ee765 Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode
These two parameters are used to control the denoiser cut-off
thresholds. They should be properly initialized when starting
mode search of a given block.

Change-Id: Iba8a25487026a0dbe0d350c347d7e4e4e237b637
2015-01-07 15:32:41 -08:00
Jingning Han
802b798f67 Fix best ref frame rd cost update in sub8x8 non-RD mode search
This fixes the issue that sub8x8 inter blocks always end up
with GOLDEN_FRAME.

Change-Id: Id0c25cbb9c2003f43b4dff8fb1572512c246e077
2015-01-07 12:00:02 -08:00
Jingning Han
c3fd9bbdaf Format fix in vp9_pick_inter_mode_sub8x8
Replace ref_frame++ with ++ref_frame.

Change-Id: Ic39793081156c314bf1b85d5ab76def97f3bff52
2015-01-07 11:50:36 -08:00
Jingning Han
2fe1bfa5ad Merge "Remove redundant local variable for segment_id" 2015-01-02 14:48:27 -08:00
Jingning Han
5516fdd8d0 Remove redundant local variable for segment_id
Use mbmi->segment_id directly in vp9_pick_inter_mode. The value is
set outside this function, hence no need to assign it again.

Change-Id: I3d63cdd2e4fadf62ccdefada638b00d979eb3741
2015-01-02 12:25:14 -08:00
Jingning Han
59cfaa538e Merge "Use less tmp motion vectors in vp9_pick_inter_mode_sub8x8" 2015-01-02 10:00:45 -08:00
Jingning Han
5c31fd5c6d Merge "Enable sub8x8 inter block search for RTC coding mode" 2015-01-02 10:00:35 -08:00
Jingning Han
2baccb18a0 Use less tmp motion vectors in vp9_pick_inter_mode_sub8x8
This commit simplifies the reference motion vector part for sub8x8
block coding in RTC mode and reduces the required local variables.

Change-Id: I470d1482092563b68af22404dc1f497e7457b0a8
2014-12-30 13:16:12 -08:00
Jingning Han
dad89d5ca1 Enable sub8x8 inter block search for RTC coding mode
This commit enables sub8x8 inter block coding for RTC mode. The
use of sub8x8 blocks can be turned on by allowing
choose_partitioning function to select 4x4/4x8/8x4 block sizes.

Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a
2014-12-24 17:40:31 -08:00
Jingning Han
eb1795f643 Set ref frame scaling factor in RTC inter mode decision
Properly set the corresponding scaling factor of the reference
frame in the non-RD mode decision process. This allows the mode
search process to account for the scaled reference frame when
selecting coding mode.

Change-Id: I9d41bff6931c98e5a82b413e37ac5e6e14b93b23
2014-12-23 09:33:58 -08:00
Jingning Han
6ec0ef6691 Add a guard on intra mode skip control for RTC mode
This commit adds a guard condition to the intra mode test skip
control in RTC coding mode. If all inter modes are skipped, force
the encoder to check intra mode. It avoids situations where the
encoder processes without properly assigning required mode
information.

Change-Id: Ibb349fee997d6584ce901d08b06e8df3ca9c01b1
2014-12-18 12:00:27 -08:00
Jingning Han
dd0602e01c Remove ARF mode entries from THR_MODES array in non-RD mode
The alternate reference frame is disabled in non-RD mode. No need
to keep the related entries in the THR_MODES array.

Change-Id: I53386f4bb1c6284f582801f27246c5edf55bc24b
2014-12-17 17:13:15 -08:00
Jingning Han
455514a683 Rework mode search threshold update for RTC coding mode
In RTC coding mode, the alternate reference frame modes and compound
inter prediction modes are disabled. This commit reworks the
related mode search threshold update process to skip interacting
with these coding modes. It provides about 1.5% speed-up for speed
-6 on average.

vidyo1
16551 b/f, 40.451 dB, 6261 ms -> 16550 b/f, 40.459 dB, 6190 ms

nik720p
33316 b/f, 38.795 dB, 6335 ms -> 33310 b/f, 38.798 dB, 6237 ms

mmmoving
33265 b/f, 41.055 dB, 7176 ms -> 33267 b/f, 41.064 dB, 7084 ms

dark720
33329 b/f, 39.729 dB, 11235 ms -> 33331 b/f, 39.733 dB, 10731 ms

Change-Id: If2a4090a371cd28f579be219c013b972d7d9b97f
2014-12-17 15:56:01 -08:00
Jingning Han
56a8bc54a6 Properly store the tx_size of selected intra mode
Use a temporary variable to store the transform size associated
with the best intra mode and restore the mode_info if the overall
best mode is intra mode.

Change-Id: I2606e0061ad32f91b095462902b1eb734b128eea
2014-12-17 09:25:14 -08:00
Jingning Han
df3e3ab6ff Fix intra mode update process in vp9_pick_inter_mode
When multiple intra modes are tested, the previous mode info
update process may overwrite the selected best intra mode and make
the final selection use an inter mode. This commit fixes this
issue by moving the mode_info reset outside the intra mode search
loop.

Change-Id: I15ed4288a6b3cb0832104a5e6d5d9a25cd1a5b2b
2014-12-15 17:52:09 -08:00
Jingning Han
c2c7596fc7 Initialize best_tx_size with invalid value
If vp9_pick_inter_mode works properly, it should at least check
one coding mode and hence get best_tx_size assigned a valid value.
There is no need to initialize best_tx_size with a legitimate
value before starting the mode search.

Change-Id: Ic0496cd89672ea9c2c512a9bd1da952190af9cba
2014-12-15 12:58:34 -08:00
Jingning Han
83e2c62aba Use right shift to replace division in vp9_pick_inter_mode
Make the variable reduction_fac log2 based and explicitly use
right shift when computing intra_cost_penalty.

Change-Id: I208f1fb879a02debb3b3fc64f9fd06260dcf1c86
2014-12-15 12:48:07 -08:00
Jingning Han
eefe869291 Simplify rate-distortion modeling function
Use left shift to replace one multiplication. The computation
outcome remains identical.

Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1
2014-12-15 11:51:16 -08:00
Jingning Han
0cac834b5a Use use_prev_frame_mvs flag for ref mv search branch
Replace error_resilient flag with use_prev_frame_mvs in
vp9_pick_inter_mode reference motion vector search selection.
This effectively turns off the simplified ref mv search in the
settings of frame resizing, even if error-resilient mode is off.

Change-Id: I7fed814ee7bc0cb419a03b846e0fc2de46ba7686
2014-12-09 18:18:40 -08:00
Jingning Han
17bedc54f5 Remove redundant rdcost reset
The initial reset of this_rdc in vp9_pick_inter_mode is not needed,
since it will be re-assign when used.

Change-Id: Ic0e12d741cbab292fc214c1eabb48b129af7839b
2014-12-05 16:06:17 -08:00
Jingning Han
eadffb2d6e Fix a motion search skip condition in vp9_pick_inter_mode
Compare the current best mode rate-distortion cost with the skip
threshold to decide if performing motion search.

Change-Id: Ia071824f8dd3b7db485f424692a485a2da6a1a9f
2014-12-05 15:58:36 -08:00
Jingning Han
732d57c2b5 Remove redundant MB_MODE_INFO reset from vp9_pick_mode_inter
Change-Id: I0222f7abc61202f4a83b117bbfb042ada6304562
2014-12-05 15:51:11 -08:00
Deb Mukherjee
70d9dbd818 Fixes a missing highbitdepth convolve call bug
Bug was introduced in https://gerrit.chromium.org/gerrit/#/c/72122/

Change-Id: Idb500ea619a30e7bc50e22fb8ee03be5282f41db
2014-12-03 17:48:50 -08:00
Jingning Han
a04ed98482 Cosmetic change in vp9_pick_inter_mode
Change-Id: Ic072585ebffdb36982ed7b8b9f875ca6c1c656c4
2014-11-25 09:42:57 -08:00
Jingning Han
92a7cfc8bf Adaptively adjust mode test kick-off thresholds in RTC coding
This commit allows the encoder to increase the mode test kick-off
thresholds if the previous best mode renders all zero quantized
coefficients, thereby saving motion search runs when possible.
The compression performance of speed -5 and -6 is down by -0.446%
and 0.591%, respectively. The runtime of speed -6 is improved by
10% for many test clips.

vidyo1, 1000 kbps
16578 b/f, 40.316 dB, 7873 ms -> 16575 b/f, 40.262 dB, 7126 ms

nik720p, 1000 kbps
33311 b/f, 38.651 dB, 7263 ms -> 33304 b/f, 38.629 dB, 6865 ms

dark720p, 1000 kbps
33331 b/f, 39.718 dB, 13596 ms -> 33324 b/f, 39.651 dB, 12000 ms

mmoving, 1000 kbps
33263 b/f, 40.983 dB, 7566 ms -> 33259 b/f, 40.978 dB, 7531 ms

Change-Id: I7591617ff113e91125ec32c9b853e257fbc41d90
2014-11-25 09:42:08 -08:00
Jingning Han
30104207fd Merge "Rework forward txfm/quantization skip system in RTC coding mode" 2014-11-25 09:33:57 -08:00
Jingning Han
6912c44135 Merge "Remove redundant intra mode penalty from vp9_pick_inter_mode" 2014-11-24 22:13:44 -08:00
Yunqing Wang
edbd61e136 vp9_ethread: modify VP9_COMP structure
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.

In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.

Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
2014-11-24 17:57:38 -08:00
Jingning Han
25be81e2dd Remove redundant intra mode penalty from vp9_pick_inter_mode
The intra mode penalty is covered by intra_cost_penalty. This
commit removes the other intra cost threshold, provided that the
constant 50 is negligible in normal rate-distortion cost.

Change-Id: I9b8b7483c43b9a41741622e7057def1f7d51bb72
2014-11-24 14:55:59 -08:00
Jingning Han
2fbdfd2c66 Key frame non-RD mode decision process
This commit makes a non-RD coding mode decision process for key
frame coding. It can be optionally turned on in speed -6 and above.

Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5
2014-11-24 09:04:28 -08:00
Jingning Han
7428cebe4f Rework forward txfm/quantization skip system in RTC coding mode
This commit allows more aggressive decision to skip forward
transform and quantization for luma component in RTC coding mode.
The chroma components remains going through the normal coding
routine, since they are not included in the non-RD mode search
process.

It reduces the runtime cost by 2% - 10%. In speed -6,
vidyo1 1000 kbps
16576 b/f, 40.281 dB, 8402 ms -> 16576 b/f, 40.323 dB, 7764 ms

nik720p 1000 kbps
33337 b/f, 38.622 dB, 7473 ms -> 33299 b/f, 38.660 dB, 7314 ms

dark720p 1000 kbps
33330 b/f, 39.785 dB, 13505 ms -> 33325 b/f, 39.714 dB, 13105 ms

The compression performance of speed -6 is improved by 0.44% in
PSNR and 1.31% in SSIM.

Change-Id: Iae9e3738de6255babea734e5897f29118bebc6d7
2014-11-21 12:46:40 -08:00
Alex Converse
bc1b3d8412 Allow DC/H/V/TM on screen content.
6.3% better compression
less than 1% compression time increase

Change-Id: Ie83c059436e54c09de9e7c87e06e0a6d40dc38fe
2014-11-20 18:04:57 -08:00
Jingning Han
a62c87fb04 Add empty pointer check to pred buffering in rtc coding mode
This commit adds a check condition to the prediction buffering
operation used in the rtc coding mode. This resolves a unit test
warning in example/vpx_tsvc_encoder_vp9_mode_7.

Change-Id: I9fd50d5956948b73b53bd8fc5a16ee66aff61995
2014-11-17 11:24:07 -08:00
Jingning Han
aeff1f7ec2 Merge "Use reconstructed pixels for intra prediction" 2014-11-13 13:59:02 -08:00
Jingning Han
e717d22b63 Use reconstructed pixels for intra prediction
This commit makes the speed -6 and above use the reconstructed
boundary pixels for precise intra prediction. This allows more
intra prediction modes to be tested in the non-RD coding process.

Enabling horizontal and vertical intra prediction modes can
improve the speed -6 compression performance for rtc set
by 0.331%.

Change-Id: I3a99f9d12c6af54de2bdbf28c76eab8e0905f744
2014-11-11 10:04:43 -08:00
Alex Converse
ce9ba97a9d Fix LAST SKIP when considering GOLDEN
Change-Id: I39d9f13fa34984ee9dad0c4f303ef672635f420e
2014-11-07 13:44:17 -08:00
Jingning Han
1434f7695b Skip ref frame mode search conditioned on predicted mv residuals
This commit makes the RTC coding mode to conditionally skip the
reference frame mode search, when the predicted motion vector of
the current reference frame gives more than two times sum of
absolute difference compared to that of other reference frames.

It reduces the runtim by 1% - 4% for speed -5 and -6. The average
compression performance is improved by about 0.1% in both settings.

It is of particular benefit to light change scenarios. The
compression performance of test clip mmmovingvga.y4m is improved by
6.39% and 15.69% at high bit rates for speed -5 and -6, respectively.

Speed -5
vidyo1 16555 b/f, 40.818 dB, 12422 ms ->
       16552 b/f, 40.804 dB, 12100 ms

nik    33211 b/f, 39.138 dB, 11341 ms ->
       33228 b/f, 39.139 dB, 11023 ms

mmmoving 33263 b/f, 40.935 dB, 13508 ms ->
         33256 b/f, 41.068 dB, 12861 ms

Speed -6
vidyo1 16541 b/f, 40.227 dB, 8437 ms ->
       16540 b/f, 40.220 dB, 8216 ms

nik    33272 b/f, 38.399 dB, 7610 ms ->
       33267 b/f, 38.414 dB, 7490 ms

mmmoving 33255 b/f, 40.555 dB, 7523 ms ->
         33257 b/f, 40.975 dB, 7493 ms

Change-Id: Id2aef76ef74a3cba5e9a82a83b792144948c6a91
2014-11-04 09:10:19 -08:00
Jingning Han
7e119e2946 Fix the THR_MODES array used in vp9_pick_inter_mode
Fix the alignment of entries fo intra prediction modes.

Change-Id: Ie32ad87cf90694efd591a4b1cc29c916c4cd56f7
2014-11-02 12:25:57 -08:00
Jingning Han
1c84e73ebd Merge "Fix mode index use case in vp9_pick_inter_mode" 2014-10-31 08:55:40 -07:00