Commit Graph

228 Commits

Author SHA1 Message Date
Marco
c83bcb3474 vp9-svc: Allow for 2 stage downscaling for spatial layers.
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.

Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.

Change-Id: I9d73f92a4a96927d0f1d6bf75315c1e60513226a
2016-03-01 15:48:42 -08:00
James Zern
8062e10162 Revert "vp9-svc: Fix speed issue with source downscaling for spatial layers."
This reverts commit f51f0998e1.

This causes datarate tests to fail. Some are due to the new default
keyframe distance, another causes an assert even forcing 9999:

[ RUN      ] VP9/DatarateOnePassCbrSvc.OnePassCbrSvc3SpatialLayers/0
test_libvpx:
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c:853: scaledconvolve2d:
Assertion `y_step_q4 <= 32' failed.

Change-Id: I4ee4fea97f47e4f1a23b82a62e6afc6280961e38
2016-02-26 16:53:26 -08:00
Marco
f51f0998e1 vp9-svc: Fix speed issue with source downscaling for spatial layers.
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.

Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.

Change-Id: Ifcf526ec2aaf3e5fa7924588d9dd8660bf02fb46
2016-02-26 08:11:37 -08:00
Marco
34d12d1160 vp9-resize: Force reference masking off for external dynamic-resizing.
An issue exists with reference_masking in non-rd pickmode for spatial
scaling. It was kept off for internal dynamic resizing and svc, this
change is to keep it off also for external dynamic resizing.

Update to external resize test, and update TODO to re-enable this
at frame level when references have same scale as source.

Change-Id: If880a643572127def703ee5b2d16fd41bdbf256c
2016-02-11 08:35:57 -08:00
Marco
734dc36173 vp9: Add flag to control usage of skin detection.
Set off as default; on for 1 pass cbr mode, speed >=5, non-screen-content.

Change-Id: I03f2497e4028b354fd83b8a7d0e072c2a6bec878
2016-02-01 11:57:56 -08:00
Debargha Mukherjee
02345be986 Adding an aq mode for 360 videos
Different quality levels are used for different regions in
the frame depending on how far they are vertically from the
center. Specifically, three segments are used based on the
mi_row index with respect number to the number of mi_rows in
the frame.

Change-Id: Ifc8b777bc58ea8521dffc4640360c67d99f8d381
2016-01-13 16:17:37 -08:00
paulwilkins
0149fb3d6b Changes to exhaustive motion search.
This change alters the nature and use of exhaustive motion search.

Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.

Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.

For example:
  stage 1: Range +/- 64 interval 4
  stage 2: Range +/- 32 interval 2
  stage 3: Range +/- 15 interval 1

This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.

This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained  a bug (the two searches used different distortion
metrics).

For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.

Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most clips though the quality gain and speed impact are small.

Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-13 10:16:31 +00:00
hui su
6ab6ac450b Use accurate bit cost for uv_mode in UV intra mode RD selection
On derflr, +0.1% for VP10; however, -0.03% on VP9.

Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-11-06 14:45:43 -08:00
Marco
c7da053d4b Move noise level estimate outside denoiser.
Source noise level estimate is also useful for
setting variance encoder parameters (variance thresholds,
qp-delta, mode selection, etc), so allow it to be used also
if denoising is not on.

Change-Id: I4fe23d47607b4e17a35287057f489c29114beed1
2015-11-02 12:15:26 -08:00
Yaowu Xu
568429512e Add a new enum type vpx_color_range_t
to make meaning of color_range obvious.

Change-Id: I303582e448b82b3203b497e27b22601cc718dfff
2015-10-16 16:27:18 -07:00
Ronald S. Bultje
812945a8f1 vp9/10: improve support for render_width/height.
In the decoder, map this to the output variable vpx_image_t.r_w/h.
This is intended as an improved version of VP9D_GET_DISPLAY_SIZE,
which doesn't work with parallel frame decoding. In the encoder,
map this to a codec control func (VP9E_SET_RENDER_SIZE) that takes
a w/h pair argument in a int[2] (identical to VP9D_GET_DISPLAY_SIZE).

Also add render_size to the encoder_param_get_to_decoder unit test.

See issue 1030.

Change-Id: I12124c13602d832bf4c44090db08c1009c94c7e8
2015-09-25 22:18:22 -04:00
Ronald S. Bultje
eeb5ef0a24 Add support for color-range.
In decoder, export (eventually) into vpx_image_t.range field. In
encoder, use oxcf->color_range to set it (same way as for
color_space).

See issue 1059.

Change-Id: Ieabbb2a785fa58cc4044bd54eee66f328f3906ce
2015-09-16 06:41:46 -04:00
Marco
4d1424faf9 For 1 pass: always use the normative filter in vp9_scale_if_required()
The normative (convolve8) filter is optimized/faster than
the nonnormative one. Pass usage of scaler (normative/nonomorative)
to vp9_scale_if_required(), and always use normative one for 1 pass.

Change-Id: I2b71d9ff18b3c7499b058d1325a9554de993dd52
2015-09-14 13:13:32 -07:00
James Zern
5e35c3c9a0 vp9_encoder: make vp9_alloc_compressor_data private
Change-Id: I38b4de692f4f7e880766316783981cbd1134bed9
2015-08-28 18:53:57 -07:00
Marco
93ffe9d6dc Update to dynamic resize for 1 pass CBR: source scaling.
Switch to use the normative (convolve8) filter for source scaling,
only for 1/2x1/2 scaling for now. This is faster and has better
quality than either the vpx_scale_frame or the nonnormative scaler.

Remove the vp9_scale_if_required_fast, which is now not used.

Change-Id: I2f7d73950589d19baafb1fa650eac987d531bcc8
2015-08-20 16:34:01 -07:00
Alex Converse
c7b7011b9b Move VP9 SSIM metrics to vpx_dsp.
Change-Id: I20c7b42631b579fade6cf7ebf6d4c69b2fcb5e5e
2015-08-06 18:25:25 -07:00
Yunqing Wang
b2446fb6be Remove tx_select_threshes
Removed unused tx_select_threshes and tx_select_diff.

Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
2015-07-27 12:02:05 -07:00
Yaowu Xu
bf82514b54 vpx_dsp/bitreader.h: vp9_->vpx_
Replace vp9_ in names to vpx_ as they are not codec specific.

Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb
2015-07-20 18:06:31 -07:00
Marco
4bbd95512a Dynamic resize for real-time: source scaling
Use faster scaling on source.

Change-Id: I968df97239a86834c96126b86832d3d6d0875a53
2015-07-10 11:04:18 -07:00
Johann
6a82f0d7fb Move sub pixel variance to vpx_dsp
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-07 15:51:04 -07:00
Debargha Mukherjee
9852643373 Expose params min-gf-interval/max-gf-interval
Adds two new vp9 parameters --min-gf-interval and --max-gf-interval
to enable testing based on frequency of alt-ref frames.

Also adds a unit-test to test enforcement of min-gf-interval.

For both these parameters the default value is 0, which indicates
they are picked by the encoder, based on resolution and framerate
considerations. If they are greater than zero, the specified
parameter is honored.

(Additional note by paulwilkins)
Note that there is a slight oddity in that key frames are also GFs and
considered part of  GF only group. However they are treated as not
being part of an arf group because for arf groups the previous GF is
assumed to be the terminal or overlay frame for the previous group.

(end note)

Change-Id: Ibf0c30b72074b3f71918ab278ccccc02a95a70a0
2015-07-06 12:24:59 -07:00
Jingning Han
d1b30ceaa3 Rename vpx_thread to vpx_util
Change the dir name to include more util tools.

Change-Id: Id5b16062803ce5eed872fe2edb36d7e56b32eed8
2015-07-02 10:02:37 -07:00
Jingning Han
8565a1c99a Merge "Use vpx prefix for codec independent threading functions" 2015-07-02 04:24:54 +00:00
Jingning Han
66cf8098e6 Merge "Move multi-threading module functions into vpx_thread folder" 2015-07-02 04:24:37 +00:00
Jingning Han
04d2e57425 Use vpx prefix for codec independent threading functions
Replace vp9_ prefix with vpx_ for common multi-threading functions.

Change-Id: I941a5ead9bfe8213fdad345511d2061b07797b55
2015-07-02 00:47:54 +00:00
Jingning Han
3a3b0be09a Move multi-threading module functions into vpx_thread folder
This commit moves the primitive multi-threading files from vp9
folder to vpx_thread, which will be accessible by all vpx codec.

Change-Id: Ib51e66e9c69801c10631fab56d35a0c0aaed5883
2015-07-01 17:45:49 -07:00
Scott LaVarnway
c06d56cc7d VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFO
to MB_MODE_INFO_EXT.  This saves 36 bytes per 8x8 area for
both the decoder and encoder. (encoder has two MODE_INFO
buffers)

Change-Id: If006abb2224acaf326df3c2be09e77e967662107
2015-06-29 12:46:47 -07:00
Marco
d77f51ba9e Add dynamic resize logic for 1 pass CBR.
Decision to scale down/up is based on buffer state and average QP
over previous time window. Limit the total amount of down-scaling
to be at most one scale down for now.

Reset certain quantities after resize (buffer level, cyclic refresh,
rate correction factor).

Feature is enable via the setting rc_resize_allowed = 1.

Change-Id: I9b1a53024e1e1e953fb8a1e1f75d21d160280dc7
2015-06-18 17:13:37 -07:00
Yunqing Wang
2c838ede68 Allocate tile data adaptively to accommodate the frame size increase
If the frame size increases, the tile data buffer needs to be
re-allocated according to the number of tiles existing in current
frame. This patch makes the multi-tile encoding work in spatial
SVC usage case, and partially solved WebM issue 1018.

Change-Id: I1ad6f33058cf5ce6f60ed5024455a709ca80c5ad
2015-06-11 11:30:18 -07:00
Marco
c139b81a13 Vidyo patch: Rate control for SVC, 1 pass CBR mode.
-Make Rate control work for SVC 1 pass CBR mode.
-Added temporal layering mode.
-Fixed bug in non-rd variance partition.
-Modified/updated the sample encoders (vp9_spatial_svc_encoder, vpx_temporal_svc_encoder).
-Added datarate unittest(s) for 1 pass CBR SVC.

Change-Id: Ie94b1b68a56ea1267b5087c625e5df04def2ee48
2015-06-02 07:54:13 -07:00
Jim Bankoski
a6e9ae9066 Adds worst frame metrics for a bunch of metrics.
Change-Id: Ieaccc36ed1bee024bb644a9cfaafdaaa65d31772
2015-04-22 06:45:56 -07:00
Jim Bankoski
ee87e20d53 Adds a new temporal consistency metric to libvpx.
Change-Id: Id61699ebf57ae4f8af96a468740c852b2f45f8e1
2015-04-21 10:05:37 -07:00
Jim Bankoski
03829f2fea Merge "Adds a blockiness metric to internal stats." 2015-04-17 16:06:26 -07:00
Jim Bankoski
3d2f037a44 Merge "adds psnrhvs to internal stats." 2015-04-17 16:06:10 -07:00
Jim Bankoski
f2cbee9a04 Merge "Adds a fastssim metric to VPX internal stats." 2015-04-17 16:05:53 -07:00
Jim Bankoski
1777413a2a Adds a blockiness metric to internal stats.
Change-Id: Iedceeb020492050063acf3fd2326f96c29db9ae5
2015-04-17 11:13:18 -07:00
Jim Bankoski
9757c1aded adds psnrhvs to internal stats.
PSNR HVS is a human visual system weighted version of SNR that's
gained some popularity from academia and apparently better matches
MOS testing.

This code is borrowed from the Daala Project but uses our FDCT code.

Change-Id: Idd10fbc93129f7f4734946f6009f87d0f44cd2d7
2015-04-17 10:29:27 -07:00
Jim Bankoski
3f7f194304 Adds a fastssim metric to VPX internal stats.
This code appeared in the Daala project first and was originally
committed by Nathan Egge.

Change-Id: Iadce416a091929c51b46637ebdec984cddcaf18c
2015-04-17 10:23:24 -07:00
Marco Paniconi
f76ccce5bc Revert "Revert "Force_split on 16x16 blocks in variance partition.""
This reverts commit 004b9d83e3

Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743
2015-04-16 17:52:13 -07:00
Yunqing Wang
004b9d83e3 Revert "Force_split on 16x16 blocks in variance partition."
This reverts commit eb8c667570.
The patch caused mismatch while using multi-threads.

Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be
2015-04-14 15:19:31 -07:00
Marco
eb8c667570 Force_split on 16x16 blocks in variance partition.
Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.

Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
(with very safe threshold) based on sad used to select reference frame.

Some visual improvement near moving boundaries.
Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.

Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
2015-04-13 12:05:07 -07:00
Yunqing Wang
cae03a7ef5 Set vbp thresholds for aq3 boosted blocks
The vbp thresholds are set seperately for boosted/non-boosted
superblocks according to their segment_id. This way we don't
have to force the boosted blocks to split to 32x32.

Speed 6 RTC set borg test result showed some quality gains.
Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%.
No speed change was observed.

Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a
2015-04-02 15:48:32 -07:00
Yunqing Wang
fc98114761 Merge "Rename vbp thresholds" 2015-03-31 16:33:30 -07:00
Yunqing Wang
c28ff1a9de Rename vbp thresholds
Code refactoring

Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79
2015-03-31 15:14:44 -07:00
Alex Converse
4dcb839607 VP9E_GET_ACTIVE_MAP API function.
This is useful when aq mode 3 (cyclic refresh) reactivates segments for refresh.

Change-Id: I3ad1d9410b899ede393d82bb8db14e2da4d84eca
2015-03-24 11:19:47 -07:00
Alex Converse
1bfacd3529 Reconcile active_map and cyclic refresh
Change-Id: Id7f8654aeeb20caa402bc822521b1d72c658f4f9
2015-03-12 16:19:49 -07:00
Adrian Grange
3807dd82ab Make encoder buffer allocation dynamic
Frame buffers are now allocated dynamically on-demand.

Entries in the reference frame map, cm->ref_frame_map,
may now be set to -1 (INVALID_IDX) to indicate that
there is not a valid reference buffer in that "slot".

All slots in the reference frame map are now initialized
to the empty state (-1) and each buffer is initialized
to have a reference count of 0.

Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582
2015-03-04 07:58:32 -08:00
Hangyu Kuang
8724d31d12 Move dequant table from VP9_COMMON to VP9_COMP as decoder
does not need it any more.

This reduces VP9_COMMON size from 25776 bytes to 17584 bytes(~31%).

Change-Id: Ic5daea732ccefb6d512b048af7983f0efe08589b
2015-02-20 11:12:42 -08:00
Yaowu Xu
ee5d79995e Move computation up to frame level
This is to avoid redo the same calculation repeatly, and also allow
easier adjustments for further experiments.

This commit shall have no effect on quality/compression.

Change-Id: I4460acf5c808ff5518da18d21e002c5da58af857
2015-02-10 15:41:52 -08:00
Adrian Grange
23ebacdb81 Auto-adaptive encoder frame resizing logic
Note: This feature is still in development.

Add an option for the encoder to decide the resolution
at which to encode each frame.

Each KF/GF/ARF goup is tested to see if it would be
better encoded at a lower resolution. At present, each
KF/GF/ARF is coded first at full-size and if the coded
size exceeds a threshold (twice target data rate) at
the maximum active Q then the entire group is encoded
at lower resolution.

This feature is enabled in vpxenc by setting:
  --resize-allowed=1

In addition, if the vpxenc command line also specifies
valid frame dimensions using:
  --resize-width=XXXX & --resize_height=YYYY
then *all* frames will be encoded at this resolution.

Change-Id: I13f341e0a82512f9e84e144e0f3b5aed8a65402b
2015-02-10 09:59:32 -08:00
Yunqing Wang
41063137c3 Rename loopfilter_thread files to thread_common files
Renames the files to allow more common thread code to
be moved to vp9/common.

Change-Id: I7386e64e221086e3cdc087e79812f993c423413b
2015-02-06 10:03:31 -08:00
hkuang
be6aeadaf4 Try again to merge branch 'frame-parallel' into master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls.

Current frame parallel decode will only speed up the decoding for frame
parallel encoded videos. For non frame parallel encoded videos, frame
parallel decode is slower than serial decode due to lack of loopfilter
worker thread.

There are still some known issues that need to be addressed. For example:
decode frame parallel videos with segmentation enabled is not right sometimes.

* frame-parallel:
  Add error handling for frame parallel decode and unit test for that.
  Fix a bug in frame parallel decode and add a unit test for that.
  Add two test vectors to test frame parallel decode.
  Add key frame seeking to webmdec and webm_video_source.
  Implement frame parallel decode for VP9.
  Increase the thread test range to cover 5, 6, 7, 8 threads.
  Fix a bug in adding frame parallel unit test.
  Add VP9 frame-parallel unit test.
  Manually pick "Make the api behavior conform to api spec." from master branch.
  Move vp9_dec_build_inter_predictors_* to decoder folder.
  Add segmentation map array for current and last frame segmentation.
  Include the right header for VP9 worker thread.
  Move vp9_thread.* to common.
  ctrl_get_reference does not need user_priv.
  Seperate the frame buffers from VP9 encoder/decoder structure.
  Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
 Conflicts:
       test/codec_factory.h
       test/decode_test_driver.cc
       test/decode_test_driver.h
       test/invalid_file_test.cc
       test/test-data.sha1
       test/test.mk
       test/test_vectors.cc
       vp8/vp8_dx_iface.c
       vp9/common/vp9_alloccommon.c
       vp9/common/vp9_entropymode.c
       vp9/common/vp9_loopfilter_thread.c
       vp9/common/vp9_loopfilter_thread.h
       vp9/common/vp9_mvref_common.c
       vp9/common/vp9_onyxc_int.h
       vp9/common/vp9_reconinter.c
       vp9/decoder/vp9_decodeframe.c
       vp9/decoder/vp9_decodeframe.h
       vp9/decoder/vp9_decodemv.c
       vp9/decoder/vp9_decoder.c
       vp9/decoder/vp9_decoder.h
       vp9/encoder/vp9_encoder.c
       vp9/encoder/vp9_pickmode.c
       vp9/encoder/vp9_rdopt.c
       vp9/vp9_cx_iface.c
       vp9/vp9_dx_iface.c

This reverts commit a18da9760a.

Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
2015-01-30 21:00:13 -08:00
Johann
a18da9760a Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch."
This reverts commit bde04ce503

Change-Id: I053dae04c761b04a36dc239558503905a14d2470
2015-01-23 08:42:02 -08:00
hkuang
bde04ce503 Merge branch 'frame-parallel' to enable frame parallel decode in master branch.
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.

* frame-parallel:
  Add error handling for frame parallel decode and unit test for that.
  Fix a bug in frame parallel decode and add a unit test for that.
  Add two test vectors to test frame parallel decode.
  Add key frame seeking to webmdec and webm_video_source.
  Implement frame parallel decode for VP9.
  Increase the thread test range to cover 5, 6, 7, 8 threads.
  Fix a bug in adding frame parallel unit test.
  Add VP9 frame-parallel unit test.
  Manually pick "Make the api behavior conform to api spec." from master branch.
  Move vp9_dec_build_inter_predictors_* to decoder folder.
  Add segmentation map array for current and last frame segmentation.
  Include the right header for VP9 worker thread.
  Move vp9_thread.* to common.
  ctrl_get_reference does not need user_priv.
  Seperate the frame buffers from VP9 encoder/decoder structure.
  Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""

 Conflicts:
       test/codec_factory.h
       test/decode_test_driver.cc
       test/decode_test_driver.h
       test/invalid_file_test.cc
       test/test-data.sha1
       test/test.mk
       test/test_vectors.cc
       vp8/vp8_dx_iface.c
       vp9/common/vp9_alloccommon.c
       vp9/common/vp9_entropymode.c
       vp9/common/vp9_loopfilter_thread.c
       vp9/common/vp9_loopfilter_thread.h
       vp9/common/vp9_mvref_common.c
       vp9/common/vp9_onyxc_int.h
       vp9/common/vp9_reconinter.c
       vp9/decoder/vp9_decodeframe.c
       vp9/decoder/vp9_decodeframe.h
       vp9/decoder/vp9_decodemv.c
       vp9/decoder/vp9_decoder.c
       vp9/decoder/vp9_decoder.h
       vp9/encoder/vp9_encoder.c
       vp9/encoder/vp9_pickmode.c
       vp9/encoder/vp9_rdopt.c
       vp9/vp9_cx_iface.c
       vp9/vp9_dx_iface.c

Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
2015-01-22 18:18:53 -08:00
Yunqing Wang
e76eaf05b1 vp9_ethread: add parallel loopfilter
1. Added row-based loopfilter in encoder;
2. Moved common multi-threaded loopfilter functions from decoder
   to common;
3. Merged multi-threaded loopfilter code, and made encoder/
   decoder call same function to reduce code duplication.

Encoder tests showed that 1% - 2% speedup was seen for good-quality
2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6%
speedup using 4 threads were seen for real-time mode(at speed 7).

Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
2015-01-16 17:19:27 -08:00
Yaowu Xu
e94b415c34 Add encoder control for setting color space
This commit adds encoder side control for vp9 to set color space info
in the output compressed bitstream.

It also amends the "vp9_encoder_params_get_to_decoder" test to verify
the correct color space information is passed from the encoder end to
decoder end.

Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650
2015-01-14 10:17:14 -08:00
Yaowu Xu
6f6fbf9175 Merge "Added plumbing for setting color space" 2015-01-13 09:20:13 -08:00
Yaowu Xu
ce52b0f8d3 Added plumbing for setting color space
Change-Id: If64052cc6e404abc8a64a889f42930d14fad21d3
2015-01-09 10:54:25 -08:00
Paul Wilkins
a3c1a9b419 Use 64 bit to accumulate frame sse.
When testing frame sse to choose a loop filter value and
when checking ambient error in kf Q selection, use 64 bit
values for accumulating the sse, to avoid risk of overflow
for large image formats.

Change-Id: I03765d16c843d0ade61a45b0cd46312472697e57
2015-01-07 14:13:16 +00:00
Jim Bankoski
dd4275e498 resolve visual studio warnings around initializers
Change-Id: Id2ad4fb24242f7ca8fa7a152f0889fded4113613
2014-12-19 12:38:25 -08:00
Paul Wilkins
60e9b731cf Remove mode dependent zbin boost.
Initial patch to remove get_zbin_mode_boost() and
cpi->zbin_mode_boost.

For now sets a dummy value of 0 for zbin extra pending
a further clean up patch.

Change-Id: I64a1e1eca2d39baa8ffb0871b515a0be05c9a6af
2014-12-18 16:45:52 +00:00
James Zern
72ece1308b vp9: move encoder-only member from common
allow_comp_inter_inter VP9_COMMON -> VP9_COMP

Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6
2014-12-12 11:17:44 -08:00
Paul Wilkins
e68c8dcfd2 Substantial restructuring of AQ mode 2.
The restructure moves the decision into the rd pick
modes loop and makes a decision based at the 16x16
block level instead of only the 64x64 level.

This gives finer granularity and better visual results
on the clips I have tested. Metrics results are worse
than the old AQ2 especially for PSNR and this mode
now falls between AQ0 and AQ1 in terms of visual
impact and metrics results.

Further tuning of this to follow.

It should be noted that if there are multiple iterations
of the recode loop the segment for a MB could change
in each loop if the previous loop causes a change in the
complexity / variance bin of the block. Also where a block
gets a delta Q this will alter the rd multiplier for this block
in subsequent recode iterations and frames where the
segmentation is applied.

Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7
2014-12-09 15:10:52 +00:00
Yunqing Wang
eba9c762a1 vp9_ethread: the tile-based multi-threaded encoder
Currently, VP9 supports column-tile encoding, which allows a frame
to be encoded in multiple column tiles independently. The number of
column tiles are set by encoder option "--tile-columns". This
provides a way to encode a frame in parallel.

Based on previous set of patches, this patch implemented the tile-
based multi-threaded encoder. Each thread processes one or more
tiles.

Usage:
For HD clips:
--tile-columns=2 --threads=1/2/3/4

While using 4 threads, tests showed that the encoder achieved
2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at
realtime speed 5.

Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4
2014-12-04 11:21:34 -08:00
Yunqing Wang
0993bef7e9 vp9_ethread: calculate and save the tok starting address for tiles
Each tile's tok starting address is calculated before the encoding
process. These addresses are stored so that the same calculation
won't be done again in packing bit stream.

Change-Id: I0a3be0301f002260c19a850303f2f73ebc47aa50
2014-11-25 17:19:35 -08:00
Yunqing Wang
edbd61e136 vp9_ethread: modify VP9_COMP structure
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.

In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.

Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
2014-11-24 17:57:38 -08:00
Yunqing Wang
70c9d2983b Revert "vp9_ethread: include a pointer to mb in VP9_COMP"
This reverts commit 6906d218dd.

Another way will be used to handle mb struct.

Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4
2014-11-20 08:31:12 -08:00
Yaowu Xu
1687c47bfd change to call vp9_refining_search_sad() directly
The function pointer in compressor instance does not change, so this
commit changes to call the function directly.

Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5
2014-11-17 11:30:17 -08:00
Yunqing Wang
d0b547c676 vp9_ethread: combine encoder counts in separate struct
Several frame counters in encoder are updated at SB level. Combine
those counters and put them in a separate struct, which allows us
to allocate one copy for each thread.

Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7
2014-11-14 16:09:22 -08:00
Yunqing Wang
6906d218dd vp9_ethread: include a pointer to mb in VP9_COMP
Modified VP9_COMP struct to include MACROBLOCK *mb. This change
makes it feasible in multi-thread case to allocate a mb for each
thread.

Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e
2014-11-14 12:31:06 -08:00
Adrian Grange
0d085ebc0a Prepare for dynamic frame resizing in the recode loop
Prepare for the introduction of frame-size change
logic into the recode loop.

Separated the speed dependent features into
separate static and dynamic parts, the latter being
those features that are dependent on the frame size.

Change-Id: Ia693e28c5cf069a1a7bf12e49ecf83e440e1d313
2014-11-13 11:41:20 -08:00
Minghai Shang
86c36a504d [spatial svc] Make spatial svc working for one pass rate control
Change-Id: Ibd9114485c3d747f9d148f64f706bf873ea473ac
2014-11-04 11:46:48 -08:00
Yaowu Xu
2a506e33b4 Merge "Add a new control of golden frame boost in CBR mode" 2014-10-28 09:32:58 -07:00
Jingning Han
d56b3eb0cf Refactor encoder tile data structure
Make the common tile info as one element in the encoder tile data
struct.

Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be
2014-10-27 19:37:13 -07:00
Yaowu Xu
03a60b78db Add a new control of golden frame boost in CBR mode
0 means that golden boost is off, and uses average frame target rate,
a non-zero number means the percentage of boost over average frame
bitrate is given initially to golden frames in CBR mode.

Change-Id: If4334fe2cc424b65ae0cce27f71b5561bf1e577d
2014-10-27 13:55:18 -07:00
Yaowu Xu
aa2af3ff6e Merge "Add a new control of max bitrate for inter frame" 2014-10-27 08:11:54 -07:00
Jingning Han
ac53c41e64 Merge "Tile based adaptive mode search in RD loop" 2014-10-24 18:44:52 -07:00
Yaowu Xu
636099f7b6 Add a new control of max bitrate for inter frame
Change-Id: I205de3611622cff7f751ea8baf9f82784581730a
2014-10-24 10:19:28 -07:00
Jingning Han
eee201c221 Tile based adaptive mode search in RD loop
Make the spatially adaptive mode search in rate-distortion
optimization loop inter tile independent. Experiments suggest that
this does not significantly change the coding staticstics.

Single tile, speed 3:
pedestrian_area 1080p 1500 kbps
59192 b/f, 40.611 dB, 101689 ms

blue_sky 1080p 1500 kbps
58505 b/f, 36.347 dB, 62458 ms

mobile_cal 720p 1000 kbps
13335 b/f, 35.646 dB, 45655 ms

as compared to 4 column tiles, speed 3:
pedestrian_area 1080p 1500 kbps
59329 b/f, 40.597 dB, 101917 ms

blue_sky 1080p 1500 kbps
58712 b/f, 36.320 dB, 62693 ms

mobile_cal 720p 1000 kbps
13191 b/f, 35.485 dB, 45319 ms

Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325
2014-10-24 10:00:27 -07:00
Adrian Grange
65753eeb8a Move frame re-sizing into the recode loop
The point at which frames are scaled to their
coded dimensions is moved into the re-code loop.

This is in preparation for a further patch that
will add logic into the re-code loop to reduce
the coded frame size if the encoder is struggling
to hit the target data rate at the native frame
size.

Change-Id: Ie4131f5ec6fb93148879f6ce96123296442bf2d1
2014-10-23 16:20:57 -07:00
Paul Wilkins
4163da33c2 Merge "Extend --auto-alt-ref so it can enable multi-alt ref." 2014-10-21 07:02:24 -07:00
Paul Wilkins
6f0ae3a2d1 Extend --auto-alt-ref so it can enable multi-alt ref.
Extend --auto-alt-ref from parameter so we can use it to
turn multi-arf on and off from the command line.

For now the range is 0-off, 1-on, 2-multi-arf on.

Rename play_alternate to enable_auto_arf

Change-Id: Id7b64407cfbe76ba0090a83b588a03e22a240386
2014-10-20 16:09:37 +01:00
Yunqing Wang
7c4992c466 Remove the dependency in token storing locations
Currently, the tokens for a tile are stored immediately after its
preceding tile, which causes a dependency. This is unnecessary
since we always allocate enough memory for tokens. Removing
the dependency allows token writing done in parallel. This patch
doesn't change encoding result.

Change-Id: I7365a6e5e2c2833eb14377c37e1503c9d0f26543
2014-10-17 14:25:33 -07:00
Paul Wilkins
d5130af568 Revert "Move input frame scaling into the recode loop"
This reverts commit 452dc21500.

This change has introduced a significant quality regression on content
with forced key frames. (e.g. the YT and yt-hd set). It is most
noticeable in static content where the kf bits dominate. Here, despite
key frames being apparently coded at the same Q, there is a drop in all
metrics of ~20% (e.g clXR and BFa0).

Change-Id: Iba14cc61778c0846fa0a59c33c55a9fc49512cb4
2014-10-16 15:54:40 +01:00
Adrian Grange
452dc21500 Move input frame scaling into the recode loop
Move the point at which input frames are scaled
into the recode loop. This will allow us to change
the coded frame size dynamically in response
to previous attempts to encode the frame at a
higher resolution.

A following patch will implement a scheme for
resizing the frame in the recode loop.

Change-Id: I6a59c02d6ac1626512edad6de8b60063b79433e6
2014-10-14 09:27:55 -07:00
Deb Mukherjee
3117830af3 Merge "Subpel search cleanups and enhancements" 2014-10-09 11:14:51 -07:00
Deb Mukherjee
d78dbff09a Subpel search cleanups and enhancements
- Some fixes to surface fit.
- Returns variance function as cost rather than sad in the
  pattern search and diamond search functions. Only
  vp9_pattern_search_sad function used in bigdia search
  uses sad as integer 1-away costs.
- Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.

Results:
derf [Speed 3]: About +0.036% in coding efficiency without any
discernible speed loss.
derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.

Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
2014-10-08 23:59:43 -07:00
Jingning Han
3f93f23120 Remove redundant header file from vp9_encoder.h
Change-Id: Ia212390cf8d36db5436bb0f0e1b696f70066341a
2014-10-07 10:49:58 -07:00
Yunqing Wang
b1b6fd85db Merge "Skip the partition search for still frames" 2014-09-30 11:59:05 -07:00
Yunqing Wang
1fcbf6ed56 Skip the partition search for still frames
This patch re-enabled the feature in Pengchong's patch
(commit 1286126073). Originally, it
was turned on while use_lastframe_partitioning > 0(not used anymore).
Now it was added as a feature, and turned on while speed >= 2.
As described in the original patch, this feature helps speed up the
slideshows in YouTube.

Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471
2014-09-26 09:03:52 -07:00
Deb Mukherjee
4109372af3 Adds high bit-depth psnr/sse functions
Also adds some miscellaneous high bit-depth setup functions.

Change-Id: I66488b08a5a2a8cb9518ca10497cf1c1501ceded
2014-09-23 17:28:05 -07:00
Deb Mukherjee
10783d4f3a Adds high bitdepth transform functions and tests
Adds various high bitdepth transform functions and tests.
Much of the changes are related to using typedefs tran_low_t
and tran_high_t for the final transform cofficients and intermediate
stages of the transform computation respectively rather than fixed
types int16_t/int. When vp9_highbitdepth configure flag is off,
these map tp int16_t/int32_t, but when the flag is on, they map
to int32_t/int64_t to make space for needed extra precision.

Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
2014-09-11 19:56:33 -07:00
Minghai Shang
759afe525c Merge "[svc] Temporal svc with two pass rate control" 2014-09-03 10:51:19 -07:00
Deb Mukherjee
a4ef1a0819 Merge "Adds config opt for highbitdepth + misc. vpx" 2014-09-02 15:41:27 -07:00
Deb Mukherjee
5acfafb18e Adds config opt for highbitdepth + misc. vpx
Adds config parameter vp9_highbitdepth, to support highbitdepth profiles.
Also includes most vpx level high bit-depth functions. However
encode/decode in the highbitdepth profiles will not work until
the rest of the code is in place.

Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6
2014-09-02 14:37:10 -07:00
Minghai Shang
be3b08da3e [svc] Temporal svc with two pass rate control
It's built based on current spatial svc code.
We only support one spatial two temporal layers at this time.
Change-Id: I1fdc8584354b910331e626bfae60473b3b701ba1
2014-09-02 12:05:14 -07:00
Dmitry Kovalev
0a4403992a Merge "Removing 'frames' field from VP9_COMP." 2014-09-02 10:01:20 -07:00
Dmitry Kovalev
4ab2241f5b Removing dummy_packing member from VP9_COMP.
Change-Id: I571ce84c97087f8a1a36a10058393bfdcefbf72a
2014-08-29 17:33:20 -07:00
Yunqing Wang
96c43e8aa9 Minor fix in vp9_encoder.h
Added the missing "int".

Change-Id: I7c8af3dee700837b40f010d53e1431a59370ae3a
2014-08-29 11:27:24 -07:00
Dmitry Kovalev
e9d106bd45 Merge "Removing unused arnr_type from VP9EncoderConfig and vp9_extracfg." 2014-08-28 13:50:05 -07:00