Commit Graph

1231 Commits

Author SHA1 Message Date
Debargha Mukherjee
01a2b40e95 Merge "Spatial SVC crash fix" 2015-12-01 21:24:46 +00:00
Marco
f78b7daec4 Condition use of minmax in variance partition on speed setting.
For non-rd variance partition: only allow minmax computation
(which currently has no arm-neon optimization) for speeds < 8.

Performance loss is small: On RTC set with speed 8, few clips lose ~2/3%,
average loss is < 1%.

Change-Id: Ia9414f4d0b77dc83c3e73ca8de5d903f64b425ce
2015-11-30 17:23:32 -08:00
Marco
5b0ddb931d vp9 denoiser: Re-evaluate ZEROMV after denoiser filtering.
For denoising, and for noise level above threshold, re-evaluate
ZEROMV for mode selection after denoising.
Current change only does this check if selected best mode (before denoising)
was intra.

Change-Id: I4b1435b68d26c78f7597b995ee7bff0ddd5f9511
2015-11-24 17:30:32 -08:00
Debargha Mukherjee
e807517a93 Spatial SVC crash fix
Fixes a spatial_svc breakage introduced in
https://chromium-review.googlesource.com/#/c/305228/3.

Change-Id: I7f2cecbdca980addb85d5e58b58b5454f4730ada
2015-11-24 16:40:27 -08:00
paulwilkins
0149fb3d6b Changes to exhaustive motion search.
This change alters the nature and use of exhaustive motion search.

Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.

Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.

For example:
  stage 1: Range +/- 64 interval 4
  stage 2: Range +/- 32 interval 2
  stage 3: Range +/- 15 interval 1

This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.

This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained  a bug (the two searches used different distortion
metrics).

For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.

Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most clips though the quality gain and speed impact are small.

Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-13 10:16:31 +00:00
Marco
419da5c734 Adjust variance threshold for 16x16 split at low resolutions.
Change-Id: I635e37f81237e9703d7d9a11ed76a043f4ec6eb0
2015-11-12 17:58:31 -08:00
Marco Paniconi
1b63238b67 Merge "Non-rd partition: reduce variance threshold low resolutions." 2015-11-12 06:08:38 +00:00
Marco
1827764450 Adjust varianace threshold for high noise condition.
Change-Id: I91c722e480328ff95b8c57614d8176ccaceb2539
2015-11-11 18:06:21 -08:00
Marco
064a9eca49 Non-rd partition: reduce variance threshold low resolutions.
Change-Id: I06306905d187948a92f839357df5d21413823808
2015-11-10 15:42:58 -08:00
Marco
c7da053d4b Move noise level estimate outside denoiser.
Source noise level estimate is also useful for
setting variance encoder parameters (variance thresholds,
qp-delta, mode selection, etc), so allow it to be used also
if denoising is not on.

Change-Id: I4fe23d47607b4e17a35287057f489c29114beed1
2015-11-02 12:15:26 -08:00
Marco
8a2fc54508 Adjustments to vp9-denoising.
Adjust variance threshold, delta-qp, and intra penalty cost,
based on estimated noise level in source.

Replace denoising_on with a level value=L/M/H.

Change-Id: I0c017dae75a5d897367d2c42dec26f2f37e447c1
2015-10-27 10:44:19 -07:00
Yaowu Xu
1832ba7509 Restore partial changes from previous commit
This portion was tested to have no effect on asan test failures.

Change-Id: I3de1dab7479148bdffc24c4568cb2e7e9963f099
2015-10-16 00:28:37 +00:00
Yaowu Xu
4727fa2a75 Fix two asan failures
Change-Id: I57865e9604ac162ef0d97deb16e81ca436a98428
2015-10-14 18:03:31 -07:00
Yaowu Xu
c2b8b5bfe2 Merge "Changes to partition breakout rules." 2015-10-13 22:31:56 +00:00
paulwilkins
cdc359989a Changes to partition breakout rules.
Changes to the breakout behavior for partition selection.
The biggest impact is on speed 0 where encode speed in
some cases more than doubles with typically less than 1%
impact on quality.

Speed 0 encode speed impact examples
Animation test clip: +128%
Park Joy:  +59%
Old town Cross: + 109%

Change-Id: I222720657e56cede1b2a5539096f788ffb2df3a1
2015-10-13 14:19:06 -07:00
Marco
86ede50943 Fix to denoiser with dynamic resize.
Temporary fix to denoiser when dynamic resizing is on.
 -Reallocate denoiser buffers on resized frame.
 -Force golden update on resized frame.
 -Don't denoise resized frame, and copy source into denoised buffers.

Change-Id: Ife7638173b76a1c49eac7da4f2a30c9c1f4e2000
2015-10-02 11:50:57 -07:00
Marco
3f7656cc23 Limit cyclic refresh on steady background blocks.
Use the existing QP condition on limiting cyclic refresh, and add
addiitonal condition that block has been encoded with zero/small motion
x frames in row (where x is at least several times the refresh period).
Additional condition only affect non-screen content mode.

This helps to improve visual stability for noisy input, where on steady
background areas the application of delta_qp may lead to encoding the noise.

Also added a change to use the true skip (after encoding) to update the
last QP.

Change-Id: I234a1128d017d284cf767fdb58ef6c59d809f679
2015-09-25 10:40:35 -07:00
Johann
c5f11912ae Include vpx_dsp_common.h when using VPXMIN/MAX
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
2015-08-31 14:36:35 -07:00
James Zern
5e16d397bd vpx_dsp_common: add VPX prefix to MIN/MAX
prevents redeclaration warnings;
vp8 has its own define which will be resolved in a future commit

Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
2015-08-26 20:11:32 -07:00
Marco
e18800443c Fix to non-rd variance partition selection.
Only test for using golden as reference for variance partition
selection if it is used as a reference for that frame.

For temporal layers, golden may not be a reference on a given frame,
even though it was for some previous frame. If it is not a reference
for current frame, don't check/use it for partition selection.

Change-Id: I6b0f2bd36aebbb5903077c9a0a66d80f1de9a7b1
2015-08-17 13:32:40 -07:00
hui su
cb79ea1c16 Call set_ref_ptrs only for inter blocks
In encode_superblock, call set_ref_ptrs only for inter blocks.

Change-Id: I27545c0e3e679e1838b78d7c9d01fe5a4d3cc0fb
2015-08-12 11:25:43 -07:00
hui su
088b05fd99 Use sizeof(variable) instead of sizeof(type)
Change-Id: Ia069da11eebb271063e9eb837bdb3e7175ecce13
2015-08-12 11:25:38 -07:00
Alex Converse
a8a08ce57e Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h
Use system_state.h in vpx_dsp and remove unneeded includes of
vp9_systemdependent.h.

Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4
2015-08-10 15:37:14 -07:00
Marco
f6255dbb53 Bugfix for svc.
Condition usage of rc.frames_since_golden to non-svc mode.

rc.frames_since_golden, which is used in non-svc mode to add second reference,
was causing, under certain condiiton, the turning off of golden reference
for svc case.

Change-Id: Icec644d235d0471e56d8ff73d6c37278bd6ecd3b
2015-08-05 13:59:52 -07:00
Yunqing Wang
3b2e73b9a4 Remove tx cache and speed up tx size selection
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.

2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.

Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
2015-07-30 18:53:40 -07:00
Jingning Han
4b5109cd73 Replace vp9_ prefix in 2D-DCT functions with vpx_
Clean up the forward 2D-DCT function names in vpx_dsp.

Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-28 16:06:44 -07:00
Yunqing Wang
b2446fb6be Remove tx_select_threshes
Removed unused tx_select_threshes and tx_select_diff.

Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
2015-07-27 12:02:05 -07:00
paulwilkins
4b44e46de0 Merge "Changes to use of rectangular partitions." 2015-07-09 18:34:41 +00:00
paulwilkins
2d637ca36d Merge "Change speed and rd features for formatting bars." 2015-07-09 16:38:38 +00:00
Jingning Han
535cc6d87f Format fixes in vp9_encodeframe.c and vp9_encodemb.c
Change-Id: Ib1303dac9043ab1b1f8fce54611cf4ea8a208038
2015-07-09 00:04:28 +00:00
paulwilkins
8dd466edc8 Changes to use of rectangular partitions.
Changes to allow more use of rectangular partitions at
speeds 1 and 2 for content classed by the first pass as
animation and for blocks near the active image edge.

This has quite a big impact in quality for the animated
test sequence but also hurts encode speed for speed 2.

For other content types the impact on both speed and
quality is small.

Added some plumbing for detection of internal vertical
image edges.

Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a
2015-07-08 18:14:12 +01:00
paulwilkins
a126b6ce7d Change speed and rd features for formatting bars.
Change speed features / behavior for split mode when there
is an internal active edge (e.g. formatting bars).

Remove some threshold constraints in rd code near the active
edge of the image.

Add some plumbing for left and right active edge detection.

Patch set 5. Limit rd pass through for sub 8x8 to internal active edges.
This takes away any speed penalty for most clips but keeps the enhanced
edge coding for the more critical case of internal image edges

Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5
2015-07-08 17:51:42 +01:00
Marco
478fbc8f23 Update to speed 5 non-rd mode partition search.
If the pre-selected partition size (from variance partition) is
32x32, also apply nonrd partition search for 32x32 and 16x16 size.

Overall small positive gain in metrics, average ~1%.
Some visual improvement, for lower resolutions.

Change-Id: I69cb425bda94f7d13d34c451ab30e9276335a30e
2015-07-07 11:52:01 -07:00
paulwilkins
99f8bd72cb Alter partition search at image edge.
Added code to reduce the minimum partition size searched
for super blocks at or straddling the edge of the image.

If the first pass has detected formatting bars the "active" edge
may not be the real edge.

Change-Id: I9c4bdd1477e60f162a75fac95ba6be7c3521e05c
2015-07-02 16:25:25 +01:00
James Zern
e757808429 Merge "vp9_pred_common: inline vp9_get_tx_size_context" 2015-07-02 01:52:40 +00:00
James Zern
0ea304620c Merge "vp9_pred_common: inline vp9_get_segment_id" 2015-07-02 01:52:21 +00:00
Scott LaVarnway
c06d56cc7d VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFO
to MB_MODE_INFO_EXT.  This saves 36 bytes per 8x8 area for
both the decoder and encoder. (encoder has two MODE_INFO
buffers)

Change-Id: If006abb2224acaf326df3c2be09e77e967662107
2015-06-29 12:46:47 -07:00
Scott LaVarnway
437d033dbb Merge "Remove tile param" 2015-06-29 18:04:56 +00:00
Marco
fb2a89b1fb Fixes for key frame coding at speed 5.
Keep the same transform cutoff and partition selection
for speed 5 as in speeds >=6 (non-rd speed settings).

Existing setting for key frame at speed 5 allowed transform size
up to 32x32 on key frames, and did not allow for 4x4 block partition size.
This created more visual artifacts on first few frames.

avgPSNR/overallPSNR/SSIM gains of 0.2/0.7/0.8 for rtc_derf(low-res) set,
and 0/0.7/1.1 gains for rtc set.

Change-Id: I8c139ec6c9bb74e14b4ffbad5f12e94f18a59c0b
2015-06-22 16:57:35 -07:00
Scott LaVarnway
86f4a3d8af Remove tile param
and added to MACROBLOCKD.

Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db
2015-06-22 06:09:38 -07:00
Marco
debe4e920f Reduce max_partition_size for low resolutions at speed 5.
For speed 5 real-time mode, the selection of the partition size for
superblocks on the segment (aq-mode=3) uses the non-rd recursive
pick partition search, and can sometimes select 64x64.

For low resolutions, visually better to limit this to 32x32.

Change-Id: I69657a7ed8899f8b3cf8c9c318a2509c5c72c565
2015-06-19 16:48:16 -07:00
James Zern
c5d779d266 Merge changes I2552d810,I51952c0a,Ib82e4247,I9c8d16cb
* changes:
  vp9_mcomp: make search_step_table static
  vp9_encodeframe: delete auto_partition_range()
  vp9_mcomp: don't mark setup_center_error() inline
  vp9_encoder: hide adjust_image_stat()
2015-06-19 03:31:38 +00:00
James Zern
3edd293dae vp9_pred_common: inline vp9_get_tx_size_context
+ drop 'vp9_' prefix

Change-Id: If3f3ec32d03026af78b8fcd82749e587a3f43059
2015-06-15 18:41:22 -07:00
James Zern
e6add6499f vp9_pred_common: inline vp9_get_segment_id
+ drop 'vp9_' prefix

Change-Id: Id5a3c8d416dbdf93d9f4f1bde662f7b2c2290168
2015-06-15 18:41:14 -07:00
James Zern
5214bd52c8 vp9_encodeframe: make coord_lookup[] static
Change-Id: I19588f9e674c8635b6e58e4633120be736d256a6
2015-06-12 19:47:46 -07:00
James Zern
31509af247 vp9_encodeframe: delete auto_partition_range()
unused since:
1f00a9b Fix choose_partitioning threshold setup for speed -5

Change-Id: I51952c0a1be3e6e0aa36ff2ffcfbbea60a505960
2015-06-12 17:57:37 -07:00
Yunqing Wang
254a4c033c Merge "Allocate tile data adaptively to accommodate the frame size increase" 2015-06-12 15:49:40 +00:00
Yunqing Wang
2c838ede68 Allocate tile data adaptively to accommodate the frame size increase
If the frame size increases, the tile data buffer needs to be
re-allocated according to the number of tiles existing in current
frame. This patch makes the multi-tile encoding work in spatial
SVC usage case, and partially solved WebM issue 1018.

Change-Id: I1ad6f33058cf5ce6f60ed5024455a709ca80c5ad
2015-06-11 11:30:18 -07:00
Scott LaVarnway
42c0b1b1f1 inline vp9_segfeature_active()
and changed name.

Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3
2015-06-11 04:20:55 -07:00
Scott LaVarnway
baaaa57533 Reducing size of MODE_INFO struct
Reduced size from 124 bytes to 104 bytes.  For decode only builds,
it is reduced to 68 bytes.

Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3
2015-06-04 07:32:16 -07:00
Marco
a8c5ab2ca6 Remove ABI check for 1 pass CBR SVC.
Remove the ABI check for the controls needed for SVC 1 pass CBR mode.
Bump up the ABI version.

Change-Id: I35b79ee010e14af83c6d1e801d574deaaa2fc7eb
2015-06-03 17:43:22 -07:00
Marco
c139b81a13 Vidyo patch: Rate control for SVC, 1 pass CBR mode.
-Make Rate control work for SVC 1 pass CBR mode.
-Added temporal layering mode.
-Fixed bug in non-rd variance partition.
-Modified/updated the sample encoders (vp9_spatial_svc_encoder, vpx_temporal_svc_encoder).
-Added datarate unittest(s) for 1 pass CBR SVC.

Change-Id: Ie94b1b68a56ea1267b5087c625e5df04def2ee48
2015-06-02 07:54:13 -07:00
Marco
a49fff632c Non-rd variance partition: Adjust thresholds for 1080p.
Increase the 32x32 split threshold, to allow for more 32x32
at expense of 16x16. Visually looks somewhat better.

Change-Id: Ia1439c3a0dc2d7933468b88bd59266fcd9f03505
2015-05-27 12:30:35 -07:00
Marco
f76d42a98a Refactor set_vbp_thresholds.
Break out the setting of the block variance split thresholds,
since they are locally modified, e.g., based on local/segment qp.

No change in performance.

Change-Id: I0a3238e6dab05140657539fc4bd27ac5ff7a554e
2015-05-27 09:25:18 -07:00
Johann
dee70d355f Merge "Move variance functions to vpx_dsp" 2015-05-26 23:02:11 +00:00
Johann
c3bdffb0a5 Move variance functions to vpx_dsp
subpel functions will be moved in another patch.

Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-26 12:01:52 -07:00
Jingning Han
96dba4902c Fix integral projection motion search for frame resize
This commit fixes the integral projection motion search crash when
frame resize is used. It fixes issue 994.

Change-Id: Ieeb52619121d7444f7d6b3d0cf09415f990d1506
2015-05-22 15:40:45 -07:00
James Zern
700b7fd0a9 vp9/encoder: make some functions static
silences missing prototype warnings

Change-Id: I3338fcaa67b5dcdf6bf237e8b374db3befd18753
2015-05-15 10:43:47 -07:00
Johann
1d7ccd5325 Relocate memory operations for common code
With the sad functions, and hopefully the variance functions soon,
moving to the vpx_dsp location, place the defines used in the
reference C code in a common location.

Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-05-13 11:41:15 -07:00
James Zern
fd3658b0e4 replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.

Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-07 11:55:08 -07:00
paulwilkins
aecb1770d5 Merge "Image size restriction to rd auto partition search." 2015-05-07 14:12:14 +00:00
paulwilkins
af76953448 Merge "Remove CONSTRAIN_NEIGHBORING_MIN_MAX." 2015-05-05 09:32:11 +00:00
paulwilkins
4a7dcf8eb2 Image size restriction to rd auto partition search.
Impose a limit on the rd auto partition search based on
the image format. Smaller formats require that the search
includes includes a smaller minimum block size.

This change is intended to mitigate the visual impact of
ringing in some problem clips, for smaller image formats.

Change-Id: Ie039e5f599ee079bbef5d272f3e40e2e27d8f97b
2015-05-01 16:16:02 +01:00
paulwilkins
287b0c6da9 Remove CONSTRAIN_NEIGHBORING_MIN_MAX.
Remove one of the auto partition size cases.
This case can behaves badly in some types of animated content
and was only used for the rd encode path. A subsequent patch
will add additional checks to help further improve visual quality.

Change-Id: I0ebd8da3d45ab8501afa45d7959ced8c2d60ee4e
2015-05-01 15:15:16 +01:00
Yunqing Wang
a257e469e1 Adjust the vbp early termination threshold slightly
Calculated cpi->vbp_threshold_sad from this frame's dequant value.
The encoding quality and speed didn't change much. Borg test
result: PSNR: -0.002%, SSIM: -0.003%.

Change-Id: I97c9826986f39582f29910d637d08a69c90afdee
2015-04-30 08:51:02 -07:00
James Zern
f58011ada5 vpx_mem: remove vpx_memset
vestigial. replace instances with memset() which they already were being
defined to.

Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-04-28 20:00:59 -07:00
James Zern
f274c2199b vpx_mem: remove vpx_memcpy
vestigial. replace instances with memcpy() which they already were being
defined to.

Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c
2015-04-28 19:59:41 -07:00
Yaowu Xu
b3e411e481 Add validation of UV partition size
For color sampling format other than 420, valid partion size in Y may
not work for UV plane. This commit adds validation of UV partition
size before select the partition choice.

This fixes a crash for real time encoding of 422 input.

Change-Id: I1fe3282accfd58625e8b5e6a4c8d2c84199751b6
2015-04-24 12:34:18 -07:00
Scott LaVarnway
8b17f7f4eb Revert "Remove mi_grid_* structures."
(see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6)

For the test clip used, the decoder performance improved by ~2%.
This is also an intermediate step towards adding back the
mode_info streams.

Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d
2015-04-21 11:16:45 -07:00
Marco Paniconi
f76ccce5bc Revert "Revert "Force_split on 16x16 blocks in variance partition.""
This reverts commit 004b9d83e3

Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743
2015-04-16 17:52:13 -07:00
Yunqing Wang
63c5bf2b9c Fix Tsan errors
This patch fixed 2 reported Tsan errors while running VP9 real-time
encoder.

Change-Id: Ib0278fe802852862c3ce87c4a500e544d7089f67
2015-04-15 12:33:39 -07:00
Yunqing Wang
004b9d83e3 Revert "Force_split on 16x16 blocks in variance partition."
This reverts commit eb8c667570.
The patch caused mismatch while using multi-threads.

Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be
2015-04-14 15:19:31 -07:00
Marco
eb8c667570 Force_split on 16x16 blocks in variance partition.
Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.

Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
(with very safe threshold) based on sad used to select reference frame.

Some visual improvement near moving boundaries.
Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.

Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
2015-04-13 12:05:07 -07:00
Jingning Han
2404332c1b Merge "Remove get_nonrd_var_based_fixed_partition function" 2015-04-09 14:45:19 -07:00
Jingning Han
208aa6158b Remove get_nonrd_var_based_fixed_partition function
This function has been replaced by other approaches and is not
in use now.

Change-Id: I387f45b5607d202539e482468ccc70e6c0f9341f
2015-04-09 09:49:55 -07:00
Yunqing Wang
12cb30d4bd Merge "Set vbp thresholds for aq3 boosted blocks" 2015-04-02 18:22:08 -07:00
Yunqing Wang
cae03a7ef5 Set vbp thresholds for aq3 boosted blocks
The vbp thresholds are set seperately for boosted/non-boosted
superblocks according to their segment_id. This way we don't
have to force the boosted blocks to split to 32x32.

Speed 6 RTC set borg test result showed some quality gains.
Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%.
No speed change was observed.

Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a
2015-04-02 15:48:32 -07:00
Marco
77ea408983 Code cleanup: put (8x8/4x4)fill_variance into separate function.
Code cleanup, no change in behavior.

Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24
2015-04-02 13:37:35 -07:00
Yaowu Xu
ba91b54d7c Simplify bsize calculation
Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168
2015-04-01 12:15:06 -07:00
Yunqing Wang
fc98114761 Merge "Rename vbp thresholds" 2015-03-31 16:33:30 -07:00
Yunqing Wang
c28ff1a9de Rename vbp thresholds
Code refactoring

Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79
2015-03-31 15:14:44 -07:00
Alex Converse
02697e35dc Merge "A tiny cyclic refresh / active map fix." 2015-03-24 09:43:24 -07:00
paulwilkins
c0b71cf82f Merge "Experimental rd bias based on source vs recon variance." 2015-03-24 03:12:41 -07:00
Alex Converse
31f1563a92 A tiny cyclic refresh / active map fix.
Change-Id: I198727461455c8c198a0c892d02ed3cb1673aa50
2015-03-23 18:51:00 -07:00
paulwilkins
9a1ce7be7d Experimental rd bias based on source vs recon variance.
This experiment biases the rd decision based on the impact
a mode decision has on the relative spatial complexity of the
reconstruction vs the source.

The aim is to better retain a semblance of texture even if it
is slightly misaligned / wrong, rather than use a simple rd
measure that tends to favor use of a flat predictor if a perfect
match can't be found.

This improves the appearance of texture and visual quality
on specific test clips but is hidden under a flag and currently
off by default pending visual quality testing on a wider Yt set.

Change-Id: Idf6e754a8949bf39ed9d314c6f2daaa20c888aad
2015-03-20 11:57:36 +00:00
Marco
71e6ed7bd1 Adjustments to aq-mode=3.
Factor in segment#2 and skip blocks into the postencode estimated bits,
and increase somewhat the aggressiveness of the refresh.

PSNR/SSIM Metrics on RTC set go up by ~0.8/0.5%.

Change-Id: I5d4e7cb00a3aefb25d18c88b6b24118b72dc5d51
2015-03-18 12:06:16 -07:00
Marco
e52109158a Update to variance partition.
Use force_split to constrain the partition selection.
This is used because in the top-down approach to variance partition,
a block size may be selected even though one of its subblocks may have
high variance.

In this patch the selection of the 64x64 block size will only
be allowed if the variance of all the 32x32 subblocks are also below the threshold.

Stil testing, but some visual improvement for areas near slow moving boundary
can be seen. Metrics for RTC set increase by about ~0.5%.

Change-Id: Iab3e7b19bf70f534236f7a43fd873895a2bb261d
2015-03-17 17:02:47 -07:00
Yunqing Wang
45e8e4a01f Merge "Refactor set vbp thresholds function" 2015-03-17 16:05:53 -07:00
Yunqing Wang
c0423abf00 Refactor set vbp thresholds function
Code refactoring.

Change-Id: I73b6fcc0444155ee46c1efa5253c1d608c6439cb
2015-03-17 12:23:32 -07:00
Adrian Grange
ed6824e449 Remove unused ZBIN_BOOST macros
Change-Id: I5169155b20ea3676a6ce58ec77d6aeba07db29d9
2015-03-17 11:53:58 -07:00
Jingning Han
7cf383d17f Fix indent in choose_partitioning
Change-Id: I4039f8ac75a9cfcc4d07abd0619d1379bb10fe51
2015-03-16 11:01:00 -07:00
Jingning Han
1f00a9b9d5 Fix choose_partitioning threshold setup for speed -5
The compression performance of speed -5 is on average 12.6% better
than speed -6. At lower bit-rates, the gains are typically 20% or
more. For 2-thread encoding, the speed -5 takes about 1.6x time of
speed -6.

Change-Id: If7a73464a24d33e8f49b9533b51ec51c8da7fc80
2015-03-13 17:01:56 -07:00
Marco
e38066a74d Fix crash with vp9 denoiser on.
Crash occured on very first key frame, because denoiser
temporal function was beng entered.

Updated denoiser unittest to set cpu_used from first frame,
and verified fix fixes the crash.

Change-Id: I3be1124b52846fbbe7248d2c3d6136e086c80bc1
2015-03-13 11:10:02 -07:00
Alex Converse
1bfacd3529 Reconcile active_map and cyclic refresh
Change-Id: Id7f8654aeeb20caa402bc822521b1d72c658f4f9
2015-03-12 16:19:49 -07:00
Jingning Han
238b6be24b Prevent integer overflow in choose_partitioning
Re-arrange the multiplication and right shift operations to avoid
integer overflow in choose_partitioning.

Change-Id: Ib4005cafb410a67c1960486471d75b6ebe38c4e0
2015-03-11 16:31:42 -07:00
Jingning Han
1ca4d51b2e Refactor to remove GLOBAL_MOTION
Make the vp9_int_pro_motion_estimation() function return zero
motion vector if high bit depth is turned on, instead of removing
it from compiled codes.

Change-Id: Ia48f010eb590b2d517d5678c394110b326a1a95e
2015-03-11 15:53:15 -07:00
Yaowu Xu
059a473b35 Enable using Golden reference in choose_partition()
Choose_partition uses only the last frame as reference frame in making
partition decision, this commit adds the check on how well Golden
frame with (0,0) predicts the current block, and uses GF(0,0) as
basis for partition decision if it produces better prediction.

The commit improves rtc speed 6 and 7 encoding by 0.14% and 0.19%
respectively.

Change-Id: I156acf925bd6e0b586d48155d1940d27270a3915
2015-03-10 08:57:28 -07:00
Alex Converse
06b59299c8 Don't waste time partitioning skip superblocks.
Force 64x64 partitioning when a whole superblock is SEGMENT_LVL_SKIP. This
drops encode times of screens mostly at rest by 20%.

Change-Id: Ieba554b0b8a0c1679aae784a8bd11f038ab942c3
2015-03-09 11:02:05 -07:00
Yunqing Wang
969dd8f128 Merge "vp9_ethread: fix me consts initialization to support aq_mode=3 encoding" 2015-03-09 09:42:12 -07:00
Jingning Han
d2b6a4cc80 Merge "Move pred_mv assign outside integral projection motion search" 2015-03-09 09:34:26 -07:00
Yunqing Wang
6e0ec0b2d9 vp9_ethread: fix me consts initialization to support aq_mode=3 encoding
While turning on "--aq_mode=3", the quantizers are updated by each
thread. Fixed the me consts initialization function to make sure
that the correct thread data are updated.

Change-Id: Ied27bb7bae76fc3fa2cda4f8c35ac0b46271bef4
2015-03-06 16:31:46 -08:00
Yaowu Xu
0f37601fd7 Merge changes I1b972c94,I9c897d32
* changes:
  Prevent invalid memory access
  Use correct bsize for uv
2015-03-06 07:27:59 -08:00
Yaowu Xu
8cbeb7cf36 Prevent invalid memory access
Change-Id: I1b972c945274254d896d772d859840b2f8211b4f
2015-03-05 14:57:11 -08:00
Jingning Han
fda0410822 Move pred_mv assign outside integral projection motion search
Change-Id: I040b066fdce08e2f05115a22ea808715aa147779
2015-03-05 11:44:10 -08:00
Jingning Han
87bf5203af Merge "Move integral projection motion search to vp9_mcomp.c" 2015-03-05 09:25:16 -08:00
Yaowu Xu
b573fef76d Use correct bsize for uv
Change-Id: I9c897d32af6c3a956bb6f424a74c12737727038a
2015-03-05 08:20:35 -08:00
Adrian Grange
a34a042615 Merge "Make encoder buffer allocation dynamic" 2015-03-04 10:54:10 -08:00
Jingning Han
50c06052e9 Merge "Use SAD value to set chroma cost flag" 2015-03-04 10:47:56 -08:00
Jingning Han
2deecdd5cb Move integral projection motion search to vp9_mcomp.c
Make it a general purpose fast motion estimation function, to be
used in the mode search process.

Change-Id: Ib354cb0e664dc61c30c0b2314297835ee75b157a
2015-03-04 10:30:15 -08:00
Jingning Han
7d8061a44a Use SAD value to set chroma cost flag
This saves an extra 64x64 variance calculation and replaces two
32x32 variance functions with sad functions. The compression
performance change is unnoticeable.

Change-Id: I6d33868695664ec73b56c42945162ae61c484856
2015-03-04 09:46:39 -08:00
Jingning Han
0fe8304d0b Merge "Properly handle the boundary blocks for integral projection search" 2015-03-04 09:01:33 -08:00
Adrian Grange
3807dd82ab Make encoder buffer allocation dynamic
Frame buffers are now allocated dynamically on-demand.

Entries in the reference frame map, cm->ref_frame_map,
may now be set to -1 (INVALID_IDX) to indicate that
there is not a valid reference buffer in that "slot".

All slots in the reference frame map are now initialized
to the empty state (-1) and each buffer is initialized
to have a reference count of 0.

Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582
2015-03-04 07:58:32 -08:00
Deb Mukherjee
87d1a488ed Merge "dc quantizer fix for 32x32 transforms" 2015-03-03 23:23:44 -08:00
Jingning Han
e5fe165840 Properly handle the boundary blocks for integral projection search
Use rectangular block size for integral projection motion estimation
if the the 64x64 block has over half block outside the frame. This
avoids the issue that the motion information of these blocks is
dominated by the extended pixels, instead of the pixels of interest.

Change-Id: I22f4d2bb7f6a20db9b3f5e2e5463a7f4b9d1b737
2015-03-03 16:15:12 -08:00
Deb Mukherjee
6910e92d04 dc quantizer fix for 32x32 transforms
The rounding factor needs to be scaled down by a factor of 2.
Also, the quantized and dequantized coefficients are memset to 0
when dc quantizer is used.

Change-Id: Ifa68bab02addbf1b83d249c5b4cbd5cda796b1cf
2015-03-03 15:58:27 -08:00
Yaowu Xu
47ac3ea0bb Adapt color sensitiviy threshold to luma signal energy
Instead using only a fixed threshold, this commit adapts the threshold
for color sensitivity decision to luma signal energy: chroma channel's
sse is at least 1/6 of that in luma for color sensitivity flag to be
set to active.

This recoups a large portion of the speed loss due to accounting for
chroma component costs in RTC mode decision.

Change-Id: Ie01f747f6037dba6a1d1ed3e10b71a0ef1abc42c
2015-03-03 11:15:13 -08:00
Jingning Han
1790d45252 Use variance metric for integral projection vector match
This commit replaces the SAD with variance as metric for the
integral projection vector match. It improves the search accuracy
in the presence of slight light change. The average speed -6
compression performance for rtc set is improved by 1.7%. No speed
changes are observed for the test clips.

Change-Id: I71c1d27e42de2aa429fb3564e6549bba1c7d6d4d
2015-03-01 10:42:56 -08:00
Jingning Han
73a00d3219 Refactor integral projection based motion estimation
Support variable block size integral projection based motion
estimation.

Change-Id: Iee6d65e44df4480aa13fb7b84b9c91914b89caa1
2015-02-26 14:48:59 -08:00
Jingning Han
3e1d14a6ce Merge "Motion compensated reference refinement" 2015-02-25 12:33:09 -08:00
Jingning Han
4c5a4efc38 Merge "Re-distribute hierarchical vector match pattern" 2015-02-25 10:33:25 -08:00
Jingning Han
b7050c0be3 Motion compensated reference refinement
This commit applies one-step refinement search to the resulting
motion vector of the integral projectiion based motion estimation,
per 64x64 block. It improves the coding performance of speed -6.

pedestrian 1080p 500 kbps
51735 b/f, 36.794 dB, 16044 ms ->
51382 b/f, 36.793 dB, 16282 ms

cloud 1080p 500 kbps
24081 b/f, 37.988 dB, 14016 ms ->
23597 b/f, 38.076 dB, 12774 ms

vidyo1 720p 1000 kbps
16552 b/f, 40.514 dB, 8279 ms ->
16553 b/f, 40.543 dB, 8510 ms

The rtc set compression performance is improved by 0.5%.

Change-Id: I3d09bea2caf58b2a4f3b38aa26fffafcbe9a2c17
2015-02-25 10:32:09 -08:00
Jingning Han
f87e315e1e Re-distribute hierarchical vector match pattern
This commit modifies the hierarchical vector match patter. It
avoids repeated SAD computation at same points. The function
vp9_vector_sad_sse2 is called 12 times per 64x64 block, instead
of 15 times as before. The effective coverage remains the same.

Change-Id: I91ad9d27d40db8963c907d02af84e10702136994
2015-02-24 11:48:38 -08:00
Marco
0187f4b411 Adjustments to cyclic refresh (aq-mode=3).
Target higher delta-qp for big blocks with zero motion,
and for segment#1: avoid 64x64 partition size and force 8x8 tx size.

Metrics on RTC set mostly positive: SSIM up by ~4%, PSRN by ~1.5%.
Doesn't seem to be any change in speed.

Change-Id: I1f68fa3c4f62dab3b90cc58041f05ebb048ae5ac
2015-02-20 08:47:59 -08:00
Jingning Han
216b171d63 Merge "Integral projection based motion estimation" 2015-02-19 15:08:11 -08:00
Jingning Han
ed2dc59c1b Integral projection based motion estimation
This commit introduces a new block match motion estimation
using integral projection measurement. The 2-D block and the nearby
region is projected onto the horizontal and vertical 1-D vectors,
respectively. It then runs vector match, instead of block match,
over the two separate 1-D vectors to locate the motion compensated
reference block.

This process is run per 64x64 block to align the reference before
choosing partitioning in speed 6. The overall CPU cycle cost due
to this additional 64x64 block match (SSE2 version) takes around 2%
at low bit-rate rtc speed 6. When strong motion activities exist in
the video sequence, it substantially improves the partition
selection accuracy, thereby achieving better compression performance
and lower CPU cycles.

The experiments were tested in RTC speed -6 setting:
cloud 1080p 500 kbps
17006 b/f, 37.086 dB, 5386 ms ->
16669 b/f, 37.970 dB, 5085 ms (>0.9dB gain and 6% faster)

pedestrian_area 1080p 500 kbps
53537 b/f, 36.771 dB, 18706 ms ->
51897 b/f, 36.792 dB, 18585 ms (4% bit-rate savings)

blue_sky 1080p 500 kbps
70214 b/f, 33.600 dB, 13979 ms ->
53885 b/f, 33.645 dB, 10878 ms (30% bit-rate savings, 25% faster)

jimred 400 kbps
13380 b/f, 36.014 dB, 5723 ms ->
13377 b/f, 36.087 dB, 5831 ms  (2% bit-rate savings, 2% slower)

Change-Id: Iffdb6ea5b16b77016bfa3dd3904d284168ae649c
2015-02-19 13:47:19 -08:00
Jingning Han
83559e7357 Fix a check condition in nonrd_pick_partition
Change-Id: Ic92fb4b16948f745c218351b24fdafecf9abce3a
2015-02-19 09:54:55 -08:00
Yaowu Xu
ee5d79995e Move computation up to frame level
This is to avoid redo the same calculation repeatly, and also allow
easier adjustments for further experiments.

This commit shall have no effect on quality/compression.

Change-Id: I4460acf5c808ff5518da18d21e002c5da58af857
2015-02-10 15:41:52 -08:00
Jingning Han
ebb4c9e8e7 Fix block partition size in fill_mode_info_sb
This commit fixes the sub block partition size used in
fill_mode_info_sb. Previous implementation effectively disabled
the rectangular block sizes. This commit resolved this issue.

Change-Id: Ic1c383ab0a9a2e7d59e85b388093f1f1f94d1e7f
2015-02-10 08:39:32 -08:00
Yaowu Xu
8b5e665098 Merge "Replace repeated check with single variable" 2015-02-06 09:17:59 -08:00
Yaowu Xu
c905c42ad8 Remove unnecessary initialization
loop_filter_level is always reset in loop_filter_frame() later in
encoder.

Change-Id: I608e03d905a6b23e7d5025ca747e4784c665007e
2015-02-04 13:56:16 -08:00
Yaowu Xu
581aee001e Move tx_mode decision logic into select_tx_mode()
Change-Id: I7f8f78c33eb3f33344b029a27bda320f4d68c577
2015-02-04 13:54:49 -08:00
Yaowu Xu
19451e6d67 Replace repeated check with single variable
Change-Id: I2f6a669bf7c6d9796388ad3f3fa3fc942635c215
2015-02-04 12:59:14 -08:00
Yaowu Xu
a844a778c7 Merge "Adjust partitioning threshold based rtc speed" 2015-02-04 12:52:03 -08:00
Yaowu Xu
3bc0c6576f Merge "Move calls to avoid unnecessary operations" 2015-02-04 12:51:16 -08:00
Yaowu Xu
bdfb5f986e Adjust partitioning threshold based rtc speed
On rtc set:
speed 7 quality improves about 0.5%
speed 8 quality improves about 1.0%

Encoding time for speed 7 changes from 67804ms to 65889ms
Encoding time for speed 8 changes from 58659ms to 56808ms

Change-Id: Iabcfb53012fc1b9f3326cdbc167e5758b8c7ad30
2015-02-04 11:28:39 -08:00
Jingning Han
1b9082ec6b Unify luma and chroma inter predictors in choose_partitioning
Change-Id: I8bfc80f4fffb0892e93d3326394a52d1ee3c0f37
2015-02-04 10:02:57 -08:00
Jingning Han
0c6d3a03e1 Account for chroma component costs in RTC mode decision
This commit allows the encoder to account for additional chroma
plane costs in the mode decision process, if the current block
potentially contains significant color change. It improves the
visual quality at very low bit-rates.

The compression performance of dark720p is improved by 12.39% in
speed 6. For jimred at 150 kbps, the PSNR of V component (red)
increased by 0.2 dB, at the expense of about 5% increase in
encoding time. Note that for sequences where the chroma components
are fairly consistent, the encoding time increase is negligible.

On average the rtc set compression performance is improved by
1.172% in PSNR and 1.920% in SSIM.

Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
2015-02-04 09:45:14 -08:00
Yaowu Xu
02537ebbe4 Move calls to avoid unnecessary operations
Change-Id: I236f7f75ab9a4511d1b52a6a67299b0e844a103e
2015-02-03 17:01:37 -08:00
Jingning Han
ca9c352fc3 Assign 2nd ref frame in choose_partitioning
Avoid the use of uninitialized second reference frame for fetching
reference block.

Change-Id: I9983a0daea829700b3270dc8bf2bcc6d6ea36652
2015-02-03 11:17:51 -08:00
Yaowu Xu
45971abd1d Optimize coef update
1. move the check of search method of USE_TX_8X8 up one level to
avoid operations of build_tree_distributions()
2. count tx used and avoid computaton for coef udpate when one size
is not used at all.

Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c
2015-01-30 10:16:40 -08:00
Yaowu Xu
fe2439703d Merge "move clear_system_state() call before using double" 2015-01-27 12:42:13 -08:00
Marco
1c4a84c6e9 Merge "aq-mode=3: Update to allow for refresh on modes other than zero-mv." 2015-01-26 19:47:13 -08:00
Yaowu Xu
645b7cdf03 move clear_system_state() call before using double
Floating point is used in vp9_convert_qindex_to_q(), so sometime unit
test ActiveMapTest would cause run time error without properly call
to clear_system_state to reset register status.

Change-Id: I181e9395148c44a6ca8b97d6e109bd4a152143c6
2015-01-26 18:41:50 -08:00
Marco
3f1af6e85e aq-mode=3: Update to allow for refresh on modes other than zero-mv.
Add distortion threshold condition to refresh state of a coding block,
and allow for qp adjustment also for some intra modes and non-zero motion modes.

Also some code cleanup (remove unused variables/code).

Change-Id: I735fa2b28bc64f60e0323976b82510577b074203
2015-01-26 16:44:25 -08:00
Yaowu Xu
6d16f6c14c Fix MSVC warnings on conversion from int64 to int
Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025
2015-01-26 10:54:06 -08:00
Marco
0dccb6277c Modify variance partition selection for low resolutions.
For low spatial resolutions: bias partittion selection to smaller block sizes,
and base the variance computation on 4x4 down-sampling.

Also move the threshold computations into the choose_partitioning,
so they are computed once for each sb block.

On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
No change for resolutions above CIF.

Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
2015-01-22 15:16:55 -08:00
Deb Mukherjee
e7570493b8 Moves inter mode count updates to update_stats
This makes the inter_mode counts update consistent with other symbols.
Also, forward updates should work corerctly now.

Change-Id: Id98be26fd08875162e644bb8f1de6f0918f85396
2015-01-06 16:40:45 -08:00
Paul Wilkins
a88e4e64b1 Merge "Deleted unused #define" 2015-01-06 04:18:20 -08:00
Jingning Han
5486db185c Add bsize check condition in nonrd_use_partition
Check if block size is below 8x8 for rectangular block coding. It
is added to support 4x8 and 8x4 block coding for RTC mode.

Change-Id: I760b328f45b98ae48adc45ed5a39fb643cd8aebd
2015-01-02 10:12:37 -08:00
Jingning Han
dad89d5ca1 Enable sub8x8 inter block search for RTC coding mode
This commit enables sub8x8 inter block coding for RTC mode. The
use of sub8x8 blocks can be turned on by allowing
choose_partitioning function to select 4x4/4x8/8x4 block sizes.

Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a
2014-12-24 17:40:31 -08:00