Commit Graph

6834 Commits

Author SHA1 Message Date
Hui Su
66906da066 Merge "Combine vp9_encode_block_intra and encode_block_intra" 2014-10-30 11:02:31 -07:00
Yunqing Wang
aed48c786a Remove unused speed feature
Partition_check was unused and removed.

Change-Id: I15ec9162d86dc61f04c09229c498629878ed7155
2014-10-29 17:05:04 -07:00
Jingning Han
afa31ab9b8 Merge "Enable mode search threshold update in non-RD coding mode" 2014-10-29 12:42:22 -07:00
Jingning Han
9349a28e80 Enable mode search threshold update in non-RD coding mode
Adaptively adjust the mode thresholds after each mode search round
to skip checking less likely selected modes. Local tests indicate
5% - 10% speed-up in speed -5 and -6. Average coding performance
loss is -1.055%.

speed -5
vidyo1 720p 1000 kbps
16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms

nik 720p 1000 kbps
33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms

speed -6
vidyo1 720p 1000 kbps
16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms

nik 720p 1000 kbps
33271 b/f, 38.433 dB,  7886 ms -> 33279 b/f, 38.416 dB, 7843 ms

Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b
2014-10-29 10:55:34 -07:00
Adrian Grange
4074099ed8 Simplify vp9_set_rd_speed_thresholds_sub8x8
Change-Id: I4bf0f9a38697f5aea564a47afd7f02bb8b2888b6
2014-10-29 09:09:46 -07:00
Hui Su
0928da3b6e Combine vp9_encode_block_intra and encode_block_intra
Change-Id: I79091fb677b64892ecca2fb466fde14602d8cdfc
2014-10-28 18:57:01 -07:00
Jingning Han
982dab6050 Merge "Use zero motion vector in choose_partitioning" 2014-10-28 12:00:13 -07:00
JackyChen
50e5c30536 Merge "vp9_denoiser_sse2: refactor the code." 2014-10-28 11:06:05 -07:00
Yaowu Xu
7d7b43b9af Merge "Allow update of golden refernce buffer in CBR mode" 2014-10-28 10:48:02 -07:00
JackyChen
99a8dac4de vp9_denoiser_sse2: refactor the code.
Combined vp9_denoiser_8xM_sse2 and vp9_denoiser_4xM_sse2 into one
function vp9_denoiser_NxM_sse2_small and passed the bitexact testing.
Changed the name of the function vp9_denoiser_64_32_16xM_sse2 to
vp9_denoiser_NxM_sse2_big.

Change-Id: Ib22478df585994dd347ebae04202c0b701e7f451
2014-10-28 09:36:58 -07:00
Yaowu Xu
2a506e33b4 Merge "Add a new control of golden frame boost in CBR mode" 2014-10-28 09:32:58 -07:00
Yaowu Xu
e5cd51880e Allow update of golden refernce buffer in CBR mode
This commit changes to allow the usage of golden reference frame in
VP9 CBR mode to improve quality. VP9 supports potentially up to 8
reference buffers, it has reference buffers available for this
purpose. This was not possible in VP8 as golden and alt-ref buffers
were used for temporal scalability purpose in CBR mode in WebRTC.

For frames that update golden frame, there can be a quality boost.
The amount of allowed bitrate boost can be controlled via parameter
rc_max_inter_bitrate_pct. The inital value of the boost ratior is
currently based on over_shoot_pct. Further experiments will work
out the adaption of this boost value.

Change-Id: I0c5f010c8fd8b7b598f69779c1b30e5b2ac30a4d
2014-10-28 09:31:10 -07:00
Paul Wilkins
422d7bc918 Relax maximum Q for extreme overshoot.
Added code to relax the active maximum Q in response
to extreme local overshoot to reduce bandwidth peaks.

The impact is small in metrics terms, but it this helps reduce
bandwidth spikes and overall overshoot in a number of
clips in our tests sets (especially the YT test set).

In particular this should help prevent very big spikes where a clip
is mainly easy but has a short hard section. In such a case a choice
of maximum Q for the clip as a whole may allow us to hit the overall
target rate but give some extreme spikes. The chunked encoding in YT
mitigates this problem but it can show up where a longer clip is
coded as a single chunk.

Change-Id: I213d09950ccb8489d10adf00fda1e53235b39203
2014-10-28 13:03:06 +00:00
Jingning Han
07436abb86 Use zero motion vector in choose_partitioning
The zero motion vector was effectively used in the subsampled pixel
based variance calculation. This commit makes it directly use zero
mv to generate prediction.

Change-Id: Ica83dc843e9f8da2f89c3ef451e50f16214c0def
2014-10-27 19:38:43 -07:00
Jingning Han
d56b3eb0cf Refactor encoder tile data structure
Make the common tile info as one element in the encoder tile data
struct.

Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be
2014-10-27 19:37:13 -07:00
Yaowu Xu
03a60b78db Add a new control of golden frame boost in CBR mode
0 means that golden boost is off, and uses average frame target rate,
a non-zero number means the percentage of boost over average frame
bitrate is given initially to golden frames in CBR mode.

Change-Id: If4334fe2cc424b65ae0cce27f71b5561bf1e577d
2014-10-27 13:55:18 -07:00
Jingning Han
192010d218 Refactor rtc coding mode to support tile encoding
Use per tile threshold in the prediction mode search process.

Change-Id: I6c74ee5a3b069bb4281002dfe51310911a0756c0
2014-10-27 09:53:46 -07:00
Yaowu Xu
aa2af3ff6e Merge "Add a new control of max bitrate for inter frame" 2014-10-27 08:11:54 -07:00
Jingning Han
ac53c41e64 Merge "Tile based adaptive mode search in RD loop" 2014-10-24 18:44:52 -07:00
James Zern
01900edc40 Merge changes I8a9c9019,Ic7b2faa3,I44d42a50,I3f3a3924,I10747b32,I31b49c9e
* changes:
  add vp9_loop_filter_data_reset
  move LFWorkerData allocation to VP9LfSync
  vp9_loop_filter_frame_mt: remove pbi dependency
  vp9_loop_filter_frame_mt: pass planes directly
  vp9_loop_filter_frame_mt: pass VP9LfSync directly
  vp9: store TileWorkerData allocations separately
2014-10-24 11:43:51 -07:00
Yaowu Xu
636099f7b6 Add a new control of max bitrate for inter frame
Change-Id: I205de3611622cff7f751ea8baf9f82784581730a
2014-10-24 10:19:28 -07:00
Jingning Han
eee201c221 Tile based adaptive mode search in RD loop
Make the spatially adaptive mode search in rate-distortion
optimization loop inter tile independent. Experiments suggest that
this does not significantly change the coding staticstics.

Single tile, speed 3:
pedestrian_area 1080p 1500 kbps
59192 b/f, 40.611 dB, 101689 ms

blue_sky 1080p 1500 kbps
58505 b/f, 36.347 dB, 62458 ms

mobile_cal 720p 1000 kbps
13335 b/f, 35.646 dB, 45655 ms

as compared to 4 column tiles, speed 3:
pedestrian_area 1080p 1500 kbps
59329 b/f, 40.597 dB, 101917 ms

blue_sky 1080p 1500 kbps
58712 b/f, 36.320 dB, 62693 ms

mobile_cal 720p 1000 kbps
13191 b/f, 35.485 dB, 45319 ms

Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325
2014-10-24 10:00:27 -07:00
Paul Wilkins
60d192db04 Merge "Enable dual arf with constant q." 2014-10-24 05:51:25 -07:00
Paul Wilkins
3758650c98 Merge "Move frame re-sizing into the recode loop" 2014-10-24 05:50:39 -07:00
Adrian Grange
65753eeb8a Move frame re-sizing into the recode loop
The point at which frames are scaled to their
coded dimensions is moved into the re-code loop.

This is in preparation for a further patch that
will add logic into the re-code loop to reduce
the coded frame size if the encoder is struggling
to hit the target data rate at the native frame
size.

Change-Id: Ie4131f5ec6fb93148879f6ce96123296442bf2d1
2014-10-23 16:20:57 -07:00
Yaowu Xu
86777f2e1e Merge "Move filter_ref initialization" 2014-10-23 11:20:22 -07:00
James Zern
01483677e5 add vp9_loop_filter_data_reset
Change-Id: I8a9c9019242ec10fa499a78db322221bf96a0275
2014-10-23 19:43:48 +02:00
Yaowu Xu
065809d286 Move filter_ref initialization
To outside the loop to avoid repeating the operations.

Change-Id: I66c1986e98ce0d7594caad3d3b45de655b299bff
2014-10-23 08:27:25 -07:00
Paul Wilkins
8fc3ab774f Enable dual arf with constant q.
Add second level arf Q adjustment when using dual arfs
in constant Q mode.

Previously in constant Q mode enabling dual arf hurt by ~5%
but with this change the average benefit is ~1-1.5% with some
mid range data points up ~10%.

Note however that it still hurts on some clips including
some very low motion show content.

Change-Id: I5b7789a2f42a6127d9e801cc010c20a7113bdd9b
2014-10-23 13:19:31 +01:00
Paul Wilkins
9363425daa Merge "Initialization bug for multi arf." 2014-10-23 02:02:48 -07:00
Jingning Han
41a17f4457 Merge "Allow checking zeromv mode in vp9_pick_inter_mode" 2014-10-22 18:46:20 -07:00
Yunqing Wang
330a6b2756 Merge "vp9_ethread: allocate frame contexts outside VP9_COMMON struct" 2014-10-22 17:10:39 -07:00
Frank Galligan
f271bed671 Merge "Fix Neon convolve profiling" 2014-10-22 15:50:36 -07:00
Yunqing Wang
7c7e4d4eb8 vp9_ethread: allocate frame contexts outside VP9_COMMON struct
This patch allocated frame contexts outside VP9_COMMON. This allows
multiple threads to share the same copy of frame contexts, and
reduces the overhead. It also guarantees the correct update of
these contexts during bitstream packing. This patch doesn't change
encoding result.

Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353
2014-10-22 15:03:12 -07:00
Yaowu Xu
7c48a295ae Merge "Fix a subtle issue in re-use inter_pred" 2014-10-22 14:53:06 -07:00
Jingning Han
08cdd006e1 Allow checking zeromv mode in vp9_pick_inter_mode
This improves the compression performance of speed -5 by 0.6%. The
speed impact is less than 1%.

Change-Id: Ie77daa561976dfc8b479061e1221bdf428eb0c3b
2014-10-22 14:47:15 -07:00
JackyChen
897500b9ba Merge "vp9_denoiser_sse2.c: improve code style." 2014-10-22 13:52:03 -07:00
Yaowu Xu
3f79359e0a Fix a subtle issue in re-use inter_pred
The initialization of this_mode_pred does not work when the ref_frame
loop ever goes beyond LAST_FRAME. This commit fixes the subtle issue
and allows potentially expanding the loop to test GOLDEN_FRAME.

Change-Id: Ibbd427a22160d1d9eacb8ed0c87f88d6cef9c0f3
2014-10-22 12:06:27 -07:00
JackyChen
5cba6516aa vp9_denoiser_sse2.c: improve code style.
denoiser_sse2.c: fix typos in comment.

Change-Id: Ic0fb102331b0e533c058da3cab1fbc30de9a0070
2014-10-22 10:55:54 -07:00
Frank Galligan
95a568b3a8 Fix Neon convolve profiling
When profiling, gprof can't distinguish between matching labels in
different files.

Change-Id: I56770df212ed314a0d8568071fa8157624ef1e8f
2014-10-22 10:51:53 -07:00
Paul Wilkins
7cd6330ef3 Initialization bug for multi arf.
Moved erroneous reset of cpi->multi_arf_last_grp_enabled.

Change-Id: Ibb0b96f6ed1d5eeb575a3b1c798e0fe2ee651d06
2014-10-22 18:51:07 +01:00
Jingning Han
0e64aa5073 Merge "Refactor rate distortion cost structure in non-RD coding mode" 2014-10-22 08:41:36 -07:00
Yaowu Xu
87665f16f4 Merge "Change speed features for good quality(cpu-used=5)" 2014-10-22 08:40:15 -07:00
Jingning Han
be212d4db3 Refactor rate distortion cost structure in non-RD coding mode
This commit refactors the rate distortion structure used in the
non-RD coding mode and saves a few RDCOST calculations.

Change-Id: I62c3416c300d2c5372f21b96d93a6b633a34ab3a
2014-10-21 17:17:11 -07:00
Hui Su
8947b18fa3 Move the definition of switchable filter numbers into enum
INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and
IF_DIFF_REF_FRAME_ADD_MV.

Change-Id: Ic36c9eb6ccb8ec324d991f7241e42b40b60b1dcb
2014-10-21 15:41:37 -07:00
Yaowu Xu
c30f7e6cc5 Change speed features for good quality(cpu-used=5)
The existing speed features produce horrible encoding results, almost
30% worse than cpu-used=4, this commit adjust the speed features to
produce relatively resonable results to be within 3%-5% of cpu-used=4.

Change-Id: I0ca6ebafb33024d4a0cbcf04c78a4a00b8dd1ecf
2014-10-21 11:59:12 -07:00
Jingning Han
1ed1dde06d Remove unused copy_partitioning
Change-Id: I75a2a3772ed17e73180eb4f263cc838cae4927b0
2014-10-21 09:47:58 -07:00
Jingning Han
f2c21cfa1e Merge "Remove deprecated constrain_copy_partitioning function" 2014-10-21 09:44:11 -07:00
Jingning Han
55ec7ebca7 Merge "Remove unused sb_has_motion function in vp9_encodeframe.c" 2014-10-21 09:43:55 -07:00
Jingning Han
072d844aff Merge "Remove deprecated use_lastframe_partitioning feature" 2014-10-21 09:43:45 -07:00
Jingning Han
61ff08ef61 Merge "Hybrid partition search for rtc coding mode" 2014-10-21 09:43:35 -07:00
Paul Wilkins
4163da33c2 Merge "Extend --auto-alt-ref so it can enable multi-alt ref." 2014-10-21 07:02:24 -07:00
Paul Wilkins
889c53a507 Merge "Resolve compiler warning." 2014-10-21 07:02:13 -07:00
Jingning Han
abb2fbb10e Remove deprecated constrain_copy_partitioning function
Its functionality has been replaced with choose_partitioning and
threshold based control on split mode check.

Change-Id: Ic9bb321df06b524f5c38ea5874dc6f6a8f93c5e3
2014-10-20 17:08:21 -07:00
Jingning Han
ef53898c48 Remove unused sb_has_motion function in vp9_encodeframe.c
Change-Id: I035fb6aa5c10741b065e27befb097d8087e3c62f
2014-10-20 17:08:11 -07:00
Jingning Han
e62ce79e1a Remove deprecated use_lastframe_partitioning feature
This speed feature has been deprecated in both yt and rtc coding
modes. This commit removes the related operations.

Change-Id: I079c79c6adafe45581af2ebf8b98faebcface1ce
2014-10-20 17:03:38 -07:00
Jingning Han
9f128b3ed9 Hybrid partition search for rtc coding mode
This commit re-designs the recursive partition search scheme in
rtc speed -5. It first checks if the current block is under cyclic
refresh mode. If so, apply recursive partition search. Otherwise,
perform sub-sampled pixel based partition selection. When the
pre-selection finds the partition size should be 32x32 or above,
use the partition size directly. Otherwise, apply partition search
at nearby levels around the preset partition size.

It is enabled in speed -5. The compression performance of rtc
speed -5 is improved by 9.4%. Speed wise, the run-time goes slower
from 1% to 10%.

nik_720p, 1000 kbps
33220 b/f, 38.977 dB, 10109 ms -> 33200 b/f, 39.119 dB, 10210 ms

vidyo1_720p, 1000 kbps
16536 b/f, 40.495 dB, 10119 ms -> 16536 b/f, 40.827 dB, 11287 ms

Change-Id: I65adba352e3adc03bae50854ddaea1b421653c6c
2014-10-20 13:02:12 -07:00
Yunqing Wang
687c56e802 Merge "SAD32xh and SAD64xh for AVX2" 2014-10-20 12:37:55 -07:00
Yunqing Wang
67c866750c Merge "Remove the dependency in token storing locations" 2014-10-20 08:26:46 -07:00
Paul Wilkins
6f0ae3a2d1 Extend --auto-alt-ref so it can enable multi-alt ref.
Extend --auto-alt-ref from parameter so we can use it to
turn multi-arf on and off from the command line.

For now the range is 0-off, 1-on, 2-multi-arf on.

Rename play_alternate to enable_auto_arf

Change-Id: Id7b64407cfbe76ba0090a83b588a03e22a240386
2014-10-20 16:09:37 +01:00
Paul Wilkins
9626a0cb62 Resolve compiler warning.
conversion from 'const int64_t' to 'int', possible loss of data.

Change-Id: I471a73bba5d448d9be0ef9cbf1590fa73aa74be1
2014-10-20 12:08:33 +01:00
Paul Wilkins
9c98fb2bab Merge "Alter adjustment of two pass GF/ARF boost with Q." 2014-10-20 03:12:06 -07:00
levytamar82
7045aec00a SAD32xh and SAD64xh for AVX2
All sad function that process above 32 consecutive elements are optimized
for AVX2:
vp9_sad64x64
vp9_sad64x32
vp9_sad32x64
vp9_sad32x32
vp9_sad32x16
vp9_sad64x64_avg
vp9_sad64x32_avg
vp9_sad32x64_avg
vp9_sad32x32_avg
vp9_sad32x16_avg
The functions that appeared as a hotspot is vp9_sad32x32 and vp9_sad64x64
vp9_sad32x32 was optimized by 68% and vp9_sad64x64 was optimized by 90%
both of them gave and overall ~2.3% user level gain

Change-Id: Iccf86b375a2b54c5fbbe685902ead0c9a561b9fd
2014-10-19 13:59:10 -07:00
Debargha Mukherjee
6202c75f84 Merge "Add highbitdepth function for vp9_avg_8x8" 2014-10-18 14:37:10 -07:00
Yaowu Xu
06e65269c7 Merge "Remove unused VAR_BASED_FIXED_PARTITION flag" 2014-10-18 13:31:47 -07:00
Yaowu Xu
7bf475926b Merge "Use rate/distortion thresholds to control non-RD partition search" 2014-10-18 13:31:41 -07:00
Peter de Rivaz
73ae6e495c Add highbitdepth function for vp9_avg_8x8
Cherry-picked from https://gerrit.chromium.org/gerrit/#/c/71914/
(a92f987a6b) on highbitdepth branch.

Change-Id: I6903e4e4cb57d90590725c8a1c64c23da7ae65e8
2014-10-17 17:04:37 -07:00
Yunqing Wang
7c4992c466 Remove the dependency in token storing locations
Currently, the tokens for a tile are stored immediately after its
preceding tile, which causes a dependency. This is unnecessary
since we always allocate enough memory for tokens. Removing
the dependency allows token writing done in parallel. This patch
doesn't change encoding result.

Change-Id: I7365a6e5e2c2833eb14377c37e1503c9d0f26543
2014-10-17 14:25:33 -07:00
hkuang
b671491ad4 Merge "Correct the logic of ready_for_new_data." 2014-10-17 14:13:44 -07:00
JackyChen
0615a196e2 Merge "vp9_denoiser_sse2.c: solve windows build error." 2014-10-17 11:16:22 -07:00
Jingning Han
3bc94cd2eb Merge "Add init and reset functions for RD_COST struct" 2014-10-17 11:15:19 -07:00
Jingning Han
4a6c99e5fd Merge "Reset rate cost value in rd mode search" 2014-10-17 11:15:03 -07:00
hkuang
e3bf55dedb Correct the logic of ready_for_new_data.
This should be set right after decoder really start to decode frame
instead setting at the end.

Even decoder does not have a displayable frame to show and return NULL
to application, this should be set too.

Change-Id: If0313a834bc64e3b0f05a84f4459d444d9eab0d8
2014-10-17 10:09:56 -07:00
Jingning Han
94ecfa323f Reset rate cost value in rd mode search
When early termination is triggered, properly reset the rate cost
to invalid value to avoid potential ioc issue.

Change-Id: I3444390be2e49a34bb02cf8a74c33d5dbd96d88d
2014-10-17 09:33:59 -07:00
JackyChen
6356d21a47 vp9_denoiser_sse2.c: solve windows build error.
Change-Id: Ib5df91c8580d5dbeb0b3554edc9c2ca906ba4c4d
2014-10-17 09:28:22 -07:00
Jingning Han
e1111fba7e Remove unused VAR_BASED_FIXED_PARTITION flag
Change-Id: I4ce19b7cb1c45fed86e81ee785e787630020fb4f
2014-10-17 09:02:25 -07:00
Paul Wilkins
f0c3da930a Alter adjustment of two pass GF/ARF boost with Q.
Delete gfboost_qadjust() and move Q based adjustment
into calc_frame_boost(). Also remove clamping. Making
the adjustment here means that it influences not just the
boost level but also the selection of the GF/ARF interval.

This change gives a small average gain in PSNR but
larger gains in SSIM, especially for harder std-hd set (1.5%)

Change-Id: I3aa81b8feccaeff93d915e19fb9cf5cd64c86327
2014-10-17 16:37:17 +01:00
James Zern
00f1cf40ed Merge "vp9_denoiser_sse2.c: eliminate gcc warnings" 2014-10-17 03:26:06 -07:00
JackyChen
8514d03402 vp9_denoiser_sse2.c: eliminate gcc warnings
Change-Id: I5f63f48e11e31ea9951223c5b18f42a2471e4560
2014-10-17 11:00:57 +02:00
Jingning Han
4b5f01e12e Merge "Fix an ioc issue in super_block_uvrd" 2014-10-16 12:35:11 -07:00
Jingning Han
ed100c0b00 Fix an ioc issue in super_block_uvrd
This commit fixes an ioc issue that will happen when the cumulative
variables are not in effective use. The fix discards these
redundant additions.

Change-Id: Idbac5bfb989c0cedc5f8a323effce938519b2457
2014-10-16 11:07:39 -07:00
James Zern
e9b8810b4d move LFWorkerData allocation to VP9LfSync
this removes an assumption that worker->data1 would be pointing to a
TileWorkerData allocation.
additionally, within the multi-threaded loopfilter pass VP9LfSync as a
parameter to the worker hook, removing the need for a shadow pointer in
LFWorkerData.

Change-Id: Ic7b2faa34e3eb59dbcb8a7c67f333448fa047c88
2014-10-16 18:55:46 +02:00
James Zern
175c870efa vp9_loop_filter_frame_mt: remove pbi dependency
Change-Id: I44d42a5098305a2d050ce8ff3c76baf7798c48af
2014-10-16 18:55:44 +02:00
James Zern
69f11d2b9d vp9_loop_filter_frame_mt: pass planes directly
one less dependency on pbi

Change-Id: I3f3a392416d3523f4aea6682c3965885baf85197
2014-10-16 18:54:10 +02:00
James Zern
eb3fdfba09 vp9_loop_filter_frame_mt: pass VP9LfSync directly
a step towards removing the pbi dependency

Change-Id: I10747b325e81c172f5e67031ea5159159fc26e91
2014-10-16 17:27:57 +02:00
James Zern
ff3ae42d8c vp9: store TileWorkerData allocations separately
move them from VP9Worker::data[12] to allow the structure to be reused a
bit more naturally by the multi-threaded loopfilter.

Change-Id: I31b49c9e93ca744fd7f6d6ed8696671188fb2c1d
2014-10-16 17:27:57 +02:00
Paul Wilkins
716ae78ce4 Change initialization of static_scene_max_gf_interval.
This removes an unnecessary restriction that causes
a problem (noticed by AWG) when the forced key frame
interval is set to a very small value, such as 10. In this case
we were being forced to code minimal length GF groups.

Change-Id: I76ef5861a09638ff51f61fea02359554184ada53
2014-10-16 16:18:29 +01:00
Minghai Shang
68b550f551 [spatial svc]Another workaround to avoid using prev_mi
We encode a empty invisible frame in front of the base layer frame to
avoid using prev_mi. Since there's a restriction for reference frame
scaling factor, we have to make it smaller and smaller gradually until
its size is 16x16.

Change remerged.

Change-Id: I9efab38bba7da86e056fbe8f663e711c5df38449
2014-10-16 16:09:40 +01:00
Paul Wilkins
d5130af568 Revert "Move input frame scaling into the recode loop"
This reverts commit 452dc21500.

This change has introduced a significant quality regression on content
with forced key frames. (e.g. the YT and yt-hd set). It is most
noticeable in static content where the kf bits dominate. Here, despite
key frames being apparently coded at the same Q, there is a drop in all
metrics of ~20% (e.g clXR and BFa0).

Change-Id: Iba14cc61778c0846fa0a59c33c55a9fc49512cb4
2014-10-16 15:54:40 +01:00
Paul Wilkins
468032961d Revert "[spatial svc]Another workaround to avoid using prev_mi"
This reverts commit c113457af9.

Temporary revert to allow clean revert of another commit.

Change-Id: Ia9b7b755e6c48e1b6e383329f121fef175a24b27
2014-10-16 15:52:08 +01:00
Marco
48ea5b7190 Merge "Some updates for Speed 6/VAR_BASED_PARTITION." 2014-10-15 15:57:21 -07:00
Jingning Han
e2612fbd70 Add init and reset functions for RD_COST struct
Change-Id: I2902de7051a883fd22e27a655209233733969cfd
2014-10-15 15:02:06 -07:00
Jingning Han
801b77d48c Merge "Replace copy_partitioning use case with choose_partitioning" 2014-10-15 14:54:52 -07:00
Jingning Han
5e766ccee0 Use rate/distortion thresholds to control non-RD partition search
Compare the estimated rate and distortion to the thresholds scaled
according to the operating block size and determine if further
split partition search will be run. The compression performance of
speed -5 is changed by -0.074%. The encoding speed is 10% - 15%
faster.

vidyo1 720p
16545 b/f, 40.492 dB, 11475 ms -> 16535 b/f, 40.486 dB, 10100 ms

nik720p
16624 b/f, 36.310 dB, 10071 ms -> 16617 b/f, 36.313 dB, 8346 ms

Change-Id: Ic9197ab5761279ae55d2fb7813b2af0e0db497b8
2014-10-15 13:40:33 -07:00
Marco
09ea74f194 Some updates for Speed 6/VAR_BASED_PARTITION.
Reduce the intra_cost_penalty for non-rd mode,
and some updates to VAR_BASED_PARTITION.

Visual tests show some improvement at Speed 6, for RTC clips.

Change-Id: If9090daf7aed14906a32d931a538ab544bbca606
2014-10-15 12:06:48 -07:00
Jingning Han
89b8c7a513 Replace copy_partitioning use case with choose_partitioning
This commit replaces the use of copy_partitioning with
choose_partitioning based on the sse of subsamped pixels, which
provides significantly better coding performance and runs at
similar speed, as compared to copy_partitioning. It improves rtc
speed 5 coding performance by 3%.

Change-Id: I52d3682a12dce0147f5e52383a594fc242ca3228
2014-10-15 11:37:20 -07:00
Minghai Shang
c113457af9 [spatial svc]Another workaround to avoid using prev_mi
We encode a empty invisible frame in front of the base layer frame to
avoid using prev_mi. Since there's a restriction for reference frame
scaling factor, we have to make it smaller and smaller gradually until
its size is 16x16.
Change-Id: I60b680314e33a60b4093cafc296465ee18169c19
2014-10-14 16:26:39 -07:00
Adrian Grange
2040bb58fb Merge "Move input frame scaling into the recode loop" 2014-10-14 15:30:42 -07:00
Alex Converse
00a9671bbd Merge "Add a 32-bit friendly sse2 quantizer." 2014-10-14 14:35:02 -07:00
Yunqing Wang
a78fd6a47e Merge "Remove an unneeded function call" 2014-10-14 14:06:23 -07:00
hkuang
cf608110fc Merge "Correct the format." 2014-10-14 13:45:11 -07:00
Yunqing Wang
a614f2288c Remove an unneeded function call
set_tile_limits() is called in vp9_change_config() already.

Change-Id: I91c3a0df2c1c7fd7e71546d8f51fd5b65838a7da
2014-10-14 11:41:37 -07:00
Alex Converse
7497d2fb23 Add a 32-bit friendly sse2 quantizer.
This is based on the 64-bit ssse3 quantizer.

1.1x speedup for screen content at speed 7.

Change-Id: I57d15415ef97c49165954bbe3daaaf9318e37448
2014-10-14 11:37:41 -07:00
hkuang
c1b0d0da0b Correct the format.
Change-Id: I59a53b419adda3a609d50b2a82f5a4a54849752e
2014-10-14 11:35:26 -07:00
Jingning Han
f67e75a6f4 Merge "Refactor super_block_uvrd function to remove goto statement" 2014-10-14 11:33:00 -07:00
hkuang
8fff2db51e Merge "Remove unnecessary local variable." 2014-10-14 11:05:51 -07:00
hkuang
c38a8edf16 Merge "Remove extra line." 2014-10-14 11:05:01 -07:00
Jingning Han
f3a5de816d Refactor super_block_uvrd function to remove goto statement
Use return value 0/1 as indicator of the validity of the rate-
distortion cost.

Change-Id: I6244126fbf03472cebcba4f177a6cd329fae4743
2014-10-14 09:58:11 -07:00
Adrian Grange
452dc21500 Move input frame scaling into the recode loop
Move the point at which input frames are scaled
into the recode loop. This will allow us to change
the coded frame size dynamically in response
to previous attempts to encode the frame at a
higher resolution.

A following patch will implement a scheme for
resizing the frame in the recode loop.

Change-Id: I6a59c02d6ac1626512edad6de8b60063b79433e6
2014-10-14 09:27:55 -07:00
Jingning Han
d0369d6fd4 Merge "Use speed feature variable in vp9_rd_pick_inter/intra_mode" 2014-10-14 09:10:24 -07:00
Jingning Han
fdf2205558 Merge "Fix vp9_rd_pick_inter/intra function types" 2014-10-14 09:10:11 -07:00
Jingning Han
790a96c94f Merge "Refactor rate distortion cost structure" 2014-10-14 08:58:55 -07:00
Adrian Grange
f7c336aa19 Merge "Remove mi_grid_base_array from VP9_COMMON (unused)" 2014-10-14 07:50:17 -07:00
Paul Wilkins
bd8a6a93aa Merge "Clamp rate error estimate." 2014-10-14 02:40:12 -07:00
Jingning Han
69a09a70e9 Use speed feature variable in vp9_rd_pick_inter/intra_mode
Replace repeated fetch cpi->sf with a local sf pointer.

Change-Id: I5a55bba3e1c41fbdbc6ad5f078d2fa49dd95ee67
2014-10-13 16:15:00 -07:00
Jingning Han
3bdb6bfcee Fix vp9_rd_pick_inter/intra function types
The returned value is not used anywhere, hence changing the function
type into void.

Change-Id: I0ece49ed61e7aab6df01140135503ad41d4ef4a4
2014-10-13 16:00:46 -07:00
hkuang
dd080e89a8 Merge "Use pre increment." 2014-10-13 15:24:57 -07:00
Jingning Han
811cef97c9 Refactor rate distortion cost structure
This commit makes a struct that contains rate value, distortion
value, and the rate-distortion cost. The goal is to provide a
better interface for rate-distortion related operation. It is
first used in rd_pick_partition and saves a few RDCOST calculations.

Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f
2014-10-13 14:27:16 -07:00
hkuang
c5fd035ce0 Use pre increment.
Change-Id: I016b4e77d8268e189473f4c382603afe1ae1750f
2014-10-13 14:07:03 -07:00
hkuang
c7f9c717de Remove unnecessary local variable.
Change-Id: I7b19b6061cec369825a0a0b7a485ca490dbc12ee
2014-10-13 14:05:42 -07:00
Adrian Grange
83b63d573a Remove mi_grid_base_array from VP9_COMMON (unused)
Change-Id: I4b4764463f5a7cdc01ec004b882c6237466c74b0
2014-10-13 11:54:05 -07:00
Paul Wilkins
6dbb9e4d44 Clamp rate error estimate.
Add back clamp which ensures that the Q adaptation
is turned off when the over_shoot_pct and under_shoot_pct
parameters are set to 100.

Change-Id: Id0161b114d39a3029cd3eb28020caab0c3914922
2014-10-13 18:07:58 +01:00
Paul Wilkins
f7f0eaa581 Add adaptation option for VBR.
Allow min and maxQ to creep when the undershoot
or overshoot exceeds thresholds controlled by the
command line under_shoot_pct and over_shoot_pct
values.

Default is 100%,100% which ~disables adaptation.

Derf results for example undershoot% / overshoot%:-

Head:- Mean abs (%rate error) = 14.4%

This check in:-
25%/25% - Mean abs (%rate error) = 6.7%
                  PSNR hit -1% SSIM -0.1%

5% / 5%  - Mean abs (%rate error) = 2.2%
                 PSNR hit -3.3% SSIM - 1.1%

Most of the remaining error and most of the quality hit is
at extreme data rates. The adaptation code still has an
exception for material that is in effect static so that we
don't over adjust and over spend on YT slide show type
content.

(Rebase of If25a2449a415449c150acff23df713e9598d64c9
to resolve a auto-merge error)

Change-Id: Iec4e1613ef0d067454751d8220edb7058dfbd816
2014-10-13 10:16:44 +01:00
Jingning Han
a62acf3c0a Fix ActiveMapTest valgrind warning
This fixes a valgrind warning in the ActiveMapTest unit test
reported in issue 870.

Change-Id: Idf172ab0244ebefe630c3577e649bc9ba7c43d10
2014-10-11 22:36:58 -07:00
hkuang
dbe91de6d4 Remove extra line.
Change-Id: I5e79c276d8953ae17cd35b2846e6e40660c037c3
2014-10-10 14:59:04 -07:00
Alex Converse
a90255c366 Revert "Add adaptation option for VBR."
This reverts commit 869d4ca519.

This breaks the build via conflict with
e18edd5eb6.

Change-Id: If544b99e367a449452834eb8cce600f58c34ec0d
2014-10-10 11:34:00 -07:00
hkuang
ab4c6efa48 Merge "Optimize the code to set the refernce frame right after reading the header." 2014-10-10 10:40:21 -07:00
hkuang
0d94f725e6 Merge "Correct the code format." 2014-10-10 10:01:05 -07:00
Paul Wilkins
169949dd74 Merge "Add adaptation option for VBR." 2014-10-10 09:22:58 -07:00
Yaowu Xu
bdea0055b2 Merge "vp9/choose_partitioning: add missing clear_system_state" 2014-10-10 09:16:19 -07:00
James Zern
a3e1a9291a vp9/choose_partitioning: add missing clear_system_state
set_vt_partitioning does double math

Change-Id: I8e9d73d5c89b937a5326abf04164d24d9d88c5ef
2014-10-10 08:14:46 -07:00
Paul Wilkins
869d4ca519 Add adaptation option for VBR.
Allow min and maxQ to creep when the undershoot
or overshoot exceeds thresholds controlled by the
command line under_shoot_pct and over_shoot_pct
values.

Default is 100%,100% which ~disables adaptation.

Derf results for example undershoot% / overshoot%:-

Head:- Mean abs (%rate error) = 14.4%

This check in:-
25%/25% - Mean abs (%rate error) = 6.7%
                  PSNR hit -1% SSIM -0.1%

5% / 5%  - Mean abs (%rate error) = 2.2%
                 PSNR hit -3.3% SSIM - 1.1%

Most of the remaining error and most of the quality hit is
at extreme data rates. The adaptation code still has an
exception for material that is in effect static so that we
don't over adjust and over spend on YT slide show type
content.

Change-Id: If25a2449a415449c150acff23df713e9598d64c9
2014-10-10 12:54:16 +01:00
James Zern
7c6fec672f vp9_avg_intrin_sse2: correct intrinsics include
immintrin.h -> emmintrin.h
fixes build where newer intrinsics are unavailable

Change-Id: I79311b39bfa782fc2abeb45884ecb417050cb9f8
2014-10-10 10:05:47 +02:00
hkuang
effc1a6f56 Correct the code format.
Change-Id: If2de420f8123a4e8bf635dd29205dd74ee174eee
2014-10-09 17:57:45 -07:00
hkuang
3304d4e6ca Optimize the code to set the refernce frame right after reading the header.
Change-Id: I495cf4a366e06e3220ed132500b1ba1c8448f708
2014-10-09 16:32:36 -07:00
hkuang
ca27459c1a Merge "Remove unnecessary code." 2014-10-09 15:44:08 -07:00
hkuang
336e255236 Merge "Remove unnecessary scale check in set_ref." 2014-10-09 15:43:31 -07:00
Deb Mukherjee
9a29fdbae7 Merge "Rename highbitdepth functions to use highbd prefix" 2014-10-09 15:39:56 -07:00
hkuang
0e06c8ff36 Remove unnecessary code.
Function will jump to error handler when ref buffer is corrupted.
So "xd->corrupted |= ref_buffer->buf->corrupted;" is useless.

Change-Id: I35353a0637ad0dbb682454e040ef69fa68280bfa
2014-10-09 15:12:12 -07:00
Deb Mukherjee
1929c9b391 Rename highbitdepth functions to use highbd prefix
Uses highbd_ prefix convention consistently.

Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-09 14:40:40 -07:00
hkuang
15a3e5f742 Remove unnecessary scale check in set_ref.
Scale check has been done in read_inter_block_mode_info.

Change-Id: I6c86f93bd579109ed30ff13a04a30e35f5ae6fc5
2014-10-09 12:19:55 -07:00
Jingning Han
112789d4f2 Merge "Remove sub8x8 block index from rd_pick_partition argument" 2014-10-09 11:16:11 -07:00
Deb Mukherjee
3117830af3 Merge "Subpel search cleanups and enhancements" 2014-10-09 11:14:51 -07:00
Alex Converse
9ffbc31367 Merge "Move the high freq coeff check outside store_coding_context" 2014-10-09 11:12:02 -07:00
Jingning Han
6a0d291fce Remove sub8x8 block index from rd_pick_partition argument
This parameter is deprecated. Its function is replaced with
other explicit condition check.

Change-Id: I61337e350ba8ca9eb50382db8b4d4acbf45cb7eb
2014-10-09 09:20:16 -07:00
Yaowu Xu
3af39b9997 Merge "Fix src frame buffer copy and extend" 2014-10-09 07:08:48 -07:00
James Zern
924af1edd8 Merge "set_vt_partitioning: fix type conversion warning" 2014-10-09 03:53:01 -07:00
James Zern
cec763bd97 set_vt_partitioning: fix type conversion warning
double -> int64
+ make threshold_multiplier an int

Change-Id: I6d3607fdf13d670f57c9d9b04a80acb2be1346a0
2014-10-09 11:41:36 +02:00
James Zern
caa0f81914 vp9_rtcd_defs: fix vp9_avg_8x8 declaration
vp9_avg_8x8 does not depend on x86inc, fixes 32-bit OS X build

Change-Id: I709b874ea84bf57c8cdb5ac7d43eecc6b8c1a2dd
2014-10-09 10:44:42 +02:00
Deb Mukherjee
d78dbff09a Subpel search cleanups and enhancements
- Some fixes to surface fit.
- Returns variance function as cost rather than sad in the
  pattern search and diamond search functions. Only
  vp9_pattern_search_sad function used in bigdia search
  uses sad as integer 1-away costs.
- Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.

Results:
derf [Speed 3]: About +0.036% in coding efficiency without any
discernible speed loss.
derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.

Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
2014-10-08 23:59:43 -07:00