Fix decoder resolution change with tiles

There was a bug with the decoder that if you started the decoder
with more threads than the first frame had tile columns. Afterwards
tried to decode a frame with more tile columns than the first frame,
the decoder would hang. E.g. run vpxdec --threads=4. The first frame
had two tile columns, then the next key frame had 4 tile columns, the
decoder would hang. If you started with 4 tiles and switched to 2
tiles the decoder would be fine. The issue is that the worker the thread
loop is using is stale.

I added a test vector "vp90-2-14-resize-848x480-1280x720.webm" that
exhibited the bug.

Change-Id: I7bdd47241a52ac0fe1c693a609bc779257e94229
This commit is contained in:
Frank Galligan
2014-04-06 20:07:14 -07:00
parent 9848d67bb3
commit 6ae58931d6
6 changed files with 56 additions and 11 deletions

View File

@@ -139,6 +139,8 @@ void vp9_loop_filter_frame_mt(VP9D_COMP *pbi,
int y_only, int partial_frame) {
// Number of superblock rows and cols
const int sb_rows = mi_cols_aligned_to_sb(cm->mi_rows) >> MI_BLOCK_SIZE_LOG2;
const int tile_cols = 1 << cm->log2_tile_cols;
const int num_workers = MIN(pbi->oxcf.max_threads & ~1, tile_cols);
int i;
// Allocate memory used in thread synchronization.
@@ -168,7 +170,16 @@ void vp9_loop_filter_frame_mt(VP9D_COMP *pbi,
sizeof(*pbi->lf_row_sync.cur_sb_col) * sb_rows);
// Set up loopfilter thread data.
for (i = 0; i < pbi->num_tile_workers; ++i) {
// The decoder is using num_workers instead of pbi->num_tile_workers
// because it has been observed that using more threads on the
// loopfilter, than there are tile columns in the frame will hurt
// performance on Android. This is because the system will only
// schedule the tile decode workers on cores equal to the number
// of tile columns. Then if the decoder tries to use more threads for the
// loopfilter, it will hurt performance because of contention. If the
// multithreading code changes in the future then the number of workers
// used by the loopfilter should be revisited.
for (i = 0; i < num_workers; ++i) {
VP9Worker *const worker = &pbi->tile_workers[i];
TileWorkerData *const tile_data = (TileWorkerData*)worker->data1;
LFWorkerData *const lf_data = &tile_data->lfdata;
@@ -184,10 +195,10 @@ void vp9_loop_filter_frame_mt(VP9D_COMP *pbi,
lf_data->y_only = y_only; // always do all planes in decoder
lf_data->lf_sync = &pbi->lf_row_sync;
lf_data->num_lf_workers = pbi->num_tile_workers;
lf_data->num_lf_workers = num_workers;
// Start loopfiltering
if (i == pbi->num_tile_workers - 1) {
if (i == num_workers - 1) {
vp9_worker_execute(worker);
} else {
vp9_worker_launch(worker);
@@ -195,7 +206,7 @@ void vp9_loop_filter_frame_mt(VP9D_COMP *pbi,
}
// Wait till all rows are finished
for (i = 0; i < pbi->num_tile_workers; ++i) {
for (i = 0; i < num_workers; ++i) {
vp9_worker_sync(&pbi->tile_workers[i]);
}
}