Commit Graph

545 Commits

Author SHA1 Message Date
Deb Mukherjee
5ade423774 Removes conditional statements from band getting
Implements scan order to band map with arrays in both the encoder
and decoder to remove conditional statements.

Encoding seems to be about 1% faster at speed 0, tested on football.
Decoding seems to be about 0.5-1% faster on a set of 25 videos.

Change-Id: Idb233ca0b9e0efd790e30880642e8717e1c5c8dd
2013-11-12 10:13:27 -08:00
Dmitry Kovalev
42b1f62085 Removing redundant assignment.
xd->mi_8x8 is assigned inside set_offsets() for each prediction block.

Change-Id: I20e5974a9eaf105e5a04fc7f99b7a93bd50e3d0a
2013-11-11 17:39:43 -08:00
Jingning Han
d8b4c79270 Decouple macroblockd_plane buffer usage
Make the macroblockd_plane contain dynamic buffer pointers instead
static pointers to the memory space allocated therein. The decoder
uses the buffer allocated in pbi, while encoder will use a dual
buffer approach for rate-distortion optimization search.

Change-Id: Ie6f24be2dcda35df7c15b4014e5ccf236fb3f76c
2013-11-11 15:26:10 -08:00
Dmitry Kovalev
ec8128e27f Merge "Replacing (raster_block >> tx_size) with (block >> (tx_size << 1))." 2013-11-11 12:27:42 -08:00
Dmitry Kovalev
3aa4b42a35 Merge "Cleaning up read_mv_probs() function." 2013-11-11 11:01:35 -08:00
Yaowu Xu
cae7e0741a [BITSTREAM]Fix row tile mode_info pointer setup
This commit fixes the assignment of mode_info pointer per tile. It
makes recognition of tiles in both row and column formats and properly
arrange the use of mode_info.

The bug was first introduced in
I6226456dd11f275fa991e4a7a930549da6675915
https://gerrit.chromium.org/gerrit/#/c/67492/

Change-Id: Ie12cd209f53241513728c461ee3d7b9599ddb860
2013-11-08 22:06:53 -08:00
Yaowu Xu
a596975eb6 Correct a couple of typos
Change-Id: Ic470c6c9ce27b615c9645b9cb0d67526417bc374
2013-11-08 12:43:51 -08:00
Dmitry Kovalev
614effc0f6 Merge "Unifying tile decoding for both direct and inverse tile order." 2013-11-08 10:59:02 -08:00
Dmitry Kovalev
d28f30ef4e Replacing (raster_block >> tx_size) with (block >> (tx_size << 1)).
The new expression is much more logical than previous one. Surprisingly
both expressions give exactly the same set of dependent values
-- have_top, have_left, have_right -- in vp9_predict_intra_block.

Change-Id: I63eb1b592b8c37883b3a0dbb1f3daa271e446109
2013-11-07 15:26:57 -08:00
Dmitry Kovalev
672ba3ddf5 Unifying tile decoding for both direct and inverse tile order.
Now tile decoding consists of two stages:
1. Find tile buffer start and its size, put this info into tile_buffers.
2. Decode each tile based on information from tile_buffers.

It seems that stage 1 can also be reused by multithreaded tile decoder.

Change-Id: If0cdaefdd6d10bb41c63561346c9ae4cfac081dd
2013-11-06 18:15:33 -08:00
Dmitry Kovalev
a1dc97beb1 Using pd->dqcoeff instead of pd->qcoeff in the decoder.
It is more logical to use dqcoeff buffer to put there *dequantized*
transform coefficients (inside inverse_transform_block and
decode_coefs functions). Dequantization happens inside WRITE_COEF_CONTINUE
macro.

qcoeff buffer should be only used in the encoder for *quantized*
transform coefficients.

Change-Id: Ifd54bef272bbf5311ced6669c4f1079f998af5d7
2013-11-06 16:14:45 -08:00
Dmitry Kovalev
d172201403 Cleaning up read_mv_probs() function.
Change-Id: I7a1e88b5977076e22736c0d6af1d4dcc19d39a6a
2013-11-05 14:43:35 -08:00
Deb Mukherjee
3a833ea38f token_cache changes in decoder
Removes stack-alocation of token_cache in decode_coefs function

Seems to achieve about 1% decode speed improvement as tested on
25 480p videos.

Change-Id: I8e7eb3361fa09d9654dfad0677a6d606701fdc6e
2013-11-05 09:32:58 -08:00
Dmitry Kovalev
dde8069e57 Splitting partition_probs array into two arrays.
We only update partition_probs for inter frames but they are constant
for key frames. It is not necessary to have constants inside frame
context and copy them every time. This change reduces FRAME_CONTEXT size
by at least 48 bytes.


Change-Id: If70a53be51043f37fe7d113853217937710932a7
2013-11-04 14:26:16 -08:00
Yaowu Xu
a272530bf0 Two optimizations:
1. Reduced the size memset based on eob for 32x32 transform. The reset
of non-zero coefficient should probably go into where they are read in
inverse transform functions. (TODO)
2. Removed a redundant level of indirection.
vp9_iht4x4_add() checks transform type and call vp9_iht4x4_16_add()
for tranforms other than DCT_DCT. In this case, the DCT_DCT case
has been already handled here.

Change-Id: Iacbc77da761f0b308df5acea0f20c9add9f33d20
2013-11-01 07:24:07 -07:00
Yaowu Xu
a49e77af50 simplify read_coef_prob()
Change-Id: I529c634db4f81ba5386092c126f53312b1e51b2b
2013-10-31 16:39:08 -07:00
Dmitry Kovalev
47b6030dda Reducing the number of foreach_transformed_block() calls.
The change doesn't affect the bitstream. It changes the order or function
calls and affects how we reconstruct intra- and inter-blocks. Speed up is
about 1...1.5%.

For intra-blocks:
  Before:
    for each transform block read tokens
    for each transform block do prediction
    for each transform block do inverse transform
  Now:
    for each transform block
      read tokens
      do prediction
      do inverse transform

For inter-blocks:
  Before:
    for each transform block read tokens
    for each transform block do inverse transform
  Now:
    for each transform block
      read tokens
      do inverse transform

Change-Id: I12a79bf1aa5a18c351b8010369bd3ff1deae1570
2013-10-31 13:52:08 -07:00
Dmitry Kovalev
ca39a00822 Merge "Reducing the number of recursive calls." 2013-10-30 15:14:18 -07:00
Dmitry Kovalev
6761872e49 Replacing (SWITCHABLE_FILTERS + 1) with SWITCHABLE_FILTER_CONTEXTS.
Change-Id: I9781a62bc1a4cd9176554d1271d87dbcafda9cb0
2013-10-30 14:40:34 -07:00
Dmitry Kovalev
2901bf2d00 Reducing the number of recursive calls.
Both decode_modes_sb and decode_modes_b had conditions to immediately
return at the beginning. Eliminating these conditions here and calling
these functions only to do a real work. Also unrolling loop for
PARTITION_SPLIT.

Change-Id: I2fc41cb74ac491f045a2f04fe68d30ff4aaa555d
2013-10-30 12:17:05 -07:00
James Zern
54c2854fe2 vp9/decode: align tile worker data allocation
fixes a crash in assembly on 32-bit linux/windows

Change-Id: I0c27e6c0ece9732b5eb2ee5b59ff42c3c8016c50
2013-10-30 08:33:09 +01:00
Johann
2a67a34f4a Merge "vp9_decodframe.c: use vpx_memset instead of cast" 2013-10-29 18:40:18 -07:00
James Zern
ce053e7006 Merge "vp9: add multi-threaded tile decoder" 2013-10-29 17:44:22 -07:00
James Zern
3b47e05908 Merge "vp9/decode: add get_tile()" 2013-10-29 17:34:56 -07:00
James Zern
fb484524bd vp9: add multi-threaded tile decoder
tiles are decoded in parallel within a single frame

Change-Id: I7aca87cb1c239b74eceef72bdc9f672faebac373
2013-10-30 01:00:20 +01:00
James Zern
6b00202f1b vp9/decode: add get_tile()
factorizes the code in decode_tiles(). reading the offsets backwards
wasn't doing anything to prove tile independence

Change-Id: I0395d3c77205852ebdc55efedc68291e93cef85c
2013-10-30 01:00:07 +01:00
Johann
dc799a875b vp9_decodframe.c: use vpx_memset instead of cast
Fix warning with -Wstrict-aliasing=1

Change-Id: Idfac09be1ab328923883e63436577f1018c895b8
2013-10-29 13:52:48 -07:00
Dmitry Kovalev
156de9c3ef Correct handling of show_bit in uncompressed header.
"keyframe" variable in the current code actually means that previous
frame is a keyframe because cm->frame_type has not been initialized
in read_uncompressed_header.

Change-Id: I5645b0816c70abdef5dfc70113018d06276dac77
2013-10-29 11:24:08 -07:00
James Zern
7795c1911e Merge "vp9_decode_frame: group assignments/setup calls" 2013-10-29 03:34:10 -07:00
James Zern
d39f279daa vp9_decode_frame: group assignments/setup calls
group error checking at the top followed by allocations, setup then
decode.

Change-Id: I877d21326bb767885520511ecea70e5fd1e28054
2013-10-29 11:03:50 +01:00
Dmitry Kovalev
19cf72eddc Adding {read, write}_partition() instead of check_bsize_coverage().
Making partition read/write logic more clear.

Change-Id: I1981e90327257d37095567c62d72a103cda1da33
2013-10-28 15:14:45 -07:00
James Zern
58a0f6dbdd vp9: add TileInfo
replaces use of cur_tile_mi_(row|col)_(start|end) by VP9_COMMON, making
it less stateful and more reusable for parallel tile decoding

Change-Id: I1df09382b4567a0e5f4434825d47c79afe2399be
2013-10-28 20:54:43 +01:00
James Zern
f0eabfd432 vp9_decodframe: limit scope of private function params (2)
replace VP9D_COMP usage with the (slightly) more targeted
VP9_COMMON/MACROBLCKD structures.

Change-Id: Ifdd9034f44d69eb94e232dd03c922de763b96a30
2013-10-28 20:48:59 +01:00
James Zern
3ffa41aae3 Merge changes If9b16f7d,I75aab21c,I9cbb768c,If5cea3d3,I96940657,I025595d8,Ie0bc3935,I3ebb172d
* changes:
  vp9: remove partition+entropy contexts from common
  vp9: add above/left_context to MACROBLOCKD
  vp9: add above/left_seg_context to MACROBLOCKD
  vp9: add above/left_context to encoder
  vp9: add above/left_seg_context to encoder
  vp9: pass entropy context directly to set_skip_context
  vp9: pass context directly to partition functions
  vp9/decode: add alloc_tile_storage()
2013-10-28 12:45:11 -07:00
James Zern
e571d3badc vp9: add above/left_context to MACROBLOCKD
Change-Id: I75aab21c1692cbad717564cbb436578fddbc348d
2013-10-28 11:34:18 +01:00
James Zern
d9a317c8b2 vp9: add above/left_seg_context to MACROBLOCKD
Change-Id: I9cbb768c5f857a096cf6c29d6755d0e5e6728435
2013-10-28 11:32:16 +01:00
James Zern
8f177bb0b6 vp9 decode: defer loop filter allocation
wait until do_loopfilter_inline is true before committing the resources

Change-Id: I01661bd40599b47362bb3fb534668471f2a9d8d7
2013-10-26 11:57:44 +02:00
Dmitry Kovalev
07502f1963 Merge "Adding get_frame_new_buffer() function to replace duplicated code." 2013-10-25 15:25:13 -07:00
James Zern
d2bf696ee0 vp9: pass entropy context directly to set_skip_context
this will allow for separate storage to be used in tile decoding

Change-Id: I025595d83118bdc82a545dae69bc6602e8d2a6e3
2013-10-25 22:01:13 +02:00
James Zern
88d79eabdc vp9: pass context directly to partition functions
update_partition_context / partition_plane_context: this will allow for
separate storage to be used in tile decoding

Change-Id: Ie0bc393531ab7e9d2ce35c95111849b294aad4ed
2013-10-25 22:01:13 +02:00
James Zern
71097d9cf2 vp9/decode: add alloc_tile_storage()
Change-Id: I3ebb172d4f2ae7db73b72fb42eb93833a295fb55
2013-10-25 22:01:13 +02:00
Dmitry Kovalev
d5ac877f7f Adding COLOR_SPACE enum.
Change-Id: If5711eb166609cce0a88b3cb5b56b3afeebc4fb0
2013-10-25 12:35:20 -07:00
Dmitry Kovalev
237ce8724a Adding get_frame_new_buffer() function to replace duplicated code.
Change-Id: I6e0e19231a48364c1de7dfab730b121ab227f111
2013-10-24 12:20:35 -07:00
Dmitry Kovalev
dfc7945d1e Adding get_frame_ref_buffer() function + cleanup.
Change-Id: Ib9ead216fc54b2df6f6f1fe82d2ea137197beebd
2013-10-24 11:05:35 -07:00
Dmitry Kovalev
4a59def9b4 Merge "Eliminating usage of allow_comp_inter_inter in the decoder." 2013-10-24 10:09:37 -07:00
Dmitry Kovalev
710ca1fe36 Merge changes I1868fb75,I9ff504c6
* changes:
  Renaming INTERPOLATIONFILTERTYPE to INTERPOLATION_TYPE.
  Adding VP9_FRAME_MARKER constant.
2013-10-24 10:08:19 -07:00
Yunqing Wang
93ec31dff6 Merge "Improve scale_factors struct" 2013-10-24 09:13:41 -07:00
Dmitry Kovalev
ad867fe237 Renaming INTERPOLATIONFILTERTYPE to INTERPOLATION_TYPE.
Change-Id: I1868fb75ed88bfa65c1c2ca24677d65f2894d713
2013-10-23 17:45:52 -07:00
Dmitry Kovalev
a53075f7c5 Adding VP9_FRAME_MARKER constant.
Also renaming SYNC_CODE_* to VP9_SYNC_CODE_*.

Change-Id: I9ff504c6ebce6cd6673d7df2085d597b818f5960
2013-10-23 17:24:17 -07:00
Dmitry Kovalev
4d88b3837b Eliminating usage of allow_comp_inter_inter in the decoder.
Splitting setup_inter_inter function into is_compound_prediction_allowed
and setup_compound_prediction. Moving setup_compound_prediction call
into read_comp_pred from read_uncompressed_header.

We should do the same in the encoder as well.

Change-Id: I40d75fdc4a221b2f7705df00d23a4b3fe79987c3
2013-10-23 14:18:09 -07:00