allocations within vp9_alloc_context_buffers() rely on mi_rows/mi_cols
individually, use those to determine whether to realloc rather than
stride and stride * rows. this fixes a crash with some fuzzed files for
invalid accesses into last_frame_seg_map and above_context.
Change-Id: I7b9f40dcf170d443890f3bd2acd285507943c7d4
proceeding using a corrupt (incompletely decoded) frame reference may
lead to incorrect assumptions about allocation sizes leading to a crash.
Change-Id: I76e74f2e1be127c2e2c7e1174bb3307497dfd23d
Adds config parameter vp9_highbitdepth, to support highbitdepth profiles.
Also includes most vpx level high bit-depth functions. However
encode/decode in the highbitdepth profiles will not work until
the rest of the code is in place.
Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6
store the number of allocated rows in VP9LfSync, the calculated values
can not be relied on when dealing with corrupt material.
Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
this change is proactive: the loop filter expects valid input and may
produce undefined results / crash in other cases.
Change-Id: I6cc1e966062a91cbc6db981c87cd03d9129fc8fe
attempting to decode a frame after the previous frame failed has the
potential of interrupting an earlier loop filter task
Change-Id: I6f2b1ddcdf5b89c3e2ee8caf5289dada2a087d66
if the first frame was corrupt and loop filter not called, the next call
would assume the necessary allocations had been done and segfault when
accessing a NULL pointer
Change-Id: Ib6ef505e5c594e6f0fe65ab0700172bcf06b92a6
The case where frame width increases but the overall memory
size required to hold the mi arrays does not was not
handled.
Change-Id: I72e70b912a7d1766687ad682979f1c9ee124449b
1. Clean the code for encode frame tests
2. Add encode w/ and w/o alt reference frame test
3. Add encode SNR layers test
4. Add encode multiple layers but decode partial layers test
Change-Id: Ibd2c9bc02525db584a6f931a98405f2d851b3cd6
The test to determine if the mode info buffers need
to be resized when the frame size changes was
incorrect, as per bug 837.
By storing the size of the allocated data structure,
a simple test determines whether to allocate more
memory when the frame size changes.
Change-Id: I1544698f2882cf958fc672485614f2f46e9719bd
Replaced encoder and decoder functions to get a pointer
to a reference frame with a common function, vp9_get_ref_frame,
and simplified it.
Change-Id: Icb206fcce8caace3bfd1db3dbfa318dde79043ee
Specifies the bit-depth, color sampling and colorspace
for intra only frames for profiles > 0
Also adds checks to ensure that profile 1 and 3 are
exclusively used for non 420 streams.
Change-Id: Icfb15fa1acccbce8f757c78fa8a2f60591360745
The original implementation only allocates one segmentation map and this
works fine for serial decode. But for frame parallel decode, each thread
need to have its own segmentation map and the last frame segmentation map
should be provided from last frame decoding thread.
After finishing decoding a frame, thread need to serve the old segmentation
map that associate with the previous decoded frame. The thread also need to
use another segmentation map for decoding the current frame.
Change-Id: I442ddff36b5de9cb8a7eb59e225744c78f4492d8
The issue was introduced by commit g9f37d14 with adding explicit
restrictions on reference-frame scale factors. The restriction
is checked against aligned-by-8 frame dimensions, not against
original ones. So, for example, frame of 35×35 actually can refer
to frame of 70×70, but the new check won't allow this. It will
compare 35 vs 72 (not 70), so 2x downscale limit will be exceeded.
Change-Id: Ic663693034440f64ac8312cbff9e1e773a921060
A previous change, https://gerrit.chromium.org/gerrit/#/c/70632,
introduced a size validation for reference frames to insuare the
input stream is a valid VP9 stream. However, the logic requiring
all reference frames have valid size turned out to be too strict.
In this commit, we modify the validation to require one of the
reference frame has valid dimension. In addition, the decoder
reports error whenever it detects the use of reference frame
with invalid scalig ratio.
Change-Id: If8efc312244087556cfe00f1fcbdff811268ebad
The patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
changed the test that determined whether the context
frame buffers needed to be reallocated or not.
The code checked for a change in total frame area
to signal the need to reallocate context buffers.
However, the above_context buffer needs to be
resized i:xf only the width of the frame has increased.
Change-Id: Ib89d75651af252908144cf662578d84f16cf30e6
For gcc, when libvpx config option debug is disabled, added the
flag -DNDEBUG to disable the assertions in libvpx for some speedup.
Change-Id: Ifcb7b9e8ef5cbe5d07a24407b53b9a2923f596ee
This patch adds back in code that checks that the frame
size lies within defined bounds was inadvertantly removed
by a previous patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
Change-Id: If526570ba559260c4b7e98098bc75f7700ae7f97
Separates HBD profile int two profiles (2 and 3) consistent with the
highbitdepth branch. This patch is ported from the original highbitdepth
branch patch: https://gerrit.chromium.org/gerrit/#/c/70460/
Two of the invalid file tests needed to be updated.
Change-Id: I6a4acd2f7a60b1fb4cbcc8e0dad4eab4248431e3
This patch is the first step toward simplifying the
frame buffer handling.
The final goal is to have a common frame buffer handling
framework for both encoder and decoder that incorporates
the existing ability to use externally allocated memory.
Change-Id: I2c378a4f54a39908915f46c4260e17a080db7ff1
This is a practical concern to allow us to fail in a decoder instance
if the size of a file is bigger than we can reasonably handle.
Change-Id: I0446b5502b1f8a48408107648ff2a8d187dca393
The issue was introduced by commit g7c43fb6. If current frame
is repeated from existing-ref pool, frame buffer ref counter
is not decreased, so buffer isn't released. Decoder fails being
unable to allocate new frame buffer at some point.
Added a test vector to verify that the condition will not
recur later. Test vector was generated by the code in this patch:
https://gerrit.chromium.org/gerrit/#/c/70862/
Change-Id: I8af96eb5b9670176e01a281d2e18bd458712cf78
Prepare for frame parallel decoding, the reference count buffers
need to be protected by mutex. Move vp9_thread.* to common
folder so that those buffers could use cross-platform mutex
from vp9_thread.*.
(cherry picked from commit 337e8015c9)
Change-Id: I0587a08447925f4554d7788686a31483c2ae3f37
Also fix bugs related with corrupted frame handling.
Return VPX_CODEC_CORRUPT_FRAME when getting corrupted
block.
Change-Id: I7207ccc7c68c4df2b40b561315d16e49ccf7ff41
Refactoring to remove some duplication of probability
tables between tokenization and detokenization.
Change-Id: I2fc6a6497f9c0410021a9b41f828bc58a864e466
This patch fixes bug 633:
https://code.google.com/p/webm/issues/detail?id=633
The first decoded frame does not have to be a keyframe,
it could be an inter-frame that is coded intra-only.
This patch fixes the handling of intra-only frames.
A test vector has also been added that encodes 3
intra-only frames at the start of the clip. The
test vector was generated using the code in the
following patch:
https://gerrit.chromium.org/gerrit/#/c/70680/
Change-Id: Ib40b1dbf91aae2bc047e23c626eaef09d1860147
Prepare for frame parallel decoding, the reference count buffers
need to be protected by mutex. Move vp9_thread.* to common
folder so that those buffers could use cross-platform mutex
from vp9_thread.*.
Change-Id: I541277cf15eefed6641555944f67f4a0bcdc8154
Prepare for frame parallel decoding, the frame buffers must be
separated from the encoder and decoder structure, while the encoder
and decoder will hold the pointer of the BufferPool.
Change-Id: I172c78f876e41fb5aea11be5f632adadf2a6f466
pull the latest from WebP, which adds a worker interface abstraction
allowing an application to override init/reset/sync/launch/execute/end
this has the side effect of removing a harmless, but annoying, TSan
warning.
Original source:
http://git.chromium.org/webm/libwebp.git
100644 blob 08ad4e1fecba302bf1247645e84a7d2779956bc3 src/utils/thread.c
100644 blob 7bd451b124ae3b81596abfbcc823e3cb129d3a38 src/utils/thread.h
Local modifications:
- s/WebP/VP9/g
- camelcase functions -> lower with _'s
- associate '*' with the variable, not the type
Change-Id: I875ac5a74ed873cbcb19a3a100b5e0ca6fcd9aed
The caller should reset the state instead of letting worker
to reset.
This reverts commit 34b2ce15f9.
Change-Id: Idb546ea6386cffc44e98dee772900d21ab79710f
This fixes the hang in VP9/InvalidFileTest.ReturnCode/3
due to worker->had_error has not been reset after getting
error.
Change-Id: Ia3608225094758a2bd88f6ae4dd9dfd93bbaad27
This reverts commit b336356198.
This causes a hang in:
VP9/InvalidFileTest.ReturnCode/3
the change to test/user_priv_test.cc remains with a minor update
Change-Id: I4a8a272ca37ea329b0f413f0b1cd827a238bd9fd
This patch checks that a decoder never tries to reference frame that's
outside the range of 2x to 1/16th the size of this frame. Any attempt
to do so causes a failure.
Change-Id: I5c98fa7bb95ac4f29146f29dd92b62fe96164e4c
This patch reverts the previous revert from Jim and also add a
variable user_priv in the FrameWorker to save the user_priv
passed from the application. In the decoder_get_frame function,
the user_priv will be binded with the img. This change is needed
or it will fail the unit test added here:
https://gerrit.chromium.org/gerrit/#/c/70610/
This reverts commit 9be46e4565.
Change-Id: I376d9a12ee196faffdf3c792b59e6137c56132c1
the max is 6. there are assumptions throughout the decode regarding
this; fixes a crash with a fuzzed bitstream
$ zzuf -s 5861 -r 0.01:0.05 -b 6- \
< vp90-2-00-quantizer-00.webm.ivf \
| dd of=invalid-vp90-2-00-quantizer-00.webm.ivf.s5861_r01-05_b6-.ivf \
bs=1 count=81883
Change-Id: I6af41bb34252e88bc156a4c27c80d505d45f5642
Avoids failures:
MSE_ClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0
MSE_ClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0
MSE_ExternalClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0
MSE_ExternalClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0
MSE_ExternalClearKeyDecryptOnly/EncryptedMediaTest.Playback_VP9Video_WebM/0
MSE_ExternalClearKeyDecryptOnly_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0
SRC_ExternalClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0
SRC_ExternalClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0
SRC_ClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0
Patches are
This reverts commit 9bc040859b
This reverts commit 6f5aba069a
This reverts commit 9bc040859b
I1f250441 Revert "Refactor the vp9_get_frame code for frame parallel."
Ibfdddce5 Revert "Delay decreasing reference count in frame-parallel decoding."
I00ce6771 Revert "Introduce FrameWorker for decoding."
Need better testing in libvpx for these commits
Change-Id: Ifa1f279b0cabf4b47c051ec26018f9301c1e130e
When decoding in serial mode, there will be only
one FrameWorker doing decoding. When decoding in
parallel mode, there will be several FrameWorkers
doing decoding in parallel.
Change-Id: If53fc5c49c7a0bf5e773f1ce7008b8a62fdae257
See: https://code.google.com/p/chromium/issues/detail?id=362697
The code properly catches an invalid stream but seg faults instead of
returning an error due to a buffer not having been initialized. This
code fixes that.
Change-Id: I695595e742cb08807e1dfb2f00bc097b3eae3a9b
The current decoding scheme will decrease the reference count
of the output frame when finish decoding. Then the application
could copy the frame from the decoder buffer to application buffer.
In frame-parallel decoding, a decoded frame will not be outputted
until several frames later which depends on thread numbers. So
the decoded frame's reference count should be decreased only
after application finish copying the frame out. But due to the
limitation of vpx_codec_get_frame, decoder could not know when
application finish decoding. So use a index last_show_frame to
release the last output frame's reference count.
Change-Id: I403ee0d01148ac1182e5a2d87cf7dcc302b51e63
The final goal is eventually to get rid of both itxm_add and fwd_txm4x4.
This patch does it in the decoder.
Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5
The current decode_tiles decodes the frame one tile by one tile
and then loopfilter the whole frame or use another worker thread to
do loopfiltering.
|------|------|------|------|
|Tile1-|Tile2-|Tile3-|Tile4-|
|------|------|------|------|
For example, if a tile video has one row and four cols, decode_tiles
will decode the Tile1, then Tile2, then Tile3, then Tile4.
And during decode each tile, decode_tile will decode row by row in
each tile.
For frame parallel decoding, decode_tiles will decode video in row order
across the tiles. So the order will be:
"Decode 1st row of Tile1" -> "Decode 1st row of Tile2"
-> "Decode 1st row of Tile3" -> "Decode 1st row of Tile4"
-> "Decode 2nd row of Tile1" -> "Decode 2nd row of Tile2"
-> "Decode 2nd row of Tile3" -> "Decode 2nd row of Tile4"-> "loopfilter 1st row"
Change-Id: I2211f9adc6d142fbf411d491031203cb8a6dbf6b
Inline loopfilter has been already handled in vp9_decode_frame().
Collecting all similar code in one place now.
Change-Id: I358a0280fc7c2b27cca520bc1e8c16c4eb6491dd
Fixes the idecoder in the case where:
cm->error_resilient_mode == 0, and
cm->frame_parallel_decoding_mode == 0, but
new_fb->corrupted == 1.
The assert in debug_check_frame_counts fails to
take into account the case of a corrupt frame.
Change-Id: Idf318a68458cc88d65d6f3f408a10d8ffe87e43f
We only used two members from that struct: max_threads and inv_tile_order.
Moving them directly to VP9Decoder struct.
Change-Id: If696a4e5b5b41868a55f3cc971e1d7c1dd9d5f69
Adds some high-level hooks for profile 2 before further
progress on the implementation.
According to the definitiion in this patch:
1. Profile 2 only supports 10 or 12 bit color but not 8
2. Profile 2 supports all color sampling modes: 444, 422 and 420,
and alpha plane.
3. Profile 3 is currently undefined.
Please consider the definition carefully and suggest modifications
to the definition as needed.
Change-Id: I5b284fc679e54ac5aee171af72fa7994cfd28995
There was a bug with the decoder that if you started the decoder
with more threads than the first frame had tile columns. Afterwards
tried to decode a frame with more tile columns than the first frame,
the decoder would hang. E.g. run vpxdec --threads=4. The first frame
had two tile columns, then the next key frame had 4 tile columns, the
decoder would hang. If you started with 4 tiles and switched to 2
tiles the decoder would be fine. The issue is that the worker the thread
loop is using is stale.
I added a test vector "vp90-2-14-resize-848x480-1280x720.webm" that
exhibited the bug.
Change-Id: I7bdd47241a52ac0fe1c693a609bc779257e94229
Reimplementing sub8x8-reading of intra block modes in
read_intra_frame_mode_info() and read_intra_block_mode_info(). Code looks
more readable as well.
Change-Id: Ia42fc7d0dad708bc0c7a8bff1f8b37809b843f40
The function has evolved over time, now only calls vp9_rtcd(), so this
commit removes the function and changes to call vp9_rtcd() directly.
Change-Id: I8cfa6190daa4b28f6f3d1e11bb3a07f9c95322bf
If show_existing_frame indicates that the decoder should
display an existing (previously decoded) frame, add a
check to make sure that the signaled buffer does contain
a valid decoded frame.
Change-Id: Iac8c686b321827414d69a3f2d0467566911bcba2
Prior to this commit, both encoder and decoder reset mode/mv info from
previous frame in error resilient mode to ensure bitstreams are able to
decode when there is loss of frame in decoder side. However, this is
not necessary. This commit changed to remove the reset, so encoder can
continue to use mode/mv/partition information from previously encoded
frame without affecting decodeablilty under loss of frame.
Change-Id: I0279f862900dc647fb471ae3389770bb1b9f454f
This CL changes libvpx to call a function when a frame buffer
is needed for decode. Libvpx will call a release callback when
no other frames reference the frame buffer. This CL adds a
default implementation of the frame buffer callbacks. Currently
only VP9 is supported. A future CL will add support for
applications to supply their own frame buffer callbacks.
Change-Id: I1405a320118f1cdd95f80c670d52b085a62cb10d
this ensures both are properly initialized when calling _dealloc().
+ check the arrays before access
Change-Id: I789af39b41c271b5cb3c029526581b4d9903b895
This avoids calls to get_unsigned_bits() with constants and
replaces hard to trace loops with simpler structures.
Change-Id: Ic1afc5a17d7df5bcfc85b76efda316b0bf118467
As pointed out by Dmitry and James, "partial" is a Microsoft-
specific c++ keyword, and it is renamed.
Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
Implemented parallel loopfiltering, which uses existing tile-
decoding threads. Each thread works on one row, and when that row
is loopfiltered, it moves to next unattended row. To ensure the
correct filtering order, threads are synchronized and one
superblock is filtered only if the superblocks it depends on are
filtered already.
To reduce synchronization overhead and speed up the decoder, we use
nsync > 1 for high resolution.
Performance tests:
1. on desktop:
8-tile 4k video using 8 threads, speedup: 70% - 80%
4-tile HD video using 4 threads, speedup: ~35%
2. on mobile device(Nexus 7):
4-tile 1080p video using 4 threads, speedup: 18% - 25%
4-tile 1080p video using 2 threads, speedup: 10% - 15%
Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
When showing a previously decoded frame, i.e. when
show_existing_frame=1, the update of the
last_show_frame flag must be disabled.
This is to ensure that the last_show_frame flag
reflects the state of the flag for the immediately
previously decoded frame rather then the value that
was forced to ensure that a previously decoded frame
would be displayed.
This patch also adds a test vector to verify that the
display_existing_frame flag works as expected. Code
for generating the test vector can be found in this
patch:
https://gerrit.chromium.org/gerrit/#/c/68581/
(Bug originally reported by Alexander Voronov
<ru.xalba@gmail.com>).
Change-Id: I731d288fba02088959f7fcc87707137fffc6acf5
Encoder's boarder is still 160, while decoder's boarder will be 32.
With on demand and separate boarder buffer for boarder extension.
The decoder's boarder does not need to to 160 anymore.
Change-Id: I93d5aaff15a33a2213e9761eaa37c5f2870747db
When showing a previously decoded frame, we need to
explicitly set the show_frame flag.
For the current frame being decoded this flag is
explicitly set in the frame header.
This should fix WebM Issue 696:
http://code.google.com/p/webm/issues/detail?id=696
Change-Id: I5751a809813f88d2ca6f62c47c3878475ff9ba8d
This commit removes the use of best_mv in the decoding process. This
variable can be replaced with nearest_mv. It saves a few cycles on
assigning the values for best_mv.
Change-Id: Ic183f9c1fb615c54efd7e6ccfedcf09d493435e4
Adding RefBuffer to simplify reference buffer management. The struct has a
pointer to image data and scale factors relative to the current frame.
Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e
Moving back to scale_factors struct. We don't need anymore x_offset_q4 and
y_offset_q4 because both values are calculated locally inside vp9_scale_mv
function.
Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b
Guard against incorrect size values moving *data past data_end.
Check read length against the difference of the buffers.
Change-Id: Ie0b54e2db517fd41a0f3ceb23402ee44839a4739
reorder the tiles based on size and their presumed complexity. this
minimizes the cases where the main thread is waiting on a worker to
complete.
Change-Id: Ie80642c6a1d64ece884f41683d23a3708ab38e0c
- Disable mode info update in case where current frame is coded
as "show existing frame".
- Should fix issue 676.
Change-Id: Ibee681850eb307f982da6528d3e31cb94f881c08
The old code would start in a mixed state, where all the reference
frames were pointing to frame buffer 0, but the reference counts
were 0. This is why we needed special code for the first frame.
Change-Id: I734961012917654ff8c0c8b317aac00ab75ded1a
In the decoder we don't need to save eobs, we can pass eob as an argument.
That's why removing eob arrays from VP9Decompressor and TileWorkerData,
and moving eob pointer from macroblockd_plane to macroblock_plane.
Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
The difference with the old code is that originally the whole token_cache
was initialized with zeros at the beginning of decode_coefs() function.
Now we set several zero values explicitly with "token_cache[scan[c]] = 0".
Change-Id: I88cc5031f01d13012d1a4491739c36cb44f9401e
Removing goto and using while loop instead, renaming seg_eob to max_eob,
moving eob token counter increment.
Change-Id: Idcc4b3a45e4f313596a71776aef56691a6647e5f
We only need qcoeff buffers in the encoder. Reducing TileWorkerData struct
and VP9Decompressor struct sizes by 24K.
Change-Id: Id148868461f7ffa3d3dd634b371503ae9c57e207
Renaming treed_read() to consistent vp9_read_tree() and moving it from
deleted vp9_treereader.h to vp9_dboolhuff.h file.
Change-Id: Iedd8655acbe25e4fcf62b79e5a13bdea69b6b004
The decoder will construct inter predictor using lazy border extension,
while the encoder, going with multiple runs of motion search in the rate-
distortion optimization loop for each block, does border extension at
frame level. This commit makes separate the inter predictors for encoder
and decoder, respectively.
Change-Id: Ieca2fecba3a7201a6d64ef9f219e5d91e50559c3
This commit takes out vp9_extend_frame_borders from
vp9_setup_scale_factors.
The refactoring is for the preparation of the use of lazy border
extension at decoder. This makes it necessary to handle border
extension separately at encoder/decoder. The use of
vp9_extend_frame_borders will be removed, when lazy border extension
is ready.
Change-Id: Ia3baba3d179d5f11eee1634f19b3b319d2a59186
Since they used in encoder only. This commit also re-order includes
for the files that include vp9_extend.h
Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
Simplifies the code by implementing band mapping with static arrays.
A lot of the code complexity introduced in a previous patch
disappears.
Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
Implements scan order to band map with arrays in both the encoder
and decoder to remove conditional statements.
Encoding seems to be about 1% faster at speed 0, tested on football.
Decoding seems to be about 0.5-1% faster on a set of 25 videos.
Change-Id: Idb233ca0b9e0efd790e30880642e8717e1c5c8dd
Make the macroblockd_plane contain dynamic buffer pointers instead
static pointers to the memory space allocated therein. The decoder
uses the buffer allocated in pbi, while encoder will use a dual
buffer approach for rate-distortion optimization search.
Change-Id: Ie6f24be2dcda35df7c15b4014e5ccf236fb3f76c
This commit fixes the assignment of mode_info pointer per tile. It
makes recognition of tiles in both row and column formats and properly
arrange the use of mode_info.
The bug was first introduced in
I6226456dd11f275fa991e4a7a930549da6675915
https://gerrit.chromium.org/gerrit/#/c/67492/
Change-Id: Ie12cd209f53241513728c461ee3d7b9599ddb860
The new expression is much more logical than previous one. Surprisingly
both expressions give exactly the same set of dependent values
-- have_top, have_left, have_right -- in vp9_predict_intra_block.
Change-Id: I63eb1b592b8c37883b3a0dbb1f3daa271e446109
Now tile decoding consists of two stages:
1. Find tile buffer start and its size, put this info into tile_buffers.
2. Decode each tile based on information from tile_buffers.
It seems that stage 1 can also be reused by multithreaded tile decoder.
Change-Id: If0cdaefdd6d10bb41c63561346c9ae4cfac081dd
It is more logical to use dqcoeff buffer to put there *dequantized*
transform coefficients (inside inverse_transform_block and
decode_coefs functions). Dequantization happens inside WRITE_COEF_CONTINUE
macro.
qcoeff buffer should be only used in the encoder for *quantized*
transform coefficients.
Change-Id: Ifd54bef272bbf5311ced6669c4f1079f998af5d7
Removing special case handling from vp9_tree_probs_from_distribution(),
tree_merge_probs(), and vp9_tokens_from_tree_offset() functions. Replacing
inter_mode_offset() function with macro INTER_OFFSET which is used now for
vp9_inter_mode_tree definition.
Change-Id: Iff75a1499d460beb949ece543389c8754deaf178
Removes stack-alocation of token_cache in decode_coefs function
Seems to achieve about 1% decode speed improvement as tested on
25 480p videos.
Change-Id: I8e7eb3361fa09d9654dfad0677a6d606701fdc6e
We only update partition_probs for inter frames but they are constant
for key frames. It is not necessary to have constants inside frame
context and copy them every time. This change reduces FRAME_CONTEXT size
by at least 48 bytes.
Change-Id: If70a53be51043f37fe7d113853217937710932a7
1. Reduced the size memset based on eob for 32x32 transform. The reset
of non-zero coefficient should probably go into where they are read in
inverse transform functions. (TODO)
2. Removed a redundant level of indirection.
vp9_iht4x4_add() checks transform type and call vp9_iht4x4_16_add()
for tranforms other than DCT_DCT. In this case, the DCT_DCT case
has been already handled here.
Change-Id: Iacbc77da761f0b308df5acea0f20c9add9f33d20
The change doesn't affect the bitstream. It changes the order or function
calls and affects how we reconstruct intra- and inter-blocks. Speed up is
about 1...1.5%.
For intra-blocks:
Before:
for each transform block read tokens
for each transform block do prediction
for each transform block do inverse transform
Now:
for each transform block
read tokens
do prediction
do inverse transform
For inter-blocks:
Before:
for each transform block read tokens
for each transform block do inverse transform
Now:
for each transform block
read tokens
do inverse transform
Change-Id: I12a79bf1aa5a18c351b8010369bd3ff1deae1570
Both decode_modes_sb and decode_modes_b had conditions to immediately
return at the beginning. Eliminating these conditions here and calling
these functions only to do a real work. Also unrolling loop for
PARTITION_SPLIT.
Change-Id: I2fc41cb74ac491f045a2f04fe68d30ff4aaa555d
factorizes the code in decode_tiles(). reading the offsets backwards
wasn't doing anything to prove tile independence
Change-Id: I0395d3c77205852ebdc55efedc68291e93cef85c
"keyframe" variable in the current code actually means that previous
frame is a keyframe because cm->frame_type has not been initialized
in read_uncompressed_header.
Change-Id: I5645b0816c70abdef5dfc70113018d06276dac77
replaces use of cur_tile_mi_(row|col)_(start|end) by VP9_COMMON, making
it less stateful and more reusable for parallel tile decoding
Change-Id: I1df09382b4567a0e5f4434825d47c79afe2399be