This patch allocated frame contexts outside VP9_COMMON. This allows
multiple threads to share the same copy of frame contexts, and
reduces the overhead. It also guarantees the correct update of
these contexts during bitstream packing. This patch doesn't change
encoding result.
Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353
Refactoring to remove some duplication of probability
tables between tokenization and detokenization.
Change-Id: I2fc6a6497f9c0410021a9b41f828bc58a864e466
Simplifies the code by implementing band mapping with static arrays.
A lot of the code complexity introduced in a previous patch
disappears.
Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
Removing special case handling from vp9_tree_probs_from_distribution(),
tree_merge_probs(), and vp9_tokens_from_tree_offset() functions. Replacing
inter_mode_offset() function with macro INTER_OFFSET which is used now for
vp9_inter_mode_tree definition.
Change-Id: Iff75a1499d460beb949ece543389c8754deaf178
We don't have to calculate 'new' probability in convert_distribution()
because it is enough to calculate only 'new' counters which could be used
to calculate probability if necessary. That's why removing a lot of unused
temporary probability arrays and reducing number of get_binary_prob()
calls.
Change-Id: I4e14eb7203d1ace61bbddefd6b9b6326be83ba63
Updated the encoder to handle frames that are coded
intra-only. Intra-only frames must be non-showable,
that is, the "show frame" flag must be set to 0 in
the frame header.
Tested by forcing the ARF frames to be coded intra-
only.
Note: The rate control code will need to be modified
to account for intra-only frames better than they
are currently handled.
Change-Id: I6a9dd5337deddcecc599d3a44a7431909ed21079
Now we have entropy code separate from scan/iscan code. The next step
in future is to move iscan code from common part to the encoder.
Change-Id: Id9732f7d80aec00af35c1d58d1137c4c96c91451
Fixes a warning on MSVS 2012 where the alignment of vp9_default_iscan_8x8
didn't match between its declaration and definition.
Change-Id: I1466a15635f4b22594d705d570b7e399bfb6cf21
Removing unused constants, macros, and function declarations. Using
ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving
#include from *.h to *.c. Merging for loops for motion vectors.
Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13
Adding common merge_probs and merge_probs2 functions. Changing ints to
usigned ints in some places.
Change-Id: Icf088ffdea7cf5b95284a128916409bdd53506b0
Counts are separate from frame context. We have several frame contexts but
need only one copy of all counts.
Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d
This should significantly speedup cost_coeffs(). Basically what the
patch does is to make the neighbour arrays padded by one item to
prevent an eob check in get_coef_context(), then it populates each
col/row scan and left/top edge coefficient with two times the same
neighbour - this prevents a single/double context branch in
get_coef_context(). Lastly, it populates neighbour arrays in pixel
order (rather than scan order), so we don't have to dereference the
scantable to get the correct neighbours.
Total encoding time of first 50 frames of bus (speed 0) at 1500kbps
goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase.
Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
x86-64 only, it needs some minor modifications to be 32bit compatible,
because it uses 15 xmm registers, whereas 32bit only has 8.
Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
Makes cost_coeffs() a lot faster:
4x4: 236 -> 181 cycles
8x8: 888 -> 588 cycles
16x16: 3550 -> 2483 cycles
32x32: 17392 -> 12010 cycles
Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes
from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup.
Change-Id: I16b8d595946393c8dc661599550b3f37f5718896