Rework the inter mode transform block decoding loop. Replace the
block index with the row and col index as the input argument. It
saves function call to compute the row and col index according to
the block index and overall block size, and many if statements
associated with the transform block position relative to the coding
block. For the test bit-stream pedestrian_area 1080p at 5 Mbps,
the decoding speed goes up from 81.13 fps to 81.92 fps.
Note that the intra coded block decoding needs more refactoring
work than the inter ones. So keep it using foreach_transforme_block
as for now.
Change-Id: I5622bdae7be28ed5af96693274057f55ba9b4fb4