Removed an adaptive rate correction factor that was having
a negative impact on quality in many clips. This factor
was influencing the Q range available to each frame
independently of the bits allocated to each.
Average results with DISABLE_RC_LONG_TERM_MEM.
derf +0.199, -0.059.
yt +3.957, +3.798
std hd +1.577, +2.140
yt hd +4.127, +4.513
Average results without DISABLE_RC_LONG_TERM_MEM
derf -0.628, -0.665
yt +3.432, +3.015
std hd -0.105, +0.153
yt hd +3.432, +3.015
Change-Id: I45bab6b606f49a442e7b27a6d631f3ffd843bbce
Includes various cleanups.
Streamlines the interfaces so that all rate control state
updates happen in the vp9_rc_postencode_update() function.
This will hopefully make it easier to support multiple
rate control schemes.
Removes some unnecessary code, which in rare cases can casue
a difference in the constrained quality mode output, but
other than that there is no bitstream change yet.
Change-Id: I3198cc37249932feea1e3691c0b2650e7b0c22fc
Removed calls to vp9_update_mode_info_border since
they immediately followed code that initialized the
entire buffer to 0.
Change-Id: Ife06794daa20439a0b607a83a87f88df59afac40
Both single frame and compound inter motion search run with luma
component only. Hence removing the block size mapping therein.
Change-Id: I217488e702432ae9fa0e95bf6f516ebb36b5c79b
The old code would start in a mixed state, where all the reference
frames were pointing to frame buffer 0, but the reference counts
were 0. This is why we needed special code for the first frame.
Change-Id: I734961012917654ff8c0c8b317aac00ab75ded1a
Using get_plane_block_size() instead of manipulation with subsampling
values, calculating all required values only once without redundant calls
to b_width_log2().
Change-Id: I00303f2a0926f9c4cb17f34591adda60615f8919
Jingning saw bitstream change with this patch. It could be true
that (mask_16x16_0 & 1) is 1, but (mask_16x16_1 & 1) is 0 in some
edge cases.
This reverts commit 8f05e70340.
Change-Id: I0a529435ce816a1e14653eb510d5090de276070a
In the decoder we don't need to save eobs, we can pass eob as an argument.
That's why removing eob arrays from VP9Decompressor and TileWorkerData,
and moving eob pointer from macroblockd_plane to macroblock_plane.
Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
This commit makes the coefficient tree initialized prior to token
initialization, where the coefficient costs are filled out according
to the probabilities associated with coefficient value categories.
Change-Id: If4e89c3923058376f8382c683fe4a225a4a38af3
This commit fixes the intra prediction reference source selection
in the settings of skip_encode. Use original boundary pixels as
prediction reference, when the inverse transform and reconstruction
are skipped in the per block size rate-distortion optimization loop.
Change-Id: I36081aa30aa46e203e0e6f4e8a420fd08269469a
Its last remaining caller can be passed its results directly without any
additional work. Also, it's not non-4:2:0 safe.
Change-Id: Ia5089ba5f7f66c7617270483c619c9271aefd868
The performance gain of idct16x16_10_add_sse2 function is not
noticeable. However since both functions use the IDCT16_1D,
idct16x16_10_add_sse2 should be modified as well.
Tested with: park_joy_420_720p50.y4m
Change-Id: I02b957e36fcf997c677d15baf496533895271bff
This commit fixes the use of uv_intra_estimate by properly restoring
the mode_info struct required by rd_pick_intra_sbuv_mode.
Change-Id: I6a156d79533c4e2e60dfd3b8c5bb0a42a8eca280
The difference with the old code is that originally the whole token_cache
was initialized with zeros at the beginning of decode_coefs() function.
Now we set several zero values explicitly with "token_cache[scan[c]] = 0".
Change-Id: I88cc5031f01d13012d1a4491739c36cb44f9401e
Removing goto and using while loop instead, renaming seg_eob to max_eob,
moving eob token counter increment.
Change-Id: Idcc4b3a45e4f313596a71776aef56691a6647e5f
E.g. disable vertical partioning for 4:2:2. Until we come up with something
better to do with the chroma block size, this prevents an assert error.
Change-Id: I9394fb3f14ec1343abc3ad4769de208e6278f285
Considering a horizontal edge, if mask_16x16 is 1 for an even-
indexed 8x8 block, then mask_16x16 is 1 for next 8x8 block in
same row. Similiar to a verticle edge, if mask_16x16 is 1 for
an even-rowed 8x8 block, then mask_16x16 is 1 for the 8x8 block
right below it in next raw. Based on that, the mask_16x16 checking
can be simplified to save cycles. The corresponding 8-pixel
vp9_mb_lpf_horizontal_edge code can also be removed.
Change-Id: Ic3fe7a5674322239208cbe2731dc3216ce2084f3
We only need qcoeff buffers in the encoder. Reducing TileWorkerData struct
and VP9Decompressor struct sizes by 24K.
Change-Id: Id148868461f7ffa3d3dd634b371503ae9c57e207
Renaming treed_read() to consistent vp9_read_tree() and moving it from
deleted vp9_treereader.h to vp9_dboolhuff.h file.
Change-Id: Iedd8655acbe25e4fcf62b79e5a13bdea69b6b004
vp9_idct32x32_34_add_sse2:
speedup: 1.472
IDCT32_1D_34 and MULTIPLICATION_AND_ADD_2 are optimized
based on the fact that Only upper-left 8x8 has
non-zero values.
vp9_idct32x32_1024_add_sse2:
speedup: 1.032
Tested with: park_joy_420_720p50.y4m
Change-Id: I8670ce547552b48695049de298e2fc46ce28dfbc