This patch allows the encoder to skip the search for partition
SPLIT, HORZ, VERT after the search for partition NONE is done
in RD optimization. It uses the first pass block-wise statistics
to make the decision. If all 16x16 blocks in the current partition
have zero motions and small residues from the frist pass statistics,
and it has small difference variance, further partition search is
skipped.
For speed 2 setting, experiments on general youtube clips show that
the speedup varies from 1% - 10%, 5% on average. On the performance
side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.
For hard stdhd clips:
park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)
This feature is under the compilation flag CONFIG_FP_MB_STATS and
it is off in current setting.
Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae
Remove the variable that indicates the relative block index. This
is explicitly covered by the use of pc_tree.
Change-Id: Ib13142582fff926c85e375bde656aa050add8350
This commit enables a chessboard pattern constrained partition
search for 720p and above resolutions. The scheme applies stricter
partition search to alternative blocks based on its above/left
neighboring blocks' partition range, as well as that of the
collocated blocks in the previous frame. It is currently turned
on at 16x16 block size level. The chessboard pattern is flipped
per coding frame.
The speed 3 runtime is reduced:
park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
The compression performance is changed:
hd -0.223%
stdhd -0.295%
Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
vp9_variance16x16(), and vp9_get16x16var().
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~16.7%.
Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb
This commit replace the repetitive retrieve of max and min allowed
partition from speed_feature with local variables max_size and
min_size.
Change-Id: Ib06f11f16615e4876e4dd5fb6a968c6bf5f7b216
The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.
Change-Id: I3cad9d209ea2e77a358085a04fe1ff0ddec5ba03
Remove the redundant index computation when store the first
pass block-wise statistics. Currently, a single byte is
allocated for a 16x16 blocks, and all the frame statistics
saved during the first pass will be kept in memory for use
in the second pass. For a 1920x1080 300-frame clip, it will
take about 2.3 MB memory. This feature is off in current
setting.
Change-Id: I135a95b348ec093d54c6a07e1e8237626909e3bd
Remove all the redundant dct functions (dct4x4, dct8x8)
in avx2 except dct32x32 those functions were copied originally from dct_sse2
Change-Id: I742576fbf5175f3ac09f2076976a9247b259323e
The partition search for 4x4 blocks takes unnecessary steps to
reconstruct pixels and an extra partition type update. This commit
removes such operations. No visible compression/speed difference.
Thanks to Yue (yuec@) for finding this issue.
Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a
The assignment of the variable mode_excluded in
vp9_rd_pick_inter_mode_sub8x8 takes redundant conditional jump.
This commit removes it.
Change-Id: Ie195fbe6e54ec2ade7093d562c456a2e93143704
1. Remove last reference flag for first frame upper layers in one pass mode.
2. Disable refresh golden frame flag for key frames.
Change-Id: I44ac1bd2c795169e4fbfdd078ea79a1d33a204d6
The value of mode_excluded has been properly set in
vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
handle_inter_mode() and re-set the value again.
Change-Id: I408d4731f2f42e0bcf3ae62e85757717bb410471
This commit extends the chessboard pattern prediction filter search.
If the above and left blocks have the same prediction filter type,
the encoder will skip the prediction filter type search and use the
reference one.
The overall chessboard pattern prediction filter type search reduces
speed 3 runtime for hard clips. Experiments on park joy at 1080p
and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
i.e., about 10% speed-up. Compression performance wise, it affects
the coding quality by
Change-Id: I880975497c7ad166532e9eea9bf46684d77ff327
derf: -0.326%
yt: -0.257%
hd: -0.241%
stdhd: -0.417%
This commit enables a chessboard pattern prediction filter type
search scheme for rate-distortion optimization speed-up. For the
inferred motion vector modes, the encoder can re-use its above/left
neighbor blocks' prediction filter type and skip a full test on
all possible filter types. Such operation is turned on/off
alternatively in a chessboard manner.
It is turned on in speed 3. For test clip pedestrian 1080p, the
runtime is reduced from 231500 ms -> 221700 ms. The compression
performance is changed:
derf: -0.147%
yt: -0.134%
hd: -0.079%
stdhd: -0.220%
Change-Id: I1912f278e7576c2dc632688e3ad7a257410c605a
When OUTPUT_YUV_DENOISED is enabled the encoder outputs the uncompressed,
denoised video to a separate file. Moved the point at which the file is
written to in order to avoid an extra blank frame at the beginning of the video.
Change-Id: I805f6a912b18b3d9cae59b13c5b8108279439ce3
This should be a local variable. Move the definition from
vp9_rd_pick_inter_mode_sb to handle_inter_mode.
Change-Id: I14f4168bb1c896ed04e8f6d4cd89fbf4c9839944