6191 Commits

Author SHA1 Message Date
Jim Bankoski
8a774e14ff Changes interface to avoid uninitialized warnings in vp9_cx_iface.c.
Change-Id: I1092239e21c1cde188ee2dcb765f4c6fc8c5cdec
2014-07-31 06:27:57 -07:00
Jingning Han
a3b062c56f Merge "Chessboard pattern partition search" 2014-07-30 14:34:42 -07:00
Pengchong Jin
7f29d22e51 Merge "Early termination after partition NONE is done in RD." 2014-07-30 13:35:02 -07:00
James Zern
d1403eafb5 Merge "vp9_cx_iface: defer compressed data buffer alloc" 2014-07-30 12:03:45 -07:00
Pengchong Jin
49866baae6 Early termination after partition NONE is done in RD.
This patch allows the encoder to skip the search for partition
SPLIT, HORZ, VERT after the search for partition NONE is done
in RD optimization. It uses the first pass block-wise statistics
to make the decision. If all 16x16 blocks in the current partition
have zero motions and small residues from the frist pass statistics,
and it has small difference variance, further partition search is
skipped.

For speed 2 setting, experiments on general youtube clips show that
the speedup varies from 1% - 10%, 5% on average. On the performance
side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.

For hard stdhd clips:
park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)

This feature is under the compilation flag CONFIG_FP_MB_STATS and
it is off in current setting.

Change-Id: I554537e9242178263b65ebe14a04f9c221b58bae
2014-07-30 11:54:49 -07:00
Frank Galligan
81c2db591f Merge "Neon version of vp9_quantize_fp()" 2014-07-30 11:12:20 -07:00
Jingning Han
d82ff94284 Refactor rd_pick_parition interface
Remove the variable that indicates the relative block index. This
is explicitly covered by the use of pc_tree.

Change-Id: Ib13142582fff926c85e375bde656aa050add8350
2014-07-30 10:53:57 -07:00
Jingning Han
ca2dcb7fed Chessboard pattern partition search
This commit enables a chessboard pattern constrained partition
search for 720p and above resolutions. The scheme applies stricter
partition search to alternative blocks based on its above/left
neighboring blocks' partition range, as well as that of the
collocated blocks in the previous frame. It is currently turned
on at 16x16 block size level. The chessboard pattern is flipped
per coding frame.

The speed 3 runtime is reduced:
park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)

The compression performance is changed:
hd     -0.223%
stdhd  -0.295%

Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
2014-07-30 10:32:41 -07:00
Jingning Han
e9935a4ca0 Merge "Clean up max/min allowed block size in rd_pick_partition" 2014-07-30 10:05:26 -07:00
Jingning Han
22cf82a14c Merge "Use frame index directly in get_chessboard_index" 2014-07-30 10:05:03 -07:00
Jim Bankoski
e71adcd834 Merge "clear up cfg unused warning in vp9_pick_inter_mode" 2014-07-30 09:56:33 -07:00
Scott LaVarnway
d4a37db5b8 Neon version of vp9_quantize_fp()
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~12.4%

Change-Id: Id29d215acf58bb108489e218a259adf74b4768d7
2014-07-30 09:33:46 -07:00
Jim Bankoski
6647ca795d clear up cfg unused warning in vp9_pick_inter_mode
Change-Id: Iefcf0a25aaf5e44e8e791839aa82d876555025e0
2014-07-30 08:55:22 -07:00
Scott LaVarnway
521cf7e879 Neon version of vp9_sub_pixel_variance16x16(),
vp9_variance16x16(), and vp9_get16x16var().

On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~16.7%.

Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb
2014-07-30 08:17:32 -07:00
James Zern
c2c02510bd vp9_cx_iface: defer compressed data buffer alloc
currently the only way to know if multiple alt-refs are enabled is to
inspect the encoder instance.
this reduces the size of the allocation by 75% when not using multiple
alt-refs

Change-Id: Ie4baa240c2897e64b766c6ad229674884b5a65b6
2014-07-29 15:01:36 -07:00
Pengchong Jin
838b53b9fb Merge "Remove the redundant index computation in the first pass" 2014-07-29 14:27:24 -07:00
Jingning Han
6646ea73e2 Clean up max/min allowed block size in rd_pick_partition
This commit replace the repetitive retrieve of max and min allowed
partition from speed_feature with local variables max_size and
min_size.

Change-Id: Ib06f11f16615e4876e4dd5fb6a968c6bf5f7b216
2014-07-29 11:03:52 -07:00
Jingning Han
c36f78b054 Use frame index directly in get_chessboard_index
The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.

Change-Id: I3cad9d209ea2e77a358085a04fe1ff0ddec5ba03
2014-07-29 10:55:56 -07:00
Scott LaVarnway
d19d222db6 Added vp9_fdct8x8_neon(), vp9_fdct8x8_1_neon()
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~3.7%.

Change-Id: I428c72c40df82c6d537955e320a8debf99343004
2014-07-29 08:56:05 -07:00
Pengchong Jin
6491065775 Remove the redundant index computation in the first pass
Remove the redundant index computation when store the first
pass block-wise statistics. Currently, a single byte is
allocated for a 16x16 blocks, and all the frame statistics
saved during the first pass will be kept in memory for use
in the second pass. For a 1920x1080 300-frame clip, it will
take about 2.3 MB memory. This feature is off in current
setting.

Change-Id: I135a95b348ec093d54c6a07e1e8237626909e3bd
2014-07-28 18:31:36 -07:00
levytamar82
4ba92dc5ab Fix bug 805
Remove all the redundant dct functions (dct4x4, dct8x8)
in avx2 except dct32x32 those functions were copied originally from dct_sse2

Change-Id: I742576fbf5175f3ac09f2076976a9247b259323e
2014-07-28 15:46:01 -07:00
Pengchong Jin
c580428928 Merge "Store block-wise statistics obtained in the first pass" 2014-07-28 14:49:05 -07:00
Pengchong Jin
bae652245d Store block-wise statistics obtained in the first pass
Change-Id: I9956db2ba2f7d28f484daaf5022d8d1ef5db473c
2014-07-28 09:12:40 -07:00
Jim Bankoski
899585ebe9 Fix reference frame size restrictions.
The issue was introduced by commit g9f37d14 with adding explicit
restrictions on reference-frame scale factors. The restriction
is checked against aligned-by-8 frame dimensions, not against
original ones. So, for example, frame of 35×35 actually can refer
to frame of 70×70, but the new check won't allow this. It will
compare 35 vs 72 (not 70), so 2x downscale limit will be exceeded.

Change-Id: Ic663693034440f64ac8312cbff9e1e773a921060
2014-07-28 08:37:25 -07:00
Jingning Han
ac1f06188d Merge "Fix rd_pick_partition search loop for 4x4 blocks" 2014-07-25 15:57:35 -07:00
Jingning Han
0c103eb211 Merge "Fix potential ioc issue in vp9_get_prob for 4K above sizes" 2014-07-25 15:56:53 -07:00
Jingning Han
3d5f17311c Merge "Remove unnecessary conditional assignment" 2014-07-25 15:56:31 -07:00
Minghai Shang
8433c8f92d Merge "[spatial svc]Fix reference issues" 2014-07-25 13:24:27 -07:00
Alex Converse
f5827aee38 Merge "Refactor inter/intra_suberblock_yrd." 2014-07-25 10:51:48 -07:00
Yaowu Xu
b43b4fe3a2 Merge "Fix allocation of context buffers on frame resize" 2014-07-25 08:49:39 -07:00
Yaowu Xu
99813843ef Merge "Changed validation of reference frame size" 2014-07-25 08:48:48 -07:00
Jingning Han
84af0486f9 Fix rd_pick_partition search loop for 4x4 blocks
The partition search for 4x4 blocks takes unnecessary steps to
reconstruct pixels and an extra partition type update. This commit
removes such operations. No visible compression/speed difference.
Thanks to Yue (yuec@) for finding this issue.

Change-Id: I3f83824aa3fd3717d63be0b280fa57258939a70a
2014-07-25 07:17:58 -07:00
Jingning Han
53844275e9 Fix potential ioc issue in vp9_get_prob for 4K above sizes
This commit turns on the existing vp9_get_prob function using
64 bit in the intermediate step. It fixes the ioc issue for 4K
above frame sizes (issue 828).

Change-Id: I9f627f3beca2c522f73b38fd2a3e7eefdff01a7c
2014-07-24 15:35:51 -07:00
Jingning Han
7112d70f24 Remove unnecessary conditional assignment
The assignment of the variable mode_excluded in
vp9_rd_pick_inter_mode_sub8x8 takes redundant conditional jump.
This commit removes it.

Change-Id: Ie195fbe6e54ec2ade7093d562c456a2e93143704
2014-07-24 15:34:11 -07:00
Yaowu Xu
9261e1aa6e Changed validation of reference frame size
A previous change, https://gerrit.chromium.org/gerrit/#/c/70632,
introduced a size validation for reference frames to insuare the
input stream is a valid VP9 stream. However, the logic requiring
all reference frames have valid size turned out to be too strict.

In this commit, we modify the validation to require one of the
reference frame has valid dimension. In addition, the decoder
reports error whenever it detects the use of reference frame
with invalid scalig ratio.

Change-Id: If8efc312244087556cfe00f1fcbdff811268ebad
2014-07-24 14:58:01 -07:00
Adrian Grange
423e8a9727 Fix allocation of context buffers on frame resize
The patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
changed the test that determined whether the context
frame buffers needed to be reallocated or not.

The code checked for a change in total frame area
to signal the need to reallocate context buffers.
However, the above_context buffer needs to be
resized i:xf only the width of the frame has increased.

Change-Id: Ib89d75651af252908144cf662578d84f16cf30e6
2014-07-24 14:07:45 -07:00
Tim Kopp
9d337d34f2 s/CONFIG_DENOISING/CONFIG_VP9_TEMPORAL_DENOISING
This should prevent confusion with the VP8 CONFIG_TEMPORAL_DENOISING and other
flags.

Change-Id: I1fe4e2977895b7966841d861ab74317ad875b6c8
2014-07-24 13:43:52 -07:00
Alex Converse
6eae35c07f Refactor inter/intra_suberblock_yrd.
Move txfm_rd_in_plane into choose_tx_size_from_rd and cleanup callers.

Change-Id: I1df2d7dc984802bd5e204cbe881ada0d75fbb3f7
2014-07-24 11:21:51 -07:00
Minghai Shang
929001bf22 [spatial svc]Fix reference issues
1. Remove last reference flag for first frame upper layers in one pass mode.
2. Disable refresh golden frame flag for key frames.

Change-Id: I44ac1bd2c795169e4fbfdd078ea79a1d33a204d6
2014-07-23 16:54:14 -07:00
Jingning Han
374c885919 Merge "Remove redundant argument entry in handle_inter_mode" 2014-07-23 15:07:01 -07:00
Jingning Han
787e8240d5 Merge "Use the chessboard pattern pred search in newmv mode" 2014-07-23 15:06:52 -07:00
Yaowu Xu
5dcb2e3237 Merge "Moved call to vp9_clear_system_state() to a proper location" 2014-07-23 12:46:11 -07:00
Jingning Han
e945c56d4a Remove redundant argument entry in handle_inter_mode
The value of mode_excluded has been properly set in
vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
handle_inter_mode() and re-set the value again.

Change-Id: I408d4731f2f42e0bcf3ae62e85757717bb410471
2014-07-23 12:04:45 -07:00
Jingning Han
4f2f86725b Use the chessboard pattern pred search in newmv mode
This commit extends the chessboard pattern prediction filter search.
If the above and left blocks have the same prediction filter type,
the encoder will skip the prediction filter type search and use the
reference one.

The overall chessboard pattern prediction filter type search reduces
speed 3 runtime for hard clips. Experiments on park joy at 1080p
and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
i.e., about 10% speed-up. Compression performance wise, it affects
the coding quality by

Change-Id: I880975497c7ad166532e9eea9bf46684d77ff327
derf:    -0.326%
yt:      -0.257%
hd:      -0.241%
stdhd:   -0.417%
2014-07-23 11:59:52 -07:00
Jingning Han
66d5757695 Merge "Remove redundant num_refs definition" 2014-07-23 10:34:54 -07:00
Jingning Han
353819103e Remove redundant num_refs definition
Use is_comp_pred to replace the use case of num_refs.

Change-Id: I4d0c1e14d5f728428a2ae3d293cd2b4a8b2f31d8
2014-07-23 09:29:51 -07:00
Jingning Han
0e5edf4eae Merge "Enable chessboard inter prediction filter type search" 2014-07-23 09:12:56 -07:00
Jingning Han
54ad09586c Enable chessboard inter prediction filter type search
This commit enables a chessboard pattern prediction filter type
search scheme for rate-distortion optimization speed-up. For the
inferred motion vector modes, the encoder can re-use its above/left
neighbor blocks' prediction filter type and skip a full test on
all possible filter types. Such operation is turned on/off
alternatively in a chessboard manner.

It is turned on in speed 3. For test clip pedestrian 1080p, the
runtime is reduced from 231500 ms -> 221700 ms. The compression
performance is changed:
derf:  -0.147%
yt:    -0.134%
hd:    -0.079%
stdhd: -0.220%

Change-Id: I1912f278e7576c2dc632688e3ad7a257410c605a
2014-07-22 16:49:03 -07:00
Adrian Grange
1f3c43e602 Merge "Fix get_frame_type function" 2014-07-22 15:17:27 -07:00
Tim Kopp
75441e1e08 Merge "VP9 denoiser bugfix in debugging code." 2014-07-22 14:48:42 -07:00