440 Commits

Author SHA1 Message Date
Jingning Han
25206e7b7f Compute prediction filter type cost only when needed
Skip redundant prediction filter type cost in filter search loop,
if the rate value will be reset in Hadamard transform based rate
distortion estimate.

Change-Id: Ie5221f4bc8da9461c449df367251aeeac52c6e5d
2015-04-07 12:41:46 -07:00
Yaowu Xu
0bb897211d Optimize the checking for transform skipping
If U is not skippable, then do not perform the check on V.

Change-Id: Iba5e8362bd42390197f373c44388a426a4404549
2015-04-06 17:54:05 -07:00
Jingning Han
9922e4344a Enable Hadamard transform based cost estimate for all block sizes
This commit turns on the Hadamard transform based rate distortion
estimate for all block sizes in RTC coding mode. It conditionally
skips the rate distortion estimation if all zero block flag is set
on. No significant encoding speed change is observed. The
compression performance of speed -6 is improved by 1.7% over using
it only for block sizes of 32x32 and below.

Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb
2015-04-04 09:58:45 -07:00
Yunqing Wang
1a1114d21c Fix the scaling factor in UV skipping test
The threshold scaling factor was calculated wrong using partition
size "bsize". Thank Yaowu for pointing it out. It was fixed and no
speed change was seen.

Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27
2015-04-03 16:07:43 -07:00
Jingning Han
60e01c6530 Account for eob cost in the RTC mode decision process
This commit accounts for the transform block end of coefficient flag
cost in the RTC mode decision process. This allows a more precise
rate estimate. It also turns on the model to block sizes up to 32x32.
The test sequences shows about 3% - 5% speed penalty for speed -6.
The average compression performance improvement for speed -6 is
1.58% in PSNR. The compression gains for hard clips like jimredvga,
mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and
3.2%, respectively.

Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb
2015-04-03 10:31:51 -07:00
Yaowu Xu
718feb0f69 move ref_frame_cost computations into a function
Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1
2015-04-02 18:10:55 -07:00
Marco
6eb05c9ed0 Small fix to segment check in pickmode.
Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72
2015-04-02 09:55:13 -07:00
Jingning Han
a4364e5146 Merge "Simplify effective src_diff address computation" 2015-04-01 14:55:03 -07:00
Jingning Han
7acb2a8795 Merge "Refactor block_yrd function for RTC coding mode" 2015-04-01 14:54:24 -07:00
Jingning Han
19da916716 Simplify effective src_diff address computation
Remove redundant offset calculation for effective src_diff address.

Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef
2015-04-01 12:07:47 -07:00
Jingning Han
1470529f62 Refactor block_yrd function for RTC coding mode
This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.

Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
2015-04-01 12:00:43 -07:00
Yunqing Wang
a0043c6d30 Enhance the transform skipping decision-making in non-rd mode
For large partition blocks(block_size > 32x32), the variance
calculation is modified so that every 8x8 block's variance
is stored during the calculation, which is used in the
following transform skipping test. Also, the variance for
every tx block is calculated. The skipping test checks all tx
blocks in the partition, and sets the skip flag only if all tx
blocks are skippable. If the skip flag of Y plane is 1, a
quick evaluation is done on UV planes. If the current partition
block is skippable in YUV planes, the mode search checks fewer
inter modes and doesn't check intra modes.

The rtc set borg test(at speed 6) showed that:
Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%.
Average single-thread speedup on rtc set was 3.5%.
For 720p clips, more speedups were seen.
gipsrecmotion: 13%
gipsrestat: 12%
vidyo: 5 - 9%
dark: 15%
niklas: 6%

Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e
2015-04-01 09:43:40 -07:00
Jingning Han
531468a07a Tuning SATD rate calculation for speed
This commit allows the encoder to check the eob per transform
block to decide how to compute the SATD rate cost. If the entire
block is quantized to zero, there is no need to add anything; if
only the DC coefficient is non-zero, add its absolute value;
otherwise, sum over the block. This reduces the CPU cycles spent
on vp9_satd_sse2 to one third.

Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32
2015-03-31 11:02:20 -07:00
Jingning Han
ebe1be9186 Allow block skip coding option in RTC mode
When the estimated rate-distortion cost of skip coding mode is
lower than that of sending quantized coefficients, allow the
encoder to drop these coefficients. This improves the compression
performance of speed -6 by 0.268% and makes the encoding speed
slightly faster.

Change-Id: Idff2d7ba59f27ead33dd5a0e9f68746ed3c2ab68
2015-03-31 09:32:53 -07:00
Jingning Han
26d3d3af6a Enable 16x16 Hadamard transform in SATD based mode decision
This commit replaces the 16x16 2D-DCT transform with Hadamard
transform for RTC coding mode. It reduces the CPU cycles cost
on 16x16 transform by 5X. Overall it makes the speed -6 encoding
speed 1.5% faster without compromise on compression performance.

Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b
2015-03-30 15:43:31 -07:00
Jingning Han
b4b5af6acd Use SATD based mode decision for block sizes below 16x16
This commit makes the encoder to select between SATD/variance as
metric for mode decision. It also allows to account chroma
component costs for mode decision as well. The overall encoding
time increase as compared to variance based mode selection is about
15% for speed -6. The compression performance is on average 2.2%
better than variance based approach, with about 5% compression
performance gains for hard clips (e.g., jimredvga, nikas720p, and
mmmoving) at lower bit-rate range.

Change-Id: I4d04a31d36f4fcb3f5f491dacd6e7fe44cb9d815
2015-03-30 15:20:07 -07:00
Jingning Han
8a927a1b7a Reuse inter prediction pixel block for Hadamard transform
It saves one unnecessary motion compensated prediction constructed
by using 8-tap filter.

Change-Id: I101215131e6f38621d5935885f94cc74de6a5377
2015-03-30 15:04:33 -07:00
Jingning Han
8c411f74e0 Hadamard transform based coding mode decision process
This commit uses Hadamard transform based rate-distortion cost
estimate for rtc coding mode decision. It improves the compression
performance of speed -6 for many hard clips at lower bit-rates.
For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for
niklas720p. This will introduce extra encoding cycle costs at
this point.

Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375
2015-03-30 14:46:05 -07:00
Yaowu Xu
9fd8abc541 vp9_pred_mv(): misc fixes and optimizations
1. skip near if it is same as nearest
2. correct rounding for converting mv to fullpel position
3. update pred_mv_sad after new mv search.

Overall .1%~.25% compression gains on rtc set for speed 5, 6, 7, 8.

Change-Id: Ic300ca53f7da18073771f1bb993c58cde9deee89
2015-03-20 17:17:04 -07:00
Jingning Han
83cbe22623 Speed up non-rd mode decision search
This commit makes the encoder to explicitly calculate the SAD
associated with the LAST_FRAME motion vector and compare it to
that of the GOLDEN_FRAME given by integral projection motion
estimation. It skips the expensive sub-pixel motion search over
GOLDEN_FRAME when the LAST_FRAME can provide fairly good motion
compensated prediction quality.

For dark720p speed -6 single thread goes from
33304 b/f, 40.070 dB, 18156 ms ->
33319 b/f, 40.061 dB, 17611 ms

Change-Id: I01bc94b9b598075567a392111046b97a9bc30efe
2015-03-18 12:04:58 -07:00
Jingning Han
ee41141466 Fix an ioc warning in vp9_pick_inter_mode
Shut off all the metric checks for golden reference frame, if we
decide that it is unlikely to be selected for reference.

Change-Id: Ie457cc1fd43935584403b4982659aed80fb9909c
2015-03-17 10:13:44 -07:00
Yaowu Xu
de3097aa23 Merge "Remove duplicate clamping" 2015-03-16 16:56:10 -07:00
Jingning Han
adaffcc010 Merge "Remove ineffective newmv skip checking from vp9_pick_inter_mode" 2015-03-16 16:43:43 -07:00
Jingning Han
4e8daaf960 Merge "Simplify prediction filter search in rtc coding mode" 2015-03-16 16:43:26 -07:00
Yaowu Xu
4611f24797 Remove duplicate clamping
The mvs are clamped in the vp9_find_best_ref_mvs() already.

Change-Id: I9bea5e35aef6007466fe7fca4bc2dc5c17e74222
2015-03-16 15:19:37 -07:00
Jingning Han
c852200f51 Remove ineffective newmv skip checking from vp9_pick_inter_mode
Change-Id: I41ee684cf113a7b5edf280183e51cb08b2e93cc4
2015-03-16 15:06:27 -07:00
Jingning Han
981bb84882 Simplify prediction filter search in rtc coding mode
Reduce unnecessary fetch from MB_MODE_INFO.

Change-Id: Iff89b76d5e2774c00a564e902913a633fa2e1ea9
2015-03-16 14:54:00 -07:00
Yaowu Xu
f2d682fc10 change the order of inter modes evaluated
Change-Id: I10c1ad23b110cf92cb026e895039c215c47abfd0
2015-03-16 12:49:30 -07:00
Yaowu Xu
51d529a578 vp9_pick_inter_mode(): minor optimizations
1. remove duplicate initialization to mbmi->interp_filter.
2. move mv clamping into ref_frame loop instead of mode checking loop.
3. move the check if last frame is same as golden frame earlier to
avoid initialization of Golden reference related variables.

Change-Id: Idf2d05e19e94a24f69cc289687869fc71d2ff289
2015-03-16 10:08:02 -07:00
Alex Converse
f8df916931 Merge "Reconcile active_map and cyclic refresh" 2015-03-13 10:20:15 -07:00
Yaowu Xu
1aa75c65cc Merge "vp9_pick_inter_mode(): Use single loop to evaluate inter modes" 2015-03-12 18:43:23 -07:00
Yunqing Wang
769e6567e9 Merge "Minorly modify model_rd_for_sb_y function" 2015-03-12 17:16:48 -07:00
Alex Converse
1bfacd3529 Reconcile active_map and cyclic refresh
Change-Id: Id7f8654aeeb20caa402bc822521b1d72c658f4f9
2015-03-12 16:19:49 -07:00
Yaowu Xu
2b368097c8 vp9_pick_inter_mode(): Use single loop to evaluate inter modes
This commit changes to use single loop to evaluate all inter modes.
There is no impact on compression quality and speed, but allow future
experiment with the order of modes evaluated.

Change-Id: I71696ce1014cbe127e25e98710d835987f5ecc09
2015-03-12 16:14:29 -07:00
Yunqing Wang
5d677c97eb Minorly modify model_rd_for_sb_y function
Added a skip_dc check. If skip_dc = 1, we could eliminate calling
of vp9_model_rd_from_var_lapndz(). This gave slight PSNR & SSIM
gain(<0.1%), and no speed change.

Change-Id: If5ca733366148c86b98e196a00cc890f50e9a3e5
2015-03-12 14:04:14 -07:00
Jingning Han
313c28f8b8 Remove unnecessary speed feature checking
This commit removes the pred_mv_sad comparison from rtc motion
search, given that a stronger comparison has been done at the
mode search level to eliminate unlikely selected reference frames.

Change-Id: I49b8d24b2174303066fd8eff2102c0648f2869df
2015-03-11 16:11:40 -07:00
Jingning Han
54eda13f8d Apply fast motion search to golden reference frame
This commit enables the rtc coding mode to run integral projection
based motion search for golden reference frame. It improves the
speed -6 compression performance by 1.1% on average, 3.46% for
jimred_vga, 6.46% for tacomascmvvga, and 0.5% for vidyo clips. The
speed -6 is about 6% slower.

Change-Id: I0fe402ad2edf0149d0349ad304ab9b2abdf0c804
2015-03-11 16:03:49 -07:00
Yaowu Xu
d549aa3b17 Separate rd_thresh adaption by ref_frame
Only update the rd_thresh factors for modes sharing same reference
frame. This helps overall compression of 6 and 7 by .13% and .19%
respectively without any noticeable speed difference.

Change-Id: Idb3a3879512c5d7d0880034516079949290690c5
2015-03-10 19:06:52 -07:00
Jingning Han
9708f9d66a Merge "Skip golden ref frame check when it is same as last ref frame" 2015-03-09 12:27:19 -07:00
Jingning Han
6245a91e0b Skip golden ref frame check when it is same as last ref frame
When golden reference frame is refreshed, the next frame has both
its last and golden reference frames point to the same reference
frame in real-time coding mode. Experiments suggest that using
two separate reference frames for frames right after golden refresh
frame does not provide further compression performance advantage.
This commit hence retains the current encoder implementation and
shuts off the mode search over golden reference frame in this case.

It makes the encoder run slightly faster at no coding performance
change.

Change-Id: I1561f7799253a10e675d05c63c1749fe9e85b472
2015-03-09 11:14:55 -07:00
Yunqing Wang
268f260d64 Modify the setting of transform skip flags in non-rd mode
While searching for the best mode in non-rd case, SSE of
a partition block is calculated and the transform size is set.
This patch rewrites the skip checking conditions based on
transform size instead of partition size to be more precise.

Small gains were seen in rtc set borg test (speed 6).
AVG PSNR: 0.087%, overall PSNR: 0.073%, SSIM: 0.146%.
No noticeable speed change.

Change-Id: I5603ca5339c784dfa02263f4005988ccd8c32f6e
2015-03-06 09:22:00 -08:00
Adrian Grange
3807dd82ab Make encoder buffer allocation dynamic
Frame buffers are now allocated dynamically on-demand.

Entries in the reference frame map, cm->ref_frame_map,
may now be set to -1 (INVALID_IDX) to indicate that
there is not a valid reference buffer in that "slot".

All slots in the reference frame map are now initialized
to the empty state (-1) and each buffer is initialized
to have a reference count of 0.

Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582
2015-03-04 07:58:32 -08:00
Jingning Han
1790d45252 Use variance metric for integral projection vector match
This commit replaces the SAD with variance as metric for the
integral projection vector match. It improves the search accuracy
in the presence of slight light change. The average speed -6
compression performance for rtc set is improved by 1.7%. No speed
changes are observed for the test clips.

Change-Id: I71c1d27e42de2aa429fb3564e6549bba1c7d6d4d
2015-03-01 10:42:56 -08:00
Marco
a1b402e71c Merge "Adjustments to cyclic refresh (aq-mode=3)." 2015-02-20 09:55:05 -08:00
Jingning Han
6728655422 Merge "Add high bit depth support to rtc sub8x8 block coding" 2015-02-20 09:35:18 -08:00
Marco
0187f4b411 Adjustments to cyclic refresh (aq-mode=3).
Target higher delta-qp for big blocks with zero motion,
and for segment#1: avoid 64x64 partition size and force 8x8 tx size.

Metrics on RTC set mostly positive: SSIM up by ~4%, PSRN by ~1.5%.
Doesn't seem to be any change in speed.

Change-Id: I1f68fa3c4f62dab3b90cc58041f05ebb048ae5ac
2015-02-20 08:47:59 -08:00
Jingning Han
6f4245894a Add high bit depth support to rtc sub8x8 block coding
This commit adds proper buffer handle to support high bit depth
in rtc sub8x8 block coding.

Change-Id: Ibaf8a2160194121aec9ca68b8094817fed9ccaea
2015-02-20 08:36:33 -08:00
Yunqing Wang
81fc5bf81c Improve skip_txfm thresholds in the non-rd mode selection
Modified the thresholds of deciding whether or not to skip
the transforms in model_rd_for_sb_y(). Used zbin[] instead
of dequant[] to be more precise. Also, modified the checking
coditions.

Rtc set borg test results (at speed 6) showed:
average PSNR gain: 0.138%, overall PSNR gain: 0.158%,
and SSIM gain: 0.177%.

The data rate test was modified slightly as suggested by
Marco.

Change-Id: Ieaf633ab77f4838cb3c45cf69065b29d55f8ae6c
2015-02-19 14:30:46 -08:00
Marco
b1940bf5fe Replace some operations with shift in encoder_breakout.
Replaced a divide by 9 with 8, so some very small difference,
but otherwise no change in behavior.

Change-Id: I1079ae3c41e0789ff0bc6fa9940a238b6bca0f5b
2015-02-13 10:45:19 -08:00
Jingning Han
e665c8f2c9 Add mode cost to sub8x8 block mode decision in rtc coding
This commit allows the encoder to properly account for the mode
cost in sub8x8 non-RD mode decision.

Change-Id: I2951960d20e37ed08e372ee0c7044935b2b9b899
2015-02-11 14:43:02 -08:00