Commit Graph

5144 Commits

Author SHA1 Message Date
Jingning Han
657cabe0f7 Tune SSSE3 assembly implementation to improve quantization speed
Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3
2015-04-01 15:28:01 -07:00
Yaowu Xu
fff4654d36 Merge "Simplify bsize calculation" 2015-04-01 15:06:55 -07:00
Jingning Han
cf4447339e Merge "Optimize quantization simd implementation" 2015-04-01 14:55:18 -07:00
Jingning Han
a4364e5146 Merge "Simplify effective src_diff address computation" 2015-04-01 14:55:03 -07:00
Jingning Han
7acb2a8795 Merge "Refactor block_yrd function for RTC coding mode" 2015-04-01 14:54:24 -07:00
Yaowu Xu
ba91b54d7c Simplify bsize calculation
Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168
2015-04-01 12:15:06 -07:00
Jingning Han
19da916716 Simplify effective src_diff address computation
Remove redundant offset calculation for effective src_diff address.

Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef
2015-04-01 12:07:47 -07:00
Jingning Han
1470529f62 Refactor block_yrd function for RTC coding mode
This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.

Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
2015-04-01 12:00:43 -07:00
Jingning Han
eed1badedd Optimize quantization simd implementation
This commit allows the quantizer to compare the AC coefficients to
the quantization step size to determine if further multiplication
operations are needed. It makes the quantization process 20% faster
without coding statistics change.

Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a
2015-04-01 11:47:09 -07:00
Yunqing Wang
a0043c6d30 Enhance the transform skipping decision-making in non-rd mode
For large partition blocks(block_size > 32x32), the variance
calculation is modified so that every 8x8 block's variance
is stored during the calculation, which is used in the
following transform skipping test. Also, the variance for
every tx block is calculated. The skipping test checks all tx
blocks in the partition, and sets the skip flag only if all tx
blocks are skippable. If the skip flag of Y plane is 1, a
quick evaluation is done on UV planes. If the current partition
block is skippable in YUV planes, the mode search checks fewer
inter modes and doesn't check intra modes.

The rtc set borg test(at speed 6) showed that:
Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%.
Average single-thread speedup on rtc set was 3.5%.
For 720p clips, more speedups were seen.
gipsrecmotion: 13%
gipsrestat: 12%
vidyo: 5 - 9%
dark: 15%
niklas: 6%

Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e
2015-04-01 09:43:40 -07:00
Yunqing Wang
fc98114761 Merge "Rename vbp thresholds" 2015-03-31 16:33:30 -07:00
Yunqing Wang
c28ff1a9de Rename vbp thresholds
Code refactoring

Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79
2015-03-31 15:14:44 -07:00
Jingning Han
502ac72233 Merge "Tuning SATD rate calculation for speed" 2015-03-31 14:24:26 -07:00
Jingning Han
1c39c5b96f Merge "Use aligned copy in 8x8 Hadamard transform SSE2" 2015-03-31 12:16:47 -07:00
Jingning Han
fa4289522e Merge "Allow block skip coding option in RTC mode" 2015-03-31 12:16:36 -07:00
Jingning Han
1638d7dc96 Merge "Fix 8x8 Hadamard SSE2 implementation" 2015-03-31 12:16:27 -07:00
Alex Converse
9670d766ab Merge "VP9E_GET_ACTIVE_MAP API function." 2015-03-31 11:52:56 -07:00
Jingning Han
531468a07a Tuning SATD rate calculation for speed
This commit allows the encoder to check the eob per transform
block to decide how to compute the SATD rate cost. If the entire
block is quantized to zero, there is no need to add anything; if
only the DC coefficient is non-zero, add its absolute value;
otherwise, sum over the block. This reduces the CPU cycles spent
on vp9_satd_sse2 to one third.

Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32
2015-03-31 11:02:20 -07:00
hui su
d4f2f1dd5b Merge "Move vp9_coef_con_tree to common/" 2015-03-31 10:51:10 -07:00
Jingning Han
014fa45298 Use aligned copy in 8x8 Hadamard transform SSE2
This reduces the 8x8 Hadamard transform cycles by 20%.

Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b
2015-03-31 10:21:52 -07:00
Jingning Han
db5ec37edc Merge "Enable 16x16 Hadamard transform in SATD based mode decision" 2015-03-31 09:55:41 -07:00
Jingning Han
8c5670bb6f Merge "Use SATD based mode decision for block sizes below 16x16" 2015-03-31 09:47:47 -07:00
Jingning Han
ebe1be9186 Allow block skip coding option in RTC mode
When the estimated rate-distortion cost of skip coding mode is
lower than that of sending quantized coefficients, allow the
encoder to drop these coefficients. This improves the compression
performance of speed -6 by 0.268% and makes the encoding speed
slightly faster.

Change-Id: Idff2d7ba59f27ead33dd5a0e9f68746ed3c2ab68
2015-03-31 09:32:53 -07:00
hui su
302e24cb3e Move vp9_coef_con_tree to common/
This tree should be defined in common/, as it is needed for
both encoder and decoder.

Change-Id: I4f5cbc80025cf2ced14182c98f7c82dc7d0f87db
2015-03-31 09:20:46 -07:00
Jingning Han
9b99eb2e12 Merge "Reuse inter prediction pixel block for Hadamard transform" 2015-03-30 16:09:38 -07:00
Jingning Han
34a996ac1e Fix 8x8 Hadamard SSE2 implementation
This commit fixes the SSE2 version 8x8 Hadamard transform
alignment and makes it consistent with the C version.

Change-Id: I1304e5f97e0e5ef2d798fe38081609c39f5bfe74
2015-03-30 15:54:08 -07:00
Jingning Han
26d3d3af6a Enable 16x16 Hadamard transform in SATD based mode decision
This commit replaces the 16x16 2D-DCT transform with Hadamard
transform for RTC coding mode. It reduces the CPU cycles cost
on 16x16 transform by 5X. Overall it makes the speed -6 encoding
speed 1.5% faster without compromise on compression performance.

Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b
2015-03-30 15:43:31 -07:00
Jingning Han
f0ac5aaa08 Merge "Hadamard transform based coding mode decision process" 2015-03-30 15:43:15 -07:00
Jingning Han
b4b5af6acd Use SATD based mode decision for block sizes below 16x16
This commit makes the encoder to select between SATD/variance as
metric for mode decision. It also allows to account chroma
component costs for mode decision as well. The overall encoding
time increase as compared to variance based mode selection is about
15% for speed -6. The compression performance is on average 2.2%
better than variance based approach, with about 5% compression
performance gains for hard clips (e.g., jimredvga, nikas720p, and
mmmoving) at lower bit-rate range.

Change-Id: I4d04a31d36f4fcb3f5f491dacd6e7fe44cb9d815
2015-03-30 15:20:07 -07:00
Jingning Han
8a927a1b7a Reuse inter prediction pixel block for Hadamard transform
It saves one unnecessary motion compensated prediction constructed
by using 8-tap filter.

Change-Id: I101215131e6f38621d5935885f94cc74de6a5377
2015-03-30 15:04:33 -07:00
Jingning Han
8c411f74e0 Hadamard transform based coding mode decision process
This commit uses Hadamard transform based rate-distortion cost
estimate for rtc coding mode decision. It improves the compression
performance of speed -6 for many hard clips at lower bit-rates.
For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for
niklas720p. This will introduce extra encoding cycle costs at
this point.

Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375
2015-03-30 14:46:05 -07:00
Alex Converse
bf7def9a43 Merge "Simplify skip check." 2015-03-30 11:31:45 -07:00
Marco
fa20a60f0d Speed 5: use non-rd mode for key frame coding.
Metrics on RTC set go down by ~1.5% on average.
Key frame encoding time goes down by factor of ~5.

Change-Id: Ia83acc55848613870e5ac6efe7f3d904d877febb
2015-03-27 16:19:26 -07:00
Adrian Grange
ad18b2b641 Remove 8-bit array in HBD
Creating both 8- and 16-bit arrays and then only using one
of them is wasteful.

Change-Id: Ic5b397c283efaff7bcfff2d2413838ba3e065561
2015-03-25 15:37:03 -07:00
Adrian Grange
65df3d138a Replace heap with stack memory allocation
Replaced the dynamic memory allocation of the
second_pred buffer with an allocation on the stack.

Change-Id: I2716c46b71e8587714ca5733a99eca2c68419b23
2015-03-25 15:36:43 -07:00
Adrian Grange
8d8d7bfde5 Fix use of scaling in joint motion search
To enable us to the scale-invariant motion estimation
code during mode selection, each of the reference
buffers is scaled to match the size of the frame
being encoded.

This fix ensures that a unit scaling factor is used in
this case rather than the one calculated assuming that
the reference frame is not scaled.

Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf
2015-03-25 15:35:29 -07:00
paulwilkins
ab788c5380 Merge "Enable group adaptive max q by default." 2015-03-24 15:00:12 -07:00
Alex Converse
4dcb839607 VP9E_GET_ACTIVE_MAP API function.
This is useful when aq mode 3 (cyclic refresh) reactivates segments for refresh.

Change-Id: I3ad1d9410b899ede393d82bb8db14e2da4d84eca
2015-03-24 11:19:47 -07:00
Yaowu Xu
c77d4dcb35 Merge "vp9_pred_mv(): misc fixes and optimizations" 2015-03-24 10:36:51 -07:00
Alex Converse
02697e35dc Merge "A tiny cyclic refresh / active map fix." 2015-03-24 09:43:24 -07:00
paulwilkins
8ea7bafdaa Merge "Revised rd adjustment for variance." 2015-03-24 03:12:56 -07:00
paulwilkins
c0b71cf82f Merge "Experimental rd bias based on source vs recon variance." 2015-03-24 03:12:41 -07:00
Alex Converse
31f1563a92 A tiny cyclic refresh / active map fix.
Change-Id: I198727461455c8c198a0c892d02ed3cb1673aa50
2015-03-23 18:51:00 -07:00
hkuang
cd1d40ff5d Merge "Safely free all the frame buffers after all the workers finish the work." 2015-03-23 16:50:15 -07:00
Alex Converse
b7605a9d70 Simplify skip check.
SEG_LVL_SKIP implies skip. This is enforced by skip = write_skip().

Change-Id: I61c79581c9c53deae36685c2bcf388cb4d8827d3
2015-03-23 10:53:31 -07:00
paulwilkins
691ec45b4e Enable group adaptive max q by default.
Set the GF group adaptive max Q compile flag to 1 by default.

This change has a quite big visual impact in some clips and also
contributes to tighter rate control.

For short test clips that have consistent content the impact is
quite small on metrics but for more varied long form clips there is
a drop in overal psnr but a sharp rise in average psnr caused by
greater expenditure on some easier sections and tighter rate clipping
in hard sections.

In chunck'ed encodes some of the effect will already be present due
to the independent rate control in each chunk but this change takes
the control down to a smaller scale.

yt hd +10.67%, - 3.77%, -1.56%
yt +9.654%, - 3.6%, - 1.82%
std hd +0.25%, -0.85%, -0.42%
derf +0.25%, - 1.1%. - 0.87%

Change-Id: Ibbc39b800d99d053939f4c6712d715124082843e
2015-03-23 15:57:09 +00:00
Yaowu Xu
9fd8abc541 vp9_pred_mv(): misc fixes and optimizations
1. skip near if it is same as nearest
2. correct rounding for converting mv to fullpel position
3. update pred_mv_sad after new mv search.

Overall .1%~.25% compression gains on rtc set for speed 5, 6, 7, 8.

Change-Id: Ic300ca53f7da18073771f1bb993c58cde9deee89
2015-03-20 17:17:04 -07:00
Alex Converse
6d6ef8eb3c Don't apply active map on key frames.
This allows applciations to be KF oblivious.

Change-Id: Ic02712eae6ad8d6b3eaec26548299d24ca0d5cc0
2015-03-20 14:57:24 -07:00
Alex Converse
e032fc7b9e Set loop filter level to zero on inactive segment.
Change-Id: I6022a79351882a72a219aee13563bf21bcd70383
2015-03-20 14:43:06 -07:00
paulwilkins
7e234b9228 Revised rd adjustment for variance.
Revised adjustment for rd based on source complexity.
Two cases:

1) Bias against low variance intra predictors
when the actual source variance is higher.

2) When the source variance is very low to give a slight
bias against predictors that might introduce false texture
or features.

The impact on metrics of this change across the test sets is
small and mixed.

derf -0.073%, -0.049%, -0.291%
std hd -0.093%, -0.1%, -0.557%
yt  +0.186%, +0.04%, - 0.074%
ythd +0.625%, + 0.563%, +0.584%

Medium to strong psycho-visual improvements in some
problem clips.

This feature and intra weight on GF group length now
turned on by default.

Change-Id: Idefc8b633a7b7bc56c42dbe19f6b2f872d73851e
2015-03-20 11:59:39 +00:00