Commit Graph

13394 Commits

Author SHA1 Message Date
Jingning Han
9922e4344a Enable Hadamard transform based cost estimate for all block sizes
This commit turns on the Hadamard transform based rate distortion
estimate for all block sizes in RTC coding mode. It conditionally
skips the rate distortion estimation if all zero block flag is set
on. No significant encoding speed change is observed. The
compression performance of speed -6 is improved by 1.7% over using
it only for block sizes of 32x32 and below.

Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb
2015-04-04 09:58:45 -07:00
Yunqing Wang
b2baaa215b Merge "Fix the scaling factor in UV skipping test" 2015-04-03 17:09:59 -07:00
James Zern
5afa7d1f87 vp8_regular_quantize_b_sse2: remove dead init
Change-Id: Ide5eefadbb3cab38743a69f744a003abb37a6506
2015-04-03 16:44:16 -07:00
James Zern
30205e14b7 vp8cx_pick_filter_level*: remove dead inits
Change-Id: I28026b86d03264b9f4e2fc8ac1d3c74aa3954208
2015-04-03 16:44:15 -07:00
James Zern
acb219be25 vp8_decode_frame: remove dead increment
Change-Id: Ie9a6fac02796d24e6f4a15416d0b4c19010547df
2015-04-03 16:44:15 -07:00
James Zern
0c5a140a02 rdopt: remove dead stores
Change-Id: Ia8a20c6751cc6d63c60bb00b99c78faca1e61051
2015-04-03 16:44:14 -07:00
James Zern
04c53d2393 find_next_key_frame: remove dead init & store
Change-Id: I8c7f5b9718ef14e4397a263aa9f52a9edcf7d1cd
2015-04-03 16:43:48 -07:00
James Zern
970acffa8f multiframe_quality_enhance_block: remove dead stores
Change-Id: I33ca9cddfdd54c3d8a23c1cb978986a537a20bf2
2015-04-03 16:15:51 -07:00
James Zern
7b4f727959 vp8_print_modes_and_motion_vectors: remove dead stores
Change-Id: I438cbf4970fa2220fb73b0b41a29e654836d4e3b
2015-04-03 16:08:37 -07:00
Yunqing Wang
1a1114d21c Fix the scaling factor in UV skipping test
The threshold scaling factor was calculated wrong using partition
size "bsize". Thank Yaowu for pointing it out. It was fixed and no
speed change was seen.

Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27
2015-04-03 16:07:43 -07:00
Ed Baker
4e73e4bf93 Test loopfilters with count=2
The following functions use the count parameter to either loop or select
dedicated paths:
vp9_lpf_horizontal_16_c
vp9_lpf_horizontal_16_sse2
vp9_lpf_horizontal_16_avx2
vp9_lpf_horizontal_16_neon
vp9_highbd_lpf_horizontal_16_c
vp9_highbd_lpf_horizontal_16_sse2

Change-Id: I7abfd2cb30baa292b4ebe11c847968481103c037
2015-04-03 15:36:52 -07:00
James Zern
44e3640923 Merge "vp9: enable sse4 sad functions" 2015-04-03 14:57:52 -07:00
Johann
0080aca235 Merge "Merge branch 'indianrunnerduck'" 2015-04-03 13:43:20 -07:00
Johann
c5f7842234 Merge "Remove AltiVec flag" 2015-04-03 13:42:49 -07:00
Johann
79bd071373 Merge branch 'indianrunnerduck'
* indianrunnerduck:
  Update CHANGELOG for v1.4.0 (Indian Runner Duck) release
  vp9: fix high-bitdepth NEON build
  Fix use of scaling in joint motion search
  Prepare Release Candidate for libvpx v1.4.0
  vp8cx.h: vpx/vpx_encoder.h -> ./vpx_encoder.h

Change-Id: Ib2eee50f02e12623aae478871cb9150604bb2ac2
2015-04-03 12:53:45 -07:00
Johann
c74bf6d889 Update CHANGELOG for v1.4.0 (Indian Runner Duck) release
Change-Id: Id31b4da40c484aefc1236f5cc568171a9fd12af2
2015-04-03 11:49:19 -07:00
Jingning Han
30e9c091c0 Merge "Tune SSSE3 assembly implementation to improve quantization speed" 2015-04-03 11:24:28 -07:00
Johann
73fe337647 Remove AltiVec flag
Change-Id: I560b1a954a5089a8af69952b8084408c6a420b96
2015-04-03 10:33:20 -07:00
Jingning Han
60e01c6530 Account for eob cost in the RTC mode decision process
This commit accounts for the transform block end of coefficient flag
cost in the RTC mode decision process. This allows a more precise
rate estimate. It also turns on the model to block sizes up to 32x32.
The test sequences shows about 3% - 5% speed penalty for speed -6.
The average compression performance improvement for speed -6 is
1.58% in PSNR. The compression gains for hard clips like jimredvga,
mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and
3.2%, respectively.

Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb
2015-04-03 10:31:51 -07:00
hkuang
d72ed35374 Merge "Fix error of "Left shift of negative value -1"." 2015-04-02 21:35:12 -07:00
Yunqing Wang
12cb30d4bd Merge "Set vbp thresholds for aq3 boosted blocks" 2015-04-02 18:22:08 -07:00
Yaowu Xu
718feb0f69 move ref_frame_cost computations into a function
Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1
2015-04-02 18:10:55 -07:00
hkuang
73c8fe5deb Fix error of "Left shift of negative value -1".
Change-Id: Ia4f3feb20df0e89cc51b02def858e12e927312cc
2015-04-02 17:35:33 -07:00
Marco
f85f79f630 Merge "Code cleanup: put (8x8/4x4)fill_variance into separate function." 2015-04-02 17:33:01 -07:00
Johann
327b138b2c Merge "Remove PPC build support" 2015-04-02 16:26:48 -07:00
Yunqing Wang
cae03a7ef5 Set vbp thresholds for aq3 boosted blocks
The vbp thresholds are set seperately for boosted/non-boosted
superblocks according to their segment_id. This way we don't
have to force the boosted blocks to split to 32x32.

Speed 6 RTC set borg test result showed some quality gains.
Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%.
No speed change was observed.

Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a
2015-04-02 15:48:32 -07:00
James Zern
d181a627f0 vp9: fix high-bitdepth NEON build
remove incorrect specializations in rtcd and update a configuration
check in partial_idct_test.cc

(cherry picked from commit 8845334097)

Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0
2015-04-02 15:19:46 -07:00
Adrian Grange
5ef2d1ddae Fix use of scaling in joint motion search
To enable us to the scale-invariant motion estimation
code during mode selection, each of the reference
buffers is scaled to match the size of the frame
being encoded.

This fix ensures that a unit scaling factor is used in
this case rather than the one calculated assuming that
the reference frame is not scaled.

(cherry picked from commit 8d8d7bfde5)

Change-Id: Id9a5c85dad402f3a7cc7ea9f30f204edad080ebf
2015-04-02 15:19:23 -07:00
Marco
77ea408983 Code cleanup: put (8x8/4x4)fill_variance into separate function.
Code cleanup, no change in behavior.

Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24
2015-04-02 13:37:35 -07:00
Marco
6eb05c9ed0 Small fix to segment check in pickmode.
Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72
2015-04-02 09:55:13 -07:00
Johann
bc98e93b53 Remove PPC build support
There are no functional optimizations for AltiVec/PPC

Change-Id: I6877a7a9739017fe36fc769be22679c65ea99976
2015-04-02 09:13:59 -07:00
James Zern
b8a1de86fd Merge "vp9/neon: skip some files in high-bitdepth build" 2015-04-01 23:36:56 -07:00
James Zern
b644384bb5 Merge "vp9: fix high-bitdepth NEON build" 2015-04-01 23:36:17 -07:00
Yaowu Xu
54210f706c Merge "use MAX_MB_PLANE consistently" 2015-04-01 18:24:39 -07:00
hkuang
f3bea3de5b Remove unnecessary set postproc flags.
Change-Id: Iaf136969bc368a890f9671647576ee9d54eef03b
2015-04-01 17:11:35 -07:00
hkuang
4cf68be17a Merge "Fix 10-bit video decode failure with --frame-parallel mode." 2015-04-01 17:07:58 -07:00
Jingning Han
2149f214d5 Merge "Reduce required xmm number by one in block_error_fp" 2015-04-01 15:46:22 -07:00
Jingning Han
657cabe0f7 Tune SSSE3 assembly implementation to improve quantization speed
Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3
2015-04-01 15:28:01 -07:00
Yaowu Xu
f26b8c84f8 use MAX_MB_PLANE consistently
Change-Id: Ic416a7f145001a88f5a7f70dde9b1edbc1b69381
2015-04-01 15:21:20 -07:00
Yaowu Xu
fff4654d36 Merge "Simplify bsize calculation" 2015-04-01 15:06:55 -07:00
Jingning Han
cf4447339e Merge "Optimize quantization simd implementation" 2015-04-01 14:55:18 -07:00
Jingning Han
a4364e5146 Merge "Simplify effective src_diff address computation" 2015-04-01 14:55:03 -07:00
Jingning Han
7acb2a8795 Merge "Refactor block_yrd function for RTC coding mode" 2015-04-01 14:54:24 -07:00
Yaowu Xu
ba91b54d7c Simplify bsize calculation
Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168
2015-04-01 12:15:06 -07:00
Jingning Han
19da916716 Simplify effective src_diff address computation
Remove redundant offset calculation for effective src_diff address.

Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef
2015-04-01 12:07:47 -07:00
Jingning Han
f2cf3c06a0 Reduce required xmm number by one in block_error_fp
Use 6 xmms instead of 8.

Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7
2015-04-01 12:07:35 -07:00
Jingning Han
1470529f62 Refactor block_yrd function for RTC coding mode
This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.

Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
2015-04-01 12:00:43 -07:00
Jingning Han
eed1badedd Optimize quantization simd implementation
This commit allows the quantizer to compare the AC coefficients to
the quantization step size to determine if further multiplication
operations are needed. It makes the quantization process 20% faster
without coding statistics change.

Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a
2015-04-01 11:47:09 -07:00
Yunqing Wang
a0043c6d30 Enhance the transform skipping decision-making in non-rd mode
For large partition blocks(block_size > 32x32), the variance
calculation is modified so that every 8x8 block's variance
is stored during the calculation, which is used in the
following transform skipping test. Also, the variance for
every tx block is calculated. The skipping test checks all tx
blocks in the partition, and sets the skip flag only if all tx
blocks are skippable. If the skip flag of Y plane is 1, a
quick evaluation is done on UV planes. If the current partition
block is skippable in YUV planes, the mode search checks fewer
inter modes and doesn't check intra modes.

The rtc set borg test(at speed 6) showed that:
Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%.
Average single-thread speedup on rtc set was 3.5%.
For 720p clips, more speedups were seen.
gipsrecmotion: 13%
gipsrestat: 12%
vidyo: 5 - 9%
dark: 15%
niklas: 6%

Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e
2015-04-01 09:43:40 -07:00
hkuang
1582ac851f Fix 10-bit video decode failure with --frame-parallel mode.
Also add unit test to avoid same error in the future.

Issue:981

Change-Id: Iaf9889d8d5514cfdff1ea098e6ae133be56d501f
2015-04-01 09:19:35 -07:00