Commit Graph

1005 Commits

Author SHA1 Message Date
Jingning Han
62c7356098 Merge "Use hybrid RD and non-RD coding flow for key frame coding" 2014-12-05 11:25:19 -08:00
Jingning Han
9d88b30854 Remove redundant vp9_zero in choose_partitioning
It makes the overall speed -6 about 2% faster with no compression
performance change.

Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a
2014-12-05 10:39:39 -08:00
Jingning Han
07711e9b27 Use hybrid RD and non-RD coding flow for key frame coding
When block size is below 16x16, the encoder swap from non-RD to
RD mode for key frame coding. This largely brough back the key
frame compression performance. For vidyo1 at 1000 kbps, the key
frame coding statistics are changed

9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us

As compared to the full RD case
7187F, 34.930 dB, 214470 us

The overall rtc set coding performance (single key frame setting)
is improved by 1.5%.

Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878
2014-12-05 09:35:27 -08:00
Yunqing Wang
a3a4a34c60 Merge "vp9_ethread: the tile-based multi-threaded encoder" 2014-12-05 08:23:49 -08:00
hkuang
62de07c8c6 Merge set_prev_mi function into encoder function.
Change-Id: Ifcf2efbb232ea4cabcdebbe77e0820d121e4a6da
2014-12-04 14:44:23 -08:00
Yunqing Wang
eba9c762a1 vp9_ethread: the tile-based multi-threaded encoder
Currently, VP9 supports column-tile encoding, which allows a frame
to be encoded in multiple column tiles independently. The number of
column tiles are set by encoder option "--tile-columns". This
provides a way to encode a frame in parallel.

Based on previous set of patches, this patch implemented the tile-
based multi-threaded encoder. Each thread processes one or more
tiles.

Usage:
For HD clips:
--tile-columns=2 --threads=1/2/3/4

While using 4 threads, tests showed that the encoder achieved
2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at
realtime speed 5.

Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4
2014-12-04 11:21:34 -08:00
Jingning Han
17176cd452 Fix indent in source_var_based_partition_search_method
Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e
2014-12-03 12:37:36 -08:00
Marco
8fd3f9a2fb Enable non-rd mode coding on key frame, for speed 6.
For key frame at speed 6: enable the non-rd mode selection in speed setting
and use the (non-rd) variance_based partition.

Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames),
mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16.

Loss in key frame quality (~0.6-0.7dB) compared to rd coding,
but speeds up key frame encoding by at least 6x.
Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6.

Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
2014-12-03 09:18:08 -08:00
Peter de Rivaz
7e40a55ef9 Added high bitdepth sse2 transform functions
Also removes some spurious changes in common/vp9_blockd.h which
was introduced by a rebase issue between nextgen and master branches.

Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
(cherry picked from commit 005d80cd05)
(cherry picked from commit 08d2f54800)
(cherry picked from commit 4230c2306c)
2014-12-02 11:16:24 -08:00
Yunqing Wang
0993bef7e9 vp9_ethread: calculate and save the tok starting address for tiles
Each tile's tok starting address is calculated before the encoding
process. These addresses are stored so that the same calculation
won't be done again in packing bit stream.

Change-Id: I0a3be0301f002260c19a850303f2f73ebc47aa50
2014-11-25 17:19:35 -08:00
Yunqing Wang
edbd61e136 vp9_ethread: modify VP9_COMP structure
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.

In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.

Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
2014-11-24 17:57:38 -08:00
Jingning Han
2fbdfd2c66 Key frame non-RD mode decision process
This commit makes a non-RD coding mode decision process for key
frame coding. It can be optionally turned on in speed -6 and above.

Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5
2014-11-24 09:04:28 -08:00
Paul Wilkins
f5209d7e01 Remove rate component adjustment for AQ1
In AQ1 a rate adjustment was applied for blocks coded with a
deltaq. This tends to skew the partition selection and cause
rate overshoot.

For example, consider a 64x64 super block where some but not all
sub blocks are in a low q segment and some are in a high q segment.
The choice of Q when considering large partition and transform sizes
is defined by the lowest sub block segment id (currently this implies the
lowest Q). If some parts of the larger partition are very hard this will
cause a high rate component.

The correct behavior here is for the rd code to discard the large partition
choice and break down to sub blocks where some have low and some
have high Q.  However the rate correction factor above mask the high
cost of coding at a larger partition size.

Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960
2014-11-21 08:51:58 -08:00
Paul Wilkins
d031237999 Add variance restriction to AQ2.
Add an additional restriction to bit/complexity based
segmentation based on spatial variance.

Only lower Q when both the number of bits spent
in the initial encoding pass and the spatial complexity are
below a threshold. This will prevent the low Q segments
being used just because there is a surfeit of bits.

Small metrics gains especially opsnr.
derf ~0.2% std-hd ~0.3%

Change-Id: I6a8496d466d673f9b0e2b2ca6304ea7b6d8e1cce
2014-11-20 16:23:35 -08:00
Paul Wilkins
6a760d483d Initial AQ1 restructuring.
This is the first of a series of patches to restructure and
improve AQ mode 1 (variance based AQ).

Change-Id: Idcf693131a3ea2459dcfd957a54a65b971fa4a2a
2014-11-20 15:50:15 -08:00
Yunqing Wang
54ba65a63e Merge "vp9_ethread: move max/min partition size to mb struct" 2014-11-20 14:00:37 -08:00
Yunqing Wang
ad7586a9e1 vp9_ethread: move max/min partition size to mb struct
The max_partition_size and max_partition_size are set at the
beginning while setting speed features, and then adjusted at
SB level. Moving them to mb struct ensures there is a local
copy for each thread.

Change-Id: I7dd08dc918d9f772fcd718bbd6533e0787720ad4
2014-11-20 09:24:50 -08:00
Yunqing Wang
70c9d2983b Revert "vp9_ethread: include a pointer to mb in VP9_COMP"
This reverts commit 6906d218dd.

Another way will be used to handle mb struct.

Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4
2014-11-20 08:31:12 -08:00
Yunqing Wang
d0b547c676 vp9_ethread: combine encoder counts in separate struct
Several frame counters in encoder are updated at SB level. Combine
those counters and put them in a separate struct, which allows us
to allocate one copy for each thread.

Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7
2014-11-14 16:09:22 -08:00
Yunqing Wang
6906d218dd vp9_ethread: include a pointer to mb in VP9_COMP
Modified VP9_COMP struct to include MACROBLOCK *mb. This change
makes it feasible in multi-thread case to allocate a mb for each
thread.

Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e
2014-11-14 12:31:06 -08:00
Yunqing Wang
807885b5e0 Merge "vp9_ethread: modify the cyclic refresh struct" 2014-11-13 18:35:01 -08:00
Yunqing Wang
8ee605f188 vp9_ethread: modify the cyclic refresh struct
Two members in struct CYCLIC_REFRESH
  int64_t projected_rate_sb;
  int64_t projected_dist_sb;
are updated at the superblock level, which makes them shared data
in the multi-thread situation, and requires extra work to handle
them. However, those values are updated and used immediately, and
therefore can be removed. This patch cleaned up the code and
removed the two members.

Change-Id: I2c6ee4552bf49fb63ce590cdb47f9723974fffb1
2014-11-13 15:05:46 -08:00
Jingning Han
6efafda738 Merge "Refactor nonrd_use_partition coding process" 2014-11-13 13:58:21 -08:00
Jingning Han
754b05a4de Refactor nonrd_use_partition coding process
This commit integrates the non-RD mode decision process and the
encoding process into a single recursion scheme.

Change-Id: I6a7e72a0b84d567554801ebbe01ec75d54c1f77d
2014-11-06 17:00:48 -08:00
Jingning Han
10da059b52 Remove unused is_background function
Change-Id: Ia540eac5f066ae95280c2f898370eddf0110c279
2014-11-05 21:19:23 -08:00
Jingning Han
caaf63b2c4 Rework cut-off decisions in cyclic refresh aq mode
This commit removes the cyclic aq mode dependency on
in_static_area and reworks the corresponding cut-off thresholds.
It improves the compression performance of speed -5 by 1.47% in
PSNR and 2.07% in SSIM, and the compression performance of speed
-6 by 3.10% in PSNR and 5.25% in SSIM. Speed wise, about 1% faster
in both settings at high bit-rates.

Change-Id: I1ffc775afdc047964448d9dff5751491ba4ff4a9
2014-11-05 21:17:09 -08:00
hkuang
55577431ae Bind motion vectors with frame buffer structure.
This will save a lot of memory for decoder due to removing of prev_mi,
but prev_mi is still needed in encoder. So this will increase a little bit
memory for encoder.

Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed
2014-10-31 17:01:08 -07:00
Jingning Han
1cffea9fb7 Merge "Rework pred pixel buffer system in non-RD coding mode" 2014-10-31 08:55:24 -07:00
Jingning Han
7bea8c59f9 Rework pred pixel buffer system in non-RD coding mode
This commit makes the inter prediction buffer system to support
hybrid partition search. It reduces the runtime of speed -5 by
about 3%. No compression performance change.

vidyo1 720p 1000 kbps
11831 ms -> 11497 ms

nik 720p 1000 kbps
10919 ms -> 10645 ms

Change-Id: I5b2da747c6395c253cd074d3907f5402e1840c36
2014-10-30 11:08:35 -07:00
Jingning Han
07436abb86 Use zero motion vector in choose_partitioning
The zero motion vector was effectively used in the subsampled pixel
based variance calculation. This commit makes it directly use zero
mv to generate prediction.

Change-Id: Ica83dc843e9f8da2f89c3ef451e50f16214c0def
2014-10-27 19:38:43 -07:00
Jingning Han
d56b3eb0cf Refactor encoder tile data structure
Make the common tile info as one element in the encoder tile data
struct.

Change-Id: I8c474b4ba67ee3e2c86ab164f353ff71ea9992be
2014-10-27 19:37:13 -07:00
Jingning Han
192010d218 Refactor rtc coding mode to support tile encoding
Use per tile threshold in the prediction mode search process.

Change-Id: I6c74ee5a3b069bb4281002dfe51310911a0756c0
2014-10-27 09:53:46 -07:00
Jingning Han
eee201c221 Tile based adaptive mode search in RD loop
Make the spatially adaptive mode search in rate-distortion
optimization loop inter tile independent. Experiments suggest that
this does not significantly change the coding staticstics.

Single tile, speed 3:
pedestrian_area 1080p 1500 kbps
59192 b/f, 40.611 dB, 101689 ms

blue_sky 1080p 1500 kbps
58505 b/f, 36.347 dB, 62458 ms

mobile_cal 720p 1000 kbps
13335 b/f, 35.646 dB, 45655 ms

as compared to 4 column tiles, speed 3:
pedestrian_area 1080p 1500 kbps
59329 b/f, 40.597 dB, 101917 ms

blue_sky 1080p 1500 kbps
58712 b/f, 36.320 dB, 62693 ms

mobile_cal 720p 1000 kbps
13191 b/f, 35.485 dB, 45319 ms

Change-Id: I35c6e1e0a859fece8f4145dec28623cbc6a12325
2014-10-24 10:00:27 -07:00
Jingning Han
be212d4db3 Refactor rate distortion cost structure in non-RD coding mode
This commit refactors the rate distortion structure used in the
non-RD coding mode and saves a few RDCOST calculations.

Change-Id: I62c3416c300d2c5372f21b96d93a6b633a34ab3a
2014-10-21 17:17:11 -07:00
Jingning Han
1ed1dde06d Remove unused copy_partitioning
Change-Id: I75a2a3772ed17e73180eb4f263cc838cae4927b0
2014-10-21 09:47:58 -07:00
Jingning Han
f2c21cfa1e Merge "Remove deprecated constrain_copy_partitioning function" 2014-10-21 09:44:11 -07:00
Jingning Han
55ec7ebca7 Merge "Remove unused sb_has_motion function in vp9_encodeframe.c" 2014-10-21 09:43:55 -07:00
Jingning Han
072d844aff Merge "Remove deprecated use_lastframe_partitioning feature" 2014-10-21 09:43:45 -07:00
Jingning Han
61ff08ef61 Merge "Hybrid partition search for rtc coding mode" 2014-10-21 09:43:35 -07:00
Paul Wilkins
889c53a507 Merge "Resolve compiler warning." 2014-10-21 07:02:13 -07:00
Jingning Han
abb2fbb10e Remove deprecated constrain_copy_partitioning function
Its functionality has been replaced with choose_partitioning and
threshold based control on split mode check.

Change-Id: Ic9bb321df06b524f5c38ea5874dc6f6a8f93c5e3
2014-10-20 17:08:21 -07:00
Jingning Han
ef53898c48 Remove unused sb_has_motion function in vp9_encodeframe.c
Change-Id: I035fb6aa5c10741b065e27befb097d8087e3c62f
2014-10-20 17:08:11 -07:00
Jingning Han
e62ce79e1a Remove deprecated use_lastframe_partitioning feature
This speed feature has been deprecated in both yt and rtc coding
modes. This commit removes the related operations.

Change-Id: I079c79c6adafe45581af2ebf8b98faebcface1ce
2014-10-20 17:03:38 -07:00
Jingning Han
9f128b3ed9 Hybrid partition search for rtc coding mode
This commit re-designs the recursive partition search scheme in
rtc speed -5. It first checks if the current block is under cyclic
refresh mode. If so, apply recursive partition search. Otherwise,
perform sub-sampled pixel based partition selection. When the
pre-selection finds the partition size should be 32x32 or above,
use the partition size directly. Otherwise, apply partition search
at nearby levels around the preset partition size.

It is enabled in speed -5. The compression performance of rtc
speed -5 is improved by 9.4%. Speed wise, the run-time goes slower
from 1% to 10%.

nik_720p, 1000 kbps
33220 b/f, 38.977 dB, 10109 ms -> 33200 b/f, 39.119 dB, 10210 ms

vidyo1_720p, 1000 kbps
16536 b/f, 40.495 dB, 10119 ms -> 16536 b/f, 40.827 dB, 11287 ms

Change-Id: I65adba352e3adc03bae50854ddaea1b421653c6c
2014-10-20 13:02:12 -07:00
Yunqing Wang
67c866750c Merge "Remove the dependency in token storing locations" 2014-10-20 08:26:46 -07:00
Paul Wilkins
9626a0cb62 Resolve compiler warning.
conversion from 'const int64_t' to 'int', possible loss of data.

Change-Id: I471a73bba5d448d9be0ef9cbf1590fa73aa74be1
2014-10-20 12:08:33 +01:00
Debargha Mukherjee
6202c75f84 Merge "Add highbitdepth function for vp9_avg_8x8" 2014-10-18 14:37:10 -07:00
Yaowu Xu
06e65269c7 Merge "Remove unused VAR_BASED_FIXED_PARTITION flag" 2014-10-18 13:31:47 -07:00
Yaowu Xu
7bf475926b Merge "Use rate/distortion thresholds to control non-RD partition search" 2014-10-18 13:31:41 -07:00
Peter de Rivaz
73ae6e495c Add highbitdepth function for vp9_avg_8x8
Cherry-picked from https://gerrit.chromium.org/gerrit/#/c/71914/
(a92f987a6b) on highbitdepth branch.

Change-Id: I6903e4e4cb57d90590725c8a1c64c23da7ae65e8
2014-10-17 17:04:37 -07:00
Yunqing Wang
7c4992c466 Remove the dependency in token storing locations
Currently, the tokens for a tile are stored immediately after its
preceding tile, which causes a dependency. This is unnecessary
since we always allocate enough memory for tokens. Removing
the dependency allows token writing done in parallel. This patch
doesn't change encoding result.

Change-Id: I7365a6e5e2c2833eb14377c37e1503c9d0f26543
2014-10-17 14:25:33 -07:00
Jingning Han
3bc94cd2eb Merge "Add init and reset functions for RD_COST struct" 2014-10-17 11:15:19 -07:00
Jingning Han
e1111fba7e Remove unused VAR_BASED_FIXED_PARTITION flag
Change-Id: I4ce19b7cb1c45fed86e81ee785e787630020fb4f
2014-10-17 09:02:25 -07:00
Marco
48ea5b7190 Merge "Some updates for Speed 6/VAR_BASED_PARTITION." 2014-10-15 15:57:21 -07:00
Jingning Han
e2612fbd70 Add init and reset functions for RD_COST struct
Change-Id: I2902de7051a883fd22e27a655209233733969cfd
2014-10-15 15:02:06 -07:00
Jingning Han
5e766ccee0 Use rate/distortion thresholds to control non-RD partition search
Compare the estimated rate and distortion to the thresholds scaled
according to the operating block size and determine if further
split partition search will be run. The compression performance of
speed -5 is changed by -0.074%. The encoding speed is 10% - 15%
faster.

vidyo1 720p
16545 b/f, 40.492 dB, 11475 ms -> 16535 b/f, 40.486 dB, 10100 ms

nik720p
16624 b/f, 36.310 dB, 10071 ms -> 16617 b/f, 36.313 dB, 8346 ms

Change-Id: Ic9197ab5761279ae55d2fb7813b2af0e0db497b8
2014-10-15 13:40:33 -07:00
Marco
09ea74f194 Some updates for Speed 6/VAR_BASED_PARTITION.
Reduce the intra_cost_penalty for non-rd mode,
and some updates to VAR_BASED_PARTITION.

Visual tests show some improvement at Speed 6, for RTC clips.

Change-Id: If9090daf7aed14906a32d931a538ab544bbca606
2014-10-15 12:06:48 -07:00
Jingning Han
89b8c7a513 Replace copy_partitioning use case with choose_partitioning
This commit replaces the use of copy_partitioning with
choose_partitioning based on the sse of subsamped pixels, which
provides significantly better coding performance and runs at
similar speed, as compared to copy_partitioning. It improves rtc
speed 5 coding performance by 3%.

Change-Id: I52d3682a12dce0147f5e52383a594fc242ca3228
2014-10-15 11:37:20 -07:00
Jingning Han
811cef97c9 Refactor rate distortion cost structure
This commit makes a struct that contains rate value, distortion
value, and the rate-distortion cost. The goal is to provide a
better interface for rate-distortion related operation. It is
first used in rd_pick_partition and saves a few RDCOST calculations.

Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f
2014-10-13 14:27:16 -07:00
Yaowu Xu
bdea0055b2 Merge "vp9/choose_partitioning: add missing clear_system_state" 2014-10-10 09:16:19 -07:00
James Zern
a3e1a9291a vp9/choose_partitioning: add missing clear_system_state
set_vt_partitioning does double math

Change-Id: I8e9d73d5c89b937a5326abf04164d24d9d88c5ef
2014-10-10 08:14:46 -07:00
Deb Mukherjee
9a29fdbae7 Merge "Rename highbitdepth functions to use highbd prefix" 2014-10-09 15:39:56 -07:00
Deb Mukherjee
1929c9b391 Rename highbitdepth functions to use highbd prefix
Uses highbd_ prefix convention consistently.

Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-09 14:40:40 -07:00
Jingning Han
112789d4f2 Merge "Remove sub8x8 block index from rd_pick_partition argument" 2014-10-09 11:16:11 -07:00
Jingning Han
6a0d291fce Remove sub8x8 block index from rd_pick_partition argument
This parameter is deprecated. Its function is replaced with
other explicit condition check.

Change-Id: I61337e350ba8ca9eb50382db8b4d4acbf45cb7eb
2014-10-09 09:20:16 -07:00
James Zern
cec763bd97 set_vt_partitioning: fix type conversion warning
double -> int64
+ make threshold_multiplier an int

Change-Id: I6d3607fdf13d670f57c9d9b04a80acb2be1346a0
2014-10-09 11:41:36 +02:00
Jim Bankoski
20254d1daa Merge "experimental : partition using 1/8 x 1/8 image" 2014-10-08 09:04:26 -07:00
Jim Bankoski
0ce51d823f experimental : partition using 1/8 x 1/8 image
The concept:

There's too much noise in source pixels for variance and at low bitrate
the reconstructed looks nothing like the source so we have problems
getting good partitionings with either.   This skirts the issue by using
a box blur scaled down version for variance calculations.  To compare
against source_var_ moved keyframe to be rd based like source_var.

Change-Id: Ie3babdbfadae324b7b5a76bea192893af27f0624
2014-10-07 16:36:14 -07:00
Jingning Han
7ee58985bd Replace mi_width_log2() with mi_width_log2_lookup table
Change-Id: If0ea98aa139d14d40cd924114e18396aff36b5a5
2014-10-07 12:45:25 -07:00
Jingning Han
b66f7016c1 Take out repeated block width/height lookup functions
The functions b_width_log2 and b_height_log2 only do direct
table fetch. This commit unifies such use cases by using the
table directly and removes these functions.

Change-Id: I3103fc6ba959c1182886a2799d21b8b77c8a7b6b
2014-10-07 12:33:07 -07:00
Yaowu Xu
5966acc1be Merge "Properly initialize segmentID in nonrd coding path" 2014-10-06 07:57:36 -07:00
Yaowu Xu
0065b73481 Properly initialize segmentID in nonrd coding path
This commit adds proper initialization of segment id for variance AQ
mode in non-rd coding path. It fixes the enc/dec mismatch issue of
rt=7 with --aq-mode=1, as reported in issue #816

Change-Id: I02fa41b96345bf2e66077d5ea553f85ba800f7bb
2014-10-03 15:01:53 -07:00
Jingning Han
ef62233396 Fix indent in encode_rd_sb_row
Change-Id: Icbcfe7b56d88474f4398b4c5b52f6719d551ab4a
2014-10-03 11:57:36 -07:00
Jingning Han
bb260d9076 Rework partition search skip scheme
This commit enables the encoder to skip split partition search if
the bigger block size has all non-zero quantized coefficients in low
frequency area and the total rate cost is below a certain threshold.
It logarithmatically scales the rate threshold according to the
current block size. For speed 3, the compression performance loss:
derf  -0.093%
stdhd -0.066%

Local experiments show 4% - 20% encoding speed-up for speed 3.
blue_sky_1080p, 1500 kbps
51051 b/f, 35.891 dB, 67236 ms ->
50554 b/f, 35.857 dB, 59270 ms (12% speed-up)

old_town_cross_720p, 1500 kbps
14431 b/f, 36.249 dB, 57687 ms ->
14108 b/f, 36.172 dB, 46586 ms (19% speed-up)

pedestrian_area_1080p, 1500 kbps
50812 b/f, 40.124 dB, 100439 ms ->
50755 b/f, 40.118 dB,  96549 ms (4% speed-up)

mobile_calendar_720p, 1000 kbps
10352 b/f, 35.055 dB, 51837 ms ->
10172 b/f, 35.003 dB, 44076 ms (15% speed-up)

Change-Id: I412e34db49060775b3b89ba1738522317c3239c8
2014-10-03 11:54:30 -07:00
Yunqing Wang
b1b6fd85db Merge "Skip the partition search for still frames" 2014-09-30 11:59:05 -07:00
Yunqing Wang
c8d01b1eaf Merge "Refactor encode_rd_sb_row function" 2014-09-30 11:58:39 -07:00
Yunqing Wang
1fcbf6ed56 Skip the partition search for still frames
This patch re-enabled the feature in Pengchong's patch
(commit 1286126073). Originally, it
was turned on while use_lastframe_partitioning > 0(not used anymore).
Now it was added as a feature, and turned on while speed >= 2.
As described in the original patch, this feature helps speed up the
slideshows in YouTube.

Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471
2014-09-26 09:03:52 -07:00
Deb Mukherjee
993d10a217 Adds various high bit-depth encode functions
Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f
2014-09-25 01:50:36 -07:00
Yunqing Wang
14ee2805a3 Refactor encode_rd_sb_row function
Simplified the code and removed some code that was not used anymore.
This patch didn't change encoding result.

Change-Id: I7e54a74c8f35a6726dfc8a1c55b337448b7ea124
2014-09-24 10:24:18 -07:00
hkuang
c70cea97ac Remove mi_grid_* structures.
mi_grid_* are arrays of pointer to pointer. They save the pointers that point
to the MIs in cm->mi. But they are unnecessary and complicated. The original
goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer
inside MODE_INFO_t, same goal could be achieved.

This commit totally removes the mi_grid_* structures. But there are still
many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit
will do on-demand MODE_INFO_t allocation in order to save these memories.

Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6
2014-09-19 21:27:11 -07:00
Yunqing Wang
1bf0beb5fc Refactor encode_superblock function
The code covers both x->skip=0 & x->skip=1 cases.

Change-Id: I09745c10e5994dc700ae4c01b4b62979cdaf3306
2014-09-12 15:58:17 -07:00
Yunqing Wang
f10d7eeda2 Remove the use of use_lastframe_partitioning at speed 4
The use of use_lastframe_partitioning is totally removed in good-
quality encoding. Its usage in real-time encoding needs to be
evaluated to see if it can be removed too.

The Borg tests at speed 4 showed:
stdhd set: 0.220% psnr gain, 0.166% ssim gain;
derf set:  0.329% psnr gain, 0.476% ssim gain.

Speed test on selected clips showed 1.54% speedup.(Worst case:
pedestrian_area_1080p25.y4m, speed loss: 1.5%)

Change-Id: I1c844d329b0b5678558439b887297c1be7ddab00
2014-09-09 10:54:07 -07:00
Yaowu Xu
c1058e5bbe select_tx_mode(): remove special case for key frame
This commit removes the special case for key frame, as transform size
decision is controlled by the appropriate speed feature for all lossy
coding modes: tx_size_search_method.

Change-Id: I9677171e3f2432ec23705f7c5ea8170dd4562fae
2014-09-03 09:34:10 -07:00
Yunqing Wang
4d2c376923 Early termination in encoding partition search
In the partition search, the encoder checks all possible
partitionings in the superblock's partition search tree.
This patch proposed a set of criteria for partition search
early termination, which effectively decided whether or
not to terminate the search in current branch based on the
"skippable" result of the quantized transform coefficients.
The "skippable" information was gathered during the
partition mode search, and no overhead calculations were
introduced.

This patch gives significant encoding speed gains without
sacrificing the quality.

Borg test results:
1. At speed 1,
   stdhd set: psnr: +0.074%, ssim: +0.093%;
   derf set:  psnr: -0.024%, ssim: +0.011%;
2. At speed 2,
   stdhd set: psnr: +0.033%, ssim: +0.100%;
   derf set:  psnr: -0.062%, ssim: +0.003%;
3. At speed 3,
   stdhd set: psnr: +0.060%, ssim: +0.190%;
   derf set:  psnr: -0.064%, ssim: -0.002%;
4. At speed 4,
   stdhd set: psnr: +0.070%, ssim: +0.143%;
   derf set:  psnr: -0.104%, ssim: +0.039%;

The speedup ranges from several percent to 60+%.
                 speed1    speed2    speed3    speed4
(1080p, 100f):
old_town_cross:  48.2%     23.9%     20.8%     16.5%
park_joy:        11.4%     17.8%     29.4%     18.2%
pedestrian_area: 10.7%      4.0%      4.2%      2.4%
(720p, 200f):
mobcal:          68.1%     36.3%     34.4%     17.7%
parkrun:         15.8%     24.2%     37.1%     16.8%
shields:         45.1%     32.8%     30.1%      9.6%
(cif, 300f)
bus:              3.7%     10.4%     14.0%      7.9%
deadline:        13.6%     14.8%     12.6%     10.9%
mobile:           5.3%     11.5%     14.7%     10.7%

Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
2014-08-28 11:27:28 -07:00
Dmitry Kovalev
4478553efc Removing tx_stepdown_count from VP9_COMP.
The variable is never read.

Change-Id: I94141c1667fa5d10604cd6f83c5f64df107dee94
2014-08-25 14:42:05 -07:00
Dmitry Kovalev
e576c42f1b Cleaning up is_background().
Change-Id: I2b9609dd22bacbf26e669f70bf155613b0316eb3
2014-08-25 11:55:30 -07:00
Pengchong Jin
997db6fc3f Merge "Add a speed feature to give the tighter search range" 2014-08-15 19:51:04 -07:00
Pengchong Jin
eca93642e2 Add a speed feature to give the tighter search range
Add a speed feature to give the tighter partition search
range. Before partition search, calculate the histogram
of the partition sizes of the left, above and previous
co-located blocks of the current block. If the variance of
observed partition sizes is small enough, adjust the search
range around the mean partition size, which will be tigher.

The feature is currently turned on at speed 2. Experiments on
sample youtube clips show on average the runtime is reduced
by 3-7%.

For hard stdhd clips:
park_joy_1080p @ 15000kbps:       509251 ms -> 491953 ms (3.3%)
pedestrian_area_1080p @ 2000kbps: 223941 ms -> 214226 ms (4.3%)

The PSNR performance is changed:
derf: -0.112%
yt:   -0.099%
hd:   -0.090%
stdhd:-0.102%

Change-Id: Ie205ec5325bf92ec5676c243e30ba9d0adca10f2
2014-08-15 16:14:20 -07:00
Yunqing Wang
28b1437d77 Remove a unused speed feature
Removed disable_split_var_thresh, which is not used anymore.

Change-Id: I50119b150442e1571157433b5effc6aae0dbe0fd
2014-08-15 14:10:27 -07:00
Jingning Han
80e5550723 Merge "Remove redundant vp9_init_plane_quantizers call" 2014-08-14 18:50:16 -07:00
Jingning Han
d67b608c5d Remove redundant vp9_init_plane_quantizers call
When aq mode is on, the quantizer will be reset later in the same
function (line 571).

Change-Id: I20635db31261d136d04d5deeb881ad3957078bf1
2014-08-14 14:21:08 -07:00
Yaowu Xu
741a23cd97 Replace current_video_frame with better alternatives
In the encoder, current_video_frame is used in a couple of places to
decide encoding strategy, this commit replaces with more appropriate
variables.

Change-Id: I3d3d8d8e2ea02c489e4639b9d4c446a63e357d29
2014-08-13 17:19:34 -07:00
Yaowu Xu
b6a41802c4 Simplify select_tx_mode()
The function is called only once, right after all stats counters are
reset to 0. Therefore all the computations have zero effect on return
values. This commmit to removed those effectless code.

Change-Id: I50d27c0802547921fa36c60aa4bd92d76247f595
2014-08-13 11:48:29 -07:00
Jingning Han
5b63c2797a Merge "Integrate fast txfm and quant path into skip_recode system" 2014-08-11 08:53:34 -07:00
Jingning Han
9da4cd94f5 Merge "Extend skip_txfm flag into array to cover YUV planes" 2014-08-11 08:53:25 -07:00
Dmitry Kovalev
91c2f1e45a Moving pass from VP9_COMP to VP9EncoderConfig.
We had a very complicated way to initialize cpi->pass from
cfg->g_pass:
switch (cfg->g_pass) {
  case VPX_RC_ONE_PASS:
    oxcf->mode = ONE_PASS_GOOD;
    break;
  case VPX_RC_FIRST_PASS:
    oxcf->mode = TWO_PASS_FIRST;
    break;
  case VPX_RC_LAST_PASS:
    oxcf->mode = TWO_PASS_SECOND_BEST;
    break;
}

cpi->pass = get_pass(oxcf->mode).

Now pass is moved to VP9EncoderConfig and initialization is simple:
switch (cfg->g_pass) {
  case VPX_RC_ONE_PASS:
    oxcf->pass = 0;
    break;
  case VPX_RC_FIRST_PASS:
    oxcf->pass = 1;
    break;
  case VPX_RC_LAST_PASS:
    oxcf->pass = 2;
    break;
}

Change-Id: I8f582203a4575f5e39b071598484a8ad2b72e0d9
2014-08-08 14:27:54 -07:00
Dmitry Kovalev
2fe6fa72fc Merge "Cleaning up vp9_encodeframe.c." 2014-08-08 13:55:34 -07:00
Alex Converse
2a5c46d8f5 Fix active_map speed 6.
Fix the interaction between active map and reuse_inter_pred_sby. The
reuse_inter_pred_sby feature expects inter predictors to already be
built, but blocks with active map on skip this step.

Change-Id: Ibb2bf0d228f678935d82a0ede9cb0919ab7c8878
2014-08-07 15:57:58 -07:00
Alex Converse
e874aea74c Cleanup SEG_LVL_SKIP handling in encode_superblock.
Change-Id: Ib7497ba08696765cbc1b2cc4218d37f4298f278c
2014-08-07 15:57:58 -07:00
Dmitry Kovalev
b539705916 Cleaning up vp9_encodeframe.c.
Change-Id: Ia3001ae5c44faee3978fc3eb7a027cd9712a0373
2014-08-07 14:55:54 -07:00