Commit Graph

992 Commits

Author SHA1 Message Date
Yaowu Xu
45971abd1d Optimize coef update
1. move the check of search method of USE_TX_8X8 up one level to
avoid operations of build_tree_distributions()
2. count tx used and avoid computaton for coef udpate when one size
is not used at all.

Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c
2015-01-30 10:16:40 -08:00
Yaowu Xu
fe2439703d Merge "move clear_system_state() call before using double" 2015-01-27 12:42:13 -08:00
Marco
1c4a84c6e9 Merge "aq-mode=3: Update to allow for refresh on modes other than zero-mv." 2015-01-26 19:47:13 -08:00
Yaowu Xu
645b7cdf03 move clear_system_state() call before using double
Floating point is used in vp9_convert_qindex_to_q(), so sometime unit
test ActiveMapTest would cause run time error without properly call
to clear_system_state to reset register status.

Change-Id: I181e9395148c44a6ca8b97d6e109bd4a152143c6
2015-01-26 18:41:50 -08:00
Marco
3f1af6e85e aq-mode=3: Update to allow for refresh on modes other than zero-mv.
Add distortion threshold condition to refresh state of a coding block,
and allow for qp adjustment also for some intra modes and non-zero motion modes.

Also some code cleanup (remove unused variables/code).

Change-Id: I735fa2b28bc64f60e0323976b82510577b074203
2015-01-26 16:44:25 -08:00
Yaowu Xu
6d16f6c14c Fix MSVC warnings on conversion from int64 to int
Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025
2015-01-26 10:54:06 -08:00
Marco
0dccb6277c Modify variance partition selection for low resolutions.
For low spatial resolutions: bias partittion selection to smaller block sizes,
and base the variance computation on 4x4 down-sampling.

Also move the threshold computations into the choose_partitioning,
so they are computed once for each sb block.

On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
No change for resolutions above CIF.

Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
2015-01-22 15:16:55 -08:00
Deb Mukherjee
e7570493b8 Moves inter mode count updates to update_stats
This makes the inter_mode counts update consistent with other symbols.
Also, forward updates should work corerctly now.

Change-Id: Id98be26fd08875162e644bb8f1de6f0918f85396
2015-01-06 16:40:45 -08:00
Paul Wilkins
a88e4e64b1 Merge "Deleted unused #define" 2015-01-06 04:18:20 -08:00
Jingning Han
5486db185c Add bsize check condition in nonrd_use_partition
Check if block size is below 8x8 for rectangular block coding. It
is added to support 4x8 and 8x4 block coding for RTC mode.

Change-Id: I760b328f45b98ae48adc45ed5a39fb643cd8aebd
2015-01-02 10:12:37 -08:00
Jingning Han
dad89d5ca1 Enable sub8x8 inter block search for RTC coding mode
This commit enables sub8x8 inter block coding for RTC mode. The
use of sub8x8 blocks can be turned on by allowing
choose_partitioning function to select 4x4/4x8/8x4 block sizes.

Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a
2014-12-24 17:40:31 -08:00
Jingning Han
d0f2377027 Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value.""
This reverts commit 9946ee23e0.

Fix the ssse3 asm function.

Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
2014-12-22 10:09:25 -08:00
Paul Wilkins
9946ee23e0 Revert "Removal of legacy zbin_extra / zbin_oq_value."
This reverts commit e9b586e21b.

Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4
2014-12-19 15:02:58 +00:00
Paul Wilkins
e9b586e21b Removal of legacy zbin_extra / zbin_oq_value.
zbin extra / zbin_oq_value was widely passed around,
hence removal touches a lot of code.

Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5
2014-12-18 16:49:11 +00:00
Paul Wilkins
60e9b731cf Remove mode dependent zbin boost.
Initial patch to remove get_zbin_mode_boost() and
cpi->zbin_mode_boost.

For now sets a dummy value of 0 for zbin extra pending
a further clean up patch.

Change-Id: I64a1e1eca2d39baa8ffb0871b515a0be05c9a6af
2014-12-18 16:45:52 +00:00
Paul Wilkins
b76312124d Deleted unused #define
FAST_MOTION_MV_THRESH no longer referenced.

Change-Id: Idee6ee5a59ba330904c42b20c9ec35b6fc16f7a2
2014-12-17 14:59:22 +00:00
Peter de Rivaz
e3d19bfc63 Fix for crash in highbitdepth rt mode
Change 72141 introduced a new use of vp9_avg_4x4.
This call needs to switch to using vp9_highbd_avg_4x4
when performing high bitdepth encodes.

Change-Id: I6a8ba4b62f8a75d0a917b365a55245e2f0438ea1
2014-12-16 10:55:49 +00:00
James Zern
4d40a046da Merge "vp9: move encoder-only member from common" 2014-12-12 14:28:55 -08:00
Marco
7f59cff53d Merge "Allow for 4x4 prediction blocks for key frame, speed 6." 2014-12-12 14:27:31 -08:00
James Zern
72ece1308b vp9: move encoder-only member from common
allow_comp_inter_inter VP9_COMMON -> VP9_COMP

Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6
2014-12-12 11:17:44 -08:00
Jingning Han
3e0793b80b Merge "Fix PICK_MODE_CONTEXT index in non-RD coding mode" 2014-12-12 09:16:01 -08:00
Jingning Han
e2c2a65695 Fix PICK_MODE_CONTEXT index in non-RD coding mode
This commit fixes a bug in the PICK_MODE_CONTEXT index for
horizontal partition case. The compression performance change
is less than 0.01% level, since most blocks are selected to
use square block size in RTC coding mode.

Change-Id: I67effc18ae8795fccdd82a55f4efc609fa5cb3e1
2014-12-11 17:21:24 -08:00
Marco
7e99cd2a9b Allow for 4x4 prediction blocks for key frame, speed 6.
For key frame under variance source partition: 4x4 prediction blocks
may be selected when variance of 8x8 block is very high (threshold is set fairly high for now).

Testing on some RTC clips shows this helps to reduce some ringing artifacts on key frame.
Encoded key frame size increases about ~10%. Key frame PSNR increases about ~0.1-0.2dB.

Change-Id: I56e203fac32ea6ef69897fb3ea269c59cb50d174
2014-12-11 15:36:16 -08:00
Jingning Han
811c74cdfa Merge "Replace division with bit shift in choose_partitioning" 2014-12-11 13:30:03 -08:00
Jingning Han
d9892e846f Merge "Refactor choose_partitioning computing scheme" 2014-12-11 11:14:07 -08:00
Jingning Han
d5c396a902 Replace division with bit shift in choose_partitioning
This commit explicitly uses the bit shift operation instead of
division for computing block variance.

Change-Id: Id19c0ff27dd1d1ae4aceee6657e1aad0d406bd74
2014-12-11 11:06:57 -08:00
Jingning Han
377d2f027a Refactor choose_partitioning computing scheme
This commit refactors the choose_partitioning function. It removes
redundant memset calls and makes the encoder to calculate
variance value per block only when it is needed. It reduces the
average runtime cost of choose_partitioning by 60%. Overall it
reduces speed -6 runtime by 2-5%.

Change-Id: I951922c50d901d0fff77a3bafc45992179bacef9
2014-12-11 09:33:40 -08:00
Paul Wilkins
65cfb808d0 Merge "Substantial restructuring of AQ mode 2." 2014-12-10 10:44:27 -08:00
Jingning Han
e728678c50 Refactor update_state_rt
Update the frame motion vector only if previous frame motion vector
is needed for next frame reference motion vector.

Change-Id: Ica50f9d7b46ad4f815bba0d9e30f5546df29546f
2014-12-09 15:35:49 -08:00
Jingning Han
225cdef665 Make RTC coding flow support sub8x8 in key frame coding
This commit enables the use of sub8x8 blocks in RTC key frame
encoding. It requires the block size to be preset and will decide
the coding mode and encode the bit-stream.

Change-Id: I35aaf8ee2d4d6085432410c7963f339f85a2c19b
2014-12-09 11:34:58 -08:00
Jingning Han
4bacaab46d Cosmetic naming change
Rename set_modeinfo_offsets as set_mode_info_offsets, to be more
consistent with naming convention.

Change-Id: I68ca1f36c4a78127d9439a50c1506a2afd07927d
2014-12-09 10:32:04 -08:00
Jingning Han
f051a7beab Take out redundant setting of mode_info from set_block_size
The later encoding process will take the top-left block's
mode_info for pre-determined block size.

Change-Id: I76a90f9ce7f3b2dbc2975b52442114e461c465b5
2014-12-09 10:27:18 -08:00
Paul Wilkins
e68c8dcfd2 Substantial restructuring of AQ mode 2.
The restructure moves the decision into the rd pick
modes loop and makes a decision based at the 16x16
block level instead of only the 64x64 level.

This gives finer granularity and better visual results
on the clips I have tested. Metrics results are worse
than the old AQ2 especially for PSNR and this mode
now falls between AQ0 and AQ1 in terms of visual
impact and metrics results.

Further tuning of this to follow.

It should be noted that if there are multiple iterations
of the recode loop the segment for a MB could change
in each loop if the previous loop causes a change in the
complexity / variance bin of the block. Also where a block
gets a delta Q this will alter the rd multiplier for this block
in subsequent recode iterations and frames where the
segmentation is applied.

Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7
2014-12-09 15:10:52 +00:00
Jingning Han
1395ded2a7 Remove unused rd cost calculation from nonrd_use_partition
The per block rd cost calculation is not needed when partition
size is preset.

Change-Id: Ie5575248bbffb584e908aa13097f697ace6ec747
2014-12-08 18:45:19 -08:00
James Zern
616b3a810f vp9 asserts: fix compile warning
string literal to int within an assert

Change-Id: I76a173f96b9add5bf27c3f5ad5d72c6f30e51629
2014-12-05 16:20:42 -08:00
hkuang
eaa6deee5b Merge "Merge set_prev_mi function into encoder function." 2014-12-05 15:12:50 -08:00
Jingning Han
6ae829088f Merge "Remove redundant vp9_zero in choose_partitioning" 2014-12-05 11:47:58 -08:00
Jingning Han
62c7356098 Merge "Use hybrid RD and non-RD coding flow for key frame coding" 2014-12-05 11:25:19 -08:00
Jingning Han
9d88b30854 Remove redundant vp9_zero in choose_partitioning
It makes the overall speed -6 about 2% faster with no compression
performance change.

Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a
2014-12-05 10:39:39 -08:00
Jingning Han
07711e9b27 Use hybrid RD and non-RD coding flow for key frame coding
When block size is below 16x16, the encoder swap from non-RD to
RD mode for key frame coding. This largely brough back the key
frame compression performance. For vidyo1 at 1000 kbps, the key
frame coding statistics are changed

9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us

As compared to the full RD case
7187F, 34.930 dB, 214470 us

The overall rtc set coding performance (single key frame setting)
is improved by 1.5%.

Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878
2014-12-05 09:35:27 -08:00
Yunqing Wang
a3a4a34c60 Merge "vp9_ethread: the tile-based multi-threaded encoder" 2014-12-05 08:23:49 -08:00
hkuang
62de07c8c6 Merge set_prev_mi function into encoder function.
Change-Id: Ifcf2efbb232ea4cabcdebbe77e0820d121e4a6da
2014-12-04 14:44:23 -08:00
Yunqing Wang
eba9c762a1 vp9_ethread: the tile-based multi-threaded encoder
Currently, VP9 supports column-tile encoding, which allows a frame
to be encoded in multiple column tiles independently. The number of
column tiles are set by encoder option "--tile-columns". This
provides a way to encode a frame in parallel.

Based on previous set of patches, this patch implemented the tile-
based multi-threaded encoder. Each thread processes one or more
tiles.

Usage:
For HD clips:
--tile-columns=2 --threads=1/2/3/4

While using 4 threads, tests showed that the encoder achieved
2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at
realtime speed 5.

Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4
2014-12-04 11:21:34 -08:00
Jingning Han
17176cd452 Fix indent in source_var_based_partition_search_method
Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e
2014-12-03 12:37:36 -08:00
Marco
8fd3f9a2fb Enable non-rd mode coding on key frame, for speed 6.
For key frame at speed 6: enable the non-rd mode selection in speed setting
and use the (non-rd) variance_based partition.

Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames),
mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16.

Loss in key frame quality (~0.6-0.7dB) compared to rd coding,
but speeds up key frame encoding by at least 6x.
Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6.

Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
2014-12-03 09:18:08 -08:00
Peter de Rivaz
7e40a55ef9 Added high bitdepth sse2 transform functions
Also removes some spurious changes in common/vp9_blockd.h which
was introduced by a rebase issue between nextgen and master branches.

Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
(cherry picked from commit 005d80cd05)
(cherry picked from commit 08d2f54800)
(cherry picked from commit 4230c2306c)
2014-12-02 11:16:24 -08:00
Yunqing Wang
0993bef7e9 vp9_ethread: calculate and save the tok starting address for tiles
Each tile's tok starting address is calculated before the encoding
process. These addresses are stored so that the same calculation
won't be done again in packing bit stream.

Change-Id: I0a3be0301f002260c19a850303f2f73ebc47aa50
2014-11-25 17:19:35 -08:00
Yunqing Wang
edbd61e136 vp9_ethread: modify VP9_COMP structure
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.

In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.

Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
2014-11-24 17:57:38 -08:00
Jingning Han
2fbdfd2c66 Key frame non-RD mode decision process
This commit makes a non-RD coding mode decision process for key
frame coding. It can be optionally turned on in speed -6 and above.

Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5
2014-11-24 09:04:28 -08:00
Paul Wilkins
f5209d7e01 Remove rate component adjustment for AQ1
In AQ1 a rate adjustment was applied for blocks coded with a
deltaq. This tends to skew the partition selection and cause
rate overshoot.

For example, consider a 64x64 super block where some but not all
sub blocks are in a low q segment and some are in a high q segment.
The choice of Q when considering large partition and transform sizes
is defined by the lowest sub block segment id (currently this implies the
lowest Q). If some parts of the larger partition are very hard this will
cause a high rate component.

The correct behavior here is for the rd code to discard the large partition
choice and break down to sub blocks where some have low and some
have high Q.  However the rate correction factor above mask the high
cost of coding at a larger partition size.

Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960
2014-11-21 08:51:58 -08:00