164 Commits

Author SHA1 Message Date
Brandon Young
50619bacfd Fix error with cumbins to allow multiple profiles
Change-Id: I23aadc8f7551771197b55465a3264250b40838ff
2016-04-08 13:16:34 -07:00
Julia Robson
ef01ea152d Changes so other expts work with 128x128 coding unit size expt
Changes ensure wedge_partition, interintra and palette expts all
work alongside the ext_coding_unit_size experiment.

Change-Id: I18f17acb29071f6fc6784e815661c73cc21144d6
2015-11-26 17:43:00 +00:00
Julia Robson
d90a3265f0 Changes to use defined constants rather than hard-coded numbers
Also fixes a valgrind error when optimizations are disabled.
Done in preparation for the work on the extended coding unit size
experiment.

Change-Id: Ib074c5a02c94ebed7dd61ff0465d26fa89834545
2015-11-12 15:42:32 -08:00
Shunyao Li
2de18d1fd2 Super resolution mode (+CONFIG_SR_MODE)
CONFIG_SR_MODE=1, enable SR mode
USE_POST_F=1, enable SR post filter
SR_USE_MULTI_F=1, enable SR post filter family
Not compatible with other experiments yet

Change-Id: I116f1d898cc2ff7dd114d7379664304907afe0ec
2015-08-31 15:29:39 -07:00
hui su
00c793ee5f Optimize entropy coding of non-transform tokens
Use separate token probabilities and counters for non-transform
blocks (pixel domain) . Initial probabilities are trained with screen_content
clips. On screen_content, it improves coding performance by about
2% (from +16.4% to +18.45%).

The initial probabilities are not optimized for natural videos. So it should
not be used for natural videos. Set FOR_SCREEN_CONTENT as 0/1 to specify
whether or not to enable this patch.

Change-Id: Ifa361c94bb62aa4b783cbfa50de08c3fecae0984
2015-05-07 07:58:19 -07:00
hui su
1f7b49f7cd Use uniform quantization settings for non-transform blocks
Do not treat first element (dc) differently.

on screen_content
tx-skip only: +16.4% (was +15.45%)

no significant impact on natrual videos

Change-Id: I79415a9e948ebbb4a69109311c10126d8a0b96ab
2015-04-28 07:54:16 -07:00
hui su
33207fb170 Remove unused variable in new-quant expt
remove dequant_val_nuq in macroblock_plane

Change-Id: I4b4070ae2d01c2403c781433030204d6e95c3750
2015-04-20 11:14:51 -07:00
Alex Converse
16e5e713fa Add an intra block copy mode (NEWDV).
Change-Id: I82b261c54ac9db33706bb057613dcbe66fc71387
2015-04-03 11:59:57 -07:00
hui su
070d635657 Add palette coding mode for inter frames
on screen_content
--enable-palette                                    +6.74%

on derflr
with all other experiments                          +6.02%
(--enable-supertx --enable-copy-mode
 --enable-ext-tx --enable-filterintra
 --enable-tx64x64 --enable-tx-skip
 --enable-interintra --enable-wedge-partition
 --enable-compound-modes --enable-new-quant
 --enable-palette)

Change-Id: Ib85049b4c3fcf52bf95efbc9d6aecf53d53ca1a3
2015-03-23 08:41:51 -07:00
Deb Mukherjee
c8ed36432e Non-uniform quantization experiment
This framework allows lower quantization bins to be shrunk down or
expanded to match closer the source distribution (assuming a generalized
gaussian-like central peaky model for the coefficients) in an
entropy-constrained sense. Specifically, the width of the bins 0-4 are
modified as a factor of the nominal quantization step size and from 5
onwards all bins become the same as the nominal quantization step size.
Further, different bin width profiles as well as reconstruction values
can be used based on the coefficient band as well as the quantization step
size divided into 5 ranges.

A small gain currently on derflr of about 0.16% is observed with the
same paraemters for all q values.
Optimizing the parameters based on qstep value is left as a TODO for now.

Results on derflr with all expts on is +6.08% (up from 5.88%).

Experiments are in progress to tune the parameters for different
coefficient bands and quantization step ranges.

Change-Id: I88429d8cb0777021bfbb689ef69b764eafb3a1de
2015-03-17 21:42:55 -07:00
Jingning Han
50cab76f12 Removal of legacy zbin_extra / zbin_oq_value.
Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
(cherry picked from commit d0f237702745c4bfc0297d24f9465f960fb988ed)
2015-01-21 20:37:19 -08:00
Deb Mukherjee
1929c9b391 Rename highbitdepth functions to use highbd prefix
Uses highbd_ prefix convention consistently.

Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-09 14:40:40 -07:00
Jingning Han
0a9f5fa146 Remove repeated header files from vp9_block.h
This commit removes unused header file vp9_onyxc_int.h and repeatedly
included file vpx_ports/mem.h from vp9_block.h

Change-Id: I400b210bd1da48f1880bd50a8f4a6e2c690e15a1
2014-10-01 13:01:43 -07:00
Deb Mukherjee
10783d4f3a Adds high bitdepth transform functions and tests
Adds various high bitdepth transform functions and tests.
Much of the changes are related to using typedefs tran_low_t
and tran_high_t for the final transform cofficients and intermediate
stages of the transform computation respectively rather than fixed
types int16_t/int. When vp9_highbitdepth configure flag is off,
these map tp int16_t/int32_t, but when the flag is on, they map
to int32_t/int64_t to make space for needed extra precision.

Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
2014-09-11 19:56:33 -07:00
Jingning Han
d62d804e64 Speed up compound inter prediction mode check
This commit allows the encoder to store outcomes of single reference
frame modes and compares them to decide if the inter prediction
filter, forward transform, and quantization can be skipped.

The compression performance of speed 3 is down
derf  -0.364%
stdhd -0.198%

For test sequences, the speed 3 runtime is reduced
highway CIF 100 kbps, 51976 ms -> 45033 ms, 13% speed-up
stockholm 720p 1000 kbps, 71826 ms -> 67838 ms, 5.5% speed-up
pedestrian 1080p 2000 kbps, 154924 ms -> 150702 ms, 2.6% speed-up

Change-Id: I5aa26f918d2b4b5197a2c0afa2779319f1c88e44
2014-09-03 15:28:01 -07:00
Jingning Han
02e6ecdc4c Extend block level sse to support multiple txfm blocks
This commit extends the sse and forward transform computation flag
to support the case 64x64 blocks where there are 4 32x32 2D-DCT
blocks.

Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
2014-08-29 08:29:34 -07:00
Jingning Han
2b1c6eacb9 Move mv cost table to VP9_COMP
The mv cost table set is maintained at frame level, hence moved to
VP9_COMP.

Change-Id: Icb3d0185d47443590bd11357de729aa4ba5c5e5e
2014-08-22 09:38:07 -07:00
Jingning Han
245e57c78e Merge "Enable fast forward txfm and quant for rate-distortion search" 2014-08-11 17:56:48 -07:00
Jingning Han
9da4cd94f5 Merge "Extend skip_txfm flag into array to cover YUV planes" 2014-08-11 08:53:25 -07:00
Jingning Han
b4b09c9796 Enable fast forward txfm and quant for rate-distortion search
This commit enables encoder to select fast forward transform and
quantization path according to the prediction residual sse/variance,
in the rate-distortion optimization scheme.

Change-Id: Ief9fc3844fd4107166d401970e800c6e5ce2b5fe
2014-08-08 16:16:51 -07:00
Jingning Han
1a8d45f309 Extend skip_txfm flag into array to cover YUV planes
Change-Id: Ieae182d72d625d0d3fd4ed7c7d24cb521a0f21b0
2014-08-05 15:42:12 -07:00
Jim Bankoski
63c1c3ee64 energy -> int to avoid unsigned / signed mismatch
Change-Id: Idd1327852f0df0eab0ea3b33959f2b8292b77301
2014-08-04 12:07:26 -07:00
Jingning Han
9ac2f66320 Re-design quantization process
This commit re-designs the quantization process for transform
coefficient blocks of size 4x4 to 16x16. It improves compression
performance for speed 7 by 3.85%. The SSSE3 version for the
new quantization process is included.

The average runtime of the 8x8 block quantization is reduced
from 285 cycles -> 255 cycles, i.e., over 10% faster.

Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
2014-07-01 17:00:07 -07:00
Yunqing Wang
9d41313e4b Decide the partitioning threshold from the variance histogram
Before encoding a frame, calculate and store each 16x16 block's
variance of source difference between last and current frame.
Find partitioning threshold T for the frame from its variance
histogram, and then use T to make partition decisions.

Comparing with fixed 16x16 partitioning, rtc set test showed an
overall psnr gain of 3.242%, and ssim gain of 3.751%. The best
psnr gain is 8.653%.

The overall encoding speed didn't change much. It got faster for
some clips(for example, 12% speedup for vidyo1), and a little
slower for others.

Also, a minor modification was made in datarate unit test.

Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c
2014-06-30 09:36:23 -07:00
Alex Converse
aeacaac574 Switch active map implementation to segment based.
Change-Id: Ibb841a1fa4d08d164cf5461246ec290f582b1f80
2014-06-20 13:13:23 -07:00
Dmitry Kovalev
f80a346e0e Merge "Replacing txfm_size with tx_size." 2014-06-12 13:07:11 -07:00
Dmitry Kovalev
4345d12d28 Replacing txfm_size with tx_size.
Change-Id: Ifa6374e9db5919322733b656e0865f5f19ee6f2c
2014-06-12 11:57:26 -07:00
Jingning Han
ccba289f8d Fast computation path for forward transform and quantization
This commit enables a fast path computational flow for forward
transformation. It checks the sse and variance of prediction
residuals and decides if the quantized coefficients are all
zero, dc only, or more. It then selects the corresponding coding
path in the forward transformation and quantization stage.

It is currently enabled in rtc coding mode. Will do it for rd
coding mode next.

In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps
goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up.
Overall coding performance for rtc set is changed by -0.18%.

Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
2014-06-12 11:10:54 -07:00
Dmitry Kovalev
35a83677a5 Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK.
The final goal is eventually to get rid of both itxm_add and fwd_txm4x4.
This patch does it in the decoder.

Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5
2014-05-21 11:09:44 -07:00
Dmitry Kovalev
81e03394d6 Replacing int_mv with MV.
Change-Id: Icd7eea20e944e3e28e5eb20cdc088866a54d53b4
2014-05-19 11:43:07 -07:00
Dmitry Kovalev
51545f5753 Moving PC_TREE from MACROBLOCK to VP9_COMP.
Because PC_TREE is encoder-level data, not MACROBLOCK-level data.

Change-Id: I4f620c0781acd3a2744860610117e74948e0b2b5
2014-05-16 10:17:13 -07:00
Dmitry Kovalev
66307bf2c8 Moving costs from MACROBLOCK to VP9_COMP.
Change-Id: I61471dd0f77d1547abec13cbf9670e1c4eb9131a
2014-05-01 16:12:23 -07:00
Dmitry Kovalev
aa464eca5e Adding search_site_config struct.
Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf
2014-04-29 10:34:53 -07:00
Jingning Han
80a4f55989 Enable background detection for adaptive quantizer control
This commit enables a background detection approach for adaptive
quantizer control. It combines the cyclic refresh pattern and the
background information to determine the segment id for adaptive
quantizer selection, prior to the non-RD mode decision process.
It hence allows proper quantization information update for a more
precise rate-distortion modeling in the non-RD mode decision.

The compression performance of speed -5 for rtc set is improved
by 2.5%, at no speed change.

Change-Id: Ic3713e8ed9185b403b5b1679d19dabd57506d452
2014-04-21 08:57:53 -07:00
Jim Bankoski
e890c2579b add a context tree structure to encoder
This patch sets up a quad_tree structure (pc_tree) for holding all of
pick_mode_context data we use at any square block size during encoding
or picking modes.  That includes contexts for 2 horizontal and 2 vertical
splits, one none, and pointers to 4 sub pc_tree nodes corresponding
to split.  It also includes a pointer to the current chosen partitioning.

This replaces code that held an index for every level in the pick
modes array including:  sb_index, mb_index,
b_index, ab_index.

These were used as stateful indexes that pointed to the current pick mode
contexts you had at each level stored in the following arrays

array ab4x4_context[][][],
sb8x4_context[][][], sb4x8_context[][][], sb8x8_context[][][],
sb8x16_context[][][], sb16x8_context[][][], mb_context[][], sb32x16[][],
sb16x32[],  sb32_context[], sb32x64_context[], sb64x32_context[],
sb64_context

and the partitioning that had been stored in the following:
b_partitioning, mb_partitioning, sb_partitioning, and sb64_partitioning.

Prior to this patch before doing an encode you had to set the appropriate
index for your block size ( switch statement),  update it ( up to 3
lookups for the index array value) and then make your call into a recursive
function at which point you'd have to call get_context which then
had to do a switch statement based on the blocksize,  and then up to 3
lookups based upon the block size to find the context to use.

With the new code the context for the block size is passed around directly
avoiding the extraneous switch statements and multi dimensional array
look ups that were listed above.   At any level in the search all of the
contexts are local to the pc_tree you are working on (in?).

In addition in most places code that used to call sub functions and
then check if the block size was 4x4 and index was > 0 and return
now don't preferring instead to call the right none function on the inside.



Change-Id: I06e39318269d9af2ce37961b3f95e181b57f5ed9
2014-04-17 07:30:55 -07:00
Paul Wilkins
e434d08fd9 Remove old activity masking code.
Delete code relating to the old VP8_TUNE_SSIM flag
as this code does not currently work and is largely made
redundant in VP9 by the various AQ modes.

Change-Id: I71f28e1f680573d296422254489000678552b17b
2014-04-16 12:22:41 -07:00
Dmitry Kovalev
ad4ce92d92 Merge "Using local variable for token_cache." 2014-03-24 15:54:38 -07:00
Jim Bankoski
35af423e8e vp9_block.h static reconverted to inline
Change-Id: I0e7d2815839d8a64250116a5486570d03659a4c0
2014-03-24 06:30:39 -07:00
Dmitry Kovalev
6b32e5f04a Using local variable for token_cache.
We use local variable for token_cache in the decoder.

Change-Id: I032763fa7894313cffe73e3f14863ae1d0527665
2014-03-21 12:14:05 -07:00
Yunqing Wang
cf07d3e332 Remove unused mode_sad
Removed mode_sad.

Change-Id: I230b42ac9b617ae2c375e297057aa0756bd355fe
2014-03-20 09:28:16 -07:00
Alex Converse
29a487c77f Add a conservative RD based active map in vp9.
Change-Id: I47b3c38aadfd8f3ea08515a18a5948aa1375c650
2014-03-10 15:48:43 -07:00
Dmitry Kovalev
3f1ab25812 Removing vp9_onyx.h and moving its content to the encoder.
Change-Id: I03451c88536bc498edddbe0cd9773ff79da085c2
2014-03-05 23:33:22 -08:00
Dmitry Kovalev
627720fa81 Cleaning up mode cost manipulations.
Change-Id: If175d97990454b171b6abeddb76d142497484487
2014-03-05 12:29:44 -08:00
Alex Converse
463ba70581 vp9_rd_pick_inter_mode_sb() reorganization
* Reduce the number of short cirtcuit checks by pre-computing and combining like checks.
* Postpone non-trivial initializations until after the shortcircuits are evaluated.
* Add some consts and const pointers.

No change to the actual results of the call or output of the encoder.

Change-Id: Ie44c4702aec6e08cfe0b8b0ba3cd6b57206478d1
2014-02-20 18:06:25 -08:00
Jingning Han
0eecccc51e Remove inactive control parameters
Change-Id: Ic5692af975fe6bd2d8ec82bbae103c6f7c2fc13e
2014-02-12 12:48:15 -08:00
Alex Converse
e78c174e54 Cleanup block_rd_txfm.
* Avoid unnecessary type erasure
* Prune unused/duplicate fields from struct rdcost_block_args
* Make struct rdcost_block_args a local

Change-Id: I4f1fd4837ccd028bbfe727191ee8d69f0463b7e5
2014-01-31 12:13:18 -08:00
Dmitry Kovalev
4264c93844 Renaming INTERPOLATION_TYPE to INTERP_FILTER.
Corresponding renames:
  subpel_kernel              => interp_kernel
  vp9_get_filter_kernel()    => vp9_get_interp_kernel()
  pred_filter_type           => pred_interp_filter
  adaptive_pred_filter_type  => adaptive_pred_interp_filter
  mcomp_filter_type          => interp_filter
  read_interp_filter_type()  => read_interp_filter()
  write_interp_filter_type() => write_interp_filter()
  fix_mcomp_filter_type()    => fix_interp_filter()

Change-Id: I1fa61fa1dc81ebbf043457c3ee2d8d4515bee6d3
2014-01-24 15:57:28 -08:00
James Zern
b453941caf vp9/encoder: add extern "C" to headers
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
2014-01-23 16:21:24 -08:00
Jingning Han
2f52decd22 Inter-frame non-RD mode decision
This commit setups a test framework for real-time coding. It enables
a light motion search for non-RD mode decision purpose.

Change-Id: I8bec656331539e963c2b685a70e43e0ae32a6e9d
2014-01-16 12:35:04 -08:00
Jingning Han
393a8ccef9 Remove avoid_frame_with_high_error from RD loop
The feature undergoes prior assumption that the recursive partition
size search from 4x4 to 64x64, hence utilizing information from small
blocks to determine early termination in large block rate-distortion
optimization search. The current codebase is now going from top down.
The previous function might go with not properly initialized values,
hence removed.

Tested on pedestrian_area_1080p at 4000 kbps running under speed 2.
No visible difference in runtime observed.

Change-Id: I553df415c6191413762db7ae34e8790c71d8118e
2014-01-06 13:34:07 -08:00