Commit Graph

439 Commits

Author SHA1 Message Date
James Zern
3ffa41aae3 Merge changes If9b16f7d,I75aab21c,I9cbb768c,If5cea3d3,I96940657,I025595d8,Ie0bc3935,I3ebb172d
* changes:
  vp9: remove partition+entropy contexts from common
  vp9: add above/left_context to MACROBLOCKD
  vp9: add above/left_seg_context to MACROBLOCKD
  vp9: add above/left_context to encoder
  vp9: add above/left_seg_context to encoder
  vp9: pass entropy context directly to set_skip_context
  vp9: pass context directly to partition functions
  vp9/decode: add alloc_tile_storage()
2013-10-28 12:45:11 -07:00
James Zern
ce2c337261 vp9: add above/left_context to encoder
Change-Id: If5cea3d389bb1135ee490d273e57cc2c43325d01
2013-10-25 22:01:14 +02:00
James Zern
d72dfab296 vp9: add above/left_seg_context to encoder
Change-Id: I969406574c6658936e9f6db5752f1b295025aab5
2013-10-25 22:01:14 +02:00
Dmitry Kovalev
237ce8724a Adding get_frame_new_buffer() function to replace duplicated code.
Change-Id: I6e0e19231a48364c1de7dfab730b121ab227f111
2013-10-24 12:20:35 -07:00
Dmitry Kovalev
fd724f13b0 Renaming vp9_short_fdct4x4 and vp9_short_walsh4x4.
For consistency with idct function names. Renames:
  vp9_short_fdct4x4  -> vp9_fdct4x4
  vp9_short_walsh4x4 -> vp9_fwht4x4

Change-Id: Id15497cc1270acca626447d846f0ce9199770f58
2013-10-23 14:28:39 -07:00
Dmitry Kovalev
ec414372e8 Removing quantize_b_4x4 function pointer.
The pointer was asigned only once with vp9_regular_quantize_b_4x4, calling
this function directly now. Also removing unused declarations:
  prototype_quantize_block
  prototype_quantize_block_pair
  prototype_quantize_mb
  vp9_regular_quantize_b_4x4_pair
  vp9_regular_quantize_b_8x8

Change-Id: I14325bc2f082336820671eafbc06126651b79f73
2013-10-22 13:09:36 -07:00
Dmitry Kovalev
190c2b4591 Using stride (# of elements) instead of pitch (bytes) in fdct4x4.
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.

Change-Id: I0ba3c52513a5fdd194f1e7e2901092671398985b
2013-10-21 15:27:35 -07:00
Jingning Han
deb10ac6f9 Merge "Make memory alloc in pick_mode_context bsize aware" 2013-10-21 11:45:59 -07:00
Dmitry Kovalev
33a29f3c35 Merge "Moving allow_high_precision_mv from MACROBLOCKD to VP9_COMMON." 2013-10-21 10:55:02 -07:00
Dmitry Kovalev
d1b65c6bda Moving allow_high_precision_mv from MACROBLOCKD to VP9_COMMON.
This value is a global frame-level flag, not a macroblock-level.

Change-Id: Ie8c5790a931150741c2167c00c3e3dd2cf26744d
2013-10-21 10:12:14 -07:00
Paul Wilkins
eec3def7c5 Modified no memory rate control.
This 2-pass rate control setting allocates bits based
on first pass stats to each kf group, gf group and individual
frame but does not correct the bits left and allocation after
each frame.

In other words it recommends a bit allocation for each frame
but does not try and correct any over or under spend on a
frame over the remainder of the clip. This reduces the accuracy
of rate control in terms of hitting an average bitrate but prevents
problems that may arise because early frames either use to many
or too few bits. This mode is currently more inclined to undershoot
than overshoot (particularly at higher data rates).

Also minor changes to rate of adaption when recode loop is not
enabled.

This mode is currently enabled by default for VBR.
It gives the following % performance gains.

derf +0.467, +1.072
yt 2.962, 2.645
stdhd 1.682, 1.595,
yt-hd 2.3, 2.174

Change-Id: I3c84a9bf8884e5b345698ff0e19187f792c2f3a0
2013-10-19 12:40:43 +01:00
Paul Wilkins
a2769bb73d Reduced delta for kf/gf/arf when at maxq.
Delta reduced because of concern about popping on some
very hard clips.

Also allow some frame recode at speed 2 for kf/gf/arf.

Change-Id: Ib47dff42da41aa6eec83b7285fcaaca24abb851e
2013-10-19 12:24:59 +01:00
Jingning Han
72033fcff8 Make memory alloc in pick_mode_context bsize aware
This commit makes the buffer allocation of zcoeff_blk array in
pick_mode_context block size aware. It calculates the number of
4x4 blocks in the partition and assigns the memory space accordingly.
This process (and the uninitialization) is done once for each encoding
pass. It allows memory copy of smaller buffer when possible.

For football at 600kbps, the runtimes improve by about 1%:
speed 1, 45961ms -> 45472ms
speed 2, 23863ms -> 23598ms

Change-Id: Id2ca24906fa89f46fa5fe742ec4b8efc2a61f877
2013-10-18 12:42:44 -07:00
Dmitry Kovalev
01993f7d4a Removing last_kf_gf_q member from VP9Common structure.
It looks like we don't actually use this value.

Change-Id: If21d52b597337e7755f7ea817824fc2b1e477a14
2013-10-16 18:01:48 -07:00
Guillaume Martres
acf0d56f0b Get rid of "this_mi", use "mi_8x8[0]" everywhere instead
The only case where they were intentionally pointing to different
structures was in mbgraph, and this didn't have the expected behavior
because both of these pointers are used interchangeably through the code

Change-Id: I979251782f90885fe962305bcc845bc05907f80c
2013-10-16 16:24:03 -07:00
Guillaume Martres
9a03154f46 Make the static_segmentation feature work again
Change-Id: I766c4b74db526efa4ff6dd2d95ef3e0beb45b6e5
2013-10-16 16:15:27 -07:00
Guillaume Martres
42bcb4a7ad Merge "Prevent accidental changes to the previous frame mode_infos" 2013-10-16 16:07:05 -07:00
Dmitry Kovalev
ab829274b1 Inlining and removing fwd_txm16x16 and fwd_txm8x8 pointers.
Change-Id: I3528ba1c3fee761918509f9d9dc2d842c69f5a44
2013-10-16 15:00:48 -07:00
Marco Paniconi
e078c3d854 Initial 1-pass.
Change-Id: I58c5436f5c95f6012fb2891cd2a02f76e4870b6a
2013-10-16 12:04:29 -07:00
Guillaume Martres
e55f60240a Implement variance-based adaptive quantization
This should be similar to what x264 does with --aq-mode 1.
It works well with clips like parkjoy and touhou
(http://x264.nl/developers/Dark_Shikari/LosslessTouhou.mkv).
At low bitrates, the segmentation signaling overhead may negate the
benefits of this feature.

(PGW) Default changed to feature OFF to allow provisional merge.
Change-Id: I938abf9bb487e1d4ad3b0264ea03d9826275c70b
2013-10-16 11:55:13 +01:00
Adrian Grange
12b2c712ca Merge "Updated encoder to handle intra-only frames" 2013-10-15 17:19:28 -07:00
Jingning Han
9b05f23e05 Merge "Make vp9_zero use cases of consistent format" 2013-10-15 16:49:05 -07:00
Alexander Voronov
d6a59fb12c Updated encoder to handle intra-only frames
Updated the encoder to handle frames that are coded
intra-only. Intra-only frames must be non-showable,
that is, the "show frame" flag must be set to 0 in
the frame header.

Tested by forcing the ARF frames to be coded intra-
only.

Note: The rate control code will need to be modified
to account for intra-only frames better than they
are currently handled.

Change-Id: I6a9dd5337deddcecc599d3a44a7431909ed21079
2013-10-15 16:44:02 -07:00
Jingning Han
c8e48f4b02 Make vp9_zero use cases of consistent format
Remove the semicolon in the definition of vp9_zero macro. Make all
the use cases of vp9_zero of consistent format.

Change-Id: Ibaf9751e8595872b12766381a93d185a4d90df8f
2013-10-15 16:12:21 -07:00
Dmitry Kovalev
a4585285ed Removing unused 8x4 transform from the encoder.
Change-Id: Icbcf68b5b685a56f255ebc3859c9692accdadf9e
2013-10-15 11:27:28 -07:00
Yaowu Xu
8b175679be Masking intra mode choice adaptively
The commit changes to mask available intra prediction modes for test
based on prediction block size.

With this patch, encoding time of CpuUsed 2 reduces from 10% to 20% for
HD clips with a compression drop of 0.2%

Change-Id: I65f320f1237c0f5ae3a355bf7caf447f55625455
2013-10-11 10:29:53 -07:00
Paul Wilkins
704028d435 Experimental rate control change.
When the codec in VBR (or cq) mode hits its max q limits and is
struggling to hit a target bandwidth, the bit target per frame collapses.

In the first instance normal frames cap out at the maximum allowed
Q and then the ARF and GFs do the same. This latter behavior is not
generally desirable as GFs and ARFs are only effective from a quality
and data rate perspective if they have at lease some level of -Q delta
compared to the surrounding frames.

In this patch I define a separate max Q for GFs and ARFs that is
derived from but somewhat lower than that defined for normal frames.
In effect there is a minimum Q delta that will always be available for
GFs and ARFs regardless of the target rate and MAXQ setting.

This may of course mean that the absolute lowest rate obtainable for
a given clip is somewhat higher.

Change-Id: I268868b28401900d0cd87e51e609cd3b784ab54a
2013-10-11 13:40:54 +01:00
Paul Wilkins
8b989f5b23 Disable recode loop.
For VBR coding disable the recode loop for speeds > 0.

Results pending.

Change-Id: I2cd9a87c3fcbe39c05b954798d0671a4ca62c37f
2013-10-11 13:38:52 +01:00
Guillaume Martres
b364176c08 Prevent accidental changes to the previous frame mode_infos
This is needed to fix mbgraph but shouldn't affect anything else

Change-Id: I2f515052f62e348cd3794b7ff0c139802225ea95
2013-10-10 12:18:12 -07:00
Dmitry Kovalev
1e8fc24af8 Merge "Removing inv_txm4x4_1_add and inv_txm4x4_add function pointers." 2013-10-10 10:49:27 -07:00
Jingning Han
4793324c16 Merge "Allow sub8x8 intra modes test for alt frame coding" 2013-10-10 09:00:08 -07:00
Jingning Han
03fe08ca30 Deprecate the use of PARTITION_INFO from encoder
Use b_mode_info to store the inter prediction mode of sub8x8 block,
in replacement of the use of partition_info. Remove redundant buffer
update for partition_info. For bus_cif at 2000 kbps, this seem to make
speed 0 about 1% faster.

Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6
2013-10-09 09:23:52 -07:00
Dmitry Kovalev
c983c966cb Removing inv_txm4x4_1_add and inv_txm4x4_add function pointers.
We already have itxm_add member in MACROBLOCKD structure. Both
inv_txm4x4_1_add and inv_txm4x4_add are just its special cases for
different eob values. But eob logic is already implemented in
vp9_iwht4x4_add and vp9_idct4x4_add (that's why also removing
inverse_transform_b_4x4_add).

Change-Id: I80bec9b6f7d40c5e5033c613faca5c819c3e6326
2013-10-08 11:27:56 -07:00
Yaowu Xu
e29137df05 Change to allow less rectangular partion check
For CpuUsed 1 & 2, this commit allow to skip retangular partition check
when NONE is better than SPLIT. It also changed to allow such logic
on alt ref frame coding rather than use square partition all them. The
change has gain compressio about .3% on yt and ythd for both 1&2, It
helped .6% compression on cif and stdhd for both CpuUsed 1&2.

Change-Id: I814b653baf89f59acd20e042629a12938a1bd4e5
2013-10-08 08:12:56 -07:00
Jim Bankoski
2b491c19b8 Merge "cpplint errors in vp9_onyx_if.h" 2013-10-07 14:47:21 -07:00
Jim Bankoski
7eb7dd2fed cpplint errors in vp9_onyx_if.h
Slightly bigger change -> broke up encode_frame_to_datarate,  lots
of line length fixes.

Change-Id: I7c53325e954de130f3fe1a6656626efc6705be82
2013-10-07 13:57:20 -07:00
Dmitry Kovalev
9dba044be2 Merge "Giving consistent names to IDCT/IWHT functions." 2013-10-05 23:44:05 -07:00
Jingning Han
0d0ed6a29b Allow sub8x8 intra modes test for alt frame coding
This commit allows sub8x8 intra modes test in the rate-distortion
loop for hd sequences in speed 1 and 2.

For sequence y90n of hd set at 8000 kbps, speed 2 runtime goes
from 207s to 210s. For ped_1080p at 3000 kbps, speed 2 runtim goes
from 336s to 337s. Both are running with 300 frames.

This improves compression performance by 0.24% for stdhd and 0.32%
for hd.

Change-Id: I173ca38a6411565ae6cfadd184c42b2070c5de1f
2013-10-04 19:13:00 -07:00
Dmitry Kovalev
3a0602578e Giving consistent names to IDCT/IWHT functions.
The idea is to have the following names for each transform size:

vp9_idct4x4_add
  vp9_idct4x4_1_add
  vp9_idct4x4_10_add
  vp9_idct4x4_16_add

vp9_idct8x8_add
  vp9_idct8x8_1_add
  vp9_idct8x8_10_add
  vp9_idct8x8_64_add

etc for 16x16, 32x32

The actual list of renames in this patch:

vp9_idct_add_lossless     -> vp9_iwht4x4_add
vp9_short_iwalsh4x4_add   -> vp9_iwht4x4_16_add
vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add

vp9_idct_add            -> vp9_idct4x4_add
vp9_short_idct4x4_add   -> vp9_idct4x4_16_add
vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add

Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
2013-10-04 14:17:06 -07:00
Paul Wilkins
44e039b4f5 Further clean up of speed 4
Speed 4 still does not give a big gain over speed 3.
This just cleans it up a little from the last patch and comments
out features that do not seem to be giving much benefit.

Change-Id: I5f366e6160e1dbe5dc45cf5eb90cc02712baa1b6
2013-10-04 16:57:24 +01:00
Paul Wilkins
de6ecc5ac3 Selective masking of split modes.
Allow selective masking of individual split modes rather than
just a single on / off flag.

For speed 2 recovers the large speed loss seen for some derf
clips  in change Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
and a small quality gain.

For speed 1 10 % speed increase observed locally on some derf clips
for minimal quality change.

Change-Id: If86191087b93cbc05351c26c60c7933e2149e485
2013-10-04 14:20:58 +01:00
Paul Wilkins
03dd2818e4 Missing threshold case for disable split.
In relation to change:
Refactor inter mode rate-distortion search
 Ie6bdfa0a370148dd60bd800961077f7e97e67dd4

sf->thresh_mult_sub8x8[THR_INTRA] = INT_MAX missing;

Change-Id: Ia86b68a5073368a3e2ca124a27b632243b525c8b
2013-10-04 11:54:24 +01:00
Jingning Han
11abab356e Refactor inter mode rate-distortion search
This commit separates the rate-distortion optimization loop of
superblocks from that of sub8x8 blocks. This allows better design
rate-distortion optimization search loop for each setting. It also
removes the use of SPLITMV and I4X4_PRED therein.

No performance change in speed 0 settings. For bus@CIF at 2000kbps,
the speed 1 runtime goes from 48009ms to 43894ms (about 10% faster).
The overall compression performance on derf changed by -0.021%.

Speed 2 runtime goes from 27114ms to 28700ms (6% slower), while the
overall coding efficiency goes up by 1.629% for derf, 1.236% for yt.

Change-Id: Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
2013-10-03 11:36:49 -07:00
Paul Wilkins
6253cc9279 Speed setting review.
Substantial reworking of the speed vs quality trade offs for
speed 1 and 2.

In this patch I am attempting to freeze the "quality" meaning of
speeds 1 and 2 relative to speed 0 so that in future we can
better evaluate progress.

I am targeting :
Speed 1 quality ~-5% vs speed 0.
Speed 2 quality ~-10% vs speed 0

It is inevitable that quality will still fluctuate a little as we adjust
settings and add new features, but we will attempt to keep as
close as possible to these values. Above speed 2 things will remain
a bit more fluid for now.

In this patch speed 1 is approximately 4-5x as fast as speed 0. This
is similar to before but the quality hit is a lot less. Likewise speed 2
is approximately 2x as fast as speed 1 but is similar in quality to the
previous speed 1 configuration.

Also slight change to behavior of FLAG_EARLY_TERMINATE to insure
all reference frames get at least one rd test. Important for very low
variance regions.

WIP :- Added a new speed level with old speed 4 becoming speed 5.
Speed 3 and 4 tradeoffs still WIP

Change-Id: Ic7a38dd7b5b63ab1501f9352411972f480ac6264
2013-10-03 10:23:28 +01:00
Paul Wilkins
ece99b3da0 Merge "Improved auto_partition_range." 2013-10-03 02:06:13 -07:00
Jingning Han
54bc73151b Deprecate unused mode count variables
Remove mode_check_freq and mode_test_hit_counts from VP9_COMP.

Change-Id: Iabfd9f841444cd9bf19ac761a9795f140082ce0b
2013-10-02 11:07:14 -07:00
Paul Wilkins
d12a502ef9 Merge "Alter Speed 3." 2013-09-30 09:12:28 -07:00
Paul Wilkins
65b93c7e52 Improved auto_partition_range.
The code now takes into account temporal and spatial
information to determine the partition size range, but the
frequency counts have been removed.

The net effect is similar in quality but about 10% faster.

Change-Id: I39a513fb79cec9177b73b2a7218f0da70963ae95
2013-09-30 11:32:57 +01:00
Paul Wilkins
a76caa7ff4 Alter Speed 3.
This patch deletes the variance based speed three partitioning.
Speed 3 now uses the same partitioning method as speed 2
but with some stricter conditions.

The speed and quality are now somewhere between speeds 2 and 4
whereas before it was worse in both than speed 4.

Change-Id: Ia142e7007299d79db3ceee6ca8670540db6f7a41
2013-09-30 11:26:46 +01:00
Deb Mukherjee
80d582239e Some minor changes/cleanups in rate control
Some small changes to the quantizer mapping functions.
Also includes some cleanups.

Change-Id: I9dea29b24015f6e6697012a0e4d8983049d8e5c7
Results:
derfraw300: +0.106%
stdhdraw250: +0.139%
2013-09-27 13:57:42 -07:00