Assembly implementation of ssse3 8x8 forward 2D-DCT. The current
version is turned on only for x86_64. The average unit runtime
goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster.
This translates into about 1.5% speed-up for pedestrian_area 1080p
at speed 2.
Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4
This reverts commit 59e733ca81.
Hold off removing arnr_type to give users the opportunity
to change their script files to handle its deprecation. A
follow-up patch will mark the control for setting arnr_type
as deprecated and it will be removed completely in a later
revision of the code.
Change-Id: I8b817c744e144d3714234a4cd4309816d0c7e3e8
This member of VP9_COMP seemed unnecessary since it
only shadowed VP9EncoderConfig.key_freq that is
accessible through VP9_COMP.
Change-Id: Ib751bb1cf1b0b3c50a2a527d7c34f6829dd6fee3
The encoder was not handling requests to place keyframes at
fixed intervals, i.e. kf_min_dist == kf_max_dist, correctly.
In this case when looking to place the next keyframe it was
accumulating stats all the way up to the end of the firstpass
file. This patch corrects this behavior.
Change-Id: I948ad9f1d7faa0c05861df588136cce3bb61d7e7
This commit introduces a chessboard pattern search for the prediction
filter type search. It runs extensive search in alternate blocks and
allows the rest blocks to refer coding decisions of their nearby
neighbors.
For pedestrian 1080p at 4000 kbps, the runtime of speed -5 goes down
from 43990 ms to 42200 ms. The overall compression performance for
RTC set is changed by -1.37%.
Change-Id: Icfe220c49451cda796f0ca91d935c9ed01e56c9d
ARNR filtering is now forced to be centered on the ARF
frame and the other two options have been removed.
The other modes of constructing the ARNR frame were
not used and there does not seem to be any good
reason to maintain them.
This is purely an encoder-side change.
Change-Id: Ic772636d23f280752973852b9740083532a49de2
For speed 3 and above, such search is only allowed at speed 3.
The change helped cif and stdhd set by 1.2% and .7% in compression,
but increased the encoding time by around 5%.
Change-Id: Ifa4832327f1c1bef3decb032ceb769cbf50e059f
Adds test code to verify that supplemental superframe information
that precedes the normal superframe information will not break
decoding.
Change-Id: Ia252b887d7ee138f51dc9a778376ff739402c455
The end_useage parameter is confusingly named since it
now actually defines the rate control method used.
Change-Id: I98912caabfe556b7af0b939a645d1336409e4d71
This commit enables a background detection approach for adaptive
quantizer control. It combines the cyclic refresh pattern and the
background information to determine the segment id for adaptive
quantizer selection, prior to the non-RD mode decision process.
It hence allows proper quantization information update for a more
precise rate-distortion modeling in the non-RD mode decision.
The compression performance of speed -5 for rtc set is improved
by 2.5%, at no speed change.
Change-Id: Ic3713e8ed9185b403b5b1679d19dabd57506d452
1. We didn't scale source image in lower layers so that
the stats are incorrect.
2. We didn't extend borders for re-constructed image.
Change-Id: Ia8d7bafbdb695ffa7f504e171f9449812e7bb0a3
To make direct side by side testing this patch combines two
VBR corrections schemes to allow more direct side by side testing.
(The other patch was by Debargha chg id I0cd1f7...)
Change-Id: I271c45e5c4ccf8de8305589000218b80d9dc3a25
The background detection only tracks luma component. This commits
removes the frame buffer pointer retrieval for chroma components.
Change-Id: I098bd2950f5e5829ed5dc2b48568167248da7fad
This patch sets up a quad_tree structure (pc_tree) for holding all of
pick_mode_context data we use at any square block size during encoding
or picking modes. That includes contexts for 2 horizontal and 2 vertical
splits, one none, and pointers to 4 sub pc_tree nodes corresponding
to split. It also includes a pointer to the current chosen partitioning.
This replaces code that held an index for every level in the pick
modes array including: sb_index, mb_index,
b_index, ab_index.
These were used as stateful indexes that pointed to the current pick mode
contexts you had at each level stored in the following arrays
array ab4x4_context[][][],
sb8x4_context[][][], sb4x8_context[][][], sb8x8_context[][][],
sb8x16_context[][][], sb16x8_context[][][], mb_context[][], sb32x16[][],
sb16x32[], sb32_context[], sb32x64_context[], sb64x32_context[],
sb64_context
and the partitioning that had been stored in the following:
b_partitioning, mb_partitioning, sb_partitioning, and sb64_partitioning.
Prior to this patch before doing an encode you had to set the appropriate
index for your block size ( switch statement), update it ( up to 3
lookups for the index array value) and then make your call into a recursive
function at which point you'd have to call get_context which then
had to do a switch statement based on the blocksize, and then up to 3
lookups based upon the block size to find the context to use.
With the new code the context for the block size is passed around directly
avoiding the extraneous switch statements and multi dimensional array
look ups that were listed above. At any level in the search all of the
contexts are local to the pc_tree you are working on (in?).
In addition in most places code that used to call sub functions and
then check if the block size was 4x4 and index was > 0 and return
now don't preferring instead to call the right none function on the inside.
Change-Id: I06e39318269d9af2ce37961b3f95e181b57f5ed9
There is no need to initialize source/dst frame buffers at frame
level. These will be done at block coding stage. This commit hence
removes the redundant operations.
Change-Id: I11d9f2556058c6205c8e58ed53e31f78622c41b7
Add code to monitor over and under spend and
apply limited correction to the data rate of subsequent
frames. To prevent the problem of starvation or overspend
on individual frames (especially near the end of a clip) the
maximum adjustment on a single frame is limited to a %
of its un-modified allocation.
Change-Id: I6e1ca035ab8afb0c98eac4392115d0752d9cbd7f
This commit compares the current original frame to the previous
original frame at 64x64 block level and decides if the entire
block belongs to background area. If it is in the background area,
skip non-RD partition search and copy the partition types of the
collocated block in the previous frame.
For vidyo1 in the rtc set, this makes the speed -5 coding speed
about 8% faster. The overall compression performance is down by
1.37% for rtc set.
Change-Id: Iccf920562fcc88f21d377fb6a44c547c8689b7ea
Delete code relating to the old VP8_TUNE_SSIM flag
as this code does not currently work and is largely made
redundant in VP9 by the various AQ modes.
Change-Id: I71f28e1f680573d296422254489000678552b17b
Remove duplicate rd_thresh code introduced when vp9_rd_pick_inter_mode_sub8x8()
was forked from vp9_rd_pick_inter_mode_sb().
Change-Id: I3c9b7143d182e1f28b29c16518eaca81dc2ecfed
Fix rate control bug whereby the rate factor heuristics
were being updated on arf overlays causing a rate surge
for a few frames followed by a corrective drop.
This fix eliminates many of the overshoot problems that
we were seeing on hard clips (even without applying
stricter vbr rate control) and also helps quality on
almost all clips with some hard clips improving by >5%.
Overall quality results measured at speed 2.
Derf +1.78% opsnr , +2.44% SSIM
Stdhd +2.41% opsnr, +2.85% SSIM
Change-Id: I2369df6295c2705963fa6307877f6acb304bcc39
We don't use declarations from this file. The real declarations
(differently named) are in vp9_rtcd_defs.pl, e.g. vp9_full_search_sad.
Change-Id: I73cbf064305710ba20747233cfdbe67366f069a0
Added command line flags "resize-width" & "resize-height"
to allow the user to specify the frame size to encode at.
These two flags are ignored if the "resize-allowed" switch
is not set to 1.
All frames in the clip are then encoded at this size, which
must be smaller than the raw frame size.
Change-Id: I3d64bd9303d5c0bd678461a866a1ea621700d744
A previous path improved speed 2 quality a little but
more extensive testing showed that it slowed encode
by a few %.
The change will have a similar effect for speed 3 but
should not impact speeds 4+;
This experiment should reverse that and give a speed
up at the cost of a small quality loss.
Borg results pending.
Change-Id: I4493fc1541aaf44587f1a41ff219f7088da9252c
Both values are already checked as command line arguments:
RANGE_CHECK_HI(cfg, g_lag_in_frames, MAX_LAG_BUFFERS);
RANGE_CHECK_HI(extra_cfg, sharpness, 7);
Change-Id: I584798d587152d88dfd517c210054b466f4e5f8a