This commit added code to keep track of separate entropy contexts for
normal frames and alt ref frames. The underly assumption was that the
two type of frames have different entropy characteristics given they
typically have quite different quantization levels. By keeping entropy
contexts separate, it helps the entropy context distribution to be more
closely adapted to each frame type.
Tests on derf set showed a good and very consistent gain on all clips
on all metrics, avg psnr: 0.89%, overall psnr: 0.84% and ssim 0.93%.
http://www.corp.google.com/~yaowu/no_crawl/mulcontext.html
Change-Id: I15bc9697f6ff7829042911fe0c62930585d7e65d
This commit enabled the usage of 8x8 intra prediction modes on inter
frames. There are a few TODO items related to this: 1)baseline entropy
need be calibrated; 2)cost of UV need to be done more properly rather
than using decision only relying on Y; 3)Threshold for allowing picking
8x8 intra prediction should be lowered to lower than the B_PRED.
Even with all the TODOs, tests showed consistent gain on derf set ~0.1%
(PSNR:0.08% and SSIM:0.14%). It is assumed that 8x8 intra prediction
will help more on large resolution clips, especially with above TODOs
addressed.
Change-Id: I398ada49dfc32575cfab962a569c2885111ae3ba
Fixed some further QIndex related issues and replaced some tables
(eg zbin and rounding)
Also Added function (currently disabled by default) to populate the
main AC and DC quantizer tables. Using the original AC range the
resulting computed DC values give behavior broadly comparable
on the DERF set. That is not to say that the equations will hold good
over a more extended range. The purpose of this code is to make it
easier to experiment with further alterations to the Q range and distribution
of Q values plus the relative weights given to AC and DC.
The function find_fp_qindex() ensures that changes to the Q tables
are reflected in the value passed in to the first pass code.
Slight experimental adjustment to static segment Q offset.
Change-Id: I36186267d55dfc2a3d565d0cff7218ef300d1cd5
this commit is to add an variable in the macroblock level mode
info structure to track the transform size used in each MB, so
the information can be used later in the loop filter to change
how loop filter works on MBs with different transform sizes.
Change-Id: Id0eeaba6cc854c6d1be00ed8d237b3d9e250e447
Slight tweaks to the new minq equations to bring results more into line with
original lookup tables.
Change-Id: I969fc87d95912df549b6775e83ee2345e84d4da0
Fixed bug in firspass.c call to vp8_initialize_rd_consts()
This was passing in vp8_dc_quant(cm->base_qindex, cm->y1dc_delta_q)
instead of (cm->base_qindex + cm->y1dc_delta_q).
It just so happens that for the value 26 used for cm->base_qindex in the
unextended Q case, the two give similar results. However, when using
the extended Q range the two are very different.
Also added more stats output and partly disabled another broken feature.
Change-Id: Iddf6cf5ea8467c44b7c133f38e629f6ba6f2581e
This value needs to be copied to each thread's data structure.
This fixed artifact problem in multi-thread encoder.
Change-Id: Iab6d9745a1d44846aa503184705376f63a505597
This is an experiment to include a mv contribution from last frame to
nearest and near mv definition. Initial test showed some small though
consistent gain.
latest patch slightly better result ~.13%-~.18%.
TODO: the entropy used to encode the mode choice, i.e. the mv counts
based conditional distribution of modes should be re-collected to
reflect this change, it is expected that there is some further gain
from that.
Change-Id: Ief1e284a36d8aa56b49ae5b360c91419ec494fa4
This comitt brings accross changes from the public branch
commit number Icf74d13af77437c08602571dc7a97e747cce5066.
The main puurpose of this comit relates to CQ mode but it
also includes some refactoring of the two pass code which
I hope will make tuning the experimental branch for the new
quantizer range a little less painfull.
Change-Id: I278e989436a928fc1fe7761068960048f9d7a376
This commit resolves further QIndex look up tables to facilitate
experimentation with the quantizer range.
In some cases rather than remove the look up tables completely
I have created functions that are called once to populate them
using a formulaic approach base on the actual quantizer.
The use of these functions based on best fit of data from the original
tables does affect the results on some clips but across the derf test
set the effect was broadly neutral.
Change-Id: I8baa61c97ce87dc09a6340d56fdeb681b9345793
API was not returning correct partition sizes on arm targets.
The armv5 token packing functions were not storing the information to the
partition size table.
As a fix, have one boolcoder instance allocated for each partition so
that partition sizes are internally available after all partitions
were encoded. This will also allow more flexibility in producing
several partitions in parallel.
Use buffer validation (overflow check) in all ARM bitpacking
functions.
Change-Id: I31c8a11d8a7613676f0ff50928cb2a2ab14fd169
One of the problems arising when tweaking or adjusting the quantizer
tables is that there are a lot of look up tables that depend on the QINDEX.
Any adjustment to the link between QINDEX and real quantizer therefore tends
to break aspects of for example the rate control.
In this check in I have replaced several of the look up tables with functions that
approximate the same results as the old Q luts but use a formulaic approach
based on real Q values rather than QIndex. This should hopefully make it easier
to experiment with changes to the Q tables without always having to go through
and hand optimize a set of look up tables. Once things stabilize we may choose
to re-instate luts for the sake of performance.
Patch 2:
Addressed Ronald's comments.
vp8_init_me_luts() Added so luts only initialized once.
Change-Id: Ic80db2212d2fd01e08e8cb5c7dca1fda1102be57
Corrected dc lookup table to maintain ac/dc balance
close to what it was previously.
Firstpass not being passed the adjusted Q index for
the extended range.
Change-Id: Ic0200dabda445fea03bf81067999cb2670e99b77
Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2
file is bad
Moving towards allowing --disable-mmx
Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742
There was an implicit reference frame test order (typically LAST,
GOLD, ARF) in the mode selection logic, but this doesn't provide the
expected results when some reference frames are disabled. For
instance, in real-time mode, the speed selection logic often disables
the ARF modes. So if the user disables the LAST and GOLD frames, the
encoder was always choosing INTRA, when in reality searching the ARF
in this case has the same speed penalty as searching LAST would have
had.
Instead, introduce the notion of a reference frame search order. This
patch preserves the former priorities, so if a frame is disabled, the
other frames bump up a slot to take its place. This patch lays the
groundwork for doing something smarter in the frame test order, for
example considering temporal distance or looking at the frames used by
nearby blocks.
Change-Id: I1199149f8662a408537c653d2c021c7f1d29a700
Extend buffer write validation (overflow check) to single token
partition packing, both mb and row based functions.
Change-Id: I36e19b7d37fc43712d05c70e3ad223d3eb5b973d
The buffer level was able to increase indefinitely rather than
being clipped to the maximum buffer size specified by the user.
This change checks the buffrer level and prevents it from
going beyond the upper limit of the buffer.
Change-Id: Ifff55f79d3c018e4d3d77e554b11ada543cc1654
This commit has a few minor fixes to the 8x8 trellis quant, so to
make it work regardless if extend_qrange is enabled or not. It also
borrowed adaptive RDMULT constants from 4x4 trellis that was missed
in the 8x8 trellis quant.
Change-Id: I60d7769071f102c699b5084597e62bca87a1f759
Explicit inclusion of limits.h to satisfy unix build for definition of INT_MAX.
Some commented out code removed.
Change-Id: I5b5980dfaa9b4d2d12bfd729cfd35bd982106908
Patch set 2: 64 bit build fix
Patch set 3: 64 bit crash fix
[Tero]
Patch set 4: Updated ARMv6 and NEON assembly.
Added also minor NEON optimizations to subtract
functions.
Patch set 5: x86 stride bug fix
Change-Id: I1fcca93e90c89b89ddc204e1c18f208682675c15
Removal of CONFIGURE_SEGMENTATION ifdefs.
Removal of legacy support code fo the old coding mechanism.
Use local reference "xd" for MACROBLOCKD structure in
encode_frame_to_data_rate()
Moved call to choose_segmap_coding_method() out of encode
loop as the cost of segmentation is not properly accounted
in the loop anyway. If this is desirable in the future it
can be moved back. The use of this function to do all the
analysis and set the probabilities also removes the need
to track segment useage in threading code.
Change-Id: I85bc8fd63440e7176c73d26cb742698f9b70cade
Changed name and sense of segment_flag to "seg_id_predicted"
Added some additional comments and retested.
I also did some experimentation with a spatial prediction option
using a similar strategy to the temporal mode implemented.
This helps in some cases where temporal prediction is bad but
I suspect there is more overlap here with work on a larger scale
block structure and spatial correlation will likely be better
handled through that mechanism.
Next check in will remove #ifdefs and legacy mode code.
Change-Id: I3b382b65ed2a57bd7775ac0f3a01a9508a209cbc
This check in includes quite a lot of clean up and refactoring.
Most of the analysis and set up for the different coding options for the
segment map (currently simple distribution based coding or temporaly
predicted coding), has been moved to one location (the function
choose_segmap_coding_method() in segmenation.c). This code was previously
scattered around in various locations making integration with other
experiments and modification / debug more difficult.
Currently the functionality is as it was with the exception that the
prediction probabilities are now only transmitted when the temporal
prediction mode is selected.
There is still quite a bit more clean up work that will be possible
when the #ifdef is removed. Also at that time I may rename and alter
the sense of macroblock based variable "segment_flag" which indicates
(1 that the segmnet id is not predicted vs 0 that it is predicted).
I also intend to experiment with a spatial prediction mode that can be
used when coding a key frame segment map or in cases where temporal
prediction does not work well but there is spatial correlation.
In a later check in when the ifdefs have gone I may also move the call
to choose_segmap_coding_method() to just before where the bitsream is
packed (currently it is in vp8_encode_frame()) to further reduce the
possibility of clashes with other experiments and prevent it being called
on each itteration of the recode loop.
Change-Id: I3d4aba2a2826ec21f367678d5b07c1d1c36db168
The calculated frame_rate is a state variable in the codec, and
shouldn't be maintained in the configuration struct. Move it to the
main part of cpi so that it isn't clobbered when the configuration
struct is updated. The initial framerate estimate is moved from the
vp8_cx_iface.c wrapper into the body of init_config() in onyx_if.c, so
that it is only called once and not reset on every call to
vp8_change_config().
Change-Id: I8d9a3d1283330d1ee297d07e9d78d1f2875f2465
Added last_segmentation_map[] structure
to keep track of what we had before when
doing temporal prediction. With this change
the existing code does once again appear to
be giving a decodable bitstream for both
temporal and standard prediction modes.
However, it is still somewhat messy and
confused and there is no option to take
advantage of spatial prediction so it could
do with further work.
Some housekeeping / clean out.
Change-Id: I368258243f82127b81d8dffa7ada615208513b47
Some initial cleanup to aid testing and debug.
Pull code to choose temporal or spatial encoding
out of encodeframe.c into a dedicated function
in segmentation.c.
For now disable broken temporal mode.
Move the coding of "temporal_update" flag and
only transmit if segment map update is indicated.
Rename the functions read_mb_features() and
write_mb_features() to read_mb_segid() and
read_mb_segid() as they only read and write
the macroblock segment id not any of the
features.
Change-Id: Ib75118520b1144c24d35fdfc6ce46106803cabcf
This commit added scaling factors to 8x8 transform, quant, dequant and
inverse transform pipeline to make 8x8 transform to work when configed
with enable-extend_qrange. This commit also disabled the trellis-quant
when extend_qrange is configured.
Change-Id: Icfb3192e4746f70a4bb35ad18b7b47705b657e52
extend_qrange introduces a different scaling factor, this commit takes
the scaling difference into account for reset 2nd order coefficients.
Change-Id: Ie58bca9f52698fa759e3f88da2aa4d82630fa91a
For ease of testing and merging experiments I have
removed in line code in encode_frame() that assigns
MBs to be t8x8 or t4x4 coded segments and have
moved the decision point and segment setup to
the init_seg_features0 test function.
Keeping everything in one place helps make sure
for now that experiments using segmentation are
not fighting each other.
Also made sure mode selection code can't choose 4x4
modes if t8x8 is selected.
Patch2: In init_seg_features() add checks
for SEG_LVL_TRANSFORM active.
Change-Id: Ia1767edd99b78510011d4251539f9bc325842e3a
Call the idct/add after the tokenize. This is WIP with
the goal of creating a common idct/add for the encoder and
decoder. This move is necessary because the decoder's version
of the idct clobbers qcoeff, which is used by the tokenize.
Change-Id: I6b08d8e8397cd873647fa4fb9469884e3c876756
Removed code in #if CONFIG_SEGMENTATION that
enables segmentation and creates a test segmentation
map, to avoid conflicts with the other segmentation test
code,
Change-Id: I7a21a44ed188b814cd80b30dd628c62474eba730
Bug fix to logic in vp8_pick_inter_mode() and
vp8_rd_pick_inter_mode().
The block on the use of segment features
for the cm->refresh_alt_ref_frame case
was just for testing and is not correct.
The special case code for alt ref can
be re-enabled as an else clause.
Change-Id: Ic9b57cdb5f04ea7737032b8fb953d84d7717b3ce
Added ARM optimized intra 4x4 prediction
- 2x faster on Profiler compared to C-code compiled with -O3
- Function interface changed a little to improve BLOCKD structure
access
Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54
The 8x8 forward transform makes use of floating operations, therefore
requires emms call to reset mmx registers to correct state. Without
the resets, the 8x8 forward transform results are indefinite on win32
platform.
Change-Id: Ib5b71c3213e10b8a04fe776adf885f3714e7deb1
vp8cx_mb_init_quantizer() needs to be called at least once to get
all values calculated. This change added one check to decide if
we could skip initialization or not.
Change-Id: I3f65eb548be57580a61444328336bc18c25c085b