Thread synchronization was not correct when frame width was 1 MB.
Number of allocated encoding threads is limited by the sync_range.
There is no point having more because each thread lags sync_range MBs
behind the thread processing the row above.
http://code.google.com/p/webm/issues/detail?id=302
Change-Id: Icaf67a883beecc5ebf2f11e9be47b6997fdf6f26
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s
functions had global function pointers rather than using the RTCD
framework. This can show up as a potential data race with tools such as
helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935
for an example.
Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
The encoder was not correctly catching transitions in the quantizer
deltas. If a delta_q was set, then the quantizer would be reinitialized
on every frame, but if they transitioned to 0, the quantizer would
not be reinitialized, leading to a encode-decode mismatch.
This bug was triggered by commit 999e155, which sets a Y2 delta Q
for very low base Q levels.
Change-Id: Ia6733464a55ee4ff2edbb82c0873980d345446f5
Reduce the number of sync points by letting each thread
continue imediatly with a new MB row.
Better multicore scaling, improves performance by 5-20% on ARM multicore.
Change-Id: Ic97e4d1c4886a842c85dd3539a93cb217188ed1b
from vp8cx_encode_intra_macro_block. prediction_error is used when
deciding if a frame should be a keyframe. After reviewing this with
Yaowu, it was pointed out that vp8cx_encode_intra_macro_block
is only called for keyframes, so the accumulation is unnecessary.
Change-Id: Id79dc81b80d4f5d124f3a0dba1b923887e2e1ec8
vp8_pick_intra4x4mby_modes uses the passed in distortion
for an early breakout. The best distortion was never saved
and the distortion for TM_PRED was always used.
Change-Id: Idbaf73027408a4bba26601713725191a5d7b325e
The condition for using RD when selecting the intra coding mode
for a MB is that the RD flag is set AND we're not in real-time
mode.
Previously the code used RD if either the RD flag was set OR
we were not using real-time mode.
Change-Id: Ic711151298468a3f99babad39ba8375f66d55a08
vp8cx_mb_init_quantizer was being called for every mode checked
in vp8_rd_pick_inter_mode. zbin_extra is the only value that
really needs to be recalculated. This calculation is disabled
when using the fast quantizer for mode selection.
This gave a small performance boost (~.5% to 1%).
Note: This needs to be verified with segmentation_enabled.
Change-Id: I62716a870b3c82b4a998bdf95130ff0b02106f1e
Prior to this change, VP8 min quantizer is 4, which caps the
highest quality around 51DB. This experimental change extends
the min quantizer to 1, removes the cap and allows the highest
quality to be around ~73DB, consistent with the fdct/idct round trip
error. To test this change, at configure time use options:
--enable-experimental --enable-extend_qrange
The following is a brief log of changes in each of the patch sets
patch set 1:
In this commit, the quantization/dequantization constants are kept
unchanged, instead scaling factor 4 is rolled into fdct/idct.
Fixed Q0 encoding tests on mobile:
Before: 9560.567kbps Overall PSNR:50.255DB VPXSSIM:98.288
Now: 18035.774kbps Overall PSNR:73.022DB VPXSSIM:99.991
patch set 2:
regenerated dc/ac quantizer lookup tables based on the scaling
factor rolled in the fdct/idct. Also slightly extended the range
towards the high quantizer end.
patch set 3:
slightly tweaked the quantizer tables and generated bits_per_mb
table based on Paul's suggestions.
patch set 4:
fix a typo in idct, re-calculated tables relating active max Q
to active min Q
patch set 5:
added rdmult lookup table based on Q
patch set 6:
fix rdmult scale: dct coefficient has scaled up by 4
patch set 7:
make transform coefficients to be within 16bits
patch set 8:
normalize 2nd order quantizers
patch set 9:
fix mis-spellings
patch set 10:
change the configure script and macros to allow experimental code
to be enabled at configure time with --enable-extend_qrange
patch set 11:
rebase for merge
Change-Id: Ib50641ddd44aba2a52ed890222c309faa31cc59c
This code fixes a bug in the calculation of
the minimum Q for alt ref frames.
It also allows an extended gf/arf interval for sections
of clips that completely static (or nearly so).
Change-Id: I1a21aaa16d4f0578e5f99b13bebd78d59403c73b
cpi->target_bits_per_mb is currently not being used,
so delete it. Also removed other unused code in rdopt.c.
Change-Id: I98449f9030bcd2f15451d9b7a3b9b93dd1409923
Use the fast quantizer for inter mode selection and the
regular quantizer for the rest of the encode for good quality,
speed 1. Both performance and quality were improved. The
quality gains will make up for the quality loss mentioned in
I9dc089007ca08129fb6c11fe7692777ebb8647b0.
Change-Id: Ia90bc9cf326a7c65d60d31fa32f6465ab6984d21
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.
Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.
Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257
The fast quantizer assembly code has not been updated to match the new
exact quantizer, which was made the default in commit 6adbe09.
Specifically, they are not aware of the potential for the coefficient
to be scaled, which results in the quantized result exceeding the range
of the DCT. This patch restores the previous behavior of using the
non-shifted coefficients when in the fast quantizer code path, but
unfortunately requires rebuilding the tables when switching between the
two.
Change-Id: I0a33f5b3850335011a06906f49fafed54dda9546
- Used three probability approach for temporal context as follows:
P0 - probability of no change if both above and left not changed
P1 - probability of no change if one of above and left has changed
P2 - probability of no change if both above and left have changed
In addition, a 1 bit/frame has been used to decide whether to use temporal context or to encode directly. The cost of using both the schemes is calculated ahead and the temporal_update flag is set if the cost of using temporal context is lower than encoding the segment ids directly.
This approach has given around 20% reduction in cost of bits needed to encode segmentation ids.
Change-Id: I44a5509599eded215ae5be9554314280d3d35405
Small changes to the default zero bin and rounding tables.
Though the tables are currently the same for the Y1 and Y2 cases
I have left them as separate tables in case we want to tune this later.
There is now some adjustment of the zbin based on the prediction mode.
Previously this was restricted to an adjustment for gf/arf 0,0 MV.
The exact quantizer now marginal outperforms and is the default.
The overall average gain is about 0.5%
Change-Id: I5e4353f3d5326dde4e86823684b236a1e9ea7f47
Most of the code that actually uses these matrices indexes them as
if they were a single contiguous array, and coverity produces
reports about the resulting accesses that overflow the static
bounds of the first row.
This is perfectly legal in C, but converting them to actual [16]
arrays should eliminate the report, and removes a good deal of
extraneous indexing and address operators from the code.
Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23
This patch moves the scattered updates to the mb skip state
(mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent
changes to the quantizer exposed a bug where if a macroblock
could be coded as a skip but isn't, the encoder would run the
loopfilter but the decoder wouldn't, causing a reference buffer
mismatch.
The loopfilter is controlled by a flag called dc_diff. The decoder
looks at the number of decoded coefficients when setting this flag.
The encoder sets this flag based on the skip state, since any
skippable macroblock should be transmitted as a skip. The coefficient
optimization pass (vp8_optimize_b()) could change the coefficients
such that a block that was not a skip becomes one. The encoder was
not updating the skip state in this situation for intra coded blocks.
The underlying issue predates it, but this bug was recently triggered
by enabling trellis quantization on the Y2 block in commit dcd29e3,
and by changing the quantizer range control in commit 305be4e.
Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802
This uses MB variance to change the RDO weight for mode decision
and quantization.
Activity is normalized against the average for the frame, which is
currently tracked using feed-forward statistics.
This could also be used to adjust the quantizer for the entire
frame, but that requires more extensive rate control changes.
This does not yet attempt to adapt the quantizer within the frame,
but the signaling cost means that will likely only be useful at
very high rates.
Change-Id: I26cd7c755cac3ff33cfe0688b1da50b2b87b9c93