x86-64 passes arguments in registers. There is no need to push
them to the stack before using them.
This fixes 15acc84f10 where ebx
was not getting preserved on x86.
Change-Id: I1214b5f818a0201f75ab6ad7d5c6f448e09b16c2
The use of incorrect mv costing tables in the ARNR sub-pel
filtering code led to corruption of the altref buffer in some cases,
particularly at low data rates.
The average gain from this fix is about 0.3% but there are a few
extreme cases where nasty and visible artifacts manifested and
for these few data points the improvement is > 10%.
PGW and AWG
Change-Id: I95cc02b196a433e71d0d2bd2b933fe68ed31e796
This intends to correct the tendency that VP8 aggressively favors rate
on intra coded frames. Experiments tested different numbers in [0, 1]
and found 9/16 overall provided about 2-4% gains for all-intra coded
clips based on vpx-ssim metric. The impact on regular encoded clips
is much smaller but positive overall. Overall impact on psnr is also
positive even though very small.
Change-Id: If808553aaaa87fdd44691f9787820ac9856d9f8a
The fast quantizer assembly code has not been updated to match the new
exact quantizer, which was made the default in commit 6adbe09.
Specifically, they are not aware of the potential for the coefficient
to be scaled, which results in the quantized result exceeding the range
of the DCT. This patch restores the previous behavior of using the
non-shifted coefficients when in the fast quantizer code path, but
unfortunately requires rebuilding the tables when switching between the
two.
Change-Id: I0a33f5b3850335011a06906f49fafed54dda9546
Debugging in postproc needs more flags to allow for specific
block types to be turned on or off in the visualizations.
Must be enabled with --enable-postproc-visualizer during
configuration time.
Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d
VBR rate control can become very noisy for the last few frames.
If there are a few bits to spare or a small overshoot then the
target rate and hence quantizer may start to fluctuate wildly.
This patch prevents further adjustment of the active Q limits for
the last few frames.
Patch also removes some redundant variables and makes one small bug fix.
Change-Id: Ic167831bec79acc9f0d7e4698bcc4bb188840c45
Small changes to the default zero bin and rounding tables.
Though the tables are currently the same for the Y1 and Y2 cases
I have left them as separate tables in case we want to tune this later.
There is now some adjustment of the zbin based on the prediction mode.
Previously this was restricted to an adjustment for gf/arf 0,0 MV.
The exact quantizer now marginal outperforms and is the default.
The overall average gain is about 0.5%
Change-Id: I5e4353f3d5326dde4e86823684b236a1e9ea7f47
The check '(user_data_end - partition < partition_size)' must be
evaluated as a signed comparison, but because partition_size was
unsigned, the LHS was promoted to unsigned, causing an incorrect
result on 32-bit. Instead, check the upper and lower bounds of
the segment separately.
Change-Id: I6266aba7fd7de084268712a3d2a81424ead7aa06
Change Ice204e86 identified a problem with bitrate undershoot due to
low precision in the timestamps passed to the library. This patch
takes a different approach by calculating the duration of this frame
and passing it to the library, rather than using a fixed duration
and letting the library average it out with higher precision
timestamps. This part of the fix only applies to vpxenc.
This patch also attempts to fix the problem for generic applications
that may have made the same mistake vpxenc did. Instead of
calculating this frame's duration by the difference of this frame's
and the last frame's start time, we use the end times instead. This
allows the framerate calculation to scavenge "unclaimed" time from
the last frame. For instance:
start | end | calculated duration
======+=======+====================
0ms 33ms 33ms
33ms 66ms 33ms
66ms 99ms 33ms
100ms 133ms 34ms
Change-Id: I92be4b3518e0bd530e97f90e69e75330a4c413fc
Change color reference frame to blend the macro block edge.
This helps with layering of visualizations.
Add block coloring for intra prediction modes.
Change-Id: Icefe0e189e26719cd6937cebd6727efac0b4d278
Split motion vectors were all being treated as 4x4
blocks. Now correctly handle 16x8, 8x16, 8x8, 4x4
blocks.
Change-Id: Icf345c5e69b5e374e12456877ed7c41213ad88cc
Now draw 16 vectors for SPLITMV mode.
Fixed constrain line to block divide by zero issues.
Blend block was not centering the shaded area correctly.
Change-Id: I1edabd8b4e553aac8d980f7b45c80159e9202434
(test clip: tulip)
For good quality mode with speed=1, this gave the encoder
a small (2 - 3%) performance boost.
Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35
Unlike GCC, Visual Studio compiler doesn't allocate SAD output
array 16-byte aligned, which causes crash in visual studio.
Change-Id: Ia755cf5a807f12929bda8db94032bb3c9d0c2362
This eliminates a large set of warnings exposed by the Mozilla build
system (Use of C++ comments in ISO C90 source, commas at the end of
enum lists, a couple incomplete initializers, and signed/unsigned
comparisons).
It also eliminates many (but not all) of the warnings expose by newer
GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
without checking the return values).
There are a few spurious warnings left on my system:
../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
change between the two if blocks that test it here.
../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
standard does not mandate that enums be unsigned, so the checks can't be
removed.
Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.
Change-Id: Id3ef0fdeeab2949a6664b2c63e2a3e1a89503f6c
Use mpsadbw, and calculate 8 sad at once. Function list:
vp8_sad16x16x8_sse4
vp8_sad16x8x8_sse4
vp8_sad8x16x8_sse4
vp8_sad8x8x8_sse4
vp8_sad4x4x8_sse4
(test clip: tulip)
For best quality mode, this gave encoder a 5% performance boost.
For good quality mode with speed=1, this gave encoder a 3%
performance boost.
Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134
This patch fixes the system dependent entries for the half-pixel
variance functions in both the RTCD and non-RTCD cases:
- The generic C versions of these functions are now correct.
Before all three cases called the hv code.
- Wire up the ARM functions in RTCD mode
- Created stubs for x86 to call the optimized subpixel functions
with the correct parameters, rather than falling back to C
code.
Change-Id: I1d937d074d929e0eb93aacb1232cc5e0ad1c6184
This allows multiple post processor debug levels to be overlayed.
i.e. can show colored reference blocks and visual motion vectors.
Change-Id: Ic4a1df438445b9f5780fe73adb3126e803472e53
ARM used to explicitly remove this file from the build. With the RTCD
changes, that's no longer possible. These errors also exist for x86 w/o
RTCD, but that's not the default configuration
Change-Id: I3e10e5553ddf3278e8d3c9365ca6fb84f52f5066
NEON has optimized 16x16 half-pixel variance functions, but they
were not part of the RTCD framework. Add these functions to RTCD,
so that other platforms can make use of this optimization in the
future and special-case ARM code can be removed.
A number of functions were taking two variance functions as
parameters. These functions were changed to take a single
parameter, a pointer to a struct containing all the variance
functions for that block size. This provides additional flexibility
for calling additional variance functions (the half-pixel special
case, for example) and by initializing the table for all block sizes,
we don't have to construct this function pointer table for each
macroblock.
Change-Id: I78289ff36b2715f9a7aa04d5f6fbe3d23acdc29c
Post process option to color the block for either the mode
of the macro block, or the frame that the macro block references.
Change-Id: Ie498175497f2d20e3319924d352dc4ddc16f4134
ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.
Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
The ARM version of vp8_hex_search() is a faster implementation
of the same algorithm. Since it doesn't use any ARM specific
code, it can be made the default implementation. This removes
a linking error.
Change-Id: I77d10f2c16b2515bff4522c350004e03b7659934