The merge includes hooks to for CQ mode and other code
changes merged from the test branch.
CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.
The existing target data rate parameter is used to specify the
guideline maximum bitrate.
A new parameter allows the user to specify a target CQ level.
For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.
Change-Id: I2221f9eecae8cc3c431d36caf83503941b25e4c1
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.
Change-Id: I9203532fd67361588d4024628d9ddb8e391ad912
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.
Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.
Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257
Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes
original: b7b1e6fb
reverted: 41f4458a
Also drop unused field from vp8_extra_bit_struct
Update ARM ASM to deal with this change. In particular, Extra is signed
and needs to be sign-extended when loaded.
Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41
This code is unused, as the current preproc implementation uses the
same spatial filter that postproc uses.
Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
Changed the end of block computation to use pmaxw. Removed
additional pushing and popping of registers that was not needed.
Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98
In the process of developing new intra prediction modes, tests have
shown removal of the low pass filtering from B_HE_PRED and B_VE_PRED
has an overall minor positive impact in both PSNR and SSIM metric.
Overall difference is about 0.1%. The change shall also have a small
positive impact on speed. Intuitively, this change should also reduce
some of the tendency of "flattening"
Change-Id: I3c43b0daca833c6eff77d00f19c811f9ef9368a3
Extending the value range of tokens allows further experiments on
extending quantizer range. Encoder and decoder were verified to
produce matching reconstructed buffers by tests with forced
quantized value of 1.
Change-Id: I12faf92832867870b6f71ddeafbf643f1040086d
Debugging in postproc needs more flags to allow for specific
block types to be turned on or off in the visualizations.
Must be enabled with --enable-postproc-visualizer during
configuration time.
Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d
Change color reference frame to blend the macro block edge.
This helps with layering of visualizations.
Add block coloring for intra prediction modes.
Change-Id: Icefe0e189e26719cd6937cebd6727efac0b4d278
Split motion vectors were all being treated as 4x4
blocks. Now correctly handle 16x8, 8x16, 8x8, 4x4
blocks.
Change-Id: Icf345c5e69b5e374e12456877ed7c41213ad88cc
Now draw 16 vectors for SPLITMV mode.
Fixed constrain line to block divide by zero issues.
Blend block was not centering the shaded area correctly.
Change-Id: I1edabd8b4e553aac8d980f7b45c80159e9202434
This eliminates a large set of warnings exposed by the Mozilla build
system (Use of C++ comments in ISO C90 source, commas at the end of
enum lists, a couple incomplete initializers, and signed/unsigned
comparisons).
It also eliminates many (but not all) of the warnings expose by newer
GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
without checking the return values).
There are a few spurious warnings left on my system:
../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
change between the two if blocks that test it here.
../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
standard does not mandate that enums be unsigned, so the checks can't be
removed.
Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.
Change-Id: Id3ef0fdeeab2949a6664b2c63e2a3e1a89503f6c
This allows multiple post processor debug levels to be overlayed.
i.e. can show colored reference blocks and visual motion vectors.
Change-Id: Ic4a1df438445b9f5780fe73adb3126e803472e53
Post process option to color the block for either the mode
of the macro block, or the frame that the macro block references.
Change-Id: Ie498175497f2d20e3319924d352dc4ddc16f4134
ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.
Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
Some of the ARM functions differed from their generic counterparts
only by unrolling their loops. Since this change may be useful
on other platforms, or might even supercede the looped version
in the generic case, move it back to the generic file.
This code is left under #if ARCH_ARM for now, but it may be worth
considering a different (possibly new) conditional for these. If
it turns out that this should be runtime selectable, these
functions will have to move to the RTCD infrastructure. Don't want
to take that step at this time without more profile data.
Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f
These functions were true duplicates of functions present in the
generic code. This fixes some of the link errors when building
with --enable-shared --enable-pic.
Change-Id: Idff26599d510d954e439207883607ad6b74df20c
Postproc level that uses Bresenham's line algorithm
to draw motion vectors onto the postproc buffer.
Change-Id: I34c7daa324f2bdfee71e84fcb1c50b90fa06f6fb
there were four versions for the regular and
macroblock loopfilters:
horizontal [y|uv]
vertical [y|uv]
this moves all the common code into 2 functions:
vp8_loop_filter_neon
vp8_mbloop_filter_neon
this provides no gain in performance. there's a bit
of jitter, but it trends down ~0.25-0.5%. however,
this is a huge gain maintenance. also, there is the
potential to drop some stack usage in the macroblock
loopfilter.
Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92
The primary goal is to allow a binary to be built which supports
NEON, but can fall back to non-NEON routines, since some Android
devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
Tegra).
The configure-generated flags HAVE_ARMV7, etc., are used to decide
which versions of each function to build, and when
CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
at run time.
In order for this to work, the CFLAGS must be set to something
appropriate (e.g., without -mfpu=neon for ARMv7, and with
appropriate -march and -mcpu for even earlier configurations), or
the native C code will not be able to run.
The ASFLAGS must remain set for the most advanced instruction set
required at build time, since the ARM assembler will refuse to emit
them otherwise.
I have not attempted to make any changes to configure to do this
automatically.
Doing so will probably require the addition of new configure options.
Many of the hooks for RTCD on ARM were already there, but a lot of
the code had bit-rotted, and a good deal of the ARM-specific code
is not integrated into the RTCD structs at all.
I did not try to resolve the latter, merely to add the minimal amount
of protection around them to allow RTCD to work.
Those functions that were called based on an ifdef at the calling
site were expanded to check the RTCD flags at that site, but they
should be added to an RTCD struct somewhere in the future.
The functions invoked with global function pointers still are, but
these should be moved into an RTCD struct for thread safety (I
believe every platform currently supported has atomic pointer
stores, but this is not guaranteed).
The encoder's boolhuff functions did not even have _c and armv7
suffixes, and the correct version was resolved at link time.
The token packing functions did have appropriate suffixes, but the
version was selected with a define, with no associated RTCD struct.
However, for both of these, the only armv7 instruction they actually
used was rbit, and this was completely superfluous, so I reworked
them to avoid it.
The only non-ARMv4 instruction remaining in them is clz, which is
ARMv5 (not even ARMv5TE is required).
Considering that there are no ARM-specific configs which are not at
least ARMv5TE, I did not try to detect these at runtime, and simply
enable them for ARMv5 and above.
Finally, the NEON register saving code was completely non-reentrant,
since it saved the registers to a global, static variable.
I moved the storage for this onto the stack.
A single binary built with this code was tested on an ARM11 (ARMv6)
and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder,
and produced identical output, while using the correct accelerated
functions on each.
I did not test on any earlier processors.
Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa
Most of the code that actually uses these matrices indexes them as
if they were a single contiguous array, and coverity produces
reports about the resulting accesses that overflow the static
bounds of the first row.
This is perfectly legal in C, but converting them to actual [16]
arrays should eliminate the report, and removes a good deal of
extraneous indexing and address operators from the code.
Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23
../libvpx/vp8/encoder/bitstream.c: In function ‘pack_inter_mode_mvs’:
../libvpx/vp8/encoder/bitstream.c:1026: warning: array subscript has type ‘char’
Change-Id: Ic77491e0a172fa1821e5b3e914d0dc41fe87c00f
Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208
nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes
Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.
Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc
nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.
Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.
Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91
Raised by Lei Yang, the Y plane stride was used for UV blocks.
This is clearly a typo. But as the comments in the code suggested
that this port of code has not been used yet, so the typo should
not have created any damage yet.
Change-Id: Iea895edc17469a51c803a8cc6d0fce65a1a7fc2f
Loopfilter deltas are initialized to zero on keyframes in the decoder.
The values then persist from the previous frame unless an update bit
is set in the bitstream. This data is not included in the entropy
data saved by the 'refresh entropy' bit in the bitstream, so it is
effectively an additional contextual element beyond the 3 ref-frames
and the entropy data.
The encoder was treating this delta update bit as update-if-nonzero,
meaning that the value would be refreshed even if it hadn't changed,
and more significantly, if the correct value for the delta changed
to zero, the update wouldn't be sent, and the decoder would preserve
the last (presumably non-zero) value.
This patch updates the encoder to send an update only if the value
has changed from the previously transmitted value. It also forces the
value to be transmitted in error resilient mode, to account for lost
context in the event of lost frames.
Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868