224 Commits

Author SHA1 Message Date
John Koleszar
02321de0f2 Fix relative include paths
Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.

Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
2011-02-10 15:09:44 -05:00
Johann
7d8199f0c3 Merge "Adds armv6 optimized variance calculation" 2011-02-10 06:06:46 -08:00
John Koleszar
a39b5af10b Merge "Put more code under #if CONFIG_MULTITHREAD." 2011-02-09 08:31:36 -08:00
Gaute Strokkenes
315e3c2518 Put more code under #if CONFIG_MULTITHREAD.
Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b
2011-02-09 11:21:18 -05:00
Tero Rintaluoma
cb14764fab Adds armv6 optimized variance calculation
Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
and adds new assembly file for variance16x16 calculation.
 - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
 - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
 - vp8_variance16x16_armv6 (new)
 - bilinearfilter_arm.h (new)
Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
2011-02-09 10:23:43 -05:00
Johann
e5aaac24bb clean up bilinear filter
make reference version of bilinear_filters short.
use reference versions of bilinear_filters and sub_pel_filters when
possible.

recognize that Width was being passed into
filter_block2d_bil_first_pass multiple times. ARM version had already
fixed this. propegate to C.

change references to src_pixels_per_line to src_pitch and standardize on
src/dst (instead of input/output).

recognize that first_pass is only run in the verticle and second_pass
only horizontal. ARM version had already fixed this. propegate to C

Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2
2011-02-08 17:42:54 -05:00
Johann
40dcae9c2e clarify *_offsets.asm differences
it's difficult to mux the *_offsets.c files because of header conflicts.
make three instead, name them consistently and partititon the contents
to allow building them as required.

Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796
2011-02-08 16:35:43 -05:00
Johann
3273c7b679 move one of the offset files
common/arm/vpx_asm_offsets moves up a level. prepare for muxing with
encoder/arm/vpx_vp8_enc_asm_offsets

Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24
2011-02-07 11:35:30 -05:00
Johann
bb9c95ea53 remove unused dboolhuff code
we were holding on to this "just in case." purge it instead

Change-Id: I77a367b36d0821d731019f2566ecfffdae1d4b8a
2011-02-04 16:00:00 -05:00
Yunqing Wang
350ffe8dae Merge "Improve MV prediction in vp8_pick_inter_mode() for speed>3" 2011-02-04 10:10:15 -08:00
Gaute Strokkenes
ffc6aeef14 Remove duplicate loopfilter parameters.
Change-Id: I0d41415e3961c2c9492d342290c1999f9d02e6d8
2011-02-04 14:55:02 +00:00
Gaute Strokkenes
bf5f585b0d Make vp8_adjust_mb_lf_value return the updated value rather than
manipulating it in situ via a pointer.

Change-Id: If4a87a4eccd84f39577c0e91e171245f4954c5cf
2011-02-03 19:24:16 +00:00
Johann
f3cb9ae459 Merge "Adds "armvX-none-rvct" targets" 2011-01-28 09:03:58 -08:00
Yunqing Wang
7cbe684ef5 Improve MV prediction in vp8_pick_inter_mode() for speed>3
Applied same method used in vp8_rd_pick_inter_mode() to improve
the accuracy of MV prediction.

Change-Id: Ia50ae26208b18482695601f32febd99fe89fbc17
2011-01-28 10:00:20 -05:00
Tero Rintaluoma
11a222f5d9 Adds "armvX-none-rvct" targets
Adds following targets to configure script to support RVCT compilation
without operating system support (for Profiler or bare metal images).
 - armv5te-none-rvct
 - armv6-none-rvct
 - armv7-none-rvct

To strip OS specific parts from the code "os_support"-config was added
to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS
specific parts such as OS specific includes and function calls for
timers and threads etc. This was done to enable RVCT compilation for
profiling purposes or running the image on bare metal target with
Lauterbach.

Removed separate AREA directives for READONLY data in armv6 and neon
assembly files to fix the RVCT compilation. Otherwise
"ldr <reg>, =label" syntax would have been needed to prevent linker
errors. This syntax is not supported by older gnu assemblers.

Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43
2011-01-28 12:47:39 +02:00
Yunqing Wang
cac54404b9 Remove copies of same functions
Reduce the code size.

Change-Id: I2e1998557a3c8776e262c442fd758c25e17aff7a
2011-01-26 15:37:00 -05:00
John Koleszar
2f0331c90c Merge "Implement error tracking in the decoder" 2011-01-19 05:51:00 -08:00
Henrik Lundin
67fb3a5155 Implement error tracking in the decoder
A new vpx_codec_control called VP8D_GET_FRAME_CORRUPTED. The output
from the function is non-zero if the last decoded frame contains
corruption due to packet losses.

The decoder is also modified to accept encoded frames of zero length.
A zero length frame indicates to the decoder that one or more frames
have been completely lost. This will mark the last decoded reference
buffer as corrupted. The data pointer can be NULL if the length is
zero.

Change-Id: Ic5902c785a281c6e05329deea958554b7a6c75ce
2011-01-19 09:53:21 +01:00
Henrik Lundin
48c28fc42c Remove unused local variables
Removing unused local variables causing compiler warnings in
Visual Studio.

Change-Id: I0e2096303be1fdbc01428a6e57cca9796bb32c8a
2011-01-11 15:22:19 +01:00
Paul Wilkins
e0846c9c8c CQ Mode
The merge includes hooks to for CQ mode and other code
changes merged from the test branch.

CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.

The existing target data rate parameter is used to specify the
guideline maximum bitrate.

A new parameter allows the user to specify a target CQ level.

For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.

Change-Id: I2221f9eecae8cc3c431d36caf83503941b25e4c1
2011-01-07 18:46:29 +00:00
Yunqing Wang
a864678cdb Always update last_frame_type
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.

Change-Id: I9203532fd67361588d4024628d9ddb8e391ad912
2010-12-29 10:28:35 -05:00
John Koleszar
b0da9b399d Add psnr/ssim tuning option
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.

Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.

Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257
2010-12-17 10:01:05 -05:00
Johann
825adc464f shrink TOKENEXTRA and vp8_extra_bit_struct
Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes
original: b7b1e6fb
reverted: 41f4458a

Also drop unused field from vp8_extra_bit_struct

Update ARM ASM to deal with this change. In particular, Extra is signed
and needs to be sign-extended when loaded.

Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41
2010-12-14 10:32:50 -05:00
John Koleszar
b1aa54ab26 remove unused temporal preproc code
This code is unused, as the current preproc implementation uses the
same spatial filter that postproc uses.

Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
2010-12-13 16:47:59 -05:00
Fritz Koenig
e0cf330cde vp8 fast quantizer sse2 optimizations for eob.
Changed the end of block computation to use pmaxw.  Removed
additional pushing and popping of registers that was not needed.

Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98
2010-12-09 15:00:30 -08:00
Yaowu Xu
d49da085c0 correct errors in token alphabet descriptions
There were a few errors in the comment section that describe VP8 token
alphabet table.

Change-Id: Ie6728a0e08bc3798893221b60408d5b201064bdc
2010-11-16 10:51:43 -08:00
Fritz Koenig
647df00f30 postproc : Re-work posproc calling to allow more flags.
Debugging in postproc needs more flags to allow for specific
block types to be turned on or off in the visualizations.

Must be enabled with --enable-postproc-visualizer during
configuration time.

Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d
2010-11-10 14:14:46 -08:00
Fritz Koenig
0e7b60617f postproc : Update visualizations.
Change color reference frame to blend the macro block edge.
This helps with layering of visualizations.

Add block coloring for intra prediction modes.

Change-Id: Icefe0e189e26719cd6937cebd6727efac0b4d278
2010-11-04 10:35:02 -07:00
Fritz Koenig
0a29bd9793 postproc : Fix display of motion vectors.
Split motion vectors were all being treated as 4x4
blocks.  Now correctly handle 16x8, 8x16, 8x8, 4x4
blocks.

Change-Id: Icf345c5e69b5e374e12456877ed7c41213ad88cc
2010-11-02 13:29:13 -07:00
Fritz Koenig
9f61a83bf9 postproc : Added SPLITMV visualization, fix line constrain.
Now draw 16 vectors for SPLITMV mode.

Fixed constrain line to block divide by zero issues.

Blend block was not centering the shaded area correctly.

Change-Id: I1edabd8b4e553aac8d980f7b45c80159e9202434
2010-11-01 13:27:13 -07:00
Timothy B. Terriberry
c4d7e5e67e Eliminate more warnings.
This eliminates a large set of warnings exposed by the Mozilla build
 system (Use of C++ comments in ISO C90 source, commas at the end of
 enum lists, a couple incomplete initializers, and signed/unsigned
 comparisons).
It also eliminates many (but not all) of the warnings expose by newer
 GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
 without checking the return values).
There are a few spurious warnings left on my system:

../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
 uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
 change between the two if blocks that test it here.

../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
 expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
 expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
 standard does not mandate that enums be unsigned, so the checks can't be
 removed.

Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395
2010-10-27 18:08:04 -07:00
Fritz Koenig
a097e18964 postproc: Tweaks to line drawing and blending.
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.

Change-Id: Id3ef0fdeeab2949a6664b2c63e2a3e1a89503f6c
2010-10-27 13:20:03 -07:00
Johann
787733d855 Merge "RTCD build is bringing old errors to light" 2010-10-27 09:59:01 -07:00
Fritz Koenig
cf127474d8 vpxdec : Change --pp-debug-info to be a bit field.
This allows multiple post processor debug levels to be overlayed.
i.e. can show colored reference blocks and visual motion vectors.

Change-Id: Ic4a1df438445b9f5780fe73adb3126e803472e53
2010-10-27 09:53:37 -07:00
Fritz Koenig
36ff6a6743 Merge "postproc: Add mode and refrence frame visualizers." 2010-10-27 09:04:39 -07:00
Johann
abcf36c758 RTCD build is bringing old errors to light
needs to be _recon_ not _recon_recon_

Change-Id: I7a8b9ddcb4fb72c2b723c563932c9ea52ff15982
2010-10-27 10:47:48 -04:00
Fritz Koenig
a0ccc97d8a postproc: Add mode and refrence frame visualizers.
Post process option to color the block for either the mode
of the macro block, or the frame that the macro block references.

Change-Id: Ie498175497f2d20e3319924d352dc4ddc16f4134
2010-10-26 16:00:14 -07:00
John Koleszar
d6c67f02c9 make vp8_recon16x16mb{,y} RTCD functions
ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.

Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
2010-10-26 13:23:36 -04:00
John Koleszar
19638c2309 arm: move unrolled loops back to generic code
Some of the ARM functions differed from their generic counterparts
only by unrolling their loops. Since this change may be useful
on other platforms, or might even supercede the looped version
in the generic case, move it back to the generic file.

This code is left under #if ARCH_ARM for now, but it may be worth
considering a different (possibly new) conditional for these. If
it turns out that this should be runtime selectable, these
functions will have to move to the RTCD infrastructure. Don't want
to take that step at this time without more profile data.

Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f
2010-10-26 09:51:35 -04:00
John Koleszar
d330a5876b arm: remove duplicate functions
These functions were true duplicates of functions present in the
generic code. This fixes some of the link errors when building
with --enable-shared --enable-pic.

Change-Id: Idff26599d510d954e439207883607ad6b74df20c
2010-10-26 09:37:44 -04:00
Fritz Koenig
1d70aaf08b Merge "Debug option for drawing motion vectors." 2010-10-25 15:40:22 -07:00
Fritz Koenig
d1a4cce809 Debug option for drawing motion vectors.
Postproc level that uses Bresenham's line algorithm
to draw motion vectors onto the postproc buffer.

Change-Id: I34c7daa324f2bdfee71e84fcb1c50b90fa06f6fb
2010-10-25 15:39:04 -07:00
Johann
1376f061da reuse common loopfilter code
there were four versions for the regular and
macroblock loopfilters:
horizontal [y|uv]
vertical [y|uv]

this moves all the common code into 2 functions:
vp8_loop_filter_neon
vp8_mbloop_filter_neon

this provides no gain in performance. there's a bit
of jitter, but it trends down ~0.25-0.5%. however,
this is a huge gain maintenance. also, there is the
potential to drop some stack usage in the macroblock
loopfilter.

Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92
2010-10-25 09:48:50 -04:00
Timothy B. Terriberry
b71962fdc9 Add runtime CPU detection support for ARM.
The primary goal is to allow a binary to be built which supports
 NEON, but can fall back to non-NEON routines, since some Android
 devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
 Tegra).
The configure-generated flags HAVE_ARMV7, etc., are used to decide
 which versions of each function to build, and when
 CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
 at run time.
In order for this to work, the CFLAGS must be set to something
 appropriate (e.g., without -mfpu=neon for ARMv7, and with
 appropriate -march and -mcpu for even earlier configurations), or
 the native C code will not be able to run.
The ASFLAGS must remain set for the most advanced instruction set
 required at build time, since the ARM assembler will refuse to emit
 them otherwise.
I have not attempted to make any changes to configure to do this
 automatically.
Doing so will probably require the addition of new configure options.

Many of the hooks for RTCD on ARM were already there, but a lot of
 the code had bit-rotted, and a good deal of the ARM-specific code
 is not integrated into the RTCD structs at all.
I did not try to resolve the latter, merely to add the minimal amount
 of protection around them to allow RTCD to work.
Those functions that were called based on an ifdef at the calling
 site were expanded to check the RTCD flags at that site, but they
 should be added to an RTCD struct somewhere in the future.
The functions invoked with global function pointers still are, but
 these should be moved into an RTCD struct for thread safety (I
 believe every platform currently supported has atomic pointer
 stores, but this is not guaranteed).

The encoder's boolhuff functions did not even have _c and armv7
 suffixes, and the correct version was resolved at link time.
The token packing functions did have appropriate suffixes, but the
 version was selected with a define, with no associated RTCD struct.
However, for both of these, the only armv7 instruction they actually
 used was rbit, and this was completely superfluous, so I reworked
 them to avoid it.
The only non-ARMv4 instruction remaining in them is clz, which is
 ARMv5 (not even ARMv5TE is required).
Considering that there are no ARM-specific configs which are not at
 least ARMv5TE, I did not try to detect these at runtime, and simply
 enable them for ARMv5 and above.

Finally, the NEON register saving code was completely non-reentrant,
 since it saved the registers to a global, static variable.
I moved the storage for this onto the stack.
A single binary built with this code was tested on an ARM11 (ARMv6)
 and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder,
 and produced identical output, while using the correct accelerated
 functions on each.
I did not test on any earlier processors.

Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa
2010-10-25 09:23:29 -04:00
Timothy B. Terriberry
8f75ea6b5c Convert [4][4] matrices to [16] arrays.
Most of the code that actually uses these matrices indexes them as
 if they were a single contiguous array, and coverity produces
 reports about the resulting accesses that overflow the static
 bounds of the first row.
This is perfectly legal in C, but converting them to actual [16]
 arrays should eliminate the report, and removes a good deal of
 extraneous indexing and address operators from the code.

Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23
2010-10-21 17:04:30 -07:00
Yaowu Xu
fc2f8dafaf Merge "fixed a typo that mis-used Y plane stride for UV blocks." 2010-10-19 16:23:31 -07:00
Yunqing Wang
7804befb55 Fix one gcc compiler warning
../libvpx/vp8/encoder/bitstream.c: In function ‘pack_inter_mode_mvs’:
../libvpx/vp8/encoder/bitstream.c:1026: warning: array subscript has type ‘char’

Change-Id: Ic77491e0a172fa1821e5b3e914d0dc41fe87c00f
2010-10-14 15:15:35 -04:00
Jan Kratochvil
1fc294116a nasm: movhps compatibility QWORD->MMWORD
Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208

nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes

Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.

Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc
2010-10-04 20:47:19 -04:00
Jan Kratochvil
5cdc3a4c29 nasm: address labels 'rel label' vice 'wrt rip'
nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
2010-10-04 19:47:54 -04:00
Jan Kratochvil
e114f699f6 nasm: match instruction length (movd/movq) to parameters
nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91
2010-10-04 23:36:29 +02:00