Remove all the special case 2-tap, 4-tap, 16x16, 8x8, etc filters,
and instead just use one 2D 6-tap 4x4 filter.
Change-Id: I9ec560fb5609d1a3160e9a3d8b396911073517a0
Use the simplified bool decoder from the bitstream guide, slightly
modified to prevent reading past the end of the buffer. Modified
the token decoder to use the normal bool decoder rather than
inlining its own.
Change-Id: Ic525e773e9f8331ba548a6505cc6d9e5372a5af0
Initial support for simple loopfilter and bilinear subpixel motion
interpolation. This adds support for VP8 profiles 1-3 to dixie.
Change-Id: I76d45cf9843f6f7473783b7932af94f033eb6e82
When reallocating the framebuffer storage, clear references to the
freed memory to prevent it from being accessed.
Change-Id: Ib496b06be469328e8e269f905dc4c9cb6d453a27
The copy_{gf,arf} flags should copy the LAST buffer to the specified
buffer. The refresh_{gf,arf} flags should copy the CURRENT buffer to
the specified buffer.
Change-Id: I1fdf014c439b1ce584cda3d56841243fbfbb1f0a
This patch adds basic reconstruction for inter-predicted frames. It does
not properly handle clamping at the MV border.
Change-Id: Ib7e4395519aab0661a38f4e0f66972b5f08805cb
This is the more naive implementation as described in the bitstream guide,
rather than the masking version implemented in the reference code. However,
the core function prototypes were left as-is to make it easy to plug in the
reference assembly code.
Verified loopfiltered output matches reference decoder for 500 frames.
Change-Id: Ib4f197e864f07dbb918b6d5e742c6110d57c1f40
The "dixie" project will be a rewrite of much of the VP8 decoder core.
Some of the goals are:
* Increase speed by paying more attention to data locality and
cache layout, and by eliminating redundant work in general.
* A different approach to multithreading, to treat all threads as
equal and working on larger work units than a single MB.
* Expose more of the bitstream to the application, essentially
creating a vp8 parser utility. This could be useful for analyzing
the complexity of a stream, to help set conformance points.
* If the above goals are met successfully, replace the reference
decoder.
For those interested in the etymology of the term "dixie:"
decoder2 -> dx2 -> dxii -> dixie
Change-Id: I4ef0832b62ea96e9cfa1906c4a77f4b51e0c62d6
1. Unavailability of each reference frame type should be tested
independently,
2. Also, only the VP8_GOLD_FLAG needs to be tested before setting
golden frame specific thresholds, and only VP8_ALT_FLAG needs
testing before setting thresholds relevant to the AltRef frame.
(Raised by gbvalor, in response to Issue 47)
Change-Id: I6a06fc2a6592841d85422bc1661e33349bb6c3b8
Since the intent is
to reset the appropriate bit in ref_frame_flags not to
test a logic condition. Prior result would always have
been ref_frame_flags being set to 0.
(Issue reported by dgohman, issue 47)
Change-Id: I2c12502ed74c73cf38e98c9680e0249c29e16433
The DOUBLE_DIVIDE_CHECK macro prevents from divide by 0,
so must be on the denominator to work as intended.
Change-Id: Ie109242d52dbb9a2c4bc1e11890fa51b5f87ffc7
If the version script produced by the libvpx build system is not
used when linking a shared library on x86-64 Linux, the constant
data in the subpel filters produces R_X86_64_32 relocation errors
due to the use of wrt rip addressing instead of
wrt rip wrt ..gotpcrel.
Instead of adding a new macro for this addressing mode, this patch
sets the ELF visibility of these symbols to "hidden", which
allows wrt rip addressing to work without a text relocation.
This allows building a shared library without using the provided
build system or a separate version script.
Fixes http://code.google.com/p/webm/issues/detail?id=46
Change-Id: Ie108f9d9a4352e5af46938bf4750d2302c1b2dc2
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.
Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
Change bitreading functions to use a larger window which is refilled less
often.
This makes it cheap enough to do bounds checking each time the window is
refilled, which avoids the need to copy the input into a large circular
buffer.
This uses less memory and speeds up the total decode time by 1.6% on an ARM11,
2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64.
Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving
the refill loop to the front of vp8dx_decode_bool().
However, having the refill loop between computation of the split values and
the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to
memory latency and code size: refilling after normalization duplicates the
code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases.
Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the
beginning of each decode step in vp8_decode_mb_tokens() means the latter
requires an extra refill at the end.
Platform-specific versions could avoid the problem, but would require most of
detokenize.c to be duplicated.
Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7
Added sse2 version of vp8_regular_quantize_b which improved encode
performance(for the clip used) by ~10% for 32 bit builds and ~3% for
64 bit builds.
Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments.
Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af