adds a compile time option: --enable-arm-asm-detok which pulls in
vp8/decoder/arm/detokenize.asm
currently about break even speed wise, but changes are pending to
the fill code (branch and load 3 bytes versus conditionally always
load one) and the error handling. Currently it doesn't handle zero
runs or overrunning the buffer.
this is really just so i don't have to rebase my changes all the
time to run benchmarks - now just need to replace one file!
Change-Id: I56d0e2354dc0ca3811bffd0e88fe1f952fa6c797
This is the first modification of VP8 multi-thread decoder, which uses
same threads to decode macroblocks and then do loopfiltering for each
frame.
Inspired by Rob Clark, synchronization was done on every 8 macroblocks
instead of every macroblock to reduce lock contention.
Comparing with the original code, this implementation gave about 15%-
20% performance gain while decoding my test clips on a Core2 Quad
platform (Linux).
The work is not done yet.
Test on other platforms are needed.
Change-Id: Ice9ddb0b511af1359b9f71e65066143c04fef3b5
At the end of the decode, frame buffers were being copied.
The frames are not updated after the copy, they are just
for reference on later frames. This change allows multiple
references to the same frame buffer instead of copying it.
Changes needed to be made to the encoder to handle this. The
encoder is still doing frame buffer copies in similar places
where pointer reference could be done.
Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66
Change I9fd1a5a4 updated the multithreaded loopfilter to avoid
reinitializing several parameteres if they haven't changed from the
last frame, but the code to update the last frame's parameters wasn't
invoked in the multithreaded case.
Change-Id: Ia23d937af625c01dd739608e02d110f742b7e1f2
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.
Change-Id: I236c05fade06589e417179c0444cb39b09e4200d