Used horizonal add instructions instead of adding
byte lanes. The encoder performance improved by
~4% for the test clip used.
Change-Id: Iaddd10403fcffb5b3f53b1f591ab2fe0ff002c08
This patch did a cleanup following the commit "Save NEON registers
in VP8 NEON functions". The pushing/poping of callee-saved NEON
registers was moved into individual NEON functions. Therefore,
we don't need to save those registers at the beginning of codec.
The related code was removed.
Change-Id: I5648166514fc9beffb780aa138495597731f49ea
Assembly implementation of ssse3 8x8 forward 2D-DCT. The current
version is turned on only for x86_64. The average unit runtime
goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster.
This translates into about 1.5% speed-up for pedestrian_area 1080p
at speed 2.
Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4
This reverts commit 59e733ca81.
Hold off removing arnr_type to give users the opportunity
to change their script files to handle its deprecation. A
follow-up patch will mark the control for setting arnr_type
as deprecated and it will be removed completely in a later
revision of the code.
Change-Id: I8b817c744e144d3714234a4cd4309816d0c7e3e8
Changes in this patch are only enabled if configured with
--enable-experimental --enable-vp9_high
Using a encoder command line argument of --input-shift=0 tells the coder
to work with 16bit framebuffers.
The output should be identical to before. Some features (such as input
image resizing) are not yet supported in 16bit mode.
Specifically, the behavior of the input-shift parameter is as follows:
* No argument : Behaviour as before, using 8bit frame buffers
* --experimental-bitstream --profile=2 --input-shift=0: Uses
16bit frame buffers to store 8-bit data, should give identical output
to before.
* --experimental-bitstream --profile=2 --input-shift=2 --bit-depth=1: Uses
16bit frame buffers to store 10-bit data, encodes a version 2 stream
with bitdepth 10
* --experimental-bitstream --profile=2 --input-shift=4 --bit-depth=2: Uses
16bit frame buffers to store 12-bit data, encodes a version 2 stream
with bitdepth 12
The decoder has an --output-shift argument which should be used when
decoding profile 2 streams.
So far support for the following has been added:
Intra filtering
Deblocking
Motion compensation
Variance calculation
Sad calculation
Transform
Change-Id: If345c88234aafdd40caea0d88935b1f07aaebe22
The recent compiler can generate optimized code that uses NEON registers
for various operations besides floating-point operations. Therefore,
only saving callee-saved registers d8 - d15 at the beginning of the
encoder/decoder is not enough anymore. This patch added register saving
code in VP8 NEON functions that use those registers.
Change-Id: Ie9e44f5188cf410990c8aaaac68faceee9dffd31
dist is broken in msvs currently due to a dependency on libs.mk which in
turn depends on the rest of the source tree, not just the examples
Change-Id: I3e313ceeae81eb29ef4bfb099d89756b43583eaa