Enable use_x86inc as a commandline option. Fix Bug with sse2 when
x86inc is disabled. Adds Sad asm protection to x86inc protection
Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6
Adds a few pattern searches to achieve various tradeoffs
between motion estimation complexity and performance.
The search framework is unified across these searches so that a
common pattern search function is used for all. Besides it will
be easier to experiment with various patterns or combinations
thereof at different scales in the future.
The new pattern search is multi-scale and is capable of using
different patterns at different scales.
The new hex search uses 8 points at the smallest scale
and 6 points at other scales.
Two other pattern searches - big-diamond and square are
also added. Big diamond uses 4 points at the smallest scale and
8 points in diamond shape at the larger scales.
Square is very similar conceptually to the default n-step search
but is somewhat faster since it keeps only one survivor across
all scales.
Psnr/speed-up results on derf300:
hex: -1.6% psnr%, 6-8% speed-up
big-diamond: -0.96% psnr, 4-5% speedup
square: -0.93% psnr, 4-5% speedup
Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2
There was no benefit having this function. For example, inside
read_switchable_filter_type switchable filter context was calculated twice.
Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1
This is in preparation for the SSE2 version of the high-precision
32x32 forward DCT which will share a lot of code with the existing
low precision version used for rate-distortion search.
Change-Id: I7084b6bdfb480b1fabb8493fb14e3f7fcc7888c0
Support enabling it or disabling it. Moved read out to configure.sh
so that its done once instead of in make and in config.
Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b
Adds a speed feature to skip all intra modes other than
DC_PRED if the source variance is small. This feature is
made part of speed 1 and up.
Results on derf300: psnr -0.07%, speedup about 1-2%
Also uses the source variance to fine-tune the early
termination criteria when FLAG_EARLY_TERMINATE is on.
This feature is made part of speed 2 and up.
Results on derf300: psnr -0.52%, speedup about 5-7%
Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232
Currently the only threaded option for vp9 decode. Enabled when the
decoder config thread count is > 1.
Change-Id: I082959abac9e31aa4a38ed9fd68b94680e57f4df
This changeset allows to remove vp9_switchable_interp and
vp9_switchable_interp_map arrays and make code much clear. Actually we
still have to use these mapping but only inside read_interp_filter_type and
write_interp_filter_type functions.
Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50
Chromium does not support 32bit builds for Mac which use x86inc.asm.
Make the files which include it work if 64bit or not PIC enabled
starting with vp9_copy_sse2.asm
Consolidate these targets in vp9_rtcd_defs.sh
Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248
This is an attempt at rewriting vp9_find_mv_refs_idx. I believe that it gains
about 1-2% decode speed
Change-Id: Ia5359c94ce9bb43b32652890e605e9a385485c1b