some build systems have trouble with duplicate basenames.
vpx_dsp/skin_detection.[hc] were added in:
658e85425 Merge skin detection code in vp8/9.
BUG=webm:1438
Change-Id: Ieaa70b40bda409ec23e6d179b47a930ac6243b05
Advise the compiler that the store is eventually going to a uint8_t
buffer. This helps avoid getting alignment hints which would cause the
memory access to fail.
Originally added as a workaround for clang:
https://bugs.llvm.org//show_bug.cgi?id=24421
Change-Id: Ie9854b777cfb2f4baaee66764f0e51dcb094d51e
This reverts commit 0d88e15454.
Reason for revert: chromium builds are failing to locate vpx_rv during dlopen()
dlopen failed: cannot locate symbol "vpx_rv" referenced by "libstandalonelibwebviewchromium.so"
Original change's description:
> Add visibility="protected" attribute for global variables referenced in asm files.
>
> During aosp builds with binutils-2.27, we're seeing linker error
> messages of this form:
> libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
> symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
> object
>
> subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
> Other messages refer to symbol references from deblock_sse2.o and
> subpixel_sse2.o, also assembled from asm files.
>
> This change marks such symbols as having "protected" visibility. This
> satisfies the linker as the symbols are not preemptible from outside
> the shared library now, which I think is the original intent anyway.
>
> Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
>
TBR=jzern@google.com,johannkoenig@google.com,rahulchaudhry@chromium.org,builds@webmproject.org
Change-Id: I0c2ea375aa7ef5fda15b9d9e23e654bb315c941b
During aosp builds with binutils-2.27, we're seeing linker error
messages of this form:
libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
object
subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
Other messages refer to symbol references from deblock_sse2.o and
subpixel_sse2.o, also assembled from asm files.
This change marks such symbols as having "protected" visibility. This
satisfies the linker as the symbols are not preemptible from outside
the shared library now, which I think is the original intent anyway.
Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
Add ppc, ppc64 and ppc64le on all_platforms and ARCH_LIST
Add VSX flags and check for -mvsx
Define empty setup_rtcd_internal
Add Altivec detection based on:
http://freevec.org/function/altivec_runtime_detection_linux
Detect VSX at runtime when enabled
Change-Id: I304f4d8c5fee0ff19b6483cd2e9cc50d6ddec472
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
To avoid decode performance hit of 2% when running on hyperthreaded
cores.
This patch only uses the mutex's when we are running tsan.
This is safe because 32 bit operations like read and store are atomic
on all the platforms we care about. Tsan warns about race situations,
but in this case either situation ( read occurs before write or write
before read) the worst case is that we go around one extra time in the
loop. So the ordering doesn't really matter.
That said a few other things have been tried :
for instance as per here:
webrtc/base/atomicops.h#52
In this patch they use:
__atomic_load_n(i, __ATOMIC_ACQUIRE);
__atomic_store_n(i, value, __ATOMIC_RELEASE);
This code works on gcc, clang ( replacing protected write and read), and
avoids tsan errors. Incurring no penalty in performance. In C11 its
replaced by straight atomic operands.
However there is no equivalent in the visual studio's we support as
int32 on all windows platforms is already atomic. To avoid tsan like
warnings on windows we'd need to use interlocked exchange and the
end result doesn't gain us any thing.
Change-Id: I2066e3c7f42641ebb23d53feb1f16f23f85bcf59
Reapply this patch:
ff0107f Amend and improve VP8 multithreading implementation
Amended the patch to add a unit test, and fix an asan error.
BUG=webm:851
Change-Id: I6572c03256169c64e80248bf5a5e99f59a2fc93c
usage of the vp8 versions was removed in:
3f72509 vp8: remove VP8_SET_DBG* control support
vp9 had the usage stripped even earlier.
Change-Id: I978142eb6492552cd29c9c6feb1e89acfc5f7b84
This uses the same sdx4df pointers as vp8_diamond_search_sadx4 and
should therefore target the same optimizations.
See e4ddf9db6a
Change-Id: Ic298e9b25c34bbe6b7a0799509355b0addb56675
Control already exists for vp9, adding it to vp8.
Usage is only when error_resilient is off.
Added a datarate unittest for non-zero boost.
Change-Id: I4296055ebe2f4f048e8210f344531f6486ac9e35
vp8_short_inv_walsh4x4_msa - Optimized to process in short vector type
Updated below functions to store exact number of bytes in output rather than complete vector
idct4x4_addblk_msa
idct4x4_addconst_msa
dequant_idct4x4_addblk_msa
dequant_idct4x4_addblk_2x_msa
dequant_idct_addconst_2x_msa
Change-Id: Ic1b3752e2421dc7d70a082dcdaab9d140d7e5d9c
The original commit never set any 'specialize' line:
61311e6103
It appears the sadx4 version of function uses sdx4df calls to speed up
the search. There are no sse3 versions of the sdx4df functions, but
there are sse2 and msa versions.
There is a neon version of vpx_sad16x16x4d but not any of the smaller
versions. Perhaps if they existed this function could be expanded to use
them.
Change-Id: I936d7d6b1a3ff6dcd5a4d2322272708c47cdec13
The value 35468 changes sign when stored in int16_t:
implicit conversion from 'int' to 'int16_t' (aka 'short')
changes value from 35468 to -30068
This negation requires adding back the original value to compensate.
Shifting the value keeps the value positive and saves a post-vqdmulh
shift.
This technique is used in webp and idct_dequant_full_2x_neon
BUG=b/28027557
Change-Id: I0c5ce09bea170fe08061856c2af6f841a557e0c3
This restores d9dce2f48e
Switched to using signed shift-and-narrow. Instead of saturating
negative results to 0, it was saturating them to 255.
BUG=webm:817
BUG=webm:1273
Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49
When filtering it needs 6 pixels: 2 prior to the source, the source, and
3 after the source.
When filtering 16 wide, that means 21. To accomplish this the SSE2 reads
[-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading
in groups of 8 is easy)
The filter then shifts this last set to the top half of the register and
uses 'or' to combine it with the previous set.
Valgrind detected an issue reading pixels [19], [20] and [21]:
Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd
Note: we only need pixels [16], [17], and [18] as context for [15].
To fix this, it now reads 8 bytes starting at [11], which re-loads [11]
through [13], but stops at [18] and does not over-read any values.
This is shifted by 5 and 'or'd with xmm1. Although the lower bits are
not cleared, they overlap directly with [11] through [13], so 'or'
produces the correct results.
Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
the --enable-postproc-visualizer configure option remains as a no-op as
do the control names and values for compatibility
+ remove the corresponding debug flags from vpxdec: --pp-*
Change-Id: I4a001cd9962b59560d7d6bda6272d4ff32b8d37c
similar to changes that were done in vp9 for encoded frame size
reporting. has the side-effect of quieting a -Wshorten-64-to-32 warning.
Change-Id: I89f74cb617fc29334ee351dc8dfaa3b8cfd4e5af
The code only has issues when xoffset == 0 and yoffset == 0 which
represents a simple copy. Presumably this case does not need to be
handled because the issue has existed since 2010.
BUG=webm:1287
Change-Id: Ic47e2653f3b729e99b40e53d8d2d8d1501edaaa9