This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functions in new file hevc_idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Add functions needed for implementation of fixed point aac dec.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functions.
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functions (qpel as well as epel) in new file hevc_mc_uniw_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functions (qpel as well as epel) in new file hevc_mc_bi_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Adds HEVC specific macros (needed for this patch) in libavcodec/mips/hevc_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7d07ee5a9bd170a06d26fd967cf8de5d3b1ce331':
ppc: cpu: Add support for VSX and POWER8 extensions
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Names of functions vector_fmul_window_fixed_c and
vector_fmul_window_fixed_scaled_c are changed by removing "_fixed"
from the name since it is redundant.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was probably broken some time ago. The breakage is now part of the
ABI. For example, we have:
AV_PIX_FMT_XYZ12BE
AV_PIX_FMT_NV16
AV_PIX_FMT_NV20LE
AV_PIX_FMT_NV20LE is wrong. It has the value 113, but as little-endian
format it should be even. This must have been quite obvious when these
formats were added (because of the AV_PIX_FMT_XYZ12BE entry), but
nobody cared or knew about this.
The future libavutil major bump will also break this additionally,
because disabling FF_API_VDPAU will remove an odd number of entries from
the middle of the enum.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This patch moves HEVC code of uni mc cases to new file hevc_mc_uni_msa.c.
(There are total 5 sub-modules of HEVC mc functions, if we add all these modules in one single file, its size would be huge (~750k) & difficult to maintain, so splitting it in multiple files)
This patch also adds new HEVC header file libavcodec/mips/hevc_macros_msa.h
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Silences warning(s) like:
libavcodec/x86/fft.asm:93: warning: section flags ignored on
section redeclaration
The cause of this warning is that because `struc` and `endstruc`
attempts to revert to the previous section state [1].
The section state is stored in the macro __SECT__, defined by
x86inc.asm to be `.note.GNU-stack ...`, through the `SECTION`
directive [2].
Thus, the `.note.GNU-stack` section is defined twice
(once in x86inc.asm, once during `endstruc`), causing the warning.
That is the first part of the commit: using the primitive `[section]` format
for .note.GNU-stack etc., which does not update `__SECT__` [2].
That fixes only half of the problem. Even without any `SECTION` directives,
`__SECT__` is predefined as `.text`, which conflicting with the later
`SECTION_TEXT` (which expands to `.text align=16`).
[1]: http://www.nasm.us/doc/nasmdoc6.html#section-6.4
[2]: http://www.nasm.us/doc/nasmdoc6.html#section-6.3
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This patch includes restructuring of existing macros and addition of more generic macros.
This change was necessary to avoid repeated review comments in remaining patches which we were about to submit.
Also this patch reduces number of code lines due to maximum use of generic macros, allows better code alignment & readability etc.
These modifications in commonly used .libavutil/mips/generic_macros_msa.h. impacts the already accepted code, hence re-submitting it in 2/4,3/4 & 4/4.
Overall, this patch set is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This commit silences warning(s) like:
libavcodec/x86/fft.asm:93: warning: section flags ignored on section
redeclaration
The cause of this warning is that because `struc` and `endstruc` attempts to
revert to the previous section state [1]. The section state is stored in the
macro __SECT__, defined by x86inc.asm to be `.note.GNU-stack ...`, through the
`SECTION` directive [2]. Thus, the `.note.GNU-stack` section is defined twice
(once in x86inc.asm, once during `endstruc`), causing the warning.
That is the first part of the commit: using the primitive `[section]` format
for .note.GNU-stack etc., which does not update `__SECT__` [2].
That fixes only half of the problem. Even without any `SECTION` directives,
`__SECT__` is predefined as `.text`, which conflicting with the later
`SECTION_TEXT` (which expands to `.text align=16`).
[1]: http://www.nasm.us/doc/nasmdoc6.html#section-6.4
[2]: http://www.nasm.us/doc/nasmdoc6.html#section-6.3
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This function allows writing AVRationals as IEEE floats without the need
of platform dependant float operations
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Recently normalization (av_normalize_sf) of output was added to av_add_sf.
This normalization is used for better precision for small values and the
purpose of this (quite simple) test case is to test difference between double
and softfloat.
The values used are tailored to maximally highlighte problem with precison when
normalization is not used.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
av_image_fill_pointers always aligns the palette, but the padding
bytes don't (and can't) get initialized in av_image_copy.
Thus initialize them in av_image_alloc.
This fixes 'Syscall param write(buf) points to uninitialised byte(s)'
valgrind warnings.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This will normalize sums for which mantissa is smaller than the lower boundary
(needed for implementation of fixed point aac decoder).
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Exponent usage and calculation in softfloat adjusted to the format used in
implementation of fixed point aac decoder.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
AVOpenCLDeviceNode and AVOpenCLPlatformNode used fixed static buffer for holding the device and platform name.
This patch modifies these structures to use pointers instead. The memory required to hold the names is
now dynamically allocated, the size for which is determined by querying appropriate OpenCL runtime APIs.
Signed-off-by: Maneesh Gupta <maneesh.gupta@amd.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'eaa2d123f0a643664721593d248ece6bcd85f1e6':
log: Print a full backtrace along with error messages under Valgrind
Conflicts:
libavutil/log.c
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Useful to understand where and in what execution state a certain message
is generated. It is enabled only when optimizations are disabled, since
function names are not printed otherwise.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Softfloat will be used in implementation of AAC fixed point decoder.
This change is needed in order to more easily integrate ffmpegs softfloat in
already developed algorithm for AAC.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '108f2f381acb93827fb4add0517eeae859afa3bf':
parseutils: Extend small_strptime to be used in avformat
Conflicts:
libavutil/parseutils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The new information is printed at verbose log level and can thus be switched on and off
through the log level
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '592a04054e6423be5050efd2bceece48b10b9c1d':
pixdesc: Replace a few leftover instances of non AV-prefixed flags
Conflicts:
libavutil/pixdesc.c
See: c7c71f95f8
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e4fe535d12f4f30df2dd672e30304af112a5a827':
mov: Write the display matrix in order
Conflicts:
libavformat/mov.c
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This will allow to copy the matrix as is and it is just cleaner to keep
the matrix in the same order specified by the mov standard (which is
also explicitly described in the documentation).
In order to preserve compatibility, flip the angle sign in the display API
av_display_rotation_set() and av_display_rotation_get(), and improve the
documentation mentioning the rotation direction.
Commit dfa9208074 ("mips/float_dsp: fix a bug in vector_fmul_window_mips")
fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also
removed the outer C loop and replaced it with assembly branches and pointer
arithmetic. When submitting my 64-bit porting patch I missed this new
assembly which also needed porting.
This patch fixes a bus error in the fate-float-dsp test when run on 64-bit
mips.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Unfortunately android < api 21 (lollipop) doesn't have the sgidefs.h header,
the easiest way around this is to just use the preprocessor definitions from
gcc / clang.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73':
arm: Suppress tags about used cpu arch and extensions
Merged-by: Michael Niedermayer <michaelni@gmx.at>
When all the codepaths using manually set .arch/.fpu code is
behind runtime detection, the elf attributes should be suppressed.
This allows tools to know that the final built binary doesn't
strictly require these extensions.
Signed-off-by: Martin Storsjö <martin@martin.st>
When OpenCL kernels are compiled, is_compiled flag is being set for each
kernel. But, in opencl uninit, this flag is not being cleared.
This causes an error when an OpenCL kernel is tried on different OpenCL
devices on same platform.
Here is the patch with a fix
Reviewed-by; Wei Gao <highgod0401@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This mainly consists of replacing all the pointer arithmatic 'addiu'
instructions with PTR_ADDIU which will handle the differences in pointer
sizes when compiled on 64 bit mips systems.
The header asmdefs.h contains the PTR_ macros which expend to the correct mips
instructions to manipulate registers containing pointers.
Signed-off-by: James Cowgill <james410@cowgill.org.uk>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Loop was unrolled eight times although in heder there is assumption
that len is multiple of 4.
This is fixed, and assembly code is rewritten to be more optimal and
to simplify clobber list.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These could trigger assert failures previously
Found-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
add ARM code for implementing av_clip_intp2 using the ssat instruction
on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and
the generic av_clip(), about -19%
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
there already is a function, av_clip_uintp2() that clips a signed integer
to an unsigned power-of-two range, i.e. 0,2^p-1
this patch adds a function av_clip_intp2() that clips a signed integer
to a signed power-of-two range, i.e. -(2^p),(2^p-1)
the new function can be used as a special case for av_clip(), e.g.
av_clip(x, -8192, 8191) can be rewritten as av_clip_intp2(x, 13)
there are ARM instructions, usat and ssat resp., which map nicely to these
functions (see next patch)
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '5b1d9ceec715846a58fe029bc3889ed6fa62436a':
pixfmt: add a pixel format for QSV hwaccel
Conflicts:
doc/APIchanges
libavutil/pixfmt.h
libavutil/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2
instructions did not issue warnings when used in SSE functions. Handle
it by also checking the register type when such instructions are used.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>