Loading pb_1 rather than pw_8192 was benchmarked to be more efficient.
Loading of the 2 yields no advantage. Loading of one saves ~11 cycles.
decicycles count:
put8: 3223(mmx) -> 2387
avg8: 2863(mmxext) -> 2125
put16: 4356(sse2) -> 3553
avg16: 4481(sse2) -> 3513
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This is a port of the inline assembly of the mmx version to use the
pavg(us|)b instruction.
8 16
mmx 1498 4355
mmx2 1242 3509
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Currently, only the mmx version is bitexact, the others (mmxext and
3dnow) are not, in spite of their naming.
Therefore, make their name more obvious. Also restore a comment that
was removed in 71155d7b.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '82dd1026cfc1d72b04019185bea4c1c9621ace3f':
x86: dsputil: Move hpeldsp-related declarations to a separate header
Conflicts:
libavcodec/x86/dsputil_x86.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '6655c933a887a2d20707fff657b614aa1d86a25b':
x86: dsputil: Move fpel declarations to a separate header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '831a1180785a786272cdcefb71566a770bfb879e':
Update dsputil- and SIMD-related comments to match reality more closely
Conflicts:
libavcodec/x86/hpeldsp.asm
libavutil/arm/float_dsp_init_arm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The mmxext optimizations should be at least equally fast if available and
amd3dnow optimizations are being deprecated. Thus the former should
override the latter, not the other way around.
* commit '6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0':
x86: avcodec: Use convenience macros to check for CPU flags
Conflicts:
libavcodec/x86/dsputil_init.c
libavcodec/x86/hpeldsp_init.c
libavcodec/x86/motion_est.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
Consistently use "cpu_flags" as variable/parameter name for CPU flags
Conflicts:
libavcodec/x86/dsputil_init.c
libavcodec/x86/h264dsp_init.c
libavcodec/x86/hpeldsp_init.c
libavcodec/x86/motion_est.c
libavcodec/x86/mpegvideo.c
libavcodec/x86/proresdsp_init.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9e5e76ef9ea803432ef2782a3f528c3f5bab621e':
x86: More specific ifdefs for dsputil/hpeldsp init functions
Conflicts:
libavcodec/x86/dsputil_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '932806232108872655556100011fe369125805d3':
x86: dsputil: Move avg_pixels16_mmx() out of rnd_template.c
x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9b3a04d30691e85b77e63f75f5f26a93c3a000cd':
x86: Move duplicated put_pixels{8|16}_mmx functions into their own file
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7f75f2f2bd692857c1c1ca7f414eb30ece3de93d':
ppc: Drop unnecessary ff_ name prefixes from static functions
x86: Drop unnecessary ff_ name prefixes from static functions
arm: Drop unnecessary ff_ name prefixes from static functions
Merged-by: Michael Niedermayer <michaelni@gmx.at>