Go to file

Ganesh Ajjanagadde 971d12b7f9 avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm

This uses Stein's binary GCD algorithm:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm
to get a roughly 4x speedup over Euclidean GCD on standard architectures
with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise.
At the moment, the compiler intrinsic is used on GCC and Clang due to
its easy availability.

Quick note regarding overflow: yes, subtractions on int64_t can, but the
llabs takes care of that. The llabs is also guaranteed to be safe, with
no annoying INT64_MIN business since INT64_MIN being a power of 2, is
shifted down before being sent to llabs.

The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On
GCC, this is provided by a built-in. On Microsoft, there is a
BitScanForward64 analog of BitScanForward that should work; but I can't confirm.
Apparently it is not available on 32 bit builds; so this may or may not
work correctly. On Intel, per the documentation there is only an
intrinsic for _bit_scan_forward and people have posted on forums
regarding _bit_scan_forward64, but often their documentation is
woeful. Again, I don't have it, so I can't test.

As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest
use a compiled version based on the De-Bruijn method of Leiserson et al:
http://supertech.csail.mit.edu/papers/debruijn.pdf.

Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell)
with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a
make fate.

aac-am00_88.err:
builtin:
714 decicycles in av_gcd,    4095 runs,      1 skips

de-bruijn:
1440 decicycles in av_gcd,    4096 runs,      0 skips

previous:
2889 decicycles in av_gcd,    4096 runs,      0 skips

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

2015-10-11 04:08:41 +02:00

compat

Merge commit '2830bce47e2eb29c76202f19017031ddc1f95dd3'

2015-10-10 09:45:44 +02:00

doc

doc/resampler, swresample/options: use proper capitalization

2015-10-10 20:49:54 +02:00

libavcodec

avcodec/pngdec: Check blend_op.

2015-10-11 03:46:44 +02:00

libavdevice

avdevice/libdc1394: add const to suppress "assignment discards const qualifier from pointer target type" warnings

2015-09-20 22:12:37 +02:00

libavfilter

avfilter/x86/vf_w3fdif: add colons after labels

2015-10-10 17:55:06 +02:00

libavformat

fate/async: test error code from underlying protocol

2015-10-10 17:58:45 +02:00

libavresample

avresample: Remove an unused variable

2015-09-29 14:33:01 +02:00

libavutil

avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm

2015-10-11 04:08:41 +02:00

libpostproc

Merge commit 'e88103a7f92cf27a2868b50acc8a9912f6088249'

2015-09-05 21:35:46 +02:00

libswresample

doc/resampler, swresample/options: use proper capitalization

2015-10-10 20:49:54 +02:00

libswscale

doc/scaler, swscale/options: use proper capitalization

2015-10-10 19:32:56 +02:00

presets

presets: remove moldering iPod presets

2014-06-17 16:15:04 -08:00

tests

fate/async: test error code from underlying protocol

2015-10-10 17:58:45 +02:00

tools

Merge commit '87e5d8d78cf08b54b4a9e7cbaeff89f8c1d91b78'

2015-09-07 12:27:21 +02:00

.gitattributes

Treat all '*.pnm' files as non-text file

2014-11-28 17:52:43 -05:00

.gitignore

gitignore: ignore object file temporaries

2015-10-10 20:30:41 +02:00

.travis.yml

Merge commit '1e0b8bf0b3b3b4247fb21e9839af342ae879607c'

2015-09-12 13:31:22 +02:00

arch.mak

use mmi instead of loongson3 as simd-optimization flag

2015-07-07 03:46:57 +02:00

Changelog

avfilter: add displace video filter

2015-10-04 21:44:57 +02:00

cmdutils_common_opts.h

opts: add list device sources/sinks options

2014-10-25 20:20:31 +02:00

cmdutils_opencl.c

OpenCL: Avoid potential buffer overflow in cmdutils_opencl.c

2015-04-28 12:18:23 +02:00

cmdutils.c

cmdutils: silence unused warnings under --disable-swscale, --disable-swresample

2015-10-03 19:26:09 +02:00

cmdutils.h

cmdutils: remove sws_opts usage, simplify code

2015-08-08 16:51:25 +02:00

common.mak

Merge commit '3ae0e721c7b6e0483801b9039b3d140e3b68b7f5'

2015-07-22 16:30:37 +02:00

configure

Merge commit 'd7a5a178c252b625537adc046392624ad543dea7'

2015-10-10 09:37:51 +02:00

COPYING.GPLv2

…

COPYING.GPLv3

…

COPYING.LGPLv2.1

cosmetics: Delete empty lines at end of file.

2012-02-09 12:26:45 +01:00

COPYING.LGPLv3

…

CREDITS

CREDITS: redirect to Git log, remove current outdated content

2013-01-31 18:02:52 +01:00

ffmpeg_dxva2.c

ffmpeg_dxva2: call GetDesktopWindow() in place of GetShellWindow()

2015-06-03 16:25:08 +02:00

ffmpeg_filter.c

Merge commit '8b830ee9a26d47b138f12a82085cdb372f407f1e'

2015-10-10 09:43:47 +02:00

ffmpeg_opt.c

Merge commit 'b5e4f393b6757629281f58c3f3f6d55ca522ab60'

2015-09-29 15:30:48 +02:00

ffmpeg_vdpau.c

ffmpeg_vdpau: Ignore decoder's max supported level

2015-08-16 15:02:33 -07:00

ffmpeg_videotoolbox.c

build: restore videotoolbox compilation on iOS

2015-10-09 10:59:31 +02:00

ffmpeg.c

ffmpeg: avoid possible undefined behavior

2015-10-09 21:19:39 +02:00

ffmpeg.h

ffmpeg: switch swscale option handling to AVDictionary similar to what the other subsystems use

2015-08-08 14:44:15 +02:00

ffplay.c

ffplay: use correct context for av_log

2015-10-05 22:28:17 +02:00

ffprobe.c

ffprobe: use AV_OPT_TYPE_BOOL for writers options

2015-09-12 17:50:22 +02:00

ffserver_config.c

ffserver: fix line wrapping on function decls

2015-10-04 15:58:35 -07:00

ffserver_config.h

ffserver: Use singlejpeg muxer for jpeg

2015-06-08 03:36:22 +02:00

ffserver.c

ffserver: avoid leaking pathname at exit

2015-10-04 22:31:22 -07:00

INSTALL.md

INSTALL: add markdown syntax

2014-05-28 22:38:38 +02:00

library.mak

build: add LDLIBFLAGS

2015-07-08 14:35:02 +02:00

LICENSE.md

avfilter: add rubberband wrapper

2015-09-20 19:54:57 +02:00

MAINTAINERS

avfilter/vf_chromakey: Add chromakey video filter

2015-09-23 18:10:14 +02:00

Makefile

avcodec: add new Videotoolbox hwaccel.

2015-08-03 10:12:10 +02:00

README.md

README: replace http with https

2015-10-06 13:27:29 +02:00

RELEASE

RELEASE: update to 2.8.git

2015-09-09 23:53:15 -03:00

version.sh

version.sh: Print versions based on the last git tag for release branches

2014-07-28 15:44:59 +02:00

README.md

FFmpeg README

FFmpeg is a collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.

Libraries

libavcodec provides implementation of a wider range of codecs.
libavformat implements streaming protocols, container formats and basic I/O access.
libavutil includes hashers, decompressors and miscellaneous utility functions.
libavfilter provides a mean to alter decoded Audio and Video through chain of filters.
libavdevice provides an abstraction to access capture and playback devices.
libswresample implements audio mixing and resampling routines.
libswscale implements color conversion and scaling routines.

Tools

ffmpeg is a command line toolbox to manipulate, convert and stream multimedia content.
ffplay is a minimalistic multimedia player.
ffprobe is a simple analysis tool to inspect multimedia content.
ffserver is a multimedia streaming server for live broadcasts.
Additional small tools such as aviocat, ismindex and qt-faststart.

Documentation

The offline documentation is available in the doc/ directory.

The online documentation is available in the main website and in the wiki.

Examples

Coding examples are available in the doc/examples directory.

License

FFmpeg codebase is mainly LGPL-licensed with optional components licensed under GPL. Please refer to the LICENSE file for detailed information.

Contributing

Patches should be submitted to the ffmpeg-devel mailing list using git format-patch or git send-email. Github pull requests should be avoided because they are not part of our review process. Few developers follow pull requests so they will likely be ignored.

Languages

C 92.1%

Assembly 6%

Makefile 1.2%

C++ 0.3%

Objective-C 0.2%

Other 0.1%