Commit Graph

133 Commits

Author SHA1 Message Date
Ronald S. Bultje
48098788c2 vp8: Replace x*155/100 by x*101581>>16.
Idea stolen from webp (by Pascal Massimino) - because it's Cool.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2012-07-25 14:37:03 -04:00
Martin Storsjö
25f056e6d4 vp8: Enclose pthread function calls in ifdefs
This fixes building with threads disabled.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-07-15 13:55:18 +03:00
Martin Storsjö
a794600c00 vp8: Include the thread headers before using the pthread types
This was unnoticed on linux, since stdlib.h apparently includes
files declaring the pthread_mutex_t and pthread_cond_t types.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-07-14 19:19:33 -07:00
Daniel Kang
951455c1c1 vp8: implement sliced threading
Testing gives 25-30% gain on HD clips with two threads and
up to 50% gain with eight threads.

Sliced threading uses more memory than single or frame threading.

Frame threading and single threading keep the previous memory
layout.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-07-14 20:18:54 +02:00
Daniel Kang
17343e3952 vp8: move data from VP8Context->VP8Macroblock
In preparation for sliced threading.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-07-14 20:18:54 +02:00
Daniel Kang
337ade52de vp8: refactor decoding a single mb_row
This is in preperation for sliced threading.

Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2012-07-14 20:18:54 +02:00
Ronald S. Bultje
6163d880c0 vp8: move block coeff arithcoder on stack.
This prevents gcc from assuming that contents of it may have changed
between calls to vp56_range_get_prob(), thus preventing countless (and
unnecessary) movs. Decoding of sintel trailer goes from (avg+SG) 9.796
+/- 0.003 to 9.635 +/- 0.010.
2012-05-30 09:08:29 -07:00
Ronald S. Bultje
82a0497cf3 vp8: update frame size changes on thread context switches.
This properly synchronizes frame size changes between threads if
subsequent threads abort decoding before frame size is initialized, i.e.
it prevents the thread after that from ping-ponging back to the original
value.

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
2012-05-02 09:57:12 -07:00
Martin Storsjö
00c3b67b8a cosmetics: Align codec declarations
Also break some long lines, remove codec function placeholder comments
and add spaces in sample/pixel format lists.

Signed-off-by: Martin Storsjö <martin@martin.st>
2012-04-06 22:37:38 +03:00
Janne Salonen
14ba7472dc vp8: fix update_lf_deltas in libavcodec/vp8.c
lf_delta.ref[i] and lf_delta.mode[i] were incorrectly reset to 0 if
specific delta value was not updated. Fixed to keep the previous value
if flag indicates that element in question is not updated.

Signed-off-by: Janne Salonen <jsalonen@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-24 08:22:05 -07:00
Aaron Colwell
30011bf201 vp8: avoid race condition on segment map.
This change avoids accessing the segment map of the previous frame if
segmentation is not enabled for the current frame. The caller of
decode_mb_mode() only calls ff_thread_await_progress() on the reference
segmentation index array if segmentation is enabled, so Chromium's TSAN
will report a race when accessing this data while segmentation is not
enabled.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-19 13:49:34 -07:00
Martin Storsjö
9cf0841ef3 dsputil: Add ff_ prefix to the dsputil*_init* functions
Signed-off-by: Martin Storsjö <martin@martin.st>
2012-02-15 22:06:34 +02:00
Ronald S. Bultje
fb90785e98 vp8: always update next_framep[] before returning from decode_frame().
Also slightly move around code not allocate a new frame if we won't
decode it. This prevents us from putting undecoded frames in frame
pointers, which (in mt decoding) other threads will use and wait on
as references, causing a deadlock (if we skipped decoding) or a crash
(if we didn't initialized next_framep[] at all).

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
2012-02-07 11:29:02 -08:00
Diego Biurrun
32f3c541bc doxygen: Do not include license boilerplates in Doxygen comment blocks. 2012-02-06 19:39:24 +01:00
Aaron Colwell
e02dec25ab vp8: flush buffers on size changes. 2011-12-02 07:21:08 -08:00
Justin Ruggles
f3a29b750a avcodec: move some AVCodecContext fields to an internal struct.
A new field, AVCodecContext.internal is used to hold a new struct
AVCodecInternal, which has private fields that are not codec-specific and are
used by general libavcodec functions.

Moved internal_buffer, internal_buffer_count, and is_copy.
2011-11-19 10:01:05 -05:00
Ronald S. Bultje
bfa0f96586 vp8: fix overflow in segmentation map caching. 2011-10-28 23:48:43 -07:00
Baptiste Coudurier
76741b0e56 h264: 4:2:2 intra decoding support
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-21 01:00:41 -07:00
Ronald S. Bultje
ce42a04884 vp8: fix up handling of segmentation_maps in reference frames.
Associate segmentation_map[] with reference frame, rather than
decoding instance. This fixes cases where the map would be free()'ed
on e.g. a size change in one thread, whereas the other thread was
still accessing it. Also, it fixes cases where threads overwrite data
that is still being referenced by the previous thread, who thinks that
it's part of the frame previously decoded by the next thread.
2011-10-21 00:17:58 -07:00
Ronald S. Bultje
0f0b5d6434 vp8: prevent read from uninitialized memory in decode_mvs
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-15 00:13:21 +02:00
Ronald S. Bultje
5653579381 vp8: force reallocation in update_thread_context after frame size change
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-15 00:13:21 +02:00
Ronald S. Bultje
f05c2fb6eb vp8: fix return value if update_dimensions fails
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2011-10-15 00:13:21 +02:00
Mans Rullgard
bb59156606 vp8: fix signed overflows
In addition to avoiding undefined behaviour, an unsigned type
makes more sense for packing multiple 8-bit values.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-10-08 20:03:55 +01:00
Anton Khirnov
ec6402b7c5 lavc: use designated initialisers for all codecs.
It's more readable and less prone to breakage.
2011-07-29 08:42:34 +02:00
Diego Biurrun
3c432e1186 doxygen: Fix documentation for some VP8 functions. 2011-07-04 12:54:26 +02:00
Diego Biurrun
24c9babaaf doxygen: Fix parameter names to match the function prototypes. 2011-07-03 18:30:02 +02:00
Ronald S. Bultje
9ebcf7699b vp8: fix segmentation race during frame-threading.
Fixes occasional failure of make fate-vp8-test-vector-010 with
frame-multithreading enabled.
2011-05-31 07:13:34 -07:00
Mans Rullgard
4276112277 vp8: use av_clip_uintp2() where possible
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-29 02:10:05 +01:00
Mans Rullgard
1550f45a89 Add av_clip_uintp2() function
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-13 16:45:24 -04:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Ronald S. Bultje
4773d90421 vp8: frame-multithreading.
Tested on a Mac Pro, 2 CPUs, 2 cores each, OSX 10.6.6:

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/sintel_trailer_1080p_vp8_vorbis.webm \
  -f null -vcodec rawvideo -an -
1: 0m14.630s (89.9 fps)
2: 0m8.056s (163.2 fps)
3: 0m5.882s (223.6 fps)
4: 0m4.952s (265.6 fps)

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/Elephants_Dream-720p-Stereo.webm \
  -f null -vcodec rawvideo -an -
1: 1m12.962s (215.1 fps)
2: 0m44.682s (351.2 fps)
3: 0m31.183s (503.2 fps)
4: 0m25.284s (620.6 fps)

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2011-05-02 17:03:31 +02:00
Stefano Sabatini
975a1447f7 Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-02 12:18:44 +02:00
Alexander Strange
66f608a6aa vp8.c: rename EDGE_* to VP8_EDGE_*. 2011-03-24 21:48:18 -04:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Jason Garrett-Glaser
81a131312d VP8: fix other function declaration
Was missed in 3efbe137.
2011-03-12 15:36:15 -08:00
Jason Garrett-Glaser
1eeca88691 VP8: optimize VP8Context struct ordering
Shaves at least 3KB off code size on x86, should improve cache utilization.
This would probably be useful to do for other decoders/encoders as well.
2011-03-12 03:43:42 -08:00
Jason Garrett-Glaser
3efbe13739 VP8: fix function declaration 2011-03-12 03:41:39 -08:00
Jason Garrett-Glaser
628b48db85 VP8: use a goto to break out of two loops
A break statement was supposed to break out of two loops, but only broke out of one.
Didn't affect output, just could have been marginally slower.
2011-03-12 03:41:33 -08:00
Jason Garrett-Glaser
891b1f15a7 VP8: init one less near_mv
This one didn't actually need to be initialized.
2011-02-17 15:25:28 -08:00
Jason Garrett-Glaser
bcf4568f18 VP8: split out declarations to new header 2011-02-17 15:25:16 -08:00
Jason Garrett-Glaser
7634771e70 VP8: faster MV clipping 2011-02-17 15:23:53 -08:00
Reinhard Tartler
737eb5976f Merge libavcore into libavutil
It is pretty hopeless that other considerable projects will adopt
libavutil alone in other projects. Projects that need small footprint
are better off with more specialized libraries such as gnulib or rather
just copy the necessary parts that they need. With this in mind, nobody
is helped by having libavutil and libavcore split. In order to ease
maintenance inside and around FFmpeg and to reduce confusion where to
put common code, avcore's functionality is merged (back) to avutil.

Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-02-15 16:18:21 +01:00
Mans Rullgard
a7878c9f73 VP8: ARM optimised decode_block_coeffs_internal
Approximately 5% faster on Cortex-A8.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-11 15:48:11 +00:00
Jason Garrett-Glaser
f3d09d44b7 VP8: optimized mv prediction and decoding
Merge find_near_mvs and mv bitstream decoding: don't do prediction steps
until absolutely necessary.
2011-02-10 16:18:16 -08:00
Jason Garrett-Glaser
62457f9052 VP8: idct_mb optimizations
Currently uses AV_RL32 instead of AV_RL32A, as the latter doesn't exist yet.
2011-02-08 15:59:24 -08:00
Jason Garrett-Glaser
8a2c99b486 VP8: slightly faster loopfilter sharpness logic 2011-02-04 04:51:22 -08:00
Jason Garrett-Glaser
79dec1541b VP8: faster deblock strength calculation
Convert hev_thresh logic to a LUT, simplify mbedge_lim calculation.
2011-02-04 04:51:18 -08:00
Jason Garrett-Glaser
a1b227bb53 VP8: faster filter_level clip 2011-02-03 19:55:06 -08:00
Jason Garrett-Glaser
dd18c9a050 VP8: simplify lf_delta mb mode logic 2011-02-03 19:55:02 -08:00
Jason Garrett-Glaser
64233e702a VP8: merge chroma MC calls
Adds some duplicated code, but avoids duplicate edge checks and similar.
~0.5% faster overall on Parkjoy test sample.
2011-01-31 20:46:54 -08:00