Commit Graph

155 Commits

Author SHA1 Message Date
Mans Rullgard
1550f45a89 Add av_clip_uintp2() function
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-13 16:45:24 -04:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Ronald S. Bultje
4773d90421 vp8: frame-multithreading.
Tested on a Mac Pro, 2 CPUs, 2 cores each, OSX 10.6.6:

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/sintel_trailer_1080p_vp8_vorbis.webm \
  -f null -vcodec rawvideo -an -
1: 0m14.630s (89.9 fps)
2: 0m8.056s (163.2 fps)
3: 0m5.882s (223.6 fps)
4: 0m4.952s (265.6 fps)

time ./ffmpeg -v 0 -vsync 0 -threads [1234] -i \
  ~/Downloads/Elephants_Dream-720p-Stereo.webm \
  -f null -vcodec rawvideo -an -
1: 1m12.962s (215.1 fps)
2: 0m44.682s (351.2 fps)
3: 0m31.183s (503.2 fps)
4: 0m25.284s (620.6 fps)

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2011-05-02 17:03:31 +02:00
Stefano Sabatini
975a1447f7 Replace deprecated FF_*_TYPE symbols with AV_PICTURE_TYPE_*.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-02 12:18:44 +02:00
Alexander Strange
66f608a6aa vp8.c: rename EDGE_* to VP8_EDGE_*. 2011-03-24 21:48:18 -04:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Jason Garrett-Glaser
81a131312d VP8: fix other function declaration
Was missed in 3efbe137.
2011-03-12 15:36:15 -08:00
Jason Garrett-Glaser
1eeca88691 VP8: optimize VP8Context struct ordering
Shaves at least 3KB off code size on x86, should improve cache utilization.
This would probably be useful to do for other decoders/encoders as well.
2011-03-12 03:43:42 -08:00
Jason Garrett-Glaser
3efbe13739 VP8: fix function declaration 2011-03-12 03:41:39 -08:00
Jason Garrett-Glaser
628b48db85 VP8: use a goto to break out of two loops
A break statement was supposed to break out of two loops, but only broke out of one.
Didn't affect output, just could have been marginally slower.
2011-03-12 03:41:33 -08:00
Jason Garrett-Glaser
891b1f15a7 VP8: init one less near_mv
This one didn't actually need to be initialized.
2011-02-17 15:25:28 -08:00
Jason Garrett-Glaser
bcf4568f18 VP8: split out declarations to new header 2011-02-17 15:25:16 -08:00
Jason Garrett-Glaser
7634771e70 VP8: faster MV clipping 2011-02-17 15:23:53 -08:00
Reinhard Tartler
737eb5976f Merge libavcore into libavutil
It is pretty hopeless that other considerable projects will adopt
libavutil alone in other projects. Projects that need small footprint
are better off with more specialized libraries such as gnulib or rather
just copy the necessary parts that they need. With this in mind, nobody
is helped by having libavutil and libavcore split. In order to ease
maintenance inside and around FFmpeg and to reduce confusion where to
put common code, avcore's functionality is merged (back) to avutil.

Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-02-15 16:18:21 +01:00
Mans Rullgard
a7878c9f73 VP8: ARM optimised decode_block_coeffs_internal
Approximately 5% faster on Cortex-A8.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-11 15:48:11 +00:00
Jason Garrett-Glaser
f3d09d44b7 VP8: optimized mv prediction and decoding
Merge find_near_mvs and mv bitstream decoding: don't do prediction steps
until absolutely necessary.
2011-02-10 16:18:16 -08:00
Jason Garrett-Glaser
62457f9052 VP8: idct_mb optimizations
Currently uses AV_RL32 instead of AV_RL32A, as the latter doesn't exist yet.
2011-02-08 15:59:24 -08:00
Jason Garrett-Glaser
8a2c99b486 VP8: slightly faster loopfilter sharpness logic 2011-02-04 04:51:22 -08:00
Jason Garrett-Glaser
79dec1541b VP8: faster deblock strength calculation
Convert hev_thresh logic to a LUT, simplify mbedge_lim calculation.
2011-02-04 04:51:18 -08:00
Jason Garrett-Glaser
a1b227bb53 VP8: faster filter_level clip 2011-02-03 19:55:06 -08:00
Jason Garrett-Glaser
dd18c9a050 VP8: simplify lf_delta mb mode logic 2011-02-03 19:55:02 -08:00
Jason Garrett-Glaser
64233e702a VP8: merge chroma MC calls
Adds some duplicated code, but avoids duplicate edge checks and similar.
~0.5% faster overall on Parkjoy test sample.
2011-01-31 20:46:54 -08:00
Jason Garrett-Glaser
73be29b0c4 Slightly simplify VP8 inter_predict
Merge an if and a switch.
2011-01-30 12:12:02 -08:00
Ronald S. Bultje
2e27959879 Move ff_emulated_edge_mc() into DSPContext. 2011-01-28 22:13:26 -05:00
Ronald S. Bultje
9d4bdcb714 Fix VP8 aliasing problems.
Replace * (uint32_t *) buf accesses with AV_WN32A/AV_COPY32.
2011-01-28 10:20:00 -05:00
Diego Elio Pettenò
d36beb3f69 Add ff_ prefix to data symbols of encoders, decoders, hwaccel, parsers, bsf.
None of these symbols should be accessed directly, so declare them as
hidden.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-26 16:08:45 +00:00
Ronald S. Bultje
44002d8323 Don't do edge emulation unless the edge pixels will be used in MC.
Do not emulate larger edges than we will actually use for this round of
MC. Decoding goes from avg+SE 29.972+/-0.023sec to 29.856+/-0.023, i.e.
0.12sec or ~0.4% faster.
2011-01-25 13:50:16 -05:00
Ronald S. Bultje
7148da489e Fix valgrind invalid read on top MB rows with CODEC_FLAG_EMU_EDGE set.
Originally committed as revision 26168 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-30 14:33:21 +00:00
Ronald S. Bultje
ee555de7dd Support CODEC_FLAG_EMU_EDGE in VP8 decoder.
Originally committed as revision 26117 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-28 17:37:19 +00:00
Stefano Sabatini
e16f217ceb Use new imgutils.h API names, fix deprecation warnings.
Originally committed as revision 25058 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-07 19:15:29 +00:00
Jason Garrett-Glaser
2b476e02e1 Remove some stray +s in VP8
Originally committed as revision 24791 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-13 02:02:07 +00:00
Pascal Massimino
aa93c52c21 remove b4_stride/mb_stride.
correct mb_xy to use mb_width.
tighten allocations.
reduce the amount of zeroing.

Originally committed as revision 24760 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 08:27:38 +00:00
Pascal Massimino
ccf13f9e20 fix over-allocation. confused b4_stride with mb_width.
Originally committed as revision 24758 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 05:24:19 +00:00
Stefano Sabatini
6ce9b4310c Remove use of the deprecated function avcodec_check_dimensions(), use
av_check_image_size() instead.

Originally committed as revision 24711 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-06 09:37:04 +00:00
Jason Garrett-Glaser
7e13022a4d VP8: fix bug in prefetch
Motion vectors in VP8 are qpel, not fullpel.

Originally committed as revision 24707 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 20:03:54 +00:00
Jason Garrett-Glaser
905ef0d064 VP5/6/8: eliminate CABAC dependency
Create a custom table for VP5/6/8's renorm to avoid depending on H.264's.
Saves one instruction in the arithmetic decoder as well.

Originally committed as revision 24701 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 23:04:05 +00:00
Jason Garrett-Glaser
1e73967950 VP8: partially inline decode_block_coeffs
Avoids a function call in the case of empty DCT blocks (most of the time).

Originally committed as revision 24691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 02:23:25 +00:00
Jason Garrett-Glaser
ffbf0794f9 Fix 100L in r24689
Accidentally committed some timing code.

Originally committed as revision 24690 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:40:58 +00:00
Jason Garrett-Glaser
afb54a85c3 VP8: simplify decode_block_coeffs to avoid having to track nonzero coeffs
Slightly faster.

Originally committed as revision 24689 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:38:08 +00:00
Jason Garrett-Glaser
b0d5879513 VP8: slightly faster DCT coefficient probability update
Originally committed as revision 24687 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 23:21:47 +00:00
Jason Garrett-Glaser
476be414a4 VP8: make another RAC call branchy
1-2 clocks faster.

Originally committed as revision 24683 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:34:24 +00:00
Jason Garrett-Glaser
0908f1b945 VP8: unroll partition type decoding tree
~34% faster partition type decoding.

Originally committed as revision 24681 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:10:58 +00:00
Jason Garrett-Glaser
c5dec7f137 VP8: unroll splitmv decoding tree
Much faster splitmv mode decoding.

Originally committed as revision 24680 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:37:14 +00:00
Jason Garrett-Glaser
23117d69c1 VP8: unroll MB mode decoding tree
~50% faster MB mode decoding, plus eliminate a costly switch.

Originally committed as revision 24679 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:24:28 +00:00
Jason Garrett-Glaser
370b622a45 VP8: eliminate a dereference in coefficient decoding
Originally committed as revision 24671 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 22:48:38 +00:00
Jason Garrett-Glaser
f311208cf1 VP8: much faster DC transform handling
A lot of the time the DC block is empty: don't do the WHT in this case.
A lot of the rest of the time, there's only one coefficient: make a special
DC-only transform for that case.
When the block is empty, don't incorrectly mark luma DCT blocks as having DC
coefficients.

Originally committed as revision 24670 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:57:03 +00:00
Jason Garrett-Glaser
827d43bb9d VP8: move zeroing of luma DC block into the WHT
Lets us do the zeroing in asm instead of C.
Also makes it consistent with the way the regular iDCT code does it.

Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:18:09 +00:00
Pascal Massimino
d2840fa49c only store intra prediction modes on the boundary for keyframes, not as a plane.
inter-frame behaviour unchanged.

Originally committed as revision 24664 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 09:44:53 +00:00
Jason Garrett-Glaser
10bf2eebbe VP8: simplify token_prob handling
~1.5% faster decode_block_coeffs

Originally committed as revision 24659 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 05:20:38 +00:00
Pascal Massimino
c22b4468a6 prevent access to vp8_coeff_band[16]
Originally committed as revision 24656 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-01 23:20:06 +00:00
Pascal Massimino
a8ab0cccf7 b0rk3d FATE + black helicopters hissing -> rolling back to r24556 and sleeping
Originally committed as revision 24559 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 23:09:13 +00:00
Pascal Massimino
62d1f7864e perform the clipping on luma_dc_qmul[1] and chroma_qmul[0] earlier
Originally committed as revision 24558 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 22:23:50 +00:00
Pascal Massimino
e7e81959d6 save some copies by moving some fields out of proba[2]
Originally committed as revision 24557 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 22:21:49 +00:00
Jason Garrett-Glaser
fca05ea8a0 VP8: add missing free
Fixes a tiny memory leak.

Originally committed as revision 24504 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 07:10:30 +00:00
Carl Eugen Hoyos
28e241de5d Fix r24445: Instead of needlessly initialising a variable, silence the warning.
Originally committed as revision 24498 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-25 14:49:45 +00:00
David Conrad
ca18a478e3 VP8: Inline traversing vp8_small_mvtree
Much faster read_mv_component, slightly faster overall

Originally committed as revision 24470 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:25 +00:00
David Conrad
7697cdcf95 VP8: Use vp56_rac_get_prob_branchy when the bit is only used by an if()
Originally committed as revision 24469 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:20 +00:00
David Conrad
fe1b5d974a Decode DCT tokens by branching to a different code path for each branch
on the huffman tree, instead of traversing the tree in a while loop.

Based on the similar optimization in libvpx's detokenize.c

10% faster at normal bitrates, and 30% faster for high-bitrate intra-only

Originally committed as revision 24468 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:17 +00:00
Jason Garrett-Glaser
13a1304bb3 Add myself to VP8 copyright and maintainers.
Also add Ronald to maintainers.

Originally committed as revision 24464 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:42:35 +00:00
Jason Garrett-Glaser
414ac27d8f VP8: always_inline some things to force gcc to do the right thing
Mostly seems to help in the MC code, which gets a hundred cycles faster.

Originally committed as revision 24463 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:36:21 +00:00
Jason Garrett-Glaser
06d50ca804 VP8: use AV_RL24 instead of defining a new RL24.
Originally committed as revision 24462 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:17:18 +00:00
Jason Garrett-Glaser
9fddd14a8e VP8: Slightly faster MV selection
Don't clamp best mv unless it's actually used.

Originally committed as revision 24461 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 19:06:22 +00:00
Jason Garrett-Glaser
14767f35ed VP8: use AV_ZERO32 instead of AV_WN32A where relevant
Originally committed as revision 24460 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:42:19 +00:00
Jason Garrett-Glaser
09959ec46e VP8: eliminate redundant code in r24458
Originally committed as revision 24459 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:34:21 +00:00
Jason Garrett-Glaser
a71abb714e VP8: shave a few clocks off check_intra_pred_mode
Originally committed as revision 24458 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:24:38 +00:00
Jason Garrett-Glaser
0087aa47d0 VP8: fix broken sign bias code in MV pred
Apparently the official conformance test vectors don't test this feature,
even though libvpx uses it.

Originally committed as revision 24456 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 06:41:35 +00:00
Jason Garrett-Glaser
3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma.
Add MMX idct_dc_add4uv function for this case.
~40% faster chroma idct.

Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 06:02:52 +00:00
Jason Garrett-Glaser
3df56f4118 VP8: Clean up some variable shadowing.
Originally committed as revision 24454 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 03:44:37 +00:00
Jason Garrett-Glaser
8a467b2d44 VP8: 30% faster idct_mb
Take shortcuts based on statistically common situations.
Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT
blocks are common.
TODO: tie this more directly into the MB mode, since the DC-level transform is
only used for non-splitmv blocks?

Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 02:58:27 +00:00
Jason Garrett-Glaser
ef38842f0b VP8: smarter prefetching
Don't prefetch reference frames that were used less than 1/32th of the time so
far in the frame.
This helps speed up to ~2% on videos that, in many frames, make near-zero
(but not entirely zero) use of golden and/or alt-refs.
This is a very common property of videos encoded by libvpx.

Originally committed as revision 24451 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 01:59:56 +00:00
Jason Garrett-Glaser
c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks.
~0.3% faster overall.

Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:07:16 +00:00
Jason Garrett-Glaser
b74f70d646 VP8: avoid a memset for non-i4x4 blocks with no coefficients
Originally committed as revision 24447 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:05:44 +00:00
Jason Garrett-Glaser
145d31865d Get rid of more unnecessary dereferences in VP8 deblocking
Originally committed as revision 24446 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:11:40 +00:00
Jason Garrett-Glaser
867215336d Shut up an uninitialized variable GCC warning in VP8.
Originally committed as revision 24445 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:04:51 +00:00
Jason Garrett-Glaser
c4211046d2 Smarter VP8 prefetching
Prefetch all refs (including altref), but only if they've been used so far this
frame.
~2.5% faster overall.

TODO: Do something even smarter, like using how often each ref has been used
so far, so that a couple blocks of a rarely-used ref don't force us to prefetch
it.

Originally committed as revision 24444 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:03:08 +00:00
Jason Garrett-Glaser
8cfae560ad Fix stupid bug in VP8 prefetching code
Originally committed as revision 24443 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 22:15:43 +00:00
Jason Garrett-Glaser
2a38c2e99a Eliminate a LUT in escape decoding in VP8 decode_block_coeffs
Originally committed as revision 24441 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 22:08:09 +00:00
Jason Garrett-Glaser
d292c3455e Eliminate some repeated dereferences in VP8 inter_predict
Originally committed as revision 24438 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 21:05:30 +00:00
Jason Garrett-Glaser
b946111fde Eliminate a pointless memset for intra blocks in P-frames in VP8
Originally committed as revision 24429 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 12:15:29 +00:00
Jason Garrett-Glaser
b9a7186bf4 VP8: Don't store segment in macroblock struct anymore.
Not necessary with the previous patch.

Originally committed as revision 24427 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 11:55:55 +00:00
Jason Garrett-Glaser
c55e0d34ba Convert VP8 macroblock structures to a ring buffer.
Uses a slightly nonintuitive ring buffer size of (width+height*2) to simplify
addressing logic.
Also split out the segmentation map to a separate structure, necessary to
implement the ring buffer.

Originally committed as revision 24426 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 11:45:18 +00:00
Jason Garrett-Glaser
968570d65f Calculate deblock strength per-MB instead of per-row
Gives better cache locality, since the VP8Macroblock structs are still in cache.
Inspired by the way x264 does it.

Originally committed as revision 24417 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 07:24:22 +00:00
Jason Garrett-Glaser
d1c58fce20 Avoid tracking i4x4 modes in P-frames in VP8
As in the previous commit, they aren't used for context selection, so it saves
memory this way.

Originally committed as revision 24416 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 07:04:45 +00:00
Jason Garrett-Glaser
158e062c95 Avoid useless fill_rectangle in P-frames in VP8
In VP8, i4x4 only uses contexts based on neighbors in I-frames.

Originally committed as revision 24415 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 06:39:54 +00:00
Jason Garrett-Glaser
7bf254c41d Optimize partition mv decoding in VP8
Originally committed as revision 24414 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 06:29:26 +00:00
Jason Garrett-Glaser
c0498b3031 Take shortcuts for mv0 case in VP8 MC
Avoid edge emulation -- it isn't needed if there isn't any subpel.

Originally committed as revision 24413 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 05:49:09 +00:00
Jason Garrett-Glaser
702e8d3376 Much faster VP8 mv and mode prediction
Originally committed as revision 24412 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 04:26:41 +00:00
Jason Garrett-Glaser
d864dee8ab Add prefetching to VP8 decoder
~5% faster overall, probably depends on CPU and resolution.

Originally committed as revision 24410 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 03:09:10 +00:00
Måns Rullgård
096971e892 vp8: indent
Originally committed as revision 24368 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-20 17:54:28 +00:00
Måns Rullgård
070ce7efad vp8: add do { } while(0) around XCHG() macro to avoid confusing if/else
This is the correct solution to the warning "fixed" in the previous
commit.

Originally committed as revision 24367 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-20 17:54:25 +00:00
Diego Biurrun
153da88dfb Add some braces to silence the warning:
libavcodec/vp8.c:892: warning: suggest explicit braces to avoid ambiguous `else'

Originally committed as revision 24366 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-20 17:45:54 +00:00
Ronald S. Bultje
3facfc99da Change function prototypes for width=8 inner and mbedge loopfilter functions
so that it does both U and V planes at the same time. This will have speed
advantages when using SSE2 (or higher) optimizations, since we can do both
the U and V rows together in a single xmm register.

This also renames filter16 to filter16y and filter8 to filter8uv so that it's
more obvious what each function is used for.

Originally committed as revision 24337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 21:18:04 +00:00
David Conrad
9ac831c2c0 vp8: Save mb border needed for intra prediction so that loop filter can run
immediately after a mb row is decoded

Originally committed as revision 24252 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 07:20:35 +00:00
David Conrad
b6c420ce8f vp8: Check for malloc failure
Originally committed as revision 24251 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 07:20:31 +00:00
Ronald S. Bultje
e394953e62 Add missing doxy for function arguments.
Originally committed as revision 24110 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-08 15:01:59 +00:00
David Conrad
5245c04da3 VP8: Move calculation of outer filter limit out of dsp functions for normal
filter to match the simple loop filter

Originally committed as revision 24010 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-02 21:04:45 +00:00
Diego Biurrun
3fa7626863 Avoid square brackets in Doxygen comments; Doxygen chokes on them.
Originally committed as revision 23979 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-02 11:44:58 +00:00
Ronald S. Bultje
7ed06b2be8 Simplify MV parsing, removes laying out 2 or 4 (16x8/8x8/8x16) MVs over all
16 subblocks (since we no longer need that), which should also lead to a
minor speedup.

Originally committed as revision 23854 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 16:04:14 +00:00
Ronald S. Bultje
7c4dcf8165 Optimize split MC, so we don't always do 4x4 blocks of 4x4pixels each, but
we apply them as 16x8/8x16/8x8 subblocks where possible. Since this allows
us to use width=8/16 instead of width=4 MC functions, we can now take more
advantage of SSE2/SSSE3 optimizations, leading to a total speedup for splitMV
filter of about 10%.

Originally committed as revision 23853 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 13:50:55 +00:00
David Conrad
0ef1dbedcb VP8 bilinear filter
Originally committed as revision 23813 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 01:46:29 +00:00