12140 Commits

Author SHA1 Message Date
Alex Converse
fe461767e6 aacenc: TLS: Try to preserve some energy in each non-zero band.
Reduce scalefactors in non-zero bands that underflow by twice as much as those
in bands that just fail to hit psy targets.

Originally committed as revision 24482 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 16:36:01 +00:00
Reimar Döffinger
edac49daf5 Use "const" qualifier for pointers that point to input data of
audio encoders.
This is purely for clarity/documentation purposes.

Originally committed as revision 24481 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 13:59:49 +00:00
Alex Converse
c226fc5bfb aacenc: Prevent premature termination of the two loop search.
Originally committed as revision 24476 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 04:23:26 +00:00
Alex Converse
81824fe059 aacdec: Only load and write each predictor variable once.
This is slightly faster and opens the door for further optimization.

Originally committed as revision 24475 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 02:57:08 +00:00
Alex Converse
70c99adb48 aacdec: 4% faster main profile decoding.
Originally committed as revision 24474 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 02:41:47 +00:00
Alex Converse
51ffd3a62f aacenc: Favor log2f() and sqrtf() over log2() and sqrt().
Originally committed as revision 24473 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 02:10:59 +00:00
Alex Converse
04d72abf17 aacenc: Factorize some scalefactor utilities.
Originally committed as revision 24472 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 22:37:42 +00:00
Eli Friedman
3611e7a309 Inline asm for VP56 arith coder
This is a lot more reliable to get cmov rather than trying to trick gcc into
generating it, useful since it's 2% faster overall.

Patch by Eli Friedman <eli.friedman at gmail>

Originally committed as revision 24471 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:30 +00:00
David Conrad
ca18a478e3 VP8: Inline traversing vp8_small_mvtree
Much faster read_mv_component, slightly faster overall

Originally committed as revision 24470 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:25 +00:00
David Conrad
7697cdcf95 VP8: Use vp56_rac_get_prob_branchy when the bit is only used by an if()
Originally committed as revision 24469 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:20 +00:00
David Conrad
fe1b5d974a Decode DCT tokens by branching to a different code path for each branch
on the huffman tree, instead of traversing the tree in a while loop.

Based on the similar optimization in libvpx's detokenize.c

10% faster at normal bitrates, and 30% faster for high-bitrate intra-only

Originally committed as revision 24468 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:17 +00:00
David Conrad
5474ec2ac8 Move renormalization of the VP56 arith decoder to before decoding a bit
No difference at the moment, but allows a future branchy variant
of vp56_rac_get_prob to be significantly faster

Originally committed as revision 24467 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:14 +00:00
David Conrad
b3d755ec8b Split renorm of vp56 arith decoder to its own function
Originally committed as revision 24466 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:08 +00:00
David Conrad
24675b8093 vp56's arith decoder's code_word is only 16 bits, no need for unsigned long
Originally committed as revision 24465 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:01 +00:00
Jason Garrett-Glaser
13a1304bb3 Add myself to VP8 copyright and maintainers.
Also add Ronald to maintainers.

Originally committed as revision 24464 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:42:35 +00:00
Jason Garrett-Glaser
414ac27d8f VP8: always_inline some things to force gcc to do the right thing
Mostly seems to help in the MC code, which gets a hundred cycles faster.

Originally committed as revision 24463 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:36:21 +00:00
Jason Garrett-Glaser
06d50ca804 VP8: use AV_RL24 instead of defining a new RL24.
Originally committed as revision 24462 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:17:18 +00:00
Jason Garrett-Glaser
9fddd14a8e VP8: Slightly faster MV selection
Don't clamp best mv unless it's actually used.

Originally committed as revision 24461 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 19:06:22 +00:00
Jason Garrett-Glaser
14767f35ed VP8: use AV_ZERO32 instead of AV_WN32A where relevant
Originally committed as revision 24460 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:42:19 +00:00
Jason Garrett-Glaser
09959ec46e VP8: eliminate redundant code in r24458
Originally committed as revision 24459 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:34:21 +00:00
Jason Garrett-Glaser
a71abb714e VP8: shave a few clocks off check_intra_pred_mode
Originally committed as revision 24458 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 10:24:38 +00:00
Jason Garrett-Glaser
0087aa47d0 VP8: fix broken sign bias code in MV pred
Apparently the official conformance test vectors don't test this feature,
even though libvpx uses it.

Originally committed as revision 24456 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 06:41:35 +00:00
Jason Garrett-Glaser
3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma.
Add MMX idct_dc_add4uv function for this case.
~40% faster chroma idct.

Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 06:02:52 +00:00
Jason Garrett-Glaser
3df56f4118 VP8: Clean up some variable shadowing.
Originally committed as revision 24454 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 03:44:37 +00:00
Jason Garrett-Glaser
51c9156438 VP8 asm: cosmetics (spacing)
Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 03:02:56 +00:00
Jason Garrett-Glaser
8a467b2d44 VP8: 30% faster idct_mb
Take shortcuts based on statistically common situations.
Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT
blocks are common.
TODO: tie this more directly into the MB mode, since the DC-level transform is
only used for non-splitmv blocks?

Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 02:58:27 +00:00
Jason Garrett-Glaser
ef38842f0b VP8: smarter prefetching
Don't prefetch reference frames that were used less than 1/32th of the time so
far in the frame.
This helps speed up to ~2% on videos that, in many frames, make near-zero
(but not entirely zero) use of golden and/or alt-refs.
This is a very common property of videos encoded by libvpx.

Originally committed as revision 24451 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 01:59:56 +00:00
Baptiste Coudurier
9479415e4e In h264 parser, return immediately if buf_size is 0, avoid printing
erroneous message for last frame.

Originally committed as revision 24450 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:34:09 +00:00
Jason Garrett-Glaser
c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks.
~0.3% faster overall.

Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:07:16 +00:00
Jason Garrett-Glaser
b74f70d646 VP8: avoid a memset for non-i4x4 blocks with no coefficients
Originally committed as revision 24447 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:05:44 +00:00
Jason Garrett-Glaser
145d31865d Get rid of more unnecessary dereferences in VP8 deblocking
Originally committed as revision 24446 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:11:40 +00:00
Jason Garrett-Glaser
867215336d Shut up an uninitialized variable GCC warning in VP8.
Originally committed as revision 24445 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:04:51 +00:00
Jason Garrett-Glaser
c4211046d2 Smarter VP8 prefetching
Prefetch all refs (including altref), but only if they've been used so far this
frame.
~2.5% faster overall.

TODO: Do something even smarter, like using how often each ref has been used
so far, so that a couple blocks of a rarely-used ref don't force us to prefetch
it.

Originally committed as revision 24444 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 23:03:08 +00:00
Jason Garrett-Glaser
8cfae560ad Fix stupid bug in VP8 prefetching code
Originally committed as revision 24443 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 22:15:43 +00:00
Jason Garrett-Glaser
2a38c2e99a Eliminate a LUT in escape decoding in VP8 decode_block_coeffs
Originally committed as revision 24441 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 22:08:09 +00:00
Jason Garrett-Glaser
d292c3455e Eliminate some repeated dereferences in VP8 inter_predict
Originally committed as revision 24438 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 21:05:30 +00:00
Ronald S. Bultje
dc5eec8085 Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on
CPUs supporting it.

Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 19:59:34 +00:00
James Zern
7eb185e0a3 Map settings for 2-pass libvpx encoding.
Patch by James Zern, jzern at google

Originally committed as revision 24430 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 12:35:32 +00:00
Jason Garrett-Glaser
b946111fde Eliminate a pointless memset for intra blocks in P-frames in VP8
Originally committed as revision 24429 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 12:15:29 +00:00
Jason Garrett-Glaser
b9a7186bf4 VP8: Don't store segment in macroblock struct anymore.
Not necessary with the previous patch.

Originally committed as revision 24427 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 11:55:55 +00:00
Jason Garrett-Glaser
c55e0d34ba Convert VP8 macroblock structures to a ring buffer.
Uses a slightly nonintuitive ring buffer size of (width+height*2) to simplify
addressing logic.
Also split out the segmentation map to a separate structure, necessary to
implement the ring buffer.

Originally committed as revision 24426 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 11:45:18 +00:00
Jason Garrett-Glaser
968570d65f Calculate deblock strength per-MB instead of per-row
Gives better cache locality, since the VP8Macroblock structs are still in cache.
Inspired by the way x264 does it.

Originally committed as revision 24417 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 07:24:22 +00:00
Jason Garrett-Glaser
d1c58fce20 Avoid tracking i4x4 modes in P-frames in VP8
As in the previous commit, they aren't used for context selection, so it saves
memory this way.

Originally committed as revision 24416 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 07:04:45 +00:00
Jason Garrett-Glaser
158e062c95 Avoid useless fill_rectangle in P-frames in VP8
In VP8, i4x4 only uses contexts based on neighbors in I-frames.

Originally committed as revision 24415 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 06:39:54 +00:00
Jason Garrett-Glaser
7bf254c41d Optimize partition mv decoding in VP8
Originally committed as revision 24414 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 06:29:26 +00:00
Jason Garrett-Glaser
c0498b3031 Take shortcuts for mv0 case in VP8 MC
Avoid edge emulation -- it isn't needed if there isn't any subpel.

Originally committed as revision 24413 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 05:49:09 +00:00
Jason Garrett-Glaser
702e8d3376 Much faster VP8 mv and mode prediction
Originally committed as revision 24412 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 04:26:41 +00:00
Jason Garrett-Glaser
d229ae2b62 Convert vp56_mv to 16-bit.
Saves nothing except a bit of memory/cache now, but will allow future
optimizations.

Originally committed as revision 24411 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 03:33:29 +00:00
Jason Garrett-Glaser
d864dee8ab Add prefetching to VP8 decoder
~5% faster overall, probably depends on CPU and resolution.

Originally committed as revision 24410 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 03:09:10 +00:00
Ronald S. Bultje
003243c3c2 Fix and enable horizontal >=SSE2 mbedge loopfilter.
Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 01:35:26 +00:00