Commit Graph

1551 Commits

Author SHA1 Message Date
James Berry
bc7151131d Fix: check cx_data buffer prior to write
check to make sure that cx_data buffer has enough room before
writting to it, prior behavior did not which could result in a crash.

Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28
2011-10-20 15:55:00 -04:00
Johann
7cdc986cdf Don't copy borders for loop_filter_pick
During the _pick only the Y plane is examined. In addition, data beyond
the borders of the frame is not read.

Change-Id: Ic549adfca70fc6e0b55f8aab0efe81f0afac89f9
2011-10-19 18:54:14 -07:00
Johann
c6fb97a376 Merge "Fix: NEON copy/extend frame for small sizes" 2011-10-19 14:37:48 -07:00
Johann
f382173225 Merge "enc: save entropy probs only when needed for refresh" 2011-10-19 14:36:29 -07:00
Scott LaVarnway
5e54085703 Improved token decoder
Tests showed over 2% improvement on various HD clips.

Change-Id: I94a30d209c92cbd5fef285122f9fc570688635fe
2011-10-19 13:38:35 -04:00
Scott LaVarnway
63a77cbed9 Merge "Remove usage of predict buffer for decode" 2011-10-19 10:24:48 -07:00
James Zern
efa17efced vpxenc: fix rollover in status output
sizeof(unsigned long)=4 in 32-bit builds

Change-Id: I81c9d698c80ffaa332214e5b43e98b4e30cf9e88
2011-10-18 17:34:10 -07:00
Scott LaVarnway
ed9c66f584 Remove usage of predict buffer for decode
Instead of using the predict buffer, the decoder now writes
the predictor into the recon buffer.  For blocks with eob=0,
unnecessary idcts can be eliminated.  This gave a performance
boost of ~1.8% for the HD clips used.

Tero: Added needed changes to ARM side and scheduled some
      assembly code to prevent interlocks.

Patch Set 6:  Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3)
into this commit because of similarities in the idct
functions.
Patch Set 7: EC bug fix.

Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
2011-10-18 12:06:50 -04:00
Johann
8d00562fba Merge "Fix: vp8cx_pack_tokens_into_partitions_armv5 crash" 2011-10-17 09:21:31 -07:00
Attila Nagy
664d9921b7 Fix: NEON copy/extend frame for small sizes
NEON version of copyframeyonly, extendframeborders, copy_frame_func were
not working for plane stride < 128 and/or y_width < 128.

Change-Id: Id6c2e6c795274da0c90134b15c0d5f62d1b17a6c
2011-10-17 14:42:37 +03:00
Johann
4341e5af66 add 32bit darwin10 (10.6) target
Change-Id: Id1c189350d54919be37f864dae91dee37584945a
2011-10-14 12:06:24 -07:00
Johann
2f9e51b8c9 allow building for older platforms
Change-Id: Ibbd05e981debee12c16ebcd274150cd75a94a69d
2011-10-14 12:03:32 -07:00
John Koleszar
2d5c7f6740 Merge "vpxdec updated to use !feof() instead of *buf_sz in readframe()" 2011-10-14 07:56:12 -07:00
Attila Nagy
a5cd42feb9 Fix: vp8cx_pack_tokens_into_partitions_armv5 crash
It was crashing when number of partitions was bigger than the number
of MB rows (ex. 128x96 with 8 partitions).
Start point was not checked against mb_rows, plus extra
"empty" partitions were not written out.

Change-Id: I9c2f013b9ec022354b658fab4ef799ff8b1de93d
2011-10-14 10:53:04 +03:00
John Koleszar
6505adf271 Bump ABI version number for temporal scalability
Commit 217591f modified the encoder ABI without incrementing the version number.

Change-Id: I74de01597dadcdcd96f6b817e4ec69d9ab535e4c
2011-10-11 12:57:17 -07:00
Adrian Grange
04182a121a Merge "Added rate-targeted temporal scalability" 2011-10-11 12:54:52 -07:00
Adrian Grange
217591fde5 Added rate-targeted temporal scalability
Added the ability to create rate-targeted, temporally
scalable, VP8 compatible bitstreams.

The application vp8_scalable_patterns.c demonstrates how
to use this capability. Users can create output bitstreams
containing upto 5 temporally separable streams encoded
as a single VP8 bitstream.
(previously abandoned as:
I92d1483e887adb274d07ce9e567e4d0314881b0a)

Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a
2011-10-11 12:49:12 -07:00
John Koleszar
07ba411914 Reset FPU state after calc_plane_error()
Fixes a MMX/SSE2 mismatch when building with --enable-internal-stats.

Change-Id: I0c50a1f246f6916b7a5fc6f36864ceb362f25520
2011-10-11 08:43:30 -07:00
James Berry
05bde9d4a4 bug fix - starting/optimal/max and buffer_level changed from int to int64_t
buffer_level in VP8_COMP and starting_buffer_level, optimal_buffer_level
and maximum_buffer_size in VP8_CONFIG changed from int to int64_t
to avoid potential crash issues for larger target bit rates.

Change-Id: I0d5ab6c8a44c2fef51f30cd8df4bb4b739c5df26
2011-10-10 12:16:55 -04:00
Attila Nagy
c0de35b413 enc: save entropy probs only when needed for refresh
Previous entropy probs need to be saved (and restored) only when
current updates are not propagated.

Change-Id: Ie6ee0543066e30874e56258be0a6b7d2dd2fdb2b
2011-10-10 13:44:54 +03:00
Scott LaVarnway
af12c23e8e Merge "Improved tokenize" 2011-10-04 09:57:42 -07:00
John Koleszar
8f8b526b54 Merge "Fix uninitialized new_mv_count in first pass file" 2011-10-04 07:40:49 -07:00
Yunqing Wang
538865dfa5 Merge "Multithreaded encoder, late sync loopfilter" 2011-10-04 07:04:30 -07:00
John Koleszar
86712c50f2 Fix uninitialized new_mv_count in first pass file
Uninitialized data could be written to the first pass file when no
motion vectors are present in the frame.

Also fix a number of compiler warnings.

Change-Id: Icc9f53b6d33da9de4563d86d9fd591910473ea90
2011-10-04 09:50:52 -04:00
John Koleszar
016a38be93 Merge "makefile: fix target 'all'" 2011-10-03 08:31:13 -07:00
Johann
2aa408524c Merge "Reduce computational complexity of generic C loop filter." 2011-09-30 16:17:56 -07:00
Johann
48b1917112 Merge "combine loopfilter data access" 2011-09-30 15:47:56 -07:00
Scott LaVarnway
ab00d209bc Improved tokenize
For a realtime HD encodings, up to 1.6% gains seen.



Change-Id: If45028e23db95124da63f9d38ffe06e05596cc6e
2011-09-30 12:49:46 -04:00
Johann
3556deaca3 combine loopfilter data access
The data processed by the loopfilter overlaps. At the block level, this
results in some redundant transforms. Grouping the filtering allows for
a single 16x16 transpose (and inversion) instead of three 16x8 transposes
(and three more inversions).

This implementation is x86_64 only. We retain the previous
implementation for x86.

Improvements are obviously material dependant, but it seems to be ~%1 in
tests here.

Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007
2011-09-30 07:38:35 -07:00
Alpha Lam
7bce513afe Call vp8_find_near_mvs lazily
vp8_find_near_mvs() is being called on all possible reference frames
but the data computed may be used if the loop exits early, which can
be due to x->skip beign set to 1.

Optimize this by call vp8_find_near_mvs() laziy only if it is going
to be used and not computed yet.

Change-Id: Iccdbd4c962a670c9f2c99b8aca8096042ca5dc98
2011-09-30 14:48:18 +01:00
Paul Wilkins
a572ac8327 Merge "CQ and two pass rate control." 2011-09-30 02:57:54 -07:00
Paul Wilkins
b6e27d5f0b CQ and two pass rate control.
Changes to the selection of Q limits for two pass
and two pass CQ mode.

Allowance made for Mode and motion vector costs.
Some refactoring of common code.

For Derf and YT sets CQ mode average improvement
circa 1% (SSIM and Global PSNR).

Some increased tendency to undershoot even when
user CQ not reached.

Patch2: Removed some test code accidentally merged.

Change-Id: Icf74d13af77437c08602571dc7a97e747cce5066
2011-09-30 10:55:52 +01:00
Aaron Watry
69aa303d96 Reduce computational complexity of generic C loop filter.
Change-Id: I1e7f9ed3cd907844a495b9e0073bc140b87e5c06
2011-09-29 17:25:48 -05:00
John Koleszar
22ea8592c1 makefile: fix target 'all'
'all' is the conventional target for building everything in the
makefile, but the child make was expecting all-$(target), for debugging
reasons that I don't recall exactly. Restore the expected behavior.

Change-Id: Ifbb03610b55be679ce7c5e210b7a69a156bb76b9
2011-09-29 09:14:44 -04:00
Attila Nagy
380d64ecb1 Multithreaded encoder, late sync loopfilter
Sync with loopfilter thread just at the beginning of next frame encoding.
This returns control to application faster and allows a better multicore scaling.
When PSNR packets are generated the final filtered frame is needed imediatly
so we cannot delay the sync.

Change-Id: I288d97b5e331d41d6f5bb49d97986fa12ac6f066
2011-09-29 10:06:24 +03:00
James Berry
9b85079342 vpxdec updated to use !feof() instead of *buf_sz in readframe()
For partial droped frames using *buf_sz could incorrectly terminate
a decode.

Change-Id: Id4a1166fa9ae6c0aa7e9f214bfa4c0be0ea82c1c
2011-09-22 15:03:28 -04:00
John Koleszar
6f9457ec12 Merge "clamp_mvs() using the wrong motion vector information" 2011-09-22 11:54:15 -07:00
John Koleszar
3c85c532bb Merge changes Ie650e9b8,I2427e494
* changes:
  vpxenc: get version string programatically
  Install missing default_coef_probs.h
2011-09-22 11:18:00 -07:00
John Koleszar
fa3e530bca vpxenc: get version string programatically
To avoid a dependency on vpx_version.h, call the vpx_codec_version_str()
function and build up the string manually.

Change-Id: Ie650e9b8f2aaaffaa31da5e9ef3b566b972321b4
2011-09-22 13:25:34 -04:00
Johann
9f41a8b0aa Merge "Replace vpx_ports/config.h with vpx_config.h" 2011-09-22 09:30:18 -07:00
John Koleszar
4a6ac727fe Install missing default_coef_probs.h
Make sure that this header is listed as one of the sources, so that it
will be installed if necessary.

Change-Id: I2427e494488126b179151dc21043c1e2c8ba5991
2011-09-22 11:08:24 -04:00
Attila Nagy
1a7d25a484 Replace vpx_ports/config.h with vpx_config.h
Just a clean-up.

Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a
2011-09-22 13:33:54 +03:00
Fritz Koenig
17c754fc00 Reduce grep match when generating offset files.
Search for the word EQU so that extraneous
symbols are not matched.

Change-Id: Ice6c9ca886211e2ca8a2f5174bdd4103db5c4989
2011-09-20 15:36:44 -07:00
Fritz Koenig
bd0c3409a8 Move neon only arm functions under arm/neon.
These files don't contain generic arm code, so should
only be compiled by neon.

Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564
2011-09-20 10:51:06 -07:00
Johann
6829e62718 Merge "NEON FDCT updated to match current C code" 2011-09-20 09:51:05 -07:00
Johann
86e07525d5 Merge "NEON walsh transform updated to match C" 2011-09-20 09:50:42 -07:00
Johann
3a16276cf7 Merge "Updated ARMv6 forward transforms to match C" 2011-09-20 09:50:36 -07:00
Johann
fdd51829b1 Merge "Fixed armv5te multiplications" 2011-09-20 09:50:19 -07:00
Tero Rintaluoma
0c2529a812 NEON FDCT updated to match current C code
- Removed fast_fdct4x4_neon and fast_fdct8x4_neon
- Uses now short_fdct4x4 and short_fdct8x4
- Gives ~1-2% speed-up on Cortex-A8/A9

Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec
2011-09-20 10:20:55 +03:00
Tero Rintaluoma
3c19bc3fb3 Fixed armv5te multiplications
Rd and Rm registers should be different in 'mul'. This register
combination results in unpredictable behaviour. GCC will give
a warning and RVCT an error in this case.

Restriction applies only to armv5 targets and not for armv6 and above.

Change-Id: I378d17c51e1f16a6820814fbed43e115aaabb03e
2011-09-20 09:59:27 +03:00