Compare commits

..

194 Commits

Author SHA1 Message Date
John Koleszar
297dc90255 Update CHANGELOG for v1.1.0 (Eider) release
Change-Id: Ic429556e76bcc4f96a34e18835a153b07fe410a2
2012-05-08 16:14:00 -07:00
John Koleszar
499510f2be Update AUTHORS
Change-Id: Id9dd1802ae597a39eb46d2cbe531d5b04bd0a1c5
2012-05-08 15:01:35 -07:00
John Koleszar
2d6cb342c2 Update .mailmap
Change-Id: If3a9958f6e2a466c631972316c3a49c253cbd9c2
2012-05-08 15:01:35 -07:00
John Koleszar
c8f4c187b3 Use consistent range for VP8E_SET_NOISE_SENSITIVITY
Accept the same range of inputs for the VP8E_SET_NOISE_SENSITIVITY
control, regardless of whether temporal denoising is enabled or not.
This is important for maintaining compatibility with existing
applications.

Change-Id: I94cd4bb09bf7c803516701a394cf1a63bfec0097
2012-05-08 15:01:24 -07:00
John Koleszar
14d827f44e fix vp8_ namespace issues
Make functions only referenced from one translation unit static. Other
symbols with extern linkage get a vp8/vpx prefix.

Change-Id: I928c7e0d0d36e89ac78cb54ff8bb28748727834f
2012-05-04 12:24:04 -07:00
John Koleszar
22f56b93e5 Formalize encodeframe.c forward delclarations
Change If4321cc5 fixed a bug caused by forward declarations not being
kept in sync across C files, resulting in a function call with the
wrong arguments. The commit moves the affected function declarations
into a header file, along with the other symbols from encodeframe.c
that were being sloppily shared.

Change-Id: I76a7b4c66d4fe175f9cbef7e52148655e4bb9ba1
2012-05-04 10:44:47 -07:00
Attila Nagy
3e32105d63 Fix multi-resolution threaded encoding
mb_row and mb_col was not passed to vp8cx_encode_inter_macroblock in
threaded encoding.

Change-Id: If4321cc59bf91e991aa31e772f882ed5f2bbb201
2012-05-04 10:44:46 -07:00
John Koleszar
2bf8fb5889 remove deprecated pre-v0.9.0 API
Remove a bunch of compatibility code dating back to before the initial
libvpx release.

Change-Id: Ie50b81e7d665955bec3d692cd6521c9583e85ca3
2012-05-04 10:44:46 -07:00
Attila Nagy
f039a85fd8 Make global data const
Removes all runtime initialization of global data. This commit is a
squashed version of the following series cherry-picked from master.
This is necessary because of a change that was merged to the tester
that depends on the scaler being moved to the RTCD framework, which
is a worthwhile thing to include in Eider anyway.

  - a91b42f02 Makes all global data in entropy.c const
  - b35a0db0e Makes all global data in tokenize.c const
  - 441cac8ea Makes all mode token tables const
  - 5948a0210 Ports vpx_xcaler to new RTCD method
  - 317d4244c Makes all mode token tables const part 2

Change-Id: Ifeaea24df2b731e7c509fa6c6ef6891a374afc26
2012-05-04 10:42:21 -07:00
John Koleszar
9f9cc8fe71 Merge "Add VPX_TS_ prefix to MAX_LAYERS, MAX_PERIODICITY" into eider 2012-05-03 09:40:50 -07:00
John Koleszar
d8216b19b6 Merge "Fix compiler warnings" into eider 2012-05-02 16:22:34 -07:00
John Koleszar
d46ddd0839 Add VPX_TS_ prefix to MAX_LAYERS, MAX_PERIODICITY
Preserved the prior names for compatibility, will remove in the future.

Change-Id: I8773f959ebce72f60168a2972f7a8ffe6642b9b2
2012-05-02 16:21:52 -07:00
Timothy B. Terriberry
8b1a14d12f Add support for native Solaris compiler on x86.
Original patch by Ginn Chen <ginn.chen@oracle.com> against libvpx
 v0.9.0.
I've forward-ported it to the current version (which mostly
 involved removing hunks that were no longer relevant), since I've
 given up on getting Ginn to submit this upstream himself.

Change-Id: I403c757c831c78d820ebcfe417e717b470a1d022
2012-05-02 10:36:17 -07:00
Timothy B. Terriberry
e50c842755 Fix TEXTRELs in the ARM asm.
Besides imposing a performance penalty at startup in most
 configurations, these relocations break the dynamic linker for
 native Fennec, since it does not support them at all.

Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de
2012-05-02 10:36:01 -07:00
Timothy B. Terriberry
22ae1403e9 Fix trailing commas in enums.
These are warnings in most builds, but show up as compile errors on
 some platforms when these headers are included from C++ code.

Change-Id: I6c523b4dbbc699075fe73830442b51922e5a61d5
2012-05-02 10:35:28 -07:00
Attila Nagy
14c9fce8e4 Fix compiler warnings
Fix code for following warnings:
-Wimplicit-function-declaration
-Wuninitialized
-Wunused-but-set-variable
-Wunused-variable

Change-Id: I2be434f22fdecb903198e8b0711255b4c1a2947a
2012-05-02 10:57:57 +03:00
Johann
f2a6799cc9 Merge "Update paths for iOS 5.1" into eider 2012-05-01 10:13:03 -07:00
Johann
e918ed98d4 Update paths for iOS 5.1
These values can be overridden with some poorly documented and
overloaded options: --libc and --sdk-path

../libvpx/configure --target=armv7-darwin-gcc --sdk-path=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer --libc=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.1.sdk/

So for someone who still wants to build with the iOS 5 SDK, the last
part of the path should be iPhoneOS5.0.sdk

Change-Id: Ibe93d96ae828c619700dc3222983aa4c30456b88
2012-04-30 15:04:41 -07:00
Johann
e5cef5d5a6 Merge "Add target for OS X 10.8 Mountain Lion" into eider 2012-04-30 10:23:59 -07:00
Adrian Grange
faed00d844 Reset output frames counter for second pass
The frame counter was not being reset at the start of
the first pass.

Change-Id: I2ef7c6edf027e43f83f470c52cbcf95bf152e430
2012-04-27 15:14:50 -07:00
Johann
101c2bd1fa Add target for OS X 10.8 Mountain Lion
Also clarify universal build rules

Change-Id: I3b7352f81d5d5b3472420e89872038377c5c2697
2012-04-27 13:35:27 -07:00
John Koleszar
60b36abf85 Merge "Fix loopfilter race condition in multithreaded encoder" into eider 2012-04-26 16:07:20 -07:00
Ralph Giles
061a16d96e Have vpxenc use a default kf_max_rate of 5 seconds.
Rather than using the static default maximum keyframe spacing
provided by vpx_codec_enc_config_default() set the default
value to 5 times the frame rate. Five seconds is too long for
live streaming applications, but is a compromise between seek
efficiency and giving the encoder freedom to choose keyframe
locations.

The five second value is from James Zern's suggestion in
http://article.gmane.org/gmane.comp.multimedia.webm.user/2945

Change-Id: Ib7274dc248589c433c06e68ca07232e97f7ce17f
2012-04-25 17:11:05 -07:00
John Koleszar
1b27e93cd1 vpxenc: validate rational arguments
Trap negative values and zero denominators at the point where they're
parsed.

Change-Id: I1ec9da5d4e95d3ef539860883041330ecec2f345
2012-04-25 17:11:05 -07:00
Attila Nagy
3939e85b26 Fix loopfilter race condition in multithreaded encoder
Race was introduced by https://gerrit.chromium.org/gerrit/15563.

If loopfilter related config params were changed between frames, or
after a KEY frame, there could be a mismatch between encoder's and
decoder's recontructed image. In worst case, when frame buffers are
reallocated because of a size change, the loopfilter could
do an invalid data access (segmentation fault).

Fixes:
Sync with the loopfilter before applying any encoder changes in
vp8_change_config().
Moved the loopfilter synching to the top of
encode_frame_to_data_rate() so that it's done before any alteration of
the encoder.

Change-Id: Ide5245d2a2aeed78752de750c0110bc4b46f5b7b
2012-04-25 14:26:02 +03:00
John Koleszar
dba053898a Merge "Use LIBSUBDIR for vpx.pc." into eider 2012-04-23 12:11:08 -07:00
John Koleszar
504601bb14 Merge "multi-res: restore v1.0.0 API" into eider 2012-04-23 12:07:18 -07:00
Takanori MATSUURA
a7eea3e267 Use LIBSUBDIR for vpx.pc.
Change-Id: Ibba635696e8c23086e5d3c0e8452ab97c460d7d7
2012-04-20 15:11:46 -07:00
John Koleszar
d72c536ede multi-res: restore v1.0.0 API
Move the notion of 0 bitrate implying skip deeper into the codec,
rather than doing it at the multi-encoder API level. This preserves
v1.0.0 ABI compatibility, rather than forcing a bump to v2.0.0 over a
minor change. Also, this allows the case where the application can
selectively enable and disable the larger resolution(s) without having
to reinitialize the codec instace (for instance, if no target is
receiving the full resolution stream).

It's not clear how deep to push this check. It may be valuable to
allow the framerate adaptation code to run, for example. Currently put
the check as early as possible for simplicity, should reevaluate this
as this feature gains real use.

Change-Id: I371709b8c6b52185a1c71a166a131ecc244582f0
2012-04-20 11:39:42 -07:00
John Koleszar
8e858f90f3 vp8_change_config: don't force kf with spatial resampling
Look for changes in the codec's configured w/h instead of its active
w/h when forcing keyframes. Otherwise calls to vp8_change_config()
will force a keyframe when spatial resampling is active.

Change-Id: Ie0d20e70507004e714ad40b640aa5b434251eb32
2012-04-20 11:09:12 -07:00
John Koleszar
c311b3b3a9 rtcd: serialize function pointer initialization
Ensure that RTCD function pointers are set at most once, to silence
some data race warnings. Implementation provided for POSIX threads and
Win32, with the prior unsynchronized behavior left in place for other
platforms.

Change-Id: I65c5856df43ef67043b3d5f26ddafddd8fcb2f7e
2012-04-19 14:15:23 -07:00
John Koleszar
21173e1999 correct alt-ref contribution to frame rate
When producing an invisible ARF, the time stamp counters aren't
updated since the last time stamp is seen by the codec twice. The
prior code was trapping this case with refresh_alt_ref, but this isn't
correct for other uses of the ARF. Instead, use the show_frame flag.

Change-Id: If67fff7c6c66a3606698e34e2fb5731f56b4a223
2012-04-16 12:23:33 -07:00
Johann
b5b61c179d FTFY: Check for astyle and version
Change-Id: I377387681332cfc975254cd825e4ad2998271690
2012-04-12 16:36:22 -07:00
Johann
0c261715b0 Merge "Use OFFSET_PATTERN from libs.mk" 2012-04-12 12:15:04 -07:00
Scott LaVarnway
3c5ed6f52e Merge "MB_MODE_INFO size reduction" 2012-04-12 12:03:45 -07:00
Scott LaVarnway
6dc21bce63 Merge "loopfilter improvements" 2012-04-12 12:02:24 -07:00
John Koleszar
87b12ac875 Merge "FTFY: fix syntax error" 2012-04-12 11:48:57 -07:00
Johann
72b7db36f3 Use OFFSET_PATTERN from libs.mk
Forestall possible issues with -ggdb3
https://gerrit.chromium.org/gerrit/16160
https://trac.macports.org/ticket/33285

Change-Id: Ied274f70004709800576a803afa91e1b0f6eb02b
2012-04-12 11:33:09 -07:00
Scott LaVarnway
e0a80519c7 loopfilter improvements
Local variable offsets are now consistent for the functions,
removed unused parameters, reworked the assembly to eliminate
stalls/instructions.

Change-Id: Iaa37668f8a9bb8754df435f6a51c3a08d547f879
2012-04-12 14:22:47 -04:00
John Koleszar
46da1cae05 FTFY: fix syntax error
Change-Id: I1952608479954c07f3556f96ea3de9118216bf27
2012-04-12 11:19:00 -07:00
Attila Nagy
e4dc2b9248 Fix vpx_rtcd.h dependency in Android.mk
Failed to build on Linux (as described in Android.mk) with NDK r7b.
Set vpx_rtcd.h dependency after libvpx sources are added to
LOCAL_SRC_FILES so that vpx_rtcd.h is generated before any libvpx file
is touched.

Change-Id: Ibe19d485ca9f679dc084044df0e3fb14587c4d3e
2012-04-12 10:15:53 +03:00
Deb Mukherjee
6b33ca395f Fixes to disable MFQE when there is motion.
This patch includes:
1. fixes to disable block based termporal mixing when motion
is detected (because this version of mfqe only handles zero motion).
2. The criterion used for determining whether to mix or
not are changed to use squared differences rather than
absolute differences.
3. Additional checks on color mismatch and excessive block
flatness added. If the block as decoded has very low activity
it is unlikely to yield benefits for mixing.

Change-Id: I07331e5ab5ba64844c56e84b1a4b7de823eac6cb
2012-04-10 14:27:28 -07:00
Debargha Mukherjee
9aa58f3fcb Merge "MFQE: apply threshold to subblocks and chroma." 2012-04-10 13:29:27 -07:00
John Koleszar
d9ca52452b FTFY: only apply on modified files
Ignore renamed, copied, and deleted files when applying the style
rules.

Change-Id: I6102e34f833e5c2ef7a88d6d57bbfdca51b25d94
2012-04-03 17:42:52 -07:00
John Koleszar
8106df8f5a MFQE: apply threshold to subblocks and chroma.
In cases where you have a flat background occluded by a moving object
of similar luminosity in the foreground, it was likely that the
foreground blocks would persist for a few frames after the background
is uncovered. This is particularly noticable when the object has a
different color than the background, so add the chroma planes in as an
additional check.

In addition, for block sizes of 8 and 16, the luma threshold is
applied on four subblocks independently, which helps when only part of
the background in the block has been uncovered.

This fixes issue #392, which includes a test clip to reproduce the
issue.

BUG=392

Change-Id: I2bd7b2b0e25e912dcac342e5ad6e8914f5afd302
2012-04-03 12:05:01 -07:00
Johann
c459d37c26 Allow disabling disabled codecs
When using 'make dist' after --disable-vp8[encoder|decoder] it would
fail to recognize the option. This would only occur when also specifying
--enable-install-docs and --enable-install-srcs but not
--enable-codec-srcs

Including vpx/ fixes builds with --enable-codec-srcs

vpx_timer.h is also required for vpxenc.c

Change-Id: Ie3e28b2f7ec7ee6d5961d3843f9eab869f79c35b
2012-04-02 15:57:28 -07:00
Johann
811d0ff209 Merge "Move variance and SAD RTCD definitions" 2012-04-02 11:31:06 -07:00
Johann
21ac3c8f26 Move variance and SAD RTCD definitions
When the functions were moved from encoder/ to common/ the RTCD file was
not updated.

Change-Id: I1c98715ed51adf1a95aa2492949d8552aec88d1f
2012-04-02 11:06:35 -07:00
Jim Bankoski
4ed05a8bb1 Merge "vp8 - compatibility warning added to changelog" 2012-04-02 11:05:38 -07:00
James Zern
00794a93ec tools/wrap-commit-msg.py: fix file truncation
truncate() operates from the current file pointer position. On at least
Linux specifying 0 without resetting the pointer will pad the file with
zeros to the current offset.

Change-Id: Ide704a1097f46c0c530f27212bb12e923f93e2d6
2012-03-29 18:13:27 -07:00
John Koleszar
8b15cf6929 Merge "remove unused BOOL_CODER::value" 2012-03-29 14:03:07 -07:00
John Koleszar
3df4436e1a Merge "FTFY: support wordwrapping commit messages" 2012-03-29 14:02:48 -07:00
John Koleszar
a46ec16569 FTFY: support wordwrapping commit messages
It's common for commit messages to be wrapped at odd places. git-gui
is often to blame. Adds support for automatically fixing up these
messages if running ftfy --amend, and adds a new option --msg-only for
fixing only the commit message.

Change-Id: Ia7ea529f8cb7395d34d9b39f1192598e9a1e315b
2012-03-29 14:01:51 -07:00
John Koleszar
3f8349467a remove unused BOOL_CODER::value
Change-Id: Ic7782707afed38c3ec7e996a4a11dc2d55226691
2012-03-29 13:56:48 -07:00
Scott LaVarnway
31322c5faa MB_MODE_INFO size reduction
Reduced the size of the struct by 8 bytes, which would be
a memory savings of 64800 bytes for 1080 resolutions.  Had
an extra byte, so created an is_4x4 for B_PRED or SPLITMV
modes.  This simplified the mode checks in
vp8_reset_mb_tokens_context and vp8_decode_mb_tokens.

Change-Id: Ibec27784139abdc34d4d01f73c09f43e9e10e0f5
2012-03-29 16:30:14 -04:00
Scott LaVarnway
a337725625 Updated vp8_build_intra_predictors_mby_s(sse2/ssse3)
to work with the latest code.

Patch Set 2: aligned the above_row buffers to fix crash

Change-Id: I7a6992a20ed079ccd302f8c26215cf3057f8b70c
2012-03-29 14:24:53 -04:00
Scott LaVarnway
b3151c80fc Merge "Updated vp8_build_intra_predictors_mbuv_s(sse2/ssse3)" 2012-03-29 11:22:22 -07:00
Scott LaVarnway
0799cccce3 Merge "Removed duplicate vp8_build_intra_predictors_mb y/uv" 2012-03-29 11:22:04 -07:00
John Koleszar
c88fc5b2f9 Merge "FTFY: an automated style corrector" 2012-03-28 17:47:24 -07:00
John Koleszar
cb265a497d FTFY: an automated style corrector
This is a utility for applying a limited amount of style correction on
a change-by-change basis. Rather than a big-bang reformatting, this
tool attempts to only correct the style in diff hunks that you touch.
This should make the cosmetic changes small enough that we can mix them
with functional changes without destroying the diffs, and there's an
escape hatch for separating the reformatting to a second commit for
purists and cases where it hurts readability.

At this time, the script requires a clean working tree, so run it after
you've commited your changes. Run without arguments, the style
corrections will be applied and left unstaged in your working copy. It
also supports the --amend option, which will automatically amend your
HEAD with the corrected style, and --commit, which will create a new
change dependent on your HEAD that contains only the whitespace changes.

There are a number of ways this could be applied in an automated manner
if this proves to be useful, either on a project-wide or per-user
basis. This doesn't buy anything in terms of real code quality, the
intent here would be to keep formatting nits out of review comments in
favor of more meaningful ones and help people whose habitual style
doesn't match the baseline.

Requires astyle[1] 1.24 or newer.
  [1]: http://astyle.sourceforge.net/

Change-Id: I2fb3434de8479655e9811f094029bb90e5d757e1
2012-03-28 14:05:56 -07:00
Scott LaVarnway
ccea000c4b Updated vp8_build_intra_predictors_mbuv_s(sse2/ssse3)
to work with the latest code.

Change-Id: Ie382bb55d00ea5929bdadba859eea15f696d4cd9
2012-03-26 13:40:14 -04:00
Scott LaVarnway
403966ae00 Removed duplicate vp8_build_intra_predictors_mb y/uv
Added y/uv stride as a parameter and remove the
duplicate code.

Change-Id: I019117a9dd9659a09d3d4e845d4814d3f33341b5
2012-03-26 13:40:14 -04:00
John Koleszar
bf2c903000 Merge "bug fix: fix mem leak error in vpxenc" 2012-03-26 10:15:56 -07:00
James Berry
85ef83a173 bug fix: fix mem leak error in vpxenc
fixes memory leak bug in vpxenc.

Change-Id: I3933026d16177947576c61ebf58f8c58147e4ba0
2012-03-26 12:55:00 -04:00
Jim Bankoski
1991123054 vp8 - compatibility warning added to changelog
Change-Id: Iac0daecfc7c8393cb4c798ca43b7fe300f56e55f
2012-03-23 13:59:59 -07:00
Scott LaVarnway
9e9f5f3d70 New vp8_decode_mb_tokens()
This new vp8_decode_mb_tokens() uses a modified version of
WebP's GetCoeffs function.  For now, the dequant does not
occur in GetCoeffs.
Tests showed performance improvements up to 2.5% depending
on material.

Change-Id: Ia24d78627e16ffee5eb4d777ee8379a9270f07c5
2012-03-23 15:50:08 -04:00
Deb Mukherjee
06dc2f6166 Initialize postproc buffer to resolve valgrind warnings
Change-Id: I9a7d40b0eac7200796dbe62e75776b2eb77dfdf6
2012-03-22 17:42:41 -07:00
Deb Mukherjee
66ba79f5fb Miscellaneous changes in mfqe and postproc modules
Adds logic to disable mfqe for the first frame after a configuration
change such as change in resolution. Also adds some missing
if CONFIG_POSTPROC macro checks.

Change-Id: If29053dad50b676bd29189ab7f9fe250eb5d30b3
2012-03-22 09:55:07 -07:00
James Berry
fd9df44a05 Merge "remove __inline for compiler compatibility" 2012-03-21 12:37:17 -07:00
James Berry
3c021e1d79 Merge "bug fix: remove inline from mfqe.c" 2012-03-21 12:36:40 -07:00
James Berry
451ab0c01e remove __inline for compiler compatibility
__inline removed for broader compiler compatibility

Change-Id: I6f2b218dfc808b73212bbb90c69e2b6cc1fa90ce
2012-03-21 14:11:10 -04:00
Yunqing Wang
bcee56bed5 Minor fix: add back a vpx_free call
Added back a vpx_free call that was mistakenly removed.

Change-Id: Ib662933a8697a4efb8534b5b9b762ee6c2777459
2012-03-21 14:09:33 -04:00
James Berry
921ffdd2c9 bug fix: remove inline from mfqe.c
remove inline from mfqe.c for vs
compatibility

Change-Id: I853f16503d285fcd41a1a12181d8745159156b5c
2012-03-21 13:42:58 -04:00
Yunqing Wang
9ed1b2f09e Merge "Add motion search skipping in first pass" 2012-03-16 14:02:51 -07:00
Yunqing Wang
6a819ce4fe Add motion search skipping in first pass
This change added a motion search skipping mechanism similar
to what we did in second pass. For a macroblock that is very
similar to the macroblock at same location on last frame,
we can set its mv to be zero, and skip motion search. This
improves first-pass performance for slide shows and video
conferencing clips with a slight PSNR loss.

Change-Id: Ic73f9ef5604270ddd6d433170091d20361dfe229
2012-03-16 15:59:00 -04:00
Johann
56e8485c84 darwin universal builds need BUILD_PFX
Universal builds create subdirectories for each target. Without
BUILD_PFX we only generated one vpx_rtcd.h instead of one for each.

Change-Id: I1caed4e018c8865ffc8da15e434cae2b96154fb4
2012-03-16 14:54:07 -04:00
John Koleszar
a05bf133ae Update XCode SDK search paths
Newer XCodes have moved the SDK path from /Developer/SDKs

Use a suggestion from jorgenisaksson@gmail.com to locate it

osx_sdk_dir is not required to be set. Apple now offers a set
command line tools which do not require this. isysroot is also
not required in newer versions of XCode so only set it when we
are confident in the location.

There remain issues with the iOS configure steps which will be
addressed later

Change-Id: I4f5d7e35175d0dea84faaa6bfb52a0153c72f84b
2012-03-16 14:38:51 -04:00
Johann
e68953b7c8 Merge "RFC: Reorganize MFQE loops" 2012-03-16 11:35:18 -07:00
Scott LaVarnway
9aa2bd8a03 Merge "x_motion_minq table reduction" 2012-03-16 09:03:47 -07:00
James Zern
6b7cf3077d doxy: fix conditional usage, ref warnings
doxygen < 1.7.? seems to have been more tolerant of single line
\if/\endif

This change fixes warnings such as:
mainpage.dox:13: warning: unable to resolve reference to `vp8_encoder-'
for \ref command
vpx_decoder.h:193: warning: explicit link request to 'n' could not be
resolved

Change-Id: If3d04af5ede1b0d1e2c63021d0e4ac8f98db20b2
2012-03-15 16:51:51 -07:00
John Koleszar
eb0c5a6ffc Merge "Fix build under Estonian locale" 2012-03-14 12:00:44 -07:00
John Koleszar
422f97d7ab Merge "fix potential use of uninitialized rate_y" 2012-03-14 12:00:27 -07:00
Priit Laes
7af4eb014b Fix build under Estonian locale
Change-Id: Ifb536403ef302b597864eae1d05aa9e2bb15d4c7
2012-03-14 11:19:09 -07:00
John Koleszar
20cd3e6b8f fix potential use of uninitialized rate_y
This issue likely doesn't appear in the unmodified encoder, but
sufficient hacking on the mode selection loop can expose it.

Change-Id: I8a35831e8f08b549806d0c2c6900d42af883f78f
2012-03-14 10:10:30 -07:00
Jim Bankoski
6b66c01c88 Merge "Adds a motion compensated temporal denoiser to the encoder." 2012-03-13 16:18:57 -07:00
Stefan Holmer
9c41143d66 Adds a motion compensated temporal denoiser to the encoder.
Some refactoring in rdopt.c and pickinter.c.

Change-Id: I4f50020eb3313c37f4d441d708fedcaf219d3038
2012-03-13 15:33:50 -07:00
Jim Bankoski
e9cacfd66d Merge "Update for key frame target size setting." 2012-03-13 10:13:03 -07:00
Marco Paniconi
301409107f Update for key frame target size setting.
Set an iniital/minimun boost level for the frame rate
factor of key frame target size setting.

Change-Id: If2586f4ac76a1fa89378aa652a58607356a1f426
2012-03-12 16:23:08 -07:00
Johann
ddf94f6184 Merge "Move SAD and variance functions to common" 2012-03-12 15:08:48 -07:00
John Koleszar
c21f53a501 Merge "vpx_timer: increase resolution" 2012-03-09 14:48:06 -08:00
Scott LaVarnway
7a1590713e Merge changes I9c26870a,Ifabb0f67
* changes:
  threading.c refactoring
  Decoder loops refactoring
2012-03-09 10:48:11 -08:00
Scott LaVarnway
9ed874713f threading.c refactoring
Added recon above/left to MACROBLOCKD
Reworked decode_macroblock

Change-Id: I9c26870af75797134f410acbd02942065b3495c1
2012-03-08 15:27:41 -05:00
Yaowu Xu
676610d2a8 Merge "vp8e - RDLambda fix" 2012-03-07 11:53:04 -08:00
Johann
fd903902ef RFC: Reorganize MFQE loops
Break MFQE code into it's own file.

It is currently only valid for 16x16 and 8x8 Y blocks. It also filters
4x4 U/V blocks.

Refactor filtering and add associated assembly. Limited test cases show
--mfqe introduces a penalty of ~20% with HD content. The assembly
reduces the penalty to ~15%

Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf
2012-03-06 15:20:03 -08:00
Jim Bankoski
154b4b4196 vp8e - RDLambda fix
Last commit went the wrong way.

Change-Id: I5e47ee6c25b0893dfa84318229b93c57dfeec24e
2012-03-06 08:47:12 -08:00
Johann
953c6a011e Merge "include CHANGELOG in CODEC_SRCS" 2012-03-05 17:20:48 -08:00
Johann
e50f96a4a3 Move SAD and variance functions to common
The MFQE function of the postprocessor depends on these

Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a
2012-03-05 16:50:33 -08:00
Johann
5d88a82aa0 include CHANGELOG in CODEC_SRCS
build/make/version.sh requires CHANGELOG to generate vpx_version.h
The file is already included when building the documentation. However,
documentation is not build if doxygen/php are not present.

This is necessary when using '--enable-install-srcs --enable-codec-srcs'
and 'make dist'

Change-Id: Icada883a056a4713d24934ea44e0f6969b68f9c2
2012-03-05 16:36:23 -08:00
Jim Bankoski
cbcfbe1b29 Merge "vp8e - fix coefficient costing" 2012-03-05 14:14:58 -08:00
Jim Bankoski
888699091d vp8e - fix coefficient costing
Coefficient costing failed to take account of the first branch
being skipped ( 0 vs eob) if the previous token is 0.

Fixed rd to account for slightly increased token cost & cleaned up
warning message

Change-Id: I56140635d9f48a28dded5a816964e973a53975ef
2012-03-05 08:20:42 -08:00
Johann
87c40b35eb Fix encoder debug setting
Propagate debug setting to the EBML struct. When writing the application
name, this allows us to strip the version code and keep the output
metadata static.

Change-Id: I8e06c6abd743bedbff5af6242bbdae5d55754538
2012-03-01 16:12:53 -08:00
John Koleszar
a6f538cefa vpx_timer: increase resolution
There's no useful reason to limit this timer to 1 second.

Change-Id: Idd1960268624e8bdfe958d99833ae6482fdb423e
2012-03-01 13:57:17 -08:00
Jim Bankoski
a60461a340 Merge "vp8e - attempt to lessen blockiness" 2012-03-01 11:47:09 -08:00
Jim Bankoski
f0f609c2e2 Merge "vp8e - static key boost" 2012-03-01 11:23:10 -08:00
Jim Bankoski
91b5c98b3d Merge "vp8e - force at least some change in over and under shoots" 2012-03-01 11:22:52 -08:00
Paul Wilkins
8d07a97acc vp8e - static key boost
This seeks to boost the size of the keyframe if the entire section
is a single frame clip

Change-Id: I3c00268dc155b047dc4b90e514cf403d55a4f8ef
2012-03-01 10:39:41 -08:00
Paul Wilkins
6d84322762 vp8e - force at least some change in over and under shoots
Change-Id: Ie1796f272dc33bf5a1c8ac990da625961d272aa9
2012-03-01 10:35:22 -08:00
Scott LaVarnway
c34d91a84e Merge "Packing bitstream on-the-fly with delayed context updates" 2012-03-01 06:20:02 -08:00
Yunqing Wang
aabae97e57 vpxenc: fix time and fps calculation in 2-pass encoding
When we do 2-pass encoding, elapsed time is accumulated through
whole 2-pass process, which gives incorrect time and fps results
for second pass. This change fixed that by resetting the time
accumulator for second pass.

Change-Id: Ie6cbf0d0e66e6874e7071305e253c6267529cf20
2012-02-29 15:44:56 -05:00
Attila Nagy
52cf4dcaea Packing bitstream on-the-fly with delayed context updates
Produce the token partitions on-the-fly, while processing each MB.
Context is updated at the beginning of each frame based on the
previoud frame's counters. Optimally encoder outputs partitions in
separate buffers. For frame based output, partitions are concatenated
internally.

Limitations:
    - enabled just in combination with realtime-only mode
    - number of encoding threads has to be equal or less than the
    number of token partitions. For this reason, by default the encoder
    will do 8 token partitions.
    - vpxenc supports partition output (-P) just in combination with
    IVF output format (--ivf)

Performance:
    - Realtime encoder can be up to 13% faster (ARM) depending on the number
    of threads and bitrate settings. Constant gain over the 5-16 speed
    range.
    - Token buffer reduced from one frame to 8 MBs

Quality:
    - quality is affected by the delayed context updates. This again
    dependents on input material, speed and bitrate settings. For VC
    style input the loss seen is up to 0.2dB. If error-resilient=2
    mode is used than the effect of this change is negligible.

Example:
./configure --enable-realtime-only --enable-onthefly-bitpacking
./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
--target-bitrate=1000 --token-parts=3 --static-thresh=2000
--ivf -P -t 4 -o strm.ivf tanya_640x480.yuv

Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76
2012-02-29 12:13:37 -05:00
Jim Bankoski
b8fa2839a2 vp8e - attempt to lessen blockiness
applies a penalty to intra blocks in order to cut down on blockiness in
easy sections.

Change-Id: Ia9e5df16328b0bf01bf0f2e6e61abcb687316c12
2012-02-29 09:03:13 -08:00
Scott LaVarnway
2578b767c3 Decoder loops refactoring
Eliminated some mb branches along with other code cleanups.
This is part of an ongoing effort to remove cut/paste
code in the decoder.

Change-Id: Ifabb0f67cafa6922b5a0e89a0d03a9b34e9e5752
2012-02-29 10:38:14 -05:00
Scott LaVarnway
ce328b855f Merge changes Ifb450710,I61c4a132
* changes:
  Eliminated reconintra_mt.c
  Eliminated vp8mt_build_intra_predictors_mbuv_s
2012-02-28 11:42:45 -08:00
Scott LaVarnway
aab70f4d7a Merge "Removed duplicate code in threading.c" 2012-02-28 11:25:43 -08:00
Scott LaVarnway
bcba86e2e9 Eliminated reconintra_mt.c
Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead.  vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict.  Future commits will fix this
code duplication.

Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5
2012-02-28 14:12:30 -05:00
Scott LaVarnway
9a4052a4ec Removed duplicate code in threading.c
Change-Id: Id7e44950ceda67b280e410e541510106ef02f1da
2012-02-28 14:00:32 -05:00
Yunqing Wang
b1bfd0ba87 Merge "Only do uv intra-mode evaluation when intra mode is checked" 2012-02-28 10:11:24 -08:00
Yunqing Wang
019384f2d3 Only do uv intra-mode evaluation when intra mode is checked
When we encode slide-show clips, for the majority of the time,
only ZEROMV mode is checked, and all other modes are skipped.
This change delayed uv intra-mode evaluation until intra mode is
actually checked. This gave big performance gain for slide-show
video encoding (2nd pass gain: 18% to 28%). But, this change
doesn't help other types of videos.

Also, zbin_mode_boost is adjusted in mode-checking loop, which
causes bitstream mismatch before/after this change when --best
or --good with --cpu-used=0 are used.

Change-Id: I582b3e69fd384039994360e870e6e059c36a64cc
2012-02-28 13:08:17 -05:00
James Berry
e2c6b05f9a bugfix: use oxcf width/height for reinit check
use oxcf instead of common in check to Reinit the
lookahead buffer if the frame size changes
prior behavior would cause assertion fail/crash

first observed in:
support changing resolution with vpx_codec_enc_config_set

Change-Id: Ib669916ca9b4f206d4cc3caab5107e49d39a36aa
2012-02-27 16:10:45 -05:00
Yunqing Wang
61c5e31ca1 Merge "Fix skippable evaluation in mode decision" 2012-02-27 11:06:13 -08:00
John Koleszar
ad1216151d Merge "vpxenc: initial implementation of multistream support" 2012-02-27 09:59:14 -08:00
John Koleszar
02a31e6b3c Merge "decoder: reset segmentation map on keyframes" 2012-02-27 09:58:29 -08:00
Yunqing Wang
84be08b07f Fix skippable evaluation in mode decision
Yaowu fixed the skippable evaluation by correcting 2nd order
block's eob.

Change-Id: Id47930cbc74a90a046c0c0e324efb03477639ee0
2012-02-27 12:45:12 -05:00
James Berry
313bfbb6a2 Merge "Add unit tests for idctllm_test and idctllm_mmx" 2012-02-23 08:50:36 -08:00
Jim Bankoski
2089f26b08 Merge "Remove the frame rate factor for key frame size." 2012-02-23 08:38:44 -08:00
Marco Paniconi
507ee87e3e Remove the frame rate factor for key frame size.
When temporal layers is used (i.e., number_of_layers > 1),
we don't use the frame rate boost for setting the key
frame target size. The factor was forcing the target size to be
always at its minimum (2* per_frame_bandwidth) for low frame rates
(i.e., base layer frame rate).

Generally we should modify or remove this frame rate factor;
for now we turn if off for number_of_layers > 1.

Change-Id: Ia5acf406c9b2f634d30ac2473adc7b9bf2e7e6c6
2012-02-22 15:25:32 -08:00
Scott LaVarnway
f2bd11faa4 Eliminated vp8mt_build_intra_predictors_mbuv_s
Reworked the code to use vp8_build_intra_predictors_mbuv_s
instead.  This is WIP with the goal of eliminating all
functions in reconintra_mt.h

Change-Id: I61c4a132684544b24a38c4a90044597c6ec0dd52
2012-02-21 14:59:05 -05:00
James Berry
0c1cec2205 Add unit tests for idctllm_test and idctllm_mmx
add unit tests for vp8_short_idct4x4llm_c

Change-Id: I472b7c0baa365ba25dc99a3f6efccc816d27c941
2012-02-21 14:52:36 -05:00
John Koleszar
dadc9189ed Merge changes I0341554f,I64e110c8
* changes:
  Consolidate C version of token packing functions
  Multithreaded encoder, late sync loopfilter
2012-02-21 10:09:23 -08:00
Scott LaVarnway
f05feab7b9 Merge "Remove redundant init of segment_counts in vp8_encode_frame" 2012-02-21 09:51:02 -08:00
John Koleszar
02360dd2c2 Merge "Update encoder mb_skip_coeff and prob_skip_false calculation" 2012-02-21 09:48:26 -08:00
Johann
b0a12a2880 Refine offset pattern
When compiling with -ggdb3 the output includes an extraneous EQU from
vpx_ports/asm_offsets.h

https://trac.macports.org/ticket/33285

Change-Id: Iba93ddafec414c152b87001a7542e7a894781231
2012-02-17 12:28:13 -08:00
John Koleszar
b5ce9456db Merge changes Idf1a05f3,If227b29b,Iac784d39
* changes:
  vpxenc: factor out input open/close
  vpxenc: add warning()/fatal() helpers
  vpxenc: factor out global config options
2012-02-17 11:14:17 -08:00
Johann
e6047a17a9 Merge "OS X shell is incompatible with echo -n" 2012-02-17 10:53:19 -08:00
Yunqing Wang
f93b1e7be1 Merge "Fix incorrect use of uv eobs in intra modes" 2012-02-17 10:43:05 -08:00
Yunqing Wang
04b9e0d787 Fix incorrect use of uv eobs in intra modes
In vp8_rd_pick_inter_mode(), if total of eobs is zero, rate needs
to be adjusted since there are no non-zero coefficients for
transmission. The uv intra eobs calculated in
rd_pick_intra_mbuv_mode() need to be saved before they are
overwritten by inter-mode eobs.

Change-Id: I41dd04fba912e8122ef95793d4d98a251bc60e58
2012-02-17 09:15:08 -05:00
Attila Nagy
ce42e79abc Update encoder mb_skip_coeff and prob_skip_false calculation
mode_info_context->mbmi.mb_skip_coeff has to always reflect the
existence or not of coeffs for a certain MB. The loopfilter needs this
info.
mb_skip_coeff is either set by the vp8_tokenize_mb or has to be set to
1 when the MB is skipped by mode selection. This has to be done
regardless of the mb_no_coeff_skip value.

prob_skip_false is needed just when mb_no_coeff_skip is 1. No need to
keep count of both skip_false and skip_true as they are complementary
(skip_true+skip_false = total_mbs)

Change-Id: I3c74c9a0ee37bec10de7bb796e408f3e77006813
2012-02-17 14:27:40 +02:00
Attila Nagy
565d0e6feb Remove redundant init of segment_counts in vp8_encode_frame
segment_counts was zero init twice in the beginning of vp8_encode_frame.

Change-Id: Ibc29f6896dabd9aab1d0993f3941cf6876022e70
2012-02-17 09:51:24 +02:00
Johann
6b151d436d Clarify 'max_sad' usage
Depending on implementation the optimized SAD functions may return early
when the calculated SAD exceeds max_sad.

Change-Id: I05ce5b2d34e6d45fb3ec2a450aa99c4f3343bf3a
2012-02-16 15:17:44 -08:00
Johann
5f0b303c28 OS X shell is incompatible with echo -n
Built in echo in 'sh' on OS X does not support -n (exclude trailing
newline). It's not necessary so just leave it off. Fixes issue 390.

Build include guard using 'symbol' so that it is more likely to be
unique.

Change-Id: I4bc6aa1fc5e02228f71c200214b5ee4a16d56b83
2012-02-16 14:20:44 -08:00
Fritz Koenig
3653fb473a Include path fix for building against Android NDK.
cpu-features.h is not in the common paths, add
to the cflags for Android.

Change-Id: Icbafc7600d72f6b59ffb030f6ab80ee6860332bb
2012-02-16 12:38:17 -08:00
John Koleszar
9e50ed7f27 vpxenc: initial implementation of multistream support
Add the ability to specify multiple output streams on the command line.
Streams are delimited by --, and most parameters inherit from previous
streams.

In this implementation, resizing streams is still not supported. It
does not make use of the new multistream support in the encoder either.
Two pass support runs all streams independently, though it's
theoretically possible that we could combine firstpass runs in the
future. The logic required for this is too tricky to do as part of this
initial implementation. This is mostly an effort to get the parameter
passing and independent streams working from the application's
perspective, and a later commit will add the rescaling and
multiresolution support.

Change-Id: Ibf18c2355f54189fc91952c734c899e5c072b3e0
2012-02-16 12:30:01 -08:00
John Koleszar
732cb9a643 vpxenc: factor out input open/close
Simplify some of the file I/O for later commits which will add multistream
support

Change-Id: Idf1a05f3a29c95331d0c4a6ea5960904e4897fd4
2012-02-16 12:30:00 -08:00
John Koleszar
c535025c12 vpxenc: add warning()/fatal() helpers
Cosmetic. Allows exiting with an error message without opening a new
scope.

Change-Id: If227b29b825f0241acea79dd38f19e524552ee18
2012-02-16 12:26:58 -08:00
John Koleszar
e8223bd250 decoder: reset segmentation map on keyframes
Refactoring some of the mode decoding logic introduced a bug where
the segmentation maps would not be properly reset on keyframes.

http://code.google.com/p/webm/issues/detail?id=378

The text of the bug is somewhat misleading as I initially read it to
imply the bug was present in v0.9.7-p1 (Cayuga), but note the text
"master", which indicates this was something subsequent. This issue
bisects back to v0.9.7-p1-84-ga99c20c, so unfortunately it was broken
during the Duclair release.

Thanks to Alexei Leonenko for investigating the root cause.

Change-Id: I9713c9f070eb37b31b3b029d9ef96be9b6ea2def
2012-02-16 12:22:18 -08:00
Makoto Kato
7989bb7fe7 Support Android x86 NDK build
On Android NDK, rand() is inlined function.  But, on our SSE optimization,
we need symbol for rand()

Change-Id: I42ab00e3255208ba95d7f9b9a8a3605ff58da8e1
2012-02-16 12:03:30 -08:00
Scott LaVarnway
6776bd62b5 Simplify mb_to_x_edge calculation during mode decoding
Change-Id: Ibcb35c32bf24c1d241090e24c5e2320e4d3ba901
2012-02-16 13:36:46 -05:00
Scott LaVarnway
a5879f7c81 Merge "decodemv cleanup/improvements" 2012-02-16 09:33:59 -08:00
Scott LaVarnway
12ee845ee7 decodemv cleanup/improvements
Removed unnecessary variables, unrolled functions, eliminated
unnecessary mv bounds checks and branches.

Change-Id: I02d034c70cd97b65025d59dd67c695e1db529f0b
2012-02-16 11:38:33 -05:00
Attila Nagy
d02e74a073 Consolidate C version of token packing functions
Replace inner loops of pack_mb_row_tokens_c and
pack_tokens_into_partitions_c with a call to pack_tokens_c.

Change-Id: I0341554fb154a14a5dadb63f8fc78010724c2c33
2012-02-16 14:11:28 +02:00
Attila Nagy
78071b3b97 Multithreaded encoder, late sync loopfilter
Second shot at this...

Sync with loopfilter thread as late as possible, usually just at the
beginning of next frame encoding. This returns control to application
faster and allows a better multicore scaling.

When PSNR packets are generated the final filtered frame is needed
imediatly so we cannot delay the sync. Same has to be done when
internal frame is previewed.

Change-Id: I64e110c8b224dd967faefffd9c93dd8dbad4a5b5
2012-02-16 12:26:39 +02:00
John Koleszar
efd54f8f41 vpxenc: factor out global config options
This is a first step towards specifying multiple output streams
with one command line.

Change-Id: Iac784d3911bf553694d024bbd0c3d547261e914b
2012-02-15 16:11:35 -08:00
Fritz Koenig
8144132866 Fix rtcd build process for Android.mk
Add a dependency so ndk-build will
generate the needed vpx_rtcd.h file.

Change-Id: I92c82e0996943dd0403c9956e1ba60e92e2837a9
2012-02-15 15:23:04 -08:00
John Koleszar
e6df50031e Merge "support changing resolution with vpx_codec_enc_config_set" 2012-02-10 16:18:00 -08:00
Johann
169823428f Missed some variance casts
Change-Id: I9fb510f9421fb3c317a8e32e3058cee977ddf9fa
2012-02-10 11:07:33 -08:00
Johann
12d45f62f6 Merge "max_sad check is not always implemented" 2012-02-10 10:28:00 -08:00
Johann
8c50a70a95 max_sad check is not always implemented
As an optimization some architectures use the max_sad argument to break
out early from the SAD. Pass in INT_MAX instead of 0 to prevent this.

Change-Id: I653c476834b97771578d63f231233d445388629d
2012-02-09 16:19:10 -08:00
Scott LaVarnway
768ae275dc x_motion_minq table reduction
Reduced by 4080 bytes.

Change-Id: I037b55bc9684bf4a54bce238be00e8c4db3f643e
2012-02-09 16:49:34 -05:00
Johann
fea3556e20 Fix variance overflow
In the variance calculations the difference is summed and later squared.
When the sum exceeds sqrt(2^31) the value is treated as a negative when
it is shifted which gives incorrect results.

To fix this we cast the result of the multiplication as unsigned.

The alternative fix is to shift sum down by 4 before multiplying.
However that will reduce precision.

For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
change).

PPC change is untested.

Change-Id: I1bad27ea0720067def6d71a6da5f789508cec265
2012-02-09 12:38:31 -08:00
John Koleszar
2e0d55314c Merge "Add OS/2 supports" 2012-02-08 11:00:55 -08:00
KO Myung-Hun
2dad8d65d9 Add OS/2 supports
Change-Id: I792d5236451905eb20a8ebe444ef5b2274e4f7a4
2012-02-08 09:44:42 -08:00
John Koleszar
51acb01167 support changing resolution with vpx_codec_enc_config_set
Allow the application to change the frame size during encoding. This
is only supported when not using lagged compress.

Change-Id: I89b585d703d5fd728a9e3dedf997f1b595d0db0f
2012-02-07 17:09:40 -08:00
John Koleszar
417b852967 Align internal mfqe framebuffer dimensions
MFQE postproc crashed with stream dimensions not a multiple of 16.
The buffer was memset unconditionally, so if the buffer allocation
fails we end up trying to write to NULL.

This patch traps an allocation failure with vpx_internal_error(),
and aligns the buffer dimensions to what vp8_yv12_alloc_frame_buffer()
expects.

Change-Id: I3915d597cd66886a24f4ef39752751ebe6425066
2012-02-07 10:40:26 -08:00
Adrian Grange
45f4b87e8e Fixed bug in 5-layer multi-layer encode
The 5-layer encode must have a keyframe every 16 frames.

The KF flag was being reset after the encode of the first
frame, which it should not do for the 5-layer case
(mode=6).

Change-Id: I207d6e689d347fe3fd1075b97a817e82f7ad53b9
2012-02-06 15:02:33 -08:00
Adrian Grange
9df0d29823 Merge "Added 2 temporal patterns with new parameters" 2012-02-06 14:45:33 -08:00
Yunqing Wang
a040eb37e4 Merge "Allow to skip highest-resolution encoding in multi-resolution encoder" 2012-02-06 13:58:11 -08:00
Yunqing Wang
fa1a9290e6 Allow to skip highest-resolution encoding in multi-resolution encoder
Sometimes, a user doesn't have enough bandwidth to send high-resolution
(i.e. HD) video even though the camera catches HD video. This change
allowed users to skip highest-resolution encoding by setting that level's
target bit rate to 0.

To test it, modify the following line in vp8_multi_resolution_encoder.c.
    unsigned int  target_bitrate[NUM_ENCODERS]={1400, 500, 100};
To skip the highest-resolution level, change it to
    unsigned int  target_bitrate[NUM_ENCODERS]={0, 500, 100};
To skip the first and second highest resolution levels, change it to
    unsigned int  target_bitrate[NUM_ENCODERS]={0, 0, 100};

This change also fixed a small problem in mapping, which slightly helped
quality and performance.

Change-Id: I977bae9a9fbfba85c8be4bd5af01539f2b84bc81
2012-02-03 13:39:05 -05:00
Scott LaVarnway
d8ebdcd89d Moved ref_frame_cost from MACROBLOCKD to MACROBLOCK
Change-Id: I05788522e9cde4322cfb12032483bdbf184bdf0b
2012-02-02 13:40:08 -05:00
Scott LaVarnway
11c706488b Removed frames_till_alt_ref_frame from MACROBLOCKD
Change-Id: Ieb05270ac332a4cc38ec4b7b995fc0150e0fffdf
2012-02-02 13:34:13 -05:00
Scott LaVarnway
e2000cc5ca Removed frames_since_golden from MACROBLOCKD
Change-Id: I10efa441d663fceb6bc97a3bfad518cd3d9a5128
2012-02-02 13:28:41 -05:00
Scott LaVarnway
07c6eb18ad Merge "Improved uv mv calculations in build inter predictor" 2012-01-31 10:43:49 -08:00
Scott LaVarnway
749bc98618 BLOCKD structure cleanup
Removed redundancies.  All of the information can be
found in the MACROBLOCKD structure.

Change-Id: I7556392c6f67b43bef2a5e9932180a737466ef93
2012-01-31 11:02:39 -05:00
John Koleszar
57d459ba82 RTCD: remove unimplemented vp8_short_walsh4x4_mmx
This function does not exist.

Change-Id: I84b72fb17d572d5cccee92220467b84c15842d4d
2012-01-30 12:55:45 -08:00
John Koleszar
8aae246089 RTCD: finalize removal of old RTCD system
This is the final commit in the series converting to the new RTCD
system. It removes the encoder csystemdependent files and the remaining
global function pointers that didn't conform to the old RTCD system.

Change-Id: I9649706f1bb89f0cbf431ab0e3e7552d37be4d8e
2012-01-30 12:10:48 -08:00
John Koleszar
109b69a706 RTCD: add arnr functions
This commit continues the process of converting to the new RTCD
system. It removes the last of the VP8_ENCODER_RTCD struct references.

Change-Id: I2a44f52d7cccf5177e1ca98a028ead570d045395
2012-01-30 12:10:48 -08:00
John Koleszar
0b0bc8d098 RTCD: add motion search functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Ia5828b7ecc80db55b21916704aa3d54cbb98f625
2012-01-30 12:10:47 -08:00
John Koleszar
be8af188d0 RTCD: add block subtraction functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Id8a287fdd4bd050ea4452e1582ad85520f3081be
2012-01-30 12:10:47 -08:00
John Koleszar
61311e6103 RTCD: add quantizer functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Iba9df4c03a508e51c37201c621be43523fae87d9
2012-01-30 12:10:46 -08:00
John Koleszar
510e0ab467 RTCD: add FDCT functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I3f9c07db65eb206f6363d21bdb80e871570da767
2012-01-30 12:10:42 -08:00
John Koleszar
83a91e789c RTCD: add variance functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Ie5c1aa480637e98dc3918fb562ff45c37a66c538
2012-01-30 12:08:30 -08:00
John Koleszar
f103dcefaf RTCD: add subpixel functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe
2012-01-30 12:08:29 -08:00
John Koleszar
2a8f57f50d RTCD: add postproc functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: If54eb5cb5d1b0cac6c4c0633a9e99c93ca860ba2
2012-01-30 12:08:29 -08:00
John Koleszar
fdb61a4531 RTCD: add recon functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
2012-01-30 12:08:28 -08:00
John Koleszar
ab77b4e898 RTCD: add remaining IDCT functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80
2012-01-30 12:08:22 -08:00
John Koleszar
55f74c59c7 RTCD: add loopfilter functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Ic8a4047d72ff3a54ec98977dd90e70c13213db71
2012-01-30 12:06:31 -08:00
John Koleszar
a910049aea New RTCD implementation
This is a proof of concept RTCD implementation to replace the current
system of nested includes, prototypes, INVOKE macros, etc. Currently
only the decoder specific functions are implemented in the new system.
Additional functions will be added in subsequent commits.

Overview:
  RTCD "functions" are implemented as either a global function pointer
  or a macro (when only one eligible specialization available).
  Functions which have RTCD specializations are listed using a simple
  DSL identifying the function's base name, its prototype, and the
  architecture extensions that specializations are available for.

Advantages over the old system:
  - No INVOKE macros. A call to an RTCD function looks like an ordinary
    function call.
  - No need to pass vtables around.
  - If there is only one eligible function to call, the function is
    called directly, rather than indirecting through a function pointer.
  - Supports the notion of "required" extensions, so in combination with
    the above, on x86_64 if the best function available is sse2 or lower
    it will be called directly, since all x86_64 platforms implement
    sse2.
  - Elides all references to functions which will never be called, which
    could reduce binary size. For example if sse2 is required and there
    are both mmx and sse2 implementations of a certain function, the
    code will have no link time references to the mmx code.
  - Significantly easier to add a new function, just one file to edit.

Disadvantages:
  - Requires global writable data (though this is not a new requirement)
  - 1 new generated source file.

Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55
2012-01-30 12:06:27 -08:00
John Koleszar
57cc35dd60 Merge Duclair release into master branch
Change-Id: Ibf577972e8cd10488d44385ff74f136a07466c0c
2012-01-27 14:08:40 -08:00
John Koleszar
9951f46133 Merge "Hook up VP8D_GET_LAST_REF_USED" 2012-01-27 11:31:35 -08:00
Adrian Grange
8978358358 Added 2 temporal patterns with new parameters
Added new 2 and 3 layer prediction frame patterns to
vp8_scalable_patterns.c and modified the coding
parameters.

Change-Id: I18798fd7326a79d2ad1e1d5b6c26f5516b6d247f
2012-01-26 16:50:15 -08:00
John Koleszar
319f7c4d56 Merge changes I17e1a348,Iad710941
* changes:
  Correct clamping in use of vp8_find_near_mvs()
  Revert "Multithreaded encoder, late sync loopfilter"
2012-01-26 14:33:28 -08:00
Attila Nagy
294aa37745 Rename save_neon_reg.asm as save_reg_neon.asm
Easier to filter out all NEON asm.

Change-Id: I0022dae8321a9608e864b09d4181414c5fff4610
2012-01-26 09:44:00 +02:00
Fritz Koenig
c14754be1d Merge "Disconnect ARM tgt_isa from dsp extensions" 2012-01-23 11:13:33 -08:00
Scott LaVarnway
8a6af9f98f Improved uv mv calculations in build inter predictor
Changed calculations to use shifts instead of if-then-else.
Eliminates branches.

Change-Id: I11b75e8bb305301ffd9cb577fb7df059a3cf9ea4
2012-01-23 11:34:43 -05:00
Fritz Koenig
892102842a Disconnect ARM tgt_isa from dsp extensions
A processor with ARMv7 instructions does not
necessarily have NEON dsp extensions.  This CL
has the added side effect of allowing the ability
to enable/disable the dsp extensions cleanly.

Change-Id: Ie1e879b8fe131885bc3d4138a0acc9ffe73a36df
2012-01-20 10:38:15 -08:00
223 changed files with 9840 additions and 12208 deletions

View File

@@ -5,3 +5,4 @@ Tom Finegan <tomfinegan@google.com>
Ralph Giles <giles@xiph.org> <giles@entropywave.com>
Ralph Giles <giles@xiph.org> <giles@mozilla.com>
Alpha Lam <hclam@google.com> <hclam@chromium.org>
Deb Mukherjee <debargha@google.com>

View File

@@ -31,9 +31,11 @@ John Koleszar <jkoleszar@google.com>
Joshua Bleecher Snyder <josh@treelinelabs.com>
Justin Clift <justin@salasaga.org>
Justin Lebar <justin.lebar@gmail.com>
KO Myung-Hun <komh@chollian.net>
Lou Quillio <louquillio@google.com>
Luca Barbato <lu_zero@gentoo.org>
Makoto Kato <makoto.kt@gmail.com>
Marco Paniconi <marpan@google.com>
Martin Ettl <ettl.martin78@googlemail.com>
Michael Kohler <michaelkohler@live.com>
Mike Hommey <mhommey@mozilla.com>
@@ -43,6 +45,7 @@ Patrik Westin <patrik.westin@gmail.com>
Paul Wilkins <paulwilkins@google.com>
Pavol Rusnak <stick@gk2.sk>
Philip Jägenstedt <philipj@opera.com>
Priit Laes <plaes@plaes.org>
Rafael Ávila de Espíndola <rafael.espindola@gmail.com>
Rafaël Carré <funman@videolan.org>
Ralph Giles <giles@xiph.org>
@@ -50,6 +53,7 @@ Ronald S. Bultje <rbultje@google.com>
Scott LaVarnway <slavarnway@google.com>
Stefan Holmer <holmer@google.com>
Taekhyun Kim <takim@nvidia.com>
Takanori MATSUURA <t.matsuu@gmail.com>
Tero Rintaluoma <teror@google.com>
Thijs Vermeir <thijsvermeir@gmail.com>
Timothy B. Terriberry <tterribe@xiph.org>

View File

@@ -1,3 +1,95 @@
2012-05-09 v1.1.0 "Eider"
This introduces a number of enhancements, mostly focused on real-time
encoding. In addition, it fixes a decoder bug (first introduced in
Duclair) so all users of that release are encouraged to upgrade.
- Upgrading:
This release is ABI and API compatible with Duclair (v1.0.0). Users
of older releases should refer to the Upgrading notes in this
document for that release.
This release introduces a new temporal denoiser, controlled by the
VP8E_SET_NOISE_SENSITIVITY control. The temporal denoiser does not
currently take a strength parameter, so the control is effectively
a boolean - zero (off) or non-zero (on). For compatibility with
existing applications, the values accepted are the same as those
for the spatial denoiser (0-6). The temporal denoiser is enabled
by default, and the older spatial denoiser may be restored by
configuring with --disable-temporal-denoising. The temporal denoiser
is more computationally intensive than the spatial one.
This release removes support for a legacy, decode only API that was
supported, but deprecated, at the initial release of libvpx
(v0.9.0). This is not expected to have any impact. If you are
impacted, you can apply a reversion to commit 2bf8fb58 locally.
Please update to the latest libvpx API if you are affected.
- Enhancements:
Adds a motion compensated temporal denoiser to the encoder, which
gives higher quality than the older spatial denoiser. (See above
for notes on upgrading).
In addition, support for new compilers and platforms were added,
including:
improved support for XCode
Android x86 NDK build
OS/2 support
SunCC support
Changing resolution with vpx_codec_enc_config_set() is now
supported. Previously, reinitializing the codec was required to
change the input resolution.
The vpxenc application has initial support for producing multiple
encodes from the same input in one call. Resizing is not yet
supported, but varying other codec parameters is. Use -- to
delineate output streams. Options persist from one stream to the
next.
Also, the vpxenc application will now use a keyframe interval of
5 seconds by default. Use the --kf-max-dist option to override.
- Speed:
Decoder performance improved 2.5% versus Duclair. Encoder speed is
consistent with Duclair for most material. Two pass encoding of
slideshow-like material will see significant improvements.
Large realtime encoding speed gains at a small quality expense are
possible by configuring the on-the-fly bitpacking experiment with
--enable-onthefly-bitpacking. Realtime encoder can be up to 13%
faster (ARM) depending on the number of threads and bitrate
settings. This technique sees constant gain over the 5-16 speed
range. For VC style input the loss seen is up to 0.2dB. See commit
52cf4dca for further details.
- Quality:
On the whole, quality is consistent with the Duclair release. Some
tweaks:
Reduced blockiness in easy sections by applying a penalty to
intra modes.
Improved quality of static sections (like slideshows) with
two pass encoding.
Improved keyframe sizing with multiple temporal layers
- Bug Fixes:
Corrected alt-ref contribution to frame rate for visible updates
to the alt-ref buffer. This affected applications making manual
usage of the frame reference flags, or temporal layers.
Additional constraints were added to disable multi-frame quality
enhancement (MFQE) in sections of the frame where there is motion.
(#392)
Fixed corruption issues when vpx_codec_enc_config_set() was called
with spatial resampling enabled.
Fixed a decoder error introduced in Duclair where the segmentation
map was not being reinitialized on keyframes (#378)
2012-01-27 v1.0.0 "Duclair"
Our fourth named release, focused on performance and features related to
real-time encoding. It also fixes a decoder crash bug introduced in

View File

@@ -99,7 +99,7 @@ $$(eval $$(call ev-build-file))
$(1) : $$(_OBJ) $(2)
@mkdir -p $$(dir $$@)
@grep -w EQU $$< | tr -d '\#' | $(CONFIG_DIR)/$(ASM_CONVERSION) > $$@
@grep $(OFFSET_PATTERN) $$< | tr -d '\#' | $(CONFIG_DIR)/$(ASM_CONVERSION) > $$@
endef
# Use ads2gas script to convert from RVCT format to GAS format. This passes
@@ -118,6 +118,9 @@ $(ASM_CNV_PATH)/libvpx/%.asm.s: $(LIBVPX_PATH)/%.asm $(ASM_CNV_OFFSETS_DEPEND)
@mkdir -p $(dir $@)
@$(CONFIG_DIR)/$(ASM_CONVERSION) <$< > $@
# For building vpx_rtcd.h, which has a rule in libs.mk
TGT_ISA:=$(word 1, $(subst -, ,$(TOOLCHAIN)))
target := libs
LOCAL_SRC_FILES += vpx_config.c
@@ -165,12 +168,15 @@ LOCAL_LDLIBS := -llog
LOCAL_STATIC_LIBRARIES := cpufeatures
$(foreach file, $(LOCAL_SRC_FILES), $(LOCAL_PATH)/$(file)): vpx_rtcd.h
.PHONY: clean
clean:
@echo "Clean: ads2gas files [$(TARGET_ARCH_ABI)]"
@$(RM) $(CODEC_SRCS_ASM_ADS2GAS) $(CODEC_SRCS_ASM_NEON_ADS2GAS)
@$(RM) $(patsubst %.asm, %.*, $(ASM_CNV_OFFSETS_DEPEND))
@$(RM) -r $(ASM_CNV_PATH)
@$(RM) $(CLEAN-OBJS)
include $(BUILD_SHARED_LIBRARY)

View File

@@ -391,6 +391,8 @@ LDFLAGS = ${LDFLAGS}
ASFLAGS = ${ASFLAGS}
extralibs = ${extralibs}
AS_SFX = ${AS_SFX:-.asm}
EXE_SFX = ${EXE_SFX}
RTCD_OPTIONS = ${RTCD_OPTIONS}
EOF
if enabled rvct; then cat >> $1 << EOF
@@ -454,9 +456,25 @@ process_common_cmdline() {
;;
--enable-?*|--disable-?*)
eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'`
echo "${CMDLINE_SELECT} ${ARCH_EXT_LIST}" | grep "^ *$option\$" >/dev/null || die_unknown $opt
if echo "${ARCH_EXT_LIST}" | grep "^ *$option\$" >/dev/null; then
[ $action = "disable" ] && RTCD_OPTIONS="${RTCD_OPTIONS}${opt} "
elif [ $action = "disable" ] && ! disabled $option ; then
echo "${CMDLINE_SELECT}" | grep "^ *$option\$" >/dev/null ||
die_unknown $opt
elif [ $action = "enable" ] && ! enabled $option ; then
echo "${CMDLINE_SELECT}" | grep "^ *$option\$" >/dev/null ||
die_unknown $opt
fi
$action $option
;;
--require-?*)
eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'`
if echo "${ARCH_EXT_LIST}" none | grep "^ *$option\$" >/dev/null; then
RTCD_OPTIONS="${RTCD_OPTIONS}${opt} "
else
die_unknown $opt
fi
;;
--force-enable-?*|--force-disable-?*)
eval `echo "$opt" | sed 's/--force-/action=/;s/-/ option=/;s/-/_/g'`
$action $option
@@ -526,6 +544,7 @@ setup_gnu_toolchain() {
STRIP=${STRIP:-${CROSS}strip}
NM=${NM:-${CROSS}nm}
AS_SFX=.s
EXE_SFX=
}
process_common_toolchain() {
@@ -569,6 +588,10 @@ process_common_toolchain() {
tgt_isa=x86_64
tgt_os=darwin11
;;
*darwin12*)
tgt_isa=x86_64
tgt_os=darwin12
;;
*mingw32*|*cygwin*)
[ -z "$tgt_isa" ] && tgt_isa=x86
tgt_os=win32
@@ -579,6 +602,9 @@ process_common_toolchain() {
*solaris2.10)
tgt_os=solaris
;;
*os2*)
tgt_os=os2
;;
esac
if [ -n "$tgt_isa" ] && [ -n "$tgt_os" ]; then
@@ -616,44 +642,51 @@ process_common_toolchain() {
# Handle darwin variants. Newer SDKs allow targeting older
# platforms, so find the newest SDK available.
if [ -d "/Developer/SDKs/MacOSX10.4u.sdk" ]; then
osx_sdk_dir="/Developer/SDKs/MacOSX10.4u.sdk"
fi
if [ -d "/Developer/SDKs/MacOSX10.5.sdk" ]; then
osx_sdk_dir="/Developer/SDKs/MacOSX10.5.sdk"
fi
if [ -d "/Developer/SDKs/MacOSX10.6.sdk" ]; then
osx_sdk_dir="/Developer/SDKs/MacOSX10.6.sdk"
fi
if [ -d "/Developer/SDKs/MacOSX10.7.sdk" ]; then
osx_sdk_dir="/Developer/SDKs/MacOSX10.7.sdk"
case ${toolchain} in
*-darwin*)
if [ -z "${DEVELOPER_DIR}" ]; then
DEVELOPER_DIR=`xcode-select -print-path 2> /dev/null`
[ $? -ne 0 ] && OSX_SKIP_DIR_CHECK=1
fi
if [ -z "${OSX_SKIP_DIR_CHECK}" ]; then
OSX_SDK_ROOTS="${DEVELOPER_DIR}/SDKs"
OSX_SDK_VERSIONS="MacOSX10.4u.sdk MacOSX10.5.sdk MacOSX10.6.sdk"
OSX_SDK_VERSIONS="${OSX_SDK_VERSIONS} MacOSX10.7.sdk"
for v in ${OSX_SDK_VERSIONS}; do
if [ -d "${OSX_SDK_ROOTS}/${v}" ]; then
osx_sdk_dir="${OSX_SDK_ROOTS}/${v}"
fi
done
fi
;;
esac
if [ -d "${osx_sdk_dir}" ]; then
add_cflags "-isysroot ${osx_sdk_dir}"
add_ldflags "-isysroot ${osx_sdk_dir}"
fi
case ${toolchain} in
*-darwin8-*)
add_cflags "-isysroot ${osx_sdk_dir}"
add_cflags "-mmacosx-version-min=10.4"
add_ldflags "-isysroot ${osx_sdk_dir}"
add_ldflags "-mmacosx-version-min=10.4"
;;
*-darwin9-*)
add_cflags "-isysroot ${osx_sdk_dir}"
add_cflags "-mmacosx-version-min=10.5"
add_ldflags "-isysroot ${osx_sdk_dir}"
add_ldflags "-mmacosx-version-min=10.5"
;;
*-darwin10-*)
add_cflags "-isysroot ${osx_sdk_dir}"
add_cflags "-mmacosx-version-min=10.6"
add_ldflags "-isysroot ${osx_sdk_dir}"
add_ldflags "-mmacosx-version-min=10.6"
;;
*-darwin11-*)
add_cflags "-isysroot ${osx_sdk_dir}"
add_cflags "-mmacosx-version-min=10.7"
add_ldflags "-isysroot ${osx_sdk_dir}"
add_ldflags "-mmacosx-version-min=10.7"
;;
*-darwin12-*)
add_cflags "-mmacosx-version-min=10.8"
add_ldflags "-mmacosx-version-min=10.8"
;;
esac
# Handle Solaris variants. Solaris 10 needs -lposix4
@@ -671,10 +704,22 @@ process_common_toolchain() {
case ${toolchain} in
arm*)
# on arm, isa versions are supersets
enabled armv7a && soft_enable armv7 ### DEBUG
enabled armv7 && soft_enable armv6
enabled armv7 || enabled armv6 && soft_enable armv5te
enabled armv7 || enabled armv6 && soft_enable fast_unaligned
case ${tgt_isa} in
armv7)
soft_enable neon
soft_enable media
soft_enable edsp
soft_enable fast_unaligned
;;
armv6)
soft_enable media
soft_enable edsp
soft_enable fast_unaligned
;;
armv5te)
soft_enable edsp
;;
esac
asm_conversion_cmd="cat"
@@ -687,10 +732,14 @@ process_common_toolchain() {
arch_int=${arch_int%%te}
check_add_asflags --defsym ARCHITECTURE=${arch_int}
tune_cflags="-mtune="
if enabled armv7
then
check_add_cflags -march=armv7-a -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=softfp #-ftree-vectorize
check_add_asflags -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=softfp #-march=armv7-a
if [ ${tgt_isa} == "armv7" ]; then
if enabled neon
then
check_add_cflags -mfpu=neon #-ftree-vectorize
check_add_asflags -mfpu=neon
fi
check_add_cflags -march=armv7-a -mcpu=cortex-a8 -mfloat-abi=softfp
check_add_asflags -mcpu=cortex-a8 -mfloat-abi=softfp #-march=armv7-a
else
check_add_cflags -march=${tgt_isa}
check_add_asflags -march=${tgt_isa}
@@ -708,10 +757,14 @@ process_common_toolchain() {
tune_cflags="--cpu="
tune_asflags="--cpu="
if [ -z "${tune_cpu}" ]; then
if enabled armv7
then
check_add_cflags --cpu=Cortex-A8 --fpu=softvfp+vfpv3
check_add_asflags --cpu=Cortex-A8 --fpu=softvfp+vfpv3
if [ ${tgt_isa} == "armv7" ]; then
if enabled neon
then
check_add_cflags --fpu=softvfp+vfpv3
check_add_asflags --fpu=softvfp+vfpv3
fi
check_add_cflags --cpu=Cortex-A8
check_add_asflags --cpu=Cortex-A8
else
check_add_cflags --cpu=${tgt_isa##armv}
check_add_asflags --cpu=${tgt_isa##armv}
@@ -757,17 +810,19 @@ process_common_toolchain() {
add_cflags "--sysroot=${alt_libc}"
add_ldflags "--sysroot=${alt_libc}"
add_cflags "-I${SDK_PATH}/sources/android/cpufeatures/"
enable pic
soft_enable realtime_only
if enabled armv7
then
if [ ${tgt_isa} == "armv7" ]; then
enable runtime_cpu_detect
fi
;;
darwin*)
if [ -z "${sdk_path}" ]; then
SDK_PATH=/Developer/Platforms/iPhoneOS.platform/Developer
SDK_PATH=`xcode-select -print-path 2> /dev/null`
SDK_PATH=${SDK_PATH}/Platforms/iPhoneOS.platform/Developer
else
SDK_PATH=${sdk_path}
fi
@@ -789,7 +844,7 @@ process_common_toolchain() {
add_ldflags -arch_only ${tgt_isa}
if [ -z "${alt_libc}" ]; then
alt_libc=${SDK_PATH}/SDKs/iPhoneOS5.0.sdk
alt_libc=${SDK_PATH}/SDKs/iPhoneOS5.1.sdk
fi
add_cflags "-isysroot ${alt_libc}"
@@ -886,6 +941,9 @@ process_common_toolchain() {
LD=${LD:-${CROSS}gcc}
CROSS=${CROSS:-g}
;;
os2)
AS=${AS:-nasm}
;;
esac
AS="${alt_as:-${AS:-auto}}"
@@ -956,6 +1014,11 @@ process_common_toolchain() {
# enabled icc && ! enabled pic && add_cflags -fno-pic -mdynamic-no-pic
enabled icc && ! enabled pic && add_cflags -fno-pic
;;
os2)
add_asflags -f aout
enabled debug && add_asflags -g
EXE_SFX=.exe
;;
*) log "Warning: Unknown os $tgt_os while setting up $AS flags"
;;
esac

View File

@@ -42,7 +42,7 @@ done
[ -n "$srcfile" ] || show_help
sfx=${sfx:-asm}
includes=$(egrep -i "include +\"?+[a-z0-9_/]+\.${sfx}" $srcfile |
includes=$(LC_ALL=C egrep -i "include +\"?+[a-z0-9_/]+\.${sfx}" $srcfile |
perl -p -e "s;.*?([a-z0-9_/]+.${sfx}).*;\1;")
#" restore editor state
for inc in ${includes}; do

330
build/make/rtcd.sh Executable file
View File

@@ -0,0 +1,330 @@
#!/bin/sh
self=$0
usage() {
cat <<EOF >&2
Usage: $self [options] FILE
Reads the Run Time CPU Detections definitions from FILE and generates a
C header file on stdout.
Options:
--arch=ARCH Architecture to generate defs for (required)
--disable-EXT Disable support for EXT extensions
--require-EXT Require support for EXT extensions
--sym=SYMBOL Unique symbol to use for RTCD initialization function
--config=FILE File with CONFIG_FOO=yes lines to parse
EOF
exit 1
}
die() {
echo "$@" >&2
exit 1
}
die_argument_required() {
die "Option $opt requires argument"
}
for opt; do
optval="${opt#*=}"
case "$opt" in
--arch) die_argument_required;;
--arch=*) arch=${optval};;
--disable-*) eval "disable_${opt#--disable-}=true";;
--require-*) REQUIRES="${REQUIRES}${opt#--require-} ";;
--sym) die_argument_required;;
--sym=*) symbol=${optval};;
--config=*) config_file=${optval};;
-h|--help)
usage
;;
-*)
die "Unrecognized option: ${opt%%=*}"
;;
*)
defs_file="$defs_file $opt"
;;
esac
shift
done
for f in $defs_file; do [ -f "$f" ] || usage; done
[ -n "$arch" ] || usage
# Import the configuration
[ -f "$config_file" ] && eval $(grep CONFIG_ "$config_file")
#
# Routines for the RTCD DSL to call
#
prototype() {
local rtyp
case "$1" in
unsigned) rtyp="$1 "; shift;;
esac
rtyp="${rtyp}$1"
local fn="$2"
local args="$3"
eval "${2}_rtyp='$rtyp'"
eval "${2}_args='$3'"
ALL_FUNCS="$ALL_FUNCS $fn"
specialize $fn c
}
specialize() {
local fn="$1"
shift
for opt in "$@"; do
eval "${fn}_${opt}=${fn}_${opt}"
done
}
require() {
for fn in $ALL_FUNCS; do
for opt in "$@"; do
local ofn=$(eval "echo \$${fn}_${opt}")
[ -z "$ofn" ] && continue
# if we already have a default, then we can disable it, as we know
# we can do better.
local best=$(eval "echo \$${fn}_default")
local best_ofn=$(eval "echo \$${best}")
[ -n "$best" ] && [ "$best_ofn" != "$ofn" ] && eval "${best}_link=false"
eval "${fn}_default=${fn}_${opt}"
eval "${fn}_${opt}_link=true"
done
done
}
forward_decls() {
ALL_FORWARD_DECLS="$ALL_FORWARD_DECLS $1"
}
#
# Include the user's directives
#
for f in $defs_file; do
. $f
done
#
# Process the directives according to the command line
#
process_forward_decls() {
for fn in $ALL_FORWARD_DECLS; do
eval $fn
done
}
determine_indirection() {
[ "$CONFIG_RUNTIME_CPU_DETECT" = "yes" ] || require $ALL_ARCHS
for fn in $ALL_FUNCS; do
local n=""
local rtyp="$(eval "echo \$${fn}_rtyp")"
local args="$(eval "echo \"\$${fn}_args\"")"
local dfn="$(eval "echo \$${fn}_default")"
dfn=$(eval "echo \$${dfn}")
for opt in "$@"; do
local ofn=$(eval "echo \$${fn}_${opt}")
[ -z "$ofn" ] && continue
local link=$(eval "echo \$${fn}_${opt}_link")
[ "$link" = "false" ] && continue
n="${n}x"
done
if [ "$n" = "x" ]; then
eval "${fn}_indirect=false"
else
eval "${fn}_indirect=true"
fi
done
}
declare_function_pointers() {
for fn in $ALL_FUNCS; do
local rtyp="$(eval "echo \$${fn}_rtyp")"
local args="$(eval "echo \"\$${fn}_args\"")"
local dfn="$(eval "echo \$${fn}_default")"
dfn=$(eval "echo \$${dfn}")
for opt in "$@"; do
local ofn=$(eval "echo \$${fn}_${opt}")
[ -z "$ofn" ] && continue
echo "$rtyp ${ofn}($args);"
done
if [ "$(eval "echo \$${fn}_indirect")" = "false" ]; then
echo "#define ${fn} ${dfn}"
else
echo "RTCD_EXTERN $rtyp (*${fn})($args);"
fi
echo
done
}
set_function_pointers() {
for fn in $ALL_FUNCS; do
local n=""
local rtyp="$(eval "echo \$${fn}_rtyp")"
local args="$(eval "echo \"\$${fn}_args\"")"
local dfn="$(eval "echo \$${fn}_default")"
dfn=$(eval "echo \$${dfn}")
if $(eval "echo \$${fn}_indirect"); then
echo " $fn = $dfn;"
for opt in "$@"; do
local ofn=$(eval "echo \$${fn}_${opt}")
[ -z "$ofn" ] && continue
[ "$ofn" = "$dfn" ] && continue;
local link=$(eval "echo \$${fn}_${opt}_link")
[ "$link" = "false" ] && continue
local cond="$(eval "echo \$have_${opt}")"
echo " if (${cond}) $fn = $ofn;"
done
fi
echo
done
}
filter() {
local filtered
for opt in "$@"; do
[ -z $(eval "echo \$disable_${opt}") ] && filtered="$filtered $opt"
done
echo $filtered
}
#
# Helper functions for generating the arch specific RTCD files
#
common_top() {
local outfile_basename=$(basename ${symbol:-rtcd.h})
local include_guard=$(echo $outfile_basename | tr '[a-z]' '[A-Z]' | tr -c '[A-Z]' _)
cat <<EOF
#ifndef ${include_guard}
#define ${include_guard}
#ifdef RTCD_C
#define RTCD_EXTERN
#else
#define RTCD_EXTERN extern
#endif
$(process_forward_decls)
$(declare_function_pointers c $ALL_ARCHS)
EOF
}
common_bottom() {
cat <<EOF
#endif
EOF
}
x86() {
determine_indirection c $ALL_ARCHS
# Assign the helper variable for each enabled extension
for opt in $ALL_ARCHS; do
local uc=$(echo $opt | tr '[a-z]' '[A-Z]')
eval "have_${opt}=\"flags & HAS_${uc}\""
done
cat <<EOF
$(common_top)
void ${symbol:-rtcd}(void);
#ifdef RTCD_C
#include "vpx_ports/x86.h"
void ${symbol:-rtcd}(void)
{
int flags = x86_simd_caps();
(void)flags;
$(set_function_pointers c $ALL_ARCHS)
}
#endif
$(common_bottom)
EOF
}
arm() {
determine_indirection c $ALL_ARCHS
# Assign the helper variable for each enabled extension
for opt in $ALL_ARCHS; do
local uc=$(echo $opt | tr '[a-z]' '[A-Z]')
eval "have_${opt}=\"flags & HAS_${uc}\""
done
cat <<EOF
$(common_top)
#include "vpx_config.h"
void ${symbol:-rtcd}(void);
#ifdef RTCD_C
#include "vpx_ports/arm.h"
void ${symbol:-rtcd}(void)
{
int flags = arm_cpu_caps();
(void)flags;
$(set_function_pointers c $ALL_ARCHS)
}
#endif
$(common_bottom)
EOF
}
unoptimized() {
determine_indirection c
cat <<EOF
$(common_top)
#include "vpx_config.h"
void ${symbol:-rtcd}(void);
#ifdef RTCD_C
void ${symbol:-rtcd}(void)
{
$(set_function_pointers c)
}
#endif
$(common_bottom)
EOF
}
#
# Main Driver
#
require c
case $arch in
x86)
ALL_ARCHS=$(filter mmx sse sse2 sse3 ssse3 sse4_1)
x86
;;
x86_64)
ALL_ARCHS=$(filter mmx sse sse2 sse3 ssse3 sse4_1)
REQUIRES=${REQUIRES:-mmx sse sse2}
require $(filter $REQUIRES)
x86
;;
armv5te)
ALL_ARCHS=$(filter edsp)
arm
;;
armv6)
ALL_ARCHS=$(filter edsp media)
arm
;;
armv7)
ALL_ARCHS=$(filter edsp media neon)
arm
;;
*)
unoptimized
;;
esac

43
configure vendored
View File

@@ -39,6 +39,7 @@ Advanced options:
${toggle_multithread} multithreaded encoding and decoding
${toggle_spatial_resampling} spatial sampling (scaling) support
${toggle_realtime_only} enable this option while building for real-time encoding
${toggle_onthefly_bitpacking} enable on-the-fly bitpacking in real-time encoding
${toggle_error_concealment} enable this option to get a decoder which is able to conceal losses
${toggle_runtime_cpu_detect} runtime cpu detection
${toggle_shared} shared library support
@@ -46,6 +47,7 @@ Advanced options:
${toggle_small} favor smaller size over speed
${toggle_postproc_visualizer} macro block / block level visualizers
${toggle_multi_res_encoding} enable multiple-resolution encoding
${toggle_temporal_denoising} enable temporal denoising and disable the spatial denoiser
Codecs:
Codecs can be selectively enabled or disabled individually, or by family:
@@ -107,8 +109,11 @@ all_platforms="${all_platforms} x86-darwin8-icc"
all_platforms="${all_platforms} x86-darwin9-gcc"
all_platforms="${all_platforms} x86-darwin9-icc"
all_platforms="${all_platforms} x86-darwin10-gcc"
all_platforms="${all_platforms} x86-darwin11-gcc"
all_platforms="${all_platforms} x86-darwin12-gcc"
all_platforms="${all_platforms} x86-linux-gcc"
all_platforms="${all_platforms} x86-linux-icc"
all_platforms="${all_platforms} x86-os2-gcc"
all_platforms="${all_platforms} x86-solaris-gcc"
all_platforms="${all_platforms} x86-win32-gcc"
all_platforms="${all_platforms} x86-win32-vs7"
@@ -117,6 +122,7 @@ all_platforms="${all_platforms} x86-win32-vs9"
all_platforms="${all_platforms} x86_64-darwin9-gcc"
all_platforms="${all_platforms} x86_64-darwin10-gcc"
all_platforms="${all_platforms} x86_64-darwin11-gcc"
all_platforms="${all_platforms} x86_64-darwin12-gcc"
all_platforms="${all_platforms} x86_64-linux-gcc"
all_platforms="${all_platforms} x86_64-linux-icc"
all_platforms="${all_platforms} x86_64-solaris-gcc"
@@ -125,6 +131,9 @@ all_platforms="${all_platforms} x86_64-win64-vs8"
all_platforms="${all_platforms} x86_64-win64-vs9"
all_platforms="${all_platforms} universal-darwin8-gcc"
all_platforms="${all_platforms} universal-darwin9-gcc"
all_platforms="${all_platforms} universal-darwin10-gcc"
all_platforms="${all_platforms} universal-darwin11-gcc"
all_platforms="${all_platforms} universal-darwin12-gcc"
all_platforms="${all_platforms} generic-gnu"
# all_targets is a list of all targets that can be configured
@@ -163,6 +172,7 @@ enable md5
enable spatial_resampling
enable multithread
enable os_support
enable temporal_denoising
[ -d ${source_path}/../include ] && enable alt_tree_layout
for d in vp8; do
@@ -176,6 +186,8 @@ else
# customer environment
[ -f ${source_path}/../include/vpx/vp8cx.h ] && CODECS="${CODECS} vp8_encoder"
[ -f ${source_path}/../include/vpx/vp8dx.h ] && CODECS="${CODECS} vp8_decoder"
[ -f ${source_path}/../include/vpx/vp8cx.h ] || disable vp8_encoder
[ -f ${source_path}/../include/vpx/vp8dx.h ] || disable vp8_decoder
[ -f ${source_path}/../lib/*/*mt.lib ] && soft_enable static_msvcrt
fi
@@ -192,9 +204,9 @@ ARCH_LIST="
ppc64
"
ARCH_EXT_LIST="
armv5te
armv6
armv7
edsp
media
neon
mips32
@@ -252,6 +264,7 @@ CONFIG_LIST="
static_msvcrt
spatial_resampling
realtime_only
onthefly_bitpacking
error_concealment
shared
static
@@ -260,6 +273,7 @@ CONFIG_LIST="
os_support
unit_tests
multi_res_encoding
temporal_denoising
"
CMDLINE_SELECT="
extra_warnings
@@ -296,6 +310,7 @@ CMDLINE_SELECT="
mem_tracker
spatial_resampling
realtime_only
onthefly_bitpacking
error_concealment
shared
static
@@ -303,6 +318,7 @@ CMDLINE_SELECT="
postproc_visualizer
unit_tests
multi_res_encoding
temporal_denoising
"
process_cmdline() {
@@ -483,11 +499,20 @@ process_toolchain() {
case $toolchain in
universal-darwin*)
local darwin_ver=${tgt_os##darwin}
fat_bin_archs="$fat_bin_archs ppc32-${tgt_os}-gcc"
# Intel
fat_bin_archs="$fat_bin_archs x86-${tgt_os}-${tgt_cc}"
if [ $darwin_ver -gt 8 ]; then
# Snow Leopard (10.6/darwin10) dropped support for PPC
# Include PPC support for all prior versions
if [ $darwin_ver -lt 10 ]; then
fat_bin_archs="$fat_bin_archs ppc32-${tgt_os}-gcc"
fi
# Tiger (10.4/darwin8) brought support for x86
if [ $darwin_ver -ge 8 ]; then
fat_bin_archs="$fat_bin_archs x86-${tgt_os}-${tgt_cc}"
fi
# Leopard (10.5/darwin9) brought 64 bit support
if [ $darwin_ver -ge 9 ]; then
fat_bin_archs="$fat_bin_archs x86_64-${tgt_os}-${tgt_cc}"
fi
;;
@@ -503,6 +528,10 @@ process_toolchain() {
check_add_cflags -Wpointer-arith
check_add_cflags -Wtype-limits
check_add_cflags -Wcast-qual
check_add_cflags -Wimplicit-function-declaration
check_add_cflags -Wuninitialized
check_add_cflags -Wunused-variable
check_add_cflags -Wunused-but-set-variable
enabled extra_warnings || check_add_cflags -Wno-unused-function
fi

View File

@@ -21,9 +21,6 @@ CODEC_DOX := mainpage.dox \
usage_dx.dox \
# Other doxy files sourced in Markdown
TXT_DOX-$(CONFIG_VP8) += vp8_api1_migration.dox
vp8_api1_migration.dox.DESC = VP8 API 1.x Migration
TXT_DOX = $(call enabled,TXT_DOX)
%.dox: %.txt

View File

@@ -32,6 +32,7 @@ vpxenc.SRCS += args.c args.h y4minput.c y4minput.h
vpxenc.SRCS += tools_common.c tools_common.h
vpxenc.SRCS += vpx_ports/mem_ops.h
vpxenc.SRCS += vpx_ports/mem_ops_aligned.h
vpxenc.SRCS += vpx_ports/vpx_timer.h
vpxenc.SRCS += libmkv/EbmlIDs.h
vpxenc.SRCS += libmkv/EbmlWriter.c
vpxenc.SRCS += libmkv/EbmlWriter.h
@@ -168,12 +169,12 @@ $(eval $(if $(filter universal%,$(TOOLCHAIN)),LIPO_OBJS,BUILD_OBJS):=yes)
# Create build/install dependencies for all examples. The common case
# is handled here. The MSVS case is handled below.
NOT_MSVS = $(if $(CONFIG_MSVS),,yes)
DIST-BINS-$(NOT_MSVS) += $(addprefix bin/,$(ALL_EXAMPLES:.c=))
INSTALL-BINS-$(NOT_MSVS) += $(addprefix bin/,$(UTILS:.c=))
DIST-BINS-$(NOT_MSVS) += $(addprefix bin/,$(ALL_EXAMPLES:.c=$(EXE_SFX)))
INSTALL-BINS-$(NOT_MSVS) += $(addprefix bin/,$(UTILS:.c=$(EXE_SFX)))
DIST-SRCS-yes += $(ALL_SRCS)
INSTALL-SRCS-yes += $(UTIL_SRCS)
OBJS-$(NOT_MSVS) += $(if $(BUILD_OBJS),$(call objs,$(ALL_SRCS)))
BINS-$(NOT_MSVS) += $(addprefix $(BUILD_PFX),$(ALL_EXAMPLES:.c=))
BINS-$(NOT_MSVS) += $(addprefix $(BUILD_PFX),$(ALL_EXAMPLES:.c=$(EXE_SFX)))
# Instantiate linker template for all examples.
@@ -183,7 +184,7 @@ $(foreach bin,$(BINS-yes),\
$(if $(BUILD_OBJS),$(eval $(bin):\
$(LIB_PATH)/lib$(CODEC_LIB)$(CODEC_LIB_SUF)))\
$(if $(BUILD_OBJS),$(eval $(call linker_template,$(bin),\
$(call objs,$($(notdir $(bin)).SRCS)) \
$(call objs,$($(notdir $(bin:$(EXE_SFX)=)).SRCS)) \
-l$(CODEC_LIB) $(addprefix -l,$(CODEC_EXTRA_LIBS))\
)))\
$(if $(LIPO_OBJS),$(eval $(call lipo_bin_template,$(bin))))\

31
libs.mk
View File

@@ -17,6 +17,7 @@ else
ASM:=.asm
endif
CODEC_SRCS-yes += CHANGELOG
CODEC_SRCS-yes += libs.mk
include $(SRC_PATH_BARE)/vpx/vpx_codec.mk
@@ -34,9 +35,9 @@ ifeq ($(CONFIG_VP8_ENCODER),yes)
include $(SRC_PATH_BARE)/$(VP8_PREFIX)vp8cx.mk
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_CX_SRCS))
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_CX_EXPORTS))
CODEC_SRCS-yes += $(VP8_PREFIX)vp8cx.mk vpx/vp8.h vpx/vp8cx.h vpx/vp8e.h
CODEC_SRCS-yes += $(VP8_PREFIX)vp8cx.mk vpx/vp8.h vpx/vp8cx.h
CODEC_SRCS-$(ARCH_ARM) += $(VP8_PREFIX)vp8cx_arm.mk
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8e.h include/vpx/vp8cx.h
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8cx.h
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8cx.h
CODEC_DOC_SECTIONS += vp8 vp8_encoder
@@ -48,7 +49,6 @@ ifeq ($(CONFIG_VP8_DECODER),yes)
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_DX_SRCS))
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_DX_EXPORTS))
CODEC_SRCS-yes += $(VP8_PREFIX)vp8dx.mk vpx/vp8.h vpx/vp8dx.h
CODEC_SRCS-$(ARCH_ARM) += $(VP8_PREFIX)vp8dx_arm.mk
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8dx.h
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8dx.h
@@ -90,6 +90,7 @@ endif
$(eval $(if $(filter universal%,$(TOOLCHAIN)),LIPO_LIBVPX,BUILD_LIBVPX):=yes)
CODEC_SRCS-$(BUILD_LIBVPX) += build/make/version.sh
CODEC_SRCS-$(BUILD_LIBVPX) += build/make/rtcd.sh
CODEC_SRCS-$(BUILD_LIBVPX) += vpx/vpx_integer.h
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/asm_offsets.h
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/vpx_timer.h
@@ -114,7 +115,6 @@ INSTALL-LIBS-yes += include/vpx/vpx_integer.h
INSTALL-LIBS-yes += include/vpx/vpx_codec_impl_top.h
INSTALL-LIBS-yes += include/vpx/vpx_codec_impl_bottom.h
INSTALL-LIBS-$(CONFIG_DECODERS) += include/vpx/vpx_decoder.h
INSTALL-LIBS-$(CONFIG_DECODERS) += include/vpx/vpx_decoder_compat.h
INSTALL-LIBS-$(CONFIG_ENCODERS) += include/vpx/vpx_encoder.h
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
ifeq ($(CONFIG_MSVS),yes)
@@ -183,6 +183,7 @@ vpx.vcproj: $(CODEC_SRCS) vpx.def
PROJECTS-$(BUILD_LIBVPX) += vpx.vcproj
vpx.vcproj: vpx_config.asm
vpx.vcproj: vpx_rtcd.h
endif
else
@@ -232,7 +233,7 @@ vpx.pc: config.mk libs.mk
$(qexec)echo '# pkg-config file from libvpx $(VERSION_STRING)' > $@
$(qexec)echo 'prefix=$(PREFIX)' >> $@
$(qexec)echo 'exec_prefix=$${prefix}' >> $@
$(qexec)echo 'libdir=$${prefix}/lib' >> $@
$(qexec)echo 'libdir=$${prefix}/$(LIBSUBDIR)' >> $@
$(qexec)echo 'includedir=$${prefix}/include' >> $@
$(qexec)echo '' >> $@
$(qexec)echo 'Name: vpx' >> $@
@@ -279,19 +280,21 @@ $(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)vpx_config.asm
# Calculate platform- and compiler-specific offsets for hand coded assembly
#
OFFSET_PATTERN:='^[a-zA-Z0-9_]* EQU'
ifeq ($(filter icc gcc,$(TGT_CC)), $(TGT_CC))
$(BUILD_PFX)asm_com_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S
grep -w EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
LC_ALL=C grep $(OFFSET_PATTERN) $< | tr -d '$$\#' $(ADS2GAS) > $@
$(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S: $(VP8_PREFIX)common/asm_com_offsets.c
CLEAN-OBJS += $(BUILD_PFX)asm_com_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S
$(BUILD_PFX)asm_enc_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S
grep -w EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
LC_ALL=C grep $(OFFSET_PATTERN) $< | tr -d '$$\#' $(ADS2GAS) > $@
$(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S: $(VP8_PREFIX)encoder/asm_enc_offsets.c
CLEAN-OBJS += $(BUILD_PFX)asm_enc_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S
$(BUILD_PFX)asm_dec_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S
grep -w EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
LC_ALL=C grep $(OFFSET_PATTERN) $< | tr -d '$$\#' $(ADS2GAS) > $@
$(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S: $(VP8_PREFIX)decoder/asm_dec_offsets.c
CLEAN-OBJS += $(BUILD_PFX)asm_dec_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S
else
@@ -322,6 +325,18 @@ endif
$(shell $(SRC_PATH_BARE)/build/make/version.sh "$(SRC_PATH_BARE)" $(BUILD_PFX)vpx_version.h)
CLEAN-OBJS += $(BUILD_PFX)vpx_version.h
#
# Rule to generate runtime cpu detection files
#
$(OBJS-yes:.o=.d): $(BUILD_PFX)vpx_rtcd.h
$(BUILD_PFX)vpx_rtcd.h: $(SRC_PATH_BARE)/$(sort $(filter %rtcd_defs.sh,$(CODEC_SRCS)))
@echo " [CREATE] $@"
$(qexec)$(SRC_PATH_BARE)/build/make/rtcd.sh --arch=$(TGT_ISA) \
--sym=vpx_rtcd \
--config=$(target)$(if $(FAT_ARCHS),,-$(TOOLCHAIN)).mk \
$(RTCD_OPTIONS) $^ > $@
CLEAN-OBJS += $(BUILD_PFX)vpx_rtcd.h
CODEC_DOC_SRCS += vpx/vpx_codec.h \
vpx/vpx_decoder.h \
vpx/vpx_encoder.h \

View File

@@ -12,8 +12,12 @@
This distribution of the WebM VP8 Codec SDK includes the following support:
\if vp8_encoder - \ref vp8_encoder \endif
\if vp8_decoder - \ref vp8_decoder \endif
\if vp8_encoder
- \ref vp8_encoder
\endif
\if vp8_decoder
- \ref vp8_decoder
\endif
\section main_startpoints Starting Points
@@ -24,8 +28,12 @@
- Read the \ref samples "sample code" for examples of how to interact with the
codec.
- \ref codec reference
\if encoder - \ref encoder reference \endif
\if decoder - \ref decoder reference \endif
\if encoder
- \ref encoder reference
\endif
\if decoder
- \ref decoder reference
\endif
\section main_support Support Options & FAQ
The WebM project is an open source project supported by its community. For

160
tools/ftfy.sh Executable file
View File

@@ -0,0 +1,160 @@
#!/bin/sh
self="$0"
dirname_self=$(dirname "$self")
usage() {
cat <<EOF >&2
Usage: $self [option]
This script applies a whitespace transformation to the commit at HEAD. If no
options are given, then the modified files are left in the working tree.
Options:
-h, --help Shows this message
-n, --dry-run Shows a diff of the changes to be made.
--amend Squashes the changes into the commit at HEAD
This option will also reformat the commit message.
--commit Creates a new commit containing only the whitespace changes
--msg-only Reformat the commit message only, ignore the patch itself.
EOF
rm -f ${CLEAN_FILES}
exit 1
}
log() {
echo "${self##*/}: $@" >&2
}
vpx_style() {
astyle --style=bsd --min-conditional-indent=0 --break-blocks \
--pad-oper --pad-header --unpad-paren \
--align-pointer=name \
--indent-preprocessor --convert-tabs --indent-labels \
--suffix=none --quiet "$@"
sed -i 's/[[:space:]]\{1,\},/,/g' "$@"
}
apply() {
[ $INTERSECT_RESULT -ne 0 ] && patch -p1 < "$1"
}
commit() {
LAST_CHANGEID=$(git show | awk '/Change-Id:/{print $2}')
if [ -z "$LAST_CHANGEID" ]; then
log "HEAD doesn't have a Change-Id, unable to generate a new commit"
exit 1
fi
# Build a deterministic Change-Id from the parent's
NEW_CHANGEID=${LAST_CHANGEID}-styled
NEW_CHANGEID=I$(echo $NEW_CHANGEID | git hash-object --stdin)
# Commit, preserving authorship from the parent commit.
git commit -a -C HEAD > /dev/null
git commit --amend -F- << EOF
Cosmetic: Fix whitespace in change ${LAST_CHANGEID:0:9}
Change-Id: ${NEW_CHANGEID}
EOF
}
show_commit_msg_diff() {
if [ $DIFF_MSG_RESULT -ne 0 ]; then
log "Modified commit message:"
diff -u "$ORIG_COMMIT_MSG" "$NEW_COMMIT_MSG" | tail -n +3
fi
}
amend() {
show_commit_msg_diff
if [ $DIFF_MSG_RESULT -ne 0 ] || [ $INTERSECT_RESULT -ne 0 ]; then
git commit -a --amend -F "$NEW_COMMIT_MSG"
fi
}
diff_msg() {
git log -1 --format=%B > "$ORIG_COMMIT_MSG"
"${dirname_self}"/wrap-commit-msg.py \
< "$ORIG_COMMIT_MSG" > "$NEW_COMMIT_MSG"
cmp -s "$ORIG_COMMIT_MSG" "$NEW_COMMIT_MSG"
DIFF_MSG_RESULT=$?
}
# Temporary files
ORIG_DIFF=orig.diff.$$
MODIFIED_DIFF=modified.diff.$$
FINAL_DIFF=final.diff.$$
ORIG_COMMIT_MSG=orig.commit-msg.$$
NEW_COMMIT_MSG=new.commit-msg.$$
CLEAN_FILES="${ORIG_DIFF} ${MODIFIED_DIFF} ${FINAL_DIFF}"
CLEAN_FILES="${CLEAN_FILES} ${ORIG_COMMIT_MSG} ${NEW_COMMIT_MSG}"
# Preconditions
[ $# -lt 2 ] || usage
# Check that astyle supports pad-header and align-pointer=name
if ! astyle --pad-header --align-pointer=name < /dev/null; then
log "Install astyle v1.24 or newer"
exit 1
fi
if ! git diff --quiet HEAD; then
log "Working tree is dirty, commit your changes first"
exit 1
fi
# Need to be in the root
cd "$(git rev-parse --show-toplevel)"
# Collect the original diff
git show > "${ORIG_DIFF}"
# Apply the style guide on new and modified files and collect its diff
for f in $(git diff HEAD^ --name-only -M90 --diff-filter=AM \
| grep '\.[ch]$'); do
case "$f" in
third_party/*) continue;;
nestegg/*) continue;;
esac
vpx_style "$f"
done
git diff --no-color --no-ext-diff > "${MODIFIED_DIFF}"
# Intersect the two diffs
"${dirname_self}"/intersect-diffs.py \
"${ORIG_DIFF}" "${MODIFIED_DIFF}" > "${FINAL_DIFF}"
INTERSECT_RESULT=$?
git reset --hard >/dev/null
# Fixup the commit message
diff_msg
# Handle options
if [ -n "$1" ]; then
case "$1" in
-h|--help) usage;;
-n|--dry-run) cat "${FINAL_DIFF}"; show_commit_msg_diff;;
--commit) apply "${FINAL_DIFF}"; commit;;
--amend) apply "${FINAL_DIFF}"; amend;;
--msg-only) amend;;
*) usage;;
esac
else
apply "${FINAL_DIFF}"
if ! git diff --quiet; then
log "Formatting changes applied, verify and commit."
log "See also: http://www.webmproject.org/code/contribute/conventions/"
git diff --stat
fi
fi
rm -f ${CLEAN_FILES}

188
tools/intersect-diffs.py Executable file
View File

@@ -0,0 +1,188 @@
#!/usr/bin/env python
## Copyright (c) 2012 The WebM project authors. All Rights Reserved.
##
## Use of this source code is governed by a BSD-style license
## that can be found in the LICENSE file in the root of the source
## tree. An additional intellectual property rights grant can be found
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
"""Calculates the "intersection" of two unified diffs.
Given two diffs, A and B, it finds all hunks in B that had non-context lines
in A and prints them to stdout. This is useful to determine the hunks in B that
are relevant to A. The resulting file can be applied with patch(1) on top of A.
"""
__author__ = "jkoleszar@google.com"
import re
import sys
class DiffLines(object):
"""A container for one half of a diff."""
def __init__(self, filename, offset, length):
self.filename = filename
self.offset = offset
self.length = length
self.lines = []
self.delta_line_nums = []
def Append(self, line):
l = len(self.lines)
if line[0] != " ":
self.delta_line_nums.append(self.offset + l)
self.lines.append(line[1:])
assert l+1 <= self.length
def Complete(self):
return len(self.lines) == self.length
def __contains__(self, item):
return item >= self.offset and item <= self.offset + self.length - 1
class DiffHunk(object):
"""A container for one diff hunk, consisting of two DiffLines."""
def __init__(self, header, file_a, file_b, start_a, len_a, start_b, len_b):
self.header = header
self.left = DiffLines(file_a, start_a, len_a)
self.right = DiffLines(file_b, start_b, len_b)
self.lines = []
def Append(self, line):
"""Adds a line to the DiffHunk and its DiffLines children."""
if line[0] == "-":
self.left.Append(line)
elif line[0] == "+":
self.right.Append(line)
elif line[0] == " ":
self.left.Append(line)
self.right.Append(line)
else:
assert False, ("Unrecognized character at start of diff line "
"%r" % line[0])
self.lines.append(line)
def Complete(self):
return self.left.Complete() and self.right.Complete()
def __repr__(self):
return "DiffHunk(%s, %s, len %d)" % (
self.left.filename, self.right.filename,
max(self.left.length, self.right.length))
def ParseDiffHunks(stream):
"""Walk a file-like object, yielding DiffHunks as they're parsed."""
file_regex = re.compile(r"(\+\+\+|---) (\S+)")
range_regex = re.compile(r"@@ -(\d+)(,(\d+))? \+(\d+)(,(\d+))?")
hunk = None
while True:
line = stream.readline()
if not line:
break
if hunk is None:
# Parse file names
diff_file = file_regex.match(line)
if diff_file:
if line.startswith("---"):
a_line = line
a = diff_file.group(2)
continue
if line.startswith("+++"):
b_line = line
b = diff_file.group(2)
continue
# Parse offset/lengths
diffrange = range_regex.match(line)
if diffrange:
if diffrange.group(2):
start_a = int(diffrange.group(1))
len_a = int(diffrange.group(3))
else:
start_a = 1
len_a = int(diffrange.group(1))
if diffrange.group(5):
start_b = int(diffrange.group(4))
len_b = int(diffrange.group(6))
else:
start_b = 1
len_b = int(diffrange.group(4))
header = [a_line, b_line, line]
hunk = DiffHunk(header, a, b, start_a, len_a, start_b, len_b)
else:
# Add the current line to the hunk
hunk.Append(line)
# See if the whole hunk has been parsed. If so, yield it and prepare
# for the next hunk.
if hunk.Complete():
yield hunk
hunk = None
# Partial hunks are a parse error
assert hunk is None
def FormatDiffHunks(hunks):
"""Re-serialize a list of DiffHunks."""
r = []
last_header = None
for hunk in hunks:
this_header = hunk.header[0:2]
if last_header != this_header:
r.extend(hunk.header)
last_header = this_header
else:
r.extend(hunk.header[2])
r.extend(hunk.lines)
r.append("\n")
return "".join(r)
def ZipHunks(rhs_hunks, lhs_hunks):
"""Join two hunk lists on filename."""
for rhs_hunk in rhs_hunks:
rhs_file = rhs_hunk.right.filename.split("/")[1:]
for lhs_hunk in lhs_hunks:
lhs_file = lhs_hunk.left.filename.split("/")[1:]
if lhs_file != rhs_file:
continue
yield (rhs_hunk, lhs_hunk)
def main():
old_hunks = [x for x in ParseDiffHunks(open(sys.argv[1], "r"))]
new_hunks = [x for x in ParseDiffHunks(open(sys.argv[2], "r"))]
out_hunks = []
# Join the right hand side of the older diff with the left hand side of the
# newer diff.
for old_hunk, new_hunk in ZipHunks(old_hunks, new_hunks):
if new_hunk in out_hunks:
continue
old_lines = old_hunk.right
new_lines = new_hunk.left
# Determine if this hunk overlaps any non-context line from the other
for i in old_lines.delta_line_nums:
if i in new_lines:
out_hunks.append(new_hunk)
break
if out_hunks:
print FormatDiffHunks(out_hunks)
sys.exit(1)
if __name__ == "__main__":
main()

70
tools/wrap-commit-msg.py Executable file
View File

@@ -0,0 +1,70 @@
#!/usr/bin/env python
## Copyright (c) 2012 The WebM project authors. All Rights Reserved.
##
## Use of this source code is governed by a BSD-style license
## that can be found in the LICENSE file in the root of the source
## tree. An additional intellectual property rights grant can be found
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
"""Wraps paragraphs of text, preserving manual formatting
This is like fold(1), but has the special convention of not modifying lines
that start with whitespace. This allows you to intersperse blocks with
special formatting, like code blocks, with written prose. The prose will
be wordwrapped, and the manual formatting will be preserved.
* This won't handle the case of a bulleted (or ordered) list specially, so
manual wrapping must be done.
Occasionally it's useful to put something with explicit formatting that
doesn't look at all like a block of text inline.
indicator = has_leading_whitespace(line);
if (indicator)
preserve_formatting(line);
The intent is that this docstring would make it through the transform
and still be legible and presented as it is in the source. If additional
cases are handled, update this doc to describe the effect.
"""
__author__ = "jkoleszar@google.com"
import textwrap
import sys
def wrap(text):
if text:
return textwrap.fill(text, break_long_words=False) + '\n'
return ""
def main(fileobj):
text = ""
output = ""
while True:
line = fileobj.readline()
if not line:
break
if line.lstrip() == line:
text += line
else:
output += wrap(text)
text=""
output += line
output += wrap(text)
# Replace the file or write to stdout.
if fileobj == sys.stdin:
fileobj = sys.stdout
else:
fileobj.seek(0)
fileobj.truncate(0)
fileobj.write(output)
if __name__ == "__main__":
if len(sys.argv) > 1:
main(open(sys.argv[1], "r+"))
else:
main(sys.stdin)

View File

@@ -9,15 +9,21 @@
*/
#include <stdio.h>
#include "tools_common.h"
#ifdef _WIN32
#if defined(_WIN32) || defined(__OS2__)
#include <io.h>
#include <fcntl.h>
#ifdef __OS2__
#define _setmode setmode
#define _fileno fileno
#define _O_BINARY O_BINARY
#endif
#endif
FILE* set_binary_mode(FILE *stream)
{
(void)stream;
#ifdef _WIN32
#if defined(_WIN32) || defined(__OS2__)
_setmode(_fileno(stream), _O_BINARY);
#endif
return stream;

View File

@@ -1,6 +1,6 @@
/*!\page usage Usage
The vpx Multi-Format codec SDK provides a unified interface amongst its
The vpx multi-format codec SDK provides a unified interface amongst its
supported codecs. This abstraction allows applications using this SDK to
easily support multiple video formats with minimal code duplication or
"special casing." This section describes the interface common to all codecs.
@@ -14,8 +14,12 @@
Fore more information on decoder and encoder specific usage, see the
following pages:
\if decoder - \subpage usage_decode \endif
\if decoder - \subpage usage_encode \endif
\if decoder
- \subpage usage_decode
\endif
\if decoder
- \subpage usage_encode
\endif
\section usage_types Important Data Types
There are two important data structures to consider in this interface.

View File

@@ -37,14 +37,15 @@ static void update_mode_info_border(MODE_INFO *mi, int rows, int cols)
void vp8_de_alloc_frame_buffers(VP8_COMMON *oci)
{
int i;
for (i = 0; i < NUM_YV12_BUFFERS; i++)
vp8_yv12_de_alloc_frame_buffer(&oci->yv12_fb[i]);
vp8_yv12_de_alloc_frame_buffer(&oci->temp_scale_frame);
#if CONFIG_POSTPROC
vp8_yv12_de_alloc_frame_buffer(&oci->post_proc_buffer);
if (oci->post_proc_buffer_int_used)
vp8_yv12_de_alloc_frame_buffer(&oci->post_proc_buffer_int);
#endif
vpx_free(oci->above_context);
vpx_free(oci->mip);
@@ -97,6 +98,7 @@ int vp8_alloc_frame_buffers(VP8_COMMON *oci, int width, int height)
return 1;
}
#if CONFIG_POSTPROC
if (vp8_yv12_alloc_frame_buffer(&oci->post_proc_buffer, width, height, VP8BORDERINPIXELS) < 0)
{
vp8_de_alloc_frame_buffers(oci);
@@ -104,6 +106,9 @@ int vp8_alloc_frame_buffers(VP8_COMMON *oci, int width, int height)
}
oci->post_proc_buffer_int_used = 0;
vpx_memset(&oci->postproc_state, 0, sizeof(oci->postproc_state));
vpx_memset((&oci->post_proc_buffer)->buffer_alloc,128,(&oci->post_proc_buffer)->frame_size);
#endif
oci->mb_rows = height >> 4;
oci->mb_cols = width >> 4;
@@ -203,7 +208,7 @@ void vp8_create_common(VP8_COMMON *oci)
oci->clr_type = REG_YUV;
oci->clamp_type = RECON_CLAMP_REQUIRED;
/* Initialise reference frame sign bias structure to defaults */
/* Initialize reference frame sign bias structure to defaults */
vpx_memset(oci->ref_frame_sign_bias, 0, sizeof(oci->ref_frame_sign_bias));
/* Default disable buffer to buffer copying */
@@ -215,13 +220,3 @@ void vp8_remove_common(VP8_COMMON *oci)
{
vp8_de_alloc_frame_buffers(oci);
}
void vp8_initialize_common()
{
vp8_coef_tree_initialize();
vp8_entropy_mode_init();
vp8_init_scan_order_mask();
}

View File

@@ -1,115 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "vpx_config.h"
#include "vpx_ports/arm.h"
#include "vp8/common/pragmas.h"
#include "vp8/common/subpixel.h"
#include "vp8/common/loopfilter.h"
#include "vp8/common/recon.h"
#include "vp8/common/idct.h"
#include "vp8/common/onyxc_int.h"
void vp8_arch_arm_common_init(VP8_COMMON *ctx)
{
#if CONFIG_RUNTIME_CPU_DETECT
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
int flags = arm_cpu_caps();
rtcd->flags = flags;
/* Override default functions with fastest ones for this CPU. */
#if HAVE_ARMV5TE
if (flags & HAS_EDSP)
{
}
#endif
#if HAVE_ARMV6
if (flags & HAS_MEDIA)
{
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_armv6;
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_armv6;
rtcd->subpix.sixtap8x4 = vp8_sixtap_predict8x4_armv6;
rtcd->subpix.sixtap4x4 = vp8_sixtap_predict_armv6;
rtcd->subpix.bilinear16x16 = vp8_bilinear_predict16x16_armv6;
rtcd->subpix.bilinear8x8 = vp8_bilinear_predict8x8_armv6;
rtcd->subpix.bilinear8x4 = vp8_bilinear_predict8x4_armv6;
rtcd->subpix.bilinear4x4 = vp8_bilinear_predict4x4_armv6;
rtcd->idct.idct16 = vp8_short_idct4x4llm_v6_dual;
rtcd->idct.iwalsh16 = vp8_short_inv_walsh4x4_v6;
rtcd->loopfilter.normal_mb_v = vp8_loop_filter_mbv_armv6;
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_armv6;
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_armv6;
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_armv6;
rtcd->loopfilter.simple_mb_v =
vp8_loop_filter_simple_vertical_edge_armv6;
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_armv6;
rtcd->loopfilter.simple_mb_h =
vp8_loop_filter_simple_horizontal_edge_armv6;
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_armv6;
rtcd->recon.copy16x16 = vp8_copy_mem16x16_v6;
rtcd->recon.copy8x8 = vp8_copy_mem8x8_v6;
rtcd->recon.copy8x4 = vp8_copy_mem8x4_v6;
rtcd->recon.intra4x4_predict = vp8_intra4x4_predict_armv6;
rtcd->dequant.block = vp8_dequantize_b_v6;
rtcd->dequant.idct_add = vp8_dequant_idct_add_v6;
rtcd->dequant.idct_add_y_block = vp8_dequant_idct_add_y_block_v6;
rtcd->dequant.idct_add_uv_block = vp8_dequant_idct_add_uv_block_v6;
}
#endif
#if HAVE_ARMV7
if (flags & HAS_NEON)
{
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_neon;
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_neon;
rtcd->subpix.sixtap8x4 = vp8_sixtap_predict8x4_neon;
rtcd->subpix.sixtap4x4 = vp8_sixtap_predict_neon;
rtcd->subpix.bilinear16x16 = vp8_bilinear_predict16x16_neon;
rtcd->subpix.bilinear8x8 = vp8_bilinear_predict8x8_neon;
rtcd->subpix.bilinear8x4 = vp8_bilinear_predict8x4_neon;
rtcd->subpix.bilinear4x4 = vp8_bilinear_predict4x4_neon;
rtcd->idct.idct16 = vp8_short_idct4x4llm_neon;
rtcd->idct.iwalsh16 = vp8_short_inv_walsh4x4_neon;
rtcd->loopfilter.normal_mb_v = vp8_loop_filter_mbv_neon;
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_neon;
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_neon;
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_neon;
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_mbvs_neon;
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_neon;
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_mbhs_neon;
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_neon;
rtcd->recon.copy16x16 = vp8_copy_mem16x16_neon;
rtcd->recon.copy8x8 = vp8_copy_mem8x8_neon;
rtcd->recon.copy8x4 = vp8_copy_mem8x4_neon;
rtcd->recon.build_intra_predictors_mby =
vp8_build_intra_predictors_mby_neon;
rtcd->recon.build_intra_predictors_mby_s =
vp8_build_intra_predictors_mby_s_neon;
rtcd->dequant.block = vp8_dequantize_b_neon;
rtcd->dequant.idct_add = vp8_dequant_idct_add_neon;
rtcd->dequant.idct_add_y_block = vp8_dequant_idct_add_y_block_neon;
rtcd->dequant.idct_add_uv_block = vp8_dequant_idct_add_uv_block_neon;
}
#endif
#endif
}

View File

@@ -9,8 +9,7 @@
*/
#include "vpx_config.h"
#include "vp8/common/idct.h"
#include "vp8/common/dequantize.h"
#include "vpx_rtcd.h"
void vp8_dequant_idct_add_y_block_v6(short *q, short *dq,

View File

@@ -144,7 +144,7 @@ loop
ldr r6, [sp, #40] ; get address of sse
mul r0, r8, r8 ; sum * sum
str r11, [r6] ; store sse
sub r0, r11, r0, asr #8 ; return (sse - ((sum * sum) >> 8))
sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8))
ldmfd sp!, {r4-r12, pc}

View File

@@ -169,7 +169,7 @@ loop
ldr r6, [sp, #40] ; get address of sse
mul r0, r8, r8 ; sum * sum
str r11, [r6] ; store sse
sub r0, r11, r0, asr #8 ; return (sse - ((sum * sum) >> 8))
sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8))
ldmfd sp!, {r4-r12, pc}

View File

@@ -210,7 +210,7 @@ loop
ldr r6, [sp, #40] ; get address of sse
mul r0, r8, r8 ; sum * sum
str r11, [r6] ; store sse
sub r0, r11, r0, asr #8 ; return (sse - ((sum * sum) >> 8))
sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8))
ldmfd sp!, {r4-r12, pc}

View File

@@ -171,7 +171,7 @@ loop
ldr r6, [sp, #40] ; get address of sse
mul r0, r8, r8 ; sum * sum
str r11, [r6] ; store sse
sub r0, r11, r0, asr #8 ; return (sse - ((sum * sum) >> 8))
sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8))
ldmfd sp!, {r4-r12, pc}

View File

@@ -8,10 +8,10 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include <math.h>
#include "vp8/common/filter.h"
#include "vp8/common/subpixel.h"
#include "bilinearfilter_arm.h"
void vp8_filter_block2d_bil_armv6

View File

@@ -10,18 +10,17 @@
#include "vpx_config.h"
#include "vp8/common/dequantize.h"
#include "vp8/common/idct.h"
#include "vp8/common/blockd.h"
#if HAVE_ARMV7
#if HAVE_NEON
extern void vp8_dequantize_b_loop_neon(short *Q, short *DQC, short *DQ);
#endif
#if HAVE_ARMV6
#if HAVE_MEDIA
extern void vp8_dequantize_b_loop_v6(short *Q, short *DQC, short *DQ);
#endif
#if HAVE_ARMV7
#if HAVE_NEON
void vp8_dequantize_b_neon(BLOCKD *d, short *DQC)
{
@@ -32,7 +31,7 @@ void vp8_dequantize_b_neon(BLOCKD *d, short *DQC)
}
#endif
#if HAVE_ARMV6
#if HAVE_MEDIA
void vp8_dequantize_b_v6(BLOCKD *d, short *DQC)
{
short *DQ = d->dqcoeff;

View File

@@ -1,59 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef DEQUANTIZE_ARM_H
#define DEQUANTIZE_ARM_H
#if HAVE_ARMV6
extern prototype_dequant_block(vp8_dequantize_b_v6);
extern prototype_dequant_idct_add(vp8_dequant_idct_add_v6);
extern prototype_dequant_idct_add_y_block(vp8_dequant_idct_add_y_block_v6);
extern prototype_dequant_idct_add_uv_block(vp8_dequant_idct_add_uv_block_v6);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_dequant_block
#define vp8_dequant_block vp8_dequantize_b_v6
#undef vp8_dequant_idct_add
#define vp8_dequant_idct_add vp8_dequant_idct_add_v6
#undef vp8_dequant_idct_add_y_block
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_v6
#undef vp8_dequant_idct_add_uv_block
#define vp8_dequant_idct_add_uv_block vp8_dequant_idct_add_uv_block_v6
#endif
#endif
#if HAVE_ARMV7
extern prototype_dequant_block(vp8_dequantize_b_neon);
extern prototype_dequant_idct_add(vp8_dequant_idct_add_neon);
extern prototype_dequant_idct_add_y_block(vp8_dequant_idct_add_y_block_neon);
extern prototype_dequant_idct_add_uv_block(vp8_dequant_idct_add_uv_block_neon);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_dequant_block
#define vp8_dequant_block vp8_dequantize_b_neon
#undef vp8_dequant_idct_add
#define vp8_dequant_idct_add vp8_dequant_idct_add_neon
#undef vp8_dequant_idct_add_y_block
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_neon
#undef vp8_dequant_idct_add_uv_block
#define vp8_dequant_idct_add_uv_block vp8_dequant_idct_add_uv_block_neon
#endif
#endif
#endif

View File

@@ -10,9 +10,9 @@
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include <math.h>
#include "vp8/common/filter.h"
#include "vp8/common/subpixel.h"
#include "vpx_ports/mem.h"
extern void vp8_filter_block2d_first_pass_armv6
@@ -86,8 +86,8 @@ extern void vp8_filter_block2d_second_pass_only_armv6
const short *vp8_filter
);
#if HAVE_ARMV6
void vp8_sixtap_predict_armv6
#if HAVE_MEDIA
void vp8_sixtap_predict4x4_armv6
(
unsigned char *src_ptr,
int src_pixels_per_line,

View File

@@ -1,51 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef IDCT_ARM_H
#define IDCT_ARM_H
#if HAVE_ARMV6
extern prototype_idct(vp8_short_idct4x4llm_v6_dual);
extern prototype_idct_scalar_add(vp8_dc_only_idct_add_v6);
extern prototype_second_order(vp8_short_inv_walsh4x4_1_v6);
extern prototype_second_order(vp8_short_inv_walsh4x4_v6);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_idct_idct16
#define vp8_idct_idct16 vp8_short_idct4x4llm_v6_dual
#undef vp8_idct_idct1_scalar_add
#define vp8_idct_idct1_scalar_add vp8_dc_only_idct_add_v6
#undef vp8_idct_iwalsh16
#define vp8_idct_iwalsh16 vp8_short_inv_walsh4x4_v6
#endif
#endif
#if HAVE_ARMV7
extern prototype_idct(vp8_short_idct4x4llm_neon);
extern prototype_idct_scalar_add(vp8_dc_only_idct_add_neon);
extern prototype_second_order(vp8_short_inv_walsh4x4_1_neon);
extern prototype_second_order(vp8_short_inv_walsh4x4_neon);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_idct_idct16
#define vp8_idct_idct16 vp8_short_idct4x4llm_neon
#undef vp8_idct_idct1_scalar_add
#define vp8_idct_idct1_scalar_add vp8_dc_only_idct_add_neon
#undef vp8_idct_iwalsh16
#define vp8_idct_iwalsh16 vp8_short_inv_walsh4x4_neon
#endif
#endif
#endif

View File

@@ -10,17 +10,22 @@
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "vp8/common/loopfilter.h"
#include "vp8/common/onyxc_int.h"
#if HAVE_ARMV6
#define prototype_loopfilter(sym) \
void sym(unsigned char *src, int pitch, const unsigned char *blimit,\
const unsigned char *limit, const unsigned char *thresh, int count)
#if HAVE_MEDIA
extern prototype_loopfilter(vp8_loop_filter_horizontal_edge_armv6);
extern prototype_loopfilter(vp8_loop_filter_vertical_edge_armv6);
extern prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_armv6);
extern prototype_loopfilter(vp8_mbloop_filter_vertical_edge_armv6);
#endif
#if HAVE_ARMV7
#if HAVE_NEON
typedef void loopfilter_y_neon(unsigned char *src, int pitch,
unsigned char blimit, unsigned char limit, unsigned char thresh);
typedef void loopfilter_uv_neon(unsigned char *u, int pitch,
@@ -38,8 +43,8 @@ extern loopfilter_uv_neon vp8_mbloop_filter_horizontal_edge_uv_neon;
extern loopfilter_uv_neon vp8_mbloop_filter_vertical_edge_uv_neon;
#endif
#if HAVE_ARMV6
/*ARMV6 loopfilter functions*/
#if HAVE_MEDIA
/* ARMV6/MEDIA loopfilter functions*/
/* Horizontal MB filtering */
void vp8_loop_filter_mbh_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
int y_stride, int uv_stride, loop_filter_info *lfi)
@@ -113,7 +118,7 @@ void vp8_loop_filter_bvs_armv6(unsigned char *y_ptr, int y_stride,
}
#endif
#if HAVE_ARMV7
#if HAVE_NEON
/* NEON loopfilter functions */
/* Horizontal MB filtering */
void vp8_loop_filter_mbh_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,

View File

@@ -1,93 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef LOOPFILTER_ARM_H
#define LOOPFILTER_ARM_H
#include "vpx_config.h"
#if HAVE_ARMV6
extern prototype_loopfilter_block(vp8_loop_filter_mbv_armv6);
extern prototype_loopfilter_block(vp8_loop_filter_bv_armv6);
extern prototype_loopfilter_block(vp8_loop_filter_mbh_armv6);
extern prototype_loopfilter_block(vp8_loop_filter_bh_armv6);
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_armv6);
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_armv6);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_armv6);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_armv6);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_lf_normal_mb_v
#define vp8_lf_normal_mb_v vp8_loop_filter_mbv_armv6
#undef vp8_lf_normal_b_v
#define vp8_lf_normal_b_v vp8_loop_filter_bv_armv6
#undef vp8_lf_normal_mb_h
#define vp8_lf_normal_mb_h vp8_loop_filter_mbh_armv6
#undef vp8_lf_normal_b_h
#define vp8_lf_normal_b_h vp8_loop_filter_bh_armv6
#undef vp8_lf_simple_mb_v
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_armv6
#undef vp8_lf_simple_b_v
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_armv6
#undef vp8_lf_simple_mb_h
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_armv6
#undef vp8_lf_simple_b_h
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_armv6
#endif /* !CONFIG_RUNTIME_CPU_DETECT */
#endif /* HAVE_ARMV6 */
#if HAVE_ARMV7
extern prototype_loopfilter_block(vp8_loop_filter_mbv_neon);
extern prototype_loopfilter_block(vp8_loop_filter_bv_neon);
extern prototype_loopfilter_block(vp8_loop_filter_mbh_neon);
extern prototype_loopfilter_block(vp8_loop_filter_bh_neon);
extern prototype_simple_loopfilter(vp8_loop_filter_mbvs_neon);
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_neon);
extern prototype_simple_loopfilter(vp8_loop_filter_mbhs_neon);
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_neon);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_lf_normal_mb_v
#define vp8_lf_normal_mb_v vp8_loop_filter_mbv_neon
#undef vp8_lf_normal_b_v
#define vp8_lf_normal_b_v vp8_loop_filter_bv_neon
#undef vp8_lf_normal_mb_h
#define vp8_lf_normal_mb_h vp8_loop_filter_mbh_neon
#undef vp8_lf_normal_b_h
#define vp8_lf_normal_b_h vp8_loop_filter_bh_neon
#undef vp8_lf_simple_mb_v
#define vp8_lf_simple_mb_v vp8_loop_filter_mbvs_neon
#undef vp8_lf_simple_b_v
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_neon
#undef vp8_lf_simple_mb_h
#define vp8_lf_simple_mb_h vp8_loop_filter_mbhs_neon
#undef vp8_lf_simple_b_h
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_neon
#endif /* !CONFIG_RUNTIME_CPU_DETECT */
#endif /* HAVE_ARMV7 */
#endif /* LOOPFILTER_ARM_H */

View File

@@ -9,8 +9,7 @@
*/
#include "vpx_config.h"
#include "vp8/common/idct.h"
#include "vp8/common/dequantize.h"
#include "vpx_rtcd.h"
/* place these declarations here because we don't want to maintain them
* outside of this scope

View File

@@ -9,7 +9,7 @@
;
EXPORT |vp8_sixtap_predict_neon|
EXPORT |vp8_sixtap_predict4x4_neon|
ARM
REQUIRE8
PRESERVE8
@@ -33,7 +33,7 @@ filter4_coeff
; stack(r4) unsigned char *dst_ptr,
; stack(lr) int dst_pitch
|vp8_sixtap_predict_neon| PROC
|vp8_sixtap_predict4x4_neon| PROC
push {r4, lr}
adr r12, filter4_coeff

View File

@@ -77,14 +77,14 @@ variance16x16_neon_loop
;vmov.32 r1, d1[0]
;mul r0, r0, r0
;str r1, [r12]
;sub r0, r1, r0, asr #8
;sub r0, r1, r0, lsr #8
;sum is in [-255x256, 255x256]. sumxsum is 32-bit. Shift to right should
;have sign-bit exension, which is vshr.s. Have to use s32 to make it right.
; while sum is signed, sum * sum is always positive and must be treated as
; unsigned to avoid propagating the sign bit.
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [r12] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
bx lr
@@ -145,8 +145,8 @@ variance16x8_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [r12] ;store sse
vshr.s32 d10, d10, #7
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #7
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
bx lr
@@ -200,8 +200,8 @@ variance8x16_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [r12] ;store sse
vshr.s32 d10, d10, #7
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #7
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
bx lr
@@ -265,8 +265,8 @@ variance8x8_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [r12] ;store sse
vshr.s32 d10, d10, #6
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #6
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
bx lr

View File

@@ -9,6 +9,11 @@
;
bilinear_taps_coeff
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
;-----------------
EXPORT |vp8_sub_pixel_variance16x16_neon_func|
ARM
REQUIRE8
@@ -27,7 +32,7 @@
|vp8_sub_pixel_variance16x16_neon_func| PROC
push {r4-r6, lr}
ldr r12, _BilinearTaps_coeff_
adr r12, bilinear_taps_coeff
ldr r4, [sp, #16] ;load *dst_ptr from stack
ldr r5, [sp, #20] ;load dst_pixels_per_line from stack
ldr r6, [sp, #24] ;load *sse from stack
@@ -405,8 +410,8 @@ sub_pixel_variance16x16_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [r6] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
add sp, sp, #528
vmov.32 r0, d0[0] ;return
@@ -415,11 +420,4 @@ sub_pixel_variance16x16_neon_loop
ENDP
;-----------------
_BilinearTaps_coeff_
DCD bilinear_taps_coeff
bilinear_taps_coeff
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
END

View File

@@ -112,8 +112,8 @@ vp8_filt_fpo16x16s_4_0_loop_neon
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [lr] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
pop {pc}
@@ -208,8 +208,8 @@ vp8_filt_spo16x16s_0_4_loop_neon
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [lr] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
pop {pc}
@@ -327,8 +327,8 @@ vp8_filt16x16s_4_4_loop_neon
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [lr] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
pop {pc}
@@ -560,8 +560,8 @@ sub_pixel_variance16x16s_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [lr] ;store sse
vshr.s32 d10, d10, #8
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #8
vsub.u32 d0, d1, d10
add sp, sp, #256
vmov.32 r0, d0[0] ;return

View File

@@ -27,7 +27,7 @@
|vp8_sub_pixel_variance8x8_neon| PROC
push {r4-r5, lr}
ldr r12, _BilinearTaps_coeff_
adr r12, bilinear_taps_coeff
ldr r4, [sp, #12] ;load *dst_ptr from stack
ldr r5, [sp, #16] ;load dst_pixels_per_line from stack
ldr lr, [sp, #20] ;load *sse from stack
@@ -206,8 +206,8 @@ sub_pixel_variance8x8_neon_loop
vmull.s32 q5, d0, d0
vst1.32 {d1[0]}, [lr] ;store sse
vshr.s32 d10, d10, #6
vsub.s32 d0, d1, d10
vshr.u32 d10, d10, #6
vsub.u32 d0, d1, d10
vmov.32 r0, d0[0] ;return
pop {r4-r5, pc}
@@ -216,8 +216,6 @@ sub_pixel_variance8x8_neon_loop
;-----------------
_BilinearTaps_coeff_
DCD bilinear_taps_coeff
bilinear_taps_coeff
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112

View File

@@ -1,65 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef RECON_ARM_H
#define RECON_ARM_H
#if HAVE_ARMV6
extern prototype_copy_block(vp8_copy_mem8x8_v6);
extern prototype_copy_block(vp8_copy_mem8x4_v6);
extern prototype_copy_block(vp8_copy_mem16x16_v6);
extern prototype_intra4x4_predict(vp8_intra4x4_predict_armv6);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_recon_copy8x8
#define vp8_recon_copy8x8 vp8_copy_mem8x8_v6
#undef vp8_recon_copy8x4
#define vp8_recon_copy8x4 vp8_copy_mem8x4_v6
#undef vp8_recon_copy16x16
#define vp8_recon_copy16x16 vp8_copy_mem16x16_v6
#undef vp8_recon_intra4x4_predict
#define vp8_recon_intra4x4_predict vp8_intra4x4_predict_armv6
#endif
#endif
#if HAVE_ARMV7
extern prototype_copy_block(vp8_copy_mem8x8_neon);
extern prototype_copy_block(vp8_copy_mem8x4_neon);
extern prototype_copy_block(vp8_copy_mem16x16_neon);
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mby_neon);
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mby_s_neon);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_recon_copy8x8
#define vp8_recon_copy8x8 vp8_copy_mem8x8_neon
#undef vp8_recon_copy8x4
#define vp8_recon_copy8x4 vp8_copy_mem8x4_neon
#undef vp8_recon_copy16x16
#define vp8_recon_copy16x16 vp8_copy_mem16x16_neon
#undef vp8_recon_build_intra_predictors_mby
#define vp8_recon_build_intra_predictors_mby vp8_build_intra_predictors_mby_neon
#undef vp8_recon_build_intra_predictors_mby_s
#define vp8_recon_build_intra_predictors_mby_s vp8_build_intra_predictors_mby_s_neon
#endif
#endif
#endif

View File

@@ -10,12 +10,11 @@
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "vp8/common/blockd.h"
#include "vp8/common/reconintra.h"
#include "vpx_mem/vpx_mem.h"
#include "vp8/common/recon.h"
#if HAVE_ARMV7
#if HAVE_NEON
extern void vp8_build_intra_predictors_mby_neon_func(
unsigned char *y_buffer,
unsigned char *ypred_ptr,
@@ -35,10 +34,7 @@ void vp8_build_intra_predictors_mby_neon(MACROBLOCKD *x)
vp8_build_intra_predictors_mby_neon_func(y_buffer, ypred_ptr, y_stride, mode, Up, Left);
}
#endif
#if HAVE_ARMV7
extern void vp8_build_intra_predictors_mby_s_neon_func(
unsigned char *y_buffer,
unsigned char *ypred_ptr,

View File

@@ -1,89 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef SUBPIXEL_ARM_H
#define SUBPIXEL_ARM_H
#if HAVE_ARMV6
extern prototype_subpixel_predict(vp8_sixtap_predict16x16_armv6);
extern prototype_subpixel_predict(vp8_sixtap_predict8x8_armv6);
extern prototype_subpixel_predict(vp8_sixtap_predict8x4_armv6);
extern prototype_subpixel_predict(vp8_sixtap_predict_armv6);
extern prototype_subpixel_predict(vp8_bilinear_predict16x16_armv6);
extern prototype_subpixel_predict(vp8_bilinear_predict8x8_armv6);
extern prototype_subpixel_predict(vp8_bilinear_predict8x4_armv6);
extern prototype_subpixel_predict(vp8_bilinear_predict4x4_armv6);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_subpix_sixtap16x16
#define vp8_subpix_sixtap16x16 vp8_sixtap_predict16x16_armv6
#undef vp8_subpix_sixtap8x8
#define vp8_subpix_sixtap8x8 vp8_sixtap_predict8x8_armv6
#undef vp8_subpix_sixtap8x4
#define vp8_subpix_sixtap8x4 vp8_sixtap_predict8x4_armv6
#undef vp8_subpix_sixtap4x4
#define vp8_subpix_sixtap4x4 vp8_sixtap_predict_armv6
#undef vp8_subpix_bilinear16x16
#define vp8_subpix_bilinear16x16 vp8_bilinear_predict16x16_armv6
#undef vp8_subpix_bilinear8x8
#define vp8_subpix_bilinear8x8 vp8_bilinear_predict8x8_armv6
#undef vp8_subpix_bilinear8x4
#define vp8_subpix_bilinear8x4 vp8_bilinear_predict8x4_armv6
#undef vp8_subpix_bilinear4x4
#define vp8_subpix_bilinear4x4 vp8_bilinear_predict4x4_armv6
#endif
#endif
#if HAVE_ARMV7
extern prototype_subpixel_predict(vp8_sixtap_predict16x16_neon);
extern prototype_subpixel_predict(vp8_sixtap_predict8x8_neon);
extern prototype_subpixel_predict(vp8_sixtap_predict8x4_neon);
extern prototype_subpixel_predict(vp8_sixtap_predict_neon);
extern prototype_subpixel_predict(vp8_bilinear_predict16x16_neon);
extern prototype_subpixel_predict(vp8_bilinear_predict8x8_neon);
extern prototype_subpixel_predict(vp8_bilinear_predict8x4_neon);
extern prototype_subpixel_predict(vp8_bilinear_predict4x4_neon);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_subpix_sixtap16x16
#define vp8_subpix_sixtap16x16 vp8_sixtap_predict16x16_neon
#undef vp8_subpix_sixtap8x8
#define vp8_subpix_sixtap8x8 vp8_sixtap_predict8x8_neon
#undef vp8_subpix_sixtap8x4
#define vp8_subpix_sixtap8x4 vp8_sixtap_predict8x4_neon
#undef vp8_subpix_sixtap4x4
#define vp8_subpix_sixtap4x4 vp8_sixtap_predict_neon
#undef vp8_subpix_bilinear16x16
#define vp8_subpix_bilinear16x16 vp8_bilinear_predict16x16_neon
#undef vp8_subpix_bilinear8x8
#define vp8_subpix_bilinear8x8 vp8_bilinear_predict8x8_neon
#undef vp8_subpix_bilinear8x4
#define vp8_subpix_bilinear8x4 vp8_bilinear_predict8x4_neon
#undef vp8_subpix_bilinear4x4
#define vp8_subpix_bilinear4x4 vp8_bilinear_predict4x4_neon
#endif
#endif
#endif

View File

@@ -9,10 +9,11 @@
*/
#include "vpx_config.h"
#include "vp8/encoder/variance.h"
#include "vpx_rtcd.h"
#include "vp8/common/variance.h"
#include "vp8/common/filter.h"
#if HAVE_ARMV6
#if HAVE_MEDIA
#include "vp8/common/arm/bilinearfilter_arm.h"
unsigned int vp8_sub_pixel_variance8x8_armv6
@@ -91,10 +92,21 @@ unsigned int vp8_sub_pixel_variance16x16_armv6
return var;
}
#endif /* HAVE_ARMV6 */
#endif /* HAVE_MEDIA */
#if HAVE_ARMV7
#if HAVE_NEON
extern unsigned int vp8_sub_pixel_variance16x16_neon_func
(
const unsigned char *src_ptr,
int src_pixels_per_line,
int xoffset,
int yoffset,
const unsigned char *dst_ptr,
int dst_pixels_per_line,
unsigned int *sse
);
unsigned int vp8_sub_pixel_variance16x16_neon
(

View File

@@ -15,6 +15,10 @@
#include "vpx_scale/yv12config.h"
#include "vp8/common/blockd.h"
#if CONFIG_POSTPROC
#include "postproc.h"
#endif /* CONFIG_POSTPROC */
BEGIN
/* vpx_scale */
@@ -30,12 +34,17 @@ DEFINE(yv12_buffer_config_v_buffer, offsetof(YV12_BUFFER_CONFIG, v_b
DEFINE(yv12_buffer_config_border, offsetof(YV12_BUFFER_CONFIG, border));
DEFINE(VP8BORDERINPIXELS_VAL, VP8BORDERINPIXELS);
#if CONFIG_POSTPROC
/* mfqe.c / filter_by_weight */
DEFINE(MFQE_PRECISION_VAL, MFQE_PRECISION);
#endif /* CONFIG_POSTPROC */
END
/* add asserts for any offset that is not supported by assembly code */
/* add asserts for any size that is not supported by assembly code */
#if HAVE_ARMV6
#if HAVE_MEDIA
/* switch case in vp8_intra4x4_predict_armv6 is based on these enumerated values */
ct_assert(B_DC_PRED, B_DC_PRED == 0);
ct_assert(B_TM_PRED, B_TM_PRED == 1);
@@ -49,7 +58,14 @@ ct_assert(B_HD_PRED, B_HD_PRED == 8);
ct_assert(B_HU_PRED, B_HU_PRED == 9);
#endif
#if HAVE_ARMV7
#if HAVE_NEON
/* vp8_yv12_extend_frame_borders_neon makes several assumptions based on this */
ct_assert(VP8BORDERINPIXELS_VAL, VP8BORDERINPIXELS == 32)
#endif
#if HAVE_SSE2
#if CONFIG_POSTPROC
/* vp8_filter_by_weight16x16 and 8x8 */
ct_assert(MFQE_PRECISION_VAL, MFQE_PRECISION == 4)
#endif /* CONFIG_POSTPROC */
#endif /* HAVE_SSE2 */

View File

@@ -18,7 +18,6 @@ void vpx_log(const char *format, ...);
#include "vpx_scale/yv12config.h"
#include "mv.h"
#include "treecoder.h"
#include "subpixel.h"
#include "vpx_ports/mem.h"
/*#define DCPRED 1*/
@@ -151,14 +150,15 @@ typedef enum
typedef struct
{
MB_PREDICTION_MODE mode, uv_mode;
MV_REFERENCE_FRAME ref_frame;
uint8_t mode, uv_mode;
uint8_t ref_frame;
uint8_t is_4x4;
int_mv mv;
unsigned char partitioning;
unsigned char mb_skip_coeff; /* does this mb has coefficients at all, 1=no coefficients, 0=need decode tokens */
unsigned char need_to_clamp_mvs;
unsigned char segment_id; /* Which set of segmentation parameters should be used for this MB */
uint8_t partitioning;
uint8_t mb_skip_coeff; /* does this mb has coefficients at all, 1=no coefficients, 0=need decode tokens */
uint8_t need_to_clamp_mvs;
uint8_t segment_id; /* Which set of segmentation parameters should be used for this MB */
} MB_MODE_INFO;
typedef struct
@@ -179,28 +179,22 @@ typedef struct
} LOWER_RES_INFO;
#endif
typedef struct
typedef struct blockd
{
short *qcoeff;
short *dqcoeff;
unsigned char *predictor;
short *dequant;
/* 16 Y blocks, 4 U blocks, 4 V blocks each with 16 entries */
unsigned char **base_pre;
int pre;
int pre_stride;
unsigned char **base_dst;
int dst;
int dst_stride;
int offset;
char *eob;
union b_mode_info bmi;
} BLOCKD;
typedef struct MacroBlockD
typedef void (*vp8_subpix_fn_t)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
typedef struct macroblockd
{
DECLARE_ALIGNED(16, unsigned char, predictor[384]);
DECLARE_ALIGNED(16, short, qcoeff[400]);
@@ -222,11 +216,21 @@ typedef struct MacroBlockD
MODE_INFO *mode_info_context;
int mode_info_stride;
#if CONFIG_TEMPORAL_DENOISING
MB_PREDICTION_MODE best_sse_inter_mode;
int_mv best_sse_mv;
unsigned char need_to_clamp_best_mvs;
#endif
FRAME_TYPE frame_type;
int up_available;
int left_available;
unsigned char *recon_above[3];
unsigned char *recon_left[3];
int recon_left_stride[2];
/* Y,U,V,Y2 */
ENTROPY_CONTEXT_PLANES *above_context;
ENTROPY_CONTEXT_PLANES *left_context;
@@ -265,11 +269,8 @@ typedef struct MacroBlockD
int mb_to_top_edge;
int mb_to_bottom_edge;
int ref_frame_cost[MAX_REF_FRAMES];
unsigned int frames_since_golden;
unsigned int frames_till_alt_ref_frame;
vp8_subpix_fn_t subpixel_predict;
vp8_subpix_fn_t subpixel_predict8x4;
vp8_subpix_fn_t subpixel_predict8x8;
@@ -286,10 +287,6 @@ typedef struct MacroBlockD
*/
DECLARE_ALIGNED(32, unsigned char, y_buf[22*32]);
#endif
#if CONFIG_RUNTIME_CPU_DETECT
struct VP8_COMMON_RTCD *rtcd;
#endif
} MACROBLOCKD;

View File

@@ -10,8 +10,8 @@
#include "vpx_config.h"
#include "dequantize.h"
#include "vp8/common/idct.h"
#include "vpx_rtcd.h"
#include "vp8/common/blockd.h"
#include "vpx_mem/vpx_mem.h"
void vp8_dequantize_b_c(BLOCKD *d, short *DQC)

View File

@@ -1,85 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef DEQUANTIZE_H
#define DEQUANTIZE_H
#include "vp8/common/blockd.h"
#define prototype_dequant_block(sym) \
void sym(BLOCKD *x, short *DQC)
#define prototype_dequant_idct_add(sym) \
void sym(short *input, short *dq, \
unsigned char *output, \
int stride)
#define prototype_dequant_idct_add_y_block(sym) \
void sym(short *q, short *dq, \
unsigned char *dst, \
int stride, char *eobs)
#define prototype_dequant_idct_add_uv_block(sym) \
void sym(short *q, short *dq, \
unsigned char *dst_u, \
unsigned char *dst_v, int stride, char *eobs)
#if ARCH_X86 || ARCH_X86_64
#include "x86/dequantize_x86.h"
#endif
#if ARCH_ARM
#include "arm/dequantize_arm.h"
#endif
#ifndef vp8_dequant_block
#define vp8_dequant_block vp8_dequantize_b_c
#endif
extern prototype_dequant_block(vp8_dequant_block);
#ifndef vp8_dequant_idct_add
#define vp8_dequant_idct_add vp8_dequant_idct_add_c
#endif
extern prototype_dequant_idct_add(vp8_dequant_idct_add);
#ifndef vp8_dequant_idct_add_y_block
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_c
#endif
extern prototype_dequant_idct_add_y_block(vp8_dequant_idct_add_y_block);
#ifndef vp8_dequant_idct_add_uv_block
#define vp8_dequant_idct_add_uv_block vp8_dequant_idct_add_uv_block_c
#endif
extern prototype_dequant_idct_add_uv_block(vp8_dequant_idct_add_uv_block);
typedef prototype_dequant_block((*vp8_dequant_block_fn_t));
typedef prototype_dequant_idct_add((*vp8_dequant_idct_add_fn_t));
typedef prototype_dequant_idct_add_y_block((*vp8_dequant_idct_add_y_block_fn_t));
typedef prototype_dequant_idct_add_uv_block((*vp8_dequant_idct_add_uv_block_fn_t));
typedef struct
{
vp8_dequant_block_fn_t block;
vp8_dequant_idct_add_fn_t idct_add;
vp8_dequant_idct_add_y_block_fn_t idct_add_y_block;
vp8_dequant_idct_add_uv_block_fn_t idct_add_uv_block;
} vp8_dequant_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define DEQUANT_INVOKE(ctx,fn) (ctx)->fn
#else
#define DEQUANT_INVOKE(ctx,fn) vp8_dequant_##fn
#endif
#endif

View File

@@ -8,23 +8,11 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include "entropy.h"
#include "string.h"
#include "blockd.h"
#include "onyxc_int.h"
#include "vpx_mem/vpx_mem.h"
#define uchar unsigned char /* typedefs can clash */
#define uint unsigned int
typedef const uchar cuchar;
typedef const uint cuint;
typedef vp8_prob Prob;
#include "coefupdateprobs.h"
DECLARE_ALIGNED(16, const unsigned char, vp8_norm[256]) =
@@ -47,10 +35,11 @@ DECLARE_ALIGNED(16, const unsigned char, vp8_norm[256]) =
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};
DECLARE_ALIGNED(16, cuchar, vp8_coef_bands[16]) =
DECLARE_ALIGNED(16, const unsigned char, vp8_coef_bands[16]) =
{ 0, 1, 2, 3, 6, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7};
DECLARE_ALIGNED(16, cuchar, vp8_prev_token_class[MAX_ENTROPY_TOKENS]) =
DECLARE_ALIGNED(16, const unsigned char,
vp8_prev_token_class[MAX_ENTROPY_TOKENS]) =
{ 0, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0};
DECLARE_ALIGNED(16, const int, vp8_default_zig_zag1d[16]) =
@@ -69,7 +58,26 @@ DECLARE_ALIGNED(16, const short, vp8_default_inv_zig_zag[16]) =
10, 11, 15, 16
};
DECLARE_ALIGNED(16, short, vp8_default_zig_zag_mask[16]);
/* vp8_default_zig_zag_mask generated with:
void vp8_init_scan_order_mask()
{
int i;
for (i = 0; i < 16; i++)
{
vp8_default_zig_zag_mask[vp8_default_zig_zag1d[i]] = 1 << i;
}
}
*/
DECLARE_ALIGNED(16, const short, vp8_default_zig_zag_mask[16]) =
{
1, 2, 32, 64,
4, 16, 128, 4096,
8, 256, 2048, 8192,
512, 1024, 16384, -32768
};
const int vp8_mb_feature_data_bits[MB_LVL_MAX] = {7, 6};
@@ -90,56 +98,72 @@ const vp8_tree_index vp8_coef_tree[ 22] = /* corresponding _CONTEXT_NODEs */
-DCT_VAL_CATEGORY5, -DCT_VAL_CATEGORY6 /* 10 = CAT_FIVE */
};
struct vp8_token_struct vp8_coef_encodings[MAX_ENTROPY_TOKENS];
/* vp8_coef_encodings generated with:
vp8_tokens_from_tree(vp8_coef_encodings, vp8_coef_tree);
*/
const vp8_token vp8_coef_encodings[MAX_ENTROPY_TOKENS] =
{
{2, 2},
{6, 3},
{28, 5},
{58, 6},
{59, 6},
{60, 6},
{61, 6},
{124, 7},
{125, 7},
{126, 7},
{127, 7},
{0, 1}
};
/* Trees for extra bits. Probabilities are constant and
do not depend on previously encoded bits */
static const Prob Pcat1[] = { 159};
static const Prob Pcat2[] = { 165, 145};
static const Prob Pcat3[] = { 173, 148, 140};
static const Prob Pcat4[] = { 176, 155, 140, 135};
static const Prob Pcat5[] = { 180, 157, 141, 134, 130};
static const Prob Pcat6[] =
static const vp8_prob Pcat1[] = { 159};
static const vp8_prob Pcat2[] = { 165, 145};
static const vp8_prob Pcat3[] = { 173, 148, 140};
static const vp8_prob Pcat4[] = { 176, 155, 140, 135};
static const vp8_prob Pcat5[] = { 180, 157, 141, 134, 130};
static const vp8_prob Pcat6[] =
{ 254, 254, 243, 230, 196, 177, 153, 140, 133, 130, 129};
static vp8_tree_index cat1[2], cat2[4], cat3[6], cat4[8], cat5[10], cat6[22];
void vp8_init_scan_order_mask()
{
int i;
/* tree index tables generated with:
for (i = 0; i < 16; i++)
void init_bit_tree(vp8_tree_index *p, int n)
{
vp8_default_zig_zag_mask[vp8_default_zig_zag1d[i]] = 1 << i;
int i = 0;
while (++i < n)
{
p[0] = p[1] = i << 1;
p += 2;
}
p[0] = p[1] = 0;
}
}
static void init_bit_tree(vp8_tree_index *p, int n)
{
int i = 0;
while (++i < n)
void init_bit_trees()
{
p[0] = p[1] = i << 1;
p += 2;
init_bit_tree(cat1, 1);
init_bit_tree(cat2, 2);
init_bit_tree(cat3, 3);
init_bit_tree(cat4, 4);
init_bit_tree(cat5, 5);
init_bit_tree(cat6, 11);
}
*/
p[0] = p[1] = 0;
}
static const vp8_tree_index cat1[2] = { 0, 0 };
static const vp8_tree_index cat2[4] = { 2, 2, 0, 0 };
static const vp8_tree_index cat3[6] = { 2, 2, 4, 4, 0, 0 };
static const vp8_tree_index cat4[8] = { 2, 2, 4, 4, 6, 6, 0, 0 };
static const vp8_tree_index cat5[10] = { 2, 2, 4, 4, 6, 6, 8, 8, 0, 0 };
static const vp8_tree_index cat6[22] = { 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12,
14, 14, 16, 16, 18, 18, 20, 20, 0, 0 };
static void init_bit_trees()
{
init_bit_tree(cat1, 1);
init_bit_tree(cat2, 2);
init_bit_tree(cat3, 3);
init_bit_tree(cat4, 4);
init_bit_tree(cat5, 5);
init_bit_tree(cat6, 11);
}
vp8_extra_bit_struct vp8_extra_bits[12] =
const vp8_extra_bit_struct vp8_extra_bits[12] =
{
{ 0, 0, 0, 0},
{ 0, 0, 0, 1},
@@ -163,8 +187,3 @@ void vp8_default_coef_probs(VP8_COMMON *pc)
sizeof(default_coef_probs));
}
void vp8_coef_tree_initialize()
{
init_bit_trees();
vp8_tokens_from_tree(vp8_coef_encodings, vp8_coef_tree);
}

View File

@@ -35,7 +35,7 @@
extern const vp8_tree_index vp8_coef_tree[];
extern struct vp8_token_struct vp8_coef_encodings[MAX_ENTROPY_TOKENS];
extern const struct vp8_token_struct vp8_coef_encodings[MAX_ENTROPY_TOKENS];
typedef struct
{
@@ -45,7 +45,7 @@ typedef struct
int base_val;
} vp8_extra_bit_struct;
extern vp8_extra_bit_struct vp8_extra_bits[12]; /* indexed by token value */
extern const vp8_extra_bit_struct vp8_extra_bits[12]; /* indexed by token value */
#define PROB_UPDATE_BASELINE_COST 7
@@ -94,7 +94,7 @@ void vp8_default_coef_probs(struct VP8Common *);
extern DECLARE_ALIGNED(16, const int, vp8_default_zig_zag1d[16]);
extern DECLARE_ALIGNED(16, const short, vp8_default_inv_zig_zag[16]);
extern short vp8_default_zig_zag_mask[16];
extern DECLARE_ALIGNED(16, const short, vp8_default_zig_zag_mask[16]);
extern const int vp8_mb_feature_data_bits[MB_LVL_MAX];
void vp8_coef_tree_initialize(void);

View File

@@ -8,22 +8,13 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#define USE_PREBUILT_TABLES
#include "entropymode.h"
#include "entropy.h"
#include "vpx_mem/vpx_mem.h"
static const unsigned int kf_y_mode_cts[VP8_YMODES] = { 1607, 915, 812, 811, 5455};
static const unsigned int y_mode_cts [VP8_YMODES] = { 8080, 1908, 1582, 1007, 5874};
static const unsigned int uv_mode_cts [VP8_UV_MODES] = { 59483, 13605, 16492, 4230};
static const unsigned int kf_uv_mode_cts[VP8_UV_MODES] = { 5319, 1904, 1703, 674};
static const unsigned int bmode_cts[VP8_BINTRAMODES] =
{
43891, 17694, 10036, 3920, 3363, 2546, 5119, 3221, 2471, 1723
};
#include "vp8_entropymodedata.h"
int vp8_mv_cont(const int_mv *l, const int_mv *a)
{
@@ -59,7 +50,7 @@ const vp8_prob vp8_sub_mv_ref_prob2 [SUBMVREF_COUNT][VP8_SUBMVREFS-1] =
vp8_mbsplit vp8_mbsplits [VP8_NUMMBSPLITS] =
const vp8_mbsplit vp8_mbsplits [VP8_NUMMBSPLITS] =
{
{
0, 0, 0, 0,
@@ -84,7 +75,7 @@ vp8_mbsplit vp8_mbsplits [VP8_NUMMBSPLITS] =
4, 5, 6, 7,
8, 9, 10, 11,
12, 13, 14, 15,
},
}
};
const int vp8_mbsplit_count [VP8_NUMMBSPLITS] = { 2, 2, 4, 16};
@@ -155,17 +146,6 @@ const vp8_tree_index vp8_sub_mv_ref_tree[6] =
-ZERO4X4, -NEW4X4
};
struct vp8_token_struct vp8_bmode_encodings [VP8_BINTRAMODES];
struct vp8_token_struct vp8_ymode_encodings [VP8_YMODES];
struct vp8_token_struct vp8_kf_ymode_encodings [VP8_YMODES];
struct vp8_token_struct vp8_uv_mode_encodings [VP8_UV_MODES];
struct vp8_token_struct vp8_mbsplit_encodings [VP8_NUMMBSPLITS];
struct vp8_token_struct vp8_mv_ref_encoding_array [VP8_MVREFS];
struct vp8_token_struct vp8_sub_mv_ref_encoding_array [VP8_SUBMVREFS];
const vp8_tree_index vp8_small_mvtree [14] =
{
2, 8,
@@ -177,89 +157,21 @@ const vp8_tree_index vp8_small_mvtree [14] =
-6, -7
};
struct vp8_token_struct vp8_small_mvencodings [8];
void vp8_init_mbmode_probs(VP8_COMMON *x)
{
unsigned int bct [VP8_YMODES] [2]; /* num Ymodes > num UV modes */
vp8_tree_probs_from_distribution(
VP8_YMODES, vp8_ymode_encodings, vp8_ymode_tree,
x->fc.ymode_prob, bct, y_mode_cts,
256, 1
);
vp8_tree_probs_from_distribution(
VP8_YMODES, vp8_kf_ymode_encodings, vp8_kf_ymode_tree,
x->kf_ymode_prob, bct, kf_y_mode_cts,
256, 1
);
vp8_tree_probs_from_distribution(
VP8_UV_MODES, vp8_uv_mode_encodings, vp8_uv_mode_tree,
x->fc.uv_mode_prob, bct, uv_mode_cts,
256, 1
);
vp8_tree_probs_from_distribution(
VP8_UV_MODES, vp8_uv_mode_encodings, vp8_uv_mode_tree,
x->kf_uv_mode_prob, bct, kf_uv_mode_cts,
256, 1
);
vpx_memcpy(x->fc.ymode_prob, vp8_ymode_prob, sizeof(vp8_ymode_prob));
vpx_memcpy(x->kf_ymode_prob, vp8_kf_ymode_prob, sizeof(vp8_kf_ymode_prob));
vpx_memcpy(x->fc.uv_mode_prob, vp8_uv_mode_prob, sizeof(vp8_uv_mode_prob));
vpx_memcpy(x->kf_uv_mode_prob, vp8_kf_uv_mode_prob, sizeof(vp8_kf_uv_mode_prob));
vpx_memcpy(x->fc.sub_mv_ref_prob, sub_mv_ref_prob, sizeof(sub_mv_ref_prob));
}
static void intra_bmode_probs_from_distribution(
vp8_prob p [VP8_BINTRAMODES-1],
unsigned int branch_ct [VP8_BINTRAMODES-1] [2],
const unsigned int events [VP8_BINTRAMODES]
)
{
vp8_tree_probs_from_distribution(
VP8_BINTRAMODES, vp8_bmode_encodings, vp8_bmode_tree,
p, branch_ct, events,
256, 1
);
}
void vp8_default_bmode_probs(vp8_prob p [VP8_BINTRAMODES-1])
{
unsigned int branch_ct [VP8_BINTRAMODES-1] [2];
intra_bmode_probs_from_distribution(p, branch_ct, bmode_cts);
vpx_memcpy(p, vp8_bmode_prob, sizeof(vp8_bmode_prob));
}
void vp8_kf_default_bmode_probs(vp8_prob p [VP8_BINTRAMODES] [VP8_BINTRAMODES] [VP8_BINTRAMODES-1])
{
unsigned int branch_ct [VP8_BINTRAMODES-1] [2];
int i = 0;
do
{
int j = 0;
do
{
intra_bmode_probs_from_distribution(
p[i][j], branch_ct, vp8_kf_default_bmode_counts[i][j]);
}
while (++j < VP8_BINTRAMODES);
}
while (++i < VP8_BINTRAMODES);
}
void vp8_entropy_mode_init()
{
vp8_tokens_from_tree(vp8_bmode_encodings, vp8_bmode_tree);
vp8_tokens_from_tree(vp8_ymode_encodings, vp8_ymode_tree);
vp8_tokens_from_tree(vp8_kf_ymode_encodings, vp8_kf_ymode_tree);
vp8_tokens_from_tree(vp8_uv_mode_encodings, vp8_uv_mode_tree);
vp8_tokens_from_tree(vp8_mbsplit_encodings, vp8_mbsplit_tree);
vp8_tokens_from_tree_offset(vp8_mv_ref_encoding_array,
vp8_mv_ref_tree, NEARESTMV);
vp8_tokens_from_tree_offset(vp8_sub_mv_ref_encoding_array,
vp8_sub_mv_ref_tree, LEFT4X4);
vp8_tokens_from_tree(vp8_small_mvencodings, vp8_small_mvtree);
vpx_memcpy(p, vp8_kf_bmode_prob, sizeof(vp8_kf_bmode_prob));
}

View File

@@ -52,22 +52,20 @@ extern const vp8_tree_index vp8_mbsplit_tree[];
extern const vp8_tree_index vp8_mv_ref_tree[];
extern const vp8_tree_index vp8_sub_mv_ref_tree[];
extern struct vp8_token_struct vp8_bmode_encodings [VP8_BINTRAMODES];
extern struct vp8_token_struct vp8_ymode_encodings [VP8_YMODES];
extern struct vp8_token_struct vp8_kf_ymode_encodings [VP8_YMODES];
extern struct vp8_token_struct vp8_uv_mode_encodings [VP8_UV_MODES];
extern struct vp8_token_struct vp8_mbsplit_encodings [VP8_NUMMBSPLITS];
extern const struct vp8_token_struct vp8_bmode_encodings[VP8_BINTRAMODES];
extern const struct vp8_token_struct vp8_ymode_encodings[VP8_YMODES];
extern const struct vp8_token_struct vp8_kf_ymode_encodings[VP8_YMODES];
extern const struct vp8_token_struct vp8_uv_mode_encodings[VP8_UV_MODES];
extern const struct vp8_token_struct vp8_mbsplit_encodings[VP8_NUMMBSPLITS];
/* Inter mode values do not start at zero */
extern struct vp8_token_struct vp8_mv_ref_encoding_array [VP8_MVREFS];
extern struct vp8_token_struct vp8_sub_mv_ref_encoding_array [VP8_SUBMVREFS];
extern const struct vp8_token_struct vp8_mv_ref_encoding_array[VP8_MVREFS];
extern const struct vp8_token_struct vp8_sub_mv_ref_encoding_array[VP8_SUBMVREFS];
extern const vp8_tree_index vp8_small_mvtree[];
extern struct vp8_token_struct vp8_small_mvencodings [8];
void vp8_entropy_mode_init(void);
extern const struct vp8_token_struct vp8_small_mvencodings[8];
void vp8_init_mbmode_probs(VP8_COMMON *x);

View File

@@ -149,7 +149,7 @@ static void filter_block2d
}
void vp8_sixtap_predict_c
void vp8_sixtap_predict4x4_c
(
unsigned char *src_ptr,
int src_pixels_per_line,

View File

@@ -10,30 +10,33 @@
#include "vpx_config.h"
#include "vp8/common/subpixel.h"
#include "vp8/common/loopfilter.h"
#include "vp8/common/recon.h"
#include "vp8/common/idct.h"
#include "vpx_rtcd.h"
#if ARCH_ARM
#include "vpx_ports/arm.h"
#elif ARCH_X86 || ARCH_X86_64
#include "vpx_ports/x86.h"
#endif
#include "vp8/common/onyxc_int.h"
#if CONFIG_MULTITHREAD
#if HAVE_UNISTD_H
#if HAVE_UNISTD_H && !defined(__OS2__)
#include <unistd.h>
#elif defined(_WIN32)
#include <windows.h>
typedef void (WINAPI *PGNSI)(LPSYSTEM_INFO);
#elif defined(__OS2__)
#define INCL_DOS
#define INCL_DOSSPINLOCK
#include <os2.h>
#endif
#endif
extern void vp8_arch_x86_common_init(VP8_COMMON *ctx);
extern void vp8_arch_arm_common_init(VP8_COMMON *ctx);
#if CONFIG_MULTITHREAD
static int get_cpu_count()
{
int core_count = 16;
#if HAVE_UNISTD_H
#if HAVE_UNISTD_H && !defined(__OS2__)
#if defined(_SC_NPROCESSORS_ONLN)
core_count = sysconf(_SC_NPROCESSORS_ONLN);
#elif defined(_SC_NPROC_ONLN)
@@ -56,6 +59,21 @@ static int get_cpu_count()
core_count = sysinfo.dwNumberOfProcessors;
}
#elif defined(__OS2__)
{
ULONG proc_id;
ULONG status;
core_count = 0;
for (proc_id = 1; ; proc_id++)
{
if (DosGetProcessorStatus(proc_id, &status))
break;
if (status == PROC_ONLINE)
core_count++;
}
}
#else
/* other platforms */
#endif
@@ -64,78 +82,69 @@ static int get_cpu_count()
}
#endif
#if HAVE_PTHREAD_H
#include <pthread.h>
static void once(void (*func)(void))
{
static pthread_once_t lock = PTHREAD_ONCE_INIT;
pthread_once(&lock, func);
}
#elif defined(_WIN32)
static void once(void (*func)(void))
{
/* Using a static initializer here rather than InitializeCriticalSection()
* since there's no race-free context in which to execute it. Protecting
* it with an atomic op like InterlockedCompareExchangePointer introduces
* an x86 dependency, and InitOnceExecuteOnce requires Vista.
*/
static CRITICAL_SECTION lock = {(void *)-1, -1, 0, 0, 0, 0};
static int done;
EnterCriticalSection(&lock);
if (!done)
{
func();
done = 1;
}
LeaveCriticalSection(&lock);
}
#else
/* No-op version that performs no synchronization. vpx_rtcd() is idempotent,
* so as long as your platform provides atomic loads/stores of pointers
* no synchronization is strictly necessary.
*/
static void once(void (*func)(void))
{
static int done;
if(!done)
{
func();
done = 1;
}
}
#endif
void vp8_machine_specific_config(VP8_COMMON *ctx)
{
#if CONFIG_RUNTIME_CPU_DETECT
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
rtcd->dequant.block = vp8_dequantize_b_c;
rtcd->dequant.idct_add = vp8_dequant_idct_add_c;
rtcd->dequant.idct_add_y_block = vp8_dequant_idct_add_y_block_c;
rtcd->dequant.idct_add_uv_block =
vp8_dequant_idct_add_uv_block_c;
rtcd->idct.idct16 = vp8_short_idct4x4llm_c;
rtcd->idct.idct1_scalar_add = vp8_dc_only_idct_add_c;
rtcd->idct.iwalsh1 = vp8_short_inv_walsh4x4_1_c;
rtcd->idct.iwalsh16 = vp8_short_inv_walsh4x4_c;
rtcd->recon.copy16x16 = vp8_copy_mem16x16_c;
rtcd->recon.copy8x8 = vp8_copy_mem8x8_c;
rtcd->recon.copy8x4 = vp8_copy_mem8x4_c;
rtcd->recon.build_intra_predictors_mby =
vp8_build_intra_predictors_mby;
rtcd->recon.build_intra_predictors_mby_s =
vp8_build_intra_predictors_mby_s;
rtcd->recon.build_intra_predictors_mbuv =
vp8_build_intra_predictors_mbuv;
rtcd->recon.build_intra_predictors_mbuv_s =
vp8_build_intra_predictors_mbuv_s;
rtcd->recon.intra4x4_predict =
vp8_intra4x4_predict_c;
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_c;
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_c;
rtcd->subpix.sixtap8x4 = vp8_sixtap_predict8x4_c;
rtcd->subpix.sixtap4x4 = vp8_sixtap_predict_c;
rtcd->subpix.bilinear16x16 = vp8_bilinear_predict16x16_c;
rtcd->subpix.bilinear8x8 = vp8_bilinear_predict8x8_c;
rtcd->subpix.bilinear8x4 = vp8_bilinear_predict8x4_c;
rtcd->subpix.bilinear4x4 = vp8_bilinear_predict4x4_c;
rtcd->loopfilter.normal_mb_v = vp8_loop_filter_mbv_c;
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_c;
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_c;
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_c;
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_simple_vertical_edge_c;
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_c;
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_simple_horizontal_edge_c;
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_c;
#if CONFIG_POSTPROC || (CONFIG_VP8_ENCODER && CONFIG_INTERNAL_STATS)
rtcd->postproc.down = vp8_mbpost_proc_down_c;
rtcd->postproc.across = vp8_mbpost_proc_across_ip_c;
rtcd->postproc.downacross = vp8_post_proc_down_and_across_c;
rtcd->postproc.addnoise = vp8_plane_add_noise_c;
rtcd->postproc.blend_mb_inner = vp8_blend_mb_inner_c;
rtcd->postproc.blend_mb_outer = vp8_blend_mb_outer_c;
rtcd->postproc.blend_b = vp8_blend_b_c;
#endif
#endif
#if ARCH_X86 || ARCH_X86_64
vp8_arch_x86_common_init(ctx);
#endif
#if ARCH_ARM
vp8_arch_arm_common_init(ctx);
#endif
#if CONFIG_MULTITHREAD
ctx->processor_core_count = get_cpu_count();
#endif /* CONFIG_MULTITHREAD */
#if ARCH_ARM
ctx->cpu_caps = arm_cpu_caps();
#elif ARCH_X86 || ARCH_X86_64
ctx->cpu_caps = x86_simd_caps();
#endif
once(vpx_rtcd);
}

View File

@@ -1,80 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef __INC_IDCT_H
#define __INC_IDCT_H
#define prototype_second_order(sym) \
void sym(short *input, short *output)
#define prototype_idct(sym) \
void sym(short *input, unsigned char *pred, int pitch, unsigned char *dst, \
int dst_stride)
#define prototype_idct_scalar_add(sym) \
void sym(short input, \
unsigned char *pred, int pred_stride, \
unsigned char *dst, \
int dst_stride)
#if ARCH_X86 || ARCH_X86_64
#include "x86/idct_x86.h"
#endif
#if ARCH_ARM
#include "arm/idct_arm.h"
#endif
#ifndef vp8_idct_idct16
#define vp8_idct_idct16 vp8_short_idct4x4llm_c
#endif
extern prototype_idct(vp8_idct_idct16);
/* add this prototype to prevent compiler warning about implicit
* declaration of vp8_short_idct4x4llm_c function in dequantize.c
* when building, for example, neon optimized version */
extern prototype_idct(vp8_short_idct4x4llm_c);
#ifndef vp8_idct_idct1_scalar_add
#define vp8_idct_idct1_scalar_add vp8_dc_only_idct_add_c
#endif
extern prototype_idct_scalar_add(vp8_idct_idct1_scalar_add);
#ifndef vp8_idct_iwalsh1
#define vp8_idct_iwalsh1 vp8_short_inv_walsh4x4_1_c
#endif
extern prototype_second_order(vp8_idct_iwalsh1);
#ifndef vp8_idct_iwalsh16
#define vp8_idct_iwalsh16 vp8_short_inv_walsh4x4_c
#endif
extern prototype_second_order(vp8_idct_iwalsh16);
typedef prototype_idct((*vp8_idct_fn_t));
typedef prototype_idct_scalar_add((*vp8_idct_scalar_add_fn_t));
typedef prototype_second_order((*vp8_second_order_fn_t));
typedef struct
{
vp8_idct_fn_t idct16;
vp8_idct_scalar_add_fn_t idct1_scalar_add;
vp8_second_order_fn_t iwalsh1;
vp8_second_order_fn_t iwalsh16;
} vp8_idct_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define IDCT_INVOKE(ctx,fn) (ctx)->fn
#else
#define IDCT_INVOKE(ctx,fn) vp8_idct_##fn
#endif
#endif

View File

@@ -9,8 +9,7 @@
*/
#include "vpx_config.h"
#include "vp8/common/idct.h"
#include "dequantize.h"
#include "vpx_rtcd.h"
void vp8_dequant_idct_add_c(short *input, short *dq,
unsigned char *dest, int stride);

31
vp8/common/idctllm_test.cc Executable file
View File

@@ -0,0 +1,31 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
extern "C" {
void vp8_short_idct4x4llm_c(short *input, unsigned char *pred_ptr,
int pred_stride, unsigned char *dst_ptr,
int dst_stride);
}
#include "vpx_config.h"
#include "idctllm_test.h"
namespace
{
INSTANTIATE_TEST_CASE_P(C, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_c));
} // namespace
int main(int argc, char **argv) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}

113
vp8/common/idctllm_test.h Executable file
View File

@@ -0,0 +1,113 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
typedef void (*idct_fn_t)(short *input, unsigned char *pred_ptr,
int pred_stride, unsigned char *dst_ptr,
int dst_stride);
namespace {
class IDCTTest : public ::testing::TestWithParam<idct_fn_t>
{
protected:
virtual void SetUp()
{
int i;
UUT = GetParam();
memset(input, 0, sizeof(input));
/* Set up guard blocks */
for(i=0; i<256; i++)
output[i] = ((i&0xF)<4&&(i<64))?0:-1;
}
idct_fn_t UUT;
short input[16];
unsigned char output[256];
unsigned char predict[256];
};
TEST_P(IDCTTest, TestGuardBlocks)
{
int i;
for(i=0; i<256; i++)
if((i&0xF) < 4 && i<64)
EXPECT_EQ(0, output[i]) << i;
else
EXPECT_EQ(255, output[i]);
}
TEST_P(IDCTTest, TestAllZeros)
{
int i;
UUT(input, output, 16, output, 16);
for(i=0; i<256; i++)
if((i&0xF) < 4 && i<64)
EXPECT_EQ(0, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
}
TEST_P(IDCTTest, TestAllOnes)
{
int i;
input[0] = 4;
UUT(input, output, 16, output, 16);
for(i=0; i<256; i++)
if((i&0xF) < 4 && i<64)
EXPECT_EQ(1, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
}
TEST_P(IDCTTest, TestAddOne)
{
int i;
for(i=0; i<256; i++)
predict[i] = i;
input[0] = 4;
UUT(input, predict, 16, output, 16);
for(i=0; i<256; i++)
if((i&0xF) < 4 && i<64)
EXPECT_EQ(i+1, output[i]) << "i==" << i;
else
EXPECT_EQ(255, output[i]) << "i==" << i;
}
TEST_P(IDCTTest, TestWithData)
{
int i;
for(i=0; i<16; i++)
input[i] = i;
UUT(input, output, 16, output, 16);
for(i=0; i<256; i++)
if((i&0xF) > 3 || i>63)
EXPECT_EQ(255, output[i]) << "i==" << i;
else if(i == 0)
EXPECT_EQ(11, output[i]) << "i==" << i;
else if(i == 34)
EXPECT_EQ(1, output[i]) << "i==" << i;
else if(i == 2 || i == 17 || i == 32)
EXPECT_EQ(3, output[i]) << "i==" << i;
else
EXPECT_EQ(0, output[i]) << "i==" << i;
}
}

View File

@@ -13,7 +13,7 @@
#define __INC_INVTRANS_H
#include "vpx_config.h"
#include "idct.h"
#include "vpx_rtcd.h"
#include "blockd.h"
#include "onyxc_int.h"
@@ -33,8 +33,7 @@ static void eob_adjust(char *eobs, short *diff)
}
}
static void vp8_inverse_transform_mby(MACROBLOCKD *xd,
const VP8_COMMON_RTCD *rtcd)
static void vp8_inverse_transform_mby(MACROBLOCKD *xd)
{
short *DQC = xd->dequant_y1;
@@ -43,19 +42,19 @@ static void vp8_inverse_transform_mby(MACROBLOCKD *xd,
/* do 2nd order transform on the dc block */
if (xd->eobs[24] > 1)
{
IDCT_INVOKE(&rtcd->idct, iwalsh16)
vp8_short_inv_walsh4x4
(&xd->block[24].dqcoeff[0], xd->qcoeff);
}
else
{
IDCT_INVOKE(&rtcd->idct, iwalsh1)
vp8_short_inv_walsh4x4_1
(&xd->block[24].dqcoeff[0], xd->qcoeff);
}
eob_adjust(xd->eobs, xd->qcoeff);
DQC = xd->dequant_y1_dc;
}
DEQUANT_INVOKE (&rtcd->dequant, idct_add_y_block)
vp8_dequant_idct_add_y_block
(xd->qcoeff, DQC,
xd->dst.y_buffer,
xd->dst.y_stride, xd->eobs);

View File

@@ -10,96 +10,13 @@
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "loopfilter.h"
#include "onyxc_int.h"
#include "vpx_mem/vpx_mem.h"
typedef unsigned char uc;
prototype_loopfilter(vp8_loop_filter_horizontal_edge_c);
prototype_loopfilter(vp8_loop_filter_vertical_edge_c);
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_c);
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_c);
prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_c);
prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_c);
/* Horizontal MB filtering */
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_mbloop_filter_horizontal_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_mbloop_filter_horizontal_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_mbloop_filter_horizontal_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
}
/* Vertical MB Filtering */
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_mbloop_filter_vertical_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_mbloop_filter_vertical_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_mbloop_filter_vertical_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
}
/* Horizontal B Filtering */
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_loop_filter_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_loop_filter_horizontal_edge_c(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_loop_filter_horizontal_edge_c(v_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
}
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, int y_stride,
const unsigned char *blimit)
{
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, blimit);
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, blimit);
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, blimit);
}
/* Vertical B Filtering */
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_loop_filter_vertical_edge_c(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_c(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_c(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_loop_filter_vertical_edge_c(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_loop_filter_vertical_edge_c(v_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
}
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, int y_stride,
const unsigned char *blimit)
{
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 4, y_stride, blimit);
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 8, y_stride, blimit);
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 12, y_stride, blimit);
}
static void lf_init_lut(loop_filter_info_n *lfi)
{
int filt_lvl;
@@ -293,6 +210,8 @@ void vp8_loop_filter_frame
int mb_row;
int mb_col;
int mb_rows = cm->mb_rows;
int mb_cols = cm->mb_cols;
int filter_level;
@@ -300,6 +219,8 @@ void vp8_loop_filter_frame
/* Point at base of Mb MODE_INFO list */
const MODE_INFO *mode_info_context = cm->mi;
int post_y_stride = post->y_stride;
int post_uv_stride = post->uv_stride;
/* Initialize the loop filter for this frame. */
vp8_loop_filter_frame_init(cm, mbd, cm->filter_level);
@@ -310,23 +231,23 @@ void vp8_loop_filter_frame
v_ptr = post->v_buffer;
/* vp8_filter each macro block */
for (mb_row = 0; mb_row < cm->mb_rows; mb_row++)
if (cm->filter_type == NORMAL_LOOPFILTER)
{
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
for (mb_row = 0; mb_row < mb_rows; mb_row++)
{
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
mode_info_context->mbmi.mode != SPLITMV &&
mode_info_context->mbmi.mb_skip_coeff);
const int mode_index = lfi_n->mode_lf_lut[mode_info_context->mbmi.mode];
const int seg = mode_info_context->mbmi.segment_id;
const int ref_frame = mode_info_context->mbmi.ref_frame;
filter_level = lfi_n->lvl[seg][ref_frame][mode_index];
if (filter_level)
for (mb_col = 0; mb_col < mb_cols; mb_col++)
{
if (cm->filter_type == NORMAL_LOOPFILTER)
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
mode_info_context->mbmi.mode != SPLITMV &&
mode_info_context->mbmi.mb_skip_coeff);
const int mode_index = lfi_n->mode_lf_lut[mode_info_context->mbmi.mode];
const int seg = mode_info_context->mbmi.segment_id;
const int ref_frame = mode_info_context->mbmi.ref_frame;
filter_level = lfi_n->lvl[seg][ref_frame][mode_index];
if (filter_level)
{
const int hev_index = lfi_n->hev_thr_lut[frame_type][filter_level];
lfi.mblim = lfi_n->mblim[filter_level];
@@ -335,55 +256,88 @@ void vp8_loop_filter_frame
lfi.hev_thr = lfi_n->hev_thr[hev_index];
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
vp8_loop_filter_mbv
(y_ptr, u_ptr, v_ptr, post_y_stride, post_uv_stride, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
vp8_loop_filter_bv
(y_ptr, u_ptr, v_ptr, post_y_stride, post_uv_stride, &lfi);
/* don't apply across umv border */
if (mb_row > 0)
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
vp8_loop_filter_mbh
(y_ptr, u_ptr, v_ptr, post_y_stride, post_uv_stride, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
vp8_loop_filter_bh
(y_ptr, u_ptr, v_ptr, post_y_stride, post_uv_stride, &lfi);
}
else
{
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
y_ptr += 16;
u_ptr += 8;
v_ptr += 8;
/* don't apply across umv border */
if (mb_row > 0)
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
}
mode_info_context++; /* step to next MB */
}
y_ptr += post_y_stride * 16 - post->y_width;
u_ptr += post_uv_stride * 8 - post->uv_width;
v_ptr += post_uv_stride * 8 - post->uv_width;
y_ptr += 16;
u_ptr += 8;
v_ptr += 8;
mode_info_context++; /* Skip border mb */
mode_info_context++; /* step to next MB */
}
}
else /* SIMPLE_LOOPFILTER */
{
for (mb_row = 0; mb_row < mb_rows; mb_row++)
{
for (mb_col = 0; mb_col < mb_cols; mb_col++)
{
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
mode_info_context->mbmi.mode != SPLITMV &&
mode_info_context->mbmi.mb_skip_coeff);
y_ptr += post->y_stride * 16 - post->y_width;
u_ptr += post->uv_stride * 8 - post->uv_width;
v_ptr += post->uv_stride * 8 - post->uv_width;
const int mode_index = lfi_n->mode_lf_lut[mode_info_context->mbmi.mode];
const int seg = mode_info_context->mbmi.segment_id;
const int ref_frame = mode_info_context->mbmi.ref_frame;
mode_info_context++; /* Skip border mb */
filter_level = lfi_n->lvl[seg][ref_frame][mode_index];
if (filter_level)
{
const unsigned char * mblim = lfi_n->mblim[filter_level];
const unsigned char * blim = lfi_n->blim[filter_level];
if (mb_col > 0)
vp8_loop_filter_simple_mbv
(y_ptr, post_y_stride, mblim);
if (!skip_lf)
vp8_loop_filter_simple_bv
(y_ptr, post_y_stride, blim);
/* don't apply across umv border */
if (mb_row > 0)
vp8_loop_filter_simple_mbh
(y_ptr, post_y_stride, mblim);
if (!skip_lf)
vp8_loop_filter_simple_bh
(y_ptr, post_y_stride, blim);
}
y_ptr += 16;
u_ptr += 8;
v_ptr += 8;
mode_info_context++; /* step to next MB */
}
y_ptr += post_y_stride * 16 - post->y_width;
u_ptr += post_uv_stride * 8 - post->uv_width;
v_ptr += post_uv_stride * 8 - post->uv_width;
mode_info_context++; /* Skip border mb */
}
}
}
@@ -446,39 +400,39 @@ void vp8_loop_filter_frame_yonly
lfi.hev_thr = lfi_n->hev_thr[hev_index];
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
vp8_loop_filter_mbv
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
vp8_loop_filter_bv
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
/* don't apply across umv border */
if (mb_row > 0)
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
vp8_loop_filter_mbh
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
vp8_loop_filter_bh
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
}
else
{
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
vp8_loop_filter_simple_mbv
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
vp8_loop_filter_simple_bv
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
/* don't apply across umv border */
if (mb_row > 0)
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
vp8_loop_filter_simple_mbh
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
vp8_loop_filter_simple_bh
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
}
}
@@ -578,35 +532,35 @@ void vp8_loop_filter_partial_frame
lfi.hev_thr = lfi_n->hev_thr[hev_index];
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
vp8_loop_filter_mbv
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
vp8_loop_filter_bv
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
vp8_loop_filter_mbh
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
vp8_loop_filter_bh
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
}
else
{
if (mb_col > 0)
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
vp8_loop_filter_simple_mbv
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
vp8_loop_filter_simple_bv
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
vp8_loop_filter_simple_mbh
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
if (!skip_lf)
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
vp8_loop_filter_simple_bh
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
}
}

View File

@@ -14,6 +14,7 @@
#include "vpx_ports/mem.h"
#include "vpx_config.h"
#include "vpx_rtcd.h"
#define MAX_LOOP_FILTER 63
/* fraction of total macroblock rows to be used in fast filter level picking */
@@ -46,7 +47,7 @@ typedef struct
unsigned char mode_lf_lut[10];
} loop_filter_info_n;
typedef struct
typedef struct loop_filter_info
{
const unsigned char * mblim;
const unsigned char * blim;
@@ -55,86 +56,6 @@ typedef struct
} loop_filter_info;
#define prototype_loopfilter(sym) \
void sym(unsigned char *src, int pitch, const unsigned char *blimit,\
const unsigned char *limit, const unsigned char *thresh, int count)
#define prototype_loopfilter_block(sym) \
void sym(unsigned char *y, unsigned char *u, unsigned char *v, \
int ystride, int uv_stride, loop_filter_info *lfi)
#define prototype_simple_loopfilter(sym) \
void sym(unsigned char *y, int ystride, const unsigned char *blimit)
#if ARCH_X86 || ARCH_X86_64
#include "x86/loopfilter_x86.h"
#endif
#if ARCH_ARM
#include "arm/loopfilter_arm.h"
#endif
#ifndef vp8_lf_normal_mb_v
#define vp8_lf_normal_mb_v vp8_loop_filter_mbv_c
#endif
extern prototype_loopfilter_block(vp8_lf_normal_mb_v);
#ifndef vp8_lf_normal_b_v
#define vp8_lf_normal_b_v vp8_loop_filter_bv_c
#endif
extern prototype_loopfilter_block(vp8_lf_normal_b_v);
#ifndef vp8_lf_normal_mb_h
#define vp8_lf_normal_mb_h vp8_loop_filter_mbh_c
#endif
extern prototype_loopfilter_block(vp8_lf_normal_mb_h);
#ifndef vp8_lf_normal_b_h
#define vp8_lf_normal_b_h vp8_loop_filter_bh_c
#endif
extern prototype_loopfilter_block(vp8_lf_normal_b_h);
#ifndef vp8_lf_simple_mb_v
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_c
#endif
extern prototype_simple_loopfilter(vp8_lf_simple_mb_v);
#ifndef vp8_lf_simple_b_v
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_c
#endif
extern prototype_simple_loopfilter(vp8_lf_simple_b_v);
#ifndef vp8_lf_simple_mb_h
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_c
#endif
extern prototype_simple_loopfilter(vp8_lf_simple_mb_h);
#ifndef vp8_lf_simple_b_h
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_c
#endif
extern prototype_simple_loopfilter(vp8_lf_simple_b_h);
typedef prototype_loopfilter_block((*vp8_lf_block_fn_t));
typedef prototype_simple_loopfilter((*vp8_slf_block_fn_t));
typedef struct
{
vp8_lf_block_fn_t normal_mb_v;
vp8_lf_block_fn_t normal_b_v;
vp8_lf_block_fn_t normal_mb_h;
vp8_lf_block_fn_t normal_b_h;
vp8_slf_block_fn_t simple_mb_v;
vp8_slf_block_fn_t simple_b_v;
vp8_slf_block_fn_t simple_mb_h;
vp8_slf_block_fn_t simple_b_h;
} vp8_loopfilter_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define LF_INVOKE(ctx,fn) (ctx)->fn
#else
#define LF_INVOKE(ctx,fn) vp8_lf_##fn
#endif
typedef void loop_filter_uvfunction
(
unsigned char *u, /* source pointer */
@@ -147,22 +68,22 @@ typedef void loop_filter_uvfunction
/* assorted loopfilter functions which get used elsewhere */
struct VP8Common;
struct MacroBlockD;
struct macroblockd;
void vp8_loop_filter_init(struct VP8Common *cm);
void vp8_loop_filter_frame_init(struct VP8Common *cm,
struct MacroBlockD *mbd,
struct macroblockd *mbd,
int default_filt_lvl);
void vp8_loop_filter_frame(struct VP8Common *cm, struct MacroBlockD *mbd);
void vp8_loop_filter_frame(struct VP8Common *cm, struct macroblockd *mbd);
void vp8_loop_filter_partial_frame(struct VP8Common *cm,
struct MacroBlockD *mbd,
struct macroblockd *mbd,
int default_filt_lvl);
void vp8_loop_filter_frame_yonly(struct VP8Common *cm,
struct MacroBlockD *mbd,
struct macroblockd *mbd,
int default_filt_lvl);
void vp8_loop_filter_update_sharpness(loop_filter_info_n *lfi,

View File

@@ -15,7 +15,7 @@
typedef unsigned char uc;
static __inline signed char vp8_signed_char_clamp(int t)
static signed char vp8_signed_char_clamp(int t)
{
t = (t < -128 ? -128 : t);
t = (t > 127 ? 127 : t);
@@ -24,9 +24,9 @@ static __inline signed char vp8_signed_char_clamp(int t)
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
static __inline signed char vp8_filter_mask(uc limit, uc blimit,
uc p3, uc p2, uc p1, uc p0,
uc q0, uc q1, uc q2, uc q3)
static signed char vp8_filter_mask(uc limit, uc blimit,
uc p3, uc p2, uc p1, uc p0,
uc q0, uc q1, uc q2, uc q3)
{
signed char mask = 0;
mask |= (abs(p3 - p2) > limit);
@@ -40,7 +40,7 @@ static __inline signed char vp8_filter_mask(uc limit, uc blimit,
}
/* is there high variance internal edge ( 11111111 yes, 00000000 no) */
static __inline signed char vp8_hevmask(uc thresh, uc p1, uc p0, uc q0, uc q1)
static signed char vp8_hevmask(uc thresh, uc p1, uc p0, uc q0, uc q1)
{
signed char hev = 0;
hev |= (abs(p1 - p0) > thresh) * -1;
@@ -48,7 +48,7 @@ static __inline signed char vp8_hevmask(uc thresh, uc p1, uc p0, uc q0, uc q1)
return hev;
}
static __inline void vp8_filter(signed char mask, uc hev, uc *op1,
static void vp8_filter(signed char mask, uc hev, uc *op1,
uc *op0, uc *oq0, uc *oq1)
{
@@ -158,7 +158,7 @@ void vp8_loop_filter_vertical_edge_c
while (++i < count * 8);
}
static __inline void vp8_mbfilter(signed char mask, uc hev,
static void vp8_mbfilter(signed char mask, uc hev,
uc *op2, uc *op1, uc *op0, uc *oq0, uc *oq1, uc *oq2)
{
signed char s, u;
@@ -279,7 +279,7 @@ void vp8_mbloop_filter_vertical_edge_c
}
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
static __inline signed char vp8_simple_filter_mask(uc blimit, uc p1, uc p0, uc q0, uc q1)
static signed char vp8_simple_filter_mask(uc blimit, uc p1, uc p0, uc q0, uc q1)
{
/* Why does this cause problems for win32?
* error C2143: syntax error : missing ';' before 'type'
@@ -289,7 +289,7 @@ static __inline signed char vp8_simple_filter_mask(uc blimit, uc p1, uc p0, uc q
return mask;
}
static __inline void vp8_simple_filter(signed char mask, uc *op1, uc *op0, uc *oq0, uc *oq1)
static void vp8_simple_filter(signed char mask, uc *op1, uc *op0, uc *oq0, uc *oq1)
{
signed char vp8_filter, Filter1, Filter2;
signed char p1 = (signed char) * op1 ^ 0x80;
@@ -352,3 +352,79 @@ void vp8_loop_filter_simple_vertical_edge_c
while (++i < 16);
}
/* Horizontal MB filtering */
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_mbloop_filter_horizontal_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_mbloop_filter_horizontal_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_mbloop_filter_horizontal_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
}
/* Vertical MB Filtering */
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_mbloop_filter_vertical_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_mbloop_filter_vertical_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_mbloop_filter_vertical_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
}
/* Horizontal B Filtering */
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_loop_filter_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_loop_filter_horizontal_edge_c(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_loop_filter_horizontal_edge_c(v_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
}
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, int y_stride,
const unsigned char *blimit)
{
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, blimit);
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, blimit);
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, blimit);
}
/* Vertical B Filtering */
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr,
unsigned char *v_ptr, int y_stride, int uv_stride,
loop_filter_info *lfi)
{
vp8_loop_filter_vertical_edge_c(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_c(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_c(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
if (u_ptr)
vp8_loop_filter_vertical_edge_c(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
if (v_ptr)
vp8_loop_filter_vertical_edge_c(v_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
}
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, int y_stride,
const unsigned char *blimit)
{
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 4, y_stride, blimit);
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 8, y_stride, blimit);
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 12, y_stride, blimit);
}

View File

@@ -11,74 +11,6 @@
#include "blockd.h"
typedef enum
{
PRED = 0,
DEST = 1
} BLOCKSET;
static void setup_block
(
BLOCKD *b,
int mv_stride,
unsigned char **base,
int Stride,
int offset,
BLOCKSET bs
)
{
if (bs == DEST)
{
b->dst_stride = Stride;
b->dst = offset;
b->base_dst = base;
}
else
{
b->pre_stride = Stride;
b->pre = offset;
b->base_pre = base;
}
}
static void setup_macroblock(MACROBLOCKD *x, BLOCKSET bs)
{
int block;
unsigned char **y, **u, **v;
if (bs == DEST)
{
y = &x->dst.y_buffer;
u = &x->dst.u_buffer;
v = &x->dst.v_buffer;
}
else
{
y = &x->pre.y_buffer;
u = &x->pre.u_buffer;
v = &x->pre.v_buffer;
}
for (block = 0; block < 16; block++) /* y blocks */
{
setup_block(&x->block[block], x->dst.y_stride, y, x->dst.y_stride,
(block >> 2) * 4 * x->dst.y_stride + (block & 3) * 4, bs);
}
for (block = 16; block < 20; block++) /* U and V blocks */
{
setup_block(&x->block[block], x->dst.uv_stride, u, x->dst.uv_stride,
((block - 16) >> 1) * 4 * x->dst.uv_stride + (block & 1) * 4, bs);
setup_block(&x->block[block+4], x->dst.uv_stride, v, x->dst.uv_stride,
((block - 16) >> 1) * 4 * x->dst.uv_stride + (block & 1) * 4, bs);
}
}
void vp8_setup_block_dptrs(MACROBLOCKD *x)
{
int r, c;
@@ -119,8 +51,18 @@ void vp8_setup_block_dptrs(MACROBLOCKD *x)
void vp8_build_block_doffsets(MACROBLOCKD *x)
{
int block;
/* handle the destination pitch features */
setup_macroblock(x, DEST);
setup_macroblock(x, PRED);
for (block = 0; block < 16; block++) /* y blocks */
{
x->block[block].offset =
(block >> 2) * 4 * x->dst.y_stride + (block & 3) * 4;
}
for (block = 16; block < 20; block++) /* U and V blocks */
{
x->block[block+4].offset =
x->block[block].offset =
((block - 16) >> 1) * 4 * x->dst.uv_stride + (block & 1) * 4;
}
}

385
vp8/common/mfqe.c Normal file
View File

@@ -0,0 +1,385 @@
/*
* Copyright (c) 2012 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
/* MFQE: Multiframe Quality Enhancement
* In rate limited situations keyframes may cause significant visual artifacts
* commonly referred to as "popping." This file implements a postproccesing
* algorithm which blends data from the preceeding frame when there is no
* motion and the q from the previous frame is lower which indicates that it is
* higher quality.
*/
#include "postproc.h"
#include "variance.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_rtcd.h"
#include "vpx_scale/yv12config.h"
#include <limits.h>
#include <stdlib.h>
static void filter_by_weight(unsigned char *src, int src_stride,
unsigned char *dst, int dst_stride,
int block_size, int src_weight)
{
int dst_weight = (1 << MFQE_PRECISION) - src_weight;
int rounding_bit = 1 << (MFQE_PRECISION - 1);
int r, c;
for (r = 0; r < block_size; r++)
{
for (c = 0; c < block_size; c++)
{
dst[c] = (src[c] * src_weight +
dst[c] * dst_weight +
rounding_bit) >> MFQE_PRECISION;
}
src += src_stride;
dst += dst_stride;
}
}
void vp8_filter_by_weight16x16_c(unsigned char *src, int src_stride,
unsigned char *dst, int dst_stride,
int src_weight)
{
filter_by_weight(src, src_stride, dst, dst_stride, 16, src_weight);
}
void vp8_filter_by_weight8x8_c(unsigned char *src, int src_stride,
unsigned char *dst, int dst_stride,
int src_weight)
{
filter_by_weight(src, src_stride, dst, dst_stride, 8, src_weight);
}
void vp8_filter_by_weight4x4_c(unsigned char *src, int src_stride,
unsigned char *dst, int dst_stride,
int src_weight)
{
filter_by_weight(src, src_stride, dst, dst_stride, 4, src_weight);
}
static void apply_ifactor(unsigned char *y_src,
int y_src_stride,
unsigned char *y_dst,
int y_dst_stride,
unsigned char *u_src,
unsigned char *v_src,
int uv_src_stride,
unsigned char *u_dst,
unsigned char *v_dst,
int uv_dst_stride,
int block_size,
int src_weight)
{
if (block_size == 16)
{
vp8_filter_by_weight16x16(y_src, y_src_stride, y_dst, y_dst_stride, src_weight);
vp8_filter_by_weight8x8(u_src, uv_src_stride, u_dst, uv_dst_stride, src_weight);
vp8_filter_by_weight8x8(v_src, uv_src_stride, v_dst, uv_dst_stride, src_weight);
}
else /* if (block_size == 8) */
{
vp8_filter_by_weight8x8(y_src, y_src_stride, y_dst, y_dst_stride, src_weight);
vp8_filter_by_weight4x4(u_src, uv_src_stride, u_dst, uv_dst_stride, src_weight);
vp8_filter_by_weight4x4(v_src, uv_src_stride, v_dst, uv_dst_stride, src_weight);
}
}
static unsigned int int_sqrt(unsigned int x)
{
unsigned int y = x;
unsigned int guess;
int p = 1;
while (y>>=1) p++;
p>>=1;
guess=0;
while (p>=0)
{
guess |= (1<<p);
if (x<guess*guess)
guess -= (1<<p);
p--;
}
/* choose between guess or guess+1 */
return guess+(guess*guess+guess+1<=x);
}
#define USE_SSD
static void multiframe_quality_enhance_block
(
int blksize, /* Currently only values supported are 16, 8 */
int qcurr,
int qprev,
unsigned char *y,
unsigned char *u,
unsigned char *v,
int y_stride,
int uv_stride,
unsigned char *yd,
unsigned char *ud,
unsigned char *vd,
int yd_stride,
int uvd_stride
)
{
static const unsigned char VP8_ZEROS[16]=
{
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
};
int uvblksize = blksize >> 1;
int qdiff = qcurr - qprev;
int i;
unsigned char *up;
unsigned char *udp;
unsigned char *vp;
unsigned char *vdp;
unsigned int act, actd, sad, usad, vsad, sse, thr, thrsq, actrisk;
if (blksize == 16)
{
actd = (vp8_variance16x16(yd, yd_stride, VP8_ZEROS, 0, &sse)+128)>>8;
act = (vp8_variance16x16(y, y_stride, VP8_ZEROS, 0, &sse)+128)>>8;
#ifdef USE_SSD
sad = (vp8_variance16x16(y, y_stride, yd, yd_stride, &sse));
sad = (sse + 128)>>8;
usad = (vp8_variance8x8(u, uv_stride, ud, uvd_stride, &sse));
usad = (sse + 32)>>6;
vsad = (vp8_variance8x8(v, uv_stride, vd, uvd_stride, &sse));
vsad = (sse + 32)>>6;
#else
sad = (vp8_sad16x16(y, y_stride, yd, yd_stride, INT_MAX)+128)>>8;
usad = (vp8_sad8x8(u, uv_stride, ud, uvd_stride, INT_MAX)+32)>>6;
vsad = (vp8_sad8x8(v, uv_stride, vd, uvd_stride, INT_MAX)+32)>>6;
#endif
}
else /* if (blksize == 8) */
{
actd = (vp8_variance8x8(yd, yd_stride, VP8_ZEROS, 0, &sse)+32)>>6;
act = (vp8_variance8x8(y, y_stride, VP8_ZEROS, 0, &sse)+32)>>6;
#ifdef USE_SSD
sad = (vp8_variance8x8(y, y_stride, yd, yd_stride, &sse));
sad = (sse + 32)>>6;
usad = (vp8_variance4x4(u, uv_stride, ud, uvd_stride, &sse));
usad = (sse + 8)>>4;
vsad = (vp8_variance4x4(v, uv_stride, vd, uvd_stride, &sse));
vsad = (sse + 8)>>4;
#else
sad = (vp8_sad8x8(y, y_stride, yd, yd_stride, INT_MAX)+32)>>6;
usad = (vp8_sad4x4(u, uv_stride, ud, uvd_stride, INT_MAX)+8)>>4;
vsad = (vp8_sad4x4(v, uv_stride, vd, uvd_stride, INT_MAX)+8)>>4;
#endif
}
actrisk = (actd > act * 5);
/* thr = qdiff/8 + log2(act) + log4(qprev) */
thr = (qdiff >> 3);
while (actd >>= 1) thr++;
while (qprev >>= 2) thr++;
#ifdef USE_SSD
thrsq = thr * thr;
if (sad < thrsq &&
/* additional checks for color mismatch and excessive addition of
* high-frequencies */
4 * usad < thrsq && 4 * vsad < thrsq && !actrisk)
#else
if (sad < thr &&
/* additional checks for color mismatch and excessive addition of
* high-frequencies */
2 * usad < thr && 2 * vsad < thr && !actrisk)
#endif
{
int ifactor;
#ifdef USE_SSD
/* TODO: optimize this later to not need sqr root */
sad = int_sqrt(sad);
#endif
ifactor = (sad << MFQE_PRECISION) / thr;
ifactor >>= (qdiff >> 5);
if (ifactor)
{
apply_ifactor(y, y_stride, yd, yd_stride,
u, v, uv_stride,
ud, vd, uvd_stride,
blksize, ifactor);
}
}
else /* else implicitly copy from previous frame */
{
if (blksize == 16)
{
vp8_copy_mem16x16(y, y_stride, yd, yd_stride);
vp8_copy_mem8x8(u, uv_stride, ud, uvd_stride);
vp8_copy_mem8x8(v, uv_stride, vd, uvd_stride);
}
else /* if (blksize == 8) */
{
vp8_copy_mem8x8(y, y_stride, yd, yd_stride);
for (up = u, udp = ud, i = 0; i < uvblksize; ++i, up += uv_stride, udp += uvd_stride)
vpx_memcpy(udp, up, uvblksize);
for (vp = v, vdp = vd, i = 0; i < uvblksize; ++i, vp += uv_stride, vdp += uvd_stride)
vpx_memcpy(vdp, vp, uvblksize);
}
}
}
static int qualify_inter_mb(const MODE_INFO *mode_info_context, int *map)
{
if (mode_info_context->mbmi.mb_skip_coeff)
map[0] = map[1] = map[2] = map[3] = 1;
else if (mode_info_context->mbmi.mode==SPLITMV)
{
static int ndx[4][4] =
{
{0, 1, 4, 5},
{2, 3, 6, 7},
{8, 9, 12, 13},
{10, 11, 14, 15}
};
int i, j;
for (i=0; i<4; ++i)
{
map[i] = 1;
for (j=0; j<4 && map[j]; ++j)
map[i] &= (mode_info_context->bmi[ndx[i][j]].mv.as_mv.row <= 2 &&
mode_info_context->bmi[ndx[i][j]].mv.as_mv.col <= 2);
}
}
else
{
map[0] = map[1] = map[2] = map[3] =
(mode_info_context->mbmi.mode > B_PRED &&
abs(mode_info_context->mbmi.mv.as_mv.row) <= 2 &&
abs(mode_info_context->mbmi.mv.as_mv.col) <= 2);
}
return (map[0]+map[1]+map[2]+map[3]);
}
void vp8_multiframe_quality_enhance
(
VP8_COMMON *cm
)
{
YV12_BUFFER_CONFIG *show = cm->frame_to_show;
YV12_BUFFER_CONFIG *dest = &cm->post_proc_buffer;
FRAME_TYPE frame_type = cm->frame_type;
/* Point at base of Mb MODE_INFO list has motion vectors etc */
const MODE_INFO *mode_info_context = cm->mi;
int mb_row;
int mb_col;
int totmap, map[4];
int qcurr = cm->base_qindex;
int qprev = cm->postproc_state.last_base_qindex;
unsigned char *y_ptr, *u_ptr, *v_ptr;
unsigned char *yd_ptr, *ud_ptr, *vd_ptr;
/* Set up the buffer pointers */
y_ptr = show->y_buffer;
u_ptr = show->u_buffer;
v_ptr = show->v_buffer;
yd_ptr = dest->y_buffer;
ud_ptr = dest->u_buffer;
vd_ptr = dest->v_buffer;
/* postprocess each macro block */
for (mb_row = 0; mb_row < cm->mb_rows; mb_row++)
{
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
{
/* if motion is high there will likely be no benefit */
if (frame_type == INTER_FRAME) totmap = qualify_inter_mb(mode_info_context, map);
else totmap = (frame_type == KEY_FRAME ? 4 : 0);
if (totmap)
{
if (totmap < 4)
{
int i, j;
for (i=0; i<2; ++i)
for (j=0; j<2; ++j)
{
if (map[i*2+j])
{
multiframe_quality_enhance_block(8, qcurr, qprev,
y_ptr + 8*(i*show->y_stride+j),
u_ptr + 4*(i*show->uv_stride+j),
v_ptr + 4*(i*show->uv_stride+j),
show->y_stride,
show->uv_stride,
yd_ptr + 8*(i*dest->y_stride+j),
ud_ptr + 4*(i*dest->uv_stride+j),
vd_ptr + 4*(i*dest->uv_stride+j),
dest->y_stride,
dest->uv_stride);
}
else
{
/* copy a 8x8 block */
int k;
unsigned char *up = u_ptr + 4*(i*show->uv_stride+j);
unsigned char *udp = ud_ptr + 4*(i*dest->uv_stride+j);
unsigned char *vp = v_ptr + 4*(i*show->uv_stride+j);
unsigned char *vdp = vd_ptr + 4*(i*dest->uv_stride+j);
vp8_copy_mem8x8(y_ptr + 8*(i*show->y_stride+j), show->y_stride,
yd_ptr + 8*(i*dest->y_stride+j), dest->y_stride);
for (k = 0; k < 4; ++k, up += show->uv_stride, udp += dest->uv_stride,
vp += show->uv_stride, vdp += dest->uv_stride)
{
vpx_memcpy(udp, up, 4);
vpx_memcpy(vdp, vp, 4);
}
}
}
}
else /* totmap = 4 */
{
multiframe_quality_enhance_block(16, qcurr, qprev, y_ptr,
u_ptr, v_ptr,
show->y_stride,
show->uv_stride,
yd_ptr, ud_ptr, vd_ptr,
dest->y_stride,
dest->uv_stride);
}
}
else
{
vp8_copy_mem16x16(y_ptr, show->y_stride, yd_ptr, dest->y_stride);
vp8_copy_mem8x8(u_ptr, show->uv_stride, ud_ptr, dest->uv_stride);
vp8_copy_mem8x8(v_ptr, show->uv_stride, vd_ptr, dest->uv_stride);
}
y_ptr += 16;
u_ptr += 8;
v_ptr += 8;
yd_ptr += 16;
ud_ptr += 8;
vd_ptr += 8;
mode_info_context++; /* step to next MB */
}
y_ptr += show->y_stride * 16 - 16 * cm->mb_cols;
u_ptr += show->uv_stride * 8 - 8 * cm->mb_cols;
v_ptr += show->uv_stride * 8 - 8 * cm->mb_cols;
yd_ptr += dest->y_stride * 16 - 16 * cm->mb_cols;
ud_ptr += dest->uv_stride * 8 - 8 * cm->mb_cols;
vd_ptr += dest->uv_stride * 8 - 8 * cm->mb_cols;
mode_info_context++; /* Skip border mb */
}
}

View File

@@ -1,146 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "entropymode.h"
const unsigned int vp8_kf_default_bmode_counts [VP8_BINTRAMODES] [VP8_BINTRAMODES] [VP8_BINTRAMODES] =
{
{
/*Above Mode : 0*/
{ 43438, 2195, 470, 316, 615, 171, 217, 412, 124, 160, }, /* left_mode 0 */
{ 5722, 2751, 296, 291, 81, 68, 80, 101, 100, 170, }, /* left_mode 1 */
{ 1629, 201, 307, 25, 47, 16, 34, 72, 19, 28, }, /* left_mode 2 */
{ 332, 266, 36, 500, 20, 65, 23, 14, 154, 106, }, /* left_mode 3 */
{ 450, 97, 10, 24, 117, 10, 2, 12, 8, 71, }, /* left_mode 4 */
{ 384, 49, 29, 44, 12, 162, 51, 5, 87, 42, }, /* left_mode 5 */
{ 495, 53, 157, 27, 14, 57, 180, 17, 17, 34, }, /* left_mode 6 */
{ 695, 64, 62, 9, 27, 5, 3, 147, 10, 26, }, /* left_mode 7 */
{ 230, 54, 20, 124, 16, 125, 29, 12, 283, 37, }, /* left_mode 8 */
{ 260, 87, 21, 120, 32, 16, 33, 16, 33, 203, }, /* left_mode 9 */
},
{
/*Above Mode : 1*/
{ 3934, 2573, 355, 137, 128, 87, 133, 117, 37, 27, }, /* left_mode 0 */
{ 1036, 1929, 278, 135, 27, 37, 48, 55, 41, 91, }, /* left_mode 1 */
{ 223, 256, 253, 15, 13, 9, 28, 64, 3, 3, }, /* left_mode 2 */
{ 120, 129, 17, 316, 15, 11, 9, 4, 53, 74, }, /* left_mode 3 */
{ 129, 58, 6, 11, 38, 2, 0, 5, 2, 67, }, /* left_mode 4 */
{ 53, 22, 11, 16, 8, 26, 14, 3, 19, 12, }, /* left_mode 5 */
{ 59, 26, 61, 11, 4, 9, 35, 13, 8, 8, }, /* left_mode 6 */
{ 101, 52, 40, 8, 5, 2, 8, 59, 2, 20, }, /* left_mode 7 */
{ 48, 34, 10, 52, 8, 15, 6, 6, 63, 20, }, /* left_mode 8 */
{ 96, 48, 22, 63, 11, 14, 5, 8, 9, 96, }, /* left_mode 9 */
},
{
/*Above Mode : 2*/
{ 709, 461, 506, 36, 27, 33, 151, 98, 24, 6, }, /* left_mode 0 */
{ 201, 375, 442, 27, 13, 8, 46, 58, 6, 19, }, /* left_mode 1 */
{ 122, 140, 417, 4, 13, 3, 33, 59, 4, 2, }, /* left_mode 2 */
{ 36, 17, 22, 16, 6, 8, 12, 17, 9, 21, }, /* left_mode 3 */
{ 51, 15, 7, 1, 14, 0, 4, 5, 3, 22, }, /* left_mode 4 */
{ 18, 11, 30, 9, 7, 20, 11, 5, 2, 6, }, /* left_mode 5 */
{ 38, 21, 103, 9, 4, 12, 79, 13, 2, 5, }, /* left_mode 6 */
{ 64, 17, 66, 2, 12, 4, 2, 65, 4, 5, }, /* left_mode 7 */
{ 14, 7, 7, 16, 3, 11, 4, 13, 15, 16, }, /* left_mode 8 */
{ 36, 8, 32, 9, 9, 4, 14, 7, 6, 24, }, /* left_mode 9 */
},
{
/*Above Mode : 3*/
{ 1340, 173, 36, 119, 30, 10, 13, 10, 20, 26, }, /* left_mode 0 */
{ 156, 293, 26, 108, 5, 16, 2, 4, 23, 30, }, /* left_mode 1 */
{ 60, 34, 13, 7, 3, 3, 0, 8, 4, 5, }, /* left_mode 2 */
{ 72, 64, 1, 235, 3, 9, 2, 7, 28, 38, }, /* left_mode 3 */
{ 29, 14, 1, 3, 5, 0, 2, 2, 5, 13, }, /* left_mode 4 */
{ 22, 7, 4, 11, 2, 5, 1, 2, 6, 4, }, /* left_mode 5 */
{ 18, 14, 5, 6, 4, 3, 14, 0, 9, 2, }, /* left_mode 6 */
{ 41, 10, 7, 1, 2, 0, 0, 10, 2, 1, }, /* left_mode 7 */
{ 23, 19, 2, 33, 1, 5, 2, 0, 51, 8, }, /* left_mode 8 */
{ 33, 26, 7, 53, 3, 9, 3, 3, 9, 19, }, /* left_mode 9 */
},
{
/*Above Mode : 4*/
{ 410, 165, 43, 31, 66, 15, 30, 54, 8, 17, }, /* left_mode 0 */
{ 115, 64, 27, 18, 30, 7, 11, 15, 4, 19, }, /* left_mode 1 */
{ 31, 23, 25, 1, 7, 2, 2, 10, 0, 5, }, /* left_mode 2 */
{ 17, 4, 1, 6, 8, 2, 7, 5, 5, 21, }, /* left_mode 3 */
{ 120, 12, 1, 2, 83, 3, 0, 4, 1, 40, }, /* left_mode 4 */
{ 4, 3, 1, 2, 1, 2, 5, 0, 3, 6, }, /* left_mode 5 */
{ 10, 2, 13, 6, 6, 6, 8, 2, 4, 5, }, /* left_mode 6 */
{ 58, 10, 5, 1, 28, 1, 1, 33, 1, 9, }, /* left_mode 7 */
{ 8, 2, 1, 4, 2, 5, 1, 1, 2, 10, }, /* left_mode 8 */
{ 76, 7, 5, 7, 18, 2, 2, 0, 5, 45, }, /* left_mode 9 */
},
{
/*Above Mode : 5*/
{ 444, 46, 47, 20, 14, 110, 60, 14, 60, 7, }, /* left_mode 0 */
{ 59, 57, 25, 18, 3, 17, 21, 6, 14, 6, }, /* left_mode 1 */
{ 24, 17, 20, 6, 4, 13, 7, 2, 3, 2, }, /* left_mode 2 */
{ 13, 11, 5, 14, 4, 9, 2, 4, 15, 7, }, /* left_mode 3 */
{ 8, 5, 2, 1, 4, 0, 1, 1, 2, 12, }, /* left_mode 4 */
{ 19, 5, 5, 7, 4, 40, 6, 3, 10, 4, }, /* left_mode 5 */
{ 16, 5, 9, 1, 1, 16, 26, 2, 10, 4, }, /* left_mode 6 */
{ 11, 4, 8, 1, 1, 4, 4, 5, 4, 1, }, /* left_mode 7 */
{ 15, 1, 3, 7, 3, 21, 7, 1, 34, 5, }, /* left_mode 8 */
{ 18, 5, 1, 3, 4, 3, 7, 1, 2, 9, }, /* left_mode 9 */
},
{
/*Above Mode : 6*/
{ 476, 149, 94, 13, 14, 77, 291, 27, 23, 3, }, /* left_mode 0 */
{ 79, 83, 42, 14, 2, 12, 63, 2, 4, 14, }, /* left_mode 1 */
{ 43, 36, 55, 1, 3, 8, 42, 11, 5, 1, }, /* left_mode 2 */
{ 9, 9, 6, 16, 1, 5, 6, 3, 11, 10, }, /* left_mode 3 */
{ 10, 3, 1, 3, 10, 1, 0, 1, 1, 4, }, /* left_mode 4 */
{ 14, 6, 15, 5, 1, 20, 25, 2, 5, 0, }, /* left_mode 5 */
{ 28, 7, 51, 1, 0, 8, 127, 6, 2, 5, }, /* left_mode 6 */
{ 13, 3, 3, 2, 3, 1, 2, 8, 1, 2, }, /* left_mode 7 */
{ 10, 3, 3, 3, 3, 8, 2, 2, 9, 3, }, /* left_mode 8 */
{ 13, 7, 11, 4, 0, 4, 6, 2, 5, 8, }, /* left_mode 9 */
},
{
/*Above Mode : 7*/
{ 376, 135, 119, 6, 32, 8, 31, 224, 9, 3, }, /* left_mode 0 */
{ 93, 60, 54, 6, 13, 7, 8, 92, 2, 12, }, /* left_mode 1 */
{ 74, 36, 84, 0, 3, 2, 9, 67, 2, 1, }, /* left_mode 2 */
{ 19, 4, 4, 8, 8, 2, 4, 7, 6, 16, }, /* left_mode 3 */
{ 51, 7, 4, 1, 77, 3, 0, 14, 1, 15, }, /* left_mode 4 */
{ 7, 7, 5, 7, 4, 7, 4, 5, 0, 3, }, /* left_mode 5 */
{ 18, 2, 19, 2, 2, 4, 12, 11, 1, 2, }, /* left_mode 6 */
{ 129, 6, 27, 1, 21, 3, 0, 189, 0, 6, }, /* left_mode 7 */
{ 9, 1, 2, 8, 3, 7, 0, 5, 3, 3, }, /* left_mode 8 */
{ 20, 4, 5, 10, 4, 2, 7, 17, 3, 16, }, /* left_mode 9 */
},
{
/*Above Mode : 8*/
{ 617, 68, 34, 79, 11, 27, 25, 14, 75, 13, }, /* left_mode 0 */
{ 51, 82, 21, 26, 6, 12, 13, 1, 26, 16, }, /* left_mode 1 */
{ 29, 9, 12, 11, 3, 7, 1, 10, 2, 2, }, /* left_mode 2 */
{ 17, 19, 11, 74, 4, 3, 2, 0, 58, 13, }, /* left_mode 3 */
{ 10, 1, 1, 3, 4, 1, 0, 2, 1, 8, }, /* left_mode 4 */
{ 14, 4, 5, 5, 1, 13, 2, 0, 27, 8, }, /* left_mode 5 */
{ 10, 3, 5, 4, 1, 7, 6, 4, 5, 1, }, /* left_mode 6 */
{ 10, 2, 6, 2, 1, 1, 1, 4, 2, 1, }, /* left_mode 7 */
{ 14, 8, 5, 23, 2, 12, 6, 2, 117, 5, }, /* left_mode 8 */
{ 9, 6, 2, 19, 1, 6, 3, 2, 9, 9, }, /* left_mode 9 */
},
{
/*Above Mode : 9*/
{ 680, 73, 22, 38, 42, 5, 11, 9, 6, 28, }, /* left_mode 0 */
{ 113, 112, 21, 22, 10, 2, 8, 4, 6, 42, }, /* left_mode 1 */
{ 44, 20, 24, 6, 5, 4, 3, 3, 1, 2, }, /* left_mode 2 */
{ 40, 23, 7, 71, 5, 2, 4, 1, 7, 22, }, /* left_mode 3 */
{ 85, 9, 4, 4, 17, 2, 0, 3, 2, 23, }, /* left_mode 4 */
{ 13, 4, 2, 6, 1, 7, 0, 1, 7, 6, }, /* left_mode 5 */
{ 26, 6, 8, 3, 2, 3, 8, 1, 5, 4, }, /* left_mode 6 */
{ 54, 8, 9, 6, 7, 0, 1, 11, 1, 3, }, /* left_mode 7 */
{ 9, 10, 4, 13, 2, 5, 4, 2, 14, 8, }, /* left_mode 8 */
{ 92, 9, 5, 19, 15, 3, 3, 1, 6, 58, }, /* left_mode 9 */
},
};

View File

@@ -19,7 +19,7 @@ typedef struct
short col;
} MV;
typedef union
typedef union int_mv
{
uint32_t as_int;
MV as_mv;

View File

@@ -60,19 +60,19 @@ extern "C"
MODE_BESTQUALITY = 0x2,
MODE_FIRSTPASS = 0x3,
MODE_SECONDPASS = 0x4,
MODE_SECONDPASS_BEST = 0x5,
MODE_SECONDPASS_BEST = 0x5
} MODE;
typedef enum
{
FRAMEFLAGS_KEY = 1,
FRAMEFLAGS_GOLDEN = 2,
FRAMEFLAGS_ALTREF = 4,
FRAMEFLAGS_ALTREF = 4
} FRAMETYPE_FLAGS;
#include <assert.h>
static __inline void Scale2Ratio(int mode, int *hr, int *hs)
static void Scale2Ratio(int mode, int *hr, int *hs)
{
switch (mode)
{
@@ -207,10 +207,10 @@ extern "C"
// Temporal scaling parameters
unsigned int number_of_layers;
unsigned int target_bitrate[MAX_PERIODICITY];
unsigned int rate_decimator[MAX_PERIODICITY];
unsigned int target_bitrate[VPX_TS_MAX_PERIODICITY];
unsigned int rate_decimator[VPX_TS_MAX_PERIODICITY];
unsigned int periodicity;
unsigned int layer_id[MAX_PERIODICITY];
unsigned int layer_id[VPX_TS_MAX_PERIODICITY];
#if CONFIG_MULTI_RES_ENCODING
/* Number of total resolutions encoded */

View File

@@ -13,25 +13,19 @@
#define __INC_VP8C_INT_H
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "vpx/internal/vpx_codec_internal.h"
#include "loopfilter.h"
#include "entropymv.h"
#include "entropy.h"
#include "idct.h"
#include "recon.h"
#if CONFIG_POSTPROC
#include "postproc.h"
#endif
#include "dequantize.h"
/*#ifdef PACKET_TESTING*/
#include "header.h"
/*#endif*/
/* Create/destroy static data structures. */
void vp8_initialize_common(void);
#define MINQ 0
#define MAXQ 127
#define QINDEX_RANGE (MAXQ + 1)
@@ -71,23 +65,6 @@ typedef enum
BILINEAR = 1
} INTERPOLATIONFILTERTYPE;
typedef struct VP8_COMMON_RTCD
{
#if CONFIG_RUNTIME_CPU_DETECT
vp8_dequant_rtcd_vtable_t dequant;
vp8_idct_rtcd_vtable_t idct;
vp8_recon_rtcd_vtable_t recon;
vp8_subpix_rtcd_vtable_t subpix;
vp8_loopfilter_rtcd_vtable_t loopfilter;
#if CONFIG_POSTPROC
vp8_postproc_rtcd_vtable_t postproc;
#endif
int flags;
#else
int unused;
#endif
} VP8_COMMON_RTCD;
typedef struct VP8Common
{
@@ -111,11 +88,13 @@ typedef struct VP8Common
int fb_idx_ref_cnt[NUM_YV12_BUFFERS];
int new_fb_idx, lst_fb_idx, gld_fb_idx, alt_fb_idx;
YV12_BUFFER_CONFIG post_proc_buffer;
YV12_BUFFER_CONFIG temp_scale_frame;
#if CONFIG_POSTPROC
YV12_BUFFER_CONFIG post_proc_buffer;
YV12_BUFFER_CONFIG post_proc_buffer_int;
int post_proc_buffer_int_used;
#endif
FRAME_TYPE last_frame_type; /* Save last frame's frame type for motion search. */
FRAME_TYPE frame_type;
@@ -203,15 +182,13 @@ typedef struct VP8Common
double bitrate;
double framerate;
#if CONFIG_RUNTIME_CPU_DETECT
VP8_COMMON_RTCD rtcd;
#endif
#if CONFIG_MULTITHREAD
int processor_core_count;
#endif
#if CONFIG_POSTPROC
struct postproc_state postproc_state;
#endif
int cpu_caps;
} VP8_COMMON;
#endif

View File

@@ -10,15 +10,14 @@
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "vpx_scale/yv12config.h"
#include "postproc.h"
#include "common.h"
#include "recon.h"
#include "vpx_scale/yv12extend.h"
#include "vpx_scale/vpxscale.h"
#include "systemdependent.h"
#include "../encoder/variance.h"
#include <limits.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
@@ -29,7 +28,6 @@
( (0.439*(float)(t>>16)) - (0.368*(float)(t>>8&0xff)) - (0.071*(float)(t&0xff)) + 128)
/* global constants */
#define MFQE_PRECISION 4
#if CONFIG_POSTPROC_VISUALIZER
static const unsigned char MB_PREDICTION_MODE_colors[MB_MODE_COUNT][3] =
{
@@ -329,21 +327,19 @@ static void vp8_deblock_and_de_macro_block(YV12_BUFFER_CONFIG *source,
YV12_BUFFER_CONFIG *post,
int q,
int low_var_thresh,
int flag,
vp8_postproc_rtcd_vtable_t *rtcd)
int flag)
{
double level = 6.0e-05 * q * q * q - .0067 * q * q + .306 * q + .0065;
int ppl = (int)(level + .5);
(void) low_var_thresh;
(void) flag;
POSTPROC_INVOKE(rtcd, downacross)(source->y_buffer, post->y_buffer, source->y_stride, post->y_stride, source->y_height, source->y_width, ppl);
POSTPROC_INVOKE(rtcd, across)(post->y_buffer, post->y_stride, post->y_height, post->y_width, q2mbl(q));
POSTPROC_INVOKE(rtcd, down)(post->y_buffer, post->y_stride, post->y_height, post->y_width, q2mbl(q));
vp8_post_proc_down_and_across(source->y_buffer, post->y_buffer, source->y_stride, post->y_stride, source->y_height, source->y_width, ppl);
vp8_mbpost_proc_across_ip(post->y_buffer, post->y_stride, post->y_height, post->y_width, q2mbl(q));
vp8_mbpost_proc_down(post->y_buffer, post->y_stride, post->y_height, post->y_width, q2mbl(q));
POSTPROC_INVOKE(rtcd, downacross)(source->u_buffer, post->u_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
POSTPROC_INVOKE(rtcd, downacross)(source->v_buffer, post->v_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
vp8_post_proc_down_and_across(source->u_buffer, post->u_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
vp8_post_proc_down_and_across(source->v_buffer, post->v_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
}
@@ -351,25 +347,24 @@ void vp8_deblock(YV12_BUFFER_CONFIG *source,
YV12_BUFFER_CONFIG *post,
int q,
int low_var_thresh,
int flag,
vp8_postproc_rtcd_vtable_t *rtcd)
int flag)
{
double level = 6.0e-05 * q * q * q - .0067 * q * q + .306 * q + .0065;
int ppl = (int)(level + .5);
(void) low_var_thresh;
(void) flag;
POSTPROC_INVOKE(rtcd, downacross)(source->y_buffer, post->y_buffer, source->y_stride, post->y_stride, source->y_height, source->y_width, ppl);
POSTPROC_INVOKE(rtcd, downacross)(source->u_buffer, post->u_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
POSTPROC_INVOKE(rtcd, downacross)(source->v_buffer, post->v_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
vp8_post_proc_down_and_across(source->y_buffer, post->y_buffer, source->y_stride, post->y_stride, source->y_height, source->y_width, ppl);
vp8_post_proc_down_and_across(source->u_buffer, post->u_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
vp8_post_proc_down_and_across(source->v_buffer, post->v_buffer, source->uv_stride, post->uv_stride, source->uv_height, source->uv_width, ppl);
}
#if !(CONFIG_TEMPORAL_DENOISING)
void vp8_de_noise(YV12_BUFFER_CONFIG *source,
YV12_BUFFER_CONFIG *post,
int q,
int low_var_thresh,
int flag,
vp8_postproc_rtcd_vtable_t *rtcd)
int flag)
{
double level = 6.0e-05 * q * q * q - .0067 * q * q + .306 * q + .0065;
int ppl = (int)(level + .5);
@@ -377,7 +372,7 @@ void vp8_de_noise(YV12_BUFFER_CONFIG *source,
(void) low_var_thresh;
(void) flag;
POSTPROC_INVOKE(rtcd, downacross)(
vp8_post_proc_down_and_across(
source->y_buffer + 2 * source->y_stride + 2,
source->y_buffer + 2 * source->y_stride + 2,
source->y_stride,
@@ -385,14 +380,14 @@ void vp8_de_noise(YV12_BUFFER_CONFIG *source,
source->y_height - 4,
source->y_width - 4,
ppl);
POSTPROC_INVOKE(rtcd, downacross)(
vp8_post_proc_down_and_across(
source->u_buffer + 2 * source->uv_stride + 2,
source->u_buffer + 2 * source->uv_stride + 2,
source->uv_stride,
source->uv_stride,
source->uv_height - 4,
source->uv_width - 4, ppl);
POSTPROC_INVOKE(rtcd, downacross)(
vp8_post_proc_down_and_across(
source->v_buffer + 2 * source->uv_stride + 2,
source->v_buffer + 2 * source->uv_stride + 2,
source->uv_stride,
@@ -401,6 +396,7 @@ void vp8_de_noise(YV12_BUFFER_CONFIG *source,
source->uv_width - 4, ppl);
}
#endif
double vp8_gaussian(double sigma, double mu, double x)
{
@@ -408,9 +404,6 @@ double vp8_gaussian(double sigma, double mu, double x)
(exp(-(x - mu) * (x - mu) / (2 * sigma * sigma)));
}
extern void (*vp8_clear_system_state)(void);
static void fillrd(struct postproc_state *state, int q, int a)
{
char char_dist[300];
@@ -696,220 +689,7 @@ static void constrain_line (int x0, int *x1, int y0, int *y1, int width, int hei
}
}
static void multiframe_quality_enhance_block
(
int blksize, /* Currently only values supported are 16, 8, 4 */
int qcurr,
int qprev,
unsigned char *y,
unsigned char *u,
unsigned char *v,
int y_stride,
int uv_stride,
unsigned char *yd,
unsigned char *ud,
unsigned char *vd,
int yd_stride,
int uvd_stride
)
{
static const unsigned char VP8_ZEROS[16]=
{
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
};
int blksizeby2 = blksize >> 1;
int qdiff = qcurr - qprev;
int i, j;
unsigned char *yp;
unsigned char *ydp;
unsigned char *up;
unsigned char *udp;
unsigned char *vp;
unsigned char *vdp;
unsigned int act, sse, sad, thr;
if (blksize == 16)
{
act = (vp8_variance_var16x16(yd, yd_stride, VP8_ZEROS, 0, &sse)+128)>>8;
sad = (vp8_variance_sad16x16(y, y_stride, yd, yd_stride, 0)+128)>>8;
}
else if (blksize == 8)
{
act = (vp8_variance_var8x8(yd, yd_stride, VP8_ZEROS, 0, &sse)+32)>>6;
sad = (vp8_variance_sad8x8(y, y_stride, yd, yd_stride, 0)+32)>>6;
}
else
{
act = (vp8_variance_var4x4(yd, yd_stride, VP8_ZEROS, 0, &sse)+8)>>4;
sad = (vp8_variance_sad4x4(y, y_stride, yd, yd_stride, 0)+8)>>4;
}
/* thr = qdiff/8 + log2(act) + log4(qprev) */
thr = (qdiff>>3);
while (act>>=1) thr++;
while (qprev>>=2) thr++;
if (sad < thr)
{
static const int roundoff = (1 << (MFQE_PRECISION - 1));
int ifactor = (sad << MFQE_PRECISION) / thr;
ifactor >>= (qdiff >> 5);
// TODO: SIMD optimize this section
if (ifactor)
{
int icfactor = (1 << MFQE_PRECISION) - ifactor;
for (yp = y, ydp = yd, i = 0; i < blksize; ++i, yp += y_stride, ydp += yd_stride)
{
for (j = 0; j < blksize; ++j)
ydp[j] = (int)((yp[j] * ifactor + ydp[j] * icfactor + roundoff) >> MFQE_PRECISION);
}
for (up = u, udp = ud, i = 0; i < blksizeby2; ++i, up += uv_stride, udp += uvd_stride)
{
for (j = 0; j < blksizeby2; ++j)
udp[j] = (int)((up[j] * ifactor + udp[j] * icfactor + roundoff) >> MFQE_PRECISION);
}
for (vp = v, vdp = vd, i = 0; i < blksizeby2; ++i, vp += uv_stride, vdp += uvd_stride)
{
for (j = 0; j < blksizeby2; ++j)
vdp[j] = (int)((vp[j] * ifactor + vdp[j] * icfactor + roundoff) >> MFQE_PRECISION);
}
}
}
else
{
if (blksize == 16)
{
vp8_recon_copy16x16(y, y_stride, yd, yd_stride);
vp8_recon_copy8x8(u, uv_stride, ud, uvd_stride);
vp8_recon_copy8x8(v, uv_stride, vd, uvd_stride);
}
else if (blksize == 8)
{
vp8_recon_copy8x8(y, y_stride, yd, yd_stride);
for (up = u, udp = ud, i = 0; i < blksizeby2; ++i, up += uv_stride, udp += uvd_stride)
vpx_memcpy(udp, up, blksizeby2);
for (vp = v, vdp = vd, i = 0; i < blksizeby2; ++i, vp += uv_stride, vdp += uvd_stride)
vpx_memcpy(vdp, vp, blksizeby2);
}
else
{
for (yp = y, ydp = yd, i = 0; i < blksize; ++i, yp += y_stride, ydp += yd_stride)
vpx_memcpy(ydp, yp, blksize);
for (up = u, udp = ud, i = 0; i < blksizeby2; ++i, up += uv_stride, udp += uvd_stride)
vpx_memcpy(udp, up, blksizeby2);
for (vp = v, vdp = vd, i = 0; i < blksizeby2; ++i, vp += uv_stride, vdp += uvd_stride)
vpx_memcpy(vdp, vp, blksizeby2);
}
}
}
#if CONFIG_RUNTIME_CPU_DETECT
#define RTCD_VTABLE(oci) (&(oci)->rtcd.postproc)
#else
#define RTCD_VTABLE(oci) NULL
#endif
void vp8_multiframe_quality_enhance
(
VP8_COMMON *cm
)
{
YV12_BUFFER_CONFIG *show = cm->frame_to_show;
YV12_BUFFER_CONFIG *dest = &cm->post_proc_buffer;
FRAME_TYPE frame_type = cm->frame_type;
/* Point at base of Mb MODE_INFO list has motion vectors etc */
const MODE_INFO *mode_info_context = cm->mi;
int mb_row;
int mb_col;
int qcurr = cm->base_qindex;
int qprev = cm->postproc_state.last_base_qindex;
unsigned char *y_ptr, *u_ptr, *v_ptr;
unsigned char *yd_ptr, *ud_ptr, *vd_ptr;
/* Set up the buffer pointers */
y_ptr = show->y_buffer;
u_ptr = show->u_buffer;
v_ptr = show->v_buffer;
yd_ptr = dest->y_buffer;
ud_ptr = dest->u_buffer;
vd_ptr = dest->v_buffer;
/* postprocess each macro block */
for (mb_row = 0; mb_row < cm->mb_rows; mb_row++)
{
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
{
/* if motion is high there will likely be no benefit */
if (((frame_type == INTER_FRAME &&
abs(mode_info_context->mbmi.mv.as_mv.row) <= 10 &&
abs(mode_info_context->mbmi.mv.as_mv.col) <= 10) ||
(frame_type == KEY_FRAME)))
{
if (mode_info_context->mbmi.mode == B_PRED || mode_info_context->mbmi.mode == SPLITMV)
{
int i, j;
for (i=0; i<2; ++i)
for (j=0; j<2; ++j)
multiframe_quality_enhance_block(8,
qcurr,
qprev,
y_ptr + 8*(i*show->y_stride+j),
u_ptr + 4*(i*show->uv_stride+j),
v_ptr + 4*(i*show->uv_stride+j),
show->y_stride,
show->uv_stride,
yd_ptr + 8*(i*dest->y_stride+j),
ud_ptr + 4*(i*dest->uv_stride+j),
vd_ptr + 4*(i*dest->uv_stride+j),
dest->y_stride,
dest->uv_stride);
}
else
{
multiframe_quality_enhance_block(16,
qcurr,
qprev,
y_ptr,
u_ptr,
v_ptr,
show->y_stride,
show->uv_stride,
yd_ptr,
ud_ptr,
vd_ptr,
dest->y_stride,
dest->uv_stride);
}
}
else
{
vp8_recon_copy16x16(y_ptr, show->y_stride, yd_ptr, dest->y_stride);
vp8_recon_copy8x8(u_ptr, show->uv_stride, ud_ptr, dest->uv_stride);
vp8_recon_copy8x8(v_ptr, show->uv_stride, vd_ptr, dest->uv_stride);
}
y_ptr += 16;
u_ptr += 8;
v_ptr += 8;
yd_ptr += 16;
ud_ptr += 8;
vd_ptr += 8;
mode_info_context++; /* step to next MB */
}
y_ptr += show->y_stride * 16 - 16 * cm->mb_cols;
u_ptr += show->uv_stride * 8 - 8 * cm->mb_cols;
v_ptr += show->uv_stride * 8 - 8 * cm->mb_cols;
yd_ptr += dest->y_stride * 16 - 16 * cm->mb_cols;
ud_ptr += dest->uv_stride * 8 - 8 * cm->mb_cols;
vd_ptr += dest->uv_stride * 8 - 8 * cm->mb_cols;
mode_info_context++; /* Skip border mb */
}
}
#if CONFIG_POSTPROC
int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t *ppflags)
{
int q = oci->filter_level * 10 / 6;
@@ -932,6 +712,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
dest->y_height = oci->Height;
dest->uv_height = dest->y_height / 2;
oci->postproc_state.last_base_qindex = oci->base_qindex;
oci->postproc_state.last_frame_valid = 1;
return 0;
}
@@ -940,13 +721,19 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
{
if ((flags & VP8D_DEBLOCK) || (flags & VP8D_DEMACROBLOCK))
{
if (vp8_yv12_alloc_frame_buffer(&oci->post_proc_buffer_int, oci->Width, oci->Height, VP8BORDERINPIXELS) >= 0)
{
oci->post_proc_buffer_int_used = 1;
}
int width = (oci->Width + 15) & ~15;
int height = (oci->Height + 15) & ~15;
if (vp8_yv12_alloc_frame_buffer(&oci->post_proc_buffer_int,
width, height, VP8BORDERINPIXELS))
vpx_internal_error(&oci->error, VPX_CODEC_MEM_ERROR,
"Failed to allocate MFQE framebuffer");
oci->post_proc_buffer_int_used = 1;
// insure that postproc is set to all 0's so that post proc
// doesn't pull random data in from edge
vpx_memset((&oci->post_proc_buffer_int)->buffer_alloc,126,(&oci->post_proc_buffer)->frame_size);
vpx_memset((&oci->post_proc_buffer_int)->buffer_alloc,128,(&oci->post_proc_buffer)->frame_size);
}
}
@@ -956,6 +743,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
#endif
if ((flags & VP8D_MFQE) &&
oci->postproc_state.last_frame_valid &&
oci->current_video_frame >= 2 &&
oci->base_qindex - oci->postproc_state.last_base_qindex >= 10)
{
@@ -963,16 +751,16 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
if (((flags & VP8D_DEBLOCK) || (flags & VP8D_DEMACROBLOCK)) &&
oci->post_proc_buffer_int_used)
{
vp8_yv12_copy_frame_ptr(&oci->post_proc_buffer, &oci->post_proc_buffer_int);
vp8_yv12_copy_frame(&oci->post_proc_buffer, &oci->post_proc_buffer_int);
if (flags & VP8D_DEMACROBLOCK)
{
vp8_deblock_and_de_macro_block(&oci->post_proc_buffer_int, &oci->post_proc_buffer,
q + (deblock_level - 5) * 10, 1, 0, RTCD_VTABLE(oci));
q + (deblock_level - 5) * 10, 1, 0);
}
else if (flags & VP8D_DEBLOCK)
{
vp8_deblock(&oci->post_proc_buffer_int, &oci->post_proc_buffer,
q, 1, 0, RTCD_VTABLE(oci));
q, 1, 0);
}
}
/* Move partially towards the base q of the previous frame */
@@ -981,20 +769,21 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
else if (flags & VP8D_DEMACROBLOCK)
{
vp8_deblock_and_de_macro_block(oci->frame_to_show, &oci->post_proc_buffer,
q + (deblock_level - 5) * 10, 1, 0, RTCD_VTABLE(oci));
q + (deblock_level - 5) * 10, 1, 0);
oci->postproc_state.last_base_qindex = oci->base_qindex;
}
else if (flags & VP8D_DEBLOCK)
{
vp8_deblock(oci->frame_to_show, &oci->post_proc_buffer,
q, 1, 0, RTCD_VTABLE(oci));
q, 1, 0);
oci->postproc_state.last_base_qindex = oci->base_qindex;
}
else
{
vp8_yv12_copy_frame_ptr(oci->frame_to_show, &oci->post_proc_buffer);
vp8_yv12_copy_frame(oci->frame_to_show, &oci->post_proc_buffer);
oci->postproc_state.last_base_qindex = oci->base_qindex;
}
oci->postproc_state.last_frame_valid = 1;
if (flags & VP8D_ADDNOISE)
{
@@ -1004,7 +793,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
fillrd(&oci->postproc_state, 63 - q, noise_level);
}
POSTPROC_INVOKE(RTCD_VTABLE(oci), addnoise)
vp8_plane_add_noise
(oci->post_proc_buffer.y_buffer,
oci->postproc_state.noise,
oci->postproc_state.blackclamp,
@@ -1302,7 +1091,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
U = B_PREDICTION_MODE_colors[bmi->as_mode][1];
V = B_PREDICTION_MODE_colors[bmi->as_mode][2];
POSTPROC_INVOKE(RTCD_VTABLE(oci), blend_b)
vp8_blend_b
(yl+bx, ul+(bx>>1), vl+(bx>>1), Y, U, V, 0xc000, y_stride);
}
bmi++;
@@ -1319,7 +1108,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
U = MB_PREDICTION_MODE_colors[mi->mbmi.mode][1];
V = MB_PREDICTION_MODE_colors[mi->mbmi.mode][2];
POSTPROC_INVOKE(RTCD_VTABLE(oci), blend_mb_inner)
vp8_blend_mb_inner
(y_ptr+x, u_ptr+(x>>1), v_ptr+(x>>1), Y, U, V, 0xc000, y_stride);
}
@@ -1358,7 +1147,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
U = MV_REFERENCE_FRAME_colors[mi->mbmi.ref_frame][1];
V = MV_REFERENCE_FRAME_colors[mi->mbmi.ref_frame][2];
POSTPROC_INVOKE(RTCD_VTABLE(oci), blend_mb_outer)
vp8_blend_mb_outer
(y_ptr+x, u_ptr+(x>>1), v_ptr+(x>>1), Y, U, V, 0xc000, y_stride);
}
@@ -1381,3 +1170,4 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
dest->uv_height = dest->y_height / 2;
return 0;
}
#endif

View File

@@ -12,92 +12,6 @@
#ifndef POSTPROC_H
#define POSTPROC_H
#define prototype_postproc_inplace(sym)\
void sym (unsigned char *dst, int pitch, int rows, int cols,int flimit)
#define prototype_postproc(sym)\
void sym (unsigned char *src, unsigned char *dst, int src_pitch,\
int dst_pitch, int rows, int cols, int flimit)
#define prototype_postproc_addnoise(sym) \
void sym (unsigned char *s, char *noise, char blackclamp[16],\
char whiteclamp[16], char bothclamp[16],\
unsigned int w, unsigned int h, int pitch)
#define prototype_postproc_blend_mb_inner(sym)\
void sym (unsigned char *y, unsigned char *u, unsigned char *v,\
int y1, int u1, int v1, int alpha, int stride)
#define prototype_postproc_blend_mb_outer(sym)\
void sym (unsigned char *y, unsigned char *u, unsigned char *v,\
int y1, int u1, int v1, int alpha, int stride)
#define prototype_postproc_blend_b(sym)\
void sym (unsigned char *y, unsigned char *u, unsigned char *v,\
int y1, int u1, int v1, int alpha, int stride)
#if ARCH_X86 || ARCH_X86_64
#include "x86/postproc_x86.h"
#endif
#ifndef vp8_postproc_down
#define vp8_postproc_down vp8_mbpost_proc_down_c
#endif
extern prototype_postproc_inplace(vp8_postproc_down);
#ifndef vp8_postproc_across
#define vp8_postproc_across vp8_mbpost_proc_across_ip_c
#endif
extern prototype_postproc_inplace(vp8_postproc_across);
#ifndef vp8_postproc_downacross
#define vp8_postproc_downacross vp8_post_proc_down_and_across_c
#endif
extern prototype_postproc(vp8_postproc_downacross);
#ifndef vp8_postproc_addnoise
#define vp8_postproc_addnoise vp8_plane_add_noise_c
#endif
extern prototype_postproc_addnoise(vp8_postproc_addnoise);
#ifndef vp8_postproc_blend_mb_inner
#define vp8_postproc_blend_mb_inner vp8_blend_mb_inner_c
#endif
extern prototype_postproc_blend_mb_inner(vp8_postproc_blend_mb_inner);
#ifndef vp8_postproc_blend_mb_outer
#define vp8_postproc_blend_mb_outer vp8_blend_mb_outer_c
#endif
extern prototype_postproc_blend_mb_outer(vp8_postproc_blend_mb_outer);
#ifndef vp8_postproc_blend_b
#define vp8_postproc_blend_b vp8_blend_b_c
#endif
extern prototype_postproc_blend_b(vp8_postproc_blend_b);
typedef prototype_postproc((*vp8_postproc_fn_t));
typedef prototype_postproc_inplace((*vp8_postproc_inplace_fn_t));
typedef prototype_postproc_addnoise((*vp8_postproc_addnoise_fn_t));
typedef prototype_postproc_blend_mb_inner((*vp8_postproc_blend_mb_inner_fn_t));
typedef prototype_postproc_blend_mb_outer((*vp8_postproc_blend_mb_outer_fn_t));
typedef prototype_postproc_blend_b((*vp8_postproc_blend_b_fn_t));
typedef struct
{
vp8_postproc_inplace_fn_t down;
vp8_postproc_inplace_fn_t across;
vp8_postproc_fn_t downacross;
vp8_postproc_addnoise_fn_t addnoise;
vp8_postproc_blend_mb_inner_fn_t blend_mb_inner;
vp8_postproc_blend_mb_outer_fn_t blend_mb_outer;
vp8_postproc_blend_b_fn_t blend_b;
} vp8_postproc_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define POSTPROC_INVOKE(ctx,fn) (ctx)->fn
#else
#define POSTPROC_INVOKE(ctx,fn) vp8_postproc_##fn
#endif
#include "vpx_ports/mem.h"
struct postproc_state
{
@@ -105,6 +19,7 @@ struct postproc_state
int last_noise;
char noise[3072];
int last_base_qindex;
int last_frame_valid;
DECLARE_ALIGNED(16, char, blackclamp[16]);
DECLARE_ALIGNED(16, char, whiteclamp[16]);
DECLARE_ALIGNED(16, char, bothclamp[16]);
@@ -119,13 +34,15 @@ void vp8_de_noise(YV12_BUFFER_CONFIG *source,
YV12_BUFFER_CONFIG *post,
int q,
int low_var_thresh,
int flag,
vp8_postproc_rtcd_vtable_t *rtcd);
int flag);
void vp8_deblock(YV12_BUFFER_CONFIG *source,
YV12_BUFFER_CONFIG *post,
int q,
int low_var_thresh,
int flag,
vp8_postproc_rtcd_vtable_t *rtcd);
int flag);
#define MFQE_PRECISION 4
void vp8_multiframe_quality_enhance(struct VP8Common *cm);
#endif

View File

@@ -98,7 +98,7 @@
stw r4, 0(r7) ;# sse
mullw r3, r3, r3 ;# sum*sum
srawi r3, r3, \DS ;# (sum*sum) >> DS
srlwi r3, r3, \DS ;# (sum*sum) >> DS
subf r3, r3, r4 ;# sse - ((sum*sum) >> DS)
.endm
@@ -142,7 +142,7 @@
stw r4, 0(r7) ;# sse
mullw r3, r3, r3 ;# sum*sum
srawi r3, r3, \DS ;# (sum*sum) >> 8
srlwi r3, r3, \DS ;# (sum*sum) >> 8
subf r3, r3, r4 ;# sse - ((sum*sum) >> 8)
.endm
@@ -367,7 +367,7 @@ vp8_variance4x4_ppc:
stw r4, 0(r7) ;# sse
mullw r3, r3, r3 ;# sum*sum
srawi r3, r3, 4 ;# (sum*sum) >> 4
srlwi r3, r3, 4 ;# (sum*sum) >> 4
subf r3, r3, r4 ;# sse - ((sum*sum) >> 4)
epilogue

View File

@@ -157,7 +157,7 @@
stw r4, 0(r9) ;# sse
mullw r3, r3, r3 ;# sum*sum
srawi r3, r3, \DS ;# (sum*sum) >> 8
srlwi r3, r3, \DS ;# (sum*sum) >> 8
subf r3, r3, r4 ;# sse - ((sum*sum) >> 8)
.endm

View File

@@ -1,111 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef __INC_RECON_H
#define __INC_RECON_H
#include "blockd.h"
#define prototype_copy_block(sym) \
void sym(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch)
#define prototype_recon_block(sym) \
void sym(unsigned char *pred, short *diff, int diff_stride, unsigned char *dst, int pitch)
#define prototype_recon_macroblock(sym) \
void sym(const struct vp8_recon_rtcd_vtable *rtcd, MACROBLOCKD *x)
#define prototype_build_intra_predictors(sym) \
void sym(MACROBLOCKD *x)
#define prototype_intra4x4_predict(sym) \
void sym(unsigned char *src, int src_stride, int b_mode, \
unsigned char *dst, int dst_stride)
struct vp8_recon_rtcd_vtable;
#if ARCH_X86 || ARCH_X86_64
#include "x86/recon_x86.h"
#endif
#if ARCH_ARM
#include "arm/recon_arm.h"
#endif
#ifndef vp8_recon_copy16x16
#define vp8_recon_copy16x16 vp8_copy_mem16x16_c
#endif
extern prototype_copy_block(vp8_recon_copy16x16);
#ifndef vp8_recon_copy8x8
#define vp8_recon_copy8x8 vp8_copy_mem8x8_c
#endif
extern prototype_copy_block(vp8_recon_copy8x8);
#ifndef vp8_recon_copy8x4
#define vp8_recon_copy8x4 vp8_copy_mem8x4_c
#endif
extern prototype_copy_block(vp8_recon_copy8x4);
#ifndef vp8_recon_build_intra_predictors_mby
#define vp8_recon_build_intra_predictors_mby vp8_build_intra_predictors_mby
#endif
extern prototype_build_intra_predictors\
(vp8_recon_build_intra_predictors_mby);
#ifndef vp8_recon_build_intra_predictors_mby_s
#define vp8_recon_build_intra_predictors_mby_s vp8_build_intra_predictors_mby_s
#endif
extern prototype_build_intra_predictors\
(vp8_recon_build_intra_predictors_mby_s);
#ifndef vp8_recon_build_intra_predictors_mbuv
#define vp8_recon_build_intra_predictors_mbuv vp8_build_intra_predictors_mbuv
#endif
extern prototype_build_intra_predictors\
(vp8_recon_build_intra_predictors_mbuv);
#ifndef vp8_recon_build_intra_predictors_mbuv_s
#define vp8_recon_build_intra_predictors_mbuv_s vp8_build_intra_predictors_mbuv_s
#endif
extern prototype_build_intra_predictors\
(vp8_recon_build_intra_predictors_mbuv_s);
#ifndef vp8_recon_intra4x4_predict
#define vp8_recon_intra4x4_predict vp8_intra4x4_predict_c
#endif
extern prototype_intra4x4_predict\
(vp8_recon_intra4x4_predict);
typedef prototype_copy_block((*vp8_copy_block_fn_t));
typedef prototype_build_intra_predictors((*vp8_build_intra_pred_fn_t));
typedef prototype_intra4x4_predict((*vp8_intra4x4_pred_fn_t));
typedef struct vp8_recon_rtcd_vtable
{
vp8_copy_block_fn_t copy16x16;
vp8_copy_block_fn_t copy8x8;
vp8_copy_block_fn_t copy8x4;
vp8_build_intra_pred_fn_t build_intra_predictors_mby_s;
vp8_build_intra_pred_fn_t build_intra_predictors_mby;
vp8_build_intra_pred_fn_t build_intra_predictors_mbuv_s;
vp8_build_intra_pred_fn_t build_intra_predictors_mbuv;
vp8_intra4x4_pred_fn_t intra4x4_predict;
} vp8_recon_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define RECON_INVOKE(ctx,fn) (ctx)->fn
#else
#define RECON_INVOKE(ctx,fn) vp8_recon_##fn
#endif
#endif

View File

@@ -9,10 +9,10 @@
*/
#include <limits.h>
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "vpx/vpx_integer.h"
#include "recon.h"
#include "subpixel.h"
#include "blockd.h"
#include "reconinter.h"
#if CONFIG_RUNTIME_CPU_DETECT
@@ -123,25 +123,19 @@ void vp8_copy_mem8x4_c(
}
void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, vp8_subpix_fn_t sppf)
void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, unsigned char *base_pre, int pre_stride, vp8_subpix_fn_t sppf)
{
int r;
unsigned char *ptr_base;
unsigned char *ptr;
unsigned char *pred_ptr = d->predictor;
ptr_base = *(d->base_pre);
unsigned char *ptr;
ptr = base_pre + d->offset + (d->bmi.mv.as_mv.row >> 3) * pre_stride + (d->bmi.mv.as_mv.col >> 3);
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
{
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
sppf(ptr, d->pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, pred_ptr, pitch);
sppf(ptr, pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, pred_ptr, pitch);
}
else
{
ptr_base += d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
ptr = ptr_base;
for (r = 0; r < 4; r++)
{
#if !(CONFIG_FAST_UNALIGNED)
@@ -153,65 +147,53 @@ void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, vp8_subpix_fn_t sppf)
*(uint32_t *)pred_ptr = *(uint32_t *)ptr ;
#endif
pred_ptr += pitch;
ptr += d->pre_stride;
ptr += pre_stride;
}
}
}
static void build_inter_predictors4b(MACROBLOCKD *x, BLOCKD *d, unsigned char *dst, int dst_stride)
static void build_inter_predictors4b(MACROBLOCKD *x, BLOCKD *d, unsigned char *dst, int dst_stride, unsigned char *base_pre, int pre_stride)
{
unsigned char *ptr_base;
unsigned char *ptr;
ptr_base = *(d->base_pre);
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
ptr = base_pre + d->offset + (d->bmi.mv.as_mv.row >> 3) * pre_stride + (d->bmi.mv.as_mv.col >> 3);
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
{
x->subpixel_predict8x8(ptr, d->pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
x->subpixel_predict8x8(ptr, pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy8x8)(ptr, d->pre_stride, dst, dst_stride);
vp8_copy_mem8x8(ptr, pre_stride, dst, dst_stride);
}
}
static void build_inter_predictors2b(MACROBLOCKD *x, BLOCKD *d, unsigned char *dst, int dst_stride)
static void build_inter_predictors2b(MACROBLOCKD *x, BLOCKD *d, unsigned char *dst, int dst_stride, unsigned char *base_pre, int pre_stride)
{
unsigned char *ptr_base;
unsigned char *ptr;
ptr_base = *(d->base_pre);
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
ptr = base_pre + d->offset + (d->bmi.mv.as_mv.row >> 3) * pre_stride + (d->bmi.mv.as_mv.col >> 3);
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
{
x->subpixel_predict8x4(ptr, d->pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
x->subpixel_predict8x4(ptr, pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy8x4)(ptr, d->pre_stride, dst, dst_stride);
vp8_copy_mem8x4(ptr, pre_stride, dst, dst_stride);
}
}
static void build_inter_predictors_b(BLOCKD *d, unsigned char *dst, int dst_stride, vp8_subpix_fn_t sppf)
static void build_inter_predictors_b(BLOCKD *d, unsigned char *dst, int dst_stride, unsigned char *base_pre, int pre_stride, vp8_subpix_fn_t sppf)
{
int r;
unsigned char *ptr_base;
unsigned char *ptr;
ptr_base = *(d->base_pre);
ptr = base_pre + d->offset + (d->bmi.mv.as_mv.row >> 3) * pre_stride + (d->bmi.mv.as_mv.col >> 3);
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
{
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
sppf(ptr, d->pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
sppf(ptr, pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst, dst_stride);
}
else
{
ptr_base += d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
ptr = ptr_base;
for (r = 0; r < 4; r++)
{
#if !(CONFIG_FAST_UNALIGNED)
@@ -223,7 +205,7 @@ static void build_inter_predictors_b(BLOCKD *d, unsigned char *dst, int dst_stri
*(uint32_t *)dst = *(uint32_t *)ptr ;
#endif
dst += dst_stride;
ptr += d->pre_stride;
ptr += pre_stride;
}
}
}
@@ -239,22 +221,13 @@ void vp8_build_inter16x16_predictors_mbuv(MACROBLOCKD *x)
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
int offset;
int pre_stride = x->block[16].pre_stride;
int pre_stride = x->pre.uv_stride;
/* calc uv motion vectors */
if (mv_row < 0)
mv_row -= 1;
else
mv_row += 1;
if (mv_col < 0)
mv_col -= 1;
else
mv_col += 1;
mv_row += 1 | (mv_row >> (sizeof(int) * CHAR_BIT - 1));
mv_col += 1 | (mv_col >> (sizeof(int) * CHAR_BIT - 1));
mv_row /= 2;
mv_col /= 2;
mv_row &= x->fullpixel_mask;
mv_col &= x->fullpixel_mask;
@@ -269,8 +242,8 @@ void vp8_build_inter16x16_predictors_mbuv(MACROBLOCKD *x)
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy8x8)(uptr, pre_stride, upred_ptr, 8);
RECON_INVOKE(&x->rtcd->recon, copy8x8)(vptr, pre_stride, vpred_ptr, 8);
vp8_copy_mem8x8(uptr, pre_stride, upred_ptr, 8);
vp8_copy_mem8x8(vptr, pre_stride, vpred_ptr, 8);
}
}
@@ -278,6 +251,8 @@ void vp8_build_inter16x16_predictors_mbuv(MACROBLOCKD *x)
void vp8_build_inter4x4_predictors_mbuv(MACROBLOCKD *x)
{
int i, j;
int pre_stride = x->pre.uv_stride;
unsigned char *base_pre;
/* build uv mvs */
for (i = 0; i < 2; i++)
@@ -295,8 +270,7 @@ void vp8_build_inter4x4_predictors_mbuv(MACROBLOCKD *x)
+ x->block[yoffset+4].bmi.mv.as_mv.row
+ x->block[yoffset+5].bmi.mv.as_mv.row;
if (temp < 0) temp -= 4;
else temp += 4;
temp += 4 + ((temp >> (sizeof(int) * CHAR_BIT - 1)) << 3);
x->block[uoffset].bmi.mv.as_mv.row = (temp / 8) & x->fullpixel_mask;
@@ -305,29 +279,41 @@ void vp8_build_inter4x4_predictors_mbuv(MACROBLOCKD *x)
+ x->block[yoffset+4].bmi.mv.as_mv.col
+ x->block[yoffset+5].bmi.mv.as_mv.col;
if (temp < 0) temp -= 4;
else temp += 4;
temp += 4 + ((temp >> (sizeof(int) * CHAR_BIT - 1)) << 3);
x->block[uoffset].bmi.mv.as_mv.col = (temp / 8) & x->fullpixel_mask;
x->block[voffset].bmi.mv.as_mv.row =
x->block[uoffset].bmi.mv.as_mv.row ;
x->block[voffset].bmi.mv.as_mv.col =
x->block[uoffset].bmi.mv.as_mv.col ;
x->block[voffset].bmi.mv.as_int = x->block[uoffset].bmi.mv.as_int;
}
}
for (i = 16; i < 24; i += 2)
base_pre = x->pre.u_buffer;
for (i = 16; i < 20; i += 2)
{
BLOCKD *d0 = &x->block[i];
BLOCKD *d1 = &x->block[i+1];
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
build_inter_predictors2b(x, d0, d0->predictor, 8);
build_inter_predictors2b(x, d0, d0->predictor, 8, base_pre, pre_stride);
else
{
vp8_build_inter_predictors_b(d0, 8, x->subpixel_predict);
vp8_build_inter_predictors_b(d1, 8, x->subpixel_predict);
vp8_build_inter_predictors_b(d0, 8, base_pre, pre_stride, x->subpixel_predict);
vp8_build_inter_predictors_b(d1, 8, base_pre, pre_stride, x->subpixel_predict);
}
}
base_pre = x->pre.v_buffer;
for (i = 20; i < 24; i += 2)
{
BLOCKD *d0 = &x->block[i];
BLOCKD *d1 = &x->block[i+1];
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
build_inter_predictors2b(x, d0, d0->predictor, 8, base_pre, pre_stride);
else
{
vp8_build_inter_predictors_b(d0, 8, base_pre, pre_stride, x->subpixel_predict);
vp8_build_inter_predictors_b(d1, 8, base_pre, pre_stride, x->subpixel_predict);
}
}
}
@@ -342,7 +328,7 @@ void vp8_build_inter16x16_predictors_mby(MACROBLOCKD *x,
unsigned char *ptr;
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
int pre_stride = x->block[0].pre_stride;
int pre_stride = x->pre.y_stride;
ptr_base = x->pre.y_buffer;
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
@@ -354,7 +340,7 @@ void vp8_build_inter16x16_predictors_mby(MACROBLOCKD *x,
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, dst_y,
vp8_copy_mem16x16(ptr, pre_stride, dst_y,
dst_ystride);
}
}
@@ -409,7 +395,7 @@ void vp8_build_inter16x16_predictors_mb(MACROBLOCKD *x,
int_mv _16x16mv;
unsigned char *ptr_base = x->pre.y_buffer;
int pre_stride = x->block[0].pre_stride;
int pre_stride = x->pre.y_stride;
_16x16mv.as_int = x->mode_info_context->mbmi.mv.as_int;
@@ -426,23 +412,14 @@ void vp8_build_inter16x16_predictors_mb(MACROBLOCKD *x,
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, dst_y, dst_ystride);
vp8_copy_mem16x16(ptr, pre_stride, dst_y, dst_ystride);
}
/* calc uv motion vectors */
if ( _16x16mv.as_mv.row < 0)
_16x16mv.as_mv.row -= 1;
else
_16x16mv.as_mv.row += 1;
if (_16x16mv.as_mv.col < 0)
_16x16mv.as_mv.col -= 1;
else
_16x16mv.as_mv.col += 1;
_16x16mv.as_mv.row += 1 | (_16x16mv.as_mv.row >> (sizeof(int) * CHAR_BIT - 1));
_16x16mv.as_mv.col += 1 | (_16x16mv.as_mv.col >> (sizeof(int) * CHAR_BIT - 1));
_16x16mv.as_mv.row /= 2;
_16x16mv.as_mv.col /= 2;
_16x16mv.as_mv.row &= x->fullpixel_mask;
_16x16mv.as_mv.col &= x->fullpixel_mask;
@@ -458,19 +435,21 @@ void vp8_build_inter16x16_predictors_mb(MACROBLOCKD *x,
}
else
{
RECON_INVOKE(&x->rtcd->recon, copy8x8)(uptr, pre_stride, dst_u, dst_uvstride);
RECON_INVOKE(&x->rtcd->recon, copy8x8)(vptr, pre_stride, dst_v, dst_uvstride);
vp8_copy_mem8x8(uptr, pre_stride, dst_u, dst_uvstride);
vp8_copy_mem8x8(vptr, pre_stride, dst_v, dst_uvstride);
}
}
static void build_inter4x4_predictors_mb(MACROBLOCKD *x)
{
int i;
unsigned char *base_dst = x->dst.y_buffer;
unsigned char *base_pre = x->pre.y_buffer;
if (x->mode_info_context->mbmi.partitioning < 3)
{
BLOCKD *b;
int dst_stride = x->block[ 0].dst_stride;
int dst_stride = x->dst.y_stride;
x->block[ 0].bmi = x->mode_info_context->bmi[ 0];
x->block[ 2].bmi = x->mode_info_context->bmi[ 2];
@@ -485,13 +464,13 @@ static void build_inter4x4_predictors_mb(MACROBLOCKD *x)
}
b = &x->block[ 0];
build_inter_predictors4b(x, b, *(b->base_dst) + b->dst, dst_stride);
build_inter_predictors4b(x, b, base_dst + b->offset, dst_stride, base_pre, dst_stride);
b = &x->block[ 2];
build_inter_predictors4b(x, b, *(b->base_dst) + b->dst, dst_stride);
build_inter_predictors4b(x, b, base_dst + b->offset, dst_stride, base_pre, dst_stride);
b = &x->block[ 8];
build_inter_predictors4b(x, b, *(b->base_dst) + b->dst, dst_stride);
build_inter_predictors4b(x, b, base_dst + b->offset, dst_stride, base_pre, dst_stride);
b = &x->block[10];
build_inter_predictors4b(x, b, *(b->base_dst) + b->dst, dst_stride);
build_inter_predictors4b(x, b, base_dst + b->offset, dst_stride, base_pre, dst_stride);
}
else
{
@@ -499,7 +478,7 @@ static void build_inter4x4_predictors_mb(MACROBLOCKD *x)
{
BLOCKD *d0 = &x->block[i];
BLOCKD *d1 = &x->block[i+1];
int dst_stride = x->block[ 0].dst_stride;
int dst_stride = x->dst.y_stride;
x->block[i+0].bmi = x->mode_info_context->bmi[i+0];
x->block[i+1].bmi = x->mode_info_context->bmi[i+1];
@@ -510,31 +489,51 @@ static void build_inter4x4_predictors_mb(MACROBLOCKD *x)
}
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
build_inter_predictors2b(x, d0, *(d0->base_dst) + d0->dst, dst_stride);
build_inter_predictors2b(x, d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride);
else
{
build_inter_predictors_b(d0, *(d0->base_dst) + d0->dst, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d1, *(d1->base_dst) + d1->dst, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d1, base_dst + d1->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
}
}
}
for (i = 16; i < 24; i += 2)
base_dst = x->dst.u_buffer;
base_pre = x->pre.u_buffer;
for (i = 16; i < 20; i += 2)
{
BLOCKD *d0 = &x->block[i];
BLOCKD *d1 = &x->block[i+1];
int dst_stride = x->block[ 16].dst_stride;
int dst_stride = x->dst.uv_stride;
/* Note: uv mvs already clamped in build_4x4uvmvs() */
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
build_inter_predictors2b(x, d0, *(d0->base_dst) + d0->dst, dst_stride);
build_inter_predictors2b(x, d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride);
else
{
build_inter_predictors_b(d0, *(d0->base_dst) + d0->dst, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d1, *(d1->base_dst) + d1->dst, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d1, base_dst + d1->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
}
}
base_dst = x->dst.v_buffer;
base_pre = x->pre.v_buffer;
for (i = 20; i < 24; i += 2)
{
BLOCKD *d0 = &x->block[i];
BLOCKD *d1 = &x->block[i+1];
int dst_stride = x->dst.uv_stride;
/* Note: uv mvs already clamped in build_4x4uvmvs() */
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
build_inter_predictors2b(x, d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride);
else
{
build_inter_predictors_b(d0, base_dst + d0->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
build_inter_predictors_b(d1, base_dst + d1->offset, dst_stride, base_pre, dst_stride, x->subpixel_predict);
}
}
}
@@ -559,8 +558,7 @@ void build_4x4uvmvs(MACROBLOCKD *x)
+ x->mode_info_context->bmi[yoffset + 4].mv.as_mv.row
+ x->mode_info_context->bmi[yoffset + 5].mv.as_mv.row;
if (temp < 0) temp -= 4;
else temp += 4;
temp += 4 + ((temp >> (sizeof(int) * CHAR_BIT - 1)) << 3);
x->block[uoffset].bmi.mv.as_mv.row = (temp / 8) & x->fullpixel_mask;
@@ -569,18 +567,14 @@ void build_4x4uvmvs(MACROBLOCKD *x)
+ x->mode_info_context->bmi[yoffset + 4].mv.as_mv.col
+ x->mode_info_context->bmi[yoffset + 5].mv.as_mv.col;
if (temp < 0) temp -= 4;
else temp += 4;
temp += 4 + ((temp >> (sizeof(int) * CHAR_BIT - 1)) << 3);
x->block[uoffset].bmi.mv.as_mv.col = (temp / 8) & x->fullpixel_mask;
if (x->mode_info_context->mbmi.need_to_clamp_mvs)
clamp_uvmv_to_umv_border(&x->block[uoffset].bmi.mv.as_mv, x);
x->block[voffset].bmi.mv.as_mv.row =
x->block[uoffset].bmi.mv.as_mv.row ;
x->block[voffset].bmi.mv.as_mv.col =
x->block[uoffset].bmi.mv.as_mv.col ;
x->block[voffset].bmi.mv.as_int = x->block[uoffset].bmi.mv.as_int;
}
}
}

View File

@@ -25,6 +25,8 @@ extern void vp8_build_inter16x16_predictors_mby(MACROBLOCKD *x,
unsigned char *dst_y,
int dst_ystride);
extern void vp8_build_inter_predictors_b(BLOCKD *d, int pitch,
unsigned char *base_pre,
int pre_stride,
vp8_subpix_fn_t sppf);
extern void vp8_build_inter16x16_predictors_mbuv(MACROBLOCKD *x);

View File

@@ -10,147 +10,24 @@
#include "vpx_config.h"
#include "recon.h"
#include "reconintra.h"
#include "vpx_rtcd.h"
#include "vpx_mem/vpx_mem.h"
#include "blockd.h"
/* For skip_recon_mb(), add vp8_build_intra_predictors_mby_s(MACROBLOCKD *x) and
* vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x).
*/
void vp8_build_intra_predictors_mby(MACROBLOCKD *x)
void vp8_build_intra_predictors_mby_s_c(MACROBLOCKD *x,
unsigned char * yabove_row,
unsigned char * yleft,
int left_stride,
unsigned char * ypred_ptr,
int y_stride)
{
unsigned char *yabove_row = x->dst.y_buffer - x->dst.y_stride;
unsigned char yleft_col[16];
unsigned char ytop_left = yabove_row[-1];
unsigned char *ypred_ptr = x->predictor;
int r, c, i;
for (i = 0; i < 16; i++)
{
yleft_col[i] = x->dst.y_buffer [i* x->dst.y_stride -1];
}
/* for Y */
switch (x->mode_info_context->mbmi.mode)
{
case DC_PRED:
{
int expected_dc;
int i;
int shift;
int average = 0;
if (x->up_available || x->left_available)
{
if (x->up_available)
{
for (i = 0; i < 16; i++)
{
average += yabove_row[i];
}
}
if (x->left_available)
{
for (i = 0; i < 16; i++)
{
average += yleft_col[i];
}
}
shift = 3 + x->up_available + x->left_available;
expected_dc = (average + (1 << (shift - 1))) >> shift;
}
else
{
expected_dc = 128;
}
vpx_memset(ypred_ptr, expected_dc, 256);
}
break;
case V_PRED:
{
for (r = 0; r < 16; r++)
{
((int *)ypred_ptr)[0] = ((int *)yabove_row)[0];
((int *)ypred_ptr)[1] = ((int *)yabove_row)[1];
((int *)ypred_ptr)[2] = ((int *)yabove_row)[2];
((int *)ypred_ptr)[3] = ((int *)yabove_row)[3];
ypred_ptr += 16;
}
}
break;
case H_PRED:
{
for (r = 0; r < 16; r++)
{
vpx_memset(ypred_ptr, yleft_col[r], 16);
ypred_ptr += 16;
}
}
break;
case TM_PRED:
{
for (r = 0; r < 16; r++)
{
for (c = 0; c < 16; c++)
{
int pred = yleft_col[r] + yabove_row[ c] - ytop_left;
if (pred < 0)
pred = 0;
if (pred > 255)
pred = 255;
ypred_ptr[c] = pred;
}
ypred_ptr += 16;
}
}
break;
case B_PRED:
case NEARESTMV:
case NEARMV:
case ZEROMV:
case NEWMV:
case SPLITMV:
case MB_MODE_COUNT:
break;
}
}
void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
{
unsigned char *yabove_row = x->dst.y_buffer - x->dst.y_stride;
unsigned char yleft_col[16];
unsigned char ytop_left = yabove_row[-1];
unsigned char *ypred_ptr = x->predictor;
int r, c, i;
int y_stride = x->dst.y_stride;
ypred_ptr = x->dst.y_buffer; /*x->predictor;*/
for (i = 0; i < 16; i++)
{
yleft_col[i] = x->dst.y_buffer [i* x->dst.y_stride -1];
yleft_col[i] = yleft[i* left_stride];
}
/* for Y */
@@ -198,7 +75,7 @@ void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
for (r = 0; r < 16; r++)
{
vpx_memset(ypred_ptr, expected_dc, 16);
ypred_ptr += y_stride; /*16;*/
ypred_ptr += y_stride;
}
}
break;
@@ -212,7 +89,7 @@ void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
((int *)ypred_ptr)[1] = ((int *)yabove_row)[1];
((int *)ypred_ptr)[2] = ((int *)yabove_row)[2];
((int *)ypred_ptr)[3] = ((int *)yabove_row)[3];
ypred_ptr += y_stride; /*16;*/
ypred_ptr += y_stride;
}
}
break;
@@ -223,7 +100,7 @@ void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
{
vpx_memset(ypred_ptr, yleft_col[r], 16);
ypred_ptr += y_stride; /*16;*/
ypred_ptr += y_stride;
}
}
@@ -246,7 +123,7 @@ void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
ypred_ptr[c] = pred;
}
ypred_ptr += y_stride; /*16;*/
ypred_ptr += y_stride;
}
}
@@ -262,162 +139,27 @@ void vp8_build_intra_predictors_mby_s(MACROBLOCKD *x)
}
}
void vp8_build_intra_predictors_mbuv(MACROBLOCKD *x)
void vp8_build_intra_predictors_mbuv_s_c(MACROBLOCKD *x,
unsigned char * uabove_row,
unsigned char * vabove_row,
unsigned char * uleft,
unsigned char * vleft,
int left_stride,
unsigned char * upred_ptr,
unsigned char * vpred_ptr,
int pred_stride)
{
unsigned char *uabove_row = x->dst.u_buffer - x->dst.uv_stride;
unsigned char uleft_col[16];
unsigned char uleft_col[8];
unsigned char utop_left = uabove_row[-1];
unsigned char *vabove_row = x->dst.v_buffer - x->dst.uv_stride;
unsigned char vleft_col[20];
unsigned char vleft_col[8];
unsigned char vtop_left = vabove_row[-1];
unsigned char *upred_ptr = &x->predictor[256];
unsigned char *vpred_ptr = &x->predictor[320];
int i, j;
for (i = 0; i < 8; i++)
{
uleft_col[i] = x->dst.u_buffer [i* x->dst.uv_stride -1];
vleft_col[i] = x->dst.v_buffer [i* x->dst.uv_stride -1];
}
switch (x->mode_info_context->mbmi.uv_mode)
{
case DC_PRED:
{
int expected_udc;
int expected_vdc;
int i;
int shift;
int Uaverage = 0;
int Vaverage = 0;
if (x->up_available)
{
for (i = 0; i < 8; i++)
{
Uaverage += uabove_row[i];
Vaverage += vabove_row[i];
}
}
if (x->left_available)
{
for (i = 0; i < 8; i++)
{
Uaverage += uleft_col[i];
Vaverage += vleft_col[i];
}
}
if (!x->up_available && !x->left_available)
{
expected_udc = 128;
expected_vdc = 128;
}
else
{
shift = 2 + x->up_available + x->left_available;
expected_udc = (Uaverage + (1 << (shift - 1))) >> shift;
expected_vdc = (Vaverage + (1 << (shift - 1))) >> shift;
}
vpx_memset(upred_ptr, expected_udc, 64);
vpx_memset(vpred_ptr, expected_vdc, 64);
}
break;
case V_PRED:
{
int i;
for (i = 0; i < 8; i++)
{
vpx_memcpy(upred_ptr, uabove_row, 8);
vpx_memcpy(vpred_ptr, vabove_row, 8);
upred_ptr += 8;
vpred_ptr += 8;
}
}
break;
case H_PRED:
{
int i;
for (i = 0; i < 8; i++)
{
vpx_memset(upred_ptr, uleft_col[i], 8);
vpx_memset(vpred_ptr, vleft_col[i], 8);
upred_ptr += 8;
vpred_ptr += 8;
}
}
break;
case TM_PRED:
{
int i;
for (i = 0; i < 8; i++)
{
for (j = 0; j < 8; j++)
{
int predu = uleft_col[i] + uabove_row[j] - utop_left;
int predv = vleft_col[i] + vabove_row[j] - vtop_left;
if (predu < 0)
predu = 0;
if (predu > 255)
predu = 255;
if (predv < 0)
predv = 0;
if (predv > 255)
predv = 255;
upred_ptr[j] = predu;
vpred_ptr[j] = predv;
}
upred_ptr += 8;
vpred_ptr += 8;
}
}
break;
case B_PRED:
case NEARESTMV:
case NEARMV:
case ZEROMV:
case NEWMV:
case SPLITMV:
case MB_MODE_COUNT:
break;
}
}
void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x)
{
unsigned char *uabove_row = x->dst.u_buffer - x->dst.uv_stride;
unsigned char uleft_col[16];
unsigned char utop_left = uabove_row[-1];
unsigned char *vabove_row = x->dst.v_buffer - x->dst.uv_stride;
unsigned char vleft_col[20];
unsigned char vtop_left = vabove_row[-1];
unsigned char *upred_ptr = x->dst.u_buffer; /*&x->predictor[256];*/
unsigned char *vpred_ptr = x->dst.v_buffer; /*&x->predictor[320];*/
int uv_stride = x->dst.uv_stride;
int i, j;
for (i = 0; i < 8; i++)
{
uleft_col[i] = x->dst.u_buffer [i* x->dst.uv_stride -1];
vleft_col[i] = x->dst.v_buffer [i* x->dst.uv_stride -1];
uleft_col[i] = uleft [i* left_stride];
vleft_col[i] = vleft [i* left_stride];
}
switch (x->mode_info_context->mbmi.uv_mode)
@@ -468,8 +210,8 @@ void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x)
{
vpx_memset(upred_ptr, expected_udc, 8);
vpx_memset(vpred_ptr, expected_vdc, 8);
upred_ptr += uv_stride; /*8;*/
vpred_ptr += uv_stride; /*8;*/
upred_ptr += pred_stride;
vpred_ptr += pred_stride;
}
}
break;
@@ -481,8 +223,8 @@ void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x)
{
vpx_memcpy(upred_ptr, uabove_row, 8);
vpx_memcpy(vpred_ptr, vabove_row, 8);
upred_ptr += uv_stride; /*8;*/
vpred_ptr += uv_stride; /*8;*/
upred_ptr += pred_stride;
vpred_ptr += pred_stride;
}
}
@@ -495,8 +237,8 @@ void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x)
{
vpx_memset(upred_ptr, uleft_col[i], 8);
vpx_memset(vpred_ptr, vleft_col[i], 8);
upred_ptr += uv_stride; /*8;*/
vpred_ptr += uv_stride; /*8;*/
upred_ptr += pred_stride;
vpred_ptr += pred_stride;
}
}
@@ -528,8 +270,8 @@ void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x)
vpred_ptr[j] = predv;
}
upred_ptr += uv_stride; /*8;*/
vpred_ptr += uv_stride; /*8;*/
upred_ptr += pred_stride;
vpred_ptr += pred_stride;
}
}

View File

@@ -9,22 +9,23 @@
*/
#include "recon.h"
#include "vpx_config.h"
#include "vpx_rtcd.h"
#include "blockd.h"
void vp8_intra4x4_predict_c(unsigned char *src, int src_stride,
int b_mode,
unsigned char *dst, int dst_stride)
void vp8_intra4x4_predict_d_c(unsigned char *Above,
unsigned char *yleft, int left_stride,
int b_mode,
unsigned char *dst, int dst_stride,
unsigned char top_left)
{
int i, r, c;
unsigned char *Above = src - src_stride;
unsigned char Left[4];
unsigned char top_left = Above[-1];
Left[0] = src[-1];
Left[1] = src[-1 + src_stride];
Left[2] = src[-1 + 2 * src_stride];
Left[3] = src[-1 + 3 * src_stride];
Left[0] = yleft[0];
Left[1] = yleft[left_stride];
Left[2] = yleft[2 * left_stride];
Left[3] = yleft[3 * left_stride];
switch (b_mode)
{
@@ -293,23 +294,15 @@ void vp8_intra4x4_predict_c(unsigned char *src, int src_stride,
}
}
/* copy 4 bytes from the above right down so that the 4x4 prediction modes using pixels above and
* to the right prediction have filled in pixels to use.
*/
void vp8_intra_prediction_down_copy(MACROBLOCKD *x)
void vp8_intra4x4_predict_c(unsigned char *src, int src_stride,
int b_mode,
unsigned char *dst, int dst_stride)
{
unsigned char *above_right = *(x->block[0].base_dst) + x->block[0].dst - x->block[0].dst_stride + 16;
unsigned char *Above = src - src_stride;
unsigned int *src_ptr = (unsigned int *)above_right;
unsigned int *dst_ptr0 = (unsigned int *)(above_right + 4 * x->block[0].dst_stride);
unsigned int *dst_ptr1 = (unsigned int *)(above_right + 8 * x->block[0].dst_stride);
unsigned int *dst_ptr2 = (unsigned int *)(above_right + 12 * x->block[0].dst_stride);
*dst_ptr0 = *src_ptr;
*dst_ptr1 = *src_ptr;
*dst_ptr2 = *src_ptr;
vp8_intra4x4_predict_d_c(Above,
src - 1, src_stride,
b_mode,
dst, dst_stride,
Above[-1]);
}

View File

@@ -11,7 +11,22 @@
#ifndef __INC_RECONINTRA4x4_H
#define __INC_RECONINTRA4x4_H
#include "vp8/common/blockd.h"
extern void vp8_intra_prediction_down_copy(MACROBLOCKD *x);
static void intra_prediction_down_copy(MACROBLOCKD *xd,
unsigned char *above_right_src)
{
int dst_stride = xd->dst.y_stride;
unsigned char *above_right_dst = xd->dst.y_buffer - dst_stride + 16;
unsigned int *src_ptr = (unsigned int *)above_right_src;
unsigned int *dst_ptr0 = (unsigned int *)(above_right_dst + 4 * dst_stride);
unsigned int *dst_ptr1 = (unsigned int *)(above_right_dst + 8 * dst_stride);
unsigned int *dst_ptr2 = (unsigned int *)(above_right_dst + 12 * dst_stride);
*dst_ptr0 = *src_ptr;
*dst_ptr1 = *src_ptr;
*dst_ptr2 = *src_ptr;
}
#endif

View File

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
* Copyright (c) 2011 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
@@ -7,13 +7,6 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "vpx_config.h"
#include "vpx_ports/x86.h"
#include "vp8/decoder/onyxd_int.h"
void vp8_arch_x86_decode_init(VP8D_COMP *pbi)
{
}
#define RTCD_C
#include "vpx_rtcd.h"

541
vp8/common/rtcd_defs.sh Normal file
View File

@@ -0,0 +1,541 @@
common_forward_decls() {
cat <<EOF
struct blockd;
struct macroblockd;
struct loop_filter_info;
/* Encoder forward decls */
struct block;
struct macroblock;
struct variance_vtable;
union int_mv;
struct yv12_buffer_config;
EOF
}
forward_decls common_forward_decls
#
# Dequant
#
prototype void vp8_dequantize_b "struct blockd*, short *dqc"
specialize vp8_dequantize_b mmx media neon
vp8_dequantize_b_media=vp8_dequantize_b_v6
prototype void vp8_dequant_idct_add "short *input, short *dq, unsigned char *output, int stride"
specialize vp8_dequant_idct_add mmx media neon
vp8_dequant_idct_add_media=vp8_dequant_idct_add_v6
prototype void vp8_dequant_idct_add_y_block "short *q, short *dq, unsigned char *dst, int stride, char *eobs"
specialize vp8_dequant_idct_add_y_block mmx sse2 media neon
vp8_dequant_idct_add_y_block_media=vp8_dequant_idct_add_y_block_v6
prototype void vp8_dequant_idct_add_uv_block "short *q, short *dq, unsigned char *dst_u, unsigned char *dst_v, int stride, char *eobs"
specialize vp8_dequant_idct_add_uv_block mmx sse2 media neon
vp8_dequant_idct_add_uv_block_media=vp8_dequant_idct_add_uv_block_v6
#
# Loopfilter
#
prototype void vp8_loop_filter_mbv "unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi"
specialize vp8_loop_filter_mbv mmx sse2 media neon
vp8_loop_filter_mbv_media=vp8_loop_filter_mbv_armv6
prototype void vp8_loop_filter_bv "unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi"
specialize vp8_loop_filter_bv mmx sse2 media neon
vp8_loop_filter_bv_media=vp8_loop_filter_bv_armv6
prototype void vp8_loop_filter_mbh "unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi"
specialize vp8_loop_filter_mbh mmx sse2 media neon
vp8_loop_filter_mbh_media=vp8_loop_filter_mbh_armv6
prototype void vp8_loop_filter_bh "unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi"
specialize vp8_loop_filter_bh mmx sse2 media neon
vp8_loop_filter_bh_media=vp8_loop_filter_bh_armv6
prototype void vp8_loop_filter_simple_mbv "unsigned char *y, int ystride, const unsigned char *blimit"
specialize vp8_loop_filter_simple_mbv mmx sse2 media neon
vp8_loop_filter_simple_mbv_c=vp8_loop_filter_simple_vertical_edge_c
vp8_loop_filter_simple_mbv_mmx=vp8_loop_filter_simple_vertical_edge_mmx
vp8_loop_filter_simple_mbv_sse2=vp8_loop_filter_simple_vertical_edge_sse2
vp8_loop_filter_simple_mbv_media=vp8_loop_filter_simple_vertical_edge_armv6
vp8_loop_filter_simple_mbv_neon=vp8_loop_filter_mbvs_neon
prototype void vp8_loop_filter_simple_mbh "unsigned char *y, int ystride, const unsigned char *blimit"
specialize vp8_loop_filter_simple_mbh mmx sse2 media neon
vp8_loop_filter_simple_mbh_c=vp8_loop_filter_simple_horizontal_edge_c
vp8_loop_filter_simple_mbh_mmx=vp8_loop_filter_simple_horizontal_edge_mmx
vp8_loop_filter_simple_mbh_sse2=vp8_loop_filter_simple_horizontal_edge_sse2
vp8_loop_filter_simple_mbh_media=vp8_loop_filter_simple_horizontal_edge_armv6
vp8_loop_filter_simple_mbh_neon=vp8_loop_filter_mbhs_neon
prototype void vp8_loop_filter_simple_bv "unsigned char *y, int ystride, const unsigned char *blimit"
specialize vp8_loop_filter_simple_bv mmx sse2 media neon
vp8_loop_filter_simple_bv_c=vp8_loop_filter_bvs_c
vp8_loop_filter_simple_bv_mmx=vp8_loop_filter_bvs_mmx
vp8_loop_filter_simple_bv_sse2=vp8_loop_filter_bvs_sse2
vp8_loop_filter_simple_bv_media=vp8_loop_filter_bvs_armv6
vp8_loop_filter_simple_bv_neon=vp8_loop_filter_bvs_neon
prototype void vp8_loop_filter_simple_bh "unsigned char *y, int ystride, const unsigned char *blimit"
specialize vp8_loop_filter_simple_bh mmx sse2 media neon
vp8_loop_filter_simple_bh_c=vp8_loop_filter_bhs_c
vp8_loop_filter_simple_bh_mmx=vp8_loop_filter_bhs_mmx
vp8_loop_filter_simple_bh_sse2=vp8_loop_filter_bhs_sse2
vp8_loop_filter_simple_bh_media=vp8_loop_filter_bhs_armv6
vp8_loop_filter_simple_bh_neon=vp8_loop_filter_bhs_neon
#
# IDCT
#
#idct16
prototype void vp8_short_idct4x4llm "short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride"
specialize vp8_short_idct4x4llm mmx media neon
vp8_short_idct4x4llm_media=vp8_short_idct4x4llm_v6_dual
#iwalsh1
prototype void vp8_short_inv_walsh4x4_1 "short *input, short *output"
# no asm yet
#iwalsh16
prototype void vp8_short_inv_walsh4x4 "short *input, short *output"
specialize vp8_short_inv_walsh4x4 mmx sse2 media neon
vp8_short_inv_walsh4x4_media=vp8_short_inv_walsh4x4_v6
#idct1_scalar_add
prototype void vp8_dc_only_idct_add "short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride"
specialize vp8_dc_only_idct_add mmx media neon
vp8_dc_only_idct_add_media=vp8_dc_only_idct_add_v6
#
# RECON
#
prototype void vp8_copy_mem16x16 "unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch"
specialize vp8_copy_mem16x16 mmx sse2 media neon
vp8_copy_mem16x16_media=vp8_copy_mem16x16_v6
prototype void vp8_copy_mem8x8 "unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch"
specialize vp8_copy_mem8x8 mmx media neon
vp8_copy_mem8x8_media=vp8_copy_mem8x8_v6
prototype void vp8_copy_mem8x4 "unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch"
specialize vp8_copy_mem8x4 mmx media neon
vp8_copy_mem8x4_media=vp8_copy_mem8x4_v6
prototype void vp8_build_intra_predictors_mby_s "struct macroblockd *x, unsigned char * yabove_row, unsigned char * yleft, int left_stride, unsigned char * ypred_ptr, int y_stride"
specialize vp8_build_intra_predictors_mby_s sse2 ssse3
#TODO: fix assembly for neon
prototype void vp8_build_intra_predictors_mbuv_s "struct macroblockd *x, unsigned char * uabove_row, unsigned char * vabove_row, unsigned char *uleft, unsigned char *vleft, int left_stride, unsigned char * upred_ptr, unsigned char * vpred_ptr, int pred_stride"
specialize vp8_build_intra_predictors_mbuv_s sse2 ssse3
prototype void vp8_intra4x4_predict_d "unsigned char *above, unsigned char *left, int left_stride, int b_mode, unsigned char *dst, int dst_stride, unsigned char top_left"
prototype void vp8_intra4x4_predict "unsigned char *src, int src_stride, int b_mode, unsigned char *dst, int dst_stride"
specialize vp8_intra4x4_predict media
vp8_intra4x4_predict_media=vp8_intra4x4_predict_armv6
#
# Postproc
#
if [ "$CONFIG_POSTPROC" = "yes" ]; then
prototype void vp8_mbpost_proc_down "unsigned char *dst, int pitch, int rows, int cols,int flimit"
specialize vp8_mbpost_proc_down mmx sse2
vp8_mbpost_proc_down_sse2=vp8_mbpost_proc_down_xmm
prototype void vp8_mbpost_proc_across_ip "unsigned char *dst, int pitch, int rows, int cols,int flimit"
specialize vp8_mbpost_proc_across_ip sse2
vp8_mbpost_proc_across_ip_sse2=vp8_mbpost_proc_across_ip_xmm
prototype void vp8_post_proc_down_and_across "unsigned char *src, unsigned char *dst, int src_pitch, int dst_pitch, int rows, int cols, int flimit"
specialize vp8_post_proc_down_and_across mmx sse2
vp8_post_proc_down_and_across_sse2=vp8_post_proc_down_and_across_xmm
prototype void vp8_plane_add_noise "unsigned char *s, char *noise, char blackclamp[16], char whiteclamp[16], char bothclamp[16], unsigned int w, unsigned int h, int pitch"
specialize vp8_plane_add_noise mmx sse2
vp8_plane_add_noise_sse2=vp8_plane_add_noise_wmt
prototype void vp8_blend_mb_inner "unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride"
# no asm yet
prototype void vp8_blend_mb_outer "unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride"
# no asm yet
prototype void vp8_blend_b "unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride"
# no asm yet
prototype void vp8_filter_by_weight16x16 "unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight"
specialize vp8_filter_by_weight16x16 sse2
prototype void vp8_filter_by_weight8x8 "unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight"
specialize vp8_filter_by_weight8x8 sse2
prototype void vp8_filter_by_weight4x4 "unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight"
# no asm yet
fi
#
# Subpixel
#
prototype void vp8_sixtap_predict16x16 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_sixtap_predict16x16 mmx sse2 ssse3 media neon
vp8_sixtap_predict16x16_media=vp8_sixtap_predict16x16_armv6
prototype void vp8_sixtap_predict8x8 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_sixtap_predict8x8 mmx sse2 ssse3 media neon
vp8_sixtap_predict8x8_media=vp8_sixtap_predict8x8_armv6
prototype void vp8_sixtap_predict8x4 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_sixtap_predict8x4 mmx sse2 ssse3 media neon
vp8_sixtap_predict8x4_media=vp8_sixtap_predict8x4_armv6
prototype void vp8_sixtap_predict4x4 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_sixtap_predict4x4 mmx ssse3 media neon
vp8_sixtap_predict4x4_media=vp8_sixtap_predict4x4_armv6
prototype void vp8_bilinear_predict16x16 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_bilinear_predict16x16 mmx sse2 ssse3 media neon
vp8_bilinear_predict16x16_media=vp8_bilinear_predict16x16_armv6
prototype void vp8_bilinear_predict8x8 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_bilinear_predict8x8 mmx sse2 ssse3 media neon
vp8_bilinear_predict8x8_media=vp8_bilinear_predict8x8_armv6
prototype void vp8_bilinear_predict8x4 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_bilinear_predict8x4 mmx media neon
vp8_bilinear_predict8x4_media=vp8_bilinear_predict8x4_armv6
prototype void vp8_bilinear_predict4x4 "unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch"
specialize vp8_bilinear_predict4x4 mmx media neon
vp8_bilinear_predict4x4_media=vp8_bilinear_predict4x4_armv6
#
# Whole-pixel Variance
#
prototype unsigned int vp8_variance4x4 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance4x4 mmx sse2
vp8_variance4x4_sse2=vp8_variance4x4_wmt
prototype unsigned int vp8_variance8x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance8x8 mmx sse2 media neon
vp8_variance8x8_sse2=vp8_variance8x8_wmt
vp8_variance8x8_media=vp8_variance8x8_armv6
prototype unsigned int vp8_variance8x16 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance8x16 mmx sse2 neon
vp8_variance8x16_sse2=vp8_variance8x16_wmt
prototype unsigned int vp8_variance16x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance16x8 mmx sse2 neon
vp8_variance16x8_sse2=vp8_variance16x8_wmt
prototype unsigned int vp8_variance16x16 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance16x16 mmx sse2 media neon
vp8_variance16x16_sse2=vp8_variance16x16_wmt
vp8_variance16x16_media=vp8_variance16x16_armv6
#
# Sub-pixel Variance
#
prototype unsigned int vp8_sub_pixel_variance4x4 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_variance4x4 mmx sse2
vp8_sub_pixel_variance4x4_sse2=vp8_sub_pixel_variance4x4_wmt
prototype unsigned int vp8_sub_pixel_variance8x8 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_variance8x8 mmx sse2 media neon
vp8_sub_pixel_variance8x8_sse2=vp8_sub_pixel_variance8x8_wmt
vp8_sub_pixel_variance8x8_media=vp8_sub_pixel_variance8x8_armv6
prototype unsigned int vp8_sub_pixel_variance8x16 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_variance8x16 mmx sse2
vp8_sub_pixel_variance8x16_sse2=vp8_sub_pixel_variance8x16_wmt
prototype unsigned int vp8_sub_pixel_variance16x8 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_variance16x8 mmx sse2 ssse3
vp8_sub_pixel_variance16x8_sse2=vp8_sub_pixel_variance16x8_wmt
prototype unsigned int vp8_sub_pixel_variance16x16 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_variance16x16 mmx sse2 ssse3 media neon
vp8_sub_pixel_variance16x16_sse2=vp8_sub_pixel_variance16x16_wmt
vp8_sub_pixel_variance16x16_media=vp8_sub_pixel_variance16x16_armv6
prototype unsigned int vp8_variance_halfpixvar16x16_h "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance_halfpixvar16x16_h mmx sse2 media neon
vp8_variance_halfpixvar16x16_h_sse2=vp8_variance_halfpixvar16x16_h_wmt
vp8_variance_halfpixvar16x16_h_media=vp8_variance_halfpixvar16x16_h_armv6
prototype unsigned int vp8_variance_halfpixvar16x16_v "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance_halfpixvar16x16_v mmx sse2 media neon
vp8_variance_halfpixvar16x16_v_sse2=vp8_variance_halfpixvar16x16_v_wmt
vp8_variance_halfpixvar16x16_v_media=vp8_variance_halfpixvar16x16_v_armv6
prototype unsigned int vp8_variance_halfpixvar16x16_hv "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_variance_halfpixvar16x16_hv mmx sse2 media neon
vp8_variance_halfpixvar16x16_hv_sse2=vp8_variance_halfpixvar16x16_hv_wmt
vp8_variance_halfpixvar16x16_hv_media=vp8_variance_halfpixvar16x16_hv_armv6
#
# Single block SAD
#
prototype unsigned int vp8_sad4x4 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int max_sad"
specialize vp8_sad4x4 mmx sse2 neon
vp8_sad4x4_sse2=vp8_sad4x4_wmt
prototype unsigned int vp8_sad8x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int max_sad"
specialize vp8_sad8x8 mmx sse2 neon
vp8_sad8x8_sse2=vp8_sad8x8_wmt
prototype unsigned int vp8_sad8x16 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int max_sad"
specialize vp8_sad8x16 mmx sse2 neon
vp8_sad8x16_sse2=vp8_sad8x16_wmt
prototype unsigned int vp8_sad16x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int max_sad"
specialize vp8_sad16x8 mmx sse2 neon
vp8_sad16x8_sse2=vp8_sad16x8_wmt
prototype unsigned int vp8_sad16x16 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int max_sad"
specialize vp8_sad16x16 mmx sse2 sse3 media neon
vp8_sad16x16_sse2=vp8_sad16x16_wmt
vp8_sad16x16_media=vp8_sad16x16_armv6
#
# Multi-block SAD, comparing a reference to N blocks 1 pixel apart horizontally
#
prototype void vp8_sad4x4x3 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sad_array"
specialize vp8_sad4x4x3 sse3
prototype void vp8_sad8x8x3 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sad_array"
specialize vp8_sad8x8x3 sse3
prototype void vp8_sad8x16x3 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sad_array"
specialize vp8_sad8x16x3 sse3
prototype void vp8_sad16x8x3 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sad_array"
specialize vp8_sad16x8x3 sse3 ssse3
prototype void vp8_sad16x16x3 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sad_array"
specialize vp8_sad16x16x3 sse3 ssse3
# Note the only difference in the following prototypes is that they return into
# an array of short
prototype void vp8_sad4x4x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned short *sad_array"
specialize vp8_sad4x4x8 sse4_1
vp8_sad4x4x8_sse4_1=vp8_sad4x4x8_sse4
prototype void vp8_sad8x8x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned short *sad_array"
specialize vp8_sad8x8x8 sse4_1
vp8_sad8x8x8_sse4_1=vp8_sad8x8x8_sse4
prototype void vp8_sad8x16x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned short *sad_array"
specialize vp8_sad8x16x8 sse4_1
vp8_sad8x16x8_sse4_1=vp8_sad8x16x8_sse4
prototype void vp8_sad16x8x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned short *sad_array"
specialize vp8_sad16x8x8 sse4_1
vp8_sad16x8x8_sse4_1=vp8_sad16x8x8_sse4
prototype void vp8_sad16x16x8 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned short *sad_array"
specialize vp8_sad16x16x8 sse4_1
vp8_sad16x16x8_sse4_1=vp8_sad16x16x8_sse4
#
# Multi-block SAD, comparing a reference to N independent blocks
#
prototype void vp8_sad4x4x4d "const unsigned char *src_ptr, int source_stride, unsigned char *ref_ptr[4], int ref_stride, unsigned int *sad_array"
specialize vp8_sad4x4x4d sse3
prototype void vp8_sad8x8x4d "const unsigned char *src_ptr, int source_stride, unsigned char *ref_ptr[4], int ref_stride, unsigned int *sad_array"
specialize vp8_sad8x8x4d sse3
prototype void vp8_sad8x16x4d "const unsigned char *src_ptr, int source_stride, unsigned char *ref_ptr[4], int ref_stride, unsigned int *sad_array"
specialize vp8_sad8x16x4d sse3
prototype void vp8_sad16x8x4d "const unsigned char *src_ptr, int source_stride, unsigned char *ref_ptr[4], int ref_stride, unsigned int *sad_array"
specialize vp8_sad16x8x4d sse3
prototype void vp8_sad16x16x4d "const unsigned char *src_ptr, int source_stride, unsigned char *ref_ptr[4], int ref_stride, unsigned int *sad_array"
specialize vp8_sad16x16x4d sse3
#
# Encoder functions below this point.
#
if [ "$CONFIG_VP8_ENCODER" = "yes" ]; then
#
# Sum of squares (vector)
#
prototype unsigned int vp8_get_mb_ss "const short *"
specialize vp8_get_mb_ss mmx sse2
#
# SSE (Sum Squared Error)
#
prototype unsigned int vp8_sub_pixel_mse16x16 "const unsigned char *src_ptr, int source_stride, int xoffset, int yoffset, const unsigned char *ref_ptr, int Refstride, unsigned int *sse"
specialize vp8_sub_pixel_mse16x16 mmx sse2
vp8_sub_pixel_mse16x16_sse2=vp8_sub_pixel_mse16x16_wmt
prototype unsigned int vp8_mse16x16 "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, unsigned int *sse"
specialize vp8_mse16x16 mmx sse2 media neon
vp8_mse16x16_sse2=vp8_mse16x16_wmt
vp8_mse16x16_media=vp8_mse16x16_armv6
prototype unsigned int vp8_get4x4sse_cs "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride"
specialize vp8_get4x4sse_cs mmx neon
#
# Block copy
#
case $arch in
x86*)
prototype void vp8_copy32xn "const unsigned char *src_ptr, int source_stride, const unsigned char *ref_ptr, int ref_stride, int n"
specialize vp8_copy32xn sse2 sse3
;;
esac
#
# Structured Similarity (SSIM)
#
if [ "$CONFIG_INTERNAL_STATS" = "yes" ]; then
[ $arch = "x86_64" ] && sse2_on_x86_64=sse2
prototype void vp8_ssim_parms_8x8 "unsigned char *s, int sp, unsigned char *r, int rp, unsigned long *sum_s, unsigned long *sum_r, unsigned long *sum_sq_s, unsigned long *sum_sq_r, unsigned long *sum_sxr"
specialize vp8_ssim_parms_8x8 $sse2_on_x86_64
prototype void vp8_ssim_parms_16x16 "unsigned char *s, int sp, unsigned char *r, int rp, unsigned long *sum_s, unsigned long *sum_r, unsigned long *sum_sq_s, unsigned long *sum_sq_r, unsigned long *sum_sxr"
specialize vp8_ssim_parms_16x16 $sse2_on_x86_64
fi
#
# Forward DCT
#
prototype void vp8_short_fdct4x4 "short *input, short *output, int pitch"
specialize vp8_short_fdct4x4 mmx sse2 media neon
vp8_short_fdct4x4_media=vp8_short_fdct4x4_armv6
prototype void vp8_short_fdct8x4 "short *input, short *output, int pitch"
specialize vp8_short_fdct8x4 mmx sse2 media neon
vp8_short_fdct8x4_media=vp8_short_fdct8x4_armv6
prototype void vp8_short_walsh4x4 "short *input, short *output, int pitch"
specialize vp8_short_walsh4x4 sse2 media neon
vp8_short_walsh4x4_media=vp8_short_walsh4x4_armv6
#
# Quantizer
#
prototype void vp8_regular_quantize_b "struct block *, struct blockd *"
specialize vp8_regular_quantize_b sse2 sse4_1
vp8_regular_quantize_b_sse4_1=vp8_regular_quantize_b_sse4
prototype void vp8_fast_quantize_b "struct block *, struct blockd *"
specialize vp8_fast_quantize_b sse2 ssse3 media neon
vp8_fast_quantize_b_media=vp8_fast_quantize_b_armv6
prototype void vp8_regular_quantize_b_pair "struct block *b1, struct block *b2, struct blockd *d1, struct blockd *d2"
# no asm yet
prototype void vp8_fast_quantize_b_pair "struct block *b1, struct block *b2, struct blockd *d1, struct blockd *d2"
specialize vp8_fast_quantize_b_pair neon
prototype void vp8_quantize_mb "struct macroblock *"
specialize vp8_quantize_mb neon
prototype void vp8_quantize_mby "struct macroblock *"
specialize vp8_quantize_mby neon
prototype void vp8_quantize_mbuv "struct macroblock *"
specialize vp8_quantize_mbuv neon
#
# Block subtraction
#
prototype int vp8_block_error "short *coeff, short *dqcoeff"
specialize vp8_block_error mmx sse2
vp8_block_error_sse2=vp8_block_error_xmm
prototype int vp8_mbblock_error "struct macroblock *mb, int dc"
specialize vp8_mbblock_error mmx sse2
vp8_mbblock_error_sse2=vp8_mbblock_error_xmm
prototype int vp8_mbuverror "struct macroblock *mb"
specialize vp8_mbuverror mmx sse2
vp8_mbuverror_sse2=vp8_mbuverror_xmm
prototype void vp8_subtract_b "struct block *be, struct blockd *bd, int pitch"
specialize vp8_subtract_b mmx sse2 media neon
vp8_subtract_b_media=vp8_subtract_b_armv6
prototype void vp8_subtract_mby "short *diff, unsigned char *src, int src_stride, unsigned char *pred, int pred_stride"
specialize vp8_subtract_mby mmx sse2 media neon
vp8_subtract_mby_media=vp8_subtract_mby_armv6
prototype void vp8_subtract_mbuv "short *diff, unsigned char *usrc, unsigned char *vsrc, int src_stride, unsigned char *upred, unsigned char *vpred, int pred_stride"
specialize vp8_subtract_mbuv mmx sse2 media neon
vp8_subtract_mbuv_media=vp8_subtract_mbuv_armv6
#
# Motion search
#
prototype int vp8_full_search_sad "struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv"
specialize vp8_full_search_sad sse3 sse4_1
vp8_full_search_sad_sse3=vp8_full_search_sadx3
vp8_full_search_sad_sse4_1=vp8_full_search_sadx8
prototype int vp8_refining_search_sad "struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv"
specialize vp8_refining_search_sad sse3
vp8_refining_search_sad_sse3=vp8_refining_search_sadx4
prototype int vp8_diamond_search_sad "struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, union int_mv *best_mv, int search_param, int sad_per_bit, int *num00, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv"
vp8_diamond_search_sad_sse3=vp8_diamond_search_sadx4
#
# Alt-ref Noise Reduction (ARNR)
#
if [ "$CONFIG_REALTIME_ONLY" != "yes" ]; then
prototype void vp8_temporal_filter_apply "unsigned char *frame1, unsigned int stride, unsigned char *frame2, unsigned int block_size, int strength, int filter_weight, unsigned int *accumulator, unsigned short *count"
specialize vp8_temporal_filter_apply sse2
fi
#
# Pick Loopfilter
#
prototype void vp8_yv12_copy_partial_frame "struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc"
specialize vp8_yv12_copy_partial_frame neon
# End of encoder only functions
fi
# Scaler functions
if [ "CONFIG_SPATIAL_RESAMPLING" != "yes" ]; then
prototype void vp8_horizontal_line_4_5_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_4_5_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_last_vertical_band_4_5_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_2_3_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_2_3_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_last_vertical_band_2_3_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_3_5_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_3_5_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_last_vertical_band_3_5_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_3_4_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_3_4_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_last_vertical_band_3_4_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_1_2_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_1_2_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_last_vertical_band_1_2_scale "unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_5_4_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_5_4_scale "unsigned char *source, unsigned int src_pitch, unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_5_3_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_5_3_scale "unsigned char *source, unsigned int src_pitch, unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_horizontal_line_2_1_scale "const unsigned char *source, unsigned int source_width, unsigned char *dest, unsigned int dest_width"
prototype void vp8_vertical_band_2_1_scale "unsigned char *source, unsigned int src_pitch, unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
prototype void vp8_vertical_band_2_1_scale_i "unsigned char *source, unsigned int src_pitch, unsigned char *dest, unsigned int dest_pitch, unsigned int dest_width"
fi
prototype void vp8_yv12_extend_frame_borders "struct yv12_buffer_config *ybf"
specialize vp8_yv12_extend_frame_borders neon
prototype void vp8_yv12_copy_frame "struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc"
specialize vp8_yv12_copy_frame neon
prototype void vp8_yv12_copy_y "struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc"
specialize vp8_yv12_copy_y neon

View File

@@ -13,40 +13,15 @@
#include "vpx_config.h"
#include "vpx/vpx_integer.h"
unsigned int vp8_sad16x16_c(
const unsigned char *src_ptr,
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
{
int r, c;
unsigned int sad = 0;
for (r = 0; r < 16; r++)
{
for (c = 0; c < 16; c++)
{
sad += abs(src_ptr[c] - ref_ptr[c]);
}
src_ptr += src_stride;
ref_ptr += ref_stride;
}
return sad;
}
static __inline
static
unsigned int sad_mx_n_c(
const unsigned char *src_ptr,
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int m,
int n)
int max_sad,
int m,
int n)
{
int r, c;
@@ -59,6 +34,9 @@ unsigned int sad_mx_n_c(
sad += abs(src_ptr[c] - ref_ptr[c]);
}
if (sad > max_sad)
break;
src_ptr += src_stride;
ref_ptr += ref_stride;
}
@@ -66,16 +44,31 @@ unsigned int sad_mx_n_c(
return sad;
}
/* max_sad is provided as an optional optimization point. Alternative
* implementations of these functions are not required to check it.
*/
unsigned int vp8_sad16x16_c(
const unsigned char *src_ptr,
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
{
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, max_sad, 16, 16);
}
unsigned int vp8_sad8x8_c(
const unsigned char *src_ptr,
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
int max_sad)
{
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, 8, 8);
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, max_sad, 8, 8);
}
@@ -84,10 +77,10 @@ unsigned int vp8_sad16x8_c(
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
int max_sad)
{
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, 16, 8);
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, max_sad, 16, 8);
}
@@ -97,10 +90,10 @@ unsigned int vp8_sad8x16_c(
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
int max_sad)
{
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, 8, 16);
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, max_sad, 8, 16);
}
@@ -109,10 +102,10 @@ unsigned int vp8_sad4x4_c(
int src_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad)
int max_sad)
{
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, 4, 4);
return sad_mx_n_c(src_ptr, src_stride, ref_ptr, ref_stride, max_sad, 4, 4);
}
void vp8_sad16x16x3_c(

View File

@@ -1,86 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef SUBPIXEL_H
#define SUBPIXEL_H
#define prototype_subpixel_predict(sym) \
void sym(unsigned char *src, int src_pitch, int xofst, int yofst, \
unsigned char *dst, int dst_pitch)
#if ARCH_X86 || ARCH_X86_64
#include "x86/subpixel_x86.h"
#endif
#if ARCH_ARM
#include "arm/subpixel_arm.h"
#endif
#ifndef vp8_subpix_sixtap16x16
#define vp8_subpix_sixtap16x16 vp8_sixtap_predict16x16_c
#endif
extern prototype_subpixel_predict(vp8_subpix_sixtap16x16);
#ifndef vp8_subpix_sixtap8x8
#define vp8_subpix_sixtap8x8 vp8_sixtap_predict8x8_c
#endif
extern prototype_subpixel_predict(vp8_subpix_sixtap8x8);
#ifndef vp8_subpix_sixtap8x4
#define vp8_subpix_sixtap8x4 vp8_sixtap_predict8x4_c
#endif
extern prototype_subpixel_predict(vp8_subpix_sixtap8x4);
#ifndef vp8_subpix_sixtap4x4
#define vp8_subpix_sixtap4x4 vp8_sixtap_predict_c
#endif
extern prototype_subpixel_predict(vp8_subpix_sixtap4x4);
#ifndef vp8_subpix_bilinear16x16
#define vp8_subpix_bilinear16x16 vp8_bilinear_predict16x16_c
#endif
extern prototype_subpixel_predict(vp8_subpix_bilinear16x16);
#ifndef vp8_subpix_bilinear8x8
#define vp8_subpix_bilinear8x8 vp8_bilinear_predict8x8_c
#endif
extern prototype_subpixel_predict(vp8_subpix_bilinear8x8);
#ifndef vp8_subpix_bilinear8x4
#define vp8_subpix_bilinear8x4 vp8_bilinear_predict8x4_c
#endif
extern prototype_subpixel_predict(vp8_subpix_bilinear8x4);
#ifndef vp8_subpix_bilinear4x4
#define vp8_subpix_bilinear4x4 vp8_bilinear_predict4x4_c
#endif
extern prototype_subpixel_predict(vp8_subpix_bilinear4x4);
typedef prototype_subpixel_predict((*vp8_subpix_fn_t));
typedef struct
{
vp8_subpix_fn_t sixtap16x16;
vp8_subpix_fn_t sixtap8x8;
vp8_subpix_fn_t sixtap8x4;
vp8_subpix_fn_t sixtap4x4;
vp8_subpix_fn_t bilinear16x16;
vp8_subpix_fn_t bilinear8x8;
vp8_subpix_fn_t bilinear8x4;
vp8_subpix_fn_t bilinear4x4;
} vp8_subpix_rtcd_vtable_t;
#if CONFIG_RUNTIME_CPU_DETECT
#define SUBPIX_INVOKE(ctx,fn) (ctx)->fn
#else
#define SUBPIX_INVOKE(ctx,fn) vp8_subpix_##fn
#endif
#endif

View File

@@ -33,6 +33,29 @@
#define pthread_getspecific(ts_key) TlsGetValue(ts_key)
#define pthread_setspecific(ts_key, value) TlsSetValue(ts_key, (void *)value)
#define pthread_self() GetCurrentThreadId()
#elif defined(__OS2__)
/* OS/2 */
#define INCL_DOS
#include <os2.h>
#include <stdlib.h>
#define THREAD_FUNCTION void
#define THREAD_FUNCTION_RETURN void
#define THREAD_SPECIFIC_INDEX PULONG
#define pthread_t TID
#define pthread_attr_t ULONG
#define pthread_create(thhandle,attr,thfunc,tharg) \
((int)((*(thhandle)=_beginthread(thfunc,NULL,1024*1024,tharg))==-1))
#define pthread_join(thread, result) ((int)DosWaitThread(&(thread),0))
#define pthread_detach(thread) 0
#define thread_sleep(nms) DosSleep(nms)
#define pthread_cancel(thread) DosKillThread(thread)
#define ts_key_create(ts_key, destructor) \
DosAllocThreadLocalMemory(1, &(ts_key));
#define pthread_getspecific(ts_key) ((void *)(*(ts_key)))
#define pthread_setspecific(ts_key, value) (*(ts_key)=(ULONG)(value))
#define pthread_self() _gettid()
#else
#ifdef __APPLE__
#include <mach/mach_init.h>
@@ -64,6 +87,76 @@
#define sem_destroy(sem) if(*sem)((int)(CloseHandle(*sem))==TRUE)
#define thread_sleep(nms) Sleep(nms)
#elif defined(__OS2__)
typedef struct
{
HEV event;
HMTX wait_mutex;
HMTX count_mutex;
int count;
} sem_t;
static inline int sem_init(sem_t *sem, int pshared, unsigned int value)
{
DosCreateEventSem(NULL, &sem->event, pshared ? DC_SEM_SHARED : 0,
value > 0 ? TRUE : FALSE);
DosCreateMutexSem(NULL, &sem->wait_mutex, 0, FALSE);
DosCreateMutexSem(NULL, &sem->count_mutex, 0, FALSE);
sem->count = value;
return 0;
}
static inline int sem_wait(sem_t * sem)
{
DosRequestMutexSem(sem->wait_mutex, -1);
DosWaitEventSem(sem->event, -1);
DosRequestMutexSem(sem->count_mutex, -1);
sem->count--;
if (sem->count == 0)
{
ULONG post_count;
DosResetEventSem(sem->event, &post_count);
}
DosReleaseMutexSem(sem->count_mutex);
DosReleaseMutexSem(sem->wait_mutex);
return 0;
}
static inline int sem_post(sem_t * sem)
{
DosRequestMutexSem(sem->count_mutex, -1);
if (sem->count < 32768)
{
sem->count++;
DosPostEventSem(sem->event);
}
DosReleaseMutexSem(sem->count_mutex);
return 0;
}
static inline int sem_destroy(sem_t * sem)
{
DosCloseEventSem(sem->event);
DosCloseMutexSem(sem->wait_mutex);
DosCloseMutexSem(sem->count_mutex);
return 0;
}
#define thread_sleep(nms) DosSleep(nms)
#else
#ifdef __APPLE__

115
vp8/common/variance.h Normal file
View File

@@ -0,0 +1,115 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef VARIANCE_H
#define VARIANCE_H
typedef unsigned int(*vp8_sad_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride,
int max_sad
);
typedef void (*vp8_copy32xn_fn_t)(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride,
int n);
typedef void (*vp8_sad_multi_fn_t)(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride,
unsigned int *sad_array);
typedef void (*vp8_sad_multi1_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride,
unsigned short *sad_array
);
typedef void (*vp8_sad_multi_d_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
unsigned char *ref_ptr[4],
int ref_stride,
unsigned int *sad_array
);
typedef unsigned int (*vp8_variance_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride,
unsigned int *sse
);
typedef unsigned int (*vp8_subpixvariance_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
int xoffset,
int yoffset,
const unsigned char *ref_ptr,
int Refstride,
unsigned int *sse
);
typedef void (*vp8_ssimpf_fn_t)
(
unsigned char *s,
int sp,
unsigned char *r,
int rp,
unsigned long *sum_s,
unsigned long *sum_r,
unsigned long *sum_sq_s,
unsigned long *sum_sq_r,
unsigned long *sum_sxr
);
typedef unsigned int (*vp8_getmbss_fn_t)(const short *);
typedef unsigned int (*vp8_get16x16prederror_fn_t)
(
const unsigned char *src_ptr,
int source_stride,
const unsigned char *ref_ptr,
int ref_stride
);
typedef struct variance_vtable
{
vp8_sad_fn_t sdf;
vp8_variance_fn_t vf;
vp8_subpixvariance_fn_t svf;
vp8_variance_fn_t svf_halfpix_h;
vp8_variance_fn_t svf_halfpix_v;
vp8_variance_fn_t svf_halfpix_hv;
vp8_sad_multi_fn_t sdx3f;
vp8_sad_multi1_fn_t sdx8f;
vp8_sad_multi_d_fn_t sdx4df;
#if ARCH_X86 || ARCH_X86_64
vp8_copy32xn_fn_t copymem;
#endif
} vp8_variance_fn_ptr_t;
#endif

View File

@@ -10,7 +10,7 @@
#include "variance.h"
#include "vp8/common/filter.h"
#include "filter.h"
unsigned int vp8_get_mb_ss_c
@@ -75,7 +75,7 @@ unsigned int vp8_variance16x16_c(
variance(src_ptr, source_stride, ref_ptr, recon_stride, 16, 16, &var, &avg);
*sse = var;
return (var - ((avg * avg) >> 8));
return (var - ((unsigned int)(avg * avg) >> 8));
}
unsigned int vp8_variance8x16_c(
@@ -91,7 +91,7 @@ unsigned int vp8_variance8x16_c(
variance(src_ptr, source_stride, ref_ptr, recon_stride, 8, 16, &var, &avg);
*sse = var;
return (var - ((avg * avg) >> 7));
return (var - ((unsigned int)(avg * avg) >> 7));
}
unsigned int vp8_variance16x8_c(
@@ -107,7 +107,7 @@ unsigned int vp8_variance16x8_c(
variance(src_ptr, source_stride, ref_ptr, recon_stride, 16, 8, &var, &avg);
*sse = var;
return (var - ((avg * avg) >> 7));
return (var - ((unsigned int)(avg * avg) >> 7));
}
@@ -124,7 +124,7 @@ unsigned int vp8_variance8x8_c(
variance(src_ptr, source_stride, ref_ptr, recon_stride, 8, 8, &var, &avg);
*sse = var;
return (var - ((avg * avg) >> 6));
return (var - ((unsigned int)(avg * avg) >> 6));
}
unsigned int vp8_variance4x4_c(
@@ -140,7 +140,7 @@ unsigned int vp8_variance4x4_c(
variance(src_ptr, source_stride, ref_ptr, recon_stride, 4, 4, &var, &avg);
*sse = var;
return (var - ((avg * avg) >> 4));
return (var - ((unsigned int)(avg * avg) >> 4));
}

242
vp8/common/vp8_entropymodedata.h Executable file
View File

@@ -0,0 +1,242 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
/*Generated file, included by entropymode.c*/
const struct vp8_token_struct vp8_bmode_encodings[VP8_BINTRAMODES] =
{
{ 0, 1 },
{ 2, 2 },
{ 6, 3 },
{ 28, 5 },
{ 30, 5 },
{ 58, 6 },
{ 59, 6 },
{ 62, 6 },
{ 126, 7 },
{ 127, 7 }
};
const struct vp8_token_struct vp8_ymode_encodings[VP8_YMODES] =
{
{ 0, 1 },
{ 4, 3 },
{ 5, 3 },
{ 6, 3 },
{ 7, 3 }
};
const struct vp8_token_struct vp8_kf_ymode_encodings[VP8_YMODES] =
{
{ 4, 3 },
{ 5, 3 },
{ 6, 3 },
{ 7, 3 },
{ 0, 1 }
};
const struct vp8_token_struct vp8_uv_mode_encodings[VP8_UV_MODES] =
{
{ 0, 1 },
{ 2, 2 },
{ 6, 3 },
{ 7, 3 }
};
const struct vp8_token_struct vp8_mbsplit_encodings[VP8_NUMMBSPLITS] =
{
{ 6, 3 },
{ 7, 3 },
{ 2, 2 },
{ 0, 1 }
};
const struct vp8_token_struct vp8_mv_ref_encoding_array[VP8_MVREFS] =
{
{ 2, 2 },
{ 6, 3 },
{ 0, 1 },
{ 14, 4 },
{ 15, 4 }
};
const struct vp8_token_struct vp8_sub_mv_ref_encoding_array[VP8_SUBMVREFS] =
{
{ 0, 1 },
{ 2, 2 },
{ 6, 3 },
{ 7, 3 }
};
const struct vp8_token_struct vp8_small_mvencodings[8] =
{
{ 0, 3 },
{ 1, 3 },
{ 2, 3 },
{ 3, 3 },
{ 4, 3 },
{ 5, 3 },
{ 6, 3 },
{ 7, 3 }
};
const vp8_prob vp8_ymode_prob[VP8_YMODES-1] =
{
112, 86, 140, 37
};
const vp8_prob vp8_kf_ymode_prob[VP8_YMODES-1] =
{
145, 156, 163, 128
};
const vp8_prob vp8_uv_mode_prob[VP8_UV_MODES-1] =
{
162, 101, 204
};
const vp8_prob vp8_kf_uv_mode_prob[VP8_UV_MODES-1] =
{
142, 114, 183
};
const vp8_prob vp8_bmode_prob[VP8_BINTRAMODES-1] =
{
120, 90, 79, 133, 87, 85, 80, 111, 151
};
const vp8_prob vp8_kf_bmode_prob
[VP8_BINTRAMODES] [VP8_BINTRAMODES] [VP8_BINTRAMODES-1] =
{
{
{ 231, 120, 48, 89, 115, 113, 120, 152, 112 },
{ 152, 179, 64, 126, 170, 118, 46, 70, 95 },
{ 175, 69, 143, 80, 85, 82, 72, 155, 103 },
{ 56, 58, 10, 171, 218, 189, 17, 13, 152 },
{ 144, 71, 10, 38, 171, 213, 144, 34, 26 },
{ 114, 26, 17, 163, 44, 195, 21, 10, 173 },
{ 121, 24, 80, 195, 26, 62, 44, 64, 85 },
{ 170, 46, 55, 19, 136, 160, 33, 206, 71 },
{ 63, 20, 8, 114, 114, 208, 12, 9, 226 },
{ 81, 40, 11, 96, 182, 84, 29, 16, 36 }
},
{
{ 134, 183, 89, 137, 98, 101, 106, 165, 148 },
{ 72, 187, 100, 130, 157, 111, 32, 75, 80 },
{ 66, 102, 167, 99, 74, 62, 40, 234, 128 },
{ 41, 53, 9, 178, 241, 141, 26, 8, 107 },
{ 104, 79, 12, 27, 217, 255, 87, 17, 7 },
{ 74, 43, 26, 146, 73, 166, 49, 23, 157 },
{ 65, 38, 105, 160, 51, 52, 31, 115, 128 },
{ 87, 68, 71, 44, 114, 51, 15, 186, 23 },
{ 47, 41, 14, 110, 182, 183, 21, 17, 194 },
{ 66, 45, 25, 102, 197, 189, 23, 18, 22 }
},
{
{ 88, 88, 147, 150, 42, 46, 45, 196, 205 },
{ 43, 97, 183, 117, 85, 38, 35, 179, 61 },
{ 39, 53, 200, 87, 26, 21, 43, 232, 171 },
{ 56, 34, 51, 104, 114, 102, 29, 93, 77 },
{ 107, 54, 32, 26, 51, 1, 81, 43, 31 },
{ 39, 28, 85, 171, 58, 165, 90, 98, 64 },
{ 34, 22, 116, 206, 23, 34, 43, 166, 73 },
{ 68, 25, 106, 22, 64, 171, 36, 225, 114 },
{ 34, 19, 21, 102, 132, 188, 16, 76, 124 },
{ 62, 18, 78, 95, 85, 57, 50, 48, 51 }
},
{
{ 193, 101, 35, 159, 215, 111, 89, 46, 111 },
{ 60, 148, 31, 172, 219, 228, 21, 18, 111 },
{ 112, 113, 77, 85, 179, 255, 38, 120, 114 },
{ 40, 42, 1, 196, 245, 209, 10, 25, 109 },
{ 100, 80, 8, 43, 154, 1, 51, 26, 71 },
{ 88, 43, 29, 140, 166, 213, 37, 43, 154 },
{ 61, 63, 30, 155, 67, 45, 68, 1, 209 },
{ 142, 78, 78, 16, 255, 128, 34, 197, 171 },
{ 41, 40, 5, 102, 211, 183, 4, 1, 221 },
{ 51, 50, 17, 168, 209, 192, 23, 25, 82 }
},
{
{ 125, 98, 42, 88, 104, 85, 117, 175, 82 },
{ 95, 84, 53, 89, 128, 100, 113, 101, 45 },
{ 75, 79, 123, 47, 51, 128, 81, 171, 1 },
{ 57, 17, 5, 71, 102, 57, 53, 41, 49 },
{ 115, 21, 2, 10, 102, 255, 166, 23, 6 },
{ 38, 33, 13, 121, 57, 73, 26, 1, 85 },
{ 41, 10, 67, 138, 77, 110, 90, 47, 114 },
{ 101, 29, 16, 10, 85, 128, 101, 196, 26 },
{ 57, 18, 10, 102, 102, 213, 34, 20, 43 },
{ 117, 20, 15, 36, 163, 128, 68, 1, 26 }
},
{
{ 138, 31, 36, 171, 27, 166, 38, 44, 229 },
{ 67, 87, 58, 169, 82, 115, 26, 59, 179 },
{ 63, 59, 90, 180, 59, 166, 93, 73, 154 },
{ 40, 40, 21, 116, 143, 209, 34, 39, 175 },
{ 57, 46, 22, 24, 128, 1, 54, 17, 37 },
{ 47, 15, 16, 183, 34, 223, 49, 45, 183 },
{ 46, 17, 33, 183, 6, 98, 15, 32, 183 },
{ 65, 32, 73, 115, 28, 128, 23, 128, 205 },
{ 40, 3, 9, 115, 51, 192, 18, 6, 223 },
{ 87, 37, 9, 115, 59, 77, 64, 21, 47 }
},
{
{ 104, 55, 44, 218, 9, 54, 53, 130, 226 },
{ 64, 90, 70, 205, 40, 41, 23, 26, 57 },
{ 54, 57, 112, 184, 5, 41, 38, 166, 213 },
{ 30, 34, 26, 133, 152, 116, 10, 32, 134 },
{ 75, 32, 12, 51, 192, 255, 160, 43, 51 },
{ 39, 19, 53, 221, 26, 114, 32, 73, 255 },
{ 31, 9, 65, 234, 2, 15, 1, 118, 73 },
{ 88, 31, 35, 67, 102, 85, 55, 186, 85 },
{ 56, 21, 23, 111, 59, 205, 45, 37, 192 },
{ 55, 38, 70, 124, 73, 102, 1, 34, 98 }
},
{
{ 102, 61, 71, 37, 34, 53, 31, 243, 192 },
{ 69, 60, 71, 38, 73, 119, 28, 222, 37 },
{ 68, 45, 128, 34, 1, 47, 11, 245, 171 },
{ 62, 17, 19, 70, 146, 85, 55, 62, 70 },
{ 75, 15, 9, 9, 64, 255, 184, 119, 16 },
{ 37, 43, 37, 154, 100, 163, 85, 160, 1 },
{ 63, 9, 92, 136, 28, 64, 32, 201, 85 },
{ 86, 6, 28, 5, 64, 255, 25, 248, 1 },
{ 56, 8, 17, 132, 137, 255, 55, 116, 128 },
{ 58, 15, 20, 82, 135, 57, 26, 121, 40 }
},
{
{ 164, 50, 31, 137, 154, 133, 25, 35, 218 },
{ 51, 103, 44, 131, 131, 123, 31, 6, 158 },
{ 86, 40, 64, 135, 148, 224, 45, 183, 128 },
{ 22, 26, 17, 131, 240, 154, 14, 1, 209 },
{ 83, 12, 13, 54, 192, 255, 68, 47, 28 },
{ 45, 16, 21, 91, 64, 222, 7, 1, 197 },
{ 56, 21, 39, 155, 60, 138, 23, 102, 213 },
{ 85, 26, 85, 85, 128, 128, 32, 146, 171 },
{ 18, 11, 7, 63, 144, 171, 4, 4, 246 },
{ 35, 27, 10, 146, 174, 171, 12, 26, 128 }
},
{
{ 190, 80, 35, 99, 180, 80, 126, 54, 45 },
{ 85, 126, 47, 87, 176, 51, 41, 20, 32 },
{ 101, 75, 128, 139, 118, 146, 116, 128, 85 },
{ 56, 41, 15, 176, 236, 85, 37, 9, 62 },
{ 146, 36, 19, 30, 171, 255, 97, 27, 20 },
{ 71, 30, 17, 119, 118, 255, 17, 18, 138 },
{ 101, 38, 60, 138, 55, 70, 43, 26, 142 },
{ 138, 45, 61, 62, 219, 1, 81, 188, 64 },
{ 32, 41, 20, 117, 151, 142, 20, 21, 163 },
{ 112, 19, 12, 61, 195, 128, 48, 4, 24 }
}
};

View File

@@ -1,58 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef DEQUANTIZE_X86_H
#define DEQUANTIZE_X86_H
/* Note:
*
* This platform is commonly built for runtime CPU detection. If you modify
* any of the function mappings present in this file, be sure to also update
* them in the function pointer initialization code
*/
#if HAVE_MMX
extern prototype_dequant_block(vp8_dequantize_b_mmx);
extern prototype_dequant_idct_add(vp8_dequant_idct_add_mmx);
extern prototype_dequant_idct_add_y_block(vp8_dequant_idct_add_y_block_mmx);
extern prototype_dequant_idct_add_uv_block(vp8_dequant_idct_add_uv_block_mmx);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_dequant_block
#define vp8_dequant_block vp8_dequantize_b_mmx
#undef vp8_dequant_idct_add
#define vp8_dequant_idct_add vp8_dequant_idct_add_mmx
#undef vp8_dequant_idct_add_y_block
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_mmx
#undef vp8_dequant_idct_add_uv_block
#define vp8_dequant_idct_add_uv_block vp8_dequant_idct_add_uv_block_mmx
#endif
#endif
#if HAVE_SSE2
extern prototype_dequant_idct_add_y_block(vp8_dequant_idct_add_y_block_sse2);
extern prototype_dequant_idct_add_uv_block(vp8_dequant_idct_add_uv_block_sse2);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_dequant_idct_add_y_block
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_sse2
#undef vp8_dequant_idct_add_uv_block
#define vp8_dequant_idct_add_uv_block vp8_dequant_idct_add_uv_block_sse2
#endif
#endif
#endif

View File

@@ -9,8 +9,8 @@
*/
#include "vpx_config.h"
#include "vp8/common/idct.h"
#include "vp8/common/dequantize.h"
#include "vpx_rtcd.h"
#include "vp8/common/blockd.h"
extern void vp8_dequantize_b_impl_mmx(short *sq, short *dq, short *q);

View File

@@ -9,8 +9,7 @@
*/
#include "vpx_config.h"
#include "vp8/common/idct.h"
#include "vp8/common/dequantize.h"
#include "vpx_rtcd.h"
void vp8_idct_dequant_0_2x_sse2
(short *q, short *dq ,

View File

@@ -1,56 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef IDCT_X86_H
#define IDCT_X86_H
/* Note:
*
* This platform is commonly built for runtime CPU detection. If you modify
* any of the function mappings present in this file, be sure to also update
* them in the function pointer initialization code
*/
#if HAVE_MMX
extern prototype_idct(vp8_short_idct4x4llm_mmx);
extern prototype_idct_scalar_add(vp8_dc_only_idct_add_mmx);
extern prototype_second_order(vp8_short_inv_walsh4x4_mmx);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_idct_idct16
#define vp8_idct_idct16 vp8_short_idct4x4llm_mmx
#undef vp8_idct_idct1_scalar_add
#define vp8_idct_idct1_scalar_add vp8_dc_only_idct_add_mmx
#undef vp8_idct_iwalsh16
#define vp8_idct_iwalsh16 vp8_short_inv_walsh4x4_mmx
#endif
#endif
#if HAVE_SSE2
extern prototype_second_order(vp8_short_inv_walsh4x4_sse2);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_idct_iwalsh16
#define vp8_idct_iwalsh16 vp8_short_inv_walsh4x4_sse2
#endif
#endif
#endif

View File

@@ -0,0 +1,31 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
extern "C" {
void vp8_short_idct4x4llm_mmx(short *input, unsigned char *pred_ptr,
int pred_stride, unsigned char *dst_ptr,
int dst_stride);
}
#include "vp8/common/idctllm_test.h"
namespace
{
INSTANTIATE_TEST_CASE_P(MMX, IDCTTest,
::testing::Values(vp8_short_idct4x4llm_mmx));
} // namespace
int main(int argc, char **argv) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}

File diff suppressed because it is too large Load Diff

View File

@@ -12,20 +12,33 @@
#include "vpx_config.h"
#include "vp8/common/loopfilter.h"
#define prototype_loopfilter(sym) \
void sym(unsigned char *src, int pitch, const unsigned char *blimit,\
const unsigned char *limit, const unsigned char *thresh, int count)
#define prototype_loopfilter_nc(sym) \
void sym(unsigned char *src, int pitch, const unsigned char *blimit,\
const unsigned char *limit, const unsigned char *thresh)
#define prototype_simple_loopfilter(sym) \
void sym(unsigned char *y, int ystride, const unsigned char *blimit)
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_mmx);
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_mmx);
prototype_loopfilter(vp8_loop_filter_vertical_edge_mmx);
prototype_loopfilter(vp8_loop_filter_horizontal_edge_mmx);
prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_mmx);
prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_mmx);
#if HAVE_SSE2 && ARCH_X86_64
prototype_loopfilter(vp8_loop_filter_bv_y_sse2);
prototype_loopfilter(vp8_loop_filter_bh_y_sse2);
#else
prototype_loopfilter(vp8_loop_filter_vertical_edge_sse2);
prototype_loopfilter(vp8_loop_filter_horizontal_edge_sse2);
prototype_loopfilter_nc(vp8_loop_filter_vertical_edge_sse2);
prototype_loopfilter_nc(vp8_loop_filter_horizontal_edge_sse2);
#endif
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_sse2);
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_sse2);
prototype_loopfilter_nc(vp8_mbloop_filter_vertical_edge_sse2);
prototype_loopfilter_nc(vp8_mbloop_filter_horizontal_edge_sse2);
extern loop_filter_uvfunction vp8_loop_filter_horizontal_edge_uv_sse2;
extern loop_filter_uvfunction vp8_loop_filter_vertical_edge_uv_sse2;
@@ -115,7 +128,7 @@ void vp8_loop_filter_bvs_mmx(unsigned char *y_ptr, int y_stride, const unsigned
void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
int y_stride, int uv_stride, loop_filter_info *lfi)
{
vp8_mbloop_filter_horizontal_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
vp8_mbloop_filter_horizontal_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr);
if (u_ptr)
vp8_mbloop_filter_horizontal_edge_uv_sse2(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, v_ptr);
@@ -126,7 +139,7 @@ void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsign
void vp8_loop_filter_mbv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
int y_stride, int uv_stride, loop_filter_info *lfi)
{
vp8_mbloop_filter_vertical_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
vp8_mbloop_filter_vertical_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr);
if (u_ptr)
vp8_mbloop_filter_vertical_edge_uv_sse2(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, v_ptr);
@@ -140,9 +153,9 @@ void vp8_loop_filter_bh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigne
#if ARCH_X86_64
vp8_loop_filter_bh_y_sse2(y_ptr, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
#else
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
#endif
if (u_ptr)
@@ -165,9 +178,9 @@ void vp8_loop_filter_bv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigne
#if ARCH_X86_64
vp8_loop_filter_bv_y_sse2(y_ptr, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
#else
vp8_loop_filter_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
vp8_loop_filter_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
vp8_loop_filter_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
vp8_loop_filter_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr);
#endif
if (u_ptr)

View File

@@ -1,100 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef LOOPFILTER_X86_H
#define LOOPFILTER_X86_H
/* Note:
*
* This platform is commonly built for runtime CPU detection. If you modify
* any of the function mappings present in this file, be sure to also update
* them in the function pointer initialization code
*/
#if HAVE_MMX
extern prototype_loopfilter_block(vp8_loop_filter_mbv_mmx);
extern prototype_loopfilter_block(vp8_loop_filter_bv_mmx);
extern prototype_loopfilter_block(vp8_loop_filter_mbh_mmx);
extern prototype_loopfilter_block(vp8_loop_filter_bh_mmx);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_mmx);
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_mmx);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_mmx);
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_mmx);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_lf_normal_mb_v
#define vp8_lf_normal_mb_v vp8_loop_filter_mbv_mmx
#undef vp8_lf_normal_b_v
#define vp8_lf_normal_b_v vp8_loop_filter_bv_mmx
#undef vp8_lf_normal_mb_h
#define vp8_lf_normal_mb_h vp8_loop_filter_mbh_mmx
#undef vp8_lf_normal_b_h
#define vp8_lf_normal_b_h vp8_loop_filter_bh_mmx
#undef vp8_lf_simple_mb_v
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_mmx
#undef vp8_lf_simple_b_v
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_mmx
#undef vp8_lf_simple_mb_h
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_mmx
#undef vp8_lf_simple_b_h
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_mmx
#endif
#endif
#if HAVE_SSE2
extern prototype_loopfilter_block(vp8_loop_filter_mbv_sse2);
extern prototype_loopfilter_block(vp8_loop_filter_bv_sse2);
extern prototype_loopfilter_block(vp8_loop_filter_mbh_sse2);
extern prototype_loopfilter_block(vp8_loop_filter_bh_sse2);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_sse2);
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_sse2);
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_sse2);
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_sse2);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_lf_normal_mb_v
#define vp8_lf_normal_mb_v vp8_loop_filter_mbv_sse2
#undef vp8_lf_normal_b_v
#define vp8_lf_normal_b_v vp8_loop_filter_bv_sse2
#undef vp8_lf_normal_mb_h
#define vp8_lf_normal_mb_h vp8_loop_filter_mbh_sse2
#undef vp8_lf_normal_b_h
#define vp8_lf_normal_b_h vp8_loop_filter_bh_sse2
#undef vp8_lf_simple_mb_v
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_sse2
#undef vp8_lf_simple_b_v
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_sse2
#undef vp8_lf_simple_mb_h
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_sse2
#undef vp8_lf_simple_b_h
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_sse2
#endif
#endif
#endif

View File

@@ -0,0 +1,281 @@
;
; Copyright (c) 2012 The WebM project authors. All Rights Reserved.
;
; Use of this source code is governed by a BSD-style license
; that can be found in the LICENSE file in the root of the source
; tree. An additional intellectual property rights grant can be found
; in the file PATENTS. All contributing project authors may
; be found in the AUTHORS file in the root of the source tree.
;
%include "vpx_ports/x86_abi_support.asm"
;void vp8_filter_by_weight16x16_sse2
;(
; unsigned char *src,
; int src_stride,
; unsigned char *dst,
; int dst_stride,
; int src_weight
;)
global sym(vp8_filter_by_weight16x16_sse2)
sym(vp8_filter_by_weight16x16_sse2):
push rbp
mov rbp, rsp
SHADOW_ARGS_TO_STACK 5
SAVE_XMM 6
GET_GOT rbx
push rsi
push rdi
; end prolog
movd xmm0, arg(4) ; src_weight
pshuflw xmm0, xmm0, 0x0 ; replicate to all low words
punpcklqdq xmm0, xmm0 ; replicate to all hi words
movdqa xmm1, [GLOBAL(tMFQE)]
psubw xmm1, xmm0 ; dst_weight
mov rax, arg(0) ; src
mov rsi, arg(1) ; src_stride
mov rdx, arg(2) ; dst
mov rdi, arg(3) ; dst_stride
mov rcx, 16 ; loop count
pxor xmm6, xmm6
.combine
movdqa xmm2, [rax]
movdqa xmm4, [rdx]
add rax, rsi
; src * src_weight
movdqa xmm3, xmm2
punpcklbw xmm2, xmm6
punpckhbw xmm3, xmm6
pmullw xmm2, xmm0
pmullw xmm3, xmm0
; dst * dst_weight
movdqa xmm5, xmm4
punpcklbw xmm4, xmm6
punpckhbw xmm5, xmm6
pmullw xmm4, xmm1
pmullw xmm5, xmm1
; sum, round and shift
paddw xmm2, xmm4
paddw xmm3, xmm5
paddw xmm2, [GLOBAL(tMFQE_round)]
paddw xmm3, [GLOBAL(tMFQE_round)]
psrlw xmm2, 4
psrlw xmm3, 4
packuswb xmm2, xmm3
movdqa [rdx], xmm2
add rdx, rdi
dec rcx
jnz .combine
; begin epilog
pop rdi
pop rsi
RESTORE_GOT
RESTORE_XMM
UNSHADOW_ARGS
pop rbp
ret
;void vp8_filter_by_weight8x8_sse2
;(
; unsigned char *src,
; int src_stride,
; unsigned char *dst,
; int dst_stride,
; int src_weight
;)
global sym(vp8_filter_by_weight8x8_sse2)
sym(vp8_filter_by_weight8x8_sse2):
push rbp
mov rbp, rsp
SHADOW_ARGS_TO_STACK 5
GET_GOT rbx
push rsi
push rdi
; end prolog
movd xmm0, arg(4) ; src_weight
pshuflw xmm0, xmm0, 0x0 ; replicate to all low words
punpcklqdq xmm0, xmm0 ; replicate to all hi words
movdqa xmm1, [GLOBAL(tMFQE)]
psubw xmm1, xmm0 ; dst_weight
mov rax, arg(0) ; src
mov rsi, arg(1) ; src_stride
mov rdx, arg(2) ; dst
mov rdi, arg(3) ; dst_stride
mov rcx, 8 ; loop count
pxor xmm4, xmm4
.combine
movq xmm2, [rax]
movq xmm3, [rdx]
add rax, rsi
; src * src_weight
punpcklbw xmm2, xmm4
pmullw xmm2, xmm0
; dst * dst_weight
punpcklbw xmm3, xmm4
pmullw xmm3, xmm1
; sum, round and shift
paddw xmm2, xmm3
paddw xmm2, [GLOBAL(tMFQE_round)]
psrlw xmm2, 4
packuswb xmm2, xmm4
movq [rdx], xmm2
add rdx, rdi
dec rcx
jnz .combine
; begin epilog
pop rdi
pop rsi
RESTORE_GOT
UNSHADOW_ARGS
pop rbp
ret
;void vp8_variance_and_sad_16x16_sse2 | arg
;(
; unsigned char *src1, 0
; int stride1, 1
; unsigned char *src2, 2
; int stride2, 3
; unsigned int *variance, 4
; unsigned int *sad, 5
;)
global sym(vp8_variance_and_sad_16x16_sse2)
sym(vp8_variance_and_sad_16x16_sse2):
push rbp
mov rbp, rsp
SHADOW_ARGS_TO_STACK 6
GET_GOT rbx
push rsi
push rdi
; end prolog
mov rax, arg(0) ; src1
mov rcx, arg(1) ; stride1
mov rdx, arg(2) ; src2
mov rdi, arg(3) ; stride2
mov rsi, 16 ; block height
; Prep accumulator registers
pxor xmm3, xmm3 ; SAD
pxor xmm4, xmm4 ; sum of src2
pxor xmm5, xmm5 ; sum of src2^2
; Because we're working with the actual output frames
; we can't depend on any kind of data alignment.
.accumulate
movdqa xmm0, [rax] ; src1
movdqa xmm1, [rdx] ; src2
add rax, rcx ; src1 + stride1
add rdx, rdi ; src2 + stride2
; SAD(src1, src2)
psadbw xmm0, xmm1
paddusw xmm3, xmm0
; SUM(src2)
pxor xmm2, xmm2
psadbw xmm2, xmm1 ; sum src2 by misusing SAD against 0
paddusw xmm4, xmm2
; pmaddubsw would be ideal if it took two unsigned values. instead,
; it expects a signed and an unsigned value. so instead we zero extend
; and operate on words.
pxor xmm2, xmm2
movdqa xmm0, xmm1
punpcklbw xmm0, xmm2
punpckhbw xmm1, xmm2
pmaddwd xmm0, xmm0
pmaddwd xmm1, xmm1
paddd xmm5, xmm0
paddd xmm5, xmm1
sub rsi, 1
jnz .accumulate
; phaddd only operates on adjacent double words.
; Finalize SAD and store
movdqa xmm0, xmm3
psrldq xmm0, 8
paddusw xmm0, xmm3
paddd xmm0, [GLOBAL(t128)]
psrld xmm0, 8
mov rax, arg(5)
movd [rax], xmm0
; Accumulate sum of src2
movdqa xmm0, xmm4
psrldq xmm0, 8
paddusw xmm0, xmm4
; Square src2. Ignore high value
pmuludq xmm0, xmm0
psrld xmm0, 8
; phaddw could be used to sum adjacent values but we want
; all the values summed. promote to doubles, accumulate,
; shift and sum
pxor xmm2, xmm2
movdqa xmm1, xmm5
punpckldq xmm1, xmm2
punpckhdq xmm5, xmm2
paddd xmm1, xmm5
movdqa xmm2, xmm1
psrldq xmm1, 8
paddd xmm1, xmm2
psubd xmm1, xmm0
; (variance + 128) >> 8
paddd xmm1, [GLOBAL(t128)]
psrld xmm1, 8
mov rax, arg(4)
movd [rax], xmm1
; begin epilog
pop rdi
pop rsi
RESTORE_GOT
UNSHADOW_ARGS
pop rbp
ret
SECTION_RODATA
align 16
t128:
ddq 128
align 16
tMFQE: ; 1 << MFQE_PRECISION
times 8 dw 0x10
align 16
tMFQE_round: ; 1 << (MFQE_PRECISION - 1)
times 8 dw 0x08

View File

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
* Copyright (c) 2012 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
@@ -8,10 +8,14 @@
* be found in the AUTHORS file in the root of the source tree.
*/
/* On Android NDK, rand is inlined function, but postproc needs rand symbol */
#if defined(__ANDROID__)
#define rand __rand
#include <stdlib.h>
#undef rand
#ifndef __INC_RECONINTRA_H
#define __INC_RECONINTRA_H
extern void init_intra_left_above_pixels(MACROBLOCKD *x);
extern int rand(void)
{
return __rand();
}
#endif

View File

@@ -1,64 +0,0 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef POSTPROC_X86_H
#define POSTPROC_X86_H
/* Note:
*
* This platform is commonly built for runtime CPU detection. If you modify
* any of the function mappings present in this file, be sure to also update
* them in the function pointer initialization code
*/
#if HAVE_MMX
extern prototype_postproc_inplace(vp8_mbpost_proc_down_mmx);
extern prototype_postproc(vp8_post_proc_down_and_across_mmx);
extern prototype_postproc_addnoise(vp8_plane_add_noise_mmx);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_postproc_down
#define vp8_postproc_down vp8_mbpost_proc_down_mmx
#undef vp8_postproc_downacross
#define vp8_postproc_downacross vp8_post_proc_down_and_across_mmx
#undef vp8_postproc_addnoise
#define vp8_postproc_addnoise vp8_plane_add_noise_mmx
#endif
#endif
#if HAVE_SSE2
extern prototype_postproc_inplace(vp8_mbpost_proc_down_xmm);
extern prototype_postproc_inplace(vp8_mbpost_proc_across_ip_xmm);
extern prototype_postproc(vp8_post_proc_down_and_across_xmm);
extern prototype_postproc_addnoise(vp8_plane_add_noise_wmt);
#if !CONFIG_RUNTIME_CPU_DETECT
#undef vp8_postproc_down
#define vp8_postproc_down vp8_mbpost_proc_down_xmm
#undef vp8_postproc_across
#define vp8_postproc_across vp8_mbpost_proc_across_ip_xmm
#undef vp8_postproc_downacross
#define vp8_postproc_downacross vp8_post_proc_down_and_across_xmm
#undef vp8_postproc_addnoise
#define vp8_postproc_addnoise vp8_plane_add_noise_wmt
#endif
#endif
#endif

Some files were not shown because too many files have changed in this diff Show More