Commit Graph

404 Commits

Author SHA1 Message Date
John Koleszar
8398449cbf Merge remote branch 'internal/upstream' into HEAD 2011-04-30 00:05:05 -04:00
Ronald S. Bultje
5a23352c03 Make hor UV predict ~2x faster (73 vs 132 cycles) using SSSE3.
Change-Id: I658a1df7d825f820573cb2d11ad402f9d2791035
2011-04-29 11:52:09 -07:00
Yaowu Xu
57ad189129 changed configure option name to reduce confusion
Renamed configure option "enable-psnr" to "enable-internal-stats" to
better reflect the purpose of the option and eliminate the confusion
reported in http://code.google.com/p/webm/issues/detail?id=35

Change-Id: If72df6fdb9f1e33dab1329240ba4d8911d2f1f7a
2011-04-29 09:39:05 -07:00
Scott LaVarnway
1b2abc5f49 Merge "Consolidated build inter predictors" 2011-04-29 07:13:49 -07:00
John Koleszar
89c3269636 Merge remote branch 'origin/master' into experimental
Change-Id: I993021d0b2d7fbe44d6371464f2686eed3ccfaae
2011-04-29 00:05:07 -04:00
John Koleszar
57afffbcbb Merge remote branch 'internal/upstream' into HEAD 2011-04-29 00:05:07 -04:00
James Berry
f10732554b bug fix removed inline from recon_wrapper_sse2.c
removed inline from recon_wrapper_sse2.c to build
for visual stuido

Change-Id: I74a3482950448e2cdb30e9cd7087145b440d8a22
2011-04-28 15:12:00 -04:00
Scott LaVarnway
219ba87a93 Merge "Use psadbw to get the sum of bytes in a line." 2011-04-28 07:58:20 -07:00
Scott LaVarnway
ccd6f7ed77 Consolidated build inter predictors
Code cleanup.

Change-Id: Ic8b0167851116c64ddf08e8a3d302fb09ab61146
2011-04-28 10:53:59 -04:00
John Koleszar
c26bb0fe8f Merge remote branch 'origin/master' into experimental
Change-Id: I7d91efbc3662c86d6efa2d7495eb4689ccdb0ced
2011-04-28 00:05:07 -04:00
John Koleszar
e1b90ce862 Merge remote branch 'internal/upstream' into HEAD 2011-04-28 00:05:07 -04:00
Ronald S. Bultje
1e7ded69cf Use psadbw to get the sum of bytes in a line.
Thanks Jason for pointing that out on #vp8. ;-).

Change-Id: I5330a753e752a8704b78a409597472628e0b26a5
2011-04-27 13:49:21 -07:00
Scott LaVarnway
2e102855f4 Removed unused code in reconinter
The skip flag is never set by the encoder for SPLITMV.

Change-Id: I5ae6457edb3a1193cb5b05a6d61772c13b1dc506
2011-04-27 15:25:32 -04:00
Ronald S. Bultje
1083fe4999 SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}().
decoding

before
10.425
10.432
10.423
=10.426

after:
10.405
10.416
10.398
=10.406, 0.2% faster

encoding

before
14.252
14.331
14.250
14.223
14.241
14.220
14.221
=14.248

after
14.095
14.090
14.085
14.095
14.064
14.081
14.089
=14.086, 1.1% faster

Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
2011-04-27 11:31:27 -07:00
John Koleszar
0a77e59847 Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/common/alloccommon.c
	vp8/encoder/rdopt.c

Change-Id: I142167d31d1b9cffe143774f6915bca463df67f0
2011-04-26 08:28:51 -04:00
John Koleszar
bbc24a65c4 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/common/alloccommon.c
	vp8/encoder/rdopt.c

Change-Id: Ic34b33577423031e277235ffa6bcaff7b252e5cb
2011-04-26 08:27:39 -04:00
Johann
d5c46bdfc0 Merge "remove simpler_lpf" 2011-04-25 14:51:07 -07:00
Johann
01527e743f remove simpler_lpf
the decision to run the regular or simple loopfilter is made outside the
function and managed with pointers

stop tracking the option in two places. use filter_type exclusively

Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15
2011-04-25 17:37:41 -04:00
John Koleszar
cfbfd39de8 Merge "Change rc undershoot/overshoot semantics" 2011-04-25 10:49:32 -07:00
John Koleszar
aa926fbd27 Add rc_max_intra_bitrate_pct control
Adds a control to limit the maximum size of a keyframe, as a function of
the per-frame bitrate. See this thread[1] for more detailed discussion:

[1]: http://groups.google.com/a/webmproject.org/group/codec-devel/browse_thread/thread/271b944a5e47ca38

Change-Id: I7337707642eb8041d1e593efc2edfdf66db02a94
2011-04-25 13:47:14 -04:00
John Koleszar
308e31a3ef Merge remote branch 'internal/upstream-experimental' into HEAD
Conflicts:
	vp8/decoder/onyxd_int.h

Change-Id: Icf445b589c2bc61d93d8c977379bbd84387d0488
2011-04-25 09:13:41 -04:00
John Koleszar
5dfd6f51cb Merge remote branch 'origin/master' into experimental
Change-Id: I6f77e7c10a54c54b26126b8acd5edd0a03358a41
2011-04-22 00:05:08 -04:00
Scott LaVarnway
3698c1f620 Removed dc_diff from MB_MODE_INFO
The dc_diff flag is used to skip loopfiltering.  Instead
of setting this flag in the decoder/encoder, we now check
for this condition in the loopfilter.

Change-Id: Ie2b9cdf9e0f4e8b932bbd36e0878c05bffd28931
2011-04-21 14:38:36 -04:00
John Koleszar
b59bd22cc0 Merge remote branch 'origin/master' into experimental
Change-Id: I78a30fb4438ddd0730262691d7c120d67cbcaaa9
2011-04-21 00:05:08 -04:00
Scott LaVarnway
7a49accd0b Removed force_no_skip
force_no_skip is always set to zero.

Change-Id: I89b61c5e0bee34627a9c07c05f3517e1db76af77
2011-04-20 15:45:12 -04:00
Scott LaVarnway
09c933ea80 Removed redundant checks of the mode_info_context flags
Code cleanup.  The build inter predictor functions are
redundantly checking the mode_info_context for either
INTRA_FRAME or SPLITMV.

Change-Id: I4d58c3a5192a4c2cec5c24ab1caf608bf13aebfb
2011-04-20 14:06:40 -04:00
Attila Nagy
43464e94ed Do not copy data between encoder reference buffers.
Golden and ALT reference buffers were refreshed by copying from
the new buffer. Replaced this by index manipulation.
Also moved all the reference frame updates to one function for
easier tracking.

Change-Id: Icd3e534e7e2c8c5567168d222e6a64a96aae24a1
2011-04-20 15:26:55 +03:00
John Koleszar
65b44c2911 Merge remote branch 'origin/master' into experimental
Change-Id: I9e9ece0424b2f4b6861e9c7c0986f6eccc9159d6
2011-04-20 00:05:12 -04:00
Johann
4a2b684ef4 modify SAVE_XMM for potential 64bit use
the win64 abi requires saving and restoring xmm6:xmm15. currently
SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
specifying the highest register used and if the stack is unaligned.

Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
2011-04-19 10:42:45 -04:00
Johann
a9b465c5c9 Merge "Add save/restore xmm registers in x86 assembly code" 2011-04-19 06:32:10 -07:00
John Koleszar
a5d3febc13 Merge remote branch 'origin/master' into experimental
Change-Id: I920c3ed6af244ef9032b744675d9f664e5878d0e
2011-04-19 00:05:09 -04:00
Johann
c7cfde42a9 Add save/restore xmm registers in x86 assembly code
Went through the code and fixed it. Verified on Windows.

Where possible, remove dependencies on xmm[67]

Current code relies on pushing rbp to the stack to get 16 byte
alignment. This broke when rbp wasn't pushed
(vp8/encoder/x86/sad_sse3.asm). Work around this by using unaligned
memory accesses. Revisit this and the offsets in
vp8/encoder/x86/sad_sse3.asm in another change to SAVE_XMM.

Change-Id: I5f940994d3ebfd977c3d68446cef20fd78b07877
2011-04-18 16:30:38 -04:00
Yunqing Wang
d5069b5af0 Merge "Handle long delay between video frames in multi-thread decoder(issue 312)" 2011-04-18 10:11:41 -07:00
John Koleszar
0ba3fffc3a Merge remote branch 'origin/master' into experimental
Change-Id: I6ee7c49138576326887b32316cffe8d3e48aa044
2011-04-16 00:05:08 -04:00
John Koleszar
9d75a502c4 Merge remote branch 'internal/upstream' into HEAD 2011-04-16 00:05:07 -04:00
Yunqing Wang
8ba58951e9 Handle long delay between video frames in multi-thread decoder(issue 312)
This is reported by m...@hesotech.de (see issue 312):
"The decoder causes an access violation
when you decode the first frame, then make a pause of about
60 seconds and then decode further frames. But only if
vpx_codec_dec_cfg_t.threads> 1.

This is caused by a timeout of WaitForSingleObject.
When I change the definition of VPXINFINITE to INFINITE(0xFFFFFFFF),
the problem is solved."

Reproduced the crash and verified the changes on Windows platform.
This brings the behavior inline with the other platforms using sem_wait().

Change-Id: I27b32f90bce05846ef2684b50f7a88f292299da1
2011-04-15 17:27:26 -04:00
Johann
487c0299c9 remove dead code, add missing RESTORE_XMM
vp8_filter_block1d16_h4_ssse3 was never called

because UNSHADOW_ARGS moves the stack by 'mov rsp, rbp', the issue was
masked. however, if/when win64 used those registers for persistant data,
issues could/will arise.

Change-Id: I56d6effca0aeba1f86082689771cb10145d39651
2011-04-15 10:11:53 -04:00
John Koleszar
a3399291ad Fix off-by-one in copy_and_extend_plane
Should only copy h lines, not h+1.

Change-Id: I802a85686635900459c6dc79596189033e5298d8
2011-04-15 08:44:39 -04:00
John Koleszar
b4bb910b57 Merge remote branch 'origin/master' into experimental
Change-Id: Iacd40d38693f433cd25b071fc8420f563b242696
2011-04-15 00:05:09 -04:00
John Koleszar
b709794929 Merge remote branch 'internal/upstream' into HEAD 2011-04-15 00:05:08 -04:00
John Koleszar
88841f1059 Refactor lookahead ring buffer
This patch cleans up the source buffer storage and copy mechanism to
allow access through a standard push/pop/peek interface. This approach
also avoids an extra copy in the case where the source is not a
multiple of 16, fixing issue #102.

Change-Id: I05808c39f5743625cb4c7af54cc841b9b10fdbd9
2011-04-13 14:26:45 -04:00
John Koleszar
c99f9d7abf Change rc undershoot/overshoot semantics
This patch changes the rc_undershoot_pct and rc_overshoot_pct controls
to set the "aggressiveness" of rate adaptation, by limiting the
amount of difference between the target buffer level and the actual
buffer level which is applied to the target frame rate for this frame.

This patch was initially provided by arosenberg at logitech.com as
an attachment to issue #270. It was modified to separate these controls
from the other unrelated modifications in that patch, as well as to
use the pre-existing variables rather than introducing new ones.

Change-Id: Id542e3f5667dd92d857d5eabf29878f2fd730a62
2011-04-12 20:49:33 -04:00
John Koleszar
7ff5084f33 Merge remote branch 'origin/master' into experimental
Change-Id: Ib42656b05f2b099f17fd6c2033bbc3445421150c
2011-04-12 00:05:09 -04:00
John Koleszar
f809f4f93c Merge remote branch 'internal/upstream' into HEAD 2011-04-12 00:05:08 -04:00
John Koleszar
a9ce3e3834 Remove unused files
Change-Id: I36ca3f2f4620358033da34daf764f0b388dacd08
2011-04-11 10:34:40 -04:00
John Koleszar
a6be45c9ca Merge remote branch 'origin/master' into experimental
Change-Id: I53be500dad1a98e21d0a28f9e07761d8d03fdcf6
2011-04-05 00:05:10 -04:00
John Koleszar
89bdcc211e Merge remote branch 'internal/upstream' into HEAD 2011-04-05 00:05:07 -04:00
Yunqing Wang
3d6815817c Use full-pixel MV in mvsadcost calculation
MV sad cost error is only used in full-pixel motion search,
which only need full-pixel resolution instead of quarter-pixel
resolution. This change reduced mvsadcost table size, and
removed unneccessary pamameter passing since this table is
constant once it is generated.

Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0
2011-04-01 16:41:58 -04:00
Attila Nagy
297b27655e Runtime detection of available processor cores.
Detect the number of available cores and limit the thread allocation
accordingly. On decoder side limit the number of threads to the max
number of token partition.

Core detetction works on Windows and
Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN.

Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078
2011-03-31 10:23:01 +03:00
John Koleszar
51bcf621c1 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/decoder/decodemv.c
	vp8/decoder/onyxd_if.c
	vp8/encoder/ratectrl.c
	vp8/encoder/rdopt.c

Change-Id: Ia1c1c5e589f4200822d12378c7749ba62bd17ae2
2011-03-23 00:27:52 -04:00
John Koleszar
5f6db3591c Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/encoder/ratectrl.c
	vp8/encoder/rdopt.c

Change-Id: I4cc58acb432662d2c47aceda1680e52982adbc06
2011-03-23 00:24:25 -04:00
John Koleszar
769c74c0ac Merge "Increase static linkage, remove unused functions" 2011-03-21 04:51:51 -07:00
John Koleszar
429dc676b1 Increase static linkage, remove unused functions
A large number of functions were defined with external linkage, even
though they were only used from within one file. This patch changes
their linkage to static and removes the vp8_ prefix from their names,
which should make it more obvious to the reader that the function is
contained within the current translation unit. Functions that were
not referenced were removed.

These symbols were identified by:

  $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
    | sort | grep '^ *1 '

Change-Id: I59609f58ab65312012c047036ae1e0634f795779
2011-03-17 20:53:47 -04:00
John Koleszar
53d11fa6ad Merge remote branch 'origin/master' into experimental
Change-Id: I3f6c1e297fc0d33dc239bb4dd41d5afbcd91de19
2011-03-17 00:05:08 -04:00
John Koleszar
42f9104d5c Merge remote branch 'internal/upstream' into HEAD 2011-03-17 00:05:07 -04:00
John Koleszar
de50520a8c apple: include proper mach primatives
Fixes implicit declaration warning for 'mach_task_self'. This change
is an update to Change I9991dedd1ccfddc092eca86705ecbc3b764b799d,
which fixed this issue for the decoder but not the encoder.

Change-Id: I9df033e81f9520c4f975b7a7cf6c643d12e87c96
2011-03-16 13:59:32 -04:00
John Koleszar
ba83622a00 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/encoder/onyx_if.c

Change-Id: Ieef9a58a2effdc68cf52bc5f14d90c31a1dbc13a
2011-03-14 08:53:02 -04:00
John Koleszar
eeb8c8004e Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/encoder/onyx_if.c

Change-Id: I230b63cef209cd1ac98357729a91ec07597756bd
2011-03-14 08:48:44 -04:00
John Koleszar
27972d2c1d Move build_intra_predictors_mby to RTCD framework
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s
functions had global function pointers rather than using the RTCD
framework. This can show up as a potential data race with tools such as
helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935
for an example.

Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
2011-03-11 13:04:50 -05:00
John Koleszar
738a791917 Merge remote branch 'internal/upstream-experimental' into HEAD
Conflicts:
	vp8/common/blockd.h

Change-Id: Ica2bd1c3da614eab5ce23acfb597e777d16b3983
2011-03-03 08:58:57 -05:00
John Koleszar
1a7ce50a6c Merge remote branch 'origin/master' into experimental
Change-Id: I52f21ff6f9a1dca7099a8459657f6f288c5bfe40
2011-02-25 00:05:08 -05:00
Scott LaVarnway
861175ef00 Removed vp8_block2type
and used defines instead.

Change-Id: Idb56e0295d004793f406dfd2d8d8c546aad62e03
2011-02-24 14:35:18 -05:00
John Koleszar
b21fe3b278 Merge remote branch 'internal/upstream' into HEAD 2011-02-19 00:05:44 -05:00
John Koleszar
bbfca323fb Merge remote branch 'origin/master' into experimental
Change-Id: Ia3197f432b424213a34b20071e5171a413ba1aaf
2011-02-19 00:05:11 -05:00
John Koleszar
c764c2a20f Merge "clean up unused files" 2011-02-18 06:33:05 -08:00
John Koleszar
3ed8fe8778 remove unused vp8_predict_dc function
Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42
2011-02-18 09:12:20 -05:00
John Koleszar
cbf923b12c clean up unused files
Removed a number of files that were unused or little-used.

Change-Id: If9ae5e5b11390077581a9a879e8a0defe709f5da
2011-02-18 09:09:49 -05:00
John Koleszar
f13212b728 Merge remote branch 'internal/upstream' into HEAD 2011-02-18 00:05:13 -05:00
John Koleszar
4fafc4d985 Merge remote branch 'origin/master' into experimental
Change-Id: I8999a33db82d38eb85482f3c423db238d6ee3ed9
2011-02-18 00:05:11 -05:00
John Koleszar
ac10665ad8 Merge "Removed unused vp8_recon_intra4x4mb function" 2011-02-17 11:30:13 -08:00
Scott LaVarnway
07f7b66fae Removed unused vp8_recon_intra4x4mb function
Change-Id: I4a328ce152d9dbe6b0d1606d1b523e8e7bfb468e
2011-02-17 13:34:38 -05:00
John Koleszar
64aebb6c7a Merge remote branch 'internal/upstream' into HEAD 2011-02-11 00:05:19 -05:00
John Koleszar
809dae2458 Merge remote branch 'origin/master' into experimental
Change-Id: Icf1a7c61a3b07da2ccfd94bca9e8810c01e46b2c
2011-02-11 00:05:14 -05:00
John Koleszar
02321de0f2 Fix relative include paths
Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.

Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
2011-02-10 15:09:44 -05:00
John Koleszar
ec3b8f1f32 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/decoder/onyxd_int.h

Change-Id: Id9aa577f03e37b4f406ba3b593c3c4330812a49e
2011-02-10 14:26:40 -05:00
Johann
7d8199f0c3 Merge "Adds armv6 optimized variance calculation" 2011-02-10 06:06:46 -08:00
John Koleszar
96ddc5c26e Merge remote branch 'origin/master' into experimental
Change-Id: Ie85d40c44bb23d56a519010356b2856c02fb4c05
2011-02-10 00:05:10 -05:00
John Koleszar
a39b5af10b Merge "Put more code under #if CONFIG_MULTITHREAD." 2011-02-09 08:31:36 -08:00
Gaute Strokkenes
315e3c2518 Put more code under #if CONFIG_MULTITHREAD.
Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b
2011-02-09 11:21:18 -05:00
Tero Rintaluoma
cb14764fab Adds armv6 optimized variance calculation
Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
and adds new assembly file for variance16x16 calculation.
 - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
 - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
 - vp8_variance16x16_armv6 (new)
 - bilinearfilter_arm.h (new)
Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
2011-02-09 10:23:43 -05:00
John Koleszar
b2ad177942 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/vp8_common.mk

Change-Id: I2094ddf20834c0b7dfe912feac6a79500bb8cce2
2011-02-09 08:34:48 -05:00
John Koleszar
6e6b46d972 Merge remote branch 'origin/master' into experimental
Change-Id: Ibc762883a5e117f5db64dc01a46a9c78438e6c33
2011-02-09 00:05:12 -05:00
Johann
e5aaac24bb clean up bilinear filter
make reference version of bilinear_filters short.
use reference versions of bilinear_filters and sub_pel_filters when
possible.

recognize that Width was being passed into
filter_block2d_bil_first_pass multiple times. ARM version had already
fixed this. propegate to C.

change references to src_pixels_per_line to src_pitch and standardize on
src/dst (instead of input/output).

recognize that first_pass is only run in the verticle and second_pass
only horizontal. ARM version had already fixed this. propegate to C

Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2
2011-02-08 17:42:54 -05:00
Johann
40dcae9c2e clarify *_offsets.asm differences
it's difficult to mux the *_offsets.c files because of header conflicts.
make three instead, name them consistently and partititon the contents
to allow building them as required.

Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796
2011-02-08 16:35:43 -05:00
John Koleszar
9683198e7b Merge remote branch 'origin/master' into experimental
Change-Id: I7897261eb2956f778f9f9885ce2005b1e134b28f
2011-02-08 00:05:11 -05:00
John Koleszar
c540bbc367 Merge remote branch 'internal/upstream' into HEAD 2011-02-07 14:16:24 -05:00
John Koleszar
2bb322380d Merge remote branch 'internal/upstream-experimental' into HEAD
Conflicts:
	vp8/encoder/encodeframe.c
	vp8/encoder/ethreading.c
	vp8/encoder/onyx_int.h

Change-Id: I1c562d2fe6e42c0d1d86f68c77c0e899066e02bd
2011-02-07 14:16:09 -05:00
Johann
3273c7b679 move one of the offset files
common/arm/vpx_asm_offsets moves up a level. prepare for muxing with
encoder/arm/vpx_vp8_enc_asm_offsets

Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24
2011-02-07 11:35:30 -05:00
John Koleszar
318a14c637 Merge remote branch 'origin/master' into experimental
Change-Id: Ib487cbd7b214a6e3f13180bc0e5dcb792d8a406e
2011-02-05 00:05:11 -05:00
Johann
bb9c95ea53 remove unused dboolhuff code
we were holding on to this "just in case." purge it instead

Change-Id: I77a367b36d0821d731019f2566ecfffdae1d4b8a
2011-02-04 16:00:00 -05:00
Yunqing Wang
350ffe8dae Merge "Improve MV prediction in vp8_pick_inter_mode() for speed>3" 2011-02-04 10:10:15 -08:00
Gaute Strokkenes
ffc6aeef14 Remove duplicate loopfilter parameters.
Change-Id: I0d41415e3961c2c9492d342290c1999f9d02e6d8
2011-02-04 14:55:02 +00:00
John Koleszar
16bbf27fa9 Merge remote branch 'origin/master' into experimental
Change-Id: I242ca4854cb21f3d63efb979bd6ecc9f06f67f33
2011-02-04 00:05:13 -05:00
Gaute Strokkenes
bf5f585b0d Make vp8_adjust_mb_lf_value return the updated value rather than
manipulating it in situ via a pointer.

Change-Id: If4a87a4eccd84f39577c0e91e171245f4954c5cf
2011-02-03 19:24:16 +00:00
John Koleszar
de4b3352b8 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	configure

Change-Id: I74063d859de31a62285c8908bcb1821e050b9f3c
2011-01-31 09:11:52 -05:00
John Koleszar
933dfe0a94 Merge remote branch 'origin/master' into experimental
Conflicts:
	configure

Change-Id: I18c2292256d2387ff09da209aa9cf6891e1864a0
2011-01-31 09:10:35 -05:00
Johann
f3cb9ae459 Merge "Adds "armvX-none-rvct" targets" 2011-01-28 09:03:58 -08:00
Yunqing Wang
7cbe684ef5 Improve MV prediction in vp8_pick_inter_mode() for speed>3
Applied same method used in vp8_rd_pick_inter_mode() to improve
the accuracy of MV prediction.

Change-Id: Ia50ae26208b18482695601f32febd99fe89fbc17
2011-01-28 10:00:20 -05:00
John Koleszar
f1db3e8358 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/encoder/rdopt.c

Change-Id: I68d04397a12f565b9f1bd35d4e50f1cc9afb76ff
2011-01-28 08:37:44 -05:00
John Koleszar
a4c887da63 Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/encoder/rdopt.c

Change-Id: Ic17907df70fff45c9e766b5d0cbab0c5f1a1095f
2011-01-28 08:33:52 -05:00
Tero Rintaluoma
11a222f5d9 Adds "armvX-none-rvct" targets
Adds following targets to configure script to support RVCT compilation
without operating system support (for Profiler or bare metal images).
 - armv5te-none-rvct
 - armv6-none-rvct
 - armv7-none-rvct

To strip OS specific parts from the code "os_support"-config was added
to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS
specific parts such as OS specific includes and function calls for
timers and threads etc. This was done to enable RVCT compilation for
profiling purposes or running the image on bare metal target with
Lauterbach.

Removed separate AREA directives for READONLY data in armv6 and neon
assembly files to fix the RVCT compilation. Otherwise
"ldr <reg>, =label" syntax would have been needed to prevent linker
errors. This syntax is not supported by older gnu assemblers.

Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43
2011-01-28 12:47:39 +02:00
Yunqing Wang
cac54404b9 Remove copies of same functions
Reduce the code size.

Change-Id: I2e1998557a3c8776e262c442fd758c25e17aff7a
2011-01-26 15:37:00 -05:00
John Koleszar
46d9ff1b97 Merge remote branch 'internal/upstream-experimental' into HEAD
Conflicts:
	configure

Change-Id: I2ce6b0a0507f9aa4e3fed8ea1cb69779db5f4566
2011-01-21 10:13:48 -05:00
John Koleszar
3ac80a74f8 Merge remote branch 'origin/master' into experimental
Change-Id: Ia0840303fe1dc8c12f3389e7a1fe20b6d3c6b9c5
2011-01-20 00:05:28 -05:00
Yaowu Xu
5b42ae09ae experiment extending the quantizer range
Prior to this change, VP8 min quantizer is 4, which caps the
highest quality around 51DB. This experimental change extends
the min quantizer to 1, removes the cap and allows the highest
quality to be around ~73DB, consistent with the fdct/idct round trip
error. To test this change, at configure time use options:

--enable-experimental --enable-extend_qrange

The following is a brief log of changes in each of the patch sets

patch set 1:
In this commit, the quantization/dequantization constants are kept
unchanged, instead scaling factor 4 is rolled into fdct/idct.
Fixed Q0 encoding tests on mobile:
  Before:    9560.567kbps Overall PSNR:50.255DB VPXSSIM:98.288
  Now:   18035.774kbps Overall PSNR:73.022DB VPXSSIM:99.991

patch set 2:
regenerated dc/ac quantizer lookup tables based on the scaling
factor rolled in the fdct/idct. Also slightly extended the range
towards the high quantizer end.

patch set 3:
slightly tweaked the quantizer tables and generated bits_per_mb
table based on Paul's suggestions.

patch set 4:
fix a typo in idct, re-calculated tables relating active max Q
to active min Q

patch set 5:
added rdmult lookup table based on Q

patch set 6:
fix rdmult scale: dct coefficient has scaled up by 4

patch set 7:
make transform coefficients to be within 16bits

patch set 8:
normalize 2nd order quantizers

patch set 9:
fix mis-spellings

patch set 10:
change the configure script and macros to allow experimental code
to be enabled at configure time with --enable-extend_qrange

patch set 11:
rebase for merge

Change-Id: Ib50641ddd44aba2a52ed890222c309faa31cc59c
2011-01-19 13:22:35 -08:00
John Koleszar
2f0331c90c Merge "Implement error tracking in the decoder" 2011-01-19 05:51:00 -08:00
Henrik Lundin
67fb3a5155 Implement error tracking in the decoder
A new vpx_codec_control called VP8D_GET_FRAME_CORRUPTED. The output
from the function is non-zero if the last decoded frame contains
corruption due to packet losses.

The decoder is also modified to accept encoded frames of zero length.
A zero length frame indicates to the decoder that one or more frames
have been completely lost. This will mark the last decoded reference
buffer as corrupted. The data pointer can be NULL if the length is
zero.

Change-Id: Ic5902c785a281c6e05329deea958554b7a6c75ce
2011-01-19 09:53:21 +01:00
John Koleszar
05be098748 Merge remote branch 'internal/upstream-experimental' into HEAD 2011-01-13 11:35:00 -05:00
John Koleszar
20d7cfd8d5 Merge remote branch 'origin/master' into experimental
Change-Id: Id5da32e6d58a58e04a4dff9ca1df23ebb6c436b8
2011-01-12 00:05:17 -05:00
Henrik Lundin
48c28fc42c Remove unused local variables
Removing unused local variables causing compiler warnings in
Visual Studio.

Change-Id: I0e2096303be1fdbc01428a6e57cca9796bb32c8a
2011-01-11 15:22:19 +01:00
John Koleszar
f3aa1515f3 Merge remote branch 'internal/upstream-experimental' into HEAD 2011-01-10 08:29:26 -05:00
John Koleszar
3cb26b4864 Merge remote branch 'origin/master' into experimental
Change-Id: Ib34bc09a295141eb65c8c0478bde6136f178909b
2011-01-08 00:05:09 -05:00
Paul Wilkins
e0846c9c8c CQ Mode
The merge includes hooks to for CQ mode and other code
changes merged from the test branch.

CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.

The existing target data rate parameter is used to specify the
guideline maximum bitrate.

A new parameter allows the user to specify a target CQ level.

For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.

Change-Id: I2221f9eecae8cc3c431d36caf83503941b25e4c1
2011-01-07 18:46:29 +00:00
John Koleszar
4d98741c61 Merge remote branch 'internal/upstream' into HEAD 2010-12-30 00:05:14 -05:00
John Koleszar
1e2ab6ace0 Merge remote branch 'origin/master' into experimental
Change-Id: Iedf38035a53aa772b947ae39e44e1da473d916ac
2010-12-30 00:05:09 -05:00
Yunqing Wang
a864678cdb Always update last_frame_type
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.

Change-Id: I9203532fd67361588d4024628d9ddb8e391ad912
2010-12-29 10:28:35 -05:00
John Koleszar
c99c0e1798 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/vp8_cx_iface.c

Change-Id: Id670128595d04d6a61ec811b2ad896b138acded8
2010-12-21 07:56:46 -05:00
John Koleszar
0b710c8d1a Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/vp8_cx_iface.c

Change-Id: I76f302448f11b28772efd4b5643f86a7cc69a8c2
2010-12-21 07:54:10 -05:00
John Koleszar
b0da9b399d Add psnr/ssim tuning option
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.

Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.

Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257
2010-12-17 10:01:05 -05:00
John Koleszar
f7224e14c8 Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/common/entropy.c

Change-Id: Ic95302e327f80afd0261ac5bd9881f38296def47
2010-12-15 08:11:07 -05:00
John Koleszar
4fa8d36f76 Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/common/entropy.c

Change-Id: I35fd49cf92a50d09082fe199d3bf21bfca68a94f
2010-12-15 08:08:18 -05:00
Johann
825adc464f shrink TOKENEXTRA and vp8_extra_bit_struct
Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes
original: b7b1e6fb
reverted: 41f4458a

Also drop unused field from vp8_extra_bit_struct

Update ARM ASM to deal with this change. In particular, Extra is signed
and needs to be sign-extended when loaded.

Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41
2010-12-14 10:32:50 -05:00
John Koleszar
7211ac407b Merge remote branch 'internal/upstream' into HEAD 2010-12-14 00:05:07 -05:00
John Koleszar
6a80032280 Merge remote branch 'origin/master' into experimental
Change-Id: Ic88e9b2fcf1dcb2852a7205bcda3f181103f5612
2010-12-14 00:05:05 -05:00
John Koleszar
b1aa54ab26 remove unused temporal preproc code
This code is unused, as the current preproc implementation uses the
same spatial filter that postproc uses.

Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
2010-12-13 16:47:59 -05:00
John Koleszar
b6905e36d9 Merge remote branch 'origin/master' into experimental
Change-Id: Ibbe41ff2356aa8583c728e9ab1b0814958a51752
2010-12-11 00:05:08 -05:00
John Koleszar
eb1c033731 Merge remote branch 'internal/upstream' into HEAD 2010-12-11 00:05:08 -05:00
Fritz Koenig
e0cf330cde vp8 fast quantizer sse2 optimizations for eob.
Changed the end of block computation to use pmaxw.  Removed
additional pushing and popping of registers that was not needed.

Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98
2010-12-09 15:00:30 -08:00
John Koleszar
1a1a8ea4df Merge remote branch 'internal/upstream-experimental' into HEAD 2010-11-19 00:05:03 -05:00
Yaowu Xu
0867b81678 remove low pass filtering from two 4x4 intra prediction
In the process of developing new intra prediction modes, tests have
shown removal of the low pass filtering from B_HE_PRED and B_VE_PRED
has an overall minor positive impact in both PSNR and SSIM metric.
Overall difference is about 0.1%. The change shall also have a small
positive impact on speed. Intuitively, this change should also reduce
some of the tendency of "flattening"

Change-Id: I3c43b0daca833c6eff77d00f19c811f9ef9368a3
2010-11-18 10:42:08 -08:00
Yaowu Xu
06c70d304f extends the range of tokens
Extending the value range of tokens allows further experiments on
extending quantizer range. Encoder and decoder were verified to
produce matching reconstructed buffers by tests with forced
quantized value of 1.

Change-Id: I12faf92832867870b6f71ddeafbf643f1040086d
2010-11-18 09:07:16 -08:00
John Koleszar
3a778de77a Merge remote branch 'origin/master' into experimental 2010-11-17 00:05:05 -05:00
John Koleszar
8232a3a0e4 Merge remote branch 'internal/upstream' into HEAD 2010-11-17 00:05:04 -05:00
Suman Sunkara
388546bc93 Merge branch 'experimental' of ssh://on2-git.corp.google.com:29418/libvpx into test
Conflicts:
	configure

Change-Id: Id874dc46b13e8b5da4179fc3b48e354ec313a2cd
2010-11-16 16:31:59 -05:00
Suman Sunkara
4b3f72001d Merge branch 'experimental' of ssh://on2-git.corp.google.com:29418/libvpx into test
Conflicts:
	vp8/common/blockd.h
	vp8/decoder/decodemv.c
	vp8/decoder/decodframe.c
	vp8/decoder/demode.c
	vp8/decoder/onyxd_if.c
	vp8/decoder/onyxd_int.h
	vp8/encoder/encodeframe.c

Change-Id: Ic379f4dffaded9796dc19d56be304d3f8527c61f
2010-11-16 16:30:59 -05:00
Jim Bankoski
b4a3602f66 changes to start experimenting with color segmentation prediction modes. 2010-11-16 14:38:40 -05:00
Yaowu Xu
d49da085c0 correct errors in token alphabet descriptions
There were a few errors in the comment section that describe VP8 token
alphabet table.

Change-Id: Ie6728a0e08bc3798893221b60408d5b201064bdc
2010-11-16 10:51:43 -08:00
Suman Sunkara
b9a18344cf Use of temporal context for encoding delta updates.
- Used three probability approach for temporal context as follows:
P0 - probability of no change if both above and left not changed
P1 - probability of no change if one of above and left has changed
P2 - probability of no change if both above and left have changed

In addition, a 1 bit/frame has been used to decide whether to use temporal context or to encode directly.  The cost of using both the schemes is calculated ahead and the temporal_update flag is set if the cost of using temporal context is lower than encoding the segment ids directly.

This approach has given around 20% reduction in cost of bits needed to encode segmentation ids.

Change-Id: I44a5509599eded215ae5be9554314280d3d35405
2010-11-11 11:31:36 -05:00
John Koleszar
f225211256 Merge remote branch 'origin/master' into experimental
Conflicts:
	configure

Change-Id: Ifa63e4610657f75cb953aa7ca08f997267612cc0
2010-11-11 09:25:10 -05:00
John Koleszar
1ea4c2924c Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	configure

Change-Id: I1c7bae5241f999387cae3f2abf2dfc84fe3f6651
2010-11-11 09:22:46 -05:00
Fritz Koenig
647df00f30 postproc : Re-work posproc calling to allow more flags.
Debugging in postproc needs more flags to allow for specific
block types to be turned on or off in the visualizations.

Must be enabled with --enable-postproc-visualizer during
configuration time.

Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d
2010-11-10 14:14:46 -08:00
John Koleszar
7a590c902b Merge remote branch 'origin/master' into experimental
Conflicts:
	configure
	ivfenc.c
	vp8/common/alloccommon.c
	vp8/common/onyxc_int.h
	vp8/vp8_cx_iface.c
2010-11-05 12:30:33 -04:00
John Koleszar
362f763cfe Merge remote branch 'internal/upstream' into HEAD
Conflicts:
	vp8/common/alloccommon.c
	vp8/common/onyxc_int.h
	vp8/vp8_cx_iface.c
	vpxenc.c
2010-11-04 21:50:37 -04:00
Fritz Koenig
0e7b60617f postproc : Update visualizations.
Change color reference frame to blend the macro block edge.
This helps with layering of visualizations.

Add block coloring for intra prediction modes.

Change-Id: Icefe0e189e26719cd6937cebd6727efac0b4d278
2010-11-04 10:35:02 -07:00
Fritz Koenig
0a29bd9793 postproc : Fix display of motion vectors.
Split motion vectors were all being treated as 4x4
blocks.  Now correctly handle 16x8, 8x16, 8x8, 4x4
blocks.

Change-Id: Icf345c5e69b5e374e12456877ed7c41213ad88cc
2010-11-02 13:29:13 -07:00
Fritz Koenig
9f61a83bf9 postproc : Added SPLITMV visualization, fix line constrain.
Now draw 16 vectors for SPLITMV mode.

Fixed constrain line to block divide by zero issues.

Blend block was not centering the shaded area correctly.

Change-Id: I1edabd8b4e553aac8d980f7b45c80159e9202434
2010-11-01 13:27:13 -07:00
Timothy B. Terriberry
c4d7e5e67e Eliminate more warnings.
This eliminates a large set of warnings exposed by the Mozilla build
 system (Use of C++ comments in ISO C90 source, commas at the end of
 enum lists, a couple incomplete initializers, and signed/unsigned
 comparisons).
It also eliminates many (but not all) of the warnings expose by newer
 GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
 without checking the return values).
There are a few spurious warnings left on my system:

../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
 uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
 change between the two if blocks that test it here.

../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
 expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
 expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
 standard does not mandate that enums be unsigned, so the checks can't be
 removed.

Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395
2010-10-27 18:08:04 -07:00
Fritz Koenig
a097e18964 postproc: Tweaks to line drawing and blending.
Turned down the blending level to make colored blocks obscure
the video less.
Not blending the entire block to give distinction to macro
block edges.
Added configuration so that macro block blending function can
be optimized.
Change to constrain line as to when dx and dy are computed.
Now draw two lines to form an arrow.

Change-Id: Id3ef0fdeeab2949a6664b2c63e2a3e1a89503f6c
2010-10-27 13:20:03 -07:00
Johann
787733d855 Merge "RTCD build is bringing old errors to light" 2010-10-27 09:59:01 -07:00
Fritz Koenig
cf127474d8 vpxdec : Change --pp-debug-info to be a bit field.
This allows multiple post processor debug levels to be overlayed.
i.e. can show colored reference blocks and visual motion vectors.

Change-Id: Ic4a1df438445b9f5780fe73adb3126e803472e53
2010-10-27 09:53:37 -07:00
Fritz Koenig
36ff6a6743 Merge "postproc: Add mode and refrence frame visualizers." 2010-10-27 09:04:39 -07:00
Johann
abcf36c758 RTCD build is bringing old errors to light
needs to be _recon_ not _recon_recon_

Change-Id: I7a8b9ddcb4fb72c2b723c563932c9ea52ff15982
2010-10-27 10:47:48 -04:00
Fritz Koenig
a0ccc97d8a postproc: Add mode and refrence frame visualizers.
Post process option to color the block for either the mode
of the macro block, or the frame that the macro block references.

Change-Id: Ie498175497f2d20e3319924d352dc4ddc16f4134
2010-10-26 16:00:14 -07:00
John Koleszar
d6c67f02c9 make vp8_recon16x16mb{,y} RTCD functions
ARM NEON has a platform specific version of vp8_recon16x16mb, though
it's just a stub to extract the various parameters from the
MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
that function's prototype directly will be a better long term solution,
but it's quite an invasive change.

Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
2010-10-26 13:23:36 -04:00
John Koleszar
19638c2309 arm: move unrolled loops back to generic code
Some of the ARM functions differed from their generic counterparts
only by unrolling their loops. Since this change may be useful
on other platforms, or might even supercede the looped version
in the generic case, move it back to the generic file.

This code is left under #if ARCH_ARM for now, but it may be worth
considering a different (possibly new) conditional for these. If
it turns out that this should be runtime selectable, these
functions will have to move to the RTCD infrastructure. Don't want
to take that step at this time without more profile data.

Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f
2010-10-26 09:51:35 -04:00
John Koleszar
d330a5876b arm: remove duplicate functions
These functions were true duplicates of functions present in the
generic code. This fixes some of the link errors when building
with --enable-shared --enable-pic.

Change-Id: Idff26599d510d954e439207883607ad6b74df20c
2010-10-26 09:37:44 -04:00
Fritz Koenig
1d70aaf08b Merge "Debug option for drawing motion vectors." 2010-10-25 15:40:22 -07:00
Fritz Koenig
d1a4cce809 Debug option for drawing motion vectors.
Postproc level that uses Bresenham's line algorithm
to draw motion vectors onto the postproc buffer.

Change-Id: I34c7daa324f2bdfee71e84fcb1c50b90fa06f6fb
2010-10-25 15:39:04 -07:00
Johann
1376f061da reuse common loopfilter code
there were four versions for the regular and
macroblock loopfilters:
horizontal [y|uv]
vertical [y|uv]

this moves all the common code into 2 functions:
vp8_loop_filter_neon
vp8_mbloop_filter_neon

this provides no gain in performance. there's a bit
of jitter, but it trends down ~0.25-0.5%. however,
this is a huge gain maintenance. also, there is the
potential to drop some stack usage in the macroblock
loopfilter.

Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92
2010-10-25 09:48:50 -04:00
Timothy B. Terriberry
b71962fdc9 Add runtime CPU detection support for ARM.
The primary goal is to allow a binary to be built which supports
 NEON, but can fall back to non-NEON routines, since some Android
 devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
 Tegra).
The configure-generated flags HAVE_ARMV7, etc., are used to decide
 which versions of each function to build, and when
 CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
 at run time.
In order for this to work, the CFLAGS must be set to something
 appropriate (e.g., without -mfpu=neon for ARMv7, and with
 appropriate -march and -mcpu for even earlier configurations), or
 the native C code will not be able to run.
The ASFLAGS must remain set for the most advanced instruction set
 required at build time, since the ARM assembler will refuse to emit
 them otherwise.
I have not attempted to make any changes to configure to do this
 automatically.
Doing so will probably require the addition of new configure options.

Many of the hooks for RTCD on ARM were already there, but a lot of
 the code had bit-rotted, and a good deal of the ARM-specific code
 is not integrated into the RTCD structs at all.
I did not try to resolve the latter, merely to add the minimal amount
 of protection around them to allow RTCD to work.
Those functions that were called based on an ifdef at the calling
 site were expanded to check the RTCD flags at that site, but they
 should be added to an RTCD struct somewhere in the future.
The functions invoked with global function pointers still are, but
 these should be moved into an RTCD struct for thread safety (I
 believe every platform currently supported has atomic pointer
 stores, but this is not guaranteed).

The encoder's boolhuff functions did not even have _c and armv7
 suffixes, and the correct version was resolved at link time.
The token packing functions did have appropriate suffixes, but the
 version was selected with a define, with no associated RTCD struct.
However, for both of these, the only armv7 instruction they actually
 used was rbit, and this was completely superfluous, so I reworked
 them to avoid it.
The only non-ARMv4 instruction remaining in them is clz, which is
 ARMv5 (not even ARMv5TE is required).
Considering that there are no ARM-specific configs which are not at
 least ARMv5TE, I did not try to detect these at runtime, and simply
 enable them for ARMv5 and above.

Finally, the NEON register saving code was completely non-reentrant,
 since it saved the registers to a global, static variable.
I moved the storage for this onto the stack.
A single binary built with this code was tested on an ARM11 (ARMv6)
 and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder,
 and produced identical output, while using the correct accelerated
 functions on each.
I did not test on any earlier processors.

Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa
2010-10-25 09:23:29 -04:00
Timothy B. Terriberry
8f75ea6b5c Convert [4][4] matrices to [16] arrays.
Most of the code that actually uses these matrices indexes them as
 if they were a single contiguous array, and coverity produces
 reports about the resulting accesses that overflow the static
 bounds of the first row.
This is perfectly legal in C, but converting them to actual [16]
 arrays should eliminate the report, and removes a good deal of
 extraneous indexing and address operators from the code.

Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23
2010-10-21 17:04:30 -07:00
Yaowu Xu
fc2f8dafaf Merge "fixed a typo that mis-used Y plane stride for UV blocks." 2010-10-19 16:23:31 -07:00
Yunqing Wang
7804befb55 Fix one gcc compiler warning
../libvpx/vp8/encoder/bitstream.c: In function ‘pack_inter_mode_mvs’:
../libvpx/vp8/encoder/bitstream.c:1026: warning: array subscript has type ‘char’

Change-Id: Ic77491e0a172fa1821e5b3e914d0dc41fe87c00f
2010-10-14 15:15:35 -04:00
Jan Kratochvil
1fc294116a nasm: movhps compatibility QWORD->MMWORD
Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208

nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes

Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.

Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc
2010-10-04 20:47:19 -04:00
Jan Kratochvil
5cdc3a4c29 nasm: address labels 'rel label' vice 'wrt rip'
nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
2010-10-04 19:47:54 -04:00
Jan Kratochvil
e114f699f6 nasm: match instruction length (movd/movq) to parameters
nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.

Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.

Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91
2010-10-04 23:36:29 +02:00
Yaowu Xu
49fdb7c41e fixed a typo that mis-used Y plane stride for UV blocks.
Raised by Lei Yang, the Y plane stride was used for UV blocks.
This is clearly a typo. But as the comments in the code suggested
that this port of code has not been used yet, so the typo should
not have created any damage yet.

Change-Id: Iea895edc17469a51c803a8cc6d0fce65a1a7fc2f
2010-10-04 11:31:14 -07:00
Johann
f143a81191 Merge "Fix valgrind errors in the NEON loop filters." 2010-10-01 06:18:53 -07:00
Timothy B. Terriberry
a465076e02 Fix valgrind errors in the NEON loop filters.
Like the ARMv6 code, these functions were accessing values below
 the stack pointer, which can be corrupted by signal delivery at
 any time.
2010-09-30 20:40:45 -07:00
John Koleszar
a047fee606 Merge "Fix loopfilter delta zero transitions" 2010-09-30 10:26:10 -07:00
Fritz Koenig
439b2ecd74 Merge "Optimizations on the loopfilters." 2010-09-29 10:47:01 -07:00
John Koleszar
b9be7a464f Fix loopfilter delta zero transitions
Loopfilter deltas are initialized to zero on keyframes in the decoder.
The values then persist from the previous frame unless an update bit
is set in the bitstream. This data is not included in the entropy
data saved by the 'refresh entropy' bit in the bitstream, so it is
effectively an additional contextual element beyond the 3 ref-frames
and the entropy data.

The encoder was treating this delta update bit as update-if-nonzero,
meaning that the value would be refreshed even if it hadn't changed,
and more significantly, if the correct value for the delta changed
to zero, the update wouldn't be sent, and the decoder would preserve
the last (presumably non-zero) value.

This patch updates the encoder to send an update only if the value
has changed from the previously transmitted value. It also forces the
value to be transmitted in error resilient mode, to account for lost
context in the event of lost frames.

Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868
2010-09-29 13:04:04 -04:00
Fritz Koenig
0964ef0e71 Optimizations on the loopfilters.
- Scheduling for Atom processors
- Combining of macros to allow for better interleaving
- Change from multiplies to adds for main filter
- Use of movhps/movlps to fill xmm registers without
  shifting and orring

Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b
2010-09-28 12:01:34 -07:00
John Koleszar
6804199073 Merge remote branch 'internal/upstream' into HEAD 2010-09-28 00:05:05 -04:00
Timothy B. Terriberry
18dc92fd66 Add 4-tap version of 2nd-pass ARMv6 MC filter.
The existing code applied a 6-tap filter with 0's on either end.
We're already paying the branch penalty to avoid computing the two
 extra columns needed as input to this filter.
We might as well save time computing the filter as well.
This reduces the inner loop from 21 instructions to 16, the number
 of loads per iteration from 4 to 1, and the number of multiplies
 from 7 to 4.
The gain in overall decoding performance, however, is small (less
 than 1%).

This change also means we now valgrind clean on ARMv6, which is
 its real purpose.
The errors reported here were valgrind's fault (it does not detect
 that 0 times an uninitialized value is initialized), but Julian
 Seward says it would slow down valgrind considerably to make such
 checks.
Speeding up libvpx rather, even by a small amount, seems a much
 better idea if only to enable proper valgrind checking of the
 rest of the codec.

Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16
2010-09-27 18:25:45 -07:00
John Koleszar
2b521ab551 move reconintra_mt to decoder (fixup)
Missed the .h file in the move.

Change-Id: Ib408183fbb4d019fd46394b362f89ca6ea9d10bc
2010-09-27 12:48:31 -04:00
Johann
063be9b82a Merge "combine max values and compare once" 2010-09-27 06:39:20 -07:00
John Koleszar
1d0db7134e Merge remote branch 'internal/upstream' into HEAD 2010-09-25 00:05:06 -04:00
Timothy B. Terriberry
e2795e9978 Fix valgrind errors in vp8_sixtap_predict8x4_armv6().
This function was accessing values below the stack pointer, which
 can be corrupted by signal delivery at any time.

Change-Id: I92945b30817562eb0340f289e74c108da72aeaca
2010-09-24 14:34:18 -07:00
Johann
f30e8dd7bd combine max values and compare once
previous implementation compared each set of values to limit and then
&'d them together, requiring a compare and & for each value.

this does the accumulation first, requiring only one compare

Change-Id: Ia5e3a1a50e47699c88470b8c41964f92a0dc1323
2010-09-24 15:42:50 -04:00
John Koleszar
48e76ff4fd move reconintra_mt to decoder (for now)
reconintra_mt.c is only required for building the decoder right now.
It could definitely be used for the encoder in the future, but it
currently depends on decoder only data structures. (onyxd_int.h,
VP8D_COMP, etc). Move it from common/ to decoder/ until the
necessary changes to the common multithread code are complete.

This patch is needed to build with --disable-vp8-decoder.

Change-Id: I568c52221a2b309234d269675cba97131ce35c86
2010-09-24 11:23:06 -04:00
John Koleszar
fbd3db91bb Merge remote branch 'internal/upstream' into HEAD 2010-09-23 00:05:10 -04:00
Johann
7fed3832e7 Remove dead code
The new loopfilter was originally introduced as an experimental change.
It's permanent now.

Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546
2010-09-22 11:07:34 -04:00
John Koleszar
72302f8609 Merge remote branch 'internal/upstream' into HEAD 2010-09-22 00:05:04 -04:00
Yunqing Wang
a23ccf8f8c Merge "Restructure multi-threaded decoder" 2010-09-21 05:00:30 -07:00
John Koleszar
99c611fea6 Merge remote branch 'internal/upstream' into HEAD 2010-09-21 00:05:03 -04:00
Fritz Koenig
b7dc9398f2 Use movq instead of movdqu.
Movdqu is more expensive (throughput, uops) than movq.  Minimal
impact for newer big cores, but ~2.25% gain on Atom.

Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f
2010-09-20 11:34:26 -07:00
Fritz Koenig
8eae7fe7e8 Better choice of instruction filter mask comparision.
Use pmaxub instead of a combination of psubusb/por to
determine if any comparisons go over the limit.

Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82
2010-09-20 10:20:38 -07:00
Yunqing Wang
f857a85088 Restructure multi-threaded decoder
On each MB, loopfiltering is done right after MB decoding. This
combines two loops in multi-threaded code into one, which reduces
number of synchronizations to half.

The above-row/left-col data are saved in temp buffers for
next-row/next MB decoding.

Tests on 4-core gLucid machine showed 10% decoder performance
gain with threads=4 (tulip clip). Testing on other platforms
isn't done yet.

Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
2010-09-17 09:56:05 -04:00
John Koleszar
b1879d9754 Merge remote branch 'internal/upstream' into HEAD 2010-09-15 00:05:04 -04:00
John Koleszar
fe46476e98 Merge remote branch 'internal/upstream' into HEAD 2010-09-14 00:05:04 -04:00
Fritz Koenig
769f2424cc Removed unnecessary pxor.
There is no need to make sure that the lower byte of the
register is 0 because the downshift by 11 overwrites that byte.

Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
2010-09-13 18:34:34 -07:00
Suman Sunkara
be7e4e854c Delta updates to segmentation map using left and above contexts.
-Updates by making use of spatial correlation.
-Checks if the segment_id is same as above or left context and encodes only the update to the map instead of updating individual segment_ids.

Change-Id: Ib861df97e8aa2b37516219eeddcdbaf552b6a249
2010-09-13 10:01:21 -04:00
Fritz Koenig
a65cd3def0 Make block access to frame buffer sequential
Sequentially accessing memory from a low address to a high
address should make it easier for the processor to predict
the cache.

Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
2010-09-10 16:27:28 -07:00
John Koleszar
42cb1d84f2 Merge remote branch 'internal/upstream' into HEAD 2010-09-10 00:05:04 -04:00
Fritz Koenig
6d90f867e4 Merge branch 'master' of git://review.webmproject.org/libvpx 2010-09-09 08:54:21 -07:00
John Koleszar
c2140b8af1 Use WebM in copyright notice for consistency
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.

Fixes issue #97.

Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-09-09 10:01:21 -04:00
Fritz Koenig
3fb37162a8 Bilinear subpixel optimizations for ssse3.
Used pmaddubsw for multiply and add of two filter taps
at once for 16x16 and 8x8 blocks.

Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea
2010-09-07 17:19:40 -07:00
John Koleszar
b6a41d66df Merge remote branch 'internal/upstream' into HEAD 2010-09-04 00:05:04 -04:00
Scott LaVarnway
0de458f6b9 Reduced the size of MB_MODE_INFO
Moved partition_bmi and partition_count out of MB_MODE_INFO and
placed into MACROBLOCK.  Also reduced the size of other members
of the MB_MODE_INFO struct.  For 1080p, the memory was reduced
by 1,209,516 bytes.  The decoder performance appeared to improve
by 3% for the clip used.
Note:  The main goal for this change is to improve the decoder
performance.  The encoder will be revisited at a later date for
further structure cleanup.

Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613
2010-09-03 16:43:23 -04:00