304 Commits

Author SHA1 Message Date
John Koleszar
1158bcd813 Merge remote branch 'origin/master' into experimental
Change-Id: I20df6781013786cbf56ded31e1c726de6c34bc42
2011-09-16 08:53:06 -04:00
Scott LaVarnway
222c72e50f Merge "Removed bmi copy to/from BLOCKD" 2011-08-31 06:57:20 -07:00
John Koleszar
1765040a26 Merge remote branch 'origin/master' into experimental
Change-Id: Ic9131382306cc18a915f8854ddba33025123968d
2011-08-25 00:05:11 -04:00
Scott LaVarnway
b870947d42 Removed bmi copy to/from BLOCKD
for SPLITMV and B_PRED modes.  Modified code to use the bmi
found in mode_info_context instead of BLOCKD.  On the decode
side, the uvmvs are calculated only when required, instead of
every macroblock.  This is WIP. (bmi should eventually be
removed from BLOCKD)
Small performance gains noticed for RT encodes and decodes.(VGA)

Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7
2011-08-24 14:42:26 -04:00
Fritz Koenig
112bd4e2b4 Fix naming of sse2 idct functions.
Prepend idct function names with vp8_
so that under profiling they show up
associated with libvpx.

Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4
2011-08-24 10:25:32 -07:00
Scott LaVarnway
1de5da80c9 Merge "Faster vp8_default_coef_probs" 2011-08-24 07:52:10 -07:00
John Koleszar
d2a2d5a6d5 Merge remote branch 'origin/master' into experimental
Change-Id: If53ec5c1219b31e5ef9ae552d9cc79432ebda267
2011-08-24 00:05:11 -04:00
Johann
85358d04cd Fix data accesses for simple loopfilters
The data that the simple horizontal loopfilter reads is aligned, treat
it accordingly.

For the vertical, we only use the bottom 4 bytes, so don't read in 16
(and incur the penalty for unaligned access).

This shows a small improvement on older processors which have a
significant penalty for unaligned reads.

postproc_mmx.c is unused

Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
2011-08-23 20:42:45 -04:00
Fritz Koenig
c5f890af2c Use local labels for jumps/loops in x86 assembly.
Prepend . to local labels in assembly code.  This
allows non unique labels within a file.  Also
makes profiling information more informative
by keeping the function name with the loop name.

Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
2011-08-23 09:05:29 -07:00
John Koleszar
c08e65a671 Merge remote branch 'origin/master' into experimental
Change-Id: Iefa9c3e87ff25d92093eb949e23d5a85f1b7de09
2011-08-20 00:05:14 -04:00
John Koleszar
edec5eb5e7 Merge "Copy less when active map is in use" 2011-08-19 07:31:00 -07:00
Alpha Lam
4e8d35a461 Copy less when active map is in use
When active map is specified and the current frame is not a key frame,
golden frame nor a altref frame then copy only those active regions.

This significantly reduces encoding time by as much as 19% on the test
system where realtime encoding is used. This is particularly useful
when the frame size is large (e.g. 2560x1600) and there's only a few
action macroblocks.

Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4
2011-08-19 10:29:41 -04:00
Scott LaVarnway
19987dcbfa Faster vp8_default_coef_probs
Copies from a generated table instead of building the
default coeff probabilities during runtime.

Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
2011-08-16 16:21:21 -04:00
John Koleszar
79d8ffd213 Merge remote branch 'origin/master' into experimental
Change-Id: I215466afda88def40f4a5d81f5b58ec383471346
2011-08-16 00:05:11 -04:00
John Koleszar
e96131705a Revert "Improved 1-pass CBR rate control"
This reverts commit b5ea2fbc2c1554769848774c836aad262af95072. Further
testing showed noticable keyframe popping in some cases, reverting this
for now to give time for a proper fix.

Conflicts:

	vp8/encoder/onyx_if.c
	vp8/encoder/ratectrl.c

Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af
2011-08-12 14:51:36 -04:00
John Koleszar
712762b508 Merge remote branch 'origin/master' into experimental
Change-Id: Ic698ea5f5b31a5faf467eb0da4b762f9586df938
2011-08-05 00:05:05 -04:00
Johann
30e5deae5d update extend frame borders
the neon code made several assumptions which were broken by a recent
change: https://review.webmproject.org/2676

update the code with new assumptions and guard them with a compile time
assert

Change-Id: I32a8378030759966068f34618d7b4b1b02e101a0
2011-08-02 19:26:46 -04:00
John Koleszar
06c3d5bb9a Fix building with --disable-postproc
Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9
2011-08-01 17:50:23 -04:00
John Koleszar
9fbb1d4350 Merge remote branch 'origin/master' into experimental
Change-Id: I1ae82458536ba2f0969e1bea78f41cd16fe96b79
2011-07-27 00:05:06 -04:00
James Zern
b45065d38b cosmetics: consistently use [u]int64_t
Removes mixed usage of (unsigned) long long and INT64.
Fixes Issue #208.

Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a
2011-07-26 11:34:36 -07:00
John Koleszar
3c4a39e71c Merge remote branch 'origin/master' into experimental
Conflicts:
	vp8/decoder/detokenize.c
	vp8/decoder/onyxd_int.h

Change-Id: Idc301ae630dc1aedeb85674ecfdcf1eb28420f81
2011-07-26 10:04:36 -04:00
Yunqing Wang
65dfcf4696 Use CONFIG_FAST_UNALIGNED consistently in codec
CONFIG_FAST_UNALIGNED is enabled by default. Disable it if it is
not supported by hardware.

Change-Id: I7d6905ed79fed918bca074bd62820b0c929d81ab
2011-07-25 10:11:24 -04:00
John Koleszar
e14ad46efa Merge remote branch 'origin/master' into experimental
Change-Id: I0a24d6762598e5fee30f264de1dcd10331c01eac
2011-07-23 00:05:13 -04:00
Johann
773bcc300d Merge "fix sharpness bug and clean up" 2011-07-22 09:34:55 -07:00
Johann
a04ed0e8f3 fix sharpness bug and clean up
sharpness was not recalculated in vp8cx_pick_filter_level_fast

remove last_filter_type. all values are calculated, don't need to update
the lfi data when it changes.

always use cm->sharpness_level. the extra indirection was annoying.

don't track last frame_type or sharpness_level manually. frame type
only matters for motion search and sharpness_level is taken care of in
frame_init

move function declarations to their proper header

Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db
2011-07-22 12:33:57 -04:00
Yunqing Wang
829179e888 Merge "Preload reference area to an intermediate buffer in sub-pixel motion search" 2011-07-22 06:56:15 -07:00
Yunqing Wang
20bd1446c0 Preload reference area to an intermediate buffer in sub-pixel motion search
In sub-pixel motion search, the search range is small(+/- 3 pixels).
Preload whole search area from reference buffer into a 32-byte
aligned buffer. Then in search, load reference data from this buffer
instead. This keeps data in cache, and reduces the crossing cache-
line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
showed encoder speed improvement:
  3.4%   at --rt --cpu-used =-4
  2.8%   at --rt --cpu-used =-3
  2.3%   at --rt --cpu-used =-2
  2.2%   at --rt --cpu-used =-1

Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
Test on Xeon machine also showed less improvement, since unaligned
data access latency is greatly reduced in newer cores.

Next, I will apply similar idea to other 2 sub-pixel search functions
for encoding speed > 4.

Make this change exclusively for x86 platforms.

Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
2011-07-22 09:28:06 -04:00
John Koleszar
6907117175 Merge remote branch 'origin/master' into experimental
Change-Id: I956822324c046c254806dd712a2d3be4dcf8564b
2011-07-20 00:05:17 -04:00
Scott LaVarnway
a25f6a9c88 Moved vp8_encode_bool into boolhuff.h
allowing the compiler to inline this function.  For real-time
encodes, this gave a boost of 1% to 2.5%, depending on the
speed setting.

Change-Id: I3929d176cca086b4261267b848419d5bcff21c02
2011-07-19 09:17:25 -04:00
John Koleszar
2614b77fcb Merge remote branch 'origin/master' into experimental
Change-Id: Ida9204624fe3fb99fed1b149d1f88159480fdd83
2011-07-19 00:05:11 -04:00
John Koleszar
b5ea2fbc2c Improved 1-pass CBR rate control
This patch attempts to improve the handling of CBR streams with
respect to the short term buffering requirements. The "buffer level"
is changed to be an average over the rc buffer, rather than a long
running average. Overshoot is also tracked over the same interval
and the golden frame targets suppressed accordingly to correct for
overly aggressive boosting.

Testing shows that this is fairly consistently positive in one
metric or another -- some clips that show significant decreases
in quality have better buffering characteristics, others show
improvenents in both.

Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920
2011-07-18 11:48:05 -04:00
John Koleszar
f1fcd74e3e Merge remote branch 'origin/master' into experimental
Change-Id: Icbeb14d64ed3d9337606b591dde4e0669540a10d
2011-07-15 00:05:06 -04:00
James Berry
6b6f367c3d bug fix vpx_copy_and_extend_frame size issue
vpx_copy_and_extend_frame could incorrectly
resize uv frames which could result in a crash.

Change-Id: Ie96f7078b1e328b3907a06eebeee44ca39a2e898
2011-07-14 15:58:15 -04:00
John Koleszar
86edcb0cc7 Merge remote branch 'origin/master' into experimental
Change-Id: I3f64e220b78738e5261a9fda3c270d51613f4faa
2011-07-14 00:05:12 -04:00
Johann
211694f67e Merge "update x86 asm for loopfilter" 2011-07-13 04:10:03 -07:00
Johann
8f910594bd Merge "Update armv6 loopfilter to new interface" 2011-07-13 04:09:55 -07:00
Johann
1a219c22b1 Merge "Update armv7 loopfilter to new interface" 2011-07-13 04:09:42 -07:00
Johann
d9b825cff2 Merge "New loop filter interface" 2011-07-13 04:09:26 -07:00
Attila Nagy
c231b0175d Update armv6 loopfilter to new interface
Change-Id: I5fe581d797571a7a9432fbd17fc557591d0c1afa
2011-07-12 12:14:51 +03:00
Attila Nagy
283b0e25ac Update armv7 loopfilter to new interface
Change-Id: I65105a9c63832669237e6a6a7fcb4ea3ea683346
2011-07-12 12:12:25 +03:00
John Koleszar
6058c9ce0c Merge remote branch 'origin/master' into experimental
Change-Id: Ica63d16cb39e2d65a3414f0b9f86c8a64112dfa3
2011-07-09 00:05:09 -04:00
Johann
01433c5043 update x86 asm for loopfilter
Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce
2011-07-08 09:23:38 -04:00
Johann
6ae12c415e Merge "clean up warnings when building arm with rtcd" 2011-07-08 05:16:09 -07:00
Attila Nagy
622958449b New loop filter interface
Separate simple filter with reduced no. of parameters.
MB filter level picking based on precalculated table. Level table updated for
each frame. Inside and edge limits precalculated and updated just when
sharpness changes. HEV threshhold is constant.
ARM targets use scalars and others vectors.

Change works only with --target=generic-gnu
All other targets have to be updated!

Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c
2011-07-08 09:31:41 +03:00
John Koleszar
61d9d595d5 Merge remote branch 'origin/master' into experimental
Change-Id: I009c7e3043ad1eb1ce95c69132a4727073b86757
2011-07-02 00:05:12 -04:00
John Koleszar
b4f70084cc Merge "Properly use GET_GOT/RESTORE_GOT when using GLOBAL()." 2011-07-01 07:14:34 -07:00
Ronald S. Bultje
c8a23ad3f4 Properly use GET_GOT/RESTORE_GOT when using GLOBAL().
This should fix binaries using PIC on x86-32. Also should
fix issue 343.

Change-Id: I591de3ad68c8a8bb16054bd8f987a75b4e2bad02
2011-06-30 14:04:27 -07:00
John Koleszar
6251e9e5ce Merge remote branch 'origin/master' into experimental
Change-Id: I35c9ca116aecd0d03e762942d9cf1289edb4f23d
2011-06-30 00:05:10 -04:00
Johann
6611f66978 clean up warnings when building arm with rtcd
Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df
2011-06-29 10:51:41 -04:00
John Koleszar
f3a13cb236 Merge "Use MAX_ENTROPY_TOKENS and ENTROPY_NODES more consistently" 2011-06-29 07:29:59 -07:00