Commit Graph

12 Commits

Author SHA1 Message Date
Yunqing Wang
4db2076594 Add SSE2 subtract functions
Instead of doing 8-bit data unpack and 16-bit subtraction, use
psubb to do 16 8-bit subtractions and pcmpgtb to preserve the
sign information. This does not bring noticable gain since
these functions are not called frequently.

Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e
2010-10-18 14:15:15 -04:00
Fritz Koenig
1dc0ca1340 Fix compiler warning about vp8_fast_quantize_b_impl_ssse2.
Typo had function defined as _ssse2 and prototyped as _sse2.

Change-Id: If9f19da1a83cff40774a90cf936d601c0bf1b7fe
2010-10-13 17:08:13 -07:00
Scott LaVarnway
d860f685b8 Added vp8_fast_quantize_b_sse2
Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into
quantize_sse2.asm and renamed.  Updated the assembly code to
match the C version.

Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200
2010-10-07 11:43:19 -04:00
John Koleszar
c2140b8af1 Use WebM in copyright notice for consistency
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.

Fixes issue #97.

Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-09-09 10:01:21 -04:00
Timothy B. Terriberry
e04e293522 Make the quantizer exact.
This replaces the approximate division-by-multiplication in the
 quantizer with an exact one that costs just one add and one
 shift extra.
The asm versions have not been updated in this patch, and thus
 have been disabled, since the new method requires different
 multipliers which are not compatible with the old method.

Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206
2010-07-23 08:48:01 -07:00
Yaowu Xu
b62d093efa Improve the accuracy of forward walsh-hadamard transform
Besides the slight improvement in round trip error. This
also fixes a sign bias in the forward transform, so the
round trip errors are evenly distributed between +1s and
-1s. The old bias seemed to work well with the dc sign bias
in old fdct,  which no longer exist in the improved fdct.

Change-Id: I8635e7be16c69e69a8669eca5438550d23089cef
2010-06-28 22:10:48 -07:00
Scott LaVarnway
f1a3b1e0d9 Added first-pass sse2 version of Yaowu's new fdct.
Change-Id: Ib479210067510162879c368428b92690591120b2
2010-06-24 16:40:56 -04:00
Yaowu Xu
d0dd01b8ce Redo the forward 4x4 dct
The new fdct lowers the round trip sum squared error for a
4x4 block ~0.12. or ~0.008/pixel. For reference, the old
matrix multiply version has average round trip error 1.46
for a 4x4 block.

Thanks to "derf" for his suggestions and references.

Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79
2010-06-24 13:17:58 -07:00
John Koleszar
94c52e4da8 cosmetics: trim trailing whitespace
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.

Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
2010-06-18 13:06:11 -04:00
Scott LaVarnway
48c84d138f sse2 version of vp8_regular_quantize_b
Added sse2 version of vp8_regular_quantize_b which improved encode
performance(for the clip used) by ~10% for 32 bit builds and ~3% for
64 bit builds.

Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments.

Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af
2010-06-14 14:07:56 -04:00
John Koleszar
09202d8071 LICENSE: update with latest text
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
2010-06-04 16:19:40 -04:00
John Koleszar
0ea50ce9cb Initial WebM release 2010-05-18 11:58:33 -04:00