Commit Graph

8 Commits

Author SHA1 Message Date
Johann
022323bf85 reorder data to use wider instructions
the previous commit laid the groundwork by doing two sets of idcts
together. this moved that further by grouping the interesting data
(q[0], q+16[0]) together to allow using wider instructions. also
managed to drop a few instructions by recognizing that the constant
for sinpi8sqrt2 could be downshifted all the time which avoided a
dowshift as well as workarounds for a function which only accepted
signed data

looks like a modest gain for performance: at qcif, went from ~180
fps to ~183
Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf
2010-09-17 16:47:39 -04:00
Johann
14ba764219 Update NEON wide idcts
Expand 93c32a55 which used SSE2 instructions to do two
idct/dequant/recons at a time to NEON. Initial working
commit. More work needs to be put into rearranging and
interlacing the data to take advantage of quadword
operations, which is when we'll hopefully see a much
better boost

Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1
2010-09-09 14:08:12 -04:00
John Koleszar
c2140b8af1 Use WebM in copyright notice for consistency
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.

Fixes issue #97.

Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-09-09 10:01:21 -04:00
Fritz Koenig
93c32a55c2 Rework idct calling structure.
Moving the eob structure allows for a non-struct based
function to handle decoding an entire mb of
idct/dequant/recon data.  This allows for SIMD functions
to idct/dequant/recon multiple blocks at once.

SSE2 implementation gives 3% gain on Atom.

Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2
2010-08-23 08:58:54 -07:00
Johann
56f5a9a060 update arm idct functions
Jeff Muizelaar posted some changes to the idct/reconstruction c code.
This is the equivalent update for the arm assembly.

This shows a good boost on v6, and a minor boost on neon.
Here are some numbers for highway in qcif, 2641 frames:
HEAD neon: ~161 fps
new neon:  ~162 fps
HEAD v6:   ~102 fps
new v6:    ~106 fps

The following functions have been updated for armv6 and neon:
vp8_dc_only_idct_add
vp8_dequant_idct_add
vp8_dequant_dc_idct_add

Conflicts:

	vp8/decoder/arm/armv6/dequantdcidct_v6.asm
	vp8/decoder/arm/armv6/dequantidct_v6.asm

Resolved by removing these files. When I rewrote the functions, I also
moved the files to dequant_dc_idct_v6.asm/dequant_idct_v6.asm

Change-Id: Ie3300df824d52474eca1a5134cf22d8b7809a5d4
2010-07-26 08:55:19 -04:00
John Koleszar
94c52e4da8 cosmetics: trim trailing whitespace
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.

Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
2010-06-18 13:06:11 -04:00
John Koleszar
09202d8071 LICENSE: update with latest text
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
2010-06-04 16:19:40 -04:00
John Koleszar
0ea50ce9cb Initial WebM release 2010-05-18 11:58:33 -04:00