John Koleszar
ece724ae16
Merge "Remove unused vp9_build_intra_predictors_sb{y,uv}_s"
2013-06-24 15:08:58 -07:00
John Koleszar
ee4a7e4e46
Merge "Remove unused vp9_model_to_full_probs_sb()"
2013-06-24 15:08:54 -07:00
Scott LaVarnway
dfa2ecc3f1
Changed size of mb_mode_context to 8 bits
...
This reduced the size of the MODE_INFO array (mip and prev_mip)
by 425,568 bytes each for 1080p resolutions.
Change-Id: Ifa513ec2d0a49e8ec0867ec90620762fb7f1261d
2013-06-24 17:11:16 -04:00
Ronald S. Bultje
3c4abbe454
Add SAD unit tests for all rectangular sizes.
...
Change-Id: I47e81b51f072abdb276bdec85423febba34b5f81
2013-06-24 14:05:13 -07:00
Ronald S. Bultje
4dc70fa7f9
Don't re-allocate comp_pred buffers for each call to comp motion search.
...
Instead, just allocate a few bytes on the stack, this is 4k, which isn't
all that much.
Change-Id: I82af6ee89e6ed01faaa23ff891ee7ced76df8c16
2013-06-24 14:05:13 -07:00
Yaowu Xu
93f88ab55a
Merge "Fix loopfilter of leftmost 4x4 edges in SB"
2013-06-24 09:55:21 -07:00
John Koleszar
858475a03a
Fix loopfilter of leftmost 4x4 edges in SB
...
For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.
Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
2013-06-24 08:26:00 -07:00
Ronald S. Bultje
4eb8c56587
Merge "Allocate memory using appropriate expected alignment in unit tests."
2013-06-21 21:22:55 -07:00
James Zern
c2fa8390f6
I420VideoSource: normalize framerate types
...
ctor inputs are ints as are vpx_rational_t members
Change-Id: I62a39bf3df123727a872e40b74e3ee9e55ef2ede
2013-06-21 19:34:51 -07:00
James Zern
f6d293adf6
intrapred_test: add virtual dtor to IntraPredBase
...
classes with virtual functions should have virtual destructors
Change-Id: If54e2f8384f0bfcbf812cc727eb9d0a586173674
2013-06-21 19:33:50 -07:00
Ronald S. Bultje
ac6ea2ab91
Allocate memory using appropriate expected alignment in unit tests.
...
Fixes crashes of test_libvpx on 32-bit Linux.
Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
2013-06-21 17:03:57 -07:00
John Koleszar
0c8e13d2f8
Merge "Add some unaligned test vectors"
2013-06-21 16:31:18 -07:00
John Koleszar
9e7019f7df
Remove unused vp9_build_intra_predictors_sb{y,uv}_s
...
The functions no longer referenced.
Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf
2013-06-21 16:10:05 -07:00
Ronald S. Bultje
98188e0e82
Merge "Remove emms - that shouldn't be there."
2013-06-21 15:53:25 -07:00
John Koleszar
5c32215e27
Remove unused vp9_model_to_full_probs_sb()
...
This function never referenced.
Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc
2013-06-21 15:38:55 -07:00
Dmitry Kovalev
f27f76dfb3
Transforming scale_mv_component_q4 into scale_mv_q4 function.
...
Using MV instead of int_mv for function arguments.
Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6
2013-06-21 15:34:29 -07:00
Ronald S. Bultje
fc033b38ee
Remove emms - that shouldn't be there.
...
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
2013-06-21 14:45:04 -07:00
James Zern
cc774c8bb0
variance_test: use REGISTER_STATE_CHECK
...
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
2013-06-21 14:30:08 -07:00
Dmitry Kovalev
40141681c0
Removing find_seg_id and using vp9_get_pred_mi_segid instead.
...
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
2013-06-21 13:05:10 -07:00
Ronald S. Bultje
ba42c02654
Add missing SECTION .text marker in assembly file.
...
Fixes a crash on Windows when building with MSVC.
Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
2013-06-21 12:55:46 -07:00
Ronald S. Bultje
54b2a59623
Implement SSE2 block_error.
...
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.
Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.
Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
2013-06-21 12:54:52 -07:00
Ronald S. Bultje
7756e9892b
Merge "Add subtract_block SSE2 version and unit test."
2013-06-21 12:49:50 -07:00
Ronald S. Bultje
9a480482cb
Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()."
2013-06-21 12:49:43 -07:00
Ronald S. Bultje
25c588b1e4
Add subtract_block SSE2 version and unit test.
...
3% faster overall (3min35.0 to 3min28.5).
Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
2013-06-21 09:35:37 -07:00
Yaowu Xu
869d770610
Merge "Get some speed back for cpuused 1"
2013-06-20 22:37:01 -07:00
Yaowu Xu
45e25a7814
Get some speed back for cpuused 1
...
and remove unused code.
Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
2013-06-20 19:05:18 -07:00
Yaowu Xu
61721181ec
Merge "rename variables to avoid build error in MSVC"
2013-06-20 19:04:30 -07:00
Yaowu Xu
ee07a261a0
rename variables to avoid build error in MSVC
...
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
2013-06-20 18:31:48 -07:00
Yaowu Xu
e6cd5ed307
Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes."
2013-06-20 17:42:50 -07:00
Ronald S. Bultje
1e6a32f1af
SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance().
...
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.
Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
2013-06-20 15:59:48 -07:00
Jim Bankoski
84490a1f3d
Merge "clean out libvpx-srcs.txt if built"
2013-06-20 15:10:16 -07:00
Jim Bankoski
975df8c729
clean out libvpx-srcs.txt if built
...
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
2013-06-20 15:05:42 -07:00
James Zern
43d04ef93b
Merge "Revert "test_libvpx: disable pthreads in gtest""
2013-06-20 15:02:27 -07:00
Frank Galligan
c259af4f73
Fix win64 warning.
...
- size_t vs int.
Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
2013-06-20 14:07:11 -07:00
James Zern
f2dc38256d
Revert "test_libvpx: disable pthreads in gtest"
...
This reverts commit 90a9900abb79fabfd44189a959d14ca677c2777a
Seems to break the Mac build:
src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22
Abort trap: 6
Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
2013-06-20 12:49:15 -07:00
Jingning Han
4f4713b417
Merge "Add unit tests for 4x4 ADST"
2013-06-20 10:22:40 -07:00
Johann
0373e517f7
Merge "Cast value to avoid size_t/int warning on win64"
2013-06-20 10:19:39 -07:00
Dmitry Kovalev
8283d893eb
Merge "Renaming 'nmv' to 'mv' for several functions."
2013-06-20 10:17:12 -07:00
Dmitry Kovalev
77186ee61a
Merge "Function decomposition inside vp9_decodemv.c file."
2013-06-20 10:17:05 -07:00
Deb Mukherjee
7947a33d72
Improving model rd with variance and quant step
...
Improves the rd modeling function and implements them using interpolation
from a table which is a little faster. Also uses sse as input to the
modeling function rather than var - since there is no dc prediction
used and as a result the sse works a little better.
derfraw300: +0.05%
Speedup: ~1%
Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
2013-06-20 10:06:28 -07:00
Johann
d94aee6854
Cast value to avoid size_t/int warning on win64
...
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from
'size_t' to 'int'
Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
2013-06-20 09:52:08 -07:00
Jim Bankoski
9f2a1ae23e
adds force partitioning greater than or less than block size
...
adds a new speed feature to force partitioning to be greater than
or less than a certain size
Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
2013-06-20 09:51:42 -07:00
Jim Bankoski
18bdf708e7
adds a set partitioning to speed features
...
this feature lets you set a partitioning size to be used by the entire
frame.
Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
2013-06-20 09:50:44 -07:00
Jim Bankoski
476d73d294
partition by variance using var from last frame
...
This uses variance to split partition. Variance is calculated using
nearest mv, always from last ref frame.
Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
2013-06-20 09:48:22 -07:00
Jim Bankoski
1f94b97694
convert all speed things to speed features
...
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
2013-06-20 09:42:44 -07:00
Jim Bankoski
727fa7b1e4
new partition via variance
...
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
2013-06-20 09:42:05 -07:00
Jim Bankoski
0fad6a9d99
fix to set up new speed feature
...
This uses the speed feature functionality for code.
Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
2013-06-20 09:35:02 -07:00
Jim Bankoski
df2314cfdd
don't copy partitions for key frames or altrefs
...
force us to go through slow partitioning for keyframes, altref and
overlays.
Change-Id: I1a286361bf74083e71973575a7296be46eb98742
2013-06-20 09:34:32 -07:00
Ronald S. Bultje
8fb6c58191
Implement sse2 and ssse3 versions for all sub_pixel_variance sizes.
...
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):
sse2 4x4: 99 -> 82 cycles
sse2 4x8: 128 cycles
sse2 8x4: 121 cycles
sse2 8x8: 149 -> 129 cycles
sse2 8x16: 235 -> 245 cycles (?)
sse2 16x8: 269 -> 203 cycles
sse2 16x16: 441 -> 349 cycles
sse2 16x32: 641 cycles
sse2 32x16: 643 cycles
sse2 32x32: 1733 -> 1154 cycles
sse2 32x64: 2247 cycles
sse2 64x32: 2323 cycles
sse2 64x64: 6984 -> 4442 cycles
ssse3 4x4: 100 cycles (?)
ssse3 4x8: 103 cycles
ssse3 8x4: 71 cycles
ssse3 8x8: 147 cycles
ssse3 8x16: 158 cycles
ssse3 16x8: 188 -> 162 cycles
ssse3 16x16: 316 -> 273 cycles
ssse3 16x32: 535 cycles
ssse3 32x16: 564 cycles
ssse3 32x32: 973 cycles
ssse3 32x64: 1930 cycles
ssse3 64x32: 1922 cycles
ssse3 64x64: 3760 cycles
Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
2013-06-20 09:34:25 -07:00
Jim Bankoski
f954490bbf
disable speed > 1 speed corrections in firstpass
...
need to rework these
Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
2013-06-20 09:34:03 -07:00