2064 Commits

Author SHA1 Message Date
John Koleszar
8137e24f3d Merge "Move vp9_counts_to_nmv_context to encoder" 2013-06-25 22:44:21 -07:00
John Koleszar
7bbb0633cd Merge "Move vp9_full_to_model_counts to encoder" 2013-06-25 22:44:16 -07:00
Jingning Han
3cc8c8c3a0 Merge "Refactor intra predictor block" 2013-06-25 19:46:55 -07:00
Jingning Han
d19ea3861d Refactor intra predictor block
Remove vp9_intra4x4_predict(). Use the common intra prediction
function for all block sizes.

Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560
2013-06-25 16:33:13 -07:00
Dmitry Kovalev
6fb10f2de4 Renaming "nmv" to "mv".
Change-Id: I8299f55c3b930221e52c2237f2ddea65b94fd33b
2013-06-25 15:19:18 -07:00
Dmitry Kovalev
dc0f457c94 Using get_plane_block_{width, height} instead of custom code.
Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9
2013-06-25 14:11:18 -07:00
Ronald S. Bultje
0441e0a2fc Merge "Only do metrics on cropped (visible) area of picture." 2013-06-25 13:51:18 -07:00
Ronald S. Bultje
1d0ae2e63c Merge "Don't skip right/bottom border pixels in SSIM calculations." 2013-06-25 13:51:04 -07:00
Ronald S. Bultje
c5be54eef3 Merge "Add averaging-SAD functions for 8-point comp-inter motion search." 2013-06-25 13:50:53 -07:00
Jingning Han
d52c359d43 Merge "Tune the rounding operations in 8x8 ADST/DCT sse2" 2013-06-25 13:17:05 -07:00
Ronald S. Bultje
450c7b57a8 Only do metrics on cropped (visible) area of picture.
The part where we align it by 8 or 16 is an implementation detail that
shouldn't matter to the outside world.

Change-Id: I9edd6f08b51b31c839c0ea91f767640bccb08d53
2013-06-25 12:57:28 -07:00
Ronald S. Bultje
44f349df62 Don't skip right/bottom border pixels in SSIM calculations.
Change-Id: I75acb55ade54bef6ad7703ed5e691581fa2f8fe1
2013-06-25 12:57:28 -07:00
Ronald S. Bultje
c24d922396 Add averaging-SAD functions for 8-point comp-inter motion search.
Makes first 50 frames of bus @ 1500kbps encode from 3min22.7 to 3min18.2,
i.e. 2.3% faster. In addition, use the sub_pixel_avg functions to calc
the variance of the averaging predictor. This is slightly suboptimal
because the function is subpixel-position-aware, but it will (at least
for the SSE2 version) not actually use a bilinear filter for a full-pixel
position, thus leading to approximately the same performance compared to
if we implemented an actual average-aware full-pixel variance function.
That gains another 0.3 seconds (i.e. encode time goes to 3min17.4), thus
leading to a total gain of 2.7%.

Change-Id: I3f059d2b04243921868cfed2568d4fa65d7b5acd
2013-06-25 12:57:28 -07:00
Jingning Han
0084e61d5f Tune the rounding operations in 8x8 ADST/DCT sse2
Improve the round-trip precision to meet the unit test setttings.

Change-Id: I303febae56b4b990ea3798b8ebed94c0510ecf79
2013-06-25 12:02:26 -07:00
Ronald S. Bultje
5ebe47747d Merge "Don't re-allocate comp_pred buffers for each call to comp motion search." 2013-06-25 12:00:36 -07:00
Dmitry Kovalev
9467571777 Moving subexp encoding functions in separate vp9_dsubexp.c file.
Change-Id: Idbb2ea80f764fa830fe2ddcfc54ef7fe232f05a8
2013-06-25 11:53:17 -07:00
Dmitry Kovalev
5ae096778e Merge "Removing unused code." 2013-06-25 11:50:55 -07:00
Jingning Han
cd6932db77 Merge "Add 8x8 dct/adst unit tests" 2013-06-25 11:21:17 -07:00
Yaowu Xu
c2e3ee13e7 Merge "Changed size of mb_mode_context to 8 bits" 2013-06-25 10:44:47 -07:00
Scott LaVarnway
855e23ce8c Merge "Small mode_info_context cleanup in filter_block_plane" 2013-06-25 10:34:19 -07:00
Dmitry Kovalev
87ee34aacb Removing unused code.
Removing block index (ib) parameter from get_tx_type_{8x8, 16x16}
functions.

Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1
2013-06-25 10:17:19 -07:00
Dmitry Kovalev
70e9622185 Merge "Removing find_seg_id and using vp9_get_pred_mi_segid instead." 2013-06-25 10:16:06 -07:00
Dmitry Kovalev
529679bd52 Merge "Transforming scale_mv_component_q4 into scale_mv_q4 function." 2013-06-25 10:15:33 -07:00
Jingning Han
ab362621fe Add 8x8 dct/adst unit tests
This commit enables 8x8 DCT and hybrid transform unit tests. It
also tunes the forward hybrid transform rounding opertions for
more precise round-trip performance.

Change-Id: If05c1ce59d75d641b9c6c91527d02d3a6ef498c3
2013-06-25 09:57:01 -07:00
Jingning Han
67365520e7 Merge "Use aligned buffer operations in 8x8/16x16 2D-DCT" 2013-06-25 09:49:03 -07:00
Scott LaVarnway
c787f40bc4 Small mode_info_context cleanup in filter_block_plane
Unnecessary updates to xd->mode_info_context.

Change-Id: I36d2d68ca48366f727548526726b1b5437f62968
2013-06-25 12:28:50 -04:00
Yaowu Xu
b9c934df8e Merge "Enable sse2 implmentation of 8x8 ADST/DCT" 2013-06-25 09:13:22 -07:00
Yaowu Xu
ca976db44d Merge "change to enable use_largest_txform feature" 2013-06-25 09:07:01 -07:00
Jingning Han
82d504b50f Use aligned buffer operations in 8x8/16x16 2D-DCT
This reduces 16x16 2D-DCT runtime from 865 cycles to 837 cycles.

Change-Id: I137758b81cd127b936175284310e81378db64552
2013-06-24 19:56:23 -07:00
Jingning Han
a32a086d23 Enable sse2 implmentation of 8x8 ADST/DCT
This commit makes use of the butterfly structure to enable the sse2
version implementation of 8x8 ADST/DCT hybrid transform coding.

The runtime of hybrid transform module goes down from 1170 cycles
to 245 cycles. Overall speed-up around 1.5%.

Change-Id: Ic808ffd21ece8a9d0410d8c0243d7b6c28ac3b3f
2013-06-24 18:41:33 -07:00
Yaowu Xu
e371cd73a3 change to enable use_largest_txform feature
for all regular inter frames at speed 1

Change-Id: I0a8b301273ecf2b8730ab1f6b7a05f89f4d498e0
2013-06-24 16:43:26 -07:00
John Koleszar
4ecd6dbead Move vp9_counts_to_nmv_context to encoder
This function only used from within vp9_encodemv.c.

Change-Id: Ib3fc7c30b1e2d27321397ac474cbc8976bc1f4b1
2013-06-24 15:58:18 -07:00
John Koleszar
08b1798ae7 Move vp9_full_to_model_counts to encoder
This function is not called from the decoder, so it doesn't need to be
in common/.

Change-Id: I6977dd462a25b4ff39c9c7e1b0b5b16aa58ee733
2013-06-24 15:46:15 -07:00
John Koleszar
ece724ae16 Merge "Remove unused vp9_build_intra_predictors_sb{y,uv}_s" 2013-06-24 15:08:58 -07:00
John Koleszar
ee4a7e4e46 Merge "Remove unused vp9_model_to_full_probs_sb()" 2013-06-24 15:08:54 -07:00
Scott LaVarnway
dfa2ecc3f1 Changed size of mb_mode_context to 8 bits
This reduced the size of the MODE_INFO array (mip and prev_mip)
by 425,568 bytes each for 1080p resolutions.

Change-Id: Ifa513ec2d0a49e8ec0867ec90620762fb7f1261d
2013-06-24 17:11:16 -04:00
Ronald S. Bultje
4dc70fa7f9 Don't re-allocate comp_pred buffers for each call to comp motion search.
Instead, just allocate a few bytes on the stack, this is 4k, which isn't
all that much.

Change-Id: I82af6ee89e6ed01faaa23ff891ee7ced76df8c16
2013-06-24 14:05:13 -07:00
John Koleszar
858475a03a Fix loopfilter of leftmost 4x4 edges in SB
For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.

Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
2013-06-24 08:26:00 -07:00
John Koleszar
9e7019f7df Remove unused vp9_build_intra_predictors_sb{y,uv}_s
The functions no longer referenced.

Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf
2013-06-21 16:10:05 -07:00
John Koleszar
5c32215e27 Remove unused vp9_model_to_full_probs_sb()
This function never referenced.

Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc
2013-06-21 15:38:55 -07:00
Dmitry Kovalev
f27f76dfb3 Transforming scale_mv_component_q4 into scale_mv_q4 function.
Using MV instead of int_mv for function arguments.

Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6
2013-06-21 15:34:29 -07:00
Ronald S. Bultje
fc033b38ee Remove emms - that shouldn't be there.
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
2013-06-21 14:45:04 -07:00
Dmitry Kovalev
40141681c0 Removing find_seg_id and using vp9_get_pred_mi_segid instead.
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
2013-06-21 13:05:10 -07:00
Ronald S. Bultje
ba42c02654 Add missing SECTION .text marker in assembly file.
Fixes a crash on Windows when building with MSVC.

Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
2013-06-21 12:55:46 -07:00
Ronald S. Bultje
54b2a59623 Implement SSE2 block_error.
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.

Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.

Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
2013-06-21 12:54:52 -07:00
Ronald S. Bultje
7756e9892b Merge "Add subtract_block SSE2 version and unit test." 2013-06-21 12:49:50 -07:00
Ronald S. Bultje
9a480482cb Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()." 2013-06-21 12:49:43 -07:00
Ronald S. Bultje
25c588b1e4 Add subtract_block SSE2 version and unit test.
3% faster overall (3min35.0 to 3min28.5).

Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
2013-06-21 09:35:37 -07:00
Yaowu Xu
869d770610 Merge "Get some speed back for cpuused 1" 2013-06-20 22:37:01 -07:00
Yaowu Xu
45e25a7814 Get some speed back for cpuused 1
and remove unused code.

Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
2013-06-20 19:05:18 -07:00