Commit Graph

2408 Commits

Author SHA1 Message Date
Tom Finegan
4fffefe189 Merge "Fix avx builds on macosx with clang 5.0." 2014-04-09 13:03:26 -07:00
Dmitry Kovalev
5ed83c3220 Merge "Converting set_prev_mi() to get_prev_mi()." 2014-04-09 10:27:05 -07:00
Yunqing Wang
2e7d327789 Merge "Use source frame difference to make partition decision" 2014-04-09 10:26:42 -07:00
Yunqing Wang
3a6670fcf8 Fix encoder uninitialized read errors reported by drmemory
This patch fixed the uninitialized read errors in Issue 748:
"dr memory VP9 encode errors". In vp9_convolve_avg_sse2,
when width is 4, pavgb reads 8 bytes from dst buffer that is
out of range. An error is reported although the data is not
actually used later. This issue was resolved by preventing
uninitialized reads.

Change-Id: I109a54910aa47139cb13119de86f2062cff207df
2014-04-09 09:59:15 -07:00
Tom Finegan
f600b50a6e Fix avx builds on macosx with clang 5.0.
The macosx release of clang v5.0 identifies itself as:
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)

This version of clang uses the older _mm_broadcastsi128_si256, like
v3.3, as given away in the LLVM svn version above.

Change-Id: I4d6d59d5454efd57d2ae9e75f5eb7486af7cbd0c
2014-04-08 18:56:03 -07:00
Yunqing Wang
4e66293fcb Use source frame difference to make partition decision
Calculate the difference variance between last source frame and
current source frame. The variance is calculated at 16x16 block
level. The variances are compared to several thresholds to decide
final partition sizes.

An adaptive strategy is implemented to decide using
SOURCE_VAR_BASED_PARTITION or FIXED_PARTITION based on motions
in the video. The switching test is done once every
search_type_check_frequency frames.

The selection of source_var_thresh needs to be investigated
further later.

RTC set Borg test showed 0.424% overall psnr gain, and 0.357%
ssim gain. For clips with large enough static area, the
encoding speedup is around 2% to 15%.

Change-Id: Id7d268f1d8cbca7fb8026aa4a53b3c77459dc156
2014-04-08 17:03:02 -07:00
Deb Mukherjee
d35df2d8ea High-level hooks for Profile 2 (10/12 bit)
Adds some high-level hooks for profile 2 before further
progress on the implementation.

According to the definitiion in this patch:
1. Profile 2 only supports 10 or 12 bit color but not 8
2. Profile 2 supports all color sampling modes: 444, 422 and 420,
and alpha plane.
3. Profile 3 is currently undefined.

Please consider the definition carefully and suggest modifications
to the definition as needed.

Change-Id: I5b284fc679e54ac5aee171af72fa7994cfd28995
2014-04-08 16:18:34 -07:00
Dmitry Kovalev
22a3e30790 Converting set_prev_mi() to get_prev_mi().
Change-Id: Iad4002d7aecaae0e25d88e286bacde7e6cd7264f
2014-04-07 16:01:34 -07:00
Dmitry Kovalev
b5e12dda52 Cleaning up vp9_{cx, dx}_iface.c files.
Change-Id: Ib4e31ba74c4b882bd93942ef743f4a189892738d
2014-04-07 10:38:51 -07:00
Dmitry Kovalev
a9f324fa7f Removing interp_kernel from MACROBLOCKD.
Now interp_kernel is obtained when it is really required (based on
mbmi->interp_filter value).

Change-Id: I4c7a93c179d1045eba16e7526c293d02c9b8b47e
2014-04-03 15:28:42 -07:00
Dmitry Kovalev
8b8606a737 Merge "Cleaning up vp9_mvref_common.c." 2014-04-02 11:03:36 -07:00
Dmitry Kovalev
68027a0b8a Merge "Grouping members in MB_MODE_INFO struct." 2014-04-02 11:00:58 -07:00
Dmitry Kovalev
86f44a91f4 Renaming two members in MACROBLOCKD struct.
Renames:
  mi_8x8 -> mi
  mode_info_stride -> mi_stride

Change-Id: I66f3e5fd1e7b7f46f108af5bb711c5fd9493c1be
2014-04-01 17:46:40 -07:00
Dmitry Kovalev
d42976c515 Common configuration for MACROBLOCKD struct.
Change-Id: Ie2ea9dd8bd338cc9fe12ca9033df64f7644c68b3
2014-04-01 10:57:59 -07:00
Dmitry Kovalev
20d868f05d Grouping members in MB_MODE_INFO struct.
Change-Id: Ia6d7e7a08810e0c3401da4d10266828d560e6851
2014-03-28 17:44:13 -07:00
Yaowu Xu
4f857bacd2 [BITSTREAM]Fix the scaling calculation
For very large size video image, the scaling calculation may need use
value beyond the range of int. This commit upgrade the value to 64bit
to make sure the calculation do not wrap around INT_MAX.

The change corrected the decoder behavior.

The bug affects only very large resolution video because the scaling
calculation was sufficient for image size smaller than 2^13.

This resolves issue:
https://code.google.com/p/webm/issues/detail?id=750

Change-Id: I2d2ed303ca6482f31f819f3c07d6d3e98ef3adc5
2014-03-28 16:40:29 -07:00
Dmitry Kovalev
03349d2ba2 Moving dqcoeff array to MACROBLOCKD in decoder.
Change-Id: I3e20c0cdb9d2437bddf21afb255855f2dead8e02
2014-03-28 10:36:16 -07:00
Dmitry Kovalev
38053687bc Cleaning up vp9_mvref_common.c.
Change-Id: I4eb815156ecaab02c9182e6e1abbea0e4d86c441
2014-03-27 17:50:02 -07:00
Dmitry Kovalev
0437575848 Merge "Removing prev_mi_8x8 from MACROBLOCKD." 2014-03-26 15:45:11 -07:00
Dmitry Kovalev
38c2d37b9d Merge "Cleaning up vp9_entropymv.c." 2014-03-26 14:28:45 -07:00
Dmitry Kovalev
63f86c149a Removing prev_mi_8x8 from MACROBLOCKD.
Change-Id: I32beb5f18c10b5771146c55933b5555487f53633
2014-03-26 10:50:34 -07:00
Dmitry Kovalev
ed39c40a2e Moving above_context to VP9_COMMON.
Change-Id: I713af99d1e17e05a20eab20df51d74ebfd1a68d2
2014-03-25 10:40:08 -07:00
Yaowu Xu
34a3628a45 Merge "Fixed a build issue" 2014-03-25 10:22:18 -07:00
Yaowu Xu
59872069d2 Merge "Change back the scaling calculation." 2014-03-25 09:48:21 -07:00
Yaowu Xu
8051563972 Fixed a build issue
Adding the missed include file.

Change-Id: I7e48df6b0633afbebaf1ccb3062ae404e7203dc9
2014-03-25 09:45:54 -07:00
Dmitry Kovalev
5b8c834c1a Initialization code cleanup.
Change-Id: I47a8b4bf9a6cc0063d1a6785eaaad641d0659e24
2014-03-24 12:21:22 -07:00
Dmitry Kovalev
49bb6df0e2 Cleaning up vp9_entropymv.c.
Change-Id: I01b3530779da89acb84c71bac5ccac456f00c5ac
2014-03-24 11:02:27 -07:00
Yunqing Wang
b458bb7c20 Merge "AVX2 SAD Optimization:" 2014-03-24 10:52:32 -07:00
Dmitry Kovalev
ac5bdc0ed8 Merge "Cleaning up vp9_loopfilter.c." 2014-03-24 09:02:06 -07:00
hkuang
22232ec602 Change back the scaling calculation.
Let the calculation to be compatible with Google's HW implementation.

Change-Id: I22e179888cdb0419e230351c0a47661b37051fef
2014-03-24 08:32:56 -07:00
Dmitry Kovalev
9895c9d4dd Merge "Removing redundant {above, left}_seg_context manipulation code." 2014-03-22 22:31:48 -07:00
Dmitry Kovalev
2786938a3c Merge "Renaming and making vp9_update_mode_info_border() static." 2014-03-21 21:19:18 -07:00
Dmitry Kovalev
58cc06f9b3 Cleaning up vp9_loopfilter.c.
Change-Id: I7c7cf7d3c7b00d1c74ffa8aa8fb8d78a0e48326f
2014-03-21 16:31:15 -07:00
Frank Galligan
8345e76d61 Merge "Fix libvpx VP9 decoder dr memory errors" 2014-03-21 15:24:39 -07:00
Dmitry Kovalev
e141f10bfc Renaming and making vp9_update_mode_info_border() static.
Change-Id: Ibb72a29cae9ca9443aae56fc4c5458d190eae279
2014-03-21 14:02:25 -07:00
levytamar82
0fa8b668c1 AVX2 SAD Optimization:
2 functions were optimized for avx2 by using full 256 bit register
In order to handle 32 elements in parallel instead of only 16 in parallel:
1. vp9_sad32x32x4d
2. vp9_sad64x64x4d

The function level gain is 66% and the user level gain is ~1%.

Change-Id: I4efbb3bc7d8bc03b64b6c98f5cd5c4a9dd3212cb
2014-03-21 13:53:32 -07:00
Yunqing Wang
9b5df3fabe Fix libvpx VP9 decoder dr memory errors
Fixed dr memory errors reported in Issue 736:
https://code.google.com/p/webm/issues/detail?id=736

All elements in left_col buffer need to be initialized to ensure
the correctness of SIMD operations in x86 optimized code.

Change-Id: I8e7f26ab45cca8099c1f9342bcf852f828bda7e4
2014-03-21 12:23:47 -07:00
Dmitry Kovalev
4cb37bff96 Removing redundant {above, left}_seg_context manipulation code.
Change-Id: Ib3c1746e61220c629cbd971b2458aa686b5c9e36
2014-03-21 12:12:55 -07:00
Dmitry Kovalev
a57de9da03 Merge "Reusing {above, left}_seg_context vars in both encoder and decoder." 2014-03-21 12:02:42 -07:00
Yaowu Xu
46c71e5eba Merge "Remove duplicate declaration" 2014-03-21 08:44:04 -07:00
Dmitry Kovalev
7ad40117f1 Reusing {above, left}_seg_context vars in both encoder and decoder.
Change-Id: Id1fa36c92cb007b73a450cc8552e810cedad38b9
2014-03-20 16:15:57 -07:00
Dmitry Kovalev
03781ff22d Merge "Removing mi_stream." 2014-03-20 13:43:13 -07:00
Dmitry Kovalev
4b37dc8d87 Adding alloc_mi() function.
Change-Id: I3b944884c048f589c86e0169aeb3c3855bc8b729
2014-03-19 13:31:47 -07:00
Yaowu Xu
7ef16efca1 Remove duplicate declaration
Change-Id: Ic8e52a89e0df816c38cd8ff1b7c53862b9a6dff2
2014-03-19 12:23:32 -07:00
Yaowu Xu
8cb59992e8 Merge "Fix the md5 mismatch for some scale cases." 2014-03-19 11:13:28 -07:00
Dmitry Kovalev
8ccfcb765f Removing mi_stream.
Change-Id: If674140e30c223c88894b983fd22a583efb99dcf
2014-03-19 10:47:32 -07:00
Dmitry Kovalev
b8bc2d337a Fixing warnings/errors from c++ compiler.
Change-Id: Ia561dda53f2dd10e3a10a2df2adb8027ab19397a
2014-03-18 10:47:51 -07:00
hkuang
1f7e4856f8 Fix the md5 mismatch for some scale cases.
Fixes issue #731
Change-Id: Id313e84b8fb4ff20f6a4e1ed11cb601927888318
2014-03-17 11:21:43 -07:00
Dmitry Kovalev
7c6337ba9e Merge "Adding vp9_swap_mi_and_prev_mi() function." 2014-03-13 17:47:27 -07:00
Dmitry Kovalev
d8e5564129 Using MB_PREDICTION_MODE enum instead of int.
Change-Id: I652d17f7bff84f75d015f4f39652472e14eb3134
2014-03-13 15:03:00 -07:00
Dmitry Kovalev
e65c564c78 Adding vp9_swap_mi_and_prev_mi() function.
Change-Id: I18b3939f0b51085cdd25c9182c3a9c7536ca7e3e
2014-03-13 13:55:33 -07:00
Dmitry Kovalev
3dca8ca7af Merge "Renaming mode2txfm_map to intra_mode_to_tx_type_lookup." 2014-03-12 23:29:29 -07:00
Yaowu Xu
17256ad763 Revert "With on demand border extension, clamping the MV"
This reverts commit b0fec6ab4a.

Change-Id: I9acd8ee0423f22d92138f11579611ff959331013
2014-03-12 19:40:15 -07:00
Yaowu Xu
acf2eb73e7 Revert "Remove dec_build_inter_predictors() parameters"
This reverts commit 9650b9d72a.

Change-Id: I841c4a4734170fda63469e32adc10703aa4bf0fa
2014-03-12 19:39:59 -07:00
Dmitry Kovalev
95aed4a3fa Renaming mode2txfm_map to intra_mode_to_tx_type_lookup.
Change-Id: I9a19eb96907f674e3ce1e573f5dd49f0fbf2ae4f
2014-03-12 17:23:26 -07:00
Dmitry Kovalev
c909b43e3c Merge "Moving mi_streams from VP9Decompressor to VP9Common." 2014-03-12 12:20:18 -07:00
Dmitry Kovalev
fec0d4bc7d Merge "Removing last_mi from MACROBLOCKD struct." 2014-03-12 12:19:43 -07:00
Dmitry Kovalev
dff81e6c7a Moving mi_streams from VP9Decompressor to VP9Common.
Change-Id: I7ad79c061ad4efbc4914ac49723b48183fdbdd47
2014-03-10 16:12:45 -07:00
Dmitry Kovalev
ff935ff781 Removing last_mi from MACROBLOCKD struct.
Change-Id: Ied12b39c55667b26fd3bf90eb331e601c53a10f6
2014-03-10 16:02:03 -07:00
Dmitry Kovalev
6281a9abbb Adding type casts to remove C++ compiler errors.
Change-Id: I224e49955ad6c833d204feb8efc4056e37d206be
2014-03-10 14:53:30 -07:00
Dmitry Kovalev
f8f8c6d44c Adding reusable get_y_mode_prob() function.
Change-Id: Iebd182d7aeebc0f8964b6fd35057449bb25b00c1
2014-03-10 10:50:16 -07:00
Jim Bankoski
622f06eb59 Merge "vp9_reconinter.h static functions in header converted to global" 2014-03-10 07:36:05 -07:00
Jim Bankoski
ffda0cde7b Merge "vp9_onyxc_int.h static -> static inline in header" 2014-03-10 07:35:54 -07:00
Dmitry Kovalev
0ac2139d02 Merge "Removing vp9_onyx.h and moving its content to the encoder." 2014-03-06 11:49:41 -08:00
James Zern
e7fe1543f6 Merge "vp9_systemdependent: reorder includes avoid proto mismatch" 2014-03-06 11:42:50 -08:00
James Zern
fe49c05214 Merge "vp9_subpixel_8t_intrin_avx2: fix build w/clang 3.4+" 2014-03-06 11:41:44 -08:00
James Zern
caecedc92f vp9_subpixel_8t_intrin_avx2: fix build w/clang 3.4+
clang reports gcc-4.2.1 in e.g., 3.3, 3.4; add a specific clang version
check for _mm256_broadcastsi128_si256

fixes issue #720

Change-Id: I5c8e3c27fdea05d8a5b050e8cb74894b595f4709
2014-03-06 10:55:44 -08:00
Dmitry Kovalev
3f1ab25812 Removing vp9_onyx.h and moving its content to the encoder.
Change-Id: I03451c88536bc498edddbe0cd9773ff79da085c2
2014-03-05 23:33:22 -08:00
James Zern
e9680bef22 vp9_systemdependent: reorder includes avoid proto mismatch
fixes a warning in vs9/x64 related to ceil()

Change-Id: Ic4bde9d0b7e961546dbe304de74aa37fc02fcf94
2014-03-05 22:02:29 -08:00
Dmitry Kovalev
08a7d7e405 Merge "Renaming NMV_UPDATE_PROB to MV_UPDATE_PROB." 2014-03-05 21:39:09 -08:00
Dmitry Kovalev
bb9b6a9568 Merge "Cleaning up vp9_mvref_common.c." 2014-03-05 10:57:37 -08:00
Dmitry Kovalev
791751015f Merge "Removing VP9_PTR." 2014-03-05 10:57:10 -08:00
Dmitry Kovalev
d31fc628a7 Renaming NMV_UPDATE_PROB to MV_UPDATE_PROB.
Change-Id: I7f3bcca103f0b1f6b3c064b61472543de9a8288a
2014-03-05 10:37:52 -08:00
Dmitry Kovalev
fe7b1d0a8d Removing VP9_PTR.
Change-Id: Ib49d8dbc67c590f22a1a70251ff607c9f38febd7
2014-03-03 16:50:16 -08:00
Jim Bankoski
e5e9b05d68 vp9_reconinter.h static functions in header converted to global
Change-Id: I916944950deb22f4c2301d83a803b732bf3ecd77
2014-03-03 14:58:43 -08:00
Jim Bankoski
3d12e65483 vp9_onyxc_int.h static -> static inline in header
Change-Id: Ib65fb0679156960305b10fbf590254ff6bf1bfe1
2014-03-03 14:50:07 -08:00
James Zern
805078a1bf build: convert rtcd.sh to perl
significantly speeds up file generation.

the goal of this change is to convert rtcd.sh to perl as directly as
possible to allow for simple comparison. future changes can make it more
perl-like.

---
Linux
    [CREATE] vpx_scale_rtcd.h
real    0m0.485s ->    0m0.022s
    [CREATE] vp8_rtcd.h
real    0m4.619s ->    0m0.060s
    [CREATE] vp9_rtcd.h
real    0m10.102s ->    0m0.087s

Windows
    [CREATE] vpx_scale_rtcd.h
real    0m8.360s ->    0m0.080s
    [CREATE] vp8_rtcd.h
real    1m8.083s ->    0m0.160s
    [CREATE] vp9_rtcd.h
real    2m6.489s ->    0m0.233s

Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee
2014-03-03 14:47:11 -08:00
Dmitry Kovalev
be647f7b83 Merge "Adding get_tx_type() instead of get_tx_type_{8x8, 16x16}." 2014-03-03 14:24:28 -08:00
Dmitry Kovalev
594677a76b Merge "Moving FRAME_CONTEXT & FRAME_COUNTS to vp9_entropymode.h." 2014-03-03 14:24:04 -08:00
Dmitry Kovalev
46af01d719 Adding get_tx_type() instead of get_tx_type_{8x8, 16x16}.
Change-Id: I4a54b12e5229705222c5a101258b9d1f81e2948d
2014-03-03 12:20:51 -08:00
Dmitry Kovalev
c288367678 Adding consts and cleaning up vp9_rdopt.
Change-Id: I9423b543e1be414e5c9e10480b813f06e6b88f8a
2014-03-03 12:19:51 -08:00
Yunqing Wang
d4648d93f4 Merge "AVX2 SubPixel AVG Variance Optimization" 2014-03-03 09:01:36 -08:00
Yaowu Xu
9650b9d72a Remove dec_build_inter_predictors() parameters
There were two parameters not in use, this commit removed them.

Change-Id: Ia03a73b9a2521400bed539df45574e34214ed93a
2014-03-01 11:14:00 -08:00
Yaowu Xu
2f4eb5f096 Remove vp9_create_common()
The function has evolved over time, now only calls vp9_rtcd(), so this
commit removes the function and changes to call vp9_rtcd() directly.

Change-Id: I8cfa6190daa4b28f6f3d1e11bb3a07f9c95322bf
2014-03-01 10:59:24 -08:00
levytamar82
ea14909687 AVX2 SubPixel AVG Variance Optimization
Optimizing 2 functions to process 32 elements in parallel instead of 16:
1. vp9_sub_pixel_avg_variance64x64
2. vp9_sub_pixel_avg_variance32x32
both of those function were calling vp9_sub_pixel_avg_variance16xh_ssse3
instead of calling that function, it calls vp9_sub_pixel_avg_variance32xh_avx2
that is written in avx2 and process 32 elements in parallel.
This Optimization gave 80% function level gain and 2% user level gain

Change-Id: Iea694654e1b7612dc6ed11e2626208c2179502c8
2014-02-28 22:51:04 -07:00
Dmitry Kovalev
d689f2ad33 Cleaning up vp9_mvref_common.c.
different_ref_found is always equal to one (if calculated) because
ref_frame[0] != ref_frame[1] for each mi-block.

Change-Id: Ibd7625b7b29dec2fd3c40edbc3de1169abb78585
2014-02-28 15:12:33 -08:00
Dmitry Kovalev
e68cc30bb5 Moving FRAME_CONTEXT & FRAME_COUNTS to vp9_entropymode.h.
Change-Id: I1fe71e35b1e44da693b43d26607abb33efd56820
2014-02-28 13:56:43 -08:00
Dmitry Kovalev
e4159100bc Merge "Adding get_y_mode() function." 2014-02-28 11:12:22 -08:00
Dmitry Kovalev
28bd1dd15e Merge "Adding consts to arguments of vp9_block_error()." 2014-02-28 10:51:43 -08:00
Dmitry Kovalev
3a83d08a08 Merge "Moving get_tx_eob() from common to encoder." 2014-02-28 10:49:47 -08:00
hkuang
edcbbf2ee3 Merge "Fix a bug in neon that has not save and restore q4-q7 registers." 2014-02-28 09:48:26 -08:00
Dmitry Kovalev
3b2cd9137a Moving get_tx_eob() from common to encoder.
Change-Id: I7d11c6ae259aff6560710d16fea3032c661e5b02
2014-02-27 18:26:44 -08:00
Dmitry Kovalev
791e9bdac9 Adding consts to arguments of vp9_block_error().
Change-Id: Id145da99259866109cfee8b47a1d8f309944b937
2014-02-27 18:17:08 -08:00
Dmitry Kovalev
1ae91f7784 Adding get_y_mode() function.
Change-Id: Iaac57b24f79cd205a8c62bc1177412d22f5787a8
2014-02-27 16:05:50 -08:00
hkuang
f3d8e315ac Fix a bug in neon that has not save and restore q4-q7 registers.
Change-Id: Ie21b5ae89100389b80f919710839084f935a8545
2014-02-27 14:06:52 -08:00
Minghai Shang
3a8deeb8b6 Merge "[svc] Add target bitrate settings for each layers." 2014-02-27 10:51:26 -08:00
Dmitry Kovalev
2c594a5275 Removing vp9_systemdependent.c.
Change-Id: I7b9738a7113c0c4687e5d320581ff69d98a8b271
2014-02-26 18:07:23 -08:00
Minghai Shang
8c196b27b3 [svc] Add target bitrate settings for each layers.
Change-Id: Ia7677fb436667bc4f76db71f65e4784f433f7826
2014-02-26 13:30:50 -08:00
hkuang
08f250f565 Merge "Fix a bug in intra prediction due to change in 25e55526301eba7d6e5c68e25402e9b2102976d8." 2014-02-26 11:56:45 -08:00
hkuang
1c4e449133 Fix a bug in intra prediction due to change in
25e5552630.

Change-Id: I17ac67c3ced91ad4f057b296f7e8dc86a3389f26
2014-02-25 17:54:33 -08:00
Dmitry Kovalev
7bca32a6a3 Merge "Changing vp9_full_search_sad{, x3, x8} signatures." 2014-02-25 10:51:17 -08:00
Yaowu Xu
05e850cb9e added clamp of segment loop filter level
for ABSDATA mode, so segment loop filter level always fall in valid
range for both Absolute and delta modes.

Change-Id: If90df3411479533dbdab63f8ae088d2f5dd174a9
2014-02-24 09:56:48 -08:00
Yaowu Xu
bfaf415ea7 Merge "Added clamp of qindex to valid range" 2014-02-24 08:28:07 -08:00
Dmitry Kovalev
2aacc66b66 Merge "Cleaning up vp9_mvref_common.{h, c}." 2014-02-23 08:25:40 -08:00
Yaowu Xu
e22b12e304 Added clamp of qindex to valid range
The qindex for a segment was not clamped in ABSDATA mode, which may
cause invalid memory access if an ill-formed stream has a negative
value in ABSDATA mode. This commit added clamp to make sure qindex
for a segment always fall into valid range.

Change-Id: I0a74d00f4ef40aec7edaeca1d03c8645e23ab08c
2014-02-22 12:30:18 -08:00
Yaowu Xu
f1633e5844 Merge "Remove an unused variable" 2014-02-21 22:44:05 -08:00
Alex Converse
6e3cf6ec1d Stop gating non420 features with a configure flag.
Change-Id: I8cc38fdef6a2a0968af8dfe15e7c2b3c46c531ea
2014-02-21 12:05:29 -08:00
James Zern
e2f614be53 Merge "vp9_subpixel_8t_intrin_ssse3.c: make some tables static" 2014-02-20 16:02:16 -08:00
James Zern
3240db7407 Merge "vp9_subpixel_8t_intrin_avx2.c: make some tables static" 2014-02-20 16:01:50 -08:00
Yaowu Xu
c58e1c7be9 Remove an unused variable
Change-Id: I8eeec70a7d4403243762f14d0b560792801645e8
2014-02-20 14:49:44 -08:00
James Zern
10f2db2b1f Merge "vp9: normalize DECLARE_ALIGNED use on global tables" 2014-02-19 11:38:47 -08:00
Dmitry Kovalev
d43c5cc5ea Cleaning up vp9_mvref_common.{h, c}.
Hiding vp9_find_mv_refs_idx() inside vp9_mvref_common.c, moving definition
of vp9_find_mv_refs() to vp9_mvref_common.c.

Change-Id: I0c9f34b03648785a7d18edf6d4fddd34e55dfcc5
2014-02-19 14:23:51 +01:00
Dmitry Kovalev
35bd886864 Merge "Cleaning up pack_inter_mode_mvs() function." 2014-02-19 01:04:36 -08:00
James Zern
b78c219c80 vp9: normalize DECLARE_ALIGNED use on global tables
- place extern within the macro
- use in the header only

Change-Id: I4274b345d8af9ef329c0eb9553a3ddaad70d1d26
2014-02-18 22:57:43 -08:00
James Zern
d73d621e5d vp9_subpixel_8t_intrin_ssse3.c: make some tables static
+ fix formatting

Change-Id: I344d4de089d03e403f0c7b3e64aeb7086cce86ac
2014-02-18 20:42:00 -08:00
James Zern
a96af49bab vp9_subpixel_8t_intrin_avx2.c: make some tables static
+ fix formatting

Change-Id: Ia62610bff3d63855104366d7860749b6a3cf4577
2014-02-18 20:40:40 -08:00
James Zern
26c8e720ca Merge "vp9_filter: move table alignment decl's to header" 2014-02-18 20:15:33 -08:00
Yunqing Wang
0cc71c9c9f Merge "SSSE3 convolution optimization" 2014-02-18 12:55:34 -08:00
Yunqing Wang
ad8d4454f0 Merge "AVX2 SubPixel Variance Optimization" 2014-02-18 12:18:13 -08:00
Dmitry Kovalev
36420009ea Changing vp9_full_search_sad{, x3, x8} signatures.
Passing block MV pointer instead of block index into
vp9_full_search_sad{, x3, x8} functions.

Change-Id: Ica07356633471c2c8f81b583a7aeba85a436bafb
2014-02-17 14:24:57 +01:00
James Zern
8092080216 vp9_filter: move table alignment decl's to header
avoids mismatched alignment warnings in visual studio builds

Change-Id: I2cedb8042fd47e708bde3f7168a6fb4bd9aaa569
2014-02-15 10:18:24 -08:00
James Yu
e486488ce8 Replace vqshrun by vqmovun if shift #0 bit
Change-Id: Ifabb8c7ec0c327fea9d6739cab10addb060ff435
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-14 21:03:40 -08:00
Johann
4378503665 Merge "Remove redundant arm neon instructions." 2014-02-14 20:02:51 -08:00
levytamar82
52dac5d1cb AVX2 SubPixel Variance Optimization
Optimizing 2 functions to process 32 elements in parallel instead of 16:
1. vp9_sub_pixel_variance64x64
2. vp9_sub_pixel_variance32x32
both of those function were calling vp9_sub_pixel_variance16xh_ssse3
instead of calling that function, it calls vp9_sub_pixel_variance32xh_avx2
that is written in avx2 and process 32 elements in parallel.
This Optimization gave 70% function level gain and 2% user level gain

Change-Id: I4f5cb386b346ff6c878a094e1c3b37e418e50bde
2014-02-14 16:59:11 -07:00
Adrian Grange
b7be30eb36 Cleanup some comments.
Change-Id: I568861ba1d43620865ad9a98a97eef37a51fd856
2014-02-14 15:05:30 -08:00
Yaowu Xu
ecf392a155 Merge "minor spelling cleanup in comments" 2014-02-14 14:29:35 -08:00
levytamar82
3068d7d944 SSSE3 convolution optimization
Optimizing all SSSE3 assembly for convolution:
1. vp9_filter_block1d4_h8_sse2
2. vp9_filter_block1d8_h8_sse2
3. vp9_filter_block1d16_h8_sse2
4. vp9_filter_block1d4_v8_sse2
5. vp9_filter_block1d8_v8_sse2
6. vp9_filter_block1d16_v8_sse2
my optimization include:
-processing 2x8 elements in one 128 bit register instead of processing
8 elements in one 128 bit register.
-removing unecessary loads.
This optimization gives between 2.4% user level gain for 480p input
and 1.6% user level gain for 720p.
This Optimization is done only for 64 bit

Change-Id: Ic07fce2f9360329b4f2d956efda1480ae958766b
2014-02-14 15:08:42 -07:00
Dmitry Kovalev
19a8eee1f0 Cleaning up pack_inter_mode_mvs() function.
Change-Id: I48ad06e3e1ae9720a0683022621f4504e3bebce6
2014-02-13 19:21:10 -08:00
Yaowu Xu
8d646becb6 Merge "Removed the reset of mode_info from previous frame" 2014-02-13 17:03:50 -08:00
Frank Galligan
fb8c246b70 Merge "Add VP9 decoder support for external frame buffers" 2014-02-13 15:29:52 -08:00
Frank Galligan
a4f30a5023 Add VP9 decoder support for external frame buffers
Added support for external frame buffers to libvpx's VP9 decoder.
If the external frame buffer functions are set then libvpx will
call the get function whenever it needs a new frame buffer to
decode a frame into. And it will call the release function
whenever there are no more references to that buffer.

Change-Id: Id2934d005f606af6e052fb6db0d5b7c02f567522
2014-02-13 13:14:19 -08:00
Yaowu Xu
896d79a57e Removed the reset of mode_info from previous frame
Prior to this commit, both encoder and decoder reset mode/mv info from
previous frame in error resilient mode to ensure bitstreams are able to
decode when there is loss of frame in decoder side. However, this is
not necessary. This commit changed to remove the reset, so encoder can
continue to use mode/mv/partition information from previously encoded
frame without affecting decodeablilty under loss of frame.

Change-Id: I0279f862900dc647fb471ae3389770bb1b9f454f
2014-02-13 12:48:08 -08:00
Dmitry Kovalev
df6c523fed Merge "Renaming skip_coeff to skip for consistency." 2014-02-13 11:04:34 -08:00
Frank Galligan
e5a1b214f7 Merge "Fix neon wide loopfilter for filter8 only branch" 2014-02-13 09:52:48 -08:00
Yunqing Wang
92824a9cbc Merge "AVX2 Convolve Optimization" 2014-02-13 09:43:55 -08:00
levytamar82
876c72a093 AVX2 Convolve Optimization
Two convolve functions were optimized for AVX2:
1. vp9_filter_block1d16_h8
2. vp9_filter_block1d16_v8
vp9_filter_block1d16_v8 was optimized for AVX2 by reducing the number of
loop strides by half, two strides were processed in parallel.
vp9_filter_block1d16_v8 was also optimized in the same way also some of the
loads were being done outside of the loop and by that preventing redundant
loads.
This Optimization gives 43% function level gain and 1.3% user level gain.
Now can be compiled in Windows

Change-Id: I2714124cfb0c14a77d7a0ce126a20db92ffbf92c
2014-02-12 20:45:31 -07:00
Frank Galligan
b41acbf9bb Fix neon wide loopfilter for filter8 only branch
The current code removed the check to only perform the filter8.

Change-Id: Ie54e19a77745042a5660eab986d9ef1c42e82410
2014-02-12 18:36:17 -08:00
Dmitry Kovalev
004c8c636e Renaming skip_coeff to skip for consistency.
Change-Id: I036e815ca63d00cba71202ae09ba0f6ef745dcb8
2014-02-12 17:44:12 -08:00
Andrew Russell
549c31f8ae minor spelling cleanup in comments
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
2014-02-12 16:32:51 -08:00
Dmitry Kovalev
50712fcaa9 Adding consts to mv search function arguments.
Change-Id: Ie79114bba4f0cea55d9f701e20d2be2017630f3b
2014-02-12 14:28:23 -08:00
Dmitry Kovalev
0109d757ee Merge "Removing vp9_foreach_transformed_block_uv() function." 2014-02-12 12:11:14 -08:00
Jingning Han
e8b7610e8f Use INTER_OFFSET in vp9_pick_inter_mode
Cosmetic change to use pre-defined macros.

Change-Id: I93e9fa90113d0242599048940b39694660385a6f
2014-02-12 09:14:29 -08:00
James Yu
619f29cdb0 Remove redundant arm neon instructions.
Change-Id: I1fabad59747eb5f68c64275a36c3a1d94daf32a3
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-11 21:19:12 -08:00
Dmitry Kovalev
79dd1f8441 Removing vp9_foreach_transformed_block_uv() function.
Change-Id: I35ec77b71e6fd686865cead9281e4dd9e9bc9e86
2014-02-11 18:06:00 -08:00
Tom Finegan
c49c75fde0 Merge "vp9/common/x86: Silence MSVC warnings in vp9_asm_stubs.c." 2014-02-11 14:39:27 -08:00
Frank Galligan
d51ca0db00 Merge "Add get release decoder frame buffer functions." 2014-02-11 08:19:37 -08:00
Dmitry Kovalev
803a5c67dd Merge "Encoder quantization cleanup." 2014-02-10 21:32:04 -08:00
Tom Finegan
60e91a92c3 vp9/common/x86: Silence MSVC warnings in vp9_asm_stubs.c.
Update filter_1dfunction definition to match usage.

Change-Id: Ie3cae13dc1ec3f5838c5f29d1c76a1a98a9217fa
2014-02-10 15:08:42 -08:00
Frank Galligan
e8e152799b Add get release decoder frame buffer functions.
This CL changes libvpx to call a function when a frame buffer
is needed for decode. Libvpx will call a release callback when
no other frames reference the frame buffer. This CL adds a
default implementation of the frame buffer callbacks. Currently
only VP9 is supported. A future CL will add support for
applications to supply their own frame buffer callbacks.

Change-Id: I1405a320118f1cdd95f80c670d52b085a62cb10d
2014-02-10 14:08:11 -08:00
Jim Bankoski
3c790ec0f8 Convert small static header functions to inline
Change-Id: I467b28346a0d8d4d8b96d6c05fc39c34eec26e5c
2014-02-10 07:56:45 -08:00
Jim Bankoski
b5f59ea280 Convert small static functions in header to inline..
Change-Id: Ic4fc01be7738fbabf8c7860dbe3476ab4caf5fc2
2014-02-10 07:56:38 -08:00
Jim Bankoski
7341725e13 Convert small header functions to inline
Change-Id: I4e5575f0d7ccfe2361b8cbf78e7dc079272c9f5f
2014-02-10 07:56:29 -08:00
Jim Bankoski
69f58b40e0 Convert header static functions to inline or make them global.
Change-Id: Ib26fbfef3505299f754e5af6c437a85d7746fc28
2014-02-10 07:39:12 -08:00
Jim Bankoski
6a9e58cb1d Converted functions in header to INLINE...
Change-Id: I00512c6cef3a4af8df57c7263ceb853fb2db8140
2014-02-09 20:12:04 -08:00
Jim Bankoski
18c8deabbf Convert functions to inline that are small .
Change-Id: I3b160e93d9319c8e1abda2a60f49f89c409d534b
2014-02-09 20:08:58 -08:00
Jim Bankoski
9768d0b184 Convert functions to inline that are in headers static.
Change-Id: If1ec3b64be327e8c48ec7efbacde208d2129fdb0
2014-02-09 20:06:35 -08:00
Jim Bankoski
99e4c508b2 Converted function to inline
Change-Id: Iaa4880c8a207cfea509608e1ef4593794b6b31f2
2014-02-09 20:04:54 -08:00
Jim Bankoski
3a3aa3f4e3 Converted short static functions to inline.
Change-Id: I859719d41ced2e35d2765b636e627bb7edc3651e
2014-02-09 19:58:54 -08:00
Tom Finegan
bf79a4da77 vp9/common: Silence MSVC warning in vp9_convolve.c.
Added cast to int to silence MSVC warning.

Change-Id: I9ef4709d2e4cf0db070d9e52385c1b3f138b00a5
2014-02-07 10:13:57 -08:00
Dmitry Kovalev
005fc6970b Finally removing "short" from transform names.
Change-Id: I5259b68dc1bcceb153e3ffe638a79a59a3019e9d
2014-02-06 11:54:15 -08:00
Marco Paniconi
4864ab21b0 Layer based rate control for CBR mode.
This patch adds a buffer-based rate control for temporal layers,
under CBR mode.

Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).

Updated datarate unittest with tests for temporal layer rate-targeting.

Change-Id: I8900a854288b9354d9c697cfeb0243a9fd6790b1
2014-02-06 09:24:45 -08:00
Dmitry Kovalev
f32fa45cba Merge "Cleaning up vp9_get_pred_context_single_ref_p1()." 2014-02-05 18:38:38 -08:00
Dmitry Kovalev
4a1a7919da Merge "Removing "_1d" suffix from mips transform code." 2014-02-05 18:37:49 -08:00
Yunqing Wang
7ad56bf3c9 Merge "Optimize bilinear sub-pixel filters in ssse3" 2014-02-05 17:20:52 -08:00
Dmitry Kovalev
724fefb4cf Cleaning up vp9_get_pred_context_single_ref_p1().
Change-Id: I279343b474d7ff41afcf8f1493b6fbf716b51823
2014-02-05 11:48:01 -08:00
Dmitry Kovalev
a536237228 Merge "Cleaning up vp9_get_pred_context_single_ref_p2()." 2014-02-05 11:37:17 -08:00
Martin Storsjo
03bc491721 arm: Consistently use braces around doubleword arguments to vld
This isn't strictly necessary, but makes the file more consistent
with the other arm assembly source files.

Change-Id: I245c9677d89e0ab3f31991e473764858af35b180
2014-02-05 13:24:25 +02:00
Martin Storsjo
c2bb1aa544 arm: Use {} around quadword arguments to vld
This fixes building for iOS.

Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb
2014-02-05 13:24:17 +02:00
James Zern
d89f861f4b vp9_systemdependent.h: relocate system includes
avoid wrapping msvc includes with extern "C"; this breaks some visual
studio builds of the (c++) tests.

Change-Id: Ie8062d55d4f4c049f6cd360a36da6a67607df132
2014-02-04 18:28:45 -08:00
Dmitry Kovalev
c31cf0d647 Merge "Moving x1 & y1 calculation under if condition." 2014-02-04 14:50:25 -08:00
hkuang
b0fec6ab4a With on demand border extension, clamping the MV
is not longer needed.

Change-Id: I40c37ef18c67ab27fc336694dfca3c43a87c47ca
2014-02-04 13:57:40 -08:00
Yunqing Wang
d1961e6fbf Optimize bilinear sub-pixel filters in ssse3
This patch added ssse3 optimization of bilinear sub-pixel filters.
The real time encoder was speeded up by ~1%.

Change-Id: Ie82e98976f411183cb8c61ab8d2ba0276e55a338
2014-02-04 08:01:55 -08:00
James Zern
2b7338aca4 Merge "vp9_filter.h: rename interp_kernel type" 2014-02-03 23:12:28 -08:00
Dmitry Kovalev
5daaff527e Moving x1 & y1 calculation under if condition.
Change-Id: Iae787d491f7cfe24855ef8f2d04e2c6c19350378
2014-02-03 18:03:17 -08:00
Dmitry Kovalev
64cca45c1d Cleaning up vp9_get_pred_context_single_ref_p2().
Change-Id: I294075acd3073c41e153079ff4462816898b3778
2014-02-03 17:46:34 -08:00
James Zern
cca4276dac vp9_filter.h: rename interp_kernel type
-> InterpKernel
avoids conflicts in variable names, fixing the build with various
toolchains.

broken since:
8691565 Removing subpix_fn_table struct.

Change-Id: Ib5f6fdbcb494a97b62c75b99d4d826ff25d4c981
2014-02-03 16:48:38 -08:00
Alex Converse
be1b41673f Merge "INLINE and reimplement get_unsigned_bits()." 2014-02-03 16:26:33 -08:00
Dmitry Kovalev
220b8f8644 Encoder quantization cleanup.
Change-Id: I633205c95f0e81ce0589580501d0be4425a3cb8e
2014-02-03 14:57:28 -08:00
Dmitry Kovalev
282f36adc4 Merge "Removing "_short" suffix from arm transform file names." 2014-02-03 14:28:47 -08:00
Alex Converse
ffd3d4834b INLINE and reimplement get_unsigned_bits().
The new implementation disagrees when the argument is equal to 2**n but
that is never called in practice and based on how it is used the new
implementation is correct in that case.

Change-Id: Ifbac4ad87d459fe6bd2fd0f400c0340f96617342
2014-02-03 12:16:22 -08:00
Yunqing Wang
2488cb34bc Optimize bilinear sub-pixel filters in sse2
Using bilinear filters could speed up the codec in real-time mode.
This patch added sse2 optimizations of bilinear filters that
operate on different-sized blocks.

Tests showed that the real-time encoder was speeded up by 3%.

Change-Id: If99a7ee4385fcc225c3ee7445d962d5752e57c3f
2014-02-03 10:34:45 -08:00
Marco Paniconi
6be2b750b8 Layer based rate control for CBR mode.
This patch adds a buffer-based rate control for temporal layers,
under CBR mode.

Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).

Updated datarate unittest with tests for temporal layer rate-targeting.

Change-Id: I9cb6cce2494390ae6096ee17774af7fb9308bde7
2014-02-02 14:30:43 -08:00
Jim Bankoski
9dec7712ab static function convert to inline or global vp9_blockd.h
Change-Id: Ifdd951f24932839f06d1c700371662511dde6ebe
2014-01-31 19:50:40 -08:00
Yunqing Wang
7c6a49bada Merge "Rename a loopfilter parameter" 2014-01-31 18:33:33 -08:00
Dmitry Kovalev
c2ca97caaf Merge "Cleaning up motion compensation code." 2014-01-31 17:33:40 -08:00
Dmitry Kovalev
c49b08c9a1 Removing "_short" suffix from arm transform file names.
Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507
2014-01-31 17:19:02 -08:00
Dmitry Kovalev
6e4a03e844 Removing "_1d" suffix from mips transform code.
Unifying transform function names across libvpx, 1d is a redundant suffix.

Change-Id: I077c19f3bc7d4842ed7ca5814d77b3dce1728e13
2014-01-31 17:05:03 -08:00
Yunqing Wang
11a9366e3b Rename a loopfilter parameter
As pointed out by Dmitry and James, "partial" is a Microsoft-
specific c++ keyword, and it is renamed.

Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
2014-01-31 16:30:04 -08:00
Dmitry Kovalev
88340b173b Merge "Combining fb_idx_ref_cnt[] and yv12_fb[] arrays." 2014-01-31 15:55:04 -08:00
Dmitry Kovalev
a8a2f22958 Merge "Renaming "mbskip" to "skip"." 2014-01-31 15:52:35 -08:00
Yunqing Wang
903801f1ef vp9 decoder: row-based multi-threaded loopfilter
Implemented parallel loopfiltering, which uses existing tile-
decoding threads. Each thread works on one row, and when that row
is loopfiltered, it moves to next unattended row. To ensure the
correct filtering order, threads are synchronized and one
superblock is filtered only if the superblocks it depends on are
filtered already.

To reduce synchronization overhead and speed up the decoder, we use
nsync > 1 for high resolution.

Performance tests:
1. on desktop:
8-tile 4k video using 8 threads, speedup: 70% - 80%
4-tile HD video using 4 threads, speedup: ~35%
2. on mobile device(Nexus 7):
4-tile 1080p video using 4 threads, speedup: 18% - 25%
4-tile 1080p video using 2 threads, speedup: 10% - 15%

Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
2014-01-31 14:44:53 -08:00
Yaowu Xu
96dc80da61 Merge "create super fast rtc mode" 2014-01-29 16:36:20 -08:00
Dmitry Kovalev
b107f2c470 Renaming "mbskip" to "skip".
Change-Id: I27a30b43eae026a77f92958e2238d02d9cdf7832
2014-01-29 14:48:42 -08:00
Dmitry Kovalev
5670f1e2a8 Merge "Finally removing vp9_setup_interp_filters() function." 2014-01-29 12:54:21 -08:00
Dmitry Kovalev
6332063475 Combining fb_idx_ref_cnt[] and yv12_fb[] arrays.
Adding new RefCntBuffer struct which contains reference counter and image
buffer.

Change-Id: I71c1f532faa13442c32c43fc03ec45b6f88fb844
2014-01-29 12:48:01 -08:00
Dmitry Kovalev
b00eb5c464 Finally removing vp9_setup_interp_filters() function.
Change-Id: If446225afbb49f6033c2a4516a37c377de6f70f7
2014-01-29 11:29:34 -08:00
Jim Bankoski
ea8aaf15b5 create super fast rtc mode
This patch only works if the video is a width and height that are both
a multiple of 32..   It sets every partition to 16x16, and does INTRADC
only on the first frame and ZEROMV on every other frame.   It always does
does the largest possible transform, and loop filter level is set to 4.

Was ~20% faster than speed -5 of vp8

Now 20% slower but adds motion search ( every block ), nearest, near
and zeromv

The SVC test was changed because - while this realtime mode produces
bad quality albeit quickly, it isn't obeying all the rules it should
about which frames are available.

Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7
2014-01-29 08:39:39 -08:00
Yunqing Wang
3c29cbffbf Add macros for convolve functions
Added macros to reduce the code duplication.

Change-Id: I1916aa5a386ea07d961d4ec439ab09bb8c45487d
2014-01-28 18:40:23 -08:00
Dmitry Kovalev
b098c04290 Merge "Decoupling set_ref_ptrs() and vp9_setup_interp_filters()." 2014-01-28 10:37:58 -08:00
Dmitry Kovalev
4ce35d8f2d Merge "Removing _1d suffix from transform names." 2014-01-28 10:37:26 -08:00