generic-library/vpx

Author	SHA1	Message	Date
Jingning Han	1221641914	Merge "Unify luma and chroma inter predictors in choose_partitioning"	2015-02-04 12:09:21 -08:00
Jingning Han	fb2bac4001	Merge "Save an extra call for setup_pred_plane function"	2015-02-04 12:09:14 -08:00
Jingning Han	ce819d74dc	Merge "Account for chroma component costs in RTC mode decision"	2015-02-04 12:09:01 -08:00
Yaowu Xu	bdfb5f986e	Adjust partitioning threshold based rtc speed On rtc set: speed 7 quality improves about 0.5% speed 8 quality improves about 1.0% Encoding time for speed 7 changes from 67804ms to 65889ms Encoding time for speed 8 changes from 58659ms to 56808ms Change-Id: Iabcfb53012fc1b9f3326cdbc167e5758b8c7ad30	2015-02-04 11:28:39 -08:00
hkuang	b104b84058	Fix a thread lost bug in frame parallel decode. After syncing the frame worker thread, avaiable thread count should increase by 1 even the worker thread does not have displayable frame to output. Change-Id: I9eeb87720fed82dfe38555286833ff88e8a8e746	2015-02-04 11:07:02 -08:00
Jingning Han	1b9082ec6b	Unify luma and chroma inter predictors in choose_partitioning Change-Id: I8bfc80f4fffb0892e93d3326394a52d1ee3c0f37	2015-02-04 10:02:57 -08:00
Jingning Han	4ccfc7d517	Save an extra call for setup_pred_plane function Reuse the yv12_mb array to fetch the buffer pointers/strides corresponding to the current reference frame. Change-Id: I5276b7494158b2cccef15213be2dc189e9036851	2015-02-04 09:47:14 -08:00
Jingning Han	0c6d3a03e1	Account for chroma component costs in RTC mode decision This commit allows the encoder to account for additional chroma plane costs in the mode decision process, if the current block potentially contains significant color change. It improves the visual quality at very low bit-rates. The compression performance of dark720p is improved by 12.39% in speed 6. For jimred at 150 kbps, the PSNR of V component (red) increased by 0.2 dB, at the expense of about 5% increase in encoding time. Note that for sequences where the chroma components are fairly consistent, the encoding time increase is negligible. On average the rtc set compression performance is improved by 1.172% in PSNR and 1.920% in SSIM. Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c	2015-02-04 09:45:14 -08:00
Yunqing Wang	b3b7645a2f	vp9_dthread: remove frame_parallel_decoding_mode requirement This patch continues the work to remove frame_parallel_decoding_mode requirement in VP9 multi-threaded tile decoder. In order to do that, the frame counts associated to each thread need to be accumulated together after the frame is decoded. Change-Id: Idba1a756cedfed3c154aef52ed82c8da3bbf9e0c	2015-02-04 09:16:41 -08:00
Johann	3a5d40608e	Merge "Remove unnecessary pointer check"	2015-02-03 17:12:56 -08:00
Yaowu Xu	02537ebbe4	Move calls to avoid unnecessary operations Change-Id: I236f7f75ab9a4511d1b52a6a67299b0e844a103e	2015-02-03 17:01:37 -08:00
Yaowu Xu	cb411108a3	Merge "adjust rtc setting and threshold"	2015-02-03 15:13:52 -08:00
hkuang	70554a21f1	Merge "Remove duplicate code."	2015-02-03 13:37:48 -08:00
Jim Bankoski	d7783cae95	Merge "make low bitrates a lot less blocky"	2015-02-03 13:25:06 -08:00
Johann	ba18609502	Remove unnecessary pointer check The original implementation had the following comment: // Ignore mv costing if mvsadcost is NULL However the current implementation does not allow for this. If x exists then nmvsadcost must not be null. This removes the only warning from -Wpointer-bool-conversion https://code.google.com/p/webm/issues/detail?id=894 Change-Id: I1a2cee340d7972d41e1bbbe1ec8dfbe917667085	2015-02-03 13:03:46 -08:00
Jingning Han	894f0fbd3b	Merge "Assign 2nd ref frame in choose_partitioning"	2015-02-03 12:25:18 -08:00
Jingning Han	ca9c352fc3	Assign 2nd ref frame in choose_partitioning Avoid the use of uninitialized second reference frame for fetching reference block. Change-Id: I9983a0daea829700b3270dc8bf2bcc6d6ea36652	2015-02-03 11:17:51 -08:00
Yunqing Wang	f5b3631621	Merge "vp9_dthread: pass frame counts to decoder functions"	2015-02-03 10:52:02 -08:00
Yaowu Xu	a6b3e01a27	Add mutex initialization in encoder This resolves the encoder crashes on windows. Change-Id: I159d79014cf9279751e403936ce1f84482ae82da	2015-02-03 09:53:08 -08:00
Yunqing Wang	85a9bc04d4	vp9_dthread: pass frame counts to decoder functions The current multi-threaded tile decoder requires that the videoes are encoded with frame_parallel_decoding_mode = 1. This requirement is not necessary, and is better to be removed. This patch includes the first part of the work. Change-Id: Ic7695fb3cfe13f9022582c9f0edd2aa6e2e36d28	2015-02-03 09:39:15 -08:00
Jim Bankoski	9f1cf2c8cf	make low bitrates a lot less blocky Remove loop filter skip at speed 7+ because of bad visual artifacts and up the postprocessing. Change-Id: Ibdd0bac71aaee232d2bb2e14462733c51517768d	2015-02-03 06:45:56 -08:00
Yaowu Xu	65a1a3e85d	adjust rtc setting and threshold 1. Adjusted the threshold for coef update computation based on counts of tx used, avoid coef update computation when count is low (<20) 2. Move sf->lpf_pick = LPF_PICK_MINIMAL_LPF to speed 8. Change-Id: I02b44309e40fcdbf135c7934ae067a3f42502d30	2015-02-02 17:43:46 -08:00
hkuang	4ed539f22e	Merge "Fix a bug from merging frame parallel branch into master."	2015-02-02 17:08:42 -08:00
hkuang	94a459522e	Fix a bug from merging frame parallel branch into master. The merge did not merge the fix for issue #850. Change-Id: I0dc1377dbfcb9497fb01a13d4f78ac65bff5eb33	2015-02-02 16:01:17 -08:00
Alex Converse	a79db92c07	Merge "Allow larger encoder configurations."	2015-02-02 12:05:56 -08:00
Yaowu Xu	80e729f601	Merge "Optimize coef update"	2015-02-01 20:08:29 -08:00
hkuang	be6aeadaf4	Try again to merge branch 'frame-parallel' into master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit `a18da9760a`. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02	2015-01-30 21:00:13 -08:00
James Zern	f6c2a6c5d6	vp9: rename 'near' parameters + nearest for consistency near is a reserved word in windows builds so using it as a parameter name may cause build failures with some configurations Change-Id: Iddf1d4ecdb39843f14e95dbfd9dca55f07f81403	2015-01-30 15:52:24 -08:00
Jingning Han	f1ab5c1021	Merge "Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8"	2015-01-30 15:49:14 -08:00
Yaowu Xu	45971abd1d	Optimize coef update 1. move the check of search method of USE_TX_8X8 up one level to avoid operations of build_tree_distributions() 2. count tx used and avoid computaton for coef udpate when one size is not used at all. Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c	2015-01-30 10:16:40 -08:00
Yunqing Wang	3b3e299650	Merge "Fix issues in 32bit PIC enabled build"	2015-01-29 16:41:25 -08:00
Alex Converse	797a2556eb	Allow larger encoder configurations. Allow changing colorspace in the encoder and increasing frame size. Change-Id: I8e7c3b891af29ce420a15beb4f6f9c250245b2bb	2015-01-29 15:07:40 -08:00
Paul Wilkins	68340a3470	Merge "Change to update of rate control factors."	2015-01-29 13:50:52 -08:00
Marco	a80dd52b6e	Merge "Fix to vp9 denoiser."	2015-01-29 09:10:30 -08:00
Paul Wilkins	f752da8ce2	Change to update of rate control factors. Remove damping parameter and use the damping formula introduced by Yaowu Xu in all cases. Change-Id: I18db7e0d0f262d5140102f259ab07821d374d285	2015-01-28 15:44:53 -08:00
Yaowu Xu	ff99a3c750	Simplify update_coef_probs() 1. reduce the size of temporaray arrays on stack 2. avoid build_tree_distribution for tx size that is not used at all. Change-Id: I0f8d7124e16a3789d3c15ad24cf02c1c12789e2c	2015-01-28 15:12:42 -08:00
Marco	c0923d4d3a	Fix to vp9 denoiser. Prevent from using wrong mv for denoiser motion compensation. Change-Id: Ifa0f9daabdbdab0900d3c17304059fe0d15de914	2015-01-28 12:07:27 -08:00
hkuang	e8c42fb0bd	Remove duplicate code. (issue #934). Change-Id: Ic8adaaff87aae0b33d9b508f160b48e0ccdaaf4c	2015-01-28 12:00:34 -08:00
Frank Galligan	d1e6b8231a	Merge "Add vp9_sad32x32x4d_neon Neon intrinsic function."	2015-01-28 10:35:50 -08:00
Frank Galligan	eb12d880ab	Merge "Add vp9_sad16x16x4d_neon Neon intrinsic function."	2015-01-27 23:01:44 -08:00
Frank Galligan	80a3a07929	Merge "Add vp9_sad64x64x4d_neon Neon intrinsic function."	2015-01-27 23:01:15 -08:00
Yunqing Wang	10d5e09c87	Fix issues in 32bit PIC enabled build This patch was to fix issue 924: https://code.google.com/p/webm/issues/detail?id=924 The SECTION_RODATA macro was modified to support macho32 format. The sub-pixel functions were modified to pass in 2 more parameters to handle the global offsets for PIC build. Change-Id: I3bfcd336bcae945edf300bca4ab40376a2628cd4	2015-01-27 22:20:21 -08:00
Yaowu Xu	fe2439703d	Merge "move clear_system_state() call before using double"	2015-01-27 12:42:13 -08:00
Frank Galligan	e3167f7fbf	Add vp9_sad32x32x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~18% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c	2015-01-27 08:54:00 -08:00
Frank Galligan	9f574d0316	Add vp9_sad16x16x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~15% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9	2015-01-27 08:42:17 -08:00
Frank Galligan	54fa956715	Add vp9_sad64x64x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: Id12af7d1883243c23e6692e898aea82299633d58	2015-01-27 08:33:40 -08:00
Marco	1c4a84c6e9	Merge "aq-mode=3: Update to allow for refresh on modes other than zero-mv."	2015-01-26 19:47:13 -08:00
Yaowu Xu	645b7cdf03	move clear_system_state() call before using double Floating point is used in vp9_convert_qindex_to_q(), so sometime unit test ActiveMapTest would cause run time error without properly call to clear_system_state to reset register status. Change-Id: I181e9395148c44a6ca8b97d6e109bd4a152143c6	2015-01-26 18:41:50 -08:00
Paul Wilkins	d231ce4fde	Merge "Adjust active maxq for GF groups."	2015-01-26 18:19:09 -08:00
Yaowu Xu	d987dc4fdb	Merge "Fix MSVC warnings on conversion from int64 to int"	2015-01-26 16:52:30 -08:00
Marco	3f1af6e85e	aq-mode=3: Update to allow for refresh on modes other than zero-mv. Add distortion threshold condition to refresh state of a coding block, and allow for qp adjustment also for some intra modes and non-zero motion modes. Also some code cleanup (remove unused variables/code). Change-Id: I735fa2b28bc64f60e0323976b82510577b074203	2015-01-26 16:44:25 -08:00
Paul Wilkins	fd070220ff	Adjust active maxq for GF groups. Currently disabled by default: enabled using #define GROUP_ADAPTIVE_MAXQ In this patch the active max Q is adjusted for each GF group based on the vbr bit allocation and raw first pass group error. This will tend to give a lower q for easy sections and a higher value for very hard sections. As such it is expected to improve quality in some of the easier sections where quality issues have been reported. This change tends to hurt overall psnr but help average psnr. SSIM also shows a small gain. Average results for derf, yt, std-hd and yt-hd test sets were as follows (%change for average psnr, overal psnr and ssim):- derf +0.291, - 0.252, -0.021 yt +6.466, -1.436, +0.552 std-hd +0.490, +0.014, +0.380 yt-hd +5.565, - 1.573, +0.099 Change-Id: Icc015499cebbf2a45054a05e8e31f3dfb43f944a	2015-01-26 14:55:36 -08:00
Yaowu Xu	6d16f6c14c	Fix MSVC warnings on conversion from int64 to int Change-Id: I7e96509ffa36899fcd2935749927a1e8aac8d025	2015-01-26 10:54:06 -08:00
Frank Galligan	9f6eba419a	Add Neon intrinsic vp9_fdct8x8_quant_neon On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51	2015-01-24 22:49:50 -08:00
Yaowu Xu	643c75d90b	Merge "Replace divide with look-up"	2015-01-23 21:12:18 -08:00
Jingning Han	9bdc0ae2b2	Format fixes in vp9_rd_pick_inter_mode_sb/sub8x8 Add parentheses to bit operations. Change-Id: I095d601f0631d055adc4b3a8fde70c9cbae9e749	2015-01-23 11:48:58 -08:00
JackyChen	65f60f8e8c	Merge "SSE2 code for the filter in MFQE."	2015-01-23 11:08:16 -08:00
Adrian Grange	0e2e2c2652	Merge "Remove elevate_newmv_thresh from SPEED_FEATURES (unused)"	2015-01-23 09:57:03 -08:00
Yaowu Xu	eda179764f	Replace divide with look-up This commit replaces an integer divide with a table-lookup. It is to improve decoding speed, and at the same time, to reduce possible complications with a bug in AMD Family 12h processors: "665 Integer Divide Instruction May Cause Unpredictable Behavior" Change-Id: I678b707a538798a923850bac467e66e847e6def7	2015-01-23 09:02:07 -08:00
Johann	a18da9760a	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in master branch." This reverts commit `bde04ce503` Change-Id: I053dae04c761b04a36dc239558503905a14d2470	2015-01-23 08:42:02 -08:00
hkuang	bde04ce503	Merge branch 'frame-parallel' to enable frame parallel decode in master branch. In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64	2015-01-22 18:18:53 -08:00
Adrian Grange	527e073163	Remove elevate_newmv_thresh from SPEED_FEATURES (unused) Change-Id: I78ef7f89586a329787f6bc4c58ec83af210989a3	2015-01-22 16:12:50 -08:00
Marco	0dccb6277c	Modify variance partition selection for low resolutions. For low spatial resolutions: bias partittion selection to smaller block sizes, and base the variance computation on 4x4 down-sampling. Also move the threshold computations into the choose_partitioning, so they are computed once for each sb block. On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%. No change for resolutions above CIF. Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0	2015-01-22 15:16:55 -08:00
Paul Wilkins	cf3202132f	Merge "Bug when last group before forced key frame is short."	2015-01-22 08:28:19 -08:00
Paul Wilkins	0bff1efc2b	Bug when last group before forced key frame is short. Just before a forced key frame we often get a foreshortened arf/gf group. In such a case, we do not want to update rc->last_boosted_qindex, which is used to define the Q range for the forced key frame itself. This gives a small average metrics gain for the YT and YT-HD sets (eg. YT SSIM +0.141%). Change-Id: Ie06698bc4f249e87183b8f8fb27ff8f3fde216d9	2015-01-21 15:25:57 -08:00
JackyChen	cd0830f452	Merge "Fix compile error in Chromium building."	2015-01-21 14:52:32 -08:00
JackyChen	25a19b48ff	Fix compile error in Chromium building. The comparison of address in the condition is not necessary, since they will constantly be non-null. Change-Id: Id0b0075283f5af65215d5761a8160a4cb2a15c9b	2015-01-21 12:59:25 -08:00
Alex Converse	910ca857df	Allow external resize via vpx_codec_enc_config_set Change-Id: I3d324e2baa4de2d266c5f7ca7b635b62372e90a7	2015-01-21 11:33:06 -08:00
Yaowu Xu	c97d435243	Merge "Replace "colorspace" with "color_space""	2015-01-21 08:58:09 -08:00
Frank Galligan	469ff48d7b	Merge "Add Neon intrinsics for vp9_avg_8x8_neon"	2015-01-20 14:38:39 -08:00
Yunqing Wang	6d7b7abf52	Add non420 code in multi-threaded loopfilter Added non420 part back to make it consistent with single thread code in vp9_loopfilter.c. Change-Id: I8ca255d73bffebae294d2627d6655eafe535cb90	2015-01-20 09:31:47 -08:00
Yunqing Wang	7b232717af	Merge "vp9_ethread: add parallel loopfilter"	2015-01-20 09:27:08 -08:00
JackyChen	09673deba9	SSE2 code for the filter in MFQE. The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8 side. In our testing, we achieve 2X speed by adopting this change. Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716	2015-01-18 16:07:59 -08:00
Frank Galligan	cc2da09d42	Fix variance Neon intrinsics > 32x32 The 16 bit sum vector was overflowing. Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f	2015-01-17 10:31:48 -08:00
Yunqing Wang	e76eaf05b1	vp9_ethread: add parallel loopfilter 1. Added row-based loopfilter in encoder; 2. Moved common multi-threaded loopfilter functions from decoder to common; 3. Merged multi-threaded loopfilter code, and made encoder/ decoder call same function to reduce code duplication. Encoder tests showed that 1% - 2% speedup was seen for good-quality 2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6% speedup using 4 threads were seen for real-time mode(at speed 7). Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4	2015-01-16 17:19:27 -08:00
Jingning Han	0220255fa0	Merge "Fix frame buffer swap in denoiser"	2015-01-16 16:58:37 -08:00
Jingning Han	dfda5cebc7	Fix frame buffer swap in denoiser This commit fixes a bug in denoiser reference frame buffer swap, which disables frame buffer update. Change-Id: I39a9427180fd18f9692602064ad821f7af4714c0	2015-01-16 12:29:58 -08:00
Yaowu Xu	bc5d3fae5c	Replace "colorspace" with "color_space" This is to make the usage of the variable name consistent across the code base. Change-Id: I698739e55841c59358d1c6e5cc97c96088772943	2015-01-15 17:58:47 -08:00
Minghai Shang	220bc3a013	[two pass temporal svc]Fix crash issue in transcoder app caused by last fix. Change-Id: I78ecc8ec3fa3ba5f69bb23813e68a5255d0534e1	2015-01-15 16:59:54 -08:00
Frank Galligan	6e7e1cf32f	Add Neon intrinsics for vp9_avg_8x8_neon On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee	2015-01-15 15:32:40 -08:00
Yunqing Wang	99b99831e4	Align thread data in vp9_ethread On some platforms, such as 32bit Windows and 32bit Mac, the allocated memory isn't aligned automatically. The thread data is aligned to ensure the correct access in SIMD code. Change-Id: I1108c145fe982ddbd3d9324952758297120e4806	2015-01-14 15:51:56 -08:00
Yaowu Xu	829a01dbb7	Merge "Add encoder control for setting color space"	2015-01-14 14:14:34 -08:00
Frank Galligan	c7d6c0c5a8	Merge "Switch remaining Neon variance functions to shifts"	2015-01-14 12:17:42 -08:00
Frank Galligan	68224a6e87	Merge "Add 64x64 sub_pel_variance Neon function"	2015-01-14 12:17:20 -08:00
Yaowu Xu	e94b415c34	Add encoder control for setting color space This commit adds encoder side control for vp9 to set color space info in the output compressed bitstream. It also amends the "vp9_encoder_params_get_to_decoder" test to verify the correct color space information is passed from the encoder end to decoder end. Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650	2015-01-14 10:17:14 -08:00
Yaowu Xu	afae733eed	Merge "Enable decoder to pass through color space info"	2015-01-14 10:04:15 -08:00
Frank Galligan	ec1d8387e1	Add 64x64 sub_pel_variance Neon function On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334	2015-01-14 08:36:24 -08:00
Frank Galligan	588f74f8a6	Switch remaining Neon variance functions to shifts Saves 5 instructions on 8x8 and 16x16 and 8 instructions on 32x32, when compiled with 4.9. Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c	2015-01-14 07:22:49 -08:00
Frank Galligan	bd3dbc588c	Merge "Add 64x variance Neon functions"	2015-01-13 22:38:58 -08:00
Minghai Shang	a14415d171	[twopass temporal svc] Fix decoding error on seek. Don't put small empty frame in front of a key frame. We will put key frame flag in webm container if there's a visible key frame. But there will be decoding error when we seek to here if we put the small empty frame, which will be inter frame, in front of it. Change-Id: Id50c2c1fd31da0405ff6faa7375cc2f49c55402d	2015-01-13 15:44:22 -08:00
Yaowu Xu	6b223fcb58	Enable decoder to pass through color space info This commit added a field to vpx_image_t for indicating color space, the field is also added to YUV_BUFFER_CONFIG. This allows the color space information pass through the decoder from input stream to the output buffer. The commit also updated compare_img() function with added verification of matching color space to ensure the color space information to be correctly passed from encode to decoder in compressed vp9 streams. Change-Id: I412776ec83defd8a09d76759aeb057b8fa690371	2015-01-13 15:13:19 -08:00
Frank Galligan	74d40cd507	Add 64x variance Neon functions Add optimized Neon functions of: vp9_variance32x64 vp9_variance64x32 vp9_variance64x64 On Nexus 7 speed -5 and -6 saw about a 4% increase in perf. Speeds -7 and -8 saw about a 6% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa	2015-01-13 15:08:13 -08:00
Yaowu Xu	6f6fbf9175	Merge "Added plumbing for setting color space"	2015-01-13 09:20:13 -08:00
Yaowu Xu	fe3f21099f	Merge "Fix comments and color format"	2015-01-11 14:01:36 -08:00
Yaowu Xu	ce52b0f8d3	Added plumbing for setting color space Change-Id: If64052cc6e404abc8a64a889f42930d14fad21d3	2015-01-09 10:54:25 -08:00
Yaowu Xu	ecbca31a1d	Fix comments and color format Replaced "color space" with "color format" in comments where color sampling format is concerned, so to differentiate from the concept defined in COLOR_SPACE. Change-Id: I8c935034c166b24307a99352dab1686531276bb8	2015-01-09 10:36:43 -08:00
Paul Wilkins	ccffe318ff	Merge "Use 64 bit to accumulate frame sse."	2015-01-09 06:05:11 -08:00
Jingning Han	ae537c151b	Merge "Refactor mc reference block fetch in denoiser"	2015-01-08 17:56:53 -08:00
James Zern	4d6838627d	Merge "vp9: add per-tile longjmp error handling"	2015-01-08 15:53:37 -08:00
James Zern	44b55dada8	Merge "vp9: fix -Wclobbered (longjmp + local variables)"	2015-01-08 15:53:02 -08:00
Jingning Han	a1daf009be	Merge "Use lookup table to find pixel numbers in block"	2015-01-08 13:58:35 -08:00
Johann	00bbe342c2	Merge "Disable vp9 _8_ loopfilters"	2015-01-08 12:47:52 -08:00
Jingning Han	a0be730eae	Refactor mc reference block fetch in denoiser This commit refactors the motion compensated reference block fetch process in denoiser. It skips the stage that generates motion compensated reference block if denoiser decides to use copy block mode. For high motion clips, this could speed up the denoising process by about 10%. Change-Id: I8ef4fa5fe766a8c4529119b9ec01faefb3d4ef53	2015-01-08 12:43:08 -08:00
Jingning Han	e3f0b19f3f	Use lookup table to find pixel numbers in block This could save one multiplication in each threshold funtion called by the denoiser per block. Change-Id: I35f437e09999f0a087180878ef7805f0d86e5819	2015-01-08 12:32:28 -08:00
Jingning Han	e535ad5067	Merge "Refactor denoiser frame buffer update"	2015-01-08 11:16:14 -08:00
Jingning Han	97dc782635	Merge "Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode"	2015-01-08 10:55:03 -08:00
Jingning Han	f1866a5792	Merge "Use vp9_convolve_copy in denoiser output"	2015-01-08 09:59:10 -08:00
hkuang	9130e7ad2e	Merge "Remove unnecessary init_macroblockd."	2015-01-08 09:15:32 -08:00
Jingning Han	ea061a885d	Refactor denoiser frame buffer update Use frame buffer pointer swap instead of memcpy when possible. These two CLs make the denoiser when running on vidyo1 720p at speed -6 over 10% faster. Change-Id: I64fe8a2422cafca6787a50c7f4dfb961191c0a9d	2015-01-07 18:33:13 -08:00
Jingning Han	29a5deb40c	Use vp9_convolve_copy in denoiser output Replace copy_block with vp9_convolve_copy for speed performance improvement. Change-Id: I3a08c4d01dff2253b6ee573efd02f65ccdc1b5a5	2015-01-07 18:23:17 -08:00
Zoe Liu	4cf636a60e	Removed redundant local variables in the forward hybrid transforms. Change-Id: I60f7ccbbc8dc624134e325bdce6042bc183075b6	2015-01-07 16:38:29 -08:00
Yaowu Xu	01eec75858	Merge "Refactor calculation of tile_cols"	2015-01-07 16:24:57 -08:00
Jingning Han	08055b639a	Merge "Always check and free denoiser buffer memory space"	2015-01-07 15:54:06 -08:00
Jingning Han	e42b3ee765	Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode These two parameters are used to control the denoiser cut-off thresholds. They should be properly initialized when starting mode search of a given block. Change-Id: Iba8a25487026a0dbe0d350c347d7e4e4e237b637	2015-01-07 15:32:41 -08:00
JackyChen	1883c940b9	Merge "Use qdiff to adjust the threshold of sad and variance in MFQE."	2015-01-07 14:57:46 -08:00
Yaowu Xu	e9cf9b7dfe	Refactor calculation of tile_cols Change-Id: I2c38ea2bcf6d221a0b6b2fb9be4cebbee21006a3	2015-01-07 14:28:59 -08:00
Jingning Han	b208439b5a	Merge "Fix best ref frame rd cost update in sub8x8 non-RD mode search"	2015-01-07 14:06:55 -08:00
Jingning Han	3e41563f33	Merge "Format fix in vp9_pick_inter_mode_sub8x8"	2015-01-07 14:06:06 -08:00
Jingning Han	802b798f67	Fix best ref frame rd cost update in sub8x8 non-RD mode search This fixes the issue that sub8x8 inter blocks always end up with GOLDEN_FRAME. Change-Id: Id0c25cbb9c2003f43b4dff8fb1572512c246e077	2015-01-07 12:00:02 -08:00
Jingning Han	c3fd9bbdaf	Format fix in vp9_pick_inter_mode_sub8x8 Replace ref_frame++ with ++ref_frame. Change-Id: Ic39793081156c314bf1b85d5ab76def97f3bff52	2015-01-07 11:50:36 -08:00
Jingning Han	59f29f5e3f	Merge "Fix denoiser chroma component initialization"	2015-01-07 11:30:15 -08:00
Jingning Han	9a0e694182	Merge "Skip duplicate denoiser frame buffer allocation"	2015-01-07 11:30:07 -08:00
Johann	d12f1f907d	Merge "Rearrange loopfilter functions"	2015-01-07 11:07:54 -08:00
Deb Mukherjee	ba93f0a201	Merge "Moves inter mode count updates to update_stats"	2015-01-07 09:38:32 -08:00
JackyChen	60cf5cf7b2	Use qdiff to adjust the threshold of sad and variance in MFQE. When qdiff is larger, the sad/variance threshold should also be higher which indicates a more aggressive action on MFQE. Change-Id: I44c5c93572805458d4f87fdc7619cc9d8a522185	2015-01-07 09:07:10 -08:00
Jingning Han	ce08006951	Always check and free denoiser buffer memory space The vp9_denoiser_free() function will internally check if the buffer pointers are NULL. This commit makes the encoder always call vp9_denoiser_free() after finishing encoding. It protects the case where noise_sensitivity_level is changed during encoding process and happen to be turned off towards the end of sequence, which could result memory space allocated to denoiser not being released. Change-Id: Ie20dc2f2e6e5fb6333fbab3356bc153978a6a0f8	2015-01-07 08:50:13 -08:00
Jingning Han	2fb9b635bb	Fix denoiser chroma component initialization Use the correct frame size and stride value for chroma components when setting the initial values. These control parameters are assigned when the denoiser buffer was allocated and initialized. Change-Id: Ia6318194c7738aff540bcbd34f77f0dac46221a1	2015-01-07 08:49:59 -08:00
Jingning Han	27582e573b	Skip duplicate denoiser frame buffer allocation Allocate the frame buffer allocation for denoiser once during the encoder initialization. This avoids allocating frame buffer multiple times and overwriting the buffer pointer without proper releasing. Change-Id: I9b3baa6283449d86fd164534d344c036bb035700	2015-01-07 08:49:04 -08:00
Paul Wilkins	a3c1a9b419	Use 64 bit to accumulate frame sse. When testing frame sse to choose a loop filter value and when checking ambient error in kf Q selection, use 64 bit values for accumulating the sse, to avoid risk of overflow for large image formats. Change-Id: I03765d16c843d0ade61a45b0cd46312472697e57	2015-01-07 14:13:16 +00:00
Johann	377b6682f9	Disable vp9 _8_ loopfilters Investigating https://code.google.com/p/chromium/issues/detail?id=443839 Change-Id: Ibb7485d835c5aa5e1d40f31715596ba8d208eedb	2015-01-06 19:26:11 -08:00
Johann	b1ba4cc394	Rearrange loopfilter functions Separate functions and rename files. This will make it easier to disable some functions later to help work around a compiler issue in chromium. Change-Id: I7f30e109f77c4cd22e2eda7bd006672f090c1dc5	2015-01-06 19:26:11 -08:00
Yaowu Xu	7ba6a676f5	Merge "Use -1 consistently as invalid buffer idx"	2015-01-06 17:31:13 -08:00
Deb Mukherjee	e7570493b8	Moves inter mode count updates to update_stats This makes the inter_mode counts update consistent with other symbols. Also, forward updates should work corerctly now. Change-Id: Id98be26fd08875162e644bb8f1de6f0918f85396	2015-01-06 16:40:45 -08:00
Yaowu Xu	61c5e94e22	Use -1 consistently as invalid buffer idx Instead of mixed use of both -1 and INT_MAX. This also fixes a vp9 fuzzing test failure. Change-Id: I950ea94b44ec7cdb5232773bee30b104e342f52a	2015-01-06 15:59:03 -08:00
Deb Mukherjee	0c2ee67ad6	Merge "Enable coefficient range checking for 10-/12-bit"	2015-01-06 14:59:08 -08:00
Yaowu Xu	0979dbb37b	Merge "Fix compiler warnigns for msvc2013"	2015-01-06 08:01:47 -08:00
Yaowu Xu	262f66e6ed	Merge "Properly validate data size"	2015-01-06 08:01:30 -08:00
Paul Wilkins	a88e4e64b1	Merge "Deleted unused #define"	2015-01-06 04:18:20 -08:00
Deb Mukherjee	0ce2a27e9b	Enable coefficient range checking for 10-/12-bit Also fixes a broken build with --enable-coefficient-range-checking configuration option. Change-Id: Icc536f53088e8cec59dfb8f635668555fdb9125e	2015-01-06 02:40:51 -08:00
Yaowu Xu	9c061ef506	Properly validate data size With "show_existing_frame" frames: Minimum data size for profile 0 and 1 is 1 byte (8bits) Minimum data size for profile 2 and 3 is 2 bytes (9bits) Otherwise: Minimum data size is 8 bytes. This resolves the VP9 failure in fuzzing test build #56. Change-Id: I146d9d37688f535dd68d24aacc76d464ccffdf04	2015-01-05 17:34:31 -08:00
Yaowu Xu	364b92dc88	Fix compiler warnigns for msvc2013 Change-Id: I1e32bf8f6872a6fb7e9cabe86483e94805e2f790	2015-01-05 17:31:19 -08:00
Jingning Han	1da0402eff	Merge "Fix denoised video output function"	2015-01-05 15:28:29 -08:00
JackyChen	fe23539d58	Adopt weighted averaging in MFQE. By using weighted averaging in the calculation of the frames to be displayed, we get an average gain of more than 1 db for key frames whose base qp are 20 higher than non-key frames. Change-Id: I7bcb2e7b9c6420ea3f73f33204d18b072dffd17c	2015-01-05 11:38:42 -08:00
Jingning Han	21c0306187	Fix denoised video output function This commit fixes the buffer alignment control in denoised video output function. The encoder is now able to properly store the denoised input video into provided file when enabled. Change-Id: I258e272c8d4a9b52592e16d6d09976c6f5c21728	2015-01-03 21:39:32 -08:00
Jingning Han	2fe1bfa5ad	Merge "Remove redundant local variable for segment_id"	2015-01-02 14:48:27 -08:00
Jingning Han	5516fdd8d0	Remove redundant local variable for segment_id Use mbmi->segment_id directly in vp9_pick_inter_mode. The value is set outside this function, hence no need to assign it again. Change-Id: I3d63cdd2e4fadf62ccdefada638b00d979eb3741	2015-01-02 12:25:14 -08:00
Jingning Han	0d2d3321af	Merge "Add bsize check condition in nonrd_use_partition"	2015-01-02 11:50:57 -08:00
Jingning Han	5486db185c	Add bsize check condition in nonrd_use_partition Check if block size is below 8x8 for rectangular block coding. It is added to support 4x8 and 8x4 block coding for RTC mode. Change-Id: I760b328f45b98ae48adc45ed5a39fb643cd8aebd	2015-01-02 10:12:37 -08:00
Jingning Han	59cfaa538e	Merge "Use less tmp motion vectors in vp9_pick_inter_mode_sub8x8"	2015-01-02 10:00:45 -08:00
Jingning Han	5c31fd5c6d	Merge "Enable sub8x8 inter block search for RTC coding mode"	2015-01-02 10:00:35 -08:00
hkuang	aa5563cd41	Remove unnecessary init_macroblockd. macroblockd are init again inside decode_tiles and decode_tiles_mt. Change-Id: I1f42837864f095c319cdb24cec7d6aa6a3a4da50	2014-12-30 15:23:52 -08:00
Jingning Han	2baccb18a0	Use less tmp motion vectors in vp9_pick_inter_mode_sub8x8 This commit simplifies the reference motion vector part for sub8x8 block coding in RTC mode and reduces the required local variables. Change-Id: I470d1482092563b68af22404dc1f497e7457b0a8	2014-12-30 13:16:12 -08:00
Jingning Han	f5d574c566	Merge "Set ref frame scaling factor in RTC inter mode decision"	2014-12-29 14:20:22 -08:00
Jingning Han	dad89d5ca1	Enable sub8x8 inter block search for RTC coding mode This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a	2014-12-24 17:40:31 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jingning Han	eb1795f643	Set ref frame scaling factor in RTC inter mode decision Properly set the corresponding scaling factor of the reference frame in the non-RD mode decision process. This allows the mode search process to account for the scaled reference frame when selecting coding mode. Change-Id: I9d41bff6931c98e5a82b413e37ac5e6e14b93b23	2014-12-23 09:33:58 -08:00
James Zern	59d63e610a	vp9: fix -Wclobbered (longjmp + local variables) Local variables used at the setjmp() site need to be marked volatile. Relevant excerpt from the 'man longjmp': =============== The values of automatic variables are unspecified after a call to longjmp() if they meet all the following criteria: · they are local to the function that made the corresponding setjmp(3) call; · their values are changed between the calls to setjmp(3) and longjmp(); and · they are not declared as volatile. =============== Change-Id: I093e6eeeedbf5f781d202248ca701ba2c29d3064	2014-12-23 11:44:11 -05:00
Jim Bankoski	4e04fa6dea	Merge "make vp9_coef_encodings const"	2014-12-22 15:05:25 -08:00
Jim Bankoski	fc954c7c03	Merge "remove static initializers for partition tree"	2014-12-22 13:49:57 -08:00
Jim Bankoski	d6d431c476	Merge "Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value."""	2014-12-22 13:43:56 -08:00
Jim Bankoski	fba0ead543	Merge "Tokenization without huge tables."	2014-12-22 13:36:38 -08:00
Jim Bankoski	a5f7d78a06	make vp9_coef_encodings const Change-Id: I28a3d342a4a4b23e02a0f47bb8037c4403f71d61	2014-12-22 13:35:56 -08:00
Jingning Han	d0f2377027	Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value."" This reverts commit `9946ee23e0`. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07	2014-12-22 10:09:25 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Jim Bankoski	17ee87b46c	convert extra bit cat structure to const statics Change-Id: Idb257e78dab2339ab1f41c3c82e537bc23e90b65	2014-12-22 06:57:50 -08:00
Jim Bankoski	3d94b9bf24	Merge "resolve visual studio warnings around initializers"	2014-12-19 15:18:04 -08:00
Jim Bankoski	dd4275e498	resolve visual studio warnings around initializers Change-Id: Id2ad4fb24242f7ca8fa7a152f0889fded4113613	2014-12-19 12:38:25 -08:00
James Zern	953dd1894d	vp9: add per-tile longjmp error handling this avoids longjmp'ing from another thread on error which will cause undesired behavior Change-Id: Ic9074ed8cc4243944bf2539d6e482f213f4e8c86	2014-12-19 11:50:04 -08:00
Jingning Han	1b5d612b5d	Merge "Add a guard on intra mode skip control for RTC mode"	2014-12-19 11:03:00 -08:00
Jingning Han	9c93307c10	Merge "Remove ARF mode entries from THR_MODES array in non-RD mode"	2014-12-19 11:02:51 -08:00
Jingning Han	cb01baa0fa	Merge "Rework mode search threshold update for RTC coding mode"	2014-12-19 11:02:40 -08:00
Jingning Han	a8e6d4d041	Merge "Properly store the tx_size of selected intra mode"	2014-12-19 11:02:37 -08:00
Paul Wilkins	9946ee23e0	Revert "Removal of legacy zbin_extra / zbin_oq_value." This reverts commit `e9b586e21b`. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4	2014-12-19 15:02:58 +00:00
Paul Wilkins	8ac3f9adaa	Merge "Removal of legacy zbin_extra / zbin_oq_value."	2014-12-19 03:37:02 -08:00
James Zern	b32ba09d35	Merge "make vp9 encoder static initializers thread safe"	2014-12-18 18:48:30 -08:00
Jim Bankoski	cd60930814	make vp9 encoder static initializers thread safe Change-Id: If2d0888d13ebe52bc7c3b16f16319408a86ab6de	2014-12-18 15:50:46 -08:00
Jingning Han	6ec0ef6691	Add a guard on intra mode skip control for RTC mode This commit adds a guard condition to the intra mode test skip control in RTC coding mode. If all inter modes are skipped, force the encoder to check intra mode. It avoids situations where the encoder processes without properly assigning required mode information. Change-Id: Ibb349fee997d6584ce901d08b06e8df3ca9c01b1	2014-12-18 12:00:27 -08:00
Paul Wilkins	e9b586e21b	Removal of legacy zbin_extra / zbin_oq_value. zbin extra / zbin_oq_value was widely passed around, hence removal touches a lot of code. Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5	2014-12-18 16:49:11 +00:00
Paul Wilkins	60e9b731cf	Remove mode dependent zbin boost. Initial patch to remove get_zbin_mode_boost() and cpi->zbin_mode_boost. For now sets a dummy value of 0 for zbin extra pending a further clean up patch. Change-Id: I64a1e1eca2d39baa8ffb0871b515a0be05c9a6af	2014-12-18 16:45:52 +00:00
Paul Wilkins	2e39817f5e	Merge "Improve motion detection for low complexity regions."	2014-12-18 08:38:21 -08:00
hkuang	b7166143d0	Let YUV plane share the same dqcoeff buffer. Remove unnecessary dqcoeff from macroblockd which reduce macroblockd size by 16384 bytes. Change-Id: Ia379a703b4fee81c8fd4698b52488a85a90c9bc2	2014-12-17 18:29:07 -08:00
Jingning Han	dd0602e01c	Remove ARF mode entries from THR_MODES array in non-RD mode The alternate reference frame is disabled in non-RD mode. No need to keep the related entries in the THR_MODES array. Change-Id: I53386f4bb1c6284f582801f27246c5edf55bc24b	2014-12-17 17:13:15 -08:00
Yaowu Xu	09b9a59fb5	Merge "Corrected value range of --cpu-used for vp9"	2014-12-17 17:12:14 -08:00
Jingning Han	455514a683	Rework mode search threshold update for RTC coding mode In RTC coding mode, the alternate reference frame modes and compound inter prediction modes are disabled. This commit reworks the related mode search threshold update process to skip interacting with these coding modes. It provides about 1.5% speed-up for speed -6 on average. vidyo1 16551 b/f, 40.451 dB, 6261 ms -> 16550 b/f, 40.459 dB, 6190 ms nik720p 33316 b/f, 38.795 dB, 6335 ms -> 33310 b/f, 38.798 dB, 6237 ms mmmoving 33265 b/f, 41.055 dB, 7176 ms -> 33267 b/f, 41.064 dB, 7084 ms dark720 33329 b/f, 39.729 dB, 11235 ms -> 33331 b/f, 39.733 dB, 10731 ms Change-Id: If2a4090a371cd28f579be219c013b972d7d9b97f	2014-12-17 15:56:01 -08:00
Yaowu Xu	a16f075375	Corrected value range of --cpu-used for vp9 This commit removes undefined value options of cpu-used for VP9 and changed vpxenc prompt to reflect the usable range of [-8,8] Change-Id: Ib80fef3dbb6ec9aabac45ed13e8ab6fbaf94f55e	2014-12-17 15:18:01 -08:00
JackyChen	9bc7974552	Merge "Add rectangle block support for MFQE."	2014-12-17 15:10:02 -08:00
Jim Bankoski	fd96deb06c	remove static initializers for partition tree Could have problem with 2 encoders. Change-Id: I92d326933c00fee688f77b54acf467ca5a8516bc see: https://code.google.com/p/webm/issues/detail?id=900&thanks=900&ts=1418843841	2014-12-17 11:41:06 -08:00
JackyChen	021e244a51	Merge "Use bit_depth in VP9Common as the flag of highbit."	2014-12-17 09:30:32 -08:00
Jingning Han	56a8bc54a6	Properly store the tx_size of selected intra mode Use a temporary variable to store the transform size associated with the best intra mode and restore the mode_info if the overall best mode is intra mode. Change-Id: I2606e0061ad32f91b095462902b1eb734b128eea	2014-12-17 09:25:14 -08:00
Jingning Han	00d2211929	Merge "Remove reset mode_info array per frame"	2014-12-17 09:24:44 -08:00
Jingning Han	cc8a11d8a1	Merge "Set second ref frame to be NONE in key frame coding"	2014-12-17 09:24:39 -08:00
Paul Wilkins	b76312124d	Deleted unused #define FAST_MOTION_MV_THRESH no longer referenced. Change-Id: Idee6ee5a59ba330904c42b20c9ec35b6fc16f7a2	2014-12-17 14:59:22 +00:00
JackyChen	b363cedcd1	Use bit_depth in VP9Common as the flag of highbit. Change-Id: I881aefbe68f9c10bb4629a2a5ee1e42a225d5ab7	2014-12-16 21:45:01 -08:00
James Yu	aeeaa67987	VP9 common for ARMv8 by using NEON intrinsics 15 Re-write - vp9_lpf_horizontal_4_dual_neon in vp9_loopfilter_16_neon.c Change-Id: Ie14f63d352f9564ad01db3939a61d91cf6d21a31 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-16 20:00:26 -08:00
Johann	ebc1951c7c	Merge "Use defines for inline and __builtin_prefetch"	2014-12-16 18:04:04 -08:00
Jingning Han	200d93545e	Merge "Fix intra mode update process in vp9_pick_inter_mode"	2014-12-16 17:04:04 -08:00
JackyChen	9931070094	Add rectangle block support for MFQE. Only for the rectangle blocks larger than 16X16, SAD and Variance are still based on the internal square blocks. Change-Id: I3754da1b0254147313f86a0140dbf4f980f06a5a	2014-12-16 16:35:54 -08:00
Johann	4f7060a431	Merge "VP9 common for ARMv8 by using NEON intrinsics 16"	2014-12-16 16:15:48 -08:00
Jingning Han	ccdc448b70	Remove reset mode_info array per frame The mode_info array was unnecessarily reset to zero every frame when error resilient mode turned on, given that the mode info values per block will be assigned during mode search stage. This commit removes this reset operation. It reduces the runtime cost on memset operation to 1/3. The overall speed -6 runtime is reduced by 2%. Change-Id: I32ecb73338d8995cc0c5147de09357364f13d45b	2014-12-16 15:54:24 -08:00
Jingning Han	01613aa753	Set second ref frame to be NONE in key frame coding This commit explicitly set the second reference frame type to be NONE in key frame coding mode. This fixes a subtle dependency of reference motion vector used by next inter frame on mode_info reset before key frame coding. Change-Id: I5ff0359753fdc9992b0bfe889490f7a32d7d5f6a	2014-12-16 15:49:58 -08:00
Johann	2fdbf70d40	Use defines for inline and __builtin_prefetch These were established for compatibility. Make sure to use them. Most frequently they manifest as issues on Visual Studio builds. Change-Id: I39d764d2eb341b999d7a6132cb44b2acfc511160	2014-12-16 15:21:19 -08:00
Frank Galligan	5fdd0f1fe0	Merge "Revert "Revert "Add support for setting byte alignment."""	2014-12-16 15:14:17 -08:00
James Yu	aa8dd897c1	VP9 common for ARMv8 by using NEON intrinsics 16 Add vp9_reconintra_neon.c - vp9_v_predictor_4x4_neon - vp9_v_predictor_8x8_neon - vp9_v_predictor_16x16_neon - vp9_v_predictor_32x32_neon - vp9_h_predictor_4x4_neon - vp9_h_predictor_8x8_neon - vp9_h_predictor_16x16_neon - vp9_h_predictor_32x32_neon - vp9_tm_predictor_4x4_neon - vp9_tm_predictor_8x8_neon - vp9_tm_predictor_16x16_neon - vp9_tm_predictor_32x32_neon Change-Id: Ib5d54a4766a1b5127169045659974f33aa98376d Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-16 12:57:52 -08:00
James Yu	ba05a4c640	VP9 common for ARMv8 by using NEON intrinsics 19 Delete vp9_dc_only_idct_add_neon.c The function was merged with vp9_short_idct4x4_1_add (later vp9_idct4x4_1_add) in `d2de1ca` and should have been deleted then. Change-Id: Ie58ba3dd9dc7330a8f1238dd7dd71c9ed4639b94 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-16 11:14:12 -08:00
JackyChen	603cdcfce5	Merge "Fixed MFQE crash issue for highbit depth."	2014-12-16 11:12:03 -08:00
JackyChen	e7bad92689	Fixed MFQE crash issue for highbit depth. Check the flags, no MFQE for highbit now. Will add highbit support latter. Change-Id: I548c27593e0f47ab7f4c92b45f14fb037dc86591	2014-12-16 10:07:38 -08:00
Jingning Han	581c8dbd33	Merge "Initialize best_tx_size with invalid value"	2014-12-16 10:01:03 -08:00
Yaowu Xu	b60ae45f36	Merge "Prevent decoder from using uninitialized entropy context."	2014-12-16 09:30:24 -08:00
Jingning Han	b47f9c5802	Merge "Use right shift to replace division in vp9_pick_inter_mode"	2014-12-16 09:26:51 -08:00
Paul Wilkins	b6c75c5a8d	Improve motion detection for low complexity regions. Where there is very subtle motion, especially when combined with low spatial complexity, the codec sometimes fails to quickly pick up the ambient motion field. Once it has been established though the field propagates well using Nearest and Near MV. This patch looks specifically at the case where the Nearest and Near have not been established as non zero vectors and in this case discounts the cost of searching for a new vector in the rd code. This will almost certainly have some implications in terms of encode speed but it should be possible to mitigate the impact in a subsequent using first pass stats and the local spatial complexity. Average results for test sets approximately neutral. Change-Id: I44a29e20f11f7ab10f8c93ffbdc50183d9801524	2014-12-16 17:22:54 +00:00
Debargha Mukherjee	4ebdb4a1b9	Merge "Fix for crash in highbitdepth rt mode"	2014-12-16 06:41:54 -08:00
Jim Bankoski	abc5a66770	Merge "Fix the comments."	2014-12-16 06:25:01 -08:00
Peter de Rivaz	e3d19bfc63	Fix for crash in highbitdepth rt mode Change 72141 introduced a new use of vp9_avg_4x4. This call needs to switch to using vp9_highbd_avg_4x4 when performing high bitdepth encodes. Change-Id: I6a8ba4b62f8a75d0a917b365a55245e2f0438ea1	2014-12-16 10:55:49 +00:00
Jingning Han	df3e3ab6ff	Fix intra mode update process in vp9_pick_inter_mode When multiple intra modes are tested, the previous mode info update process may overwrite the selected best intra mode and make the final selection use an inter mode. This commit fixes this issue by moving the mode_info reset outside the intra mode search loop. Change-Id: I15ed4288a6b3cb0832104a5e6d5d9a25cd1a5b2b	2014-12-15 17:52:09 -08:00
Johann	1d059fa23e	Merge "VP9 common for ARMv8 by using NEON intrinsics 06"	2014-12-15 14:49:33 -08:00
Johann	37ea1e1218	Merge "VP9 common for ARMv8 by using NEON intrinsics 05"	2014-12-15 14:48:53 -08:00
Jingning Han	5c93dca3d3	Merge "Simplify rate-distortion modeling function"	2014-12-15 14:37:19 -08:00
Jingning Han	c2c7596fc7	Initialize best_tx_size with invalid value If vp9_pick_inter_mode works properly, it should at least check one coding mode and hence get best_tx_size assigned a valid value. There is no need to initialize best_tx_size with a legitimate value before starting the mode search. Change-Id: Ic0496cd89672ea9c2c512a9bd1da952190af9cba	2014-12-15 12:58:34 -08:00
Jingning Han	83e2c62aba	Use right shift to replace division in vp9_pick_inter_mode Make the variable reduction_fac log2 based and explicitly use right shift when computing intra_cost_penalty. Change-Id: I208f1fb879a02debb3b3fc64f9fd06260dcf1c86	2014-12-15 12:48:07 -08:00
Frank Galligan	c4f7079ad4	Revert "Revert "Add support for setting byte alignment."" This reverts commit `91471d6aad`. Fixes the compile issues if post_proc is enabled. Change-Id: Ib40a15ce2c194f9b5adfa65a17ab01ddf60f5a59	2014-12-15 12:20:37 -08:00
James Yu	4f856cd7fa	VP9 common for ARMv8 by using NEON intrinsics 06 Add vp9_iht8x8_add_neon.c - vp9_iht8x8_64_add_neon The assembly did not previously implement tx_type 0 BUG=716 Change-Id: Icfc99dd24f3d59047f9184a7d0c761ba7e3de934 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-15 12:18:06 -08:00
James Yu	6b71013277	VP9 common for ARMv8 by using NEON intrinsics 05 Add vp9_iht4x4_add_neon.c - vp9_iht4x4_16_add_neon The assembly did not previously implement tx_type 0 BUG=715 Change-Id: I60034d1568de034edba45c5cdd13f3d87dbc73b6 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-15 12:16:19 -08:00
James Zern	8d558f2ca5	Merge "vp9/MACROBLOCKD: reorder struct members"	2014-12-15 11:54:51 -08:00
Jingning Han	eefe869291	Simplify rate-distortion modeling function Use left shift to replace one multiplication. The computation outcome remains identical. Change-Id: I1e1737af0a245de0d2a2bde10f0c171477199fc1	2014-12-15 11:51:16 -08:00
Paul Wilkins	91471d6aad	Revert "Add support for setting byte alignment." Fails to compile. Bad calls to vp9_alloc_frame_buffer and vp9_realloc_frame_buffer in postproc.c This reverts commit `399823b6f5`. Change-Id: I29f0e173f8e185d3a303cfdb17813e1eccb51e3a	2014-12-15 11:54:13 +00:00
James Zern	c58c579ec4	vp9/MACROBLOCKD: reorder struct members improves locality of reference Change-Id: I0639b98bf38879f918173b3a1b25dd93090e88b4	2014-12-12 18:01:24 -08:00
James Zern	089086bc25	Merge "Optimize bit_read_buffer."	2014-12-12 16:29:42 -08:00
Frank Galligan	9c2601eb68	Merge "Add support for setting byte alignment."	2014-12-12 15:47:11 -08:00
hkuang	3cecce916b	Optimize bit_read_buffer. Change-Id: Iee43c34909deec9787b29c1c33672213b9f049df	2014-12-12 14:38:12 -08:00
James Zern	89ee8923a8	Merge "Remove redundant loads on 1d16_v8 filter."	2014-12-12 14:32:52 -08:00
James Zern	f82d7fd854	Merge "Remove redundant loads on 1d8_v8 filter."	2014-12-12 14:32:26 -08:00
James Zern	4d40a046da	Merge "vp9: move encoder-only member from common"	2014-12-12 14:28:55 -08:00
James Zern	2bf4b4852f	Merge changes Id6421838,I37499329 * changes: vp9: make postproc members depend on CONFIG_VP9_POSTPROC vp9_postproc: remove redundant CONFIG_* checks	2014-12-12 14:27:56 -08:00
Marco	7f59cff53d	Merge "Allow for 4x4 prediction blocks for key frame, speed 6."	2014-12-12 14:27:31 -08:00
James Zern	5ccff43292	Merge "vp9_loopfilter_mmx: remove some unused tables"	2014-12-12 14:25:53 -08:00
Frank Galligan	399823b6f5	Add support for setting byte alignment. Add support for setting byte alignment on the Y, U, and V plane of the reference buffers. The byte alignment must be a power of 2, from 32 to 1024. A value of 0 sets legacy alignment. Change-Id: I7c1399622f7aa68e123646369216b32047dda73d	2014-12-12 13:34:36 -08:00
James Zern	6d1a63a02a	Merge "Remove unnecessary dqcoeff memset."	2014-12-12 12:16:32 -08:00
Frank Galligan	6a24dbd71f	Remove redundant loads on 1d16_v8 filter. This CL showed about a 3% gain in performance on some systems. Change-Id: Id27e7e0b8e69068aa364e67859436da852669250	2014-12-12 11:48:47 -08:00
Frank Galligan	44ee777905	Remove redundant loads on 1d8_v8 filter. This CL showed a modest gain in performance on some systems. Change-Id: Iad636a89a1a9804ab7a0dea302bf2c6a4d1653a4	2014-12-12 11:34:24 -08:00
James Zern	72ece1308b	vp9: move encoder-only member from common allow_comp_inter_inter VP9_COMMON -> VP9_COMP Change-Id: I6d9dc25d1cdd7e2ab62f5be69cd9fa883d21dbb6	2014-12-12 11:17:44 -08:00
James Zern	ef06de33fe	vp9: make postproc members depend on CONFIG_VP9_POSTPROC Change-Id: Id64218386968cee3132269e4a0572650f20fd980	2014-12-12 11:17:17 -08:00
James Zern	890f7bedf3	vp9_postproc: remove redundant CONFIG_* checks the entire module is wrapped in CONFIG_VP9_POSTPROC which is forcibly enabled with CONFIG_INTERNAL_STATS + a similar change in vp9_alloccommon.c Change-Id: I374993297a9fba5bef2f0b71f984eba42f0995a3	2014-12-12 11:17:16 -08:00
James Zern	d456ccbc9d	vp9_loopfilter_mmx: remove some unused tables Change-Id: I964d25cc91c8e4864d73b142d9c7a1b39cb6cfbb	2014-12-12 11:16:24 -08:00
Jim Bankoski	d916b0f22f	Merge "vp9_dx_iface.c uses CONFIG_VP9_POSTPROC but config.h not included"	2014-12-12 11:10:17 -08:00
Jingning Han	3e0793b80b	Merge "Fix PICK_MODE_CONTEXT index in non-RD coding mode"	2014-12-12 09:16:01 -08:00
Jim Bankoski	c67859f737	vp9_dx_iface.c uses CONFIG_VP9_POSTPROC but config.h not included Change-Id: Id316b3786214bf1028992968955da917e3f2d4a3	2014-12-12 08:42:36 -08:00
Jingning Han	e2c2a65695	Fix PICK_MODE_CONTEXT index in non-RD coding mode This commit fixes a bug in the PICK_MODE_CONTEXT index for horizontal partition case. The compression performance change is less than 0.01% level, since most blocks are selected to use square block size in RTC coding mode. Change-Id: I67effc18ae8795fccdd82a55f4efc609fa5cb3e1	2014-12-11 17:21:24 -08:00
JackyChen	3425d6c83e	Merge "Multiframe Quality Enhancement(MFQE) in VP9."	2014-12-11 16:24:08 -08:00
Marco	7e99cd2a9b	Allow for 4x4 prediction blocks for key frame, speed 6. For key frame under variance source partition: 4x4 prediction blocks may be selected when variance of 8x8 block is very high (threshold is set fairly high for now). Testing on some RTC clips shows this helps to reduce some ringing artifacts on key frame. Encoded key frame size increases about ~10%. Key frame PSNR increases about ~0.1-0.2dB. Change-Id: I56e203fac32ea6ef69897fb3ea269c59cb50d174	2014-12-11 15:36:16 -08:00
Jingning Han	811c74cdfa	Merge "Replace division with bit shift in choose_partitioning"	2014-12-11 13:30:03 -08:00
Debargha Mukherjee	dd33c656da	Merge "Corrected optimization of 8x8 DCT code"	2014-12-11 12:28:45 -08:00
hkuang	3c7a06c3cc	Remove unnecessary dqcoeff memset. dqcoeff is set to be 0 on initialization. And set back to 0 after being used everytime. Change-Id: I32b8e149bba40a8d707849f737a8e49a691f319c	2014-12-11 12:27:25 -08:00
Jingning Han	d9892e846f	Merge "Refactor choose_partitioning computing scheme"	2014-12-11 11:14:07 -08:00
Jingning Han	d5c396a902	Replace division with bit shift in choose_partitioning This commit explicitly uses the bit shift operation instead of division for computing block variance. Change-Id: Id19c0ff27dd1d1ae4aceee6657e1aad0d406bd74	2014-12-11 11:06:57 -08:00
Alexander Voronov	6c6a97814f	Prevent decoder from using uninitialized entropy context. If decoding starts with intra-only frame, there is a possibility of using uninitialized entropy context, what leads to undefined behavior. Change-Id: Icbb64b5b1bd1e5de2a4bfa2884e56bc0a20840af	2014-12-11 20:44:19 +03:00
Peter de Rivaz	5c22224e9e	Corrected optimization of 8x8 DCT code The 8x8 DCT uses a fast version whenever possible. There was a mistake in the checking code which meant sometimes the fast version was used when it was not safe to do so. Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7 (cherry picked from commit `fd05fb0c21`)	2014-12-11 09:42:57 -08:00
Jingning Han	377d2f027a	Refactor choose_partitioning computing scheme This commit refactors the choose_partitioning function. It removes redundant memset calls and makes the encoder to calculate variance value per block only when it is needed. It reduces the average runtime cost of choose_partitioning by 60%. Overall it reduces speed -6 runtime by 2-5%. Change-Id: I951922c50d901d0fff77a3bafc45992179bacef9	2014-12-11 09:33:40 -08:00
JackyChen	7ac3e3c1d6	Multiframe Quality Enhancement(MFQE) in VP9. It is the first version of MFQE in VP9. There are a few TODOs included in this version. Usage: Add flag --enable-vp9-postproc to config the project. In decoder, use flag --mfqe in the command line to enable MFQE in postproc. Note: Need to have key frame with low quality to see the effect of this new patch. In my experiment, I fixed the qindex to 200 in key frame. Change-Id: I021f9ce4616ed3574c81e48d968662994b56a396	2014-12-11 09:19:39 -08:00
James Yu	3f7c12dab9	VP9 common for ARMv8 by using NEON intrinsics 18 Add vp9_idct32x32_add_neon.c - vp9_idct32x32_1024_add_neon Change-Id: Ic598b772c28bd3487a8ead7a4598a66b25f9b00f Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 18:20:04 -08:00
James Yu	3cfed4bf76	VP9 common for ARMv8 by using NEON intrinsics 14 Add vp9_idct16x16_add_neon.c - vp9_idct16x16_256_add_neon_pass1 - vp9_idct16x16_256_add_neon_pass2 - vp9_idct16x16_10_add_neon_pass1 - vp9_idct16x16_10_add_neon_pass2 Change-Id: I54d25b54a36f4371760f54e4036693aaea40a5de Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 18:19:54 -08:00
James Yu	ce76aeb00d	VP9 common for ARMv8 by using NEON intrinsics 13 Add vp9_idct8x8_add_neon.c - vp9_idct8x8_64_add_neon - vp9_idct8x8_10_add_neon Change-Id: I6ee7b4496765aa36ed52990f2ef73e9f24459610 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 14:56:54 -08:00
James Yu	8c25f4af6a	VP9 common for ARMv8 by using NEON intrinsics 12 Add vp9_idct4x4_add_neon.c - vp9_idct4x4_16_add_neon Change-Id: I011a96b10f1992dbd52246019ce05bae7ca8ea4f Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 14:49:59 -08:00
James Yu	420f58f2d2	VP9 common for ARMv8 by using NEON intrinsics 11 Add vp9_idct16x16_1_add_neon.c - vp9_idct16x16_1_add_neon Change-Id: I7c6524024ad4cb4e66aa38f1c887e733503c39df Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 13:06:58 -08:00
James Yu	030ca4d0e5	VP9 common for ARMv8 by using NEON intrinsics 10 Add vp9_idct32x32_1_add_neon.c - vp9_idct32x32_1_add_neon Change-Id: If9ffe9a857228f5c67f61dc2b428b40965816eda Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 13:04:29 -08:00
James Yu	2772b45ac0	VP9 common for ARMv8 by using NEON intrinsics 09 Add vp9_idct8x8_1_add_neon.c - vp9_idct8x8_1_add_neon Change-Id: I9d23e01fa96013febbf64db6c76c6c955f14e3ff Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 12:52:33 -08:00
James Yu	9114f0afdb	VP9 common for ARMv8 by using NEON intrinsics 08 Add vp9_idct4x4_1_add_neon.c - vp9_idct4x4_1_add_neon Change-Id: Ieab9af107dbd07a4f9503bc945890c90faccb8ac Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 12:49:28 -08:00
Johann	2d8f581330	Merge "VP9 common for ARMv8 by using NEON intrinsics 07"	2014-12-10 11:40:46 -08:00
Johann	913d0adbaf	Merge "VP9 common for ARMv8 by using NEON intrinsics 04"	2014-12-10 11:40:29 -08:00
Paul Wilkins	65cfb808d0	Merge "Substantial restructuring of AQ mode 2."	2014-12-10 10:44:27 -08:00
Jingning Han	ad19724f1a	Merge "Use use_prev_frame_mvs flag for ref mv search branch"	2014-12-10 09:25:12 -08:00
Jingning Han	6fc289b9c0	Merge "Refactor update_state_rt"	2014-12-10 09:25:05 -08:00
Jingning Han	8bd88a3c83	Merge "Make RTC coding flow support sub8x8 in key frame coding"	2014-12-10 09:24:56 -08:00
Jingning Han	4cda7a1a9a	Merge "Cosmetic naming change"	2014-12-10 09:05:34 -08:00
Jingning Han	fb3cc0ed57	Merge "Take out redundant setting of mode_info from set_block_size"	2014-12-10 09:05:26 -08:00
Jingning Han	161f636809	Merge "Remove unused rd cost calculation from nonrd_use_partition"	2014-12-10 09:05:18 -08:00
James Yu	01fc6f51e0	VP9 common for ARMv8 by using NEON intrinsics 07 Add vp9_convolve8_neon.c - vp9_convolve8_horiz_neon - vp9_convolve8_vert_neon Change-Id: I0bdd99ff72d275223fe211ac7243c25a5a60cf87 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:03:07 -08:00
James Yu	893534a996	VP9 common for ARMv8 by using NEON intrinsics 04 Add vp9_convolve8_avg_neon.c - vp9_convolve8_avg_horiz_neon - vp9_convolve8_avg_vert_neon Change-Id: I617971e37b02186fec5aca181f4f9622050ea2df Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:03:07 -08:00
James Yu	d12757f5c6	VP9 common for ARMv8 by using NEON intrinsics 03 Add vp9_copy_neon.c - vp9_convolve_copy_neon Change-Id: I291fc5423d06240876411bbceab03eae5ef585be Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:02:46 -08:00
Scott LaVarnway	617382a2e3	VP9 common for ARMv8 by using NEON intrinsics 02 Add vp9_avg_neon.c - vp9_convolve_avg_neon Change-Id: Id2c9d5bcfa37cff1a16417aba1656ff07bdf10fd Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 19:00:21 -08:00
Jingning Han	0cac834b5a	Use use_prev_frame_mvs flag for ref mv search branch Replace error_resilient flag with use_prev_frame_mvs in vp9_pick_inter_mode reference motion vector search selection. This effectively turns off the simplified ref mv search in the settings of frame resizing, even if error-resilient mode is off. Change-Id: I7fed814ee7bc0cb419a03b846e0fc2de46ba7686	2014-12-09 18:18:40 -08:00
Jingning Han	e728678c50	Refactor update_state_rt Update the frame motion vector only if previous frame motion vector is needed for next frame reference motion vector. Change-Id: Ica50f9d7b46ad4f815bba0d9e30f5546df29546f	2014-12-09 15:35:49 -08:00
hkuang	4eee74d6ed	Fix clang ioc warning due to NULL src_mi pointer. The warning only happens in VP9 encoder's first pass due to src_mi is not set up yet. But it will not fail the encoder as left_mi and above_mi are not used in the first_pass and they will be set up again in the second pass. Change-Id: I12dffcd5fb1002b2b2dabb083c8726650e4b5f08	2014-12-09 14:32:48 -08:00
Johann	5810f1b4cd	Merge "VP9 common for ARMv8 by using NEON intrinsics 01"	2014-12-09 13:41:49 -08:00
James Yu	5b098b1825	VP9 common for ARMv8 by using NEON intrinsics 01 Add vp9_loopfilter_neon.c - vp9_lpf_horizontal_4_neon - vp9_lpf_vertical_4_neon - vp9_lpf_horizontal_8_neon - vp9_lpf_vertical_8_neon Change-Id: I97a0d7b399a431c21ee77396be3d5f5a1f7ebccb Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 12:26:56 -08:00
Jingning Han	225cdef665	Make RTC coding flow support sub8x8 in key frame coding This commit enables the use of sub8x8 blocks in RTC key frame encoding. It requires the block size to be preset and will decide the coding mode and encode the bit-stream. Change-Id: I35aaf8ee2d4d6085432410c7963f339f85a2c19b	2014-12-09 11:34:58 -08:00
Jingning Han	4bacaab46d	Cosmetic naming change Rename set_modeinfo_offsets as set_mode_info_offsets, to be more consistent with naming convention. Change-Id: I68ca1f36c4a78127d9439a50c1506a2afd07927d	2014-12-09 10:32:04 -08:00
Jingning Han	f051a7beab	Take out redundant setting of mode_info from set_block_size The later encoding process will take the top-left block's mode_info for pre-determined block size. Change-Id: I76a90f9ce7f3b2dbc2975b52442114e461c465b5	2014-12-09 10:27:18 -08:00
hkuang	3dfdfd5c86	Merge "Clean up the logic of handling corrupted frame."	2014-12-09 10:23:18 -08:00
Paul Wilkins	e68c8dcfd2	Substantial restructuring of AQ mode 2. The restructure moves the decision into the rd pick modes loop and makes a decision based at the 16x16 block level instead of only the 64x64 level. This gives finer granularity and better visual results on the clips I have tested. Metrics results are worse than the old AQ2 especially for PSNR and this mode now falls between AQ0 and AQ1 in terms of visual impact and metrics results. Further tuning of this to follow. It should be noted that if there are multiple iterations of the recode loop the segment for a MB could change in each loop if the previous loop causes a change in the complexity / variance bin of the block. Also where a block gets a delta Q this will alter the rd multiplier for this block in subsequent recode iterations and frames where the segmentation is applied. Change-Id: I20256c125daa14734c16f7cc9aefab656ab808f7	2014-12-09 15:10:52 +00:00
Jingning Han	1395ded2a7	Remove unused rd cost calculation from nonrd_use_partition The per block rd cost calculation is not needed when partition size is preset. Change-Id: Ie5575248bbffb584e908aa13097f697ace6ec747	2014-12-08 18:45:19 -08:00
Yunqing Wang	cddbdeabd0	Merge "SSSE3 Optimization for Atom processors using new instruction selection and ordering"	2014-12-08 13:34:54 -08:00
James Zern	c38d0490b3	Merge "Changes to assembler for NASM on mac."	2014-12-08 12:55:06 -08:00
hkuang	81e5cb86d3	Fix the comments. Change-Id: I9789476865a1b24dad54115d8f7edb4fed780b90	2014-12-08 12:44:09 -08:00
hkuang	d05cf10fe7	Add error handling for frame parallel decode and unit test for that. Change-Id: I6e309e11f1641618d2424b7a2c0fe744b8974dec	2014-12-08 12:30:19 -08:00
levytamar82	8f9d94ec17	SSSE3 Optimization for Atom processors using new instruction selection and ordering The function vp9_filter_block1d16_h8_ssse3 uses the PSHUFB instruction which has a 3 cycle latency and slows execution when done in blocks of 5 or more on Atom processors. By replacing the PSHUFB instructions with other more efficient single cycle instructions (PUNPCKLBW + PUNPCHBW + PALIGNR) performance can be improved. In the original code, the PSHUBF uses every byte and is consecutively copied. This is done more efficiently by PUNPCKLBW and PUNPCHBW, using PALIGNR to concatenate the intermediate result and then shift right the next consecutive 16 bytes for the final result. For example: filter = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8 Reg = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 REG1 = PUNPCKLBW Reg, Reg = 0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7 REG2 = PUNPCHBW Reg, Reg = 8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15 PALIGNR REG2, REG1, 1 = 0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8 This optimization improved the function performance by 23% and produced a 3% user level gain on 1080p content on Atom processors. There was no observed performance impact on Core processors (expected). Change-Id: I3cec701158993d95ed23ff04516942b5a4a461c0	2014-12-08 13:11:01 -07:00
hkuang	f925e5ce0f	Merge "Improve the performance by caching the left_mi and right_mi in macroblockd."	2014-12-08 10:24:17 -08:00
Paul Wilkins	127f65531b	Merge "Use average mb energy from first pass in AQ2 test."	2014-12-08 09:01:39 -08:00
Frank Galligan	0f8e8330eb	Merge "Fix potential integer overflow."	2014-12-07 21:37:39 -08:00
James Zern	da464c483f	Merge "vp9 asserts: fix compile warning"	2014-12-05 21:09:42 -08:00
James Zern	3db785facc	Merge "vp9: fix frame-parallel encoding"	2014-12-05 19:00:48 -08:00
Deb Mukherjee	0d367474d0	Merge "Some internal-stats, vp9-highbitdepth bug fixes"	2014-12-05 17:49:52 -08:00
James Zern	6db81fd629	vp9: fix frame-parallel encoding the flag in the header wasn't being set based on the encoder configuration in non-intra only mode broken since: `fbc2fbf` Adding oxcf temp variable. Change-Id: Ib4cff9901889824bc4e68d7f0f6deb1e41df2f53	2014-12-05 17:44:46 -08:00
Jingning Han	bd6bfb93b0	Merge "Remove redundant rdcost reset"	2014-12-05 17:35:07 -08:00
Jingning Han	296afb9440	Merge "Fix a motion search skip condition in vp9_pick_inter_mode"	2014-12-05 17:35:04 -08:00
Jingning Han	3d8d1e374e	Merge "Remove redundant MB_MODE_INFO reset from vp9_pick_mode_inter"	2014-12-05 16:59:50 -08:00
hkuang	382f86f945	Improve the performance by caching the left_mi and right_mi in macroblockd. This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3	2014-12-05 16:25:42 -08:00
James Zern	616b3a810f	vp9 asserts: fix compile warning string literal to int within an assert Change-Id: I76a173f96b9add5bf27c3f5ad5d72c6f30e51629	2014-12-05 16:20:42 -08:00
Jingning Han	17bedc54f5	Remove redundant rdcost reset The initial reset of this_rdc in vp9_pick_inter_mode is not needed, since it will be re-assign when used. Change-Id: Ic0e12d741cbab292fc214c1eabb48b129af7839b	2014-12-05 16:06:17 -08:00
Jingning Han	eadffb2d6e	Fix a motion search skip condition in vp9_pick_inter_mode Compare the current best mode rate-distortion cost with the skip threshold to decide if performing motion search. Change-Id: Ia071824f8dd3b7db485f424692a485a2da6a1a9f	2014-12-05 15:58:36 -08:00
Jingning Han	732d57c2b5	Remove redundant MB_MODE_INFO reset from vp9_pick_mode_inter Change-Id: I0222f7abc61202f4a83b117bbfb042ada6304562	2014-12-05 15:51:11 -08:00
hkuang	eaa6deee5b	Merge "Merge set_prev_mi function into encoder function."	2014-12-05 15:12:50 -08:00
Deb Mukherjee	37448d3e1f	Some internal-stats, vp9-highbitdepth bug fixes Change-Id: I0363d98f6f6558a43276aec48f27dca37c93f5ad	2014-12-05 13:40:50 -08:00
Jingning Han	6ae829088f	Merge "Remove redundant vp9_zero in choose_partitioning"	2014-12-05 11:47:58 -08:00
Jingning Han	69a9dc5cd3	Merge "Enable conditional skip path in rd_pick_intra_sby_mode"	2014-12-05 11:25:30 -08:00
Jingning Han	62c7356098	Merge "Use hybrid RD and non-RD coding flow for key frame coding"	2014-12-05 11:25:19 -08:00
Jingning Han	9d88b30854	Remove redundant vp9_zero in choose_partitioning It makes the overall speed -6 about 2% faster with no compression performance change. Change-Id: I680a967b421caa2c5a5cdb821311c4726a2df45a	2014-12-05 10:39:39 -08:00
Jingning Han	74ded4863e	Enable conditional skip path in rd_pick_intra_sby_mode These speed-up features for key frame coding are only turned on in the settings of hybrid non-RD and RD mode decision. It provides about 20% speed-up to the hybrid key frame coding at the expense of certain compression performance loss. For vidyo1, the key frame coding statistics are changed 9838F, 35.020 dB, 61677 us -> 9920F, 34.834 dB, 47556 us Overall rtc set compression performance is down by -0.257%. Change-Id: I0025447fda26bb7855e982955642b5f55d71b51f	2014-12-05 09:36:09 -08:00
Jingning Han	07711e9b27	Use hybrid RD and non-RD coding flow for key frame coding When block size is below 16x16, the encoder swap from non-RD to RD mode for key frame coding. This largely brough back the key frame compression performance. For vidyo1 at 1000 kbps, the key frame coding statistics are changed 9978F, 34.183 dB, 36807 us -> 9838F, 35.020 dB, 61677 us As compared to the full RD case 7187F, 34.930 dB, 214470 us The overall rtc set coding performance (single key frame setting) is improved by 1.5%. Change-Id: I78a4ecf025d7b24ec911e85be94e01da05e77878	2014-12-05 09:35:27 -08:00
Yunqing Wang	a3a4a34c60	Merge "vp9_ethread: the tile-based multi-threaded encoder"	2014-12-05 08:23:49 -08:00
Frank Galligan	4c4d7261e4	Fix potential integer overflow. ioc found a potential integer overflow in the rate control. This is related to https://code.google.com/p/webm/issues/detail?id=821 Change-Id: Ib6c4acd6e964972f932fce7490592eb134f2b7ea	2014-12-05 08:02:12 -08:00
Paul Wilkins	bb6e47c1c9	Merge "Increase strength of AQ1."	2014-12-05 04:11:43 -08:00
Debargha Mukherjee	15cf55b3ca	Merge "Use the RTC optimizations when in high bitdepth mode."	2014-12-04 19:22:27 -08:00
James Zern	b43c27ab6e	Merge "vp9_reader: reorder struct members"	2014-12-04 16:08:08 -08:00
Debargha Mukherjee	4bfde1071e	Merge "Corrected the renaming of CONFIG_VP9_HIGH ro CONFIG_VP9_HIGHBITDEPTH."	2014-12-04 15:52:35 -08:00
Peter de Rivaz	a306bd8274	Use the RTC optimizations when in high bitdepth mode. Change 72193 made the encoder behave differently when configured with and without high bitdepth. This change means the same algorithm is used for both. Change-Id: I707a44a94afca773a9e0c2f7ebeeea83030257c5	2014-12-04 15:48:42 -08:00
hkuang	dde819599b	Clean up the logic of handling corrupted frame. No more checking of corrupted reference frame as we skip decoding any non-intra frame in case of frame corrupted. Change-Id: I77d41bbb02fc5f61972740e2d411441eb6a17073	2014-12-04 15:07:59 -08:00
hkuang	62de07c8c6	Merge set_prev_mi function into encoder function. Change-Id: Ifcf2efbb232ea4cabcdebbe77e0820d121e4a6da	2014-12-04 14:44:23 -08:00
Yunqing Wang	eba9c762a1	vp9_ethread: the tile-based multi-threaded encoder Currently, VP9 supports column-tile encoding, which allows a frame to be encoded in multiple column tiles independently. The number of column tiles are set by encoder option "--tile-columns". This provides a way to encode a frame in parallel. Based on previous set of patches, this patch implemented the tile- based multi-threaded encoder. Each thread processes one or more tiles. Usage: For HD clips: --tile-columns=2 --threads=1/2/3/4 While using 4 threads, tests showed that the encoder achieved 2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at realtime speed 5. Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4	2014-12-04 11:21:34 -08:00
Deb Mukherjee	4f860dba78	Merge "Fixes a missing highbitdepth convolve call bug"	2014-12-04 11:19:59 -08:00
Adrian Grange	9065da983f	Merge "Free motion vector array before re-allocating"	2014-12-04 07:08:37 -08:00
Peter de Rivaz	f610f88be4	Corrected the renaming of CONFIG_VP9_HIGH ro CONFIG_VP9_HIGHBITDEPTH. Change 71789 renamed CONFIG_VP9_HIGH to CONFIG_VP9_HIGHBITDEPTH. However, one use of CONFIG_VP9_HIGH was missed. Change-Id: I0ebb9c71380c6d810a25708d15471abf9533e695	2014-12-04 11:01:46 +00:00
Tom Finegan	7339681ee9	Merge "sse2 visual studio build fix"	2014-12-03 18:05:03 -08:00
Deb Mukherjee	70d9dbd818	Fixes a missing highbitdepth convolve call bug Bug was introduced in https://gerrit.chromium.org/gerrit/#/c/72122/ Change-Id: Idb500ea619a30e7bc50e22fb8ee03be5282f41db	2014-12-03 17:48:50 -08:00
Adrian Grange	b56451f488	Merge "Use memset for initialization to 0"	2014-12-03 16:50:39 -08:00
Deb Mukherjee	6615706af2	sse2 visual studio build fix Change-Id: Id8c8c3be882bcd92afea3ccec6ebdf3f208d28ef	2014-12-03 16:35:26 -08:00
Adrian Grange	979ee6e4c9	Free motion vector array before re-allocating Change-Id: I0c39136d67e1e83020d61f86b062a04182ec9b00	2014-12-03 16:07:32 -08:00
Marco	fb20a07c36	Merge "Increase delta-qp for aq=3 mode, after key frame."	2014-12-03 16:03:06 -08:00
Jingning Han	3665f194fa	Merge "Fix indent in source_var_based_partition_search_method"	2014-12-03 15:43:40 -08:00
Adrian Grange	73caef0500	Use memset for initialization to 0 Change-Id: I714ca22b5d51016bf8b035cf457616c707257641	2014-12-03 15:22:02 -08:00
James Zern	d5937cd268	Merge "vp9: sync threads after a longjmp"	2014-12-03 14:30:55 -08:00
Marco	a047e7cdf8	Increase delta-qp for aq=3 mode, after key frame. For a few refresh periods after key frame, use large qp-delta to increase quality ramp-up. Change-Id: Ib5a150fb2dfa6bafd0d4e6b5d28dfd0724b61319	2014-12-03 13:04:45 -08:00
Jingning Han	17176cd452	Fix indent in source_var_based_partition_search_method Change-Id: I6e5e0571d6967b9b992966336715e35bb97f187e	2014-12-03 12:37:36 -08:00
Jingning Han	8f3db5f22e	Merge "Remove unused ONE_LOOP entry from speed feature"	2014-12-03 11:34:42 -08:00
Jingning Han	228ec17ff2	Merge "Rework coeff probability model update for rtc coding"	2014-12-03 11:34:35 -08:00
Marco	8fd3f9a2fb	Enable non-rd mode coding on key frame, for speed 6. For key frame at speed 6: enable the non-rd mode selection in speed setting and use the (non-rd) variance_based partition. Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames), mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16. Loss in key frame quality (~0.6-0.7dB) compared to rd coding, but speeds up key frame encoding by at least 6x. Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6. Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405	2014-12-03 09:18:08 -08:00
Jingning Han	a8d8c0f633	Remove unused ONE_LOOP entry from speed feature Change-Id: I56ead0ebc2491144c4e79e5859b05e126176702c	2014-12-03 09:17:08 -08:00
Jingning Han	8fe50191c6	Rework coeff probability model update for rtc coding This commit reworks the ONE_LOOP_REDUCED coefficient probability model update process. It allows model update for every coefficient across the spectrum at a coarser resolution, instead of performing precise update only for certain subset of probability models. The overall runtime remains nearly same (<1% change) for speed -6. The compression performance is improved by 7.5% in PSNR for speed -5 and 4.57% for speed -6, respectively. Change-Id: Ifb17136382ee7e39a9f34ff4a4f09a753125c8d1	2014-12-03 09:15:25 -08:00
James Zern	6f7ab01451	vp9: sync threads after a longjmp Synchronize all threads immediately as a subsequent decode call may cause a resize invalidating some allocations. fixes one aspect of crbug.com/437655 Change-Id: Ie993b62c2756478543206ddbe43ec6268d90a470	2014-12-02 16:51:27 -08:00
Debargha Mukherjee	99874f55fb	Merge "Reinsert macro to fix issue 884."	2014-12-02 15:32:24 -08:00
Deb Mukherjee	1fbe0c7615	Merge "Fix a warning related to VPX_EFLAG_FORCE_KF check"	2014-12-02 14:03:55 -08:00
Peter de Rivaz	2c886953d1	Reinsert macro to fix issue 884. Change 72056 unfolded some macro definitions, but lost some alternative behaviour required for high bitdepth encodes. This causes the encoder to crash, see issue 884. Change-Id: I8ce4d73c9fe0a3c10ccb86fba210fabc8b2f0ccc	2014-12-02 13:45:26 -08:00
Deb Mukherjee	02941b0df2	Fix a warning related to VPX_EFLAG_FORCE_KF check Fixes a warning in chrome build. Change-Id: I8fa0fd3e7ba1aecf89e5f79ce94cd64ed6a9567c	2014-12-02 11:35:52 -08:00
Peter de Rivaz	7e40a55ef9	Added high bitdepth sse2 transform functions Also removes some spurious changes in common/vp9_blockd.h which was introduced by a rebase issue between nextgen and master branches. Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282 (cherry picked from commit `005d80cd05`) (cherry picked from commit `08d2f54800`) (cherry picked from commit `4230c2306c`)	2014-12-02 11:16:24 -08:00
Paul Wilkins	00e3626e13	Use average mb energy from first pass in AQ2 test. AQ2 modified to use mb_av_energy in defining variance thresholds used alongside complexity when defining the segment to be used for an SB64. Slight improvements in metrics (ssim and PSNR). Change-Id: Idb9cb73f7d9c4f7118cd7e84ac77b0f25cacbf81	2014-12-02 16:07:30 +00:00
Marco Paniconi	83fd18977f	Cyclic refresh: factor segment delta-q into rate control. Incorporate segment delta-q into estimated bits. This generally improves the rate control under cyclic refresh (aq=3) mode. Change-Id: I1dc60fb230e7d08357fae18909d8ed27bf58e037	2014-12-01 16:56:43 -08:00
Jingning Han	f59cb45e90	Merge "Remove repeated search_type_check_frequency assign"	2014-12-01 14:02:10 -08:00
Yunqing Wang	7af927e324	Merge "vp9_ethread: calculate and save the tok starting address for tiles"	2014-12-01 12:49:03 -08:00
Paul Wilkins	0d3d6e0e31	Increase strength of AQ1. This patch greatly increase the strength of AQ1. Visual tests show strong gains on many clips but their is a big hit on psnr. SSIM is more mixed with some winners and losers. Change-Id: Idaa5d3b41d8576096bfa000b62bc531c3d8bf6a1	2014-11-27 10:53:37 +00:00
Jingning Han	a6df0cbcca	Remove repeated search_type_check_frequency assign This parameter is initialized as 50. No need to re-assign the same value in speed -6. Change-Id: I8735a5593412df2fdcee53ae45c8ebd1c3d792e7	2014-11-25 18:36:41 -08:00
Yunqing Wang	0993bef7e9	vp9_ethread: calculate and save the tok starting address for tiles Each tile's tok starting address is calculated before the encoding process. These addresses are stored so that the same calculation won't be done again in packing bit stream. Change-Id: I0a3be0301f002260c19a850303f2f73ebc47aa50	2014-11-25 17:19:35 -08:00
Yaowu Xu	e4234b3f8b	Separate rate_correction_factor for boosted GFs When the golden frame is boosted, the rate correction factor is not correlated well with other inter frames even in CBR mode. This commit changes to use GF specific rate_correction_factor when gf_cbr_boost is greater than 20%. Change-Id: I6312c1564387bcacc11f4c5e8a9cfdc781b5c3ab	2014-11-25 14:32:07 -08:00
Jingning Han	a04ed98482	Cosmetic change in vp9_pick_inter_mode Change-Id: Ic072585ebffdb36982ed7b8b9f875ca6c1c656c4	2014-11-25 09:42:57 -08:00
Jingning Han	92a7cfc8bf	Adaptively adjust mode test kick-off thresholds in RTC coding This commit allows the encoder to increase the mode test kick-off thresholds if the previous best mode renders all zero quantized coefficients, thereby saving motion search runs when possible. The compression performance of speed -5 and -6 is down by -0.446% and 0.591%, respectively. The runtime of speed -6 is improved by 10% for many test clips. vidyo1, 1000 kbps 16578 b/f, 40.316 dB, 7873 ms -> 16575 b/f, 40.262 dB, 7126 ms nik720p, 1000 kbps 33311 b/f, 38.651 dB, 7263 ms -> 33304 b/f, 38.629 dB, 6865 ms dark720p, 1000 kbps 33331 b/f, 39.718 dB, 13596 ms -> 33324 b/f, 39.651 dB, 12000 ms mmoving, 1000 kbps 33263 b/f, 40.983 dB, 7566 ms -> 33259 b/f, 40.978 dB, 7531 ms Change-Id: I7591617ff113e91125ec32c9b853e257fbc41d90	2014-11-25 09:42:08 -08:00
Jingning Han	30104207fd	Merge "Rework forward txfm/quantization skip system in RTC coding mode"	2014-11-25 09:33:57 -08:00
Jingning Han	6912c44135	Merge "Remove redundant intra mode penalty from vp9_pick_inter_mode"	2014-11-24 22:13:44 -08:00
James Zern	e1f55e0441	vp9_reader: reorder struct members improves locality of reference Change-Id: Ia4d55bb8c98e479528d88303fa35e8c74fbf939d	2014-11-24 22:10:39 -08:00
Yunqing Wang	edbd61e136	vp9_ethread: modify VP9_COMP structure This patch modified struct VP9_COMP. Created a struct ThreadData to include data that need to be copied for each thread. In multiple thread case, one thread processes one tile. all threads share one copy of VP9_COMP, (refer to VP9_COMP cpi in the code) but each thread has its own copy of ThreadData, (refer to ThreadData td in the code). Therefore, within the scope of encode_tiles(), both cpi and td need to be passed as function parameters. In single thread case, the FRAME_COUNTS pointer in ThreadData points to "counts" in VP9_COMMON. Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e	2014-11-24 17:57:38 -08:00
Alex Converse	60ef6c0735	Merge "Fix a tautological assert."	2014-11-24 16:36:53 -08:00
Alex Converse	0496d11486	Fix a tautological assert. Change-Id: I90ad08823e1d038384536fa9f458caadc2c87f38	2014-11-24 15:01:01 -08:00
Jingning Han	25be81e2dd	Remove redundant intra mode penalty from vp9_pick_inter_mode The intra mode penalty is covered by intra_cost_penalty. This commit removes the other intra cost threshold, provided that the constant 50 is negligible in normal rate-distortion cost. Change-Id: I9b8b7483c43b9a41741622e7057def1f7d51bb72	2014-11-24 14:55:59 -08:00
Jingning Han	e6fb9c0b0b	Merge "Key frame non-RD mode decision process"	2014-11-24 13:21:56 -08:00
Debargha Mukherjee	e9d9f1adab	Merge "Refactored idct routines and headers"	2014-11-24 12:47:03 -08:00
John Stark	71379b87df	Changes to assembler for NASM on mac. fixes non-Apple nasm part of issue #755 Change-Id: I11955d270c4ee55e3c00e99f568de01b95e7ea9a	2014-11-24 12:00:50 -08:00
Peter de Rivaz	3a8c43a479	Refactored idct routines and headers This change is made in preparation for a subsequent patch which adds acceleration for the highbitdepth transform functions. The highbitdepth transform functions attempt to use 16/32bit sse instructions where possible, but fallback to using the C implementations if potential overflow is detected. For this reason the dct routines are made global so they can be called from the acceleration functions in the subsequent patch. Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665 (cherry picked from commit `454342d4e7`)	2014-11-24 09:57:40 -08:00
Jingning Han	2fbdfd2c66	Key frame non-RD mode decision process This commit makes a non-RD coding mode decision process for key frame coding. It can be optionally turned on in speed -6 and above. Change-Id: I0847258b392877a0210b4768bef88ebc9ad009b5	2014-11-24 09:04:28 -08:00
Marco	681d5e9024	Merge "Only allow for cyclic refresh (aq=3 mode) for base layer."	2014-11-24 07:46:36 -08:00
Paul Wilkins	2232c3e34b	Merge "Fix some minor nits."	2014-11-21 17:39:43 -08:00
Debargha Mukherjee	02355a4abf	Merge "Added highbitdepth sse2 acceleration for quantize"	2014-11-21 16:08:47 -08:00
Paul Wilkins	d28e9ed452	Merge changes Ie077edd0,Id31a74fc * changes: Remove rate component adjustment for AQ1 Switch AQ1 segment basis from q ratio to rate ratio.	2014-11-21 15:38:32 -08:00
Paul Wilkins	771259fe10	Merge "Add adaptive midpoint for AQ1."	2014-11-21 15:26:18 -08:00
Paul Wilkins	6dbf83d082	Merge "Add variance restriction to AQ2."	2014-11-21 15:25:43 -08:00
Marco	53c3f2ca4d	Only allow for cyclic refresh (aq=3 mode) for base layer. Condition existed for temporal case, added it for spatial as well. Issue: https://code.google.com/p/webm/issues/detail?id=878. Change-Id: I38339207f9a94924f5568a081eabe64f867a686d	2014-11-21 14:47:32 -08:00
Paul Wilkins	ea494c0e76	Fix some minor nits. Change-Id: Ib8810d431fa20a2c78e0caaa28eb2c99903e60fb	2014-11-21 14:13:59 -08:00
Paul Wilkins	a867bb538b	Merge "Further AQ1 clean up."	2014-11-21 12:58:03 -08:00
Jingning Han	7428cebe4f	Rework forward txfm/quantization skip system in RTC coding mode This commit allows more aggressive decision to skip forward transform and quantization for luma component in RTC coding mode. The chroma components remains going through the normal coding routine, since they are not included in the non-RD mode search process. It reduces the runtime cost by 2% - 10%. In speed -6, vidyo1 1000 kbps 16576 b/f, 40.281 dB, 8402 ms -> 16576 b/f, 40.323 dB, 7764 ms nik720p 1000 kbps 33337 b/f, 38.622 dB, 7473 ms -> 33299 b/f, 38.660 dB, 7314 ms dark720p 1000 kbps 33330 b/f, 39.785 dB, 13505 ms -> 33325 b/f, 39.714 dB, 13105 ms The compression performance of speed -6 is improved by 0.44% in PSNR and 1.31% in SSIM. Change-Id: Iae9e3738de6255babea734e5897f29118bebc6d7	2014-11-21 12:46:40 -08:00
Paul Wilkins	b87c51ce55	Merge "Initial AQ1 restructuring."	2014-11-21 12:10:03 -08:00
Paul Wilkins	f5209d7e01	Remove rate component adjustment for AQ1 In AQ1 a rate adjustment was applied for blocks coded with a deltaq. This tends to skew the partition selection and cause rate overshoot. For example, consider a 64x64 super block where some but not all sub blocks are in a low q segment and some are in a high q segment. The choice of Q when considering large partition and transform sizes is defined by the lowest sub block segment id (currently this implies the lowest Q). If some parts of the larger partition are very hard this will cause a high rate component. The correct behavior here is for the rd code to discard the large partition choice and break down to sub blocks where some have low and some have high Q. However the rate correction factor above mask the high cost of coding at a larger partition size. Change-Id: Ie077edd0b1b43c094898f481df772ea280b35960	2014-11-21 08:51:58 -08:00
Paul Wilkins	1663eff7f8	Switch AQ1 segment basis from q ratio to rate ratio. In defining the Q deltas for segments in AQ1 use a rate ratio rather than a q ratio. Change-Id: Id31a74fcf2b7e55437e42a51c21b3cbcb57028d4	2014-11-21 08:50:57 -08:00
Paul Wilkins	fc47c5d653	Add adaptive midpoint for AQ1. Make the midpoint variance used in AQ mode 1 segmentation depend on the overall complexity of the frame in two pass. Change-Id: I452814ec57f7a32352e41bb250e78066abe952dd	2014-11-20 18:37:34 -08:00
Alex Converse	bc1b3d8412	Allow DC/H/V/TM on screen content. 6.3% better compression less than 1% compression time increase Change-Id: Ie83c059436e54c09de9e7c87e06e0a6d40dc38fe	2014-11-20 18:04:57 -08:00
Alex Converse	722e9d611b	Drop special inter mode selection for screen content. Better mode selection was implemented for all content. Change-Id: I479778ed21d3968892f4dce396c83733583f4f23	2014-11-20 18:04:57 -08:00
Yunqing Wang	72522dbc86	Merge "vp9_ethread: move filter_cache out of RD_OPT struct"	2014-11-20 16:51:31 -08:00
Paul Wilkins	d031237999	Add variance restriction to AQ2. Add an additional restriction to bit/complexity based segmentation based on spatial variance. Only lower Q when both the number of bits spent in the initial encoding pass and the spatial complexity are below a threshold. This will prevent the low Q segments being used just because there is a surfeit of bits. Small metrics gains especially opsnr. derf ~0.2% std-hd ~0.3% Change-Id: I6a8496d466d673f9b0e2b2ca6304ea7b6d8e1cce	2014-11-20 16:23:35 -08:00
Paul Wilkins	3d1e8c9a85	Further AQ1 clean up. Further patch to restructure AQ mode 1. Change-Id: I566452a033d047a49a40441a7be24690ea69412d	2014-11-20 16:00:51 -08:00
Paul Wilkins	6a760d483d	Initial AQ1 restructuring. This is the first of a series of patches to restructure and improve AQ mode 1 (variance based AQ). Change-Id: Idcf693131a3ea2459dcfd957a54a65b971fa4a2a	2014-11-20 15:50:15 -08:00
Paul Wilkins	b74eeb8675	Merge "Fix bug in calculating number of mbs with scaling."	2014-11-20 15:45:41 -08:00
Yunqing Wang	54ba65a63e	Merge "vp9_ethread: move max/min partition size to mb struct"	2014-11-20 14:00:37 -08:00
Yunqing Wang	379334c2d8	vp9_ethread: move filter_cache out of RD_OPT struct Similar to mask_filter, the filter_cache in RD_OPT struct can be moved out, and declared as a local variable since it is only used in pick_inter_mode functions. Change-Id: I412b99cca82bade07ac912064ec03dd1de6b2c17	2014-11-20 13:44:16 -08:00
Yunqing Wang	0b71fdbf80	Merge "vp9_ethread: change mask_filter to a local variable"	2014-11-20 13:02:55 -08:00
Yunqing Wang	bdaa3eaf43	Merge "Revert "vp9_ethread: include a pointer to mb in VP9_COMP""	2014-11-20 12:27:34 -08:00
Paul Wilkins	5e5da2e963	Fix bug in calculating number of mbs with scaling. Correct calculation of number of mbs in two pass code when frame resizing is enabled. Always use initial number of mbs if scaling is enabled, as this is what was used in the first pass. Change-Id: I49a4280ab5a8b1000efcc157a449a081cbb6d410	2014-11-20 12:24:43 -08:00
Yunqing Wang	b0efddd8e6	vp9_ethread: change mask_filter to a local variable The mask_filter in RD_OPT struct is used to record rd result in filter decision. It is only used in pick_inter_mode functions, and is removed from the struct and declared as a local variable. Change-Id: I3c95c8632ba7241591ce00ef2ef5677b5e297d7b	2014-11-20 09:41:49 -08:00
Yunqing Wang	ad7586a9e1	vp9_ethread: move max/min partition size to mb struct The max_partition_size and max_partition_size are set at the beginning while setting speed features, and then adjusted at SB level. Moving them to mb struct ensures there is a local copy for each thread. Change-Id: I7dd08dc918d9f772fcd718bbd6533e0787720ad4	2014-11-20 09:24:50 -08:00
Yunqing Wang	70c9d2983b	Revert "vp9_ethread: include a pointer to mb in VP9_COMP" This reverts commit `6906d218dd`. Another way will be used to handle mb struct. Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4	2014-11-20 08:31:12 -08:00
Peter de Rivaz	a7b2d09f36	Added highbitdepth sse2 acceleration for quantize Also includes block error. (This patch is mostly cherry picked from commit `db7192e0b0`) Change-Id: Idef18f90b111a0d0c9546543d3347e551908fd78	2014-11-19 23:55:19 -08:00
Jingning Han	c42715b721	Enable ssse3 version of vp9_fdct8x8_quant It improves the speed performance of vp9_fdct8x8_quant_sse2 by about 5%. Change-Id: I74b093ba4d81df64caf71ac7693f3d917f673097	2014-11-19 22:14:19 -08:00
Yaowu Xu	21db24efcb	Add a reset to rc tracking for dropped frames VP9/DatarateTestVP9Large.ChangingDropFrameThresh/[34] fails post the merge of commit#ffa06b37. This commit adds reset of rc tracking info when frame is dropped, and fixes the causes of the bad interaction between the tests and the previous commit. Change-Id: I848acfd9fcb336359662274325190f94aac76eae	2014-11-19 15:32:11 -08:00
Jingning Han	bf63652d34	Merge "Combine fdct8x8 and quantization process"	2014-11-19 11:17:44 -08:00
Jingning Han	ce77a7bcb0	Merge "Add sse2 version for vp9_quantize_fp"	2014-11-19 11:17:36 -08:00
Jingning Han	c6908fd5f7	Combine fdct8x8 and quantization process This commit reworks the forward transform and quantization process for 8x8 block coding. It combines the two operations in a single function to save a store/load stage of the original transform coefficients. Overall the speed -6 is slightly faster (around 1% range). The compression performance of speed -6 is improved by 3.4%. Change-Id: Id6628daef123f3e4649248735ec2ad7423629387	2014-11-18 18:10:56 -08:00
Yaowu Xu	587f0cd39d	Merge "Prevent severe rate control errors in CBR mode"	2014-11-18 16:56:06 -08:00
Marco	3c715863da	Merge "Modify active_worst_quality setting for one pass CBR."	2014-11-18 11:22:18 -08:00
Yaowu Xu	550707b9e1	Merge "change to call vp9_refining_search_sad() directly"	2014-11-18 09:14:02 -08:00
Yaowu Xu	ffa06b3708	Prevent severe rate control errors in CBR mode In rare cases, the interaction between rate correction factor and Q choices may cause severe oscillating frame sizes that are way off target bandwidth. This commit adds tracking of rate control results for last two frames, and use the information to prevent oscillating Q choices. Change-Id: I9a6d125a15652b9bcac0e1fec6d7a1aedc4ed97e	2014-11-18 09:05:57 -08:00
Jingning Han	2d3cc8ea2b	Add sse2 version for vp9_quantize_fp vp9_quantize_fp is the quantization process used by rtc coding mode. This commit adds a sse2 implementation of it. The implementation is modified based on vp9_quantize_b_sse2. No speed difference from ssse3 version. Change-Id: I24949c5b27df160b4f35117d28858d269454e64a	2014-11-18 09:01:41 -08:00
Jingning Han	13a999de8e	Merge "Add empty pointer check to pred buffering in rtc coding mode"	2014-11-17 17:40:54 -08:00
Marco	b660f723b4	Modify active_worst_quality setting for one pass CBR. Current setting had active_worst_quality set too high (close to worst_quality) for first frame(s) following first key frame. This changes that to be somewhat more aggressive in allowing active_worst_quality to be lower following key frame. Also remove the 4/5 reduction in active_worst for key frame as this should be set by the user qp_max setting. Change-Id: I0530b3ddcc85c00e3eb7568de1b14a31206c4a4c	2014-11-17 11:46:49 -08:00
Yaowu Xu	1687c47bfd	change to call vp9_refining_search_sad() directly The function pointer in compressor instance does not change, so this commit changes to call the function directly. Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5	2014-11-17 11:30:17 -08:00
Jingning Han	a62c87fb04	Add empty pointer check to pred buffering in rtc coding mode This commit adds a check condition to the prediction buffering operation used in the rtc coding mode. This resolves a unit test warning in example/vpx_tsvc_encoder_vp9_mode_7. Change-Id: I9fd50d5956948b73b53bd8fc5a16ee66aff61995	2014-11-17 11:24:07 -08:00
Yunqing Wang	4539c496bc	Merge "Code cleanup: remove unused members in RD_OPT"	2014-11-17 09:10:28 -08:00
Yunqing Wang	3f4a93baf2	Merge "vp9_ethread: combine encoder counts in separate struct"	2014-11-17 08:57:38 -08:00
Debargha Mukherjee	c3a9056df4	Merge "Added sse2 acceleration for highbitdepth variance"	2014-11-14 21:11:27 -08:00
Yunqing Wang	87ae6d73d4	Code cleanup: remove unused members in RD_OPT These 2 members in RD_OPT were moved to TileDataEnc struct already, and therefore were removed here. Change-Id: I22fee3b67f96e473a58e194a7edc76dbd48bfa04	2014-11-14 16:33:25 -08:00
Yunqing Wang	d0b547c676	vp9_ethread: combine encoder counts in separate struct Several frame counters in encoder are updated at SB level. Combine those counters and put them in a separate struct, which allows us to allocate one copy for each thread. Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7	2014-11-14 16:09:22 -08:00
Peter de Rivaz	48032bfcdb	Added sse2 acceleration for highbitdepth variance Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f (cherry picked from commit `d7422b2b1e`) (cherry picked from commit `6d741e4d76`)	2014-11-14 15:18:53 -08:00
Yunqing Wang	6906d218dd	vp9_ethread: include a pointer to mb in VP9_COMP Modified VP9_COMP struct to include MACROBLOCK *mb. This change makes it feasible in multi-thread case to allocate a mb for each thread. Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e	2014-11-14 12:31:06 -08:00
hkuang	a9a20a1040	Fix a bug in frame parallel decode and add a unit test for that. A flush bug is discovered during putting frame parallel decoder into Android. This test will expose that bug. Change-Id: Ia047f27972f4da0471649f79f1f91e7695297473	2014-11-14 10:16:34 -08:00
Yunqing Wang	807885b5e0	Merge "vp9_ethread: modify the cyclic refresh struct"	2014-11-13 18:35:01 -08:00
Yaowu Xu	e4e85ad4a8	Merge "adapt the adjustment limit for rate correction factor in RTC mode"	2014-11-13 15:50:30 -08:00
Yunqing Wang	8ee605f188	vp9_ethread: modify the cyclic refresh struct Two members in struct CYCLIC_REFRESH int64_t projected_rate_sb; int64_t projected_dist_sb; are updated at the superblock level, which makes them shared data in the multi-thread situation, and requires extra work to handle them. However, those values are updated and used immediately, and therefore can be removed. This patch cleaned up the code and removed the two members. Change-Id: I2c6ee4552bf49fb63ce590cdb47f9723974fffb1	2014-11-13 15:05:46 -08:00
Adrian Grange	35de9db312	Merge "Prepare for dynamic frame resizing in the recode loop"	2014-11-13 15:01:49 -08:00
Paul Wilkins	20517e0627	Merge "Fix 32 bit build emms problem."	2014-11-13 15:00:41 -08:00
Jingning Han	aeff1f7ec2	Merge "Use reconstructed pixels for intra prediction"	2014-11-13 13:59:02 -08:00
Jingning Han	6efafda738	Merge "Refactor nonrd_use_partition coding process"	2014-11-13 13:58:21 -08:00
Adrian Grange	0d085ebc0a	Prepare for dynamic frame resizing in the recode loop Prepare for the introduction of frame-size change logic into the recode loop. Separated the speed dependent features into separate static and dynamic parts, the latter being those features that are dependent on the frame size. Change-Id: Ia693e28c5cf069a1a7bf12e49ecf83e440e1d313	2014-11-13 11:41:20 -08:00
Paul Wilkins	b9c4f9a7db	Fix 32 bit build emms problem. Add extra vp9_clear_system_state() calls to fix double / mmx issue introduced into first pass code for 32 bit builds. Change-Id: I84cd2986b80d83650a091ab25c43755efeb82e03	2014-11-13 11:33:55 -08:00
Yaowu Xu	9f79259e54	adapt the adjustment limit for rate correction factor in RTC mode Rate correction factor is used to correct the estimated rate for any given quantizer, and feeds into rate control for quantizer selection. We make use of the actual bits used to calculate this rate correction factor with an adjustment limit to prevent over-adjustment. This commit adapts the adjustment limit to the difference between the estimated bits and the actual bits, allows the adjustment limit to vary between 0.125 (when estimate is close to actual) and 0.625 (when there is >10X factor off between estimated and actual bits). By doing this, the commit appears to have largely corrected two observed issues: 1. Adjustment is too slow when the actual bits used is way off from estimate due to the small adjustment limit. 2. Extreme oscillating quantizer choices due to the feedback loop. Change-Id: I4ee148d2c9d26d173b6c48011313ddb07ce2d7d6	2014-11-13 11:26:52 -08:00
Deb Mukherjee	7621a19aa5	Merge "Vidyo: Turn off keyframes in higher spatial layers"	2014-11-13 03:27:11 -08:00
Debargha Mukherjee	002172efd6	Merge "Added highbitdepth sse2 SAD acceleration and tests"	2014-11-12 21:20:34 -08:00
Peter de Rivaz	7eee487c00	Added highbitdepth sse2 SAD acceleration and tests Change-Id: I1a74a1b032b198793ef9cc526327987f7799125f (cherry picked from commit `b1a6f6b9cb`)	2014-11-12 14:25:45 -08:00
Yaowu Xu	8e112d9586	Merge "Use normal rate_correction_factor for gf in CBR mode"	2014-11-12 08:00:26 -08:00
Deb Mukherjee	48a7627316	Vidyo: Turn off keyframes in higher spatial layers Change-Id: Icdd5e71cd6a2b59bc4b3b972af9e4d4a36821792	2014-11-11 16:09:07 -08:00
Deb Mukherjee	c7a905ca3d	Merge "Vidyo: Support for one-pass rc-enabled SVC encoder"	2014-11-11 16:03:11 -08:00
Jingning Han	e717d22b63	Use reconstructed pixels for intra prediction This commit makes the speed -6 and above use the reconstructed boundary pixels for precise intra prediction. This allows more intra prediction modes to be tested in the non-RD coding process. Enabling horizontal and vertical intra prediction modes can improve the speed -6 compression performance for rtc set by 0.331%. Change-Id: I3a99f9d12c6af54de2bdbf28c76eab8e0905f744	2014-11-11 10:04:43 -08:00
Paul Wilkins	4999505472	Merge "AQ1 - remove first pass weights."	2014-11-11 09:17:33 -08:00
Yaowu Xu	f2b978e895	Use normal rate_correction_factor for gf in CBR mode I0c5f010 changed to allow update golden reference buffer in CBR mode, this commit changes the use of rate_correction_factor for those frames to be aligned with the new usage. This commit attempts to solve two issues: a. Initialization of rate correction factor for Golden Frame Prior to this patch, even the regular inter frame has been update the rate correction factor based on content and encoding results, the first golden frame would still use the ininitialized value that can be way off. b. Allowing rate correction factor update to be slightly faster Prior to this patch, when the rate correction factor is off, the update to the factor is too slow, the factor could not get close to a semi-correct value even after many frames. The commit helps all clips in psnr/ssim metric, but especially to a few clip in RTC set that rate correction was way off. For example thaloundeskmtgvga gained about .5dB for both overall/average psnr. Change-Id: I0be5c41691be57891d824505348b64be87fa3545	2014-11-10 16:55:13 -08:00
Deb Mukherjee	0ba1542f12	Vidyo: Support for one-pass rc-enabled SVC encoder Adds support for one-pass rc-enabled SVC encoder with callbacks for getting per-layer packets. - the callback function registration is implemented as an encoder control function. - if the callback function is not registered, the old way of aggregating packets with superframe will take effect. - one more control function “VP9E_GET_SVC_LAYER_ID” has been implemented to get the temporal/spatial id from the encoder within the callback. This can be used to get the ids to put on RTP packet. Change-Id: I1a90e00135dde65da128b758e6c00b57299a111a	2014-11-10 16:08:58 -08:00
Deb Mukherjee	130c6d7455	Merge "Iadst transforms to use internal low precision"	2014-11-10 15:39:46 -08:00
Deb Mukherjee	cc57c5e4af	Iadst transforms to use internal low precision Change-Id: I266777d40c300bc53b45b205144520b85b0d6e58 (cherry picked from commit `a1b726117f`)	2014-11-07 14:19:45 -08:00
Alex Converse	ce9ba97a9d	Fix LAST SKIP when considering GOLDEN Change-Id: I39d9f13fa34984ee9dad0c4f303ef672635f420e	2014-11-07 13:44:17 -08:00
Paul Wilkins	08d86bc904	Merge "Add intra complexity and brightness weight to first pass."	2014-11-07 09:22:12 -08:00
Yaowu Xu	98492c1091	Merge "Change the use of a reserved color space entry"	2014-11-07 06:24:59 -08:00
Paul Wilkins	31b6d7c1eb	AQ1 - remove first pass weights. Removed redundant weighting function tied for AQ1 from first pass code. Improvment in baseline AQ1 results:- Derf opsnr +0.142% SSIm +0.258% YT opsnr +0.173% SSIm +0.3% Change-Id: I16ef91caf2d7f302cd5940cc5e2626d48ebcb212	2014-11-07 14:11:29 +00:00
Yaowu Xu	af3519a385	Change the use of a reserved color space entry This commit rename a reserved color space entry to BT_2020, it intends to provide support for VP9 bitstream to pass along the color space type defined in BT.2020(Rec.2020) please note this entry does not have any effect on encoding/decoding behavior, but allow applications to the pass the information along from encoding end to decoding end. Change-Id: I4678520e89141ea5e8900f7bd1c0e95b710b7091	2014-11-06 19:14:21 -08:00
Jingning Han	754b05a4de	Refactor nonrd_use_partition coding process This commit integrates the non-RD mode decision process and the encoding process into a single recursion scheme. Change-Id: I6a7e72a0b84d567554801ebbe01ec75d54c1f77d	2014-11-06 17:00:48 -08:00
Yunqing Wang	bf44117d5f	Merge "Modify the frame context memory deallocation"	2014-11-06 13:08:57 -08:00
Jingning Han	417e754f56	Merge "Remove unused is_background function"	2014-11-06 12:03:15 -08:00
Jingning Han	e97f404e52	Merge "Rework cut-off decisions in cyclic refresh aq mode"	2014-11-06 12:03:07 -08:00
Yunqing Wang	1228433430	Modify the frame context memory deallocation This patch was to fix the vpxdec fuzzing3 test failure. When an error occurs, setjmp() is invoked, which calls the decoder removing routine. In multiple thread situation, other threads could try to access the frame context memory that is already deallocated, thus causing a segfault. An invalid unit test was added for this issue. Change-Id: Ida7442154f3d89759483f0f4fe0324041fffb952	2014-11-06 11:34:19 -08:00
Paul Wilkins	5e935126a6	Add intra complexity and brightness weight to first pass. The aim of this patch is to apply a positive weighting to frames that have a significant number of blocks that are of low spatial complexity and are dark. The rationale behind this is that artifacts tend to be more visible in such frames. In this patch the weight is only applied in regard to the distribution of bits between frames. Hence if all the frames share similar characteristics (as is the case for most of our short test clips) there will be little or no net effect. However, the effect can be seen on some longer form test content. For example Tears of steel baseline test: 2323.09 Kbit/s opsnr 39.915 ssim 74.729 With this patch:- 2213.34 Kbit/s opsnr 39.963 ssim 74.808 (Sligtly better metrics and about 5% smaller) The weighting may well need some further tuning along side changes to the aq modes. Change-Id: Ieced379bca03938166ab87b2b97f55d94948904c	2014-11-06 10:45:00 +00:00
Jingning Han	10da059b52	Remove unused is_background function Change-Id: Ia540eac5f066ae95280c2f898370eddf0110c279	2014-11-05 21:19:23 -08:00
Jingning Han	caaf63b2c4	Rework cut-off decisions in cyclic refresh aq mode This commit removes the cyclic aq mode dependency on in_static_area and reworks the corresponding cut-off thresholds. It improves the compression performance of speed -5 by 1.47% in PSNR and 2.07% in SSIM, and the compression performance of speed -6 by 3.10% in PSNR and 5.25% in SSIM. Speed wise, about 1% faster in both settings at high bit-rates. Change-Id: I1ffc775afdc047964448d9dff5751491ba4ff4a9	2014-11-05 21:17:09 -08:00
hkuang	e8860693ea	Merge "Totally remove prev_mi in VP9 decoder."	2014-11-05 17:48:47 -08:00
hkuang	4cc7c5a17f	Totally remove prev_mi in VP9 decoder. This will save the memory and improve the decode speed due to removing unnecessary memset of big prev_mi array for all the key frames. Decoding a all key frames 1080p video shows speed improve around 2%. Change-Id: I6284a445c1291056e3c15135c3c20d502f791c10	2014-11-05 16:14:30 -08:00
Yaowu Xu	2c4fee17bc	Fix visual studio 2013 compiler warnings For configured with --enable-vp9-highbitdepth Change-Id: I2b181519d7192f8d7a241ad5760c3578255f24e6	2014-11-05 13:47:28 -08:00
Hui Su	2c95a3f374	Merge "Simplify interface of write_selected_tx_size and read_tx_size"	2014-11-05 13:33:09 -08:00
Jingning Han	a7889cac9a	Merge "Skip ref frame mode search conditioned on predicted mv residuals"	2014-11-05 12:04:10 -08:00
Hui Su	709c634b84	Simplify interface of write_selected_tx_size and read_tx_size Change-Id: Ia2b2a895deefaaf7b34bf26df86add56dbab082c	2014-11-04 16:11:50 -08:00
Minghai Shang	9f9e30d7bf	Merge "[spatial svc] Make spatial svc working for one pass rate control"	2014-11-04 15:57:16 -08:00
hkuang	23da920a8e	Fix the memory leak due to missing free frame_mvs. Change-Id: I2ceee7341d906259002c0ea31ea009ae32c04bfd	2014-11-04 13:28:31 -08:00
Minghai Shang	86c36a504d	[spatial svc] Make spatial svc working for one pass rate control Change-Id: Ibd9114485c3d747f9d148f64f706bf873ea473ac	2014-11-04 11:46:48 -08:00
Jingning Han	1e753387c8	Merge "Refactor sub-pixel motion search unit"	2014-11-04 09:11:15 -08:00
Jingning Han	1434f7695b	Skip ref frame mode search conditioned on predicted mv residuals This commit makes the RTC coding mode to conditionally skip the reference frame mode search, when the predicted motion vector of the current reference frame gives more than two times sum of absolute difference compared to that of other reference frames. It reduces the runtim by 1% - 4% for speed -5 and -6. The average compression performance is improved by about 0.1% in both settings. It is of particular benefit to light change scenarios. The compression performance of test clip mmmovingvga.y4m is improved by 6.39% and 15.69% at high bit rates for speed -5 and -6, respectively. Speed -5 vidyo1 16555 b/f, 40.818 dB, 12422 ms -> 16552 b/f, 40.804 dB, 12100 ms nik 33211 b/f, 39.138 dB, 11341 ms -> 33228 b/f, 39.139 dB, 11023 ms mmmoving 33263 b/f, 40.935 dB, 13508 ms -> 33256 b/f, 41.068 dB, 12861 ms Speed -6 vidyo1 16541 b/f, 40.227 dB, 8437 ms -> 16540 b/f, 40.220 dB, 8216 ms nik 33272 b/f, 38.399 dB, 7610 ms -> 33267 b/f, 38.414 dB, 7490 ms mmmoving 33255 b/f, 40.555 dB, 7523 ms -> 33257 b/f, 40.975 dB, 7493 ms Change-Id: Id2aef76ef74a3cba5e9a82a83b792144948c6a91	2014-11-04 09:10:19 -08:00
Yunqing Wang	6d90a9d289	Merge "WORKAROUND FIX FOR GCC4.9.1"	2014-11-03 16:56:38 -08:00
Marco	343acaa8f2	Merge "Allow disable of refresh golden for more than 1 layer encoding."	2014-11-03 14:38:05 -08:00
Jingning Han	e083f6bd08	Refactor sub-pixel motion search unit This commit unfolds the legacy macro definitions used in the sub-pixel motion search and refactors the operational flow for later optimizations. Change-Id: I3e3f770cad961d03d1a6eb0b2a0186cc77eaf2b8	2014-11-03 09:02:57 -08:00
Jingning Han	0ca5908ff6	Merge "Fix the THR_MODES array used in vp9_pick_inter_mode"	2014-11-03 08:46:42 -08:00
Yaowu Xu	2fe893c94f	Merge "Fix speed 7 and speed 12 for rt"	2014-11-03 08:02:58 -08:00
Marco	d6b688375f	Allow disable of refresh golden for more than 1 layer encoding. The current logic was allowing for disabling golden refresh only for two pass svc encoding. This change disables it as long as more than 1 layer encoding is used (for example temporal layers under 1pass CBR). Change-Id: I4dc5204a7ad365c821ec7963e93b59da82e1826b	2014-11-02 22:24:00 -08:00
Jingning Han	7e119e2946	Fix the THR_MODES array used in vp9_pick_inter_mode Fix the alignment of entries fo intra prediction modes. Change-Id: Ie32ad87cf90694efd591a4b1cc29c916c4cd56f7	2014-11-02 12:25:57 -08:00
levytamar82	86175a5788	WORKAROUND FIX FOR GCC4.9.1 In the function mb_lpf_horizontal_edge_w_avx2_16 the usage of the intrinsic _mm256_cvtepu8_epi16 cause a compiler bug in gcc 4.9.1. until it will be fixed I created a workaround that create the up convert by using broadcast128+shuffle. The bug was reported here: https://code.google.com/p/webm/issues/detail?id=867 Change-Id: I73452e6806f42e0fadcde96b804ea3afa7eeb351	2014-11-01 11:27:28 -07:00
Yaowu Xu	0271ff7775	Fix speed 7 and speed 12 for rt A recent change has introduced big quality drops for speed 7 and 12 for --rt mode. The change reverted the big drop and improved quality by 9.5% for speed 7 and 13.4% for speed 12. Change-Id: I07b82e3bb6002a73af486a083458c88877bdad01	2014-10-31 17:29:02 -07:00
hkuang	55577431ae	Bind motion vectors with frame buffer structure. This will save a lot of memory for decoder due to removing of prev_mi, but prev_mi is still needed in encoder. So this will increase a little bit memory for encoder. Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed	2014-10-31 17:01:08 -07:00
Jingning Han	1c84e73ebd	Merge "Fix mode index use case in vp9_pick_inter_mode"	2014-10-31 08:55:40 -07:00
Jingning Han	61966b1d10	Merge "Refactor vp9_update_rd_thresh_fact"	2014-10-31 08:55:28 -07:00
Jingning Han	1cffea9fb7	Merge "Rework pred pixel buffer system in non-RD coding mode"	2014-10-31 08:55:24 -07:00
Jingning Han	64348d9f8d	Fix mode index use case in vp9_pick_inter_mode This improves coding performance of speed -5 and -6 by 0.6%, respectively. Change-Id: Ic5a7746a88c73285f0b14333d35dc16b02152c25	2014-10-30 11:10:06 -07:00
Jingning Han	f7b46d8c5e	Refactor vp9_update_rd_thresh_fact Reduce the scope of function parameters. Change-Id: Ifef2cfb559908a97498ffdbd6ea53da1cd45a73c	2014-10-30 11:09:40 -07:00
Jingning Han	7bea8c59f9	Rework pred pixel buffer system in non-RD coding mode This commit makes the inter prediction buffer system to support hybrid partition search. It reduces the runtime of speed -5 by about 3%. No compression performance change. vidyo1 720p 1000 kbps 11831 ms -> 11497 ms nik 720p 1000 kbps 10919 ms -> 10645 ms Change-Id: I5b2da747c6395c253cd074d3907f5402e1840c36	2014-10-30 11:08:35 -07:00
Hui Su	d478d2df37	Merge "Move the definition of switchable filter numbers into enum INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and IF_DIFF_REF_FRAME_ADD_MV."	2014-10-30 11:05:04 -07:00
Hui Su	66906da066	Merge "Combine vp9_encode_block_intra and encode_block_intra"	2014-10-30 11:02:31 -07:00
Yunqing Wang	aed48c786a	Remove unused speed feature Partition_check was unused and removed. Change-Id: I15ec9162d86dc61f04c09229c498629878ed7155	2014-10-29 17:05:04 -07:00
Jingning Han	afa31ab9b8	Merge "Enable mode search threshold update in non-RD coding mode"	2014-10-29 12:42:22 -07:00
Jingning Han	9349a28e80	Enable mode search threshold update in non-RD coding mode Adaptively adjust the mode thresholds after each mode search round to skip checking less likely selected modes. Local tests indicate 5% - 10% speed-up in speed -5 and -6. Average coding performance loss is -1.055%. speed -5 vidyo1 720p 1000 kbps 16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms nik 720p 1000 kbps 33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms speed -6 vidyo1 720p 1000 kbps 16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms nik 720p 1000 kbps 33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b	2014-10-29 10:55:34 -07:00
Adrian Grange	4074099ed8	Simplify vp9_set_rd_speed_thresholds_sub8x8 Change-Id: I4bf0f9a38697f5aea564a47afd7f02bb8b2888b6	2014-10-29 09:09:46 -07:00
Hui Su	0928da3b6e	Combine vp9_encode_block_intra and encode_block_intra Change-Id: I79091fb677b64892ecca2fb466fde14602d8cdfc	2014-10-28 18:57:01 -07:00
Jingning Han	982dab6050	Merge "Use zero motion vector in choose_partitioning"	2014-10-28 12:00:13 -07:00
JackyChen	50e5c30536	Merge "vp9_denoiser_sse2: refactor the code."	2014-10-28 11:06:05 -07:00
Yaowu Xu	7d7b43b9af	Merge "Allow update of golden refernce buffer in CBR mode"	2014-10-28 10:48:02 -07:00
JackyChen	99a8dac4de	vp9_denoiser_sse2: refactor the code. Combined vp9_denoiser_8xM_sse2 and vp9_denoiser_4xM_sse2 into one function vp9_denoiser_NxM_sse2_small and passed the bitexact testing. Changed the name of the function vp9_denoiser_64_32_16xM_sse2 to vp9_denoiser_NxM_sse2_big. Change-Id: Ib22478df585994dd347ebae04202c0b701e7f451	2014-10-28 09:36:58 -07:00

... 8 9 10 11 12 ...

7684 Commits