generic-library/vpx

Author	SHA1	Message	Date
Alpha Lam	7bce513afe	Call vp8_find_near_mvs lazily vp8_find_near_mvs() is being called on all possible reference frames but the data computed may be used if the loop exits early, which can be due to x->skip beign set to 1. Optimize this by call vp8_find_near_mvs() laziy only if it is going to be used and not computed yet. Change-Id: Iccdbd4c962a670c9f2c99b8aca8096042ca5dc98	2011-09-30 14:48:18 +01:00
Paul Wilkins	a572ac8327	Merge "CQ and two pass rate control."	2011-09-30 02:57:54 -07:00
Paul Wilkins	b6e27d5f0b	CQ and two pass rate control. Changes to the selection of Q limits for two pass and two pass CQ mode. Allowance made for Mode and motion vector costs. Some refactoring of common code. For Derf and YT sets CQ mode average improvement circa 1% (SSIM and Global PSNR). Some increased tendency to undershoot even when user CQ not reached. Patch2: Removed some test code accidentally merged. Change-Id: Icf74d13af77437c08602571dc7a97e747cce5066	2011-09-30 10:55:52 +01:00
Attila Nagy	380d64ecb1	Multithreaded encoder, late sync loopfilter Sync with loopfilter thread just at the beginning of next frame encoding. This returns control to application faster and allows a better multicore scaling. When PSNR packets are generated the final filtered frame is needed imediatly so we cannot delay the sync. Change-Id: I288d97b5e331d41d6f5bb49d97986fa12ac6f066	2011-09-29 10:06:24 +03:00
Johann	9f41a8b0aa	Merge "Replace vpx_ports/config.h with vpx_config.h"	2011-09-22 09:30:18 -07:00
Attila Nagy	1a7d25a484	Replace vpx_ports/config.h with vpx_config.h Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a	2011-09-22 13:33:54 +03:00
Fritz Koenig	bd0c3409a8	Move neon only arm functions under arm/neon. These files don't contain generic arm code, so should only be compiled by neon. Change-Id: Ie712823aa04d4235e7cfe7a3b725e73ee4c3e564	2011-09-20 10:51:06 -07:00
Johann	6829e62718	Merge "NEON FDCT updated to match current C code"	2011-09-20 09:51:05 -07:00
Johann	86e07525d5	Merge "NEON walsh transform updated to match C"	2011-09-20 09:50:42 -07:00
Johann	3a16276cf7	Merge "Updated ARMv6 forward transforms to match C"	2011-09-20 09:50:36 -07:00
Tero Rintaluoma	0c2529a812	NEON FDCT updated to match current C code - Removed fast_fdct4x4_neon and fast_fdct8x4_neon - Uses now short_fdct4x4 and short_fdct8x4 - Gives ~1-2% speed-up on Cortex-A8/A9 Change-Id: Ib62f2cb2080ae719f8fa1d518a3a5e71278a41ec	2011-09-20 10:20:55 +03:00
Tero Rintaluoma	3c19bc3fb3	Fixed armv5te multiplications Rd and Rm registers should be different in 'mul'. This register combination results in unpredictable behaviour. GCC will give a warning and RVCT an error in this case. Restriction applies only to armv5 targets and not for armv6 and above. Change-Id: I378d17c51e1f16a6820814fbed43e115aaabb03e	2011-09-20 09:59:27 +03:00
Tero Rintaluoma	4c3ad66b7f	Updated ARMv6 forward transforms to match C - Updated walsh transform to match C (based on Change Id24f3392) - Changed fast_fdct4x4 and 8x4 to short_fdct4x4 and 8x4 correspondingly Change-Id: I704e862f40e315b0a79997633c7bd9c347166a8e	2011-09-19 10:26:59 +03:00
Tero Rintaluoma	2a4b2a000c	NEON walsh transform updated to match C Modified original patch If2f07220885c4c3a0cae0dace34ea0e36124f001 according to comments. Scheduled code a little bit to prevent some interlocks. Change-Id: I338f02b881098782f82af63d97f042b85e63e902	2011-09-19 10:15:33 +03:00
Scott LaVarnway	5bc7b3a68e	Fixed encoder crash caused by the "Removed bmi copy to/from BLOCKD" commit. Change-Id: I9fae71bdc34c8ecc07bb81cd3ccf498b91ce3ec7	2011-09-13 11:46:33 -04:00
Scott LaVarnway	c4b9089bb9	Merge "Skip computation of distortion in vp8_pick_inter_mode if active_map is used"	2011-08-31 07:18:52 -07:00
Scott LaVarnway	222c72e50f	Merge "Removed bmi copy to/from BLOCKD"	2011-08-31 06:57:20 -07:00
Alpha Lam	0e05f2c6c9	Skip computation of distortion in vp8_pick_inter_mode if active_map is used If a block is marked to be inactive then set distortion to 0. Change-Id: Ib415f19642a2ff7b5cf5cfaedd60ebbd79732272	2011-08-31 14:06:55 +01:00
John Koleszar	800b70a3bf	Merge "Recalculate zbin_extra only if regular quantizer is being used"	2011-08-30 12:49:24 -07:00
Alpha Lam	bc9293b815	Recalculate zbin_extra only if regular quantizer is being used vp8_update_zbin_extra() is called all the time even though the fast quantizer doesn't use it. Skip this call if fast quantizer is used. Change-Id: Ia711c38431930cc2486cf59b8466060ef0e9d9db	2011-08-30 19:23:34 +01:00
Yunqing Wang	1f20202e2c	Minor modification on key frame decision This change makes sure that no key frame recoding in real-time mode even if CONFIG_REALTIME_ONLY is not configured. Change-Id: Ifc34141f3217a6bb63cc087d78b111fadb35eec2	2011-08-25 16:54:45 -04:00
Fritz Koenig	4797a97215	Quiet warning by removing unused variable. fwd_boost_score was not being computed or referenced, so remove declaration. Change-Id: Iece36cde1ec113e3c6afaff1407d24cdf12bd0a8	2011-08-24 15:47:09 -07:00
Scott LaVarnway	b870947d42	Removed bmi copy to/from BLOCKD for SPLITMV and B_PRED modes. Modified code to use the bmi found in mode_info_context instead of BLOCKD. On the decode side, the uvmvs are calculated only when required, instead of every macroblock. This is WIP. (bmi should eventually be removed from BLOCKD) Small performance gains noticed for RT encodes and decodes.(VGA) Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7	2011-08-24 14:42:26 -04:00
Scott LaVarnway	1de5da80c9	Merge "Faster vp8_default_coef_probs"	2011-08-24 07:52:10 -07:00
Fritz Koenig	c5f890af2c	Use local labels for jumps/loops in x86 assembly. Prepend . to local labels in assembly code. This allows non unique labels within a file. Also makes profiling information more informative by keeping the function name with the loop name. Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f	2011-08-23 09:05:29 -07:00
Fritz Koenig	694d4e7777	Reclassify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I48ad0218af0cc51c5078070a08511dee43ecfe09	2011-08-22 12:36:28 -07:00
Fritz Koenig	b7a6f1d20e	Merge "Revert "Reclasify optimized ssim calculations as SSE2.""	2011-08-22 12:32:12 -07:00
Fritz Koenig	734b1b2041	Revert "Reclasify optimized ssim calculations as SSE2." This reverts commit 01376858cd184d820ff4c2d8390361a8679c0e87	2011-08-22 11:31:12 -07:00
Fritz Koenig	f8e3d23b99	Merge "Reclasify optimized ssim calculations as SSE2."	2011-08-22 09:20:33 -07:00
Fritz Koenig	01376858cd	Reclasify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97	2011-08-19 08:51:27 -07:00
John Koleszar	edec5eb5e7	Merge "Copy less when active map is in use"	2011-08-19 07:31:00 -07:00
Alpha Lam	4e8d35a461	Copy less when active map is in use When active map is specified and the current frame is not a key frame, golden frame nor a altref frame then copy only those active regions. This significantly reduces encoding time by as much as 19% on the test system where realtime encoding is used. This is particularly useful when the frame size is large (e.g. 2560x1600) and there's only a few action macroblocks. Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4	2011-08-19 10:29:41 -04:00
Paul Wilkins	744f482350	Small boost to every other frame. Instead of a single mid GF boost apply a few extra bits to every other frame. This gives a very small average metrics improvement on both derf and YT sets. Also use min GF interval as min KF interval. Change-Id: Iee238b8cae0ffaed850a5a944ac825cee18da485	2011-08-17 14:14:23 +01:00
Scott LaVarnway	19987dcbfa	Faster vp8_default_coef_probs Copies from a generated table instead of building the default coeff probabilities during runtime. Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50	2011-08-16 16:21:21 -04:00
John Koleszar	9cc1611588	Merge v0.9.7-p1 release int 'origin/master' Change-Id: I93388d2f8846615ad1e26b975308c5e96b9b1918	2011-08-15 17:10:01 -04:00
John Koleszar	e96131705a	Revert "Improved 1-pass CBR rate control" This reverts commit b5ea2fbc2c1554769848774c836aad262af95072. Further testing showed noticable keyframe popping in some cases, reverting this for now to give time for a proper fix. Conflicts: vp8/encoder/onyx_if.c vp8/encoder/ratectrl.c Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af	2011-08-12 14:51:36 -04:00
Yunqing Wang	b84e8f20c3	Merge "Adjust half-pixel only search"	2011-08-05 12:15:32 -07:00
John Koleszar	238dae8604	Fix source buffer selection This patch fixes a bug in the interaction between the recode loop and spatial resampling. If the codec was in a spatial resampling state, and a subsequent iteration of the recode loop disables resampling, then the source buffer must be reset to the unscaled source. Change-Id: I4e4cd47b943f6cd26a47449dc7f4255b38e27c77	2011-08-03 16:13:15 -04:00
Yunqing Wang	b9f19f8917	Adjust half-pixel only search Changed motion search in vp8_find_best_half_pixel_step() to be the same as in vp8_find_best_sub_pixel_step(), which checks 5 points instead of 8 points. This only affects real-time mode with cpu-used >=9. Tests showed it gives 2% encoding speedup with a quality loss(psnr) of up to 0.5%. Change-Id: I16049cad1535002346d46cfdfad345bfc3dc5146	2011-08-03 11:51:07 -04:00
John Koleszar	06c3d5bb9a	Fix building with --disable-postproc Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9	2011-08-01 17:50:23 -04:00
John Koleszar	1f71d2e2c8	Correctly track sharpness in vp8cx_pick_filter_level_fast Make sure to update last_sharpness_level from the current sharpness_level whenever it changes. Change-Id: I0258d2f5b11a407abf6176a8d4c4994d925943f0	2011-07-29 12:27:03 -04:00
Yunqing Wang	2f2302f8d5	Preload reference area in sub-pixel motion search (real-time mode) This change implemented same idea in change "Preload reference area to an intermediate buffer in sub-pixel motion search." The changes were made to vp8_find_best_sub_pixel_step() and vp8_find_best_half _pixel_step() functions which are called when speed >= 5. Test result (using tulip clip): 1. On Core2 Quad machine(Linux) rt mode, speed (-5 ~ -8), encoding speed gain: 2% ~ 3% rt mode, speed (-9 ~ -11), encoding speed gain: 1% ~ 2% rt mode, speed (-12 ~ -14), no noticeable encoding speed gain 2. On Xeon machine(Linux) Test on speed (-5 ~ -14) didn't show noticeable speed change. Change-Id: I21bec2d6e7fbe541fcc0f4c0366bbdf3e2076aa2	2011-07-27 14:19:10 -04:00
Yunqing Wang	f11613b620	Merge "Fix range checks in motion search"	2011-07-27 09:34:13 -07:00
Yunqing Wang	bde2afbe23	Fix range checks in motion search There were some situations that the start motion vectors were out of range. This fix adjusted range checks to make sure they are checked and clamped. Change-Id: Ife83b7fed0882bba6d1fa559b6e63c054fd5065d	2011-07-27 10:37:33 -04:00
James Zern	b45065d38b	cosmetics: consistently use [u]int64_t Removes mixed usage of (unsigned) long long and INT64. Fixes Issue #208. Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a	2011-07-26 11:34:36 -07:00
Yunqing Wang	fe270dd527	Specify size for argument pushed to stack The change fixes building error on Win64. Change-Id: I63d25b26220c4da8a98ca2e36530cbb802468e6b	2011-07-25 11:30:45 -04:00
Johann	773bcc300d	Merge "fix sharpness bug and clean up"	2011-07-22 09:34:55 -07:00
Johann	a04ed0e8f3	fix sharpness bug and clean up sharpness was not recalculated in vp8cx_pick_filter_level_fast remove last_filter_type. all values are calculated, don't need to update the lfi data when it changes. always use cm->sharpness_level. the extra indirection was annoying. don't track last frame_type or sharpness_level manually. frame type only matters for motion search and sharpness_level is taken care of in frame_init move function declarations to their proper header Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db	2011-07-22 12:33:57 -04:00
Yunqing Wang	829179e888	Merge "Preload reference area to an intermediate buffer in sub-pixel motion search"	2011-07-22 06:56:15 -07:00
Yunqing Wang	20bd1446c0	Preload reference area to an intermediate buffer in sub-pixel motion search In sub-pixel motion search, the search range is small(+/- 3 pixels). Preload whole search area from reference buffer into a 32-byte aligned buffer. Then in search, load reference data from this buffer instead. This keeps data in cache, and reduces the crossing cache- line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux) showed encoder speed improvement: 3.4% at --rt --cpu-used =-4 2.8% at --rt --cpu-used =-3 2.3% at --rt --cpu-used =-2 2.2% at --rt --cpu-used =-1 Test on Atom notebook showed only 1.1% speed improvement(speed=-4). Test on Xeon machine also showed less improvement, since unaligned data access latency is greatly reduced in newer cores. Next, I will apply similar idea to other 2 sub-pixel search functions for encoding speed > 4. Make this change exclusively for x86 platforms. Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f	2011-07-22 09:28:06 -04:00

... 4 5 6 7 8 ...

953 Commits