generic-library/vpx

Author	SHA1	Message	Date
Fritz Koenig	694d4e7777	Reclassify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I48ad0218af0cc51c5078070a08511dee43ecfe09	2011-08-22 12:36:28 -07:00
Fritz Koenig	b7a6f1d20e	Merge "Revert "Reclasify optimized ssim calculations as SSE2.""	2011-08-22 12:32:12 -07:00
Fritz Koenig	734b1b2041	Revert "Reclasify optimized ssim calculations as SSE2." This reverts commit 01376858cd184d820ff4c2d8390361a8679c0e87	2011-08-22 11:31:12 -07:00
Fritz Koenig	f8e3d23b99	Merge "Reclasify optimized ssim calculations as SSE2."	2011-08-22 09:20:33 -07:00
Fritz Koenig	01376858cd	Reclasify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97	2011-08-19 08:51:27 -07:00
John Koleszar	edec5eb5e7	Merge "Copy less when active map is in use"	2011-08-19 07:31:00 -07:00
Alpha Lam	4e8d35a461	Copy less when active map is in use When active map is specified and the current frame is not a key frame, golden frame nor a altref frame then copy only those active regions. This significantly reduces encoding time by as much as 19% on the test system where realtime encoding is used. This is particularly useful when the frame size is large (e.g. 2560x1600) and there's only a few action macroblocks. Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4	2011-08-19 10:29:41 -04:00
Paul Wilkins	744f482350	Small boost to every other frame. Instead of a single mid GF boost apply a few extra bits to every other frame. This gives a very small average metrics improvement on both derf and YT sets. Also use min GF interval as min KF interval. Change-Id: Iee238b8cae0ffaed850a5a944ac825cee18da485	2011-08-17 14:14:23 +01:00
Scott LaVarnway	19987dcbfa	Faster vp8_default_coef_probs Copies from a generated table instead of building the default coeff probabilities during runtime. Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50	2011-08-16 16:21:21 -04:00
John Koleszar	9cc1611588	Merge v0.9.7-p1 release int 'origin/master' Change-Id: I93388d2f8846615ad1e26b975308c5e96b9b1918	2011-08-15 17:10:01 -04:00
John Koleszar	e96131705a	Revert "Improved 1-pass CBR rate control" This reverts commit b5ea2fbc2c1554769848774c836aad262af95072. Further testing showed noticable keyframe popping in some cases, reverting this for now to give time for a proper fix. Conflicts: vp8/encoder/onyx_if.c vp8/encoder/ratectrl.c Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af	2011-08-12 14:51:36 -04:00
Yunqing Wang	b84e8f20c3	Merge "Adjust half-pixel only search"	2011-08-05 12:15:32 -07:00
John Koleszar	238dae8604	Fix source buffer selection This patch fixes a bug in the interaction between the recode loop and spatial resampling. If the codec was in a spatial resampling state, and a subsequent iteration of the recode loop disables resampling, then the source buffer must be reset to the unscaled source. Change-Id: I4e4cd47b943f6cd26a47449dc7f4255b38e27c77	2011-08-03 16:13:15 -04:00
Yunqing Wang	b9f19f8917	Adjust half-pixel only search Changed motion search in vp8_find_best_half_pixel_step() to be the same as in vp8_find_best_sub_pixel_step(), which checks 5 points instead of 8 points. This only affects real-time mode with cpu-used >=9. Tests showed it gives 2% encoding speedup with a quality loss(psnr) of up to 0.5%. Change-Id: I16049cad1535002346d46cfdfad345bfc3dc5146	2011-08-03 11:51:07 -04:00
John Koleszar	06c3d5bb9a	Fix building with --disable-postproc Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9	2011-08-01 17:50:23 -04:00
John Koleszar	1f71d2e2c8	Correctly track sharpness in vp8cx_pick_filter_level_fast Make sure to update last_sharpness_level from the current sharpness_level whenever it changes. Change-Id: I0258d2f5b11a407abf6176a8d4c4994d925943f0	2011-07-29 12:27:03 -04:00
Yunqing Wang	2f2302f8d5	Preload reference area in sub-pixel motion search (real-time mode) This change implemented same idea in change "Preload reference area to an intermediate buffer in sub-pixel motion search." The changes were made to vp8_find_best_sub_pixel_step() and vp8_find_best_half _pixel_step() functions which are called when speed >= 5. Test result (using tulip clip): 1. On Core2 Quad machine(Linux) rt mode, speed (-5 ~ -8), encoding speed gain: 2% ~ 3% rt mode, speed (-9 ~ -11), encoding speed gain: 1% ~ 2% rt mode, speed (-12 ~ -14), no noticeable encoding speed gain 2. On Xeon machine(Linux) Test on speed (-5 ~ -14) didn't show noticeable speed change. Change-Id: I21bec2d6e7fbe541fcc0f4c0366bbdf3e2076aa2	2011-07-27 14:19:10 -04:00
Yunqing Wang	f11613b620	Merge "Fix range checks in motion search"	2011-07-27 09:34:13 -07:00
Yunqing Wang	bde2afbe23	Fix range checks in motion search There were some situations that the start motion vectors were out of range. This fix adjusted range checks to make sure they are checked and clamped. Change-Id: Ife83b7fed0882bba6d1fa559b6e63c054fd5065d	2011-07-27 10:37:33 -04:00
James Zern	b45065d38b	cosmetics: consistently use [u]int64_t Removes mixed usage of (unsigned) long long and INT64. Fixes Issue #208. Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a	2011-07-26 11:34:36 -07:00
Yunqing Wang	fe270dd527	Specify size for argument pushed to stack The change fixes building error on Win64. Change-Id: I63d25b26220c4da8a98ca2e36530cbb802468e6b	2011-07-25 11:30:45 -04:00
Johann	773bcc300d	Merge "fix sharpness bug and clean up"	2011-07-22 09:34:55 -07:00
Johann	a04ed0e8f3	fix sharpness bug and clean up sharpness was not recalculated in vp8cx_pick_filter_level_fast remove last_filter_type. all values are calculated, don't need to update the lfi data when it changes. always use cm->sharpness_level. the extra indirection was annoying. don't track last frame_type or sharpness_level manually. frame type only matters for motion search and sharpness_level is taken care of in frame_init move function declarations to their proper header Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db	2011-07-22 12:33:57 -04:00
Yunqing Wang	829179e888	Merge "Preload reference area to an intermediate buffer in sub-pixel motion search"	2011-07-22 06:56:15 -07:00
Yunqing Wang	20bd1446c0	Preload reference area to an intermediate buffer in sub-pixel motion search In sub-pixel motion search, the search range is small(+/- 3 pixels). Preload whole search area from reference buffer into a 32-byte aligned buffer. Then in search, load reference data from this buffer instead. This keeps data in cache, and reduces the crossing cache- line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux) showed encoder speed improvement: 3.4% at --rt --cpu-used =-4 2.8% at --rt --cpu-used =-3 2.3% at --rt --cpu-used =-2 2.2% at --rt --cpu-used =-1 Test on Atom notebook showed only 1.1% speed improvement(speed=-4). Test on Xeon machine also showed less improvement, since unaligned data access latency is greatly reduced in newer cores. Next, I will apply similar idea to other 2 sub-pixel search functions for encoding speed > 4. Make this change exclusively for x86 platforms. Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f	2011-07-22 09:28:06 -04:00
John Koleszar	2bdda84e37	Merge "Increase chrow row alignment to 16 bytes."	2011-07-21 07:32:39 -07:00
Yunqing Wang	c5fe641179	Merge "Add improvements made in good-quality mode to real-time mode"	2011-07-21 07:27:09 -07:00
Timothy B. Terriberry	7d1b37cdac	Increase chrow row alignment to 16 bytes. This is done by expanding luma row to 32-byte alignment, since there is currently a bunch of code that assumes that uv_stride == y_stride/2 (see, for example, vp8/common/postproc.c, common/reconinter.c, common/arm/neon/recon16x16mb_neon.asm, encoder/temporal_filter.c, and possibly others; I haven't done a full audit). It also uses replaces the hardcoded border of 16 in a number of encoder buffers with VP8BORDERINPIXELS (currently 32), as the chroma rows start at an offset of border/2. Together, these two changes have the nice advantage that simply dumping the frame memory as a contiguous blob produces a valid, if padded, image. Change-Id: Iaf5ea722ae5c82d5daa50f6e2dade9de753f1003	2011-07-20 10:20:31 -07:00
Scott LaVarnway	a25f6a9c88	Moved vp8_encode_bool into boolhuff.h allowing the compiler to inline this function. For real-time encodes, this gave a boost of 1% to 2.5%, depending on the speed setting. Change-Id: I3929d176cca086b4261267b848419d5bcff21c02	2011-07-19 09:17:25 -04:00
John Koleszar	b5ea2fbc2c	Improved 1-pass CBR rate control This patch attempts to improve the handling of CBR streams with respect to the short term buffering requirements. The "buffer level" is changed to be an average over the rc buffer, rather than a long running average. Overshoot is also tracked over the same interval and the golden frame targets suppressed accordingly to correct for overly aggressive boosting. Testing shows that this is fairly consistently positive in one metric or another -- some clips that show significant decreases in quality have better buffering characteristics, others show improvenents in both. Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920	2011-07-18 11:48:05 -04:00
Scott LaVarnway	e68894fa03	Merge "Tokenize MB optimized"	2011-07-15 07:54:14 -07:00
Tero Rintaluoma	4e82f01547	Tokenize MB optimized Optimized C-code of the following functions: - vp8_tokenize_mb - tokenize1st_order_b - tokenize2nd_order_b Gives ~1-5% speed-up for RT encoding on Cortex-A8/A9 depending on encoding parameters. Change-Id: I6be86104a589a06dcbc9ed3318e8bf264ef4176c	2011-07-15 11:26:54 +03:00
John Koleszar	04dce631a2	Remove unused speed features min_fs_radius, max_fs_radius, full_freq were set but never read. Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74	2011-07-14 14:20:25 -04:00
Yunqing Wang	0e9a6ed72a	Add improvements made in good-quality mode to real-time mode Several improvements we made in good-quality mode can be added into real-time mode to speed up encoding in speed 1, 2, and 3 with small quality loss. Tests using tulip clip showed: --rt --cpu-used=-1 (before change) PSNR: 38.028 time: 1m33.195s (after change) PSNR: 38.014 time: 1m20.851s --rt --cpu-used=-2 (before change) PSNR: 37.773 time: 0m57.650s (after change) PSNR: 37.759 time: 0m54.594s --rt --cpu-used=-3 (before change) PSNR: 37.392 time: 0m42.865s (after change) PSNR: 37.375 time: 0m41.949s Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9	2011-07-13 14:51:02 -04:00
Fritz Koenig	84c3cd79d1	Merge "Reduce motion vector search on alt-ref frame."	2011-07-13 10:07:30 -07:00
Johann	d9b825cff2	Merge "New loop filter interface"	2011-07-13 04:09:26 -07:00
Fritz Koenig	ede0b15c9d	Reduce motion vector search on alt-ref frame. Clamp mv search to accomodate subpixel filtering of UV mv. Change-Id: Iab3ed405993ef6bf779ad7cf60863153068fb7d1	2011-07-11 09:05:43 -07:00
Yunqing Wang	587ca06da9	Minor change in pick_inter_mode() Scott suggested to move vp8_mv_pred() under "case NEWMV" to save extra checks. Change-Id: I09e69892f34a08dd425a4d81cfcc83674e344a20	2011-07-08 14:08:45 -04:00
Yunqing Wang	e83d36c053	Merge "Adjust full-pixel clamping and motion vector limit calculation"	2011-07-08 08:39:32 -07:00
Yunqing Wang	40991faeae	Adjust full-pixel clamping and motion vector limit calculation Do mvp clamping in full-pixel precision instead of 1/8-pixel precision to avoid error caused by right shifting operation. Also, further fixed the motion vector limit calculation in change: b7480454706a6b15bf091e659cd6227ab373c1a6 Change-Id: Ied88a4f7ddfb0476eb9f7afc6ceeddbf209fffd7	2011-07-08 11:34:28 -04:00
Johann	6ae12c415e	Merge "clean up warnings when building arm with rtcd"	2011-07-08 05:16:09 -07:00
Attila Nagy	622958449b	New loop filter interface Separate simple filter with reduced no. of parameters. MB filter level picking based on precalculated table. Level table updated for each frame. Inside and edge limits precalculated and updated just when sharpness changes. HEV threshhold is constant. ARM targets use scalars and others vectors. Change works only with --target=generic-gnu All other targets have to be updated! Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c	2011-07-08 09:31:41 +03:00
John Koleszar	973a9c075d	Merge "Set VPX_FRAME_IS_DROPPABLE"	2011-07-07 08:11:05 -07:00
John Koleszar	37de0b8bdf	Set VPX_FRAME_IS_DROPPABLE Allow the encoder to inform the application that the encoded frame will not be used as a reference. Change-Id: I90e41962325ef73d44da03327deb340d6f7f4860	2011-07-07 10:38:45 -04:00
Yunqing Wang	ae8aa836d5	Merge "Copy macroblock data to a buffer before encoding it"	2011-06-30 11:14:24 -07:00
Yunqing Wang	80c3bbf657	Merge "Bug fix in motion vector limit calculation"	2011-06-30 09:52:03 -07:00
Yunqing Wang	b748045470	Bug fix in motion vector limit calculation Motion vector limits are calculated using right shifts, which could give wrong results for negative numbers. James Berry's test on one clip showed encoder produced some artifacts. This change fixed that. Change-Id: I035fc02280b10455b7f6eb388f7c2e33b796b018	2011-06-30 11:20:13 -04:00
Johann	3e4a80cc35	Merge "remove incorrect initialization"	2011-06-30 07:59:08 -07:00
Paul Wilkins	eacaabc592	Merge "Change to arf boost calculation."	2011-06-29 10:03:57 -07:00
Paul Wilkins	11694aab66	Change to arf boost calculation. In this commit I have added an experimental function that tests prediction quality either side of a central position to calculate a suggested boost number for an ARF frame. The function is passed an offset from the current position and a number of frames to search forwards and backwards. It returns a forward, backward and compound boost number. The new code can be deactivated using #define NEW_BOOST 0 In its current default state the code searches forwards and backwards from the proposed position of the next alt ref. The the old code used a boost number calculated by scanning forward from the previous GF up to the proposed alt ref frame position. I have also added some code to try and prevent placement of a gf/arf where there is a brief flash. Change-Id: I98af789a5181148659f10dd5dd2ff2d4250cd51c	2011-06-29 18:01:25 +01:00

... 2 3 4 5 6 ...

828 Commits