Commit Graph

380 Commits

Author SHA1 Message Date
James Berry
f3e9e2a0f8 Fix incorrect macroblock counts in twopass rate control
The previous calculation of macroblock count (w*h)/256
is not correct when the width/height are not multiples of
16. Use the precalculated macroblock count from
cpi->common instead. This manifested itself as a divide
by zero when the number of pixels was less than 256.
num_mbs updated in estimate_max_q, estimate_q,
 estimate_kf_group_q, and estimate_cq

Change-Id: I92ff98587864c801b1ee5485cfead964673a9973
2011-03-10 13:33:06 -05:00
Yunqing Wang
7b8e7f0f3a Add vp8_sub_pixel_variance16x8_ssse3 function
Added SSSE3 function

Change-Id: I8c304c92458618d93fda3a2f62bd09ccb63e75ad
2011-03-09 12:33:21 -05:00
Yunqing Wang
4561109a69 Remove unused functions
Removed some unused functions

Change-Id: Ifdfc27453e53cfc75997b38492901d193a16b245
2011-03-09 10:45:03 -05:00
Yunqing Wang
7966dd5287 Merge "Improve SSE2 half-pixel filter funtions" 2011-03-09 07:23:06 -08:00
John Koleszar
fa836faede Merge "Configuration updates:Making a clear distinction between Init and Change" 2011-03-09 05:07:11 -08:00
Yunqing Wang
419f638910 Improve SSE2 half-pixel filter funtions
Rewrote these functions to process 16 pixels once instead of 8.

Change-Id: Ic67e80124467a446a3df4cfecfb76a4248602adb
2011-03-08 16:25:06 -05:00
Yunqing Wang
859abd6b5d Merge "Add zero offset checking in SSE2 sub-pixel filter function" 2011-03-08 12:26:58 -08:00
Yunqing Wang
8432a1729f Add zero offset checking in SSE2 sub-pixel filter function
Skip filter at zero offset.

Change-Id: I95fc7e211869bc0ab5bcfb7ab2e3259d1c0ccf38
2011-03-08 15:22:07 -05:00
Yunqing Wang
e8f7b0f7f5 Merge "Write SSSE3 sub-pixel filter function" 2011-03-08 10:58:30 -08:00
Yunqing Wang
244e2e1451 Write SSSE3 sub-pixel filter function
1. Process 16 pixels at one time instead of 8.
2. Add check for both xoffset =0 and yoffset=0, which happens
   during motion search.
This change gave encoder 1%~3% performance gain.

Change-Id: Idaa39506b48f4f8b2fbbeb45aae8226fa32afb3e
2011-03-08 13:29:01 -05:00
Ralph Giles
e6948bf0f9 Fix a multi-line format-string warning.
GCC 4.5 and 4.6 both issue a warning about the multi-line format
string introduced in bc9c30a0, which also changed the whitespace
in the associated stt file by line-wrapping the long format string.

Instead, use multiple string constants, which the compiler will
concatenate. This maintains the original formatting, but remains
legible within the standard line length.

Change-Id: I27c9f92d46be82d408105a3a4091f145f677e00e
2011-03-08 07:14:12 -08:00
Paul Wilkins
de87c420ef Corrected minor typos.
Change-Id: Icc9f12bd1e1bdaf51256dc8a90d08aa9be89ef34
2011-03-08 14:46:22 +00:00
Paul Wilkins
0eccee4378 Merge changes I00c3e823,If8bca004
* changes:
  Improved key frame detection.
  Improved KF insertion after fades to still.
2011-03-08 06:40:11 -08:00
John Koleszar
5d1d9911cb correct zbin boost for splitmv mode
Disable zbin boost in SPLITMV mode as intended. Was incorrectly looking
at vp8_ref_frame_order instead of vp8_mode_order when comparing against
SPLITMV. This condition should have always been false, as SPLITMV is
not in the range of valid reference frames.

Change-Id: I0408cc7595eff68f00efef6d008e79f5b60d14bf
2011-03-07 20:58:37 -05:00
Paul Wilkins
bc9c30a003 Improved key frame detection.
In some cases where clips have been encoded with
borders (eg. some wide-screen content where there is a
border top and bottom and slide shows containing portrait
format photographs (border left and right)) key frames were
not being correctly detected.

The new code looks to measure cases where a portion of
the image can be coded equally easily using intra or inter
modes and where the resulting error score is also very low.
These "neutral" areas are then discounted in the key frame
detection code.

Change-Id: I00c3e8230772b8213cdc08020e1990cf83b780d8
2011-03-07 15:58:07 +00:00
Paul Wilkins
9fc8cb39aa Improved KF insertion after fades to still.
This code extends what was previously done for GFs, to pick
cases where insertion of a key frame after a fade (or other
transition or complex motion)  followed by a still section, will
be beneficial and will reduce the number of forced key frames.

Change-Id: If8bca00457f0d5f83dc3318a587f61c17d90f135
2011-03-07 15:11:09 +00:00
John Koleszar
0bc31f1887 Merge "Fixing divide by zero" 2011-03-04 05:40:33 -08:00
Mikhal Shemer
84f7f20985 Configuration updates:Making a clear distinction between Init and Change
Change-Id: I7b2fb326e1aabc08b032177a7b914a5b8bb7376f
2011-03-03 10:35:09 -08:00
Mikhal Shemer
1de99a2a81 Fixing divide by zero
Change-Id: I9d8a98a2f7ed1e3116d0bae35164618c41998bac
2011-03-03 10:33:36 -08:00
John Koleszar
36be4f7f06 Fix drastic undershoot in long form content
When the modified_error_left accumulator exceeds INT_MAX, an incorrect
cast to int resulted in a negative value, causing the rate control to
allocate no bits to that keyframe group, leading to severe undershoot
and subsequent poor quality.

This error was exposed by the recent change to the rolling target and
actual spend accumulators in commit 305be4e4 which fixed them to
actually calculate the average value rather than be re-initialized
on every frame to the average per-frame bitrate. When this bug was
triggered, the target bitrate could be 0, so the rolling target
becomes small, which causes the undershoot. The code prior to 305be4e4
did not exhibit this behavior because the rolling target was always
set to a reasonable value and was independent of the actual target
bitrate. With this patch, the actual target bitrate is calculated
correctly, and the rate control tracks as expected.

This cast was likely added to silence a compiler warning on a comparison
between a double (modified_error_left) and an int (0). Instead, this
patch removes the cast and changes the comparison to be against 0.0,
which should prevent the warning from reoccuring.

This fixes issue #289. Special thanks to gnafu for his efforts in
reporting and debugging this fix.

Change-Id: Ie5cc1a7b516c578a76c3a50c892a6f04a11621fe
2011-03-02 22:52:27 -05:00
Johann
6f5189c044 Merge "ARMv6 optimized half pixel variance calculations" 2011-03-02 05:48:46 -08:00
Yunqing Wang
cfaee9f7c6 Merge "Add prefetch before variance calculation" 2011-02-28 11:42:28 -08:00
Scott LaVarnway
3e6d476ac3 Merge "Avoid double copying of key frames into alt and golden buffer" 2011-02-28 10:16:33 -08:00
Yunqing Wang
d96ba65a23 Add prefetch before variance calculation
This improved encoding performance by 0.5% (good, speed 1) to
1.5% (good, speed 5).

Change-Id: I843d72a0d68a90b5f694adf770943e4a4618f50e
2011-02-28 11:25:55 -05:00
Johann
31dab574cc Merge "Remove a second check for invalid ptr in vp8_get_compressed_data" 2011-02-25 11:44:18 -08:00
Johann
e4fa638653 Merge "Remove temporal alt ref from realtime only build" 2011-02-25 06:55:17 -08:00
Attila Nagy
d8fc974ac0 Avoid double copying of key frames into alt and golden buffer
Change-Id: I726976a297a593a35ed6cba3c660e372562f7b27
2011-02-25 09:03:16 +02:00
Attila Nagy
6da2018789 Remove a second check for invalid ptr in vp8_get_compressed_data
Check is done first when function si entered.

Change-Id: Ief0d0cbd4860aaf492b78728f8d22f24029b1174
2011-02-25 08:41:13 +02:00
Scott LaVarnway
861175ef00 Removed vp8_block2type
and used defines instead.

Change-Id: Idb56e0295d004793f406dfd2d8d8c546aad62e03
2011-02-24 14:35:18 -05:00
Scott LaVarnway
d53492bba4 Merge "Revisited rd_pick_intra4x4block" 2011-02-24 11:25:21 -08:00
Scott LaVarnway
658454a04c Revisited rd_pick_intra4x4block
Removed unnecessary copies.  No noticeable speed gains.


Change-Id: I996c50c23fedd06d54ee7a3e762cbf559cc4a9d1
2011-02-24 13:31:47 -05:00
Paul Wilkins
b862c108dd Overflow of frame error accumulators.
This fixes an overflow problem in the frame error accumulators.

The overflow condition is extreme but did trigger when Frank B.
coded some high motion interlaced HD content.

The observed effect was a catastrophic  breakdown of the rate
control leading to massive undershoot and poor bit allocation.

All the error values should really be unsigned but I will look at this
separately.

Change-Id: I9745f5c5ca2783620426b66b568b2088b579151f
2011-02-24 15:49:41 +00:00
Tero Rintaluoma
8ae92aef66 ARMv6 optimized half pixel variance calculations
Adds following ARMv6 optimized functions to the encoder:
 - vp8_variance_halfpixvar16x16_h_armv6
 - vp8_variance_halfpixvar16x16_v_armv6
 - vp8_variance_halfpixvar16x16_hv_armv6

Change-Id: I1e9c2af7acd2a51b72b3845beecd990db4bebd29
2011-02-23 13:27:27 +02:00
Attila Nagy
7af0d906e3 Remove temporal alt ref from realtime only build
It is not used in realtime mode. Reduces memory footprint.

Change-Id: I7f163225762368df5457cfd413050161d3704a3f
2011-02-22 12:53:32 +02:00
Johann
945dad277d Revert "use unaligned load"
This reverts commit f50f2fd2a7.

Change Ib7506e3e aligns the buffer

Change-Id: Ie0f8bd3e57cfdfef81d39638a1451458ebbae2e0
2011-02-18 10:23:02 -05:00
John Koleszar
c764c2a20f Merge "clean up unused files" 2011-02-18 06:33:05 -08:00
John Koleszar
3ed8fe8778 remove unused vp8_predict_dc function
Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42
2011-02-18 09:12:20 -05:00
John Koleszar
cbf923b12c clean up unused files
Removed a number of files that were unused or little-used.

Change-Id: If9ae5e5b11390077581a9a879e8a0defe709f5da
2011-02-18 09:09:49 -05:00
John Koleszar
d371ca93e5 cosmetic: remove unnecessary scope
Clean up some unnecessary scoping around pick_filter_level.

Change-Id: Ic57fa33e3fcae37fe6beae977e5743783399d5af
2011-02-18 08:46:07 -05:00
John Koleszar
597d02b508 Merge "Dont pick encoder filter level when loopfilter is disabled." 2011-02-18 05:26:23 -08:00
Attila Nagy
fb5a692d27 Reinitialize quantizer only when any delta is changing
No need to reinitialize for base Q changes.

Change-Id: Ie76ec21dd3c5582d5183dbed75ed73a1eed3e291
2011-02-18 14:23:37 +02:00
Attila Nagy
c6ef75690f Dont pick encoder filter level when loopfilter is disabled.
Change-Id: I58154faf4f3ece24f9927a5c3ab7e830e0887fb6
2011-02-18 08:53:00 +02:00
John Koleszar
562f1470ce Use endian-neutral bitstream packing/unpacking
Eliminate unnecessary checks on target endianness and associated
macros.

Change-Id: I1d4e6a9dcee9bfc8940c8196838d31ed31b0e4aa
2011-02-17 15:20:53 -05:00
John Koleszar
c351aa7f1b Merge "Fix relative include paths" 2011-02-17 04:13:44 -08:00
Yunqing Wang
da9402fbf6 Merge "Allocate source buffers to be multiples of 16" 2011-02-16 11:35:06 -08:00
Yunqing Wang
da227b901d Allocate source buffers to be multiples of 16
Currently, when the video frame width is not multiples of 16, the
source buffer has a stride of non-multiples of 16, which forces
an unaligned load in SAD function and hurts the performance. To
avoid that, this change allocates source buffers to be multiples
of 16.

Change-Id: Ib7506e3eb2cea06657d56be5a899f38dfe3eeb39
2011-02-16 12:57:17 -05:00
Johann
0c2cfff9b0 Merge "ARMv6 optimized sad16x16" 2011-02-16 05:22:38 -08:00
James Zern
0030303b69 Remove redundant ptr checks in calls to vpx_free
vpx_free if used contains this check. If replaced, well behaved free
will behave similarly.

Change-Id: I25483aaa8b39255b9a8cf388d6e5eaa20a908ae1
2011-02-15 12:43:35 -08:00
Yunqing Wang
7725a7eb56 Merge "Improve vp8_sad16x16_sse3 function" 2011-02-14 14:09:25 -08:00
Yaowu Xu
27dad21548 Merge "Improved vp8_rd_pick_intra_mbuv_mode" 2011-02-14 13:58:12 -08:00