Commit Graph

427 Commits

Author SHA1 Message Date
James Berry
d3dfcde0f7 mem leak fix for cpi->tplist
checks added to make sure that cpi->tplist
is freed correctly in vp8_dealloc_compressor_data
and vp8_alloc_compressor_data.

Change-Id: I66149dbbd25c958800ad94f4379d723191d9680d
2011-02-14 14:02:52 -05:00
Scott LaVarnway
d419b93e3e Improved rd_pick_intra4x4block
Eliminated unnecessary calculations.  Improved performance
by 10% on keyframes and 1.6% overall for the test clip used.

Change-Id: I87671b26af5e2cc439e81d0fee3b15c7cd2a3309
2011-02-14 13:32:58 -05:00
Yunqing Wang
353246bd60 Merge "Add improved_mv_pred flag in real-time mode" 2011-02-11 07:20:17 -08:00
Yunqing Wang
9d0b2cbbce Add improved_mv_pred flag in real-time mode
As mentioned in check-in "Improve motion search in real-time mode",
MV prediction calculation causes speed loss for speed 7 and above.
This change added a flag to turn off this calculation for speed>6
in real-time mode.

Change-Id: I9f4ae5a8bf449222d1784b54e7d315fc8347b2d1
2011-02-11 09:59:41 -05:00
Tero Rintaluoma
1ef86980b9 ARMv6 optimized sad16x16
Adds a new ARMv6 optimized function vp8_sad16x16_armv6 to encoder.

Change-Id: Ibbd7edb8b25cb7a5b522d391b1e9a690fe150e57
2011-02-11 11:14:07 +02:00
Yaowu Xu
4f8a166058 Merge "Redefining good quality speed settings" 2011-02-10 21:38:19 -08:00
Yunqing Wang
6f53e59641 Merge "Improve motion search in real-time mode" 2011-02-10 12:42:44 -08:00
John Koleszar
02321de0f2 Fix relative include paths
Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.

Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
2011-02-10 15:09:44 -05:00
Yunqing Wang
41e6eceb28 Improve motion search in real-time mode
Applied better MV prediction in real-time mode, which improves
the encoding quality.

Used quarter-pixel search instead of iterative sub-pixel search
for speed >=5 to improve encoding performance.

Tests on the test set showed:
1. For speed=-5, quality improvement: 1.7% on AvgPSNR and 2.1%
on SSIM, performance improvement: 3.6% (This counts in the
performance lose caused by MV prediction calculation in "Improve
MV prediction in vp8_pick_inter_mode() for speed>3").
2. For speed=-8, quality improvement: 2.1% on AvgPSNR and 2.5%
on SSIM. but, 6.9% performance decrease because of MV prediction
calculation. This should be improved later.

Change-Id: I349a96c452bd691081d8c8e3e54419e7f477bebd
2011-02-10 13:40:24 -05:00
Johann
7d8199f0c3 Merge "Adds armv6 optimized variance calculation" 2011-02-10 06:06:46 -08:00
Scott LaVarnway
19054ab6da Redefining good quality speed settings
Created a new speed 1 which is in the middle of the old
speed 0 and speed 1. (for both quality and performance)

Change-Id: I4802133cdb43f359ca787646c090899679dd5d84
2011-02-09 17:18:28 -05:00
James Berry
fffa2a61d7 fixed stride in vp8_temporal_filter_predictors_mb_c
stride would not be calculated correctly for material
with odd sized frame widths.

Change-Id: I1710f6aef9ebb93d36249c9239c68c5baa9791f8
2011-02-09 16:55:39 -05:00
John Koleszar
c2b43164bd Merge "correct cost for implicit bit in mvs" 2011-02-09 11:20:12 -08:00
John Koleszar
9954d05ca6 correct cost for implicit bit in mvs
Use 0xFFF0 vice 240 (0xF0) for determining whether the sometimes
implicit bit 3 will be transmitted. This is consistent with the decoder
and encode_mvcomponent().

Change-Id: Ic1304d0ab56844bed8236edd1c5243a6767fc6b1
2011-02-09 12:50:17 -05:00
John Koleszar
a39b5af10b Merge "Put more code under #if CONFIG_MULTITHREAD." 2011-02-09 08:31:36 -08:00
Gaute Strokkenes
315e3c2518 Put more code under #if CONFIG_MULTITHREAD.
Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b
2011-02-09 11:21:18 -05:00
Scott LaVarnway
85e79ce288 Merge "Added early breakout for vp8_rd_pick_intra4x4mby_modes" 2011-02-09 07:55:04 -08:00
Tero Rintaluoma
cb14764fab Adds armv6 optimized variance calculation
Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
and adds new assembly file for variance16x16 calculation.
 - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
 - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
 - vp8_variance16x16_armv6 (new)
 - bilinearfilter_arm.h (new)
Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
2011-02-09 10:23:43 -05:00
Scott LaVarnway
13db80c282 Added early breakout for vp8_rd_pick_intra4x4mby_modes
Improved performance of good quality, speed 0 (3% average)
with no average quality loss.

Change-Id: Ica34473f99bd74260eaebde6b132185e09e3c09d
2011-02-08 16:50:43 -05:00
Johann
40dcae9c2e clarify *_offsets.asm differences
it's difficult to mux the *_offsets.c files because of header conflicts.
make three instead, name them consistently and partititon the contents
to allow building them as required.

Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796
2011-02-08 16:35:43 -05:00
Yunqing Wang
58d2e70fc5 Fix link error in real-time mode
make vp8_mv_pred() and vp8_cal_sad() available in real-time mode.

Change-Id: I71dbae241b486ba943458dcbae552ec4a51689d3
2011-02-07 08:21:14 -05:00
Yunqing Wang
350ffe8dae Merge "Improve MV prediction in vp8_pick_inter_mode() for speed>3" 2011-02-04 10:10:15 -08:00
John Koleszar
63fc44dfa5 correct quantizer initialization
The encoder was not correctly catching transitions in the quantizer
deltas. If a delta_q was set, then the quantizer would be reinitialized
on every frame, but if they transitioned to 0, the quantizer would
not be reinitialized, leading to a encode-decode mismatch.

This bug was triggered by commit 999e155, which sets a Y2 delta Q
for very low base Q levels.

Change-Id: Ia6733464a55ee4ff2edbb82c0873980d345446f5
2011-02-04 11:37:47 -05:00
John Koleszar
c0a9cbebe1 Merge "Delay auto key frame insertion in realtime configuration" 2011-02-04 05:16:15 -08:00
Scott LaVarnway
4aa12b6c5f Merge "Zero out block mv when an intra mode is selected" 2011-02-03 07:16:52 -08:00
Yunqing Wang
a870315629 Merge "Improved encoder threading" 2011-02-03 05:44:57 -08:00
Attila Nagy
e5904f2d5e Delay auto key frame insertion in realtime configuration
Whe auto keyframe insertion is enabled and conditions are right (scene change)
the encoder can decide to insert a key frame and does a re-encoding. This can
introduce extra latency. In RT mode we do not do the re-encoding of the current
frame but force the next frame to key frame.

Change-Id: I15c175fa845ac4c1a1f18bea3676e154669522a7
2011-02-02 13:54:40 +02:00
Scott LaVarnway
07a7c08aef Zero out block mv when an intra mode is selected
instead of each time mode is tested.

Change-Id: Ief0f5586dafde54cc14d348dcecdacb182e7c1d5
2011-02-01 12:55:51 -05:00
Scott LaVarnway
a5ecaca6a7 Removed unnecessary B_MODE_INFO memset.
Change-Id: I2bcef6a8e47f88542861fd1356631ca934e2a0e7
2011-02-01 11:35:08 -05:00
Scott LaVarnway
b18df82e1d Moved rd calculation into vp8_pick_intra4x4mby_modes
Then removed unnecessary code.

Change-Id: I142658815d843c9396b07881dbdd8d387c43c90e
2011-02-01 11:26:04 -05:00
Scott LaVarnway
4e7e79f770 Removed intra_modes from vp8cx_encode_intra_macro_block
Restructured function in order to eliminate the prediction
modes save/restore.  Code cleanup also.

Change-Id: I816e3b910de64d0f0f0ddc2398805c63263191e8
2011-02-01 10:05:35 -05:00
Attila Nagy
385c2a76d1 Improved encoder threading
Reduce the number of sync points by letting each thread
continue imediatly with a new MB row.
Better multicore scaling, improves performance by 5-20% on ARM multicore.

Change-Id: Ic97e4d1c4886a842c85dd3539a93cb217188ed1b
2011-02-01 12:17:58 +02:00
Scott LaVarnway
9e7fec216e Removed prediction_error accumulation
from vp8cx_encode_intra_macro_block.  prediction_error is used when
deciding if a frame should be a keyframe.  After reviewing this with
Yaowu, it was pointed out that vp8cx_encode_intra_macro_block
is only called for keyframes, so the accumulation is unnecessary.

Change-Id: Id79dc81b80d4f5d124f3a0dba1b923887e2e1ec8
2011-01-31 19:53:02 -05:00
Scott LaVarnway
317f0da91e Removed last_auto_filter_prediction_error
last_auto_filter_prediction_error is not really used.

Change-Id: Ic6e56c4076bbd250ef783ee1be46964c85f62864
2011-01-31 19:41:09 -05:00
Scott LaVarnway
4a15e55793 Possible bug in vp8cx_encode_intra_macro_block
vp8_pick_intra4x4mby_modes uses the passed in distortion
for an early breakout.  The best distortion was never saved
and the distortion for TM_PRED was always used.

Change-Id: Idbaf73027408a4bba26601713725191a5d7b325e
2011-01-31 17:43:18 -05:00
Scott LaVarnway
60fde4d342 Merge "Performance improvement of first pass" 2011-01-31 13:02:23 -08:00
Yaowu Xu
6d19d40718 Merge "change the threshold of DC check for encode breakout" 2011-01-31 11:00:46 -08:00
Adrian Grange
408a8adc15 Merge "Changed condition for using RD in Intra Mode" 2011-01-31 02:18:40 -08:00
Yaowu Xu
8f279596cb change the threshold of DC check for encode breakout
Previously, the DC check is to make sure there is no code-able
DC shift for quantizer Q0, which has been verified rather
conservative. This commit changes the criteria to have two
components, DC and AC, to address the conservativeness. First,
it checks if all AC energy is enough to contribute a single
non-zero quantized AC coefficient. Second, for DC, the decision
to skip further considers two possible scenarios: 1. There is
no code-able 2nd order DC coefficient at all; 2 The residue is
relatively flat, but the uniform DC change is very small, i.e.
less than 1/2 gray level per pixel.

Comparing to previous criteria, the new criteria is about 10%
to 15% faster in encoding time with a very small quality loss.
(threshold ~1000 and quality range 33db-45db)

It should be noted that this commit enables "automatic" static
threshold for encodebreakout if a non-zero small value is passed
in to encoder.

Change-Id: I0f77719a1ac2c2dfddbd950d84920df374515ce3
2011-01-28 09:43:23 -08:00
Johann
f3cb9ae459 Merge "Adds "armvX-none-rvct" targets" 2011-01-28 09:03:58 -08:00
Yunqing Wang
7cbe684ef5 Improve MV prediction in vp8_pick_inter_mode() for speed>3
Applied same method used in vp8_rd_pick_inter_mode() to improve
the accuracy of MV prediction.

Change-Id: Ia50ae26208b18482695601f32febd99fe89fbc17
2011-01-28 10:00:20 -05:00
Adrian Grange
e9f513d74a Changed condition for using RD in Intra Mode
The condition for using RD when selecting the intra coding mode
for a MB is that the RD flag is set AND we're not in real-time
mode.

Previously the code used RD if either the RD flag was set OR
we were not using real-time mode.

Change-Id: Ic711151298468a3f99babad39ba8375f66d55a08
2011-01-28 14:47:36 +00:00
Paul Wilkins
dcb23e2aaa Inconsistent distortion metric in vp8_rd_pick_intra_mbuv_mode
This function was using a variance metric compared to and SSE metric in
other places (eg. vp8_rd_inter_uv)

Change-Id: I9109fcc5a13bca9db1d7ead500fe14999ab233eb
2011-01-28 13:13:30 +00:00
Tero Rintaluoma
11a222f5d9 Adds "armvX-none-rvct" targets
Adds following targets to configure script to support RVCT compilation
without operating system support (for Profiler or bare metal images).
 - armv5te-none-rvct
 - armv6-none-rvct
 - armv7-none-rvct

To strip OS specific parts from the code "os_support"-config was added
to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS
specific parts such as OS specific includes and function calls for
timers and threads etc. This was done to enable RVCT compilation for
profiling purposes or running the image on bare metal target with
Lauterbach.

Removed separate AREA directives for READONLY data in armv6 and neon
assembly files to fix the RVCT compilation. Otherwise
"ldr <reg>, =label" syntax would have been needed to prevent linker
errors. This syntax is not supported by older gnu assemblers.

Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43
2011-01-28 12:47:39 +02:00
Johann
73207a1d8b warning: pointer targets differ in signedness
vp8/encoder/rdopt.c:728: warning: pointer targets in passing argument 3
of 'macro_block_yrd' differ in signedness
vp8/encoder/rdopt.c:541: note: expected 'int *' but argument is of type
'unsigned int *'

distortion is signed when calling macro_block_yrd is both other cases,
as well as for RDCOST

Change-Id: I5e22358b7da76a116f498793253aac8099cb3461
2011-01-27 11:53:26 -05:00
Johann
27000ed6d9 clean up implicit declaration warnings for neon
Change-Id: I6ca2d89f355839c4c770773c09fc69dcea7c1406
warning: implicit declaration of function
  'vp8_variance_halfpixvar16x16_[h|v|hv]_neon'
  'vp8_sub_pixel_variance16x16_neon_func'
2011-01-27 11:31:59 -05:00
Scott LaVarnway
8a5c255b3d Merge "Removed unused members from VP8_COMP" 2011-01-27 08:12:22 -08:00
Yunqing Wang
bb30ffc4dc Merge "Remove copies of same functions" 2011-01-27 08:11:26 -08:00
Yunqing Wang
3ee4e1e79f Merge "Refine motion vector prediction for NEWMV mode" 2011-01-27 08:10:53 -08:00
Scott LaVarnway
3c18a2bb2e Performance improvement of first pass
Improved the performance of the first pass only
(~6% on 720p test clip) by making use of LUT instead of the
float calculations.  Might try a SIMD version later.
Also started to make use of int_mv instead of
MV.

Change-Id: If2a217c7d6b59cd2c25c5553e0ca7e0502403af8
2011-01-26 16:42:56 -05:00
Yunqing Wang
cac54404b9 Remove copies of same functions
Reduce the code size.

Change-Id: I2e1998557a3c8776e262c442fd758c25e17aff7a
2011-01-26 15:37:00 -05:00
Scott LaVarnway
c4887da39c Removed unused members from VP8_COMP
Change-Id: I8f3f2642b02975fbdb14982984a29821f80d30d3
2011-01-26 15:07:17 -05:00
Paul Wilkins
35bb74a6bd Rationalize vp8_rd_pick_intra16x16mby_mode()
Use the function macro_block_yrd() to calculate error and distortion
in keeping with what is done for inter frames.

The old code was using a variance metric for once case and an
SSE function for measuring distortion in the other case.

The function vp8_encode_intra16x16mbyrd() is no longer used.

Change-Id: Ic228cb00a78ff637f4365b43f58fbe5a9273d36f
2011-01-26 18:46:34 +00:00
Paul Wilkins
e8e09d33df Merge "Correction to buffer update for non-viewable frames." 2011-01-26 09:33:48 -08:00
Yaowu Xu
82266a1ac9 Merge "cap the best quantizer for 2nd order DC" 2011-01-26 09:27:11 -08:00
Paul Wilkins
a3f71ccff6 Correction to buffer update for non-viewable frames.
The code previously tested cpi->common.refresh_alt_ref_frame
but there are situations where this flag may be set for viewable frames.

The correct test should be !cm->show_frame.

Change-Id: Ia1a600622992a4a68fe1d38ac23bf6b34b133688
2011-01-26 12:52:31 +00:00
Paul Wilkins
2caa36aa4f Merge "Fix for incorrect variable declaration." 2011-01-26 01:53:53 -08:00
Yaowu Xu
999e155f55 cap the best quantizer for 2nd order DC
This commit also removes artificial RDMULT cap for low quantizers.
The intention is to address some abnormal behavior of mode selections
at the low quantizer end, where many macroblocks were coded with
SPLITMV with all partitions using same motion vector including (0,0).
This change improves the compression quality substantially for high
quality encodings in both PSNR and SSIM terms. Overall effect on
mid/low rate range is also positive for all metrics, but smaller
in magnitude.

Change-Id: I864b29c4bd9ff610d2545fa94a19cc7e80c02667
2011-01-25 22:26:18 -08:00
Fritz Koenig
53d8e9dc97 Fix for incorrect variable declaration.
Commit 336aa0b7da incorrectly
declared current_pos as and int, when it should have been
a FIRSTPASS_STATS pointer.

Change-Id: I0a51c7a86ebba8546c95dd5d9d1c1143d4613e40
2011-01-25 15:41:41 -08:00
Johann
907e98fbb5 Merge "update sse2 regular quantizer" 2011-01-25 13:40:28 -08:00
Johann
58f19cc697 Merge "move new neon subpixel function" 2011-01-25 13:09:05 -08:00
Yunqing Wang
dcaaadd8ed Refine motion vector prediction for NEWMV mode
Adjust checking points in motion vector prediction to better cover
possible movements, and get a better prediction. Tests on test
clips showed a 0.1% improvement in SSIM, and no change in PSNR
and performance.

Change-Id: Ifdab05d35e10faea1445c61bb73debf888c9d2f8
2011-01-25 15:54:34 -05:00
Johann
af7d23c9b4 Merge "Fix issue 262, vp8cx_pack_tokens_into_partitions_armv5" 2011-01-25 12:49:52 -08:00
Johann
2168a94495 move new neon subpixel function
previously wasn't guarded with ifdef ARMV7, causing a link error with
ARMV6

Change-Id: I0526858be0b5f49b2bf11e9090180b2a6c48926d
2011-01-25 15:48:37 -05:00
Yunqing Wang
4e149bb447 Merge "Modify calling of NEON code in sub-pixel search" 2011-01-25 09:54:23 -08:00
Attila Nagy
3bf235a4c9 Fix issue 262, vp8cx_pack_tokens_into_partitions_armv5
http://code.google.com/p/webm/issues/detail?id=262
Function was asuming that partitions have equal amount of mb_rows,
which is not always true.

Change-Id: I59ed40117fd408392a85c633beeb5340ed2f4b25
2011-01-25 15:55:02 +02:00
Paul Wilkins
a69c18980f Merge "Incorrect bit allocation in forced KF groups." 2011-01-25 05:32:26 -08:00
Paul Wilkins
336aa0b7da Incorrect bit allocation in forced KF groups.
The old 2 pass code estimated error distribution when coding a
forced (by interval) key frame. The result of this was that in some
cases, when allocating bits at the GF group level within a KF
group there was either a glut of bits or starvation of bits at the end
of the KF group.

Added code to rescan and get the correct data once the position of
a forced key frame has been determined.

Change-Id: I0c811675ef3f9e4109d14bd049d7641682ffcf11
2011-01-25 12:29:06 +00:00
Scott LaVarnway
0ee525d6de Added vp8_update_zbin_extra
vp8cx_mb_init_quantizer was being called for every mode checked
in vp8_rd_pick_inter_mode.  zbin_extra is the only value that
really needs to be recalculated.  This calculation is disabled
when using the fast quantizer for mode selection.
This gave a small performance boost (~.5% to 1%).
Note: This needs to be verified with segmentation_enabled.

Change-Id: I62716a870b3c82b4a998bdf95130ff0b02106f1e
2011-01-24 11:00:56 -05:00
Yunqing Wang
d3e9409bb0 Merge "Modify sub-pixel filters to eliminate unnecessary calculations" 2011-01-21 11:07:17 -08:00
Yunqing Wang
0822a62f40 Modify sub-pixel filters to eliminate unnecessary calculations
In sub-pixel calculation, xoffset and yoffset mostly take some
specific values. Modified sub-pixel filter functions according to
these possible values to improve performance.

Change-Id: I83083570af8b00ff65093467914fbb97a4e9ea21
2011-01-21 13:59:27 -05:00
Paul Wilkins
0cdfef1e22 Modified static scene check.
Added code to scan ahead a few frames when we see what
we think is a static scene in the two pass GF loop to see if the
conditions persist.

Moved calculation of decay rate out into a fuunction.

Change-Id: I6e9c67e01ec9f555144deafc8ae67ef25bffb449
2011-01-21 17:52:00 +00:00
Paul Wilkins
8064583d26 Further work to reduce pulsing.
These changes are specifically targeted at fade transitions to
static scenes. Here we want to place a GF/ARF immediately
after the fade and prevent an ARF just  before the fade.

Also some code lines and comment lines shortened to 80 chars
while I was there.

Change-Id: Iefdc09a4fa7b265048fc017246b73e138693950f
2011-01-20 18:01:20 +00:00
Adrian Grange
815e1e9fe4 Fixed use of motion percentage in KF/GF group calc
In both vp8_find_next_key_frame and define_gf_group,
motion_pct was initialised at the top of the loop before
next_frame stats had been read in.

This fix sets motion_pct after next_frame stats have
been read.

Change-Id: I8c0bebf372ef8aa97b97fd35b42973d1d831ee73
2011-01-20 13:13:33 +00:00
Paul Wilkins
e867516843 First pass loop bug.
Incorrect value loop_decay_rate used in GF loop.

The intent was to test the  cumulative value decay_accumulator.

Change-Id: I62928c63eb09f4f6936a45ebd1c23784d1c9681b
2011-01-19 15:50:22 +00:00
Yunqing Wang
ce6c954d2e Modify calling of NEON code in sub-pixel search
In vp8_find_best_sub_pixel_step_iteratively(), many times xoffset
and yoffset are specific values - (4,0) (0,4) and (4,4). Modified
code to call simplified NEON version at these specific offsets to
help with the performance.

Change-Id: Iaf896a0f7aae4697bd36a49e182525dd1ef1ab4d
2011-01-18 14:19:52 -05:00
Jim Bankoski
edcf74c6ad vp8e -removed undefined max call
Change-Id: I42a86b0488f44115f09551fc5ad6d711fd470f0d
2011-01-18 11:21:32 -05:00
Paul Wilkins
d6d5d43708 Merge "Further CQ, Key frame and ARF changes" 2011-01-18 08:04:46 -08:00
Paul Wilkins
57136a268a Further CQ, Key frame and ARF changes
This code fixes a bug in the calculation of
the minimum Q for alt ref frames.

It also allows an extended gf/arf interval for sections
of clips that completely static (or nearly so).

Change-Id: I1a21aaa16d4f0578e5f99b13bebd78d59403c73b
2011-01-18 15:19:05 +00:00
Attila Nagy
cb791aaa2f Fix encoder real-time only configuration.
Remove allocation/deallocation of stats storage.
Remove full search functions in machine specific encoder inits.
Remove last pass validation in  validate_config.

Change-Id: I7f29be69273981a4fef6e80ecdb6217c68cbad4e
2011-01-18 08:19:21 -05:00
Paul Wilkins
339c512762 Fix CQ range and experimental KF sizing changes.
The CQ level was not using the q_trans[] array to convert
to a 0-127 range as per min and maxq

Experimental change to try and match the reconstruction
error for forced key frames approximately to that of the
previous frame by means of the recode loop. Though this
may cause extra recodes and the recode behavior has not
been optimized, it can only happen on forced key frames.

Change-Id: I1f7e42d526f1b1cb556dd461eff1a692bd1b5b2f
2011-01-17 17:24:45 +00:00
Johann
15f9bea73b update sse2 regular quantizer
about ~5% gain on 32bit. disabled for 64bit

unset executable bit on ssse3 version (cosmetic)

Change-Id: I1a5860839eb294ce4261f819caea2dcfa78e57ca
2011-01-14 14:26:10 -05:00
Paul Wilkins
a1a4d23797 Merge "KF/GF Pulsing" 2011-01-14 09:20:37 -08:00
Paul Wilkins
3aafb47729 Merge "Testing of modes with Alt Ref frame" 2011-01-14 07:26:37 -08:00
Paul Wilkins
8f711db4e8 Merge "Experimental change to help with ARNR problem." 2011-01-14 07:26:01 -08:00
Paul Wilkins
415371c9d9 Testing of modes with Alt Ref frame
Previously when a frame was being overlaid on a previously coded
alt ref frame we only checked the alt ref 0,0 mode. Where there is
a possibility that the alt ref buffer is a filtered frame we should allow
the other prediction modes as normal or at the least allow use of
the last frame buffer.

Change-Id: I4d6227223d125c96b4f3066ec6ec9484fee7768c
2011-01-14 15:20:45 +00:00
Adrian Grange
2c1b06e672 ARNR filter pointer update bug fix
In cases where the frame width is not a multiple of 16 the
ARNR filter would go wrong.

In vp8_temporal_filter_iterate_c when updating pointers
at the end of a row of MBs,  the image size was
incorrectly used rather than using Num_MBs_In_Row
times 16 (Y) or 8 (U,V).

This worked when width is multiple of 16 but failed
otherwise.

Change-Id: I008919062715bd3d17c7aa2562ab58d1cb37053a
2011-01-14 15:04:39 +00:00
Paul Wilkins
72e22b0bb8 Experimental change to help with ARNR problem.
Allow use of other reference frames for the ARF overlay frame
when ARNR filtering is enabled

Change-Id: Icd6a9fb38977a88fbe7cc9b9c18198eb454c0273
2011-01-14 12:07:12 +00:00
Paul Wilkins
c8338ebf7a KF/GF Pulsing
This change is designed to try and reduce pulsing effects when moving
with a complex transition like a fade, into an easy or static section in
an otherwise difficult clip in CQ mode.

The active CQ level is relaxed down to the user entered level for frames that
are generating less than the passed in minimum bandwidth.

Change-Id: Id6d8b551daad4f489c087bd742bc95418a95f3f0
2011-01-14 11:37:26 +00:00
Scott LaVarnway
b082790c7d Merge "Moved ref frame calculations" 2011-01-13 06:59:28 -08:00
Paul Wilkins
eda7d538bf One pass rate control correction.
Fixed discrepancy cpi->ni_frames vs cm->current_video_frame > 150.

Make one pass path explicit.

There is still scope for some odd behaviour around the transition
point at cpi->ni_frames > 150.

Change-Id: Icdee130fe6e2a832206d30e45bf65963edd7a74d
2011-01-13 12:51:41 +00:00
Paul Wilkins
55acda98f7 Limit key frame quantizer for forced key frames.
Where a key frame occurs because of a minimum interval
selected by the user, then these forced key frames ideally need
to be more closely matched in quality to the surrounding frame.

Change-Id: Ia55b1f047e77dc7fbd78379c45869554f25b3df7
2011-01-12 17:43:59 +00:00
Scott LaVarnway
96fd758ea9 Moved ref frame calculations
Moved ref frame calculations to outside of the
mode_index loop.

Change-Id: I06103fc7e8af88b54b84443acf6691d29b1272ac
2011-01-11 15:00:00 -05:00
Yunqing Wang
6ff2b0883a Merge "Add no_skip_block4x4_search flag in SPLITMV mode" 2011-01-11 08:34:24 -08:00
Johann
e88d7ab245 Merge "use unaligned load" 2011-01-11 08:25:22 -08:00
Johann
f50f2fd2a7 use unaligned load
source buffer is not guaranteed to be aligned for odd size buffers

Change-Id: Id0b1fd40ba3bd6c994bcfada788feccd2b53c5a9
2011-01-11 11:22:29 -05:00
Yunqing Wang
1546e6a8c9 Add no_skip_block4x4_search flag in SPLITMV mode
Add a flag to always enable block4x4 search for speed=0 (good
quality) to guarantee no quality loss for speed0.

Change-Id: Ie04bbc25f7e6a33a7bfa30e05775d33148731c81
2011-01-11 09:50:13 -05:00
Henrik Lundin
48c28fc42c Remove unused local variables
Removing unused local variables causing compiler warnings in
Visual Studio.

Change-Id: I0e2096303be1fdbc01428a6e57cca9796bb32c8a
2011-01-11 15:22:19 +01:00
Yunqing Wang
3675b2291c Fix bug in motion search
The maximum possible MV in 1/8 pel units is (1<<11), which could
cause mvcost out of its range that is 1023. Change maximum
possible MV in 1/8 pel units to (1<<11)-8 will fix this problem.

Change-Id: I5788ed1de773f66658c14f225fb4ab5b1679b74b
2011-01-10 16:16:59 -05:00
Paul Wilkins
cf7c4732e5 Two Pass VBR change
Further experiment with restriction of the Q range.

This uses the average non KF/GF/ARF quantizer,  instead
of just relying on the initial value. It is not such a strong constraint
but there may be a reduced risk of rate misses.

Change-Id: I424fe782a37a2f4e18c70805e240db55bfaa25ec
2011-01-10 16:41:53 +00:00
Paul Wilkins
405499d835 Revert BASE_ERRPERMB
Constant value reverted pending more tests
on different video formats.

Change-Id: I07d11a0e0185e60724698c835416caf2e0774e61
2011-01-10 16:02:51 +00:00
Paul Wilkins
c28b10adeb Merge "CQ Mode" 2011-01-07 11:05:56 -08:00
Paul Wilkins
e0846c9c8c CQ Mode
The merge includes hooks to for CQ mode and other code
changes merged from the test branch.

CQ mode attempts to maintain a more stable quantizer within a clip
whilst also trying to adhere to a guidline maximum bitrate.

The existing target data rate parameter is used to specify the
guideline maximum bitrate.

A new parameter allows the user to specify a target CQ level.

For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the
user specified value (0-63). However, in some cases the encoder may
choose to impose a target CQ that is above that specified by the user,
if it estimates that consistent use of the target value is not compatible
with guideline maximum bitrate.

Change-Id: I2221f9eecae8cc3c431d36caf83503941b25e4c1
2011-01-07 18:46:29 +00:00
Paul Wilkins
ba976eaa9b Merge "Limit Q variability in two pass." 2011-01-07 09:32:29 -08:00
Paul Wilkins
3af3593c8e Limit Q variability in two pass.
In two pass encoding each frame is given an active
Q range to work with. This change limits how much this
Q range can be altered over time from the initial estimate
made for the clip as a whole.

There is some danger this could lead to overshoot or undershoot
in some corner cases but it helps considerably in regard to
clips where either there is a glut or famine of bits in some sections,
particularly near the end of a clip.

Change-Id: I34fcd1af31d2ee3d5444f93e334645254043026e
2011-01-07 17:23:50 +00:00
Paul Wilkins
f7e2f1fedf Merge "Disable some features for first pass." 2011-01-07 08:34:27 -08:00
Scott LaVarnway
dd314351e6 Merge "Removed cpi->target_bits_per_mb" 2011-01-07 06:46:45 -08:00
Scott LaVarnway
6dbdfe3422 Removed cpi->target_bits_per_mb
cpi->target_bits_per_mb is currently not being used,
so delete it.  Also removed other unused code in rdopt.c.

Change-Id: I98449f9030bcd2f15451d9b7a3b9b93dd1409923
2011-01-07 09:41:13 -05:00
Johann
8b0cf5f79d x86 sse2 temporal_filter_apply
count can be reduced to short because the max number of filtered frames
is set to 15. the max value for any frame is 32 (modifier = 16,
filter_weight = 2). 15*32 = 480 which requires 9 bits

this function goes from about 7000 us / 1000 iterations for the C code
to < 275 us / 1000 iterations for sse2 for block_size = 16 and from
about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8

Change-Id: I64a32607f58a2d33c39286f468b04ccd457d9e6e
2011-01-06 14:00:30 -05:00
Paul Wilkins
431dac08d1 Disable some features for first pass.
The following features don't make sense for the first
pass in its current form and have a significant impact on its
speed (up to 50%).

Slow quantizer, slow dct and trellis optimization.

Change-Id: Id9943f6765ffbd71fc0084ec7dfbc9d376fd6fcd
2011-01-06 17:10:07 +00:00
Paul Wilkins
b095d9df3c Adjustment to boost calculation in two pass.
Calculate a minimum intra value to be used in determining the
IIratio scores used in two pass, second pass.

This is to make sure sections that are low complexity" in the
intra domain are still boosted appropriately for KF/GF/ARF.

For now I have commented out the Q based adjustment of
KF boost.

Change-Id: I15deb09c5bd9b53180a2ddd3e5f575b2aba244b3
2011-01-04 18:11:28 +00:00
Scott LaVarnway
de4e8185e9 Fixed encoder crash when mult-threading is enabled.
Happens in real-time mode.  Will happen in good quality, speed 1.

Change-Id: I3e5b68827b1a5798d0431b088a709256d1ce2c95
2010-12-29 16:41:22 -05:00
Yunqing Wang
a864678cdb Always update last_frame_type
Scott pointed out that last_frame_type only gets updated while
loopfilter exists. Since last_frame_type is also needed in
motion search now, it needs to be updated every frame.

Change-Id: I9203532fd67361588d4024628d9ddb8e391ad912
2010-12-29 10:28:35 -05:00
Scott LaVarnway
3fb4abf3d1 Merge "Use the fast quantizer for inter mode selection" 2010-12-28 11:56:11 -08:00
Scott LaVarnway
516ea8460b Use the fast quantizer for inter mode selection
Use the fast quantizer for inter mode selection and the
regular quantizer for the rest of the encode for good quality,
speed 1.  Both performance and quality were improved.  The
quality gains will make up for the quality loss mentioned in
I9dc089007ca08129fb6c11fe7692777ebb8647b0.

Change-Id: Ia90bc9cf326a7c65d60d31fa32f6465ab6984d21
2010-12-28 14:51:46 -05:00
Yunqing Wang
bf53ec492d Adjust MV borders for SPLITMV mode
Add limits to avoid MV going out of range.

Change-Id: I8a5deb40bf393488d29f694b5a56804d578e68b5
2010-12-28 13:23:07 -05:00
Yunqing Wang
e463b95b4e Merge "Modify motion estimation for SPLITMV mode" 2010-12-28 08:12:26 -08:00
Yunqing Wang
a5a8d92976 Modify motion estimation for SPLITMV mode
1. Search for block8x16/block16x8 uses block8x8's search results.
2. Check block4x4 only if block8x8 is chosen. (This hurts quality,
   which will be improved in another check-in.)
3. In block4x4 search, the previous block's result is used as
   MV predictor for next block.

This change improves performance.

Change-Id: I9dc089007ca08129fb6c11fe7692777ebb8647b0
2010-12-28 10:34:42 -05:00
Yaowu Xu
0f5264b584 adjusted sad_per_bit to correlate with quantizer
Re-calibrated sad_per_bit16 and sad_per_bit4 tables to linearly
correlated to quantizer values, these two variables are used in
motion search for costing motion vectors. This change has an small
positive effect on compression.

Change-Id: Ic9b5ea6fb8d5078ef663ba4899db019cc51f4166
2010-12-23 22:59:38 -08:00
Johann
20b855c33e improve integer version of filter
the lookup table is based on floating point calculations (see source)

by moving the *3 before the downshift and adding the rounding bit, the
delta (LUT - integer) goes from:
______________________________________
__ 1__ 1______________________________
__ 1__ 1______________________________
____ 1______ 1________________________
____ 1 2__ 2 1________________________
______ 1 1 2__ 2__ 2__ 2 1 1__________
________ 1 1 2 2__ 1 2 3 1 2__ 2__ 2__
to:
__-1__-1______________________________
______________________________________
____-1______-1________________________
______________________________________
________-1______________-1____________
______________________________________

it's important to be able to use the integer version because the LUT
more or less precludes SIMD optimizations

Change-Id: I45a81127dc7b72a06fba951649135d9d918386c0
2010-12-22 11:33:59 -05:00
Johann
4b6219cb33 temporal filter naming changes
be more consistant with the naming pattern, especially wrt rtcd

Change-Id: I3df50686a09f1dab0a9620b5adbb8a1577b40f2f
2010-12-22 11:32:15 -05:00
Johann
092b5bef37 abstract apply_temporal_filter
allow for optimized versions of apply_temporal_filter
(now vp8_apply_temporal_filter_c)

the function was previously declared as static and appears to have been
inlined. with this change, that's no longer possible. performance takes
a small hit.

the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because
of a circular dependency. for rtcd, temporal_filter.h holds the
definition for the rtcd table, so it needs to be included by onyx_int.h.
however, onyx_int.h holds the definition for VP8_COMP which is needed
for the function prototype. blah.

Change-Id: I499c055fdc652ac4659c21c5a55fe10ceb7e95e3
2010-12-22 11:31:54 -05:00
John Koleszar
b0da9b399d Add psnr/ssim tuning option
Add a new encoder control, VP8E_SET_TUNING, to allow the application
to inform the encoder that the material will benefit from certain
tuning. Expose this control as the --tune option to vpxenc. The args
helper is expanded to support enumerated arguments by name or value.

Two tunings are provided by this patch, PSNR (default) and SSIM.
Activity masking is made dependent on setting --tune=ssim, as the
current implementation hurts speed (10%) and PSNR (2.7% avg,
10% peak) too much for it to be a default yet.

Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257
2010-12-17 10:01:05 -05:00
Scott LaVarnway
64baa8df2e Changed segmentation check order
In SPLITMV, the 8x8 segment will be checked first.  If the 8x8 rd
is better than the best, we check the other segments.  Otherwise
bail.  Adjustments to the thresh_mult were necessary to make
up for the initial quality loss.
The performance improved by 20% (average) for good quality,
speed 0 and speed 1, while the overall quality remained the same.

Change-Id: I717aef401323c8a254fba3e9777d2a316c774cc3
2010-12-16 17:01:27 -05:00
Scott LaVarnway
81cdeb7117 Adjusted breakout RD for SPLITMV
vp8_rd_pick_best_mbsegmentation looks at y only.  The new
breakout does not include the frame cost, the prob_skip_false
cost, or the uv rate.  Performance improved by a few percent
and the quality remained the same.

Change-Id: I94ff013998ac51e8ecce7130870f7b6600758e15
2010-12-16 09:38:02 -05:00
Yunqing Wang
4fbd0227f5 Merge "Fix a bug in motion search code(2)" 2010-12-15 08:10:34 -08:00
Yunqing Wang
08706a3ea7 Fix a bug in motion search code(2)
This fix added MV range checks for NEWMV mode as suggested by Jim.
To reduce unnecessary MV range checks, I tried Yaowu's suggestion.
Update UMV borders in NEWMV mode to also cover MV range check.
Also, in this way, every MV that is valid gets checked in diamond
search function.

Change-Id: I95a89ce0daf6f178c454448f13d4249f19b30f3a
2010-12-14 17:39:25 -05:00
Yaowu Xu
3ac73173a4 Merge "fix a bug that "optimize" flag is not set for sub-threads" 2010-12-14 13:32:04 -08:00
Yunqing Wang
23aa13d92c Merge "Fix a bug in motion search code" 2010-12-14 13:25:34 -08:00
Yunqing Wang
7fb0f86863 Fix a bug in motion search code
The MV's range is 256. Since the new motion search uses a different
starting MV than the center ref MV, a MV range checking needs to
be done to avoid corruption.

Change-Id: I8ae0721d1bd203639e13891e2e54a2e87276f306
2010-12-14 13:59:38 -05:00
Yaowu Xu
64f3d91579 fix a bug that "optimize" flag is not set for sub-threads
The flag for quantization optimization was not properly propagated to
mb row encoding threads.

Change-Id: Ic561599c35acd94cd5698c9b314bccd596ac2deb
2010-12-14 10:12:21 -08:00
Johann
825adc464f shrink TOKENEXTRA and vp8_extra_bit_struct
Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes
original: b7b1e6fb
reverted: 41f4458a

Also drop unused field from vp8_extra_bit_struct

Update ARM ASM to deal with this change. In particular, Extra is signed
and needs to be sign-extended when loaded.

Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41
2010-12-14 10:32:50 -05:00
John Koleszar
41f4458a03 Revert "Reduce size of TOKENEXTRA struct"
This reverts commit b7b1e6fb55. Previous
fix is incomplete, breaks ARM. Itchy submit finger.

Change-Id: I939dc0d3bf4173cf951c1d152338ab6ea2184bb9
2010-12-13 17:12:51 -05:00
John Koleszar
3809d7bbd9 Merge "remove unused temporal preproc code" 2010-12-13 13:57:59 -08:00
John Koleszar
398aa81849 Merge "Reduce size of TOKENEXTRA struct" 2010-12-13 13:57:55 -08:00
John Koleszar
b1aa54ab26 remove unused temporal preproc code
This code is unused, as the current preproc implementation uses the
same spatial filter that postproc uses.

Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
2010-12-13 16:47:59 -05:00
John Koleszar
b7b1e6fb55 Reduce size of TOKENEXTRA struct
Change the size of structure elements to reduce memory utilization.
Removed the 'section' member entirely, as it is set but never read.

Change-Id: Iad043830392fb4168cb3cd6075fb0eb70c7f691c
2010-12-13 16:37:37 -05:00
Yaowu Xu
97a86c5b13 fix a bug in multithreaded encoding with active_map enabled
Added the initialization of the pointer to active map. Also added the
same logic for cyclic refresh in mbrow encoding threads.

Change-Id: Ic48d0849dc706b27fba72d07dcc498075725663d
2010-12-10 10:48:30 -08:00
Fritz Koenig
0ced701487 Merge "vp8 fast quantizer sse2 optimizations for eob." 2010-12-10 09:25:04 -08:00
Fritz Koenig
e0cf330cde vp8 fast quantizer sse2 optimizations for eob.
Changed the end of block computation to use pmaxw.  Removed
additional pushing and popping of registers that was not needed.

Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98
2010-12-09 15:00:30 -08:00
John Koleszar
cb9698951c fix uninitialized read in encode breakout
Change I3430820 performed an uninitialized read when
encode_breakout == 0, since the sum and sse wouldn't be set:

   if(x->encode_breakout)
       VARIANCE_INVOKE(..., get16x16var)(..., &sum, &sse);
   if (cpi->active_map_enabled && x->active_ptr[0] == 0) {
       ...
   } else if (sse < x->encode_breakout)

Change-Id: I915eb76d1227b4b6d1137a0dedf2c143860098a2
2010-12-09 16:05:26 -05:00
Paul Wilkins
c63fc881e1 Correct q_low and q_high limits for the recode loop
Corrected the initial Q range limits for the recode loop
to reflect the current allowed range for the frame.

In experimental work on constrained quality this bug was
causing unnecessary recodes.

Change-Id: I7e256fbfa681293b0223fe21ec329933d76c229f
2010-12-09 15:02:04 +00:00
Yaowu Xu
160f3c7e9e Merge "vp8e - static threshold play" 2010-12-08 13:08:04 -08:00
Yaowu Xu
d88da98614 Merge "vp8e - remove unnecessary variance calc" 2010-12-08 09:19:22 -08:00
Jim Bankoski
718c19711a vp8e - static threshold play
Realized no need for new assembly code sum is already
calculated.

Change-Id: Ie2d94feb4b7c1f77c5359bca29b66228e41638c9
2010-12-07 16:07:23 -05:00
Scott LaVarnway
f661fa1f24 Merge "vp8_rd_pick_best_mbsegmentation code restructure" 2010-12-07 07:53:12 -08:00
Yaowu Xu
062980cc48 Merge "adjust RDMULT for UV plane in quantization RDO" 2010-12-06 22:04:45 -08:00
Yaowu Xu
7c03a1c308 adjust RDMULT for UV plane in quantization RDO
This patch adds a weighting factor on RDMULT for UV blocks. The change
has an overall gain about 0.5% based on ssim, between 0.1 and 0.2% by
psnr numbers.

Change-Id: I97781b077ce3bb7e34241b03268491917e8d1d72
2010-12-06 20:53:59 -08:00
Yunqing Wang
9520f4b3cc Fix a memory leak problem in encoder
Deallocating the buffers before re-allocating them.

The fix passed James Berry's test program for memory
leak check.

Change-Id: I18c3cf665412c0e313a523e3d435106c03ca438d
2010-12-06 17:21:37 -05:00
Scott LaVarnway
2fa5d5a26d vp8_rd_pick_best_mbsegmentation code restructure
Moved the code from the segmentation loop into a function
which is now called for each segment. This will allow us
to change the segment order checking more easily.

Change-Id: I9510d26f0acae5a73043fcca8f1984b121d3e052
2010-12-06 16:42:52 -05:00