Commit Graph

948 Commits

Author SHA1 Message Date
Scott LaVarnway
ce328b855f Merge changes Ifb450710,I61c4a132
* changes:
  Eliminated reconintra_mt.c
  Eliminated vp8mt_build_intra_predictors_mbuv_s
2012-02-28 11:42:45 -08:00
Scott LaVarnway
bcba86e2e9 Eliminated reconintra_mt.c
Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead.  vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict.  Future commits will fix this
code duplication.

Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5
2012-02-28 14:12:30 -05:00
Yunqing Wang
b1bfd0ba87 Merge "Only do uv intra-mode evaluation when intra mode is checked" 2012-02-28 10:11:24 -08:00
Yunqing Wang
019384f2d3 Only do uv intra-mode evaluation when intra mode is checked
When we encode slide-show clips, for the majority of the time,
only ZEROMV mode is checked, and all other modes are skipped.
This change delayed uv intra-mode evaluation until intra mode is
actually checked. This gave big performance gain for slide-show
video encoding (2nd pass gain: 18% to 28%). But, this change
doesn't help other types of videos.

Also, zbin_mode_boost is adjusted in mode-checking loop, which
causes bitstream mismatch before/after this change when --best
or --good with --cpu-used=0 are used.

Change-Id: I582b3e69fd384039994360e870e6e059c36a64cc
2012-02-28 13:08:17 -05:00
James Berry
e2c6b05f9a bugfix: use oxcf width/height for reinit check
use oxcf instead of common in check to Reinit the
lookahead buffer if the frame size changes
prior behavior would cause assertion fail/crash

first observed in:
support changing resolution with vpx_codec_enc_config_set

Change-Id: Ib669916ca9b4f206d4cc3caab5107e49d39a36aa
2012-02-27 16:10:45 -05:00
Yunqing Wang
84be08b07f Fix skippable evaluation in mode decision
Yaowu fixed the skippable evaluation by correcting 2nd order
block's eob.

Change-Id: Id47930cbc74a90a046c0c0e324efb03477639ee0
2012-02-27 12:45:12 -05:00
Jim Bankoski
2089f26b08 Merge "Remove the frame rate factor for key frame size." 2012-02-23 08:38:44 -08:00
Marco Paniconi
507ee87e3e Remove the frame rate factor for key frame size.
When temporal layers is used (i.e., number_of_layers > 1),
we don't use the frame rate boost for setting the key
frame target size. The factor was forcing the target size to be
always at its minimum (2* per_frame_bandwidth) for low frame rates
(i.e., base layer frame rate).

Generally we should modify or remove this frame rate factor;
for now we turn if off for number_of_layers > 1.

Change-Id: Ia5acf406c9b2f634d30ac2473adc7b9bf2e7e6c6
2012-02-22 15:25:32 -08:00
Scott LaVarnway
f2bd11faa4 Eliminated vp8mt_build_intra_predictors_mbuv_s
Reworked the code to use vp8_build_intra_predictors_mbuv_s
instead.  This is WIP with the goal of eliminating all
functions in reconintra_mt.h

Change-Id: I61c4a132684544b24a38c4a90044597c6ec0dd52
2012-02-21 14:59:05 -05:00
John Koleszar
dadc9189ed Merge changes I0341554f,I64e110c8
* changes:
  Consolidate C version of token packing functions
  Multithreaded encoder, late sync loopfilter
2012-02-21 10:09:23 -08:00
Scott LaVarnway
f05feab7b9 Merge "Remove redundant init of segment_counts in vp8_encode_frame" 2012-02-21 09:51:02 -08:00
John Koleszar
02360dd2c2 Merge "Update encoder mb_skip_coeff and prob_skip_false calculation" 2012-02-21 09:48:26 -08:00
Yunqing Wang
f93b1e7be1 Merge "Fix incorrect use of uv eobs in intra modes" 2012-02-17 10:43:05 -08:00
Yunqing Wang
04b9e0d787 Fix incorrect use of uv eobs in intra modes
In vp8_rd_pick_inter_mode(), if total of eobs is zero, rate needs
to be adjusted since there are no non-zero coefficients for
transmission. The uv intra eobs calculated in
rd_pick_intra_mbuv_mode() need to be saved before they are
overwritten by inter-mode eobs.

Change-Id: I41dd04fba912e8122ef95793d4d98a251bc60e58
2012-02-17 09:15:08 -05:00
Attila Nagy
ce42e79abc Update encoder mb_skip_coeff and prob_skip_false calculation
mode_info_context->mbmi.mb_skip_coeff has to always reflect the
existence or not of coeffs for a certain MB. The loopfilter needs this
info.
mb_skip_coeff is either set by the vp8_tokenize_mb or has to be set to
1 when the MB is skipped by mode selection. This has to be done
regardless of the mb_no_coeff_skip value.

prob_skip_false is needed just when mb_no_coeff_skip is 1. No need to
keep count of both skip_false and skip_true as they are complementary
(skip_true+skip_false = total_mbs)

Change-Id: I3c74c9a0ee37bec10de7bb796e408f3e77006813
2012-02-17 14:27:40 +02:00
Attila Nagy
565d0e6feb Remove redundant init of segment_counts in vp8_encode_frame
segment_counts was zero init twice in the beginning of vp8_encode_frame.

Change-Id: Ibc29f6896dabd9aab1d0993f3941cf6876022e70
2012-02-17 09:51:24 +02:00
Johann
6b151d436d Clarify 'max_sad' usage
Depending on implementation the optimized SAD functions may return early
when the calculated SAD exceeds max_sad.

Change-Id: I05ce5b2d34e6d45fb3ec2a450aa99c4f3343bf3a
2012-02-16 15:17:44 -08:00
Attila Nagy
d02e74a073 Consolidate C version of token packing functions
Replace inner loops of pack_mb_row_tokens_c and
pack_tokens_into_partitions_c with a call to pack_tokens_c.

Change-Id: I0341554fb154a14a5dadb63f8fc78010724c2c33
2012-02-16 14:11:28 +02:00
Attila Nagy
78071b3b97 Multithreaded encoder, late sync loopfilter
Second shot at this...

Sync with loopfilter thread as late as possible, usually just at the
beginning of next frame encoding. This returns control to application
faster and allows a better multicore scaling.

When PSNR packets are generated the final filtered frame is needed
imediatly so we cannot delay the sync. Same has to be done when
internal frame is previewed.

Change-Id: I64e110c8b224dd967faefffd9c93dd8dbad4a5b5
2012-02-16 12:26:39 +02:00
John Koleszar
e6df50031e Merge "support changing resolution with vpx_codec_enc_config_set" 2012-02-10 16:18:00 -08:00
Johann
169823428f Missed some variance casts
Change-Id: I9fb510f9421fb3c317a8e32e3058cee977ddf9fa
2012-02-10 11:07:33 -08:00
Scott LaVarnway
768ae275dc x_motion_minq table reduction
Reduced by 4080 bytes.

Change-Id: I037b55bc9684bf4a54bce238be00e8c4db3f643e
2012-02-09 16:49:34 -05:00
Johann
fea3556e20 Fix variance overflow
In the variance calculations the difference is summed and later squared.
When the sum exceeds sqrt(2^31) the value is treated as a negative when
it is shifted which gives incorrect results.

To fix this we cast the result of the multiplication as unsigned.

The alternative fix is to shift sum down by 4 before multiplying.
However that will reduce precision.

For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
change).

PPC change is untested.

Change-Id: I1bad27ea0720067def6d71a6da5f789508cec265
2012-02-09 12:38:31 -08:00
John Koleszar
51acb01167 support changing resolution with vpx_codec_enc_config_set
Allow the application to change the frame size during encoding. This
is only supported when not using lagged compress.

Change-Id: I89b585d703d5fd728a9e3dedf997f1b595d0db0f
2012-02-07 17:09:40 -08:00
Yunqing Wang
a040eb37e4 Merge "Allow to skip highest-resolution encoding in multi-resolution encoder" 2012-02-06 13:58:11 -08:00
Yunqing Wang
fa1a9290e6 Allow to skip highest-resolution encoding in multi-resolution encoder
Sometimes, a user doesn't have enough bandwidth to send high-resolution
(i.e. HD) video even though the camera catches HD video. This change
allowed users to skip highest-resolution encoding by setting that level's
target bit rate to 0.

To test it, modify the following line in vp8_multi_resolution_encoder.c.
    unsigned int  target_bitrate[NUM_ENCODERS]={1400, 500, 100};
To skip the highest-resolution level, change it to
    unsigned int  target_bitrate[NUM_ENCODERS]={0, 500, 100};
To skip the first and second highest resolution levels, change it to
    unsigned int  target_bitrate[NUM_ENCODERS]={0, 0, 100};

This change also fixed a small problem in mapping, which slightly helped
quality and performance.

Change-Id: I977bae9a9fbfba85c8be4bd5af01539f2b84bc81
2012-02-03 13:39:05 -05:00
Scott LaVarnway
d8ebdcd89d Moved ref_frame_cost from MACROBLOCKD to MACROBLOCK
Change-Id: I05788522e9cde4322cfb12032483bdbf184bdf0b
2012-02-02 13:40:08 -05:00
Scott LaVarnway
11c706488b Removed frames_till_alt_ref_frame from MACROBLOCKD
Change-Id: Ieb05270ac332a4cc38ec4b7b995fc0150e0fffdf
2012-02-02 13:34:13 -05:00
Scott LaVarnway
e2000cc5ca Removed frames_since_golden from MACROBLOCKD
Change-Id: I10efa441d663fceb6bc97a3bfad518cd3d9a5128
2012-02-02 13:28:41 -05:00
Scott LaVarnway
749bc98618 BLOCKD structure cleanup
Removed redundancies.  All of the information can be
found in the MACROBLOCKD structure.

Change-Id: I7556392c6f67b43bef2a5e9932180a737466ef93
2012-01-31 11:02:39 -05:00
John Koleszar
8aae246089 RTCD: finalize removal of old RTCD system
This is the final commit in the series converting to the new RTCD
system. It removes the encoder csystemdependent files and the remaining
global function pointers that didn't conform to the old RTCD system.

Change-Id: I9649706f1bb89f0cbf431ab0e3e7552d37be4d8e
2012-01-30 12:10:48 -08:00
John Koleszar
109b69a706 RTCD: add arnr functions
This commit continues the process of converting to the new RTCD
system. It removes the last of the VP8_ENCODER_RTCD struct references.

Change-Id: I2a44f52d7cccf5177e1ca98a028ead570d045395
2012-01-30 12:10:48 -08:00
John Koleszar
0b0bc8d098 RTCD: add motion search functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Ia5828b7ecc80db55b21916704aa3d54cbb98f625
2012-01-30 12:10:47 -08:00
John Koleszar
be8af188d0 RTCD: add block subtraction functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Id8a287fdd4bd050ea4452e1582ad85520f3081be
2012-01-30 12:10:47 -08:00
John Koleszar
61311e6103 RTCD: add quantizer functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Iba9df4c03a508e51c37201c621be43523fae87d9
2012-01-30 12:10:46 -08:00
John Koleszar
510e0ab467 RTCD: add FDCT functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I3f9c07db65eb206f6363d21bdb80e871570da767
2012-01-30 12:10:42 -08:00
John Koleszar
83a91e789c RTCD: add variance functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: Ie5c1aa480637e98dc3918fb562ff45c37a66c538
2012-01-30 12:08:30 -08:00
John Koleszar
f103dcefaf RTCD: add subpixel functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe
2012-01-30 12:08:29 -08:00
John Koleszar
2a8f57f50d RTCD: add postproc functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: If54eb5cb5d1b0cac6c4c0633a9e99c93ca860ba2
2012-01-30 12:08:29 -08:00
John Koleszar
fdb61a4531 RTCD: add recon functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
2012-01-30 12:08:28 -08:00
John Koleszar
ab77b4e898 RTCD: add remaining IDCT functions
This commit continues the process of converting to the new RTCD
system.

Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80
2012-01-30 12:08:22 -08:00
John Koleszar
a910049aea New RTCD implementation
This is a proof of concept RTCD implementation to replace the current
system of nested includes, prototypes, INVOKE macros, etc. Currently
only the decoder specific functions are implemented in the new system.
Additional functions will be added in subsequent commits.

Overview:
  RTCD "functions" are implemented as either a global function pointer
  or a macro (when only one eligible specialization available).
  Functions which have RTCD specializations are listed using a simple
  DSL identifying the function's base name, its prototype, and the
  architecture extensions that specializations are available for.

Advantages over the old system:
  - No INVOKE macros. A call to an RTCD function looks like an ordinary
    function call.
  - No need to pass vtables around.
  - If there is only one eligible function to call, the function is
    called directly, rather than indirecting through a function pointer.
  - Supports the notion of "required" extensions, so in combination with
    the above, on x86_64 if the best function available is sse2 or lower
    it will be called directly, since all x86_64 platforms implement
    sse2.
  - Elides all references to functions which will never be called, which
    could reduce binary size. For example if sse2 is required and there
    are both mmx and sse2 implementations of a certain function, the
    code will have no link time references to the mmx code.
  - Significantly easier to add a new function, just one file to edit.

Disadvantages:
  - Requires global writable data (though this is not a new requirement)
  - 1 new generated source file.

Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55
2012-01-30 12:06:27 -08:00
John Koleszar
319f7c4d56 Merge changes I17e1a348,Iad710941
* changes:
  Correct clamping in use of vp8_find_near_mvs()
  Revert "Multithreaded encoder, late sync loopfilter"
2012-01-26 14:33:28 -08:00
John Koleszar
83cef816fd Correct clamping in use of vp8_find_near_mvs()
Commit e06c242ba introduced a change to call vp8_find_near_mvs() only
once instead of once per reference frame by observing that the only
effect that the frame had was on the bias applied to the motion
vector. By keeping track of the sign_bias value, the mv to use could
be flip-flopped by multiplying its components by -1.

This behavior was subtley wrong in the case when clamping was applied
to the motion vectors found by vp8_find_near_mvs(). A motion vector
could be in-bounds with one sign bias, but out of bounds after
inverting the sign, or vice versa. The clamping must match that done
by the decoder.

This change modifies vp8_find_near_mvs() to remove the clamping from
that function. The vp8_pick_inter_mode() and vp8_rd_pick_inter_mode()
functions instead track the correctly clamped values for both bias
values, switching between them by simple assignment. The common
clamping and inversion code is in vp8_find_near_mvs_bias()

Change-Id: I17e1a348d1643497eca0be232e2fbe2acf8478e1
2012-01-26 09:37:27 -08:00
John Koleszar
630d3b95e2 Revert "Multithreaded encoder, late sync loopfilter"
This commit is incomplete, as it does not synchronize the loop filter
before returning a handle to the reconstructed frame in
vpx_codec_get_preview_frame(), which can cause (false?) failures
when running the test_reconstruct_buffer test.

This may be related to a bug that does cause visible artifacts, which
is also under investigation.

This reverts commit 380d64ecb1.

Change-Id: Iad710941e7731d44fc2bde63bc63d6763cc4629e
2012-01-24 15:41:59 -08:00
Fritz Koenig
892102842a Disconnect ARM tgt_isa from dsp extensions
A processor with ARMv7 instructions does not
necessarily have NEON dsp extensions.  This CL
has the added side effect of allowing the ability
to enable/disable the dsp extensions cleanly.

Change-Id: Ie1e879b8fe131885bc3d4138a0acc9ffe73a36df
2012-01-20 10:38:15 -08:00
Jeff Faust
ac97b089d1 Merge "Simplify an assignment statement" 2012-01-18 21:14:51 -08:00
John Koleszar
6a4ff6f325 Merge "get_plane_pointers: use u/v planes consistently" 2012-01-18 14:22:55 -08:00
John Koleszar
4753ee4166 get_plane_pointers: use u/v planes consistently
The prior commit accidentally used the u plane where it should have
used the v plane.

Change-Id: Ib6c8443b99061536389f05ac25b8e0a307ace637
2012-01-18 12:50:06 -08:00
Jeff Faust
15c29afeca Simplify an assignment statement
Separated a double assignment that looked suspiciously like an
assignment and equality typo.

Change-Id: I7813979e9d7ea2539afb3c8ae6074f9df5ebdf52
2012-01-18 12:49:43 -08:00
John Koleszar
0e06bc817a Merge changes I1ebe76aa,Ia079b52b
* changes:
  rdopt/pickinter: factor out some common setup
  rdopt: remove unused frame_lf_or_gf
2012-01-18 09:30:46 -08:00
Adrian Grange
e479379abb Fixed bugs in multi-layer code related to changing params
When running multi-layer (ML) encodes and dynamically
changing coding parameters on the fly (e.g. frame
duration/rate, bandwidths allocated to each layer)
the encoder would not produce sensible output.

In certain cases the rate targeting would be
hideously inaccurate.

These fixes make it possible to change these coding
parameters correctly and to maintain accurate control
of the rate targeting.

I also added the specification of the input timebase
into the test program, vp8_scalable_patterns.c.

Patch 2: Moved declaration to appease MS compiler)

Change-Id: Ic8bb5a16daa924bb64974e740696e040d07ae363
2012-01-13 16:52:25 -08:00
John Koleszar
4ade079633 rdopt/pickinter: factor out some common setup
Add new get_predictor_pointers() and get_reference_search_order()
functions for code shared between the two implementations.

Change-Id: I1ebe76aa8f168b1f5cfabc00d05d8f19a0d4d207
2012-01-11 14:43:52 -08:00
John Koleszar
bd5bfd94b8 rdopt: remove unused frame_lf_or_gf
This flag was set but unused.

Change-Id: Ia079b52b88ffbe3b16fdbde4b84e2b87304eaa13
2012-01-11 13:02:19 -08:00
John Koleszar
66da859e5e Merge "Reduced the size of Y1Dequant and friends to [128][2]" 2012-01-06 11:59:06 -08:00
Scott LaVarnway
5f25d4c175 Reduced the size of Y1Dequant and friends to [128][2]
This patch removes the local copies of the dequantize
constants and implements John's idea as described
in "Make a local copy of the dequantized data" commit.

Change-Id: Ic6b7d681f00bf63263f71ff1e39ab2f80729e8b2
2012-01-06 11:12:00 -08:00
Johann
0780f258da Merge "Improve SSSE3 fast quantizer function" 2012-01-05 10:09:39 -08:00
Yunqing Wang
9f1083e9a0 Merge "Improve vp8cx_init_quantizer()" 2012-01-04 06:22:15 -08:00
Scott LaVarnway
33d9ea5471 Merge "Remove useless g_common.h" 2012-01-03 09:48:35 -08:00
Yunqing Wang
2b2c0c9bda Improve SSSE3 fast quantizer function
Simplified the EOB calculation in the function.

Change-Id: I7422f18be40ae270358f5cb0811d66e64436b56f
2011-12-29 12:05:50 -05:00
John Koleszar
3cb92b85b9 Remove unused MACROBLOCK member vector_range
Change-Id: Ie2dc0d72363ff38e0f71b59f6e2d1a2d70c5266b
2011-12-28 14:58:38 -08:00
John Koleszar
31e86192ba Remove unused BLOCK member force_empty
Change-Id: I72ed49ce14ca0124dd0d31bfcf4c7630a4681587
2011-12-28 13:57:51 -08:00
Yunqing Wang
b510863f8f Improve vp8cx_init_quantizer()
Except zrun_zbin_boost, 15 AC values are the same for all other
parameters. Removed unneccessary calculation.

Change-Id: I6101c0fe8080bd2b4387c3b04d7ddedbf6010409
2011-12-28 13:55:55 -05:00
John Koleszar
03fadc4b20 Merge "Remove unnecessary ternary constructs" 2011-12-22 13:01:05 -08:00
John Koleszar
d48ea5a2ab Merge "Remove legacy integer types" 2011-12-22 13:00:23 -08:00
John Koleszar
adb10c47a8 Merge "Use lookup tables for mode_check_freq" 2011-12-22 12:59:47 -08:00
John Koleszar
64c4be2669 Merge "Use lookup tables for thresh_mult" 2011-12-22 10:31:21 -08:00
John Koleszar
0c2b2c79ae Remove unnecessary ternary constructs
The code had a number of constructs like (condition)?1:0,
which is redundant with C's semantics. In the cases where a boolean
operator was used in the condition, simply remove the ternary part.
Otherwise adjust the surrounding expression to remove the condition
(eg, for rounding up. See pickinter.c and rdopt.c)

Change-Id: Icb2372defa3783cf31857d90b2630d06b2c7e1be
2011-12-22 10:09:46 -08:00
John Koleszar
f56918ba9c Remove legacy integer types
Remove BOOL, INTn, UINTn, etc, in favor of C99-style fixed width
types.

Change-Id: I396636212fb5edd6b347d43cc940186d8cd1e7b5
2011-12-22 09:58:40 -08:00
John Koleszar
aa8650dd7f Use lookup tables for mode_check_freq
Mostly cosmetic. Trying for a more compact representation of speed
selection thresholds.

Change-Id: I339e7840049b91ad569aabbdc9c702a496110d3b
2011-12-22 09:43:44 -08:00
John Koleszar
efb4783d36 Use lookup tables for thresh_mult
Mostly cosmetic. Trying for a more compact representation of speed
selection thresholds.

Change-Id: Icaebea632c7bb71ca8e07b4def04a046d4515e27
2011-12-22 09:43:40 -08:00
John Koleszar
0c2f8e77cc Remove useless g_common.h
This file declared a bunch of nonexistent, unreferenced global
function pointers.

Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728
2011-12-21 15:02:23 -08:00
James Zern
b651875e24 squash some signed/unsigned comparison warnings
Change-Id: Ifc64cf990ae04d77934da3324d0afb3993f061e7
2011-12-21 13:49:19 -08:00
John Koleszar
16a8948c45 Merge "Remove opaque pointer VP8_PTR" 2011-12-21 09:59:22 -08:00
John Koleszar
63d9c4da5e Merge "tokenizer: use correct block type context in stuff1st_order_b" 2011-12-21 09:20:21 -08:00
John Koleszar
b0056c3b5e Remove opaque pointer VP8_PTR
Use an opaque struct rather than typecasting through VP8_PTR, an int*.

Change-Id: I5ed4d9238ba2e8d51bfa07a8da87a2eb4c8fa43a
2011-12-21 09:13:51 -08:00
John Koleszar
056bcc8771 remove armv6 files from armv5 build
Make bilinearfilter_arm.c compiled only when HAVE_ARMV6, as its definitions
are v6 only. This is normally not a problem for static builds as the file
is elided at link time, but this was not being done properly for the
--enable-shared --enable-pic build.

Change-Id: Ic800a7cde751f74f22555c5b247f99f9df5e550d
2011-12-19 13:51:11 -08:00
Johann
080919b3c2 Merge "Avoid heap allocation of firstpass stats" 2011-12-19 10:11:23 -08:00
John Koleszar
c75f0ec379 Merge "fix: make sure ss_err is large enough" 2011-12-19 09:50:12 -08:00
Yunqing Wang
c647ec4462 Merge mr_pick_inter_mode and pick_inter_mode
Merged multi-resolution motion estimation with regular motion
estimation function in order to remove duplicated part. This
caused slight changes in multi-resulotion encoder quality &
performance.

Change-Id: Ib4ecc7acfebfe5eea959b5b91febae6db7b95fd1
2011-12-16 18:02:29 -05:00
James Berry
24196dd987 fix: make sure ss_err is large enough
increase size of ss_err by one to make
sure there is room for 64 elements.

Change-Id: I355cb8c499aa7da3b9675f2326a8d25a74bb88d2
2011-12-16 17:43:55 -05:00
John Koleszar
26c6a44c66 Avoid heap allocation of firstpass stats
The total_stats, this_frame_stats, and total_left_stats structures
were previously create by a heap allocation, despite being of fixed
size. These structures were allocated and deallocated during
{de,}allocate_compressor_data, which is reinvoked whenever the frame
size changes. Unfortunately, this clobbers the total_stats and
total_left_stats data.

Historically, these were variable size at one time, due to the first
pass motion map, which necessitated their being created by a unique
heap allocation. However, this bug with the total_stats being
clobbered has probably been present since that initial implementation.

These structures are instead moved to be stored within the struct
twopass_rc directly, rather than being heap allocated separately.

Change-Id: I7f9e519e25c58b92969071f0e99fa80307e0682b
2011-12-16 11:40:23 -08:00
Scott LaVarnway
a53d5a4c44 Moved dequant idct into common
These functions are now used by the encoder.
This is WIP with the goal of creating a common idct/add for
the encoder and decoder.  A boost of 1.8% was seen for
the HD rt test clip used.

[Tero] Added needed changes to ARM side.

Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf
2011-12-15 14:23:41 -05:00
Yunqing Wang
c8df1656bd Merge "Only call vp8_find_near_mvs() once for each macroblock" 2011-12-15 09:53:25 -08:00
Yunqing Wang
e06c242baa Only call vp8_find_near_mvs() once for each macroblock
While doing motion search on a macroblock, we usually call
vp8_find_near_mvs once per reference frame. Actually, for
different reference frames, the only difference in calculating
these near_mvs is they may have different sign_bias, which
causes a sign change in resulting near_mvs. In this change, we
only do find_near_mvs for the first reference frame. For other
reference frames, only need to adjust the near_mvs according to
that reference frame's sign_bias value.

Change-Id: I661394b49c6ad79fed7d0f2eb2be239b9c56f149
2011-12-15 11:19:18 -05:00
Yunqing Wang
d7e09b6ada Merge "Force realtime version 1 streams to only use simple loopfilter" 2011-12-15 05:38:57 -08:00
John Koleszar
72f459c77f Merge "Avoid multiple test for same lvl in auto filter lvl pick" 2011-12-14 16:28:13 -08:00
John Koleszar
e542627b0c Merge "fix: active_worst_quality could be set above 127" 2011-12-14 11:22:00 -08:00
Attila Nagy
51c4f9e6b1 Avoid multiple test for same lvl in auto filter lvl pick
Sometimes same level is tested 2-3 times; store and reuse the
calculated error value.

Change-Id: Ia1c04a2568232edf9a5a62c4e2d8e8a50d85e00e
2011-12-14 15:56:29 +02:00
Attila Nagy
55fbdd58ac Force realtime version 1 streams to only use simple loopfilter
...regardless of the speed settings.

Change-Id: I4b91ac7a7208efd690dfc69e175f8eb769b6ce03
2011-12-14 12:57:49 +02:00
James Berry
f8b431c334 fix: active_worst_quality could be set above 127
add check to set active_worst_quality to 127 if it
is set above 127

Change-Id: I7db353d5c1b1c8516a116542b6ed21c0110bb512
2011-12-13 14:58:59 -05:00
John Koleszar
d6020f9d52 tokenizer: use correct block type context in stuff1st_order_b
The fast-path for skipped MBs was not correctly respecting the
block type during update of the coefficient counts. Extracted
this from part of change I365cfb6ac636f19c545f682e3aeac185253abaef

Change-Id: I53d8cf0a00a98034b97b0ed3414b703bae74a739
2011-12-13 11:58:56 -08:00
Jim Bankoski
6b2792b0e0 Merge "vp8e - entropy stats per frame type" 2011-12-12 09:08:34 -08:00
Jim Bankoski
6de67cd6e8 vp8e - entropy stats per frame type
Change-Id: I4168eb6ea22ae541471738a7a3453e7d52059275
2011-12-09 16:56:18 -08:00
Johann
a69810b893 Merge "Reduce mem copies in encoder loopfilter level picking" 2011-12-07 10:41:00 -08:00
Attila Nagy
e570b0406d Reduce mem copies in encoder loopfilter level picking
Do the test filtering in the existing backup frame buffer instead of
the original. Copy the original data into extra buffer before doing
the  filtering. This way there is no need to restore the original
unfiltered  frame at the end of level picking process.

This came up in some discussions with Johann. Thanks!

Change-Id: I495f4301d983854673276c34ec0ddf9a9d622122
2011-12-07 09:59:50 +02:00
Yunqing Wang
aa7335e610 Multiple-resolution encoder
The example encoder down-samples the input video frames a number of
times with a down-sampling factor, and then encodes and outputs
bitstreams with different resolutions.

Support arbitrary down-sampling factor, and down-sampling factor
can be different for each encoding level.

For example, the encoder can be tested as follows.
1. Configure with multi-resolution encoding enabled:
../libvpx/configure --target=x86-linux-gcc --disable-codecs
--enable-vp8 --enable-runtime_cpu_detect --enable-debug
--disable-install-docs --enable-error-concealment
--enable-multi-res-encoding
2. Run make
3. Encode:
If input video is 1280x720, run:
./vp8_multi_resolution_encoder 1280 720 input.yuv 1.ivf 2.ivf 3.ivf 1
(output: 1.ivf(1280x720); 2.ivf(640x360); 3.ivf(320x180).
The last parameter is set to 1/0 to show/not show PSNR.)
4. Decode:
./simple_decoder 1.ivf 1.yuv
./simple_decoder 2.ivf 2.yuv
./simple_decoder 3.ivf 3.yuv
5. View video:
mplayer 1.yuv -demuxer rawvideo -rawvideo w=1280:h=720 -loop 0 -fps 30
mplayer 2.yuv -demuxer rawvideo -rawvideo w=640:h=360 -loop 0 -fps 30
mplayer 3.yuv -demuxer rawvideo -rawvideo w=320:h=180 -loop 0 -fps 30

The encoding parameters can be modified in vp8_multi_resolution_encoder.c,
for example, target bitrate, frame rate...

Modified API. John helped a lot with that. Thanks!

Change-Id: I03be9a51167eddf94399f92d269599fb3f3d54f5
2011-12-05 17:59:42 -05:00
John Koleszar
6127af60c1 Merge "Speed selection support for disabled reference frames" 2011-12-05 14:36:54 -08:00
Yunqing Wang
06fc0f83b6 Populate q_index in multi-thread encoding
This value needs to be copied to each thread's data structure.
This fixed artifact problem in multi-thread encoder.

Change-Id: Iab6d9745a1d44846aa503184705376f63a505597
2011-11-28 15:58:28 -05:00
Johann
e2bacd581a Merge "Move shared data to shared location" 2011-11-23 11:20:54 -08:00
Attila Nagy
97259b460c Fix encoder partitioned output on ARM
API was not returning correct partition sizes on arm targets.
The armv5 token packing functions were not storing the information to the
partition size table.
As a fix, have one boolcoder instance allocated for each partition so
that partition sizes are internally available after all partitions
were encoded. This will also allow more flexibility in producing
several partitions in parallel.

Use buffer validation (overflow check) in all ARM bitpacking
functions.

Change-Id: I31c8a11d8a7613676f0ff50928cb2a2ab14fd169
2011-11-23 12:29:43 +02:00
Johann
f2cd4ded22 Move shared data to shared location
Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2
file is bad

Moving towards allowing --disable-mmx

Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742
2011-11-18 16:23:14 -08:00
John Koleszar
e55974bf86 Speed selection support for disabled reference frames
There was an implicit reference frame test order (typically LAST,
GOLD, ARF) in the mode selection logic, but this doesn't provide the
expected results when some reference frames are disabled. For
instance, in real-time mode, the speed selection logic often disables
the ARF modes. So if the user disables the LAST and GOLD frames, the
encoder was always choosing INTRA, when in reality searching the ARF
in this case has the same speed penalty as searching LAST would have
had.

Instead, introduce the notion of a reference frame search order. This
patch preserves the former priorities, so if a frame is disabled, the
other frames bump up a slot to take its place. This patch lays the
groundwork for doing something smarter in the frame test order, for
example considering temporal distance or looking at the frames used by
nearby blocks.

Change-Id: I1199149f8662a408537c653d2c021c7f1d29a700
2011-11-18 13:53:21 -08:00
Attila Nagy
c84d42f864 Validate encoder buffer writes for single token partition
Extend buffer write validation (overflow check) to single token
partition packing, both mb and row based functions.

Change-Id: I36e19b7d37fc43712d05c70e3ad223d3eb5b973d
2011-11-18 12:49:27 +02:00
Scott LaVarnway
3c755577b8 Merge "Added predictor stride argument(s) to subtract functions" 2011-11-17 10:17:53 -08:00
Scott LaVarnway
edd98b7310 Added predictor stride argument(s) to subtract functions
Patch set 2: 64 bit build fix
Patch set 3: 64 bit crash fix

[Tero]
Patch set 4: Updated ARMv6 and NEON assembly.
             Added also minor NEON optimizations to subtract
             functions.

Patch set 5: x86 stride bug fix

Change-Id: I1fcca93e90c89b89ddc204e1c18f208682675c15
2011-11-15 12:53:01 -05:00
John Koleszar
bdd35c13cc avoid resetting framerate during vpx_codec_enc_config_set()
The calculated frame_rate is a state variable in the codec, and
shouldn't be maintained in the configuration struct. Move it to the
main part of cpi so that it isn't clobbered when the configuration
struct is updated. The initial framerate estimate is moved from the
vp8_cx_iface.c wrapper into the body of init_config() in onyx_if.c, so
that it is only called once and not reset on every call to
vp8_change_config().

Change-Id: I8d9a3d1283330d1ee297d07e9d78d1f2875f2465
2011-11-11 14:45:58 -08:00
Scott LaVarnway
9532bda0fb Merge "Relocated idct/add calls for encoder" 2011-11-09 10:17:43 -08:00
Johann
ea2229bab6 Merge "ARMv6 optimized Intra4x4 prediction" 2011-11-09 09:36:33 -08:00
John Koleszar
2999ca3094 Merge "Reset FPU state after calc_plane_error()" 2011-11-09 09:35:08 -08:00
John Koleszar
3fcf0e3668 Merge "Compiler warning fix for const array." 2011-11-09 09:34:50 -08:00
Scott LaVarnway
861ed6a5c1 Relocated idct/add calls for encoder
Call the idct/add after the tokenize.  This is WIP with
the goal of creating a common idct/add for the encoder and
decoder. This move is necessary because the decoder's version
of the idct clobbers qcoeff, which is used by the tokenize.

Change-Id: I6b08d8e8397cd873647fa4fb9469884e3c876756
2011-11-09 10:41:05 -05:00
Tero Rintaluoma
5a2fd63a2a ARMv6 optimized Intra4x4 prediction
Added ARM optimized intra 4x4 prediction
 - 2x faster on Profiler compared to C-code compiled with -O3
 - Function interface changed a little to improve BLOCKD structure
   access

Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54
2011-11-09 09:13:51 +02:00
Yunqing Wang
4c14efd234 Fix checks in MB quantizer initialization
vp8cx_mb_init_quantizer() needs to be called at least once to get
all values calculated. This change added one check to decide if
we could skip initialization or not.

Change-Id: I3f65eb548be57580a61444328336bc18c25c085b
2011-11-08 12:11:48 -05:00
Adrian Grange
b615a6d47f Third set of checks of buffer level against maximum buffer size
Additional check of buffer level to ensure it doesn't exceed the
maximum buffer size.

Change-Id: I1ba4f8b09bbec89646885040ff47470196af521e
2011-11-07 17:15:28 -08:00
Adrian Grange
fa25a31ed4 Additional clipping of buffer level to maximum buffer size
Added additional check of buffer level against maximum
buffer size.

Change-Id: Iaf1fbaf008601161e402b43ce82c3dbc129bf740
2011-11-07 16:54:40 -08:00
Adrian Grange
9dc95b0a12 Added check to make sure maximum buffer size not exceeded
Added code to clip the buffer level to the maximum buffer
size. Without this the buffer level would increase
unchecked.

This bug was found when encoding an essentially static
scene at 2Mb/s. The encoder is unable to generate frames
consistent with the high data-rate because Q bottoms out
at Qmin.

As frames generated are consistently undersized the buffer
level increases and does not get checked against the
maximum size specified by the user (or default).

Change-Id: Id8a3c6323d3246da50f7cb53ddbf78b5528032c6
2011-11-07 16:28:13 -08:00
Fritz Koenig
f0c01413fb Compiler warning fix for const array.
Fix compiler warning for passing a non const array
to a function expecting a const array by using an
intermediary pointer and casting.

Change-Id: I9bdd358ebdc926223993fb8fb2098ffedd2f3fc7
2011-11-04 18:19:26 -07:00
Yunqing Wang
e1a55b504a Merge "Add checks in MB quantizer initialization" 2011-11-04 11:52:27 -07:00
Tero Rintaluoma
d497ec688d Fix issue 374: eob read incorrectly
Updated eob changes to check_reset_2nd_coeffs function.

Change-Id: Id1b21c91c7f0fd286640b487ffe47867009b717d
2011-11-04 09:36:49 +02:00
Scott LaVarnway
46639567a0 Merge "Change use of eob in the encoder" 2011-11-03 08:06:06 -07:00
Tero Rintaluoma
e4f2ec7a52 Change use of eob in the encoder
Changed 'int eob' to 'char *eob' in BLOCKD so that both encoder and
decoder will use eobs[25] array from MACROBLOCKD structure. In future,
this will enable use of the decoder side IDCT in the encoder.

Change-Id: I6e1c011628cb8864fd4a0b80f0279ce16a5ca978
2011-11-03 16:08:09 +02:00
Yaowu Xu
8002c31804 Merge "added code to clear 2nd order block when appropriate" 2011-11-02 08:22:58 -07:00
Yunqing Wang
e44720af84 Add checks in MB quantizer initialization
In some situations (f.g. error-resilient is turned on), vp8cx_mb
_init_quantizer() was called once per macroblock. Added checks
to avoid calculations when there is no change.

Change-Id: Ie4f0a5ade2202041254990a4e9d5b03bd1ac5aea
2011-11-01 17:41:22 -04:00
Yaowu Xu
88e24f07ae added code to clear 2nd order block when appropriate
It is discovered that in rare situations the 2nd order block may
produce a few small magnitude coefficients that has no effect on
reconstruction. The situations are a combination of low quantizer
values (high quality) and low energy in residual signals (content
dependent). This commit added code to detect such cases and reset
the 2nd order block to all 0.

Patch 1 to 4 used code to do all-zero-check on idct result buffer,
and tests on derf set showed a consistent gain of .12%-.14% on all
metrics.But due to a recent change Ie31d90b, the idct result buffer
is not longer populated. So patch 5&6 use an alternative method to
detect the situations. Tests on derf set now shows a consistent
quality gain of .16%-.20%.

As suggested by Jim, Patch 7&8 removed the condition of all first
order block not having any coefficient, instead we reset 2nd order
coefficients to all 0 if sum of absolute value of the coefficients
is small. So it does slightly more than just detecting the oddity
as discussed above, but tests on derf set now show a consistent
gain of .20%-.23% on all metrics.

It is worth noting here that this change does not have any effect
on mid/high quantizer range, it only affects the quantizer value
18 or blow. Within this range, the change helps compression by up
to 2.5% on clips in the derf set.

Change-Id: I718e19cf59a4fc2462cb7070832759beb9f7e7dd
2011-10-28 12:07:21 -07:00
Attila Nagy
9452dce181 Fix ARM build problem introduced by CL I3fab6f2b
Update ARM asm implementation of vp8_start_encode to new definition.

Change-Id: Ic44791c969e351082331ba6146c3384c01a0dfad
2011-10-27 09:06:45 +03:00
Attila Nagy
de82809444 Reduce partial frame copy in encoder's pick_filter_level_fast
The partial frame copy function used to copy an extra 8 lines above
and  below. The partial frame filtering can only modify 3 pixel rows
above the partial frame. Reduce copy to bare minimum needed, which is
4 lines, so that partial filtering on copied frame is possible.

Define the "magic" fraction number for partial filtering in
loopfilter.h .

Change-Id: I4791ffc541b6884b12759a0d0714a8faf16147ec
2011-10-26 15:25:07 +03:00
Johann
a82cc0205d remove unused variable warning
Change-Id: I4fcd6e4656d9823aead941616cd63501aecbd6e2
2011-10-24 16:33:45 -07:00
John Koleszar
2c0b4a24b9 Merge "Fix: check cx_data buffer prior to write" 2011-10-20 17:36:40 -07:00
James Berry
bc7151131d Fix: check cx_data buffer prior to write
check to make sure that cx_data buffer has enough room before
writting to it, prior behavior did not which could result in a crash.

Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28
2011-10-20 15:55:00 -04:00
Johann
7cdc986cdf Don't copy borders for loop_filter_pick
During the _pick only the Y plane is examined. In addition, data beyond
the borders of the frame is not read.

Change-Id: Ic549adfca70fc6e0b55f8aab0efe81f0afac89f9
2011-10-19 18:54:14 -07:00
Johann
f382173225 Merge "enc: save entropy probs only when needed for refresh" 2011-10-19 14:36:29 -07:00
Scott LaVarnway
63a77cbed9 Merge "Remove usage of predict buffer for decode" 2011-10-19 10:24:48 -07:00
Scott LaVarnway
ed9c66f584 Remove usage of predict buffer for decode
Instead of using the predict buffer, the decoder now writes
the predictor into the recon buffer.  For blocks with eob=0,
unnecessary idcts can be eliminated.  This gave a performance
boost of ~1.8% for the HD clips used.

Tero: Added needed changes to ARM side and scheduled some
      assembly code to prevent interlocks.

Patch Set 6:  Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3)
into this commit because of similarities in the idct
functions.
Patch Set 7: EC bug fix.

Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
2011-10-18 12:06:50 -04:00
Attila Nagy
a5cd42feb9 Fix: vp8cx_pack_tokens_into_partitions_armv5 crash
It was crashing when number of partitions was bigger than the number
of MB rows (ex. 128x96 with 8 partitions).
Start point was not checked against mb_rows, plus extra
"empty" partitions were not written out.

Change-Id: I9c2f013b9ec022354b658fab4ef799ff8b1de93d
2011-10-14 10:53:04 +03:00
Adrian Grange
04182a121a Merge "Added rate-targeted temporal scalability" 2011-10-11 12:54:52 -07:00
Adrian Grange
217591fde5 Added rate-targeted temporal scalability
Added the ability to create rate-targeted, temporally
scalable, VP8 compatible bitstreams.

The application vp8_scalable_patterns.c demonstrates how
to use this capability. Users can create output bitstreams
containing upto 5 temporally separable streams encoded
as a single VP8 bitstream.
(previously abandoned as:
I92d1483e887adb274d07ce9e567e4d0314881b0a)

Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a
2011-10-11 12:49:12 -07:00
John Koleszar
07ba411914 Reset FPU state after calc_plane_error()
Fixes a MMX/SSE2 mismatch when building with --enable-internal-stats.

Change-Id: I0c50a1f246f6916b7a5fc6f36864ceb362f25520
2011-10-11 08:43:30 -07:00
James Berry
05bde9d4a4 bug fix - starting/optimal/max and buffer_level changed from int to int64_t
buffer_level in VP8_COMP and starting_buffer_level, optimal_buffer_level
and maximum_buffer_size in VP8_CONFIG changed from int to int64_t
to avoid potential crash issues for larger target bit rates.

Change-Id: I0d5ab6c8a44c2fef51f30cd8df4bb4b739c5df26
2011-10-10 12:16:55 -04:00
Attila Nagy
c0de35b413 enc: save entropy probs only when needed for refresh
Previous entropy probs need to be saved (and restored) only when
current updates are not propagated.

Change-Id: Ie6ee0543066e30874e56258be0a6b7d2dd2fdb2b
2011-10-10 13:44:54 +03:00
Scott LaVarnway
af12c23e8e Merge "Improved tokenize" 2011-10-04 09:57:42 -07:00
John Koleszar
8f8b526b54 Merge "Fix uninitialized new_mv_count in first pass file" 2011-10-04 07:40:49 -07:00
Yunqing Wang
538865dfa5 Merge "Multithreaded encoder, late sync loopfilter" 2011-10-04 07:04:30 -07:00
John Koleszar
86712c50f2 Fix uninitialized new_mv_count in first pass file
Uninitialized data could be written to the first pass file when no
motion vectors are present in the frame.

Also fix a number of compiler warnings.

Change-Id: Icc9f53b6d33da9de4563d86d9fd591910473ea90
2011-10-04 09:50:52 -04:00
Scott LaVarnway
ab00d209bc Improved tokenize
For a realtime HD encodings, up to 1.6% gains seen.



Change-Id: If45028e23db95124da63f9d38ffe06e05596cc6e
2011-09-30 12:49:46 -04:00
Alpha Lam
7bce513afe Call vp8_find_near_mvs lazily
vp8_find_near_mvs() is being called on all possible reference frames
but the data computed may be used if the loop exits early, which can
be due to x->skip beign set to 1.

Optimize this by call vp8_find_near_mvs() laziy only if it is going
to be used and not computed yet.

Change-Id: Iccdbd4c962a670c9f2c99b8aca8096042ca5dc98
2011-09-30 14:48:18 +01:00
Paul Wilkins
a572ac8327 Merge "CQ and two pass rate control." 2011-09-30 02:57:54 -07:00
Paul Wilkins
b6e27d5f0b CQ and two pass rate control.
Changes to the selection of Q limits for two pass
and two pass CQ mode.

Allowance made for Mode and motion vector costs.
Some refactoring of common code.

For Derf and YT sets CQ mode average improvement
circa 1% (SSIM and Global PSNR).

Some increased tendency to undershoot even when
user CQ not reached.

Patch2: Removed some test code accidentally merged.

Change-Id: Icf74d13af77437c08602571dc7a97e747cce5066
2011-09-30 10:55:52 +01:00
Attila Nagy
380d64ecb1 Multithreaded encoder, late sync loopfilter
Sync with loopfilter thread just at the beginning of next frame encoding.
This returns control to application faster and allows a better multicore scaling.
When PSNR packets are generated the final filtered frame is needed imediatly
so we cannot delay the sync.

Change-Id: I288d97b5e331d41d6f5bb49d97986fa12ac6f066
2011-09-29 10:06:24 +03:00
Johann
9f41a8b0aa Merge "Replace vpx_ports/config.h with vpx_config.h" 2011-09-22 09:30:18 -07:00