Commit Graph

2003 Commits

Author SHA1 Message Date
Jingning Han
c8f481fa3d Restore mode skip feature in sub8x8 rd loop
This commit restores the mode skip feature in the sub8x8 rd loop.

Change-Id: I5496ee32053f572b8961b549e9ecd4f1360824de
2013-10-07 14:20:34 -07:00
Dmitry Kovalev
2ae93a776b Merge "Giving consistent names to IDCT 8x8 functions." 2013-10-07 14:19:50 -07:00
Dmitry Kovalev
23cc1cd8e6 Removing redundant vp9_pt_energy_class declarations.
Declaring vp9_pt_energy_class in vp9_entropy.h instead of many external
places.

Change-Id: I66e8a3fc119a43f88d130d0dae4133c825a047a3
2013-10-07 14:11:01 -07:00
Jim Bankoski
7eb7dd2fed cpplint errors in vp9_onyx_if.h
Slightly bigger change -> broke up encode_frame_to_datarate,  lots
of line length fixes.

Change-Id: I7c53325e954de130f3fe1a6656626efc6705be82
2013-10-07 13:57:20 -07:00
Dmitry Kovalev
272adbbec4 Using inter_mode_offset_function instead of duplicated code.
Change-Id: I8de865cd1deca07b5c92c225782f0867367e9a11
2013-10-07 13:18:46 -07:00
Adrian Grange
18a2617126 Merge "cpplint issues resolved vp9_ratectrl.c" 2013-10-07 10:54:17 -07:00
Jim Bankoski
31b7a912d1 cpplint issues resolved vp9_ratectrl.c
Change-Id: Iae7674b0c946a5ac01617840b3f62965c654d920
2013-10-07 09:21:29 -07:00
Jim Bankoski
92519a005a Merge "cpplint problems resolved with vp9_firstpass.c" 2013-10-07 09:16:46 -07:00
Jim Bankoski
ccc5a483f4 Merge "cpplint issues resolved in vp9_mcomp.c" 2013-10-07 09:14:35 -07:00
Paul Wilkins
65f0cc7f4b Disable MODE_TEST_HIT_STATS
This flag is for stats generation and testing and should not
be checked in as enabled by default.

Change-Id: I4ea57dbcf49790f14777f598ddd3dc37dcc7a6bb
2013-10-07 02:54:19 -07:00
Dmitry Kovalev
c6ad70d5f1 Giving consistent names to IDCT 8x8 functions.
Renames:
  vp9_short_idct8x8_add    -> vp9_idct8x8_64_add
  vp9_short_idct8x8_1_add  -> vp9_idct8x8_1_add
  vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add
  vp9_idct_add_8x8         -> vp9_idct8x8_add

Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
2013-10-06 00:24:09 -07:00
Dmitry Kovalev
9dba044be2 Merge "Giving consistent names to IDCT/IWHT functions." 2013-10-05 23:44:05 -07:00
Jim Bankoski
bf21ce63ee encodemb cpplint issues revisited.
Change-Id: Id5f25b74e2207bf44b6f6c8ffe548fa30fd78b4d
2013-10-05 17:24:51 -07:00
Jim Bankoski
30dee8adfc cpplint problems resolved with vp9_firstpass.c
Change-Id: Ic7b7014a0d857585bfd4baaea1d5c27ffe355642
2013-10-05 17:10:54 -07:00
Jim Bankoski
c9f3f9ed70 Merge "unused typedef in vp9_variance.h" 2013-10-05 16:49:13 -07:00
Jim Bankoski
7fd13472ae Merge "cpplint issues with vp9_boolhuff.c resolved" 2013-10-05 16:48:28 -07:00
Jim Bankoski
f59cb3eacc Merge "added nolint to function that doesn't seem easy to breakup" 2013-10-05 16:47:23 -07:00
Jim Bankoski
4410bbbf88 Merge "cpplint issues in vp9_lookahead.c" 2013-10-05 16:46:11 -07:00
Jim Bankoski
b79b7c354d cpplint issues resolved in vp9_mcomp.c
Change-Id: I2c2f83f4dfa2782fc6b0aa6db3ba2c4e6e423ffa
2013-10-05 16:44:40 -07:00
Jim Bankoski
6a7b1fb754 Merge changes Idbfabe42,I788f1a30
* changes:
  cpplint issues resolved in vp9_variance_mmx.c
  cpplint issues in vp9_ssim.c
2013-10-05 16:32:50 -07:00
Jim Bankoski
2dba2eb46a Merge "cpplint issues in vp9_picklpf.c" 2013-10-05 16:32:00 -07:00
Jingning Han
0d0ed6a29b Allow sub8x8 intra modes test for alt frame coding
This commit allows sub8x8 intra modes test in the rate-distortion
loop for hd sequences in speed 1 and 2.

For sequence y90n of hd set at 8000 kbps, speed 2 runtime goes
from 207s to 210s. For ped_1080p at 3000 kbps, speed 2 runtim goes
from 336s to 337s. Both are running with 300 frames.

This improves compression performance by 0.24% for stdhd and 0.32%
for hd.

Change-Id: I173ca38a6411565ae6cfadd184c42b2070c5de1f
2013-10-04 19:13:00 -07:00
Jim Bankoski
0500cf429f cpplint issues with vp9_boolhuff.c resolved
Change-Id: I6990c9ab838323d8770dd1f49a25bf3acc4c05c7
2013-10-04 17:20:58 -07:00
Jim Bankoski
a36045fb3b Merge "cpplint issues with vp9_temporal_filter.c" 2013-10-04 17:17:02 -07:00
Jim Bankoski
cac3e1588e cpplint issues in vp9_picklpf.c
Change-Id: I62e631ca95fefbb1a993479a5e3926dc81359fe7
2013-10-04 17:08:41 -07:00
Jim Bankoski
eead4bb89e Merge "lint issue in vp9_psnr.c" 2013-10-04 16:42:30 -07:00
Jim Bankoski
e2d73897d0 Merge "vp9_encodeframe.c cpplint issues resolved" 2013-10-04 16:42:06 -07:00
Jim Bankoski
6e161a26e3 Merge "cpp lint issues resolved in vp9_encodeintra.c" 2013-10-04 16:41:58 -07:00
Jim Bankoski
5f80d2ad33 Merge "cpplint vp9_dct.c issues resolved" 2013-10-04 16:41:46 -07:00
Jim Bankoski
38f6a3cdc7 Merge "cpplint issues vp9_tokenize.c resolved" 2013-10-04 16:41:23 -07:00
Jim Bankoski
d07545b7b8 cpplint issues with vp9_temporal_filter.c
Change-Id: I695a990689c79d160227975116125b140875aed1
2013-10-04 15:49:30 -07:00
Yaowu Xu
d129eea9fa Merge "Further clean up of speed 4" 2013-10-04 14:45:21 -07:00
Jim Bankoski
de5cb8b140 vp9_encodeframe.c cpplint issues resolved
Change-Id: Id9d837e062d9c4a94def4b4ed1f49a67c75d3618
2013-10-04 14:37:31 -07:00
Jim Bankoski
02f28bac29 cpp lint issues resolved in vp9_encodeintra.c
Change-Id: Ib6a8360d24f44eeaec12c5055568382a105dc235
2013-10-04 14:35:01 -07:00
Jim Bankoski
9c2b3744c9 cpplint issues in vp9_lookahead.c
Change-Id: I2a98995f0df77d99dc47bda5e41886f014d8843f
2013-10-04 14:24:19 -07:00
Jim Bankoski
5b4f836148 cpplint issues resolved in vp9_variance_mmx.c
Change-Id: Idbfabe427fbeab44210f13fec8b6f63f7a4eb0dd
2013-10-04 14:22:08 -07:00
Jim Bankoski
eb5b7ac27b added nolint to function that doesn't seem easy to breakup
Change-Id: I5489b116aea7c510ea5ebbed3c1445f321b05f3e
2013-10-04 14:17:47 -07:00
Dmitry Kovalev
3a0602578e Giving consistent names to IDCT/IWHT functions.
The idea is to have the following names for each transform size:

vp9_idct4x4_add
  vp9_idct4x4_1_add
  vp9_idct4x4_10_add
  vp9_idct4x4_16_add

vp9_idct8x8_add
  vp9_idct8x8_1_add
  vp9_idct8x8_10_add
  vp9_idct8x8_64_add

etc for 16x16, 32x32

The actual list of renames in this patch:

vp9_idct_add_lossless     -> vp9_iwht4x4_add
vp9_short_iwalsh4x4_add   -> vp9_iwht4x4_16_add
vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add

vp9_idct_add            -> vp9_idct4x4_add
vp9_short_idct4x4_add   -> vp9_idct4x4_16_add
vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add

Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
2013-10-04 14:17:06 -07:00
Jim Bankoski
25ecb1f0b3 cpplint vp9_variance_sse2.c
Change-Id: Ifce8f5b57a1ea8952e8a67c5b92a127a061899fa
2013-10-04 14:15:06 -07:00
Jim Bankoski
f3e6a35cdb cpplint issues in vp9_ssim.c
Change-Id: I788f1a3004643347ca08d08fc3cb2bb8f0b134d9
2013-10-04 14:08:37 -07:00
Jim Bankoski
424c74e736 cpplint vp9_dct.c issues resolved
Change-Id: Ia21653a447040f1b472d21ebd19103b0558c4b16
2013-10-04 13:47:59 -07:00
Jim Bankoski
c6960b6086 cpplint issues vp9_tokenize.c resolved
Change-Id: Id4ec0084641d2ad4def95fb05239455fbc25f9b9
2013-10-04 13:42:58 -07:00
Jim Bankoski
660dcfe6a2 Merge "cpplint issues vp9_encodemv.c" 2013-10-04 12:55:46 -07:00
Jim Bankoski
19641c40f9 Merge "cpplint issues vp9_mbgraph" 2013-10-04 12:55:26 -07:00
Guillaume Martres
014a2c17df Fix first pass for non-square blocks
Change-Id: Ic049f0a6ce190f33859118e7b8cfcfe305979102
2013-10-04 12:04:15 -07:00
Dmitry Kovalev
042c475a8f Merge "Moving all idct/iht functions in one place." 2013-10-04 12:01:42 -07:00
Jim Bankoski
d9215a6616 cpplint issues vp9_mbgraph
Change-Id: Iedf9ac460edb31d7c072e2bebd26f2afe8e6089b
2013-10-04 11:22:22 -07:00
Jim Bankoski
19e227561a cpplint issues vp9_encodemv.c
Change-Id: Icda1d2d7cbfb176884fa6c7d9366a2d60e2994e9
2013-10-04 11:19:06 -07:00
Jim Bankoski
916f803175 lint issue in vp9_psnr.c
Change-Id: Ifc7ffc02cfedb47230571298622602609a4e8a70
2013-10-04 11:01:49 -07:00
Jingning Han
1ab60f7bfb Merge "Remove redundant second_ref_frame check in sub8x8" 2013-10-04 09:04:11 -07:00
Paul Wilkins
44e039b4f5 Further clean up of speed 4
Speed 4 still does not give a big gain over speed 3.
This just cleans it up a little from the last patch and comments
out features that do not seem to be giving much benefit.

Change-Id: I5f366e6160e1dbe5dc45cf5eb90cc02712baa1b6
2013-10-04 16:57:24 +01:00
Paul Wilkins
8abd92f12f Remove mode_skip_start and mask code for sub 8x8
This code serves no purpose in the re-factored sub 8x8 code.

Change-Id: I5364986224d1a28b71bcb046ec8557a3d14aaa47
2013-10-04 14:26:17 +01:00
Paul Wilkins
de6ecc5ac3 Selective masking of split modes.
Allow selective masking of individual split modes rather than
just a single on / off flag.

For speed 2 recovers the large speed loss seen for some derf
clips  in change Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
and a small quality gain.

For speed 1 10 % speed increase observed locally on some derf clips
for minimal quality change.

Change-Id: If86191087b93cbc05351c26c60c7933e2149e485
2013-10-04 14:20:58 +01:00
Paul Wilkins
03dd2818e4 Missing threshold case for disable split.
In relation to change:
Refactor inter mode rate-distortion search
 Ie6bdfa0a370148dd60bd800961077f7e97e67dd4

sf->thresh_mult_sub8x8[THR_INTRA] = INT_MAX missing;

Change-Id: Ia86b68a5073368a3e2ca124a27b632243b525c8b
2013-10-04 11:54:24 +01:00
Dmitry Kovalev
d975804e9a Merge "Replacing duplicated code with get_scan_and_band call." 2013-10-03 18:58:40 -07:00
Dmitry Kovalev
8b34437522 Replacing duplicated code with get_scan_and_band call.
Change-Id: I2cc3684f416a63dc99b9303109f9850f34a470d5
2013-10-03 17:46:28 -07:00
Jingning Han
2952b7d1fb Remove redundant second_ref_frame check in sub8x8
This commit removes the redundant second reference frame check in
the rate-distortion optimization loop for sub8x8 blocks.

Change-Id: I13a57a6f624c4a9bcef02ff2a867fa30d8b44a93
2013-10-03 14:02:12 -07:00
Jingning Han
b9daef91d8 Use vp9_zero in sub8x8 RD optimiazion loop
Change-Id: Ic23a705e48cadaa7151f2bd8536d56636cb973e3
2013-10-03 12:34:25 -07:00
Jingning Han
4093192ec9 Change b_mode_info definition from union to struct
This commit defines b_mode_info as a struct type. This will allow
us to further remove the use of PARTITION_INFO in the encoding process.

Change-Id: I975b0f7d557b5e0f66545a61b472def76b671cce
2013-10-03 12:34:11 -07:00
Jingning Han
793c2d8429 Remove unused variables in inter_mode rd loops
Remove redundant variable definition/use in rate-distortion search
loop for regular and sub8x8 blocks, respectively.

Change-Id: Ic0eb3660bb6851ba2eb8d702ba9fd11595000d01
2013-10-03 12:34:11 -07:00
Jingning Han
a55625873f Merge "Refactor inter mode rate-distortion search" 2013-10-03 12:19:53 -07:00
Jingning Han
11abab356e Refactor inter mode rate-distortion search
This commit separates the rate-distortion optimization loop of
superblocks from that of sub8x8 blocks. This allows better design
rate-distortion optimization search loop for each setting. It also
removes the use of SPLITMV and I4X4_PRED therein.

No performance change in speed 0 settings. For bus@CIF at 2000kbps,
the speed 1 runtime goes from 48009ms to 43894ms (about 10% faster).
The overall compression performance on derf changed by -0.021%.

Speed 2 runtime goes from 27114ms to 28700ms (6% slower), while the
overall coding efficiency goes up by 1.629% for derf, 1.236% for yt.

Change-Id: Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
2013-10-03 11:36:49 -07:00
Dmitry Kovalev
9250d1529c Using vp9_zero instead of vpx_memset.
Change-Id: I9a0d0e9c3459954aa7b9c68f92cc5d56385ebd18
2013-10-03 10:59:36 -07:00
Paul Wilkins
b03d3da9c1 Merge "Speed setting review." 2013-10-03 09:49:00 -07:00
Paul Wilkins
fa71882e63 Merge "make use last partition consider motion" 2013-10-03 09:48:49 -07:00
Dmitry Kovalev
6cb6987d4d Merge "BITSTREAM - RESTORING BILINEAR INTERPOLATION FILTER SUPPORT" 2013-10-03 09:34:26 -07:00
Paul Wilkins
6253cc9279 Speed setting review.
Substantial reworking of the speed vs quality trade offs for
speed 1 and 2.

In this patch I am attempting to freeze the "quality" meaning of
speeds 1 and 2 relative to speed 0 so that in future we can
better evaluate progress.

I am targeting :
Speed 1 quality ~-5% vs speed 0.
Speed 2 quality ~-10% vs speed 0

It is inevitable that quality will still fluctuate a little as we adjust
settings and add new features, but we will attempt to keep as
close as possible to these values. Above speed 2 things will remain
a bit more fluid for now.

In this patch speed 1 is approximately 4-5x as fast as speed 0. This
is similar to before but the quality hit is a lot less. Likewise speed 2
is approximately 2x as fast as speed 1 but is similar in quality to the
previous speed 1 configuration.

Also slight change to behavior of FLAG_EARLY_TERMINATE to insure
all reference frames get at least one rd test. Important for very low
variance regions.

WIP :- Added a new speed level with old speed 4 becoming speed 5.
Speed 3 and 4 tradeoffs still WIP

Change-Id: Ic7a38dd7b5b63ab1501f9352411972f480ac6264
2013-10-03 10:23:28 +01:00
Jim Bankoski
f1d3e5e4d6 make use last partition consider motion
This commit causes use last partition to consider whether a 64x64 has
motion that might make a new partitioning worth while.

Change-Id: I3a57bedef4f3cd961fadbfa96651c206fa36da4a
2013-10-03 10:22:39 +01:00
Paul Wilkins
ece99b3da0 Merge "Improved auto_partition_range." 2013-10-03 02:06:13 -07:00
Dmitry Kovalev
68a3e4a888 BITSTREAM - RESTORING BILINEAR INTERPOLATION FILTER SUPPORT
Adding appropriate test vector vp90-2-06-bilinear.webm.

Change-Id: Ia3bbf57318e0cc61a1b724fe751e3f9c7e11b337
2013-10-02 18:04:12 -07:00
A.Mahfoodh
5215b83aea Simplifying and inlining k_cvtlo_epi16 and k_cvthi_epi16
Simplify the k_cvtlo_epi16 and k_cvthi_epi16 to only two
instructions. Then inlined them.

quoting from intel MMX_App_Compute_16bit_Vector.pdf‎
"The PMADDWD instruction multiplies four
pairs of 16-bit numbers and produces partial sums of the results
and can do so once per clock (with a three-clock latency)."
so I am assuming that there will be three clock overhead after the
last _mm_madd_pi16 command.
Even with the overhead the number of clocks in general should be
smaller. I am not sure though becasue I could not find information
about number of clocks required for instructions in k_cvtlo_epi16
and k_cvthi_epi16. I will run a test and compare the execution time.

Change-Id: Ieda4aa338f69ad3dd196ac6e7892da3cf1b47ea7
2013-10-02 20:02:03 -04:00
Dmitry Kovalev
a88a0e88a4 Merge "Moving get_token_alloc function from common to the encoder." 2013-10-02 16:26:00 -07:00
Jim Bankoski
f5bcc372c9 unused typedef in vp9_variance.h
Change-Id: I15f79c9de34c723c1dd419b8da96c3ff948c5e03
2013-10-02 15:59:31 -07:00
Dmitry Kovalev
be7eec79be Moving all idct/iht functions in one place.
Moving functions from vp9_idct_blk to vp9_idct because these functions are
used from both encoder and decoder. Removing duplicated code from
vp9_encodemb.c and reusing existing functions.

Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
2013-10-02 14:13:33 -07:00
Jingning Han
54bc73151b Deprecate unused mode count variables
Remove mode_check_freq and mode_test_hit_counts from VP9_COMP.

Change-Id: Iabfd9f841444cd9bf19ac761a9795f140082ce0b
2013-10-02 11:07:14 -07:00
Jim Bankoski
825b7c301d Merge "vp9_block.h cpplint issues resolved" 2013-10-01 16:14:58 -07:00
Jim Bankoski
691177842c Merge "cpplint issue in vp9_rdopt.h" 2013-10-01 15:45:35 -07:00
Jim Bankoski
5491a1f33e vp9_block.h cpplint issues resolved
Change-Id: Icc6a76a5be77f3e19918155bab3998e0aa32ccf5
2013-10-01 15:17:39 -07:00
Jim Bankoski
c4627a9ff1 cpplint issues in vp9_onyx_int.h
Change-Id: I6c4058aebe834e1a12b7a3fb10484b9ebe60b349
2013-10-01 15:14:39 -07:00
Jim Bankoski
b6e2f9b752 cpplint issue in vp9_rdopt.h
Change-Id: I84209d382ca5dfc537ee533cd792d8caa0e25cee
2013-10-01 15:09:32 -07:00
Dmitry Kovalev
0a5e9ee054 Moving get_token_alloc function from common to the encoder.
Also renaming mb_row -> mi_row, mb_col -> mi_col arguments and calculate
mb_rows/mb_cols values from mi_rows/mi_cols.

Change-Id: I6919a279f560648e23bc9a12f507d17c21ffd5d7
2013-10-01 11:54:10 -07:00
Jingning Han
195061feda Fix rectangular partition check in speed 1
Make encoder skip rectangular partition check in speed 1 and above,
when early termination was triggered in partition split.
Thanks Guillaume (gmartres@) for catching this issue.

This change makes bus_cif at 2000kbps speed 1 runtime goes down from
25612ms to 23438ms (about 9% speed-up), at the expense of -0.235%
performance down.

Change-Id: I98613fad081a261d30d5fa206f934ca70601c180
2013-09-30 12:14:36 -07:00
Paul Wilkins
d12a502ef9 Merge "Alter Speed 3." 2013-09-30 09:12:28 -07:00
Deb Mukherjee
fad3d07df3 Merge "Some minor changes/cleanups in rate control" 2013-09-30 06:50:56 -07:00
Paul Wilkins
65b93c7e52 Improved auto_partition_range.
The code now takes into account temporal and spatial
information to determine the partition size range, but the
frequency counts have been removed.

The net effect is similar in quality but about 10% faster.

Change-Id: I39a513fb79cec9177b73b2a7218f0da70963ae95
2013-09-30 11:32:57 +01:00
Paul Wilkins
a76caa7ff4 Alter Speed 3.
This patch deletes the variance based speed three partitioning.
Speed 3 now uses the same partitioning method as speed 2
but with some stricter conditions.

The speed and quality are now somewhere between speeds 2 and 4
whereas before it was worse in both than speed 4.

Change-Id: Ia142e7007299d79db3ceee6ca8670540db6f7a41
2013-09-30 11:26:46 +01:00
Dmitry Kovalev
b927620231 Merge "Using is_inter_block and has_second_ref functions." 2013-09-29 12:14:41 -07:00
Dmitry Kovalev
29815ca729 Merge "Moving from int_mv* to MV* (3)." 2013-09-29 12:13:16 -07:00
Dmitry Kovalev
4ab01fb5f7 Merge "Reusing FRAME_CONTEXT struct to simplify the code." 2013-09-29 12:02:26 -07:00
Dmitry Kovalev
b3d3578ee4 Merge "Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add." 2013-09-29 12:01:50 -07:00
Dmitry Kovalev
7343681675 Merge "Removing vp9_get_coef_neighbors_handle function." 2013-09-29 12:01:36 -07:00
Dmitry Kovalev
efbacc9f89 Merge "Removing vp9_subpelvar.h from common." 2013-09-29 12:00:46 -07:00
Dmitry Kovalev
bd9c057433 Reusing FRAME_CONTEXT struct to simplify the code.
Change-Id: Ia455c1900d84a3221e3681e31e15ca86bd03f89d
2013-09-27 16:41:20 -07:00
Guillaume Martres
ceaa3c37a9 Merge "Simplify RDMULT and RDDIV derivation" 2013-09-27 16:32:54 -07:00
Dmitry Kovalev
3fab2125ff Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add.
Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1.

Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5
2013-09-27 15:26:27 -07:00
Dmitry Kovalev
209c6cbf8f Removing vp9_get_coef_neighbors_handle function.
Change-Id: I6be72c8b048d1ccc7ef43764cf84c32360098970
2013-09-27 14:11:13 -07:00
Deb Mukherjee
80d582239e Some minor changes/cleanups in rate control
Some small changes to the quantizer mapping functions.
Also includes some cleanups.

Change-Id: I9dea29b24015f6e6697012a0e4d8983049d8e5c7
Results:
derfraw300: +0.106%
stdhdraw250: +0.139%
2013-09-27 13:57:42 -07:00
Dmitry Kovalev
15a36a0a0d Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10.
Making function name consistent with vp9_short_idct16x16 and
vp9_short_idct16x16_1.

Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517
2013-09-26 14:01:25 -07:00
Guillaume Martres
2b426969c3 Simplify RDMULT and RDDIV derivation
Don't divide RDMULT and RDDIV by 100 when RDMULT > 1000. This was
probably done to avoid overflow when the rd cost was stored in a 32 bits
integer but this is not the case anymore. This change will make it easier
to support multiple quantizers per frame.

derf compression gain at speed 0: 0.037%

Change-Id: Ibeeb9b7cfa1a132a7af41bc90fc07a3bba0857f6
2013-09-26 13:55:16 -07:00
Dmitry Kovalev
eda4e24c0d Using is_inter_block and has_second_ref functions.
Change-Id: I60dee58a4fd24d3c4f3c101a49d30e217309f43a
2013-09-25 19:03:04 -07:00
Guillaume Martres
7755b9dada Merge "Correctly set the segment_id prediction flag and context" 2013-09-25 18:04:21 -07:00
Yaowu Xu
0c02bfcc2a Merge "Limit mv search range for first pass and mbgraph" 2013-09-25 17:21:13 -07:00
Dmitry Kovalev
8266da1cd1 Moving from int_mv* to MV* (3).
Change-Id: I9795d0937bc07793c13d067281995e0750f694d9
2013-09-25 16:44:19 -07:00
Dmitry Kovalev
f9e2140cab Merge "Moving from int_mv* to MV* (2)." 2013-09-25 16:12:13 -07:00
Dmitry Kovalev
64eff7f360 Removing vp9_subpelvar.h from common.
Moving all code from that file to vp9_variace_c.c in the encoder.

Change-Id: Ic803d5b4c78d5191e4d25541b3df97337878fc3e
2013-09-25 16:10:43 -07:00
Dmitry Kovalev
2b5670238b Merge "Replacing txfm with tx." 2013-09-25 15:57:56 -07:00
Dmitry Kovalev
87a214c277 Merge "Adding vp9_get_entropy_contexts function." 2013-09-25 15:43:55 -07:00
Dmitry Kovalev
9cd14ea6ed Merge "Removing redundant 'extern' keyword." 2013-09-25 15:42:48 -07:00
Dmitry Kovalev
d445945a84 Adding vp9_get_entropy_contexts function.
Change-Id: Ife0dd29fb4ad65c7e12ac5f1db8cea4ed81de488
2013-09-24 17:26:05 -07:00
Dmitry Kovalev
d0365c4a2c Replacing txfm with tx.
Renaming txfm_stepdown_count to tx_stepdown_count and max_txfm_size to
max_tx_size.

Change-Id: Ifc173e22c78240e561a57c4c741b64b1b8fc6fef
2013-09-24 17:24:35 -07:00
Dmitry Kovalev
450cbfe53a Cleaning up vp9_update_nmv_count function.
Using best_mv[2] array instead of two separate variables.

Change-Id: Iefa0a41f5c42c42f2c66cef26750da68405f0f25
2013-09-24 15:55:49 -07:00
Dmitry Kovalev
12d57a9409 Removing redundant 'extern' keyword.
Change-Id: Ie51306689c0dc527a8aa12d3984389dd8f360dea
2013-09-24 15:13:09 -07:00
Guillaume Martres
57272e41dd Correctly set the segment_id prediction flag and context
This fix a bug introduced by ac6093d179

Change-Id: I0700a4daf7a6a2471074f81a4596352287fb2ac9
2013-09-24 14:18:27 -07:00
Yaowu Xu
35c5d79e6b Limit mv search range for first pass and mbgraph
Both first pass and mbgraph search use block size 16x16 for motion
estimation. This commit put a limit of motion vector range. The
effective range allows the entire 16x16 with required subpel
interpolation input to be completely outside image border, but
not any further away from image border.

Change-Id: Id70a5ed08be49e70959f064859d72adc7d775d08
2013-09-24 13:47:29 -07:00
Dmitry Kovalev
b87696ac37 Moving from int_mv* to MV* (2).
Updating fractional_mv_step_fp and fractional_mv_step_comp_fp function
types.

Change-Id: I601c4378bc39ac3ffd4e295d9cbd8e1f74829d46
2013-09-24 12:48:12 -07:00
Dmitry Kovalev
30888742f4 Merge "Moving from int_mv to MV." 2013-09-24 12:25:56 -07:00
Yaowu Xu
71cfaaa689 Merge "Replace memcpy with vpx_memcpy" 2013-09-24 11:35:03 -07:00
Yaowu Xu
9be0bb19df Replace memcpy with vpx_memcpy
Also removed obselete comment

Change-Id: Iae1664777d76383639c637ee786e0d50fc45819a
2013-09-24 10:56:06 -07:00
Yaowu Xu
6037f17942 Rename defined constants
The change is to better reflect the nature of the constants.

Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
2013-09-24 10:53:01 -07:00
Yaowu Xu
ff1ae7f713 Prevent using uninitialized value in RD decision
INT64_MAX may be assigned as RDCOST when RDCSOST computation is skipped
for speed, this commit to prevent INT64_MAX from being used as real
RDCOST in transform size decision.

Change-Id: I89a945134191bbdea1f1431ade70424ac079eaac
2013-09-24 10:53:01 -07:00
Yaowu Xu
fe533c9741 Merge "Change to prevent invalid memory access" 2013-09-24 10:37:17 -07:00
Dmitry Kovalev
f24b9b4f87 Merge "Adding best_mv[2] array instead of two variables." 2013-09-24 10:17:53 -07:00
Deb Mukherjee
f1a627e8a2 Merge "Small tweak in the constant quality parameter" 2013-09-24 09:51:08 -07:00
Jingning Han
9bcd750565 Merge "Enable per transformed block zero coeffs forcing" 2013-09-24 09:18:17 -07:00
Jingning Han
24ad692572 Merge "Calculate rd cost per transformed block" 2013-09-24 09:18:03 -07:00
Deb Mukherjee
b7a93578e5 Small tweak in the constant quality parameter
Improves results a little.

Change-Id: I7bcac02dbb65b43a993445cf557c520197114e5c
2013-09-24 09:09:35 -07:00
Yunqing Wang
bacb5925ff Merge "Number of instructions in fdct4_1d_sse2 reduced by two." 2013-09-24 08:40:56 -07:00
Yaowu Xu
92a29c157f Change to prevent invalid memory access
After change of MI context storage , mi_8x8[]  pointer may be null for
a block outside of image border. The commit changes to access the data
only after validation of mi_row and mi_col.

Change-Id: I039c4eb486a228ea9d8e5f35ab9ae6717d718bf3
2013-09-24 08:36:59 -07:00
A.Mahfoodh
13c7715a75 Number of instructions in fdct4_1d_sse2 reduced by two.
Mathematically the results are the same.

Change-Id: I1c5126cd3ca64e8515ca6331e0989c6f7dd651a0
2013-09-23 17:23:27 -07:00
Yaowu Xu
838eae3961 Correct 3 step search site initialziation
39c7b01d accidently reverted the row/col initialization, which broke
mv clamps, which is dependent on the sites for valid motion vector
range. This commit fixed the issue.

Change-Id: Ibcce0226e0360b1ef483fe760b2e33f1af4bf494
2013-09-23 16:11:49 -07:00
Jingning Han
a517343ca3 Enable per transformed block zero coeffs forcing
This commit enables forcing all coefficients zero per transformed
block, when its rate-distortion cost is lower than regular coeff
quantization.

The overall performance improvement (including its parent patch on
calculating rd cost per transformed block) at speed 1:
derf:  0.298%
yt:    0.452%
hd:    0.741%
stdhd: 0.006%

Change-Id: I66005fe0fd7af192c3eba32e02fd6d77952accb5
2013-09-23 10:39:35 -07:00
Jingning Han
54c87058bf Merge "Remove redundant mv_pred use for sub8x8 blocks" 2013-09-23 08:47:21 -07:00
Deb Mukherjee
d11221f433 Improves constant qual, constrained qual turned on
Adds modeled functions to decide the qp for altref frames in constant q
mode similar to other functions in use in bitrate mode.

Also turns on the constrained quality mode (end-usage=2) option which
was turned off before. Basic testing shows the mode works in principle,
to cap bitrate to the target-bitrate specified, while allowing lower
bitrate depending on the cq-level specified. The mode will need to be
improved over time.

Results for constant quality vs bitrate control mode:
derfraw300/fullderfraw: +3.0% at constant quality over bitrate control.
fullstdhdraw: +4.341%
stdhdraw250: +5.361%

Change-Id: If5027c9ec66c8e88d33e47062c6cb84a07b1cda9
2013-09-22 23:04:50 -07:00
Jingning Han
78fbb10642 Calculate rd cost per transformed block
This commit makes the rate-distortion optimization loop evaluate
the rd costs of regular quantization and all zero coeffs, per
transformed block. It improves speed 1 compression performance:

derf: 0.245%
yt:   0.515%

For a large partition that consists multiple transformed blocks,
this allows more flexibility to selectively force a portion of
them coded as all zero coeffs, as well be continued in the next
patches.

Change-Id: I211518be4179747b57375696f017d1160cc91851
2013-09-20 12:40:17 -07:00
Dmitry Kovalev
bb5e2bf86a Adding best_mv[2] array instead of two variables.
Change-Id: I584fe50f73879f6a72fada45714ef80893b6d549
2013-09-20 17:08:53 +04:00
Dmitry Kovalev
e51e7a0e8d Moving from int_mv to MV.
Converting vp9_mv_bit_cost, mv_err_cost, and mvsad_err_cost
functions for now.

Change-Id: I60e3cc20daef773c2adf9a18e30bc85b1c2eb211
2013-09-20 13:52:43 +04:00
Dmitry Kovalev
39c7b01d3c Cleanup in vp9_init3smotion_compensation.
Change-Id: Ie47f53e76bc9530475c8c6d24e9b7a5a0189de56
2013-09-20 12:54:14 +04:00
Dmitry Kovalev
24df77e951 Merge "Adding get_scan_and_band function." 2013-09-20 00:15:06 -07:00
Jingning Han
44b708b4c4 Remove redundant mv_pred use for sub8x8 blocks
The sub8x8 blocks has its own motion vector reference scheme. The
mv_pred is only used blocks of sizes 8x8 and above, to find the
starting point for motion search.

This change does not change any coding behavior. It makes the
encoding process slightly faster. (0.5% speed-up for local test on
speed 1.)

Change-Id: I746ee6ef0eac19aa3621be014afa12be8d82cbb9
2013-09-19 10:32:44 -07:00
Yaowu Xu
79af591368 change to avoid invalid memory read.
The fake token EOSB may cause invaild memory read in pack token, this
commit reworked the loop to avoid such invalid read.

Change-Id: I37fdfce869b44a7f90003f82a02f84c45472a457
2013-09-19 08:22:10 -07:00
Yaowu Xu
014acfa2af fix integer overflow errors
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
2013-09-19 08:14:26 -07:00
Dmitry Kovalev
a23c2a9e7b Adding get_scan_and_band function.
Extracting get_scan_and_band function from get_entropy_context to
remove duplicated code.

Change-Id: I5da1f5a60263017e887da68bc834317b5f084cb2
2013-09-19 16:53:48 +04:00
Dmitry Kovalev
1600707d35 Merge "Removing redundant code from vp9_mcomp.c." 2013-09-19 00:30:18 -07:00
Dmitry Kovalev
cda802ac86 Merge "Removing redundant coef calculation + cleanup." 2013-09-19 00:28:31 -07:00
Dmitry Kovalev
98cf0145b1 Removing redundant coef calculation + cleanup.
Adding temp variable for &x->plane[0], inlining src_diff values.

Change-Id: I24c08a5425a6da6fd66f5b0278f2fce74f9989b2
2013-09-18 16:20:10 +04:00
Dmitry Kovalev
72fd127f8c Removing redundant code from vp9_mcomp.c.
Replacing ((1 << MV_MAX_BITS) - 1) with MV_MAX, adding const
qualifiers, reusing computed values.

Change-Id: I7b46d47f6c644b079d9c3478116a9de465a9baec
2013-09-18 13:11:38 +04:00
Dmitry Kovalev
245ca04bab Fixing typo in the encoder.
Change-Id: I168efdc366eecf638694f357ccad2f4eba7e2fdb
2013-09-18 12:02:22 +04:00
Yaowu Xu
85fd8bdb01 Merge "Silence a bunch of MSVC warnings" 2013-09-17 17:10:58 -07:00
Jingning Han
c437bbcde0 Clean up second ref check in sub8x8 rd loop
This commit cleans up the second reference check in the
rate-distortion optimization loop of sub8x8 blocks.

Change-Id: Ife68feaa4cddbfad2878c9b44d3012788d634f97
2013-09-17 15:59:49 -07:00
Yaowu Xu
a783da80e7 Silence a bunch of MSVC warnings
Change-Id: I16633269582a640809dca27572bbe99efa6369fc
2013-09-17 12:08:51 -07:00
Paul Wilkins
84758960db Merge "Minor clean up." 2013-09-17 03:39:24 -07:00
Paul Wilkins
90a52694f3 Merge "Adjustment to mode_skip_start." 2013-09-17 03:39:15 -07:00
Yaowu Xu
eeae6f946d fix a problem where an invalid mv used in search
The commit added reset of pred_mv at the beginning of each SB64x64
partition mv search, also limited the usage of pred_mv only when
search on the largest partition is already done. This is to fix
a crash at speed 1/2 encoder where an invalid mv is used in mv
search.

Change-Id: I39010177da76d054e3c90b7899a44feb2e3a5b1b
2013-09-16 12:49:27 -07:00
Paul Wilkins
cb50dc7f33 Minor clean up.
Removed some unused code and minor cleanup
/ reordering.

Change-Id: I4083ae56aeb8edfe9b85aa2f42a16aa28d19da94
2013-09-16 13:45:20 +01:00
Paul Wilkins
3b01778450 Adjustment to mode_skip_start.
Corrected values relating to modified mode order.

Change-Id: I24fccba3af4bc16721d5e7e51888a66305bfa7fe
2013-09-16 13:44:48 +01:00
Jingning Han
e8a967d960 Merge "Adaptive motion search control" 2013-09-13 14:43:23 -07:00
Jingning Han
c4826c5941 Adaptive motion search control
This commit enables adaptive constraint on motion search range for
smaller partitions, given the motion vectors of collocated larger
partition as a candidate initial search point.

It makes speed 0 runtime of bus at CIF and 2000 kbps goes from
167s down to 162s (3% speed-up), at 0.01dB performance gains. In
the settings of speed 1, this makes the runtime goes from 33687 ms
to 32142 ms (4.5% speed-up), at 0.03dB performance gains.

Compression performance wise, it gains at speed 1:
derf  0.118%
yt    0.237%
hd    0.203%
stdhd 0.438%

Change-Id: Ic8b34c67810d9504a9579bef2825d3fa54b69454
2013-09-13 13:58:10 -07:00
Deb Mukherjee
0c3038234d Merge "Clean up of the search best filter speed feature" 2013-09-13 11:03:59 -07:00
Paul Wilkins
5d8642354e Merge "Fix VP9_mode_order[]" 2013-09-13 09:19:31 -07:00
Scott LaVarnway
8fc95a1b11 Merge "New mode_info_context storage -- undo revert" 2013-09-13 08:56:20 -07:00
Paul Wilkins
1407cf8588 Fix VP9_mode_order[]
Mis-merge of the following change managed to break mode order
and delete two mode options (new alt ref and near alt ref)
It also created a situation where we could test two undefined
modes off the end of the VP9_mode_order[] data structure.
  "clang warnings : remove split and i4x4_pred fake modes"
  "Change Id: I8ef3c*"

Initial testing on Akiyo at speed 2.
101.35	 44.567	 44.447 improves to
96.82	 44.915	 44.815

Approx 0.3-0.4db gain and 2.5% size reduction

Change-Id: Icff813e7c0778d140ad4f0eea18cf1ed203c4e34
2013-09-13 13:33:26 +01:00
Jim Bankoski
324ebb704a Merge "fix clang warning in rdopt" 2013-09-12 16:39:05 -07:00
Jim Bankoski
9ee9918dad fix clang warning in rdopt
either missed this or it crept back in

Change-Id: I6cc1519d09e558be7250254c25bde2ae720555ea
2013-09-12 06:39:42 -07:00
Jim Bankoski
cddde51ec5 Merge "clang warnings : remove split and i4x4_pred fake modes" 2013-09-12 06:20:45 -07:00
Paul Wilkins
66755abff4 Merge "Changes in speed 2 settings" 2013-09-12 02:22:45 -07:00
Jim Bankoski
7fb42d909e clang warnings : remove split and i4x4_pred fake modes
Change-Id: I8ef3c7c0f08f0f1f4ccb8ea4deca4cd8143526ee
2013-09-11 16:34:55 -07:00
Deb Mukherjee
b964646756 Clean up of the search best filter speed feature
Removes this speed feature since it is very slow and unlikely
to be used in practice. This cleanup removes a bunch of unnecessary
complications in the outer encode loop.

Change-Id: I3c66ef1ca924fbfad7dadff297c9e7f652d308a1
2013-09-11 15:16:36 -07:00
Jim Bankoski
d09abfa9f7 Merge "resolve clang issue : implicit convert tx_mode -> tx_size" 2013-09-11 13:40:11 -07:00
Deb Mukherjee
69fe840ec4 Changes in speed 2 settings
Propose some changes to the speed 2 settings to improve quality.
In particular, turns off the adjust_thresholds_by_speed feature
which improves results by 6%. Also removes the code for
adjust_thresholds_by_speed since it conflicts with the adaptive
rd thresh feature.

Overall, with this change speed 2 is -15.2% from speed 0 settings,
on derf, which is significantly better than -21.6% down before.

Change-Id: I6e90a563470979eb0c258ec32d6183ed7ce9a505
2013-09-11 10:54:07 -07:00
Scott LaVarnway
ac6093d179 New mode_info_context storage -- undo revert
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of pointers to MODE_INFO structs.  The
MODE_INFO structs are now stored as a stream (decoder only),
eliminating unnecessary copies and is a little more cache
friendly.

Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
2013-09-11 13:45:44 -04:00
Jingning Han
65fe7d7605 Merge "Remove redundant condition check in 32x32 quant" 2013-09-10 16:39:18 -07:00
Jingning Han
cb24406da5 Merge "Remove the use of uninitialized_safe in encode_sb_" 2013-09-10 12:05:22 -07:00
Jingning Han
5d93feb6ad Remove redundant condition check in 32x32 quant
The c code implementation of 32x32 quantization does the zbin check
of all coefficients prior to the quant/dequant loop, hence removing
the redundant zbin check inside the loop. This only affects the
c code version. SSSE3 version does not separate the zbin check out.

Change-Id: Ic197a7d61d0b25fcac3cc092987651378cb56e4e
2013-09-10 12:04:33 -07:00
Deb Mukherjee
3d22d3ae0c Merge "Small tweaks on the constant quality mode" 2013-09-10 11:16:47 -07:00
Deb Mukherjee
09830aa0ea Small tweaks on the constant quality mode
Improves results a little.
derf is now +1.078% over bitrate control.

Change-Id: I4812136f3e67be21d14ec089419976a32a841785
2013-09-10 10:16:19 -07:00
Yunqing Wang
0607abc3dd Stop partition checking when distortion is small
If the current obtained distortion is very small, which happens
for static image case, we pick the current partition type without
further split checking.

This won't affect regular videos. For static videos, we got 10%~12%
encoding speed gain. PSNR was better for some clips, and worse for
others. Overall it was even.

Change-Id: If787a57bedf46fc595ca4f5ded2b0c0a69e9fdef
2013-09-10 10:13:24 -07:00
Yunqing Wang
939791a129 Modify encode breakout for static frames
Thank Paul for the suggestions. While turning on static-thresh
for static-image videos, a big jump on bitrate was seen. In this
patch, we detected static frames in the video using first-pass
stats. For different cases, disable encode breakout or reduce
encode breakout threshold to limit the skipping.

More modification need be done to break incorrect partition
picking pattern for static frames while skipping happens.

Change-Id: Ia25f47041af0f04e229c70a0185e12b0ffa6047f
2013-09-10 09:06:03 -07:00
Paul Wilkins
4f660cc018 Modified mode skip functionality.
A previous speed feature skipped modes not used in earlier
partitions but this not longer worked as intended following
changes to the partition coding order and in conjunction
with some other speed features (Especially speed 2 and above).

This modified mode skip feature sets a mask after the first X
modes have been tested in each partition depending on the
reference frame of the current best case.

This patch also makes some changes to the order modes are
tested to fit better with this skip functionality.

Initial testing suggests speed and rd hit count improvements
of up to 20% at speed 1. Quality results. (derf -1.9%, std hd  +0.23%).

Change-Id: Idd8efa656cbc0c28f06d09690984c1f18b1115e1
2013-09-10 13:30:10 +01:00
Paul Wilkins
901c495482 Added extra check to rd_auto_partition_range()
Added check that the returned max and minimum are
valid in bottom and right border cases.

Change-Id: I2d6cdc9b5f04c7d0ff512ddcf3228331e028bf9b
2013-09-10 13:29:23 +01:00
Ivan Maltz
20abe595ec Merge "API extensions and sample app for spacial scalable encoder" 2013-09-09 16:57:01 -07:00
Ivan Maltz
01b35c3c16 API extensions and sample app for spacial scalable encoder
Sample app: vp9_spatial_scalable_encoder
vpx_codec_control extensions:
  VP9E_SET_SVC
  VP9E_SET_WIDTH, VP9E_SET_HEIGHT, VP9E_SET_LAYER
  VP9E_SET_MIN_Q, VP9E_SET_MAX_Q
expanded buffer size for vp9_convolve

modified setting of initial width in vp9_onyx_if.c so that layer size
can be set prior to initial encode

Default number of layers set to 3 (VPX_SS_DEFAULT_LAYERS)
Number of layers set explicitly in vpx_codec_enc_cfg.ss_number_layers

Change-Id: I2c7a6fe6d665113671337032f7ad032430ac4197
2013-09-09 15:57:56 -07:00
Jingning Han
18c780a0ff Remove the use of uninitialized_safe in encode_sb_
Initialize the probability model context with default value in
encode_sb.

Change-Id: Id826114024dfc21c7ef41aea9f4a0316d4a5cb95
2013-09-09 15:41:16 -07:00
James Zern
c1913c9cf4 Merge "Revert "New mode_info_context storage"" 2013-09-09 14:38:01 -07:00
James Zern
54a03e20dd Revert "New mode_info_context storage"
This reverts commit dae17734ec

Encode crashes, leaks and increases integer overflow errors.

Change-Id: I595aa2649bb8d0b6552ff91652837a74c103fda2
2013-09-09 13:37:01 -07:00
Paul Wilkins
740acd6891 Merge "Enable kf restrictions at speed 4" 2013-09-09 05:39:13 -07:00
Jim Bankoski
9faa7e8186 resolve clang issue : implicit convert tx_mode -> tx_size
Change-Id: Ifc9da470358f58e800e3d0d70a565b61e5f7834a
2013-09-08 07:17:12 -07:00
Jim Bankoski
e378566060 Merge "New mode_info_context storage" 2013-09-08 07:16:25 -07:00
Jingning Han
09bc942b47 Fix overflow issue in 16x16 quantization SSSE3
The 16x16 transform unit test suggested that the peak coefficient
value can reach 32639. This could cause potential overflow issue
in the SSSE3 implmentation of 16x16 block quantization. This commit
fixes this issue by replacing addition with saturated addition.

Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e
2013-09-06 21:06:10 -07:00
Paul Wilkins
f15cdc7451 Enable kf restrictions at speed 4
Change-Id: I453409d3be3f5fe118b15affde45cb52184aef20
2013-09-06 11:16:04 -07:00
Deb Mukherjee
e378a89bd6 Support a constant quality mode in VP9
Adds a new end-usage option for constant quality encoding in vpx. This
first version implemented for VP9, encodes all regular inter frames
using the quality specified in the --cq-level= option, while encoding
all key frames and golden/altref frames at a quality better than that.

The current performance on derfraw300 is +0.910% up from bitrate control,
but achieved without multiple recode loops per frame.

The decision for qp for each altref/golden/key frame will be improved
in subsequent patches based on better use of stats from the first pass.
Further, the qp for regular inter frames may also be varied around the
provided cq-level.

Change-Id: I6c4a2a68563679d60e0616ebcb11698578615fb3
2013-09-06 10:30:53 -07:00
Scott LaVarnway
dae17734ec New mode_info_context storage
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of a pointer to a MODE_INFO struct and
a "in the image" flag.  The MODE_INFO structs are now stored
as a stream, eliminating unnecessary copies and is a little
more cache friendly.

For the test clips used, the decoder performance improved
by ~4.3% (1080p) and ~9.7% (720p).

Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
and 5.9% (720p).

Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
2013-09-06 12:33:34 -04:00
Jingning Han
1c263d6918 Merge "Use saturated addition in SSSE3 of 32x32 quant" 2013-09-05 14:09:40 -07:00
Jingning Han
458c2833c0 Use saturated addition in SSSE3 of 32x32 quant
The 32x32 forward transform can potentially reach peak coefficient
value close to 32700, while the rounding factor can go upto 610.
This could cause overflow issue in the SSSE3 implementation of 32x32
quantization process.

This commit resolves this issue by replacing the addition operations
with saturated addition operations in 32x32 block quantization.

Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70
2013-09-05 12:49:12 -07:00
Jim Bankoski
9fc3d32a50 Merge "faster accounting of inc_mv" 2013-09-05 12:38:56 -07:00
Paul Wilkins
e5deed06c0 Merge "Attempt to fix speed 4" 2013-09-04 17:19:22 -07:00
Jim Bankoski
bb2313db28 Merge "make vp9 postproc a config option" 2013-09-04 10:35:26 -07:00
Yunqing Wang
9fd2767200 Merge "Use correct bit cost while static-thresh is on" 2013-09-04 10:26:37 -07:00
Jim Bankoski
79401542f7 make vp9 postproc a config option
Vp9 postproc is disabled for now as its not been shown to help and
may be merged with vp8.

Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-09-04 10:02:08 -07:00
Jim Bankoski
532179e845 faster accounting of inc_mv
Moves counting of mv branches to where we have a new mv, instead of after
the whole frame is summed.

Change-Id: I945d9f6d9199ba2443fe816c92d5849340d17bbd
2013-09-04 09:47:57 -07:00
Paul Wilkins
49317cddad Attempt to fix speed 4
Speed 4 fixed partition size. Use fixed size unless it does not
fit inside image, in which case use the largest size that does.

Change-Id: I250f7a80506750dd82ab355721624a1344247223
2013-09-03 17:46:25 +01:00
Jingning Han
010c0ad0eb Merge "Fix 32x32 forward transform SSE2 version" 2013-09-03 08:58:03 -07:00
Jingning Han
3cf46fa591 Fix 32x32 forward transform SSE2 version
This commit fixed the potential overflow issue in the SSE2
implementation of 32x32 forward DCT. It resolved the corrupted
coded frames in the border of scenes.

Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
2013-08-31 18:47:08 -07:00
Yunqing Wang
0ca7855f67 Use correct bit cost while static-thresh is on
While static-thresh is on, we only need to transmit skip
flag if skip = 1. The cost of skip bit is added to the
total rate cost.

Change-Id: I64e73e482bc297eba22907026298a15fa8cc3920
2013-08-30 15:25:13 -07:00
Paul Wilkins
2b9baca4f0 Merge "Added per pixel inter rd hit count stats" 2013-08-30 08:56:01 -07:00
Jingning Han
c86c5443eb Merge "Fix overflow issue in SSSE3 32x32 quantization" 2013-08-29 16:49:04 -07:00
Paul Wilkins
1f4bf79d65 Added per pixel inter rd hit count stats
Added some code to output normalized rd hit count stats.
In effect this approximates to the average number of rd
operations/tests per pixel for the sequence.

The results are not quite accurate and I have not bothered
to account for partial SB64s at frame edges and for key frames
However they do give some idea of the number of modes /
prediction methods being tested for each pixel across the
different partition sizes. This indicates how much scope their
is for further gains either by reducing the number of partitions
examined or the modes per partition through heuristics.

Patch 3 moved place where count incremented so partial rd
tests that are aborted with INT_MAX return are also counted.

Example numbers for first 50 frames of Akiyo.
Speed 0 ~84.4 rd operations / pixel
Speed 1 ~28.8
Speed 2 ~11.9

Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
2013-08-30 00:13:51 +01:00
Deb Mukherjee
b6dbf11ed5 Merge "Adds a speed feature for fast 1-loop forw updates" 2013-08-29 15:54:04 -07:00
James Zern
e83e8f0426 Merge changes Ib1e853f9,Ifd75c809,If3e83404
* changes:
  consistently name VP9_COMMON variables #3
  consistently name VP9_COMMON variables #2
  consistently name VP9_COMMON variables #1
2013-08-29 15:50:56 -07:00
Yaowu Xu
ee961599e1 Merge "Fixed potential overflows" 2013-08-29 15:43:26 -07:00
James Zern
d765df2796 consistently name VP9_COMMON variables #3
stragglers

Change-Id: Ib1e853f9a331b7b66639dc34d79568d84d1930f1
2013-08-29 13:27:41 -07:00
James Zern
924d74516a consistently name VP9_COMMON variables #1
pc -> cm

Change-Id: If3e83404f574316fdd3b9aace2487b64efdb66f3
2013-08-29 13:25:57 -07:00
Dmitry Kovalev
e80bf802a9 Merge "Renaming txfm_size to tx_size." 2013-08-29 12:30:18 -07:00
Jingning Han
abff678866 Fix overflow issue in SSSE3 32x32 quantization
The 32x32 quantization process can potentially have the intermediate
stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
fixes this overflow issue in the SSSE3 implementation, as well as the
prototype, of 32x32 quantization.

This fixes issue 607 from webm@googlecode.

Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
2013-08-29 11:00:54 -07:00
Yaowu Xu
aaa7b44460 Fixed potential overflows
The two arrays are typically initialized to INT64_MAX, if they are not
filled with valid values before the addition, the values can overflow
and lead to wrong results.

Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
2013-08-29 10:26:52 -07:00
Dmitry Kovalev
b62ddd5f8b General code cleanup.
Switching from mi_{width, height}_log2 and b_{width, height}_log2 to
num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing
redundant code, adding const.

Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
2013-08-28 12:22:37 -07:00
Deb Mukherjee
e02dc84c1a Adds a speed feature for fast 1-loop forw updates
Incorporates a speed feature for fast forward updates of
coefficients. This feature takes 3 values:
0 - use standard 2-loop version
1 - use a 1-loop version
2 - use a 1-loop version with reduced updates

Results: derfraw300 +0.007% (on speed 0) at feature value = 1
                    -0.160% (on speed 0) at feature value = 2

There is substantial speed up at speeds 2 and above for low
resolution sequences where the entropy updates are a big part
of the overall computations.

Change-Id: Ie96fc50777088a5bd441288bca6111e43d03bcae
2013-08-28 10:56:52 -07:00
Dmitry Kovalev
851a2fd72c Renaming txfm_size to tx_size.
Change-Id: I752e374867d459960995b24d197301d65ad535e3
2013-08-27 19:47:53 -07:00
Jingning Han
eb7acb5524 Merge "Fix buf alignment in sub8x8 comp inter-inter pred" 2013-08-27 19:03:12 -07:00
Dmitry Kovalev
a93992e725 Adding get_entropy_context function.
Moving common code from encoder and decoder to this function.

Change-Id: I60fa643fb1ddf7ebbff5e83b6c4710137b0195ef
2013-08-27 14:17:53 -07:00
Dmitry Kovalev
7b95f9bf39 Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder.
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
2013-08-27 11:05:08 -07:00
Dmitry Kovalev
ba10aed86d Merge "Using num_8x8_* lookup tables instead of mi_*_log2." 2013-08-27 10:49:36 -07:00
Dmitry Kovalev
f389ca2acc Merge "Cleaning up model_rd_for_sb_y_tx." 2013-08-27 10:17:10 -07:00
Dmitry Kovalev
78e670fcf8 Merge "Renaming D27 to D207." 2013-08-27 10:03:57 -07:00
Jingning Han
2d6aadd7e2 Fix buf alignment in sub8x8 comp inter-inter pred
This commit resolved a mis-alignment issue in compound inter-inter
prediction of sub8x8. This patch follows solution from dkovalev@.

Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932
2013-08-27 09:28:05 -07:00
Yaowu Xu
9482c07953 fixed the reading too many bytes
In subpel_avg_variance functions, code similar to the following

punpkldq m2, [addr]

actually reads 8 bytes. For functions that are supposed to work on
buffers only have less 8 bytes a line, this caused valgrind error
of reading uninitialized memory.

Change-Id: I2a4c079dbdbc747829bd9e2ed85f0018ad2a3a34
2013-08-27 08:39:20 -07:00
Dmitry Kovalev
657ee2d719 Cleaning up model_rd_for_sb_y_tx.
Removing references to plane_block_width and plane_block_height (we are
going to delete the latter ones).

Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287
2013-08-26 16:18:28 -07:00
Dmitry Kovalev
b25589c6bb Using num_8x8_* lookup tables instead of mi_*_log2.
Change-Id: I8a246b3d056c98be614d05a90bc261e2441ffc10
2013-08-26 14:22:54 -07:00
Yaowu Xu
4505e8accb Merge "Fix the reading of too many input pixels" 2013-08-26 14:01:50 -07:00
Paul Wilkins
aa823f8667 Merge "Changes to adaptive inter rd thresholds." 2013-08-26 12:48:11 -07:00
Yaowu Xu
6c5433c836 Fix the reading of too many input pixels
in VP9_get4x4var_mmx

Change-Id: I4b4a8f45f25ebdfad281f169cc87aba5e2d6f227
2013-08-26 12:35:27 -07:00
Paul Wilkins
642696b678 Merge "Limit Key frame Intra modes checks." 2013-08-26 12:34:56 -07:00
James Zern
c8ba8c513c cosmetics: strip 'VP9_' from defines in vp9 only code
Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c
2013-08-23 19:16:49 -07:00
Dmitry Kovalev
50ee61db4c Renaming D27 to D207.
I've already renamed d27_predictor to d207_predictor but forgot about the
corresponding constant.

Change-Id: Id312aa80fc5b5a1ab8a709a33418a029552a6857
2013-08-23 17:33:48 -07:00
Dmitry Kovalev
e6c435b506 Merge "Cleanup in mvref_common.{h, c}." 2013-08-23 17:09:49 -07:00
Yaowu Xu
13930cf569 Limit mv range to be based on partition size
Previous change c4048dbd limits the mv search range assuming max block
size of 64x64, this commit change the search range using actual block
size instead.

Change-Id: Ibe07ab02b62bf64bd9f8675d2b997af20a2c7e11
2013-08-23 15:43:57 -07:00
Yaowu Xu
8e04257bc5 Merge "Added border extension" 2013-08-23 14:43:58 -07:00
Dmitry Kovalev
21d8e8590b Cleanup in mvref_common.{h, c}.
Making code more compact, adding consts, removing redundant arguments,
adding do/while(0) for macros.

Change-Id: Ic9ec0bc58cee0910a5450b7fb8cfbf35fa9d0d16
2013-08-23 12:00:30 -07:00
Yaowu Xu
656632b776 Added border extension
To the source buffer to be encoded as an alt ref frame. This is to fix
the problem of using uninitialized memory in encoder.

See https://code.google.com/p/webm/issues/detail?id=605

Change-Id: I97618a2fc207e08abcf5301b734aa9e3ad695e2c
2013-08-23 11:31:28 -07:00
Dmitry Kovalev
1c159c470a Merge "Checking scale factors on access." 2013-08-23 11:05:17 -07:00
Paul Wilkins
aa5b67add0 Changes to adaptive inter rd thresholds.
Values now carried over frame to frame.
Change to algorithm for decreasing threshold after
a hit and to max threshold (now based on speed)

Removed some old commented out code relating to
VP8 adaptive thresholds.

The impact of these changes tested on Akiyo (50 frames)
and measured in terms of unit rd hits is as follows:

Speed 0 84.36 -> 84.67
Speed 1 29.48 -> 22.22
Speed 2 11.76 -> 8.21
Speed 3 12.32 -> 7.21

Encode speed impact is broadly in line with these.

Change-Id: I5b886efee3077a11553fa950d796fd6d00c8cb19
2013-08-23 16:18:45 +01:00
Paul Wilkins
f76f52df61 Limit Key frame Intra modes checks.
Most of the focus so far has been on inter frames.

At high speed settings the key frame is now taking a high %
of the cycles.

This patch puts in some masking to reduce the number
of INTRA modes searched during key frame coding (as already
happens for inter frames) at higher speed settings

TODO: Develop this further with either adaptive rd thresholds
when choosing which intra modes to consider or some other
heuristic.

Impact.
At high speed settings on some clips the key frame was starting
to dominate. In a coding of the first 50 frames of AKIYO at speed
2 limiting the key frame intra modes to DC or TM_PRED resulted in
~30% overall speedup. For Bus the number was lower at ~4-5%.

Change-Id: I7bde68aee04995f9d9beb13a1902143112e341e2
2013-08-23 16:10:30 +01:00
Jingning Han
9655c2c7a6 Merge "Fix rectangular partition check flag" 2013-08-22 18:59:18 -07:00
Dmitry Kovalev
33104cdd42 Merge "vp9_encodeframe.c cleanup." 2013-08-22 18:07:35 -07:00
James Zern
711aff9d9d Merge "vp9/encoder: fix last_frame_seg_map mem leak" 2013-08-22 18:04:03 -07:00
James Zern
d843ac5132 Merge "rename LOG2_* defines to *_LOG2" 2013-08-22 18:02:42 -07:00
Jingning Han
84f3b76e1c Fix rectangular partition check flag
Put rectangular partition check flag change according to the rd
costs of NONE and SPLIT partition types under the speed feature.

Change-Id: If681e1e078a8d43d86961ea4b748da5cd1b6c331
2013-08-22 17:15:01 -07:00
Dmitry Kovalev
604022d40b vp9_encodeframe.c cleanup.
Removing unused get_sbuv_perpixel_variance function, using has_second_ref/
is_inter_block functions, organizing includes.

Change-Id: I016de4af12fbbb8b4ece26a70759b2392651b095
2013-08-22 15:50:51 -07:00
Dmitry Kovalev
335b1d360b check_bsize_coverage cleanup.
Change-Id: Ib7803857b35c00e317c9deb8630e777e25eb278f
2013-08-22 15:45:56 -07:00
Dmitry Kovalev
3c42657207 Checking scale factors on access.
It is possible to have invalid scale factors and not access them
during decoding. Error is reported if we really try to use invalid scale
factors.

Change-Id: Ie532d3ea7325ee0c7a6ada08269f804350c80fdf
2013-08-22 15:19:05 -07:00
James Zern
40ae02c247 rename LOG2_* defines to *_LOG2
gets rid of a mix of styles

Change-Id: I3591d312157bc6f53a25438bf047765c671fd8a8
2013-08-22 14:45:24 -07:00