Commit Graph

3397 Commits

Author SHA1 Message Date
Dmitry Kovalev
6d3db91d3b Merge "Cleaning up foreach_predicted_block_in_plane() function." 2013-10-07 11:30:45 -07:00
Adrian Grange
18a2617126 Merge "cpplint issues resolved vp9_ratectrl.c" 2013-10-07 10:54:17 -07:00
Jim Bankoski
31b7a912d1 cpplint issues resolved vp9_ratectrl.c
Change-Id: Iae7674b0c946a5ac01617840b3f62965c654d920
2013-10-07 09:21:29 -07:00
Jim Bankoski
92519a005a Merge "cpplint problems resolved with vp9_firstpass.c" 2013-10-07 09:16:46 -07:00
Jim Bankoski
ccc5a483f4 Merge "cpplint issues resolved in vp9_mcomp.c" 2013-10-07 09:14:35 -07:00
Scott LaVarnway
a2a3b4a479 d153 intra prediction (32x32) ssse3 using bytes
Change-Id: Ie2c0d84ff9f6294084d65f4380e1f30c09e681c9
2013-10-07 11:21:10 -04:00
Paul Wilkins
65f0cc7f4b Disable MODE_TEST_HIT_STATS
This flag is for stats generation and testing and should not
be checked in as enabled by default.

Change-Id: I4ea57dbcf49790f14777f598ddd3dc37dcc7a6bb
2013-10-07 02:54:19 -07:00
James Zern
879e21ddfd vp9_blockd.h: update get_tx_eob() signature
as the name implies, the segmentation pointer can be const

Change-Id: I945f01a077c112ec86c00e35a1e9395bc230c2d9
2013-10-07 11:45:16 +02:00
Paul Wilkins
950058765d Fix MSVC warning.
A new set of MSVC warnings were introduced by change
I3f36d3f7cd8d15195a6e2fafd1777cdaf9ecb847

In particular MSVC does not like:-

typedef const int16_t subpel_kernel[SUBPEL_TAPS];

struct subpix_fn_table {
  const subpel_kernel *filter_x;
  const subpel_kernel *filter_y;
};

causes  new warning in MSVC.
warning C4114: same type qualifier used more than once

Change-Id: Iae596fd13aadf36169faf00c68eabe9a32a9b156
2013-10-07 02:26:44 -07:00
Jim Bankoski
bf893e84bd Merge changes I8a106dd6,Iec442603
* changes:
  d153 intra prediction (16x16) ssse3 using bytes
  d153 intra prediction ssse3 using bytes
2013-10-06 20:11:24 -07:00
Dmitry Kovalev
c6ad70d5f1 Giving consistent names to IDCT 8x8 functions.
Renames:
  vp9_short_idct8x8_add    -> vp9_idct8x8_64_add
  vp9_short_idct8x8_1_add  -> vp9_idct8x8_1_add
  vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add
  vp9_idct_add_8x8         -> vp9_idct8x8_add

Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
2013-10-06 00:24:09 -07:00
Dmitry Kovalev
5c0b108639 Merge "Adding assign_mv() function to reduce code duplication." 2013-10-05 23:44:59 -07:00
Dmitry Kovalev
9dba044be2 Merge "Giving consistent names to IDCT/IWHT functions." 2013-10-05 23:44:05 -07:00
Jim Bankoski
a5db3967ea Merge "encodemb cpplint issues revisited." 2013-10-05 18:16:01 -07:00
Jim Bankoski
7edc5ac42f NOLINT issue with headers that's hard to avoid do to config.h issue
Change-Id: Ibd0b3414cdea05bc2fd6d0aa35808e44b3db8d96
2013-10-05 17:32:43 -07:00
Jim Bankoski
44228663f1 remaining cpplint issue in vp9_decode_frame
Change-Id: Ia3030882c5276dc1f8e6b6c82b9eb301f00b6bbc
2013-10-05 17:30:34 -07:00
Jim Bankoski
bf21ce63ee encodemb cpplint issues revisited.
Change-Id: Id5f25b74e2207bf44b6f6c8ffe548fa30fd78b4d
2013-10-05 17:24:51 -07:00
Jim Bankoski
30dee8adfc cpplint problems resolved with vp9_firstpass.c
Change-Id: Ic7b7014a0d857585bfd4baaea1d5c27ffe355642
2013-10-05 17:10:54 -07:00
Jim Bankoski
c9f3f9ed70 Merge "unused typedef in vp9_variance.h" 2013-10-05 16:49:13 -07:00
Jim Bankoski
7fd13472ae Merge "cpplint issues with vp9_boolhuff.c resolved" 2013-10-05 16:48:28 -07:00
Jim Bankoski
f59cb3eacc Merge "added nolint to function that doesn't seem easy to breakup" 2013-10-05 16:47:23 -07:00
Jim Bankoski
4410bbbf88 Merge "cpplint issues in vp9_lookahead.c" 2013-10-05 16:46:11 -07:00
Jim Bankoski
b79b7c354d cpplint issues resolved in vp9_mcomp.c
Change-Id: I2c2f83f4dfa2782fc6b0aa6db3ba2c4e6e423ffa
2013-10-05 16:44:40 -07:00
Jim Bankoski
6a7b1fb754 Merge changes Idbfabe42,I788f1a30
* changes:
  cpplint issues resolved in vp9_variance_mmx.c
  cpplint issues in vp9_ssim.c
2013-10-05 16:32:50 -07:00
Jim Bankoski
2dba2eb46a Merge "cpplint issues in vp9_picklpf.c" 2013-10-05 16:32:00 -07:00
Jim Bankoski
c4697a6690 Merge "cpplint issues resolved vp9/vp9_cx_iface.c" 2013-10-05 16:31:50 -07:00
James Zern
557862d152 vp9_receive_compressed_data: remove unnecessary indent
+ useless comment

Change-Id: Ied29a4cc8c506b216968ce67af630bae542aca12
2013-10-05 12:10:38 +02:00
Jingning Han
0d0ed6a29b Allow sub8x8 intra modes test for alt frame coding
This commit allows sub8x8 intra modes test in the rate-distortion
loop for hd sequences in speed 1 and 2.

For sequence y90n of hd set at 8000 kbps, speed 2 runtime goes
from 207s to 210s. For ped_1080p at 3000 kbps, speed 2 runtim goes
from 336s to 337s. Both are running with 300 frames.

This improves compression performance by 0.24% for stdhd and 0.32%
for hd.

Change-Id: I173ca38a6411565ae6cfadd184c42b2070c5de1f
2013-10-04 19:13:00 -07:00
Jim Bankoski
0500cf429f cpplint issues with vp9_boolhuff.c resolved
Change-Id: I6990c9ab838323d8770dd1f49a25bf3acc4c05c7
2013-10-04 17:20:58 -07:00
Jim Bankoski
a36045fb3b Merge "cpplint issues with vp9_temporal_filter.c" 2013-10-04 17:17:02 -07:00
Jim Bankoski
fa7dbab3fe cpplint issues resolved vp9/vp9_cx_iface.c
Change-Id: I4f66d6f1aebe7d47ad01cda9b03c600725240680
2013-10-04 17:16:20 -07:00
Jim Bankoski
cac3e1588e cpplint issues in vp9_picklpf.c
Change-Id: I62e631ca95fefbb1a993479a5e3926dc81359fe7
2013-10-04 17:08:41 -07:00
Jim Bankoski
eead4bb89e Merge "lint issue in vp9_psnr.c" 2013-10-04 16:42:30 -07:00
Jim Bankoski
e2d73897d0 Merge "vp9_encodeframe.c cpplint issues resolved" 2013-10-04 16:42:06 -07:00
Jim Bankoski
6e161a26e3 Merge "cpp lint issues resolved in vp9_encodeintra.c" 2013-10-04 16:41:58 -07:00
Jim Bankoski
5f80d2ad33 Merge "cpplint vp9_dct.c issues resolved" 2013-10-04 16:41:46 -07:00
Jim Bankoski
38f6a3cdc7 Merge "cpplint issues vp9_tokenize.c resolved" 2013-10-04 16:41:23 -07:00
Dmitry Kovalev
ee74054e81 Cleaning up foreach_predicted_block_in_plane() function.
Change-Id: Ibb3d9667eba56621667412f62097aa7a392659c2
2013-10-04 15:53:32 -07:00
Jim Bankoski
d07545b7b8 cpplint issues with vp9_temporal_filter.c
Change-Id: I695a990689c79d160227975116125b140875aed1
2013-10-04 15:49:30 -07:00
Dmitry Kovalev
56acf7e528 Merge "Adding vp9_get_filter_kernel() function." 2013-10-04 15:21:39 -07:00
Yaowu Xu
d129eea9fa Merge "Further clean up of speed 4" 2013-10-04 14:45:21 -07:00
Jim Bankoski
de5cb8b140 vp9_encodeframe.c cpplint issues resolved
Change-Id: Id9d837e062d9c4a94def4b4ed1f49a67c75d3618
2013-10-04 14:37:31 -07:00
Jim Bankoski
02f28bac29 cpp lint issues resolved in vp9_encodeintra.c
Change-Id: Ib6a8360d24f44eeaec12c5055568382a105dc235
2013-10-04 14:35:01 -07:00
Jim Bankoski
9c2b3744c9 cpplint issues in vp9_lookahead.c
Change-Id: I2a98995f0df77d99dc47bda5e41886f014d8843f
2013-10-04 14:24:19 -07:00
Jim Bankoski
5b4f836148 cpplint issues resolved in vp9_variance_mmx.c
Change-Id: Idbfabe427fbeab44210f13fec8b6f63f7a4eb0dd
2013-10-04 14:22:08 -07:00
Jim Bankoski
eb5b7ac27b added nolint to function that doesn't seem easy to breakup
Change-Id: I5489b116aea7c510ea5ebbed3c1445f321b05f3e
2013-10-04 14:17:47 -07:00
Dmitry Kovalev
3a0602578e Giving consistent names to IDCT/IWHT functions.
The idea is to have the following names for each transform size:

vp9_idct4x4_add
  vp9_idct4x4_1_add
  vp9_idct4x4_10_add
  vp9_idct4x4_16_add

vp9_idct8x8_add
  vp9_idct8x8_1_add
  vp9_idct8x8_10_add
  vp9_idct8x8_64_add

etc for 16x16, 32x32

The actual list of renames in this patch:

vp9_idct_add_lossless     -> vp9_iwht4x4_add
vp9_short_iwalsh4x4_add   -> vp9_iwht4x4_16_add
vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add

vp9_idct_add            -> vp9_idct4x4_add
vp9_short_idct4x4_add   -> vp9_idct4x4_16_add
vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add

Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
2013-10-04 14:17:06 -07:00
Jim Bankoski
25ecb1f0b3 cpplint vp9_variance_sse2.c
Change-Id: Ifce8f5b57a1ea8952e8a67c5b92a127a061899fa
2013-10-04 14:15:06 -07:00
Jim Bankoski
f3e6a35cdb cpplint issues in vp9_ssim.c
Change-Id: I788f1a3004643347ca08d08fc3cb2bb8f0b134d9
2013-10-04 14:08:37 -07:00
Jim Bankoski
424c74e736 cpplint vp9_dct.c issues resolved
Change-Id: Ia21653a447040f1b472d21ebd19103b0558c4b16
2013-10-04 13:47:59 -07:00
Jim Bankoski
c6960b6086 cpplint issues vp9_tokenize.c resolved
Change-Id: Id4ec0084641d2ad4def95fb05239455fbc25f9b9
2013-10-04 13:42:58 -07:00
Jim Bankoski
660dcfe6a2 Merge "cpplint issues vp9_encodemv.c" 2013-10-04 12:55:46 -07:00
Jim Bankoski
19641c40f9 Merge "cpplint issues vp9_mbgraph" 2013-10-04 12:55:26 -07:00
Guillaume Martres
014a2c17df Fix first pass for non-square blocks
Change-Id: Ic049f0a6ce190f33859118e7b8cfcfe305979102
2013-10-04 12:04:15 -07:00
Dmitry Kovalev
042c475a8f Merge "Moving all idct/iht functions in one place." 2013-10-04 12:01:42 -07:00
Jim Bankoski
d9215a6616 cpplint issues vp9_mbgraph
Change-Id: Iedf9ac460edb31d7c072e2bebd26f2afe8e6089b
2013-10-04 11:22:22 -07:00
Jim Bankoski
19e227561a cpplint issues vp9_encodemv.c
Change-Id: Icda1d2d7cbfb176884fa6c7d9366a2d60e2994e9
2013-10-04 11:19:06 -07:00
Jim Bankoski
916f803175 lint issue in vp9_psnr.c
Change-Id: Ifc7ffc02cfedb47230571298622602609a4e8a70
2013-10-04 11:01:49 -07:00
Jingning Han
1ab60f7bfb Merge "Remove redundant second_ref_frame check in sub8x8" 2013-10-04 09:04:11 -07:00
Paul Wilkins
44e039b4f5 Further clean up of speed 4
Speed 4 still does not give a big gain over speed 3.
This just cleans it up a little from the last patch and comments
out features that do not seem to be giving much benefit.

Change-Id: I5f366e6160e1dbe5dc45cf5eb90cc02712baa1b6
2013-10-04 16:57:24 +01:00
Paul Wilkins
8abd92f12f Remove mode_skip_start and mask code for sub 8x8
This code serves no purpose in the re-factored sub 8x8 code.

Change-Id: I5364986224d1a28b71bcb046ec8557a3d14aaa47
2013-10-04 14:26:17 +01:00
Paul Wilkins
de6ecc5ac3 Selective masking of split modes.
Allow selective masking of individual split modes rather than
just a single on / off flag.

For speed 2 recovers the large speed loss seen for some derf
clips  in change Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
and a small quality gain.

For speed 1 10 % speed increase observed locally on some derf clips
for minimal quality change.

Change-Id: If86191087b93cbc05351c26c60c7933e2149e485
2013-10-04 14:20:58 +01:00
Paul Wilkins
03dd2818e4 Missing threshold case for disable split.
In relation to change:
Refactor inter mode rate-distortion search
 Ie6bdfa0a370148dd60bd800961077f7e97e67dd4

sf->thresh_mult_sub8x8[THR_INTRA] = INT_MAX missing;

Change-Id: Ia86b68a5073368a3e2ca124a27b632243b525c8b
2013-10-04 11:54:24 +01:00
Dmitry Kovalev
bde3ae0c60 Adding assign_mv() function to reduce code duplication.
Change-Id: I2b4e5b842c19f64749b18946ad215c0caa57e7b7
2013-10-03 20:06:32 -07:00
Dmitry Kovalev
d975804e9a Merge "Replacing duplicated code with get_scan_and_band call." 2013-10-03 18:58:40 -07:00
Dmitry Kovalev
9ec09700d6 Adding vp9_get_filter_kernel() function.
Moving INTERPOLATIONFILTERTYPE enum and subpix_fn_table struct to
vp9_filter.h. Adding convenient typedef for subpel kernels.

Function vp9_setup_interp_filters() besides setting xd->subpix.filter_x &
xd->subpix.filter_y has a side effect of also setting scale factors. This
is not required inside decode_modes_b() because scale factors have been
already set by set_ref() calls. That's why replacing
vp9_setup_interp_filters() call with newly created vp9_get_filter_kernel()
call. The behavior of vp9_setup_interp_filters() is unchanged (it
is used from the encoder).

Change-Id: I3f36d3f7cd8d15195a6e2fafd1777cdaf9ecb847
2013-10-03 18:55:21 -07:00
Dmitry Kovalev
934c4e6621 Merge "Reading diff update flag inside vp9_diff_update_prob." 2013-10-03 17:47:10 -07:00
Dmitry Kovalev
8b34437522 Replacing duplicated code with get_scan_and_band call.
Change-Id: I2cc3684f416a63dc99b9303109f9850f34a470d5
2013-10-03 17:46:28 -07:00
Jingning Han
63a92eb665 Merge "Use vp9_zero in sub8x8 RD optimiazion loop" 2013-10-03 17:04:16 -07:00
Dmitry Kovalev
3b7794f9eb Merge "BITSTREAM - "update_map" SEMANTICS BROKEN IN 398ddafb629b7f49cf255bf09d3e38b4abd0bb95" 2013-10-03 15:09:49 -07:00
Dmitry Kovalev
0e23048303 BITSTREAM - "update_map" SEMANTICS BROKEN IN 398ddafb62
This patch reverts old commit 398ddafb62
"New way of updating last frame segmentation map.".

Change-Id: Iba730f433c30ed7f5e5449d6768049cbf9a2b2c5
2013-10-03 14:41:36 -07:00
Jingning Han
2952b7d1fb Remove redundant second_ref_frame check in sub8x8
This commit removes the redundant second reference frame check in
the rate-distortion optimization loop for sub8x8 blocks.

Change-Id: I13a57a6f624c4a9bcef02ff2a867fa30d8b44a93
2013-10-03 14:02:12 -07:00
Jingning Han
b9daef91d8 Use vp9_zero in sub8x8 RD optimiazion loop
Change-Id: Ic23a705e48cadaa7151f2bd8536d56636cb973e3
2013-10-03 12:34:25 -07:00
Jingning Han
4093192ec9 Change b_mode_info definition from union to struct
This commit defines b_mode_info as a struct type. This will allow
us to further remove the use of PARTITION_INFO in the encoding process.

Change-Id: I975b0f7d557b5e0f66545a61b472def76b671cce
2013-10-03 12:34:11 -07:00
Jingning Han
793c2d8429 Remove unused variables in inter_mode rd loops
Remove redundant variable definition/use in rate-distortion search
loop for regular and sub8x8 blocks, respectively.

Change-Id: Ic0eb3660bb6851ba2eb8d702ba9fd11595000d01
2013-10-03 12:34:11 -07:00
Jingning Han
a55625873f Merge "Refactor inter mode rate-distortion search" 2013-10-03 12:19:53 -07:00
Yunqing Wang
134dfea878 Merge "Rewrite HORIZx4 and HORIZx8 in subpixel filter functions" 2013-10-03 12:17:47 -07:00
Dmitry Kovalev
3aed95dbdb Merge "Using vp9_zero instead of vpx_memset." 2013-10-03 11:41:11 -07:00
Jingning Han
11abab356e Refactor inter mode rate-distortion search
This commit separates the rate-distortion optimization loop of
superblocks from that of sub8x8 blocks. This allows better design
rate-distortion optimization search loop for each setting. It also
removes the use of SPLITMV and I4X4_PRED therein.

No performance change in speed 0 settings. For bus@CIF at 2000kbps,
the speed 1 runtime goes from 48009ms to 43894ms (about 10% faster).
The overall compression performance on derf changed by -0.021%.

Speed 2 runtime goes from 27114ms to 28700ms (6% slower), while the
overall coding efficiency goes up by 1.629% for derf, 1.236% for yt.

Change-Id: Ie6bdfa0a370148dd60bd800961077f7e97e67dd4
2013-10-03 11:36:49 -07:00
Dmitry Kovalev
8394f1a015 Merge "Making decode_modes_b function more straightforward." 2013-10-03 11:06:29 -07:00
Dmitry Kovalev
9250d1529c Using vp9_zero instead of vpx_memset.
Change-Id: I9a0d0e9c3459954aa7b9c68f92cc5d56385ebd18
2013-10-03 10:59:36 -07:00
Dmitry Kovalev
6f1bb2246c Reading diff update flag inside vp9_diff_update_prob.
Change-Id: I5ae659c1bfb132428a7272d094b5287d144ec7c8
2013-10-03 10:55:36 -07:00
Paul Wilkins
b03d3da9c1 Merge "Speed setting review." 2013-10-03 09:49:00 -07:00
Paul Wilkins
fa71882e63 Merge "make use last partition consider motion" 2013-10-03 09:48:49 -07:00
Johann
fd6c4c71d6 Merge "mips dsp-ase r2 vp9 decoder convolve module optimizations" 2013-10-03 09:41:16 -07:00
Dmitry Kovalev
6cb6987d4d Merge "BITSTREAM - RESTORING BILINEAR INTERPOLATION FILTER SUPPORT" 2013-10-03 09:34:26 -07:00
Yunqing Wang
ed22179a82 Rewrite HORIZx4 and HORIZx8 in subpixel filter functions
In subpixel filters, prefetched source data, unrolled loops,
and interleaved instructions.

In HORIZx4, integrated the idea in Scott's CL (commit:
d22a504d11), which was suggested by
Erik/Tamar from Intel. Further tweaking was done to combine row 0,
2, and row 1, 3 in registers to do more 2-row-in-1 operations until
the last add.

Test showed a ~2% decoder speedup.

Change-Id: Ib53d04ede8166c38c3dc744da8c6f737ce26a0e3
2013-10-03 09:04:02 -07:00
Paul Wilkins
6253cc9279 Speed setting review.
Substantial reworking of the speed vs quality trade offs for
speed 1 and 2.

In this patch I am attempting to freeze the "quality" meaning of
speeds 1 and 2 relative to speed 0 so that in future we can
better evaluate progress.

I am targeting :
Speed 1 quality ~-5% vs speed 0.
Speed 2 quality ~-10% vs speed 0

It is inevitable that quality will still fluctuate a little as we adjust
settings and add new features, but we will attempt to keep as
close as possible to these values. Above speed 2 things will remain
a bit more fluid for now.

In this patch speed 1 is approximately 4-5x as fast as speed 0. This
is similar to before but the quality hit is a lot less. Likewise speed 2
is approximately 2x as fast as speed 1 but is similar in quality to the
previous speed 1 configuration.

Also slight change to behavior of FLAG_EARLY_TERMINATE to insure
all reference frames get at least one rd test. Important for very low
variance regions.

WIP :- Added a new speed level with old speed 4 becoming speed 5.
Speed 3 and 4 tradeoffs still WIP

Change-Id: Ic7a38dd7b5b63ab1501f9352411972f480ac6264
2013-10-03 10:23:28 +01:00
Jim Bankoski
f1d3e5e4d6 make use last partition consider motion
This commit causes use last partition to consider whether a 64x64 has
motion that might make a new partitioning worth while.

Change-Id: I3a57bedef4f3cd961fadbfa96651c206fa36da4a
2013-10-03 10:22:39 +01:00
Paul Wilkins
ece99b3da0 Merge "Improved auto_partition_range." 2013-10-03 02:06:13 -07:00
Dmitry Kovalev
68a3e4a888 BITSTREAM - RESTORING BILINEAR INTERPOLATION FILTER SUPPORT
Adding appropriate test vector vp90-2-06-bilinear.webm.

Change-Id: Ia3bbf57318e0cc61a1b724fe751e3f9c7e11b337
2013-10-02 18:04:12 -07:00
A.Mahfoodh
5215b83aea Simplifying and inlining k_cvtlo_epi16 and k_cvthi_epi16
Simplify the k_cvtlo_epi16 and k_cvthi_epi16 to only two
instructions. Then inlined them.

quoting from intel MMX_App_Compute_16bit_Vector.pdf‎
"The PMADDWD instruction multiplies four
pairs of 16-bit numbers and produces partial sums of the results
and can do so once per clock (with a three-clock latency)."
so I am assuming that there will be three clock overhead after the
last _mm_madd_pi16 command.
Even with the overhead the number of clocks in general should be
smaller. I am not sure though becasue I could not find information
about number of clocks required for instructions in k_cvtlo_epi16
and k_cvthi_epi16. I will run a test and compare the execution time.

Change-Id: Ieda4aa338f69ad3dd196ac6e7892da3cf1b47ea7
2013-10-02 20:02:03 -04:00
Parag Salasakar
40edab5e39 mips dsp-ase r2 vp9 decoder convolve module optimizations
Change-Id: I401536778e3c68ba2b3ae3955c689d005e1f1d59
2013-10-02 16:58:37 -07:00
Dmitry Kovalev
43e979db3b Merge "Adding const to function arguments." 2013-10-02 16:26:20 -07:00
Dmitry Kovalev
7fa14f42c1 Merge "Removing unused vp9_coeff_stats_model typedef." 2013-10-02 16:26:09 -07:00
Dmitry Kovalev
a88a0e88a4 Merge "Moving get_token_alloc function from common to the encoder." 2013-10-02 16:26:00 -07:00
Jim Bankoski
f5bcc372c9 unused typedef in vp9_variance.h
Change-Id: I15f79c9de34c723c1dd419b8da96c3ff948c5e03
2013-10-02 15:59:31 -07:00
Dmitry Kovalev
be7eec79be Moving all idct/iht functions in one place.
Moving functions from vp9_idct_blk to vp9_idct because these functions are
used from both encoder and decoder. Removing duplicated code from
vp9_encodemb.c and reusing existing functions.

Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
2013-10-02 14:13:33 -07:00
Scott LaVarnway
20a09d928a d153 intra prediction (16x16) ssse3 using bytes
Change-Id: I8a106dd61b0a2520fae792d87d6348e662649b2d
2013-10-02 16:34:05 -04:00
Dmitry Kovalev
d958c0486a Merge "Removing memset calls inside idct/iht functions." 2013-10-02 12:45:27 -07:00
Dmitry Kovalev
c4d1ab573a Removing memset calls inside idct/iht functions.
Making appropriate memset inside decode_block now.

Change-Id: I8e944194668c830de08271c8fb6e413251c201d8
2013-10-02 11:48:08 -07:00
Jingning Han
54bc73151b Deprecate unused mode count variables
Remove mode_check_freq and mode_test_hit_counts from VP9_COMP.

Change-Id: Iabfd9f841444cd9bf19ac761a9795f140082ce0b
2013-10-02 11:07:14 -07:00
Jingning Han
6d3bd96607 BITSTREAM - CLARIFICATION OF MV SIZE RANGE
The codec should effectively run with motion vector of range (-2048, 2047)
in full pixels, for sequences of 1080p and below. Add assertions to clarify
this behavior.

Change-Id: Ia0cac28249f587d8f8882205228fa480263ab313
2013-10-02 10:29:45 -07:00
Dmitry Kovalev
6c2082db71 Merge "Adding read_intra_mode_{y, uv} functions for clarity." 2013-10-02 09:17:10 -07:00
Dmitry Kovalev
3c4e9e341f Adding SSE2 optimized vp9_short_idct32x32_1_add function.
Change-Id: I4b1c6bb9ff615f5872b96ed07dbf0f5e18e63643
2013-10-01 18:34:36 -07:00
Dmitry Kovalev
771f3ef5ad Adding read_intra_mode_{y, uv} functions for clarity.
Change-Id: I92fd32476c472e54f52b8d7602a98262b25e6eaf
2013-10-01 17:55:48 -07:00
Jim Bankoski
e83ebc8992 Merge "vp9_thread nolintify lint issue I can't fix easily" 2013-10-01 16:15:03 -07:00
Jim Bankoski
825b7c301d Merge "vp9_block.h cpplint issues resolved" 2013-10-01 16:14:58 -07:00
Jim Bankoski
691177842c Merge "cpplint issue in vp9_rdopt.h" 2013-10-01 15:45:35 -07:00
Jim Bankoski
d0308b7daa Merge "cpplint issues in vp9_onyx_int.h" 2013-10-01 15:45:02 -07:00
Dmitry Kovalev
aeb603f2af Making decode_modes_b function more straightforward.
Moving out decode_tokens function calls and adding decode_blocks boolean
variable. We only have to decode if eobtotal > 0, i.e. we have at least one
non-zero coefficient. Also inlining and remove vp9_set_pred_flag_mbskip
function.

Change-Id: I7be38b12ee8206faf0beea2bbf4d52be42575b03
2013-10-01 15:41:30 -07:00
Jim Bankoski
c52d85442c vp9_thread nolintify lint issue I can't fix easily
Change-Id: Ib19dabe697656e4d7e8403d91bedca7cd31d36bf
2013-10-01 15:19:39 -07:00
Jim Bankoski
5491a1f33e vp9_block.h cpplint issues resolved
Change-Id: Icc6a76a5be77f3e19918155bab3998e0aa32ccf5
2013-10-01 15:17:39 -07:00
Jim Bankoski
c4627a9ff1 cpplint issues in vp9_onyx_int.h
Change-Id: I6c4058aebe834e1a12b7a3fb10484b9ebe60b349
2013-10-01 15:14:39 -07:00
Jim Bankoski
b6e2f9b752 cpplint issue in vp9_rdopt.h
Change-Id: I84209d382ca5dfc537ee533cd792d8caa0e25cee
2013-10-01 15:09:32 -07:00
Yunqing Wang
03698aa6d8 Merge "Modify HORIZx16 macro in subpixel filter functions" 2013-10-01 14:18:10 -07:00
Yunqing Wang
df8e156432 Modify HORIZx16 macro in subpixel filter functions
Interleaved the instructions, reduced register dependency, and
prefetched the source data. This improved the decoder speed
by 0.6% - 2%.

Change-Id: I568067aa0c629b2e58219326899c82aedf7eccca
2013-10-01 12:49:25 -07:00
Dmitry Kovalev
0a5e9ee054 Moving get_token_alloc function from common to the encoder.
Also renaming mb_row -> mi_row, mb_col -> mi_col arguments and calculate
mb_rows/mb_cols values from mi_rows/mi_cols.

Change-Id: I6919a279f560648e23bc9a12f507d17c21ffd5d7
2013-10-01 11:54:10 -07:00
Yaowu Xu
5c66f6f5eb fix build with MSVC
near is a key word, changed to use nearmv instead.

Change-Id: Ib54438c431b2b2521a62fc7b61a9c127dd7bc01e
2013-10-01 09:51:59 -07:00
Scott LaVarnway
27b390e1a1 d153 intra prediction ssse3 using bytes
byte version of ronalds d153 ssse3 optimizations for
4x4 and 8x8
(commit: fc91a2a112238a1aee568f3b840585de4e928fca)

Change-Id: Iec4426032311483f615fd9e0dceba3ee85ddebd7
2013-10-01 09:05:20 -04:00
Dmitry Kovalev
c982a73b9f Removing unused vp9_coeff_stats_model typedef.
Change-Id: I6973e7121b6393379b5759f288632e8eab763d3e
2013-09-30 15:10:00 -07:00
Dmitry Kovalev
c64e23832f Adding const to function arguments.
Function list:
  tx_counts_to_branch_counts_32x32
  tx_counts_to_branch_counts_8x8
  tx_counts_to_branch_counts_8x8
  update_ct
  update_ct2
  update_mode_probs

Change-Id: I120d8945a34378cf285d6bd415e23de1d522cf2f
2013-09-30 14:50:15 -07:00
Dmitry Kovalev
40047bef5d Merge "Using array of motion vectors instead of separate variables." 2013-09-30 13:16:45 -07:00
Dmitry Kovalev
cd945c7bd9 Merge "Removing vp9_add_constant_residual_{8x8, 16x16, 32x32} functions." 2013-09-30 13:16:34 -07:00
Jingning Han
195061feda Fix rectangular partition check in speed 1
Make encoder skip rectangular partition check in speed 1 and above,
when early termination was triggered in partition split.
Thanks Guillaume (gmartres@) for catching this issue.

This change makes bus_cif at 2000kbps speed 1 runtime goes down from
25612ms to 23438ms (about 9% speed-up), at the expense of -0.235%
performance down.

Change-Id: I98613fad081a261d30d5fa206f934ca70601c180
2013-09-30 12:14:36 -07:00
Dmitry Kovalev
c151bdd412 Using array of motion vectors instead of separate variables.
Change-Id: I7380a089105f658257bbb3e30a525da168e76952
2013-09-30 12:11:46 -07:00
Dmitry Kovalev
1a9d4fedf3 Merge "Using size_t for memory buffer size." 2013-09-30 11:10:08 -07:00
Dmitry Kovalev
548671dd20 Removing vp9_add_constant_residual_{8x8, 16x16, 32x32} functions.
We don't need these functions anymore. The only one which was actually
used is vp9_add_constant_residual_32x32. Addition of
vp9_short_idct32x32_1_add eliminates this single usage. SSE2 optimized
version of vp9_short_idct32x32_1_add will be added in the next patch set,
right now it is only C implementation. Now we have all idct functions
implemented in a consistent manner.

Change-Id: I63df79a13cf62aa2c9360a7a26933c100f9ebda3
2013-09-30 10:56:37 -07:00
Jim Bankoski
4906fe45e2 Merge "systemdependent lint issue resolved" 2013-09-30 10:55:07 -07:00
Jim Bankoski
fd09be0984 Merge changes I2b2af1dd,Id2cc5c82
* changes:
  fixed cpp lint issue in vp9_postproc_x86
  nolintify intrinsic idct file
2013-09-30 10:53:30 -07:00
Jim Bankoski
e3c1f0880f Merge "cpplint issues in vp9_loopfilter.h" 2013-09-30 10:53:13 -07:00
Jim Bankoski
509ba98938 Merge "treecoder lint issues resolved" 2013-09-30 10:43:22 -07:00
Jim Bankoski
7ddd9f7f27 Merge "cpplint issue with entropymv.h" 2013-09-30 10:43:16 -07:00
Jim Bankoski
c424c5e808 Merge "cpplint issue with vp9_loopfilter_filters.c" 2013-09-30 10:43:05 -07:00
Jim Bankoski
282704145d Merge "cpplint issue in blockd.h" 2013-09-30 10:42:45 -07:00
Jim Bankoski
58a09c32c2 Merge "common_data.h lint issues resolved" 2013-09-30 10:42:35 -07:00
Jim Bankoski
9e056fa094 Merge "vp9_loopfilter.c cpplint issues resolved." 2013-09-30 10:42:27 -07:00
Jim Bankoski
d2a4ddf982 Merge "cpplint issue resolved in vp9_pred_common.h" 2013-09-30 10:42:19 -07:00
Jim Bankoski
cbdcc215b3 Merge "resolved lint issues in default_coef_probs" 2013-09-30 10:42:12 -07:00
Jim Bankoski
d35e9a0c53 Merge "lint issues in mvref_common.c" 2013-09-30 10:41:50 -07:00
Jim Bankoski
14916b0ca6 Merge "vp9 convolve lint issues" 2013-09-30 10:41:43 -07:00
Jim Bankoski
4e5d99ca72 Merge "vp9_rtcd.c lint issues" 2013-09-30 10:41:32 -07:00
Jim Bankoski
bc1b089372 Merge changes Id58e2176,I7efc74ef
* changes:
  cpplint issues in vp9_filter.h
  cpplint issues with onyxc_int.h
2013-09-30 10:41:23 -07:00
Jim Bankoski
0f8805e086 Merge "vp9_entropy.c lint issues" 2013-09-30 10:34:11 -07:00
Paul Wilkins
d12a502ef9 Merge "Alter Speed 3." 2013-09-30 09:12:28 -07:00
Jim Bankoski
7f13b33a78 Merge "cpplint issues resolved in vp9_postproc.c" 2013-09-30 08:26:00 -07:00
Jim Bankoski
1a2f4fd2f5 Merge "fix lint issues in quant common" 2013-09-30 08:26:00 -07:00
Jim Bankoski
88251c86dc Merge "fix cpplint issue in reconintra" 2013-09-30 08:26:00 -07:00
Jim Bankoski
68b8d1ea0a Merge changes Ia7969baa,Ic5807152,I1c3943cd,I0b5af849,I01cbd1b0
* changes:
  fixed cpplint issue with vp9_scale.h
  vp9_entropymv.c cpplint issues resolved
  cpplint fixes to debug modes
  cpplint issues in vp9_onyx.h
  cpplint issues resolved in vp9_dx_iface.c
2013-09-30 08:26:00 -07:00
Jim Bankoski
821b987486 Merge "cpplint issue with treedreader" 2013-09-30 08:24:59 -07:00
Deb Mukherjee
fad3d07df3 Merge "Some minor changes/cleanups in rate control" 2013-09-30 06:50:56 -07:00
Paul Wilkins
65b93c7e52 Improved auto_partition_range.
The code now takes into account temporal and spatial
information to determine the partition size range, but the
frequency counts have been removed.

The net effect is similar in quality but about 10% faster.

Change-Id: I39a513fb79cec9177b73b2a7218f0da70963ae95
2013-09-30 11:32:57 +01:00
Paul Wilkins
a76caa7ff4 Alter Speed 3.
This patch deletes the variance based speed three partitioning.
Speed 3 now uses the same partitioning method as speed 2
but with some stricter conditions.

The speed and quality are now somewhere between speeds 2 and 4
whereas before it was worse in both than speed 4.

Change-Id: Ia142e7007299d79db3ceee6ca8670540db6f7a41
2013-09-30 11:26:46 +01:00
Jim Bankoski
777460329b vp9_entropy.c lint issues
Change-Id: I4e163cc4ce9ec2f3a5a8b9da478049c71b08d71f
2013-09-29 20:29:43 -07:00
Jim Bankoski
7019e34c34 vp9 convolve lint issues
Change-Id: I8b496191c6a60a60a52c929adca305db47058a84
2013-09-29 19:44:05 -07:00
Jim Bankoski
f6d7e3679c resolved lint issues in default_coef_probs
Change-Id: I97bf241c0d981721cc74a50be47c9db8a00f6be3
2013-09-29 19:41:31 -07:00
Jim Bankoski
c66bfc70d1 treecoder lint issues resolved
Change-Id: I442609f689aa9381e1e208012305cf62a6b31eee
2013-09-29 19:37:11 -07:00
Jim Bankoski
a57912f893 systemdependent lint issue resolved
Change-Id: I07fbb32d5cee0003d04b2369cfafcb03c371cd4f
2013-09-29 19:34:44 -07:00
Jim Bankoski
8f229caf87 lint issues in mvref_common.c
Change-Id: If6a7a8c48fefc69349c792d8ed52a6e1d374e46e
2013-09-29 19:32:53 -07:00
Jim Bankoski
623e163f84 vp9_rtcd.c lint issues
Change-Id: I58209ae96d21c56cbb8ef796940b6ca3b3ebfa72
2013-09-29 19:29:58 -07:00
Jim Bankoski
c288b94ab9 common_data.h lint issues resolved
Change-Id: I1fd79093a5b9cb40c9e877b6b71c25a07a69b3ae
2013-09-29 19:28:32 -07:00
Jim Bankoski
03df17070b vp9_loopfilter.c cpplint issues resolved.
Change-Id: Idfa17d120ec4edf542e424fa0deb769951afbf4a
2013-09-29 19:04:21 -07:00
Jim Bankoski
6249a5b17e cpplint issue with vp9_loopfilter_filters.c
Change-Id: I13aa43df6bff340b5768d69125b473a52d1d59bd
2013-09-29 19:03:00 -07:00
Jim Bankoski
855d078f95 cpplint issue with entropymv.h
Change-Id: I3556738d27def6a5bd71577728050a1e2bb1de63
2013-09-29 19:01:46 -07:00
Jim Bankoski
2b5bf7b8d8 cpplint issue in blockd.h
Change-Id: Ia41e1966431652b839134a1c27feccb25c762539
2013-09-29 19:00:40 -07:00
Jim Bankoski
716d37f8bf fixed cpplint issue with vp9_scale.h
Change-Id: Ia7969baac7ffc6d7a0e8e8e83e9252d077a3c5b3
2013-09-29 18:58:58 -07:00
Jim Bankoski
2ecd0dae1e vp9_entropymv.c cpplint issues resolved
Change-Id: Ic5807152cc78127b3f84b5abb4c5f3ef6d06ce65
2013-09-29 18:57:35 -07:00
Jim Bankoski
7a59efe7f8 cpplint issues resolved in vp9_postproc.c
Change-Id: If61380115163a02ecfe74b82e116001ac54e20e2
2013-09-29 18:52:29 -07:00
Jim Bankoski
152fd59964 fixed cpp lint issue in vp9_postproc_x86
Change-Id: I2b2af1dd9f5c29c05e28a4fd51fa58ccc4071477
2013-09-29 18:44:58 -07:00
Jim Bankoski
ec421b7810 nolintify intrinsic idct file
Change-Id: Id2cc5c829399a2afdf7a8a82615a4e272c814986
2013-09-29 18:42:24 -07:00
Jim Bankoski
31ceb6b13c cpplint issues in vp9_loopfilter.h
Change-Id: Ib142f9c5130aa5f0e1fc76e1c4f51cd66c73dcc7
2013-09-29 18:36:42 -07:00
Jim Bankoski
11cf0c39c9 cpplint issues in vp9_filter.h
Change-Id: Id58e21760c7948a2b020c9623c38cf007874d43e
2013-09-29 18:34:41 -07:00
Jim Bankoski
01d43aaa24 cpplint issue resolved in vp9_pred_common.h
Change-Id: Ibacac91c2192fcfbd9e411ae141dd00445566efe
2013-09-29 18:17:06 -07:00
Jim Bankoski
ab03c00504 cpplint issues with onyxc_int.h
Change-Id: I7efc74ef53139bbaa6ec4f01482d9d9b362be27b
2013-09-29 18:10:03 -07:00
Jim Bankoski
eb506a6590 cpplint fixes to debug modes
Change-Id: I1c3943cd5db6cd8fc759116a3717dba3c030fa0d
2013-09-29 18:04:48 -07:00
Jim Bankoski
fb6e6cd24d fix cpplint issue in reconintra
Change-Id: I934f9cfb96ce4f5f266b025064237875dcd92b3a
2013-09-29 18:02:42 -07:00
Jim Bankoski
d052117319 fix lint issues in quant common
Change-Id: I135ee6e8df91262f813c474b24f14381a4064e02
2013-09-29 17:59:43 -07:00
Jim Bankoski
efc8638890 cpplint issues in vp9_onyx.h
Change-Id: I0b5af849833ac077bd4de71a24af8f8bd7ec06d6
2013-09-29 17:50:18 -07:00
Jim Bankoski
4ecdf26d9c cpplint issues resolved in vp9_dx_iface.c
Change-Id: I01cbd1b00d8d8e02541b2c29b9e88e690edfcaba
2013-09-29 17:33:30 -07:00
Jim Bankoski
0f9efe9e7a cpplint issue with treedreader
Change-Id: I4036add96dd5e42896c57a80a6ef2b6f27b8224a
2013-09-29 17:20:33 -07:00
Jim Bankoski
8e45778eaf Merge changes I29b5bbb9,Iaa6b8ac9,Ibf996de7,Ie1b544e4,I9dea60e3,If71923f4,I6498d2ee
* changes:
  cpplint issue extra line in decodemv.c
  cpplint issue - vp9_idct_blk.c
  cpplint issue in vp9_detokenize.c
  fixed cpplint issue vp9_onyxd_int.h
  cpplint issue in vp9_read_bit_buffer resolved
  cpplint issue vp9_decodeframe.c
  fix cpplint issue in vp9_onyxd.h
2013-09-29 17:10:17 -07:00
Jim Bankoski
8486741e15 Merge "cpplint issues vp9_thread.h" 2013-09-29 17:07:55 -07:00
Jim Bankoski
8d0b712af6 Merge "cpplint style issue" 2013-09-29 17:07:27 -07:00
Jim Bankoski
8d50d766d4 Merge "fixed cpplint issues in vp9_onyxd_if.c" 2013-09-29 17:07:17 -07:00
Dmitry Kovalev
b927620231 Merge "Using is_inter_block and has_second_ref functions." 2013-09-29 12:14:41 -07:00
Dmitry Kovalev
29815ca729 Merge "Moving from int_mv* to MV* (3)." 2013-09-29 12:13:16 -07:00
Dmitry Kovalev
4ab01fb5f7 Merge "Reusing FRAME_CONTEXT struct to simplify the code." 2013-09-29 12:02:26 -07:00
Dmitry Kovalev
b3d3578ee4 Merge "Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add." 2013-09-29 12:01:50 -07:00
Dmitry Kovalev
7343681675 Merge "Removing vp9_get_coef_neighbors_handle function." 2013-09-29 12:01:36 -07:00
Dmitry Kovalev
efbacc9f89 Merge "Removing vp9_subpelvar.h from common." 2013-09-29 12:00:46 -07:00
Dmitry Kovalev
5df8b1d05b Merge "Fixing warning generated by gcc." 2013-09-29 12:00:27 -07:00
Dmitry Kovalev
3bb773d03e Merge "Removing unnecessary function calls." 2013-09-29 11:59:44 -07:00
Jim Bankoski
cf688474ea cpplint issue extra line in decodemv.c
Change-Id: I29b5bbb9bed7296d0bf7d58ae1e78187ccdc5b34
2013-09-29 11:53:14 -07:00
Jim Bankoski
33c7ed4478 cpplint issue - vp9_idct_blk.c
Change-Id: Iaa6b8ac967c0000d4632b64ff9709304072d6ef2
2013-09-29 11:53:10 -07:00
Jim Bankoski
11fe8ecf57 cpplint issue in vp9_detokenize.c
Change-Id: Ibf996de79e9c9bbe03b2202d4af11aebc58f9bcc
2013-09-29 11:53:06 -07:00
Jim Bankoski
67a0a89272 fixed cpplint issue vp9_onyxd_int.h
Change-Id: Ie1b544e488a5e346a62174bfdeb9b54c34a19083
2013-09-29 11:53:02 -07:00
Jim Bankoski
ef6d82358d cpplint issue in vp9_read_bit_buffer resolved
Change-Id: I9dea60e39bc4a51684cfba49c82c3570a2f7b61e
2013-09-29 11:52:58 -07:00
Jim Bankoski
fff4caeac1 cpplint issue vp9_decodeframe.c
Change-Id: If71923f4821a7bf3372a1ead83baa91fc576977c
2013-09-29 11:52:52 -07:00
Jim Bankoski
2ce70a15d2 fix cpplint issue in vp9_onyxd.h
Change-Id: I6498d2eee0b3f3bbb94787eb0ba72ccfcf8f5f02
2013-09-29 11:52:47 -07:00
Jim Bankoski
da17ffa937 cpplint issues vp9_thread.h
apparently we are going to have trouble completely removing lint issue in this file.
It needs a bit more work.   We need to include vpx_config.h to know whether
we need to have multi threading .    and that means vpx_config.h has to come
before the system headers.  ( a violation )

Change-Id: I023feeab1bf5643b79dccc3b80a4a9ad42689e7b
Signed-off-by: Jim Bankoski <jimbankoski@google.com>
2013-09-29 11:49:52 -07:00
Jim Bankoski
681fb22820 cpplint style issue
Change-Id: I550e27b2d40f0e608032e74e1472ceec53c97dc7
2013-09-29 11:19:26 -07:00
Jim Bankoski
cfbc246d57 fixed cpplint issues in vp9_onyxd_if.c
Change-Id: Ia67e9ed2d5ea79f3dbf1d58f9a187cb18ecd0995
2013-09-29 11:03:53 -07:00
Dmitry Kovalev
b10e6b2943 Removing unnecessary function calls.
Both vp9_init_mbmode_probs() and vp9_zero(cm->ref_frame_sign_bias) are
called inside vp9_setup_past_independence() which called in any case for
encoder/decoder after VP9_COMMON struct creation.

Change-Id: I3724d1a4fb8060101ff0290dd6a158f0b5c57bb4
2013-09-27 17:42:05 -07:00
Dmitry Kovalev
bd9c057433 Reusing FRAME_CONTEXT struct to simplify the code.
Change-Id: Ia455c1900d84a3221e3681e31e15ca86bd03f89d
2013-09-27 16:41:20 -07:00
Guillaume Martres
ceaa3c37a9 Merge "Simplify RDMULT and RDDIV derivation" 2013-09-27 16:32:54 -07:00
Dmitry Kovalev
3fab2125ff Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add.
Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1.

Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5
2013-09-27 15:26:27 -07:00
Christian Duvivier
b1b4ba1bdd Properly save neon registers.
Replace current code which corrupts the stack by
duplicate of vp8 code to save and restore neon
registers.

Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a
2013-09-27 14:25:33 -07:00
Dmitry Kovalev
209c6cbf8f Removing vp9_get_coef_neighbors_handle function.
Change-Id: I6be72c8b048d1ccc7ef43764cf84c32360098970
2013-09-27 14:11:13 -07:00
Deb Mukherjee
80d582239e Some minor changes/cleanups in rate control
Some small changes to the quantizer mapping functions.
Also includes some cleanups.

Change-Id: I9dea29b24015f6e6697012a0e4d8983049d8e5c7
Results:
derfraw300: +0.106%
stdhdraw250: +0.139%
2013-09-27 13:57:42 -07:00
Dmitry Kovalev
db60c02c9e Merge "Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10." 2013-09-27 13:08:52 -07:00
Dmitry Kovalev
36d2794369 Merge "New way of updating last frame segmentation map." 2013-09-27 13:08:44 -07:00
Scott LaVarnway
35830879db Merge "d63 intra prediction ssse3 using bytes" 2013-09-27 07:21:08 -07:00
Dmitry Kovalev
398ddafb62 New way of updating last frame segmentation map.
Implementing more natural (and faster) way of updating last frame
segmentation map.

Change-Id: I9fefa8f78e77bd7948133b04173da45edc15a17e
2013-09-26 18:44:48 -07:00
Christian Duvivier
3c465af2ab Merge "Fix a bunch of TODO from vp9_short_idct32x32_add_neon." 2013-09-26 14:15:18 -07:00
Dmitry Kovalev
15a36a0a0d Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10.
Making function name consistent with vp9_short_idct16x16 and
vp9_short_idct16x16_1.

Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517
2013-09-26 14:01:25 -07:00
Guillaume Martres
2b426969c3 Simplify RDMULT and RDDIV derivation
Don't divide RDMULT and RDDIV by 100 when RDMULT > 1000. This was
probably done to avoid overflow when the rd cost was stored in a 32 bits
integer but this is not the case anymore. This change will make it easier
to support multiple quantizers per frame.

derf compression gain at speed 0: 0.037%

Change-Id: Ibeeb9b7cfa1a132a7af41bc90fc07a3bba0857f6
2013-09-26 13:55:16 -07:00
Dmitry Kovalev
794a7ccd78 Fixing warning generated by gcc.
vp9/vp9_cx_iface.c:92: warning: type qualifiers ignored on function
return type

Change-Id: I6f130e280e2db261506a4af8ce11fc788ad13198
2013-09-26 10:33:21 -07:00
Christian Duvivier
5b1dc1515f Fix a bunch of TODO from vp9_short_idct32x32_add_neon.
- full ASM version, no more C gateway file.
- integrate combine-add with last step of 2nd pass.
- remove a few push/pop pairs.
- some instruction reordering to hide latency.

Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6
2013-09-25 21:15:19 -07:00
Dmitry Kovalev
eda4e24c0d Using is_inter_block and has_second_ref functions.
Change-Id: I60dee58a4fd24d3c4f3c101a49d30e217309f43a
2013-09-25 19:03:04 -07:00
Guillaume Martres
7755b9dada Merge "Correctly set the segment_id prediction flag and context" 2013-09-25 18:04:21 -07:00
Yaowu Xu
0c02bfcc2a Merge "Limit mv search range for first pass and mbgraph" 2013-09-25 17:21:13 -07:00
Dmitry Kovalev
8266da1cd1 Moving from int_mv* to MV* (3).
Change-Id: I9795d0937bc07793c13d067281995e0750f694d9
2013-09-25 16:44:19 -07:00
Dmitry Kovalev
f9e2140cab Merge "Moving from int_mv* to MV* (2)." 2013-09-25 16:12:13 -07:00
Dmitry Kovalev
64eff7f360 Removing vp9_subpelvar.h from common.
Moving all code from that file to vp9_variace_c.c in the encoder.

Change-Id: Ic803d5b4c78d5191e4d25541b3df97337878fc3e
2013-09-25 16:10:43 -07:00
Dmitry Kovalev
2b5670238b Merge "Replacing txfm with tx." 2013-09-25 15:57:56 -07:00
Dmitry Kovalev
e2c92d1510 Merge "Removing unused SUBMVREF_COUNT constant." 2013-09-25 15:57:49 -07:00
Dmitry Kovalev
87a214c277 Merge "Adding vp9_get_entropy_contexts function." 2013-09-25 15:43:55 -07:00
Dmitry Kovalev
9cd14ea6ed Merge "Removing redundant 'extern' keyword." 2013-09-25 15:42:48 -07:00
Dmitry Kovalev
49f5efa8d8 Removing unused SUBMVREF_COUNT constant.
Change-Id: I302ab4603553352a84b57bc89bc9e3d037978d29
2013-09-25 15:33:05 -07:00
Scott LaVarnway
208658490c d63 intra prediction ssse3 using bytes
byte version of ronalds d63 ssse3 optimizations
(commit: c5a1c8cf3541cf3665fee981b36d22c9fbd4191e)

Change-Id: Ifd3e6d454a2246085f23eabb38518a930321e807
2013-09-25 16:16:44 -04:00
Dmitry Kovalev
d445945a84 Adding vp9_get_entropy_contexts function.
Change-Id: Ife0dd29fb4ad65c7e12ac5f1db8cea4ed81de488
2013-09-24 17:26:05 -07:00
Dmitry Kovalev
d0365c4a2c Replacing txfm with tx.
Renaming txfm_stepdown_count to tx_stepdown_count and max_txfm_size to
max_tx_size.

Change-Id: Ifc173e22c78240e561a57c4c741b64b1b8fc6fef
2013-09-24 17:24:35 -07:00
Dmitry Kovalev
c7b7b1da86 Using size_t for memory buffer size.
Change-Id: Ibf1642525731c66c99fa25f95c7b5834ae88c688
2013-09-24 16:38:30 -07:00
Dmitry Kovalev
682c27239f Merge "Cleaning up vp9_update_nmv_count function." 2013-09-24 16:27:18 -07:00
Dmitry Kovalev
450cbfe53a Cleaning up vp9_update_nmv_count function.
Using best_mv[2] array instead of two separate variables.

Change-Id: Iefa0a41f5c42c42f2c66cef26750da68405f0f25
2013-09-24 15:55:49 -07:00
Dmitry Kovalev
12d57a9409 Removing redundant 'extern' keyword.
Change-Id: Ie51306689c0dc527a8aa12d3984389dd8f360dea
2013-09-24 15:13:09 -07:00
Dmitry Kovalev
d571e4e785 Replacing unsigned char* with uint8_t*.
Change-Id: I99a1880aee015ae16311ba05a31aa307df89bef2
2013-09-24 14:57:42 -07:00
Guillaume Martres
57272e41dd Correctly set the segment_id prediction flag and context
This fix a bug introduced by ac6093d179

Change-Id: I0700a4daf7a6a2471074f81a4596352287fb2ac9
2013-09-24 14:18:27 -07:00
Yaowu Xu
35c5d79e6b Limit mv search range for first pass and mbgraph
Both first pass and mbgraph search use block size 16x16 for motion
estimation. This commit put a limit of motion vector range. The
effective range allows the entire 16x16 with required subpel
interpolation input to be completely outside image border, but
not any further away from image border.

Change-Id: Id70a5ed08be49e70959f064859d72adc7d775d08
2013-09-24 13:47:29 -07:00
Dmitry Kovalev
b87696ac37 Moving from int_mv* to MV* (2).
Updating fractional_mv_step_fp and fractional_mv_step_comp_fp function
types.

Change-Id: I601c4378bc39ac3ffd4e295d9cbd8e1f74829d46
2013-09-24 12:48:12 -07:00
Jingning Han
b1c58f57a7 Merge "Remove redundant mode update in sub8x8 decoding" 2013-09-24 12:35:58 -07:00
Dmitry Kovalev
30888742f4 Merge "Moving from int_mv to MV." 2013-09-24 12:25:56 -07:00
Yaowu Xu
71cfaaa689 Merge "Replace memcpy with vpx_memcpy" 2013-09-24 11:35:03 -07:00
Yaowu Xu
9be0bb19df Replace memcpy with vpx_memcpy
Also removed obselete comment

Change-Id: Iae1664777d76383639c637ee786e0d50fc45819a
2013-09-24 10:56:06 -07:00
Yaowu Xu
6037f17942 Rename defined constants
The change is to better reflect the nature of the constants.

Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
2013-09-24 10:53:01 -07:00
Yaowu Xu
ff1ae7f713 Prevent using uninitialized value in RD decision
INT64_MAX may be assigned as RDCOST when RDCSOST computation is skipped
for speed, this commit to prevent INT64_MAX from being used as real
RDCOST in transform size decision.

Change-Id: I89a945134191bbdea1f1431ade70424ac079eaac
2013-09-24 10:53:01 -07:00
Yaowu Xu
fe533c9741 Merge "Change to prevent invalid memory access" 2013-09-24 10:37:17 -07:00
Dmitry Kovalev
f24b9b4f87 Merge "Adding best_mv[2] array instead of two variables." 2013-09-24 10:17:53 -07:00
Deb Mukherjee
f1a627e8a2 Merge "Small tweak in the constant quality parameter" 2013-09-24 09:51:08 -07:00
Jingning Han
9bcd750565 Merge "Enable per transformed block zero coeffs forcing" 2013-09-24 09:18:17 -07:00
Jingning Han
24ad692572 Merge "Calculate rd cost per transformed block" 2013-09-24 09:18:03 -07:00
Deb Mukherjee
b7a93578e5 Small tweak in the constant quality parameter
Improves results a little.

Change-Id: I7bcac02dbb65b43a993445cf557c520197114e5c
2013-09-24 09:09:35 -07:00
Yunqing Wang
bacb5925ff Merge "Number of instructions in fdct4_1d_sse2 reduced by two." 2013-09-24 08:40:56 -07:00
Yaowu Xu
92a29c157f Change to prevent invalid memory access
After change of MI context storage , mi_8x8[]  pointer may be null for
a block outside of image border. The commit changes to access the data
only after validation of mi_row and mi_col.

Change-Id: I039c4eb486a228ea9d8e5f35ab9ae6717d718bf3
2013-09-24 08:36:59 -07:00
A.Mahfoodh
13c7715a75 Number of instructions in fdct4_1d_sse2 reduced by two.
Mathematically the results are the same.

Change-Id: I1c5126cd3ca64e8515ca6331e0989c6f7dd651a0
2013-09-23 17:23:27 -07:00
Jingning Han
e85eaf6acd Remove redundant mode update in sub8x8 decoding
The probability model used to code prediction mode is conditioned
on the immediate above and left 8x8 blocks' prediction modes. When
the above/left block is coded in sub8x8 mode, we use the prediction
mode of the bottom-right sub8x8 block as the reference to generate
the context.

This commit moves the update of mbmi.mode out of the sub8x8 decoding
loop, hence removing redundant update steps and keeping the bottom-
right block's mode for the decoding process of next blocks.

Change-Id: I1e8d749684d201c1a1151697621efa5d569218b6
2013-09-23 17:21:40 -07:00
Yaowu Xu
838eae3961 Correct 3 step search site initialziation
39c7b01d accidently reverted the row/col initialization, which broke
mv clamps, which is dependent on the sites for valid motion vector
range. This commit fixed the issue.

Change-Id: Ibcce0226e0360b1ef483fe760b2e33f1af4bf494
2013-09-23 16:11:49 -07:00
Jingning Han
a517343ca3 Enable per transformed block zero coeffs forcing
This commit enables forcing all coefficients zero per transformed
block, when its rate-distortion cost is lower than regular coeff
quantization.

The overall performance improvement (including its parent patch on
calculating rd cost per transformed block) at speed 1:
derf:  0.298%
yt:    0.452%
hd:    0.741%
stdhd: 0.006%

Change-Id: I66005fe0fd7af192c3eba32e02fd6d77952accb5
2013-09-23 10:39:35 -07:00
Jingning Han
54c87058bf Merge "Remove redundant mv_pred use for sub8x8 blocks" 2013-09-23 08:47:21 -07:00
Deb Mukherjee
d11221f433 Improves constant qual, constrained qual turned on
Adds modeled functions to decide the qp for altref frames in constant q
mode similar to other functions in use in bitrate mode.

Also turns on the constrained quality mode (end-usage=2) option which
was turned off before. Basic testing shows the mode works in principle,
to cap bitrate to the target-bitrate specified, while allowing lower
bitrate depending on the cq-level specified. The mode will need to be
improved over time.

Results for constant quality vs bitrate control mode:
derfraw300/fullderfraw: +3.0% at constant quality over bitrate control.
fullstdhdraw: +4.341%
stdhdraw250: +5.361%

Change-Id: If5027c9ec66c8e88d33e47062c6cb84a07b1cda9
2013-09-22 23:04:50 -07:00
Dmitry Kovalev
14330abdc6 Merge "Cleanup in vp9_init3smotion_compensation." 2013-09-21 02:57:47 -07:00
Johann
a6a00fc6a3 Use lowercase instruction in assembly
The iOS compiler does not recognize BLE:
bad instruction `BLE idct32_transpose_pair_loop'

Change-Id: I7426694c66bc31caf939a2d5000968da1222c15b
2013-09-20 16:11:05 -07:00
Jingning Han
78fbb10642 Calculate rd cost per transformed block
This commit makes the rate-distortion optimization loop evaluate
the rd costs of regular quantization and all zero coeffs, per
transformed block. It improves speed 1 compression performance:

derf: 0.245%
yt:   0.515%

For a large partition that consists multiple transformed blocks,
this allows more flexibility to selectively force a portion of
them coded as all zero coeffs, as well be continued in the next
patches.

Change-Id: I211518be4179747b57375696f017d1160cc91851
2013-09-20 12:40:17 -07:00
Dmitry Kovalev
bb5e2bf86a Adding best_mv[2] array instead of two variables.
Change-Id: I584fe50f73879f6a72fada45714ef80893b6d549
2013-09-20 17:08:53 +04:00
Dmitry Kovalev
e51e7a0e8d Moving from int_mv to MV.
Converting vp9_mv_bit_cost, mv_err_cost, and mvsad_err_cost
functions for now.

Change-Id: I60e3cc20daef773c2adf9a18e30bc85b1c2eb211
2013-09-20 13:52:43 +04:00
Dmitry Kovalev
39c7b01d3c Cleanup in vp9_init3smotion_compensation.
Change-Id: Ie47f53e76bc9530475c8c6d24e9b7a5a0189de56
2013-09-20 12:54:14 +04:00
Dmitry Kovalev
24df77e951 Merge "Adding get_scan_and_band function." 2013-09-20 00:15:06 -07:00
Jingning Han
44b708b4c4 Remove redundant mv_pred use for sub8x8 blocks
The sub8x8 blocks has its own motion vector reference scheme. The
mv_pred is only used blocks of sizes 8x8 and above, to find the
starting point for motion search.

This change does not change any coding behavior. It makes the
encoding process slightly faster. (0.5% speed-up for local test on
speed 1.)

Change-Id: I746ee6ef0eac19aa3621be014afa12be8d82cbb9
2013-09-19 10:32:44 -07:00
Yaowu Xu
79af591368 change to avoid invalid memory read.
The fake token EOSB may cause invaild memory read in pack token, this
commit reworked the loop to avoid such invalid read.

Change-Id: I37fdfce869b44a7f90003f82a02f84c45472a457
2013-09-19 08:22:10 -07:00
Yaowu Xu
014acfa2af fix integer overflow errors
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
2013-09-19 08:14:26 -07:00
Dmitry Kovalev
a23c2a9e7b Adding get_scan_and_band function.
Extracting get_scan_and_band function from get_entropy_context to
remove duplicated code.

Change-Id: I5da1f5a60263017e887da68bc834317b5f084cb2
2013-09-19 16:53:48 +04:00
Dmitry Kovalev
1600707d35 Merge "Removing redundant code from vp9_mcomp.c." 2013-09-19 00:30:18 -07:00
Dmitry Kovalev
cda802ac86 Merge "Removing redundant coef calculation + cleanup." 2013-09-19 00:28:31 -07:00
Dmitry Kovalev
0fcb0e17bc Merge "Fixing typo in the encoder." 2013-09-19 00:26:52 -07:00
Yunqing Wang
a7b7f94ae8 Merge "Fix x86inc.asm to build PIC code correctly" 2013-09-18 14:51:31 -07:00
Yunqing Wang
9d901217c6 Fix x86inc.asm to build PIC code correctly
Current x86inc.asm didn't handle 32bit PIC build properly.
TEXTRELs were seen in the library built. The PIC macros from
libvpx's x86_abi_support.asm was used to fix this problem.
The assembly code was modified to use the macros.

Notes: We need this fix in for decoder building. Functions in
encoder will be fixed later.

Change-Id: Ifa548d37b1d0bc7d0528db75009cc18cd5eb1838
2013-09-18 13:45:46 -07:00
Dmitry Kovalev
98cf0145b1 Removing redundant coef calculation + cleanup.
Adding temp variable for &x->plane[0], inlining src_diff values.

Change-Id: I24c08a5425a6da6fd66f5b0278f2fce74f9989b2
2013-09-18 16:20:10 +04:00
Dmitry Kovalev
72fd127f8c Removing redundant code from vp9_mcomp.c.
Replacing ((1 << MV_MAX_BITS) - 1) with MV_MAX, adding const
qualifiers, reusing computed values.

Change-Id: I7b46d47f6c644b079d9c3478116a9de465a9baec
2013-09-18 13:11:38 +04:00
Dmitry Kovalev
245ca04bab Fixing typo in the encoder.
Change-Id: I168efdc366eecf638694f357ccad2f4eba7e2fdb
2013-09-18 12:02:22 +04:00
Yaowu Xu
85fd8bdb01 Merge "Silence a bunch of MSVC warnings" 2013-09-17 17:10:58 -07:00
Jingning Han
c437bbcde0 Clean up second ref check in sub8x8 rd loop
This commit cleans up the second reference check in the
rate-distortion optimization loop of sub8x8 blocks.

Change-Id: Ife68feaa4cddbfad2878c9b44d3012788d634f97
2013-09-17 15:59:49 -07:00
Yaowu Xu
a783da80e7 Silence a bunch of MSVC warnings
Change-Id: I16633269582a640809dca27572bbe99efa6369fc
2013-09-17 12:08:51 -07:00
Jingning Han
2b3bfaa9ce Remove redundant argument in get_sub_block_mv
The sub8x8 check can be directly inferred from block_idx, hence
removed from the arguments if get_sub_block_mv.

Change-Id: Ib766d57e81248fb92df0f6d9b163e6c77b933ccd
2013-09-17 12:08:45 -07:00
Paul Wilkins
84758960db Merge "Minor clean up." 2013-09-17 03:39:24 -07:00
Paul Wilkins
90a52694f3 Merge "Adjustment to mode_skip_start." 2013-09-17 03:39:15 -07:00
hkuang
cbf394574d Merge "Speed up iht8x8 by rearranging instructions. Speed improves from 282% to 302% faster based on assembly-perf." 2013-09-16 14:39:45 -07:00
hkuang
23e1a29fc7 Speed up iht8x8 by rearranging instructions.
Speed improves from 282% to 302% faster based on assembly-perf.

Change-Id: I08c5c1a542d43361611198f750b725e4303d19e2
2013-09-16 14:23:26 -07:00
Yaowu Xu
eeae6f946d fix a problem where an invalid mv used in search
The commit added reset of pred_mv at the beginning of each SB64x64
partition mv search, also limited the usage of pred_mv only when
search on the largest partition is already done. This is to fix
a crash at speed 1/2 encoder where an invalid mv is used in mv
search.

Change-Id: I39010177da76d054e3c90b7899a44feb2e3a5b1b
2013-09-16 12:49:27 -07:00
Paul Wilkins
cb50dc7f33 Minor clean up.
Removed some unused code and minor cleanup
/ reordering.

Change-Id: I4083ae56aeb8edfe9b85aa2f42a16aa28d19da94
2013-09-16 13:45:20 +01:00
Paul Wilkins
3b01778450 Adjustment to mode_skip_start.
Corrected values relating to modified mode order.

Change-Id: I24fccba3af4bc16721d5e7e51888a66305bfa7fe
2013-09-16 13:44:48 +01:00
James Zern
2d58761993 Revert "Improved 8t filters"
This is incompatible with most toolchains other than gcc.

Revert "Deleted #include <inttypes.h>"

This reverts commit 4d018be950.

This reverts commit d22a504d11.

Change-Id: I1751dc6831f4395ee064e6748281418e967e1dcf
2013-09-13 15:13:06 -07:00
Jingning Han
e8a967d960 Merge "Adaptive motion search control" 2013-09-13 14:43:23 -07:00
Jingning Han
c4826c5941 Adaptive motion search control
This commit enables adaptive constraint on motion search range for
smaller partitions, given the motion vectors of collocated larger
partition as a candidate initial search point.

It makes speed 0 runtime of bus at CIF and 2000 kbps goes from
167s down to 162s (3% speed-up), at 0.01dB performance gains. In
the settings of speed 1, this makes the runtime goes from 33687 ms
to 32142 ms (4.5% speed-up), at 0.03dB performance gains.

Compression performance wise, it gains at speed 1:
derf  0.118%
yt    0.237%
hd    0.203%
stdhd 0.438%

Change-Id: Ic8b34c67810d9504a9579bef2825d3fa54b69454
2013-09-13 13:58:10 -07:00
Deb Mukherjee
0c3038234d Merge "Clean up of the search best filter speed feature" 2013-09-13 11:03:59 -07:00
Paul Wilkins
5d8642354e Merge "Fix VP9_mode_order[]" 2013-09-13 09:19:31 -07:00
Scott LaVarnway
8fc95a1b11 Merge "New mode_info_context storage -- undo revert" 2013-09-13 08:56:20 -07:00
Paul Wilkins
1407cf8588 Fix VP9_mode_order[]
Mis-merge of the following change managed to break mode order
and delete two mode options (new alt ref and near alt ref)
It also created a situation where we could test two undefined
modes off the end of the VP9_mode_order[] data structure.
  "clang warnings : remove split and i4x4_pred fake modes"
  "Change Id: I8ef3c*"

Initial testing on Akiyo at speed 2.
101.35	 44.567	 44.447 improves to
96.82	 44.915	 44.815

Approx 0.3-0.4db gain and 2.5% size reduction

Change-Id: Icff813e7c0778d140ad4f0eea18cf1ed203c4e34
2013-09-13 13:33:26 +01:00
Paul Wilkins
9c9a3b2775 Merge "Deleted #include <inttypes.h>" 2013-09-13 01:05:31 -07:00
Jim Bankoski
324ebb704a Merge "fix clang warning in rdopt" 2013-09-12 16:39:05 -07:00
hkuang
86fb12b600 Merge "Add neon optimize iht8x8 which is 282% faster than C." 2013-09-12 15:42:44 -07:00
Christian Duvivier
25655e5794 Merge "First draft of vp9_short_idct32x32_add_neon." 2013-09-12 14:23:00 -07:00
hkuang
182366c736 Add neon optimize iht8x8 which is 282% faster than C.
Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530
2013-09-12 11:49:05 -07:00
Jim Bankoski
9ee9918dad fix clang warning in rdopt
either missed this or it crept back in

Change-Id: I6cc1519d09e558be7250254c25bde2ae720555ea
2013-09-12 06:39:42 -07:00
Jim Bankoski
e7f2aa0fb8 clang warnings : ref frame enum mismatch
Convert from refframe_type_t to VP9_REFFRAME

Change-Id: Iff4043c3fdb3e1c9c2b412bdffd5da8ed913ec13
2013-09-12 06:29:07 -07:00
Jim Bankoski
cddde51ec5 Merge "clang warnings : remove split and i4x4_pred fake modes" 2013-09-12 06:20:45 -07:00
Paul Wilkins
4d018be950 Deleted #include <inttypes.h>
This seems not to be needed and is not supported
in the Windows build.

Change-Id: Iaca3bbf8cca283aee6bc336cb31ba9dd4610322b
2013-09-12 13:43:07 +01:00
Paul Wilkins
66755abff4 Merge "Changes in speed 2 settings" 2013-09-12 02:22:45 -07:00
Jim Bankoski
7fb42d909e clang warnings : remove split and i4x4_pred fake modes
Change-Id: I8ef3c7c0f08f0f1f4ccb8ea4deca4cd8143526ee
2013-09-11 16:34:55 -07:00
Christian Duvivier
6a501462f8 First draft of vp9_short_idct32x32_add_neon.
Lots of TODO which will be taken care in upcoming changes. As is,
about 6x faster than C version.

Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0
2013-09-11 15:19:38 -07:00
Deb Mukherjee
b964646756 Clean up of the search best filter speed feature
Removes this speed feature since it is very slow and unlikely
to be used in practice. This cleanup removes a bunch of unnecessary
complications in the outer encode loop.

Change-Id: I3c66ef1ca924fbfad7dadff297c9e7f652d308a1
2013-09-11 15:16:36 -07:00
Scott LaVarnway
23845947c4 Merge "Improved 8t filters" 2013-09-11 14:34:54 -07:00
Jim Bankoski
d09abfa9f7 Merge "resolve clang issue : implicit convert tx_mode -> tx_size" 2013-09-11 13:40:11 -07:00
Scott LaVarnway
d22a504d11 Improved 8t filters
Reformatted version of a patch submitted by Erik/Tamar
from Intel.  For the test clips used, the decoder
performance improved by ~2%.

Change-Id: Ifbc37ac6311bca9ff1cfefe3f2e9b7f13a4a511b
2013-09-11 13:56:32 -04:00
Deb Mukherjee
69fe840ec4 Changes in speed 2 settings
Propose some changes to the speed 2 settings to improve quality.
In particular, turns off the adjust_thresholds_by_speed feature
which improves results by 6%. Also removes the code for
adjust_thresholds_by_speed since it conflicts with the adaptive
rd thresh feature.

Overall, with this change speed 2 is -15.2% from speed 0 settings,
on derf, which is significantly better than -21.6% down before.

Change-Id: I6e90a563470979eb0c258ec32d6183ed7ce9a505
2013-09-11 10:54:07 -07:00
Scott LaVarnway
ac6093d179 New mode_info_context storage -- undo revert
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of pointers to MODE_INFO structs.  The
MODE_INFO structs are now stored as a stream (decoder only),
eliminating unnecessary copies and is a little more cache
friendly.

Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
2013-09-11 13:45:44 -04:00
Yunqing Wang
079183c1a8 code cleanup
Removed unused function.

Change-Id: Icb12a09e4d303968be6aec9fae1ef05935913a4f
2013-09-11 09:32:00 -07:00
Jingning Han
65fe7d7605 Merge "Remove redundant condition check in 32x32 quant" 2013-09-10 16:39:18 -07:00
Jingning Han
cb24406da5 Merge "Remove the use of uninitialized_safe in encode_sb_" 2013-09-10 12:05:22 -07:00
Jingning Han
5d93feb6ad Remove redundant condition check in 32x32 quant
The c code implementation of 32x32 quantization does the zbin check
of all coefficients prior to the quant/dequant loop, hence removing
the redundant zbin check inside the loop. This only affects the
c code version. SSSE3 version does not separate the zbin check out.

Change-Id: Ic197a7d61d0b25fcac3cc092987651378cb56e4e
2013-09-10 12:04:33 -07:00
Deb Mukherjee
3d22d3ae0c Merge "Small tweaks on the constant quality mode" 2013-09-10 11:16:47 -07:00
Deb Mukherjee
09830aa0ea Small tweaks on the constant quality mode
Improves results a little.
derf is now +1.078% over bitrate control.

Change-Id: I4812136f3e67be21d14ec089419976a32a841785
2013-09-10 10:16:19 -07:00
Yunqing Wang
0607abc3dd Stop partition checking when distortion is small
If the current obtained distortion is very small, which happens
for static image case, we pick the current partition type without
further split checking.

This won't affect regular videos. For static videos, we got 10%~12%
encoding speed gain. PSNR was better for some clips, and worse for
others. Overall it was even.

Change-Id: If787a57bedf46fc595ca4f5ded2b0c0a69e9fdef
2013-09-10 10:13:24 -07:00
Yunqing Wang
939791a129 Modify encode breakout for static frames
Thank Paul for the suggestions. While turning on static-thresh
for static-image videos, a big jump on bitrate was seen. In this
patch, we detected static frames in the video using first-pass
stats. For different cases, disable encode breakout or reduce
encode breakout threshold to limit the skipping.

More modification need be done to break incorrect partition
picking pattern for static frames while skipping happens.

Change-Id: Ia25f47041af0f04e229c70a0185e12b0ffa6047f
2013-09-10 09:06:03 -07:00
hkuang
f4a6f936b5 Merge "Speed up idct16x16 by rearrange instructions." 2013-09-10 08:23:57 -07:00
Paul Wilkins
4f660cc018 Modified mode skip functionality.
A previous speed feature skipped modes not used in earlier
partitions but this not longer worked as intended following
changes to the partition coding order and in conjunction
with some other speed features (Especially speed 2 and above).

This modified mode skip feature sets a mask after the first X
modes have been tested in each partition depending on the
reference frame of the current best case.

This patch also makes some changes to the order modes are
tested to fit better with this skip functionality.

Initial testing suggests speed and rd hit count improvements
of up to 20% at speed 1. Quality results. (derf -1.9%, std hd  +0.23%).

Change-Id: Idd8efa656cbc0c28f06d09690984c1f18b1115e1
2013-09-10 13:30:10 +01:00
Paul Wilkins
901c495482 Added extra check to rd_auto_partition_range()
Added check that the returned max and minimum are
valid in bottom and right border cases.

Change-Id: I2d6cdc9b5f04c7d0ff512ddcf3228331e028bf9b
2013-09-10 13:29:23 +01:00
hkuang
fc5ec206a7 Speed up idct16x16 by rearrange instructions.
Speed improve from 376% to 400% faster base on assembly-perf.

Change-Id: If0b2eccc39d5793dc101ce9feb7fcadf88396ea2
2013-09-09 18:00:13 -07:00
Ivan Maltz
20abe595ec Merge "API extensions and sample app for spacial scalable encoder" 2013-09-09 16:57:01 -07:00
Ivan Maltz
01b35c3c16 API extensions and sample app for spacial scalable encoder
Sample app: vp9_spatial_scalable_encoder
vpx_codec_control extensions:
  VP9E_SET_SVC
  VP9E_SET_WIDTH, VP9E_SET_HEIGHT, VP9E_SET_LAYER
  VP9E_SET_MIN_Q, VP9E_SET_MAX_Q
expanded buffer size for vp9_convolve

modified setting of initial width in vp9_onyx_if.c so that layer size
can be set prior to initial encode

Default number of layers set to 3 (VPX_SS_DEFAULT_LAYERS)
Number of layers set explicitly in vpx_codec_enc_cfg.ss_number_layers

Change-Id: I2c7a6fe6d665113671337032f7ad032430ac4197
2013-09-09 15:57:56 -07:00
Jingning Han
18c780a0ff Remove the use of uninitialized_safe in encode_sb_
Initialize the probability model context with default value in
encode_sb.

Change-Id: Id826114024dfc21c7ef41aea9f4a0316d4a5cb95
2013-09-09 15:41:16 -07:00
James Zern
c1913c9cf4 Merge "Revert "New mode_info_context storage"" 2013-09-09 14:38:01 -07:00
James Zern
54a03e20dd Revert "New mode_info_context storage"
This reverts commit dae17734ec

Encode crashes, leaks and increases integer overflow errors.

Change-Id: I595aa2649bb8d0b6552ff91652837a74c103fda2
2013-09-09 13:37:01 -07:00
Yaowu Xu
b19126b291 Merge "Reduce the amount of extension in src frames" 2013-09-09 08:09:56 -07:00
Paul Wilkins
740acd6891 Merge "Enable kf restrictions at speed 4" 2013-09-09 05:39:13 -07:00
Yaowu Xu
65c2444e15 Reduce the amount of extension in src frames
The commit changes the border pixel extension from 160 pixel each side
to what is necessary in arnr filter or motion estimation portion, i.e.
16 pixel on top and left side. For right or bottom side, the extension
is changed to either round up image size to multiple of 64 or at least
16 pixels.

Change-Id: Ic05e19b94368c1ab4df568723aae5734e6c3d2c5
2013-09-08 15:51:54 -07:00
Jim Bankoski
9faa7e8186 resolve clang issue : implicit convert tx_mode -> tx_size
Change-Id: Ifc9da470358f58e800e3d0d70a565b61e5f7834a
2013-09-08 07:17:12 -07:00
Jim Bankoski
e378566060 Merge "New mode_info_context storage" 2013-09-08 07:16:25 -07:00
Jingning Han
09bc942b47 Fix overflow issue in 16x16 quantization SSSE3
The 16x16 transform unit test suggested that the peak coefficient
value can reach 32639. This could cause potential overflow issue
in the SSSE3 implmentation of 16x16 block quantization. This commit
fixes this issue by replacing addition with saturated addition.

Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e
2013-09-06 21:06:10 -07:00
Paul Wilkins
f15cdc7451 Enable kf restrictions at speed 4
Change-Id: I453409d3be3f5fe118b15affde45cb52184aef20
2013-09-06 11:16:04 -07:00
Deb Mukherjee
e378a89bd6 Support a constant quality mode in VP9
Adds a new end-usage option for constant quality encoding in vpx. This
first version implemented for VP9, encodes all regular inter frames
using the quality specified in the --cq-level= option, while encoding
all key frames and golden/altref frames at a quality better than that.

The current performance on derfraw300 is +0.910% up from bitrate control,
but achieved without multiple recode loops per frame.

The decision for qp for each altref/golden/key frame will be improved
in subsequent patches based on better use of stats from the first pass.
Further, the qp for regular inter frames may also be varied around the
provided cq-level.

Change-Id: I6c4a2a68563679d60e0616ebcb11698578615fb3
2013-09-06 10:30:53 -07:00
Scott LaVarnway
dae17734ec New mode_info_context storage
mode_info_context was stored as a grid of MODE_INFO structs.
The grid now constists of a pointer to a MODE_INFO struct and
a "in the image" flag.  The MODE_INFO structs are now stored
as a stream, eliminating unnecessary copies and is a little
more cache friendly.

For the test clips used, the decoder performance improved
by ~4.3% (1080p) and ~9.7% (720p).

Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
and 5.9% (720p).

Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
2013-09-06 12:33:34 -04:00
Jim Bankoski
e4e864586c Merge "fix loop filter setup_mask could reach out of bounds issue" 2013-09-06 06:21:28 -07:00
hkuang
3476404912 Merge "Speed up idct8x8 by rearrange instructions. Speed improve from 264% ~ 270% to 280% ~ 300% base on assembly-perf." 2013-09-05 17:37:13 -07:00
Jim Bankoski
736114f44b fix loop filter setup_mask could reach out of bounds issue
Change-Id: Ic8446c4f26b6782a6dc482c19ea73c77646df418
2013-09-05 15:53:31 -07:00
Jingning Han
1c263d6918 Merge "Use saturated addition in SSSE3 of 32x32 quant" 2013-09-05 14:09:40 -07:00
Jim Bankoski
2156ccaa4a Merge "resolve clang warnings : uninitialized vars in vp9_entropy.h" 2013-09-05 12:55:32 -07:00
Jingning Han
458c2833c0 Use saturated addition in SSSE3 of 32x32 quant
The 32x32 forward transform can potentially reach peak coefficient
value close to 32700, while the rounding factor can go upto 610.
This could cause overflow issue in the SSSE3 implementation of 32x32
quantization process.

This commit resolves this issue by replacing the addition operations
with saturated addition operations in 32x32 block quantization.

Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70
2013-09-05 12:49:12 -07:00
Jim Bankoski
9fc3d32a50 Merge "faster accounting of inc_mv" 2013-09-05 12:38:56 -07:00
Yaowu Xu
9158b8956f Merge "make bsize requirement for SEG_LVL_SKIP explicit" 2013-09-05 08:15:03 -07:00
Jim Bankoski
2e4ca9d1a5 resolve clang warnings : uninitialized vars in vp9_entropy.h
This helps clear out some of the warnings

Change-Id: Ie7ccaca8fd92542386a7f1b257398e1bdf2f55dc
2013-09-04 18:38:41 -07:00
Jim Bankoski
e8feb2932f Merge "wrap non420 loop filter code in macro" 2013-09-04 17:20:53 -07:00
Paul Wilkins
e5deed06c0 Merge "Attempt to fix speed 4" 2013-09-04 17:19:22 -07:00
Yaowu Xu
1ee66933c1 make bsize requirement for SEG_LVL_SKIP explicit
The segment feature SEG_LVL_SKIP requires the prediction unit size
to be at least BLOCK_8X8. This commit makes the requirement to be
explicit. This is to prevent future encoder implementations from
making wrong choices.

Change-Id: I0127f0bd4c66e130b81f0cb0a8d3dbfe3b2da5c2
2013-09-04 16:32:26 -07:00
hkuang
01c4e04424 Speed up idct8x8 by rearrange instructions.
Speed improve from 264% ~ 270% to 280% ~ 300% base on assembly-perf.

Change-Id: I3e2cc818ec14b432204ff43732f39b6438db685d
2013-09-04 15:57:22 -07:00
Yaowu Xu
72872d3d8c Merge "Fixing problem with invalid delta_q reading." 2013-09-04 14:21:30 -07:00
hkuang
3c05bda058 Merge "Add neon optimize vp9_short_iht4x4_add." 2013-09-04 13:35:09 -07:00
hkuang
3b8614a8f6 Add neon optimize vp9_short_iht4x4_add.
Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e
2013-09-04 12:37:58 -07:00
Dmitry Kovalev
890eee3b47 Fixing problem with invalid delta_q reading.
This is a bitstream change but no currently produces videos should
be affected. https://code.google.com/p/webm/issues/detail?id=610

Change-Id: Ic85a6477df6c201cdf7f70f6bd84607b71f4593c
2013-09-04 11:25:43 -07:00
Yaowu Xu
76a437a31b Merge "Replacing init_dequantizer() with setup_plane_dequants()." 2013-09-04 10:42:12 -07:00
Jim Bankoski
872c6d85c0 Merge "speed up inc_mv_component" 2013-09-04 10:35:51 -07:00
Jim Bankoski
bb2313db28 Merge "make vp9 postproc a config option" 2013-09-04 10:35:26 -07:00
Yunqing Wang
9fd2767200 Merge "Use correct bit cost while static-thresh is on" 2013-09-04 10:26:37 -07:00
Jim Bankoski
c3c21e3c14 wrap non420 loop filter code in macro
Change-Id: I62bca0e7a4bffc1a78b750dbb9df9d2378e92423
2013-09-04 10:24:42 -07:00
Jim Bankoski
79401542f7 make vp9 postproc a config option
Vp9 postproc is disabled for now as its not been shown to help and
may be merged with vp8.

Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-09-04 10:02:08 -07:00
Jim Bankoski
532179e845 faster accounting of inc_mv
Moves counting of mv branches to where we have a new mv, instead of after
the whole frame is summed.

Change-Id: I945d9f6d9199ba2443fe816c92d5849340d17bbd
2013-09-04 09:47:57 -07:00
Dmitry Kovalev
d6606d1ea7 Replacing init_dequantizer() with setup_plane_dequants().
Change-Id: Ib67e996b4a6dcb6f481889f5a0d84811a9e3c5d1
2013-09-04 09:22:59 -07:00
Jim Bankoski
5dda1d2394 speed up inc_mv_component
Convert mv_class if statements to look up.  re order to avoid ifs...

Change-Id: I76966a21bf517bb1f9a7957c08c476c7bb3e9a63
2013-09-04 07:11:30 -07:00
James Zern
1cf2272347 Merge "Fix intermediate height in convolve_c" 2013-09-03 15:50:33 -07:00
Paul Wilkins
49317cddad Attempt to fix speed 4
Speed 4 fixed partition size. Use fixed size unless it does not
fit inside image, in which case use the largest size that does.

Change-Id: I250f7a80506750dd82ab355721624a1344247223
2013-09-03 17:46:25 +01:00
Jingning Han
010c0ad0eb Merge "Fix 32x32 forward transform SSE2 version" 2013-09-03 08:58:03 -07:00
Scott LaVarnway
948aaab4ca Merge "Improved mb_lpf_horizontal_edge_w_sse2_8" 2013-09-03 05:44:01 -07:00
Jingning Han
3cf46fa591 Fix 32x32 forward transform SSE2 version
This commit fixed the potential overflow issue in the SSE2
implementation of 32x32 forward DCT. It resolved the corrupted
coded frames in the border of scenes.

Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
2013-08-31 18:47:08 -07:00
Yunqing Wang
0ca7855f67 Use correct bit cost while static-thresh is on
While static-thresh is on, we only need to transmit skip
flag if skip = 1. The cost of skip bit is added to the
total rate cost.

Change-Id: I64e73e482bc297eba22907026298a15fa8cc3920
2013-08-30 15:25:13 -07:00
Paul Wilkins
2b9baca4f0 Merge "Added per pixel inter rd hit count stats" 2013-08-30 08:56:01 -07:00
Tero Rintaluoma
e326cecf18 Fix intermediate height in convolve_c
- Intermediate height was not correct i.e. when block size is 4 and
  y_step_q4 is 6. In this case intermediate height was
  (4*6) >> 4 = 1 and vertical interpolation needs two source pixels
  plus 7 extra pixels for taps.
- Also if the current output block is 16x16 and we are using 4x upscaling
  we need only 12 rows after horizontal filtering instead of 16.

  Patch Set 2: Intermediate_height updated after CL 66723
               "Fix bug in convolution functions (filter selection)"

Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
2013-08-30 10:31:21 +03:00
Jim Bankoski
1d44fc0c49 Merge "rework filter_block_plane" 2013-08-29 20:11:09 -07:00
Jim Bankoski
bc50961a74 rework filter_block_plane
Change-Id: I55c3b60c4c0f4910d3dfb70e3edaae00cfa8dc4d
2013-08-29 17:00:05 -07:00
Jingning Han
c86c5443eb Merge "Fix overflow issue in SSSE3 32x32 quantization" 2013-08-29 16:49:04 -07:00
Paul Wilkins
1f4bf79d65 Added per pixel inter rd hit count stats
Added some code to output normalized rd hit count stats.
In effect this approximates to the average number of rd
operations/tests per pixel for the sequence.

The results are not quite accurate and I have not bothered
to account for partial SB64s at frame edges and for key frames
However they do give some idea of the number of modes /
prediction methods being tested for each pixel across the
different partition sizes. This indicates how much scope their
is for further gains either by reducing the number of partitions
examined or the modes per partition through heuristics.

Patch 3 moved place where count incremented so partial rd
tests that are aborted with INT_MAX return are also counted.

Example numbers for first 50 frames of Akiyo.
Speed 0 ~84.4 rd operations / pixel
Speed 1 ~28.8
Speed 2 ~11.9

Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
2013-08-30 00:13:51 +01:00
Deb Mukherjee
b6dbf11ed5 Merge "Adds a speed feature for fast 1-loop forw updates" 2013-08-29 15:54:04 -07:00
James Zern
e83e8f0426 Merge changes Ib1e853f9,Ifd75c809,If3e83404
* changes:
  consistently name VP9_COMMON variables #3
  consistently name VP9_COMMON variables #2
  consistently name VP9_COMMON variables #1
2013-08-29 15:50:56 -07:00
Yaowu Xu
ee961599e1 Merge "Fixed potential overflows" 2013-08-29 15:43:26 -07:00
James Zern
d765df2796 consistently name VP9_COMMON variables #3
stragglers

Change-Id: Ib1e853f9a331b7b66639dc34d79568d84d1930f1
2013-08-29 13:27:41 -07:00
James Zern
aa05321262 consistently name VP9_COMMON variables #2
oci -> cm

Change-Id: Ifd75c809d9cc99034d3c2fccc4653a78b3aec21f
2013-08-29 13:25:58 -07:00
James Zern
924d74516a consistently name VP9_COMMON variables #1
pc -> cm

Change-Id: If3e83404f574316fdd3b9aace2487b64efdb66f3
2013-08-29 13:25:57 -07:00
Dmitry Kovalev
e80bf802a9 Merge "Renaming txfm_size to tx_size." 2013-08-29 12:30:18 -07:00
Jingning Han
abff678866 Fix overflow issue in SSSE3 32x32 quantization
The 32x32 quantization process can potentially have the intermediate
stacks over 16-bit range, thereby causing enc/dec mismatch. This commit
fixes this overflow issue in the SSSE3 implementation, as well as the
prototype, of 32x32 quantization.

This fixes issue 607 from webm@googlecode.

Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806
2013-08-29 11:00:54 -07:00
Yaowu Xu
aaa7b44460 Fixed potential overflows
The two arrays are typically initialized to INT64_MAX, if they are not
filled with valid values before the addition, the values can overflow
and lead to wrong results.

Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
2013-08-29 10:26:52 -07:00
Scott LaVarnway
22dc946a7e Improved mb_lpf_horizontal_edge_w_sse2_8
This patch is a reformatted version of optimizations done by
engineers at Intel (Erik/Tamar) who have been providing
performance feedback for VP9.  For the test clips used (720p, 1080p),
up to 1.2% performance improvement was seen.

Change-Id: Ic1a7149098740079d5453b564da6fbfdd0b2f3d2
2013-08-29 08:30:17 -04:00
Dmitry Kovalev
b71807082c Merge "General code cleanup." 2013-08-28 12:57:49 -07:00
Dmitry Kovalev
db20806710 Merge "Removing unnecessary call to vp9_setup_interp_filters." 2013-08-28 12:31:08 -07:00
Dmitry Kovalev
b62ddd5f8b General code cleanup.
Switching from mi_{width, height}_log2 and b_{width, height}_log2 to
num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing
redundant code, adding const.

Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
2013-08-28 12:22:37 -07:00
Deb Mukherjee
e02dc84c1a Adds a speed feature for fast 1-loop forw updates
Incorporates a speed feature for fast forward updates of
coefficients. This feature takes 3 values:
0 - use standard 2-loop version
1 - use a 1-loop version
2 - use a 1-loop version with reduced updates

Results: derfraw300 +0.007% (on speed 0) at feature value = 1
                    -0.160% (on speed 0) at feature value = 2

There is substantial speed up at speeds 2 and above for low
resolution sequences where the entropy updates are a big part
of the overall computations.

Change-Id: Ie96fc50777088a5bd441288bca6111e43d03bcae
2013-08-28 10:56:52 -07:00
Dmitry Kovalev
851a2fd72c Renaming txfm_size to tx_size.
Change-Id: I752e374867d459960995b24d197301d65ad535e3
2013-08-27 19:47:53 -07:00
Jingning Han
eb7acb5524 Merge "Fix buf alignment in sub8x8 comp inter-inter pred" 2013-08-27 19:03:12 -07:00
Dmitry Kovalev
1d3f94efe2 Merge "Adding get_entropy_context function." 2013-08-27 17:02:36 -07:00
Frank Galligan
7d058ef86c Merge "Fix winodws warning." 2013-08-27 15:39:58 -07:00
Frank Galligan
f1560ce035 Fix winodws warning.
Const is not needed on the function parameter.

Change-Id: I38c2a7317cb6f42f70bbddfde9a2cd18d65ceb1c
2013-08-27 15:19:55 -07:00
Dmitry Kovalev
a93992e725 Adding get_entropy_context function.
Moving common code from encoder and decoder to this function.

Change-Id: I60fa643fb1ddf7ebbff5e83b6c4710137b0195ef
2013-08-27 14:17:53 -07:00
hkuang
3a679e56b2 Add neon optimize vp9_short_idct16x16_1_add.
Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5
2013-08-27 14:00:27 -07:00
hkuang
ce04b1aa62 Merge "Add neon optimize vp9_short_idct8x8_1_add." 2013-08-27 12:10:07 -07:00
Dmitry Kovalev
7b95f9bf39 Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder.
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
2013-08-27 11:05:08 -07:00
Dmitry Kovalev
ba10aed86d Merge "Using num_8x8_* lookup tables instead of mi_*_log2." 2013-08-27 10:49:36 -07:00
Dmitry Kovalev
12e5931a9a Merge "Using existing functions instead of raw expressions." 2013-08-27 10:33:34 -07:00
Dmitry Kovalev
f77c6973a1 Merge "Cleaning up decode_block_intra function." 2013-08-27 10:17:56 -07:00
Dmitry Kovalev
f389ca2acc Merge "Cleaning up model_rd_for_sb_y_tx." 2013-08-27 10:17:10 -07:00
Dmitry Kovalev
bfebe7e927 Merge "Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the common/decoder." 2013-08-27 10:15:21 -07:00
Dmitry Kovalev
78e670fcf8 Merge "Renaming D27 to D207." 2013-08-27 10:03:57 -07:00
Jingning Han
2d6aadd7e2 Fix buf alignment in sub8x8 comp inter-inter pred
This commit resolved a mis-alignment issue in compound inter-inter
prediction of sub8x8. This patch follows solution from dkovalev@.

Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932
2013-08-27 09:28:05 -07:00
Yaowu Xu
45125ee573 Merge "fixed the reading too many bytes" 2013-08-27 09:09:18 -07:00
Yaowu Xu
9482c07953 fixed the reading too many bytes
In subpel_avg_variance functions, code similar to the following

punpkldq m2, [addr]

actually reads 8 bytes. For functions that are supposed to work on
buffers only have less 8 bytes a line, this caused valgrind error
of reading uninitialized memory.

Change-Id: I2a4c079dbdbc747829bd9e2ed85f0018ad2a3a34
2013-08-27 08:39:20 -07:00
Dmitry Kovalev
44b7854c84 Removing unnecessary call to vp9_setup_interp_filters.
vp9_setup_interp_filters before each inter block decoding, it is not
necessary to call it just before the whole frame decoding.

Change-Id: Id1b0ee62f987474e27eafba0013a4896b492c400
2013-08-26 17:25:49 -07:00
hkuang
36e9b82080 Add neon optimize vp9_short_idct8x8_1_add.
Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4
2013-08-26 16:28:57 -07:00
hkuang
ba8fc71979 Merge "Add neon optimize vp9_short_idct4x4_1_add." 2013-08-26 16:26:38 -07:00
Dmitry Kovalev
657ee2d719 Cleaning up model_rd_for_sb_y_tx.
Removing references to plane_block_width and plane_block_height (we are
going to delete the latter ones).

Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287
2013-08-26 16:18:28 -07:00
hkuang
69384f4fad Add neon optimize vp9_short_idct4x4_1_add.
Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5
2013-08-26 15:55:16 -07:00
Dmitry Kovalev
242460cb66 Cleaning up decode_block_intra function.
Change-Id: Ia41ea5d526d15fcbc9b56d74079593cf8b2fdf66
2013-08-26 15:24:12 -07:00
Dmitry Kovalev
b25589c6bb Using num_8x8_* lookup tables instead of mi_*_log2.
Change-Id: I8a246b3d056c98be614d05a90bc261e2441ffc10
2013-08-26 14:22:54 -07:00
Yaowu Xu
4505e8accb Merge "Fix the reading of too many input pixels" 2013-08-26 14:01:50 -07:00
Paul Wilkins
aa823f8667 Merge "Changes to adaptive inter rd thresholds." 2013-08-26 12:48:11 -07:00
Yaowu Xu
6c5433c836 Fix the reading of too many input pixels
in VP9_get4x4var_mmx

Change-Id: I4b4a8f45f25ebdfad281f169cc87aba5e2d6f227
2013-08-26 12:35:27 -07:00
Paul Wilkins
642696b678 Merge "Limit Key frame Intra modes checks." 2013-08-26 12:34:56 -07:00
Dmitry Kovalev
45870619f3 Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the common/decoder.
Adding temporary "typedef BLOCK_SIZE BLOCK_SIZE_TYPE" which will go away
after encoder's patch.

Change-Id: I06ec6a6f079401439843ec981d1496234fd7775c
2013-08-26 11:33:16 -07:00
Jingning Han
4681197a58 Merge "Temporarily disable SSSE3 quant_32x32" 2013-08-26 11:19:53 -07:00
Dmitry Kovalev
5eed6e2224 Merge "Removing redundant calls to clamp_mv2." 2013-08-26 10:48:37 -07:00
Jingning Han
166dc85bed Temporarily disable SSSE3 quant_32x32
Make the current head working properly, while working on fixing an
issue in the SSSE3 implementation of 32x32 quantization.

Change-Id: Ic029da3fd7f1f5e58bc641341cbd226ec49a16bc
2013-08-26 10:45:59 -07:00
James Zern
c8ba8c513c cosmetics: strip 'VP9_' from defines in vp9 only code
Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c
2013-08-23 19:16:49 -07:00
James Zern
2c6ba737f8 Merge "vp9: remove unnecessary wait w/threaded loopfilter" 2013-08-23 18:52:10 -07:00
Dmitry Kovalev
50ee61db4c Renaming D27 to D207.
I've already renamed d27_predictor to d207_predictor but forgot about the
corresponding constant.

Change-Id: Id312aa80fc5b5a1ab8a709a33418a029552a6857
2013-08-23 17:33:48 -07:00
Dmitry Kovalev
480dd8ffbe Using existing functions instead of raw expressions.
Change-Id: Ifa50b04bac1a6ff2abef989073cbf1f37a89eb50
2013-08-23 17:26:53 -07:00
Dmitry Kovalev
e6c435b506 Merge "Cleanup in mvref_common.{h, c}." 2013-08-23 17:09:49 -07:00
Dmitry Kovalev
7194da2167 Merge "Fixing display size setting problem." 2013-08-23 17:08:51 -07:00
Yaowu Xu
13930cf569 Limit mv range to be based on partition size
Previous change c4048dbd limits the mv search range assuming max block
size of 64x64, this commit change the search range using actual block
size instead.

Change-Id: Ibe07ab02b62bf64bd9f8675d2b997af20a2c7e11
2013-08-23 15:43:57 -07:00
Dmitry Kovalev
cd2cc27af1 Removing redundant calls to clamp_mv2.
We could avoid calling clamp_mv2 because it has been already called
inside vp9_find_best_ref_mvs function.

Change-Id: I08edeaf3e11e98c19e67b9711b2523ca5fb1416e
2013-08-23 15:18:35 -07:00
Yaowu Xu
8e04257bc5 Merge "Added border extension" 2013-08-23 14:43:58 -07:00
Adrian Grange
78debf246b Merge "Fix bug in convolution functions (filter selection)" 2013-08-23 13:41:47 -07:00
Dmitry Kovalev
fb481913f0 Merge "Removing useless calls to setup_{pre, dst}_planes." 2013-08-23 13:37:32 -07:00
Dmitry Kovalev
11e3ac62a5 Fixing display size setting problem.
Fix of https://code.google.com/p/webm/issues/detail?id=608. We could have
used invalid display size equal to the previous frame size (not to the
current frame size).

Change-Id: I91b576be5032e47084214052a1990dc51213e2f0
2013-08-23 13:12:46 -07:00
Dmitry Kovalev
21d8e8590b Cleanup in mvref_common.{h, c}.
Making code more compact, adding consts, removing redundant arguments,
adding do/while(0) for macros.

Change-Id: Ic9ec0bc58cee0910a5450b7fb8cfbf35fa9d0d16
2013-08-23 12:00:30 -07:00
Yaowu Xu
656632b776 Added border extension
To the source buffer to be encoded as an alt ref frame. This is to fix
the problem of using uninitialized memory in encoder.

See https://code.google.com/p/webm/issues/detail?id=605

Change-Id: I97618a2fc207e08abcf5301b734aa9e3ad695e2c
2013-08-23 11:31:28 -07:00
Adrian Grange
3f10831308 Fix bug in convolution functions (filter selection)
(In response to Issue 604:
 https://code.google.com/p/webm/issues/detail?id=604)

There were bugs in the convolution code for two cases:

1. Where the filter table was assumed to be aligned to a
   256 byte boundary. The offset of the pixel in the
   source buffer was computed incorrectly.

2. Where no such alignment assumption was made. An
   incorrect address for the filter table base was used.

To fix both problems, I now assume that the filter table is
256-byte aligned and modify the pixel offset calculation to
match.

A later patch should remove the restriction that the filter
table is aligned to a 256-byte boundary.

There was also a bug in the ConvolveTest unit test
(convolve_test.cc).

(Bug & initial fix suggestion submitted by Tero Rintaluoma
and Sami Pietilä).

Change-Id: I71985551e62846e55e40de9e7e3959d4805baa82
2013-08-23 11:16:08 -07:00
Dmitry Kovalev
1c159c470a Merge "Checking scale factors on access." 2013-08-23 11:05:17 -07:00
hkuang
b85367a608 Merge "Optimise idct4x4: rearrange the instructions a bit to improve instruction scheduling." 2013-08-23 10:08:43 -07:00
Paul Wilkins
aa5b67add0 Changes to adaptive inter rd thresholds.
Values now carried over frame to frame.
Change to algorithm for decreasing threshold after
a hit and to max threshold (now based on speed)

Removed some old commented out code relating to
VP8 adaptive thresholds.

The impact of these changes tested on Akiyo (50 frames)
and measured in terms of unit rd hits is as follows:

Speed 0 84.36 -> 84.67
Speed 1 29.48 -> 22.22
Speed 2 11.76 -> 8.21
Speed 3 12.32 -> 7.21

Encode speed impact is broadly in line with these.

Change-Id: I5b886efee3077a11553fa950d796fd6d00c8cb19
2013-08-23 16:18:45 +01:00
Paul Wilkins
f76f52df61 Limit Key frame Intra modes checks.
Most of the focus so far has been on inter frames.

At high speed settings the key frame is now taking a high %
of the cycles.

This patch puts in some masking to reduce the number
of INTRA modes searched during key frame coding (as already
happens for inter frames) at higher speed settings

TODO: Develop this further with either adaptive rd thresholds
when choosing which intra modes to consider or some other
heuristic.

Impact.
At high speed settings on some clips the key frame was starting
to dominate. In a coding of the first 50 frames of AKIYO at speed
2 limiting the key frame intra modes to DC or TM_PRED resulted in
~30% overall speedup. For Bus the number was lower at ~4-5%.

Change-Id: I7bde68aee04995f9d9beb13a1902143112e341e2
2013-08-23 16:10:30 +01:00
Jingning Han
9655c2c7a6 Merge "Fix rectangular partition check flag" 2013-08-22 18:59:18 -07:00
Dmitry Kovalev
33104cdd42 Merge "vp9_encodeframe.c cleanup." 2013-08-22 18:07:35 -07:00
James Zern
711aff9d9d Merge "vp9/encoder: fix last_frame_seg_map mem leak" 2013-08-22 18:04:03 -07:00
James Zern
d843ac5132 Merge "rename LOG2_* defines to *_LOG2" 2013-08-22 18:02:42 -07:00
Jingning Han
84f3b76e1c Fix rectangular partition check flag
Put rectangular partition check flag change according to the rd
costs of NONE and SPLIT partition types under the speed feature.

Change-Id: If681e1e078a8d43d86961ea4b748da5cd1b6c331
2013-08-22 17:15:01 -07:00
Dmitry Kovalev
53f6f8ac93 Merge "check_bsize_coverage cleanup." 2013-08-22 16:18:24 -07:00
hkuang
4205d79273 Merge "Add neon optimize vp9_short_idct10_16x16_add." 2013-08-22 15:57:28 -07:00
hkuang
4082bf9d7c Add neon optimize vp9_short_idct10_16x16_add.
vp9_short_idct10_16x16_add is used to handle the block that only have valid data
at top left 4x4 block. All the other datas are 0. So we could cut many
unnecessary calculations in order to save instructions.

Change-Id: I6e30a3fee1ece5af7f258532416d0bfddd1143f0
2013-08-22 15:53:22 -07:00
Dmitry Kovalev
604022d40b vp9_encodeframe.c cleanup.
Removing unused get_sbuv_perpixel_variance function, using has_second_ref/
is_inter_block functions, organizing includes.

Change-Id: I016de4af12fbbb8b4ece26a70759b2392651b095
2013-08-22 15:50:51 -07:00
Dmitry Kovalev
335b1d360b check_bsize_coverage cleanup.
Change-Id: Ib7803857b35c00e317c9deb8630e777e25eb278f
2013-08-22 15:45:56 -07:00
Dmitry Kovalev
3c42657207 Checking scale factors on access.
It is possible to have invalid scale factors and not access them
during decoding. Error is reported if we really try to use invalid scale
factors.

Change-Id: Ie532d3ea7325ee0c7a6ada08269f804350c80fdf
2013-08-22 15:19:05 -07:00
James Zern
40ae02c247 rename LOG2_* defines to *_LOG2
gets rid of a mix of styles

Change-Id: I3591d312157bc6f53a25438bf047765c671fd8a8
2013-08-22 14:45:24 -07:00
Dmitry Kovalev
13eed79c77 Merge "Adding vp9_is_scaled function." 2013-08-22 14:39:55 -07:00
Dmitry Kovalev
09858c239b Removing useless calls to setup_{pre, dst}_planes.
Comment is wrong, we don't initialize any xd pointers. We only initialize
xd->planes[i]->dst and xd->planes[i]->pre[], which are actually initialized
for every block during the decoding.

Change-Id: If152ea872ebef1f83ca70712fa6f8df1b6855f56
2013-08-22 14:39:05 -07:00
James Zern
a5726ac453 vp9/encoder: fix last_frame_seg_map mem leak
remove duplicate allocation from vp9_create_compressor, it was added to
vp9_alloc_frame_buffers in:

d5bec52 Added resizing & initialization of last frame segment map

Change-Id: I996723226a16a62aff8f9a52ac74e0b73cc98fdf
2013-08-22 14:13:04 -07:00
Dmitry Kovalev
640dea4d9d Adding vp9_is_scaled function.
Change-Id: Ieb7077ca3586b9491912027eed450a4f6fd38d30
2013-08-22 14:04:59 -07:00
Jingning Han
8adc20ce35 Merge "Refactor rd_pick_partition for parameter control" 2013-08-22 13:54:48 -07:00
James Zern
da9a6ac9e7 Merge "vp9_peek_si: add bitstream v1 support" 2013-08-22 13:28:00 -07:00
Jingning Han
01a37177d1 Refactor rd_pick_partition for parameter control
This commit changes the partition search order of superblocks from
{SPLIT, NONE, HORZ, VERT} to {NONE, SPLIT, HORZ, VERT} for
consistency with that of sub8x8 partition search. It enable the use
of early termination in partition search for all block sizes.

For ped_area_1080p 50 frames coded at 4000 kbps, it makes the runtime
goes down from 844305ms -> 818003ms (3% speed-up) at speed 0.

This will further move towards making the in-search partition types
configurable, hence unifying various speed-up approaches.

Some speed 1 and 2 features are turned off during the refactoring
process, including:
disable_split_var_thresh
using_small_partition_info

Stricter constraints are applied to use_square_partition_only for
right/bottom boundary blocks. Will bring back/refine these features
subsequently. At this point, it makes derf set at speed 1 about
0.45% higher in compression performance, and 9% down in run-time.

Change-Id: I3db9f9d1d1a0d6cbe2e50e49bd9eda1cf705f37c
2013-08-22 12:36:02 -07:00
hkuang
610642c130 Optimise idct4x4: rearrange the instructions a bit
to improve instruction scheduling.

Change-Id: I5ea881a6e419f9e8ed4b3b619406403b4de24134
2013-08-22 11:02:22 -07:00
Deb Mukherjee
8b810c7a78 Fixes on feature disabling split based on variance
Adds a couple of minor fixes, which may be absorbed in Jingning's
patch. Thanks to Guillaume for pointing these out.
Also adjusts the thresholds for speed 1 and 2 to 16 and 32
respectively, to keep quality drops small.

Results:
--------
derfraw300:  threshold = 16, psnr -0.082%, speedup 2-3%
             threshold = 32, psnr -0.218%, speedup 5-6%
stdhdraw250: threshold = 16, psnr -0.031%, speedup 2-3%
             threshold = 32, psnr -0.273%, speedup 5-6%

Change-Id: I4b11ae8296cca6c2a9f644be7e40de7c423b8330
2013-08-22 07:05:44 -07:00
Scott LaVarnway
f39bf458e5 Merge "Initialize mb_skip_coeff before picking modes" 2013-08-22 06:26:04 -07:00
Scott LaVarnway
94bfbaa84e Initialize mb_skip_coeff before picking modes
It appears that the above/left mb_skip_coeff used during
the pick modes, is left over from the previously
encode frame.  This patch initializes the flag to the default
value of zero.


Change-Id: Ida4684cc99611d6e3e82628db35ed717e28ce550
2013-08-22 08:51:04 -04:00
Dmitry Kovalev
96a1a59d21 Merge "Using has_second_ref function to simplify the code." 2013-08-22 01:39:14 -07:00
Dmitry Kovalev
a33f178491 Merge "Cleaning up foreach_transformed_block_in_plane." 2013-08-22 01:37:21 -07:00
Dmitry Kovalev
359b571448 Merge "Cleaning up reset_skip_context function." 2013-08-22 01:36:25 -07:00
Dmitry Kovalev
596c51087b Merge "Removing unused foreach_predicted_block function." 2013-08-22 01:35:41 -07:00
Dmitry Kovalev
cb05a451c6 Merge "Cleaning up optimize_init_b function." 2013-08-22 01:35:27 -07:00
Dmitry Kovalev
64c0f5c592 Merge "Cleaning up sum_intra_stats function." 2013-08-22 01:34:39 -07:00
Jingning Han
fcb890d751 Merge "Enable zero coeff check in sub8x8 UV rd loop" 2013-08-21 22:07:00 -07:00
James Zern
85640f1c9d vp9: remove unnecessary wait w/threaded loopfilter
the final macroblock rows are scheduled in the main thread. prior to
this change one additional macroblock row would be scheduled in the
worker forcing the main thread to wait before finishing.

Change-Id: I05f3168e5c629b898fcebb0d77eb6d6a90d6105e
2013-08-21 17:43:44 -07:00
Dmitry Kovalev
4172d7c584 Cleaning up foreach_transformed_block_in_plane.
Change-Id: I9f45af3894c57f35cb266c255e2b904295d39c34
2013-08-21 17:16:02 -07:00
James Zern
6167355309 vp9_peek_si: add bitstream v1 support
currently protected by CONFIG_NON420 as v1 is still not entirely stable

Change-Id: Id1c5081b04a2c47a842822048b8804be67d23a6d
2013-08-21 17:04:10 -07:00
Dmitry Kovalev
be60924f29 Cleaning up optimize_init_b function.
Change-Id: Ib2c975e1d96deefb7ac4d6b600c8c5388035d111
2013-08-21 16:40:16 -07:00
Dmitry Kovalev
c43da352ab Cleaning up reset_skip_context function.
Change-Id: Ib3e72671eb8da6f2e9767a6de292ec7c7cde6bc7
2013-08-21 16:31:51 -07:00
Dmitry Kovalev
048ccb2849 Cleaning up sum_intra_stats function.
Using size_group_lookup table and better variable names.

Change-Id: I6e67f2ce091845db43ace7d21b7ae31c6f165aec
2013-08-21 16:25:02 -07:00
Dmitry Kovalev
3286abd82e Merge "Adding scale factor check." 2013-08-21 14:11:13 -07:00
Dmitry Kovalev
687891238c Merge "Removing PLANE_TYPE argument from cost_coeffs function." 2013-08-21 14:10:05 -07:00
Deb Mukherjee
a2f7619860 Merge "Make "good" quality 2-pass vpxenc encoding default" 2013-08-21 13:58:49 -07:00
James Zern
ac12f3926b Merge "vp9 rtcd: remove non-existent sad functions" 2013-08-21 13:55:59 -07:00
Dmitry Kovalev
2f1a0a0e2c Removing PLANE_TYPE argument from cost_coeffs function.
We can determine plane_type for another function arguments.

Change-Id: I85331877aedb357632ae916a37b5b15f22c0bb1f
2013-08-21 13:02:28 -07:00
Deb Mukherjee
0d8723f8d5 Make "good" quality 2-pass vpxenc encoding default
Currently, the best quality mode in VP9 is not very well developed,
and unnecessarily makes the encode too slow. Hence the command line
default is changed to "good" quality. Also, the number of passes
default is changed to 2 passes as well, since 1-pass encoding is
not very efficient in VP9.

Besides, a number of VP9 defaults are set to the currently
recommended settings. With these changes, vpxenc
run with --codec=vp9 --kf-max-dist=9999 --cpu-used=0 should
work about the same as our borg results.
Note when the --cpu-used=0 option is dropped there will be a slight
difference in the output, because of a difference in the cpu-used
value for the first pass. Specifically, the default when unspecified
is to use cpu_used=1 for the first pass and cpu_used=0 for the
second pass. But when specified, both passes will use the cpu-used
value specified.

Note that this also changes the default for VP8 as being "good"
but other options stay unchanged.

Change-Id: Ib23c1a05ae2f36ee076c0e34403efbda518c5066
2013-08-21 12:41:26 -07:00
Dmitry Kovalev
27a984fbd3 Removing a lot of duplicated code.
Adding set_contexts contexts function and call it instead of
set_contexts_on_border. Calling txfrm_block_to_raster_xy to get aoff and
loff.

Change-Id: I41897e344afd2cae1f923f4fdbe63daccf6fe80e
2013-08-21 11:55:12 -07:00
Dmitry Kovalev
a3ae4c87fd Adding scale factor check.
We support only [1/16, 2] scale factors, enforcing this now.

Change-Id: I0822eb7cea51720df6814e42d3f35ff340963061
2013-08-21 11:24:47 -07:00
Adrian Grange
ce28d0ca89 Fix typos and minor stylistic cleanup
Change-Id: I32e43474e8651ef2eb181d24860a8f118cfea7bf
2013-08-21 08:45:42 -07:00
Adrian Grange
5b63963573 Merge "Further correct bug in loopfilter initialization" 2013-08-21 07:17:43 -07:00
James Zern
ae455fabd8 vp9 rtcd: remove non-existent sad functions
vp9_sad32x3, vp9_sad3x32

+ remove unnecessary sad include from vp9_findnearmv.c

Change-Id: Idef2a89cadc3fec64eff82ba9be60ffff50b3468
2013-08-20 18:07:53 -07:00
Dmitry Kovalev
90027be251 Removing unused foreach_predicted_block function.
Moving foreach_predicted_block_in_plane function to vp9_reconinter.c
because there is only one usage.

Change-Id: I9852feae43fc3cf809b817fc541d043bc5496209
2013-08-20 17:20:47 -07:00
Dmitry Kovalev
7f814c6bf8 Merge "Passing plane_bsize to foreach_transformed_block_visitor." 2013-08-20 14:25:01 -07:00
Dmitry Kovalev
27de4fe922 Using has_second_ref function to simplify the code.
Updating implementation of vp9_get_pred_context_single_ref_p2 using
has_second_ref function to make code easier to read.

Change-Id: I5ba642712f59861a48aab974e73aa01640d086fe
2013-08-20 14:09:56 -07:00
hkuang
62a2cd9ed2 Merge "Add neon optimize vp9_short_idct10_8x8_add." 2013-08-20 14:06:57 -07:00
Dmitry Kovalev
381d3b8b7d Merge "vp9_filter.{h, c} cleanup + adding SUBPEL_TAPS constant." 2013-08-20 13:46:53 -07:00
Dmitry Kovalev
d19ac4b66d vp9_filter.{h, c} cleanup + adding SUBPEL_TAPS constant.
Change-Id: Ib394ea23f464591dad50b5c65c316701378d06d7
2013-08-20 12:29:57 -07:00
hkuang
37cda6dc4c Add neon optimize vp9_short_idct10_8x8_add.
vp9_short_idct10_8x8_add is used to handle the block that only have valid data
at top left 4x4 block. All the other datas are 0. So we could cut several
unnecessary calculations in order to save instructions.

Change-Id: I34fda95e29082b789aded97c2df193991c2d9195
2013-08-20 11:51:07 -07:00
Jingning Han
1bf1428654 Enable zero coeff check in sub8x8 UV rd loop
Check the minimum rate-distortion cost of regular quantization and
all zero coeffs cases in the sub8x8 inter prediction rd loop for
luma components. Use this as the cumulative rdcost sent to UV rd
estimation.

Change-Id: Ia4bc7700437d5e13d7cdad4cf9ae57ab036d3e97
2013-08-20 10:33:42 -07:00