Commit Graph

14566 Commits

Author SHA1 Message Date
Debargha Mukherjee
65dd056e41 Merge "Optimize vpx_quantize_{b,b_32x32} assembler." 2015-10-26 18:04:49 +00:00
Debargha Mukherjee
35cae7f1b3 Merge "Optimize vp9_highbd_block_error_8bit assembly." 2015-10-26 18:03:46 +00:00
Alex Converse
e34c7e3f59 Merge "palette: Replace rand() call with custom LCG." 2015-10-26 17:05:00 +00:00
Jingning Han
e1a056e163 Merge "Use explicit block position in foreach_transformed_block" 2015-10-26 16:25:56 +00:00
Alex Converse
171fd8999f palette: Replace rand() call with custom LCG.
The custom LCG is based on the POSIX recommend constants for a 16-bit
rand(). This implementation uses less computation than typical standard
library procedures which have been extended for 32-bit support, is
guaranteed to be reentrant, and identical everywhere.

Change-Id: I3140bbd566f44ab820d131c584a5d4ec6134c5a0
Ref: http://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html
2015-10-24 13:38:23 -07:00
Marco
d162934bdc VP9: Estimate noise level for denoiser.
Periodically estiamte noise level in source, and only denoise
if estimated noise level is above threshold.

Change-Id: I54f967b3003b0c14d0b1d3dc83cb82ce8cc2d381
2015-10-23 11:03:30 -07:00
Jingning Han
caeb10bf06 Use explicit block position in foreach_transformed_block
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.

Change-Id: Ie7508acdac0b498487564639bc5cc6378a8a0df7
2015-10-23 09:19:17 -07:00
Ronald S. Bultje
f4af1a9af4 Merge "vp10: merge ext_ipred_bltr experiment into misc_fixes." 2015-10-22 21:14:20 +00:00
Ronald S. Bultje
806ae29d80 Merge "vp10: merge universal_hp experiment into misc_fixes." 2015-10-22 21:14:13 +00:00
Ronald S. Bultje
d6fc63ac31 Merge "Adjust superframe-is-optional unit test for vp10 superframe syntax." 2015-10-22 21:14:06 +00:00
Ronald S. Bultje
dbefcc0609 Merge "vp10: don't allow comp_inter_inter on keyframes." 2015-10-22 21:14:00 +00:00
Ronald S. Bultje
a857728267 Merge "vp10: fix tile size in remuxing step." 2015-10-22 21:12:44 +00:00
Ronald S. Bultje
40347d0c07 Merge "vp10: use correct constant for bw adaptation of seg pred probs." 2015-10-22 21:12:35 +00:00
Ronald S. Bultje
de4e2662d7 Merge "vp10: don't make right edge available across tile boundaries." 2015-10-22 21:12:25 +00:00
Ronald S. Bultje
69df584416 Merge "vp10: clip MVs before adding to find_ref_mvs() list." 2015-10-22 21:12:09 +00:00
Ronald S. Bultje
53dc9fd0a0 vp10: merge ext_ipred_bltr experiment into misc_fixes.
Change-Id: I2f2deb700748408b8278b7f5c29ee1f2e39785ec
2015-10-21 22:27:34 -04:00
Ronald S. Bultje
194c0a5cfb vp10: merge universal_hp experiment into misc_fixes.
Change-Id: I79fc3c0594535adc0056339c929cff69b8188760
2015-10-21 22:27:34 -04:00
Ronald S. Bultje
aa11256555 Adjust superframe-is-optional unit test for vp10 superframe syntax.
Change-Id: Ic64b6928af7ae8ecc987f845b0bf0faecdacb072
2015-10-21 22:27:28 -04:00
Ronald S. Bultje
6a032503ca vp10: don't allow comp_inter_inter on keyframes.
Change-Id: Ibd0e13721a2bb71c532d20b36c42f4cccf5c5de2
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
558d93f3a5 vp10: fix tile size in remuxing step.
Change-Id: Id48fb193bbdb3afed1d0db26c4ddded65a293b1b
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
59058775fc vp10: use correct constant for bw adaptation of seg pred probs.
Change-Id: Idb869a77a126982814b8e7e288f952a65340e6be
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
3d90819149 vp10: don't make right edge available across tile boundaries.
Change-Id: Ia81cf3858ef6c8d1fd4b1fb2dd9627906081129d
2015-10-21 15:19:11 -04:00
Geza Lore
aa8f85223b Optimize vp9_highbd_block_error_8bit assembly.
A new version of vp9_highbd_error_8bit is now available which is
optimized with AVX assembly. AVX itself does not buy us too much, but
the non-destructive 3 operand format encoding of the 128bit SSEn integer
instructions helps to eliminate move instructions. The Sandy Bridge
micro-architecture cannot eliminate move instructions in the processor
front end, so AVX will help on these machines.

Further 2 optimizations are applied:

1. The common case of computing block error on 4x4 blocks is optimized
as a special case.
2. All arithmetic is speculatively done on 32 bits only. At the end of
the loop, the code detects if overflow might have happened and if so,
the whole computation is re-executed using higher precision arithmetic.
This case however is extremely rare in real use, so we can achieve a
large net gain here.

The optimizations rely on the fact that the coefficients are in the
range [-(2^15-1), 2^15-1], and that the quantized coefficients always
have the same sign as the input coefficients (in the worst case they are
0). These are the same assumptions that the old SSE2 assembly code for
the non high bitdepth configuration relied on. The unit tests have been
updated to take this constraint into consideration when generating test
input data.

Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
2015-10-21 12:30:40 +01:00
Ronald S. Bultje
56cfbeefb4 Merge "vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe." 2015-10-20 19:58:00 +00:00
Ronald S. Bultje
293e20df91 vp10: clip MVs before adding to find_ref_mvs() list.
This causes the output of find_ref_mvs() to always be unique or zero.
A nice side-effect of this is that it also causes the output of
find_ref_mvs_sub8x8() to be unique-or-zero, and it will not ignore
available candidate MVs under certain conditions.

See issue 1012.

Change-Id: If4792789cb7885dbc9db420001d95f9b91b63bfa
2015-10-20 14:48:35 -04:00
Ronald S. Bultje
dec4405cfa vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe.
See issue 1088.

Change-Id: Icb15d33b4e316add848f210b50cbccd7c7847207
2015-10-20 14:48:31 -04:00
Marco
be3f2713ad Setting change in sample encoder: vpx_temporal_svc_encoder.c
Change-Id: Ifb384fa571eb08b516ed08fe05b8bca0c94b1edf
2015-10-20 10:40:20 -07:00
Hui Su
96b69deca5 Merge "VP10: some changes to palette mode" 2015-10-20 16:37:31 +00:00
Ronald S. Bultje
9897e1c27c Merge "vp10: write colorspace info for profile 0 intraonly frames." 2015-10-20 15:57:21 +00:00
Ronald S. Bultje
bafadaafbb Merge "vp10: per-segment lossless coding." 2015-10-20 15:57:12 +00:00
Ronald S. Bultje
92c4d8149a Merge "vp10: add extended-intra prediction edges experiment." 2015-10-20 15:57:05 +00:00
Ronald S. Bultje
1a64595780 Merge "vp10: allow MV refs to point outside visible image." 2015-10-20 15:56:56 +00:00
Ronald S. Bultje
4a7f012b95 Merge "vp10: allow forward updates for keyframe y intra mode probabilities." 2015-10-20 15:56:49 +00:00
Ronald S. Bultje
f441a652b7 Merge "vp10: merge keyframe/interframe uvintramode/partition probabilities." 2015-10-20 15:56:42 +00:00
Ronald S. Bultje
24517b9635 Merge "vp10: make segmentation probs use generic probability model." 2015-10-20 15:56:34 +00:00
Geza Lore
9cfba09ac0 Optimize vpx_quantize_{b,b_32x32} assembler.
Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.

This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.

Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.

Change-Id: I86714a6b7364da20cd468cd784247009663a5140
2015-10-20 10:11:19 +01:00
James Zern
849e54cedd Merge "vp8cx: remove deprecated reference/entropy controls" 2015-10-20 02:46:36 +00:00
Ronald S. Bultje
2a388b53f2 vp10: write colorspace info for profile 0 intraonly frames.
See issue 1087.

Change-Id: I231f6f12f870d0a56391daf1673536048418b207
2015-10-19 12:18:57 -04:00
James Zern
a046f56491 vp8cx: remove deprecated reference/entropy controls
VP8E_UPD_ENTROPY, VP8E_UPD_REFERENCE and VP8E_USE_REFERENCE have been
deprecated since the initial public release

Change-Id: Ied16b441eec13434d85f1ab115d49ccaf5f2f7b0
2015-10-16 17:02:36 -07:00
Ronald S. Bultje
60c58b5284 vp10: per-segment lossless coding.
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.

See issue 1035.

Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
c7dc1d78bf vp10: add extended-intra prediction edges experiment.
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.

See issue 1043.

Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
dea998997f vp10: allow MV refs to point outside visible image.
In VP9, the ref MV had to point to a block that itself fully resided
within the visible image, i.e. all borders of the image had to be
within the visible borders of the coded frame. This is somewhat
illogical, and had obscure side effects, e.g. clamping of fairly
reasonable motion vectors such as 0,0 were clipped to negative values
if the block was overhanging on frame edges (such as the last rows
on 1080p content), which makes no sense whatsoever.

Instead, relax clamping constraints such that the ref MVs are allowed
to point to blocks exactly outside the visible edges in both Y as well
as UV planes, including the 8tap filter edges (that's why the offset is
8 pixels + block size).

See issue 1037.

Change-Id: I2683eb2a18b24955e4dcce36c2940aa2ba3a1061
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
1eb51a2010 vp10: allow forward updates for keyframe y intra mode probabilities.
See issue 1040 point 5.

Change-Id: I51a70b9eade39efba392a1457bd70a3c515525cb
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
d8f3bb1837 vp10: merge keyframe/interframe uvintramode/partition probabilities.
This has various benefits:
- simplify implementations because we don't have to switch between
  multiple probability tables depending on frametype
- allows fw subexp and bw adaptivity for partitions/uvmode in keyframes

See issue 1040 point 5.

Change-Id: Ia566aa2863252d130cee9deedcf123bb2a0d3765
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
6e5a1165be vp10: make segmentation probs use generic probability model.
Locate them (code-wise) in frame_context, and have them be updated
as any other probability using the subexp forward and adaptive bw
updates.

See issue 1040 point 1.

TODOs:
- real-world default probabilities
- why is counts sometimes NULL in the decoder? Does that mean bw
  adaptivity updates only work on some frames? (I haven't looked
  very closely yet, maybe this is a red herring.)

Change-Id: I23b57b4e5e7574b75f16eb64823b29c22fbab42e
2015-10-16 19:30:38 -04:00
Yaowu Xu
568429512e Add a new enum type vpx_color_range_t
to make meaning of color_range obvious.

Change-Id: I303582e448b82b3203b497e27b22601cc718dfff
2015-10-16 16:27:18 -07:00
James Zern
7dd7a7da20 vpx/*.h: add VPX_CTRL_* preproc defines
allows controls to be tested for at compile-time

Change-Id: I1cd01287dc144392956c82e6dbac003f37703039
2015-10-16 18:47:20 +00:00
James Zern
9ade6e1001 Merge "vpx/*.h, cosmetics: fix some typos" 2015-10-16 18:47:08 +00:00
hui su
17c817adfc VP10: some changes to palette mode
Account for rounding in distortion calculation in k-means;
carry out rounding before duplicates removal of base colors;
replace numbers with macros;
use prefix increment.

Slight coding gain (<0.1%) on screen_content testset.

Change-Id: Ie8bd241266da6b82c7b2874befc3a0c72b4fcd8c
2015-10-16 11:41:26 -07:00
Marco
b44c5cf639 Adjustment on limiting cyclic refresh on steady blocks.
Adjust the qp threshold and consec_zeromv threshold for
limiting cyclic refresh. Also increase the refresh period
when the limit amount is significant, and some code-cleanup.

Small gain in PSNR/SSIM metrics: ~0.25/0.3 gain on RTC set, speed 7.

Change only affects non-screen content.

Change-Id: I1ced87a89a132684c071e722616e445b2d18236a
2015-10-16 10:16:44 -07:00