Commit Graph

134 Commits

Author SHA1 Message Date
paulwilkins
9ce611a764 1 pass VBR mode bug fix.
(copied from VP9)

The one pass VBR mode selects a Q range based on a
moving average of recent Q values. This calculation
should have been excluding arf overlay frames as these
are usually coded at the highest allowed value. Their
inclusion skews the average and can cause it to drift
upwards even when the clip as a whole is undershooting.

As such it can undermine correct adaptation of the allowed
Q range especially for easy content.

Change-Id: I9e12da84e12917e836b6e53ca4dfe4f150b9efb1
2015-12-15 15:02:40 +00:00
paulwilkins
4e692bbee2 Changes to exhaustive motion search.
This change has been imported from VP9 and
alters the nature and use of exhaustive motion search.

Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.

Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.

For example:
  stage 1: Range +/- 64 interval 4
  stage 2: Range +/- 32 interval 2
  stage 3: Range +/- 15 interval 1

This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.

This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained  a bug (the two searches used different distortion
metrics).

For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.

Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most natural video clips, however, where the step search
is performing well, the quality gain and speed impact are small.

Change-Id: Iac24152ae239f42a246f39ee5f00fe62d193cb98
2015-12-08 16:54:42 +00:00
paulwilkins
9d85ce8e0c Fix bug when overlaying middle arfs in multi-arf groups.
Fix copied over from VP9 master to VP10 master.
Do not reset the alt ref active flag when overlaying the middle
arf(s) of a multi arf group.

Change-Id: I1b7392107e7c675640d5ee1624012f39cc374c58
2015-12-07 15:23:46 +00:00
Angie Chiang
06bdcea606 Merge "comment out range_check of fdct in dct.c" 2015-12-04 23:38:35 +00:00
Angie Chiang
08b157da8e comment out range_check of fdct in dct.c
The range_check is not used because the bit range
in fdct# is not correct. Since we are going to merge in a new version
of fdct# from nextgenv2, we won't fix the incorrect bit range now.

Change-Id: I54f27a6507f27bf475af302b4dbedc71c5385118
2015-12-04 10:54:31 -08:00
hui su
5d3327e891 Remove palette from VP10
Store it in nextgenv2 for now.

Change-Id: Iab0af0e15246758e3b6e8bde4a74b13c410576fc
2015-12-03 12:30:47 -08:00
Alex Converse
b1fcd1751e Fix unsigned overflow in rd_variance_adjustment.
Found with clang -fsanitize=integer

Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1
2015-11-19 15:00:59 -08:00
hui su
6ab6ac450b Use accurate bit cost for uv_mode in UV intra mode RD selection
On derflr, +0.1% for VP10; however, -0.03% on VP9.

Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-11-06 14:45:43 -08:00
Alex Converse
255bcf8697 Merge "misc fixes: Remove a wasted value." 2015-11-03 17:52:34 +00:00
Alex Converse
6f229b3e62 Merge "Shrink probability remap tables." 2015-10-29 19:58:24 +00:00
Alex Converse
0f059d6d65 misc fixes: Remove a wasted value.
Remove delta index 254 from probability remapping and subexp coding.
Saves 1-bit when the delta index is 129.

Change-Id: I88aba565fc766b1769165be458d2efd3ce45817e
2015-10-27 12:10:25 -07:00
Alex Converse
a736bf6bfb Shrink probability remap tables.
Saves 2288 bytes in vp8+vp9 libvpx.a.

Change-Id: Iaa5712e59a9693ed58cea63de63781a96827e44e
2015-10-27 12:08:23 -07:00
Yaowu Xu
c1b2d416d7 Merge "Reorder code to be consistent accross branches" 2015-10-27 17:07:50 +00:00
Yaowu Xu
9d8bde85cb Reorder code to be consistent accross branches
This is to make future merge a bit easier.

Change-Id: I1039de381d8fe7b9988b57c23d15d0cb5f2fcd32
2015-10-27 09:04:40 -07:00
Alex Converse
811be0df3a Fix VS build.
Add a cast on a double to unsigned assignment.

Change-Id: I4abce7cfa13e145ed0c71469844ac9b274aa1411
2015-10-26 23:13:03 -07:00
Debargha Mukherjee
65dd056e41 Merge "Optimize vpx_quantize_{b,b_32x32} assembler." 2015-10-26 18:04:49 +00:00
Alex Converse
e34c7e3f59 Merge "palette: Replace rand() call with custom LCG." 2015-10-26 17:05:00 +00:00
Alex Converse
171fd8999f palette: Replace rand() call with custom LCG.
The custom LCG is based on the POSIX recommend constants for a 16-bit
rand(). This implementation uses less computation than typical standard
library procedures which have been extended for 32-bit support, is
guaranteed to be reentrant, and identical everywhere.

Change-Id: I3140bbd566f44ab820d131c584a5d4ec6134c5a0
Ref: http://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html
2015-10-24 13:38:23 -07:00
Jingning Han
caeb10bf06 Use explicit block position in foreach_transformed_block
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.

Change-Id: Ie7508acdac0b498487564639bc5cc6378a8a0df7
2015-10-23 09:19:17 -07:00
Ronald S. Bultje
dbefcc0609 Merge "vp10: don't allow comp_inter_inter on keyframes." 2015-10-22 21:14:00 +00:00
Ronald S. Bultje
a857728267 Merge "vp10: fix tile size in remuxing step." 2015-10-22 21:12:44 +00:00
Ronald S. Bultje
6a032503ca vp10: don't allow comp_inter_inter on keyframes.
Change-Id: Ibd0e13721a2bb71c532d20b36c42f4cccf5c5de2
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
558d93f3a5 vp10: fix tile size in remuxing step.
Change-Id: Id48fb193bbdb3afed1d0db26c4ddded65a293b1b
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
56cfbeefb4 Merge "vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe." 2015-10-20 19:58:00 +00:00
Ronald S. Bultje
dec4405cfa vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe.
See issue 1088.

Change-Id: Icb15d33b4e316add848f210b50cbccd7c7847207
2015-10-20 14:48:31 -04:00
Hui Su
96b69deca5 Merge "VP10: some changes to palette mode" 2015-10-20 16:37:31 +00:00
Geza Lore
9cfba09ac0 Optimize vpx_quantize_{b,b_32x32} assembler.
Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.

This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.

Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.

Change-Id: I86714a6b7364da20cd468cd784247009663a5140
2015-10-20 10:11:19 +01:00
Ronald S. Bultje
2a388b53f2 vp10: write colorspace info for profile 0 intraonly frames.
See issue 1087.

Change-Id: I231f6f12f870d0a56391daf1673536048418b207
2015-10-19 12:18:57 -04:00
Ronald S. Bultje
60c58b5284 vp10: per-segment lossless coding.
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.

See issue 1035.

Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
c7dc1d78bf vp10: add extended-intra prediction edges experiment.
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.

See issue 1043.

Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
1eb51a2010 vp10: allow forward updates for keyframe y intra mode probabilities.
See issue 1040 point 5.

Change-Id: I51a70b9eade39efba392a1457bd70a3c515525cb
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
d8f3bb1837 vp10: merge keyframe/interframe uvintramode/partition probabilities.
This has various benefits:
- simplify implementations because we don't have to switch between
  multiple probability tables depending on frametype
- allows fw subexp and bw adaptivity for partitions/uvmode in keyframes

See issue 1040 point 5.

Change-Id: Ia566aa2863252d130cee9deedcf123bb2a0d3765
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
6e5a1165be vp10: make segmentation probs use generic probability model.
Locate them (code-wise) in frame_context, and have them be updated
as any other probability using the subexp forward and adaptive bw
updates.

See issue 1040 point 1.

TODOs:
- real-world default probabilities
- why is counts sometimes NULL in the decoder? Does that mean bw
  adaptivity updates only work on some frames? (I haven't looked
  very closely yet, maybe this is a red herring.)

Change-Id: I23b57b4e5e7574b75f16eb64823b29c22fbab42e
2015-10-16 19:30:38 -04:00
hui su
17c817adfc VP10: some changes to palette mode
Account for rounding in distortion calculation in k-means;
carry out rounding before duplicates removal of base colors;
replace numbers with macros;
use prefix increment.

Slight coding gain (<0.1%) on screen_content testset.

Change-Id: Ie8bd241266da6b82c7b2874befc3a0c72b4fcd8c
2015-10-16 11:41:26 -07:00
hui su
aaf6f6215f Fix palette mode in multi-thread encoding setting
Fix a couple of memory related errors. Also fix thread test failures.

Change-Id: I0103995f832cecf1dd2380000321ac7204f0cfc0
2015-10-15 15:00:57 -07:00
Yaowu Xu
15cc8bc72f Merge "fix a msvc compiler warning" 2015-10-15 14:39:01 +00:00
Yaowu Xu
8ced62f250 fix a msvc compiler warning
Change-Id: Ifd6581c1bdb8d8f4b2ecf676c1a3d385dc129abf
2015-10-15 01:05:13 +00:00
Yaowu Xu
4727fa2a75 Fix two asan failures
Change-Id: I57865e9604ac162ef0d97deb16e81ca436a98428
2015-10-14 18:03:31 -07:00
Hui Su
fe0396cadc Merge "Fix compiler warnings" 2015-10-13 19:30:33 +00:00
Hui Su
b9e31b5163 Merge "VP10: Add palette mode part 1" 2015-10-13 17:34:27 +00:00
hui su
6f31722950 Fix compiler warnings
Change-Id: I761256a8100d83abf1b937f3739580237e3fad2a
2015-10-13 10:33:17 -07:00
Ronald S. Bultje
00170953b1 vp10: allow forward updates for uv_mode probabilities.
See issue 1040 point 4.

Change-Id: I79e06bd71a27f45770c760c47dc71bc3767a77a0
2015-10-12 17:51:01 -04:00
Ronald S. Bultje
5f589826f3 vp10: allow bw adaptivity for skip/tx probabilities in keyframes.
See issue 1040 point 3.

Change-Id: Ieef6d326b7fb50ceca5936525b7c688225a11fd1
2015-10-12 17:51:01 -04:00
Ronald S. Bultje
fee146e60b vp10: don't write tile size marker bit if CONFIG_MISC_FIXES=0.
Change-Id: I41b13b8767e30da391c2c4da9a729ca7292b16b9
2015-10-12 17:50:57 -04:00
Ronald S. Bultje
5b4805d6e9 vp10: remove clamp_mv2() call from vp10_find_best_ref_mvs().
This actually has no effect whatsoever, since the input MVs themselves
are clamped by clamp_mv_ref() already, which is significantly more
restrictive in its bounds.

Change-Id: I4a3a7b2b121ee422c56428c2a12d930c3813c06e
2015-10-12 14:45:18 -04:00
Ronald S. Bultje
2e45ce1493 vp10: update assertion/allocation for tokens.
We only write EOSB tokens if we write tokens (i.e. not for skip blocks),
and we write EOSB tokens per-plane instead of per block.

Change-Id: I8d7ee99f8ec50eb7ae809f9f9282c1c91dbf6537
2015-10-12 14:45:18 -04:00
hui su
5d011cb278 VP10: Add palette mode part 1
Add palette mode for keyframe luma channel. Palette mode is enabled
when using "--tune-content=screen" in encoding config parameters.

on screen_content testset:  +6.89%
on derlr                 :  +0.00%

Design doc (WIP):
https://goo.gl/lD4yJw

Change-Id: Ib368b216bfd3ea21c6c27436934ad87afdaa6f88
2015-10-12 10:02:17 -07:00
Ronald S. Bultje
177e7b53e7 vp10: use subexp probability updates for MV probs.
See issue 1040 point 2.

Change-Id: I0b37fe74be764610696620f1fe296dc74e4806d7
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
3461e8ce64 vp10: skip unreachable cat6 token extrabits.
We have historically added new bits to cat6 whenever we added a new
transform size (or bitdepth, for that matter). However, we have
always coded these new bits regardless of the actual transform size,
which means that for smaller transforms, we code bits that cannot
possibly be set. The coding (quality) impact of this is negligible,
but the bigger issue is that this allows creating bitstreams with
coefficient values that are nonsensible and can cause int overflows,
which then de facto become part of the bitstream spec. By not coding
these bits, we remove this possibility.

See issue 1065.

Change-Id: Ib3186eca2df6a7a15ddc60c8b55af182aadd964d
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
7460798ba5 vp10: use superframe marker index/size mechanism for tile size.
See issue 1042. Should provide slight bitstream savings in most cases
where tiles are being used.

Change-Id: Ie2808cf8ef30b3efe50804396900c4d63a3fa026
2015-10-05 20:58:32 -04:00