Commit Graph

133 Commits

Author SHA1 Message Date
John Koleszar
5055a1610d remove rotation experiment
This is being reimplemented more generically in terms of affine
transforms.

Change-Id: I9300bfde5f8b93c708c64f59427087720f8ed782
2012-08-21 10:09:56 -07:00
Christian Duvivier
a1168155a7 Fix warnings.
Change-Id: I4b911e4173da30c164bde7ea50bc80a70fbbb745
2012-08-14 16:23:32 -07:00
Daniel Kang
2f963917a6 Fix typo, which adds skip testing for 16x16
Also add warnings for undefined macros in the C pre-processor

Change-Id: I1ec30e57c5a49fb72151a4cf140d7eeb0fb1d779
2012-08-13 16:28:11 -07:00
Deb Mukherjee
7d0656537b Merging in the sixteenth subpel uv experiment
Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.

Also removes an experimental filter.

Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd
2012-08-08 16:57:43 -07:00
Yaowu Xu
8b2f57d0b8 a new way of determining reference motion vector
Using surrounding reconstructed pixels from left and above to select
best matching mv to use as reference motion vector for mv encoding.

Test results:
       AVGPSNR  GLBPSNR VPXSSIM
Derf:  1.107%   1.062%  0.992%
Std-hd:1.209%   1.176%  1.029%

Change-Id: I8f10e09ee6538c05df2fb9f069abcaf1edb3fca6
2012-08-07 11:25:57 -07:00
Christian Duvivier
82edabce75 Add x86_64-darwin11-gcc target.
This allows building on MountainLion as the 10.6 SDK has been
removed from the latest Xcode version (4.4 4F250). Also fix
all warnings for that build.

Change-Id: Ib70bca4a25295f13595f0d10ea9f0229631de5a4
2012-08-06 15:26:58 -07:00
Deb Mukherjee
2af5473a90 Merging in high_precision_mv experiment
Merged in the high_precision_mv experiment to make it easier
to work on new mv encoding strategies. Also removed
coef_update_probs3().

Change-Id: I82d3b0bb642419fe05dba82528bc9ba010e90924
2012-08-03 13:38:49 -07:00
Jingning Han
fcbff9ee04 Replacing the 8x8 DCT with 8x8 ADST/DCT for I8x8
Fixed the code review comments.

Under the htrans8x8 experiment the 8X8 DCT in the
I8X8 mode is replaced with a combination of 8X8 ADST and
DCT.

Overall coding gains with the htrans8x8 experiment are:
derf:   0.486
std-hd: 1.040
hd:     1.063
yt:     0.506

Note that part of the gain comes from bigger transforms
(8x8 instead of 4x4) and part comes from replacing the DCT
wth the ADST.

Change-Id: I92ca6bbfce11b4165d612b81d9adfad4d010c775
2012-08-03 12:02:07 -07:00
Daniel Kang
e6de9c2e5d Merge "16x16 DCT blocks." into experimental 2012-08-03 10:22:04 -07:00
Deb Mukherjee
4aabfaa5d0 Merge "Merging and bug-fix in enhanced_interp experiment" into experimental 2012-08-03 10:18:25 -07:00
Daniel Kang
fed8a1837f 16x16 DCT blocks.
Set on all 16x16 intra/inter modes

Features:
- Butterfly fDCT/iDCT
- Loop filter does not filter internal edges with 16x16
- Optimize coefficient function
- Update coefficient probability function
- RD
- Entropy stats
- 16x16 is a config option

Have not tested with experiments.

hd:     2.60%
std-hd: 2.43%
yt:     1.32%
derf:   0.60%

Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c
2012-08-02 17:33:10 -07:00
Jingning Han
c7846ebc34 Use 8x8 DCT transform for I8X8 prediction mode
Apply 2D-DCT transform of dimension 8x8 to encode prediction
residuals of I8X8 mode.
Brought back block type 3 probability context model for 8x8 tokens,
which is used for the coefficients of Y blocks in I8x8 modes. The
coefficient costs estimate of I8X8 mode in rate-distortion is also
changed appropriately.
Performance results:
derf:   0.246
yt:     0.114
std-hd: 0.730
hd:     0.670

Change-Id: If1d970eeb4e1827c9f0d2c5b27d33089b347ea27
2012-08-02 09:09:17 -07:00
Deb Mukherjee
0ebf548c75 Merging and bug-fix in enhanced_interp experiment
Merged the enhanced_interp experiment.
Found and fixed a bug in the include files framework, whereby
certain encoder files were still using the old INTERP_EXTEND
value of 3 instead of 4. The thresholds for mv range mcomp.c
need a small adjustment to prevent crashes.

The results are more or less unchanged.

Change-Id: Iac5008390f1efc97ce1102fbb5f8989c847fb579
2012-07-31 11:45:31 -07:00
Deb Mukherjee
5259744145 Adds support for switchable interpolation filters.
Allows for swtiching/setting interpolation filters at the MB
level. A frame level flag indicates whether to use a specifc
filter for the entire frame or to signal the interpolation
filter for each MB. When switchable filters are used, the
encoder chooses between 8-tap and 8-tap sharp filters. The
code currently has options to explore other variations as well,
which will be cleaned up subsequently.

One issue with the framework is that encoding is slow. I
tried to do some tricks to speed things up but it is still slow.
Decoding speed should not be affected since the number of
filter taps remain unchanged.

With the current version, we are up 0.5% on derf on average but
some videos city/mobile improve by close to 4 and 2% respectively.
If we did a full-search by turning the SEARCH_BEST_FILTER flag
on, the results are somewhat better.

The framework can be combined with filtered prediction, and I
seek feedback regarding that.

Rebased.

Change-Id: I8f632cb2c111e76284140a2bd480945d6d42b77a
2012-07-30 11:33:43 -07:00
Deb Mukherjee
96f9473866 Merge "Merges several experiments" into experimental 2012-07-27 12:22:55 -07:00
Deb Mukherjee
9984a155d6 Merges several experiments
The following five experiments are merged:

newentropy
newupdate
adaptive_entropy (also includes a couple of parameter changes
                  that improves results a little
                  in common/entropymode.c and encoder/modecosts.c
                  that were not merged from the internal branch)
newintramodes
expanded_coef_context

Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
2012-07-27 12:12:39 -07:00
jimbankoski
45e551b28f shared object on mac osx
Change-Id: Ibf357eb492e7d5883fbdf1ddf455e28767c1d65d
2012-07-25 19:39:33 -07:00
Jim Bankoski
1b16e74813 Dll build of libvpx
Change-Id: I74e50b4dfbe73eb98e1dce1695a9973f637220c0
2012-07-23 14:51:21 -07:00
Jingning Han
9824230fe3 Adds hybrid transform
Adds ADST/DCT hybrid transform coding for Intra4x4 mode.
The ADST is applied to directions in which the boundary
pixels are used for prediction, while DCT applied to
directions without corresponding boundary prediction.

Adds enum TX_TYPE in b_mode_infor to indicate the transform
type used.

Make coding style consistent with google style.
Fixed the commented issues.

Experimental results in terms of bit-rate reduction:
derf:   0.731%
yt:     0.982%
std-hd: 0.459%
hd:     0.725%

Will be looking at 8x8 transforms next.

Change-Id: I46dbd7b80dbb3e8856e9c34fbc58cb3764a12fcf
2012-07-19 13:02:57 -07:00
Yaowu Xu
d632bf8cf5 removed floating point version 8x8 fdct
the integer version has very good precision, the float version is no
longer useful. this commit also removes the experiment option from
configure script.

Change-Id: Ibb92e63c9f5083357cdf89c559d584a7deb3353f
2012-07-17 22:50:47 -07:00
Yaowu Xu
11e23e673d cleanup experiments in configure
this commit removes a number of experiment options from configure
script. the associated experiments are already fully merged, the
options in configure script have no effect at all.

Change-Id: I8054ccaee0a04610162ed76ac9e59c4538217113
2012-07-17 22:34:07 -07:00
Hui Su
e44ee38aef Add lossless compression mode.
This commit adds lossless compression capability to the experimental
branch. The lossless experiment can be enabled using --enable-lossless
in configure. When the experiment is enabled, the encoder will use
lossless compression mode by command line option --lossless, and the
decoder automatically recognizes a losslessly encoded clip and decodes
accordingly.

To achieve the lossless coding, this commit has changed the following:
    1. To encode at lossless mode, encoder forces the use of unit
quantizer, i.e, Q 0, where effective quantization is 1. Encoder also
disables the usage of 8x8 transform and allows only 4x4 transform;
    2. At Q 0, the first order 4x4  DCT/IDCT have been switched over
to a pair of forward and inverse Walsh-Hadamard Transform
(http://goo.gl/EIsfy),  with proper scaling applied to match the range
of the original 4x4 DCT/IDCT pair;
    3. At Q 0, the second order remains to use the previous
walsh-hadamard transform pair. However, to maintain the reversibility
in second order transform at Q 0, scaling down is applied to first
order DC coefficients prior to forward transform, and scaling up is
applied to the second order output prior to quantization. Symmetric
upscaling and downscaling are added around inverse second order
transform;
    4. At lossless mode, encoder also disables a number of minor
features to ensure no loss is introduced, these features includes:
        a. Trellis quantization optimization
        b. Loop filtering
        c. Aggressive zero-binning, rounding and zero-bin boosting
        d. Mode based zero-bin boosting

Lossless coding test was performed on all clips within the derf set,
to verify that the commit has achieved lossless compression for all
clips. The average compression ratio is around 2.57 to 1.
(http://goo.gl/dEShs)

Change-Id: Ia3aba7dd09df40dd590f93b9aba134defbc64e34
2012-06-28 17:09:47 -07:00
Adrian Grange
bbc926dca2 Added Prediction Filter to Mode Selection
Added the ability to optionally filter the prediction data
when inter modes are selected (excludes SPLITMV, for now).

The mode selection loop considers both the filtered and
non-filtered prediction data when choosing mode. The filter
can be turned on/off at the frame-level, or signaled for
each MB.

Change-Id: I1b783c71d95a361ab36c761b07e8a6b06bc36822
2012-06-27 14:51:41 -07:00
Ronald S. Bultje
8a1c01d3e6 Reset executable flags for configure.
This was accidently disabled in 1fe85a35e0.

Change-Id: I09dbfecfe45b28dec75b27a627e3065f9c7dc8b2
2012-06-18 14:21:30 -07:00
Deb Mukherjee
1fe85a35e0 Adaptive entropy coding of coefficients, modes, mv.
This patch incorporates adaptive entropy coding of coefficient tokens,
and mode/mv information based on distributions encountered in a frame.
Specifically, there is an initial forward update to the probabilities
in the bitstream as before for coding the symbols in the frame, however
at the end of decoding each frame, the forward update to the
probabilities is reverted and instead the probabilities are updated
towards the actual distributions encountered within the frame.
The amount of update is weighted by the number of hits within each
context.

Results on derf/hd/std-hd are all up by 1.6%.

On derf, the most of the gains come from coefficients, however for the
hd and std-hd sets, the most of the gains come from the mode/mv
information updates.

Change-Id: I708c0e11fdacafee04940fe7ae159ba6844005fd
2012-06-15 10:35:23 -07:00
Deb Mukherjee
c5ddb7f016 Adds new Directional Intra prediction modes.
Adds 6 directional intra predictiom modes for 16x16 and 8x8 blocks.

Change-Id: I25eccc0836f28d8d74922e4e9231568a648b47d1
2012-05-15 08:54:50 -07:00
Yaowu Xu
b22cc559b6 Changed to use integer 8x8 dct
The commit added an integer version of 8x8 forward DCT, based on the
orginal forward DCT from VP6. The constants, roundings, and shifts
were adjusted to improve the accuracy. The latest patch has a very
similar accuracy in term of round trip error against the floating
point version.

It should be noted here that the purpose of the patch is to help
encoding speed and facilitate all other experiments. There will be
futher review in combination with inverse DCT before finalization.

configure with "--enable--int_8x8fdct" to use the integer version

Change-Id: I5a4f80507429f0e07cf02a13768ec81cbfddc5bc
2012-05-15 07:28:26 -07:00
Jim Bankoski
a5d11f298f Fix configure issue with unit test add.
Change-Id: I960c6eb81f8d76c958e8af989700447f581a8812
2012-05-11 08:08:12 -07:00
James Berry
a0769f70f5 add unit test support via google test
adds unit testing via google test

Change-Id: I144b50a976d79251fc5135186a4e0a5051ed0e8c
2012-05-11 06:19:52 -07:00
Deb Mukherjee
813c6c3925 Expanding the coefficient encoding contexts
This patch expands the set of prev contexts used for video coding
from 3 to 4.

There is a small improvement of the order of 0.08% for derf and
0.15% on the HD set. The tests were rerun after the various merges
last week. There are two columns in each test - the first are the
results with the mbskip change, and the second with expanded contexts
added on top of that.

Derf:
http://www.corp.google.com/~debargha/vp8_results/explibvpx_newentropy_expcontext.html

HD:
http://www.corp.google.com/~debargha/vp8_results/explibvpx_hd_newentropy_expcontext.html

Rebased.

Broke up 80 char lines.

Change-Id: I82d2e72d054e530cbf5ce9aa0e6d85c582965675
2012-05-04 07:11:38 -07:00
Deb Mukherjee
c6f1bf4321 Differential encoding of probability updates
Adds differential encoding of prob updates using a subexponential
code centered around the previous probability value.
Also searches for the most cost-effective update, and breaks
up the coefficient updates into smaller groups.

Small gain on Derf: 0.2%

Change-Id: Ie0071e3dc113e3d0d7ab95b6442bb07a89970030
2012-04-23 23:02:52 -07:00
Adrian Grange
9daf3154db Superblock encoding order
This is the first patch to add superblock (32x32) coding
order capabilities. It does not yet do any mode selection
at the SB level, that will follow in a further patch.

This patch encodes rows of SBs rather than
MBs, each SB contains 2x2 MBs.

Two intra prediction modes have been disabled since they
require reconstructed data for the above-right MB which
may not have been encoded yet (e.g. for the bottom right
MB in each SB).

Results on the one test clip I have tried (720p GIPS clip)
suggest that it is somewhere around 0.2dB worse than the
baseline version, so there may be bugs.

It has been tested with no experiments enabled and with
the following 3 experiments enabled:
  --enable-enhanced_interp
  --enable-high_precision_mv
  --enable-sixteenth_subpel_uv
in each case the decode buffer matches the recon buffer
(using "cmp" to compare the dumped/decoded frames).
Note: Testing these experiments individually created
errors.

Some problems were found with other experiments but it
is unclear what state these experiments are in:
  --enable-comp_intra_pred
  --enable-newentropy
  --enable-uvintra

This code has not been extensively tested yet, so there
is every likelihood that further bugs remain. I also
intend to do some code cleanup & refactoring in tandem
with the next patch that adds the 32x32 modes.

Change-Id: I1eba7f740a70b3510df58db53464535ef881b4d9
2012-04-11 10:40:57 +01:00
Deb Mukherjee
57d953479b Adding contextual coding of mb_skip_coeff flag.
Using contextual coding of the mkb_skip_coeff flag using the
values of this flag from the left and above. There is a small
improvement of about 0.15% on Derf:
http://www.corp.google.com/~debargha/vp8_results/mbskipcontext.html

Refactored to use pred_common.c by adding a new context type.

Results on HD set (about 0.66% improvement):
http://www.corp.google.com/~debargha/vp8_results/mbskipcontext_hd.html

Incliding missing refactoring to use the pred_common utilities.

Change-Id: I95373382d429b5a59610d77f69a0fea2be628278
2012-03-21 03:55:44 -07:00
Paul Wilkins
68033ca472 Snapshot candidate
Pulled out super block code for the snapshot as this
is not quite ready and will need an extensive re-merge.

Change-Id: I436369b511257447a7b0ea064016cb63f5011849
2012-03-07 11:24:33 +00:00
Ronald S. Bultje
d476165107 Compound intra prediction (b_pred/4x4 only, for now),
Also remove duplicate build_intra_predictors_mby/uv().

Change-Id: I78607e7304952a9b962a5b25af9bb9c48692187b
2012-02-28 17:41:03 -08:00
Paul Wilkins
19b9d28f70 Merge new loop filter.
Merge of the NEWLPF configuration experiment so it is always on.

Change-Id: I7054772b6eab28bad1ff807bfa54d98f83de9308
2012-02-28 20:58:52 +00:00
Deb Mukherjee
18e90d744e Supporting high precision 1/8-pel motion vectors
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.

Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)

Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V

Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).

Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.

With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html

Results on high precision mv and on the hd set are to follow.

Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c

Patch 7: Cleaning up various debug messages.

Patch 8: Merge conflict

Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
2012-02-23 09:25:21 -08:00
Paul Wilkins
46f9ad2ca5 Experimental code base simplification.
Remove error concealment code.

Change-Id: I882705174fbfea212e96f7f684e47a671dbe5c67
2012-02-15 16:08:47 +00:00
Yaowu Xu
d327dcf3aa moved segment based LPF level selection under CONFIG_FEATUREUPDATES
This commit moved segment based loop filter level selection into
the experiment of CONFIG_FEATUREUPDATES. As previous commit noted,
the segment based loop filter selection helps the compression by
~0.1% on cif set, the ongoing experiment CONFIG_FEATUREUPDATES
made encoding updates of the segment based LPF level more efficient,
hence, another .04% gain on cif set. The commit also fixed an issue
previously where encoder/decoder may use different loop filter level
for one of the segments.

Change-Id: Ia978b14aae95bb107d561ba53a7a2bb6ff01faf3
2012-02-15 07:18:05 -08:00
Paul Wilkins
9a8204d6ee Simplification of experimental code base.
Removed ~CONFIG_REALTIME_ONLY code.

Change-Id: I5fafff29a08acd8928699f9ddce8744787024d8c
2012-02-14 09:03:56 +00:00
Jim Bankoski
af8f1928d1 vp8 - config_featureupdates
Added a bit to signify that the feature changed since
the last time we sent it, or not so that we don't need
to send all the databits for every feature change.

added config

Change-Id: I8d3064ce90d4500bf0d5c6b87c664e46138dfcac
2012-02-13 12:31:12 -08:00
Paul Wilkins
2615ca5d41 Removal of threading code.
For the experimental branch we are trying to slim the codebase
down removing features such as threading for now which complicate
the process of development and testing.

Change-Id: I657c0246aef4d1fa8c8ffc6a1adfeee45bce8e24
2012-02-10 16:23:59 +00:00
Ronald S. Bultje
29e4d7e861 Merge dualpred (compound prediction) experiment.
Change-Id: Ieaaa07c50eae41118596197f6a4d848135946e41
2012-02-09 16:29:18 -08:00
Adrian Grange
5d0b5a17d9 Added encoding in Superblock Order
As a precursor to encoding 32x32 blocks this cl adds the
ability to encode the frame superblock (=32x32 block) at
a time. Within a SB the 4 indiviual MBs are encoded in
raster-order (NW,NE,SW,SE).

This functionality is added as an experiment which can be
enabled by ispecifying --enable-superblocks in the
command line specified to configure (CONFIG_SUPERBLOCKS
macro in the code).

To make this work I had to disable the two intra
prediction modes that use data from the top-right of the
MB.

On the tests that I have run the results produce
almost exactly the same PSNRs & SSIMs with a very
slightly higher average data rate (and slightly higher
data rate than just disabling the two intra modes in
the original code).

NOTE: This will also break the multi-threaded code.

This replaces the abandoned change:
Iebebe0d1a50ce8c15c79862c537b765a2f67e162

Change-Id: I1bc1a00f236abc1a373c7210d756e25f970fcad8
2012-02-02 10:30:57 -08:00
Paul Wilkins
b2f64dff7d Added common prediction modules.
This function adds the common prediction modules,  some data structures
and a config option but does not use them.

It also corrects a bug in clearing down  the MODE_INFO border and introduces
a new element that indicates if an entry corresponds to an "in image" macro block
or is part of the border.

Change-Id: Ib69eec0876173ebe9d1de9df9537d0b2447702e0
2012-01-31 12:53:36 +00:00
Deb Mukherjee
6fa47a5f16 Adds support for enhanced interpolation for subpel motion
using an 8-tap filter.

The results with 3 different 8-tap filters on the derf set are in:
http://www.corp.google.com/~debargha/vp8_results/enhinterp.html
The one that gives the most gain achieves an overall gain of about
0.6%. The results for a set of 12 hd (720p) videos are in:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd.html
with max gain of 0.55% with the same filter. The best filter apparently
achieves the best trade-off between pass band ripple and stop band
attenuation.

Change-Id: I919e28ae245c0493147fa0864f8c9d048a9dd530
2012-01-26 10:24:47 -08:00
Yaowu Xu
5e7d7d3d95 new loop filter functions for macroblock boundaries
The commit adds a new set of loop filter for macroblock edge filtering.
The new loop filter has a mask to detect so-called "flat" regions. The
detection checks 5 pixels of each side of an edge. If the all pixels
have value with +/-1 from the edge pixel on the same side, the region
is treated as a "flat" region. For such case, a 7 tap filter is used
to change 3 pixel values on each side. The 7 taps are:
               [1, 1, 1, 2, 1, 1, 1]/8
The furthest away pixels used as input are +/-5 away from edge. For
non-flat region, we fall back to old filtering. It should be noted
here that the thresholds and filter taps may require more optimization
for best possible results.

Tests on a set of hd clips showed consistent gains:
http://www.corp.google.com/~yaowu/no_crawl/mblpf_hd.html
(avg psnr: .83% glb psnr: .77% ssim: .82%)

Tests on derf set also showed consistent gains:
http://www.corp.google.com/~yaowu/no_crawl/mblpf_derf.html
(avg psnr: .24% glb psnr: .22% ssim: .48%)

Change-Id: I0855b1ff48e79e1175c20b81967137e18b2af352
2012-01-18 09:51:29 -08:00
Yaowu Xu
b70f23caec Removed #if CONFIG_MULCONTEXT
This commit removed the macro CONFIG_MULCONTEXT, which was used to
indicate the experiment code for using separate context for altref
and normal frames. This commit made the change fully merged in.

Change-Id: I525f927f68e2365d37b340ef23b836a136a4f70b
2011-12-07 14:01:07 -08:00
Yaowu Xu
d37cd97682 Removed #if CONFIG_I8X8
This commit removed the macro CONFIG_I8X8, which was used to indicate
the 8x8 intra prediction experiment, made the change fully merged in.

Change-Id: Iafa4443781ce6e83f5591c12ba615a0e92ce0ea0
2011-12-07 13:48:53 -08:00
Ronald S. Bultje
60cb39da86 Dual 16x16 inter prediction.
This patch introduces the concept of dual inter16x16 prediction. A
16x16 inter-predicted macroblock can use 2 references instead of 1,
where both references use the same mvmode (new, near/est, zero). In the
case of newmv, this means that two MVs are coded instead of one. The
frame can be encoded in 3 ways: all MBs single-prediction, all MBs dual
prediction, or per-MB single/dual prediction selection ("hybrid"), in
which case a single bit is coded per-MB to indicate whether the MB uses
single or dual inter prediction.

In the future, we can (maybe?) get further gains by mixing this with
Adrian's 32x32 work, per-segment dual prediction settings, or adding
support for dual splitmv/8x8mv inter prediction.

Gain (on derf-set, CQ mode) is ~2.8% (SSIM) or ~3.6% (glb PSNR). Most
gain is at medium/high bitrates, but there's minor gains at low bitrates
also. Output was confirmed to match between encoder and decoder.

Note for optimization people: this patch introduces a 2nd version of
16x16/8x8 sixtap/bilin functions, which does an avg instead of a
store. They may want to look and make sure this is implemented to
their satisfaction so we can optimize it best in the future.

Change-ID: I59dc84b07cbb3ccf073ac0f756d03d294cb19281
2011-12-06 11:53:02 -08:00