Implements wedge mask generation by pre-computing arrays.
Improves encode speed by 15-20%.
Also consolidates the mask generation code for inter-inter
and inter-intra.
Change-Id: If1d4ae2babb2c05bd50cc30affce22785fd59f95
Changes ensure wedge_partition, interintra and palette expts all
work alongside the ext_coding_unit_size experiment.
Change-Id: I18f17acb29071f6fc6784e815661c73cc21144d6
These changes have been made in preparation for the work on the
extended coding unit size experiment.
Change-Id: I83f289812426bb9aba6c4a5fedd2b0e0a4fe17cb
Changes to allow high bit depth and palette to be enabled at the
same time by using a 16 bit (instead of 8bit) palette when high
bit depth is enabled and modifying related functions accordingly.
Change-Id: I97d30b4d9338d3a51db02c94bc568eba60a8905d
For 444 videos, a single palette of 3-d colors is
generated for YUV. For 420 videos, there may be two
palettes, one for Y, and the other for UV.
Also fixed a bug when palette and tx-skip are both on.
on derflr
--enable-palette +0.00%
with all experiments +5.87% (was +5.93%)
on screen_content
--enable-palette +6.00%
--enable-palette --enable-tx_skip +15.3%
on screen_content 444 version
--enable-palette +6.76%
--enable-palette --enable-tx_skip +19.5%
Change-Id: I7287090aecc90eebcd4335d132a8c2c3895dfdd4
All stats look fine.
derflr: +0.912 with respect to 10-bit internal baseline
(Was +0.747% w.r.t. 8 bit)
+5.545 with respect to 8-bit baseline
Change-Id: I3c14fd17718a640ea2f6bd39534e0b5cbe04fb66
Added palette coding option for Y channel only, key frame only.
Palette colors are obtained with k-means clustering algorithm.
Run-length coding is used to compress the color indices.
On screen_content:
--enable-palette + 5.75%
--enable-palette --enable-tx_skip +15.04% (was 13.3%)
On derflr:
--enable-palette - 0.03%
with all the other expriments + 5.95% (was 5.98%)
Change-Id: I6d1cf45c889be764d14083170fdf14a424bd31b5
Also fixes some build issues with certain experimental combinations.
Results on derflr with all experiments on: +5.516%
Change-Id: I9b492f3d3556bd1f057005571dc9bee63167dd95
A smooth weighting scheme is used to put more weight
on the intra predictor samples near the left/top boundaries
and decaying it to favor the inter predictor samples more as
we move away from these boundaries in the direction of
prediction.
Results:
derflr: +0.609% with only this experiment
derflr: +3.901% with all experiments
Change-Id: Ic9dbe599ad6162fb05900059cbd6fc88b203a09c
Preliminary 64x64 transform implementation.
Includes all code changes.
All mismatches resolved.
Coding results for derf and stdhd are within noise. stdhd is slightly
higher, derf is slightly lower.
To be further refined.
Change-Id: I091c183f62b156d23ed6f648202eb96c82e69b4b
Incorporates the WRAPLOW macro into the non-highbitdepth transforms
to aid hardware verification between a software C model and an
intended hardware implementation though the use of the configure
options: --enable-experimental --enable-emulate-hardware.
Note that to avoid further discrepancies between the sse/sse2
implementations of the transforms and the C implementation, when the
emulate hardware option is invoked, we also disable sse/sse2/etc.
Also incudes some minor cleanups/renaming etc.
Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287
The warning messages complained that there are unused arguments
in a few prediction modes. This structure was designed on purpose,
such that a wrapper function can cover all prediction mode cases
and make them readily accessible as an pointer array.
This commit silences such warnings.
Change-Id: I7036b6bdb70747e5327d8f6fceb154f100abc4c0
Fixed dr memory errors reported in Issue 736:
https://code.google.com/p/webm/issues/detail?id=736
All elements in left_col buffer need to be initialized to ensure
the correctness of SIMD operations in x86 optimized code.
Change-Id: I8e7f26ab45cca8099c1f9342bcf852f828bda7e4
This probably has a mildly negative impact on performance, but will
(in future commits - or possibly merged with this one) allow SIMD
implementations of individual intra prediction functions. We may
perhaps want to consider having separate functions per txfm-size
also (i.e. 4x4, 8x8, 16x16 and 32x32 intra prediction functions for
each intra prediction mode), but I haven't played much with that
yet.
Change-Id: Ie739985eee0a3fcbb7aed29ee6910fdb653ea269
This commit enables configurable reference buffer pointer for intra
predictor. This allows later removal of spatial dependency between
blocks inside a 64x64 superblock in the rate-distortion optimization
loop.
Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1
Removed one 4x4 prediction step that was unnessary in the rd loop.
Removed a unused modecosts estimate from encoder side.
Change-Id: I65221a52719d6876492996955ef04142d2752d86