Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.
This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.
Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.
Change-Id: I86714a6b7364da20cd468cd784247009663a5140
Add palette mode for keyframe luma channel. Palette mode is enabled
when using "--tune-content=screen" in encoding config parameters.
on screen_content testset: +6.89%
on derlr : +0.00%
Design doc (WIP):
https://goo.gl/lD4yJw
Change-Id: Ib368b216bfd3ea21c6c27436934ad87afdaa6f88