This commit allows the encoder to process tile coding per 64x64
block. The supported upper limit of tile resolution is the minimum
of frame size and 4096 in each dimension. To turn on, set
--experiment --row-tile
and compile.
It overwrite the old --tile-columns and --tile-rows configurations.
These two parameters now tell the encoder the width and height of
tile in the unit of 64x64 block. For example,
--tile-columns=1 --tile-rows=1
will make a tile contains a single 64x64 block.
Change-Id: Id515749a05cfeb9e9d008291b76bdfb720de0948
This commit allows the internal codec handle arbitrary tile size
in the unit of 64x64 pixel blocks.
Change-Id: I3ad24de392064645bebab887c94e1db957794916
Move the 2D tile info arrays as global variables. This resolves
the local function stack overflow issue due to excessively large
tile info variables. This allows the internal operation to support
up to 1024 row and column tiles.
Change-Id: I6644cc929e5d3a778a5c03a712ebfc0b8729f576
This commit allows the codec to use up to row tiles (optionally
in combination with up to 64 column tiles per row tile). The
minimum tile size is set to be 256x256 pixel block.
Change-Id: I811ca93f0c5eba41e190f6c7c0f064d1083f530f
The max and min tile number reference should be used to support
both row and column tiles. This commit renames the previous col
prefix to avoid confusion.
Change-Id: I487bea43701af946b79023597a9a9a0516480380
Runborgs results on derflr show consistent results between NEW_INTER
and the previous combination of NEWMVREF and COMPOUND_MODES.
Change-Id: Ieba239c4faa7f93bc5c05ad656a7a3b818b4fbfc
Test VP9/EndToEndTestLarge.EndtoEndPSNRTest/1 (422 stream) failed when
supertx enabled. This was because 4x8 and 8x4 blocks were not being
split into 4x4s during tokenization in the encoder. This patch
uses vp9_foreach_transformed_block() to fix this.
Change-Id: I1f1cb27474eb9e04347067f5c4aff1942bbea8d9
Fix the row tile boundary detection issues. This allows to use
more resources for parallel encoding/decoding when avaiable.
Change-Id: Ifda9f66d1d7c2567dd4e0a572a99a83f179b55f9
Besides code cleaning, this patch contains 3 fixes:
(1) Fixed the COMPOUND_MODES for the NEW_NEWMV mode;
(2) Fixed the joint search when the NEAR_FORNEWMV mode (in NEWMVREF)
is being evaluated;
(3) Fixed the WEDGE_PARTITION when the NEAR_FORNEWMV mode (in NEWMVREF)
is being evaluated.
(4) Adjusted the entropy probability value for NEAR_FORNEW mode.
On derflr turning on all 14 experiments (except for global-motion), the
average gain w.r.t. PSNR is +0.07%:
Maximum on bridge_far_cif: +1.02%
Minimum on hallmonitor_cif: -0.16%
Change-Id: I4c9c6ee24a981af7e655a629580641d9f9745f91
Use separate token probabilities and counters for non-transform
blocks (pixel domain) . Initial probabilities are trained with screen_content
clips. On screen_content, it improves coding performance by about
2% (from +16.4% to +18.45%).
The initial probabilities are not optimized for natural videos. So it should
not be used for natural videos. Set FOR_SCREEN_CONTENT as 0/1 to specify
whether or not to enable this patch.
Change-Id: Ifa361c94bb62aa4b783cbfa50de08c3fecae0984
Implements a first version of global motion where the
existing ZEROMV mode is converted to a translation only
global motion mode.
A lot of the code for supporting a rotation-zoom affine
model is also incorporated.
WIP.
Change-Id: Ia1288a8dfe82f89484d4e291780288388e56d91b
The 8x8 DCT uses a fast version whenever possible.
There was a mistake in the checking code which
meant sometimes the fast version was used when it
was not safe to do so.
Change-Id: I154c84c9e2d836764768a11082947ca30f4b5ab7
In the case when there are only non-zero coefficients
in the first 4x4 block a special routine is called.
The highbitdepth optimized version of this routine
examined the wrong positions when deciding whether
to call an assembler or C inverse transform.
Change-Id: I62da663ca11775dadb66e402e42f4a1cb1927893
This is a combination of:
4a19fa6 Added sse2 acceleration for highbitdepth variance
c6f5d3b Fix high bit depth assembly function bugs
Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f
This change is made in preparation for a
subsequent patch which adds acceleration
for the highbitdepth transform functions.
The highbitdepth transform functions attempt
to use 16/32bit sse instructions where possible,
but fallback to using the C implementations if
potential overflow is detected. For this reason
the dct routines are made global so they can be
called from the acceleration functions in the
subsequent patch.
Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665
It is a minor change, but the essential idea is to use the mv of the
top right block as the nearmv for the bottom left partition in the
sub8x8 block. The change is under the experiment of NEWMVREF.
When all 13 experiments are on (except for INTRABC), the gain is +0.05%:
Worse on bowing_cif: -0.17%
Best on foreman_cif: +0.42%; and bridge_far_cif: +0.40%
The total 13 experiments achieved a gain of +6.97% against base.
Change-Id: I3a51d9e28b34b0943fe16a984d62bfb38304ebca
vp9_quantize_rect did illegal shifts but didn't use the results.
The shift |a << b| is unfortunately undefined if |a < 0|, but the
more verbose |a * (1 << b)| generates the same machine code.
Change-Id: I7ceac66fa20a700630cf8ed008949146b161dab4