Cleans up the experiment. Actually uses reduced counts for backward
updates, and reduced number of probabilities in the context.
No change in bitstream when the experiment is on.
Between expt on and off:
derfraw300 is down only -0.062% (which is better than when expts
were run previously).
Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
The new code is 0x49, 0x83, 0x42
There is nothing particularly special about this code bitstream wise.
Its derivation is the word "sync" coded using 4x6bit alphabetic indices.
Change-Id: Ie2430a854af32ddc5a5c25a6c1c90cf6497ba647
The recursive partition type search is enabled down to 4x4, 4x8 and
8x4, followed by the corresponding rate-distortion optimization for
the per-partition encoding mode decisions.
The bit-stream writing/reading synchronized in supporting the
rectangular partition of 8x8 block.
This provides above 1% coding performance gains on derf.
To do next:
1. re-design the rate-distortion loop for inter prediction below 8x8.
2. re-design the rate-distortion loop for intra prediction below 4x4.
3. make the loop-filter aware of rectangular partition of 8x8 block.
4. clean the unused probability models.
5. update default probability values.
Change-Id: Idd41a315b16879db08f045a322241f46f1d53f20
Correct the stride parameter of 4x8 in vp9_sub_pixel_variance4x8_
and vp9_sub_pixel_avg_variance4x8.
Change-Id: I2ca74d4043817503b21737563994270e3b0619ff
Replace vp9_kf_default_bmode_counts structure with
direct default probabilities. The probability structure is
smaller and it removes the need to specify in the bitstream
how to convert the counts to probabilities.
Note that I have concerns still about the size and value of
the large intra mode context. This may cause problems for
HW but it also means we rely heavily on reverse update as
forwards update of a structure this size is problematic. I
intend to review this more generally in the next few days to
see if we can come up with a competitive solution that does
not rely on such a large context.
Change-Id: I0a36071079d5d26a57ab0e9fbf91af4199aa7984
This is a mostly-working implementation of an extra channel in the
bitstream. Configure with --enable-alpha to test. Notable TODOs:
- Add extra channel to all mismatch tests, PSNR, SSIM, etc
- Configurable subsampling
- Variable number of planes (currently always uses all 4)
- Loop filtering
- Per-plane lossless quantizer
- ARNR support
This implementation just uses the same contents as the Y channel
for the A channel, due to lack of content and general pain in
playing back 4 channel content. A later patch will use the actual
alpha channel passed in from outside the codec.
Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592
Deprecate set_block_index. Replace it with get_sb_index_ for
consistency with partition search and bit-stream writing/reading.
Use b_width/height_log2 instead of mi_width/height_log2, to support
4x4 resolution partition types.
Change-Id: Ic1e71981e163c669f7ea6b3c12b831c284c4a494
Replace mi_width/height_log2 with b_width/height_log2 in partition
type parsing at bit-stream writing stage. This allows parsing
resolution at 4x4 block level and makes the 4x4/4x8/8x4 partition
coding consistent with other superblock types.
Change-Id: I7db3617ea042e0db2dc898999b0c323bff91a22f
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45
These building blocks enable rate-distortion optimization search
over block sizes of 8x4 and 4x8. Need to convert them into mmx/sse
forms.
Change-Id: I570ea2d22d14ceec3fe3575128d7dfa172a577de
This patch creates a new inter mode contest that avoids
a dependence on the reconstructed motion vectors from
neighboring blocks. This was a change requested by
a hardware vendor to improve decode performance.
As part of this change I have also made some modifications
to stats output code (under a flag) to allow accumulation of
inter mode context flags over multiple clips
Some further changes will be required to accommodate the
deprecation of the split mv mode over the next few days.
Performance as stands is around -0.25% on derf and
std-hd but up on the YT and YT-HD sets. With further tuning
or some adjustment to the context criteria it should be
possible to make this change broadly neutral.
Change-Id: Ia15cb4470969b9e87332a59c546ae0bd40676f6c
Adds a subsampling aware border extension function. This may be reworked
soon to support more than 3 planes.
Change-Id: I76b81901ad10bb1e678dd4f0d22740ca6c76c43b
This commit allows proper transform type (DCT/ADST) selection in
the settings of partition 4x4 level.
Change-Id: Iec6f922a46480d777e7ca9142a99e8c131f0077b
Always initialize the mode_info with sb_type of BLOCK_SIZE_MB16X16
for the first-pass encoding test.
Change-Id: Ic86393eeef981bdd523a5b44cfac3f0b24c068b7
Combining encode_nmv_component with encode_nmv_component_fp
and read_nmv_component with read_nmv_component_fp. Bitstream is slightly
changed (only the order of bits), here are the results on test sets:
stdhd: +0.047, yt: -0.038, derf: +0.001, hd: -0.011.
Change-Id: I1be312e976796df78ca63368702d0ee19f2b8c50
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1
Trial use of a combination of reference frame,
prediction block size and mv to define segmentation.
Change-Id: Ie8946a0446dbad777fdcf7626f89e5af0994db50
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a
This commit allows the rate-distortion optimization recursion
at encoder to go down to 4x4 block size. It deprecates the use
of I4X4_PRED and SPLITMV syntax elements from bit-stream
writing/reading. Will remove the unused probability models in
the next patch.
The partition type search and bit-stream are now capable of
supporting the rectangular partition of 8x8 block, i.e., 8x4
and 4x8. Need to revise the rate-distortion parts to get these
two partition tested in the rd loop.
Change-Id: I0dfe3b90a1507ad6138db10cc58e6e237a06a9d6
Allow motion search multiple times iteratively, and break out
the loop if this search couldn't find better motion vectors.
Limit the maximum number of search to 2.
Tests results:
1. stdhd set: 0.311%(overall psnr); 0.346%(ssim).
positive gain on 10 out of 16 clips(best: 2.746% on sunflower;
worst: -0.434% on old_town_cross).
2. derf set: 0.016%(overall psnr); 0.062%(ssim).
positive gain on half of the clips(best: 0.499% on bowing;
worst: -0.387 on city).
Change-Id: Ibf0a51776d4caf7707be0586346db08128117559
Change band calculation back to simpler model based
on the order in which coefficients are coded in scan order
not the absolute coefficient positions.
With the scatter scan experiment enabled the results were
appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134).
Without the scatterscan experiment on the results were up derf as well.
Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791
Move set_partition_seg_context_ to common file. Use consistent
context setup conditions for partition probability model update at
encoder and decoder.
Change-Id: I24b7ed3b1c48e3d2568191a46b70136b99b67b1a
Use 4x4 block coding for UV components arbitrarily in I4X4_PRED and
SPLITMV coding modes. This is a temporary solution to enable
bit-stream support for recursive partition down to 4x4 block size.
Will separate the functionalities of 4x4 block coding rate-distortion
out from those of superblocks.
Change-Id: I03dc15d5897014f175f3f2c91e9b266091d56797
In current code, motion vectors got from single prediction mode are used
in compound prediction mode directly. These motion vectors may not give
accurate prediction since they are searched independently. In this patch,
we took Pascal's suggestion, and did joint motion search in compound
prediction mode to find better motion vectors in this situation.
Test results:
Overall PSNR: 0.570%(derf), 0.918%(stdhd);
SSIM: 0.572%(derf), 1.009%(stdhd);
The encoder is a little slower. This can be improved since some c
code is used in motion search.
Change-Id: Ib30c9240f6c56c9b070867b4ca89412a76d9f3c6
This commit enables the search for the optimal superblock
partition types in the recursion form. The intention is to
make the optimization process more concise and ready to
support partition down to 4x4 block size next.
Change-Id: Iae279a67df3a7cc372553c84c775bc4d2f3e4336
Make framebuffer allocations according to the chroma subsamping
factors in use. A bit is placed in the raw part of the frame header for
each of the two subsampling factors. This will be moved in a future
commit to make them part of the TBD feature set bits, probably only set
on keyframes, etc.
Change-Id: I59ed38d3a3c0d4af3c7c277617de28d04a001853
The chroma planes are not used during the first pass encode,
but the vp9_encode_sb() function was operating on them anyway.
This was causing the use of uninitialized memory.
Change-Id: I5ebafcd3d5e34ed91a8336dad159b573995a939f
Update and buffer left/above partition information context per 8x8
block. This allows to further enable recursive partition down to
4x4 block size, and hence deprecating I4X4_PRED and SPLITMV.
This commit also fixes a context buffer swap/restore issue in 32x32
partition type search. This gives 0.1% performance gain for derf/yt.
Will refactor the superblock partition type search into recursion
form.
Change-Id: Ib61975aca5f12b78d8018481d7fa1393d085689b
Makes the temporary storage of the filtered data agnostic to
the number of planes and how they're subsampled.
Change-Id: I12f352cd69a47ebe1ac622af30db29b49becb7f4
Skip Q values between the q.0 mode and a real q of
2.0 as these are not valuable from an RD perspective.
Change-Id: I110c4858c57f97315953f4d88a2596d4764360df
There is only one instance of these structures, no need for them
to be allocated separately on the heap.
Change-Id: I1333cc92d06bbe21be643c2b2f0e3936f0264cac
This setup is now handled by vp9_build_intra_predictors()
when left_available and/or up_available is zero.
Change-Id: I59cec0ab95f8be69ce885fd20727510e4deef8a0
Iterating over all planes in the loop instead of custom y,uv code inside
handle_inter_mode function.
Change-Id: I301f9276d6d544c2fd7203d84f1318ac80ea625d
Disable the use of scaled reference frame for motion search in
SPLITMV mode. This fixes the unit test failure issue triggered
when merging sb8x8 from experimental list.
Change-Id: I02ac25fd8db8d5762f8fee29513b947189875fa0
This allows removing a large number of transform size specific functions,
as well as supporting 444/alpha by routing all code through the
subsampling-aware path.
Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4
The number of reference buffers is extended to 8 and
a reference sign-bias added for the LAST_FRAME.
Whilst the number of reference buffers used by an
individual frame remains unchanged at 3, these may
now be selected from 8 possible buffers.
Change-Id: I2d247b9c1c2b3a339d6c9fac125e81ba373f75a7
With this, encoder/decoder appear to match with sb8x8 experiment.
Needs some larger-scale testing.
Change-Id: I44d3cac37b3c98264985ed0a0fc763c30089aa64
Creates a common encode (subtract, transform, quantize, optimize,
inverse transform, reconstruct) function for all sb sizes, including
the old 16x16 path.
Change-Id: I964dff1ea7a0a5c378046a069ad83495f54df007
Don't allow i4x4 except for sb8x8 recursion step. Read only 4 (not 16)
i4x4 submodes if we are i4x4.
Change-Id: Iaaaced1a134006b2c96eed66f014300eae41e0ed
This doesn't affect the output, because in previous cases where the
values were uninitialized, this was because the mb_row/col is outside
the codable area, and thus encode_sb will test them for the next
decomposition-level, but return right after that on size-check. All
this does is prevent a warning in valgrind.
Change-Id: I90d8a29e6f8ebb2b0143684e08fe77ae3a0816b1
Work-in-progress, not yet ready for review. TODO items:
- bitstream writing (encoder) and reading (decoder)
- decoder reconstruction
Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081
Changing the order of probabilities inside mb_segment_tree_probs in order
to use treed_read/treed_write function instead of custom code.
Change-Id: I843487d5057913b9358db73da270893eefecc6c8
Moving common code from encoder and decoder to vp9_get_qindex function.
Also moving quant-related constants from vp9_onyxc_int.h to
vp9_quant_common.h.
Change-Id: I70c5bfbaa1c8bf00fde0bfc459d077f88b6d46c8
Separate the decoding process of 4x4 block based coding (both intra
and inter) from decode_mb and move it into decode_atom_. This allows
to further move the rest per 16x16 block decoding of decode_mb into
decode_sb, and hence eventually deprecating decode_mb when SB8X8 is
enabled.
Change-Id: I678cb8007d8a57b792d7a23020edb0c74fbf4237
Separate the functionality of I4X4_PRED from decode_mb. Use
decode_atom_intra instead, to enable recursive partition of superblock
down to 8x8.
Change-Id: Ifc89a3be82225398954169d0a839abdbbfd8ca3b
Updates the tokenizer to use the common block walker used by the
detokenizer, to support non-4:2:0 and more than 3 planes.
Change-Id: If1854117a9c7c1427349209fa2b3051ce6459dcb
This doesn't change output, because the argument isn't actually used
ATM. However, we should fix it for consistency.
Change-Id: I7b7326a8e92c0d411c999ec2c781204b516ed53d
Unify the tokenize_ function and enable configurable block size for
superblock 8x8. We are immigrating the functionalities of
macroblock handles into superblock ones, and eventually will remove
encode_mb and decode_mb. To be continued on detokenize_ module.
Change-Id: I9f81e8c2291082535cf5e0c4b662eb24fb7c8a7f
Output changes slightly because of a minor bug in (at least) the sb32x16
block2above tx16x16 tables that previously existed in vp9_blockd.c.
Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc
Conflicts:
vp9/common/vp9_findnearmv.c
vp9/common/vp9_rtcd_defs.sh
vp9/decoder/vp9_decodframe.c
vp9/decoder/x86/vp9_dequantize_sse2.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_common.mk
Resolve file name changes in favor of master. Resolve rdopt changes in
favor of experimental, preserving the newer experiments.
Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f