Add inv_txfm and highbd_inv_txfm as facades of inverse transform
such that the code flow in encodemb.c can be simpler
Change-Id: Iea45fd22dd8b173f8eb3919ca6502636f7bcfcf7
This patch eliminates the copying of data when using FLIPADST forward
transforms, by incorporating the necessary data flipping into the
load_buffer_* functions of the SSE2 optimized forward transforms. The
load_buffer_* functions are normally inlined, so the overhead of copying
the data is removed and the overhead of flipping is minimized. Left to
right flipping is still not free, as the columns need to be shuffled in
registers.
To preserve identity between the C and SSE2 implementations, the
appropriate C implementations now also do the data flipping as part of
the transform, rather than relying on the caller for flipping the input.
Overall speedup is about 1.5-2% in encode on my tests. Note that these
are only the forward transforms. Inverse transforms to come in a later
patch.
There are also a few code hygiene changes:
- Fixed some indents of switch statements.
- DCT_DCT transform now always use vp10_fht* functions, which dispatch
to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
directly, some of them used to call vp10_fht*).
Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
Allows inter and intra tx_types to have different sets of
transforms for different tx_size/sb_type combinations.
Change-Id: Ic0ac1daef7a9fb15c4210271e4d04cd36e5cec8e
This commit fixes the merge conflicts between master and nextgenv2 and
disable early termination in choose_tx_size() to avoid failure in test.
The test failures are pre-existing, some of the issue were fixed in
masterbase already, so will have another merge to introduce the fixes.
Change-Id: Ib71889661955e73aedbb4db49d8be70425281dcb
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.
See issue 1035.
Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.
See issue 1043.
Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
This commit makes the rate-distortion optimization for chroma
component support the recursive transform block coding scheme.
Change-Id: I1bfed6d05b0ebb3905cb625222401e2ccbae10f3
This commit makes the transform, quantization, tokenization and
their corresponding inverse operations support recursive transform
block coding process.
Change-Id: I71f2ef3a7c2d3db7cfc63c1fd3f1337e8e0360b5
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.
Change-Id: I42b3578eac258ebaba7a7c74f684de9abab521a6
Resolved Conflicts in the following files:
configure
vp10/common/idct.c
vp10/encoder/dct.c
vp10/encoder/encodemb.c
vp10/encoder/rdopt.c
Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316
Creates new hybrid transforms combining symmetric DST with
ADST and DCT. Thus a total of 16 transforms are supported.
derfl: +1.659% (up about 0.2%)
Change-Id: Idde1cecdb59527890bf05da740099c3f6a5b9764
Does not include DST1 yet.
derflr: +1.437 (8-bit internal), +7.243 (12-bit internal)
with --enable-ext-tx
Change-Id: I91f1759fd2de794755eb6384cda52e80e979cb7d
derflr +0.202%
hevclf +0.207%
hevcmr +0.095%
hevchr +0.077%
Tested locally on several derf sequences, speed (encoder + decoder)
is slower by less than 1%.
It is part of the EXT_TX experiment, which is to be continued to
explore different transform variants.
Change-Id: I05d44994a62106538a9a241ed8d89bd7c5d14761