81 Commits

Author SHA1 Message Date
Angie Chiang
a245d9f88c Add facade to inverse txfm
Add inv_txfm and highbd_inv_txfm as facades of inverse transform
such that the code flow in encodemb.c can be simpler

Change-Id: Iea45fd22dd8b173f8eb3919ca6502636f7bcfcf7
2015-11-25 13:50:40 -08:00
Angie Chiang
96baa73ed9 Create hybrid_fwd_txfm.c
Move txfm functions from encodemb to hybrid_twd_txfm.c
to make encodemb's code flow clear

Change-Id: If174d8ddb490d149c103e5127d30ef19adfbed13
2015-11-25 12:51:25 -08:00
Angie Chiang
30e325a94b merge txfm_#x#_1 into txfm_#x#
Change-Id: I9f539491fe676898246976c91d5ac4804a155803
2015-11-24 18:21:27 -08:00
Geza Lore
01bb4a318d Eliminate copying for FLIPADST in fwd transforms.
This patch eliminates the copying of data when using FLIPADST forward
transforms, by incorporating the necessary data flipping into the
load_buffer_* functions of the SSE2 optimized forward transforms. The
load_buffer_* functions are normally inlined, so the overhead of copying
the data is removed and the overhead of flipping is minimized. Left to
right flipping is still not free, as the columns need to be shuffled in
registers.

To preserve identity between the C and SSE2 implementations, the
appropriate C implementations now also do the data flipping as part of
the transform, rather than relying on the caller for flipping the input.

Overall speedup is about 1.5-2% in encode on my tests. Note that these
are only the forward transforms. Inverse transforms to come in a later
patch.

There are also a few code hygiene changes:
- Fixed some indents of switch statements.
- DCT_DCT transform now always use vp10_fht* functions, which dispatch
  to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
  directly, some of them used to call vp10_fht*).

Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
2015-11-03 17:10:55 +00:00
Jingning Han
bfeac5e19c Support per transform block skip coding
Allow the encoder to drop individual transform block coding.

Change-Id: I2c2b2985254cb92baf891f03daa33f067279373b
2015-10-30 08:55:17 -07:00
Debargha Mukherjee
8a4292441f Refactoring tx-types to add more flexibility
Allows inter and intra tx_types to have different sets of
transforms for different tx_size/sb_type combinations.

Change-Id: Ic0ac1daef7a9fb15c4210271e4d04cd36e5cec8e
2015-10-28 23:31:32 -07:00
Debargha Mukherjee
f1c4b79d72 Build fix for ext-tx
Change-Id: Ifab43f85f6ae1be6b9f95521f79ba49055353b5f
2015-10-23 21:38:50 +00:00
Yaowu Xu
5a27b3bb85 Fix merge defects
This commit fixes the merge conflicts between master and nextgenv2 and
disable early termination in choose_tx_size() to avoid failure in test.

The test failures are pre-existing, some of the issue were fixed in
masterbase already, so will have another merge to introduce the fixes.

Change-Id: Ib71889661955e73aedbb4db49d8be70425281dcb
2015-10-22 18:25:41 -07:00
Yaowu Xu
4ac2ae3a4d Merge branch 'masterbase' into nextgenv2
Conflicts:
	configure
	test/vp9_encoder_parms_get_to_decoder.cc
	vp10/common/blockd.h
	vp10/common/entropymode.c
	vp10/common/entropymode.h
	vp10/common/idct.c
	vp10/decoder/decodeframe.c
	vp10/decoder/decodemv.c
	vp10/encoder/bitstream.c
	vp10/encoder/encodeframe.c
	vp10/encoder/encodemb.c
	vp10/encoder/encoder.c
	vp10/encoder/encoder.h
	vp10/encoder/rd.c
	vp10/encoder/rdopt.c
	vp10/encoder/tokenize.c
	vp10/encoder/tokenize.h
	vp9/decoder/vp9_decodeframe.c
	vp9/decoder/vp9_decoder.h
	vp9/encoder/vp9_aq_cyclicrefresh.c
	vp9/encoder/vp9_encoder.h
	vp9/vp9_cx_iface.c
	vpx/vp8cx.h
	vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c
	vpx_scale/yv12config.h

Change-Id: I604a329d38badec7a11e8ede16ca1404476e9b93
2015-10-22 11:40:44 -07:00
Ronald S. Bultje
60c58b5284 vp10: per-segment lossless coding.
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.

See issue 1035.

Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
c7dc1d78bf vp10: add extended-intra prediction edges experiment.
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.

See issue 1043.

Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
2015-10-16 19:30:39 -04:00
Jingning Han
2cdc12742d Rate-distortion optimization for recursive transform block coding
This commit enables the rate-distortion optimization for recursive
transform block coding scheme.

Change-Id: Id6a8336ca847bb3af1e94cbfb51db1f4da12d38f
2015-10-13 12:49:03 -07:00
Jingning Han
a8dad55c82 Make chroma component RD estimate support transform partition
This commit makes the rate-distortion optimization for chroma
component support the recursive transform block coding scheme.

Change-Id: I1bfed6d05b0ebb3905cb625222401e2ccbae10f3
2015-10-08 18:04:03 -07:00
Jingning Han
704985e65a Add decoder support to recursive transform block partition
This commit allows the decoder to recursively parse and rebuild
the pixel blocks.

Change-Id: I510f3a30ae7cdad5b70725c66882b00a0594e96f
2015-10-08 16:16:41 -07:00
Jingning Han
52bb9dd45c Make tokenization process support recursive transform block coding
This commit makes the transform, quantization, tokenization and
their corresponding inverse operations support recursive transform
block coding process.

Change-Id: I71f2ef3a7c2d3db7cfc63c1fd3f1337e8e0360b5
2015-10-08 08:46:02 -07:00
Jingning Han
ebc48efe37 Use explicit block position in foreach_transformed_block
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.

Change-Id: I42b3578eac258ebaba7a7c74f684de9abab521a6
2015-10-07 16:32:19 -07:00
Jingning Han
00ca5c1c98 Simplify vp10_xform_quant index parsing
Change-Id: Id7f7a9b2e53fc0074b55d58143f296afad6b844e
2015-10-06 17:19:23 -07:00
hui su
2afe7320c8 Add identity transform to ext-tx experiment
ext-tx on derflr: +1.756% (was +1.648)

Change-Id: I8a87970fa589e8f5f96db7aa68ec9b6c98e20188
2015-09-30 18:47:46 -07:00
Yaowu Xu
7c514e2dfd Merged branch 'master' into nextgenv2
Resolved Conflicts in the following files:
        configure
        vp10/common/idct.c
        vp10/encoder/dct.c
        vp10/encoder/encodemb.c
        vp10/encoder/rdopt.c

Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316
2015-09-29 16:17:32 -07:00
Ronald S. Bultje
bab8d38f7f vp10: remove MACROBLOCK.{highbd_,}itxfm_add function pointer.
This is preparatory work for allowing per-segment lossless coding.

See issue 1035.

Change-Id: I9487d02717ee3e766aee61a487780056bb35d2d3
2015-09-25 19:30:46 -04:00
Ronald S. Bultje
c74b33a413 vp10: remove MACROBLOCK.fwd_txm4x4 function pointer.
This is preparatory work for allowing per-segment lossless coding.

See issue 1035.

Change-Id: Idd72e2a42d90fa7319c10122032d1a7c7a54dc05
2015-09-25 19:30:46 -04:00
Debargha Mukherjee
09ff5f2792 Merge remote-tracking branch 'origin/master' into nextgenv2
Periodic merge to get master changes into nextgenv2.

Change-Id: I6f0e4b470f193da03f1a8cb8e6a93ae39395699a
2015-09-17 16:33:18 -07:00
Jingning Han
481b834842 Fix vp10 high bit-depth build
Change-Id: Ie3daed0b282b43ef81d2f8797ac1f6e8bde7d65e
2015-09-11 08:56:29 -07:00
Jingning Han
f137697c32 Take out skip_encode speed feature in vp10
Change-Id: Ic39d4523e78863c816b0fc85f56ea5ae5e0b3310
2015-09-10 12:45:39 -07:00
Debargha Mukherjee
4ce81d666e Comprehensive support for symmetric DST
Creates new hybrid transforms combining symmetric DST with
ADST and DCT. Thus a total of 16 transforms are supported.

derfl: +1.659% (up about 0.2%)

Change-Id: Idde1cecdb59527890bf05da740099c3f6a5b9764
2015-09-10 11:13:59 -07:00
Debargha Mukherjee
9fc691efbe Backport EXT_TX experiment from nextgen
Does not include DST1 yet.

derflr: +1.437 (8-bit internal), +7.243 (12-bit internal)
with --enable-ext-tx

Change-Id: I91f1759fd2de794755eb6384cda52e80e979cb7d
2015-09-09 09:42:51 -07:00
hui su
b3cc3a07b0 Enable ADST for UV channel
derflr +0.202%
hevclf +0.207%
hevcmr +0.095%
hevchr +0.077%

Tested locally on several derf sequences, speed (encoder + decoder)
is slower by less than 1%.

It is part of the EXT_TX experiment, which is to be continued to
explore different transform variants.

Change-Id: I05d44994a62106538a9a241ed8d89bd7c5d14761
2015-08-26 13:25:30 -07:00
hui su
d76e5b3652 Refactoring on transform types
Prepare for adding more transform varieties (EXT_TX and TX_SKIP in nextgen).

Change-Id: I2dfe024f6be7a92078775917092ed62abc2e7d1e
2015-08-24 10:47:25 -07:00
hui su
5eed74e1d3 Refactor get_tx_type and get_scan
This makes it easier to add new transform types and scan orders
to VP10 in the future.

Change-Id: I94874ddc9b19928d7820d57e94e2af04adf51efe
2015-08-21 09:53:37 -07:00
Jingning Han
3acfe46e8d Sync vp10 with vpx_ports/system_state.h
Change-Id: Ic5004f8bdc1c2b025b598e80374ee1f286ea95ee
2015-08-12 09:21:25 -07:00
Jingning Han
54d66ef165 Remove vp9_ prefix from vp10 files
Remove the vp9_ prefix from vp10 file names.

Change-Id: I513a211b286a57d6126fc1b0fbfd6405120014f1
2015-08-11 21:24:08 -07:00