26 Commits

Author SHA1 Message Date
Yaowu Xu
7e89c102c4 vp9-highbitdepth -> vpx-highbitdepth
Change-Id: I1e90cf7ab4bb02c0ef119b0bd1596771edefedff
2016-08-05 15:41:33 -07:00
Debargha Mukherjee
e5848dea5a Rectangular transforms 4x8 & 8x4
Added a new expt rect-tx to be used in conjunction with ext-tx.
[rect-tx is a temporary config flag and will eventually be
merged into ext-tx once it works correctly with all other
experiments].

Added 4x8 and 8x4 tranforms for use initially with rectangular
sub8x8 y blocks as part of this experiment.

There is about a -0.2% BDRATE improvement on lowres, others pending.

When var-tx is on rectangular transforms are currently not used.
That will be enabled in a subsequent patch.

Change-Id: Iaf3f88ede2740ffe6a0ffb1ef5fc01a16cd0283a
2016-07-21 10:46:41 -07:00
Debargha Mukherjee
1b17559327 Adds 1D transforms for ADST/FlipADST to make 16
Makes a set of 16 transforms total, adding all 1D
combinations of ADST and FlipADST, and removng all DST
transforms.

lowres, midres both improve by about 0.1% and hdres by
-0.378% in BDRATE but with fewer transforms that are also
simpler.

Further experiments to continue later.

Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e
2016-03-21 11:19:36 -07:00
Debargha Mukherjee
9b88762b17 Refactor 1D transforms
In preparation for adding more 1D variants with ADST/FlipADST/etc.

BDRATE actually improves by 0.21% on lowres.

Change-Id: I2fa4720c69fe001fa666119a284dfc6b17fffab2
2016-03-14 22:30:09 -07:00
Jingning Han
a8dc9694a4 Hybrid 1-D/2-D transform coding
This commit enables a hybrid 1-D/2-D transform coding scheme and
the accompany entropy coding system. It currently uses hybrid
1-D/2-D DCT transform coding. It provides coding performance gains:

lowres_all  0.55%
hdres_all   0.43%

Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78
2016-03-07 09:27:46 -08:00
Debargha Mukherjee
7485498773 Extends ext-tx to support 32x32 masked transforms
Adds new 32x32 masked 1-d transforms that combine 1-D length-16
DCT with length-16 identity transforms.

To be continued in subsequent patches.

Change-Id: I0b4f66492d44c079b3c3b531ba48a97201de1484
2016-02-17 09:31:34 -08:00
Debargha Mukherjee
1badceada8 Code cleanup: remove redundant DST1 code
Removes the USE_DST2 flag that was on by default. DST2 performs
slightly better that DST1 and is faster to compute.

Change-Id: Ifb788f3f0a0e1995d7625230cec144b876f01206
2016-02-16 10:36:02 -08:00
Debargha Mukherjee
49d9730f60 Replace DST1 in ext_tx experiment with DST2
The DST2 is implemented by input alternate sign-flip, followed
by DCT, followed by output reversal.
Results are roughly the same, but it should be easier to optimize
the DST2.
[Interestingly a mtrix multuiply implementation is about 0.1%
better].

Change-Id: If9ae5fdba87767fb0e6c163a62b77ee66a8d3afc
2015-12-15 11:30:48 -08:00
Yaowu Xu
69f4930041 Merge branch 'master' into nextgenv2
Conflicts:
	vp10/common/blockd.h
	vp10/common/entropymode.h
	vp10/common/reconintra.c
	vp10/decoder/decodemv.c
	vp10/encoder/bitstream.c
	vp10/encoder/encoder.h
	vp10/encoder/rd.c
	vp10/encoder/rdopt.c
	vp10/encoder/tokenize.h

Change-Id: Ic4891839b6f0474026d6d69821e38edec9632df1
2015-12-07 11:37:14 -08:00
Angie Chiang
08b157da8e comment out range_check of fdct in dct.c
The range_check is not used because the bit range
in fdct# is not correct. Since we are going to merge in a new version
of fdct# from nextgenv2, we won't fix the incorrect bit range now.

Change-Id: I54f27a6507f27bf475af302b4dbedc71c5385118
2015-12-04 10:54:31 -08:00
Geza Lore
01bb4a318d Eliminate copying for FLIPADST in fwd transforms.
This patch eliminates the copying of data when using FLIPADST forward
transforms, by incorporating the necessary data flipping into the
load_buffer_* functions of the SSE2 optimized forward transforms. The
load_buffer_* functions are normally inlined, so the overhead of copying
the data is removed and the overhead of flipping is minimized. Left to
right flipping is still not free, as the columns need to be shuffled in
registers.

To preserve identity between the C and SSE2 implementations, the
appropriate C implementations now also do the data flipping as part of
the transform, rather than relying on the caller for flipping the input.

Overall speedup is about 1.5-2% in encode on my tests. Note that these
are only the forward transforms. Inverse transforms to come in a later
patch.

There are also a few code hygiene changes:
- Fixed some indents of switch statements.
- DCT_DCT transform now always use vp10_fht* functions, which dispatch
  to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
  directly, some of them used to call vp10_fht*).

Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
2015-11-03 17:10:55 +00:00
Geza Lore
2b39bcec29 Fix transform tables in C implementations.
These tables were out of sync with the indexing enum since the
refactoring in commit 4f16f119 (change 303389), due to the removal
of the ext_tx_to_txtype lookup table. This patch just puts them
back in order.

Change-Id: Ieb7d57654f61b99b511d54c9ba09abbd5e8d0d14
2015-11-03 17:10:51 +00:00
Jingning Han
6a9ed8d2b6 Fix forward transform bit range limits
Change-Id: I13c0ecff8c58a0571d9de4bc5fbbebe72533ccdb
2015-10-15 09:09:44 -07:00
Debargha Mukherjee
3e8cceb3fc Speed up of DST and the search in ext_tx
Adds an early termination to the ext_tx search, and also
implements the DST transforms more efficiently.

About 4 times faster with the ext-tx experiment.

There is a 0.09% drop in performance on derflr from 1.735% to
1.648%, but worth it with the speedup achieved.

Change-Id: I2ede9d69c557f25e0a76cd5d701cc0e36e825c7c
2015-09-29 19:11:43 -07:00
Yaowu Xu
7c514e2dfd Merged branch 'master' into nextgenv2
Resolved Conflicts in the following files:
        configure
        vp10/common/idct.c
        vp10/encoder/dct.c
        vp10/encoder/encodemb.c
        vp10/encoder/rdopt.c

Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316
2015-09-29 16:17:32 -07:00
Angie Chiang
6a382101dd comment out fdct32
comment out fdct32
remove fdct32 test

Change-Id: I31c47fb435377465cd3265e39621ca50d3aae656
2015-09-25 18:18:27 -07:00
James Zern
e7c8b71a86 Revert "remove static from fdct4/8/16/32"
This reverts commit 8903b9fa8345726efbe9b92a759c98cc21c4c14b.

there is no reason for these to be global

Change-Id: I66a31c06f8426aeca348ef12d9b9ab59d6d5e55d
2015-09-23 17:45:57 -07:00
Angie Chiang
8903b9fa83 remove static from fdct4/8/16/32
remove static from fdct4/8/16/32 in vp10/encoder/dct.c
add prefix vp10_ to fdct4/8/16/32
add vp10/encoder/dct.h

Change-Id: I644827a191c1a7761850ec0b1da705638b618c66
2015-09-21 11:49:10 -07:00
Debargha Mukherjee
09ff5f2792 Merge remote-tracking branch 'origin/master' into nextgenv2
Periodic merge to get master changes into nextgenv2.

Change-Id: I6f0e4b470f193da03f1a8cb8e6a93ae39395699a
2015-09-17 16:33:18 -07:00
Debargha Mukherjee
b8bc026c72 Misc. ext_tx fixes/enhancements
derflr: +1.732% (8-bit)

Change-Id: I9c04c8249646ff96eacacfa1dcb0bd118c04e84a
2015-09-15 10:00:54 -07:00
Angie Chiang
fe776ce61f add range_check for fdct in vp10
Unify the style of fdct4() fdct8() fdct16()
Add fdct32()
Add range_check() at each stage
Add unit test at ../../test/vp10_dct_test.cc

Change-Id: I13f76d9046c3ea473c82024b09a5bc8662e2c28e
2015-09-12 03:26:09 +00:00
Debargha Mukherjee
4ce81d666e Comprehensive support for symmetric DST
Creates new hybrid transforms combining symmetric DST with
ADST and DCT. Thus a total of 16 transforms are supported.

derfl: +1.659% (up about 0.2%)

Change-Id: Idde1cecdb59527890bf05da740099c3f6a5b9764
2015-09-10 11:13:59 -07:00
James Zern
43a4900ea3 Revert "add range_check for fdct in vp10"
Tests fail to build.

This reverts commit f78d6aa77245cea5cd7f630a20f4e6d576679e7f.

Change-Id: Ia220270517ded273c65a7ab965d82edb696663c9
2015-09-03 00:23:16 +00:00
Angie Chiang
f78d6aa772 add range_check for fdct in vp10
Unify the style of fdct4() fdct8() fdct16()
Add fdct32()
Add range_check() at each stage
Add unit test at ../../test/vp10_dct_test.cc

Change-Id: I9e912b2c5683862e65c5a21abc3e1c260cca4576
2015-09-02 13:50:17 -07:00
Jingning Han
3acfe46e8d Sync vp10 with vpx_ports/system_state.h
Change-Id: Ic5004f8bdc1c2b025b598e80374ee1f286ea95ee
2015-08-12 09:21:25 -07:00
Jingning Han
54d66ef165 Remove vp9_ prefix from vp10 files
Remove the vp9_ prefix from vp10 file names.

Change-Id: I513a211b286a57d6126fc1b0fbfd6405120014f1
2015-08-11 21:24:08 -07:00