12269 Commits

Author SHA1 Message Date
Julia Robson
3d9133b2a5 SSE2 optim of vp9_subtract_block for 128x128 units
Extending the SSE2 implementation of vp9_subtract_block to work
with the 128x128 coding unit experiment

Change-Id: Ib3cc16bf5801ef2c7eecc19d3cc07a8c50631580
2015-11-13 11:12:56 -08:00
Debargha Mukherjee
3436acb347 Merge "Adding encoder support for 128x128 coding units" into nextgen 2015-11-13 18:52:09 +00:00
Debargha Mukherjee
a542190830 Merge "Eliminate copying for FLIPADST in fwd transforms." into nextgen 2015-11-13 18:23:49 +00:00
Geza Lore
177ad11981 Eliminate copying for FLIPADST in fwd transforms.
This is a port of 01bb4a318dc0f9069264b7fd5641bc3014f47f32

This commit also fixes a bug where FLIPADST transforms when combined
with a DST (that is FLIPADST_DST and DST_FLIPADST) did not actually did
a flipped transform but a straight ADST instead. This was due to the C
implementation that it fell back on not implementing flipping.  This is
now fixed as well and FLIPADST_DST and DST_FLIPADST does what it is
supposed to do.

Change-Id: I89c67ca1d5e06808a1567c51e7d6bec4998182bd
2015-11-13 09:34:26 -08:00
Debargha Mukherjee
59de0c0bc7 Adding encoder support for 128x128 coding units
Changes to allow the encoder to make use of 128x128 coding units.

Change-Id: I340bd38f9d9750cb6346d83885efb00443852910
2015-11-13 09:21:22 -08:00
Debargha Mukherjee
9d9962aec9 Some fixes on context size for the 128x128 expt
Change-Id: I56f050502e3a750ce74b196d033b780218df2c1f
2015-11-13 07:06:19 -08:00
Johann Koenig
a0c13e6757 Merge "Cherry pick the rest of 661802, the important part" into nextgen 2015-11-13 00:47:05 +00:00
Julia Robson
d90a3265f0 Changes to use defined constants rather than hard-coded numbers
Also fixes a valgrind error when optimizations are disabled.
Done in preparation for the work on the extended coding unit size
experiment.

Change-Id: Ib074c5a02c94ebed7dd61ff0465d26fa89834545
2015-11-12 15:42:32 -08:00
Johann
a877c6e9a6 Cherry pick the rest of 661802, the important part
Change-Id: I85f1d2c07b89c874ea6c30df32dda9ecaa8d2c3f
2015-11-12 15:41:24 -08:00
Debargha Mukherjee
bd7a34d5a3 Merge "Fixing issue with calculation of block_idx" into nextgen 2015-11-12 23:39:35 +00:00
Johann
26272dd366 Cherry pick 661802 to fix 64bit arm build
Remove default cortex-a8 tuning.

Probably not even the dominant platform the library is being built for.
Add --cpu= option description to help. The option already exists.

Don't allow passing just --cpu as a no-op.

BUG=826

Change-Id: Iaa3f4f693ec78b18927b159b480daafeba0549c0
2015-11-12 14:51:53 -08:00
Julia Robson
598a11d04a Fixing issue with calculation of block_idx
For tall rectangular blocks, the block_idx of the lower transform
block was being mis-calculated.

Does not affect results the way this function is being used now.

Change-Id: I470464d19be0bf0f42003d0cc29793bc42db8f52
2015-11-12 08:21:53 -08:00
Julia Robson
84a5403bab Added decoder support for 128x128 coding units
Change-Id: Icf3c6b64caaf8a86cd27231aa27baf5fd99c0fde
2015-11-02 16:03:30 +00:00
Julia Robson
2a1f8c74aa Changes to use defined constants rather than hard-coded numbers
These changes have been made in preparation for the work on the
extended coding unit size experiment.

Change-Id: I83f289812426bb9aba6c4a5fedd2b0e0a4fe17cb
2015-11-02 16:02:55 +00:00
Sarah Parker
42abac9374 Fixed final style issues
WIP.

Change-Id: Iafcbcfdc2139e77eb2c6849a52a9dc94ea498d66
2015-10-27 18:39:54 -07:00
Debargha Mukherjee
9ef4023569 Remove some unused variables
Change-Id: I3ab263b4c42cc3bfd598a1fc280fbaffba2d4461
2015-10-15 12:50:15 -07:00
Julia Robson
dff4e683fd Added extended coding unit size experiment
Change-Id: I45e2efe22c8e2f23e3305d00906bc08229a85c17
2015-10-07 16:54:10 +01:00
Debargha Mukherjee
9fc51184b7 Merge "Speed up of wedge search" into nextgen 2015-10-06 00:33:50 +00:00
Debargha Mukherjee
597204298a Speed up of wedge search
Speeds up wedge search by pre-calculating single predictions
before computing the wedge combination.

About 20% speed up achieved.

Change-Id: I72f76568559d1899c8ac0afaa133d433ba388e6d
2015-10-04 23:43:25 -07:00
Debargha Mukherjee
ff6a66a0b2 Merge "tx64 prob tweaks." into nextgen 2015-10-03 17:59:54 +00:00
Debargha Mukherjee
b80a04bd63 tx64 prob tweaks.
A little improvement in results on hevchd

Change-Id: Ib71a57c4bec34bf688e1d53dbf73eb4525e7805b
2015-10-02 11:44:02 -07:00
Debargha Mukherjee
ff9aa146cb Reimplementatio of dst1 for speed
Encoder with --enable-ext-tx --enable-dst1 is now 4 times faster.

Change-Id: Ia750ad3516698ce94da4ceb566b1c51539537a95
2015-10-02 11:06:55 -07:00
Debargha Mukherjee
4f371ac4f2 Merge "64x64 idct fix." into nextgen 2015-09-25 23:36:42 +00:00
Zoe Liu
90dede3f5b Merge "Merged LAST2 and LAST3 to one experiment MULTI_REF" into nextgen 2015-09-25 16:47:28 +00:00
Debargha Mukherjee
5a25d567d7 64x64 idct fix.
Change-Id: If0e0cd7cbe71e9586657f5e8ffa87dcdebc686ba
2015-09-25 05:54:23 -07:00
Debargha Mukherjee
1e2cbf515e Merge "tx64x64 experiment fix for high-bitdepth" into nextgen 2015-09-25 03:22:54 +00:00
Zoe Liu
e8e58402f0 Merged LAST2 and LAST3 to one experiment MULTI_REF
Change-Id: I220be17af317520dccb62fa6b19da5c7ce10652d
2015-09-24 17:08:15 -07:00
Zoe Liu
a9900eb2b1 Merge "Added another reference frame LAST4_FRAME" into nextgen 2015-09-24 15:58:22 +00:00
Debargha Mukherjee
cdcebfba29 tx64x64 experiment fix for high-bitdepth
Change-Id: Ia8d769b43bad0f9ad0684ecf6925e580339c7397
2015-09-24 05:45:03 -07:00
Zoe Liu
829dbf7a79 Added another reference frame LAST4_FRAME
Under the experiment of CONFIG_LAST4_REF. On derflr testset, using
highbitdepth (HBD), in average PSNR,

(1) LAST2+LAST3+LAST4 obtained +0.361% against LAST2+LAST3;
(2) LAST2+LAST3+LAST4 obtained +1.567% against baesline.

Change-Id: Ic8b14272de6a569df2b54418fa72b505e1ed3aad
2015-09-23 17:10:44 -07:00
hui su
71d0af90f8 Adjust rd calculation in choose_tx_size_from_rd
Change-Id: I3649f28196a87663b116b9fe6446b1fbe6eeab4a
2015-09-23 14:37:45 -07:00
Zoe Liu
411c490bc3 Improved LAST3's single ref prob context design a little
On derflr testset, using 12-bit HightBitDepth mode, this CL obtained a
small gain of +0.031% by turning on LAST2+LAST3.

Change-Id: Ib6c9d595e56269634bf29d684eabcd806fc08cc9
2015-09-21 15:03:43 -07:00
Zoe Liu
9144967ca8 Fixed a couple of bugs for LAST3
Change-Id: I63126a844c255df4a447aac7f630ba54cc7d7d7a
2015-09-21 11:50:35 -07:00
Zoe Liu
c0889b9a8c Added a 3rd reference frame LAST3_FRAME
Under experiment CONFIG_LAST3_REF, which can only be turned on when
the experiment of CONFIG_MULTI_REF is on, i.e. LAST3_FRAME can only
be used when LAST2_FRAME is used. CONFIG_LAST3_REF would most likely
be combined with CONFIG_MULTI_REF once the performance improvement
is further confirmed.

On the testset of derflr, using Average PSNR metrics, with HighBitDepth
(HBD) on:

(1) LAST2 HBD obtained +0.579% against base HBD;
(2) LAST2 + LAST3 HBD obtained +0.591% against LAST2 HBD;
(3) LAST2 + LAST3 HBD obtained +1.173% against base HBD.

Change-Id: I1aa2b2e2d2c9834e5f8e61bf2d8818c7b1516669
2015-09-18 15:25:46 -07:00
Debargha Mukherjee
b6d5b0838b ext-tx extension to intra blocks
derflr: improves to 1.692%

Change-Id: Idf583216b40ed02526b9b39031eaf2fb72fed11d
2015-09-17 14:21:24 -07:00
Debargha Mukherjee
c104998b61 Merge "Redo DST1 in the ext-tx experiment" into nextgen 2015-09-16 18:38:03 +00:00
Debargha Mukherjee
4dbaf9a5ab Redo DST1 in the ext-tx experiment
Moved from nextgenv2 branch to test with other experiments.

derflr: +1.629%

Change-Id: Ie7c720053ed8b628177679c4351bb31b54716a71
2015-09-16 09:46:13 -07:00
Zoe Liu
9b0635fc75 Fixed a bug on the number of MAX_MODES in baseline
All the numbers of MAX_MODES have been changed assuming
CONFIG_MULTI_REF. Now correct numbers have been put in for both with and
without the enabling of the experiment MULTI_REF.

Change-Id: I70ffe2f1a89fa572d612dd3d311d3af19fe3a632
2015-09-15 14:12:12 -07:00
Zoe Liu
f48a159430 Added more LAST2 modes for INTERINTRA
Turning on all the other experiments, compared the RD performance
between with and without the use of LAST2_FRAME, on derflr testset,
on Average PSNR:

8-bit: +0.653% (All positive except one,
max: mobile_cif: 2.019%; min: paris_cif: -0.081%)
12-bit HBD: +0.735% (All positive,
max: bridge_far_cif: 2.416%; min: bowing_cif: 0.132%)

Change-Id: Ia0a375667e228c8ba3d2e223abff608206f2f545
2015-09-15 11:57:18 -07:00
Zoe Liu
ec8864a8bf Added MACRO for reference frame encoding
This CL introduces a few macros plus code cleaning on the encoding of
the reference frames. Coding performance remains unchanged.

For the encoding of either the compound reference or the single reference
case, since each bit has different contexts, the tree structure may not
be applied to treat the combined bits as one symbol. It is possible we may
explore the sharing of the same context for all the bits to introduce
the use of tree structure for the next step.

Change-Id: I6916ae53c66be1a0b23e6273811c0139515484df
2015-09-11 14:57:31 -07:00
Zoe Liu
897192be43 Added one more reference frame LAST2_FRAME
Under the experiment CONFIG_MULTI_REF. Current version shows
LAST2 vs base in nextgen on the testset of derflr:

(1) 8-bit: Average PSNR +0.53%
(worst: students_cif: -0.247%; best: mobile_cif: 1.902%)
(2) 12-bit HBD: Average PSNR +0.63%
(worst: pamphlet_cif: -0.213%, best: mobile_cif: 2.101%)

More tuning on the reference frame context design and default
probs are being conducted. This version does not guarantee to
work with other experiments in nextgen. A separate CL will address
the working with all other experiments.

Change-Id: I7f40d2522517afc26ca389c995bad56989587f65
2015-09-09 14:27:05 -07:00
Shunyao Li
2de18d1fd2 Super resolution mode (+CONFIG_SR_MODE)
CONFIG_SR_MODE=1, enable SR mode
USE_POST_F=1, enable SR post filter
SR_USE_MULTI_F=1, enable SR post filter family
Not compatible with other experiments yet

Change-Id: I116f1d898cc2ff7dd114d7379664304907afe0ec
2015-08-31 15:29:39 -07:00
Shunyao Li
c7d886d96a Add transform size rate for intra skip mode in rdopt
Change-Id: I81fedd99cd39c12b66b93b786cb43234c867b84b
2015-08-31 11:28:27 -07:00
Debargha Mukherjee
9c685602d0 Make tests work with new configurations
Disables some test vector tests when Vp8/Vp9 decoders are disabled
in configuration. Also moves some macros to the vpx level in
line with recent refactoring on the master branch.

Change-Id: Iaac8008992110398ae096c36b9726f723164c207
2015-08-27 14:05:59 -07:00
Debargha Mukherjee
4525961d80 Some tweaks to probabilities for ext-tx with dst1
derflr: up to 1.429% from a little less than 1.3% when
--enable-dst1 is also enabled with --enable-ext-tx.

Change-Id: I301229f2239b18acb96accc4fc44b64fa6927ace
2015-08-12 14:31:31 -07:00
hui su
d5eaca7fee code cleanup in encode_block_intra
Change-Id: I376b7e9b243178d79141a96e0aeafcbc15758e97
2015-07-30 15:28:51 -07:00
Cheng Chen
cc4d523d9f Resolve bug of DST1 in ext_tx experiment.
Change-Id: I828569e3596f9b9e8487aec7c4056e66cf1fc1f2
2015-07-28 10:58:14 -07:00
Debargha Mukherjee
23690fc5d1 Adds support for DST1 transforms for inter blocks
Adds an additional transform in the ext_tx experiment that
is a 2d DST1-DST1 combination.

To enable use --enable-ext-tx --enable-dst1.

This needs to be later extended to combine DST1 with DCT
or ADST.

Change-Id: I6d29f1b778ef8294bcfb6a512a78fc5eda20723b
2015-07-24 16:23:09 -07:00
Shunyao Li
188087202c Speed up of supertx
Limited the prediction extension to 8 pixels at each edge
Fixed a bug in the combination of wedge prediction and supertx

~10% speed up in decoder
derflr:     -0.004
derflr+hbd: +0.002
hevcmr:     +0.015

Change-Id: I777518896894a612c9704d3de0e7902bf498b0ea
2015-07-24 11:19:19 -07:00
Debargha Mukherjee
4b57a8b356 Add extended transforms for 32x32 and 64x64
Framework for alternate transforms for inter 32x32 and larger based
on dwt-dct hybrid is implemented.
Further experiments are to be condcuted with different
variations of hybrid dct/dwt or plain dwt, as well as super-resolution
mode.

Change-Id: I9a2bf49ba317e7668002cf1499211d7da6fa14ad
2015-07-23 18:01:22 -07:00