8 Commits

Author SHA1 Message Date
Yaowu Xu
7e89c102c4 vp9-highbitdepth -> vpx-highbitdepth
Change-Id: I1e90cf7ab4bb02c0ef119b0bd1596771edefedff
2016-08-05 15:41:33 -07:00
Angie Chiang
6f28581b26 Turn on flip in inverse txfm2d
Fix build failed
Reduce txfm test time

Change-Id: Ieaf6b27f3a272d06286f817f01230413fa8adcf6
2016-05-18 11:26:57 -07:00
Yi Luo
1d307368a9 Integrate HBD row/column flip fwd txfm SSE4.1 optimization
- Integrate 5 flip transform types for each 4x4, 8x8, and 16x16
  block, for experiment, EXT_TX.
- Encoder speed improves about 12%-15%.
- Update the unit tests for bit-exact result against C.

Change-Id: Idf27c87f1e516ca5b66c7b70142477a115404ccb
2016-05-18 03:48:01 +00:00
Yi Luo
412ad22f46 HBD hybrid transform 16x16 SSE4.1 optimization
- Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
- Update vp10_fht16x16_test.cc to do bit-exact test against
  latest C version.
- HBD encoder speed improves ~1.8%.

Change-Id: Icfc799a212e5289bcf6cedcae3722032133a2bc6
2016-05-09 11:07:01 -07:00
Alex Converse
2e520f2768 transform tests: Avoid #if inside INSTANTIATE_TEST_CASE_P
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1200

Change-Id: Ia2dd6bb1ca2dff4422753af4a00156a12e488ed0
2016-04-27 14:39:38 -07:00
Yi Luo
770bf71503 8x8/16x16 HT types V_DCT to H_FLIPADST SSE2 optimization
- Wrote function: fidtx8_sse2() and fidtx16_sse2().
- Turned on vp10_fht8x8_sse2()/vp10_fht16x16_sse2() for new types.
- Updated 8x8/16x16 unit tests for accuracy/speed.
- Running 20K times with random numbers and getting through
  tx type from V_DCT to H_FLIPADST, SSE2 speed improvement:
  8x8: ~131%
  16x16: ~66%

Change-Id: Ibbb707e932a08fec3b1f423a7dab280a1d696c9a
2016-03-25 16:48:19 -07:00
Debargha Mukherjee
1b17559327 Adds 1D transforms for ADST/FlipADST to make 16
Makes a set of 16 transforms total, adding all 1D
combinations of ADST and FlipADST, and removng all DST
transforms.

lowres, midres both improve by about 0.1% and hdres by
-0.378% in BDRATE but with fewer transforms that are also
simpler.

Further experiments to continue later.

Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e
2016-03-21 11:19:36 -07:00
Yi Luo
50a164a1f6 Implemented DST 16x16 SSE2 intrinsics optimization
- Implemented fdst16_sse2(), fdst16_8col() against C version: fdst16().
- Turned on 7 DST related hybrid txfm types in vp10_fht16x16_sse2().
- Replaced vp10_fht10x10_c() with vp10_fht16x16_sse2() in
  fwd_txfm_16x16().
- Added vp10_fht16x16_sse2() unit test against C version:
  vp10_fht16x16_c() (--gtest_filter=*VP10Trans16x16*).
- Unit test passed.
- Speed improvement: 2.4%, 3.2%, 3.2%, for city_cif.y4m, garden_sif.y4m,
  and mobile_cif.y4m.

Change-Id: Ib30a67ce5d5964bef143d588d0f8fa438be8901f
2016-03-08 14:56:38 -08:00