Commit Graph

14019 Commits

Author SHA1 Message Date
James Zern
d8642d831f Merge "VP9_COPY_CONVOLVE_SSE2 optimization" 2015-07-31 23:22:34 +00:00
Jingning Han
e8b133c79c Factor inverse transform functions into vpx_dsp
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.

Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
2015-07-31 16:21:00 -07:00
Alex Converse
af6d2c7d42 Turn off simple_model_rd_from_var at speed 4.
This got erroneously changed during the refactor. This fixes
SvcTest.TwoPassEncode2TemporalLayersWithMultipleFrameContextsAndTiles.

Change-Id: Ifa5ab0e098396c5e2d10478db87df256eadfa4c7
2015-07-31 15:50:17 -07:00
James Zern
e184b613b9 Merge changes Iecdbbc34,I8b4db93f
* changes:
  Android.mk: fix *_rtcd.h deps for armeabi-v7a
  Android.mk: add a dep on vpx_config.asm for x86_64
2015-07-31 22:22:48 +00:00
Scott LaVarnway
a5e97d874b VP9_COPY_CONVOLVE_SSE2 optimization
This function suffers from a couple problems in small core(tablets):
-The load of the next iteration is blocked by the store of previous iteration
-4k aliasing (between future store and older loads)
-current small core machine are in-order machine and because of it the store will spin the rehabQ until the load is finished
fixed by:
- prefetching 2 lines ahead
- unroll copy of 2 rows of block
- pre-load all xmm regiters before the loop, final stores after the loop
The function is optimized by:
copy_convolve_sse2 64x64 - 16%
copy_convolve_sse2 32x32 - 52%
copy_convolve_sse2 16x16 - 6%
copy_convolve_sse2 8x8 - 2.5%
copy_convolve_sse2 4x4 - 2.7%
credit goes to Tom Craver(tom.r.craver@intel.com) and Ilya Albrekht(ilya.albrekht@intel.com)

Change-Id: I63d3428799c50b2bf7b5677c8268bacb9fc29671
2015-07-31 14:51:51 -07:00
Jingning Han
6025c6d65b Merge "Fix compiler warning in mips/dspr2" 2015-07-31 21:29:50 +00:00
Aℓex Converse
dd4b416412 Merge "Compute skippable inside the block_rd_txfm loop." 2015-07-31 21:19:11 +00:00
Jingning Han
135b43ccf3 Fix compiler warning in mips/dspr2
This commit fixes the mix declaration and definition warning when
mips/dspr2 is turned on.

Change-Id: I633d6fe42368b9ac35b106786ebac6969ad53552
2015-07-31 12:34:34 -07:00
Aℓex Converse
90e563d91f Merge changes Ic1ce346a,Ic0b4e92c
* changes:
  Simplify model_rd_for_sb HBD ifdefs
  Simplify dist_block HBD ifdefs
2015-07-31 19:05:54 +00:00
Alex Converse
ab20c98e84 Compute skippable inside the block_rd_txfm loop.
Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
2015-07-31 11:45:59 -07:00
Zoe Liu
7f8dd35329 Merge "Refactor mips/dspr2 on convolution." 2015-07-31 18:23:19 +00:00
Zoe Liu
873a158f14 Merge "Code refactor on InterpKernel" 2015-07-31 18:20:14 +00:00
Alex Converse
c62228f273 Simplify model_rd_for_sb HBD ifdefs
Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
2015-07-31 11:16:59 -07:00
Alex Converse
da9c73c293 Simplify dist_block HBD ifdefs
Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
2015-07-31 11:04:01 -07:00
Aℓex Converse
8abd0c2a12 Merge "Short circuit rate_block in block_rd_txfm." 2015-07-31 17:59:22 +00:00
Zoe Liu
7cfdc00337 Refactor mips/dspr2 on convolution.
Change-Id: If59a39d5a92c261537342726f94bb7f7f26dfff3
2015-07-31 10:27:42 -07:00
Zoe Liu
7186a2dd86 Code refactor on InterpKernel
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.

Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
2015-07-31 10:27:33 -07:00
Alex Converse
4ac5058afc Give skip_txfm constants names.
This is using a define instead of an enum to keep byte packing.

Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
2015-07-31 10:08:08 -07:00
Alex Converse
73422d3b2d Short circuit rate_block in block_rd_txfm.
Don't run rate_block (cost_coeffs) if distortion alone is enough to
surpass best_rd.

This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is
zero effect on output if tx_cache is removed.

Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
2015-07-31 10:05:51 -07:00
Parag Salasakar
8fbc641540 mips msa vp8 temporal filter optimization
average improvement ~2x-3x

Change-Id: I05593bed583234dc7809aaec6cab82773a29505d
2015-07-31 12:03:19 +05:30
Parag Salasakar
0e3f494b21 mips msa vp8 block subtract optimization
average improvement ~2x-3x

Change-Id: I30abf4c92cddcc9e87b7a40d4106076e1ec701c2
2015-07-31 09:29:10 +05:30
Parag Salasakar
e3ee8c292b Merge "mips msa vp8 quantize optimization" 2015-07-31 03:44:03 +00:00
Yunqing Wang
3b2e73b9a4 Remove tx cache and speed up tx size selection
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.

2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.

Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
2015-07-30 18:53:40 -07:00
Aℓex Converse
eb6b443bd2 Merge "Convert simple_model_rd_from_var from a speed check to a speed feature." 2015-07-30 23:04:28 +00:00
Hui Su
a71c5c0ee9 Merge "Exclude vpx intra prediction functions in vp8-only build" 2015-07-30 22:29:35 +00:00
Alex Converse
c827c59eaf Convert simple_model_rd_from_var from a speed check to a speed feature.
Change-Id: I8877025e172fff29bc4e270790211463b676b4d7
2015-07-30 13:53:26 -07:00
hui su
5fddefbced Exclude vpx intra prediction functions in vp8-only build
Currently vp8 is not using the intra prediction functions in vpx_dsp.

Change-Id: I1522b5f5cb12a81999fb126cf7c62c70259e7a52
2015-07-30 13:49:47 -07:00
James Zern
21da45e570 Android.mk: fix *_rtcd.h deps for armeabi-v7a
strip '.neon' so *_rtcd.h depends on the correct file

Change-Id: Iecdbbc34c9ce5c6d0a4b466332d52f4e6a0cb128
2015-07-30 13:27:30 -07:00
Parag Salasakar
56aa0da405 mips msa vp8 quantize optimization
average improvement ~2x-3x

Change-Id: I6fc37191bf9cb5a67e1af9787d0d27659c17bdba
2015-07-30 12:56:57 -07:00
Alex Converse
b7f441a0bc Cleanup rdcost_block_args
Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba
2015-07-30 12:55:51 -07:00
Aℓex Converse
c0f0245e8a Merge "Clean up some casts." 2015-07-30 19:37:28 +00:00
Jingning Han
91feec1452 Merge "Cosmetics - Fix header file order in unit tests" 2015-07-30 05:37:53 +00:00
Jingning Han
097d59c28c Cosmetics - Fix header file order in unit tests
Change-Id: I9582a8d74990125b71e8fe620f7f3f2585a30798
2015-07-29 20:48:25 -07:00
Parag Salasakar
0c2a14f9e2 mips msa vp8 fdct optimization
average improvement ~2x-4x

Change-Id: Id0bc600440f7ef53348f585ebadb1ac6869e9a00
2015-07-30 08:14:42 +05:30
Parag Salasakar
7c6ae373ac Merge "mips msa vp8 post proc optimization" 2015-07-30 02:34:06 +00:00
Aℓex Converse
583c205270 Merge "Comment zcoeff_blk." 2015-07-30 01:06:08 +00:00
Alex Converse
dfe7fdae7d Comment zcoeff_blk.
Change-Id: Iefc2eb78e71472ecf51802ec59ff32caef4bd0f4
2015-07-29 16:53:33 -07:00
Yaowu Xu
f23241087a Add const to a variable declaration
Change-Id: Idf572c22a87098665f5179dc3212a06d9a85a342
2015-07-29 16:27:34 -07:00
Yaowu Xu
47c55acdad Fix a typo
Change-Id: Ief8eea8fe6bef139d1e94f8d6dfac5a44efe785d
2015-07-29 16:23:14 -07:00
James Zern
e3365c894a Android.mk: add a dep on vpx_config.asm for x86_64
Change-Id: I8b4db93f754607aab64351745bd102ab238d9501
2015-07-29 15:38:43 -07:00
Alex Converse
49e0673659 Clean up some casts.
Change-Id: I264ca534cd7d4755906e20aea47e7a2523bca611
2015-07-29 11:26:51 -07:00
Parag Salasakar
a5d9416fd7 mips msa vp8 post proc optimization
average improvement ~2x-4x

Change-Id: I93abc15389649c169bb8b69127c0b95407d34692
2015-07-29 09:40:26 +05:30
Parag Salasakar
ce4c4b96e4 Merge "mips msa vp8 filter by weight optimization" 2015-07-29 04:00:41 +00:00
James Zern
f42012e526 Merge "add vp9_block_error_fp_neon" 2015-07-29 00:47:09 +00:00
Hui Su
4cbf36b105 Merge "Replace prefix vp9_ with vpx_ for intra prediction functions" 2015-07-29 00:38:48 +00:00
Jingning Han
d12a4a825c Merge "Replace vp9_ prefix in 2D-DCT functions with vpx_" 2015-07-29 00:07:31 +00:00
Jingning Han
39e3937c24 Merge "Remove vp9_dct.h file" 2015-07-29 00:06:56 +00:00
Jingning Han
fc18cf7a11 Merge "Move DC only forward 2D-DCT functions to vpx_dsp" 2015-07-29 00:06:37 +00:00
Jingning Han
4b5109cd73 Replace vp9_ prefix in 2D-DCT functions with vpx_
Clean up the forward 2D-DCT function names in vpx_dsp.

Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-28 16:06:44 -07:00
Jingning Han
a7e9178d80 Remove vp9_dct.h file
The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder.
The vp9_dct.h file is not effectively used now.

Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281
2015-07-28 15:27:36 -07:00