Commit Graph

3261 Commits

Author SHA1 Message Date
Linfeng Zhang
b14b616d96 Update vp9_iht8x8_64_add_neon()
Change-Id: Ie70ed8b9273df5e1fd06bc93cb469e80630941d2
2018-01-29 15:17:08 -08:00
Linfeng Zhang
903bc150da Update vp9_iht4x4_16_add_neon()
Change-Id: Ica8dbe5f8167e5d370d89d233c598b70bba123b7
2018-01-29 10:25:24 -08:00
Scott LaVarnway
b9e44842fc Merge "vp9_quantize_fp_avx2()" 2018-01-24 13:58:08 +00:00
Linfeng Zhang
231012fdab Add vp9_highbd_iht16x16_256_add_sse4_1()
BUG=webm:1413

Change-Id: I8d7eeae1bd219eb848c1a86071046a477f7a91af
2018-01-23 11:24:42 -08:00
Linfeng Zhang
8f50e06012 Add "vpx_" prefix to 2 idct x86 functions
Change-Id: I4f3052d8748e16b06e9155f8daf22f867dfaa7a3
2018-01-23 09:17:38 -08:00
Linfeng Zhang
6fea41abee Merge "Add vp9_highbd_iht8x8_64_add_sse4_1()" 2018-01-23 17:04:20 +00:00
Linfeng Zhang
9874ec07bd Add vp9_highbd_iht8x8_64_add_sse4_1()
BUG=webm:1413

Change-Id: Id9038226902b2d793fc6c17ac81bb104c1a18988
2018-01-18 15:49:44 -08:00
Scott LaVarnway
c7449b482c vp9_quantize_fp_avx2()
Started from vp9_quantize_fp_sse2 and tweaked to use avx2.

Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2018-01-18 13:33:30 -08:00
Johann
281f68a81f clang-format v5.0.0 vp9/
Remove trailing commas to keep multiple elements on one line.

Add blank lines to prevent comments from being treated as blocks.

clang-format guards for struct with a comment in the middle.

Change-Id: I3bcb8313ae8aaf69179249a13b4087b1272cdbc0
2018-01-18 12:37:58 -08:00
Linfeng Zhang
e20ca4fead Add vp9_highbd_iht4x4_16_add_sse4_1()
BUG=webm:1413

Change-Id: I14930d0af24370a44ab359de5bba5512eef4e29f
2018-01-08 10:14:20 -08:00
Johann
e4b3f03c64 add copyright to rtcd files
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-14 22:50:08 +00:00
Johann
bdbecea1ba explicitly label .text sections
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451

Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01 14:33:04 -08:00
Vlad Tsyrklevich
bc29863b96 [CFI] Remove function pointer casts
Control Flow Integrity [1] indirect call checking verifies that function
pointers only call valid functions with a matching type signature. This
change eliminates function pointer casts to make libvpx CFI-safe.

[1] https://www.chromium.org/developers/testing/control-flow-integrity

Change-Id: I7e08522d195a43c88cda06fa20414426c8c4372c
2017-11-20 16:36:29 -08:00
Jerome Jiang
ea14a1a965 Merge "vp9: Fix mem rel for non-ref for external buffer." 2017-11-17 00:31:16 +00:00
Scott LaVarnway
8c7213bc00 Merge "vpx: [x86] add vp9_block_error_fp_avx2()" 2017-11-10 00:45:47 +00:00
Jerome Jiang
6246d8aa76 vp9: Fix mem rel for non-ref for external buffer.
Release frame buffers for non-ref when the decoder is destroyed.

Enable the non ref test.

BUG=b/68819248

Change-Id: Id87ef3b0a62318f9812e927cd957c05c859047fa
2017-11-09 15:47:21 -08:00
Scott LaVarnway
62ab5e99c1 vpx: [x86] add vp9_block_error_fp_avx2()
SSE2 asm vs AVX2 intrinsics speed gains:
blocksize   16: ~1.00
blocksize   64: ~1.17
blocksize  256: ~1.67
blocksize 1024: ~1.81

Change-Id: I2a86db239cf57e3ff617890ccb2d236aba83ad5e
2017-11-09 05:02:31 -08:00
Kyle Siefring
b383a17fa4 Support building AVX-512 and implement sadx4 for AVX-512
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.

Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00
Linfeng Zhang
a80bdfd081 Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154
2017-09-18 09:26:03 -07:00
Linfeng Zhang
535dee0fb6 cosmetics: vp9_rtcd_defs.pl
Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086
2017-09-13 12:13:55 -07:00
Linfeng Zhang
71b38a144e Add 2 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b
2017-09-07 12:33:50 -07:00
Linfeng Zhang
d331e7a1c0 Remove get_filter_base() and get_filter_offset() in convolve
so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-09-05 15:22:36 -07:00
clang-format
7587a97551 apply clang-format
Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c
2017-09-01 01:24:03 -07:00
Johann
e83d99d7b8 quantize fp: neon implementation
About 4x faster when values are below the dequant threshold and 10x
faster if everything needs to be calculated.

Both numbers would improve if the division for dqcoeff could be
simplified.

BUG=webm:1426

Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-23 08:01:30 -07:00
Linfeng Zhang
f95686895b Merge changes I08b562b6,Ia275940a,I51106e90
* changes:
  Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
  Update highbd idct x86 optimizations.
  Update 32x32 idct sse2 and ssse3 optimizations.
2017-08-16 16:36:37 +00:00
Linfeng Zhang
3f05a70c41 Update 32x32 idct sse2 and ssse3 optimizations.
Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70
2017-08-14 16:59:31 -07:00
Scott LaVarnway
fa85cf131c vp9: strip temporal filter code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52
2017-08-14 14:27:53 -07:00
Johann
109faffe9b remove vp9_full_sad_search
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.

Remove the remaining sizes and all the highbitdepth versions.

BUG=webm:1425

Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-07-10 11:20:35 -07:00
James Zern
5227b8200b vp9: remove FrameWorkerData & vp9_dthread.h
the file was empty after the struct removal. the only remaining use was
within vp9_dx_iface, but the wrapper became unnecessary after the
removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd
2017-07-05 22:32:00 -07:00
James Zern
48c4a038eb vp9: remove (un)lock_buffer_pool
there is no threaded access to this pool after the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: I710769b87102edc898c59eb9a2e7a91d8c49107f
2017-07-05 21:07:00 -07:00
James Zern
303cb3106b vp9_onyxc_int,RefCntBuffer: rm unused members
the last frame_worker_owner, row and col references were removed in:
131bd06e6 remove vp9_dthread.c

BUG=webm:1395

Change-Id: Ia7fb2e8782b12a58d2a2263849d20a8abf06aef6
2017-06-30 12:03:07 -07:00
James Zern
fb134e759a vp9: reduce FRAME_BUFFERS by 3
the additional buffers are unneeded with the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: Id9ec4cb6462af5d07a0d3cf939bd216db27d9d9e
2017-06-30 12:03:07 -07:00
James Zern
bc837b223b VP9_COMMON: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3a562b2c6b82050064d2b2ccb18a3e77c700b2da
2017-06-30 12:03:07 -07:00
Linfeng Zhang
a76b6b232c Update load_input_data() in x86
Split to load_input_data4() and load_input_data8().
Use pack with signed saturation instruction for high bitdepth.

Change-Id: Icda3e0129a6fdb4a51d1cafbdc652ae3a65f4e06
2017-06-26 13:38:33 -07:00
Linfeng Zhang
cbb991b6b8 Convert 8x8 idct x86 macros to inline functions
Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce
2017-06-13 16:50:43 -07:00
Jerome Jiang
0afa2dad76 Fix vp8 race when build --enable-vp9-highbitdepth.
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
2017-05-26 09:45:01 -07:00
Marco
d3aebeee4e vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl
Change-Id: I259d152c62864b365490368051f3c3b7d7f2f1c5
2017-05-10 12:06:44 -07:00
Marco
4e23998fb4 vp9: SVC: Add option to set downsampling filter type.
Add option in SVC to set the filter type and phase for
the frame level downsampling filters.

For 3 spatial layers: set downsampling filter type to bilinear
and set phase to 8, for lowest spatial layer.

Change-Id: Id81f4b1ba93db19c1cd37b6a46d1281a2c61bc43
2017-05-09 17:22:44 -07:00
Linfeng Zhang
ecd1eb2162 Update 4x4 idct sse2 functions
It's a bit faster to call idct4_sse2() in vpx_idct4x4_16_add_sse2()

Change-Id: I1513be7a895cd2fc190f4a8297c240b17de0f876
2017-05-08 16:16:52 -07:00
Linfeng Zhang
2c3a2ad6f1 Merge changes I0cfe4117,I3581d80d,Ida62c941
* changes:
  Split dsp/x86/inv_txfm_sse2.c
  Update highbd idct functions arguments to use uint16_t dst
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
2017-05-08 16:15:57 +00:00
Jerome Jiang
3453c8d6c4 Merge "vp9: Neon optimization for denoiser. Add unit tests." 2017-05-06 01:28:32 +00:00
Jerome Jiang
069eedb3a0 vp9: Neon optimization for denoiser. Add unit tests.
Denoiser on Neon is 5x faster than C code.

BUG=webm:1420

Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb
2017-05-05 16:40:52 -07:00
Linfeng Zhang
d5de63d2be Update highbd idct functions arguments to use uint16_t dst
BUG=webm:1388

Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03 13:59:16 -07:00
Linfeng Zhang
081b39f2b7 Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
BUG=webm:1388

Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-05-03 13:58:31 -07:00
Linfeng Zhang
e8655d49f5 Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()" 2017-05-01 19:54:40 +00:00
Johann
657f3e9f14 Use uint32_t for accumulator
Be specific about the data type size.

Use convenience macro vp9_zero_array.

Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd
2017-04-28 06:36:59 -07:00
Johann Koenig
94ebdba71d Merge "vp9 temporal filter: sse4 implementation" 2017-04-28 13:22:41 +00:00
Johann
6dfeea6592 vp9 temporal filter: sse4 implementation
Approximates division using multiply and shift.

Speeds up both sizes (8x8 and 16x16) by 30 times.

Fix the call sites to use the RTCD function.

Delete sse2 and mips implementation. They were based on a previous
implementation of the filter. It was changed in Dec 2015:
ece4fd5d22

BUG=webm:1378

Change-Id: I0818e767a802966520b5c6e7999584ad13159276
2017-04-26 22:03:05 -07:00
Linfeng Zhang
4758d20227 Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()
BUG=webm:1388

Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d
2017-04-25 14:32:20 -07:00
Linfeng Zhang
51dc998f3a Update highbd convolve functions arguments to use uint16_t src/dst
BUG=webm:1388

Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-25 14:22:19 -07:00