Scott LaVarnway
3bf02ad74a
vpx: hadamard: use ptrdiff_t instead of int for stride
...
Eliminates the following instruction for the x86 (64 bit)
intrinsic code:
movslq %esi,%rax
Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae
2017-10-26 11:41:48 -07:00
Kyle Siefring
037e596f04
Merge "Optimize convolve8 SSSE3 and AVX2 intrinsics"
2017-10-24 19:22:36 +00:00
Kyle Siefring
ae35425ae6
Optimize convolve8 SSSE3 and AVX2 intrinsics
...
Changed the intrinsics to perform summation similiar to the way the assembly does.
The new code diverges from the assembly by preferring unsaturated additions.
Results for haswell
SSSE3
Horiz/Vert Size Speedup
Horiz x4 ~32%
Horiz x8 ~6%
Vert x8 ~4%
AVX2
Horiz/Vert Size Speedup
Horiz x16 ~16%
Vert x16 ~14%
BUG=webm:1471
Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668
2017-10-24 10:39:48 -04:00
Scott LaVarnway
512bf4e029
vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix
...
Use an intermediate buffer before storing to coeffs when
highbitdepth is enabled.
Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b
2017-10-23 08:49:32 -07:00
Scott LaVarnway
4906cea027
vpx: [x86] vpx_hadamard_16x16_avx2() improvements
...
~10% performance gain. Fixed the cosmetics noted in the
previous commit.
Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13
2017-10-20 08:55:06 -07:00
Scott LaVarnway
b58259ab55
Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()"
2017-10-19 23:32:10 +00:00
Scott LaVarnway
55c126a5d7
vpx: [x86] add vpx_hadamard_16x16_avx2()
...
This version is ~1.91x faster than the sse2 version. When
highbitdepth is enabled, it is ~1.74x.
Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd
2017-10-18 18:00:00 -07:00
Kyle Siefring
b3a36f7946
Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c"
2017-10-18 16:19:52 +00:00
Linfeng Zhang
9336e01621
Merge changes I17fff122,Ic149e3cb
...
* changes:
Add 4 to 3 scaling SSSE3 optimization
Test extreme inputs in frame scale functions
2017-10-17 16:03:29 +00:00
Kyle Siefring
55805e2786
Refactor x86/vpx_subpixel_8t_intrin_avx2.c
...
Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305
2017-10-17 11:57:40 -04:00
Linfeng Zhang
580d32240f
Add 4 to 3 scaling SSSE3 optimization
...
Note this change will trigger the different C version on SSSE3 and
generate different scaled output.
Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3().
Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
2017-10-16 15:42:42 -07:00
Kyle Siefring
caa116c9be
Merge changes I38783d97,If5160c0c
...
* changes:
Extend 16 wide AVX2 convolve8 code to support averaging.
Add AVX2 version of vpx_convolve8_avg.
2017-10-12 16:12:38 +00:00
Linfeng Zhang
16166bfdaa
Add 4 to 1 scaling x86 optimization
...
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10 16:24:06 -07:00
Linfeng Zhang
963cc22cef
Merge changes I9d4c1af5,I882da3a0
...
* changes:
Rename some inline functions in NEON scaling
Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10 17:29:50 +00:00
Kyle Siefring
1b2f92ee8e
Extend 16 wide AVX2 convolve8 code to support averaging.
...
Also adds vpx_convolve8_avg_horiz_avx2.
Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf
2017-10-09 19:10:03 -04:00
Kyle Siefring
9ca06bcdd2
Add AVX2 version of vpx_convolve8_avg.
...
vpx_convolve8_avg works by first running a normal horizontal filter then a
vertical filter averages at the end.
The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
horizontal step.
vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.
Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
2017-10-07 23:37:48 -04:00
James Zern
807248ec81
Merge "ppc: Add vpx_idct32x32_1024_add_vsx"
2017-10-07 19:08:26 +00:00
Linfeng Zhang
127864deb3
Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
...
Change-Id: I882da3a04884d5fabd4cd591c28682cbb2d76aa5
2017-10-04 12:35:39 -07:00
Linfeng Zhang
9a71811d98
Merge changes Id6a8c549,Ib1e0650b,Ic369dd86
...
* changes:
Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
Add vpx_dsp/x86/mem_sse2.h
Add transpose_8bit_{4x4,8x8}() x86 optimization
2017-10-04 16:15:14 +00:00
James Zern
66b6b87471
Merge "vpx: fix nasm build errors"
2017-10-03 21:47:49 +00:00
Scott LaVarnway
bc4bc9b622
vpx: fix nasm build errors
...
BUG=webm:1462,766721
Change-Id: Icfa536a8e38623636b96c396e3c94889bfde7a98
2017-10-03 20:02:21 +00:00
Linfeng Zhang
6543213e87
Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
...
Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac
2017-10-03 13:02:05 -07:00
Linfeng Zhang
0f756a307d
Add vpx_dsp/x86/mem_sse2.h
...
Add some load and store sse2 inline functions.
Change-Id: Ib1e0650b5a3d8e2b3736ab7c7642d6e384354222
2017-10-03 12:59:05 -07:00
Linfeng Zhang
67c38c92e7
Add transpose_8bit_{4x4,8x8}() x86 optimization
...
Change-Id: Ic369dd86b3b81686f68fbc13ad34ab8ea8846878
2017-10-03 10:00:30 -07:00
Alexandra Hájková
fb7fc1dbda
ppc: Add vpx_idct32x32_1024_add_vsx
...
Change-Id: I55cd0a1569ccc47a53d0ecf751aac259d510e10d
2017-09-30 19:31:20 +00:00
Scott LaVarnway
3bbd62ed27
vpxdsp: [x86] add highbd_d135_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~1.81x
C vs SSSE3 speed gains:
_8x8 : ~1.96x
_16x16 : ~1.88x
_32x32 : ~2.02x
BUG=webm:1411
Change-Id: Iefaf8b39afbbfe34c1ad1d21e3a003b20f1f61e0
2017-09-29 08:56:38 -07:00
Scott LaVarnway
4cae64c32c
vpxdsp: [x86] add highbd_d117_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~2.04x
C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x
BUG=webm:1411
Change-Id: I31d949695991c067dac89d91e0bed3e666c94993
2017-09-28 14:45:28 -07:00
Scott LaVarnway
80992a746c
Merge "vpxdsp: [x86] add highbd_d153_predictor functions"
2017-09-27 20:40:21 +00:00
James Zern
690fa6bb6e
Merge "fix signed integer overflow of idct"
2017-09-27 19:39:11 +00:00
Linfeng Zhang
dbbbd44304
fix signed integer overflow of idct
...
Exposed by fuzz test in high bitdepth.
The bug is introduced in commit 64653fa
.
BUG=webm:1466
Change-Id: Idd77d5c6a60efb9241471611ce1aba0646cb6ff5
2017-09-27 11:17:54 -07:00
Scott LaVarnway
19c45ccd43
vpxdsp: [x86] add highbd_d153_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~1.95x
C vs SSSE3 speed gains:
_8x8 : ~3.30x
_16x16 : ~5.67x
_32x32 : ~3.87x
BUG=webm:1411
Change-Id: Ib483989b25614aa89b635e8c087d0879a5d71904
2017-09-27 11:01:11 -07:00
Linfeng Zhang
9d0d13e939
Add vpx_scaled_2d_neon()
...
BUG=webm:1419
Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96
2017-09-26 09:22:39 -07:00
Linfeng Zhang
28762341ac
Merge changes Ib9105462,Idfac00ed,If8d8a0e2
...
* changes:
cosmetics: NEON scaling code
Refactor convolve NEON code
Refactor convolve code
2017-09-26 16:10:46 +00:00
Scott LaVarnway
a059dc0986
Merge "vpxdsp: [x86] add highbd_d45_predictor functions"
2017-09-25 11:34:14 +00:00
Scott LaVarnway
cf82f7276e
vpxdsp: [x86] add highbd_d45_predictor functions
...
C vs SSSE3 speed gains:
_4x4 : ~2.45x
_8x8 : ~10.61x
_16x16 : ~11.34x
_32x32 : ~6.36x
BUG=webm:1411
Change-Id: Ic91389a4f1a8ad093f498afe53765b897fb9be09
2017-09-22 05:20:12 -07:00
Linfeng Zhang
d586cdb4d4
Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
...
BUG=webm:1450
Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8
2017-09-20 14:13:26 -07:00
Linfeng Zhang
76a3d3fcc5
Remove the unnecessary upcasts of (int)cospi_{1...31}_64
...
BUG=webm:1450
Change-Id: Ib046fe28caec5b9ebdc9d0152df7c54ff4266858
2017-09-20 14:13:26 -07:00
Linfeng Zhang
64653fa133
Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
...
The unnecessary upcast to (int) will be cleaned later.
BUG=webm:1450
Change-Id: Ia234575206d5a74540526924b06ed3939322d063
2017-09-20 14:13:26 -07:00
Scott LaVarnway
b85e391ac8
Merge "vpxdsp: [x86] add highbd_d63_predictor functions"
2017-09-20 11:39:28 +00:00
Linfeng Zhang
7c0529728a
cosmetics: NEON scaling code
...
Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618
2017-09-19 16:39:17 -07:00
Linfeng Zhang
f357335c38
Refactor convolve NEON code
...
Rename a couple of hbd static functions.
Move the position of NEON function convolve8_4().
Change-Id: Idfac00edf2e99cdd8e0a73b9f895402f60be6349
2017-09-19 16:28:36 -07:00
Linfeng Zhang
bf8bdae913
Refactor convolve code
...
Extract a couple of static functions into their caller functions.
Change-Id: If8d8a0e217fba6b402d2a79ede13b5b444ff08a0
2017-09-19 16:28:31 -07:00
Scott LaVarnway
bc86e2c6a2
vpxdsp: [x86] add highbd_d63_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~2.94x
C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x
BUG=webm:1411
Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4
2017-09-19 15:47:22 -07:00
Linfeng Zhang
a80bdfd081
Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
...
Add "typedef int16_t tran_coef_t;"
BUG=webm:1450
Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154
2017-09-18 09:26:03 -07:00
Linfeng Zhang
9d278465b5
Merge "cosmetics: vp9_rtcd_defs.pl"
2017-09-18 16:23:33 +00:00
Kaustubh Raste
4ca8f8f5e2
mips msa clean-up msa macros
...
Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store
Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448
2017-09-14 12:29:19 +05:30
Linfeng Zhang
535dee0fb6
cosmetics: vp9_rtcd_defs.pl
...
Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086
2017-09-13 12:13:55 -07:00
Johann Koenig
ed3a80cb5e
Merge "Revert "Revert "quantize avx: copy 32x32 implementation"""
2017-09-13 14:44:53 +00:00
Johann
eb4238ac70
Revert "Revert "quantize avx: copy 32x32 implementation""
...
This reverts commit 8c42237bb2
.
Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.
Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c
Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
2017-09-12 14:25:38 -07:00
Kaustubh Raste
30f1ff94e0
Optimize mips msa vp9 average mc functions
...
Load the specific destination loads instead of vector load
Change-Id: I65ca13ae8f608fad07121fef848e2a18f54171fe
2017-09-12 16:12:11 +05:30
Scott LaVarnway
c39cd9235e
Merge "vpxdsp: [x86] add highbd_d207_predictor functions"
2017-09-11 22:32:23 +00:00
Linfeng Zhang
a9bbe53dbb
Add 4 to 1 scaling NEON optimization
...
BUG=webm:1419
Change-Id: If82a93935d2453e61b7647aae70983db1740bec7
2017-09-11 10:17:28 -07:00
Scott LaVarnway
d6c9bbc2b6
vpxdsp: [x86] add highbd_d207_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~2.31x
C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x
BUG=webm:1411
Change-Id: I0bac29db261079181ddabc6814bd62c463109caf
2017-09-11 07:36:24 -07:00
James Zern
fb40b5d7a7
intrapred: sync highbd_d63_predictor w/d63_
...
8/16/32: ~6%/~18%/~33% faster
previously:
7012ba639
vp9_reconintra: simplify d63_predictor
BUG=webm:1411
Change-Id: Ie775f3a4f7fd74df44754e65686d826a51c2cdc2
2017-09-08 19:28:01 -07:00
James Zern
5c95fd921e
intrapred: sync highbd_d45_predictor w/d45_
...
8/16/32:: ~19%/~54%/~75.5% faster
previously:
acc481eaa
vp9_reconintra: simplify d45_predictor
BUG=webm:1411
Change-Id: Ie8340b0c5070ae640f124733f025e4e749b660d8
2017-09-08 19:09:07 -07:00
James Zern
9a2dd7e67e
Merge changes I9ec438aa,I99c954ff
...
* changes:
Update convolve functions' assertions
Add 2 to 1 scaling NEON optimization
2017-09-08 19:23:40 +00:00
Shiyou Yin
2c7b7424c5
Merge "vpxdsp: [loongson] optimize sad functions with mmi"
2017-09-08 00:55:14 +00:00
Linfeng Zhang
ef41c6286d
Update convolve functions' assertions
...
So that 4 to 1 frame scaling can call them.
Change-Id: I9ec438aa63b923ba164ad3c59d7ecfa12789eab5
2017-09-07 12:33:58 -07:00
Linfeng Zhang
3ec20445b2
Refactor convolve8 NEON functions
...
Change-Id: I4ac576875c91fee7cb150d298fae4a2c156d374c
2017-09-06 15:55:17 -07:00
Linfeng Zhang
7219f31904
Merge "Remove get_filter_base() and get_filter_offset() in convolve"
2017-09-06 22:39:15 +00:00
Linfeng Zhang
d331e7a1c0
Remove get_filter_base() and get_filter_offset() in convolve
...
so that the convolve functions are independent of table alignment.
Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-09-05 15:22:36 -07:00
Scott LaVarnway
bc4bcca3fd
vpxdsp: [x86] add highbd_dc_128_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~7.64x
_8x8 : ~16.60x
_16x16 : ~8.15x
_32x32 : ~5.05x
BUG=webm:1411
Change-Id: If165d419711cfda901bd428a05ca1560a009e62e
2017-09-05 07:57:42 -07:00
Shiyou Yin
f4150163a2
vpxdsp: [loongson] optimize sad functions with mmi
...
1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c
Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a
2017-09-02 15:11:32 +00:00
James Zern
334e9abb0b
Merge "inv_txfm_vsx: fix loads in high-bitdepth"
2017-09-01 03:09:49 +00:00
James Zern
f8f64c309b
inv_txfm_vsx: fix loads in high-bitdepth
...
vec_vsx_ld -> load_tran_low
Change-Id: Id3144cdd528d2d406a515e5812e2ea9e4db64bf1
2017-08-30 23:47:56 -07:00
Scott LaVarnway
c39a05ff61
vpxdsp: [x86] add highbd_dc_left_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~6.49x
_8x8 : ~10.82x
_16x16 : ~7.61x
_32x32 : ~5.29x
BUG=webm:1411
Change-Id: Ibc30c50cb7139049bf05298010803499e6ef949b
2017-08-30 09:29:06 -07:00
Scott LaVarnway
f783e3a75d
vpxdsp: [x86] add highbd_dc_top_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~7.39x
_8x8 : ~11.36x
_16x16 : ~8.68x
_32x32 : ~4.33x
BUG=webm:1411
Change-Id: I7f1487cd1531d4e7f0fbb4596fed3bfb72a59d58
2017-08-29 12:53:30 -07:00
Scott LaVarnway
30d9a1916c
vpxdsp: [x86] add highbd_h_predictor functions
...
C vs SSE2 speed gains:
_4x4 : ~8.12x
_8x8 : ~9.71x
_16x16 : ~8.21x
_32x32 : ~5.0x
BUG=webm:1422
Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993
2017-08-28 17:31:18 -07:00
Marco Paniconi
3e069846b9
Merge "Revert "quantize avx: copy 32x32 implementation""
2017-08-25 18:20:31 +00:00
Marco Paniconi
8c42237bb2
Revert "quantize avx: copy 32x32 implementation"
...
This reverts commit f60d1dcd3d
.
Reason for revert: <INSERT REASONING HERE>
Failures in AVX/VP9QuantizeTest in nightly tests.
Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c
TBR=slavarnway@google.com ,johannkoenig@google.com,builds@webmproject.org
Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-08-25 16:56:08 +00:00
Shiyou Yin
ece1989fa2
Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi."
2017-08-25 06:44:02 +00:00
Shiyou Yin
9e4647c7ab
vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi.
...
Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174
2017-08-25 01:58:49 +08:00
Johann
f60d1dcd3d
quantize avx: copy 32x32 implementation
...
Ensure avx and ssse3 stay in sync by testing them against each other.
Change-Id: I699f3b48785c83260825402d7826231f475f697c
2017-08-24 10:42:34 -07:00
Johann
1787e7dbe0
quantize ssse3: copy implementation to intrinsics
...
Still does not pass tests. Does match the previous assembly, although
saving the sign before multiplying is dubious.
Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a
2017-08-24 07:47:51 -07:00
Shiyou Yin
d080c92524
Merge "vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi."
2017-08-24 00:55:11 +00:00
Johann Koenig
f53b656207
Merge "quantize avx: copy implementation to intrinsics"
2017-08-23 21:14:13 +00:00
Scott LaVarnway
1aad50c092
Merge "vpx_dsp: get32x32var_avx2() cleanup"
2017-08-23 19:59:25 +00:00
Johann Koenig
dfafd10ef5
Merge "quantize neon: round dqcoeff towards zero"
2017-08-23 19:20:53 +00:00
Johann
7c27872164
quantize avx: copy implementation to intrinsics
...
Adds an early exit based on ptest. Slightly slower than ssse3 in the
full case because of the extra check, but potentially faster if lots of
rows can be skipped.
Very close in speed to the assembly.
Can run in 32 bit, unlike the assembly. Allows reworking the function
prototype to use structs.
Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
2017-08-23 09:19:16 -07:00
Johann
2a5aa98a35
quantize neon: round dqcoeff towards zero
...
Add 1 if negative to get dqcoeff to round towards zero.
10-15% faster than converting to positive before shifting.
Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d
2017-08-23 08:05:50 -07:00
Shiyou Yin
59e065b6ed
vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi.
...
Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2
2017-08-23 15:14:15 +08:00
Johann
b9c1dcc5fa
quantize ssse3: copy style from sse2
...
Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598
2017-08-22 14:25:27 -07:00
Johann
75752ab7c0
quantize sse2: copy opts from ssse3
...
Simplify eob calculations based on ssse3 implementation.
General clean up and re-scoping.
Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee
2017-08-22 13:01:44 -07:00
Johann Koenig
ab27b68693
Merge changes Icfb70687,I9a963e99,Ie8ac00ef,I1272917c
...
* changes:
quantize: ignore skip_block in arm
quantize: ignore skip_block in x86
quantize fp: ignore skip_block in arm
quantize fp: ignore skip_block in x86
2017-08-22 19:19:14 +00:00
James Zern
419ce36294
Merge "ppc: Add vpx_idct16x16_256_add_vsx"
2017-08-22 00:48:39 +00:00
Shiyou Yin
bff5aa9827
Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi."
2017-08-22 00:37:23 +00:00
Johann
2c56bb97f2
quantize: ignore skip_block in arm
...
Change-Id: Icfb70687476b2edb25d255793ba325b261d40584
2017-08-21 14:37:50 -07:00
Johann
c02fdd0258
quantize: ignore skip_block in x86
...
Change-Id: I9a963e99f08761f0c8d6a305619270b2f1c4edf8
2017-08-21 14:37:03 -07:00
Johann
13eed991f9
Remove skip_block from quantize
...
This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.
Add assert() and comments regarding the usage of skip_block.
Removing the parameter is a fairly involved process so leave it be for
the moment.
Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-08-21 09:49:04 -07:00
Scott LaVarnway
eab3f5e0cc
vpx_dsp: get32x32var_avx2() cleanup
...
renamed to get32x16var_avx2()
BUG=webm:1404
Change-Id: Icb8f3986c9c9c646e13a69430db7235fc7e1a036
2017-08-18 13:44:09 -07:00
Scott LaVarnway
2c5478e383
Merge "vpx_dsp: vpx_get16x16var_avx2() cleanup"
2017-08-18 20:30:59 +00:00
Scott LaVarnway
2f7497f341
vpx_dsp: vpx_get16x16var_avx2() cleanup
...
BUG=webm:1404
Change-Id: I88aceb07f4db4870a06eee21d87296974ce3221a
2017-08-18 12:23:49 -07:00
Johann Koenig
1426f04e91
Merge "quantize: normalize intermediate types"
2017-08-18 16:00:28 +00:00
Shiyou Yin
7d82e57f5b
vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi.
...
Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e
2017-08-18 09:06:49 +08:00
James Zern
bb15fd51be
highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo
...
135 -> 34
fixes unused function warnings for highbd_idct32_34_4x32_quarter_[12]
Change-Id: I4f50ff6ea514200af93dd59ff94c7f9717409682
2017-08-17 15:37:38 -07:00
Johann
7f602d6114
quantize: normalize intermediate types
...
Despite abs_coeff being a positive value, all the other implementations
treat it as signed which simplifies restoring the sign.
HBD builds cast qcoeff to avoid a visual studio warning. Match
vp9_quantize.c style of casting the entire expression.
Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5
2017-08-17 12:34:28 -07:00
James Zern
e038d1610e
inv_txfm_sse2.h: correct idct*/iadst* prototypes
...
fixes mismatch between prototypes and definitions
Change-Id: Ib5e7dfcce244dbb8401815be2cdd183d96792652
2017-08-16 23:06:09 -07:00
Linfeng Zhang
f95686895b
Merge changes I08b562b6,Ia275940a,I51106e90
...
* changes:
Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
Update highbd idct x86 optimizations.
Update 32x32 idct sse2 and ssse3 optimizations.
2017-08-16 16:36:37 +00:00
Jerome Jiang
6b9c691daf
Merge "Clean up writing YUV files for debug purpose."
2017-08-15 18:28:54 +00:00
Jerome Jiang
a153080b55
Clean up writing YUV files for debug purpose.
...
Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.
To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising
For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'
Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528
2017-08-15 10:44:03 -07:00