Yi Luo
6036a0d24f
Following SSSE3 intrinsics functions also work for HBD
...
- vpx_idct8x8_12_add_ssse3
vpx_idct8x8_64_add_ssse3
vpx_idct32x32_34_add_ssse3
vpx_idct32x32_135_add_ssse3
vpx_idct32x32_1024_add_ssse3
- turn on unit tests.
Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
2017-02-21 12:37:53 -08:00
Johann Koenig
1e224dcb83
Merge "Drop zbin_ptr and quant_shift_ptr"
2017-02-21 18:16:38 +00:00
Yi Luo
62a332160f
Merge "Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests"
2017-02-21 16:36:06 +00:00
Paul Wilkins
4d4231352c
Merge "Change to prediction decay calculation."
2017-02-21 09:42:38 +00:00
Marco Paniconi
f091752a9c
Merge "vp9: Fix for non-rd pickmode for high-bitdepth build."
2017-02-21 05:37:23 +00:00
Marco
4e1ba35458
vp9: Fix for non-rd pickmode for high-bitdepth build.
...
Use the simple block_yrd under certain conditions.
The optimization code is completed but the speed is still slower
(~6% on 720p) than the low-bitdepth build.
For now, use the more complex block_yrd under certain conditions
(always use it for speed <= 5, otherwise use it on key frames and for
bsize >= 32x32).
This gives about ~2-3% gain in quality for speed 7 on RTC set
(over high bitdepth build), with about the same encoder fps as the
low bitdepth build.
Change-Id: Ibe92a1945d0bd635f880befb4c815727df62d754
2017-02-20 20:25:36 -08:00
James Zern
bf6fcebfed
vp8_fdct4x4_test: align input and output buffers
...
fixes segfault in 32-bit builds
Change-Id: I5b3cc5a335cb236a6ec4cb11fa8feb54ae0182c7
2017-02-18 13:30:28 -08:00
James Zern
52b3e1a633
datarate_test: disable OnePassCbrSvc2SpatialLayersDenoiserOn
...
segfaults
BUG=webm:1374
Change-Id: I3790c6cb8a539d13dee6a8225ef09b1575dea26c
2017-02-17 16:23:22 -08:00
Johann Koenig
9cb470eba7
Merge "vp8_short_fdct4x4: verify optimized functions"
2017-02-17 22:11:08 +00:00
Yi Luo
1f8e8e5bf1
Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests
...
- In SSSE3 optimization, 16-bit addition and subtraction would
overflow when input coefficient is 16-bit signed extreme values.
- Function-level speed becomes slower (unit ms):
idct8x8_64: 284 -> 294
idct8x8_12: 145 -> 158.
BUG=webm:1332
Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b
2017-02-17 14:05:05 -08:00
James Zern
3e7025022e
Merge "Add vpx_highbd_idct16x16_10_add_neon()"
2017-02-17 20:29:37 +00:00
paulwilkins
a63adac604
Change to prediction decay calculation.
...
This change subtracts out low complexity intra regions that are also low
error in the inter domain, in the calculation of the frame prediction decay.
The rationale here his that low complexity regions (such as sky) do not imply
high prediction decay in the same way as high error intra or neutral blocks.
The effect of this is small in most clips but in a few clips it can be > 10%.
(E.g. In to tree)
Change-Id: If67ac23d17fca14285cad2defa464c61c9ea861c
2017-02-17 09:29:24 +00:00
Johann
bf05cd3c99
vp8_short_fdct4x4: verify optimized functions
...
Change-Id: I7c7f5dfabde65c09f111fb0ced0e3ad231ee716e
2017-02-16 19:34:50 -08:00
Johann
c7342f35c8
tiny_ssim: clean up on failure
...
Clears up clang static analysis warnings about memory leaks.
Change-Id: I60d4d0f3794735a8b81d9da4a30d19e7a9cba9cf
2017-02-17 03:28:34 +00:00
Yi Luo
f62dcc9c33
Replace idct32x32_1024_add_ssse3 assembly with intrinsics
...
- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on
i7-6700, no obvious user-level speed performance downgrade.
- Passed unit tests.
Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
2017-02-16 16:10:40 -08:00
James Zern
b5bc9ee02d
Merge "cosmetics: Fix spelling mistake in compile flag name."
2017-02-17 00:04:42 +00:00
Johann Koenig
a9b81da575
Merge "block error avx2: use tran_low_t"
2017-02-16 23:51:14 +00:00
Linfeng Zhang
0620081731
Add vpx_highbd_idct16x16_10_add_neon()
...
BUG=webm:1301
Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab
2017-02-16 15:13:50 -08:00
James Zern
0f014c97e5
Merge "Fix mips vpx_post_proc_down_and_across_mb_row_msa function"
2017-02-16 23:02:10 +00:00
James Zern
e9d07c0c2a
Merge "disable VP9MultiThreadedFrameParallel tests"
2017-02-16 22:56:02 +00:00
paulwilkins
d218b0914e
cosmetics: Fix spelling mistake in compile flag name.
...
agressive -> aggressive
after:
ce7b38459 Aggressive VBR method.
Change-Id: Ie0f30b1bbc77ed9f32bec047b4a9b3d0cf4853f5
2017-02-16 14:51:31 -08:00
Johann Koenig
06a82af0de
Merge "correct bitdepth_conversion_sse2.h header guard"
2017-02-16 21:41:28 +00:00
Johann
ca4e27f5da
Drop zbin_ptr and quant_shift_ptr
...
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
of these parameters.
scan is used for C code and iscan is used for SIMD implementations.
Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
2017-02-16 13:20:32 -08:00
James Zern
6ab0870d45
disable VP9MultiThreadedFrameParallel tests
...
these are flaky and cause TSan warnings with clang-3.9.1
BUG=webm:1372
Change-Id: I8a7047552ba2ccd2d8c45f8795818c74562e5990
2017-02-16 12:56:04 -08:00
Johann
6c2d732bf4
correct bitdepth_conversion_sse2.h header guard
...
Change-Id: Ic4ffd861608e67fe59bcb3a86010ce3ef11a5519
2017-02-16 12:43:33 -08:00
Yi Luo
1cb44945fb
Merge "Add idct32x32_135_add SSSE3 intrinsics"
2017-02-16 20:43:29 +00:00
Johann
2104454607
block error avx2: use tran_low_t
...
Change-Id: Ic5f3a1f569d6f82afeaf4fcd7235374bb460db3c
2017-02-16 12:39:02 -08:00
Johann Koenig
cc43012674
Merge changes I267050a5,Iebade0ef,Id96a8df3
...
* changes:
quantize_fp_32x32 highbd ssse3: enable existing function
quantize_fp highbd ssse3: use tran_low_t for coeff
quantize_fp highbd sse2: use tran_low_t for coeff
2017-02-16 20:34:48 +00:00
Yi Luo
72a43e2378
Add idct32x32_135_add SSSE3 intrinsics
...
- Replace the corresponding assembly code.
- No user level speed performance degrade.
- Unit tests passed.
Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
2017-02-16 11:29:34 -08:00
Yunqing Wang
0bf6b51572
Merge "Structured the mode ordering code to avoid redundant memcpy"
2017-02-16 16:22:54 +00:00
Johann
ff37a911ce
quantize_fp_32x32 highbd ssse3: enable existing function
...
This was created as part of the quantize_fp_ssse3 change. Both
functions use the same source file with different macro parameters.
Change-Id: I267050a559426a85955d215aa0aaca270439c5ab
2017-02-16 07:40:56 -08:00
Johann
4682130b60
quantize_fp highbd ssse3: use tran_low_t for coeff
...
Change-Id: Iebade0efc0efbb0a80a0f3adbef4962e3a2f25e8
2017-02-16 07:40:56 -08:00
Johann
ac3996a6d1
quantize_fp highbd sse2: use tran_low_t for coeff
...
Change-Id: Id96a8df33354a7987ce890a3d6798c7375ffa4aa
2017-02-16 07:40:55 -08:00
Johann
44600442dc
bitdepth conversion: really use num elements
...
The previous implementation confused bit/bytes/elements. It was using
'32' as the multiplier but that was mistakenly adopted because a 32x32
transform embedded the stride.
Change-Id: Ieeb867a332416b9a40580b5e7c9b20088e9e691a
2017-02-16 15:02:48 +00:00
Ranjit Kumar Tulabandu
5127e58dab
Structured the mode ordering code to avoid redundant memcpy
...
Change-Id: I4f5d6b54018bd1928cd9e5e42619e6f55b334803
2017-02-16 14:12:33 +00:00
Paul Wilkins
60a10116d1
Merge "Disconnect ARF breakout from frame boost."
2017-02-16 10:02:09 +00:00
Paul Wilkins
543ebc900f
Merge "Remove unnecessary factor."
2017-02-16 10:01:58 +00:00
Paul Wilkins
9216ba58d8
Merge "Bug in scale_sse_threshold()"
2017-02-16 10:01:46 +00:00
Paul Wilkins
e6c1993f1b
Merge "Additional first pass stats."
2017-02-16 09:39:29 +00:00
Kaustubh Raste
fddf66b741
Fix mips vpx_post_proc_down_and_across_mb_row_msa function
...
Added fix to handle non-multiple of 16 cols case for size 16
Change-Id: If3a6d772d112077c5e0a9be9e612e1148f04338c
2017-02-16 13:17:00 +05:30
Johann Koenig
b63e88e506
Merge "Use 'packssdw' for loading tran_low_t values"
2017-02-16 02:41:00 +00:00
Johann Koenig
61d05c1e67
Merge "vp8_dx_iface: remove unused 'else' condition"
2017-02-16 01:00:45 +00:00
James Zern
cc04ae1565
Merge "vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism"
2017-02-16 00:21:19 +00:00
Marco Paniconi
e6cf741ae6
Merge "vp9: Some code cleanup for aq-mode = 3."
2017-02-15 23:03:27 +00:00
Marco
158b300952
vp9: Some code cleanup for aq-mode = 3.
...
The weight segment needs to only be computed once per frame,
so remove it from the funciton vp9_cyclic_refresh_rc_bits_per_mb(),
which is called within a loop inside vp9_rc_regulate_q.
Change-Id: Ia0e18b89abb97e42c466d4dbc47700d7f76555db
2017-02-15 14:07:04 -08:00
Jerome Jiang
2865de86ec
vpx_temporal_svc_encoder: Expose error resilient control to cmd line.
...
Change-Id: Ic74a8690b136ffbc370080f70b2d5a6b1572bf63
2017-02-15 21:45:52 +00:00
Linfeng Zhang
d12f25f216
Merge "cosmetics,dsp/inv_txfm.c: reorder functions"
2017-02-15 20:18:23 +00:00
Marco Paniconi
725606a678
Merge "vp9. Use same source_sad threshold for all speeds."
2017-02-15 20:07:19 +00:00
Linfeng Zhang
106c342659
cosmetics,dsp/inv_txfm.c: reorder functions
...
Change-Id: Ie0f7689ebe230c68eadb22a32b14838c1a7543a6
2017-02-15 11:40:35 -08:00
Linfeng Zhang
d5edf56bb5
Merge "Add vpx_highbd_idct16x16_38_add_neon()"
2017-02-15 19:34:18 +00:00