16549 Commits

Author SHA1 Message Date
Johann Koenig
9b63cb057a Merge "post proc test: add padding for sse2 tests" 2016-12-17 01:12:34 +00:00
Marco Paniconi
d1eca240fb Merge "vp9: Change condition to enable recheck_zeromv_after_denoising." 2016-12-16 23:53:33 +00:00
Marco
4260a7f2b3 vp9: Change condition to enable recheck_zeromv_after_denoising.
For when denoising enabled: change condition to enable
the recheck_zeromv_after_denoising for only very high noise level.
This is causing an issue, so enabling it for very high noise
to effectively shut it off.

Change-Id: Ic40d6025f3f398338cedd270d17c0ccd9a3daa84
2016-12-16 15:00:21 -08:00
Johann
5993b808f0 post proc test: add padding for sse2 tests
Avoid valgrind warnings for reading out of bounds when the width is not
divisible by 16.

Change-Id: I5670d7cfbbce00874b98cfb7472f99c7936c2c47
2016-12-16 14:06:06 -08:00
Johann
4781a67737 postproc test: disable new down and across test
The new test is causing valgrind failures:
[ RUN      ] SSE2/VpxPostProcDownAndAcrossMbRowTest.CheckCvsAssembly/0
==28923== Invalid read of size 16
28923==    at 0x724016: ??? (deblock_sse2.asm:146)

Disable during investigation. The test is new but the code is not.

Change-Id: I5521e5fd48a595e3798b833bf7e3cc97b81c1975
2016-12-16 12:19:00 -08:00
Jim Bankoski
318a1ff5ec vp8 : use threading mutex's for tsan only.
To avoid decode performance hit of 2% when running on hyperthreaded
cores.

This patch only uses the mutex's when we are running tsan.

This is safe because 32 bit operations like read and store are atomic
on all the platforms we care about. Tsan warns about race situations,
but in this case either situation ( read occurs before write or write
before read) the worst case is that we go around one extra time in the
loop.  So the ordering doesn't really matter.

That said a few other things have been tried :

for instance as per here:
webrtc/base/atomicops.h#52

In this patch they use:
__atomic_load_n(i, __ATOMIC_ACQUIRE);
__atomic_store_n(i, value, __ATOMIC_RELEASE);

This code works on gcc, clang ( replacing protected write and read), and
avoids tsan errors. Incurring no penalty in performance.  In C11 its
replaced by straight atomic operands.

However there is no equivalent in the visual studio's we support as
int32 on all windows platforms is already atomic.  To avoid tsan like
warnings on windows we'd need to use interlocked exchange and the
end result doesn't gain us any thing.

Change-Id: I2066e3c7f42641ebb23d53feb1f16f23f85bcf59
2016-12-16 08:50:55 -08:00
Marco Paniconi
2b1ec65b5d Merge "vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS" 2016-12-15 19:48:16 +00:00
Johann
41b0888a84 postproc: neon down and across macroblock filter
Implement vpx_post_proc_down_and_across_mb_row in NEON.
Runs about 6-7x faster than C.

BUG=webm:1320

Change-Id: Ic5c7d3552a88cfcf999ec5bf2bd46fee460642c2
2016-12-14 15:11:28 -08:00
Marco
5de798f2b2 vp9: Fix to usage of flag USE_ALTREF_FOR_ONE_PASS
The flag USE_ALTREF_FOR_ONE_PASS allows for alt-ref lookahead
in 1 pass vbr (from https://chromium-review.googlesource.com/#/c/365498).
This change is to make sure this macro flag only has effect if
the config flag cpi->oxcf.enable_auto_altef is also on.

No change in ytlive encoding, as USE_ALTREF_FOR_ONE_PASS is not
yet enabled.

Change-Id: I1a69681e4a15c5244581a3dab4587fca08f02e0f
2016-12-14 15:07:38 -08:00
Yaowu Xu
27e1bacdb3 Change order of operation to avoid ubsan warnings
This commit change an order of operation to avoid left shifts of
negative numbers.

Change-Id: I607c7eb91658c7a5ef397fc1504721d1b10e3dd6
2016-12-14 09:37:14 -08:00
Linfeng Zhang
3dd20456ab Merge "Update idct test code to test 8-bit & high bitdepth simultaneously" 2016-12-14 17:05:34 +00:00
Linfeng Zhang
201dcefafe Update idct test code to test 8-bit & high bitdepth simultaneously
Change-Id: Icc0eb9c0ddf2a13ec832877a089450972134e8ec
2016-12-13 17:25:04 -08:00
James Bankoski
3486abd54a Merge "Reapply 'Amend and improve VP8 multithreading implementation'" 2016-12-14 01:21:50 +00:00
James Zern
86e340c76e enable vpx_idct32x32_1024_add_neon in hbd builds
BUG=webm:1294

Change-Id: Ibdda54e6d1303b0f73bc7bc71417e4041d7618de
2016-12-12 19:28:35 -08:00
Jim Bankoski
85a541a421 Reapply 'Amend and improve VP8 multithreading implementation'
Reapply this patch:
ff0107f Amend and improve VP8 multithreading implementation

Amended the patch to add a unit test, and fix an asan error.

BUG=webm:851

Change-Id: I6572c03256169c64e80248bf5a5e99f59a2fc93c
2016-12-13 02:11:34 +00:00
Linfeng Zhang
5d4aa325a6 Cosmetics by unifying dest_stride to stride in idct
Change-Id: Ie9336a808a3c3592bb4fd5d4ad3839028bfcafba
2016-12-12 15:13:22 -08:00
James Bankoski
282f3b3d78 Merge "vp8: adds multithread testing." 2016-12-10 00:01:32 +00:00
Marco Paniconi
817488be47 Merge "vp9: Fix to crash in svc code." 2016-12-09 23:47:02 +00:00
Jim Bankoski
121e161115 vp8: adds multithread testing.
The test is disabled because of TSAN errors until we resolve
BUG=webm:851

Change-Id: I0b21c8d815bc1ea365da024b1e2ee5e1fc5715c2
2016-12-09 15:05:59 -08:00
Johann
2c24f7178d Move load_and_transpose to transpose_neon.h
Allows for use outside the idcts without pulling in idct_neon.h

Change-Id: I4a94c1af3dac3e1b5bc8296ec9eab0ddcc8cfecf
2016-12-09 12:54:55 -08:00
Marco
076d4bd91a vp9: Fix to crash in svc code.
use_base_mv assumes 2x2 scaling, so fix is to shutoff
this feature unless spatial scale factors are 2.

Added svc unittest for 2 spatial layers with 5x5 scaling,
which generates the issue without this fix.

Also fix some settings in svc unittest:
let the speed setting vary (from 5 to 8), and enable static threshold.

BUG=webm:1344

Change-Id: Idfd0a6c633c21b49a0479601506302cfe974e30e
2016-12-09 08:57:09 -08:00
James Zern
7ba9d31e3f Merge "idct16x16_add_neon: fix arm visual studio builds" 2016-12-09 03:19:16 +00:00
Marco
cd6f742980 vp8 multi_res_encoder: Ajust some settings in sample encoder.
Set #threads to default 1 for all streams, change bit allocaton
for 3 temporal layers, and enable denoiser on middle resolution layer.

Change-Id: I4a57adbfdb2c319002b8f3cf359613842dc00d75
2016-12-08 15:27:16 -08:00
James Zern
6defef4ab2 idct16x16_add_neon: fix arm visual studio builds
after:
2d3d95f enable vpx_idct16x16_256_add_neon in hbd builds

reorder INCLUDEs and fix indent of IF/ENDIFs

remove vpx_config.asm to avoid multiple symbol definitions in windows
builds and shift idct_neon.asm.S to the top to allow use of
CONFIG_VP9_HIGHBITDEPTH in the export list.

Change-Id: I0dacfbae62a6ec8fe4a26940c1a52da2dfad2029
2016-12-08 15:17:57 -08:00
Yunqing Wang
880adc3355 Merge "Remove an unused first pass statistic" 2016-12-08 22:46:44 +00:00
Yunqing Wang
394020383d Remove an unused first pass statistic
One of the first pass stats "new_mv_count" is no longer used in VP9,
and is removed. This also makes it easy to implement a multi-threaded
first pass. This change doesn't affect the coding performance, which
has been verified by borg tests.

Change-Id: I4c7c7bf9465fda838eb230814ef0c631c068c903
2016-12-07 15:32:25 -08:00
Marco Paniconi
e4c6f8fde7 Merge "vp9: Fix some TODOs in svc code." 2016-12-07 22:06:01 +00:00
Linfeng Zhang
385599b553 Merge "Update TEST_P(PartialIDctTest, RunQuantCheck)" 2016-12-07 21:05:05 +00:00
Linfeng Zhang
174528de1e Merge "Update idct NEON optimization to not use narrowing saturating shift" 2016-12-07 21:03:21 +00:00
Marco
5778a7c9cb vp9: Fix some TODOs in svc code.
Change-Id: Ie9f441245987ade9dab38af69adf4dd1fb38ca3f
2016-12-07 13:02:48 -08:00
James Zern
f16a0a1aa4 Merge "enable vpx_idct16x16_256_add_neon in hbd builds" 2016-12-07 20:26:44 +00:00
Linfeng Zhang
834feffe08 Update TEST_P(PartialIDctTest, RunQuantCheck)
1. Use correct projections when copying real dct/quant outputs.
2. Remove local random number generator and combine loops.
3. Quantization with minimum allowed step sizes instead of maximum.
   This may generate larger inputs.

Change-Id: I154afc26230c894d564671cff4b8fd5485b69598
2016-12-07 11:34:00 -08:00
Marco Paniconi
17c403d0ab Merge "vp9: Adjust the weight factor for segment rate cost for aq-mode=3." 2016-12-07 19:31:13 +00:00
Linfeng Zhang
018a2adcb1 Update idct NEON optimization to not use narrowing saturating shift
Change-Id: Iae517017217dbacd638d40fcfeeb0f4bba7b8b8b
2016-12-07 10:25:09 -08:00
James Zern
2d3d95f7ac enable vpx_idct16x16_256_add_neon in hbd builds
BUG=webm:1294

Change-Id: Ib421c150b0d29dee0a81390a612bf01a4a28cff1
2016-12-06 18:32:21 -08:00
James Zern
228c9940ea Merge changes Ibad079f2,I7858a0a1
* changes:
  enable vpx_idct16x16_10_add_neon in hbd builds
  idct16x16,NEON: rm output_stride from pass1 fns
2016-12-07 01:40:28 +00:00
James Zern
8befcd0089 enable vpx_idct16x16_10_add_neon in hbd builds
BUG=webm:1294

Change-Id: Ibad079f25e673d4f5181961896a8a8333a51e825
2016-12-06 16:09:19 -08:00
James Zern
af9d7aa9fb idct16x16,NEON: rm output_stride from pass1 fns
vpx_idct16x16_256_add_neon_pass1, vpx_idct16x16_10_add_neon:
this was a constant 8 in all cases meaning the results are stored
contiguously, this allows the number of stores to be reduced.

Change-Id: I7858a0a15a284883ef45c13dfd97c308df9ea09e
2016-12-06 15:13:33 -08:00
Linfeng Zhang
cb339d628f Refine 8-bit 8x8 idct NEON intrinsics
Change-Id: I4ec4ad1928ec2ed87f596f52f097bc52065278dd
2016-12-05 17:50:14 -08:00
Marco
360ac89885 vp9: Adjust the weight factor for segment rate cost for aq-mode=3.
Use the segment weight factor based on the target (cr->percent_refresh)
if it less than the current estimate (avergae of past usage and target).
Small improvement at low bitrates.

Change-Id: Iba8fd909e203f94458901366d3a991f7ea854d49
2016-12-05 12:42:56 -08:00
Linfeng Zhang
a8eee97b43 Check in vpx_lpf_vertical_4_dual_neon() assembly
This replaces its C version.

Change-Id: Ie39e9324305fdc0fff610ced608a037e44a85a1a
2016-12-02 15:54:30 -08:00
James Zern
a7fa1314da Merge changes I4afc130e,Iaa64d23f
* changes:
  Add high bitdepth 4x4 idct NEON intrinsics
  Update idct x86 intrinsics to not use saturated add and sub
2016-12-02 04:01:28 +00:00
Linfeng Zhang
17a8cf5cc3 Add high bitdepth 4x4 idct NEON intrinsics
Change-Id: I4afc130effa05b8be2e9f982967216b1beb2ce4b
2016-11-30 13:07:13 -08:00
Linfeng Zhang
264f6e70ec Update idct x86 intrinsics to not use saturated add and sub
Change-Id: Iaa64d23fdb45ca1f235b0ea57e614516e548eca4
2016-11-29 17:06:08 -08:00
James Zern
c6641782c3 idct16x16,NEON,cosmetics: normalize fn signatures
+ remove unused parameters from vpx_idct16x16_10_add_neon_pass2

Change-Id: Ie5912a4abdd308fab589380bca054a2e7234a2c4
2016-11-28 16:46:01 -08:00
James Zern
12566c3d0f Merge changes Ide6d3994,I164cfcbe
* changes:
  enable vpx_idct32x32_135_add_neon in hbd builds
  idct_neon: rename load_tran_low_to_s16 -> ...s16q
2016-11-29 00:12:45 +00:00
James Zern
33ddc645ce Merge "build/make/Android.mk: correct rtcd template var refs" 2016-11-28 23:39:37 +00:00
James Bankoski
68991d7f87 Merge "svc_test: fix two warnings" 2016-11-28 22:27:26 +00:00
Jim Bankoski
27b5cc31e6 svc_test: fix two warnings
Use of possibly uninitialized variable and missing test initializer.

Change-Id: I2192c81c39ef4239cc11a309850c0ee8781ef17e
2016-11-28 12:53:39 -08:00
Jerome Jiang
f68cf8ba19 Cosmetic changes to variable names in deblocker tests.
Change kExpectedOutput to expected_output in function parameters in
the deblocker test.

Change-Id: I5baf8d1285ac47922950887406c7aa519ddc512a
2016-11-28 10:08:12 -08:00