Commit Graph

17517 Commits

Author SHA1 Message Date
Linfeng Zhang
c7e4917e97 Clean 8x8 idct x86 optimization
Create load_buffer_8x8() and write_buffer_8x8().

Change-Id: Ib26dd515d734a5402971c91de336ab481b213fdf
2017-06-15 14:30:00 -07:00
Linfeng Zhang
98967645a1 Remove vpx_idct8x8_64_add_ssse3()
It's almost identical with vpx_idct8x8_64_add_sse2(), except little
difference in instructions order.

Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f
2017-06-15 14:09:33 -07:00
Urvang Joshi
a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00
Linfeng Zhang
8d391a111a Merge changes Ibf9d120b,I341399ec,Iaa5dd63b,Id59865fd
* changes:
  Update high bitdepth load_input_data() in x86
  Clean array_transpose_{4X8,16x16,16x16_2) in x86
  Remove array_transpose_8x8() in x86
  Convert 8x8 idct x86 macros to inline functions
2017-06-15 17:57:50 +00:00
Marco Paniconi
8b48f68c0d Merge "vp8: Adjust the pred_err threhsold for drop on overshoot." 2017-06-14 15:59:55 +00:00
Linfeng Zhang
6da6a23291 Update high bitdepth load_input_data() in x86
BUG=webm:1412

Change-Id: Ibf9d120b80c7d3a7637e79e123cf2f0aae6dd78c
2017-06-13 16:53:53 -07:00
Linfeng Zhang
d6eeef9ee6 Clean array_transpose_{4X8,16x16,16x16_2) in x86
Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0
2017-06-13 16:50:44 -07:00
Linfeng Zhang
9c72e85e4c Remove array_transpose_8x8() in x86
Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f
2017-06-13 16:50:44 -07:00
Linfeng Zhang
cbb991b6b8 Convert 8x8 idct x86 macros to inline functions
Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce
2017-06-13 16:50:43 -07:00
James Zern
4f9d852759 vp8_skin_detection: add 'vp8_' prefix to public fns
BUG=webm:1438

Change-Id: I5feb31c254d02e116e624cfe702e73ba5a1f7aca
2017-06-12 20:13:28 -07:00
James Zern
98666368ee rename vp8/common/skin_detection.[hc] -> vp8_*
some build systems have trouble with duplicate basenames.
vpx_dsp/skin_detection.[hc] were added in:
658e85425 Merge skin detection code in vp8/9.

BUG=webm:1438

Change-Id: Ieaa70b40bda409ec23e6d179b47a930ac6243b05
2017-06-12 20:13:23 -07:00
Marco
b6e1bdfc76 vp8: Adjust the pred_err threhsold for drop on overshoot.
Change-Id: Ica2a09ac87160936b6f7bd01f167f464ea3ac41c
2017-06-12 09:54:16 -07:00
Hui Su
21e1661b54 Merge "vp9 level targeting: more strict constraint on min_gf_interval" 2017-06-12 16:38:02 +00:00
Jerome Jiang
a46bc0268b Merge "Remove duplication on vp8/9_write_yuv_frame." 2017-06-10 04:50:19 +00:00
Marco
e540ca7155 vp9: SVC: Use prune_evenemore only for non_reference.
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
2017-06-09 17:52:20 -07:00
Jerome Jiang
ff2d220d21 Remove duplication on vp8/9_write_yuv_frame.
Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8
2017-06-09 17:08:26 -07:00
Johann Koenig
6dcd9b37ea Merge "idct_test: don't use std::nothrow anymore" 2017-06-09 20:42:39 +00:00
Johann Koenig
8aa4ee1f10 Merge "buffer.h: allow declaring an alignment" 2017-06-09 20:42:21 +00:00
Johann Koenig
65f4299d65 Merge "Remove some dead code. Coverity CID 1310058" 2017-06-09 20:41:57 +00:00
Johann
92373a5bb2 idct_test: don't use std::nothrow anymore
But still check for NULL before calling Init()

Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76
2017-06-09 11:09:06 -07:00
Johann
5aee8ea752 buffer.h: allow declaring an alignment
x86 simd register operations generally prefer and may require 16 byte
alignment.

Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba
2017-06-09 11:03:15 -07:00
Sylvestre Ledru
c12d1d9b98 Remove some dead code. Coverity CID 1310058
Change-Id: I1186cf1dd8cde42f5970928f43edfc852298289d
2017-06-09 17:56:38 +00:00
James Zern
b3a262dff3 Merge "vp8_decode_frame: fix oob read on truncated key frame" 2017-06-08 23:17:50 +00:00
James Zern
45daecb4f7 vp8_decode_frame: fix oob read on truncated key frame
the check for error correction being disabled was overriding the data
length checks. this avoids returning incorrect information (width /
height) for the decoded frame which could result in inconsistent sizes
returned in to an application causing it to read beyond the bounds of
the frame allocation.

BUG=webm:1443
BUG=b/62458770

Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342
2017-06-08 23:16:04 +00:00
Johann
e50ea014c3 Revert "buffer.h: use size_t"
This reverts commit f08581c1d0.

type conversion warnings abound.

Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00
2017-06-08 10:20:21 -07:00
Jerome Jiang
943f9ee25c Merge "Merge skin detection code in vp8/9." 2017-06-08 16:36:00 +00:00
Johann Koenig
903375a48a Merge "fdct16x16 neon optimization" 2017-06-08 15:19:36 +00:00
Jerome Jiang
658e854252 Merge skin detection code in vp8/9.
BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14
2017-06-07 21:20:34 -07:00
hui su
21d2273efa vp9 level targeting: more strict constraint on min_gf_interval
min_gf_interval should be no less than min_altref_distance + 1,
as the encoder may produce bitstream with alt-ref distance being
min_gf_interval - 1.

BUG=b/38450599

Change-Id: Ifb733daa643ebc668d1b23e1ce92db94b66dabe8
2017-06-07 17:40:25 -07:00
Johann
eae7cf2368 fdct16x16 neon optimization
Roughly 2x speedup. Since the only change for HBD is to store(), the
improvement appears to hold there as well.

BUG=webm:1424

Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19
2017-06-07 14:59:55 -07:00
Marco Paniconi
9cea3a3c4e Merge "vp9: SVC: Enable simple_block_yrd for temporal layers." 2017-06-07 21:12:14 +00:00
Johann Koenig
0c4f74d129 Merge changes Iade45f69,I18d90658,Ieca3f1ef
* changes:
  buffer.h: add num_elements_
  buffer.h: zero-init all values
  buffer.h: use size_t
2017-06-07 19:20:16 +00:00
Marco
14d4718043 vp9: SVC: Enable simple_block_yrd for temporal layers.
Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed
2017-06-07 11:41:50 -07:00
Johann
902d63759e buffer.h: add num_elements_
raw_size_ was being incorrectly computed and used

Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d
2017-06-07 11:31:20 -07:00
Johann
4a37e3e2a0 buffer.h: zero-init all values
Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8
2017-06-07 11:27:26 -07:00
Johann
f08581c1d0 buffer.h: use size_t
Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f
2017-06-07 11:24:27 -07:00
Marco
13b02a8efe vp9: SVC: Enable row-mt in sample encoder.
Change-Id: I4b51043cb3f5955efe947fe4685aed4a21adb8bd
2017-06-07 10:32:44 -07:00
James Zern
ff42e04f9c Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}" 2017-06-06 23:52:39 +00:00
Marco Paniconi
27b34a109d Merge "vp9: SVC: Adjust some speed settings for SVC speed >= 7." 2017-06-06 23:07:45 +00:00
Marco
7d2f5f8e9d vp9: SVC: Adjust some speed settings for SVC speed >= 7.
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
2017-06-06 15:30:24 -07:00
Johann
de4cb716ee buffer.h: split out init
Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287
2017-06-06 15:02:50 -07:00
Johann
8659764a07 buffer.h: Use T for values
Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9
2017-06-06 12:05:14 -07:00
Jerome Jiang
cf07d85809 Initialize cost_list all to INT_MAX.
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
2017-06-06 10:42:37 -07:00
James Zern
6df142e2ab vp9_mcomp,get_cost_surf_min: quiet conversion warning
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-05 22:52:58 -07:00
Jerome Jiang
968a5d6bc2 Merge "Fix valgrind failure on uninitialized variables." 2017-06-06 03:47:31 +00:00
James Zern
4753c23983 Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx" 2017-06-06 02:19:41 +00:00
Jerome Jiang
ffe0f9b7fb Fix valgrind failure on uninitialized variables.
BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-05 13:09:29 -07:00
Jerome Jiang
f3a9ae5baa Fix ubsan failure in vp9_mcomp.c.
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-06-02 21:37:13 -07:00
Marco
e30781ff80 vp9: SVC: Force subpel search off under certain conditions.
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
2017-06-01 20:48:52 -07:00
Marco Paniconi
ff637d1903 Merge "vp9: Speed >8: Set subpel_search_method for low motion." 2017-06-01 23:57:19 +00:00