17918 Commits

Author SHA1 Message Date
Shiyou Yin
f70de09f2a vp8: [loongson] optimize dct with mmi
1. vp8_short_fdct4x4_mmi
2. vp8_short_fdct8x4_mmi
3. vp8_short_walsh4x4_mmi

Change-Id: I89a7df25cfd09fae309fac257ad8b6a3dc1c8acb
2017-10-12 08:50:04 +08:00
Shiyou Yin
bc4098a8e9 Merge "vp8: [loongson] optimize quantize with mmi" 2017-10-12 00:33:17 +00:00
Marco
72c69e14ad Adjust threshold in datarate tests for 1 pass VBR
Small increase in threshold for the 1 pass VBR datarate tests.
Needed due to commit:
<017257a Adjustment to scene detection and key frame>

Change-Id: I28b3bd7db2192a8cc2bccc3cb0e3b8dbb910ca16
2017-10-11 11:48:36 -07:00
Shiyou Yin
e8ed2bb762 vp8: [loongson] optimize quantize with mmi
1. vp8_fast_quantize_b_mmi
2. vp8_regular_quantize_b_mmi

Change-Id: Ic6e21593075f92c1004acd67184602d2aa5d5646
2017-10-11 16:45:58 +08:00
Linfeng Zhang
16166bfdaa Add 4 to 1 scaling x86 optimization
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10 16:24:06 -07:00
Jerome Jiang
dcfae2cc64 Merge "Fix alignment in vpx_image without external allocation." 2017-10-10 23:02:05 +00:00
Jerome Jiang
33c598990b Fix alignment in vpx_image without external allocation.
This restores behaviors prior to
<40c8fde Fix image width alignment. Enable ImageSizeSetting test.>.

BUG=b/64710201

Change-Id: I559557afe80d5ff5ea6ac24021561715068e7786
2017-10-10 14:26:17 -07:00
Marco
017257a317 Adjustment to scene detection and key frame.
For 1 pass vbr: use higher threshold on avg_sad
and force key frame under scene cut detection if
above the threshold. Allow it for speed >= 6 for now,
since it does not use the full nonrd_pickmode partition
(as in speed 5).

Improves quality somewhat on scene cut frames.
Neutral on overall metrics and fps for speed 6 on
ytlive set.

Change-Id: I12626f7627419ca14f9d0d249df86c7104438162
2017-10-10 11:20:05 -07:00
Linfeng Zhang
963cc22cef Merge changes I9d4c1af5,I882da3a0
* changes:
  Rename some inline functions in NEON scaling
  Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10 17:29:50 +00:00
Linfeng Zhang
27d21a3d13 Rename some inline functions in NEON scaling
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-09 11:23:00 -07:00
Linfeng Zhang
e1ae3772da Merge "Update vp9_scale_and_extend_frame_ssse3()" 2017-10-09 16:20:00 +00:00
James Zern
807248ec81 Merge "ppc: Add vpx_idct32x32_1024_add_vsx" 2017-10-07 19:08:26 +00:00
Marco Paniconi
5bc4c37a89 Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."" 2017-10-06 22:41:34 +00:00
Marco Paniconi
bcbc6ed82d Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."
This reverts commit 9311ef18b4b4eff0da3adf9d702a34f489a270ff.

Reason for revert:
Notice small regression in some clips.
Will revisit in another change.

Original change's description:
> Speed >=5 real-time: add TM intra mode for high_source_sad.
> 
> Small/neutral change in metrics or speed for ytlive.
> Some improvement in quality on frames with big content change.
> 
> Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-10-06 22:14:56 +00:00
Marco
e405eb06b1 Adjust threshold in scene detection
For 1 pass vbr: increase min_thresh slightly, and also add
condition on golden/arf update for using full nonrd_pick_partition.

Reduces possible false detection for scene cut detection.

Neutral/small change in metrics or speed for speed 5.

Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76
2017-10-06 11:08:56 -07:00
Marco Paniconi
7af6c6c9ca Merge "Speed >=5 real-time: add TM intra mode for high_source_sad." 2017-10-06 06:29:46 +00:00
Marco
9311ef18b4 Speed >=5 real-time: add TM intra mode for high_source_sad.
Small/neutral change in metrics or speed for ytlive.
Some improvement in quality on frames with big content change.

Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d
2017-10-05 23:07:03 -07:00
James Zern
d2fb834ebd Merge "vpx_codec.h: namespace local defines" 2017-10-06 05:30:16 +00:00
James Zern
e8ed030da3 vpx_codec.h: namespace local defines
add VPX_ to UNUSED/*DEPRECATED to avoid conflicts with other headers.

Change-Id: Ie16bdac3575bc1af57a05d37e65b994370585377
2017-10-05 15:12:20 -07:00
James Zern
107eb6a9d4 vp9_ethread_test: abort early/add more detailed output
in the case compare_fp_stats fails report the 2 values and their index

Change-Id: I927a832b7a1e24c392961093b7caee1134223def
2017-10-05 15:02:51 -07:00
Marco Paniconi
e095bcce44 Merge "Adjust threshold for adapt_partition for speed 6." 2017-10-05 03:28:06 +00:00
Marco
18262a8576 Adjust threshold for adapt_partition for speed 6.
Lower SAD threshold to select non_rd pickmode partition
at superblock level more often.
Small gain in metrics, small/negligible decrease in speed.

Change-Id: I0f728236b91a604e4ca7e02039adc54d5985c4dc
2017-10-04 18:04:09 -07:00
Marco Paniconi
014976c251 Merge "Avoid nonrd_pick_partition for speed >= 6." 2017-10-04 23:36:27 +00:00
Marco
4bc1fc58b6 Avoid nonrd_pick_partition for speed >= 6.
For 1 pass vbr speed >= 6: when REFERENCE_PARTITION is selected,
avoid doing the full nonrd_pickmode based partition.
No change in overall metrics or speed.
Reduces encode times on scene cuts by 10-20%.

Change-Id: I0310b1610cc1c83793a509e0a9059840e8f18308
2017-10-04 15:31:54 -07:00
Marco Paniconi
6a42bdd25f Merge "Modify early exit for alt_ref in nonrd_pickmode." 2017-10-04 19:38:49 +00:00
Linfeng Zhang
127864deb3 Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
Change-Id: I882da3a04884d5fabd4cd591c28682cbb2d76aa5
2017-10-04 12:35:39 -07:00
Linfeng Zhang
b809442521 Update vp9_scale_and_extend_frame_ssse3()
Change-Id: I22622faebfcc36f7a4d1f37e3800ae8ab87c8cd4
2017-10-04 12:32:30 -07:00
Marco
77e51e2035 Modify early exit for alt_ref in nonrd_pickmode.
For 1 pass vbr mode:
On no-show_frame/ARF: instead of skipping alt_ref_frame
completely in mode testing, allow for checking (0, 0) on alt_ref.

Small gain in metrics, ~0.18%, no change in speed.

Change-Id: I32a3c24faca64ab70dd5091071a0dc301db7dd1e
2017-10-04 11:53:39 -07:00
Linfeng Zhang
9a71811d98 Merge changes Id6a8c549,Ib1e0650b,Ic369dd86
* changes:
  Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
  Add vpx_dsp/x86/mem_sse2.h
  Add transpose_8bit_{4x4,8x8}() x86 optimization
2017-10-04 16:15:14 +00:00
Jerome Jiang
ffa3a3c441 Merge "Fix image width alignment. Enable ImageSizeSetting test." 2017-10-04 14:48:03 +00:00
Marco
98dbf31c87 Enable arf usage for speed >= 6, 1 pass vbr.
For speed 6 on ytlive set:
On average, speed slowdown ~5%, quality gain ~2%.

Change-Id: Ia18237cc1d52c54d7e2cb3c71f571cf37ef61b44
2017-10-03 17:18:33 -07:00
Marco
ab2bd340ac vp9: 1 pass vbr: Limit qpdelta on high_source_sad.
For 1 pass vbr: when significant content/scene change is detected
(high_source_sad = 1) reduce/turnoff the additional qdelta on the
active_worst_quality. This helps somewhat to reduce the occurrence
of large frame sizes and large encode times.
Allow it only when use_altef_onepass is enabled.

Neutral/no change on metrics.

Change-Id: I1dd97dd2ab892d65f707b841b27a5de300b714ea
2017-10-03 16:27:17 -07:00
James Zern
66b6b87471 Merge "vpx: fix nasm build errors" 2017-10-03 21:47:49 +00:00
Scott LaVarnway
bc4bc9b622 vpx: fix nasm build errors
BUG=webm:1462,766721

Change-Id: Icfa536a8e38623636b96c396e3c94889bfde7a98
2017-10-03 20:02:21 +00:00
Linfeng Zhang
6543213e87 Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac
2017-10-03 13:02:05 -07:00
Linfeng Zhang
0f756a307d Add vpx_dsp/x86/mem_sse2.h
Add some load and store sse2 inline functions.

Change-Id: Ib1e0650b5a3d8e2b3736ab7c7642d6e384354222
2017-10-03 12:59:05 -07:00
Marco
c8678fb7f3 Use adapt_partition for ARF in 1 pass.
For speed 6 real-time mode: use adapt_partition
on ARF frame instead of REFERENCE_PARTITION (which is slower).
This requires enabling compute_source_sad_onepass for no-show_frames.

Speedup of ~3-5% on some clips that heavily use ARF,
small loss (~0.2%) in quality on ytlive set.

Change-Id: Ib50acc97df06458244a6ac55d2bd882c30012536
2017-10-03 11:49:55 -07:00
Linfeng Zhang
67c38c92e7 Add transpose_8bit_{4x4,8x8}() x86 optimization
Change-Id: Ic369dd86b3b81686f68fbc13ad34ab8ea8846878
2017-10-03 10:00:30 -07:00
Marco Paniconi
fe7b869104 Merge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode." 2017-10-03 03:01:14 +00:00
Marco
33e10dfa7e ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.
Speedup of ~2-3% on 1080p clips speed 6.
Neutral/negligible loss in metrics on ytlive.

Change-Id: I7ac47a4d8b58c566920bae29a94a0e8d59c36dee
2017-10-02 19:04:03 -07:00
Linfeng Zhang
0e55b0b0a7 Add 4 to 3 scaling NEON optimization
Speed comparing with the one calling vpx_scaled_2d_neon()
  ~1.7 x in general
  ~2.8x for BILINEAR filter

BUG=webm:1419

Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6
2017-10-02 15:04:09 -07:00
Linfeng Zhang
2c560c3c22 Specialize 4 to 3 frame scaling in C
Scale 3x3 block instead of 16x16 block in each loop. Disabled by
default.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3.
   Optimization code will be smaller and faster.
2. Maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: I59a1f7496d89a1b090498c935d30cfcf1d0c282b
2017-10-02 11:56:15 -07:00
Scott LaVarnway
3c700052be Merge "vpxdsp: [x86] add highbd_d135_predictor functions" 2017-10-02 15:00:19 +00:00
Alexandra Hájková
fb7fc1dbda ppc: Add vpx_idct32x32_1024_add_vsx
Change-Id: I55cd0a1569ccc47a53d0ecf751aac259d510e10d
2017-09-30 19:31:20 +00:00
Marco
c8f6e7b99e Fix partition selection in speed features for arf overlay frame.
For real-time mode. Move the switch to fixed partition
for is_src_frame_alt_ref so all speeds may use it
if use_altref_onepass is set.

Improves metrics by ~2% for ytlive set at speed 4
(where use_altref_onepass is currently used).

Change-Id: I033240386598c9dbd0364da89ccbcca64bc663ee
2017-09-29 15:02:28 -07:00
Marco
f2c3d0a7a3 Enable use_altref_onepass for speed 4 real-time mode.
Used for VBR mode with lag-in-frames > 0.
On ytlive set at speed 4: ~3% average gain.

Change-Id: I45dad1700bf8be9d8f177815dc062774f6f2f0de
2017-09-29 10:56:14 -07:00
Scott LaVarnway
3bbd62ed27 vpxdsp: [x86] add highbd_d135_predictor functions
C vs SSE2 speed gains:
_4x4 : ~1.81x

C vs SSSE3 speed gains:
_8x8 : ~1.96x
_16x16 : ~1.88x
_32x32 : ~2.02x

BUG=webm:1411

Change-Id: Iefaf8b39afbbfe34c1ad1d21e3a003b20f1f61e0
2017-09-29 08:56:38 -07:00
Scott LaVarnway
4cae64c32c vpxdsp: [x86] add highbd_d117_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.04x

C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x

BUG=webm:1411

Change-Id: I31d949695991c067dac89d91e0bed3e666c94993
2017-09-28 14:45:28 -07:00
Jerome Jiang
5a40c8fde1 Fix image width alignment. Enable ImageSizeSetting test.
BUG=b/64710201

Change-Id: I5465f6c6481d3c9a5e00fcab024cf4ae562b6b01
2017-09-28 11:25:24 -07:00
Marco
a2ef180dd0 Set rc->high_source_sad = 0 before scene detection.
Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f
2017-09-28 10:49:45 -07:00