Commit Graph

6400 Commits

Author SHA1 Message Date
Yaowu Xu
b73c9df1a4 Merge "No longer use use_lastframe_partitioning speed feature" 2014-09-08 18:10:20 -07:00
Paul Wilkins
f24054574d Fix VS build issue.
Compile fails when CONFIG_INTERNAL_STATS
flag is set.

Change-Id: Iba7701c058169ca3fc0b9008619ac55a1fe1a8b6
2014-09-08 15:29:33 -07:00
Johann
c731d6a4f1 Merge "Fixing Mac OS build." 2014-09-08 11:36:03 -07:00
Alex Converse
b932c6c5dd BITSTREAM CLARIFICATION: Forbid referencing across color spaces.
Check image format of reference frames.

Change-Id: I7d8d7f097ba547839ff9cec3880bd15a4948ee06
2014-09-08 11:12:09 -07:00
Dmitry Kovalev
980abf6078 Fixing Mac OS build.
Change-Id: Ifae8906185a868a07685eb7a7da2484af95e70a7
2014-09-08 08:53:12 -07:00
Jingning Han
a61973bf29 Merge "Enable adaptive motion search for ARF coding" 2014-09-08 08:51:05 -07:00
Dmitry Kovalev
1f19ebbab6 Replacing vp9_get_mb_ss_sse2 asm implementation with intrinsics.
Change-Id: Ib4f5dd733eb2939b108070a01e83da5d9990bac0
2014-09-06 00:10:25 -07:00
James Zern
49bb8fbaca vp9_pick_inter_mode: normalize some types
Change-Id: I4c74dcab6358817f03d3bc4d526006d241f0c10e
2014-09-05 19:22:54 -07:00
James Zern
7fe86bba2e vp9_pick_inter_mode: cosmetics: localize var. defs
Change-Id: Ifbfc142291697a1847ef85ced0b0eb4d6dab161e
2014-09-05 19:22:54 -07:00
James Zern
6f094e2a71 vp9_pick_inter_mode: cosmetics: add const
Change-Id: I2450b4856e48dbc4d5b938b2edcea0704f756c8e
2014-09-05 19:22:53 -07:00
James Zern
0adfacad75 vp9_pick_inter_mode: cosmetics: fix indent
+ delete a dead comment

Change-Id: Ibdb07f6dbdb30fc7888f6115ddc326fcec1157a7
2014-09-05 19:22:53 -07:00
James Zern
5ed806a608 vp9_pickmode: move PRED_BUFFER definition to .c
Change-Id: I3737772fe53f9885c82e2ac4c1af478ab951c16c
2014-09-05 19:22:53 -07:00
James Zern
94968c6d14 vp9_pickmode: make vp9_pick_inter_mode() void
the previous return value was constant and unused.

Change-Id: Ic3be55edb4a884448c7bb07977a80dfb58b7b940
2014-09-05 19:22:53 -07:00
Dmitry Kovalev
70092af5c0 Cleaning up and speeding up vp9_idct32x32_1024_add_sse2().
Change-Id: If91017b792572c9db6e257011ca307bef8428486
2014-09-05 18:12:30 -07:00
Dmitry Kovalev
89963bf586 Merge "Removing postproc mmx code." 2014-09-05 18:11:08 -07:00
Yunqing Wang
1092140379 No longer use use_lastframe_partitioning speed feature
The speedup in rd_pick_partition() function makes it possible
to drop use_lastframe_partitioning feature. By doing that, we
achieve good PSNR gain with small speed loss. Also, this makes
encoding loop less complicated. The code cleanup patch will
follow.

Borg tests showed:
1. At speed 2,
   stdhd set: 0.201% PSNR gain, 0.133% SSIM gain;
   derf set:  0.262% PSNR gain, 0.276% SSIM gain.
2. At speed 3,
   stdhd set: 0.139% PSNR gain, 0.109% SSIM gain;
   derf set:  0.447% PSNR gain, 0.442% SSIM gain.

The average speed loss over selected test clips is within 1%
with the worst case of 4%.

Change-Id: Icfd2ded7869372b585a6972855d933b3d0280d90
2014-09-05 16:24:41 -07:00
Yunqing Wang
ebac8f3487 Merge "Correct the mode decisions in special cases" 2014-09-05 13:45:41 -07:00
Dmitry Kovalev
54bec0971f Merge "Initializing intra modes without vpx_once()." 2014-09-05 12:03:36 -07:00
James Zern
46faaeeffb Merge changes I7b9f40dc,I76e74f2e
* changes:
  vp9: correct context buffer resize check
  vp9: fail decode if block/frame refs are corrupt
2014-09-05 12:01:59 -07:00
Yunqing Wang
1dd9a63929 Correct the mode decisions in special cases
The rate costs calculated for inter modes are not precise in some
cases, which causes NEWMV is chosen instead of NEARESTMV, NEARMV,
and ZEROMV. This patch added checks for these cases, and corrected
the mode decisions.

Borg tests at speed 3 showed:
1. stdhd set: 0.102% PSNR gain and 0.088% SSIM gain.
2. derf set:  0.147% PSNR gain and 0.132% SSIM gain.
No speed change.

Change-Id: I35d17684b89ad4734fb610942d707899146426db
2014-09-05 12:01:07 -07:00
James Zern
6f980c6a1e Merge "fix x86-darwin* build" 2014-09-05 11:58:55 -07:00
James Zern
2886b91790 Merge "vp9: skip loopfilter when the frame is corrupt" 2014-09-05 11:58:09 -07:00
Dmitry Kovalev
1100e262c5 Removing postproc mmx code.
Removed functions:
* vp9_post_proc_down_and_across_mmx
* vp9_mbpost_proc_down_mmx
* vp9_plane_add_noise_mmx

They all have sse2 equivalent.

Change-Id: I59c1fac12b7c96ca4538d455e4400c2b7875feff
2014-09-05 11:52:50 -07:00
Dmitry Kovalev
02a0c51e50 Merge "Adding temp cpi var." 2014-09-05 10:31:41 -07:00
James Zern
a8083449e9 fix x86-darwin* build
vp9_variance_sse2.c contains a mix of intrinsics and references to
assembly which uses x86inc.asm; it's conditionally included as a result.

Change-Id: I254451483a65881c0b8e18e27bf0c3ddef60c4ec
2014-09-04 23:32:13 -07:00
James Zern
bb4950dfdf vp9: correct context buffer resize check
allocations within vp9_alloc_context_buffers() rely on mi_rows/mi_cols
individually, use those to determine whether to realloc rather than
stride and stride * rows. this fixes a crash with some fuzzed files for
invalid accesses into last_frame_seg_map and above_context.

Change-Id: I7b9f40dcf170d443890f3bd2acd285507943c7d4
2014-09-04 19:14:21 -07:00
James Zern
440f5097c7 vp9: fail decode if block/frame refs are corrupt
proceeding using a corrupt (incompletely decoded) frame reference may
lead to incorrect assumptions about allocation sizes leading to a crash.

Change-Id: I76e74f2e1be127c2e2c7e1174bb3307497dfd23d
2014-09-04 19:14:00 -07:00
Dmitry Kovalev
c7b925c3fe Merge "Removing sz member from vpx_codec_priv. " 2014-09-04 17:28:22 -07:00
Dmitry Kovalev
ce1c9228d4 Merge "Removing unused function prototypes." 2014-09-04 17:28:16 -07:00
JackyChen
f8e5105b47 Merge "Map motion magnitude in VP9 denoiser." 2014-09-04 16:59:53 -07:00
Jingning Han
d435148fe6 Enable adaptive motion search for ARF coding
This commit turns on adaptive motion search for ARF coding, in
addition to other normal inter frame coding. It improves the
average compression efficiency:

stdhd 0.1%
derf  0.04%

For the test sequences, the speed 3 runtime is reduced:

pedestrian 1080p 2000 kbps, 149932 ms -> 144580 ms, (3.3% speed-up)
bus CIF 1000 kbps, 8050 ms -> 7895 ms, (1.9%)
highway CIF 100 bkps, 45033 ms -> 44078 ms, (2.2%)

Change-Id: I5228565b609f99e8ae04f6140a2bf2b64a831d21
2014-09-04 16:26:40 -07:00
Jingning Han
3de038f396 Merge "Speed up compound inter prediction mode check" 2014-09-04 16:09:07 -07:00
JackyChen
b869b970c1 Merge "Update the condition when COPY_BLOCK is chosen." 2014-09-04 15:48:22 -07:00
Dmitry Kovalev
46b83391e2 Merge "Removing local set_speed_features() function." 2014-09-04 15:36:52 -07:00
JackyChen
b1153f34d4 Map motion magnitude in VP9 denoiser.
This is to keep the same with VP8 denoiser.
If motion magnitude is small,
make denoiser more aggressive.

Change-Id: I942a6e2f2ed9aec6f0c4c1f9e5fa47066cadcc0c
2014-09-04 14:53:33 -07:00
Dmitry Kovalev
7897059e8b Adding temp cpi var.
Change-Id: Ifa3c1cc2317c1bc21d1042b9662b35056d1e9ed0
2014-09-04 14:51:29 -07:00
Dmitry Kovalev
91998e638e Removing sz member from vpx_codec_priv.
Change-Id: I811526a9ee9f237604f72abe7fc677e39e0f457f
2014-09-04 14:47:42 -07:00
JackyChen
d75266f141 Update the condition when COPY_BLOCK is chosen.
The change is just to keep the condition the same with VP8.

Change-Id: I9662b40996126605945dd853c0cbe8916c1ce578
2014-09-04 14:28:12 -07:00
Dmitry Kovalev
490943552f Removing unused function prototypes.
Change-Id: Ia5e383e2cf18052f6f1eacf8b9495ab8e4d58878
2014-09-04 14:26:30 -07:00
JackyChen
7ba600dc89 Merge "Fix a bug in VP9 denoiser." 2014-09-04 14:16:26 -07:00
Dmitry Kovalev
27db51c602 Merge "Adding sse2 variant for vp9_mse{8x8, 8x16, 16x8}." 2014-09-04 13:59:37 -07:00
JackyChen
e30f7698f5 Fix a bug in VP9 denoiser.
When the first try of denoising turns out to be too much,
we will use a softer filter by adopting an adjustment to
make the result closer to original pixel (as in VP8 denoiser).
The old code made the adjustment in the wrong direction.

Change-Id: I84e28fa9e01eef47c5a37d5a2e6d3d378a06786b
2014-09-04 11:46:36 -07:00
Dmitry Kovalev
3820f568da Merge "Consistent allocation of vpx_codec_alg_priv_t." 2014-09-03 19:41:28 -07:00
Dmitry Kovalev
48197f0a70 Adding sse2 variant for vp9_mse{8x8, 8x16, 16x8}.
Change-Id: I6786d25ce4f32b8d8912f2d239a45ca15b310c4b
2014-09-03 19:02:14 -07:00
Dmitry Kovalev
ab73dba65f Merge "Replacing asm 16x16 variance calculation with intrinsics." 2014-09-03 18:57:33 -07:00
Dmitry Kovalev
406404af63 Merge "Small cleanup: reusing existing code." 2014-09-03 18:57:25 -07:00
Jingning Han
d62d804e64 Speed up compound inter prediction mode check
This commit allows the encoder to store outcomes of single reference
frame modes and compares them to decide if the inter prediction
filter, forward transform, and quantization can be skipped.

The compression performance of speed 3 is down
derf  -0.364%
stdhd -0.198%

For test sequences, the speed 3 runtime is reduced
highway CIF 100 kbps, 51976 ms -> 45033 ms, 13% speed-up
stockholm 720p 1000 kbps, 71826 ms -> 67838 ms, 5.5% speed-up
pedestrian 1080p 2000 kbps, 154924 ms -> 150702 ms, 2.6% speed-up

Change-Id: I5aa26f918d2b4b5197a2c0afa2779319f1c88e44
2014-09-03 15:28:01 -07:00
Yaowu Xu
7ab5de04fd Merge "Change last_partition_redo_frequency for speed 3" 2014-09-03 14:57:02 -07:00
Yaowu Xu
44879ceea7 Merge "Remove redundant code" 2014-09-03 14:55:28 -07:00
Dmitry Kovalev
7f4c3b8d93 Merge "Cleaning up vp9_variance_avx2.c." 2014-09-03 13:21:38 -07:00
Yaowu Xu
ad3616a1fb Merge "Merge two similar functions into one" 2014-09-03 13:00:02 -07:00
Dmitry Kovalev
a7ccc12973 Small cleanup: reusing existing code.
Change-Id: Iac4775ad98e988f2b9cf5bd0dc91ab994d0262ce
2014-09-03 12:20:29 -07:00
Dmitry Kovalev
4eab7c28b8 Merge "Removing duplicated code." 2014-09-03 12:11:37 -07:00
Yaowu Xu
9a15835812 Merge "select_tx_mode(): remove special case for key frame" 2014-09-03 11:54:44 -07:00
Dmitry Kovalev
bf778e7d8e Initializing intra modes without vpx_once().
Change-Id: I0a9d52432f2500f1bd8f43f229e70e38bb9a0343
2014-09-03 11:39:02 -07:00
Yaowu Xu
e759d95743 Merge two similar functions into one
intra_super_block_yrd() and inter_super_block_yrd() are largely same,
this commit merges them into one to reduce code duplication.

Change-Id: I64d7042a5b099345627cf55663010c185b25ec37
2014-09-03 11:21:06 -07:00
Dmitry Kovalev
095d48a419 Merge "Removing clear_system_state() call from update_coef_probs()." 2014-09-03 11:05:45 -07:00
Dmitry Kovalev
b08fab8808 Consistent allocation of vpx_codec_alg_priv_t.
Change-Id: I5a03496de035fbcf31e4527cd25fcae4627a57a0
2014-09-03 11:01:21 -07:00
Minghai Shang
759afe525c Merge "[svc] Temporal svc with two pass rate control" 2014-09-03 10:51:19 -07:00
Yaowu Xu
7a33712475 Change last_partition_redo_frequency for speed 3
From 3 to 2, which seems to be slightly positive on compression for
all test sets, also reduces encoding time by 2%-5%, varying on the
test clips.

Change-Id: If045417bd27311700c919b4a335eff0dc1130ae0
2014-09-03 09:34:10 -07:00
Yaowu Xu
cdda17ed77 Remove redundant code
Change-Id: I453b167f03811a3cd3592089593b3f2823f62ab3
2014-09-03 09:34:10 -07:00
Yaowu Xu
c1058e5bbe select_tx_mode(): remove special case for key frame
This commit removes the special case for key frame, as transform size
decision is controlled by the appropriate speed feature for all lossy
coding modes: tx_size_search_method.

Change-Id: I9677171e3f2432ec23705f7c5ea8170dd4562fae
2014-09-03 09:34:10 -07:00
Paul Wilkins
819e231b93 Merge "Skip comp inter mode test in RD loop with same frame bias signs" 2014-09-03 02:26:47 -07:00
Jingning Han
801fef26ec Skip comp inter mode test in RD loop with same frame bias signs
This commit allows the encoder to skip check on compound inter
modes in the rate-distortion optimization loop, if the reference
frame bias signs are the same.

Change-Id: Ib753e6bb11cbdd338aee69dbe2b649671f75a6b0
2014-09-02 18:17:33 -07:00
Dmitry Kovalev
070210e20b Removing duplicated code.
Change-Id: I7b5c776d5e6f5ca428b87fa9411ae4012a9538ba
2014-09-02 17:57:35 -07:00
Dmitry Kovalev
0ecc75c819 Merge "Removing MMX SAD calculation code." 2014-09-02 17:35:59 -07:00
Deb Mukherjee
a4ef1a0819 Merge "Adds config opt for highbitdepth + misc. vpx" 2014-09-02 15:41:27 -07:00
Dmitry Kovalev
318fc0c34f Removing MMX SAD calculation code.
Removed functions:
* vp9_sad_16x16_mmx
* vp9_sad_8x16_mmx
* vp9_sad_16x8_mmx
* vp9_sad_8x8_mmx
* vp9_sad_4x4_mmx

Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3
2014-09-02 14:41:36 -07:00
Deb Mukherjee
5acfafb18e Adds config opt for highbitdepth + misc. vpx
Adds config parameter vp9_highbitdepth, to support highbitdepth profiles.
Also includes most vpx level high bit-depth functions. However
encode/decode in the highbitdepth profiles will not work until
the rest of the code is in place.

Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6
2014-09-02 14:37:10 -07:00
Dmitry Kovalev
6f6bd282c9 Replacing asm 16x16 variance calculation with intrinsics.
New code is 20% faster for 64-bit and 15% faster for 32-bit. Compiled
using clang.

Change-Id: Icfea461238411001fd093561293dbfedfbf8d0bb
2014-09-02 13:54:34 -07:00
Minghai Shang
be3b08da3e [svc] Temporal svc with two pass rate control
It's built based on current spatial svc code.
We only support one spatial two temporal layers at this time.
Change-Id: I1fdc8584354b910331e626bfae60473b3b701ba1
2014-09-02 12:05:14 -07:00
Jingning Han
33176fef87 Skip comp inter mode tests for arf coding
This commit skips the compound inter mode prediction check in the
rate-distortion optimization loop for ARF coding. It reduces the
runtime for certain test clips at speed 3, at no compression
performance change:

bus CIF 1000 kbps, 8260 ms -> 8090 ms, 1.8% speed-up
stockholm 720p 1000 kbps, 74453 ms -> 71826 ms, 2.9% speed-up

No visible speed-up for pedestrian area 1080p at 2000 kbps.

Change-Id: Ic68aa56837159b726563b784e2e3729e846465ad
2014-09-02 11:23:47 -07:00
Dmitry Kovalev
5c937db029 Cleaning up vp9_variance_avx2.c.
Change-Id: I75eb47dd21f87015efd673dbd2aa71f4386afdf5
2014-09-02 11:01:29 -07:00
Dmitry Kovalev
0a4403992a Merge "Removing 'frames' field from VP9_COMP." 2014-09-02 10:01:20 -07:00
Dmitry Kovalev
4c7a783e8c Merge "Adding get_frame_pkt_flags() function." 2014-09-02 10:00:51 -07:00
Dmitry Kovalev
7c24d21f2e Merge "Removing lookup_next_frame_stats()." 2014-09-02 09:25:16 -07:00
Jingning Han
bac0268716 Merge "Skip intra mode tests depending on inter residuals" 2014-09-02 08:32:52 -07:00
Dmitry Kovalev
dbe2170595 Merge "Replacing asm 8x8 variance calculation with intrinsics." 2014-08-31 18:39:46 -07:00
Dmitry Kovalev
4ab2241f5b Removing dummy_packing member from VP9_COMP.
Change-Id: I571ce84c97087f8a1a36a10058393bfdcefbf72a
2014-08-29 17:33:20 -07:00
Dmitry Kovalev
0b721db543 Replacing asm 8x8 variance calculation with intrinsics.
New code is 10% faster for 64-bit and 25% faster for 32-bit. Compiled
using clang.

Change-Id: I8ba1544c30dd6f3ca479db806384317549650dfc
2014-08-29 17:28:31 -07:00
Jingning Han
deb8882cca Merge "Fix int64_t to unsigned int conversion warnings" 2014-08-29 17:15:46 -07:00
Jingning Han
dc3327c9dc Merge "Extend block level sse to support multiple txfm blocks" 2014-08-29 17:15:30 -07:00
Jingning Han
6ddf1e152a Fix int64_t to unsigned int conversion warnings
Use unsigned int type to store the sse in the pixel domain. The
precision is sufficient to handle sse of block size up to 64x64.
The transform domain version however needs int64_t, since there is
a transfer gain applied in the forward transformation that might
cause unsigned int overflow.

Change-Id: Ifef97c38597e426262290f35341fbb093cf0a079
2014-08-29 14:29:31 -07:00
Dmitry Kovalev
72037944df Merge "Removing variance MMX code." 2014-08-29 14:08:02 -07:00
James Zern
0e361fb895 Merge "vp9: sync workers at the start of decode_tiles_mt()" 2014-08-29 14:07:37 -07:00
James Zern
8700c61610 Merge "vp9: fix m/t loop filter invalid free" 2014-08-29 14:07:02 -07:00
Yunqing Wang
a4a1ca109c Merge "Minor fix in vp9_encoder.h" 2014-08-29 13:44:10 -07:00
Yunqing Wang
96c43e8aa9 Minor fix in vp9_encoder.h
Added the missing "int".

Change-Id: I7c8af3dee700837b40f010d53e1431a59370ae3a
2014-08-29 11:27:24 -07:00
James Zern
fec40f9269 vp9: fix m/t loop filter invalid free
store the number of allocated rows in VP9LfSync, the calculated values
can not be relied on when dealing with corrupt material.

Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
2014-08-29 11:04:45 -07:00
Dmitry Kovalev
12cd6f421d Removing variance MMX code.
Removed functions:
* vp9_mse16x16_mmx
* vp9_get_mb_ss_mmx
* vp9_get4x4var_mmx
* vp9_get8x8var_mmx
* vp9_variance4x4_mmx
* vp9_variance8x8_mmx
* vp9_variance16x16_mmx
* vp9_variance16x8_mmx
* vp9_variance8x16_mmx

They all have SSE2 equivalent.

Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615
2014-08-29 10:26:42 -07:00
Jingning Han
4282955ee1 Skip intra mode tests depending on inter residuals
This commit allows encoder to skip intra coding mode test, when
the known inter residual is less than the source variance. It
reduces the runtime of speed 3 for test clips:
bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up
pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up.

The compression performance is down by
derf   -0.36%
stdhd  -0.25%

Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
2014-08-29 08:37:35 -07:00
Jingning Han
02e6ecdc4c Extend block level sse to support multiple txfm blocks
This commit extends the sse and forward transform computation flag
to support the case 64x64 blocks where there are 4 32x32 2D-DCT
blocks.

Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
2014-08-29 08:29:34 -07:00
James Zern
c29cc89c78 Merge "vp9: fix crash in inline loopfilter w/corrupt file" 2014-08-28 18:37:30 -07:00
James Zern
458d0114f9 Merge "vp9: fix crash in mt loopfilter w/corrupt file" 2014-08-28 18:36:31 -07:00
James Zern
dbdff12b81 vp9: sync workers at the start of decode_tiles_mt()
prevents any problems resuming decode after decoding a corrupt frame

Change-Id: Ib7eb1b5c062aebe71074fef1ece32a32822c16be
2014-08-28 17:50:38 -07:00
Dmitry Kovalev
8e78a0d365 Merge "Implementing 4x4 variance calculation with SSE2." 2014-08-28 17:25:46 -07:00
Dmitry Kovalev
dcac083cf3 Implementing 4x4 variance calculation with SSE2.
New SSE2 function is three times faster than MMX one.

Change-Id: I4f387ce9f75b88379176ec7bdc62d86eb5f70fbe
2014-08-28 15:01:16 -07:00
Dmitry Kovalev
73edeb03ea Removing alg_priv from vpx_codec_priv struct.
In order to understand memory layout consider the declaration of the
following structs. The first one is a part of our API:

struct vpx_codec_ctx {
  // ...
  struct vpx_codec_priv *priv;
};

The second one is defined in vpx_codec_internal.h:

struct vpx_codec_priv {
  // ...
};

The following struct is defined 4 times for encoder/decoder VP8/VP9:

struct vpx_codec_alg_priv {
  struct vpx_codec_priv base;
  // ... 
};

Private data allocation for the given ctx:

struct vpx_codec_ctx *ctx = <get>
struct vpx_codec_alg_priv *alg_priv = <allocate>
ctx->priv = (struct vpx_codec_priv *)alg_priv;

The cast works because vpx_codec_alg_priv has a
vpx_codec_priv instance as a first member 'base'.

Change-Id: I10d1afc8c9a7dfda50baade8c7b0296678bdb0d0
2014-08-28 13:51:37 -07:00
Dmitry Kovalev
e9d106bd45 Merge "Removing unused arnr_type from VP9EncoderConfig and vp9_extracfg." 2014-08-28 13:50:05 -07:00
Yunqing Wang
5ac75188cb Merge "Early termination in encoding partition search" 2014-08-28 13:49:39 -07:00