14767 Commits

Author SHA1 Message Date
Geza Lore
5eefd3ebfd Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
2015-11-11 14:03:47 +00:00
James Zern
ec45003a8f libs.mk, testdata: rm redundant test of LIBVPX_TEST_DATA
the return value of enabled, which may be empty, is handled by the for
loop. this avoids making an unnecessarily long command line which may
fail in certain cases.

Change-Id: Ib88ecbbe2c0f6d7debb600b4caed4884497263b1
2015-11-10 17:54:51 -08:00
Marco
064a9eca49 Non-rd partition: reduce variance threshold low resolutions.
Change-Id: I06306905d187948a92f839357df5d21413823808
2015-11-10 15:42:58 -08:00
Marco Paniconi
79a194692f Merge "Add bias to zero/small motion for noisy source." 2015-11-10 23:10:31 +00:00
James Zern
e3efed7f4c Merge "convolve_copy_sse2: replace SSE w/SSE2 code" 2015-11-10 22:35:12 +00:00
Scott LaVarnway
f48321974b Merge "VPX: x86 asm version of vpx_idct32x32_34_add()" 2015-11-10 21:40:11 +00:00
Scott LaVarnway
9aeaa2016e VPX: x86 asm version of vpx_idct32x32_34_add()
Change-Id: I8a933c63b7fbf3c65e2c06dbdca9646cadd0b7cb
2015-11-10 11:54:56 -08:00
Marco
bd6bf25969 Add bias to zero/small motion for noisy source.
Change is only for real-time mode, speed >= 5, and non-screen content mode.
Add bias to zero/low motion for big blocks, if noise estimation
is enabled and noise level is above threshold.

Change-Id: I3a0a4608ede6aa535bda6eca528d20f8aba738e7
2015-11-10 11:23:40 -08:00
James Zern
40dab58941 convolve_copy_sse2: replace SSE w/SSE2 code
this should be neutral or slightly faster on modern (P4+) architectures

Change-Id: Iec4c080275941eb8c9e05a66a2daf0405d86a69b
2015-11-09 23:45:16 -08:00
JackyChen
19272d866b VP9 noise estimate: no noise estimate if frame size change.
Change-Id: I521f7b53c143d562a88fe7de330aa3f0ef09f414
2015-11-09 19:18:29 -08:00
Jacky Chen
394d6c122a Merge "VP9: add unit test for realtime external resize." 2015-11-10 03:05:30 +00:00
Johann
f937114402 Merge branch 'javanwhistlingduck'
Change-Id: Ib63fde31ae7b3f71e608830f7433113733b2a275
2015-11-09 17:00:37 -08:00
jackychen
55c8843791 VP9: add unit test for realtime external resize.
Change-Id: I9bfa80de73847d9be88b6ce9865d7bb5fafaaa57
2015-11-09 16:48:18 -08:00
Jacky Chen
7155f7ab78 Merge "VP9 dynamic resize: enable resize unit test(DownUp)." 2015-11-09 22:54:53 +00:00
James Zern
e1fbc886e1 Merge "VP9: Only zero counts when !frame_parallel_decoding_mode" 2015-11-09 22:23:34 +00:00
Johann
cbecf57f3e Release v1.5.0
Javan Whistling Duck release.

Change-Id: If44c9ca16a8188b68759325fbacc771365cb4af8
v1.5.0
2015-11-09 14:12:38 -08:00
jackychen
0465aa45ea VP9 dynamic resize: enable resize unit test(DownUp).
The unit test requires a longer clip which is already in the repo.

Change-Id: Ic42e8d83e636fafd20d485a7f5f8422835319245
2015-11-09 14:04:58 -08:00
Marco Paniconi
cdec99b243 Merge "VP9 dynamic resize: increase waiting time after key frame." 2015-11-09 21:11:51 +00:00
jackychen
3c9a424e6e VP9 dynamic resize: increase waiting time after key frame.
For 1 pass CBR mode: increase waiting time after key frame
before we start sampling rate control behavior for determining
resize. This change need to disable one internal resize(DownUp)
temporally since it requires a longer clip to do so.

Change-Id: If21beda1be23f169ee541ab4dd642f718347887a
2015-11-09 12:04:00 -08:00
Marco Paniconi
498fd551fd Merge "Use same bias (against non-zero mv for big blocks) for speed 5." 2015-11-09 19:29:35 +00:00
Alex Converse
d1a7c10325 Merge "Expand unconstrained nodes in pack_mb_tokens and loop on zeros." 2015-11-09 18:27:40 +00:00
Scott LaVarnway
380a5519cc VP9: Only zero counts when !frame_parallel_decoding_mode
The counts are never used when frame_parallel_decoding_mode
is set.

Change-Id: Ic7a566a048297f7373c9ffbb48929ea09eff674f
2015-11-09 10:14:13 -08:00
Marco
718654848a Use same bias (against non-zero mv for big blocks) for speed 5.
Use same setting for speed 5 (as it is for speed > 5).
Change is only for real-time (non-rd) mode.

Change-Id: I830250eac654328373cb318baa89d4f0e63942e1
2015-11-09 10:09:51 -08:00
James Zern
420e8d6d03 Merge changes I8c83b86d,Ic53b2ed5,I4acc8a84
* changes:
  variance_test: create fn pointers w/'&' ref
  sixtap_predict_test: create fn pointers w/'&' ref
  sad_test: create fn pointers w/'&' ref
2015-11-07 00:57:06 +00:00
Hui Su
908fbabe4e Merge "Use accurate bit cost for uv_mode in UV intra mode RD selection" 2015-11-07 00:22:50 +00:00
Alex Converse
70eb870cfe Expand unconstrained nodes in pack_mb_tokens and loop on zeros.
Reduces Linux perf estimated cycle count for pack_mb_tokens on a
lossless encode on my desktop from 61858501855 to 48154040219 or from
26% of the overall profile to 21%.

Change-Id: I9ca3426d7e3272bc7f7030abda4f0d0cec87fb4a
2015-11-06 16:00:10 -08:00
hui su
6ab6ac450b Use accurate bit cost for uv_mode in UV intra mode RD selection
On derflr, +0.1% for VP10; however, -0.03% on VP9.

Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-11-06 14:45:43 -08:00
James Zern
eba14ddbe7 Merge "Revert "Add AVX vectorized vp9_diamond_search_sad"" 2015-11-06 22:37:20 +00:00
James Zern
30466f26b4 Revert "Add AVX vectorized vp9_diamond_search_sad"
This reverts commit f1342a7b070ef61b9fbdf03e899ac2107cfcb6bd.

This breaks 32-bit builds:
 runtime error: load of misaligned address 0xf72fdd48 for type 'const
__m128i' (vector of 2 'long long' values), which requires 16 byte
alignment

+ _mm_set1_epi64x is incompatible with some versions of visual studio

Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
2015-11-06 13:15:01 -08:00
James Zern
837cea40fc variance_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: I8c83b86da53b1783cd18c0f765b67ba33da91d72
2015-11-06 11:04:11 -08:00
James Zern
ab5ce2e5ae sixtap_predict_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: Ic53b2ed5fbce05c5b5e633b4a4ef9ea75c55360a
2015-11-06 11:04:10 -08:00
Marco
5f041c01ed vp9: Disable noise estimate on resize trigger frame.
Change-Id: I35767a6320943582ee11d737b5f240cea2d01b25
2015-11-06 08:42:09 -08:00
James Zern
91606bbbe6 sad_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: I4acc8a844d1e55b766f66482bd6d32998174d70f
2015-11-05 23:53:24 -08:00
Marco Paniconi
d7bbe1a210 Merge "vp9: Updates to noise estimation." 2015-11-06 06:51:11 +00:00
Marco
1c724d01aa vp9: Updates to noise estimation.
Add threshold/condition on spatial_variance and brightness level.
Modification to normalization of block variance.
Change resolution limit below which we disable noise estimation.

Change-Id: If5be08a26ceda351242d8a58d2f0bc88c0a918f0
2015-11-05 18:19:01 -08:00
James Zern
892130f75b vp9_spatial_svc_encoder.sh: fix command line param
-l -> -sl, renamed in:
be3b08d [svc] Temporal svc with two pass rate control

Change-Id: I5a7b179b33d94e20e54825090659156dece928c0
2015-11-05 15:22:39 -08:00
Yunqing Wang
57cae22c1e Merge "Add AVX vectorized vp9_diamond_search_sad" 2015-11-05 20:17:13 +00:00
Geza Lore
f1342a7b07 Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
2015-11-05 10:02:17 +00:00
Marco Paniconi
c6641709a7 Merge "Bias against non-zero mv for large blocks." 2015-11-04 00:01:23 +00:00
Alex Converse
246e0eaa71 Deduplicate some high bit depth tables
Change-Id: I6977f7d155cc1e81ae2393933893caac6770821f
2015-11-03 15:40:44 -08:00
Marco
04a99cb36b Bias against non-zero mv for large blocks.
Change is only for real-time mode, speed > 5, and non-screen content mode.
Bias is based on block size and motion vector level (motion above some threshold).

Helps to improves stability in background from lightning changes.
PSNR/SSIM metrics on RTC set almost no change/neutral (within +/- 0.1).

Change-Id: I7eac13c1ae10be4ab1f40acc7f9f1df5653ece9d
2015-11-03 14:51:56 -08:00
Marco Paniconi
17534d2918 Merge "Update to encoder_breakout_test, for non-rd mode." 2015-11-03 22:40:53 +00:00
Yaowu Xu
5ff1008ed9 Merge "Fix a msvc warning" 2015-11-03 21:56:25 +00:00
Hui Su
3cbe767972 Merge "Generate intra prediction reference values only when necessary" 2015-11-03 20:55:14 +00:00
Marco Paniconi
73372cc09a Merge "Adjust threshold for datarate frame drop test." 2015-11-03 19:54:52 +00:00
Marco
9a7785b9d6 Update to encoder_breakout_test, for non-rd mode.
Only use non-zero threshold(s) for breakout if
the motion level of the current tested mode is low.

Change-Id: I22aae961cc42371b49d3f648560181cc54708502
2015-11-03 11:49:44 -08:00
Yaowu Xu
87e08f4d9f Fix a msvc warning
Change-Id: Id5b8f597fb275395232559fea7bfeb56912b88a1
2015-11-03 11:22:58 -08:00
Alex Converse
255bcf8697 Merge "misc fixes: Remove a wasted value." 2015-11-03 17:52:34 +00:00
Alex Converse
1796d1cc77 Merge "Add target for Mac OS X 10.11 'El Capitan'" 2015-11-03 17:50:34 +00:00
Marco
cb7b2a4f4b Adjust threshold for datarate frame drop test.
Current threshold is little too strict.

Change-Id: I99ec1409d095e0c2fd3b7ab398742cabcc05700b
2015-11-03 08:17:21 -08:00