Compare commits

...

634 Commits

Author SHA1 Message Date
Marco
16ec39cf96 vp9-denoiser bugfix: Disable postproc-denoiser under temporal denoising.
The postproc vp9_denoise() is a spatial denoise/blur function.
It was not intended to be used if temporal denoising is enabled.

Change-Id: I97d2dcb941e7cc49bbafce99d9286beb2693249d
2016-02-24 17:06:33 -08:00
Marco
b520882f0e vp9-svc: Fix to avoid msan unitialized value.
Move the logic for forcing zero_mode after the
(ref_frame & flag_list) check.
This was causing an memory leak under msan:
https://bugs.chromium.org/p/webrtc/issues/detail?id=5402

Change-Id: Ie9d243369f8ed7c332f46178275945331da4fd85
2016-01-06 11:34:57 -08:00
Yaowu Xu
2bd4f44409 Assert no mv clamping for scaled references
Under --enable-better-hw-compabibility, this commit adds the asserts
that no mv clamping is applied for scaled references, so when built
with this configure option, decoder will assert if an input bitstream
triggger mv clamping for scaled reference frames.

Change-Id: I786e86a2bbbfb5bc2d2b706a31b0ffa8fe2eb0cb
2016-01-05 14:55:05 -08:00
Yaowu Xu
ce6d3f1de4 Merge "Assert no 8x4/4x8 partition for scaled references" 2016-01-05 20:35:46 +00:00
Marco Paniconi
e9e726f744 Merge "vp9-skin detection: Refactoring." 2016-01-05 16:56:54 +00:00
Yaowu Xu
03a021a6fc Assert no 8x4/4x8 partition for scaled references
This commit adds a new configure option:

--enable-better-hw-compatibility

The purpose of the configure option is to provide information on known
hardware decoder implementation bugs, so encoder implementers may
choose to implement their encoders in a way to avoid triggering these
decoder bugs.

The WebM team were made aware of that a number of hardware decoders
have trouble in handling the combination of scaled frame reference
frame and 8x4 or 4x8 partitions. This commit added asserts to vp9
decoder, so when built with above configure option, the decoder can
assert if an input bitstream triggers such decoder bug.

Change-Id: I386204cfa80ed16b50ebde57f886121ed76200bf
2016-01-04 18:33:37 -08:00
Yaowu Xu
ef77ce4407 Merge "vp10: only assume ONLY_4X4 if segmentation is disabled." 2016-01-05 02:29:05 +00:00
Yaowu Xu
0b769b2929 Merge "vp10: skip coding of txsz for lossless-segment blocks." 2016-01-05 02:28:58 +00:00
Marco
a8b7c6aad3 vp9-skin detection: Refactoring.
Add function to compute skin map for a given block, as its
used in several places (cyclic refresh, noise estimation, and denoising).

Change-Id: Ied622908df43b6927f7fafc6c019d1867f2a24eb
2016-01-04 16:58:06 -08:00
Marco
e5dfca02a9 vp9-svc: Set initial values for ext_buffer/flag indices.
Set initial values for these parameters in the vp9_init_layer_context().

This also fixes an issue in the svc-bypass mode when frame flags are
passed via the vpx_codec_encode().

Change-Id: I0968f04672f8d3d2fe2cea6b8a23f79f80d7a8b1
2016-01-04 12:28:46 -08:00
Ronald S. Bultje
53a11656cd vp10: only assume ONLY_4X4 if segmentation is disabled.
Otherwise, per-segment lossless might mean that some segments are not
lossless and they could still want to use another mode. The per-block
tx points remain uncoded on blocks where (per the segment id) the Q
value implies lossless.

Change-Id: If210206ab1fe3dd11976797370c77f961f13dfa0
2016-01-04 15:21:02 -05:00
Ronald S. Bultje
d9439fdc36 vp10: skip coding of txsz for lossless-segment blocks.
Change-Id: Ic23c10b6d2a9fed3abe69c6bf10e910832444f2c
2016-01-04 15:21:02 -05:00
Jian Zhou
b8c2a4eb0c Merge "Code clean of highbd_tm_predictor_32x32" 2015-12-28 18:17:03 +00:00
Jian Zhou
dbe2d8c33c Merge changes I0139f8e9,I7d2545fc
* changes:
  Code clean of highbd_tm_predictor_16x16
  Code clean of highbd_dc_predictor_32x32
2015-12-28 18:16:13 +00:00
Jingning Han
c84d3abeb8 Merge "Fix sub8x8 motion search on scaled reference frame" 2015-12-23 02:34:18 +00:00
Jian Zhou
26a6ce4c6d Code clean of highbd_tm_predictor_32x32
Remove the ARCH_X86_64 constraint. No performance hit on both
big core and small core.

Change-Id: I39860b62b7a0ae4acaafdca7d68f3e5820133a81
2015-12-22 16:51:57 -08:00
Jian Zhou
355bfa2193 Code clean of highbd_tm_predictor_16x16
Remove the ARCH_X86_64 constraint.

Change-Id: I0139f8e998cc5525df55161c2054008d21ac24d4
2015-12-22 16:34:40 -08:00
Jian Zhou
a4c265f1b7 Code clean of highbd_dc_predictor_32x32
Remove the ARCH_X86_64 constraint.

Change-Id: I7d2545fc4f24eb352cf3e03082fc4d48d46fbb09
2015-12-22 16:06:54 -08:00
Marco Paniconi
a9dd8a7308 Merge "aq-mode=3: Don't reset segment if block is determined to be skin." 2015-12-22 20:18:24 +00:00
Marco
b121a3e7b8 aq-mode=3: Don't reset segment if block is determined to be skin.
For coding block sizes <=16X16, if the block is determined to be skin,
then always allow for that block to be candidate for refresh. So if that
block happens to be on the boost segment(s), segment won't get reset to 0
and delta-q will be applied.

PSNR/SSIM metrics neutral (little/no change) on RTC clips.
Speed increase small/negligible (< 1%).
Some visual improvement on faces in a few RTC clips.

Change-Id: I6bf0fce6f39d820b491ce05d7c017ad168fce7d6
2015-12-22 10:23:44 -08:00
James Zern
cedb1db594 Merge "Code clean of highbd_tm_predictor_4x4" 2015-12-22 16:45:01 +00:00
James Zern
a097963f80 Merge "Code clean of highbd_dc_predictor_4x4" 2015-12-22 16:30:37 +00:00
Jian Zhou
52e7f4153b Merge "Code clean of highbd_v_predictor_4x4" 2015-12-21 18:07:48 +00:00
Yunqing Wang
b597e3e188 Merge "Fix for issue 1114 compile error" 2015-12-19 04:29:39 +00:00
James Zern
8b2ddbc728 sad_sse2: fix sad4xN(_avg) on windows
reduce the register count by 1 to avoid xmm6 and unnecessarily
penalizing the other users of the base macro

Change-Id: I59605c9a41a31c1b74f67ec06a40d1a7f92c4699
2015-12-18 19:19:32 -08:00
Jian Zhou
db11307502 Code clean of highbd_tm_predictor_4x4
Replace MMX with SSE2, reduce mem access to left neighbor,
loop unrolled.

Change-Id: I941be915af809025f121ecc6c6443f73c9903e70
2015-12-18 18:43:41 -08:00
Jian Zhou
c91dd55eda Code clean of highbd_v_predictor_4x4
MMX replaced with SSE2, same performance.

Change-Id: I2ab8f30a71e5fadbbc172fb385093dec1e11a696
2015-12-18 15:25:27 -08:00
Jian Zhou
8366b414dd Code clean of highbd_dc_predictor_4x4
MMX replaced with SSE2, same performance.

Change-Id: Ic57855254e26757191933c948fac6aa047fadafc
2015-12-18 12:45:23 -08:00
Marco Paniconi
f075fdc474 Merge "Non-rd speed >=5: Include H/V intra for bsize=16x16." 2015-12-18 17:45:49 +00:00
Peter de Rivaz
7361ef732b Fix for issue 1114 compile error
In 32-bit build with --enable-shared, there is a lot of
register pressure and register src_strideq is reused.
The code needs to use the stack based version of src_stride,
but this doesn't compile when used in an lea instruction.

This patch also fixes a related segmentation fault caused by the
implementation using src_strideq even though it has been
reused.

This patch also fixes the HBD subpel variance tests that fail
when compiled without disable-optimizations.
These failures were caused by local variables in the assembler
routines colliding with the caller's stack frame.

Change-Id: Ice9d4dafdcbdc6038ad5ee7c1c09a8f06deca362
2015-12-18 09:43:22 +00:00
Jian Zhou
8f8a3b6a78 Merge "Code clean of sad4xN(_avg)_sse" 2015-12-18 01:39:20 +00:00
Marco
c8a2c31ec1 Non-rd speed >=5: Include H/V intra for bsize=16x16.
H/V intra mode was only enabled for bsize < 16x16,
enable it also for bsize=16x16.

Metrics are neutral with this change:
Overall very small gain (0.1%), small visual gain on some RTC clips.

Change-Id: Ib2d7a44382433bfc11cf324aa3cc5c382ea9e088
2015-12-17 17:18:44 -08:00
Jian Zhou
b158d9a649 Code clean of sad4xN(_avg)_sse
Replace MMX with SSE2, reduce psadbw ops which may help Silvermont.

Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d
2015-12-17 11:10:42 -08:00
Marco Paniconi
685a6b602b Merge "vp9-svc: Fix to allow for 4x4 variance for low resolutions." 2015-12-16 23:04:26 +00:00
James Zern
a71dcd6f99 Merge "vpxenc: don't warn about libwebm availability if writing IVF." 2015-12-16 22:53:01 +00:00
Marco
f0961498a0 vp9-svc: Fix to allow for 4x4 variance for low resolutions.
Change-Id: I3ec08e10d9ebf6d8b8a03004a320523f926e5cc4
2015-12-16 13:38:41 -08:00
Yaowu Xu
e650129683 Move bit_depth init out of setup_quantization
This also fixes a compiling error under --enable-vp9_highbitdepth.

Change-Id: I9d1dcb95d3336d797eb3c23a4702c30b04355357
2015-12-16 11:43:11 -08:00
Ronald S. Bultje
3977507339 vpxenc: don't warn about libwebm availability if writing IVF.
Change-Id: I1a9635a9948458e6c83f5b58764b7e720d98e2ea
2015-12-16 13:35:59 -05:00
Marco Paniconi
f73a511d37 Merge "Non-rd variance partition: Lower the 64->32 force split threshold." 2015-12-16 16:48:07 +00:00
Marco
26fda00840 Non-rd variance partition: Lower the 64->32 force split threshold.
Change-Id: I837551bdf87197bee8a193353bb31f4cff794787
2015-12-15 17:29:01 -08:00
Yaowu Xu
eace551c87 Merge changes Icf9b57c3,I9e12da84,Idf5ee179
* changes:
  Fixed interval, fixed Q 1 pass test patch.
  1 pass VBR mode bug fix.
  Fixed interval, fixed Q 1 pass test patch.
2015-12-15 17:51:33 +00:00
Marco Paniconi
12084f6d57 Merge "Revert "Add "unknown" status for noise estimation."" 2015-12-15 16:46:06 +00:00
Marco Paniconi
f3e7539c67 Revert "Add "unknown" status for noise estimation."
This reverts commit e15fedb925.

Change-Id: Ibf2bce008c727a9754f88814b7630095fa7b8253
2015-12-15 16:44:40 +00:00
Marco Paniconi
93c0b879d4 Merge "SVC 1 pass mode: Constrain inter mode search within superframe." 2015-12-15 16:25:20 +00:00
Yaowu Xu
9232f69b26 Merge "Fix a enc/dec mismatch under CONFIG_MISC_FIXES" 2015-12-15 16:02:39 +00:00
Paul Wilkins
a5af49331d Merge "1 pass VBR mode bug fix." 2015-12-15 15:50:05 +00:00
paulwilkins
99309004bf Fixed interval, fixed Q 1 pass test patch.
For testing implemented a fixed pattern and delta, 1 pass,
fixed Q, low delay mode.

This has not in any way been tuned or optimized.

Change-Id: Icf9b57c3bb16cc5c0726d5229009212af36eb6d9
2015-12-15 15:33:25 +00:00
paulwilkins
9ce611a764 1 pass VBR mode bug fix.
(copied from VP9)

The one pass VBR mode selects a Q range based on a
moving average of recent Q values. This calculation
should have been excluding arf overlay frames as these
are usually coded at the highest allowed value. Their
inclusion skews the average and can cause it to drift
upwards even when the clip as a whole is undershooting.

As such it can undermine correct adaptation of the allowed
Q range especially for easy content.

Change-Id: I9e12da84e12917e836b6e53ca4dfe4f150b9efb1
2015-12-15 15:02:40 +00:00
paulwilkins
fc50d95b2e Fixed interval, fixed Q 1 pass test patch.
For testing implemented a fixed pattern and delta, 1 pass,
fixed Q, low delay mode.

This has not in any way been tuned or optimized.

Change-Id: Idf5ee179b277fa15d07a97f14f2ce5bbaae80a04
2015-12-15 15:00:38 +00:00
paulwilkins
cea5e1c1e3 1 pass VBR mode bug fix.
The one pass VBR mode selects a Q range based on a
moving average of recent Q values. This calculation
should have been excluding arf overlay frames as these
are usually coded at the highest allowed value. Their
inclusion skews the average and can cause it to drift
upwards even when the clip as a whole is undershooting.

As such it can undermine correct adaptation of the allowed
Q range especially for easy content.

Change-Id: I7d10fe4227262376aa2dc2a7aec0f1fd82bf11f9
2015-12-15 10:27:51 +00:00
Yaowu Xu
c7101830a6 Fix a enc/dec mismatch under CONFIG_MISC_FIXES
The culprit is on the decode side xd->lossless[i] setup was in wrong
location where segment features are not yet decoded.

Also on the encoder side, transform mode was not set consistently
between when tx_mode is selected and how tx_mode is enforced in
tx size selection.

Change-Id: I4c4c32188fda7530cadab9b46d4201f33f7ceca3
2015-12-14 20:56:37 -08:00
James Zern
b81f04a0cc Merge "move vp9_avg to vpx_dsp" 2015-12-15 03:41:22 +00:00
Jacky Chen
b7654afb6b Merge "Add "unknown" status for noise estimation." 2015-12-15 00:41:23 +00:00
jackychen
e15fedb925 Add "unknown" status for noise estimation.
Change-Id: I0fe95332ccfa2e1ad2a01a8e7ddd631289e0f8eb
2015-12-14 15:38:20 -08:00
Marco
c760c33b99 SVC 1 pass mode: Constrain inter mode search within superframe.
Keep track of frame indexes for the references, and
constrain inter mode search for reference with same
temporal alignment.

Improves speed by about ~15%, no noticeable loss in
compression performance.

Change-Id: I5c407a8acca921234060c4fcef4afd7d734201c8
2015-12-14 15:19:29 -08:00
Marco Paniconi
c0c0edd9d7 Merge "Non-rd variance partition: Adjust logic for 32->16 force split." 2015-12-14 22:46:15 +00:00
James Zern
d36659cec7 move vp9_avg to vpx_dsp
Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f
2015-12-14 14:42:12 -08:00
Marco
6f17954f85 Non-rd variance partition: Adjust logic for 32->16 force split.
Lower the threshold for splitting 32x32->16x16 based on average variance,
and add lower bound condition for this split to occur. This prevents
unneccassry splitting for areas with very low variance.

Change-Id: Ibeb33b3d993632c2019f296eb87ef3b7e3568189
2015-12-14 12:54:10 -08:00
Jian Zhou
2404e3290e Merge "Code clean of tm_predictor_32x32" 2015-12-14 17:56:01 +00:00
Marco Paniconi
e19b7df8d3 Merge "Non-rd variance partition: Adjustments to reduce dragging artifact." 2015-12-12 02:59:33 +00:00
Marco
d4440614ae Non-rd variance partition: Adjustments to reduce dragging artifact.
For non-rd variannce partition, speed >= 5:
Adjustments to reduce dragging artifcat of background area near
slow moving boundary.

-Decrease base threshold under low source noise conditions.
-Add condition to split 64x64/32x32 based on average variances
of lower level blocks.

PSNR/SSIM metrics go down ~0.7/0.9% on average on RTC set.
Visually helps to reduce dragging artifact on some rtc clips.

Change-Id: If1f0a1aef1ddacd67464520ca070e167abf82fac
2015-12-11 16:16:02 -08:00
Jian Zhou
6e87880e7f Merge "Speed up tm_predictor_16x16" 2015-12-11 18:55:46 +00:00
Jian Zhou
88120481a4 Code clean of tm_predictor_32x32
Reallocate the xmm register usage so that no ARCH_X86_64 required.
Reduce memory access to the left neighbor by half.
Speed up by single digit on big core machine.

Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990
2015-12-11 10:32:08 -08:00
Jingning Han
27bbfd652d Fix sub8x8 motion search on scaled reference frame
This commit makes the sub8x8 block rate-distortion optimization
scheme use precise motion compensated prediction to compute the rd
cost. It fixes a potential buffer overflow issue related to sub8x8
motion search on scaled reference frame.

Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de
2015-12-11 10:08:51 -08:00
Jian Zhou
62f986265f Merge "SSE2 based h_predictor_32x32" 2015-12-11 18:02:34 +00:00
James Zern
ecb8dff768 Merge "dc_left_pred[48]: fix pic builds" 2015-12-11 02:48:11 +00:00
Jian Zhou
5604924945 Merge "Code clean of dc_left/top_predictor_16x16" 2015-12-11 01:53:44 +00:00
Yaowu Xu
f0bef772be Merge "Proper fix of a msvc complier warning" 2015-12-11 00:53:28 +00:00
Yunqing Wang
be0501c875 Merge "Minor cleanup" 2015-12-11 00:52:03 +00:00
Yaowu Xu
4d2cfeab36 Proper fix of a msvc complier warning
Change-Id: I701ab4993be7cfb15b61a1adbbaf5565bd14ae27
2015-12-10 16:29:01 -08:00
James Zern
40ee78bc19 dc_left_pred[48]: fix pic builds
GET_GOT modifies the stack pointer so the offset for left's address will
be wrong if loaded afterword.

Change-Id: Iff9433aec45f5f6fe1a59ed8080c589bad429536
2015-12-10 15:44:31 -08:00
Yaowu Xu
5a81c5c4be Merge changes Iece22223,Iefad9d8d
* changes:
  Fix two msvc build issues
  Fix enc/dec mismatches for aq-mode 1 and 2
2015-12-10 23:32:32 +00:00
Yunqing Wang
cd08120d62 Minor cleanup
Removed unused GET_GOT_SAVE_ARG.

Change-Id: I0ae41c2d0dcd6d7d1c8dda05062fcdb737fd917d
2015-12-10 15:28:07 -08:00
Yunqing Wang
feeb116c92 Merge "Fix the win32 crash when GET_GOT is not defined" 2015-12-10 23:25:05 +00:00
Jingning Han
72760976a0 Merge "Sync high bit-depth temporal filter" 2015-12-10 22:54:59 +00:00
Yunqing Wang
322ea7ff5b Fix the win32 crash when GET_GOT is not defined
This patch continues to fix the win32 crash issue:
https://bugs.chromium.org/p/webm/issues/detail?id=1105

Johann's patch is here:
https://chromium-review.googlesource.com/#/c/316446/2

Change-Id: I7fe191c717e40df8602e229371321efb0d689375
2015-12-10 14:25:01 -08:00
Yaowu Xu
6786280807 Fix two msvc build issues
Change-Id: Iece22223773dd6d0f87f8f59827705acd2ebe2a4
2015-12-10 12:41:27 -08:00
Jian Zhou
4ec5953080 Code clean of dc_left/top_predictor_16x16
Remove some redundant code.

Change-Id: Ida2e8c0ce28770f7a9545ca014fe792b04295260
2015-12-10 11:59:58 -08:00
Yaowu Xu
221ed5e47b Fix enc/dec mismatches for aq-mode 1 and 2
Change-Id: Iefad9d8d96a08dcc788a5efdca2df6a815d1205f
2015-12-10 11:45:26 -08:00
Jian Zhou
c90a8a1a43 SSE2 based h_predictor_32x32
Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8,
and reduce mem access to left.
Speed up by single digit in ./test_intra_pred_speed on big core
machines.

Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960
2015-12-10 10:09:58 -08:00
Tom Finegan
7f79a83f17 Merge "iosbuild.sh: Support macosx targets in Xcode 7." 2015-12-10 16:45:01 +00:00
Paul Wilkins
449e46958c Merge "Backport temporal filter approach to VP9" 2015-12-10 09:47:25 +00:00
Jingning Han
d3c972403a Sync high bit-depth temporal filter
Change-Id: Ifdcfb91416be8189569f703bee9be253d7b3d9b6
2015-12-09 15:06:36 -08:00
Tom Finegan
acf580d2bb iosbuild.sh: Support macosx targets in Xcode 7.
Xcode 7 refuses to link to x86 and x86_64 code that's built for
iphone sim, so add an extra command line flag that forces iosbuild
to use darwin15 targets.

Change-Id: I2228d458f5cccf4d26866040380a974f88d9d360
2015-12-09 13:52:06 -08:00
Jingning Han
ece4fd5d22 Backport temporal filter approach to VP9
This commit enables the new temporal filter system for VP9. For
speed 1, it improves the compression performance:
derf  0.54%
stdhd 1.62%

Change-Id: I041760044def943e464345223790d4efad70b91e
2015-12-09 13:39:06 -08:00
Johann Koenig
420b9f5bd3 Merge "fix null pointer crash in Win32 because esp register is broken" 2015-12-09 19:31:12 +00:00
Yaowu Xu
74c67e3da3 Merge "Changes to exhaustive motion search." 2015-12-09 15:57:10 +00:00
Jacky Chen
d9bba21306 Merge "Add vp9_avg_4x4_neon and the unit test." 2015-12-09 06:09:33 +00:00
James Zern
3dc19feb29 Merge changes Id3c6cf5c,I7970575e,If3253a87
* changes:
  test.mk: simplify vp8/9 checks
  test.mk: regroup white box tests
  test.mk: enable test_intra_pred_speed unconditionally
2015-12-09 01:39:45 +00:00
James Zern
44fe73ec37 Merge "vp8: fix loop filter level clamping" 2015-12-09 01:38:09 +00:00
James Zern
e040c6c404 Merge "vp8: fix quantizer clamping" 2015-12-09 01:37:58 +00:00
jackychen
303f144eef Add vp9_avg_4x4_neon and the unit test.
Change-Id: I3ef9a9648841374ed3cc865a02053c14ad821a20
2015-12-08 17:23:36 -08:00
Marco Paniconi
835f16ea36 Merge "vp9 denoiser: Re-evaluate mode selection for golden reference." 2015-12-09 00:34:09 +00:00
paulwilkins
4e692bbee2 Changes to exhaustive motion search.
This change has been imported from VP9 and
alters the nature and use of exhaustive motion search.

Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.

Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.

For example:
  stage 1: Range +/- 64 interval 4
  stage 2: Range +/- 32 interval 2
  stage 3: Range +/- 15 interval 1

This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.

This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained  a bug (the two searches used different distortion
metrics).

For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.

Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most natural video clips, however, where the step search
is performing well, the quality gain and speed impact are small.

Change-Id: Iac24152ae239f42a246f39ee5f00fe62d193cb98
2015-12-08 16:54:42 +00:00
Jian Zhou
aa5b517a39 Re-enable SSE2 based intra 4x4 prediction
4x4 Intra predictor implemented with MMX is replaced with SSE2.
Segfault in change 315561 when decoding vp8 is taken care of.

Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76
2015-12-07 18:50:37 -08:00
Scott LaVarnway
c7e557b82c Merge "VP9: Add ssse3 version of vpx_idct32x32_135_add()" 2015-12-07 21:13:35 +00:00
Sergey Kolomenkin
5fc9688792 fix null pointer crash in Win32 because esp register is broken
https://bugs.chromium.org/p/webm/issues/detail?id=1105

Change-Id: I304ea85ea1f6474e26f074dc39dc0748b90d4d3d
2015-12-07 12:57:06 -08:00
Johann Koenig
14ea8848fb Merge "Strip redundant entries from .mailmap" 2015-12-07 18:14:05 +00:00
Johann
9fde1f2ee3 Strip redundant entries from .mailmap
Also prevent them from being reintroduced.

Change-Id: I4e16293c8185462b48e641f066d78449685e2854
2015-12-07 09:03:00 -08:00
paulwilkins
9d85ce8e0c Fix bug when overlaying middle arfs in multi-arf groups.
Fix copied over from VP9 master to VP10 master.
Do not reset the alt ref active flag when overlaying the middle
arf(s) of a multi arf group.

Change-Id: I1b7392107e7c675640d5ee1624012f39cc374c58
2015-12-07 15:23:46 +00:00
James Zern
79a9add666 Revert "MMX in intra 4x4 prediction replaced with SSE2"
This reverts commit 89a1efa4c4.

This causes a segfault when decoding vp8, in both 32 and 64-bit

Change-Id: Idbb9bb28ab897e1d055340497c47b49a12231367
2015-12-05 10:20:39 -08:00
James Zern
a046ba21d8 test.mk: simplify vp8/9 checks
use CONFIG_VP[89] to protect white-box tests and drop redundant
uses of CONFIG_VP9 in variable assignments within that block

Change-Id: Id3c6cf5c7822aa161b19768b295f58829a1c6447
2015-12-04 18:44:45 -08:00
James Zern
2c9c2e0b8b test.mk: regroup white box tests
vp8/9/10/multi-config/unconditional

Change-Id: I7970575e997da0b68c6c54741a221fbba5ad0b08
2015-12-04 18:44:34 -08:00
Marco Paniconi
16a4fab9e2 Merge "Adjust variance threshold based on source noise level." 2015-12-05 00:06:14 +00:00
Angie Chiang
06bdcea606 Merge "comment out range_check of fdct in dct.c" 2015-12-04 23:38:35 +00:00
Marco
d5b3f29f3c Adjust variance threshold based on source noise level.
For non-rd variance partition: Adjust variance threhsold based
on noise level estimate. This change allows the adjustment to be
updated more frequently.

Change-Id: Ie2abf63bf3f1ee54d0bc4ff497298801fdb92b0d
2015-12-04 14:43:39 -08:00
Jian Zhou
589f3c7bc8 Merge changes Ie48229c2,Ib9f18468,I0c90e7c1
* changes:
  Speed up h_predictor_16x16
  Speed up h_predictor_8x8
  MMX in intra 8x8 prediction replaced with SSE2
2015-12-04 21:43:10 +00:00
Jian Zhou
e86c7c863e Speed up h_predictor_16x16
Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4,
and reduce mem access to left.
Speed up by >20% in ./test_intra_pred_speed.

Change-Id: Ie48229c2e32404706b722442942c84983bda74cc
2015-12-04 12:12:55 -08:00
Jian Zhou
da3f08fac3 Speed up h_predictor_8x8
Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2,
and reduce mem access to left.
Speed up by >20% in ./test_intra_pred_speed.

Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e
2015-12-04 11:36:44 -08:00
Marco Paniconi
64e46a033f Merge "Non-rd partition: Use force split on 16x16 for low resolutions." 2015-12-04 19:21:26 +00:00
Angie Chiang
08b157da8e comment out range_check of fdct in dct.c
The range_check is not used because the bit range
in fdct# is not correct. Since we are going to merge in a new version
of fdct# from nextgenv2, we won't fix the incorrect bit range now.

Change-Id: I54f27a6507f27bf475af302b4dbedc71c5385118
2015-12-04 10:54:31 -08:00
Jian Zhou
9f23a9c2e1 Merge "MMX in intra 4x4 prediction replaced with SSE2" 2015-12-04 18:50:58 +00:00
Marco
6490fc71a7 Non-rd partition: Use force split on 16x16 for low resolutions.
For low resolutions, whem 4x4downsample is used for variance,
use the same force split (that is used for 8x8downsample) for 16x16 blocks.

No change in metrics. Small improvement visually.

Change-Id: I915b9895902d0b9a41e75d37fee1bf3714d2366d
2015-12-04 09:24:28 -08:00
Paul Wilkins
2b5baea8fd Merge "Fix bug when overlaying middle arfs in multi-arf groups." 2015-12-04 10:33:55 +00:00
Jian Zhou
aa2764abdd MMX in intra 8x8 prediction replaced with SSE2
8x8 Intra predictor implemented with MMX is replaced with SSE2.

Change-Id: I0c90e7c1e1e6942489ac2bfe58903b728aac7a52
2015-12-03 18:11:06 -08:00
Jian Zhou
89a1efa4c4 MMX in intra 4x4 prediction replaced with SSE2
4x4 Intra predictor implemented with MMX is replaced with SSE2.

Change-Id: Id57da2a7c38832d0356bc998790fc1989d39eafc
2015-12-03 16:40:23 -08:00
Marco Paniconi
6202ce5ada Merge "vp9-noise estimate: Move level setting to a function." 2015-12-04 00:24:49 +00:00
James Zern
2e693eb80e vp8: fix loop filter level clamping
the loop filter level is transmitted as 6-bits + sign so needs to be clamped in
the delta + absolute case.

BUG=https://bugzilla.mozilla.org/show_bug.cgi?id=1224363

Change-Id: Icbdca4fdbf043466429bd5c9d59dbe913bf153bc
2015-12-03 16:18:48 -08:00
James Zern
ff3674a15e vp8: fix quantizer clamping
the quantizer is transmitted as 7-bits + sign so needs to be clamped in
the delta + absolute case.

BUG=https://bugzilla.mozilla.org/show_bug.cgi?id=1224361

Change-Id: I9115f5d1d5cf7e0a1d149d79486d9d17de9b9639
2015-12-03 16:16:28 -08:00
Marco Paniconi
b38a7cd169 Merge "vp9-denoiser: Increase threshold for mode re-evaluation." 2015-12-03 23:52:46 +00:00
Marco
dd998adc7a vp9-denoiser: Increase threshold for mode re-evaluation.
Change-Id: I57a15aec1cb2d6638f5211d30c2c9f15fb62494f
2015-12-03 13:48:35 -08:00
Marco
b12e353424 vp9-noise estimate: Move level setting to a function.
This is so we may update level at any time (e.g., to be used
for setting thresholds in variance-based partition).

Change-Id: I32caad2271b8e03017a531f9ea456a6dbb9d49c7
2015-12-03 13:11:49 -08:00
hui su
5d3327e891 Remove palette from VP10
Store it in nextgenv2 for now.

Change-Id: Iab0af0e15246758e3b6e8bde4a74b13c410576fc
2015-12-03 12:30:47 -08:00
paulwilkins
4a79503b3e Fix bug when overlaying middle arfs in multi-arf groups.
Do not reset the alt ref active flag when overlaying the middle
arf(s) of a multi arf group.

Change-Id: Ia55a55a376973f3fd17161429fd2afb07b4df31f
2015-12-03 15:19:02 +00:00
Jian Zhou
623e988add Merge "SSE2 speed up of h_predictor_4x4" 2015-12-02 18:49:00 +00:00
Scott LaVarnway
f0b0b1fe62 VP9: Add ssse3 version of vpx_idct32x32_135_add()
Change-Id: I9a780131efaad28cf1ad233ae64c5c319a329727
2015-12-02 04:50:46 -08:00
Debargha Mukherjee
f70095076b Fix a spatial svc test crash
Fixes crash in 2pass spatial svc test that was introduced in:
https://chromium-review.googlesource.com/#/c/313571/6

Change-Id: Iab3e8225a8d159cd33f5849dffe6802e25038047
2015-12-01 17:17:51 -08:00
Debargha Mukherjee
7ceba7c26b Fix a spatial svc assert failure
Fixes spatial svc rc assert failure introdcued in:
https://chromium-review.googlesource.com/#/c/312959/1

Change-Id: I6096bfbc484859d71a5fb55e6a3248a31885af61
2015-12-01 14:24:50 -08:00
Debargha Mukherjee
01a2b40e95 Merge "Spatial SVC crash fix" 2015-12-01 21:24:46 +00:00
Debargha Mukherjee
d3409bad9a Fix a spatial svc bug related to scaling
Fixes bug introduced in
https://chromium-review.googlesource.com/#/c/299482/5

Change-Id: If542c1a917380465dd9bc4ce5e32b0adbb20e340
2015-12-01 10:40:59 -08:00
Marco
1abf575f32 vp9 denoiser: Re-evaluate mode selection for golden reference.
Under certain denoising conditons, check for re-evaluation of
zero_last mode if best mode was golden reference.

Change-Id: Ic6cdfd175eef2f7d68606300c7173ab6654b3f6e
2015-12-01 09:39:01 -08:00
Jian Zhou
c7fae5d893 Speed up tm_predictor_16x16
Reduce mem access to left. Speed up by 10% in ./test_intra_pred_speed
with the same instruction size.

Change-Id: Ia33689d62476972cc82ebb06b50415aeccc95d15
2015-11-30 17:46:40 -08:00
Marco
f78b7daec4 Condition use of minmax in variance partition on speed setting.
For non-rd variance partition: only allow minmax computation
(which currently has no arm-neon optimization) for speeds < 8.

Performance loss is small: On RTC set with speed 8, few clips lose ~2/3%,
average loss is < 1%.

Change-Id: Ia9414f4d0b77dc83c3e73ca8de5d903f64b425ce
2015-11-30 17:23:32 -08:00
Scott LaVarnway
2669e05949 Merge "VPX: x86 asm version of vpx_idct32x32_1024_add()" 2015-11-30 23:28:27 +00:00
Marco Paniconi
23831545a0 Merge "vp9 denoiser: Fix to re-evaluate mode selection." 2015-11-30 19:00:39 +00:00
Jian Zhou
9d29d76280 SSE2 speed up of h_predictor_4x4
Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers.
Speed up by ~25% in ./test_intra_pred_speed.

Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d
2015-11-30 10:08:05 -08:00
Marco
f1f74a4e6c vp9: Update to noise estimation for denoising.
Change initial state of noise level, and only update
denoiser with noise level when estimate is done.

Change-Id: If44090d29949d3e4927e855d88241634cdb395dc
2015-11-30 10:03:20 -08:00
Marco
ad7e765319 vp9 denoiser: Fix to re-evaluate mode selection.
This fix allows to enable reuse_inter_pred.

Change-Id: I53f2bf1163bb0036ffb6df92117a86debdca11d1
2015-11-30 08:59:10 -08:00
Scott LaVarnway
0148e20c3c VPX: x86 asm version of vpx_idct32x32_1024_add()
Change-Id: I3ba4ede553e068bf116dce59d1317347988b3542
2015-11-25 10:11:29 -08:00
James Zern
1138b986c9 test.mk: enable test_intra_pred_speed unconditionally
vpx_dsp is currently included in all configurations

Change-Id: If3253a87d27f3e1abc94fbfe76f978c1172f3762
2015-11-24 22:29:12 -08:00
Marco Paniconi
610b413d7b Merge "vp9 denoiser: Re-evaluate ZEROMV after denoiser filtering." 2015-11-25 04:24:00 +00:00
Jian Zhou
901d20369a Merge "Speed up tm_predictor_8x8" 2015-11-25 02:34:07 +00:00
James Zern
adb033b57b Merge "configure: simplify x86 asm dependencies" 2015-11-25 02:19:47 +00:00
James Zern
fd51d90159 Merge changes Iaf8cbe95,I6748183d,I2a49811d
* changes:
  add vp9_satd_neon
  fix vp9_satd_sse2
  vp9_satd: return an int
2015-11-25 01:48:53 +00:00
Marco
5b0ddb931d vp9 denoiser: Re-evaluate ZEROMV after denoiser filtering.
For denoising, and for noise level above threshold, re-evaluate
ZEROMV for mode selection after denoising.
Current change only does this check if selected best mode (before denoising)
was intra.

Change-Id: I4b1435b68d26c78f7597b995ee7bff0ddd5f9511
2015-11-24 17:30:32 -08:00
Debargha Mukherjee
e807517a93 Spatial SVC crash fix
Fixes a spatial_svc breakage introduced in
https://chromium-review.googlesource.com/#/c/305228/3.

Change-Id: I7f2cecbdca980addb85d5e58b58b5454f4730ada
2015-11-24 16:40:27 -08:00
James Zern
eb1d0f8d60 add vp9_satd_neon
~60-65% faster at the function level across block sizes

Change-Id: Iaf8cbe95731c43fdcbf68256e44284ba51a93893
2015-11-24 16:09:10 -08:00
Jian Zhou
f4621c5c8d Speed up tm_predictor_8x8
Left neighbor read from memory only once.
Speed up by ~20% in ./test_intra_pred_speed.

Change-Id: Ia1388630df6fed0dce9a6eeded6cb855bbc43505
2015-11-24 16:07:06 -08:00
Marco
fbd245c598 vp9-denoiser: Fix to reset frame_stats.
zeromv_lastref_sse was not reset.

Change-Id: I23c12e804d63dc7dc18514f6efe71de1d1acbd6a
2015-11-24 15:58:28 -08:00
Marco Paniconi
e99e4a64e0 Merge "vp9 non-rd pickmode: Fix logic in reference masking." 2015-11-24 19:14:35 +00:00
Alex Converse
b84fa548fb Merge "bitreader/writer: Change shift to signed" 2015-11-24 18:33:45 +00:00
Alex Converse
4b038ad2ef Merge "Deduplicate some high bit depth tables" 2015-11-24 18:24:32 +00:00
Marco
eb43c8ebfc vp9 non-rd pickmode: Fix logic in reference masking.
This change makes sure last reference with zero mv
is always checked for mode selection.

No change in metrics.

Change-Id: Iaf01877bf34272b966c78bfe18daad882a0a419e
2015-11-24 10:10:03 -08:00
Scott LaVarnway
b16a164c97 Merge "VPX: Removed unnecessary pmulhrsw in IDCT32X32_34" 2015-11-23 23:37:13 +00:00
Scott LaVarnway
26eb806342 Merge "VP9: Only zero counts when !frame_parallel_decoding_mode (2)" 2015-11-23 23:36:46 +00:00
Scott LaVarnway
2c3b737af6 VP9: Only zero counts when !frame_parallel_decoding_mode (2)
The counts are never used when frame_parallel_decoding_mode
is set.

Change-Id: I293aa68abadcdd30973adacb9f5f5a3aecf8daa2
2015-11-23 14:42:15 -08:00
Marco
b0027b96ae vp9-svc: Fix to allow setting qp-max/min per spatial and temporal layer.
Change-Id: Ic0ec32c1d7f7c08c9f956592dccbfd9060b1f624
2015-11-23 10:46:34 -08:00
Scott LaVarnway
97e6cc6198 VPX: Removed unnecessary pmulhrsw in IDCT32X32_34
and fixed macro name.

Change-Id: I306b98a2b4ec80b130ae80290b4cd9c7a5363311
2015-11-23 10:24:09 -08:00
James Zern
16eba81f69 Revert "Speed up h_predictor_4x4"
This reverts commit d76032ae87.

breaks 32-bit builds

Change-Id: If6266ec2a405b5a21d615112f0f37e8a71193858
2015-11-20 22:25:29 -08:00
James Zern
073dc71cd0 Merge "Use Interlocked calls in win32 once() implementation." 2015-11-21 01:40:11 +00:00
James Zern
1b10753ad7 Merge "Speed up h_predictor_4x4" 2015-11-21 01:12:42 +00:00
Marco
131c1600a9 vp9 denoiser: Bias to last reference for temporal filter.
Change-Id: I6a360a12e8da8cdcb8a779647512591612d64f31
2015-11-20 15:38:32 -08:00
James Zern
60760f710f fix vp9_satd_sse2
accumulate satd in 32-bits
+ add unit test

Change-Id: I6748183df3662ddb9d635f9641f9586f2fd38ad5
2015-11-20 14:35:46 -08:00
James Zern
3e0138edb7 vp9_satd: return an int
the final sum may use up to 26 bits

+ add a unit test
+ disable the sse2 as the result will rollover; this will be fixed in a
future commit

Change-Id: I2a49811dfaa06abfd9fa1e1e65ed7cd68e4c97ce
2015-11-20 14:35:38 -08:00
Marco Paniconi
64a60ce3ba Merge "vp9-svc: Fix the setting of is_key_frame." 2015-11-20 18:29:15 +00:00
Alex Converse
612e3c8a0e Merge "Fix a signed shift overflow in vpx_rb_read_inv_signed_literal." 2015-11-20 17:42:05 +00:00
Alex Converse
d37c78819a Merge "Fix unsigned overflow in rd_variance_adjustment." 2015-11-20 17:41:58 +00:00
Marco
80a3e2615a vp9-svc: Fix the setting of is_key_frame.
Change on affects 1 pass CBR.
On key frame, temporal layer_id is reset to 0 for 1 pass CBR,
but since "layer" is reset, the svc.layer_context[layer].is_key_frame
was not correspondingly set properly.

Change-Id: I08f6da0a55ac7429ccfbaddfb7be14479e43543b
2015-11-20 08:51:13 -08:00
Scott LaVarnway
e7fc39fdf5 Merge "VPX: x86 asm version of vpx_idct32x32_34_add()" 2015-11-20 15:11:00 +00:00
Alex Converse
6aa2163b69 bitreader/writer: Change shift to signed
Silences several legal but suspicious unsigned overflows found with
clang -fsanitize=integer.

Change-Id: I69399751492a183167932b0a10751c433c32ca7b
2015-11-19 15:13:39 -08:00
Alex Converse
42b7c44b2f Fix a signed shift overflow in vpx_rb_read_inv_signed_literal.
Found with clang -fsanitize=integer

Change-Id: I17cb2166c06ff463abfaf9b0e6bc749d0d6fdf94
2015-11-19 15:04:20 -08:00
Alex Converse
b1fcd1751e Fix unsigned overflow in rd_variance_adjustment.
Found with clang -fsanitize=integer

Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1
2015-11-19 15:00:59 -08:00
Jian Zhou
d76032ae87 Speed up h_predictor_4x4
Modify h_predictor_4x4 with XMM registers.
Speed up by ~25% in ./test_intra_pred_speed.

Change-Id: Id01c34c48e75b9d56dfc2e93af12cf0c0326a279
2015-11-19 11:34:22 -08:00
Paul Wilkins
f3f6b6fe3e Merge "Changes to best quality settings." 2015-11-19 16:13:43 +00:00
Jian Zhou
4993158ee5 Merge "Speed up tm_predictor_4x4" 2015-11-19 02:32:48 +00:00
Jian Zhou
79b68626ae Speed up tm_predictor_4x4
tm_predictor_4x4 is implemented with SSE2 using XMM registers.
Speed up by ~25% in ./test_intra_pred_speed.

Change-Id: I25074b78d476a2cb17f81cf654bdfd80df2070e0
2015-11-18 16:44:25 -08:00
Marco
eed5494fc6 vp9-svc: Fix to key frame counter for spatial layers.
Existing condition only applied to temporal layers.

Change-Id: Icef20a59d0afc61d4e14dea01aff4786fa9e41ae
2015-11-18 14:31:37 -08:00
Paul Wilkins
85aea16f17 Merge "Changes to exhaustive motion search." 2015-11-18 11:10:13 +00:00
Scott LaVarnway
ed833048c2 VPX: x86 asm version of vpx_idct32x32_34_add()
Change-Id: Ic81f38998fb1b8d33f5a5d7424c2c41002786cef
2015-11-17 17:42:24 -08:00
James Zern
6e6dbbc67d configure: simplify x86 asm dependencies
--disable-XXX has the effect of disabling all extensions above it, e.g.,
--disable-ssse3 disables ssse3-avx2.

Change-Id: If02b44ca71ee12e4acb12010db8593a7989f2a9d
2015-11-17 16:15:57 -08:00
Zoe Liu
8a782c7eac Fixed a few sanity checks.
Change-Id: Ieec4a7be5945dc6de192e2d8292ab978baf47f53
(cherry picked from commit 2096296421)
2015-11-17 22:54:03 +00:00
paulwilkins
8ba98516fd Changes to best quality settings.
Small changes to the best quality default speed trade off.
Some speedup settings are worth while even for best quality as they
have only a very small impact on quality but a significant impact on
encode time.

These changes give as much as a further 50-60% increase in encode
speed for my test animations clip with minimal impact on quality.

For this sequence these changes improve the best quality encode  speed
to about the same level as good quality speed 0 in Q3 2015 whilst
retaining the large quality gain of over 1 db

For many natural videos though the quality difference from good 0
to best is much smaller.

Change-Id: I28b3840009d77e129817a78a7c41e29cb03e1132
2015-11-17 16:20:20 +00:00
jackychen
204cde580a Enable resize test(down&up) by changing the bitrate.
Change-Id: I5a4f1f7b9de20fbfc28cb743dcd29c0eeca736f8
2015-11-13 16:46:00 -08:00
Ralph Giles
2635573a7f Use Interlocked calls in win32 once() implementation.
This is simpler than the previous scheme, which tried to allocate
the CRITICAL_SECTION struct in a thread-safe manner before it
could use it to run the wrapped function in a thread-safe manner.

Change-Id: I172e5544e5f16403a3a0e5e2b9104b1292a0d786
2015-11-13 13:04:36 -08:00
Marco
988fd77c1f Reduce sampling time for noise estimate.
Change-Id: I46abd85e2187b8f4c2846416a23fab26d9b9f67d
2015-11-13 08:11:30 -08:00
Marco
006fd19246 Fix resize internal test.
Temporary fix to make sure it always passes.

Change-Id: I56a0529986ad7049b6090f871c14e9e06d573d5f
2015-11-13 06:22:27 -08:00
Marco Paniconi
5f5d185d01 Merge "VP9 noise estimation: add frame level motion metrics and adjust thresholds." 2015-11-13 14:09:19 +00:00
paulwilkins
0149fb3d6b Changes to exhaustive motion search.
This change alters the nature and use of exhaustive motion search.

Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.

Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.

For example:
  stage 1: Range +/- 64 interval 4
  stage 2: Range +/- 32 interval 2
  stage 3: Range +/- 15 interval 1

This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.

This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained  a bug (the two searches used different distortion
metrics).

For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.

Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most clips though the quality gain and speed impact are small.

Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-13 10:16:31 +00:00
JackyChen
6fb3d6db99 VP9 noise estimation: add frame level motion metrics and adjust thresholds.
Change-Id: Ia1aba00603b32cee6835951d3d8f740937cf20f4
2015-11-12 23:41:42 -08:00
James Zern
7501728327 Merge "libs.mk, testdata: rm redundant test of LIBVPX_TEST_DATA" 2015-11-13 06:49:00 +00:00
James Zern
34159b72d9 Merge "Add AVX vectorized vp9_diamond_search_sad" 2015-11-13 06:29:20 +00:00
Marco
419da5c734 Adjust variance threshold for 16x16 split at low resolutions.
Change-Id: I635e37f81237e9703d7d9a11ed76a043f4ec6eb0
2015-11-12 17:58:31 -08:00
Marco Paniconi
866c9357c2 Revert "Update to noise estimation."
This reverts commit 6b79a1e3e0.

Change-Id: I5a4923ca8a6de842855ce0725e92567ccbed6fb7
2015-11-13 00:13:32 +00:00
Marco
6b79a1e3e0 Update to noise estimation.
Add frame level global check and adjust some parameters.

Change-Id: I42103394f2d329781195d94ce6cbb5b3383eea17
2015-11-12 09:18:35 -08:00
Marco Paniconi
1b63238b67 Merge "Non-rd partition: reduce variance threshold low resolutions." 2015-11-12 06:08:38 +00:00
Marco Paniconi
0941ff72a0 Merge "Adjust varianace threshold for high noise condition." 2015-11-12 06:06:51 +00:00
Marco
384fc5e381 Adjust motion threshold to limit cyclic refresh.
Change-Id: Icfca27a567eb8929c312c6315856ee130d982a04
2015-11-11 18:22:21 -08:00
Marco
1827764450 Adjust varianace threshold for high noise condition.
Change-Id: I91c722e480328ff95b8c57614d8176ccaceb2539
2015-11-11 18:06:21 -08:00
Marco Paniconi
4d38dbdfb5 Merge "vp9 denoiser: Add another noise level to denoising." 2015-11-11 20:40:29 +00:00
James Zern
9ecb99abf0 Merge "Revert "VPX: x86 asm version of vpx_idct32x32_34_add()"" 2015-11-11 20:39:12 +00:00
Marco
ff32369804 vp9 denoiser: Add another noise level to denoising.
Change-Id: Idc755ab54e4f78bb7d75bc97634c451804edad99
2015-11-11 11:21:26 -08:00
James Zern
0ccad4d649 Revert "VPX: x86 asm version of vpx_idct32x32_34_add()"
This reverts commit 9aeaa2016e.

This causes some test vectors to fail.

Change-Id: I3659a2068404ec5a0591fba5c88b1bec0c9059a4
2015-11-11 11:12:38 -08:00
James Zern
8f7bc45b5b Revert "VP9: Only zero counts when !frame_parallel_decoding_mode"
This reverts commit 380a5519cc.

This causes an assertion failure in debug_check_frame_counts() which
probably isn't valid with this change; leaving the investigation for
later now.

Change-Id: Ieda5ca811ed2fa50a0cc6935919a8d10dca996e0
2015-11-11 11:11:00 -08:00
Geza Lore
5eefd3ebfd Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
2015-11-11 14:03:47 +00:00
James Zern
ec45003a8f libs.mk, testdata: rm redundant test of LIBVPX_TEST_DATA
the return value of enabled, which may be empty, is handled by the for
loop. this avoids making an unnecessarily long command line which may
fail in certain cases.

Change-Id: Ib88ecbbe2c0f6d7debb600b4caed4884497263b1
2015-11-10 17:54:51 -08:00
Marco
064a9eca49 Non-rd partition: reduce variance threshold low resolutions.
Change-Id: I06306905d187948a92f839357df5d21413823808
2015-11-10 15:42:58 -08:00
Marco Paniconi
79a194692f Merge "Add bias to zero/small motion for noisy source." 2015-11-10 23:10:31 +00:00
James Zern
e3efed7f4c Merge "convolve_copy_sse2: replace SSE w/SSE2 code" 2015-11-10 22:35:12 +00:00
Scott LaVarnway
f48321974b Merge "VPX: x86 asm version of vpx_idct32x32_34_add()" 2015-11-10 21:40:11 +00:00
Scott LaVarnway
9aeaa2016e VPX: x86 asm version of vpx_idct32x32_34_add()
Change-Id: I8a933c63b7fbf3c65e2c06dbdca9646cadd0b7cb
2015-11-10 11:54:56 -08:00
Marco
bd6bf25969 Add bias to zero/small motion for noisy source.
Change is only for real-time mode, speed >= 5, and non-screen content mode.
Add bias to zero/low motion for big blocks, if noise estimation
is enabled and noise level is above threshold.

Change-Id: I3a0a4608ede6aa535bda6eca528d20f8aba738e7
2015-11-10 11:23:40 -08:00
James Zern
40dab58941 convolve_copy_sse2: replace SSE w/SSE2 code
this should be neutral or slightly faster on modern (P4+) architectures

Change-Id: Iec4c080275941eb8c9e05a66a2daf0405d86a69b
2015-11-09 23:45:16 -08:00
JackyChen
19272d866b VP9 noise estimate: no noise estimate if frame size change.
Change-Id: I521f7b53c143d562a88fe7de330aa3f0ef09f414
2015-11-09 19:18:29 -08:00
Jacky Chen
394d6c122a Merge "VP9: add unit test for realtime external resize." 2015-11-10 03:05:30 +00:00
Johann
f937114402 Merge branch 'javanwhistlingduck'
Change-Id: Ib63fde31ae7b3f71e608830f7433113733b2a275
2015-11-09 17:00:37 -08:00
jackychen
55c8843791 VP9: add unit test for realtime external resize.
Change-Id: I9bfa80de73847d9be88b6ce9865d7bb5fafaaa57
2015-11-09 16:48:18 -08:00
Jacky Chen
7155f7ab78 Merge "VP9 dynamic resize: enable resize unit test(DownUp)." 2015-11-09 22:54:53 +00:00
James Zern
e1fbc886e1 Merge "VP9: Only zero counts when !frame_parallel_decoding_mode" 2015-11-09 22:23:34 +00:00
Johann
cbecf57f3e Release v1.5.0
Javan Whistling Duck release.

Change-Id: If44c9ca16a8188b68759325fbacc771365cb4af8
2015-11-09 14:12:38 -08:00
jackychen
0465aa45ea VP9 dynamic resize: enable resize unit test(DownUp).
The unit test requires a longer clip which is already in the repo.

Change-Id: Ic42e8d83e636fafd20d485a7f5f8422835319245
2015-11-09 14:04:58 -08:00
Marco Paniconi
cdec99b243 Merge "VP9 dynamic resize: increase waiting time after key frame." 2015-11-09 21:11:51 +00:00
jackychen
3c9a424e6e VP9 dynamic resize: increase waiting time after key frame.
For 1 pass CBR mode: increase waiting time after key frame
before we start sampling rate control behavior for determining
resize. This change need to disable one internal resize(DownUp)
temporally since it requires a longer clip to do so.

Change-Id: If21beda1be23f169ee541ab4dd642f718347887a
2015-11-09 12:04:00 -08:00
Marco Paniconi
498fd551fd Merge "Use same bias (against non-zero mv for big blocks) for speed 5." 2015-11-09 19:29:35 +00:00
Alex Converse
d1a7c10325 Merge "Expand unconstrained nodes in pack_mb_tokens and loop on zeros." 2015-11-09 18:27:40 +00:00
Scott LaVarnway
380a5519cc VP9: Only zero counts when !frame_parallel_decoding_mode
The counts are never used when frame_parallel_decoding_mode
is set.

Change-Id: Ic7a566a048297f7373c9ffbb48929ea09eff674f
2015-11-09 10:14:13 -08:00
Marco
718654848a Use same bias (against non-zero mv for big blocks) for speed 5.
Use same setting for speed 5 (as it is for speed > 5).
Change is only for real-time (non-rd) mode.

Change-Id: I830250eac654328373cb318baa89d4f0e63942e1
2015-11-09 10:09:51 -08:00
James Zern
420e8d6d03 Merge changes I8c83b86d,Ic53b2ed5,I4acc8a84
* changes:
  variance_test: create fn pointers w/'&' ref
  sixtap_predict_test: create fn pointers w/'&' ref
  sad_test: create fn pointers w/'&' ref
2015-11-07 00:57:06 +00:00
Hui Su
908fbabe4e Merge "Use accurate bit cost for uv_mode in UV intra mode RD selection" 2015-11-07 00:22:50 +00:00
Alex Converse
70eb870cfe Expand unconstrained nodes in pack_mb_tokens and loop on zeros.
Reduces Linux perf estimated cycle count for pack_mb_tokens on a
lossless encode on my desktop from 61858501855 to 48154040219 or from
26% of the overall profile to 21%.

Change-Id: I9ca3426d7e3272bc7f7030abda4f0d0cec87fb4a
2015-11-06 16:00:10 -08:00
hui su
6ab6ac450b Use accurate bit cost for uv_mode in UV intra mode RD selection
On derflr, +0.1% for VP10; however, -0.03% on VP9.

Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-11-06 14:45:43 -08:00
James Zern
eba14ddbe7 Merge "Revert "Add AVX vectorized vp9_diamond_search_sad"" 2015-11-06 22:37:20 +00:00
James Zern
30466f26b4 Revert "Add AVX vectorized vp9_diamond_search_sad"
This reverts commit f1342a7b07.

This breaks 32-bit builds:
 runtime error: load of misaligned address 0xf72fdd48 for type 'const
__m128i' (vector of 2 'long long' values), which requires 16 byte
alignment

+ _mm_set1_epi64x is incompatible with some versions of visual studio

Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
2015-11-06 13:15:01 -08:00
James Zern
837cea40fc variance_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: I8c83b86da53b1783cd18c0f765b67ba33da91d72
2015-11-06 11:04:11 -08:00
James Zern
ab5ce2e5ae sixtap_predict_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: Ic53b2ed5fbce05c5b5e633b4a4ef9ea75c55360a
2015-11-06 11:04:10 -08:00
Marco
5f041c01ed vp9: Disable noise estimate on resize trigger frame.
Change-Id: I35767a6320943582ee11d737b5f240cea2d01b25
2015-11-06 08:42:09 -08:00
James Zern
91606bbbe6 sad_test: create fn pointers w/'&' ref
this helps some toolchains (vs9) resolve the type of the parameter

Change-Id: I4acc8a844d1e55b766f66482bd6d32998174d70f
2015-11-05 23:53:24 -08:00
Marco Paniconi
d7bbe1a210 Merge "vp9: Updates to noise estimation." 2015-11-06 06:51:11 +00:00
Marco
1c724d01aa vp9: Updates to noise estimation.
Add threshold/condition on spatial_variance and brightness level.
Modification to normalization of block variance.
Change resolution limit below which we disable noise estimation.

Change-Id: If5be08a26ceda351242d8a58d2f0bc88c0a918f0
2015-11-05 18:19:01 -08:00
James Zern
892130f75b vp9_spatial_svc_encoder.sh: fix command line param
-l -> -sl, renamed in:
be3b08d [svc] Temporal svc with two pass rate control

Change-Id: I5a7b179b33d94e20e54825090659156dece928c0
2015-11-05 15:22:39 -08:00
Yunqing Wang
57cae22c1e Merge "Add AVX vectorized vp9_diamond_search_sad" 2015-11-05 20:17:13 +00:00
Geza Lore
f1342a7b07 Add AVX vectorized vp9_diamond_search_sad
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
  - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
  - For all i: mvsadcost[0][i] == mvsadcost[1][i]
        (equal per component cost)
  - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
        (Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.

Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
2015-11-05 10:02:17 +00:00
Marco Paniconi
c6641709a7 Merge "Bias against non-zero mv for large blocks." 2015-11-04 00:01:23 +00:00
Alex Converse
246e0eaa71 Deduplicate some high bit depth tables
Change-Id: I6977f7d155cc1e81ae2393933893caac6770821f
2015-11-03 15:40:44 -08:00
Marco
04a99cb36b Bias against non-zero mv for large blocks.
Change is only for real-time mode, speed > 5, and non-screen content mode.
Bias is based on block size and motion vector level (motion above some threshold).

Helps to improves stability in background from lightning changes.
PSNR/SSIM metrics on RTC set almost no change/neutral (within +/- 0.1).

Change-Id: I7eac13c1ae10be4ab1f40acc7f9f1df5653ece9d
2015-11-03 14:51:56 -08:00
Marco Paniconi
17534d2918 Merge "Update to encoder_breakout_test, for non-rd mode." 2015-11-03 22:40:53 +00:00
Yaowu Xu
5ff1008ed9 Merge "Fix a msvc warning" 2015-11-03 21:56:25 +00:00
Hui Su
3cbe767972 Merge "Generate intra prediction reference values only when necessary" 2015-11-03 20:55:14 +00:00
Marco Paniconi
73372cc09a Merge "Adjust threshold for datarate frame drop test." 2015-11-03 19:54:52 +00:00
Marco
9a7785b9d6 Update to encoder_breakout_test, for non-rd mode.
Only use non-zero threshold(s) for breakout if
the motion level of the current tested mode is low.

Change-Id: I22aae961cc42371b49d3f648560181cc54708502
2015-11-03 11:49:44 -08:00
Yaowu Xu
87e08f4d9f Fix a msvc warning
Change-Id: Id5b8f597fb275395232559fea7bfeb56912b88a1
2015-11-03 11:22:58 -08:00
Alex Converse
255bcf8697 Merge "misc fixes: Remove a wasted value." 2015-11-03 17:52:34 +00:00
Alex Converse
1796d1cc77 Merge "Add target for Mac OS X 10.11 'El Capitan'" 2015-11-03 17:50:34 +00:00
Marco
cb7b2a4f4b Adjust threshold for datarate frame drop test.
Current threshold is little too strict.

Change-Id: I99ec1409d095e0c2fd3b7ab398742cabcc05700b
2015-11-03 08:17:21 -08:00
Jacky Chen
d73e6cef75 Merge "vpx_scale: fix the issue in msan test." 2015-11-02 23:37:23 +00:00
Alex Converse
080ad919df Add target for Mac OS X 10.11 'El Capitan'
Change-Id: I174f5b41be384894e41b8e2926cbf8fd0f8e21b2
2015-11-02 14:35:57 -08:00
Marco Paniconi
61f240c288 Merge "Move noise level estimate outside denoiser." 2015-11-02 22:08:01 +00:00
jackychen
fcb464671c vpx_scale: fix the issue in msan test.
Do memset to fix msan issue due to the access of uninitialized
memory.

BUG=https://code.google.com/p/chromium/issues/detail?id=549155

Change-Id: I02f995ede79e3574e72587cc078df1a0d11af002
2015-11-02 12:36:10 -08:00
Marco
c7da053d4b Move noise level estimate outside denoiser.
Source noise level estimate is also useful for
setting variance encoder parameters (variance thresholds,
qp-delta, mode selection, etc), so allow it to be used also
if denoising is not on.

Change-Id: I4fe23d47607b4e17a35287057f489c29114beed1
2015-11-02 12:15:26 -08:00
hui su
16bf821dfc Move palette-based intra prediction out of misc-fixes
Change-Id: Ia59724413c4a4831390119a33d40a7d713b4b69f
2015-11-02 11:11:25 -08:00
hui su
e085fb643f Generate intra prediction reference values only when necessary
This can help increase encoding speed substantially.

Change-Id: Id0c009146e6e74d9365add71c7b10b9a57a84676
2015-11-02 10:26:50 -08:00
Marco
c2f6a7df8d vp9 denoiser: Don't estimate noise on resized trigger frame.
Change-Id: I60461f011d1aba0b1eb6584c6940f745221915f4
2015-11-02 09:11:35 -08:00
James Zern
bc98bf65e8 vp9_dx_iface: move struct defs to separate header
this avoids redefining vpx_codec_vp9_dx, vpx_codec_vp9_dx_algo in
vp9_encoder_parms_get_to_decoder.cc

Change-Id: I3b89e7a62497227ee32419f1a7d30e4c10a13c05
(cherry picked from commit ca163b85bb)
2015-10-31 12:23:53 -07:00
James Zern
8f9c9ab5c9 vp9_decodeframe.h: add missing include
Change-Id: I8ef772a016a79cab88bee8e9739530aa030baaa9
(cherry picked from commit 68ecfc1e62)
2015-10-31 12:23:53 -07:00
Debargha Mukherjee
9cafc46d9e Merge "Convert motion search config from AoS to SoA" 2015-10-30 20:57:10 +00:00
James Zern
082434b274 Merge changes I3b89e7a6,I8ef772a0
* changes:
  vp9_dx_iface: move struct defs to separate header
  vp9_decodeframe.h: add missing include
2015-10-30 05:50:58 +00:00
James Zern
ca163b85bb vp9_dx_iface: move struct defs to separate header
this avoids redefining vpx_codec_vp9_dx, vpx_codec_vp9_dx_algo in
vp9_encoder_parms_get_to_decoder.cc

Change-Id: I3b89e7a62497227ee32419f1a7d30e4c10a13c05
2015-10-29 17:55:35 -07:00
Alex Converse
d2967221d2 Merge "Make the zero handling in extend_to_full_distribution more explicit." 2015-10-30 00:37:33 +00:00
James Zern
68ecfc1e62 vp9_decodeframe.h: add missing include
Change-Id: I8ef772a016a79cab88bee8e9739530aa030baaa9
2015-10-29 16:41:25 -07:00
hui su
ede323a119 Specify feasible parameter values for lossless mode
Change-Id: I53d9719dcb81fa83fe3c920a552db5a0f1cacefa
2015-10-29 16:07:55 -07:00
Alex Converse
989193c797 Make the zero handling in extend_to_full_distribution more explicit.
The old workaround "p = 0 ? 0 : p -1" is misleading.

?: happens before =
assigning back to p truncates to one byte.

Therefore it is equivalent to (p - 1) & 0xFF, but the check just exists
to work around a first pass bug, so let's make the work around more
clear.

https://bugs.chromium.org/p/webm/issues/detail?id=1089

Change-Id: I587c44dd61c1f3767543c0126376f881889935af
2015-10-29 14:46:55 -07:00
Jacky Chen
039f241fc2 Merge "VP9_resizing: add limitation to the downsacling resolution." 2015-10-29 21:00:36 +00:00
Alex Converse
6f229b3e62 Merge "Shrink probability remap tables." 2015-10-29 19:58:24 +00:00
jackychen
dba2d5b3f3 VP9_resizing: add limitation to the downsacling resolution.
Width and height of downscaling resolution should not be lower
than min_width and min_height which can be set as needed, both
are 180 for now.

Change-Id: I34d06704ea51affbdd814246e22ee8d41d991f00
2015-10-29 09:42:44 -07:00
Jacky Chen
487023e94e Merge "VP9 decoder: Add more test vectors for resizing." 2015-10-29 16:00:15 +00:00
Marco
9cb73659d5 Update to vp9_spatial_svc_encoder.
Some fixes for rate control stats and bypass mode.

Change-Id: I28bed5467a681b8867cca55852d5d3a25d850f39
2015-10-29 08:21:10 -07:00
jackychen
d464e8a462 VP9 decoder: Add more test vectors for resizing.
Refer to doc "vp9-test-vectors".

BUG=https://code.google.com/p/webm/issues/detail?id=1086

Change-Id: I523d1f39141a3a86f113604cbdb9cd41cc2d6470
2015-10-28 21:26:00 -07:00
Marco Paniconi
9645cd4826 Merge "VP9-SVC: Allow frame dropping due to overshoot for spatial layers." 2015-10-28 21:59:17 +00:00
Alex Converse
e765969971 Merge "Revert "Replace the zero handling in extend_to_full_distribution."" 2015-10-28 20:48:52 +00:00
Johann Koenig
bb0bc06fa5 Merge "Skip AS detection when using --enable-external-build" 2015-10-28 19:18:49 +00:00
Johann Koenig
6f498956e5 Merge "Only set sysroot when alt_libc finds a directory" 2015-10-28 19:16:42 +00:00
Alex Converse
663960e757 Revert "Replace the zero handling in extend_to_full_distribution."
This reverts commit 7f56cb2978.

It causes uninitialized reads in the first pass setting up later cost tables.

Change-Id: I2df498df3f5c03eff359f79edf045aed0c618dc9
2015-10-28 11:51:40 -07:00
Hangyu Kuang
bd45af8bbb Add more resize test videos that with larger resolution change intervals.
These videos change resolution every 10 frames versus every 3 frames in current
test sets.

Change-Id: Ic33f449fc9b6d2f480825d4715b8f63e70801232
2015-10-28 10:57:30 -07:00
Geza Lore
965a8dea0b Convert motion search config from AoS to SoA
This is a prerequisite for vectorizing vp9_diamond_search_sad_c.

Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410
2015-10-28 15:30:43 +00:00
Hangyu Kuang
f5f19a1fbd Merge "Add several new test vectors with small resolution." 2015-10-28 15:04:25 +00:00
Hangyu Kuang
0771a30e9e Add several new test vectors with small resolution.
Change-Id: I70b1b8162a0c9b8501358ba7d32fecd1dc020ab5
2015-10-27 17:46:48 -07:00
Marco
823a47ee3b Update to vp9-denoising.
Set increase_denoising parameter for temporal filter.

Change-Id: Id98bf160db98dfa9aedf76e20b43e6f7c783fb1c
2015-10-27 15:52:56 -07:00
Johann
a6f70b42b6 Only set sysroot when alt_libc finds a directory
Change-Id: Idc0a9adb4fb371272d6c8c98737f66c6cf209e37
2015-10-27 15:38:47 -07:00
Marco
4fb2ba2861 VP9-SVC: Allow frame dropping due to overshoot for spatial layers.
For 1 pass CBR mode.

Change-Id: I8bceb489a850ec26f05382eecb5c0c32a1bb8883
2015-10-27 14:51:47 -07:00
Alex Converse
0f059d6d65 misc fixes: Remove a wasted value.
Remove delta index 254 from probability remapping and subexp coding.
Saves 1-bit when the delta index is 129.

Change-Id: I88aba565fc766b1769165be458d2efd3ce45817e
2015-10-27 12:10:25 -07:00
Marco Paniconi
2de14eb942 Merge "Adjustments to vp9-denoising." 2015-10-27 19:10:01 +00:00
Alex Converse
a736bf6bfb Shrink probability remap tables.
Saves 2288 bytes in vp8+vp9 libvpx.a.

Change-Id: Iaa5712e59a9693ed58cea63de63781a96827e44e
2015-10-27 12:08:23 -07:00
Marco
8a2fc54508 Adjustments to vp9-denoising.
Adjust variance threshold, delta-qp, and intra penalty cost,
based on estimated noise level in source.

Replace denoising_on with a level value=L/M/H.

Change-Id: I0c017dae75a5d897367d2c42dec26f2f37e447c1
2015-10-27 10:44:19 -07:00
Yaowu Xu
c1b2d416d7 Merge "Reorder code to be consistent accross branches" 2015-10-27 17:07:50 +00:00
Alex Converse
89d10d8f3f Merge "Replace the zero handling in extend_to_full_distribution." 2015-10-27 16:54:49 +00:00
Yaowu Xu
9d8bde85cb Reorder code to be consistent accross branches
This is to make future merge a bit easier.

Change-Id: I1039de381d8fe7b9988b57c23d15d0cb5f2fcd32
2015-10-27 09:04:40 -07:00
Alex Converse
811be0df3a Fix VS build.
Add a cast on a double to unsigned assignment.

Change-Id: I4abce7cfa13e145ed0c71469844ac9b274aa1411
2015-10-26 23:13:03 -07:00
Johann
12f26bf0bc Skip AS detection when using --enable-external-build
The option exists specifically to allow for configurations
where the build environment is different from the configure
environment.

Change-Id: I95196fa3c49700251d10ff5d256dc7380e39d0c4
2015-10-26 16:43:59 -07:00
Marco Paniconi
dc9d36c0a6 Merge "Code cleanup for vp9-denoiser." 2015-10-26 20:52:16 +00:00
Paul Wilkins
cce3982c48 Merge "Incorrect frame used in KF boost loop." 2015-10-26 19:12:34 +00:00
Paul Wilkins
26abc15e04 Merge "Bug in clamping of base_frame_target." 2015-10-26 19:12:08 +00:00
Marco
f2845ed83c Code cleanup for vp9-denoiser.
Change-Id: Ibb573f50c4bf2cfb382b589803f3363db0ac1285
2015-10-26 12:04:54 -07:00
Alex Converse
7f56cb2978 Replace the zero handling in extend_to_full_distribution.
The old workaround "p = 0 ? 0 : p -1" is misleading.

?: happens before =
assigning back to p truncates to one byte.

Therefore it is equivalent to (p - 1) & 0xFF, but the check just exists
to work around a first pass bug, so let's make the work around more
clear.

https://code.google.com/p/webm/issues/detail?id=1089

Change-Id: Ia6dcc8922e1acbac0eeca23a4d564a355c489572
2015-10-26 11:29:46 -07:00
Debargha Mukherjee
65dd056e41 Merge "Optimize vpx_quantize_{b,b_32x32} assembler." 2015-10-26 18:04:49 +00:00
Debargha Mukherjee
35cae7f1b3 Merge "Optimize vp9_highbd_block_error_8bit assembly." 2015-10-26 18:03:46 +00:00
Alex Converse
e34c7e3f59 Merge "palette: Replace rand() call with custom LCG." 2015-10-26 17:05:00 +00:00
Jingning Han
e1a056e163 Merge "Use explicit block position in foreach_transformed_block" 2015-10-26 16:25:56 +00:00
Alex Converse
171fd8999f palette: Replace rand() call with custom LCG.
The custom LCG is based on the POSIX recommend constants for a 16-bit
rand(). This implementation uses less computation than typical standard
library procedures which have been extended for 32-bit support, is
guaranteed to be reentrant, and identical everywhere.

Change-Id: I3140bbd566f44ab820d131c584a5d4ec6134c5a0
Ref: http://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html
2015-10-24 13:38:23 -07:00
Paul Wilkins
762c0f2264 Bug in clamping of base_frame_target.
Bug relating to issue:- http://b/25090786

base_frame_target is supposed to track the idealized bit
allocation based on error score and not the actual bits
allocated to each frame.

The clamping of this value based on the VBR min and max pct values
was causing a bug where in some cases the loop that adjusts the
active max quantizer for each GF group was running out of bits at
the end of a KF group. This caused a spike in Q and some ugly artifacts.

A second change makes sure that the calculation of the active
Q range for a group DOES, however, take account of clamping.

Change-Id: I31035e97d18853530b0874b433c1da7703f607d1
2015-10-23 14:45:48 -07:00
Marco
d162934bdc VP9: Estimate noise level for denoiser.
Periodically estiamte noise level in source, and only denoise
if estimated noise level is above threshold.

Change-Id: I54f967b3003b0c14d0b1d3dc83cb82ce8cc2d381
2015-10-23 11:03:30 -07:00
Jingning Han
caeb10bf06 Use explicit block position in foreach_transformed_block
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.

Change-Id: Ie7508acdac0b498487564639bc5cc6378a8a0df7
2015-10-23 09:19:17 -07:00
Ronald S. Bultje
f4af1a9af4 Merge "vp10: merge ext_ipred_bltr experiment into misc_fixes." 2015-10-22 21:14:20 +00:00
Ronald S. Bultje
806ae29d80 Merge "vp10: merge universal_hp experiment into misc_fixes." 2015-10-22 21:14:13 +00:00
Ronald S. Bultje
d6fc63ac31 Merge "Adjust superframe-is-optional unit test for vp10 superframe syntax." 2015-10-22 21:14:06 +00:00
Ronald S. Bultje
dbefcc0609 Merge "vp10: don't allow comp_inter_inter on keyframes." 2015-10-22 21:14:00 +00:00
Ronald S. Bultje
a857728267 Merge "vp10: fix tile size in remuxing step." 2015-10-22 21:12:44 +00:00
Ronald S. Bultje
40347d0c07 Merge "vp10: use correct constant for bw adaptation of seg pred probs." 2015-10-22 21:12:35 +00:00
Ronald S. Bultje
de4e2662d7 Merge "vp10: don't make right edge available across tile boundaries." 2015-10-22 21:12:25 +00:00
Ronald S. Bultje
69df584416 Merge "vp10: clip MVs before adding to find_ref_mvs() list." 2015-10-22 21:12:09 +00:00
Ronald S. Bultje
53dc9fd0a0 vp10: merge ext_ipred_bltr experiment into misc_fixes.
Change-Id: I2f2deb700748408b8278b7f5c29ee1f2e39785ec
2015-10-21 22:27:34 -04:00
Ronald S. Bultje
194c0a5cfb vp10: merge universal_hp experiment into misc_fixes.
Change-Id: I79fc3c0594535adc0056339c929cff69b8188760
2015-10-21 22:27:34 -04:00
Ronald S. Bultje
aa11256555 Adjust superframe-is-optional unit test for vp10 superframe syntax.
Change-Id: Ic64b6928af7ae8ecc987f845b0bf0faecdacb072
2015-10-21 22:27:28 -04:00
Paul Wilkins
4e887f032d Incorrect frame used in KF boost loop.
Fixes a bug in the calculation of the boost for key frames.

Change-Id: I75e9c96a9e86379239fbbbecb56ccd529783dc7c
2015-10-21 22:17:53 +01:00
Ronald S. Bultje
6a032503ca vp10: don't allow comp_inter_inter on keyframes.
Change-Id: Ibd0e13721a2bb71c532d20b36c42f4cccf5c5de2
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
558d93f3a5 vp10: fix tile size in remuxing step.
Change-Id: Id48fb193bbdb3afed1d0db26c4ddded65a293b1b
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
59058775fc vp10: use correct constant for bw adaptation of seg pred probs.
Change-Id: Idb869a77a126982814b8e7e288f952a65340e6be
2015-10-21 15:19:11 -04:00
Ronald S. Bultje
3d90819149 vp10: don't make right edge available across tile boundaries.
Change-Id: Ia81cf3858ef6c8d1fd4b1fb2dd9627906081129d
2015-10-21 15:19:11 -04:00
Geza Lore
aa8f85223b Optimize vp9_highbd_block_error_8bit assembly.
A new version of vp9_highbd_error_8bit is now available which is
optimized with AVX assembly. AVX itself does not buy us too much, but
the non-destructive 3 operand format encoding of the 128bit SSEn integer
instructions helps to eliminate move instructions. The Sandy Bridge
micro-architecture cannot eliminate move instructions in the processor
front end, so AVX will help on these machines.

Further 2 optimizations are applied:

1. The common case of computing block error on 4x4 blocks is optimized
as a special case.
2. All arithmetic is speculatively done on 32 bits only. At the end of
the loop, the code detects if overflow might have happened and if so,
the whole computation is re-executed using higher precision arithmetic.
This case however is extremely rare in real use, so we can achieve a
large net gain here.

The optimizations rely on the fact that the coefficients are in the
range [-(2^15-1), 2^15-1], and that the quantized coefficients always
have the same sign as the input coefficients (in the worst case they are
0). These are the same assumptions that the old SSE2 assembly code for
the non high bitdepth configuration relied on. The unit tests have been
updated to take this constraint into consideration when generating test
input data.

Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
2015-10-21 12:30:40 +01:00
Ronald S. Bultje
56cfbeefb4 Merge "vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe." 2015-10-20 19:58:00 +00:00
Ronald S. Bultje
293e20df91 vp10: clip MVs before adding to find_ref_mvs() list.
This causes the output of find_ref_mvs() to always be unique or zero.
A nice side-effect of this is that it also causes the output of
find_ref_mvs_sub8x8() to be unique-or-zero, and it will not ignore
available candidate MVs under certain conditions.

See issue 1012.

Change-Id: If4792789cb7885dbc9db420001d95f9b91b63bfa
2015-10-20 14:48:35 -04:00
Ronald S. Bultje
dec4405cfa vp10: disallow coding zero-sized tiles-in-frame/frames-in-superframe.
See issue 1088.

Change-Id: Icb15d33b4e316add848f210b50cbccd7c7847207
2015-10-20 14:48:31 -04:00
Marco
be3f2713ad Setting change in sample encoder: vpx_temporal_svc_encoder.c
Change-Id: Ifb384fa571eb08b516ed08fe05b8bca0c94b1edf
2015-10-20 10:40:20 -07:00
Hui Su
96b69deca5 Merge "VP10: some changes to palette mode" 2015-10-20 16:37:31 +00:00
Ronald S. Bultje
9897e1c27c Merge "vp10: write colorspace info for profile 0 intraonly frames." 2015-10-20 15:57:21 +00:00
Ronald S. Bultje
bafadaafbb Merge "vp10: per-segment lossless coding." 2015-10-20 15:57:12 +00:00
Ronald S. Bultje
92c4d8149a Merge "vp10: add extended-intra prediction edges experiment." 2015-10-20 15:57:05 +00:00
Ronald S. Bultje
1a64595780 Merge "vp10: allow MV refs to point outside visible image." 2015-10-20 15:56:56 +00:00
Ronald S. Bultje
4a7f012b95 Merge "vp10: allow forward updates for keyframe y intra mode probabilities." 2015-10-20 15:56:49 +00:00
Ronald S. Bultje
f441a652b7 Merge "vp10: merge keyframe/interframe uvintramode/partition probabilities." 2015-10-20 15:56:42 +00:00
Ronald S. Bultje
24517b9635 Merge "vp10: make segmentation probs use generic probability model." 2015-10-20 15:56:34 +00:00
Geza Lore
9cfba09ac0 Optimize vpx_quantize_{b,b_32x32} assembler.
Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.

This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.

Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.

Change-Id: I86714a6b7364da20cd468cd784247009663a5140
2015-10-20 10:11:19 +01:00
James Zern
849e54cedd Merge "vp8cx: remove deprecated reference/entropy controls" 2015-10-20 02:46:36 +00:00
Ronald S. Bultje
2a388b53f2 vp10: write colorspace info for profile 0 intraonly frames.
See issue 1087.

Change-Id: I231f6f12f870d0a56391daf1673536048418b207
2015-10-19 12:18:57 -04:00
James Zern
a046f56491 vp8cx: remove deprecated reference/entropy controls
VP8E_UPD_ENTROPY, VP8E_UPD_REFERENCE and VP8E_USE_REFERENCE have been
deprecated since the initial public release

Change-Id: Ied16b441eec13434d85f1ab115d49ccaf5f2f7b0
2015-10-16 17:02:36 -07:00
Ronald S. Bultje
60c58b5284 vp10: per-segment lossless coding.
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.

See issue 1035.

Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
c7dc1d78bf vp10: add extended-intra prediction edges experiment.
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.

See issue 1043.

Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
2015-10-16 19:30:39 -04:00
Ronald S. Bultje
dea998997f vp10: allow MV refs to point outside visible image.
In VP9, the ref MV had to point to a block that itself fully resided
within the visible image, i.e. all borders of the image had to be
within the visible borders of the coded frame. This is somewhat
illogical, and had obscure side effects, e.g. clamping of fairly
reasonable motion vectors such as 0,0 were clipped to negative values
if the block was overhanging on frame edges (such as the last rows
on 1080p content), which makes no sense whatsoever.

Instead, relax clamping constraints such that the ref MVs are allowed
to point to blocks exactly outside the visible edges in both Y as well
as UV planes, including the 8tap filter edges (that's why the offset is
8 pixels + block size).

See issue 1037.

Change-Id: I2683eb2a18b24955e4dcce36c2940aa2ba3a1061
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
1eb51a2010 vp10: allow forward updates for keyframe y intra mode probabilities.
See issue 1040 point 5.

Change-Id: I51a70b9eade39efba392a1457bd70a3c515525cb
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
d8f3bb1837 vp10: merge keyframe/interframe uvintramode/partition probabilities.
This has various benefits:
- simplify implementations because we don't have to switch between
  multiple probability tables depending on frametype
- allows fw subexp and bw adaptivity for partitions/uvmode in keyframes

See issue 1040 point 5.

Change-Id: Ia566aa2863252d130cee9deedcf123bb2a0d3765
2015-10-16 19:30:38 -04:00
Ronald S. Bultje
6e5a1165be vp10: make segmentation probs use generic probability model.
Locate them (code-wise) in frame_context, and have them be updated
as any other probability using the subexp forward and adaptive bw
updates.

See issue 1040 point 1.

TODOs:
- real-world default probabilities
- why is counts sometimes NULL in the decoder? Does that mean bw
  adaptivity updates only work on some frames? (I haven't looked
  very closely yet, maybe this is a red herring.)

Change-Id: I23b57b4e5e7574b75f16eb64823b29c22fbab42e
2015-10-16 19:30:38 -04:00
Yaowu Xu
568429512e Add a new enum type vpx_color_range_t
to make meaning of color_range obvious.

Change-Id: I303582e448b82b3203b497e27b22601cc718dfff
2015-10-16 16:27:18 -07:00
James Zern
7dd7a7da20 vpx/*.h: add VPX_CTRL_* preproc defines
allows controls to be tested for at compile-time

Change-Id: I1cd01287dc144392956c82e6dbac003f37703039
2015-10-16 18:47:20 +00:00
James Zern
9ade6e1001 Merge "vpx/*.h, cosmetics: fix some typos" 2015-10-16 18:47:08 +00:00
hui su
17c817adfc VP10: some changes to palette mode
Account for rounding in distortion calculation in k-means;
carry out rounding before duplicates removal of base colors;
replace numbers with macros;
use prefix increment.

Slight coding gain (<0.1%) on screen_content testset.

Change-Id: Ie8bd241266da6b82c7b2874befc3a0c72b4fcd8c
2015-10-16 11:41:26 -07:00
Marco
b44c5cf639 Adjustment on limiting cyclic refresh on steady blocks.
Adjust the qp threshold and consec_zeromv threshold for
limiting cyclic refresh. Also increase the refresh period
when the limit amount is significant, and some code-cleanup.

Small gain in PSNR/SSIM metrics: ~0.25/0.3 gain on RTC set, speed 7.

Change only affects non-screen content.

Change-Id: I1ced87a89a132684c071e722616e445b2d18236a
2015-10-16 10:16:44 -07:00
Yaowu Xu
1832ba7509 Restore partial changes from previous commit
This portion was tested to have no effect on asan test failures.

Change-Id: I3de1dab7479148bdffc24c4568cb2e7e9963f099
2015-10-16 00:28:37 +00:00
hui su
aaf6f6215f Fix palette mode in multi-thread encoding setting
Fix a couple of memory related errors. Also fix thread test failures.

Change-Id: I0103995f832cecf1dd2380000321ac7204f0cfc0
2015-10-15 15:00:57 -07:00
Jacky Chen
a5d74843eb Merge "VP9_resizing: adjust the threshold and another improvement." 2015-10-15 21:35:02 +00:00
Marco Paniconi
cff15f9d3c Merge "Fix resetting of cyclic refresh on dynamic resize change." 2015-10-15 21:09:06 +00:00
JackyChen
dc002cb7b4 VP9_resizing: adjust the threshold and another improvement.
Adjust the qp threshold based on the denoising setting; not allow
to scale directly from original resolution to one half and vise versa.

Change-Id: I032a9b22f8e1c88de6bb81cf8351367223a3e40d
2015-10-15 09:27:22 -07:00
Marco
d6bbda4bc2 Fix resetting of cyclic refresh on dynamic resize change.
Put the reset at the right place, during the setup and prior
to updating the map.

Change-Id: I75e550ae9d8cc15081330b8857edc04c23947875
2015-10-15 09:03:51 -07:00
Marco
1a0a10cf3d VP9: Rate control update for re-encode screen-content.
For the re-encoding (at max-qp) on the detected high-content change:
update rate correction factor, reset rate over/under-shoot flags,
and update/reset the rate control for layered coding.

Change-Id: I5dc72bb235427344dc87b5235f2b0f31704a034a
2015-10-15 08:26:15 -07:00
Yaowu Xu
15cc8bc72f Merge "fix a msvc compiler warning" 2015-10-15 14:39:01 +00:00
Yaowu Xu
3e1e3ac789 Merge "Fix two asan failures" 2015-10-15 14:38:05 +00:00
Yaowu Xu
8ced62f250 fix a msvc compiler warning
Change-Id: Ifd6581c1bdb8d8f4b2ecf676c1a3d385dc129abf
2015-10-15 01:05:13 +00:00
Yaowu Xu
4727fa2a75 Fix two asan failures
Change-Id: I57865e9604ac162ef0d97deb16e81ca436a98428
2015-10-14 18:03:31 -07:00
Johann
5d5cc0d082 Check for bswap* builtins before using
Canonical builtin checks for clang are to use
__has_builtin. Much less fragile than version checks.

https://code.google.com/p/webm/issues/detail?id=1082

Change-Id: I8151fb75899acdf1a935c23aad9441da99a9abcd
2015-10-14 15:37:53 -07:00
Johann
ec623a0bb7 Upstream Mozilla fix for older Apple clang builds
Also use the _mm_broadcastsi128_si256 intrisic for
Apple clang versions 4.[012]

https://bugzilla.mozilla.org/show_bug.cgi?id=1085607
https://code.google.com/p/webm/issues/detail?id=1082

Change-Id: I6bc821d8163387194ef663e94bfed91fa7281d88
2015-10-14 07:41:23 -07:00
Yaowu Xu
c2b8b5bfe2 Merge "Changes to partition breakout rules." 2015-10-13 22:31:56 +00:00
paulwilkins
cdc359989a Changes to partition breakout rules.
Changes to the breakout behavior for partition selection.
The biggest impact is on speed 0 where encode speed in
some cases more than doubles with typically less than 1%
impact on quality.

Speed 0 encode speed impact examples
Animation test clip: +128%
Park Joy:  +59%
Old town Cross: + 109%

Change-Id: I222720657e56cede1b2a5539096f788ffb2df3a1
2015-10-13 14:19:06 -07:00
Marco Paniconi
86c16df39d Merge "VP9-SVC: Bugfix to allow skipping lower layer(s) encoding." 2015-10-13 21:09:10 +00:00
Ronald S. Bultje
567c791d01 Merge "vp10: fix compiler warning with --enable-universal_hp." 2015-10-13 19:33:05 +00:00
Hui Su
fe0396cadc Merge "Fix compiler warnings" 2015-10-13 19:30:33 +00:00
Ronald S. Bultje
fa8ba206bf vp10: fix compiler warning with --enable-universal_hp.
Change-Id: I0d7ca20bdd0fc868b28b0755e3114a4499056f45
2015-10-13 14:05:47 -04:00
Hui Su
b9e31b5163 Merge "VP10: Add palette mode part 1" 2015-10-13 17:34:27 +00:00
hui su
6f31722950 Fix compiler warnings
Change-Id: I761256a8100d83abf1b937f3739580237e3fad2a
2015-10-13 10:33:17 -07:00
Marco
1ce01eaaf7 VP9-SVC: Bugfix to allow skipping lower layer(s) encoding.
The setting of svc->spatial_layer_to_encode was missing
in VP9E_SET_SVC_LAYER_ID.

Change-Id: I015b1a64adb9ef2644d6477a02d9d9364c8462b9
2015-10-12 16:11:34 -07:00
Ronald S. Bultje
00170953b1 vp10: allow forward updates for uv_mode probabilities.
See issue 1040 point 4.

Change-Id: I79e06bd71a27f45770c760c47dc71bc3767a77a0
2015-10-12 17:51:01 -04:00
Ronald S. Bultje
5f589826f3 vp10: allow bw adaptivity for skip/tx probabilities in keyframes.
See issue 1040 point 3.

Change-Id: Ieef6d326b7fb50ceca5936525b7c688225a11fd1
2015-10-12 17:51:01 -04:00
Ronald S. Bultje
fee146e60b vp10: don't write tile size marker bit if CONFIG_MISC_FIXES=0.
Change-Id: I41b13b8767e30da391c2c4da9a729ca7292b16b9
2015-10-12 17:50:57 -04:00
Ronald S. Bultje
1799f2f81d vp10: remove ref-MV-dependent use of HP.
This change (in a new config experiment: universal_hp) removes the
bitstream parsing dependency of the HP MV bit on the ref MV to be
coded. It also cleans up clearing of the HP bit in near/nearestMV,
since HP is always on if it's set in the frame header.

This admittedly doesn't clean up the crap that could be cleaned up,
but that's mostly because I think this needs some careful review;
not so much for coding style, but more from hardware people and from
the codec team on what we/you want. It would also be nice to get some
actual numbers on the real quality impact of this change. If, for
example, hardware people come up and tell us they don't actually care
anymore, we should probably just this code as-is and do nothing (i.e.
discard this patch).

See issue 1036.

Change-Id: Ic9b106f34422aa0f79de0c28125b72d566bd511a
2015-10-12 14:45:18 -04:00
Ronald S. Bultje
5b4805d6e9 vp10: remove clamp_mv2() call from vp10_find_best_ref_mvs().
This actually has no effect whatsoever, since the input MVs themselves
are clamped by clamp_mv_ref() already, which is significantly more
restrictive in its bounds.

Change-Id: I4a3a7b2b121ee422c56428c2a12d930c3813c06e
2015-10-12 14:45:18 -04:00
Ronald S. Bultje
2e45ce1493 vp10: update assertion/allocation for tokens.
We only write EOSB tokens if we write tokens (i.e. not for skip blocks),
and we write EOSB tokens per-plane instead of per block.

Change-Id: I8d7ee99f8ec50eb7ae809f9f9282c1c91dbf6537
2015-10-12 14:45:18 -04:00
hui su
5d011cb278 VP10: Add palette mode part 1
Add palette mode for keyframe luma channel. Palette mode is enabled
when using "--tune-content=screen" in encoding config parameters.

on screen_content testset:  +6.89%
on derlr                 :  +0.00%

Design doc (WIP):
https://goo.gl/lD4yJw

Change-Id: Ib368b216bfd3ea21c6c27436934ad87afdaa6f88
2015-10-12 10:02:17 -07:00
James Zern
ba7ea4456f tile_worker_hook: fix -Wclobbered warning
*tile should be marked volatile like the others due to the use of
setjmp()

Change-Id: I5dbf8e6792e4c0f34a683434b4fd06e3b4c75c4b
2015-10-10 11:17:08 -07:00
James Zern
0b74e5d7af vpx/*.h, cosmetics: fix some typos
Change-Id: Ie9ead2c665c6c065a6b922ab66bae9be63483272
2015-10-09 16:33:15 -07:00
Alex Converse
0c00af126d Add vpx_highbd_convolve_{copy,avg}_sse2
single-threaded:
swanky (silvermont): ~1% faster overall
peppy (celeron,haswell): ~1.5% faster overall

Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073
2015-10-09 11:50:25 -07:00
Alex Converse
7e77938d72 Generate convolve_test wrapper functions with a macro
Change-Id: Iccb4cdc23c1845cf9cb7d69101c9f4f43675d368
2015-10-09 11:42:05 -07:00
James Zern
65055a5fbd Merge "vp9/decode_tiles_mt: remove unnecessary local" 2015-10-09 17:52:34 +00:00
Geza Lore
cbada4a982 Remove 4 mova insts from quantize_ssse3_x86_64.asm
Change-Id: If3cb9345b44162e600e6c74873e0cb4c207fc7fb
2015-10-09 07:52:04 -07:00
Debargha Mukherjee
94bedd013e Merge "Optimization of 8bit block error for high bitdepth" 2015-10-09 13:36:47 +00:00
Geza Lore
0134764fa6 Optimization of 8bit block error for high bitdepth
If high bit depth configuration is enabled, but encoding in profile 0,
the code now falls back on optimized SSE2 assembler to compute the
block errors, similar to when high bit depth is not enabled.

Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
2015-10-08 14:05:25 -07:00
Jacky Chen
66bf686975 Merge "VP9 denoiser: use skin map to improve denoising." 2015-10-08 21:02:46 +00:00
jackychen
bafe1a2d67 VP9 denoiser: use skin map to improve denoising.
Only denoise at small motion if it's a skin block.

Change-Id: I6235cad9dd7f76ab40e7d9cdfe6180e619c20c6e
2015-10-08 12:17:25 -07:00
Ronald S. Bultje
95f8b81962 Merge "vp10: use subexp probability updates for MV probs." 2015-10-08 18:50:50 +00:00
Ronald S. Bultje
ca67339901 Merge "vp10: skip unreachable cat6 token extrabits." 2015-10-08 18:50:39 +00:00
Ronald S. Bultje
b60b15bc11 Merge "vp10: remove superframe size field for last frame in superframe." 2015-10-08 18:50:08 +00:00
Jacky Chen
0f6e9c5d3d Merge "vp9_skin_detection: fix some build warnings." 2015-10-08 18:15:10 +00:00
Ronald S. Bultje
5dd85e525d Merge "vp10: use superframe marker index/size mechanism for tile size." 2015-10-08 17:32:52 +00:00
jackychen
eaa101b502 vp9_skin_detection: fix some build warnings.
Change-Id: Ib779c083e9775dc9922ed6e104f6275bc453bef9
2015-10-08 09:51:34 -07:00
James Zern
50b20b90aa vp9/decode_tiles_mt: remove unnecessary local
reuse the common loop index

Change-Id: I9db45a93c219c2123917514cb8e9d4ea86454711
2015-10-07 17:46:13 -07:00
James Zern
a83e8ec008 Merge "vp9/tile_worker_hook: pass pbi directly" 2015-10-07 22:09:33 +00:00
James Zern
1f2acb7e40 Merge changes Iaee60826,I51cf1e39
* changes:
  vp9/tile_worker_hook: add multiple tile decoding
  invalid_file_test: loosen error check w/tile-threading
2015-10-07 22:09:21 +00:00
jackychen
b0a2ba2ffa VP9_denoiser: pass address in copy_frame to make it faster.
Change-Id: I65269ddb3ea5f911d5be38614b93c97be7e1ba76
2015-10-07 13:22:37 -07:00
Marco Paniconi
780ada18aa Merge "VP9 denoiser bug-fix: artifact caused by false buffer swap." 2015-10-07 19:08:07 +00:00
Alex Converse
061103dc82 Merge "vp9: simplify extrabits encoding" 2015-10-07 18:45:02 +00:00
James Zern
8b55eafed1 Merge "test/reg...check,video_source.h: avoid NOMINMAX redef" 2015-10-07 18:40:02 +00:00
James Zern
12de7e2a4a Merge "vpxdec: quiet signed/unsigned warning" 2015-10-07 18:26:36 +00:00
James Zern
05b4e18142 Merge changes I2965e786,I144bedde
* changes:
  vpx_memset16: drop unnecessary local
  vpx_memset16: quiet signed/unsigned warning
2015-10-07 18:26:15 +00:00
jackychen
7231c62c9f VP9 denoiser bug-fix: artifact caused by false buffer swap.
The artifact occurs periodically when VP9 denoiser is on and
refresh_golden_frame happen. When refresh_golden_frame happen,
we should copy the frame buffer instead of swapping the pointers.

Change-Id: Ib3204c4b04db28ecf439c6d9e61f3d146f04196d
2015-10-07 11:16:15 -07:00
Marco Paniconi
d20f086be5 Merge "Move setting of refresh threshold outside loop." 2015-10-07 16:44:32 +00:00
Debargha Mukherjee
f3a73f1277 Merge "Backports highbitdepth accelerations into vp10" 2015-10-07 16:28:36 +00:00
James Zern
18bd24ba9d test/reg...check,video_source.h: avoid NOMINMAX redef
some mingw32 configs define this. force this to be on to ensure the
build succeeds

Change-Id: I2cc490782b6a0736aa617e6a1457fc2bc984adbb
2015-10-06 23:05:15 -07:00
James Zern
fcf1609b7c vpxdec: quiet signed/unsigned warning
Change-Id: I93c56dfa547af9b2f2b96c4f85fd9862ea67af62
2015-10-06 22:56:34 -07:00
James Zern
d0f406366c vpx_memset16: drop unnecessary local
+ add a cast

Change-Id: I2965e7867223aa25bf688c988629ac57b4971905
2015-10-06 22:51:35 -07:00
James Zern
3554089838 vpx_memset16: quiet signed/unsigned warning
Change-Id: I144bedde7ea43f1b84360c1a7c8a042fd30abb6b
2015-10-06 22:48:18 -07:00
James Zern
0bd82af834 vp9/tile_worker_hook: pass pbi directly
reduces the size of TileWorkerData reusing the storage in the worker
itself

Change-Id: If8a62fcb35167037c3da5814ab84fb81893f9cab
2015-10-06 20:14:24 -07:00
James Zern
1f4a6c8a4e vp9/tile_worker_hook: add multiple tile decoding
this reduces the number of synchronizations in decode_tiles_mt() and
improves overall performance when the number of threads is less than the
number of tiles

Change-Id: Iaee6082673dc187ffe0e3d91a701d1e470c62924
2015-10-06 20:13:54 -07:00
Marco
bc137ff67b Move setting of refresh threshold outside loop.
Small code cleanup. consec_zeromv refresh threshold
does not need to be computed for every super-block.

No change in behavior.

Change-Id: I8c4b1b28072f42b01d917fff6d1f62722f1e1554
2015-10-06 17:51:30 -07:00
James Zern
fb209003a8 invalid_file_test: loosen error check w/tile-threading
The serial decode check is too strict for tile-threaded decoding as
there is no guarantee on the decode order nor which specific error
will take precedence. Currently a tile-level error is not forwarded so
the frame will simply be marked corrupt.

Change-Id: I51cf1e39e44bedeac93746154b36a4ccb2f059b1
2015-10-06 16:40:20 -07:00
Alex Converse
2f7f482c77 vp9: simplify extrabits encoding
Change-Id: I5a2abd35cb303d8f6354b3119ab95acf90405116
2015-10-06 16:26:08 -07:00
Debargha Mukherjee
ce3f4ade67 Merge "SSSE3 optimisation for quantize in high bit depth" 2015-10-06 22:28:11 +00:00
Marco
7266bedc04 Add first_spatial_layer_to_encode to SVC.
Use the existing VP9_SET_SVC control to set the
first spatial layer to encode.

Since we loop over all spatial layers inside the encoder, the
setting of spatial_layer_id via VP9_SET_SVC has no relevance.
Use it instead to set the first_spatial_layer_to_encode,
which allows an application to skip encoding lower layer(s).

Change only affects the 1 pass CBR SVC.

Change-Id: I5d63ab713c3e250fdf42c637f38d5ec8f60cd1fb
2015-10-06 08:56:15 -07:00
Julia Robson
37c68efee2 SSSE3 optimisation for quantize in high bit depth
When configured with high bit detpth enabled, the 8bit quantize
function stopped using optimised code. This made 8bit content
decode slowly. This commit re-enables the SSSE3 optimisations.

Change-Id: I194b505dd3f4c494e5c5e53e020f5d94534b16b5
2015-10-06 13:32:02 +01:00
Scott LaVarnway
b212094839 Merge "VPX: refactor vpx_idct32x32_1_add_sse2()" 2015-10-06 11:35:15 +00:00
Ronald S. Bultje
48178d2cf2 Merge "vp10: extend range for delta Q values." 2015-10-06 10:49:30 +00:00
Ronald S. Bultje
177e7b53e7 vp10: use subexp probability updates for MV probs.
See issue 1040 point 2.

Change-Id: I0b37fe74be764610696620f1fe296dc74e4806d7
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
3461e8ce64 vp10: skip unreachable cat6 token extrabits.
We have historically added new bits to cat6 whenever we added a new
transform size (or bitdepth, for that matter). However, we have
always coded these new bits regardless of the actual transform size,
which means that for smaller transforms, we code bits that cannot
possibly be set. The coding (quality) impact of this is negligible,
but the bigger issue is that this allows creating bitstreams with
coefficient values that are nonsensible and can cause int overflows,
which then de facto become part of the bitstream spec. By not coding
these bits, we remove this possibility.

See issue 1065.

Change-Id: Ib3186eca2df6a7a15ddc60c8b55af182aadd964d
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
d77a84bf52 vp10: remove superframe size field for last frame in superframe.
This is identical to what the tile size does for the last tile. See
issue 1042 (which covers generalizing the superframe/tile concepts).

Change-Id: I1f187d2e3b984e424e3b6d79201b8723069e1a50
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
7460798ba5 vp10: use superframe marker index/size mechanism for tile size.
See issue 1042. Should provide slight bitstream savings in most cases
where tiles are being used.

Change-Id: Ie2808cf8ef30b3efe50804396900c4d63a3fa026
2015-10-05 20:58:32 -04:00
Ronald S. Bultje
612104bb8d vp10: extend range for delta Q values.
See issue 1051. 6 bits is fairly arbitrary but at least allows writing
delta Q values that are fairly normal in other codecs. I can extend to
8 if people want full range, although I personally don't have any need
for that.

Change-Id: I0a5a7c3d9b8eb3de4418430ab0e925d4a08cd7a0
2015-10-05 20:58:32 -04:00
jackychen
de53e6de49 Add the check of resolution in VP9 dynamic resizing.
The resolution check fixs the issue which resets resize_pending
unnecessarily and causes not-bitexact with previous one-step version.

Change-Id: I4e7660b3c8f34f59781e2e61ca30d61080c322de
2015-10-05 15:39:32 -07:00
Julia Robson
5e6533e707 SSE2 optimisation for quantize in high bit depth
When configured with high bit detpth enabled, the 8bit quantize
function stopped using optimised code. This made 8bit content
decode slowly. This commit re-enables the SSE2 optimisation
(but not the SSSE3 optimisation).

Change-Id: Id015fe3c1c44580a4bff3f4bd985170f2806a9d9
2015-10-05 10:59:16 -07:00
Marco Paniconi
7777e7a8d5 Merge "Fix to denoiser with dynamic resize." 2015-10-05 14:14:35 +00:00
Marco Paniconi
3da6564f90 Merge "Stabilize the encoder buffer from going too negative." 2015-10-05 14:11:43 +00:00
Scott LaVarnway
23d1c06268 VPX: refactor vpx_idct32x32_1_add_sse2()
Change-Id: Ia1a2cac0e9dc05f3207b3433a6c1589fa7f2aee3
2015-10-05 06:33:42 -07:00
JackyChen
87b2495f95 Turn on two-steps scaling in VP9 encoder dynamic resizing.
First do a 3/4 scaling and then go down to 1/2 when necessary.

Change-Id: I5689c5228ca7e1606baea7f960eb24d0dab04d4d
2015-10-02 15:27:37 -07:00
Marco
86ede50943 Fix to denoiser with dynamic resize.
Temporary fix to denoiser when dynamic resizing is on.
 -Reallocate denoiser buffers on resized frame.
 -Force golden update on resized frame.
 -Don't denoise resized frame, and copy source into denoised buffers.

Change-Id: Ife7638173b76a1c49eac7da4f2a30c9c1f4e2000
2015-10-02 11:50:57 -07:00
Marco
37293583cd Stabilize the encoder buffer from going too negative.
For screen-content mode, with frame dropper off, put a limit
on how low encoder buffer can go.

Under hard slide changes, the buffer level can go too low and then
take long time to come back up (in particular when frame-dropping
is not used), which will affect the active_worst and target frame size.

Change-Id: Ie9fca097e05cd71141f978ec687f852daf9de332
2015-10-02 11:07:59 -07:00
Ronald S. Bultje
ce3780251c vp10: make render_width/height referenceable through ref frames.
See issue 1079.

Change-Id: I754a61ec011c3508bbb26826cf8e11dbdfdd8379
2015-10-02 13:39:38 -04:00
Ronald S. Bultje
3fedf4a59b Merge "vp10: reimplement d45/4x4 to match vp8 instead of vp9." 2015-10-02 17:15:59 +00:00
Debargha Mukherjee
f18322262f Backports highbitdepth accelerations into vp10
Ports the changes in
https://chromium-review.googlesource.com/#/c/302372/3
into vp10.

Change-Id: I334c409f693691227ad16fc703c91899592dd8dc
2015-10-02 00:57:37 -07:00
Debargha Mukherjee
cb5c47f20d Merge "Accelerated transform in high bit depth" 2015-10-02 06:55:55 +00:00
Marco Paniconi
194b374bb6 Merge "Two-steps scaling in VP9 encoder dynamic resizing." 2015-10-02 03:20:22 +00:00
jackychen
ba06be3844 Two-steps scaling in VP9 encoder dynamic resizing.
Dynamic resizing now support two-steps scaling: first go down to
3/4 and then 1/2. This feature is under a flag which controls the
switch between two-steps scaling and one-step scaling (1/2 only).

Change-Id: I3a6c1d3d5668cf8e016a0a02aeca737565604a0f
2015-10-01 18:18:49 -07:00
hui su
06bdc7f6db Small cleanup
Change-Id: I5aeaa94b743f84738d288f8b027fec4c164f2ec3
2015-10-01 11:19:13 -07:00
Scott LaVarnway
dfeaaeb0ad Merge "VP9: remove plane_type from macroblockd_plane" 2015-10-01 17:31:10 +00:00
Ronald S. Bultje
62a1579525 vp10: reimplement d45/4x4 to match vp8 instead of vp9.
This is more a proof of concept than anything else. The problem here
isn't so much how to code it, but rather where to place the resulting
code. All intrapred DSP code lives in vpx_dsp, so do we want the vp10
specific intra pred functions to live there, or in vp10/?

See issue 1015.

Change-Id: I675f7badcc8e18fd99a9553910ecf3ddf81f0a05
2015-10-01 10:11:54 -04:00
Ronald S. Bultje
b1d85bf60f vp8: align left pixel array by 16 bytes.
The x86 simd expects this. Identical alignment can be found in vp9
and vp10 also. Fixes crashes on 32bit x86 systems.

Change-Id: I229c88d8f696acbef5337c8fa9503528df4e1c40
2015-10-01 10:11:54 -04:00
James Zern
20f43ddfde Merge "sixtap_predict_test: enable NEON tests" 2015-10-01 02:10:22 +00:00
Ronald S. Bultje
31498df5f0 Merge "vp8: change build_intra4x4_predictors() to use vpx_dsp." 2015-10-01 01:01:57 +00:00
Ronald S. Bultje
12238fe851 Merge "vp8: change build_intra_predictors_mbuv_s to use vpx_dsp." 2015-10-01 01:01:45 +00:00
Ronald S. Bultje
0462172ccf Merge "vp8: change build_intra_predictors_mby_s to use vpx_dsp." 2015-10-01 00:57:37 +00:00
Ronald S. Bultje
c26a9ecaa2 vp8: change build_intra4x4_predictors() to use vpx_dsp.
I've added a few new functions (d45e, d63e, he, ve) to cover the
filtered h/v 4x4 predictors that are vp8-specific, the "correct"
d45 with the correctly filtered bottom-right pixel (as opposed to
the unfiltered version in vp9), and the "broken" d63 with weirdly
filtered bottom-right pixels (which is correctly filtered in vp9).

There may be a minor performance impact on all systems because we
have to do an extra copy of the Above pixel array to incorporate
the topleft pixel in the same array (thus fitting the vpx_dsp API).
In addition, armv6 will have a more serious performance impact b/c
I removed the armv6/vp8-specific assembly. I'm not sure anyone
cares...

Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86
2015-09-30 18:45:49 -04:00
Ronald S. Bultje
7cdcfee82c vp8: change build_intra_predictors_mbuv_s to use vpx_dsp.
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
2015-09-30 18:45:46 -04:00
Ronald S. Bultje
54d48955f6 vp8: change build_intra_predictors_mby_s to use vpx_dsp.
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
2015-09-30 18:45:40 -04:00
Scott LaVarnway
2f8625d824 VP9: remove plane_type from macroblockd_plane
Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec
2015-09-30 15:15:11 -07:00
Scott LaVarnway
13888e0eef Merge "VP9: remove plane_type checks in loopfilter functions" 2015-09-30 22:11:21 +00:00
James Zern
bdcfdebd68 Merge changes I264e75bf,Ifb0f41fb
* changes:
  vp9_loopfilter: remove unnecessary masks
  vp9_reset_lfm: harmonize function signature
2015-09-30 21:52:38 +00:00
James Zern
05c202a702 Merge changes I68c4f189,Ia5a752db
* changes:
  vp9_thread_test: clarify test case names
  vp9_thread_test: add non-frame-parallel files
2015-09-30 21:51:51 +00:00
James Zern
cd6d56e9a6 Merge "test/*.h: (windows) fix min/max conflict" 2015-09-30 19:55:36 +00:00
James Zern
a18cc591a5 vp9_loopfilter: remove unnecessary masks
Change-Id: I264e75bf3ddd083ee5311c50a37fb18fe634ddc3
2015-09-30 12:12:53 -07:00
James Zern
5d91201069 test/*.h: (windows) fix min/max conflict
define NOMINMAX to allow the std:: versions to be used; min/max will be
defined transitively via windows.h otherwise

Change-Id: I692b03fa3e70b7a53962d3fd209498f70f712fed
2015-09-29 23:03:26 -07:00
James Zern
a1914dbb31 vp9_reset_lfm: harmonize function signature
Change-Id: Ifb0f41fb43564a777be29b4c66443b366fa146a3
2015-09-29 20:46:37 -07:00
Alex Converse
aeae7fc903 Change dynamic_cast to static_cast to fix no-rtti build
Change-Id: Iad73b490b171cdda5c368ada69fb8eab2a86c156
2015-09-29 18:49:21 -07:00
Alex Converse
d2a953e02b Merge "Add a test for the interaction between active map and cyclic referesh." 2015-09-30 01:20:30 +00:00
Scott LaVarnway
18373264d9 VP9: remove plane_type checks in loopfilter functions
vp9_filter_block_plane_ss11() and vp9_filter_block_plane_non420()
are only called for the uv planes.

Change-Id: Iacd3b3242c8ce581edd37c8f06d95efc8a0f88a3
2015-09-29 15:54:33 -07:00
Scott LaVarnway
66de2b710f Merge "VP9: move loopfilter build masks to decode loop" 2015-09-29 21:40:48 +00:00
Tom Finegan
388a807e49 Merge "vpxenc: Allow non i420 input for VP10." 2015-09-29 18:56:21 +00:00
Marco Paniconi
0ca0a536f5 Merge "aq-mode for SVC: Add consec_zero_mv to layer context." 2015-09-29 17:47:39 +00:00
Tom Finegan
ed0d9dc836 vpxenc: Allow non i420 input for VP10.
BUG=https://code.google.com/p/webm/issues/detail?id=1066

Change-Id: I3bd26a516ef3d2742c523af570f639f9312df6df
2015-09-29 10:45:00 -07:00
Yaowu Xu
08ae94404f Merge "Fix a macro definition" 2015-09-29 17:22:49 +00:00
Tom Finegan
c0e2b5f473 Merge "build/make/iosbuild.sh: Remove jobs argument." 2015-09-29 17:08:55 +00:00
Marco
c05c58f8ff aq-mode for SVC: Add consec_zero_mv to layer context.
Change-Id: I63fadf1c7240d4b2893384f75c519311e9659d47
2015-09-29 10:01:53 -07:00
Yaowu Xu
45948a03c0 Fix a macro definition
to be consistent with the head file name.

Change-Id: I9634332a2b3fac7e7f3b7ef58821ea7c81c5c813
2015-09-29 09:34:42 -07:00
Scott LaVarnway
7718117104 VP9: move loopfilter build masks to decode loop
The loopfilter masks are now built in the decode loop.
This is done so we can eventually reduce the number of
MODE_INFO structs required by the decoder.

The encoder builds the masks for the entire frame prior
to calling the loopfilter.

Change-Id: Ia2146b07e0acb8c50203e586dfae0c4c5b316f11
2015-09-29 05:20:49 -07:00
Julia Robson
406030d1b0 Accelerated transform in high bit depth
When configured with high bitdepth enabled, the 8bit transform
stopped using optimised code. This made 8bit content decode slowly.

Change-Id: I67d91f9b212921d5320f949fc0a0d3f32f90c0ea
2015-09-28 21:09:16 -07:00
Marco Paniconi
7d28d12ef3 Merge "VP8: Update rate correction factor for drop_overshoot feature." 2015-09-28 19:53:10 +00:00
Marco
bd3088fd56 VP8: Update rate correction factor for drop_overshoot feature.
Update rate correction factor when we drop the frame due to overshoot.
Only affects when the drop_overshoot feature is on: screen_content_mode = 2.

Change-Id: I67e24de979b4c74744151d2ceb3cd75fec2a1e7a
2015-09-28 12:11:33 -07:00
Angie Chiang
e40a448e45 Merge "comment out fdct32" 2015-09-28 17:26:22 +00:00
Ronald S. Bultje
cc5dd3ec10 Merge "vp9/10: improve support for render_width/height." 2015-09-28 16:25:28 +00:00
Ronald S. Bultje
3db5721e21 Merge "Rename display_{size,width,height} to render_*." 2015-09-28 16:25:20 +00:00
Ronald S. Bultje
7238492235 Merge "vp10: code reference_mode in uncompressed header." 2015-09-28 16:23:11 +00:00
Ronald S. Bultje
2e3aa0587c Merge "vp10: split UV int4x4 loopfilter flag in one for each covered edge." 2015-09-28 16:23:00 +00:00
Ronald S. Bultje
812945a8f1 vp9/10: improve support for render_width/height.
In the decoder, map this to the output variable vpx_image_t.r_w/h.
This is intended as an improved version of VP9D_GET_DISPLAY_SIZE,
which doesn't work with parallel frame decoding. In the encoder,
map this to a codec control func (VP9E_SET_RENDER_SIZE) that takes
a w/h pair argument in a int[2] (identical to VP9D_GET_DISPLAY_SIZE).

Also add render_size to the encoder_param_get_to_decoder unit test.

See issue 1030.

Change-Id: I12124c13602d832bf4c44090db08c1009c94c7e8
2015-09-25 22:18:22 -04:00
James Zern
db2056f341 Merge "vp9/10 encoder: prevent NULL access on failure" 2015-09-26 01:52:52 +00:00
Ronald S. Bultje
36ffe64498 Rename display_{size,width,height} to render_*.
The name "display_*" (or "d_*") is used for non-compatible information
(that is, the cropped frame dimensions in pixels, as opposed to the
intended screen rendering surface size). Therefore, continuing to use
display_* would be confusing to end users. Instead, rename the field
to render_*, so that struct vpx_image can include it.

Change-Id: Iab8d2eae96492b71c4ea60c4bce8121cb2a1fe2d
2015-09-25 21:34:29 -04:00
Ronald S. Bultje
fcd6414e77 Merge "vp10: remove MACROBLOCK.{highbd_,}itxfm_add function pointer." 2015-09-26 01:20:14 +00:00
Ronald S. Bultje
690f662e26 Merge "vp10: remove MACROBLOCK.fwd_txm4x4 function pointer." 2015-09-26 01:19:49 +00:00
Angie Chiang
6a382101dd comment out fdct32
comment out fdct32
remove fdct32 test

Change-Id: I31c47fb435377465cd3265e39621ca50d3aae656
2015-09-25 18:18:27 -07:00
Ronald S. Bultje
8979e9e387 vp10: code reference_mode in uncompressed header.
See issue 1041 point 2.

Change-Id: I6fc6427b1a0edff828e39d43428e3271491f8ac5
2015-09-25 20:32:14 -04:00
Ronald S. Bultje
034c28b0a4 vp10: split UV int4x4 loopfilter flag in one for each covered edge.
In practice, this fixes the issue that if you have an odd number of
mi_cols, on the full right of the image, the UV int4x4 loopfilter
will be skipped over odd cols as well as odd rows (because it holds a
single variable for both edges).

See issue 1016.

Change-Id: Id53b501cbff9323a8239ed4775ae01fe91874b7e
2015-09-25 20:25:10 -04:00
James Zern
b945a532e5 Merge "Revert "remove static from fdct4/8/16/32"" 2015-09-26 00:12:43 +00:00
Ronald S. Bultje
bab8d38f7f vp10: remove MACROBLOCK.{highbd_,}itxfm_add function pointer.
This is preparatory work for allowing per-segment lossless coding.

See issue 1035.

Change-Id: I9487d02717ee3e766aee61a487780056bb35d2d3
2015-09-25 19:30:46 -04:00
Ronald S. Bultje
c74b33a413 vp10: remove MACROBLOCK.fwd_txm4x4 function pointer.
This is preparatory work for allowing per-segment lossless coding.

See issue 1035.

Change-Id: Idd72e2a42d90fa7319c10122032d1a7c7a54dc05
2015-09-25 19:30:46 -04:00
Tom Finegan
c6a419b490 build/make/iosbuild.sh: Remove jobs argument.
This can be handled via MAKEFLAGS.

Change-Id: I3a58a8a41f6570cb3b80c7c97e51735b82bf4ec9
2015-09-25 15:18:17 -07:00
Tom Finegan
7602232642 Merge "build/make/configure.sh: Embed bitcode in arm darwin targets." 2015-09-25 22:14:38 +00:00
Alex Converse
35fb3441f8 Add a test for the interaction between active map and cyclic referesh.
Fails with Icac63051bf37c7355e661837b57c257d58c764fc reverted.

Change-Id: I460d7a5a74faa4daace25f911f8dc5f68e16c951
2015-09-25 13:04:00 -07:00
James Zern
e7c949d32d Merge "vp9/10 decoder_remove: check pbi pointer" 2015-09-25 19:31:07 +00:00
Marco Paniconi
040395b944 Merge "VP8: Adjust rate correction factor for drop due to overshoot." 2015-09-25 18:59:58 +00:00
Marco
3f7656cc23 Limit cyclic refresh on steady background blocks.
Use the existing QP condition on limiting cyclic refresh, and add
addiitonal condition that block has been encoded with zero/small motion
x frames in row (where x is at least several times the refresh period).
Additional condition only affect non-screen content mode.

This helps to improve visual stability for noisy input, where on steady
background areas the application of delta_qp may lead to encoding the noise.

Also added a change to use the true skip (after encoding) to update the
last QP.

Change-Id: I234a1128d017d284cf767fdb58ef6c59d809f679
2015-09-25 10:40:35 -07:00
James Zern
7e54f0fe4b Merge "configure: reference the README for missing yasm" 2015-09-25 03:22:15 +00:00
James Zern
921c347ef6 vp9/10 decoder_remove: check pbi pointer
fixes crash on error

Change-Id: Ibb1ef5565fb833cdee1a49335473d98f1187ef43
2015-09-24 19:51:14 -07:00
Jacky Chen
ee72b6915e Merge "Change size on first frame and change config cause crash." 2015-09-25 01:04:07 +00:00
Marco
ece841f03f VP8: Adjust rate correction factor for drop due to overshoot.
Change-Id: Id70ca2e18a46247720eb631ae13a8430bd8b0954
2015-09-24 16:40:29 -07:00
Tom Finegan
9194f3c0cb build/make/configure.sh: Embed bitcode in arm darwin targets.
When the iOS SDK major version is 9 or higher:
- Pass -fembed-bitcode to compiler, assembler, and linker.
- Add a warning for simulator targets since yasm doesn't know
  what -fembed-bitcode means, and exits with an error.

BUG=https://code.google.com/p/webm/issues/detail?id=1075

Change-Id: I38c997a0225e53c5dd1b4ddf7935d21362953f76
2015-09-24 15:11:15 -07:00
Tom Finegan
20b770eecd Merge "build/make/configure.sh: Silence arm target Xcode7 link warnings." 2015-09-24 18:44:32 +00:00
Tom Finegan
4327d50904 Merge "build/make/configure.sh: Fix armv7 builds in Xcode7." 2015-09-24 18:44:23 +00:00
Tom Finegan
324bcbfaed build/make/configure.sh: Silence arm target Xcode7 link warnings.
Always add IOS_VERSION_MIN to darwin arm cflags. The warning occured
because the default (9.0) does not match the value set by configure
(6.0).

BUG=https://code.google.com/p/webm/issues/detail?id=1075

Change-Id: Ia9085ceeca10e057f9eb781c14f07581bb6280a5
2015-09-23 18:42:32 -07:00
Tom Finegan
6cf994b924 build/make/configure.sh: Fix armv7 builds in Xcode7.
- Use the iphoneos SDK path (instead of macosx).
- Detect iOS SDK major version and disable media (armv6) when using
  iOS SDK version 9 or higher.

BUG=https://code.google.com/p/webm/issues/detail?id=1075

Change-Id: I12f77dbeee4c0084e8322f6841813da8b5e91c16
2015-09-23 18:42:21 -07:00
Tom Finegan
6002212d2b build/make/configure.sh: docs for soft_{dis|en}able.
Add function comments explaining what the functions do and do not do.

Change-Id: I23dea09f93bc5cdbea6a0077f90683a1df2f74dc
2015-09-23 18:34:40 -07:00
James Zern
078312979e vp9_thread_test: clarify test case names
rename Decode[2-4] to something more precise

Change-Id: I68c4f189796eb11ac1a5b7b682f24efb71708187
2015-09-23 18:31:36 -07:00
James Zern
f8a5ab5257 vp9_thread_test: add non-frame-parallel files
these have been supported in tile-threaded decoding since:
b3b7645 vp9_dthread: remove frame_parallel_decoding_mode requirement

Change-Id: Ia5a752db9be937153cf4830d9258752136356d1b
2015-09-23 18:31:35 -07:00
James Zern
cf8f6559ce vp9/10 encoder: prevent NULL access on failure
Change-Id: I1fc8e0b3d48675cd5428b7b36f7cc28ab32cbf71
2015-09-23 17:55:51 -07:00
James Zern
f3627c82d0 configure: reference the README for missing yasm
Change-Id: I2ad799901385011764affadeaddcc271df21509f
2015-09-23 17:51:42 -07:00
James Zern
e7c8b71a86 Revert "remove static from fdct4/8/16/32"
This reverts commit 8903b9fa83.

there is no reason for these to be global

Change-Id: I66a31c06f8426aeca348ef12d9b9ab59d6d5e55d
2015-09-23 17:45:57 -07:00
James Zern
af631e1f19 Merge "VP9: Remove frame_parallel_decoding_mode from macroblockd" 2015-09-24 00:33:16 +00:00
Marco Paniconi
30bd74cf74 Merge "Non-rd mode: Limit transform size for intra to 16x16." 2015-09-24 00:12:02 +00:00
Scott LaVarnway
5404978825 VP9: Remove frame_parallel_decoding_mode from macroblockd
Not used.

Change-Id: I71527d0ee43a5730f1a2527e7ab687a77a137db4
2015-09-23 16:06:46 -07:00
Hui Su
d5683faab9 Merge "Adjust rd calculation in choose_tx_size_from_rd" 2015-09-23 21:39:43 +00:00
Marco
9b51b3a9ca Adjust rate-boost threshold in cyclic refresh for seg#2.
Small gain in metrics (average ~0.2dB), small
reduction in rate fluctuation.

Change-Id: Id75bd89c168486f075308fb474ebd26e3bdfb85b
2015-09-23 11:52:55 -07:00
Marco
01860f6fe4 Non-rd mode: Limit transform size for intra to 16x16.
Limit transform size for intra to 16x16, for non-screen content mode.
Little/no change in speed or metrics.
32x32 intra block is rarley selected in RTC (non-screen content) case,
but some visual improvement can be seen in some example,
e.g., captured_video_dark_whd.yuv.

Change-Id: I68e2db87875343b3fb9bb407a7709f0088f84072
2015-09-23 10:59:24 -07:00
hui su
38cc168822 Adjust rd calculation in choose_tx_size_from_rd
Coding gain:
derflr 0.142%
hevclr 0.153%
hevcmr 0.124%

Change-Id: I63b56ae3a9002c3a266e10e2964135ed43b0ba53
2015-09-23 10:54:28 -07:00
Johann
90a109f0ee Restrict get_msb inputs
Add a warning and assert that inputs for get_msb must not be zero.

Change-Id: I8c6f289ff13248f6e3a8bc24aab3712ed33022a6
2015-09-22 00:24:01 +00:00
Angie Chiang
36c4e8b27a Merge "remove static from fdct4/8/16/32" 2015-09-21 23:25:26 +00:00
Johann Koenig
90889b9a45 Merge "Remove vpx_filter_block1d16_v8_intrin_ssse3" 2015-09-21 19:17:18 +00:00
Angie Chiang
8903b9fa83 remove static from fdct4/8/16/32
remove static from fdct4/8/16/32 in vp10/encoder/dct.c
add prefix vp10_ to fdct4/8/16/32
add vp10/encoder/dct.h

Change-Id: I644827a191c1a7761850ec0b1da705638b618c66
2015-09-21 11:49:10 -07:00
Marco Paniconi
ce2b56cd69 Merge "Non-rd pickmode: Don't skip checking zeromv-last mode." 2015-09-21 18:26:28 +00:00
jackychen
55f092db09 Change size on first frame and change config cause crash.
Reallocation of mi buffer fails if change size on the first frame and
change config in subsequent frames. Add a condition for resolution
check to avoid assertion failure.

BUG=1074

Change-Id: Ie26ed816a57fa871ba27a72db9805baaaeaba9f3
2015-09-21 10:57:05 -07:00
Marco
38ad2dcea6 Non-rd pickmode: Don't skip checking zeromv-last mode.
Reference frame masking logic may skip checking zeromv-last mode.
Fix to avoid this and make sure zero-last is always checked.

No noticeable change in speed, and PSNR/SSIM metrics on RTC set overall
neutral (very small gain ~0.02).
Small visual improvement on few RTC clips.

Change-Id: I26eacdc449126424001a4a64e5ac31949f064417
2015-09-21 10:32:23 -07:00
Jingning Han
67ec82a262 Merge "Create sub8x8 block inter prediction function" 2015-09-21 16:13:37 +00:00
James Zern
571b7c978e vp9_end_to_end_test: disable vp10 w/high bitdepth
the range check in dct.c (abs(input[i]) < (1 << bit)) will fail in many
cases. this was broken at the time this check was added

BUG=1076

Change-Id: I3df8c7a555e95567d73ac16acda997096ab8d6e2
2015-09-19 09:14:18 -07:00
Jingning Han
d6be2671ed Create sub8x8 block inter prediction function
Change-Id: Ib161e6fb3eb081f7176a1d969fed16a7d1ffc320
2015-09-18 16:31:36 -07:00
James Zern
57694362e0 Merge "configure: add --extra-cxxflags option" 2015-09-18 23:18:10 +00:00
Johann
dd4f953350 Remove vpx_filter_block1d16_v8_intrin_ssse3
This was rewritten and moved to vpx_dsp/x86/vpx_subpixel_8t_ssse3.asm
in 195883023b

Change-Id: I117ce983dae12006e302679ba7f175573dd9e874
2015-09-18 16:05:43 -07:00
Tom Finegan
cd82f69823 Merge "iosbuild: Enable PIC for x86 targets." 2015-09-18 19:38:55 +00:00
Tom Finegan
9a8fe58caf Merge "iosbuild: Add --test-link argument." 2015-09-18 19:38:45 +00:00
James Zern
e00470aef8 vp9_arf_freq_test: disable vp10 w/high bitdepth
the range check in dct.c (abs(input[i]) < (1 << bit)) will fail in the
25-29 range. this was broken at the time this check was added

Change-Id: I8ca9607f6cbdc8be7f47696ffeabbab3ac5727e2
2015-09-17 20:17:35 -07:00
Jingning Han
48b8023ef0 Merge "Refactor mbmi_ext structure" 2015-09-18 00:49:14 +00:00
Tom Finegan
69ceed8e3a iosbuild: Enable PIC for x86 targets.
Change-Id: I03b1e8f983f8cd87519aefda732210359b319c81
2015-09-17 16:22:07 -07:00
Tom Finegan
01276f4453 iosbuild: Add --test-link argument.
Shortcut arg for --extra-configure-args --enable-examples. Enables
the examples, and thus ensures that all versions of libvpx that
iosbuild.sh produces can actually be linked.

Change-Id: I2ddda094361bf0ac77f8d2ae542e4dc7b2cab158
2015-09-17 16:21:22 -07:00
Marco Paniconi
e12ec3615c Merge "Add SVC codec control to set frame flags and buffer indices." 2015-09-17 22:29:07 +00:00
James Zern
9d8decc162 Merge changes from topic 'tile-thread-cleanup'
* changes:
  vp9/decode_tiles_mt: move frame count accum from loop
  VP9Decoder: remove duplicate tile_worker_info
  vp9/decode_tiles_mt: move some inits from inner loop
  vp9_accumulate_frame_counts: pass counts directly
2015-09-17 22:00:23 +00:00
James Zern
e665d0bdd9 Merge "vpx_subpixel_8t_ssse3: fix reg counts/access" 2015-09-17 21:31:14 +00:00
James Zern
683b5a3161 vpx_subpixel_8t_ssse3: fix reg counts/access
fixes build on windows x64; previously 'heightq' i.e., the 64-bit register
was accessed when only the 32-bit value was needed. given this is from a
stack variable the upper bits were undefined.

+ bump register/xmm counts; users of SETUP_LOCAL_VARS touch xmm13 in
64-bit builds and filter_block1d16_v* uses one extra temp variable

Change-Id: I9c768c0b2047481d1d3b11c2e16b2f8de6eb0d80
2015-09-17 12:27:34 -07:00
Jingning Han
c3bf837572 Refactor mbmi_ext structure
This commit removes mbmi_ext_base pointer from MACROBLOCK struct.
Its use case can be fully covered by cpi->mbmi_ext_base pointer.

Change-Id: I155351609336cf5b6145ed13c21b105052727f30
2015-09-17 09:51:45 -07:00
Marco
730cdefd3e Add SVC codec control to set frame flags and buffer indices.
Add SVC codec control to set the frame flags and buffer indices
for each spatial layer of the current (super)frame to be encoded.
This allows the application to set (and change on the fly) the
reference frame configuration for spatial layers.

Added an example layer pattern (spatial and temporal layers)
in vp9_spatial_svc_encoder for the bypass_mode using new control.

Change-Id: I05f941897cae13fb9275b939d11f93941cb73bee
2015-09-17 09:37:15 -07:00
Ronald S. Bultje
50f944272c vp10: do sub8x8 block reconstruction in full subblocks.
This means that we don't reconstruct in 4x4 dimensions, but in
blocksize dimensions, e.g. 4x8 or 8x4. This may in some cases lead
to performance improvements. Also, if we decide to re-introduce
scalable coding support, this would fix the fact that you need to
re-scale the MV halfway the block in sub8x8 non-4x4 blocks.

See issue 1013.

Change-Id: If39c890cad20dff96635720d8c75b910cafac495
2015-09-16 19:35:54 -04:00
Ronald S. Bultje
ed29c2f945 vp10: fix 4:2:2 chroma MVs for 8x4/4x4 blocks.
In vp9, the bottom MV would be the average of the topright and
bottomleft luma MV (instead of the bottomleft/bottomright luma MV).

See issue 993.

Change-Id: Ic91c0b195950e7b32fc26c84c04788a09321e391
2015-09-16 19:35:54 -04:00
Ronald S. Bultje
43be86dbff vp10: remove double MV value check.
This has virtually no effect on coding efficiency, but it is more
logical from a theoretical perspective (since it makes no sense to
me that you would exclude a MV from a list just because it's sign-
inversed value is identical to a value already in a list), and it
also makes the code simpler (it removes a duplicate value check in
cases where signbias is equal between the two MVs being compared).

See issue 662.

Change-Id: I23e607c6de150b9f11d1372fb2868b813c322d37
2015-09-16 19:35:53 -04:00
Ronald S. Bultje
00a203b7bc vp10: move coding of tx_mode element to the non-arithcoded header.
See issue 1040 point 3.

Change-Id: If051b92c24a34d6a39861fb7d7180c5ca32f3d82
2015-09-16 19:35:53 -04:00
Ronald S. Bultje
a3df343cda vp10: code sign bit before absolute value in non-arithcoded header.
For reading, this makes the operation branchless, although it still
requires two shifts. For writing, this makes the operation as fast
as writing an unsigned value, branchlessly. This is also how other
codecs typically code signed, non-arithcoded bitstream elements.

See issue 1039.

Change-Id: I6a8182cc88a16842fb431688c38f6b52d7f24ead
2015-09-16 19:35:03 -04:00
Ronald S. Bultje
3c8e04e939 Merge "vp10: don't reset contextual skip flag if block has no coefficients." 2015-09-16 20:55:14 +00:00
Ronald S. Bultje
623279169a Merge "Add support for color-range." 2015-09-16 20:26:10 +00:00
Jacky Chen
c21ce82832 Merge "VP9 dynamic resizing unit test with bitrate change." 2015-09-16 16:55:14 +00:00
Ronald S. Bultje
a5d930e464 vp10: don't reset contextual skip flag if block has no coefficients.
The implicitly changed value would be used for contextualizing future
skip flags of neighbour blocks (bottom/right), which is certainly not
what was intended. The original code stems from vp8, and was useful
in cases where coding of the skip flag was disabled. In vp9, the skip
flag is always coded. The result of this change is that for bitstream
parsing purposes, decoding of the skip flag becomes independent of
decoding of block coefficients.

See issue 1014.

Change-Id: I8629e6abe76f7c1d649f28cd6fe22a675ce4a15d
2015-09-16 06:41:51 -04:00
Ronald S. Bultje
eeb5ef0a24 Add support for color-range.
In decoder, export (eventually) into vpx_image_t.range field. In
encoder, use oxcf->color_range to set it (same way as for
color_space).

See issue 1059.

Change-Id: Ieabbb2a785fa58cc4044bd54eee66f328f3906ce
2015-09-16 06:41:46 -04:00
Ronald S. Bultje
e562c71783 Merge "vp10: fix entropy counts for the hp bit." 2015-09-16 01:53:44 +00:00
jackychen
ca8f8fd304 VP9 dynamic resizing unit test with bitrate change.
Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
Start at low target bitrate, raise the bitrate in the middle of the
clip, verify that scaling-up does occur after bitrate changed.

Change-Id: I7ad8c9a4c8288387d897dd6bdda592f142d8870c
2015-09-15 18:03:26 -07:00
Angie Chiang
8c1dce86e8 Merge "fix implicit declaration" 2015-09-16 00:20:43 +00:00
James Zern
c667593e1e Merge changes from topic 'fix-vp9-bitstream-test'
* changes:
  vp9_encoder_parms_get_to_decoder: cosmetics
  vp9...parms_get_to_decoder: remove unneeded func
  vp9...parms_get_to_decoder: fix EXPECT param order
  vp9_encoder_parms_get_to_decoder: delete dead code
  fix BitstreamParms test
  vp9_encoder_parms_get_to_decoder: remove vp10
  yuvconfig2image(): add explicit cast to avoid conv warning
  vp9/10 decoder_init: add missing alloc cast
  vp9/10: set color_space on preview frame
  vp10: add extern "C" to headers
  vp9: add extern "C" to headers
2015-09-15 23:14:34 +00:00
Marco Paniconi
9c82fc457e Merge "VP9 dynamic resizing unit test." 2015-09-15 22:26:03 +00:00
Marco Paniconi
f6097ef243 Merge "SVC fix to set worst/best_quality per layer." 2015-09-15 22:06:13 +00:00
jackychen
9ac42bc15c VP9 dynamic resizing unit test.
Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
Run at low bitrate, with resize_allowed = 1, and verify that we get
one resize down event.

Change-Id: Ic347be60972fa87f7d68310da2a055679788929d
2015-09-15 14:36:55 -07:00
Marco
15c43d9ac7 SVC fix to set worst/best_quality per layer.
Allow the worst/best_quality to be set per layer via the
VP9E_SET_SVC_PARAMETERS control.

Change-Id: Icba5ec8ac757152f3bb7860d6010d9174a7bd578
2015-09-15 14:16:07 -07:00
Marco
eb53c69ece Add cyclic refresh parameters to svc-layer context.
For 1 pass CBR spatial-SVC:
Add cyclic refresh parameters to the svc-layer context.

This allows cyclic refresh (aq-mode=3) to be applied to
the whole super-frame (all spatial layers).
This gives a performance improvement for spatial layer encoding.

Addd the aq_mode mode on/off setting as command line option.

Change-Id: Ib9c3b5ba3cb7851bfb8c37d4f911664bef38e165
2015-09-15 10:06:36 -07:00
Debargha Mukherjee
0e1b4fb941 Fix two pass svc encoding
Fixes temporal scalability. Updates were inadvertently turned
off for two pass svc causing crashes due to gf_group.index
growing unchecked.

Change-Id: Iff759946bf61bbde70630347cc8fa4d51a8c2d2f
2015-09-15 06:11:24 -07:00
Yaowu Xu
e723e36da6 Merge "Remove leftover of "frame_parallel_decoding"" 2015-09-15 02:24:36 +00:00
Yaowu Xu
ee825f9372 Remove leftover of "frame_parallel_decoding"
The variable has been removed by a previous commit, but missed this
instance.

Change-Id: Ia34474b0be4945cc6cb9191f0d7cd24a99a4c22e
2015-09-14 18:10:08 -07:00
jackychen
419456617e Change parameters for VP9 dynamic resizing.
Use a smaller window in dynamic resizing and wait a shorter
time after key frame.

Change-Id: I086f840cdec3c6bdaa9acfe11346d919e445973d
2015-09-14 16:17:52 -07:00
Alex Converse
0b762e0c0c Merge "CR: Don't attempt to read qindex_delta for segments CR is unaware of." 2015-09-14 22:59:24 +00:00
Alex Converse
575e81f7c9 CR: Don't attempt to read qindex_delta for segments CR is unaware of.
Found by the remoting VideoEncoderVpxTest.Vp9LossyUnchangedFrame unit
test under asan.

Change-Id: Icac63051bf37c7355e661837b57c257d58c764fc
2015-09-14 13:55:30 -07:00
Marco Paniconi
bb581f4e83 Merge "For 1 pass: always use the normative filter in vp9_scale_if_required()" 2015-09-14 20:36:34 +00:00
Ronald S. Bultje
1e9e9ce2dc vp10: fix entropy counts for the hp bit.
The counts didn't take usehp into account, which means that if the
scope of the refmv is too large for the hp bit to be coded, the value
(always 1) is still included in the stats. Therefore, the final
counts will not reflect the entropy of the coded bits, but rather the
entropy of the combination of coded bits and the implied value (which
is always 1). Fix that by only including counts if the hp bit is
actually coded.

See issue 1060.

Change-Id: I19a3adda4a8662a05f08a9e58d7e56ff979be11e
2015-09-14 16:13:59 -04:00
Marco
4d1424faf9 For 1 pass: always use the normative filter in vp9_scale_if_required()
The normative (convolve8) filter is optimized/faster than
the nonnormative one. Pass usage of scaler (normative/nonomorative)
to vp9_scale_if_required(), and always use normative one for 1 pass.

Change-Id: I2b71d9ff18b3c7499b058d1325a9554de993dd52
2015-09-14 13:13:32 -07:00
Ronald S. Bultje
48f0168e95 Merge "vp10: merge frame_parallel_decoding_mode and refresh_frame_context." 2015-09-14 18:24:24 +00:00
James Zern
12355c4c4c configure: add --extra-cxxflags option
same usage as --extra-cflags

Change-Id: Iff2ed7b8ebb6e51610ee0851aeec08413367ab23
2015-09-12 10:25:28 -07:00
Angie Chiang
fe776ce61f add range_check for fdct in vp10
Unify the style of fdct4() fdct8() fdct16()
Add fdct32()
Add range_check() at each stage
Add unit test at ../../test/vp10_dct_test.cc

Change-Id: I13f76d9046c3ea473c82024b09a5bc8662e2c28e
2015-09-12 03:26:09 +00:00
James Zern
9759c3d542 third_party/libwebm: pull from upstream.
Upstream hash: 476366249e1fda7710a389cd41c57db42305e0d4

Changes from upstream since last update:
4763662 mkvparser: fix type warnings
267f71c mkvparser: SafeArrayAlloc fix type warning
f1a99d5 mkvparser: s/LONG_LONG_MAX/LLONG_MAX/ for compatibility
bff1aa5 mkvparser: add msvc compatibility for isnan/isinf

Change-Id: Ie0375e564fc74b3b296744d0039830d2f77b83b6
2015-09-11 19:02:24 -07:00
Ronald S. Bultje
d1474f02aa vp10: merge frame_parallel_decoding_mode and refresh_frame_context.
See issue 1030. The value of frame_parallel_decoding_mode was ignored
in vp9 if refresh_frame_context was 0, so instead make it a 3-member
enum where the dependency is obviously stated.

Change-Id: I37f0177e5759f54e2e6cc6217023d5681de92438
2015-09-11 19:33:46 -04:00
Ronald S. Bultje
c92c50f2fe vpxdec: remove implied --output-bit-depth=8 for --yv12.
Change-Id: I28c939db49334572476aa2b428ec93111d4e869d
2015-09-11 19:33:45 -04:00
Ronald S. Bultje
ef73bbf778 vp10: remove duplicate frame_parallel_decode field.
Keep the one in VP10_COMMON in favour of the one in VP10_DECODER.

Change-Id: Ia81983ccc95d83829dc815e28d9b1143e16e27b1
2015-09-11 18:37:24 -04:00
Ronald S. Bultje
eba342af87 Don't convert bitdepth for !single-file or MD5.
... unless --output-bit-depth was set.

Change-Id: I3482eaf12e245eec24427518fccdd173f890f4b4
2015-09-11 18:37:24 -04:00
Ronald S. Bultje
812fbc5ecb Merge "Make reset_frame_context an enum." 2015-09-11 22:36:49 +00:00
Marco Paniconi
cd9ae6d758 Merge "Avoid scaling last_source, unless needed." 2015-09-11 21:39:06 +00:00
Angie Chiang
894ab8be7e fix implicit declaration
include vpx_dsp_rtcd.h to avoid implicit declaration of
vp10_highbd_fdct32x32_rd_c

Change-Id: I0b9ad50381a302750138deab14d2d5ac31f286ee
2015-09-11 12:17:15 -07:00
Ronald S. Bultje
62da0bf162 Make reset_frame_context an enum.
In vp9, [0] and [1] had identical meaning, so merge them into a
single value. Make it impossible to code RESET_FRAME_CONTEXT_NONE
for intra_only frames, since that is a non-sensical combination.

See issue 1030.

Change-Id: If450c74162d35ca63a9d279beaa53ff9cdd6612b
2015-09-11 15:12:02 -04:00
Marco
e8a4a3e2b1 Avoid scaling last_source, unless needed.
Save some encoding time, for the case of spatial layers
or under dynamic resizing mode.

Change-Id: If4a8eb6f0376c3d2dde8465fde6bfd86ab704920
2015-09-11 11:53:25 -07:00
James Zern
f79f71fc22 Merge "Fix vp10 high bit-depth build" 2015-09-11 18:27:49 +00:00
Jingning Han
481b834842 Fix vp10 high bit-depth build
Change-Id: Ie3daed0b282b43ef81d2f8797ac1f6e8bde7d65e
2015-09-11 08:56:29 -07:00
Marco
6ddbc845cc Remove unneeded/incorrect comment.
Change-Id: I5c923223c284ad4fda0c45572a66bebc8528dd1d
2015-09-11 08:49:13 -07:00
James Zern
d318d7cb6f Merge "build: modify default ARFLAGS / .a target" 2015-09-11 02:30:07 +00:00
Ronald S. Bultje
ad747e94d0 Merge "Add misc_fixes experiment." 2015-09-11 02:00:45 +00:00
Ronald S. Bultje
3ef3dcb8b6 Merge "Don't reset sign_bias fields in vp10_setup_past_independence()." 2015-09-11 02:00:30 +00:00
Angie Chiang
501efcad4a Merge "Isolate vp10's fwd_txfm from vp9" 2015-09-11 00:10:45 +00:00
Alex Converse
3c092e2474 Merge changes Ibb308526,I99e330f8
* changes:
  Prevent CR in screen mode from refreshing flat inter blocks forever.
  For screen content consider intra uv when color_sensitivity is set.
2015-09-10 23:04:46 +00:00
Jingning Han
b50e0badbc Merge "Take out reference_masking speed feature" 2015-09-10 23:03:06 +00:00
Jingning Han
b999f1509c Merge "Take out skip_encode speed feature in vp10" 2015-09-10 23:02:38 +00:00
Jingning Han
7f71d1e00a Merge "Remove speed features in vp10" 2015-09-10 23:02:27 +00:00
Angie Chiang
b0bfea4f5f Merge "Isolate vp10's inv_txfm from vp9" 2015-09-10 22:51:02 +00:00
Angie Chiang
ee5b80597e Isolate vp10's fwd_txfm from vp9
1) copy fw_txfm related files from vpx_dsp tp vp10

    vpx_dsp/fwd_txfm.h → vp10/common/vp10_fwd_txfm.h
    vpx_dsp/fwd_txfm.c → vp10/common/vp10_fwd_txfm.c
    vpx_dsp/x86/fwd_dct32x32_impl_sse2.h →  vp10/common/x86/vp10_fwd_dct32x32_impl_sse2.h
    vpx_dsp/x86/fwd_txfm_sse2.c →  vp10/common/x86/vp10_fwd_txfm_sse2.c
    vpx_dsp/x86/fwd_txfm_impl_sse2.h → vp10/common/vp10_fwd_txfm_impl_sse2.h

Change-Id: Ie9428b2ab1ffeb28e17981bb8a142ebe204f3bba
2015-09-10 15:19:43 -07:00
Angie Chiang
87175ed592 Isolate vp10's inv_txfm from vp9
1) copy following files from vpx_dsp/ to vp10/common/
vp10_inv_txfm.c
vp10_inv_txfm.h
vp10_inv_txfm_sse2.c
vp10_inv_txfm_sse2.h

2) change the function prefix "vpx_" to "vp10_" in above files

3) add unit test at vp10_inv_txfm_test.cc

Change-Id: I206f10f60c8b27d872c84b7482c3bb1d1cb4b913
2015-09-10 15:08:37 -07:00
Alex Converse
3d6b8a667f Prevent CR in screen mode from refreshing flat inter blocks forever.
Take the minimum last_codec_q_map on inter skip.

Change-Id: Ibb308526dd19793bb359f51ebd7b48d8692903fd
2015-09-10 15:03:13 -07:00
Alex Converse
d5c0e366d7 For screen content consider intra uv when color_sensitivity is set.
Change-Id: I99e330f8a779b4d564c19ef4639a881cb68910ae
2015-09-10 15:03:09 -07:00
Jingning Han
1eb760e55d Take out reference_masking speed feature
This condition is not effectively in use. The actual reference
frame masking is done in other route.

Change-Id: Ia59c843bcac7243dada92f0f67658d7ce43df5e8
2015-09-10 12:57:48 -07:00
James Zern
1b3d775366 build: modify default ARFLAGS / .a target
remove 'u' and specify all objects to allow objects with the same
basename to be added and a incremental rebuild to succeed

fixes issue #1067

Change-Id: Id0ebc89be826a026f1bbf21b4e32a2b1af45154d
2015-09-10 12:54:01 -07:00
Vignesh Venkatasubramanian
dd5510750a third_party/libwebm: pull from upstream.
Upstream hash: a58c32339e06e5d672a58cdd5844cea0a661e735

Changes from upstream since last update:
a58c323 mkvmuxer: Add codec id constant for VP10.
714f3c4 mkvparser: validate results in EBMLHeader::Parse.
cec98d4 mkvparser: Correct the ReadID implementation.
eb36ae4 Merge changes I029a268e,Ia272b150,I5c4d1bbc,Ia47a2478,I3a2e2226
229f493 Merge "mkvparser: Segment::AppendCluster asserts to error checks."
287faf9 Merge "mkvparser: Segment::DoLoadClusterUnknownSize asserts to error checks."
1a87b59 Merge "mkvparser: Segment assert clean up."
d26ec69 mkvparser: Cluster::Parse clean up.
f2029be mkvparser: Disallow EBML IDs equal to 0.
19f5694 mkvparser: Cluster::Load clean up.
27a07c1 mkvparser: Segment::Load asserts to error checks.
d0313dd mkvparser: Segment::PreloadCluster asserts to error checks.
b108695 mkvparser: Segment::AppendCluster asserts to error checks.
4630f80 mkvparser: Segment::DoLoadClusterUnknownSize asserts to error checks.
841a9b5 mkvparser: Segment assert clean up.
8c4ca2e Merge "mkvparser: Make mkvparser namespace usage uniform."
49ae6f0 Merge "mkvparser: Fix include order."
0735bb5 mkvparser: Make mkvparser namespace usage uniform.
93b24c4 mkvparser: Fix include order.
a57d660 sample_muxer: fix Segment memory leak on error
1c5bd94 mkvparser: Cues, change asserts to error checks.
7f77201 Merge "mkvparser: Add ReadID."
795fd56 mkvparser: set kMaxAllocSize explicitly
23bb18b mkvparser: Add ReadID.
7b57e37 mkvparser: add SafeArrayAlloc.
83a1f68 mkvparser: Remove buf_t typedef.
5074714 Merge changes Ia1265a63,I799d54df,Icfc582e4,I3425f608
b181105 Merge changes Ie4318152,I1e65f30f
06b4337 Block::Parse: replace pos asserts w/checks
b366a98 Cluster::ParseBlockGroup: replace pos asserts w/checks
2857b23 Tags::*::Parse: replace pos asserts w/checks
f1b2cfa Chapters::*::Parse: replace pos asserts w/checks
ca80629 Merge "mkvparser: Cues::PreloadCuePoint now returns bool."
6b4b297 Block::Parse: use int64 to aggregate laced frame sizes
c0d2c98 UnserializeFloat: check result for Inf/NaN
1a6dc4f mkvparser: Cues::PreloadCuePoint now returns bool.
275ac22 mkvparser: Cluster::Create clean up.
064f2ee Segment::PreloadCluster(): return a bool status
3778408 Segment::AppendCluster(): return a bool status
e86d046 mkvparser: check Cluster::Create() return
f9885b5 mkvparser: check allocations
21ee398 mkvparser: Segment::Load fail w/missing info/tracks
08fb654 Merge changes I264e68b2,Ife6190a4,Ibf37245f,I06efadb5,I88b5dfec, ...
c896095 mkvparser/Cluster: convert asserts to failure returns
680b4bf mkvparser/Tracks: convert asserts to failure returns
5889e6c mkvparser/Track: convert asserts to failure returns
5135c4c mkvparser/ContentEncoding: convert asserts to failure returns
b0e4f32 mkvparser/Cues: convert asserts to failure returns
13ccc7f mkvparser/UnserializeInt: fix sign flip
db3f9bb mkvparser/SeekHead: convert asserts to failure returns
8de3654 mkvparser/Segment: convert asserts to failure returns
fa2aa7d SeekHead::Parse(): fix assertion failure
d9bdade sample{,_muxer}: check SegmentInfo::GetInfo() return
07a9cf7 Merge "mkvparser: Remove some asserts."
c56ee29 mkvparser: Remove some asserts.
d901324 Merge "mkvparser: Remove some asserts from SegmentInfo::Parse."
7f7d898 Fix case sensitivity issue in iosbuild.sh.
42fe2cd mkvparser: Remove some asserts from SegmentInfo::Parse.
8bccd9c Merge "mkvparser: avoid rollover in UnserializeInt()."
7a2fa0d mkvparser: avoid rollover in UnserializeInt().
44f5ce6 mkvparser: Disallow durations in seconds greater than LONG_LONG_MAX.
b521e30 Merge "mkvparser: Segment::ParseHeaders() avoid rollover and bad int sizes."
7680e2a mkvparser: Check for errors in Match().
39a315f mkvparser: Segment::ParseHeaders() avoid rollover and bad int sizes.
f250ace mkvparser: Handle invalid lengths and rollover in ParseElementHeader().
cd96a76 mkvparser: Avoid rollover/truncation in UnserializeString().
8e8b3db Merge "mkvparser: Add error checking in Block::Parse."
82b7e5f sample: correct mbstowcs() error check
04d7809 sample: check allocation return
986b64b mkvparser: Add error checking in Block::Parse.

Change-Id: I39beef84962d6341f8ce53be06807b3e2068f777
2015-09-10 12:47:21 -07:00
Jingning Han
f137697c32 Take out skip_encode speed feature in vp10
Change-Id: Ic39d4523e78863c816b0fc85f56ea5ae5e0b3310
2015-09-10 12:45:39 -07:00
Jingning Han
4fa8e73249 Remove speed features in vp10
Take out speed features that affect the compression performance
to simplify the coding route. This commit removes the motion field
mode search used in speed 3.

Change-Id: Ifdf6862cb1ece8261125a56d9d89bcef60758c00
2015-09-10 12:25:33 -07:00
Vignesh Venkatasubramanian
09969ac9a2 webmdec: Handle codec id being NULL.
WebM files could have CodecId missing in the track headers. Treat those files as
unknown input file type in vpxdec.

Fixes issue #1064.

Change-Id: I6c3bb7b4bd3a4f5c244312482a5996f8b68db3f3
2015-09-10 10:44:59 -07:00
Marco Paniconi
2ff108aac6 Merge "vp8: Small adjustment to cyclic_refresh max_mbs_perframe." 2015-09-10 15:47:49 +00:00
James Zern
ba317bc9dc vp9_encoder_parms_get_to_decoder: cosmetics
fix indent, */& association, join a few lines

Change-Id: Idaca24b87b574788f9508168082d0ade3d4e9ecc
2015-09-10 00:21:59 -07:00
James Zern
fc4ddc0d00 vp9...parms_get_to_decoder: remove unneeded func
removes a redundant cast in the process

Change-Id: Ie3727a0938c0093f70f25a875c2c58671938d45c
2015-09-10 00:21:59 -07:00
James Zern
67774db59f vp9...parms_get_to_decoder: fix EXPECT param order
(expected, actual)

Change-Id: I449e7b6c51aa85cdde008d2fad5a9629970222a9
2015-09-10 00:21:58 -07:00
James Zern
21952bab12 vp9_encoder_parms_get_to_decoder: delete dead code
the only input is y4m, there's no need to test for yuv.

Change-Id: Ie5b55ea4af44ad79a55304ef5636a8ad7ed30bb8
2015-09-10 00:21:58 -07:00
James Zern
0fe900a543 fix BitstreamParms test
avoid duplicating internal structures and include vp9_dx_iface.c
directly. these had fallen out of sync after the frame-parallel branch
merge.

Change-Id: I604cfbffa95abe2a1c8e906a696f32436b1422ed
2015-09-10 00:21:57 -07:00
James Zern
7793a51ddc vp9_encoder_parms_get_to_decoder: remove vp10
this file needs to be reworked to remove the duplication of codec
internals + allow for divergence of vp9 and vp10

Change-Id: I6266b94ccfbc24dae30148f134804b52aa411b88
2015-09-10 00:21:56 -07:00
James Zern
58cb7886c3 yuvconfig2image(): add explicit cast to avoid conv warning
prevents an int -> vpx_img_fmt_t conversion warning with high-bitdepth
as it modifies the image format

Change-Id: Ie3135d031565312613a036a1e6937abb59760a7e
2015-09-10 00:19:18 -07:00
James Zern
a124bc7a81 vp9/10 decoder_init: add missing alloc cast
Change-Id: I1ba4400d67095f3a360fb7d97ee8d118d4f741fe
2015-09-09 23:15:59 -07:00
James Zern
a2e61adc96 vp9/10: set color_space on preview frame
Change-Id: If9176ce6ed3eb6c7ef8ffd1378456cb95b4aeb86
2015-09-09 23:15:59 -07:00
James Zern
55f5d557f2 vp10: add extern "C" to headers
Change-Id: Ie2e8b37fa01ce8d6b993684f431f3159d511cfb1
2015-09-09 23:15:59 -07:00
James Zern
b09aa3ac54 vp9: add extern "C" to headers
Change-Id: I1b6927ad820f99340985b094d415aaab14defaf4
2015-09-09 23:15:59 -07:00
James Zern
992d9a062a Merge "Fix ioc warnings related to sub8x8 reference frame" 2015-09-10 06:10:44 +00:00
Tom Finegan
8fa5ca4899 Merge "Revert "Fix building with iOS 9 beta SDK"" 2015-09-10 01:36:29 +00:00
Jingning Han
b6d71a308c Fix ioc warnings related to sub8x8 reference frame
Access scaled reference frame in the sub8x8 rate-distortion
optimization loop only when the current test mode is an inter mode.
This prevents an ioc warning triggered by sending intra_frame index
to fetch scaled reference frame.

Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604
2015-09-09 15:48:00 -07:00
jackychen
f5617fd083 Change the qp threshold of VP9 dynamic resizing.
Change-Id: I1efe086191665ff8fa063f03d8e2032024dc090f
2015-09-09 15:47:07 -07:00
Marco
3140e90175 vp8: Small adjustment to cyclic_refresh max_mbs_perframe.
For 3 temporal layers, reduce somewhat the
cyclic_refresh_mode_max_mbs_perframe parameter, from 20% to ~14%.
Small increase in PSNR/SSIM metrics.

Change-Id: Ia216fa5474048f1ef7fe3db88cd60dfef2a1bf8a
2015-09-09 15:34:58 -07:00
Tom Finegan
d8808d365e Revert "Fix building with iOS 9 beta SDK"
This reverts commit 78637b6136.

Breaks armv7-darwin targets with current SDK (iOS 8/Xcode 6.4).

BUG=https://code.google.com/p/webm/issues/detail?id=1062

Change-Id: I58b27950f330557154d681a894114eadfbd3e593
2015-09-09 05:28:42 -07:00
Ronald S. Bultje
1589ecb0ae Add misc_fixes experiment.
Will be used to hold various trivial bitstream fixes.

Change-Id: Ic8ba07a2ae392db7c956ebae124913afe2ae4409
2015-09-08 14:05:08 -04:00
Ronald S. Bultje
e1d22db451 Don't reset sign_bias fields in vp10_setup_past_independence().
The fields are always coded in the frame itself, so there is never any
dependency on past frames. In practice, this fixes sign_bias being
ignored when error_resilient_mode=1.

See issue 1011.

Change-Id: I9d134ef6b445ced4d100fa735ce579855a0fa5af
2015-09-08 13:48:20 -04:00
James Zern
ad0ac045d5 vp9/decode_tiles_mt: move frame count accum from loop
the check performed within the while was redundant; simply place the
accumulation after all tiles are decoded.

Change-Id: I6a74e87257c775fd8bfc8ac4511e4a6ad8f18346
2015-09-04 20:24:29 -07:00
James Zern
5e1e6a9f17 VP9Decoder: remove duplicate tile_worker_info
unnecessary since: 86f4a3d Remove tile param

Change-Id: Iff75d3acf6c5aade833ea0a214c919279403cf97
2015-09-04 19:47:33 -07:00
James Zern
2d06b08cba vp9/decode_tiles_mt: move some inits from inner loop
worker copies of pbi/xd/counts only need to be initialized once

Change-Id: I0081a85b9c82d39573c22d2fd2c670ec2f7b8715
2015-09-04 19:47:33 -07:00
James Zern
0548046ae3 vp9_accumulate_frame_counts: pass counts directly
Change-Id: Ic3c6cfba5b1867c335f2834da936e20caec8597a
2015-09-04 19:47:33 -07:00
James Zern
14bc773199 sixtap_predict_test: enable NEON tests
the offending assembly code was deleted in:
08e38f0 VP8 for ARMv8 by using NEON intrinsics 14

the intrinsics currently pass.

fixes issue #725

Change-Id: I43e4263bef21f9d9008c51ffdfa39fcf10b8e776
2014-11-05 18:10:59 -08:00
271 changed files with 27153 additions and 11766 deletions

View File

@@ -1,14 +1,18 @@
Adrian Grange <agrange@google.com> Adrian Grange <agrange@google.com>
Alex Converse <aconverse@google.com> <alex.converse@gmail.com> Aex Converse <aconverse@google.com>
Aex Converse <aconverse@google.com> <alex.converse@gmail.com>
Alexis Ballier <aballier@gentoo.org> <alexis.ballier@gmail.com> Alexis Ballier <aballier@gentoo.org> <alexis.ballier@gmail.com>
Alpha Lam <hclam@google.com> <hclam@chromium.org> Alpha Lam <hclam@google.com> <hclam@chromium.org>
Deb Mukherjee <debargha@google.com> Deb Mukherjee <debargha@google.com>
Erik Niemeyer <erik.a.niemeyer@intel.com> <erik.a.niemeyer@gmail.com> Erik Niemeyer <erik.a.niemeyer@intel.com> <erik.a.niemeyer@gmail.com>
Guillaume Martres <gmartres@google.com> <smarter3@gmail.com> Guillaume Martres <gmartres@google.com> <smarter3@gmail.com>
Hangyu Kuang <hkuang@google.com> Hangyu Kuang <hkuang@google.com>
Hui Su <huisu@google.com>
Jacky Chen <jackychen@google.com>
Jim Bankoski <jimbankoski@google.com> Jim Bankoski <jimbankoski@google.com>
Johann Koenig <johannkoenig@google.com> Johann Koenig <johannkoenig@google.com>
Johann Koenig <johannkoenig@google.com> <johann.koenig@duck.com> Johann Koenig <johannkoenig@google.com> <johann.koenig@duck.com>
Johann Koenig <johannkoenig@google.com> <johann.koenig@gmail.com>
John Koleszar <jkoleszar@google.com> John Koleszar <jkoleszar@google.com>
Joshua Litt <joshualitt@google.com> <joshualitt@chromium.org> Joshua Litt <joshualitt@google.com> <joshualitt@chromium.org>
Marco Paniconi <marpan@google.com> Marco Paniconi <marpan@google.com>
@@ -17,10 +21,12 @@ Pascal Massimino <pascal.massimino@gmail.com>
Paul Wilkins <paulwilkins@google.com> Paul Wilkins <paulwilkins@google.com>
Ralph Giles <giles@xiph.org> <giles@entropywave.com> Ralph Giles <giles@xiph.org> <giles@entropywave.com>
Ralph Giles <giles@xiph.org> <giles@mozilla.com> Ralph Giles <giles@xiph.org> <giles@mozilla.com>
Ronald S. Bultje <rsbultje@gmail.com> <rbultje@google.com>
Sami Pietilä <samipietila@google.com> Sami Pietilä <samipietila@google.com>
Tamar Levy <tamar.levy@intel.com> Tamar Levy <tamar.levy@intel.com>
Tamar Levy <tamar.levy@intel.com> <levytamar82@gmail.com> Tamar Levy <tamar.levy@intel.com> <levytamar82@gmail.com>
Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com> Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com>
Timothy B. Terriberry <tterribe@xiph.org> Tim Terriberry <tterriberry@mozilla.com> Timothy B. Terriberry <tterribe@xiph.org> Tim Terriberry <tterriberry@mozilla.com>
Tom Finegan <tomfinegan@google.com> Tom Finegan <tomfinegan@google.com>
Tom Finegan <tomfinegan@google.com> <tomfinegan@chromium.org>
Yaowu Xu <yaowu@google.com> <yaowu@xuyaowu.com> Yaowu Xu <yaowu@google.com> <yaowu@xuyaowu.com>

15
AUTHORS
View File

@@ -5,9 +5,9 @@ Aaron Watry <awatry@gmail.com>
Abo Talib Mahfoodh <ab.mahfoodh@gmail.com> Abo Talib Mahfoodh <ab.mahfoodh@gmail.com>
Adam Xu <adam@xuyaowu.com> Adam Xu <adam@xuyaowu.com>
Adrian Grange <agrange@google.com> Adrian Grange <agrange@google.com>
Aex Converse <aconverse@google.com>
Ahmad Sharif <asharif@google.com> Ahmad Sharif <asharif@google.com>
Alexander Voronov <avoronov@graphics.cs.msu.ru> Alexander Voronov <avoronov@graphics.cs.msu.ru>
Alex Converse <aconverse@google.com>
Alexis Ballier <aballier@gentoo.org> Alexis Ballier <aballier@gentoo.org>
Alok Ahuja <waveletcoeff@gmail.com> Alok Ahuja <waveletcoeff@gmail.com>
Alpha Lam <hclam@google.com> Alpha Lam <hclam@google.com>
@@ -16,8 +16,10 @@ Ami Fischman <fischman@chromium.org>
Andoni Morales Alastruey <ylatuya@gmail.com> Andoni Morales Alastruey <ylatuya@gmail.com>
Andres Mejia <mcitadel@gmail.com> Andres Mejia <mcitadel@gmail.com>
Andrew Russell <anrussell@google.com> Andrew Russell <anrussell@google.com>
Angie Chiang <angiebird@google.com>
Aron Rosenberg <arosenberg@logitech.com> Aron Rosenberg <arosenberg@logitech.com>
Attila Nagy <attilanagy@google.com> Attila Nagy <attilanagy@google.com>
Brion Vibber <bvibber@wikimedia.org>
changjun.yang <changjun.yang@intel.com> changjun.yang <changjun.yang@intel.com>
Charles 'Buck' Krasic <ckrasic@google.com> Charles 'Buck' Krasic <ckrasic@google.com>
chm <chm@rock-chips.com> chm <chm@rock-chips.com>
@@ -27,6 +29,7 @@ Deb Mukherjee <debargha@google.com>
Dim Temp <dimtemp0@gmail.com> Dim Temp <dimtemp0@gmail.com>
Dmitry Kovalev <dkovalev@google.com> Dmitry Kovalev <dkovalev@google.com>
Dragan Mrdjan <dmrdjan@mips.com> Dragan Mrdjan <dmrdjan@mips.com>
Ed Baker <edward.baker@intel.com>
Ehsan Akhgari <ehsan.akhgari@gmail.com> Ehsan Akhgari <ehsan.akhgari@gmail.com>
Erik Niemeyer <erik.a.niemeyer@intel.com> Erik Niemeyer <erik.a.niemeyer@intel.com>
Fabio Pedretti <fabio.ped@libero.it> Fabio Pedretti <fabio.ped@libero.it>
@@ -34,6 +37,8 @@ Frank Galligan <fgalligan@google.com>
Fredrik Söderquist <fs@opera.com> Fredrik Söderquist <fs@opera.com>
Fritz Koenig <frkoenig@google.com> Fritz Koenig <frkoenig@google.com>
Gaute Strokkenes <gaute.strokkenes@broadcom.com> Gaute Strokkenes <gaute.strokkenes@broadcom.com>
Geza Lore <gezalore@gmail.com>
Ghislain MARY <ghislainmary2@gmail.com>
Giuseppe Scrivano <gscrivano@gnu.org> Giuseppe Scrivano <gscrivano@gnu.org>
Gordana Cmiljanovic <gordana.cmiljanovic@imgtec.com> Gordana Cmiljanovic <gordana.cmiljanovic@imgtec.com>
Guillaume Martres <gmartres@google.com> Guillaume Martres <gmartres@google.com>
@@ -44,7 +49,7 @@ Henrik Lundin <hlundin@google.com>
Hui Su <huisu@google.com> Hui Su <huisu@google.com>
Ivan Maltz <ivanmaltz@google.com> Ivan Maltz <ivanmaltz@google.com>
Jacek Caban <cjacek@gmail.com> Jacek Caban <cjacek@gmail.com>
JackyChen <jackychen@google.com> Jacky Chen <jackychen@google.com>
James Berry <jamesberry@google.com> James Berry <jamesberry@google.com>
James Yu <james.yu@linaro.org> James Yu <james.yu@linaro.org>
James Zern <jzern@google.com> James Zern <jzern@google.com>
@@ -60,9 +65,11 @@ Jingning Han <jingning@google.com>
Joey Parrish <joeyparrish@google.com> Joey Parrish <joeyparrish@google.com>
Johann Koenig <johannkoenig@google.com> Johann Koenig <johannkoenig@google.com>
John Koleszar <jkoleszar@google.com> John Koleszar <jkoleszar@google.com>
Johnny Klonaris <google@jawknee.com>
John Stark <jhnstrk@gmail.com> John Stark <jhnstrk@gmail.com>
Joshua Bleecher Snyder <josh@treelinelabs.com> Joshua Bleecher Snyder <josh@treelinelabs.com>
Joshua Litt <joshualitt@google.com> Joshua Litt <joshualitt@google.com>
Julia Robson <juliamrobson@gmail.com>
Justin Clift <justin@salasaga.org> Justin Clift <justin@salasaga.org>
Justin Lebar <justin.lebar@gmail.com> Justin Lebar <justin.lebar@gmail.com>
KO Myung-Hun <komh@chollian.net> KO Myung-Hun <komh@chollian.net>
@@ -82,6 +89,7 @@ Mike Hommey <mhommey@mozilla.com>
Mikhal Shemer <mikhal@google.com> Mikhal Shemer <mikhal@google.com>
Minghai Shang <minghai@google.com> Minghai Shang <minghai@google.com>
Morton Jonuschat <yabawock@gmail.com> Morton Jonuschat <yabawock@gmail.com>
Nico Weber <thakis@chromium.org>
Parag Salasakar <img.mips1@gmail.com> Parag Salasakar <img.mips1@gmail.com>
Pascal Massimino <pascal.massimino@gmail.com> Pascal Massimino <pascal.massimino@gmail.com>
Patrik Westin <patrik.westin@gmail.com> Patrik Westin <patrik.westin@gmail.com>
@@ -96,7 +104,7 @@ Rafael Ávila de Espíndola <rafael.espindola@gmail.com>
Rafaël Carré <funman@videolan.org> Rafaël Carré <funman@videolan.org>
Ralph Giles <giles@xiph.org> Ralph Giles <giles@xiph.org>
Rob Bradford <rob@linux.intel.com> Rob Bradford <rob@linux.intel.com>
Ronald S. Bultje <rbultje@google.com> Ronald S. Bultje <rsbultje@gmail.com>
Rui Ueyama <ruiu@google.com> Rui Ueyama <ruiu@google.com>
Sami Pietilä <samipietila@google.com> Sami Pietilä <samipietila@google.com>
Scott Graham <scottmg@chromium.org> Scott Graham <scottmg@chromium.org>
@@ -104,6 +112,7 @@ Scott LaVarnway <slavarnway@google.com>
Sean McGovern <gseanmcg@gmail.com> Sean McGovern <gseanmcg@gmail.com>
Sergey Ulanov <sergeyu@chromium.org> Sergey Ulanov <sergeyu@chromium.org>
Shimon Doodkin <helpmepro1@gmail.com> Shimon Doodkin <helpmepro1@gmail.com>
Shunyao Li <shunyaoli@google.com>
Stefan Holmer <holmer@google.com> Stefan Holmer <holmer@google.com>
Suman Sunkara <sunkaras@google.com> Suman Sunkara <sunkaras@google.com>
Taekhyun Kim <takim@nvidia.com> Taekhyun Kim <takim@nvidia.com>

View File

@@ -1,7 +1,19 @@
xxxx-yy-zz v1.4.0 "Changes for next release" 2015-11-09 v1.5.0 "Javan Whistling Duck"
vpxenc is changed to use VP9 by default. This release improves upon the VP9 encoder and speeds up the encoding and
Encoder controls added for 1 pass SVC. decoding processes.
Decoder control to toggle on/off loopfilter.
- Upgrading:
This release is ABI incompatible with 1.4.0. It drops deprecated VP8
controls and adds a variety of VP9 controls for testing.
The vpxenc utility now prefers VP9 by default.
- Enhancements:
Faster VP9 encoding and decoding
Smaller library size by combining functions used by VP8 and VP9
- Bug Fixes:
A variety of fuzzing issues
2015-04-03 v1.4.0 "Indian Runner Duck" 2015-04-03 v1.4.0 "Indian Runner Duck"
This release includes significant improvements to the VP9 codec. This release includes significant improvements to the VP9 codec.

View File

@@ -287,7 +287,7 @@ define archive_template
# for creating them. # for creating them.
$(1): $(1):
$(if $(quiet),@echo " [AR] $$@") $(if $(quiet),@echo " [AR] $$@")
$(qexec)$$(AR) $$(ARFLAGS) $$@ $$? $(qexec)$$(AR) $$(ARFLAGS) $$@ $$^
endef endef
define so_template define so_template

View File

@@ -73,6 +73,7 @@ Build options:
--target=TARGET target platform tuple [generic-gnu] --target=TARGET target platform tuple [generic-gnu]
--cpu=CPU optimize for a specific cpu rather than a family --cpu=CPU optimize for a specific cpu rather than a family
--extra-cflags=ECFLAGS add ECFLAGS to CFLAGS [$CFLAGS] --extra-cflags=ECFLAGS add ECFLAGS to CFLAGS [$CFLAGS]
--extra-cxxflags=ECXXFLAGS add ECXXFLAGS to CXXFLAGS [$CXXFLAGS]
${toggle_extra_warnings} emit harmless warnings (always non-fatal) ${toggle_extra_warnings} emit harmless warnings (always non-fatal)
${toggle_werror} treat warnings as errors, if possible ${toggle_werror} treat warnings as errors, if possible
(not available with all compilers) (not available with all compilers)
@@ -200,6 +201,10 @@ disabled(){
eval test "x\$$1" = "xno" eval test "x\$$1" = "xno"
} }
# Iterates through positional parameters, checks to confirm the parameter has
# not been explicitly (force) disabled, and enables the setting controlled by
# the parameter when the setting is not disabled.
# Note: Does NOT alter RTCD generation options ($RTCD_OPTIONS).
soft_enable() { soft_enable() {
for var in $*; do for var in $*; do
if ! disabled $var; then if ! disabled $var; then
@@ -209,6 +214,10 @@ soft_enable() {
done done
} }
# Iterates through positional parameters, checks to confirm the parameter has
# not been explicitly (force) enabled, and disables the setting controlled by
# the parameter when the setting is not enabled.
# Note: Does NOT alter RTCD generation options ($RTCD_OPTIONS).
soft_disable() { soft_disable() {
for var in $*; do for var in $*; do
if ! enabled $var; then if ! enabled $var; then
@@ -337,6 +346,10 @@ check_add_cflags() {
check_cflags "$@" && add_cflags_only "$@" check_cflags "$@" && add_cflags_only "$@"
} }
check_add_cxxflags() {
check_cxxflags "$@" && add_cxxflags_only "$@"
}
check_add_asflags() { check_add_asflags() {
log add_asflags "$@" log add_asflags "$@"
add_asflags "$@" add_asflags "$@"
@@ -428,7 +441,7 @@ NM=${NM}
CFLAGS = ${CFLAGS} CFLAGS = ${CFLAGS}
CXXFLAGS = ${CXXFLAGS} CXXFLAGS = ${CXXFLAGS}
ARFLAGS = -rus\$(if \$(quiet),c,v) ARFLAGS = -crs\$(if \$(quiet),,v)
LDFLAGS = ${LDFLAGS} LDFLAGS = ${LDFLAGS}
ASFLAGS = ${ASFLAGS} ASFLAGS = ${ASFLAGS}
extralibs = ${extralibs} extralibs = ${extralibs}
@@ -503,6 +516,9 @@ process_common_cmdline() {
--extra-cflags=*) --extra-cflags=*)
extra_cflags="${optval}" extra_cflags="${optval}"
;; ;;
--extra-cxxflags=*)
extra_cxxflags="${optval}"
;;
--enable-?*|--disable-?*) --enable-?*|--disable-?*)
eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'` eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'`
if echo "${ARCH_EXT_LIST}" | grep "^ *$option\$" >/dev/null; then if echo "${ARCH_EXT_LIST}" | grep "^ *$option\$" >/dev/null; then
@@ -617,6 +633,11 @@ show_darwin_sdk_path() {
xcodebuild -sdk $1 -version Path 2>/dev/null xcodebuild -sdk $1 -version Path 2>/dev/null
} }
# Print the major version number of the Darwin SDK specified by $1.
show_darwin_sdk_major_version() {
xcrun --sdk $1 --show-sdk-version 2>/dev/null | cut -d. -f1
}
process_common_toolchain() { process_common_toolchain() {
if [ -z "$toolchain" ]; then if [ -z "$toolchain" ]; then
gcctarget="${CHOST:-$(gcc -dumpmachine 2> /dev/null)}" gcctarget="${CHOST:-$(gcc -dumpmachine 2> /dev/null)}"
@@ -667,6 +688,10 @@ process_common_toolchain() {
tgt_isa=x86_64 tgt_isa=x86_64
tgt_os=darwin14 tgt_os=darwin14
;; ;;
*darwin15*)
tgt_isa=x86_64
tgt_os=darwin15
;;
x86_64*mingw32*) x86_64*mingw32*)
tgt_os=win64 tgt_os=win64
;; ;;
@@ -729,13 +754,14 @@ process_common_toolchain() {
# platforms, so use the newest one available. # platforms, so use the newest one available.
case ${toolchain} in case ${toolchain} in
arm*-darwin*) arm*-darwin*)
ios_sdk_dir="$(show_darwin_sdk_path iphoneos)" add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
if [ -d "${ios_sdk_dir}" ]; then iphoneos_sdk_dir="$(show_darwin_sdk_path iphoneos)"
add_cflags "-isysroot ${ios_sdk_dir}" if [ -d "${iphoneos_sdk_dir}" ]; then
add_ldflags "-isysroot ${ios_sdk_dir}" add_cflags "-isysroot ${iphoneos_sdk_dir}"
add_ldflags "-isysroot ${iphoneos_sdk_dir}"
fi fi
;; ;;
*-darwin*) x86*-darwin*)
osx_sdk_dir="$(show_darwin_sdk_path macosx)" osx_sdk_dir="$(show_darwin_sdk_path macosx)"
if [ -d "${osx_sdk_dir}" ]; then if [ -d "${osx_sdk_dir}" ]; then
add_cflags "-isysroot ${osx_sdk_dir}" add_cflags "-isysroot ${osx_sdk_dir}"
@@ -773,6 +799,10 @@ process_common_toolchain() {
add_cflags "-mmacosx-version-min=10.10" add_cflags "-mmacosx-version-min=10.10"
add_ldflags "-mmacosx-version-min=10.10" add_ldflags "-mmacosx-version-min=10.10"
;; ;;
*-darwin15-*)
add_cflags "-mmacosx-version-min=10.11"
add_ldflags "-mmacosx-version-min=10.11"
;;
*-iphonesimulator-*) *-iphonesimulator-*)
add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}" add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
add_ldflags "-miphoneos-version-min=${IOS_VERSION_MIN}" add_ldflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
@@ -811,16 +841,35 @@ process_common_toolchain() {
die "Disabling neon while keeping neon-asm is not supported" die "Disabling neon while keeping neon-asm is not supported"
fi fi
case ${toolchain} in case ${toolchain} in
# Apple iOS SDKs no longer support armv6 as of the version 9
# release (coincides with release of Xcode 7). Only enable media
# when using earlier SDK releases.
*-darwin*) *-darwin*)
# Neon is guaranteed on iOS 6+ devices, while old media extensions if [ "$(show_darwin_sdk_major_version iphoneos)" -lt 9 ]; then
# no longer assemble with iOS 9 SDK soft_enable media
else
soft_disable media
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-media "
fi
;; ;;
*) *)
soft_enable media soft_enable media
;;
esac esac
;; ;;
armv6) armv6)
soft_enable media case ${toolchain} in
*-darwin*)
if [ "$(show_darwin_sdk_major_version iphoneos)" -lt 9 ]; then
soft_enable media
else
die "Your iOS SDK does not support armv6."
fi
;;
*)
soft_enable media
;;
esac
;; ;;
esac esac
@@ -938,8 +987,10 @@ EOF
awk '{ print $1 }' | tail -1` awk '{ print $1 }' | tail -1`
fi fi
add_cflags "--sysroot=${alt_libc}" if [ -d "${alt_libc}" ]; then
add_ldflags "--sysroot=${alt_libc}" add_cflags "--sysroot=${alt_libc}"
add_ldflags "--sysroot=${alt_libc}"
fi
# linker flag that routes around a CPU bug in some # linker flag that routes around a CPU bug in some
# Cortex-A8 implementations (NDK Dev Guide) # Cortex-A8 implementations (NDK Dev Guide)
@@ -1003,6 +1054,12 @@ EOF
done done
asm_conversion_cmd="${source_path}/build/make/ads2gas_apple.pl" asm_conversion_cmd="${source_path}/build/make/ads2gas_apple.pl"
if [ "$(show_darwin_sdk_major_version iphoneos)" -gt 8 ]; then
check_add_cflags -fembed-bitcode
check_add_asflags -fembed-bitcode
check_add_ldflags -fembed-bitcode
fi
;; ;;
linux*) linux*)
@@ -1151,32 +1208,43 @@ EOF
soft_enable runtime_cpu_detect soft_enable runtime_cpu_detect
# We can't use 'check_cflags' until the compiler is configured and CC is # We can't use 'check_cflags' until the compiler is configured and CC is
# populated. # populated.
check_gcc_machine_option mmx for ext in ${ARCH_EXT_LIST_X86}; do
check_gcc_machine_option sse # disable higher order extensions to simplify asm dependencies
check_gcc_machine_option sse2 if [ "$disable_exts" = "yes" ]; then
check_gcc_machine_option sse3 if ! disabled $ext; then
check_gcc_machine_option ssse3 RTCD_OPTIONS="${RTCD_OPTIONS}--disable-${ext} "
check_gcc_machine_option sse4 sse4_1 disable_feature $ext
check_gcc_machine_option avx
check_gcc_machine_option avx2
case "${AS}" in
auto|"")
which nasm >/dev/null 2>&1 && AS=nasm
which yasm >/dev/null 2>&1 && AS=yasm
if [ "${AS}" = nasm ] ; then
# Apple ships version 0.98 of nasm through at least Xcode 6. Revisit
# this check if they start shipping a compatible version.
apple=`nasm -v | grep "Apple"`
[ -n "${apple}" ] \
&& echo "Unsupported version of nasm: ${apple}" \
&& AS=""
fi fi
[ "${AS}" = auto ] || [ -z "${AS}" ] \ elif disabled $ext; then
&& die "Neither yasm nor nasm have been found" disable_exts="yes"
;; else
esac # use the shortened version for the flag: sse4_1 -> sse4
log_echo " using $AS" check_gcc_machine_option ${ext%_*} $ext
fi
done
if enabled external_build; then
log_echo " skipping assembler detection"
else
case "${AS}" in
auto|"")
which nasm >/dev/null 2>&1 && AS=nasm
which yasm >/dev/null 2>&1 && AS=yasm
if [ "${AS}" = nasm ] ; then
# Apple ships version 0.98 of nasm through at least Xcode 6. Revisit
# this check if they start shipping a compatible version.
apple=`nasm -v | grep "Apple"`
[ -n "${apple}" ] \
&& echo "Unsupported version of nasm: ${apple}" \
&& AS=""
fi
[ "${AS}" = auto ] || [ -z "${AS}" ] \
&& die "Neither yasm nor nasm have been found." \
"See the prerequisites section in the README for more info."
;;
esac
log_echo " using $AS"
fi
[ "${AS##*/}" = nasm ] && add_asflags -Ox [ "${AS##*/}" = nasm ] && add_asflags -Ox
AS_SFX=.asm AS_SFX=.asm
case ${tgt_os} in case ${tgt_os} in
@@ -1212,6 +1280,13 @@ EOF
enabled x86 && sim_arch="-arch i386" || sim_arch="-arch x86_64" enabled x86 && sim_arch="-arch i386" || sim_arch="-arch x86_64"
add_cflags ${sim_arch} add_cflags ${sim_arch}
add_ldflags ${sim_arch} add_ldflags ${sim_arch}
if [ "$(show_darwin_sdk_major_version iphonesimulator)" -gt 8 ]; then
# yasm v1.3.0 doesn't know what -fembed-bitcode means, so turning it
# on is pointless (unless building a C-only lib). Warn the user, but
# do nothing here.
log "Warning: Bitcode embed disabled for simulator targets."
fi
;; ;;
os2) os2)
add_asflags -f aout add_asflags -f aout

View File

@@ -25,31 +25,42 @@ CONFIGURE_ARGS="--disable-docs
DIST_DIR="_dist" DIST_DIR="_dist"
FRAMEWORK_DIR="VPX.framework" FRAMEWORK_DIR="VPX.framework"
HEADER_DIR="${FRAMEWORK_DIR}/Headers/vpx" HEADER_DIR="${FRAMEWORK_DIR}/Headers/vpx"
MAKE_JOBS=1
SCRIPT_DIR=$(dirname "$0") SCRIPT_DIR=$(dirname "$0")
LIBVPX_SOURCE_DIR=$(cd ${SCRIPT_DIR}/../..; pwd) LIBVPX_SOURCE_DIR=$(cd ${SCRIPT_DIR}/../..; pwd)
LIPO=$(xcrun -sdk iphoneos${SDK} -find lipo) LIPO=$(xcrun -sdk iphoneos${SDK} -find lipo)
ORIG_PWD="$(pwd)" ORIG_PWD="$(pwd)"
TARGETS="arm64-darwin-gcc ARM_TARGETS="arm64-darwin-gcc
armv7-darwin-gcc armv7-darwin-gcc
armv7s-darwin-gcc armv7s-darwin-gcc"
x86-iphonesimulator-gcc SIM_TARGETS="x86-iphonesimulator-gcc
x86_64-iphonesimulator-gcc" x86_64-iphonesimulator-gcc"
OSX_TARGETS="x86-darwin15-gcc
x86_64-darwin15-gcc"
TARGETS="${ARM_TARGETS} ${SIM_TARGETS}"
# Configures for the target specified by $1, and invokes make with the dist # Configures for the target specified by $1, and invokes make with the dist
# target using $DIST_DIR as the distribution output directory. # target using $DIST_DIR as the distribution output directory.
build_target() { build_target() {
local target="$1" local target="$1"
local old_pwd="$(pwd)" local old_pwd="$(pwd)"
local target_specific_flags=""
vlog "***Building target: ${target}***" vlog "***Building target: ${target}***"
case "${target}" in
x86-*)
target_specific_flags="--enable-pic"
vlog "Enabled PIC for ${target}"
;;
esac
mkdir "${target}" mkdir "${target}"
cd "${target}" cd "${target}"
eval "${LIBVPX_SOURCE_DIR}/configure" --target="${target}" \ eval "${LIBVPX_SOURCE_DIR}/configure" --target="${target}" \
${CONFIGURE_ARGS} ${EXTRA_CONFIGURE_ARGS} ${devnull} ${CONFIGURE_ARGS} ${EXTRA_CONFIGURE_ARGS} ${target_specific_flags} \
${devnull}
export DIST_DIR export DIST_DIR
eval make -j ${MAKE_JOBS} dist ${devnull} eval make dist ${devnull}
cd "${old_pwd}" cd "${old_pwd}"
vlog "***Done building target: ${target}***" vlog "***Done building target: ${target}***"
@@ -189,16 +200,29 @@ cleanup() {
fi fi
} }
print_list() {
local indent="$1"
shift
local list="$@"
for entry in ${list}; do
echo "${indent}${entry}"
done
}
iosbuild_usage() { iosbuild_usage() {
cat << EOF cat << EOF
Usage: ${0##*/} [arguments] Usage: ${0##*/} [arguments]
--help: Display this message and exit. --help: Display this message and exit.
--extra-configure-args <args>: Extra args to pass when configuring libvpx. --extra-configure-args <args>: Extra args to pass when configuring libvpx.
--jobs: Number of make jobs. --macosx: Uses darwin15 targets instead of iphonesimulator targets for x86
and x86_64. Allows linking to framework when builds target MacOSX
instead of iOS.
--preserve-build-output: Do not delete the build directory. --preserve-build-output: Do not delete the build directory.
--show-build-output: Show output from each library build. --show-build-output: Show output from each library build.
--targets <targets>: Override default target list. Defaults: --targets <targets>: Override default target list. Defaults:
${TARGETS} $(print_list " " ${TARGETS})
--test-link: Confirms all targets can be linked. Functionally identical to
passing --enable-examples via --extra-configure-args.
--verbose: Output information about the environment and each stage of the --verbose: Output information about the environment and each stage of the
build. build.
EOF EOF
@@ -227,20 +251,22 @@ while [ -n "$1" ]; do
iosbuild_usage iosbuild_usage
exit exit
;; ;;
--jobs)
MAKE_JOBS="$2"
shift
;;
--preserve-build-output) --preserve-build-output)
PRESERVE_BUILD_OUTPUT=yes PRESERVE_BUILD_OUTPUT=yes
;; ;;
--show-build-output) --show-build-output)
devnull= devnull=
;; ;;
--test-link)
EXTRA_CONFIGURE_ARGS="${EXTRA_CONFIGURE_ARGS} --enable-examples"
;;
--targets) --targets)
TARGETS="$2" TARGETS="$2"
shift shift
;; ;;
--macosx)
TARGETS="${ARM_TARGETS} ${OSX_TARGETS}"
;;
--verbose) --verbose)
VERBOSE=yes VERBOSE=yes
;; ;;
@@ -260,15 +286,17 @@ cat << EOF
EXTRA_CONFIGURE_ARGS=${EXTRA_CONFIGURE_ARGS} EXTRA_CONFIGURE_ARGS=${EXTRA_CONFIGURE_ARGS}
FRAMEWORK_DIR=${FRAMEWORK_DIR} FRAMEWORK_DIR=${FRAMEWORK_DIR}
HEADER_DIR=${HEADER_DIR} HEADER_DIR=${HEADER_DIR}
MAKE_JOBS=${MAKE_JOBS}
PRESERVE_BUILD_OUTPUT=${PRESERVE_BUILD_OUTPUT}
LIBVPX_SOURCE_DIR=${LIBVPX_SOURCE_DIR} LIBVPX_SOURCE_DIR=${LIBVPX_SOURCE_DIR}
LIPO=${LIPO} LIPO=${LIPO}
MAKEFLAGS=${MAKEFLAGS}
ORIG_PWD=${ORIG_PWD} ORIG_PWD=${ORIG_PWD}
TARGETS="${TARGETS}" PRESERVE_BUILD_OUTPUT=${PRESERVE_BUILD_OUTPUT}
TARGETS="$(print_list "" ${TARGETS})"
OSX_TARGETS="${OSX_TARGETS}"
SIM_TARGETS="${SIM_TARGETS}"
EOF EOF
fi fi
build_framework "${TARGETS}" build_framework "${TARGETS}"
echo "Successfully built '${FRAMEWORK_DIR}' for:" echo "Successfully built '${FRAMEWORK_DIR}' for:"
echo " ${TARGETS}" print_list "" ${TARGETS}

31
configure vendored
View File

@@ -35,6 +35,9 @@ Advanced options:
${toggle_debug_libs} in/exclude debug version of libraries ${toggle_debug_libs} in/exclude debug version of libraries
${toggle_static_msvcrt} use static MSVCRT (VS builds only) ${toggle_static_msvcrt} use static MSVCRT (VS builds only)
${toggle_vp9_highbitdepth} use VP9 high bit depth (10/12) profiles ${toggle_vp9_highbitdepth} use VP9 high bit depth (10/12) profiles
${toggle_better_hw_compatibility}
enable encoder to produce streams with better
hardware decoder compatibility
${toggle_vp8} VP8 codec support ${toggle_vp8} VP8 codec support
${toggle_vp9} VP9 codec support ${toggle_vp9} VP9 codec support
${toggle_vp10} VP10 codec support ${toggle_vp10} VP10 codec support
@@ -122,6 +125,7 @@ all_platforms="${all_platforms} x86-darwin11-gcc"
all_platforms="${all_platforms} x86-darwin12-gcc" all_platforms="${all_platforms} x86-darwin12-gcc"
all_platforms="${all_platforms} x86-darwin13-gcc" all_platforms="${all_platforms} x86-darwin13-gcc"
all_platforms="${all_platforms} x86-darwin14-gcc" all_platforms="${all_platforms} x86-darwin14-gcc"
all_platforms="${all_platforms} x86-darwin15-gcc"
all_platforms="${all_platforms} x86-iphonesimulator-gcc" all_platforms="${all_platforms} x86-iphonesimulator-gcc"
all_platforms="${all_platforms} x86-linux-gcc" all_platforms="${all_platforms} x86-linux-gcc"
all_platforms="${all_platforms} x86-linux-icc" all_platforms="${all_platforms} x86-linux-icc"
@@ -142,6 +146,7 @@ all_platforms="${all_platforms} x86_64-darwin11-gcc"
all_platforms="${all_platforms} x86_64-darwin12-gcc" all_platforms="${all_platforms} x86_64-darwin12-gcc"
all_platforms="${all_platforms} x86_64-darwin13-gcc" all_platforms="${all_platforms} x86_64-darwin13-gcc"
all_platforms="${all_platforms} x86_64-darwin14-gcc" all_platforms="${all_platforms} x86_64-darwin14-gcc"
all_platforms="${all_platforms} x86_64-darwin15-gcc"
all_platforms="${all_platforms} x86_64-iphonesimulator-gcc" all_platforms="${all_platforms} x86_64-iphonesimulator-gcc"
all_platforms="${all_platforms} x86_64-linux-gcc" all_platforms="${all_platforms} x86_64-linux-gcc"
all_platforms="${all_platforms} x86_64-linux-icc" all_platforms="${all_platforms} x86_64-linux-icc"
@@ -232,6 +237,16 @@ ARCH_LIST="
x86 x86
x86_64 x86_64
" "
ARCH_EXT_LIST_X86="
mmx
sse
sse2
sse3
ssse3
sse4_1
avx
avx2
"
ARCH_EXT_LIST=" ARCH_EXT_LIST="
edsp edsp
media media
@@ -243,14 +258,7 @@ ARCH_EXT_LIST="
msa msa
mips64 mips64
mmx ${ARCH_EXT_LIST_X86}
sse
sse2
sse3
ssse3
sse4_1
avx
avx2
" "
HAVE_LIST=" HAVE_LIST="
${ARCH_EXT_LIST} ${ARCH_EXT_LIST}
@@ -264,6 +272,7 @@ EXPERIMENT_LIST="
spatial_svc spatial_svc
fp_mb_stats fp_mb_stats
emulate_hardware emulate_hardware
misc_fixes
" "
CONFIG_LIST=" CONFIG_LIST="
dependency_tracking dependency_tracking
@@ -316,6 +325,7 @@ CONFIG_LIST="
vp9_temporal_denoising vp9_temporal_denoising
coefficient_range_checking coefficient_range_checking
vp9_highbitdepth vp9_highbitdepth
better_hw_compatibility
experimental experimental
size_limit size_limit
${EXPERIMENT_LIST} ${EXPERIMENT_LIST}
@@ -374,6 +384,7 @@ CMDLINE_SELECT="
temporal_denoising temporal_denoising
vp9_temporal_denoising vp9_temporal_denoising
coefficient_range_checking coefficient_range_checking
better_hw_compatibility
vp9_highbitdepth vp9_highbitdepth
experimental experimental
" "
@@ -722,6 +733,10 @@ EOF
check_add_cflags ${extra_cflags} || \ check_add_cflags ${extra_cflags} || \
die "Requested extra CFLAGS '${extra_cflags}' not supported by compiler" die "Requested extra CFLAGS '${extra_cflags}' not supported by compiler"
fi fi
if [ -n "${extra_cxxflags}" ]; then
check_add_cxxflags ${extra_cxxflags} || \
die "Requested extra CXXFLAGS '${extra_cxxflags}' not supported by compiler"
fi
} }

View File

@@ -36,6 +36,8 @@ LIBYUV_SRCS += third_party/libyuv/include/libyuv/basic_types.h \
third_party/libyuv/source/scale_neon64.cc \ third_party/libyuv/source/scale_neon64.cc \
third_party/libyuv/source/scale_win.cc \ third_party/libyuv/source/scale_win.cc \
LIBWEBM_COMMON_SRCS += third_party/libwebm/webmids.hpp
LIBWEBM_MUXER_SRCS += third_party/libwebm/mkvmuxer.cpp \ LIBWEBM_MUXER_SRCS += third_party/libwebm/mkvmuxer.cpp \
third_party/libwebm/mkvmuxerutil.cpp \ third_party/libwebm/mkvmuxerutil.cpp \
third_party/libwebm/mkvwriter.cpp \ third_party/libwebm/mkvwriter.cpp \
@@ -43,8 +45,7 @@ LIBWEBM_MUXER_SRCS += third_party/libwebm/mkvmuxer.cpp \
third_party/libwebm/mkvmuxertypes.hpp \ third_party/libwebm/mkvmuxertypes.hpp \
third_party/libwebm/mkvmuxerutil.hpp \ third_party/libwebm/mkvmuxerutil.hpp \
third_party/libwebm/mkvparser.hpp \ third_party/libwebm/mkvparser.hpp \
third_party/libwebm/mkvwriter.hpp \ third_party/libwebm/mkvwriter.hpp
third_party/libwebm/webmids.hpp
LIBWEBM_PARSER_SRCS = third_party/libwebm/mkvparser.cpp \ LIBWEBM_PARSER_SRCS = third_party/libwebm/mkvparser.cpp \
third_party/libwebm/mkvreader.cpp \ third_party/libwebm/mkvreader.cpp \
@@ -68,6 +69,7 @@ ifeq ($(CONFIG_LIBYUV),yes)
vpxdec.SRCS += $(LIBYUV_SRCS) vpxdec.SRCS += $(LIBYUV_SRCS)
endif endif
ifeq ($(CONFIG_WEBM_IO),yes) ifeq ($(CONFIG_WEBM_IO),yes)
vpxdec.SRCS += $(LIBWEBM_COMMON_SRCS)
vpxdec.SRCS += $(LIBWEBM_PARSER_SRCS) vpxdec.SRCS += $(LIBWEBM_PARSER_SRCS)
vpxdec.SRCS += webmdec.cc webmdec.h vpxdec.SRCS += webmdec.cc webmdec.h
endif endif
@@ -89,6 +91,7 @@ ifeq ($(CONFIG_LIBYUV),yes)
vpxenc.SRCS += $(LIBYUV_SRCS) vpxenc.SRCS += $(LIBYUV_SRCS)
endif endif
ifeq ($(CONFIG_WEBM_IO),yes) ifeq ($(CONFIG_WEBM_IO),yes)
vpxenc.SRCS += $(LIBWEBM_COMMON_SRCS)
vpxenc.SRCS += $(LIBWEBM_MUXER_SRCS) vpxenc.SRCS += $(LIBWEBM_MUXER_SRCS)
vpxenc.SRCS += webmenc.cc webmenc.h vpxenc.SRCS += webmenc.cc webmenc.h
endif endif

View File

@@ -80,6 +80,8 @@ static const arg_def_t rc_end_usage_arg =
ARG_DEF(NULL, "rc-end-usage", 1, "0 - 3: VBR, CBR, CQ, Q"); ARG_DEF(NULL, "rc-end-usage", 1, "0 - 3: VBR, CBR, CQ, Q");
static const arg_def_t speed_arg = static const arg_def_t speed_arg =
ARG_DEF("sp", "speed", 1, "speed configuration"); ARG_DEF("sp", "speed", 1, "speed configuration");
static const arg_def_t aqmode_arg =
ARG_DEF("aq", "aqmode", 1, "aq-mode off/on");
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
static const struct arg_enum_list bitdepth_enum[] = { static const struct arg_enum_list bitdepth_enum[] = {
@@ -101,7 +103,7 @@ static const arg_def_t *svc_args[] = {
&kf_dist_arg, &scale_factors_arg, &passes_arg, &pass_arg, &kf_dist_arg, &scale_factors_arg, &passes_arg, &pass_arg,
&fpf_name_arg, &min_q_arg, &max_q_arg, &min_bitrate_arg, &fpf_name_arg, &min_q_arg, &max_q_arg, &min_bitrate_arg,
&max_bitrate_arg, &temporal_layers_arg, &temporal_layering_mode_arg, &max_bitrate_arg, &temporal_layers_arg, &temporal_layering_mode_arg,
&lag_in_frame_arg, &threads_arg, &lag_in_frame_arg, &threads_arg, &aqmode_arg,
#if OUTPUT_RC_STATS #if OUTPUT_RC_STATS
&output_rc_stats_arg, &output_rc_stats_arg,
#endif #endif
@@ -221,6 +223,8 @@ static void parse_command_line(int argc, const char **argv_,
#endif #endif
} else if (arg_match(&arg, &speed_arg, argi)) { } else if (arg_match(&arg, &speed_arg, argi)) {
svc_ctx->speed = arg_parse_uint(&arg); svc_ctx->speed = arg_parse_uint(&arg);
} else if (arg_match(&arg, &aqmode_arg, argi)) {
svc_ctx->aqmode = arg_parse_uint(&arg);
} else if (arg_match(&arg, &threads_arg, argi)) { } else if (arg_match(&arg, &threads_arg, argi)) {
svc_ctx->threads = arg_parse_uint(&arg); svc_ctx->threads = arg_parse_uint(&arg);
} else if (arg_match(&arg, &temporal_layering_mode_arg, argi)) { } else if (arg_match(&arg, &temporal_layering_mode_arg, argi)) {
@@ -404,7 +408,10 @@ static void set_rate_control_stats(struct RateControlStats *rc,
for (tl = 0; tl < cfg->ts_number_layers; ++tl) { for (tl = 0; tl < cfg->ts_number_layers; ++tl) {
const int layer = sl * cfg->ts_number_layers + tl; const int layer = sl * cfg->ts_number_layers + tl;
const int tlayer0 = sl * cfg->ts_number_layers; const int tlayer0 = sl * cfg->ts_number_layers;
rc->layer_framerate[layer] = if (cfg->ts_number_layers == 1)
rc->layer_framerate[layer] = framerate;
else
rc->layer_framerate[layer] =
framerate / cfg->ts_rate_decimator[tl]; framerate / cfg->ts_rate_decimator[tl];
if (tl > 0) { if (tl > 0) {
rc->layer_pfb[layer] = 1000.0 * rc->layer_pfb[layer] = 1000.0 *
@@ -540,6 +547,59 @@ vpx_codec_err_t parse_superframe_index(const uint8_t *data,
} }
#endif #endif
// Example pattern for spatial layers and 2 temporal layers used in the
// bypass/flexible mode. The pattern corresponds to the pattern
// VP9E_TEMPORAL_LAYERING_MODE_0101 (temporal_layering_mode == 2) used in
// non-flexible mode.
void set_frame_flags_bypass_mode(int sl, int tl, int num_spatial_layers,
int is_key_frame,
vpx_svc_ref_frame_config_t *ref_frame_config) {
for (sl = 0; sl < num_spatial_layers; ++sl) {
if (!tl) {
if (!sl) {
ref_frame_config->frame_flags[sl] = VP8_EFLAG_NO_REF_GF |
VP8_EFLAG_NO_REF_ARF |
VP8_EFLAG_NO_UPD_GF |
VP8_EFLAG_NO_UPD_ARF;
} else {
if (is_key_frame) {
ref_frame_config->frame_flags[sl] = VP8_EFLAG_NO_REF_LAST |
VP8_EFLAG_NO_REF_ARF |
VP8_EFLAG_NO_UPD_GF |
VP8_EFLAG_NO_UPD_ARF;
} else {
ref_frame_config->frame_flags[sl] = VP8_EFLAG_NO_REF_ARF |
VP8_EFLAG_NO_UPD_GF |
VP8_EFLAG_NO_UPD_ARF;
}
}
} else if (tl == 1) {
if (!sl) {
ref_frame_config->frame_flags[sl] = VP8_EFLAG_NO_REF_GF |
VP8_EFLAG_NO_REF_ARF |
VP8_EFLAG_NO_UPD_LAST |
VP8_EFLAG_NO_UPD_GF;
} else {
ref_frame_config->frame_flags[sl] = VP8_EFLAG_NO_REF_ARF |
VP8_EFLAG_NO_UPD_LAST |
VP8_EFLAG_NO_UPD_GF;
}
}
if (tl == 0) {
ref_frame_config->lst_fb_idx[sl] = sl;
if (sl)
ref_frame_config->gld_fb_idx[sl] = sl - 1;
else
ref_frame_config->gld_fb_idx[sl] = 0;
ref_frame_config->alt_fb_idx[sl] = 0;
} else if (tl == 1) {
ref_frame_config->lst_fb_idx[sl] = sl;
ref_frame_config->gld_fb_idx[sl] = num_spatial_layers + sl - 1;
ref_frame_config->alt_fb_idx[sl] = num_spatial_layers + sl;
}
}
}
int main(int argc, const char **argv) { int main(int argc, const char **argv) {
AppInput app_input = {0}; AppInput app_input = {0};
VpxVideoWriter *writer = NULL; VpxVideoWriter *writer = NULL;
@@ -560,6 +620,7 @@ int main(int argc, const char **argv) {
VpxVideoWriter *outfile[VPX_TS_MAX_LAYERS] = {NULL}; VpxVideoWriter *outfile[VPX_TS_MAX_LAYERS] = {NULL};
struct RateControlStats rc; struct RateControlStats rc;
vpx_svc_layer_id_t layer_id; vpx_svc_layer_id_t layer_id;
vpx_svc_ref_frame_config_t ref_frame_config;
int sl, tl; int sl, tl;
double sum_bitrate = 0.0; double sum_bitrate = 0.0;
double sum_bitrate2 = 0.0; double sum_bitrate2 = 0.0;
@@ -635,7 +696,7 @@ int main(int argc, const char **argv) {
vpx_codec_control(&codec, VP8E_SET_CPUUSED, svc_ctx.speed); vpx_codec_control(&codec, VP8E_SET_CPUUSED, svc_ctx.speed);
if (svc_ctx.threads) if (svc_ctx.threads)
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (svc_ctx.threads >> 1)); vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (svc_ctx.threads >> 1));
if (svc_ctx.speed >= 5) if (svc_ctx.speed >= 5 && svc_ctx.aqmode == 1)
vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3); vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3);
@@ -649,6 +710,37 @@ int main(int argc, const char **argv) {
end_of_stream = 1; end_of_stream = 1;
} }
// For BYPASS/FLEXIBLE mode, set the frame flags (reference and updates)
// and the buffer indices for each spatial layer of the current
// (super)frame to be encoded. The temporal layer_id for the current frame
// also needs to be set.
// TODO(marpan): Should rename the "VP9E_TEMPORAL_LAYERING_MODE_BYPASS"
// mode to "VP9E_LAYERING_MODE_BYPASS".
if (svc_ctx.temporal_layering_mode == VP9E_TEMPORAL_LAYERING_MODE_BYPASS) {
layer_id.spatial_layer_id = 0;
// Example for 2 temporal layers.
if (frame_cnt % 2 == 0)
layer_id.temporal_layer_id = 0;
else
layer_id.temporal_layer_id = 1;
// Note that we only set the temporal layer_id, since we are calling
// the encode for the whole superframe. The encoder will internally loop
// over all the spatial layers for the current superframe.
vpx_codec_control(&codec, VP9E_SET_SVC_LAYER_ID, &layer_id);
set_frame_flags_bypass_mode(sl, layer_id.temporal_layer_id,
svc_ctx.spatial_layers,
frame_cnt == 0,
&ref_frame_config);
vpx_codec_control(&codec, VP9E_SET_SVC_REF_FRAME_CONFIG,
&ref_frame_config);
// Keep track of input frames, to account for frame drops in rate control
// stats/metrics.
for (sl = 0; sl < enc_cfg.ss_number_layers; ++sl) {
++rc.layer_input_frames[sl * enc_cfg.ts_number_layers +
layer_id.temporal_layer_id];
}
}
vpx_usec_timer_start(&timer); vpx_usec_timer_start(&timer);
res = vpx_svc_encode(&svc_ctx, &codec, (end_of_stream ? NULL : &raw), res = vpx_svc_encode(&svc_ctx, &codec, (end_of_stream ? NULL : &raw),
pts, frame_duration, svc_ctx.speed >= 5 ? pts, frame_duration, svc_ctx.speed >= 5 ?
@@ -679,9 +771,16 @@ int main(int argc, const char **argv) {
vpx_codec_control(&codec, VP9E_GET_SVC_LAYER_ID, &layer_id); vpx_codec_control(&codec, VP9E_GET_SVC_LAYER_ID, &layer_id);
parse_superframe_index(cx_pkt->data.frame.buf, parse_superframe_index(cx_pkt->data.frame.buf,
cx_pkt->data.frame.sz, sizes, &count); cx_pkt->data.frame.sz, sizes, &count);
for (sl = 0; sl < enc_cfg.ss_number_layers; ++sl) { // Note computing input_layer_frames here won't account for frame
++rc.layer_input_frames[sl * enc_cfg.ts_number_layers + // drops in rate control stats.
layer_id.temporal_layer_id]; // TODO(marpan): Fix this for non-bypass mode so we can get stats
// for dropped frames.
if (svc_ctx.temporal_layering_mode !=
VP9E_TEMPORAL_LAYERING_MODE_BYPASS) {
for (sl = 0; sl < enc_cfg.ss_number_layers; ++sl) {
++rc.layer_input_frames[sl * enc_cfg.ts_number_layers +
layer_id.temporal_layer_id];
}
} }
for (tl = layer_id.temporal_layer_id; for (tl = layer_id.temporal_layer_id;
tl < enc_cfg.ts_number_layers; ++tl) { tl < enc_cfg.ts_number_layers; ++tl) {
@@ -772,6 +871,16 @@ int main(int argc, const char **argv) {
pts += frame_duration; pts += frame_duration;
} }
} }
// Compensate for the extra frame count for the bypass mode.
if (svc_ctx.temporal_layering_mode == VP9E_TEMPORAL_LAYERING_MODE_BYPASS) {
for (sl = 0; sl < enc_cfg.ss_number_layers; ++sl) {
const int layer = sl * enc_cfg.ts_number_layers +
layer_id.temporal_layer_id;
--rc.layer_input_frames[layer];
}
}
printf("Processed %d frames\n", frame_cnt); printf("Processed %d frames\n", frame_cnt);
fclose(infile); fclose(infile);
#if OUTPUT_RC_STATS #if OUTPUT_RC_STATS

View File

@@ -684,14 +684,14 @@ int main(int argc, char **argv) {
if (strncmp(encoder->name, "vp8", 3) == 0) { if (strncmp(encoder->name, "vp8", 3) == 0) {
vpx_codec_control(&codec, VP8E_SET_CPUUSED, -speed); vpx_codec_control(&codec, VP8E_SET_CPUUSED, -speed);
vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kDenoiserOff); vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kDenoiserOff);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 0); vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
} else if (strncmp(encoder->name, "vp9", 3) == 0) { } else if (strncmp(encoder->name, "vp9", 3) == 0) {
vpx_svc_extra_cfg_t svc_params; vpx_svc_extra_cfg_t svc_params;
vpx_codec_control(&codec, VP8E_SET_CPUUSED, speed); vpx_codec_control(&codec, VP8E_SET_CPUUSED, speed);
vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3); vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3);
vpx_codec_control(&codec, VP9E_SET_FRAME_PERIODIC_BOOST, 0); vpx_codec_control(&codec, VP9E_SET_FRAME_PERIODIC_BOOST, 0);
vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, 0); vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, 0);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 0); vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP9E_SET_TUNE_CONTENT, 0); vpx_codec_control(&codec, VP9E_SET_TUNE_CONTENT, 0);
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (cfg.g_threads >> 1)); vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (cfg.g_threads >> 1));
if (vpx_codec_control(&codec, VP9E_SET_SVC, layering_mode > 0 ? 1: 0)) if (vpx_codec_control(&codec, VP9E_SET_SVC, layering_mode > 0 ? 1: 0))

12
libs.mk
View File

@@ -260,7 +260,7 @@ OBJS-yes += $(LIBVPX_OBJS)
LIBS-$(if yes,$(CONFIG_STATIC)) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a LIBS-$(if yes,$(CONFIG_STATIC)) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a
$(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS) $(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
SO_VERSION_MAJOR := 2 SO_VERSION_MAJOR := 3
SO_VERSION_MINOR := 0 SO_VERSION_MINOR := 0
SO_VERSION_PATCH := 0 SO_VERSION_PATCH := 0
ifeq ($(filter darwin%,$(TGT_OS)),$(TGT_OS)) ifeq ($(filter darwin%,$(TGT_OS)),$(TGT_OS))
@@ -429,12 +429,10 @@ testdata:: $(LIBVPX_TEST_DATA)
if [ -n "$${sha1sum}" ]; then\ if [ -n "$${sha1sum}" ]; then\
set -e;\ set -e;\
echo "Checking test data:";\ echo "Checking test data:";\
if [ -n "$(LIBVPX_TEST_DATA)" ]; then\ for f in $(call enabled,LIBVPX_TEST_DATA); do\
for f in $(call enabled,LIBVPX_TEST_DATA); do\ grep $$f $(SRC_PATH_BARE)/test/test-data.sha1 |\
grep $$f $(SRC_PATH_BARE)/test/test-data.sha1 |\ (cd $(LIBVPX_TEST_DATA_PATH); $${sha1sum} -c);\
(cd $(LIBVPX_TEST_DATA_PATH); $${sha1sum} -c);\ done; \
done; \
fi; \
else\ else\
echo "Skipping test data integrity check, sha1sum not found.";\ echo "Skipping test data integrity check, sha1sum not found.";\
fi fi

View File

@@ -0,0 +1,127 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <algorithm>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
#include "test/y4m_video_source.h"
namespace {
// Check if any pixel in a 16x16 macroblock varies between frames.
int CheckMb(const vpx_image_t &current, const vpx_image_t &previous,
int mb_r, int mb_c) {
for (int plane = 0; plane < 3; plane++) {
int r = 16 * mb_r;
int c0 = 16 * mb_c;
int r_top = std::min(r + 16, static_cast<int>(current.d_h));
int c_top = std::min(c0 + 16, static_cast<int>(current.d_w));
r = std::max(r, 0);
c0 = std::max(c0, 0);
if (plane > 0 && current.x_chroma_shift) {
c_top = (c_top + 1) >> 1;
c0 >>= 1;
}
if (plane > 0 && current.y_chroma_shift) {
r_top = (r_top + 1) >> 1;
r >>= 1;
}
for (; r < r_top; ++r) {
for (int c = c0; c < c_top; ++c) {
if (current.planes[plane][current.stride[plane] * r + c] !=
previous.planes[plane][previous.stride[plane] * r + c])
return 1;
}
}
}
return 0;
}
void GenerateMap(int mb_rows, int mb_cols, const vpx_image_t &current,
const vpx_image_t &previous, uint8_t *map) {
for (int mb_r = 0; mb_r < mb_rows; ++mb_r) {
for (int mb_c = 0; mb_c < mb_cols; ++mb_c) {
map[mb_r * mb_cols + mb_c] = CheckMb(current, previous, mb_r, mb_c);
}
}
}
const int kAqModeCyclicRefresh = 3;
class ActiveMapRefreshTest
: public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
ActiveMapRefreshTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ActiveMapRefreshTest() {}
virtual void SetUp() {
InitializeConfig();
SetMode(GET_PARAM(1));
cpu_used_ = GET_PARAM(2);
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
::libvpx_test::Y4mVideoSource *y4m_video =
static_cast<libvpx_test::Y4mVideoSource *>(video);
if (video->frame() == 1) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_SET_AQ_MODE, kAqModeCyclicRefresh);
} else if (video->frame() >= 2 && video->img()) {
vpx_image_t *current = video->img();
vpx_image_t *previous = y4m_holder_->img();
ASSERT_TRUE(previous != NULL);
vpx_active_map_t map = vpx_active_map_t();
const int width = static_cast<int>(current->d_w);
const int height = static_cast<int>(current->d_h);
const int mb_width = (width + 15) / 16;
const int mb_height = (height + 15) / 16;
uint8_t *active_map = new uint8_t[mb_width * mb_height];
GenerateMap(mb_height, mb_width, *current, *previous, active_map);
map.cols = mb_width;
map.rows = mb_height;
map.active_map = active_map;
encoder->Control(VP8E_SET_ACTIVEMAP, &map);
delete[] active_map;
}
if (video->img()) {
y4m_video->SwapBuffers(y4m_holder_);
}
}
int cpu_used_;
::libvpx_test::Y4mVideoSource *y4m_holder_;
};
TEST_P(ActiveMapRefreshTest, Test) {
cfg_.g_lag_in_frames = 0;
cfg_.g_profile = 1;
cfg_.rc_target_bitrate = 600;
cfg_.rc_resize_allowed = 0;
cfg_.rc_min_quantizer = 8;
cfg_.rc_max_quantizer = 30;
cfg_.g_pass = VPX_RC_ONE_PASS;
cfg_.rc_end_usage = VPX_CBR;
cfg_.kf_max_dist = 90000;
::libvpx_test::Y4mVideoSource video("desktop_credits.y4m", 0, 30);
::libvpx_test::Y4mVideoSource video_holder("desktop_credits.y4m", 0, 30);
video_holder.Begin();
y4m_holder_ = &video_holder;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
VP9_INSTANTIATE_TEST_CASE(ActiveMapRefreshTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(5, 6));
} // namespace

View File

@@ -15,9 +15,7 @@
#include "third_party/googletest/src/include/gtest/gtest.h" #include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h" #include "./vpx_config.h"
#if CONFIG_VP9_ENCODER #include "./vpx_dsp_rtcd.h"
#include "./vp9_rtcd.h"
#endif
#include "test/acm_random.h" #include "test/acm_random.h"
#include "test/clear_system_state.h" #include "test/clear_system_state.h"
@@ -194,6 +192,48 @@ class IntProColTest
int16_t sum_c_; int16_t sum_c_;
}; };
typedef int (*SatdFunc)(const int16_t *coeffs, int length);
typedef std::tr1::tuple<int, SatdFunc> SatdTestParam;
class SatdTest
: public ::testing::Test,
public ::testing::WithParamInterface<SatdTestParam> {
protected:
virtual void SetUp() {
satd_size_ = GET_PARAM(0);
satd_func_ = GET_PARAM(1);
rnd_.Reset(ACMRandom::DeterministicSeed());
src_ = reinterpret_cast<int16_t*>(
vpx_memalign(16, sizeof(*src_) * satd_size_));
ASSERT_TRUE(src_ != NULL);
}
virtual void TearDown() {
libvpx_test::ClearSystemState();
vpx_free(src_);
}
void FillConstant(const int16_t val) {
for (int i = 0; i < satd_size_; ++i) src_[i] = val;
}
void FillRandom() {
for (int i = 0; i < satd_size_; ++i) src_[i] = rnd_.Rand16();
}
void Check(const int expected) {
int total;
ASM_REGISTER_STATE_CHECK(total = satd_func_(src_, satd_size_));
EXPECT_EQ(expected, total);
}
int satd_size_;
private:
int16_t *src_;
SatdFunc satd_func_;
ACMRandom rnd_;
};
uint8_t* AverageTestBase::source_data_ = NULL; uint8_t* AverageTestBase::source_data_ = NULL;
@@ -246,69 +286,126 @@ TEST_P(IntProColTest, Random) {
RunComparison(); RunComparison();
} }
TEST_P(SatdTest, MinValue) {
const int kMin = -32640;
const int expected = -kMin * satd_size_;
FillConstant(kMin);
Check(expected);
}
TEST_P(SatdTest, MaxValue) {
const int kMax = 32640;
const int expected = kMax * satd_size_;
FillConstant(kMax);
Check(expected);
}
TEST_P(SatdTest, Random) {
int expected;
switch (satd_size_) {
case 16: expected = 205298; break;
case 64: expected = 1113950; break;
case 256: expected = 4268415; break;
case 1024: expected = 16954082; break;
default:
FAIL() << "Invalid satd size (" << satd_size_
<< ") valid: 16/64/256/1024";
}
FillRandom();
Check(expected);
}
using std::tr1::make_tuple; using std::tr1::make_tuple;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
C, AverageTest, C, AverageTest,
::testing::Values( ::testing::Values(
make_tuple(16, 16, 1, 8, &vp9_avg_8x8_c), make_tuple(16, 16, 1, 8, &vpx_avg_8x8_c),
make_tuple(16, 16, 1, 4, &vp9_avg_4x4_c))); make_tuple(16, 16, 1, 4, &vpx_avg_4x4_c)));
INSTANTIATE_TEST_CASE_P(
C, SatdTest,
::testing::Values(
make_tuple(16, &vpx_satd_c),
make_tuple(64, &vpx_satd_c),
make_tuple(256, &vpx_satd_c),
make_tuple(1024, &vpx_satd_c)));
#if HAVE_SSE2 #if HAVE_SSE2
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSE2, AverageTest, SSE2, AverageTest,
::testing::Values( ::testing::Values(
make_tuple(16, 16, 0, 8, &vp9_avg_8x8_sse2), make_tuple(16, 16, 0, 8, &vpx_avg_8x8_sse2),
make_tuple(16, 16, 5, 8, &vp9_avg_8x8_sse2), make_tuple(16, 16, 5, 8, &vpx_avg_8x8_sse2),
make_tuple(32, 32, 15, 8, &vp9_avg_8x8_sse2), make_tuple(32, 32, 15, 8, &vpx_avg_8x8_sse2),
make_tuple(16, 16, 0, 4, &vp9_avg_4x4_sse2), make_tuple(16, 16, 0, 4, &vpx_avg_4x4_sse2),
make_tuple(16, 16, 5, 4, &vp9_avg_4x4_sse2), make_tuple(16, 16, 5, 4, &vpx_avg_4x4_sse2),
make_tuple(32, 32, 15, 4, &vp9_avg_4x4_sse2))); make_tuple(32, 32, 15, 4, &vpx_avg_4x4_sse2)));
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSE2, IntProRowTest, ::testing::Values( SSE2, IntProRowTest, ::testing::Values(
make_tuple(16, &vp9_int_pro_row_sse2, &vp9_int_pro_row_c), make_tuple(16, &vpx_int_pro_row_sse2, &vpx_int_pro_row_c),
make_tuple(32, &vp9_int_pro_row_sse2, &vp9_int_pro_row_c), make_tuple(32, &vpx_int_pro_row_sse2, &vpx_int_pro_row_c),
make_tuple(64, &vp9_int_pro_row_sse2, &vp9_int_pro_row_c))); make_tuple(64, &vpx_int_pro_row_sse2, &vpx_int_pro_row_c)));
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSE2, IntProColTest, ::testing::Values( SSE2, IntProColTest, ::testing::Values(
make_tuple(16, &vp9_int_pro_col_sse2, &vp9_int_pro_col_c), make_tuple(16, &vpx_int_pro_col_sse2, &vpx_int_pro_col_c),
make_tuple(32, &vp9_int_pro_col_sse2, &vp9_int_pro_col_c), make_tuple(32, &vpx_int_pro_col_sse2, &vpx_int_pro_col_c),
make_tuple(64, &vp9_int_pro_col_sse2, &vp9_int_pro_col_c))); make_tuple(64, &vpx_int_pro_col_sse2, &vpx_int_pro_col_c)));
INSTANTIATE_TEST_CASE_P(
SSE2, SatdTest,
::testing::Values(
make_tuple(16, &vpx_satd_sse2),
make_tuple(64, &vpx_satd_sse2),
make_tuple(256, &vpx_satd_sse2),
make_tuple(1024, &vpx_satd_sse2)));
#endif #endif
#if HAVE_NEON #if HAVE_NEON
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
NEON, AverageTest, NEON, AverageTest,
::testing::Values( ::testing::Values(
make_tuple(16, 16, 0, 8, &vp9_avg_8x8_neon), make_tuple(16, 16, 0, 8, &vpx_avg_8x8_neon),
make_tuple(16, 16, 5, 8, &vp9_avg_8x8_neon), make_tuple(16, 16, 5, 8, &vpx_avg_8x8_neon),
make_tuple(32, 32, 15, 8, &vp9_avg_8x8_neon))); make_tuple(32, 32, 15, 8, &vpx_avg_8x8_neon),
make_tuple(16, 16, 0, 4, &vpx_avg_4x4_neon),
make_tuple(16, 16, 5, 4, &vpx_avg_4x4_neon),
make_tuple(32, 32, 15, 4, &vpx_avg_4x4_neon)));
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
NEON, IntProRowTest, ::testing::Values( NEON, IntProRowTest, ::testing::Values(
make_tuple(16, &vp9_int_pro_row_neon, &vp9_int_pro_row_c), make_tuple(16, &vpx_int_pro_row_neon, &vpx_int_pro_row_c),
make_tuple(32, &vp9_int_pro_row_neon, &vp9_int_pro_row_c), make_tuple(32, &vpx_int_pro_row_neon, &vpx_int_pro_row_c),
make_tuple(64, &vp9_int_pro_row_neon, &vp9_int_pro_row_c))); make_tuple(64, &vpx_int_pro_row_neon, &vpx_int_pro_row_c)));
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
NEON, IntProColTest, ::testing::Values( NEON, IntProColTest, ::testing::Values(
make_tuple(16, &vp9_int_pro_col_neon, &vp9_int_pro_col_c), make_tuple(16, &vpx_int_pro_col_neon, &vpx_int_pro_col_c),
make_tuple(32, &vp9_int_pro_col_neon, &vp9_int_pro_col_c), make_tuple(32, &vpx_int_pro_col_neon, &vpx_int_pro_col_c),
make_tuple(64, &vp9_int_pro_col_neon, &vp9_int_pro_col_c))); make_tuple(64, &vpx_int_pro_col_neon, &vpx_int_pro_col_c)));
INSTANTIATE_TEST_CASE_P(
NEON, SatdTest,
::testing::Values(
make_tuple(16, &vpx_satd_neon),
make_tuple(64, &vpx_satd_neon),
make_tuple(256, &vpx_satd_neon),
make_tuple(1024, &vpx_satd_neon)));
#endif #endif
#if HAVE_MSA #if HAVE_MSA
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
MSA, AverageTest, MSA, AverageTest,
::testing::Values( ::testing::Values(
make_tuple(16, 16, 0, 8, &vp9_avg_8x8_msa), make_tuple(16, 16, 0, 8, &vpx_avg_8x8_msa),
make_tuple(16, 16, 5, 8, &vp9_avg_8x8_msa), make_tuple(16, 16, 5, 8, &vpx_avg_8x8_msa),
make_tuple(32, 32, 15, 8, &vp9_avg_8x8_msa), make_tuple(32, 32, 15, 8, &vpx_avg_8x8_msa),
make_tuple(16, 16, 0, 4, &vp9_avg_4x4_msa), make_tuple(16, 16, 0, 4, &vpx_avg_4x4_msa),
make_tuple(16, 16, 5, 4, &vp9_avg_4x4_msa), make_tuple(16, 16, 5, 4, &vpx_avg_4x4_msa),
make_tuple(32, 32, 15, 4, &vp9_avg_4x4_msa))); make_tuple(32, 32, 15, 4, &vpx_avg_4x4_msa)));
#endif #endif
} // namespace } // namespace

View File

@@ -960,511 +960,72 @@ TEST_P(ConvolveTest, CheckScalingFiltering) {
using std::tr1::make_tuple; using std::tr1::make_tuple;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
#define WRAP(func, bd) \
void wrap_ ## func ## _ ## bd(const uint8_t *src, ptrdiff_t src_stride, \
uint8_t *dst, ptrdiff_t dst_stride, \
const int16_t *filter_x, \
int filter_x_stride, \
const int16_t *filter_y, \
int filter_y_stride, \
int w, int h) { \
vpx_highbd_ ## func(src, src_stride, dst, dst_stride, filter_x, \
filter_x_stride, filter_y, filter_y_stride, \
w, h, bd); \
}
#if HAVE_SSE2 && ARCH_X86_64 #if HAVE_SSE2 && ARCH_X86_64
void wrap_convolve8_horiz_sse2_8(const uint8_t *src, ptrdiff_t src_stride, #if CONFIG_USE_X86INC
uint8_t *dst, ptrdiff_t dst_stride, WRAP(convolve_copy_sse2, 8)
const int16_t *filter_x, WRAP(convolve_avg_sse2, 8)
int filter_x_stride, WRAP(convolve_copy_sse2, 10)
const int16_t *filter_y, WRAP(convolve_avg_sse2, 10)
int filter_y_stride, WRAP(convolve_copy_sse2, 12)
int w, int h) { WRAP(convolve_avg_sse2, 12)
vpx_highbd_convolve8_horiz_sse2(src, src_stride, dst, dst_stride, filter_x, #endif // CONFIG_USE_X86INC
filter_x_stride, filter_y, filter_y_stride, WRAP(convolve8_horiz_sse2, 8)
w, h, 8); WRAP(convolve8_avg_horiz_sse2, 8)
} WRAP(convolve8_vert_sse2, 8)
WRAP(convolve8_avg_vert_sse2, 8)
void wrap_convolve8_avg_horiz_sse2_8(const uint8_t *src, ptrdiff_t src_stride, WRAP(convolve8_sse2, 8)
uint8_t *dst, ptrdiff_t dst_stride, WRAP(convolve8_avg_sse2, 8)
const int16_t *filter_x, WRAP(convolve8_horiz_sse2, 10)
int filter_x_stride, WRAP(convolve8_avg_horiz_sse2, 10)
const int16_t *filter_y, WRAP(convolve8_vert_sse2, 10)
int filter_y_stride, WRAP(convolve8_avg_vert_sse2, 10)
int w, int h) { WRAP(convolve8_sse2, 10)
vpx_highbd_convolve8_avg_horiz_sse2(src, src_stride, dst, dst_stride, WRAP(convolve8_avg_sse2, 10)
filter_x, filter_x_stride, WRAP(convolve8_horiz_sse2, 12)
filter_y, filter_y_stride, w, h, 8); WRAP(convolve8_avg_horiz_sse2, 12)
} WRAP(convolve8_vert_sse2, 12)
WRAP(convolve8_avg_vert_sse2, 12)
void wrap_convolve8_vert_sse2_8(const uint8_t *src, ptrdiff_t src_stride, WRAP(convolve8_sse2, 12)
uint8_t *dst, ptrdiff_t dst_stride, WRAP(convolve8_avg_sse2, 12)
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_avg_vert_sse2_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_sse2_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_avg_sse2_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_horiz_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_horiz_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_horiz_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_horiz_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_vert_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_vert_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_sse2_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_horiz_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_horiz_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_horiz_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_horiz_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_vert_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_vert_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_sse2_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_sse2(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
#endif // HAVE_SSE2 && ARCH_X86_64 #endif // HAVE_SSE2 && ARCH_X86_64
void wrap_convolve_copy_c_8(const uint8_t *src, ptrdiff_t src_stride, WRAP(convolve_copy_c, 8)
uint8_t *dst, ptrdiff_t dst_stride, WRAP(convolve_avg_c, 8)
const int16_t *filter_x, WRAP(convolve8_horiz_c, 8)
int filter_x_stride, WRAP(convolve8_avg_horiz_c, 8)
const int16_t *filter_y, WRAP(convolve8_vert_c, 8)
int filter_y_stride, WRAP(convolve8_avg_vert_c, 8)
int w, int h) { WRAP(convolve8_c, 8)
vpx_highbd_convolve_copy_c(src, src_stride, dst, dst_stride, WRAP(convolve8_avg_c, 8)
filter_x, filter_x_stride, WRAP(convolve_copy_c, 10)
filter_y, filter_y_stride, w, h, 8); WRAP(convolve_avg_c, 10)
} WRAP(convolve8_horiz_c, 10)
WRAP(convolve8_avg_horiz_c, 10)
void wrap_convolve_avg_c_8(const uint8_t *src, ptrdiff_t src_stride, WRAP(convolve8_vert_c, 10)
uint8_t *dst, ptrdiff_t dst_stride, WRAP(convolve8_avg_vert_c, 10)
const int16_t *filter_x, WRAP(convolve8_c, 10)
int filter_x_stride, WRAP(convolve8_avg_c, 10)
const int16_t *filter_y, WRAP(convolve_copy_c, 12)
int filter_y_stride, WRAP(convolve_avg_c, 12)
int w, int h) { WRAP(convolve8_horiz_c, 12)
vpx_highbd_convolve_avg_c(src, src_stride, dst, dst_stride, WRAP(convolve8_avg_horiz_c, 12)
filter_x, filter_x_stride, WRAP(convolve8_vert_c, 12)
filter_y, filter_y_stride, w, h, 8); WRAP(convolve8_avg_vert_c, 12)
} WRAP(convolve8_c, 12)
WRAP(convolve8_avg_c, 12)
void wrap_convolve8_horiz_c_8(const uint8_t *src, ptrdiff_t src_stride, #undef WRAP
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_avg_horiz_c_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_vert_c_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_avg_vert_c_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_c_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve8_avg_c_8(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 8);
}
void wrap_convolve_copy_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve_copy_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve_avg_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve_avg_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_horiz_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_horiz_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_vert_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_vert_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve8_avg_c_10(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 10);
}
void wrap_convolve_copy_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve_copy_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve_avg_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve_avg_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_horiz_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_horiz_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_horiz_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_vert_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_vert_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_vert_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
void wrap_convolve8_avg_c_12(const uint8_t *src, ptrdiff_t src_stride,
uint8_t *dst, ptrdiff_t dst_stride,
const int16_t *filter_x,
int filter_x_stride,
const int16_t *filter_y,
int filter_y_stride,
int w, int h) {
vpx_highbd_convolve8_avg_c(src, src_stride, dst, dst_stride,
filter_x, filter_x_stride,
filter_y, filter_y_stride, w, h, 12);
}
const ConvolveFunctions convolve8_c( const ConvolveFunctions convolve8_c(
wrap_convolve_copy_c_8, wrap_convolve_avg_c_8, wrap_convolve_copy_c_8, wrap_convolve_avg_c_8,
@@ -1563,7 +1124,11 @@ INSTANTIATE_TEST_CASE_P(C, ConvolveTest, ::testing::Values(
#if HAVE_SSE2 && ARCH_X86_64 #if HAVE_SSE2 && ARCH_X86_64
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
const ConvolveFunctions convolve8_sse2( const ConvolveFunctions convolve8_sse2(
#if CONFIG_USE_X86INC
wrap_convolve_copy_sse2_8, wrap_convolve_avg_sse2_8,
#else
wrap_convolve_copy_c_8, wrap_convolve_avg_c_8, wrap_convolve_copy_c_8, wrap_convolve_avg_c_8,
#endif // CONFIG_USE_X86INC
wrap_convolve8_horiz_sse2_8, wrap_convolve8_avg_horiz_sse2_8, wrap_convolve8_horiz_sse2_8, wrap_convolve8_avg_horiz_sse2_8,
wrap_convolve8_vert_sse2_8, wrap_convolve8_avg_vert_sse2_8, wrap_convolve8_vert_sse2_8, wrap_convolve8_avg_vert_sse2_8,
wrap_convolve8_sse2_8, wrap_convolve8_avg_sse2_8, wrap_convolve8_sse2_8, wrap_convolve8_avg_sse2_8,
@@ -1571,7 +1136,11 @@ const ConvolveFunctions convolve8_sse2(
wrap_convolve8_vert_sse2_8, wrap_convolve8_avg_vert_sse2_8, wrap_convolve8_vert_sse2_8, wrap_convolve8_avg_vert_sse2_8,
wrap_convolve8_sse2_8, wrap_convolve8_avg_sse2_8, 8); wrap_convolve8_sse2_8, wrap_convolve8_avg_sse2_8, 8);
const ConvolveFunctions convolve10_sse2( const ConvolveFunctions convolve10_sse2(
#if CONFIG_USE_X86INC
wrap_convolve_copy_sse2_10, wrap_convolve_avg_sse2_10,
#else
wrap_convolve_copy_c_10, wrap_convolve_avg_c_10, wrap_convolve_copy_c_10, wrap_convolve_avg_c_10,
#endif // CONFIG_USE_X86INC
wrap_convolve8_horiz_sse2_10, wrap_convolve8_avg_horiz_sse2_10, wrap_convolve8_horiz_sse2_10, wrap_convolve8_avg_horiz_sse2_10,
wrap_convolve8_vert_sse2_10, wrap_convolve8_avg_vert_sse2_10, wrap_convolve8_vert_sse2_10, wrap_convolve8_avg_vert_sse2_10,
wrap_convolve8_sse2_10, wrap_convolve8_avg_sse2_10, wrap_convolve8_sse2_10, wrap_convolve8_avg_sse2_10,
@@ -1579,7 +1148,11 @@ const ConvolveFunctions convolve10_sse2(
wrap_convolve8_vert_sse2_10, wrap_convolve8_avg_vert_sse2_10, wrap_convolve8_vert_sse2_10, wrap_convolve8_avg_vert_sse2_10,
wrap_convolve8_sse2_10, wrap_convolve8_avg_sse2_10, 10); wrap_convolve8_sse2_10, wrap_convolve8_avg_sse2_10, 10);
const ConvolveFunctions convolve12_sse2( const ConvolveFunctions convolve12_sse2(
#if CONFIG_USE_X86INC
wrap_convolve_copy_sse2_12, wrap_convolve_avg_sse2_12,
#else
wrap_convolve_copy_c_12, wrap_convolve_avg_c_12, wrap_convolve_copy_c_12, wrap_convolve_avg_c_12,
#endif // CONFIG_USE_X86INC
wrap_convolve8_horiz_sse2_12, wrap_convolve8_avg_horiz_sse2_12, wrap_convolve8_horiz_sse2_12, wrap_convolve8_avg_horiz_sse2_12,
wrap_convolve8_vert_sse2_12, wrap_convolve8_avg_vert_sse2_12, wrap_convolve8_vert_sse2_12, wrap_convolve8_avg_vert_sse2_12,
wrap_convolve8_sse2_12, wrap_convolve8_avg_sse2_12, wrap_convolve8_sse2_12, wrap_convolve8_avg_sse2_12,

View File

@@ -538,7 +538,7 @@ TEST_P(DatarateTestVP9Large, ChangingDropFrameThresh) {
<< " The first dropped frame for drop_thresh " << i << " The first dropped frame for drop_thresh " << i
<< " > first dropped frame for drop_thresh " << " > first dropped frame for drop_thresh "
<< i - kDropFrameThreshTestStep; << i - kDropFrameThreshTestStep;
ASSERT_GE(num_drops_, last_num_drops * 0.90) ASSERT_GE(num_drops_, last_num_drops * 0.85)
<< " The number of dropped frames for drop_thresh " << i << " The number of dropped frames for drop_thresh " << i
<< " < number of dropped frames for drop_thresh " << " < number of dropped frames for drop_thresh "
<< i - kDropFrameThreshTestStep; << i - kDropFrameThreshTestStep;
@@ -770,7 +770,7 @@ class DatarateOnePassCbrSvc : public ::libvpx_test::EncoderTest,
::libvpx_test::Encoder *encoder) { ::libvpx_test::Encoder *encoder) {
if (video->frame() == 0) { if (video->frame() == 0) {
int i; int i;
for (i = 0; i < 2; ++i) { for (i = 0; i < VPX_MAX_LAYERS; ++i) {
svc_params_.max_quantizers[i] = 63; svc_params_.max_quantizers[i] = 63;
svc_params_.min_quantizers[i] = 0; svc_params_.min_quantizers[i] = 0;
} }

View File

@@ -124,6 +124,11 @@ class Encoder {
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError(); ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
} }
void Control(int ctrl_id, int *arg) {
const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
}
void Control(int ctrl_id, struct vpx_scaling_mode *arg) { void Control(int ctrl_id, struct vpx_scaling_mode *arg) {
const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg); const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError(); ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();

View File

@@ -1,406 +0,0 @@
/*
* Copyright (c) 2012 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h"
#include "./vp8_rtcd.h"
#include "test/acm_random.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "vp8/common/blockd.h"
#include "vpx_mem/vpx_mem.h"
namespace {
using libvpx_test::ACMRandom;
class IntraPredBase {
public:
virtual ~IntraPredBase() { libvpx_test::ClearSystemState(); }
protected:
void SetupMacroblock(MACROBLOCKD *mbptr,
MODE_INFO *miptr,
uint8_t *data,
int block_size,
int stride,
int num_planes) {
mbptr_ = mbptr;
miptr_ = miptr;
mbptr_->up_available = 1;
mbptr_->left_available = 1;
mbptr_->mode_info_context = miptr_;
stride_ = stride;
block_size_ = block_size;
num_planes_ = num_planes;
for (int p = 0; p < num_planes; p++)
data_ptr_[p] = data + stride * (block_size + 1) * p +
stride + block_size;
}
void FillRandom() {
// Fill edges with random data
ACMRandom rnd(ACMRandom::DeterministicSeed());
for (int p = 0; p < num_planes_; p++) {
for (int x = -1 ; x <= block_size_; x++)
data_ptr_[p][x - stride_] = rnd.Rand8();
for (int y = 0; y < block_size_; y++)
data_ptr_[p][y * stride_ - 1] = rnd.Rand8();
}
}
virtual void Predict(MB_PREDICTION_MODE mode) = 0;
void SetLeftUnavailable() {
mbptr_->left_available = 0;
for (int p = 0; p < num_planes_; p++)
for (int i = -1; i < block_size_; ++i)
data_ptr_[p][stride_ * i - 1] = 129;
}
void SetTopUnavailable() {
mbptr_->up_available = 0;
for (int p = 0; p < num_planes_; p++)
memset(&data_ptr_[p][-1 - stride_], 127, block_size_ + 2);
}
void SetTopLeftUnavailable() {
SetLeftUnavailable();
SetTopUnavailable();
}
int BlockSizeLog2Min1() const {
switch (block_size_) {
case 16:
return 3;
case 8:
return 2;
default:
return 0;
}
}
// check DC prediction output against a reference
void CheckDCPrediction() const {
for (int p = 0; p < num_planes_; p++) {
// calculate expected DC
int expected;
if (mbptr_->up_available || mbptr_->left_available) {
int sum = 0, shift = BlockSizeLog2Min1() + mbptr_->up_available +
mbptr_->left_available;
if (mbptr_->up_available)
for (int x = 0; x < block_size_; x++)
sum += data_ptr_[p][x - stride_];
if (mbptr_->left_available)
for (int y = 0; y < block_size_; y++)
sum += data_ptr_[p][y * stride_ - 1];
expected = (sum + (1 << (shift - 1))) >> shift;
} else {
expected = 0x80;
}
// check that all subsequent lines are equal to the first
for (int y = 1; y < block_size_; ++y)
ASSERT_EQ(0, memcmp(data_ptr_[p], &data_ptr_[p][y * stride_],
block_size_));
// within the first line, ensure that each pixel has the same value
for (int x = 1; x < block_size_; ++x)
ASSERT_EQ(data_ptr_[p][0], data_ptr_[p][x]);
// now ensure that that pixel has the expected (DC) value
ASSERT_EQ(expected, data_ptr_[p][0]);
}
}
// check V prediction output against a reference
void CheckVPrediction() const {
// check that all lines equal the top border
for (int p = 0; p < num_planes_; p++)
for (int y = 0; y < block_size_; y++)
ASSERT_EQ(0, memcmp(&data_ptr_[p][-stride_],
&data_ptr_[p][y * stride_], block_size_));
}
// check H prediction output against a reference
void CheckHPrediction() const {
// for each line, ensure that each pixel is equal to the left border
for (int p = 0; p < num_planes_; p++)
for (int y = 0; y < block_size_; y++)
for (int x = 0; x < block_size_; x++)
ASSERT_EQ(data_ptr_[p][-1 + y * stride_],
data_ptr_[p][x + y * stride_]);
}
static int ClipByte(int value) {
if (value > 255)
return 255;
else if (value < 0)
return 0;
return value;
}
// check TM prediction output against a reference
void CheckTMPrediction() const {
for (int p = 0; p < num_planes_; p++)
for (int y = 0; y < block_size_; y++)
for (int x = 0; x < block_size_; x++) {
const int expected = ClipByte(data_ptr_[p][x - stride_]
+ data_ptr_[p][stride_ * y - 1]
- data_ptr_[p][-1 - stride_]);
ASSERT_EQ(expected, data_ptr_[p][y * stride_ + x]);
}
}
// Actual test
void RunTest() {
{
SCOPED_TRACE("DC_PRED");
FillRandom();
Predict(DC_PRED);
CheckDCPrediction();
}
{
SCOPED_TRACE("DC_PRED LEFT");
FillRandom();
SetLeftUnavailable();
Predict(DC_PRED);
CheckDCPrediction();
}
{
SCOPED_TRACE("DC_PRED TOP");
FillRandom();
SetTopUnavailable();
Predict(DC_PRED);
CheckDCPrediction();
}
{
SCOPED_TRACE("DC_PRED TOP_LEFT");
FillRandom();
SetTopLeftUnavailable();
Predict(DC_PRED);
CheckDCPrediction();
}
{
SCOPED_TRACE("H_PRED");
FillRandom();
Predict(H_PRED);
CheckHPrediction();
}
{
SCOPED_TRACE("V_PRED");
FillRandom();
Predict(V_PRED);
CheckVPrediction();
}
{
SCOPED_TRACE("TM_PRED");
FillRandom();
Predict(TM_PRED);
CheckTMPrediction();
}
}
MACROBLOCKD *mbptr_;
MODE_INFO *miptr_;
uint8_t *data_ptr_[2]; // in the case of Y, only [0] is used
int stride_;
int block_size_;
int num_planes_;
};
typedef void (*IntraPredYFunc)(MACROBLOCKD *x,
uint8_t *yabove_row,
uint8_t *yleft,
int left_stride,
uint8_t *ypred_ptr,
int y_stride);
class IntraPredYTest
: public IntraPredBase,
public ::testing::TestWithParam<IntraPredYFunc> {
public:
static void SetUpTestCase() {
mb_ = reinterpret_cast<MACROBLOCKD*>(
vpx_memalign(32, sizeof(MACROBLOCKD)));
mi_ = reinterpret_cast<MODE_INFO*>(
vpx_memalign(32, sizeof(MODE_INFO)));
data_array_ = reinterpret_cast<uint8_t*>(
vpx_memalign(kDataAlignment, kDataBufferSize));
}
static void TearDownTestCase() {
vpx_free(data_array_);
vpx_free(mi_);
vpx_free(mb_);
data_array_ = NULL;
}
protected:
static const int kBlockSize = 16;
static const int kDataAlignment = 16;
static const int kStride = kBlockSize * 3;
// We use 48 so that the data pointer of the first pixel in each row of
// each macroblock is 16-byte aligned, and this gives us access to the
// top-left and top-right corner pixels belonging to the top-left/right
// macroblocks.
// We use 17 lines so we have one line above us for top-prediction.
static const int kDataBufferSize = kStride * (kBlockSize + 1);
virtual void SetUp() {
pred_fn_ = GetParam();
SetupMacroblock(mb_, mi_, data_array_, kBlockSize, kStride, 1);
}
virtual void Predict(MB_PREDICTION_MODE mode) {
mbptr_->mode_info_context->mbmi.mode = mode;
ASM_REGISTER_STATE_CHECK(pred_fn_(mbptr_,
data_ptr_[0] - kStride,
data_ptr_[0] - 1, kStride,
data_ptr_[0], kStride));
}
IntraPredYFunc pred_fn_;
static uint8_t* data_array_;
static MACROBLOCKD * mb_;
static MODE_INFO *mi_;
};
MACROBLOCKD* IntraPredYTest::mb_ = NULL;
MODE_INFO* IntraPredYTest::mi_ = NULL;
uint8_t* IntraPredYTest::data_array_ = NULL;
TEST_P(IntraPredYTest, IntraPredTests) {
RunTest();
}
INSTANTIATE_TEST_CASE_P(C, IntraPredYTest,
::testing::Values(
vp8_build_intra_predictors_mby_s_c));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, IntraPredYTest,
::testing::Values(
vp8_build_intra_predictors_mby_s_sse2));
#endif
#if HAVE_SSSE3
INSTANTIATE_TEST_CASE_P(SSSE3, IntraPredYTest,
::testing::Values(
vp8_build_intra_predictors_mby_s_ssse3));
#endif
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, IntraPredYTest,
::testing::Values(
vp8_build_intra_predictors_mby_s_neon));
#endif
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, IntraPredYTest,
::testing::Values(
vp8_build_intra_predictors_mby_s_msa));
#endif
typedef void (*IntraPredUvFunc)(MACROBLOCKD *x,
uint8_t *uabove_row,
uint8_t *vabove_row,
uint8_t *uleft,
uint8_t *vleft,
int left_stride,
uint8_t *upred_ptr,
uint8_t *vpred_ptr,
int pred_stride);
class IntraPredUVTest
: public IntraPredBase,
public ::testing::TestWithParam<IntraPredUvFunc> {
public:
static void SetUpTestCase() {
mb_ = reinterpret_cast<MACROBLOCKD*>(
vpx_memalign(32, sizeof(MACROBLOCKD)));
mi_ = reinterpret_cast<MODE_INFO*>(
vpx_memalign(32, sizeof(MODE_INFO)));
data_array_ = reinterpret_cast<uint8_t*>(
vpx_memalign(kDataAlignment, kDataBufferSize));
}
static void TearDownTestCase() {
vpx_free(data_array_);
vpx_free(mi_);
vpx_free(mb_);
data_array_ = NULL;
}
protected:
static const int kBlockSize = 8;
static const int kDataAlignment = 8;
static const int kStride = kBlockSize * 3;
// We use 24 so that the data pointer of the first pixel in each row of
// each macroblock is 8-byte aligned, and this gives us access to the
// top-left and top-right corner pixels belonging to the top-left/right
// macroblocks.
// We use 9 lines so we have one line above us for top-prediction.
// [0] = U, [1] = V
static const int kDataBufferSize = 2 * kStride * (kBlockSize + 1);
virtual void SetUp() {
pred_fn_ = GetParam();
SetupMacroblock(mb_, mi_, data_array_, kBlockSize, kStride, 2);
}
virtual void Predict(MB_PREDICTION_MODE mode) {
mbptr_->mode_info_context->mbmi.uv_mode = mode;
pred_fn_(mbptr_, data_ptr_[0] - kStride, data_ptr_[1] - kStride,
data_ptr_[0] - 1, data_ptr_[1] - 1, kStride,
data_ptr_[0], data_ptr_[1], kStride);
}
IntraPredUvFunc pred_fn_;
// We use 24 so that the data pointer of the first pixel in each row of
// each macroblock is 8-byte aligned, and this gives us access to the
// top-left and top-right corner pixels belonging to the top-left/right
// macroblocks.
// We use 9 lines so we have one line above us for top-prediction.
// [0] = U, [1] = V
static uint8_t* data_array_;
static MACROBLOCKD* mb_;
static MODE_INFO* mi_;
};
MACROBLOCKD* IntraPredUVTest::mb_ = NULL;
MODE_INFO* IntraPredUVTest::mi_ = NULL;
uint8_t* IntraPredUVTest::data_array_ = NULL;
TEST_P(IntraPredUVTest, IntraPredTests) {
RunTest();
}
INSTANTIATE_TEST_CASE_P(C, IntraPredUVTest,
::testing::Values(
vp8_build_intra_predictors_mbuv_s_c));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, IntraPredUVTest,
::testing::Values(
vp8_build_intra_predictors_mbuv_s_sse2));
#endif
#if HAVE_SSSE3
INSTANTIATE_TEST_CASE_P(SSSE3, IntraPredUVTest,
::testing::Values(
vp8_build_intra_predictors_mbuv_s_ssse3));
#endif
#if HAVE_NEON
INSTANTIATE_TEST_CASE_P(NEON, IntraPredUVTest,
::testing::Values(
vp8_build_intra_predictors_mbuv_s_neon));
#endif
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, IntraPredUVTest,
::testing::Values(
vp8_build_intra_predictors_mbuv_s_msa));
#endif
} // namespace

View File

@@ -63,9 +63,22 @@ class InvalidFileTest
EXPECT_NE(res, EOF) << "Read result data failed"; EXPECT_NE(res, EOF) << "Read result data failed";
// Check results match. // Check results match.
EXPECT_EQ(expected_res_dec, res_dec) const DecodeParam input = GET_PARAM(1);
<< "Results don't match: frame number = " << video.frame_number() if (input.threads > 1) {
<< ". (" << decoder->DecodeError() << ")"; // The serial decode check is too strict for tile-threaded decoding as
// there is no guarantee on the decode order nor which specific error
// will take precedence. Currently a tile-level error is not forwarded so
// the frame will simply be marked corrupt.
EXPECT_TRUE(res_dec == expected_res_dec ||
res_dec == VPX_CODEC_CORRUPT_FRAME)
<< "Results don't match: frame number = " << video.frame_number()
<< ". (" << decoder->DecodeError() << "). Expected: "
<< expected_res_dec << " or " << VPX_CODEC_CORRUPT_FRAME;
} else {
EXPECT_EQ(expected_res_dec, res_dec)
<< "Results don't match: frame number = " << video.frame_number()
<< ". (" << decoder->DecodeError() << ")";
}
return !HasFailure(); return !HasFailure();
} }

View File

@@ -30,7 +30,9 @@
#if defined(_WIN64) #if defined(_WIN64)
#define _WIN32_LEAN_AND_MEAN #undef NOMINMAX
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h> #include <windows.h>
#include <winnt.h> #include <winnt.h>

View File

@@ -81,6 +81,15 @@ static void write_ivf_frame_header(const vpx_codec_cx_pkt_t *const pkt,
const unsigned int kInitialWidth = 320; const unsigned int kInitialWidth = 320;
const unsigned int kInitialHeight = 240; const unsigned int kInitialHeight = 240;
struct FrameInfo {
FrameInfo(vpx_codec_pts_t _pts, unsigned int _w, unsigned int _h)
: pts(_pts), w(_w), h(_h) {}
vpx_codec_pts_t pts;
unsigned int w;
unsigned int h;
};
unsigned int ScaleForFrameNumber(unsigned int frame, unsigned int val) { unsigned int ScaleForFrameNumber(unsigned int frame, unsigned int val) {
if (frame < 10) if (frame < 10)
return val; return val;
@@ -120,15 +129,6 @@ class ResizeTest : public ::libvpx_test::EncoderTest,
virtual ~ResizeTest() {} virtual ~ResizeTest() {}
struct FrameInfo {
FrameInfo(vpx_codec_pts_t _pts, unsigned int _w, unsigned int _h)
: pts(_pts), w(_w), h(_h) {}
vpx_codec_pts_t pts;
unsigned int w;
unsigned int h;
};
virtual void SetUp() { virtual void SetUp() {
InitializeConfig(); InitializeConfig();
SetMode(GET_PARAM(1)); SetMode(GET_PARAM(1));
@@ -196,13 +196,27 @@ class ResizeInternalTest : public ResizeTest {
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video, virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) { libvpx_test::Encoder *encoder) {
if (video->frame() == kStepDownFrame) { if (change_config_) {
struct vpx_scaling_mode mode = {VP8E_FOURFIVE, VP8E_THREEFIVE}; int new_q = 60;
encoder->Control(VP8E_SET_SCALEMODE, &mode); if (video->frame() == 0) {
} struct vpx_scaling_mode mode = {VP8E_ONETWO, VP8E_ONETWO};
if (video->frame() == kStepUpFrame) { encoder->Control(VP8E_SET_SCALEMODE, &mode);
struct vpx_scaling_mode mode = {VP8E_NORMAL, VP8E_NORMAL}; }
encoder->Control(VP8E_SET_SCALEMODE, &mode); if (video->frame() == 1) {
struct vpx_scaling_mode mode = {VP8E_NORMAL, VP8E_NORMAL};
encoder->Control(VP8E_SET_SCALEMODE, &mode);
cfg_.rc_min_quantizer = cfg_.rc_max_quantizer = new_q;
encoder->Config(&cfg_);
}
} else {
if (video->frame() == kStepDownFrame) {
struct vpx_scaling_mode mode = {VP8E_FOURFIVE, VP8E_THREEFIVE};
encoder->Control(VP8E_SET_SCALEMODE, &mode);
}
if (video->frame() == kStepUpFrame) {
struct vpx_scaling_mode mode = {VP8E_NORMAL, VP8E_NORMAL};
encoder->Control(VP8E_SET_SCALEMODE, &mode);
}
} }
} }
@@ -227,6 +241,7 @@ class ResizeInternalTest : public ResizeTest {
#endif #endif
double frame0_psnr_; double frame0_psnr_;
bool change_config_;
#if WRITE_COMPRESSED_STREAM #if WRITE_COMPRESSED_STREAM
FILE *outfile_; FILE *outfile_;
unsigned int out_frames_; unsigned int out_frames_;
@@ -237,6 +252,7 @@ TEST_P(ResizeInternalTest, TestInternalResizeWorks) {
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288, ::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 10); 30, 1, 0, 10);
init_flags_ = VPX_CODEC_USE_PSNR; init_flags_ = VPX_CODEC_USE_PSNR;
change_config_ = false;
// q picked such that initial keyframe on this clip is ~30dB PSNR // q picked such that initial keyframe on this clip is ~30dB PSNR
cfg_.rc_min_quantizer = cfg_.rc_max_quantizer = 48; cfg_.rc_min_quantizer = cfg_.rc_max_quantizer = 48;
@@ -261,6 +277,164 @@ TEST_P(ResizeInternalTest, TestInternalResizeWorks) {
} }
} }
TEST_P(ResizeInternalTest, TestInternalResizeChangeConfig) {
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 10);
cfg_.g_w = 352;
cfg_.g_h = 288;
change_config_ = true;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
class ResizeRealtimeTest : public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
ResizeRealtimeTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ResizeRealtimeTest() {}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
if (video->frame() == 0) {
encoder->Control(VP9E_SET_AQ_MODE, 3);
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
}
if (change_bitrate_ && video->frame() == 120) {
change_bitrate_ = false;
cfg_.rc_target_bitrate = 500;
encoder->Config(&cfg_);
}
}
virtual void SetUp() {
InitializeConfig();
SetMode(GET_PARAM(1));
set_cpu_used_ = GET_PARAM(2);
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t pts) {
frame_info_list_.push_back(FrameInfo(pts, img.d_w, img.d_h));
}
void DefaultConfig() {
cfg_.rc_buf_initial_sz = 500;
cfg_.rc_buf_optimal_sz = 600;
cfg_.rc_buf_sz = 1000;
cfg_.rc_min_quantizer = 2;
cfg_.rc_max_quantizer = 56;
cfg_.rc_undershoot_pct = 50;
cfg_.rc_overshoot_pct = 50;
cfg_.rc_end_usage = VPX_CBR;
cfg_.kf_mode = VPX_KF_AUTO;
cfg_.g_lag_in_frames = 0;
cfg_.kf_min_dist = cfg_.kf_max_dist = 3000;
// Enable dropped frames.
cfg_.rc_dropframe_thresh = 1;
// Enable error_resilience mode.
cfg_.g_error_resilient = 1;
// Enable dynamic resizing.
cfg_.rc_resize_allowed = 1;
// Run at low bitrate.
cfg_.rc_target_bitrate = 200;
}
std::vector< FrameInfo > frame_info_list_;
int set_cpu_used_;
bool change_bitrate_;
};
TEST_P(ResizeRealtimeTest, TestExternalResizeWorks) {
ResizingVideoSource video;
DefaultConfig();
change_bitrate_ = false;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
for (std::vector<FrameInfo>::const_iterator info = frame_info_list_.begin();
info != frame_info_list_.end(); ++info) {
const unsigned int frame = static_cast<unsigned>(info->pts);
const unsigned int expected_w = ScaleForFrameNumber(frame, kInitialWidth);
const unsigned int expected_h = ScaleForFrameNumber(frame, kInitialHeight);
EXPECT_EQ(expected_w, info->w)
<< "Frame " << frame << " had unexpected width";
EXPECT_EQ(expected_h, info->h)
<< "Frame " << frame << " had unexpected height";
}
}
// Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
// Run at low bitrate, with resize_allowed = 1, and verify that we get
// one resize down event.
TEST_P(ResizeRealtimeTest, TestInternalResizeDown) {
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 299);
DefaultConfig();
cfg_.g_w = 352;
cfg_.g_h = 288;
change_bitrate_ = false;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
unsigned int last_w = cfg_.g_w;
unsigned int last_h = cfg_.g_h;
int resize_count = 0;
for (std::vector<FrameInfo>::const_iterator info = frame_info_list_.begin();
info != frame_info_list_.end(); ++info) {
if (info->w != last_w || info->h != last_h) {
// Verify that resize down occurs.
ASSERT_LT(info->w, last_w);
ASSERT_LT(info->h, last_h);
last_w = info->w;
last_h = info->h;
resize_count++;
}
}
// Verify that we get 1 resize down event in this test.
ASSERT_EQ(1, resize_count) << "Resizing should occur.";
}
// Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
// Start at low target bitrate, raise the bitrate in the middle of the clip,
// scaling-up should occur after bitrate changed.
TEST_P(ResizeRealtimeTest, TestInternalResizeDownUpChangeBitRate) {
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 359);
DefaultConfig();
cfg_.g_w = 352;
cfg_.g_h = 288;
change_bitrate_ = true;
// Disable dropped frames.
cfg_.rc_dropframe_thresh = 0;
// Starting bitrate low.
cfg_.rc_target_bitrate = 80;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
unsigned int last_w = cfg_.g_w;
unsigned int last_h = cfg_.g_h;
int resize_count = 0;
for (std::vector<FrameInfo>::const_iterator info = frame_info_list_.begin();
info != frame_info_list_.end(); ++info) {
if (info->w != last_w || info->h != last_h) {
resize_count++;
if (resize_count == 1) {
// Verify that resize down occurs.
ASSERT_LT(info->w, last_w);
ASSERT_LT(info->h, last_h);
} else if (resize_count == 2) {
// Verify that resize up occurs.
ASSERT_GT(info->w, last_w);
ASSERT_GT(info->h, last_h);
}
last_w = info->w;
last_h = info->h;
}
}
// Verify that we get 2 resize events in this test.
ASSERT_EQ(resize_count, 2) << "Resizing should occur twice.";
}
vpx_img_fmt_t CspForFrameNumber(int frame) { vpx_img_fmt_t CspForFrameNumber(int frame) {
if (frame < 10) if (frame < 10)
return VPX_IMG_FMT_I420; return VPX_IMG_FMT_I420;
@@ -371,6 +545,9 @@ VP9_INSTANTIATE_TEST_CASE(ResizeTest,
::testing::Values(::libvpx_test::kRealTime)); ::testing::Values(::libvpx_test::kRealTime));
VP9_INSTANTIATE_TEST_CASE(ResizeInternalTest, VP9_INSTANTIATE_TEST_CASE(ResizeInternalTest,
::testing::Values(::libvpx_test::kOnePassBest)); ::testing::Values(::libvpx_test::kOnePassBest));
VP9_INSTANTIATE_TEST_CASE(ResizeRealtimeTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(5, 9));
VP9_INSTANTIATE_TEST_CASE(ResizeCspTest, VP9_INSTANTIATE_TEST_CASE(ResizeCspTest,
::testing::Values(::libvpx_test::kRealTime)); ::testing::Values(::libvpx_test::kRealTime));
} // namespace } // namespace

File diff suppressed because it is too large Load Diff

View File

@@ -186,70 +186,48 @@ TEST_P(SixtapPredictTest, TestWithRandomData) {
using std::tr1::make_tuple; using std::tr1::make_tuple;
const SixtapPredictFunc sixtap_16x16_c = vp8_sixtap_predict16x16_c;
const SixtapPredictFunc sixtap_8x8_c = vp8_sixtap_predict8x8_c;
const SixtapPredictFunc sixtap_8x4_c = vp8_sixtap_predict8x4_c;
const SixtapPredictFunc sixtap_4x4_c = vp8_sixtap_predict4x4_c;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
C, SixtapPredictTest, ::testing::Values( C, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_c), make_tuple(16, 16, &vp8_sixtap_predict16x16_c),
make_tuple(8, 8, sixtap_8x8_c), make_tuple(8, 8, &vp8_sixtap_predict8x8_c),
make_tuple(8, 4, sixtap_8x4_c), make_tuple(8, 4, &vp8_sixtap_predict8x4_c),
make_tuple(4, 4, sixtap_4x4_c))); make_tuple(4, 4, &vp8_sixtap_predict4x4_c)));
#if HAVE_NEON #if HAVE_NEON
const SixtapPredictFunc sixtap_16x16_neon = vp8_sixtap_predict16x16_neon;
const SixtapPredictFunc sixtap_8x8_neon = vp8_sixtap_predict8x8_neon;
const SixtapPredictFunc sixtap_8x4_neon = vp8_sixtap_predict8x4_neon;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
DISABLED_NEON, SixtapPredictTest, ::testing::Values( NEON, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_neon), make_tuple(16, 16, &vp8_sixtap_predict16x16_neon),
make_tuple(8, 8, sixtap_8x8_neon), make_tuple(8, 8, &vp8_sixtap_predict8x8_neon),
make_tuple(8, 4, sixtap_8x4_neon))); make_tuple(8, 4, &vp8_sixtap_predict8x4_neon)));
#endif #endif
#if HAVE_MMX #if HAVE_MMX
const SixtapPredictFunc sixtap_16x16_mmx = vp8_sixtap_predict16x16_mmx;
const SixtapPredictFunc sixtap_8x8_mmx = vp8_sixtap_predict8x8_mmx;
const SixtapPredictFunc sixtap_8x4_mmx = vp8_sixtap_predict8x4_mmx;
const SixtapPredictFunc sixtap_4x4_mmx = vp8_sixtap_predict4x4_mmx;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
MMX, SixtapPredictTest, ::testing::Values( MMX, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_mmx), make_tuple(16, 16, &vp8_sixtap_predict16x16_mmx),
make_tuple(8, 8, sixtap_8x8_mmx), make_tuple(8, 8, &vp8_sixtap_predict8x8_mmx),
make_tuple(8, 4, sixtap_8x4_mmx), make_tuple(8, 4, &vp8_sixtap_predict8x4_mmx),
make_tuple(4, 4, sixtap_4x4_mmx))); make_tuple(4, 4, &vp8_sixtap_predict4x4_mmx)));
#endif #endif
#if HAVE_SSE2 #if HAVE_SSE2
const SixtapPredictFunc sixtap_16x16_sse2 = vp8_sixtap_predict16x16_sse2;
const SixtapPredictFunc sixtap_8x8_sse2 = vp8_sixtap_predict8x8_sse2;
const SixtapPredictFunc sixtap_8x4_sse2 = vp8_sixtap_predict8x4_sse2;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSE2, SixtapPredictTest, ::testing::Values( SSE2, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_sse2), make_tuple(16, 16, &vp8_sixtap_predict16x16_sse2),
make_tuple(8, 8, sixtap_8x8_sse2), make_tuple(8, 8, &vp8_sixtap_predict8x8_sse2),
make_tuple(8, 4, sixtap_8x4_sse2))); make_tuple(8, 4, &vp8_sixtap_predict8x4_sse2)));
#endif #endif
#if HAVE_SSSE3 #if HAVE_SSSE3
const SixtapPredictFunc sixtap_16x16_ssse3 = vp8_sixtap_predict16x16_ssse3;
const SixtapPredictFunc sixtap_8x8_ssse3 = vp8_sixtap_predict8x8_ssse3;
const SixtapPredictFunc sixtap_8x4_ssse3 = vp8_sixtap_predict8x4_ssse3;
const SixtapPredictFunc sixtap_4x4_ssse3 = vp8_sixtap_predict4x4_ssse3;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSSE3, SixtapPredictTest, ::testing::Values( SSSE3, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_ssse3), make_tuple(16, 16, &vp8_sixtap_predict16x16_ssse3),
make_tuple(8, 8, sixtap_8x8_ssse3), make_tuple(8, 8, &vp8_sixtap_predict8x8_ssse3),
make_tuple(8, 4, sixtap_8x4_ssse3), make_tuple(8, 4, &vp8_sixtap_predict8x4_ssse3),
make_tuple(4, 4, sixtap_4x4_ssse3))); make_tuple(4, 4, &vp8_sixtap_predict4x4_ssse3)));
#endif #endif
#if HAVE_MSA #if HAVE_MSA
const SixtapPredictFunc sixtap_16x16_msa = vp8_sixtap_predict16x16_msa;
const SixtapPredictFunc sixtap_8x8_msa = vp8_sixtap_predict8x8_msa;
const SixtapPredictFunc sixtap_8x4_msa = vp8_sixtap_predict8x4_msa;
const SixtapPredictFunc sixtap_4x4_msa = vp8_sixtap_predict4x4_msa;
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
MSA, SixtapPredictTest, ::testing::Values( MSA, SixtapPredictTest, ::testing::Values(
make_tuple(16, 16, sixtap_16x16_msa), make_tuple(16, 16, &vp8_sixtap_predict16x16_msa),
make_tuple(8, 8, sixtap_8x8_msa), make_tuple(8, 8, &vp8_sixtap_predict8x8_msa),
make_tuple(8, 4, sixtap_8x4_msa), make_tuple(8, 4, &vp8_sixtap_predict8x4_msa),
make_tuple(4, 4, sixtap_4x4_msa))); make_tuple(4, 4, &vp8_sixtap_predict4x4_msa)));
#endif #endif
} // namespace } // namespace

View File

@@ -16,8 +16,13 @@
namespace { namespace {
const int kTestMode = 0;
const int kSuperframeSyntax = 1;
typedef std::tr1::tuple<libvpx_test::TestMode,int> SuperframeTestParam;
class SuperframeTest : public ::libvpx_test::EncoderTest, class SuperframeTest : public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWithParam<libvpx_test::TestMode> { public ::libvpx_test::CodecTestWithParam<SuperframeTestParam> {
protected: protected:
SuperframeTest() : EncoderTest(GET_PARAM(0)), modified_buf_(NULL), SuperframeTest() : EncoderTest(GET_PARAM(0)), modified_buf_(NULL),
last_sf_pts_(0) {} last_sf_pts_(0) {}
@@ -25,9 +30,13 @@ class SuperframeTest : public ::libvpx_test::EncoderTest,
virtual void SetUp() { virtual void SetUp() {
InitializeConfig(); InitializeConfig();
SetMode(GET_PARAM(1)); const SuperframeTestParam input = GET_PARAM(1);
const libvpx_test::TestMode mode = std::tr1::get<kTestMode>(input);
const int syntax = std::tr1::get<kSuperframeSyntax>(input);
SetMode(mode);
sf_count_ = 0; sf_count_ = 0;
sf_count_max_ = INT_MAX; sf_count_max_ = INT_MAX;
is_vp10_style_superframe_ = syntax;
} }
virtual void TearDown() { virtual void TearDown() {
@@ -50,7 +59,8 @@ class SuperframeTest : public ::libvpx_test::EncoderTest,
const uint8_t marker = buffer[pkt->data.frame.sz - 1]; const uint8_t marker = buffer[pkt->data.frame.sz - 1];
const int frames = (marker & 0x7) + 1; const int frames = (marker & 0x7) + 1;
const int mag = ((marker >> 3) & 3) + 1; const int mag = ((marker >> 3) & 3) + 1;
const unsigned int index_sz = 2 + mag * frames; const unsigned int index_sz =
2 + mag * (frames - is_vp10_style_superframe_);
if ((marker & 0xe0) == 0xc0 && if ((marker & 0xe0) == 0xc0 &&
pkt->data.frame.sz >= index_sz && pkt->data.frame.sz >= index_sz &&
buffer[pkt->data.frame.sz - index_sz] == marker) { buffer[pkt->data.frame.sz - index_sz] == marker) {
@@ -75,6 +85,7 @@ class SuperframeTest : public ::libvpx_test::EncoderTest,
return pkt; return pkt;
} }
int is_vp10_style_superframe_;
int sf_count_; int sf_count_;
int sf_count_max_; int sf_count_max_;
vpx_codec_cx_pkt_t modified_pkt_; vpx_codec_cx_pkt_t modified_pkt_;
@@ -92,9 +103,11 @@ TEST_P(SuperframeTest, TestSuperframeIndexIsOptional) {
EXPECT_EQ(sf_count_, 1); EXPECT_EQ(sf_count_, 1);
} }
VP9_INSTANTIATE_TEST_CASE(SuperframeTest, ::testing::Values( VP9_INSTANTIATE_TEST_CASE(SuperframeTest, ::testing::Combine(
::libvpx_test::kTwoPassGood)); ::testing::Values(::libvpx_test::kTwoPassGood),
::testing::Values(0)));
VP10_INSTANTIATE_TEST_CASE(SuperframeTest, ::testing::Values( VP10_INSTANTIATE_TEST_CASE(SuperframeTest, ::testing::Combine(
::libvpx_test::kTwoPassGood)); ::testing::Values(::libvpx_test::kTwoPassGood),
::testing::Values(CONFIG_MISC_FIXES)));
} // namespace } // namespace

View File

@@ -18,6 +18,7 @@ LIBVPX_TEST_DATA-$(CONFIG_ENCODERS) += park_joy_90p_8_422.y4m
LIBVPX_TEST_DATA-$(CONFIG_ENCODERS) += park_joy_90p_8_444.y4m LIBVPX_TEST_DATA-$(CONFIG_ENCODERS) += park_joy_90p_8_444.y4m
LIBVPX_TEST_DATA-$(CONFIG_ENCODERS) += park_joy_90p_8_440.yuv LIBVPX_TEST_DATA-$(CONFIG_ENCODERS) += park_joy_90p_8_440.yuv
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += desktop_credits.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_1280_720_30.y4m LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += niklas_1280_720_30.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += rush_hour_444.y4m LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += rush_hour_444.y4m
LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += screendata.y4m LIBVPX_TEST_DATA-$(CONFIG_VP9_ENCODER) += screendata.y4m
@@ -417,6 +418,18 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x64.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x64.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x64.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x66.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x66.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x66.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-66x66.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-130x132.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-130x132.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-132x130.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-132x130.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-132x132.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-132x132.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-178x180.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-178x180.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-180x178.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-180x178.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-180x180.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-180x180.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-lf-1920x1080.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-lf-1920x1080.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-lf-1920x1080.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-02-size-lf-1920x1080.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-03-deltaq.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-03-deltaq.webm
@@ -641,6 +654,34 @@ LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-2.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-4.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-4.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-fp-tiles-8-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-2-4-8.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-2-4-8.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-8.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-1-8.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-1.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-1.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-8.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-2-8.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-1.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-1.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-8.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-4-8.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-1.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-1.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-4-2-1.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-4-2-1.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-14-resize-10frames-fp-tiles-8-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey.webm.md5 LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey_adpq.webm LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-15-segkey_adpq.webm
@@ -768,3 +809,53 @@ endif # CONFIG_ENCODE_PERF_TESTS
# sort and remove duplicates # sort and remove duplicates
LIBVPX_TEST_DATA-yes := $(sort $(LIBVPX_TEST_DATA-yes)) LIBVPX_TEST_DATA-yes := $(sort $(LIBVPX_TEST_DATA-yes))
# VP9 dynamic resizing test (decoder)
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x180_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_320x240_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x360_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_640x480_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1280x720_7_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_5_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_5_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_5_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_5_3-4.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_1-2.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_1-2.webm.md5
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_3-4.webm
LIBVPX_TEST_DATA-$(CONFIG_VP9_DECODER) += vp90-2-21-resize_inter_1920x1080_7_3-4.webm.md5

View File

@@ -743,3 +743,92 @@ d06285d109ecbaef63b0cbcc44d70a129186f51c *invalid-vp90-2-03-size-224x196.webm.iv
e60d859b0ef2b331b21740cf6cb83fabe469b079 *invalid-vp90-2-03-size-202x210.webm.ivf.s113306_r01-05_b6-.ivf e60d859b0ef2b331b21740cf6cb83fabe469b079 *invalid-vp90-2-03-size-202x210.webm.ivf.s113306_r01-05_b6-.ivf
0ae808dca4d3c1152a9576e14830b6faa39f1b4a *invalid-vp90-2-03-size-202x210.webm.ivf.s113306_r01-05_b6-.ivf.res 0ae808dca4d3c1152a9576e14830b6faa39f1b4a *invalid-vp90-2-03-size-202x210.webm.ivf.s113306_r01-05_b6-.ivf.res
9cfc855459e7549fd015c79e8eca512b2f2cb7e3 *niklas_1280_720_30.y4m 9cfc855459e7549fd015c79e8eca512b2f2cb7e3 *niklas_1280_720_30.y4m
5b5763b388b1b52a81bb82b39f7ec25c4bd3d0e1 *desktop_credits.y4m
85771f6ab44e4a0226e206c0cde8351dd5918953 *vp90-2-02-size-130x132.webm
512dad5eabbed37b4bbbc64ce153f1a5484427b8 *vp90-2-02-size-130x132.webm.md5
01f7127d40360289db63b27f61cb9afcda350e95 *vp90-2-02-size-132x130.webm
4a94275328ae076cf60f966c097a8721010fbf5a *vp90-2-02-size-132x130.webm.md5
f41c0400b5716b4b70552c40dd03d44be131e1cc *vp90-2-02-size-132x132.webm
1a69e989f697e424bfe3e3e8a77bb0c0992c8e47 *vp90-2-02-size-132x132.webm.md5
94a5cbfacacba100e0c5f7861c72a1b417feca0f *vp90-2-02-size-178x180.webm
dedfecf1d784bcf70629592fa5e6f01d5441ccc9 *vp90-2-02-size-178x180.webm.md5
4828b62478c04014bba3095a83106911a71cf387 *vp90-2-02-size-180x178.webm
423da2b861050c969d78ed8e8f8f14045d1d8199 *vp90-2-02-size-180x178.webm.md5
338f7c9282f43e29940f5391118aadd17e4f9234 *vp90-2-02-size-180x180.webm
6c2ef013392310778dca5dd5351160eca66b0a60 *vp90-2-02-size-180x180.webm.md5
679fa7d6807e936ff937d7b282e7dbd8ac76447e *vp90-2-14-resize-10frames-fp-tiles-1-2-4-8.webm
fc7267ab8fc2bf5d6c234e34ee6c078a967b4888 *vp90-2-14-resize-10frames-fp-tiles-1-2-4-8.webm.md5
9d33a137c819792209c5ce4e4e1ee5da73d574fe *vp90-2-14-resize-10frames-fp-tiles-1-2.webm
0c78a154956a8605d050bdd75e0dcc4d39c040a6 *vp90-2-14-resize-10frames-fp-tiles-1-2.webm.md5
d6a8d8c57f66a91d23e8e7df480f9ae841e56c37 *vp90-2-14-resize-10frames-fp-tiles-1-4.webm
e9b4e8c7b33b5fda745d340c3f47e6623ae40cf2 *vp90-2-14-resize-10frames-fp-tiles-1-4.webm.md5
aa6fe043a0c4a42b49c87ebbe812d4afd9945bec *vp90-2-14-resize-10frames-fp-tiles-1-8.webm
028520578994c2d013d4c0129033d4f2ff31bbe0 *vp90-2-14-resize-10frames-fp-tiles-1-8.webm.md5
d1d5463c9ea7b5cc5f609ddedccddf656f348d1a *vp90-2-14-resize-10frames-fp-tiles-2-1.webm
92d5872f5bdffbed721703b7e959b4f885e3d77a *vp90-2-14-resize-10frames-fp-tiles-2-1.webm.md5
677cb29de1215d97346015af5807a9b1faad54cf *vp90-2-14-resize-10frames-fp-tiles-2-4.webm
a5db19f977094ec3fd60b4f7671b3e6740225e12 *vp90-2-14-resize-10frames-fp-tiles-2-4.webm.md5
cdd3c52ba21067efdbb2de917fe2a965bf27332e *vp90-2-14-resize-10frames-fp-tiles-2-8.webm
db17ec5d894ea8b8d0b7f32206d0dd3d46dcfa6d *vp90-2-14-resize-10frames-fp-tiles-2-8.webm.md5
0f6093c472125d05b764d7d1965c1d56771c0ea2 *vp90-2-14-resize-10frames-fp-tiles-4-1.webm
bc7c79e1bee07926dd970462ce6f64fc30eec3e1 *vp90-2-14-resize-10frames-fp-tiles-4-1.webm.md5
c5142e2bff4091338196c8ea8bc9266e64f548bc *vp90-2-14-resize-10frames-fp-tiles-4-2.webm
22aa3dd430b69fd3d92f6561bac86deeed90486d *vp90-2-14-resize-10frames-fp-tiles-4-2.webm.md5
ede8b1466d2f26e1b1bd9602addb9cd1017e1d8c *vp90-2-14-resize-10frames-fp-tiles-4-8.webm
508d5ebb9c0eac2a4100281a3ee052ec2fc19217 *vp90-2-14-resize-10frames-fp-tiles-4-8.webm.md5
2b292e3392854cd1d76ae597a6f53656cf741cfa *vp90-2-14-resize-10frames-fp-tiles-8-1.webm
1c24e54fa19e94e1722f24676404444e941c3d31 *vp90-2-14-resize-10frames-fp-tiles-8-1.webm.md5
61beda21064e09634564caa6697ab90bd53c9af7 *vp90-2-14-resize-10frames-fp-tiles-8-2.webm
9c0657b4d9e1d0e4c9d28a90e5a8630a65519124 *vp90-2-14-resize-10frames-fp-tiles-8-2.webm.md5
1758c50a11a7c92522749b4a251664705f1f0d4b *vp90-2-14-resize-10frames-fp-tiles-8-4-2-1.webm
4f454a06750614314ae15a44087b79016fe2db97 *vp90-2-14-resize-10frames-fp-tiles-8-4-2-1.webm.md5
3920c95ba94f1f048a731d9d9b416043b44aa4bd *vp90-2-14-resize-10frames-fp-tiles-8-4.webm
4eb347a0456d2c49a1e1d8de5aa1c51acc39887e *vp90-2-14-resize-10frames-fp-tiles-8-4.webm.md5
4b95a74c032a473b6683d7ad5754db1b0ec378e9 *vp90-2-21-resize_inter_1280x720_5_1-2.webm
a7826dd386bedfe69d02736969bfb47fb6a40a5e *vp90-2-21-resize_inter_1280x720_5_1-2.webm.md5
5cfff79e82c4d69964ccb8e75b4f0c53b9295167 *vp90-2-21-resize_inter_1280x720_5_3-4.webm
a18f57db4a25e1f543a99f2ceb182e00db0ee22f *vp90-2-21-resize_inter_1280x720_5_3-4.webm.md5
d26db0811bf30eb4131d928669713e2485f8e833 *vp90-2-21-resize_inter_1280x720_7_1-2.webm
fd6f9f332cd5bea4c0f0d57be4297bea493cc5a1 *vp90-2-21-resize_inter_1280x720_7_1-2.webm.md5
5c7d73d4d268e2ba9593b31cb091fd339505c7fd *vp90-2-21-resize_inter_1280x720_7_3-4.webm
7bbb949cabc1e70dadcc74582739f63b833034e0 *vp90-2-21-resize_inter_1280x720_7_3-4.webm.md5
f2d2a41a60eb894aff0c5854afca15931f1445a8 *vp90-2-21-resize_inter_1920x1080_5_1-2.webm
66d7789992613ac9d678ff905ff1059daa1b89e4 *vp90-2-21-resize_inter_1920x1080_5_1-2.webm.md5
764edb75fe7dd64e73a1b4f3b4b2b1bf237a4dea *vp90-2-21-resize_inter_1920x1080_5_3-4.webm
f78bea1075983fd990e7f25d4f31438f9b5efa34 *vp90-2-21-resize_inter_1920x1080_5_3-4.webm.md5
96496f2ade764a5de9f0c27917c7df1f120fb2ef *vp90-2-21-resize_inter_1920x1080_7_1-2.webm
2632b635135ed5ecd67fd22dec7990d29c4f4cb5 *vp90-2-21-resize_inter_1920x1080_7_1-2.webm.md5
74889ea42001bf41428cb742ca74e65129c886dc *vp90-2-21-resize_inter_1920x1080_7_3-4.webm
d2cf3b25956415bb579d368e7098097e482dd73a *vp90-2-21-resize_inter_1920x1080_7_3-4.webm.md5
4658986a8ce36ebfcc80a1903e446eaab3985336 *vp90-2-21-resize_inter_320x180_5_1-2.webm
8a3d8cf325109ffa913cc9426c32eea8c202a09a *vp90-2-21-resize_inter_320x180_5_1-2.webm.md5
16303aa45176520ee42c2c425247aadc1506b881 *vp90-2-21-resize_inter_320x180_5_3-4.webm
41cab1ddf7715b680a4dbce42faa9bcd72af4e5c *vp90-2-21-resize_inter_320x180_5_3-4.webm.md5
56648adcee66dd0e5cb6ac947f5ee1b9cc8ba129 *vp90-2-21-resize_inter_320x180_7_1-2.webm
70047377787003cc03dda7b2394e6d7eaa666d9e *vp90-2-21-resize_inter_320x180_7_1-2.webm.md5
d2ff99165488499cc55f75929f1ce5ca9c9e359b *vp90-2-21-resize_inter_320x180_7_3-4.webm
e69019e378114a4643db283b66d1a7e304761a56 *vp90-2-21-resize_inter_320x180_7_3-4.webm.md5
4834d129bed0f4289d3a88f2ae3a1736f77621b0 *vp90-2-21-resize_inter_320x240_5_1-2.webm
a75653c53d22b623c1927fc0088da21dafef21f4 *vp90-2-21-resize_inter_320x240_5_1-2.webm.md5
19818e1b7fd1c1e63d8873c31b0babe29dd33ba6 *vp90-2-21-resize_inter_320x240_5_3-4.webm
8d89814ff469a186312111651b16601dfbce4336 *vp90-2-21-resize_inter_320x240_5_3-4.webm.md5
ac8057bae52498f324ce92a074d5f8207cc4a4a7 *vp90-2-21-resize_inter_320x240_7_1-2.webm
2643440898c83c08cc47bc744245af696b877c24 *vp90-2-21-resize_inter_320x240_7_1-2.webm.md5
cf4a4cd38ac8b18c42d8c25a3daafdb39132256b *vp90-2-21-resize_inter_320x240_7_3-4.webm
70ba8ec9120b26e9b0ffa2c79b432f16cbcb50ec *vp90-2-21-resize_inter_320x240_7_3-4.webm.md5
669f10409fe1c4a054010162ca47773ea1fdbead *vp90-2-21-resize_inter_640x360_5_1-2.webm
6355a04249004a35fb386dd1024214234f044383 *vp90-2-21-resize_inter_640x360_5_1-2.webm.md5
c23763b950b8247c1775d1f8158d93716197676c *vp90-2-21-resize_inter_640x360_5_3-4.webm
59e6fc381e3ec3b7bdaac586334e0bc944d18fb6 *vp90-2-21-resize_inter_640x360_5_3-4.webm.md5
71b45cbfdd068baa1f679a69e5e6f421d256a85f *vp90-2-21-resize_inter_640x360_7_1-2.webm
1416fc761b690c54a955c4cf017fa078520e8c18 *vp90-2-21-resize_inter_640x360_7_1-2.webm.md5
6c409903279448a697e4db63bab1061784bcd8d2 *vp90-2-21-resize_inter_640x360_7_3-4.webm
60de1299793433a630b71130cf76c9f5965758e2 *vp90-2-21-resize_inter_640x360_7_3-4.webm.md5
852b597b8af096d90c80bf0ed6ed3b336b851f19 *vp90-2-21-resize_inter_640x480_5_1-2.webm
f6856f19236ee46ed462bd0a2e7e72b9c3b9cea6 *vp90-2-21-resize_inter_640x480_5_1-2.webm.md5
792a16c6f60043bd8dceb515f0b95b8891647858 *vp90-2-21-resize_inter_640x480_5_3-4.webm
68ffe59877e9a7863805e1c0a3ce18ce037d7c9d *vp90-2-21-resize_inter_640x480_5_3-4.webm.md5
61e044c4759972a35ea3db8c1478a988910a4ef4 *vp90-2-21-resize_inter_640x480_7_1-2.webm
7739bfca167b1b43fea72f807f01e097b7cb98d8 *vp90-2-21-resize_inter_640x480_7_1-2.webm.md5
7291af354b4418917eee00e3a7e366086a0b7a10 *vp90-2-21-resize_inter_640x480_7_3-4.webm
4a18b09ccb36564193f0215f599d745d95bb558c *vp90-2-21-resize_inter_640x480_7_3-4.webm.md5

View File

@@ -36,6 +36,7 @@ LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += external_frame_buffer_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += invalid_file_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += invalid_file_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += user_priv_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += user_priv_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_frame_parallel_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_frame_parallel_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += active_map_refresh_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += active_map_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += active_map_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += borders_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += borders_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += cpu_speed_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += cpu_speed_test.cc
@@ -91,10 +92,9 @@ endif
## shared library builds don't make these functions accessible. ## shared library builds don't make these functions accessible.
## ##
ifeq ($(CONFIG_SHARED),) ifeq ($(CONFIG_SHARED),)
LIBVPX_TEST_SRCS-$(CONFIG_VP9) += lpf_8_test.cc
## VP8 ## VP8
ifneq ($(CONFIG_VP8_ENCODER)$(CONFIG_VP8_DECODER),) ifeq ($(CONFIG_VP8),yes)
# These tests require both the encoder and decoder to be built. # These tests require both the encoder and decoder to be built.
ifeq ($(CONFIG_VP8_ENCODER)$(CONFIG_VP8_DECODER),yesyes) ifeq ($(CONFIG_VP8_ENCODER)$(CONFIG_VP8_DECODER),yesyes)
@@ -104,13 +104,12 @@ endif
LIBVPX_TEST_SRCS-$(CONFIG_POSTPROC) += pp_filter_test.cc LIBVPX_TEST_SRCS-$(CONFIG_POSTPROC) += pp_filter_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_DECODER) += vp8_decrypt_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP8_DECODER) += vp8_decrypt_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += quantize_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += set_roi.cc LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += set_roi.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += variance_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += variance_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += vp8_fdct4x4_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += vp8_fdct4x4_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP8_ENCODER) += quantize_test.cc
LIBVPX_TEST_SRCS-yes += idct_test.cc LIBVPX_TEST_SRCS-yes += idct_test.cc
LIBVPX_TEST_SRCS-yes += intrapred_test.cc
LIBVPX_TEST_SRCS-yes += sixtap_predict_test.cc LIBVPX_TEST_SRCS-yes += sixtap_predict_test.cc
LIBVPX_TEST_SRCS-yes += vpx_scale_test.cc LIBVPX_TEST_SRCS-yes += vpx_scale_test.cc
@@ -121,7 +120,7 @@ endif
endif # VP8 endif # VP8
## VP9 ## VP9
ifneq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_DECODER),) ifeq ($(CONFIG_VP9),yes)
# These tests require both the encoder and decoder to be built. # These tests require both the encoder and decoder to be built.
ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_DECODER),yesyes) ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_DECODER),yesyes)
@@ -134,25 +133,24 @@ LIBVPX_TEST_SRCS-yes += vp9_boolcoder_test.cc
LIBVPX_TEST_SRCS-yes += vp9_encoder_parms_get_to_decoder.cc LIBVPX_TEST_SRCS-yes += vp9_encoder_parms_get_to_decoder.cc
endif endif
LIBVPX_TEST_SRCS-$(CONFIG_VP9) += convolve_test.cc LIBVPX_TEST_SRCS-yes += convolve_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_thread_test.cc LIBVPX_TEST_SRCS-yes += lpf_8_test.cc
LIBVPX_TEST_SRCS-yes += vp9_intrapred_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_decrypt_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_decrypt_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_DECODER) += vp9_thread_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct16x16_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct16x16_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct32x32_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += dct32x32_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct4x4_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct4x4_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct8x8_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += fdct8x8_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += variance_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += variance_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_subtract_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_avg_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_error_block_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_error_block_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_quantize_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_quantize_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_VP9) += vp9_intrapred_test.cc LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_subtract_test.cc
ifeq ($(CONFIG_VP9_ENCODER),yes) ifeq ($(CONFIG_VP9_ENCODER),yes)
LIBVPX_TEST_SRCS-$(CONFIG_SPATIAL_SVC) += svc_test.cc LIBVPX_TEST_SRCS-$(CONFIG_SPATIAL_SVC) += svc_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_INTERNAL_STATS) += blockiness_test.cc LIBVPX_TEST_SRCS-$(CONFIG_INTERNAL_STATS) += blockiness_test.cc
LIBVPX_TEST_SRCS-$(CONFIG_INTERNAL_STATS) += consistency_test.cc LIBVPX_TEST_SRCS-$(CONFIG_INTERNAL_STATS) += consistency_test.cc
endif endif
ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_TEMPORAL_DENOISING),yesyes) ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_VP9_TEMPORAL_DENOISING),yesyes)
@@ -162,10 +160,24 @@ LIBVPX_TEST_SRCS-$(CONFIG_VP9_ENCODER) += vp9_arf_freq_test.cc
endif # VP9 endif # VP9
LIBVPX_TEST_SRCS-$(CONFIG_ENCODERS) += sad_test.cc ## VP10
ifeq ($(CONFIG_VP10),yes)
TEST_INTRA_PRED_SPEED_SRCS-$(CONFIG_VP9) := test_intra_pred_speed.cc LIBVPX_TEST_SRCS-yes += vp10_inv_txfm_test.cc
TEST_INTRA_PRED_SPEED_SRCS-$(CONFIG_VP9) += ../md5_utils.h ../md5_utils.c LIBVPX_TEST_SRCS-$(CONFIG_VP10_ENCODER) += vp10_dct_test.cc
endif # VP10
## Multi-codec / unconditional whitebox tests.
ifeq ($(findstring yes,$(CONFIG_VP9_ENCODER)$(CONFIG_VP10_ENCODER)),yes)
LIBVPX_TEST_SRCS-yes += avg_test.cc
endif
LIBVPX_TEST_SRCS-$(CONFIG_ENCODERS) += sad_test.cc
TEST_INTRA_PRED_SPEED_SRCS-yes := test_intra_pred_speed.cc
TEST_INTRA_PRED_SPEED_SRCS-yes += ../md5_utils.h ../md5_utils.c
endif # CONFIG_SHARED endif # CONFIG_SHARED

View File

@@ -187,18 +187,19 @@ INTRA_PRED_TEST(C, TestIntraPred4, vpx_dc_predictor_4x4_c,
vpx_d153_predictor_4x4_c, vpx_d207_predictor_4x4_c, vpx_d153_predictor_4x4_c, vpx_d207_predictor_4x4_c,
vpx_d63_predictor_4x4_c, vpx_tm_predictor_4x4_c) vpx_d63_predictor_4x4_c, vpx_tm_predictor_4x4_c)
#if HAVE_SSE && CONFIG_USE_X86INC #if HAVE_SSE2 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSE, TestIntraPred4, vpx_dc_predictor_4x4_sse, INTRA_PRED_TEST(SSE2, TestIntraPred4, vpx_dc_predictor_4x4_sse2,
vpx_dc_left_predictor_4x4_sse, vpx_dc_top_predictor_4x4_sse, vpx_dc_left_predictor_4x4_sse2, vpx_dc_top_predictor_4x4_sse2,
vpx_dc_128_predictor_4x4_sse, vpx_v_predictor_4x4_sse, NULL, vpx_dc_128_predictor_4x4_sse2, vpx_v_predictor_4x4_sse2,
NULL, NULL, NULL, NULL, NULL, NULL, vpx_tm_predictor_4x4_sse) vpx_h_predictor_4x4_sse2, NULL, NULL, NULL, NULL, NULL, NULL,
#endif // HAVE_SSE && CONFIG_USE_X86INC vpx_tm_predictor_4x4_sse2)
#endif // HAVE_SSE2 && CONFIG_USE_X86INC
#if HAVE_SSSE3 && CONFIG_USE_X86INC #if HAVE_SSSE3 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSSE3, TestIntraPred4, NULL, NULL, NULL, NULL, NULL, INTRA_PRED_TEST(SSSE3, TestIntraPred4, NULL, NULL, NULL, NULL, NULL,
vpx_h_predictor_4x4_ssse3, vpx_d45_predictor_4x4_ssse3, NULL, NULL, vpx_d45_predictor_4x4_ssse3, NULL, NULL,
NULL, vpx_d153_predictor_4x4_ssse3, vpx_d153_predictor_4x4_ssse3, vpx_d207_predictor_4x4_ssse3,
vpx_d207_predictor_4x4_ssse3, vpx_d63_predictor_4x4_ssse3, NULL) vpx_d63_predictor_4x4_ssse3, NULL)
#endif // HAVE_SSSE3 && CONFIG_USE_X86INC #endif // HAVE_SSSE3 && CONFIG_USE_X86INC
#if HAVE_DSPR2 #if HAVE_DSPR2
@@ -235,23 +236,19 @@ INTRA_PRED_TEST(C, TestIntraPred8, vpx_dc_predictor_8x8_c,
vpx_d153_predictor_8x8_c, vpx_d207_predictor_8x8_c, vpx_d153_predictor_8x8_c, vpx_d207_predictor_8x8_c,
vpx_d63_predictor_8x8_c, vpx_tm_predictor_8x8_c) vpx_d63_predictor_8x8_c, vpx_tm_predictor_8x8_c)
#if HAVE_SSE && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSE, TestIntraPred8, vpx_dc_predictor_8x8_sse,
vpx_dc_left_predictor_8x8_sse, vpx_dc_top_predictor_8x8_sse,
vpx_dc_128_predictor_8x8_sse, vpx_v_predictor_8x8_sse, NULL,
NULL, NULL, NULL, NULL, NULL, NULL, NULL)
#endif // HAVE_SSE && CONFIG_USE_X86INC
#if HAVE_SSE2 && CONFIG_USE_X86INC #if HAVE_SSE2 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSE2, TestIntraPred8, NULL, NULL, NULL, NULL, NULL, NULL, NULL, INTRA_PRED_TEST(SSE2, TestIntraPred8, vpx_dc_predictor_8x8_sse2,
NULL, NULL, NULL, NULL, NULL, vpx_tm_predictor_8x8_sse2) vpx_dc_left_predictor_8x8_sse2, vpx_dc_top_predictor_8x8_sse2,
vpx_dc_128_predictor_8x8_sse2, vpx_v_predictor_8x8_sse2,
vpx_h_predictor_8x8_sse2, NULL, NULL, NULL, NULL, NULL,
NULL, vpx_tm_predictor_8x8_sse2)
#endif // HAVE_SSE2 && CONFIG_USE_X86INC #endif // HAVE_SSE2 && CONFIG_USE_X86INC
#if HAVE_SSSE3 && CONFIG_USE_X86INC #if HAVE_SSSE3 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSSE3, TestIntraPred8, NULL, NULL, NULL, NULL, NULL, INTRA_PRED_TEST(SSSE3, TestIntraPred8, NULL, NULL, NULL, NULL, NULL,
vpx_h_predictor_8x8_ssse3, vpx_d45_predictor_8x8_ssse3, NULL, NULL, vpx_d45_predictor_8x8_ssse3, NULL, NULL,
NULL, vpx_d153_predictor_8x8_ssse3, vpx_d153_predictor_8x8_ssse3, vpx_d207_predictor_8x8_ssse3,
vpx_d207_predictor_8x8_ssse3, vpx_d63_predictor_8x8_ssse3, NULL) vpx_d63_predictor_8x8_ssse3, NULL)
#endif // HAVE_SSSE3 && CONFIG_USE_X86INC #endif // HAVE_SSSE3 && CONFIG_USE_X86INC
#if HAVE_DSPR2 #if HAVE_DSPR2
@@ -293,13 +290,13 @@ INTRA_PRED_TEST(SSE2, TestIntraPred16, vpx_dc_predictor_16x16_sse2,
vpx_dc_left_predictor_16x16_sse2, vpx_dc_left_predictor_16x16_sse2,
vpx_dc_top_predictor_16x16_sse2, vpx_dc_top_predictor_16x16_sse2,
vpx_dc_128_predictor_16x16_sse2, vpx_v_predictor_16x16_sse2, vpx_dc_128_predictor_16x16_sse2, vpx_v_predictor_16x16_sse2,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, vpx_h_predictor_16x16_sse2, NULL, NULL, NULL, NULL, NULL, NULL,
vpx_tm_predictor_16x16_sse2) vpx_tm_predictor_16x16_sse2)
#endif // HAVE_SSE2 && CONFIG_USE_X86INC #endif // HAVE_SSE2 && CONFIG_USE_X86INC
#if HAVE_SSSE3 && CONFIG_USE_X86INC #if HAVE_SSSE3 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSSE3, TestIntraPred16, NULL, NULL, NULL, NULL, NULL, INTRA_PRED_TEST(SSSE3, TestIntraPred16, NULL, NULL, NULL, NULL, NULL,
vpx_h_predictor_16x16_ssse3, vpx_d45_predictor_16x16_ssse3, NULL, vpx_d45_predictor_16x16_ssse3,
NULL, NULL, vpx_d153_predictor_16x16_ssse3, NULL, NULL, vpx_d153_predictor_16x16_ssse3,
vpx_d207_predictor_16x16_ssse3, vpx_d63_predictor_16x16_ssse3, vpx_d207_predictor_16x16_ssse3, vpx_d63_predictor_16x16_ssse3,
NULL) NULL)
@@ -340,28 +337,19 @@ INTRA_PRED_TEST(C, TestIntraPred32, vpx_dc_predictor_32x32_c,
vpx_d63_predictor_32x32_c, vpx_tm_predictor_32x32_c) vpx_d63_predictor_32x32_c, vpx_tm_predictor_32x32_c)
#if HAVE_SSE2 && CONFIG_USE_X86INC #if HAVE_SSE2 && CONFIG_USE_X86INC
#if ARCH_X86_64
INTRA_PRED_TEST(SSE2, TestIntraPred32, vpx_dc_predictor_32x32_sse2, INTRA_PRED_TEST(SSE2, TestIntraPred32, vpx_dc_predictor_32x32_sse2,
vpx_dc_left_predictor_32x32_sse2, vpx_dc_left_predictor_32x32_sse2,
vpx_dc_top_predictor_32x32_sse2, vpx_dc_top_predictor_32x32_sse2,
vpx_dc_128_predictor_32x32_sse2, vpx_v_predictor_32x32_sse2, vpx_dc_128_predictor_32x32_sse2, vpx_v_predictor_32x32_sse2,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, vpx_h_predictor_32x32_sse2, NULL, NULL, NULL, NULL, NULL,
vpx_tm_predictor_32x32_sse2) NULL, vpx_tm_predictor_32x32_sse2)
#else
INTRA_PRED_TEST(SSE2, TestIntraPred32, vpx_dc_predictor_32x32_sse2,
vpx_dc_left_predictor_32x32_sse2,
vpx_dc_top_predictor_32x32_sse2,
vpx_dc_128_predictor_32x32_sse2, vpx_v_predictor_32x32_sse2,
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL)
#endif // ARCH_X86_64
#endif // HAVE_SSE2 && CONFIG_USE_X86INC #endif // HAVE_SSE2 && CONFIG_USE_X86INC
#if HAVE_SSSE3 && CONFIG_USE_X86INC #if HAVE_SSSE3 && CONFIG_USE_X86INC
INTRA_PRED_TEST(SSSE3, TestIntraPred32, NULL, NULL, NULL, NULL, NULL, INTRA_PRED_TEST(SSSE3, TestIntraPred32, NULL, NULL, NULL, NULL, NULL,
vpx_h_predictor_32x32_ssse3, vpx_d45_predictor_32x32_ssse3, NULL, vpx_d45_predictor_32x32_ssse3, NULL, NULL,
NULL, NULL, vpx_d153_predictor_32x32_ssse3, vpx_d153_predictor_32x32_ssse3, vpx_d207_predictor_32x32_ssse3,
vpx_d207_predictor_32x32_ssse3, vpx_d63_predictor_32x32_ssse3, vpx_d63_predictor_32x32_ssse3, NULL)
NULL)
#endif // HAVE_SSSE3 && CONFIG_USE_X86INC #endif // HAVE_SSSE3 && CONFIG_USE_X86INC
#if HAVE_NEON #if HAVE_NEON

View File

@@ -10,6 +10,7 @@
#include <cstdio> #include <cstdio>
#include <cstdlib> #include <cstdlib>
#include <set>
#include <string> #include <string>
#include "third_party/googletest/src/include/gtest/gtest.h" #include "third_party/googletest/src/include/gtest/gtest.h"
#include "../tools_common.h" #include "../tools_common.h"
@@ -44,6 +45,12 @@ class TestVectorTest : public ::libvpx_test::DecoderTest,
TestVectorTest() TestVectorTest()
: DecoderTest(GET_PARAM(0)), : DecoderTest(GET_PARAM(0)),
md5_file_(NULL) { md5_file_(NULL) {
#if CONFIG_VP9_DECODER
resize_clips_.insert(
::libvpx_test::kVP9TestVectorsResize,
::libvpx_test::kVP9TestVectorsResize +
::libvpx_test::kNumVP9TestVectorsResize);
#endif
} }
virtual ~TestVectorTest() { virtual ~TestVectorTest() {
@@ -77,6 +84,10 @@ class TestVectorTest : public ::libvpx_test::DecoderTest,
<< "Md5 checksums don't match: frame number = " << frame_number; << "Md5 checksums don't match: frame number = " << frame_number;
} }
#if CONFIG_VP9_DECODER
std::set<std::string> resize_clips_;
#endif
private: private:
FILE *md5_file_; FILE *md5_file_;
}; };
@@ -97,6 +108,14 @@ TEST_P(TestVectorTest, MD5Match) {
if (mode == kFrameParallelMode) { if (mode == kFrameParallelMode) {
flags |= VPX_CODEC_USE_FRAME_THREADING; flags |= VPX_CODEC_USE_FRAME_THREADING;
#if CONFIG_VP9_DECODER
// TODO(hkuang): Fix frame parallel decode bug. See issue 1086.
if (resize_clips_.find(filename) != resize_clips_.end()) {
printf("Skipping the test file: %s, due to frame parallel decode bug.\n",
filename.c_str());
return;
}
#endif
} }
cfg.threads = threads; cfg.threads = threads;

View File

@@ -52,6 +52,31 @@ const char *const kVP8TestVectors[] = {
const int kNumVP8TestVectors = NELEMENTS(kVP8TestVectors); const int kNumVP8TestVectors = NELEMENTS(kVP8TestVectors);
#endif // CONFIG_VP8_DECODER #endif // CONFIG_VP8_DECODER
#if CONFIG_VP9_DECODER #if CONFIG_VP9_DECODER
#define RESIZE_TEST_VECTORS "vp90-2-21-resize_inter_320x180_5_1-2.webm", \
"vp90-2-21-resize_inter_320x180_5_3-4.webm", \
"vp90-2-21-resize_inter_320x180_7_1-2.webm", \
"vp90-2-21-resize_inter_320x180_7_3-4.webm", \
"vp90-2-21-resize_inter_320x240_5_1-2.webm", \
"vp90-2-21-resize_inter_320x240_5_3-4.webm", \
"vp90-2-21-resize_inter_320x240_7_1-2.webm", \
"vp90-2-21-resize_inter_320x240_7_3-4.webm", \
"vp90-2-21-resize_inter_640x360_5_1-2.webm", \
"vp90-2-21-resize_inter_640x360_5_3-4.webm", \
"vp90-2-21-resize_inter_640x360_7_1-2.webm", \
"vp90-2-21-resize_inter_640x360_7_3-4.webm", \
"vp90-2-21-resize_inter_640x480_5_1-2.webm", \
"vp90-2-21-resize_inter_640x480_5_3-4.webm", \
"vp90-2-21-resize_inter_640x480_7_1-2.webm", \
"vp90-2-21-resize_inter_640x480_7_3-4.webm", \
"vp90-2-21-resize_inter_1280x720_5_1-2.webm", \
"vp90-2-21-resize_inter_1280x720_5_3-4.webm", \
"vp90-2-21-resize_inter_1280x720_7_1-2.webm", \
"vp90-2-21-resize_inter_1280x720_7_3-4.webm", \
"vp90-2-21-resize_inter_1920x1080_5_1-2.webm", \
"vp90-2-21-resize_inter_1920x1080_5_3-4.webm", \
"vp90-2-21-resize_inter_1920x1080_7_1-2.webm", \
"vp90-2-21-resize_inter_1920x1080_7_3-4.webm",
const char *const kVP9TestVectors[] = { const char *const kVP9TestVectors[] = {
"vp90-2-00-quantizer-00.webm", "vp90-2-00-quantizer-01.webm", "vp90-2-00-quantizer-00.webm", "vp90-2-00-quantizer-01.webm",
"vp90-2-00-quantizer-02.webm", "vp90-2-00-quantizer-03.webm", "vp90-2-00-quantizer-02.webm", "vp90-2-00-quantizer-03.webm",
@@ -120,7 +145,10 @@ const char *const kVP9TestVectors[] = {
"vp90-2-02-size-66x10.webm", "vp90-2-02-size-66x16.webm", "vp90-2-02-size-66x10.webm", "vp90-2-02-size-66x16.webm",
"vp90-2-02-size-66x18.webm", "vp90-2-02-size-66x32.webm", "vp90-2-02-size-66x18.webm", "vp90-2-02-size-66x32.webm",
"vp90-2-02-size-66x34.webm", "vp90-2-02-size-66x64.webm", "vp90-2-02-size-66x34.webm", "vp90-2-02-size-66x64.webm",
"vp90-2-02-size-66x66.webm", "vp90-2-03-size-196x196.webm", "vp90-2-02-size-66x66.webm", "vp90-2-02-size-130x132.webm",
"vp90-2-02-size-132x130.webm", "vp90-2-02-size-132x132.webm",
"vp90-2-02-size-178x180.webm", "vp90-2-02-size-180x178.webm",
"vp90-2-02-size-180x180.webm", "vp90-2-03-size-196x196.webm",
"vp90-2-03-size-196x198.webm", "vp90-2-03-size-196x200.webm", "vp90-2-03-size-196x198.webm", "vp90-2-03-size-196x200.webm",
"vp90-2-03-size-196x202.webm", "vp90-2-03-size-196x208.webm", "vp90-2-03-size-196x202.webm", "vp90-2-03-size-196x208.webm",
"vp90-2-03-size-196x210.webm", "vp90-2-03-size-196x224.webm", "vp90-2-03-size-196x210.webm", "vp90-2-03-size-196x224.webm",
@@ -182,6 +210,20 @@ const char *const kVP9TestVectors[] = {
"vp90-2-14-resize-fp-tiles-4-2.webm", "vp90-2-14-resize-fp-tiles-4-8.webm", "vp90-2-14-resize-fp-tiles-4-2.webm", "vp90-2-14-resize-fp-tiles-4-8.webm",
"vp90-2-14-resize-fp-tiles-8-16.webm", "vp90-2-14-resize-fp-tiles-8-1.webm", "vp90-2-14-resize-fp-tiles-8-16.webm", "vp90-2-14-resize-fp-tiles-8-1.webm",
"vp90-2-14-resize-fp-tiles-8-2.webm", "vp90-2-14-resize-fp-tiles-8-4.webm", "vp90-2-14-resize-fp-tiles-8-2.webm", "vp90-2-14-resize-fp-tiles-8-4.webm",
"vp90-2-14-resize-10frames-fp-tiles-1-2-4-8.webm",
"vp90-2-14-resize-10frames-fp-tiles-1-2.webm",
"vp90-2-14-resize-10frames-fp-tiles-1-4.webm",
"vp90-2-14-resize-10frames-fp-tiles-1-8.webm",
"vp90-2-14-resize-10frames-fp-tiles-2-1.webm",
"vp90-2-14-resize-10frames-fp-tiles-2-4.webm",
"vp90-2-14-resize-10frames-fp-tiles-2-8.webm",
"vp90-2-14-resize-10frames-fp-tiles-4-1.webm",
"vp90-2-14-resize-10frames-fp-tiles-4-2.webm",
"vp90-2-14-resize-10frames-fp-tiles-4-8.webm",
"vp90-2-14-resize-10frames-fp-tiles-8-1.webm",
"vp90-2-14-resize-10frames-fp-tiles-8-2.webm",
"vp90-2-14-resize-10frames-fp-tiles-8-4-2-1.webm",
"vp90-2-14-resize-10frames-fp-tiles-8-4.webm",
"vp90-2-15-segkey.webm", "vp90-2-15-segkey_adpq.webm", "vp90-2-15-segkey.webm", "vp90-2-15-segkey_adpq.webm",
"vp90-2-16-intra-only.webm", "vp90-2-17-show-existing-frame.webm", "vp90-2-16-intra-only.webm", "vp90-2-17-show-existing-frame.webm",
"vp90-2-18-resize.ivf", "vp90-2-19-skip.webm", "vp90-2-18-resize.ivf", "vp90-2-19-skip.webm",
@@ -193,10 +235,16 @@ const char *const kVP9TestVectors[] = {
"vp93-2-20-10bit-yuv422.webm", "vp93-2-20-12bit-yuv422.webm", "vp93-2-20-10bit-yuv422.webm", "vp93-2-20-12bit-yuv422.webm",
"vp93-2-20-10bit-yuv440.webm", "vp93-2-20-12bit-yuv440.webm", "vp93-2-20-10bit-yuv440.webm", "vp93-2-20-12bit-yuv440.webm",
"vp93-2-20-10bit-yuv444.webm", "vp93-2-20-12bit-yuv444.webm", "vp93-2-20-10bit-yuv444.webm", "vp93-2-20-12bit-yuv444.webm",
#endif // CONFIG_VP9_HIGHBITDEPTH` #endif // CONFIG_VP9_HIGHBITDEPTH
"vp90-2-20-big_superframe-01.webm", "vp90-2-20-big_superframe-02.webm", "vp90-2-20-big_superframe-01.webm", "vp90-2-20-big_superframe-02.webm",
RESIZE_TEST_VECTORS
}; };
const int kNumVP9TestVectors = NELEMENTS(kVP9TestVectors); const int kNumVP9TestVectors = NELEMENTS(kVP9TestVectors);
const char *const kVP9TestVectorsResize[] = {
RESIZE_TEST_VECTORS
};
const int kNumVP9TestVectorsResize = NELEMENTS(kVP9TestVectorsResize);
#undef RESIZE_TEST_VECTORS
#endif // CONFIG_VP9_DECODER #endif // CONFIG_VP9_DECODER
} // namespace libvpx_test } // namespace libvpx_test

View File

@@ -23,6 +23,8 @@ extern const char *const kVP8TestVectors[];
#if CONFIG_VP9_DECODER #if CONFIG_VP9_DECODER
extern const int kNumVP9TestVectors; extern const int kNumVP9TestVectors;
extern const char *const kVP9TestVectors[]; extern const char *const kVP9TestVectors[];
extern const int kNumVP9TestVectorsResize;
extern const char *const kVP9TestVectorsResize[];
#endif // CONFIG_VP9_DECODER #endif // CONFIG_VP9_DECODER
} // namespace libvpx_test } // namespace libvpx_test

File diff suppressed because it is too large Load Diff

View File

@@ -11,6 +11,9 @@
#define TEST_VIDEO_SOURCE_H_ #define TEST_VIDEO_SOURCE_H_
#if defined(_WIN32) #if defined(_WIN32)
#undef NOMINMAX
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h> #include <windows.h>
#endif #endif
#include <cstdio> #include <cstdio>

111
test/vp10_dct_test.cc Normal file
View File

@@ -0,0 +1,111 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <stdlib.h>
#include <new>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/acm_random.h"
#include "test/util.h"
#include "./vpx_config.h"
#include "vpx_ports/msvc.h"
#undef CONFIG_COEFFICIENT_RANGE_CHECKING
#define CONFIG_COEFFICIENT_RANGE_CHECKING 1
#include "vp10/encoder/dct.c"
using libvpx_test::ACMRandom;
namespace {
void reference_dct_1d(const double *in, double *out, int size) {
const double PI = 3.141592653589793238462643383279502884;
const double kInvSqrt2 = 0.707106781186547524400844362104;
for (int k = 0; k < size; ++k) {
out[k] = 0;
for (int n = 0; n < size; ++n) {
out[k] += in[n] * cos(PI * (2 * n + 1) * k / (2 * size));
}
if (k == 0)
out[k] = out[k] * kInvSqrt2;
}
}
typedef void (*FdctFuncRef)(const double *in, double *out, int size);
typedef void (*IdctFuncRef)(const double *in, double *out, int size);
typedef void (*FdctFunc)(const tran_low_t *in, tran_low_t *out);
typedef void (*IdctFunc)(const tran_low_t *in, tran_low_t *out);
class TransTestBase {
public:
virtual ~TransTestBase() {}
protected:
void RunFwdAccuracyCheck() {
tran_low_t *input = new tran_low_t[txfm_size_];
tran_low_t *output = new tran_low_t[txfm_size_];
double *ref_input = new double[txfm_size_];
double *ref_output = new double[txfm_size_];
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
for (int ti = 0; ti < count_test_block; ++ti) {
for (int ni = 0; ni < txfm_size_; ++ni) {
input[ni] = rnd.Rand8() - rnd.Rand8();
ref_input[ni] = static_cast<double>(input[ni]);
}
fwd_txfm_(input, output);
fwd_txfm_ref_(ref_input, ref_output, txfm_size_);
for (int ni = 0; ni < txfm_size_; ++ni) {
EXPECT_LE(
abs(output[ni] - static_cast<tran_low_t>(round(ref_output[ni]))),
max_error_);
}
}
delete[] input;
delete[] output;
delete[] ref_input;
delete[] ref_output;
}
double max_error_;
int txfm_size_;
FdctFunc fwd_txfm_;
FdctFuncRef fwd_txfm_ref_;
};
typedef std::tr1::tuple<FdctFunc, FdctFuncRef, int, int> FdctParam;
class Vp10FwdTxfm
: public TransTestBase,
public ::testing::TestWithParam<FdctParam> {
public:
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
fwd_txfm_ref_ = GET_PARAM(1);
txfm_size_ = GET_PARAM(2);
max_error_ = GET_PARAM(3);
}
virtual void TearDown() {}
};
TEST_P(Vp10FwdTxfm, RunFwdAccuracyCheck) {
RunFwdAccuracyCheck();
}
INSTANTIATE_TEST_CASE_P(
C, Vp10FwdTxfm,
::testing::Values(
FdctParam(&fdct4, &reference_dct_1d, 4, 1),
FdctParam(&fdct8, &reference_dct_1d, 8, 1),
FdctParam(&fdct16, &reference_dct_1d, 16, 2)));
} // namespace

321
test/vp10_inv_txfm_test.cc Normal file
View File

@@ -0,0 +1,321 @@
/*
* Copyright (c) 2013 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vp10_rtcd.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "vp10/common/blockd.h"
#include "vp10/common/scan.h"
#include "vpx/vpx_integer.h"
#include "vp10/common/vp10_inv_txfm.h"
using libvpx_test::ACMRandom;
namespace {
const double PI = 3.141592653589793238462643383279502884;
const double kInvSqrt2 = 0.707106781186547524400844362104;
void reference_idct_1d(const double *in, double *out, int size) {
for (int n = 0; n < size; ++n) {
out[n] = 0;
for (int k = 0; k < size; ++k) {
if (k == 0)
out[n] += kInvSqrt2 * in[k] * cos(PI * (2 * n + 1) * k / (2 * size));
else
out[n] += in[k] * cos(PI * (2 * n + 1) * k / (2 * size));
}
}
}
typedef void (*IdctFuncRef)(const double *in, double *out, int size);
typedef void (*IdctFunc)(const tran_low_t *in, tran_low_t *out);
class TransTestBase {
public:
virtual ~TransTestBase() {}
protected:
void RunInvAccuracyCheck() {
tran_low_t *input = new tran_low_t[txfm_size_];
tran_low_t *output = new tran_low_t[txfm_size_];
double *ref_input = new double[txfm_size_];
double *ref_output = new double[txfm_size_];
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 5000;
for (int ti = 0; ti < count_test_block; ++ti) {
for (int ni = 0; ni < txfm_size_; ++ni) {
input[ni] = rnd.Rand8() - rnd.Rand8();
ref_input[ni] = static_cast<double>(input[ni]);
}
fwd_txfm_(input, output);
fwd_txfm_ref_(ref_input, ref_output, txfm_size_);
for (int ni = 0; ni < txfm_size_; ++ni) {
EXPECT_LE(
abs(output[ni] - static_cast<tran_low_t>(round(ref_output[ni]))),
max_error_);
}
}
delete[] input;
delete[] output;
delete[] ref_input;
delete[] ref_output;
}
double max_error_;
int txfm_size_;
IdctFunc fwd_txfm_;
IdctFuncRef fwd_txfm_ref_;
};
typedef std::tr1::tuple<IdctFunc, IdctFuncRef, int, int> IdctParam;
class Vp10InvTxfm
: public TransTestBase,
public ::testing::TestWithParam<IdctParam> {
public:
virtual void SetUp() {
fwd_txfm_ = GET_PARAM(0);
fwd_txfm_ref_ = GET_PARAM(1);
txfm_size_ = GET_PARAM(2);
max_error_ = GET_PARAM(3);
}
virtual void TearDown() {}
};
TEST_P(Vp10InvTxfm, RunInvAccuracyCheck) {
RunInvAccuracyCheck();
}
INSTANTIATE_TEST_CASE_P(
C, Vp10InvTxfm,
::testing::Values(
IdctParam(&vp10_idct4_c, &reference_idct_1d, 4, 1),
IdctParam(&vp10_idct8_c, &reference_idct_1d, 8, 2),
IdctParam(&vp10_idct16_c, &reference_idct_1d, 16, 4),
IdctParam(&vp10_idct32_c, &reference_idct_1d, 32, 6))
);
typedef void (*FwdTxfmFunc)(const int16_t *in, tran_low_t *out, int stride);
typedef void (*InvTxfmFunc)(const tran_low_t *in, uint8_t *out, int stride);
typedef std::tr1::tuple<FwdTxfmFunc,
InvTxfmFunc,
InvTxfmFunc,
TX_SIZE, int> PartialInvTxfmParam;
const int kMaxNumCoeffs = 1024;
class Vp10PartialIDctTest
: public ::testing::TestWithParam<PartialInvTxfmParam> {
public:
virtual ~Vp10PartialIDctTest() {}
virtual void SetUp() {
ftxfm_ = GET_PARAM(0);
full_itxfm_ = GET_PARAM(1);
partial_itxfm_ = GET_PARAM(2);
tx_size_ = GET_PARAM(3);
last_nonzero_ = GET_PARAM(4);
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
protected:
int last_nonzero_;
TX_SIZE tx_size_;
FwdTxfmFunc ftxfm_;
InvTxfmFunc full_itxfm_;
InvTxfmFunc partial_itxfm_;
};
TEST_P(Vp10PartialIDctTest, RunQuantCheck) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
int size;
switch (tx_size_) {
case TX_4X4:
size = 4;
break;
case TX_8X8:
size = 8;
break;
case TX_16X16:
size = 16;
break;
case TX_32X32:
size = 32;
break;
default:
FAIL() << "Wrong Size!";
break;
}
DECLARE_ALIGNED(16, tran_low_t, test_coef_block1[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, test_coef_block2[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst1[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst2[kMaxNumCoeffs]);
const int count_test_block = 1000;
const int block_size = size * size;
DECLARE_ALIGNED(16, int16_t, input_extreme_block[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, output_ref_block[kMaxNumCoeffs]);
int max_error = 0;
for (int i = 0; i < count_test_block; ++i) {
// clear out destination buffer
memset(dst1, 0, sizeof(*dst1) * block_size);
memset(dst2, 0, sizeof(*dst2) * block_size);
memset(test_coef_block1, 0, sizeof(*test_coef_block1) * block_size);
memset(test_coef_block2, 0, sizeof(*test_coef_block2) * block_size);
ACMRandom rnd(ACMRandom::DeterministicSeed());
for (int i = 0; i < count_test_block; ++i) {
// Initialize a test block with input range [-255, 255].
if (i == 0) {
for (int j = 0; j < block_size; ++j)
input_extreme_block[j] = 255;
} else if (i == 1) {
for (int j = 0; j < block_size; ++j)
input_extreme_block[j] = -255;
} else {
for (int j = 0; j < block_size; ++j) {
input_extreme_block[j] = rnd.Rand8() % 2 ? 255 : -255;
}
}
ftxfm_(input_extreme_block, output_ref_block, size);
// quantization with maximum allowed step sizes
test_coef_block1[0] = (output_ref_block[0] / 1336) * 1336;
for (int j = 1; j < last_nonzero_; ++j)
test_coef_block1[vp10_default_scan_orders[tx_size_].scan[j]]
= (output_ref_block[j] / 1828) * 1828;
}
ASM_REGISTER_STATE_CHECK(full_itxfm_(test_coef_block1, dst1, size));
ASM_REGISTER_STATE_CHECK(partial_itxfm_(test_coef_block1, dst2, size));
for (int j = 0; j < block_size; ++j) {
const int diff = dst1[j] - dst2[j];
const int error = diff * diff;
if (max_error < error)
max_error = error;
}
}
EXPECT_EQ(0, max_error)
<< "Error: partial inverse transform produces different results";
}
TEST_P(Vp10PartialIDctTest, ResultsMatch) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
int size;
switch (tx_size_) {
case TX_4X4:
size = 4;
break;
case TX_8X8:
size = 8;
break;
case TX_16X16:
size = 16;
break;
case TX_32X32:
size = 32;
break;
default:
FAIL() << "Wrong Size!";
break;
}
DECLARE_ALIGNED(16, tran_low_t, test_coef_block1[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, tran_low_t, test_coef_block2[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst1[kMaxNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst2[kMaxNumCoeffs]);
const int count_test_block = 1000;
const int max_coeff = 32766 / 4;
const int block_size = size * size;
int max_error = 0;
for (int i = 0; i < count_test_block; ++i) {
// clear out destination buffer
memset(dst1, 0, sizeof(*dst1) * block_size);
memset(dst2, 0, sizeof(*dst2) * block_size);
memset(test_coef_block1, 0, sizeof(*test_coef_block1) * block_size);
memset(test_coef_block2, 0, sizeof(*test_coef_block2) * block_size);
int max_energy_leftover = max_coeff * max_coeff;
for (int j = 0; j < last_nonzero_; ++j) {
int16_t coef = static_cast<int16_t>(sqrt(1.0 * max_energy_leftover) *
(rnd.Rand16() - 32768) / 65536);
max_energy_leftover -= coef * coef;
if (max_energy_leftover < 0) {
max_energy_leftover = 0;
coef = 0;
}
test_coef_block1[vp10_default_scan_orders[tx_size_].scan[j]] = coef;
}
memcpy(test_coef_block2, test_coef_block1,
sizeof(*test_coef_block2) * block_size);
ASM_REGISTER_STATE_CHECK(full_itxfm_(test_coef_block1, dst1, size));
ASM_REGISTER_STATE_CHECK(partial_itxfm_(test_coef_block2, dst2, size));
for (int j = 0; j < block_size; ++j) {
const int diff = dst1[j] - dst2[j];
const int error = diff * diff;
if (max_error < error)
max_error = error;
}
}
EXPECT_EQ(0, max_error)
<< "Error: partial inverse transform produces different results";
}
using std::tr1::make_tuple;
INSTANTIATE_TEST_CASE_P(
C, Vp10PartialIDctTest,
::testing::Values(
make_tuple(&vpx_fdct32x32_c,
&vp10_idct32x32_1024_add_c,
&vp10_idct32x32_34_add_c,
TX_32X32, 34),
make_tuple(&vpx_fdct32x32_c,
&vp10_idct32x32_1024_add_c,
&vp10_idct32x32_1_add_c,
TX_32X32, 1),
make_tuple(&vpx_fdct16x16_c,
&vp10_idct16x16_256_add_c,
&vp10_idct16x16_10_add_c,
TX_16X16, 10),
make_tuple(&vpx_fdct16x16_c,
&vp10_idct16x16_256_add_c,
&vp10_idct16x16_1_add_c,
TX_16X16, 1),
make_tuple(&vpx_fdct8x8_c,
&vp10_idct8x8_64_add_c,
&vp10_idct8x8_12_add_c,
TX_8X8, 12),
make_tuple(&vpx_fdct8x8_c,
&vp10_idct8x8_64_add_c,
&vp10_idct8x8_1_add_c,
TX_8X8, 1),
make_tuple(&vpx_fdct4x4_c,
&vp10_idct4x4_16_add_c,
&vp10_idct4x4_1_add_c,
TX_4X4, 1)));
} // namespace

View File

@@ -230,9 +230,23 @@ VP9_INSTANTIATE_TEST_CASE(
::testing::ValuesIn(kEncodeVectors), ::testing::ValuesIn(kEncodeVectors),
::testing::ValuesIn(kMinArfVectors)); ::testing::ValuesIn(kMinArfVectors));
#if CONFIG_VP9_HIGHBITDEPTH
# if CONFIG_VP10_ENCODER
// TODO(angiebird): 25-29 fail in high bitdepth mode.
INSTANTIATE_TEST_CASE_P(
DISABLED_VP10, ArfFreqTest,
::testing::Combine(
::testing::Values(static_cast<const libvpx_test::CodecFactory *>(
&libvpx_test::kVP10)),
::testing::ValuesIn(kTestVectors),
::testing::ValuesIn(kEncodeVectors),
::testing::ValuesIn(kMinArfVectors)));
# endif // CONFIG_VP10_ENCODER
#else
VP10_INSTANTIATE_TEST_CASE( VP10_INSTANTIATE_TEST_CASE(
ArfFreqTest, ArfFreqTest,
::testing::ValuesIn(kTestVectors), ::testing::ValuesIn(kTestVectors),
::testing::ValuesIn(kEncodeVectors), ::testing::ValuesIn(kEncodeVectors),
::testing::ValuesIn(kMinArfVectors)); ::testing::ValuesIn(kMinArfVectors));
#endif // CONFIG_VP9_HIGHBITDEPTH
} // namespace } // namespace

View File

@@ -14,38 +14,10 @@
#include "test/encode_test_driver.h" #include "test/encode_test_driver.h"
#include "test/util.h" #include "test/util.h"
#include "test/y4m_video_source.h" #include "test/y4m_video_source.h"
#include "test/yuv_video_source.h" #include "vp9/vp9_dx_iface.h"
#include "vp9/decoder/vp9_decoder.h"
typedef vpx_codec_stream_info_t vp9_stream_info_t;
struct vpx_codec_alg_priv {
vpx_codec_priv_t base;
vpx_codec_dec_cfg_t cfg;
vp9_stream_info_t si;
struct VP9Decoder *pbi;
int postproc_cfg_set;
vp8_postproc_cfg_t postproc_cfg;
vpx_decrypt_cb decrypt_cb;
void *decrypt_state;
vpx_image_t img;
int img_avail;
int flushed;
int invert_tile_order;
int frame_parallel_decode;
// External frame buffer info to save for VP9 common.
void *ext_priv; // Private data associated with the external frame buffers.
vpx_get_frame_buffer_cb_fn_t get_ext_fb_cb;
vpx_release_frame_buffer_cb_fn_t release_ext_fb_cb;
};
static vpx_codec_alg_priv_t *get_alg_priv(vpx_codec_ctx_t *ctx) {
return (vpx_codec_alg_priv_t *)ctx->priv;
}
namespace { namespace {
const unsigned int kFramerate = 50;
const int kCpuUsed = 2; const int kCpuUsed = 2;
struct EncodePerfTestVideo { struct EncodePerfTestVideo {
@@ -66,35 +38,27 @@ struct EncodeParameters {
int32_t lossless; int32_t lossless;
int32_t error_resilient; int32_t error_resilient;
int32_t frame_parallel; int32_t frame_parallel;
vpx_color_range_t color_range;
vpx_color_space_t cs; vpx_color_space_t cs;
int render_size[2];
// TODO(JBB): quantizers / bitrate // TODO(JBB): quantizers / bitrate
}; };
const EncodeParameters kVP9EncodeParameterSet[] = { const EncodeParameters kVP9EncodeParameterSet[] = {
{0, 0, 0, 1, 0, VPX_CS_BT_601}, {0, 0, 0, 1, 0, VPX_CR_STUDIO_RANGE, VPX_CS_BT_601},
{0, 0, 0, 0, 0, VPX_CS_BT_709}, {0, 0, 0, 0, 0, VPX_CR_FULL_RANGE, VPX_CS_BT_709},
{0, 0, 1, 0, 0, VPX_CS_BT_2020}, {0, 0, 1, 0, 0, VPX_CR_FULL_RANGE, VPX_CS_BT_2020},
{0, 2, 0, 0, 1, VPX_CS_UNKNOWN}, {0, 2, 0, 0, 1, VPX_CR_STUDIO_RANGE, VPX_CS_UNKNOWN, { 640, 480 }},
// TODO(JBB): Test profiles (requires more work). // TODO(JBB): Test profiles (requires more work).
}; };
int is_extension_y4m(const char *filename) {
const char *dot = strrchr(filename, '.');
if (!dot || dot == filename)
return 0;
else
return !strcmp(dot, ".y4m");
}
class VpxEncoderParmsGetToDecoder class VpxEncoderParmsGetToDecoder
: public ::libvpx_test::EncoderTest, : public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith2Params<EncodeParameters, \ public ::libvpx_test::CodecTestWith2Params<EncodeParameters,
EncodePerfTestVideo> { EncodePerfTestVideo> {
protected: protected:
VpxEncoderParmsGetToDecoder() VpxEncoderParmsGetToDecoder()
: EncoderTest(GET_PARAM(0)), : EncoderTest(GET_PARAM(0)), encode_parms(GET_PARAM(1)) {}
encode_parms(GET_PARAM(1)) {
}
virtual ~VpxEncoderParmsGetToDecoder() {} virtual ~VpxEncoderParmsGetToDecoder() {}
@@ -112,6 +76,7 @@ class VpxEncoderParmsGetToDecoder
::libvpx_test::Encoder *encoder) { ::libvpx_test::Encoder *encoder) {
if (video->frame() == 1) { if (video->frame() == 1) {
encoder->Control(VP9E_SET_COLOR_SPACE, encode_parms.cs); encoder->Control(VP9E_SET_COLOR_SPACE, encode_parms.cs);
encoder->Control(VP9E_SET_COLOR_RANGE, encode_parms.color_range);
encoder->Control(VP9E_SET_LOSSLESS, encode_parms.lossless); encoder->Control(VP9E_SET_LOSSLESS, encode_parms.lossless);
encoder->Control(VP9E_SET_FRAME_PARALLEL_DECODING, encoder->Control(VP9E_SET_FRAME_PARALLEL_DECODING,
encode_parms.frame_parallel); encode_parms.frame_parallel);
@@ -122,37 +87,44 @@ class VpxEncoderParmsGetToDecoder
encoder->Control(VP8E_SET_ARNR_MAXFRAMES, 7); encoder->Control(VP8E_SET_ARNR_MAXFRAMES, 7);
encoder->Control(VP8E_SET_ARNR_STRENGTH, 5); encoder->Control(VP8E_SET_ARNR_STRENGTH, 5);
encoder->Control(VP8E_SET_ARNR_TYPE, 3); encoder->Control(VP8E_SET_ARNR_TYPE, 3);
if (encode_parms.render_size[0] > 0 && encode_parms.render_size[1] > 0)
encoder->Control(VP9E_SET_RENDER_SIZE, encode_parms.render_size);
} }
} }
virtual bool HandleDecodeResult(const vpx_codec_err_t res_dec, virtual bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::VideoSource& video, const libvpx_test::VideoSource &video,
libvpx_test::Decoder *decoder) { libvpx_test::Decoder *decoder) {
vpx_codec_ctx_t* vp9_decoder = decoder->GetDecoder(); vpx_codec_ctx_t *const vp9_decoder = decoder->GetDecoder();
vpx_codec_alg_priv_t* priv = vpx_codec_alg_priv_t *const priv =
(vpx_codec_alg_priv_t*) get_alg_priv(vp9_decoder); reinterpret_cast<vpx_codec_alg_priv_t *>(vp9_decoder->priv);
FrameWorkerData *const worker_data =
VP9Decoder* pbi = priv->pbi; reinterpret_cast<FrameWorkerData *>(priv->frame_workers[0].data1);
VP9_COMMON* common = &pbi->common; VP9_COMMON *const common = &worker_data->pbi->common;
if (encode_parms.lossless) { if (encode_parms.lossless) {
EXPECT_EQ(common->base_qindex, 0); EXPECT_EQ(0, common->base_qindex);
EXPECT_EQ(common->y_dc_delta_q, 0); EXPECT_EQ(0, common->y_dc_delta_q);
EXPECT_EQ(common->uv_dc_delta_q, 0); EXPECT_EQ(0, common->uv_dc_delta_q);
EXPECT_EQ(common->uv_ac_delta_q, 0); EXPECT_EQ(0, common->uv_ac_delta_q);
EXPECT_EQ(common->tx_mode, ONLY_4X4); EXPECT_EQ(ONLY_4X4, common->tx_mode);
} }
EXPECT_EQ(common->error_resilient_mode, encode_parms.error_resilient); EXPECT_EQ(encode_parms.error_resilient, common->error_resilient_mode);
if (encode_parms.error_resilient) { if (encode_parms.error_resilient) {
EXPECT_EQ(common->frame_parallel_decoding_mode, 1); EXPECT_EQ(1, common->frame_parallel_decoding_mode);
EXPECT_EQ(common->use_prev_frame_mvs, 0); EXPECT_EQ(0, common->use_prev_frame_mvs);
} else { } else {
EXPECT_EQ(common->frame_parallel_decoding_mode, EXPECT_EQ(encode_parms.frame_parallel,
encode_parms.frame_parallel); common->frame_parallel_decoding_mode);
} }
EXPECT_EQ(common->color_space, encode_parms.cs); EXPECT_EQ(encode_parms.color_range, common->color_range);
EXPECT_EQ(common->log2_tile_cols, encode_parms.tile_cols); EXPECT_EQ(encode_parms.cs, common->color_space);
EXPECT_EQ(common->log2_tile_rows, encode_parms.tile_rows); if (encode_parms.render_size[0] > 0 && encode_parms.render_size[1] > 0) {
EXPECT_EQ(encode_parms.render_size[0], common->render_width);
EXPECT_EQ(encode_parms.render_size[1], common->render_height);
}
EXPECT_EQ(encode_parms.tile_cols, common->log2_tile_cols);
EXPECT_EQ(encode_parms.tile_rows, common->log2_tile_rows);
EXPECT_EQ(VPX_CODEC_OK, res_dec) << decoder->DecodeError(); EXPECT_EQ(VPX_CODEC_OK, res_dec) << decoder->DecodeError();
return VPX_CODEC_OK == res_dec; return VPX_CODEC_OK == res_dec;
@@ -164,35 +136,18 @@ class VpxEncoderParmsGetToDecoder
EncodeParameters encode_parms; EncodeParameters encode_parms;
}; };
// TODO(hkuang): This test conflicts with frame parallel decode. So disable it TEST_P(VpxEncoderParmsGetToDecoder, BitstreamParms) {
// for now until fix.
TEST_P(VpxEncoderParmsGetToDecoder, DISABLED_BitstreamParms) {
init_flags_ = VPX_CODEC_USE_PSNR; init_flags_ = VPX_CODEC_USE_PSNR;
libvpx_test::VideoSource *video; libvpx_test::VideoSource *const video =
if (is_extension_y4m(test_video_.name)) { new libvpx_test::Y4mVideoSource(test_video_.name, 0, test_video_.frames);
video = new libvpx_test::Y4mVideoSource(test_video_.name, ASSERT_TRUE(video != NULL);
0, test_video_.frames);
} else {
video = new libvpx_test::YUVVideoSource(test_video_.name,
VPX_IMG_FMT_I420,
test_video_.width,
test_video_.height,
kFramerate, 1, 0,
test_video_.frames);
}
ASSERT_NO_FATAL_FAILURE(RunLoop(video)); ASSERT_NO_FATAL_FAILURE(RunLoop(video));
delete(video); delete video;
} }
VP9_INSTANTIATE_TEST_CASE( VP9_INSTANTIATE_TEST_CASE(VpxEncoderParmsGetToDecoder,
VpxEncoderParmsGetToDecoder, ::testing::ValuesIn(kVP9EncodeParameterSet),
::testing::ValuesIn(kVP9EncodeParameterSet), ::testing::ValuesIn(kVP9EncodePerfTestVectors));
::testing::ValuesIn(kVP9EncodePerfTestVectors));
VP10_INSTANTIATE_TEST_CASE(
VpxEncoderParmsGetToDecoder,
::testing::ValuesIn(kVP9EncodeParameterSet),
::testing::ValuesIn(kVP9EncodePerfTestVectors));
} // namespace } // namespace

View File

@@ -187,9 +187,23 @@ VP9_INSTANTIATE_TEST_CASE(
::testing::ValuesIn(kTestVectors), ::testing::ValuesIn(kTestVectors),
::testing::ValuesIn(kCpuUsedVectors)); ::testing::ValuesIn(kCpuUsedVectors));
#if CONFIG_VP9_HIGHBITDEPTH
# if CONFIG_VP10_ENCODER
// TODO(angiebird): many fail in high bitdepth mode.
INSTANTIATE_TEST_CASE_P(
DISABLED_VP10, EndToEndTestLarge,
::testing::Combine(
::testing::Values(static_cast<const libvpx_test::CodecFactory *>(
&libvpx_test::kVP10)),
::testing::ValuesIn(kEncodingModeVectors),
::testing::ValuesIn(kTestVectors),
::testing::ValuesIn(kCpuUsedVectors)));
# endif // CONFIG_VP10_ENCODER
#else
VP10_INSTANTIATE_TEST_CASE( VP10_INSTANTIATE_TEST_CASE(
EndToEndTestLarge, EndToEndTestLarge,
::testing::ValuesIn(kEncodingModeVectors), ::testing::ValuesIn(kEncodingModeVectors),
::testing::ValuesIn(kTestVectors), ::testing::ValuesIn(kTestVectors),
::testing::ValuesIn(kCpuUsedVectors)); ::testing::ValuesIn(kCpuUsedVectors));
#endif // CONFIG_VP9_HIGHBITDEPTH
} // namespace } // namespace

View File

@@ -67,12 +67,22 @@ TEST_P(ErrorBlockTest, OperationCheck) {
int64_t ret; int64_t ret;
int64_t ref_ssz; int64_t ref_ssz;
int64_t ref_ret; int64_t ref_ret;
const int msb = bit_depth_ + 8 - 1;
for (int i = 0; i < kNumIterations; ++i) { for (int i = 0; i < kNumIterations; ++i) {
int err_count = 0; int err_count = 0;
block_size = 16 << (i % 9); // All block sizes from 4x4, 8x4 ..64x64 block_size = 16 << (i % 9); // All block sizes from 4x4, 8x4 ..64x64
for (int j = 0; j < block_size; j++) { for (int j = 0; j < block_size; j++) {
coeff[j] = rnd(2 << 20) - (1 << 20); // coeff and dqcoeff will always have at least the same sign, and this
dqcoeff[j] = rnd(2 << 20) - (1 << 20); // can be used for optimization, so generate test input precisely.
if (rnd(2)) {
// Positive number
coeff[j] = rnd(1 << msb);
dqcoeff[j] = rnd(1 << msb);
} else {
// Negative number
coeff[j] = -rnd(1 << msb);
dqcoeff[j] = -rnd(1 << msb);
}
} }
ref_ret = ref_error_block_op_(coeff, dqcoeff, block_size, &ref_ssz, ref_ret = ref_error_block_op_(coeff, dqcoeff, block_size, &ref_ssz,
bit_depth_); bit_depth_);
@@ -85,7 +95,7 @@ TEST_P(ErrorBlockTest, OperationCheck) {
err_count_total += err_count; err_count_total += err_count;
} }
EXPECT_EQ(0, err_count_total) EXPECT_EQ(0, err_count_total)
<< "Error: Error Block Test, C output doesn't match SSE2 output. " << "Error: Error Block Test, C output doesn't match optimized output. "
<< "First failed at test case " << first_failure; << "First failed at test case " << first_failure;
} }
@@ -100,23 +110,36 @@ TEST_P(ErrorBlockTest, ExtremeValues) {
int64_t ret; int64_t ret;
int64_t ref_ssz; int64_t ref_ssz;
int64_t ref_ret; int64_t ref_ret;
int max_val = ((1 << 20) - 1); const int msb = bit_depth_ + 8 - 1;
int max_val = ((1 << msb) - 1);
for (int i = 0; i < kNumIterations; ++i) { for (int i = 0; i < kNumIterations; ++i) {
int err_count = 0; int err_count = 0;
int k = (i / 9) % 5; int k = (i / 9) % 9;
// Change the maximum coeff value, to test different bit boundaries // Change the maximum coeff value, to test different bit boundaries
if ( k == 4 && (i % 9) == 0 ) { if ( k == 8 && (i % 9) == 0 ) {
max_val >>= 1; max_val >>= 1;
} }
block_size = 16 << (i % 9); // All block sizes from 4x4, 8x4 ..64x64 block_size = 16 << (i % 9); // All block sizes from 4x4, 8x4 ..64x64
for (int j = 0; j < block_size; j++) { for (int j = 0; j < block_size; j++) {
if (k < 4) { // Test at maximum values if (k < 4) {
coeff[j] = k % 2 ? max_val : -max_val; // Test at positive maximum values
dqcoeff[j] = (k >> 1) % 2 ? max_val : -max_val; coeff[j] = k % 2 ? max_val : 0;
dqcoeff[j] = (k >> 1) % 2 ? max_val : 0;
} else if (k < 8) {
// Test at negative maximum values
coeff[j] = k % 2 ? -max_val : 0;
dqcoeff[j] = (k >> 1) % 2 ? -max_val : 0;
} else { } else {
coeff[j] = rnd(2 << 14) - (1 << 14); if (rnd(2)) {
dqcoeff[j] = rnd(2 << 14) - (1 << 14); // Positive number
coeff[j] = rnd(1 << 14);
dqcoeff[j] = rnd(1 << 14);
} else {
// Negative number
coeff[j] = -rnd(1 << 14);
dqcoeff[j] = -rnd(1 << 14);
}
} }
} }
ref_ret = ref_error_block_op_(coeff, dqcoeff, block_size, &ref_ssz, ref_ret = ref_error_block_op_(coeff, dqcoeff, block_size, &ref_ssz,
@@ -130,13 +153,30 @@ TEST_P(ErrorBlockTest, ExtremeValues) {
err_count_total += err_count; err_count_total += err_count;
} }
EXPECT_EQ(0, err_count_total) EXPECT_EQ(0, err_count_total)
<< "Error: Error Block Test, C output doesn't match SSE2 output. " << "Error: Error Block Test, C output doesn't match optimized output. "
<< "First failed at test case " << first_failure; << "First failed at test case " << first_failure;
} }
using std::tr1::make_tuple; using std::tr1::make_tuple;
#if CONFIG_USE_X86INC
int64_t wrap_vp9_highbd_block_error_8bit_c(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size,
int64_t *ssz, int bps) {
assert(bps == 8);
return vp9_highbd_block_error_8bit_c(coeff, dqcoeff, block_size, ssz);
}
#if HAVE_SSE2 #if HAVE_SSE2
int64_t wrap_vp9_highbd_block_error_8bit_sse2(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size,
int64_t *ssz, int bps) {
assert(bps == 8);
return vp9_highbd_block_error_8bit_sse2(coeff, dqcoeff, block_size, ssz);
}
INSTANTIATE_TEST_CASE_P( INSTANTIATE_TEST_CASE_P(
SSE2, ErrorBlockTest, SSE2, ErrorBlockTest,
::testing::Values( ::testing::Values(
@@ -145,7 +185,27 @@ INSTANTIATE_TEST_CASE_P(
make_tuple(&vp9_highbd_block_error_sse2, make_tuple(&vp9_highbd_block_error_sse2,
&vp9_highbd_block_error_c, VPX_BITS_12), &vp9_highbd_block_error_c, VPX_BITS_12),
make_tuple(&vp9_highbd_block_error_sse2, make_tuple(&vp9_highbd_block_error_sse2,
&vp9_highbd_block_error_c, VPX_BITS_8))); &vp9_highbd_block_error_c, VPX_BITS_8),
make_tuple(&wrap_vp9_highbd_block_error_8bit_sse2,
&wrap_vp9_highbd_block_error_8bit_c, VPX_BITS_8)));
#endif // HAVE_SSE2 #endif // HAVE_SSE2
#if HAVE_AVX
int64_t wrap_vp9_highbd_block_error_8bit_avx(const tran_low_t *coeff,
const tran_low_t *dqcoeff,
intptr_t block_size,
int64_t *ssz, int bps) {
assert(bps == 8);
return vp9_highbd_block_error_8bit_avx(coeff, dqcoeff, block_size, ssz);
}
INSTANTIATE_TEST_CASE_P(
AVX, ErrorBlockTest,
::testing::Values(
make_tuple(&wrap_vp9_highbd_block_error_8bit_avx,
&wrap_vp9_highbd_block_error_8bit_c, VPX_BITS_8)));
#endif // HAVE_AVX
#endif // CONFIG_USE_X86INC
#endif // CONFIG_VP9_HIGHBITDEPTH #endif // CONFIG_VP9_HIGHBITDEPTH
} // namespace } // namespace

View File

@@ -132,7 +132,6 @@ using std::tr1::make_tuple;
#if HAVE_SSE2 #if HAVE_SSE2
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
#if CONFIG_USE_X86INC #if CONFIG_USE_X86INC
#if ARCH_X86_64
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_8, VP9IntraPredTest, INSTANTIATE_TEST_CASE_P(SSE2_TO_C_8, VP9IntraPredTest,
::testing::Values( ::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_32x32_sse2, make_tuple(&vpx_highbd_dc_predictor_32x32_sse2,
@@ -141,13 +140,13 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_8, VP9IntraPredTest,
&vpx_highbd_tm_predictor_16x16_c, 16, 8), &vpx_highbd_tm_predictor_16x16_c, 16, 8),
make_tuple(&vpx_highbd_tm_predictor_32x32_sse2, make_tuple(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, 8), &vpx_highbd_tm_predictor_32x32_c, 32, 8),
make_tuple(&vpx_highbd_dc_predictor_4x4_sse, make_tuple(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 8), &vpx_highbd_dc_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2, make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 8), &vpx_highbd_dc_predictor_8x8_c, 8, 8),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2, make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16, 8), &vpx_highbd_dc_predictor_16x16_c, 16, 8),
make_tuple(&vpx_highbd_v_predictor_4x4_sse, make_tuple(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 8), &vpx_highbd_v_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2, make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 8), &vpx_highbd_v_predictor_8x8_c, 8, 8),
@@ -155,34 +154,11 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_8, VP9IntraPredTest,
&vpx_highbd_v_predictor_16x16_c, 16, 8), &vpx_highbd_v_predictor_16x16_c, 16, 8),
make_tuple(&vpx_highbd_v_predictor_32x32_sse2, make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, 8), &vpx_highbd_v_predictor_32x32_c, 32, 8),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse, make_tuple(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 8), &vpx_highbd_tm_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2, make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 8))); &vpx_highbd_tm_predictor_8x8_c, 8, 8)));
#else
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_8, VP9IntraPredTest,
::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_4x4_sse,
&vpx_highbd_dc_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 8),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16, 8),
make_tuple(&vpx_highbd_v_predictor_4x4_sse,
&vpx_highbd_v_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 8),
make_tuple(&vpx_highbd_v_predictor_16x16_sse2,
&vpx_highbd_v_predictor_16x16_c, 16, 8),
make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, 8),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse,
&vpx_highbd_tm_predictor_4x4_c, 4, 8),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 8)));
#endif // !ARCH_X86_64
#if ARCH_X86_64
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_10, VP9IntraPredTest, INSTANTIATE_TEST_CASE_P(SSE2_TO_C_10, VP9IntraPredTest,
::testing::Values( ::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_32x32_sse2, make_tuple(&vpx_highbd_dc_predictor_32x32_sse2,
@@ -194,14 +170,14 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_10, VP9IntraPredTest,
make_tuple(&vpx_highbd_tm_predictor_32x32_sse2, make_tuple(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, &vpx_highbd_tm_predictor_32x32_c, 32,
10), 10),
make_tuple(&vpx_highbd_dc_predictor_4x4_sse, make_tuple(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 10), &vpx_highbd_dc_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2, make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 10), &vpx_highbd_dc_predictor_8x8_c, 8, 10),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2, make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16, &vpx_highbd_dc_predictor_16x16_c, 16,
10), 10),
make_tuple(&vpx_highbd_v_predictor_4x4_sse, make_tuple(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 10), &vpx_highbd_v_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2, make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 10), &vpx_highbd_v_predictor_8x8_c, 8, 10),
@@ -211,35 +187,11 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_10, VP9IntraPredTest,
make_tuple(&vpx_highbd_v_predictor_32x32_sse2, make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, &vpx_highbd_v_predictor_32x32_c, 32,
10), 10),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse, make_tuple(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 10), &vpx_highbd_tm_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2, make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 10))); &vpx_highbd_tm_predictor_8x8_c, 8, 10)));
#else
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_10, VP9IntraPredTest,
::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_4x4_sse,
&vpx_highbd_dc_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 10),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16,
10),
make_tuple(&vpx_highbd_v_predictor_4x4_sse,
&vpx_highbd_v_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 10),
make_tuple(&vpx_highbd_v_predictor_16x16_sse2,
&vpx_highbd_v_predictor_16x16_c, 16, 10),
make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, 10),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse,
&vpx_highbd_tm_predictor_4x4_c, 4, 10),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 10)));
#endif // !ARCH_X86_64
#if ARCH_X86_64
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_12, VP9IntraPredTest, INSTANTIATE_TEST_CASE_P(SSE2_TO_C_12, VP9IntraPredTest,
::testing::Values( ::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_32x32_sse2, make_tuple(&vpx_highbd_dc_predictor_32x32_sse2,
@@ -251,14 +203,14 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_12, VP9IntraPredTest,
make_tuple(&vpx_highbd_tm_predictor_32x32_sse2, make_tuple(&vpx_highbd_tm_predictor_32x32_sse2,
&vpx_highbd_tm_predictor_32x32_c, 32, &vpx_highbd_tm_predictor_32x32_c, 32,
12), 12),
make_tuple(&vpx_highbd_dc_predictor_4x4_sse, make_tuple(&vpx_highbd_dc_predictor_4x4_sse2,
&vpx_highbd_dc_predictor_4x4_c, 4, 12), &vpx_highbd_dc_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2, make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 12), &vpx_highbd_dc_predictor_8x8_c, 8, 12),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2, make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16, &vpx_highbd_dc_predictor_16x16_c, 16,
12), 12),
make_tuple(&vpx_highbd_v_predictor_4x4_sse, make_tuple(&vpx_highbd_v_predictor_4x4_sse2,
&vpx_highbd_v_predictor_4x4_c, 4, 12), &vpx_highbd_v_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2, make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 12), &vpx_highbd_v_predictor_8x8_c, 8, 12),
@@ -268,33 +220,11 @@ INSTANTIATE_TEST_CASE_P(SSE2_TO_C_12, VP9IntraPredTest,
make_tuple(&vpx_highbd_v_predictor_32x32_sse2, make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, &vpx_highbd_v_predictor_32x32_c, 32,
12), 12),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse, make_tuple(&vpx_highbd_tm_predictor_4x4_sse2,
&vpx_highbd_tm_predictor_4x4_c, 4, 12), &vpx_highbd_tm_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2, make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 12))); &vpx_highbd_tm_predictor_8x8_c, 8, 12)));
#else
INSTANTIATE_TEST_CASE_P(SSE2_TO_C_12, VP9IntraPredTest,
::testing::Values(
make_tuple(&vpx_highbd_dc_predictor_4x4_sse,
&vpx_highbd_dc_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_dc_predictor_8x8_sse2,
&vpx_highbd_dc_predictor_8x8_c, 8, 12),
make_tuple(&vpx_highbd_dc_predictor_16x16_sse2,
&vpx_highbd_dc_predictor_16x16_c, 16,
12),
make_tuple(&vpx_highbd_v_predictor_4x4_sse,
&vpx_highbd_v_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_v_predictor_8x8_sse2,
&vpx_highbd_v_predictor_8x8_c, 8, 12),
make_tuple(&vpx_highbd_v_predictor_16x16_sse2,
&vpx_highbd_v_predictor_16x16_c, 16, 12),
make_tuple(&vpx_highbd_v_predictor_32x32_sse2,
&vpx_highbd_v_predictor_32x32_c, 32, 12),
make_tuple(&vpx_highbd_tm_predictor_4x4_sse,
&vpx_highbd_tm_predictor_4x4_c, 4, 12),
make_tuple(&vpx_highbd_tm_predictor_8x8_sse2,
&vpx_highbd_tm_predictor_8x8_c, 8, 12)));
#endif // !ARCH_X86_64
#endif // CONFIG_USE_X86INC #endif // CONFIG_USE_X86INC
#endif // CONFIG_VP9_HIGHBITDEPTH #endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_SSE2 #endif // HAVE_SSE2

View File

@@ -54,7 +54,7 @@ vp9_spatial_svc() {
if [ "$(vp9_encode_available)" = "yes" ]; then if [ "$(vp9_encode_available)" = "yes" ]; then
local readonly test_name="vp9_spatial_svc" local readonly test_name="vp9_spatial_svc"
for layers in $(seq 1 ${vp9_ssvc_test_layers}); do for layers in $(seq 1 ${vp9_ssvc_test_layers}); do
vp9_spatial_svc_encoder "${test_name}" -l ${layers} vp9_spatial_svc_encoder "${test_name}" -sl ${layers}
done done
fi fi
} }

View File

@@ -190,7 +190,7 @@ string DecodeFile(const string& filename, int num_threads) {
void DecodeFiles(const FileList files[]) { void DecodeFiles(const FileList files[]) {
for (const FileList *iter = files; iter->name != NULL; ++iter) { for (const FileList *iter = files; iter->name != NULL; ++iter) {
SCOPED_TRACE(iter->name); SCOPED_TRACE(iter->name);
for (int t = 2; t <= 8; ++t) { for (int t = 1; t <= 8; ++t) {
EXPECT_EQ(iter->expected_md5, DecodeFile(iter->name, t)) EXPECT_EQ(iter->expected_md5, DecodeFile(iter->name, t))
<< "threads = " << t; << "threads = " << t;
} }
@@ -235,13 +235,13 @@ TEST(VPxWorkerThreadTest, TestSerialInterface) {
EXPECT_EQ(expected_md5, DecodeFile(filename, 2)); EXPECT_EQ(expected_md5, DecodeFile(filename, 2));
} }
TEST(VP9DecodeMultiThreadedTest, Decode) { TEST(VP9DecodeMultiThreadedTest, NoTilesNonFrameParallel) {
// no tiles or frame parallel; this exercises loop filter threading. // no tiles or frame parallel; this exercises loop filter threading.
EXPECT_EQ("b35a1b707b28e82be025d960aba039bc", EXPECT_EQ("b35a1b707b28e82be025d960aba039bc",
DecodeFile("vp90-2-03-size-226x226.webm", 2)); DecodeFile("vp90-2-03-size-226x226.webm", 2));
} }
TEST(VP9DecodeMultiThreadedTest, Decode2) { TEST(VP9DecodeMultiThreadedTest, FrameParallel) {
static const FileList files[] = { static const FileList files[] = {
{ "vp90-2-08-tile_1x2_frame_parallel.webm", { "vp90-2-08-tile_1x2_frame_parallel.webm",
"68ede6abd66bae0a2edf2eb9232241b6" }, "68ede6abd66bae0a2edf2eb9232241b6" },
@@ -255,8 +255,7 @@ TEST(VP9DecodeMultiThreadedTest, Decode2) {
DecodeFiles(files); DecodeFiles(files);
} }
// Test tile quantity changes within one file. TEST(VP9DecodeMultiThreadedTest, FrameParallelResize) {
TEST(VP9DecodeMultiThreadedTest, Decode3) {
static const FileList files[] = { static const FileList files[] = {
{ "vp90-2-14-resize-fp-tiles-1-16.webm", { "vp90-2-14-resize-fp-tiles-1-16.webm",
"0cd5e632c326297e975f38949c31ea94" }, "0cd5e632c326297e975f38949c31ea94" },
@@ -307,6 +306,19 @@ TEST(VP9DecodeMultiThreadedTest, Decode3) {
DecodeFiles(files); DecodeFiles(files);
} }
TEST(VP9DecodeMultiThreadedTest, NonFrameParallel) {
static const FileList files[] = {
{ "vp90-2-08-tile_1x2.webm", "570b4a5d5a70d58b5359671668328a16" },
{ "vp90-2-08-tile_1x4.webm", "988d86049e884c66909d2d163a09841a" },
{ "vp90-2-08-tile_1x8.webm", "0941902a52e9092cb010905eab16364c" },
{ "vp90-2-08-tile-4x1.webm", "06505aade6647c583c8e00a2f582266f" },
{ "vp90-2-08-tile-4x4.webm", "85c2299892460d76e2c600502d52bfe2" },
{ NULL, NULL }
};
DecodeFiles(files);
}
#endif // CONFIG_WEBM_IO #endif // CONFIG_WEBM_IO
INSTANTIATE_TEST_CASE_P(Synchronous, VPxWorkerThreadTest, ::testing::Bool()); INSTANTIATE_TEST_CASE_P(Synchronous, VPxWorkerThreadTest, ::testing::Bool());

View File

@@ -9,6 +9,7 @@
*/ */
#ifndef TEST_Y4M_VIDEO_SOURCE_H_ #ifndef TEST_Y4M_VIDEO_SOURCE_H_
#define TEST_Y4M_VIDEO_SOURCE_H_ #define TEST_Y4M_VIDEO_SOURCE_H_
#include <algorithm>
#include <string> #include <string>
#include "test/video_source.h" #include "test/video_source.h"
@@ -91,6 +92,18 @@ class Y4mVideoSource : public VideoSource {
y4m_input_fetch_frame(&y4m_, input_file_, img_.get()); y4m_input_fetch_frame(&y4m_, input_file_, img_.get());
} }
// Swap buffers with another y4m source. This allows reading a new frame
// while keeping the old frame around. A whole Y4mSource is required and
// not just a vpx_image_t because of how the y4m reader manipulates
// vpx_image_t internals,
void SwapBuffers(Y4mVideoSource *other) {
std::swap(other->y4m_.dst_buf, y4m_.dst_buf);
vpx_image_t *tmp;
tmp = other->img_.release();
other->img_.reset(img_.release());
img_.reset(tmp);
}
protected: protected:
void CloseSource() { void CloseSource() {
y4m_input_close(&y4m_); y4m_input_close(&y4m_);

View File

@@ -1,7 +1,10 @@
URL: https://chromium.googlesource.com/webm/libwebm URL: https://chromium.googlesource.com/webm/libwebm
Version: 2dec09426ab62b794464cc9971bd135b4d313e65 Version: 476366249e1fda7710a389cd41c57db42305e0d4
License: BSD License: BSD
License File: LICENSE.txt License File: LICENSE.txt
Description: Description:
libwebm is used to handle WebM container I/O. libwebm is used to handle WebM container I/O.
Local Changes:
* <none>

View File

@@ -528,7 +528,7 @@ class Tracks {
public: public:
// Audio and video type defined by the Matroska specs. // Audio and video type defined by the Matroska specs.
enum { kVideo = 0x1, kAudio = 0x2 }; enum { kVideo = 0x1, kAudio = 0x2 };
// Opus, Vorbis, VP8, and VP9 codec ids defined by the Matroska specs.
static const char kOpusCodecId[]; static const char kOpusCodecId[];
static const char kVorbisCodecId[]; static const char kVorbisCodecId[];
static const char kVp8CodecId[]; static const char kVp8CodecId[];

File diff suppressed because it is too large Load Diff

View File

@@ -9,12 +9,13 @@
#ifndef MKVPARSER_HPP #ifndef MKVPARSER_HPP
#define MKVPARSER_HPP #define MKVPARSER_HPP
#include <cstdlib>
#include <cstdio>
#include <cstddef> #include <cstddef>
#include <cstdio>
#include <cstdlib>
namespace mkvparser { namespace mkvparser {
const int E_PARSE_FAILED = -1;
const int E_FILE_FORMAT_INVALID = -2; const int E_FILE_FORMAT_INVALID = -2;
const int E_BUFFER_NOT_FULL = -3; const int E_BUFFER_NOT_FULL = -3;
@@ -27,8 +28,11 @@ class IMkvReader {
virtual ~IMkvReader(); virtual ~IMkvReader();
}; };
template<typename Type> Type* SafeArrayAlloc(unsigned long long num_elements,
unsigned long long element_size);
long long GetUIntLength(IMkvReader*, long long, long&); long long GetUIntLength(IMkvReader*, long long, long&);
long long ReadUInt(IMkvReader*, long long, long&); long long ReadUInt(IMkvReader*, long long, long&);
long long ReadID(IMkvReader* pReader, long long pos, long& len);
long long UnserializeUInt(IMkvReader*, long long pos, long long size); long long UnserializeUInt(IMkvReader*, long long pos, long long size);
long UnserializeFloat(IMkvReader*, long long pos, long long size, double&); long UnserializeFloat(IMkvReader*, long long pos, long long size, double&);
@@ -833,7 +837,7 @@ class Cues {
private: private:
bool Init() const; bool Init() const;
void PreloadCuePoint(long&, long long) const; bool PreloadCuePoint(long&, long long) const;
mutable CuePoint** m_cue_points; mutable CuePoint** m_cue_points;
mutable long m_count; mutable long m_count;
@@ -999,8 +1003,8 @@ class Segment {
long DoLoadClusterUnknownSize(long long&, long&); long DoLoadClusterUnknownSize(long long&, long&);
long DoParseNext(const Cluster*&, long long&, long&); long DoParseNext(const Cluster*&, long long&, long&);
void AppendCluster(Cluster*); bool AppendCluster(Cluster*);
void PreloadCluster(Cluster*, ptrdiff_t); bool PreloadCluster(Cluster*, ptrdiff_t);
// void ParseSeekHead(long long pos, long long size); // void ParseSeekHead(long long pos, long long size);
// void ParseSeekEntry(long long pos, long long size); // void ParseSeekEntry(long long pos, long long size);

View File

@@ -41,6 +41,7 @@ enum MkvId {
kMkvTimecodeScale = 0x2AD7B1, kMkvTimecodeScale = 0x2AD7B1,
kMkvDuration = 0x4489, kMkvDuration = 0x4489,
kMkvDateUTC = 0x4461, kMkvDateUTC = 0x4461,
kMkvTitle = 0x7BA9,
kMkvMuxingApp = 0x4D80, kMkvMuxingApp = 0x4D80,
kMkvWritingApp = 0x5741, kMkvWritingApp = 0x5741,
// Cluster // Cluster
@@ -107,9 +108,16 @@ enum MkvId {
kMkvContentEncodingOrder = 0x5031, kMkvContentEncodingOrder = 0x5031,
kMkvContentEncodingScope = 0x5032, kMkvContentEncodingScope = 0x5032,
kMkvContentEncodingType = 0x5033, kMkvContentEncodingType = 0x5033,
kMkvContentCompression = 0x5034,
kMkvContentCompAlgo = 0x4254,
kMkvContentCompSettings = 0x4255,
kMkvContentEncryption = 0x5035, kMkvContentEncryption = 0x5035,
kMkvContentEncAlgo = 0x47E1, kMkvContentEncAlgo = 0x47E1,
kMkvContentEncKeyID = 0x47E2, kMkvContentEncKeyID = 0x47E2,
kMkvContentSignature = 0x47E3,
kMkvContentSigKeyID = 0x47E4,
kMkvContentSigAlgo = 0x47E5,
kMkvContentSigHashAlgo = 0x47E6,
kMkvContentEncAESSettings = 0x47E7, kMkvContentEncAESSettings = 0x47E7,
kMkvAESSettingsCipherMode = 0x47E8, kMkvAESSettingsCipherMode = 0x47E8,
kMkvAESSettingsCipherInitData = 0x47E9, kMkvAESSettingsCipherInitData = 0x47E9,

View File

@@ -119,7 +119,7 @@
%if ABI_IS_32BIT %if ABI_IS_32BIT
%if CONFIG_PIC=1 %if CONFIG_PIC=1
%ifidn __OUTPUT_FORMAT__,elf32 %ifidn __OUTPUT_FORMAT__,elf32
%define GET_GOT_SAVE_ARG 1 %define GET_GOT_DEFINED 1
%define WRT_PLT wrt ..plt %define WRT_PLT wrt ..plt
%macro GET_GOT 1 %macro GET_GOT 1
extern _GLOBAL_OFFSET_TABLE_ extern _GLOBAL_OFFSET_TABLE_
@@ -138,7 +138,7 @@
%define RESTORE_GOT pop %1 %define RESTORE_GOT pop %1
%endmacro %endmacro
%elifidn __OUTPUT_FORMAT__,macho32 %elifidn __OUTPUT_FORMAT__,macho32
%define GET_GOT_SAVE_ARG 1 %define GET_GOT_DEFINED 1
%macro GET_GOT 1 %macro GET_GOT 1
push %1 push %1
call %%get_got call %%get_got
@@ -149,6 +149,8 @@
%undef RESTORE_GOT %undef RESTORE_GOT
%define RESTORE_GOT pop %1 %define RESTORE_GOT pop %1
%endmacro %endmacro
%else
%define GET_GOT_DEFINED 0
%endif %endif
%endif %endif

View File

@@ -6,7 +6,7 @@ cat <<EOF
# This file is automatically generated from the git commit history # This file is automatically generated from the git commit history
# by tools/gen_authors.sh. # by tools/gen_authors.sh.
$(git log --pretty=format:"%aN <%aE>" | sort | uniq) $(git log --pretty=format:"%aN <%aE>" | sort | uniq | grep -v corp.google)
Google Inc. Google Inc.
The Mozilla Foundation The Mozilla Foundation
The Xiph.Org Foundation The Xiph.Org Foundation

View File

@@ -66,7 +66,7 @@ void vp10_foreach_transformed_block_in_plane(
for (r = 0; r < max_blocks_high; r += (1 << tx_size)) { for (r = 0; r < max_blocks_high; r += (1 << tx_size)) {
// Skip visiting the sub blocks that are wholly within the UMV. // Skip visiting the sub blocks that are wholly within the UMV.
for (c = 0; c < max_blocks_wide; c += (1 << tx_size)) { for (c = 0; c < max_blocks_wide; c += (1 << tx_size)) {
visit(plane, i, plane_bsize, tx_size, arg); visit(plane, i, r, c, plane_bsize, tx_size, arg);
i += step; i += step;
} }
i += extra_step; i += extra_step;

View File

@@ -70,6 +70,9 @@ typedef struct {
PREDICTION_MODE mode; PREDICTION_MODE mode;
TX_SIZE tx_size; TX_SIZE tx_size;
int8_t skip; int8_t skip;
#if CONFIG_MISC_FIXES
int8_t has_no_coeffs;
#endif
int8_t segment_id; int8_t segment_id;
int8_t seg_id_predicted; // valid only when temporal_update is enabled int8_t seg_id_predicted; // valid only when temporal_update is enabled
@@ -128,6 +131,7 @@ struct macroblockd_plane {
ENTROPY_CONTEXT *above_context; ENTROPY_CONTEXT *above_context;
ENTROPY_CONTEXT *left_context; ENTROPY_CONTEXT *left_context;
int16_t seg_dequant[MAX_SEGMENTS][2]; int16_t seg_dequant[MAX_SEGMENTS][2];
uint8_t *color_index_map;
// number of 4x4s in current block // number of 4x4s in current block
uint16_t n4_w, n4_h; uint16_t n4_w, n4_h;
@@ -167,8 +171,6 @@ typedef struct macroblockd {
int up_available; int up_available;
int left_available; int left_available;
const vpx_prob (*partition_probs)[PARTITION_TYPES - 1];
/* Distance of MB away from frame edges */ /* Distance of MB away from frame edges */
int mb_to_left_edge; int mb_to_left_edge;
int mb_to_right_edge; int mb_to_right_edge;
@@ -176,7 +178,6 @@ typedef struct macroblockd {
int mb_to_bottom_edge; int mb_to_bottom_edge;
FRAME_CONTEXT *fc; FRAME_CONTEXT *fc;
int frame_parallel_decoding_mode;
/* pointers to reference frames */ /* pointers to reference frames */
RefBuffer *block_refs[2]; RefBuffer *block_refs[2];
@@ -195,7 +196,7 @@ typedef struct macroblockd {
int bd; int bd;
#endif #endif
int lossless; int lossless[MAX_SEGMENTS];
int corrupted; int corrupted;
struct vpx_internal_error_info *error_info; struct vpx_internal_error_info *error_info;
@@ -224,8 +225,8 @@ static INLINE TX_TYPE get_tx_type(PLANE_TYPE plane_type, const MACROBLOCKD *xd,
const MODE_INFO *const mi = xd->mi[0]; const MODE_INFO *const mi = xd->mi[0];
const MB_MODE_INFO *const mbmi = &mi->mbmi; const MB_MODE_INFO *const mbmi = &mi->mbmi;
if (plane_type != PLANE_TYPE_Y || xd->lossless || is_inter_block(mbmi) || if (plane_type != PLANE_TYPE_Y || xd->lossless[mbmi->segment_id] ||
mbmi->tx_size >= TX_32X32) is_inter_block(mbmi) || mbmi->tx_size >= TX_32X32)
return DCT_DCT; return DCT_DCT;
return intra_mode_to_tx_type_lookup[get_y_mode(mi, block_idx)]; return intra_mode_to_tx_type_lookup[get_y_mode(mi, block_idx)];
@@ -266,16 +267,8 @@ static INLINE void reset_skip_context(MACROBLOCKD *xd, BLOCK_SIZE bsize) {
} }
} }
static INLINE const vpx_prob *get_y_mode_probs(const MODE_INFO *mi,
const MODE_INFO *above_mi,
const MODE_INFO *left_mi,
int block) {
const PREDICTION_MODE above = vp10_above_block_mode(mi, above_mi, block);
const PREDICTION_MODE left = vp10_left_block_mode(mi, left_mi, block);
return vp10_kf_y_mode_prob[above][left];
}
typedef void (*foreach_transformed_block_visitor)(int plane, int block, typedef void (*foreach_transformed_block_visitor)(int plane, int block,
int blk_row, int blk_col,
BLOCK_SIZE plane_bsize, BLOCK_SIZE plane_bsize,
TX_SIZE tx_size, TX_SIZE tx_size,
void *arg); void *arg);
@@ -289,17 +282,6 @@ void vp10_foreach_transformed_block(
const MACROBLOCKD* const xd, BLOCK_SIZE bsize, const MACROBLOCKD* const xd, BLOCK_SIZE bsize,
foreach_transformed_block_visitor visit, void *arg); foreach_transformed_block_visitor visit, void *arg);
static INLINE void txfrm_block_to_raster_xy(BLOCK_SIZE plane_bsize,
TX_SIZE tx_size, int block,
int *x, int *y) {
const int bwl = b_width_log2_lookup[plane_bsize];
const int tx_cols_log2 = bwl - tx_size;
const int tx_cols = 1 << tx_cols_log2;
const int raster_mb = block >> (tx_size << 1);
*x = (raster_mb & (tx_cols - 1)) << tx_size;
*y = (raster_mb >> tx_cols_log2) << tx_size;
}
void vp10_set_contexts(const MACROBLOCKD *xd, struct macroblockd_plane *pd, void vp10_set_contexts(const MACROBLOCKD *xd, struct macroblockd_plane *pd,
BLOCK_SIZE plane_bsize, TX_SIZE tx_size, int has_eob, BLOCK_SIZE plane_bsize, TX_SIZE tx_size, int has_eob,
int aoff, int loff); int aoff, int loff);

View File

@@ -31,6 +31,8 @@ static const uint8_t num_4x4_blocks_high_lookup[BLOCK_SIZES] =
// Log 2 conversion lookup tables for modeinfo width and height // Log 2 conversion lookup tables for modeinfo width and height
static const uint8_t mi_width_log2_lookup[BLOCK_SIZES] = static const uint8_t mi_width_log2_lookup[BLOCK_SIZES] =
{0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3}; {0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3};
static const uint8_t mi_height_log2_lookup[BLOCK_SIZES] =
{0, 0, 0, 0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
static const uint8_t num_8x8_blocks_wide_lookup[BLOCK_SIZES] = static const uint8_t num_8x8_blocks_wide_lookup[BLOCK_SIZES] =
{1, 1, 1, 1, 1, 2, 2, 2, 4, 4, 4, 8, 8}; {1, 1, 1, 1, 1, 2, 2, 2, 4, 4, 4, 8, 8};
static const uint8_t num_8x8_blocks_high_lookup[BLOCK_SIZES] = static const uint8_t num_8x8_blocks_high_lookup[BLOCK_SIZES] =

View File

@@ -403,7 +403,6 @@ const vpx_prob vp10_pareto8_full[COEFF_PROB_MODELS][MODEL_NODES] = {
{255, 241, 243, 255, 236, 255, 252, 254}, {255, 241, 243, 255, 236, 255, 252, 254},
{255, 243, 245, 255, 237, 255, 252, 254}, {255, 243, 245, 255, 237, 255, 252, 254},
{255, 246, 247, 255, 239, 255, 253, 255}, {255, 246, 247, 255, 239, 255, 253, 255},
{255, 246, 247, 255, 239, 255, 253, 255},
}; };
static const vp10_coeff_probs_model default_coef_probs_4x4[PLANE_TYPES] = { static const vp10_coeff_probs_model default_coef_probs_4x4[PLANE_TYPES] = {
@@ -743,7 +742,9 @@ static const vp10_coeff_probs_model default_coef_probs_32x32[PLANE_TYPES] = {
}; };
static void extend_to_full_distribution(vpx_prob *probs, vpx_prob p) { static void extend_to_full_distribution(vpx_prob *probs, vpx_prob p) {
memcpy(probs, vp10_pareto8_full[p = 0 ? 0 : p - 1], // TODO(aconverse): model[PIVOT_NODE] should never be zero.
// https://code.google.com/p/webm/issues/detail?id=1089
memcpy(probs, vp10_pareto8_full[p == 0 ? 254 : p - 1],
MODEL_NODES * sizeof(vpx_prob)); MODEL_NODES * sizeof(vpx_prob));
} }

View File

@@ -153,7 +153,7 @@ static INLINE const uint8_t *get_band_translate(TX_SIZE tx_size) {
// 1, 3, 5, 7, ..., 253, 255 // 1, 3, 5, 7, ..., 253, 255
// In between probabilities are interpolated linearly // In between probabilities are interpolated linearly
#define COEFF_PROB_MODELS 256 #define COEFF_PROB_MODELS 255
#define UNCONSTRAINED_NODES 3 #define UNCONSTRAINED_NODES 3

View File

@@ -127,6 +127,7 @@ const vpx_prob vp10_kf_y_mode_prob[INTRA_MODES][INTRA_MODES][INTRA_MODES - 1] =
} }
}; };
#if !CONFIG_MISC_FIXES
const vpx_prob vp10_kf_uv_mode_prob[INTRA_MODES][INTRA_MODES - 1] = { const vpx_prob vp10_kf_uv_mode_prob[INTRA_MODES][INTRA_MODES - 1] = {
{ 144, 11, 54, 157, 195, 130, 46, 58, 108 }, // y = dc { 144, 11, 54, 157, 195, 130, 46, 58, 108 }, // y = dc
{ 118, 15, 123, 148, 131, 101, 44, 93, 131 }, // y = v { 118, 15, 123, 148, 131, 101, 44, 93, 131 }, // y = v
@@ -139,6 +140,7 @@ const vpx_prob vp10_kf_uv_mode_prob[INTRA_MODES][INTRA_MODES - 1] = {
{ 116, 12, 64, 120, 140, 125, 49, 115, 121 }, // y = d63 { 116, 12, 64, 120, 140, 125, 49, 115, 121 }, // y = d63
{ 102, 19, 66, 162, 182, 122, 35, 59, 128 } // y = tm { 102, 19, 66, 162, 182, 122, 35, 59, 128 } // y = tm
}; };
#endif
static const vpx_prob default_if_y_probs[BLOCK_SIZE_GROUPS][INTRA_MODES - 1] = { static const vpx_prob default_if_y_probs[BLOCK_SIZE_GROUPS][INTRA_MODES - 1] = {
{ 65, 32, 18, 144, 162, 194, 41, 51, 98 }, // block_size < 8x8 { 65, 32, 18, 144, 162, 194, 41, 51, 98 }, // block_size < 8x8
@@ -147,7 +149,7 @@ static const vpx_prob default_if_y_probs[BLOCK_SIZE_GROUPS][INTRA_MODES - 1] = {
{ 221, 135, 38, 194, 248, 121, 96, 85, 29 } // block_size >= 32x32 { 221, 135, 38, 194, 248, 121, 96, 85, 29 } // block_size >= 32x32
}; };
static const vpx_prob default_if_uv_probs[INTRA_MODES][INTRA_MODES - 1] = { static const vpx_prob default_uv_probs[INTRA_MODES][INTRA_MODES - 1] = {
{ 120, 7, 76, 176, 208, 126, 28, 54, 103 }, // y = dc { 120, 7, 76, 176, 208, 126, 28, 54, 103 }, // y = dc
{ 48, 12, 154, 155, 139, 90, 34, 117, 119 }, // y = v { 48, 12, 154, 155, 139, 90, 34, 117, 119 }, // y = v
{ 67, 6, 25, 204, 243, 158, 13, 21, 96 }, // y = h { 67, 6, 25, 204, 243, 158, 13, 21, 96 }, // y = h
@@ -160,6 +162,7 @@ static const vpx_prob default_if_uv_probs[INTRA_MODES][INTRA_MODES - 1] = {
{ 101, 21, 107, 181, 192, 103, 19, 67, 125 } // y = tm { 101, 21, 107, 181, 192, 103, 19, 67, 125 } // y = tm
}; };
#if !CONFIG_MISC_FIXES
const vpx_prob vp10_kf_partition_probs[PARTITION_CONTEXTS] const vpx_prob vp10_kf_partition_probs[PARTITION_CONTEXTS]
[PARTITION_TYPES - 1] = { [PARTITION_TYPES - 1] = {
// 8x8 -> 4x4 // 8x8 -> 4x4
@@ -183,6 +186,7 @@ const vpx_prob vp10_kf_partition_probs[PARTITION_CONTEXTS]
{ 57, 15, 9 }, // l split, a not split { 57, 15, 9 }, // l split, a not split
{ 12, 3, 3 }, // a/l both split { 12, 3, 3 }, // a/l both split
}; };
#endif
static const vpx_prob default_partition_probs[PARTITION_CONTEXTS] static const vpx_prob default_partition_probs[PARTITION_CONTEXTS]
[PARTITION_TYPES - 1] = { [PARTITION_TYPES - 1] = {
@@ -314,8 +318,16 @@ static const vpx_prob default_switchable_interp_prob[SWITCHABLE_FILTER_CONTEXTS]
{ 149, 144, }, { 149, 144, },
}; };
#if CONFIG_MISC_FIXES
// FIXME(someone) need real defaults here
static const struct segmentation_probs default_seg_probs = {
{ 128, 128, 128, 128, 128, 128, 128 },
{ 128, 128, 128 },
};
#endif
static void init_mode_probs(FRAME_CONTEXT *fc) { static void init_mode_probs(FRAME_CONTEXT *fc) {
vp10_copy(fc->uv_mode_prob, default_if_uv_probs); vp10_copy(fc->uv_mode_prob, default_uv_probs);
vp10_copy(fc->y_mode_prob, default_if_y_probs); vp10_copy(fc->y_mode_prob, default_if_y_probs);
vp10_copy(fc->switchable_interp_prob, default_switchable_interp_prob); vp10_copy(fc->switchable_interp_prob, default_switchable_interp_prob);
vp10_copy(fc->partition_prob, default_partition_probs); vp10_copy(fc->partition_prob, default_partition_probs);
@@ -326,6 +338,10 @@ static void init_mode_probs(FRAME_CONTEXT *fc) {
fc->tx_probs = default_tx_probs; fc->tx_probs = default_tx_probs;
vp10_copy(fc->skip_probs, default_skip_probs); vp10_copy(fc->skip_probs, default_skip_probs);
vp10_copy(fc->inter_mode_probs, default_inter_mode_probs); vp10_copy(fc->inter_mode_probs, default_inter_mode_probs);
#if CONFIG_MISC_FIXES
vp10_copy(fc->seg.tree_probs, default_seg_probs.tree_probs);
vp10_copy(fc->seg.pred_probs, default_seg_probs.pred_probs);
#endif
} }
const vpx_tree_index vp10_switchable_interp_tree const vpx_tree_index vp10_switchable_interp_tree
@@ -334,7 +350,7 @@ const vpx_tree_index vp10_switchable_interp_tree
-EIGHTTAP_SMOOTH, -EIGHTTAP_SHARP -EIGHTTAP_SMOOTH, -EIGHTTAP_SHARP
}; };
void vp10_adapt_mode_probs(VP10_COMMON *cm) { void vp10_adapt_inter_frame_probs(VP10_COMMON *cm) {
int i, j; int i, j;
FRAME_CONTEXT *fc = cm->fc; FRAME_CONTEXT *fc = cm->fc;
const FRAME_CONTEXT *pre_fc = &cm->frame_contexts[cm->frame_context_idx]; const FRAME_CONTEXT *pre_fc = &cm->frame_contexts[cm->frame_context_idx];
@@ -362,6 +378,7 @@ void vp10_adapt_mode_probs(VP10_COMMON *cm) {
vpx_tree_merge_probs(vp10_intra_mode_tree, pre_fc->y_mode_prob[i], vpx_tree_merge_probs(vp10_intra_mode_tree, pre_fc->y_mode_prob[i],
counts->y_mode[i], fc->y_mode_prob[i]); counts->y_mode[i], fc->y_mode_prob[i]);
#if !CONFIG_MISC_FIXES
for (i = 0; i < INTRA_MODES; ++i) for (i = 0; i < INTRA_MODES; ++i)
vpx_tree_merge_probs(vp10_intra_mode_tree, pre_fc->uv_mode_prob[i], vpx_tree_merge_probs(vp10_intra_mode_tree, pre_fc->uv_mode_prob[i],
counts->uv_mode[i], fc->uv_mode_prob[i]); counts->uv_mode[i], fc->uv_mode_prob[i]);
@@ -369,6 +386,7 @@ void vp10_adapt_mode_probs(VP10_COMMON *cm) {
for (i = 0; i < PARTITION_CONTEXTS; i++) for (i = 0; i < PARTITION_CONTEXTS; i++)
vpx_tree_merge_probs(vp10_partition_tree, pre_fc->partition_prob[i], vpx_tree_merge_probs(vp10_partition_tree, pre_fc->partition_prob[i],
counts->partition[i], fc->partition_prob[i]); counts->partition[i], fc->partition_prob[i]);
#endif
if (cm->interp_filter == SWITCHABLE) { if (cm->interp_filter == SWITCHABLE) {
for (i = 0; i < SWITCHABLE_FILTER_CONTEXTS; i++) for (i = 0; i < SWITCHABLE_FILTER_CONTEXTS; i++)
@@ -377,6 +395,13 @@ void vp10_adapt_mode_probs(VP10_COMMON *cm) {
counts->switchable_interp[i], counts->switchable_interp[i],
fc->switchable_interp_prob[i]); fc->switchable_interp_prob[i]);
} }
}
void vp10_adapt_intra_frame_probs(VP10_COMMON *cm) {
int i;
FRAME_CONTEXT *fc = cm->fc;
const FRAME_CONTEXT *pre_fc = &cm->frame_contexts[cm->frame_context_idx];
const FRAME_COUNTS *counts = &cm->counts;
if (cm->tx_mode == TX_MODE_SELECT) { if (cm->tx_mode == TX_MODE_SELECT) {
int j; int j;
@@ -405,6 +430,28 @@ void vp10_adapt_mode_probs(VP10_COMMON *cm) {
for (i = 0; i < SKIP_CONTEXTS; ++i) for (i = 0; i < SKIP_CONTEXTS; ++i)
fc->skip_probs[i] = mode_mv_merge_probs( fc->skip_probs[i] = mode_mv_merge_probs(
pre_fc->skip_probs[i], counts->skip[i]); pre_fc->skip_probs[i], counts->skip[i]);
#if CONFIG_MISC_FIXES
if (cm->seg.temporal_update) {
for (i = 0; i < PREDICTION_PROBS; i++)
fc->seg.pred_probs[i] = mode_mv_merge_probs(pre_fc->seg.pred_probs[i],
counts->seg.pred[i]);
vpx_tree_merge_probs(vp10_segment_tree, pre_fc->seg.tree_probs,
counts->seg.tree_mispred, fc->seg.tree_probs);
} else {
vpx_tree_merge_probs(vp10_segment_tree, pre_fc->seg.tree_probs,
counts->seg.tree_total, fc->seg.tree_probs);
}
for (i = 0; i < INTRA_MODES; ++i)
vpx_tree_merge_probs(vp10_intra_mode_tree, pre_fc->uv_mode_prob[i],
counts->uv_mode[i], fc->uv_mode_prob[i]);
for (i = 0; i < PARTITION_CONTEXTS; i++)
vpx_tree_merge_probs(vp10_partition_tree, pre_fc->partition_prob[i],
counts->partition[i], fc->partition_prob[i]);
#endif
} }
static void set_default_lf_deltas(struct loopfilter *lf) { static void set_default_lf_deltas(struct loopfilter *lf) {
@@ -448,12 +495,12 @@ void vp10_setup_past_independence(VP10_COMMON *cm) {
vp10_init_mv_probs(cm); vp10_init_mv_probs(cm);
cm->fc->initialized = 1; cm->fc->initialized = 1;
if (cm->frame_type == KEY_FRAME || if (cm->frame_type == KEY_FRAME || cm->error_resilient_mode ||
cm->error_resilient_mode || cm->reset_frame_context == 3) { cm->reset_frame_context == RESET_FRAME_CONTEXT_ALL) {
// Reset all frame contexts. // Reset all frame contexts.
for (i = 0; i < FRAME_CONTEXTS; ++i) for (i = 0; i < FRAME_CONTEXTS; ++i)
cm->frame_contexts[i] = *cm->fc; cm->frame_contexts[i] = *cm->fc;
} else if (cm->reset_frame_context == 2) { } else if (cm->reset_frame_context == RESET_FRAME_CONTEXT_CURRENT) {
// Reset only the frame context specified in the frame header. // Reset only the frame context specified in the frame header.
cm->frame_contexts[cm->frame_context_idx] = *cm->fc; cm->frame_contexts[cm->frame_context_idx] = *cm->fc;
} }
@@ -463,7 +510,5 @@ void vp10_setup_past_independence(VP10_COMMON *cm) {
memset(cm->prev_mip, 0, memset(cm->prev_mip, 0,
cm->mi_stride * (cm->mi_rows + 1) * sizeof(*cm->prev_mip)); cm->mi_stride * (cm->mi_rows + 1) * sizeof(*cm->prev_mip));
vp10_zero(cm->ref_frame_sign_bias);
cm->frame_context_idx = 0; cm->frame_context_idx = 0;
} }

View File

@@ -14,6 +14,7 @@
#include "vp10/common/entropy.h" #include "vp10/common/entropy.h"
#include "vp10/common/entropymv.h" #include "vp10/common/entropymv.h"
#include "vp10/common/filter.h" #include "vp10/common/filter.h"
#include "vp10/common/seg_common.h"
#include "vpx_dsp/vpx_filter.h" #include "vpx_dsp/vpx_filter.h"
#ifdef __cplusplus #ifdef __cplusplus
@@ -41,6 +42,12 @@ struct tx_counts {
unsigned int tx_totals[TX_SIZES]; unsigned int tx_totals[TX_SIZES];
}; };
struct seg_counts {
unsigned int tree_total[MAX_SEGMENTS];
unsigned int tree_mispred[MAX_SEGMENTS];
unsigned int pred[PREDICTION_PROBS][2];
};
typedef struct frame_contexts { typedef struct frame_contexts {
vpx_prob y_mode_prob[BLOCK_SIZE_GROUPS][INTRA_MODES - 1]; vpx_prob y_mode_prob[BLOCK_SIZE_GROUPS][INTRA_MODES - 1];
vpx_prob uv_mode_prob[INTRA_MODES][INTRA_MODES - 1]; vpx_prob uv_mode_prob[INTRA_MODES][INTRA_MODES - 1];
@@ -56,10 +63,14 @@ typedef struct frame_contexts {
struct tx_probs tx_probs; struct tx_probs tx_probs;
vpx_prob skip_probs[SKIP_CONTEXTS]; vpx_prob skip_probs[SKIP_CONTEXTS];
nmv_context nmvc; nmv_context nmvc;
#if CONFIG_MISC_FIXES
struct segmentation_probs seg;
#endif
int initialized; int initialized;
} FRAME_CONTEXT; } FRAME_CONTEXT;
typedef struct FRAME_COUNTS { typedef struct FRAME_COUNTS {
unsigned int kf_y_mode[INTRA_MODES][INTRA_MODES][INTRA_MODES];
unsigned int y_mode[BLOCK_SIZE_GROUPS][INTRA_MODES]; unsigned int y_mode[BLOCK_SIZE_GROUPS][INTRA_MODES];
unsigned int uv_mode[INTRA_MODES][INTRA_MODES]; unsigned int uv_mode[INTRA_MODES][INTRA_MODES];
unsigned int partition[PARTITION_CONTEXTS][PARTITION_TYPES]; unsigned int partition[PARTITION_CONTEXTS][PARTITION_TYPES];
@@ -76,22 +87,30 @@ typedef struct FRAME_COUNTS {
struct tx_counts tx; struct tx_counts tx;
unsigned int skip[SKIP_CONTEXTS][2]; unsigned int skip[SKIP_CONTEXTS][2];
nmv_context_counts mv; nmv_context_counts mv;
#if CONFIG_MISC_FIXES
struct seg_counts seg;
#endif
} FRAME_COUNTS; } FRAME_COUNTS;
extern const vpx_prob vp10_kf_uv_mode_prob[INTRA_MODES][INTRA_MODES - 1];
extern const vpx_prob vp10_kf_y_mode_prob[INTRA_MODES][INTRA_MODES] extern const vpx_prob vp10_kf_y_mode_prob[INTRA_MODES][INTRA_MODES]
[INTRA_MODES - 1]; [INTRA_MODES - 1];
#if !CONFIG_MISC_FIXES
extern const vpx_prob vp10_kf_uv_mode_prob[INTRA_MODES][INTRA_MODES - 1];
extern const vpx_prob vp10_kf_partition_probs[PARTITION_CONTEXTS] extern const vpx_prob vp10_kf_partition_probs[PARTITION_CONTEXTS]
[PARTITION_TYPES - 1]; [PARTITION_TYPES - 1];
#endif
extern const vpx_tree_index vp10_intra_mode_tree[TREE_SIZE(INTRA_MODES)]; extern const vpx_tree_index vp10_intra_mode_tree[TREE_SIZE(INTRA_MODES)];
extern const vpx_tree_index vp10_inter_mode_tree[TREE_SIZE(INTER_MODES)]; extern const vpx_tree_index vp10_inter_mode_tree[TREE_SIZE(INTER_MODES)];
extern const vpx_tree_index vp10_partition_tree[TREE_SIZE(PARTITION_TYPES)]; extern const vpx_tree_index vp10_partition_tree[TREE_SIZE(PARTITION_TYPES)];
extern const vpx_tree_index vp10_switchable_interp_tree extern const vpx_tree_index vp10_switchable_interp_tree
[TREE_SIZE(SWITCHABLE_FILTERS)]; [TREE_SIZE(SWITCHABLE_FILTERS)];
void vp10_setup_past_independence(struct VP10Common *cm); void vp10_setup_past_independence(struct VP10Common *cm);
void vp10_adapt_mode_probs(struct VP10Common *cm); void vp10_adapt_intra_frame_probs(struct VP10Common *cm);
void vp10_adapt_inter_frame_probs(struct VP10Common *cm);
void vp10_tx_counts_to_branch_counts_32x32(const unsigned int *tx_count_32x32p, void vp10_tx_counts_to_branch_counts_32x32(const unsigned int *tx_count_32x32p,
unsigned int (*ct_32x32p)[2]); unsigned int (*ct_32x32p)[2]);
@@ -100,6 +119,15 @@ void vp10_tx_counts_to_branch_counts_16x16(const unsigned int *tx_count_16x16p,
void vp10_tx_counts_to_branch_counts_8x8(const unsigned int *tx_count_8x8p, void vp10_tx_counts_to_branch_counts_8x8(const unsigned int *tx_count_8x8p,
unsigned int (*ct_8x8p)[2]); unsigned int (*ct_8x8p)[2]);
static INLINE int vp10_ceil_log2(int n) {
int i = 1, p = 2;
while (p < n) {
i++;
p = p << 1;
}
return i;
}
#ifdef __cplusplus #ifdef __cplusplus
} // extern "C" } // extern "C"
#endif #endif

View File

@@ -128,8 +128,13 @@ MV_CLASS_TYPE vp10_get_mv_class(int z, int *offset) {
} }
int vp10_use_mv_hp(const MV *ref) { int vp10_use_mv_hp(const MV *ref) {
#if CONFIG_MISC_FIXES
(void) ref;
return 1;
#else
return (abs(ref->row) >> 3) < COMPANDED_MVREF_THRESH && return (abs(ref->row) >> 3) < COMPANDED_MVREF_THRESH &&
(abs(ref->col) >> 3) < COMPANDED_MVREF_THRESH; (abs(ref->col) >> 3) < COMPANDED_MVREF_THRESH;
#endif
} }
static void inc_mv_component(int v, nmv_component_counts *comp_counts, static void inc_mv_component(int v, nmv_component_counts *comp_counts,
@@ -161,17 +166,19 @@ static void inc_mv_component(int v, nmv_component_counts *comp_counts,
} }
} }
void vp10_inc_mv(const MV *mv, nmv_context_counts *counts) { void vp10_inc_mv(const MV *mv, nmv_context_counts *counts, const int usehp) {
if (counts != NULL) { if (counts != NULL) {
const MV_JOINT_TYPE j = vp10_get_mv_joint(mv); const MV_JOINT_TYPE j = vp10_get_mv_joint(mv);
++counts->joints[j]; ++counts->joints[j];
if (mv_joint_vertical(j)) { if (mv_joint_vertical(j)) {
inc_mv_component(mv->row, &counts->comps[0], 1, 1); inc_mv_component(mv->row, &counts->comps[0], 1,
!CONFIG_MISC_FIXES || usehp);
} }
if (mv_joint_horizontal(j)) { if (mv_joint_horizontal(j)) {
inc_mv_component(mv->col, &counts->comps[1], 1, 1); inc_mv_component(mv->col, &counts->comps[1], 1,
!CONFIG_MISC_FIXES || usehp);
} }
} }
} }

View File

@@ -124,7 +124,7 @@ typedef struct {
nmv_component_counts comps[2]; nmv_component_counts comps[2];
} nmv_context_counts; } nmv_context_counts;
void vp10_inc_mv(const MV *mv, nmv_context_counts *mvctx); void vp10_inc_mv(const MV *mv, nmv_context_counts *mvctx, const int usehp);
#ifdef __cplusplus #ifdef __cplusplus
} // extern "C" } // extern "C"

View File

@@ -179,21 +179,24 @@ void vp10_idct32x32_add(const tran_low_t *input, uint8_t *dest, int stride,
} }
void vp10_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest, void vp10_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest,
int stride, int eob, TX_TYPE tx_type, int stride, int eob, TX_TYPE tx_type, int lossless) {
void (*itxm_add_4x4)(const tran_low_t *input, if (lossless) {
uint8_t *dest, int stride, int eob)) { assert(tx_type == DCT_DCT);
switch (tx_type) { vp10_iwht4x4_add(input, dest, stride, eob);
case DCT_DCT: } else {
itxm_add_4x4(input, dest, stride, eob); switch (tx_type) {
break; case DCT_DCT:
case ADST_DCT: vp10_idct4x4_add(input, dest, stride, eob);
case DCT_ADST: break;
case ADST_ADST: case ADST_DCT:
vp10_iht4x4_16_add(input, dest, stride, tx_type); case DCT_ADST:
break; case ADST_ADST:
default: vp10_iht4x4_16_add(input, dest, stride, tx_type);
assert(0); break;
break; default:
assert(0);
break;
}
} }
} }
@@ -418,21 +421,24 @@ void vp10_highbd_idct32x32_add(const tran_low_t *input, uint8_t *dest,
void vp10_highbd_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest, void vp10_highbd_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest,
int stride, int eob, int bd, TX_TYPE tx_type, int stride, int eob, int bd, TX_TYPE tx_type,
void (*highbd_itxm_add_4x4) int lossless) {
(const tran_low_t *input, uint8_t *dest, if (lossless) {
int stride, int eob, int bd)) { assert(tx_type == DCT_DCT);
switch (tx_type) { vp10_highbd_iwht4x4_add(input, dest, stride, eob, bd);
case DCT_DCT: } else {
highbd_itxm_add_4x4(input, dest, stride, eob, bd); switch (tx_type) {
break; case DCT_DCT:
case ADST_DCT: vp10_highbd_idct4x4_add(input, dest, stride, eob, bd);
case DCT_ADST: break;
case ADST_ADST: case ADST_DCT:
vp10_highbd_iht4x4_16_add(input, dest, stride, tx_type, bd); case DCT_ADST:
break; case ADST_ADST:
default: vp10_highbd_iht4x4_16_add(input, dest, stride, tx_type, bd);
assert(0); break;
break; default:
assert(0);
break;
}
} }
} }

View File

@@ -44,9 +44,7 @@ void vp10_idct4x4_add(const tran_low_t *input, uint8_t *dest, int stride,
int eob); int eob);
void vp10_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest, void vp10_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest,
int stride, int eob, TX_TYPE tx_type, int stride, int eob, TX_TYPE tx_type, int lossless);
void (*itxm_add_4x4)(const tran_low_t *input,
uint8_t *dest, int stride, int eob));
void vp10_inv_txfm_add_8x8(const tran_low_t *input, uint8_t *dest, void vp10_inv_txfm_add_8x8(const tran_low_t *input, uint8_t *dest,
int stride, int eob, TX_TYPE tx_type); int stride, int eob, TX_TYPE tx_type);
void vp10_inv_txfm_add_16x16(const tran_low_t *input, uint8_t *dest, void vp10_inv_txfm_add_16x16(const tran_low_t *input, uint8_t *dest,
@@ -67,9 +65,7 @@ void vp10_highbd_idct32x32_add(const tran_low_t *input, uint8_t *dest,
int stride, int eob, int bd); int stride, int eob, int bd);
void vp10_highbd_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest, void vp10_highbd_inv_txfm_add_4x4(const tran_low_t *input, uint8_t *dest,
int stride, int eob, int bd, TX_TYPE tx_type, int stride, int eob, int bd, TX_TYPE tx_type,
void (*highbd_itxm_add_4x4) int lossless);
(const tran_low_t *input, uint8_t *dest,
int stride, int eob, int bd));
void vp10_highbd_inv_txfm_add_8x8(const tran_low_t *input, uint8_t *dest, void vp10_highbd_inv_txfm_add_8x8(const tran_low_t *input, uint8_t *dest,
int stride, int eob, int bd, TX_TYPE tx_type); int stride, int eob, int bd, TX_TYPE tx_type);
void vp10_highbd_inv_txfm_add_16x16(const tran_low_t *input, uint8_t *dest, void vp10_highbd_inv_txfm_add_16x16(const tran_low_t *input, uint8_t *dest,

View File

@@ -719,7 +719,11 @@ static void build_masks(const loop_filter_info_n *const lfi_n,
uint64_t *const int_4x4_y = &lfm->int_4x4_y; uint64_t *const int_4x4_y = &lfm->int_4x4_y;
uint16_t *const left_uv = &lfm->left_uv[tx_size_uv]; uint16_t *const left_uv = &lfm->left_uv[tx_size_uv];
uint16_t *const above_uv = &lfm->above_uv[tx_size_uv]; uint16_t *const above_uv = &lfm->above_uv[tx_size_uv];
#if CONFIG_MISC_FIXES
uint16_t *const int_4x4_uv = &lfm->left_int_4x4_uv;
#else
uint16_t *const int_4x4_uv = &lfm->int_4x4_uv; uint16_t *const int_4x4_uv = &lfm->int_4x4_uv;
#endif
int i; int i;
// If filter level is 0 we don't loop filter. // If filter level is 0 we don't loop filter.
@@ -754,8 +758,13 @@ static void build_masks(const loop_filter_info_n *const lfi_n,
// If the block has no coefficients and is not intra we skip applying // If the block has no coefficients and is not intra we skip applying
// the loop filter on block edges. // the loop filter on block edges.
#if CONFIG_MISC_FIXES
if ((mbmi->skip || mbmi->has_no_coeffs) && is_inter_block(mbmi))
return;
#else
if (mbmi->skip && is_inter_block(mbmi)) if (mbmi->skip && is_inter_block(mbmi))
return; return;
#endif
// Here we are adding a mask for the transform size. The transform // Here we are adding a mask for the transform size. The transform
// size mask is set to be correct for a 64x64 prediction block size. We // size mask is set to be correct for a 64x64 prediction block size. We
@@ -812,8 +821,13 @@ static void build_y_mask(const loop_filter_info_n *const lfi_n,
*above_y |= above_prediction_mask[block_size] << shift_y; *above_y |= above_prediction_mask[block_size] << shift_y;
*left_y |= left_prediction_mask[block_size] << shift_y; *left_y |= left_prediction_mask[block_size] << shift_y;
#if CONFIG_MISC_FIXES
if ((mbmi->skip || mbmi->has_no_coeffs) && is_inter_block(mbmi))
return;
#else
if (mbmi->skip && is_inter_block(mbmi)) if (mbmi->skip && is_inter_block(mbmi))
return; return;
#endif
*above_y |= (size_mask[block_size] & *above_y |= (size_mask[block_size] &
above_64x64_txform_mask[tx_size_y]) << shift_y; above_64x64_txform_mask[tx_size_y]) << shift_y;
@@ -1005,7 +1019,11 @@ void vp10_setup_mask(VP10_COMMON *const cm, const int mi_row, const int mi_col,
lfm->above_uv[i] &= mask_uv; lfm->above_uv[i] &= mask_uv;
} }
lfm->int_4x4_y &= mask_y; lfm->int_4x4_y &= mask_y;
#if CONFIG_MISC_FIXES
lfm->above_int_4x4_uv = lfm->left_int_4x4_uv & mask_uv;
#else
lfm->int_4x4_uv &= mask_uv; lfm->int_4x4_uv &= mask_uv;
#endif
// We don't apply a wide loop filter on the last uv block row. If set // We don't apply a wide loop filter on the last uv block row. If set
// apply the shorter one instead. // apply the shorter one instead.
@@ -1039,7 +1057,11 @@ void vp10_setup_mask(VP10_COMMON *const cm, const int mi_row, const int mi_col,
lfm->above_uv[i] &= mask_uv; lfm->above_uv[i] &= mask_uv;
} }
lfm->int_4x4_y &= mask_y; lfm->int_4x4_y &= mask_y;
#if CONFIG_MISC_FIXES
lfm->left_int_4x4_uv &= mask_uv_int;
#else
lfm->int_4x4_uv &= mask_uv_int; lfm->int_4x4_uv &= mask_uv_int;
#endif
// We don't apply a wide loop filter on the last uv column. If set // We don't apply a wide loop filter on the last uv column. If set
// apply the shorter one instead. // apply the shorter one instead.
@@ -1069,7 +1091,11 @@ void vp10_setup_mask(VP10_COMMON *const cm, const int mi_row, const int mi_col,
assert(!(lfm->left_uv[TX_16X16]&lfm->left_uv[TX_8X8])); assert(!(lfm->left_uv[TX_16X16]&lfm->left_uv[TX_8X8]));
assert(!(lfm->left_uv[TX_16X16] & lfm->left_uv[TX_4X4])); assert(!(lfm->left_uv[TX_16X16] & lfm->left_uv[TX_4X4]));
assert(!(lfm->left_uv[TX_8X8] & lfm->left_uv[TX_4X4])); assert(!(lfm->left_uv[TX_8X8] & lfm->left_uv[TX_4X4]));
#if CONFIG_MISC_FIXES
assert(!(lfm->left_int_4x4_uv & lfm->left_uv[TX_16X16]));
#else
assert(!(lfm->int_4x4_uv & lfm->left_uv[TX_16X16])); assert(!(lfm->int_4x4_uv & lfm->left_uv[TX_16X16]));
#endif
assert(!(lfm->above_y[TX_16X16] & lfm->above_y[TX_8X8])); assert(!(lfm->above_y[TX_16X16] & lfm->above_y[TX_8X8]));
assert(!(lfm->above_y[TX_16X16] & lfm->above_y[TX_4X4])); assert(!(lfm->above_y[TX_16X16] & lfm->above_y[TX_4X4]));
assert(!(lfm->above_y[TX_8X8] & lfm->above_y[TX_4X4])); assert(!(lfm->above_y[TX_8X8] & lfm->above_y[TX_4X4]));
@@ -1077,7 +1103,11 @@ void vp10_setup_mask(VP10_COMMON *const cm, const int mi_row, const int mi_col,
assert(!(lfm->above_uv[TX_16X16] & lfm->above_uv[TX_8X8])); assert(!(lfm->above_uv[TX_16X16] & lfm->above_uv[TX_8X8]));
assert(!(lfm->above_uv[TX_16X16] & lfm->above_uv[TX_4X4])); assert(!(lfm->above_uv[TX_16X16] & lfm->above_uv[TX_4X4]));
assert(!(lfm->above_uv[TX_8X8] & lfm->above_uv[TX_4X4])); assert(!(lfm->above_uv[TX_8X8] & lfm->above_uv[TX_4X4]));
#if CONFIG_MISC_FIXES
assert(!(lfm->above_int_4x4_uv & lfm->above_uv[TX_16X16]));
#else
assert(!(lfm->int_4x4_uv & lfm->above_uv[TX_16X16])); assert(!(lfm->int_4x4_uv & lfm->above_uv[TX_16X16]));
#endif
} }
static void filter_selectively_vert(uint8_t *s, int pitch, static void filter_selectively_vert(uint8_t *s, int pitch,
@@ -1432,7 +1462,11 @@ void vp10_filter_block_plane_ss11(VP10_COMMON *const cm,
uint16_t mask_16x16 = lfm->left_uv[TX_16X16]; uint16_t mask_16x16 = lfm->left_uv[TX_16X16];
uint16_t mask_8x8 = lfm->left_uv[TX_8X8]; uint16_t mask_8x8 = lfm->left_uv[TX_8X8];
uint16_t mask_4x4 = lfm->left_uv[TX_4X4]; uint16_t mask_4x4 = lfm->left_uv[TX_4X4];
#if CONFIG_MISC_FIXES
uint16_t mask_4x4_int = lfm->left_int_4x4_uv;
#else
uint16_t mask_4x4_int = lfm->int_4x4_uv; uint16_t mask_4x4_int = lfm->int_4x4_uv;
#endif
assert(plane->subsampling_x == 1 && plane->subsampling_y == 1); assert(plane->subsampling_x == 1 && plane->subsampling_y == 1);
@@ -1484,7 +1518,11 @@ void vp10_filter_block_plane_ss11(VP10_COMMON *const cm,
mask_16x16 = lfm->above_uv[TX_16X16]; mask_16x16 = lfm->above_uv[TX_16X16];
mask_8x8 = lfm->above_uv[TX_8X8]; mask_8x8 = lfm->above_uv[TX_8X8];
mask_4x4 = lfm->above_uv[TX_4X4]; mask_4x4 = lfm->above_uv[TX_4X4];
#if CONFIG_MISC_FIXES
mask_4x4_int = lfm->above_int_4x4_uv;
#else
mask_4x4_int = lfm->int_4x4_uv; mask_4x4_int = lfm->int_4x4_uv;
#endif
for (r = 0; r < MI_BLOCK_SIZE && mi_row + r < cm->mi_rows; r += 2) { for (r = 0; r < MI_BLOCK_SIZE && mi_row + r < cm->mi_rows; r += 2) {
const int skip_border_4x4_r = mi_row + r == cm->mi_rows - 1; const int skip_border_4x4_r = mi_row + r == cm->mi_rows - 1;

View File

@@ -80,7 +80,12 @@ typedef struct {
uint64_t int_4x4_y; uint64_t int_4x4_y;
uint16_t left_uv[TX_SIZES]; uint16_t left_uv[TX_SIZES];
uint16_t above_uv[TX_SIZES]; uint16_t above_uv[TX_SIZES];
#if CONFIG_MISC_FIXES
uint16_t left_int_4x4_uv;
uint16_t above_int_4x4_uv;
#else
uint16_t int_4x4_uv; uint16_t int_4x4_uv;
#endif
uint8_t lfl_y[64]; uint8_t lfl_y[64];
uint8_t lfl_uv[16]; uint8_t lfl_uv[16];
} LOOP_FILTER_MASK; } LOOP_FILTER_MASK;

View File

@@ -27,9 +27,13 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
const MV_REF *const prev_frame_mvs = cm->use_prev_frame_mvs ? const MV_REF *const prev_frame_mvs = cm->use_prev_frame_mvs ?
cm->prev_frame->mvs + mi_row * cm->mi_cols + mi_col : NULL; cm->prev_frame->mvs + mi_row * cm->mi_cols + mi_col : NULL;
const TileInfo *const tile = &xd->tile; const TileInfo *const tile = &xd->tile;
const int bw = num_8x8_blocks_wide_lookup[mi->mbmi.sb_type] << 3;
const int bh = num_8x8_blocks_high_lookup[mi->mbmi.sb_type] << 3;
#if !CONFIG_MISC_FIXES
// Blank the reference vector list // Blank the reference vector list
memset(mv_ref_list, 0, sizeof(*mv_ref_list) * MAX_MV_REF_CANDIDATES); memset(mv_ref_list, 0, sizeof(*mv_ref_list) * MAX_MV_REF_CANDIDATES);
#endif
// The nearest 2 blocks are treated differently // The nearest 2 blocks are treated differently
// if the size < 8x8 we get the mv from the bmi substructure, // if the size < 8x8 we get the mv from the bmi substructure,
@@ -46,10 +50,10 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
if (candidate->ref_frame[0] == ref_frame) if (candidate->ref_frame[0] == ref_frame)
ADD_MV_REF_LIST(get_sub_block_mv(candidate_mi, 0, mv_ref->col, block), ADD_MV_REF_LIST(get_sub_block_mv(candidate_mi, 0, mv_ref->col, block),
refmv_count, mv_ref_list, Done); refmv_count, mv_ref_list, bw, bh, xd, Done);
else if (candidate->ref_frame[1] == ref_frame) else if (candidate->ref_frame[1] == ref_frame)
ADD_MV_REF_LIST(get_sub_block_mv(candidate_mi, 1, mv_ref->col, block), ADD_MV_REF_LIST(get_sub_block_mv(candidate_mi, 1, mv_ref->col, block),
refmv_count, mv_ref_list, Done); refmv_count, mv_ref_list, bw, bh, xd, Done);
} }
} }
@@ -64,9 +68,11 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
different_ref_found = 1; different_ref_found = 1;
if (candidate->ref_frame[0] == ref_frame) if (candidate->ref_frame[0] == ref_frame)
ADD_MV_REF_LIST(candidate->mv[0], refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(candidate->mv[0], refmv_count, mv_ref_list,
bw, bh, xd, Done);
else if (candidate->ref_frame[1] == ref_frame) else if (candidate->ref_frame[1] == ref_frame)
ADD_MV_REF_LIST(candidate->mv[1], refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(candidate->mv[1], refmv_count, mv_ref_list,
bw, bh, xd, Done);
} }
} }
@@ -88,9 +94,11 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
} }
if (prev_frame_mvs->ref_frame[0] == ref_frame) { if (prev_frame_mvs->ref_frame[0] == ref_frame) {
ADD_MV_REF_LIST(prev_frame_mvs->mv[0], refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(prev_frame_mvs->mv[0], refmv_count, mv_ref_list,
bw, bh, xd, Done);
} else if (prev_frame_mvs->ref_frame[1] == ref_frame) { } else if (prev_frame_mvs->ref_frame[1] == ref_frame) {
ADD_MV_REF_LIST(prev_frame_mvs->mv[1], refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(prev_frame_mvs->mv[1], refmv_count, mv_ref_list,
bw, bh, xd, Done);
} }
} }
@@ -106,7 +114,7 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
// If the candidate is INTRA we don't want to consider its mv. // If the candidate is INTRA we don't want to consider its mv.
IF_DIFF_REF_FRAME_ADD_MV(candidate, ref_frame, ref_sign_bias, IF_DIFF_REF_FRAME_ADD_MV(candidate, ref_frame, ref_sign_bias,
refmv_count, mv_ref_list, Done); refmv_count, mv_ref_list, bw, bh, xd, Done);
} }
} }
} }
@@ -121,19 +129,21 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
mv.as_mv.row *= -1; mv.as_mv.row *= -1;
mv.as_mv.col *= -1; mv.as_mv.col *= -1;
} }
ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, bw, bh, xd, Done);
} }
if (prev_frame_mvs->ref_frame[1] > INTRA_FRAME && if (prev_frame_mvs->ref_frame[1] > INTRA_FRAME &&
prev_frame_mvs->ref_frame[1] != ref_frame && #if !CONFIG_MISC_FIXES
prev_frame_mvs->mv[1].as_int != prev_frame_mvs->mv[0].as_int) { prev_frame_mvs->mv[1].as_int != prev_frame_mvs->mv[0].as_int &&
#endif
prev_frame_mvs->ref_frame[1] != ref_frame) {
int_mv mv = prev_frame_mvs->mv[1]; int_mv mv = prev_frame_mvs->mv[1];
if (ref_sign_bias[prev_frame_mvs->ref_frame[1]] != if (ref_sign_bias[prev_frame_mvs->ref_frame[1]] !=
ref_sign_bias[ref_frame]) { ref_sign_bias[ref_frame]) {
mv.as_mv.row *= -1; mv.as_mv.row *= -1;
mv.as_mv.col *= -1; mv.as_mv.col *= -1;
} }
ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, Done); ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, bw, bh, xd, Done);
} }
} }
@@ -141,9 +151,14 @@ static void find_mv_refs_idx(const VP10_COMMON *cm, const MACROBLOCKD *xd,
mode_context[ref_frame] = counter_to_context[context_counter]; mode_context[ref_frame] = counter_to_context[context_counter];
#if CONFIG_MISC_FIXES
for (i = refmv_count; i < MAX_MV_REF_CANDIDATES; ++i)
mv_ref_list[i].as_int = 0;
#else
// Clamp vectors // Clamp vectors
for (i = 0; i < MAX_MV_REF_CANDIDATES; ++i) for (i = 0; i < MAX_MV_REF_CANDIDATES; ++i)
clamp_mv_ref(&mv_ref_list[i].as_mv, xd); clamp_mv_ref(&mv_ref_list[i].as_mv, bw, bh, xd);
#endif
} }
void vp10_find_mv_refs(const VP10_COMMON *cm, const MACROBLOCKD *xd, void vp10_find_mv_refs(const VP10_COMMON *cm, const MACROBLOCKD *xd,
@@ -166,14 +181,13 @@ static void lower_mv_precision(MV *mv, int allow_hp) {
} }
} }
void vp10_find_best_ref_mvs(MACROBLOCKD *xd, int allow_hp, void vp10_find_best_ref_mvs(int allow_hp,
int_mv *mvlist, int_mv *nearest_mv, int_mv *mvlist, int_mv *nearest_mv,
int_mv *near_mv) { int_mv *near_mv) {
int i; int i;
// Make sure all the candidates are properly clamped etc // Make sure all the candidates are properly clamped etc
for (i = 0; i < MAX_MV_REF_CANDIDATES; ++i) { for (i = 0; i < MAX_MV_REF_CANDIDATES; ++i) {
lower_mv_precision(&mvlist[i].as_mv, allow_hp); lower_mv_precision(&mvlist[i].as_mv, allow_hp);
clamp_mv2(&mvlist[i].as_mv, xd);
} }
*nearest_mv = mvlist[0]; *nearest_mv = mvlist[0];
*near_mv = mvlist[1]; *near_mv = mvlist[1];

View File

@@ -17,10 +17,6 @@
extern "C" { extern "C" {
#endif #endif
#define LEFT_TOP_MARGIN ((VP9_ENC_BORDER_IN_PIXELS - VP9_INTERP_EXTEND) << 3)
#define RIGHT_BOTTOM_MARGIN ((VP9_ENC_BORDER_IN_PIXELS -\
VP9_INTERP_EXTEND) << 3)
#define MVREF_NEIGHBOURS 8 #define MVREF_NEIGHBOURS 8
typedef struct position { typedef struct position {
@@ -123,13 +119,26 @@ static const int idx_n_column_to_subblock[4][2] = {
}; };
// clamp_mv_ref // clamp_mv_ref
#if CONFIG_MISC_FIXES
#define MV_BORDER (8 << 3) // Allow 8 pels in 1/8th pel units
#else
#define MV_BORDER (16 << 3) // Allow 16 pels in 1/8th pel units #define MV_BORDER (16 << 3) // Allow 16 pels in 1/8th pel units
#endif
static INLINE void clamp_mv_ref(MV *mv, const MACROBLOCKD *xd) { static INLINE void clamp_mv_ref(MV *mv, int bw, int bh, const MACROBLOCKD *xd) {
#if CONFIG_MISC_FIXES
clamp_mv(mv, xd->mb_to_left_edge - bw * 8 - MV_BORDER,
xd->mb_to_right_edge + bw * 8 + MV_BORDER,
xd->mb_to_top_edge - bh * 8 - MV_BORDER,
xd->mb_to_bottom_edge + bh * 8 + MV_BORDER);
#else
(void) bw;
(void) bh;
clamp_mv(mv, xd->mb_to_left_edge - MV_BORDER, clamp_mv(mv, xd->mb_to_left_edge - MV_BORDER,
xd->mb_to_right_edge + MV_BORDER, xd->mb_to_right_edge + MV_BORDER,
xd->mb_to_top_edge - MV_BORDER, xd->mb_to_top_edge - MV_BORDER,
xd->mb_to_bottom_edge + MV_BORDER); xd->mb_to_bottom_edge + MV_BORDER);
#endif
} }
// This function returns either the appropriate sub block or block's mv // This function returns either the appropriate sub block or block's mv
@@ -155,35 +164,41 @@ static INLINE int_mv scale_mv(const MB_MODE_INFO *mbmi, int ref,
return mv; return mv;
} }
#if CONFIG_MISC_FIXES
#define CLIP_IN_ADD(mv, bw, bh, xd) clamp_mv_ref(mv, bw, bh, xd)
#else
#define CLIP_IN_ADD(mv, bw, bh, xd) do {} while (0)
#endif
// This macro is used to add a motion vector mv_ref list if it isn't // This macro is used to add a motion vector mv_ref list if it isn't
// already in the list. If it's the second motion vector it will also // already in the list. If it's the second motion vector it will also
// skip all additional processing and jump to done! // skip all additional processing and jump to done!
#define ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, Done) \ #define ADD_MV_REF_LIST(mv, refmv_count, mv_ref_list, bw, bh, xd, Done) \
do { \ do { \
if (refmv_count) { \ (mv_ref_list)[(refmv_count)] = (mv); \
if ((mv).as_int != (mv_ref_list)[0].as_int) { \ CLIP_IN_ADD(&(mv_ref_list)[(refmv_count)].as_mv, (bw), (bh), (xd)); \
(mv_ref_list)[(refmv_count)] = (mv); \ if (refmv_count && (mv_ref_list)[1].as_int != (mv_ref_list)[0].as_int) { \
(refmv_count) = 2; \
goto Done; \ goto Done; \
} \
} else { \
(mv_ref_list)[(refmv_count)++] = (mv); \
} \ } \
(refmv_count) = 1; \
} while (0) } while (0)
// If either reference frame is different, not INTRA, and they // If either reference frame is different, not INTRA, and they
// are different from each other scale and add the mv to our list. // are different from each other scale and add the mv to our list.
#define IF_DIFF_REF_FRAME_ADD_MV(mbmi, ref_frame, ref_sign_bias, refmv_count, \ #define IF_DIFF_REF_FRAME_ADD_MV(mbmi, ref_frame, ref_sign_bias, refmv_count, \
mv_ref_list, Done) \ mv_ref_list, bw, bh, xd, Done) \
do { \ do { \
if (is_inter_block(mbmi)) { \ if (is_inter_block(mbmi)) { \
if ((mbmi)->ref_frame[0] != ref_frame) \ if ((mbmi)->ref_frame[0] != ref_frame) \
ADD_MV_REF_LIST(scale_mv((mbmi), 0, ref_frame, ref_sign_bias), \ ADD_MV_REF_LIST(scale_mv((mbmi), 0, ref_frame, ref_sign_bias), \
refmv_count, mv_ref_list, Done); \ refmv_count, mv_ref_list, bw, bh, xd, Done); \
if (has_second_ref(mbmi) && \ if (has_second_ref(mbmi) && \
(mbmi)->ref_frame[1] != ref_frame && \ (CONFIG_MISC_FIXES || \
(mbmi)->mv[1].as_int != (mbmi)->mv[0].as_int) \ (mbmi)->mv[1].as_int != (mbmi)->mv[0].as_int) && \
(mbmi)->ref_frame[1] != ref_frame) \
ADD_MV_REF_LIST(scale_mv((mbmi), 1, ref_frame, ref_sign_bias), \ ADD_MV_REF_LIST(scale_mv((mbmi), 1, ref_frame, ref_sign_bias), \
refmv_count, mv_ref_list, Done); \ refmv_count, mv_ref_list, bw, bh, xd, Done); \
} \ } \
} while (0) } while (0)
@@ -199,14 +214,6 @@ static INLINE int is_inside(const TileInfo *const tile,
mi_col + mi_pos->col >= tile->mi_col_end); mi_col + mi_pos->col >= tile->mi_col_end);
} }
// TODO(jingning): this mv clamping function should be block size dependent.
static INLINE void clamp_mv2(MV *mv, const MACROBLOCKD *xd) {
clamp_mv(mv, xd->mb_to_left_edge - LEFT_TOP_MARGIN,
xd->mb_to_right_edge + RIGHT_BOTTOM_MARGIN,
xd->mb_to_top_edge - LEFT_TOP_MARGIN,
xd->mb_to_bottom_edge + RIGHT_BOTTOM_MARGIN);
}
typedef void (*find_mv_refs_sync)(void *const data, int mi_row); typedef void (*find_mv_refs_sync)(void *const data, int mi_row);
void vp10_find_mv_refs(const VP10_COMMON *cm, const MACROBLOCKD *xd, void vp10_find_mv_refs(const VP10_COMMON *cm, const MACROBLOCKD *xd,
MODE_INFO *mi, MV_REFERENCE_FRAME ref_frame, MODE_INFO *mi, MV_REFERENCE_FRAME ref_frame,
@@ -217,7 +224,7 @@ void vp10_find_mv_refs(const VP10_COMMON *cm, const MACROBLOCKD *xd,
// check a list of motion vectors by sad score using a number rows of pixels // check a list of motion vectors by sad score using a number rows of pixels
// above and a number cols of pixels in the left to select the one with best // above and a number cols of pixels in the left to select the one with best
// score to use as ref motion vector // score to use as ref motion vector
void vp10_find_best_ref_mvs(MACROBLOCKD *xd, int allow_hp, void vp10_find_best_ref_mvs(int allow_hp,
int_mv *mvlist, int_mv *nearest_mv, int_mv *near_mv); int_mv *mvlist, int_mv *nearest_mv, int_mv *near_mv);
void vp10_append_sub8x8_mvs_for_idx(VP10_COMMON *cm, MACROBLOCKD *xd, void vp10_append_sub8x8_mvs_for_idx(VP10_COMMON *cm, MACROBLOCKD *xd,

View File

@@ -57,6 +57,29 @@ typedef enum {
REFERENCE_MODES = 3, REFERENCE_MODES = 3,
} REFERENCE_MODE; } REFERENCE_MODE;
typedef enum {
RESET_FRAME_CONTEXT_NONE = 0,
RESET_FRAME_CONTEXT_CURRENT = 1,
RESET_FRAME_CONTEXT_ALL = 2,
} RESET_FRAME_CONTEXT_MODE;
typedef enum {
/**
* Don't update frame context
*/
REFRESH_FRAME_CONTEXT_OFF,
/**
* Update frame context to values resulting from forward probability
* updates signaled in the frame header
*/
REFRESH_FRAME_CONTEXT_FORWARD,
/**
* Update frame context to values resulting from backward probability
* updates based on entropy/counts in the decoded frame
*/
REFRESH_FRAME_CONTEXT_BACKWARD,
} REFRESH_FRAME_CONTEXT_MODE;
typedef struct { typedef struct {
int_mv mv[2]; int_mv mv[2];
MV_REFERENCE_FRAME ref_frame[2]; MV_REFERENCE_FRAME ref_frame[2];
@@ -106,10 +129,11 @@ typedef struct BufferPool {
typedef struct VP10Common { typedef struct VP10Common {
struct vpx_internal_error_info error; struct vpx_internal_error_info error;
vpx_color_space_t color_space; vpx_color_space_t color_space;
int color_range;
int width; int width;
int height; int height;
int display_width; int render_width;
int display_height; int render_height;
int last_width; int last_width;
int last_height; int last_height;
@@ -161,10 +185,8 @@ typedef struct VP10Common {
int allow_high_precision_mv; int allow_high_precision_mv;
// Flag signaling that the frame context should be reset to default values. // Flag signaling which frame contexts should be reset to default values.
// 0 or 1 implies don't reset, 2 reset just the context specified in the RESET_FRAME_CONTEXT_MODE reset_frame_context;
// frame header, 3 reset all contexts.
int reset_frame_context;
// MBs, mb_rows/cols is in 16-pixel units; mi_rows/cols is in // MBs, mb_rows/cols is in 16-pixel units; mi_rows/cols is in
// MODE_INFO (8-pixel) units. // MODE_INFO (8-pixel) units.
@@ -222,15 +244,18 @@ typedef struct VP10Common {
loop_filter_info_n lf_info; loop_filter_info_n lf_info;
int refresh_frame_context; /* Two state 0 = NO, 1 = YES */ // Flag signaling how frame contexts should be updated at the end of
// a frame decode
REFRESH_FRAME_CONTEXT_MODE refresh_frame_context;
int ref_frame_sign_bias[MAX_REF_FRAMES]; /* Two state 0, 1 */ int ref_frame_sign_bias[MAX_REF_FRAMES]; /* Two state 0, 1 */
struct loopfilter lf; struct loopfilter lf;
struct segmentation seg; struct segmentation seg;
#if !CONFIG_MISC_FIXES
struct segmentation_probs segp;
#endif
// TODO(hkuang): Remove this as it is the same as frame_parallel_decode
// in pbi.
int frame_parallel_decode; // frame-based threading. int frame_parallel_decode; // frame-based threading.
// Context probabilities for reference frame prediction // Context probabilities for reference frame prediction
@@ -255,9 +280,9 @@ typedef struct VP10Common {
#endif #endif
int error_resilient_mode; int error_resilient_mode;
int frame_parallel_decoding_mode;
int log2_tile_cols, log2_tile_rows; int log2_tile_cols, log2_tile_rows;
int tile_sz_mag;
int byte_alignment; int byte_alignment;
int skip_loop_filter; int skip_loop_filter;
@@ -275,6 +300,11 @@ typedef struct VP10Common {
PARTITION_CONTEXT *above_seg_context; PARTITION_CONTEXT *above_seg_context;
ENTROPY_CONTEXT *above_context; ENTROPY_CONTEXT *above_context;
int above_context_alloc_cols; int above_context_alloc_cols;
// scratch memory for intraonly/keyframe forward updates from default tables
// - this is intentionally not placed in FRAME_CONTEXT since it's reset upon
// each keyframe and not used afterwards
vpx_prob kf_y_prob[INTRA_MODES][INTRA_MODES][INTRA_MODES - 1];
} VP10_COMMON; } VP10_COMMON;
// TODO(hkuang): Don't need to lock the whole pool after implementing atomic // TODO(hkuang): Don't need to lock the whole pool after implementing atomic
@@ -347,14 +377,6 @@ static INLINE int frame_is_intra_only(const VP10_COMMON *const cm) {
return cm->frame_type == KEY_FRAME || cm->intra_only; return cm->frame_type == KEY_FRAME || cm->intra_only;
} }
static INLINE void set_partition_probs(const VP10_COMMON *const cm,
MACROBLOCKD *const xd) {
xd->partition_probs =
frame_is_intra_only(cm) ?
&vp10_kf_partition_probs[0] :
(const vpx_prob (*)[PARTITION_TYPES - 1])cm->fc->partition_prob;
}
static INLINE void vp10_init_macroblockd(VP10_COMMON *cm, MACROBLOCKD *xd, static INLINE void vp10_init_macroblockd(VP10_COMMON *cm, MACROBLOCKD *xd,
tran_low_t *dqcoeff) { tran_low_t *dqcoeff) {
int i; int i;
@@ -370,19 +392,11 @@ static INLINE void vp10_init_macroblockd(VP10_COMMON *cm, MACROBLOCKD *xd,
memcpy(xd->plane[i].seg_dequant, cm->uv_dequant, sizeof(cm->uv_dequant)); memcpy(xd->plane[i].seg_dequant, cm->uv_dequant, sizeof(cm->uv_dequant));
} }
xd->fc = cm->fc; xd->fc = cm->fc;
xd->frame_parallel_decoding_mode = cm->frame_parallel_decoding_mode;
} }
xd->above_seg_context = cm->above_seg_context; xd->above_seg_context = cm->above_seg_context;
xd->mi_stride = cm->mi_stride; xd->mi_stride = cm->mi_stride;
xd->error_info = &cm->error; xd->error_info = &cm->error;
set_partition_probs(cm, xd);
}
static INLINE const vpx_prob* get_partition_probs(const MACROBLOCKD *xd,
int ctx) {
return xd->partition_probs[ctx];
} }
static INLINE void set_skip_context(MACROBLOCKD *xd, int mi_row, int mi_col) { static INLINE void set_skip_context(MACROBLOCKD *xd, int mi_row, int mi_col) {
@@ -432,6 +446,16 @@ static INLINE void set_mi_row_col(MACROBLOCKD *xd, const TileInfo *const tile,
} }
} }
static INLINE const vpx_prob *get_y_mode_probs(const VP10_COMMON *cm,
const MODE_INFO *mi,
const MODE_INFO *above_mi,
const MODE_INFO *left_mi,
int block) {
const PREDICTION_MODE above = vp10_above_block_mode(mi, above_mi, block);
const PREDICTION_MODE left = vp10_left_block_mode(mi, left_mi, block);
return cm->kf_y_prob[above][left];
}
static INLINE void update_partition_context(MACROBLOCKD *xd, static INLINE void update_partition_context(MACROBLOCKD *xd,
int mi_row, int mi_col, int mi_row, int mi_col,
BLOCK_SIZE subsize, BLOCK_SIZE subsize,

View File

@@ -48,9 +48,9 @@ static INLINE int vp10_get_pred_context_seg_id(const MACROBLOCKD *xd) {
return above_sip + left_sip; return above_sip + left_sip;
} }
static INLINE vpx_prob vp10_get_pred_prob_seg_id(const struct segmentation *seg, static INLINE vpx_prob vp10_get_pred_prob_seg_id(
const MACROBLOCKD *xd) { const struct segmentation_probs *segp, const MACROBLOCKD *xd) {
return seg->pred_probs[vp10_get_pred_context_seg_id(xd)]; return segp->pred_probs[vp10_get_pred_context_seg_id(xd)];
} }
static INLINE int vp10_get_skip_context(const MACROBLOCKD *xd) { static INLINE int vp10_get_skip_context(const MACROBLOCKD *xd) {

View File

@@ -128,6 +128,53 @@ void build_inter_predictors(MACROBLOCKD *xd, int plane, int block,
} }
} }
void vp10_build_inter_predictor_sub8x8(MACROBLOCKD *xd, int plane,
int i, int ir, int ic,
int mi_row, int mi_col) {
struct macroblockd_plane *const pd = &xd->plane[plane];
MODE_INFO *const mi = xd->mi[0];
const BLOCK_SIZE plane_bsize = get_plane_block_size(mi->mbmi.sb_type, pd);
const int width = 4 * num_4x4_blocks_wide_lookup[plane_bsize];
const int height = 4 * num_4x4_blocks_high_lookup[plane_bsize];
uint8_t *const dst = &pd->dst.buf[(ir * pd->dst.stride + ic) << 2];
int ref;
const int is_compound = has_second_ref(&mi->mbmi);
const InterpKernel *kernel = vp10_filter_kernels[mi->mbmi.interp_filter];
for (ref = 0; ref < 1 + is_compound; ++ref) {
const uint8_t *pre =
&pd->pre[ref].buf[(ir * pd->pre[ref].stride + ic) << 2];
#if CONFIG_VP9_HIGHBITDEPTH
if (xd->cur_buf->flags & YV12_FLAG_HIGHBITDEPTH) {
vp10_highbd_build_inter_predictor(pre, pd->pre[ref].stride,
dst, pd->dst.stride,
&mi->bmi[i].as_mv[ref].as_mv,
&xd->block_refs[ref]->sf, width, height,
ref, kernel, MV_PRECISION_Q3,
mi_col * MI_SIZE + 4 * ic,
mi_row * MI_SIZE + 4 * ir, xd->bd);
} else {
vp10_build_inter_predictor(pre, pd->pre[ref].stride,
dst, pd->dst.stride,
&mi->bmi[i].as_mv[ref].as_mv,
&xd->block_refs[ref]->sf, width, height, ref,
kernel, MV_PRECISION_Q3,
mi_col * MI_SIZE + 4 * ic,
mi_row * MI_SIZE + 4 * ir);
}
#else
vp10_build_inter_predictor(pre, pd->pre[ref].stride,
dst, pd->dst.stride,
&mi->bmi[i].as_mv[ref].as_mv,
&xd->block_refs[ref]->sf, width, height, ref,
kernel, MV_PRECISION_Q3,
mi_col * MI_SIZE + 4 * ic,
mi_row * MI_SIZE + 4 * ir);
#endif // CONFIG_VP9_HIGHBITDEPTH
}
}
static void build_inter_predictors_for_planes(MACROBLOCKD *xd, BLOCK_SIZE bsize, static void build_inter_predictors_for_planes(MACROBLOCKD *xd, BLOCK_SIZE bsize,
int mi_row, int mi_col, int mi_row, int mi_col,
int plane_from, int plane_to) { int plane_from, int plane_to) {
@@ -135,20 +182,26 @@ static void build_inter_predictors_for_planes(MACROBLOCKD *xd, BLOCK_SIZE bsize,
const int mi_x = mi_col * MI_SIZE; const int mi_x = mi_col * MI_SIZE;
const int mi_y = mi_row * MI_SIZE; const int mi_y = mi_row * MI_SIZE;
for (plane = plane_from; plane <= plane_to; ++plane) { for (plane = plane_from; plane <= plane_to; ++plane) {
const BLOCK_SIZE plane_bsize = get_plane_block_size(bsize, const struct macroblockd_plane *pd = &xd->plane[plane];
&xd->plane[plane]); const int bw = 4 * num_4x4_blocks_wide_lookup[bsize] >> pd->subsampling_x;
const int num_4x4_w = num_4x4_blocks_wide_lookup[plane_bsize]; const int bh = 4 * num_4x4_blocks_high_lookup[bsize] >> pd->subsampling_y;
const int num_4x4_h = num_4x4_blocks_high_lookup[plane_bsize];
const int bw = 4 * num_4x4_w;
const int bh = 4 * num_4x4_h;
if (xd->mi[0]->mbmi.sb_type < BLOCK_8X8) { if (xd->mi[0]->mbmi.sb_type < BLOCK_8X8) {
int i = 0, x, y; const PARTITION_TYPE bp = bsize - xd->mi[0]->mbmi.sb_type;
const int have_vsplit = bp != PARTITION_HORZ;
const int have_hsplit = bp != PARTITION_VERT;
const int num_4x4_w = 2 >> ((!have_vsplit) | pd->subsampling_x);
const int num_4x4_h = 2 >> ((!have_hsplit) | pd->subsampling_y);
const int pw = 8 >> (have_vsplit | pd->subsampling_x);
const int ph = 8 >> (have_hsplit | pd->subsampling_y);
int x, y;
assert(bp != PARTITION_NONE && bp < PARTITION_TYPES);
assert(bsize == BLOCK_8X8); assert(bsize == BLOCK_8X8);
assert(pw * num_4x4_w == bw && ph * num_4x4_h == bh);
for (y = 0; y < num_4x4_h; ++y) for (y = 0; y < num_4x4_h; ++y)
for (x = 0; x < num_4x4_w; ++x) for (x = 0; x < num_4x4_w; ++x)
build_inter_predictors(xd, plane, i++, bw, bh, build_inter_predictors(xd, plane, y * 2 + x, bw, bh,
4 * x, 4 * y, 4, 4, mi_x, mi_y); 4 * x, 4 * y, pw, ph, mi_x, mi_y);
} else { } else {
build_inter_predictors(xd, plane, 0, bw, bh, build_inter_predictors(xd, plane, 0, bw, bh,
0, 0, bw, bh, mi_x, mi_y); 0, 0, bw, bh, mi_x, mi_y);

View File

@@ -131,6 +131,10 @@ void build_inter_predictors(MACROBLOCKD *xd, int plane, int block,
int x, int y, int w, int h, int x, int y, int w, int h,
int mi_x, int mi_y); int mi_x, int mi_y);
void vp10_build_inter_predictor_sub8x8(MACROBLOCKD *xd, int plane,
int i, int ir, int ic,
int mi_row, int mi_col);
void vp10_build_inter_predictors_sby(MACROBLOCKD *xd, int mi_row, int mi_col, void vp10_build_inter_predictors_sby(MACROBLOCKD *xd, int mi_row, int mi_col,
BLOCK_SIZE bsize); BLOCK_SIZE bsize);

View File

@@ -21,6 +21,28 @@
#include "vp10/common/reconintra.h" #include "vp10/common/reconintra.h"
#include "vp10/common/onyxc_int.h" #include "vp10/common/onyxc_int.h"
#if CONFIG_MISC_FIXES
enum {
NEED_LEFT = 1 << 1,
NEED_ABOVE = 1 << 2,
NEED_ABOVERIGHT = 1 << 3,
NEED_ABOVELEFT = 1 << 4,
NEED_BOTTOMLEFT = 1 << 5,
};
static const uint8_t extend_modes[INTRA_MODES] = {
NEED_ABOVE | NEED_LEFT, // DC
NEED_ABOVE, // V
NEED_LEFT, // H
NEED_ABOVE | NEED_ABOVERIGHT, // D45
NEED_LEFT | NEED_ABOVE | NEED_ABOVELEFT, // D135
NEED_LEFT | NEED_ABOVE | NEED_ABOVELEFT, // D117
NEED_LEFT | NEED_ABOVE | NEED_ABOVELEFT, // D153
NEED_LEFT | NEED_BOTTOMLEFT, // D207
NEED_ABOVE | NEED_ABOVERIGHT, // D63
NEED_LEFT | NEED_ABOVE | NEED_ABOVELEFT, // TM
};
#else
enum { enum {
NEED_LEFT = 1 << 1, NEED_LEFT = 1 << 1,
NEED_ABOVE = 1 << 2, NEED_ABOVE = 1 << 2,
@@ -39,6 +61,134 @@ static const uint8_t extend_modes[INTRA_MODES] = {
NEED_ABOVERIGHT, // D63 NEED_ABOVERIGHT, // D63
NEED_LEFT | NEED_ABOVE, // TM NEED_LEFT | NEED_ABOVE, // TM
}; };
#endif
#if CONFIG_MISC_FIXES
static const uint8_t orders_64x64[1] = { 0 };
static const uint8_t orders_64x32[2] = { 0, 1 };
static const uint8_t orders_32x64[2] = { 0, 1 };
static const uint8_t orders_32x32[4] = {
0, 1,
2, 3,
};
static const uint8_t orders_32x16[8] = {
0, 2,
1, 3,
4, 6,
5, 7,
};
static const uint8_t orders_16x32[8] = {
0, 1, 2, 3,
4, 5, 6, 7,
};
static const uint8_t orders_16x16[16] = {
0, 1, 4, 5,
2, 3, 6, 7,
8, 9, 12, 13,
10, 11, 14, 15,
};
static const uint8_t orders_16x8[32] = {
0, 2, 8, 10,
1, 3, 9, 11,
4, 6, 12, 14,
5, 7, 13, 15,
16, 18, 24, 26,
17, 19, 25, 27,
20, 22, 28, 30,
21, 23, 29, 31,
};
static const uint8_t orders_8x16[32] = {
0, 1, 2, 3, 8, 9, 10, 11,
4, 5, 6, 7, 12, 13, 14, 15,
16, 17, 18, 19, 24, 25, 26, 27,
20, 21, 22, 23, 28, 29, 30, 31,
};
static const uint8_t orders_8x8[64] = {
0, 1, 4, 5, 16, 17, 20, 21,
2, 3, 6, 7, 18, 19, 22, 23,
8, 9, 12, 13, 24, 25, 28, 29,
10, 11, 14, 15, 26, 27, 30, 31,
32, 33, 36, 37, 48, 49, 52, 53,
34, 35, 38, 39, 50, 51, 54, 55,
40, 41, 44, 45, 56, 57, 60, 61,
42, 43, 46, 47, 58, 59, 62, 63,
};
static const uint8_t *const orders[BLOCK_SIZES] = {
orders_8x8, orders_8x8, orders_8x8, orders_8x8,
orders_8x16, orders_16x8, orders_16x16,
orders_16x32, orders_32x16, orders_32x32,
orders_32x64, orders_64x32, orders_64x64,
};
static int vp10_has_right(BLOCK_SIZE bsize, int mi_row, int mi_col,
int right_available,
TX_SIZE txsz, int y, int x, int ss_x) {
if (y == 0) {
int wl = mi_width_log2_lookup[bsize];
int hl = mi_height_log2_lookup[bsize];
int w = 1 << (wl + 1 - ss_x);
int step = 1 << txsz;
const uint8_t *order = orders[bsize];
int my_order, tr_order;
if (x + step < w)
return 1;
mi_row = (mi_row & 7) >> hl;
mi_col = (mi_col & 7) >> wl;
if (mi_row == 0)
return right_available;
if (((mi_col + 1) << wl) >= 8)
return 0;
my_order = order[((mi_row + 0) << (3 - wl)) + mi_col + 0];
tr_order = order[((mi_row - 1) << (3 - wl)) + mi_col + 1];
return my_order > tr_order && right_available;
} else {
int wl = mi_width_log2_lookup[bsize];
int w = 1 << (wl + 1 - ss_x);
int step = 1 << txsz;
return x + step < w;
}
}
static int vp10_has_bottom(BLOCK_SIZE bsize, int mi_row, int mi_col,
int bottom_available, TX_SIZE txsz,
int y, int x, int ss_y) {
if (x == 0) {
int wl = mi_width_log2_lookup[bsize];
int hl = mi_height_log2_lookup[bsize];
int h = 1 << (hl + 1 - ss_y);
int step = 1 << txsz;
const uint8_t *order = orders[bsize];
int my_order, bl_order;
mi_row = (mi_row & 7) >> hl;
mi_col = (mi_col & 7) >> wl;
if (mi_col == 0)
return bottom_available &&
(mi_row << (hl + !ss_y)) + y + step < (8 << !ss_y);
if (((mi_row + 1) << hl) >= 8)
return 0;
if (y + step < h)
return 1;
my_order = order[((mi_row + 0) << (3 - wl)) + mi_col + 0];
bl_order = order[((mi_row + 1) << (3 - wl)) + mi_col - 1];
return bl_order < my_order && bottom_available;
} else {
return 0;
}
}
#endif
typedef void (*intra_pred_fn)(uint8_t *dst, ptrdiff_t stride, typedef void (*intra_pred_fn)(uint8_t *dst, ptrdiff_t stride,
const uint8_t *above, const uint8_t *left); const uint8_t *above, const uint8_t *left);
@@ -55,17 +205,26 @@ static intra_high_pred_fn dc_pred_high[2][2][4];
#endif // CONFIG_VP9_HIGHBITDEPTH #endif // CONFIG_VP9_HIGHBITDEPTH
static void vp10_init_intra_predictors_internal(void) { static void vp10_init_intra_predictors_internal(void) {
#define INIT_ALL_SIZES(p, type) \ #define INIT_NO_4X4(p, type) \
p[TX_4X4] = vpx_##type##_predictor_4x4; \
p[TX_8X8] = vpx_##type##_predictor_8x8; \ p[TX_8X8] = vpx_##type##_predictor_8x8; \
p[TX_16X16] = vpx_##type##_predictor_16x16; \ p[TX_16X16] = vpx_##type##_predictor_16x16; \
p[TX_32X32] = vpx_##type##_predictor_32x32 p[TX_32X32] = vpx_##type##_predictor_32x32
#define INIT_ALL_SIZES(p, type) \
p[TX_4X4] = vpx_##type##_predictor_4x4; \
INIT_NO_4X4(p, type)
INIT_ALL_SIZES(pred[V_PRED], v); INIT_ALL_SIZES(pred[V_PRED], v);
INIT_ALL_SIZES(pred[H_PRED], h); INIT_ALL_SIZES(pred[H_PRED], h);
#if CONFIG_MISC_FIXES
INIT_ALL_SIZES(pred[D207_PRED], d207e);
INIT_ALL_SIZES(pred[D45_PRED], d45e);
INIT_ALL_SIZES(pred[D63_PRED], d63e);
#else
INIT_ALL_SIZES(pred[D207_PRED], d207); INIT_ALL_SIZES(pred[D207_PRED], d207);
INIT_ALL_SIZES(pred[D45_PRED], d45); INIT_ALL_SIZES(pred[D45_PRED], d45);
INIT_ALL_SIZES(pred[D63_PRED], d63); INIT_ALL_SIZES(pred[D63_PRED], d63);
#endif
INIT_ALL_SIZES(pred[D117_PRED], d117); INIT_ALL_SIZES(pred[D117_PRED], d117);
INIT_ALL_SIZES(pred[D135_PRED], d135); INIT_ALL_SIZES(pred[D135_PRED], d135);
INIT_ALL_SIZES(pred[D153_PRED], d153); INIT_ALL_SIZES(pred[D153_PRED], d153);
@@ -79,9 +238,15 @@ static void vp10_init_intra_predictors_internal(void) {
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
INIT_ALL_SIZES(pred_high[V_PRED], highbd_v); INIT_ALL_SIZES(pred_high[V_PRED], highbd_v);
INIT_ALL_SIZES(pred_high[H_PRED], highbd_h); INIT_ALL_SIZES(pred_high[H_PRED], highbd_h);
#if CONFIG_MISC_FIXES
INIT_ALL_SIZES(pred_high[D207_PRED], highbd_d207e);
INIT_ALL_SIZES(pred_high[D45_PRED], highbd_d45e);
INIT_ALL_SIZES(pred_high[D63_PRED], highbd_d63);
#else
INIT_ALL_SIZES(pred_high[D207_PRED], highbd_d207); INIT_ALL_SIZES(pred_high[D207_PRED], highbd_d207);
INIT_ALL_SIZES(pred_high[D45_PRED], highbd_d45); INIT_ALL_SIZES(pred_high[D45_PRED], highbd_d45);
INIT_ALL_SIZES(pred_high[D63_PRED], highbd_d63); INIT_ALL_SIZES(pred_high[D63_PRED], highbd_d63);
#endif
INIT_ALL_SIZES(pred_high[D117_PRED], highbd_d117); INIT_ALL_SIZES(pred_high[D117_PRED], highbd_d117);
INIT_ALL_SIZES(pred_high[D135_PRED], highbd_d135); INIT_ALL_SIZES(pred_high[D135_PRED], highbd_d135);
INIT_ALL_SIZES(pred_high[D153_PRED], highbd_d153); INIT_ALL_SIZES(pred_high[D153_PRED], highbd_d153);
@@ -96,6 +261,13 @@ static void vp10_init_intra_predictors_internal(void) {
#undef intra_pred_allsizes #undef intra_pred_allsizes
} }
#if CONFIG_MISC_FIXES
static INLINE void memset16(uint16_t *dst, int val, int n) {
while (n--)
*dst++ = val;
}
#endif
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
static void build_intra_predictors_high(const MACROBLOCKD *xd, static void build_intra_predictors_high(const MACROBLOCKD *xd,
const uint8_t *ref8, const uint8_t *ref8,
@@ -104,23 +276,38 @@ static void build_intra_predictors_high(const MACROBLOCKD *xd,
int dst_stride, int dst_stride,
PREDICTION_MODE mode, PREDICTION_MODE mode,
TX_SIZE tx_size, TX_SIZE tx_size,
#if CONFIG_MISC_FIXES
int n_top_px, int n_topright_px,
int n_left_px, int n_bottomleft_px,
#else
int up_available, int up_available,
int left_available, int left_available,
int right_available, int right_available,
#endif
int x, int y, int x, int y,
int plane, int bd) { int plane, int bd) {
int i; int i;
uint16_t *dst = CONVERT_TO_SHORTPTR(dst8); uint16_t *dst = CONVERT_TO_SHORTPTR(dst8);
uint16_t *ref = CONVERT_TO_SHORTPTR(ref8); uint16_t *ref = CONVERT_TO_SHORTPTR(ref8);
#if CONFIG_MISC_FIXES
DECLARE_ALIGNED(16, uint16_t, left_col[32]); DECLARE_ALIGNED(16, uint16_t, left_col[32]);
#else
DECLARE_ALIGNED(16, uint16_t, left_col[64]);
#endif
DECLARE_ALIGNED(16, uint16_t, above_data[64 + 16]); DECLARE_ALIGNED(16, uint16_t, above_data[64 + 16]);
uint16_t *above_row = above_data + 16; uint16_t *above_row = above_data + 16;
const uint16_t *const_above_row = above_row; const uint16_t *const_above_row = above_row;
const int bs = 4 << tx_size; const int bs = 4 << tx_size;
#if CONFIG_MISC_FIXES
const uint16_t *above_ref = ref - ref_stride;
#else
int frame_width, frame_height; int frame_width, frame_height;
int x0, y0; int x0, y0;
const struct macroblockd_plane *const pd = &xd->plane[plane]; const struct macroblockd_plane *const pd = &xd->plane[plane];
// int base=128; #endif
const int need_left = extend_modes[mode] & NEED_LEFT;
const int need_above = extend_modes[mode] & NEED_ABOVE;
const int need_aboveright = extend_modes[mode] & NEED_ABOVERIGHT;
int base = 128 << (bd - 8); int base = 128 << (bd - 8);
// 127 127 127 .. 127 127 127 127 127 127 // 127 127 127 .. 127 127 127 127 127 127
// 129 A B .. Y Z // 129 A B .. Y Z
@@ -128,129 +315,56 @@ static void build_intra_predictors_high(const MACROBLOCKD *xd,
// 129 E F .. U V // 129 E F .. U V
// 129 G H .. S T T T T T // 129 G H .. S T T T T T
// Get current frame pointer, width and height. #if CONFIG_MISC_FIXES
if (plane == 0) { (void) x;
frame_width = xd->cur_buf->y_width; (void) y;
frame_height = xd->cur_buf->y_height; (void) plane;
} else { (void) need_left;
frame_width = xd->cur_buf->uv_width; (void) need_above;
frame_height = xd->cur_buf->uv_height; (void) need_aboveright;
}
// Get block position in current frame. // NEED_LEFT
x0 = (-xd->mb_to_left_edge >> (3 + pd->subsampling_x)) + x; if (extend_modes[mode] & NEED_LEFT) {
y0 = (-xd->mb_to_top_edge >> (3 + pd->subsampling_y)) + y; const int need_bottom = !!(extend_modes[mode] & NEED_BOTTOMLEFT);
i = 0;
// left if (n_left_px > 0) {
if (left_available) { for (; i < n_left_px; i++)
if (xd->mb_to_bottom_edge < 0) {
/* slower path if the block needs border extension */
if (y0 + bs <= frame_height) {
for (i = 0; i < bs; ++i)
left_col[i] = ref[i * ref_stride - 1];
} else {
const int extend_bottom = frame_height - y0;
for (i = 0; i < extend_bottom; ++i)
left_col[i] = ref[i * ref_stride - 1];
for (; i < bs; ++i)
left_col[i] = ref[(extend_bottom - 1) * ref_stride - 1];
}
} else {
/* faster path if the block does not need extension */
for (i = 0; i < bs; ++i)
left_col[i] = ref[i * ref_stride - 1]; left_col[i] = ref[i * ref_stride - 1];
} if (need_bottom && n_bottomleft_px > 0) {
} else { assert(i == bs);
// TODO(Peter): this value should probably change for high bitdepth for (; i < bs + n_bottomleft_px; i++)
vpx_memset16(left_col, base + 1, bs); left_col[i] = ref[i * ref_stride - 1];
}
// TODO(hkuang) do not extend 2*bs pixels for all modes.
// above
if (up_available) {
const uint16_t *above_ref = ref - ref_stride;
if (xd->mb_to_right_edge < 0) {
/* slower path if the block needs border extension */
if (x0 + 2 * bs <= frame_width) {
if (right_available && bs == 4) {
memcpy(above_row, above_ref, 2 * bs * sizeof(above_row[0]));
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
}
} else if (x0 + bs <= frame_width) {
const int r = frame_width - x0;
if (right_available && bs == 4) {
memcpy(above_row, above_ref, r * sizeof(above_row[0]));
vpx_memset16(above_row + r, above_row[r - 1],
x0 + 2 * bs - frame_width);
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
}
} else if (x0 <= frame_width) {
const int r = frame_width - x0;
memcpy(above_row, above_ref, r * sizeof(above_row[0]));
vpx_memset16(above_row + r, above_row[r - 1],
x0 + 2 * bs - frame_width);
} }
// TODO(Peter) this value should probably change for high bitdepth if (i < (bs << need_bottom))
above_row[-1] = left_available ? above_ref[-1] : (base+1); memset16(&left_col[i], left_col[i - 1], (bs << need_bottom) - i);
} else { } else {
/* faster path if the block does not need extension */ memset16(left_col, base + 1, bs << need_bottom);
if (bs == 4 && right_available && left_available) {
const_above_row = above_ref;
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
if (bs == 4 && right_available)
memcpy(above_row + bs, above_ref + bs, bs * sizeof(above_row[0]));
else
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
// TODO(Peter): this value should probably change for high bitdepth
above_row[-1] = left_available ? above_ref[-1] : (base+1);
}
} }
} else {
vpx_memset16(above_row, base - 1, bs * 2);
// TODO(Peter): this value should probably change for high bitdepth
above_row[-1] = base - 1;
} }
// predict // NEED_ABOVE
if (mode == DC_PRED) { if (extend_modes[mode] & NEED_ABOVE) {
dc_pred_high[left_available][up_available][tx_size](dst, dst_stride, const int need_right = !!(extend_modes[mode] & NEED_ABOVERIGHT);
const_above_row, if (n_top_px > 0) {
left_col, xd->bd); memcpy(above_row, above_ref, n_top_px * 2);
} else { i = n_top_px;
pred_high[mode][tx_size](dst, dst_stride, const_above_row, left_col, if (need_right && n_topright_px > 0) {
xd->bd); assert(n_top_px == bs);
memcpy(above_row + bs, above_ref + bs, n_topright_px * 2);
i += n_topright_px;
}
if (i < (bs << need_right))
memset16(&above_row[i], above_row[i - 1], (bs << need_right) - i);
} else {
memset16(above_row, base - 1, bs << need_right);
}
} }
}
#endif // CONFIG_VP9_HIGHBITDEPTH
static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
int ref_stride, uint8_t *dst, int dst_stride,
PREDICTION_MODE mode, TX_SIZE tx_size,
int up_available, int left_available,
int right_available, int x, int y,
int plane) {
int i;
DECLARE_ALIGNED(16, uint8_t, left_col[32]);
DECLARE_ALIGNED(16, uint8_t, above_data[64 + 16]);
uint8_t *above_row = above_data + 16;
const uint8_t *const_above_row = above_row;
const int bs = 4 << tx_size;
int frame_width, frame_height;
int x0, y0;
const struct macroblockd_plane *const pd = &xd->plane[plane];
// 127 127 127 .. 127 127 127 127 127 127
// 129 A B .. Y Z
// 129 C D .. W X
// 129 E F .. U V
// 129 G H .. S T T T T T
// ..
if (extend_modes[mode] & NEED_ABOVELEFT) {
above_row[-1] = n_top_px > 0 ?
(n_left_px > 0 ? above_ref[-1] : base + 1) : base - 1;
}
#else
// Get current frame pointer, width and height. // Get current frame pointer, width and height.
if (plane == 0) { if (plane == 0) {
frame_width = xd->cur_buf->y_width; frame_width = xd->cur_buf->y_width;
@@ -264,8 +378,207 @@ static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
x0 = (-xd->mb_to_left_edge >> (3 + pd->subsampling_x)) + x; x0 = (-xd->mb_to_left_edge >> (3 + pd->subsampling_x)) + x;
y0 = (-xd->mb_to_top_edge >> (3 + pd->subsampling_y)) + y; y0 = (-xd->mb_to_top_edge >> (3 + pd->subsampling_y)) + y;
// NEED_LEFT
if (need_left) {
if (left_available) {
if (xd->mb_to_bottom_edge < 0) {
/* slower path if the block needs border extension */
if (y0 + bs <= frame_height) {
for (i = 0; i < bs; ++i)
left_col[i] = ref[i * ref_stride - 1];
} else {
const int extend_bottom = frame_height - y0;
for (i = 0; i < extend_bottom; ++i)
left_col[i] = ref[i * ref_stride - 1];
for (; i < bs; ++i)
left_col[i] = ref[(extend_bottom - 1) * ref_stride - 1];
}
} else {
/* faster path if the block does not need extension */
for (i = 0; i < bs; ++i)
left_col[i] = ref[i * ref_stride - 1];
}
} else {
// TODO(Peter): this value should probably change for high bitdepth
vpx_memset16(left_col, base + 1, bs);
}
}
// NEED_ABOVE
if (need_above) {
if (up_available) {
const uint16_t *above_ref = ref - ref_stride;
if (xd->mb_to_right_edge < 0) {
/* slower path if the block needs border extension */
if (x0 + bs <= frame_width) {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
} else if (x0 <= frame_width) {
const int r = frame_width - x0;
memcpy(above_row, above_ref, r * sizeof(above_row[0]));
vpx_memset16(above_row + r, above_row[r - 1], x0 + bs - frame_width);
}
} else {
/* faster path if the block does not need extension */
if (bs == 4 && right_available && left_available) {
const_above_row = above_ref;
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
}
}
above_row[-1] = left_available ? above_ref[-1] : (base + 1);
} else {
vpx_memset16(above_row, base - 1, bs);
above_row[-1] = base - 1;
}
}
// NEED_ABOVERIGHT
if (need_aboveright) {
if (up_available) {
const uint16_t *above_ref = ref - ref_stride;
if (xd->mb_to_right_edge < 0) {
/* slower path if the block needs border extension */
if (x0 + 2 * bs <= frame_width) {
if (right_available && bs == 4) {
memcpy(above_row, above_ref, 2 * bs * sizeof(above_row[0]));
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
}
} else if (x0 + bs <= frame_width) {
const int r = frame_width - x0;
if (right_available && bs == 4) {
memcpy(above_row, above_ref, r * sizeof(above_row[0]));
vpx_memset16(above_row + r, above_row[r - 1],
x0 + 2 * bs - frame_width);
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
}
} else if (x0 <= frame_width) {
const int r = frame_width - x0;
memcpy(above_row, above_ref, r * sizeof(above_row[0]));
vpx_memset16(above_row + r, above_row[r - 1],
x0 + 2 * bs - frame_width);
}
// TODO(Peter) this value should probably change for high bitdepth
above_row[-1] = left_available ? above_ref[-1] : (base + 1);
} else {
/* faster path if the block does not need extension */
if (bs == 4 && right_available && left_available) {
const_above_row = above_ref;
} else {
memcpy(above_row, above_ref, bs * sizeof(above_row[0]));
if (bs == 4 && right_available)
memcpy(above_row + bs, above_ref + bs, bs * sizeof(above_row[0]));
else
vpx_memset16(above_row + bs, above_row[bs - 1], bs);
// TODO(Peter): this value should probably change for high bitdepth
above_row[-1] = left_available ? above_ref[-1] : (base + 1);
}
}
} else {
vpx_memset16(above_row, base - 1, bs * 2);
// TODO(Peter): this value should probably change for high bitdepth
above_row[-1] = base - 1;
}
}
#endif
// predict
if (mode == DC_PRED) {
#if CONFIG_MISC_FIXES
dc_pred_high[n_left_px > 0][n_top_px > 0][tx_size](dst, dst_stride,
const_above_row,
left_col, xd->bd);
#else
dc_pred_high[left_available][up_available][tx_size](dst, dst_stride,
const_above_row,
left_col, xd->bd);
#endif
} else {
pred_high[mode][tx_size](dst, dst_stride, const_above_row, left_col,
xd->bd);
}
}
#endif // CONFIG_VP9_HIGHBITDEPTH
static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
int ref_stride, uint8_t *dst, int dst_stride,
PREDICTION_MODE mode, TX_SIZE tx_size,
#if CONFIG_MISC_FIXES
int n_top_px, int n_topright_px,
int n_left_px, int n_bottomleft_px,
#else
int up_available, int left_available,
int right_available,
#endif
int x, int y, int plane) {
int i;
#if CONFIG_MISC_FIXES
DECLARE_ALIGNED(16, uint8_t, left_col[64]);
const uint8_t *above_ref = ref - ref_stride;
#else
DECLARE_ALIGNED(16, uint8_t, left_col[32]);
int frame_width, frame_height;
int x0, y0;
const struct macroblockd_plane *const pd = &xd->plane[plane];
#endif
DECLARE_ALIGNED(16, uint8_t, above_data[64 + 16]);
uint8_t *above_row = above_data + 16;
const uint8_t *const_above_row = above_row;
const int bs = 4 << tx_size;
// 127 127 127 .. 127 127 127 127 127 127
// 129 A B .. Y Z
// 129 C D .. W X
// 129 E F .. U V
// 129 G H .. S T T T T T
// ..
#if CONFIG_MISC_FIXES
(void) xd;
(void) x;
(void) y;
(void) plane;
assert(n_top_px >= 0);
assert(n_topright_px >= 0);
assert(n_left_px >= 0);
assert(n_bottomleft_px >= 0);
#else
// Get current frame pointer, width and height.
if (plane == 0) {
frame_width = xd->cur_buf->y_width;
frame_height = xd->cur_buf->y_height;
} else {
frame_width = xd->cur_buf->uv_width;
frame_height = xd->cur_buf->uv_height;
}
// Get block position in current frame.
x0 = (-xd->mb_to_left_edge >> (3 + pd->subsampling_x)) + x;
y0 = (-xd->mb_to_top_edge >> (3 + pd->subsampling_y)) + y;
#endif
// NEED_LEFT // NEED_LEFT
if (extend_modes[mode] & NEED_LEFT) { if (extend_modes[mode] & NEED_LEFT) {
#if CONFIG_MISC_FIXES
const int need_bottom = !!(extend_modes[mode] & NEED_BOTTOMLEFT);
i = 0;
if (n_left_px > 0) {
for (; i < n_left_px; i++)
left_col[i] = ref[i * ref_stride - 1];
if (need_bottom && n_bottomleft_px > 0) {
assert(i == bs);
for (; i < bs + n_bottomleft_px; i++)
left_col[i] = ref[i * ref_stride - 1];
}
if (i < (bs << need_bottom))
memset(&left_col[i], left_col[i - 1], (bs << need_bottom) - i);
} else {
memset(left_col, 129, bs << need_bottom);
}
#else
if (left_available) { if (left_available) {
if (xd->mb_to_bottom_edge < 0) { if (xd->mb_to_bottom_edge < 0) {
/* slower path if the block needs border extension */ /* slower path if the block needs border extension */
@@ -287,10 +600,27 @@ static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
} else { } else {
memset(left_col, 129, bs); memset(left_col, 129, bs);
} }
#endif
} }
// NEED_ABOVE // NEED_ABOVE
if (extend_modes[mode] & NEED_ABOVE) { if (extend_modes[mode] & NEED_ABOVE) {
#if CONFIG_MISC_FIXES
const int need_right = !!(extend_modes[mode] & NEED_ABOVERIGHT);
if (n_top_px > 0) {
memcpy(above_row, above_ref, n_top_px);
i = n_top_px;
if (need_right && n_topright_px > 0) {
assert(n_top_px == bs);
memcpy(above_row + bs, above_ref + bs, n_topright_px);
i += n_topright_px;
}
if (i < (bs << need_right))
memset(&above_row[i], above_row[i - 1], (bs << need_right) - i);
} else {
memset(above_row, 127, bs << need_right);
}
#else
if (up_available) { if (up_available) {
const uint8_t *above_ref = ref - ref_stride; const uint8_t *above_ref = ref - ref_stride;
if (xd->mb_to_right_edge < 0) { if (xd->mb_to_right_edge < 0) {
@@ -315,8 +645,14 @@ static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
memset(above_row, 127, bs); memset(above_row, 127, bs);
above_row[-1] = 127; above_row[-1] = 127;
} }
#endif
} }
#if CONFIG_MISC_FIXES
if (extend_modes[mode] & NEED_ABOVELEFT) {
above_row[-1] = n_top_px > 0 ? (n_left_px > 0 ? above_ref[-1] : 129) : 127;
}
#else
// NEED_ABOVERIGHT // NEED_ABOVERIGHT
if (extend_modes[mode] & NEED_ABOVERIGHT) { if (extend_modes[mode] & NEED_ABOVERIGHT) {
if (up_available) { if (up_available) {
@@ -362,29 +698,83 @@ static void build_intra_predictors(const MACROBLOCKD *xd, const uint8_t *ref,
above_row[-1] = 127; above_row[-1] = 127;
} }
} }
#endif
// predict // predict
if (mode == DC_PRED) { if (mode == DC_PRED) {
#if CONFIG_MISC_FIXES
dc_pred[n_left_px > 0][n_top_px > 0][tx_size](dst, dst_stride,
const_above_row, left_col);
#else
dc_pred[left_available][up_available][tx_size](dst, dst_stride, dc_pred[left_available][up_available][tx_size](dst, dst_stride,
const_above_row, left_col); const_above_row, left_col);
#endif
} else { } else {
pred[mode][tx_size](dst, dst_stride, const_above_row, left_col); pred[mode][tx_size](dst, dst_stride, const_above_row, left_col);
} }
} }
void vp10_predict_intra_block(const MACROBLOCKD *xd, int bwl_in, void vp10_predict_intra_block(const MACROBLOCKD *xd, int bwl_in, int bhl_in,
TX_SIZE tx_size, PREDICTION_MODE mode, TX_SIZE tx_size, PREDICTION_MODE mode,
const uint8_t *ref, int ref_stride, const uint8_t *ref, int ref_stride,
uint8_t *dst, int dst_stride, uint8_t *dst, int dst_stride,
int aoff, int loff, int plane) { int aoff, int loff, int plane) {
const int bw = (1 << bwl_in);
const int txw = (1 << tx_size); const int txw = (1 << tx_size);
const int have_top = loff || xd->up_available; const int have_top = loff || xd->up_available;
const int have_left = aoff || xd->left_available; const int have_left = aoff || xd->left_available;
const int have_right = (aoff + txw) < bw;
const int x = aoff * 4; const int x = aoff * 4;
const int y = loff * 4; const int y = loff * 4;
#if CONFIG_MISC_FIXES
const int bw = VPXMAX(2, 1 << bwl_in);
const int bh = VPXMAX(2, 1 << bhl_in);
const int mi_row = -xd->mb_to_top_edge >> 6;
const int mi_col = -xd->mb_to_left_edge >> 6;
const BLOCK_SIZE bsize = xd->mi[0]->mbmi.sb_type;
const struct macroblockd_plane *const pd = &xd->plane[plane];
const int right_available =
mi_col + (bw >> !pd->subsampling_x) < xd->tile.mi_col_end;
const int have_right = vp10_has_right(bsize, mi_row, mi_col,
right_available,
tx_size, loff, aoff,
pd->subsampling_x);
const int have_bottom = vp10_has_bottom(bsize, mi_row, mi_col,
xd->mb_to_bottom_edge > 0,
tx_size, loff, aoff,
pd->subsampling_y);
const int wpx = 4 * bw;
const int hpx = 4 * bh;
const int txpx = 4 * txw;
int xr = (xd->mb_to_right_edge >> (3 + pd->subsampling_x)) + (wpx - x - txpx);
int yd =
(xd->mb_to_bottom_edge >> (3 + pd->subsampling_y)) + (hpx - y - txpx);
#else
const int bw = (1 << bwl_in);
const int have_right = (aoff + txw) < bw;
#endif // CONFIG_MISC_FIXES
#if CONFIG_MISC_FIXES
#if CONFIG_VP9_HIGHBITDEPTH
if (xd->cur_buf->flags & YV12_FLAG_HIGHBITDEPTH) {
build_intra_predictors_high(xd, ref, ref_stride, dst, dst_stride, mode,
tx_size,
have_top ? VPXMIN(txpx, xr + txpx) : 0,
have_top && have_right ? VPXMIN(txpx, xr) : 0,
have_left ? VPXMIN(txpx, yd + txpx) : 0,
have_bottom && have_left ? VPXMIN(txpx, yd) : 0,
x, y, plane, xd->bd);
return;
}
#endif
build_intra_predictors(xd, ref, ref_stride, dst, dst_stride, mode,
tx_size,
have_top ? VPXMIN(txpx, xr + txpx) : 0,
have_top && have_right ? VPXMIN(txpx, xr) : 0,
have_left ? VPXMIN(txpx, yd + txpx) : 0,
have_bottom && have_left ? VPXMIN(txpx, yd) : 0,
x, y, plane);
#else // CONFIG_MISC_FIXES
(void) bhl_in;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
if (xd->cur_buf->flags & YV12_FLAG_HIGHBITDEPTH) { if (xd->cur_buf->flags & YV12_FLAG_HIGHBITDEPTH) {
build_intra_predictors_high(xd, ref, ref_stride, dst, dst_stride, mode, build_intra_predictors_high(xd, ref, ref_stride, dst, dst_stride, mode,
@@ -395,6 +785,7 @@ void vp10_predict_intra_block(const MACROBLOCKD *xd, int bwl_in,
#endif #endif
build_intra_predictors(xd, ref, ref_stride, dst, dst_stride, mode, tx_size, build_intra_predictors(xd, ref, ref_stride, dst, dst_stride, mode, tx_size,
have_top, have_left, have_right, x, y, plane); have_top, have_left, have_right, x, y, plane);
#endif // CONFIG_MISC_FIXES
} }
void vp10_init_intra_predictors(void) { void vp10_init_intra_predictors(void) {

View File

@@ -20,7 +20,7 @@ extern "C" {
void vp10_init_intra_predictors(void); void vp10_init_intra_predictors(void);
void vp10_predict_intra_block(const MACROBLOCKD *xd, int bwl_in, void vp10_predict_intra_block(const MACROBLOCKD *xd, int bwl_in, int bhl_in,
TX_SIZE tx_size, PREDICTION_MODE mode, TX_SIZE tx_size, PREDICTION_MODE mode,
const uint8_t *ref, int ref_stride, const uint8_t *ref, int ref_stride,
uint8_t *dst, int dst_stride, uint8_t *dst, int dst_stride,

View File

@@ -695,6 +695,13 @@ DECLARE_ALIGNED(16, static const int16_t, vp10_default_iscan_32x32[1024]) = {
1023, 1023,
}; };
const scan_order vp10_default_scan_orders[TX_SIZES] = {
{default_scan_4x4, vp10_default_iscan_4x4, default_scan_4x4_neighbors},
{default_scan_8x8, vp10_default_iscan_8x8, default_scan_8x8_neighbors},
{default_scan_16x16, vp10_default_iscan_16x16, default_scan_16x16_neighbors},
{default_scan_32x32, vp10_default_iscan_32x32, default_scan_32x32_neighbors},
};
const scan_order vp10_scan_orders[TX_SIZES][TX_TYPES] = { const scan_order vp10_scan_orders[TX_SIZES][TX_TYPES] = {
{ // TX_4X4 { // TX_4X4
{default_scan_4x4, vp10_default_iscan_4x4, default_scan_4x4_neighbors}, {default_scan_4x4, vp10_default_iscan_4x4, default_scan_4x4_neighbors},

View File

@@ -29,6 +29,7 @@ typedef struct {
const int16_t *neighbors; const int16_t *neighbors;
} scan_order; } scan_order;
extern const scan_order vp10_default_scan_orders[TX_SIZES];
extern const scan_order vp10_scan_orders[TX_SIZES][TX_TYPES]; extern const scan_order vp10_scan_orders[TX_SIZES][TX_TYPES];
static INLINE int get_coef_context(const int16_t *neighbors, static INLINE int get_coef_context(const int16_t *neighbors,

View File

@@ -42,13 +42,15 @@ struct segmentation {
uint8_t abs_delta; uint8_t abs_delta;
uint8_t temporal_update; uint8_t temporal_update;
vpx_prob tree_probs[SEG_TREE_PROBS];
vpx_prob pred_probs[PREDICTION_PROBS];
int16_t feature_data[MAX_SEGMENTS][SEG_LVL_MAX]; int16_t feature_data[MAX_SEGMENTS][SEG_LVL_MAX];
unsigned int feature_mask[MAX_SEGMENTS]; unsigned int feature_mask[MAX_SEGMENTS];
}; };
struct segmentation_probs {
vpx_prob tree_probs[SEG_TREE_PROBS];
vpx_prob pred_probs[PREDICTION_PROBS];
};
static INLINE int segfeature_active(const struct segmentation *seg, static INLINE int segfeature_active(const struct segmentation *seg,
int segment_id, int segment_id,
SEG_LVL_FEATURES feature_id) { SEG_LVL_FEATURES feature_id) {

View File

@@ -434,4 +434,15 @@ void vp10_accumulate_frame_counts(VP10_COMMON *cm, FRAME_COUNTS *counts,
for (i = 0; i < MV_FP_SIZE; i++) for (i = 0; i < MV_FP_SIZE; i++)
comps->fp[i] += comps_t->fp[i]; comps->fp[i] += comps_t->fp[i];
} }
#if CONFIG_MISC_FIXES
for (i = 0; i < PREDICTION_PROBS; i++)
for (j = 0; j < 2; j++)
cm->counts.seg.pred[i][j] += counts->seg.pred[i][j];
for (i = 0; i < MAX_SEGMENTS; i++) {
cm->counts.seg.tree_total[i] += counts->seg.tree_total[i];
cm->counts.seg.tree_mispred[i] += counts->seg.tree_mispred[i];
}
#endif
} }

View File

@@ -14,6 +14,10 @@
#include "vp10/common/loopfilter.h" #include "vp10/common/loopfilter.h"
#include "vpx_util/vpx_thread.h" #include "vpx_util/vpx_thread.h"
#ifdef __cplusplus
extern "C" {
#endif
struct VP10Common; struct VP10Common;
struct FRAME_COUNTS; struct FRAME_COUNTS;
@@ -54,4 +58,8 @@ void vp10_loop_filter_frame_mt(YV12_BUFFER_CONFIG *frame,
void vp10_accumulate_frame_counts(struct VP10Common *cm, void vp10_accumulate_frame_counts(struct VP10Common *cm,
struct FRAME_COUNTS *counts, int is_dec); struct FRAME_COUNTS *counts, int is_dec);
#ifdef __cplusplus
} // extern "C"
#endif
#endif // VP10_COMMON_LOOPFILTER_THREAD_H_ #endif // VP10_COMMON_LOOPFILTER_THREAD_H_

824
vp10/common/vp10_fwd_txfm.c Normal file
View File

@@ -0,0 +1,824 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "vp10/common/vp10_fwd_txfm.h"
void vp10_fdct4x4_c(const int16_t *input, tran_low_t *output, int stride) {
// The 2D transform is done with two passes which are actually pretty
// similar. In the first one, we transform the columns and transpose
// the results. In the second one, we transform the rows. To achieve that,
// as the first pass results are transposed, we transpose the columns (that
// is the transposed rows) and transpose the results (so that it goes back
// in normal/row positions).
int pass;
// We need an intermediate buffer between passes.
tran_low_t intermediate[4 * 4];
const int16_t *in_pass0 = input;
const tran_low_t *in = NULL;
tran_low_t *out = intermediate;
// Do the two transform/transpose passes
for (pass = 0; pass < 2; ++pass) {
tran_high_t input[4]; // canbe16
tran_high_t step[4]; // canbe16
tran_high_t temp1, temp2; // needs32
int i;
for (i = 0; i < 4; ++i) {
// Load inputs.
if (0 == pass) {
input[0] = in_pass0[0 * stride] * 16;
input[1] = in_pass0[1 * stride] * 16;
input[2] = in_pass0[2 * stride] * 16;
input[3] = in_pass0[3 * stride] * 16;
if (i == 0 && input[0]) {
input[0] += 1;
}
} else {
input[0] = in[0 * 4];
input[1] = in[1 * 4];
input[2] = in[2 * 4];
input[3] = in[3 * 4];
}
// Transform.
step[0] = input[0] + input[3];
step[1] = input[1] + input[2];
step[2] = input[1] - input[2];
step[3] = input[0] - input[3];
temp1 = (step[0] + step[1]) * cospi_16_64;
temp2 = (step[0] - step[1]) * cospi_16_64;
out[0] = (tran_low_t)fdct_round_shift(temp1);
out[2] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step[2] * cospi_24_64 + step[3] * cospi_8_64;
temp2 = -step[2] * cospi_8_64 + step[3] * cospi_24_64;
out[1] = (tran_low_t)fdct_round_shift(temp1);
out[3] = (tran_low_t)fdct_round_shift(temp2);
// Do next column (which is a transposed row in second/horizontal pass)
in_pass0++;
in++;
out += 4;
}
// Setup in/out for next pass.
in = intermediate;
out = output;
}
{
int i, j;
for (i = 0; i < 4; ++i) {
for (j = 0; j < 4; ++j)
output[j + i * 4] = (output[j + i * 4] + 1) >> 2;
}
}
}
void vp10_fdct4x4_1_c(const int16_t *input, tran_low_t *output, int stride) {
int r, c;
tran_low_t sum = 0;
for (r = 0; r < 4; ++r)
for (c = 0; c < 4; ++c)
sum += input[r * stride + c];
output[0] = sum << 1;
output[1] = 0;
}
void vp10_fdct8x8_c(const int16_t *input,
tran_low_t *final_output, int stride) {
int i, j;
tran_low_t intermediate[64];
int pass;
tran_low_t *output = intermediate;
const tran_low_t *in = NULL;
// Transform columns
for (pass = 0; pass < 2; ++pass) {
tran_high_t s0, s1, s2, s3, s4, s5, s6, s7; // canbe16
tran_high_t t0, t1, t2, t3; // needs32
tran_high_t x0, x1, x2, x3; // canbe16
int i;
for (i = 0; i < 8; i++) {
// stage 1
if (pass == 0) {
s0 = (input[0 * stride] + input[7 * stride]) * 4;
s1 = (input[1 * stride] + input[6 * stride]) * 4;
s2 = (input[2 * stride] + input[5 * stride]) * 4;
s3 = (input[3 * stride] + input[4 * stride]) * 4;
s4 = (input[3 * stride] - input[4 * stride]) * 4;
s5 = (input[2 * stride] - input[5 * stride]) * 4;
s6 = (input[1 * stride] - input[6 * stride]) * 4;
s7 = (input[0 * stride] - input[7 * stride]) * 4;
++input;
} else {
s0 = in[0 * 8] + in[7 * 8];
s1 = in[1 * 8] + in[6 * 8];
s2 = in[2 * 8] + in[5 * 8];
s3 = in[3 * 8] + in[4 * 8];
s4 = in[3 * 8] - in[4 * 8];
s5 = in[2 * 8] - in[5 * 8];
s6 = in[1 * 8] - in[6 * 8];
s7 = in[0 * 8] - in[7 * 8];
++in;
}
// fdct4(step, step);
x0 = s0 + s3;
x1 = s1 + s2;
x2 = s1 - s2;
x3 = s0 - s3;
t0 = (x0 + x1) * cospi_16_64;
t1 = (x0 - x1) * cospi_16_64;
t2 = x2 * cospi_24_64 + x3 * cospi_8_64;
t3 = -x2 * cospi_8_64 + x3 * cospi_24_64;
output[0] = (tran_low_t)fdct_round_shift(t0);
output[2] = (tran_low_t)fdct_round_shift(t2);
output[4] = (tran_low_t)fdct_round_shift(t1);
output[6] = (tran_low_t)fdct_round_shift(t3);
// Stage 2
t0 = (s6 - s5) * cospi_16_64;
t1 = (s6 + s5) * cospi_16_64;
t2 = fdct_round_shift(t0);
t3 = fdct_round_shift(t1);
// Stage 3
x0 = s4 + t2;
x1 = s4 - t2;
x2 = s7 - t3;
x3 = s7 + t3;
// Stage 4
t0 = x0 * cospi_28_64 + x3 * cospi_4_64;
t1 = x1 * cospi_12_64 + x2 * cospi_20_64;
t2 = x2 * cospi_12_64 + x1 * -cospi_20_64;
t3 = x3 * cospi_28_64 + x0 * -cospi_4_64;
output[1] = (tran_low_t)fdct_round_shift(t0);
output[3] = (tran_low_t)fdct_round_shift(t2);
output[5] = (tran_low_t)fdct_round_shift(t1);
output[7] = (tran_low_t)fdct_round_shift(t3);
output += 8;
}
in = intermediate;
output = final_output;
}
// Rows
for (i = 0; i < 8; ++i) {
for (j = 0; j < 8; ++j)
final_output[j + i * 8] /= 2;
}
}
void vp10_fdct8x8_1_c(const int16_t *input, tran_low_t *output, int stride) {
int r, c;
tran_low_t sum = 0;
for (r = 0; r < 8; ++r)
for (c = 0; c < 8; ++c)
sum += input[r * stride + c];
output[0] = sum;
output[1] = 0;
}
void vp10_fdct16x16_c(const int16_t *input, tran_low_t *output, int stride) {
// The 2D transform is done with two passes which are actually pretty
// similar. In the first one, we transform the columns and transpose
// the results. In the second one, we transform the rows. To achieve that,
// as the first pass results are transposed, we transpose the columns (that
// is the transposed rows) and transpose the results (so that it goes back
// in normal/row positions).
int pass;
// We need an intermediate buffer between passes.
tran_low_t intermediate[256];
const int16_t *in_pass0 = input;
const tran_low_t *in = NULL;
tran_low_t *out = intermediate;
// Do the two transform/transpose passes
for (pass = 0; pass < 2; ++pass) {
tran_high_t step1[8]; // canbe16
tran_high_t step2[8]; // canbe16
tran_high_t step3[8]; // canbe16
tran_high_t input[8]; // canbe16
tran_high_t temp1, temp2; // needs32
int i;
for (i = 0; i < 16; i++) {
if (0 == pass) {
// Calculate input for the first 8 results.
input[0] = (in_pass0[0 * stride] + in_pass0[15 * stride]) * 4;
input[1] = (in_pass0[1 * stride] + in_pass0[14 * stride]) * 4;
input[2] = (in_pass0[2 * stride] + in_pass0[13 * stride]) * 4;
input[3] = (in_pass0[3 * stride] + in_pass0[12 * stride]) * 4;
input[4] = (in_pass0[4 * stride] + in_pass0[11 * stride]) * 4;
input[5] = (in_pass0[5 * stride] + in_pass0[10 * stride]) * 4;
input[6] = (in_pass0[6 * stride] + in_pass0[ 9 * stride]) * 4;
input[7] = (in_pass0[7 * stride] + in_pass0[ 8 * stride]) * 4;
// Calculate input for the next 8 results.
step1[0] = (in_pass0[7 * stride] - in_pass0[ 8 * stride]) * 4;
step1[1] = (in_pass0[6 * stride] - in_pass0[ 9 * stride]) * 4;
step1[2] = (in_pass0[5 * stride] - in_pass0[10 * stride]) * 4;
step1[3] = (in_pass0[4 * stride] - in_pass0[11 * stride]) * 4;
step1[4] = (in_pass0[3 * stride] - in_pass0[12 * stride]) * 4;
step1[5] = (in_pass0[2 * stride] - in_pass0[13 * stride]) * 4;
step1[6] = (in_pass0[1 * stride] - in_pass0[14 * stride]) * 4;
step1[7] = (in_pass0[0 * stride] - in_pass0[15 * stride]) * 4;
} else {
// Calculate input for the first 8 results.
input[0] = ((in[0 * 16] + 1) >> 2) + ((in[15 * 16] + 1) >> 2);
input[1] = ((in[1 * 16] + 1) >> 2) + ((in[14 * 16] + 1) >> 2);
input[2] = ((in[2 * 16] + 1) >> 2) + ((in[13 * 16] + 1) >> 2);
input[3] = ((in[3 * 16] + 1) >> 2) + ((in[12 * 16] + 1) >> 2);
input[4] = ((in[4 * 16] + 1) >> 2) + ((in[11 * 16] + 1) >> 2);
input[5] = ((in[5 * 16] + 1) >> 2) + ((in[10 * 16] + 1) >> 2);
input[6] = ((in[6 * 16] + 1) >> 2) + ((in[ 9 * 16] + 1) >> 2);
input[7] = ((in[7 * 16] + 1) >> 2) + ((in[ 8 * 16] + 1) >> 2);
// Calculate input for the next 8 results.
step1[0] = ((in[7 * 16] + 1) >> 2) - ((in[ 8 * 16] + 1) >> 2);
step1[1] = ((in[6 * 16] + 1) >> 2) - ((in[ 9 * 16] + 1) >> 2);
step1[2] = ((in[5 * 16] + 1) >> 2) - ((in[10 * 16] + 1) >> 2);
step1[3] = ((in[4 * 16] + 1) >> 2) - ((in[11 * 16] + 1) >> 2);
step1[4] = ((in[3 * 16] + 1) >> 2) - ((in[12 * 16] + 1) >> 2);
step1[5] = ((in[2 * 16] + 1) >> 2) - ((in[13 * 16] + 1) >> 2);
step1[6] = ((in[1 * 16] + 1) >> 2) - ((in[14 * 16] + 1) >> 2);
step1[7] = ((in[0 * 16] + 1) >> 2) - ((in[15 * 16] + 1) >> 2);
}
// Work on the first eight values; fdct8(input, even_results);
{
tran_high_t s0, s1, s2, s3, s4, s5, s6, s7; // canbe16
tran_high_t t0, t1, t2, t3; // needs32
tran_high_t x0, x1, x2, x3; // canbe16
// stage 1
s0 = input[0] + input[7];
s1 = input[1] + input[6];
s2 = input[2] + input[5];
s3 = input[3] + input[4];
s4 = input[3] - input[4];
s5 = input[2] - input[5];
s6 = input[1] - input[6];
s7 = input[0] - input[7];
// fdct4(step, step);
x0 = s0 + s3;
x1 = s1 + s2;
x2 = s1 - s2;
x3 = s0 - s3;
t0 = (x0 + x1) * cospi_16_64;
t1 = (x0 - x1) * cospi_16_64;
t2 = x3 * cospi_8_64 + x2 * cospi_24_64;
t3 = x3 * cospi_24_64 - x2 * cospi_8_64;
out[0] = (tran_low_t)fdct_round_shift(t0);
out[4] = (tran_low_t)fdct_round_shift(t2);
out[8] = (tran_low_t)fdct_round_shift(t1);
out[12] = (tran_low_t)fdct_round_shift(t3);
// Stage 2
t0 = (s6 - s5) * cospi_16_64;
t1 = (s6 + s5) * cospi_16_64;
t2 = fdct_round_shift(t0);
t3 = fdct_round_shift(t1);
// Stage 3
x0 = s4 + t2;
x1 = s4 - t2;
x2 = s7 - t3;
x3 = s7 + t3;
// Stage 4
t0 = x0 * cospi_28_64 + x3 * cospi_4_64;
t1 = x1 * cospi_12_64 + x2 * cospi_20_64;
t2 = x2 * cospi_12_64 + x1 * -cospi_20_64;
t3 = x3 * cospi_28_64 + x0 * -cospi_4_64;
out[2] = (tran_low_t)fdct_round_shift(t0);
out[6] = (tran_low_t)fdct_round_shift(t2);
out[10] = (tran_low_t)fdct_round_shift(t1);
out[14] = (tran_low_t)fdct_round_shift(t3);
}
// Work on the next eight values; step1 -> odd_results
{
// step 2
temp1 = (step1[5] - step1[2]) * cospi_16_64;
temp2 = (step1[4] - step1[3]) * cospi_16_64;
step2[2] = fdct_round_shift(temp1);
step2[3] = fdct_round_shift(temp2);
temp1 = (step1[4] + step1[3]) * cospi_16_64;
temp2 = (step1[5] + step1[2]) * cospi_16_64;
step2[4] = fdct_round_shift(temp1);
step2[5] = fdct_round_shift(temp2);
// step 3
step3[0] = step1[0] + step2[3];
step3[1] = step1[1] + step2[2];
step3[2] = step1[1] - step2[2];
step3[3] = step1[0] - step2[3];
step3[4] = step1[7] - step2[4];
step3[5] = step1[6] - step2[5];
step3[6] = step1[6] + step2[5];
step3[7] = step1[7] + step2[4];
// step 4
temp1 = step3[1] * -cospi_8_64 + step3[6] * cospi_24_64;
temp2 = step3[2] * cospi_24_64 + step3[5] * cospi_8_64;
step2[1] = fdct_round_shift(temp1);
step2[2] = fdct_round_shift(temp2);
temp1 = step3[2] * cospi_8_64 - step3[5] * cospi_24_64;
temp2 = step3[1] * cospi_24_64 + step3[6] * cospi_8_64;
step2[5] = fdct_round_shift(temp1);
step2[6] = fdct_round_shift(temp2);
// step 5
step1[0] = step3[0] + step2[1];
step1[1] = step3[0] - step2[1];
step1[2] = step3[3] + step2[2];
step1[3] = step3[3] - step2[2];
step1[4] = step3[4] - step2[5];
step1[5] = step3[4] + step2[5];
step1[6] = step3[7] - step2[6];
step1[7] = step3[7] + step2[6];
// step 6
temp1 = step1[0] * cospi_30_64 + step1[7] * cospi_2_64;
temp2 = step1[1] * cospi_14_64 + step1[6] * cospi_18_64;
out[1] = (tran_low_t)fdct_round_shift(temp1);
out[9] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step1[2] * cospi_22_64 + step1[5] * cospi_10_64;
temp2 = step1[3] * cospi_6_64 + step1[4] * cospi_26_64;
out[5] = (tran_low_t)fdct_round_shift(temp1);
out[13] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step1[3] * -cospi_26_64 + step1[4] * cospi_6_64;
temp2 = step1[2] * -cospi_10_64 + step1[5] * cospi_22_64;
out[3] = (tran_low_t)fdct_round_shift(temp1);
out[11] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step1[1] * -cospi_18_64 + step1[6] * cospi_14_64;
temp2 = step1[0] * -cospi_2_64 + step1[7] * cospi_30_64;
out[7] = (tran_low_t)fdct_round_shift(temp1);
out[15] = (tran_low_t)fdct_round_shift(temp2);
}
// Do next column (which is a transposed row in second/horizontal pass)
in++;
in_pass0++;
out += 16;
}
// Setup in/out for next pass.
in = intermediate;
out = output;
}
}
void vp10_fdct16x16_1_c(const int16_t *input, tran_low_t *output, int stride) {
int r, c;
tran_low_t sum = 0;
for (r = 0; r < 16; ++r)
for (c = 0; c < 16; ++c)
sum += input[r * stride + c];
output[0] = sum >> 1;
output[1] = 0;
}
static INLINE tran_high_t dct_32_round(tran_high_t input) {
tran_high_t rv = ROUND_POWER_OF_TWO(input, DCT_CONST_BITS);
// TODO(debargha, peter.derivaz): Find new bounds for this assert,
// and make the bounds consts.
// assert(-131072 <= rv && rv <= 131071);
return rv;
}
static INLINE tran_high_t half_round_shift(tran_high_t input) {
tran_high_t rv = (input + 1 + (input < 0)) >> 2;
return rv;
}
void vp10_fdct32(const tran_high_t *input, tran_high_t *output, int round) {
tran_high_t step[32];
// Stage 1
step[0] = input[0] + input[(32 - 1)];
step[1] = input[1] + input[(32 - 2)];
step[2] = input[2] + input[(32 - 3)];
step[3] = input[3] + input[(32 - 4)];
step[4] = input[4] + input[(32 - 5)];
step[5] = input[5] + input[(32 - 6)];
step[6] = input[6] + input[(32 - 7)];
step[7] = input[7] + input[(32 - 8)];
step[8] = input[8] + input[(32 - 9)];
step[9] = input[9] + input[(32 - 10)];
step[10] = input[10] + input[(32 - 11)];
step[11] = input[11] + input[(32 - 12)];
step[12] = input[12] + input[(32 - 13)];
step[13] = input[13] + input[(32 - 14)];
step[14] = input[14] + input[(32 - 15)];
step[15] = input[15] + input[(32 - 16)];
step[16] = -input[16] + input[(32 - 17)];
step[17] = -input[17] + input[(32 - 18)];
step[18] = -input[18] + input[(32 - 19)];
step[19] = -input[19] + input[(32 - 20)];
step[20] = -input[20] + input[(32 - 21)];
step[21] = -input[21] + input[(32 - 22)];
step[22] = -input[22] + input[(32 - 23)];
step[23] = -input[23] + input[(32 - 24)];
step[24] = -input[24] + input[(32 - 25)];
step[25] = -input[25] + input[(32 - 26)];
step[26] = -input[26] + input[(32 - 27)];
step[27] = -input[27] + input[(32 - 28)];
step[28] = -input[28] + input[(32 - 29)];
step[29] = -input[29] + input[(32 - 30)];
step[30] = -input[30] + input[(32 - 31)];
step[31] = -input[31] + input[(32 - 32)];
// Stage 2
output[0] = step[0] + step[16 - 1];
output[1] = step[1] + step[16 - 2];
output[2] = step[2] + step[16 - 3];
output[3] = step[3] + step[16 - 4];
output[4] = step[4] + step[16 - 5];
output[5] = step[5] + step[16 - 6];
output[6] = step[6] + step[16 - 7];
output[7] = step[7] + step[16 - 8];
output[8] = -step[8] + step[16 - 9];
output[9] = -step[9] + step[16 - 10];
output[10] = -step[10] + step[16 - 11];
output[11] = -step[11] + step[16 - 12];
output[12] = -step[12] + step[16 - 13];
output[13] = -step[13] + step[16 - 14];
output[14] = -step[14] + step[16 - 15];
output[15] = -step[15] + step[16 - 16];
output[16] = step[16];
output[17] = step[17];
output[18] = step[18];
output[19] = step[19];
output[20] = dct_32_round((-step[20] + step[27]) * cospi_16_64);
output[21] = dct_32_round((-step[21] + step[26]) * cospi_16_64);
output[22] = dct_32_round((-step[22] + step[25]) * cospi_16_64);
output[23] = dct_32_round((-step[23] + step[24]) * cospi_16_64);
output[24] = dct_32_round((step[24] + step[23]) * cospi_16_64);
output[25] = dct_32_round((step[25] + step[22]) * cospi_16_64);
output[26] = dct_32_round((step[26] + step[21]) * cospi_16_64);
output[27] = dct_32_round((step[27] + step[20]) * cospi_16_64);
output[28] = step[28];
output[29] = step[29];
output[30] = step[30];
output[31] = step[31];
// dump the magnitude by 4, hence the intermediate values are within
// the range of 16 bits.
if (round) {
output[0] = half_round_shift(output[0]);
output[1] = half_round_shift(output[1]);
output[2] = half_round_shift(output[2]);
output[3] = half_round_shift(output[3]);
output[4] = half_round_shift(output[4]);
output[5] = half_round_shift(output[5]);
output[6] = half_round_shift(output[6]);
output[7] = half_round_shift(output[7]);
output[8] = half_round_shift(output[8]);
output[9] = half_round_shift(output[9]);
output[10] = half_round_shift(output[10]);
output[11] = half_round_shift(output[11]);
output[12] = half_round_shift(output[12]);
output[13] = half_round_shift(output[13]);
output[14] = half_round_shift(output[14]);
output[15] = half_round_shift(output[15]);
output[16] = half_round_shift(output[16]);
output[17] = half_round_shift(output[17]);
output[18] = half_round_shift(output[18]);
output[19] = half_round_shift(output[19]);
output[20] = half_round_shift(output[20]);
output[21] = half_round_shift(output[21]);
output[22] = half_round_shift(output[22]);
output[23] = half_round_shift(output[23]);
output[24] = half_round_shift(output[24]);
output[25] = half_round_shift(output[25]);
output[26] = half_round_shift(output[26]);
output[27] = half_round_shift(output[27]);
output[28] = half_round_shift(output[28]);
output[29] = half_round_shift(output[29]);
output[30] = half_round_shift(output[30]);
output[31] = half_round_shift(output[31]);
}
// Stage 3
step[0] = output[0] + output[(8 - 1)];
step[1] = output[1] + output[(8 - 2)];
step[2] = output[2] + output[(8 - 3)];
step[3] = output[3] + output[(8 - 4)];
step[4] = -output[4] + output[(8 - 5)];
step[5] = -output[5] + output[(8 - 6)];
step[6] = -output[6] + output[(8 - 7)];
step[7] = -output[7] + output[(8 - 8)];
step[8] = output[8];
step[9] = output[9];
step[10] = dct_32_round((-output[10] + output[13]) * cospi_16_64);
step[11] = dct_32_round((-output[11] + output[12]) * cospi_16_64);
step[12] = dct_32_round((output[12] + output[11]) * cospi_16_64);
step[13] = dct_32_round((output[13] + output[10]) * cospi_16_64);
step[14] = output[14];
step[15] = output[15];
step[16] = output[16] + output[23];
step[17] = output[17] + output[22];
step[18] = output[18] + output[21];
step[19] = output[19] + output[20];
step[20] = -output[20] + output[19];
step[21] = -output[21] + output[18];
step[22] = -output[22] + output[17];
step[23] = -output[23] + output[16];
step[24] = -output[24] + output[31];
step[25] = -output[25] + output[30];
step[26] = -output[26] + output[29];
step[27] = -output[27] + output[28];
step[28] = output[28] + output[27];
step[29] = output[29] + output[26];
step[30] = output[30] + output[25];
step[31] = output[31] + output[24];
// Stage 4
output[0] = step[0] + step[3];
output[1] = step[1] + step[2];
output[2] = -step[2] + step[1];
output[3] = -step[3] + step[0];
output[4] = step[4];
output[5] = dct_32_round((-step[5] + step[6]) * cospi_16_64);
output[6] = dct_32_round((step[6] + step[5]) * cospi_16_64);
output[7] = step[7];
output[8] = step[8] + step[11];
output[9] = step[9] + step[10];
output[10] = -step[10] + step[9];
output[11] = -step[11] + step[8];
output[12] = -step[12] + step[15];
output[13] = -step[13] + step[14];
output[14] = step[14] + step[13];
output[15] = step[15] + step[12];
output[16] = step[16];
output[17] = step[17];
output[18] = dct_32_round(step[18] * -cospi_8_64 + step[29] * cospi_24_64);
output[19] = dct_32_round(step[19] * -cospi_8_64 + step[28] * cospi_24_64);
output[20] = dct_32_round(step[20] * -cospi_24_64 + step[27] * -cospi_8_64);
output[21] = dct_32_round(step[21] * -cospi_24_64 + step[26] * -cospi_8_64);
output[22] = step[22];
output[23] = step[23];
output[24] = step[24];
output[25] = step[25];
output[26] = dct_32_round(step[26] * cospi_24_64 + step[21] * -cospi_8_64);
output[27] = dct_32_round(step[27] * cospi_24_64 + step[20] * -cospi_8_64);
output[28] = dct_32_round(step[28] * cospi_8_64 + step[19] * cospi_24_64);
output[29] = dct_32_round(step[29] * cospi_8_64 + step[18] * cospi_24_64);
output[30] = step[30];
output[31] = step[31];
// Stage 5
step[0] = dct_32_round((output[0] + output[1]) * cospi_16_64);
step[1] = dct_32_round((-output[1] + output[0]) * cospi_16_64);
step[2] = dct_32_round(output[2] * cospi_24_64 + output[3] * cospi_8_64);
step[3] = dct_32_round(output[3] * cospi_24_64 - output[2] * cospi_8_64);
step[4] = output[4] + output[5];
step[5] = -output[5] + output[4];
step[6] = -output[6] + output[7];
step[7] = output[7] + output[6];
step[8] = output[8];
step[9] = dct_32_round(output[9] * -cospi_8_64 + output[14] * cospi_24_64);
step[10] = dct_32_round(output[10] * -cospi_24_64 + output[13] * -cospi_8_64);
step[11] = output[11];
step[12] = output[12];
step[13] = dct_32_round(output[13] * cospi_24_64 + output[10] * -cospi_8_64);
step[14] = dct_32_round(output[14] * cospi_8_64 + output[9] * cospi_24_64);
step[15] = output[15];
step[16] = output[16] + output[19];
step[17] = output[17] + output[18];
step[18] = -output[18] + output[17];
step[19] = -output[19] + output[16];
step[20] = -output[20] + output[23];
step[21] = -output[21] + output[22];
step[22] = output[22] + output[21];
step[23] = output[23] + output[20];
step[24] = output[24] + output[27];
step[25] = output[25] + output[26];
step[26] = -output[26] + output[25];
step[27] = -output[27] + output[24];
step[28] = -output[28] + output[31];
step[29] = -output[29] + output[30];
step[30] = output[30] + output[29];
step[31] = output[31] + output[28];
// Stage 6
output[0] = step[0];
output[1] = step[1];
output[2] = step[2];
output[3] = step[3];
output[4] = dct_32_round(step[4] * cospi_28_64 + step[7] * cospi_4_64);
output[5] = dct_32_round(step[5] * cospi_12_64 + step[6] * cospi_20_64);
output[6] = dct_32_round(step[6] * cospi_12_64 + step[5] * -cospi_20_64);
output[7] = dct_32_round(step[7] * cospi_28_64 + step[4] * -cospi_4_64);
output[8] = step[8] + step[9];
output[9] = -step[9] + step[8];
output[10] = -step[10] + step[11];
output[11] = step[11] + step[10];
output[12] = step[12] + step[13];
output[13] = -step[13] + step[12];
output[14] = -step[14] + step[15];
output[15] = step[15] + step[14];
output[16] = step[16];
output[17] = dct_32_round(step[17] * -cospi_4_64 + step[30] * cospi_28_64);
output[18] = dct_32_round(step[18] * -cospi_28_64 + step[29] * -cospi_4_64);
output[19] = step[19];
output[20] = step[20];
output[21] = dct_32_round(step[21] * -cospi_20_64 + step[26] * cospi_12_64);
output[22] = dct_32_round(step[22] * -cospi_12_64 + step[25] * -cospi_20_64);
output[23] = step[23];
output[24] = step[24];
output[25] = dct_32_round(step[25] * cospi_12_64 + step[22] * -cospi_20_64);
output[26] = dct_32_round(step[26] * cospi_20_64 + step[21] * cospi_12_64);
output[27] = step[27];
output[28] = step[28];
output[29] = dct_32_round(step[29] * cospi_28_64 + step[18] * -cospi_4_64);
output[30] = dct_32_round(step[30] * cospi_4_64 + step[17] * cospi_28_64);
output[31] = step[31];
// Stage 7
step[0] = output[0];
step[1] = output[1];
step[2] = output[2];
step[3] = output[3];
step[4] = output[4];
step[5] = output[5];
step[6] = output[6];
step[7] = output[7];
step[8] = dct_32_round(output[8] * cospi_30_64 + output[15] * cospi_2_64);
step[9] = dct_32_round(output[9] * cospi_14_64 + output[14] * cospi_18_64);
step[10] = dct_32_round(output[10] * cospi_22_64 + output[13] * cospi_10_64);
step[11] = dct_32_round(output[11] * cospi_6_64 + output[12] * cospi_26_64);
step[12] = dct_32_round(output[12] * cospi_6_64 + output[11] * -cospi_26_64);
step[13] = dct_32_round(output[13] * cospi_22_64 + output[10] * -cospi_10_64);
step[14] = dct_32_round(output[14] * cospi_14_64 + output[9] * -cospi_18_64);
step[15] = dct_32_round(output[15] * cospi_30_64 + output[8] * -cospi_2_64);
step[16] = output[16] + output[17];
step[17] = -output[17] + output[16];
step[18] = -output[18] + output[19];
step[19] = output[19] + output[18];
step[20] = output[20] + output[21];
step[21] = -output[21] + output[20];
step[22] = -output[22] + output[23];
step[23] = output[23] + output[22];
step[24] = output[24] + output[25];
step[25] = -output[25] + output[24];
step[26] = -output[26] + output[27];
step[27] = output[27] + output[26];
step[28] = output[28] + output[29];
step[29] = -output[29] + output[28];
step[30] = -output[30] + output[31];
step[31] = output[31] + output[30];
// Final stage --- outputs indices are bit-reversed.
output[0] = step[0];
output[16] = step[1];
output[8] = step[2];
output[24] = step[3];
output[4] = step[4];
output[20] = step[5];
output[12] = step[6];
output[28] = step[7];
output[2] = step[8];
output[18] = step[9];
output[10] = step[10];
output[26] = step[11];
output[6] = step[12];
output[22] = step[13];
output[14] = step[14];
output[30] = step[15];
output[1] = dct_32_round(step[16] * cospi_31_64 + step[31] * cospi_1_64);
output[17] = dct_32_round(step[17] * cospi_15_64 + step[30] * cospi_17_64);
output[9] = dct_32_round(step[18] * cospi_23_64 + step[29] * cospi_9_64);
output[25] = dct_32_round(step[19] * cospi_7_64 + step[28] * cospi_25_64);
output[5] = dct_32_round(step[20] * cospi_27_64 + step[27] * cospi_5_64);
output[21] = dct_32_round(step[21] * cospi_11_64 + step[26] * cospi_21_64);
output[13] = dct_32_round(step[22] * cospi_19_64 + step[25] * cospi_13_64);
output[29] = dct_32_round(step[23] * cospi_3_64 + step[24] * cospi_29_64);
output[3] = dct_32_round(step[24] * cospi_3_64 + step[23] * -cospi_29_64);
output[19] = dct_32_round(step[25] * cospi_19_64 + step[22] * -cospi_13_64);
output[11] = dct_32_round(step[26] * cospi_11_64 + step[21] * -cospi_21_64);
output[27] = dct_32_round(step[27] * cospi_27_64 + step[20] * -cospi_5_64);
output[7] = dct_32_round(step[28] * cospi_7_64 + step[19] * -cospi_25_64);
output[23] = dct_32_round(step[29] * cospi_23_64 + step[18] * -cospi_9_64);
output[15] = dct_32_round(step[30] * cospi_15_64 + step[17] * -cospi_17_64);
output[31] = dct_32_round(step[31] * cospi_31_64 + step[16] * -cospi_1_64);
}
void vp10_fdct32x32_c(const int16_t *input, tran_low_t *out, int stride) {
int i, j;
tran_high_t output[32 * 32];
// Columns
for (i = 0; i < 32; ++i) {
tran_high_t temp_in[32], temp_out[32];
for (j = 0; j < 32; ++j)
temp_in[j] = input[j * stride + i] * 4;
vp10_fdct32(temp_in, temp_out, 0);
for (j = 0; j < 32; ++j)
output[j * 32 + i] = (temp_out[j] + 1 + (temp_out[j] > 0)) >> 2;
}
// Rows
for (i = 0; i < 32; ++i) {
tran_high_t temp_in[32], temp_out[32];
for (j = 0; j < 32; ++j)
temp_in[j] = output[j + i * 32];
vp10_fdct32(temp_in, temp_out, 0);
for (j = 0; j < 32; ++j)
out[j + i * 32] =
(tran_low_t)((temp_out[j] + 1 + (temp_out[j] < 0)) >> 2);
}
}
// Note that although we use dct_32_round in dct32 computation flow,
// this 2d fdct32x32 for rate-distortion optimization loop is operating
// within 16 bits precision.
void vp10_fdct32x32_rd_c(const int16_t *input, tran_low_t *out, int stride) {
int i, j;
tran_high_t output[32 * 32];
// Columns
for (i = 0; i < 32; ++i) {
tran_high_t temp_in[32], temp_out[32];
for (j = 0; j < 32; ++j)
temp_in[j] = input[j * stride + i] * 4;
vp10_fdct32(temp_in, temp_out, 0);
for (j = 0; j < 32; ++j)
// TODO(cd): see quality impact of only doing
// output[j * 32 + i] = (temp_out[j] + 1) >> 2;
// PS: also change code in vp10_dsp/x86/vp10_dct_sse2.c
output[j * 32 + i] = (temp_out[j] + 1 + (temp_out[j] > 0)) >> 2;
}
// Rows
for (i = 0; i < 32; ++i) {
tran_high_t temp_in[32], temp_out[32];
for (j = 0; j < 32; ++j)
temp_in[j] = output[j + i * 32];
vp10_fdct32(temp_in, temp_out, 1);
for (j = 0; j < 32; ++j)
out[j + i * 32] = (tran_low_t)temp_out[j];
}
}
void vp10_fdct32x32_1_c(const int16_t *input, tran_low_t *output, int stride) {
int r, c;
tran_low_t sum = 0;
for (r = 0; r < 32; ++r)
for (c = 0; c < 32; ++c)
sum += input[r * stride + c];
output[0] = sum >> 3;
output[1] = 0;
}
#if CONFIG_VP9_HIGHBITDEPTH
void vp10_highbd_fdct4x4_c(const int16_t *input, tran_low_t *output,
int stride) {
vp10_fdct4x4_c(input, output, stride);
}
void vp10_highbd_fdct8x8_c(const int16_t *input, tran_low_t *final_output,
int stride) {
vp10_fdct8x8_c(input, final_output, stride);
}
void vp10_highbd_fdct8x8_1_c(const int16_t *input, tran_low_t *final_output,
int stride) {
vp10_fdct8x8_1_c(input, final_output, stride);
}
void vp10_highbd_fdct16x16_c(const int16_t *input, tran_low_t *output,
int stride) {
vp10_fdct16x16_c(input, output, stride);
}
void vp10_highbd_fdct16x16_1_c(const int16_t *input, tran_low_t *output,
int stride) {
vp10_fdct16x16_1_c(input, output, stride);
}
void vp10_highbd_fdct32x32_c(const int16_t *input,
tran_low_t *out, int stride) {
vp10_fdct32x32_c(input, out, stride);
}
void vp10_highbd_fdct32x32_rd_c(const int16_t *input, tran_low_t *out,
int stride) {
vp10_fdct32x32_rd_c(input, out, stride);
}
void vp10_highbd_fdct32x32_1_c(const int16_t *input,
tran_low_t *out, int stride) {
vp10_fdct32x32_1_c(input, out, stride);
}
#endif // CONFIG_VP9_HIGHBITDEPTH

View File

@@ -0,0 +1,18 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef VP10_COMMON_VP10_FWD_TXFM_H_
#define VP10_COMMON_VP10_FWD_TXFM_H_
#include "vpx_dsp/txfm_common.h"
#include "vpx_dsp/fwd_txfm.h"
void vp10_fdct32(const tran_high_t *input, tran_high_t *output, int round);
#endif // VP10_COMMON_VP10_FWD_TXFM_H_

2499
vp10/common/vp10_inv_txfm.c Normal file

File diff suppressed because it is too large Load Diff

122
vp10/common/vp10_inv_txfm.h Normal file
View File

@@ -0,0 +1,122 @@
/*
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef VPX_DSP_INV_TXFM_H_
#define VPX_DSP_INV_TXFM_H_
#include <assert.h>
#include "./vpx_config.h"
#include "vpx_dsp/txfm_common.h"
#include "vpx_ports/mem.h"
#ifdef __cplusplus
extern "C" {
#endif
static INLINE tran_low_t check_range(tran_high_t input) {
#if CONFIG_COEFFICIENT_RANGE_CHECKING
// For valid VP9 input streams, intermediate stage coefficients should always
// stay within the range of a signed 16 bit integer. Coefficients can go out
// of this range for invalid/corrupt VP9 streams. However, strictly checking
// this range for every intermediate coefficient can burdensome for a decoder,
// therefore the following assertion is only enabled when configured with
// --enable-coefficient-range-checking.
assert(INT16_MIN <= input);
assert(input <= INT16_MAX);
#endif // CONFIG_COEFFICIENT_RANGE_CHECKING
return (tran_low_t)input;
}
static INLINE tran_low_t dct_const_round_shift(tran_high_t input) {
tran_high_t rv = ROUND_POWER_OF_TWO(input, DCT_CONST_BITS);
return check_range(rv);
}
#if CONFIG_VP9_HIGHBITDEPTH
static INLINE tran_low_t highbd_check_range(tran_high_t input,
int bd) {
#if CONFIG_COEFFICIENT_RANGE_CHECKING
// For valid highbitdepth VP9 streams, intermediate stage coefficients will
// stay within the ranges:
// - 8 bit: signed 16 bit integer
// - 10 bit: signed 18 bit integer
// - 12 bit: signed 20 bit integer
const int32_t int_max = (1 << (7 + bd)) - 1;
const int32_t int_min = -int_max - 1;
assert(int_min <= input);
assert(input <= int_max);
(void) int_min;
#endif // CONFIG_COEFFICIENT_RANGE_CHECKING
(void) bd;
return (tran_low_t)input;
}
static INLINE tran_low_t highbd_dct_const_round_shift(tran_high_t input,
int bd) {
tran_high_t rv = ROUND_POWER_OF_TWO(input, DCT_CONST_BITS);
return highbd_check_range(rv, bd);
}
#endif // CONFIG_VP9_HIGHBITDEPTH
#if CONFIG_EMULATE_HARDWARE
// When CONFIG_EMULATE_HARDWARE is 1 the transform performs a
// non-normative method to handle overflows. A stream that causes
// overflows in the inverse transform is considered invalid in VP9,
// and a hardware implementer is free to choose any reasonable
// method to handle overflows. However to aid in hardware
// verification they can use a specific implementation of the
// WRAPLOW() macro below that is identical to their intended
// hardware implementation (and also use configure options to trigger
// the C-implementation of the transform).
//
// The particular WRAPLOW implementation below performs strict
// overflow wrapping to match common hardware implementations.
// bd of 8 uses trans_low with 16bits, need to remove 16bits
// bd of 10 uses trans_low with 18bits, need to remove 14bits
// bd of 12 uses trans_low with 20bits, need to remove 12bits
// bd of x uses trans_low with 8+x bits, need to remove 24-x bits
#define WRAPLOW(x, bd) ((((int32_t)(x)) << (24 - bd)) >> (24 - bd))
#else
#define WRAPLOW(x, bd) ((int32_t)(x))
#endif // CONFIG_EMULATE_HARDWARE
void vp10_idct4_c(const tran_low_t *input, tran_low_t *output);
void vp10_idct8_c(const tran_low_t *input, tran_low_t *output);
void vp10_idct16_c(const tran_low_t *input, tran_low_t *output);
void vp10_idct32_c(const tran_low_t *input, tran_low_t *output);
void vp10_iadst4_c(const tran_low_t *input, tran_low_t *output);
void vp10_iadst8_c(const tran_low_t *input, tran_low_t *output);
void vp10_iadst16_c(const tran_low_t *input, tran_low_t *output);
#if CONFIG_VP9_HIGHBITDEPTH
void vp10_highbd_idct4_c(const tran_low_t *input, tran_low_t *output, int bd);
void vp10_highbd_idct8_c(const tran_low_t *input, tran_low_t *output, int bd);
void vp10_highbd_idct16_c(const tran_low_t *input, tran_low_t *output, int bd);
void vp10_highbd_iadst4_c(const tran_low_t *input, tran_low_t *output, int bd);
void vp10_highbd_iadst8_c(const tran_low_t *input, tran_low_t *output, int bd);
void vp10_highbd_iadst16_c(const tran_low_t *input, tran_low_t *output, int bd);
static INLINE uint16_t highbd_clip_pixel_add(uint16_t dest, tran_high_t trans,
int bd) {
trans = WRAPLOW(trans, bd);
return clip_pixel_highbd(WRAPLOW(dest + trans, bd), bd);
}
#endif
static INLINE uint8_t clip_pixel_add(uint8_t dest, tran_high_t trans) {
trans = WRAPLOW(trans, 8);
return clip_pixel(WRAPLOW(dest + trans, 8));
}
#ifdef __cplusplus
} // extern "C"
#endif
#endif // VPX_DSP_INV_TXFM_H_

View File

@@ -87,14 +87,127 @@ specialize qw/vp10_filter_by_weight8x8 sse2 msa/;
if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") { if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
# Note as optimized versions of these functions are added we need to add a check to ensure # Note as optimized versions of these functions are added we need to add a check to ensure
# that when CONFIG_EMULATE_HARDWARE is on, it defaults to the C versions only. # that when CONFIG_EMULATE_HARDWARE is on, it defaults to the C versions only.
add_proto qw/void vp10_iht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type"; if (vpx_config("CONFIG_EMULATE_HARDWARE") eq "yes") {
specialize qw/vp10_iht4x4_16_add/; add_proto qw/void vp10_iht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type";
specialize qw/vp10_iht4x4_16_add/;
add_proto qw/void vp10_iht8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type"; add_proto qw/void vp10_iht8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type";
specialize qw/vp10_iht8x8_64_add/; specialize qw/vp10_iht8x8_64_add/;
add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type"; add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type";
specialize qw/vp10_iht16x16_256_add/; specialize qw/vp10_iht16x16_256_add/;
add_proto qw/void vp10_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4/;
add_proto qw/void vp10_fdct4x4_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4_1/;
add_proto qw/void vp10_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8/;
add_proto qw/void vp10_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8_1/;
add_proto qw/void vp10_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16/;
add_proto qw/void vp10_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16_1/;
add_proto qw/void vp10_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32/;
add_proto qw/void vp10_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_rd/;
add_proto qw/void vp10_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_1/;
add_proto qw/void vp10_highbd_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct4x4/;
add_proto qw/void vp10_highbd_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct8x8/;
add_proto qw/void vp10_highbd_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct8x8_1/;
add_proto qw/void vp10_highbd_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct16x16/;
add_proto qw/void vp10_highbd_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct16x16_1/;
add_proto qw/void vp10_highbd_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32/;
add_proto qw/void vp10_highbd_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32_rd/;
add_proto qw/void vp10_highbd_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32_1/;
} else {
add_proto qw/void vp10_iht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type";
specialize qw/vp10_iht4x4_16_add sse2/;
add_proto qw/void vp10_iht8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type";
specialize qw/vp10_iht8x8_64_add sse2/;
add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type";
specialize qw/vp10_iht16x16_256_add/;
add_proto qw/void vp10_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4 sse2/;
add_proto qw/void vp10_fdct4x4_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4_1 sse2/;
add_proto qw/void vp10_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8 sse2/;
add_proto qw/void vp10_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8_1 sse2/;
add_proto qw/void vp10_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16 sse2/;
add_proto qw/void vp10_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16_1 sse2/;
add_proto qw/void vp10_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32 sse2/;
add_proto qw/void vp10_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_rd sse2/;
add_proto qw/void vp10_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_1 sse2/;
add_proto qw/void vp10_highbd_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct4x4 sse2/;
add_proto qw/void vp10_highbd_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct8x8 sse2/;
add_proto qw/void vp10_highbd_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct8x8_1/;
add_proto qw/void vp10_highbd_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct16x16 sse2/;
add_proto qw/void vp10_highbd_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct16x16_1/;
add_proto qw/void vp10_highbd_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32 sse2/;
add_proto qw/void vp10_highbd_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32_rd sse2/;
add_proto qw/void vp10_highbd_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_highbd_fdct32x32_1/;
}
} else { } else {
# Force C versions if CONFIG_EMULATE_HARDWARE is 1 # Force C versions if CONFIG_EMULATE_HARDWARE is 1
if (vpx_config("CONFIG_EMULATE_HARDWARE") eq "yes") { if (vpx_config("CONFIG_EMULATE_HARDWARE") eq "yes") {
@@ -106,6 +219,33 @@ if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type"; add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type";
specialize qw/vp10_iht16x16_256_add/; specialize qw/vp10_iht16x16_256_add/;
add_proto qw/void vp10_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4/;
add_proto qw/void vp10_fdct4x4_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4_1/;
add_proto qw/void vp10_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8/;
add_proto qw/void vp10_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8_1/;
add_proto qw/void vp10_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16/;
add_proto qw/void vp10_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16_1/;
add_proto qw/void vp10_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32/;
add_proto qw/void vp10_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_rd/;
add_proto qw/void vp10_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_1/;
} else { } else {
add_proto qw/void vp10_iht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type"; add_proto qw/void vp10_iht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int tx_type";
specialize qw/vp10_iht4x4_16_add sse2 neon dspr2 msa/; specialize qw/vp10_iht4x4_16_add sse2 neon dspr2 msa/;
@@ -115,6 +255,33 @@ if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type"; add_proto qw/void vp10_iht16x16_256_add/, "const tran_low_t *input, uint8_t *output, int pitch, int tx_type";
specialize qw/vp10_iht16x16_256_add sse2 dspr2 msa/; specialize qw/vp10_iht16x16_256_add sse2 dspr2 msa/;
add_proto qw/void vp10_fdct4x4/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4 sse2/;
add_proto qw/void vp10_fdct4x4_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct4x4_1 sse2/;
add_proto qw/void vp10_fdct8x8/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8 sse2/;
add_proto qw/void vp10_fdct8x8_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct8x8_1 sse2/;
add_proto qw/void vp10_fdct16x16/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16 sse2/;
add_proto qw/void vp10_fdct16x16_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct16x16_1 sse2/;
add_proto qw/void vp10_fdct32x32/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32 sse2/;
add_proto qw/void vp10_fdct32x32_rd/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_rd sse2/;
add_proto qw/void vp10_fdct32x32_1/, "const int16_t *input, tran_low_t *output, int stride";
specialize qw/vp10_fdct32x32_1 sse2/;
} }
} }
@@ -184,42 +351,6 @@ if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
# #
if (vpx_config("CONFIG_VP10_ENCODER") eq "yes") { if (vpx_config("CONFIG_VP10_ENCODER") eq "yes") {
add_proto qw/unsigned int vp10_avg_8x8/, "const uint8_t *, int p";
specialize qw/vp10_avg_8x8 sse2 neon msa/;
add_proto qw/unsigned int vp10_avg_4x4/, "const uint8_t *, int p";
specialize qw/vp10_avg_4x4 sse2 msa/;
add_proto qw/void vp10_minmax_8x8/, "const uint8_t *s, int p, const uint8_t *d, int dp, int *min, int *max";
specialize qw/vp10_minmax_8x8 sse2/;
add_proto qw/void vp10_hadamard_8x8/, "int16_t const *src_diff, int src_stride, int16_t *coeff";
specialize qw/vp10_hadamard_8x8 sse2/, "$ssse3_x86_64_x86inc";
add_proto qw/void vp10_hadamard_16x16/, "int16_t const *src_diff, int src_stride, int16_t *coeff";
specialize qw/vp10_hadamard_16x16 sse2/;
add_proto qw/int16_t vp10_satd/, "const int16_t *coeff, int length";
specialize qw/vp10_satd sse2/;
add_proto qw/void vp10_int_pro_row/, "int16_t *hbuf, uint8_t const *ref, const int ref_stride, const int height";
specialize qw/vp10_int_pro_row sse2 neon/;
add_proto qw/int16_t vp10_int_pro_col/, "uint8_t const *ref, const int width";
specialize qw/vp10_int_pro_col sse2 neon/;
add_proto qw/int vp10_vector_var/, "int16_t const *ref, int16_t const *src, const int bwl";
specialize qw/vp10_vector_var neon sse2/;
if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
add_proto qw/unsigned int vp10_highbd_avg_8x8/, "const uint8_t *, int p";
specialize qw/vp10_highbd_avg_8x8/;
add_proto qw/unsigned int vp10_highbd_avg_4x4/, "const uint8_t *, int p";
specialize qw/vp10_highbd_avg_4x4/;
add_proto qw/void vp10_highbd_minmax_8x8/, "const uint8_t *s, int p, const uint8_t *d, int dp, int *min, int *max";
specialize qw/vp10_highbd_minmax_8x8/;
}
# ENCODEMB INVOKE # ENCODEMB INVOKE
# #
@@ -289,6 +420,188 @@ if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
specialize qw/vp10_fwht4x4 msa/, "$mmx_x86inc"; specialize qw/vp10_fwht4x4 msa/, "$mmx_x86inc";
} }
# Inverse transform
if (vpx_config("CONFIG_VP9_HIGHBITDEPTH") eq "yes") {
# Note as optimized versions of these functions are added we need to add a check to ensure
# that when CONFIG_EMULATE_HARDWARE is on, it defaults to the C versions only.
add_proto qw/void vp10_idct4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_1_add/;
add_proto qw/void vp10_idct4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_16_add/;
add_proto qw/void vp10_idct8x8_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_1_add/;
add_proto qw/void vp10_idct8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_64_add/;
add_proto qw/void vp10_idct8x8_12_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_12_add/;
add_proto qw/void vp10_idct16x16_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_1_add/;
add_proto qw/void vp10_idct16x16_256_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_256_add/;
add_proto qw/void vp10_idct16x16_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_10_add/;
add_proto qw/void vp10_idct32x32_1024_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1024_add/;
add_proto qw/void vp10_idct32x32_34_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_34_add/;
add_proto qw/void vp10_idct32x32_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1_add/;
add_proto qw/void vp10_iwht4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_1_add/;
add_proto qw/void vp10_iwht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_16_add/;
add_proto qw/void vp10_highbd_idct4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct4x4_1_add/;
add_proto qw/void vp10_highbd_idct8x8_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct8x8_1_add/;
add_proto qw/void vp10_highbd_idct16x16_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct16x16_1_add/;
add_proto qw/void vp10_highbd_idct32x32_1024_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct32x32_1024_add/;
add_proto qw/void vp10_highbd_idct32x32_34_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct32x32_34_add/;
add_proto qw/void vp10_highbd_idct32x32_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct32x32_1_add/;
add_proto qw/void vp10_highbd_iwht4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_iwht4x4_1_add/;
add_proto qw/void vp10_highbd_iwht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_iwht4x4_16_add/;
# Force C versions if CONFIG_EMULATE_HARDWARE is 1
if (vpx_config("CONFIG_EMULATE_HARDWARE") eq "yes") {
add_proto qw/void vp10_highbd_idct4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct4x4_16_add/;
add_proto qw/void vp10_highbd_idct8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct8x8_64_add/;
add_proto qw/void vp10_highbd_idct8x8_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct8x8_10_add/;
add_proto qw/void vp10_highbd_idct16x16_256_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct16x16_256_add/;
add_proto qw/void vp10_highbd_idct16x16_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct16x16_10_add/;
} else {
add_proto qw/void vp10_highbd_idct4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct4x4_16_add sse2/;
add_proto qw/void vp10_highbd_idct8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct8x8_64_add sse2/;
add_proto qw/void vp10_highbd_idct8x8_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct8x8_10_add sse2/;
add_proto qw/void vp10_highbd_idct16x16_256_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct16x16_256_add sse2/;
add_proto qw/void vp10_highbd_idct16x16_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride, int bd";
specialize qw/vp10_highbd_idct16x16_10_add sse2/;
} # CONFIG_EMULATE_HARDWARE
} else {
# Force C versions if CONFIG_EMULATE_HARDWARE is 1
if (vpx_config("CONFIG_EMULATE_HARDWARE") eq "yes") {
add_proto qw/void vp10_idct4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_1_add/;
add_proto qw/void vp10_idct4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_16_add/;
add_proto qw/void vp10_idct8x8_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_1_add/;
add_proto qw/void vp10_idct8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_64_add/;
add_proto qw/void vp10_idct8x8_12_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_12_add/;
add_proto qw/void vp10_idct16x16_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_1_add/;
add_proto qw/void vp10_idct16x16_256_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_256_add/;
add_proto qw/void vp10_idct16x16_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_10_add/;
add_proto qw/void vp10_idct32x32_1024_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1024_add/;
add_proto qw/void vp10_idct32x32_34_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_34_add/;
add_proto qw/void vp10_idct32x32_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1_add/;
add_proto qw/void vp10_iwht4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_1_add/;
add_proto qw/void vp10_iwht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_16_add/;
} else {
add_proto qw/void vp10_idct4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_1_add sse2/;
add_proto qw/void vp10_idct4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct4x4_16_add sse2/;
add_proto qw/void vp10_idct8x8_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_1_add sse2/;
add_proto qw/void vp10_idct8x8_64_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_64_add sse2/;
add_proto qw/void vp10_idct8x8_12_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct8x8_12_add sse2/;
add_proto qw/void vp10_idct16x16_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_1_add sse2/;
add_proto qw/void vp10_idct16x16_256_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_256_add sse2/;
add_proto qw/void vp10_idct16x16_10_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct16x16_10_add sse2/;
add_proto qw/void vp10_idct32x32_1024_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1024_add sse2/;
add_proto qw/void vp10_idct32x32_34_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_34_add sse2/;
add_proto qw/void vp10_idct32x32_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_idct32x32_1_add sse2/;
add_proto qw/void vp10_iwht4x4_1_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_1_add/;
add_proto qw/void vp10_iwht4x4_16_add/, "const tran_low_t *input, uint8_t *dest, int dest_stride";
specialize qw/vp10_iwht4x4_16_add/;
} # CONFIG_EMULATE_HARDWARE
} # CONFIG_VP9_HIGHBITDEPTH
# #
# Motion search # Motion search
# #

View File

@@ -12,14 +12,14 @@
#include "vpx_dsp/x86/txfm_common_sse2.h" #include "vpx_dsp/x86/txfm_common_sse2.h"
#include "vpx_ports/mem.h" #include "vpx_ports/mem.h"
void vp10_iht4x4_16_add_sse2(const int16_t *input, uint8_t *dest, int stride, void vp10_iht4x4_16_add_sse2(const tran_low_t *input, uint8_t *dest, int stride,
int tx_type) { int tx_type) {
__m128i in[2]; __m128i in[2];
const __m128i zero = _mm_setzero_si128(); const __m128i zero = _mm_setzero_si128();
const __m128i eight = _mm_set1_epi16(8); const __m128i eight = _mm_set1_epi16(8);
in[0] = _mm_loadu_si128((const __m128i *)(input)); in[0] = load_input_data(input);
in[1] = _mm_loadu_si128((const __m128i *)(input + 8)); in[1] = load_input_data(input + 8);
switch (tx_type) { switch (tx_type) {
case 0: // DCT_DCT case 0: // DCT_DCT
@@ -77,21 +77,21 @@ void vp10_iht4x4_16_add_sse2(const int16_t *input, uint8_t *dest, int stride,
} }
} }
void vp10_iht8x8_64_add_sse2(const int16_t *input, uint8_t *dest, int stride, void vp10_iht8x8_64_add_sse2(const tran_low_t *input, uint8_t *dest, int stride,
int tx_type) { int tx_type) {
__m128i in[8]; __m128i in[8];
const __m128i zero = _mm_setzero_si128(); const __m128i zero = _mm_setzero_si128();
const __m128i final_rounding = _mm_set1_epi16(1 << 4); const __m128i final_rounding = _mm_set1_epi16(1 << 4);
// load input data // load input data
in[0] = _mm_load_si128((const __m128i *)input); in[0] = load_input_data(input);
in[1] = _mm_load_si128((const __m128i *)(input + 8 * 1)); in[1] = load_input_data(input + 8 * 1);
in[2] = _mm_load_si128((const __m128i *)(input + 8 * 2)); in[2] = load_input_data(input + 8 * 2);
in[3] = _mm_load_si128((const __m128i *)(input + 8 * 3)); in[3] = load_input_data(input + 8 * 3);
in[4] = _mm_load_si128((const __m128i *)(input + 8 * 4)); in[4] = load_input_data(input + 8 * 4);
in[5] = _mm_load_si128((const __m128i *)(input + 8 * 5)); in[5] = load_input_data(input + 8 * 5);
in[6] = _mm_load_si128((const __m128i *)(input + 8 * 6)); in[6] = load_input_data(input + 8 * 6);
in[7] = _mm_load_si128((const __m128i *)(input + 8 * 7)); in[7] = load_input_data(input + 8 * 7);
switch (tx_type) { switch (tx_type) {
case 0: // DCT_DCT case 0: // DCT_DCT
@@ -144,8 +144,8 @@ void vp10_iht8x8_64_add_sse2(const int16_t *input, uint8_t *dest, int stride,
RECON_AND_STORE(dest + 7 * stride, in[7]); RECON_AND_STORE(dest + 7 * stride, in[7]);
} }
void vp10_iht16x16_256_add_sse2(const int16_t *input, uint8_t *dest, int stride, void vp10_iht16x16_256_add_sse2(const tran_low_t *input, uint8_t *dest,
int tx_type) { int stride, int tx_type) {
__m128i in0[16], in1[16]; __m128i in0[16], in1[16];
load_buffer_8x16(input, in0); load_buffer_8x16(input, in0);

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,271 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <emmintrin.h> // SSE2
#include "./vpx_config.h"
#include "vpx_dsp/vpx_dsp_common.h"
#include "vpx_dsp/x86/fwd_txfm_sse2.h"
void vp10_fdct4x4_1_sse2(const int16_t *input, tran_low_t *output, int stride) {
__m128i in0, in1;
__m128i tmp;
const __m128i zero = _mm_setzero_si128();
in0 = _mm_loadl_epi64((const __m128i *)(input + 0 * stride));
in1 = _mm_loadl_epi64((const __m128i *)(input + 1 * stride));
in1 = _mm_unpacklo_epi64(in1, _mm_loadl_epi64((const __m128i *)
(input + 2 * stride)));
in0 = _mm_unpacklo_epi64(in0, _mm_loadl_epi64((const __m128i *)
(input + 3 * stride)));
tmp = _mm_add_epi16(in0, in1);
in0 = _mm_unpacklo_epi16(zero, tmp);
in1 = _mm_unpackhi_epi16(zero, tmp);
in0 = _mm_srai_epi32(in0, 16);
in1 = _mm_srai_epi32(in1, 16);
tmp = _mm_add_epi32(in0, in1);
in0 = _mm_unpacklo_epi32(tmp, zero);
in1 = _mm_unpackhi_epi32(tmp, zero);
tmp = _mm_add_epi32(in0, in1);
in0 = _mm_srli_si128(tmp, 8);
in1 = _mm_add_epi32(tmp, in0);
in0 = _mm_slli_epi32(in1, 1);
store_output(&in0, output);
}
void vp10_fdct8x8_1_sse2(const int16_t *input, tran_low_t *output, int stride) {
__m128i in0 = _mm_load_si128((const __m128i *)(input + 0 * stride));
__m128i in1 = _mm_load_si128((const __m128i *)(input + 1 * stride));
__m128i in2 = _mm_load_si128((const __m128i *)(input + 2 * stride));
__m128i in3 = _mm_load_si128((const __m128i *)(input + 3 * stride));
__m128i u0, u1, sum;
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
in0 = _mm_load_si128((const __m128i *)(input + 4 * stride));
in1 = _mm_load_si128((const __m128i *)(input + 5 * stride));
in2 = _mm_load_si128((const __m128i *)(input + 6 * stride));
in3 = _mm_load_si128((const __m128i *)(input + 7 * stride));
sum = _mm_add_epi16(u0, u1);
in0 = _mm_add_epi16(in0, in1);
in2 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, in0);
u0 = _mm_setzero_si128();
sum = _mm_add_epi16(sum, in2);
in0 = _mm_unpacklo_epi16(u0, sum);
in1 = _mm_unpackhi_epi16(u0, sum);
in0 = _mm_srai_epi32(in0, 16);
in1 = _mm_srai_epi32(in1, 16);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_unpacklo_epi32(sum, u0);
in1 = _mm_unpackhi_epi32(sum, u0);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_srli_si128(sum, 8);
in1 = _mm_add_epi32(sum, in0);
store_output(&in1, output);
}
void vp10_fdct16x16_1_sse2(const int16_t *input, tran_low_t *output,
int stride) {
__m128i in0, in1, in2, in3;
__m128i u0, u1;
__m128i sum = _mm_setzero_si128();
int i;
for (i = 0; i < 2; ++i) {
input += 8 * i;
in0 = _mm_load_si128((const __m128i *)(input + 0 * stride));
in1 = _mm_load_si128((const __m128i *)(input + 1 * stride));
in2 = _mm_load_si128((const __m128i *)(input + 2 * stride));
in3 = _mm_load_si128((const __m128i *)(input + 3 * stride));
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 4 * stride));
in1 = _mm_load_si128((const __m128i *)(input + 5 * stride));
in2 = _mm_load_si128((const __m128i *)(input + 6 * stride));
in3 = _mm_load_si128((const __m128i *)(input + 7 * stride));
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 8 * stride));
in1 = _mm_load_si128((const __m128i *)(input + 9 * stride));
in2 = _mm_load_si128((const __m128i *)(input + 10 * stride));
in3 = _mm_load_si128((const __m128i *)(input + 11 * stride));
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 12 * stride));
in1 = _mm_load_si128((const __m128i *)(input + 13 * stride));
in2 = _mm_load_si128((const __m128i *)(input + 14 * stride));
in3 = _mm_load_si128((const __m128i *)(input + 15 * stride));
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
sum = _mm_add_epi16(sum, u1);
}
u0 = _mm_setzero_si128();
in0 = _mm_unpacklo_epi16(u0, sum);
in1 = _mm_unpackhi_epi16(u0, sum);
in0 = _mm_srai_epi32(in0, 16);
in1 = _mm_srai_epi32(in1, 16);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_unpacklo_epi32(sum, u0);
in1 = _mm_unpackhi_epi32(sum, u0);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_srli_si128(sum, 8);
in1 = _mm_add_epi32(sum, in0);
in1 = _mm_srai_epi32(in1, 1);
store_output(&in1, output);
}
void vp10_fdct32x32_1_sse2(const int16_t *input, tran_low_t *output,
int stride) {
__m128i in0, in1, in2, in3;
__m128i u0, u1;
__m128i sum = _mm_setzero_si128();
int i;
for (i = 0; i < 8; ++i) {
in0 = _mm_load_si128((const __m128i *)(input + 0));
in1 = _mm_load_si128((const __m128i *)(input + 8));
in2 = _mm_load_si128((const __m128i *)(input + 16));
in3 = _mm_load_si128((const __m128i *)(input + 24));
input += stride;
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 0));
in1 = _mm_load_si128((const __m128i *)(input + 8));
in2 = _mm_load_si128((const __m128i *)(input + 16));
in3 = _mm_load_si128((const __m128i *)(input + 24));
input += stride;
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 0));
in1 = _mm_load_si128((const __m128i *)(input + 8));
in2 = _mm_load_si128((const __m128i *)(input + 16));
in3 = _mm_load_si128((const __m128i *)(input + 24));
input += stride;
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
in0 = _mm_load_si128((const __m128i *)(input + 0));
in1 = _mm_load_si128((const __m128i *)(input + 8));
in2 = _mm_load_si128((const __m128i *)(input + 16));
in3 = _mm_load_si128((const __m128i *)(input + 24));
input += stride;
sum = _mm_add_epi16(sum, u1);
u0 = _mm_add_epi16(in0, in1);
u1 = _mm_add_epi16(in2, in3);
sum = _mm_add_epi16(sum, u0);
sum = _mm_add_epi16(sum, u1);
}
u0 = _mm_setzero_si128();
in0 = _mm_unpacklo_epi16(u0, sum);
in1 = _mm_unpackhi_epi16(u0, sum);
in0 = _mm_srai_epi32(in0, 16);
in1 = _mm_srai_epi32(in1, 16);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_unpacklo_epi32(sum, u0);
in1 = _mm_unpackhi_epi32(sum, u0);
sum = _mm_add_epi32(in0, in1);
in0 = _mm_srli_si128(sum, 8);
in1 = _mm_add_epi32(sum, in0);
in1 = _mm_srai_epi32(in1, 3);
store_output(&in1, output);
}
#define DCT_HIGH_BIT_DEPTH 0
#define FDCT4x4_2D vp10_fdct4x4_sse2
#define FDCT8x8_2D vp10_fdct8x8_sse2
#define FDCT16x16_2D vp10_fdct16x16_sse2
#include "vp10/common/x86/vp10_fwd_txfm_impl_sse2.h"
#undef FDCT4x4_2D
#undef FDCT8x8_2D
#undef FDCT16x16_2D
#define FDCT32x32_2D vp10_fdct32x32_rd_sse2
#define FDCT32x32_HIGH_PRECISION 0
#include "vp10/common/x86/vp10_fwd_dct32x32_impl_sse2.h"
#undef FDCT32x32_2D
#undef FDCT32x32_HIGH_PRECISION
#define FDCT32x32_2D vp10_fdct32x32_sse2
#define FDCT32x32_HIGH_PRECISION 1
#include "vp10/common/x86/vp10_fwd_dct32x32_impl_sse2.h" // NOLINT
#undef FDCT32x32_2D
#undef FDCT32x32_HIGH_PRECISION
#undef DCT_HIGH_BIT_DEPTH
#if CONFIG_VP9_HIGHBITDEPTH
#define DCT_HIGH_BIT_DEPTH 1
#define FDCT4x4_2D vp10_highbd_fdct4x4_sse2
#define FDCT8x8_2D vp10_highbd_fdct8x8_sse2
#define FDCT16x16_2D vp10_highbd_fdct16x16_sse2
#include "vp10/common/x86/vp10_fwd_txfm_impl_sse2.h" // NOLINT
#undef FDCT4x4_2D
#undef FDCT8x8_2D
#undef FDCT16x16_2D
#define FDCT32x32_2D vp10_highbd_fdct32x32_rd_sse2
#define FDCT32x32_HIGH_PRECISION 0
#include "vp10/common/x86/vp10_fwd_dct32x32_impl_sse2.h" // NOLINT
#undef FDCT32x32_2D
#undef FDCT32x32_HIGH_PRECISION
#define FDCT32x32_2D vp10_highbd_fdct32x32_sse2
#define FDCT32x32_HIGH_PRECISION 1
#include "vp10/common/x86/vp10_fwd_dct32x32_impl_sse2.h" // NOLINT
#undef FDCT32x32_2D
#undef FDCT32x32_HIGH_PRECISION
#undef DCT_HIGH_BIT_DEPTH
#endif // CONFIG_VP9_HIGHBITDEPTH

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,184 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef VPX_DSP_X86_INV_TXFM_SSE2_H_
#define VPX_DSP_X86_INV_TXFM_SSE2_H_
#include <emmintrin.h> // SSE2
#include "./vpx_config.h"
#include "vpx/vpx_integer.h"
#include "vp10/common/vp10_inv_txfm.h"
// perform 8x8 transpose
static INLINE void array_transpose_8x8(__m128i *in, __m128i *res) {
const __m128i tr0_0 = _mm_unpacklo_epi16(in[0], in[1]);
const __m128i tr0_1 = _mm_unpacklo_epi16(in[2], in[3]);
const __m128i tr0_2 = _mm_unpackhi_epi16(in[0], in[1]);
const __m128i tr0_3 = _mm_unpackhi_epi16(in[2], in[3]);
const __m128i tr0_4 = _mm_unpacklo_epi16(in[4], in[5]);
const __m128i tr0_5 = _mm_unpacklo_epi16(in[6], in[7]);
const __m128i tr0_6 = _mm_unpackhi_epi16(in[4], in[5]);
const __m128i tr0_7 = _mm_unpackhi_epi16(in[6], in[7]);
const __m128i tr1_0 = _mm_unpacklo_epi32(tr0_0, tr0_1);
const __m128i tr1_1 = _mm_unpacklo_epi32(tr0_4, tr0_5);
const __m128i tr1_2 = _mm_unpackhi_epi32(tr0_0, tr0_1);
const __m128i tr1_3 = _mm_unpackhi_epi32(tr0_4, tr0_5);
const __m128i tr1_4 = _mm_unpacklo_epi32(tr0_2, tr0_3);
const __m128i tr1_5 = _mm_unpacklo_epi32(tr0_6, tr0_7);
const __m128i tr1_6 = _mm_unpackhi_epi32(tr0_2, tr0_3);
const __m128i tr1_7 = _mm_unpackhi_epi32(tr0_6, tr0_7);
res[0] = _mm_unpacklo_epi64(tr1_0, tr1_1);
res[1] = _mm_unpackhi_epi64(tr1_0, tr1_1);
res[2] = _mm_unpacklo_epi64(tr1_2, tr1_3);
res[3] = _mm_unpackhi_epi64(tr1_2, tr1_3);
res[4] = _mm_unpacklo_epi64(tr1_4, tr1_5);
res[5] = _mm_unpackhi_epi64(tr1_4, tr1_5);
res[6] = _mm_unpacklo_epi64(tr1_6, tr1_7);
res[7] = _mm_unpackhi_epi64(tr1_6, tr1_7);
}
#define TRANSPOSE_8X4(in0, in1, in2, in3, out0, out1) \
{ \
const __m128i tr0_0 = _mm_unpacklo_epi16(in0, in1); \
const __m128i tr0_1 = _mm_unpacklo_epi16(in2, in3); \
\
in0 = _mm_unpacklo_epi32(tr0_0, tr0_1); /* i1 i0 */ \
in1 = _mm_unpackhi_epi32(tr0_0, tr0_1); /* i3 i2 */ \
}
static INLINE void array_transpose_4X8(__m128i *in, __m128i * out) {
const __m128i tr0_0 = _mm_unpacklo_epi16(in[0], in[1]);
const __m128i tr0_1 = _mm_unpacklo_epi16(in[2], in[3]);
const __m128i tr0_4 = _mm_unpacklo_epi16(in[4], in[5]);
const __m128i tr0_5 = _mm_unpacklo_epi16(in[6], in[7]);
const __m128i tr1_0 = _mm_unpacklo_epi32(tr0_0, tr0_1);
const __m128i tr1_2 = _mm_unpackhi_epi32(tr0_0, tr0_1);
const __m128i tr1_4 = _mm_unpacklo_epi32(tr0_4, tr0_5);
const __m128i tr1_6 = _mm_unpackhi_epi32(tr0_4, tr0_5);
out[0] = _mm_unpacklo_epi64(tr1_0, tr1_4);
out[1] = _mm_unpackhi_epi64(tr1_0, tr1_4);
out[2] = _mm_unpacklo_epi64(tr1_2, tr1_6);
out[3] = _mm_unpackhi_epi64(tr1_2, tr1_6);
}
static INLINE void array_transpose_16x16(__m128i *res0, __m128i *res1) {
__m128i tbuf[8];
array_transpose_8x8(res0, res0);
array_transpose_8x8(res1, tbuf);
array_transpose_8x8(res0 + 8, res1);
array_transpose_8x8(res1 + 8, res1 + 8);
res0[8] = tbuf[0];
res0[9] = tbuf[1];
res0[10] = tbuf[2];
res0[11] = tbuf[3];
res0[12] = tbuf[4];
res0[13] = tbuf[5];
res0[14] = tbuf[6];
res0[15] = tbuf[7];
}
static INLINE void load_buffer_8x16(const int16_t *input, __m128i *in) {
in[0] = _mm_load_si128((const __m128i *)(input + 0 * 16));
in[1] = _mm_load_si128((const __m128i *)(input + 1 * 16));
in[2] = _mm_load_si128((const __m128i *)(input + 2 * 16));
in[3] = _mm_load_si128((const __m128i *)(input + 3 * 16));
in[4] = _mm_load_si128((const __m128i *)(input + 4 * 16));
in[5] = _mm_load_si128((const __m128i *)(input + 5 * 16));
in[6] = _mm_load_si128((const __m128i *)(input + 6 * 16));
in[7] = _mm_load_si128((const __m128i *)(input + 7 * 16));
in[8] = _mm_load_si128((const __m128i *)(input + 8 * 16));
in[9] = _mm_load_si128((const __m128i *)(input + 9 * 16));
in[10] = _mm_load_si128((const __m128i *)(input + 10 * 16));
in[11] = _mm_load_si128((const __m128i *)(input + 11 * 16));
in[12] = _mm_load_si128((const __m128i *)(input + 12 * 16));
in[13] = _mm_load_si128((const __m128i *)(input + 13 * 16));
in[14] = _mm_load_si128((const __m128i *)(input + 14 * 16));
in[15] = _mm_load_si128((const __m128i *)(input + 15 * 16));
}
#define RECON_AND_STORE(dest, in_x) \
{ \
__m128i d0 = _mm_loadl_epi64((__m128i *)(dest)); \
d0 = _mm_unpacklo_epi8(d0, zero); \
d0 = _mm_add_epi16(in_x, d0); \
d0 = _mm_packus_epi16(d0, d0); \
_mm_storel_epi64((__m128i *)(dest), d0); \
}
static INLINE void write_buffer_8x16(uint8_t *dest, __m128i *in, int stride) {
const __m128i final_rounding = _mm_set1_epi16(1<<5);
const __m128i zero = _mm_setzero_si128();
// Final rounding and shift
in[0] = _mm_adds_epi16(in[0], final_rounding);
in[1] = _mm_adds_epi16(in[1], final_rounding);
in[2] = _mm_adds_epi16(in[2], final_rounding);
in[3] = _mm_adds_epi16(in[3], final_rounding);
in[4] = _mm_adds_epi16(in[4], final_rounding);
in[5] = _mm_adds_epi16(in[5], final_rounding);
in[6] = _mm_adds_epi16(in[6], final_rounding);
in[7] = _mm_adds_epi16(in[7], final_rounding);
in[8] = _mm_adds_epi16(in[8], final_rounding);
in[9] = _mm_adds_epi16(in[9], final_rounding);
in[10] = _mm_adds_epi16(in[10], final_rounding);
in[11] = _mm_adds_epi16(in[11], final_rounding);
in[12] = _mm_adds_epi16(in[12], final_rounding);
in[13] = _mm_adds_epi16(in[13], final_rounding);
in[14] = _mm_adds_epi16(in[14], final_rounding);
in[15] = _mm_adds_epi16(in[15], final_rounding);
in[0] = _mm_srai_epi16(in[0], 6);
in[1] = _mm_srai_epi16(in[1], 6);
in[2] = _mm_srai_epi16(in[2], 6);
in[3] = _mm_srai_epi16(in[3], 6);
in[4] = _mm_srai_epi16(in[4], 6);
in[5] = _mm_srai_epi16(in[5], 6);
in[6] = _mm_srai_epi16(in[6], 6);
in[7] = _mm_srai_epi16(in[7], 6);
in[8] = _mm_srai_epi16(in[8], 6);
in[9] = _mm_srai_epi16(in[9], 6);
in[10] = _mm_srai_epi16(in[10], 6);
in[11] = _mm_srai_epi16(in[11], 6);
in[12] = _mm_srai_epi16(in[12], 6);
in[13] = _mm_srai_epi16(in[13], 6);
in[14] = _mm_srai_epi16(in[14], 6);
in[15] = _mm_srai_epi16(in[15], 6);
RECON_AND_STORE(dest + 0 * stride, in[0]);
RECON_AND_STORE(dest + 1 * stride, in[1]);
RECON_AND_STORE(dest + 2 * stride, in[2]);
RECON_AND_STORE(dest + 3 * stride, in[3]);
RECON_AND_STORE(dest + 4 * stride, in[4]);
RECON_AND_STORE(dest + 5 * stride, in[5]);
RECON_AND_STORE(dest + 6 * stride, in[6]);
RECON_AND_STORE(dest + 7 * stride, in[7]);
RECON_AND_STORE(dest + 8 * stride, in[8]);
RECON_AND_STORE(dest + 9 * stride, in[9]);
RECON_AND_STORE(dest + 10 * stride, in[10]);
RECON_AND_STORE(dest + 11 * stride, in[11]);
RECON_AND_STORE(dest + 12 * stride, in[12]);
RECON_AND_STORE(dest + 13 * stride, in[13]);
RECON_AND_STORE(dest + 14 * stride, in[14]);
RECON_AND_STORE(dest + 15 * stride, in[15]);
}
void idct4_sse2(__m128i *in);
void idct8_sse2(__m128i *in);
void idct16_sse2(__m128i *in0, __m128i *in1);
void iadst4_sse2(__m128i *in);
void iadst8_sse2(__m128i *in);
void iadst16_sse2(__m128i *in0, __m128i *in1);
#endif // VPX_DSP_X86_INV_TXFM_SSE2_H_

View File

@@ -47,6 +47,8 @@
static int is_compound_reference_allowed(const VP10_COMMON *cm) { static int is_compound_reference_allowed(const VP10_COMMON *cm) {
int i; int i;
if (frame_is_intra_only(cm))
return 0;
for (i = 1; i < REFS_PER_FRAME; ++i) for (i = 1; i < REFS_PER_FRAME; ++i)
if (cm->ref_frame_sign_bias[i + 1] != cm->ref_frame_sign_bias[1]) if (cm->ref_frame_sign_bias[i + 1] != cm->ref_frame_sign_bias[1])
return 1; return 1;
@@ -81,12 +83,18 @@ static int decode_unsigned_max(struct vpx_read_bit_buffer *rb, int max) {
return data > max ? max : data; return data > max ? max : data;
} }
#if CONFIG_MISC_FIXES
static TX_MODE read_tx_mode(struct vpx_read_bit_buffer *rb) {
return vpx_rb_read_bit(rb) ? TX_MODE_SELECT : vpx_rb_read_literal(rb, 2);
}
#else
static TX_MODE read_tx_mode(vpx_reader *r) { static TX_MODE read_tx_mode(vpx_reader *r) {
TX_MODE tx_mode = vpx_read_literal(r, 2); TX_MODE tx_mode = vpx_read_literal(r, 2);
if (tx_mode == ALLOW_32X32) if (tx_mode == ALLOW_32X32)
tx_mode += vpx_read_bit(r); tx_mode += vpx_read_bit(r);
return tx_mode; return tx_mode;
} }
#endif
static void read_tx_mode_probs(struct tx_probs *tx_probs, vpx_reader *r) { static void read_tx_mode_probs(struct tx_probs *tx_probs, vpx_reader *r) {
int i, j; int i, j;
@@ -118,6 +126,18 @@ static void read_inter_mode_probs(FRAME_CONTEXT *fc, vpx_reader *r) {
vp10_diff_update_prob(r, &fc->inter_mode_probs[i][j]); vp10_diff_update_prob(r, &fc->inter_mode_probs[i][j]);
} }
#if CONFIG_MISC_FIXES
static REFERENCE_MODE read_frame_reference_mode(const VP10_COMMON *cm,
struct vpx_read_bit_buffer *rb) {
if (is_compound_reference_allowed(cm)) {
return vpx_rb_read_bit(rb) ? REFERENCE_MODE_SELECT
: (vpx_rb_read_bit(rb) ? COMPOUND_REFERENCE
: SINGLE_REFERENCE);
} else {
return SINGLE_REFERENCE;
}
}
#else
static REFERENCE_MODE read_frame_reference_mode(const VP10_COMMON *cm, static REFERENCE_MODE read_frame_reference_mode(const VP10_COMMON *cm,
vpx_reader *r) { vpx_reader *r) {
if (is_compound_reference_allowed(cm)) { if (is_compound_reference_allowed(cm)) {
@@ -128,6 +148,7 @@ static REFERENCE_MODE read_frame_reference_mode(const VP10_COMMON *cm,
return SINGLE_REFERENCE; return SINGLE_REFERENCE;
} }
} }
#endif
static void read_frame_reference_mode_probs(VP10_COMMON *cm, vpx_reader *r) { static void read_frame_reference_mode_probs(VP10_COMMON *cm, vpx_reader *r) {
FRAME_CONTEXT *const fc = cm->fc; FRAME_CONTEXT *const fc = cm->fc;
@@ -151,8 +172,12 @@ static void read_frame_reference_mode_probs(VP10_COMMON *cm, vpx_reader *r) {
static void update_mv_probs(vpx_prob *p, int n, vpx_reader *r) { static void update_mv_probs(vpx_prob *p, int n, vpx_reader *r) {
int i; int i;
for (i = 0; i < n; ++i) for (i = 0; i < n; ++i)
#if CONFIG_MISC_FIXES
vp10_diff_update_prob(r, &p[i]);
#else
if (vpx_read(r, MV_UPDATE_PROB)) if (vpx_read(r, MV_UPDATE_PROB))
p[i] = (vpx_read_literal(r, 7) << 1) | 1; p[i] = (vpx_read_literal(r, 7) << 1) | 1;
#endif
} }
static void read_mv_probs(nmv_context *ctx, int allow_hp, vpx_reader *r) { static void read_mv_probs(nmv_context *ctx, int allow_hp, vpx_reader *r) {
@@ -190,6 +215,7 @@ static void inverse_transform_block_inter(MACROBLOCKD* xd, int plane,
int eob, int block) { int eob, int block) {
struct macroblockd_plane *const pd = &xd->plane[plane]; struct macroblockd_plane *const pd = &xd->plane[plane];
TX_TYPE tx_type = get_tx_type(pd->plane_type, xd, block); TX_TYPE tx_type = get_tx_type(pd->plane_type, xd, block);
const int seg_id = xd->mi[0]->mbmi.segment_id;
if (eob > 0) { if (eob > 0) {
tran_low_t *const dqcoeff = pd->dqcoeff; tran_low_t *const dqcoeff = pd->dqcoeff;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
@@ -197,9 +223,7 @@ static void inverse_transform_block_inter(MACROBLOCKD* xd, int plane,
switch (tx_size) { switch (tx_size) {
case TX_4X4: case TX_4X4:
vp10_highbd_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, xd->bd, vp10_highbd_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, xd->bd,
tx_type, xd->lossless ? tx_type, xd->lossless[seg_id]);
vp10_highbd_iwht4x4_add :
vp10_highbd_idct4x4_add);
break; break;
case TX_8X8: case TX_8X8:
vp10_highbd_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, xd->bd, vp10_highbd_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, xd->bd,
@@ -222,8 +246,7 @@ static void inverse_transform_block_inter(MACROBLOCKD* xd, int plane,
switch (tx_size) { switch (tx_size) {
case TX_4X4: case TX_4X4:
vp10_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, tx_type, vp10_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, tx_type,
xd->lossless ? vp10_iwht4x4_add : xd->lossless[seg_id]);
vp10_idct4x4_add);
break; break;
case TX_8X8: case TX_8X8:
vp10_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, tx_type); vp10_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, tx_type);
@@ -261,6 +284,7 @@ static void inverse_transform_block_intra(MACROBLOCKD* xd, int plane,
uint8_t *dst, int stride, uint8_t *dst, int stride,
int eob) { int eob) {
struct macroblockd_plane *const pd = &xd->plane[plane]; struct macroblockd_plane *const pd = &xd->plane[plane];
const int seg_id = xd->mi[0]->mbmi.segment_id;
if (eob > 0) { if (eob > 0) {
tran_low_t *const dqcoeff = pd->dqcoeff; tran_low_t *const dqcoeff = pd->dqcoeff;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
@@ -268,9 +292,7 @@ static void inverse_transform_block_intra(MACROBLOCKD* xd, int plane,
switch (tx_size) { switch (tx_size) {
case TX_4X4: case TX_4X4:
vp10_highbd_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, xd->bd, vp10_highbd_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, xd->bd,
tx_type, xd->lossless ? tx_type, xd->lossless[seg_id]);
vp10_highbd_iwht4x4_add :
vp10_highbd_idct4x4_add);
break; break;
case TX_8X8: case TX_8X8:
vp10_highbd_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, xd->bd, vp10_highbd_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, xd->bd,
@@ -293,8 +315,7 @@ static void inverse_transform_block_intra(MACROBLOCKD* xd, int plane,
switch (tx_size) { switch (tx_size) {
case TX_4X4: case TX_4X4:
vp10_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, tx_type, vp10_inv_txfm_add_4x4(dqcoeff, dst, stride, eob, tx_type,
xd->lossless ? vp10_iwht4x4_add : xd->lossless[seg_id]);
vp10_idct4x4_add);
break; break;
case TX_8X8: case TX_8X8:
vp10_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, tx_type); vp10_inv_txfm_add_8x8(dqcoeff, dst, stride, eob, tx_type);
@@ -343,7 +364,7 @@ static void predict_and_reconstruct_intra_block(MACROBLOCKD *const xd,
if (plane == 0) if (plane == 0)
mode = xd->mi[0]->bmi[(row << 1) + col].as_mode; mode = xd->mi[0]->bmi[(row << 1) + col].as_mode;
vp10_predict_intra_block(xd, pd->n4_wl, tx_size, mode, vp10_predict_intra_block(xd, pd->n4_wl, pd->n4_hl, tx_size, mode,
dst, pd->dst.stride, dst, pd->dst.stride, dst, pd->dst.stride, dst, pd->dst.stride,
col, row, plane); col, row, plane);
@@ -527,6 +548,7 @@ static void dec_build_inter_predictors(VP10Decoder *const pbi, MACROBLOCKD *xd,
struct buf_2d *dst_buf, const MV* mv, struct buf_2d *dst_buf, const MV* mv,
RefCntBuffer *ref_frame_buf, RefCntBuffer *ref_frame_buf,
int is_scaled, int ref) { int is_scaled, int ref) {
VP10_COMMON *const cm = &pbi->common;
struct macroblockd_plane *const pd = &xd->plane[plane]; struct macroblockd_plane *const pd = &xd->plane[plane];
uint8_t *const dst = dst_buf->buf + dst_buf->stride * y + x; uint8_t *const dst = dst_buf->buf + dst_buf->stride * y + x;
MV32 scaled_mv; MV32 scaled_mv;
@@ -623,7 +645,7 @@ static void dec_build_inter_predictors(VP10Decoder *const pbi, MACROBLOCKD *xd,
// Wait until reference block is ready. Pad 7 more pixels as last 7 // Wait until reference block is ready. Pad 7 more pixels as last 7
// pixels of each superblock row can be changed by next superblock row. // pixels of each superblock row can be changed by next superblock row.
if (pbi->frame_parallel_decode) if (cm->frame_parallel_decode)
vp10_frameworker_wait(pbi->frame_worker_owner, ref_frame_buf, vp10_frameworker_wait(pbi->frame_worker_owner, ref_frame_buf,
VPXMAX(0, (y1 + 7)) << (plane == 0 ? 0 : 1)); VPXMAX(0, (y1 + 7)) << (plane == 0 ? 0 : 1));
@@ -650,7 +672,7 @@ static void dec_build_inter_predictors(VP10Decoder *const pbi, MACROBLOCKD *xd,
} else { } else {
// Wait until reference block is ready. Pad 7 more pixels as last 7 // Wait until reference block is ready. Pad 7 more pixels as last 7
// pixels of each superblock row can be changed by next superblock row. // pixels of each superblock row can be changed by next superblock row.
if (pbi->frame_parallel_decode) { if (cm->frame_parallel_decode) {
const int y1 = (y0_16 + (h - 1) * ys) >> SUBPEL_BITS; const int y1 = (y0_16 + (h - 1) * ys) >> SUBPEL_BITS;
vp10_frameworker_wait(pbi->frame_worker_owner, ref_frame_buf, vp10_frameworker_wait(pbi->frame_worker_owner, ref_frame_buf,
VPXMAX(0, (y1 + 7)) << (plane == 0 ? 0 : 1)); VPXMAX(0, (y1 + 7)) << (plane == 0 ? 0 : 1));
@@ -700,12 +722,19 @@ static void dec_build_inter_predictors_sb(VP10Decoder *const pbi,
const int is_scaled = vp10_is_scaled(sf); const int is_scaled = vp10_is_scaled(sf);
if (sb_type < BLOCK_8X8) { if (sb_type < BLOCK_8X8) {
int i = 0, x, y; const PARTITION_TYPE bp = BLOCK_8X8 - sb_type;
const int have_vsplit = bp != PARTITION_HORZ;
const int have_hsplit = bp != PARTITION_VERT;
const int num_4x4_w = 2 >> ((!have_vsplit) | pd->subsampling_x);
const int num_4x4_h = 2 >> ((!have_hsplit) | pd->subsampling_y);
const int pw = 8 >> (have_vsplit | pd->subsampling_x);
const int ph = 8 >> (have_hsplit | pd->subsampling_y);
int x, y;
for (y = 0; y < num_4x4_h; ++y) { for (y = 0; y < num_4x4_h; ++y) {
for (x = 0; x < num_4x4_w; ++x) { for (x = 0; x < num_4x4_w; ++x) {
const MV mv = average_split_mvs(pd, mi, ref, i++); const MV mv = average_split_mvs(pd, mi, ref, y * 2 + x);
dec_build_inter_predictors(pbi, xd, plane, n4w_x4, n4h_x4, dec_build_inter_predictors(pbi, xd, plane, n4w_x4, n4h_x4,
4 * x, 4 * y, 4, 4, mi_x, mi_y, kernel, 4 * x, 4 * y, pw, ph, mi_x, mi_y, kernel,
sf, pre_buf, dst_buf, &mv, sf, pre_buf, dst_buf, &mv,
ref_frame_buf, is_scaled, ref); ref_frame_buf, is_scaled, ref);
} }
@@ -857,7 +886,11 @@ static void decode_block(VP10Decoder *const pbi, MACROBLOCKD *const xd,
} }
if (!less8x8 && eobtotal == 0) if (!less8x8 && eobtotal == 0)
#if CONFIG_MISC_FIXES
mbmi->has_no_coeffs = 1; // skip loopfilter
#else
mbmi->skip = 1; // skip loopfilter mbmi->skip = 1; // skip loopfilter
#endif
} }
} }
@@ -890,11 +923,11 @@ static INLINE void dec_update_partition_context(MACROBLOCKD *xd,
memset(left_ctx, partition_context_lookup[subsize].left, bw); memset(left_ctx, partition_context_lookup[subsize].left, bw);
} }
static PARTITION_TYPE read_partition(MACROBLOCKD *xd, int mi_row, int mi_col, static PARTITION_TYPE read_partition(VP10_COMMON *cm, MACROBLOCKD *xd,
vpx_reader *r, int mi_row, int mi_col, vpx_reader *r,
int has_rows, int has_cols, int bsl) { int has_rows, int has_cols, int bsl) {
const int ctx = dec_partition_plane_context(xd, mi_row, mi_col, bsl); const int ctx = dec_partition_plane_context(xd, mi_row, mi_col, bsl);
const vpx_prob *const probs = get_partition_probs(xd, ctx); const vpx_prob *const probs = cm->fc->partition_prob[ctx];
FRAME_COUNTS *counts = xd->counts; FRAME_COUNTS *counts = xd->counts;
PARTITION_TYPE p; PARTITION_TYPE p;
@@ -929,7 +962,7 @@ static void decode_partition(VP10Decoder *const pbi, MACROBLOCKD *const xd,
if (mi_row >= cm->mi_rows || mi_col >= cm->mi_cols) if (mi_row >= cm->mi_rows || mi_col >= cm->mi_cols)
return; return;
partition = read_partition(xd, mi_row, mi_col, r, has_rows, has_cols, partition = read_partition(cm, xd, mi_row, mi_col, r, has_rows, has_cols,
n8x8_l2); n8x8_l2);
subsize = subsize_lookup[partition][bsize]; // get_subsize(bsize, partition); subsize = subsize_lookup[partition][bsize]; // get_subsize(bsize, partition);
if (!hbs) { if (!hbs) {
@@ -1015,6 +1048,9 @@ static void read_coef_probs(FRAME_CONTEXT *fc, TX_MODE tx_mode,
static void setup_segmentation(VP10_COMMON *const cm, static void setup_segmentation(VP10_COMMON *const cm,
struct vpx_read_bit_buffer *rb) { struct vpx_read_bit_buffer *rb) {
struct segmentation *const seg = &cm->seg; struct segmentation *const seg = &cm->seg;
#if !CONFIG_MISC_FIXES
struct segmentation_probs *const segp = &cm->segp;
#endif
int i, j; int i, j;
seg->update_map = 0; seg->update_map = 0;
@@ -1031,23 +1067,26 @@ static void setup_segmentation(VP10_COMMON *const cm,
seg->update_map = vpx_rb_read_bit(rb); seg->update_map = vpx_rb_read_bit(rb);
} }
if (seg->update_map) { if (seg->update_map) {
#if !CONFIG_MISC_FIXES
for (i = 0; i < SEG_TREE_PROBS; i++) for (i = 0; i < SEG_TREE_PROBS; i++)
seg->tree_probs[i] = vpx_rb_read_bit(rb) ? vpx_rb_read_literal(rb, 8) segp->tree_probs[i] = vpx_rb_read_bit(rb) ? vpx_rb_read_literal(rb, 8)
: MAX_PROB; : MAX_PROB;
#endif
if (frame_is_intra_only(cm) || cm->error_resilient_mode) { if (frame_is_intra_only(cm) || cm->error_resilient_mode) {
seg->temporal_update = 0; seg->temporal_update = 0;
} else { } else {
seg->temporal_update = vpx_rb_read_bit(rb); seg->temporal_update = vpx_rb_read_bit(rb);
} }
#if !CONFIG_MISC_FIXES
if (seg->temporal_update) { if (seg->temporal_update) {
for (i = 0; i < PREDICTION_PROBS; i++) for (i = 0; i < PREDICTION_PROBS; i++)
seg->pred_probs[i] = vpx_rb_read_bit(rb) ? vpx_rb_read_literal(rb, 8) segp->pred_probs[i] = vpx_rb_read_bit(rb) ? vpx_rb_read_literal(rb, 8)
: MAX_PROB; : MAX_PROB;
} else { } else {
for (i = 0; i < PREDICTION_PROBS; i++) for (i = 0; i < PREDICTION_PROBS; i++)
seg->pred_probs[i] = MAX_PROB; segp->pred_probs[i] = MAX_PROB;
} }
#endif
} }
// Segmentation data update // Segmentation data update
@@ -1090,34 +1129,27 @@ static void setup_loopfilter(struct loopfilter *lf,
for (i = 0; i < MAX_REF_FRAMES; i++) for (i = 0; i < MAX_REF_FRAMES; i++)
if (vpx_rb_read_bit(rb)) if (vpx_rb_read_bit(rb))
lf->ref_deltas[i] = vpx_rb_read_signed_literal(rb, 6); lf->ref_deltas[i] = vpx_rb_read_inv_signed_literal(rb, 6);
for (i = 0; i < MAX_MODE_LF_DELTAS; i++) for (i = 0; i < MAX_MODE_LF_DELTAS; i++)
if (vpx_rb_read_bit(rb)) if (vpx_rb_read_bit(rb))
lf->mode_deltas[i] = vpx_rb_read_signed_literal(rb, 6); lf->mode_deltas[i] = vpx_rb_read_inv_signed_literal(rb, 6);
} }
} }
} }
static INLINE int read_delta_q(struct vpx_read_bit_buffer *rb) { static INLINE int read_delta_q(struct vpx_read_bit_buffer *rb) {
return vpx_rb_read_bit(rb) ? vpx_rb_read_signed_literal(rb, 4) : 0; return vpx_rb_read_bit(rb) ?
vpx_rb_read_inv_signed_literal(rb, CONFIG_MISC_FIXES ? 6 : 4) : 0;
} }
static void setup_quantization(VP10_COMMON *const cm, MACROBLOCKD *const xd, static void setup_quantization(VP10_COMMON *const cm,
struct vpx_read_bit_buffer *rb) { struct vpx_read_bit_buffer *rb) {
cm->base_qindex = vpx_rb_read_literal(rb, QINDEX_BITS); cm->base_qindex = vpx_rb_read_literal(rb, QINDEX_BITS);
cm->y_dc_delta_q = read_delta_q(rb); cm->y_dc_delta_q = read_delta_q(rb);
cm->uv_dc_delta_q = read_delta_q(rb); cm->uv_dc_delta_q = read_delta_q(rb);
cm->uv_ac_delta_q = read_delta_q(rb); cm->uv_ac_delta_q = read_delta_q(rb);
cm->dequant_bit_depth = cm->bit_depth; cm->dequant_bit_depth = cm->bit_depth;
xd->lossless = cm->base_qindex == 0 &&
cm->y_dc_delta_q == 0 &&
cm->uv_dc_delta_q == 0 &&
cm->uv_ac_delta_q == 0;
#if CONFIG_VP9_HIGHBITDEPTH
xd->bd = (int)cm->bit_depth;
#endif
} }
static void setup_segmentation_dequant(VP10_COMMON *const cm) { static void setup_segmentation_dequant(VP10_COMMON *const cm) {
@@ -1151,12 +1183,12 @@ static INTERP_FILTER read_interp_filter(struct vpx_read_bit_buffer *rb) {
return vpx_rb_read_bit(rb) ? SWITCHABLE : vpx_rb_read_literal(rb, 2); return vpx_rb_read_bit(rb) ? SWITCHABLE : vpx_rb_read_literal(rb, 2);
} }
static void setup_display_size(VP10_COMMON *cm, static void setup_render_size(VP10_COMMON *cm,
struct vpx_read_bit_buffer *rb) { struct vpx_read_bit_buffer *rb) {
cm->display_width = cm->width; cm->render_width = cm->width;
cm->display_height = cm->height; cm->render_height = cm->height;
if (vpx_rb_read_bit(rb)) if (vpx_rb_read_bit(rb))
vp10_read_frame_size(rb, &cm->display_width, &cm->display_height); vp10_read_frame_size(rb, &cm->render_width, &cm->render_height);
} }
static void resize_mv_buffer(VP10_COMMON *cm) { static void resize_mv_buffer(VP10_COMMON *cm) {
@@ -1204,7 +1236,7 @@ static void setup_frame_size(VP10_COMMON *cm, struct vpx_read_bit_buffer *rb) {
BufferPool *const pool = cm->buffer_pool; BufferPool *const pool = cm->buffer_pool;
vp10_read_frame_size(rb, &width, &height); vp10_read_frame_size(rb, &width, &height);
resize_context_buffers(cm, width, height); resize_context_buffers(cm, width, height);
setup_display_size(cm, rb); setup_render_size(cm, rb);
lock_buffer_pool(pool); lock_buffer_pool(pool);
if (vpx_realloc_frame_buffer( if (vpx_realloc_frame_buffer(
@@ -1227,6 +1259,9 @@ static void setup_frame_size(VP10_COMMON *cm, struct vpx_read_bit_buffer *rb) {
pool->frame_bufs[cm->new_fb_idx].buf.subsampling_y = cm->subsampling_y; pool->frame_bufs[cm->new_fb_idx].buf.subsampling_y = cm->subsampling_y;
pool->frame_bufs[cm->new_fb_idx].buf.bit_depth = (unsigned int)cm->bit_depth; pool->frame_bufs[cm->new_fb_idx].buf.bit_depth = (unsigned int)cm->bit_depth;
pool->frame_bufs[cm->new_fb_idx].buf.color_space = cm->color_space; pool->frame_bufs[cm->new_fb_idx].buf.color_space = cm->color_space;
pool->frame_bufs[cm->new_fb_idx].buf.color_range = cm->color_range;
pool->frame_bufs[cm->new_fb_idx].buf.render_width = cm->render_width;
pool->frame_bufs[cm->new_fb_idx].buf.render_height = cm->render_height;
} }
static INLINE int valid_ref_frame_img_fmt(vpx_bit_depth_t ref_bit_depth, static INLINE int valid_ref_frame_img_fmt(vpx_bit_depth_t ref_bit_depth,
@@ -1248,13 +1283,21 @@ static void setup_frame_size_with_refs(VP10_COMMON *cm,
YV12_BUFFER_CONFIG *const buf = cm->frame_refs[i].buf; YV12_BUFFER_CONFIG *const buf = cm->frame_refs[i].buf;
width = buf->y_crop_width; width = buf->y_crop_width;
height = buf->y_crop_height; height = buf->y_crop_height;
#if CONFIG_MISC_FIXES
cm->render_width = buf->render_width;
cm->render_height = buf->render_height;
#endif
found = 1; found = 1;
break; break;
} }
} }
if (!found) if (!found) {
vp10_read_frame_size(rb, &width, &height); vp10_read_frame_size(rb, &width, &height);
#if CONFIG_MISC_FIXES
setup_render_size(cm, rb);
#endif
}
if (width <= 0 || height <= 0) if (width <= 0 || height <= 0)
vpx_internal_error(&cm->error, VPX_CODEC_CORRUPT_FRAME, vpx_internal_error(&cm->error, VPX_CODEC_CORRUPT_FRAME,
@@ -1285,7 +1328,9 @@ static void setup_frame_size_with_refs(VP10_COMMON *cm,
} }
resize_context_buffers(cm, width, height); resize_context_buffers(cm, width, height);
setup_display_size(cm, rb); #if !CONFIG_MISC_FIXES
setup_render_size(cm, rb);
#endif
lock_buffer_pool(pool); lock_buffer_pool(pool);
if (vpx_realloc_frame_buffer( if (vpx_realloc_frame_buffer(
@@ -1308,6 +1353,9 @@ static void setup_frame_size_with_refs(VP10_COMMON *cm,
pool->frame_bufs[cm->new_fb_idx].buf.subsampling_y = cm->subsampling_y; pool->frame_bufs[cm->new_fb_idx].buf.subsampling_y = cm->subsampling_y;
pool->frame_bufs[cm->new_fb_idx].buf.bit_depth = (unsigned int)cm->bit_depth; pool->frame_bufs[cm->new_fb_idx].buf.bit_depth = (unsigned int)cm->bit_depth;
pool->frame_bufs[cm->new_fb_idx].buf.color_space = cm->color_space; pool->frame_bufs[cm->new_fb_idx].buf.color_space = cm->color_space;
pool->frame_bufs[cm->new_fb_idx].buf.color_range = cm->color_range;
pool->frame_bufs[cm->new_fb_idx].buf.render_width = cm->render_width;
pool->frame_bufs[cm->new_fb_idx].buf.render_height = cm->render_height;
} }
static void setup_tile_info(VP10_COMMON *cm, struct vpx_read_bit_buffer *rb) { static void setup_tile_info(VP10_COMMON *cm, struct vpx_read_bit_buffer *rb) {
@@ -1328,6 +1376,15 @@ static void setup_tile_info(VP10_COMMON *cm, struct vpx_read_bit_buffer *rb) {
cm->log2_tile_rows = vpx_rb_read_bit(rb); cm->log2_tile_rows = vpx_rb_read_bit(rb);
if (cm->log2_tile_rows) if (cm->log2_tile_rows)
cm->log2_tile_rows += vpx_rb_read_bit(rb); cm->log2_tile_rows += vpx_rb_read_bit(rb);
#if CONFIG_MISC_FIXES
// tile size magnitude
if (cm->log2_tile_rows > 0 || cm->log2_tile_cols > 0) {
cm->tile_sz_mag = vpx_rb_read_literal(rb, 2);
}
#else
cm->tile_sz_mag = 3;
#endif
} }
typedef struct TileBuffer { typedef struct TileBuffer {
@@ -1336,10 +1393,27 @@ typedef struct TileBuffer {
int col; // only used with multi-threaded decoding int col; // only used with multi-threaded decoding
} TileBuffer; } TileBuffer;
static int mem_get_varsize(const uint8_t *data, const int mag) {
switch (mag) {
case 0:
return data[0];
case 1:
return mem_get_le16(data);
case 2:
return mem_get_le24(data);
case 3:
return mem_get_le32(data);
}
assert("Invalid tile size marker value" && 0);
return -1;
}
// Reads the next tile returning its size and adjusting '*data' accordingly // Reads the next tile returning its size and adjusting '*data' accordingly
// based on 'is_last'. // based on 'is_last'.
static void get_tile_buffer(const uint8_t *const data_end, static void get_tile_buffer(const uint8_t *const data_end,
int is_last, const int tile_sz_mag, int is_last,
struct vpx_internal_error_info *error_info, struct vpx_internal_error_info *error_info,
const uint8_t **data, const uint8_t **data,
vpx_decrypt_cb decrypt_cb, void *decrypt_state, vpx_decrypt_cb decrypt_cb, void *decrypt_state,
@@ -1353,12 +1427,12 @@ static void get_tile_buffer(const uint8_t *const data_end,
if (decrypt_cb) { if (decrypt_cb) {
uint8_t be_data[4]; uint8_t be_data[4];
decrypt_cb(decrypt_state, *data, be_data, 4); decrypt_cb(decrypt_state, *data, be_data, tile_sz_mag + 1);
size = mem_get_be32(be_data); size = mem_get_varsize(be_data, tile_sz_mag) + CONFIG_MISC_FIXES;
} else { } else {
size = mem_get_be32(*data); size = mem_get_varsize(*data, tile_sz_mag) + CONFIG_MISC_FIXES;
} }
*data += 4; *data += tile_sz_mag + 1;
if (size > (size_t)(data_end - *data)) if (size > (size_t)(data_end - *data))
vpx_internal_error(error_info, VPX_CODEC_CORRUPT_FRAME, vpx_internal_error(error_info, VPX_CODEC_CORRUPT_FRAME,
@@ -1384,7 +1458,8 @@ static void get_tile_buffers(VP10Decoder *pbi,
const int is_last = (r == tile_rows - 1) && (c == tile_cols - 1); const int is_last = (r == tile_rows - 1) && (c == tile_cols - 1);
TileBuffer *const buf = &tile_buffers[r][c]; TileBuffer *const buf = &tile_buffers[r][c];
buf->col = c; buf->col = c;
get_tile_buffer(data_end, is_last, &pbi->common.error, &data, get_tile_buffer(data_end, pbi->common.tile_sz_mag,
is_last, &pbi->common.error, &data,
pbi->decrypt_cb, pbi->decrypt_state, buf); pbi->decrypt_cb, pbi->decrypt_state, buf);
} }
} }
@@ -1453,14 +1528,17 @@ static const uint8_t *decode_tiles(VP10Decoder *pbi,
tile_data->cm = cm; tile_data->cm = cm;
tile_data->xd = pbi->mb; tile_data->xd = pbi->mb;
tile_data->xd.corrupted = 0; tile_data->xd.corrupted = 0;
tile_data->xd.counts = cm->frame_parallel_decoding_mode ? tile_data->xd.counts =
NULL : &cm->counts; cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_BACKWARD ?
&cm->counts : NULL;
vp10_zero(tile_data->dqcoeff); vp10_zero(tile_data->dqcoeff);
vp10_tile_init(&tile_data->xd.tile, tile_data->cm, tile_row, tile_col); vp10_tile_init(&tile_data->xd.tile, tile_data->cm, tile_row, tile_col);
setup_token_decoder(buf->data, data_end, buf->size, &cm->error, setup_token_decoder(buf->data, data_end, buf->size, &cm->error,
&tile_data->bit_reader, pbi->decrypt_cb, &tile_data->bit_reader, pbi->decrypt_cb,
pbi->decrypt_state); pbi->decrypt_state);
vp10_init_macroblockd(cm, &tile_data->xd, tile_data->dqcoeff); vp10_init_macroblockd(cm, &tile_data->xd, tile_data->dqcoeff);
tile_data->xd.plane[0].color_index_map = tile_data->color_index_map[0];
tile_data->xd.plane[1].color_index_map = tile_data->color_index_map[1];
} }
} }
@@ -1509,7 +1587,7 @@ static const uint8_t *decode_tiles(VP10Decoder *pbi,
// After loopfiltering, the last 7 row pixels in each superblock row may // After loopfiltering, the last 7 row pixels in each superblock row may
// still be changed by the longest loopfilter of the next superblock // still be changed by the longest loopfilter of the next superblock
// row. // row.
if (pbi->frame_parallel_decode) if (cm->frame_parallel_decode)
vp10_frameworker_broadcast(pbi->cur_buf, vp10_frameworker_broadcast(pbi->cur_buf,
mi_row << MI_BLOCK_SIZE_LOG2); mi_row << MI_BLOCK_SIZE_LOG2);
} }
@@ -1527,7 +1605,7 @@ static const uint8_t *decode_tiles(VP10Decoder *pbi,
// Get last tile data. // Get last tile data.
tile_data = pbi->tile_data + tile_cols * tile_rows - 1; tile_data = pbi->tile_data + tile_cols * tile_rows - 1;
if (pbi->frame_parallel_decode) if (cm->frame_parallel_decode)
vp10_frameworker_broadcast(pbi->cur_buf, INT_MAX); vp10_frameworker_broadcast(pbi->cur_buf, INT_MAX);
return vpx_reader_find_end(&tile_data->bit_reader); return vpx_reader_find_end(&tile_data->bit_reader);
} }
@@ -1651,7 +1729,7 @@ static const uint8_t *decode_tiles_mt(VP10Decoder *pbi,
} }
// Initialize thread frame counts. // Initialize thread frame counts.
if (!cm->frame_parallel_decoding_mode) { if (cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_BACKWARD) {
int i; int i;
for (i = 0; i < num_workers; ++i) { for (i = 0; i < num_workers; ++i) {
@@ -1673,8 +1751,9 @@ static const uint8_t *decode_tiles_mt(VP10Decoder *pbi,
tile_data->pbi = pbi; tile_data->pbi = pbi;
tile_data->xd = pbi->mb; tile_data->xd = pbi->mb;
tile_data->xd.corrupted = 0; tile_data->xd.corrupted = 0;
tile_data->xd.counts = cm->frame_parallel_decoding_mode ? tile_data->xd.counts =
0 : &tile_data->counts; cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_BACKWARD ?
&tile_data->counts : NULL;
vp10_zero(tile_data->dqcoeff); vp10_zero(tile_data->dqcoeff);
vp10_tile_init(tile, cm, 0, buf->col); vp10_tile_init(tile, cm, 0, buf->col);
vp10_tile_init(&tile_data->xd.tile, cm, 0, buf->col); vp10_tile_init(&tile_data->xd.tile, cm, 0, buf->col);
@@ -1682,6 +1761,8 @@ static const uint8_t *decode_tiles_mt(VP10Decoder *pbi,
&tile_data->bit_reader, pbi->decrypt_cb, &tile_data->bit_reader, pbi->decrypt_cb,
pbi->decrypt_state); pbi->decrypt_state);
vp10_init_macroblockd(cm, &tile_data->xd, tile_data->dqcoeff); vp10_init_macroblockd(cm, &tile_data->xd, tile_data->dqcoeff);
tile_data->xd.plane[0].color_index_map = tile_data->color_index_map[0];
tile_data->xd.plane[1].color_index_map = tile_data->color_index_map[1];
worker->had_error = 0; worker->had_error = 0;
if (i == num_workers - 1 || n == tile_cols - 1) { if (i == num_workers - 1 || n == tile_cols - 1) {
@@ -1713,7 +1794,8 @@ static const uint8_t *decode_tiles_mt(VP10Decoder *pbi,
} }
// Accumulate thread frame counts. // Accumulate thread frame counts.
if (n >= tile_cols && !cm->frame_parallel_decoding_mode) { if (n >= tile_cols &&
cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_BACKWARD) {
for (i = 0; i < num_workers; ++i) { for (i = 0; i < num_workers; ++i) {
TileWorkerData *const tile_data = TileWorkerData *const tile_data =
(TileWorkerData*)pbi->tile_workers[i].data1; (TileWorkerData*)pbi->tile_workers[i].data1;
@@ -1745,7 +1827,8 @@ static void read_bitdepth_colorspace_sampling(
} }
cm->color_space = vpx_rb_read_literal(rb, 3); cm->color_space = vpx_rb_read_literal(rb, 3);
if (cm->color_space != VPX_CS_SRGB) { if (cm->color_space != VPX_CS_SRGB) {
vpx_rb_read_bit(rb); // [16,235] (including xvycc) vs [0,255] range // [16,235] (including xvycc) vs [0,255] range
cm->color_range = vpx_rb_read_bit(rb);
if (cm->profile == PROFILE_1 || cm->profile == PROFILE_3) { if (cm->profile == PROFILE_1 || cm->profile == PROFILE_3) {
cm->subsampling_x = vpx_rb_read_bit(rb); cm->subsampling_x = vpx_rb_read_bit(rb);
cm->subsampling_y = vpx_rb_read_bit(rb); cm->subsampling_y = vpx_rb_read_bit(rb);
@@ -1776,6 +1859,7 @@ static void read_bitdepth_colorspace_sampling(
static size_t read_uncompressed_header(VP10Decoder *pbi, static size_t read_uncompressed_header(VP10Decoder *pbi,
struct vpx_read_bit_buffer *rb) { struct vpx_read_bit_buffer *rb) {
VP10_COMMON *const cm = &pbi->common; VP10_COMMON *const cm = &pbi->common;
MACROBLOCKD *const xd = &pbi->mb;
BufferPool *const pool = cm->buffer_pool; BufferPool *const pool = cm->buffer_pool;
RefCntBuffer *const frame_bufs = pool->frame_bufs; RefCntBuffer *const frame_bufs = pool->frame_bufs;
int i, mask, ref_index = 0; int i, mask, ref_index = 0;
@@ -1817,7 +1901,7 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
cm->lf.filter_level = 0; cm->lf.filter_level = 0;
cm->show_frame = 1; cm->show_frame = 1;
if (pbi->frame_parallel_decode) { if (cm->frame_parallel_decode) {
for (i = 0; i < REF_FRAMES; ++i) for (i = 0; i < REF_FRAMES; ++i)
cm->next_ref_frame_map[i] = cm->ref_frame_map[i]; cm->next_ref_frame_map[i] = cm->ref_frame_map[i];
} }
@@ -1849,13 +1933,41 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
} else { } else {
cm->intra_only = cm->show_frame ? 0 : vpx_rb_read_bit(rb); cm->intra_only = cm->show_frame ? 0 : vpx_rb_read_bit(rb);
cm->reset_frame_context = cm->error_resilient_mode ? if (cm->error_resilient_mode) {
0 : vpx_rb_read_literal(rb, 2); cm->reset_frame_context = RESET_FRAME_CONTEXT_ALL;
} else {
#if CONFIG_MISC_FIXES
if (cm->intra_only) {
cm->reset_frame_context =
vpx_rb_read_bit(rb) ? RESET_FRAME_CONTEXT_ALL
: RESET_FRAME_CONTEXT_CURRENT;
} else {
cm->reset_frame_context =
vpx_rb_read_bit(rb) ? RESET_FRAME_CONTEXT_CURRENT
: RESET_FRAME_CONTEXT_NONE;
if (cm->reset_frame_context == RESET_FRAME_CONTEXT_CURRENT)
cm->reset_frame_context =
vpx_rb_read_bit(rb) ? RESET_FRAME_CONTEXT_ALL
: RESET_FRAME_CONTEXT_CURRENT;
}
#else
static const RESET_FRAME_CONTEXT_MODE reset_frame_context_conv_tbl[4] = {
RESET_FRAME_CONTEXT_NONE, RESET_FRAME_CONTEXT_NONE,
RESET_FRAME_CONTEXT_CURRENT, RESET_FRAME_CONTEXT_ALL
};
cm->reset_frame_context =
reset_frame_context_conv_tbl[vpx_rb_read_literal(rb, 2)];
#endif
}
if (cm->intra_only) { if (cm->intra_only) {
if (!vp10_read_sync_code(rb)) if (!vp10_read_sync_code(rb))
vpx_internal_error(&cm->error, VPX_CODEC_UNSUP_BITSTREAM, vpx_internal_error(&cm->error, VPX_CODEC_UNSUP_BITSTREAM,
"Invalid frame sync code"); "Invalid frame sync code");
#if CONFIG_MISC_FIXES
read_bitdepth_colorspace_sampling(cm, rb);
#else
if (cm->profile > PROFILE_0) { if (cm->profile > PROFILE_0) {
read_bitdepth_colorspace_sampling(cm, rb); read_bitdepth_colorspace_sampling(cm, rb);
} else { } else {
@@ -1864,12 +1976,14 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
// specifies that the default color format should be YUV 4:2:0 in this // specifies that the default color format should be YUV 4:2:0 in this
// case (normative). // case (normative).
cm->color_space = VPX_CS_BT_601; cm->color_space = VPX_CS_BT_601;
cm->color_range = 0;
cm->subsampling_y = cm->subsampling_x = 1; cm->subsampling_y = cm->subsampling_x = 1;
cm->bit_depth = VPX_BITS_8; cm->bit_depth = VPX_BITS_8;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
cm->use_highbitdepth = 0; cm->use_highbitdepth = 0;
#endif #endif
} }
#endif
pbi->refresh_frame_flags = vpx_rb_read_literal(rb, REF_FRAMES); pbi->refresh_frame_flags = vpx_rb_read_literal(rb, REF_FRAMES);
setup_frame_size(cm, rb); setup_frame_size(cm, rb);
@@ -1914,6 +2028,9 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
get_frame_new_buffer(cm)->bit_depth = cm->bit_depth; get_frame_new_buffer(cm)->bit_depth = cm->bit_depth;
#endif #endif
get_frame_new_buffer(cm)->color_space = cm->color_space; get_frame_new_buffer(cm)->color_space = cm->color_space;
get_frame_new_buffer(cm)->color_range = cm->color_range;
get_frame_new_buffer(cm)->render_width = cm->render_width;
get_frame_new_buffer(cm)->render_height = cm->render_height;
if (pbi->need_resync) { if (pbi->need_resync) {
vpx_internal_error(&cm->error, VPX_CODEC_CORRUPT_FRAME, vpx_internal_error(&cm->error, VPX_CODEC_CORRUPT_FRAME,
@@ -1922,11 +2039,20 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
} }
if (!cm->error_resilient_mode) { if (!cm->error_resilient_mode) {
cm->refresh_frame_context = vpx_rb_read_bit(rb); cm->refresh_frame_context =
cm->frame_parallel_decoding_mode = vpx_rb_read_bit(rb); vpx_rb_read_bit(rb) ? REFRESH_FRAME_CONTEXT_FORWARD
: REFRESH_FRAME_CONTEXT_OFF;
if (cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_FORWARD) {
cm->refresh_frame_context =
vpx_rb_read_bit(rb) ? REFRESH_FRAME_CONTEXT_FORWARD
: REFRESH_FRAME_CONTEXT_BACKWARD;
#if !CONFIG_MISC_FIXES
} else {
vpx_rb_read_bit(rb); // parallel decoding mode flag
#endif
}
} else { } else {
cm->refresh_frame_context = 0; cm->refresh_frame_context = REFRESH_FRAME_CONTEXT_OFF;
cm->frame_parallel_decoding_mode = 1;
} }
// This flag will be overridden by the call to vp10_setup_past_independence // This flag will be overridden by the call to vp10_setup_past_independence
@@ -1961,9 +2087,32 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
vp10_setup_past_independence(cm); vp10_setup_past_independence(cm);
setup_loopfilter(&cm->lf, rb); setup_loopfilter(&cm->lf, rb);
setup_quantization(cm, &pbi->mb, rb); setup_quantization(cm, rb);
#if CONFIG_VP9_HIGHBITDEPTH
xd->bd = (int)cm->bit_depth;
#endif
setup_segmentation(cm, rb); setup_segmentation(cm, rb);
{
int i;
for (i = 0; i < MAX_SEGMENTS; ++i) {
const int qindex = CONFIG_MISC_FIXES && cm->seg.enabled ?
vp10_get_qindex(&cm->seg, i, cm->base_qindex) :
cm->base_qindex;
xd->lossless[i] = qindex == 0 &&
cm->y_dc_delta_q == 0 &&
cm->uv_dc_delta_q == 0 &&
cm->uv_ac_delta_q == 0;
}
}
setup_segmentation_dequant(cm); setup_segmentation_dequant(cm);
#if CONFIG_MISC_FIXES
cm->tx_mode = (!cm->seg.enabled && xd->lossless[0]) ? ONLY_4X4
: read_tx_mode(rb);
cm->reference_mode = read_frame_reference_mode(cm, rb);
#endif
setup_tile_info(cm, rb); setup_tile_info(cm, rb);
sz = vpx_rb_read_literal(rb, 16); sz = vpx_rb_read_literal(rb, 16);
@@ -1978,17 +2127,21 @@ static size_t read_uncompressed_header(VP10Decoder *pbi,
static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data, static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data,
size_t partition_size) { size_t partition_size) {
VP10_COMMON *const cm = &pbi->common; VP10_COMMON *const cm = &pbi->common;
#if !CONFIG_MISC_FIXES
MACROBLOCKD *const xd = &pbi->mb; MACROBLOCKD *const xd = &pbi->mb;
#endif
FRAME_CONTEXT *const fc = cm->fc; FRAME_CONTEXT *const fc = cm->fc;
vpx_reader r; vpx_reader r;
int k; int k, i, j;
if (vpx_reader_init(&r, data, partition_size, pbi->decrypt_cb, if (vpx_reader_init(&r, data, partition_size, pbi->decrypt_cb,
pbi->decrypt_state)) pbi->decrypt_state))
vpx_internal_error(&cm->error, VPX_CODEC_MEM_ERROR, vpx_internal_error(&cm->error, VPX_CODEC_MEM_ERROR,
"Failed to allocate bool decoder 0"); "Failed to allocate bool decoder 0");
cm->tx_mode = xd->lossless ? ONLY_4X4 : read_tx_mode(&r); #if !CONFIG_MISC_FIXES
cm->tx_mode = xd->lossless[0] ? ONLY_4X4 : read_tx_mode(&r);
#endif
if (cm->tx_mode == TX_MODE_SELECT) if (cm->tx_mode == TX_MODE_SELECT)
read_tx_mode_probs(&fc->tx_probs, &r); read_tx_mode_probs(&fc->tx_probs, &r);
read_coef_probs(fc, cm->tx_mode, &r); read_coef_probs(fc, cm->tx_mode, &r);
@@ -1996,9 +2149,35 @@ static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data,
for (k = 0; k < SKIP_CONTEXTS; ++k) for (k = 0; k < SKIP_CONTEXTS; ++k)
vp10_diff_update_prob(&r, &fc->skip_probs[k]); vp10_diff_update_prob(&r, &fc->skip_probs[k]);
if (!frame_is_intra_only(cm)) { #if CONFIG_MISC_FIXES
if (cm->seg.enabled) {
if (cm->seg.temporal_update) {
for (k = 0; k < PREDICTION_PROBS; k++)
vp10_diff_update_prob(&r, &cm->fc->seg.pred_probs[k]);
}
for (k = 0; k < MAX_SEGMENTS - 1; k++)
vp10_diff_update_prob(&r, &cm->fc->seg.tree_probs[k]);
}
for (j = 0; j < INTRA_MODES; j++)
for (i = 0; i < INTRA_MODES - 1; ++i)
vp10_diff_update_prob(&r, &fc->uv_mode_prob[j][i]);
for (j = 0; j < PARTITION_CONTEXTS; ++j)
for (i = 0; i < PARTITION_TYPES - 1; ++i)
vp10_diff_update_prob(&r, &fc->partition_prob[j][i]);
#endif
if (frame_is_intra_only(cm)) {
vp10_copy(cm->kf_y_prob, vp10_kf_y_mode_prob);
#if CONFIG_MISC_FIXES
for (k = 0; k < INTRA_MODES; k++)
for (j = 0; j < INTRA_MODES; j++)
for (i = 0; i < INTRA_MODES - 1; ++i)
vp10_diff_update_prob(&r, &cm->kf_y_prob[k][j][i]);
#endif
} else {
nmv_context *const nmvc = &fc->nmvc; nmv_context *const nmvc = &fc->nmvc;
int i, j;
read_inter_mode_probs(fc, &r); read_inter_mode_probs(fc, &r);
@@ -2008,7 +2187,9 @@ static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data,
for (i = 0; i < INTRA_INTER_CONTEXTS; i++) for (i = 0; i < INTRA_INTER_CONTEXTS; i++)
vp10_diff_update_prob(&r, &fc->intra_inter_prob[i]); vp10_diff_update_prob(&r, &fc->intra_inter_prob[i]);
#if !CONFIG_MISC_FIXES
cm->reference_mode = read_frame_reference_mode(cm, &r); cm->reference_mode = read_frame_reference_mode(cm, &r);
#endif
if (cm->reference_mode != SINGLE_REFERENCE) if (cm->reference_mode != SINGLE_REFERENCE)
setup_compound_reference_mode(cm); setup_compound_reference_mode(cm);
read_frame_reference_mode_probs(cm, &r); read_frame_reference_mode_probs(cm, &r);
@@ -2017,9 +2198,11 @@ static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data,
for (i = 0; i < INTRA_MODES - 1; ++i) for (i = 0; i < INTRA_MODES - 1; ++i)
vp10_diff_update_prob(&r, &fc->y_mode_prob[j][i]); vp10_diff_update_prob(&r, &fc->y_mode_prob[j][i]);
#if !CONFIG_MISC_FIXES
for (j = 0; j < PARTITION_CONTEXTS; ++j) for (j = 0; j < PARTITION_CONTEXTS; ++j)
for (i = 0; i < PARTITION_TYPES - 1; ++i) for (i = 0; i < PARTITION_TYPES - 1; ++i)
vp10_diff_update_prob(&r, &fc->partition_prob[j][i]); vp10_diff_update_prob(&r, &fc->partition_prob[j][i]);
#endif
read_mv_probs(nmvc, cm->allow_high_precision_mv, &r); read_mv_probs(nmvc, cm->allow_high_precision_mv, &r);
} }
@@ -2035,7 +2218,8 @@ static int read_compressed_header(VP10Decoder *pbi, const uint8_t *data,
static void debug_check_frame_counts(const VP10_COMMON *const cm) { static void debug_check_frame_counts(const VP10_COMMON *const cm) {
FRAME_COUNTS zero_counts; FRAME_COUNTS zero_counts;
vp10_zero(zero_counts); vp10_zero(zero_counts);
assert(cm->frame_parallel_decoding_mode || cm->error_resilient_mode); assert(cm->refresh_frame_context != REFRESH_FRAME_CONTEXT_BACKWARD ||
cm->error_resilient_mode);
assert(!memcmp(cm->counts.y_mode, zero_counts.y_mode, assert(!memcmp(cm->counts.y_mode, zero_counts.y_mode,
sizeof(cm->counts.y_mode))); sizeof(cm->counts.y_mode)));
assert(!memcmp(cm->counts.uv_mode, zero_counts.uv_mode, assert(!memcmp(cm->counts.uv_mode, zero_counts.uv_mode,
@@ -2161,10 +2345,11 @@ void vp10_decode_frame(VP10Decoder *pbi,
// If encoded in frame parallel mode, frame context is ready after decoding // If encoded in frame parallel mode, frame context is ready after decoding
// the frame header. // the frame header.
if (pbi->frame_parallel_decode && cm->frame_parallel_decoding_mode) { if (cm->frame_parallel_decode &&
cm->refresh_frame_context != REFRESH_FRAME_CONTEXT_BACKWARD) {
VPxWorker *const worker = pbi->frame_worker_owner; VPxWorker *const worker = pbi->frame_worker_owner;
FrameWorkerData *const frame_worker_data = worker->data1; FrameWorkerData *const frame_worker_data = worker->data1;
if (cm->refresh_frame_context) { if (cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_FORWARD) {
context_updated = 1; context_updated = 1;
cm->frame_contexts[cm->frame_context_idx] = *cm->fc; cm->frame_contexts[cm->frame_context_idx] = *cm->fc;
} }
@@ -2198,11 +2383,17 @@ void vp10_decode_frame(VP10Decoder *pbi,
} }
if (!xd->corrupted) { if (!xd->corrupted) {
if (!cm->error_resilient_mode && !cm->frame_parallel_decoding_mode) { if (cm->refresh_frame_context == REFRESH_FRAME_CONTEXT_BACKWARD) {
vp10_adapt_coef_probs(cm); vp10_adapt_coef_probs(cm);
#if CONFIG_MISC_FIXES
vp10_adapt_intra_frame_probs(cm);
#endif
if (!frame_is_intra_only(cm)) { if (!frame_is_intra_only(cm)) {
vp10_adapt_mode_probs(cm); #if !CONFIG_MISC_FIXES
vp10_adapt_intra_frame_probs(cm);
#endif
vp10_adapt_inter_frame_probs(cm);
vp10_adapt_mv_probs(cm, cm->allow_high_precision_mv); vp10_adapt_mv_probs(cm, cm->allow_high_precision_mv);
} }
} else { } else {
@@ -2214,6 +2405,7 @@ void vp10_decode_frame(VP10Decoder *pbi,
} }
// Non frame parallel update frame context here. // Non frame parallel update frame context here.
if (cm->refresh_frame_context && !context_updated) if (cm->refresh_frame_context != REFRESH_FRAME_CONTEXT_OFF &&
!context_updated)
cm->frame_contexts[cm->frame_context_idx] = *cm->fc; cm->frame_contexts[cm->frame_context_idx] = *cm->fc;
} }

View File

@@ -24,6 +24,19 @@
#include "vpx_dsp/vpx_dsp_common.h" #include "vpx_dsp/vpx_dsp_common.h"
static INLINE int read_uniform(vpx_reader *r, int n) {
int l = get_unsigned_bits(n);
int m = (1 << l) - n;
int v = vpx_read_literal(r, l-1);
assert(l != 0);
if (v < m)
return v;
else
return (v << 1) - m + vpx_read_literal(r, 1);
}
static PREDICTION_MODE read_intra_mode(vpx_reader *r, const vpx_prob *p) { static PREDICTION_MODE read_intra_mode(vpx_reader *r, const vpx_prob *p) {
return (PREDICTION_MODE)vpx_read_tree(r, vp10_intra_mode_tree, p); return (PREDICTION_MODE)vpx_read_tree(r, vp10_intra_mode_tree, p);
} }
@@ -60,8 +73,9 @@ static PREDICTION_MODE read_inter_mode(VP10_COMMON *cm, MACROBLOCKD *xd,
return NEARESTMV + mode; return NEARESTMV + mode;
} }
static int read_segment_id(vpx_reader *r, const struct segmentation *seg) { static int read_segment_id(vpx_reader *r,
return vpx_read_tree(r, vp10_segment_tree, seg->tree_probs); const struct segmentation_probs *segp) {
return vpx_read_tree(r, vp10_segment_tree, segp->tree_probs);
} }
static TX_SIZE read_selected_tx_size(VP10_COMMON *cm, MACROBLOCKD *xd, static TX_SIZE read_selected_tx_size(VP10_COMMON *cm, MACROBLOCKD *xd,
@@ -86,6 +100,8 @@ static TX_SIZE read_tx_size(VP10_COMMON *cm, MACROBLOCKD *xd,
TX_MODE tx_mode = cm->tx_mode; TX_MODE tx_mode = cm->tx_mode;
BLOCK_SIZE bsize = xd->mi[0]->mbmi.sb_type; BLOCK_SIZE bsize = xd->mi[0]->mbmi.sb_type;
const TX_SIZE max_tx_size = max_txsize_lookup[bsize]; const TX_SIZE max_tx_size = max_txsize_lookup[bsize];
if (xd->lossless[xd->mi[0]->mbmi.segment_id])
return TX_4X4;
if (allow_select && tx_mode == TX_MODE_SELECT && bsize >= BLOCK_8X8) if (allow_select && tx_mode == TX_MODE_SELECT && bsize >= BLOCK_8X8)
return read_selected_tx_size(cm, xd, max_tx_size, r); return read_selected_tx_size(cm, xd, max_tx_size, r);
else else
@@ -116,18 +132,32 @@ static void set_segment_id(VP10_COMMON *cm, int mi_offset,
cm->current_frame_seg_map[mi_offset + y * cm->mi_cols + x] = segment_id; cm->current_frame_seg_map[mi_offset + y * cm->mi_cols + x] = segment_id;
} }
static int read_intra_segment_id(VP10_COMMON *const cm, int mi_offset, static int read_intra_segment_id(VP10_COMMON *const cm, MACROBLOCKD *const xd,
int x_mis, int y_mis, int mi_offset, int x_mis, int y_mis,
vpx_reader *r) { vpx_reader *r) {
struct segmentation *const seg = &cm->seg; struct segmentation *const seg = &cm->seg;
#if CONFIG_MISC_FIXES
FRAME_COUNTS *counts = xd->counts;
struct segmentation_probs *const segp = &cm->fc->seg;
#else
struct segmentation_probs *const segp = &cm->segp;
#endif
int segment_id; int segment_id;
#if !CONFIG_MISC_FIXES
(void) xd;
#endif
if (!seg->enabled) if (!seg->enabled)
return 0; // Default for disabled segmentation return 0; // Default for disabled segmentation
assert(seg->update_map && !seg->temporal_update); assert(seg->update_map && !seg->temporal_update);
segment_id = read_segment_id(r, seg); segment_id = read_segment_id(r, segp);
#if CONFIG_MISC_FIXES
if (counts)
++counts->seg.tree_total[segment_id];
#endif
set_segment_id(cm, mi_offset, x_mis, y_mis, segment_id); set_segment_id(cm, mi_offset, x_mis, y_mis, segment_id);
return segment_id; return segment_id;
} }
@@ -147,6 +177,12 @@ static void copy_segment_id(const VP10_COMMON *cm,
static int read_inter_segment_id(VP10_COMMON *const cm, MACROBLOCKD *const xd, static int read_inter_segment_id(VP10_COMMON *const cm, MACROBLOCKD *const xd,
int mi_row, int mi_col, vpx_reader *r) { int mi_row, int mi_col, vpx_reader *r) {
struct segmentation *const seg = &cm->seg; struct segmentation *const seg = &cm->seg;
#if CONFIG_MISC_FIXES
FRAME_COUNTS *counts = xd->counts;
struct segmentation_probs *const segp = &cm->fc->seg;
#else
struct segmentation_probs *const segp = &cm->segp;
#endif
MB_MODE_INFO *const mbmi = &xd->mi[0]->mbmi; MB_MODE_INFO *const mbmi = &xd->mi[0]->mbmi;
int predicted_segment_id, segment_id; int predicted_segment_id, segment_id;
const int mi_offset = mi_row * cm->mi_cols + mi_col; const int mi_offset = mi_row * cm->mi_cols + mi_col;
@@ -171,12 +207,28 @@ static int read_inter_segment_id(VP10_COMMON *const cm, MACROBLOCKD *const xd,
} }
if (seg->temporal_update) { if (seg->temporal_update) {
const vpx_prob pred_prob = vp10_get_pred_prob_seg_id(seg, xd); const int ctx = vp10_get_pred_context_seg_id(xd);
const vpx_prob pred_prob = segp->pred_probs[ctx];
mbmi->seg_id_predicted = vpx_read(r, pred_prob); mbmi->seg_id_predicted = vpx_read(r, pred_prob);
segment_id = mbmi->seg_id_predicted ? predicted_segment_id #if CONFIG_MISC_FIXES
: read_segment_id(r, seg); if (counts)
++counts->seg.pred[ctx][mbmi->seg_id_predicted];
#endif
if (mbmi->seg_id_predicted) {
segment_id = predicted_segment_id;
} else {
segment_id = read_segment_id(r, segp);
#if CONFIG_MISC_FIXES
if (counts)
++counts->seg.tree_mispred[segment_id];
#endif
}
} else { } else {
segment_id = read_segment_id(r, seg); segment_id = read_segment_id(r, segp);
#if CONFIG_MISC_FIXES
if (counts)
++counts->seg.tree_total[segment_id];
#endif
} }
set_segment_id(cm, mi_offset, x_mis, y_mis, segment_id); set_segment_id(cm, mi_offset, x_mis, y_mis, segment_id);
return segment_id; return segment_id;
@@ -213,7 +265,7 @@ static void read_intra_frame_mode_info(VP10_COMMON *const cm,
const int x_mis = VPXMIN(cm->mi_cols - mi_col, bw); const int x_mis = VPXMIN(cm->mi_cols - mi_col, bw);
const int y_mis = VPXMIN(cm->mi_rows - mi_row, bh); const int y_mis = VPXMIN(cm->mi_rows - mi_row, bh);
mbmi->segment_id = read_intra_segment_id(cm, mi_offset, x_mis, y_mis, r); mbmi->segment_id = read_intra_segment_id(cm, xd, mi_offset, x_mis, y_mis, r);
mbmi->skip = read_skip(cm, xd, mbmi->segment_id, r); mbmi->skip = read_skip(cm, xd, mbmi->segment_id, r);
mbmi->tx_size = read_tx_size(cm, xd, 1, r); mbmi->tx_size = read_tx_size(cm, xd, 1, r);
mbmi->ref_frame[0] = INTRA_FRAME; mbmi->ref_frame[0] = INTRA_FRAME;
@@ -223,27 +275,27 @@ static void read_intra_frame_mode_info(VP10_COMMON *const cm,
case BLOCK_4X4: case BLOCK_4X4:
for (i = 0; i < 4; ++i) for (i = 0; i < 4; ++i)
mi->bmi[i].as_mode = mi->bmi[i].as_mode =
read_intra_mode(r, get_y_mode_probs(mi, above_mi, left_mi, i)); read_intra_mode(r, get_y_mode_probs(cm, mi, above_mi, left_mi, i));
mbmi->mode = mi->bmi[3].as_mode; mbmi->mode = mi->bmi[3].as_mode;
break; break;
case BLOCK_4X8: case BLOCK_4X8:
mi->bmi[0].as_mode = mi->bmi[2].as_mode = mi->bmi[0].as_mode = mi->bmi[2].as_mode =
read_intra_mode(r, get_y_mode_probs(mi, above_mi, left_mi, 0)); read_intra_mode(r, get_y_mode_probs(cm, mi, above_mi, left_mi, 0));
mi->bmi[1].as_mode = mi->bmi[3].as_mode = mbmi->mode = mi->bmi[1].as_mode = mi->bmi[3].as_mode = mbmi->mode =
read_intra_mode(r, get_y_mode_probs(mi, above_mi, left_mi, 1)); read_intra_mode(r, get_y_mode_probs(cm, mi, above_mi, left_mi, 1));
break; break;
case BLOCK_8X4: case BLOCK_8X4:
mi->bmi[0].as_mode = mi->bmi[1].as_mode = mi->bmi[0].as_mode = mi->bmi[1].as_mode =
read_intra_mode(r, get_y_mode_probs(mi, above_mi, left_mi, 0)); read_intra_mode(r, get_y_mode_probs(cm, mi, above_mi, left_mi, 0));
mi->bmi[2].as_mode = mi->bmi[3].as_mode = mbmi->mode = mi->bmi[2].as_mode = mi->bmi[3].as_mode = mbmi->mode =
read_intra_mode(r, get_y_mode_probs(mi, above_mi, left_mi, 2)); read_intra_mode(r, get_y_mode_probs(cm, mi, above_mi, left_mi, 2));
break; break;
default: default:
mbmi->mode = read_intra_mode(r, mbmi->mode = read_intra_mode(r,
get_y_mode_probs(mi, above_mi, left_mi, 0)); get_y_mode_probs(cm, mi, above_mi, left_mi, 0));
} }
mbmi->uv_mode = read_intra_mode(r, vp10_kf_uv_mode_prob[mbmi->mode]); mbmi->uv_mode = read_intra_mode_uv(cm, xd, r, mbmi->mode);
} }
static int read_mv_component(vpx_reader *r, static int read_mv_component(vpx_reader *r,
@@ -294,7 +346,7 @@ static INLINE void read_mv(vpx_reader *r, MV *mv, const MV *ref,
if (mv_joint_horizontal(joint_type)) if (mv_joint_horizontal(joint_type))
diff.col = read_mv_component(r, &ctx->comps[1], use_hp); diff.col = read_mv_component(r, &ctx->comps[1], use_hp);
vp10_inc_mv(&diff, counts); vp10_inc_mv(&diff, counts, use_hp);
mv->row = ref->row + diff.row; mv->row = ref->row + diff.row;
mv->col = ref->col + diff.col; mv->col = ref->col + diff.col;
@@ -523,8 +575,8 @@ static void read_inter_block_mode_info(VP10Decoder *const pbi,
if (bsize < BLOCK_8X8 || mbmi->mode != ZEROMV) { if (bsize < BLOCK_8X8 || mbmi->mode != ZEROMV) {
for (ref = 0; ref < 1 + is_compound; ++ref) { for (ref = 0; ref < 1 + is_compound; ++ref) {
vp10_find_best_ref_mvs(xd, allow_hp, ref_mvs[mbmi->ref_frame[ref]], vp10_find_best_ref_mvs(allow_hp, ref_mvs[mbmi->ref_frame[ref]],
&nearestmv[ref], &nearmv[ref]); &nearestmv[ref], &nearmv[ref]);
} }
} }

View File

@@ -126,6 +126,9 @@ VP10Decoder *vp10_decoder_create(BufferPool *const pool) {
void vp10_decoder_remove(VP10Decoder *pbi) { void vp10_decoder_remove(VP10Decoder *pbi) {
int i; int i;
if (!pbi)
return;
vpx_get_worker_interface()->end(&pbi->lf_worker); vpx_get_worker_interface()->end(&pbi->lf_worker);
vpx_free(pbi->lf_worker.data1); vpx_free(pbi->lf_worker.data1);
vpx_free(pbi->tile_data); vpx_free(pbi->tile_data);
@@ -258,7 +261,7 @@ static void swap_frame_buffers(VP10Decoder *pbi) {
pbi->hold_ref_buf = 0; pbi->hold_ref_buf = 0;
cm->frame_to_show = get_frame_new_buffer(cm); cm->frame_to_show = get_frame_new_buffer(cm);
if (!pbi->frame_parallel_decode || !cm->show_frame) { if (!cm->frame_parallel_decode || !cm->show_frame) {
lock_buffer_pool(pool); lock_buffer_pool(pool);
--frame_bufs[cm->new_fb_idx].ref_count; --frame_bufs[cm->new_fb_idx].ref_count;
unlock_buffer_pool(pool); unlock_buffer_pool(pool);
@@ -297,7 +300,7 @@ int vp10_receive_compressed_data(VP10Decoder *pbi,
// Check if the previous frame was a frame without any references to it. // Check if the previous frame was a frame without any references to it.
// Release frame buffer if not decoding in frame parallel mode. // Release frame buffer if not decoding in frame parallel mode.
if (!pbi->frame_parallel_decode && cm->new_fb_idx >= 0 if (!cm->frame_parallel_decode && cm->new_fb_idx >= 0
&& frame_bufs[cm->new_fb_idx].ref_count == 0) && frame_bufs[cm->new_fb_idx].ref_count == 0)
pool->release_fb_cb(pool->cb_priv, pool->release_fb_cb(pool->cb_priv,
&frame_bufs[cm->new_fb_idx].raw_frame_buffer); &frame_bufs[cm->new_fb_idx].raw_frame_buffer);
@@ -310,7 +313,7 @@ int vp10_receive_compressed_data(VP10Decoder *pbi,
cm->cur_frame = &pool->frame_bufs[cm->new_fb_idx]; cm->cur_frame = &pool->frame_bufs[cm->new_fb_idx];
pbi->hold_ref_buf = 0; pbi->hold_ref_buf = 0;
if (pbi->frame_parallel_decode) { if (cm->frame_parallel_decode) {
VPxWorker *const worker = pbi->frame_worker_owner; VPxWorker *const worker = pbi->frame_worker_owner;
vp10_frameworker_lock_stats(worker); vp10_frameworker_lock_stats(worker);
frame_bufs[cm->new_fb_idx].frame_worker_owner = worker; frame_bufs[cm->new_fb_idx].frame_worker_owner = worker;
@@ -379,12 +382,12 @@ int vp10_receive_compressed_data(VP10Decoder *pbi,
if (!cm->show_existing_frame) { if (!cm->show_existing_frame) {
cm->last_show_frame = cm->show_frame; cm->last_show_frame = cm->show_frame;
cm->prev_frame = cm->cur_frame; cm->prev_frame = cm->cur_frame;
if (cm->seg.enabled && !pbi->frame_parallel_decode) if (cm->seg.enabled && !cm->frame_parallel_decode)
vp10_swap_current_and_last_seg_map(cm); vp10_swap_current_and_last_seg_map(cm);
} }
// Update progress in frame parallel decode. // Update progress in frame parallel decode.
if (pbi->frame_parallel_decode) { if (cm->frame_parallel_decode) {
// Need to lock the mutex here as another thread may // Need to lock the mutex here as another thread may
// be accessing this buffer. // be accessing this buffer.
VPxWorker *const worker = pbi->frame_worker_owner; VPxWorker *const worker = pbi->frame_worker_owner;
@@ -456,6 +459,9 @@ vpx_codec_err_t vp10_parse_superframe_index(const uint8_t *data,
// an invalid bitstream and need to return an error. // an invalid bitstream and need to return an error.
uint8_t marker; uint8_t marker;
#if CONFIG_MISC_FIXES
size_t frame_sz_sum = 0;
#endif
assert(data_sz); assert(data_sz);
marker = read_marker(decrypt_cb, decrypt_state, data + data_sz - 1); marker = read_marker(decrypt_cb, decrypt_state, data + data_sz - 1);
@@ -464,7 +470,7 @@ vpx_codec_err_t vp10_parse_superframe_index(const uint8_t *data,
if ((marker & 0xe0) == 0xc0) { if ((marker & 0xe0) == 0xc0) {
const uint32_t frames = (marker & 0x7) + 1; const uint32_t frames = (marker & 0x7) + 1;
const uint32_t mag = ((marker >> 3) & 0x3) + 1; const uint32_t mag = ((marker >> 3) & 0x3) + 1;
const size_t index_sz = 2 + mag * frames; const size_t index_sz = 2 + mag * (frames - CONFIG_MISC_FIXES);
// This chunk is marked as having a superframe index but doesn't have // This chunk is marked as having a superframe index but doesn't have
// enough data for it, thus it's an invalid superframe index. // enough data for it, thus it's an invalid superframe index.
@@ -495,13 +501,20 @@ vpx_codec_err_t vp10_parse_superframe_index(const uint8_t *data,
x = clear_buffer; x = clear_buffer;
} }
for (i = 0; i < frames; ++i) { for (i = 0; i < frames - CONFIG_MISC_FIXES; ++i) {
uint32_t this_sz = 0; uint32_t this_sz = 0;
for (j = 0; j < mag; ++j) for (j = 0; j < mag; ++j)
this_sz |= (*x++) << (j * 8); this_sz |= (*x++) << (j * 8);
this_sz += CONFIG_MISC_FIXES;
sizes[i] = this_sz; sizes[i] = this_sz;
#if CONFIG_MISC_FIXES
frame_sz_sum += this_sz;
#endif
} }
#if CONFIG_MISC_FIXES
sizes[i] = data_sz - index_sz - frame_sz_sum;
#endif
*count = frames; *count = frames;
} }
} }

View File

@@ -34,6 +34,7 @@ typedef struct TileData {
DECLARE_ALIGNED(16, MACROBLOCKD, xd); DECLARE_ALIGNED(16, MACROBLOCKD, xd);
/* dqcoeff are shared by all the planes. So planes must be decoded serially */ /* dqcoeff are shared by all the planes. So planes must be decoded serially */
DECLARE_ALIGNED(16, tran_low_t, dqcoeff[32 * 32]); DECLARE_ALIGNED(16, tran_low_t, dqcoeff[32 * 32]);
DECLARE_ALIGNED(16, uint8_t, color_index_map[2][64 * 64]);
} TileData; } TileData;
typedef struct TileWorkerData { typedef struct TileWorkerData {
@@ -43,6 +44,7 @@ typedef struct TileWorkerData {
DECLARE_ALIGNED(16, MACROBLOCKD, xd); DECLARE_ALIGNED(16, MACROBLOCKD, xd);
/* dqcoeff are shared by all the planes. So planes must be decoded serially */ /* dqcoeff are shared by all the planes. So planes must be decoded serially */
DECLARE_ALIGNED(16, tran_low_t, dqcoeff[32 * 32]); DECLARE_ALIGNED(16, tran_low_t, dqcoeff[32 * 32]);
DECLARE_ALIGNED(16, uint8_t, color_index_map[2][64 * 64]);
struct vpx_internal_error_info error_info; struct vpx_internal_error_info error_info;
} TileWorkerData; } TileWorkerData;
@@ -55,8 +57,6 @@ typedef struct VP10Decoder {
int refresh_frame_flags; int refresh_frame_flags;
int frame_parallel_decode; // frame-based threading.
// TODO(hkuang): Combine this with cur_buf in macroblockd as they are // TODO(hkuang): Combine this with cur_buf in macroblockd as they are
// the same. // the same.
RefCntBuffer *cur_buf; // Current decoding frame buffer. RefCntBuffer *cur_buf; // Current decoding frame buffer.

View File

@@ -163,26 +163,33 @@ static int decode_coefs(const MACROBLOCKD *xd,
case CATEGORY5_TOKEN: case CATEGORY5_TOKEN:
val = CAT5_MIN_VAL + read_coeff(cat5_prob, 5, r); val = CAT5_MIN_VAL + read_coeff(cat5_prob, 5, r);
break; break;
case CATEGORY6_TOKEN: case CATEGORY6_TOKEN: {
#if CONFIG_MISC_FIXES
const int skip_bits = TX_SIZES - 1 - tx_size;
#else
const int skip_bits = 0;
#endif
const uint8_t *cat6p = cat6_prob + skip_bits;
#if CONFIG_VP9_HIGHBITDEPTH #if CONFIG_VP9_HIGHBITDEPTH
switch (xd->bd) { switch (xd->bd) {
case VPX_BITS_8: case VPX_BITS_8:
val = CAT6_MIN_VAL + read_coeff(cat6_prob, 14, r); val = CAT6_MIN_VAL + read_coeff(cat6p, 14 - skip_bits, r);
break; break;
case VPX_BITS_10: case VPX_BITS_10:
val = CAT6_MIN_VAL + read_coeff(cat6_prob, 16, r); val = CAT6_MIN_VAL + read_coeff(cat6p, 16 - skip_bits, r);
break; break;
case VPX_BITS_12: case VPX_BITS_12:
val = CAT6_MIN_VAL + read_coeff(cat6_prob, 18, r); val = CAT6_MIN_VAL + read_coeff(cat6p, 18 - skip_bits, r);
break; break;
default: default:
assert(0); assert(0);
return -1; return -1;
} }
#else #else
val = CAT6_MIN_VAL + read_coeff(cat6_prob, 14, r); val = CAT6_MIN_VAL + read_coeff(cat6p, 14 - skip_bits, r);
#endif #endif
break; break;
}
} }
} }
v = (val * dqv) >> dq_shift; v = (val * dqv) >> dq_shift;

View File

@@ -23,13 +23,13 @@ static int inv_recenter_nonneg(int v, int m) {
static int decode_uniform(vpx_reader *r) { static int decode_uniform(vpx_reader *r) {
const int l = 8; const int l = 8;
const int m = (1 << l) - 191; const int m = (1 << l) - 191 + CONFIG_MISC_FIXES;
const int v = vpx_read_literal(r, l - 1); const int v = vpx_read_literal(r, l - 1);
return v < m ? v : (v << 1) - m + vpx_read_bit(r); return v < m ? v : (v << 1) - m + vpx_read_bit(r);
} }
static int inv_remap_prob(int v, int m) { static int inv_remap_prob(int v, int m) {
static int inv_map_table[MAX_PROB] = { static uint8_t inv_map_table[MAX_PROB - CONFIG_MISC_FIXES] = {
7, 20, 33, 46, 59, 72, 85, 98, 111, 124, 137, 150, 163, 176, 189, 7, 20, 33, 46, 59, 72, 85, 98, 111, 124, 137, 150, 163, 176, 189,
202, 215, 228, 241, 254, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 202, 215, 228, 241, 254, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 25, 26, 27, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 25, 26, 27,
@@ -46,7 +46,10 @@ static int inv_remap_prob(int v, int m) {
191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 203, 204, 205, 206, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 203, 204, 205, 206,
207, 208, 209, 210, 211, 212, 213, 214, 216, 217, 218, 219, 220, 221, 222, 207, 208, 209, 210, 211, 212, 213, 214, 216, 217, 218, 219, 220, 221, 222,
223, 224, 225, 226, 227, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 223, 224, 225, 226, 227, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238,
239, 240, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 253 239, 240, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253,
#if !CONFIG_MISC_FIXES
253
#endif
}; };
assert(v < (int)(sizeof(inv_map_table) / sizeof(inv_map_table[0]))); assert(v < (int)(sizeof(inv_map_table) / sizeof(inv_map_table[0])));
v = inv_map_table[v]; v = inv_map_table[v];

View File

@@ -15,6 +15,10 @@
#include "vpx_util/vpx_thread.h" #include "vpx_util/vpx_thread.h"
#include "vpx/internal/vpx_codec_internal.h" #include "vpx/internal/vpx_codec_internal.h"
#ifdef __cplusplus
extern "C" {
#endif
struct VP10Common; struct VP10Common;
struct VP10Decoder; struct VP10Decoder;
@@ -63,4 +67,8 @@ void vp10_frameworker_broadcast(RefCntBuffer *const buf, int row);
void vp10_frameworker_copy_context(VPxWorker *const dst_worker, void vp10_frameworker_copy_context(VPxWorker *const dst_worker,
VPxWorker *const src_worker); VPxWorker *const src_worker);
#ifdef __cplusplus
} // extern "C"
#endif
#endif // VP10_DECODER_DTHREAD_H_ #endif // VP10_DECODER_DTHREAD_H_

View File

@@ -1,160 +0,0 @@
/*
* Copyright (c) 2015 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <arm_neon.h>
#include <assert.h>
#include "./vp10_rtcd.h"
#include "./vpx_config.h"
#include "vpx/vpx_integer.h"
static INLINE unsigned int horizontal_add_u16x8(const uint16x8_t v_16x8) {
const uint32x4_t a = vpaddlq_u16(v_16x8);
const uint64x2_t b = vpaddlq_u32(a);
const uint32x2_t c = vadd_u32(vreinterpret_u32_u64(vget_low_u64(b)),
vreinterpret_u32_u64(vget_high_u64(b)));
return vget_lane_u32(c, 0);
}
unsigned int vp10_avg_8x8_neon(const uint8_t *s, int p) {
uint8x8_t v_s0 = vld1_u8(s);
const uint8x8_t v_s1 = vld1_u8(s + p);
uint16x8_t v_sum = vaddl_u8(v_s0, v_s1);
v_s0 = vld1_u8(s + 2 * p);
v_sum = vaddw_u8(v_sum, v_s0);
v_s0 = vld1_u8(s + 3 * p);
v_sum = vaddw_u8(v_sum, v_s0);
v_s0 = vld1_u8(s + 4 * p);
v_sum = vaddw_u8(v_sum, v_s0);
v_s0 = vld1_u8(s + 5 * p);
v_sum = vaddw_u8(v_sum, v_s0);
v_s0 = vld1_u8(s + 6 * p);
v_sum = vaddw_u8(v_sum, v_s0);
v_s0 = vld1_u8(s + 7 * p);
v_sum = vaddw_u8(v_sum, v_s0);
return (horizontal_add_u16x8(v_sum) + 32) >> 6;
}
void vp10_int_pro_row_neon(int16_t hbuf[16], uint8_t const *ref,
const int ref_stride, const int height) {
int i;
uint16x8_t vec_sum_lo = vdupq_n_u16(0);
uint16x8_t vec_sum_hi = vdupq_n_u16(0);
const int shift_factor = ((height >> 5) + 3) * -1;
const int16x8_t vec_shift = vdupq_n_s16(shift_factor);
for (i = 0; i < height; i += 8) {
const uint8x16_t vec_row1 = vld1q_u8(ref);
const uint8x16_t vec_row2 = vld1q_u8(ref + ref_stride);
const uint8x16_t vec_row3 = vld1q_u8(ref + ref_stride * 2);
const uint8x16_t vec_row4 = vld1q_u8(ref + ref_stride * 3);
const uint8x16_t vec_row5 = vld1q_u8(ref + ref_stride * 4);
const uint8x16_t vec_row6 = vld1q_u8(ref + ref_stride * 5);
const uint8x16_t vec_row7 = vld1q_u8(ref + ref_stride * 6);
const uint8x16_t vec_row8 = vld1q_u8(ref + ref_stride * 7);
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row1));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row1));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row2));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row2));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row3));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row3));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row4));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row4));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row5));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row5));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row6));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row6));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row7));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row7));
vec_sum_lo = vaddw_u8(vec_sum_lo, vget_low_u8(vec_row8));
vec_sum_hi = vaddw_u8(vec_sum_hi, vget_high_u8(vec_row8));
ref += ref_stride * 8;
}
vec_sum_lo = vshlq_u16(vec_sum_lo, vec_shift);
vec_sum_hi = vshlq_u16(vec_sum_hi, vec_shift);
vst1q_s16(hbuf, vreinterpretq_s16_u16(vec_sum_lo));
hbuf += 8;
vst1q_s16(hbuf, vreinterpretq_s16_u16(vec_sum_hi));
}
int16_t vp10_int_pro_col_neon(uint8_t const *ref, const int width) {
int i;
uint16x8_t vec_sum = vdupq_n_u16(0);
for (i = 0; i < width; i += 16) {
const uint8x16_t vec_row = vld1q_u8(ref);
vec_sum = vaddw_u8(vec_sum, vget_low_u8(vec_row));
vec_sum = vaddw_u8(vec_sum, vget_high_u8(vec_row));
ref += 16;
}
return horizontal_add_u16x8(vec_sum);
}
// ref, src = [0, 510] - max diff = 16-bits
// bwl = {2, 3, 4}, width = {16, 32, 64}
int vp10_vector_var_neon(int16_t const *ref, int16_t const *src, const int bwl) {
int width = 4 << bwl;
int32x4_t sse = vdupq_n_s32(0);
int16x8_t total = vdupq_n_s16(0);
assert(width >= 8);
assert((width % 8) == 0);
do {
const int16x8_t r = vld1q_s16(ref);
const int16x8_t s = vld1q_s16(src);
const int16x8_t diff = vsubq_s16(r, s); // [-510, 510], 10 bits.
const int16x4_t diff_lo = vget_low_s16(diff);
const int16x4_t diff_hi = vget_high_s16(diff);
sse = vmlal_s16(sse, diff_lo, diff_lo); // dynamic range 26 bits.
sse = vmlal_s16(sse, diff_hi, diff_hi);
total = vaddq_s16(total, diff); // dynamic range 16 bits.
ref += 8;
src += 8;
width -= 8;
} while (width != 0);
{
// Note: 'total''s pairwise addition could be implemented similarly to
// horizontal_add_u16x8(), but one less vpaddl with 'total' when paired
// with the summation of 'sse' performed better on a Cortex-A15.
const int32x4_t t0 = vpaddlq_s16(total); // cascading summation of 'total'
const int32x2_t t1 = vadd_s32(vget_low_s32(t0), vget_high_s32(t0));
const int32x2_t t2 = vpadd_s32(t1, t1);
const int t = vget_lane_s32(t2, 0);
const int64x2_t s0 = vpaddlq_s32(sse); // cascading summation of 'sse'.
const int32x2_t s1 = vadd_s32(vreinterpret_s32_s64(vget_low_s64(s0)),
vreinterpret_s32_s64(vget_high_s64(s0)));
const int s = vget_lane_s32(s1, 0);
const int shift_factor = bwl + 2;
return s - ((t * t) >> shift_factor);
}
}

View File

@@ -45,6 +45,19 @@ static const struct vp10_token partition_encodings[PARTITION_TYPES] =
static const struct vp10_token inter_mode_encodings[INTER_MODES] = static const struct vp10_token inter_mode_encodings[INTER_MODES] =
{{2, 2}, {6, 3}, {0, 1}, {7, 3}}; {{2, 2}, {6, 3}, {0, 1}, {7, 3}};
static INLINE void write_uniform(vpx_writer *w, int n, int v) {
int l = get_unsigned_bits(n);
int m = (1 << l) - n;
if (l == 0)
return;
if (v < m) {
vpx_write_literal(w, v, l - 1);
} else {
vpx_write_literal(w, m + ((v - m) >> 1), l - 1);
vpx_write_literal(w, (v - m) & 1, 1);
}
}
static void write_intra_mode(vpx_writer *w, PREDICTION_MODE mode, static void write_intra_mode(vpx_writer *w, PREDICTION_MODE mode,
const vpx_prob *probs) { const vpx_prob *probs) {
vp10_write_token(w, vp10_intra_mode_tree, probs, &intra_mode_encodings[mode]); vp10_write_token(w, vp10_intra_mode_tree, probs, &intra_mode_encodings[mode]);
@@ -122,8 +135,11 @@ static void update_switchable_interp_probs(VP10_COMMON *cm, vpx_writer *w,
static void pack_mb_tokens(vpx_writer *w, static void pack_mb_tokens(vpx_writer *w,
TOKENEXTRA **tp, const TOKENEXTRA *const stop, TOKENEXTRA **tp, const TOKENEXTRA *const stop,
vpx_bit_depth_t bit_depth) { vpx_bit_depth_t bit_depth, const TX_SIZE tx) {
TOKENEXTRA *p = *tp; TOKENEXTRA *p = *tp;
#if !CONFIG_MISC_FIXES
(void) tx;
#endif
while (p < stop && p->token != EOSB_TOKEN) { while (p < stop && p->token != EOSB_TOKEN) {
const int t = p->token; const int t = p->token;
@@ -171,6 +187,12 @@ static void pack_mb_tokens(vpx_writer *w,
if (b->base_val) { if (b->base_val) {
const int e = p->extra, l = b->len; const int e = p->extra, l = b->len;
#if CONFIG_MISC_FIXES
int skip_bits =
(b->base_val == CAT6_MIN_VAL) ? TX_SIZES - 1 - tx : 0;
#else
int skip_bits = 0;
#endif
if (l) { if (l) {
const unsigned char *pb = b->prob; const unsigned char *pb = b->prob;
@@ -180,7 +202,12 @@ static void pack_mb_tokens(vpx_writer *w,
do { do {
const int bb = (v >> --n) & 1; const int bb = (v >> --n) & 1;
vpx_write(w, bb, pb[i >> 1]); if (skip_bits) {
skip_bits--;
assert(!bb);
} else {
vpx_write(w, bb, pb[i >> 1]);
}
i = b->tree[i + bb]; i = b->tree[i + bb];
} while (n); } while (n);
} }
@@ -190,13 +217,14 @@ static void pack_mb_tokens(vpx_writer *w,
++p; ++p;
} }
*tp = p + (p->token == EOSB_TOKEN); *tp = p;
} }
static void write_segment_id(vpx_writer *w, const struct segmentation *seg, static void write_segment_id(vpx_writer *w, const struct segmentation *seg,
const struct segmentation_probs *segp,
int segment_id) { int segment_id) {
if (seg->enabled && seg->update_map) if (seg->enabled && seg->update_map)
vp10_write_tree(w, vp10_segment_tree, seg->tree_probs, segment_id, 3, 0); vp10_write_tree(w, vp10_segment_tree, segp->tree_probs, segment_id, 3, 0);
} }
// This function encodes the reference frame // This function encodes the reference frame
@@ -242,6 +270,11 @@ static void pack_inter_mode_mvs(VP10_COMP *cpi, const MODE_INFO *mi,
const MACROBLOCK *const x = &cpi->td.mb; const MACROBLOCK *const x = &cpi->td.mb;
const MACROBLOCKD *const xd = &x->e_mbd; const MACROBLOCKD *const xd = &x->e_mbd;
const struct segmentation *const seg = &cm->seg; const struct segmentation *const seg = &cm->seg;
#if CONFIG_MISC_FIXES
const struct segmentation_probs *const segp = &cm->fc->seg;
#else
const struct segmentation_probs *const segp = &cm->segp;
#endif
const MB_MODE_INFO *const mbmi = &mi->mbmi; const MB_MODE_INFO *const mbmi = &mi->mbmi;
const MB_MODE_INFO_EXT *const mbmi_ext = x->mbmi_ext; const MB_MODE_INFO_EXT *const mbmi_ext = x->mbmi_ext;
const PREDICTION_MODE mode = mbmi->mode; const PREDICTION_MODE mode = mbmi->mode;
@@ -255,12 +288,12 @@ static void pack_inter_mode_mvs(VP10_COMP *cpi, const MODE_INFO *mi,
if (seg->update_map) { if (seg->update_map) {
if (seg->temporal_update) { if (seg->temporal_update) {
const int pred_flag = mbmi->seg_id_predicted; const int pred_flag = mbmi->seg_id_predicted;
vpx_prob pred_prob = vp10_get_pred_prob_seg_id(seg, xd); vpx_prob pred_prob = vp10_get_pred_prob_seg_id(segp, xd);
vpx_write(w, pred_flag, pred_prob); vpx_write(w, pred_flag, pred_prob);
if (!pred_flag) if (!pred_flag)
write_segment_id(w, seg, segment_id); write_segment_id(w, seg, segp, segment_id);
} else { } else {
write_segment_id(w, seg, segment_id); write_segment_id(w, seg, segp, segment_id);
} }
} }
@@ -270,7 +303,7 @@ static void pack_inter_mode_mvs(VP10_COMP *cpi, const MODE_INFO *mi,
vpx_write(w, is_inter, vp10_get_intra_inter_prob(cm, xd)); vpx_write(w, is_inter, vp10_get_intra_inter_prob(cm, xd));
if (bsize >= BLOCK_8X8 && cm->tx_mode == TX_MODE_SELECT && if (bsize >= BLOCK_8X8 && cm->tx_mode == TX_MODE_SELECT &&
!(is_inter && skip)) { !(is_inter && skip) && !xd->lossless[segment_id]) {
write_selected_tx_size(cm, xd, w); write_selected_tx_size(cm, xd, w);
} }
@@ -342,6 +375,11 @@ static void pack_inter_mode_mvs(VP10_COMP *cpi, const MODE_INFO *mi,
static void write_mb_modes_kf(const VP10_COMMON *cm, const MACROBLOCKD *xd, static void write_mb_modes_kf(const VP10_COMMON *cm, const MACROBLOCKD *xd,
MODE_INFO **mi_8x8, vpx_writer *w) { MODE_INFO **mi_8x8, vpx_writer *w) {
const struct segmentation *const seg = &cm->seg; const struct segmentation *const seg = &cm->seg;
#if CONFIG_MISC_FIXES
const struct segmentation_probs *const segp = &cm->fc->seg;
#else
const struct segmentation_probs *const segp = &cm->segp;
#endif
const MODE_INFO *const mi = mi_8x8[0]; const MODE_INFO *const mi = mi_8x8[0];
const MODE_INFO *const above_mi = xd->above_mi; const MODE_INFO *const above_mi = xd->above_mi;
const MODE_INFO *const left_mi = xd->left_mi; const MODE_INFO *const left_mi = xd->left_mi;
@@ -349,15 +387,17 @@ static void write_mb_modes_kf(const VP10_COMMON *cm, const MACROBLOCKD *xd,
const BLOCK_SIZE bsize = mbmi->sb_type; const BLOCK_SIZE bsize = mbmi->sb_type;
if (seg->update_map) if (seg->update_map)
write_segment_id(w, seg, mbmi->segment_id); write_segment_id(w, seg, segp, mbmi->segment_id);
write_skip(cm, xd, mbmi->segment_id, mi, w); write_skip(cm, xd, mbmi->segment_id, mi, w);
if (bsize >= BLOCK_8X8 && cm->tx_mode == TX_MODE_SELECT) if (bsize >= BLOCK_8X8 && cm->tx_mode == TX_MODE_SELECT &&
!xd->lossless[mbmi->segment_id])
write_selected_tx_size(cm, xd, w); write_selected_tx_size(cm, xd, w);
if (bsize >= BLOCK_8X8) { if (bsize >= BLOCK_8X8) {
write_intra_mode(w, mbmi->mode, get_y_mode_probs(mi, above_mi, left_mi, 0)); write_intra_mode(w, mbmi->mode,
get_y_mode_probs(cm, mi, above_mi, left_mi, 0));
} else { } else {
const int num_4x4_w = num_4x4_blocks_wide_lookup[bsize]; const int num_4x4_w = num_4x4_blocks_wide_lookup[bsize];
const int num_4x4_h = num_4x4_blocks_high_lookup[bsize]; const int num_4x4_h = num_4x4_blocks_high_lookup[bsize];
@@ -367,12 +407,12 @@ static void write_mb_modes_kf(const VP10_COMMON *cm, const MACROBLOCKD *xd,
for (idx = 0; idx < 2; idx += num_4x4_w) { for (idx = 0; idx < 2; idx += num_4x4_w) {
const int block = idy * 2 + idx; const int block = idy * 2 + idx;
write_intra_mode(w, mi->bmi[block].as_mode, write_intra_mode(w, mi->bmi[block].as_mode,
get_y_mode_probs(mi, above_mi, left_mi, block)); get_y_mode_probs(cm, mi, above_mi, left_mi, block));
} }
} }
} }
write_intra_mode(w, mbmi->uv_mode, vp10_kf_uv_mode_prob[mbmi->mode]); write_intra_mode(w, mbmi->uv_mode, cm->fc->uv_mode_prob[mbmi->mode]);
} }
static void write_modes_b(VP10_COMP *cpi, const TileInfo *const tile, static void write_modes_b(VP10_COMP *cpi, const TileInfo *const tile,
@@ -382,12 +422,12 @@ static void write_modes_b(VP10_COMP *cpi, const TileInfo *const tile,
const VP10_COMMON *const cm = &cpi->common; const VP10_COMMON *const cm = &cpi->common;
MACROBLOCKD *const xd = &cpi->td.mb.e_mbd; MACROBLOCKD *const xd = &cpi->td.mb.e_mbd;
MODE_INFO *m; MODE_INFO *m;
int plane;
xd->mi = cm->mi_grid_visible + (mi_row * cm->mi_stride + mi_col); xd->mi = cm->mi_grid_visible + (mi_row * cm->mi_stride + mi_col);
m = xd->mi[0]; m = xd->mi[0];
cpi->td.mb.mbmi_ext = cpi->td.mb.mbmi_ext_base + cpi->td.mb.mbmi_ext = cpi->mbmi_ext_base + (mi_row * cm->mi_cols + mi_col);
(mi_row * cm->mi_cols + mi_col);
set_mi_row_col(xd, tile, set_mi_row_col(xd, tile,
mi_row, num_8x8_blocks_high_lookup[m->mbmi.sb_type], mi_row, num_8x8_blocks_high_lookup[m->mbmi.sb_type],
@@ -399,8 +439,16 @@ static void write_modes_b(VP10_COMP *cpi, const TileInfo *const tile,
pack_inter_mode_mvs(cpi, m, w); pack_inter_mode_mvs(cpi, m, w);
} }
assert(*tok < tok_end); if (!m->mbmi.skip) {
pack_mb_tokens(w, tok, tok_end, cm->bit_depth); assert(*tok < tok_end);
for (plane = 0; plane < MAX_MB_PLANE; ++plane) {
TX_SIZE tx = plane ? get_uv_tx_size(&m->mbmi, &xd->plane[plane])
: m->mbmi.tx_size;
pack_mb_tokens(w, tok, tok_end, cm->bit_depth, tx);
assert(*tok < tok_end && (*tok)->token == EOSB_TOKEN);
(*tok)++;
}
}
} }
static void write_partition(const VP10_COMMON *const cm, static void write_partition(const VP10_COMMON *const cm,
@@ -408,7 +456,7 @@ static void write_partition(const VP10_COMMON *const cm,
int hbs, int mi_row, int mi_col, int hbs, int mi_row, int mi_col,
PARTITION_TYPE p, BLOCK_SIZE bsize, vpx_writer *w) { PARTITION_TYPE p, BLOCK_SIZE bsize, vpx_writer *w) {
const int ctx = partition_plane_context(xd, mi_row, mi_col, bsize); const int ctx = partition_plane_context(xd, mi_row, mi_col, bsize);
const vpx_prob *const probs = xd->partition_probs[ctx]; const vpx_prob *const probs = cm->fc->partition_prob[ctx];
const int has_rows = (mi_row + hbs) < cm->mi_rows; const int has_rows = (mi_row + hbs) < cm->mi_rows;
const int has_cols = (mi_col + hbs) < cm->mi_cols; const int has_cols = (mi_col + hbs) < cm->mi_cols;
@@ -486,12 +534,9 @@ static void write_modes_sb(VP10_COMP *cpi,
static void write_modes(VP10_COMP *cpi, static void write_modes(VP10_COMP *cpi,
const TileInfo *const tile, vpx_writer *w, const TileInfo *const tile, vpx_writer *w,
TOKENEXTRA **tok, const TOKENEXTRA *const tok_end) { TOKENEXTRA **tok, const TOKENEXTRA *const tok_end) {
const VP10_COMMON *const cm = &cpi->common;
MACROBLOCKD *const xd = &cpi->td.mb.e_mbd; MACROBLOCKD *const xd = &cpi->td.mb.e_mbd;
int mi_row, mi_col; int mi_row, mi_col;
set_partition_probs(cm, xd);
for (mi_row = tile->mi_row_start; mi_row < tile->mi_row_end; for (mi_row = tile->mi_row_start; mi_row < tile->mi_row_end;
mi_row += MI_BLOCK_SIZE) { mi_row += MI_BLOCK_SIZE) {
vp10_zero(xd->left_seg_context); vp10_zero(xd->left_seg_context);
@@ -714,8 +759,7 @@ static void encode_loopfilter(struct loopfilter *lf,
vpx_wb_write_bit(wb, changed); vpx_wb_write_bit(wb, changed);
if (changed) { if (changed) {
lf->last_ref_deltas[i] = delta; lf->last_ref_deltas[i] = delta;
vpx_wb_write_literal(wb, abs(delta) & 0x3F, 6); vpx_wb_write_inv_signed_literal(wb, delta, 6);
vpx_wb_write_bit(wb, delta < 0);
} }
} }
@@ -725,8 +769,7 @@ static void encode_loopfilter(struct loopfilter *lf,
vpx_wb_write_bit(wb, changed); vpx_wb_write_bit(wb, changed);
if (changed) { if (changed) {
lf->last_mode_deltas[i] = delta; lf->last_mode_deltas[i] = delta;
vpx_wb_write_literal(wb, abs(delta) & 0x3F, 6); vpx_wb_write_inv_signed_literal(wb, delta, 6);
vpx_wb_write_bit(wb, delta < 0);
} }
} }
} }
@@ -736,8 +779,7 @@ static void encode_loopfilter(struct loopfilter *lf,
static void write_delta_q(struct vpx_write_bit_buffer *wb, int delta_q) { static void write_delta_q(struct vpx_write_bit_buffer *wb, int delta_q) {
if (delta_q != 0) { if (delta_q != 0) {
vpx_wb_write_bit(wb, 1); vpx_wb_write_bit(wb, 1);
vpx_wb_write_literal(wb, abs(delta_q), 4); vpx_wb_write_inv_signed_literal(wb, delta_q, CONFIG_MISC_FIXES ? 6 : 4);
vpx_wb_write_bit(wb, delta_q < 0);
} else { } else {
vpx_wb_write_bit(wb, 0); vpx_wb_write_bit(wb, 0);
} }
@@ -756,6 +798,9 @@ static void encode_segmentation(VP10_COMMON *cm, MACROBLOCKD *xd,
int i, j; int i, j;
const struct segmentation *seg = &cm->seg; const struct segmentation *seg = &cm->seg;
#if !CONFIG_MISC_FIXES
const struct segmentation_probs *segp = &cm->segp;
#endif
vpx_wb_write_bit(wb, seg->enabled); vpx_wb_write_bit(wb, seg->enabled);
if (!seg->enabled) if (!seg->enabled)
@@ -770,14 +815,16 @@ static void encode_segmentation(VP10_COMMON *cm, MACROBLOCKD *xd,
if (seg->update_map) { if (seg->update_map) {
// Select the coding strategy (temporal or spatial) // Select the coding strategy (temporal or spatial)
vp10_choose_segmap_coding_method(cm, xd); vp10_choose_segmap_coding_method(cm, xd);
#if !CONFIG_MISC_FIXES
// Write out probabilities used to decode unpredicted macro-block segments // Write out probabilities used to decode unpredicted macro-block segments
for (i = 0; i < SEG_TREE_PROBS; i++) { for (i = 0; i < SEG_TREE_PROBS; i++) {
const int prob = seg->tree_probs[i]; const int prob = segp->tree_probs[i];
const int update = prob != MAX_PROB; const int update = prob != MAX_PROB;
vpx_wb_write_bit(wb, update); vpx_wb_write_bit(wb, update);
if (update) if (update)
vpx_wb_write_literal(wb, prob, 8); vpx_wb_write_literal(wb, prob, 8);
} }
#endif
// Write out the chosen coding method. // Write out the chosen coding method.
if (!frame_is_intra_only(cm) && !cm->error_resilient_mode) { if (!frame_is_intra_only(cm) && !cm->error_resilient_mode) {
@@ -785,15 +832,18 @@ static void encode_segmentation(VP10_COMMON *cm, MACROBLOCKD *xd,
} else { } else {
assert(seg->temporal_update == 0); assert(seg->temporal_update == 0);
} }
#if !CONFIG_MISC_FIXES
if (seg->temporal_update) { if (seg->temporal_update) {
for (i = 0; i < PREDICTION_PROBS; i++) { for (i = 0; i < PREDICTION_PROBS; i++) {
const int prob = seg->pred_probs[i]; const int prob = segp->pred_probs[i];
const int update = prob != MAX_PROB; const int update = prob != MAX_PROB;
vpx_wb_write_bit(wb, update); vpx_wb_write_bit(wb, update);
if (update) if (update)
vpx_wb_write_literal(wb, prob, 8); vpx_wb_write_literal(wb, prob, 8);
} }
} }
#endif
} }
// Segmentation data // Segmentation data
@@ -821,14 +871,45 @@ static void encode_segmentation(VP10_COMMON *cm, MACROBLOCKD *xd,
} }
} }
static void encode_txfm_probs(VP10_COMMON *cm, vpx_writer *w, #if CONFIG_MISC_FIXES
FRAME_COUNTS *counts) { static void update_seg_probs(VP10_COMP *cpi, vpx_writer *w) {
// Mode VP10_COMMON *cm = &cpi->common;
vpx_write_literal(w, VPXMIN(cm->tx_mode, ALLOW_32X32), 2);
if (cm->tx_mode >= ALLOW_32X32) if (!cpi->common.seg.enabled)
vpx_write_bit(w, cm->tx_mode == TX_MODE_SELECT); return;
if (cpi->common.seg.temporal_update) {
int i;
for (i = 0; i < PREDICTION_PROBS; i++)
vp10_cond_prob_diff_update(w, &cm->fc->seg.pred_probs[i],
cm->counts.seg.pred[i]);
prob_diff_update(vp10_segment_tree, cm->fc->seg.tree_probs,
cm->counts.seg.tree_mispred, MAX_SEGMENTS, w);
} else {
prob_diff_update(vp10_segment_tree, cm->fc->seg.tree_probs,
cm->counts.seg.tree_total, MAX_SEGMENTS, w);
}
}
static void write_txfm_mode(TX_MODE mode, struct vpx_write_bit_buffer *wb) {
vpx_wb_write_bit(wb, mode == TX_MODE_SELECT);
if (mode != TX_MODE_SELECT)
vpx_wb_write_literal(wb, mode, 2);
}
#else
static void write_txfm_mode(TX_MODE mode, struct vpx_writer *wb) {
vpx_write_literal(wb, VPXMIN(mode, ALLOW_32X32), 2);
if (mode >= ALLOW_32X32)
vpx_write_bit(wb, mode == TX_MODE_SELECT);
}
#endif
static void update_txfm_probs(VP10_COMMON *cm, vpx_writer *w,
FRAME_COUNTS *counts) {
// Probabilities
if (cm->tx_mode == TX_MODE_SELECT) { if (cm->tx_mode == TX_MODE_SELECT) {
int i, j; int i, j;
unsigned int ct_8x8p[TX_SIZES - 3][2]; unsigned int ct_8x8p[TX_SIZES - 3][2];
@@ -933,7 +1014,8 @@ static int get_refresh_mask(VP10_COMP *cpi) {
} }
} }
static size_t encode_tiles(VP10_COMP *cpi, uint8_t *data_ptr) { static size_t encode_tiles(VP10_COMP *cpi, uint8_t *data_ptr,
unsigned int *max_tile_sz) {
VP10_COMMON *const cm = &cpi->common; VP10_COMMON *const cm = &cpi->common;
vpx_writer residual_bc; vpx_writer residual_bc;
int tile_row, tile_col; int tile_row, tile_col;
@@ -941,6 +1023,7 @@ static size_t encode_tiles(VP10_COMP *cpi, uint8_t *data_ptr) {
size_t total_size = 0; size_t total_size = 0;
const int tile_cols = 1 << cm->log2_tile_cols; const int tile_cols = 1 << cm->log2_tile_cols;
const int tile_rows = 1 << cm->log2_tile_rows; const int tile_rows = 1 << cm->log2_tile_rows;
unsigned int max_tile = 0;
memset(cm->above_seg_context, 0, memset(cm->above_seg_context, 0,
sizeof(*cm->above_seg_context) * mi_cols_aligned_to_sb(cm->mi_cols)); sizeof(*cm->above_seg_context) * mi_cols_aligned_to_sb(cm->mi_cols));
@@ -963,26 +1046,32 @@ static size_t encode_tiles(VP10_COMP *cpi, uint8_t *data_ptr) {
assert(tok == tok_end); assert(tok == tok_end);
vpx_stop_encode(&residual_bc); vpx_stop_encode(&residual_bc);
if (tile_col < tile_cols - 1 || tile_row < tile_rows - 1) { if (tile_col < tile_cols - 1 || tile_row < tile_rows - 1) {
unsigned int tile_sz;
// size of this tile // size of this tile
mem_put_be32(data_ptr + total_size, residual_bc.pos); assert(residual_bc.pos > 0);
tile_sz = residual_bc.pos - CONFIG_MISC_FIXES;
mem_put_le32(data_ptr + total_size, tile_sz);
max_tile = max_tile > tile_sz ? max_tile : tile_sz;
total_size += 4; total_size += 4;
} }
total_size += residual_bc.pos; total_size += residual_bc.pos;
} }
} }
*max_tile_sz = max_tile;
return total_size; return total_size;
} }
static void write_display_size(const VP10_COMMON *cm, static void write_render_size(const VP10_COMMON *cm,
struct vpx_write_bit_buffer *wb) { struct vpx_write_bit_buffer *wb) {
const int scaling_active = cm->width != cm->display_width || const int scaling_active = cm->width != cm->render_width ||
cm->height != cm->display_height; cm->height != cm->render_height;
vpx_wb_write_bit(wb, scaling_active); vpx_wb_write_bit(wb, scaling_active);
if (scaling_active) { if (scaling_active) {
vpx_wb_write_literal(wb, cm->display_width - 1, 16); vpx_wb_write_literal(wb, cm->render_width - 1, 16);
vpx_wb_write_literal(wb, cm->display_height - 1, 16); vpx_wb_write_literal(wb, cm->render_height - 1, 16);
} }
} }
@@ -991,7 +1080,7 @@ static void write_frame_size(const VP10_COMMON *cm,
vpx_wb_write_literal(wb, cm->width - 1, 16); vpx_wb_write_literal(wb, cm->width - 1, 16);
vpx_wb_write_literal(wb, cm->height - 1, 16); vpx_wb_write_literal(wb, cm->height - 1, 16);
write_display_size(cm, wb); write_render_size(cm, wb);
} }
static void write_frame_size_with_refs(VP10_COMP *cpi, static void write_frame_size_with_refs(VP10_COMP *cpi,
@@ -1006,6 +1095,10 @@ static void write_frame_size_with_refs(VP10_COMP *cpi,
if (cfg != NULL) { if (cfg != NULL) {
found = cm->width == cfg->y_crop_width && found = cm->width == cfg->y_crop_width &&
cm->height == cfg->y_crop_height; cm->height == cfg->y_crop_height;
#if CONFIG_MISC_FIXES
found &= cm->render_width == cfg->render_width &&
cm->render_height == cfg->render_height;
#endif
} }
vpx_wb_write_bit(wb, found); vpx_wb_write_bit(wb, found);
if (found) { if (found) {
@@ -1016,9 +1109,15 @@ static void write_frame_size_with_refs(VP10_COMP *cpi,
if (!found) { if (!found) {
vpx_wb_write_literal(wb, cm->width - 1, 16); vpx_wb_write_literal(wb, cm->width - 1, 16);
vpx_wb_write_literal(wb, cm->height - 1, 16); vpx_wb_write_literal(wb, cm->height - 1, 16);
#if CONFIG_MISC_FIXES
write_render_size(cm, wb);
#endif
} }
write_display_size(cm, wb); #if !CONFIG_MISC_FIXES
write_render_size(cm, wb);
#endif
} }
static void write_sync_code(struct vpx_write_bit_buffer *wb) { static void write_sync_code(struct vpx_write_bit_buffer *wb) {
@@ -1055,7 +1154,8 @@ static void write_bitdepth_colorspace_sampling(
} }
vpx_wb_write_literal(wb, cm->color_space, 3); vpx_wb_write_literal(wb, cm->color_space, 3);
if (cm->color_space != VPX_CS_SRGB) { if (cm->color_space != VPX_CS_SRGB) {
vpx_wb_write_bit(wb, 0); // 0: [16, 235] (i.e. xvYCC), 1: [0, 255] // 0: [16, 235] (i.e. xvYCC), 1: [0, 255]
vpx_wb_write_bit(wb, cm->color_range);
if (cm->profile == PROFILE_1 || cm->profile == PROFILE_3) { if (cm->profile == PROFILE_1 || cm->profile == PROFILE_3) {
assert(cm->subsampling_x != 1 || cm->subsampling_y != 1); assert(cm->subsampling_x != 1 || cm->subsampling_y != 1);
vpx_wb_write_bit(wb, cm->subsampling_x); vpx_wb_write_bit(wb, cm->subsampling_x);
@@ -1092,16 +1192,37 @@ static void write_uncompressed_header(VP10_COMP *cpi,
if (!cm->show_frame) if (!cm->show_frame)
vpx_wb_write_bit(wb, cm->intra_only); vpx_wb_write_bit(wb, cm->intra_only);
if (!cm->error_resilient_mode) if (!cm->error_resilient_mode) {
vpx_wb_write_literal(wb, cm->reset_frame_context, 2); #if CONFIG_MISC_FIXES
if (cm->intra_only) {
vpx_wb_write_bit(wb,
cm->reset_frame_context == RESET_FRAME_CONTEXT_ALL);
} else {
vpx_wb_write_bit(wb,
cm->reset_frame_context != RESET_FRAME_CONTEXT_NONE);
if (cm->reset_frame_context != RESET_FRAME_CONTEXT_NONE)
vpx_wb_write_bit(wb,
cm->reset_frame_context == RESET_FRAME_CONTEXT_ALL);
}
#else
static const int reset_frame_context_conv_tbl[3] = { 0, 2, 3 };
vpx_wb_write_literal(wb,
reset_frame_context_conv_tbl[cm->reset_frame_context], 2);
#endif
}
if (cm->intra_only) { if (cm->intra_only) {
write_sync_code(wb); write_sync_code(wb);
#if CONFIG_MISC_FIXES
write_bitdepth_colorspace_sampling(cm, wb);
#else
// Note for profile 0, 420 8bpp is assumed. // Note for profile 0, 420 8bpp is assumed.
if (cm->profile > PROFILE_0) { if (cm->profile > PROFILE_0) {
write_bitdepth_colorspace_sampling(cm, wb); write_bitdepth_colorspace_sampling(cm, wb);
} }
#endif
vpx_wb_write_literal(wb, get_refresh_mask(cpi), REF_FRAMES); vpx_wb_write_literal(wb, get_refresh_mask(cpi), REF_FRAMES);
write_frame_size(cm, wb); write_frame_size(cm, wb);
@@ -1125,8 +1246,13 @@ static void write_uncompressed_header(VP10_COMP *cpi,
} }
if (!cm->error_resilient_mode) { if (!cm->error_resilient_mode) {
vpx_wb_write_bit(wb, cm->refresh_frame_context); vpx_wb_write_bit(wb,
vpx_wb_write_bit(wb, cm->frame_parallel_decoding_mode); cm->refresh_frame_context != REFRESH_FRAME_CONTEXT_OFF);
#if CONFIG_MISC_FIXES
if (cm->refresh_frame_context != REFRESH_FRAME_CONTEXT_OFF)
#endif
vpx_wb_write_bit(wb, cm->refresh_frame_context !=
REFRESH_FRAME_CONTEXT_BACKWARD);
} }
vpx_wb_write_literal(wb, cm->frame_context_idx, FRAME_CONTEXTS_LOG2); vpx_wb_write_literal(wb, cm->frame_context_idx, FRAME_CONTEXTS_LOG2);
@@ -1134,30 +1260,69 @@ static void write_uncompressed_header(VP10_COMP *cpi,
encode_loopfilter(&cm->lf, wb); encode_loopfilter(&cm->lf, wb);
encode_quantization(cm, wb); encode_quantization(cm, wb);
encode_segmentation(cm, xd, wb); encode_segmentation(cm, xd, wb);
#if CONFIG_MISC_FIXES
if (!cm->seg.enabled && xd->lossless[0])
cm->tx_mode = TX_4X4;
else
write_txfm_mode(cm->tx_mode, wb);
if (cpi->allow_comp_inter_inter) {
const int use_hybrid_pred = cm->reference_mode == REFERENCE_MODE_SELECT;
const int use_compound_pred = cm->reference_mode != SINGLE_REFERENCE;
vpx_wb_write_bit(wb, use_hybrid_pred);
if (!use_hybrid_pred)
vpx_wb_write_bit(wb, use_compound_pred);
}
#endif
write_tile_info(cm, wb); write_tile_info(cm, wb);
} }
static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) { static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) {
VP10_COMMON *const cm = &cpi->common; VP10_COMMON *const cm = &cpi->common;
MACROBLOCKD *const xd = &cpi->td.mb.e_mbd;
FRAME_CONTEXT *const fc = cm->fc; FRAME_CONTEXT *const fc = cm->fc;
FRAME_COUNTS *counts = cpi->td.counts; FRAME_COUNTS *counts = cpi->td.counts;
vpx_writer header_bc; vpx_writer header_bc;
int i;
#if CONFIG_MISC_FIXES
int j;
#endif
vpx_start_encode(&header_bc, data); vpx_start_encode(&header_bc, data);
if (xd->lossless) #if !CONFIG_MISC_FIXES
cm->tx_mode = ONLY_4X4; if (cpi->td.mb.e_mbd.lossless[0]) {
else cm->tx_mode = TX_4X4;
encode_txfm_probs(cm, &header_bc, counts); } else {
write_txfm_mode(cm->tx_mode, &header_bc);
update_txfm_probs(cm, &header_bc, counts);
}
#else
update_txfm_probs(cm, &header_bc, counts);
#endif
update_coef_probs(cpi, &header_bc); update_coef_probs(cpi, &header_bc);
update_skip_probs(cm, &header_bc, counts); update_skip_probs(cm, &header_bc, counts);
#if CONFIG_MISC_FIXES
update_seg_probs(cpi, &header_bc);
if (!frame_is_intra_only(cm)) { for (i = 0; i < INTRA_MODES; ++i)
int i; prob_diff_update(vp10_intra_mode_tree, fc->uv_mode_prob[i],
counts->uv_mode[i], INTRA_MODES, &header_bc);
for (i = 0; i < PARTITION_CONTEXTS; ++i)
prob_diff_update(vp10_partition_tree, fc->partition_prob[i],
counts->partition[i], PARTITION_TYPES, &header_bc);
#endif
if (frame_is_intra_only(cm)) {
vp10_copy(cm->kf_y_prob, vp10_kf_y_mode_prob);
#if CONFIG_MISC_FIXES
for (i = 0; i < INTRA_MODES; ++i)
for (j = 0; j < INTRA_MODES; ++j)
prob_diff_update(vp10_intra_mode_tree, cm->kf_y_prob[i][j],
counts->kf_y_mode[i][j], INTRA_MODES, &header_bc);
#endif
} else {
for (i = 0; i < INTER_MODE_CONTEXTS; ++i) for (i = 0; i < INTER_MODE_CONTEXTS; ++i)
prob_diff_update(vp10_inter_mode_tree, cm->fc->inter_mode_probs[i], prob_diff_update(vp10_inter_mode_tree, cm->fc->inter_mode_probs[i],
counts->inter_mode[i], INTER_MODES, &header_bc); counts->inter_mode[i], INTER_MODES, &header_bc);
@@ -1170,8 +1335,9 @@ static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) {
counts->intra_inter[i]); counts->intra_inter[i]);
if (cpi->allow_comp_inter_inter) { if (cpi->allow_comp_inter_inter) {
const int use_compound_pred = cm->reference_mode != SINGLE_REFERENCE;
const int use_hybrid_pred = cm->reference_mode == REFERENCE_MODE_SELECT; const int use_hybrid_pred = cm->reference_mode == REFERENCE_MODE_SELECT;
#if !CONFIG_MISC_FIXES
const int use_compound_pred = cm->reference_mode != SINGLE_REFERENCE;
vpx_write_bit(&header_bc, use_compound_pred); vpx_write_bit(&header_bc, use_compound_pred);
if (use_compound_pred) { if (use_compound_pred) {
@@ -1181,6 +1347,12 @@ static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) {
vp10_cond_prob_diff_update(&header_bc, &fc->comp_inter_prob[i], vp10_cond_prob_diff_update(&header_bc, &fc->comp_inter_prob[i],
counts->comp_inter[i]); counts->comp_inter[i]);
} }
#else
if (use_hybrid_pred)
for (i = 0; i < COMP_INTER_CONTEXTS; i++)
vp10_cond_prob_diff_update(&header_bc, &fc->comp_inter_prob[i],
counts->comp_inter[i]);
#endif
} }
if (cm->reference_mode != COMPOUND_REFERENCE) { if (cm->reference_mode != COMPOUND_REFERENCE) {
@@ -1201,9 +1373,11 @@ static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) {
prob_diff_update(vp10_intra_mode_tree, cm->fc->y_mode_prob[i], prob_diff_update(vp10_intra_mode_tree, cm->fc->y_mode_prob[i],
counts->y_mode[i], INTRA_MODES, &header_bc); counts->y_mode[i], INTRA_MODES, &header_bc);
#if !CONFIG_MISC_FIXES
for (i = 0; i < PARTITION_CONTEXTS; ++i) for (i = 0; i < PARTITION_CONTEXTS; ++i)
prob_diff_update(vp10_partition_tree, fc->partition_prob[i], prob_diff_update(vp10_partition_tree, fc->partition_prob[i],
counts->partition[i], PARTITION_TYPES, &header_bc); counts->partition[i], PARTITION_TYPES, &header_bc);
#endif
vp10_write_nmv_probs(cm, cm->allow_high_precision_mv, &header_bc, vp10_write_nmv_probs(cm, cm->allow_high_precision_mv, &header_bc,
&counts->mv); &counts->mv);
@@ -1215,15 +1389,67 @@ static size_t write_compressed_header(VP10_COMP *cpi, uint8_t *data) {
return header_bc.pos; return header_bc.pos;
} }
void vp10_pack_bitstream(VP10_COMP *cpi, uint8_t *dest, size_t *size) { #if CONFIG_MISC_FIXES
static int remux_tiles(uint8_t *dest, const int sz,
const int n_tiles, const int mag) {
int rpos = 0, wpos = 0, n;
for (n = 0; n < n_tiles; n++) {
int tile_sz;
if (n == n_tiles - 1) {
tile_sz = sz - rpos;
} else {
tile_sz = mem_get_le32(&dest[rpos]) + 1;
rpos += 4;
switch (mag) {
case 0:
dest[wpos] = tile_sz - 1;
break;
case 1:
mem_put_le16(&dest[wpos], tile_sz - 1);
break;
case 2:
mem_put_le24(&dest[wpos], tile_sz - 1);
break;
case 3: // remuxing should only happen if mag < 3
default:
assert("Invalid value for tile size magnitude" && 0);
}
wpos += mag + 1;
}
memmove(&dest[wpos], &dest[rpos], tile_sz);
wpos += tile_sz;
rpos += tile_sz;
}
assert(rpos > wpos);
assert(rpos == sz);
return wpos;
}
#endif
void vp10_pack_bitstream(VP10_COMP *const cpi, uint8_t *dest, size_t *size) {
uint8_t *data = dest; uint8_t *data = dest;
size_t first_part_size, uncompressed_hdr_size; size_t first_part_size, uncompressed_hdr_size, data_sz;
struct vpx_write_bit_buffer wb = {data, 0}; struct vpx_write_bit_buffer wb = {data, 0};
struct vpx_write_bit_buffer saved_wb; struct vpx_write_bit_buffer saved_wb;
unsigned int max_tile;
#if CONFIG_MISC_FIXES
VP10_COMMON *const cm = &cpi->common;
const int n_log2_tiles = cm->log2_tile_rows + cm->log2_tile_cols;
const int have_tiles = n_log2_tiles > 0;
#else
const int have_tiles = 0; // we have tiles, but we don't want to write a
// tile size marker in the header
#endif
write_uncompressed_header(cpi, &wb); write_uncompressed_header(cpi, &wb);
saved_wb = wb; saved_wb = wb;
vpx_wb_write_literal(&wb, 0, 16); // don't know in advance first part. size // don't know in advance first part. size
vpx_wb_write_literal(&wb, 0, 16 + have_tiles * 2);
uncompressed_hdr_size = vpx_wb_bytes_written(&wb); uncompressed_hdr_size = vpx_wb_bytes_written(&wb);
data += uncompressed_hdr_size; data += uncompressed_hdr_size;
@@ -1232,10 +1458,32 @@ void vp10_pack_bitstream(VP10_COMP *cpi, uint8_t *dest, size_t *size) {
first_part_size = write_compressed_header(cpi, data); first_part_size = write_compressed_header(cpi, data);
data += first_part_size; data += first_part_size;
data_sz = encode_tiles(cpi, data, &max_tile);
#if CONFIG_MISC_FIXES
if (max_tile > 0) {
int mag;
unsigned int mask;
// Choose the (tile size) magnitude
for (mag = 0, mask = 0xff; mag < 4; mag++) {
if (max_tile <= mask)
break;
mask <<= 8;
mask |= 0xff;
}
assert(n_log2_tiles > 0);
vpx_wb_write_literal(&saved_wb, mag, 2);
if (mag < 3)
data_sz = remux_tiles(data, (int)data_sz, 1 << n_log2_tiles, mag);
} else {
assert(n_log2_tiles == 0);
}
#endif
data += data_sz;
// TODO(jbb): Figure out what to do if first_part_size > 16 bits. // TODO(jbb): Figure out what to do if first_part_size > 16 bits.
vpx_wb_write_literal(&saved_wb, (int)first_part_size, 16); vpx_wb_write_literal(&saved_wb, (int)first_part_size, 16);
data += encode_tiles(cpi, data);
*size = data - dest; *size = data - dest;
} }

View File

@@ -18,7 +18,7 @@ extern "C" {
#include "vp10/encoder/encoder.h" #include "vp10/encoder/encoder.h"
void vp10_pack_bitstream(VP10_COMP *cpi, uint8_t *dest, size_t *size); void vp10_pack_bitstream(VP10_COMP *const cpi, uint8_t *dest, size_t *size);
static INLINE int vp10_preserve_existing_gf(VP10_COMP *cpi) { static INLINE int vp10_preserve_existing_gf(VP10_COMP *cpi) {
return !cpi->multi_arf_allowed && cpi->refresh_golden_frame && return !cpi->multi_arf_allowed && cpi->refresh_golden_frame &&

View File

@@ -58,7 +58,6 @@ struct macroblock {
MACROBLOCKD e_mbd; MACROBLOCKD e_mbd;
MB_MODE_INFO_EXT *mbmi_ext; MB_MODE_INFO_EXT *mbmi_ext;
MB_MODE_INFO_EXT *mbmi_ext_base;
int skip_block; int skip_block;
int select_tx_size; int select_tx_size;
int skip_recode; int skip_recode;
@@ -71,6 +70,8 @@ struct macroblock {
int rddiv; int rddiv;
int rdmult; int rdmult;
int mb_energy; int mb_energy;
int * m_search_count_ptr;
int * ex_search_count_ptr;
// These are set to their default values at the beginning, and then adjusted // These are set to their default values at the beginning, and then adjusted
// further in the encoding process. // further in the encoding process.
@@ -115,7 +116,6 @@ struct macroblock {
// indicate if it is in the rd search loop or encoding process // indicate if it is in the rd search loop or encoding process
int use_lp32x32fdct; int use_lp32x32fdct;
int skip_encode;
// use fast quantization process // use fast quantization process
int quant_fp; int quant_fp;
@@ -134,13 +134,6 @@ struct macroblock {
// Strong color activity detection. Used in RTC coding mode to enhance // Strong color activity detection. Used in RTC coding mode to enhance
// the visual quality at the boundary of moving color objects. // the visual quality at the boundary of moving color objects.
uint8_t color_sensitivity[2]; uint8_t color_sensitivity[2];
void (*fwd_txm4x4)(const int16_t *input, tran_low_t *output, int stride);
void (*itxm_add)(const tran_low_t *input, uint8_t *dest, int stride, int eob);
#if CONFIG_VP9_HIGHBITDEPTH
void (*highbd_itxm_add)(const tran_low_t *input, uint8_t *dest, int stride,
int eob, int bd);
#endif
}; };
#ifdef __cplusplus #ifdef __cplusplus

View File

@@ -30,13 +30,13 @@ static void alloc_mode_context(VP10_COMMON *cm, int num_4x4_blk,
for (i = 0; i < MAX_MB_PLANE; ++i) { for (i = 0; i < MAX_MB_PLANE; ++i) {
for (k = 0; k < 3; ++k) { for (k = 0; k < 3; ++k) {
CHECK_MEM_ERROR(cm, ctx->coeff[i][k], CHECK_MEM_ERROR(cm, ctx->coeff[i][k],
vpx_memalign(16, num_pix * sizeof(*ctx->coeff[i][k]))); vpx_memalign(32, num_pix * sizeof(*ctx->coeff[i][k])));
CHECK_MEM_ERROR(cm, ctx->qcoeff[i][k], CHECK_MEM_ERROR(cm, ctx->qcoeff[i][k],
vpx_memalign(16, num_pix * sizeof(*ctx->qcoeff[i][k]))); vpx_memalign(32, num_pix * sizeof(*ctx->qcoeff[i][k])));
CHECK_MEM_ERROR(cm, ctx->dqcoeff[i][k], CHECK_MEM_ERROR(cm, ctx->dqcoeff[i][k],
vpx_memalign(16, num_pix * sizeof(*ctx->dqcoeff[i][k]))); vpx_memalign(32, num_pix * sizeof(*ctx->dqcoeff[i][k])));
CHECK_MEM_ERROR(cm, ctx->eobs[i][k], CHECK_MEM_ERROR(cm, ctx->eobs[i][k],
vpx_memalign(16, num_blk * sizeof(*ctx->eobs[i][k]))); vpx_memalign(32, num_blk * sizeof(*ctx->eobs[i][k])));
ctx->coeff_pbuf[i][k] = ctx->coeff[i][k]; ctx->coeff_pbuf[i][k] = ctx->coeff[i][k];
ctx->qcoeff_pbuf[i][k] = ctx->qcoeff[i][k]; ctx->qcoeff_pbuf[i][k] = ctx->qcoeff[i][k];
ctx->dqcoeff_pbuf[i][k] = ctx->dqcoeff[i][k]; ctx->dqcoeff_pbuf[i][k] = ctx->dqcoeff[i][k];
@@ -61,6 +61,11 @@ static void free_mode_context(PICK_MODE_CONTEXT *ctx) {
ctx->eobs[i][k] = 0; ctx->eobs[i][k] = 0;
} }
} }
for (i = 0; i < 2; ++i) {
vpx_free(ctx->color_index_map[i]);
ctx->color_index_map[i] = 0;
}
} }
static void alloc_tree_contexts(VP10_COMMON *cm, PC_TREE *tree, static void alloc_tree_contexts(VP10_COMMON *cm, PC_TREE *tree,

View File

@@ -14,6 +14,10 @@
#include "vp10/common/blockd.h" #include "vp10/common/blockd.h"
#include "vp10/encoder/block.h" #include "vp10/encoder/block.h"
#ifdef __cplusplus
extern "C" {
#endif
struct VP10_COMP; struct VP10_COMP;
struct VP10Common; struct VP10Common;
struct ThreadData; struct ThreadData;
@@ -23,6 +27,7 @@ typedef struct {
MODE_INFO mic; MODE_INFO mic;
MB_MODE_INFO_EXT mbmi_ext; MB_MODE_INFO_EXT mbmi_ext;
uint8_t *zcoeff_blk; uint8_t *zcoeff_blk;
uint8_t *color_index_map[2];
tran_low_t *coeff[MAX_MB_PLANE][3]; tran_low_t *coeff[MAX_MB_PLANE][3];
tran_low_t *qcoeff[MAX_MB_PLANE][3]; tran_low_t *qcoeff[MAX_MB_PLANE][3];
tran_low_t *dqcoeff[MAX_MB_PLANE][3]; tran_low_t *dqcoeff[MAX_MB_PLANE][3];
@@ -84,4 +89,8 @@ typedef struct PC_TREE {
void vp10_setup_pc_tree(struct VP10Common *cm, struct ThreadData *td); void vp10_setup_pc_tree(struct VP10Common *cm, struct ThreadData *td);
void vp10_free_pc_tree(struct ThreadData *td); void vp10_free_pc_tree(struct ThreadData *td);
#ifdef __cplusplus
} // extern "C"
#endif
#endif /* VP10_ENCODER_CONTEXT_TREE_H_ */ #endif /* VP10_ENCODER_CONTEXT_TREE_H_ */

View File

@@ -20,218 +20,711 @@
#include "vpx_dsp/fwd_txfm.h" #include "vpx_dsp/fwd_txfm.h"
#include "vpx_ports/mem.h" #include "vpx_ports/mem.h"
static INLINE void range_check(const tran_low_t *input, const int size,
const int bit) {
#if 0 // CONFIG_COEFFICIENT_RANGE_CHECKING
// TODO(angiebird): the range_check is not used because the bit range
// in fdct# is not correct. Since we are going to merge in a new version
// of fdct# from nextgenv2, we won't fix the incorrect bit range now.
int i;
for (i = 0; i < size; ++i) {
assert(abs(input[i]) < (1 << bit));
}
#else
(void)input;
(void)size;
(void)bit;
#endif
}
static void fdct4(const tran_low_t *input, tran_low_t *output) { static void fdct4(const tran_low_t *input, tran_low_t *output) {
tran_high_t step[4]; tran_high_t temp;
tran_high_t temp1, temp2; tran_low_t step[4];
step[0] = input[0] + input[3]; // stage 0
step[1] = input[1] + input[2]; range_check(input, 4, 14);
step[2] = input[1] - input[2];
step[3] = input[0] - input[3];
temp1 = (step[0] + step[1]) * cospi_16_64; // stage 1
temp2 = (step[0] - step[1]) * cospi_16_64; output[0] = input[0] + input[3];
output[0] = (tran_low_t)fdct_round_shift(temp1); output[1] = input[1] + input[2];
output[2] = (tran_low_t)fdct_round_shift(temp2); output[2] = input[1] - input[2];
temp1 = step[2] * cospi_24_64 + step[3] * cospi_8_64; output[3] = input[0] - input[3];
temp2 = -step[2] * cospi_8_64 + step[3] * cospi_24_64;
output[1] = (tran_low_t)fdct_round_shift(temp1); range_check(output, 4, 15);
output[3] = (tran_low_t)fdct_round_shift(temp2);
// stage 2
temp = output[0] * cospi_16_64 + output[1] * cospi_16_64;
step[0] = (tran_low_t)fdct_round_shift(temp);
temp = output[1] * -cospi_16_64 + output[0] * cospi_16_64;
step[1] = (tran_low_t)fdct_round_shift(temp);
temp = output[2] * cospi_24_64 + output[3] * cospi_8_64;
step[2] = (tran_low_t)fdct_round_shift(temp);
temp = output[3] * cospi_24_64 + output[2] * -cospi_8_64;
step[3] = (tran_low_t)fdct_round_shift(temp);
range_check(step, 4, 16);
// stage 3
output[0] = step[0];
output[1] = step[2];
output[2] = step[1];
output[3] = step[3];
range_check(output, 4, 16);
} }
static void fdct8(const tran_low_t *input, tran_low_t *output) { static void fdct8(const tran_low_t *input, tran_low_t *output) {
tran_high_t s0, s1, s2, s3, s4, s5, s6, s7; // canbe16 tran_high_t temp;
tran_high_t t0, t1, t2, t3; // needs32 tran_low_t step[8];
tran_high_t x0, x1, x2, x3; // canbe16
// stage 0
range_check(input, 8, 13);
// stage 1 // stage 1
s0 = input[0] + input[7]; output[0] = input[0] + input[7];
s1 = input[1] + input[6]; output[1] = input[1] + input[6];
s2 = input[2] + input[5]; output[2] = input[2] + input[5];
s3 = input[3] + input[4]; output[3] = input[3] + input[4];
s4 = input[3] - input[4]; output[4] = input[3] - input[4];
s5 = input[2] - input[5]; output[5] = input[2] - input[5];
s6 = input[1] - input[6]; output[6] = input[1] - input[6];
s7 = input[0] - input[7]; output[7] = input[0] - input[7];
// fdct4(step, step); range_check(output, 8, 14);
x0 = s0 + s3;
x1 = s1 + s2;
x2 = s1 - s2;
x3 = s0 - s3;
t0 = (x0 + x1) * cospi_16_64;
t1 = (x0 - x1) * cospi_16_64;
t2 = x2 * cospi_24_64 + x3 * cospi_8_64;
t3 = -x2 * cospi_8_64 + x3 * cospi_24_64;
output[0] = (tran_low_t)fdct_round_shift(t0);
output[2] = (tran_low_t)fdct_round_shift(t2);
output[4] = (tran_low_t)fdct_round_shift(t1);
output[6] = (tran_low_t)fdct_round_shift(t3);
// Stage 2 // stage 2
t0 = (s6 - s5) * cospi_16_64; step[0] = output[0] + output[3];
t1 = (s6 + s5) * cospi_16_64; step[1] = output[1] + output[2];
t2 = (tran_low_t)fdct_round_shift(t0); step[2] = output[1] - output[2];
t3 = (tran_low_t)fdct_round_shift(t1); step[3] = output[0] - output[3];
step[4] = output[4];
temp = output[5] * -cospi_16_64 + output[6] * cospi_16_64;
step[5] = (tran_low_t)fdct_round_shift(temp);
temp = output[6] * cospi_16_64 + output[5] * cospi_16_64;
step[6] = (tran_low_t)fdct_round_shift(temp);
step[7] = output[7];
// Stage 3 range_check(step, 8, 15);
x0 = s4 + t2;
x1 = s4 - t2;
x2 = s7 - t3;
x3 = s7 + t3;
// Stage 4 // stage 3
t0 = x0 * cospi_28_64 + x3 * cospi_4_64; temp = step[0] * cospi_16_64 + step[1] * cospi_16_64;
t1 = x1 * cospi_12_64 + x2 * cospi_20_64; output[0] = (tran_low_t)fdct_round_shift(temp);
t2 = x2 * cospi_12_64 + x1 * -cospi_20_64; temp = step[1] * -cospi_16_64 + step[0] * cospi_16_64;
t3 = x3 * cospi_28_64 + x0 * -cospi_4_64; output[1] = (tran_low_t)fdct_round_shift(temp);
output[1] = (tran_low_t)fdct_round_shift(t0); temp = step[2] * cospi_24_64 + step[3] * cospi_8_64;
output[3] = (tran_low_t)fdct_round_shift(t2); output[2] = (tran_low_t)fdct_round_shift(temp);
output[5] = (tran_low_t)fdct_round_shift(t1); temp = step[3] * cospi_24_64 + step[2] * -cospi_8_64;
output[7] = (tran_low_t)fdct_round_shift(t3); output[3] = (tran_low_t)fdct_round_shift(temp);
output[4] = step[4] + step[5];
output[5] = step[4] - step[5];
output[6] = step[7] - step[6];
output[7] = step[7] + step[6];
range_check(output, 8, 16);
// stage 4
step[0] = output[0];
step[1] = output[1];
step[2] = output[2];
step[3] = output[3];
temp = output[4] * cospi_28_64 + output[7] * cospi_4_64;
step[4] = (tran_low_t)fdct_round_shift(temp);
temp = output[5] * cospi_12_64 + output[6] * cospi_20_64;
step[5] = (tran_low_t)fdct_round_shift(temp);
temp = output[6] * cospi_12_64 + output[5] * -cospi_20_64;
step[6] = (tran_low_t)fdct_round_shift(temp);
temp = output[7] * cospi_28_64 + output[4] * -cospi_4_64;
step[7] = (tran_low_t)fdct_round_shift(temp);
range_check(step, 8, 16);
// stage 5
output[0] = step[0];
output[1] = step[4];
output[2] = step[2];
output[3] = step[6];
output[4] = step[1];
output[5] = step[5];
output[6] = step[3];
output[7] = step[7];
range_check(output, 8, 16);
} }
static void fdct16(const tran_low_t in[16], tran_low_t out[16]) { static void fdct16(const tran_low_t *input, tran_low_t *output) {
tran_high_t step1[8]; // canbe16 tran_high_t temp;
tran_high_t step2[8]; // canbe16 tran_low_t step[16];
tran_high_t step3[8]; // canbe16
tran_high_t input[8]; // canbe16
tran_high_t temp1, temp2; // needs32
// step 1 // stage 0
input[0] = in[0] + in[15]; range_check(input, 16, 13);
input[1] = in[1] + in[14];
input[2] = in[2] + in[13];
input[3] = in[3] + in[12];
input[4] = in[4] + in[11];
input[5] = in[5] + in[10];
input[6] = in[6] + in[ 9];
input[7] = in[7] + in[ 8];
step1[0] = in[7] - in[ 8]; // stage 1
step1[1] = in[6] - in[ 9]; output[0] = input[0] + input[15];
step1[2] = in[5] - in[10]; output[1] = input[1] + input[14];
step1[3] = in[4] - in[11]; output[2] = input[2] + input[13];
step1[4] = in[3] - in[12]; output[3] = input[3] + input[12];
step1[5] = in[2] - in[13]; output[4] = input[4] + input[11];
step1[6] = in[1] - in[14]; output[5] = input[5] + input[10];
step1[7] = in[0] - in[15]; output[6] = input[6] + input[9];
output[7] = input[7] + input[8];
output[8] = input[7] - input[8];
output[9] = input[6] - input[9];
output[10] = input[5] - input[10];
output[11] = input[4] - input[11];
output[12] = input[3] - input[12];
output[13] = input[2] - input[13];
output[14] = input[1] - input[14];
output[15] = input[0] - input[15];
// fdct8(step, step); range_check(output, 16, 14);
{
tran_high_t s0, s1, s2, s3, s4, s5, s6, s7; // canbe16
tran_high_t t0, t1, t2, t3; // needs32
tran_high_t x0, x1, x2, x3; // canbe16
// stage 1 // stage 2
s0 = input[0] + input[7]; step[0] = output[0] + output[7];
s1 = input[1] + input[6]; step[1] = output[1] + output[6];
s2 = input[2] + input[5]; step[2] = output[2] + output[5];
s3 = input[3] + input[4]; step[3] = output[3] + output[4];
s4 = input[3] - input[4]; step[4] = output[3] - output[4];
s5 = input[2] - input[5]; step[5] = output[2] - output[5];
s6 = input[1] - input[6]; step[6] = output[1] - output[6];
s7 = input[0] - input[7]; step[7] = output[0] - output[7];
step[8] = output[8];
step[9] = output[9];
temp = output[10] * -cospi_16_64 + output[13] * cospi_16_64;
step[10] = (tran_low_t)fdct_round_shift(temp);
temp = output[11] * -cospi_16_64 + output[12] * cospi_16_64;
step[11] = (tran_low_t)fdct_round_shift(temp);
temp = output[12] * cospi_16_64 + output[11] * cospi_16_64;
step[12] = (tran_low_t)fdct_round_shift(temp);
temp = output[13] * cospi_16_64 + output[10] * cospi_16_64;
step[13] = (tran_low_t)fdct_round_shift(temp);
step[14] = output[14];
step[15] = output[15];
// fdct4(step, step); range_check(step, 16, 15);
x0 = s0 + s3;
x1 = s1 + s2;
x2 = s1 - s2;
x3 = s0 - s3;
t0 = (x0 + x1) * cospi_16_64;
t1 = (x0 - x1) * cospi_16_64;
t2 = x3 * cospi_8_64 + x2 * cospi_24_64;
t3 = x3 * cospi_24_64 - x2 * cospi_8_64;
out[0] = (tran_low_t)fdct_round_shift(t0);
out[4] = (tran_low_t)fdct_round_shift(t2);
out[8] = (tran_low_t)fdct_round_shift(t1);
out[12] = (tran_low_t)fdct_round_shift(t3);
// Stage 2 // stage 3
t0 = (s6 - s5) * cospi_16_64; output[0] = step[0] + step[3];
t1 = (s6 + s5) * cospi_16_64; output[1] = step[1] + step[2];
t2 = fdct_round_shift(t0); output[2] = step[1] - step[2];
t3 = fdct_round_shift(t1); output[3] = step[0] - step[3];
output[4] = step[4];
temp = step[5] * -cospi_16_64 + step[6] * cospi_16_64;
output[5] = (tran_low_t)fdct_round_shift(temp);
temp = step[6] * cospi_16_64 + step[5] * cospi_16_64;
output[6] = (tran_low_t)fdct_round_shift(temp);
output[7] = step[7];
output[8] = step[8] + step[11];
output[9] = step[9] + step[10];
output[10] = step[9] - step[10];
output[11] = step[8] - step[11];
output[12] = step[15] - step[12];
output[13] = step[14] - step[13];
output[14] = step[14] + step[13];
output[15] = step[15] + step[12];
// Stage 3 range_check(output, 16, 16);
x0 = s4 + t2;
x1 = s4 - t2;
x2 = s7 - t3;
x3 = s7 + t3;
// Stage 4 // stage 4
t0 = x0 * cospi_28_64 + x3 * cospi_4_64; temp = output[0] * cospi_16_64 + output[1] * cospi_16_64;
t1 = x1 * cospi_12_64 + x2 * cospi_20_64; step[0] = (tran_low_t)fdct_round_shift(temp);
t2 = x2 * cospi_12_64 + x1 * -cospi_20_64; temp = output[1] * -cospi_16_64 + output[0] * cospi_16_64;
t3 = x3 * cospi_28_64 + x0 * -cospi_4_64; step[1] = (tran_low_t)fdct_round_shift(temp);
out[2] = (tran_low_t)fdct_round_shift(t0); temp = output[2] * cospi_24_64 + output[3] * cospi_8_64;
out[6] = (tran_low_t)fdct_round_shift(t2); step[2] = (tran_low_t)fdct_round_shift(temp);
out[10] = (tran_low_t)fdct_round_shift(t1); temp = output[3] * cospi_24_64 + output[2] * -cospi_8_64;
out[14] = (tran_low_t)fdct_round_shift(t3); step[3] = (tran_low_t)fdct_round_shift(temp);
} step[4] = output[4] + output[5];
step[5] = output[4] - output[5];
step[6] = output[7] - output[6];
step[7] = output[7] + output[6];
step[8] = output[8];
temp = output[9] * -cospi_8_64 + output[14] * cospi_24_64;
step[9] = (tran_low_t)fdct_round_shift(temp);
temp = output[10] * -cospi_24_64 + output[13] * -cospi_8_64;
step[10] = (tran_low_t)fdct_round_shift(temp);
step[11] = output[11];
step[12] = output[12];
temp = output[13] * cospi_24_64 + output[10] * -cospi_8_64;
step[13] = (tran_low_t)fdct_round_shift(temp);
temp = output[14] * cospi_8_64 + output[9] * cospi_24_64;
step[14] = (tran_low_t)fdct_round_shift(temp);
step[15] = output[15];
// step 2 range_check(step, 16, 16);
temp1 = (step1[5] - step1[2]) * cospi_16_64;
temp2 = (step1[4] - step1[3]) * cospi_16_64;
step2[2] = fdct_round_shift(temp1);
step2[3] = fdct_round_shift(temp2);
temp1 = (step1[4] + step1[3]) * cospi_16_64;
temp2 = (step1[5] + step1[2]) * cospi_16_64;
step2[4] = fdct_round_shift(temp1);
step2[5] = fdct_round_shift(temp2);
// step 3 // stage 5
step3[0] = step1[0] + step2[3]; output[0] = step[0];
step3[1] = step1[1] + step2[2]; output[1] = step[1];
step3[2] = step1[1] - step2[2]; output[2] = step[2];
step3[3] = step1[0] - step2[3]; output[3] = step[3];
step3[4] = step1[7] - step2[4]; temp = step[4] * cospi_28_64 + step[7] * cospi_4_64;
step3[5] = step1[6] - step2[5]; output[4] = (tran_low_t)fdct_round_shift(temp);
step3[6] = step1[6] + step2[5]; temp = step[5] * cospi_12_64 + step[6] * cospi_20_64;
step3[7] = step1[7] + step2[4]; output[5] = (tran_low_t)fdct_round_shift(temp);
temp = step[6] * cospi_12_64 + step[5] * -cospi_20_64;
output[6] = (tran_low_t)fdct_round_shift(temp);
temp = step[7] * cospi_28_64 + step[4] * -cospi_4_64;
output[7] = (tran_low_t)fdct_round_shift(temp);
output[8] = step[8] + step[9];
output[9] = step[8] - step[9];
output[10] = step[11] - step[10];
output[11] = step[11] + step[10];
output[12] = step[12] + step[13];
output[13] = step[12] - step[13];
output[14] = step[15] - step[14];
output[15] = step[15] + step[14];
// step 4 range_check(output, 16, 16);
temp1 = step3[1] * -cospi_8_64 + step3[6] * cospi_24_64;
temp2 = step3[2] * cospi_24_64 + step3[5] * cospi_8_64;
step2[1] = fdct_round_shift(temp1);
step2[2] = fdct_round_shift(temp2);
temp1 = step3[2] * cospi_8_64 - step3[5] * cospi_24_64;
temp2 = step3[1] * cospi_24_64 + step3[6] * cospi_8_64;
step2[5] = fdct_round_shift(temp1);
step2[6] = fdct_round_shift(temp2);
// step 5 // stage 6
step1[0] = step3[0] + step2[1]; step[0] = output[0];
step1[1] = step3[0] - step2[1]; step[1] = output[1];
step1[2] = step3[3] + step2[2]; step[2] = output[2];
step1[3] = step3[3] - step2[2]; step[3] = output[3];
step1[4] = step3[4] - step2[5]; step[4] = output[4];
step1[5] = step3[4] + step2[5]; step[5] = output[5];
step1[6] = step3[7] - step2[6]; step[6] = output[6];
step1[7] = step3[7] + step2[6]; step[7] = output[7];
temp = output[8] * cospi_30_64 + output[15] * cospi_2_64;
step[8] = (tran_low_t)fdct_round_shift(temp);
temp = output[9] * cospi_14_64 + output[14] * cospi_18_64;
step[9] = (tran_low_t)fdct_round_shift(temp);
temp = output[10] * cospi_22_64 + output[13] * cospi_10_64;
step[10] = (tran_low_t)fdct_round_shift(temp);
temp = output[11] * cospi_6_64 + output[12] * cospi_26_64;
step[11] = (tran_low_t)fdct_round_shift(temp);
temp = output[12] * cospi_6_64 + output[11] * -cospi_26_64;
step[12] = (tran_low_t)fdct_round_shift(temp);
temp = output[13] * cospi_22_64 + output[10] * -cospi_10_64;
step[13] = (tran_low_t)fdct_round_shift(temp);
temp = output[14] * cospi_14_64 + output[9] * -cospi_18_64;
step[14] = (tran_low_t)fdct_round_shift(temp);
temp = output[15] * cospi_30_64 + output[8] * -cospi_2_64;
step[15] = (tran_low_t)fdct_round_shift(temp);
// step 6 range_check(step, 16, 16);
temp1 = step1[0] * cospi_30_64 + step1[7] * cospi_2_64;
temp2 = step1[1] * cospi_14_64 + step1[6] * cospi_18_64;
out[1] = (tran_low_t)fdct_round_shift(temp1);
out[9] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step1[2] * cospi_22_64 + step1[5] * cospi_10_64; // stage 7
temp2 = step1[3] * cospi_6_64 + step1[4] * cospi_26_64; output[0] = step[0];
out[5] = (tran_low_t)fdct_round_shift(temp1); output[1] = step[8];
out[13] = (tran_low_t)fdct_round_shift(temp2); output[2] = step[4];
output[3] = step[12];
output[4] = step[2];
output[5] = step[10];
output[6] = step[6];
output[7] = step[14];
output[8] = step[1];
output[9] = step[9];
output[10] = step[5];
output[11] = step[13];
output[12] = step[3];
output[13] = step[11];
output[14] = step[7];
output[15] = step[15];
temp1 = step1[3] * -cospi_26_64 + step1[4] * cospi_6_64; range_check(output, 16, 16);
temp2 = step1[2] * -cospi_10_64 + step1[5] * cospi_22_64;
out[3] = (tran_low_t)fdct_round_shift(temp1);
out[11] = (tran_low_t)fdct_round_shift(temp2);
temp1 = step1[1] * -cospi_18_64 + step1[6] * cospi_14_64;
temp2 = step1[0] * -cospi_2_64 + step1[7] * cospi_30_64;
out[7] = (tran_low_t)fdct_round_shift(temp1);
out[15] = (tran_low_t)fdct_round_shift(temp2);
} }
/* TODO(angiebird): Unify this with vp10_fwd_txfm.c: vp10_fdct32
static void fdct32(const tran_low_t *input, tran_low_t *output) {
tran_high_t temp;
tran_low_t step[32];
// stage 0
range_check(input, 32, 14);
// stage 1
output[0] = input[0] + input[31];
output[1] = input[1] + input[30];
output[2] = input[2] + input[29];
output[3] = input[3] + input[28];
output[4] = input[4] + input[27];
output[5] = input[5] + input[26];
output[6] = input[6] + input[25];
output[7] = input[7] + input[24];
output[8] = input[8] + input[23];
output[9] = input[9] + input[22];
output[10] = input[10] + input[21];
output[11] = input[11] + input[20];
output[12] = input[12] + input[19];
output[13] = input[13] + input[18];
output[14] = input[14] + input[17];
output[15] = input[15] + input[16];
output[16] = input[15] - input[16];
output[17] = input[14] - input[17];
output[18] = input[13] - input[18];
output[19] = input[12] - input[19];
output[20] = input[11] - input[20];
output[21] = input[10] - input[21];
output[22] = input[9] - input[22];
output[23] = input[8] - input[23];
output[24] = input[7] - input[24];
output[25] = input[6] - input[25];
output[26] = input[5] - input[26];
output[27] = input[4] - input[27];
output[28] = input[3] - input[28];
output[29] = input[2] - input[29];
output[30] = input[1] - input[30];
output[31] = input[0] - input[31];
range_check(output, 32, 15);
// stage 2
step[0] = output[0] + output[15];
step[1] = output[1] + output[14];
step[2] = output[2] + output[13];
step[3] = output[3] + output[12];
step[4] = output[4] + output[11];
step[5] = output[5] + output[10];
step[6] = output[6] + output[9];
step[7] = output[7] + output[8];
step[8] = output[7] - output[8];
step[9] = output[6] - output[9];
step[10] = output[5] - output[10];
step[11] = output[4] - output[11];
step[12] = output[3] - output[12];
step[13] = output[2] - output[13];
step[14] = output[1] - output[14];
step[15] = output[0] - output[15];
step[16] = output[16];
step[17] = output[17];
step[18] = output[18];
step[19] = output[19];
temp = output[20] * -cospi_16_64 + output[27] * cospi_16_64;
step[20] = (tran_low_t)fdct_round_shift(temp);
temp = output[21] * -cospi_16_64 + output[26] * cospi_16_64;
step[21] = (tran_low_t)fdct_round_shift(temp);
temp = output[22] * -cospi_16_64 + output[25] * cospi_16_64;
step[22] = (tran_low_t)fdct_round_shift(temp);
temp = output[23] * -cospi_16_64 + output[24] * cospi_16_64;
step[23] = (tran_low_t)fdct_round_shift(temp);
temp = output[24] * cospi_16_64 + output[23] * cospi_16_64;
step[24] = (tran_low_t)fdct_round_shift(temp);
temp = output[25] * cospi_16_64 + output[22] * cospi_16_64;
step[25] = (tran_low_t)fdct_round_shift(temp);
temp = output[26] * cospi_16_64 + output[21] * cospi_16_64;
step[26] = (tran_low_t)fdct_round_shift(temp);
temp = output[27] * cospi_16_64 + output[20] * cospi_16_64;
step[27] = (tran_low_t)fdct_round_shift(temp);
step[28] = output[28];
step[29] = output[29];
step[30] = output[30];
step[31] = output[31];
range_check(step, 32, 16);
// stage 3
output[0] = step[0] + step[7];
output[1] = step[1] + step[6];
output[2] = step[2] + step[5];
output[3] = step[3] + step[4];
output[4] = step[3] - step[4];
output[5] = step[2] - step[5];
output[6] = step[1] - step[6];
output[7] = step[0] - step[7];
output[8] = step[8];
output[9] = step[9];
temp = step[10] * -cospi_16_64 + step[13] * cospi_16_64;
output[10] = (tran_low_t)fdct_round_shift(temp);
temp = step[11] * -cospi_16_64 + step[12] * cospi_16_64;
output[11] = (tran_low_t)fdct_round_shift(temp);
temp = step[12] * cospi_16_64 + step[11] * cospi_16_64;
output[12] = (tran_low_t)fdct_round_shift(temp);
temp = step[13] * cospi_16_64 + step[10] * cospi_16_64;
output[13] = (tran_low_t)fdct_round_shift(temp);
output[14] = step[14];
output[15] = step[15];
output[16] = step[16] + step[23];
output[17] = step[17] + step[22];
output[18] = step[18] + step[21];
output[19] = step[19] + step[20];
output[20] = step[19] - step[20];
output[21] = step[18] - step[21];
output[22] = step[17] - step[22];
output[23] = step[16] - step[23];
output[24] = step[31] - step[24];
output[25] = step[30] - step[25];
output[26] = step[29] - step[26];
output[27] = step[28] - step[27];
output[28] = step[28] + step[27];
output[29] = step[29] + step[26];
output[30] = step[30] + step[25];
output[31] = step[31] + step[24];
range_check(output, 32, 17);
// stage 4
step[0] = output[0] + output[3];
step[1] = output[1] + output[2];
step[2] = output[1] - output[2];
step[3] = output[0] - output[3];
step[4] = output[4];
temp = output[5] * -cospi_16_64 + output[6] * cospi_16_64;
step[5] = (tran_low_t)fdct_round_shift(temp);
temp = output[6] * cospi_16_64 + output[5] * cospi_16_64;
step[6] = (tran_low_t)fdct_round_shift(temp);
step[7] = output[7];
step[8] = output[8] + output[11];
step[9] = output[9] + output[10];
step[10] = output[9] - output[10];
step[11] = output[8] - output[11];
step[12] = output[15] - output[12];
step[13] = output[14] - output[13];
step[14] = output[14] + output[13];
step[15] = output[15] + output[12];
step[16] = output[16];
step[17] = output[17];
temp = output[18] * -cospi_8_64 + output[29] * cospi_24_64;
step[18] = (tran_low_t)fdct_round_shift(temp);
temp = output[19] * -cospi_8_64 + output[28] * cospi_24_64;
step[19] = (tran_low_t)fdct_round_shift(temp);
temp = output[20] * -cospi_24_64 + output[27] * -cospi_8_64;
step[20] = (tran_low_t)fdct_round_shift(temp);
temp = output[21] * -cospi_24_64 + output[26] * -cospi_8_64;
step[21] = (tran_low_t)fdct_round_shift(temp);
step[22] = output[22];
step[23] = output[23];
step[24] = output[24];
step[25] = output[25];
temp = output[26] * cospi_24_64 + output[21] * -cospi_8_64;
step[26] = (tran_low_t)fdct_round_shift(temp);
temp = output[27] * cospi_24_64 + output[20] * -cospi_8_64;
step[27] = (tran_low_t)fdct_round_shift(temp);
temp = output[28] * cospi_8_64 + output[19] * cospi_24_64;
step[28] = (tran_low_t)fdct_round_shift(temp);
temp = output[29] * cospi_8_64 + output[18] * cospi_24_64;
step[29] = (tran_low_t)fdct_round_shift(temp);
step[30] = output[30];
step[31] = output[31];
range_check(step, 32, 18);
// stage 5
temp = step[0] * cospi_16_64 + step[1] * cospi_16_64;
output[0] = (tran_low_t)fdct_round_shift(temp);
temp = step[1] * -cospi_16_64 + step[0] * cospi_16_64;
output[1] = (tran_low_t)fdct_round_shift(temp);
temp = step[2] * cospi_24_64 + step[3] * cospi_8_64;
output[2] = (tran_low_t)fdct_round_shift(temp);
temp = step[3] * cospi_24_64 + step[2] * -cospi_8_64;
output[3] = (tran_low_t)fdct_round_shift(temp);
output[4] = step[4] + step[5];
output[5] = step[4] - step[5];
output[6] = step[7] - step[6];
output[7] = step[7] + step[6];
output[8] = step[8];
temp = step[9] * -cospi_8_64 + step[14] * cospi_24_64;
output[9] = (tran_low_t)fdct_round_shift(temp);
temp = step[10] * -cospi_24_64 + step[13] * -cospi_8_64;
output[10] = (tran_low_t)fdct_round_shift(temp);
output[11] = step[11];
output[12] = step[12];
temp = step[13] * cospi_24_64 + step[10] * -cospi_8_64;
output[13] = (tran_low_t)fdct_round_shift(temp);
temp = step[14] * cospi_8_64 + step[9] * cospi_24_64;
output[14] = (tran_low_t)fdct_round_shift(temp);
output[15] = step[15];
output[16] = step[16] + step[19];
output[17] = step[17] + step[18];
output[18] = step[17] - step[18];
output[19] = step[16] - step[19];
output[20] = step[23] - step[20];
output[21] = step[22] - step[21];
output[22] = step[22] + step[21];
output[23] = step[23] + step[20];
output[24] = step[24] + step[27];
output[25] = step[25] + step[26];
output[26] = step[25] - step[26];
output[27] = step[24] - step[27];
output[28] = step[31] - step[28];
output[29] = step[30] - step[29];
output[30] = step[30] + step[29];
output[31] = step[31] + step[28];
range_check(output, 32, 18);
// stage 6
step[0] = output[0];
step[1] = output[1];
step[2] = output[2];
step[3] = output[3];
temp = output[4] * cospi_28_64 + output[7] * cospi_4_64;
step[4] = (tran_low_t)fdct_round_shift(temp);
temp = output[5] * cospi_12_64 + output[6] * cospi_20_64;
step[5] = (tran_low_t)fdct_round_shift(temp);
temp = output[6] * cospi_12_64 + output[5] * -cospi_20_64;
step[6] = (tran_low_t)fdct_round_shift(temp);
temp = output[7] * cospi_28_64 + output[4] * -cospi_4_64;
step[7] = (tran_low_t)fdct_round_shift(temp);
step[8] = output[8] + output[9];
step[9] = output[8] - output[9];
step[10] = output[11] - output[10];
step[11] = output[11] + output[10];
step[12] = output[12] + output[13];
step[13] = output[12] - output[13];
step[14] = output[15] - output[14];
step[15] = output[15] + output[14];
step[16] = output[16];
temp = output[17] * -cospi_4_64 + output[30] * cospi_28_64;
step[17] = (tran_low_t)fdct_round_shift(temp);
temp = output[18] * -cospi_28_64 + output[29] * -cospi_4_64;
step[18] = (tran_low_t)fdct_round_shift(temp);
step[19] = output[19];
step[20] = output[20];
temp = output[21] * -cospi_20_64 + output[26] * cospi_12_64;
step[21] = (tran_low_t)fdct_round_shift(temp);
temp = output[22] * -cospi_12_64 + output[25] * -cospi_20_64;
step[22] = (tran_low_t)fdct_round_shift(temp);
step[23] = output[23];
step[24] = output[24];
temp = output[25] * cospi_12_64 + output[22] * -cospi_20_64;
step[25] = (tran_low_t)fdct_round_shift(temp);
temp = output[26] * cospi_20_64 + output[21] * cospi_12_64;
step[26] = (tran_low_t)fdct_round_shift(temp);
step[27] = output[27];
step[28] = output[28];
temp = output[29] * cospi_28_64 + output[18] * -cospi_4_64;
step[29] = (tran_low_t)fdct_round_shift(temp);
temp = output[30] * cospi_4_64 + output[17] * cospi_28_64;
step[30] = (tran_low_t)fdct_round_shift(temp);
step[31] = output[31];
range_check(step, 32, 18);
// stage 7
output[0] = step[0];
output[1] = step[1];
output[2] = step[2];
output[3] = step[3];
output[4] = step[4];
output[5] = step[5];
output[6] = step[6];
output[7] = step[7];
temp = step[8] * cospi_30_64 + step[15] * cospi_2_64;
output[8] = (tran_low_t)fdct_round_shift(temp);
temp = step[9] * cospi_14_64 + step[14] * cospi_18_64;
output[9] = (tran_low_t)fdct_round_shift(temp);
temp = step[10] * cospi_22_64 + step[13] * cospi_10_64;
output[10] = (tran_low_t)fdct_round_shift(temp);
temp = step[11] * cospi_6_64 + step[12] * cospi_26_64;
output[11] = (tran_low_t)fdct_round_shift(temp);
temp = step[12] * cospi_6_64 + step[11] * -cospi_26_64;
output[12] = (tran_low_t)fdct_round_shift(temp);
temp = step[13] * cospi_22_64 + step[10] * -cospi_10_64;
output[13] = (tran_low_t)fdct_round_shift(temp);
temp = step[14] * cospi_14_64 + step[9] * -cospi_18_64;
output[14] = (tran_low_t)fdct_round_shift(temp);
temp = step[15] * cospi_30_64 + step[8] * -cospi_2_64;
output[15] = (tran_low_t)fdct_round_shift(temp);
output[16] = step[16] + step[17];
output[17] = step[16] - step[17];
output[18] = step[19] - step[18];
output[19] = step[19] + step[18];
output[20] = step[20] + step[21];
output[21] = step[20] - step[21];
output[22] = step[23] - step[22];
output[23] = step[23] + step[22];
output[24] = step[24] + step[25];
output[25] = step[24] - step[25];
output[26] = step[27] - step[26];
output[27] = step[27] + step[26];
output[28] = step[28] + step[29];
output[29] = step[28] - step[29];
output[30] = step[31] - step[30];
output[31] = step[31] + step[30];
range_check(output, 32, 18);
// stage 8
step[0] = output[0];
step[1] = output[1];
step[2] = output[2];
step[3] = output[3];
step[4] = output[4];
step[5] = output[5];
step[6] = output[6];
step[7] = output[7];
step[8] = output[8];
step[9] = output[9];
step[10] = output[10];
step[11] = output[11];
step[12] = output[12];
step[13] = output[13];
step[14] = output[14];
step[15] = output[15];
temp = output[16] * cospi_31_64 + output[31] * cospi_1_64;
step[16] = (tran_low_t)fdct_round_shift(temp);
temp = output[17] * cospi_15_64 + output[30] * cospi_17_64;
step[17] = (tran_low_t)fdct_round_shift(temp);
temp = output[18] * cospi_23_64 + output[29] * cospi_9_64;
step[18] = (tran_low_t)fdct_round_shift(temp);
temp = output[19] * cospi_7_64 + output[28] * cospi_25_64;
step[19] = (tran_low_t)fdct_round_shift(temp);
temp = output[20] * cospi_27_64 + output[27] * cospi_5_64;
step[20] = (tran_low_t)fdct_round_shift(temp);
temp = output[21] * cospi_11_64 + output[26] * cospi_21_64;
step[21] = (tran_low_t)fdct_round_shift(temp);
temp = output[22] * cospi_19_64 + output[25] * cospi_13_64;
step[22] = (tran_low_t)fdct_round_shift(temp);
temp = output[23] * cospi_3_64 + output[24] * cospi_29_64;
step[23] = (tran_low_t)fdct_round_shift(temp);
temp = output[24] * cospi_3_64 + output[23] * -cospi_29_64;
step[24] = (tran_low_t)fdct_round_shift(temp);
temp = output[25] * cospi_19_64 + output[22] * -cospi_13_64;
step[25] = (tran_low_t)fdct_round_shift(temp);
temp = output[26] * cospi_11_64 + output[21] * -cospi_21_64;
step[26] = (tran_low_t)fdct_round_shift(temp);
temp = output[27] * cospi_27_64 + output[20] * -cospi_5_64;
step[27] = (tran_low_t)fdct_round_shift(temp);
temp = output[28] * cospi_7_64 + output[19] * -cospi_25_64;
step[28] = (tran_low_t)fdct_round_shift(temp);
temp = output[29] * cospi_23_64 + output[18] * -cospi_9_64;
step[29] = (tran_low_t)fdct_round_shift(temp);
temp = output[30] * cospi_15_64 + output[17] * -cospi_17_64;
step[30] = (tran_low_t)fdct_round_shift(temp);
temp = output[31] * cospi_31_64 + output[16] * -cospi_1_64;
step[31] = (tran_low_t)fdct_round_shift(temp);
range_check(step, 32, 18);
// stage 9
output[0] = step[0];
output[1] = step[16];
output[2] = step[8];
output[3] = step[24];
output[4] = step[4];
output[5] = step[20];
output[6] = step[12];
output[7] = step[28];
output[8] = step[2];
output[9] = step[18];
output[10] = step[10];
output[11] = step[26];
output[12] = step[6];
output[13] = step[22];
output[14] = step[14];
output[15] = step[30];
output[16] = step[1];
output[17] = step[17];
output[18] = step[9];
output[19] = step[25];
output[20] = step[5];
output[21] = step[21];
output[22] = step[13];
output[23] = step[29];
output[24] = step[3];
output[25] = step[19];
output[26] = step[11];
output[27] = step[27];
output[28] = step[7];
output[29] = step[23];
output[30] = step[15];
output[31] = step[31];
range_check(output, 32, 18);
}
*/
static void fadst4(const tran_low_t *input, tran_low_t *output) { static void fadst4(const tran_low_t *input, tran_low_t *output) {
tran_high_t x0, x1, x2, x3; tran_high_t x0, x1, x2, x3;
tran_high_t s0, s1, s2, s3, s4, s5, s6, s7; tran_high_t s0, s1, s2, s3, s4, s5, s6, s7;
@@ -607,19 +1100,19 @@ void vp10_fdct8x8_quant_c(const int16_t *input, int stride,
output[4 * 8] = (tran_low_t)fdct_round_shift(t1); output[4 * 8] = (tran_low_t)fdct_round_shift(t1);
output[6 * 8] = (tran_low_t)fdct_round_shift(t3); output[6 * 8] = (tran_low_t)fdct_round_shift(t3);
// Stage 2 // stage 2
t0 = (s6 - s5) * cospi_16_64; t0 = (s6 - s5) * cospi_16_64;
t1 = (s6 + s5) * cospi_16_64; t1 = (s6 + s5) * cospi_16_64;
t2 = fdct_round_shift(t0); t2 = fdct_round_shift(t0);
t3 = fdct_round_shift(t1); t3 = fdct_round_shift(t1);
// Stage 3 // stage 3
x0 = s4 + t2; x0 = s4 + t2;
x1 = s4 - t2; x1 = s4 - t2;
x2 = s7 - t3; x2 = s7 - t3;
x3 = s7 + t3; x3 = s7 + t3;
// Stage 4 // stage 4
t0 = x0 * cospi_28_64 + x3 * cospi_4_64; t0 = x0 * cospi_28_64 + x3 * cospi_4_64;
t1 = x1 * cospi_12_64 + x2 * cospi_20_64; t1 = x1 * cospi_12_64 + x2 * cospi_20_64;
t2 = x2 * cospi_12_64 + x1 * -cospi_20_64; t2 = x2 * cospi_12_64 + x1 * -cospi_20_64;

Some files were not shown because too many files have changed in this diff Show More