This patch adds a buffer-based rate control for temporal layers,
under CBR mode.
Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).
Updated datarate unittest with tests for temporal layer rate-targeting.
Change-Id: I8900a854288b9354d9c697cfeb0243a9fd6790b1
the public typedef already includes a const, quiets
'same type qualifier used more than once' warnings
Change-Id: Ib118b3b116fba59d4c6ead84d85b26e5d3ed363d
When showing a previously decoded frame, i.e. when
show_existing_frame=1, the update of the
last_show_frame flag must be disabled.
This is to ensure that the last_show_frame flag
reflects the state of the flag for the immediately
previously decoded frame rather then the value that
was forced to ensure that a previously decoded frame
would be displayed.
This patch also adds a test vector to verify that the
display_existing_frame flag works as expected. Code
for generating the test vector can be found in this
patch:
https://gerrit.chromium.org/gerrit/#/c/68581/
(Bug originally reported by Alexander Voronov
<ru.xalba@gmail.com>).
Change-Id: I731d288fba02088959f7fcc87707137fffc6acf5
This patch only works if the video is a width and height that are both
a multiple of 32.. It sets every partition to 16x16, and does INTRADC
only on the first frame and ZEROMV on every other frame. It always does
does the largest possible transform, and loop filter level is set to 4.
Was ~20% faster than speed -5 of vp8
Now 20% slower but adds motion search ( every block ), nearest, near
and zeromv
The SVC test was changed because - while this realtime mode produces
bad quality albeit quickly, it isn't obeying all the rules it should
about which frames are available.
Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7
Adds an arbitrary-size resize library for use in scaling of input
frames in a non-normative manner in the vp9 encoder. The method
used is as follows:
Downsampling - Uses a 8 tap filter for factor of 2 decimation upto
a size just higher than the desired size. Then interpolates pixels
at a precision of 1/32 pel using a set of 8-tap filters.
Upsampling - Interpolates pixels at a precision of 1/32 pel using
a set of 8-tap filters.
There is no assembly optimization yet.
Change-Id: Ib5b81e174fc139da322bb97c8214d52289d60d8a
From frame 2, the lpf deltas are all cleared for for even frames, and
a set of values are set and used for odd frames. The intention is to
exercise decoding code around lpf delta/update decoding.
Change-Id: Ic9ff1bc2c2a023f4805852f8573398f2ec2249d7
The added vector was encoded with aq mode on, with the intent to
exercise the decode code around segment feature.
Change-Id: Iedcb7261e87d3e11b25ecf031d3a69385271148e
Corrected a typo that set rc_2pass_vbr_minsection_pct to
two different values on consecutive lines. Second line
should have set rc_2pass_vbr_maxsection_pct.
Change-Id: Ie07ac67cd5455afe556bef34da8127304db9c97c
Adds a hook that derived test classes can implement to be notified
before every call to decode a frame.
Change-Id: Iefa836459cf3e5d7df9ee27f8198daf82b1be088
Modifications to the spatial scalable encoder to match
changes made to the scaling code in the decoder.
In particular, the use of a dummy first frame was removed
now that the decoder is able to handle a smaller first
frame.
SvcTest.FirstFrameHasLayers unit test re-enabled.
Change-Id: Ic2e91fbe4eadf95895569947670d36d68abaf458
Added the test vector provided by Attila, which caught the bug in
Issue 661 "Decoder produces mismatched outputs with ssse3 enabled
and disabled"
vp90-hantro-stream-001.ivf
size: 320x180; 20 frames
Change-Id: Ic0d2b57ac7596ecb938dd55abc8c706fc2dd6d8f
* Change from thumb mode to arm mode improves test time significantly
* Direct inclusion of test.mk allows for unit test configuration via
configure script
Change-Id: Id58d3ba8289374528756a672459d8334afe20e2a
This commit enables the unit tests for 4x4 DCT and ADST transforms.
It covers tests of round-trip error check, coefficient match check,
coefficient overflow check, and inverse accuracy check.
Change-Id: Ibfea928ee48f0ebc088b7fdb0bf2d89a14161299
The switch to the rate-correction damping factor
in https://gerrit.chromium.org/gerrit/#/c/67536/ was not conditioned on CBR mode.
Change-Id: I2326704e8ac030a4f7b592dd3fedb94c7dd0644d
These changes are to support automated regressions of vpx on android
new file: test/android/Android.mk
new file: test/android/README
new file: test/android/get_files.py
Change-Id: I52c8e9daf3676a3561badbe710ec3a16fed72abd
This to make sure that prediction residue always get coded in lossless
mode.
This commit also fixed lossless unit test
Change-Id: I537726ee55328d4e4cf0a0196393a67e12bfcde1
SVC multiple layer per frame encoding is invoked with vpx_svc_init and
vpx_svc_encode. These interfaces are designed to be invoked from ffmpeg.
Additional improvements:
- make dummy frame handling a bit more explicit
- fixed bug with single layer encodes
- track individual frame sizes and psnrs instead of averages
- parameterized quantizer, 16th scalefactors, more logging,
- enabled single layer encodes to generate baseline
- include new mode for 3 layer I frame with 5 total layers
Change-Id: I46cfa600d102e208c6af8acd6132e0cc25cda8d4
-Don't reduce maxQ for gold/alt in CBR mode.
-Fix to min/maxQ for first/initial key frame.
-Add more speeds to datarate test and reduce the starting bitrate for test.
Change-Id: Id2a333d76dd3f6a51b322ca984588e2a22159c58
For consistency with idct function names. Renames:
vp9_short_fdct4x4 -> vp9_fdct4x4
vp9_short_walsh4x4 -> vp9_fwht4x4
Change-Id: Id15497cc1270acca626447d846f0ce9199770f58
This reverts commit a82001b1cf, reversing
changes made to f6d870f7ae.
This commit breaks windows builds and needs some work to fix those and
some additional comments.
Change-Id: Ic0b0228e36704b127e5e399ce59db26182cfffe7
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: I0ba3c52513a5fdd194f1e7e2901092671398985b
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: Ibc944952a192e6c7b2b6a869ec2894c01da82ed1
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: I2d95fdcbba96aaa0ed24a80870cb38f53487a97d
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: Id623c5113262655fa50f7c9d6cec9a91fcb20bb4
cherry-picked from:
commit 988b70844e03efcfcc075a9bc25d846670494f36
Author: Pascal Massimino <pascal.massimino@gmail.com>
Date: Fri Aug 2 11:15:16 2013 -0700
add WebPWorkerExecute() for convenient bypass
This is mainly for re-using the worker structs without using the
thread.
Change-Id: I8e1be29e53874ef425b15c192fb68036b4c0a359
Original source:
http://git.chromium.org/webm/libwebp.git
100644 blob c0d318aee628fdf9ba4876451a28aa978f1066b8 src/utils/thread.c
100644 blob c2b92c9fe353f8e514f78922f3d237204a9cbc66 src/utils/thread.h
Change-Id: I13fe92b1e94062bb99fdeeb7cb0b4b0575d27793
To ensure fast encoding/decoding on devices without ssse3 support,
SSE2 optimization of sub-pixel filters was done. Test using 1080p
clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps
with sse2 filters, and ~15fps with c filters.
Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
The idea is to have the following names for each transform size:
vp9_idct4x4_add
vp9_idct4x4_1_add
vp9_idct4x4_10_add
vp9_idct4x4_16_add
vp9_idct8x8_add
vp9_idct8x8_1_add
vp9_idct8x8_10_add
vp9_idct8x8_64_add
etc for 16x16, 32x32
The actual list of renames in this patch:
vp9_idct_add_lossless -> vp9_iwht4x4_add
vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add
vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add
vp9_idct_add -> vp9_idct4x4_add
vp9_short_idct4x4_add -> vp9_idct4x4_16_add
vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add
Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
Modified the resize unit test so that it optionally
writes the encoded bitstream to file. The macro
WRITE_COMPRESSED_STREAM should be set to 1 to enable
output of the test bitstream; it is set to 0 by default.
Change-Id: I7d436b1942f935da97db6d84574a98d379f57fb1
This commit reworked the unit test for 8x8 forward transform. It
allows scalability to cover various implemented versions.
Change-Id: I5594bd3e2307bb5bec764eaffd8860caa260e432
The CpuSpeedTest is extended to cover 2pass good quality with CpuUsed
from 0 to 4. The BordersTest is changed to use CpuUsed 1 for faster
turn around.
Change-Id: I005e89adee7fe63af4b1f2a76a3a13ea826feadf
Added the resize_test unit test to the VP9 set.
Set g_in_frames = 0 to avoid a problem when the total
number of frames being encoded is smaller than
g_in_frames. In this case the test will not have
access to the encoded frames and will not be able to
compare them for testing for encoder/decoder mismatch.
Change-Id: I0d2ff8ef058de7002c5faa894ed6ea794d5f900b
This commit completes the per coefficient accuracy check and memory
overflow check for SSE2 and other implemented versions of 16x16
transform.
Change-Id: If26a3e4f6ba82ccecc13f0b73cb8f7bb6ac14584
This commit refactors the 16x16 transform unit test. It enables the
test on all implemented versions of forward and inverse 16x16 transform
modules.
Change-Id: I0c7d5f3c5fdd5d789a25f73e287aeeaf463b9d69
This commit enabled a full functional test on 32x32 forward/inverse
transform, including round-trip error and memory overflow check. It
tests the prototype functions in C and all other implementations if
applicable.
Change-Id: I9cc50b05abdb4863e7abbcb29209a19b1fe90da7
There is another unit test that has been failing randomly on win32
build. Investigation has shown that the failure was caused by simd
register state is not reset appropriately in the fdct8x8 test. This
commit added ClearSystemState() in the teardown of this test, tests
showed it resolved the random failure issue for win32 build.
Related issue: https://code.google.com/p/webm/issues/detail?id=614
Change-Id: I9381d0c1a6f4b855ccaeef1aca8c417ac8c71ee2
- Intermediate height was not correct i.e. when block size is 4 and
y_step_q4 is 6. In this case intermediate height was
(4*6) >> 4 = 1 and vertical interpolation needs two source pixels
plus 7 extra pixels for taps.
- Also if the current output block is 16x16 and we are using 4x upscaling
we need only 12 rows after horizontal filtering instead of 16.
Patch Set 2: Intermediate_height updated after CL 66723
"Fix bug in convolution functions (filter selection)"
Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
(In response to Issue 604:
https://code.google.com/p/webm/issues/detail?id=604)
There were bugs in the convolution code for two cases:
1. Where the filter table was assumed to be aligned to a
256 byte boundary. The offset of the pixel in the
source buffer was computed incorrectly.
2. Where no such alignment assumption was made. An
incorrect address for the filter table base was used.
To fix both problems, I now assume that the filter table is
256-byte aligned and modify the pixel offset calculation to
match.
A later patch should remove the restriction that the filter
table is aligned to a 256-byte boundary.
There was also a bug in the ConvolveTest unit test
(convolve_test.cc).
(Bug & initial fix suggestion submitted by Tero Rintaluoma
and Sami Pietilä).
Change-Id: I71985551e62846e55e40de9e7e3959d4805baa82
Currently, the best quality mode in VP9 is not very well developed,
and unnecessarily makes the encode too slow. Hence the command line
default is changed to "good" quality. Also, the number of passes
default is changed to 2 passes as well, since 1-pass encoding is
not very efficient in VP9.
Besides, a number of VP9 defaults are set to the currently
recommended settings. With these changes, vpxenc
run with --codec=vp9 --kf-max-dist=9999 --cpu-used=0 should
work about the same as our borg results.
Note when the --cpu-used=0 option is dropped there will be a slight
difference in the output, because of a difference in the cpu-used
value for the first pass. Specifically, the default when unspecified
is to use cpu_used=1 for the first pass and cpu_used=0 for the
second pass. But when specified, both passes will use the cpu-used
value specified.
Note that this also changes the default for VP8 as being "good"
but other options stay unchanged.
Change-Id: Ib23c1a05ae2f36ee076c0e34403efbda518c5066
The mix use of double type and simd code caused invalid values stored
in double variables, further caused unit tests to fail. The failures
were only observed on x86-win32-vs9 build with vs2008.
Change-Id: If0131754a3bf217a5ace303b7963e8f5162c34b5
Enable use_x86inc as a commandline option. Fix Bug with sse2 when
x86inc is disabled. Adds Sad asm protection to x86inc protection
Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6
Support enabling it or disabling it. Moved read out to configure.sh
so that its done once instead of in make and in config.
Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b
Currently the only threaded option for vp9 decode. Enabled when the
decoder config thread count is > 1.
Change-Id: I082959abac9e31aa4a38ed9fd68b94680e57f4df
Chromium does not support 32bit builds for Mac which use x86inc.asm.
Make the files which include it work if 64bit or not PIC enabled
starting with vp9_copy_sse2.asm
Consolidate these targets in vp9_rtcd_defs.sh
Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248
Call the individually optimized horizontal and vertical functions. This
implementation abuses the temp buffer.
This will be replaced with a custom optimized function.
Over 2x speedup.
Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
Adding missed parenthesis around boolean expressions. Bitstream is changed.
Regenerating test vectors.
Change-Id: I4cc00b761e9473f92f180a9fc3a0c607f0aaae56
Super basic conversion from the other implementations. Any changes to
one should be trivial to copy over keep in sync.
Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8
Independent horizontal and vertical implementations.
Requires that blocks be built from 4x4 and [xy]_step_q4 == 16
6-10% improvement. CIF improved the least.
Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda
It does encodings with min and max q set at 0, and check to make sure
output PSNR at MAX_PSNR (100).
Change-Id: Ia2418353cccf6e487204ea4ff874a7e71e55cb3e
In the rare case were 4x4 interior filtering was called for but no
8x8 or larger filtering takes place, the previous code was skipping
the filtering. This patch fixes the issue by including the interior
mask in the overall mask for the filter application loops.
Change-Id: I4a0b65056c64f97478827c2ff41e0914fc7779d0
comment out some unused parameters and adjust the format to avoid:
./test/fdct4x4_test.cc|27| warning C4138: '*/' found outside of comment
Change-Id: I60f93b4c3cd7e8d61f0de80019f3404b40161f03
This commit enables 8x8 DCT and hybrid transform unit tests. It
also tunes the forward hybrid transform rounding opertions for
more precise round-trip performance.
Change-Id: If05c1ce59d75d641b9c6c91527d02d3a6ef498c3
For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.
Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.
Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
Tests resolutions of 8, 10, 16, 18, 32, 34, 64, 66 to exercise the
border conditions, as well as non-SB aligned sizes.
Change-Id: Ie7c2b7860ac3727e23202042f2e86792652912f8
This allows code calling the library can choose an arbitrary
encryption algorithm.
Decoder control parameter VP8_SET_DECRYPT_KEY is renamed to
VP8D_SET_DECRYPTOR, and now takes an small config struct instead
of just a byte array.
Change-Id: I0462b3388d8d45057e4f79a6b6777fe713dc546e
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
These files can stand in until we get proper syntax vectors. They
should provide some additional assurance against inadvertant
bitstream changes.
Change-Id: I12f6c9a5f054e30df40a7ff1f33145abf7e1d59d
No bitstream change.
Removes unused filters and the code for the case of 2 switchable filters;
also changes the 8tap-smooth filter coefficients for integer shifts to be
interpolating to be consistent with the way it is implemented currently.
Change-Id: I96c542fd8c06f4e0df507a645976f58e6de92aae
The partition types of blocks sitting on the frame boundary are
constrained by the block size and the position of each sub-block
relative to the frame. Hence we use truncated probability models
to handle the coding of such information.
100 frames run:
yt 0.138%
Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d
This avoids encoding tokens for blocks that are entirely
in the UMV border. This changes the bitstream.
Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.
Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a
Code was previously using VPX_IMG_FMT_VPXI420, which was intended to be the
"vpx" non-YUV colorspace variant.
Change-Id: Icf8771eeefeb574055ed638a93450c3d0ed5b9f5
Disables the part of the error-resilient test that tests the
quality after dropping undroppable frames. It's not clear how
to set the threshold for this correctly at the moment.
Change-Id: I3ee4a0d475498f44711fdef05749f305e8d08591
This reverts commit b24735c622
since the adjusted threshold doesn't allow the existing tests
to pass. Will disable the failing test in a separate commit.
Change-Id: I26d41cf6175f300bbad493cecdc96e6b0dd6f2fe
input_ is filled with random values just afterward.
the size was wrong anyway as input_ is allocated with memalign so
sizeof(input_)==sizeof(uint8_t*)
Change-Id: I014b832ac60960cd22b6f369dbc9fd648d4055b5
Conflicts:
vp9/common/vp9_findnearmv.c
vp9/common/vp9_rtcd_defs.sh
vp9/decoder/vp9_decodframe.c
vp9/decoder/x86/vp9_dequantize_sse2.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_common.mk
Resolve file name changes in favor of master. Resolve rdopt changes in
favor of experimental, preserving the newer experiments.
Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f
Updates the common convoloution code to support blocks larger than
16x16, and rectangular blocks. This uncovered a bug in the SSSE3
filtering routines due to the order of application of saturation.
This commit fixes that bug, adjusts the unit test to bias its
random values towards the extremes, and adds a test to ensure that
all filters conform to the expected pairwise addition structure.
Change-Id: I81f69668b1de0de5a8ed43f0643845641525c8f0
This is the first CL with vp9_reader changes. All another macro
definitions will be replaced after.
Change-Id: I1c6bd9c9a612ec1663d484d6adb4fb720af54063
the one from gtest in this case: testing::internal::Random.
this will make the tests deterministic between platforms. addresses
issue #568.
Change-Id: I5a8a92f5c33f52cb0a219c1dd3d02335acbbf163
Pick up VP8 encryption, quantization changes, and some fixes to vpxenc
Conflicts:
test/decode_test_driver.cc
test/decode_test_driver.h
test/encode_test_driver.cc
vp8/vp8cx.mk
vpxdec.c
vpxenc.c
Change-Id: I9fbcc64808ead47e22f1f22501965cc7f0c4791c
The superframe index marker byte carries data in the lower 5 bits. Only the
upper 3 should be used as part of the mask to detect it. By masking with
0xf0, the previous code was incorrect for frames over 65k bytes.
Change-Id: I6248889f5af227457f359a56b2348ef6db87a3b4
Improved coding performance made this test fail. Adjust the threshold
so that it passes again. A more stable metric is an open TODO.
Change-Id: I56e18749ced48123ee2488888a3eed631759912b
A 'superframe' is a group of frames that share the same PTS, but have a
defined decoding order. This commit adds the ability to append an index
to such a group of frames, allowing for random access to the constituent
frames. This could be useful for frame-level parallelism or partial
decoding in a multilayer scenario.
Decoding the stream serially without such an index should work as a
fallback, and VP9/TestSuperframeIndexIsOptional verifies that.
Change-Id: Idff83b7560e1a7077d8fb067bfbc45b567e78b1c
Since the 8-tap lowpass filter is non-interpolating, the results are
different between applying it at whole-pel values and not. This
means that 1D-only versions are requried to be implemented, as
opposed to being an optimization of the 2D case. Calling the 2D
filter instead of the horizontal-only filter is not equivalent
in this case. Update the test to pass invalid filters to the
unused stage of the 1D-only calls, to verify they're unused.
Change-Id: Idc1c490f059adadd4cc80dbe770c1ccefe628b0a
Updates the convolve test to verify that all filters match the
reference implementation. This verifies commit 30f866f, which
fixed some problems with the SSE3 version of the filters for
the vp9_sub_pel_filters_8s and vp9_sub_pel_filters_8lp banks
due to overflow and order of operations.
Change-Id: I6b5fe1a41bc20062e2e64633b1355ae58c9c592c
Fixes a bug in vp9_set_internal_size() that prevented returning to
the unscaled state. Updated the ResizeInternalTest to scale both
down and up. Added a check that all frames are within 2.5% of the
quality of the initial keyframe.
Change-Id: I3b7ef17cdac144ed05b9148dce6badfa75cff5c8
This avoids duplicating all the filters twice. Includes fixups to the
convolve routines and associated tests to make this work.
Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
This patch allows coding frames using references of different
resolution, in ZEROMV mode. For compound prediction, either
reference may be scaled.
To test, I use the resize_test and enable WRITE_RECON_BUFFER
in vp9_onyxd_if.c. It's also useful to apply this patch to
test/i420_video_source.h:
--- a/test/i420_video_source.h
+++ b/test/i420_video_source.h
@@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {
virtual void FillFrame() {
// Read a frame from input_file.
+ if (frame_ != 3)
if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
limit_ = frame_;
}
This forces the frame that the resolution changes on to be coded
with no motion, only scaling, and improves the quality of the
result.
Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
Ensure that all inter prediction goes through a common code path
that takes scaling into account. Removes a bunch of duplicate
1st/2nd predictor code. Also introduces a 16x8 mode for 8x8
MVs, similar to the 8x4 trick we were doing before. This has an
unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that
case for now.
Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
Also
1. Removed the test code for fDCT from the iDCT test.
2. changed the criteria of round trip error to be below 1/block, this
is quite strict comparing to smaller transforms when size differences
are accounted for.
Change-Id: Idb46a6380b04c93fc8e2845c75f5a850366b0090
This commit added pre/post scaling for first half of fDCT16x16 to
reduce error, by simulation of 100,000 blocks for random inputs,
the average sse reduced from 2.1/block to 0.0498/block.
also enabled tests for 16x16 fDCT and iDCT
Change-Id: Id2a95f0464c6dd4118797d456237ae90274c0f02
The commit added a final rounding choice for 8x8 forward dct to get
rid of a sign bias at DC position and improve the accuracry in term
of round trip error for 8x8 fDCT/iDCT.
This commit also enabled forward 8x8 dct test.
Change-Id: Ib67f99b0a24d513e230c7812bc04569d472fdc50
1. changed 4x4 test name to Vp9Fdct4x4Test to be consistent
2. remove forward 8x8 dct test code from idct8x8_test.cc
3. temporarily disable other forward dct tests to allow fdct work in
progress
Change-Id: I566aeed9c7c34da5a206190aa7d0e847a4008b36
This is after discussion with the hardware team. Update the unit test
to take these sizes into account. Split out some duplicate code into
a separate file so it can be shared.
Change-Id: I8311d11b0191d8bb37e8eb4ac962beb217e1bff5
* changes:
Initial support for resolution changes on P-frames
Avoid allocating memory when resizing frames
Adds a test for the VP8E_SET_SCALEMODE control
Tests that the external interface to set the internal codec scaling
works as expected. Also updates the test to pull the height from
the decoded frame size rather than parsing the keyframe header,
in anticipation of allowing resolution changes on non-keyframes.
Change-Id: I3ed92117d8e5288fbbd1e7b618f2f233d0fe2c17
This commit adds the 8 tap SSSE3 subpixel filters back into the code
underneath the convolve API. The C code is still called for 4x4
blocks, as well as compound prediction modes. This restores the
encode performance to be within about 8% of the baseline.
Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c
This patch adds column-based tiling. The idea is to make each tile
independently decodable (after reading the common frame header) and
also independendly encodable (minus within-frame cost adjustments in
the RD loop) to speed-up hardware & software en/decoders if they used
multi-threading. Column-based tiling has the added advantage (over
other tiling methods) that it minimizes realtime use-case latency,
since all threads can start encoding data as soon as the first SB-row
worth of data is available to the encoder.
There is some test code that does random tile ordering in the decoder,
to confirm that each tile is indeed independently decodable from other
tiles in the same frame. At tile edges, all contexts assume default
values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode),
and motion vector search and ordering do not cross tiles in the same
frame.
t log
Tile independence is not maintained between frames ATM, i.e. tile 0 of
frame 1 is free to use motion vectors that point into any tile of frame
0. We support 1 (i.e. no tiling), 2 or 4 column-tiles.
The loopfilter crosses tile boundaries. I discussed this briefly with Aki
and he says that's OK. An in-loop loopfilter would need to do some sync
between tile threads, but that shouldn't be a big issue.
Resuls: with tiling disabled, we go up slightly because of improved edge
use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf,
~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5%
on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is
concentrated in the low-bitrate end of clips, and most of it is because
of the loss of edges at tile boundaries and the resulting loss of intra
predictors.
TODO:
- more tiles (perhaps allow row-based tiling also, and max. 8 tiles)?
- maybe optionally (for EC purposes), motion vectors themselves
should not cross tile edges, or we should emulate such borders as
if they were off-frame, to limit error propagation to within one
tile only. This doesn't have to be the default behaviour but could
be an optional bitstream flag.
Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f
This commit introduces a new convolution function which will be used to
replace the existing subpixel interpolation functions. It is much the
same as the existing functions, but allows for changing the filter
kernel on a per-pixel basis, and doesn't bake in knowledge of the
filter to be applied or the size of the resulting block into the
function name.
Replacing the existing subpel filters will come in a later commit.
Change-Id: Ic9a5615f2f456cb77f96741856fc650d6d78bb91
Adds an error-resilient mode where frames can be continued
to be decoded even when there are errors (due to network losses)
on a prior frame. Specifically, backward updates are turned off
and probabilities of various symbols are reset to defaults at
the beginning of each frame. Further, the last frame's mvs are
not used for the mv reference list, and the sorting of the
initial list based on search on previous frames is turned off
as well.
Also adds a test where an arbitrary set of frames are skipped
from decoding to simulate errors. The test verifies (1) that if
the error frames are droppable - i.e. frame buffer updates have
been turned off - there are no mismatch errors for the remaining
frames after the error frames; and (2) if the error-frames are non
droppable, there are not only no decoding errors but the mismatch
PSNR between the decoder's version of the post-error frames and the
encoder's version is at least 20 dB.
Change-Id: Ie6e2bcd436b1e8643270356d3a930e8989ff52a5
This commit starts to convert the tests to a system where the codec
to be used is provided by a factory object. Currently no tests are
instantiated for VP9 since they all fail for various reasons, but it
was verified that they're called and the correct codec is
instantiated.
Change-Id: Ia7506df2ca3a7651218ba3ca560634f08c9fbdeb