Commit Graph

351 Commits

Author SHA1 Message Date
Dmitry Kovalev
890eee3b47 Fixing problem with invalid delta_q reading.
This is a bitstream change but no currently produces videos should
be affected. https://code.google.com/p/webm/issues/detail?id=610

Change-Id: Ic85a6477df6c201cdf7f70f6bd84607b71f4593c
2013-09-04 11:25:43 -07:00
James Zern
1cf2272347 Merge "Fix intermediate height in convolve_c" 2013-09-03 15:50:33 -07:00
Tero Rintaluoma
e326cecf18 Fix intermediate height in convolve_c
- Intermediate height was not correct i.e. when block size is 4 and
  y_step_q4 is 6. In this case intermediate height was
  (4*6) >> 4 = 1 and vertical interpolation needs two source pixels
  plus 7 extra pixels for taps.
- Also if the current output block is 16x16 and we are using 4x upscaling
  we need only 12 rows after horizontal filtering instead of 16.

  Patch Set 2: Intermediate_height updated after CL 66723
               "Fix bug in convolution functions (filter selection)"

Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
2013-08-30 10:31:21 +03:00
Jingning Han
ec4b2742e7 Refactor 16x16 unit tests
Make the new test module comply to the unit test rules.

Change-Id: Id79ff7f03f870973ffbc74f26d64edb418b75299
2013-08-29 16:49:11 -07:00
Dmitry Kovalev
bfebe7e927 Merge "Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the common/decoder." 2013-08-27 10:15:21 -07:00
Jim Bankoski
a5cb05c45d Add a test vector that tests color space 444
This adds a test vector for 444 color space.

Change-Id: I1e2ac3883211989a062cfafc0e58151b14d294b8
2013-08-26 15:24:35 -07:00
Jim Bankoski
af13fbb70f Fix Chroma plane md5 check
Chroma plane MD5 calculation was incorrect for 444 and 422
yuv color spaces.

Change-Id: If985396871a2f57db85108a4355172f9793d3007
2013-08-26 14:26:38 -07:00
Dmitry Kovalev
45870619f3 Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the common/decoder.
Adding temporary "typedef BLOCK_SIZE BLOCK_SIZE_TYPE" which will go away
after encoder's patch.

Change-Id: I06ec6a6f079401439843ec981d1496234fd7775c
2013-08-26 11:33:16 -07:00
Adrian Grange
3f10831308 Fix bug in convolution functions (filter selection)
(In response to Issue 604:
 https://code.google.com/p/webm/issues/detail?id=604)

There were bugs in the convolution code for two cases:

1. Where the filter table was assumed to be aligned to a
   256 byte boundary. The offset of the pixel in the
   source buffer was computed incorrectly.

2. Where no such alignment assumption was made. An
   incorrect address for the filter table base was used.

To fix both problems, I now assume that the filter table is
256-byte aligned and modify the pixel offset calculation to
match.

A later patch should remove the restriction that the filter
table is aligned to a 256-byte boundary.

There was also a bug in the ConvolveTest unit test
(convolve_test.cc).

(Bug & initial fix suggestion submitted by Tero Rintaluoma
and Sami Pietilä).

Change-Id: I71985551e62846e55e40de9e7e3959d4805baa82
2013-08-23 11:16:08 -07:00
Deb Mukherjee
0d8723f8d5 Make "good" quality 2-pass vpxenc encoding default
Currently, the best quality mode in VP9 is not very well developed,
and unnecessarily makes the encode too slow. Hence the command line
default is changed to "good" quality. Also, the number of passes
default is changed to 2 passes as well, since 1-pass encoding is
not very efficient in VP9.

Besides, a number of VP9 defaults are set to the currently
recommended settings. With these changes, vpxenc
run with --codec=vp9 --kf-max-dist=9999 --cpu-used=0 should
work about the same as our borg results.
Note when the --cpu-used=0 option is dropped there will be a slight
difference in the output, because of a difference in the cpu-used
value for the first pass. Specifically, the default when unspecified
is to use cpu_used=1 for the first pass and cpu_used=0 for the
second pass. But when specified, both passes will use the cpu-used
value specified.

Note that this also changes the default for VP8 as being "good"
but other options stay unchanged.

Change-Id: Ib23c1a05ae2f36ee076c0e34403efbda518c5066
2013-08-21 12:41:26 -07:00
Dmitry Kovalev
3c43ec206c Renaming BLOCK_SIZE_TYPES constant to BLOCK_SIZES.
There will be another change set to rename BLOCK_SIZE_TYPE enum to
BLOCK_SIZE.

Change-Id: I8d1dfc873d6186fa5e554262f5169e929978085e
2013-08-09 17:47:32 -07:00
Yaowu Xu
bc484ebf06 fix unit test failure on win32 vs2008 build
The mix use of double type and simd code caused invalid values stored
in double variables, further caused unit tests to fail. The failures
were only observed on x86-win32-vs9 build with vs2008.

Change-Id: If0131754a3bf217a5ace303b7963e8f5162c34b5
2013-08-08 18:51:51 -07:00
Jim Bankoski
5b307886fb variance x86inc guards
also fixed bug in sad calcs

Change-Id: I6571fcbe37556c16ae32be66dc0fd879852aac1d
2013-08-06 14:17:13 -07:00
Jim Bankoski
c9126e0b30 sad + miscellaneous updates
Enable use_x86inc as a commandline option.  Fix Bug with sse2 when
x86inc is disabled. Adds Sad asm protection to x86inc protection

Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6
2013-08-06 12:16:04 -07:00
Jim Bankoski
62c6aa884d block error / x86inc mods
Change-Id: Icb607745634e10b9bac5019d06661ece09fcdb40
2013-08-06 06:23:38 -07:00
Jim Bankoski
a93b115cd6 reworked config for use_x86_inc
Support enabling it or disabling it.  Moved read out to configure.sh
so that its done once instead of in make and in config.

Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b
2013-08-05 17:35:25 -07:00
James Zern
d115cd8b12 Merge changes I082959ab,Ib6932640
* changes:
  vp9/decoder: threaded row-based loop filter
  vp9/decoder: add thread worker
2013-08-05 16:07:09 -07:00
Jim Bankoski
a5a7322459 Merge "Begin to restrict x86inc.asm usage" 2013-08-05 14:17:49 -07:00
James Zern
a0ffa2794b vp9/decoder: threaded row-based loop filter
Currently the only threaded option for vp9 decode. Enabled when the
decoder config thread count is > 1.

Change-Id: I082959abac9e31aa4a38ed9fd68b94680e57f4df
2013-08-05 13:22:04 -07:00
James Zern
183b77d5ab vp9/decoder: add thread worker
vp9/decoder/vp9_thread.[hc]
Original source:
 http://git.chromium.org/webm/libwebp.git
 100644 blob b1615d0fb8d311666b2fa4561076c62d72c2e3ff  src/utils/thread.c
 100644 blob 13a61a4c84194c3374080cbf03d881d3cd6af40d  src/utils/thread.h

Local modifications:
 - s/WebP/VP9/g
 - camelcase functions -> lower with _'s

Change-Id: Ib6932640ee34f8b4782c6fbd15864a59d5d4c5fe
2013-08-05 13:21:13 -07:00
Jim Bankoski
c3809f3de5 Begin to restrict x86inc.asm usage
Chromium does not support 32bit builds for Mac which use x86inc.asm.
Make the files which include it work if 64bit or not PIC enabled
starting with vp9_copy_sse2.asm

Consolidate these targets in vp9_rtcd_defs.sh

Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248
2013-08-05 12:07:30 -07:00
Dmitry Kovalev
d007446b3f Replacing long block size enum values with shorter ones (2).
Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b
2013-08-05 10:51:02 -07:00
James Zern
1197d6736c Merge "tests: silence a few type related warnings" 2013-07-22 11:50:22 -07:00
James Zern
4a688b26f7 Merge "cosmetics: idct_test.cc: fix formatting" 2013-07-22 11:49:23 -07:00
James Zern
104dbbbfd9 tests: silence a few type related warnings
Change-Id: If908328c1dbbb5bd84c57e30fab1cda1804933e4
2013-07-18 16:13:39 -07:00
James Zern
bae311772f cosmetics: tile_independence_test: fix formatting
Change-Id: Ifd48f796fa70fe1dc9b87a6f2bdc715bc0ea5ad3
2013-07-18 16:00:01 -07:00
James Zern
36b882eeb6 cosmetics: idct_test.cc: fix formatting
clang-format -style=Google

Change-Id: Ic85f2cd2a1d65d9cf18a0f8bc515c0a0f5161747
2013-07-18 15:42:06 -07:00
Johann
9ca66ec050 Merge "vp9_convolve8_neon placeholder" 2013-07-17 10:09:00 -07:00
Johann
59dc4e9cdd vp9_convolve8_neon placeholder
Call the individually optimized horizontal and vertical functions. This
implementation abuses the temp buffer.

This will be replaced with a custom optimized function.

Over 2x speedup.

Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
2013-07-17 08:39:27 -07:00
James Zern
70fe2b3ec3 Merge "Cosmetic changes in 4x4 and 8x8 fdct unit tests" 2013-07-16 12:55:42 -07:00
Johann
90ebfe621f Merge "vp9_convolve8_[horiz|vert]_avg" 2013-07-16 09:42:52 -07:00
Jingning Han
6094bf37c5 Cosmetic changes in 4x4 and 8x8 fdct unit tests
Make the codes consistent with conventions.

Change-Id: Id044ed8382f83a3c3f54f9edd569f00bcd0523db
2013-07-15 11:37:17 -07:00
Dmitry Kovalev
31a68bcdff Fixing vp9_get_pred_context_comp_ref_p function.
Adding missed parenthesis around boolean expressions. Bitstream is changed.
Regenerating test vectors.

Change-Id: I4cc00b761e9473f92f180a9fc3a0c607f0aaae56
2013-07-12 17:46:02 -07:00
Johann
a15bebfc0a vp9_convolve8_[horiz|vert]_avg
Super basic conversion from the other implementations. Any changes to
one should be trivial to copy over keep in sync.

Change-Id: I1720b4128e0aba4b2779e3761f6494f8a09d3ea8
2013-07-12 16:21:33 -07:00
Jingning Han
119decdee7 Merge "Cosmetic changes in 16x16 ADST/DCT unit test" 2013-07-11 21:52:39 -07:00
Jingning Han
29c45f31ee Cosmetic changes in 16x16 ADST/DCT unit test
Change-Id: Ic649e9e47d14d6f8cae0c443a425ea533a97ad8d
2013-07-11 11:37:38 -07:00
Johann
158c80cbb0 convolve8 optimizations for neon
Independent horizontal and vertical implementations.

Requires that blocks be built from 4x4 and [xy]_step_q4 == 16

6-10% improvement. CIF improved the least.

Change-Id: I137f5ceae4440adc0960bf88e4453e55a618bcda
2013-07-11 11:08:19 -07:00
Ronald S. Bultje
decead7336 Replace copy_memNxM functions with a generic copy/avg function.
Change-Id: I3ce849452ed4f08527de9565a9914d5ee36170aa
2013-07-10 18:27:24 -07:00
Jingning Han
82c415328c Merge "Add unit test for 16x16 forward ADST/DCT" 2013-07-10 11:16:39 -07:00
Jingning Han
cf768b2d80 Add unit test for 16x16 forward ADST/DCT
Unit tests on the functional accuracy of forward ADST/DCT.

Change-Id: I81afff866bdeacbd457b0af96993a035741657f6
2013-07-10 09:40:46 -07:00
Yaowu Xu
9ce6de195b Added a lossless test
It does encodings with min and max q set at 0, and check to make sure
output PSNR at MAX_PSNR (100).

Change-Id: Ia2418353cccf6e487204ea4ff874a7e71e55cb3e
2013-07-09 14:40:20 -07:00
Yaowu Xu
df5731273f Merge "Fix loopfilter bug" 2013-07-09 01:34:25 -07:00
John Koleszar
527fc5caf6 Fix loopfilter bug
In the rare case were 4x4 interior filtering was called for but no
8x8 or larger filtering takes place, the previous code was skipping
the filtering. This patch fixes the issue by including the interior
mask in the overall mask for the filter application loops.

Change-Id: I4a0b65056c64f97478827c2ff41e0914fc7779d0
2013-07-08 16:49:57 -07:00
Jim Bankoski
b0520b61ed new unit test for cpu-speed
Tests q0 ( lossless),  very high bitrate and low bitrates at cpu speed
0, 1 and 2.

Change-Id: I0c5cdca00acd8d01e7b13f124b3b08d4b1ae9f6d
2013-07-02 14:38:03 -07:00
James Zern
e247ab09a6 variance_test: add missing ClearSystemState...
...to recently added SubpelVarianceTest

Change-Id: I8775e39fd5dbfba81ad42b79b47bf6dd6ca8cc0e
2013-06-26 18:32:21 -07:00
Jingning Han
9b744ce35b Fix aligned memory allocation in unit tests
Change-Id: I38fac90e0ed25cb747453ab1d6396187cf5ef3b9
2013-06-26 11:59:46 -07:00
James Zern
e4f38c88da test/fdct*: fix some warnings
comment out some unused parameters and adjust the format to avoid:
./test/fdct4x4_test.cc|27| warning C4138: '*/' found outside of comment

Change-Id: I60f93b4c3cd7e8d61f0de80019f3404b40161f03
2013-06-26 11:09:08 -07:00
James Zern
66c7dffd5c tests/*source: test file pointer before reading
if the caller did not abort after an ASSERT failure in Begin()
FillFrame() would segfault.

Change-Id: I2d3f5a0918611bbd081be6f686dea19c56695073
2013-06-25 17:57:52 -07:00
James Zern
1c05e9de2c encode_test_driver: check for fatal failures
Make the base test be:
!(fatal || abort_) removing some redundancy in the encode tests

Change-Id: I8ffaf33fcf9a3030b38ea3e8eb94704cdc2fc920
2013-06-25 17:57:52 -07:00
Jingning Han
3f184bce7b Merge "Cosmetic changes in 4x4 fwd transform unit test" 2013-06-25 13:17:23 -07:00
Jingning Han
d52c359d43 Merge "Tune the rounding operations in 8x8 ADST/DCT sse2" 2013-06-25 13:17:05 -07:00
James Zern
4c0f283886 Merge "I420VideoSource: normalize framerate types" 2013-06-25 12:57:49 -07:00
James Zern
9d95993115 Merge "intrapred_test: add virtual dtor to IntraPredBase" 2013-06-25 12:56:40 -07:00
Jingning Han
0084e61d5f Tune the rounding operations in 8x8 ADST/DCT sse2
Improve the round-trip precision to meet the unit test setttings.

Change-Id: I303febae56b4b990ea3798b8ebed94c0510ecf79
2013-06-25 12:02:26 -07:00
Ronald S. Bultje
becf1691c4 Merge "Add SAD unit tests for all rectangular sizes." 2013-06-25 12:00:41 -07:00
Jingning Han
29b6e73c2c Cosmetic changes in 4x4 fwd transform unit test
Change-Id: I7a9ea03b92160f1052e56665b19a155211ee241f
2013-06-25 11:39:19 -07:00
Jingning Han
ab362621fe Add 8x8 dct/adst unit tests
This commit enables 8x8 DCT and hybrid transform unit tests. It
also tunes the forward hybrid transform rounding opertions for
more precise round-trip performance.

Change-Id: If05c1ce59d75d641b9c6c91527d02d3a6ef498c3
2013-06-25 09:57:01 -07:00
Ronald S. Bultje
3c4abbe454 Add SAD unit tests for all rectangular sizes.
Change-Id: I47e81b51f072abdb276bdec85423febba34b5f81
2013-06-24 14:05:13 -07:00
Yaowu Xu
93f88ab55a Merge "Fix loopfilter of leftmost 4x4 edges in SB" 2013-06-24 09:55:21 -07:00
John Koleszar
858475a03a Fix loopfilter of leftmost 4x4 edges in SB
For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.

Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
2013-06-24 08:26:00 -07:00
Ronald S. Bultje
4eb8c56587 Merge "Allocate memory using appropriate expected alignment in unit tests." 2013-06-21 21:22:55 -07:00
James Zern
c2fa8390f6 I420VideoSource: normalize framerate types
ctor inputs are ints as are vpx_rational_t members

Change-Id: I62a39bf3df123727a872e40b74e3ee9e55ef2ede
2013-06-21 19:34:51 -07:00
James Zern
f6d293adf6 intrapred_test: add virtual dtor to IntraPredBase
classes with virtual functions should have virtual destructors

Change-Id: If54e2f8384f0bfcbf812cc727eb9d0a586173674
2013-06-21 19:33:50 -07:00
Ronald S. Bultje
ac6ea2ab91 Allocate memory using appropriate expected alignment in unit tests.
Fixes crashes of test_libvpx on 32-bit Linux.

Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
2013-06-21 17:03:57 -07:00
John Koleszar
0c8e13d2f8 Merge "Add some unaligned test vectors" 2013-06-21 16:31:18 -07:00
James Zern
cc774c8bb0 variance_test: use REGISTER_STATE_CHECK
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
2013-06-21 14:30:08 -07:00
Ronald S. Bultje
7756e9892b Merge "Add subtract_block SSE2 version and unit test." 2013-06-21 12:49:50 -07:00
Ronald S. Bultje
9a480482cb Merge "SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance()." 2013-06-21 12:49:43 -07:00
Ronald S. Bultje
25c588b1e4 Add subtract_block SSE2 version and unit test.
3% faster overall (3min35.0 to 3min28.5).

Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
2013-06-21 09:35:37 -07:00
Yaowu Xu
e6cd5ed307 Merge "Implement sse2 and ssse3 versions for all sub_pixel_variance sizes." 2013-06-20 17:42:50 -07:00
Ronald S. Bultje
1e6a32f1af SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance().
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.

Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
2013-06-20 15:59:48 -07:00
Jingning Han
4f4713b417 Merge "Add unit tests for 4x4 ADST" 2013-06-20 10:22:40 -07:00
Ronald S. Bultje
8fb6c58191 Implement sse2 and ssse3 versions for all sub_pixel_variance sizes.
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):

sse2   4x4:    99 ->   82 cycles
sse2   4x8:           128 cycles
sse2   8x4:           121 cycles
sse2   8x8:   149 ->  129 cycles
sse2   8x16:  235 ->  245 cycles (?)
sse2  16x8:   269 ->  203 cycles
sse2  16x16:  441 ->  349 cycles
sse2  16x32:          641 cycles
sse2  32x16:          643 cycles
sse2  32x32: 1733 -> 1154 cycles
sse2  32x64:         2247 cycles
sse2  64x32:         2323 cycles
sse2  64x64: 6984 -> 4442 cycles

ssse3  4x4:           100 cycles (?)
ssse3  4x8:           103 cycles
ssse3  8x4:            71 cycles
ssse3  8x8:           147 cycles
ssse3  8x16:          158 cycles
ssse3 16x8:   188 ->  162 cycles
ssse3 16x16:  316 ->  273 cycles
ssse3 16x32:          535 cycles
ssse3 32x16:          564 cycles
ssse3 32x32:          973 cycles
ssse3 32x64:         1930 cycles
ssse3 64x32:         1922 cycles
ssse3 64x64:         3760 cycles

Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
2013-06-20 09:34:25 -07:00
Jingning Han
362809dfbf Add unit tests for 4x4 ADST
Enable sign bias check and round-trip error unit tests for 4x4 hybrid
transform modules.

Change-Id: Icd3d839f098d4b92b00ff76eac146765b039d0d3
2013-06-20 09:24:48 -07:00
John Koleszar
639db571df Add some unaligned test vectors
Tests resolutions of 8, 10, 16, 18, 32, 34, 64, 66 to exercise the
border conditions, as well as non-SB aligned sizes.

Change-Id: Ie7c2b7860ac3727e23202042f2e86792652912f8
2013-06-19 11:46:09 -07:00
John Koleszar
2319b7aaf1 Merge "tests: clear system state after non-API calls" 2013-06-18 16:40:15 -07:00
James Zern
5b756748fd tests: clear system state after non-API calls
add ClearSystemState() to reset MMX registers avoiding corrupting
subsequent tests.

Change-Id: I668deb09aa7aa467709776e5819f936910698bc0
2013-06-18 11:32:27 -07:00
James Zern
e7b599f683 convolve_test: align filter arrays
fixes issue #583

Change-Id: I4b855a5b5b168c8961410cef6ab5e6d86f14d301
2013-06-17 23:14:15 -07:00
Jeff Petkau
368c72374e Change the encryption feature to use a callback for decryption.
This allows code calling the library can choose an arbitrary
encryption algorithm.

Decoder control parameter VP8_SET_DECRYPT_KEY is renamed to
VP8D_SET_DECRYPTOR, and now takes an small config struct instead
of just a byte array.

Change-Id: I0462b3388d8d45057e4f79a6b6777fe713dc546e
2013-06-17 11:32:16 -07:00
John Koleszar
f616cfe4d7 Merge "Add vp9 test vectors unit test" 2013-06-17 10:32:08 -07:00
Jingning Han
0b7910b9ff Merge "Enable sse2 version of sad8x4/4x8" 2013-06-14 13:15:49 -07:00
Jingning Han
c43af9a8a3 Enable sse2 version of sad8x4/4x8
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.

Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
2013-06-14 09:19:28 -07:00
Jingning Han
15f50e7b42 Enable sse2 version of sad8x4/4x8
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.

Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
2013-06-13 16:18:18 -07:00
John Koleszar
119c9812a5 Add vp9 test vectors unit test
These files can stand in until we get proper syntax vectors. They
should provide some additional assurance against inadvertant
bitstream changes.

Change-Id: I12f6c9a5f054e30df40a7ff1f33145abf7e1d59d
2013-06-13 12:54:01 -07:00
Ronald S. Bultje
fa96eeb835 Implement SSE version for sad4x8x4d and SSE2 version for sad8x4x4d.
Encoding time of crew (CIF, first 50 frames) @ 1500kbps goes from 4min56
to 4min42.

Change-Id: I92c0c8b32980d2ae7c6dafc8b883a2c7fcd14a9f
2013-06-12 17:40:01 -04:00
Deb Mukherjee
995ce523eb Cosmetic cleanups of filters
No bitstream change.

Removes unused filters and the code for the case of 2 switchable filters;
also changes the 8tap-smooth filter coefficients for integer shifts to be
interpolating to be consistent with the way it is implemented currently.

Change-Id: I96c542fd8c06f4e0df507a645976f58e6de92aae
2013-06-10 12:06:36 -07:00
Jingning Han
78b8190cc7 Handle partition type coding of boundary blocks
The partition types of blocks sitting on the frame boundary are
constrained by the block size and the position of each sub-block
relative to the frame. Hence we use truncated probability models
to handle the coding of such information.

100 frames run:
yt 0.138%

Change-Id: I85d9b45665c15280069c0234ea6f778af586d87d
2013-06-07 14:19:40 -07:00
John Koleszar
a425e2cc06 Add marker bit to bool-coded partition start
Adds a marker bit to allow distinguishing the frame header from its residual
data.

Change-Id: Id75d47acc9e5a97007e4690c4f8748a4ce63e641
2013-06-06 23:06:26 -07:00
Jim Bankoski
5a88271b09 don't tokenize & encode tokens for blocks in UMV
This avoids encoding tokens for blocks that are entirely
in the UMV border. This changes the bitstream.

Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9
2013-06-06 06:10:25 -07:00
James Zern
a91e5b4fdc sad_test: fix msvc compile
Fixes:
error C2121: '#' : invalid character : possibly the result of a macro expansion

Change-Id: I63d7ebba29f3a3cbb546406be39270290e9dc47d
2013-05-29 17:48:53 -07:00
Yunqing Wang
f4fcfe3075 Optimize variance functions
Added SSE2 version of variance functions for super blocks.

Change-Id: Ibeaae8771ca21c99d41dd74067574a51e97b412d
2013-05-22 10:29:38 -07:00
Scott LaVarnway
ba48a11130 WIP: 4x4 idct/recon merge
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.

Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
2013-05-20 13:03:17 -04:00
Scott LaVarnway
794a7bedbd WIP: 8x8 idct/recon merge
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.

Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45
2013-05-16 13:52:15 -04:00
Scott LaVarnway
a272ff25cd WIP: 16x16 idct/recon merge
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.

Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1
2013-05-15 13:16:02 -04:00
Scott LaVarnway
2cf0d4be12 WIP: 32x32 idct/recon merge
This patch eliminates the intermediate diff buffer usage by
combining the short idct and the add residual into one function.
The encoder can use the same code as well.

Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a
2013-05-14 15:54:17 -07:00
John Koleszar
9e327dbb76 Change test image format to VPX_IMG_FMT_I420
Code was previously using VPX_IMG_FMT_VPXI420, which was intended to be the
"vpx" non-YUV colorspace variant.

Change-Id: Icf8771eeefeb574055ed638a93450c3d0ed5b9f5
2013-05-08 20:48:37 -07:00
Dmitry Kovalev
cd5113ceec Replacing vp9_{write, write_literal, bit} macros with functions.
Also removing BOOL_CODER and using vp9_writer instead.

Change-Id: I31d1ec661872f7eb1fe869607b6ed0ebfbb03e01
2013-05-07 18:19:50 -07:00
John Koleszar
9fba03456d Partially disable error resilience test
Disables the part of the error-resilient test that tests the
quality after dropping undroppable frames. It's not clear how
to set the threshold for this correctly at the moment.

Change-Id: I3ee4a0d475498f44711fdef05749f305e8d08591
2013-05-07 14:26:26 -07:00
John Koleszar
c0490a5cbb Revert "Adjust error resilience test data rate."
This reverts commit b24735c622
since the adjusted threshold doesn't allow the existing tests
to pass. Will disable the failing test in a separate commit.

Change-Id: I26d41cf6175f300bbad493cecdc96e6b0dd6f2fe
2013-05-07 12:58:32 -07:00
Paul Wilkins
b24735c622 Adjust error resilience test data rate.
Note that the pass fail criteria for this test seems a bit
arbitrary to me.

Change-Id: Idc695c39dd7542e851a7732b2810b45e0bdf91ae
2013-05-07 18:50:39 +01:00
John Koleszar
b844e50a61 Merge "encode_test_driver: make ~Encoder virtual" into experimental 2013-05-06 22:36:40 -07:00
James Zern
2b1a0b68bf test/tile_independence_test: check decode return
abort on failure

Change-Id: I52882613e466ae57e1ed7f10ca64e25b9724fb61
2013-05-06 11:55:15 -07:00
James Zern
51b7fd0d77 encode_test_driver: make ~Encoder virtual
+ some quick lint fixes

Change-Id: I95b6c32454c17d7fc717f1daa2376eb4d5418ee3
2013-05-03 19:08:08 -07:00
James Zern
c9327e6e66 Merge changes Ifea8618a,I014b832a into experimental
* changes:
  convolve_test: cosmetics
  convolve_test: remove unnecessary memset
2013-05-02 19:32:03 -07:00
James Zern
8fb48afd54 convolve_test: cosmetics
fix indent, whitespace, casts

Change-Id: Ifea8618a90f9da263a8955dd242bb3aa7fc59ae5
2013-05-02 19:30:47 -07:00
James Zern
c0b44b3160 superframe_test: use delete[] where appropriate
Change-Id: Id374267c93a7e14e985b8079833364c8eff5248b
2013-05-02 18:19:22 -07:00
James Zern
b0e5775ebc convolve_test: remove unnecessary memset
input_ is filled with random values just afterward.
the size was wrong anyway as input_ is allocated with memalign so
sizeof(input_)==sizeof(uint8_t*)

Change-Id: I014b832ac60960cd22b6f369dbc9fd648d4055b5
2013-05-02 12:32:13 -07:00
Johann
32a5c52856 Merge branch 'master' into experimental
Conflicts:
	vp9/common/vp9_findnearmv.c
	vp9/common/vp9_rtcd_defs.sh
	vp9/decoder/vp9_decodframe.c
	vp9/decoder/x86/vp9_dequantize_sse2.c
	vp9/encoder/vp9_rdopt.c
	vp9/vp9_common.mk

Resolve file name changes in favor of master. Resolve rdopt changes in
favor of experimental, preserving the newer experiments.

Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f
2013-04-26 12:57:10 -07:00
Dmitry Kovalev
0b44624c37 Finally removing BOOL_DECODER and using vp9_reader instead.
Change-Id: I03d5b6f22f0930893709c6db5f1b06762ad3354e
2013-04-19 10:37:24 -07:00
John Koleszar
a9ebbcc338 convolve: support larger blocks, fix asm saturation bug
Updates the common convoloution code to support blocks larger than
16x16, and rectangular blocks. This uncovered a bug in the SSSE3
filtering routines due to the order of application of saturation.
This commit fixes that bug, adjusts the unit test to bias its
random values towards the extremes, and adds a test to ensure that
all filters conform to the expected pairwise addition structure.

Change-Id: I81f69668b1de0de5a8ed43f0643845641525c8f0
2013-04-18 13:57:59 -07:00
John Koleszar
7f7d1357a2 Merge branch 'experimental' into master
VP9 preview bitstream 2, commit '868ecb55a1528ca3f19286e7d1551572bf89b642'

Conflicts:
	vp9/vp9_common.mk

Change-Id: I3f0f6e692c987ff24f98ceafbb86cb9cf64ad8d3
2013-04-16 06:49:46 -07:00
Dmitry Kovalev
67d060067e Replacing vp9_read, vp9_read_literal, vp9_read_bit macros with functions.
This is the first CL with vp9_reader changes. All another macro
definitions will be replaced after.

Change-Id: I1c6bd9c9a612ec1663d484d6adb4fb720af54063
2013-04-15 14:54:19 -07:00
James Zern
c4195e0eb8 tests: use a portable rand() implementation
the one from gtest in this case: testing::internal::Random.
this will make the tests deterministic between platforms. addresses
issue #568.

Change-Id: I5a8a92f5c33f52cb0a219c1dd3d02335acbbf163
2013-04-04 19:29:33 -07:00
John Koleszar
672b75a103 Convert inv_tile_order to control interface
Restore ABI compatibility with the master branch.

Change-Id: Ie9f6fdf536662bd87dfcf114d16f003422670763
2013-03-27 11:22:20 -07:00
John Koleszar
771fc832f3 Merge branch 'master' into experimental
Pick up VP8 encryption, quantization changes, and some fixes to vpxenc

Conflicts:
	test/decode_test_driver.cc
	test/decode_test_driver.h
	test/encode_test_driver.cc
	vp8/vp8cx.mk
	vpxdec.c
	vpxenc.c

Change-Id: I9fbcc64808ead47e22f1f22501965cc7f0c4791c
2013-03-27 10:46:19 -07:00
John Koleszar
449f136886 VP9/ResizeInternalTest: adjust passing threshold
Update to +/- 1dB.

Change-Id: Idada001f261b36945c9334e288a415ee2c79c415
2013-03-18 15:17:45 -07:00
Dmitry Kovalev
26cec5c13f Basic encryption feature for libvpx.
New decoder control paramter VP8_SET_DECRYPT_KEY to set the decryption key.

Change-Id: I6fc1f44d41f74f3b3f702778af1a6f8f5cc9439f
2013-03-15 18:21:55 -07:00
Yaowu Xu
005552639b removed reference to "LLM" and "x8"
The commit changed the name of files and function to remove obselete
reference to LLM and x8.

Change-Id: I973b20fc1a55149ed68b5408b3874768e6f88516
2013-03-13 08:35:46 -07:00
John Koleszar
bd9cd9a185 fix superframe index marker masks
The superframe index marker byte carries data in the lower 5 bits. Only the
upper 3 should be used as part of the mask to detect it. By masking with
0xf0, the previous code was incorrect for frames over 65k bytes.

Change-Id: I6248889f5af227457f359a56b2348ef6db87a3b4
2013-03-12 19:04:32 -07:00
John Koleszar
0a18228274 Merge "Add 'superframe' index" into experimental 2013-03-11 16:31:48 -07:00
John Koleszar
93e10c8e87 Update ResizeInternalTest threshold
Improved coding performance made this test fail. Adjust the threshold
so that it passes again. A more stable metric is an open TODO.

Change-Id: I56e18749ced48123ee2488888a3eed631759912b
2013-03-05 13:44:56 -08:00
John Koleszar
522d4bf852 Add 'superframe' index
A 'superframe' is a group of frames that share the same PTS, but have a
defined decoding order. This commit adds the ability to append an index
to such a group of frames, allowing for random access to the constituent
frames. This could be useful for frame-level parallelism or partial
decoding in a multilayer scenario.

Decoding the stream serially without such an index should work as a
fallback, and VP9/TestSuperframeIndexIsOptional verifies that.

Change-Id: Idff83b7560e1a7077d8fb067bfbc45b567e78b1c
2013-03-05 12:45:40 -08:00
John Koleszar
2d3e879fcc Merge changes If5896507,I06b5ba5c,I2712f99e into experimental
* changes:
  Add unit test for x4 multi-SAD functions
  Add VP9 1 block SAD functions to unit test
  Merge master branch into experimental
2013-03-01 20:50:33 -08:00
John Koleszar
1cfc86ebe0 Add unit test for x4 multi-SAD functions
Update the function prototypes to match between VP9 and VP8.

Change-Id: If58965073989e87df3b62b67a030ec6ce23ca04f
2013-03-01 18:14:02 -08:00
John Koleszar
6b653cba02 Add VP9 1 block SAD functions to unit test
Change-Id: I06b5ba5c457944cfa4cd9f53c3bd8cda132439c2
2013-03-01 18:04:19 -08:00
Ronald S. Bultje
e189edfeb1 Initialize pass variable in tile test.
Change-Id: I7977694223521404fc69f29ae2cff03e36e87299
2013-03-01 12:43:10 -08:00
John Koleszar
69c67c9531 Merge master branch into experimental
Picks up some build system changes, compiler warning fixes, etc.

Change-Id: I2712f99e653502818a101a72696ad54018152d4e
2013-03-01 11:06:05 -08:00
John Koleszar
04c2407874 convolve test: validate 1D filters are 1D
Since the 8-tap lowpass filter is non-interpolating, the results are
different between applying it at whole-pel values and not. This
means that 1D-only versions are requried to be implemented, as
opposed to being an optimization of the 2D case. Calling the 2D
filter instead of the horizontal-only filter is not equivalent
in this case. Update the test to pass invalid filters to the
unused stage of the 1D-only calls, to verify they're unused.

Change-Id: Idc1c490f059adadd4cc80dbe770c1ccefe628b0a
2013-02-27 11:19:11 -08:00
John Koleszar
557a1b209e Run all filters through convolve test
Updates the convolve test to verify that all filters match the
reference implementation. This verifies commit 30f866f, which
fixed some problems with the SSE3 version of the filters for
the vp9_sub_pel_filters_8s and vp9_sub_pel_filters_8lp banks
due to overflow and order of operations.

Change-Id: I6b5fe1a41bc20062e2e64633b1355ae58c9c592c
2013-02-27 11:15:20 -08:00
John Koleszar
9615fd8f39 Merge "Test upscaling as well as downscaling" into experimental 2013-02-27 10:25:51 -08:00
John Koleszar
d8e68bd14b Merge changes I922f8602,I0ac3343d into experimental
* changes:
  Use 256-byte aligned filter tables
  Set scale factors consistently for SPLITMV
2013-02-27 10:08:53 -08:00
John Koleszar
b683eecf6d Test upscaling as well as downscaling
Fixes a bug in vp9_set_internal_size() that prevented returning to
the unscaled state. Updated the ResizeInternalTest to scale both
down and up. Added a check that all frames are within 2.5% of the
quality of the initial keyframe.

Change-Id: I3b7ef17cdac144ed05b9148dce6badfa75cff5c8
2013-02-27 08:22:40 -08:00
John Koleszar
6fd7dd1a70 Use 256-byte aligned filter tables
This avoids duplicating all the filters twice. Includes fixups to the
convolve routines and associated tests to make this work.

Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
2013-02-27 08:22:39 -08:00
Yaowu Xu
103d83cb6c Merge "Enable 32x32 dct tests" into experimental 2013-02-27 07:57:07 -08:00
John Koleszar
eb939f45b8 Spatial resamping of ZEROMV predictors
This patch allows coding frames using references of different
resolution, in ZEROMV mode. For compound prediction, either
reference may be scaled.

To test, I use the resize_test and enable WRITE_RECON_BUFFER
in vp9_onyxd_if.c. It's also useful to apply this patch to
test/i420_video_source.h:

  --- a/test/i420_video_source.h
  +++ b/test/i420_video_source.h
  @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {

     virtual void FillFrame() {
       // Read a frame from input_file.
  +    if (frame_ != 3)
       if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
         limit_ = frame_;
       }

This forces the frame that the resolution changes on to be coded
with no motion, only scaling, and improves the quality of the
result.

Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
2013-02-26 23:54:23 -08:00
John Koleszar
6a4f708c25 Refactor inter recon functions to support scaling
Ensure that all inter prediction goes through a common code path
that takes scaling into account. Removes a bunch of duplicate
1st/2nd predictor code. Also introduces a 16x8 mode for 8x8
MVs, similar to the 8x4 trick we were doing before. This has an
unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that
case for now.

Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
2013-02-26 10:03:29 -08:00
Yaowu Xu
3dbc78b134 Enable 32x32 dct tests
Also
1. Removed the test code for fDCT from the iDCT test.
2. changed the criteria of round trip error to be below 1/block, this
is quite strict comparing to smaller transforms when size differences
are accounted for.

Change-Id: Idb46a6380b04c93fc8e2845c75f5a850366b0090
2013-02-26 09:23:01 -08:00
Yaowu Xu
499fe05dc0 optimize forward 16x16 DCT for accuracy
This commit added pre/post scaling for first half of fDCT16x16 to
reduce error, by simulation of 100,000 blocks for random inputs,
the average sse reduced from 2.1/block to 0.0498/block.

also enabled tests for 16x16 fDCT and iDCT

Change-Id: Id2a95f0464c6dd4118797d456237ae90274c0f02
2013-02-25 07:47:27 -08:00
Yaowu Xu
22012ee994 optimize 8x8 fdct rounding for accuracy
The commit added a final rounding choice for 8x8 forward dct to get
rid of a sign bias at DC position and improve the accuracry in term
of round trip error for 8x8 fDCT/iDCT.

This commit also enabled forward 8x8 dct test.

Change-Id: Ib67f99b0a24d513e230c7812bc04569d472fdc50
2013-02-22 16:55:30 -08:00
James Zern
1711cf2dbb add vp8 variance test
Change-Id: I4e94ee2c4e2360d6a11a454c323f2899c1bb6f72
2013-02-22 16:25:14 -08:00
James Zern
540997afba sixtap_predict_test: fix sizes passed to memset
src_/dst_/dst_c_ are heap allocated, use the allocation size rather than
sizeof(var)

Change-Id: I3335ad487dc9b154cdf212891d1d74c812eff060
2013-02-22 11:29:47 -08:00
Yaowu Xu
4e2697f5cd changes related fdct/idct tests
1. changed 4x4 test name to Vp9Fdct4x4Test to be consistent
2. remove forward 8x8 dct test code from idct8x8_test.cc
3. temporarily disable other forward dct tests to allow fdct work in
progress

Change-Id: I566aeed9c7c34da5a206190aa7d0e847a4008b36
2013-02-22 10:39:31 -08:00
Paul Wilkins
649be94cf0 Removal of Hybrid DWT/DCT experiment.
Removal of experiment to simplify code base for other
changes.

Change-Id: If0a33952504558511926ad212bc311fc2bffb19a
2013-02-13 15:08:48 +00:00
Ronald S. Bultje
f496f601fb Add tile column size limits (256 pixels min, 4096 pixels max).
This is after discussion with the hardware team. Update the unit test
to take these sizes into account. Split out some duplicate code into
a separate file so it can be shared.

Change-Id: I8311d11b0191d8bb37e8eb4ac962beb217e1bff5
2013-02-12 10:33:34 -08:00
John Koleszar
6dfc95fe63 Merge changes Icd1a2a5a,I204d17a1,I3ed92117 into experimental
* changes:
  Initial support for resolution changes on P-frames
  Avoid allocating memory when resizing frames
  Adds a test for the VP8E_SET_SCALEMODE control
2013-02-08 14:20:05 -08:00
John Koleszar
3de8ee6ba1 Merge changes Ife0d8147,I7d469716,Ic9a5615f into experimental
* changes:
  Restore SSSE3 subpixel filters in new convolve framework
  Convert subpixel filters to use convolve framework
  Add 8-tap generic convolver
2013-02-08 13:19:47 -08:00
John Koleszar
88f99f4ec2 Adds a test for the VP8E_SET_SCALEMODE control
Tests that the external interface to set the internal codec scaling
works as expected. Also updates the test to pull the height from
the decoded frame size rather than parsing the keyframe header,
in anticipation of allowing resolution changes on non-keyframes.

Change-Id: I3ed92117d8e5288fbbd1e7b618f2f233d0fe2c17
2013-02-08 12:20:30 -08:00
John Koleszar
29d47ac80e Restore SSSE3 subpixel filters in new convolve framework
This commit adds the 8 tap SSSE3 subpixel filters back into the code
underneath the convolve API. The C code is still called for 4x4
blocks, as well as compound prediction modes. This restores the
encode performance to be within about 8% of the baseline.

Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c
2013-02-08 12:18:14 -08:00
Ronald S. Bultje
1407bdc243 [WIP] Add column-based tiling.
This patch adds column-based tiling. The idea is to make each tile
independently decodable (after reading the common frame header) and
also independendly encodable (minus within-frame cost adjustments in
the RD loop) to speed-up hardware & software en/decoders if they used
multi-threading. Column-based tiling has the added advantage (over
other tiling methods) that it minimizes realtime use-case latency,
since all threads can start encoding data as soon as the first SB-row
worth of data is available to the encoder.

There is some test code that does random tile ordering in the decoder,
to confirm that each tile is indeed independently decodable from other
tiles in the same frame. At tile edges, all contexts assume default
values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode),
and motion vector search and ordering do not cross tiles in the same
frame.
t log

Tile independence is not maintained between frames ATM, i.e. tile 0 of
frame 1 is free to use motion vectors that point into any tile of frame
0. We support 1 (i.e. no tiling), 2 or 4 column-tiles.

The loopfilter crosses tile boundaries. I discussed this briefly with Aki
and he says that's OK. An in-loop loopfilter would need to do some sync
between tile threads, but that shouldn't be a big issue.

Resuls: with tiling disabled, we go up slightly because of improved edge
use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf,
~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5%
on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is
concentrated in the low-bitrate end of clips, and most of it is because
of the loss of edges at tile boundaries and the resulting loss of intra
predictors.

TODO:
- more tiles (perhaps allow row-based tiling also, and max. 8 tiles)?
- maybe optionally (for EC purposes), motion vectors themselves
  should not cross tile edges, or we should emulate such borders as
  if they were off-frame, to limit error propagation to within one
  tile only. This doesn't have to be the default behaviour but could
  be an optional bitstream flag.

Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f
2013-02-05 15:43:03 -08:00
John Koleszar
5ca6a3667f Add 8-tap generic convolver
This commit introduces a new convolution function which will be used to
replace the existing subpixel interpolation functions. It is much the
same as the existing functions, but allows for changing the filter
kernel on a per-pixel basis, and doesn't bake in knowledge of the
filter to be applied or the size of the resulting block into the
function name.

Replacing the existing subpel filters will come in a later commit.

Change-Id: Ic9a5615f2f456cb77f96741856fc650d6d78bb91
2013-02-05 14:19:28 -08:00