For cases where there's no transform set in bit 0 (the left edge of
the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the
left edge needs filtering), it was incorrectly being skipped before.
This situation only happens on the leftmost edge of the image, as
the edge at column 0 is intentionally skipped since there aren't
pixels to the left to read.
Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.
Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.
Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.
Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
The new print out includes skips and has prefixed sections so you can
grep to find things like transforms chosen on each frame.
Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.
This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.
Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
This seems to only be used in the encoder. Also remove an empty wrapper
file that contained forward declarations for this function, but didn't
actually define any actual functions.
Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b
vp9_default_inter_mode_probs was being accessed with a different type
than it was defined with. Ensure that its declaration is included
prior to its definition.
Change-Id: I2f963f513ab2f4e339f8a3c17e3d0f03749eba16
All elements of this table are equal to 252, so replace it with a
single constant VP9_COEF_UPDATE_PROB.
Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
Modified to work with 8x8 blocks of memory. Will revisit
later for further optimizations. For the HD clip used, the
decoder improved by almost 20%.
Change-Id: Iaa4785be293a32a42e8db07141bd699f504b8c67
Modified to work with 8x8 blocks of memory. Will revisit
later for further optimizations. For the HD clip used, the
decoder improved my 20%.
Change-Id: Ia0057f55d66d1445882351ea6c43b595a5a980e5
This commit fixed the allowable partition types for bottom-right
corner blocks.
When a block has over half of its pixels as valid content in both
vertical and horizontal directions, allow all the four partition
types in the bit-stream. Otherwise, apply partition type constraints.
Change-Id: I2252e2de7125a8bfb1c824bf34299a13c81102e3
* New probs for subpel filters/tx_count
* Makes a change to not reset to defaults for the tx_size
probs if an intermediate frame reverts to using a fixed tx_size.
* A few updates to the parameters for backward adaptation for mode/mv
* some cosmetic cleanups
derf300: +0.06%
Change-Id: I22994d659bc31ca7a4fc8820fde24001e64a2920