offsetting by a variable stride prevents instruction reordering,
resulting in poor assembly.
additionally reroll 16x16/32x32 loops to reduce register spill with this
new format
Change-Id: I0635b8ba21ecdb88116e927dbdab53acdf256e11
The rotation computation using 2X of cos(pi/16) has a potential to
overflow 32 bit, this commit disable the function to allow further
investigation and optimization.
Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf
The version is currently producing different result from c version
for some input. Disable the use of it for now to allow time for
investigation the source of mismatch.
Change-Id: Id039455494ee531db4886a9f1fa4761174ef6df3
The default golden frame interval was doubled. After encoding a
frame, the background motion was measured. If the motion was high,
the current frame was set as the golden frame. Currently, the
changes were applied only while aq-mode 3 was on.
Borg tests(rtc set) showed a 0.226% PSNR gain and 0.312% SSIM gain.
No speed changes.
Change-Id: Id1e2793cc5be37e8a9bacec1380af6f36182f9b1
structured extended feature flags require eax = 7; avoids incorrectly
detecting avx2 on some older processors that support avx.
from [1]:
INPUT EAX = 0: Returns CPUID’s Highest Value for Basic Processor
Information and the Vendor Identification String
[1] http://www.intel.com/content/www/us/en/processors/processor-identification-cpuid-instruction-note.html
Change-Id: I6b4735b5f7b7729a815e428fca767d1e5a10bcab
For color sampling format other than 420, valid partion size in Y may
not work for UV plane. This commit adds validation of UV partition
size before select the partition choice.
This fixes a crash for real time encoding of 422 input.
Change-Id: I1fe3282accfd58625e8b5e6a4c8d2c84199751b6
(see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6)
For the test clip used, the decoder performance improved by ~2%.
This is also an intermediate step towards adding back the
mode_info streams.
Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d
The existing test was triggering a lot of false positives on some types
of animated material with very plain backgrounds. These were triggering
code designed to catch key frames in letter box format clips.
This patch tightens up the criteria and imposes a minimum requirement
on the % blocks coded intra in the first pass and the ratio between the
% coded intra and the modified inter % after discounting neutral (flat)
blocks that are coded equally well either way.
On a particular problem animation clip this change eliminated a large
number of false positives including some cases where the old code
selected kf several times in a row. Marginal false negatives are less
damaging typically to compression and in the problem clip there are now
a couple of cases where "visual" scene cuts are ignored because of well
correlated content across the scene cut.
Replaced some magic numbers related to this with #defines and added
explanatory comments.
Change-Id: Ia3d304ac60eb7e4323e3817eaf83b4752cd63ecf