This commit fixes a compiling error in vp9_idct.h, where the codec
checks that the intermediate steps of transformation fit within
16-bit length. The issue was due to broken file dependency.
Change-Id: Ib22bba13a1e6df28489cb23d6774c561969f1fdc
When calculating delta in VP8 denoiser, since the block size is fixed to 16x16,
the divisor is 256, which is the number of the pixel.
But in VP9, the block size varies, the divisor should correspond to the block
size.
Change-Id: Ibdc1e5d23ba8c788b0d0dc6d406bcdfc34c1b142
One is a more aggressive version of the pruned subpel tree
search where only a single halfpel candidate is searched.
The search candidate is based on a surface fit result.
The other is a method to obtain the subpel position at one
shot based on the same surface fit.
The methods have not been deployed in any speed setting yet.
Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
This commit enables the encoder to skip checking ALTREF inter modes
in ARF coding, if the predicted motion vectors suggest that the
GOLDEN_FRAME provides higher prediction accuracy than ALTREF_FRAME.
It improves the speed 3 encoding speed by about 5%, at the expense
of compression performance loss -0.041% and -0.225% for derf and
stdhd, respectively.
pedestrian_area 1080p 2000 kbps
66705 b/f, 40.909 dB, 118738 ms ->
66732 b/f, 40.908 dB, 113688 ms
old_town_cross 720p 1500 kbps
14427 b/f, 36.256 dB, 62746 ms ->
14412 b/f, 36.252 dB, 60690 ms
blue_sky 1080p 1500 kbps
51026 b/f, 35.897 dB, 73310 ms ->
50921 b/f, 35.893 dB, 70406 ms
bus CIF 1000 kbps
21301 b/f, 34.841 dB, 7326 ms ->
21248 b/f, 34.837 dB, 7196 ms
Change-Id: I76cf88b4d655e1ee3c0cb03c8a5745493040e8d2
This patch re-enabled the feature in Pengchong's patch
(commit 1286126073). Originally, it
was turned on while use_lastframe_partitioning > 0(not used anymore).
Now it was added as a feature, and turned on while speed >= 2.
As described in the original patch, this feature helps speed up the
slideshows in YouTube.
Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471
The first comment is obselete given the way is now normative in VP9
bitstream. The second comment line was too long.
Change-Id: I6546585babf60d466485ddcf2daa6d2fa79e999a
As reported in issue #850, the condition for border extension was not
complete. This commit added the case when the scaling is enabled.
This fixes issue #850.
Change-Id: I67768b23f0dcc4ac9a9aa0a0825b0fe8cb85a72e
Simplified the code and removed some code that was not used anymore.
This patch didn't change encoding result.
Change-Id: I7e54a74c8f35a6726dfc8a1c55b337448b7ea124
The rd_thresholds are adaptively changed based on best mode tested.
It was only changed for the same block size, this commit makes the
adaptation for similar block sizes too. The commit also made minor
adjustment and code cleanups.
The impact on encoding time for _ped:
118089 ms -> 111927 ms
The impact on compression:
derf: -0.339%
stdhd: -0.303%
Change-Id: I8817fed1102350497f2ec631849e43f753878e5d
Adds code to return an integer cost list for NSTEP search. Then
uses it for pruned subpel search in speed 3.
derf: -0.06%
Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps.
[Subject to further testing].
Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96
mi_grid_* are arrays of pointer to pointer. They save the pointers that point
to the MIs in cm->mi. But they are unnecessary and complicated. The original
goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer
inside MODE_INFO_t, same goal could be achieved.
This commit totally removes the mi_grid_* structures. But there are still
many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit
will do on-demand MODE_INFO_t allocation in order to save these memories.
Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6
vpx_svc_parameters_t contains id, resolution and min/max qp for each spatial layer.
In this change we will use extra config to send min/max qp and scaling factors, then calculate layer resolution inside encoder.
Change-Id: Ib673303266605fe803c3b067284aae5f7a25514a
Substantial restructuring of the way we estimate
the rate of decay in prediction quality and determine
the arf interval and amount of boost used.
Also other changes to support moving to a lower first pass
Q which exposes some new features and allows us to better
distinguish genuinely static blocks from low motion or noisy
blocks.
Net gains now visible on all the test sets with std-hd PSNR up
1.87%. There are still some bad outlier cases but most of these
are low motion or slide show type content where the metrics
are already high at any given rate. The best + case is up by
more than 10%.
Change-Id: I18e25170053bdf3188f493ff8062f48a74515815
This commit adds back sse2 or ssse3 optimized versio of a couple of
functions, fixes a ~10% performance regression.
Change-Id: I049786906e5a641224dced63c6492aec9d86d183
The ARF frame should always be the same size as the
native resolution of the input frames.
It will be scaled to the required resolution at
encode time.
Change-Id: I0afe858129aa6ef65b1648f43476331715346896
This commit makes the encoder to use non-zero mode threshold for
NEARESTMV modes. The runtime for test clips of speed 3 is reduced
by about 1%.
pedestrian 1080p 2000 kbps, 143239 ms -> 141989 ms
bus CIF 1000 kbps, 7835 ms -> 7749 ms
The compression performance change is about -0.02% for both derf
and stdhd.
Change-Id: Ib71808922c41ae2997100cb7c561f68dcebfa08e
If the partition block is skippable, which means no coefficients
for Y, U, and V planes, its skip flag is set to 1. No quality
change (verified by borg tests), and no noticeable speed change.
Change-Id: I9231f720f8dd6364384cf05aa148ca24d75450f1
Libvpx was memseting every external frame buffer before decode. This
was to work around a valgrind issue in our C loop filter. Most of
the time this was not needed and we have noticed some significant
performance loss on some platforms. Now we require the application to
zero out the buffers if it is using external frame buffers.
Change-Id: I7330d00a315e65137ed30edd5f813e8929b76242
This commit enforces ARF validation check for compound inter modes.
It avoids potential access to ARF in the encoding process if it
is not allowed.
Change-Id: I055fec946b5d19d97937dc9001e1e564923e2439
The valid reference frame check in sub8x8 rate-distortion
optimization search has been included in the ref_frame_skip_mask
scheme. This commit removes the later further validation checks
that are not in effect.
Change-Id: I853b477c44037d3dc0afec6cbfce08a96c597a75
This commit replaces the best_ref_index table fetch with the use
of best_mbmode in vp9_rd_pick_inter_mode_sub8x8.
Change-Id: I882ee9ee6a8c0e61befcca1f4dba6d2ea8de8f13
The issue was discovered on bitstream with 2x vertical downscale. For
zero MVs, y_pad is set to 1 only when vertical convolution is
required. The original code assumes that for y_step_q4 == 32 we don't
perform vertical convolution. But vp9_setup_scale_factors_for_frame()
sets convolve functions so that when x_step and y_step are both not
equal to 16, convolve in both directions is performed. And convolve()
unconditionally subtracts one stride from source pointer when calls
convolve_horiz(). This leads to invalid memory access.
Change-Id: I882dfa6081a58e172b5ffa55842bfcd6727f10bf
Call to vp9_rc_get_second_pass_params() moved from
Pass2Encode() to earlier in vp9_get_compressed_data(),
to ensure that two pass stats and parameters are
available before decisions such as frame scaling.
Change-Id: If21537f0073919b04696a7d5e9aac78e23d76f39
When a reference frame type is not in the frame buffer, the mode
search threshold will be set to INT_MAX, so as to effectively
turn off the mode entries in the rate-distortion optimization loop
that involves this reference frame type. This operation is now
integrated in the ref_frame_skip_mask scheme. This commit hence
removes the redundant mode search threshold setting.
Change-Id: Ib18f45da611afda2af275201efd367df7f5101ab
This commit unifies the reference frame control in the rate-
distortion optimization search loop of sub8x8 block size to remove
the control dependency on mode search order.
Change-Id: I3a174099f71a7cc176ede9fd60e2374243ae9232
Improves function to return sad of integer pels by reusing integer
pels already visited in the smallest scale.
Turns on BIGDIA search for speed 4. Also, turns on the
first version of the pruned subpel search at this speed.
derf: -0.32% (speed 4)
Speed seems to improve by at least 5% but subject to verification.
Change-Id: Iaec8eaffd61d6237ac029e6a2a1b0a88b2a35271
Adds various high bitdepth transform functions and tests.
Much of the changes are related to using typedefs tran_low_t
and tran_high_t for the final transform cofficients and intermediate
stages of the transform computation respectively rather than fixed
types int16_t/int. When vp9_highbitdepth configure flag is off,
these map tp int16_t/int32_t, but when the flag is on, they map
to int32_t/int64_t to make space for needed extra precision.
Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8