Commit Graph

4567 Commits

Author SHA1 Message Date
Jingning Han
ed100c0b00 Fix an ioc issue in super_block_uvrd
This commit fixes an ioc issue that will happen when the cumulative
variables are not in effective use. The fix discards these
redundant additions.

Change-Id: Idbac5bfb989c0cedc5f8a323effce938519b2457
2014-10-16 11:07:39 -07:00
Paul Wilkins
716ae78ce4 Change initialization of static_scene_max_gf_interval.
This removes an unnecessary restriction that causes
a problem (noticed by AWG) when the forced key frame
interval is set to a very small value, such as 10. In this case
we were being forced to code minimal length GF groups.

Change-Id: I76ef5861a09638ff51f61fea02359554184ada53
2014-10-16 16:18:29 +01:00
Minghai Shang
68b550f551 [spatial svc]Another workaround to avoid using prev_mi
We encode a empty invisible frame in front of the base layer frame to
avoid using prev_mi. Since there's a restriction for reference frame
scaling factor, we have to make it smaller and smaller gradually until
its size is 16x16.

Change remerged.

Change-Id: I9efab38bba7da86e056fbe8f663e711c5df38449
2014-10-16 16:09:40 +01:00
Paul Wilkins
d5130af568 Revert "Move input frame scaling into the recode loop"
This reverts commit 452dc21500.

This change has introduced a significant quality regression on content
with forced key frames. (e.g. the YT and yt-hd set). It is most
noticeable in static content where the kf bits dominate. Here, despite
key frames being apparently coded at the same Q, there is a drop in all
metrics of ~20% (e.g clXR and BFa0).

Change-Id: Iba14cc61778c0846fa0a59c33c55a9fc49512cb4
2014-10-16 15:54:40 +01:00
Paul Wilkins
468032961d Revert "[spatial svc]Another workaround to avoid using prev_mi"
This reverts commit c113457af9.

Temporary revert to allow clean revert of another commit.

Change-Id: Ia9b7b755e6c48e1b6e383329f121fef175a24b27
2014-10-16 15:52:08 +01:00
Marco
48ea5b7190 Merge "Some updates for Speed 6/VAR_BASED_PARTITION." 2014-10-15 15:57:21 -07:00
Jingning Han
e2612fbd70 Add init and reset functions for RD_COST struct
Change-Id: I2902de7051a883fd22e27a655209233733969cfd
2014-10-15 15:02:06 -07:00
Jingning Han
801b77d48c Merge "Replace copy_partitioning use case with choose_partitioning" 2014-10-15 14:54:52 -07:00
Jingning Han
5e766ccee0 Use rate/distortion thresholds to control non-RD partition search
Compare the estimated rate and distortion to the thresholds scaled
according to the operating block size and determine if further
split partition search will be run. The compression performance of
speed -5 is changed by -0.074%. The encoding speed is 10% - 15%
faster.

vidyo1 720p
16545 b/f, 40.492 dB, 11475 ms -> 16535 b/f, 40.486 dB, 10100 ms

nik720p
16624 b/f, 36.310 dB, 10071 ms -> 16617 b/f, 36.313 dB, 8346 ms

Change-Id: Ic9197ab5761279ae55d2fb7813b2af0e0db497b8
2014-10-15 13:40:33 -07:00
Marco
09ea74f194 Some updates for Speed 6/VAR_BASED_PARTITION.
Reduce the intra_cost_penalty for non-rd mode,
and some updates to VAR_BASED_PARTITION.

Visual tests show some improvement at Speed 6, for RTC clips.

Change-Id: If9090daf7aed14906a32d931a538ab544bbca606
2014-10-15 12:06:48 -07:00
Jingning Han
89b8c7a513 Replace copy_partitioning use case with choose_partitioning
This commit replaces the use of copy_partitioning with
choose_partitioning based on the sse of subsamped pixels, which
provides significantly better coding performance and runs at
similar speed, as compared to copy_partitioning. It improves rtc
speed 5 coding performance by 3%.

Change-Id: I52d3682a12dce0147f5e52383a594fc242ca3228
2014-10-15 11:37:20 -07:00
Minghai Shang
c113457af9 [spatial svc]Another workaround to avoid using prev_mi
We encode a empty invisible frame in front of the base layer frame to
avoid using prev_mi. Since there's a restriction for reference frame
scaling factor, we have to make it smaller and smaller gradually until
its size is 16x16.
Change-Id: I60b680314e33a60b4093cafc296465ee18169c19
2014-10-14 16:26:39 -07:00
Adrian Grange
2040bb58fb Merge "Move input frame scaling into the recode loop" 2014-10-14 15:30:42 -07:00
Alex Converse
00a9671bbd Merge "Add a 32-bit friendly sse2 quantizer." 2014-10-14 14:35:02 -07:00
Yunqing Wang
a614f2288c Remove an unneeded function call
set_tile_limits() is called in vp9_change_config() already.

Change-Id: I91c3a0df2c1c7fd7e71546d8f51fd5b65838a7da
2014-10-14 11:41:37 -07:00
Alex Converse
7497d2fb23 Add a 32-bit friendly sse2 quantizer.
This is based on the 64-bit ssse3 quantizer.

1.1x speedup for screen content at speed 7.

Change-Id: I57d15415ef97c49165954bbe3daaaf9318e37448
2014-10-14 11:37:41 -07:00
Jingning Han
f67e75a6f4 Merge "Refactor super_block_uvrd function to remove goto statement" 2014-10-14 11:33:00 -07:00
Jingning Han
f3a5de816d Refactor super_block_uvrd function to remove goto statement
Use return value 0/1 as indicator of the validity of the rate-
distortion cost.

Change-Id: I6244126fbf03472cebcba4f177a6cd329fae4743
2014-10-14 09:58:11 -07:00
Adrian Grange
452dc21500 Move input frame scaling into the recode loop
Move the point at which input frames are scaled
into the recode loop. This will allow us to change
the coded frame size dynamically in response
to previous attempts to encode the frame at a
higher resolution.

A following patch will implement a scheme for
resizing the frame in the recode loop.

Change-Id: I6a59c02d6ac1626512edad6de8b60063b79433e6
2014-10-14 09:27:55 -07:00
Jingning Han
d0369d6fd4 Merge "Use speed feature variable in vp9_rd_pick_inter/intra_mode" 2014-10-14 09:10:24 -07:00
Jingning Han
fdf2205558 Merge "Fix vp9_rd_pick_inter/intra function types" 2014-10-14 09:10:11 -07:00
Jingning Han
790a96c94f Merge "Refactor rate distortion cost structure" 2014-10-14 08:58:55 -07:00
Jingning Han
69a09a70e9 Use speed feature variable in vp9_rd_pick_inter/intra_mode
Replace repeated fetch cpi->sf with a local sf pointer.

Change-Id: I5a55bba3e1c41fbdbc6ad5f078d2fa49dd95ee67
2014-10-13 16:15:00 -07:00
Jingning Han
3bdb6bfcee Fix vp9_rd_pick_inter/intra function types
The returned value is not used anywhere, hence changing the function
type into void.

Change-Id: I0ece49ed61e7aab6df01140135503ad41d4ef4a4
2014-10-13 16:00:46 -07:00
Jingning Han
811cef97c9 Refactor rate distortion cost structure
This commit makes a struct that contains rate value, distortion
value, and the rate-distortion cost. The goal is to provide a
better interface for rate-distortion related operation. It is
first used in rd_pick_partition and saves a few RDCOST calculations.

Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f
2014-10-13 14:27:16 -07:00
Paul Wilkins
6dbb9e4d44 Clamp rate error estimate.
Add back clamp which ensures that the Q adaptation
is turned off when the over_shoot_pct and under_shoot_pct
parameters are set to 100.

Change-Id: Id0161b114d39a3029cd3eb28020caab0c3914922
2014-10-13 18:07:58 +01:00
Paul Wilkins
f7f0eaa581 Add adaptation option for VBR.
Allow min and maxQ to creep when the undershoot
or overshoot exceeds thresholds controlled by the
command line under_shoot_pct and over_shoot_pct
values.

Default is 100%,100% which ~disables adaptation.

Derf results for example undershoot% / overshoot%:-

Head:- Mean abs (%rate error) = 14.4%

This check in:-
25%/25% - Mean abs (%rate error) = 6.7%
                  PSNR hit -1% SSIM -0.1%

5% / 5%  - Mean abs (%rate error) = 2.2%
                 PSNR hit -3.3% SSIM - 1.1%

Most of the remaining error and most of the quality hit is
at extreme data rates. The adaptation code still has an
exception for material that is in effect static so that we
don't over adjust and over spend on YT slide show type
content.

(Rebase of If25a2449a415449c150acff23df713e9598d64c9
to resolve a auto-merge error)

Change-Id: Iec4e1613ef0d067454751d8220edb7058dfbd816
2014-10-13 10:16:44 +01:00
Jingning Han
a62acf3c0a Fix ActiveMapTest valgrind warning
This fixes a valgrind warning in the ActiveMapTest unit test
reported in issue 870.

Change-Id: Idf172ab0244ebefe630c3577e649bc9ba7c43d10
2014-10-11 22:36:58 -07:00
Alex Converse
a90255c366 Revert "Add adaptation option for VBR."
This reverts commit 869d4ca519.

This breaks the build via conflict with
e18edd5eb6.

Change-Id: If544b99e367a449452834eb8cce600f58c34ec0d
2014-10-10 11:34:00 -07:00
Paul Wilkins
169949dd74 Merge "Add adaptation option for VBR." 2014-10-10 09:22:58 -07:00
Yaowu Xu
bdea0055b2 Merge "vp9/choose_partitioning: add missing clear_system_state" 2014-10-10 09:16:19 -07:00
James Zern
a3e1a9291a vp9/choose_partitioning: add missing clear_system_state
set_vt_partitioning does double math

Change-Id: I8e9d73d5c89b937a5326abf04164d24d9d88c5ef
2014-10-10 08:14:46 -07:00
Paul Wilkins
869d4ca519 Add adaptation option for VBR.
Allow min and maxQ to creep when the undershoot
or overshoot exceeds thresholds controlled by the
command line under_shoot_pct and over_shoot_pct
values.

Default is 100%,100% which ~disables adaptation.

Derf results for example undershoot% / overshoot%:-

Head:- Mean abs (%rate error) = 14.4%

This check in:-
25%/25% - Mean abs (%rate error) = 6.7%
                  PSNR hit -1% SSIM -0.1%

5% / 5%  - Mean abs (%rate error) = 2.2%
                 PSNR hit -3.3% SSIM - 1.1%

Most of the remaining error and most of the quality hit is
at extreme data rates. The adaptation code still has an
exception for material that is in effect static so that we
don't over adjust and over spend on YT slide show type
content.

Change-Id: If25a2449a415449c150acff23df713e9598d64c9
2014-10-10 12:54:16 +01:00
James Zern
7c6fec672f vp9_avg_intrin_sse2: correct intrinsics include
immintrin.h -> emmintrin.h
fixes build where newer intrinsics are unavailable

Change-Id: I79311b39bfa782fc2abeb45884ecb417050cb9f8
2014-10-10 10:05:47 +02:00
Deb Mukherjee
9a29fdbae7 Merge "Rename highbitdepth functions to use highbd prefix" 2014-10-09 15:39:56 -07:00
Deb Mukherjee
1929c9b391 Rename highbitdepth functions to use highbd prefix
Uses highbd_ prefix convention consistently.

Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-09 14:40:40 -07:00
Jingning Han
112789d4f2 Merge "Remove sub8x8 block index from rd_pick_partition argument" 2014-10-09 11:16:11 -07:00
Deb Mukherjee
3117830af3 Merge "Subpel search cleanups and enhancements" 2014-10-09 11:14:51 -07:00
Alex Converse
9ffbc31367 Merge "Move the high freq coeff check outside store_coding_context" 2014-10-09 11:12:02 -07:00
Jingning Han
6a0d291fce Remove sub8x8 block index from rd_pick_partition argument
This parameter is deprecated. Its function is replaced with
other explicit condition check.

Change-Id: I61337e350ba8ca9eb50382db8b4d4acbf45cb7eb
2014-10-09 09:20:16 -07:00
Yaowu Xu
3af39b9997 Merge "Fix src frame buffer copy and extend" 2014-10-09 07:08:48 -07:00
James Zern
cec763bd97 set_vt_partitioning: fix type conversion warning
double -> int64
+ make threshold_multiplier an int

Change-Id: I6d3607fdf13d670f57c9d9b04a80acb2be1346a0
2014-10-09 11:41:36 +02:00
Deb Mukherjee
d78dbff09a Subpel search cleanups and enhancements
- Some fixes to surface fit.
- Returns variance function as cost rather than sad in the
  pattern search and diamond search functions. Only
  vp9_pattern_search_sad function used in bigdia search
  uses sad as integer 1-away costs.
- Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.

Results:
derf [Speed 3]: About +0.036% in coding efficiency without any
discernible speed loss.
derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.

Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
2014-10-08 23:59:43 -07:00
Yunqing Wang
189566db58 Merge "Allow mode search breakout at very low prediction errors" 2014-10-08 19:58:18 -07:00
Yunqing Wang
e18edd5eb6 Allow mode search breakout at very low prediction errors
In model_rd_for_sb function, the spatial domain SSE and variance
are checked to see if transform coefficients are quantized to 0.
Besides that, this patch adds another set of thresholds that are
much more strict. These thresholds are used to conduct a partition
block level check to measure if all its TX blocks are skippable
for YUV planes. If it is true, x->skip is set for this partition
block, and thus its mode search is terminated.

This speeds up the encoding at very low prediction error case,
such as screen sharing application. This patch covers what
rd_encode_breakout_test() does, so that function is removed.

Borg test at speed 3 shows:
For stdhd set, psnr: +0.008%, ssim: +0.014%;
For derf set, psnr: +0.018%, ssim: +0.025%.
No noticeable speed change.

Change-Id: I4e5f15cf10016a282a68e35175ff854b28195944
2014-10-08 17:46:22 -07:00
Jingning Han
5fcbcf1b22 Move the high freq coeff check outside store_coding_context
This fixes valgrind message issue 870.

Change-Id: Ibbc2481923a2995029ab05de30c9e8a6e9f0f9a8
2014-10-08 16:10:32 -07:00
Jingning Han
41cea46154 Use local variable in vp9_rd_pick_inter_mode_sb
Change-Id: Ie35a965a6b8de536ccaf61ff61498620d22db205
2014-10-08 16:09:47 -07:00
Yaowu Xu
d602500d4a Fix src frame buffer copy and extend
For input source with size that is not multiple of 8, the size is
rounded to 8 and saved in width or height, the original source sizes
are saved in crop_width and crop_height. This commit corrects the
computation of bottom and right extension amounts to use the orignal
sizes, hence crop_width and crop_height.

In addition, this commit also adds the missed initialization for
uv_crop_width and uv_crop_height.

This addresses issue #834

Change-Id: I084543ca7645a4964b88f7cf8ff668f517d3a39b
2014-10-08 11:07:04 -07:00
Jim Bankoski
20254d1daa Merge "experimental : partition using 1/8 x 1/8 image" 2014-10-08 09:04:26 -07:00
Jim Bankoski
4130e691bb Merge "Force better lower quantizer keyframe in case of high quantizer." 2014-10-08 09:04:01 -07:00
Paul Wilkins
2faff64866 Merge "Improve two pass VBR accuracy." 2014-10-08 04:23:30 -07:00
Paul Wilkins
679a1c2f5a Merge "Two pass rc changes." 2014-10-08 04:23:20 -07:00
Jim Bankoski
0ce51d823f experimental : partition using 1/8 x 1/8 image
The concept:

There's too much noise in source pixels for variance and at low bitrate
the reconstructed looks nothing like the source so we have problems
getting good partitionings with either.   This skirts the issue by using
a box blur scaled down version for variance calculations.  To compare
against source_var_ moved keyframe to be rd based like source_var.

Change-Id: Ie3babdbfadae324b7b5a76bea192893af27f0624
2014-10-07 16:36:14 -07:00
Jim Bankoski
dae97868da Force better lower quantizer keyframe in case of high quantizer.
Change-Id: Ie69a164bc166b6a8819777038d65a7d9f9c3361f
2014-10-07 15:40:02 -07:00
Jingning Han
3bbec7b422 Merge "Replace mi_width_log2() with mi_width_log2_lookup table" 2014-10-07 15:33:52 -07:00
Jingning Han
27c9577f8e Merge "Take out repeated block width/height lookup functions" 2014-10-07 15:33:45 -07:00
Yunqing Wang
0cac69f594 Merge "Fix skip_txfm issue in rdopt code" 2014-10-07 15:13:44 -07:00
Jingning Han
bd9706506f Merge "Move inter filter defs to vp9_filter.h" 2014-10-07 13:42:26 -07:00
Jingning Han
91442e1626 Merge "Reduce the scope of the header file used in vp9_context_tree.h" 2014-10-07 13:42:17 -07:00
Jingning Han
92b45e5d4e Merge "Remove redundant header file from vp9_encoder.h" 2014-10-07 13:42:13 -07:00
Yunqing Wang
a4aa14020a Fix skip_txfm issue in rdopt code
Fixed an encoder crash. Set skip_txfm to 0 for cases that skip_txfm
isn't calculated. Put memcpy of skip_txfm at right place.

Change-Id: Ib3b6afc1b251a85b2a853c8138fb3393f48cfef6
2014-10-07 12:47:43 -07:00
Jingning Han
7ee58985bd Replace mi_width_log2() with mi_width_log2_lookup table
Change-Id: If0ea98aa139d14d40cd924114e18396aff36b5a5
2014-10-07 12:45:25 -07:00
Jingning Han
b66f7016c1 Take out repeated block width/height lookup functions
The functions b_width_log2 and b_height_log2 only do direct
table fetch. This commit unifies such use cases by using the
table directly and removes these functions.

Change-Id: I3103fc6ba959c1182886a2799d21b8b77c8a7b6b
2014-10-07 12:33:07 -07:00
Jingning Han
5d9cdac087 Move inter filter defs to vp9_filter.h
Add comments on the use case of these definitions. Further reduce
the scope of header file in vp9_context_tree.h.

Change-Id: Ic4a7638e838d0ac441b64abfc56e57354c059d75
2014-10-07 12:16:37 -07:00
Deb Mukherjee
cfc337aae8 Merge "Resolves some static analysis / undefined warnings" 2014-10-07 12:15:26 -07:00
Deb Mukherjee
fced63ed30 Resolves some static analysis / undefined warnings
Also fixes a case of distortion becoming negative and messing
up the RDCOST computation.

Change-Id: Id345af9e8dfff31ade622be5756e51f2cdface53
2014-10-07 11:20:56 -07:00
Jingning Han
5e32036b97 Reduce the scope of the header file used in vp9_context_tree.h
Change-Id: I264ee35044a5973c7725daba7af870968353a3c1
2014-10-07 11:13:35 -07:00
JackyChen
a9f479682a Merge "Add SSE2 code and unit test for VP9 denoiser." 2014-10-07 10:51:55 -07:00
Jingning Han
3f93f23120 Remove redundant header file from vp9_encoder.h
Change-Id: Ia212390cf8d36db5436bb0f0e1b696f70066341a
2014-10-07 10:49:58 -07:00
Jingning Han
a75551585b Fix eobs buffer pointer mis-use
This commit fixes a buffer pointer mis-use in store_coding_context.
The compression performance for stdhd set of speed 3 is improved by
0.097%. It fixes issue 869.

Change-Id: Idc59e22035eaf39f7133ca04174894374d647ff7
2014-10-06 15:57:13 -07:00
JackyChen
80465dae88 Add SSE2 code and unit test for VP9 denoiser.
This SSE2 is based on VP8 denoiser's SSE2 code. In VP8, there are
only 16x16 blocks in denoiser, while in VP9, there are 13 different
block sizes.

By adding this SSE2 code, the improvement of encoder speed is around
20%(using C code vs using SSE2 code), vary for different clips.

The unit test for VP9 denoiser is to confirm that the SSE2 code is
bit-exact with the C code. The unit test covers all block size.

Change-Id: Ic8d8ac26db4ea40a5f146b5678a065af07eaaa3d
2014-10-06 15:27:40 -07:00
Jingning Han
1b8c57e915 Merge "Fix an IOC issue in vp9_rd_pick_inter_mode_sb" 2014-10-06 09:29:29 -07:00
Yaowu Xu
5966acc1be Merge "Properly initialize segmentID in nonrd coding path" 2014-10-06 07:57:36 -07:00
Paul Wilkins
0e1068a4bd Improve two pass VBR accuracy.
Adjustments to the GF interval choice and minimum boost.
Adjustment to the calculation of 2 pass worst q.
Compared to 09/29 head there is metrics hit on derf of
(-0.123%,-0.191%)

Compared to the September 29 head and a baseline on
September 18 baseline the accuracy of the VBR rate control
measured on the derf set is as follows:-

Mean error %  / Mean abs(error %)
Sept 18 baseline (-7.0% / 14.76%)
Sept 29 head (-15.7%, 19.8%)
This check in (-1.5% / 14.4%)

The mean undershoot is reduced slightly but the
worst case overshoot on e.g. harbour/highway is
increased. This will be addressed in a later patch.

Change-Id: Iffd9b0ab7432a131c98fbaaa82d1e5b40be72b58
2014-10-06 14:20:09 +01:00
Jingning Han
085b97aa5c Fix an IOC issue in vp9_rd_pick_inter_mode_sb
It is possible that the GOLDEN reference frame is not avaiable, in
which setting the predicted mv will be associated with a residual
value of INT_MAX. This commit checks this condition before
left shift and comparison with that of ALTREF frame, to avoid
overflow issue.

Change-Id: Ib98c3149dbdd016f2fe5beaafb13f67d469dd07c
2014-10-05 12:05:14 -07:00
Jingning Han
a021924092 Merge "Fix indent in encode_rd_sb_row" 2014-10-03 15:24:02 -07:00
Jingning Han
a1088e0b5f Merge "Rework partition search skip scheme" 2014-10-03 15:23:54 -07:00
Yaowu Xu
0065b73481 Properly initialize segmentID in nonrd coding path
This commit adds proper initialization of segment id for variance AQ
mode in non-rd coding path. It fixes the enc/dec mismatch issue of
rt=7 with --aq-mode=1, as reported in issue #816

Change-Id: I02fa41b96345bf2e66077d5ea553f85ba800f7bb
2014-10-03 15:01:53 -07:00
Deb Mukherjee
8a01074d04 Merge "Incorporate WRAPLOW macro into non-highbitdepth tx" 2014-10-03 12:45:39 -07:00
Jingning Han
ef62233396 Fix indent in encode_rd_sb_row
Change-Id: Icbcfe7b56d88474f4398b4c5b52f6719d551ab4a
2014-10-03 11:57:36 -07:00
Jingning Han
bb260d9076 Rework partition search skip scheme
This commit enables the encoder to skip split partition search if
the bigger block size has all non-zero quantized coefficients in low
frequency area and the total rate cost is below a certain threshold.
It logarithmatically scales the rate threshold according to the
current block size. For speed 3, the compression performance loss:
derf  -0.093%
stdhd -0.066%

Local experiments show 4% - 20% encoding speed-up for speed 3.
blue_sky_1080p, 1500 kbps
51051 b/f, 35.891 dB, 67236 ms ->
50554 b/f, 35.857 dB, 59270 ms (12% speed-up)

old_town_cross_720p, 1500 kbps
14431 b/f, 36.249 dB, 57687 ms ->
14108 b/f, 36.172 dB, 46586 ms (19% speed-up)

pedestrian_area_1080p, 1500 kbps
50812 b/f, 40.124 dB, 100439 ms ->
50755 b/f, 40.118 dB,  96549 ms (4% speed-up)

mobile_calendar_720p, 1000 kbps
10352 b/f, 35.055 dB, 51837 ms ->
10172 b/f, 35.003 dB, 44076 ms (15% speed-up)

Change-Id: I412e34db49060775b3b89ba1738522317c3239c8
2014-10-03 11:54:30 -07:00
Deb Mukherjee
d50716face Incorporate WRAPLOW macro into non-highbitdepth tx
Incorporates the WRAPLOW macro into the non-highbitdepth transforms
to aid hardware verification between a software C model and an
intended hardware implementation though the use of the configure
options: --enable-experimental --enable-emulate-hardware.
Note that to avoid further discrepancies between the sse/sse2
implementations of the transforms and the C implementation, when the
emulate hardware option is invoked, we also disable sse/sse2/etc.

Also incudes some minor cleanups/renaming etc.

Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287
2014-10-03 11:38:05 -07:00
Deb Mukherjee
2c7b94f6ec Merge "Prevent negative cost for highbitdepth" 2014-10-03 11:37:47 -07:00
Deb Mukherjee
431cdc33ee Prevent negative cost for highbitdepth
Adds proper scaling for highbitdepth in a rdopt cost.

Change-Id: I066694799a7f491b830945ef1c66eb202071c355
2014-10-03 10:22:21 -07:00
Deb Mukherjee
00a4b20fbe rdmult data type change
To fix a VS warning.

Change-Id: I4c530c0afe8d06acdb8cc78b7995aba57a25373d
2014-10-03 00:09:41 -07:00
Yaowu Xu
f809475c73 Merge "Make iscan and scan neighbor arrays static const." 2014-10-02 15:15:58 -07:00
Yaowu Xu
9712bc691d Make iscan and scan neighbor arrays static const.
This commit changes the tables to be read only, which fixes
issue #866

Change-Id: I85bbe03f9d344f50570f8c1c61699bdc5cee248f
2014-10-02 14:08:14 -07:00
Alex Converse
a0befb93e7 Fix subsampling check for images 1 pixel wide/tall
Change-Id: I0e262ede7eb4a4ae0c86181922d744e542e93350
2014-10-02 11:02:57 -07:00
Deb Mukherjee
35e8fa1458 rdmult data type change to fix high bit-depth
Fixes an intermittent assert failure for highbitdepth.

Change-Id: If8cad0209a94f1184b69c7b3f1d587934f857d9b
2014-10-02 07:37:26 -07:00
Jingning Han
9641b1b9ac Merge "Remove unused header files from vp9_encodemb.h" 2014-10-01 17:05:25 -07:00
Deb Mukherjee
30fbf23fda Merge "High-bitdepth bugfixes" 2014-10-01 16:47:43 -07:00
Yunqing Wang
e350e3fe68 Merge "Modify block transform skipping check" 2014-10-01 16:19:56 -07:00
Jingning Han
72a78a0c40 Remove unused header files from vp9_encodemb.h
Change-Id: Icfc3fb62cc0b05e435814035bfe1f2e2870442b4
2014-10-01 14:50:24 -07:00
Deb Mukherjee
a160d72522 High-bitdepth bugfixes
Miscellaneous bug-fixes for high bitdepth functionality.
With this patch, high bit-depth profiles become mostly functional,
except for an intermittent assert failure issue that is being
tracked.

Change-Id: I6a7fcbdcf1e5b09842e88535f8442d2e1230748c
2014-10-01 14:18:11 -07:00
Jingning Han
0a9f5fa146 Remove repeated header files from vp9_block.h
This commit removes unused header file vp9_onyxc_int.h and repeatedly
included file vpx_ports/mem.h from vp9_block.h

Change-Id: I400b210bd1da48f1880bd50a8f4a6e2c690e15a1
2014-10-01 13:01:43 -07:00
Yunqing Wang
e4aac6bb61 Modify block transform skipping check
Block transform skipping was implemented based on DCT's energy
conservation property. Modified the thresholds using zero bin
parameters. AC and DC coefficients were checked separately to
allow better identifying of skippable blocks.

Borg test at speed 3 showed:
stdhd set: psnr gain: 0.153%, ssim gain: 0.051%;
derf set: psnr gain: 0.023%, ssim gain: 0.036%

For most test clips, the encoding speedup is 1% - 2%.
parkrun(720p): 7.5% speedup, park_joy(1080p): 3.5% speedup.

Change-Id: If28eb81113a077414f5ca7b021c14f9069b373bb
2014-10-01 12:58:09 -07:00
Jingning Han
20a37391d9 Merge "Conditionally skip reference frame check" 2014-10-01 11:19:10 -07:00
Jingning Han
891793a540 Conditionally skip reference frame check
For regular inter frames, if the distance from GOLDEN_FRAME is larger
than 2 and if the predicted motion vector of LAST_FRAME gives lower
sse than that of GOLDEN_FRAME, skip the GOLDE_FRAME mode checking in
the rate-distortion optimization. It provides about 5% speed-up at
expense of -0.137% and -0.230% performance down for speed 3. Local
experiment results:

pedestrian 1080p 2000 kbps
66712 b/f, 40.908 dB, 113688 ms ->
66768 b/f, 40.911 dB, 108752 ms

blue_sky 1080p 2000 kbps
51054 b/f, 35.894 dB, 70406 ms ->
51051 b/f, 35.891 dB, 67236 ms

old_town_cross 720p 1500 kbps
14412 b/f, 36.252 dB, 60690 ms ->
14431 b/f, 36.249 dB, 57346 ms

Change-Id: Idfcafe7f63da7a4896602fc60bd7093f0f0d82ca
2014-10-01 08:32:15 -07:00
Yunqing Wang
b1b6fd85db Merge "Skip the partition search for still frames" 2014-09-30 11:59:05 -07:00
Yunqing Wang
c8d01b1eaf Merge "Refactor encode_rd_sb_row function" 2014-09-30 11:58:39 -07:00
Deb Mukherjee
40479dfe92 Misc. high-bit-depth fixes
Change-Id: Ie9fb6a4078eb6a3fb7c4ff1453831ab9afe23121
2014-09-30 10:37:53 -07:00
Deb Mukherjee
63e49be340 Merge "Adds two new subpel search methods" 2014-09-29 20:11:04 -07:00
JackyChen
7ba646f7e6 Fix a bug in calculating delta in VP9 denoiser.
When calculating delta in VP8 denoiser, since the block size is fixed to 16x16,
the divisor is 256, which is the number of the pixel.
But in VP9, the block size varies, the divisor should correspond to the block
size.

Change-Id: Ibdc1e5d23ba8c788b0d0dc6d406bcdfc34c1b142
2014-09-29 13:09:18 -07:00
Deb Mukherjee
4e9c0d2ad4 Adds two new subpel search methods
One is a more aggressive version of the pruned subpel tree
search where only a single halfpel candidate is searched.
The search candidate is based on a surface fit result.
The other is a method to obtain the subpel position at one
shot based on the same surface fit.

The methods have not been deployed in any speed setting yet.

Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
2014-09-29 12:51:20 -07:00
Jingning Han
8b4dd536a5 Merge "Skip certain ALTREF inter modes in ARF coding" 2014-09-29 10:43:45 -07:00
Deb Mukherjee
d4713f1d50 Fix a bug introduced in a previous patch on highbd
Change-Id: Ice692334f75157446a44a6e81503cada977934f4
2014-09-26 15:43:55 -07:00
Jingning Han
ccdb518ff8 Skip certain ALTREF inter modes in ARF coding
This commit enables the encoder to skip checking ALTREF inter modes
in ARF coding, if the predicted motion vectors suggest that the
GOLDEN_FRAME provides higher prediction accuracy than ALTREF_FRAME.

It improves the speed 3 encoding speed by about 5%, at the expense
of compression performance loss -0.041% and -0.225% for derf and
stdhd, respectively.

pedestrian_area 1080p 2000 kbps
66705 b/f, 40.909 dB, 118738 ms ->
66732 b/f, 40.908 dB, 113688 ms

old_town_cross 720p 1500 kbps
14427 b/f, 36.256 dB, 62746 ms ->
14412 b/f, 36.252 dB, 60690 ms

blue_sky 1080p 1500 kbps
51026 b/f, 35.897 dB, 73310 ms ->
50921 b/f, 35.893 dB, 70406 ms

bus CIF 1000 kbps
21301 b/f, 34.841 dB, 7326 ms ->
21248 b/f, 34.837 dB, 7196 ms

Change-Id: I76cf88b4d655e1ee3c0cb03c8a5745493040e8d2
2014-09-26 12:53:43 -07:00
Paul Wilkins
d3bbd87d5e Two pass rc changes.
Adjustments to the GF interval choice and minimum boost.

Change-Id: I29951621484e1ee339adfb73ab430aa65f310ad8
2014-09-26 17:13:02 +01:00
Yunqing Wang
1fcbf6ed56 Skip the partition search for still frames
This patch re-enabled the feature in Pengchong's patch
(commit 1286126073). Originally, it
was turned on while use_lastframe_partitioning > 0(not used anymore).
Now it was added as a feature, and turned on while speed >= 2.
As described in the original patch, this feature helps speed up the
slideshows in YouTube.

Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471
2014-09-26 09:03:52 -07:00
Deb Mukherjee
993d10a217 Adds various high bit-depth encode functions
Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f
2014-09-25 01:50:36 -07:00
Jingning Han
6989e81d61 Remove unused variable in handle_inter_mode
Change-Id: Id757d2c940756ce1b0ead2ea24af9ac0a493de05
2014-09-24 18:27:44 -07:00
Paul Wilkins
76035d16d9 Merge "Fix build issue with stats enabled." 2014-09-24 10:32:37 -07:00
Yunqing Wang
14ee2805a3 Refactor encode_rd_sb_row function
Simplified the code and removed some code that was not used anymore.
This patch didn't change encoding result.

Change-Id: I7e54a74c8f35a6726dfc8a1c55b337448b7ea124
2014-09-24 10:24:18 -07:00
Paul Wilkins
5b724fc78e Fix build issue with stats enabled.
Compiler build issue when output stats enabled.

Change-Id: I7b5409108f3f27ba61b0241b9340b412683eff45
2014-09-24 11:48:58 +01:00
Deb Mukherjee
e1d3c36525 Adds high bit-depth frame resize functions
Change-Id: I35b015a759325d72d0da427c61a09f19f8e69697
2014-09-23 22:55:33 -07:00
Yaowu Xu
8751e49a6f Merge "Adapt mode based rd_threshold for similar block size" 2014-09-23 22:28:08 -07:00
Yaowu Xu
60737c9fc8 Merge "Fix an IOC" 2014-09-23 20:44:35 -07:00
Deb Mukherjee
4109372af3 Adds high bit-depth psnr/sse functions
Also adds some miscellaneous high bit-depth setup functions.

Change-Id: I66488b08a5a2a8cb9518ca10497cf1c1501ceded
2014-09-23 17:28:05 -07:00
Deb Mukherjee
e2a90c0b21 Merge "High bit-depth loop/arf/postproc filter functions" 2014-09-23 17:26:32 -07:00
Deb Mukherjee
6c6213d960 Merge "Pruned subpel search for speed 3." 2014-09-23 17:12:03 -07:00
Deb Mukherjee
931ed516ba High bit-depth loop/arf/postproc filter functions
Adds high-bitdepth loopfilter, temporal filter and postproc functions

Change-Id: I81c8a9176890784686bc4f2af0d550d243b3b2d3
2014-09-23 16:20:43 -07:00
Yaowu Xu
4a101310e8 Adapt mode based rd_threshold for similar block size
The rd_thresholds are adaptively changed based on best mode tested.
It was only changed for the same block size, this commit makes the
adaptation for similar block sizes too. The commit also made minor
adjustment and code cleanups.

The impact on encoding time for _ped:
118089 ms -> 111927 ms

The impact on compression:
derf:  -0.339%
stdhd: -0.303%

Change-Id: I8817fed1102350497f2ec631849e43f753878e5d
2014-09-23 16:10:59 -07:00
Yaowu Xu
56032b471d Fix an IOC
Change-Id: I0ca6746696d81657c035b0f6523c9af370da3c95
2014-09-23 16:07:22 -07:00
Deb Mukherjee
c94b17f4b2 Pruned subpel search for speed 3.
Adds code to return an integer cost list for NSTEP search. Then
uses it for pruned subpel search in speed 3.

derf: -0.06%
Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps.
[Subject to further testing].

Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96
2014-09-23 11:27:58 -07:00
Yaowu Xu
7feede9869 Merge "Remove code duplication" 2014-09-22 17:13:59 -07:00
Yaowu Xu
052bc8ea6a Merge "Simplify rd_pick_intra_sby_mode()" 2014-09-22 17:13:55 -07:00
Yaowu Xu
c7ab18fe56 Remove code duplication
Change-Id: I453b3e0d946951665d5919248445fc4f3222d2ad
2014-09-22 15:22:51 -07:00
Yaowu Xu
f46326c7a2 Simplify rd_pick_intra_sby_mode()
Change-Id: Ifb0915c94c2db48827ddbd446314cb6e3155b99c
2014-09-22 14:58:51 -07:00
Minghai Shang
38b6aed8fd Merge "[spatial svc] Remove vpx_svc_parameters_t and the loop that sets it for each layer" 2014-09-22 14:01:24 -07:00
Jingning Han
f7023ea014 Remove unnecessary local variable declaration
This commit removes a repetitive local variable declaration in
vp9_rd_pick_inter_mode_sb.

Change-Id: I1b0afa98ff1ecbfb46e17d3d1cee95d32c4309db
2014-09-22 09:29:28 -07:00
Jingning Han
eee904c9b9 Adaptive mode search scheduling
This commit enables an adaptive mode search order scheduling scheme
in the rate-distortion optimization. It changes the compression
performance by -0.433% and -0.420% for derf and stdhd respectively.
It provides speed improvement for speed 3:

bus CIF 1000 kbps
24590 b/f, 35.513 dB, 7864 ms ->
24696 b/f, 35.491 dB, 7408 ms (6% speed-up)

stockholm 720p 1000 kbps
8983 b/f, 35.078 dB, 65698 ms ->
8962 b/f, 35.054 dB, 60298 ms (8%)

old_town_cross 720p 1000 kbps
11804 b/f, 35.666 dB, 62492 ms ->
11778 b/f, 35.609 dB, 56040 ms (10%)

blue_sky 1080p 1500 kbps
57173 b/f, 36.179 dB, 77879 ms ->
57199 b/f, 36.131 dB, 69821 ms (10%)

pedestrian_area 1080p 2000 kbps
74241 b/f, 41.105 dB, 144031 ms ->
74271 b/f, 41.091 dB, 133614 ms (8%)

Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e
2014-09-22 09:28:16 -07:00
hkuang
c70cea97ac Remove mi_grid_* structures.
mi_grid_* are arrays of pointer to pointer. They save the pointers that point
to the MIs in cm->mi. But they are unnecessary and complicated. The original
goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer
inside MODE_INFO_t, same goal could be achieved.

This commit totally removes the mi_grid_* structures. But there are still
many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit
will do on-demand MODE_INFO_t allocation in order to save these memories.

Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6
2014-09-19 21:27:11 -07:00
Deb Mukherjee
822b51609b High bit-depth coefficient coding functions
Tokenization and Detokenization enhancements for 10/12 bit

Change-Id: I3c269ec30f8eb160ee024905638a193975237559
2014-09-19 15:21:24 -07:00
Minghai Shang
209ee12110 [spatial svc] Remove vpx_svc_parameters_t and the loop that sets it for each layer
vpx_svc_parameters_t contains id, resolution and min/max qp for each spatial layer.

In this change we will use extra config to send min/max qp and scaling factors, then calculate layer resolution inside encoder.

Change-Id: Ib673303266605fe803c3b067284aae5f7a25514a
2014-09-18 18:05:07 -07:00
Minghai Shang
f66be91f61 Merge "[spatial svc] Use same golden frame for all temporal layers" 2014-09-18 12:29:40 -07:00
Minghai Shang
f780b16bb8 [spatial svc] Use same golden frame for all temporal layers
Overhead goes down from 8% to 3% for 1080 60p

Change-Id: Idf3e5ca8712402a914a8cb79df17d3cdab63b163
2014-09-18 11:16:29 -07:00
Deb Mukherjee
6d0ee9860e Merge "Adds high bitdepth convolve, interpred & scaling" 2014-09-18 10:52:23 -07:00
Deb Mukherjee
0d3c3d3ce7 Adds high bitdepth convolve, interpred & scaling
Change-Id: Ie51c352a6b250547207cbc1ebba833a01ed053e3
2014-09-18 07:26:17 -07:00
Paul Wilkins
c389b37bb4 Substantial reworking of code for arf and kf groups.
Substantial restructuring of the way we estimate
the rate of decay in prediction quality and determine
the arf interval and amount of boost used.

Also other changes to support moving to a lower first pass
Q which exposes some new features and allows us to better
distinguish genuinely static blocks from low motion or noisy
blocks.

Net gains now visible on all the test sets with std-hd PSNR up
1.87%. There are still some bad outlier cases but most of these
are low motion or slide show type content where the metrics
are already high at any given rate. The best + case is up by
more than 10%.

Change-Id: I18e25170053bdf3188f493ff8062f48a74515815
2014-09-18 12:53:48 +01:00
Deb Mukherjee
5cd0aab81a Adds high bitdepth quantization functions
Adds various high bitdepth quantization functions.

Change-Id: I36fc0bf75a1bd15128ed271df8723de0ac134b0c
2014-09-16 14:55:37 -07:00
Jingning Han
66f812fb56 Merge "Use non-zero mode threshold for NEARESTMV modes" 2014-09-16 13:39:54 -07:00
Adrian Grange
2b3b63f422 Merge "Fix ARF construction when scaling" 2014-09-16 12:35:23 -07:00
Adrian Grange
99df7ded95 Merge "Move call to vp9_rc_get_second_pass_params()." 2014-09-16 11:37:33 -07:00
Adrian Grange
1def634f1a Fix ARF construction when scaling
The ARF frame should always be the same size as the
native resolution of the input frames.

It will be scaled to the required resolution at
encode time.

Change-Id: I0afe858129aa6ef65b1648f43476331715346896
2014-09-16 11:12:49 -07:00
Jingning Han
56fa3ab886 Use non-zero mode threshold for NEARESTMV modes
This commit makes the encoder to use non-zero mode threshold for
NEARESTMV modes. The runtime for test clips of speed 3 is reduced
by about 1%.

pedestrian 1080p 2000 kbps, 143239 ms -> 141989 ms
bus CIF 1000 kbps, 7835 ms -> 7749 ms

The compression performance change is about -0.02% for both derf
and stdhd.

Change-Id: Ib71808922c41ae2997100cb7c561f68dcebfa08e
2014-09-16 09:56:10 -07:00
Jingning Han
ffaebfc7b4 Merge "Add ARF validation for compound inter mode check" 2014-09-15 21:26:37 -07:00
Jingning Han
c50256c157 Merge "Remove redundant reference frame check in sub8x8 RD search" 2014-09-15 21:26:11 -07:00
Jingning Han
fe96932c69 Merge "Replace best_ref_index table fetch with best_mbmode" 2014-09-15 21:25:48 -07:00
Yunqing Wang
57eb2a4e83 Merge "Simplify the skip flag cost code" 2014-09-15 18:50:30 -07:00
Yunqing Wang
c60ef810a1 Merge "Set the skip flag to 1 for skippable blocks" 2014-09-15 18:50:19 -07:00