7795 Commits

Author SHA1 Message Date
Jingning Han
1bb11781e2 Rework idct8x8_10 SSE2 implementation
This commit optimizes the SSE2 implmentation of idct8x8_10. It exploits
the fact that only top-left 4x4 block contains non-zero coefficients,
and hence reduces the instructions needed.

The runtime of idct8x8_10_sse2 goes down from 216 to 198 CPU cycles,
estimated by averaging over 100000 runs. For pedestrian_area_1080p 300
frames coded at 4000kbps, the average decoding speed goes up from
79.3 fps to 79.7 fps.

Change-Id: I6d277bbaa3ec9e1562667906975bae06904cb180
2014-01-03 12:04:09 -08:00
Dmitry Kovalev
672c355a26 Replacing int_mv with MV.
Change-Id: Ifd432fa3741ba47102d298e0b348eb00f5a9ce53
2014-01-03 11:48:07 -08:00
Dmitry Kovalev
f215c641bb Merge changes Ic0a2427a,I3addbf6d
* changes:
  Removing CONFIG_MD5.
  Using VP9_FRAME_MARKER instead of raw number.
2014-01-03 11:47:57 -08:00
Dmitry Kovalev
5b04962cf4 Merging best_ref_mv and second_best_ref_mv into best_ref_mv[2].
Change-Id: If04b57828847cee09a79c94e1098d1aa4990ea0d
2014-01-03 11:31:00 -08:00
Paul Wilkins
65ede3da45 Modified Handling of min and max vbr rates.
In two pass encodes bits are allocated to each frame
according to a modified error score for the frame as a
fraction of the modified error score for the clip or section.

Previously a minimum rate per frame was reserved and
subtracted from the bits allocatable by the two pass code.
The vbr max section rate was enforced by clipping the
actual number of bits allocated.

In this patch the min and max vbr rates are enforced
instead by clipping the modified error scores for each frame
rather than the number of bits allocated.

Small gains for all test sets (psnr and SSIM) ranging from
~ +0.05 for YT psnr up to ~ +0.25 for Std-hd SSIM.

Change-Id: Iae27d70bdd3944e3f0cceaf225bad2e8802833de
2014-01-03 14:56:08 +00:00
Dmitry Kovalev
f16b186b8e Reusing vp9_get_skip_context() function in encoder.
Change-Id: Ic0345622115941f49b6a568c7b8154ba892cbf0d
2014-01-02 18:29:56 -08:00
Christian Duvivier
3bcece9578 Merge "ARM NEON version of denoiser." 2014-01-02 15:03:58 -08:00
Christian Duvivier
b52db6b7e8 ARM NEON version of denoiser.
Change-Id: I951abd4ad0078f78949f3cb79453ac334fb82a7e
2014-01-02 10:51:05 -08:00
Yaowu Xu
8458c8c450 Merge "Fix show existing frame" 2014-01-02 09:27:28 -08:00
Dmitry Kovalev
d24f4e49c1 Removing CONFIG_MD5.
We don't need compile time md5 configuration because --md5 is a runtime
option.

Change-Id: Ic0a2427ae5de5a18f31e5ee60c3732481b377ca1
2013-12-27 16:10:18 -08:00
Jingning Han
cdc933ca00 Merge "Adaptive motion control on ref and search range" 2013-12-27 15:04:16 -08:00
Dmitry Kovalev
46d5cc4307 Using VP9_FRAME_MARKER instead of raw number.
Change-Id: I3addbf6d89a86a707c8df1a463da3e9e367910df
2013-12-27 14:50:27 -08:00
Dmitry Kovalev
7eb09151bc Merge "Removing vpx_codec_vp9x_cx and internal experimental flag." 2013-12-27 14:47:57 -08:00
Yunqing Wang
a7248a04b7 Merge "Remove a unused sub-pixel search" 2013-12-27 14:05:38 -08:00
Dmitry Kovalev
116e0a1ab7 Removing vpx_codec_vp9x_cx and internal experimental flag.
vpx_codec_vp9x_cx is not used internally. Experimental flag from
vp9_extracfg is also not really used. YUV 4:4:4 just works after these
changes (you have to specify --profile=1 for the encoder).

Change-Id: Ib1c8461d0d19d159827e005efe868f891eea0140
2013-12-27 14:01:12 -08:00
Jingning Han
a4ce53f14d Adaptive motion control on ref and search range
This commit takes a preliminary attempt to refine the motion search
control. It detects the SAD associated with mv predictor per reference
frame, and based on which to determine whether the encoder wants to
reduce the motion search range (if the predicted mv provides fairly
small SAD), or to skip the current reference frame (if there exists
another ref frame that gives much smaller SAD cost).

This feature is turned on in the settings of speed 1 and above.

In speed 1, compression performance changed
derf  -0.018%
yt    -0.043%
hd    -0.045%
stdhd -0.281%

speed-up
pedestrian_area_1080p at 4000 kbps 100 frames
199651ms -> 188846ms (5.5% speed-up)
blue_sky_1080p at 6000 kbps
443531ms -> 415239ms (6.3% speed-up)

In speed 2, compression performance changed
derf  -0.026%
yt    -0.090%
hd    -0.055%
stdhd -0.210%

speed-up
pedstrian 113949ms -> 108855ms (4.5% speed-up)
blue_sky  271057ms -> 257322ms (5% speed-up)

Change-Id: I1b74ea28278c94fea329d971d706d573983d810d
2013-12-27 12:43:06 -08:00
Dmitry Kovalev
f3beca079c Merge "Calculating has_second_ref only once for single_ref context." 2013-12-26 13:41:02 -08:00
Dmitry Kovalev
1e8b5bf4ac Merge "Removing vp9_findnearmv.{h, c} files." 2013-12-26 13:38:38 -08:00
Dmitry Kovalev
87440aeb82 Moving MAX_PROB constant to vp9_prob.h.
Change-Id: I07470ad1b7a0344d088911428ffab8ba9a0d8708
2013-12-20 15:56:59 -08:00
Dmitry Kovalev
f69b5609ff Renaming vp9_dboolhuff.{h, c} to vp9_reader.{h, c}.
Change-Id: I50c009ff8108bda1c57427f23d63a79c04f7e776
2013-12-20 12:53:03 -08:00
Dmitry Kovalev
36ee0a2d0b Merge "Renaming vp9_boolcoder.{h, c} to vp9_writer.{h, c}." 2013-12-20 12:51:37 -08:00
Dmitry Kovalev
b3b9f4a4d0 Merge "Using single struct to represent scale factors." 2013-12-20 11:22:02 -08:00
Dmitry Kovalev
4084566554 Renaming vp9_boolcoder.{h, c} to vp9_writer.{h, c}.
Change-Id: I9b9a5fcce8530284df0f270706ee060a0edc1517
2013-12-20 11:10:24 -08:00
Dmitry Kovalev
47d482cb0a Merge "Reusing FRAME_COUNTS in the encoder." 2013-12-20 10:56:31 -08:00
Jingning Han
9938777058 Merge "Store the SSE of prediction residuals" 2013-12-20 10:37:20 -08:00
Marco Paniconi
68cdbfe50e Merge "Initialize avg_frame_qindex to worst_allowed for 1 pass." 2013-12-20 10:28:09 -08:00
Yaowu Xu
6afd37aa15 Merge "Fix a bug" 2013-12-20 09:20:50 -08:00
Yaowu Xu
2472f125c1 Fix a bug
The line was accidently removed in 4dbad63a7.

Change-Id: Ic1e18f209cead95cecc684f952ae667271b58a97
2013-12-20 08:52:35 -08:00
Yunqing Wang
b6a0ac11f0 Merge "Code clean up" 2013-12-20 08:46:11 -08:00
Paul Wilkins
17b2d63196 Merge "Adjust gf_group_error_left for arf groups." 2013-12-20 04:21:56 -08:00
James Zern
cc8ea84d3d Merge "test/partial_idct_test: fix msvc build" 2013-12-19 19:31:02 -08:00
Jingning Han
243327f43c Store the SSE of prediction residuals
Buffer the SSE of prediction residuals in the rate-distortion
optimization loop of a given block. This information would be used
for later encoding control.

Change-Id: If4e63f3462490513c48be9407d3327c8dd438367
2013-12-19 18:45:28 -08:00
Dmitry Kovalev
987810ad95 Removing vp9_findnearmv.{h, c} files.
Moving all code from that files to vp9_mvref_common.{h, c}.

Change-Id: Ibc4afcb8cea6847166ff411130e93611ebe63b20
2013-12-19 17:39:57 -08:00
Dmitry Kovalev
a3fbcc88bb Using single struct to represent scale factors.
Moving back to scale_factors struct. We don't need anymore x_offset_q4 and
y_offset_q4 because both values are calculated locally inside vp9_scale_mv
function.

Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b
2013-12-19 16:06:33 -08:00
Dmitry Kovalev
40e173ac42 Merge "vp9_encode_frame() cleanup." 2013-12-19 15:37:13 -08:00
Marco Paniconi
5ba4b16c2d Initialize avg_frame_qindex to worst_allowed for 1 pass.
Change-Id: I535bde16c8fc4c2cd263bbbbaed46ead4c776090
2013-12-19 14:55:49 -08:00
Dmitry Kovalev
c872d2be65 Call set_scaled_offsets() just before scale_mv() call.
Before mv scaling it is required to calculate x_offset_q4/y_offset_q4
by calling set_scaled_offsets(). Now offset configuration can not be
missed because it happens just before scale_mv().

Change-Id: I7dd1a85b85811a6cc67c46c9b01e6ccbbb06ce3a
2013-12-19 14:55:13 -08:00
Dmitry Kovalev
5bfd475104 Merge "Adding get_block_variance_fn() function." 2013-12-19 14:47:59 -08:00
Dmitry Kovalev
a619f5a776 Merge "Replacing 1 << mi_{width, height}_log2() with lookup tables." 2013-12-19 14:28:58 -08:00
Dmitry Kovalev
f06187f125 vp9_encode_frame() cleanup.
Change-Id: I82ecbe7fe0baa890ce251043f3c7159188c00665
2013-12-19 14:28:42 -08:00
Dmitry Kovalev
66ef9d1c20 Adding get_block_variance_fn() function.
Change-Id: I67d934b6af899ffd4bcad2d913a650685fa64abd
2013-12-19 14:07:03 -08:00
Dmitry Kovalev
431aaefbec Replacing 1 << mi_{width, height}_log2() with lookup tables.
Change-Id: Iba91ff1e797a83517e2cd7c3ab86cba39f39415b
2013-12-19 13:43:45 -08:00
Deb Mukherjee
84b4d8c692 Merge "Begin refactor of frame schedule in rate control" 2013-12-19 11:37:06 -08:00
Yunqing Wang
6ff4f19269 Remove a unused sub-pixel search
The original iterative search was replaced by subpel_tree search,
and was not used anymore.

Change-Id: I998b38e1cb0ee359a08b2410d0766dbf183ab071
2013-12-19 11:20:56 -08:00
Yunqing Wang
09faf55916 Code clean up
Removed unused filter coefficients.

Change-Id: Ib395a51305e23ff41ab69c1808d56946d25961cd
2013-12-19 11:09:23 -08:00
Dmitry Kovalev
c67ee5ea24 Merge "Converting vp9_treecoder.h to vp9_prob.{h, c}" 2013-12-19 11:03:30 -08:00
Dmitry Kovalev
e4b85c9ed8 Merge "Adding get_zbin_mode_boost() function." 2013-12-19 11:03:23 -08:00
Deb Mukherjee
4dbad63a71 Begin refactor of frame schedule in rate control
Various cleanups and streamlining of interfaces as precursor
to further advancements in rate control.
Pre-encode parameter setting for different use cases:
One-pass, first of 2-pass, second of 2-pass, and Svc
are separated out.

There is no change in output with this change.

Change-Id: Ied8ca7d84d610993776aa30ef263fe20452e0e3e
2013-12-19 10:55:38 -08:00
Tom Finegan
46444c268a Merge "test/decode_perf_test: fix msvc build" 2013-12-19 10:35:41 -08:00
Paul Wilkins
ee29b7e85c Adjust gf_group_error_left for arf groups.
Take account of the fact that the overlay frame is usually
very cheap so distribute target bits among the other frames.

Change-Id: I120685122e8cbbe75da8d07d02932f7877059867
2013-12-19 17:26:04 +00:00