Commit Graph

4028 Commits

Author SHA1 Message Date
Christian Duvivier
c129203f7e Faster vp9_short_fdct8x8.
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).

Change-Id: I7e85d8225a914a74c61ea370210414696560094d
2013-02-27 17:23:08 -08:00
Dmitry Kovalev
c6421433c8 Merge "Code cleanup." into experimental 2013-02-27 16:43:04 -08:00
Dmitry Kovalev
347f3a0aa8 Code cleanup.
Fixing code style, using array lookup instead of switch statements for
forward hybrid transforms (in the same way as for their inverses).
Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places.

Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
2013-02-27 13:51:04 -08:00
John Koleszar
889ce83390 Merge changes Idc1c490f,I6b5fe1a4 into experimental
* changes:
  convolve test: validate 1D filters are 1D
  Run all filters through convolve test
2013-02-27 13:45:42 -08:00
Dmitry Kovalev
9d771f948f Merge "Motion vectors code cleanup." into experimental 2013-02-27 13:34:56 -08:00
Yunqing Wang
bbc7b6a86a Merge "Remove unused file" into experimental 2013-02-27 13:00:10 -08:00
John Koleszar
ebf8b9fc6d Fix rollover and pass 1 time estimate
Fixes a rollover of the cx_time variable for encodes that take
over ~4200 seconds. Also enable the time estimate in first pass.

Change-Id: Ib5a98ee71bccd79a804d709cec7260651d0b7141
2013-02-27 12:29:25 -08:00
John Koleszar
5ac141187a Merge "Remove unused vp9_copy32xn" into experimental 2013-02-27 12:23:45 -08:00
Yunqing Wang
d6ff6fe2ed Merge "Remove unused file" into experimental 2013-02-27 11:58:29 -08:00
Dmitry Kovalev
0c0de00217 Motion vectors code cleanup.
Fixing indentation, removing redundant parenthesis, deciphering single
letter variable names, better spacing.

Change-Id: I1d447a7d69eddbf1e94e0820423615f40ea2d591
2013-02-27 11:48:13 -08:00
Ronald S. Bultje
90932399b4 Merge "Move eob from BLOCKD to MACROBLOCKD." into experimental 2013-02-27 11:39:16 -08:00
Yunqing Wang
8092aaf9ec Merge "Optimize vp9_dc_only_idct_add_c function" into experimental 2013-02-27 11:38:45 -08:00
James Zern
3b79000122 Merge "vp8/encoder/mcomp.c: remove an unused variable" 2013-02-27 11:33:18 -08:00
John Koleszar
d711f1091d Merge "add vp8 variance test" 2013-02-27 11:22:46 -08:00
John Koleszar
09be534f13 Merge "give vp9 variance struct a unique name" 2013-02-27 11:22:36 -08:00
John Koleszar
1a1cacfdb0 Merge "rtcd: make include guard unique" 2013-02-27 11:22:26 -08:00
John Koleszar
04c2407874 convolve test: validate 1D filters are 1D
Since the 8-tap lowpass filter is non-interpolating, the results are
different between applying it at whole-pel values and not. This
means that 1D-only versions are requried to be implemented, as
opposed to being an optimization of the 2D case. Calling the 2D
filter instead of the horizontal-only filter is not equivalent
in this case. Update the test to pass invalid filters to the
unused stage of the 1D-only calls, to verify they're unused.

Change-Id: Idc1c490f059adadd4cc80dbe770c1ccefe628b0a
2013-02-27 11:19:11 -08:00
John Koleszar
557a1b209e Run all filters through convolve test
Updates the convolve test to verify that all filters match the
reference implementation. This verifies commit 30f866f, which
fixed some problems with the SSE3 version of the filters for
the vp9_sub_pel_filters_8s and vp9_sub_pel_filters_8lp banks
due to overflow and order of operations.

Change-Id: I6b5fe1a41bc20062e2e64633b1355ae58c9c592c
2013-02-27 11:15:20 -08:00
Yunqing Wang
bf6cca44ad Remove unused file
Removed vp9/decoder/x86/vp9_idct_blk_mmx.c

Change-Id: I07ab06382a394cf556fa5a8e3c98b91f6e4f9ce8
2013-02-27 11:13:19 -08:00
John Koleszar
aba4f7fd42 Merge "vpxdec: support scaling output" into experimental 2013-02-27 11:09:56 -08:00
Yunqing Wang
5ef694cfb8 Remove unused file
Removed vp9_idctllm_mmx.asm

Change-Id: I7152756f23a5a09ed69e8fb40edb2ab3237290fe
2013-02-27 11:00:58 -08:00
Ronald S. Bultje
e8c74e2b70 Move eob from BLOCKD to MACROBLOCKD.
Consistent with VP8.

Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
2013-02-27 11:00:55 -08:00
John Koleszar
4b6fb94637 Merge "vpxenc: support scaling prior to encoding" into experimental 2013-02-27 10:59:21 -08:00
John Koleszar
0921bfb749 Merge "Use ref_frame_map vice active_ref_idx on the encoder" into experimental 2013-02-27 10:59:08 -08:00
John Koleszar
9615fd8f39 Merge "Test upscaling as well as downscaling" into experimental 2013-02-27 10:25:51 -08:00
John Koleszar
7ad8dbe417 Remove unused vp9_copy32xn
This function was part of an optimization used in VP8 that required
caching two macroblocks. This is unused in VP9, and might not
survive refactoring to support superblocks, so removing it for now.

Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
2013-02-27 10:24:56 -08:00
John Koleszar
d8e68bd14b Merge changes I922f8602,I0ac3343d into experimental
* changes:
  Use 256-byte aligned filter tables
  Set scale factors consistently for SPLITMV
2013-02-27 10:08:53 -08:00
Jan Kratochvil
82ed3f9a41 Fix --as=nasm compatibility for new asm code.
s/movd/movq/

Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
2013-02-27 09:55:38 -08:00
John Koleszar
350ba5f30e Merge "Combined motion compensation with scaled predictors" into experimental 2013-02-27 09:46:12 -08:00
John Koleszar
0d2517ce1b vpxdec: support scaling output
Adds an option, --scale, that will rescale any frames produced by
the decoder that don't match the resolution of the first frame to
that resolution. This is useful for playback of files that use
spatial resampling.

Change-Id: I867adc650d535df7ec5b107549359712ea3aaaa0
2013-02-27 08:22:40 -08:00
John Koleszar
34882b9bf5 vpxenc: support scaling prior to encoding
Scales the input of the encoder using libyuv's "box filter". Each stream
may have a different width and height specified. If the width (or
height) parameter is missing (or is explicitly set to 0) then the value
will be calculated based on the specified height (or width) and the
input file's dimensions, preserving its aspect ratio. Leaving the height
unspecified behaves similarly.

Change-Id: Ic7026810b13be030826be80dc6f7fc4aaf0c35d0
2013-02-27 08:22:40 -08:00
John Koleszar
800ad0b886 Use ref_frame_map vice active_ref_idx on the encoder
This patch makes the encoder's use of ref_frame_map and active_ref_idx
consistent with the decoder. ref_frame_map[] maps a reference buffer
index to its actual location in the yv12_fb array, since many
references may share an underlying buffer. active_ref_idx[] mirrors
cpi->{lst,gld,alt}_fb_idx, holding the active references in each
slot.

This also fixes a bug in setup_buffer_inter() where the incorrect
reference was used to populate the scaling factors.

Change-Id: Id3728f6d77cffcd27c248903bf51f9c3e594287e
2013-02-27 08:22:40 -08:00
John Koleszar
b683eecf6d Test upscaling as well as downscaling
Fixes a bug in vp9_set_internal_size() that prevented returning to
the unscaled state. Updated the ResizeInternalTest to scale both
down and up. Added a check that all frames are within 2.5% of the
quality of the initial keyframe.

Change-Id: I3b7ef17cdac144ed05b9148dce6badfa75cff5c8
2013-02-27 08:22:40 -08:00
John Koleszar
6fd7dd1a70 Use 256-byte aligned filter tables
This avoids duplicating all the filters twice. Includes fixups to the
convolve routines and associated tests to make this work.

Change-Id: I922f86021594e55072ddb63b42b2313605db6e00
2013-02-27 08:22:39 -08:00
John Koleszar
77f88e97fa Combined motion compensation with scaled predictors
This patch extends the previous support for using references of a
different resolution in ZEROMV mode to all inter prediction modes.
Subpixel based best-mv scoring is disabled when the reference frame
differs in resolution from the current frame.

Change-Id: Id4dc3e5e6692de98d9857fd56bfad3ac57e944ac
2013-02-27 08:22:39 -08:00
John Koleszar
472eeaf082 Set scale factors consistently for SPLITMV
This commit updates the 4x4 prediction to consistently use the
build_2x1_inter_predictor() method. That function is updated to
calculate the scale offset, rather than relying on the caller
to calculate it. In the case that the 2x1 prediction can not
be used, the scale offset is recalculated for each 1x1 block.
The idea here is that the offsets are calculated before each
call to vp9_build_scaled_inter_predictor().

Change-Id: I0ac3343dd54e2846efa3c4195fcd328b709ca04d
2013-02-27 08:22:39 -08:00
Yaowu Xu
103d83cb6c Merge "Enable 32x32 dct tests" into experimental 2013-02-27 07:57:07 -08:00
Yaowu Xu
858b60e8d0 Merge "Improve 32x32 forward dct" into experimental 2013-02-27 07:56:42 -08:00
John Koleszar
eb939f45b8 Spatial resamping of ZEROMV predictors
This patch allows coding frames using references of different
resolution, in ZEROMV mode. For compound prediction, either
reference may be scaled.

To test, I use the resize_test and enable WRITE_RECON_BUFFER
in vp9_onyxd_if.c. It's also useful to apply this patch to
test/i420_video_source.h:

  --- a/test/i420_video_source.h
  +++ b/test/i420_video_source.h
  @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {

     virtual void FillFrame() {
       // Read a frame from input_file.
  +    if (frame_ != 3)
       if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
         limit_ = frame_;
       }

This forces the frame that the resolution changes on to be coded
with no motion, only scaling, and improves the quality of the
result.

Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
2013-02-26 23:54:23 -08:00
Dmitry Kovalev
c7805395fd Merge "Removing redundant 'extern' keyword from function declarations." into experimental 2013-02-26 20:56:32 -08:00
Ronald S. Bultje
96d260515a Merge "Merge cnvcontext experiment." into experimental 2013-02-26 19:39:39 -08:00
Ronald S. Bultje
1a0533958b Merge "Fix modes.stt output printf format string." into experimental 2013-02-26 19:39:33 -08:00
Ronald S. Bultje
db54e6774f Merge "Minor cosmetics in rdopt." into experimental 2013-02-26 19:39:28 -08:00
Yunqing Wang
35bc02c6eb Optimize vp9_dc_only_idct_add_c function
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to
improve performance, clipped the absolute diff values to [0, 255].
This allowed us to keep the additions/subtractions in 8 bits.
Test showed an over 2% decoder performance increase.

Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
2013-02-26 17:16:13 -08:00
James Zern
4446af78f0 Merge "vp9: promote gf_group_bits calculation to 64-bit" into experimental 2013-02-26 16:27:45 -08:00
Dmitry Kovalev
971ff2679f Removing redundant 'extern' keyword from function declarations.
Change-Id: I893fa36297b9bd9cff93d082f1736f6860b15c0d
2013-02-26 15:52:05 -08:00
John Koleszar
25686fc22d Merge "Refactor inter recon functions to support scaling" into experimental 2013-02-26 11:45:28 -08:00
Johann
ef887974aa vp8 fast quantizer with intrinsics
Reduce dependency on offsets file by using intrinsics. Disassembly shows
improvements over previous assembly specifically in register management,
preloading, and {pro,epi}log. Speed change is within margin of error.

Change-Id: I8131b4b4d62bc092407fe847bfaa8f2c0e1384ff
2013-02-26 10:48:24 -08:00
Dmitry Kovalev
998bed1d2c Merge "Changing pitch value meaning for fht and iht transforms." into experimental 2013-02-26 10:44:15 -08:00
Ronald S. Bultje
b1641150b1 Merge cnvcontext experiment.
Change-Id: I35e64998b25694a3bb4a62164bba3c03c1db4bc7
2013-02-26 10:40:15 -08:00