Commit Graph

7765 Commits

Author SHA1 Message Date
Jingning Han
d66c748635 Enable skipping reference frame check in rd loop
This commit allows encoder to compare the SAD cost associated with
the best motion vector predictor, per frame. If one reference frame
has this cost more than 4 times of the best SAD cost given by other
reference frames, skip NEARESTMV, NEARMV, ZEROMV mode check of this
reference frame.

This setting is turned on in speed 2 and above. Compression quality
change in speed 2:
derf  -0.014%
yt    -0.097%
hd    -0.023%
stdhd  0.046%

It reduces the speed 2 runtime of test sequences:
pedestrian_area_1080p 4000 kbps 310763 ms -> 303595 ms
bluesky_1080p 6000 kbps         259852 ms -> 251920 ms

Change-Id: I7f59cf79503d51836d61d56d50dc5bdf0e502e22
2014-01-09 18:25:53 -08:00
Jingning Han
af31b27aae Optimze inv 16x16 DCT with 10 non-zero coeffs - P2
This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
2014-01-09 12:46:09 -08:00
Jingning Han
ba6ab46cdc Optimze inv 16x16 DCT with 10 non-zero coeffs - P1
This commit is the first patch optimizing SSE2 implementation of inverse
16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
transformation. It exploits the fact that only top-left 4x4 block contains
non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.

The average runtime of idct16x16_10 unit is reduced from
883 cycles -> 779 cycles (12% faster).

For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
down from 310651 ms  -> 305910 ms. The decoding speed goes up from
80.37 fps -> 80.87 fps.

Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
2014-01-08 15:36:45 -08:00
Alex Converse
8fcb74e6bb Merge "Add a C fallback for get_msb() and change inline to INLINE." 2014-01-08 14:43:46 -08:00
hkuang
5be0ed30dc Merge "Add initial intra frame neon optimization. 1~2% gain." 2014-01-08 14:41:43 -08:00
Dmitry Kovalev
feab7e1146 Merge "Using struct twopass_rc* instead of VP9_COMP*." 2014-01-08 14:14:05 -08:00
Alex Converse
ce7ff3b63d Add a C fallback for get_msb() and change inline to INLINE.
For systems without __builtin_clz() or _BitScanReverse(), taken from libwep

Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e
2014-01-08 12:25:47 -08:00
hkuang
691111aacf Add initial intra frame neon optimization. 1~2% gain.
More intra optimizations will be added.

Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
2014-01-08 11:58:42 -08:00
Yunqing Wang
a84029ad9c Merge "AVX2 Variance Optimization" 2014-01-08 11:33:42 -08:00
Johann
af72081818 Merge "Include gen_msvs_vcxproj.sh" 2014-01-08 11:10:03 -08:00
Alex Converse
22d83a0ab7 Merge "Replace RD modeling with a fixed point approximation." 2014-01-08 11:06:54 -08:00
levytamar82
357b65369f AVX2 Variance Optimization
Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
vp9_mse16x16 by migrating to AVX2
some of the functions were optimized by processing 32 elements instead of 16.
some of the functions were optimized by processing 2 loop strides of 16
elements in a single 256 bit register
This optimization gives between 2.4% - 2.7% user level performance gain
and 42% function level gain.

Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
2014-01-08 12:05:53 -07:00
Alex Converse
f2ca665f1c Replace RD modeling with a fixed point approximation.
Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
2014-01-08 10:37:24 -08:00
Jingning Han
aa9552b0b5 Merge "Fix an issue in motion vector prediction stage" 2014-01-08 10:06:03 -08:00
Johann
87784e3a99 Include gen_msvs_vcxproj.sh
Change-Id: I28e9cf9347acd7279df3b841863a248479633265
2014-01-08 09:51:15 -08:00
Deb Mukherjee
0d21d79bbc Merge "Further rate control cleanups" 2014-01-08 09:20:29 -08:00
Johann
7d4083a8da Merge "Remove yasm.rules dependency" 2014-01-08 08:42:25 -08:00
Deb Mukherjee
730ade414d Further rate control cleanups
Some cleanups on frames_to_key, frames_since_key.
Also removes the unused fixed_q parameters in vp9.

Change-Id: If8743a32c71de30a8d17136477b53d607a7acda8
2014-01-07 13:51:50 -08:00
Jingning Han
06e4f825af Fix an issue in motion vector prediction stage
The previous implementation stops motion vector prediction test when
the zero motion vector appears for the second time. This commit fixes
it by simply skipping the second time check on zero mv and continuing
on to next mv candidate.

It slightly improves stdhd in speed 2 by 0.06% on average. Most static
sequences are not affected. A few hard ones, like jet, ped, and riverbed
were improved by 0.1 - 0.2%.

Change-Id: Ia8d4e2ffb7136669e8ad1fb24ea6e8fdd6b9a3c1
2014-01-07 10:18:04 -08:00
Jingning Han
fdad4fd226 Remove deprecated variable from rt_speed_feature
This resolves a merge error.

Change-Id: Ifb83acc0a08e80c82f7624f9c86f79d3a86cc871
2014-01-07 10:15:51 -08:00
Dmitry Kovalev
16f5607dfe Merge "Adding new_mv local variable." 2014-01-07 09:56:41 -08:00
Dmitry Kovalev
7b496783c2 Merge "Adding get_ref_frame_buffer() function." 2014-01-07 09:56:06 -08:00
Dmitry Kovalev
b3af2f87b0 Merge "Removing unused mvp_fill manipulation code." 2014-01-07 09:54:05 -08:00
Jingning Han
656166ea81 Merge "Remove avoid_frame_with_high_error from RD loop" 2014-01-07 09:31:17 -08:00
Dmitry Kovalev
9e18cf70db Merge "Reusing ROUND_POWER_OF_TWO macro." 2014-01-07 02:40:02 -08:00
Paul Wilkins
e4e58ac400 Merge "Clean up: unused function and variables" 2014-01-07 02:27:20 -08:00
Dmitry Kovalev
6a7a7341ee Removing unused mvp_fill manipulation code.
The code can be removed because mvp_full will be overridden after that.

Change-Id: I89559b1b6914c86bcd02b7359d37241948ac11d3
2014-01-06 18:07:12 -08:00
Dmitry Kovalev
8e6583b1a2 Merge "Replacing &cpi->common with cm." 2014-01-06 17:58:26 -08:00
Dmitry Kovalev
c015ba5f6e Adding new_mv local variable.
Change-Id: I9631b35810c232c134f39dc0edadb1b3860a45ae
2014-01-06 17:58:01 -08:00
Dmitry Kovalev
ff655420b5 Reusing ROUND_POWER_OF_TWO macro.
Change-Id: I064ba32d5358bfbf080a4300fc1793b345080006
2014-01-06 17:38:57 -08:00
Dmitry Kovalev
abe4940d64 Replacing &cpi->common with cm.
Change-Id: Ic5bf5682ccdb8d2fbad6bba0d7db19a4f47b62a1
2014-01-06 17:29:16 -08:00
Alex Converse
7900c80e5a Merge "Fix encoding Raw yv12 and i420 from a pipe." 2014-01-06 17:22:21 -08:00
Marco Paniconi
166d8142ac Merge "Code cleanup: remove unneeded lines." 2014-01-06 16:35:52 -08:00
Alex Converse
64b89f1b4b Fix encoding Raw yv12 and i420 from a pipe.
rewind() does not work on pipes.

https://code.google.com/p/webm/issues/detail?id=678

Change-Id: I057f1e25c3f5662012d6e33ff4c97c88f50df357
2014-01-06 16:31:09 -08:00
Yaowu Xu
9aa16eecd0 Merge "Added placeholder for real time mode" 2014-01-06 16:26:57 -08:00
Marco Paniconi
3817c7c732 Code cleanup: remove unneeded lines.
Change-Id: I44a89b822a436299b9dd4ff26ad2e35767c29c58
2014-01-06 16:04:48 -08:00
Charles 'Buck' Krasic
11660c6b38 Merge "Write correct resolution to the IVF file header (b/11270652)" 2014-01-06 15:18:04 -08:00
Johann
195a085253 Remove yasm.rules dependency
The file was removed by 9152f4851d after
the solution files were changed.

Change-Id: I868c56fd609f45fb3e21afd085b9e6c268aac038
2014-01-06 15:12:42 -08:00
Dmitry Kovalev
a224b0dded Merge "Combining ref_frame and second_ref_frame into ref_frames[2]." 2014-01-06 15:02:31 -08:00
Dmitry Kovalev
29199efd57 Merge "Moving reset_segment_features() to encoder/vp9_segmentation.h." 2014-01-06 15:01:54 -08:00
Dmitry Kovalev
7919bf6afd Adding get_ref_frame_buffer() function.
Encapsulating direct references to lst_fb_idx, gld_fb_idx, alt_fb_idx.

Change-Id: I7e65ba3f131286e433e6651970c5647311fa4687
2014-01-06 14:50:54 -08:00
Dmitry Kovalev
bbb25e6a39 Merge "Adding RefBuffer struct." 2014-01-06 14:19:44 -08:00
Charles 'Buck' Krasic
8aa33ed6b1 Write correct resolution to the IVF file header (b/11270652)
also:
 o remove dead code, create_dummy_frame
 o Fix a bug in command line handling that caused a segfault if wrong
   number of arguments were given.

Change-Id: I78f026aee4e363967b750e6cde0982659c558a1f
2014-01-06 14:18:38 -08:00
Jingning Han
393a8ccef9 Remove avoid_frame_with_high_error from RD loop
The feature undergoes prior assumption that the recursive partition
size search from 4x4 to 64x64, hence utilizing information from small
blocks to determine early termination in large block rate-distortion
optimization search. The current codebase is now going from top down.
The previous function might go with not properly initialized values,
hence removed.

Tested on pedestrian_area_1080p at 4000 kbps running under speed 2.
No visible difference in runtime observed.

Change-Id: I553df415c6191413762db7ae34e8790c71d8118e
2014-01-06 13:34:07 -08:00
Dmitry Kovalev
b57b82b5d6 Using struct twopass_rc* instead of VP9_COMP*.
Change-Id: Id9ff7772aa3a3fb5d6cf94aff7dc9489bd964340
2014-01-06 12:46:23 -08:00
Dmitry Kovalev
6b150c2884 Combining ref_frame and second_ref_frame into ref_frames[2].
Change-Id: I007d66a1cb1b44751dcceafbaa64649ed9a34562
2014-01-06 12:24:37 -08:00
Deb Mukherjee
f73b21439d Merge "Corerctly sets frame type in the 2 pass case" 2014-01-06 12:01:30 -08:00
Yaowu Xu
a2c01ed5b4 Added placeholder for real time mode
Change-Id: I203d10f76c7ca78d875eaae15557cd765c6240d1
2014-01-06 11:57:25 -08:00
Dmitry Kovalev
4603f31d02 Moving reset_segment_features() to encoder/vp9_segmentation.h.
Change-Id: I0db4b31cb2382d4f6249eae0a8f42d227ad0ac57
2014-01-06 11:31:57 -08:00
Dmitry Kovalev
a9deec4389 Merge "Moving get_scan() call out of decode_coeffs() function." 2014-01-06 10:50:16 -08:00