This commit replaces SAD cost with modeled rate-distortion cost
for non-RD mode decision. It translates the prediction residual
SSE into estimate rate and reconstruction distorion costs, hence
capturing the quantization setting effect. The compression
performance of speed -7 for rtc set is improved by 14.79%.
Change-Id: Ifda014eb0501d13109fe7f92680bf1410b463632
Set speed features before running frame encoding. This avoids
redundant RD threshold calculation in key frame coding.
Change-Id: If8e3cf2c02976baa59b310c1c23af9eea0c46e36
- Change type of encrypt_buffer() offset argument to ptrdiff_t, and change the
type of the size argument to size_t.
- Update size argument encrypt_buffer() in vp8_boolcoder_test.c with
same.
Change-Id: Ie29c7c82c73318bee01b89c6fb4c4e1442eef03c
The core motion estimation fucntions all return sad now consistently.
The only exception is vp9_full_pixel_diamond(), however the core diamond
and refining search routines called from vp9_full_pixel_diamond() also
return SAD. If variance of pred error + mv cost is desired it must be
calculated explicitly outside these functions. For very fast encoding,
hopefully this will eliminate some redundant computations.
Also suggests reimplementing FAST_HEX with the vp9_pattern_search
framework. It is not exactly the same as the existing FAST_HEX, but
performance is slightly better and speed is very similar. Enables
removing a lot of duplicate code.
Change-Id: I152736393438c25bdf7e96b37cbb8ce330f4f94a
This patch adds a new speed feature which doesn't do the rather
expensive entropy context lookup or save to the table, while
doing costing.
The speed up on desktop36p.y4m is around 10% other clips much less.
On the RTC test set this was + 1% in overall datarate.
Change-Id: Ia5144bbf45270671e7be9c8e4055369909e2f738
This gets more accurate mode hit stats. It's also the first step to
handling ZEROMV not being allowed more intelligently.
Change-Id: I5de6734507b5177bf73e9ddbad923f218c39f3e4
intra_y_mode_mask is already enforced for the sub8x8 case.
intra_uv_mode_mask is already enforced for all sizes.
Change-Id: Ia9dd14701cb49873c2e8f24eb5f8b255eaf76a1f
The function has evolved over time, now only calls vp9_rtcd(), so this
commit removes the function and changes to call vp9_rtcd() directly.
Change-Id: I8cfa6190daa4b28f6f3d1e11bb3a07f9c95322bf
Optimizing 2 functions to process 32 elements in parallel instead of 16:
1. vp9_sub_pixel_avg_variance64x64
2. vp9_sub_pixel_avg_variance32x32
both of those function were calling vp9_sub_pixel_avg_variance16xh_ssse3
instead of calling that function, it calls vp9_sub_pixel_avg_variance32xh_avx2
that is written in avx2 and process 32 elements in parallel.
This Optimization gave 80% function level gain and 2% user level gain
Change-Id: Iea694654e1b7612dc6ed11e2626208c2179502c8