This prevents a duplicate memcpy of a 128-byte struct every time
set_scale_factors() is called (which is a lot), thus leading to a
decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block
RD/partition loop.
Overall, this decreases encoding time of the first 50 frames of bus
@ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5%
overall speedup. We can likely get more gains by removing the copy
of the other struct (and replacing it with an indexing) as well.
Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a
These arrays have constant values (no any updates). Removing two
corresponding memcpy calls. Making a little cleanup in vp9_entropymode.h
as well: removing redundant 'extern' keyword and moving all function
declarations at the end.
Change-Id: Ia16b38b46aec2e2500f5df29c40a297ae241dede
Call the individually optimized horizontal and vertical functions. This
implementation abuses the temp buffer.
This will be replaced with a custom optimized function.
Over 2x speedup.
Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd
Removing tile_rows and tile_columns from VP9Common, removing redundant
constants MIN_TILE_WIDTH and MAX_TILE_WIDTH, changing signature of
vp9_get_tile_n_bits.
Change-Id: I8ff3104a38179b2c6900df965c144c1d6f602267
Cosmetic code changes, renaming 'flat' local var to 'mask', removing
unused field 'blim' from loopfilter_info_n and loop_filter_info structs.
Change-Id: I51e6ccf727fe361ad9a08e29e1201aa7abd4987f
This commit enables SSE2 implementation of 16x16 inverse ADST/DCT
hybrid transform. The runtime goes from 5742 cycles -> 1821 cycles.
This provides about 1% encoding speed-up at speed 0.
Change-Id: I1678d0988bf30b9efd524877705bbb3645edb17b
Making implementation of vp9_set_pred_flag_{seg_id, mbskip} consistent
with vp9_get_segment_id without using confusing sub(a, b) macro. Passing
mi_row and mi_col to functions explicitly instead of replying on
mb_to_right_edge and mb_to_bottom_edge.
Change-Id: I54c1087dd2ba9036f8ba7eb165b073e807d00435
In the prior code, the above context pointers used for entropy
decoding were initialized on the first frame, and not updated when
the frame size changed. The per-frame code which initializes the
contexts assumes that the contexts are contiguous, leading to an
incomplete initialization when the frame is smaller. This commit
updates the pointers so that the context is contigous whenever
the frame size changes.
Change-Id: I08b53e3a30c8289491212311682ff1b8028cff6c