- For experiment EXT_INTERP under high bit depth.
- Add unit test to verify bit-exact.
- Speed performance improvement:
On Xeon E5-2680, park_joy_1080p_12.y4m, 50 frames, encoding time
drops from 6682503 ms to 5390270 ms.
Change-Id: Iea4debf5414f3accf1eb5672abeab56a0539ac77
- Fix the over-writing bug in horizontal filtering as width = 2.
- Fix 10-tap vertical filtering which no longer reads one row of
pixel above the block.
- Fix 10-tap filter zero padding.
- Encoder speed slow down ~4.0%, compared to,
81ad953 Convolution vertical filter SSSE3 optimization
Change-Id: I9bb294a4529300081c29bf284e6bc6eb081cc536
- Apply 8-pixel vertical filtering direction parallelism.
- Add unit tests to verify bit exact.
- Encoder speed improves ~29% (enable EXT_INTERP) on Xeon E5-2680.
- Combinational cycle count of vp10_convolve() drops from 26.06%
to 6.73%.
Change-Id: Ic1ae48f8fb1909991577947a8c00d07832737e57
- Apply signal direction/4-pixel vertical/8-pixel vertical
parallelism.
- Add unit test to verify the bit exact result.
- Overall encoding time improves ~24% on Xeon E5-2680 CPU.
Change-Id: I104dcbfd43451476fee1f94cd16ca5f965878e59