Merging these functions allows merging some loops, which makes the results (particularly after SIMD optimizations) much faster. (cherry picked from commit f8bed30d8b176fa030f6737765338bb4a2bcabc9)