llvm's integrated assembler does not accept spaces as macro argument delimiter when targeting darwin. Using a explicit delimiter is a good idea in principle since it makes case like 'macro 4 -2' vs 'macro 4 - 2' clear.
Values are positive powers of two, so just replace it with right shift.
Approximately as fast as the ARM NEON version on Apple's A7.