This will make the normal C++ code not run on non-NEON devices at all, making the runtime CPU feature detection pointless. Adding -mfpu=neon to CFLAGS is not necessary, it's enough to add it while building those individual .S files (via ASMFLAGS).