1. Make the feature test work by excluding known-deficient processors, so
we don't have to maintain a complete list of all the processors that support
REV and REV16.
2. Don't abuse 'register' to get an effect similar to GCC's +l constraint,
but which was unnecessarily restrictive.
3. Fix __swap64md so _x isn't clobbered, breaking 64-bit swaps.
4. Make <byteswap.h> (which declars bswap_16 and friends) use <endian.h>
rather than <sys/endian.h>, so we get the machine-dependent implementations.
Change-Id: I6a38fad7a9fbe394aff141489617eb3883e1e944
ARMv6 ISA has several instructions to handle data in different byte order.
For endian conversion (byte swapping) of single data words, it might be a
good idea to use the REV/REV16 instruction simply.
Change-Id: Ic4a5ed6254e082763e54aa70d428f59a0088636e