Although glibc gets by with an 8-byte mbstate_t, OpenBSD uses 12 bytes (of
the 128 bytes it reserves!).
We can actually implement UTF-8 encoding/decoding with a 0-byte mbstate_t
which means we can make things work on LP32 too, as long as we accept the
limitation that the caller needs to present us with a complete sequence
before we'll process it.
Our behavior is fine when going from characters to bytes; we just
update the source wchar_t** to say how far through the input we got.
I'll come back and use the 4 bytes we do have to cope with byte sequences
split across multiple input buffers. The fact that we don't support
UTF-8 sequences longer than 4 bytes plus the fact that the first byte of
a UTF-8 sequence encodes the length means we shouldn't need the other
fields OpenBSD used (at the cost of some recomputation in cases where a
sequence is split across buffers).
This patch also makes the minimal changes necessary to setlocale(3) to
make us behave like glibc when an app requests UTF-8. (The difference
being that our "C" locale is the same as our "C.UTF-8" locale.)
Change-Id: Ied327a8c4643744b3611bf6bb005a9b389ba4c2f
Use the ANDROID_DATA environment variable instead of the hard-coded
directory for these benchmarks.
Change-Id: I00bae7b4a24e81e77fc8f52e1fe99f4d4918f520
If you rewrite the tokens of a #if you need to rewrite the expression to match
because either might be used later. This was showing up as SIGRTMAX being
rewritten in a #define but not in the #ifndef that guarded it, for which case
I've added a unit test.
Change-Id: I6929675461a1afe272edd667594529fd84a3dc4d
__SIGRTMIN will continue to tell the truth. This matches glibc's
behavior (as evidenced by the fact that we don't need a special case
in the strsignal test now).
Change-Id: I1abe1681d516577afa8cd39c837ef12467f68dd2
This also gets us the C99 wcstoimax and wcstoumax, and a working fgetwc and
ungetwc, all of which are needed in the implementation.
This also brings several other files closer to upstream.
Change-Id: I23b025a8237a6dbb9aa50d2a96765ea729a85579
This replaces a partial set of non-functional functions with a complete
set of functions, all of which actually work.
This requires us to implement mbsnrtowcs and wcsnrtombs which completes
the set of what we need for libc++.
The mbsnrtowcs is basically a copy & paste of wcsnrtombs, but I'm going
to go straight to looking at using the OpenBSD UTF-8 implementation rather
than keep polishing our home-grown turd.
(This patch also opportunistically switches us over to upstream btowc,
mbrlen, and wctob, since they're all trivially expressed in terms of
other functions.)
Change-Id: I0f81443840de0f1aa73b96f0b51988976793a323