bionic/libc/upstream-openbsd
Elliott Hughes 5a0aa3dee2 Switch to a working UTF-8 mb/wc implementation.
Although glibc gets by with an 8-byte mbstate_t, OpenBSD uses 12 bytes (of
the 128 bytes it reserves!).

We can actually implement UTF-8 encoding/decoding with a 0-byte mbstate_t
which means we can make things work on LP32 too, as long as we accept the
limitation that the caller needs to present us with a complete sequence
before we'll process it.

Our behavior is fine when going from characters to bytes; we just
update the source wchar_t** to say how far through the input we got.

I'll come back and use the 4 bytes we do have to cope with byte sequences
split across multiple input buffers. The fact that we don't support
UTF-8 sequences longer than 4 bytes plus the fact that the first byte of
a UTF-8 sequence encodes the length means we shouldn't need the other
fields OpenBSD used (at the cost of some recomputation in cases where a
sequence is split across buffers).

This patch also makes the minimal changes necessary to setlocale(3) to
make us behave like glibc when an app requests UTF-8. (The difference
being that our "C" locale is the same as our "C.UTF-8" locale.)

Change-Id: Ied327a8c4643744b3611bf6bb005a9b389ba4c2f
2014-05-01 14:46:54 -07:00
..
android Clean up and document our hacks for building BSD source. 2014-04-21 17:13:46 -07:00
lib/libc Switch to a working UTF-8 mb/wc implementation. 2014-05-01 14:46:54 -07:00
README.txt Clean up and document our hacks for building BSD source. 2014-04-21 17:13:46 -07:00

This directory contains upstream OpenBSD source. You should not edit these
files directly. Make fixes upstream and then pull down the new version of
the file.

TODO: write a script to make this process automated.