Currently we use __thread variable to store thread_local_dtors,
which makes tsan test fork_atexit.cc hang. The problem is as below:
The main thread creates a worker thread, the worker thread calls
pthread_exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init) -> emutls_init().
Then the main thread calls fork(), the child process cals
exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init).
So the child process is waiting for pthread_once(emutls_init)
to finish which will never occur.
It might be the test's fault because POSIX standard says if a
multi-threaded process calls fork(), the new process may only
execute async-signal-safe operations until exec functions are
called. And exit() is not async-signal-safe. But we can make
bionic more reliable by not using __thread in
__cxa_thread_finalize().
Bug: 25392375
Change-Id: Ife403dd7379dad8ddf1859c348c1c0adea07afb3