ZMQ_ROUTER_NOTIFY doesn't have a context and doesn't play nice with protocols. with ZMQ_DISCONNECT_MSG we can set it to a protocol message, like DISCONNECT in majordomo. Router will send it when a peer is disconnected. Another advantage of ZMQ_DISCONNECT_MSG is that it also works on inproc.
Together with ZMQ_HEARTBEAT it allows to build very reliable protocols, and much simpler as well.
When using ZMQ_HEARTBEAT one still needs to implement application-level heartbeat in order to know when to send a hello message.
For example, with the majordomo protocol, the worker needs to send a READY message when connecting to a broker. If the connection to the broker drops, and the heartbeat recognizes it the worker won't know about it and won't send the READY msg.
To solve that, the majordomo worker still has to implement heartbeat. With this new option, whenever the connection drops and reconnects the hello message will be sent, greatly simplify the majordomo protocol, as now READY and HEARTBEAT can be handled by zeromq.
Solution: fix handling of _starting and _terminate flags
Add tests for this situation.
Clarify documentation of zmq_ctx_shutdown and zmq_socket.
Fixes#3792
Solution:
1. Use optional name parameter in thread_t::start for operating
systems that have thread names.
2. Give start_thread() an optional name parameter for the
thread's name. If this parameter is set, it will be appended to "0MQ:".
If not set, "0MQ" will be used as the thread's name.
3. Give epoll the ability to name its thread. Then use this in
io_thread and reaper to name them.
Solution: remove documents and tests for ZMQ_THREAD_PRIORITY getter. It
never worked and can never work as it has the same value as a get-only
option ZMQ_SOCKET_LIMIT. It cannot be changed without breaking ABI.
Note that the setter works fine as ZMQ_SOCKET_LIMIT is get-only.
The zero copy decoding strategy implemented for 4.2.0 can lead to a large
increase of main memory usage in some cases (I have seen one program go up to
40G from 10G after upgrading from 4.1.4). This commit adds a new option to
contexts, called ZMQ_ZERO_COPY_RECV, which allows one to switch to the old
decoding strategy.
* Background thread scheduling
- add ZMQ_THREAD_AFFINITY ctx option; set all thread scheduling options
from the context of the secondary thread instead of using the main
process thread context!
- change ZMQ_THREAD_PRIORITY to support setting NICE of the background
thread when using SCHED_OTHER
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes#2632
Solution: don't set thread name on Android
Setting a thread name on Android may fail with "permission
denied" error and abort the process due to failed assertion.
Tested on Android 5 and 6 (two phones).
Strangely enough, it only happens on signed APKs and is fine
in debug. Using JeroMQ is not an option as we need TCP keepalive
settings and authentication which JeroMQ doesn't support.
Solution: use pthread API to set the name. For now call every thread
"ZMQ b/g thread". Would be nice to number the I/O threads and name
explicitly the reaper thread, but in reality a bit of internal API
churn would be necessary, so perhaps it's not worth it.
This is useful when debugging a process with many threads.
Solution: add a zmq_assert to check if the ephemeral sockets created
to drain the queue of pending inproc connecting sockets was allocated
successfully.
Solution: check if the connecting inproc socket has been closed
before trying to send the identity.
Otherwise the pipe will be in waiting_for_delimiter state causing
writes to fail and the connect to assert when the context is being
torn down and the pending inproc connects are resolved.
Add test case that covers this behaviour.