Solution:
- Add checks for **poller_p_ to ensure that we do not segfault when either it
or the value within it are NULL
- Add tests for the above and increase error state coverage
Solution:
Mark them with LIBZMQ_UNUSED macro as per convention; although in future the
appropriate pthread code should be updated to support thread scheduling
priorities (for Mac OS X, et. al.)
The TIPC protocol bindings in ZeroMQ defaults to a lookup domain
of 1.0.0 to prevent 'closest first' search, and instead always
do round robin if several sockets in the network or node have
the same name published. In retrospect, this might have been a
bad idea because it won't work on standalone configurations.
We solve this by allowing an optional domain suffix to be provided
in the address, and 0.0.0 should be used in that case, or if the
TIPC address range in the cluster configuration is defined to some
other value. Domain suffixes are only relevant for connecting
addresses.
Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Solution:
- Add check for the [count] parameter in zmq_sendiov() and zmq_recviov()
- Use and add test for zmq_sendiov() in tests/test_iov.cpp
- Add error state tests for zmq_sendiov() in tests/test_iov.cpp
- Add error state tests for zmq_recviov() in tests/test_iov.cpp
- Cleanup tests/test_iov.cpp for style, consistency and clarity
- Generally improve test coverage for both API methods
Hat-tip:
@somdoron, @bluca
Solution: try to resolve the TCP endpoint passed by the user in the
zmq_unbind call before giving up, if it doesn't match.
This fixes a breakage in the API, where after a call to
zmq_bind(s, "tcp://127.0.0.1:9999") with IPv6 enabled on s would
result in the call to zmq_unbind(s, "tcp://127.0.0.1:9999") failing.
Add more test cases to increase coverage on all combinations of TCP
endpoints.
Problem:
Conditional logic in check_protocol() that checks if a protocol is supported,
is duplicated twice. Moreover, the first set of checks to ascertain if a
protocol is supported is done regardless of whether the particular protocol
will be built into the library or not.
Solution:
* Simplify/collapse all supported protocol checks into one in check_protocol()
* Enclose pgm/epgm/norm socket+protocol match checks with requisite macros
Solution: return -1 (no event) instead of 0 (event)
For some reason, this just returns 0 if there are no sockets registered
on the poller. Usually this would mean there has been an event. So the
caller would have to check the return value AND the event, or write code
that takes the number of registered sockets into consideration.
By returning -1 and setting errno = ETIMEDOUT like in the usual timeout
cases, it's more consistent and convenient.
Test case included.
Solution: if options.use_fd do not create temporary random
directory for ipc://*, since the socket is already created and
passed to the library by the user.
Solution: use the less nice but correct int constant 1000000000
instead of the shorter 1E9 to avoid a compiler warning when assigning
to timespec.tv_nsec, which is a long int.
Solution: in the Windows-specific ifdef in tcp_listener set_address,
check for error and set errno only after the IPv4 fallback has failed
too, to avoid setting errno when the socket creation succeeds through
the fallback.
Solution: if opening an IPv6 TCP socket fails because IPv6 is not
available, try to open an IPv4 socket instead when creating and
connecting a TCP endpoint.
Solution: if opening an IPv6 TCP socket fails because IPv6 is not
available, try to open an IPv4 socket instead when creating and
binding a TCP endpoint.
Problem: Since pull request #1730 was merged, protocol for REQ socket is
checked at the session level and this check does not take into account
the possibility of a request_id being part of the message. Thus the option
ZMQ_REQ_CORRELATE would no longer work.
This is now fixed: the possiblity of a 4 bytes integer being present
before the delimiter frame is taken into account (whether or not this
breaks the REQ/REP RFC is another issue).
A Visual Studio build from master (commit id: dac5b45dfb) using the v140_xp toolset yields a binary that is not XP compatible.
Two libraries contain exports that cannot be found:
- IPHLPAPI.DLL : if_nametoindex
- KERNEL32.DLL : InitializeConditionVariable
The latter export is already dealt with in the file './src/condition_variable.hpp'; however this requires setting the _WIN32_WINNT pre-processor definition.
I am not experienced enough to figure a work around for the 'if_nametoindex' method, so I have created a new pre-processor definition 'ZMQ_HAVE_WINDOWS_TARGET_XP' and removed the calling of the function with the limitation that these builds cannot handle a IPv6 address with an adapter name.
To make it easier for people targeting XP with an MSVC build I have modified the MSBuild property file to add/modify the pre-processor definitions if they are building using a XP targeting tool set; such as v140_xp.
libsodium calls abort() when /dev/urandom can't be found
even if one creates ZeroMQ context before calling chroot()[1].
This happens because crypto gets initialized on handshake,
and at that moment the process is already chroot'ed.
Solution: initialize cryptographic libraries in ctx
randombytes_close() is already there in the destructor.
[1] https://download.libsodium.org/doc/usage/index.html
Problem: when using ZMQ_REQ_RELAXED + ZMQ_REQ_CORRELATE and two 'send' are
executed in a row and no server is available at the time of the sends,
then the internal request_id used to identify messages gets corrupted and
the two messages end up with the same request_id. The correlation no
longer works in that case and you may end up with the wrong message.
Solution: make a copy of the request_id instance member before sending it
down the pipe.
Solution: socket_base_t::in_event cannot do anything useful with
return status of process_commands. Asserting is the wrong solution,
as it is entirely valid to be interrupted or for the context to be
terminated, so discard the value.
Solution: remove statc initialization to NULL of thread.hpp pthread_t
descriptor. There is no portable way to statically initialize a
pthread_t variable.
Solution: The Coverity Static Code Analyzer was used on libzmq code and found
many issues with uninitialized member variables, some redefinition of variables
hidding previous instances of same variable name and a couple of functions
where return values were not checked, even though all other occurrences were
checked (e.g. init_size() return).
Solution: Corrected Toolset setting where needed and inprove compilation speed
by adding defintion of WIN32_LEAN_AND_MEAN prior to any Windows specific
include files, which skips non-essential definitions during compilation.
Solution: disable the warnings on this file only
We use pragmas wrapped in compiler conditionals. This will need
extending to non-gcc/msvc compilers. We could also fix the warnings
in the code, though I suspect it's not really possible.
libzmq used to switch off pedantic checks when using tweetnacl. As
this is now the default, that means pedantic checks are always off.
This is not good.
Solution: in tweetnacl.c alone, use a GCC pragma to disable sign
comparison warnings. We could also clean the code up yet this is
simpler. In other code, we still want those warnings, hence I've
used a pragma rather than global compile option.
Second, use -Wno-long-long all the time, as this warning does not
work with a pragma.
I removed code that set -wno-long-long, for MinGW and Solaris.
Related problem 2: --with-relaxed is badly named
This option switches off pedantic checks, so should be called
--disable-pedantic. 'with' is for optional packages.
Solution: always initialised zmq::options_t class variables arrays to
avoid reading uninitialised data when CURVE is not yet configured and
a getsockopt ZMQ_CURVE_{SERVER | PUBLIC | SECRET]KEY is issued.
It's all over the place.
Solution: remove duplicates and try to move main includes to start
of source. Also, include net/if.h always, so that the code will
compile if ZMQ_HAVE_IFADDRS isn't defined.
- they have no copyright / license statement
- they are in some randomish directory structure
- they are a mix of postable and non-portable files
- they do not conform to conditional compile environment
Overall, it makes it rather more work than needed, in build scripts.
Solution: clean up tweetnacl sauce.
- merged code into single tweetnacl.c and .h
- standard copyright header, DJB to AUTHORS
- moved into src/ along with all other source files
- all system and conditional compilation hidden in these files
- thus, they can be compiled and packaged in all cases
- ZMQ_USE_TWEETNACL is set when we're using built-in tweetnacl
- HAVE_LIBSODIUM is set when we're using external libsodium
It's unclear which we need and in the source code, conditional code
treats tweetnacl as a subclass of libsodium, which is inaccurate.
Solution: redesign the configure/cmake API for this:
* tweetnacl is present by default and cannot be enabled
* libsodium can be enabled using --with-libsodium, which replaces
the built-in tweetnacl
* CURVE encryption can be disabled entirely using --enable-curve=no
The macros we define in platform.hpp are:
ZMQ_HAVE_CURVE 1 // When CURVE is enabled
HAVE_LIBSODIUM 1 // When we are using libsodium
HAVE_TWEETNACL 1 // When we're using tweetnacl (default)
As of this patch, the default build of libzmq always has CURVE
security, and always uses tweetnacl.
And I'm on a reasonably sized laptop. I think allocating INT_MAX
memory is dangerous in a test case.
Solution: expose this as a context option. I've used ZMQ_MAX_MSGSZ
and documented it and implemented the API. However I don't know how
to get the parent context for a socket, so the code in zmq.cpp is
still unfinished.
These options are confusing and redundant. Their names suggest
they apply to the tcp:// transport, yet they are used for all
stream protocols. The methods zmq::set_tcp_receive_buffer and
zmq::set_tcp_send_buffer don't use these values at all, they use
ZMQ_SNDBUF and ZMQ_RCVBUF.
Solution: merge these new options into ZMQ_SNDBUF and ZMQ_RCVBUF.
This means defaulting these two options to 8192, and removing the
new options. We now have ZMQ_SNDBUF and ZMQ_RCVBUF being used both
for TCP socket control, and for input/output buffering.
Note: the default for SNDBUF and RCVBUF are otherwise 4096.
This option has a few issues. The name is long and clumsy. The
functonality is not smooth: one must set both this and
ZMQ_XPUB_VERBOSE at the same time, or things will break mysteriously.
Solution: rename to ZMQ_XPUB_VERBOSER and make an atomic option.
That is, implicitly does ZMQ_XPUB_VERBOSE.
Solution: be more explicit in the code, and in the zmq_recv man
page (which is the most unobvious case). Assert if length is not
zero and buffer is nonetheless null.
There's no value in this as the same pattern is repeated in several
places and it's fair to expect people to understand it.
Solution: revert to the old, one-liner style.
used static_cast<signed int> around WSA_WAIT_FAILED as it is an unsigned implicitly defined as (0xFFFFFFFF ion winbase.h) and causes a comparison error.
removed use of c++11 style initialiser list for 'sockaddr addr { 0 }' and changed it to 'sockaddr addr = { 0 }'
Solution: when iterating over a map and conditionally deleting
elements, an erased iterator gets invalidated. Call erase using postfix
increment on iterator to avoid using an invalid element in the next
iteration.
Solution: parse the value set by the ZMQ_PRE_ALLOCATED_FD sockopt
when creating a new TCP socket and use it if valid.
Add new tests/test_pre_allocated_fd_tcp.cpp unit test.
Solution: parse the value set by the ZMQ_PRE_ALLOCATED_FD sockopt
when creating a new IPC socket and use it if valid.
Add new tests/test_pre_allocated_fd_ipc.cpp unit test.
Solution: add new [set|get]sockopt ZMQ_PRE_ALLOCATED_FD to allow
users to let ZMQ use a pre-allocated file descriptor instead of
allocating a new one. Update [set|get]sockopt documentation and
test accordingly.
The main use case for this feature is a socket-activated systemd
service. For more information about this feature see:
http://0pointer.de/blog/projects/socket-activation.html
select was improved to support multiple service providers on Windows.
it should be slightly faster because of optimized iteration
over selected sockets.
This commit addresses the following warnings reported on gcc 5.2.1. In
the future, this will help reduce the "noise" and help catch warnings
revealing a serious problem.
It was originally introduce in the refactoring associated with
zeromq/libzmq@da2bc60 (Removing zmq_pollfd as it is replaced by zmq_poller).
8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---
/path/to/libzmq/src/socket_poller.cpp: In member function ‘int zmq::socket_poller_t::add(zmq::socket_base_t*, void*, short int)’:
/path/to/libzmq/src/socket_poller.cpp:92:51: warning: missing initializer for member ‘zmq::socket_poller_t::item_t::pollfd_index’ [-Wmissing-field-initializers]
item_t item = {socket_, 0, user_data_, events_};
^
/path/to/libzmq/src/socket_poller.cpp: In member function ‘int zmq::socket_poller_t::add_fd(zmq::fd_t, void*, short int)’:
/path/to/libzmq/src/socket_poller.cpp:108:50: warning: missing initializer for member ‘zmq::socket_poller_t::item_t::pollfd_index’ [-Wmissing-field-initializers]
item_t item = {NULL, fd_, user_data_, events_};
^
8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---
avoids leaking sockets due to multiple monitor calls on one socket
Alternative: raise error (not sure what errno; EADDRINUSE?) if collision detected; force manual stop.
When using ZMQ_REQ_RELAXED and a 'send' is executed after another 'send' the
previous code would terminate the 'reply_pipe' if any.
This is incorrect as terminating the reply pipe also terminates the send pipe
as they are the same (a pipe associated with a socket is bidirectional).
Doing a terminate on the pipe sets an internal flag called out_active to false
and the pipe can no longer send messages.
Removing the 'terminate' solves the problem. Removing this call is not an issue
as the incorrect ordering of messages that could be incurred is taken care of
by the ZMQ_REQ_CORRELATE option if needed.
These sockets don't handle multipart data, so if callers send it,
they drop frames, and things break silently.
Solution: if the caller tries to use ZMQ_SNDMORE, return -1 and
set errno to EINVAL.
If we're going to add CLASS-like APIs we should use the proper
syntax; specifically 'destroy' instead of 'close', which is a
hangover from the 'ZeroMQ is like sockets' model we're slowly
moving away from.
Solution: change zmq_timers_close(p) to zmq_timers_destroy(&p)
VMCI transport allows fast communication between the Host
and a virtual machine, between virtual machines on the same host,
and within a virtual machine (like IPC).
It requires VMware to be installed on the host and Guest Additions
to be installed on a guest.
This reduces chances of race between writer deactivation and activation.
Reader sends activation command to writer when number or messages is
multiple of LWM. In situation with high throughput (millions of messages
per second) and correspondingly large HWM (e.g. 10M) the difference
between HWM needs to be large enough - so that activation command is
received before pipe becomes full.
Only assert on errors we know are our fault,
instead of trying to whitelist every possible network-related failure.
This makes ZeroMQ more portable to other platforms
where the possible errors are different.
In particular, the previous code would often die under iOS.
See issue #1608.
This is an old issue with Windows 7. The effect is that we see a latency
ramp on the first 500 messages.
* The ramp is unaffected by message size.
* Sleeping up to 100msec between sends has no effect except to switch
off ZeroMQ batching so making the ramp more visible.
* After 500 messages, latency falls back down to ~10-40 usec.
* Over inproc:// the ramp happens when we use the signaler class.
* Client-server over inproc:// does not show the ramp.
* Client-server over tcp:// shows a similar ramp.
We know that the signaller is using TCP on Windows. We can 'prime' the
connection by doing 500 dummy sends. This potentially causes new sockets
to be delayed on creation, which is not a good solution.
Note that the signaller sends zero-byte messages. This may also be
confusing TCP.
Solution: flood the receive buffer when creating a new FD pair; send a
1M buffer and discard it.
Fixes#1608
This causes assertion failures after network reconnects.
Solution: allow EINVAL as a possible condition after read/write.
Fixes#829Fixes#1399
Patch provided by Michele Dionisio @mdionisio, thanks :)
In real world usage, there have been reported signaler failures where the
eventfd read() or socket recv() system call in signaler::recv() fails,
despite having made a prior successful signaler::wait() call.
this patch creates a signaler::recv_failable() method that allows
unreadable eventfd / socket to return an error without asserting.