Commit Graph

2375 Commits

Author SHA1 Message Date
sigiesec
e2d3ba9c62 Problem: classification ZMQ_HANDSHAKE_FAILED_* events is coarse-grained and partially misleading
Solution: redesign ZMQ_HANDSHAKE_FAILED_* events, introduce new class of ZMQ_HANDSHAKE_FAILED_AUTH events
2017-08-18 09:17:59 +02:00
sigiesec
f9985708b7 Problem: unreachable code in zap_client_t
Solution: replaced unreachable code by assertions and adapted uses
2017-08-17 12:54:05 +02:00
sigiesec
f107b53768 Problem: deviating behavior regarding monitoring events between mechanisms
Solution: move relevant behavior to zap_client_t
2017-08-17 12:10:00 +02:00
sigiesec
8dce0396fb Problem: inconsistent handling of ZAP replies
Solution: unification, pulled up common behaviour to zap_client_t/zap_client_common_handshake_t
2017-08-17 09:44:05 +02:00
sigiesec
8c58ef7f5c Problem: zap_msg_available duplicated between curve_server_t and plain_server_t (with deviating behaviour)
Solution: pull up into zap_client_common_handshake_t, along with handle_zap_status_code and error_detail/current_error_detail
2017-08-16 18:05:36 +02:00
sigiesec
314a3acfa9 Problem: status method duplicated between curve_server_t and plain_server_t
Solution: extract into new intermediate base class zap_client_common_handshake_t
2017-08-16 18:05:36 +02:00
sigiesec
ebba815a4d Problem: duplicate but equivalent state enums in curve_server_t and plain_server_t
Solution: pull state enum up to zap_client_t and unify names of enum values
2017-08-16 18:05:35 +02:00
sigiesec
414c6f45b8 Problem: receive_and_process_zap_reply is duplicated in all mechanisms
Solution: extract receive_and_process_zap_reply into zap_client_t and convert zap_client_t into base class of the server mechanism classes
2017-08-16 18:05:35 +02:00
sigiesec
d7a3778387 Problem: plain_server_t duplicates zap_client_t::send_zap_request
Solution: Use zap_client_t::send_zap_request
2017-08-16 18:05:35 +02:00
sigiesec
014b201d3e Problem: ZAP message without credentials is not terminated
Solution: Set more flag depending on presence of credentials
2017-08-16 18:05:35 +02:00
sigiesec
b324c66b6f Problem: null_mechanism duplicates zap_client_t::send_zap_request\nSolution: use zap_client_t::send_zap_request 2017-08-16 18:05:34 +02:00
sigiesec
f3884f3380 Problem: gssapi_server_t duplicates zap_client_t::send_zap_request
Solution: Use zap_client_t::send_zap_request
2017-08-16 18:04:31 +02:00
sigiesec
6e8a0b31be Problem: ZAP client code is duplicated in all mechanisms
Solution: created a zap_client_t class, extracted first function send_zap_request from curve_server_t
2017-08-16 18:04:30 +02:00
Simon Giesecke
4a18f6204c Problem: Possible buffer overruns related to metadata in various mechanisms (#2683)
* Problem: no test case with CURVE encryption and large identity

Solution: added test case (currently crashing)

* Problem: possible buffer overflow in mechanism_t::add_property

Solution: add target buffer length parameter and check the buffer is sufficiently large

* Problem: test cases accidentally excluded from build

Solution: remove #if/#endif

* Problem: possible buffer overruns related to metadata at various locations

Solution: allocate buffer large enough for actual metadata, reduce code duplication

* Problem: syntax error related to pointer type conversion

Solution: change argument type of make_command_with_basic_properties to const char *

* Problem: large metadata may cause an assertion in produce_initiate

Solution: Allow metadata of arbitrary size in produce_initiate
2017-08-15 18:42:31 +01:00
Simon Giesecke
d5e4319edc [WIP, do not merge] Problem: insufficient tests for ZMTP-CURVE protocol errors (#2680)
* Extracted connect_vanilla_socket function

* Problem: no tests for ZMTP-CURVE protocol errors

Solution: added two test cases with erroneous HELLO commands

* Problem: insufficient tests for ZMTP-CURVE protocol errors

Solution: added two test cases with erroneous HELLO command version

* Problem: test HELLO message is invalid apart from deliberate errors

Solution: create cryptographically correct HELLO message
add tweetnacl.c to test_security_curve

* Problem: nonce is incorrect, build fails with GCC

Solution: use correct non prefix

* Problem: make builds are failing

Solution: transfer CMake changes to (auto)make files

* Problem: nonce is incorrect, build fails with GCC

Solution: use correct non prefix

* Problem: make builds are failing

Solution: transfer CMake changes to (auto)make files

* Problem: no test with INITIATE command with invalid length

Solution: added test case

* Problem: code duplication between test_security_curve.cpp and curve_client.cpp

Solution: extracted parts of zmq::curve_client_t::produce_hello into reusable function

* Problem: code duplication between test_security_curve.cpp and curve_client.cpp

Solution: extracted further parts of zmq::curve_client_t into reusable functions
added missing file

* Problem: mechanism_t::add_property can be declared static

Solution: declare mechanism_t::add_property static

* Problem: intermediate crypto data needs to be passed between static function calls to curve_client_tools_t

Solution: add non-static member functions

* Problem: msg_t instance may be closed twice

Solution: remove offending close

* Problem: prepare_hello uses static curve_client_tools_t::produce_hello

Solution: Use non-static curve_client_tools_t::produce_hello

* Problem: no test with invalid command name where INITIATE command is expected

Solution: added test case

* Problem: make builds are failing due to curve_client_tools.hpp not being found

Solution: add curve_client_tools.hpp to list of source files

* Problem: wrong initializer order in zmq::curve_client_t

Solution: reorder

* Problem: under non-Windows systems, test fails because random_open was not called

Solution: call random_open/random_close within test

* Problem: conflict between custom function htonll and macro definition on Darwin

Solution: define htonll function only if not defined as a macro

* Problem: nullptr not defined on all platforms

Solution: replace nullptr by NULL

* Problem: libsodium builds not working

Solution: adapt compile and link file sets for libsodium builds

* Problem: Makefile.am broken

Solution: Fix syntax

* Problem: no tests for garbage encrypted cookie or content in INITIATE

Solution: added test cases

* Problem: test cases accidentally excluded from build

Solution: remove #if/#endif

* Solution: some error cases are unreachable

Problem: for the time being, added some comments without changing the code

* Added comments on hard-to-test cases
2017-08-15 15:28:24 +01:00
Luca Boccassi
e376c81c2d Problem: SIGBUS on SPARC64
Solution: force the compiler to make the atomic_counter_t alignment
friendly.
This will ensure that the pointers inside the buffers allocated by
shared_message_memory are aligned, at the cost of growing the memory
size of atomic_counter_t from 4 to 8 bytes on 64 bit (when not using
mutexes).
Note that although content_t contains an atomic_counter_t, the
compiler already padded the struct so there is no change in the
buffer sizes used by the engines, save for the extra 4 bytes for the
buffer's own single atomic counter.
Fixes #2588
2017-08-11 17:48:23 +01:00
Luca Boccassi
f0ae5e585c Problem: C++11 atomic API never used
Solution: remove requirement to manually define macro and just check
for the C++ supported version.
Note that compiler intrinsics still have priority if available, to
avoid changes unless necessary.
2017-08-11 17:47:18 +01:00
Alain O'Dea
454d1eda65
Problem: log for bad ZAP status code is confusing
Solution: log that handler sent bad status code to clarify ZAP debugging
2017-08-10 10:59:18 -02:30
Luca Boccassi
75fba539d0 Problem: PGM/NORM builds broken due to i_engine API change
Solution: implement get_endpoint in the NORM and PGM engines too
2017-08-08 15:18:07 +01:00
Simon Giesecke
a6cef4ef86 Problem: ZAP status codes != 200 do not result in an appropriate monitor event (#2665)
* Problem: missing test for status code 300, inadequate assertion for status code 500

Solution: add test, change assertion (currently test fails)

* Problem: gcc compiler error deprecated conversion from string constant

Solution: declare variable as const

* Problem: in case of ZAP handler returning a status code other than 200, no appropriate event is emitted

Solution: immediately emit event after receiving reply from ZAP handler

* Problem: endpoint address is not included in zap-reply monitor event

Solution: added functions to retrieve endpoint address in zmq::i_engine and zmq::session_base_t
removed unused code block in zmq::stream_engine_t::next_handshake_command

* Problem: wrong formatting

Solution: fix formatting

* Problem: test fails because of EPIPE

Solution: add EPIPE/ECONNRESET/ECONNAGAIN handling for more test cases
2017-08-08 13:10:20 +01:00
Luca Boccassi
3046fe2053 Problem: getrandom usage breaks build
Solution: add missing flags parameter
2017-08-07 08:43:25 +02:00
Luca Boccassi
dc51ebeb1f Problem: valgrind shows unitialised errors in backtrace print
Solution: fix them
2017-08-05 18:04:15 +01:00
Simon Giesecke
41108b203e Problem: zmq::curve_server_t::produce_error sends sizeof std::string instead of status code length
Solution: send status code length (always 3) instead
2017-08-04 16:54:46 +02:00
Simon Giesecke
9949965717 Problem: Property names are duplicated at several places
Solution: Define them in zmq.h and use them (currently in DRAFT API)
2017-08-04 10:33:51 +02:00
Simon Giesecke
c191909c0e Problem: Misleading error code in case ZAP handler sends an invalid status code (#2646)
Solution: Use EPROTO instead of EACCES error code in that case
2017-08-03 14:20:35 +01:00
Simon Giesecke
5d4e30eb13 Replace console output by monitoring events for curve security issues (#2645)
* Fixing #2002 one way of doing it

 * Mechanisms can implement a new method `error_detail()`
 * This error detail have three values for the moment: no_detail
 (default), protocol, encryption.
    + generic enough to make sense for all mechanisms.
    - low granularity level on information.

* Fixing #2002: implementation of the error details

The ZMQ_EVENT_HANDSHAKE_FAILED event carries the error details
as value.

* Removed Microsoft extenstion for enum member access

This was leading to compilation error under linux.

* Adaptation of CURVE test cases

* Monitoring event: changed API for detailed events

Removed ZMQ_EVENT_HANDSHAKE_FAILED and replaced it by:
- ZMQ_EVENT_HANDSHAKE_FAILED_NO_DETAIL,
- ZMQ_EVENT_HANDSHAKE_FAILED_PROTOCOL,
- ZMQ_EVENT_HANDSHAKE_FAILED_ENCRYPTION

Adaptation of text case `security_curve`

* Removed event value comparison

This was introduced for the previous API model adaptation

* Removed the prints in std output and added missing details

`current_error_detail` was not set in every protocol error cases

* Fixed initialization of current_error_detail

* Fixed error in greeting test case

The handshake failure due to mechanism mismatch in greeting is actually
a protocol error. The error handling method consider it like so and
send a protocol handshake failure monitoring event instead of no_detail.

Fixed the test_security_curve expectation as well.

* Upgraded tests of monitoring events

The tests check the number of monitoring events received

* Problem: does not build under Linux or without ZMQ_DRAFT_API

Solution:
- properly use ZMQ_DRAFT_API conditional compilation
- use receive timeouts instead of Sleep

* Problem: duplicate definition of variable 'timeout'

Solution: merged definitions

* Problem: inconsistent timing dependencies

Solution: reduce timing dependency by using timeouts at more places

* Problem: assertion failure under Linux due to unexpected monitor event

Solution: output event type to aid debugging

* Problem: erroneous assertion code

* Problem: assertion failure with a garbage server key due to an extra third event

Solution: changed assertion to expect three events (needs to be checked)

* Problem: extra include directive to non-existent file

Solution: removed include directive

* Problem: assertion failure on appveyor for unknown reason

Solution: improve debug output

* Problem: no build with libsodium and draft api

Solution: add build configurations with libsodium and draft api

* Problem: assertion failure on CI

Solution: change assertion to reflect actual behaviour on CI (at least temporarily)

* Problem: error in condition in assertion code

* Problem: assertion failure on CI

Solution: generalize assertion to match behavior on CI

* Problem: assertion failures on CI

Solution: removed inconsistent assertion on no monitor events before flushing
improved debuggability by converting function into macro

* Problem: diverging test code for three analogous test cases with garbage key

Solution: extract common code into function

* Problem: does not build without ZMQ_BUILD_DRAFT_API

Solution: introduce dummy variable

* Attempt to remove workaround regarding ZMQ_EVENT_HANDSHAKE_FAILED_NO_DETAIL again

* Problem: EAGAIN error after handshake complete if there is no more data in inbuffer

Solution: Skip tcp_read attempt in that case

* Problem: handshaking event emitted after handshaking failed

Solution: use stream_engine_t::handshaking instead of mechanism_t::status() to determine whether still handshaking

* Include error code in debug output

* Improve debugging output: output flushed events

* Split up ZMQ_EVENT_HANDSHAKE_FAILED_PROTOCOL into ZMQ_EVENT_HANDSHAKE_FAILED_ZMTP and ZMQ_EVENT_HANDSHAKE_FAILED_ZAP

* Fixed compilation without ZMQ_BUILD_DRAFT_API

* Renamed ZMQ_EVENT_HANDSHAKE_SUCCEED to ZMQ_EVENT_HANDSHAKE_SUCCEEDED for language consistency

* Renamed ZMQ_EVENT_HANDSHAKE_SUCCEED to ZMQ_EVENT_HANDSHAKE_SUCCEEDED for language consistency

* Renamed ZMQ_EVENT_HANDSHAKE_SUCCEED to ZMQ_EVENT_HANDSHAKE_SUCCEEDED for language consistency

* Fixed assert_monitor_event (require event instead of allowing no event)
Reverted erroneous change to handshaking condition
Renamed test_wrong_key to test_garbage_key
Generalized assumption in test_garbage_key to allow for ZMQ_EVENT_HANDSHAKE_FAILED_NO_DETAIL with error == EPIPE

* Better isolate test cases from each other by providing a fresh context & server for each

* Added diagnostic output

* Changed assertion to reflect actual behavior on CI

* Fixed formatting, observe maximum line length

* Fixed formatting, observe maximum line length

* Increase timeout to check if this fixes valgrind run

* Close server with close_zero_linger

* Increase timeout to check if this fixes valgrind run

* Increase timeout to check if this fixes valgrind run

* Generalize assertion to also work with valgrind

* Fixed formatting

* Add more diagnostic output

* Generalize assertion to also work with valgrind
2017-08-03 14:15:56 +01:00
Luca Boccassi
bb0b518e7f Problem: ZMQ_BINDTODEVICE not used for ZMQ_DISH
Solution: apply the option outside of the send/recv_enabled blocks so
that it is used for all types of UDP sockets
2017-07-31 16:31:31 +01:00
Luca Boccassi
415bdbc1b9 Problem: ZMQ_BINDTODEVICE is DRAFT but not DRAFT
Solution: move definition in the DRAFT section of the header
2017-07-31 16:31:31 +01:00
Brian Russell
b963542e8f Add socket option BINDTODEVICE
Linux now supports Virtual Routing and Forwarding (VRF) as per:

https://www.kernel.org/doc/Documentation/networking/vrf.txt

In order for an application to bind or connect to a socket with an
address in a VRF, they need to first bind the socket to the VRF device:

    setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);

Note "dev" is the VRF device, eg. VRF "blue", rather than an interface
enslaved to the VRF.

Add a new socket option, ZMQ_BINDTODEVICE, to bind a socket to a device.
In general, if a socket is bound to a device, eg. an interface, only
packets received from that particular device are processed by the socket.

If device is a VRF device, then subsequent binds/connects to that socket
use addresses in the VRF routing table.
2017-07-31 15:31:47 +01:00
Luca Boccassi
fbb6bbdcb8 Problem: reading from /dev/urandom is clunky
Solution: if available use the getrandom function as it doesn't
require any synchronization, state or cleanup
2017-07-28 11:28:19 +01:00
Luca Boccassi
2626fdfa23 Problem: tweetnacl leaks file descriptor on fork+exec
Solution: open with O_CLOEXEC if available or set FD_CLOEXEC if not
2017-07-28 11:27:55 +01:00
Luca Boccassi
e015a0f8b9 Problem: fd leak in tweetnacl with one ctx per thread
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes #2632
2017-07-28 11:27:53 +01:00
Luca Boccassi
a7bf010ee2 Problem: misleading indentation in tweetnacl.c
Solution: fix it
2017-07-27 21:04:43 +01:00
pavel.pimenov
dfd9d48496 Suppress C4324 (VC++2017)
'zmq::command_t': structure was padded due to alignment specifier
https://msdn.microsoft.com/en-us/library/92fdk6xx.aspx
2017-07-27 07:06:19 +03:00
Eamonn Coughlan
cfb59dde21 Problem: can't set IPV6_V6ONLY on OpenBSD
Solution: skip setsockopt call resulting in EINVAL
2017-07-22 22:53:12 +02:00
Marc Sune
b7b89a8f60 Fix ROUTER's xhas_out() in MANDATORY mode
Before this commit, xhas_out() was returning true regardless. This
was correct before the ZMQ_ROUTER_MANDATORY flag as introduced.
However, ZMQ_POLLOUT.

With this commit, _if_ ZMQ_ROUTER_MANDATORY is set, xhas_out() will
return false if ALL peer's outgoing pipes are full.

There is an outstanding high-level design question:

If ZMQ_ROUTER_MANDATORY is set, and zmq_poll() waits for ZMQ_POLLOUT
events, zmq_poll() will immediately wake up if only 1 pipe has
room to send, regardless of the peer, creating a busy loop of
zmq_poll() wake-up, zmq_send() (EAGAIN). There is no way for
the application to selectively wait for ZMQ_POLLOUT for specific
peer(s), which seems somehow necessary in ZMQ_ROUTER_MANDATORY.

This discussion will be addressed in a separate issue.

Signed-off-by: Marc Sune <marc@voltanet.io>
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
2017-07-14 15:55:58 +02:00
Luca Boccassi
d04065b778 Problem: CURVE server (connect) fails when client rebinds
Solution: if a CURVE server is using zmq_connect, the same session
will be used for any client "reconnect" (actual binds). This is
acceptable, so do not assert if zap_pipe already exists during the
handshake, but simply reuse it.
Fixes #2608
2017-07-01 17:37:07 +01:00
bjovke
9ef34addb8 Problem: When using print_backtrace() on Linux with libunwind, printout of stack traces from multiple threads are interleaved. Solution: added static mutex to serialize printing of stack traces. 2017-06-27 20:29:08 +02:00
bjovke
69355730a4 Problem: intermittent memory leak for req/rep send/recv. #2602 Solution: memory leak fixed. 2017-06-27 20:15:08 +02:00
Luca Boccassi
3536c4b9c4 Problem: XPUB_MANUAL subscriptions not removed on peer term
Solution: remove the pipe from the real trie when a peer disconnects.
Also add a unit test that exercises the behaviour by reconnecting
a different socket and sending a message that matches.
Fixes #2601 and introduced by #2042
2017-06-22 01:02:08 +01:00
Luca Boccassi
6ad0b08da9 Problem: GSSAPI can no longer be used without ZAP
Solution: do not fail if ZAP is not enabled.
GSSAPI already provides authentication and can be used separately,
so it is a valid use case.
2017-06-13 22:56:49 +01:00
Luca Boccassi
0ce18eac25 Problem: CURVE can no longer be used without ZAP
Solution: revert change that made ZAP mandatory.
The "Stonehouse" pattern, where CURVE is used only for encryption and
without authentication, is a valid use case so we should still
support it.
Also restore CURVE testing in the test_heartbeat.

Fixes #2594
2017-06-13 22:56:32 +01:00
Luca Boccassi
33695d1da8 Problem: ZAP is allowed to be configured incorrectly or not to work
Solution: if inproc://zeromq.zap.01 exists, which means ZAP is
enabled, abort immediately if it cannot be used (eg: out of memory)
or it is configured incorrectly (eg: wrong socket type).
Otherwise authentication failures will simply be ignored and
unauthorised peers will be allowed to slip in.
2017-06-13 22:56:31 +01:00
sunddy
af598f2e1c fix bug: dish client does not resend subscriptions to radio server after radio server restart
problem: for zmq radio/dish pattern, if the radio process restarts, the dish will not resend subscriptions to radio. And the result is that the dish will never receive any more messages.

solution: in session_base_t::reconnect (), take ZMQ_DISH into consideration when invoking hiccup method.
2017-06-12 12:26:21 +08:00
laplaceyang
67a6594fc0 fix bug: coredump if set linger and immediate together
In function session_base_t::reconnect, if we set immediate to 1 and set linger, we will get into first block of reconnect function, and set pipe to NULL, but we forget to cancel timer of linger. Once timer tiggered, we will get coredump. Solution: cancel timer in the end of set pipe to NULL
2017-06-02 11:36:41 +08:00
rkfg
72b4b6830f Problem: abort at socket creation on Android with jzmq
Solution: don't set thread name on Android

Setting a thread name on Android may fail with "permission
denied" error and abort the process due to failed assertion.
Tested on Android 5 and 6 (two phones).
Strangely enough, it only happens on signed APKs and is fine
in debug. Using JeroMQ is not an option as we need TCP keepalive
settings and authentication which JeroMQ doesn't support.
2017-05-17 15:05:37 +03:00
Luca Boccassi
bdc676f687 Problem: REP leaves label msgs for dead REQ in pipe
Solution: roll back the pipe if writing messages other than the
first fails in router::xsend. Roll it back also when the pipe is
terminating.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes #2567
2017-05-17 09:18:15 +01:00
BJovke
1489fc1ac5 Revert "Problem: REP leaves label msgs for dead REQ in pipe" 2017-05-16 11:20:03 +02:00
Luca Boccassi
0999fdd885 Problem: REP leaves label msgs for dead REQ in pipe
Solution: roll back the pipe if writing messages other than the
first fails in router::xsend.
Also add test case that reproduces the memory leak when ran with
valgrind.
Fixes #2567
2017-05-10 23:44:03 +01:00
KIU Shueng Chuan
d11f501dc1 problem: not using official api FD_ZERO to init fd_set
solution: fix it

In particular, on Windows, using FD_ZERO is much more efficient than
zeroing out the whole structure.
2017-05-06 08:03:09 +08:00