In basic client mode, set the origin and receive timestamp to zero.
This reduces the amount of information useful for fingerprinting and
improves privacy as the origin timestamp allows a passive observer to
track individual NTP clients as they move across networks. (With chrony
clients that assumes the timestamp wasn't reset by the chronyc offline
and online commands.)
This follows recommendations from the current version of IETF draft on
NTP data minimization [1].
The timestamp could be theoretically useful for enhanced rate limiting
which can limit individual clients behind NAT and better deal with DoS
attacks, but no server implementation is known to do that.
[1] https://tools.ietf.org/html/draft-ietf-ntp-data-minimization-01
In interleaved client mode, when so many consecutive requests were lost
that the first valid (interleaved) response would be dropped for being
too old, switch to basic mode so the response can be accepted if it
doesn't fail in the other tests.
This reworks commit 16afa8eb50.
In symmetric mode, don't send a packet in interleaved mode unless it is
the first response to the last valid request received from the peer and
there was just one response to the previous valid request. This prevents
the peer from matching the transmit timestamp with an older response if
it can't detect missed responses.
In interleaved symmetric mode, check if the remote TX timestamp is
before RX timestamp. Only the first response from the peer after
receiving a request should pass this test. Check also the interval
between last two remote transmit timestamps when we know the remote poll
can't be constrained by minpoll. Use the minimum of previous remote and
local poll as a lower bound of the actual interval between peer's
transmissions.
Check how many responses were missing before accumulating a sample using
old timestamps to avoid correcting the clock with an offset extrapolated
over a long interval.
This should be eventually done in sourcestats for all sources.
With the new selection of timestamps in the interleaved mode it's no
longer necessary to reverse the poll tracking in order to reduce the
local and remote intervals of measurements that makes the peer with
higher stratum.
This reverts commit 4a24368763.
Use previous local TX and remote RX timestamps for the new sample in the
interleaved mode if it will make the local and remote intervals
significantly shorter in order to improve the accuracy of the measured
delay.
Unlike in the basic mode, the peer with a higher stratum needs to wait
for a response before sending the next request in order to minimize the
delay of the measurement and error in the measured delay.
Slightly increase the delay adjustment to make it work with older chrony
versions.
Update the remote poll and remote stratum even for unsychronised peers,
and handle stratum of 0 as 16, so the peers work with the opposite
differences between their strata and can adjust their polling intervals
in order to interleave the packets.
When the poll value in a client request is smaller than the server's NTP
rate limiting interval, set poll in the response to the rate limiting
interval to suggest the client to increase its polling interval.
This follows ntpd as a server. No current client implementation seems to
be increasing its interval by the poll, but it may change in the future.
It was never used for anything and messages in debug output already
include filenames, which can be easily grepped if there is a need
to see log messages only from a particular file.
While the measurements log can be useful for debugging problems in NTP
configuration (e.g. authentication failures with symmetric keys), it
seems most users are interested only in valid measurements (e.g. for
producing graphs) and don't expect/handle entries where some of the RFC
5905 tests 1-7 failed. Modify the measurements log option to log only
valid measurements, and for debugging purposes add a new rawmeasurements
option.
When the server's transmit timestamp was updated with a kernel/HW
timestamp, it didn't include the time smoothing offset. If the offset
was larger than one second, the update failed and clients using the
interleaved mode received less accurate timestamps. If the update
succeeded, the clients received timestamps that were not adjusted for
the time smoothing offset, which added an error of up to 0.5s/1s to
their measured offset/delay.
Fix the update to include the smoothing offset in the new timestamp.
Before sending an NTP packet, check whether the TX timestamp is not
equal to the RX timestamp. If it is, generate a new TX timestamp and try
again. This is extremely unlikely to happen in normal operation, but it
is needed for reliable detection of the interleaved mode.
Instead of a worst-case delay use a mean value and relate it to the
source's time. This makes it more stable in the interleaved and
symmetric modes, which should improve the weighting and asymmetry
correction. Modify the test A and B to work with a minimum estimated
delay (delay - dispersion).
If the MAC in NTPv4 requests would be truncated, use version 3 by
default to avoid the truncation. This is necessary for compatibility
with older chronyd servers, which do not respond to messages with
truncated MACs.
In order to allow deterministic parsing of NTPv4 extension fields, the
MAC must not be longer than 192 bits (RFC 7822). One way to get around
this limitation when using symmetric keys which produce longer MACs is
to truncate them to 192 bits (32-bit key ID and 160-bit hash).
Modify the code to accept NTPv4 packets with MACs truncated to 192
bits, but still allow long MACs in NTPv4 packets to not break
compatibility with older chrony clients.
In a burst of three requests (two presend + one normal) the server can
detect the client is using the interleaved mode and save the transmit
timestamp of the second response for the third response. This shortens
the interval in which the server has to keep the state.
Rework the code to make a real request for presend and process the
response, but don't accumulate the sample. This allows presend to work
in the interleaved client mode.
Always allow update from the first valid response, even if its transmit
timestamp is not newer than the currently saved timestamp. This shoud
provide a temporary protection in the case where the attacker does have
an authenticated packet from future, but the peers are using the same
polling interval and the protocol is already synchronised. This could be
also useful in the case where the attacker cannot observe the traffic
and authentication is disabled.
A recently published paper [1] (section VIII) describes a DoS attack
on symmetric associations authenticated with a symmetric key where the
attacker can only observe and replay packets. Although the attacker
cannot prevent packets from reaching the other peer (not even by
flooding the network for example), s/he has the same power as a MitM
attacker.
As the authors explain, this is a fundamental flaw of the protocol,
which cannot be fixed in the general case. However, we can at least try
to protect associations in a case where the peers use the same polling
interval (i.e. for each request is expected one response) and all peers
that share the symmetric key never start with clocks in future or very
distant past (i.e. the attacker does not have any packets from future
that could be replayed).
Require that updates of the NTP state between requests have increasing
transmit timestamp and when a packet that passed all NTP tests to be
considered a valid response was received, don't allow any more updates
of the state from packets that don't pass the tests. This should ensure
the last update of the state is from the first time the last real
response was received and still allow the protocol to recover in case
one of the peers steps its clock back or the attacker does have a packet
from future and the attack stops.
[1] Aanchal Malhotra, Matthew Van Gundy, Mayank Varia, Haydn Kennedy,
Jonathan Gardner, and Sharon Goldberg. The Security of NTP's
Datagram Protocol. https://eprint.iacr.org/2016/1006
Adapt the interleaved symmetric mode for client/server associations.
On server, save the state needed for detection and responding in the
interleaved mode in the client log. On client, enable the interleaved
mode when the server is specified with the xleave option. Always accept
responses in basic mode to allow synchronization with servers that
don't support the interleaved mode, have too many clients, or have
multiple clients behing the same IP address. This is also necessary to
prevent DoS attacks on the client by overwriting or flushing the server
state. Protect the client's state variables against replay attacks as
the timestamps are now needed when processing the subsequent packet.
Add xleave option to the peer directive to enable an interleaved mode
compatible with ntpd. This allows peers to exchange transmit timestamps
captured after the actual transmission and significantly improve
the accuracy of the measurements.
Introduce a new structure for local timestamps that will hold the
timestamp with its estimated error and also its source (daemon, kernel
or HW). While at it, reorder parameters of the functions that accept the
timestamps.
Add new functions for processing of packets after they are actually
sent by the kernel or HW in order to get a more accurate transmit
timestamp. Rename old functions for processing of received packets and
their parameters to make the naming more consistent.
Replace struct timeval with struct timespec as the main data type for
timestamps. This will allow the NTP code to work with timestamps in
nanosecond resolution.
Crypto-NAK is useful only with Autokey where it allows quick reset
of the association. There is no plan to support Autokey and NTS will
specify its own message for authentication errors.
When selecting sources from a pool, ignore responses which didn't
produce a new sample. Sources with acceptable delay (as configured by
the maxdelay* options) should be prefered.
When a valid packet is received from an unsynchronised source (i.e. only
a test of leap, stratum or root distance failed), there is no point in
waiting for another packet or the RX timeout, and the client socket can
be immediately closed.
Add support for authenticating MS-SNTP responses in Samba (ntp_signd).
Supported is currently only the old MS-SNTP authenticator field. It's
disabled by default. It can be enabled with the --enable-ntp-signd
configure option and the ntpsigndsocket directive, which specifies the
location of the Samba ntp_signd socket.
When a received packet fails to authenticate, check if the digest
contains zeroes and treat it as an MS-SNTP packet with authenticator or
extended authenticator field. For now, discard these packets, i.e. don't
respond with a crypto-NAK.
Replace the flag that enables authentication using a symmetric key with
an enum. Specify crypto-NAK as a special mode used for responses instead
of relying on zero key ID. Also, rework check_packet_auth() to always
save the mode and key ID.
Add offset option to the server/pool/peer directive. It specifies a
correction which will be applied to offsets measured with the NTP
source. It's particularly useful to compensate for a known asymmetry in
network delay or timestamping errors.
If a special reference mode is enabled, always pass the test for
synchronization loop. This allows chronyd using the initstepslew
directive (or the -q/-Q option) to accept time from its own clients
after restart as is documented in the chrony.conf man page.
This was broken since update to NTPv4.
If local reference is active, return normal leap, but unsynchronised
status. Update the callers of the function to work with the leap
directly and not change their behaviour.
REF_IsLocalActive() is no longer needed.
When ntpd as an NTP server has active orphan mode, it doesn't update
its reference time and the reference timestamp may fail the NTP test
3 and 7. (https://bugs.ntp.org/show_bug.cgi?id=1098)
Remove both checks of the timestamp to allow chronyd to operate as
a client of ntpd server in the orphan mode. When ntpd is fixed and
old versions are no longer used, this may be reverted.
When a valid NTP reply is received, save the local address (e.g. from
IP_PKTINFO), so the reference ID which would the source use for this
host can be calculated when needed.
After restricting authentication of servers and peers to the specified
key, a short key in the key file is a security problem from the client's
point of view only if it's specified for a source.
When a server/peer was specified with a key number to enable
authentication with a symmetric key, packets received from the
server/peer were accepted if they were authenticated with any of
the keys contained in the key file and not just the specified key.
This allowed an attacker who knew one key of a client/peer to modify
packets from its servers/peers that were authenticated with other
keys in a man-in-the-middle (MITM) attack. For example, in a network
where each NTP association had a separate key and all hosts had only
keys they needed, a client of a server could not attack other clients
of the server, but it could attack the server and also attack its own
clients (i.e. modify packets from other servers).
To not allow the server/peer to be authenticated with other keys
extend the authentication test to check if the key ID in the received
packet is equal to the configured key number. As a consequence, it's
no longer possible to authenticate two peers to each other with two
different keys, both peers have to be configured to use the same key.
This issue was discovered by Matt Street of Cisco ASIG.
Instead of time_t use a 32-bit fixed point representation with 4-bit
fraction to save the time of the last hit. The rate can now be measured
up to 16 packets per second. Maximum interval between hits is about 4
years.
When the measured NTP or command request rate of a client exceeds
a threshold, reply only to a small fraction of the requests to reduce
the network traffic. Clients are allowed to send a burst of requests.
Try to detect broken clients which increase the request rate when not
getting replies and suppress the rate limiting for them.
Add ratelimit and cmdratelimit directives to configure the thresholds,
bursts and leak rates independently for NTP and command response rate
limiting. Both are disabled by default. Commands from localhost are
never limited.
Don't log NTP peer access and auth/bad command access. Also, change
types for logging number of hits from long to uint32_t. This reduces the
size of the node and allows more clients to be monitored in the same
amount of memory.
The meaning of the poll value in KoD RATE packets is not currently
defined in the NTP specification (RFC 5905). In the reference NTP
implementation it signals the minimum acceptable polling interval to the
clients. In chrony the minimum poll is set to the KoD RATE poll if it's
larger, but not to a larger value than 10.
The problem is that ntpd as a server sets the KoD RATE poll to the
maximum of the client's poll and the configured rate limiting interval.
An attacker can send a burst of spoofed packets to the server to trigger
the client's request rate limit. When the client sends its next request
and the server responds with a KoD RATE packet, the client will set its
minimum poll to the current poll and it will no longer be able to switch
to a shorter poll when needed.
ntpd could be fixed to always set the KoD RATE poll to the rate limiting
interval. Unfortunately, ntpd as a client seems to depend on the current
behavior. It tries to follow the server poll and if the KoD RATE poll
was shorter than the current poll, the polling interval would be
reduced, defeating the purpose of KoD RATE. The server fix will probably
need to wait until clients are fixed and that could take a very long
time.
For now, ignore the poll value in KoD RATE packets. Just add an extra
delay based on the current poll to the next transmit timeout and stop an
ongoing burst.
First packet after setting a source to online was sent with constant
delay (0.2s). If the period in which the source was offline was shorter
than the current polling interval, the new packet was sent sooner than
it would be if the source wasn't switched to offline and back.
Don't reset the local tx timestamp when mode is changed. When starting
the initial transmit timeout, adjust the delay to make the interval
between the two packets at least as long as the current polling
interval.
In client packets set the leap, stratum, reference ID, reference time,
root delay and root dispersion to constant values to not reveal the
state of the synchronization. Use precision 32 to make the receive and
transmit timestamps completely random and not reveal the local time.
Use UTI_GetRandomBytes() instead of random() to generate random bits
below precision. Save the result in NTP_int64 in the network order and
allow precision in the full range from -32 to 32. With precision 32
the fuzzing now makes the timestamp completely random and can be used to
hide the time.
After sending a client packet, schedule a timeout to close the socket
at the time when all server replies would fail the delay test, so the
socket is not open for longer than necessary (e.g. when the server is
unreachable). With the default maxdelay of 3 seconds the timeout is 7
seconds.