Replace struct timeval with struct timespec as the main data type for
timestamps. This will allow the NTP code to work with timestamps in
nanosecond resolution.
Crypto-NAK is useful only with Autokey where it allows quick reset
of the association. There is no plan to support Autokey and NTS will
specify its own message for authentication errors.
When selecting sources from a pool, ignore responses which didn't
produce a new sample. Sources with acceptable delay (as configured by
the maxdelay* options) should be prefered.
When a valid packet is received from an unsynchronised source (i.e. only
a test of leap, stratum or root distance failed), there is no point in
waiting for another packet or the RX timeout, and the client socket can
be immediately closed.
Add support for authenticating MS-SNTP responses in Samba (ntp_signd).
Supported is currently only the old MS-SNTP authenticator field. It's
disabled by default. It can be enabled with the --enable-ntp-signd
configure option and the ntpsigndsocket directive, which specifies the
location of the Samba ntp_signd socket.
When a received packet fails to authenticate, check if the digest
contains zeroes and treat it as an MS-SNTP packet with authenticator or
extended authenticator field. For now, discard these packets, i.e. don't
respond with a crypto-NAK.
Replace the flag that enables authentication using a symmetric key with
an enum. Specify crypto-NAK as a special mode used for responses instead
of relying on zero key ID. Also, rework check_packet_auth() to always
save the mode and key ID.
Add offset option to the server/pool/peer directive. It specifies a
correction which will be applied to offsets measured with the NTP
source. It's particularly useful to compensate for a known asymmetry in
network delay or timestamping errors.
If a special reference mode is enabled, always pass the test for
synchronization loop. This allows chronyd using the initstepslew
directive (or the -q/-Q option) to accept time from its own clients
after restart as is documented in the chrony.conf man page.
This was broken since update to NTPv4.
If local reference is active, return normal leap, but unsynchronised
status. Update the callers of the function to work with the leap
directly and not change their behaviour.
REF_IsLocalActive() is no longer needed.
When ntpd as an NTP server has active orphan mode, it doesn't update
its reference time and the reference timestamp may fail the NTP test
3 and 7. (https://bugs.ntp.org/show_bug.cgi?id=1098)
Remove both checks of the timestamp to allow chronyd to operate as
a client of ntpd server in the orphan mode. When ntpd is fixed and
old versions are no longer used, this may be reverted.
When a valid NTP reply is received, save the local address (e.g. from
IP_PKTINFO), so the reference ID which would the source use for this
host can be calculated when needed.
After restricting authentication of servers and peers to the specified
key, a short key in the key file is a security problem from the client's
point of view only if it's specified for a source.
When a server/peer was specified with a key number to enable
authentication with a symmetric key, packets received from the
server/peer were accepted if they were authenticated with any of
the keys contained in the key file and not just the specified key.
This allowed an attacker who knew one key of a client/peer to modify
packets from its servers/peers that were authenticated with other
keys in a man-in-the-middle (MITM) attack. For example, in a network
where each NTP association had a separate key and all hosts had only
keys they needed, a client of a server could not attack other clients
of the server, but it could attack the server and also attack its own
clients (i.e. modify packets from other servers).
To not allow the server/peer to be authenticated with other keys
extend the authentication test to check if the key ID in the received
packet is equal to the configured key number. As a consequence, it's
no longer possible to authenticate two peers to each other with two
different keys, both peers have to be configured to use the same key.
This issue was discovered by Matt Street of Cisco ASIG.
Instead of time_t use a 32-bit fixed point representation with 4-bit
fraction to save the time of the last hit. The rate can now be measured
up to 16 packets per second. Maximum interval between hits is about 4
years.
When the measured NTP or command request rate of a client exceeds
a threshold, reply only to a small fraction of the requests to reduce
the network traffic. Clients are allowed to send a burst of requests.
Try to detect broken clients which increase the request rate when not
getting replies and suppress the rate limiting for them.
Add ratelimit and cmdratelimit directives to configure the thresholds,
bursts and leak rates independently for NTP and command response rate
limiting. Both are disabled by default. Commands from localhost are
never limited.
Don't log NTP peer access and auth/bad command access. Also, change
types for logging number of hits from long to uint32_t. This reduces the
size of the node and allows more clients to be monitored in the same
amount of memory.
The meaning of the poll value in KoD RATE packets is not currently
defined in the NTP specification (RFC 5905). In the reference NTP
implementation it signals the minimum acceptable polling interval to the
clients. In chrony the minimum poll is set to the KoD RATE poll if it's
larger, but not to a larger value than 10.
The problem is that ntpd as a server sets the KoD RATE poll to the
maximum of the client's poll and the configured rate limiting interval.
An attacker can send a burst of spoofed packets to the server to trigger
the client's request rate limit. When the client sends its next request
and the server responds with a KoD RATE packet, the client will set its
minimum poll to the current poll and it will no longer be able to switch
to a shorter poll when needed.
ntpd could be fixed to always set the KoD RATE poll to the rate limiting
interval. Unfortunately, ntpd as a client seems to depend on the current
behavior. It tries to follow the server poll and if the KoD RATE poll
was shorter than the current poll, the polling interval would be
reduced, defeating the purpose of KoD RATE. The server fix will probably
need to wait until clients are fixed and that could take a very long
time.
For now, ignore the poll value in KoD RATE packets. Just add an extra
delay based on the current poll to the next transmit timeout and stop an
ongoing burst.
First packet after setting a source to online was sent with constant
delay (0.2s). If the period in which the source was offline was shorter
than the current polling interval, the new packet was sent sooner than
it would be if the source wasn't switched to offline and back.
Don't reset the local tx timestamp when mode is changed. When starting
the initial transmit timeout, adjust the delay to make the interval
between the two packets at least as long as the current polling
interval.
In client packets set the leap, stratum, reference ID, reference time,
root delay and root dispersion to constant values to not reveal the
state of the synchronization. Use precision 32 to make the receive and
transmit timestamps completely random and not reveal the local time.
Use UTI_GetRandomBytes() instead of random() to generate random bits
below precision. Save the result in NTP_int64 in the network order and
allow precision in the full range from -32 to 32. With precision 32
the fuzzing now makes the timestamp completely random and can be used to
hide the time.
After sending a client packet, schedule a timeout to close the socket
at the time when all server replies would fail the delay test, so the
socket is not open for longer than necessary (e.g. when the server is
unreachable). With the default maxdelay of 3 seconds the timeout is 7
seconds.
For testA in the client mode require also that the time the server
needed to process the client request is not longer than 4 seconds.
With maximum peer delay this limits the interval in which the client can
accept a server reply.
Timeout ID of zero can be now safely used to indicate that the timer is
not running. Remove the extra timer_running variables that were
necessary to track that.
Set refid in server/broadcast packets to 127.127.1.255 when a time
smoothing offset is applied to the timestamps. This allows the clients
and administrators to detect that the server is not serving its best
estimate of the true time.
Time smoothing determines an offset that needs to be applied to the
cooked time to make it smooth for external observers. Observed offset
and frequency change slowly and there are no discontinuities. This can
be used on an NTP server to make it easier for the clients to track the
time and keep their clocks close together even when large offset or
frequency corrections are applied to the server's clock (e.g. after
being offline for longer time).
Accumulated offset and frequency are smoothed out in three stages. In
the first stage, the frequency is changed at a constant rate (wander) up
to a maximum, in the second stage the frequency stays at the maximum for
as long as needed and in the third stage the frequency is brought back
to zero.
Time smoothing is configured by the smoothtime directive. It takes two
arguments, maximum frequency offset and maximum wander. It's disabled by
default.
An attacker knowing that NTP hosts A and B are peering with each other
(symmetric association) can send a packet with random timestamps to host
A with source address of B which will set the NTP state variables on A
to the values sent by the attacker. Host A will then send on its next
poll to B a packet with originate timestamp that doesn't match the
transmit timestamp of B and the packet will be dropped. If the attacker
does this periodically for both hosts, they won't be able to synchronize
to each other. It is a denial-of-service attack.
According to [1], NTP authentication is supposed to protect symmetric
associations against this attack, but in the NTPv3 (RFC 1305) and NTPv4
(RFC 5905) specifications the state variables are updated before the
authentication check is performed, which means the association is
vulnerable to the attack even when authentication is enabled.
To fix this problem in chrony, save the originate and local timestamps
only when the authentication check (test5) passed.
[1] https://www.eecis.udel.edu/~mills/onwire.html
Server sockets are now explicitly opened and closed for normal NTP
server, NTP broadcast and NTP peering. This will allow closing the
NTP port when not needed.
The minsamples and maxsamples directives now set the default value,
which can be overriden for individual sources in the server/peer/pool
and refclock directives.