If the whole process group receives a signal (e.g. CTRL-C in terminal),
the helper process needs to keep running until it gets the QUIT request,
so the system drivers can still use it in their finalisation, e.g. to
cancel remaining slew.
In receive_reponse() don't interpret return codes in helper responses as
a non-zero value may not necessarily mean an error. Just copy errno if
it's not zero and let PRV_* functions deal with the return code.
Prepare a list of required privileged operations first and from that
define the PRIVOPS macros. This will reduce the amount of code that will
be needed when the privileged helper is used on other platforms.
Rename PRV_Initialise() to PRV_StartHelper() and add a new
initialisation function, which just sets the helper fd to -1. Move
the initialision/finalisation calls from the system drivers to main.c.
If privops is not included in the build, define empty macros for the
function names, so their calls don't have to be wrapped in #ifdefs.
With SOCK_DGRAM sockets, the helper doesn't stop as there is no error
received when the socket is closed on the daemon side.
Add a QUIT operation to the protocol which is requested when the daemon
is exiting. It has no response. Register the stopping function with
atexit() to stop the helper even when the daemon is not exiting cleanly,
e.g. due to a fatal error.
Split out the sending part of the function into send_request() and
rename it to submit_request(). This will be useful to send a request
without waiting for a response.
Also, remove the fd parameter from the functions and just use helper_fd
directly.
SOCK_SEQPACKET is preferred over SOCK_DGRAM for communication with the
helper as the process will get an error when the other end of the socket
is closed. It's not supported on all platforms.
If SOCK_SEQPACKET is defined, try creating the pair of sockets with this
type first and if that fails, fall back to SOCK_DGRAM.
When the rtcsync directive is specified in the chronyd config file,
chronyd will update the RTC via settimeofday() every 60 minutes if
the system time is synchronised to NTP.
Instead of printing some large arbitrary values use dash in the LastRx
column of the sources output and the Last/Int columns in the clients
output when no sample or hit is recorded.
Instead of time_t use a 32-bit fixed point representation with 4-bit
fraction to save the time of the last hit. The rate can now be measured
up to 16 packets per second. Maximum interval between hits is about 4
years.
Abort immediately on start if chronyd is compiled on a platform with int
shorter than 32 bits, using other representation than two's complement,
or unexpected conversion of large unsigned integers to signed.
Some libc calls like memcpy() expect the pointer to be valid even when
the size is zero and there is nothing to do. Instead of checking the
size before all such calls, modify ARR_GetElements() to return a pointer
to the array instance itself if data was not allocated yet.
Add new fields from clientlog to the report and print them in chronyc.
Rework the code to skip empty records in the hash table. The reply no
longer has variable length, all client fields are filled even if some
are empty. Reply with RPY_NULL when the facility is disabled.
When the measured NTP or command request rate of a client exceeds
a threshold, reply only to a small fraction of the requests to reduce
the network traffic. Clients are allowed to send a burst of requests.
Try to detect broken clients which increase the request rate when not
getting replies and suppress the rate limiting for them.
Add ratelimit and cmdratelimit directives to configure the thresholds,
bursts and leak rates independently for NTP and command response rate
limiting. Both are disabled by default. Commands from localhost are
never limited.
This simplifies the code and allows older records to be reused when no
more memory can be allocated for new addresses. Each slot of the hash
table has 16 records and there is no chaining between different slots.
Reused records may be newer than records in other slots, but the search
time remains constant.
Don't log NTP peer access and auth/bad command access. Also, change
types for logging number of hits from long to uint32_t. This reduces the
size of the node and allows more clients to be monitored in the same
amount of memory.
The meaning of the poll value in KoD RATE packets is not currently
defined in the NTP specification (RFC 5905). In the reference NTP
implementation it signals the minimum acceptable polling interval to the
clients. In chrony the minimum poll is set to the KoD RATE poll if it's
larger, but not to a larger value than 10.
The problem is that ntpd as a server sets the KoD RATE poll to the
maximum of the client's poll and the configured rate limiting interval.
An attacker can send a burst of spoofed packets to the server to trigger
the client's request rate limit. When the client sends its next request
and the server responds with a KoD RATE packet, the client will set its
minimum poll to the current poll and it will no longer be able to switch
to a shorter poll when needed.
ntpd could be fixed to always set the KoD RATE poll to the rate limiting
interval. Unfortunately, ntpd as a client seems to depend on the current
behavior. It tries to follow the server poll and if the KoD RATE poll
was shorter than the current poll, the polling interval would be
reduced, defeating the purpose of KoD RATE. The server fix will probably
need to wait until clients are fixed and that could take a very long
time.
For now, ignore the poll value in KoD RATE packets. Just add an extra
delay based on the current poll to the next transmit timeout and stop an
ongoing burst.
First packet after setting a source to online was sent with constant
delay (0.2s). If the period in which the source was offline was shorter
than the current polling interval, the new packet was sent sooner than
it would be if the source wasn't switched to offline and back.
Don't reset the local tx timestamp when mode is changed. When starting
the initial transmit timeout, adjust the delay to make the interval
between the two packets at least as long as the current polling
interval.
Use UTI_GetRandomBytes() instead of random() to calculate the random
part of the timeout. This was the only remaining use of random() in the
code and the srandom() call can be removed.
In client packets set the leap, stratum, reference ID, reference time,
root delay and root dispersion to constant values to not reveal the
state of the synchronization. Use precision 32 to make the receive and
transmit timestamps completely random and not reveal the local time.
Use UTI_GetRandomBytes() instead of random() to generate random bits
below precision. Save the result in NTP_int64 in the network order and
allow precision in the full range from -32 to 32. With precision 32
the fuzzing now makes the timestamp completely random and can be used to
hide the time.