Rework handling of late HW TX timestamps. Instead of suspending reading
from client-only sockets that have HW TX timestamping enabled, save the
whole response if it is valid and a HW TX timestamp was received for the
source before. When the timestamp is received, or the configurable
timeout is reached, process the saved response again, but skip the
authentication test as the NTS code allows only one response per
request. Only one valid response per source can be saved. If a second
valid response is received while waiting for the timestamp, process both
responses immediately in the order they were received.
The main advantage of this approach is that it works on all sockets, i.e.
even in the symmetric mode and with NTP-over-PTP, and the kernel does
not need to buffer invalid responses.
Update the serverstats response to use the new 64-bit integers.
Don't define a new value for the response as it already had an
incompatible change since the latest release (new fields added for
timestamp counters).
Specify maxpoll for HW timestamping (default minpoll + 1) to track the
PHC well even when there is little NTP traffic on the interface. After
each PHC reading schedule a timeout according to the maxpoll. Polling
between minpoll and maxpoll is still triggered by HW timestamps.
Wait for the first HW timestamp before adding the timeout to avoid
polling PHCs on interfaces that are enabled in the configuration but
not used for NTP. Add a new scheduling class to separate polling of
different PHCs to avoid too long intervals between processing I/O
events.
In some cases even the new timeout of 1 millisecond is not sufficient to
get all HW TX timestamps. Add a new directive to allow users to
specify longer timeouts.
In a non-tty session with chronyc it is not possible to detect the
end of the response without relying on timeouts, or separate responses
to a repeated command if using the -c option.
Add -e option to end each response with a line containing a single dot.
The NTP SHM refclock protocol has the following properties:
- the memory segments have a predictable key (first segment 0x4e545030)
- it's expected to work in any order of starting chronyd and the program
providing samples to chronyd, i.e. both the consumer and producer need
to be able to create the segment
- the producer and consumer generally don't know under which user is
the other side running (e.g. gpsd can create the segment as root and
also as nobody after it drops root privileges)
- there is no authentication of data provided via SHM
- there is no way to restart the protocol
This makes it difficult for chronyd to ensure it is receiving
measurements from the process that the admin expects it to and not some
other process that managed to create the segment before it was started.
It's up to the admin to configure the system so that chronyd or the
producer is started before untrusted applications or users can create
the segment, or at least verify at some point later that the segment was
created with the expected owner and permissions.
There doesn't seem to be a backward-compatible fix of the protocol. Even
if one side could detect the segment had a wrong owner or permissions,
it wouldn't be able to tell the other side to reattach after recreating
the segment with the expected owner and permissions, if it still had the
permissions to do that.
The protocol would need to specify which side is responsible for
creating the segment and the start order would need to strictly follow
that.
As gpsd (likely the most common refclock source for chronyd) now
supports in the latest version SOCK even for message-based timing,
update the man page and FAQ to deprecate SHM in favor of SOCK.
If AES-128-GCM-SIV is available on the client, add it to the requested
algorithms in NTS-KE as the first (preferred) entry.
If supported on the server, it will make the cookies shorter, which
will get the length of NTP messages containing only one cookie below
200 octets. This should make NTS more reliable in networks where longer
NTP packets are filtered as a mitigation against amplification attacks
exploiting the ntpd mode 6/7 protocol.
Add a new test for maximum delay using a long-term estimate of a
p-quantile of the peer delay. If enabled, it replaces the
maxdelaydevratio test. It's main advantage is that it is not sensitive
to outliers corrupting the minimum delay.
As it can take a large number of samples for the estimate to reach the
expected value and adapt to a new value after a network change, the
option is recommended only for local networks with very short polling
intervals.
Instead of waiting for the sample filter to accumulate the specified
number of samples and then deciding if the result is acceptable, count
missing samples and get the result after the specified number of polls.
This should work better when samples are dropped at a high rate. The
source and clock update interval will be stable as long as at least
one sample can be collected.
Latest versions of ethtool print only the shorter lower-case names of
capabilities and filters. Explain that chronyd doesn't synchronize the
PHC and refer to the new vclock feature of the kernel, which should be
used by applications that need a synchronized PHC (e.g. ptp4l and
phc2sys) in order to not interfere with chronyd.
Some PHCs that have a PPS input don't have configurable pins (their
function is hardcoded). Accept a negative pin index to skip the pin
configuration before requesting external timestamping.
Add "local" option to specify that the reference clock is an
unsynchronized clock which is more stable than the system clock (e.g.
TCXO, OCXO, or atomic clock) and it should be used as a local standard
to stabilize the system clock.
Handle the local refclock as a PPS refclock locked to itself which gives
the unsynchronized status to be ignored in the source selection. Wait
for the refclock to get at least minsamples samples and adjust the clock
directly to follow changes in the refclock's sourcestats frequency and
offset.
There should be at most one refclock specified with this option.
Allow sources to accumulate samples with the leap status set to not
synchronized. Define a new state for them to be ignored in the
selection. This is intended for sources that are never synchronized and
will be used only for stabilization.
For a long time, the Solaris support in chrony wasn't tested on a real
Solaris system, but on illumos/OpenIndiana, which was forked from
OpenSolaris when it was discontinued in 2010.
While Solaris and illumos might have not diverged enough to make a
difference for chrony, replace Solaris in the documentation with illumos
to make it clear which system is actually supported by the chrony
project.
If the xleave option is enabled, ignore the key option and the hash
length. Always use version 4 as the default to get interleaved responses
from new chrony servers.
Add "extfield F323" option to include the new extension field in
requests. If the server responds with this field, use the root
delay/dispersion and monotonic timestamp. Accumulate changes in the
offset between the monotonic and real-time receive timestamps and use
it for the correction of previous offsets in sourcestats. In the
interleaved mode, cancel out the latest change in the offset in
timestamps of the previous request and response, which were captured
before the change actually happened.
Report the number of received interleaved requests and current timestamp
count with their span.
Expand the serverstats description in chronyc man page.
Use the lowest bit of the server RX and TX timestamp as a flag
indicating RX timestamp. This allows the server to detect potential
interleaved requests without having to save all its RX timestamps. It
significantly reduces the amount of memory needed to support clients
using the interleaved mode if most of the server's clients are using the
basic mode (e.g. a public server).
Capture the TX timestamp on the first response to the request which has
the flag set to not further delay the first interleaved response.
False positives are possible with broken clients which set the origin
timestamp to something else than zero or the server RX or TX timestamp.
This causes an unnecessary RX timestamp to be saved and TX timestamp
captured and saved.