Commit graph

2710 commits

Author SHA1 Message Date
Miroslav Lichvar
de598c2310 main: add log message for timeout reached with -t option
This should make it more clear why chronyd exits if -q/-Q does not
finish before the timeout is reached.
2022-06-30 14:46:18 +02:00
Miroslav Lichvar
91cc4dbb12 doc: improve description of test A in measurements log 2022-06-30 10:20:49 +02:00
Miroslav Lichvar
0ae6f2485b ntp: don't use first response in interleaved mode
With the first interleaved response coming after a basic response the
client is forced to select the four timestamps covering most of the last
polling interval, which makes measured delay very sensitive to the
frequency offset between server and client. To avoid corrupting the
minimum delay held in sourcestats (which can cause testC failures),
reject the first interleaved response in the client/server mode as
failing the test A.

This does not change anything for the symmetric mode, where both sets of
the four timestamps generally cover a significant part of the polling
interval.
2022-06-30 10:18:48 +02:00
Miroslav Lichvar
52ec694d2b test: fix server interleaved mode in ntp_core unit test 2022-06-28 15:43:25 +02:00
Miroslav Lichvar
e2e07af8a4 doc: improve and add more questions to FAQ 2022-06-23 14:29:21 +02:00
Miroslav Lichvar
2ed88c31c7 sys_generic: damp slew oscillation due to delayed stop
If the computer is overloaded so much that chronyd cannot stop a slew
within one second of the scheduled end and the actual duration is more
than doubled (2 seconds with the minimum duration of 1 second), the
overshoot will be larger than the intended correction. If these
conditions persist, the oscillation will grow up to the maximum offset
allowed by maxslewrate and the delay in stopping.

Monitor the excess duration as an exponentially decaying maximum value
and don't allow any slews shorter than 5 times the value to damp the
oscillation. Ignore delays longer than 100 seconds, assuming they have a
different cause (e.g. the system was suspended and resumed) and are
already handled in the scheduler by triggering cancellation of the
ongoing slew.

This should also make it safer to shorten the minimum duration if
needed.

Reported-by: Daniel Franke <dff@amazon.com>
2022-06-15 17:42:49 +02:00
Miroslav Lichvar
af8e4a5115 sys_generic: rename slew constants 2022-06-14 16:02:06 +02:00
Miroslav Lichvar
f503a9a490 test: improve 133-hwtimestamp test 2022-06-09 16:01:22 +02:00
Miroslav Lichvar
9c64fbb9c4 hwclock: improve filtering of readings
Estimate the 1st and 2nd 10-quantile of the reading delay and accept
only readings between them unless the error of the offset predicted from
previous samples is larger than the minimum reading error. With the 25
PHC readings per ioctl it should combine about 2-3 readings.

This should improve hwclock tracking and synchronization stability when
a PHC reading delay occasionally falls below the normal expected
minimum, or all readings in the batch are delayed significantly (e.g.
due to high PCIe load).
2022-06-09 16:01:22 +02:00
Miroslav Lichvar
b428f901c7 quantiles: add support for quantile estimation
Add estimation of quantiles using the Frugal-2U streaming algorithm
(https://arxiv.org/pdf/1407.1121v1.pdf). It does not need to save
previous samples and adapts to changes in the distribution.

Allow multiple estimates of the same quantile and select the median for
better stability.
2022-06-09 16:01:19 +02:00
Miroslav Lichvar
09b7f77f9a hwclock: refactor processing of PHC readings
Move processing of PHC readings from sys_linux to hwclock, where
statistics can be collected and filtering improved.

In the PHC refclock driver accumulate the samples even if not in the
external timestamping mode to update the context which will be needed
for improved filtering.
2022-06-09 12:04:20 +02:00
Miroslav Lichvar
c23c0b8484 ntp: convert HW timestamp even if PHC reading fails
Reading of PHC can fail occasionally on some hardware. If that happens,
don't abort the conversion of the timestamp that triggered the reading.
2022-06-08 15:30:05 +02:00
Miroslav Lichvar
d530055917 sys_linux: increase number of PHC readings
Increase the number of requested readings from 10 to 25 - the maximum
accepted by the PTP_SYS_OFFSET* ioctls. This should improve stability of
HW clock tracking and PHC refclock.
2022-06-08 14:29:35 +02:00
Miroslav Lichvar
f41d09e19f doc: improve hwtimestamp description
Latest versions of ethtool print only the shorter lower-case names of
capabilities and filters. Explain that chronyd doesn't synchronize the
PHC and refer to the new vclock feature of the kernel, which should be
used by applications that need a synchronized PHC (e.g. ptp4l and
phc2sys) in order to not interfere with chronyd.
2022-05-19 10:50:10 +02:00
Miroslav Lichvar
46030d9d3e sources: add selection log
Add an option to enable selection log, capturing some data from the
selectdata report.
2022-05-19 08:23:05 +02:00
Miroslav Lichvar
02ccd3a3c7 sourcestats: don't load samples from future
When loading a dumped file, make sure there are no sample times in
future relative to the current system time (e.g. after reboot with
missing RTC).
2022-05-18 16:38:41 +02:00
Miroslav Lichvar
9cc609c4b0 local: cancel remaining correction after external step
Instead of the generic clock driver silently zeroing the remaining
offset after detecting an external step, cancel it properly with the
slew handlers in order to correct timestamps that are not reset in
handling of the unknown step (e.g. the NTP local TX).
2022-05-16 16:28:36 +02:00
Miroslav Lichvar
a0a496dcb4 refclock: set minimum maxlockage in local mode
Use 3 as the minimum maxlockage in the local mode to avoid disruptions
due to losing the lock when a single sample is missed, e.g. when the PPS
driver polling interval is slightly longer than the pulse interval and a
pulse is skipped.
2022-05-11 14:28:58 +02:00
Miroslav Lichvar
8d08486edf refclock: restart local mode after losing lock
A refclock in the local mode is locked to itself. When the maxlockage
check failed after missing some samples, it failed permanently and the
refclock was not able to accumulate any new samples.

When the check fails, drop all samples and reset the source to start
from scratch.

Reported-by: Dan Drown <dan-ntp@drown.org>
2022-05-11 14:27:18 +02:00
Miroslav Lichvar
a3b376cf0a refclock: fix invalid warning in local mode
A refclock in the local mode is locked to itself by design.

Reported-by: Dan Drown <dan-ntp@drown.org>
2022-05-11 14:22:41 +02:00
Miroslav Lichvar
e66f1df89d samplefilt: drop last sample in SPF_DropSamples()
When SPF_DropSamples() is called, don't keep the last sample to be
retrieved by SPF_GetLastSample(). It should be kept only after
filtering.
2022-05-11 14:21:09 +02:00
Miroslav Lichvar
35220aac9d siv: set key directly with gnutls
A new function is provided by the latest gnutls (should be in 3.7.5) to
set the key of an AEAD cipher. If available, use it to avoid destroying
and creating a new SIV instance with each key change.

This improves the server NTS-NTP performance if using gnutls for SIV.
2022-05-11 12:22:33 +02:00
Miroslav Lichvar
5b04f3ca90 doc: improve description of chronyc -h option 2022-05-05 14:34:15 +02:00
Miroslav Lichvar
beb1c36136 doc: improve maxchange description 2022-05-05 14:34:08 +02:00
Miroslav Lichvar
da3495c472 nts: don't exit if initialization of priority cache fails
Initialization of the gnutls priority cache can fail depending on the
system crypto policy (e.g. disabled TLS1.3). Log an error mentioning
TLS, but continue to run without the server/client credentials.
2022-05-05 10:27:48 +02:00
Miroslav Lichvar
356771c0c3 client: rework command catenation
Use snprintf() instead of strcat() and don't try to parse commands
longer than 2048 characters to make it consistent with the chrony.conf
parser, avoid memory allocation, and not rely on the system ARG_MAX to
keep the length sane.
2022-05-04 14:17:32 +02:00
Miroslav Lichvar
fca8966ada examples: replace grep command in NM dispatcher script
Some grep implementations detect binary data and return success without
matching whole line. This might be an issue for the DHCPv6 NTP FQDN
check. The GNU grep in the C locale seems to check only for the NUL
character, which cannot be passed in an environment variable, but other
implementations might behave differently and there doesn't seem to be a
portable way to force matching the whole line.

Instead of the grep command, check for invalid characters by comparing
the length of the input passed through "tr -d -c".
2022-03-23 15:36:17 +01:00
Miroslav Lichvar
25f80a1a9d doc: include gnutls in libraries providing SECHASH feature 2022-03-16 14:54:12 +01:00
Miroslav Lichvar
1219f99935 ntp: keep original source IP address
When an added source is specified by IP address, save the original
string instead of formatting a new string from the parsed address, which
can be different (e.g. compressed vs expanded IPv6 address).

This fixes the chronyc sourcename command and -N option to print the IP
address exactly as it was specified in the configuration file or chronyc
add command.
2022-03-10 09:54:31 +01:00
Miroslav Lichvar
33a1fe7a9c ntp: split out conf_id allocation 2022-03-10 09:53:29 +01:00
Miroslav Lichvar
eed0a0de56 test: update 007-cmdmon system test for recent changes
The new unsynchronised source state is now reported in selectdata before
the first measurement.

Fixes: c29f8888c767 ("sources: handle unsynchronized sources in selection")
2022-03-07 16:00:20 +01:00
Miroslav Lichvar
07600cbd71 test: extend sources unit test 2022-03-02 12:17:02 +01:00
Miroslav Lichvar
f2e341b5ed sources: improve debug messages
Print source status as char and print the name instead of index in
combining.
2022-03-02 12:11:24 +01:00
Miroslav Lichvar
55717c1ccd refclock: trim offset in local mode
With the local option, trim offset larger than 1 second to not lose
precision after large steps of the clock.
2022-02-24 11:40:26 +01:00
Miroslav Lichvar
d5e645eb38 samplefilt: add function to correct accumulated offsets
Analogously to SST_CorrectOffset(), add SPF_CorrectOffset() to correct
the offsets accumulated in the filter.
2022-02-24 11:40:01 +01:00
Miroslav Lichvar
3196630fb9 sys_linux: don't require configurable pin for external PPS
Some PHCs that have a PPS input don't have configurable pins (their
function is hardcoded). Accept a negative pin index to skip the pin
configuration before requesting external timestamping.
2022-02-23 14:43:39 +01:00
Miroslav Lichvar
663dde1ad7 refclock: improve precision with large offset
If a SHM or PHC refclock has a very large offset compensated by the
offset option, or ignored with the pps or local option, there is a
persistent loss of precision in the calculation of the sample offset
using the double format.

Rework the code to delay the calculation of the accumulated offset to
include the specificed compensation and remaining correction of the
system clock, where the calculation can be split to improve the
precision. In the pps mode ignore integer seconds competely.

The precision of the SOCK refclock is now limited to 1 nanosecond due to
the extra double->timespec->double conversion.
2022-02-23 14:43:39 +01:00
Miroslav Lichvar
62757cda49 refclock: add local option
Add "local" option to specify that the reference clock is an
unsynchronized clock which is more stable than the system clock (e.g.
TCXO, OCXO, or atomic clock) and it should be used as a local standard
to stabilize the system clock.

Handle the local refclock as a PPS refclock locked to itself which gives
the unsynchronized status to be ignored in the source selection. Wait
for the refclock to get at least minsamples samples and adjust the clock
directly to follow changes in the refclock's sourcestats frequency and
offset.

There should be at most one refclock specified with this option.
2022-02-23 14:43:39 +01:00
Miroslav Lichvar
af6ae9186b reference: allow clock adjustments without updating reference
Add support for accumulating frequency and time offset without changing
the reference parameters and calling the local parameter change
handlers.

This will allow an unsynchronized source to operate below other sources
in order to stabilize the clock.
2022-02-23 14:43:39 +01:00
Miroslav Lichvar
4c29f8888c sources: handle unsynchronized sources in selection
Allow sources to accumulate samples with the leap status set to not
synchronized. Define a new state for them to be ignored in the
selection. This is intended for sources that are never synchronized and
will be used only for stabilization.
2022-02-23 14:42:40 +01:00
Miroslav Lichvar
d06ae4a60e sourcestats: add function to get minsamples 2022-02-10 16:38:50 +01:00
Miroslav Lichvar
f9af2f9733 sourcestats: clamp minsamples and maxsamples in initialization
Don't leave the variables set to values outside their effective range.
This has no functional impact, but makes it clear what is the precedence
of the two settings.
2022-02-10 16:31:46 +01:00
Miroslav Lichvar
43ae0131cd sourcestats: use constant for required number of samples 2022-02-10 15:16:08 +01:00
Michael Hudson-Doyle
8bb8f15a7d sys_linux: allow rseq in seccomp filter
Libc 2.35 will use rseq syscalls [1][2] by default and thereby
break chrony in seccomp isolation.

[1]: https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
[2]: https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html

Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2022-02-09 10:46:09 +01:00
Miroslav Lichvar
e55f174bd3 examples: handle more actions in NM dispatcher script
Run the chronyc onoffline command also when the connectivity-change
and dhcp6-change actions are reported by the NetworkManager dispatcher.

The latter should not be necessary, but there currently doesn't seem to
be any action for IPv6 becoming routable after duplicate address
detection, so at least in networks using DHCPv6, IPv6 NTP servers should
not be stuck in the offline state from a previously reported action.
2022-02-07 17:04:19 +01:00
Miroslav Lichvar
5bd13c8d59 examples: support DHCPv6 NTP servers in NM dispatcher script
Latest NetworkManager code provides NTP servers from the DHCPv6 NTP
option (RFC 5908) in the DHCP6_DHCP6_NTP_SERVERS variable to dispatcher
scripts.

Check for invalid characters (which can come from the FQDN suboption)
and include the servers in the interface-specific sources file.
2022-02-07 16:59:10 +01:00
Miroslav Lichvar
759580aa6f client: fix waitsync command to reconnect to server
If chronyc waitsync was started before chronyd, it would try all
addresses (Unix socket, IPv4, IPv6) and get stuck with no address, not
getting any response later when chronyd was running.

Reset the address index in open_io() when returning with failure to
allow the next call to start with the first address again.

Reported-by: Jan Mikkelsen <janm@transactionware.com>
2022-01-26 16:11:01 +01:00
Vincent Blut
b61cbed689 test: ensure awk commands in 008-ntpera return an integer
Some awk interpreters (e.g. mawk) print long integers in exponential
notation skewing the test result.
2022-01-13 09:40:12 +01:00
Miroslav Lichvar
2ac2247756 doc: update NEWS 2021-12-16 13:17:42 +01:00
Miroslav Lichvar
55f48b14b7 update copyright years 2021-12-16 13:17:42 +01:00