If the computer is overloaded so much that chronyd cannot stop a slew
within one second of the scheduled end and the actual duration is more
than doubled (2 seconds with the minimum duration of 1 second), the
overshoot will be larger than the intended correction. If these
conditions persist, the oscillation will grow up to the maximum offset
allowed by maxslewrate and the delay in stopping.
Monitor the excess duration as an exponentially decaying maximum value
and don't allow any slews shorter than 5 times the value to damp the
oscillation. Ignore delays longer than 100 seconds, assuming they have a
different cause (e.g. the system was suspended and resumed) and are
already handled in the scheduler by triggering cancellation of the
ongoing slew.
This should also make it safer to shorten the minimum duration if
needed.
Reported-by: Daniel Franke <dff@amazon.com>
Instead of the generic clock driver silently zeroing the remaining
offset after detecting an external step, cancel it properly with the
slew handlers in order to correct timestamps that are not reset in
handling of the unknown step (e.g. the NTP local TX).
It was never used for anything and messages in debug output already
include filenames, which can be easily grepped if there is a need
to see log messages only from a particular file.
Replace struct timeval with struct timespec as the main data type for
timestamps. This will allow the NTP code to work with timestamps in
nanosecond resolution.
Timeout ID of zero can be now safely used to indicate that the timer is
not running. Remove the extra timer_running variables that were
necessary to track that.
The system drivers may implement their own slewing which the generic
driver can use to slew faster than the maximum frequency the driver is
allowed to set directly.
Different systems may consider different time values to be valid.
Don't exit on settimeofday()/adjtimex() error in case the check in
UTI_IsTimeOffsetSane() isn't restrictive enough.
This will be used to set the kernel adjtimex() variables to allow other
applications running on the system to know if the system clock is
synchronized and the estimated error and the maximum error.
Add a new change type and use it when an unexpected time jump is
detected in the scheduler to reset reference times, offset and slewing,
NCR instances (with their polling interval), synchronization status, and
drop all sourcestats, manual, refclock and RTC samples.
This should make the recovery more graceful if the estimated jump has a
large error (e.g. select didn't timeout, or after system suspend).
There is no need to try to correct offsets below the specified minimum
(1 nanosecond), let the clock drift away after crossing zero offset and
avoid unnecessary updates.
With the generic driver, the maxslewrate directive sets the maximum
frequency offset that the driver is allowed to use to slew the time. By
default, it's set to 83333.333 (1/12). This is identical to what Linux
fast slewing used to use.
This driver is intended to complete system-specific drivers that don't
have implemented all required driver functionality. Currently, it
implements offset functions working on top of system-specific frequency
functions. Offsets are corrected by changing frequency, similarly to
fast slewing implemented in the Linux driver.