Include IP address instead of reference ID in the name of dump file
for NTP sources and for reference clocks format the reference ID as a
hexadecimal number instead of quad dotted notation.
Also, avoid dynamic memory allocation and improve warning messages.
Replace struct timeval with struct timespec as the main data type for
timestamps. This will allow the NTP code to work with timestamps in
nanosecond resolution.
Ignore orphan sources that are unreachable (but still have usable stats)
to have a quick and consistent source selection between orphans.
This also fixes the "Unknown local refid in orphan mode" error appearing
when a selected orphan source is removed, as the source is marked as
unreachable and the selection runs with disabled NTP instance before the
source instance is actually removed.
When the local reference is configured with the orphan option, NTP
sources that have stratum equal to the configured local stratum are
considered to be orphans (i.e. serving local time while not being
synchronised with real time) and are excluded from the normal source
selection. Sources with stratum larger than the local stratum are
considered to be directly on indirectly synchronised to an orphan and
are always ignored.
If no selectable source is available and all orphan sources have
reference IDs larger than the local ID, no source will be selected and
the local reference mode will be activated at some point, i.e. this host
will become an orphan. Otherwise, the orphan source with the smallest
reference ID will be selected. This ensures a group of servers polling
each other (with the same orphan configuration) which have no external
source can settle down to a state where only one server is serving its
local unsychronised time and others are synchronised to it.
Similarly to unreachable sources and falsetickers, try to replace
sources with distance larger than the limit set by the maxdistance
directive with a newly resolved address of the hostname.
Require that at least one of the sources specified with this option is
selectable (i.e. recently reachable and not a falseticker) before
updating the clock. Together with the trust option this may be useful to
allow a trusted, but not very precise, reference clock or a trusted
authenticated NTP source to be safely combined with unauthenticated NTP
sources in order to improve the accuracy of the clock. They can be
selected and used for synchronization only if they agree with the
trusted and required source.
Assume time from a source that is specified with the trust option is
always true. It can't be rejected as falseticker in the source
selection if sources that are specified without this option don't agree
with it.
Add maxdistance directive to set the maximum root distance the sources
are allowed to have to be selected. This is useful to reject NTPv4
sources that are no longer synchronized and report large dispersion.
The default value is 3 seconds.
Create logdir and dumpdir before dropping root. Set their uid/gid to the
user chronyd will switch to. This allows chronyd to create the
directories in a directory where the user won't have write permissions
(e.g. /var/lib).
When reducing the list of selectable sources to sources with the prefer
option, sources before the first preferred source were left with the
SRC_OK status, which triggered an assertion failure in the next
selection.
The minsamples and maxsamples directives now set the default value,
which can be overriden for individual sources in the server/peer/pool
and refclock directives.
Add new functions to change source's reference ID/address and reset the
instance. Use that instead of destroying and creating a new instance
when the NTP address is changed.
The pool directive can be used to configure chronyd for a pool of NTP
servers (e.g. pool.ntp.org). The name is expected to resolve to multiple
addresses which change over time.
On start, a source will be added for each resolved address. When a
source from the pool is unreachable or marked as falseticker, chronyd
will try to replace the source with a newly resolved address of the
pool.
The minimum interval between replacements is currently set to 244
seconds to avoid frequent DNS requests.
With the recent change allowing unreachable sources to remain selected,
offline sources will now be selectable only for some time, similarly to
online unreachable sources.
Reachability is no longer a requirement for selection. An unreachable
source can remain selected if its newest sample is not older than the
oldest sample from all reachable sources.
This is useful to prevent reselection when an accurate source uses a
very short polling interval (e.g. refclock) and is occasionally
unreachable for short periods of time.
Add new source states and rename some states so there is one state for
each reason a source can be rejected in the source selection.
This fixes reported status when sources are selectable, but the actual
selection was postponed until next update. It will also allow more
detailed reports when the cmdmon protocol is updated.
Following RFC 5905, don't call REF_SetUnsynchronised() when there are no
reachable or selectable sources. It's up to the client to consider the
source unsynchronized when the root distance exceeds a threshold.
The unsynchronized status is still set when no majority is reached.
This should reduce the number of possible memory leaks reported by
valgrind. The remaining reported leaks are sched tqe allocation, async
DNS instance allocation, cmdmon response/timestamp cell allocation, and
clientlog subnet allocation.
When current time is within 5 seconds of a leap second, don't accumulate
new samples or update the leap second status to increase the chances of
getting through safely.
Don't stop online burst for unreachable sources until sending succeeds.
This is mainly useful with iburst when chronyd is started before the
network is configured.
Add a new change type and use it when an unexpected time jump is
detected in the scheduler to reset reference times, offset and slewing,
NCR instances (with their polling interval), synchronization status, and
drop all sourcestats, manual, refclock and RTC samples.
This should make the recovery more graceful if the estimated jump has a
large error (e.g. select didn't timeout, or after system suspend).
When source is set as active, it's receiving reachability updates (e.g.
offline NTP sources are not active).
Also add function to count active sources.
Instead of giving up when a source has 7 reach updates, continue as long
as at least one source has fewer than 7 updates and can still have 3
samples to be selectable in that number of updates.
When no sources are responding, it will give up sooner.
Explicitly set the number of iburst samples to the size of the register
to make sure there are at least 7 reachability updates and the
initstepslew mode can be ended.
The initstepslew code has its own minimal NTP implementation. Drop the
code, add a new initstepslew mode to the reference updating code and
use regular NTP sources with iburst flag for initstepslew addresses
instead. When an update is made or a source is found unreachable, log a
message, remove the initstepslew sources and switch to normal mode.
This reduces code duplication and makes initstepslew use features
implemented only in the main code like source combining or SO_TIMESTAMP
support.
List of selectable sources that is used in combining was trimmed to
sources with prefer option, but scoring algorithm considered all
selectable sources. When a source without prefer was selected and
no source was combined, it caused assertion failure.
Combine only sources whose distance is shorter than distance of the
selected source multiplied by the value of combinelimit and their
estimated frequencies are close to the frequency of the selected source.
Add outlyer status for sources which are selectable, but not included in
the combining. The status is displayed as '-' in the chronyc sources
output.
We want to correct the offset quickly, but we also want to keep the
frequency error caused by the correction itself low.
Define correction rate as the area of the region bounded by the graph of
offset corrected in time. Set the rate so that the time needed to correct
an offset equal to the current sourcestats stddev will be equal to the
update interval (assuming linear adjustment). The offset and the
time needed to make the correction are inversely proportional.
This is only a suggestion and it's up to the system driver how the
adjustment will be executed.
Instead of always selecting the source with minimum stratum, add weighted
stratum to the distance when comparing selectable sources. The weight
can be configured with new stratumweight directive and can be set to
zero to ignore stratum completely, by default 1.0.
It seems it triggers even with one source alone if it's close and
accurate or polling interval is long enough. It would need to include
max_clock_error.
Each source has a score against currently selected source which is
updated (multiplied by ratio of their distances) when one of the two
sources has a new sample. When the score reaches a limit, the source
will be selected. This should allow to slowly select the source with
minimum distance without frequent reselecting.
To avoid switching between sources with very variable distances (e.g. on
LAN or when upstream server uses a longer polling interval), sources
that are currently not selected are penalized by a fixed distance. This
can be configured with new reselectdist directive (100 microseconds by
default).
On start, when servers are reachable and use the same polling interval,
wait for them to have the same reachability register (which corresponds
to the number of samples in sourcestats) before selecting one.
Local reference is now updated only when a new source is selected or
match_addr is equal to the selected source, match_addr 0 is no longer
treated specially.
Add new flag to source struct to indicate when source is selectable
(packets with good headers are received) and use a reachability
register for last 8 samples instead of the reachable flag. Source
drivers now provide only reachability updates.
Require that the ratio of the increase in delay from the minimum one in
the stats data register to the standard deviation of the offsets in the
register is less than maxdelaydevratio or the difference between
measured offset and predicted offset is larger than the increase in
delay. In the allowed delay increase is included also skew and maximum
clock frequency error.
maxdelaydevratio is 10.0 by default.
Instead of following skew changes, adjust polling interval so that the
number of measurements used in the regression algorithm remains close to
a target value. It can be configured with a new polltarget option
(6 by default).