When switching from basic mode to interleaved mode following a response
which wasn't accumulated due to failed test A, B, C, or D, allow
timestamps of the failed sample to be reused in interleaved mode, i.e.
replacing the server's less accurate transmit timestamp with a more
accurate timestamp server can turn a failed sample into acceptable one.
Move the presend check into test A to simplify the code.
The presend option in interleaved mode uses two presend requests instead
of one to get an interleaved response from servers like chrony which
delay the first interleaved response due to an optimization saving
timestamps only for clients actually using the interleaved mode.
After commit 0ae6f2485b ("ntp: don't use first response in interleaved
mode") the first interleaved response following the two presend
responses in basic mode is dropped as the preferred set of timestamps
minimizing error in delay was already used by the second sample in
basic mode. There are only three responses in the burst and no sample is
accumulated.
Increasing the number of presend requests to three to get a fourth
sample would be wasteful. Instead, allow reusing timestamps of the
second presend sample in basic mode, which is never accumulated.
Reported-by: Aaron Thompson
Fixes: 0ae6f2485b ("ntp: don't use first response in interleaved mode")
If the network correction is known for both the request and response,
and their sum is not larger that the measured peer delay, allowing the
transparent clocks to be running up to 100 ppm faster than the client's
clock, apply the corrections to the NTP offset and peer delay. Don't
correct the root delay to not change the estimated maximum error.
Provide the network correction (PTP correction + RX duration) of the
request in the new extension field if included in the request and
NTP-over-PTP is enabled.
To be able to verify PTP corrections, the client will need to know both
the correction of the request received by the server and the correction
of the response. Add a new experimental NTP extension field that the
clients will use to request the correction and servers return the
value.
Add two new fields to the NTP_Local_Timestamp structure:
- receive duration as the time it takes to receive the ethernet frame,
currently known only with HW timestamping
- network correction as a generalized PTP correction
The PTP correction is provided by transparent clocks in the correction
field of PTP messages to remove the receive, processing and queueing
delays of network switches and routers. Only one-step end-to-end unicast
transparent clocks are useful for NTP-over-PTP. Two-step transparent
clocks use follow-up messages and peer-to-peer transparent clocks don't
handle delay requests.
The RX duration will be included in the network correction to compensate
for asymmetric link speeds of the server and client as the NTP RX
timestamp corresponds to the end of the reception (in order to
compensate for the asymmetry in the normal case when no corrections
are applied).
Rename the exp1 extension field to exp_mono_root (monotonic timestamp +
root delay/dispersion) to better distinguish it from future experimental
extension fields.
Clients sockets are closed immediately after receiving valid response.
Don't wait for the first early HW TX timestamp to enable waiting for
late timestamps. It may take a long time or never come if the HW/driver
is consistently slow. It's a chicken and egg problem.
Instead, simply check if HW timestamping is enabled on at least one
interface. Responses from NTP sources on other interfaces will always be
saved (for 1 millisecond by default).
The polltarget value is used in a floating-point division in the
calculation of the poll adjustment. Set 1 as the minimum accepted
polltarget value to avoid working with infinite values.
Set the polling interval to minpoll when changing address of a source,
but only if it is reachable to avoid increasing load on server or
network in case that is the reason for the source being unreachable.
This shortens the time needed to replace a falseticker or
unsynchronized source with a selectable source.
Rework handling of late HW TX timestamps. Instead of suspending reading
from client-only sockets that have HW TX timestamping enabled, save the
whole response if it is valid and a HW TX timestamp was received for the
source before. When the timestamp is received, or the configurable
timeout is reached, process the saved response again, but skip the
authentication test as the NTS code allows only one response per
request. Only one valid response per source can be saved. If a second
valid response is received while waiting for the timestamp, process both
responses immediately in the order they were received.
The main advantage of this approach is that it works on all sockets, i.e.
even in the symmetric mode and with NTP-over-PTP, and the kernel does
not need to buffer invalid responses.
Previously, in the calculation of the next transmission time
corresponding to the current polling interval, the reference point was
the current time in the client mode (i.e. the time when the response is
processed) and the last transmission time in the symmetric mode.
Rework the code to use the last transmission in both modes and make it
independent from the time when the response is processed to avoid extra
delays due to waiting for HW TX timestamps.
Count served timestamps in all combinations of RX/TX and
daemon/kernel/hardware. Repurpose CLG_LogAuthNtpRequest() to update all
NTP-specific stats in one call per accepted request and response.
After 5f4cbaab7e ("ntp: optimize detection of clients using
interleaved mode") the local TX timestamp is saved for all requests
indicating interleaved mode even when no previous RX timestamp is found.
Count missing samples for the median filter when
NAU_PrepareRequestAuth() is failing.
Fixes: 4234732b08 ("ntp: rework filter option to count missing samples")
Don't adjust the NTP polling interval and decrement the burst count when
NAU_PrepareRequestAuth() fails (e.g. no NTS-KE response received yet,
network being down, or the server refusing connections), same as if an
NTP request could not be sent. Rely on the rate limiting implemented in
the NTS code.
Add a new test for maximum delay using a long-term estimate of a
p-quantile of the peer delay. If enabled, it replaces the
maxdelaydevratio test. It's main advantage is that it is not sensitive
to outliers corrupting the minimum delay.
As it can take a large number of samples for the estimate to reach the
expected value and adapt to a new value after a network change, the
option is recommended only for local networks with very short polling
intervals.
Instead of waiting for the sample filter to accumulate the specified
number of samples and then deciding if the result is acceptable, count
missing samples and get the result after the specified number of polls.
This should work better when samples are dropped at a high rate. The
source and clock update interval will be stable as long as at least
one sample can be collected.
When the minimum round-trip time is checked to enable a sub-second
polling interval, consider also the last sample in the filter to avoid
waiting for the first sample to be accumulated in sourcestats.
If a sub-second polling interval is configured, initialize the local
poll to 0 to avoid a shorter interval between the first and second
request in case no response to the first request is received (in time).
With the first interleaved response coming after a basic response the
client is forced to select the four timestamps covering most of the last
polling interval, which makes measured delay very sensitive to the
frequency offset between server and client. To avoid corrupting the
minimum delay held in sourcestats (which can cause testC failures),
reject the first interleaved response in the client/server mode as
failing the test A.
This does not change anything for the symmetric mode, where both sets of
the four timestamps generally cover a significant part of the polling
interval.
In addition to the 16s limit in per-response change in the monotonic
offset, don't allow the total accumulated offset injected in sourcestats
to be larger than 16 seconds.
Calculate the delay since the previous transmission only if the
TX timestamp is actually set. This removes an unnecessary delay when
starting at the Unix epoch in 1970 (e.g. in a test).
Don't ignore the magic field when searching for the exp1 extension
field in a received response. If there were two exp1 fields in the
packet, and only one of them had the expected magic value, it should
pick the right one.
Fixes: 2319f72b29 ("ntp: add client support for experimental extension field")
If the xleave option is enabled, ignore the key option and the hash
length. Always use version 4 as the default to get interleaved responses
from new chrony servers.
The interleaved modes are being specified for NTPv4 only. As a server,
detect interleaved requests only in NTPv4 packets.
Clients and peers can still send interleaved requests in lower-version
packets if configured with the version option.
Frequency transfer and time smoothing are conflicting features. Set the
monotonic timestamp in the experimental extension field to zero
(invalid) if time smoothing is activated.
Add "extfield F323" option to include the new extension field in
requests. If the server responds with this field, use the root
delay/dispersion and monotonic timestamp. Accumulate changes in the
offset between the monotonic and real-time receive timestamps and use
it for the correction of previous offsets in sourcestats. In the
interleaved mode, cancel out the latest change in the offset in
timestamps of the previous request and response, which were captured
before the change actually happened.
Maintain a server monotonic timescale needed for the experimental
extension field. It follows the best estimate of frequency without
time corrections. Implement it as an offset relative to the NTP time,
starting at zero, using a slew handler to cancel time corrections of the
NTP clock. The 32-bit epoch ID is set to a random value on start and
every step of the system clock.
Add an experimental extension field for some features that were proposed
for NTPv5. Higher-resolution root delay and dispersion (using 28-bit
fraction) are added. A monotonic receive timestamp will allow a
frequency transfer between the server and client. The client will be
able to separate the server's time corrections from frequency
corrections by tracking the offset between the real-time and monotonic
receive timestamps.
The field has a type of 0xF323 from the new experimental range proposed
by the NTP working group. Include a magic 32-bit value in the field to
avoid interoperability issues if a different implementation choses the
same type for its own experimental field. The value will be changed on
incompatible changes to avoid issues between two different chrony
versions.
Add a new variable to the packet info structure with flags for extension
fields included in received packets and add a new parameter to
transmit_packet() to add the fields to transmitted packets.
Since commit fdfcabd79b ("ntp: drop support for long NTPv4 MACs"), the
parser doesn't need to check validify of MACs in NTPv4 packets to
distinguish them from extension fields. Move the parser to ntp_core to
avoid having a separate iteration looking for non-authentication
extension fields.
When calculating the root delay and dispersion of a sample measured in
the interleaved mode, use the root delay and dispersion values from
the previous response (to which the TX timestamp corresponds). If the TX
timestamp is combined with the RX timestamp of the latest response (e.g.
in the symmetric mode), use the maximum of the previous and latest root
delay/dispersion.
When the server clock was updated between saving of the RX timestamp and
updating the TX timestamp, a client using interleaved mode with the four
timestamps which minimize error in measured delay (e.g. chrony) had the
server clock adjustment included in the measured delay, which could
disrupt the sample filtering and weighting.
Add a handler to track the slew epoch and remember the last offset. Undo
the adjustment in TX timestamps which have their RX timestamp in the
previous epoch to fix the delay observed by the clients.
If an unknown clock step is detected, drop all timestamps.
Don't save server RX and TX timestamp to clientlog if the transmission
or authentication failed (e.g. packet is handled in ntp_signd). They
will not be needed.