You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Next »

NTP and GPS

The turbulence tower data system (aka, the DSM) uses a GPS receiver and the NTP (Network Time Protocol) software to set the system clock, which, in addition to the normal uses of a system clock, is used to time-tag the data samples.

The serial messages from the GPS are received on serial port 3, /dev/ttyS3. The pulse-per-second square-wave signal (PPS) from the GPS is also connected to the CTS line of that serial port. The PPS patch has been added to the Linux kernel on the data system so that an interrupt function can be registered to run in response to the CTS interrupts. This interrupt function will be called immediately after the rising edge of the PPS signal has been detected by the serial port hardware.

The NTP software on the DSM runs a reference clock driver for a Generic NMEA GPS Receiver, with PPS. This driver reads the 1 second GPS RMC records from the serial port, and registers a function to be run on receipt of the PPS interrupt. NTP then uses these two sets of information to create a GPS reference clock. NTP then monitors the state of the GPS reference clock and the system clock, and makes gradual adjustments to the system clock to bring it to close agreement with the GPS clock.

The RMC records contain the current date and time, in addition to latitude, longitude, and other quantities. The transmission time of the RMC message is not tightly controlled and appears to be primarily effected by lags associated with internal GPS processing, and is also likely effected by what other messages are enabled for output on the GPS. The exact receipt time of the RMC message is not used for clock adjustments. NTP simply uses the time fields within the RMC message as an absolute time label for the previous PPS, whose timing is very precise.

Clock Variables

We monitor the following variables to keep track of the DSM timekeeping, and plot them on the daily web plots:

  • GPSdiff: The time difference, in seconds, between the time-tag that was assigned to a RMC message and the date and time that is contained within the message. The time-tag assigned to a message sample is the value of the system clock at the moment the first byte of the message was received. For example, a value of 0.6 secs means that the data system assigned a time-tag to the RMC message that was 0.6 seconds later than the time value contained in the message. GPSdiff will be effected by processing lags within the GPS, DSM data sampling lags, and the drift of the DSM system clock relative to the clock within the GPS receiver. When 5 minute statistics are computed, the maximum and minimum values of GPSdiff for each 5 minute period are written to the output NetCDF files as GPSdiff_max and GPSdiff_min.
  • GPSnsat: number of satellites being tracked by the receiver, that is, the number of satellites whose signals are used in the time and location solution.

NTP on the DSM is configured to log information about time-keeping in a "loopstats" file. See http://www.eecis.udel.edu/~mills/ntp/html/monopt.html for information on the NTP monitoring options.

  • NTPClockOffset: the estimated offset of the GPS time from the data system time. A positive value indicates that NTP is estimating that the GPS clock is ahead of the system clock, i.e. the GPS showing a later time than the system clock.
  • NTPFreqOffset: the correction applied to the system clock frequency in parts-per-million, a positive value indicates that NTP has determined that the system clock oscillator is slow and the frequency offset is being added to the system clock values.

The NTP logs have not been recorded consistently since the beginning of the project. 2010 data from May 3 to August 12th and Oct 14th to November 9th are available, as well as all data from April 9, 2011 onward.

New Garmin 18-LVC GPS

On April 12, 2011 the old Garmin GPS 25-HVS at the tower was replaced with a much newer Garmin 18x-LVC model. The model numbers are shown in the $PGRMT messages, where the time is UTC:

data_dump -i 1,30 -A manitou_20110412_120000.bz2 | fgrep GPS
...
2011 04 12 16:41:39.6568    0.15      49 $PGRMT,GPS 25-HVS VER 2.50 ,P,P,R,R,P,,23,R*08\r\n
2011 04 12 16:42:50.4248  0.1249      51 $PGRMT,GPS 18x-LVC software ver. 3.10,,,,,,,,*6D\r\

Unexpectedly, the newer GPS provided much better time-keeping.

The following plot is for the old 25-HVS model for 3 days prior to the swap:

The NTPClockOffset_max ranges from approximately -1000 to 50000 microseconds during that period. The upward spikes in NTPClockOffset_max are simultaneous with positive jumps in GPSdiff_max, up to as much as 2.5 seconds. These jumps in GPSdiff_max also seem to happen when the number of tracked satellites changes, indicating that internal processing lags in the 25-HVS cause it to report late. It is unknown if the PPS signal is effected by these events. At these moments, NTP estimates that the system clock is early relative to the GPS, and starts to correct for the error by speeding up the system clock, seen as the positive spike in NTPClockOffset and NTPFreqOffset. When the GPS recovers from its delayed reporting, then NTP sees that the system clock has gotten ahead of the GPS, reports a negative NTPClockOffset and gradually returns to the previous frequency offset.

After installing 18x-LVC, the NTPClockOffset is in a much smaller range, from -10 to 25 microseconds:

GPSdiff is also much better behaved, ranging from a minimum of 0.5 to 1.1 seconds. The number of satellites tracked by the new GPS is also generally higher.

Temperature Effects

As expected, the frequency offset shows a temperature dependence in the system clock oscillator. We do not have a measurement of the temperature inside the data system. The top panel in the plot below shows a time series of the ambient air temperature at 2 meters on the tower, along with the NTPFreqOffset, for a cool 3 day period in April. When the ambient air temperatures is below 5 deg C, the system clock oscillator does not show an obvious temperature relation.

The bottom panel shows a close relationship between the NTPClockOffset and the time derivative of NTPFreqOffset, indicating how NTP adjusts the clock.

On a warmer 3 day period in July, where the temperatures were all above 5 degC, the temperature effect on the system clock oscillator is very evident.

Time Offsets During File Transfers

The periodic spikes in GPSdiff_max up to 1 second that occur at 23:00 local time and last about an hour, are simultaneous with the network transfer of the day's data files from the DSM to the RAL server. These indicate increased sampling buffering and latency is happening at these times, which needs to be investigated and improved.

At these times there is also a little bump in NTPFreqOffset. I can think of two possible causes of this. It could be due to increased interrupt load at these times, causing increased latency in the interrupt function that is called in response to the PPS interrupts. Increased latency in response to PPS interrupts should cause NTP to think that the GPS clock has fallen behind the system clock, but the NTPClockOffset at these times is positive, and the slope of NTPFreqOffset is positive, indicating that NTP thinks the GPS clock is ahead.

Or the bump could be caused by increased heating of the system clock oscillator, due to increased CPU load during the file transfers. The sign of NTPClockOffset is consistent with a heating effect, as described above. Also, when the ambient air temperature is very cold, the bump is diminished, or has the opposite slope. So my thinking at this point is that the bump is caused by increased oscillator heating.

PPSTEST

On the DSM, the ppstest program is helpful for gaining an understanding of the system and GPS clocks. It displays the system clock value when the interrupt function is called at the time of the assertion and the clear of the PPS signal. Do ctrl-C to terminate ppstest.

root@manitou root# ppstest /dev/ttyS3
trying PPS source "/dev/ttyS3"
found PPS source #3 "serial3" on "/dev/ttyS3"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1315494544.999995675, sequence: 37249847 - clear  1315494544.099998000, sequence: 37249862
source 0 - assert 1315494544.999995675, sequence: 37249847 - clear  1315494545.099995000, sequence: 37249863
source 0 - assert 1315494545.999994675, sequence: 37249848 - clear  1315494545.099995000, sequence: 37249863
source 0 - assert 1315494545.999994675, sequence: 37249848 - clear  1315494546.099993000, sequence: 37249864
source 0 - assert 1315494546.999994675, sequence: 37249849 - clear  1315494546.099993000, sequence: 37249864
ctrl-C

The above sequence shows that the system clock is behind the GPS. The system time when the interrupt function is being called on the PPS assert is 5 microseconds before the exact second (1.0 - 0.999995). This corresponds to a NTPClockOffset of positive 5 microseconds. This ss confirmed with the ntpq program (which reports its offset in milliseconds):

ntpd -p
root@manitou root# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
xral             38.229.71.1      3 u   34   64  377    0.320    3.804   0.031
 LOCAL(0)        .LOCL.          10 l  93d   64    0    0.000    0.000   0.000
oGPS_NMEA(0)     .GPS.            2 l    6   16  377    0.000    0.005   0.031

The ntpq output indicates (with the leading 'o') that NTP is using the GPS as the system's reference clock. It also displays the offset of the RAL server's clock of 3.804 milliseconds, and indicates with an 'x' that it is not using that clock as a reference. The RAL server uses NTP over a WIFI connection to adjust its clock, so it is not as accurate as the DSM.

The loopstats file also shows the 5 usec offset at this time:

55812 54504.454 0.000005000 39.301 0.000030518 0.001408 4
55812 54520.455 0.000006000 39.302 0.000030518 0.001415 4
55812 54536.454 0.000005000 39.303 0.000030518 0.001392 4
55812 54552.454 0.000005000 39.305 0.000030518 0.001372 4

I do not believe I've seen a jitter value less than 31 microseconds. Not sure why that is. I believe the jitter is the standard deviation of the offset, but the NTP documentation is rather unclear to me.

  • No labels