Heavy dew this morning.
Took most of yesterday off. Still a few sensors issues to deal with today:
- grass Tsoil – probably replace
- init.6m - probably need to replace, but don't have a spare (still at least 6 back at Campbell being repaired) Confirmed a red sonic LED in EC100, and all aspects of installation look fine.
- TRH.8.5m.rel - unknown issue (climbed and replaced SHT sensor, but turned out to be RS422/232 jumpering. Now fixed.)
- mote.20m.rel not sending data (mote console cable Bulgin to standard serial Bulgin connection at 20m doesn't have a purple ring. Retaped and got mote LEDs alive.)
Electric fence is mostly up (to protect CTEMPS fiber).
Dry run IOP is scheduled for tonight: 2200-0000.
Data flow was horrible this morning – cockpit wouldn't connect to streams/nagios showed critical something on almost all dsms. Did an ansible restart_dsm on everything, which got most going (P4 still won't connect, as usual with morning dew).
3 Comments
Gary Granger
When I was looking this morning while logged into ustar, I could not ping any of the radios except
sodar
, so I suspected that the radio onrel
was down. I'm surprised the ansible dsm restart was able to connect to the DSMs. Perhaps the radio links had recovered by the time you ran that, or only just started to recover? Any hint of power problems onrel
or anything else that might have caused therel
access point to go down?Steve Oncley AUTHOR
When we arrived at about 9, nagios had both OK and Critical buttons for every dsm. I was able to ping/ssh to the one or two that I tried. Thus, I wasn't surprised that ansible worked (except for P4). Of course, we haven't connected power monitoring to rel – I just haven't figured out where to put the batteries to get the USB cable to reach – so I can't comment about power availability. We did have a lot of dew, as I mentioned, though the tower was totally dry by the time I climbed it at about 10.
Gary Granger
According to nagios, the rel radio was down from 2018-09-23 04:22 MDT - 08:05 MDT, or 10:22 to 14:05 UTC, or 05:22 - 09:05 CDT. The trends plot below is in MDT, since that's the timezone of my browser:
I've looked at the data files on rel1, rel2, and relm, and there is definitely a corresponding data gap on rel1 and relm, but not on rel2. I conclude that just relm and rel1 went down, and the rel radio is powered by relm, and both are powered by a different solar panel and battery than rel2 and relt. If that sounds consistent, then perhaps the relm power is not getting enough charging to last it through the night?