Sensor status: Tsoil.x.4.4cm.bao is reading high
T: ok
RH: ok
Ifan: ok
spd: ok
P: ok
co2/h2o: ok
csat u,v,w: ok
csat ldiag: ok
soils: Tsoil.x.4.4cm.bao is reading high
Wetness: ok
Rsw/Rlw/Rpile: ok
Voltages: ok
sstat outputs: ok, ok
- rad.bao was down for most of Fri & Sat – don't know why – Xbee reset should have been running
- tsoil.bao was spotty on Fri – don't know why
- Sat most sonics (including 2D at 10m.bao) were pretty bad in rain/snow
- need to add p.5m.bao and tc.5m.bao to WWW plots.
Just a reminder that the EC150 bandwidth parameter in the project config needs to be set to match the EC150 settings, since it is used to determine the sample delay time. Often this won't be a problem, however we are now correlating the nanobarometer and EC150 winds for CABL and it turned out that we had set bandwidth to 25, rather than 5Hz. This has now be set to the correct value, causing w'p' cospectra no longer to show a phase lag.
The check_trh.sh script just restarted the TRH at 100m because Ifan went below 10. After the power cycle values for Ifan were around 43 mA. After about 40 seconds they settled down to 25.
Ifan.100m has bee weird since 09:40 MDT this morning in this screen grab of ncharts.
Ifan.200m dropped from around 35 to around 23 on May 1.
Rick and I were out to do this today (5 may). Gordon had set up the config yesterday, so it was plug and play. The mast was down from about 10:12 to 11:15 to do this (it took a while to fish the data cable through).
In addition I:
- added a blue cap to the TRH bluetooth connector (I had noticed it was missing during our last trip)
- installed a "double shield" insert around the Tirga thermistor.
T.100m died on Sat (2 May) ~1600. eio doesn't revive it. However, "ddn/dup" did! I noticed that its last message at May 2 22Z was normal. I see nothing in the logs about anything odd happening at this time..
bao rad mote died between 1000-1400 on Sun (3 May), but came back. John's watchdog brought it back. The last message was at 16:11:56.3590Z, then an "Xbee reset" occurs at 19:12:00.1041 (and the reset time was set to 3 hours).
sonic data at 50, 150, and 250m missing for about an hour about 0600 today (4 May), but came back.
About a week ago, Tsoil.x.4.4cm started going bad and has been garbage for the last 3 days. This is the brown-coated probe installed rightside-up. The white epoxy upside-down Tsoil.4.4cm is okay. I note that the white epoxy probe was what went bad at ehs...
I'm preparing to add a nanobarometer to bao.5m.
The TRH at 250m went out last night for about an hour, right as the checker script was power cycling it because the temp was high:
/var/log/isfs/dsm.log:May 1 01:34:50 250m root: temperature is -61.48 . Power cycling port 5 /var/log/isfs/dsm.log:May 1 01:34:54 250m root: temperature is 168.51 . Power cycling port 5 /var/log/isfs/dsm.log:May 1 01:39:54 250m root: temperature is 168.59 . Power cycling port 5
It came back on its own. Then this morning the Ifan value went to 0. This could have been because I was testing rserial at that time. I see some of the characters I was typing in the TRH output, such as "|fgre". So perhaps I caused this second issue. At 16:48 UTC I power cycled the TRH and Ifan came back.
2015 05 01 16:14:38.1284 1 37 TRH32 9.30 97.48 39 0 1226 201 124\r\n 2015 05 01 16:14:39.1384 1.01 37 TRH32 9.30 97.48 39 0 1226 201 124\r\n 2015 05 01 16:14:40.1384 1 37 TRH32 9.30 97.48 38 0 1226 201 120\r\n 2015 05 01 16:14:40.5740 0.4357 38 \rTRH32 9.30 97.48 38 0 1226 201 119\r\n 2015 05 01 16:14:42.1484 1.574 37 TRH32 9.30 97.48 38 0 1226 201 119\r\n 2015 05 01 16:14:43.1584 1.01 37 TRH32 9.30 97.48 38 0 1226 201 119\r\n 2015 05 01 16:14:44.1584 1 37 TRH32 9.30 97.48 38 0 1226 201 119\r\n 2015 05 01 16:14:44.6040 0.4456 44 f|freTRH32 9.34 97.49 38 0 1227 201 118\r\n 2015 05 01 16:14:47.2784 2.674 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:48.2784 1 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:49.2884 1.01 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:50.2884 1 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:51.2984 1.01 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:51.5941 0.2956 35 \rTRH32 9.30 97.04 0 0 1226 200 0\r\n 2015 05 01 16:14:53.3084 1.714 34 TRH32 9.34 97.49 0 0 1227 201 0\r\n 2015 05 01 16:14:54.3084 1 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n 2015 05 01 16:14:55.3184 1.01 34 TRH32 9.30 97.48 0 0 1226 201 0\r\n
We now can make some statements about the performance of our new Tirga measurement. I'll call this "fan" vs the original "shield".
Shield was deployed at ehs. First, I have to remove a bias of +1C from tc.5m to make T.2m and tc.5m agree with the heat flux. I then find that Tirga is generally within 1C of both T.2m and tc.5m. Some days and nights (presumably clear skies), Tirga.5m agrees closer to T.2m than to tc.5m. This makes sense, because the radiation error would act to raise daytime temps and lower nighttime temps, which is the same effect as measuring closer to the surface. Generally, the magnitude of this radiation error was about 0.5C.
Fan was deployed at bao. No tc adjustment was needed. At this site, large differences between 2m and 5m are seen – typically 5C at night. When the fan was running, differences from tc are typically within 1C. When the fan wasn't running (may 19 16:30 – apr 29 17:30), daytime Tirga was typically 4C higher than tc. Presumably, this is the internal EC100 box temperature heating up.
Considering all of the above and using data only with the fan working, nighttime Tirga.5m-tc.5m differences are about the same between fan and shield – typically within 0.5C. Daytime Tirga.5m-tc.5m has shield on the order of 70% of fan – say 0.9 vs 1.3C. Thus, after all this work, fan still is worse than shield . Perhaps we need a double-shield inside the EC100?
Rudy had noted that rad data died Monday afternoon. Efforts to reset it remotely using mote commands have failed, so there must be a hardware issue. We'll try to get out there today to replace this with a spare.
We'll replace the EC150 Tirga fan at the same time.
The GPS at 200m has quit reporting. It died around 01:00 UTC, April 27.
I noticed 200m was an outlier in the "chronyc sourcestats" output on flux. This listing shows an offset of 1051 microseconds for 200m instead of +-4 microseconds for the others:
chronyc sourcestats 210 Number of sources = 6 Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== 50m 39 21 10h -0.000 0.001 -2198ns 13us 100m 5 4 68m +0.002 0.124 +4085ns 27us 150m 32 15 534m +0.000 0.001 +2403ns 11us 200m 7 4 103m +0.058 0.024 +1051us 16us 250m 43 21 12h +0.000 0.001 +1879ns 12us 300m 5 3 68m -0.001 0.145 -1918ns 31us
The serial port doesn't show large values of fe (framing errors) or breaks:
root@200m root# cktty 3 3: uart:XR16850 mmio:0x10000000 irq:122 tx:1936 rx:808452280 fe:24 RTS|DTR
I don't think power to serial port 3 can be controlled with "tio 3 1/0". When I tried to power off the GPS on 150m, the output to "rs G" did not stop.
As a workaround, I edited /etc/ntp.conf on 200m and added 150m as a server. So there is no urgency to replace this GPS.
Dan noticed that 300m TRH data were NA last night. This morning, a power cycle showed that the fan isn't turning on – it acts like the fan is stuck.
We plan to drop a new TRH off this afternoon for Dan/Bruce to replace this with (probably on Monday).
Sensor ID3 I2C ADD: 12 data rate: 1 (secs) fan(0) max current: 80 (ma)\n resolution: 12 bits 1 sec MOTE: off\r\n calibration coefficients:\r\n Ta0 = -4.112729E+1\r\n Ta1 = 4.153065E-2\r\n Ta2 = -5.198994E-7\r\n Ha0 = -7.871138E+0\r\n Ha1 = 6.237115E-1\r\n Ha2 = -5.446227E-4\r\n Ha3 = 8.683383E-2\r\n Ha4 = 7.886339E-4\r\n Fa0 = 3.222650E-1\r\n TRH3 11.82 35.91 329 0 1296 72 1023\r\n TRH3 11.82 35.91 221 0 1296 72 687\r\n TRH3 11.82 35.91 118 0 1296 72 367\r\n TRH3 11.78 35.90 69 0 1295 72 216\r\n TRH3 11.82 35.91 39 0 1296 72 123\r\n TRH3 11.78 35.90 18 0 1295 72 57\r\n TRH3 11.78 35.90 6 0 1295 72 19\r\n TRH3 11.78 35.90 0 0 1295 72 0\r\n TRH3 11.78 35.90 0 0 1295 72 0\r\n TRH3 11.78 36.45 0 0 1295 73 0\r\n TRH3 11.74 36.45 0 0 1294 73 0\r\n
From the logs of the check_trh process on flux I see these entries since it was started on April 9. For some reason the higher TRHs had some issues yesterday.
Times in MDT: fgrep cycling /var/log/messages* Apr 18 18:49:09 flux check_trh.sh: 300m temperature is 137.88 . Power cycling port 5 Apr 23 13:03:58 flux check_trh.sh: 300m temperature is 174.1 . Power cycling port 5 Apr 23 13:06:58 flux check_trh.sh: 300m temperature is 174.28 . Power cycling port 5 Apr 23 13:08:18 flux check_trh.sh: 200m temperature is 181.61 . Power cycling port 5 Apr 23 13:08:38 flux check_trh.sh: 300m temperature is 174.28 . Power cycling port 5 Apr 23 13:08:58 flux check_trh.sh: 200m temperature is 181.53 . Power cycling port 5 Apr 23 13:09:48 flux check_trh.sh: 300m temperature is 174.06 . Power cycling port 5 Apr 23 13:16:48 flux check_trh.sh: 200m temperature is 179.15 . Power cycling port 5 Apr 23 13:19:18 flux check_trh.sh: 250m temperature is 173.33 . Power cycling port 5 Apr 23 13:30:38 flux check_trh.sh: 200m temperature is 177.04 . Power cycling port 5 Apr 23 13:48:38 flux check_trh.sh: 250m temperature is 171.63 . Power cycling port 5 Apr 23 13:50:48 flux check_trh.sh: 250m temperature is 173.08 . Power cycling port 5
Yesterday (April 23) I reworked things so that the check script is run on each DSM, including the bao station. The only entries after that are from 300m. Subtracting 6 hours from the times, these are at 13:27-13:29 MDT
Times in UTC ssh 300m fgrep cycling /var/log/isfs/dsm.log Apr 23 19:27:33 300m root: temperature is -62.52 . Power cycling port 5 Apr 23 19:28:25 300m root: temperature is -62.52 . Power cycling port 5 Apr 23 19:29:41 300m root: temperature is -62.52 . Power cycling port 5
For example, here is the hiccup from 200m at 19:30:22 UTC. Note after the first power cycle, things look good for 5 seconds, then it reports a bad temp of 89.92 at 19:30:50.1491 and is power cycled again, and works after that.
data_dump -i 4,20 -A 200m_20150423_160000.dat | more ... 2015 04 23 19:30:17.3598 1.001 37 TRH30 15.13 27.28 34 0 1377 56 107\r\n 2015 04 23 19:30:18.3691 1.009 37 TRH30 15.09 27.28 33 0 1376 56 105\r\n 2015 04 23 19:30:19.3691 1 37 TRH30 15.13 27.28 34 0 1377 56 108\r\n 2015 04 23 19:30:20.3692 1 37 TRH30 15.09 27.28 33 0 1376 56 103\r\n 2015 04 23 19:30:21.3790 1.01 37 TRH30 15.09 27.28 34 0 1376 56 108\r\n 2015 04 23 19:30:22.6191 1.24 40 TRH30 177.00 260.02 36 0 5510 886 112\r\n 2015 04 23 19:30:23.6290 1.01 40 TRH30 177.00 260.18 35 0 5510 885 109\r\n 2015 04 23 19:30:24.6290 1 40 TRH30 177.04 260.21 34 0 5511 885 106\r\n 2015 04 23 19:30:25.6398 1.011 40 TRH30 177.08 260.40 33 0 5512 884 105\r\n ... 2015 04 23 19:30:37.6898 1.001 40 TRH30 177.26 260.19 32 0 5517 886 102\r\n 2015 04 23 19:30:38.6900 1 40 TRH30 177.23 260.33 34 0 5516 885 108\r\n 2015 04 23 19:30:39.6991 1.009 38 TRH30 177.30 260.87 5 0 5518 882 16\r\n 2015 04 23 19:30:43.7398 4.041 2 \n 2015 04 23 19:30:43.7408 0.001042 80 \r Sensor ID30 I2C ADD: 11 data rate: 1 (secs) fan(0) max current: 80 (ma)\n 2015 04 23 19:30:43.8292 0.08842 44 \rresolution: 12 bits 1 sec MOTE: off\r\n 2015 04 23 19:30:43.8806 0.05133 28 calibration coefficients:\r\n 2015 04 23 19:30:43.9098 0.02924 21 Ta0 = -4.129395E+1\r\n 2015 04 23 19:30:43.9398 0.02995 21 Ta1 = 4.143320E-2\r\n 2015 04 23 19:30:43.9691 0.02937 21 Ta2 = -3.293163E-7\r\n 2015 04 23 19:30:43.9899 0.02073 21 Ha0 = -7.786594E+0\r\n 2015 04 23 19:30:44.0191 0.02922 21 Ha1 = 6.188832E-1\r\n 2015 04 23 19:30:44.0449 0.02582 21 Ha2 = -5.069766E-4\r\n 2015 04 23 19:30:44.0691 0.02418 21 Ha3 = 9.665616E-2\r\n 2015 04 23 19:30:44.0991 0.03 21 Ha4 = 6.398342E-4\r\n 2015 04 23 19:30:44.1191 0.02001 21 Fa0 = 3.222650E-1\r\n 2015 04 23 19:30:45.1098 0.9907 37 TRH30 15.17 26.14 32 0 1378 54 102\r\n 2015 04 23 19:30:46.1191 1.009 37 TRH30 15.17 26.14 33 0 1378 54 103\r\n 2015 04 23 19:30:47.1291 1.01 37 TRH30 15.17 26.14 34 0 1378 54 108\r\n 2015 04 23 19:30:48.1290 0.9999 37 TRH30 15.17 26.14 32 0 1378 54 101\r\n 2015 04 23 19:30:49.1390 1.01 37 TRH30 15.17 26.14 33 0 1378 54 105\r\n 2015 04 23 19:30:50.1491 1.01 32 TRH30 89.92 0.90 0 0 3251 0 0\r\n 2015 04 23 19:30:53.5790 3.43 2 \n 2015 04 23 19:30:53.5801 0.001042 80 \r Sensor ID30 I2C ADD: 11 data rate: 1 (secs) fan(0) max current: 80 (ma)\n 2015 04 23 19:30:53.6699 0.08981 44 \rresolution: 12 bits 1 sec MOTE: off\r\n 2015 04 23 19:30:53.7213 0.05139 28 calibration coefficients:\r\n 2015 04 23 19:30:53.7491 0.0278 21 Ta0 = -4.129395E+1\r\n 2015 04 23 19:30:53.7790 0.02995 21 Ta1 = 4.143320E-2\r\n 2015 04 23 19:30:53.8083 0.02925 21 Ta2 = -3.293163E-7\r\n 2015 04 23 19:30:53.8290 0.02075 21 Ha0 = -7.786594E+0\r\n 2015 04 23 19:30:53.8601 0.03103 21 Ha1 = 6.188832E-1\r\n 2015 04 23 19:30:53.8898 0.02971 21 Ha2 = -5.069766E-4\r\n 2015 04 23 19:30:53.9108 0.02107 21 Ha3 = 9.665616E-2\r\n 2015 04 23 19:30:53.9398 0.02892 21 Ha4 = 6.398342E-4\r\n 2015 04 23 19:30:53.9691 0.02932 21 Fa0 = 3.222650E-1\r\n 2015 04 23 19:30:54.9590 0.99 37 TRH30 15.17 26.14 34 0 1378 54 107\r\n 2015 04 23 19:30:55.9598 1.001 37 TRH30 15.17 26.14 33 0 1378 54 103\r\n 2015 04 23 19:30:56.9691 1.009 37 TRH30 15.21 26.15 34 0 1379 54 108\r\n 2015 04 23 19:30:57.9691 1 37 TRH30 15.17 26.14 33 0 1378 54 103\r\n
Notice the delta-T column after the datetime. I've looked at a few of these, and I think that there is always a larger deltat-T (in this case 1.24 sec instead of 1.0 ) at the time of the initial bad data, in case that might help in debugging.
9am, Apr 25: Some more glitches since yesterday. Notice again that the problems in different sensors seem to occur at approximately simultaneous times:
ck_trh 200m Apr 24 20:56:05 200m root: temperature is 170.15 . Power cycling port 5 Apr 24 20:57:10 200m root: temperature is 170.22 . Power cycling port 5 300m Apr 24 20:51:47 300m root: temperature is -62.52 . Power cycling port 5 Apr 24 20:56:52 300m root: temperature is -62.52 . Power cycling port 5
This done from about 2:30-4:30 with Steve S&O and Kurt. Soil samples taken – will update gravimetric posting. Queried each soil sensor for IDs (since I had to look at the Qsoil values anyway) into minicom capture file (attached: ehsteardown.cap)
Problem soil sensors at this site have been:
Tsoil.0.6cm (SN 12, epoxy-coated, upside down): Didn't work from a few hours after installation until it revived itself ~3 weeks later. Ran fine until tear-down (including the manual reading I took during tear-down). Back in the lab, nothing is visually wrong with this probe.
Qsoil.5cm (SN 12): Worked fine for the first 3 weeks, then started dropping data (for hours at a time) during last 3 weeks. Worked during the manual reading during tear-down. In lab, the Binder connector was not fully seated – pulled out by about 0.7mm. Sorry, I forgot to inspect it during the actual tear-down. This is a post-mortem item – to inspect that each Binder is fully seated during probe installation.
It appears that after EOL systems work yesterday, /scr/isfs didn't come up. WWW plots died (though there is at least one WWW plots issue that is due to ehs being decommissioned). rsync from bao didn't happen last night.
I've just submitted a system help request to correct this, and tonight's rsync task should regenerate the data.
From the plots at http://datavis.eol.ucar.edu/ncharts/projects/CABL/geo_notiltcor, it seems that several of the sonics on the tower were not reporting data intermittently on 4/17. The 5-minute data files also have several '_' points. Is this a permanent data outage or are the data recoverable?