Pico/Waveshare Meshtastic node with GPS locks up every 8 hours

I have started a new thread for this as the original one is sort of solved but leaving a difficult to resolve problem.
Entirely due to Michael99645 's generosity with his time the Pico/Waveshare LoRa now works with the Ublox GPS. However there is a remaining problem that is so far proving difficult to resolve so I am wondering if anyone else may have useful input.
After very close to 8 hours running with the GPS attached the Pico locks up. The OLED display stops updating and the green LED on the Pico is either off, permanently on or flashing rapidly. Disconnecting the USB power cable and reconnecting it restores everything to normal for another 8 hours. But then then the Pico locks up again.
Obviously a Meshtastic node that locks up every 8 hours is not much use!
I have 2 other identical nodes that do not have GPS attached and they never lock up.
I have tried:-

  1. A brand new Pico.
  2. Swapping the Waveshare Lora module.
  3. Different versions of the Meshtastic firmware (obviously each modified by Michael to talk to the GPS)
  4. Different USB power supplies.
  5. Different USB cables

Nothing has made any difference. Within plus or minus a few minutes of 8 hours the Pico always locks up.
We (essentially Michael) are still working on the issue but we thought it was time to see if anyone else can contribute usefully.
Ian

Hey @Ian100768

My first thought is that the pico is storing the location data from the GPS and filling up its available memory.

There should be an option in the config files to enable smart broadcast, it will only log the location if the device moves by a specified amount. Maybe give this a crack and see if you get any new results?

Position Configuration Meshtastic

Thanks Dan. I did have that turned off. I have turned it on now and resumed testing.

Dan, I turned on Smart Position and it ran for 8 h 45 m and then locked up. Whether the slightly longer run time is the Smart Position being on I don’t know.
Any other ideas anyone?

Hi @Ian100768

While its a bit janky, you could look at using the Meshtastic Python CLI specifically the reboot command to periodically reset the device every say 6? hours in theory it should go off well before it locks up.

Michael99645 is still looking at the Meshtastic code to see if he can discover why this is happening. He has been incredibly generous with his time and deep knowledge.
I will keep your suggestion as a last resort should all else fail. Thanks again Dan.

2 Likes

Hey Ian,

Another source to check would be the Meshtastic Discord. There is a lot of issues being solved in there and it might be worth searching the rp2040 channel. Chances are you can also ask a question and get a Meshtastic Dev to help out, they might be interested in fixing this bug!

The way the code deals with the GPS/UART is all a bit weird and prone to some issues that then have other fixes.
A PR on git hub covers the serial port/uart FIFO buffer increase to 256 bytes to help fix corrupted gps packets… the bigger buffer is needed as the packets are read when the code gets to it, rather then an IRQ.

A few things to note:
The GPS code will attempt to tell the GPS to only send the packet types it wants. i.e. reduce the need to deal with data not needed… fair enough.
Then it will attempt to put the GPS to sleep (not send any data) and when ready for the next read, enable and read… again no issue with the idea.
But if the time between reads is too short, it will not put the GPS to sleep, allowing it to send data when not needed.

There is a hard coded timer of 5 seconds that will call the GPS “whileActive” function.
In here if the gps is meant to be asleep it will clear anything in the buffer (clearBuffer():wink: and exit. as it does not care about the data until is due to “care”

IF it does want the data, it will then feed the fifi buffer into tinygps (byte by byte)
once the FIFO is empty it will exit. If it did not get the information it wants, 5seconds later it will try again.

So not lets consider a few things.
the 2 packet types it wants will need about 142 bytes (data depending on gps formatting and supplied data) and the GPS is sending that data every second, so 142 *5 = 710 bytes, so the 5 seconds between reads while have some good full packets will also 100% full the buffer.
(If other packet types where left on, then even more data is dropped and the buffer left full.)

While its hard to home in on the exact cause of the crash, one things I noticed was leaving the GPS updates period at 120 seconds means the GPS is put to sleep and does a few more things as it wakes up, the reduced update period of 10 seconds means the GPS is left running so lots of buffer clears.
In limit testing, it seems that the 120 second beacon period and going to sleep seems more stable.

At the moment, Im still looking at code and adding extra debug outputs trying to see if we can find the exact cause of the “freeze”. I have seen posts around the Pico UART FIFO buffer that indicate that under some condition it could lockup; but it seems more as passing comments, not proven; I was starting to wonder if the FIFO had a memory leak, but that’s Pico microcode, not the application level, and again would need to work out how to test and prove; but we do know we are overflowing the fifo all the time.

A side note re: the discord and/or git hub, there is a 2nd thing that needs to be addressed as well. the way the code is current written, the dont officially support GPIO 0 as valid pin. Pin 0 (index) is used to say “not set” in parts of the code.
So using the default pins 0/1 for the UART0 (which is free on the pico side) while working is not supported. They a have also hardcoded the GPS to be on Serial1 which maps to UART0; so to user UART1 we then need to change the code code as well… as such, I was keen to get to the bottom of the bug rather then ask a question with the answer being “not supported”.

What I am yet to try is moving the UART pins to different free pins (planed for this weekend) and see how it goes.

1 Like