[Ovmsdev] Repeated WDT crashes after upgrade to 3.3 with 4G
Mark Webb-Johnson
mark at webb-johnson.net
Sun Mar 6 22:28:46 HKT 2022
Nice catch. I had reviewed that code section this afternoon, but missed it.
Shall we make a 3.3.002 and push it to EAP?
Regards, Mark
> On 6 Mar 2022, at 5:58 PM, Michael Balzer <dexter at expeedo.de> wrote:
>
> I've found & fixed the bug: there was an endless loop in modem::IncomingMuxData() when the NMEA channel had not been created.
>
> → https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/a6e5b9ab957ab72f56740d6fd4ba484fe1691f80
>
> Regards,
> Michael
>
>
> Am 05.03.22 um 11:21 schrieb Michael Balzer:
>> I've now tried starting the NMEA channel without actually sending any GPS init commands (also no +CGPS=0).
>>
>> That inhibits the crashes, although apparently no NMEA sentences are coming in. The only communication on MUX channel 1 (NMEA) are some initial frames:
>>
>> balzer at leela:~/ovms/v3> grep -A1 CHAN=1 minicom.log
>> V (26168) gsm-mux: ProcessFrame(CHAN=1, ADDR=07, CTRL=73, FCS=15, LEN=6)
>> I (26168) gsm-mux: Channel #1 is open
>> --
>> V (32568) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56, LEN=20)
>> V (32568) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=d1, LEN=20)
>> --
>> V (32938) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
>> V (32938) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=32, LEN=21)
>> --
>> V (36948) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56, LEN=20)
>> V (36948) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=d1, LEN=20)
>> --
>> V (38968) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
>> V (38968) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=32, LEN=21)
>> --
>> V (42638) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
>> V (42638) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f0, LEN=21)
>> --
>> V (44988) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
>> V (44988) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f0, LEN=21)
>>
>>
>>
>> Unrelated to this issue, but possibly to the framing errors: I've also removed the NMEA log filter from modem::StandardLineHandler() and found after normal GPS startup, the NMEA sentences always come in three times via three MUX channels – additionally to channel 1 on channels 3 & 4:
>>
>> V (130148) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=bf, LEN=78)
>> D (130148) gsm-nmea: Incoming RMC: $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
>> V (130158) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=fa, LEN=78)
>> D (130158) cellular: mux-rx-line #3: $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
>> V (130158) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=f7, LEN=78)
>> D (130158) cellular: mux-rx-line #4: $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
>> V (130168) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b1, LEN=82)
>> D (130168) gsm-nmea: Incoming GNS: $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
>> V (130178) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f4, LEN=82)
>> D (130178) cellular: mux-rx-line #3: $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
>> V (130178) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=f9, LEN=82)
>> D (130178) cellular: mux-rx-line #4: $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
>>
>> I don't know if that's intended behaviour or can be changed, but I suspect this to add unnecessary load to the serial communication.
>>
>> Regards,
>> Michael
>>
>>
>> Am 05.03.22 um 09:02 schrieb Michael Balzer:
>>> Another theory: I found Mark's comment in the SIM7600 driver NMEA startup:
>>>
>>> // We need to do this a little differently from the standard, as SIM7600
>>> // may start GPS on power up, and doesn't like us using CGPS=1,1 when
>>> // it is already on. So workaround is to first CGPS=0.
>>>
>>> IOW, the SIM7600 may have GPS enabled by default, possibly also sending NMEA sentences, if not explicitly switched off.
>>>
>>> BTW, that also explains the difficulties in booting a 3.3 module from USB… :-/
>>>
>>> a) We don't switch off GPS if GPS is disabled.
>>>
>>> b) I just tried switching off GPS manually after a GPS-enabled startup, and the modem still continued to send NMEA sentences:
>>>
>>> OVMS# cellular cmd AT+CGPS=0
>>> MODEM command has been sent.
>>> …
>>> D (475190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>>> V (475190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=c1, LEN=31)
>>> V (475190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=cc, LEN=31)
>>> V (475190) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=60, LEN=32)
>>> D (475190) gsm-nmea: Incoming GNS: $GNGNS,,,,,,NNN,,,,,,*1D
>>> V (475190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=25, LEN=32)
>>> V (475190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=28, LEN=32)
>>> V (480180) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=84, LEN=31)
>>> D (480190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>>> …
>>> OVMS# cellular cmd AT+CGPSNMEA=0
>>> MODEM command has been sent.
>>> D (505190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>>> V (505190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=c1, LEN=31)
>>> V (505190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=cc, LEN=31)
>>> V (505190) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=60, LEN=32)
>>> D (505190) gsm-nmea: Incoming GNS: $GNGNS,,,,,,NNN,,,,,,*1D
>>> V (505190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=25, LEN=32)
>>> V (505190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=28, LEN=32)
>>> V (510180) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=84, LEN=31)
>>> D (510190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>>> …
>>>
>>>
>>> So without us starting the NMEA subsystem, is it possible these transmissions are still there, filling up some MUX buffer?
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 05.03.22 um 08:33 schrieb Michael Balzer:
>>>> I've tried to reproduce this with my old 3.2 & hand soldered 3.3 dev module, with different vehicle modules and modem GPS disabled.
>>>>
>>>> Results:
>>>> - the vehicle module is irrelevant
>>>> - 3.2 (SIM5360E) runs without issues (had this running over night)
>>>> - 3.3 (SIM7600G) consistently crashes by the TWDT within a couple of minutes after modem init
>>>>
>>>> This is caused by the modem task hogging core 0 as soon as the PPP connection is up:
>>>>
>>>> I (87161) cellular: Identified cellular modem: SIM7600/Experimental support for SIMCOM SIM7600
>>>> D (87161) cellular: Remove old 'auto' modem driver
>>>> I (87161) cellular: Set modem driver to 'SIM7600'
>>>> I (87161) cellular: State: Enter PoweredOn state
>>>> OVMS# mo ta
>>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit SPIRAM C# PRI CPU% BPR/MH
>>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644 44544 0 22 0% 22/ 0
>>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 188 0 524288 1 5 0% 5/ 0
>>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0 0 0 20 0% 20/ 0
>>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80136 0 79300 1 8 1% 8/ 0
>>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0 50556 0 23 0% 23/ 0
>>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0 0 0 24 0% 24/ 0
>>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0 0 1 24 0% 24/ 0
>>>> 3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0 0 0 0 98% 0/ 0
>>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0 1 0 97% 0/ 0
>>>> 3FFCE1D0 12 Blk Tmr Svc 380 1404 3072 124 0 0 0 20 0% 20/ 0
>>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0 1040 1 18 0% 18/ 0
>>>> 3FFD34E0 20 Blk OVMS Cellular 652 2396 4096 5848 0 0 0 20 0% 20/ 0
>>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0 5632 0 22 1% 22/ 1
>>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0 1 10 0% 10/ 0
>>>> 3FFE7BD4 23 Rdy OVMS Console 848 2688 10240 356 12100 15960 1 5 1% 5/ 1
>>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0 2152 1 5 0% 5/ 0
>>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0 4 0 1 0% 1/ 0
>>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0 0 1 2 0% 2/ 0
>>>>
>>>> …
>>>>
>>>> I (118171) cellular: Network Registration status: RegisteredRoaming
>>>> D (118181) cellular: mux-rx-line #3: +CCLK: "22/03/05,08:22:23+04"
>>>> D (118181) cellular: mux-rx-line #3: +CSQ: 24,0
>>>> D (118181) cellular: mux-rx-line #3: +COPS: 0,0,"vodafone.de Hologram",0
>>>> I (118181) cellular: Network Provider is: vodafone.de Hologram
>>>> D (119151) cellular: State transition NetWait => NetStart
>>>> I (119151) cellular: State: Enter NetStart state
>>>> D (120151) cellular: Netstart AT+CGDCONT=1,"IP","hologram";+CGDATA="PPP",1
>>>> V (120191) gsm-mux: ProcessFrame(CHAN=0, ADDR=01, CTRL=ef, FCS=79, LEN=11)
>>>> V (120201) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=fb, LEN=24)
>>>> D (120201) cellular: mux-rx-line #2: CONNECT 115200
>>>> I (120201) cellular: PPP Connection is ready to start
>>>> V (120751) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56, LEN=20)
>>>> OVMS# mo ta
>>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit SPIRAM C# PRI CPU% BPR/MH
>>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644 44544 0 22 0% 22/ 0
>>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 284 0 524308 1 5 1% 5/ 0
>>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0 0 0 20 0% 20/ 0
>>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80280 0 79680 1 8 1% 8/ 0
>>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0 50556 0 23 0% 23/ 0
>>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0 0 0 24 0% 24/ 0
>>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0 0 1 24 0% 24/ 0
>>>> 3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0 0 0 0 79% 0/ 0
>>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0 1 0 95% 0/ 0
>>>> 3FFCE1D0 12 Blk Tmr Svc 380 1452 3072 124 0 0 0 20 0% 20/ 0
>>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0 1044 1 18 0% 18/ 0
>>>> 3FFD34E0 20 Rdy OVMS Cellular 876 2716 4096 6116 0 10440 0 20 19% 20/ 0
>>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0 5632 0 22 1% 22/ 1
>>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0 1 10 0% 10/ 0
>>>> 3FFE7BD4 23 Rdy OVMS Console 848 2688 10240 356 12100 15960 1 5 3% 5/ 1
>>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0 2152 1 5 0% 5/ 0
>>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0 4 0 1 0% 1/ 0
>>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0 0 1 2 0% 2/ 0
>>>> I (122161) housekeeping: System considered stable (RAM: 8b=64256-64760 32b=184 SPI=3362168-3374964)
>>>> D (122231) ovms-duktape: Duktape: Compacting DukTape memory done in 60 ms
>>>> I (123151) ovms-server-v2: Send MP-0 h1,0,*-OVM-DebugCrash,0,2592000,3.3.001-285-g601f2a70/factory/edge (build idf v3.3.4-848-g1ff5e24b1b Feb 22 2022 20:57:41),1,EarlyCrash,12,12,1,1,abort(),0,,0x4008ddca 0x4008e06
>>>> 5 0x400e88b8 0x40084176 ,6,Task watchdog,,,0,IDLE0,OVMS WIFI BLE BT cores=2 rev=ESP32/3; MODEM SIM7600
>>>> I (123361) ovms-server-v2: Incoming Msg: MP-0 h1
>>>> OVMS# mo ta
>>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit SPIRAM C# PRI CPU% BPR/MH
>>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644 44544 0 22 0% 22/ 0
>>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 188 0 524288 1 5 0% 5/ 0
>>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0 0 0 20 0% 20/ 0
>>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80136 0 79300 1 8 1% 8/ 0
>>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0 50556 0 23 0% 23/ 0
>>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0 0 0 24 0% 24/ 0
>>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0 0 1 24 0% 24/ 0
>>>> 3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0 0 0 0 0% 0/ 0
>>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0 1 0 95% 0/ 0
>>>> 3FFCE1D0 12 Blk Tmr Svc 380 1452 3072 124 0 0 0 20 0% 20/ 0
>>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0 1044 1 18 0% 18/ 0
>>>> 3FFD34E0 20 Rdy OVMS Cellular 876 2716 4096 6116 0 10440 0 20 98% 20/ 0
>>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0 5632 0 22 1% 22/ 1
>>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0 1 10 0% 10/ 0
>>>> 3FFE7BD4 23 Rdy OVMS Console 880 2688 10240 356 12100 15960 1 5 4% 5/ 1
>>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0 2152 1 5 0% 5/ 0
>>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0 4 0 1 0% 1/ 0
>>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0 0 1 2 0% 2/ 0
>>>>
>>>> …
>>>>
>>>> OVMS# E (181151) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
>>>> E (181151) task_wdt: - IDLE0 (CPU 0)
>>>> E (181151) task_wdt: Tasks currently running:
>>>> E (181151) task_wdt: CPU 0: Tmr Svc
>>>> E (181151) task_wdt: CPU 1: IDLE1
>>>> E (181151) task_wdt: Aborting.
>>>>
>>>>
>>>> No other log messages.
>>>>
>>>> Am 04.03.22 um 17:08 schrieb Jeff Anton:
>>>>> I woke from sleep thinking "interrupt storm."
>>>>>
>>>>> Craig's observation makes me wonder... Does the gps system produce interrupts which are not handled properly when gps is turned off. Unhandled interrupts might get constantly repeated causing the IDLE tasks to starve?
>>>>
>>>> I think that may be the case. Maybe the NMEA MUX channel gets continuous input from the 7600 when GPS is disabled?
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>>>>
>>>> Am 04.03.22 um 23:15 schrieb Craig Leres:
>>>>> One more data point; I turned GPS off on my dev module (v3.1 board with a sim7600A running 3.3.001-285-g601f2a70-dirty) and it started crashing.
>>>>>
>>>>> Has anyone outside the US tried this?
>>>>>
>>>>> Craig
>>>>>
>>>>> ice 12 % ./backtrace.sh 0x400891af:0x3ffb0690 0x40089449:0x3ffb06b0 0x400e8950:0x3ffb06d0 0x40083dde:0x3ffb06f0
>>>>> + xtensa-esp32-elf-addr2line -e build/ovms3.elf 0x400891af:0x3ffb0690 0x40089449:0x3ffb06b0 0x400e8950:0x3ffb06d0 0x40083dde:0x3ffb06f0
>>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c:736
>>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c:736
>>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/task_wdt.c:274
>>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/freertos/xtensa_vectors.S:1154
>>>>>
>>>>> # esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c
>>>>> 733
>>>>> 734 void _esp_error_check_failed_without_abort(esp_err_t rc, const char *file, int line, const char *function, const char *expression)
>>>>> 735 {
>>>>> 736 esp_error_check_failed_print("ESP_ERROR_CHECK_WITHOUT_ABORT", rc, file, line, function, expression);
>>>>> 737 }
>>>>>
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.openvehicles.com
>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>
>>>> --
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>
>>>>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20220306/f1359a7e/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/octet-stream
Size: 203 bytes
Desc: not available
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20220306/f1359a7e/attachment-0001.obj>
More information about the OvmsDev
mailing list