[Ovmsdev] Repeated WDT crashes after upgrade to 3.3 with 4G
Michael Balzer
dexter at expeedo.de
Sun Mar 6 17:58:40 HKT 2022
I've found & fixed the bug: there was an endless loop in
modem::IncomingMuxData() when the NMEA channel had not been created.
→
https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/a6e5b9ab957ab72f56740d6fd4ba484fe1691f80
Regards,
Michael
Am 05.03.22 um 11:21 schrieb Michael Balzer:
> I've now tried starting the NMEA channel without actually sending any
> GPS init commands (also no +CGPS=0).
>
> That inhibits the crashes, although apparently no NMEA sentences are
> coming in. The only communication on MUX channel 1 (NMEA) are some
> initial frames:
>
> balzer at leela:~/ovms/v3> grep -A1 CHAN=1 minicom.log
> V (26168) gsm-mux: ProcessFrame(CHAN=1, ADDR=07, CTRL=73, FCS=15, LEN=6)
> I (26168) gsm-mux: Channel #1 is open
> --
> V (32568) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56, LEN=20)
> V (32568) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=d1, LEN=20)
> --
> V (32938) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
> V (32938) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=32, LEN=21)
> --
> V (36948) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56, LEN=20)
> V (36948) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=d1, LEN=20)
> --
> V (38968) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
> V (38968) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=32, LEN=21)
> --
> V (42638) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
> V (42638) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f0, LEN=21)
> --
> V (44988) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b5, LEN=21)
> V (44988) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f0, LEN=21)
>
>
>
> Unrelated to this issue, but possibly to the framing errors: I've also
> removed the NMEA log filter from modem::StandardLineHandler() and
> found after normal GPS startup, the NMEA sentences always come in
> three times via three MUX channels – additionally to channel 1 on
> channels 3 & 4:
>
> V (130148) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=bf, LEN=78)
> D (130148) gsm-nmea: Incoming RMC:
> $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
> V (130158) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=fa, LEN=78)
> D (130158) cellular: mux-rx-line #3:
> $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
> V (130158) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=f7, LEN=78)
> D (130158) cellular: mux-rx-line #4:
> $GPRMC,095314.00,A,xxxx.139928,N,xxxx.391626,E,0.0,,050322,0.9,W,A*06
> V (130168) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=b1, LEN=82)
> D (130168) gsm-nmea: Incoming GNS:
> $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
> V (130178) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=f4, LEN=82)
> D (130178) cellular: mux-rx-line #3:
> $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
> V (130178) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=f9, LEN=82)
> D (130178) cellular: mux-rx-line #4:
> $GNGNS,095314.00,xxxx.139928,N,xxxx.391626,E,AAN,16,0.7,327.9,47.0,,,V*60
>
> I don't know if that's intended behaviour or can be changed, but I
> suspect this to add unnecessary load to the serial communication.
>
> Regards,
> Michael
>
>
> Am 05.03.22 um 09:02 schrieb Michael Balzer:
>> Another theory: I found Mark's comment in the SIM7600 driver NMEA
>> startup:
>>
>> // We need to do this a little differently from the standard, as
>> SIM7600
>> // may start GPS on power up, and doesn't like us using CGPS=1,1 when
>> // it is already on. So workaround is to first CGPS=0.
>>
>> IOW, the SIM7600 may have GPS enabled by default, possibly also
>> sending NMEA sentences, if not explicitly switched off.
>>
>> BTW, that also explains the difficulties in booting a 3.3 module from
>> USB… :-/
>>
>> a) We don't switch off GPS if GPS is disabled.
>>
>> b) I just tried switching off GPS manually after a GPS-enabled
>> startup, and the modem still continued to send NMEA sentences:
>>
>> OVMS# cellular cmd AT+CGPS=0
>> MODEM command has been sent.
>> …
>> D (475190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>> V (475190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=c1,
>> LEN=31)
>> V (475190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=cc,
>> LEN=31)
>> V (475190) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=60,
>> LEN=32)
>> D (475190) gsm-nmea: Incoming GNS: $GNGNS,,,,,,NNN,,,,,,*1D
>> V (475190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=25,
>> LEN=32)
>> V (475190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=28,
>> LEN=32)
>> V (480180) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=84,
>> LEN=31)
>> D (480190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>> …
>> OVMS# cellular cmd AT+CGPSNMEA=0
>> MODEM command has been sent.
>> D (505190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>> V (505190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=c1,
>> LEN=31)
>> V (505190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=cc,
>> LEN=31)
>> V (505190) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=60,
>> LEN=32)
>> D (505190) gsm-nmea: Incoming GNS: $GNGNS,,,,,,NNN,,,,,,*1D
>> V (505190) gsm-mux: ProcessFrame(CHAN=3, ADDR=0d, CTRL=ff, FCS=25,
>> LEN=32)
>> V (505190) gsm-mux: ProcessFrame(CHAN=4, ADDR=11, CTRL=ff, FCS=28,
>> LEN=32)
>> V (510180) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=84,
>> LEN=31)
>> D (510190) gsm-nmea: Incoming RMC: $GPRMC,,V,,,,,,,,,,N*53
>> …
>>
>>
>> So without us starting the NMEA subsystem, is it possible these
>> transmissions are still there, filling up some MUX buffer?
>>
>> Regards,
>> Michael
>>
>>
>> Am 05.03.22 um 08:33 schrieb Michael Balzer:
>>> I've tried to reproduce this with my old 3.2 & hand soldered 3.3 dev
>>> module, with different vehicle modules and modem GPS disabled.
>>>
>>> Results:
>>> - the vehicle module is irrelevant
>>> - 3.2 (SIM5360E) runs without issues (had this running over night)
>>> - 3.3 (SIM7600G) consistently crashes by the TWDT within a couple of
>>> minutes after modem init
>>>
>>> This is caused by the modem task hogging core 0 as soon as the PPP
>>> connection is up:
>>>
>>> I (87161) cellular: Identified cellular modem: SIM7600/Experimental
>>> support for SIMCOM SIM7600
>>> D (87161) cellular: Remove old 'auto' modem driver
>>> I (87161) cellular: Set modem driver to 'SIM7600'
>>> I (87161) cellular: State: Enter PoweredOn state
>>> OVMS# mo ta
>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit
>>> SPIRAM C# PRI CPU% BPR/MH
>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644
>>> 44544 0 22 0% 22/ 0
>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 188 0
>>> 524288 1 5 0% 5/ 0
>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0
>>> 0 0 20 0% 20/ 0
>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80136 0
>>> 79300 1 8 1% 8/ 0
>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0
>>> 50556 0 23 0% 23/ 0
>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0
>>> 0 0 24 0% 24/ 0
>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0
>>> 0 1 24 0% 24/ 0
>>> *3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0
>>> 0 0 0 98% 0/ 0*
>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0
>>> 1 0 97% 0/ 0
>>> 3FFCE1D0 12 Blk Tmr Svc 380 1404 3072 124 0
>>> 0 0 20 0% 20/ 0
>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0
>>> 1040 1 18 0% 18/ 0
>>> *3FFD34E0 20 Blk OVMS Cellular 652 2396 4096 5848
>>> 0 0 0 20 0% 20/ 0*
>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0
>>> 5632 0 22 1% 22/ 1
>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0
>>> 1 10 0% 10/ 0
>>> 3FFE7BD4 23 Rdy OVMS Console 848 2688 10240 356 12100
>>> 15960 1 5 1% 5/ 1
>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0
>>> 2152 1 5 0% 5/ 0
>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0
>>> 4 0 1 0% 1/ 0
>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0
>>> 0 1 2 0% 2/ 0
>>>
>>> …
>>>
>>> I (118171) cellular: Network Registration status: RegisteredRoaming
>>> D (118181) cellular: mux-rx-line #3: +CCLK: "22/03/05,08:22:23+04"
>>> D (118181) cellular: mux-rx-line #3: +CSQ: 24,0
>>> D (118181) cellular: mux-rx-line #3: +COPS: 0,0,"vodafone.de Hologram",0
>>> I (118181) cellular: Network Provider is: vodafone.de Hologram
>>> D (119151) cellular: State transition NetWait => NetStart
>>> I (119151) cellular: State: Enter NetStart state
>>> D (120151) cellular: Netstart
>>> AT+CGDCONT=1,"IP","hologram";+CGDATA="PPP",1
>>> V (120191) gsm-mux: ProcessFrame(CHAN=0, ADDR=01, CTRL=ef, FCS=79,
>>> LEN=11)
>>> V (120201) gsm-mux: ProcessFrame(CHAN=2, ADDR=09, CTRL=ff, FCS=fb,
>>> LEN=24)
>>> D (120201) cellular: mux-rx-line #2: CONNECT 115200
>>> *I (120201) cellular: PPP Connection is ready to start*
>>> V (120751) gsm-mux: ProcessFrame(CHAN=1, ADDR=05, CTRL=ff, FCS=56,
>>> LEN=20)
>>> OVMS# mo ta
>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit
>>> SPIRAM C# PRI CPU% BPR/MH
>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644
>>> 44544 0 22 0% 22/ 0
>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 284 0
>>> 524308 1 5 1% 5/ 0
>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0
>>> 0 0 20 0% 20/ 0
>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80280 0
>>> 79680 1 8 1% 8/ 0
>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0
>>> 50556 0 23 0% 23/ 0
>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0
>>> 0 0 24 0% 24/ 0
>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0
>>> 0 1 24 0% 24/ 0
>>> *3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0
>>> 0 0 0 79% 0/ 0*
>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0
>>> 1 0 95% 0/ 0
>>> 3FFCE1D0 12 Blk Tmr Svc 380 1452 3072 124 0
>>> 0 0 20 0% 20/ 0
>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0
>>> 1044 1 18 0% 18/ 0
>>> *3FFD34E0 20 Rdy OVMS Cellular 876 2716 4096 6116 0
>>> 10440 0 20 19% 20/ 0*
>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0
>>> 5632 0 22 1% 22/ 1
>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0
>>> 1 10 0% 10/ 0
>>> 3FFE7BD4 23 Rdy OVMS Console 848 2688 10240 356 12100
>>> 15960 1 5 3% 5/ 1
>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0
>>> 2152 1 5 0% 5/ 0
>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0
>>> 4 0 1 0% 1/ 0
>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0
>>> 0 1 2 0% 2/ 0
>>> I (122161) housekeeping: System considered stable (RAM:
>>> 8b=64256-64760 32b=184 SPI=3362168-3374964)
>>> D (122231) ovms-duktape: Duktape: Compacting DukTape memory done in
>>> 60 ms
>>> I (123151) ovms-server-v2: Send MP-0
>>> h1,0,*-OVM-DebugCrash,0,2592000,3.3.001-285-g601f2a70/factory/edge
>>> (build idf v3.3.4-848-g1ff5e24b1b Feb 22 2022
>>> 20:57:41),1,EarlyCrash,12,12,1,1,abort(),0,,0x4008ddca 0x4008e06
>>> 5 0x400e88b8 0x40084176 ,6,Task watchdog,,,0,IDLE0,OVMS WIFI BLE BT
>>> cores=2 rev=ESP32/3; MODEM SIM7600
>>> I (123361) ovms-server-v2: Incoming Msg: MP-0 h1
>>> OVMS# mo ta
>>> Number of Tasks = 18 Stack: Now Max Total Heap 32-bit
>>> SPIRAM C# PRI CPU% BPR/MH
>>> 3FFAFB88 1 Blk esp_timer 436 708 4096 41648 644
>>> 44544 0 22 0% 22/ 0
>>> 3FFC2BD8 2 Blk OVMS DukTape 500 9252 12288 188 0
>>> 524288 1 5 0% 5/ 0
>>> 3FFC46B4 3 Blk eventTask 480 2032 4608 108 0
>>> 0 0 20 0% 20/ 0
>>> 3FFC6B38 4 Blk OVMS Events 484 3204 8192 80136 0
>>> 79300 1 8 1% 8/ 0
>>> 3FFC9C30 5 Blk OVMS CanRx 476 1996 4096 3240 0
>>> 50556 0 23 0% 23/ 0
>>> 3FFCAA2C 6 Blk ipc0 424 504 1024 7804 0
>>> 0 0 24 0% 24/ 0
>>> 3FFCB030 7 Blk ipc1 428 524 1024 120 0
>>> 0 1 24 0% 24/ 0
>>> *3FFCCEA0 10 Rdy IDLE0 412 508 1024 0 0
>>> 0 0 0 0% 0/ 0*
>>> 3FFCD438 11 Rdy IDLE1 404 580 1024 0 0 0
>>> 1 0 95% 0/ 0
>>> 3FFCE1D0 12 Blk Tmr Svc 380 1452 3072 124 0
>>> 0 0 20 0% 20/ 0
>>> 3FFCC308 17 Blk tiT 532 2036 3072 924 0
>>> 1044 1 18 0% 18/ 0
>>> *3FFD34E0 20 Rdy OVMS Cellular 876 2716 4096 6116 0
>>> 10440 0 20 98% 20/ 0*
>>> 3FFD656C 21 Blk wifi 472 2840 3584 38080 0
>>> 5632 0 22 1% 22/ 1
>>> 3FFE3244 22 Blk OVMS Vehicle 496 496 8192 0 0 0
>>> 1 10 0% 10/ 0
>>> 3FFE7BD4 23 Rdy OVMS Console 880 2688 10240 356 12100
>>> 15960 1 5 4% 5/ 1
>>> 3FFEA5A0 24 Blk OVMS NetMan 908 2684 10240 832 0
>>> 2152 1 5 0% 5/ 0
>>> 3FFEBBDC 25 Blk mdns 504 2200 4096 104 0
>>> 4 0 1 0% 1/ 0
>>> 3FFEFDC8 26 Blk OVMS FileLog 580 1524 3072 36 0
>>> 0 1 2 0% 2/ 0
>>>
>>> …
>>>
>>> OVMS# E (181151) task_wdt: Task watchdog got triggered. The
>>> following tasks did not reset the watchdog in time:
>>> E (181151) task_wdt: - IDLE0 (CPU 0)
>>> E (181151) task_wdt: Tasks currently running:
>>> E (181151) task_wdt: CPU 0: Tmr Svc
>>> E (181151) task_wdt: CPU 1: IDLE1
>>> E (181151) task_wdt: Aborting.
>>>
>>>
>>> No other log messages.
>>>
>>> Am 04.03.22 um 17:08 schrieb Jeff Anton:
>>>> I woke from sleep thinking "interrupt storm."
>>>>
>>>> Craig's observation makes me wonder... Does the gps system produce
>>>> interrupts which are not handled properly when gps is turned off.
>>>> Unhandled interrupts might get constantly repeated causing the IDLE
>>>> tasks to starve?
>>>
>>> I think that may be the case. Maybe the NMEA MUX channel gets
>>> continuous input from the 7600 when GPS is disabled?
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>>
>>> Am 04.03.22 um 23:15 schrieb Craig Leres:
>>>> One more data point; I turned GPS off on my dev module (v3.1 board
>>>> with a sim7600A running 3.3.001-285-g601f2a70-dirty) and it started
>>>> crashing.
>>>>
>>>> Has anyone outside the US tried this?
>>>>
>>>> Craig
>>>>
>>>> ice 12 % ./backtrace.sh 0x400891af:0x3ffb0690 0x40089449:0x3ffb06b0
>>>> 0x400e8950:0x3ffb06d0 0x40083dde:0x3ffb06f0
>>>> + xtensa-esp32-elf-addr2line -e build/ovms3.elf
>>>> 0x400891af:0x3ffb0690 0x40089449:0x3ffb06b0 0x400e8950:0x3ffb06d0
>>>> 0x40083dde:0x3ffb06f0
>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c:736
>>>>
>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c:736
>>>>
>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/esp32/task_wdt.c:274
>>>>
>>>> /home/ice/u0/leres/esp/openvehicles-xtensa-esp32-elf/components/freertos/xtensa_vectors.S:1154
>>>>
>>>>
>>>> # esp/openvehicles-xtensa-esp32-elf/components/esp32/panic.c
>>>> 733
>>>> 734 void _esp_error_check_failed_without_abort(esp_err_t rc,
>>>> const char *file, int line, const char *function, const char
>>>> *expression)
>>>> 735 {
>>>> 736
>>>> esp_error_check_failed_print("ESP_ERROR_CHECK_WITHOUT_ABORT", rc,
>>>> file, line, function, expression);
>>>> 737 }
>>>>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20220306/eefc40de/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20220306/eefc40de/attachment-0001.sig>
More information about the OvmsDev
mailing list