[Ovmsdev] Strange esp32can / Wifi mode interaction

Michael Balzer dexter at expeedo.de
Fri Jan 27 18:49:07 HKT 2023


I've begun reducing my car module's power usage by switching off GPS and 
Wifi AP mode when not needed, and found a rather strange behaviour:

When switching off Wifi AP mode (i.e. running in client mode only), a 
short while after the car's CAN bus has been shut off, the esp32can bus 
runs into a stuck bus-off error state.

This looks like this:

(…vehicle switches off…)
E (113345) can: can1: intr=1791 rxpkt=903 txpkt=882 errflags=0x8000d9 
rxerr=0 txerr=8 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=0 wdgreset=0 
errreset=0
(…txerr raising up to…)
E (113345) can: can1: intr=1808 rxpkt=903 txpkt=882 errflags=0x204800 
rxerr=0 txerr=128 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=0 
wdgreset=0 errreset=0
I (114345) v-vweup: PollSetState: AWAKE -> OFF

Normal up to here – the bus is now off. I normally wouldn't see another 
CAN log entry now until the next on/off transition.

But /with Wifi in client mode/, after a couple of minutes new errors 
appear, leading to a lot of IRQs and ending in a stuck bus-off state:

I (409995) wifi:
new:<6,2>, old:<6,0>, ap:<255,255>, sta:<6,2>, prof:1

E (411345) can: can1: intr=2106 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=128 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412485) can: can1: intr=6655 rxpkt=903 txpkt=882 errflags=0x804802 
rxerr=0 txerr=136 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412495) can: can1: intr=6659 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=136 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412495) can: can1: intr=6696 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=144 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412505) can: can1: intr=6698 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=144 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412505) can: can1: intr=6707 rxpkt=903 txpkt=882 errflags=0x80480b 
rxerr=0 txerr=152 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412505) can: can1: intr=6708 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=152 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412815) can: can1: intr=7933 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=160 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (412815) can: can1: intr=7936 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=160 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (413015) can: can1: intr=8727 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=168 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (413015) can: can1: intr=8731 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=168 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=150 
wdgreset=0 errreset=0
E (417565) can: can1: intr=26805 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=176 rxinval=0 rxovr=0 txovr=0 txdelay=2 txfail=150 
wdgreset=0 errreset=0
E (417565) can: can1: intr=26810 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=176 rxinval=0 rxovr=0 txovr=0 txdelay=2 txfail=150 
wdgreset=0 errreset=0
I (420025) wifi:
new:<6,0>, old:<6,2>, ap:<255,255>, sta:<6,0>, prof:1

I (421055) wifi:
new:<6,2>, old:<6,0>, ap:<255,255>, sta:<6,2>, prof:1

E (421145) can: can1: intr=40992 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=184 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421145) can: can1: intr=40995 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=184 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421405) can: can1: intr=42033 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=192 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421405) can: can1: intr=42037 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=192 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421425) can: can1: intr=42116 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=200 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421425) can: can1: intr=42119 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=200 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421485) can: can1: intr=42328 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=208 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (421485) can: can1: intr=42331 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=208 rxinval=0 rxovr=0 txovr=0 txdelay=3 txfail=150 
wdgreset=0 errreset=0
E (427385) can: can1: intr=65746 rxpkt=903 txpkt=882 errflags=0x80480b 
rxerr=0 txerr=216 rxinval=0 rxovr=0 txovr=0 txdelay=5 txfail=150 
wdgreset=0 errreset=0
E (427385) can: can1: intr=65750 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=216 rxinval=0 rxovr=0 txovr=0 txdelay=5 txfail=150 
wdgreset=0 errreset=0
E (428585) can: can1: intr=70500 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=224 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (428585) can: can1: intr=70505 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=224 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (428585) can: can1: intr=70515 rxpkt=903 txpkt=882 errflags=0x80480a 
rxerr=0 txerr=232 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (428585) can: can1: intr=70517 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=232 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (428845) can: can1: intr=71523 rxpkt=903 txpkt=882 errflags=0x80480b 
rxerr=0 txerr=240 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (428845) can: can1: intr=71526 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=240 rxinval=0 rxovr=0 txovr=0 txdelay=6 txfail=150 
wdgreset=0 errreset=0
E (433465) can: can1: intr=89884 rxpkt=903 txpkt=882 errflags=0x804803 
rxerr=0 txerr=248 rxinval=0 rxovr=0 txovr=0 txdelay=7 txfail=150 
wdgreset=0 errreset=0
E (433465) can: can1: intr=89887 rxpkt=903 txpkt=882 errflags=0x8048d9 
rxerr=0 txerr=248 rxinval=0 rxovr=0 txovr=0 txdelay=7 txfail=150 
wdgreset=0 errreset=0
E (435495) can: can1: intr=97936 rxpkt=903 txpkt=882 *errflags=0x4cc0a 
*rxerr=256 txerr=384 rxinval=0 rxovr=0 txovr=0 txdelay=8 txfail=150 
wdgreset=0 errreset=0
E (435495) can: can1: intr=97937 rxpkt=903 txpkt=882 *errflags=0x48c00* 
rxerr=256 txerr=351 rxinval=0 rxovr=0 txovr=0 txdelay=8 txfail=150 
wdgreset=0 errreset=0

OVMS# can can1 status
CAN:       can1
Mode:      Active
Speed:     500000
DBC:       none

Interrupts:               97938
Rx pkt:                     903
Rx ovrflw:                    0
Tx pkt:                     882
Tx delays:                   30
Tx ovrflw:                   42
Tx fails:                   150

Err flags: *0x00040c00*
Rx err:                       0
Tx err:                       0
Rx invalid:                   0
Wdg Resets:                   0
Wdg Timer:                    5 sec(s)
Err Resets:                   0

Err flag means: IRQ 0x04 = IR.2 Error Interrupt (warning state change), 
SR 0x0c = TX done + free.

 From here, RX and TX are both blocked, i.e. no vehicle state change is 
detected, no data sent or received.

The only way to resolve this state before my latest commit was to reboot 
the module. I've debugged the "can stop" command for the esp32can driver 
and added an auto reset for this condition.

I think the state is the known bus-off recovery hardware error (see 
"TWAI_ERRATA_FIX_BUS_OFF_REC"), which means our workaround 
implementation doesn't work, at least not in this case. I've tried a 
small modification after looking at the current TWAI driver, but the idf 
driver does a lot of things differently now, so that's not easily 
applicable to our driver.

With the CAN auto reset, the above cycle happens about every 5-10 
minutes as long as the Wifi mode is client (scanning or fixed).

The issue disappears as soon as I start the Wifi apclient mode, and 
reappears as soon as I go back to client mode. This is 100% repeatable.

Any ideas? Is this possibly some kind of EM interference, as the CAN bus 
isn't terminated in the module? But if so, wouldn't this be more likely 
to happen with AP mode enabled?

Should I try closing J4 to enable the CAN1 termination?

Has anyone seen a similar effect? Btw, this is a production 3.3 module 
(first batch).

Regards,
Michael

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20230127/c16901a0/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20230127/c16901a0/attachment.sig>


More information about the OvmsDev mailing list