[Ovmsdev] esp32can issue

Mark Webb-Johnson mark at webb-johnson.net
Mon Jun 11 10:07:47 HKT 2018


Nasty.

I think the fault must be internal (controller, or our driver code), as the can bus itself has error checking.

I reviewed the ESP32.COM thread, and master driver code from the original developers, but couldn’t find anything obvious. There was one late commit (early this year) commented as 'avoid dominant CAN bus level while initialisation’ and just affecting initialisation. I don’t think that is related, but I just applied it to our code base anyway.

The only other thing I noticed on my review is that in ESP32CAN_rxframe there is no check that a frame is actually available before reading and queueing it. If the RX interrupt is spurious, that could cause all sorts of issues. I think it would be safer to have such a check, and maybe that is the cause of this issue?

I’m working on the MCP2515 driver at the moment, to try to find out what is causing the OBDII HUD lock-up (and others have seen). Perhaps I’ll find out something when reviewing the driver and shared CAN library. Due to the huge volume of traffic on my Tesla Model S, I get a large number of overruns but don’t see this problem. Maybe it is rare, or we are just not noticing it.

Regards, Mark

> On 10 Jun 2018, at 9:05 PM, Michael Balzer <dexter at expeedo.de> wrote:
> 
> Frank Demes reported a very strange CAN behaviour. He managed to get a CRTD log of one incident.
> 
> The effect is triggered by an unknown cause freezing the CAN rx task for short periods of time, i.e. 70 - 120 ms. This alone should be no problem, the RXFIFO
> will probaly overflow, but that is handled by our driver, so except some messages lost this should not be an issue.
> 
> Unfortunately, something goes completely wrong:
> 
> a) on the overrun event, the last valid message is delivered multiple times (i.e. 5 times)
> 
> b) after resolving the overrun, a garbled message is received.
> 
> The garbled message contains a valid byte sequence in the wrong place, i.e. bytes 5-8 would be valid if they were bytes 1-4. Bytes 1-4 cannot have been read
> from the bus, so something is going very wrong here.
> 
> I have walked through the esp32can driver and checked the SJA1000 documentation but have no clue what is going wrong.
> 
> Any ideas? CRTD excerpt below. (Frame 155 is reflected by the module with byte 1 changed from 07 to 01 to limit the charge current. The problem is, it also
> reflects the garbled message, which causes the charger to stop.)
> 
> Regards,
> Michael
> 
> 
> ----------------------------------------------------------------------------------------
>> 274.341 1R11 155 07 97 E4 54 4B A8 00 6C
> 274.341 1T11 155 01 97 E4 54 4B A8 00 6C
> 274.351 1R11 155 07 97 E5 54 4B A8 00 6C
> 274.351 1T11 155 01 97 E5 54 4B A8 00 6C
> 274.361 1R11 155 07 97 E5 54 4B A8 00 6C
> 274.361 1T11 155 01 97 E5 54 4B A8 00 6C
> 274.371 1R11 423 03 22 FF FF 00 E0 00 DB
> 274.371 1R11 426 00 38 01 00 4D 7E 00
> 274.371 1R11 597 20 A4 03 B1 2F 00 01 53
> 274.371 1R11 627 00 00 00
> 274.371 1R11 155 07 97 E5 54 4B A8 00 6C
> 274.371 1T11 155 01 97 E5 54 4B A8 00 6C
> 274.381 1R11 155 07 97 E5 54 4B A8 00 6C
> 274.381 1T11 155 01 97 E5 54 4B A8 00 6C
> 274.391 1R11 5D7 FF FF 01 E4 53 80 00
> 
>   - normal up to here
> 
>   !!! CAN RX is blocked here for ~70 ms !!!
>     … interrupts disabled? by whom?
> 
> 274.461 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.461 1R11 436 00 0E 3C B2 00 00
> 274.461 1R11 155 07 97 E5 54 4B A8 00 6C
> 274.461 1R11 424 11 40 15 26 47 64 00 48
> 274.461 1R11 425 0A 1D 44 FF FE 40 01 1F
> 274.461 1R11 556 30 63 07 30 73 07 30 7A
> 274.461 1T11 155 01 97 E5 54 4B A8 00 6C      ← reflection for 155 still works here
> 
>   !!! next RX pause, ~90 ms this time !!!
> 
>   !!! ID 599 has a period of 100 ms, but we now get it 5 times:
> 
> 274.551 1R11 599 00 00 4D 7E 1E 26 FF FF
> 
> 274.551 1CEV Error intr=58445 rxpkt=27132 txpkt=528128 errflags=0x15433 rxerr=0 txerr=0 rxovr=2 txovr=0 txdelay=17
>   crtd fprintf bug compensation:
>   274.551 1CEV Error rxpkt=58445 txpkt=27132 errflags=528128 intr=0x15433 rxerr=0 txerr=0 rxovr=2 txovr=0 txdelay=17
>     errflags=528128 = 08 0F 00
>     0x08 = IR: Data overrun
>     0x0F = SR: Bus OK, error limit not yet reached, RX & TX idle, RXFIFO overrun, RX buffer full
>     0x00 = ECC: -
> 
> 274.551 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.551 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.551 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.551 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.551 1R11 155 07 97 E6 54 4B A8 00 6C                    RX "A"
> 274.551 1R11 155 07 97 E6 54 4B A8 00 6C                    RX "B"
> 274.551 1R11 155 07 97 E7 54 4B A8 00 6C                    RX "C"
> 274.551 1R11 423 03 22 FF FF 00 E0 00 DB
> 274.551 1R11 426 00 38 01 00 4D 7E 00
> 274.551 1R11 597 20 A4 03 B1 2F 00 01 53
> 274.551 1R11 627 00 00 00
> 274.551 1T11 155 01 97 E6 54 4B A8 00 6C                    reflection "A"
> 274.551 1CEV TX_Queue T11 155 01 97 E6 54 4B A8 00 6C       reflection "B"
> 274.551 1CEV TX_Queue T11 155 01 97 E7 54 4B A8 00 6C       reflection "C"
> 274.551 1T11 155 01 97 E8 54 4B A8 00 6C                    reflection "D"
> 
>   !!! 120 ms pause !!!
> 
> 274.671 1R11 155 07 97 E8 54 4B A8 00 6C                    RX "D"
> 
>   → "D" was sent to the listeners & reflected by vehicle::IncomingFrame()
>     then, before the LogFrame() call was processed, the CAN task was paused for 120 ms
> 
>   and now it's getting really weird:
> 
> 274.671 1R11 155 07 08 2A A0 07 97 E7 54                    RX "E": invalid frame / garbled message!
> 
>   … this frame certainly was not on the bus!
>     bytes 2-4 (08 2A A0) = complete nonsense, not matching any frame on the bus
>     bytes 5-8 (07 97 E7 54) = normally bytes 1-4 (!) of ID 155
>     - copy / memory error?      … very unlikely
>     - CAN RX handling error?    … doesn't seem so
>     - RXFIFO message window bug?
> 
> 274.671 1CEV Error intr=58458 rxpkt=27134 txpkt=528128 errflags=0x15456 rxerr=0 txerr=0 rxovr=3 txovr=0 txdelay=19
> 274.671 1R11 155 07 97 E7 54 4B A8 00 6C
> 274.671 1R11 155 07 97 E7 54 4B A8 00 6C
> 274.671 1R11 155 07 97 E7 54 4B A8 00 6C
> 274.671 1T11 155 01 97 E6 54 4B A8 00 6C
> 274.671 1CEV TX_Queue T11 155 01 97 E7 54 4B A8 00 6C
> 274.671 1R11 155 07 97 F2 54 4B A8 00 6C
> 274.671 1R11 155 07 97 F3 54 4B A8 00 6C
> 274.671 1R11 423 03 22 FF FF 00 E0 00 DB
> 274.671 1R11 426 00 38 01 00 4D 7E 00
> 274.671 1R11 597 20 A4 03 B1 2F 00 01 53
> 274.671 1R11 627 00 00 00
> 274.671 1R11 155 07 97 F5 54 4B A8 00 6C
> 274.671 1R11 155 07 97 F6 54 4B A8 00 6C
> 274.671 1R11 5D7 FF FF 01 E4 53 80 00
> 274.671 1R11 599 00 00 4D 7E 1E 26 FF FF
> 274.671 1R11 155 07 97 F7 54 4B A8 00 6C
> 274.671 1R11 436 00 0E 3C B2 00 00
> 274.671 1R11 041 A0 07 97 F5 54 4B A8 00
> 274.671 1T11 155 01 97 E7 54 4B A8 00 6C
> 274.671 1T11 155 01 08 2A A0 07 97 E7 54                    reflection of invalid msg "E" → charge terminates due to error
> 
> 
> -- 
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> 
> 
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev




More information about the OvmsDev mailing list