[Ovmsdev] esp32can issue

Mark Webb-Johnson mark at webb-johnson.net
Sun Nov 4 22:04:15 HKT 2018


Despite working on this on-and-off for a month, I can’t find the root cause. The MCP2515 just seems to randomly get overloaded, and stop receiving, but the flags don’t Indicate it in any reliable way.

I have just committed a change to try to workaround this:

Introduce a watchdog reset count in the status for each can bus.
Introduce a watchdog timer that tracks the time the last message was received.
If the car is OFF, the watchdog timer is simply reset once every ten seconds.
If the car is ON, and the watchdog timer indicates no messages received for 60 seconds (or longer), the bus is reset (turned off and on). An error is logged to record this, and the status counter incremented.

That should go out to EDGE release tonight. It is already in my car, but that only fails once every week or so.

Regards, Mark

> On 16 Aug 2018, at 7:05 PM, Mark Webb-Johnson <mark at webb-johnson.net> wrote:
> 
> The mcp2515 issues are probably different than esp32can, but similar logic can probably be used to address them both.
> 
> After a week in my car with three buses running, can2 locked up today. Here is the status:
> 
> OVMS# can can2 status
> CAN:       can2
> Mode:      Active
> Speed:     500000
> Interrupts:               40069
> Rx pkt:                   40429
> Rx err:                       0
> Rx ovrflw:                    0
> Tx pkt:                       0
> Tx delays:                    0
> Tx err:                       0
> Tx ovrflw:                    0
> Err flags: 0x01000001
> 
> OVMS# can can3 status
> CAN:       can3
> Mode:      Active
> Speed:     125000
> Interrupts:             5003600
> Rx pkt:                 5003610
> Rx err:                       0
> Rx ovrflw:                    0
> Tx pkt:                       0
> Tx delays:                    0
> Tx err:                       0
> Tx ovrflw:                    0
> Err flags: 0x01000001
> 
> CAN3 was operating normally. Flags identical.
> 
> I fixed this with:
> 
> OVMS# can can2 start active 500000
> Can bus can2 started in mode active at speed 500000bps
> 
> OVMS# can can2 status
> CAN:       can2
> Mode:      Active
> Speed:     500000
> Interrupts:                 831
> Rx pkt:                     831
> Rx err:                       0
> Rx ovrflw:                    0
> Tx pkt:                       0
> Tx delays:                    0
> Tx err:                       0
> Tx ovrflw:                    0
> Err flags: 0x01000001
> 
> No need to power on/off.
> 
> For mcp2515, I’ll try to add a ‘kick’ function able to try to read the status registers and restart as appropriate. That should give us more information.
> 
> Regards, Mark.
> 
>> On 16 Aug 2018, at 4:58 PM, Tom Parker <tom at carrott.org <mailto:tom at carrott.org>> wrote:
>> 
>> On my Leaf the bus seems to stop quite often, so the reset would have to operate quite often too. Perhaps increment a metric when the reset fires so we can monitor how often it is happening?
>> 
>> 
>> On 16/08/18 01:46, Mark Webb-Johnson wrote:
>>> Sorry, forgot to mention: another option is to poll the bus manually if an interrupt hasn’t been received in N seconds. Check the status registers and if everything is not perfect then reset the controller.
>>> 
>>>> On 15 Aug 2018, at 9:45 PM, Mark Webb-Johnson <mark at webb-johnson.net <mailto:mark at webb-johnson.net>> wrote:
>>>> 
>>>> This was my worry for both esp32 can and mcp2515. We get an interrupt, but for some reason the status is not showing anything to do, so we return from the interrupt. Then, the status changes. Or, for a particular status we had to do something that we didn’t. The bus is locked up, and the interrupt never fires again, so we never get called. I wonder if we just fired the interrupt handler again manually, would it recover?
>>>> 
>>>> When this happens, do we need to power off then on the can bus, or is a ‘can canX stop’ ‘can canX start …’ sufficient?
>>>> 
>>>> Perhaps a general solution would be a watchdog. If the CAN bus does not receive anything for N seconds (perhaps 60), then restart the controller? We could simply protect it from restarting twice (a simple check of the counters), so worse case this would restart once 60 seconds after a bus normally went idle. Or is that too kludgy?
>>>> 
>>>> Regards, Mark.
>> 
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20181104/7d321975/attachment-0001.html>


More information about the OvmsDev mailing list