[Ovmsdev] Problems with IncomingFrameCan when registering two can buses.

Mark Webb-Johnson mark at webb-johnson.net
Wed Jan 3 20:35:41 HKT 2018


Great work done here. Thanks to all contributors.

I’ve been away for the past few days, spending some quality time with my wife and kids. I left OVMS v3 running on my desk, connected via the cellular network here in Malaysia. It saw a bunch of connection drops, but overall managed to reconnect each time and be stable for 10 days. I think we’re close… Just need to deal with the edge case of modem rebooting and losing MUX connection.

Regards, Mark.



> On 1 Jan 2018, at 7:00 PM, Geir Øyvind Vælidalo <geir at validalo.net> wrote:
> 
> YESSS! Both can-buses are working now! Was stable for at least a few minutes (that was what I got time for), so I think we can say problem solve. Thanks guys!
> 
> Happy new year to all!
> 
> Geir
> 
>> 31. des. 2017 kl. 23:07 skrev Greg D. <gregd2350 at gmail.com <mailto:gregd2350 at gmail.com>>:
>> 
>> Right.  I verified the Tx with Wireshark, but the counters are zero as well.  Good cross-check.
>> 
>> Fully agree with not putting delays in the driver.  My understanding from others was that it was a temporary hack to get things to work, pending a better solution.  We now have that, thanks!
>> 
>> Greg
>> 
>> 
>> Michael Balzer wrote:
>>> Greg,
>>> 
>>> you should now also be able to see TX overflows in the can status output.
>>> 
>>> I removed the delay loop because delays and retries should not be done by the driver, the driver should just tell the application about a fail, which it does now. Also, the minimum delay with vTaskDelay() is 1 "tick" = currently 10 ms, which is already far too much time to be spent on this level.
>>> 
>>> The previous esp32can implementation did no checking of the TX buffer at all, resulting in silent losses or possibly corruptions of frames on TX overflows.
>>> 
>>> And TX overflows actually do occur quite often, without a plausible cause:
>>> 
>>> OVMS > can can1 status 
>>> CAN:       can1
>>> Mode:      Active
>>> Speed:     500000
>>> Rx pkt:                  133146
>>> Rx err:                       0
>>> Rx ovrflw:                    0
>>> Tx pkt:                   53238
>>> Tx err:                      95
>>> Tx ovrflw:                  498
>>> Err flags: 0x12c00
>>> 
>>> In this case, a TX occurs every 10 ms on a 500 kbit bus -- plenty of time for the buffer to get sent. I'm looking into that.
>>> 
>>> Regards,
>>> Michael
>>> 
>>> 
>>> Am 31.12.2017 um 19:37 schrieb Greg D.:
>>>> Ok, MUCH better.  CAN3 is back to working again!  No evidence of the transmit overrun problem, and the frames are about 50ms apart, so I think your transmit changes are working.  I'm running 500k BAUD.
>>>> 
>>>> I'll let it run for a while, just to check stability.  But so far, Yea!
>>>> 
>>>> Greg
>>>> 
>>>> 
>>>> Michael Balzer wrote:
>>>>> I've added another minor change to avoid TX lockups if the bus is lost (i.e. disconnected). Also CMD_READ_RXBUF actually clears the interrupt flag itself, so we can skip the additional BITMODIFY call.
>>>>> 
>>>>> The internal ESP32 CAN controller (SJA1000) unfortunately only has one TX buffer. I'm having performance issues with the TX speed on that, it's not consistently sufficient high to get the Twizy charge control working reliably. Not sure yet if that's caused by the CAN controller, the driver or the RTOS.
>>>>> 
>>>>> Regards,
>>>>> Michael
>>>>> 
>>>>> 
>>>>> Am 31.12.2017 um 14:33 schrieb Geir Øyvind Vælidalo:
>>>>>> I’ve been looking into what’s happening with the interrupt/queue this morning and your fix makes more sense than my suggestion. It could potentially fix the problem with the buffer filling up.
>>>>>> Unfortunately my wife is using the car so I can’t test this yet,
>>>>>> 
>>>>>> Geir    
>>>>>> 
>>>>>>> 31. des. 2017 kl. 10:54 skrev Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>>:
>>>>>>> 
>>>>>>> Geir & Greg,
>>>>>>> 
>>>>>>> first of all, I did an over-optimization mistake in the RxCallback: the return after fetching the frame must always be true -- fix is pushed.
>>>>>>> 
>>>>>>> That has caused frames to get lost so you should apply this fix first.
>>>>>>> 
>>>>>>> 
>>>>>>> Am 31.12.2017 um 01:00 schrieb Geir Øyvind Vælidalo:
>>>>>>>> I did a test where I created three counters.
>>>>>>>> One went into MCP2515_isr and counts every interrupt.
>>>>>>>> One was added as the first code line in mcp2515::RxCallback.
>>>>>>>> And the third one was added to RxCallBack, but right before we read the CAN frame via SPI that will end up in IncomingFrame. I.e. should be a count of every CAN frame.  
>>>>>>>> 
>>>>>>>> This is what I got:
>>>>>>>> 
>>>>>>>> OVMS > can can2 status
>>>>>>>> CAN:       can2
>>>>>>>> Mode:      Active
>>>>>>>> Speed:     100000
>>>>>>>> Rx pkt:                      82
>>>>>>>> MCP2515_isr:                 239
>>>>>>>> RxCallback1:                 320
>>>>>>>> RxCallback2:                 295
>>>>>>>> Rx err:                       0
>>>>>>>> Tx pkt:                       0
>>>>>>>> Tx err:                       0
>>>>>>>> Err flags: 0x2040
>>>>>>>> 
>>>>>>>> These numbers puzzles me. Shouldn’t RxCallback1 and RxCallback2 be less or equal to MCP2515_isr? Where does these extra 81 calls come from? I’m missing something here...
>>>>>>> 
>>>>>>> No, that's expected behaviour. The MCP2515 has two RX buffers plus error conditions. The framework is designed to loop RxCallback over an IRQ event until all buffers and error conditions have been processed, so RxCallback counters should always be >= ISR count.
>>>>>>> 
>>>>>>>> Also, RxCallback2 is much bigger than Rx pkt, which means not all frames are sent to IncomingFrame.
>>>>>>> 
>>>>>>> That's in part due to my bug, but it also can happen under normal conditions, as an error IRQ will also trigger the RxCallback but not return a frame to be processed.
>>>>>>> 
>>>>>>>> 
>>>>>>>> What does the 0x2040 means? And where do that number comes from? 
>>>>>>>> 
>>>>>>> 
>>>>>>> That's constructed in line 293 from the error interrupt flags and the error register. The lower 8 bits are in the image I sent, the upper 8 bits are
>>>>>>> 
>>>>>>>       //  MERRF 0x80 = message tx/rx error
>>>>>>>       //  ERRIF 0x20 = overflow / error state change
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Michael
>>>>>>> 
>>>>>>> -- 
>>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>>> _______________________________________________
>>>>>>> OvmsDev mailing list
>>>>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>>>>> 
>>>>> -- 
>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>>> 
>>> -- 
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>> 
>>> 
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev <http://lists.teslaclub.hk/mailman/listinfo/ovmsdev>
>> 
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
> 
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.teslaclub.hk
> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180103/15411d82/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.png
Type: image/png
Size: 69918 bytes
Desc: not available
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180103/15411d82/attachment-0002.png>


More information about the OvmsDev mailing list