[Ovmsdev] Need some help on CAN errors

Michael Balzer dexter at expeedo.de
Fri Sep 18 21:59:19 HKT 2020


Chris,

I've got no simple way to test changes to the MCP2515 driver, can you
please try the attached patch?

In case you're not familiar with patching:

git apply …/mcp2515.patch

That's simply adding a mutex to Write() & AsynchronousInterruptHandler()
plus a minor optimization to directly queue a frame if the queue isn't
empty.

Thanks,
Michael


Am 18.09.20 um 15:39 schrieb Chris van der Meijden:
> Correction:
>
> The crashes occur when I use
>
>   CAN_frame_t txframe;
>   txframe = {};
>   txframe.MsgID = 0x219;
>   txframe.FIR.B.DLC = 8;
>   txframe.data = { 0x…, 0x…, … };
>   txframe.Write(m_can3);
>
> to write to the bus.
>
> When I use the 
>
> comfBus->WriteStandard(0x219, length, data);
>
> method I have no crashes.
>
> Continuing the quest ...
>
> Chris
>
>
> Am Donnerstag, den 17.09.2020, 21:35 +0200 schrieb Michael Balzer:
>> That looks like a bug in the MCP2515 driver.
>>
>>> The TX that come to the bus (3T11) we see 15 times the first messages
>>> (that is right), one time the second messages (wrong, should be two)
>>> and 6 times the third messages (wrong, should be 5).
>>
>> I haven't fully analyzed the code path yet, but my impression so far
>> is we have a race condition between mcp2515::Write() and
>> mcp2515::AsynchronousInterruptHandler(). If I'm right, that could
>> lead to wrong TX log entries and even to lost or out of sequence
>> frame transmissions in a situation like yours, when many frames get
>> sent in series without any delay.
>>
>> The race scheme is:
>>
>> - The application task (low prio) calls Write() concurrently to the
>> "TX done" interrupt of the last frame sent being raised.
>> - The interrupt is sent to the CAN task (high prio), which calls
>> AsynchronousInterruptHandler() concurrently now to the Write().
>> - Both Write() and AsynchronousInterruptHandler() may now get "TX
>> buffer is free" from the MCP2515
>> - …Write() will then immediately fill the TX buffer with the new frame
>> - …and AsynchronousInterruptHandler() will concurrently fill the TX
>> buffer with the next frame of the TX queue if any
>>
>> As the SPI bus has a lock, both writes may succeed, but m_tx_frame
>> will contain only the last of the two frames, so the next log entry
>> will be wrong. The m_tx_frame container (originating from PR 259) has
>> been designed under the assumption of stricly sequential frame
>> transmissions.
>>
>> You could have had that very effect on the second time of the second
>> message vs. the first time of the third message:
>>
>> 1600352233.392765 3CER TX_Queue T11 291 *8a**aa* 00 00 00 46 03 00
>> 1600352233.394239 3T11 291 *8a**aa* 00 00 00 46 03 00
>> 1600352233.394816 3CER TX_Queue T11 291 *00**aa* 00 00 00 46 03 00
>> 1600352233.395680 3T11 291 *8a**aa* 00 00 00 46 03 00
>> 1600352233.396153 3CER TX_Queue T11 291 *00**aa* 00 00 00 46 03 00
>> 1600352233.404239 3T11 291 *00**aa* 00 00 00 46 03 00
>> 1600352233.404538 3CER TX_Queue T11 291 *00 00* 00 00 00 46 03 00
>> 1600352233.404753 3CER TX_Queue T11 291 *00 00* 00 00 00 46 03 00
>> … TX "00 aa" done IRQ coming in here, queue has "00 00" frame ready
>> which gets sent concurrently
>>   → wrong frame "00 00" is now in m_tx_frame and gets logged instead
>> of "00 aa":
>> 1600352233.405126 3T11 291 *00 00* 00 00 00 46 03 00
>>
>>
>> A second opinion on this would be good, possibly by someone else who
>> has worked on the MCP2515 code last?
>>
>> Chris, while we're thinking about this, you could try serialising the
>> frames by sending them one by one using the TX callback. The TX queue
>> is meant to handle bulk transmissions, but you may need to raise the
>> queue size (sdkconfig: CONFIG_OVMS_HW_CAN_TX_QUEUE_SIZE=20) in that case.
>>
>>> On the RX side (3R11) we only see 3 times the first messages, none of
>>> the second and one of the third.
>>
>> The RX log entries are the frames received from the vehicle.
>>
>> Regards,
>> Michael
>>
>>
>>
>> Am 17.09.20 um 17:06 schrieb Chris van der Meijden:
>>> Hi all
>>>
>>> I'm trying to make the Lock/Unlock function work for the VW e-UP on
>>> T26A.
>>>
>>> I found the right messages and got it working once, not knowing why.
>>>
>>> What I need to do is sending 22 messages in a row. My code for that is:
>>>
>>>         // Send 0x291 messages
>>>
>>>         for (k=0;k<15;k++)
>>>         {
>>>            data[0] = 0x8A;
>>>            data[1] = 0xAA;
>>>            data[2] = 0x00;
>>>            data[3] = 0x00;
>>>            data[4] = 0x00;
>>>            data[5] = 0x46;
>>>            data[6] = 0x03;
>>>            data[7] = 0x00;
>>>            if (vwup_enable_write && !dev_mode)
>>>             comfBus->WriteStandard(0x291, length, data);
>>>
>>>         }
>>>
>>>         for (k=0;k<2;k++)
>>>         {
>>>            data[0] = 0x00;
>>>            data[1] = 0xAA;
>>>            data[2] = 0x00;
>>>            data[3] = 0x00;
>>>            data[4] = 0x00;
>>>            data[5] = 0x46;
>>>            data[6] = 0x03;
>>>            data[7] = 0x00;
>>>            if (vwup_enable_write && !dev_mode)
>>>             comfBus->WriteStandard(0x291, length, data);
>>>
>>>         }
>>>
>>>         for (k=0;k<5;k++)
>>>         {
>>>            data[0] = 0x00;
>>>            data[1] = 0x00;
>>>            data[2] = 0x00;
>>>            data[3] = 0x00;
>>>            data[4] = 0x00;
>>>            data[5] = 0x46;
>>>            data[6] = 0x03;
>>>            data[7] = 0x00;
>>>            if (vwup_enable_write && !dev_mode)
>>>             comfBus->WriteStandard(0x291, length, data);
>>>
>>>         }
>>>
>>> This is what I get in the logging:
>>>
>>> V (96275) canlog-monitor: 1600352233.355713 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96275) canlog-monitor: 1600352233.357066 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96275) canlog-monitor: 1600352233.357673 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96275) canlog-monitor: 1600352233.358156 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> E (96285) mcp2515: AsynchronousInterruptHandler: error while sending
>>> frame. msgId 0x291
>>> V (96285) canlog-monitor: 1600352233.364241 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> E (96285) can: can3: intr=8 rxpkt=0 txpkt=4 errflags=0x85050101 rxerr=0
>>> txerr=107 rxovr=0 txovr=0 txdelay=6 wdgreset=0 errreset=0
>>> V (96285) canlog-monitor: 1600352233.364745 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> W (96285) mcp2515: can3 EFLG: TX_Err_Warn EWARN 
>>> V (96285) canlog-monitor: 1600352233.364965 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> E (96285) can: can3: intr=8 rxpkt=1 txpkt=5 errflags=0x4050104 rxerr=0
>>> txerr=107 rxovr=0 txovr=0 txdelay=6 wdgreset=0 errreset=0
>>> E (96285) canlog-monitor: 1600352233.366305 3CER Error intr=8 rxpkt=0
>>> txpkt=4 errflags=0x85050101 rxerr=0 txerr=107 rxovr=0 txovr=0 txdelay=6
>>> wdgreset=0 errreset=0
>>> V (96285) canlog-monitor: 1600352233.367035 3R11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96285) canlog-monitor: 1600352233.367620 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96295) canlog-monitor: 1600352233.367647 3CER TX_Fail T11 291 8a aa
>>> 00 00 00 46 03 00
>>> E (96295) canlog-monitor: 1600352233.368350 3CER Error intr=8 rxpkt=1
>>> txpkt=5 errflags=0x4050104 rxerr=0 txerr=107 rxovr=0 txovr=0 txdelay=6
>>> wdgreset=0 errreset=0
>>> V (96295) canlog-monitor: 1600352233.368403 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96295) canlog-monitor: 1600352233.369064 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96295) canlog-monitor: 1600352233.370243 3R11 470 40 00 00 00 00 00
>>> 00 14
>>> V (96295) canlog-monitor: 1600352233.371859 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96295) canlog-monitor: 1600352233.374239 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96295) canlog-monitor: 1600352233.374827 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96295) canlog-monitor: 1600352233.375507 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96295) canlog-monitor: 1600352233.375976 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96305) canlog-monitor: 1600352233.384241 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96305) canlog-monitor: 1600352233.384985 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96305) canlog-monitor: 1600352233.385602 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> E (96305) can: can3: intr=14 rxpkt=2 txpkt=10 errflags=0x27450501
>>> rxerr=0 txerr=100 rxovr=0 txovr=0 txdelay=12 wdgreset=0 errreset=0
>>> V (96305) canlog-monitor: 1600352233.386140 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> W (96305) mcp2515: can3 EFLG: TX_Err_Warn EWARN 
>>> E (96305) canlog-monitor: 1600352233.387010 3CER Error intr=14 rxpkt=2
>>> txpkt=10 errflags=0x27450501 rxerr=0 txerr=100 rxovr=0 txovr=0
>>> txdelay=12 wdgreset=0 errreset=0
>>> V (96305) canlog-monitor: 1600352233.387820 3R11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.388508 3R11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.388548 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.388586 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.390509 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.391040 3R11 359 ff 89 ff 00 10 6b
>>> 80 00
>>> V (96315) canlog-monitor: 1600352233.392246 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.392765 3CER TX_Queue T11 291 8a aa
>>> 00 00 00 46 03 00
>>> V (96315) canlog-monitor: 1600352233.394239 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.394816 3CER TX_Queue T11 291 00 aa
>>> 00 00 00 46 03 00
>>> V (96315) canlog-monitor: 1600352233.395680 3T11 291 8a aa 00 00 00 46
>>> 03 00
>>> V (96315) canlog-monitor: 1600352233.396153 3CER TX_Queue T11 291 00 aa
>>> 00 00 00 46 03 00
>>> V (96325) canlog-monitor: 1600352233.404239 3T11 291 00 aa 00 00 00 46
>>> 03 00
>>> V (96325) canlog-monitor: 1600352233.404538 3CER TX_Queue T11 291 00 00
>>> 00 00 00 46 03 00
>>> V (96325) canlog-monitor: 1600352233.404753 3CER TX_Queue T11 291 00 00
>>> 00 00 00 46 03 00
>>> V (96325) canlog-monitor: 1600352233.404791 3R11 291 00 00 00 00 00 46
>>> 03 00
>>> V (96325) canlog-monitor: 1600352233.405089 3R11 457 00 47 00
>>> V (96325) canlog-monitor: 1600352233.405126 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> V (96325) canlog-monitor: 1600352233.406203 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> V (96325) canlog-monitor: 1600352233.407538 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> V (96325) canlog-monitor: 1600352233.408701 3CER TX_Queue T11 291 00 00
>>> 00 00 00 46 03 00
>>> V (96335) canlog-monitor: 1600352233.414259 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> E (96335) can: can3: intr=25 rxpkt=7 txpkt=21 errflags=0x27400501
>>> rxerr=0 txerr=90 rxovr=0 txovr=0 txdelay=18 wdgreset=0 errreset=0
>>> E (96335) canlog-monitor: 1600352233.416398 3CER Error intr=25 rxpkt=7
>>> txpkt=21 errflags=0x27400501 rxerr=0 txerr=90 rxovr=0 txovr=0
>>> txdelay=18 wdgreset=0 errreset=0
>>> V (96335) canlog-monitor: 1600352233.416729 3R11 320 01 00 00 01 00 02
>>> 00 80
>>> V (96335) canlog-monitor: 1600352233.417216 3R11 575 00 00 00 00
>>> V (96335) canlog-monitor: 1600352233.417255 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> V (96335) canlog-monitor: 1600352233.417283 3T11 291 00 00 00 00 00 46
>>> 03 00
>>> I (96335) ovms-server-v2: Send MP-0 c22,0
>>>
>>> I see some weird stuff here ...
>>>
>>> The TX that come to the bus (3T11) we see 15 times the first messages
>>> (that is right), one time the second messages (wrong, should be two)
>>> and 6 times the third messages (wrong, should be 5).
>>>
>>> On the RX side (3R11) we only see 3 times the first messages, none of
>>> the second and one of the third.
>>>
>>> To unlock the doors we need on 3R11 (???) 15 times first, 5 times
>>> second and two times third.
>>>
>>> I don't understand enough from the can communication within OVMS to
>>> solve this on my own. Could someone help me out here?
>>>
>>> Thanx!
>>>
>>> Regards
>>>
>>> Chris
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20200918/9f510264/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcp2515.patch
Type: text/x-patch
Size: 1945 bytes
Desc: not available
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20200918/9f510264/attachment-0002.bin>


More information about the OvmsDev mailing list