[Ovmsdev] MCP2515 driver bug?
Mark Webb-Johnson
mark at webb-johnson.net
Thu Jan 6 14:15:40 HKT 2022
I think we would need to see the status of the can bus and a trace of the traffic to know for sure what is going on. I find it strange the HUD would send request (I thought that purely responded to requests from the OVMS master?).
Apart from the SPI changes, the only change to the MCP2515 driver made here was to improve in-order delivery and throughput. I doubt that would improve the situation with a hung HUD.
Regards, Mark
> On 6 Jan 2022, at 2:18 AM, Greg D. <gregd2350 at gmail.com> wrote:
>
> Hi Mark,
>
> Simply power cycling the HUD didn't work. The only way I could clear the issue was to restart the application while the HUD was turned off (ext12v off). As I recall, watching the CAN bus with a monitor showed the HUD was sending requests, but they were not being received by the application. So I concluded something was odd with the driver, but never pinned it down.
>
> Greg
>
>
> Mark Webb-Johnson wrote:
>> If a power cycle of the HUD fixed the problem, then it sounds like the issue is more likely on the HUD side. Conversely if a stop+start of the CAN on ovms fixed it, then I would suspect our driver.
>>
>> Regards, Mark.
>>
>>> On 4 Jan 2022, at 11:05 AM, Greg D <gregd2350 at gmail.com <mailto:gregd2350 at gmail.com>> wrote:
>>>
>>> No error flags as I recall. It just stopped receiving frames. I tried to figure out how to clear it in the driver, but never succeeded. The workaround was to cycle the ext12v, restarting obd2ecu between off and on, triggered by a Car-on event.
>>>
>>> Greg
>>>
>>>
>>> On January 3, 2022 6:54:22 PM PST, Mark Webb-Johnson <mark at webb-johnson.net <mailto:mark at webb-johnson.net>> wrote:
>>>
>>> I don’t know the cause of the HUD bus hang, so really not sure if it is resolved or not. The SPI driver is the standard ESP IDF one now, so should be stable. But what was the state of the bus when it hung? Perhaps some error condition set in the MCP2515?
>>>
>>> Regards, Mark
>>>
>>>> On 4 Jan 2022, at 10:46 AM, Greg D <gregd2350 at gmail.com <mailto:gregd2350 at gmail.com>> wrote:
>>>>
>>>> Hi Mark,
>>>>
>>>> Trying this again... (got a 554 blocked reply the first time). apologize if it's a duplicate.
>>>>
>>>> Will this fix the bus hangs we have with the HUD devices? Ref: "Can buses stop after some time" email thread from 5/28/2019... As I wrote at the time, a 100% way to reproduce the hang was to have a HUD device running, then start the OBD2ECU task. I haven't used my HUD in some time, so I don't know if this was fixed since then. If not, that was a very quick and easy way to reproduce the issue.
>>>>
>>>> Happy New Year!
>>>>
>>>> Greg
>>>>
>>>> On Mon, Jan 3, 2022 at 5:10 PM Mark Webb-Johnson <mark at webb-johnson.net <mailto:mark at webb-johnson.net>> wrote:
>>>> An update on this.
>>>>
>>>> Working with another developer, we have made some changes in a ’spimaster’ branch:
>>>>
>>>> Stop using spi_nodma fork of ESP’s standard spi code, and switch back to use the standard ESP IDF spi master.
>>>>
>>>> To support >3 devices (which ESP IDF spi master doesn’t due to hardware limitations of CS line and 3x DMA channels), change to use software CS line for the MAX7317 driver (the MCP2515 continue to use hardware CS).
>>>>
>>>> Confirm the changes to our MCP2515 driver related to keeping track of the last buffer read, to solve the out-of-order issue.
>>>>
>>>> Confirm the fix for another related issue where we don’t block (delay) if the can tx queue is full.
>>>>
>>>> These seem better now, and I am able to establish a CAN IP connection over MCP2515. Frames come in order, and we are seeing performance around 700 frames/second - which should be adequate for our needs.
>>>>
>>>> I’ll do some more testing over the next few days, and if no issues found merge back to master.
>>>>
>>>> Regards, Mark.
>>>>
>>>>> On 7 Jun 2021, at 12:16 AM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>
>>>>> Signed PGP part
>>>>> Mark,
>>>>>
>>>>> I've just found a spot-on post on this issue:
>>>>>
>>>>> https://www.microchip.com/forums/tm.aspx?m=620741 <https://www.microchip.com/forums/tm.aspx?m=620741>
>>>>>
>>>>> Tom suggests implementing a state machine to reproduce the receive order. His analysis & solution looks sound to me.
>>>>>
>>>>> Regards,
>>>>> Michael
>>>>>
>>>>>
>>>>> Am 06.06.21 um 14:50 schrieb Mark Webb-Johnson:
>>>>>>
>>>>>> I spent quite a bit of time on this. With my standard test packet of 11 CAN frames expected, and the standard driver, I get perhaps 4 or 5 of them (about half are lost, and some are out of order).
>>>>>>
>>>>>> I made the suggested change to move the MyCan.IncomingFrame() call out of the ‘can’ object (when frameReceived is true) to within the mcp2515 AsynchronousInterruptHandler itself. That allows the handler to receive more than one frame per call and is a very simple change. Once that is done, we can at least now try to tune it.
>>>>>>
>>>>>> So I then modified the code of mcp2515 AsynchronousInterruptHandler to loop so long as the interrupt flag says either buffer #0 or #1 has a frame. The result looks something like this:
>>>>>>
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler instat=01 errflag=00 txb0ctrl=00
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler rx frame from buffer #0 (ID 0x110 B1=54)
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler instat=23 errflag=40 txb0ctrl=00
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler rx frame from buffer #0 (ID 0x110 B1=40)
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler rx frame from buffer #1 (ID 0x110 B1=45)
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler Clear RX buffer #0 overflow flag
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler instat=23 errflag=00 txb0ctrl=00
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler rx frame from buffer #0 (ID 0x110 B1=24)
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler rx frame from buffer #1 (ID 0x110 B1=34)
>>>>>> D (63192) mcp2515: AsynchronousInterruptHandler instat=20 errflag=00 txb0ctrl=00
>>>>>>
>>>>>> The actual frames on the bus are (B1 values) 54, 45, 40, 0a, 7a, d5, 0c, 14, 1c, 24, 2c, and 34. Looking at the above debug output, we get:
>>>>>>
>>>>>> Interrupt flags show buffer #0 has a frame. It is B1=54. Good.
>>>>>> Interrupt flags show buffers #0 and #1 both have frames. Buffer #0 has B1=40 and buffer #1 has B1=45.
>>>>>> Etc etc
>>>>>>
>>>>>> That is not good. What must have happened is that the first B1=54 frame arrived, got put in buffer #0, and interrupt was raised. We checked the interrupt flags, found buffer #0 had something, and read the frame ok. All is good. But what is happening now is that between the time we checked the interrupt flags and the time we finished reading the 13 bytes from buffer #0, a second frame arrived and was put in buffer #1. Then a third frame arrives and is put in buffer #0. We loop back to check the interrupt flags and find both buffers have frames ready. So we ready buffer #0 to get the third frame, then buffer #1 to get the second frame. We are out of sequence.
>>>>>>
>>>>>> By removing the ESP_LOGD statements, I can improve performance enough to get 10 out of the 11 frames, but still sometimes frames are swapped in order.
>>>>>>
>>>>>> By over-clocking the MCP2515 SPI bus (supposed to be 10MHz, but I push it to 15MHz), I can get all 11 frames, but two are out of order.
>>>>>>
>>>>>> I suppose I can minimise the chance of the out-of-order issue by repeating the call to read interrupt flags after processing buffer #0 but before checking for buffer #1. That would at least reduce the time window to as small as possible, but would be another SPI call and is too slow. Doing that brings us back to losing frames.
>>>>>>
>>>>>> Another approach may relate to our current use of the READ command to read 5 status registers (interrupt flags, error flags, two skipped, then transmit buffer #0 flags). There are two specific commands ‘read status’ (which gets the rx and tx buffer status flags in one byte), and ‘rx status’ (which gets just the receive buffer status and some info on the frames received, again in one byte). I think those are more designed for what we are trying to do. I can try to optimise the read loop at the start of the AsynchronousInterruptHandler to use one of those - they are 2 SPI bytes vs 7 for what we are doing at the moment (so more than three times as fast).
>>>>>>
>>>>>> I think it will also be worthwhile having a look at some other open source mcp2515 drivers to see how other people are doing it.
>>>>>>
>>>>>> Regards, Mark.
>>>>>>
>>>>>>> On 4 Jun 2021, at 3:02 PM, Mark Webb-Johnson <mark at webb-johnson.net <mailto:mark at webb-johnson.net>> wrote:
>>>>>>>
>>>>>>> Signed PGP part
>>>>>>>
>>>>>>> The handler can only return one frame. As it is, if both buffers #0 and #1 have a frame, it returns #0. I am not sure if it gets called again (seems to depend on the interrupt gpio status).
>>>>>>>
>>>>>>> // Read the interrupt pin status and if it's still active (low), require another interrupt handling iteration
>>>>>>> return !gpio_get_level((gpio_num_t)m_intpin);
>>>>>>>
>>>>>>> Maybe a quick solution is to just return true, immediately after *frameReceived=true, if intflag=0x01 and (intstat & CANINTF_RX1IF)? That would dispatch the incoming frame, then call back for more (from the loop in the can object).
>>>>>>>
>>>>>>> I am not sure in general why AsynchronousInterruptHandler uses a bool frameReceived flag, and doesn’t just simply dispatch the frame immediately to the can object? That would simplify things and allow the AsynchronousInterruptHandler to handle receiving both frames in the same call. Given that MCP2515 is the only driver using AsynchronousInterruptHandler, that would be an easy fix.
>>>>>>>
>>>>>>> Regards, Mark.
>>>>>>>
>>>>>>>> On 4 Jun 2021, at 2:29 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>>>
>>>>>>>> Signed PGP part
>>>>>>>> Mark,
>>>>>>>>
>>>>>>>> the handler is meant to read both buffers sequentially, and on a quick glance I don't see why it wouldn't. But it can't hurt if you do an audit of the code.
>>>>>>>>
>>>>>>>> I remember having had that out-of-order discussion when handling both RX buffers before here, but don't remember the outcome. Too bad the list archives cannot be searched.
>>>>>>>>
>>>>>>>> I think it was the MCP not doing overflows from RX buffer 1 to 0. I.e. if buffer 1 still has a frame on arrival, the new frame will be lost. That means losing a frame if the handler cannot react fast enough, but receiving out of order would be worse.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Michael
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 04.06.21 um 04:16 schrieb Mark Webb-Johnson:
>>>>>>>>> Michael,
>>>>>>>>>
>>>>>>>>> Good suggestion on the timing. I think it best to use the same timings as the Arduino library, and have committed that change. No vehicle modules currently use 1Mbps on MCP2515 anyway. Unfortunately, it didn’t resolve my problem.
>>>>>>>>>
>>>>>>>>> Looking at the error flags I see:
>>>>>>>>>
>>>>>>>>> Error flag: 0x23401c01
>>>>>>>>>
>>>>>>>>> intstat 0x23
>>>>>>>>> ERRIF Error Interrupt pending
>>>>>>>>> RX0IF Rx buffer 0 full interrupt
>>>>>>>>> RX1IF Rx buffer 1 full interrupt
>>>>>>>>>
>>>>>>>>> errflag 0x40
>>>>>>>>> RX0OVR Rx buffer 0 overflow
>>>>>>>>>
>>>>>>>>> intflag 0x1c01
>>>>>>>>> 0x01 Implied from Rx buffer 0 full
>>>>>>>>>
>>>>>>>>> 0x1c = 0001 1100
>>>>>>>>> Means RXB0 overflow. No data lost in this case (it went into RXB1)
>>>>>>>>> Means (errflag & EFLG_RX01OVR), clear RX buffer overflow flags
>>>>>>>>> Means (intstat & (CANINTF_MERRF | CANINTF_WAKIF | CANINTF_ERRIF)), clear error & wakeup interrupts
>>>>>>>>>
>>>>>>>>> So we have CAN frames in BOTH rx buffers #0 and #1. Looking at our driver code (mcp2515::AsynchronousInterruptHandler), it seems in that case we only read from buffer #0. From the flow I can see, we are going to lose that second frame. We’re not really handling the issue of two frames being in the buffers when the interrupt handler is called.
>>>>>>>>>
>>>>>>>>> As the architecture of mcp2515::AsynchronousInterruptHandler can only receive one frame, it is not so simple to fix. We could simply read and return the frame in buffer #0, requesting to be called again (return true), but another frame may arrive (into buffer #0) before we get called again, and that is going to result in out-of-order frames.
>>>>>>>>>
>>>>>>>>> I’ll work on improving the handling of this case.
>>>>>>>>>
>>>>>>>>> Regards, Mark.
>>>>>>>>>
>>>>>>>>>> On 3 Jun 2021, at 3:07 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>>>>>
>>>>>>>>>> Signed PGP part
>>>>>>>>>> Mark,
>>>>>>>>>>
>>>>>>>>>> I'd give the bit timing a try first, the MCP2515 seems to be very sensitive for this. I've even had some trouble finding a working configuration for the 50 kbit timing I've added a couple weeks ago.
>>>>>>>>>>
>>>>>>>>>> We currently use 00 / D0 / 82 which is also the result of the old Intrepid timing calculator. That's a propagation segment of 1 Tq and 3 Tq per phase, resulting in samling between 50% - 62.5%.
>>>>>>>>>>
>>>>>>>>>> The Arduino MCP CAN lib by Cory Fowler also had this previously, but then changed in…
>>>>>>>>>>
>>>>>>>>>> https://github.com/coryjfowler/MCP_CAN_lib/commit/ece730cf697fef1cbe8a90111694868168d41000 <https://github.com/coryjfowler/MCP_CAN_lib/commit/ece730cf697fef1cbe8a90111694868168d41000>
>>>>>>>>>>
>>>>>>>>>> …to 00 / CA / 81, which is a propagation segment of 3 Tq and 2 Tq per phase, shifting the sampling window to 62.5 - 75%.
>>>>>>>>>>
>>>>>>>>>> Our current configuration scheme for the internal SJA1000 compatible CAN seems to sample at 62.5 - 75% as well, so that would also match.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Michael
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 03.06.21 um 07:36 schrieb Mark Webb-Johnson:
>>>>>>>>>>>
>>>>>>>>>>> I’m working on an implementation of IP stack over CAN for the Tesla Roadster. IP frames are encoded as a length followed by a sequence of CAN frames, all on the same ID. This runs over a 1MHz bus, so presumably the traffic volume could be high at times.
>>>>>>>>>>>
>>>>>>>>>>> I was having problems with this running on CAN2, so tried CAN1 and it worked perfectly. Here are some simple dumps of a single PING packet (and single PING response packet):
>>>>>>>>>>>
>>>>>>>>>>> ID #111 is used to transmit an IP packet, and ID #110 is used to receive an IP packet. The special empty data frame is an acknowledgment.
>>>>>>>>>>>
>>>>>>>>>>> Using latest master branch code (3.2.016-196-g0aad1a9f/ota_1/edge (build idf v3.3.4-848-g1ff5e24b1 Jun 2 2021 09:28:58)).
>>>>>>>>>>>
>>>>>>>>>>> So, first let’s test with traffic on CAN1 (active, 1Mbps), and listening on CAN2 (listen, 1Mbps):
>>>>>>>>>>>
>>>>>>>>>>> TCPDUMP:
>>>>>>>>>>>
>>>>>>>>>>> 05:57:55.980291 IP (tos 0x0, ttl 64, id 43101, offset 0, flags [DF], proto ICMP (1), length 84)
>>>>>>>>>>> 10.10.99.3 > 10.10.99.2 <http://10.10.99.2/>: ICMP echo request, id 23372, seq 1, length 64
>>>>>>>>>>> 0x0000: 4500 0054 a85d 4000 4001 b832 0a0a 6303 E..T.]@. at ..2..c <mailto:E..T.]@. at ..2..c>.
>>>>>>>>>>> 0x0010: 0a0a 6302 0800 7df8 5b4c 0001 5361 b860 ..c...}.[L..Sa.`
>>>>>>>>>>> 0x0020: 19f5 0e00 0809 0a0b 0c0d 0e0f 1011 1213 ................
>>>>>>>>>>> 0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 .............!"#
>>>>>>>>>>> 0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 $%&'()*+,-./0123
>>>>>>>>>>> 0x0050: 3435 3637 4567
>>>>>>>>>>>
>>>>>>>>>>> 05:57:56.436190 IP (tos 0x0, ttl 64, id 14937, offset 0, flags [none], proto ICMP (1), length 84)
>>>>>>>>>>> 10.10.99.2 > 10.10.99.3 <http://10.10.99.3/>: ICMP echo reply, id 23372, seq 1, length 64
>>>>>>>>>>> 0x0000: 4500 0054 3a59 0000 4001 6637 0a0a 6302 E..T:Y.. at .f7..c <mailto:E..T:Y.. at .f7..c>.
>>>>>>>>>>> 0x0010: 0a0a 6303 0000 85f8 5b4c 0001 5361 b860 ..c.....[L..Sa.`
>>>>>>>>>>> 0x0020: 19f5 0e00 0809 0a0b 0c0d 0e0f 1011 1213 ................
>>>>>>>>>>> 0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 .............!"#
>>>>>>>>>>> 0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 $%&'()*+,-./0123
>>>>>>>>>>> 0x0050: 3435 3637 4567
>>>>>>>>>>>
>>>>>>>>>>> Traffic (as shown on PC the other end of the can log tcp connection):
>>>>>>>>>>>
>>>>>>>>>>> tx: #111 54 00
>>>>>>>>>>> tx: #111 45 00 00 54 a8 5d 40 00
>>>>>>>>>>> tx: #111 40 01 b8 32 0a 0a 63 03
>>>>>>>>>>> tx: #111 0a 0a 63 02 08 00 7d f8
>>>>>>>>>>> tx: #111 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> tx: #111 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> tx: #111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> tx: #111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> tx: #111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> tx: #111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> tx: #111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> tx: #111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> rx: #110
>>>>>>>>>>> rx: #110 54 00
>>>>>>>>>>> rx: #110 45 00 00 54 3a 59 00 00
>>>>>>>>>>> rx: #110 40 01 66 37 0a 0a 63 02
>>>>>>>>>>> rx: #110 0a 0a 63 03 00 00 85 f8
>>>>>>>>>>> rx: #110 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> rx: #110 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> rx: #110 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> rx: #110 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> rx: #110 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> rx: #110 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> rx: #110 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> rx: #110 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> CAN1 active:
>>>>>>>>>>>
>>>>>>>>>>> 1T11 111 54 00
>>>>>>>>>>> 1R11 110
>>>>>>>>>>>
>>>>>>>>>>> 1CER TX_Queue T11 111 40 01 b8 32 0a 0a 63 03
>>>>>>>>>>> 1T11 111 45 00 00 54 a8 5d 40 00
>>>>>>>>>>> 1T11 111 40 01 b8 32 0a 0a 63 03
>>>>>>>>>>> 1CER TX_Queue T11 111 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> 1T11 111 0a 0a 63 02 08 00 7d f8
>>>>>>>>>>> 1T11 111 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> 1CER TX_Queue T11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1T11 111 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> 1T11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1CER TX_Queue T11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1T11 111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1T11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1CER TX_Queue T11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1T11 111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1T11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1T11 111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> 1R11 110 54 00
>>>>>>>>>>> 1R11 110 45 00 00 54 3a 59 00 00
>>>>>>>>>>> 1R11 110 40 01 66 37 0a 0a 63 02
>>>>>>>>>>> 1R11 110 0a 0a 63 03 00 00 85 f8
>>>>>>>>>>> 1R11 110 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> 1R11 110 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> 1R11 110 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1R11 110 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1R11 110 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1R11 110 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1R11 110 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1R11 110 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> CAN2 listen:
>>>>>>>>>>>
>>>>>>>>>>> 2R11 111 54 00
>>>>>>>>>>> 2R11 110
>>>>>>>>>>>
>>>>>>>>>>> 2R11 111 45 00 00 54 a8 5d 40 00
>>>>>>>>>>> 2R11 111 40 01 b8 32 0a 0a 63 03
>>>>>>>>>>> 2R11 111 0a 0a 63 02 08 00 7d f8
>>>>>>>>>>> 2R11 111 5b 4c 00 01 53 61 b8 60
>>>>>>>>>>> 2R11 111 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> 2R11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 2R11 111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 2R11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 2R11 111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 2R11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 2R11 111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> 2R11 110 54 00
>>>>>>>>>>> 2CER Error intr=10 rxpkt=14 txpkt=0 errflags=0x23401c01 rxerr=0 txerr=0 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=0 wdgreset=0 errreset=0
>>>>>>>>>>> 2R11 110 40 01 66 37 0a 0a 63 02
>>>>>>>>>>> 2R11 110 19 f5 0e 00 08 09 0a 0b
>>>>>>>>>>> 2R11 110 34 35 36 37
>>>>>>>>>>> 2R11 110 45 00 00 54 3a 59 00 00
>>>>>>>>>>>
>>>>>>>>>>> Conclusion is that the CAN1 traffic looks fine, and the PING packet gets a good reply. All successful. But the CAN2 listen is missing a few packets and the last packet is out of order.
>>>>>>>>>>>
>>>>>>>>>>> Now, let’s test with traffic on CAN2 (active, 1Mbps), and listening on CAN1 (listen, 1Mbps):
>>>>>>>>>>>
>>>>>>>>>>> TCPDUMP:
>>>>>>>>>>>
>>>>>>>>>>> 06:00:33.004060 IP (tos 0x0, ttl 64, id 58240, offset 0, flags [DF], proto ICMP (1), length 84)
>>>>>>>>>>> 10.10.99.3 > 10.10.99.2 <http://10.10.99.2/>: ICMP echo request, id 23393, seq 1, length 64
>>>>>>>>>>> 0x0000: 4500 0054 e380 4000 4001 7d0f 0a0a 6303 E..T.. at .@ <mailto:E..T.. at .@>.}...c.
>>>>>>>>>>> 0x0010: 0a0a 6302 0800 7cc8 5b61 0001 f161 b860 ..c...|.[a...a.`
>>>>>>>>>>> 0x0020: 8b0f 0000 0809 0a0b 0c0d 0e0f 1011 1213 ................
>>>>>>>>>>> 0x0030: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 .............!"#
>>>>>>>>>>> 0x0040: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 $%&'()*+,-./0123
>>>>>>>>>>> 0x0050: 3435 3637 4567
>>>>>>>>>>>
>>>>>>>>>>> Traffic (as shown on PC the other end of the can log tcp connection):
>>>>>>>>>>>
>>>>>>>>>>> tx: #111 54 00
>>>>>>>>>>> tx: #111 45 00 00 54 e3 80 40 00
>>>>>>>>>>> tx: #111 40 01 7d 0f 0a 0a 63 03
>>>>>>>>>>> tx: #111 0a 0a 63 02 08 00 7c c8
>>>>>>>>>>> tx: #111 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> tx: #111 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> tx: #111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> tx: #111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> tx: #111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> tx: #111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> tx: #111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> tx: #111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> rx: #110
>>>>>>>>>>> rx: #110 54 00
>>>>>>>>>>> rx: #110 45 00 00 54 3a 5a 00 00
>>>>>>>>>>> rx: #110 0a 0a 63 03 00 00 84 c8
>>>>>>>>>>> rx: #110 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> rx: #110 34 35 36 37
>>>>>>>>>>> rx: #110 40 01 66 36 0a 0a 63 02
>>>>>>>>>>>
>>>>>>>>>>> CAN2 active:
>>>>>>>>>>>
>>>>>>>>>>> 2T11 111 54 00
>>>>>>>>>>> 2R11 110
>>>>>>>>>>>
>>>>>>>>>>> 2CER TX_Queue T11 111 40 01 7d 0f 0a 0a 63 03
>>>>>>>>>>> 2T11 111 45 00 00 54 e3 80 40 00
>>>>>>>>>>> 2T11 111 40 01 7d 0f 0a 0a 63 03
>>>>>>>>>>> 2CER TX_Queue T11 111 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 2T11 111 0a 0a 63 02 08 00 7c c8
>>>>>>>>>>> 2T11 111 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 2CER TX_Queue T11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 2T11 111 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 2T11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 2CER TX_Queue T11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 2T11 111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 2T11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 2CER TX_Queue T11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 2T11 111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 2T11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 2T11 111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> 2R11 110 54 00
>>>>>>>>>>> 2R11 110 45 00 00 54 3a 5a 00 00
>>>>>>>>>>> 2CER Error intr=15 rxpkt=3 txpkt=12 errflags=0x23401c01 rxerr=0 txerr=0 rxinval=0 rxovr=0 txovr=0 txdelay=5 txfail=0 wdgreset=0 errreset=0
>>>>>>>>>>> 2R11 110 0a 0a 63 03 00 00 84 c8
>>>>>>>>>>> 2R11 110 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 2R11 110 34 35 36 37
>>>>>>>>>>> 2R11 110 40 01 66 36 0a 0a 63 02
>>>>>>>>>>>
>>>>>>>>>>> CAN1 listen:
>>>>>>>>>>>
>>>>>>>>>>> 1R11 111 54 00
>>>>>>>>>>> 1R11 110
>>>>>>>>>>>
>>>>>>>>>>> 1R11 111 45 00 00 54 e3 80 40 00
>>>>>>>>>>> 1R11 111 40 01 7d 0f 0a 0a 63 03
>>>>>>>>>>> 1R11 111 0a 0a 63 02 08 00 7c c8
>>>>>>>>>>> 1R11 111 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 1R11 111 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 1R11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1R11 111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1R11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1R11 111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1R11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1R11 111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> 1R11 110 54 00
>>>>>>>>>>> 1R11 110 45 00 00 54 3a 5a 00 00
>>>>>>>>>>> 1R11 110 40 01 66 36 0a 0a 63 02
>>>>>>>>>>> 1R11 110 0a 0a 63 03 00 00 84 c8
>>>>>>>>>>> 1R11 110 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 1R11 110 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 1R11 110 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1R11 110 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1R11 110 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1R11 110 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1R11 110 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1R11 110 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> Conclusion is that the CAN2 transmit traffic looks fine, but no PING reply received via CAN. The CAN1 listen shows the reply just fine.
>>>>>>>>>>>
>>>>>>>>>>> Here is that last CAN1 listen, with timestamps:
>>>>>>>>>>>
>>>>>>>>>>> 1622696433.080107 1R11 111 54 00
>>>>>>>>>>> 1622696433.081657 1R11 110
>>>>>>>>>>>
>>>>>>>>>>> 1622696433.227479 1R11 111 45 00 00 54 e3 80 40 00
>>>>>>>>>>> 1622696433.228318 1R11 111 40 01 7d 0f 0a 0a 63 03
>>>>>>>>>>> 1622696433.245727 1R11 111 0a 0a 63 02 08 00 7c c8
>>>>>>>>>>> 1622696433.246214 1R11 111 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 1622696433.248219 1R11 111 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 1622696433.248772 1R11 111 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1622696433.250774 1R11 111 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1622696433.251338 1R11 111 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1622696433.253380 1R11 111 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1622696433.253944 1R11 111 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1622696433.265937 1R11 111 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> 1622696433.269221 1R11 110 54 00
>>>>>>>>>>> 1622696433.272095 1R11 110 45 00 00 54 3a 5a 00 00
>>>>>>>>>>> 1622696433.272125 1R11 110 40 01 66 36 0a 0a 63 02
>>>>>>>>>>> 1622696433.272156 1R11 110 0a 0a 63 03 00 00 84 c8
>>>>>>>>>>> 1622696433.272193 1R11 110 5b 61 00 01 f1 61 b8 60
>>>>>>>>>>> 1622696433.272245 1R11 110 8b 0f 00 00 08 09 0a 0b
>>>>>>>>>>> 1622696433.272277 1R11 110 0c 0d 0e 0f 10 11 12 13
>>>>>>>>>>> 1622696433.272314 1R11 110 14 15 16 17 18 19 1a 1b
>>>>>>>>>>> 1622696433.272354 1R11 110 1c 1d 1e 1f 20 21 22 23
>>>>>>>>>>> 1622696433.272387 1R11 110 24 25 26 27 28 29 2a 2b
>>>>>>>>>>> 1622696433.272420 1R11 110 2c 2d 2e 2f 30 31 32 33
>>>>>>>>>>> 1622696433.272452 1R11 110 34 35 36 37
>>>>>>>>>>>
>>>>>>>>>>> It is 1Mbps, with 30us or so between each packet. This is the only traffic on the bus. Everything else is turned off. Roughly 12 packets each way. Surely even if we were hitting a performance limit, our buffers can handle 12 packets?
>>>>>>>>>>>
>>>>>>>>>>> The good news is that I have a good environment to replicate this issue now, so any fix should be easy to test.
>>>>>>>>>>>
>>>>>>>>>>> I haven’t worked on the MCP2515 driver in our code in a long time, but it certainly seems something is messed up and that could be badly affecting vehicle modules using anything other than CAN1.
>>>>>>>>>>>
>>>>>>>>>>> I will start to look at this over the weekend, but has anyone got any ideas/suggestions? Perhaps the bit timing registers are off by a small amount (so it works on CAN1 but not on CAN2)? Or something more serious in our driver?
>>>>>>>>>>>
>>>>>>>>>>> Regards, Mark.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> OvmsDev mailing list
>>>>>>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>>>>>> <MCP2515Calc-1000kbit.ods>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> OvmsDev mailing list
>>>>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>>>
>>>>> --
>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>
>>>>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>
>>> --
>>> This space for rent...
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20220106/50fd4945/attachment-0001.htm>
More information about the OvmsDev
mailing list