[Ovmsdev] MCP2515 driver bug?

Mark Webb-Johnson mark at webb-johnson.net
Thu Jun 3 13:36:33 HKT 2021


I’m working on an implementation of IP stack over CAN for the Tesla Roadster. IP frames are encoded as a length followed by a sequence of CAN frames, all on the same ID. This runs over a 1MHz bus, so presumably the traffic volume could be high at times.

I was having problems with this running on CAN2, so tried CAN1 and it worked perfectly. Here are some simple dumps of a single PING packet (and single PING response packet):

ID #111 is used to transmit an IP packet, and ID #110 is used to receive an IP packet. The special empty data frame is an acknowledgment.

Using latest master branch code (3.2.016-196-g0aad1a9f/ota_1/edge (build idf v3.3.4-848-g1ff5e24b1 Jun  2 2021 09:28:58)).

So, first let’s test with traffic on CAN1 (active, 1Mbps), and listening on CAN2 (listen, 1Mbps):

TCPDUMP:

05:57:55.980291 IP (tos 0x0, ttl 64, id 43101, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.99.3 > 10.10.99.2: ICMP echo request, id 23372, seq 1, length 64
	0x0000:  4500 0054 a85d 4000 4001 b832 0a0a 6303  E..T.]@. at ..2..c.
	0x0010:  0a0a 6302 0800 7df8 5b4c 0001 5361 b860  ..c...}.[L..Sa.`
	0x0020:  19f5 0e00 0809 0a0b 0c0d 0e0f 1011 1213  ................
	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
	0x0050:  3435 3637                                4567

05:57:56.436190 IP (tos 0x0, ttl 64, id 14937, offset 0, flags [none], proto ICMP (1), length 84)
    10.10.99.2 > 10.10.99.3: ICMP echo reply, id 23372, seq 1, length 64
	0x0000:  4500 0054 3a59 0000 4001 6637 0a0a 6302  E..T:Y.. at .f7..c.
	0x0010:  0a0a 6303 0000 85f8 5b4c 0001 5361 b860  ..c.....[L..Sa.`
	0x0020:  19f5 0e00 0809 0a0b 0c0d 0e0f 1011 1213  ................
	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
	0x0050:  3435 3637                                4567

Traffic (as shown on PC the other end of the can log tcp connection):

tx: #111 54 00
tx: #111 45 00 00 54 a8 5d 40 00
tx: #111 40 01 b8 32 0a 0a 63 03
tx: #111 0a 0a 63 02 08 00 7d f8
tx: #111 5b 4c 00 01 53 61 b8 60
tx: #111 19 f5 0e 00 08 09 0a 0b
tx: #111 0c 0d 0e 0f 10 11 12 13
tx: #111 14 15 16 17 18 19 1a 1b
tx: #111 1c 1d 1e 1f 20 21 22 23
tx: #111 24 25 26 27 28 29 2a 2b
tx: #111 2c 2d 2e 2f 30 31 32 33
tx: #111 34 35 36 37

rx: #110
rx: #110 54 00
rx: #110 45 00 00 54 3a 59 00 00
rx: #110 40 01 66 37 0a 0a 63 02
rx: #110 0a 0a 63 03 00 00 85 f8
rx: #110 5b 4c 00 01 53 61 b8 60
rx: #110 19 f5 0e 00 08 09 0a 0b
rx: #110 0c 0d 0e 0f 10 11 12 13
rx: #110 14 15 16 17 18 19 1a 1b
rx: #110 1c 1d 1e 1f 20 21 22 23
rx: #110 24 25 26 27 28 29 2a 2b
rx: #110 2c 2d 2e 2f 30 31 32 33
rx: #110 34 35 36 37

CAN1 active:

1T11 111 54 00
1R11 110

1CER TX_Queue T11 111 40 01 b8 32 0a 0a 63 03
1T11 111 45 00 00 54 a8 5d 40 00
1T11 111 40 01 b8 32 0a 0a 63 03
1CER TX_Queue T11 111 5b 4c 00 01 53 61 b8 60
1T11 111 0a 0a 63 02 08 00 7d f8
1T11 111 5b 4c 00 01 53 61 b8 60
1CER TX_Queue T11 111 0c 0d 0e 0f 10 11 12 13
1T11 111 19 f5 0e 00 08 09 0a 0b
1T11 111 0c 0d 0e 0f 10 11 12 13
1CER TX_Queue T11 111 1c 1d 1e 1f 20 21 22 23
1T11 111 14 15 16 17 18 19 1a 1b
1T11 111 1c 1d 1e 1f 20 21 22 23
1CER TX_Queue T11 111 2c 2d 2e 2f 30 31 32 33
1T11 111 24 25 26 27 28 29 2a 2b
1T11 111 2c 2d 2e 2f 30 31 32 33
1T11 111 34 35 36 37

1R11 110 54 00
1R11 110 45 00 00 54 3a 59 00 00
1R11 110 40 01 66 37 0a 0a 63 02
1R11 110 0a 0a 63 03 00 00 85 f8
1R11 110 5b 4c 00 01 53 61 b8 60
1R11 110 19 f5 0e 00 08 09 0a 0b
1R11 110 0c 0d 0e 0f 10 11 12 13
1R11 110 14 15 16 17 18 19 1a 1b
1R11 110 1c 1d 1e 1f 20 21 22 23
1R11 110 24 25 26 27 28 29 2a 2b
1R11 110 2c 2d 2e 2f 30 31 32 33
1R11 110 34 35 36 37

CAN2 listen:

2R11 111 54 00
2R11 110

2R11 111 45 00 00 54 a8 5d 40 00
2R11 111 40 01 b8 32 0a 0a 63 03
2R11 111 0a 0a 63 02 08 00 7d f8
2R11 111 5b 4c 00 01 53 61 b8 60
2R11 111 19 f5 0e 00 08 09 0a 0b
2R11 111 0c 0d 0e 0f 10 11 12 13
2R11 111 14 15 16 17 18 19 1a 1b
2R11 111 1c 1d 1e 1f 20 21 22 23
2R11 111 24 25 26 27 28 29 2a 2b
2R11 111 2c 2d 2e 2f 30 31 32 33
2R11 111 34 35 36 37

2R11 110 54 00
2CER Error intr=10 rxpkt=14 txpkt=0 errflags=0x23401c01 rxerr=0 txerr=0 rxinval=0 rxovr=0 txovr=0 txdelay=0 txfail=0 wdgreset=0 errreset=0
2R11 110 40 01 66 37 0a 0a 63 02
2R11 110 19 f5 0e 00 08 09 0a 0b
2R11 110 34 35 36 37
2R11 110 45 00 00 54 3a 59 00 00

Conclusion is that the CAN1 traffic looks fine, and the PING packet gets a good reply. All successful. But the CAN2 listen is missing a few packets and the last packet is out of order.

Now, let’s test with traffic on CAN2 (active, 1Mbps), and listening on CAN1 (listen, 1Mbps):

TCPDUMP:

06:00:33.004060 IP (tos 0x0, ttl 64, id 58240, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.99.3 > 10.10.99.2: ICMP echo request, id 23393, seq 1, length 64
	0x0000:  4500 0054 e380 4000 4001 7d0f 0a0a 6303  E..T.. at .@.}...c.
	0x0010:  0a0a 6302 0800 7cc8 5b61 0001 f161 b860  ..c...|.[a...a.`
	0x0020:  8b0f 0000 0809 0a0b 0c0d 0e0f 1011 1213  ................
	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
	0x0050:  3435 3637                                4567

Traffic (as shown on PC the other end of the can log tcp connection):

tx: #111 54 00
tx: #111 45 00 00 54 e3 80 40 00
tx: #111 40 01 7d 0f 0a 0a 63 03
tx: #111 0a 0a 63 02 08 00 7c c8
tx: #111 5b 61 00 01 f1 61 b8 60
tx: #111 8b 0f 00 00 08 09 0a 0b
tx: #111 0c 0d 0e 0f 10 11 12 13
tx: #111 14 15 16 17 18 19 1a 1b
tx: #111 1c 1d 1e 1f 20 21 22 23
tx: #111 24 25 26 27 28 29 2a 2b
tx: #111 2c 2d 2e 2f 30 31 32 33
tx: #111 34 35 36 37

rx: #110
rx: #110 54 00
rx: #110 45 00 00 54 3a 5a 00 00
rx: #110 0a 0a 63 03 00 00 84 c8
rx: #110 8b 0f 00 00 08 09 0a 0b
rx: #110 34 35 36 37
rx: #110 40 01 66 36 0a 0a 63 02

CAN2 active:

2T11 111 54 00
2R11 110

2CER TX_Queue T11 111 40 01 7d 0f 0a 0a 63 03
2T11 111 45 00 00 54 e3 80 40 00
2T11 111 40 01 7d 0f 0a 0a 63 03
2CER TX_Queue T11 111 5b 61 00 01 f1 61 b8 60
2T11 111 0a 0a 63 02 08 00 7c c8
2T11 111 5b 61 00 01 f1 61 b8 60
2CER TX_Queue T11 111 0c 0d 0e 0f 10 11 12 13
2T11 111 8b 0f 00 00 08 09 0a 0b
2T11 111 0c 0d 0e 0f 10 11 12 13
2CER TX_Queue T11 111 1c 1d 1e 1f 20 21 22 23
2T11 111 14 15 16 17 18 19 1a 1b
2T11 111 1c 1d 1e 1f 20 21 22 23
2CER TX_Queue T11 111 2c 2d 2e 2f 30 31 32 33
2T11 111 24 25 26 27 28 29 2a 2b
2T11 111 2c 2d 2e 2f 30 31 32 33
2T11 111 34 35 36 37

2R11 110 54 00
2R11 110 45 00 00 54 3a 5a 00 00
2CER Error intr=15 rxpkt=3 txpkt=12 errflags=0x23401c01 rxerr=0 txerr=0 rxinval=0 rxovr=0 txovr=0 txdelay=5 txfail=0 wdgreset=0 errreset=0
2R11 110 0a 0a 63 03 00 00 84 c8
2R11 110 8b 0f 00 00 08 09 0a 0b
2R11 110 34 35 36 37
2R11 110 40 01 66 36 0a 0a 63 02

CAN1 listen:

1R11 111 54 00
1R11 110

1R11 111 45 00 00 54 e3 80 40 00
1R11 111 40 01 7d 0f 0a 0a 63 03
1R11 111 0a 0a 63 02 08 00 7c c8
1R11 111 5b 61 00 01 f1 61 b8 60
1R11 111 8b 0f 00 00 08 09 0a 0b
1R11 111 0c 0d 0e 0f 10 11 12 13
1R11 111 14 15 16 17 18 19 1a 1b
1R11 111 1c 1d 1e 1f 20 21 22 23
1R11 111 24 25 26 27 28 29 2a 2b
1R11 111 2c 2d 2e 2f 30 31 32 33
1R11 111 34 35 36 37

1R11 110 54 00
1R11 110 45 00 00 54 3a 5a 00 00
1R11 110 40 01 66 36 0a 0a 63 02
1R11 110 0a 0a 63 03 00 00 84 c8
1R11 110 5b 61 00 01 f1 61 b8 60
1R11 110 8b 0f 00 00 08 09 0a 0b
1R11 110 0c 0d 0e 0f 10 11 12 13
1R11 110 14 15 16 17 18 19 1a 1b
1R11 110 1c 1d 1e 1f 20 21 22 23
1R11 110 24 25 26 27 28 29 2a 2b
1R11 110 2c 2d 2e 2f 30 31 32 33
1R11 110 34 35 36 37

Conclusion is that the CAN2 transmit traffic looks fine, but no PING reply received via CAN. The CAN1 listen shows the reply just fine.

Here is that last CAN1 listen, with timestamps:

1622696433.080107 1R11 111 54 00
1622696433.081657 1R11 110

1622696433.227479 1R11 111 45 00 00 54 e3 80 40 00
1622696433.228318 1R11 111 40 01 7d 0f 0a 0a 63 03
1622696433.245727 1R11 111 0a 0a 63 02 08 00 7c c8
1622696433.246214 1R11 111 5b 61 00 01 f1 61 b8 60
1622696433.248219 1R11 111 8b 0f 00 00 08 09 0a 0b
1622696433.248772 1R11 111 0c 0d 0e 0f 10 11 12 13
1622696433.250774 1R11 111 14 15 16 17 18 19 1a 1b
1622696433.251338 1R11 111 1c 1d 1e 1f 20 21 22 23
1622696433.253380 1R11 111 24 25 26 27 28 29 2a 2b
1622696433.253944 1R11 111 2c 2d 2e 2f 30 31 32 33
1622696433.265937 1R11 111 34 35 36 37

1622696433.269221 1R11 110 54 00
1622696433.272095 1R11 110 45 00 00 54 3a 5a 00 00
1622696433.272125 1R11 110 40 01 66 36 0a 0a 63 02
1622696433.272156 1R11 110 0a 0a 63 03 00 00 84 c8
1622696433.272193 1R11 110 5b 61 00 01 f1 61 b8 60
1622696433.272245 1R11 110 8b 0f 00 00 08 09 0a 0b
1622696433.272277 1R11 110 0c 0d 0e 0f 10 11 12 13
1622696433.272314 1R11 110 14 15 16 17 18 19 1a 1b
1622696433.272354 1R11 110 1c 1d 1e 1f 20 21 22 23
1622696433.272387 1R11 110 24 25 26 27 28 29 2a 2b
1622696433.272420 1R11 110 2c 2d 2e 2f 30 31 32 33
1622696433.272452 1R11 110 34 35 36 37

It is 1Mbps, with 30us or so between each packet. This is the only traffic on the bus. Everything else is turned off. Roughly 12 packets each way. Surely even if we were hitting a performance limit, our buffers can handle 12 packets?

The good news is that I have a good environment to replicate this issue now, so any fix should be easy to test.

I haven’t worked on the MCP2515 driver in our code in a long time, but it certainly seems something is messed up and that could be badly affecting vehicle modules using anything other than CAN1.

I will start to look at this over the weekend, but has anyone got any ideas/suggestions? Perhaps the bit timing registers are off by a small amount (so it works on CAN1 but not on CAN2)? Or something more serious in our driver?

Regards, Mark.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210603/fc8e11d1/attachment-0001.htm>


More information about the OvmsDev mailing list