[Ovmsdev] MCP2515 driver bug?
Mark Webb-Johnson
mark at webb-johnson.net
Tue Jun 8 10:46:43 HKT 2021
Colin,
In the case of this specific protocol, I have no control over the protocol or the firmware at the other end. I guess they are relying on this being a point-to-point CAN bus connection on a private bus with known components at both ends, and the TCP/IP layer providing the error detection and correction. They have a dedicated bus used for this, at 1MHz, with only the VMS at one end and a diagnostic device at the other. They also seem to pace the communications with some sort of ‘go ahead’ signal. I don’t have any documentation on the protocol itself, and just trying to RE it. So far it seems to be simply:
Client->Server: N bytes are coming
Server->Client: OK, go ahead
Client->Server: Send a sequence of CAN frame, totalling N bytes of data
The data itself is simply an IP datagram (which can be fed directly into the IP stack using a TUN device on Linux - I still need to work out how to feed that into LWIP on ESP32, but it doesn’t seem hard). Strangely, I don’t seem to require the ‘OK, go ahead’ at the client side (for traffic coming the other way). I have no idea what it does if things get out of sequence. Perhaps that is the reason for the ‘OK, go ahead’ message (as, coupled with timeouts, it would allow for comms to be re-synced).
The only reason I am doing this at all is that it provides the holy grail of full access to the VMS. Access to logs, configuration, etc. That said, improving our driver may help us in general, and provide a framework for others to look at.
While UDP can deliver frames out of order (as it is multi-hop, with route potentially determined on a per-packet basis, and some routes may be quicker than others), can that happen with CAN? Missing frames on CAN is a definite possibility.
That said, I do think the drivers should be written not to re-order frames arbitrarily. It is one thing for the communication channel to do that, but quite another for a driver.
Regards, Mark.
> On 7 Jun 2021, at 9:44 PM, Collin Kidder <collink at kkmfg.com> wrote:
>
> I totally do not disagree with you. The handling is not very good at
> all - and I might have even written that part... Surely it could be
> handled better. However, nobody has fixed it yet. Perhaps I ought to
> create an issue on GitHub to remind myself that someone has to revisit
> that code at some point.
>
> One thing I would like to inject into this discussion, though, is that
> counting on proper in-order reception and no dropped frames is not
> generally advisable. CAN is best treated like UDP. Some packets might
> arrive out of order, some not at all. Any multiframe transmission over
> CAN really needs to include some sort of means to identify lost or out
> of order frames. You simply cannot count on CAN traffic to reliably
> get from point A to point B 100% of the time. This is especially true
> once you get onto a real bus where other frames are coming in and the
> bus is loaded up. It's one thing to try this on an empty bus and get
> it to work but that work is not likely to transition over reliably to
> "the real thing." It's best to not even entertain the notion that it
> could be possible. Some form of higher level protocol that keeps track
> of the order and sequence is necessary. Yes, I know technically no
> dropped frames and in order reception are possible if the drivers on
> both sides are properly coded. But, in practice it tends to be quite
> difficult to ensure this 100% of the time. Once again, this is JUST
> like UDP. It can be reliable... most of the time. But, it isn't
> guaranteed to be so and it's best to take proper precautions against
> the problem. This is why things like ISO-TP have sequence bytes. Also,
> see here: https://datatracker.ietf.org/doc/html/draft-cafi-can-ip-00
>
> On Sun, Jun 6, 2021 at 4:16 PM Michael Balzer <dexter at expeedo.de> wrote:
>> That issue isn't addressed by that driver as well, in fact their
>> handling is much worse than ours, even without rollover enabled. As you
>> can see in their interrupt handler…
>>
>> https://github.com/collin80/esp32_can/blob/master/src/mcp2515.cpp#L892
>>
>> …they read the interrupt flags, then check & read both buffers based on
>> the flags read, then clear all (!) interrupt flags.
>>
>> So if an interrupt occurs for buffer 0, and the next frame comes in for
>> buffer 1 right after the driver read the interrupts, their driver will
>> only read buffer 0 and then clear all interrupts, thus losing that for
>> buffer 1.
>>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210608/96e0082f/attachment-0001.htm>
More information about the OvmsDev
mailing list