[Ovmsdev] CAN drivers: fix & harmonize frame transmission failure handling
Michael Balzer
dexter at expeedo.de
Sat Jan 9 21:18:15 HKT 2021
As all tests were positive and without issues, I've merged the rework
into master.
I now consider extending the poller to allow to hook into transmission
failures.
Also, if a TX callback is present in a frame, I don't think we need the
error log entry from the CAN framework. That would eliminate most CAN
error log messages from regular poller "pings".
Regards,
Michael
Am 08.01.21 um 18:52 schrieb Michael Balzer:
> Steve,
>
> thanks, that's perfect. The failure handling works as designed in your
> case.
>
> Regarding your question:
>> It does prompt me to ask a question that I had - On the i3, if you do
>> something like send a lock from the key or the Connected Drive APP
>> then the OBD-II comes alive but goes asleep again in less than a minute.
>>
>> if I have a PID that I poll infrequently - say every 120 seconds.
>> What happens in this case? Would they be seen as "overdue" when the
>> bus comes alive and polled immediately, or is it a matter of luck if
>> the 120th tick arrives at a time when the bus is alive?
>>
>> If the latter I need to poll even things like the VIN every 10
>> seconds to make sure I get it before the bus goes to sleep again.
>
> With the old handling, the queued frames would have get sent as soon
> as the bus got awake again. That's nasty, as the frames may have been
> for a specific task (e.g. some protocol part), and should not be sent
> to a just woken up car. That could produce any sort of problem up to
> queued OBD writes corrupting the car memory. It was also nasty the
> driver would then send the whole TX queue at once, flooding the bus. A
> vehicle could see that as a malicious activity and block access.
>
> The new handling will abort the transmission as soon as the CAN
> controller runs into the retransmission limit (128 tries, formally CAN
> error-passive mode).
>
> So you now need to "ping" the car with some simple read or session
> state request, and check if a response comes in to determine if the
> bus is online. If using the poller, you'll get a respective
> Incoming…() callback. If you don't use the poller, you can set the TX
> callback pointer on the frame you send. The TX callback is called with
> a success indicator, so you can know a frame has been sent even if you
> don't get a response from the device.
>
> Regards,
> Michael
>
>
> Am 08.01.21 um 09:33 schrieb Steve Davies:
>> Hi Michael,
>>
>> Here's the log from a test on my car with your branch
>>
>> I started the car, left it for a while, then shut it down and waited
>> until the OBD-II first went to "not getting replies to my requests"
>> and then to "not sending anything at all".
>>
>> Hope its helpful.
>>
>> https://drive.google.com/file/d/1AavD41HCykYrn-BxQXNufu2dT_UVCXYU/view?usp=sharing
>> <https://drive.google.com/file/d/1AavD41HCykYrn-BxQXNufu2dT_UVCXYU/view?usp=sharing>
>>
>> Steve
>>
>>
>> On Fri, 8 Jan 2021 at 08:22, Steve Davies <steve at telviva.co.za
>> <mailto:steve at telviva.co.za>> wrote:
>>
>> Hi Michael,
>>
>> The change looks helpful, thanks. I'll try it during the course
>> of the day.
>>
>> It does prompt me to ask a question that I had - On the i3, if
>> you do something like send a lock from the key or the Connected
>> Drive APP then the OBD-II comes alive but goes asleep again in
>> less than a minute.
>>
>> if I have a PID that I poll infrequently - say every 120
>> seconds. What happens in this case? Would they be seen as
>> "overdue" when the bus comes alive and polled immediately, or is
>> it a matter of luck if the 120th tick arrives at a time when the
>> bus is alive?
>>
>> If the latter I need to poll even things like the VIN every 10
>> seconds to make sure I get it before the bus goes to sleep again.
>>
>> Thanks,
>> Steve
>>
>>
>> On Thu, 7 Jan 2021 at 18:22, Michael Balzer <dexter at expeedo.de
>> <mailto:dexter at expeedo.de>> wrote:
>>
>> Everyone,
>>
>> please pull & test the new "can-txfail-fix" branch. It's up
>> to date and includes the BMW i3 code already.
>>
>> I need to get feedback from users of both can1 (esp32can) &
>> can2/3/4 (mcp2515), as changes had to be made to both drivers.
>>
>> I'll quote from my commit:
>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/c94592a11ad2c989e65313d23a8876cf38787d70
>> <https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/c94592a11ad2c989e65313d23a8876cf38787d70>
>>
>> Design goals:
>> - any TX can either fail or succeed, the result state is terminal
>> - the respective TX callback is called exactly once
>> - transmissions fail on reaching the error-passive bus state
>> and on message/bus errors while in error-passive state
>> - a failed TX will be aborted (no retries after bus recovery),
>> i.e. will be retried at most 128 times (in error-active phase)
>> - reduce excessive CAN error logging
>> - reduce excessive interrupt load with switched-off buses
>>
>> This results in the application being able to reliably detect a
>> switched-off vehicle bus by the TX callback's success indicator.
>> It also results in frames no longer being held in the TX buffer
>> or added to the TX queue when the bus is switched off. The
>> application can now rely on getting a clean bus state on every
>> reconnect, without any queued old frames to be sent automatically.
>>
>> Secondary benefit from aborting the transmission is, the module
>> doesn't need to handle the load from the continuously triggered
>> CAN error interrupts by retransmission attempts in error-passive
>> state.
>>
>>
>> Reason for this was a) Steve's question on aborting
>> transmissions / flushing the queue and b) my new car now also
>> switching off the bus, with the annoying effect of a frozen
>> can1 every 2-3 days, needing to reboot the module. I'm not
>> sure yet if the freeze issue is solved, but I haven't had it
>> since running these changes on my module.
>>
>> The other issue of the transceivers resending frames queued
>> long ago may have caused all sorts of strange & unrepeatable
>> issues. I remember the VW crew having problems that fell into
>> this category.
>>
>> I've verified the new MCP2515 implementation only on my
>> workbench (with an Arduino as the CAN tester), so real life
>> tests are necessary.
>>
>> Thanks,
>> Michael
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>> <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210109/42053b81/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210109/42053b81/attachment-0001.sig>
More information about the OvmsDev
mailing list