[Ovmsdev] CAN drivers: fix & harmonize frame transmission failure handling

Michael Balzer dexter at expeedo.de
Sun Jan 10 01:02:03 HKT 2021


I've just pushed the two poller extensions described before:

 1. If some TxCallback has been registered, either globally or on the
    frame, the CAN framework now won't add an extra error status log
    entry on TX failure. If you need to see/log these, activate a CAN
    logger capable of catching TX failures, e.g. "can log start monitor
    crtd", and look for "TX_Fail" entries. With a registered TxCallback,
    during normal operation you will only see error level log entries
    from TX failures when the CAN framework encounters a bus error
    condition. "Ping" frames/requests sent while the error state is
    active won't produce standard log entries.

 2. The vehicle poller now registers a TxCallback for all requests sent,
    so automatically fulfills the above condition. You can hook into
    that callback simply by overriding the following method:

    /**
      * IncomingPollTxCallback: poller TX callback (stub, override with
    vehicle implementation)
      *  This is called by PollerTxCallback() on TX success/failure for
    a poller request.
      *  You can use this to detect CAN bus issues, e.g. if the car
    switches off the OBD port.
      *
      *  ATT: this is executed in the main CAN task context. Keep it simple.
      *    Complex processing here will affect overall CAN performance.
      *
      *  @param bus
      *    CAN bus the current poll is done on
      *  @param txid
      *    The module TX ID of the current poll
      *  @param type
      *    OBD2 mode / UDS polling type, e.g. VEHICLE_POLL_TYPE_READDTC
      *  @param pid
      *    PID addressed (depending on the request type, may be none / 8
    bit / 16 bit)
      *  @param success
      *    Frame transmission success
      */
    void OvmsVehicle::IncomingPollTxCallback(canbus* bus, uint32_t txid,
    uint16_t type, uint16_t pid, bool success)
       {
       }


Regards,
Michael


Am 09.01.21 um 14:18 schrieb Michael Balzer:
> As all tests were positive and without issues, I've merged the rework 
> into master.
>
> I now consider extending the poller to allow to hook into transmission 
> failures.
>
> Also, if a TX callback is present in a frame, I don't think we need 
> the error log entry from the CAN framework. That would eliminate most 
> CAN error log messages from regular poller "pings".
>
> Regards,
> Michael
>
>
> Am 08.01.21 um 18:52 schrieb Michael Balzer:
>> Steve,
>>
>> thanks, that's perfect. The failure handling works as designed in 
>> your case.
>>
>> Regarding your question:
>>> It does prompt me to ask a question that I had - On the i3, if you 
>>> do something like send a lock from the key or the Connected Drive 
>>> APP then the OBD-II comes alive but goes asleep again in less than a 
>>> minute.
>>>
>>> if I have a PID that I poll infrequently - say every 120 seconds.  
>>> What happens in this case?  Would they be seen as "overdue" when the 
>>> bus comes alive and polled immediately, or is it a matter of luck if 
>>> the 120th tick arrives at a time when the bus is alive?
>>>
>>> If the latter I need to poll even things like the VIN every 10 
>>> seconds to make sure I get it before the bus goes to sleep again.
>>
>> With the old handling, the queued frames would have get sent as soon 
>> as the bus got awake again. That's nasty, as the frames may have been 
>> for a specific task (e.g. some protocol part), and should not be sent 
>> to a just woken up car. That could produce any sort of problem up to 
>> queued OBD writes corrupting the car memory. It was also nasty the 
>> driver would then send the whole TX queue at once, flooding the bus. 
>> A vehicle could see that as a malicious activity and block access.
>>
>> The new handling will abort the transmission as soon as the CAN 
>> controller runs into the retransmission limit (128 tries, formally 
>> CAN error-passive mode).
>>
>> So you now need to "ping" the car with some simple read or session 
>> state request, and check if a response comes in to determine if the 
>> bus is online. If using the poller, you'll get a respective 
>> Incoming…() callback. If you don't use the poller, you can set the TX 
>> callback pointer on the frame you send. The TX callback is called 
>> with a success indicator, so you can know a frame has been sent even 
>> if you don't get a response from the device.
>>
>> Regards,
>> Michael
>>
>>
>> Am 08.01.21 um 09:33 schrieb Steve Davies:
>>> Hi Michael,
>>>
>>> Here's the log from a test on my car with your branch
>>>
>>> I started the car, left it for a while, then shut it down and waited 
>>> until the OBD-II first went to "not getting replies to my requests" 
>>> and then to "not sending anything at all".
>>>
>>> Hope its helpful.
>>>
>>> https://drive.google.com/file/d/1AavD41HCykYrn-BxQXNufu2dT_UVCXYU/view?usp=sharing 
>>> <https://drive.google.com/file/d/1AavD41HCykYrn-BxQXNufu2dT_UVCXYU/view?usp=sharing>
>>>
>>> Steve
>>>
>>>
>>> On Fri, 8 Jan 2021 at 08:22, Steve Davies <steve at telviva.co.za 
>>> <mailto:steve at telviva.co.za>> wrote:
>>>
>>>     Hi Michael,
>>>
>>>     The change looks helpful, thanks.  I'll try it during the course
>>>     of the day.
>>>
>>>     It does prompt me to ask a question that I had - On the i3, if
>>>     you do something like send a lock from the key or the Connected
>>>     Drive APP then the OBD-II comes alive but goes asleep again in
>>>     less than a minute.
>>>
>>>     if I have a PID that I poll infrequently - say every 120
>>>     seconds.  What happens in this case?  Would they be seen as
>>>     "overdue" when the bus comes alive and polled immediately, or is
>>>     it a matter of luck if the 120th tick arrives at a time when the
>>>     bus is alive?
>>>
>>>     If the latter I need to poll even things like the VIN every 10
>>>     seconds to make sure I get it before the bus goes to sleep again.
>>>
>>>     Thanks,
>>>     Steve
>>>
>>>
>>>     On Thu, 7 Jan 2021 at 18:22, Michael Balzer <dexter at expeedo.de
>>>     <mailto:dexter at expeedo.de>> wrote:
>>>
>>>         Everyone,
>>>
>>>         please pull & test the new "can-txfail-fix" branch. It's up
>>>         to date and includes the BMW i3 code already.
>>>
>>>         I need to get feedback from users of both can1 (esp32can) &
>>>         can2/3/4 (mcp2515), as changes had to be made to both drivers.
>>>
>>>         I'll quote from my commit:
>>>         https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/c94592a11ad2c989e65313d23a8876cf38787d70
>>>         <https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/c94592a11ad2c989e65313d23a8876cf38787d70>
>>>
>>>         Design goals:
>>>         - any TX can either fail or succeed, the result state is terminal
>>>         - the respective TX callback is called exactly once
>>>         - transmissions fail on reaching the error-passive bus state
>>>            and on message/bus errors while in error-passive state
>>>         - a failed TX will be aborted (no retries after bus recovery),
>>>            i.e. will be retried at most 128 times (in error-active phase)
>>>         - reduce excessive CAN error logging
>>>         - reduce excessive interrupt load with switched-off buses
>>>
>>>         This results in the application being able to reliably detect a
>>>         switched-off vehicle bus by the TX callback's success indicator.
>>>         It also results in frames no longer being held in the TX buffer
>>>         or added to the TX queue when the bus is switched off. The
>>>         application can now rely on getting a clean bus state on every
>>>         reconnect, without any queued old frames to be sent automatically.
>>>
>>>         Secondary benefit from aborting the transmission is, the module
>>>         doesn't need to handle the load from the continuously triggered
>>>         CAN error interrupts by retransmission attempts in error-passive
>>>         state.
>>>
>>>
>>>         Reason for this was a) Steve's question on aborting
>>>         transmissions / flushing the queue and b) my new car now
>>>         also switching off the bus, with the annoying effect of a
>>>         frozen can1 every 2-3 days, needing to reboot the module.
>>>         I'm not sure yet if the freeze issue is solved, but I
>>>         haven't had it since running these changes on my module.
>>>
>>>         The other issue of the transceivers resending frames queued
>>>         long ago may have caused all sorts of strange & unrepeatable
>>>         issues. I remember the VW crew having problems that fell
>>>         into this category.
>>>
>>>         I've verified the new MCP2515 implementation only on my
>>>         workbench (with an Arduino as the CAN tester), so real life
>>>         tests are necessary.
>>>
>>>         Thanks,
>>>         Michael
>>>
>>>         -- 
>>>         Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>         Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>         _______________________________________________
>>>         OvmsDev mailing list
>>>         OvmsDev at lists.openvehicles.com
>>>         <mailto:OvmsDev at lists.openvehicles.com>
>>>         http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>         <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> -- 
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> -- 
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210109/2eced8f7/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210109/2eced8f7/attachment-0001.sig>


More information about the OvmsDev mailing list