[Ovmsdev] CAN-3 broken again?

Michael Balzer dexter at expeedo.de
Sun Jan 14 02:14:53 HKT 2018


Just added an additional fix for this.

Regards,
Michael


Am 13.01.2018 um 19:03 schrieb Michael Balzer:
> Please check again.
>
> Thanks,
> Michael
>
>
> Am 13.01.2018 um 18:42 schrieb Greg D.:
>> Gave it a quick try, and got a crash...  I'll see if I can isolate it a bit, but here's something to start with.  Tombstone, attached.  
>>
>> Greg
>>
>>
>>
>>
>> Geir Øyvind Vælidalo wrote:
>>> I currently don’t send anything on can2. I could try to send something, but the car is away this weekend :-(
>>>
>>> Geir
>>>
>>>
>>>> 13. jan. 2018 kl. 17:24 skrev Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>>:
>>>>
>>>> Part one (TX queue) done & pushed.
>>>>
>>>> OVMS > can can1 status
>>>> CAN:       can1
>>>> Mode:      Active
>>>> Speed:     500000
>>>> Rx pkt:                  236657
>>>> Rx err:                       1
>>>> Rx ovrflw:                    0
>>>> Tx pkt:                  106378
>>>> Tx delays:                    4
>>>> Tx err:                       0
>>>> Tx ovrflw:                    0
>>>> Err flags: 0x800caa
>>>>
>>>> TX performance is rock steady on can1 -- the delays occurred when sending the stop charge request (as expected). I can't test can2/3, Greg & Geir, could you…?
>>>>
>>>> The TxCallback can't be used on the mcp2515. The ISR can't query the IRQ register, so the TX IRQs are now also handled by the RxCallback(). As the TX IRQs
>>>> need to be cleared before loading the next frame, this needs another SPI call. I hope that doesn't introduce new problems.
>>>>
>>>>
>>>> No changes are necessary to the application code (well, except you can remove any hard coded delays now). The TX queue has a length of 20 frames and will
>>>> automatically be used by the drivers when no TX buffers are free.
>>>>
>>>> If an application wants to know whether a frame was sent immediately or gets delayed it can check the return code of the Write() method. Write() now also
>>>> can take a second parameter for the maximum wait time for space in the TX queue to become available if it's full (default 0 = fail immediately if queue is
>>>> full).
>>>>
>>>>
>>>> I also added logging of CAN errors. It's currently activated by "can … trace on", I don't think this needs to be active by default, just for CAN issue
>>>> debugging.
>>>>
>>>> E (45718) can: Error can1 rxpkt=3 txpkt=0 errflags=0x800caa rxerr=1 txerr=0 rxovr=0 txovr=0 txdelay=0
>>>> E (83528) can: Error can1 rxpkt=7483 txpkt=226 errflags=0x800caa rxerr=1 txerr=0 rxovr=0 txovr=0 txdelay=0
>>>>
>>>> …that's also a first part of the logging extension (part two).
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>>>> Am 12.01.2018 um 19:01 schrieb Michael Balzer:
>>>>> Yes, I had something like that in mind. On TX IRQ, the drivers send CAN_txcallbacks to the CAN_rxtask. The CAN_rxtask then fetches frames from the TX
>>>>> queue and calls the TxCallback until all TX buffers of the driver are full. From the already existing TxCallback() stubs I suppose you had planned a
>>>>> scheme like that already? ;)
>>>>>
>>>>> Greg, can you create a pull request for your MCP2515 change? I'd like to merge that before beginning on the drivers.
>>>>>
>>>>> Thanks,
>>>>> Michael
>>>>>
>>>>>
>>>>> Am 12.01.2018 um 01:19 schrieb Mark Webb-Johnson:
>>>>>> Option B sounds like a good approach.
>>>>>>
>>>>>> Presumably we are just polling the tx queue in the existing CAN_rxtask based on TxCallback?
>>>>>>
>>>>>> Regards, Mark.
>>>>>>
>>>>>>> On 11 Jan 2018, at 8:42 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>>
>>>>>>> Greg, Mark,
>>>>>>>
>>>>>>> I can check your new code after work.
>>>>>>>
>>>>>>> For the TX performance/overflow issue, there are basically two options:
>>>>>>>
>>>>>>>   * A: make all application TX be aware of overflows, i.e. check the return value of the CAN Write() call as necessary and/or introduce sufficient
>>>>>>>     delays (very ugly)
>>>>>>>   * B: add a TX queue to the CAN framework, so the application can just push some frames as fast as it likes, with an option to wait/block/fail if the
>>>>>>>     queue is full
>>>>>>>       o → the framework checks for TX buffers becoming available *(i.e. driver issuing a TxCallback request)* and delivers queued frames only as fast as
>>>>>>>         the driver can handle them
>>>>>>>
>>>>>>> Option B has been on my todo list since removing the delay from the MCP driver and introducing the TX buffer check in the esp32can driver, as I don't
>>>>>>> think applications should need to handle TX overflows.
>>>>>>>
>>>>>>> I can try to implement that this weekend if it's urgent now.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Michael
>>>>>>>
>>>>>>>
>>>>>>> Am 11.01.2018 um 05:55 schrieb Greg D.:
>>>>>>>> Hi Mark, Micheal,
>>>>>>>>
>>>>>>>> Ok, good news and bad news.
>>>>>>>>
>>>>>>>> Good news:  Rx problem I believe is fixed.  Return is true only if we received something, else false.  And the other interrupt conditions are handled
>>>>>>>> at the same time, so no hangs are seen when restarting wifi.  Rx overflow counter does increment properly.  Yea!  Code has been pushed to my clone on
>>>>>>>> Github.
>>>>>>>>
>>>>>>>> Bad news:  I am still able to hang the bus, but I think it's on the transmit side.  The obd2ecu process can send up to 3 frames back to back to report
>>>>>>>> the ECU Name, followed soon after by several more with to grab the VIN.  Without any flow control on the transmit side, and with a half-duplex CAN bus,
>>>>>>>> that's just too much.  Turning off the VIN reporting (config set obd2ecu private yes) seems to let everything run because I don't respond to the VIN
>>>>>>>> request (which lets everything drain as OBDWiz times out).  Also verified by putting temporary delays in the obd2ecu code to let things drain a bit
>>>>>>>> between frames.  So, the transmit side is still a bit fragile, depending on timing.  Not sure quite what to do here, as there is no easy place to queue
>>>>>>>> things...  Do we need to go back to the old way with a delay in the obd2ecu code (perhaps better than in the driver, no?).  Architecturally it's ugly,
>>>>>>>> but this only occurs at startup, and I don't mind the kludge.  Do any other uses of the MCP busses do a burst of transmitting?  If not, I'll put the
>>>>>>>> delays in the obd2ecu code and call it close enough.  Lemme know.
>>>>>>>>
>>>>>>>> For receive, I'd go with what I have for now, if Michael would be so kind as to review what I have done first. 
>>>>>>>> https://github.com/bitsofgreg/Open-Vehicle-Monitoring-System-3/blob/master/vehicle/OVMS.V3/components/mcp2515/mcp2515.cpp  Hopefully he'll be back on
>>>>>>>> line before I get up in the morning.  Wonderful how the Earth's spin helps with the teamwork.
>>>>>>>>
>>>>>>>> I'll keep poking at things tonight, and take it out for a spin in the car tomorrow, just to see everything working together.  But as it is now, it's
>>>>>>>> much better than it was before.  Really, this time.  :)
>>>>>>>>
>>>>>>>> Greg
>>>>>>>>
>>>>>>>>
>>>>>>>> Greg D. wrote:
>>>>>>>>> Hi Mark,
>>>>>>>>>
>>>>>>>>> I believe you are right about the multiple flags, and the code only processing Rx and "error" separately.  Fundamentally, a roll-over from buffer 0 to
>>>>>>>>> buffer 1 isn't really an error, just a statement of fact on what happened.  So, we should have buffer 1 and the rollover flag at the same time, which
>>>>>>>>> in fact is what I saw.  I need to handle the Rx overflow at the same time as the buffer 1 receive, I think...
>>>>>>>>>
>>>>>>>>> I need to grab some dinner, but have a fix in the works.  Will report back in a few hours, hopefully with good news...
>>>>>>>>>
>>>>>>>>> Greg
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Mark Webb-Johnson wrote:
>>>>>>>>>>
>>>>>>>>>> The design of the system is as follows:
>>>>>>>>>>
>>>>>>>>>>   * The can object CAN_rxtask listens on the rx queue to receive instructional messages from canbus drivers. These can be:
>>>>>>>>>>       o CAN_frame: simply passes an entire incoming can frame to the IncomingFrame handler.
>>>>>>>>>>       o CAN_rxcallback: an instruction for the CAN_rxtask to call the RxCallback task repeatedly.
>>>>>>>>>>       o CAN_txcallback: an instruction for the CAN_rxtask to call the TxCallback once.
>>>>>>>>>>   * In the case of CAN_rxcallback, the canbus object RxCallback function is expected to return FALSE to indicate nothing should be done and
>>>>>>>>>>     RxCallback should not be called again, or TRUE to indicate an incoming frame has been received and should be passed to IncomingFrame.
>>>>>>>>>>   * The system is arranged so that individual bus driver interrupt implementations can be fast and efficient.
>>>>>>>>>>       o The driver can choose to receive the frame in the interrupt handler itself, and pass it with CAN_frame to CAN_rxtask. The esp32 can driver
>>>>>>>>>>         uses this option.
>>>>>>>>>>       o Or the driver can choose to delay the reception of the frame to the RxCallback stage, and merely pass an indication with CAN_rxcallback. The
>>>>>>>>>>         mcp2515 driver uses this option.
>>>>>>>>>>   * The true/false response from RxCallback is designed to allow the callback to signal it received a frame or not. If it received a frame, then it
>>>>>>>>>>     is called again.
>>>>>>>>>>   * This approach is used in order to be able to centralise the reception of CAN frames to one single task (avoiding having to run individual tasks
>>>>>>>>>>     for each canbus, hence saving stack RAM).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The RxCallback should definitely ONLY return true if an actual can message has been received, and is being passed back in the frame pointer parameter.
>>>>>>>>>>
>>>>>>>>>> I suspect the issue is that the mcp2515 RxCallback is being faced with multiple error flags. Changing that to a return true (as Greg has done) has
>>>>>>>>>> the undesired side-effect of issuing a spurious IncomingFrame (with garbage/blank frame), but also causes the RxCallback to be called again (clearing
>>>>>>>>>> the error flag). Perhaps the solution is to put a loop in RxCallback so that if an error condition is found, it should be cleared, but then loop
>>>>>>>>>> again and keep clearing errors until no more are found, then return false? I think that in the mcp2515 case, this error clearing loop can be simply
>>>>>>>>>> handled in the RxCallback itself.
>>>>>>>>>>
>>>>>>>>>> The alternative is to change the RxCallback logic so that the return bool value means simply ‘loop’ (call me again, please), and have the RxCallback
>>>>>>>>>> itself call IncomingFrame(), rather than passing a frame as a parameter. If Michael/Greg think this is a better approach, I am happy to make that
>>>>>>>>>> change - it is pretty trivial.
>>>>>>>>>>
>>>>>>>>>> Regards, Mark.
>>>>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>>> _______________________________________________
>>>>>>> OvmsDev mailing list
>>>>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.teslaclub.hk
>>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>>>
>>>>> -- 
>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.teslaclub.hk
>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>>
>>>> -- 
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.teslaclub.hk
>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.teslaclub.hk
>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>
> -- 
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.teslaclub.hk
> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180113/06add69e/attachment.htm>


More information about the OvmsDev mailing list