[Ovmsdev] CAN-3 broken again?

Michael Balzer dexter at expeedo.de
Sun Jan 14 02:03:03 HKT 2018


Please check again.

Thanks,
Michael


Am 13.01.2018 um 18:42 schrieb Greg D.:
> Gave it a quick try, and got a crash...  I'll see if I can isolate it a bit, but here's something to start with.  Tombstone, attached.  
>
> Greg
>
>
>
>
> Geir Øyvind Vælidalo wrote:
>> I currently don’t send anything on can2. I could try to send something, but the car is away this weekend :-(
>>
>> Geir
>>
>>
>>> 13. jan. 2018 kl. 17:24 skrev Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>>:
>>>
>>> Part one (TX queue) done & pushed.
>>>
>>> OVMS > can can1 status
>>> CAN:       can1
>>> Mode:      Active
>>> Speed:     500000
>>> Rx pkt:                  236657
>>> Rx err:                       1
>>> Rx ovrflw:                    0
>>> Tx pkt:                  106378
>>> Tx delays:                    4
>>> Tx err:                       0
>>> Tx ovrflw:                    0
>>> Err flags: 0x800caa
>>>
>>> TX performance is rock steady on can1 -- the delays occurred when sending the stop charge request (as expected). I can't test can2/3, Greg & Geir, could you…?
>>>
>>> The TxCallback can't be used on the mcp2515. The ISR can't query the IRQ register, so the TX IRQs are now also handled by the RxCallback(). As the TX IRQs
>>> need to be cleared before loading the next frame, this needs another SPI call. I hope that doesn't introduce new problems.
>>>
>>>
>>> No changes are necessary to the application code (well, except you can remove any hard coded delays now). The TX queue has a length of 20 frames and will
>>> automatically be used by the drivers when no TX buffers are free.
>>>
>>> If an application wants to know whether a frame was sent immediately or gets delayed it can check the return code of the Write() method. Write() now also
>>> can take a second parameter for the maximum wait time for space in the TX queue to become available if it's full (default 0 = fail immediately if queue is
>>> full).
>>>
>>>
>>> I also added logging of CAN errors. It's currently activated by "can … trace on", I don't think this needs to be active by default, just for CAN issue
>>> debugging.
>>>
>>> E (45718) can: Error can1 rxpkt=3 txpkt=0 errflags=0x800caa rxerr=1 txerr=0 rxovr=0 txovr=0 txdelay=0
>>> E (83528) can: Error can1 rxpkt=7483 txpkt=226 errflags=0x800caa rxerr=1 txerr=0 rxovr=0 txovr=0 txdelay=0
>>>
>>> …that's also a first part of the logging extension (part two).
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 12.01.2018 um 19:01 schrieb Michael Balzer:
>>>> Yes, I had something like that in mind. On TX IRQ, the drivers send CAN_txcallbacks to the CAN_rxtask. The CAN_rxtask then fetches frames from the TX queue
>>>> and calls the TxCallback until all TX buffers of the driver are full. From the already existing TxCallback() stubs I suppose you had planned a scheme like
>>>> that already? ;)
>>>>
>>>> Greg, can you create a pull request for your MCP2515 change? I'd like to merge that before beginning on the drivers.
>>>>
>>>> Thanks,
>>>> Michael
>>>>
>>>>
>>>> Am 12.01.2018 um 01:19 schrieb Mark Webb-Johnson:
>>>>> Option B sounds like a good approach.
>>>>>
>>>>> Presumably we are just polling the tx queue in the existing CAN_rxtask based on TxCallback?
>>>>>
>>>>> Regards, Mark.
>>>>>
>>>>>> On 11 Jan 2018, at 8:42 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>
>>>>>> Greg, Mark,
>>>>>>
>>>>>> I can check your new code after work.
>>>>>>
>>>>>> For the TX performance/overflow issue, there are basically two options:
>>>>>>
>>>>>>   * A: make all application TX be aware of overflows, i.e. check the return value of the CAN Write() call as necessary and/or introduce sufficient delays
>>>>>>     (very ugly)
>>>>>>   * B: add a TX queue to the CAN framework, so the application can just push some frames as fast as it likes, with an option to wait/block/fail if the
>>>>>>     queue is full
>>>>>>       o → the framework checks for TX buffers becoming available *(i.e. driver issuing a TxCallback request)* and delivers queued frames only as fast as
>>>>>>         the driver can handle them
>>>>>>
>>>>>> Option B has been on my todo list since removing the delay from the MCP driver and introducing the TX buffer check in the esp32can driver, as I don't
>>>>>> think applications should need to handle TX overflows.
>>>>>>
>>>>>> I can try to implement that this weekend if it's urgent now.
>>>>>>
>>>>>> Regards,
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>> Am 11.01.2018 um 05:55 schrieb Greg D.:
>>>>>>> Hi Mark, Micheal,
>>>>>>>
>>>>>>> Ok, good news and bad news.
>>>>>>>
>>>>>>> Good news:  Rx problem I believe is fixed.  Return is true only if we received something, else false.  And the other interrupt conditions are handled at
>>>>>>> the same time, so no hangs are seen when restarting wifi.  Rx overflow counter does increment properly.  Yea!  Code has been pushed to my clone on Github.
>>>>>>>
>>>>>>> Bad news:  I am still able to hang the bus, but I think it's on the transmit side.  The obd2ecu process can send up to 3 frames back to back to report
>>>>>>> the ECU Name, followed soon after by several more with to grab the VIN.  Without any flow control on the transmit side, and with a half-duplex CAN bus,
>>>>>>> that's just too much.  Turning off the VIN reporting (config set obd2ecu private yes) seems to let everything run because I don't respond to the VIN
>>>>>>> request (which lets everything drain as OBDWiz times out).  Also verified by putting temporary delays in the obd2ecu code to let things drain a bit
>>>>>>> between frames.  So, the transmit side is still a bit fragile, depending on timing.  Not sure quite what to do here, as there is no easy place to queue
>>>>>>> things...  Do we need to go back to the old way with a delay in the obd2ecu code (perhaps better than in the driver, no?).  Architecturally it's ugly,
>>>>>>> but this only occurs at startup, and I don't mind the kludge.  Do any other uses of the MCP busses do a burst of transmitting?  If not, I'll put the
>>>>>>> delays in the obd2ecu code and call it close enough.  Lemme know.
>>>>>>>
>>>>>>> For receive, I'd go with what I have for now, if Michael would be so kind as to review what I have done first. 
>>>>>>> https://github.com/bitsofgreg/Open-Vehicle-Monitoring-System-3/blob/master/vehicle/OVMS.V3/components/mcp2515/mcp2515.cpp  Hopefully he'll be back on
>>>>>>> line before I get up in the morning.  Wonderful how the Earth's spin helps with the teamwork.
>>>>>>>
>>>>>>> I'll keep poking at things tonight, and take it out for a spin in the car tomorrow, just to see everything working together.  But as it is now, it's
>>>>>>> much better than it was before.  Really, this time.  :)
>>>>>>>
>>>>>>> Greg
>>>>>>>
>>>>>>>
>>>>>>> Greg D. wrote:
>>>>>>>> Hi Mark,
>>>>>>>>
>>>>>>>> I believe you are right about the multiple flags, and the code only processing Rx and "error" separately.  Fundamentally, a roll-over from buffer 0 to
>>>>>>>> buffer 1 isn't really an error, just a statement of fact on what happened.  So, we should have buffer 1 and the rollover flag at the same time, which
>>>>>>>> in fact is what I saw.  I need to handle the Rx overflow at the same time as the buffer 1 receive, I think...
>>>>>>>>
>>>>>>>> I need to grab some dinner, but have a fix in the works.  Will report back in a few hours, hopefully with good news...
>>>>>>>>
>>>>>>>> Greg
>>>>>>>>
>>>>>>>>
>>>>>>>> Mark Webb-Johnson wrote:
>>>>>>>>>
>>>>>>>>> The design of the system is as follows:
>>>>>>>>>
>>>>>>>>>   * The can object CAN_rxtask listens on the rx queue to receive instructional messages from canbus drivers. These can be:
>>>>>>>>>       o CAN_frame: simply passes an entire incoming can frame to the IncomingFrame handler.
>>>>>>>>>       o CAN_rxcallback: an instruction for the CAN_rxtask to call the RxCallback task repeatedly.
>>>>>>>>>       o CAN_txcallback: an instruction for the CAN_rxtask to call the TxCallback once.
>>>>>>>>>   * In the case of CAN_rxcallback, the canbus object RxCallback function is expected to return FALSE to indicate nothing should be done and RxCallback
>>>>>>>>>     should not be called again, or TRUE to indicate an incoming frame has been received and should be passed to IncomingFrame.
>>>>>>>>>   * The system is arranged so that individual bus driver interrupt implementations can be fast and efficient.
>>>>>>>>>       o The driver can choose to receive the frame in the interrupt handler itself, and pass it with CAN_frame to CAN_rxtask. The esp32 can driver
>>>>>>>>>         uses this option.
>>>>>>>>>       o Or the driver can choose to delay the reception of the frame to the RxCallback stage, and merely pass an indication with CAN_rxcallback. The
>>>>>>>>>         mcp2515 driver uses this option.
>>>>>>>>>   * The true/false response from RxCallback is designed to allow the callback to signal it received a frame or not. If it received a frame, then it is
>>>>>>>>>     called again.
>>>>>>>>>   * This approach is used in order to be able to centralise the reception of CAN frames to one single task (avoiding having to run individual tasks
>>>>>>>>>     for each canbus, hence saving stack RAM).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The RxCallback should definitely ONLY return true if an actual can message has been received, and is being passed back in the frame pointer parameter.
>>>>>>>>>
>>>>>>>>> I suspect the issue is that the mcp2515 RxCallback is being faced with multiple error flags. Changing that to a return true (as Greg has done) has the
>>>>>>>>> undesired side-effect of issuing a spurious IncomingFrame (with garbage/blank frame), but also causes the RxCallback to be called again (clearing the
>>>>>>>>> error flag). Perhaps the solution is to put a loop in RxCallback so that if an error condition is found, it should be cleared, but then loop again and
>>>>>>>>> keep clearing errors until no more are found, then return false? I think that in the mcp2515 case, this error clearing loop can be simply handled in
>>>>>>>>> the RxCallback itself.
>>>>>>>>>
>>>>>>>>> The alternative is to change the RxCallback logic so that the return bool value means simply ‘loop’ (call me again, please), and have the RxCallback
>>>>>>>>> itself call IncomingFrame(), rather than passing a frame as a parameter. If Michael/Greg think this is a better approach, I am happy to make that
>>>>>>>>> change - it is pretty trivial.
>>>>>>>>>
>>>>>>>>> Regards, Mark.
>>>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.teslaclub.hk
>>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>>
>>>> -- 
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>
>>>>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.teslaclub.hk
>>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>>
>>> -- 
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.teslaclub.hk <mailto:OvmsDev at lists.teslaclub.hk>
>>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>>
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.teslaclub.hk
>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
>
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.teslaclub.hk
> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180113/7d25b1d2/attachment.htm>


More information about the OvmsDev mailing list