[Ovmsdev] CAN-3 broken again?

Greg D. gregd2350 at gmail.com
Mon Jan 8 09:14:51 HKT 2018


I've turned off Canopen, SSH, and Telnet earlier, and that seemed to
stop the crashes.  Just now added Bluetooth to that list, for good measure.

Let's see how that holds...

As for the issues with CAN-3, I seem to be able to hang it by simply
starting WiFi while the OBDII ECU is running with an OBDII device
attached (OBDWiz, in this case).  Trying to reconnect the OBDII device
fails - no frames are received.  Stopping and restarting the OBDII ECU
task lets me reconnect.  If I look at the can status when hung, I see
that Rx Ovrflw is 1, and the Rx counter doesn't increment.  I'm guessing
that starting WiFi is taking enough CPU time that the OBDII ECU task is
falling behind, causing the overflow.  Apparently, that overflow is not
being handled, leading to the hang.

On an earlier run (before removing Bluetooth), I was able to get the
OBDWiz dongle to connect for a few frames, after which it hung.  That
behavior didn't repeat just now, but I'm not sure what else was running
at the time (e.g. the modem / ppp).  The connect sequence from OBDWiz
does a few frames rapidly (an initial PID 0, followed by requests for
ECU Name and VIN), before a more relaxed polling starts.  So, if there's
another task taking up CPU time, I can see where an Rx overflow could
occur during that initial connect sequence.

Driving a HUD is not a critical task, so I would be against a general
raising of task priority.  Rather, we need to figure out how to handle
the Rx Overflow, and keep the frames coming in.  OBDII devices generally
are somewhat forgiving about lost frames, but apparently the OBDWiz has
a short attention span and lets you know that something is wrong.

I'll take a look at the 2515 code, but I'm not much of an expert on the
chip's care and feeding under such circumstances.  If someone more in
the know about it could take a look, that would be great.

Thanks,

Greg


Stephen Casner wrote:
> Greg,
>
> Yes, definitely running out of free RAM, but I don't know the meaning
> of the WindowOverflow messages.
>
> The first time I built with release/v3.0 of esp-idf I was not able to
> open an ssh connection; the error displayed was about a crypto
> failure.  After quite a bit of digging to narrow down to where the
> error was occurring, I finally found that the problem was running out
> of free RAM.  My solution was to disable bluetooth entirely, which
> made a big difference in the amount of free RAM.
>
>                                                         -- Steve
>
> On Sun, 7 Jan 2018, Greg D. wrote:
>
>> Hi Michael, Steve, Mark,
>>
>> Steve, the crash was an abort in new_op.cc, so perhaps being out of space is the
>> issue.  Crash and reboot log attached (crash.txt).  One thing I've been wondering
>> about are the several lines "_WindowOverflow4 at ??:?" during the boot process.  Is
>> that indicative of a problem, later to manifest in the crash?
>>
>> My builds include pretty much everything, except for the Leaf, Twizy, and Soul.
>>
>> The update included some 20 lines changed to mcp2525.cpp, as well as a bunch of other
>> stuff, including a lot stuff updated in Canopen and Kia.  I have a script that does
>> the git fetch master, merge, and push back to my github fork, the output of which is
>> attached (update.txt).  As a test, I removed Canopen from the build config, and the
>> crash has disappeared.  CAN-3 also appears to have come back to life (!), at least
>> initially.  I can still get CAN-3 to fail if I turn on/off the modem and/or wifi in
>> some sequence (still trying to pin that down), but that also leads to another crash
>> (crash2.txt, attached).
>>
>> Mark:  Note also the issue with DNS failures getting to the v2 server.  I enabled the
>> modem, got connected, then enabled WiFi (simulating arriving at home), and lost the V2
>> server.  Disabling Wifi didn't bring it back, and powering off the modem (in
>> preparation for turning it back on) caused the crash.
>>
>> So, two questions...  First, why the apparent conflict between Canopen or wifi/modem
>> and obd2ecu over access to the 3rd CAN bus?  Why would the modem or wifi have any
>> effect on a CAN bus?
>>
>> Second, overall memory usage seems to be at the limit.  What sort of budget do we have
>> for what remains to be done, and how are we going to be packaging the build options
>> for when non-developers want to get their hands on the product?  Will we be able to
>> turn everything on, minus the developer / debug stuff, or will we have a separate SKU
>> for each model car?
>>
>> Thanks,
>>
>> Greg
>>
>>
>> Michael Balzer wrote:
>>
>> Greg,
>>
>> which commits / changes do you mean? The CAN drivers have not been changed since the T
>> X performance fix, which Geir reported having solved his last issues.
>>
>> The current version is stable over here, but without the SSH component -- I can't use
>> that due to memory getting too low together with the Twizy component.
>>
>> Regards,
>> Michael
>>
>>
>> Am 07.01.2018 um 08:04 schrieb Greg D.:
>>
>> Hi folks,
>>
>> I just resync'd with the main repository, and am not receiving frames on
>> CAN-3 anymore.  I see there were changes to the chip driver...
>>
>> I'm also seeing crashes right after getting connected to WiFi,
>> immediately after the system tries to start SSH.
>>
>> Seems like we just took a big step backward.  What happened?
>>
>> Greg
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.teslaclub.hk
>> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.teslaclub.hk
> http://lists.teslaclub.hk/mailman/listinfo/ovmsdev



More information about the OvmsDev mailing list