[Ovmsdev] Poor wifi performance
mark at webb-johnson.net
Tue Jan 8 12:58:46 HKT 2019
The core issue here (wifi connection being lost, but ESP stack not able to report that to the application - us) is the real culprit. I, like Michael, hope that later versions of the ESP IDF can solve this. Once the wear levelling version upgrade bug is fixed, perhaps we can try? I do see this in my home (despite the wifi access point being about 2metres from the car), so suspect it is more related to a timeout / interference. For me, it only happens once every few weeks.
Implementing a facility for fallback to modem even if wifi is up (in the case of connections over wifi failing) is probably a sensible feature. I think we would need:
A way for connection success / failure over a particular transport to be reported to the network layer. Or some way for network layer to access historical statistics.
Some logic in the network layer to determine wifi is unreliable (presumably based on sequential connections failures without success, over time).
Some logic in the network layer to determine wifi is reliable again. Perhaps after some time (hours) on modem failover, it could switch back to wifi and try again.
The switch is done in the network layer itself, and our current default route switching mechanism should support that just fine.
I guess a config to enable/disable this as a feature.
But, I would really much rather ‘fix’ the wifi in the first place. It is certainly easier to see if any fix is effective at the moment, with the wifi unreliable, than if we had some automatic failover to modem situation.
> On 5 Jan 2019, at 2:56 PM, Stephen Casner <casner at acm.org> wrote:
> You're right that we can't depend on the websocket job queue overflow
> to detect loss of wifi connectivity. If the improvements in the wifi
> driver make it sufficiently robust to detect disassociation, then we
> may not need to do anything else to work around that problem.
> However, there may well be situations where wifi is able to associate
> just fine, but there is no connectivity upstream from that point to
> the server. To handle such cases I think it would be a good idea to
> have a signal that both the websocket and server-v can send to
> netmanager to trigger switching to another path.
> -- Steve
> On Fri, 4 Jan 2019, Michael Balzer wrote:
>> you could have enabled event logging additionally, but there clearly is no event from the wifi driver on the disassociation, or the netmanager would have logged
>> this as well.
>> You're probably right in the websocket job queue overflow indicating the loss, but that won't fit as a general canary, as it's only active while at least one
>> web client is connected.
>> Another thing you could monitor is the signal quality, or maybe check for a lack of update callbacks? That's CSIRxCallback in esp32wifi.
>> But that's all working around the underlying wifi blob bug. We first should check if the current IDF blob does a better job.
>> Am 04.01.19 um 06:15 schrieb Stephen Casner:
>>> Yesterday I found another instance where I could not ssh to OVMS nor
>>> ping it. This time I verified in my router status that the wifi
>>> association with OVMS was "inactive" (down). The iPhone app said that
>>> server-v2 was not hear for 108 minutes. This time I have a full log
>>> file covering back to the previous day, which is attached.
>>> I connected to OVMS with the serial console and the output from the
>>> serial monitor is appended to the attached log file. The "wifi stat"
>>> and "net stat" commands both indicated that the wifi connection was up
>>> when all the external indications were to the contrary. Going back
>>> 108 minutes in the log file corresponds roughly to the first instance
>>> of "job queue overflow detected". Perhaps the web server can act as
>>> the canary in the coal mine? There is no indication that the wifi
>>> driver signaled any problem at that time.
>>> Diagnosing this problem is difficult because the loss of connectivity
>>> occurs after a day or a few days of operation. I have not tried the
>>> current esp-idf yet; I may do that, but I'm not sure how long that
>>> would need to operate without loss to determine success.
>>> -- Steve
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OvmsDev