[Ovmsdev] Branch for-v3.3 network issues
Michael Balzer
dexter at expeedo.de
Mon Mar 22 17:22:21 HKT 2021
I think we must avoid blocking the Mongoose task, as that's the central
network dispatcher.
Chris had implemented a workaround in one of his PRs that could allow
that to be done temporarily by running a local Mongoose main loop during
a synchronous operation, but I still see potential issues from that, as
it wasn't the standard handling as done by the task, and as it may need
to recurse.
Maybe the old OvmsHttpClient using socket I/O is the right way for
synchronous network operations?
Regards,
Michael
Am 22.03.21 um 07:15 schrieb Mark Webb-Johnson:
> Not sure how to resolve this.
>
> OvmsSyncHttpClient is currently used in commands from ovms_plugins and
> ovms_ota.
>
> I could bring back the OvmsHttpClient blocking (non-mongoose)
> implementation, but I don’t think that would address the core problem
> here:
>
> Inside a mongoose callback (inside the mongoose networking task),
> we are making blocking calls (and in particular calls that could
> block for several tens of seconds).
>
>
> But fundamentally is it ok to block the mongoose networking task for
> extended periods during a mongoose event callback?
>
> Mark
>
>> On 21 Mar 2021, at 9:57 PM, Michael Balzer <dexter at expeedo.de
>> <mailto:dexter at expeedo.de>> wrote:
>>
>> Signed PGP part
>> I've found opening the web UI firmware page or calling "ota status"
>> via ssh to consistently deadlock the network on my module.
>>
>> I (130531) webserver: HTTP GET /cfg/firmware
>> D (130531) http: OvmsSyncHttpClient: Connect to
>> ovms.dexters-web.de:80 <http://ovms.dexters-web.de:80>
>> D (130541) http: OvmsSyncHttpClient: waiting for completion
>>
>> After that log message, the network is dead, and the netmanager also
>> doesn't respond:
>>
>> OVMS# network list
>> ERROR: job failed
>> D (183241) netmanager: send cmd 1 from 0x3ffe7054
>> W (193241) netmanager: ExecuteJob: cmd 1: timeout
>>
>> The interfaces seem to be registered and online, but nothing gets in
>> or out:
>>
>> OVMS# network status
>> Interface#3: pp3 (ifup=1 linkup=1)
>> IPv4: 10.170.195.13/255.255.255.255 gateway 10.64.64.64
>>
>> Interface#2: ap2 (ifup=1 linkup=1)
>> IPv4: 192.168.4.1/255.255.255.0 gateway 192.168.4.1
>>
>> Interface#1: st1 (ifup=1 linkup=1)
>> IPv4: 192.168.2.106/255.255.255.0 gateway 192.168.2.1
>>
>> DNS: 192.168.2.1
>>
>> Default Interface: st1 (192.168.2.106/255.255.255.0 gateway 192.168.2.1)
>>
>>
>> A couple of minutes later, server-v2 recognizes the stale connection
>> and issues a network restart, which fails resulting in the same
>> behaviour as shown below with finally forced reboot by loss of an
>> important event.
>>
>> Doing "ota status" from USB works normally, so this looks like
>> OvmsSyncHttpClient not being able to run from within a mongoose client.
>>
>> Regards,
>> Michael
>>
>>
>> Am 18.03.21 um 08:14 schrieb Mark Webb-Johnson:
>>> Tried to repeat this, but not having much success. Here is my car
>>> module, with network still up:
>>>
>>> OVMS# boot status
>>> Last boot was 262355 second(s) ago
>>>
>>>
>>> I did manage to catch one network related crash after repeatedly
>>> disconnecting and reconnecting the cellular antenna. This was:
>>>
>>> I (3717989) cellular: PPP Connection disconnected
>>> Guru Meditation Error: Core 1 panic'ed (LoadProhibited).
>>> Exception was unhandled.
>>>
>>> 0x400fe082 is in OvmsNetManager::PrioritiseAndIndicate()
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_netmanager.cpp:707).
>>>
>>> 707 if ((pri->name[0]==search[0])&&
>>>
>>>
>>> 0x400ed360 is in
>>> OvmsMetricString::SetValue(std::__cxx11::basic_string<char,
>>> std::char_traits<char>, std::allocator<char> >)
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_metrics.cpp:1358).
>>>
>>> 1357 void OvmsMetricString::SetValue(std::string value)
>>>
>>>
>>> 0x4008bdad is at
>>> ../../../../.././newlib/libc/machine/xtensa/strcmp.S:586.
>>>
>>>
>>> 0x4008bdd1 is at
>>> ../../../../.././newlib/libc/machine/xtensa/strcmp.S:604.
>>>
>>>
>>> 0x400fe886 is in
>>> OvmsNetManager::ModemDown(std::__cxx11::basic_string<char,
>>> std::char_traits<char>, std::allocator<char> >, void*)
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_netmanager.cpp:522).
>>>
>>> 522 PrioritiseAndIndicate();
>>>
>>>
>>> 0x400fd752 is in std::_Function_handler<void
>>> (std::__cxx11::basic_string<char, std::char_traits<char>,
>>> std::allocator<char> >, void*), std::_Bind<std::_Mem_fn<void
>>> (OvmsNetManager::*)(std::__cxx11::basic_string<char,
>>> std::char_traits<char>, std::allocator<char> >, void*)>
>>> (OvmsNetManager*, std::_Placeholder<1>, std::_Placeholder<2>)>
>>> >::_M_invoke(std::_Any_data const&,
>>> std::__cxx11::basic_string<char, std::char_traits<char>,
>>> std::allocator<char> >&&, void*&&)
>>> (/home/openvehicles/build/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/functional:600).
>>>
>>> 600 { return
>>> (__object->*_M_pmf)(std::forward<_Args>(__args)...); }
>>>
>>>
>>> 0x400f512e is in std::function<void
>>> (std::__cxx11::basic_string<char, std::char_traits<char>,
>>> std::allocator<char> >,
>>> void*)>::operator()(std::__cxx11::basic_string<char,
>>> std::char_traits<char>, std::allocator<char> >, void*) const
>>> (/home/openvehicles/build/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/functional:2271).
>>>
>>> 2271 return _M_invoker(_M_functor,
>>> std::forward<_ArgTypes>(__args)...);
>>>
>>>
>>> 0x400f52f1 is in
>>> OvmsEvents::HandleQueueSignalEvent(event_queue_t*)
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:283).
>>>
>>> 283 m_current_callback->m_callback(m_current_event,
>>> msg->body.signal.data);
>>>
>>>
>>> 0x400f53d8 is in OvmsEvents::EventTask()
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:237).
>>>
>>> 237 HandleQueueSignalEvent(&msg);
>>>
>>>
>>> 0x400f545d is in EventLaunchTask(void*)
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:80).
>>>
>>> 80 me->EventTask();
>>>
>>>
>>> My for_v3.3 branch does include the preliminary changes to support
>>> the wifi at 20MHz bandwidth, and perhaps those could be affecting
>>> things. I do notice that if I ‘power wifi off’, then ‘wifi mode
>>> client’, it can connect to the station, but not get an IP address.
>>> I’ve just tried to merge in the latest fixes to that, and rebuilt a
>>> release. I will continue to test with that.
>>>
>>> Regards, Mark.
>>>
>>>> On 12 Mar 2021, at 10:32 PM, Michael Balzer <dexter at expeedo.de
>>>> <mailto:dexter at expeedo.de>> wrote:
>>>>
>>>> Signed PGP part
>>>> I just tried switching to for-v3.3 in my car module after tests on
>>>> my desk module were OK, and I've run into the very same problem
>>>> with for-v3.3. So the issue isn't related to esp-idf.
>>>>
>>>> The network only occasionally starts normally, but even then all
>>>> connectivity is lost after a couple of minutes.
>>>>
>>>> The stale connection watchdog in server-v2 triggers a network
>>>> restart, but that also doesn't seem to succeed:
>>>>
>>>> 2021-03-12 14:53:01.802 CET W (981652) ovms-server-v2: Detected
>>>> stale connection (issue #241), restarting network
>>>> 2021-03-12 14:53:01.802 CET I (981652) esp32wifi: Restart
>>>> 2021-03-12 14:53:01.802 CET I (981652) esp32wifi: Stopping WIFI station
>>>> 2021-03-12 14:53:01.812 CET I (981662) wifi:state: run -> init (0)
>>>> 2021-03-12 14:53:01.812 CET I (981662) wifi:pm stop, total sleep
>>>> time: 831205045 us / 975329961 us
>>>>
>>>> 2021-03-12 14:53:01.812 CET I (981662) wifi:new:<1,0>, old:<1,1>,
>>>> ap:<1,1>, sta:<1,1>, prof:1
>>>> 2021-03-12 14:53:01.832 CET I (981682) wifi:flush txq
>>>> 2021-03-12 14:53:01.842 CET I (981692) wifi:stop sw txq
>>>> 2021-03-12 14:53:01.842 CET I (981692) wifi:lmac stop hw txq
>>>> 2021-03-12 14:53:01.852 CET I (981702) esp32wifi: Powering down
>>>> WIFI driver
>>>> 2021-03-12 14:53:01.852 CET I (981702) wifi:Deinit lldesc rx mblock:16
>>>> 2021-03-12 14:53:01.862 CET I (981712) esp32wifi: Powering up WIFI
>>>> driver
>>>> 2021-03-12 14:53:01.862 CET I (981712) wifi:nvs_log_init, erase log
>>>> key successfully, reinit nvs log
>>>> 2021-03-12 14:53:01.882 CET I (981732) wifi:wifi driver task:
>>>> 3ffd4d84, prio:23, stack:3584, core=0
>>>> 2021-03-12 14:53:01.882 CET I (981732) system_api: Base MAC address
>>>> is not set, read default base MAC address from BLK0 of EFUSE
>>>> 2021-03-12 14:53:01.882 CET I (981732) system_api: Base MAC address
>>>> is not set, read default base MAC address from BLK0 of EFUSE
>>>> 2021-03-12 14:53:01.902 CET I (981752) wifi:wifi firmware version:
>>>> 30f9e79
>>>> 2021-03-12 14:53:01.912 CET I (981762) wifi:config NVS flash: enabled
>>>> 2021-03-12 14:53:01.912 CET I (981762) wifi:config nano formating:
>>>> disabled
>>>> 2021-03-12 14:53:01.912 CET I (981762) wifi:Init data frame dynamic
>>>> rx buffer num: 16
>>>> 2021-03-12 14:53:01.912 CET I (981762) wifi:Init management frame
>>>> dynamic rx buffer num: 16
>>>> 2021-03-12 14:53:01.922 CET I (981772) wifi:Init management short
>>>> buffer num: 32
>>>> 2021-03-12 14:53:01.922 CET I (981772) wifi:Init dynamic tx buffer
>>>> num: 16
>>>> 2021-03-12 14:53:01.922 CET I (981772) wifi:Init static rx buffer
>>>> size: 2212
>>>> 2021-03-12 14:53:01.922 CET I (981772) wifi:Init static rx buffer
>>>> num: 16
>>>> 2021-03-12 14:53:01.922 CET I (981772) wifi:Init dynamic rx buffer
>>>> num: 16
>>>> 2021-03-12 14:53:02.642 CET I (982492) wifi:mode : sta
>>>> (30:ae:a4:5f:e7:ec) + softAP (30:ae:a4:5f:e7:ed)
>>>> 2021-03-12 14:53:02.652 CET I (982502) wifi:Total power save buffer
>>>> number: 8
>>>> 2021-03-12 14:53:02.652 CET I (982502) cellular-modem-auto: Restart
>>>> 2021-03-12 14:53:02.662 CET I (982512) cellular: State: Enter
>>>> PowerOffOn state
>>>> 2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: Shutting down (hard)...
>>>> 2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: StatusCallBack:
>>>> User Interrupt
>>>> 2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: PPP connection has
>>>> been closed
>>>> 2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: PPP is shutdown
>>>> 2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: Shutting down (hard)...
>>>> 2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: StatusCallBack:
>>>> User Interrupt
>>>> 2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: PPP connection has
>>>> been closed
>>>> 2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: PPP is shutdown
>>>> 2021-03-12 14:53:02.672 CET I (982522) gsm-nmea: Shutdown (direct)
>>>> 2021-03-12 14:53:02.672 CET I (982522) cellular-modem-auto: Power Cycle
>>>> 2021-03-12 14:53:04.682 CET D (984532) events: Signal(system.wifi.down)
>>>> 2021-03-12 14:53:04.682 CET I (984532) netmanager: WIFI client stop
>>>> 2021-03-12 14:53:04.682 CET E (984532) netmanager: Inconsistent
>>>> state: no interface of type 'pp' found
>>>> 2021-03-12 14:53:04.682 CET I (984532) netmanager: WIFI client down
>>>> (with MODEM up): reconfigured for MODEM priority
>>>> 2021-03-12 14:53:04.692 CET D (984542) events: Signal(system.event)
>>>> 2021-03-12 14:53:04.692 CET D (984542) events:
>>>> Signal(system.wifi.sta.disconnected)
>>>> 2021-03-12 14:53:04.692 CET E (984542) netmanager: Inconsistent
>>>> state: no interface of type 'pp' found
>>>> 2021-03-12 14:53:04.692 CET I (984542) esp32wifi: STA disconnected
>>>> with reason 8 = ASSOC_LEAVE
>>>> 2021-03-12 14:53:04.702 CET D (984552) events: Signal(system.event)
>>>> 2021-03-12 14:53:04.702 CET D (984552) events:
>>>> Signal(system.wifi.sta.stop)
>>>> 2021-03-12 14:53:04.702 CET E (984552) netmanager: Inconsistent
>>>> state: no interface of type 'pp' found
>>>> 2021-03-12 14:53:04.712 CET D (984562) events: Signal(system.event)
>>>> 2021-03-12 14:53:04.712 CET D (984562) events:
>>>> Signal(system.wifi.ap.stop)
>>>> 2021-03-12 14:53:04.712 CET E (984562) netmanager: Inconsistent
>>>> state: no interface of type 'pp' found
>>>> 2021-03-12 14:53:04.712 CET I (984562) netmanager: WIFI access
>>>> point is down
>>>> 2021-03-12 14:53:04.712 CET I (984562) esp32wifi: AP stopped
>>>> 2021-03-12 14:53:04.722 CET D (984572) events:
>>>> Signal(network.wifi.sta.bad)
>>>> 2021-03-12 14:53:04.722 CET D (984572) events: Signal(system.event)
>>>> 2021-03-12 14:53:04.722 CET D (984572) events:
>>>> Signal(system.wifi.sta.start)
>>>> 2021-03-12 14:53:04.732 CET D (984582) events:
>>>> Signal(system.modem.down)
>>>> 2021-03-12 14:53:04.742 CET I (984592) netmanager: MODEM down (with
>>>> WIFI client down): network connectivity has been lost
>>>> 2021-03-12 14:53:04.742 CET D (984592) events:
>>>> Signal(system.modem.down)
>>>> 2021-03-12 14:53:04.752 CET D (984602) events: Signal(system.event)
>>>> 2021-03-12 14:53:04.752 CET D (984602) events:
>>>> Signal(system.wifi.ap.start)
>>>> 2021-03-12 14:53:04.752 CET I (984602) netmanager: WIFI access
>>>> point is up
>>>> 2021-03-12 14:53:26.802 CET E (1006652) events: SignalEvent: queue
>>>> overflow (running system.wifi.ap.start->netmanager for 23 sec),
>>>> event 'ticker.1' dropped
>>>> 2021-03-12 14:53:27.802 CET E (1007652) events: SignalEvent: queue
>>>> overflow (running system.wifi.ap.start->netmanager for 24 sec),
>>>> event 'ticker.1' dropped
>>>> 2021-03-12 14:53:28.802 CET E (1008652) events: SignalEvent: queue
>>>> overflow (running system.wifi.ap.start->netmanager for 25 sec),
>>>> event 'ticker.1' dropped
>>>> …and so on until
>>>> 2021-03-12 14:54:01.802 CET E (1041652) events: SignalEvent: lost
>>>> important event => aborting
>>>>
>>>>
>>>> I need my car now, so will switch back to master for now.
>>>>
>>>> Mark, if you've got specific debug logs I should fetch on the next
>>>> try, tell me.
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>>>> Am 12.03.21 um 05:47 schrieb Craig Leres:
>>>>> I just updated to 3.2.016-68-g8e10c6b7 and still get the network
>>>>> hang immediately after booting and logging into the web gui.
>>>>>
>>>>> But I see now my problem is likely that I'm not using the right
>>>>> esp-idf (duh). Is there a way I can have master build using
>>>>> ~/esp/esp-idf and have for-v3.3 use a different path?)
>>>>>
>>>>> Craig
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>> <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>>
>>>> --
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210322/fd27bb10/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20210322/fd27bb10/attachment-0001.sig>
More information about the OvmsDev
mailing list