Not sure how to resolve this.

OvmsSyncHttpClient is currently used in commands from ovms_plugins and ovms_ota.

I could bring back the OvmsHttpClient blocking (non-mongoose) implementation, but I don’t think that would address the core problem here:

Inside a mongoose callback (inside the mongoose networking task), we are making blocking calls (and in particular calls that could block for several tens of seconds).

But fundamentally is it ok to block the mongoose networking task for extended periods during a mongoose event callback?

Mark

On 21 Mar 2021, at 9:57 PM, Michael Balzer <dexter@expeedo.de> wrote:

Signed PGP part
I've found opening the web UI firmware page or calling "ota status" via ssh to consistently deadlock the network on my module.

I (130531) webserver: HTTP GET /cfg/firmware
D (130531) http: OvmsSyncHttpClient: Connect to ovms.dexters-web.de:80
D (130541) http: OvmsSyncHttpClient: waiting for completion


After that log message, the network is dead, and the netmanager also doesn't respond:

OVMS# network list
ERROR: job failed
D (183241) netmanager: send cmd 1 from 0x3ffe7054
W (193241) netmanager: ExecuteJob: cmd 1: timeout


The interfaces seem to be registered and online, but nothing gets in or out:

OVMS# network status
Interface#3: pp3 (ifup=1 linkup=1)
  IPv4: 10.170.195.13/255.255.255.255 gateway 10.64.64.64

Interface#2: ap2 (ifup=1 linkup=1)
  IPv4: 192.168.4.1/255.255.255.0 gateway 192.168.4.1

Interface#1: st1 (ifup=1 linkup=1)
  IPv4: 192.168.2.106/255.255.255.0 gateway 192.168.2.1

DNS: 192.168.2.1

Default Interface: st1 (192.168.2.106/255.255.255.0 gateway 192.168.2.1)



A couple of minutes later, server-v2 recognizes the stale connection and issues a network restart, which fails resulting in the same behaviour as shown below with finally forced reboot by loss of an important event.

Doing "ota status" from USB works normally, so this looks like OvmsSyncHttpClient not being able to run from within a mongoose client.

Regards,
Michael


Am 18.03.21 um 08:14 schrieb Mark Webb-Johnson:
Tried to repeat this, but not having much success. Here is my car module, with network still up:

OVMS# boot status
Last boot was 262355 second(s) ago

I did manage to catch one network related crash after repeatedly disconnecting and reconnecting the cellular antenna. This was:

I (3717989) cellular: PPP Connection disconnected
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.

0x400fe082 is in OvmsNetManager::PrioritiseAndIndicate() (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_netmanager.cpp:707).
707         if ((pri->name[0]==search[0])&&

0x400ed360 is in OvmsMetricString::SetValue(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_metrics.cpp:1358).
1357    void OvmsMetricString::SetValue(std::string value)

0x4008bdad is at ../../../../.././newlib/libc/machine/xtensa/strcmp.S:586.

0x4008bdd1 is at ../../../../.././newlib/libc/machine/xtensa/strcmp.S:604.

0x400fe886 is in OvmsNetManager::ModemDown(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_netmanager.cpp:522).
522         PrioritiseAndIndicate();

0x400fd752 is in std::_Function_handler<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*), std::_Bind<std::_Mem_fn<void (OvmsNetManager::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*)> (OvmsNetManager*, std::_Placeholder<1>, std::_Placeholder<2>)> >::_M_invoke(std::_Any_data const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, void*&&) (/home/openvehicles/build/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/functional:600).
600             { return (__object->*_M_pmf)(std::forward<_Args>(__args)...); }

0x400f512e is in std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*)>::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*) const (/home/openvehicles/build/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/5.2.0/functional:2271).
2271          return _M_invoker(_M_functor, std::forward<_ArgTypes>(__args)...);

0x400f52f1 is in OvmsEvents::HandleQueueSignalEvent(event_queue_t*) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:283).
283             m_current_callback->m_callback(m_current_event, msg->body.signal.data);

0x400f53d8 is in OvmsEvents::EventTask() (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:237).
237               HandleQueueSignalEvent(&msg);

0x400f545d is in EventLaunchTask(void*) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-pre/vehicle/OVMS.V3/main/ovms_events.cpp:80).
80        me->EventTask();

My for_v3.3 branch does include the preliminary changes to support the wifi at 20MHz bandwidth, and perhaps those could be affecting things. I do notice that if I ‘power wifi off’, then ‘wifi mode client’, it can connect to the station, but not get an IP address. I’ve just tried to merge in the latest fixes to that, and rebuilt a release. I will continue to test with that.

Regards, Mark.

On 12 Mar 2021, at 10:32 PM, Michael Balzer <dexter@expeedo.de> wrote:

Signed PGP part
I just tried switching to for-v3.3 in my car module after tests on my desk module were OK, and I've run into the very same problem with for-v3.3. So the issue isn't related to esp-idf.

The network only occasionally starts normally, but even then all connectivity is lost after a couple of minutes.

The stale connection watchdog in server-v2 triggers a network restart, but that also doesn't seem to succeed:

2021-03-12 14:53:01.802 CET W (981652) ovms-server-v2: Detected stale connection (issue #241), restarting network
2021-03-12 14:53:01.802 CET I (981652) esp32wifi: Restart
2021-03-12 14:53:01.802 CET I (981652) esp32wifi: Stopping WIFI station
2021-03-12 14:53:01.812 CET I (981662) wifi:state: run -> init (0)
2021-03-12 14:53:01.812 CET I (981662) wifi:pm stop, total sleep time: 831205045 us / 975329961 us

2021-03-12 14:53:01.812 CET I (981662) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1
2021-03-12 14:53:01.832 CET I (981682) wifi:flush txq
2021-03-12 14:53:01.842 CET I (981692) wifi:stop sw txq
2021-03-12 14:53:01.842 CET I (981692) wifi:lmac stop hw txq
2021-03-12 14:53:01.852 CET I (981702) esp32wifi: Powering down WIFI driver
2021-03-12 14:53:01.852 CET I (981702) wifi:Deinit lldesc rx mblock:16
2021-03-12 14:53:01.862 CET I (981712) esp32wifi: Powering up WIFI driver
2021-03-12 14:53:01.862 CET I (981712) wifi:nvs_log_init, erase log key successfully, reinit nvs log
2021-03-12 14:53:01.882 CET I (981732) wifi:wifi driver task: 3ffd4d84, prio:23, stack:3584, core=0
2021-03-12 14:53:01.882 CET I (981732) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE
2021-03-12 14:53:01.882 CET I (981732) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE
2021-03-12 14:53:01.902 CET I (981752) wifi:wifi firmware version: 30f9e79
2021-03-12 14:53:01.912 CET I (981762) wifi:config NVS flash: enabled
2021-03-12 14:53:01.912 CET I (981762) wifi:config nano formating: disabled
2021-03-12 14:53:01.912 CET I (981762) wifi:Init data frame dynamic rx buffer num: 16
2021-03-12 14:53:01.912 CET I (981762) wifi:Init management frame dynamic rx buffer num: 16
2021-03-12 14:53:01.922 CET I (981772) wifi:Init management short buffer num: 32
2021-03-12 14:53:01.922 CET I (981772) wifi:Init dynamic tx buffer num: 16
2021-03-12 14:53:01.922 CET I (981772) wifi:Init static rx buffer size: 2212
2021-03-12 14:53:01.922 CET I (981772) wifi:Init static rx buffer num: 16
2021-03-12 14:53:01.922 CET I (981772) wifi:Init dynamic rx buffer num: 16
2021-03-12 14:53:02.642 CET I (982492) wifi:mode : sta (30:ae:a4:5f:e7:ec) + softAP (30:ae:a4:5f:e7:ed)
2021-03-12 14:53:02.652 CET I (982502) wifi:Total power save buffer number: 8
2021-03-12 14:53:02.652 CET I (982502) cellular-modem-auto: Restart
2021-03-12 14:53:02.662 CET I (982512) cellular: State: Enter PowerOffOn state
2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: Shutting down (hard)...
2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: StatusCallBack: User Interrupt
2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: PPP connection has been closed
2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: PPP is shutdown
2021-03-12 14:53:02.662 CET I (982512) gsm-ppp: Shutting down (hard)...
2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: StatusCallBack: User Interrupt
2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: PPP connection has been closed
2021-03-12 14:53:02.672 CET I (982522) gsm-ppp: PPP is shutdown
2021-03-12 14:53:02.672 CET I (982522) gsm-nmea: Shutdown (direct)
2021-03-12 14:53:02.672 CET I (982522) cellular-modem-auto: Power Cycle
2021-03-12 14:53:04.682 CET D (984532) events: Signal(system.wifi.down)
2021-03-12 14:53:04.682 CET I (984532) netmanager: WIFI client stop
2021-03-12 14:53:04.682 CET E (984532) netmanager: Inconsistent state: no interface of type 'pp' found
2021-03-12 14:53:04.682 CET I (984532) netmanager: WIFI client down (with MODEM up): reconfigured for MODEM priority
2021-03-12 14:53:04.692 CET D (984542) events: Signal(system.event)
2021-03-12 14:53:04.692 CET D (984542) events: Signal(system.wifi.sta.disconnected)
2021-03-12 14:53:04.692 CET E (984542) netmanager: Inconsistent state: no interface of type 'pp' found
2021-03-12 14:53:04.692 CET I (984542) esp32wifi: STA disconnected with reason 8 = ASSOC_LEAVE
2021-03-12 14:53:04.702 CET D (984552) events: Signal(system.event)
2021-03-12 14:53:04.702 CET D (984552) events: Signal(system.wifi.sta.stop)
2021-03-12 14:53:04.702 CET E (984552) netmanager: Inconsistent state: no interface of type 'pp' found
2021-03-12 14:53:04.712 CET D (984562) events: Signal(system.event)
2021-03-12 14:53:04.712 CET D (984562) events: Signal(system.wifi.ap.stop)
2021-03-12 14:53:04.712 CET E (984562) netmanager: Inconsistent state: no interface of type 'pp' found
2021-03-12 14:53:04.712 CET I (984562) netmanager: WIFI access point is down
2021-03-12 14:53:04.712 CET I (984562) esp32wifi: AP stopped
2021-03-12 14:53:04.722 CET D (984572) events: Signal(network.wifi.sta.bad)
2021-03-12 14:53:04.722 CET D (984572) events: Signal(system.event)
2021-03-12 14:53:04.722 CET D (984572) events: Signal(system.wifi.sta.start)
2021-03-12 14:53:04.732 CET D (984582) events: Signal(system.modem.down)
2021-03-12 14:53:04.742 CET I (984592) netmanager: MODEM down (with WIFI client down): network connectivity has been lost
2021-03-12 14:53:04.742 CET D (984592) events: Signal(system.modem.down)
2021-03-12 14:53:04.752 CET D (984602) events: Signal(system.event)
2021-03-12 14:53:04.752 CET D (984602) events: Signal(system.wifi.ap.start)
2021-03-12 14:53:04.752 CET I (984602) netmanager: WIFI access point is up
2021-03-12 14:53:26.802 CET E (1006652) events: SignalEvent: queue overflow (running system.wifi.ap.start->netmanager for 23 sec), event 'ticker.1' dropped
2021-03-12 14:53:27.802 CET E (1007652) events: SignalEvent: queue overflow (running system.wifi.ap.start->netmanager for 24 sec), event 'ticker.1' dropped
2021-03-12 14:53:28.802 CET E (1008652) events: SignalEvent: queue overflow (running system.wifi.ap.start->netmanager for 25 sec), event 'ticker.1' dropped
…and so on until
2021-03-12 14:54:01.802 CET E (1041652) events: SignalEvent: lost important event => aborting


I need my car now, so will switch back to master for now.

Mark, if you've got specific debug logs I should fetch on the next try, tell me.

Regards,
Michael


Am 12.03.21 um 05:47 schrieb Craig Leres:
I just updated to 3.2.016-68-g8e10c6b7 and still get the network hang immediately after booting and logging into the web gui.

But I see now my problem is likely that I'm not using the right esp-idf (duh). Is there a way I can have master build using ~/esp/esp-idf and have for-v3.3 use a different path?)

        Craig
_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev

--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26





_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26