Some news...

My issue was caused because of course I made a mistake with the way of registering my metrics Listener.

I used the following code (see my previous post) in the constructor:

+  MyMetrics.RegisterListener(IDTAG.c_str(), "*", std::bind(&canlog::MetricListener, this, _1));

and for the destructor:
+  MyMetrics.DeregisterListener(IDTAG.c_str());

I had to use IDTAG because this class has multiple instances.

The issue is passing IDTAG.c_str() as a parameter ; because MyMetrics.RegisterListener stores the value of this const char* parameter and uses it 1) as an immutable parameter and 2) as a value.

Of course this does not work with my first way to call MyMetrics.RegisterListener :

I rewrote the code this way by adding a new member (m_idtag) to the class:

@ vehicle/OVMS.V3/components/can/src/canlog.cpp:352 @ canlog::canlog(const char* type, std::string format, canformat::canformat_serve_
   m_dropcount = 0;
   m_filtercount = 0;
 
+  m_idtag = strdup(IDTAG.c_str());
+
   using std::placeholders::_1;
   using std::placeholders::_2;
   MyEvents.RegisterEvent(IDTAG, "*", std::bind(&canlog::EventListener, this, _1, _2));
+  if (NULL != m_idtag)
+    {
+    MyMetrics.RegisterListener(m_idtag, "*", std::bind(&canlog::MetricListener, this, _1));
+    }
+  else
+    {
+    ESP_LOGE(TAG,"Cannot allocate memory - not registering metrics listener");
+    }
 
   int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
   m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
@ vehicle/OVMS.V3/components/can/src/canlog.cpp:374 @ canlog::canlog(const char* type, std::string format, canformat::canformat_serve_
 canlog::~canlog()
   {
   MyEvents.DeregisterEvent(IDTAG);
+  if (NULL != m_idtag)
+    {
+    MyMetrics.DeregisterListener(m_idtag);
+    free((void*)m_idtag);
+    }
 
   if (m_task)
     {


Another option is to change MyMetrics.RegisterListener so that it store *a copy* of the parameter, and does string comparison instead of a value comparison:

modified: vehicle/OVMS.V3/main/ovms_metrics.cpp
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
@ vehicle/OVMS.V3/main/ovms_metrics.cpp:1019 @ void OvmsMetrics::RegisterListener(const char* caller, const char* name, MetricC
     return;
     }
 
+  const char * caller_copy = (const char *)strdup(caller);
+
+  if (NULL == caller_copy)
+    {
+    ESP_LOGE(TAG, "Problem registering metric %s for caller %s - insufficient memory",name,caller);
+    return;
+    }
+
   MetricCallbackList *ml = k->second;
-  ml->push_back(new MetricCallbackEntry(caller,callback));
+  ml->push_back(new MetricCallbackEntry(caller_copy,callback));
   }
 
 void OvmsMetrics::DeregisterListener(const char* caller)
@ vehicle/OVMS.V3/main/ovms_metrics.cpp:1041 @ void OvmsMetrics::DeregisterListener(const char* caller)
     while (itc!=ml->end())
       {
       MetricCallbackEntry* ec = *itc;
-      if (ec->m_caller == caller)
+      if (strcmp(ec->m_caller, caller) == 0)
         {
+        free((char *)ec->m_caller);
         itc = ml->erase(itc);
         delete ec;
         }


I've tested both approaches and they both solve my issue : no more crashes and memory corruption.

The second one is more "risky" (as it potentially impacts everyone) but I think that it should not cause much regression.

Let me know what you think - should I make a PR for this second approach ?

Regards,

Le 16/11/2022 à 22:45, Ludovic LANGE a écrit :

Dear list,

You may remember that I'm trying to enhance the "datalogger" side of OVMS - in different ways.

One facet of this endeavour is to have (CAN) logs that also incorporate other data coming from OVMS. In particular, I'd like to have GNSS (GPS) data in the same log as my CAN data.

My first try is to slightly alter `canlog.cpp` and add the ability to `LogInfo` some chosen metrics, in the same spirit of the already logged subset of events. (With CRTD format, it appears as comments)

While doing this, I'm facing multiple crashes that are somehow strange:

The crashes seems to always occur in a window of 20s after the `can log stop` command (In general it's immediate, but sometime can take some more seconds).

(It lead me also to discover that the feature of our version of esp-idf that resolves the memory addresses in the serial output - `make monitor` - is broken, and that it's better to use GDB for that)

While I am doing my homework and try to solve this, I wanted to share it here also to see if you already encountered this and/or if my changes are not done properly / not on the right OS task / etc.... - for which I lack background and experience. (In the process I think I found a non-freed variable, but it's not causing my issue - cf https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/774 )

Here are the changes - you'll see that the crash do occur as soon as I `LogInfo` something - after the log has been closed.


modified: vehicle/OVMS.V3/components/can/src/canlog.cpp
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
@ vehicle/OVMS.V3/components/can/src/canlog.cpp:355 @ canlog::canlog(const char* type, std::string format, canformat::canformat_serve_
   using std::placeholders::_1;
   using std::placeholders::_2;
   MyEvents.RegisterEvent(IDTAG, "*", std::bind(&canlog::EventListener, this, _1, _2));
+  MyMetrics.RegisterListener(IDTAG.c_str(), "*", std::bind(&canlog::MetricListener, this, _1));
 
   int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
   m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
@ vehicle/OVMS.V3/components/can/src/canlog.cpp:365 @ canlog::canlog(const char* type, std::string format, canformat::canformat_serve_
 canlog::~canlog()
   {
   MyEvents.DeregisterEvent(IDTAG);
+  MyMetrics.DeregisterListener(IDTAG.c_str());
 
   if (m_task)
     {
@ vehicle/OVMS.V3/components/can/src/canlog.cpp:441 @ void canlog::EventListener(std::string event, void* data)
     LogInfo(NULL, CAN_LogInfo_Event, event.c_str());
   }
 
+void canlog::MetricListener(OvmsMetric* metric) {
+    std::string name = metric->m_name;
+    ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
+  if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") || (name == "v.p.longitude") || (name == "v.p.altitude") || (name == "v.p.direction") || (name == "v.p.satcount") || (name == "m.net.wifi.sq") || (name == "m.freeram")) {
+    std::string metric_text = "metric: " + name;
+    ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
+    LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());
+  }
+}
+
 const char* canlog::GetType()
   {
   return m_type;


Note: "m.net.wifi.sq" / "m.freeram" are providing debugging data, so that I can quickly receive metrics after the boot. Not what I'm interested in for the final version of course, and the list will be configurable.


Here are some example of panic handler infos:

Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC      : 0x400d95f8  PS      : 0x00060830  A0      : 0x8030d32c  A1      : 0x3ffc4980  
A2      : 0x3ffeb5a8  A3      : 0x00000000  A4      : 0xfefefefe  A5      : 0x3ffb5200  
A6      : 0x3f805e78  A7      : 0x000000fe  A8      : 0x800d95f0  A9      : 0x3ffc4930  
A10     : 0x3ffeb5a8  A11     : 0x00000000  A12     : 0x00000009  A13     : 0x73c80b46  
A14     : 0x3ffc4930  A15     : 0x3f8b4164  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0xfefeff2e  LBEG    : 0x4008abe4  LEND    : 0x4008ac00  LCOUNT  : 0xffffffff  

ELF file SHA256: eb3e5ce8f67640c8

Backtrace: 0x400d95f8:0x3ffc4980 0x4030d329:0x3ffc49e0 0x4012122d:0x3ffc4a00 0x40121296:0x3ffc4a60 0x40120e6e:0x3ffc4a80 0x400ede1f:0x3ffc4aa0 0x400eed41:0x3ffc4ae0 0x400ecdc9:0x3ffc4b00 0x4011c5b6:0x3ffc4b40 0x4011c7b6:0x3ffc4b70 0x4011c8d8:0x3ffc4be0 0x0


0x400d95f8 is exactly the address of the call to LogInfo in my changes:

(gdb) list *0x400d95f8
0x400d95f8 is in canlog::MetricListener(OvmsMetric*) (vehicle/OVMS.V3/components/can/src/canlog.cpp:447).
442       std::string name = metric->m_name;
443       ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
444     if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") || (name == "v.p.longitude") || (name == "v.p.altitude") || (name == "v.p.direction") || (name == "v.p.satcount") || (name == "m.net.wifi.sq") || (name == "m.freeram")) {
445       std::string metric_text = "metric: " + name;
446       ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
447       LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());
448     }
449    }
450   

0xfefe / 0xfefefefe seems related to some of the Heap Memory Debugging options I've activated (Cf https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/heap_debug.html ) and seems to indicate a possible use-after-free issue. I'll look to debug in that direction.


Another version (before Heap Memory Debugging activated) did that from time to time:


Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x00008100  PS      : 0x00060530  A0      : 0x802f0cf0  A1      : 0x3ffd0f00  
A2      : 0x3ffeb214  A3      : 0x00008100  A4      : 0x402f1df8  A5      : 0x00000000  
A6      : 0x00000003  A7      : 0x3ffb556c  A8      : 0x800d9540  A9      : 0x3ffd0ee0  
A10     : 0x3ffeb214  A11     : 0x00000000  A12     : 0x00000009  A13     : 0x3f8b4a24  
A14     : 0xffffffff  A15     : 0x40404040  SAR     : 0x00000018  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00008100  LBEG    : 0x4008a848  LEND    : 0x4008a864  LCOUNT  : 0xffffffff  

ELF file SHA256: f4d0c43aa48ef759

Backtrace: 0x00008100:0x3ffd0f00 0x402f0ced:0x3ffd0fb0 0x4011c56d:0x3ffd0fd0 0x4011c5c6:0x3ffd1020 0x4011c24a:0x3ffd1040 0x4014f51d:0x3ffd1060 0x4014f953:0x3ffd1080 0x4014e6e3:0x3ffd11d0 0x4014ef16:0x3ffd1210 0x4014f001:0x3ffd1230 0x4014f119:0x3ffd1260 0x4014e889:0x3ffd1290 0x4014e9ed:0x3ffd1310 0x4014ebf9:0x3ffd13f0




Thanks for any hint or advices you may want to share.

Regards,


_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev