[Ovmsdev] Strange crashes when logging additional info in can logs

Ludovic LANGE ll-ovmsdev at lange.nom.fr
Fri Nov 18 08:49:36 HKT 2022


Some news...

My issue was caused because of course I made a mistake with the way of 
registering my metrics Listener.

I used the following code (see my previous post) in the constructor:

+ MyMetrics.RegisterListener(IDTAG.c_str(), "*", 
std::bind(&canlog::MetricListener, this, _1));

and for the destructor:
+ MyMetrics.DeregisterListener(IDTAG.c_str());

I had to use IDTAG because this class has multiple instances.

The issue is passing IDTAG.c_str() as a parameter ; 
becauseMyMetrics.RegisterListener stores the value of this const char* 
parameter and uses it 1) as an immutable parameter and 2) as a value.

Of course this does not work with my first way to call 
MyMetrics.RegisterListener :

  * the memory buffer returned by IDTAG.c_str() has no more existence
    after exiting the constructor, so this lead to memory corruption.
  * when calling IDTAG.c_str() again in the destructor, there is
    absolutely no chance that this c_str() call returns the same memory
    address as the call in the constructor, so the value comparison will
    never work.


I rewrote the code this way by adding a new member (m_idtag) to the class:

@ vehicle/OVMS.V3/components/can/src/canlog.cpp:352 @ 
canlog::canlog(const char* type, std::string format, 
canformat::canformat_serve_
    m_dropcount = 0;
    m_filtercount = 0;

+  m_idtag = strdup(IDTAG.c_str());
+
    using std::placeholders::_1;
    using std::placeholders::_2;
    MyEvents.RegisterEvent(IDTAG, "*", std::bind(&canlog::EventListener, 
this, _1, _2));
+  if (NULL != m_idtag)
+    {
+    MyMetrics.RegisterListener(m_idtag, "*", 
std::bind(&canlog::MetricListener, this, _1));
+    }
+  else
+    {
+    ESP_LOGE(TAG,"Cannot allocate memory - not registering metrics 
listener");
+    }

    int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
    m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
@ vehicle/OVMS.V3/components/can/src/canlog.cpp:374 @ 
canlog::canlog(const char* type, std::string format, 
canformat::canformat_serve_
  canlog::~canlog()
    {
    MyEvents.DeregisterEvent(IDTAG);
+  if (NULL != m_idtag)
+    {
+    MyMetrics.DeregisterListener(m_idtag);
+    free((void*)m_idtag);
+    }

    if (m_task)
      {


Another option is to change MyMetrics.RegisterListener so that it store 
*a copy* of the parameter, and does string comparison instead of a value 
comparison:

modified: vehicle/OVMS.V3/main/ovms_metrics.cpp
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
@ vehicle/OVMS.V3/main/ovms_metrics.cpp:1019 @ void 
OvmsMetrics::RegisterListener(const char* caller, const char* name, MetricC
      return;
      }

+  const char * caller_copy = (const char *)strdup(caller);
+
+  if (NULL == caller_copy)
+    {
+    ESP_LOGE(TAG, "Problem registering metric %s for caller %s - 
insufficient memory",name,caller);
+    return;
+    }
+
    MetricCallbackList *ml = k->second;
-  ml->push_back(new MetricCallbackEntry(caller,callback));
+  ml->push_back(new MetricCallbackEntry(caller_copy,callback));
    }

  void OvmsMetrics::DeregisterListener(const char* caller)
@ vehicle/OVMS.V3/main/ovms_metrics.cpp:1041 @ void 
OvmsMetrics::DeregisterListener(const char* caller)
      while (itc!=ml->end())
        {
        MetricCallbackEntry* ec = *itc;
-      if (ec->m_caller == caller)
+      if (strcmp(ec->m_caller, caller) == 0)
          {
+        free((char *)ec->m_caller);
          itc = ml->erase(itc);
          delete ec;
          }


I've tested both approaches and they both solve my issue : no more 
crashes and memory corruption.

The second one is more "risky" (as it potentially impacts everyone) but 
I think that it should not cause much regression.

Let me know what you think - should I make a PR for this second approach ?

Regards,

Le 16/11/2022 à 22:45, Ludovic LANGE a écrit :
>
> Dear list,
>
> You may remember that I'm trying to enhance the "datalogger" side of 
> OVMS - in different ways.
>
> One facet of this endeavour is to have (CAN) logs that also 
> incorporate other data coming from OVMS. In particular, I'd like to 
> have GNSS (GPS) data in the same log as my CAN data.
>
> My first try is to slightly alter `canlog.cpp` and add the ability to 
> `LogInfo` some chosen metrics, in the same spirit of the already 
> logged subset of events. (With CRTD format, it appears as comments)
>
> While doing this, I'm facing multiple crashes that are somehow strange:
>
>   * The crashes only occur when closing the logs (`can log stop`),
>     never when logging
>   * The cause of the crash is not always the same (LoadProhibited,
>     InstrFetchProhibited)
>   * Sometimes the PC is impacted / overwritten
>
> The crashes seems to always occur in a window of 20s after the `can 
> log stop` command (In general it's immediate, but sometime can take 
> some more seconds).
>
> (It lead me also to discover that the feature of our version of 
> esp-idf that resolves the memory addresses in the serial output - 
> `make monitor` - is broken, and that it's better to use GDB for that)
>
> While I am doing my homework and try to solve this, I wanted to share 
> it here also to see if you already encountered this and/or if my 
> changes are not done properly / not on the right OS task / etc.... - 
> for which I lack background and experience. (In the process I think I 
> found a non-freed variable, but it's not causing my issue - cf 
> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/774 
> )
>
> Here are the changes - you'll see that the crash do occur as soon as I 
> `LogInfo` something - after the log has been closed.
>
> ------------------------------------------------------------------------
>
> modified: vehicle/OVMS.V3/components/can/src/canlog.cpp
> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:355 @ 
> canlog::canlog(const char* type, std::string format, 
> canformat::canformat_serve_
>    using std::placeholders::_1;
>    using std::placeholders::_2;
>    MyEvents.RegisterEvent(IDTAG, "*", 
> std::bind(&canlog::EventListener, this, _1, _2));
> +  MyMetrics.RegisterListener(IDTAG.c_str(), "*", 
> std::bind(&canlog::MetricListener, this, _1));
>
>    int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
>    m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:365 @ 
> canlog::canlog(const char* type, std::string format, 
> canformat::canformat_serve_
>  canlog::~canlog()
>    {
>    MyEvents.DeregisterEvent(IDTAG);
> +  MyMetrics.DeregisterListener(IDTAG.c_str());
>
>    if (m_task)
>      {
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:441 @ void 
> canlog::EventListener(std::string event, void* data)
>      LogInfo(NULL, CAN_LogInfo_Event, event.c_str());
>    }
>
> +void canlog::MetricListener(OvmsMetric* metric) {
> +    std::string name = metric->m_name;
> +    ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
> +  if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") || 
> (name == "v.p.longitude") || (name == "v.p.altitude") || (name == 
> "v.p.direction") || (name == "v.p.satcount") || (name == 
> "m.net.wifi.sq") || (name == "m.freeram")) {
> +    std::string metric_text = "metric: " + name;
> +    ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
> +    LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());
> +  }
> +}
> +
>  const char* canlog::GetType()
>    {
>    return m_type;
>
>
> Note: "m.net.wifi.sq" / "m.freeram" are providing debugging data, so 
> that I can quickly receive metrics after the boot. Not what I'm 
> interested in for the final version of course, and the list will be 
> configurable.
>
> ------------------------------------------------------------------------
>
> Here are some example of panic handler infos:
>
> Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception 
> was unhandled.
> Core 1 register dump:
> PC      : 0x400d95f8  PS      : 0x00060830  A0      : 0x8030d32c  
> A1      : 0x3ffc4980
> A2      : 0x3ffeb5a8  A3      : 0x00000000  A4      : 0xfefefefe  
> A5      : 0x3ffb5200
> A6      : 0x3f805e78  A7      : 0x000000fe  A8      : 0x800d95f0  
> A9      : 0x3ffc4930
> A10     : 0x3ffeb5a8  A11     : 0x00000000  A12     : 0x00000009  
> A13     : 0x73c80b46
> A14     : 0x3ffc4930  A15     : 0x3f8b4164  SAR     : 0x00000004  
> EXCCAUSE: 0x0000001c
> EXCVADDR: 0xfefeff2e  LBEG    : 0x4008abe4  LEND    : 0x4008ac00  
> LCOUNT  : 0xffffffff
>
> ELF file SHA256: eb3e5ce8f67640c8
>
> Backtrace: 0x400d95f8:0x3ffc4980 0x4030d329:0x3ffc49e0 
> 0x4012122d:0x3ffc4a00 0x40121296:0x3ffc4a60 0x40120e6e:0x3ffc4a80 
> 0x400ede1f:0x3ffc4aa0 0x400eed41:0x3ffc4ae0 0x400ecdc9:0x3ffc4b00 
> 0x4011c5b6:0x3ffc4b40 0x4011c7b6:0x3ffc4b70 0x4011c8d8:0x3ffc4be0 0x0
>
>
> 0x400d95f8 is exactly the address of the call to LogInfo in my changes:
>
> (gdb) list *0x400d95f8
> 0x400d95f8 is in canlog::MetricListener(OvmsMetric*) 
> (vehicle/OVMS.V3/components/can/src/canlog.cpp:447).
> 442       std::string name = metric->m_name;
> 443       ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
> 444     if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") || 
> (name == "v.p.longitude") || (name == "v.p.altitude") || (name == 
> "v.p.direction") || (name == "v.p.satcount") || (name == 
> "m.net.wifi.sq") || (name == "m.freeram")) {
> 445       std::string metric_text = "metric: " + name;
> 446       ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
> *447       LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());*
> 448     }
> 449    }
> 450
>
> 0xfefe / 0xfefefefe seems related to some of the Heap Memory Debugging 
> options I've activated (Cf 
> https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/heap_debug.html 
> ) and seems to indicate a possible use-after-free issue. I'll look to 
> debug in that direction.
>
>
> Another version (before Heap Memory Debugging activated) did that from 
> time to time:
>
> ------------------------------------------------------------------------
> Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). 
> Exception was unhandled.
> Core 0 register dump:
> PC      : 0x00008100  PS      : 0x00060530  A0      : 0x802f0cf0  
> A1      : 0x3ffd0f00
> A2      : 0x3ffeb214  A3      : 0x00008100  A4      : 0x402f1df8  
> A5      : 0x00000000
> A6      : 0x00000003  A7      : 0x3ffb556c  A8      : 0x800d9540  
> A9      : 0x3ffd0ee0
> A10     : 0x3ffeb214  A11     : 0x00000000  A12     : 0x00000009  
> A13     : 0x3f8b4a24
> A14     : 0xffffffff  A15     : 0x40404040  SAR     : 0x00000018  
> EXCCAUSE: 0x00000014
> EXCVADDR: 0x00008100  LBEG    : 0x4008a848  LEND    : 0x4008a864  
> LCOUNT  : 0xffffffff
>
> ELF file SHA256: f4d0c43aa48ef759
>
> Backtrace: 0x00008100:0x3ffd0f00 0x402f0ced:0x3ffd0fb0 
> 0x4011c56d:0x3ffd0fd0 0x4011c5c6:0x3ffd1020 0x4011c24a:0x3ffd1040 
> 0x4014f51d:0x3ffd1060 0x4014f953:0x3ffd1080 0x4014e6e3:0x3ffd11d0 
> 0x4014ef16:0x3ffd1210 0x4014f001:0x3ffd1230 0x4014f119:0x3ffd1260 
> 0x4014e889:0x3ffd1290 0x4014e9ed:0x3ffd1310 0x4014ebf9:0x3ffd13f0
>
> ------------------------------------------------------------------------
>
>
> Thanks for any hint or advices you may want to share.
>
> Regards,
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221118/f2ac92b6/attachment-0001.htm>


More information about the OvmsDev mailing list