[Ovmsdev] Strange crashes when logging additional info in can logs

Michael Balzer dexter at expeedo.de
Fri Nov 18 21:51:19 HKT 2022


Ludovic,

there is an even better variant of your second approach, that's changing 
MetricCallbackEntry.m_caller to a std::string.

That's the same scheme EventCallbackEntry already uses, and m_caller is 
only used to identiy the registrant, so that's straight forward.

Regards,
Michael


Am 18.11.22 um 01:49 schrieb Ludovic LANGE:
> Some news...
>
> My issue was caused because of course I made a mistake with the way of 
> registering my metrics Listener.
>
> I used the following code (see my previous post) in the constructor:
>
> + MyMetrics.RegisterListener(IDTAG.c_str(), "*", 
> std::bind(&canlog::MetricListener, this, _1));
>
> and for the destructor:
> + MyMetrics.DeregisterListener(IDTAG.c_str());
>
> I had to use IDTAG because this class has multiple instances.
>
> The issue is passing IDTAG.c_str() as a parameter ; 
> becauseMyMetrics.RegisterListener stores the value of this const char* 
> parameter and uses it 1) as an immutable parameter and 2) as a value.
>
> Of course this does not work with my first way to call 
> MyMetrics.RegisterListener :
>
>   * the memory buffer returned by IDTAG.c_str() has no more existence
>     after exiting the constructor, so this lead to memory corruption.
>   * when calling IDTAG.c_str() again in the destructor, there is
>     absolutely no chance that this c_str() call returns the same
>     memory address as the call in the constructor, so the value
>     comparison will never work.
>
>
> I rewrote the code this way by adding a new member (m_idtag) to the class:
>
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:352 @ 
> canlog::canlog(const char* type, std::string format, 
> canformat::canformat_serve_
>    m_dropcount = 0;
>    m_filtercount = 0;
>
> +  m_idtag = strdup(IDTAG.c_str());
> +
>    using std::placeholders::_1;
>    using std::placeholders::_2;
>    MyEvents.RegisterEvent(IDTAG, "*", 
> std::bind(&canlog::EventListener, this, _1, _2));
> +  if (NULL != m_idtag)
> +    {
> +    MyMetrics.RegisterListener(m_idtag, "*", 
> std::bind(&canlog::MetricListener, this, _1));
> +    }
> +  else
> +    {
> +    ESP_LOGE(TAG,"Cannot allocate memory - not registering metrics 
> listener");
> +    }
>
>    int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
>    m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:374 @ 
> canlog::canlog(const char* type, std::string format, 
> canformat::canformat_serve_
>  canlog::~canlog()
>    {
>    MyEvents.DeregisterEvent(IDTAG);
> +  if (NULL != m_idtag)
> +    {
> +    MyMetrics.DeregisterListener(m_idtag);
> +    free((void*)m_idtag);
> +    }
>
>    if (m_task)
>      {
>
>
> Another option is to change MyMetrics.RegisterListener so that it 
> store *a copy* of the parameter, and does string comparison instead of 
> a value comparison:
>
> modified: vehicle/OVMS.V3/main/ovms_metrics.cpp
> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> @ vehicle/OVMS.V3/main/ovms_metrics.cpp:1019 @ void 
> OvmsMetrics::RegisterListener(const char* caller, const char* name, 
> MetricC
>      return;
>      }
>
> +  const char * caller_copy = (const char *)strdup(caller);
> +
> +  if (NULL == caller_copy)
> +    {
> +    ESP_LOGE(TAG, "Problem registering metric %s for caller %s - 
> insufficient memory",name,caller);
> +    return;
> +    }
> +
>    MetricCallbackList *ml = k->second;
> -  ml->push_back(new MetricCallbackEntry(caller,callback));
> +  ml->push_back(new MetricCallbackEntry(caller_copy,callback));
>    }
>
>  void OvmsMetrics::DeregisterListener(const char* caller)
> @ vehicle/OVMS.V3/main/ovms_metrics.cpp:1041 @ void 
> OvmsMetrics::DeregisterListener(const char* caller)
>      while (itc!=ml->end())
>        {
>        MetricCallbackEntry* ec = *itc;
> -      if (ec->m_caller == caller)
> +      if (strcmp(ec->m_caller, caller) == 0)
>          {
> +        free((char *)ec->m_caller);
>          itc = ml->erase(itc);
>          delete ec;
>          }
>
>
> I've tested both approaches and they both solve my issue : no more 
> crashes and memory corruption.
>
> The second one is more "risky" (as it potentially impacts everyone) 
> but I think that it should not cause much regression.
>
> Let me know what you think - should I make a PR for this second approach ?
>
> Regards,
>
> Le 16/11/2022 à 22:45, Ludovic LANGE a écrit :
>>
>> Dear list,
>>
>> You may remember that I'm trying to enhance the "datalogger" side of 
>> OVMS - in different ways.
>>
>> One facet of this endeavour is to have (CAN) logs that also 
>> incorporate other data coming from OVMS. In particular, I'd like to 
>> have GNSS (GPS) data in the same log as my CAN data.
>>
>> My first try is to slightly alter `canlog.cpp` and add the ability to 
>> `LogInfo` some chosen metrics, in the same spirit of the already 
>> logged subset of events. (With CRTD format, it appears as comments)
>>
>> While doing this, I'm facing multiple crashes that are somehow strange:
>>
>>   * The crashes only occur when closing the logs (`can log stop`),
>>     never when logging
>>   * The cause of the crash is not always the same (LoadProhibited,
>>     InstrFetchProhibited)
>>   * Sometimes the PC is impacted / overwritten
>>
>> The crashes seems to always occur in a window of 20s after the `can 
>> log stop` command (In general it's immediate, but sometime can take 
>> some more seconds).
>>
>> (It lead me also to discover that the feature of our version of 
>> esp-idf that resolves the memory addresses in the serial output - 
>> `make monitor` - is broken, and that it's better to use GDB for that)
>>
>> While I am doing my homework and try to solve this, I wanted to share 
>> it here also to see if you already encountered this and/or if my 
>> changes are not done properly / not on the right OS task / etc.... - 
>> for which I lack background and experience. (In the process I think I 
>> found a non-freed variable, but it's not causing my issue - cf 
>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/774 
>> )
>>
>> Here are the changes - you'll see that the crash do occur as soon as 
>> I `LogInfo` something - after the log has been closed.
>>
>> ------------------------------------------------------------------------
>>
>> modified: vehicle/OVMS.V3/components/can/src/canlog.cpp
>> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:355 @ 
>> canlog::canlog(const char* type, std::string format, 
>> canformat::canformat_serve_
>>    using std::placeholders::_1;
>>    using std::placeholders::_2;
>>    MyEvents.RegisterEvent(IDTAG, "*", 
>> std::bind(&canlog::EventListener, this, _1, _2));
>> +  MyMetrics.RegisterListener(IDTAG.c_str(), "*", 
>> std::bind(&canlog::MetricListener, this, _1));
>>
>>    int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
>>    m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:365 @ 
>> canlog::canlog(const char* type, std::string format, 
>> canformat::canformat_serve_
>>  canlog::~canlog()
>>    {
>>    MyEvents.DeregisterEvent(IDTAG);
>> +  MyMetrics.DeregisterListener(IDTAG.c_str());
>>
>>    if (m_task)
>>      {
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:441 @ void 
>> canlog::EventListener(std::string event, void* data)
>>      LogInfo(NULL, CAN_LogInfo_Event, event.c_str());
>>    }
>>
>> +void canlog::MetricListener(OvmsMetric* metric) {
>> +    std::string name = metric->m_name;
>> +    ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
>> +  if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") || 
>> (name == "v.p.longitude") || (name == "v.p.altitude") || (name == 
>> "v.p.direction") || (name == "v.p.satcount") || (name == 
>> "m.net.wifi.sq") || (name == "m.freeram")) {
>> +    std::string metric_text = "metric: " + name;
>> +    ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
>> +    LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());
>> +  }
>> +}
>> +
>>  const char* canlog::GetType()
>>    {
>>    return m_type;
>>
>>
>> Note: "m.net.wifi.sq" / "m.freeram" are providing debugging data, so 
>> that I can quickly receive metrics after the boot. Not what I'm 
>> interested in for the final version of course, and the list will be 
>> configurable.
>>
>> ------------------------------------------------------------------------
>>
>> Here are some example of panic handler infos:
>>
>> Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception 
>> was unhandled.
>> Core 1 register dump:
>> PC      : 0x400d95f8  PS      : 0x00060830  A0      : 0x8030d32c  
>> A1      : 0x3ffc4980
>> A2      : 0x3ffeb5a8  A3      : 0x00000000  A4      : 0xfefefefe  
>> A5      : 0x3ffb5200
>> A6      : 0x3f805e78  A7      : 0x000000fe  A8      : 0x800d95f0  
>> A9      : 0x3ffc4930
>> A10     : 0x3ffeb5a8  A11     : 0x00000000  A12     : 0x00000009  
>> A13     : 0x73c80b46
>> A14     : 0x3ffc4930  A15     : 0x3f8b4164  SAR     : 0x00000004  
>> EXCCAUSE: 0x0000001c
>> EXCVADDR: 0xfefeff2e  LBEG    : 0x4008abe4  LEND    : 0x4008ac00  
>> LCOUNT  : 0xffffffff
>>
>> ELF file SHA256: eb3e5ce8f67640c8
>>
>> Backtrace: 0x400d95f8:0x3ffc4980 0x4030d329:0x3ffc49e0 
>> 0x4012122d:0x3ffc4a00 0x40121296:0x3ffc4a60 0x40120e6e:0x3ffc4a80 
>> 0x400ede1f:0x3ffc4aa0 0x400eed41:0x3ffc4ae0 0x400ecdc9:0x3ffc4b00 
>> 0x4011c5b6:0x3ffc4b40 0x4011c7b6:0x3ffc4b70 0x4011c8d8:0x3ffc4be0 0x0
>>
>>
>> 0x400d95f8 is exactly the address of the call to LogInfo in my changes:
>>
>> (gdb) list *0x400d95f8
>> 0x400d95f8 is in canlog::MetricListener(OvmsMetric*) 
>> (vehicle/OVMS.V3/components/can/src/canlog.cpp:447).
>> 442       std::string name = metric->m_name;
>> 443       ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
>> 444     if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") 
>> || (name == "v.p.longitude") || (name == "v.p.altitude") || (name == 
>> "v.p.direction") || (name == "v.p.satcount") || (name == 
>> "m.net.wifi.sq") || (name == "m.freeram")) {
>> 445       std::string metric_text = "metric: " + name;
>> 446       ESP_LOGI(TAG,"MetricListener - keep : %s", 
>> metric_text.c_str());
>> *447       LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());*
>> 448     }
>> 449    }
>> 450
>>
>> 0xfefe / 0xfefefefe seems related to some of the Heap Memory 
>> Debugging options I've activated (Cf 
>> https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/heap_debug.html 
>> ) and seems to indicate a possible use-after-free issue. I'll look to 
>> debug in that direction.
>>
>>
>> Another version (before Heap Memory Debugging activated) did that 
>> from time to time:
>>
>> ------------------------------------------------------------------------
>> Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). 
>> Exception was unhandled.
>> Core 0 register dump:
>> PC      : 0x00008100  PS      : 0x00060530  A0      : 0x802f0cf0  
>> A1      : 0x3ffd0f00
>> A2      : 0x3ffeb214  A3      : 0x00008100  A4      : 0x402f1df8  
>> A5      : 0x00000000
>> A6      : 0x00000003  A7      : 0x3ffb556c  A8      : 0x800d9540  
>> A9      : 0x3ffd0ee0
>> A10     : 0x3ffeb214  A11     : 0x00000000  A12     : 0x00000009  
>> A13     : 0x3f8b4a24
>> A14     : 0xffffffff  A15     : 0x40404040  SAR     : 0x00000018  
>> EXCCAUSE: 0x00000014
>> EXCVADDR: 0x00008100  LBEG    : 0x4008a848  LEND    : 0x4008a864  
>> LCOUNT  : 0xffffffff
>>
>> ELF file SHA256: f4d0c43aa48ef759
>>
>> Backtrace: 0x00008100:0x3ffd0f00 0x402f0ced:0x3ffd0fb0 
>> 0x4011c56d:0x3ffd0fd0 0x4011c5c6:0x3ffd1020 0x4011c24a:0x3ffd1040 
>> 0x4014f51d:0x3ffd1060 0x4014f953:0x3ffd1080 0x4014e6e3:0x3ffd11d0 
>> 0x4014ef16:0x3ffd1210 0x4014f001:0x3ffd1230 0x4014f119:0x3ffd1260 
>> 0x4014e889:0x3ffd1290 0x4014e9ed:0x3ffd1310 0x4014ebf9:0x3ffd13f0
>>
>> ------------------------------------------------------------------------
>>
>>
>> Thanks for any hint or advices you may want to share.
>>
>> Regards,
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221118/4b2f6adc/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221118/4b2f6adc/attachment-0001.sig>


More information about the OvmsDev mailing list