[Ovmsdev] Strange crashes when logging additional info in can logs
Michael Balzer
dexter at expeedo.de
Fri Nov 18 21:51:19 HKT 2022
Ludovic,
there is an even better variant of your second approach, that's changing
MetricCallbackEntry.m_caller to a std::string.
That's the same scheme EventCallbackEntry already uses, and m_caller is
only used to identiy the registrant, so that's straight forward.
Regards,
Michael
Am 18.11.22 um 01:49 schrieb Ludovic LANGE:
> Some news...
>
> My issue was caused because of course I made a mistake with the way of
> registering my metrics Listener.
>
> I used the following code (see my previous post) in the constructor:
>
> + MyMetrics.RegisterListener(IDTAG.c_str(), "*",
> std::bind(&canlog::MetricListener, this, _1));
>
> and for the destructor:
> + MyMetrics.DeregisterListener(IDTAG.c_str());
>
> I had to use IDTAG because this class has multiple instances.
>
> The issue is passing IDTAG.c_str() as a parameter ;
> becauseMyMetrics.RegisterListener stores the value of this const char*
> parameter and uses it 1) as an immutable parameter and 2) as a value.
>
> Of course this does not work with my first way to call
> MyMetrics.RegisterListener :
>
> * the memory buffer returned by IDTAG.c_str() has no more existence
> after exiting the constructor, so this lead to memory corruption.
> * when calling IDTAG.c_str() again in the destructor, there is
> absolutely no chance that this c_str() call returns the same
> memory address as the call in the constructor, so the value
> comparison will never work.
>
>
> I rewrote the code this way by adding a new member (m_idtag) to the class:
>
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:352 @
> canlog::canlog(const char* type, std::string format,
> canformat::canformat_serve_
> m_dropcount = 0;
> m_filtercount = 0;
>
> + m_idtag = strdup(IDTAG.c_str());
> +
> using std::placeholders::_1;
> using std::placeholders::_2;
> MyEvents.RegisterEvent(IDTAG, "*",
> std::bind(&canlog::EventListener, this, _1, _2));
> + if (NULL != m_idtag)
> + {
> + MyMetrics.RegisterListener(m_idtag, "*",
> std::bind(&canlog::MetricListener, this, _1));
> + }
> + else
> + {
> + ESP_LOGE(TAG,"Cannot allocate memory - not registering metrics
> listener");
> + }
>
> int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
> m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:374 @
> canlog::canlog(const char* type, std::string format,
> canformat::canformat_serve_
> canlog::~canlog()
> {
> MyEvents.DeregisterEvent(IDTAG);
> + if (NULL != m_idtag)
> + {
> + MyMetrics.DeregisterListener(m_idtag);
> + free((void*)m_idtag);
> + }
>
> if (m_task)
> {
>
>
> Another option is to change MyMetrics.RegisterListener so that it
> store *a copy* of the parameter, and does string comparison instead of
> a value comparison:
>
> modified: vehicle/OVMS.V3/main/ovms_metrics.cpp
> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> @ vehicle/OVMS.V3/main/ovms_metrics.cpp:1019 @ void
> OvmsMetrics::RegisterListener(const char* caller, const char* name,
> MetricC
> return;
> }
>
> + const char * caller_copy = (const char *)strdup(caller);
> +
> + if (NULL == caller_copy)
> + {
> + ESP_LOGE(TAG, "Problem registering metric %s for caller %s -
> insufficient memory",name,caller);
> + return;
> + }
> +
> MetricCallbackList *ml = k->second;
> - ml->push_back(new MetricCallbackEntry(caller,callback));
> + ml->push_back(new MetricCallbackEntry(caller_copy,callback));
> }
>
> void OvmsMetrics::DeregisterListener(const char* caller)
> @ vehicle/OVMS.V3/main/ovms_metrics.cpp:1041 @ void
> OvmsMetrics::DeregisterListener(const char* caller)
> while (itc!=ml->end())
> {
> MetricCallbackEntry* ec = *itc;
> - if (ec->m_caller == caller)
> + if (strcmp(ec->m_caller, caller) == 0)
> {
> + free((char *)ec->m_caller);
> itc = ml->erase(itc);
> delete ec;
> }
>
>
> I've tested both approaches and they both solve my issue : no more
> crashes and memory corruption.
>
> The second one is more "risky" (as it potentially impacts everyone)
> but I think that it should not cause much regression.
>
> Let me know what you think - should I make a PR for this second approach ?
>
> Regards,
>
> Le 16/11/2022 à 22:45, Ludovic LANGE a écrit :
>>
>> Dear list,
>>
>> You may remember that I'm trying to enhance the "datalogger" side of
>> OVMS - in different ways.
>>
>> One facet of this endeavour is to have (CAN) logs that also
>> incorporate other data coming from OVMS. In particular, I'd like to
>> have GNSS (GPS) data in the same log as my CAN data.
>>
>> My first try is to slightly alter `canlog.cpp` and add the ability to
>> `LogInfo` some chosen metrics, in the same spirit of the already
>> logged subset of events. (With CRTD format, it appears as comments)
>>
>> While doing this, I'm facing multiple crashes that are somehow strange:
>>
>> * The crashes only occur when closing the logs (`can log stop`),
>> never when logging
>> * The cause of the crash is not always the same (LoadProhibited,
>> InstrFetchProhibited)
>> * Sometimes the PC is impacted / overwritten
>>
>> The crashes seems to always occur in a window of 20s after the `can
>> log stop` command (In general it's immediate, but sometime can take
>> some more seconds).
>>
>> (It lead me also to discover that the feature of our version of
>> esp-idf that resolves the memory addresses in the serial output -
>> `make monitor` - is broken, and that it's better to use GDB for that)
>>
>> While I am doing my homework and try to solve this, I wanted to share
>> it here also to see if you already encountered this and/or if my
>> changes are not done properly / not on the right OS task / etc.... -
>> for which I lack background and experience. (In the process I think I
>> found a non-freed variable, but it's not causing my issue - cf
>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/pull/774
>> )
>>
>> Here are the changes - you'll see that the crash do occur as soon as
>> I `LogInfo` something - after the log has been closed.
>>
>> ------------------------------------------------------------------------
>>
>> modified: vehicle/OVMS.V3/components/can/src/canlog.cpp
>> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:355 @
>> canlog::canlog(const char* type, std::string format,
>> canformat::canformat_serve_
>> using std::placeholders::_1;
>> using std::placeholders::_2;
>> MyEvents.RegisterEvent(IDTAG, "*",
>> std::bind(&canlog::EventListener, this, _1, _2));
>> + MyMetrics.RegisterListener(IDTAG.c_str(), "*",
>> std::bind(&canlog::MetricListener, this, _1));
>>
>> int queuesize = MyConfig.GetParamValueInt("can", "log.queuesize",100);
>> m_queue = xQueueCreate(queuesize, sizeof(CAN_log_message_t));
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:365 @
>> canlog::canlog(const char* type, std::string format,
>> canformat::canformat_serve_
>> canlog::~canlog()
>> {
>> MyEvents.DeregisterEvent(IDTAG);
>> + MyMetrics.DeregisterListener(IDTAG.c_str());
>>
>> if (m_task)
>> {
>> @ vehicle/OVMS.V3/components/can/src/canlog.cpp:441 @ void
>> canlog::EventListener(std::string event, void* data)
>> LogInfo(NULL, CAN_LogInfo_Event, event.c_str());
>> }
>>
>> +void canlog::MetricListener(OvmsMetric* metric) {
>> + std::string name = metric->m_name;
>> + ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
>> + if (startsWith(name, "v.p.gps") || (name == "v.p.latitude") ||
>> (name == "v.p.longitude") || (name == "v.p.altitude") || (name ==
>> "v.p.direction") || (name == "v.p.satcount") || (name ==
>> "m.net.wifi.sq") || (name == "m.freeram")) {
>> + std::string metric_text = "metric: " + name;
>> + ESP_LOGI(TAG,"MetricListener - keep : %s", metric_text.c_str());
>> + LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());
>> + }
>> +}
>> +
>> const char* canlog::GetType()
>> {
>> return m_type;
>>
>>
>> Note: "m.net.wifi.sq" / "m.freeram" are providing debugging data, so
>> that I can quickly receive metrics after the boot. Not what I'm
>> interested in for the final version of course, and the list will be
>> configurable.
>>
>> ------------------------------------------------------------------------
>>
>> Here are some example of panic handler infos:
>>
>> Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception
>> was unhandled.
>> Core 1 register dump:
>> PC : 0x400d95f8 PS : 0x00060830 A0 : 0x8030d32c
>> A1 : 0x3ffc4980
>> A2 : 0x3ffeb5a8 A3 : 0x00000000 A4 : 0xfefefefe
>> A5 : 0x3ffb5200
>> A6 : 0x3f805e78 A7 : 0x000000fe A8 : 0x800d95f0
>> A9 : 0x3ffc4930
>> A10 : 0x3ffeb5a8 A11 : 0x00000000 A12 : 0x00000009
>> A13 : 0x73c80b46
>> A14 : 0x3ffc4930 A15 : 0x3f8b4164 SAR : 0x00000004
>> EXCCAUSE: 0x0000001c
>> EXCVADDR: 0xfefeff2e LBEG : 0x4008abe4 LEND : 0x4008ac00
>> LCOUNT : 0xffffffff
>>
>> ELF file SHA256: eb3e5ce8f67640c8
>>
>> Backtrace: 0x400d95f8:0x3ffc4980 0x4030d329:0x3ffc49e0
>> 0x4012122d:0x3ffc4a00 0x40121296:0x3ffc4a60 0x40120e6e:0x3ffc4a80
>> 0x400ede1f:0x3ffc4aa0 0x400eed41:0x3ffc4ae0 0x400ecdc9:0x3ffc4b00
>> 0x4011c5b6:0x3ffc4b40 0x4011c7b6:0x3ffc4b70 0x4011c8d8:0x3ffc4be0 0x0
>>
>>
>> 0x400d95f8 is exactly the address of the call to LogInfo in my changes:
>>
>> (gdb) list *0x400d95f8
>> 0x400d95f8 is in canlog::MetricListener(OvmsMetric*)
>> (vehicle/OVMS.V3/components/can/src/canlog.cpp:447).
>> 442 std::string name = metric->m_name;
>> 443 ESP_LOGI(TAG,"MetricListener - enter : %s", name.c_str());
>> 444 if (startsWith(name, "v.p.gps") || (name == "v.p.latitude")
>> || (name == "v.p.longitude") || (name == "v.p.altitude") || (name ==
>> "v.p.direction") || (name == "v.p.satcount") || (name ==
>> "m.net.wifi.sq") || (name == "m.freeram")) {
>> 445 std::string metric_text = "metric: " + name;
>> 446 ESP_LOGI(TAG,"MetricListener - keep : %s",
>> metric_text.c_str());
>> *447 LogInfo(NULL, CAN_LogInfo_Event, metric_text.c_str());*
>> 448 }
>> 449 }
>> 450
>>
>> 0xfefe / 0xfefefefe seems related to some of the Heap Memory
>> Debugging options I've activated (Cf
>> https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/heap_debug.html
>> ) and seems to indicate a possible use-after-free issue. I'll look to
>> debug in that direction.
>>
>>
>> Another version (before Heap Memory Debugging activated) did that
>> from time to time:
>>
>> ------------------------------------------------------------------------
>> Guru Meditation Error: Core 0 panic'ed (InstrFetchProhibited).
>> Exception was unhandled.
>> Core 0 register dump:
>> PC : 0x00008100 PS : 0x00060530 A0 : 0x802f0cf0
>> A1 : 0x3ffd0f00
>> A2 : 0x3ffeb214 A3 : 0x00008100 A4 : 0x402f1df8
>> A5 : 0x00000000
>> A6 : 0x00000003 A7 : 0x3ffb556c A8 : 0x800d9540
>> A9 : 0x3ffd0ee0
>> A10 : 0x3ffeb214 A11 : 0x00000000 A12 : 0x00000009
>> A13 : 0x3f8b4a24
>> A14 : 0xffffffff A15 : 0x40404040 SAR : 0x00000018
>> EXCCAUSE: 0x00000014
>> EXCVADDR: 0x00008100 LBEG : 0x4008a848 LEND : 0x4008a864
>> LCOUNT : 0xffffffff
>>
>> ELF file SHA256: f4d0c43aa48ef759
>>
>> Backtrace: 0x00008100:0x3ffd0f00 0x402f0ced:0x3ffd0fb0
>> 0x4011c56d:0x3ffd0fd0 0x4011c5c6:0x3ffd1020 0x4011c24a:0x3ffd1040
>> 0x4014f51d:0x3ffd1060 0x4014f953:0x3ffd1080 0x4014e6e3:0x3ffd11d0
>> 0x4014ef16:0x3ffd1210 0x4014f001:0x3ffd1230 0x4014f119:0x3ffd1260
>> 0x4014e889:0x3ffd1290 0x4014e9ed:0x3ffd1310 0x4014ebf9:0x3ffd13f0
>>
>> ------------------------------------------------------------------------
>>
>>
>> Thanks for any hint or advices you may want to share.
>>
>> Regards,
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221118/4b2f6adc/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221118/4b2f6adc/attachment-0001.sig>
More information about the OvmsDev
mailing list