[Ovmsdev] Crash debugging

Michael Balzer dexter at expeedo.de
Tue Apr 10 04:44:00 HKT 2018


gdb is better, but also not exact. Line numbers reported are usually around 2-5 lines too low.

I'm pretty sure now the crashes in mg_send_dns_query() are also memory related, it looks like MG_CALLOC() fails.

I think this happens because we still buffer notifications in RAM. The Twizy firmware sends a lot of data notifications which may add up on a temporary loss of
connection.

I'm looking forward to testing with the v3.1 module. The first has arrived in germany, mine has been routed to the customs department… let's see what happens next…

Regards,
Michael


Am 09.04.2018 um 00:27 schrieb Michael Balzer:
> Att: addr2line reports wrong line numbers.
>
> That's a known issue: https://github.com/jcmvbkbc/binutils-gdb-xtensa/issues/5
>
> Workaround is using gdb. I've made a little bash script named "a2l" for this:
>
> #!/bin/bash
> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
> for adr in $* ; do cmd+=" -ex 'l *$adr'" ; done
> cmd+=" -ex 'q'"
> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null | grep " is in "
>
> You can also leave out the final grep if you want to see the source context.
>
>
> 80% of the crashes I have recorded today have been out of memory issues.
>
> The remaining 20% have all been along this path:
>
> 0x400f8432 is in mg_send_dns_query (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11168).
> 0x400f85d9 is in mg_resolve_async_eh (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11628).
> 0x400f644b is in mg_call (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2277).
> 0x400f6536 is in mg_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2318).
> 0x400f7a56 is in mg_mgr_handle_conn (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3729).
> 0x400f7ca9 is in mg_socket_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3916).
> 0x400f4471 is in mg_mgr_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2446).
> 0x400eb9e6 is in OvmsNetManager::MongooseTask() (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:382).
> 0x400eba25 is in MongooseRawTask(void*) (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:370).
>
> …which is what Greg reported as "webserver status crash" but also occurs without user interaction.
>
> This is related to network changes, for example losing contact to a wifi station quite often triggers this. It looks like mongoose continues to use some
> object that has been deleted by another task, but I need to have a closer look tomorrow.
>
> Regards,
> Michael
>
>
> Am 08.04.2018 um 16:11 schrieb Mark Webb-Johnson:
>> Michael,
>>
>> Neat. I agree that the stack track is normally sufficient.
>>
>> I’ve started to store the production PUSH firmwares in a standardised way. You can find them at:
>>
>>     http://api.openvehicles.com/firmware/ota/(v3.0|v3.1)/main/ <http://api.openvehicles.com/firmware/ota/v3.0/main/>
>>
>>         <ver>.ovms3.bin
>>         <ver>.ovms3.elf
>>         <ver>.ovms3.map
>>
>>     Where <ver> is the version in format ‘major.minor.sequence' such as 3.1.003.
>>
>>
>>     So, for example:
>>
>>
>>         http://api.openvehicles.com/firmware/ota/v3.1/main/3.1.003.ovms3.elf
>>
>>
>> Regards, Mark
>>
>>> On 8 Apr 2018, at 8:56 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>
>>> I've pushed an update to enable saving crash data over a reset and automatically send that to the server using the history record "*-OVM-DebugCrash".
>>>
>>> Note: you will need to pull the esp-idf as well. As the support for exception handlers was insufficient for our needs, I have added a separate error handler
>>> registration that catches all three kinds of crashes (exceptions, panics and aborts).
>>>
>>> Crash reason, registers (if available) and backtrace are available via "boot status", with "make monitor" automatically mapping the addresses:
>>>
>>> OVMS# boot status
>>> Last boot was 16 second(s) ago
>>>   This is reset #1 since last power cycle
>>>   Detected boot reason: Crash
>>>   Crash counters: 1 total, 0 early
>>>   CPU#0 boot reason was 12
>>>   CPU#1 boot reason was 12
>>>
>>> Last crash: abort() was called on core 1
>>>   Backtrace:
>>>   0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5
>>> 0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd
>>> 0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>>> 0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>>> 0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
>>> 0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
>>> 0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
>>> 0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
>>> 0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
>>> 0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
>>> (discriminator 2)
>>> 0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
>>> 0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
>>> 0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
>>> 0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
>>> 0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
>>> 0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156
>>>
>>>
>>> The "*-OVM-DebugCrash" record has this structure:
>>>
>>>     // H type "*-OVM-DebugCrash"
>>>     //  ,<firmware_version>
>>>     //  ,<bootcount>,<bootreason_name>,<bootreason_cpu0>,<bootreason_cpu1>
>>>     //  ,<crashcnt>,<earlycrashcnt>,<crashtype>,<crashcore>,<registers>,<backtrace>
>>>
>>> …with registers and backtraces separated by spaces.
>>>
>>> Example:
>>>
>>> *-OVM-DebugCrash,0,2592000,3.1.003-11-g37c5f4b/factory/main (build idf v3.1-dev-453-g0f978bcb-dirty Apr  8 2018
>>> 14:42:07),1,Crash,12,12,1,0,abort(),1,,0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937
>>> 0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd
>>>
>>> To decode the backtrace, feed it to addr2line:
>>>
>>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> xtensa-esp32-elf-addr2line -pfiaC -e build/ovms3.elf 0x4008f698 0x4008f86f 0x400e7027
>>> 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04
>>> 0x400e69dd
>>> 0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>>> 0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>>> 0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
>>> 0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>>> 0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
>>> 0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
>>> 0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
>>> 0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
>>> 0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
>>> (discriminator 2)
>>> 0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
>>> 0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
>>> 0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
>>> 0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
>>> 0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
>>> 0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156
>>>
>>>
>>> There is also an option to write core dumps in esp-idf, but a core has 64K, and crashes are still too many. I think the backtrace is sufficient in most
>>> situations.
>>>
>>> We should keep a central archive of .elf files for the releases rolled out, so we don't need to recompile for debugging.
>>>
>>> Regards,
>>> Michael
>>>
>>> -- 
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> -- 
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180409/4e1a72f5/attachment.htm>


More information about the OvmsDev mailing list