[Ovmsdev] Crash debugging

Michael Balzer dexter at expeedo.de
Mon Apr 9 06:27:45 HKT 2018


Att: addr2line reports wrong line numbers.

That's a known issue: https://github.com/jcmvbkbc/binutils-gdb-xtensa/issues/5

Workaround is using gdb. I've made a little bash script named "a2l" for this:

#!/bin/bash
elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
for adr in $* ; do cmd+=" -ex 'l *$adr'" ; done
cmd+=" -ex 'q'"
eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null | grep " is in "

You can also leave out the final grep if you want to see the source context.


80% of the crashes I have recorded today have been out of memory issues.

The remaining 20% have all been along this path:

0x400f8432 is in mg_send_dns_query (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11168).
0x400f85d9 is in mg_resolve_async_eh (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11628).
0x400f644b is in mg_call (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2277).
0x400f6536 is in mg_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2318).
0x400f7a56 is in mg_mgr_handle_conn (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3729).
0x400f7ca9 is in mg_socket_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3916).
0x400f4471 is in mg_mgr_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2446).
0x400eb9e6 is in OvmsNetManager::MongooseTask() (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:382).
0x400eba25 is in MongooseRawTask(void*) (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:370).

…which is what Greg reported as "webserver status crash" but also occurs without user interaction.

This is related to network changes, for example losing contact to a wifi station quite often triggers this. It looks like mongoose continues to use some object
that has been deleted by another task, but I need to have a closer look tomorrow.

Regards,
Michael


Am 08.04.2018 um 16:11 schrieb Mark Webb-Johnson:
> Michael,
>
> Neat. I agree that the stack track is normally sufficient.
>
> I’ve started to store the production PUSH firmwares in a standardised way. You can find them at:
>
>     http://api.openvehicles.com/firmware/ota/(v3.0|v3.1)/main/ <http://api.openvehicles.com/firmware/ota/v3.0/main/>
>
>         <ver>.ovms3.bin
>         <ver>.ovms3.elf
>         <ver>.ovms3.map
>
>     Where <ver> is the version in format ‘major.minor.sequence' such as 3.1.003.
>
>
>     So, for example:
>
>
>         http://api.openvehicles.com/firmware/ota/v3.1/main/3.1.003.ovms3.elf
>
>
> Regards, Mark
>
>> On 8 Apr 2018, at 8:56 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>
>> I've pushed an update to enable saving crash data over a reset and automatically send that to the server using the history record "*-OVM-DebugCrash".
>>
>> Note: you will need to pull the esp-idf as well. As the support for exception handlers was insufficient for our needs, I have added a separate error handler
>> registration that catches all three kinds of crashes (exceptions, panics and aborts).
>>
>> Crash reason, registers (if available) and backtrace are available via "boot status", with "make monitor" automatically mapping the addresses:
>>
>> OVMS# boot status
>> Last boot was 16 second(s) ago
>>   This is reset #1 since last power cycle
>>   Detected boot reason: Crash
>>   Crash counters: 1 total, 0 early
>>   CPU#0 boot reason was 12
>>   CPU#1 boot reason was 12
>>
>> Last crash: abort() was called on core 1
>>   Backtrace:
>>   0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5
>> 0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd
>> 0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>> 0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>> 0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
>> 0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
>> 0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
>> 0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
>> 0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
>> 0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
>> (discriminator 2)
>> 0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
>> 0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
>> 0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
>> 0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
>> 0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
>> 0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156
>>
>>
>> The "*-OVM-DebugCrash" record has this structure:
>>
>>     // H type "*-OVM-DebugCrash"
>>     //  ,<firmware_version>
>>     //  ,<bootcount>,<bootreason_name>,<bootreason_cpu0>,<bootreason_cpu1>
>>     //  ,<crashcnt>,<earlycrashcnt>,<crashtype>,<crashcore>,<registers>,<backtrace>
>>
>> …with registers and backtraces separated by spaces.
>>
>> Example:
>>
>> *-OVM-DebugCrash,0,2592000,3.1.003-11-g37c5f4b/factory/main (build idf v3.1-dev-453-g0f978bcb-dirty Apr  8 2018
>> 14:42:07),1,Crash,12,12,1,0,abort(),1,,0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937
>> 0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd
>>
>> To decode the backtrace, feed it to addr2line:
>>
>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> xtensa-esp32-elf-addr2line -pfiaC -e build/ovms3.elf 0x4008f698 0x4008f86f 0x400e7027
>> 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04
>> 0x400e69dd
>> 0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>> 0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
>> 0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
>> 0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
>> 0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
>> 0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
>> 0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
>> 0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
>> 0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
>> (discriminator 2)
>> 0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
>> 0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
>> 0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
>> 0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
>> 0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
>> 0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156
>>
>>
>> There is also an option to write core dumps in esp-idf, but a core has 64K, and crashes are still too many. I think the backtrace is sufficient in most
>> situations.
>>
>> We should keep a central archive of .elf files for the releases rolled out, so we don't need to recompile for debugging.
>>
>> Regards,
>> Michael
>>
>> -- 
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20180409/705d2554/attachment-0001.html>


More information about the OvmsDev mailing list