gdb is better, but also not exact. Line numbers reported are usually around 2-5 lines too low.

I'm pretty sure now the crashes in mg_send_dns_query() are also memory related, it looks like MG_CALLOC() fails.

I think this happens because we still buffer notifications in RAM. The Twizy firmware sends a lot of data notifications which may add up on a temporary loss of connection.

I'm looking forward to testing with the v3.1 module. The first has arrived in germany, mine has been routed to the customs department… let's see what happens next…

Regards,
Michael


Am 09.04.2018 um 00:27 schrieb Michael Balzer:
Att: addr2line reports wrong line numbers.

That's a known issue: https://github.com/jcmvbkbc/binutils-gdb-xtensa/issues/5

Workaround is using gdb. I've made a little bash script named "a2l" for this:

#!/bin/bash
elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
for adr in $* ; do cmd+=" -ex 'l *$adr'" ; done
cmd+=" -ex 'q'"
eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null | grep " is in "

You can also leave out the final grep if you want to see the source context.


80% of the crashes I have recorded today have been out of memory issues.

The remaining 20% have all been along this path:

0x400f8432 is in mg_send_dns_query (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11168).
0x400f85d9 is in mg_resolve_async_eh (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:11628).
0x400f644b is in mg_call (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2277).
0x400f6536 is in mg_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2318).
0x400f7a56 is in mg_mgr_handle_conn (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3729).
0x400f7ca9 is in mg_socket_if_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:3916).
0x400f4471 is in mg_mgr_poll (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/mongoose/mongoose/mongoose.c:2446).
0x400eb9e6 is in OvmsNetManager::MongooseTask() (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:382).
0x400eba25 is in MongooseRawTask(void*) (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_netmanager.cpp:370).

…which is what Greg reported as "webserver status crash" but also occurs without user interaction.

This is related to network changes, for example losing contact to a wifi station quite often triggers this. It looks like mongoose continues to use some object that has been deleted by another task, but I need to have a closer look tomorrow.

Regards,
Michael


Am 08.04.2018 um 16:11 schrieb Mark Webb-Johnson:
Michael,

Neat. I agree that the stack track is normally sufficient.

I’ve started to store the production PUSH firmwares in a standardised way. You can find them at:

http://api.openvehicles.com/firmware/ota/(v3.0|v3.1)/main/
<ver>.ovms3.bin
<ver>.ovms3.elf
<ver>.ovms3.map

Where <ver> is the version in format ‘major.minor.sequence' such as 3.1.003.

So, for example:

http://api.openvehicles.com/firmware/ota/v3.1/main/3.1.003.ovms3.elf

Regards, Mark

On 8 Apr 2018, at 8:56 PM, Michael Balzer <dexter@expeedo.de> wrote:

I've pushed an update to enable saving crash data over a reset and automatically send that to the server using the history record "*-OVM-DebugCrash".

Note: you will need to pull the esp-idf as well. As the support for exception handlers was insufficient for our needs, I have added a separate error handler
registration that catches all three kinds of crashes (exceptions, panics and aborts).

Crash reason, registers (if available) and backtrace are available via "boot status", with "make monitor" automatically mapping the addresses:

OVMS# boot status
Last boot was 16 second(s) ago
  This is reset #1 since last power cycle
  Detected boot reason: Crash
  Crash counters: 1 total, 0 early
  CPU#0 boot reason was 12
  CPU#1 boot reason was 12

Last crash: abort() was called on core 1
  Backtrace:
  0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5
0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd
0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
(discriminator 2)
0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156


The "*-OVM-DebugCrash" record has this structure:

    // H type "*-OVM-DebugCrash"
    //  ,<firmware_version>
    //  ,<bootcount>,<bootreason_name>,<bootreason_cpu0>,<bootreason_cpu1>
    //  ,<crashcnt>,<earlycrashcnt>,<crashtype>,<crashcore>,<registers>,<backtrace>

…with registers and backtraces separated by spaces.

Example:

*-OVM-DebugCrash,0,2592000,3.1.003-11-g37c5f4b/factory/main (build idf v3.1-dev-453-g0f978bcb-dirty Apr  8 2018
14:42:07),1,Crash,12,12,1,0,abort(),1,,0x4008f698 0x4008f86f 0x400e7027 0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937
0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04 0x400e69dd

To decode the backtrace, feed it to addr2line:

balzer@leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> xtensa-esp32-elf-addr2line -pfiaC -e build/ovms3.elf 0x4008f698 0x4008f86f 0x400e7027
0x400edb76 0x400edc8d 0x400edc7f 0x400edcb5 0x400e3908 0x400f11c9 0x400f1230 0x400e3937 0x401cb613 0x400e3f49 0x400e82e5 0x400e84d1 0x400e3df5 0x400e3e04
0x400e69dd
0x4008f698: invoke_abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
0x4008f86f: abort at /home/balzer/esp/esp-idf/components/esp32/./panic.c:669
0x400e7027: module_fault(int, OvmsWriter*, OvmsCommand*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_module.cpp:823
0x400edb76: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edc8d: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edc7f: OvmsCommand::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400edcb5: OvmsCommandApp::Execute(int, OvmsWriter*, int, char const* const*) at
/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_command.cpp:94
0x400e3908: Execute(microrl*, int, char const* const*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:47
0x400f11c9: new_line_handler at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:620
0x400f1230: microrl_insert_char at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/microrl/./microrl.c:668
0x400e3937: OvmsShell::ProcessChar(char) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:70
0x401cb613: OvmsShell::ProcessChars(char const*, int) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_shell.cpp:77
(discriminator 2)
0x400e3f49: ConsoleAsync::HandleDeviceEvent(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:169
0x400e82e5: OvmsConsole::Poll(unsigned int, void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:152
0x400e84d1: OvmsConsole::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./ovms_console.cpp:132 (discriminator 1)
0x400e3df5: ConsoleAsync::Service() at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./console_async.cpp:80
0x400e3e04: non-virtual thunk to ConsoleAsync::Service() at ??:?
0x400e69dd: TaskBase::Task(void*) at /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/main/./task_base.cpp:156


There is also an option to write core dumps in esp-idf, but a core has 64K, and crashes are still too many. I think the backtrace is sufficient in most situations.

We should keep a central archive of .elf files for the releases rolled out, so we don't need to recompile for debugging.

Regards,
Michael

--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26


_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev



_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26


_______________________________________________
OvmsDev mailing list
OvmsDev@lists.openvehicles.com
http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26