[Ovmsdev] version 3.2.007: Crash on boot

Tamás Kovács kommykt at gmail.com
Wed Dec 18 12:56:31 HKT 2019


I deleted my fork from my github repo, create a new fork from
Open-Vehicle-Monitoring-System-3, now version tagging is fixed.
The crash is my fault I think, missing two row from my code.
In ticker60 I added the following row:

CAN_frame_t frame;
memset(&frame,0,sizeof(frame));

now can't get crash about 24 hour, now testing.

Tamás Kovács <kommykt at gmail.com> ezt írta (időpont: 2019. dec. 16., H,
19:57):

> Yes I merged the fix.
> I don't understand why my version is 3.2.005.
> Now I try to new clone I think.
> I don't know what's this: "or are you on a local spin-off branch of
> 3.2.005". I always create modification on my code then I upload it to my
> git repo and then create a compare with the main open vehicles repo then I
> update my local code in atom with a fetch.
>
> Michael Balzer <dexter at expeedo.de> ezt írta (időpont: 2019. dec. 16., H,
> 18:54):
>
>> Tamás,
>>
>> I'm confused, and your dropbox files don't help… did you merge the fix
>> and still get the crash?
>>
>> Is your version just off because your repository lacks tags (if so:
>> why?), or are you on a local spin-off branch of 3.2.005?
>>
>> I've got the impression your git repository is messed up. Maybe you
>> should try a new clone?
>>
>> Regards,
>> Michael
>>
>>
>> Am 15.12.19 um 20:13 schrieb Tamás Kovács:
>>
>> My code crashed with same error as it described.
>> plus one error: in esp32can.cpp:448
>>
>> The files can access my dropbox folder:
>> https://www.dropbox.com/sh/mowc7hc7zrzszmm/AACO4MAmM1yF3yoXsGoZO6R_a?dl=0
>>
>> Michael Balzer <dexter at expeedo.de> ezt írta (időpont: 2019. dec. 13., P,
>> 10:16):
>>
>>> Greg, Mark,
>>>
>>> the new crash happened here:
>>>
>>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3>
>>> a2l 0x3ffcfda0:0x3ffc87a0 0x400da762:0x3ffc87d0 0x400da95b:0x3ffc87f0
>>> 0x400da901:0x3ffc8810
>>> Using elf file:
>>> /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>>> 0x400da762 is in can::ExecuteCallbacks(CAN_frame_t const*, bool, bool)
>>> (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:866).
>>> 861      {
>>> 862      if (tx)
>>> 863        {
>>> 864        if (frame->callback)
>>> 865          {
>>> 866          (*(frame->callback))(frame, success); // invoke
>>> frame-specific callback function
>>> 867          }
>>> 868        for (auto entry : m_txcallbacks) {      // invoke generic tx
>>> callbacks
>>> 869          entry->m_callback(frame, success);
>>> 870          }
>>> 0x400da95b is in canbus::TxCallback(CAN_frame_t*, bool)
>>> (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:1018).
>>> 1013        }
>>> 1014      else
>>> 1015        {
>>> 1016        LogFrame(CAN_LogFrame_TX_Fail, p_frame);
>>> 1017        }
>>> 1018      MyCan.ExecuteCallbacks(p_frame, true, success);
>>> 1019      }
>>> 1020
>>> 1021    /**
>>> 1022     * canbus::Write -- main TX API
>>> 0x400da901 is in CAN_rxtask(void*)
>>> (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:730).
>>> 725                  me->IncomingFrame(&msg.body.frame);
>>> 726                } while (loop);
>>> 727              break;
>>> 728              }
>>> 729            case CAN_txcallback:
>>> 730              msg.body.bus->TxCallback(&msg.body.frame, true);
>>> 731              break;
>>> 732            case CAN_txfailedcallback:
>>> 733              msg.body.bus->TxCallback(&msg.body.frame, false);
>>> 734              msg.body.bus->LogStatus(CAN_LogStatus_Error);
>>>
>>>
>>> As the frame callback invocation is secured against null, it must have
>>> been a missing initialization of frame.callback. So I checked our source
>>> tree for uses of CAN_frame_t without memset or zero initialization and
>>> found one in obd2ecu::IncomingFrame() -- causing this crash I assume -- and
>>> two more in the can rx/tx shell commands.
>>>
>>> Fix is in commit
>>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/34302254567b15a8c0f9dfb730b5cf24703ac700
>>>
>>> …and build in https://ovms.dexters-web.de/firmware/ota/v3.1/edge/
>>>
>>> Greg, please test again.
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 13.12.19 um 07:09 schrieb Greg D.:
>>>
>>> Hi Michael,
>>>
>>> Ok, so I believe I loaded your Edge build, and the issue remains.  To
>>> verify, this is the string reported by the web Status page:
>>>
>>> 3.2.007-102-g07440970/ota_1/main (build idf v3.3-beta3-774-g6652269e1
>>> Dec 12 2019 11:59:14)
>>>
>>> Following my earlier email, I left the configuration as-is, including
>>> launching the obd2ecu task, but with the attached OBDII module removed, and
>>> the crash was resolved.  Rebooting the module had it come up running
>>> normally.  I then plugged in the OBDII module into the OVMSv3, and OVMS
>>> crashed the instant the first it started polling.  The device is a SyncUp
>>> Drive, which is a telematics and WiFi hotspot.  Connecting the OBDII device
>>> with 12v power off, as expected, did nothing.  Power it on, wait a few
>>> seconds to boot, and OVMS crashes immediately.
>>>
>>> New console log, attached.  It shows the module booting normally (OBDII
>>> device not present), followed by the device being attached (indicated in
>>> the log), and the subsequent crash.  I removed the device right after the
>>> boot started, so as to let the OVMS boot normally.
>>>
>>> So, clearly the obd2ecu task it is receiving frames on CAN3 just fine,
>>> but transmits back out on that same bus aren't working.  Is this something
>>> I need to address within obd2ecu, or is it in the core code?  I've not
>>> changed obd2ecu in quite some time.
>>>
>>> Greg
>>>
>>>
>>> Michael Balzer wrote:
>>>
>>> Fix commit is
>>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/8fef3f5fb6eec9fca26997e01d9c7d08080cdc06
>>>
>>> Greg, please test.
>>>
>>> New build with SPIRAM fix:
>>> https://ovms.dexters-web.de/firmware/ota/v3.1/edge/
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 12.12.19 um 11:17 schrieb Michael Balzer:
>>>
>>> Last line of a2l was missing:
>>>
>>> #!/bin/bash
>>>
>>> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>>> for adr in $* ; do
>>>   if [[ "$adr" =~ "elf" ]] ; then
>>>     elf="$adr"
>>>   else
>>>     cmd+=" -ex 'l *${adr/:*/}'"
>>>   fi
>>> done
>>> cmd+=" -ex 'q'"
>>> echo "Using elf file: $elf"
>>> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null
>>>
>>>
>>>
>>> Am 12.12.19 um 11:12 schrieb Michael Balzer:
>>>
>>> Mark,
>>>
>>> 1) maybe I missed posting my later version of a2l, which automatically
>>> strips the stack pointers from the ":" address syntax so you can copy&paste
>>> the USB output. Here it is:
>>>
>>> #!/bin/bash
>>>
>>> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>>> for adr in $* ; do
>>>   if [[ "$adr" =~ "elf" ]] ; then
>>>     elf="$adr"
>>>   else
>>>     cmd+=" -ex 'l *${adr/:*/}'"
>>>   fi
>>> done
>>> cmd+=" -ex 'q'"
>>> echo "Using elf file: $elf"
>>>
>>>
>>> 2) You're right, the bug is tx_frame with null origin overwriting
>>> body.bus in the union. I didn't notice that when checking Marko's
>>> submission.
>>>
>>> tx_frame is a copy of the last frame given to the bus for transmission.
>>> The queue msg is gone when the TX done IRQ comes in, and Marko needed a
>>> copy of the frame the TX IRQ relates to. I asked him (see PR discussion),
>>> he checked and confirmed that all TX is done sequentially, so a single
>>> buffer is sufficient.
>>>
>>> Swapping the lines would work. The frame.origin also shouldn't be null,
>>> but the handler should tolerate that. …oops, tx_frame also doesn't get
>>> initialized in the canbus constructor, so there's also potentially garbage
>>> in tx_frame if due to some bug a TX IRQ is generated or processed without a
>>> previous tx.
>>>
>>> I'll do the fixes… and also rename tx_frame to m_tx_frame for
>>> consistency.
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 12.12.19 um 01:20 schrieb Mark Webb-Johnson:
>>>
>>> Two issues:
>>>
>>> *1] A2L*
>>>
>>> My a2l is this:
>>>
>>> #!/bin/bash
>>> elf=3.2.007.ovms3.elf
>>> for adr in $* ; do
>>>   if [[ "$adr" =~ "elf" ]] ; then
>>>     elf="$adr"
>>>   else
>>>     cmd+=" -ex 'l *$adr'"
>>>   fi
>>> done
>>> cmd+=" -ex 'q'"
>>> echo "Using elf file: $elf"
>>> echo "xtensa-esp32-elf-gdb -batch $elf $cmd"
>>> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null #| grep "^0x.* is
>>> in "
>>>
>>>
>>> When I run it, I get:
>>>
>>> $ a2l 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>> Using elf file: 3.2.007.ovms3.elf
>>> xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf  -ex 'l
>>> *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex ‘q'
>>>
>>>
>>> And a manual run gives:
>>>
>>> $ xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf  -ex 'l
>>> *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex 'q'
>>> Junk at end of line specification.
>>>
>>> $ xtensa-esp32-elf-gdb 3.2.007.ovms3.elf
>>> GNU gdb (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 7.10
>>> (gdb) l *0x400d5e4c:0x3ffc5c40
>>> Junk at end of line specification.
>>> (gdb) l *0x7ffffffd:0x3ffc5c90
>>> (gdb) quit
>>>
>>>
>>> *2] Frame origin*
>>>
>>> Regarding the CAN_txcallback, I think you are correct. And both
>>> generators of that message set the frame correctly. So my initial thought
>>> was that it is either a memory corruption, or somewhere else sending a
>>> frame with garbage data.
>>>
>>> I do see this technique used in both the mcp2515 and esp32can drivers:
>>>
>>> msg.body.bus = me;
>>> msg.body.frame = me->tx_frame;
>>>
>>>
>>> I don’t normally just copy structures over like that. I memcpy() them,
>>> but I guess it must work. However, as our CAN_queue_msg_t is a union of
>>> CAN_frame_t (frame, with first member of the structure a canbus*) and
>>> 'canbus* bus’, that is a little worrying. It seems that the msg.body.bus
>>> will get overwritten with whatever is in msg.body.frame.origin. I can’t see
>>> anywhere that tx_frame.origin is set - which is scary because that would
>>> mean it is always random junk.
>>>
>>> Maybe it works if tx_frame.origin is set to the bus before anything
>>> else, but not in Greg’s circumstances where something else arrives first.
>>> Perhaps related to obd2ecu? I only see tx_frame set in can.cpp
>>> canbus::Write. I don’t really understand the tx_frame approach at all, and
>>> why the frame is just not passed on the queue.
>>>
>>> // CAN Frame
>>> // Note: Take care changing this structure, as it is a union with
>>> // CAN_log_message_t and position of 'origin' is fixed.
>>> struct CAN_frame_t
>>>   {
>>>   canbus*     origin;                   // Origin of the frame
>>>   CanFrameCallback * callback;          // Frame-specific callback. Is
>>> called when this frame is successfully sent (or sending failed)
>>>   CAN_FIR_t   FIR;                      // Frame information record
>>>   uint32_t    MsgID;                    // Message ID
>>>   union
>>>     {
>>>     uint8_t   u8[8];                    // Payload byte access
>>>     uint32_t  u32[2];                   // Payload u32 access (Att:
>>> little endian!)
>>>     uint64_t  u64;                      // Payload u64 access (Att:
>>> little endian!)
>>>     } data;
>>>
>>>   esp_err_t Write(canbus* bus=NULL, TickType_t maxqueuewait=0);  // bus:
>>> NULL=origin
>>>   };
>>>
>>> // CAN message
>>> typedef struct
>>>   {
>>>   CAN_queue_type_t type;
>>>   union
>>>     {
>>>     CAN_frame_t frame;  // CAN_frame
>>>     canbus* bus;
>>>     } body;
>>>   } CAN_queue_msg_t;
>>>
>>>
>>> This approach seems to date back to the swcan support merge (f94ae5a1b).
>>>
>>> Should we swap around, and set the msg.body.bus after msg.body.frame? Or
>>> am I missing something…
>>>
>>> Regards, Mark.
>>>
>>> On 12 Dec 2019, at 2:29 AM, Michael Balzer <dexter at expeedo.de> wrote:
>>>
>>> Mark,
>>>
>>> good example why not to use addr2line: I think that result is wrong. a2l
>>> uses gdb which gives:
>>>
>>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3>
>>> a2l tmp/3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>> Using elf file: tmp/3.2.007.ovms3.elf
>>> 0x400d5e4c is in CAN_rxtask(void*)
>>> (/home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:730).
>>> 725                  me->IncomingFrame(&msg.body.frame);
>>> 726                } while (loop);
>>> 727              break;
>>> 728              }
>>> 729            case CAN_txcallback:
>>> 730              msg.body.bus->TxCallback(&msg.body.frame, true);
>>> 731              break;
>>> 732            case CAN_txfailedcallback:
>>> 733              msg.body.bus->TxCallback(&msg.body.frame, false);
>>> 734              msg.body.bus->LogStatus(CAN_LogStatus_Error);
>>>
>>> …and that actually makes sense and matches the register dump.
>>>
>>> If I read the gdb disassembly correctly, A10 = msg.body.bus, so Greg's
>>> got a CAN_txcallback msg without a bus.
>>>
>>> Hardening the rxtask against null here would probably avoid the crash,
>>> but I don't see yet how that could be possible.
>>> Both esp32can and mcp2515 set the bus field to their object addresses,
>>> which cannot be null.
>>>
>>> Regards,
>>> Michael
>>>
>>>
>>> Am 11.12.19 um 13:45 schrieb Mark Webb-Johnson:
>>>
>>> Can’t get a2l working at the moment. The addr2line gives:
>>>
>>> addr2line -e 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40
>>> 0x7ffffffd:0x3ffc5c90
>>>
>>> /home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:551
>>>
>>>
>>> That is:
>>>
>>> void canbus::LogInfo(CAN_log_type_t type, const char* text)
>>>   {
>>>   MyCan.LogInfo(this, type, text);    <—— HERE
>>>   }
>>>
>>>
>>> ELF is at:
>>>
>>> http://api.openvehicles.com/firmware/ota/v3.2/main/3.2.007.ovms3.elf
>>>
>>>
>>> Regards, Mark.
>>>
>>> On 11 Dec 2019, at 5:20 PM, Michael Balzer <dexter at expeedo.de> wrote:
>>>
>>> Mark,
>>>
>>> Greg uses your build, the crash point seems to be consistent, can you
>>> post the a2l on this?
>>>
>>> Regards,
>>> Michael
>>>
>>> PS: Greg, would you mind switching to EAP to beta test future releases?
>>>
>>>
>>> Am 11.12.19 um 04:33 schrieb Greg D.:
>>>
>>> Hi folks,
>>>
>>> Well, the module updated to 3.2.007 last night.  I just checked on it,
>>> and it appears that it didn't exactly survive.  Crashed while running
>>> the autoconfig script.  Log with two cycles attached.
>>>
>>> I tried renaming the /store/events directory to /store/was_events since
>>> it seems like Duktape was getting in the way, but that didn't resolve
>>> the crash.  I manually enabled wifi so I could manage the module, and
>>> moved it back to 3.2.005 from the other partition.  Seems stable again.
>>>
>>> The car is going into the Service Center tomorrow (the perennial issue
>>> with 1146 alerts), so I need to have the module stable so that I can
>>> keep an eye on it.  Going to leave it on 3.2.005 for now, unless someone
>>> has a quick fix in the next few hours...
>>>
>>> Otherwise, any ideas for troubleshooting after I get the car back back
>>> (hopefully end of day)?
>>>
>>> Greg
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> --
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>
>>
>> --
>> Üdvözlettel:
>> Kovács Tamás
>>
>>
>> _______________________________________________
>> OvmsDev mailing listOvmsDev at lists.openvehicles.comhttp://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>
>
> --
> Üdvözlettel:
> Kovács Tamás
>
>

-- 
Üdvözlettel:
Kovács Tamás
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20191218/f43627ae/attachment-0001.html>


More information about the OvmsDev mailing list