[Ovmsdev] version 3.2.007: Crash on boot

Michael Balzer dexter at expeedo.de
Fri Dec 13 17:15:50 HKT 2019


Greg, Mark,

the new crash happened here:

balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> a2l 0x3ffcfda0:0x3ffc87a0 0x400da762:0x3ffc87d0 0x400da95b:0x3ffc87f0 0x400da901:0x3ffc8810
Using elf file: /home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
0x400da762 is in can::ExecuteCallbacks(CAN_frame_t const*, bool, bool)
(/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:866).
861      {
862      if (tx)
863        {
864        if (frame->callback)
865          {
866          (*(frame->callback))(frame, success); // invoke frame-specific callback function
867          }
868        for (auto entry : m_txcallbacks) {      // invoke generic tx callbacks
869          entry->m_callback(frame, success);
870          }
0x400da95b is in canbus::TxCallback(CAN_frame_t*, bool) (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:1018).
1013        }
1014      else
1015        {
1016        LogFrame(CAN_LogFrame_TX_Fail, p_frame);
1017        }
1018      MyCan.ExecuteCallbacks(p_frame, true, success);
1019      }
1020   
1021    /**
1022     * canbus::Write -- main TX API
0x400da901 is in CAN_rxtask(void*) (/home/balzer/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/components/can/src/can.cpp:730).
725                  me->IncomingFrame(&msg.body.frame);
726                } while (loop);
727              break;
728              }
729            case CAN_txcallback:
730              msg.body.bus->TxCallback(&msg.body.frame, true);
731              break;
732            case CAN_txfailedcallback:
733              msg.body.bus->TxCallback(&msg.body.frame, false);
734              msg.body.bus->LogStatus(CAN_LogStatus_Error);


As the frame callback invocation is secured against null, it must have been a missing initialization of frame.callback. So I checked our source tree for uses of
CAN_frame_t without memset or zero initialization and found one in obd2ecu::IncomingFrame() -- causing this crash I assume -- and two more in the can rx/tx
shell commands.

Fix is in commit https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/34302254567b15a8c0f9dfb730b5cf24703ac700

…and build in https://ovms.dexters-web.de/firmware/ota/v3.1/edge/

Greg, please test again.

Regards,
Michael


Am 13.12.19 um 07:09 schrieb Greg D.:
> Hi Michael,
>
> Ok, so I believe I loaded your Edge build, and the issue remains.  To verify, this is the string reported by the web Status page:
>
> 3.2.007-102-g07440970/ota_1/main (build idf v3.3-beta3-774-g6652269e1 Dec 12 2019 11:59:14)
>
> Following my earlier email, I left the configuration as-is, including launching the obd2ecu task, but with the attached OBDII module removed, and the crash
> was resolved.  Rebooting the module had it come up running normally.  I then plugged in the OBDII module into the OVMSv3, and OVMS crashed the instant the
> first it started polling.  The device is a SyncUp Drive, which is a telematics and WiFi hotspot.  Connecting the OBDII device with 12v power off, as expected,
> did nothing.  Power it on, wait a few seconds to boot, and OVMS crashes immediately.
>
> New console log, attached.  It shows the module booting normally (OBDII device not present), followed by the device being attached (indicated in the log), and
> the subsequent crash.  I removed the device right after the boot started, so as to let the OVMS boot normally.
>
> So, clearly the obd2ecu task it is receiving frames on CAN3 just fine, but transmits back out on that same bus aren't working.  Is this something I need to
> address within obd2ecu, or is it in the core code?  I've not changed obd2ecu in quite some time.
>
> Greg
>
>
> Michael Balzer wrote:
>> Fix commit is https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/8fef3f5fb6eec9fca26997e01d9c7d08080cdc06
>>
>> Greg, please test.
>>
>> New build with SPIRAM fix: https://ovms.dexters-web.de/firmware/ota/v3.1/edge/
>>
>> Regards,
>> Michael
>>
>>
>> Am 12.12.19 um 11:17 schrieb Michael Balzer:
>>> Last line of a2l was missing:
>>>
>>> #!/bin/bash
>>> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>>> for adr in $* ; do
>>>   if [[ "$adr" =~ "elf" ]] ; then
>>>     elf="$adr"
>>>   else
>>>     cmd+=" -ex 'l *${adr/:*/}'"
>>>   fi
>>> done
>>> cmd+=" -ex 'q'"
>>> echo "Using elf file: $elf"
>>> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null
>>>
>>>
>>>
>>> Am 12.12.19 um 11:12 schrieb Michael Balzer:
>>>> Mark,
>>>>
>>>> 1) maybe I missed posting my later version of a2l, which automatically strips the stack pointers from the ":" address syntax so you can copy&paste the USB
>>>> output. Here it is:
>>>>
>>>> #!/bin/bash
>>>> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>>>> for adr in $* ; do
>>>>   if [[ "$adr" =~ "elf" ]] ; then
>>>>     elf="$adr"
>>>>   else
>>>>     cmd+=" -ex 'l *${adr/:*/}'"
>>>>   fi
>>>> done
>>>> cmd+=" -ex 'q'"
>>>> echo "Using elf file: $elf"
>>>>
>>>>
>>>> 2) You're right, the bug is tx_frame with null origin overwriting body.bus in the union. I didn't notice that when checking Marko's submission.
>>>>
>>>> tx_frame is a copy of the last frame given to the bus for transmission. The queue msg is gone when the TX done IRQ comes in, and Marko needed a copy of the
>>>> frame the TX IRQ relates to. I asked him (see PR discussion), he checked and confirmed that all TX is done sequentially, so a single buffer is sufficient.
>>>>
>>>> Swapping the lines would work. The frame.origin also shouldn't be null, but the handler should tolerate that. …oops, tx_frame also doesn't get initialized
>>>> in the canbus constructor, so there's also potentially garbage in tx_frame if due to some bug a TX IRQ is generated or processed without a previous tx.
>>>>
>>>> I'll do the fixes… and also rename tx_frame to m_tx_frame for consistency.
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>>>> Am 12.12.19 um 01:20 schrieb Mark Webb-Johnson:
>>>>> Two issues:
>>>>>
>>>>> _*1] A2L*_
>>>>>
>>>>> My a2l is this:
>>>>>
>>>>>     #!/bin/bash
>>>>>     elf=3.2.007.ovms3.elf
>>>>>     for adr in $* ; do
>>>>>       if [[ "$adr" =~ "elf" ]] ; then
>>>>>         elf="$adr"
>>>>>       else
>>>>>         cmd+=" -ex 'l *$adr'"
>>>>>       fi
>>>>>     done
>>>>>     cmd+=" -ex 'q'"
>>>>>     echo "Using elf file: $elf"
>>>>>     echo "xtensa-esp32-elf-gdb -batch $elf $cmd"
>>>>>     eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null #| grep "^0x.* is in "
>>>>>
>>>>>
>>>>> When I run it, I get:
>>>>>
>>>>>     $ a2l 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>>>>     Using elf file: 3.2.007.ovms3.elf
>>>>>     xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf  -ex 'l *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex ‘q'
>>>>>
>>>>>
>>>>> And a manual run gives:
>>>>>
>>>>>     $ xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf  -ex 'l *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex 'q'
>>>>>     Junk at end of line specification.
>>>>>
>>>>>     $ xtensa-esp32-elf-gdb 3.2.007.ovms3.elf
>>>>>     GNU gdb (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 7.10
>>>>>     (gdb) l *0x400d5e4c:0x3ffc5c40
>>>>>     Junk at end of line specification.
>>>>>     (gdb) l *0x7ffffffd:0x3ffc5c90
>>>>>     (gdb) quit
>>>>>
>>>>>
>>>>> _*2] Frame origin*_
>>>>>
>>>>> Regarding the CAN_txcallback, I think you are correct. And both generators of that message set the frame correctly. So my initial thought was that it is
>>>>> either a memory corruption, or somewhere else sending a frame with garbage data.
>>>>>
>>>>> I do see this technique used in both the mcp2515 and esp32can drivers:
>>>>>
>>>>>     msg.body.bus = me;
>>>>>     msg.body.frame = me->tx_frame;
>>>>>
>>>>>
>>>>> I don’t normally just copy structures over like that. I memcpy() them, but I guess it must work. However, as our CAN_queue_msg_t is a union of CAN_frame_t
>>>>> (frame, with first member of the structure a canbus*) and 'canbus* bus’, that is a little worrying. It seems that the msg.body.bus will get overwritten
>>>>> with whatever is in msg.body.frame.origin. I can’t see anywhere that tx_frame.origin is set - which is scary because that would mean it is always random junk.
>>>>>
>>>>> Maybe it works if tx_frame.origin is set to the bus before anything else, but not in Greg’s circumstances where something else arrives first. Perhaps
>>>>> related to obd2ecu? I only see tx_frame set in can.cpp canbus::Write. I don’t really understand the tx_frame approach at all, and why the frame is just
>>>>> not passed on the queue.
>>>>>
>>>>>     // CAN Frame
>>>>>     // Note: Take care changing this structure, as it is a union with
>>>>>     // CAN_log_message_t and position of 'origin' is fixed.
>>>>>     struct CAN_frame_t
>>>>>       {
>>>>>       canbus*     origin;                   // Origin of the frame
>>>>>       CanFrameCallback * callback;          // Frame-specific callback. Is called when this frame is successfully sent (or sending failed)
>>>>>       CAN_FIR_t   FIR;                      // Frame information record
>>>>>       uint32_t    MsgID;                    // Message ID
>>>>>       union
>>>>>         {
>>>>>         uint8_t   u8[8];                    // Payload byte access
>>>>>         uint32_t  u32[2];                   // Payload u32 access (Att: little endian!)
>>>>>         uint64_t  u64;                      // Payload u64 access (Att: little endian!)
>>>>>         } data;
>>>>>
>>>>>       esp_err_t Write(canbus* bus=NULL, TickType_t maxqueuewait=0);  // bus: NULL=origin
>>>>>       };
>>>>>
>>>>>     // CAN message
>>>>>     typedef struct
>>>>>       {
>>>>>       CAN_queue_type_t type;
>>>>>       union
>>>>>         {
>>>>>         CAN_frame_t frame;  // CAN_frame
>>>>>         canbus* bus;
>>>>>         } body;
>>>>>       } CAN_queue_msg_t;
>>>>>
>>>>>
>>>>> This approach seems to date back to the swcan support merge (f94ae5a1b).
>>>>>
>>>>> Should we swap around, and set the msg.body.bus after msg.body.frame? Or am I missing something…
>>>>>
>>>>> Regards, Mark.
>>>>>
>>>>>> On 12 Dec 2019, at 2:29 AM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>
>>>>>> Mark,
>>>>>>
>>>>>> good example why not to use addr2line: I think that result is wrong. a2l uses gdb which gives:
>>>>>>
>>>>>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> a2l tmp/3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>>>>> Using elf file: tmp/3.2.007.ovms3.elf
>>>>>> 0x400d5e4c is in CAN_rxtask(void*) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:730).
>>>>>> 725                  me->IncomingFrame(&msg.body.frame);
>>>>>> 726                } while (loop);
>>>>>> 727              break;
>>>>>> 728              }
>>>>>> 729            case CAN_txcallback:
>>>>>> 730              msg.body.bus->TxCallback(&msg.body.frame, true);
>>>>>> 731              break;
>>>>>> 732            case CAN_txfailedcallback:
>>>>>> 733              msg.body.bus->TxCallback(&msg.body.frame, false);
>>>>>> 734              msg.body.bus->LogStatus(CAN_LogStatus_Error);
>>>>>>
>>>>>> …and that actually makes sense and matches the register dump.
>>>>>>
>>>>>> If I read the gdb disassembly correctly, A10 = msg.body.bus, so Greg's got a CAN_txcallback msg without a bus.
>>>>>>
>>>>>> Hardening the rxtask against null here would probably avoid the crash, but I don't see yet how that could be possible.
>>>>>> Both esp32can and mcp2515 set the bus field to their object addresses, which cannot be null.
>>>>>>
>>>>>> Regards,
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>> Am 11.12.19 um 13:45 schrieb Mark Webb-Johnson:
>>>>>>> Can’t get a2l working at the moment. The addr2line gives:
>>>>>>>
>>>>>>>     addr2line -e 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>>>>>>     /home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:551
>>>>>>>
>>>>>>>
>>>>>>> That is:
>>>>>>>
>>>>>>>     void canbus::LogInfo(CAN_log_type_t type, const char* text)
>>>>>>>       {
>>>>>>>       MyCan.LogInfo(this, type, text);    <—— HERE
>>>>>>>       }
>>>>>>>
>>>>>>>
>>>>>>> ELF is at:
>>>>>>>
>>>>>>>     http://api.openvehicles.com/firmware/ota/v3.2/main/3.2.007.ovms3.elf
>>>>>>>
>>>>>>>
>>>>>>> Regards, Mark.
>>>>>>>
>>>>>>>> On 11 Dec 2019, at 5:20 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>>>
>>>>>>>> Mark,
>>>>>>>>
>>>>>>>> Greg uses your build, the crash point seems to be consistent, can you post the a2l on this?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Michael
>>>>>>>>
>>>>>>>> PS: Greg, would you mind switching to EAP to beta test future releases?
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 11.12.19 um 04:33 schrieb Greg D.:
>>>>>>>>> Hi folks,
>>>>>>>>>
>>>>>>>>> Well, the module updated to 3.2.007 last night.  I just checked on it,
>>>>>>>>> and it appears that it didn't exactly survive.  Crashed while running
>>>>>>>>> the autoconfig script.  Log with two cycles attached.
>>>>>>>>>
>>>>>>>>> I tried renaming the /store/events directory to /store/was_events since
>>>>>>>>> it seems like Duktape was getting in the way, but that didn't resolve
>>>>>>>>> the crash.  I manually enabled wifi so I could manage the module, and
>>>>>>>>> moved it back to 3.2.005 from the other partition.  Seems stable again.
>>>>>>>>>
>>>>>>>>> The car is going into the Service Center tomorrow (the perennial issue
>>>>>>>>> with 1146 alerts), so I need to have the module stable so that I can
>>>>>>>>> keep an eye on it.  Going to leave it on 3.2.005 for now, unless someone
>>>>>>>>> has a quick fix in the next few hours...
>>>>>>>>>
>>>>>>>>> Otherwise, any ideas for troubleshooting after I get the car back back
>>>>>>>>> (hopefully end of day)?
>>>>>>>>>
>>>>>>>>> Greg
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> OvmsDev mailing list
>>>>>>>>> OvmsDev at lists.openvehicles.com
>>>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>>>> _______________________________________________
>>>>>>>> OvmsDev mailing list
>>>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> OvmsDev mailing list
>>>>>>> OvmsDev at lists.openvehicles.com
>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>>
>>>>>> -- 
>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.openvehicles.com
>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>
>>>> -- 
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>> -- 
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> -- 
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20191213/ce07a9ea/attachment.htm>


More information about the OvmsDev mailing list