[Ovmsdev] version 3.2.007: Crash on boot
Michael Balzer
dexter at expeedo.de
Thu Dec 12 19:07:25 HKT 2019
Fix commit is https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/commit/8fef3f5fb6eec9fca26997e01d9c7d08080cdc06
Greg, please test.
New build with SPIRAM fix: https://ovms.dexters-web.de/firmware/ota/v3.1/edge/
Regards,
Michael
Am 12.12.19 um 11:17 schrieb Michael Balzer:
> Last line of a2l was missing:
>
> #!/bin/bash
> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
> for adr in $* ; do
> if [[ "$adr" =~ "elf" ]] ; then
> elf="$adr"
> else
> cmd+=" -ex 'l *${adr/:*/}'"
> fi
> done
> cmd+=" -ex 'q'"
> echo "Using elf file: $elf"
> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null
>
>
>
> Am 12.12.19 um 11:12 schrieb Michael Balzer:
>> Mark,
>>
>> 1) maybe I missed posting my later version of a2l, which automatically strips the stack pointers from the ":" address syntax so you can copy&paste the USB
>> output. Here it is:
>>
>> #!/bin/bash
>> elf=~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3/build/ovms3.elf
>> for adr in $* ; do
>> if [[ "$adr" =~ "elf" ]] ; then
>> elf="$adr"
>> else
>> cmd+=" -ex 'l *${adr/:*/}'"
>> fi
>> done
>> cmd+=" -ex 'q'"
>> echo "Using elf file: $elf"
>>
>>
>> 2) You're right, the bug is tx_frame with null origin overwriting body.bus in the union. I didn't notice that when checking Marko's submission.
>>
>> tx_frame is a copy of the last frame given to the bus for transmission. The queue msg is gone when the TX done IRQ comes in, and Marko needed a copy of the
>> frame the TX IRQ relates to. I asked him (see PR discussion), he checked and confirmed that all TX is done sequentially, so a single buffer is sufficient.
>>
>> Swapping the lines would work. The frame.origin also shouldn't be null, but the handler should tolerate that. …oops, tx_frame also doesn't get initialized in
>> the canbus constructor, so there's also potentially garbage in tx_frame if due to some bug a TX IRQ is generated or processed without a previous tx.
>>
>> I'll do the fixes… and also rename tx_frame to m_tx_frame for consistency.
>>
>> Regards,
>> Michael
>>
>>
>> Am 12.12.19 um 01:20 schrieb Mark Webb-Johnson:
>>> Two issues:
>>>
>>> _*1] A2L*_
>>>
>>> My a2l is this:
>>>
>>> #!/bin/bash
>>> elf=3.2.007.ovms3.elf
>>> for adr in $* ; do
>>> if [[ "$adr" =~ "elf" ]] ; then
>>> elf="$adr"
>>> else
>>> cmd+=" -ex 'l *$adr'"
>>> fi
>>> done
>>> cmd+=" -ex 'q'"
>>> echo "Using elf file: $elf"
>>> echo "xtensa-esp32-elf-gdb -batch $elf $cmd"
>>> eval xtensa-esp32-elf-gdb -batch $elf $cmd 2>/dev/null #| grep "^0x.* is in "
>>>
>>>
>>> When I run it, I get:
>>>
>>> $ a2l 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>> Using elf file: 3.2.007.ovms3.elf
>>> xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf -ex 'l *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex ‘q'
>>>
>>>
>>> And a manual run gives:
>>>
>>> $ xtensa-esp32-elf-gdb -batch 3.2.007.ovms3.elf -ex 'l *0x400d5e4c:0x3ffc5c40' -ex 'l *0x7ffffffd:0x3ffc5c90' -ex 'q'
>>> Junk at end of line specification.
>>>
>>> $ xtensa-esp32-elf-gdb 3.2.007.ovms3.elf
>>> GNU gdb (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 7.10
>>> (gdb) l *0x400d5e4c:0x3ffc5c40
>>> Junk at end of line specification.
>>> (gdb) l *0x7ffffffd:0x3ffc5c90
>>> (gdb) quit
>>>
>>>
>>> _*2] Frame origin*_
>>>
>>> Regarding the CAN_txcallback, I think you are correct. And both generators of that message set the frame correctly. So my initial thought was that it is
>>> either a memory corruption, or somewhere else sending a frame with garbage data.
>>>
>>> I do see this technique used in both the mcp2515 and esp32can drivers:
>>>
>>> msg.body.bus = me;
>>> msg.body.frame = me->tx_frame;
>>>
>>>
>>> I don’t normally just copy structures over like that. I memcpy() them, but I guess it must work. However, as our CAN_queue_msg_t is a union of CAN_frame_t
>>> (frame, with first member of the structure a canbus*) and 'canbus* bus’, that is a little worrying. It seems that the msg.body.bus will get overwritten with
>>> whatever is in msg.body.frame.origin. I can’t see anywhere that tx_frame.origin is set - which is scary because that would mean it is always random junk.
>>>
>>> Maybe it works if tx_frame.origin is set to the bus before anything else, but not in Greg’s circumstances where something else arrives first. Perhaps
>>> related to obd2ecu? I only see tx_frame set in can.cpp canbus::Write. I don’t really understand the tx_frame approach at all, and why the frame is just not
>>> passed on the queue.
>>>
>>> // CAN Frame
>>> // Note: Take care changing this structure, as it is a union with
>>> // CAN_log_message_t and position of 'origin' is fixed.
>>> struct CAN_frame_t
>>> {
>>> canbus* origin; // Origin of the frame
>>> CanFrameCallback * callback; // Frame-specific callback. Is called when this frame is successfully sent (or sending failed)
>>> CAN_FIR_t FIR; // Frame information record
>>> uint32_t MsgID; // Message ID
>>> union
>>> {
>>> uint8_t u8[8]; // Payload byte access
>>> uint32_t u32[2]; // Payload u32 access (Att: little endian!)
>>> uint64_t u64; // Payload u64 access (Att: little endian!)
>>> } data;
>>>
>>> esp_err_t Write(canbus* bus=NULL, TickType_t maxqueuewait=0); // bus: NULL=origin
>>> };
>>>
>>> // CAN message
>>> typedef struct
>>> {
>>> CAN_queue_type_t type;
>>> union
>>> {
>>> CAN_frame_t frame; // CAN_frame
>>> canbus* bus;
>>> } body;
>>> } CAN_queue_msg_t;
>>>
>>>
>>> This approach seems to date back to the swcan support merge (f94ae5a1b).
>>>
>>> Should we swap around, and set the msg.body.bus after msg.body.frame? Or am I missing something…
>>>
>>> Regards, Mark.
>>>
>>>> On 12 Dec 2019, at 2:29 AM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>
>>>> Mark,
>>>>
>>>> good example why not to use addr2line: I think that result is wrong. a2l uses gdb which gives:
>>>>
>>>> balzer at leela:~/esp/Open-Vehicle-Monitoring-System-3/vehicle/OVMS.V3> a2l tmp/3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>>> Using elf file: tmp/3.2.007.ovms3.elf
>>>> 0x400d5e4c is in CAN_rxtask(void*) (/home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:730).
>>>> 725 me->IncomingFrame(&msg.body.frame);
>>>> 726 } while (loop);
>>>> 727 break;
>>>> 728 }
>>>> 729 case CAN_txcallback:
>>>> 730 msg.body.bus->TxCallback(&msg.body.frame, true);
>>>> 731 break;
>>>> 732 case CAN_txfailedcallback:
>>>> 733 msg.body.bus->TxCallback(&msg.body.frame, false);
>>>> 734 msg.body.bus->LogStatus(CAN_LogStatus_Error);
>>>>
>>>> …and that actually makes sense and matches the register dump.
>>>>
>>>> If I read the gdb disassembly correctly, A10 = msg.body.bus, so Greg's got a CAN_txcallback msg without a bus.
>>>>
>>>> Hardening the rxtask against null here would probably avoid the crash, but I don't see yet how that could be possible.
>>>> Both esp32can and mcp2515 set the bus field to their object addresses, which cannot be null.
>>>>
>>>> Regards,
>>>> Michael
>>>>
>>>>
>>>> Am 11.12.19 um 13:45 schrieb Mark Webb-Johnson:
>>>>> Can’t get a2l working at the moment. The addr2line gives:
>>>>>
>>>>> addr2line -e 3.2.007.ovms3.elf 0x400d5e4c:0x3ffc5c40 0x7ffffffd:0x3ffc5c90
>>>>> /home/openvehicles/build/Open-Vehicle-Monitoring-System-3.1/vehicle/OVMS.V3/components/can/src/can.cpp:551
>>>>>
>>>>>
>>>>> That is:
>>>>>
>>>>> void canbus::LogInfo(CAN_log_type_t type, const char* text)
>>>>> {
>>>>> MyCan.LogInfo(this, type, text); <—— HERE
>>>>> }
>>>>>
>>>>>
>>>>> ELF is at:
>>>>>
>>>>> http://api.openvehicles.com/firmware/ota/v3.2/main/3.2.007.ovms3.elf
>>>>>
>>>>>
>>>>> Regards, Mark.
>>>>>
>>>>>> On 11 Dec 2019, at 5:20 PM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>>>>>
>>>>>> Mark,
>>>>>>
>>>>>> Greg uses your build, the crash point seems to be consistent, can you post the a2l on this?
>>>>>>
>>>>>> Regards,
>>>>>> Michael
>>>>>>
>>>>>> PS: Greg, would you mind switching to EAP to beta test future releases?
>>>>>>
>>>>>>
>>>>>> Am 11.12.19 um 04:33 schrieb Greg D.:
>>>>>>> Hi folks,
>>>>>>>
>>>>>>> Well, the module updated to 3.2.007 last night. I just checked on it,
>>>>>>> and it appears that it didn't exactly survive. Crashed while running
>>>>>>> the autoconfig script. Log with two cycles attached.
>>>>>>>
>>>>>>> I tried renaming the /store/events directory to /store/was_events since
>>>>>>> it seems like Duktape was getting in the way, but that didn't resolve
>>>>>>> the crash. I manually enabled wifi so I could manage the module, and
>>>>>>> moved it back to 3.2.005 from the other partition. Seems stable again.
>>>>>>>
>>>>>>> The car is going into the Service Center tomorrow (the perennial issue
>>>>>>> with 1146 alerts), so I need to have the module stable so that I can
>>>>>>> keep an eye on it. Going to leave it on 3.2.005 for now, unless someone
>>>>>>> has a quick fix in the next few hours...
>>>>>>>
>>>>>>> Otherwise, any ideas for troubleshooting after I get the car back back
>>>>>>> (hopefully end of day)?
>>>>>>>
>>>>>>> Greg
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> OvmsDev mailing list
>>>>>>> OvmsDev at lists.openvehicles.com
>>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.openvehicles.com
>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>>
>>>> --
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>>
>>>
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>>
>> --
>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
>
> --
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20191212/212af617/attachment.htm>
More information about the OvmsDev
mailing list