[Ovmsdev] std::string data corruption (issue #189)

Mark Webb-Johnson mark at webb-johnson.net
Wed Jan 30 17:00:21 HKT 2019



> On 30 Jan 2019, at 4:41 PM, Michael Balzer <dexter at expeedo.de> wrote:
> 
> Mark,
> 
> issue #2892 mentions using only the lower 2 MB of SPI RAM as a workaround. Where are our allocations placed by the allocator? Does it fill from the middle or end? If not we don't use the upper half yet.

I saw that, and had a look. I don’t see anything in makeconfig to limit to 2MB (not 4MB).

I do see a CONFIG_SPIRAM_SIZE=4194304 in sdkconfig, but not sure how it gets there. Perhaps we can just change there to 2MB and try?

I did see this in the documentation:

During ESP-IDF startup, external RAM is mapped into the data address space starting at at address 0x3F800000 (byte-accessible). The length of this region is the same as the SPIRAM size (up to the limit of 4MiB).

So, maybe we can look at the address our allocations come from to see their offset from 0x3F800000? If it is coming from the top 2MB, then perhaps we start with a big 2MB allocation that we never use?
 
> A possible workaround for the most apparent issues with this (string assembly) could be to use char buffers instead of std::string. I was thinking about doing that at least for the websocket stream.

It seems strange that we are only seeing it with std::string. Maybe that just stresses the system more, or does a lot of reallocations?

> But if this really is a hardware issue it can affect all objects in SPI RAM, appending to std::string then only triggers this more often.
> 
> Another workaround could be to run everything on core 0.

Urgh.

> Regards,
> Michael
> 
> 
> Am 30.01.19 um 03:33 schrieb Mark Webb-Johnson:
>> Michael,
>> 
>> Espressif’s response (and linking to that other issue) sounds like others are seeing this.
>> 
>> Lousy timing (with Chinese New Year next week), so don’t expect anything quick from Espressif. I guess we’ll just have to live with it until they can find a workaround. It sounds like the issue is at the hardware level and a compiler patch will be needed.
>> 
>> Regards, Mark.
>> 
>>> On 30 Jan 2019, at 3:07 AM, Michael Balzer <dexter at expeedo.de <mailto:dexter at expeedo.de>> wrote:
>>> 
>>>https://github.com/espressif/esp-idf/issues/3006 <https://github.com/espressif/esp-idf/issues/3006>
>>> 
>>> Regards,
>>> Michael
>>> 
>>> 
>>> Am 28.01.19 um 20:46 schrieb Michael Balzer:
>>>> To clarify: the bug is most likely not restricted to the case of building a message in a buffer. It can possibly cause corruptions in any RAM section, so can well be responsible for many/most of the unidentified crashes and stack/heap corruptions we're experiencing.
>>>> 
>>>> Regards,
>>>> Michael
>>>> 
>>>> 
>>>> Am 28.01.19 um 20:43 schrieb Michael Balzer:
>>>>> For those not following the github discussion: I'm pretty sure I've nailed the bug down.
>>>>> 
>>>>> I have reproduced the bug in a simple test project and intend to raise an issue with Espressif on this.
>>>>> 
>>>>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435 <https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435>
>>>>> 
>>>>> If you'd like to test this on your module, configure the project to your wifi credentials, then use "make flash". As the test project is small, this normally will not erase your OVMS config partition, but a backup is always recommended.
>>>>> 
>>>>> Regards,
>>>>> Michael
>>>>> 
>>>>> 
>>>>> Am 24.01.19 um 21:19 schrieb Michael Balzer:
>>>>>> Everyone please have a look at…
>>>>>> 
>>>>>> https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248 <https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248>
>>>>>> 
>>>>>> Please try to reproduce the bug on your modules.
>>>>>> 
>>>>>> I'm open for explanations.
>>>>>> 
>>>>>> I thought this might be some copy-on-write bug with std::string, but the gcc 5.x libstdc++ does no longer use that implementation (wouldn't be C++11 compliant as well). I also tried moving all strings to temporary buffers, but modes 5 & 6 eliminated this explanation as well.
>>>>>> 
>>>>>> My remaining theories:
>>>>>> A task writing out of bounds (but only 0-bytes?)
>>>>>> A hardware issue only affecting some modules
>>>>>> A hardware issue only affecting some percentage of ESP32 could explain this as well as the strange heap corruptions that seem to affect some modules especially often.
>>>>>> 
>>>>>> Regards,
>>>>>> Michael
>>>>>> 
>>>>>> -- 
>>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> OvmsDev mailing list
>>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>>> 
>>>>> -- 
>>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> OvmsDev mailing list
>>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>>> 
>>>> -- 
>>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>>> 
>>>> 
>>>> _______________________________________________
>>>> OvmsDev mailing list
>>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>>> 
>>> -- 
>>> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
>>> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
>>> _______________________________________________
>>> OvmsDev mailing list
>>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
>> 
>> 
>> 
>> _______________________________________________
>> OvmsDev mailing list
>> OvmsDev at lists.openvehicles.com <mailto:OvmsDev at lists.openvehicles.com>
>> http://lists.openvehicles.com/mailman/listinfo/ovmsdev <http://lists.openvehicles.com/mailman/listinfo/ovmsdev>
> 
> 
> -- 
> Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
> Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
> _______________________________________________
> OvmsDev mailing list
> OvmsDev at lists.openvehicles.com
> http://lists.openvehicles.com/mailman/listinfo/ovmsdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20190130/1efc4384/attachment.htm>


More information about the OvmsDev mailing list