<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">My test log shows the buffer address
0x3f802044, so allocations are done from the beginning and we're
not touching the upper 2 MB.<br>
<br>
Maybe the 2 MB workaround doesn't apply in our case, or has a
correlation with the core assignment.<br>
<br>
Reallocations as a source can be ruled out, as I reserve enough
capacity on the std::string.<br>
<br>
<blockquote type="cite">Urgh.
</blockquote>
<br>
Indeed.<br>
<br>
<br>
Am 30.01.19 um 10:00 schrieb Mark Webb-Johnson:<br>
</div>
<blockquote type="cite"
cite="mid:A5FC95B0-CB4E-4C5D-B77C-51A246008E94@webb-johnson.net">
<div>
<blockquote type="cite" class="">
<div class="">On 30 Jan 2019, at 4:41 PM, Michael Balzer <<a
href="mailto:dexter@expeedo.de" class=""
moz-do-not-send="true">dexter@expeedo.de</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Mark,<br class="">
<br class="">
issue #2892 mentions using only the lower 2 MB of SPI
RAM as a workaround. Where are our allocations placed by
the allocator? Does it fill from the middle or end? If
not we don't use the upper half yet.<br class="">
</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
I saw that, and had a look. I don’t see anything in makeconfig
to limit to 2MB (not 4MB).</div>
<div><br class="">
</div>
<div>I do see a CONFIG_SPIRAM_SIZE=4194304 in sdkconfig, but not
sure how it gets there. Perhaps we can just change there to 2MB
and try?</div>
<div><br class="">
</div>
<div>I did see this in the documentation:</div>
<div><br class="">
</div>
<blockquote style="margin: 0 0 0 40px; border: none; padding:
0px;" class="">
<div><span style="caret-color: rgb(64, 64, 64); color: rgb(64,
64, 64); font-family: Lato, proxima-nova, "Helvetica
Neue", Arial, sans-serif; font-size: 16px;
background-color: rgb(252, 252, 252);" class="">During
ESP-IDF startup, external RAM is mapped into the data
address space starting at at address 0x3F800000
(byte-accessible). The length of this region is the same as
the SPIRAM size (up to the limit of 4MiB).</span></div>
</blockquote>
<div><br class="">
</div>
<div>So, maybe we can look at the address our allocations come
from to see their offset from 0x3F800000? If it is coming from
the top 2MB, then perhaps we start with a big 2MB allocation
that we never use?</div>
<div> <br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">A possible workaround for the
most apparent issues with this (string assembly) could
be to use char buffers instead of std::string. I was
thinking about doing that at least for the websocket
stream.<br class="">
</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>It seems strange that we are only seeing it with
std::string. Maybe that just stresses the system more, or does
a lot of reallocations?</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix"> But if this really is a
hardware issue it can affect all objects in SPI RAM,
appending to std::string then only triggers this more
often.<br class="">
<br class="">
Another workaround could be to run everything on core 0.<br
class="">
</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
Urgh.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 30.01.19 um 03:33 schrieb Mark Webb-Johnson:<br
class="">
</div>
<blockquote type="cite"
cite="mid:4737F304-A344-4BD8-9B0C-95366DB7A40F@webb-johnson.net"
class="">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8" class="">
Michael,
<div class=""><br class="">
</div>
<div class="">Espressif’s response (and linking to that
other issue) sounds like others are seeing this.</div>
<div class=""><br class="">
</div>
<div class="">Lousy timing (with Chinese New Year next
week), so don’t expect anything quick from Espressif.
I guess we’ll just have to live with it until they can
find a workaround. It sounds like the issue is at the
hardware level and a compiler patch will be needed.</div>
<div class=""><br class="">
</div>
<div class="">Regards, Mark.<br class="">
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">On 30 Jan 2019, at 3:07 AM, Michael
Balzer <<a href="mailto:dexter@expeedo.de"
class="" moz-do-not-send="true">dexter@expeedo.de</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="Content-Type"
content="text/html; charset=UTF-8" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
→ <a class="moz-txt-link-freetext"
href="https://github.com/espressif/esp-idf/issues/3006"
moz-do-not-send="true">https://github.com/espressif/esp-idf/issues/3006</a><br
class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
<div class="moz-cite-prefix">Am 28.01.19 um
20:46 schrieb Michael Balzer:<br class="">
</div>
<blockquote type="cite"
cite="mid:b193b172-9a83-6df5-7b61-9e0cbfa46281@expeedo.de"
class="">
<meta http-equiv="Content-Type"
content="text/html; charset=UTF-8"
class="">
To clarify: the bug is most likely not
restricted to the case of building a message
in a buffer. It can possibly cause
corruptions in any RAM section, so can well
be responsible for many/most of the
unidentified crashes and stack/heap
corruptions we're experiencing.<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
<div class="moz-cite-prefix">Am 28.01.19 um
20:43 schrieb Michael Balzer:<br class="">
</div>
<blockquote type="cite"
cite="mid:06d0ea65-f4d5-e2c5-9ca1-b0ad63bfa3b1@expeedo.de"
class="">
<meta http-equiv="Content-Type"
content="text/html; charset=UTF-8"
class="">
For those not following the github
discussion: I'm pretty sure I've nailed
the bug down.<br class="">
<br class="">
I have reproduced the bug in a simple test
project and intend to raise an issue with
Espressif on this.<br class="">
<br class="">
<a class="moz-txt-link-freetext"
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435"
moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435</a><br
class="">
<br class="">
If you'd like to test this on your module,
configure the project to your wifi
credentials, then use "make flash". As the
test project is small, this normally will
not erase your OVMS config partition, but
a backup is always recommended.<br
class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
<div class="moz-cite-prefix">Am 24.01.19
um 21:19 schrieb Michael Balzer:<br
class="">
</div>
<blockquote type="cite"
cite="mid:e4159a1c-b283-25b0-f1b5-ce570b529a1c@expeedo.de"
class="">
<meta http-equiv="content-type"
content="text/html; charset=UTF-8"
class="">
Everyone please have a look at…<br
class="">
<br class="">
<a class="moz-txt-link-freetext"
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248"
moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248</a><br
class="">
<br class="">
Please try to reproduce the bug on your
modules.<br class="">
<br class="">
I'm open for explanations.<br class="">
<br class="">
I thought this might be some
copy-on-write bug with std::string, but
the gcc 5.x libstdc++ does no longer use
that implementation (wouldn't be C++11
compliant as well). I also tried moving
all strings to temporary buffers, but
modes 5 & 6 eliminated this
explanation as well.<br class="">
<br class="">
My remaining theories:<br class="">
<ul class="">
<li class="">A task writing out of
bounds (but only 0-bytes?)</li>
<li class="">A hardware issue only
affecting some modules<br class="">
</li>
</ul>
A hardware issue only affecting some
percentage of ESP32 could explain this
as well as the strange heap corruptions
that seem to affect some modules
especially often.<br class="">
<br class="">
Regards,<br class="">
Michael<br>
</blockquote>
</blockquote>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
<pre class="moz-signature" cols="144">--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
</pre>
</body>
</html>