My test log shows the buffer address
0x3f802044, so allocations are done from the beginning and we're
not touching the upper 2 MB.
Maybe the 2 MB workaround doesn't apply in our case, or has a
correlation with the core assignment.
Reallocations as a source can be ruled out, as I reserve enough
capacity on the std::string.
Urgh.
Indeed.
Am 30.01.19 um 10:00 schrieb Mark Webb-Johnson:
Mark,
issue #2892 mentions using only the lower 2 MB of SPI
RAM as a workaround. Where are our allocations placed by
the allocator? Does it fill from the middle or end? If
not we don't use the upper half yet.
I saw that, and had a look. I don’t see anything in makeconfig
to limit to 2MB (not 4MB).
I do see a CONFIG_SPIRAM_SIZE=4194304 in sdkconfig, but not
sure how it gets there. Perhaps we can just change there to 2MB
and try?
I did see this in the documentation:
During
ESP-IDF startup, external RAM is mapped into the data
address space starting at at address 0x3F800000
(byte-accessible). The length of this region is the same as
the SPIRAM size (up to the limit of 4MiB).
So, maybe we can look at the address our allocations come
from to see their offset from 0x3F800000? If it is coming from
the top 2MB, then perhaps we start with a big 2MB allocation
that we never use?
A possible workaround for the
most apparent issues with this (string assembly) could
be to use char buffers instead of std::string. I was
thinking about doing that at least for the websocket
stream.
It seems strange that we are only seeing it with
std::string. Maybe that just stresses the system more, or does
a lot of reallocations?
But if this really is a
hardware issue it can affect all objects in SPI RAM,
appending to std::string then only triggers this more
often.
Another workaround could be to run everything on core 0.
Urgh.
Regards,
Michael
Am 30.01.19 um 03:33 schrieb Mark Webb-Johnson:
Michael,
Espressif’s response (and linking to that
other issue) sounds like others are seeing this.
Lousy timing (with Chinese New Year next
week), so don’t expect anything quick from Espressif.
I guess we’ll just have to live with it until they can
find a workaround. It sounds like the issue is at the
hardware level and a compiler patch will be needed.
Regards, Mark.
→
https://github.com/espressif/esp-idf/issues/3006
Regards,
Michael
Am 28.01.19 um
20:46 schrieb Michael Balzer:
To clarify: the bug is most likely not
restricted to the case of building a message
in a buffer. It can possibly cause
corruptions in any RAM section, so can well
be responsible for many/most of the
unidentified crashes and stack/heap
corruptions we're experiencing.
Regards,
Michael
Am 28.01.19 um
20:43 schrieb Michael Balzer:
For those not following the github
discussion: I'm pretty sure I've nailed
the bug down.
I have reproduced the bug in a simple test
project and intend to raise an issue with
Espressif on this.
https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435
If you'd like to test this on your module,
configure the project to your wifi
credentials, then use "make flash". As the
test project is small, this normally will
not erase your OVMS config partition, but
a backup is always recommended.
Regards,
Michael
Am 24.01.19
um 21:19 schrieb Michael Balzer:
Everyone please have a look at…
https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248
Please try to reproduce the bug on your
modules.
I'm open for explanations.
I thought this might be some
copy-on-write bug with std::string, but
the gcc 5.x libstdc++ does no longer use
that implementation (wouldn't be C++11
compliant as well). I also tried moving
all strings to temporary buffers, but
modes 5 & 6 eliminated this
explanation as well.
My remaining theories:
- A task writing out of
bounds (but only 0-bytes?)
- A hardware issue only
affecting some modules
A hardware issue only affecting some
percentage of ESP32 could explain this
as well as the strange heap corruptions
that seem to affect some modules
especially often.
Regards,
Michael