<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">My test log shows the buffer address
      0x3f802044, so allocations are done from the beginning and we're
      not touching the upper 2 MB.<br>
      <br>
      Maybe the 2 MB workaround doesn't apply in our case, or has a
      correlation with the core assignment.<br>
      <br>
      Reallocations as a source can be ruled out, as I reserve enough
      capacity on the std::string.<br>
      <br>
      <blockquote type="cite">Urgh.
      </blockquote>
      <br>
      Indeed.<br>
      <br>
      <br>
      Am 30.01.19 um 10:00 schrieb Mark Webb-Johnson:<br>
    </div>
    <blockquote type="cite"
      cite="mid:A5FC95B0-CB4E-4C5D-B77C-51A246008E94@webb-johnson.net">
      <div>
        <blockquote type="cite" class="">
          <div class="">On 30 Jan 2019, at 4:41 PM, Michael Balzer <<a
              href="mailto:dexter@expeedo.de" class=""
              moz-do-not-send="true">dexter@expeedo.de</a>> wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <meta http-equiv="Content-Type" content="text/html;
              charset=UTF-8" class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <div class="moz-cite-prefix">Mark,<br class="">
                <br class="">
                issue #2892 mentions using only the lower 2 MB of SPI
                RAM as a workaround. Where are our allocations placed by
                the allocator? Does it fill from the middle or end? If
                not we don't use the upper half yet.<br class="">
              </div>
            </div>
          </div>
        </blockquote>
        <div><br class="">
        </div>
        I saw that, and had a look. I don’t see anything in makeconfig
        to limit to 2MB (not 4MB).</div>
      <div><br class="">
      </div>
      <div>I do see a CONFIG_SPIRAM_SIZE=4194304 in sdkconfig, but not
        sure how it gets there. Perhaps we can just change there to 2MB
        and try?</div>
      <div><br class="">
      </div>
      <div>I did see this in the documentation:</div>
      <div><br class="">
      </div>
      <blockquote style="margin: 0 0 0 40px; border: none; padding:
        0px;" class="">
        <div><span style="caret-color: rgb(64, 64, 64); color: rgb(64,
            64, 64); font-family: Lato, proxima-nova, "Helvetica
            Neue", Arial, sans-serif; font-size: 16px;
            background-color: rgb(252, 252, 252);" class="">During
            ESP-IDF startup, external RAM is mapped into the data
            address space starting at at address 0x3F800000
            (byte-accessible). The length of this region is the same as
            the SPIRAM size (up to the limit of 4MiB).</span></div>
      </blockquote>
      <div><br class="">
      </div>
      <div>So, maybe we can look at the address our allocations come
        from to see their offset from 0x3F800000? If it is coming from
        the top 2MB, then perhaps we start with a big 2MB allocation
        that we never use?</div>
      <div> <br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <div class="moz-cite-prefix">A possible workaround for the
                most apparent issues with this (string assembly) could
                be to use char buffers instead of std::string. I was
                thinking about doing that at least for the websocket
                stream.<br class="">
              </div>
            </div>
          </div>
        </blockquote>
        <div><br class="">
        </div>
        <div>It seems strange that we are only seeing it with
          std::string. Maybe that just stresses the system more, or does
          a lot of reallocations?</div>
        <br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <div class="moz-cite-prefix"> But if this really is a
                hardware issue it can affect all objects in SPI RAM,
                appending to std::string then only triggers this more
                often.<br class="">
                <br class="">
                Another workaround could be to run everything on core 0.<br
                  class="">
              </div>
            </div>
          </div>
        </blockquote>
        <div><br class="">
        </div>
        Urgh.</div>
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div text="#000000" bgcolor="#FFFFFF" class="">
              <div class="moz-cite-prefix">Regards,<br class="">
                Michael<br class="">
                <br class="">
                <br class="">
                Am 30.01.19 um 03:33 schrieb Mark Webb-Johnson:<br
                  class="">
              </div>
              <blockquote type="cite"
                cite="mid:4737F304-A344-4BD8-9B0C-95366DB7A40F@webb-johnson.net"
                class="">
                <meta http-equiv="Content-Type" content="text/html;
                  charset=UTF-8" class="">
                Michael,
                <div class=""><br class="">
                </div>
                <div class="">Espressif’s response (and linking to that
                  other issue) sounds like others are seeing this.</div>
                <div class=""><br class="">
                </div>
                <div class="">Lousy timing (with Chinese New Year next
                  week), so don’t expect anything quick from Espressif.
                  I guess we’ll just have to live with it until they can
                  find a workaround. It sounds like the issue is at the
                  hardware level and a compiler patch will be needed.</div>
                <div class=""><br class="">
                </div>
                <div class="">Regards, Mark.<br class="">
                  <div class=""><br class="">
                    <blockquote type="cite" class="">
                      <div class="">On 30 Jan 2019, at 3:07 AM, Michael
                        Balzer <<a href="mailto:dexter@expeedo.de"
                          class="" moz-do-not-send="true">dexter@expeedo.de</a>>
                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <div class="">
                        <meta http-equiv="Content-Type"
                          content="text/html; charset=UTF-8" class="">
                        <div text="#000000" bgcolor="#FFFFFF" class="">
                          → <a class="moz-txt-link-freetext"
                            href="https://github.com/espressif/esp-idf/issues/3006"
                            moz-do-not-send="true">https://github.com/espressif/esp-idf/issues/3006</a><br
                            class="">
                          <br class="">
                          Regards,<br class="">
                          Michael<br class="">
                          <br class="">
                          <br class="">
                          <div class="moz-cite-prefix">Am 28.01.19 um
                            20:46 schrieb Michael Balzer:<br class="">
                          </div>
                          <blockquote type="cite"
                            cite="mid:b193b172-9a83-6df5-7b61-9e0cbfa46281@expeedo.de"
                            class="">
                            <meta http-equiv="Content-Type"
                              content="text/html; charset=UTF-8"
                              class="">
                            To clarify: the bug is most likely not
                            restricted to the case of building a message
                            in a buffer. It can possibly cause
                            corruptions in any RAM section, so can well
                            be responsible for many/most of the
                            unidentified crashes and stack/heap
                            corruptions we're experiencing.<br class="">
                            <br class="">
                            Regards,<br class="">
                            Michael<br class="">
                            <br class="">
                            <br class="">
                            <div class="moz-cite-prefix">Am 28.01.19 um
                              20:43 schrieb Michael Balzer:<br class="">
                            </div>
                            <blockquote type="cite"
                              cite="mid:06d0ea65-f4d5-e2c5-9ca1-b0ad63bfa3b1@expeedo.de"
                              class="">
                              <meta http-equiv="Content-Type"
                                content="text/html; charset=UTF-8"
                                class="">
                              For those not following the github
                              discussion: I'm pretty sure I've nailed
                              the bug down.<br class="">
                              <br class="">
                              I have reproduced the bug in a simple test
                              project and intend to raise an issue with
                              Espressif on this.<br class="">
                              <br class="">
                              <a class="moz-txt-link-freetext"
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435"
                                moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457965435</a><br
                                class="">
                              <br class="">
                              If you'd like to test this on your module,
                              configure the project to your wifi
                              credentials, then use "make flash". As the
                              test project is small, this normally will
                              not erase your OVMS config partition, but
                              a backup is always recommended.<br
                                class="">
                              <br class="">
                              Regards,<br class="">
                              Michael<br class="">
                              <br class="">
                              <br class="">
                              <div class="moz-cite-prefix">Am 24.01.19
                                um 21:19 schrieb Michael Balzer:<br
                                  class="">
                              </div>
                              <blockquote type="cite"
                                cite="mid:e4159a1c-b283-25b0-f1b5-ce570b529a1c@expeedo.de"
                                class="">
                                <meta http-equiv="content-type"
                                  content="text/html; charset=UTF-8"
                                  class="">
                                Everyone please have a look at…<br
                                  class="">
                                <br class="">
                                <a class="moz-txt-link-freetext"
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248"
                                  moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/189#issuecomment-457334248</a><br
                                  class="">
                                <br class="">
                                Please try to reproduce the bug on your
                                modules.<br class="">
                                <br class="">
                                I'm open for explanations.<br class="">
                                <br class="">
                                I thought this might be some
                                copy-on-write bug with std::string, but
                                the gcc 5.x libstdc++ does no longer use
                                that implementation (wouldn't be C++11
                                compliant as well). I also tried moving
                                all strings to temporary buffers, but
                                modes 5 & 6 eliminated this
                                explanation as well.<br class="">
                                <br class="">
                                My remaining theories:<br class="">
                                <ul class="">
                                  <li class="">A task writing out of
                                    bounds (but only 0-bytes?)</li>
                                  <li class="">A hardware issue only
                                    affecting some modules<br class="">
                                  </li>
                                </ul>
                                A hardware issue only affecting some
                                percentage of ESP32 could explain this
                                as well as the strange heap corruptions
                                that seem to affect some modules
                                especially often.<br class="">
                                <br class="">
                                Regards,<br class="">
                                Michael<br>
                              </blockquote>
                            </blockquote>
                          </blockquote>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                </div>
              </blockquote>
            </div>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="144">-- 
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
</pre>
  </body>
</html>