<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>(I'm reposting because I had the impression that my message

      didn't get through. If it appears as a duplicate, please forgive

      me - and delete the double post if necessary. Still learning how

      to handle this delay between post and list visibility (moderation

      ?))<br>

    </p>

    <p>Hello List,</p>

    <p>I'm facing some reboots which looks like they are load-related

      (watchdog not triggered). I'll try to troubleshoot / diagnose it

      further but I thought it would be interesting to have your

      feedback on this.</p>

    <p><br>

    </p>

    <p>I'm currently tweaking a dashboard ; the idea is to have an

      in-vehicle display (WiFi-connected) showing a few important

      metrics to the driver (RPM / Speed / Voltage / SOC / multiple

      temperatures / range / controller status / BMS and cell status /

      ...)</p>

    <p>Don't know if images are OK in the list, here is a sample of the

      dashboard - you'll recognize the obvious lineage from the official

      OVMS dashboard:</p>

    <p><img src="cid:part1.oSOgLj3B.s0vWm5fk@lange.nom.fr" alt=""

        class=""></p>

    <p>The metrics are coming from DBC analysis of the CAN bus traffic.</p>

    <p>For the tests I'm not in a vehicle, but am replaying CAN bus

      traffic and feeding it to OVMS (Not via the CAN play famework, as

      I still not had to time to look at

      <a class="moz-txt-link-freetext"

href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/747"

        moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/issues/747</a>

      in details, but via a local CAN bus).</p>

    <p>There are (approximately) :</p>

    <ul>

      <li>1 message repeating each 3ms (333Hz)</li>

      <li>10 messages that are occurring each 10ms (100Hz)</li>

      <li>5 messages that are spaced by 100ms (10Hz)</li>

      <li>3 messages each 500ms (2Hz)<br>

      </li>

    </ul>

    <p>CAN bus speed is 250.000.<br>

    </p>

    <p>Metrics are properly generated (from DBC), and properly displayed

      on the dashboard. However, the combination of the "intense" bus

      traffic, + number of generated metrics seems to be, in some way,

      overflowing the capacity of the WebSocketHandler, which results in

      a reboot from time to time:<br>

    </p>

    <p> </p>

    <blockquote type="cite"><font face="monospace">W (5111095)

        websocket: WebSocketHandler[0x3f8d1654]: job queue overflow

        resolved, 14 drops<br>

        W (5111095) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow detected<br>

        I (5111105) metrics: Modified metric v.g.current: 0A<br>

        I (5111105) metrics: Modified metric v.m.rpm: 763<br>

        I (5111115) metrics: Modified metric v.i.temp: 34.1°C<br>

        W (5111115) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow detected<br>

        W (5111125) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow detected<br>

        I (5111125) metrics: Modified metric v.m.rpm: 765<br>

        W (5111135) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow detected<br>

        I (5111145) metrics: Modified metric v.m.rpm: 758<br>

        W (5111145) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow detected<br>

        I (5111155) metrics: Modified metric v.m.rpm: 756<br>

        W (5111155) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow resolved, 7 drops<br>

        I (5111165) metrics: Modified metric v.m.rpm: 760<br>

        W (5111175) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow resolved, 1 drops<br>

        W (5111185) websocket: WebSocketHandler[0x3f8d1654]: job queue

        overflow E (5111845) task_wdt: Task watchdog got triggered. The

        following tasks did not reset the watchdog in time:<br>

        E (5111845) task_wdt:  - IDLE1 (CPU 1)<br>

        E (5111845) task_wdt: Tasks currently running:<br>

        E (5111845) task_wdt: CPU 0: wifi<br>

        E (5111845) task_wdt: CPU 1: OVMS Console<br>

        E (5111845) task_wdt: Aborting.<br>

        abort() was called at PC 0x400e9920 on core 0<br>

        <br>

        ELF file SHA256: 51b422e8c864d36f<br>

        <br>

        Backtrace: 0x4008ddca:0x3ffb0690 0x4008e065:0x3ffb06b0

        0x400e9920:0x3ffb06d0 0x40084176:0x3ffb06f0<br>

        <br>

        Rebooting...<br>

        ets Jul 29 2019 12:21:46<br>

        <br>

        rst:0xc (SW_CPU_RESET),boot:0x1f (SPI_FAST_FLASH_BOOT)<br>

        configsip: 0, SPIWP:0xee<br>

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00<br>

        mode:DIO, clock div:2<br>

        load:0x3fff0018,len:4<br>

        load:0x3fff001c,len:4796<br>

        load:0x40078000,len:0<br>

        load:0x40078000,len:14896<br>

        entry 0x40078d74<br>

        I (1068) psram: This chip is ESP32-D0WD<br>

        I (1068) spiram: Found 64MBit SPI RAM device</font></blockquote>

    <p>Please note that the Lab setup has:</p>

    <ul>

      <li>OVMS connected to the Lab network</li>

      <li>The computer (displaying the dashboard) also connected to the

        Lab network</li>

    </ul>

    <p>(While, in the car, the computer / tablet would be directly

      connected to OVMS' wifi)</p>

    <p><br>

    </p>

    <p>That's it for the context, now a few questions:</p>

    <ul>

      <li>As I don't know about the capabilities of the OVMS for CAN bus

        traffic analysis, does it looks like the number / frequency of

        messages I'm injecting is unreasonable ?</li>

      <li>It seems like there is a buffering / consolidation of the

        metrics before sending them to the web socket ; is this

        tweakable in some way ?</li>

      <li>Does the DBC processor add a significant processing time

        (compared to a dedicated vehicle module) when processing CAN

        data ?<br>

      </li>

      <li>What would be the best way to diagnose / confirm the health of

        the processes involved here ?<br>

      </li>

      <li>any similar use case / feedback from you ?</li>

    </ul>

    <p><br>

    </p>

    <p>Thanks for any feedback.</p>

    <p><br>

    </p>

    <p>Regards,</p>

    <p>Ludovic.<br>

    </p>

    <div id="grammalecte_menu_main_button_shadow_host" style="width:

      0px; height: 0px;"></div>

  </body>

</html>