[Ovmsdev] Reboot under some load
Michael Balzer
dexter at expeedo.de
Tue Nov 1 04:20:32 HKT 2022
Ludovic,
Am 31.10.22 um 10:59 schrieb Ludovic LANGE:
>
> (I'm reposting because I had the impression that my message didn't get
> through. If it appears as a duplicate, please forgive me - and delete
> the double post if necessary. Still learning how to handle this delay
> between post and list visibility (moderation ?))
>
You're right, didn't get through, but there is no moderation. Checked
your junk folder for an error message? Possibly Mark can see something
in the logs.
> Metrics are properly generated (from DBC), and properly displayed on
> the dashboard. However, the combination of the "intense" bus traffic,
> + number of generated metrics seems to be, in some way, overflowing
> the capacity of the WebSocketHandler, which results in a reboot from
> time to time:
>
>> W (5111095) websocket: WebSocketHandler[0x3f8d1654]: job queue
>> overflow resolved, 14 drops
>> W (5111095) websocket: WebSocketHandler[0x3f8d1654]: job queue
>> overflow detected
>> I (5111105) metrics: Modified metric v.g.current: 0A
>> I (5111105) metrics: Modified metric v.m.rpm: 763
>> I (5111115) metrics: Modified metric v.i.temp: 34.1°C
>> W (5111115) websocket: WebSocketHandler[0x3f8d1654]: job queue
>> overflow detected
>> W (5111125) websocket: WebSocketHandler[0x3f8d1654]: job queue
>> overflow detected
A WebSocket client channel can jam easily if it can't transmit the data
to the client fast enough. This doesn't depend on the actual Wifi
connection quality alone, but also on the processing speed of the client
device. My impression is, complex and fast chart updates can cause the
Javascript engine needing to do a lot of memory management work.
I haven't had the time to do an analysis on this, but I'm pretty sure
there are options to reduce the load. The dashboard & chart data
processing is still my first implementation, I didn't invest much time
in optimization on that. For example, every new data series is a new
allocation, so the garbage collector has quite some work to do.
Having said that, you should also try to reduce the data volume. From
your logs it seems you've got metrics tracing enabled. That produces a
log message on every metrics update, and all log messages are
transmitted via the WebSocket channel.
>> E (5111845) task_wdt: Tasks currently running:
>> E (5111845) task_wdt: CPU 0: wifi
>> E (5111845) task_wdt: CPU 1: OVMS Console
If you didn't execute a command on the console at that moment, that's
probably also an indicator for a high log load.
> Please note that the Lab setup has:
>
> * OVMS connected to the Lab network
> * The computer (displaying the dashboard) also connected to the Lab
> network
>
> (While, in the car, the computer / tablet would be directly connected
> to OVMS' wifi)
>
Shouldn't make much of a difference. But you could try configuring just
Wifi client or ap mode, not both, depending on the setup. AP is running
on the same channel, so might cut off some capacity.
> That's it for the context, now a few questions:
>
> * As I don't know about the capabilities of the OVMS for CAN bus
> traffic analysis, does it looks like the number / frequency of
> messages I'm injecting is unreasonable ?
>
No.
> * It seems like there is a buffering / consolidation of the metrics
> before sending them to the web socket ; is this tweakable in some
> way ?
>
Metrics updates are initiated by the web client update ticket every 250
ms. You can experiment with changing the interval or make that a
configuration if you like, but I had bad results with higher frequencies
by producing too much load on the smartphones tested, and lower
frequencies are bad for a smooth UI experience.
Regarding the queue overflow you might experiment with raising the queue
size, which is currently 50 jobs. But if 50 tx jobs are reached, chances
are you've got Wifi or client capacity issues.
> * Does the DBC processor add a significant processing time (compared
> to a dedicated vehicle module) when processing CAN data ?
>
Don't know, haven't used the DBC processor for real data.
> *
>
>
> * What would be the best way to diagnose / confirm the health of the
> processes involved here ?
>
Use the task monitoring (module tasks) to check the CPU load of your
processes.
Reduce any unnecessary load, for example avoid excessive logging, user
event creation, file writes and especially SD card accesses, these can
be very slow, see my warning here:
https://docs.openvehicles.com/en/latest/userguide/scripting.html#vfs
Use the browser developer tools to analyse client performance. Btw, you
can see the actual websocket packets when opening the network monitor
before opening the web UI.
> *
> * any similar use case / feedback from you ?
>
>
> Thanks for any feedback.
>
Regards,
Michael
--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221031/67b1612f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openvehicles.com/pipermail/ovmsdev/attachments/20221031/67b1612f/attachment.sig>
More information about the OvmsDev
mailing list