<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Btw, regarding releases, I also will
change that on my server to include the commit offset in "edge"
and "main".<br>
<br>
That way I can merge and roll out important changes (e.g. security
updates) to these branches as well without the need for a new
major version.<br>
<br>
In the firmware distribution directory, "edge" and "main" then are
simply links to the release directory.<br>
<br>
Regards,<br>
Michael<br>
<br>
<br>
Am 19.09.19 um 09:23 schrieb Michael Balzer:<br>
</div>
<blockquote type="cite"
cite="mid:7a3bb65a-9a54-6f67-c794-4b6f8d4ca2b8@expeedo.de">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div class="moz-cite-prefix">Sorry, I didn't think about this
being an issue elsewhere -- german data plans typically start at
minimum 100 MB/month flat (that's my current plan at 3 € /
month).<br>
<br>
No need for a new release, it can be turned off OTA by issueing<br>
<br>
<tt>config set module debug.tasks no</tt><br>
<br>
Regards,<br>
Michael<br>
<br>
<br>
Am 19.09.19 um 09:08 schrieb Mark Webb-Johnson:<br>
</div>
<blockquote type="cite"
cite="mid:BFED773E-0CFE-4EBB-A3B4-9B44FE4A783D@webb-johnson.net">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
Yep:
<div class=""><br class="">
</div>
<blockquote style="margin: 0 0 0 40px; border: none; padding:
0px;" class="">
<div class="">758 bytes * (86400 / 300) * 30 = 6.5MB/month</div>
</blockquote>
<div class="">
<div><br class="">
</div>
<div>That is going over data (not SD). Presumably cellular
data for a large portion of the time.</div>
<div><br class="">
</div>
<div>I think we need to default this to OFF, and make a
3.2.004 to avoid this becoming an issue.</div>
<div><br class="">
</div>
<div>Regards, Mark.</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On 19 Sep 2019, at 2:04 PM, Stephen Casner
<<a href="mailto:casner@acm.org" class=""
moz-do-not-send="true">casner@acm.org</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">That's 6.55MB/month, unless you have
unusually short months! :-)<br class="">
<br class="">
In what space is that data stored? A log written to
SD? That's not<br class="">
likely to fill up the SD card too fast, but what
happens if no SD card<br class="">
is installed?<br class="">
<br class="">
-- Steve<br
class="">
<br class="">
On Thu, 19 Sep 2019, Mark Webb-Johnson wrote:<br
class="">
<br class="">
<blockquote type="cite" class="">
<blockquote type="cite" class=""> To enable CPU
usage statistics, apply the changes to sdkconfig<br
class="">
included.<br class="">
New history record:<br class="">
- "*-OVM-DebugTasks" v1:
<taskcnt,totaltime> + per task:<br class="">
<tasknum,name,state,stack_now,stack_max,stack_total,<br
class="">
heap_total,heap_32bit,heap_spi,runtime><br
class="">
Note: CPU core use percentage = runtime /
totaltime<br class="">
</blockquote>
<br class="">
I’ve just noticed that this is enabled by default
now (my production build has the sdkconfig updated,
as per defaults).<br class="">
<br class="">
I am seeing 758 bytes of history record, every 5
minutes. About 218KB/day, or 654KB/month.<br
class="">
<br class="">
Should this be opt-in?<br class="">
<br class="">
Regards, Mark.<br class="">
<br class="">
<blockquote type="cite" class="">On 8 Sep 2019, at
5:43 PM, Michael Balzer <<a
href="mailto:dexter@expeedo.de" class=""
moz-do-not-send="true">dexter@expeedo.de</a>>
wrote:<br class="">
<br class="">
I've pushed some modifications and improvements to
(hopefully) fix the timer issue or at least be
able to debug it.<br class="">
<br class="">
Some sdkconfig changes are necessary.<br class="">
<br class="">
The build including these updates is on my edge
release as 3.2.002-258-g20ae554b.<br class="">
<br class="">
Btw: the network restart strategy seems to
mitigate issue #241; I've seen a major drop on
record repetitions on my server since the rollout.<br
class="">
<br class="">
<br class="">
commit 99e4e48bdd40b7004c0976f51aba9e3da4ecab53<br
class="">
<br class="">
Module: add per task CPU usage statistics, add
task stats history records<br class="">
<br class="">
To enable CPU usage statistics, apply the
changes to sdkconfig<br class="">
included. The CPU usage shown by the commands
is calculated against<br class="">
the last task status retrieved (or system
boot).<br class="">
<br class="">
Command changes:<br class="">
- "module tasks" -- added CPU (core) usage in
percent per task<br class="">
<br class="">
New command:<br class="">
- "module tasks data" -- output task stats in
history record form<br class="">
<br class="">
New config:<br class="">
- [module] debug.tasks -- yes (default) = send
task stats every 5 minutes<br class="">
<br class="">
New history record:<br class="">
- "*-OVM-DebugTasks" v1:
<taskcnt,totaltime> + per task:<br class="">
<tasknum,name,state,stack_now,stack_max,stack_total,<br
class="">
heap_total,heap_32bit,heap_spi,runtime><br
class="">
Note: CPU core use percentage = runtime /
totaltime<br class="">
<br class="">
commit 950172c216a72beb4da0bc7a40a46995a6105955<br
class="">
<br class="">
Build config: default timer service task
priority raised to 20<br class="">
<br class="">
Background: the FreeRTOS timer service shall
only be used for very<br class="">
short and non-blocking jobs. We delegate
event processing to our<br class="">
events task, anything else in timers needs to
run with high<br class="">
priority.<br class="">
<br class="">
commit 31ac19d187480046c16356b80668de45cacbb83d<br
class="">
<br class="">
DukTape: add build config for task priority,
default lowered to 3<br class="">
<br class="">
Background: the DukTape garbage collector shall
run on lower<br class="">
priority than tasks like SIMCOM & events<br
class="">
<br class="">
commit e0a44791fbcfb5a4e4cad24c9d1163b76e637b4f<br
class="">
<br class="">
Server V2: use esp_log_timestamp for timeout
detection,<br class="">
add timeout config, limit data records &
size per second<br class="">
<br class="">
New config:<br class="">
- [server.v2] timeout.rx -- timeout in seconds,
default 960<br class="">
<br class="">
commit 684a4ce9525175a910040f0d1ca82ac212fbf5de<br
class="">
<br class="">
Notify: use esp_log_timestamp for creation time
instead of monotonictime<br class="">
to harden against timer service starvation /
ticker event drops<br class="">
<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 07.09.19 um 10:55 schrieb Michael Balzer:<br
class="">
<blockquote type="cite" class="">I think the RTOS
timer service task starves. It's running on core
0 with priority 1.<br class="">
<br class="">
Taks on core 0 sorted by priority:<br class="">
<br class="">
Number of Tasks = 20 Stack: Now Max
Total Heap 32-bit SPIRAM C# PRI<br class="">
3FFC84A8 6 Blk ipc0 388 500
1024 7788 0 0 0 24<br class="">
3FFC77F0 5 Blk OVMS CanRx 428 428
2048 3052 0 31844 0 23<br class="">
3FFAFBF4 1 Blk esp_timer 400 656
4096 35928 644 25804 0 22<br class="">
3FFD3240 19 Blk wifi 460 2716
3584 43720 0 20 0 22<br class="">
3FFC03C4 2 Blk eventTask 448 1984
4608 104 0 0 0 20<br class="">
3FFC8F14 17 Blk tiT 500 2308
3072 6552 0 0 * 18<br class="">
3FFE14F0 26 Blk OVMS COrx 456 456
4096 0 0 0 0 7<br class="">
3FFE19D4 27 Blk OVMS COwrk 476 476
3072 0 0 0 0 7<br class="">
3FFCBC34 12 Blk Tmr Svc 352 928
3072 88 0 0 0 1<br class="">
3FFE7708 23 Blk mdns 468 1396
4096 108 0 0 0 1<br class="">
<br class="">
I don't think it's our CanRx, as that only
fetches and queues CAN frames, the actual work
is done by the listeners. The CO tasks only run
for CANopen jobs, which are few for normal
operation.<br class="">
<br class="">
That leaves the system tasks, with main suspect
-once again- the wifi blob.<br class="">
<br class="">
We need to know how much CPU time the tasks
actually use now. I think I saw some option for
this in the FreeRTOS config.<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 06.09.19 um 23:15 schrieb Michael Balzer:<br
class="">
<blockquote type="cite" class="">The workaround
is based on the monotonictime being updated
per second, as do the history record offsets.<br
class="">
<br class="">
Apparently, that mechanism doesn't work
reliably. That may be an indicator for some
bigger underlying issue.<br class="">
<br class="">
Example log excerpt:<br class="">
<br class="">
2019-09-06 22:07:48.126919 +0200 info main:
#173 C MITPROHB rx msg h
964,0,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:03.089031 +0200 info main:
#173 C MITPROHB rx msg h
964,-10,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:05.041574 +0200 info main:
#173 C MITPROHB rx msg h
964,-20,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:05.052644 +0200 info main:
#173 C MITPROHB rx msg h
964,-30,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:05.063617 +0200 info main:
#173 C MITPROHB rx msg h
964,-49,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:05.077527 +0200 info main:
#173 C MITPROHB rx msg h
964,-59,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:05.193775 +0200 info main:
#173 C MITPROHB rx msg h
964,-70,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:13.190645 +0200 info main:
#173 C MITPROHB rx msg h
964,-80,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:22.077994 +0200 info main:
#173 C MITPROHB rx msg h
964,-90,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:09:54.590300 +0200 info main:
#173 C MITPROHB rx msg h
964,-109,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:11:10.127054 +0200 info main:
#173 C MITPROHB rx msg h
964,-119,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:11:16.794200 +0200 info main:
#173 C MITPROHB rx msg h
964,-130,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:11:22.455652 +0200 info main:
#173 C MITPROHB rx msg h
964,-140,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:12:49.423412 +0200 info main:
#173 C MITPROHB rx msg h
964,-150,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:12:49.442096 +0200 info main:
#173 C MITPROHB rx msg h
964,-169,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:12:49.461941 +0200 info main:
#173 C MITPROHB rx msg h
964,-179,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:14:39.828133 +0200 info main:
#173 C MITPROHB rx msg h
964,-190,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:14:39.858144 +0200 info main:
#173 C MITPROHB rx msg h
964,-200,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:14:52.020319 +0200 info main:
#173 C MITPROHB rx msg h
964,-210,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:14:54.452637 +0200 info main:
#173 C MITPROHB rx msg h
964,-229,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:15:12.613935 +0200 info main:
#173 C MITPROHB rx msg h
964,-239,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:15:35.223845 +0200 info main:
#173 C MITPROHB rx msg h
964,-250,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:16:09.255059 +0200 info main:
#173 C MITPROHB rx msg h
964,-260,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:17:31.919754 +0200 info main:
#173 C MITPROHB rx msg h
964,-270,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:19:23.366267 +0200 info main:
#173 C MITPROHB rx msg h
964,-289,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:21:57.344609 +0200 info main:
#173 C MITPROHB rx msg h
964,-299,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:23:40.082406 +0200 info main:
#31 C MITPROHB rx msg h
964,-1027,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
2019-09-06 22:25:58.061883 +0200 info main:
#31 C MITPROHB rx msg h
964,-1040,RT-BAT-C,5,86400,2,1,3830,3795,3830,-10,25,25,25,0<br
class="">
<br class="">
<br class="">
This shows the ticker was only run 299 times
from 22:07:48 to 22:21:57.<br class="">
<br class="">
After 22:21:57 the workaround was triggered
and did a reconnect. Apparently during that
network reinitialization of 103 seconds, the
per second ticker was run 628 times.<br
class="">
<br class="">
That can't be catching up on the event queue,
as that queue has only 20 slots. So something
strange is going on here.<br class="">
<br class="">
Any ideas?<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 06.09.19 um 08:04 schrieb Michael Balzer:<br
class="">
<blockquote type="cite" class="">Mark &
anyone else running a V2 server,<br class="">
<br class="">
as most cars don't send history records,
this also needs the change to the server I
just pushed, i.e. server version 2.4.2.<br
class="">
<br class="">
<a
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System/commits/master"
class="" moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System/commits/master</a>
<<a
href="https://github.com/openvehicles/Open-Vehicle-Monitoring-System/commits/master"
class="" moz-do-not-send="true">https://github.com/openvehicles/Open-Vehicle-Monitoring-System/commits/master</a>><br
class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 05.09.19 um 19:55 schrieb Michael Balzer:<br
class="">
<blockquote type="cite" class="">I've pushed
the nasty workaround: the v2 server checks
for no RX over 15 minutes, then restarts
the network (wifi & modem) as
configured for autostart.<br class="">
<br class="">
Rolled out on my server in edge as
3.2.002-237-ge075f655.<br class="">
<br class="">
Please test.<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 05.09.19 um 01:58 schrieb Mark
Webb-Johnson:<br class="">
<blockquote type="cite" class="">
<blockquote type="cite" class="">Mark,
you can check your server logs for
history messages with ridiculous time
offsets:<br class="">
[sddexter@ns27 server]$ cat
log-20190903 | egrep "rx msg h
[0-9]+,-[0-9]{4}" | wc -l<br class="">
455283<br class="">
</blockquote>
<br class="">
I checked my logs and see 12 vehicles
showing this. But, 2 only show this for
a debugcrash log (which is expected, I
guess, if the time is not synced at
report time). I’ve got 4 cars with the
offset > 10,000.<br class="">
<br class="">
Regards, Mark.<br class="">
<br class="">
<blockquote type="cite" class="">On 4
Sep 2019, at 4:45 AM, Michael Balzer
<<a href="mailto:dexter@expeedo.de"
class="" moz-do-not-send="true">dexter@expeedo.de</a>
<<a href="mailto:dexter@expeedo.de"
class="" moz-do-not-send="true">mailto:dexter@expeedo.de</a>>>
wrote:<br class="">
<br class="">
Everyone,<br class="">
<br class="">
I've pushed a change that needs some
testing.<br class="">
<br class="">
I had the issue myself now parking at
a certain distance from my garage wifi
AP, i.e. on the edge of "in", after
wifi had been disconnected for some
hours, and with the module still
connected via modem. The wifi blob had
been trying to connect to the AP for
about two hours.<br class="">
<br class="">
As seen before, the module saw no
error, just the server responses and
commands stopped coming in. I noticed
the default interface was still "st1"
despite wifi having been disconnected
and modem connected. The DNS was also
still configured for my wifi network,
and the interface seemed to have an IP
address -- but wasn't pingable from
the wifi network.<br class="">
<br class="">
A power cycle of the modem solved the
issue without reboot. So the cause may
be in the modem/ppp subsystem, or it
may be related (in some weird way) to
the default interface / DNS setup.<br
class="">
<br class="">
More tests showed the default
interface again/still got set by the
wifi blob itself at some point,
overriding our modem prioritization.
The events we didn't handle up to now
were "sta.connected" and "sta.lostip",
so I added these, and the bug didn't
show up again since then. That doesn't
mean anything, so we need to test
this.<br class="">
<br class="">
The default interface really shouldn't
affect inbound packet routing of an
established connection, but there
always may be strange bugs lurking in
those libs.<br class="">
<br class="">
The change also reimplements the wifi
signal strength reading, as the tests
also showed that still wasn't working
well using the CSI callback. It now
seems to be much more reliable.<br
class="">
<br class="">
Please test & report. The single
module will be hard to test, as the
bug isn't reproducable easily, but you
can still try if wifi / modem
transitions work well.<br class="">
<br class="">
Mark, you can check your server logs
for history messages with ridiculous
time offsets:<br class="">
[sddexter@ns27 server]$ cat
log-20190903 | egrep "rx msg h
[0-9]+,-[0-9]{4}" | wc -l<br class="">
455283<br class="">
The bug now severely affects the V2
server performance, as the server is
single threaded and doesn't scale very
well to this kind of bulk data bursts,
especially when coming from multiple
modules in parallel. So we really need
to solve this now. Slow reactions or
connection drops from my server lately
have been due to this bug. If this
change doesn't solve it, we'll need to
add some reboot trigger on "too many
server v2 notification
retransmissions" -- or maybe a modem
power cycle will do, that wouldn't
discard the data.<br class="">
<br class="">
Thanks,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 03.09.19 um 07:46 schrieb Mark
Webb-Johnson:<br class="">
<blockquote type="cite" class="">No
problem. We can hold. I won’t commit
anything for the next few days (and
agree to hold-off on Markos’s pull).
Let me know when you are ready.<br
class="">
<br class="">
Regards, Mark.<br class="">
<br class="">
<blockquote type="cite" class="">On
3 Sep 2019, at 1:58 AM, Michael
Balzer <<a
href="mailto:dexter@expeedo.de"
class="" moz-do-not-send="true">dexter@expeedo.de</a>>
<<a
href="mailto:dexter@expeedo.de"
class="" moz-do-not-send="true">mailto:dexter@expeedo.de</a>>
wrote:<br class="">
<br class="">
Mark, please wait.<br class="">
<br class="">
I may just have found the cause
for issue #241, or at least
something I need to investigate
before releasing.<br class="">
<br class="">
I need to dig into my logs first,
and try something.<br class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 02.09.19 um 12:23 schrieb
Michael Balzer:<br class="">
<blockquote type="cite" class="">Nothing
open from my side at the moment.<br
class="">
<br class="">
I haven't had the time to look
in to Markos pull request, but
from a first check also think
that's going too deep to be
included in this release.<br
class="">
<br class="">
Regards,<br class="">
Michael<br class="">
<br class="">
<br class="">
Am 02.09.19 um 04:15 schrieb
Mark Webb-Johnson:<br class="">
<blockquote type="cite" class="">I
think it is well past time for
a 3.2.003 release. Things
seems table in edge (although
some things only partially
implemented).<br class="">
<br class="">
Anything people want to
include at the last minute, or
can we go ahead and build?<br
class="">
<br class="">
Regards, Mark.<br class="">
<br>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="144">--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
OvmsDev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OvmsDev@lists.openvehicles.com">OvmsDev@lists.openvehicles.com</a>
<a class="moz-txt-link-freetext" href="http://lists.openvehicles.com/mailman/listinfo/ovmsdev">http://lists.openvehicles.com/mailman/listinfo/ovmsdev</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="144">--
Michael Balzer * Helkenberger Weg 9 * D-58256 Ennepetal
Fon 02333 / 833 5735 * Handy 0176 / 206 989 26
</pre>
</body>
</html>